ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Where the bugs are
Full text PdfPdf (154 KB)
Source International Symposium on Software Testing and Analysis archive
Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis table of contents
Boston, Massachusetts, USA
SESSION: Empirical studies table of contents
Pages: 86 - 96  
Year of Publication: 2004
ISBN:1-58113-820-2
Also published in ...
Authors
Thomas J. Ostrand  AT&T Labs - Research, Florham Park, NJ
Elaine J. Weyuker  AT&T Labs - Research, Florham Park, NJ
Robert M. Bell  AT&T Labs - Research, Florham Park, NJ
Sponsors
SIGSOFT: ACM Special Interest Group on Software Engineering
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 131,   Citation Count: 26
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007512.1007524
What is a DOI?

ABSTRACT

The ability to predict which files in a large software system are most likely to contain the largest numbers of faults in the next release can be a very valuable asset. To accomplish this, a negative binomial regression model using information from previous releases has been developed and used to predict the numbers of faults for a large industrial inventory system. The files of each release were sorted in descending order based on the predicted number of faults and then the first 20% of the files were selected. This was done for each of fifteen consecutive releases, representing more than four years of field usage. The predictions were extremely accurate, correctly selecting files that contained between 71% and 92% of the faults, with the overall average being 83%. In addition, the same model was used on data for the same system's releases, but with all fault data prior to integration testing removed. The prediction was again very accurate, ranging from 71% to 93%, with the average being 84%. Predictions were made for a second system, and again the first 20% of files accounted for 83% of the identified faults. Finally, a highly simplified predictor was considered which correctly predicted 73% and 74% of the faults for the two systems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E.N. Adams. Optimizing Preventive Service of Software Products. IBM J. Res. Develop., Vol 28, No 1, Jan 1984, pp.2--14.
2
 
3
 
4
 
5
 
6
 
7
T.J. McCabe. A Complexity Measure. IEEE Trans. on Software Engineering, Vol 2, 1976, pp.308--320.
 
8
P. McCullagh and J.A. Nelder. Generalized Linear Models, Second Edition, Chapman and Hall, London, 1989.
 
9
K-H. Moller and D.J. Paulish. An Empirical Investigation of Software Fault Distribution. Proc. IEEE First International Software Metrics Symposium, Baltimore, Md., May 21-22, 1993, pp.82--90.
 
10
11
 
12
T. Ostrand, E.J. Weyuker, and R. Bell. Using Static Analysis to Determine Where to Focus Dynamic Testing Effort. Proc. IEE/Workshop on Dynamic Analysis (WODA04), Edinburgh, May 2004.
 
13
 
14
SAS Institute Inc. SAS/STAT User's Guide, Version 8, SAS Institute, Cary, NC, 1999.

CITED BY  26


REVIEW

"Timothy R. Hopkins : Reviewer"

This interesting paper reports on the use of a negative binomial regression model to predict which files in a large software system are most likely to contain faults in the next release. This model was developed by studying a large and evolving in  more...

Collaborative Colleagues:
Thomas J. Ostrand: colleagues
Elaine J. Weyuker: colleagues
Robert M. Bell: colleagues