ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Fair and balanced?: bias in bug-fix datasets
Full text PdfPdf (772 KB)
Source
Foundations of Software Engineering archive
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering table of contents
Amsterdam, The Netherlands
SESSION: Empirical software engineering table of contents
Pages: 121-130  
Year of Publication: 2009
ISBN:978-1-60558-001-2
Authors
Christian Bird  University of California, Davis, Davis, CA, USA
Adrian Bachmann  University of Zurich, Zurich, Switzerland
Eirik Aune  Univeristy of California, Davis, Davis, CA, USA
John Duffy  University of California, Davis, Davis, CA, USA
Abraham Bernstein  University of Zurich, Zurich, Switzerland
Vladimir Filkov  University of California, Davis, Davis, CA, USA
Premkumar Devanbu  University of California, Davis, Davis, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 36,   Downloads (12 Months): 125,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1595696.1595716
What is a DOI?

ABSTRACT

Software engineering researchers have long been interested in where and why bugs occur in code, and in predicting where they might turn up next. Historical bug-occurence data has been key to this research. Bug tracking systems, and code version histories, record when, how and by whom bugs were fixed; from these sources, datasets that relate file changes to bug fixes can be extracted. These historical datasets can be used to test hypotheses concerning processes of bug introduction, and also to build statistical bug prediction models. Unfortunately, processes and humans are imperfect, and only a fraction of bug fixes are actually labelled in source code version histories, and thus become available for study in the extracted datasets. The question naturally arises, are the bug fixes recorded in these historical datasets a fair representation of the full population of bug fixes? In this paper, we investigate historical data from several software projects, and find strong evidence of systematic bias. We then investigate the potential effects of "unfair, imbalanced" datasets on the performance of prediction techniques. We draw the lesson that bias is a critical problem that threatens both the effectiveness of processes that rely on biased datasets to build prediction models and the generalizability of hypotheses tested on biased data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Agresti and B. Coull. Approximate Is Better Than "Exact" for Interval Estimation ofBinomial Proportions. The American Statistician, 52(2), 1998.
 
2
C. Ambroise and G. McLachlan. Selection bias in gene extraction on the basis of microarray gene-expression data. Proceedings of the National Academy of Sciences,99(10):6562--6566, 2002.
 
3
A. Bachmann and A. Bernstein. Data retrieval, processing and linking for software process dataanalysis. Technical report, University of Zurich, 2009. Published May, 2009. http://www.ifi.uzh.ch/ddis/people/adrian-bachmann/pdq/.
4
 
5
V. Basili, G. Caldiera, and H. Rombach. The Goal Question Metric Approach. Encyclopedia of Software Engineering, 1:528--532, 1994.
 
6
V. Basili and R. Selby Jr. Data collection and analysis in software research and management. Proc. of the American Statistical Association and BiomeasureSociety Joint Statistical Meetings, 1984.
 
7
V. Basili and D. Weiss. A methodology for collecting valid software engineering data. IEEE Transactions on Software Engineering, 10(6):728--738,1984.
 
8
Y. Benjamini and Y. Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 1995.
 
9
R. Berk. An introduction to sample selection bias in sociological data. American Sociological Review, 48(3):386--398, 1983.
 
10
Bugzilla Fields, http://www.eclipse.org/tptp/home/documents/process/development/bugzilla.html.
 
11
C. Catal and B. Diri. A systematic review of software fault prediction studies. Expert Systems With Applications, 2008.
 
12
W. J. Conover. Practical Nonparametric Statistics. John Wiley&Sons, 1971.
 
13
14
 
15
S. Dowdy, S. Wearden, and D. Chilko. Statistics for research. John Wiley&Sons, third edition, 2004.
 
16
 
17
P. Easterbrook, J. Berlin, R. Gopalan, and D. Matthews. Publication bias in clinical research. Lancet, 337(8746):867--72, 1991.
 
18
S. Easterbrook, J. Singer, M. Storey, and D. Damian. Selecting Empirical Methods for Software Engineering Research. Guide to Advanced Empirical Software Engineering, 2007.
 
19
 
20
L. Gasser and G. Ripoche. Distributed collective practices and free/open-source software problem management: perspectives and methods. Proceedings of the Conference on Coopration, Innovations et Technologies, 2003.
 
21
M. Grabe, S. Zhou, and B. Barnett. Explicating sensationalism in television news: Content and the bells and whistles of form. Journal of Broadcasting&Electronic Media, 45:635, 2001.
 
22
 
23
J. Heckman. Sample Selection Bias as a Specification Error. Econometrica, 47(1):153--161, 1979.
24
25
 
26
 
27
28
 
29
 
30
M. R. Levy. The Methodology and Performance of Election Day Polls. Public Opinion Quarterly, 47(1):54--67, 1983.
31
 
32
 
33
R. Little and D. Rubin. Statistical analysis with missing data. Technometrics, 45(4):364--365, 2003.
 
34
 
35
A. Mockus. Missing Data in Software Engineering. Empirical Methods in Software Engineering. The MIT Press), 2000.
36
37
 
38
M. Nick and C. Tautz. Practical evaluation of an organizational memory using the goal-question-metric technique. Lecture notes in computer science, pages 138--147, 1999.
 
39
R. Nickerson. Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2:175--220, 1998.
40
 
41
Promise '08: Proceedings of the 4th international workshop on predictor models in software engineering, 2008. Eds. B. Boetticher and T. Ostrand.
 
42
Promise Dataset, http://promisedata.org.
43
 
44
 
45
R. A. Singleton, Jr. and B. C. Straits. Approaches to Social Research. Oxford University Press, 2005.
46
 
47
K. Weiss. Confounding, ascertainment bias, and the blind quest for a genetic 'fountain of youth'. Annals of Medicine, 35:532--544, 2003.
48
 
49
 
50
T. Zimmermann and P. Weißgerber. Preprocessing CVS data for fine-grained analysis. In Proceedings of the International Workshop on Mining Software Repositories, 2004.
 
51


Collaborative Colleagues:
Christian Bird: colleagues
Adrian Bachmann: colleagues
Eirik Aune: colleagues
John Duffy: colleagues
Abraham Bernstein: colleagues
Vladimir Filkov: colleagues
Premkumar Devanbu: colleagues