| Applying classification techniques to remotely-collected program execution data |
| Full text |
Pdf
(184 KB)
|
| Source
|
Foundations of Software Engineering
archive
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
table of contents
Lisbon, Portugal
SESSION: Patterns and aspects
table of contents
Pages: 146 - 155
Year of Publication: 2005
ISBN:1-59593-014-0
Also published in ...
|
|
Authors
|
|
Murali Haran
|
Penn State University, University Park, PA
|
|
Alan Karr
|
National Institute of Statistical Sciences, Triangle Park, NC
|
|
Alessandro Orso
|
Georgia Inst. of Technology, Atlanta, GA
|
|
Adam Porter
|
University of Maryland, College Park, MD
|
|
Ashish Sanil
|
National Institute of Statistical Sciences, Triangle Park, NC
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 49, Citation Count: 11
|
|
|
ABSTRACT
There is an increasing interest in techniques that support measurement and analysis of fielded software systems. One of the main goals of these techniques is to better understand how software actually behaves in the field. In particular, many of these techniques require a way to distinguish, in the field, failing from passing executions. So far, researchers and practitioners have only partially addressed this problem: they have simply assumed that program failure status is either obvious (i.e., the program crashes) or provided by an external source (e.g., the users). In this paper, we propose a technique for automatically classifying execution data, collected in the field, as coming from either passing or failing program runs. (Failing program runs are executions that terminate with a failure, such as a wrong outcome.) We use statistical learning algorithms to build the classification models. Our approach builds the models by analyzing executions performed in a controlled environment (e.g., test cases run in-house) and then uses the models to predict whether execution data produced by a fielded instance were generated by a passing or failing program execution. We also present results from an initial feasibility study, based on multiple versions of a software subject, in which we investigate several issues vital to the applicability of the technique. Finally, we present some lessons learned regarding the interplay between the reliability of classification models and the amount and type of data collected.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
|
| |
5
|
|
 |
6
|
|
| |
7
|
William Dickinson , David Leon , Andy Podgurski, Finding failures by cluster analysis of execution profiles, Proceedings of the 23rd International Conference on Software Engineering, p.339-348, May 12-19, 2001, Toronto, Ontario, Canada
|
| |
8
|
|
| |
9
|
|
| |
10
|
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer, 2001.
|
 |
11
|
|
 |
12
|
|
 |
13
|
Ben Liblit , Alex Aiken , Alice X. Zheng , Michael I. Jordan, Bug isolation via remote program sampling, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, June 09-11, 2003, San Diego, California, USA
|
 |
14
|
Ben Liblit , Mayur Naik , Alice X. Zheng , Alex Aiken , Michael I. Jordan, Scalable statistical bug isolation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
15
|
Microsoft online crash analysis, 2004. http://oca.microsoft.com.
|
 |
16
|
|
 |
17
|
|
 |
18
|
|
 |
19
|
|
| |
20
|
Andy Podgurski , David Leon , Patrick Francis , Wes Masri , Melinda Minch , Jiayang Sun , Bin Wang, Automated support for classifying software failure reports, Proceedings of the 25th International Conference on Software Engineering, May 03-10, 2003, Portland, Oregon
|
| |
21
|
|
 |
22
|
|
CITED BY 11
|
|
|
|
|
|
|
|
|
|
|
Janaki T. Madhavan , E. James Whitehead, Jr., Predicting buggy changes inside an integrated development environment, Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange, p.36-40, October 21-21, 2007, Montreal, Quebec, Canada
|
|
|
|
|
|
Murali Haran , Alan Karr , Michael Last , Alessandro Orso , Adam A. Porter , Ashish Sanil , Sandro Fouche, Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks, IEEE Transactions on Software Engineering, v.33 n.5, p.287-304, May 2007
|
|
|
A. V. Miranskyy , N. H. Madhavji , M. S. Gittens , M. Davison , M. Wilding , D. Godwin, An iterative, multi-level, and scalable approach to comparing execution traces, The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering: companion papers, September 03-07, 2007, Dubrovnik, Croatia
|
|
|
|
|
|
Andriy V. Miranskyy , Nazim H. Madhavji , Mechelle S. Gittens , Matthew Davison , Mark Wilding , David Godwin, An iterative, multi-level, and scalable approach to comparing execution traces, Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, September 03-07, 2007, Dubrovnik, Croatia
|
|
|
|
|
|
A. V. Miranskyy , N. H. Madhavji , M. S. Gittens , M. Davison , M. Wilding , D. Godwin , C. A. Taylor, SIFT: a scalable iterative-unfolding technique for filtering execution traces, Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds, October 27-30, 2008, Ontario, Canada
|
|