|
ABSTRACT
An error occurs when software cannot complete a requested action as a result of some problem with its input, configuration, or environment. A high-quality error report allows a user to understand and correct the problem. Unfortunately, the quality of error reports has been decreasing as software becomes more complex and layered. End-users take the cryptic error messages given to them by programsand struggle to fix their problems using search engines and support websites. Developers cannot improve their error messages when they receive an ambiguous or otherwise insufficient error indicator from a black-box software component. We introduce Clarify, a system that improves error reporting by classifying application behavior. Clarify uses minimally invasive monitoring to generate a behavior profile, which is a summary of the program's execution history. A machine learning classifier uses the behavior profile to classify the application's behavior, thereby enabling a more precise error report than the output of the application itself. We evaluate a prototype Clarify system on ambiguous error messages generated by large, modern applications like gcc, La-TeX, and the Linux kernel. For a performance cost of less than 1% on user applications and 4.7% on the Linux kernel, the proto type correctly disambiguates at least 85% of application behaviors that result in ambiguous error reports. This accuracy does not degrade significantly with more behaviors: a Clarify classifier for 81 La-TeX error messages is at most 2.5% less accurate than a classifier for 27 LaTeX error messages. Finally, we show that without any human effort to build a classifier, Clarify can provide nearest-neighbor software support, where users who experience a problem are told about 5 other users who might have had the same problem. On average 2.3 of the 5 users that Clarify identifies have experienced the same problem.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Marcos K. Aguilera , Jeffrey C. Mogul , Janet L. Wiener , Patrick Reynolds , Athicha Muthitacharoen, Performance debugging for distributed systems of black boxes, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
 |
2
|
Glenn Ammons , Thomas Ball , James R. Larus, Exploiting hardware performance counters with flow and context sensitive profiling, Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation, p.85-96, June 16-18, 1997, Las Vegas, Nevada, United States
|
 |
3
|
Andrew Ayers , Richard Schooler , Chris Metcalf , Anant Agarwal , Junghwan Rhee , Emmett Witchel, TraceBack: first fault diagnosis by reconstruction of distributed control flow, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
 |
4
|
Vasanth Bala , Evelyn Duesterwald , Sanjeev Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.1-12, June 18-21, 2000, Vancouver, British Columbia, Canada
|
| |
5
|
|
 |
6
|
Rob Barrett , Eser Kandogan , Paul P. Maglio , Eben M. Haber , Leila A. Takayama , Madhu Prabaker, Field studies of computer system administrators: analysis of system management tools and practices, Proceedings of the 2004 ACM conference on Computer supported cooperative work, November 06-10, 2004, Chicago, Illinois, USA
[doi> 10.1145/1031607.1031672]
|
| |
7
|
J. Berkman. Bug-buddy -- GNOME bug-reporting utility, 2004. http://directory.fsf.org/All_Packages_in_Directory/bugbuddy.html.
|
 |
8
|
|
| |
9
|
Justin Brickell, Donald E. Porter, Vitaly Shmatikov, and Emmett Witchel. Secure remote software diagnostics, Under review.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Bryan Cantrill and Mike Shapiro and Adam Leventhal. Dtrace, 2006. http://www.genunix.org/wiki/index.php/DTrace_FAQ.
|
 |
14
|
|
| |
15
|
Latex Error Classes. http://www.cs.utexas.edu/users/habals/clarify/latex_errors.html, 2006.
|
| |
16
|
Microsoft corporation. Privacy statement for the microsoft error reporting service, 2006.
|
| |
17
|
Microsoft corporation. Reporting and solving computer problems, 2006.
|
| |
18
|
Microsoft Corporation. What information is sent to Microsoft when I report a problem?, 2006.
|
| |
19
|
Jason V. Davis, Jungwoo Ha, Christopher J. Rossbach, Hany E. Ramadan, and Emmett Witchel. Cost-sensitive decision tree learning for forensic classification. In ECML, 2006.
|
 |
20
|
|
| |
21
|
M. J. Harrold, G. Rothermel, K. Sayre, R. Wu, and L. Yi. An empirical investigation of the relationship between fault-revealing test behavior and differences in program spectra. In Journal of Software Testing, Verification and Reliability, vol 10, no 3, 2000.
|
| |
22
|
J. Humphreys and V. Turner. On-demand enterprises and utility computing: A current market assessment and outlook. Technical report, IDC, Jul 2004.
|
| |
23
|
Monica Hutchins , Herb Foster , Tarak Goradia , Thomas Ostrand, Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria, Proceedings of the 16th international conference on Software engineering, p.191-200, May 16-21, 1994, Sorrento, Italy
|
| |
24
|
Jim Keniston and Prasanna S Panchamukhi. Kernel Probes (Kprobes),2006. Documentation/kprobes.txt.
|
| |
25
|
|
 |
26
|
Ben Liblit , Alex Aiken , Alice X. Zheng , Michael I. Jordan, Bug isolation via remote program sampling, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, June 09-11, 2003, San Diego, California, USA
|
 |
27
|
Ben Liblit , Mayur Naik , Alice X. Zheng , Alex Aiken , Michael I. Jordan, Scalable statistical bug isolation, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, June 12-15, 2005, Chicago, IL, USA
|
| |
28
|
C. Liu, X. Yang, H.Yu, J. Han, and P. S. Yu. Mining behavior graphs for "backtrace" of noncrashing bugs. In Proc. of 2005 SIAM Int. Conf. on Data Mining (SDM05), 2005.
|
| |
29
|
Microsoft Corporation. Dr. Watson Overview, 2002. http://www.microsoft.com/TechNet/prodtechnol/winxppro/proddocs/drwatson_overview.asp.
|
| |
30
|
Microsoft Corporation. Online Crash Analysis, 2004. http://oca.microsoft.com/.
|
| |
31
|
Andy Podgurski , David Leon , Patrick Francis , Wes Masri , Melinda Minch , Jiayang Sun , Bin Wang, Automated support for classifying software failure reports, Proceedings of the 25th International Conference on Software Engineering, May 03-10, 2003, Portland, Oregon
|
| |
32
|
|
 |
33
|
Thomas Reps , Thomas Ball , Manuvir Das , James Larus, The use of program profiling for software maintenance with applications to the year 2000 problem, Proceedings of the 6th European conference held jointly with the 5th ACM SIGSOFT international symposium on Foundations of software engineering, p.432-449, September 22-25, 1997, Zurich, Switzerland
|
| |
34
|
Rubber. http://www.pps.jussieu.fr/_beffara/soft/rubber, 2007.
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
Helen J. Wang , John C. Platt , Yu Chen , Ruyun Zhang , Yi-Min Wang, Automatic misconfiguration troubleshooting with peerpressure, Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, p.17-17, December 06-08, 2004, San Francisco, CA
|
| |
39
|
|
| |
40
|
C. Yuan, N. Lao, J. Wen, J. Li, Z. Zhang, Y. Wang, and W. Ma. Automated known problem diagnosis with event traces. MSR-TR-2005--81, 2005.
|
 |
41
|
Chun Yuan , Ni Lao , Ji-Rong Wen , Jiwei Li , Zheng Zhang , Yi-Min Wang , Wei-Ying Ma, Automated known problem diagnosis with event traces, Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, April 18-21, 2006, Leuven, Belgium
|
 |
42
|
Xiaotong Zhuang , Mauricio J. Serrano , Harold W. Cain , Jong-Deok Choi, Accurate, efficient, and adaptive calling context profiling, Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, June 11-14, 2006, Ottawa, Ontario, Canada
|
CITED BY 8
|
|
Jason V. Davis , Brian Kulis , Prateek Jain , Suvrit Sra , Inderjit S. Dhillon, Information-theoretic metric learning, Proceedings of the 24th international conference on Machine learning, p.209-216, June 20-24, 2007, Corvalis, Oregon
|
|
|
|
|
|
Xiaoning Ding , Hai Huang , Yaoping Ruan , Anees Shaikh , Xiaodong Zhang, Automatic software fault diagnosis by exploiting application signatures, Proceedings of the 22nd conference on Large installation system administration conference, p.23-39, November 09-14, 2008, San Diego, California
|
|
|
|
|
|
Justin Brickell , Donald E. Porter , Vitaly Shmatikov , Emmett Witchel, Privacy-preserving remote diagnostics, Proceedings of the 14th ACM conference on Computer and communications security, October 28-31, 2007, Alexandria, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|