| Reference-driven performance anomaly identification |
| Full text |
Pdf
(593 KB)
|
Source
|
Joint International Conference on Measurement and Modeling of Computer Systems
archive
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
table of contents
Seattle, WA, USA
SESSION: Computing and switching
table of contents
Pages 85-96
Year of Publication: 2009
ISBN:978-1-60558-511-6
|
|
Authors
|
|
Kai Shen
|
University of Rochester, Rochester, NY, USA
|
|
Christopher Stewart
|
University of Rochester, Rochester, NY, USA
|
|
Chuanpeng Li
|
University of Rochester, Rochester, NY, USA
|
|
Xin Li
|
University of Rochester, Rochester, NY, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 36, Downloads (12 Months): 104, Citation Count: 0
|
|
|
ABSTRACT
Complex system software allows a variety of execution conditions on system configurations and workload properties. This paper explores a principled use of reference executions--those of similar execution conditions from the target--to help identify the symptoms and causes of performance anomalies. First, to identify anomaly symptoms, we construct change profiles that probabilistically characterize expected performance deviations between target and reference executions. By synthesizing several single-parameter change profiles, we can scalably identify anomalous reference-to-target changes in a complex system with multiple execution parameters. Second, to narrow the scope of anomaly root cause analysis, we filter anomaly-related low-level system metrics as those that manifest very differently between target and reference executions. Our anomaly identification approach requires little expert knowledge or detailed models on system internals and consequently it can be easily deployed. Using empirical case studies on the Linux I/O subsystem and a J2EE-based distributed online service, we demonstrate our approach's effectiveness in identifying performance anomalies over a wide range of execution conditions as well as multiple system software versions. In particular, we discovered five previously unknown performance anomaly causes in the Linux 2.6.23 kernel. Additionally, our preliminary results suggest that online anomaly detection and system reconfiguration may help evade performance anomalies in complex online systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Realistic nonstationary online workloads. http://www.cs.rochester.edu/u/stewart/models.html.
|
| |
2
|
MySQL JDBC driver. http://www.mysql.com/products/connector.
|
| |
3
|
R.A. Fisher. The arrangement of field experiments. J. of the Ministry of Agriculture of Great Britain, 33:503--513, 1926.
|
| |
4
|
M. Grindal, J. Offutt, and S.F. Andler. Combination testing strategies: A survey. Software Testing, Verification and Reliability, 15(3):167--199, Mar. 2005.
|
 |
5
|
|
| |
6
|
Nikolai Joukov , Avishay Traeger , Rakesh Iyer , Charles P. Wright , Erez Zadok, Operating system profiling via latency analysis, Proceedings of the 7th symposium on Operating systems design and implementation, November 06-08, 2006, Seattle, Washington
|
| |
7
|
|
 |
8
|
|
| |
9
|
Linux kernel bug tracker. http://bugzilla.kernel.org/.
|
| |
10
|
Linux kernel bug tracker on "many pre-mature anticipation timeouts in anticipatory I/O scheduler". http://bugzilla.kernel.org/show_bug.cgi?id=10756.
|
 |
11
|
Michael P. Mesnier , Matthew Wachs , Raja R. Sambasivan , Alice X. Zheng , Gregory R. Ganger, Modeling the relative fitness of storage, Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 12-16, 2007, San Diego, California, USA
|
| |
12
|
Patrick Reynolds , Charles Killian , Janet L. Wiener , Jeffrey C. Mogul , Mehul A. Shah , Amin Vahdat, Pip: detecting the unexpected in distributed systems, Proceedings of the 3rd conference on Networked Systems Design & Implementation, p.9-9, May 08-10, 2006, San Jose, CA
|
| |
13
|
RUBiS: Rice University bidding system. http://rubis.objectweb.org.
|
| |
14
|
|
| |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
 |
20
|
Joseph Tucek , Shan Lu , Chengdu Huang , Spiros Xanthos , Yuanyuan Zhou, Triage: diagnosing production run failures at the user's site, Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, October 14-17, 2007, Stevenson, Washington, USA
|
| |
21
|
Helen J. Wang , John C. Platt , Yu Chen , Ruyun Zhang , Yi-Min Wang, Automatic misconfiguration troubleshooting with peerpressure, Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, p.17-17, December 06-08, 2004, San Francisco, CA
|
 |
22
|
|
|