|
ABSTRACT
Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ from run to run. There are a number of sources of non-determinism such as Just-In-Time (JIT) compilation and optimization in the virtual machine (VM) driven by timer-based method sampling, thread scheduling, garbage collection, and various. There exist a wide variety of Java performance evaluation methodologies usedby researchers and benchmarkers. These methodologies differ from each other in a number of ways. Some report average performance over a number of runs of the same experiment; others report the best or second best performance observed; yet others report the worst. Some iterate the benchmark multiple times within a single VM invocation; others consider multiple VM invocations and iterate a single benchmark execution; yet others consider multiple VM invocations and iterate the benchmark multiple times. This paper shows that prevalent methodologies can be misleading, and can even lead to incorrect conclusions. The reason is that the data analysis is not statistically rigorous. In this paper, we present a survey of existing Java performance evaluation methodologies and discuss the importance of statistically rigorous data analysis for dealing with non-determinism. We advocate approaches to quantify startup as well as steady-state performance, and, in addition, we provide the JavaStats software to automatically obtain performance numbers in a rigorous manner. Although this paper focuses on Java performance evaluation, many of the issues addressed in this paper also apply to other programming languages and systems that build on a managed runtime system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Matthew Arnold , Stephen Fink , David Grove , Michael Hind , Peter F. Sweeney, Adaptive optimization in the Jalapeño JVM, Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, p.47-65, October 2000, Minneapolis, Minnesota, United States
|
 |
2
|
Matthew Arnold , Michael Hind , Barbara G. Ryder, Online feedback-directed optimization of Java, Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, November 04-08, 2002, Seattle, Washington, USA
|
 |
3
|
Katherine Barabash , Yoav Ossia , Erez Petrank, Mostly concurrent garbage collection revisited, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
 |
4
|
Ori Ben-Yitzhak , Irit Goft , Elliot K. Kolodner , Kean Kuiper , Victor Leikehman, An algorithm for parallel incremental compaction, Proceedings of the 3rd international symposium on Memory management, June 20-21, 2002, Berlin, Germany
|
 |
5
|
|
| |
6
|
|
 |
7
|
Stephen M. Blackburn , Robin Garner , Chris Hoffmann , Asjad M. Khang , Kathryn S. McKinley , Rotem Bentzur , Amer Diwan , Daniel Feinberg , Daniel Frampton , Samuel Z. Guyer , Martin Hirzel , Antony Hosking , Maria Jump , Han Lee , J. Eliot B. Moss , B. Moss , Aashish Phansalkar , Darko Stefanović , Thomas VanDrunen , Daniel von Dincklage , Ben Wiedermann, The DaCapo benchmarks: java benchmarking development and analysis, Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, October 22-26, 2006, Portland, Oregon, USA
|
 |
8
|
|
 |
9
|
Stephen M. Blackburn , Kathryn S. McKinley, Ulterior reference counting: fast garbage collection without a long wait, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
 |
10
|
Lieven Eeckhout , Andy Georges , Koen De Bosschere, How java programs interact with virtual machines at the microarchitectural level, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
 |
11
|
|
 |
12
|
Matthias Hauswirth , Peter F. Sweeney , Amer Diwan , Michael Hind, Vertical profiling: understanding the behavior of object-priented applications, Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, October 24-28, 2004, Vancouver, BC, Canada
|
| |
13
|
J.L. Hintze, and R.D. Nelson. Violin Plots: A Box Plot-Density Trace Synergism In The American Statistician, Volume 52(2), pages 181--184, May 1998.
|
 |
14
|
Xianglong Huang , Stephen M. Blackburn , Kathryn S. McKinley , J Eliot B. Moss , Zhenlin Wang , Perry Cheng, The garbage collection advantage: improving program locality, Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, October 24-28, 2004, Vancouver, BC, Canada
|
| |
15
|
|
| |
16
|
|
 |
17
|
Jonas Maebe , Dries Buytaert , Lieven Eeckhout , Koen De Bosschere, Javana: a system for building customized Java program analysis tools, Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, October 22-26, 2006, Portland, Oregon, USA
|
 |
18
|
|
| |
19
|
J. Neter, M. H. Kutner, W. Wasserman, and C. J. Nachtsheim Applied Linear Statistical Models WCB/McGraw-Hill, 1996.
|
 |
20
|
Narendran Sachindran , J. Eliot B. Moss , B. Moss, Mark-copy: fast copying GC with less space overhead, Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications, October 26-30, 2003, Anaheim, California, USA
|
 |
21
|
|
 |
22
|
|
| |
23
|
Standard Performance Evaluation Corporation. SPECjvm98 Benchmarks. http://www.spec.org/jvm98.
|
| |
24
|
Peter F. Sweeney , Matthias Hauswirth , Brendon Cahoon , Perry Cheng , Amer Diwan , David Grove , Michael Hind, Using hardware performance monitors to understand the behavior of java applications, Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium, p.5-5, May 06-07, 2004, San Jose, California
|
 |
25
|
Chengliang Zhang , Kirk Kelsey , Xipeng Shen , Chen Ding , Matthew Hertz , Mitsunori Ogihara, Program-level adaptive memory management, Proceedings of the 5th international symposium on Memory management, June 10-11, 2006, Ottawa, Ontario, Canada
[doi> 10.1145/1133956.1133979]
|
CITED BY 10
|
|
|
|
|
|
|
|
Stephen M. Blackburn , Kathryn S. McKinley , Robin Garner , Chris Hoffmann , Asjad M. Khan , Rotem Bentzur , Amer Diwan , Daniel Feinberg , Daniel Frampton , Samuel Z. Guyer , Martin Hirzel , Antony Hosking , Maria Jump , Han Lee , J. Eliot B. Moss , Aashish Phansalkar , Darko Stefanovik , Thomas VanDrunen , Daniel von Dincklage , Ben Wiedermann, Wake up and smell the coffee: evaluation methodology for the 21st century, Communications of the ACM, v.51 n.8, August 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jun Shirako , Jisheng M. Zhao , V. Krishna Nandivada , Vivek N. Sarkar, Chunking parallel loops in the presence of synchronization, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|
|
Ryan M. Golbeck , Samuel Davis , Immad Naseer , Igor Ostrovsky , Gregor Kiczales, Lightweight virtual machine support for AspectJ, Proceedings of the 7th international conference on Aspect-oriented software development, March 31-April 04, 2008, Brussels, Belgium
|
|