ACM Home Page
Please provide us with feedback. Feedback
Statistically rigorous java performance evaluation
Full text PdfPdf (1.63 MB)
Source
Conference on Object Oriented Programming Systems Languages and Applications archive
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications table of contents
Montreal, Quebec, Canada
SESSION: Runtime techniques/GC table of contents
Pages: 57 - 76  
Year of Publication: 2007
ISBN:978-1-59593-786-5
Also published in ...
Authors
Andy Georges  Ghent University, Ghent, Belgium
Dries Buytaert  Ghent University, Ghent, Belgium
Lieven Eeckhout  Ghent University, Ghent, Belgium
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 178,   Citation Count: 9
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1297027.1297033
What is a DOI?

ABSTRACT

Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ from run to run. There are a number of sources of non-determinism such as Just-In-Time (JIT) compilation and optimization in the virtual machine (VM) driven by timer-based method sampling, thread scheduling, garbage collection, and various.

There exist a wide variety of Java performance evaluation methodologies usedby researchers and benchmarkers. These methodologies differ from each other in a number of ways. Some report average performance over a number of runs of the same experiment; others report the best or second best performance observed; yet others report the worst. Some iterate the benchmark multiple times within a single VM invocation; others consider multiple VM invocations and iterate a single benchmark execution; yet others consider multiple VM invocations and iterate the benchmark multiple times.

This paper shows that prevalent methodologies can be misleading, and can even lead to incorrect conclusions. The reason is that the data analysis is not statistically rigorous. In this paper, we present a survey of existing Java performance evaluation methodologies and discuss the importance of statistically rigorous data analysis for dealing with non-determinism. We advocate approaches to quantify startup as well as steady-state performance, and, in addition, we provide the JavaStats software to automatically obtain performance numbers in a rigorous manner. Although this paper focuses on Java performance evaluation, many of the issues addressed in this paper also apply to other programming languages and systems that build on a managed runtime system.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
5
 
6
7
8
9
10
11
12
 
13
J.L. Hintze, and R.D. Nelson. Violin Plots: A Box Plot-Density Trace Synergism In The American Statistician, Volume 52(2), pages 181--184, May 1998.
14
 
15
 
16
17
18
 
19
J. Neter, M. H. Kutner, W. Wasserman, and C. J. Nachtsheim Applied Linear Statistical Models WCB/McGraw-Hill, 1996.
20
21
22
 
23
Standard Performance Evaluation Corporation. SPECjvm98 Benchmarks. http://www.spec.org/jvm98.
 
24
25

CITED BY  10

Collaborative Colleagues:
Andy Georges: colleagues
Dries Buytaert: colleagues
Lieven Eeckhout: colleagues