|
ABSTRACT
This paper studies the memory behavior of important Java workloads used in benchmarking Java Virtual Machines (JVMs), based on instrumentation of both application and library code in a state-of-the-art JVM, and provides structured information about these workloads to help guide systems' design. We begin by characterizing the inherent memory behavior of the benchmarks, such as information on the breakup of heap accesses among different categories and on the hotness of references to fields and methods. We then provide detailed information about misses in the data TLB and caches, including the distribution of misses over different kinds of accesses and over different methods. In the process, we make interesting discoveries about TLB behavior and limitations of data prefetching schemes discussed in the literature in dealing with pointer-intensive Java codes. Throughout this paper, we develop a set of recommendations to computer architects and compiler writers on how to optimize computer systems and system software to run Java programs more efficiently. This paper also makes the first attempt to compare the characteristics of SPECjvm98 to those of a server-oriented benchmark, pBOB, and explain why the current set of SPECjvm98 benchmarks may not be adequate for a comprehensive and objective evaluation of JVMs and just-in-time (JIT) compilers.We discover that the fraction of accesses to array elements is quite significant, demonstrate that the number of "hot spots" in the benchmarks is small, and show that field reordering cannot yield significant performance gains. We also show that even a fairly large L2 data cache is not effective for many Java benchmarks. We observe that instructions used to prefetch data into the L2 data cache are often squashed because of high TLB miss rates and because the TLB does not usually have the translation information needed to prefetch the data into the L2 data cache. We also find that co-allocation of frequently used method tables can reduce the number of TLB misses and lower the cost of accessing type information block entries in virtual method calls and runtime type checking.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Alpern, A. Cocchi, D. Lieber, M. Mergen, and V. Sarkar. Jalapenoa compiler-supported Java virtual machine for servers. In Workshop on Compiler Support for Software System (WCSSS 99), May 1999.
|
 |
2
|
Matthew Arnold , Stephen Fink , David Grove , Michael Hind , Peter F. Sweeney, Adaptive optimization in the Jalapeño JVM, Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, p.47-65, October 2000, Minneapolis, Minnesota, United States
|
| |
3
|
A. Barisone, F. Belliotti, R. Berta, and A. Gloria. Ultrasparc instruction level characterization of Java virtual machine workload. In 2nd Annual Workshop on Workload Characterization (WWC) for Computer System Design, pages 1-24. Kluwer Academic Publishers, 1999.
|
 |
4
|
|
| |
5
|
S. J. Baylor , M. Devarakonda , S. J. Fink , E. Gluzberg , M. Kalantar , P. Muttineni , E. Barsness , R. Arora , R. Dimpsey , S. J. Munroe, Java server benchmarks, IBM Systems Journal, v.39 n.1, p.57-81, January 2000
|
| |
6
|
K. R. Bowers and D. Kaeli. Characterizing the SPEC JVM98 benchmarks on the Java virtual machine. Technical report, Northeastern University, Dept. of ECE, Computer Architecture Group, 1998.
|
 |
7
|
Edouard Bugnion , Jennifer M. Anderson , Todd C. Mowry , Mendel Rosenblum , Monica S. Lam, Compiler-directed page coloring for multiprocessors, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.244-255, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
8
|
Michael G. Burke , Jong-Deok Choi , Stephen Fink , David Grove , Michael Hind , Vivek Sarkar , Mauricio J. Serrano , V. C. Sreedhar , Harini Srinivasan , John Whaley, The Jalapeño dynamic optimizing compiler for Java, Proceedings of the ACM 1999 conference on Java Grande, p.129-141, June 12-14, 1999, San Francisco, California, United States
[doi> 10.1145/304065.304113]
|
 |
9
|
|
 |
10
|
Trishul M. Chilimbi , Bob Davidson , James R. Larus, Cache-conscious structure definition, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.13-24, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
11
|
Trishul M. Chilimbi , Mark D. Hill , James R. Larus, Cache-conscious structure layout, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.1-12, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
Amer Diwan , Kathryn S. McKinley , J. Eliot B. Moss, Type-based alias analysis, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.106-117, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
16
|
|
| |
17
|
|
| |
18
|
The Java Hotspot Performance Engine Architecture. http://java.sun.com/products/hotspot/ whitepaper.html.
|
| |
19
|
|
| |
20
|
IBM Corp. PowerPC 604e RISC Microprocessor User's Manual, Mar. 1998. G522-0330-00.
|
 |
21
|
Kimberly Keeton , David A. Patterson , Yong Qiang He , Roger C. Raphael , Walter E. Baker, Performance characterization of a Quad Pentium Pro SMP using OLTP workloads, Proceedings of the 25th annual international symposium on Computer architecture, p.15-26, June 27-July 02, 1998, Barcelona, Spain
|
 |
22
|
|
 |
23
|
|
| |
24
|
T. Kistler and M. Franz. Automated record layout for dynamic data structures. Technical Report 98-22, Department of Information and Computer Science, University of California at Irvine, 1998.
|
 |
25
|
Dennis C. Lee , Patrick J. Crowley , Jean-Loup Baer , Thomas E. Anderson , Brian N. Bershad, Execution characteristics of desktop applications on Windows NT, Proceedings of the 25th annual international symposium on Computer architecture, p.27-38, June 27-July 02, 1998, Barcelona, Spain
|
 |
26
|
Tao Li , Lizy Kurian John , Vijaykrishnan Narayanan , Anand Sivasubramaniam , Jyotsna Sabarinathan , Anupama Murthy, Using complete system simulation to characterize SPECjvm98 benchmarks, Proceedings of the 14th international conference on Supercomputing, p.22-33, May 08-11, 2000, Santa Fe, New Mexico, United States
[doi> 10.1145/335231.335234]
|
 |
27
|
|
 |
28
|
Ann Marie Grizzaffi Maynard , Colette M. Donnelly , Bret R. Olszewski, Contrasting characteristics and cache performance of technical and multi-user commercial workloads, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.145-156, October 05-07, 1994, San Jose, California, United States
|
| |
29
|
|
 |
30
|
Todd C. Mowry , Monica S. Lam , Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.62-73, October 12-15, 1992, Boston, Massachusetts, United States
|
| |
31
|
|
| |
32
|
R. Radhakrishnan, J. Rubio, L. John, and N. Vijaykrishnan. Execution characteristics of just-in-time compilers. Technical Report TR-990717-01, Department of Electrical and Computer Engineering, University of Texas at Austin, 1999.
|
| |
33
|
R. Radhakrishnan, N. Vijaykrishnan, L. K. John, and A. Sivasubramaniam. Architectural issues in Java runtime systems. In Proc. of HPCA-6, pages 387-398, Jan. 2000.
|
 |
34
|
Theodore H. Romer , Dennis Lee , Geoffrey M. Voelker , Alec Wolman , Wayne A. Wong , Jean-Loup Baer , Brian N. Bershad , Henry M. Levy, The structure and performance of interpreters, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.150-159, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
35
|
|
| |
36
|
W. J. Schmidt , R. R. Roediger , C. S. Mestad , B. Mendelson , I. Shavit-Lottem , V. Bortnikov-Sitnitsky, Profile-directed restructuring of operating system code, IBM Systems Journal, v.37 n.2, p.270-297, April 1998
|
 |
37
|
Mauricio Serrano , Rajesh Bordawekar , Sam Midkiff , Manish Gupta, Quicksilver: a quasi-static compiler for Java, Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, p.66-82, October 2000, Minneapolis, Minnesota, United States
|
| |
38
|
Y. Shuf, M. J. Serrano, M. Gupta, and J. P. Singh. Characterizing memory behavior of Java workloads: A structured view and opportunities for optimizations. Technical report, IBM T.J. Watson Research Center, Yorktown Heights, NY, 2000.
|
| |
39
|
Standard Performance Evaluation Council. SPEC JVM98 Benchmarks, 1998. http://www.spec.org/osg/jvm98/.
|
| |
40
|
Standard Performance Evaluation Council. SPEC JBB2000 Benchmark, 2000. http://www.spec.org/osg/jbb2000/.
|
| |
41
|
Transaction Processing Performance Council. TPC-C Benchmark, 2000. http://www.tpc.org/cspec.html.
|
CITED BY 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jinzhan Peng , Guei-Yuan Lueh , Gansha Wu , Xiaogang Gou , Ryan Rakvic, A comprehensive study of hardware/software approaches to improve TLB performance for java applications on embedded systems, Proceedings of the 2006 workshop on Memory system performance and correctness, October 22-22, 2006, San Jose, California
|
|
|
|
|
|
Stephen M. Blackburn , Robin Garner , Chris Hoffmann , Asjad M. Khang , Kathryn S. McKinley , Rotem Bentzur , Amer Diwan , Daniel Feinberg , Daniel Frampton , Samuel Z. Guyer , Martin Hirzel , Antony Hosking , Maria Jump , Han Lee , J. Eliot , B. Moss , Aashish Phansalkar , Darko Stefanović , Thomas VanDrunen , Daniel von Dincklage , Ben Wiedermann, The DaCapo benchmarks: java benchmarking development and analysis, ACM SIGPLAN Notices, v.41 n.10, October 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|