| Stride prefetching by dynamically inspecting objects |
| Full text |
Pdf
(169 KB)
|
| Source
|
Conference on Programming Language Design and Implementation
archive
Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
table of contents
San Diego, California, USA
SESSION: Code optimization II
table of contents
Pages: 269 - 277
Year of Publication: 2003
ISBN:1-58113-662-5
Also published in ...
|
|
Authors
|
|
Tatsushi Inagaki
|
IBM Tokyo Research Laboratorym Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan
|
|
Tamiya Onodera
|
IBM Tokyo Research Laboratorym Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan
|
|
Hideaki Komatsu
|
IBM Tokyo Research Laboratorym Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan
|
|
Toshio Nakatani
|
IBM Tokyo Research Laboratorym Shimotsuruma, Yamato-shi, Kanagawa-ken 242-8502, Japan
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 78, Citation Count: 10
|
|
|
ABSTRACT
Software prefetching is a promising technique to hide cache miss latencies, but it remains challenging to effectively prefetch pointer-based data structures because obtaining the memory address to be prefetched requires pointer dereferences. The recently proposed stride prefetching overcomes this problem, but it only exploits inter-iteration stride patterns and relies on an off-line profiling method.We propose a new algorithm for stride prefetching which is intended for use in a dynamic compiler. We exploit both inter- and intra-iteration stride patterns, which we discover using an ultra-lightweight profiling technique, called object inspection. This is a kind of partial interpretation that only a dynamic compiler can perform. During the compilation of a method, the dynamic compiler gathers the profile information by partially interpreting the method using the actual values of parameters and causing no side effects.We evaluated an implementation of our prefetching algorithm in a production-level Java just-in time compiler. The results show that the algorithm achieved up to an 18.9% and 25.1% speedup in industry-standard benchmarks on the Pentium 4 and the Athlon MP, respectively, while it increased the compilation time by less than 3.0%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Advanced Micro Devices, Inc. AMD Athlon Processor x86 Code Optimization Guide, Aug. 2001. Document Number 22007J.
|
| |
2
|
|
 |
3
|
David Callahan , Ken Kennedy , Allan Porterfield, Software prefetching, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.40-52, April 08-11, 1991, Santa Clara, California, United States
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
Intel Corporation. Intel Itanium Architecture Software Developer's Manual Volume 3: Instruction Set Reference, 2001. Revision 2.0, Document Number 245319-003.
|
| |
9
|
Intel Corporation. Intel Pentium 4 Processor Optimization Reference Manual, 2001. Document Number 248966.
|
| |
10
|
Intel Corporation. VTune Performance Analyzer. http://www.intel.com/software/products/vtune, 2002.
|
 |
11
|
Kazuaki Ishizaki , Motohiro Kawahito , Toshiaki Yasue , Mikio Takeuchi , Takeshi Ogasawara , Toshio Suganuma , Tamiya Onodera , Hideaki Komatsu , Toshio Nakatani, Design, implementation, and evaluation of optimizations in a just-in-time compiler, Proceedings of the ACM 1999 conference on Java Grande, p.119-128, June 12-14, 1999, San Francisco, California, United States
[doi> 10.1145/304065.304111]
|
| |
12
|
Java Grande Benchmarking Project. Java Grande Forum Benchmark Suite, Version 2.0. http://www.epcc.ed.ac.uk/javagrande, 1999.
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
Todd C. Mowry , Monica S. Lam , Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.62-73, October 12-15, 1992, Boston, Massachusetts, United States
|
| |
17
|
Proc. of the ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2002.
|
 |
18
|
Yefim Shuf , Mauricio J. Serrano , Manish Gupta , Jaswinder Pal Singh, Characterizing the memory behavior of Java workloads: a structured view and opportunities for optimizations, Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.194-205, June 2001, Cambridge, Massachusetts, United States
|
| |
19
|
|
| |
20
|
Standard Performance Evaluation Corporation (SPEC). JVM Client98 (SPECjvm98). http://www.spec.org/osg/jvm98, 1998.
|
| |
21
|
Artour Stoutchinin , José N. Amaral , Guang R. Gao , James C. Dehnert , Suneel Jain , Alban Douillet, Speculative Prefetching of Induction Pointers, Proceedings of the 10th International Conference on Compiler Construction, p.289-303, April 02-06, 2001
|
| |
22
|
T. Suganuma , T. Ogasawara , M. Takeuchi , T. Yasue , M. Kawahito , K. Ishizaki , H. Komatsu , T. Nakatani, Overview of the IBM Java just-in-time compiler, IBM Systems Journal, v.39 n.1, p.175-193, January 2000
|
 |
23
|
|
| |
24
|
|
|