|
ABSTRACT
Unless the speed gap between CPU and memory disappears, efficient memory usage remains a decisive factor for performance. To optimize data usage of programs in the presence of the memory hierarchy, we are particularly interested in two compiler techniques: pool allocation and field layout restructuring. Since foreseeing runtime behaviors of programs at compile time is difficult, most of the previous work relied on profiling. On the contrary, our goal is to develop a fully automatic compiler that statically transforms input codes to use memory efficiently. Noticing that regular expressions, which denote repetition explicitly, are sufficient for memory access patterns, we describe how to extract memory access patterns as regular expressions in detail. Based on static patterns presented in regular expressions, we apply pool allocation to repeatedly accessed structures and exploit field layout restructuring according to field affinity relations of chosen structures. To make a scalable framework, we devise and apply new abstraction techniques, which build and interpret access patterns for the whole programs in a bottom-up fashion. We implement our analyses and transformations with the CIL compiler. To verify the effect and scalability of our scheme, we examine 17 benchmarks including 2 SPECINT 2000 benchmarks whose source lines of code are larger than 10,000. Our experiments demonstrate that the static layout transformations for dynamic memory can reduce L1D cache misses by 16% and execution times by 14% on average.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
 |
2
|
|
| |
3
|
Bourdoncle, F. 1993. Efficient chaotic iteration strategies with widenings. In Proceedings of the International Conference on Formal Methods in Programming and Their Applications (FMPA'93). ACM, New York, 128--141.
|
 |
4
|
|
 |
5
|
Trishul M. Chilimbi , Bob Davidson , James R. Larus, Cache-conscious structure definition, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.13-24, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
6
|
|
 |
7
|
|
 |
8
|
Samuel Z. Guyer , Kathryn S. McKinley, Finding your cronies: static analysis for dynamic object colocation, Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, October 24-28, 2004, Vancouver, BC, Canada
|
 |
9
|
Thomas A. Henzinger , Ranjit Jhala , Rupak Majumdar , Grégoire Sutre, Lazy abstraction, Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.58-70, January 16-18, 2002, Portland, Oregon
|
| |
10
|
John E. Hopcroft , Rajeev Motwani , Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation (3rd Edition), Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2006
|
 |
11
|
Xianglong Huang , Stephen M. Blackburn , Kathryn S. McKinley , J Eliot B. Moss , Zhenlin Wang , Perry Cheng, The garbage collection advantage: improving program locality, Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, October 24-28, 2004, Vancouver, BC, Canada
|
| |
12
|
|
 |
13
|
|
| |
14
|
McGill benchmark suite. http://llvm.org/.
|
| |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
Rundberg, P. and Warg, F. Freebench v1.0 benchmark suite. http://www.freebench.org/.
|
 |
19
|
|
 |
20
|
|
| |
21
|
Keoncheol Shin , Jungeun Kim , Seonggun Kim , Hwansoo Han, Restructuring field layouts for embedded memory systems, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
 |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
Valgrind. http://valgrind.org.
|
| |
27
|
Wheeler, D. A. SLOCcount. http://www.dwheeler.com/sloccount/.
|
 |
28
|
|
|