|
ABSTRACT
Heap memory allocation plays an important role in modern applications. Conventional heap allocators, however, generally ignore the underlying memory hierarchy of the system, favoring instead a low runtime overhead and fast response times. Unfortunately, with little concern for the memory hierarchy, the data layout may exhibit poor spatial locality, and degrade cache performance. In this paper, we describe a dynamic heap allocation scheme called pool allocation. The strategy aims to improve cache performance by inspecting memory allocation requests, and allocating memory from appropriate heap pools as dictated by the requesting context. The advantages are two fold. First, by pooling together data with a common context, we expect to improve spatial locality, as data fetched to the caches will contain fewer items from different contexts. If the allocation patterns are closely matched to the traversal patterns, the end result is faster memory performance. Second, by pooling heap objects, we expect access patterns to exhibit more regularity, thus creating more opportunities for data prefetching. Our dynamic memory optimizer exploits the increased regularity to insert prefetch instructions at runtime. The optimizations are implemented in DynamoRIO, a dynamic optimization framework. We evaluate the work using various bench-marks, and measure a 17% speedup over gcc −03 on an Athlon MP, and a 13% speedup on a Pentium 4.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. G. Abraham and B. R. Rau. Predicting load latencies using cache profiling. Technical Report HPL-94-110, Hewlett Packard Labs, Dec 1994.
|
| |
2
|
|
 |
3
|
Brad Calder , Chandra Krintz , Simmi John , Todd Austin, Cache-conscious data placement, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.139-149, October 02-07, 1998, San Jose, California, United States
|
 |
4
|
David Callahan , Ken Kennedy , Allan Porterfield, Software prefetching, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.40-52, April 08-11, 1991, Santa Clara, California, United States
|
| |
5
|
M. Charney and A. Reeves. Generalized correlation-based hardware prefetching. Technical Report EE-CEG-95-1, Cornell University, Feb. 1995.
|
 |
6
|
Trishul M. Chilimbi , Mark D. Hill , James R. Larus, Cache-conscious structure layout, Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation, p.1-12, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
7
|
|
 |
8
|
John W. C. Fu , Janak H. Patel , Bob L. Janssens, Stride directed prefetching in scalar processors, Proceedings of the 25th annual international symposium on Microarchitecture, p.102-110, December 01-04, 1992, Portland, Oregon, United States
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
Jiwei Lu , Howard Chen , Rao Fu , Wei-Chung Hsu , Bobbie Othmer , Pen-Chung Yew , Dong-Yuan Chen, The Performance of Runtime Data Cache Prefetching in a Dynamic Optimization System, Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, p.180, December 03-05, 2003
|
 |
14
|
Todd C. Mowry , Monica S. Lam , Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.62-73, October 12-15, 1992, Boston, Massachusetts, United States
|
| |
15
|
Toshihiro Ozawa , Yasunori Kimura , Shin'ichiro Nishizaki, Cache miss heuristics and preloading techniques for general-purpose programs, Proceedings of the 28th annual international symposium on Microarchitecture, p.243-248, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
 |
16
|
|
 |
17
|
Rodric M. Rabbah , Hariharan Sandanagobalane , Mongkol Ekpanyapong , Weng-Fai Wong, Compiler orchestrated prefetching via speculation and predication, Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, October 07-13, 2004, Boston, MA, USA
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
 |
21
|
|
CITED BY 2
|
|
|
|
|
Robert Pyka , Christoph Faßbach , Manish Verma , Heiko Falk , Peter Marwedel, Operating system integrated energy aware scratchpad allocation strategies for multiprocess applications, Proceedingsof the 10th international workshop on Software & compilers for embedded systems, April 20-20, 2007, Nice, France
|
|