|
ABSTRACT
Although central processor speeds continues to improve, improvements in overall system performance are increasingly hampered by memory latency, especially for pointer-intensive applications. To counter this loss of performance, numerous data and instruction prefetch mechanisms have been proposed. Recently, several proposals have posited a memory-side prefetcher; typically, these prefetchers involve a distinct processor that executes a program slice that would effectively prefetch data needed by the primary program. Alternative designs embody large state tables that learn the miss reference behavior of the processor and attempt to prefetch likely misses.This paper proposes Content-Directed Data Prefetching, a data prefetching architecture that exploits the memory allocation used by operating systems and runtime systems to improve the performance of pointer-intensive applications constructed using modern language systems. This technique is modeled after conservative garbage collection, and prefetches "likely" virtual addresses observed in memory references. This prefetching mechanism uses the underlying data of the application, and provides an 11.3% speedup using no additional processor state. By adding less than ½% space overhead to the second level cache, performance can be further increased to 12.6% across a range of "real world" applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
H.-J. Boehm. Hardware and operating system support for conservative garbage collection. In International Workshop on Memory Management, pages 61-67, Palo Alto, California, October 1991. IEEE Press.
|
| |
2
|
M. Charney and A. Reeves. Generalized correlation based hardware prefetching. Technical Report EE-CEG-95-1, Cornell University, February 1995.
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
Mikko H. Lipasti , William J. Schmidt , Steven R. Kunkel , Robert R. Roediger, SPAID: software prefetching in pointer- and call-intensive environments, Proceedings of the 28th annual international symposium on Microarchitecture, p.231-236, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
 |
8
|
|
 |
9
|
Todd C. Mowry , Monica S. Lam , Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.62-73, October 12-15, 1992, Boston, Massachusetts, United States
|
| |
10
|
Toshihiro Ozawa , Yasunori Kimura , Shin'ichiro Nishizaki, Cache miss heuristics and preloading techniques for general-purpose programs, Proceedings of the 28th annual international symposium on Microarchitecture, p.243-248, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
 |
11
|
|
 |
12
|
Amir Roth , Andreas Moshovos , Gurindar S. Sohi, Dependence based prefetching for linked data structures, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.115-126, October 02-07, 1998, San Jose, California, United States
|
 |
13
|
|
CITED BY 28
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Luis M. Ramos , José Luis Briz , Pablo E. Ibáñez , Victor Viñals, Data prefetching in a cache hierarchy with high bandwidth and capacity, Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures, p.37-44, September 16-20, 2006, Seattle, Washington
|
|
|
Kartik K. Agaram , Stephen W. Keckler , Calvin Lin , Kathryn S. McKinley, Decomposing memory performance: data structures and phases, Proceedings of the 2006 international symposium on Memory management, June 10-11, 2006, Ottawa, Ontario, Canada
|
|
|
|
|
|
Zhen Yang , Xudong Shi , Feiqi Su , Jih-Kwon Peir, Overlapping dependent loads with addressless preload, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Seung Woo Son , Mahmut Kandemir , Mustafa Karakoy , Dhruva Chakrabarti, A compiler-directed data prefetching scheme for chip multiprocessors, Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming, February 14-18, 2009, Raleigh, NC, USA
|
|