| Push vs. pull: data movement for linked data structures |
| Full text |
Pdf
(983 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 14th international conference on Supercomputing
table of contents
Santa Fe, New Mexico, United States
Pages: 176 - 186
Year of Publication: 2000
ISBN:1-58113-270-0
|
|
Authors
|
|
Chia-Lin Yang
|
Department of Computer Science, Duke University, Durham, NC
|
|
Alvin R. Lebeck
|
Department of Computer Science, Duke University, Durham, NC
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 42, Citation Count: 17
|
|
|
ABSTRACT
As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetching can often overlap memory latency with computation for array-based numeric applications. However, prefetching for pointer-intensive applications still remains a challenging problem. Prefetching linked data structures (LDS) is difficult because the address sequence of LDS traversal does not present the same arithmetic regularity as array-based applications and the data dependence of pointer dereferences can serialize the address generation process.In this paper, we propose a cooperative hardware/software mechanism to reduce memory access latencies for linked data structures. Instead of relying on the past address history to predict future accesses, we identify the load instructions that traverse the LDS, and execute them ahead of the actual computation. To overcome the serial nature of the LDS address generation, we attach a prefetch controller to each level of the memory hierarchy and push, rather than pull, data to the CPU. Our simulations, using four pointer-intensive applications, show that the push model can achieve between 4% and 30% larger reductions in execution time compared to the pull model.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
D. C. Burger, T. M.. Austin, and S. Bennett. Evaluating future microprocessors-the simplescalar tool set. Technical Report 1308, Computer Sciences Department, University of Wisconsin-Madison, July 1996.
|
| |
4
|
T. Chilimbi, J. Larus, and M. Hill. Improving pointerbased codes through cache-concious data placement. Technical Report CSL-TR-98-1365, University of Wisconsin, Madison., March 1998.
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
C. Kolb. The rayshade user's guide. In http://graphics.stanford.EDU/ cek/rayshade.
|
| |
10
|
Mikko H. Lipasti , William J. Schmidt , Steven R. Kunkel , Robert R. Roediger, SPAID: software prefetching in pointer- and call-intensive environments, Proceedings of the 28th annual international symposium on Microarchitecture, p.231-236, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
 |
11
|
|
 |
12
|
|
 |
13
|
Todd C. Mowry , Monica S. Lam , Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.62-73, October 12-15, 1992, Boston, Massachusetts, United States
|
 |
14
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
Amir Roth , Andreas Moshovos , Gurindar S. Sohi, Dependence based prefetching for linked data structures, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.115-126, October 02-07, 1998, San Jose, California, United States
|
 |
20
|
|
 |
21
|
|
| |
22
|
|
 |
23
|
|
CITED BY 17
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhen Yang , Xudong Shi , Feiqi Su , Jih-Kwon Peir, Overlapping dependent loads with addressless preload, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|