ACM Home Page
Please provide us with feedback. Feedback
Computation regrouping: restructuring programs for temporal data cache locality
Full text PdfPdf (252 KB)
Source International Conference on Supercomputing archive
Proceedings of the 16th international conference on Supercomputing table of contents
New York, New York, USA
SESSION: Compilers 2 table of contents
Pages: 252 - 261  
Year of Publication: 2002
ISBN:1-58113-483-5
Authors
Venkata K. Pingali  Information Sciences Institute, University of Southern California
Sally A. McKee  School of Computing, University of Utah
Wilson C. Hseih  School of Computing, University of Utah
John B. Carter  School of Computing, University of Utah
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 35,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/514191.514227
What is a DOI?

ABSTRACT

Data access costs contribute significantly to the execution time of applications with complex data structures. As the latency of memory accesses becomes high relative to processor cycle times, application performance is increasingly limited by memory performance. In some situations it may be reasonable to trade increased computation costs for reduced memory costs. The contributions of this paper are three-fold: we provide a detailed analysis of the memory performance of a set of seven, memory-intensive benchmarks; we describe Computation Regrouping, a general, source-level approach to improving the overall performance of these applications by improving temporal locality to reduce cache and TLB miss ratios (and thus memory stall times); and we demonstrate significant performance improvements from applying Computation Regrouping to our suite of seven benchmarks. With Computation Regrouping, we observe an average speedup of 1.97, with individual speedups ranging from 1.26 to 3.03. Most of this improvement comes from eliminating memory stall time.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
 
4
5
 
6
7
8
9
10
 
11
 
12
M. Frigo and S. Johnson. FFTW: An adaptive software architecture for the FFT. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 1381--1384, May 1998
 
13
14
15
 
16
 
17
H. Han and C.-W. Tseng. Improving locality for adaptive irregular scientific codes. Technical Report CS-TR-4039, University of Maryland, College Park, September 1999
 
18
 
19
M. Karlsson, F. Dahlgren, and P. Stenstrom. A Prefetching Technique for Irregular Accesses to Linked Data Structures. In Proceedings of the Sixth Annual Symposium on High Performance Computer Architecture, pages 206--217, January 2000
20
21
22
 
23
 
24
S. Leung and J. Zahorjan. Optimizing Data Locality by Array Restructuring. Technical Report UW-CSE-95-09-01, University of Washington Dept. of Computer Science and Engineering, September 1995
25
 
26
J. W. Manke and J. Wu. Data-Intensive System Benchmark Suite Analysis and Specification. Atlantic Aerospace Electronics Corp., June 1999
 
27
V. Pingali. Memory performance of complex data structures: Characterization and optimization. Master's thesis, University of Utah, August 2001
 
28
29
 
30
E. S. Roberts and M. T. Vandevoorde. WorkCrews: An Abstraction for Controlling Parallelism. Technical Report SRC-042, Digital Systems Research Center, April 1989
31
 
32
 
33
Silicon Graphics Inc. SpeedShop User's Guide. 1996
 
34
F. Somenzi. CUDD: CU Decision Diagram Package Release 2.3.1, 2001
35
 
36
 
37


Collaborative Colleagues:
Venkata K. Pingali: colleagues
Sally A. McKee: colleagues
Wilson C. Hseih: colleagues
John B. Carter: colleagues