ACM Home Page
Please provide us with feedback. Feedback
A data locality optimizing algorithm
Full text PdfPdf (2.26 MB)
Source ACM SIGPLAN Notices archive
Volume 39 ,  Issue 4  (April 2004) table of contents
Best of PLDI 1979-1999
SPECIAL ISSUE: 1991 table of contents
Pages: 442 - 459  
Year of Publication: 2004
ISSN:0362-1340
Authors
Monica S. Lam  Stanford University, CA
Michael E. Wolf  Stanford University, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 77,   Citation Count: 3
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/989393.989437
What is a DOI?

ABSTRACT

This paper proposes an algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling. The loop transformation algorithm is based on two concepts: a mathematical formulation of reuse and locality, and a loop transformation theory that unifies the various transforms as unimodular matrix transformations.The algorithm has been implemented in the SUIF (Stanford University Intermediate Format) compiler, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation (SOR), LU decomposition without pivoting, and Givens QR factorization. Performance evaluation indicates that locality optimization is especially crucial for scaling up the performance of parallel code.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
P. Feautrier. Dataflow analysis of scalar and array references. Journal of Parallel and Distributed Computing, 20(1):23--53, February 1991.
 
3
4
 
5
6
 
7
8
 
9
 
10
 
11
U. Banerjee. Data dependence in ordinary programs. Technical Report 76--837, University of Illinios at Urbana-Champaign, Nov 1976.
 
12
 
13
U. Banerjee. Unimodular transformations of double loops. In 3rd Workshop on Languages and Compilers for Parallel Computing, Aug 1990.
14
15
 
16
K. Gallivan, W. Jalby, U. Meier, and A. Sameh. The impact of hierachical memory systems on linear algebra algorithm design. Technical report, University of Illinios, 1987.
 
17
 
18
G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, 1989.
 
19
F. Irigoin and R. Triolet. Computing dependence direction vectors and dependence cones. Technical Report E94, Centre D'Automatique et Informatique, 1988.
20
21
22
 
23
 
24
R. Schreiber and J. Dongarra. Automatic blocking of nested loops. 1990.
 
25
 
26
M. J. Wolfe. Techniques for improving the inherent parallelism in programs. Technical Report UIUCDCS-R-78-929, University of Illinois, 1978.
27

Collaborative Colleagues:
Monica S. Lam: colleagues
Michael E. Wolf: colleagues