| Data locality enhancement by memory reduction |
| Full text |
Pdf
(361 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 15th international conference on Supercomputing
table of contents
Sorrento, Italy
Pages: 50 - 64
Year of Publication: 2001
ISBN:1-58113-410-X
|
|
Authors
|
|
Yonghong Song
|
Sun Microsystems, Inc., 901 San Antonio Rd., Palo Alto, CA
|
|
Rong Xu
|
Department of Computer Sciences, Purdue University, West Lafayette, IN
|
|
Cheng Wang
|
Department of Computer Sciences, Purdue University, West Lafayette, IN
|
|
Zhiyuan Li
|
Department of Computer Sciences, Purdue University, West Lafayette, IN
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 22, Citation Count: 16
|
|
|
ABSTRACT
In this paper, we propose memory reduction as a new approach to data locality enhancement. Under this approach, we use the compiler to reduce the size of the data repeatedly referenced in a collection of nested loops. Between their reuses, the data will more likely remain in higher-speed memory devices, such as the cache. Specifically, we present an optimal algorithm to combine loop shifting, loop fusion and array contraction to reduce the temporary array storage required to execute a collection of loops. When applied to 20 benchmark programs, our technique reduces the memory requirement, counting both the data and the code, by 51% on average. The transformed programs gain a speedup of 1.40 on average, due to the reduced footprint and, consequently, the improved data locality.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Ravindra K. Ahuja , Thomas L. Magnanti , James B. Orlin, Network flows: theory, algorithms, and applications, Prentice-Hall, Inc., Upper Saddle River, NJ, 1993
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
Junjie Gu , Zhiyuan Li , Gyungho Lee, Experience with efficient array data flow analysis for array privatization, Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.157-167, June 18-21, 1997, Las Vegas, Nevada, United States
|
| |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
C. Leiserson and J. Saxe. Retiming synchronous circuitry. Algorithmica, 6:5-35, 1991.
|
 |
17
|
E. Christopher Lewis , Calvin Lin , Lawrence Snyder, The implementation and evaluation of fusion and contraction in array languages, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.50-59, June 17-19, 1998, Montreal, Quebec, Canada
|
| |
18
|
|
 |
19
|
|
| |
20
|
A. G. Mohamed, G. C. Fox, G. von Laszewski, M. Parashar, T. Haupt, K. Mills, Y.-H. Lu, N.-T. Lin, and N.-K. Yeh. Applications benchmark set for fortran-d and high performance fortran. Technical Report CRPS-TR92260, Center for Research on Parallel Computation, Rice University, June 1992.
|
| |
21
|
J. Rice and J. Jing. Problems to test parallel and vector languages. Technical Report CSD-TR-1016, Department of Computer Science, Purdue University, 1990.
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
S. K. Singhai and K. S. McKinley. A parameterized loop fusion algorithm for improving parallelism and cache locality. The Computer Journal, 40(6), 1997.
|
 |
26
|
|
| |
27
|
Y. Song, R. Xu, C. Wang, and Z. Li. Performance enhancement by memory reduction. Technical Report CSD-TR-00-016, Department of Computer Science, Purdue University, 2000. Also available at http://www.cs.purdue.edu/homes/songyh/academic.html.
|
 |
28
|
Michelle Mills Strout , Larry Carter , Jeanne Ferrante , Beth Simon, Schedule-independent storage mapping for loops, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.24-33, October 02-07, 1998, San Jose, California, United States
|
| |
29
|
|
| |
30
|
|
CITED BY 16
|
|
Daniel Cociorva , Gerald Baumgartner , Chi-Chung Lam , P. Sadayappan , J. Ramanujam , Marcel Nooijen , David E. Bernholdt , Robert Harrison, Space-time trade-off optimization for a class of electronic structure calculations, ACM SIGPLAN Notices, v.37 n.5, May 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jay L.T. Cornwall , Lee Howes , Paul H.J. Kelly , Phil Parsonage , Bruno Nicoletti, High-performance SIMT code generation in an active visual effects library, Proceedings of the 6th ACM conference on Computing frontiers, May 18-20, 2009, Ischia, Italy
|
|
|
|
|
|
|
|