ACM Home Page
Please provide us with feedback. Feedback
Loop optimization for a class of memory-constrained computations
Full text PdfPdf (161 KB)
Source International Conference on Supercomputing archive
Proceedings of the 15th international conference on Supercomputing table of contents
Sorrento, Italy
Pages: 103 - 113  
Year of Publication: 2001
ISBN:1-58113-410-X
Authors
D. Cociorva  Dept. of Physics, The Ohio State University, Columbus, OH
J. W. Wilkins  Dept. of Physics, The Ohio State University, Columbus, OH
C. Lam  Dept. of Comp. & Info. Sci., The Ohio State University, Columbus, OH
G. Baumgartner  Dept. of Comp. & Info. Sci., The Ohio State University, Columbus, OH
J. Ramanujam  Dept. of Elec. & Comp. Engr., Louisiana State University, Baton Rouge, LA
P. Sadayappan  Dept. of Comp. & Info. Sci., The Ohio State University, Columbus, OH
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 29,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/377792.377814
What is a DOI?

ABSTRACT

Compute-intensive multi-dimensional summations that involve products of several arrays arise in the modeling of electronic structure of materials. Sometimes several alternative formulations of a computation, representing different space-time trade-offs, are possible. By computing and storing some intermediate arrays, reduction of the number of arithmetic operations is possible, but the size of intermediate temporary arrays may be prohibitively large. Loop fusion can be applied to reduce memory requirements, but that could impede effective tiling to minimize memory access costs. This paper develops an integrated model combining loop tiling for enhancing data reuse, and loop fusion for reduction of memory for intermediate temporary arrays. An algorithm is presented that addresses the selection of tile sizes and choice of loops for fusion, with the objective of minimizing cache misses while keeping the total memory usage within a given limit. Experimental results are reported that demonstrate the effectiveness of the combined loop tiling and fusion transformations performed by using the developed framework.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
W. Aulbur. Parallel Implementation of Quasiparticle Calculations of Semiconductors and Insulators, Ph.D. Dissertation, Ohio State University, Columbus, OH, October 1996.
 
4
K. L. Bak, P. Jorgensen, J. Olsen, W. Klopper. Accuracy of atomization energies and reaction enthalpies in standard and extrapolated electronic wave function/basis set calculations. J. Chem. Phys., Vol. 112, pp. 9229-9242, 2000.
 
5
L. Carter, J. Ferrante and S. F. Hummel. Efficient Parallelism via Hierarchical Tiling. Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, Philadelphia, PA, February 1995.
6
7
8
 
9
T. H. Dunning, Jr. A roadmap for the calculation of molecular binding energies. J. Phys. Chem. A, 2000 (in press).
 
10
J. Foresman and A. Frisch. Exploring Chemistry with Electronic Structure Methods: A Guide to Using Gaussian, Second Edition. Gaussian, Inc., Pittsburgh, PA, 1996.
 
11
12
 
13
High Performance Computational Chemistry Group. NWChem, A computational chemistry package for parallel computers, Version 3.3, 1999. Pacific Northwest National Laboratory, Richland, WA 99352.
14
 
15
M. S. Hybertsen and S. G. Louie. Electronic Correlation in Semiconductors and Insulators: Band Gaps and Quasiparticle Energies. Phys. Rev. B, 34, 5390 (1986).
16
 
17
18
19
20
 
21
 
22
 
23
 
24
 
25
C. Lam, P. Sadayappan and R. Wenger. On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution. Parallel Processing Letters, Vol. 7 No. 2, pp. 157-168, 1997.
 
26
C. Lam, P. Sadayappan and R. Wenger. Optimization of a Class of Multi-Dimensional Integrals on Parallel Machines. Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, Minneapolis, MN, March 1997.
27
 
28
T. J. Lee and G. E. Scuseria. Achieving chemical accuracy with coupled cluster theory. In S. R. Langhoff (Ed.), Quantum Mechanical Electronic Structure Calculations with Chemical Accuracy, pp. 47-109, Kluwer Academic, 1997.
 
29
30
 
31
J. M. L. Martin. In P. v. R. Schleyer, P. R. Schreiner, N. L. Allinger, T. Clark, J. Gasteiger, P. Kollman, H. F. Schaefer III (Eds.), Encyclopedia of Computational Chemistry. Wiley & Sons, Berne (Switzerland). Vol. 1, pp. 115-128, 1998.
 
32
33
 
34
35
 
36
K. A. Peterson and T. H. Dunning, Jr. (1997). The CO molecule: Role of basis set and correlation treatment in the calculation of molecular properties. J. Molec. Struct. (Theochem), Vol. 400, pp. 93-117.
37
38
 
39
H. N. Rojas, R. W. Godby and R. J. Needs. Space-Time Method for Ab-Initio Calculations of Self-Energies and Dielectric Response Functions of Solids. Phys. Rev. Lett., 74, 1827, (1995).
 
40
 
41
S. Singhai and K. S. McKinley. Loop Fusion for Parallelism and Locality. Mid-Atlantic States Student Workshop on Programming Languages and Systems, MASPLAS '96, April 1996.
 
42
S. Singhai and K. S. McKinley. A Parameterized Loop Fusion Algorithm for Improving Parallelism and Cache Locality. The Computer Journal, 40(6):340-355, 1997.
43
 
44
J. F. Stanton, J. Gauss, J. D. Watts, M. Nooijen, N. Oliphant, S. A. Perera, P. G. Szalay, W. J. Lauderdale, S. A. Kucharski, S. R. Gwaltney, S. Beck, A. Balkov' a, D. E. Bernholdt, K. K. Baeck, P. Rozyczko, H. Sekino, C. Hober, and R. J. Bartlett. ACES II, a software product of the Quantum Theory Project, University of Florida. Integral packages included are VMOL (J. Alml of and P. R. Taylor); VPROPS (P. Taylor) ABACUS; (T. Helgaker, H. J. Aa. Jensen, P. Jorgensen, J. Olsen, and P. R. Taylor).
45
 
46
 
47

CITED BY  6

Collaborative Colleagues:
D. Cociorva: colleagues
J. W. Wilkins: colleagues
C. Lam: colleagues
G. Baumgartner: colleagues
J. Ramanujam: colleagues
P. Sadayappan: colleagues