| Transformations for imperfectly nested loops |
| Full text |
Pdf
(299 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM)
table of contents
Pittsburgh, Pennsylvania, United States
Article No. 12
Year of Publication: 1996
ISBN:0-89791-854-1
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 15, Citation Count: 8
|
|
|
ABSTRACT
Loop transformations are critical for compiling high-performance code for modern computers. Existing work has focused on transformations for perfectly nested loops (that is, loops in which all assignment statements are contained within the innermost loop of a loop nest). In practice, most loop nests, such as those in matrix factorization codes, are imperfectly nested. In some programs, imperfectly nested loops can be transformed into perfectly nested loops by loop distribution, but this is not always legal. In this paper, we present an approach to transforming imperfectly nested loops directly. Our approach is an extension of the linear loop transformation framework for perfectly nested loops, and it models permutation, reversal, skewing, scaling, alignment, distribution and jamming. We also give a completion procedure which generates a complete transformation from a partial transformation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
Uptal Banerjee. Unimodular transformations of double loops. In Languages and compilers for parallel computing, pages 192-219, 1990.
|
| |
5
|
W. Blume, R. Eigenmann, K. Faigin, J. Grout, J. Hoeflinger, D. Padua, P. Petersen, W. Pottenger, L. Rauchwerger, P. Tu, and S. Weatherford. Polaris: The next generation in parallelizing compilers. Technical Report 1375, Center for Supercomputing Research and Development (CSRD), University of Illinois Urbana-Champaign.
|
| |
6
|
|
| |
7
|
Paul Feautrier. Some efficient solutions to the affine scheduling problem - part ii: multi-dimensional time. International Journal of Parallel Programming, December 1992.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
J. Ramanujam. Optimal code parallelization using unimodular transformations. In Proceedings of Supercomputing, 1992.
|
| |
13
|
M. E. Wolf and M. S. Lam. An algorithmic approach to compound loop transformations. In Languages and compilers for parallel computing, pages 243-273, 1990.
|
| |
14
|
|
CITED BY 8
|
|
|
|
|
Suvas Vajracharya , Steve Karmesin , Peter Beckman , James Crotinger , Allen Malony , Sameer Shende , Rod Oldehoeft , Stephen Smith, SMARTS: exploiting temporal locality and parallelism through vertical execution, Proceedings of the 13th international conference on Supercomputing, p.302-310, June 20-25, 1999, Rhodes, Greece
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|