| Compile-time composition of run-time data and iteration reorderings |
| Full text |
Pdf
(209 KB)
|
| Source
|
Conference on Programming Language Design and Implementation
archive
Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
table of contents
San Diego, California, USA
SESSION: Code optimization I
table of contents
Pages: 91 - 102
Year of Publication: 2003
ISBN:1-58113-662-5
Also published in ...
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 48, Citation Count: 16
|
|
|
ABSTRACT
Many important applications, such as those using sparse data structures, have memory reference patterns that are unknown at compile-time. Prior work has developed run-time reorderings of data and computation that enhance locality in such applications.This paper presents a compile-time framework that allows the explicit composition of run-time data and iteration-reordering transformations. Our framework builds on the iteration-reordering framework of Kelly and Pugh to represent the effects of a given composition. To motivate our extension, we show that new compositions of run-time reordering transformations can result in better performance on three benchmarks.We show how to express a number of run-time data and iteration-reordering transformations that focus on improving data locality. We also describe the space of possible run-time reordering transformations and how existing transformations fit within it. Since sparse tiling techniques are included in our framework, they become more generally applicable, both to a larger class of applications, and in their composition with other reordering transformations. Finally, within the presented framework data need be remapped only once at runtime for a given composition thus exhibiting one example of overhead reductions the framework can express.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Nawaaz Ahmed , Nikolay Mateev , Keshav Pingali, Synthesizing transformations for locality enhancement of imperfectly-nested loop nests, Proceedings of the 14th international conference on Supercomputing, p.141-152, May 08-11, 2000, Santa Fe, New Mexico, United States
[doi> 10.1145/335231.335245]
|
| |
2
|
|
 |
3
|
Steve Carr , Kathryn S. McKinley , Chau-Wen Tseng, Compiler optimizations for improving data locality, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.252-262, October 05-07, 1994, San Jose, California, United States
|
 |
4
|
|
| |
5
|
R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured Euler solver using software primitives, AIAA-92-0562. In Proceedings of the 30th Aerospace Sciences Meeting, 1992.
|
| |
6
|
|
 |
7
|
|
| |
8
|
|
| |
9
|
C. C. Douglas, J. Hu, M. Kowarschik, U. Rüde, and C. Weiss. Cache Optimization for Structured and Unstructured Grid Multigrid. Electronic Transaction on Numerical Analysis, 10:21--40, February 2000.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
M. Kandemir , A. Choudhary , J. Ramanujam , P. Banerjee, Improving locality using loop and data transformations in an integrated framework, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.285-297, November 1998, Dallas, Texas, United States
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
Jaswinder Pal Singh , Chris Holt , Takashi Totsuka , Anoop Gupta , John Hennessy, Load balancing and data locality in adaptive hierarchical N-body methods: Barnes-Hut, fast multipole, and radiosity, Journal of Parallel and Distributed Computing, v.27 n.2, p.118-141, June 1995
[doi> 10.1006/jpdc.1995.1077]
|
| |
29
|
|
| |
30
|
M. M. Strout, L. Carter, J. Ferrante, J. Freeman, and B. Kreaseck. Combining performance aspects of irregular gauss-seidel via sparse tiling. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2002.
|
 |
31
|
William Thies , Frédéric Vivien , Jeffrey Sheldon , Saman Amarasinghe, A unified framework for schedule and storage optimization, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.232-242, June 2001, Snowbird, Utah, United States
|
| |
32
|
|
| |
33
|
H. Yu, F. Dang, and L. Rauchwerger. Parallel reductions: An application of adaptive algorithm selection. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computing (LCPC), July 2002.
|
CITED BY 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brian S. White , Sally A. McKee , Bronis R. de Supinski , Brian Miller , Daniel Quinlan , Martin Schulz, Improving the computational intensity of unstructured mesh applications, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Xiaoming Gu , Ian Christopher , Tongxin Bai , Chengliang Zhang , Chen Ding, A component model of spatial locality, Proceedings of the 2009 international symposium on Memory management, June 19-20, 2009, Dublin, Ireland
|
|
|
|
|