ACM Home Page
Please provide us with feedback. Feedback
Better tiling and array contraction for compiling scientific programs
Full text PdfPdf (506 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2002 ACM/IEEE conference on Supercomputing table of contents
Baltimore, Maryland
Pages: 1 - 12  
Year of Publication: 2002
Authors
Geoff Pike  University of California at Berkeley
Paul N. Hilfinger  University of California at Berkeley
Sponsors
IEEE-CS\DATC : IEEE Computer Society
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 10,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Scientific programs often include multiple loops over the same data; interleaving parts of different loops may greatly improve performance. We exploit this in a compiler for Titanium, a dialect of Java. Our compiler combines reordering optimizations such as loop fusion and tiling with storage optimizations such as array contraction (eliminating or reducing the size of temporary arrays).The programmers we have in mind are willing to spend some time tuning their code and their compiler parameters. Given that, and the difficulty in statically selecting parameters such as tile sizes, it makes sense to provide automatic parameter searching alongside the compiler. Our strategy is to optimize aggressively but to expose the compiler's decisions to external control. We double or triple the performance of Gauss-Seidel relaxation and multi-grid (versus an optimizing compiler without tiling and array contraction), and we argue that ours is the best compiler for that kind of program.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
W. L. Briggs. A Multigrid Tutorial. SIAM, 1987.
3
 
4
Larry Carter, Jeanne Ferrante, Susan Flynn Hummel, Bowen Alpern, Kang-Su Gatlin. Hierarchical Tiling: A Methodology for High Performance. UCSD Technical Report CS96-508, November 1996.
 
5
 
6
C. C. Douglas et al. Maximizing Cache Memory Usage for Multigrid Algorithms. In Z. Chen, R. E. Ewing and Z.-C. Shi, editors, Multiphase Flows and Transport in Porous Media: State of the Art, Springer-Verlag, Lecture Notes in Physics, Berlin, 2000.
 
7
FFTW. http://www.fftw.org/.
 
8
 
9
S. Flynn Hummel, I. Banicescu, C. Wang, and J. Wein. Load Balancing and Data Locality via Fractiling: An Experimental Study. In Boleslaw K. Szymanski and Balaram Sinharoy, editors, Proc. Third Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, pages 85--89. Kluwer Academic Publishers, Boston, MA, 1995.
 
10
 
11
12
 
13
 
14
T. Kisuki, P. M. W. Knijnenburg, K. Gallivan, and M. F. P. O'Boyle. The Effect of Cache Models on Iterative Compilation for Combined Tiling and Unrolling. In Proc. FDDO-3, pages 31--40, 2000.
 
15
T. Kisuki, P. M. W. Knijnenburg, and M. F. P. O'Boyle. Combined Selection of Tile Sizes and Unroll Factors Using Iterative Compilation. Technical Report 2000-07, LIACS, Leiden University, 2000.
 
16
17
18
 
19
20
 
21
M. F. P. O'Boyle, P. M. W. Knijnenburg, and G. G. Fursin. Feedback Assisted Iterative Compilation. Preprint, 2000.
 
22
 
23
G. Pike, L. Semenzato, P. Colella, P. Hilfinger. Parallel 3D Adaptive Mesh Refinement in Titanium. In Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX, March 1999.
 
24
25
26
27
 
28
 
29
30

CITED BY  7

Collaborative Colleagues:
Geoff Pike: colleagues
Paul N. Hilfinger: colleagues