ACM Home Page
Please provide us with feedback. Feedback
Compiling stencils in high performance Fortran
Full text PdfPdf (256 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) table of contents
San Jose, CA
Pages: 1 - 20  
Year of Publication: 1997
ISBN:0-89791-985-8
Authors
Gerald Roth  Rice University, Houston, TX
John Mellor-Crummey  Rice University, Houston, TX
Ken Kennedy  Rice University, Houston, TX
R. Gregg Brickner  Los Alamos National Laboratory, Los Alamos, NM
Sponsors
IEEE-CS\DATC : IEEE Computer Society
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 23,   Citation Count: 12
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/509593.509605
What is a DOI?

ABSTRACT

For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, and geometric modeling. The efficient handling of such stencils is critical for achieving high performance on distributed-memory machines. Compiling stencils into efficient code is viewed as so important that some companies have built special-purpose compilers for handling them and others have added stencil-recognizers to existing compilers.In this paper we present a general compilation strategy for stencils written using Fortran90 array constructs. Our strategy is capable of optimizing single or multi-statement stencils and is applicable to stencils specified with shift intrinsics or with array-syntax all equally well. The strategy eliminates the need for pattern-recognition algorithms by orchestrating a set of optimizations that address the overhead of both intraprocessor and interprocessor data movement that results from the translation of Fortran90 array constructs. Our experimental results show that code produced by this strategy beats or matches the best code produced by the special-purpose compilers or pattern-recognition schemes that are known to us. In addition, our strategy produces highly optimized code in situations where the others fail, producing several orders of magnitude performance improvement, and thus provides a stencil compilation strategy that is more robust than its predecessors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Z. Bozkus, L. Meadows, D. Miles, S. Nakamoto, V. Schuster, and M. Young. Techniques for compiling and executing HPF programs on shared-memory and distributed-memory parallel systems. In Proceedings of the First International Workshop on Parallel Processing, Bangalore, India, December 1994.
 
2
 
3
T. Brandes. Compiling data parallel programs to message passing programs for massively parallel MIMD systems. In Working Conference on Massively Parallel Programming Models, Berlin, 1993.
 
4
R. G. Brickner, W. George, S. L. Johnsson, and A. Ruttenberg. A stencil compiler for the Connection Machine models CM-2/200. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.
 
5
R. G. Brickner, K. Holian, B. Thiagarajan, and S. L. Johnsson. A stencil compiler for the Connection Machine model CM-5. Technical Report CRPC-TR94457, Center for Research on Parallel Computation, Rice University, June 1994.
6
7
 
8
A. Choudhary, G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, S. Ranka, and C.-W. Tseng. Compiling Fortran 77D and 90D for MIMD distributed-memory machines. In Frontiers '92: The 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, October 1992.
9
 
10
 
11
12
 
13
T. Haupt, S. Reddy, and G. Vengurlekar. Low level HPF compiler benchmark suite. Technical Report SCCS-735, Northeast Parallel Architectures Center, Syracuse University, Syracuse, NY, August 1995.
 
14
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1-2):1-170, 1993.
 
15
 
16
K. Kennedy and K. S. McKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, August 1993.
 
17
K. Kennedy and G. Roth. Context optimization for SIMD execution. In Proceedings of the 1994 Scalable High Performance Computing Conference, Knoxville, TN, May 1994.
 
18
K. Knobe, J. Lukas, and M. Weiss. Optimization techniques for SIMD Fortran compilers. Concurrency: Practice and Experience, 5(7):527-552, October 1993.
19
 
20
A. Mohamed, G. Fox, G. v. Laszewski, M. Parashar, T. Haupt, K. Mills, Y. Lu, N. Lin, and N. Yeh. Applications benchmark set for Fortran-D and High Performance Fortran. Technical Report SCCS-327, Northeast Parallel Architectures Center, Syracuse University, Syracuse, NY, June 1992.
 
21
J. R. Rice and J. Jing. Problems to test parallel and vector languages. Technical Report CSD-TR-1016, Dept. of Computer Science, Purdue University, 1990.
 
22
 
23
G. Sabot. A compiler for a massively parallel distributed memory MIMD computer. In Frontiers '92: The 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, October 1992.
 
24

CITED BY  12

Collaborative Colleagues:
Gerald Roth: colleagues
John Mellor-Crummey: colleagues
Ken Kennedy: colleagues
R. Gregg Brickner: colleagues