|
ABSTRACT
For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, and geometric modeling. The efficient handling of such stencils is critical for achieving high performance on distributed-memory machines. Compiling stencils into efficient code is viewed as so important that some companies have built special-purpose compilers for handling them and others have added stencil-recognizers to existing compilers.In this paper we present a general compilation strategy for stencils written using Fortran90 array constructs. Our strategy is capable of optimizing single or multi-statement stencils and is applicable to stencils specified with shift intrinsics or with array-syntax all equally well. The strategy eliminates the need for pattern-recognition algorithms by orchestrating a set of optimizations that address the overhead of both intraprocessor and interprocessor data movement that results from the translation of Fortran90 array constructs. Our experimental results show that code produced by this strategy beats or matches the best code produced by the special-purpose compilers or pattern-recognition schemes that are known to us. In addition, our strategy produces highly optimized code in situations where the others fail, producing several orders of magnitude performance improvement, and thus provides a stencil compilation strategy that is more robust than its predecessors.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Z. Bozkus, L. Meadows, D. Miles, S. Nakamoto, V. Schuster, and M. Young. Techniques for compiling and executing HPF programs on shared-memory and distributed-memory parallel systems. In Proceedings of the First International Workshop on Parallel Processing, Bangalore, India, December 1994.
|
| |
2
|
|
| |
3
|
T. Brandes. Compiling data parallel programs to message passing programs for massively parallel MIMD systems. In Working Conference on Massively Parallel Programming Models, Berlin, 1993.
|
| |
4
|
R. G. Brickner, W. George, S. L. Johnsson, and A. Ruttenberg. A stencil compiler for the Connection Machine models CM-2/200. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.
|
| |
5
|
R. G. Brickner, K. Holian, B. Thiagarajan, and S. L. Johnsson. A stencil compiler for the Connection Machine model CM-5. Technical Report CRPC-TR94457, Center for Research on Parallel Computation, Rice University, June 1994.
|
 |
6
|
Mark Bromley , Steven Heller , Tim McNerney , Guy L. Steele, Jr., Fortran at ten gigaflops: the connection machine convolution compiler, Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, p.145-156, June 24-28, 1991, Toronto, Ontario, Canada
|
 |
7
|
Steve Carr , Kathryn S. McKinley , Chau-Wen Tseng, Compiler optimizations for improving data locality, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.252-262, October 05-07, 1994, San Jose, California, United States
|
| |
8
|
A. Choudhary, G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, S. Ranka, and C.-W. Tseng. Compiling Fortran 77D and 90D for MIMD distributed-memory machines. In Frontiers '92: The 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, October 1992.
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
Manish Gupta , Sam Midkiff , Edith Schonberg , Ven Seshadri , David Shields , Ko-Yang Wang , Wai-Mee Ching , Ton Ngo, An HPF compiler for the IBM SP2, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.71-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224422]
|
| |
13
|
T. Haupt, S. Reddy, and G. Vengurlekar. Low level HPF compiler benchmark suite. Technical Report SCCS-735, Northeast Parallel Architectures Center, Syracuse University, Syracuse, NY, August 1995.
|
| |
14
|
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1-2):1-170, 1993.
|
| |
15
|
|
| |
16
|
K. Kennedy and K. S. McKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, August 1993.
|
| |
17
|
K. Kennedy and G. Roth. Context optimization for SIMD execution. In Proceedings of the 1994 Scalable High Performance Computing Conference, Knoxville, TN, May 1994.
|
| |
18
|
K. Knobe, J. Lukas, and M. Weiss. Optimization techniques for SIMD Fortran compilers. Concurrency: Practice and Experience, 5(7):527-552, October 1993.
|
 |
19
|
|
| |
20
|
A. Mohamed, G. Fox, G. v. Laszewski, M. Parashar, T. Haupt, K. Mills, Y. Lu, N. Lin, and N. Yeh. Applications benchmark set for Fortran-D and High Performance Fortran. Technical Report SCCS-327, Northeast Parallel Architectures Center, Syracuse University, Syracuse, NY, June 1992.
|
| |
21
|
J. R. Rice and J. Jing. Problems to test parallel and vector languages. Technical Report CSD-TR-1016, Dept. of Computer Science, Purdue University, 1990.
|
| |
22
|
|
| |
23
|
G. Sabot. A compiler for a massively parallel distributed memory MIMD computer. In Frontiers '92: The 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, October 1992.
|
| |
24
|
|
CITED BY 12
|
|
|
|
|
|
|
|
Hitoshi Sakagami , Hitoshi Murai , Yoshiki Seo , Mitsuo Yokokawa, 14.9 TFLOPS three-dimensional fluid simulation for fusion science with HPF on the Earth Simulator, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-14, November 16, 2002, Baltimore, Maryland
|
|
|
|
|
|
Feihui Li , Guangyu Chen , Mahmut Kandemir , Mary Jane Irwin, Compiler-directed proactive power management for networks, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|