|
ABSTRACT
Data-parallel languages like Fortran 90 express parallelism in the form of operations on data aggregates such as arrays. Misalignment of the operands of an array operation can reduce program performance on a distributed-memory parallel machine by requiring nonlocal data accesses. Determining array alignments that reduce communication is therefore a key issue in compiling such languages.
We present a framework for the automatic determination of array alignments in data-parallel languages such as Fortran 90. Our language model handles array sectioning, reductions, spreads, transpositions, and masked operations. We decompose alignment functions into three constituents: axis, stride, and offset. For each of these subproblems, we show how to solve the alignment problem for a basic block of code, possibly containing common subexpressions. Alignments are generated for all array objects in the code, both named program variables and intermediate results. The alignments obtained by our algorithms are more general than those provided by the “owner-computes” rule. Finally, we present some ideas for dealing with control flow, replication, and dynamic alignments that depend on loop induction variables.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
2
|
AMERICAN NATIONAL STANDARDS INSTITUTE. Fortran 90: X3J3 internal document $8.118 Submitted as Text /or ANSI X3.198-1991, and ISO/IEC JTC1/SC~2/WG5 internal document N692 Submitted as Text for ISO/IEC 1539:1991, May 1991.
|
| |
3
|
BALA, V., AND FERRANTE, J. Explicit data placement (XDP): A methodology for explicit compile-time representation and optimization of data movement. In Proceedings o} the Second Workshop on Languages, Compilers, and Runtime Environments/or Distributed Memory Multiprocessors (Boulder, CO, Oct. 1992).
|
| |
4
|
CHAPMAN, B., MEHROTRA, P., AND ZrMA, H. Programming in Vienna Fortran. Scientific Programming 1, 1 (Fall 1992), 31-50.
|
| |
5
|
|
| |
6
|
CHATTERJEE, S., GILBERT, J. R., SCI-tREIBER, R., AND TENG, S.-H. Optimal evaluation of array expressions on massively parallel machines. In Proceedings o} the Second Workshop on Languages, Compilers, and Runtime Environments for Distributed Memory Multi. processors (Boulder, CO, Oct. 1992). Also available as RIACS Technical Report TR 92.17.
|
| |
7
|
CHEN, M. C., AND WU, J.-J. Optimizing FORTRAN- 90 programs for data motion on massively parallel systems. Tech. Rep. YALEU/DCS/TR-882, Department of Computer Science, Yale University, New Haven, CT, Jan. 1992.
|
| |
8
|
|
| |
9
|
Fox, G. C., HmANANDANI, S., KENNEDY, K., KOEL- BEL, C., KREMER, U., TSENG, C.-W., AND WU, M.- Y. Fortran D language specification. Tech. Rep. Rice COMP TR90-141, Department of Computer Science, Rice University, Houston, TX, Dec. 1990.
|
| |
10
|
|
| |
11
|
|
| |
12
|
GILL, P. E., MURRAY, W., AND WRIGHT, M. H. Practical Optimization. Academic Press, Orlando, FL, 1981.
|
| |
13
|
|
| |
14
|
|
| |
15
|
HIGH PERFORMANCE FORTRAN FORUM. High Performance Fortran language specification version 1.0. Draft, Sept. 1992.
|
| |
16
|
HmANANDANI, S., KENNEDY, K., AND TSENG, C.- W. Compiler support for machine-independent parallel programming in Fortran D. Tech. Rep. Rice COMP TR90-149, Department of Computer Science, Rice University, Houston, TX, Feb. 1991.
|
| |
17
|
|
| |
18
|
KNOBE, K., LUKAS, J. D., AND DALLY, W. J. Dynamic alignment on distributed memory systems. In Proceedings o} the Third Workshop on Compilers }or Parallel Computers (Vienna, Austria, July 1992), Austrian Center for Parallel Computation, pp. 394-404.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
MELZAK, Z. On the problem of Steiner. Canadian Mathematical Bulletin ~ (1961), !43-148.
|
| |
24
|
THINKrNG MACHINES CORPORATION. CM Fortran Re}- erence Manual Versions 1.0 and 1.1. Cambridge, MA, July 1991.
|
| |
25
|
|
| |
26
|
|
CITED BY 30
|
|
|
|
|
|
|
|
|
|
|
Z. Bozkus , A. Choudhary , G. Fox , T. Haupt , S. Ranka, Fortran 90D/HPF compiler for distributed memory MIMD computers: design, implementation, and performance results, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.351-360, December 1993, Portland, Oregon, United States
|
|
|
|
|
|
Jordi Garcia , Eduard Ayguadé , Jesus Lebarta, A novel approach towards automatic data distribution, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.78-es, December 04-08, 1995, San Diego, California, United States
|
|
|
D. Cociorva , J. W. Wilkins , C. Lam , G. Baumgartner , J. Ramanujam , P. Sadayappan, Loop optimization for a class of memory-constrained computations, Proceedings of the 15th international conference on Supercomputing, p.103-113, June 2001, Sorrento, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alok Choudhary , Geoffrey Fox , Seema Hiranandani , Ken Kennedy , Charles Koelbel , Sanjay Ranka , Chau-Wen Tseng, Unified compilation of Fortran 77D and 90D, ACM Letters on Programming Languages and Systems (LOPLAS), v.2 n.1-4, p.95-114, March–Dec. 1993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|