|
ABSTRACT
Portability, efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes a new approach, called Global Arrays (GA), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GA is that it provides a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. We have implemented GA libraries on a variety of computer systems, including the Intel DELTA and Paragon, the IBM SP-1 (all message-passers), the Kendall Square KSR-2 (a nonuniform access shared-memory machine), and networks of Unix workstations. We discuss the design and implementation of these libraries, report their performance, illustrate the use of GA in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
High Performance Fortran Forum, High Performance Fortran Language Specification, Version 1.0, Rice University, 1993.
|
| |
2
|
|
| |
3
|
I.T. Foster, R. Olson and S. Tuecke, 'Productive Parallel Programming: The PCN Approach,' Scientific Programming, pp. 51-66, 1, 1992.
|
| |
4
|
I.T. Foster and K.M. Chandy, Fortran M: A Language for Modular Parallel Programming, Argonne National Laboratory, preprint MCS-P327-0992, 1992.
|
| |
5
|
|
| |
6
|
A. Szabo and N.S. Ostlund, Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. 1st Ed. Revised, McGraw-Hill, Inc., New York, 1989.
|
| |
7
|
J. Almlöf, K. Faegri and K. Korsell, 'The Direct SCF Method,' J. Comp. Chem., 385, 3, 1982.
|
| |
8
|
R.J. Harrison, M.F. Guest, R.A. Kendall, D.E. Bernholdt, A.T. Wong, M.S. Stave, J.L. Anchell, A.C. Hess, R.J. Littlefield, G.I. Fann, J. Nieplocha, G.S. Thomas, D. Elwood, J. Tilson, R.L. Shepard, A.F. Wagner, I.T. Foster, E. Lusk and R. Stevens, 'Fully Distributed Parallel Algorithms -- Molecular Self Consistent Field Calculations,' J. Comp. Chem., submitted for publication, 1994.
|
| |
9
|
|
| |
10
|
M. Schuler, T. Kovar, H. Lischka, R. Shepard and R.J. Harrison, 'A parallel implementation of the COLUMBUS multireference configuration interaction program,' Theor. Chim. Acta, pp. 489-509, 84, 1993.
|
| |
11
|
SCALAPACK, scalable linear algebra package, code and documents available through netlib.
|
| |
12
|
High Performance Fortran Forum II, information available from chk@cs.rice.edu.
|
| |
13
|
R.J. Harrison, 'Portable Tools and Applications for Parallel Computers,' Int. J. Quant. Chem., pp. 847-863, 40, 1991.
|
| |
14
|
|
 |
15
|
|
| |
16
|
R.J. Littlefield, 'Characterizing and Tuning Communications Performance on the Touchstone DELTA and iPSC/860,' Proc. of the Intel Supercomputer Users' Group 1992 Annual Users conference, pp. 309-313, 1992.
|
| |
17
|
J. Michalakes, Analysis of Workload and Load Balancing Issues in NCAR Community Climate Model, Argonne National Laboratory, technical report MCS-TM-144, 1991.
|
|