|
ABSTRACT
There exists substantial data level parallelism in scientific problems. The PARTY runtime system is an attempt to obtain efficient parallel implementations for scientific computations, particularly those where the data dependencies are manifest only at runtime. This can preclude compiler based detection of certain types of parallelism. The automated system is structured as follows: An appropriate level of granularity is first selected for the computations. A directed acyclic graph representation of the program is generated on which various aggregation techniques may be employed in order to generate efficient schedules. These schedules are then mapped onto the target machine. We describe some initial results from experiments conducted on the Intel Hypercube and the Encore Multimax that indicate the usefulness of our approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
M. J. Berger and S. Bokhari. A Partitioning Strategy for Non-Uniform Problems on Multiprocessors. Report 85-55, ICASE, November 1985. Shortened version appeared in Proceedings ICPP 1985.
|
| |
4
|
Chen. Can Data Parallel Machines be Made Easy to Program. Technical Report YALEU/DCS/RR-556, Department of Computer Science, Yale University, August 1987.
|
| |
5
|
G. C. Fox and S. W. Otto. Concurrent Computation and the Theory of Complex Systems. Report CALT-68-1343, Caltech, 1986.
|
| |
6
|
G. A. Geist and M. T. lIeath. Matrix factorization on a hypercube multiprocessor, in The Proceedings of the Hypercube Microprocessors Conf., Knoxville, TN, pages 161-180, September 1986.
|
| |
7
|
|
| |
8
|
I. Ipsen, Y. Sa~d, and M.tI. Schultz. Complexity of dense linear system solution on a multiprocessor ring. Lin. Algebra Appl., 77:205-239, 1986.
|
| |
9
|
J. F. Jordan, M. S. Benten, and N. S. Arenstori'. Force User's Manual. Department of Electrical and Computer Engineering 80309-0425, University of Colorado, October 1986.
|
| |
10
|
K. Kennedy. Compilation for n-processor architectures, in Proceedings of the IEEE International Conference on Computer Design: VLSI in Computers, page 15, October 1985.
|
| |
11
|
Ewing Lusk , James Boyle , Ralph Butler , Terrence Disz , Barnett Glickfeld , Ross Overbeek , James Patterson , Rick Stevens, Portable programs for parallel processors, Holt, Rinehart & Winston, Austin, TX, 1988
|
| |
12
|
D. M. Nicol and J. H. Saltz. Principles for Problem Aggregation and Assignment in Medium Scale Multiprocessors. Report 87-39, ICASE, September 1987.
|
 |
13
|
|
| |
14
|
PCGPAK User's Guide. 1984.
|
| |
15
|
Y. Saad. Communication complexity of the gaussian elimination algorithm on multipiocessors. Lin. Algebra Appl., 77:315-340, 1986.
|
| |
16
|
Y. Saad and M. H. Schultz. Topological Properties of Hypercubes. Department of Computer Science YALEU/DCS/RR-389, Yale, June 1986.
|
| |
17
|
J. S~ltz. Automated Problem Scheduling and Reduction of Communication Delay Effects; submitted for publication. Report 87-22, ICASE, May 1987.
|
| |
18
|
J. H. Saltz, R. Mirchandaney, R. Smith, and D. Nicol. The Automated Crystal Runtime System: A Framework. Technical Report 588, Department of Computer Science, Yale University, Nov: 1987.
|
| |
19
|
M. Schultz, D. Baxter, S. Eisenstat, and J. Saltz. Building Software Packages for Large Sparse Linear Systems of Equations on Shared Memory Multiprocessors. Technical Report SCA-115, Scientific Computing Associates, 1987.
|
CITED BY 32
|
|
Shamik D. Sharma , Ravi Ponnusamy , Bongki Moon , Yuan-Shin Hwang , Raja Das , Joel Saltz, Run-time and compile-time support for adaptive irregular problems, Proceedings of the 1994 conference on Supercomputing, p.97-106, December 1994, Washington, D.C., United States
|
|
|
|
|
|
M. C. Rinard , D. J. Scales , M. S. Lam, Heterogeneous parallel programming in Jade, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.245-256, November 16-20, 1992, Minneapolis, Minnesota, United States
|
|
|
|
|
|
D. J. Mavriplis , R. Das , R. E. Vermeland , J. Saltz, Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.132-141, November 16-20, 1992, Minneapolis, Minnesota, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Raja Das , Paul Havlak , Joel Saltz , Ken Kennedy, Index array flattening through program transformation, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.70-es, December 04-08, 1995, San Diego, California, United States
|
|
|
|
|
|
Shamik D. Sharma , Ravi Ponnusamy , Bongki Moon , Yuan Shin Hwang , Raja Das , Joel Saltz, Run-time and compile-time support for adaptive irregular problems, Proceedings of the 1994 ACM/IEEE conference on Supercomputing, November 14-18, 1994, Washington, D.C.
|
|
|
Seema Hiranandani , Ken Kennedy , Chau-Wen Tseng, Compiler optimizations for Fortran D on MIMD distributed-memory machines, Proceedings of the 1991 ACM/IEEE conference on Supercomputing, p.86-100, November 18-22, 1991, Albuquerque, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Manuel Ujaldón , Shamik D. Sharma , Emilio L. Zapata , Joel Saltz, Experimental evaluation of efficient sparse matrix distributions, Proceedings of the 10th international conference on Supercomputing, p.78-85, May 25-28, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
D. Baxter , R. Mirchandaney , J. H. Saltz, Run-time parallelization and scheduling of loops, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.303-312, June 18-21, 1989, Santa Fe, New Mexico, United States
|
|
|
|
|