ACM Home Page
Please provide us with feedback. Feedback
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors
Full text PdfPdf (1.11 MB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 1989 ACM/IEEE conference on Supercomputing table of contents
Reno, Nevada, United States
Pages: 637 - 646  
Year of Publication: 1989
ISBN:0-89791-341-8
Authors
J. Ramanujam  Department of Computer and Information Science, The Ohio State University, Columbus, Ohio
P. Sadayappan  Department of Computer and Information Science, The Ohio State University, Columbus, Ohio
Sponsors
Argonne Natl Lab : Argonne National Lab
IEEE-CS : Computer Society
NASA : National Aeronatics and Space Administration
SIGARCH: ACM Special Interest Group on Computer Architecture
Los Alamos National Labs : Los Alamos National Labs
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 8,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/76263.76335
What is a DOI?

ABSTRACT

The availability of large distributed memory multiprocessors and parallel computers with complex memory hierarchies have left the programmer with the difficult task of planning the detailed parallel execution of a program; for example, in the case of distributed memory machines, the programmer is forced to manually distribute code and data in addition to managing communication among tasks explicitly. Current work in compiler support for this task has focused on automating task partitioning, assuming a fixed data partition or a programmer-specified (in the form of annotations) data partition. This paper argues that one of the reasons for inefficient parallel execution is the lack of synergism between task partitioning and data partitioning and allocation; hence data and task allocation should both be influenced by the inherent dependence structure of the computation (which is the source of synergism). We present a methodology based on a unified approach to task and data partitioning; we show how to derive data and task partitions for computations expressed as nested loops that exhibit regular dependencies and how to map these onto distributed memory multiprocessors. Based on the mapping, we show how to derive code for the nodes of a distributed memory machine with appropriate message transmission constructs. We also discuss related communication optimizations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
D. Callahan and K. Kennedy, "Compiling Programs for Distributed-Memory Multiprocessors," The Journal of Supercomputing, Vol. 2, Oct. 1988, pp. 151-169.
 
4
 
5
M. Chen, Y. Choo and J. Li, "Compiling Parallel Programs by Optimizing Performance," The Journal of Supercomputing, 2, 1988, pp. 171-207.
 
6
 
7
R. Cytron, "Limited Processor Scheduling of Doacross Loops," Proc. 1987 International Conference on Paral. lel Processing, S. Sahni (Ed.), Aug. 1987, pp. 226-234.
 
8
J. Fortes and D. Moldown, "ParaUehsm Detection and Transformation Techniques Useful for VLSI Algorithms," Journal of Parallel and D~stributed Computing, Vol. 2, 1985, pp. 277-301.
 
9
10
 
11
12
13
 
14
B. Kernighan and S. Lin, "An Efficient Heuristic Procedure for Partitioning Graphs," Bell Syslems Technical Journal, Vol. 49, No. 2, 1970, pp. 291-308.
 
15
 
16
C. Koelbel and P. Mehrotra, "Semi-automatic Process Decomposition for Non-shared Memory Machines," in Proc. 1989 Inlernational Conference on Supercomputing, May 1989.
17
18
 
19
 
20
 
21
D. Padua, D. Kuck and D. Lawrie, "High-speed Multiprocessors and Compilation Techniques," IEEE Trans. Computers, Vol. C-29, No. 9, Sep. 1980, pp. 763-776.
22
23
 
24
P. Sadayappan, F. Ercal ~nd J. Ramanujam, "Cluster Partitioning Approaches to Mapping Parallel Programs onto a Hypercube," to appear in Parallel Computing.
 
25
C. Shen and W. Tsai, "A Graph Matching Approach to Optimal Task Assignment in Distributed Computing Systems Using a Minmax Criterion," IEEE Trans. on Computers, Vol. C-34, No. 3, Mar. 1985, pp. 197-203.
 
26
B.J. Smith, "A Pipelined, Shared Resource MIMD Computer," Proc. 1978 International Conference on Parallel Processing, pp. 6-8.
 
27
 
28
 
29

CITED BY  14

Collaborative Colleagues:
J. Ramanujam: colleagues
P. Sadayappan: colleagues