ACM Home Page
Please provide us with feedback. Feedback
Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors
Full text PdfPdf (1.20 MB)
Source International Conference on Supercomputing archive
Proceedings of the 8th international conference on Supercomputing table of contents
Manchester, England
Pages: 54 - 63  
Year of Publication: 1994
ISBN:0-89791-665-4
Author
Kathryn S. McKinley  Department of Computer Science, University of Massachusetts, Amherst, MA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 10,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/181181.181265
What is a DOI?

ABSTRACT

We present a parallel code generation algorithm for complete applications and a new experimental methodology that tests the efficacy of our approach. The algorithm optimizes for data locality and parallelism, reducing or eliminating false sharing. It also uses interprocedural analysis and transformations to improve the granularity of parallelism. Although the individual components of the algorithm have been published previously, their coordination is unique to this paper. For experimental validation, we do not attempt to parallelize “dusty deck” programs where many have tried and failed. Instead, we collect programs where the users tried to achieve excellent parallel performance. We apply our optimizations to sequential versions of these programs, i.e., the compiler was required to use its analysis and algorithms to parallelize the program and could not rely on user assertions that for example, a loop is parallel. With this metric, our algorithm improves or matches hand-coded parallel programs on shared-memory, bus-based parallel machines for eight of the nine programs in our test suite.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
 
5
S. Carr, K. S. MCKinley, and C. Tseng. Compiler optimizations for improving data locality. Technical Report TR92-195, Dept. of Computer Science, Rice University, November 1992.
 
6
K. Cooper, M. W. Hall, R. T. Hood, K. Kennedy, K. S. McKinley, J. M. Mellor-Crummey, L. Torczon, and S. K. Warren. The ParaScope parallel programming environment. Proceedings of the IEEE, 81(2):244-263, February 1993.
7
 
8
R. Eigenmann, J. Hoeflinger, G. Jaxon, Z. Li, and D. Padua. Restructuring Fortran programs for Cedar. Concurrency: Practice & Experience, 5(7):553-574, October 1993.
9
 
10
11
 
12
 
13
K. Kennedy and K. S. MCKinley. Typed fusion with applications to parallel and sequential code generation. Technical Report TR93-208, Dept. of Computer Science, Rice University, August 1993.
 
14
K. Kennedy, K. S. MCKinley, and C. Tseng. Analysis and transformation in an interactive parallel programming tool. Concurrency: Practice & Experience, 5(7):575--602, October 1993.
15
16
 
17
K.S. McKinley. Dependence analysis of arrays subscripted by index arrays. Technical Report TR91-162, Dept. of Computer Science, Rice University, December 1990.
 
18
 
19
 
20
J. Singh and J. Hennessy. An empirical investigation of the effectiveness of and limitations of automatic parallelization. In Proceedings of the International Symposium on Shared Memory Multiprocessors, Tokyo, Japan, April 1991.
 
21
 
22
23
 
24
 
25


Collaborative Colleagues:
Kathryn S. McKinley: colleagues

Peer to Peer - Readers of this Article have also read: