ACM Home Page
Please provide us with feedback. Feedback
A compiler method for the parallel execution of irregular reductions in scalable shared memory multiprocessors
Full text PdfPdf (899 KB)
Source International Conference on Supercomputing archive
Proceedings of the 14th international conference on Supercomputing table of contents
Santa Fe, New Mexico, United States
Pages: 78 - 87  
Year of Publication: 2000
ISBN:1-58113-270-0
Authors
E. Gutiérrez  Department of Computer Architecture, University of Málaga, E-29080 Málaga, Spain
O. Plata  Department of Computer Architecture, University of Málaga, E-29080 Málaga, Spain
E. L. Zapata  Department of Computer Architecture, University of Málaga, E-29080 Málaga, Spain
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 32,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/335231.335239
What is a DOI?

ABSTRACT

This paper presents a new parallelization method for reductions of arrays with subscripted subscripts on scalable shared memory multiprocessors. The mapping of computations is based on grouping reduction loop iterations into sets that are further assigned to the cooperating threads of computation. Iterations belonging to the same set are chosen in such a way that update different entries in the reduction array. That is, the loop distribution implies a conflict-free write distribution of the reduction array. The iteration sets are set up by building a loop-index prefetching data structure that allows to reorder properly the loop iterations. The proposed method is general, scalable, and easy to implement on a compiler. In addition it deals in a uniform way with one and multiple subscript arrays. In case of multiple indirection arrays, writes on the reduction array affecting different sets are solved by defining conflict-free supersets. A performance evaluation is presented. From the experimental results and performance analysis, the proposed method appears as a clear alternative to the array expansion and privatized buffer techniques, used on state-of-the-art parallelizing compilers, like Polaris or SUIF. The scalability problem that those techniques exhibit is missing in our method, as the memory overhead presented does not depend on the number of processors.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Asenjo, E. Gutierrez, Y. Lin, D. Padua, B. Pottengerg, and E. Zapata. On the automatic parallelization of sparse and irregular Fortran codes. Technical Report TR-1512, University for Illinois at Urbana-Champalgn. Center for Supercomputing R&D, December 1996.
 
2
 
3
B. Brooks, K. Bruccoleri, B. Olafson, D. States, S. Swaminathan, and M. Karplus. Charmm: A program for macromolecular energy, minimization and dynamics calculations. J. Computational Chemistry, 4:217-183, 1983.
4
 
5
I. Foster, R. Schreiber, and P. Havlak. HPF-2, scope of activities and motivating applications. Technical Report CRPC-TR94492, Center for Research on Parallel Computation, PAce University, November 1994.
6
 
7
 
8
 
9
R. Hanxleden. Compiler support for machine independent parallelization of irregular problems. Technical Report CR.PC-TR.92301-S, Center for Research on Parallel Computation, Rice University, November 1992.
 
10
High Performance Fortran Forum. High Performance Fortran Language Specification, Version 2.0, 1996.
 
11
J. Ku. The design of an efficient and portable interface between a parallelizing compiler and its target machine. Technical report, Master's Thesis, Univ. of Illinois at Urbana-Champaign, Center for Supercomputing R&D, 1995.
 
12
 
13
 
14
J. Morales and S. Toxvaerd. The ceU-neighbour table method in moleimlar dynamics simulations. Computer Physics Communications, 71:71-76, 1992.
 
15
OpenMP Architecture Re,clew Board. OpenMP: A Proposed lndustry Standard API for Shared Memory Programming, 1997.
 
16
17
 
18
Silicon Graphics, Inc. MIPSpro Fortran77 Programmer's Guide, 1994.
 
19
Silicon Graphics, Inc. MIPSpro Automatic Parallelization, 1998.
 
20
S. Toxvacrd. Algorithms for canonical molecular dynamics simulations. Molecular Physics, 72(1):159-168, 1991.

CITED BY  8

Collaborative Colleagues:
E. Gutiérrez: colleagues
O. Plata: colleagues
E. L. Zapata: colleagues