ACM Home Page
Please provide us with feedback. Feedback
MCAMP: communication optimization on massively parallel machines with hierarchical scratch-pad memory
Full text PdfPdf (915 KB)
Source
PACT archive
Proceedings of the 17th international conference on Parallel architectures and compilation techniques table of contents
Toronto, Ontario, Canada
SESSION: I/O optimizations table of contents
Pages 102-111  
Year of Publication: 2008
ISBN:978-1-60558-282-5
Authors
Hiroshige Hayashizaki  The University of Tokyo, Tokyo, Japan
Yutaka Sugawara  The University of Tokyo, Tokyo, Japan
Mary Inaba  The University of Tokyo, Tokyo, Japan
Kei Hiraki  The University of Tokyo, Tokyo, Japan
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 87,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1454115.1454132
What is a DOI?

ABSTRACT

Massively parallel machines that integrate a large number of simple processors and small scratch-pad memories (SPMs) into a single chip can achieve a high peak performance per watt of power. In these machines, communication optimizations are important because the communication bandwidth tends to be a bottleneck. Previously proposed communication optimizations using copy candidates, which have been shown to be effective, detect frequently reused array regions by compile-time analysis and copy the regions to scratch-pad memories nearer to the processors. However, they have been proposed for uniprocessor systems or small parallel machines with one or more layers of scratch-pad memories, and the analysis time increases when they are applied to massively parallel machines. In this paper, we propose Multilayer Copy-candidate Analysis for Massively Parallel machines (MCAMP), a communication optimization method for massively parallel machines. MCAMP re-formalizes the framework used in earlier works and improves the scalability of the analysis by assuming the homogeneity of the target systems. We implemented an MCAMP optimizer, which takes an input program that consists of perfectly nested loops containing array references and computation codes, and generates optimized communication. We measured the performance of the output programs of the MCAMP optimizer by executing them on a real massively parallel machine GRAPE-DR using a software tool chain that we also implemented. We showed that MCAMP can achieve optimal data transfer patterns and comparable performance to that of hand-optimized codes with a short analysis time.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
lp_solve. version 5.5.0.11 http://sourceforge.net/projects/lpsolve.
2
3
 
4
5
6
7
8
9
10
 
11
12
 
13
 
14

Collaborative Colleagues:
Hiroshige Hayashizaki: colleagues
Yutaka Sugawara: colleagues
Mary Inaba: colleagues
Kei Hiraki: colleagues