ACM Home Page
Please provide us with feedback. Feedback
A global communication optimization technique based on data-flow analysis and linear algebra
Full text PdfPdf (386 KB)
Source ACM Transactions on Programming Languages and Systems (TOPLAS) archive
Volume 21 ,  Issue 6  (November 1999) table of contents
Pages: 1251 - 1297  
Year of Publication: 1999
ISSN:0164-0925
Authors
M. Kandemir  Syracuse Univ., Syracuse, NY
P. Banerjee  Northwestern Univ., Evanston, IL
A. Choudhary  Northwestern Univ., Evanston, IL
J. Ramanujam  Louisiana State Univ., Baton Rouge
N. Shenoy  Northwestern Univ., Evanston, IL
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 59,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/330643.330647
What is a DOI?

ABSTRACT

Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements.The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distribu-tions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (anaverage of 32% reduction), the volume of the data communicated (an average of 37%reduction), and the execution time (an average of 26% reduction).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
ADVE, V., MELLOR-CRUMMEY, J., AND SETHI, A. 1997. An integer set framework for HPF analysis and code generation. Technical Report TR97-275, Computer Science Dept., Rice University.
 
3
 
4
5
6
 
7
 
8
BALASUNDARAM, V., FOX, G., KENNEDY, K., AND K_REMER, U. 1990. An interactive environment for data partitioning and distribution. In 5th Distributed Memory Computing Conference, Charleston, SC.
 
9
 
10
 
11
 
12
13
 
14
 
15
 
16
 
17
 
18
GONG, C., GUPTA, R., AND MELHEM, R. 1993. Compilation techniques for optimizing communication on distributed-memory systems. In Proc. International Conference on Parallel Processing, Volume II, pages 39-46, St. Charles, IL.
19
 
20
 
21
22
 
23
 
24
 
25
 
26
 
27
28
 
29
 
30
 
31
KENNEDY, K. AND SETHI, A. 1995. A constrained-based communication placement framework, Technical Report CRPC-TR95515-S, CRPC, Rice University.
 
32
 
33
34
 
35
KENNEDY, K., NEDELJKOVIC, N., AND SETHI, A. 1996. Communication generation for cyclic(k) distributions. In Languages, Compilers, and Run-Time Systems for Scalable Computers, B. Szymanski and B. Sinharoy (Eds.), Chapter 14, Kluwer Academic Publishers.
36
 
37
 
38
PALERMO, D. g., Su, E., CHANDY, g. A., AND BANERJEE, P. 1994. Communication optimizations used in the PARADIGM compiler for distributed-memory multicomputers. In Proc. International Conference on Parallel Processing.
 
39
POLYCHRONOPOULOS, C., GIRKAR, M. B., HAGHIGHAT, M. R., LEE, C. L., LEUNG, B. P., AND SCHOUTEN, D.A. 1989. Parafrase-2: An environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors. In Proc. the International Conference on Parallel Processing, St. Charles IL, August 1989, pages II 39-48.
40
41
 
42
 
43
TARJAN, R. E. 1974. Testing flow graph reducibility. Journal of Computer and System Sciences, 9:355-365.
 
44
 
45
VAN HANXLEDEN, R. AND KENNEDY, K. 1993. A code placement framework and its application to communication generation. Technical Report CRPC-TR93337-S, CRPC, Rice University.
46
 
47
VENKATACHAR, A., RAMANUJAM, J., AND THIRUMALAI, A. 1997. Communication generation for block-cyclic distributions. Parallel Processing Letters, 7, 2, 195-202, June.
 
48
 
49
 
50
YUAN, X., GUPTA, R., AND MELHEM, R. 1997b. Demand-driven data flow analysis for communication optimization. Parallel Processing Letters, 7, 4, 359-370, December.
 
51
52

CITED BY  10

Collaborative Colleagues:
M. Kandemir: colleagues
P. Banerjee: colleagues
A. Choudhary: colleagues
J. Ramanujam: colleagues
N. Shenoy: colleagues