|
ABSTRACT
Irregular applications pose challenges in optimizing communication, due to the difficulty of analyzing irregular data accesses accurately and efficiently. This challenge is especially big when translating irregular shared-memory applications to message-passing form for clusters. The lack of effective irregular data analysis in the translation system results in unnecessary or redundant communication, which limits application scalability. In this paper, we present a Lean Distributed Shared Memory (LDSM) system, which features a fast and accurate irregular data access (IDA) analysis. The analysis uses a region-based diff method and makes use of a runtime library that is optimized for irregular applications. We describe three optimizations that improve the LDSM system performance. A parallel array reduction transformation reduces overheads in the analysis. A packed communication optimization and a differential communication optimization effectively eliminate unnecessary and redundant messages. We evaluate the performance of the optimized LDSM system on a set of representative irregular benchmarks. The optimized LDSM executes irregular applications on average 45% faster than the hand-tuned MPI applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Cristiana Amza , Alan L. Cox , Sandhya Dwarkadas , Pete Keleher , Honghui Lu , Ramakrishnan Rajamony , Weimin Yu , Willy Zwaenepoel, TreadMarks: Shared Memory Computing on Networks of Workstations, Computer, v.29 n.2, p.18-28, February 1996
[doi> 10.1109/2.485843]
|
| |
3
|
J. Balart, M. Gonzalez, X. Martorell, E. Ayguade, and J. Labarta. Runtime Address Space Computation for SDSM Systems. In The 19th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2006), pages 330--344, 2006.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. Charmm: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry, 4(2):187--217, 1983.
|
 |
8
|
|
 |
9
|
Raja Das , Paul Havlak , Joel Saltz , Ken Kennedy, Index array flattening through program transformation, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.70-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224420]
|
| |
10
|
T. El-Ghazawi, W. Carlson, and J. Draper. UPC Language Specifications, v1.1.1, 2003.
|
| |
11
|
|
| |
12
|
Paul N. Hilfinger , Dan Bonachea , David Gay , Susan Graham , Ben Liblit , Geoff Pike , Katherine Yelick, Titanium Language Reference Manual, University of California at Berkeley, Berkeley, CA, 2001
|
| |
13
|
Y. Hwang, B. Moon, S. Sharma, R. Das, and J. Saltz. Runtime Support to Parallelize Adaptive Irregular Programs, 1994.
|
| |
14
|
|
| |
15
|
|
 |
16
|
Honghui Lu , Alan L. Cox , Sandhya Dwarkadas , Ramakrishnan Rajamony , Willy Zwaenepoel, Compiler and software distributed shared memory support for irregular applications, Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.48-56, June 18-21, 1997, Las Vegas, Nevada, United States
|
| |
17
|
|
 |
18
|
Seung-Jai Min , Rudolf Eigenmann, Combined compile-time and runtime-driven, pro-active data movement in software DSM systems, Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems, p.1-6, October 22-23, 2004, Houston, Texas
[doi> 10.1145/1066650.1066661]
|
 |
19
|
R. Mirchandaney , J. H. Saltz , R. M. Smith , D. M. Nico , K. Crowley, Principles of runtime support for parallel processors, Proceedings of the 2nd international conference on Supercomputing, p.140-152, June 1988, St. Malo, France
[doi> 10.1145/55364.55378]
|
| |
20
|
B. Moon, M. Uysal, and J. Saltz. Index Translation Schemes for Adaptive Computations on Distributed Memory Multicomputers. Technical Report CS-TR-3428, 1995.
|
 |
21
|
|
| |
22
|
OpenMP Forum. OpenMP: A Proposed Industry Standard API for Shared Memory Programming. Technical report, October 1997.
|
 |
23
|
|
| |
24
|
J. Saltz, R. Ponnusamy, S. D. Sharma, B. Moon, Y.-S. Hwang, M. Uysal, and R. Das. A Manual for the CHAOS Runtime Library. Technical Report CS-TR-3437, 1995.
|
 |
25
|
Shamik D. Sharma , Ravi Ponnusamy , Bongki Moon , Yuan Shin Hwang , Raja Das , Joel Saltz, Run-time and compile-time support for adaptive irregular problems, Proceedings of the 1994 ACM/IEEE conference on Supercomputing, November 14-18, 1994, Washington, D.C.
[doi> 10.1145/602770.602793]
|
| |
26
|
|
| |
27
|
|
 |
28
|
|
| |
29
|
|
|