|
ABSTRACT
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as a black box with unknown side effects and thus miss potential optimizations. This paper's contributions enable the development of an MPI-aware optimizing compiler that can perform transformations exploiting knowledge of MPI call effects to increase communication-computa-tion overlap. We formulate a set of data flow equations and rules to describe the side effects of key MPI functions so an MPI-aware compiler can automatically assess the safety of transformations. After categorizing existing compiler transformations based on their effect on the application code, we present an optimization algorithm that specifies when and how to apply these optimizing transformations to achieve improved communication-computation overlap. By manually applying the optimization algorithm to kernels extracted from HYCOM and the NAS benchmarks, we show that even when transforming these highly optimized codes, execution time can be decreased by an average of over 30%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Open64. http://open64.sourceforge.net.
|
| |
2
|
D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, December 1995.
|
| |
3
|
C. Bell, D. Bonachea, R. Nishtala, and K. Yelick. Optimizing Bandwidth Limited Problems Using One-Sided Communication and Overlap. In 20th International Parallel & Distributed Processing Symposium (IPDPS), 2006.
|
| |
4
|
|
| |
5
|
|
| |
6
|
E. P. Chassignet, L. T. Smith, G. R. Halliwell, and R. Bleck. North Atlantic simulation with the HYbrid Coordinate Ocean Model (HYCOM): Impact of the vertical coordinate choice, reference density, and thermobaricity. Journal of Physical Oceanography, 32:2504--2526, 2003.
|
 |
7
|
|
| |
8
|
|
| |
9
|
Dale Shires and Lori Pollock and Sara Sprenkle. Program Flow Graph Construction for Static Analysis of MPI Programs. In Parallel and Distributed Processing Techniques and Applications (PDPTA'99), pages 1847--1853, June 1999.
|
| |
10
|
Anthony Danalis , Aaron Brown , Lori Pollock , Martin Swany , John Cavazos, Gravel: A Communication Library to Fast Path MPI, Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, September 07-10, 2008, Dublin, Ireland
[doi> 10.1007/978-3-540-87475-1_19]
|
| |
11
|
|
| |
12
|
A. Danalis, L. Pollock, and M. Swany. Automatic MPI application transformation with ASPhALT. In Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL 2007), in conjunction with IPDPS 2007, 2007.
|
| |
13
|
A. Danalis, L. Pollock, M. Swany, and J. Cavazos. Implementing an Open64-based Tool for Improving the Performance of MPI Programs. In The Open64 Workshop, in conjunction with IEEE/ACM International Symposium on Code Generation and Optimization (CGO) 2008, Apr 2008.
|
| |
14
|
D. Das, M. Gupta, R. Ravindran, W. Shivani, P. Sivakeshava, and R. Uppal. Compiler-Controlled Extraction of Computation-Communication Overlap in MPI Applications. In HIPS-POHLL joint Workshop on High-Level Parallel Programming Models and Supportive Environments and Performance Optimization for High-Level Languages and Libraries held in conjunction with the 22nd IEEE International Parallel & Distributed Processing Symposium(IPDPS 2008), April 2008.
|
| |
15
|
T. A. El-Ghazawi, W. W. Carlson, and J. M. Draper. UPC specification v. 1.1. http://upc.gwu.edu/documentation, 2003.
|
 |
16
|
|
 |
17
|
Manish Gupta , Sam Midkiff , Edith Schonberg , Ven Seshadri , David Shields , Ko-Yang Wang , Wai-Mee Ching , Ton Ngo, An HPF compiler for the IBM SP2, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.71-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224422]
|
| |
18
|
High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. CRPC-TR92225, Rice University, Houston, TX, 1993.
|
| |
19
|
Paul N. Hilfinger , Dan Bonachea , David Gay , Susan Graham , Ben Liblit , Geoff Pike , Katherine Yelick, Titanium Language Reference Manual, University of California at Berkeley, Berkeley, CA, 2001
|
| |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
C. Iancu, P. Husbands, and W. Chen. Message Strip Mining Heuristics for High Speed Networks. In VECPAR, 2004.
|
 |
24
|
|
| |
25
|
K. Kennedy, B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gannon, L. Johnsson, J. Mellor-Crummey, and L. Torczon. Telescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated Libraries. Journal of Parallel and Distributed Computing, 61(12):1803--1826, 2001.
|
| |
26
|
|
| |
27
|
|
| |
28
|
|
 |
29
|
|
 |
30
|
|
| |
31
|
|
 |
32
|
|
| |
33
|
|
|