| Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols |
| Full text |
Pdf
(423 KB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 23rd international conference on Supercomputing
table of contents
Yorktown Heights, NY, USA
SESSION: High-performance communications II
table of contents
Pages 306-315
Year of Publication: 2009
ISBN:978-1-60558-498-0
|
|
Authors
|
|
Matthew Small
|
Florida State University, Tallahassee, FL, USA
|
|
Xin Yuan
|
Florida State University, Tallahassee, FL, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 24, Downloads (12 Months): 85, Citation Count: 0
|
|
|
ABSTRACT
Message Passing Interface (MPI) point-to-point communications are usually realized with two protocols, the eager protocol for small messages and the rendezvous protocol for medium and large sized messages. Traditional sender-initiated rendezvous protocols are sub-optimal in many situations. In this work, we propose to refine the rendezvous protocol for medium and large messages on RDMA-enabled clusters with three protocols that are customized for different situations, a hybrid protocol for medium sized messages when the sender arrives early, a sender-initiated protocol for large messages when the sender arrives early, and a receiver-initiated protocol when the receiver arrives early. In comparison to traditional sender-initiated rendezvous protocols, the proposed scheme reduces unnecessary synchronizations, decreases the number of control messages that are in the critical path of communications, and improves the communication progress, which results in a significantly better communication-computation overlap capability. We present and analyze these protocols, and describe how these protocols and the eager protocol can be seamlessly integrated in one system without introducing an excessive number of control messages. We have implemented the proposed scheme for InfiniBand clusters. The experimental results demonstrate the effectiveness of the proposed technique.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Anthony Danalis , Aaron Brown , Lori Pollock , Martin Swany , John Cavazos, Gravel: A Communication Library to Fast Path MPI, Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, September 07-10, 2008, Dublin, Ireland
[doi> 10.1007/978-3-540-87475-1_19]
|
| |
3
|
InfiniBand Trade Association, http://www.infinibandta.org.
|
| |
4
|
"InfiniBand Host Channel Adapter Verb Implementer's Guide", Intel Corp., 2003.
|
| |
5
|
J. Ke, M. Burtscher, and E. Speight, "Tolerating Message Latency through the Early Release of Blocked Receives," Euro-Par 2005 LNCS 2648, pp 19--29, 2005.
|
| |
6
|
C. Keppitiyagama and a. Wagner, "MPI-NP II:A Network Processor Based Message Manager for MPI," International Conference on Communications in Computing 2000.
|
| |
7
|
|
| |
8
|
Rahul Kumar , Amith R. Mamidala , Matthew J. Koop , Gopal Santhanaraman , Dhabaleswar K. Panda, Lock-Free Asynchronous Rendezvous Design for MPI Point-to-Point Communication, Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, September 07-10, 2008, Dublin, Ireland
[doi> 10.1007/978-3-540-87475-1_27]
|
 |
9
|
Jiuxing Liu , Jiesheng Wu , Sushmitha P. Kini , Pete Wyckoff , Dhabaleswar K. Panda, High performance RDMA-based MPI implementation over InfiniBand, Proceedings of the 17th annual international conference on Supercomputing, June 23-26, 2003, San Francisco, CA, USA
[doi> 10.1145/782814.782855]
|
| |
10
|
S. Majumder, S. Rixner, and V. S. Pai, "An Event-Driven Architecture for MPI Libraries," The Los Alamos Computer Science Institute Symposium 2004.
|
| |
11
|
The MPI Forum. MPI: A Message-Passing Interface Standard, Version 1. 3 May 2008. Available at http://www.mpi-forum.org/docs/mpi-1.3/mpi-report-1.3-2008-05-30.pdf.
|
| |
12
|
MVAPICH:MPI over InfiniBand and iWARP, http://mvapich.cse.ohio-state. edu.
|
| |
13
|
Myricom, http://www.myricom.com.
|
| |
14
|
NAS Parallel Benchmarks, http://www.nas.nasa.gov/Software/NPB/
|
| |
15
|
S. Pakin, "Receiver-initiated Message Passing over RDMA Networks," the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS), April 2008.
|
| |
16
|
|
| |
17
|
|
 |
18
|
Sayantan Sur , Hyun-Wook Jin , Lei Chai , Dhabaleswar K. Panda, RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
[doi> 10.1145/1122971.1122978]
|
| |
19
|
V. Tipparaju, G. Santhanaraman, J. Nieplocha, and D. K. Panda, "Host-Assisted Zero-Copy Remote Memory Access Communication on InfiniBand," IEEE International Parallel and Distributed Processing Symposium 2004.
|
|