ACM Home Page
Please provide us with feedback. Feedback
High performance RDMA-based MPI implementation over InfiniBand
Full text PdfPdf (223 KB)
Source International Conference on Supercomputing archive
Proceedings of the 17th annual international conference on Supercomputing table of contents
San Francisco, CA, USA
SESSION: Parallel architectures table of contents
Pages: 295 - 304  
Year of Publication: 2003
ISBN:1-58113-733-8
Authors
Jiuxing Liu  The Ohio State University, Columbus, OH
Jiesheng Wu  The Ohio State University, Columbus, OH
Sushmitha P. Kini  The Ohio State University, Columbus, OH
Pete Wyckoff  Ohio Supercomputer Center, Columbus, OH
Dhabaleswar K. Panda  The Ohio State University, Columbus, OH
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 38,   Citation Count: 21
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/782814.782855
What is a DOI?

ABSTRACT

Although InfiniBand Architecture is relatively new in the high performance computing area, it offers many features which help us to improve the performance of communication subsystems. One of these features is Remote Direct Memory Access (RDMA) operations. In this paper, we propose a new design of MPI over InfiniBand which brings the benefit of RDMA to not only large messages, but also small and control messages. We also achieve better scalability by exploiting application communication pattern and combining send/receive operations with RDMA operations. Our RDMA-based MPI implementation currently delivers a latency of 6.8 microseconds for small messages and a peak bandwidth of 871 Million Bytes (831 Mega Bytes) per second. Performance evaluation at the MPI level shows that for small messages, our RDMA-based design can reduce the latency by 24%, increase the bandwidth by over 104%, and reduce the host overhead by up to 22%. For large messages, we improve performance by reducing the time for transferring control messages. We have also shown that our new design is beneficial to MPI collective communication and NAS Parallel Benchmarks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
 
5
R. Dimitrov and A. Skjellum. An Efficient MPI Implementation for Virtual Interface (VI) Architecture-Enabled Cluster Computing. http://www.mpi-softtech.com/publications/, 1998.
 
6
 
7
 
8
 
9
 
10
InfiniBand Trade Association. InfiniBand Architecture Specification, Release 1.0, October 24, 2000.
 
11
Lawrence Livermore National Laboratory. MVICH: MPI for Virtual Interface Architecture, August, 2001.
 
12
J. Liu, J. Wu, S. P. Kinis, D. Buntins, W. Yu, B. Chandrasekaran, R. Noronha, P. Wyckoff, and D. K. Panda. MPI over InfiniBand: Early Experiences. Technical Report, OSU-CISRC-10/02-TR25, Computer and Information Science, the Ohio State University, January, 2003.
 
13
14
 
15
Mellanox Technologies. Mellanox InfiniBand InfiniHost Adapters, July, 2002.
 
16
NASA. NAS Parallel Benchmarks.
 
17
Pallas. Pallas MPI Benchmarks. http://www.pallas.com/e/products/pmb/.
 
18
 
19
 
20
 
21
22

CITED BY  21
 
 
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Jiuxing Liu: colleagues
Jiesheng Wu: colleagues
Sushmitha P. Kini: colleagues
Pete Wyckoff: colleagues
Dhabaleswar K. Panda: colleagues

Peer to Peer - Readers of this Article have also read: