| High performance RDMA-based MPI implementation over InfiniBand |
| Full text |
Pdf
(223 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 17th annual international conference on Supercomputing
table of contents
San Francisco, CA, USA
SESSION: Parallel architectures
table of contents
Pages: 295 - 304
Year of Publication: 2003
ISBN:1-58113-733-8
|
|
Authors
|
|
Jiuxing Liu
|
The Ohio State University, Columbus, OH
|
|
Jiesheng Wu
|
The Ohio State University, Columbus, OH
|
|
Sushmitha P. Kini
|
The Ohio State University, Columbus, OH
|
|
Pete Wyckoff
|
Ohio Supercomputer Center, Columbus, OH
|
|
Dhabaleswar K. Panda
|
The Ohio State University, Columbus, OH
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 4, Downloads (12 Months): 38, Citation Count: 21
|
|
|
ABSTRACT
Although InfiniBand Architecture is relatively new in the high performance computing area, it offers many features which help us to improve the performance of communication subsystems. One of these features is Remote Direct Memory Access (RDMA) operations. In this paper, we propose a new design of MPI over InfiniBand which brings the benefit of RDMA to not only large messages, but also small and control messages. We also achieve better scalability by exploiting application communication pattern and combining send/receive operations with RDMA operations. Our RDMA-based MPI implementation currently delivers a latency of 6.8 microseconds for small messages and a peak bandwidth of 871 Million Bytes (831 Mega Bytes) per second. Performance evaluation at the MPI level shows that for small messages, our RDMA-based design can reduce the latency by 24%, increase the bandwidth by over 104%, and reduce the host overhead by up to 22%. For large messages, we improve performance by reducing the time for transferring control messages. We have also shown that our new design is beneficial to MPI collective communication and NAS Parallel Benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
David Culler , Richard Karp , David Patterson , Abhijit Sahay , Klaus Erik Schauser , Eunice Santos , Ramesh Subramonian , Thorsten von Eicken, LogP: towards a realistic model of parallel computation, Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.1-12, May 19-22, 1993, San Diego, California, United States
|
| |
5
|
R. Dimitrov and A. Skjellum. An Efficient MPI Implementation for Virtual Interface (VI) Architecture-Enabled Cluster Computing. http://www.mpi-softtech.com/publications/, 1998.
|
| |
6
|
Dave Dunning , Greg Regnier , Gary McAlpine , Don Cameron , Bill Shubert , Frank Berry , Anne Marie Merritt , Ed Gronke , Chris Dodd, The Virtual Interface Architecture, IEEE Micro, v.18 n.2, p.66-76, March 1998
[doi> 10.1109/40.671404]
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
InfiniBand Trade Association. InfiniBand Architecture Specification, Release 1.0, October 24, 2000.
|
| |
11
|
Lawrence Livermore National Laboratory. MVICH: MPI for Virtual Interface Architecture, August, 2001.
|
| |
12
|
J. Liu, J. Wu, S. P. Kinis, D. Buntins, W. Yu, B. Chandrasekaran, R. Noronha, P. Wyckoff, and D. K. Panda. MPI over InfiniBand: Early Experiences. Technical Report, OSU-CISRC-10/02-TR25, Computer and Information Science, the Ohio State University, January, 2003.
|
| |
13
|
Kostas Magoutis , Salimah Addetia , Alexandra Fedorova , Margo I. Seltzer , Jeffrey S. Chase , Andrew J. Gallatin , Richard Kisley , Rajiv Wickremesinghe , Eran Gabber, Structure and Performance of the Direct Access File System, Proceedings of the General Track: 2002 USENIX Annual Technical Conference, p.1-14, June 10-15, 2002
|
 |
14
|
Richard P. Martin , Amin M. Vahdat , David E. Culler , Thomas E. Anderson, Effects of communication latency, overhead, and bandwidth in a cluster architecture, Proceedings of the 24th annual international symposium on Computer architecture, p.85-97, June 01-04, 1997, Denver, Colorado, United States
|
| |
15
|
Mellanox Technologies. Mellanox InfiniBand InfiniHost Adapters, July, 2002.
|
| |
16
|
NASA. NAS Parallel Benchmarks.
|
| |
17
|
Pallas. Pallas MPI Benchmarks. http://www.pallas.com/e/products/pmb/.
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
 |
22
|
Yuanyuan Zhou , Angelos Bilas , Suresh Jagannathan , Cezary Dubnicki , James F. Philbin , Kai Li, Experiences with VI communication for database storage, Proceedings of the 29th annual international symposium on Computer architecture, p.257, May 25-29, 2002, Anchorage, Alaska
|
CITED BY 21
|
|
|
|
|
|
|
|
|
Ada Gavrilovska , Sanjay Kumar , Srikanth Sundaragopalan , Karsten Schwan, Platform Overlays: enabling in-network stream processing in large-scale distributed applications, Proceedings of the international workshop on Network and operating systems support for digital audio and video, June 13-14, 2005, Stevenson, Washington, USA
|
|
Michael Schlansker , Nagabhushan Chitlur , Erwin Oertli , Paul M. Stillwell, Jr , Linda Rankin , Dennis Bradford , Richard J. Carter , Jayaram Mudigonda , Nathan Binkert , Norman P. Jouppi, High-performance ethernet-based communications for future multi-core processors, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, November 10-16, 2007, Reno, Nevada
|
|
|
Jiuxing Liu , Balasubramanian Chandrasekaran , Jiesheng Wu , Weihang Jiang , Sushmitha Kini , Weikuan Yu , Darius Buntinas , Peter Wyckoff , D K. Panda, Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics, Proceedings of the 2003 ACM/IEEE conference on Supercomputing, p.58, November 15-21, 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hyungsoo Jung , Dongin Shin , Hyuck Han , Jai W. Kim , Heon Y. Yeom , Jongsuk Lee, Design and Implementation of Multiple Fault-Tolerant MPI over Myrinet (M^3), Proceedings of the 2005 ACM/IEEE conference on Supercomputing, p.32, November 12-18, 2005
|
|
Wei Huang , Jiuxing Liu , Matthew Koop , Bulent Abali , Dhabaleswar Panda, Nomad: migrating OS-bypass networks in virtual machines, Proceedings of the 3rd international conference on Virtual execution environments, June 13-15, 2007, San Diego, California, USA
|
|
Adit Ranadive , Mukil Kesavan , Ada Gavrilovska , Karsten Schwan, Performance implications of virtualizing multicore cluster machines, Proceedings of the 2nd workshop on System-level virtualization for high performance computing, p.1-8, March 31-31, 2008, Glasgow, Scotland
|
|
|
|
|
|
|
Wei Huang , Jiuxing Liu , Bulent Abali , Dhabaleswar K. Panda, A case for high performance computing with virtual machines, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|
|
|
|
Kevin J. Bowers , Edmond Chow , Huafeng Xu , Ron O. Dror , Michael P. Eastwood , Brent A. Gregersen , John L. Klepeis , Istvan Kolossvary , Mark A. Moraes , Federico D. Sacerdoti , John K. Salmon , Yibing Shan , David E. Shaw, Molecular dynamics---Scalable algorithms for molecular dynamics simulations on commodity clusters, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
|
|
Mark K. Gardner , Wu-chun Feng , Jeremy Archuleta , Heshan Lin , Xiaosong Mal, Grid applications---Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications, Proceedings of the 2006 ACM/IEEE conference on Supercomputing, November 11-17, 2006, Tampa, Florida
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|