| Optimizing threaded MPI execution on SMP clusters |
| Full text |
Pdf
(273 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 15th international conference on Supercomputing
table of contents
Sorrento, Italy
Pages: 381 - 392
Year of Publication: 2001
ISBN:1-58113-410-X
|
|
Authors
|
|
Hong Tang
|
Department of Computer Science, University of California, Santa Barbara, CA
|
|
Tao Yang
|
Department of Computer Science, University of California, Santa Barbara, CA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 88, Citation Count: 5
|
|
|
ABSTRACT
Our previous work has shown that using threads to execute MPI programs can yield great performance gain on multiprogrammed shared-memory machines. This paper investigates the design and implementation of a thread-based MPI system on SMP clusters. Our study indicates that with a proper design for threaded MPI execution, both point-to-point and collective communication performance can be improved substantially, compared to a process-based MPI implementation in a cluster environment. Our contribution includes a hierarchy-aware and adaptive communication scheme for threaded MPI execution and a thread-safe network device abstraction that uses event-driven synchronization and provides separated collective and point-to-point communication channels. This paper describes the implementation of our design and illustrates its performance advantage on a Linux SMP cluster.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Donald J.Becker,Thomas Sterling,D niel Sav rese, John E.Dorband,Ud y A.Ranawak,and Charles V. P cker.BEOWULF:parallel workstation for scienti .c computation.In Proceeding of International Conference on Parallel Proce ing ,1995.More Beowulf papers re in http://www.beowulf.org.
|
| |
3
|
G.Burns,,V.R diy ,R.Daoud,and R.Machiraju.All About Trollius,1990.Ohio Supercomputer Center.
|
| |
4
|
G.Burns,R.Daoud,and J.Vaigl.LAM:An Open Cluster Environment for MPI.Ohio Supercomputer Center.
|
| |
5
|
|
| |
6
|
|
| |
7
|
W.Gropp and E.Lusk.An Abstract Device Definition to Support the Implementation of High-Level Point-to-Point Message P ssing Interface.Technical Report MCS-P392-1193,Argonne N tional L boratory,1994.
|
| |
8
|
|
| |
9
|
|
 |
10
|
Thilo Kielmann , Rutger F. H. Hofman , Henri E. Bal , Aske Plaat , Raoul A. F. Bhoedjang, MagPIe: MPI's collective communication operations for clustered wide area systems, Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, p.131-140, May 04-06, 1999, Atlanta, Georgia, United States
|
| |
11
|
|
 |
12
|
|
| |
13
|
L.MPI.Home page of lam "local rea multicomputer. "http://www.lsc.nd.edu/lam/".
|
| |
14
|
MPI-Forum.MPI Forum,1999.http://www.mpi-forum.org.
|
| |
15
|
|
 |
16
|
Kai Shen , Hong Tang , Tao Yang, Adaptive two-level thread management for fast MPI execution on shared memory machines, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.49-es, November 14-19, 1999, Portland, Oregon, United States
[doi> 10.1145/331532.331581]
|
 |
17
|
Steve Sistare , Rolf vandeVaart , Eugene Loh, Optimization of MPI collectives on clusters of large-scale SMP's, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.23-es, November 14-19, 1999, Portland, Oregon, United States
[doi> 10.1145/331532.331555]
|
 |
18
|
|
| |
19
|
F.Wong and D.Culler.Message passing interface on ctive messages."http://now.cs.berkeley.edu/F stcomm/MPI/".
|
| |
20
|
H.Zhou and A.Geist.LPVM:step towards multithread PVM.Concurrency -Practice and Experience ,1997.
|
|