ACM Home Page
Please provide us with feedback. Feedback
Optimizing threaded MPI execution on SMP clusters
Full text PdfPdf (273 KB)
Source International Conference on Supercomputing archive
Proceedings of the 15th international conference on Supercomputing table of contents
Sorrento, Italy
Pages: 381 - 392  
Year of Publication: 2001
ISBN:1-58113-410-X
Authors
Hong Tang  Department of Computer Science, University of California, Santa Barbara, CA
Tao Yang  Department of Computer Science, University of California, Santa Barbara, CA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 88,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/377792.377895
What is a DOI?

ABSTRACT

Our previous work has shown that using threads to execute MPI programs can yield great performance gain on multiprogrammed shared-memory machines. This paper investigates the design and implementation of a thread-based MPI system on SMP clusters. Our study indicates that with a proper design for threaded MPI execution, both point-to-point and collective communication performance can be improved substantially, compared to a process-based MPI implementation in a cluster environment. Our contribution includes a hierarchy-aware and adaptive communication scheme for threaded MPI execution and a thread-safe network device abstraction that uses event-driven synchronization and provides separated collective and point-to-point communication channels. This paper describes the implementation of our design and illustrates its performance advantage on a Linux SMP cluster.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Donald J.Becker,Thomas Sterling,D niel Sav rese, John E.Dorband,Ud y A.Ranawak,and Charles V. P cker.BEOWULF:parallel workstation for scienti .c computation.In Proceeding of International Conference on Parallel Proce ing ,1995.More Beowulf papers re in http://www.beowulf.org.
 
3
G.Burns,,V.R diy ,R.Daoud,and R.Machiraju.All About Trollius,1990.Ohio Supercomputer Center.
 
4
G.Burns,R.Daoud,and J.Vaigl.LAM:An Open Cluster Environment for MPI.Ohio Supercomputer Center.
 
5
 
6
 
7
W.Gropp and E.Lusk.An Abstract Device Definition to Support the Implementation of High-Level Point-to-Point Message P ssing Interface.Technical Report MCS-P392-1193,Argonne N tional L boratory,1994.
 
8
 
9
10
 
11
12
 
13
L.MPI.Home page of lam "local rea multicomputer. "http://www.lsc.nd.edu/lam/".
 
14
MPI-Forum.MPI Forum,1999.http://www.mpi-forum.org.
 
15
16
17
18
 
19
F.Wong and D.Culler.Message passing interface on ctive messages."http://now.cs.berkeley.edu/F stcomm/MPI/".
 
20
H.Zhou and A.Geist.LPVM:step towards multithread PVM.Concurrency -Practice and Experience ,1997.