ACM Home Page
Please provide us with feedback. Feedback
MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools
Full text PdfPdf (177 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2003 ACM/IEEE conference on Supercomputing table of contents
Page: 21  
Year of Publication: 2003
ISBN:1-58113-695-1
Authors
Philip C. Roth  University of Wisconsin, Madison
Dorian C. Arnold  University of Wisconsin, Madison
Barton P. Miller  University of Wisconsin, Madison
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
IEEE Computer Society  Washington, DC, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 28,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

We present MRNet, a software-based multicast/reduction network for building scalable performance and system administration tools. MRNet supports multiple simultaneous, asynchronous collective communication operations. MRNet is flexible, allowing tool builders to tailor its process network topology to suit their tool's requirements and the underlying system's capabilities. MRNet is extensible, allowing tool builders to incorporate custom data reductions to augment its collection of built-in reductions. We evaluated MRNet in a simple test tool and also integrated into an existing, real-world performance tool with up to 512 tool back-ends. In the real-world tool, we used MRNet not only for multicast and simple data reductions but also with custom histogram and clock skew detection reductions. In our experiments, the MRNet-based tools showed significantly better performance than the tools without MRNet for average message latency and throughput, overall tool start-up latency, and performance data processing throughput.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
[1] Advanced Simulation and Computing program, National Nuclear Security Administration, United States of America Department of Energy. ¿http://www.nnsa.doe.gov/asc/home.htm¿, February 6, 2003.
 
2
 
3
[3] Susanne M. Balle. Personal communication, November 2002.
 
4
 
5
[5] M. Bernaschi and G. Iannello. Collective Communication Operations: Experimental Results vs. Theory. Concurrency: Practice and Experience 10, 5, April 1998, pp. 359-386.
 
6
 
7
[7] Center for Computational Research, University at Buffalo, The State University of New York. ¿http://www.ccr.buffalo.edu¿, February 6, 2003.
8
 
9
[9] Earth Simulator Center. ¿http://www.es.jamstec.go.jp¿, February 6, 2003.
 
10
[10] Etnus LLC, "TotalView User's Guide", Document version 6.0.0-1, January 2003. ¿http://www.etnus.com¿
 
11
 
12
[12] D.A. Evensky. Personal communication, November 2001.
 
13
[13] Forecast Systems Laboratory, National Oceanic and Atmospheric Administration. ¿http://hpcs.fsl.noaa.gov¿, Feb 6, 2003.
 
14
 
15
16
17
 
18
[18] Lawrence Livermore National Laboratory. Multiprogrammatic Capability Cluster. ¿http://www.llnl.gov/linux/mcr¿, February 6, 2003.
 
19
[19] Lawrence Livermore National Laboratory. Using ASCI Blue Pacific. ¿http://www.llnl.gov/asci/platforms/bluepac¿, February 13, 2003.
20
 
21
[21] M.L. Massie, B.N. Chun, and D.E. Culler. The Ganglia Distributed Monitoring System: Design, Implementation, and Experience. University of California, Berkeley Technical Report, ¿http://ganglia.sourceforge.net/talks/parallel_computi ng/ganglia-twocol.pdf¿, February 2003.
 
22
[22] Message Passing Interface Forum. MPI: A Message Passing Interface Standard. International Journal of Supercomputing Applications 8, 3/4, Fall/Winter 1994.
 
23
24
 
25
 
26
[26] UoE HPCX Ltd. ¿http://www.hpcx.ac.uk¿, February 6, 2003.
 
27
 
28

CITED BY  14

Collaborative Colleagues:
Philip C. Roth: colleagues
Dorian C. Arnold: colleagues
Barton P. Miller: colleagues