ACM Home Page
Please provide us with feedback. Feedback
Multi-protocol active messages on a cluster of SMP's
Full text PdfPdf (248 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) table of contents
San Jose, CA
Pages: 1 - 22  
Year of Publication: 1997
ISBN:0-89791-985-8
Authors
Steven S. Lumetta  University of California, Berkeley
Alan M. Mainwaring  University of California, Berkeley
David E. Culler  University of California, Berkeley
Sponsors
IEEE-CS\DATC : IEEE Computer Society
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 20,   Citation Count: 19
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/509593.509596
What is a DOI?

ABSTRACT

Clusters of multiprocessors, or Clumps, promise to be the supercomputers of the future, but obtaining high performance on these architectures requires an understanding of interactions between the multiple levels of interconnection. In this paper, we present the first multi-protocol implementation of a lightweight message layer---a version of Active Messages-II running on a cluster of Sun Enterprise 5000 servers connected with Myrinet. This research brings together several pieces of high-performance interconnection technology: bus backplanes for symmetric multiprocessors, low-latency networks for connections between machines, and simple, user-level primitives for communication. The paper describes the shared memory message-passing protocol and analyzes the multi-protocol implementation with both microbenchmarks and Split-C applications. Three aspects of the communication layer are critical to performance: the overhead of cache-coherence mechanisms, the method of managing concurrent access, and the cost of accessing state with the slower protocol. Through the use of an adaptive polling strategy, the multi-protocol implementation limits performance interactions between the protocols, delivering up to 160 MB/s of bandwidth with 3.6 microsecond end-to-end latency. Applications within an SMP benefit from this fast communication, running up to 75% faster than on a network of uniprocessor workstations. Applications running on the entire Clump are limited by the balance of NIC's to processors in our system, and are typically slower than on the NOW. These results illustrate several potential pitfalls for the Clumps architecture.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
Accelerated Strategic Computing Initiative, a program of the Department of Energy. Information is available via http://www.llnl.gov/asci-alliances/.
 
3
D. A. Bader, J. JáJá, "SIMPLE: A Methodology for Programming High Performance Algorithms on Clusters of Symmetric Multiprocessors (SMP's)," preliminary version, May 1997, available via http://www.umiacs.umd.edu/research/EXPAR.
 
4
 
5
 
6
R. Butler, E. Lusk, "Monitors, Message, and Clusters: the p4 Parallel Programming System," available via http://www.mcs.anl.gov/home/lusk/p4/p4-paper/paper.html.
 
7
B. N. Chun, A. M. Mainwaring, D. E. Culler, "A General-Purpose Protocol Architecture for a Low-Latency, Multi-gigabit System Area Network," Proceedings of Hot Interconnects V, Stanford, California, August 1997.
8
9
 
10
 
11
S. J. Fink, S. B. Baden, "Non-Uniform Partitioning of Finite Difference Methods Running on SMP Clusters," submitted for publication, available via http://www-cse.ucsd.edu/users/baden/MT.html.
 
12
S. J. Fink, S. B. Baden, "Runtime Support for Multi-Tier Programming of Block-Structured Applications on SMP Clusters," submitted for publication, available via http://www-cse.ucsd.edu/users/baden/MT.html.
 
13
 
14
 
15
 
16
17
 
18
B.-H. Lim, P. Heidelberger, P. Pattnaik, M. Snir, "Message Proxies for Efficient, Protected Communication on SMP Clusters," IBM Almaden Research Report #RC 20522 (90972), August 1996.
 
19
L. T. Liu, D. E. Culler, "Evaluation of the Intel Paragon on Active Message Communication," Proceedings of Intel Supercomputer Users Group Conference, June 1995, also available via http://now.CS.Berkeley.EDU.
 
20
S. S. Lumetta, D. E. Culler, "Managing Concurrent Access for Shared Memory Active Messages," U. C. Berkeley Technical Report in preparation.
 
21
 
22
R. Martin, "HPAM: an Active Message Layer for a Network of HP Workstations," Proceedings of Hot Interconnects II, Stanford, California, August 1994, pp. 40-58.
23
 
24
S. S. Mukherjee, M. D. Hill, "A Case for Making Network Interfaces Less Peripheral," Proceedings of Hot Interconnects V, Stanford, California, August 1997.
25
 
26
 
27
A. Singhal, D. Broniarczyk, F. Cerauskis, J. Price, L. Yuan, C. Cheng, D. Doblar, S. Fosth, N. Agarwal, K. Harvey, E. Hagersten, B. Liencres, "Gigaplane: A High Performance Bus for Large SMPs," Proceedings of Hot Interconnects IV, Stanford, California, August 1996, pp. 41-52
 
28
 
29
T. von Eicken, V. Avula, A. Basu, V. Buch, "Low-latency Communication over ATM Networks Using Active Messages," Proceedings of Hot Interconnects II, Stanford, California, August 1994, pp. 60-71.
30
 
31
32

CITED BY  19
Collaborative Colleagues:
Steven S. Lumetta: colleagues
Alan M. Mainwaring: colleagues
David E. Culler: colleagues