ACM Home Page
Please provide us with feedback. Feedback
Decoupled hardware support for distributed shared memory
Full text PdfPdf (1.47 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 23rd annual international symposium on Computer architecture table of contents
Philadelphia, Pennsylvania, United States
Pages: 34 - 43  
Year of Publication: 1996
ISBN:0-89791-786-3
Also published in ...
Authors
Steven K. Reinhardt  Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI
Robert W. Pfile  Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI
David A. Wood  Computer Sciences Department, University of Wisconsin-Madison, 1210 West Dayton Street, Madison, WI
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 23,   Citation Count: 23
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/232973.232979
What is a DOI?

ABSTRACT

This paper investigates hardware support for fine-grain distributed shared memory (DSM) in networks of workstations. To reduce design time and implementation cost relative to dedicated DSM systems, we decouple the functional hardware components of DSM support, allowing greater use of off-the-shelf devices.We present two decoupled systems, Typhoon-0 and Typhoon-1. Typhoon-0 uses an off-the-shelf protocol processor and network interface; a custom access control device is the only DSM-specific hardware. To demonstrate the feasibility and simplicity of this access control device, we designed and built an FPGA-based version in under one year. Typhoon-1 also uses an off-the-shelf protocol processor, but integrates the network interface and access control devices for higher performance.We compare the performance of the two decoupled systems with two integrated systems via simulation. For six benchmarks on 32 nodes, Typhoon-0 ranges from 30% to 309% slower than the best integrated system, while Typhoon-1 ranges from 13% to 132% slower. Four of the six benchmarks achieve speedups of 12 to 18 on Typhoon-0 and 15 to 26 on Typhoon-1, compared with 19 to 35 on the best integrated system. Two benchmarks are hampered by high communication overheads, but selectively replacing shared-memory operations with message passing provides speedups of at least 16 on both decoupled systems. These speedups indicate that decoupled designs can potentially provide a cost-effective alternative to complex high-end DSM systems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
David Bailey, John Barton, Thomas Lasinski, and Horst Simon. The NAS Parallel Benchmarks. Technical Report RNR-91-002 Revision 2, Ames Research Center, August 1991.
 
4
Brian N. Bershad, Matthew J. Zekauskas, and Wayne A. Sawdon. The Midway Distributed Shared Memory System. In COMPCON 1993, 1993.
 
5
6
 
7
8
 
9
Doug Burger and Sanjay Mehta. Parallelizing Appbt for a Shared- Memory Multiprocessor. Technical Report 1286, Computer Sciences Department, University of Wisconsin-Madison, September 1985.
10
 
11
Derek Chiou, Boon S. Ang, Arvind, Michael J. Beckerle, Andy Boughton, Robert Greiner, James E. Hicks, and James C. Hoe. StarT- NG: Delivering Seamless Parallel Computing. Technical Report CSG Memo 371, MIT Laboratory for Computer Science, February 1995.
12
13
 
14
 
15
Babak Falsafi and David A. Wood. When does Dedicated Protocol Processing Make Sense? Technical Report 1302, Computer Sciences Department, University of Wisconsin-Madison, February 1996.
16
 
17
Linley Gwennap. intel's P6 Bus Designed for Multiprocessing. Microprocessor Report, 9(7), May 30, 1995.
 
18
Erik Hagersten, Ashley Saulsbury, and Anders Landin. Simple COMA Node Implementations. In Proceedings of the 27th Hawaii International Conference on System Sciences, January 1994.
19
20
21
22
 
23
 
24
Pete Keleher, Sandhya Dwarkadas, Alan Cox, and Willy Zwaenepoel. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. Technical Report 93-214, Department of Computer Science, Rice University, November 1993.
 
25
Kendall Square Research. Kendall Square Research Technical Summary, 1992.
 
26
27
 
28
29
 
30
LSI Logic Inc. L64601 SCI NodeChip Technical Manual.
 
31
32
 
33
A. Nowatzyk, M. Monger, M. Parkin, E. Kelly, M. Browne, G. Aybay, and D. Lee. S3.mp: A Multiprocessor in a Matchbox. In Proc. PASA, 1993.
 
34
Robert W. Pfile. Typhoon-Zero implementation: The Vortex Module. Technical Report 1290, Computer Sciences Department, University of Wisconsin-Madison, October 1995.
 
35
Steven K. Reinhardt. Tempest Interface Specification (Revision 1.2.1). Technical Report 1267, Computer Sciences Department, University of Wisconsin-Madison, February 1995.
36
37
 
38
ROSS Technology Inc. SPARC RISC User's Guide, September 1993.
39
 
40
Steve Scott. The SCX Channel: A New, Supercomputer-Class System Interconnect. Hot Interconnects III, August 1995.
 
41
Doug Shore. Personal communication, November 1994.
 
42
Sun Microsystems Inc. SPARC MBus Interface Specification, April 1991.
 
43
Thinking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.
44
45
 
46

CITED BY  23
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Steven K. Reinhardt: colleagues
Robert W. Pfile: colleagues
David A. Wood: colleagues

Peer to Peer - Readers of this Article have also read: