ACM Home Page
Please provide us with feedback. Feedback
Multigrain shared memory
Full text PdfPdf (369 KB)
Source ACM Transactions on Computer Systems (TOCS) archive
Volume 18 ,  Issue 2  (May 2000) table of contents
Pages: 154 - 196  
Year of Publication: 2000
ISSN:0734-2071
Authors
Donald Yeung  Univ. of Maryland at College Park, College Park
John Kubiatowicz  Univ. of California at Berkeley, Berkeley
Anant Agarwal  Massachusetts Institute of Technology, Cambridge
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 77,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/350853.350871
What is a DOI?

ABSTRACT

Parallel workstations, each comprising tens of processors based on shared memory, promise cost-effective scalable multiprocessing. This article explores the coupling of such small- to medium-scale shared-memory multiprocessors through software over a local area network to synthesize larger shared-memory systems. We call these systems Distributed Shared-memory MultiProcessors (DSMPs). This article introduces the design of a shared-memory system that uses multiple granularities of sharing, called MGS, and presents a prototype implementation of MGS on the MIT Alewife multiprocessor. Multigrain shared memory enables the collaboration of hardware and software shared memory, thus synthesizing a single transparent shared-memory address space across a cluster of multiprocessors. The system leverages the efficient support for fine-grain cache-line sharing within multiprocessor nodes as often as possible, and resorts to coarse-grain page-level sharing across nodes only when absolutely necessary. Using our prototype implementation of MGS, an in-depth study of several shared-memory application is conducted to understand the behavior of DSMPs. Our study is the first to comprehensively explore the DSMP design space, and teh compare the performance of DSMPs against all-software and all-hardware DSMs on a signle experimental platform. Keeping the total number of processors fixed, we show that applications execute up to 85% faster on a DSMP as compared to an all-software DSM. We also show that all-hardware DSMs hold a significant performance advantage over DSMPs on challenging applications, between 159% and 1014%. However, program transformations to improve data locality for these applications allow DSMPs to almost match the performance of an all-hardware multiprocessor of the same size.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
BERSHAD, B. N. AND ZEKAUSKAS, M.J. 1991. Midway: Shared memory parallel programming with entry consistency for distributed memory multiprocessors. CMU-CS-91-170. Computer Science Department, Carnegie Mellon University, Pittsburgh, PA.
3
4
5
 
6
Cox, A. L. AND FOWLER, R.g. 1989. The implementation of a coherent memory abstraction on a NUMA multiprocessor: Experiences with PLATINUM. Tech. Rep. 263. Dept. of Computer Science, University of Rochester, Rochester, NY.
7
8
 
9
 
10
GILLETT, R. 1996. Memory channel: An optimzed cluster interconnect. IEEE Micro 16, 2 (Apr.).
11
 
12
13
 
14
KELEHER, P., DWARKADAS, S., Cox, A., AND ZWAENEPOEL, W. 1994. Treadmarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the Winter 1994 USENIX Conference (Jan.), USENIX Assoc., Berkeley, CA, 115-131.
15
16
17
18
19
20
21
 
22
 
23
24
 
25
 
26
SINGH, J. P., HOLT, C., TOTSUKA, T., GUPTA, A., AND HENNESSY, J. L. 1992a. Load balancing and data locality in hierarchical N-body methods. Tech. Rep. CSL-TR-92-505. Computer Systems Laboratory, Stanford Univ., Stanford, CA.
 
27
28
 
29
SUN MICROSYSTEMS. 1996. The Ultra Enterpise 1 and 2 server architecture. Sun Microsysterns, Inc., Mountain View, CA.
 
30
31
 
32
33
34



REVIEW

"Farnaz Mounes-Toussi : Reviewer"

The paper describes a distributed shared-memory multiprocessor system in which each node is a multiprocessor with hardware support for cache coherence. Nodes are connected through a Local Area Network and cache coherence is supported by soft  more...

Collaborative Colleagues:
Donald Yeung: colleagues
John Kubiatowicz: colleagues
Anant Agarwal: colleagues