| VM-based shared memory on low-latency, remote-memory-access networks |
| Full text |
Pdf
(1.96 MB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 24th annual international symposium on Computer architecture
table of contents
Denver, Colorado, United States
Pages: 157 - 169
Year of Publication: 1997
ISBN:0-89791-901-7
Also published in ...
|
|
Authors
|
|
Leonidas Kontothanassis
|
DEC Cambridge Research Lab, One Kendall Sq., Bldg. 700, Cambridge, MA
|
|
Galen Hunt
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Robert Stets
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Nikolaos Hardavellas
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Michał Cierniak
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Srinivasan Parthasarathy
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Wagner Meira, Jr.
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Sandhya Dwarkadas
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
Michael Scott
|
Department of Computer Science, University of Rochester, Rochester, NY
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 15, Downloads (12 Months): 30, Citation Count: 20
|
|
|
ABSTRACT
Recent technological advances have produced network interfaces that provide users with very low-latency access to the memory of remote machines. We examine the impact of such networks on the implementation and performance of software DSM. Specifically, we compare two DSM systems---Cashmere and TreadMarks---on a 32-processor DEC Alpha cluster connected by a Memory Channel network.Both Cashmere and TreadMarks use virtual memory to maintain coherence on pages, and both use lazy, multi-writer release consistency. The systems differ dramatically, however, in the mechanisms used to track sharing information and to collect and merge concurrent updates to a page, with the result that Cashmere communicates much more frequently, and at a much finer grain.Our principal conclusion is that low-latency networks make DSM based on fine-grain communication competitive with more coarse-grain approaches, but that further hardware improvements will be needed before such systems can provide consistently superior performance. In our experiments, Cashmere scales slightly better than TreadMarks for applications with false sharing. At the same time, it is severely constrained by limitations of the current Memory Channel hardware. In general, performance is better for TreadMarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Cristiana Amza , Alan L. Cox , Sandhya Dwarkadas , Pete Keleher , Honghui Lu , Ramakrishnan Rajamony , Weimin Yu , Willy Zwaenepoel, TreadMarks: Shared Memory Computing on Networks of Workstations, Computer, v.29 n.2, p.18-28, February 1996
[doi> 10.1109/2.485843]
|
 |
2
|
M. A. Blumrich , K. Li , R. Alpert , C. Dubnicki , E. W. Felten , J. Sandberg, Virtual memory mapped network interface for the SHRIMP multicomputer, Proceedings of the 21ST annual international symposium on Computer architecture, p.142-153, April 18-21, 1994, Chicago, Illinois, United States
|
| |
3
|
Nanette J. Boden , Danny Cohen , Robert E. Felderman , Alan E. Kulawik , Charles L. Seitz , Jakov N. Seizovic , Wen-King Su, Myrinet: A Gigabit-per-Second Local Area Network, IEEE Micro, v.15 n.1, p.29-36, February 1995
[doi> 10.1109/40.342015]
|
 |
4
|
William J. Bolosky , Michael L. Scott , Robert P. Fitzgerald , Robert J. Fowler , Alan L. Cox, NUMA policies and their relation to memory architecture, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.212-221, April 08-11, 1991, Santa Clara, California, United States
|
 |
5
|
Greg Buzzard , David Jacobson , Milon Mackey , Scott Marovich , John Wilkes, An implementation of the Hamlyn sender-managed interface architecture, Proceedings of the second USENIX symposium on Operating systems design and implementation, p.245-259, October 29-November 01, 1996, Seattle, Washington, United States
|
 |
6
|
|
 |
7
|
A. Krishnamurthy , D. E. Culler , A. Dusseau , S. C. Goldstein , S. Lumetta , T. von Eicken , K. Yelick, Parallel programming in Split-C, Proceedings of the 1993 ACM/IEEE conference on Supercomputing, p.262-273, December 1993, Portland, Oregon, United States
[doi> 10.1145/169627.169724]
|
| |
8
|
$. Dwarkadas, A. A. Sehaffer, R. W. Cottingham Jr., A, L, Cox, P. Keleher, and W. Zwaenepoel. Parallelization of General Linkage Analysis Problems. Human Heredity, 44:127-141,1994.
|
 |
9
|
Sandhya Dwarkadas , Alan L. Cox , Willy Zwaenepoel, An integrated compile-time/run-time software distributed shared memory system, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.186-197, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
10
|
Andrew Erlichson , Neal Nuckolls , Greg Chesson , John Hennessy, SoftFLASH: analyzing the performance of clustered distributed virtual shared memory, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.210-220, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
11
|
M.J. Feeley, J. S. Chase, V. R. Narasayya, and H. M. Levy. Integrating Coherency and Recovery in Distributed Systems. In Proc. of the 1st Symp. on Operating Systems Design and Implementation, Nov. 1994.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
K. L. Johnson , M. F. Kaashoek , D. A. Wallach, CRL: high-performance all-software distributed shared memory, Proceedings of the fifteenth ACM symposium on Operating systems principles, p.213-226, December 03-06, 1995, Copper Mountain, Colorado, United States
|
 |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
Daniel Lenoski , James Laudon , Kourosh Gharachorloo , Anoop Gupta , John Hennessy, The directory-based cache coherence protocol for the DASH multiprocessor, Proceedings of the 17th annual international symposium on Computer Architecture, p.148-159, May 28-31, 1990, Seattle, Washington, United States
|
 |
21
|
|
| |
22
|
O. Lysne, S. Gjessing, and K. Loehsen. Running the SCI Protocol over HIC Networks. In 2nd Intl. Workshop on SCl-based Low-cos~igh-performance Computing (SCIzzL-2), Mar. 1995.
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
K. Petersen and K. Li. Cache Coherence for Shared Memory Multiprocessors Based on Virtual Memory Support. In Proc. of the 7th Intl. Parallel Processing Symp., Apr. 1993.
|
 |
27
|
Harjinder S. Sandhu , Benjamin Gamsa , Songnian Zhou, The shared regions approach to software cache coherence on multiprocessors, Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.229-238, May 19-22, 1993, San Diego, California, United States
|
 |
28
|
Daniel J. Scales , Kourosh Gharachorloo , Chandramohan A. Thekkath, Shasta: a low overhead, software-only approach for supporting fine-grain shared memory, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.174-185, October 01-04, 1996, Cambridge, Massachusetts, United States
|
 |
29
|
Ioannis Schoinas , Babak Falsafi , Alvin R. Lebeck , Steven K. Reinhardt , James R. Larus , David A. Wood, Fine-grain access control for distributed shared memory, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.297-306, October 05-07, 1994, San Jose, California, United States
|
 |
30
|
|
| |
31
|
|
 |
32
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
33
|
Donald Yeung , John Kubiatowicz , Anant Agarwal, MGS: a multigrain shared memory system, Proceedings of the 23rd annual international symposium on Computer architecture, p.44-55, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
34
|
M.J. Zekauskas,W. A. Sawdon, and B. N. Bershad. Software Write Detection for Distributed Shared Memory. In Proc. of the 1st Symp. on Operating Systems Design and Implementation, Nov. 1994.
|
CITED BY 20
|
|
Robert Stets , Sandhya Dwarkadas , Nikolaos Hardavellas , Galen Hunt , Leonidas Kontothanassis , Srinivasan Parthasarathy , Michael Scott, Cashmere-2L: software coherent shared memory on a clustered remote-write network, ACM SIGOPS Operating Systems Review, v.31 n.5, p.170-183, Dec. 1997
|
|
|
Henri E. Bal , Raoul Bhoedjang , Rutger Hofman , Ceriel Jacobs , Koen Langendoen , Tim Rühl , M. Frans Kaashoek, Performance evaluation of the Orca shared-object system, ACM Transactions on Computer Systems (TOCS), v.16 n.1, p.1-40, Feb. 1998
|
|
|
Cheng Liao , Dongming Jiang , Liviu Iftode , Margaret Martonosi , Douglas W. Clark, Monitoring shared virtual memory performance on a Myrinet-based PC cluster, Proceedings of the 12th international conference on Supercomputing, p.251-258, July 1998, Melbourne, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Liviu Iftode , Matthias Blumrich , Cezary Dubnicki , David L. Oppenheimer , Jaswinder Pal Singh , Kai Li, Shared virtual memory with automatic update support, Proceedings of the 13th international conference on Supercomputing, p.175-183, June 20-25, 1999, Rhodes, Greece
|
|
|
|
|
|
Aman Singla , Umakishore Ramachandran , Jessica Hodgins, Temporal notions of synchronization and consistency in Beehive, Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, p.211-220, June 23-25, 1997, Newport, Rhode Island, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Leonidas Kontothanassis , Robert Stets , Galen Hunt , Umit Rencuzogullari , Gautam Altekar , Sandhya Dwarkadas , Michael L. Scott, Shared memory computing on clusters with symmetric multiprocessors and system area networks, ACM Transactions on Computer Systems (TOCS), v.23 n.3, p.301-335, August 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|