ACM Home Page
Please provide us with feedback. Feedback
Removing the overhead from software-based shared memory
Full text PdfPdf (238 KB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) table of contents
Denver, Colorado
Pages: 56 - 56  
Year of Publication: 2001
ISBN:1-58113-293-X
Authors
Zoran Radović  Uppsala University, Sweden
Erik Hagersten  Uppsala University, Sweden
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE-CS\DATC : IEEE Computer Society
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 36,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/582034.582090
What is a DOI?

ABSTRACT

The implementation presented in this paper---DSZOOM-WF---is a sequentially consistent, fine-grained distributed software-based shared memory. It demonstrates a protocol-handling overhead below a microsecond for all the actions involved in a remote load operation, to be compared to the fastest implementation to date of around ten microseconds.The all-software protocol is implemented assuming some basic low-level primitives in the cluster interconnect and an operating system bypass functionality, similar to the emerging InfiniBand standard. All interrupt- and/or poll-based asynchronous protocol processing is completely removed by running the entire coherence protocol in the requesting processor. This not only removes the asynchronous overhead, but also makes use of a processor that otherwise would stall. The technique is applicable to both page-based and fine-grain software-based shared memory.DSZOOM-WF consistently demonstrates performance comparable to hardware-based distributed shared memory implementations.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
E. Artiaga. Personal communication, April 2001.
 
2
E. Artiaga, X. Martorell, Y. Becerra, and N. Navarro. Experiences on Implementing PARMACS Macros to Run the SPLASH-2 Suite on Multiprocessors. In Proceedings of the 6th Euromicro Workshop on Parallel and Distributed Processing, January 1998.
 
3
E. Artiaga, N. Navarro, X. Martorell, and Y. Becerra. Implementing PARMACS Macros for Shared-Memory Multiprocessor Environments. Technical Report UPC-DAC-1997-07, Department of Computer Architecture, Polytechnic University of Catalunya, January 1997.
 
4
B. N. Bershad, M. J. Zekauskas, and W. A. Sawdon. The Midway Distributed Shared Memory System. In Proceedings of the 38th IEEE Computer Society International Conference, pages 528-537, February 1993.
5
6
7
8
9
 
10
11
 
12
K. Gharachorloo. Personal communication, October 2000.
13
 
14
 
15
 
16
L. Iftode, M. Blumrich, C. Dubnicki, D. L. Oppenheimer, J. P. Singh, and K. Li. Shared Virtual Memory with Automatic Update Support. Technical Report TR-575-98, Princeton University, February 1998.
 
17
L. Iftode and J. P. Singh. Shared Virtual Memory: Progress and Challenges. Proceedings of the IEEE, Special Issue on Distributed Shared Memory, 87(3):498-507, March 1999.
 
18
InfiniBand(SM) Trade Association, InfiniBand Architecture Specification, Release 1.0, October 2000. Available from: http://www.infinibandta.org.
19
 
20
 
21
P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. In Proceedings of the Winter 1994 USENIX Conference, pages 115-131, January 1994.
22
 
23
 
24
L. Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9):690-691, September 1979.
25
 
26
K. Li. IVY: A Shared Virtual Memory System for Parallel Computing. In Proceedings of the 1988 International Conference on Parallel Processing (ICPP '88), volume II, pages 94-101, August 1988.
27
 
28
L. W. McVoy and Carl Staelin. lmbench: Portable Tools for Performance Analysis. In Proceedings of the 1996 USENIX Annual Technical Conference, pages 279-294, January 1996.
29
30
 
31
Z. Radović and E. Hagersten. DSZOOM --- Low Latency Software-Based Shared Memory. Technical Report 2001:03, Parallel and Scientific Computing Institute (PSCI), Sweden, April 2001.
32
33
 
34
D. J. Scales, K. Gharachorloo, and A. Aggarwal. Fine-Grain Software Distributed Shared Memory on SMP Clusters. Technical Report 97/3, Western Research Laboratory, Digital Equipment Corporation, February 1997.
35
 
36
 
37
I. Schoinas, B. Falsafi, M. D. Hill, J. R. Larus, C. E. Lucas, S. S. Mukherjee, S. K. Reinhardt, E. Schnarr, and D. A. Wood. Implementing Fine-Grain Distributed Shared Memory On Commodity SMP Workstations. Technical Report 1307, Computer Sciences Department, University of Wisconsin-Madison, March 1996.
38
 
39
A. Singhal, D. Broniarczyk, F. Cerauskis, J. Price, L. Yuan, C. Cheng, D. Doblar, S. Fosth, N. Agarwal, K. Harvey, E. Hagersten, and B. Liencres. Gigaplane: A High Performance Bus for Large SMPs. In Proceedings of IEEE Hot Interconnects IV, pages 41-52, August 1996.
 
40
E. Speight and J. Bennett. Brazos: A Third Generation DSM System. In Proceedings of the 1st USENIX Windows NT Symposium, August 1997.
41
 
42
M. Weiser. Program Slicing. IEEE Transactions on Software Engineering, SE-10(4):352-357, July 1984.
43
44


Collaborative Colleagues:
Zoran Radović: colleagues
Erik Hagersten: colleagues