| Memory sharing predictor: the key to a speculative coherent DSM |
| Full text |
Pdf
(123 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 26th annual international symposium on Computer architecture
table of contents
Atlanta, Georgia, United States
Pages: 172 - 183
Year of Publication: 1999
ISBN:0-7695-0170-2
Also published in ...
|
|
Authors
|
|
An-Chow Lai
|
School of Electrical & Computer Engineering, Purdue University, 1285 EE Building, West Lafayette, IN
|
|
Babak Falsafi
|
School of Electrical & Computer Engineering, Purdue University, 1285 EE Building, West Lafayette, IN
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society
Washington, DC, USA
|
| Bibliometrics |
Downloads (6 Weeks): 13, Downloads (12 Months): 33, Citation Count: 19
|
|
|
ABSTRACT
Recent research advocates using general message predictors to learn and predict the coherence activity in distributed shared memory (DSM). By accurately predicting a message and timely invoking the necessary coherence actions, a DSM can hide much of the remote access latency. This paper proposes the Memory Sharing Predictors (MSPs), pattern-based predictors that significantly improve prediction accuracy and implementation cost over general message predictors. An MSP is based on the key observation that to hide the remote access latency, a predictor must accurately predict only the remote memory accesses (i.e., request messages) and not the subsequent coherence messages invoked by an access. Simulation results indicate that MSPs improve prediction accuracy over general message predictors from 81% to 93% while requiring less storage overhead.This paper also presents the first design and evaluation for a speculative coherent DSM using pattern-based predictors. We identify simple techniques and mechanisms to trigger prediction timely and perform speculation for remote read accesses. Our speculation hardware readily works with a conventional full-map write-invalidate coherence protocol without any modifications. Simulation results indicate that performing speculative read requests alone reduces execution times by 12% in our shared-memory applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Cristiana Amza , Alan L. Cox , Sandhya Dwarkadas , Pete Keleher , Honghui Lu , Ramakrishnan Rajamony , Weimin Yu , Willy Zwaenepoel, TreadMarks: Shared Memory Computing on Networks of Workstations, Computer, v.29 n.2, p.18-28, February 1996
[doi> 10.1109/2.485843]
|
 |
4
|
|
| |
5
|
Doug Burger and Sanjay Mehta. Paraltelizing Appbt for a Shared-Memory Multiprocessor. Technical Report 1286, Computer Sciences Department, University of Wisconsin- Madison, September 1995.
|
| |
6
|
|
| |
7
|
Babak Falsafi , Alvin R. Lebeck , Steven K. Reinhardt , Ioannis Schoinas , Mark D. Hill , James R. Larus , Anne Rogers , David A. Wood, Application-specific protocols for user-level shared memory, Proceedings of the 1994 conference on Supercomputing, p.380-389, December 1994, Washington, D.C., United States
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
Shubhendu S. Mukherjee, Steven K. Reinhardt, Babak Falsail, Mike Litzkow, Steve Huss-Lederman, Mark D. Hill, James R. Larus, and David A. Wood. Wisconsin Wind Tunnel II: A fast and portable paraltel architecture simulator. In Workshop on Performance Analysis and Its Impact on Design (PAID), June t997.
|
 |
19
|
S. K. Reinhardt , J. R. Larus , D. A. Wood, Tempest and typhoon: user-level shared memory, Proceedings of the 21ST annual international symposium on Computer architecture, p.325-336, April 18-21, 1994, Chicago, Illinois, United States
|
 |
20
|
Amir Roth , Andreas Moshovos , Gurindar S. Sohi, Dependence based prefetching for linked data structures, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.115-126, October 02-07, 1998, San Jose, California, United States
|
 |
21
|
Ioannis Schoinas , Babak Falsafi , Alvin R. Lebeck , Steven K. Reinhardt , James R. Larus , David A. Wood, Fine-grain access control for distributed shared memory, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.297-306, October 05-07, 1994, San Jose, California, United States
|
 |
22
|
Wolf-Dietrich Weber , Stephen Gold , Pat Helland , Takeshi Shimizu , Thomas Wicki , Winfried Wilcke, The Mercury Interconnect Architecture: a cost-effective infrastructure for high-performance servers, Proceedings of the 24th annual international symposium on Computer architecture, p.98-107, June 01-04, 1997, Denver, Colorado, United States
|
 |
23
|
|
CITED BY 19
|
|
|
|
|
|
|
|
|
|
|
Manuel E. Acacio , José González , José M. García , José Duato, Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture, Proceedings of the 2002 ACM/IEEE conference on Supercomputing, p.1-12, November 16, 2002, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
Stephen Somogyi , Thomas F. Wenisch , Nikolaos Hardavellas , Jangwoo Kim , Anastassia Ailamaki , Babak Falsafi, Memory coherence activity prediction in commercial workloads, Proceedings of the 3rd workshop on Memory performance issues: in conjunction with the 31st international symposium on computer architecture, p.37-45, June 20-20, 2004, Munich, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomas F. Wenisch , Stephen Somogyi , Nikolaos Hardavellas , Jangwoo Kim , Anastassia Ailamaki , Babak Falsafi, Temporal Streaming of Shared Memory, ACM SIGARCH Computer Architecture News, v.33 n.2, p.222-233, May 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|