ACM Home Page
Please provide us with feedback. Feedback
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
Full text PdfPdf (1.37 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 22nd annual international symposium on Computer architecture table of contents
S. Margherita Ligure, Italy
Pages: 48 - 59  
Year of Publication: 1995
ISBN:0-89791-698-0
Also published in ...
Authors
Alvin R. Lebeck  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
David A. Wood  Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 24,   Citation Count: 26
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/223982.223995
What is a DOI?

ABSTRACT

This paper introduces dynamic self-invalidation (DSI), a new technique for reducing cache coherence overhead in shared-memory multiprocessors. DSI eliminates invalidation messages by having a processor automatically invalidate its local copy of a cache block before a conflicting access by another processor. Eliminating invalidation overhead is particularly important under sequential consistency, where the latency of invalidating outstanding copies can increase a program's critical path.DSI is applicable to software, hardware, and hybrid coherence schemes. In this paper we evaluate DSI in the context of hardware directory-based write-invalidate coherence protocols. Our results show that DSI reduces execution time of a sequentially consistent full-map coherence protocol by as much as 41%. This is comparable to an implementation of weak consistency that uses a coalescing write-buffer to allow up to 16 outstanding requests for exclusive blocks. When used in conjunction with weak consistency, DSI can exploit tear-off blocks---which eliminate both invalidation and acknowledgment messages---for a total reduction in messages of up to 26%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Sarita V. Adve and Mark D. Hill. Implementing Sequential Consistency in Cache-Based Systems. In ICPP90, pages 147-I50, August 1990.
2
3
 
4
Thomas E. Anderson. The Performance Implications of Spin-Waiting Alternatives for Shared-Memory Multiprocessors. In Proceedings of the 1989 International Conference on Parallel Processing (Vol. H Software), pages 11170--11174, August 1989.
 
5
 
6
Brian Case. SPARC V9 Adds Wealth of New Features. Microprocessor Report, 7(9), February 1993.
 
7
L.M. Censier and P. Feautrier. A New Solution to Coherence Problems in Multicache Systems. IEEE Transacttons on Computers, C-27(t2):1112-1118, December 1978.
8
 
9
 
10
Trishul M. Chilimbi and James R. Larus. Cachier: A Tool for Automatically Inserting CICO Annotations. In Proceedings of the 1994 h~ternational Conference on Parallel Processhzg (Vol. II Software), pages ii-89-98, August 1994.
 
11
12
13
 
14
Ron Cytron, Steve Karlovsky, and Kevin P. McAuliffe. Automatic Management of Programmable Caches. In Proceedings of the 1988 bzternational Conference on Parallel Processing (Vol. H Software), pages 229-238, Aug 1988.
15
16
17
 
18
Vincent W. Freeh, David K. Lowenthal, and Gregory R. Andrews. Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations. In Proceedings of the First USENIX Symposium on Operating Systems Design and Implementation ( OSDI), pages 201-213, November 1994.
19
20
 
21
22
23
 
24
Pete Keleher, Sandhya Dwarkadas, Alan Cox, and Willy Zwanenepoel. TreadMarks: Distributed Shared Memory on Standard Workstations and Operations Systems. Technical Report COMP TR93-214, Department of Computer Science, Rice University, November 1993.
 
25
Gordon Kurpanek, Ken Chan, Jason Zheng, Eric Delano, and William Bryg. PA7200: A PA-RISC Processor with Integrated High Performance MP Bus Interface. In Compcon, pages 375-382, 1994.
26
 
27
Leslie Lamport. How to Make a Multiprocessor Compute that Correctly Executes Multiprocess Programs. IEEE Transactions on Computers, C-28(9):690-691, September 1979.
 
28
 
29
 
30
 
31
32
33
 
34
 
35
Ioannis Schoinas, Babak Falsafi, Alvin R. Lebeck, Steve K. Reinhardt, James R. Larus, and David A. Wood. Fine-grain Access Control for Distributed Shared Memory. Submitted for publication, March 1994.
36
 
37
B. Smith. Architecture and Applications of the HEP Multiprocessor Computer System. In Proceedings of the hTt. Soc. for Opt. Engr, pages 241-248, 1982.
38
 
39
C.K. Tang. Cache System Design in the Tightly Coupled Multiprocessor System. In Proc. AFIPS, pages 749-753, 1976.
40

CITED BY  26
 
 
 
 
 
 
 
 
 
 

Collaborative Colleagues:
Alvin R. Lebeck: colleagues
David A. Wood: colleagues

Peer to Peer - Readers of this Article have also read: