| Lazy release consistency for hardware-coherent multiprocessors |
| Full text |
Html
(4 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM)
table of contents
San Diego, California, United States
Article No. 61
Year of Publication: 1995
ISBN:0-89791-816-9
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 11, Citation Count: 5
|
|
|
ABSTRACT
Release consistency is a widely accepted memory model for distributed shared memory systems. Eager release consistency represents the state of the art in release consistent protocols for hardware-coherent multiprocessors, while lazy release consistency has been shown to provide better performance for software distributed shared memory (DSM). Several of the optimizations performed by lazy protocols have the potential to improve the performance of hardware-coherent multiprocessors as well, but their complexity has precluded a hardware implementation. With the advent of programmable protocol processors it may become possible to use them after all. We present and evaluate a lazy release-consistent protocol suitable for machines with dedicated protocol processors. This protocol admits multiple concurrent writers, sends write notices concurrently with computation, and delays invalidations until acquire operations. We also consider a lazier protocol that delays sending write notices until release operations. Our results indicate that the first protocol outperforms eager release consistency by as much as 20% across a variety of applications. The lazier protocol, on the other hand, is unable to recoup its high synchronization overhead. This represents a qualitative shift from the DSM world, where lazier protocols always yield performance improvements. Based on our results, we conclude that machines with flexible hardware support for coherence should use protocols based on lazy release consistency, but in a less ''aggressively lazy'' form than is appropriate for DSM.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
John B. Carter , John K. Bennett , Willy Zwaenepoel, Implementation and performance of Munin, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.152-164, October 13-16, 1991, Pacific Grove, California, United States
|
| |
5
|
K. Dackland, E. Elmroth, B. Kagstrom, and C. V. Loan. Parallel Block Matrix Factorizations on the Shared-Memory Multiprocessor IBM 3090 VF/600J. The International Journal of Supercomputer Applications, 6(1):69--97, Spring 1992.
|
| |
6
|
Michel, Dubois , Christoph Scheurich , Fayé A. Briggs, Synchronization, Coherence, and Event Ordering in Multiprocessors, Computer, v.21 n.2, p.9-21, February 1988
[doi> 10.1109/2.15]
|
 |
7
|
Michel Dubois , Jin Chin Wang , Luiz A. Barroso , Kangwoo Lee , Yung-Syau Chen, Delayed consistency and its effects on the miss rate of parallel programs, Proceedings of the 1991 ACM/IEEE conference on Supercomputing, p.197-206, November 18-22, 1991, Albuquerque, New Mexico, United States
[doi> 10.1145/125826.125941]
|
 |
8
|
Michel Dubois , Jonas Skeppstedt , Livio Ricciulli , Krishnan Ramamurthy , Per Stenström, The detection and elimination of useless misses in multiprocessors, Proceedings of the 20th annual international symposium on Computer architecture, p.88-97, May 16-19, 1993, San Diego, California, United States
|
| |
9
|
S. J. Eggers and T. E. Jeremiassen. Eliminating False Sharing. In Proceedings of the 1991 International Conference on Parallel Processing, pages I:377--381, St. Charles, IL, August 1991.
|
 |
10
|
Kourosh Gharachorloo , Daniel Lenoski , James Laudon , Phillip Gibbons , Anoop Gupta , John Hennessy, Memory consistency and event ordering in scalable shared-memory multiprocessors, Proceedings of the 17th annual international symposium on Computer Architecture, p.15-26, May 28-31, 1990, Seattle, Washington, United States
|
| |
11
|
K. Gharachorloo, S. V. Adve, A. Gupta, J. L. Hennessy, and M. D. Hill. Programming for Different Memory Consistency Models. Journal of Parallel and Distributed Computing, 15:399--407, 1992.
|
 |
12
|
|
 |
13
|
|
| |
14
|
P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems. In Proceedings of the USENIX Winter '94 Technical Conference, San Francisco, CA, January 1994.
|
| |
15
|
|
 |
16
|
J. Kuskin , D. Ofelt , M. Heinrich , J. Heinlein , R. Simoni , K. Gharachorloo , J. Chapin , D. Nakahira , J. Baxter , M. Horowitz , A. Gupta , M. Rosenblum , J. Hennessy, The Stanford FLASH multiprocessor, Proceedings of the 21ST annual international symposium on Computer architecture, p.302-313, April 18-21, 1994, Chicago, Illinois, United States
|
 |
17
|
Daniel Lenoski , James Laudon , Kourosh Gharachorloo , Anoop Gupta , John Hennessy, The directory-based cache coherence protocol for the DASH multiprocessor, Proceedings of the 17th annual international symposium on Computer Architecture, p.148-159, May 28-31, 1990, Seattle, Washington, United States
|
| |
18
|
Daniel Lenoski , James Laudon , Kourosh Gharachorloo , Wolf-Dietrich Weber , Anoop Gupta , John Hennessy , Mark Horowitz , Monica S. Lam, The Stanford Dash Multiprocessor, Computer, v.25 n.3, p.63-79, March 1992
[doi> 10.1109/2.121510]
|
| |
19
|
K. Petersen and K. Li. Cache Coherence for Shared Memory Multiprocessors Based on Virtual Memory Support. In Proceedings of the Seventh International Parallel Processing Symposium, Newport Beach, CA, April 1993.
|
 |
20
|
S. K. Reinhardt , J. R. Larus , D. A. Wood, Tempest and typhoon: user-level shared memory, Proceedings of the 21ST annual international symposium on Computer architecture, p.325-336, April 18-21, 1994, Chicago, Illinois, United States
|
 |
21
|
|
 |
22
|
|
| |
23
|
|
CITED BY 5
|
|
|
|
|
Lance Hammond , Brian D. Carlstrom , Vicky Wong , Ben Hertzberg , Mike Chen , Christos Kozyrakis , Kunle Olukotun, Programming with transactional coherence and consistency (TCC), ACM SIGOPS Operating Systems Review, v.38 n.5, December 2004
|
|
|
|
|
|
|
|
|
|
|