ACM Home Page
Please provide us with feedback. Feedback
A scalable approach to thread-level speculation
Full text PdfPdf (187 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 27th annual international symposium on Computer architecture table of contents
Vancouver, British Columbia, Canada
Pages: 1 - 12  
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
Authors
J. Greggory Steffan  Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
Christopher B. Colohan  Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
Antonia Zhai  Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
Todd C. Mowry  Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 33,   Downloads (12 Months): 125,   Citation Count: 77
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/339647.339650
What is a DOI?

ABSTRACT

While architects understand how to build cost-effective parallel machines across a wide spectrum of machine sizes (ranging from within a single chip to large-scale servers), the real challenge is how to easily create parallel software to effectively exploit all of this raw performance potential. One promising technique for overcoming this problem is Thread-Level Speculation (TLS), which enables the compiler to optimistically create parallel threads despite uncertainty as to whether those threads are actually independent. In this paper, we propose and evaluate a design for supporting TLS that seamlessly scales to any machine size because it is a straightforward extension of writeback invalidation-based cache coherence (which itself scales both up and down). Our experimental results demonstrate that our scheme performs well on both single-chip multiprocessors and on larger-scale machines where communication latencies are twenty times larger.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
5
 
6
 
7
 
8
9
 
10
J. Kahle. Power4: A Dual-CPU Processor Chip. Microprocessor Forum '99, October 1999.
 
11
R Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. Tread- Marks: Distributed Shared Memory on Standard Workstations and Operating Systems. In Proceedings of the Winter Usenix Conference, January 1994.
12
 
13
14
15
16
 
17
18
 
19
J. G. Steffan, C. B. Colohan, and T. C. Mowry. Architectural Support for Thread-Level Data Speculation. Technical Report CMU-CS- 97-188, School of Computer Science, Carnegie Mellon University, November 1997.
 
20
 
21
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
 
22
23
 
24
 
25

CITED BY  77

Collaborative Colleagues:
J. Greggory Steffan: colleagues
Christopher B. Colohan: colleagues
Antonia Zhai: colleagues
Todd C. Mowry: colleagues