| A scalable approach to thread-level speculation |
| Full text |
Pdf
(187 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 27th annual international symposium on Computer architecture
table of contents
Vancouver, British Columbia, Canada
Pages: 1 - 12
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
|
|
Authors
|
|
J. Greggory Steffan
|
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
|
|
Christopher B. Colohan
|
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
|
|
Antonia Zhai
|
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
|
|
Todd C. Mowry
|
Computer Science Department, Carnegie Mellon University, Pittsburgh, PA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 33, Downloads (12 Months): 125, Citation Count: 73
|
|
|
ABSTRACT
While architects understand how to build cost-effective parallel machines across a wide spectrum of machine sizes (ranging from within a single chip to large-scale servers), the real challenge is how to easily create parallel software to effectively exploit all of this raw performance potential. One promising technique for overcoming this problem is Thread-Level Speculation (TLS), which enables the compiler to optimistically create parallel threads despite uncertainty as to whether those threads are actually independent. In this paper, we propose and evaluate a design for supporting TLS that seamlessly scales to any machine size because it is a straightforward extension of writeback invalidation-based cache coherence (which itself scales both up and down). Our experimental results demonstrate that our scheme performs well on both single-chip multiprocessors and on larger-scale machines where communication latencies are twenty times larger.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
Lance Hammond , Mark Willey , Kunle Olukotun, Data speculation support for a chip multiprocessor, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.58-69, October 02-07, 1998, San Jose, California, United States
|
| |
10
|
J. Kahle. Power4: A Dual-CPU Processor Chip. Microprocessor Forum '99, October 1999.
|
| |
11
|
R Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenepoel. Tread- Marks: Distributed Shared Memory on Standard Workstations and Operating Systems. In Proceedings of the Winter Usenix Conference, January 1994.
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
Kunle Olukotun , Basem A. Nayfeh , Lance Hammond , Ken Wilson , Kunyung Chang, The case for a single-chip multiprocessor, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.2-11, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
17
|
|
 |
18
|
|
| |
19
|
J. G. Steffan, C. B. Colohan, and T. C. Mowry. Architectural Support for Thread-Level Data Speculation. Technical Report CMU-CS- 97-188, School of Computer Science, Carnegie Mellon University, November 1997.
|
| |
20
|
|
| |
21
|
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. HotChips '99, August 1999.
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
CITED BY 77
|
|
|
|
|
|
|
|
|
|
|
Chong-Liang Ooi , Seon Wook Kim , Il Park , Rudolf Eigenmann , Babak Falsafi , T. N. Vijaykumar, Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor, Proceedings of the 15th international conference on Supercomputing, p.368-380, June 2001, Sorrento, Italy
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Karthikeyan Sankaralingam , Ramadass Nagarajan , Haiming Liu , Changkyu Kim , Jaehyuk Huh , Nitya Ranganathan , Doug Burger , Stephen W. Keckler , Robert G. McDonald , Charles R. Moore, TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP, ACM Transactions on Architecture and Code Optimization (TACO), v.1 n.1, p.62-93, March 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti Sarangi , James Tuck , Josep Torrellas, Thread-Level Speculation on a CMP can be energy efficient, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
|
|
|
Jose Renau , James Tuck , Wei Liu , Luis Ceze , Karin Strauss , Josep Torrellas, Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
Sanjeev Kumar , Michael Chu , Christopher J. Hughes , Partha Kundu , Anthony Nguyen, Hybrid transactional memory, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
|
|
|
María Jesús Garzarán , Milos Prvulovic , José María Llabería , Víctor Viñals , Lawrence Rauchwerger , Josep Torrellas, Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors, ACM Transactions on Architecture and Code Optimization (TACO), v.2 n.3, p.247-279, September 2005
|
|
|
Wei Liu , James Tuck , Luis Ceze , Wonsun Ahn , Karin Strauss , Jose Renau , Josep Torrellas, POSH: a TLS compiler that exploits program structure, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
|
|
|
|
|
|
|
|
|
|
|
|
Karthikeyan Sankaralingam , Ramadass Nagarajan , Haiming Liu , Changkyu Kim , Jaehyuk Huh , Doug Burger , Stephen W. Keckler , Charles R. Moore, Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture, ACM SIGARCH Computer Architecture News, v.31 n.2, May 2003
|
|
|
|
|
|
|
|
|
Antonia Zhai , Christopher B. Colohan , J. Gregory Steffan , Todd C. Mowry, Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.39, March 20-24, 2004, Palo Alto, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Antonia Zhai , Shengyue Wang , Pen-Chung Yew , Guojin He, Compiler optimizations for parallelizing general-purpose applications under thread-level speculation, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
|
|
|
Yao Guo , Vladimir Vlassov , Raksit Ashok , Richard Weiss , Csaba Andras Moritz, Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization, Journal of Parallel and Distributed Computing, v.68 n.2, p.165-181, February, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Austen McDonald , JaeWoong Chung , Brian D. Carlstrom , Chi Cao Minh , Hassan Chafi , Christos Kozyrakis , Kunle Olukotun, Architectural Semantics for Practical Transactional Memory, ACM SIGARCH Computer Architecture News, v.34 n.2, p.53-65, May 2006
|
|
|
|
|
|
|
|
|
|
|
|
Haitham Akkary , Komal Jothi , Renjith Retnamma , Satyanarayana Nekkalapu , Doug Hall , Shahrokh Shahidzadeh, On the potential of latency tolerant execution in speculative multithreading, Proceedings of the 1st international forum on Next-generation multicore/manycore technologies, November 24-25, 2008, Cairo, Egypt
|
|
|
Lukasz Ziarek , Suresh Jagannathan , Matthew Fluet , Umut A. Acar, Speculative N-Way barriers, Proceedings of the 4th workshop on Declarative aspects of multicore programming, January 20-20, 2009, Savannah, GA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Seth H. Pugsley , Manu Awasthi , Niti Madan , Naveen Muralimanohar , Rajeev Balasubramonian, Scalable and reliable communication for hardware transactional memory, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
Benjamin Wester , James Cowling , Edmund B. Nightingale , Peter M. Chen , Jason Flinn , Barbara Liskov, Tolerating latency in replicated state machines through client speculation, Proceedings of the 6th USENIX symposium on Networked systems design and implementation, p.245-260, April 22-24, 2009, Boston, Massachusetts
|
|
|
|
|
|
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|
|
|
|
|
|
|