|
ABSTRACT
Chip Multiprocessors (CMPs) are flexible, high-frequency platforms on which to support Thread-Level Speculation (TLS). However, for TLS to deliver on its promise, CMPs must exploit multiple sources of speculative task-level parallelism, including any nesting levels of both subroutines and loop iterations. Unfortunately, these environments are hard to support in decentralized CMP hardware: since tasks are spawned out-of-order and unpredictably, maintaining key TLS basics such as task ordering and efficient resource allocation is challenging.While the concept of out-of-order spawning is not new, this paper is the first to propose a set of microarchitectural mechanisms that, altogether, fundamentally enable fast TLS with out-of-order spawn in a CMP. Moreover, we develop a fully-automated TLS compiler for aggressive out-of-order spawn. With our mechanisms, a TLS CMP with four 4-issue cores achieves an average speedup of 1.30 for full SPECint 2000 applications; the corresponding speedup for in-order only spawn is 1.04. Overall, our mechanisms unlock the potential of TLS for the toughest applications.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
Peng-Sheng Chen , Ming-Yu Hung , Yuan-Shin Hwang , Roy Dz-Ching Ju , Jenq Kuen Lee, Compiler support for speculative multithreading architecture with probabilistic points-to analysis, Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 11-13, 2003, San Diego, California, USA
|
 |
5
|
|
 |
6
|
|
| |
7
|
Pradeep K. Dubey , Kevin O'Brien , Kathryn M. O'Brien , Charles Barton, Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading, Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, p.109-121, June 27-29, 1995, Limassol, Cyprus
|
| |
8
|
SSA for Trees - GNU Project, May 2003. http://www.gccsummit.org/2003/view_abstract.php?talk=2.
|
 |
9
|
Lance Hammond , Mark Willey , Kunle Olukotun, Data speculation support for a chip multiprocessor, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.58-69, October 02-07, 1998, San Jose, California, United States
|
| |
10
|
|
| |
11
|
X. F. Li, Z. H. Dui, Q. Y. Zhao, and T. F. Ngai. Software Value Prediction for Speculative Parallel Threaded Computations. In Value Prediction Workshop, pages 18--25, June 2003.
|
| |
12
|
R. H. Littin, J. A. D. McWha, M. W. Pearson, and J. G. Cleary. Block Based Execution and Task Level Parallelism. In Australian Computer Science Communications, pages 57--66, 1998.
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
J. Greggory Steffan , Christopher B. Colohan , Antonia Zhai , Todd C. Mowry, A scalable approach to thread-level speculation, Proceedings of the 27th annual international symposium on Computer architecture, p.1-12, June 2000, Vancouver, British Columbia, Canada
|
| |
19
|
M. Tremblay. MAJC; Microprocessor Architecture for Java Computing. Hot Chips, August 1999.
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
 |
24
|
|
CITED BY 14
|
|
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti Sarangi , James Tuck , Josep Torrellas, Thread-Level Speculation on a CMP can be energy efficient, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
Wei Liu , James Tuck , Luis Ceze , Wonsun Ahn , Karin Strauss , Jose Renau , Josep Torrellas, POSH: a TLS compiler that exploits program structure, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming, March 29-31, 2006, New York, New York, USA
|
|
|
Arun Kejariwal , Xinmin Tian , Milind Girkar , Wei Li , Sergey Kozhukhov , Utpal Banerjee , Alexander Nicolau , Alexander V. Veidenbaum , Constantine D. Polychronopoulos, Tight analysis of the performance potential of thread speculation using spec CPU 2006, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming, March 14-17, 2007, San Jose, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
Smruti R. Sarangi , Wei Liu, Josep Torrellas , Yuanyuan Zhou, ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.257-270, November 12-16, 2005, Barcelona, Spain
|
|
|
|
|
|
|
|
|
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|
|
Cheng Wang , Youfeng Wu , Edson Borin , Shiliang Hu , Wei Liu , Dave Sager , Tin-fook Ngai , Jesse Fang, Dynamic parallelization of single-threaded binary programs using speculative slicing, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|