|
ABSTRACT
Chip Multiprocessors (CMP) with Thread-Level Speculation (TLS) have become the subject of intense research. However, TLS is suspected of being too energy inefficient to compete against conventional processors. In this paper, we refute this claim. To do so, we first identify the main sources of dynamic energy consumption in TLS. Then, we present simple energy-saving optimizations that cut the energy cost of TLS by over 60% on average with minimal performance impact. The resulting TLS CMP, populated with four 3-issue cores, speeds-up full SPECint 2000 codes by 1.27 on average, while keeping the fraction of the chip's energy consumption due to TLS to only 20%. Compared to a 6-issue superscalar at the same frequency, the TLS CMP is on average faster, while consuming only 85% of its total on-chip power.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
International Technology Roadmap for Semiconductors. Semiconductor Industry Association, 2002.
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
María Jesús Garzarán , Milos Prvulovic , José María Llabería , Víctor Viñals , Lawrence Rauchwerger , Josep Torrellas, Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors, Proceedings of the 9th International Symposium on High-Performance Computer Architecture, p.191, February 08-12, 2003
|
| |
7
|
SSA for Trees - GNU Project, May 2003. "http://www.gccsummit. org/2003/view_abstract.php?talk=2".
|
| |
8
|
|
 |
9
|
Lance Hammond , Mark Willey , Kunle Olukotun, Data speculation support for a chip multiprocessor, Proceedings of the eighth international conference on Architectural support for programming languages and operating systems, p.58-69, October 02-07, 1998, San Jose, California, United States
|
| |
10
|
|
| |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
A. J. Martin, M. Nystroem, and P. Penzes. ET2: A Metric for Time and Energy Efficiency of Computation. Technical Report CSTR:2001.007, California Institute of Technology, December 2001.
|
 |
15
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
16
|
V. Petric and A. Roth. Energy-Effectiveness of Pre-Execution and Energy-Aware P-Thread Selection. Technical Report MS-CIS-03-34, University of Pennsylvania, November 2003.
|
 |
17
|
Milos Prvulovic , María Jesús Garzarán , Lawrence Rauchwerger , Josep Torrellas, Removing architectural bottlenecks to the scalability of speculative parallelization, Proceedings of the 28th annual international symposium on Computer architecture, p.204-215, June 30-July 04, 2001, Göteborg, Sweden
|
| |
18
|
|
 |
19
|
Jose Renau , James Tuck , Wei Liu , Luis Ceze , Karin Strauss , Josep Torrellas, Tasking with out-of-order spawn in TLS chip multiprocessors: microarchitecture and compilation, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088173]
|
| |
20
|
P. Shivakumar and N. Jouppi. CACTI 3.0: An Integrated Cache Timing, Power and Area Model. Technical Report 2001/2, Compaq Computer Corporation, August 2001.
|
 |
21
|
|
 |
22
|
J. Greggory Steffan , Christopher B. Colohan , Antonia Zhai , Todd C. Mowry, A scalable approach to thread-level speculation, Proceedings of the 27th annual international symposium on Computer architecture, p.1-12, June 2000, Vancouver, British Columbia, Canada
|
| |
23
|
|
 |
24
|
Haihua Su , Frank Liu , Anirudh Devgan , Emrah Acar , Sani Nassif, Full chip leakage estimation considering power supply and temperature variations, Proceedings of the 2003 international symposium on Low power electronics and design, August 25-27, 2003, Seoul, Korea
[doi> 10.1145/871506.871529]
|
| |
25
|
M. Tremblay. MAJC: Microprocessor Architecture for Java Computing. Hot Chips, August 1999.
|
| |
26
|
|
| |
27
|
J. Tuck. A Novel Compiler Framework for a Chip-Multiprocessor Architecture with Thread-Level Speculation. Master's thesis, University of Illinois at Urbana-Champaign, 2004.
|
| |
28
|
|
| |
29
|
Y. Zhang, D. Parikh, K. Sankaranarayanan, K. Skadron, and M. Stan. HotLeakage: A Temperature-Aware Model of Subthreshold and Gate Leakage for Architects. Technical Report CS-2003-05, University of Virginia, Department of Computer Science, March 2003.
|
CITED BY 4
|
|
Seth H. Pugsley , Manu Awasthi , Niti Madan , Naveen Muralimanohar , Rajeev Balasubramonian, Scalable and reliable communication for hardware transactional memory, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, October 25-29, 2008, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
Jose Renau , Karin Strauss , Luis Ceze , Wei Liu , Smruti R. Sarangi , James Tuck , Josep Torrellas, Energy-Efficient Thread-Level Speculation, IEEE Micro, v.26 n.1, p.80-91, January 2006
|
|