|
ABSTRACT
The issue logic of a dynamically-scheduled superscalar processor is a complex mechanism devoted to start the execution of multiple instructions every cycle. Due to its complexity, it is responsible for a significant percentage of the energy consumed by a microprocessor. The energy consumption of the issue logic depends on several architectural parameters, the instruction issue queue size being one of the most important. In this paper we present a technique to reduce the energy consumption of the issue logic of a high-performance superscalar processor. The proposed technique is based on the observation that the conventional issue logic wastes a significant amount of energy for useless activity. In particular, the wake-up of empty entries and operands that are ready represents an important source of energy waste. Besides, we propose a mechanism to dynamically reduce the effective size of the instruction queue. We show that on average the effective instruction queue size can be reduced by a factor of 26% with minimal impact on performance. This reduction together with the energy saved for empty and ready entries result in about 90.7% reduction in the energy consumed by the wake-up logic, which represents 14.9% of the total energy of the assumed processor.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
R. Iris Bahar , Gianluca Albera , Srilatha Manne, Power and performance tradeoffs using various caching strategies, Proceedings of the 1998 international symposium on Low power electronics and design, p.64-69, August 10-12, 1998, Monterey, California, United States
[doi> 10.1145/280756.295115]
|
 |
2
|
|
| |
3
|
D. Albonesi "The inherent Energy Efficiency of Complexity-Adaptive Processors" Power Driven Microarchitecture Workshop in conjuction with 1SCA-25, 1998
|
| |
4
|
D. Burger, T. Austin "The SimpleScalar Tool Set" , Version 3.0 Technical Report, University of Wisconsin, Madison 1999
|
| |
5
|
D. Brooks, M. Martonosi "Adaptive Thermal Management for High-Performance Microprocessors", Workshop on Complexity Effective Design in conjuction with ISCA-27, 2000
|
 |
6
|
|
| |
7
|
G. Cai "Architectural Level Power/Performance Optimization and Dynamic Power Estimation" , Proc. of the CoolChips tutorial, An Industrial Perspective on Low Power Processor Design in conjunction with MICRO-32 , 1999
|
 |
8
|
|
| |
9
|
T. M. Conte, M. Toburen, M. Reilly "Instruction Scheduling for Low power dissipation in High Performance Microprocessors", Workshop on Power Driven Microarchitecture in conjuction with 1SCA-25,1998
|
| |
10
|
Intel Corporation "The Intel Architecture Software Developers Manual", 1999
|
| |
11
|
Intel Corporation "Intel StrongArm SA- 110 Microprocessor Datasheet", 1999
|
| |
12
|
Mentor Graphics Corporation "QuickPower', 1999
|
| |
13
|
Synopsys Corporation "PowerMill Data Sheet". 1999
|
| |
14
|
Transmeta Corporation "The Technology Behind the Crusoe Processor Whitepaper", 2000
|
| |
15
|
J. Cortadella,T. Lang, E. Mussoll "Reducing thc encrgy of address and Data Buses with the Working-Zone Encoding Technique and its Effect on Multimedia Applications", Workshop on Power Driven Microarchitecture in conjuction with ISCA-25, 1998
|
| |
16
|
|
| |
17
|
D. Folegnani and A. Gonzilez,"Reducing Power Consumption of the Issue Logic", Proc. of Workshop on Complexity-Effective Design held in conjunction with ISCA 2000, Vancouver (Canada), June 10, 2000
|
| |
18
|
R. Gonzailez, M. Horowitz "Energy Dissipation in General Purpose microprocessors", IEEE Journal of Solid State Circuits, 31 (9), 1996, pages 1277-1284
|
| |
19
|
Johnson Kin , Munish Gupta , William H. Mangione-Smith, The filter cache: an energy efficient memory structure, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.184-193, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
|
| |
24
|
M. Panich "Reducing Instruction Cache Energy Using Gated Wordlines", MS Thesis, MIT, 1999
|
| |
25
|
Hector Sanchez , Belli Kuttanna , Tim Olson , Mike Alexander , Gian Gerosa , Ross Philip , Jose Alvarez, Thermal Management System for High Performance PowerPCTM Microprocessors, Proceedings of the 42nd IEEE International Computer Conference, p.325, February 23-26, 1997
|
| |
26
|
M Shebanow "SPARC64 5: A High Performance and High Reliability 64-bit SPARC Processor", CSLI Public Event, Stanford University, December 1999
|
| |
27
|
D. Singh, V. Tiwari "Power Challenges in the lnternet World", Proceedings of the CoolChips tutorial, An Industrial Perspective on Low Power Processor Design, in conjunction with MICRO-32, 1999
|
| |
28
|
C. Small "Shrinking Devices Put the Squeeze on System Packaging", EDN, 39(4), 1994, pp 41-46
|
| |
29
|
|
| |
30
|
D. Tennenhouse "Pro-Active Computing", Darpa Technical report, 1999
|
| |
31
|
The CPU Info Center. http://infopad.eecs.berkeley.edu/ClC
|
 |
32
|
N. Vijaykrishnan , M. Kandemir , M. J. Irwin , H. S. Kim , W. Ye, Energy-driven integrated hardware-software optimizations using SimplePower, Proceedings of the 27th annual international symposium on Computer architecture, p.95-106, June 2000, Vancouver, British Columbia, Canada
|
| |
33
|
D. Wall "Limits of Instruction-Level Parallelism", Technical report WRL 93/6, Digital WRL, 1993
|
| |
34
|
K. Wilcox, S. Manne "Alpha Processors: A History of Power Issues and A Look to the Future", Proceedings of the CooIChips tutorial. An Industrial Perspective on Low Power Processor Design in conjunction MICRO-33, 1999
|
| |
35
|
|
 |
36
|
N. Vijaykrishnan , M. Kandemir , M. J. Irwin , H. S. Kim , W. Ye, Energy-driven integrated hardware-software optimizations using SimplePower, Proceedings of the 27th annual international symposium on Computer architecture, p.95-106, June 2000, Vancouver, British Columbia, Canada
|
| |
37
|
|
CITED BY 70
|
|
Alper Buyuktosunoglu , David H. Albonesi , Stanley Schuster , David Brooks , Pradip Bose , Peter Cook, Power-efficient issue queue design, Power aware computing, Kluwer Academic Publishers, Norwell, MA, 2002
|
|
|
|
|
|
|
|
|
|
|
|
Alper Buyuktosunoglu , David H. Albonesi , Pradip Bose , Peter W. Cook , Stanley E. Schuster, Tradeoffs in power-efficient issue queue design, Proceedings of the 2002 international symposium on Low power electronics and design, August 12-14, 2002, Monterey, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
I. Kadayif , A. Sivasubramaniam , M. Kandemir , G. Kandiraju , G. Chen, Generating physical addresses directly for saving instruction TLB energy, Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, November 18-22, 2002, Istanbul, Turkey
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel Chaver , Luis Piñuel , Manuel Prieto , Francisco Tirado , Michael C. Huang, Branch prediction on demand: an energy-efficient solution, Proceedings of the 2003 international symposium on Low power electronics and design, August 25-27, 2003, Seoul, Korea
|
|
|
|
|
|
|
|
|
Karthik Natarajan , Heather Hanson , Stephen W. Keckler , Charles R. Moore , Doug Burger, Microprocessor pipeline energy analysis, Proceedings of the 2003 international symposium on Low power electronics and design, August 25-27, 2003, Seoul, Korea
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Saurabh Chheda , Osman Unsal , Israel Koren , C. Mani Krishna , Csaba Andras Moritz, Combining compiler and runtime IPC predictions to reduce energy in next generation architectures, Proceedings of the 1st conference on Computing frontiers, April 14-16, 2004, Ischia, Italy
|
|
|
|
|
|
David H. Albonesi , Rajeev Balasubramonian , Steven G. Dropsho , Sandhya Dwarkadas , Eby G. Friedman , Michael C. Huang , Volkan Kursun , Grigorios Magklis , Michael L. Scott , Greg Semeraro , Pradip Bose , Alper Buyuktosunoglu , Peter W. Cook , Stanley E. Schuster, Dynamically Tuning Processor Resources with Adaptive Processing, Computer, v.36 n.12, p.49-58, December 2003
|
|
|
|
|
|
Hongbo Yang , Guang R. Gao , Clement Leung, On achieving balanced power consumption in software pipelined loops, Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems, October 08-11, 2002, Grenoble, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Joseph J. Sharkey , Dmitry V. Ponomarev , Kanad Ghose , Oguz Ergin, Instruction packing: reducing power and delay of the dynamic scheduling logic, Proceedings of the 2005 international symposium on Low power electronics and design, August 08-10, 2005, San Diego, CA, USA
|
|
|
|
|
|
R. González , A. Cristal , M. Pericas , M. Valero , A. Veidenbaum, An asymmetric clustered processor based on value content, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Steven Dropsho , Greg Semeraro , David H. Albonesi , Grigorios Magklis , Michael L. Scott, Dynamically Trading Frequency for Complexity in a GALS Microprocessor, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.157-168, December 04-08, 2004, Portland, Oregon
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hans Vandierendonck , Philippe Manet , Thibault Delavallee , Igor Loiselle , Jean-Didier Legat, By-passing the out-of-order execution pipeline to increase energy-efficiency, Proceedings of the 4th international conference on Computing frontiers, May 07-09, 2007, Ischia, Italy
|
|
|
|
|
|
|
|
|
|
|
|
Francisco J. Mesa-Martínez , Michael C. Huang , Jose Renau, SEED: scalable, efficient enforcement of dependences, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yingmin Li , Dharmesh Parikh , Yan Zhang , Karthik Sankaranarayanan , Mircea Stan , Kevin Skadron, State-Preserving vs. Non-State-Preserving Leakage Control in Caches, Proceedings of the conference on Design, automation and test in Europe, p.10022, February 16-20, 2004
|
|
|
|
|
|
Fernando Latorre , Grigorios Magklis , José González , Pedro Chaparro , Antonio González, Building a large instruction window through ROB compression, Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, p.41-48, September 16-16, 2007, Brasov, Romania
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|