| Optimizing pipelines for power and performance |
| Full text |
Publisher Site
,
Pdf
(1.24 MB)
|
| Source
|
International Symposium on Microarchitecture
archive
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
table of contents
Istanbul, Turkey
SESSION: Energy aware design
table of contents
Pages: 333 - 344
Year of Publication: 2002
ISBN ~ ISSN:1072-4451 , 0-7695-1859-1
|
|
Authors
|
|
Viji Srinivasan
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
David Brooks
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Michael Gschwind
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Pradip Bose
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Victor Zyuban
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Philip N. Strenski
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
Philip G. Emma
|
IBM T.J. Watson Research Center, Yorktown Heights, NY
|
|
| Sponsors |
|
| Publisher |
IEEE Computer Society Press
Los Alamitos, CA, USA
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 37, Citation Count: 28
|
|
|
ABSTRACT
During the concept phase and definition of next generation high-end processors, power and performance will need to be weighted appropriately to deliver competitive cost/performance. It is not enough to adopt a CPl-centric view alone in early-stage definition studies. One of the fundamental issues confronting the architect at this stage is the choice of pipeline depth and target frequency. In this paper we present an optimization methodology that starts with an analytical power-performance model to derive optimal pipeline depth for a superscalar processor. The results are validated and further refined using detailed simulation based analysis. As part of the power-modeling methodology, we have developed equations that model the variation of energy as a function of pipeline depth. Our results using a set of SPEC2000 applications show that when both power and performance are considered for optimization, the optimal clock period is around 18 F04. We also provide a detailed sensitivity analysis of the optimal pipeline depth against key assumptions of these energy models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
David M. Brooks , Pradip Bose , Stanley E. Schuster , Hans Jacobson , Prabhakar N. Kudva , Alper Buyuktosunoglu , John-David Wellman , Victor Zyuban , Manish Gupta , Peter W. Cook, Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors, IEEE Micro, v.20 n.6, p.26-44, November 2000
[doi> 10.1109/40.888701]
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
R. Gonzalez and M. Horowitz. Energy dissipation in general purpose microprocessors. IEEE Journal of Solid-State Circuits, 31(9): 1277--84, Sept. 1996.
|
 |
9
|
|
| |
10
|
|
 |
11
|
M. S. Hrishikesh , Doug Burger , Norman P. Jouppi , Stephen W. Keckler , Keith I. Farkas , Premkishore Shivakumar, The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays, Proceedings of the 29th annual international symposium on Computer architecture, p.14, May 25-29, 2002, Anchorage, Alaska
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
M. Moudgill, P. Bose, and J. Moreno. Validation of Turandot, a fast processor model for microarchitecture exploration. In Proceedings of the IEEE International Performance, Computing, and Communications Conference (IPCCC), pages 451--457, Feb. 1999.
|
| |
17
|
|
| |
18
|
J. S. Neely, H. H. Chen, S. G. Walker, J. Venuto, and T. Bucelot. CPAM: A common power analysis methodology for high-performance VLSI design. In Proc. of the 9th Topical Meeting on the Electrical Performance of Electronic Packaging, pages 303--306, 2000.
|
 |
19
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
20
|
P. Song and G. D. Micheli. Circuit and architecture tradeoffs for high-speed multiplication. IEEE Journal of Solid-State Circuits, 26(9): 1184--1198, Sept. 1991.
|
 |
21
|
|
 |
22
|
|
 |
23
|
N. Vijaykrishnan , M. Kandemir , M. J. Irwin , H. S. Kim , W. Ye, Energy-driven integrated hardware-software optimizations using SimplePower, Proceedings of the 27th annual international symposium on Computer architecture, p.95-106, June 2000, Vancouver, British Columbia, Canada
|
| |
24
|
|
 |
25
|
|
 |
26
|
|
CITED BY 28
|
|
|
|
|
Ozgur Celebican , Tajana Simunic Rosing , Vincent J. Mooney, III, Energy estimation of peripheral devices in embedded systems, Proceedings of the 14th ACM Great Lakes symposium on VLSI, April 26-28, 2004, Boston, MA, USA
|
|
|
|
|
|
|
|
|
D. Brooks , P. Bose , V. Srinivasan , M. K. Gschwind , P. G. Emma , M. G. Rosenfield, New methodology for early-stage, microarchitecture-level power-performance analysis of microprocessors, IBM Journal of Research and Development, v.47 n.5-6, p.653-670, September 2003
|
|
|
|
|
|
|
|
|
|
|
|
J. A. Kahle , M. N. Day , H. P. Hofstee , C. R. Johns , T. R. Maeurer , D. Shippy, Introduction to the cell multiprocessor, IBM Journal of Research and Development, v.49 n.4/5, p.589-604, July 2005
|
|
|
|
|
|
Yoav Almog , Roni Rosner , Naftali Schwartz , Ari Schmorak, Specialized Dynamic Optimizations for High-Performance Energy-Efficient Microarchitecture, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.137, March 20-24, 2004, Palo Alto, California
|
|
|
Pong-Fei Lu , Nianzheng Cao , Leon Sigal , Pieter Woltgens , R. Robertazzi , D. Heidel, A pulsed low-voltage swing latch for reduced power dissipation in high-frequency microprocessors, Proceedings of the 2006 international symposium on Low power electronics and design, October 04-06, 2006, Tegernsee, Bavaria, Germany
|
|
|
|
|
|
|
|
|
Nam Sung Kim , Taeho Kgil , K. Bowman , V. De , T. Mudge, Total power-optimal pipelining and parallel processing under process variations in nanometer technology, Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design, p.535-540, November 06-10, 2005, San Jose, CA
|
|
|
|
|
|
|
|
|
B. Flachs , S. Asano , S. H. Dhong , H. P. Hofstee , G. Gervais , R. Kim , T. Le , P. Liu , J. Leenstra , J. S. Liberty , B. Michael , H.-J. Oh , S. M. Mueller , O. Takahashi , K. Hirairi , A. Kawasumii , H. Murakami , H. Noro , S. Onishi , J. Pille , J. Silberman , S. Yong , A. Hatakeyama , Y. Watanabe , N. Yano , D. A. Brokenshire , M. Peyravian , V. To , E. Iwata, Microarchitecture and implementation of the synergistic processor in 65-nm and 90-nm SOI, IBM Journal of Research and Development, v.51 n.5, p.529-543, September 2007
|
|
|
|
|
|
|
|
|
Valentina Salapura , Matthias Blumrich , Alan Gara, Improving the accuracy of snoop filtering using stream registers, Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture, p.25-32, September 16-16, 2007, Brasov, Romania
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|