| Power-efficient clustering via incomplete bypassing |
| Full text |
Pdf
(506 KB)
|
Source
|
International Symposium on Low Power Electronics and Design
archive
Proceeding of the 13th international symposium on Low power electronics and design
table of contents
Bangalore, India
SESSION: Microarchitectural techniques
table of contents
Pages 369-374
Year of Publication: 2008
ISBN:978-1-60558-109-5
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 39, Citation Count: 0
|
|
|
ABSTRACT
Researchers have proposed clustered microarchitectures for performance and energy effciency. Typically, clustered microarchitectures offer fast, local bypassing between instructions within clusters but global bypasses are slower. Traditional clustered microarchitectures (TCM) are implemented by partitioning the register file and associated functional units to clusters. This paper demonstrates an alternate implementation - Incomplete bypass-based clustered microarchitecture (IBCM). IBCM reduces the length of bypass wires by 42.4% resulting in an 8.9% reduction of "Execute" stage delay. This delay reduction in the critical EX stage enables voltage scaling that results in significantly lower average power consumption (between 11.7% and 19.5% lower) while achieving identical performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Pritpal S. Ahuja , Douglas W. Clark , Anne Rogers, The performance impact of incomplete bypassing in processor pipelines, Proceedings of the 28th annual international symposium on Microarchitecture, p.36-45, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
E. Fetzer, M. Gibson, A. Klein, N. Calick, C. Zhu, E. Busta, and B. Mohammad. A fully bypassed six-issue integer datapath and register file on the itanium-2 microprocessor. Solid-State Circuits, IEEE Journal of, 37(11):1433--1440, Nov 2002.
|
 |
8
|
|
 |
9
|
|
| |
10
|
H. Kadota, S. Ozawa, K. Kawakami, and E. Ichinohe. A new register file structure for the high-speed microprocessor. Solid-State Circuits, IEEE Journal of, 17(5):892--897, Oct 1982.
|
| |
11
|
|
| |
12
|
S. Kim. Reducing alu and register file energy by dynamic zero detection. Performance, Computing, and Communications Conference, 2007. IPCCC 2007. IEEE International, pages 365{371, 11--13 April 2007.
|
 |
13
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
14
|
S. Palacharla, N. P.Jouppi, and J. E. Smith. Quantifying the complexity of superscalar processors. Technical Report CSTR-96-1328, University of Wisconsin-Madison, November 1996.
|
 |
15
|
Erez Perelman , Greg Hamerly , Michael Van Biesbrouck , Timothy Sherwood , Brad Calder, Using SimPoint for accurate and efficient simulation, Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, June 11-14, 2003, San Diego, CA, USA
|
| |
16
|
P. G. Sassone and D. S. Wills. Multicycle broadcast bypass: Too readily overlooked, 2004.
|
| |
17
|
S.I. Association. Intl. Technology Roadmap for Semiconductors, 2006.
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
 |
21
|
Javier Zalamea , Josep Llosa , Eduard Ayguadé , Mateo Valero, Two-level hierarchical register file organization for VLIW processors, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.137-146, December 2000, Monterey, California, United States
[doi> 10.1145/360128.360143]
|
|