| Power efficient branch prediction through early identification of branch addresses |
| Full text |
Pdf
(248 KB)
|
| Source
|
International Conference on Compilers, Architecture and Synthesis for Embedded Systems
archive
Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
table of contents
Seoul, Korea
SESSION: Architecture/power
table of contents
Pages: 169 - 178
Year of Publication: 2006
ISBN:1-59593-543-6
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 61, Citation Count: 2
|
|
|
ABSTRACT
Ever increasing performance requirements have elevated deeply pipelined architectures to a standard even in the embedded processor domain, requiring the incorporation of dynamic branch prediction subsystems to hide the execution latency of control-altering instructions. In this paper a low power early branch identification technique which enables the design of extremely power-efficient branch predictors and BTBs is proposed. Through static extraction of program information regarding the distance to subsequent branches, this technique enables the calculation of the next branch address as soon as the direction of the current branch has been predicted. Early identification of branch addresses enables a complete elimination of the power hungry BTB lookups normally occurring at every execution cycle, as well as a just-in-time wake-up mechanism when accessing "hibernating" entries in complex predictors, switched to power-saving mode to reduce leakage power dissipation. A cost-efficient Branch Identification Unit (BIU) to calculate branch addresses is presented and analyzed in terms of power and timing characteristics. The effectiveness of the proposed BTB access policy and predictor wake-up mechanism is also confirmed by the simulation results of the SPECint 2000 and Media-bench benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
P. Petrov and A. Orailoglu. Low-power branch target buffer for application-speci c embedded processors. IEE Transactions on Computers &Digital Techniques, 152(4):482--488, July 2005.
|
| |
3
|
M. Monchiero , G. Palermo , M. Sami , C. Silvano , V. Zaccaria , R. Zafalon, Low-power branch prediction techniques for VLIW architectures: a compiler-hints based approach, Integration, the VLSI Journal, v.38 n.3, p.515-524, January 2005
[doi> l0.1016/j.vlsi.2004.07.012]
|
| |
4
|
|
 |
5
|
|
 |
6
|
Daniel Chaver , Luis Piñuel , Manuel Prieto , Francisco Tirado , Michael C. Huang, Branch prediction on demand: an energy-efficient solution, Proceedings of the 2003 international symposium on Low power electronics and design, August 25-27, 2003, Seoul, Korea
[doi> 10.1145/871506.871603]
|
 |
7
|
|
| |
8
|
|
 |
9
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, p.148, May 25-29, 2002, Anchorage, Alaska
|
 |
10
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, p.148, May 25-29, 2002, Anchorage, Alaska
|
| |
11
|
|
 |
12
|
Shien-Tai Pan , Kimming So , Joseph T. Rahmeh, Improving the accuracy of dynamic branch prediction using branch correlation, Proceedings of the fifth international conference on Architectural support for programming languages and operating systems, p.76-84, October 12-15, 1992, Boston, Massachusetts, United States
|
| |
13
|
S. McFarling. Combining branch predictors. Tech. Note TN-36, DEC WRL, June 1993.
|
| |
14
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
15
|
A. Srivastava and A. Eustace. ATOM: A system for building customized program analysis tools. Tech. report, Western Research Lab, March 1994.
|
 |
16
|
|
| |
17
|
P. Shivakumar and N. P. Jouppi. Cacti 3. 0:An integrated cache timing, power and area model. Tech. report, Western Research Lab, Aug. 2001.
|
|