| Branchless cycle prediction for embedded processors |
| Full text |
Pdf
(222 KB)
|
| Source
|
Symposium on Applied Computing
archive
Proceedings of the 2006 ACM symposium on Applied computing
table of contents
Dijon, France
SESSION: Embedded systems: applications, solutions and techniques (EMBS)
table of contents
Pages: 928 - 932
Year of Publication: 2006
ISBN:1-59593-108-2
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 34, Citation Count: 1
|
|
|
ABSTRACT
Modern embedded processors access the Branch Target Buffer (BTB) every cycle to speculate branch target addresses. Such accesses, quite often, are unnecessary as there is no branch instruction among those fetched.In this work we introduce Branchless Cycle Prediction (BLCP) to exploit this design inefficiency. BLCP uses a simple power efficient structure to predict cycles where there is no branch instruction among those fetched, at least one cycle in advance. We avoid accessing BTB during such cycles.We show that, by using BLCP, it is possible to reduce BTB power dissipation by 32% while paying a negligible performance cost (average: 0.2%).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sung Woo Chung, Sung-Bae Park, "A Low Power Branch Predictor to Selectively Access the BTB", Asia-Pacific Computer Systems Arch. Conference, pp 374--384, September 2004.
|
| |
2
|
|
| |
3
|
M. R. Guthaus , J. S. Ringenberg , D. Ernst , T. M. Austin , T. Mudge , R. B. Brown, MiBench: A free, commercially representative embedded benchmark suite, Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop on, p.3-14, December 02-02, 2001
[doi> 10.1109/WWC.2001.15]
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
 |
7
|
Vikas Agarwal , M. S. Hrishikesh , Stephen W. Keckler , Doug Burger, Clock rate versus IPC: the end of the road for conventional microarchitectures, Proceedings of the 27th annual international symposium on Computer architecture, p.248-259, June 2000, Vancouver, British Columbia, Canada
|
 |
8
|
|
 |
9
|
|
| |
10
|
S. Wilton and N. Jouppi. "An Enhanced Access and Cycle Time Model for On-chip Caches." In WRL Research Report 93/5, DEC Western Research Laboratory, 1994.
|
| |
11
|
Michele Co, Dee A. B. Weikle, and Kevin Skadron, "A Break-Even Formulation for Evaluating Branch Predictor Energy Efficiency", Workshop on Complexity-Effective Design (WCED) in conjunction with the 32nd Annual ACM/IEEE International Symposium on Computer Architecture, 2005
|
|