| Completion time multiple branch prediction for enhancing trace cache performance |
| Full text |
Pdf
(156 KB)
|
| Source
|
International Symposium on Computer Architecture
archive
Proceedings of the 27th annual international symposium on Computer architecture
table of contents
Vancouver, British Columbia, Canada
Pages: 47 - 58
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
|
|
Authors
|
|
Ryan Rakvic
|
Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
|
|
Bryan Black
|
Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
|
|
John Paul Shen
|
Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 26, Citation Count: 5
|
|
|
ABSTRACT
The need for multiple branch prediction is inherent to wide instruction fetching. This paper presents a completion time multiple branch predictor called the Tree-based Multiple Branch Predictor (TMP) that builds on previous single branch prediction techniques. It employs a tree structure of branch predictors, or tree-node predictors, and achieves accurate multiple branch prediction by leveraging the high accuracies of the individual branch predictors. A highly-efficient TMP design uses the 2-bit saturating counters for the tree-node predictors. To achieve higher prediction rate, the TMP employs two-level schemes for the tree-node predictors resulting in a three-level TMP design. Placing the TMP at completion time reduces the critical latency in the front-end of the pipeline; the resultant longer update latency does not significantly impact the overall performance. In this paper the TMP is applied to a trace cache design and shown to be very effective in increasing its performance.
Results: A realistic-size TMP (72KB) can predict 1, 2, 3, and 4 consecutive blocks with compounded prediction accuracies of 96%, 93%, 87%, and 82%, respectively. The block-based trace cache with this TMP achieves 4.75 IPC for SPECint95 on an idealized machine, which is a 20% performance improvement over the original design [1]. This improved performance is 8% above that of a conventional I-cache design with perfect single branch prediction.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Bryan Black , Bohuslav Rychlik , John Paul Shen, The block-based trace cache, Proceedings of the 26th annual international symposium on Computer architecture, p.196-207, May 01-04, 1999, Atlanta, Georgia, United States
|
 |
2
|
|
 |
3
|
Thomas M. Conte , Kishore N. Menezes , Patrick M. Mills , Burzin A. Patel, Optimization of instruction fetch mechanisms for high issue rates, Proceedings of the 22nd annual international symposium on Computer architecture, p.333-344, June 22-24, 1995, S. Margherita Ligure, Italy
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
IBM Microelectronics Division, PowerPC 604 RISC Microprocessor User' s Manual, 1994.
|
| |
8
|
Quinn Jacobson , Eric Rotenberg , James E. Smith, Path-based next trace prediction, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.14-23, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
9
|
S. McFarling, "Combining Branch Predictors." Technical Report TN-36, Digital Equipment Corp., June 1993.
|
| |
10
|
|
 |
11
|
|
| |
12
|
S. Patel, D. Friendly, and Y. Patt, "Evaluation of Design Options for the Trace Cache Fetch Mechanism." IEEE Transactions on Computers a Special Issue on Cache Memory and Related Problems.
|
| |
13
|
|
| |
14
|
Eric Rotenberg , Quinn Jacobson , Yiannakis Sazeides , Jim Smith, Trace processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.138-148, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
15
|
André Seznec , Stéphan Jourdan , Pascal Sainrat , Pierre Michaud, Multiple-block ahead branch predictors, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.116-127, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
16
|
J. Smith, "A Study of Branch Prediction Strategies." In Proceedings of the 30th International Symposium on Microarchitecture, pp. 138-148, December 1997.
|
| |
17
|
|
 |
18
|
|
 |
19
|
|
 |
20
|
|
CITED BY 5
|
|
|
|
|
Yuan Chou , Pazhani Pillai , Herman Schmit , John Paul Shen, PipeRench implementation of the instruction path coprocessor, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.147-158, December 2000, Monterey, California, United States
|
|
|
|
|
|
|
|
|
Juan C. Moure , Domingo Benítez , Dolores I. Rexachs , Emilio Luque, Wide and efficient trace prediction using the local trace predictor, Proceedings of the 20th annual international conference on Supercomputing, June 28-July 01, 2006, Cairns, Queensland, Australia
|
|