ACM Home Page
Please provide us with feedback. Feedback
Completion time multiple branch prediction for enhancing trace cache performance
Full text PdfPdf (156 KB)
Source International Symposium on Computer Architecture archive
Proceedings of the 27th annual international symposium on Computer architecture table of contents
Vancouver, British Columbia, Canada
Pages: 47 - 58  
Year of Publication: 2000
ISBN:1-58113-232-8
Also published in ...
Authors
Ryan Rakvic  Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
Bryan Black  Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
John Paul Shen  Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 26,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/339647.339654
What is a DOI?

ABSTRACT

The need for multiple branch prediction is inherent to wide instruction fetching. This paper presents a completion time multiple branch predictor called the Tree-based Multiple Branch Predictor (TMP) that builds on previous single branch prediction techniques. It employs a tree structure of branch predictors, or tree-node predictors, and achieves accurate multiple branch prediction by leveraging the high accuracies of the individual branch predictors. A highly-efficient TMP design uses the 2-bit saturating counters for the tree-node predictors. To achieve higher prediction rate, the TMP employs two-level schemes for the tree-node predictors resulting in a three-level TMP design. Placing the TMP at completion time reduces the critical latency in the front-end of the pipeline; the resultant longer update latency does not significantly impact the overall performance. In this paper the TMP is applied to a trace cache design and shown to be very effective in increasing its performance. Results: A realistic-size TMP (72KB) can predict 1, 2, 3, and 4 consecutive blocks with compounded prediction accuracies of 96%, 93%, 87%, and 82%, respectively. The block-based trace cache with this TMP achieves 4.75 IPC for SPECint95 on an idealized machine, which is a 20% performance improvement over the original design [1]. This improved performance is 8% above that of a conventional I-cache design with perfect single branch prediction.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
 
5
 
6
 
7
IBM Microelectronics Division, PowerPC 604 RISC Microprocessor User' s Manual, 1994.
 
8
 
9
S. McFarling, "Combining Branch Predictors." Technical Report TN-36, Digital Equipment Corp., June 1993.
 
10
11
 
12
S. Patel, D. Friendly, and Y. Patt, "Evaluation of Design Options for the Trace Cache Fetch Mechanism." IEEE Transactions on Computers a Special Issue on Cache Memory and Related Problems.
 
13
 
14
15
 
16
J. Smith, "A Study of Branch Prediction Strategies." In Proceedings of the 30th International Symposium on Microarchitecture, pp. 138-148, December 1997.
 
17
18
19
20


Collaborative Colleagues:
Ryan Rakvic: colleagues
Bryan Black: colleagues
John Paul Shen: colleagues