ACM Home Page
Please provide us with feedback. Feedback
Using profiling to reduce branch misprediction costs on a dynamically scheduled processor
Full text PdfPdf (869 KB)
Source International Conference on Supercomputing archive
Proceedings of the 14th international conference on Supercomputing table of contents
Santa Fe, New Mexico, United States
Pages: 206 - 214  
Year of Publication: 2000
ISBN:1-58113-270-0
Authors
Srinivas Mantripragada  Silicon Graphics Inc., 2011 N. Shoreline Blvd, Mountain View, CA
Alexandru Nicolau  Dept. of Computer Science, University of California, Irvine, CA
Sponsor
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 24,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/335231.335251
What is a DOI?

ABSTRACT

Modern dynamically scheduled processors use branch prediction hardware to speculatively fetch and execute most likely executed paths in a program. Complex branch predictors have been proposed which attempt to identify these paths accurately such that the hardware can benefit from out-of-order (OOO) execution. Recent studies have shown that inspite of such complex prediction schemes, there still exist many frequently executed branches which are difficult to predict. Predicated execution has been proposed as an alternative technique to eliminate some of these branches in various forms ranging from a restrictive support to a full-blown support. We call the restrictive form of predicated execution as guarded execution.In this paper, we propose a new algorithm which uses profiling and selectively performs if-conversion for architectures with guarded execution support. Branch profiling is used to gather the taken, non-taken and misprediction counts for every branch. This combined with block profiling is used to select paths which suffer from heavy mispredictions and are profitable to if-convert. Effects of three different selection criterias, namely size-based, predictability-based and profiled-based, on net cycle improvements, branch mispredictions and mis-speculated instructions are then studied. We also propose new mechanisms to convert unsafe instructions to safe form to enhance the applicability of the technique. Finally, we explain numerous adjustments that were made to the selection criterias to better reflect the OOO processor behavior.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J.C. Dehnert, P.Y.T. Hsu and J. P. Bratt, "Overlapped loop support in the Cydra5," Proceedings of the 16th Annual International Symposium on Computer Microarchitecture, 1989
 
2
3
4
5
 
6
A. Nicolau and J.A. Fisher, "Measuring the available parallelism for very long instruction word architectures," IEEE- TC, C-33, p 1088-1098 September, 1984
7
8
 
9
10
11
 
12
 
13
D. Burger, T.M. Austin and S. Bermett, -"Evaluating Future Microprocessors: The Simplescalar Tools Set," Technical Report TR#1308, University of Wisconsin, 1996
 
14
S. McFarling, "Combining Branch Predictors," WRL Technical Note TR-36, Digital Equipment Corporation, June 1993.
 
15
"WHIRL Intermediate Language Specification," Technical Report, Silicon Graphics Inc., Nov 1999.
 
16
17
 
18
 
19


Collaborative Colleagues:
Srinivas Mantripragada: colleagues
Alexandru Nicolau: colleagues