ACM Home Page
Please provide us with feedback. Feedback
Code placement for improving dynamic branch prediction accuracy
Full text PdfPdf (145 KB)
Source Conference on Programming Language Design and Implementation archive
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation table of contents
Chicago, IL, USA
SESSION: Optimization table of contents
Pages: 107 - 116  
Year of Publication: 2005
ISBN:1-59593-056-6
Also published in ...
Author
Daniel A. Jiménez  Rutgers University, Piscataway, New Jersey & Universidad Politéécnica de Cataluña, Barcelona, Spain
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 72,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1065010.1065025
What is a DOI?

ABSTRACT

Code placement techniques have traditionally improved instruction fetch bandwidth by increasing instruction locality and decreasing the number of taken branches. However, traditional code placement techniques have less benefit in the presence of a trace cache that alters the placement of instructions in the instruction cache. Moreover, as pipelines have become deeper to accommodate increasing clock rates, branch misprediction penalties have become a significant impediment to performance. We evaluate pattern history table partitioning, a feedback directed code placement technique that explicitly places conditional branches so that they are less likely to interfere destructively with one another in branch prediction tables. On SPEC CPU benchmarks running on an Intel Pentium 4, branch mispredictions are reduced by up to 22% and 3.5% on average. This reduction yields a speedup of up to 16.0% and 4.5% on average. By contrast, branch alignment, a previous code placement technique, yields only up to a 4.7% speedup and less than 1% on average.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
D. Burger and T. M. Austin. The SimpleScalar tool set version 2.0. Technical Report 1342, Computer Sciences Department, University of Wisconsin, June 1997.
3
4
5
 
6
 
7
K. Diefendorff. K7 challenges Intel. Microprocessor Report, 12(14), October 1998.
 
8
D. J. Hatfield and J. Gerald. Program restructuring for virtual memory. IBM Systems Journal, 10(3):168--192, 1971.
 
9
10
11
 
12
B. Hayes. Differences in optimizing for the Pentium 4 processor vs. the Pentium III processor. Intel Developer Services, http://www.intel.com/ cd/ ids/developer/ asmo-na/eng/44010.htm.
 
13
Intel Corporation. Intel Pentium 4 processor optimization. Technical Report Order Number: 248966, Intel Corporation, 2001.
14
 
15
 
16
 
17
 
18
J. Levon. Oprofile - a system profiler for linux. Technical report, http://oprofile.sourceforge.net/ (Current on September 23, 2004).
19
20
 
21
S. McFarling. Combining branch predictors. Technical Report TN-36m, Digital Western Research Laboratory, June 1993.
22
 
23
 
24
H. Patil and J. Emer. Combining static and dynamic branch prediction to reduce destructive aliasing. In Proceedings of the 6th International Symposium on High Performance Computer Architecture, January 2000.
25
26
 
27
 
28
29
 
30
Standard Performance Evaluation Corporation. SPEC CPU 2000, http://www.spec.org/osg/cpu2000, April 2000.
31
32
33