|
ABSTRACT
Current trends in modern out-of-order processors involve implementing deeper pipelines and a large instruction window to achieve high performance. However, as pipeline depth increases, the branch misprediction penalty becomes a critical factor in overall processor performance. Current approaches to handling branch mispredictions either incrementally roll back to in-order state by waiting until the mispredicted branch reaches the head of the reorder buffer, or utilize checkpointing at branches for faster recovery. Rolling back to in-order state stalls the pipeline for a significant number of cycles and checkpointing is costly.This paper proposes a fast recovery mechanism, called Eager Misprediction Recovery (EMR), to reduce the branch misprediction penalty. Upon a misprediction, the processor immediately starts fetching and renaming instructions from the correct path without restoring the map table. Those instructions that access incorrect speculative values wait until the correct data are restored; however, instructions that access correct values continue executing while recovery occurs. Thus, the recovery mechanism hides the latency of long branch recovery with useful instructions.EMR achieves a mean performance improvement very close to a recovery mechanism that supports checkpointing at each branch. In addition, EMR provides an average of 9.0% and up to 19.9% better performance than traditional sequential misprediction recovery on the SPEC2000 benchmark suite.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
 |
3
|
Juan L. Aragón , José González , Antonio González , James E. Smith, Dual path instruction processing, Proceedings of the 16th international conference on Supercomputing, June 22-26, 2002, New York, New York, USA
[doi> 10.1145/514191.514223]
|
| |
4
|
David N. Armstrong , Hyesoon Kim , Onur Mutlu , Yale N. Patt, Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery, Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture, p.119-128, December 04-08, 2004, Portland, Oregon
[doi> 10.1109/MICRO.2004.38]
|
| |
5
|
Chih-Chieh Lee , I-Cheng K. Chen , Trevor N. Mudge, The bi-mode branch predictor, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.4-13, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
6
|
|
| |
7
|
COMPAQ. Alpha 21264 microprocessor hardware reference manual. July 1999.
|
| |
8
|
|
| |
9
|
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the pentium 4 processor. In Intel Technology Journal, February 2001.
|
 |
10
|
|
| |
11
|
|
| |
12
|
M. Johnson. Superscalar Microprocessor Design. Prentice Hall, 1991.
|
| |
13
|
|
| |
14
|
José F. Martínez , Jose Renau , Michael C. Huang , Milos Prvulovic , Josep Torrellas, Cherry: checkpointed early resource recycling in out-of-order microprocessors, Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, November 18-22, 2002, Istanbul, Turkey
|
| |
15
|
S. McFarling, Combining branch predictors. Technical Report WRL-TN-36, Digital Western Research Laboratory, 1993.
|
| |
16
|
|
| |
17
|
|
 |
18
|
Subbarao Palacharla , Norman P. Jouppi , J. E. Smith, Complexity-effective superscalar processors, Proceedings of the 24th annual international symposium on Computer architecture, p.206-218, June 01-04, 1997, Denver, Colorado, United States
|
| |
19
|
S. Palacharla, N. P. Jouppi, and J. E. Smith. Quantifying the complexity of superscalar processors, Technical Report CS-TR-96-1328, University of Wisconsin Technical Report, 1996.
|
| |
20
|
|
 |
21
|
|
 |
22
|
Eric Sprangle , Robert S. Chappell , Mitch Alsup , Yale N. Patt, The agree predictor: a mechanism for reducing negative branch history interference, Proceedings of the 24th annual international symposium on Computer architecture, p.284-291, June 01-04, 1997, Denver, Colorado, United States
|
| |
23
|
|
 |
24
|
|
CITED BY 5
|
|
|
|
|
|
|
|
|
|
|
Isidro Gonzalez , Marco Galluzzi , Alex Veidenbaum , Marco A. Ramirez , Adrian Cristal , Mateo Valero, A distributed processor state management architecture for large-window processors, Proceedings of the 2008 41st IEEE/ACM International Symposium on Microarchitecture, p.11-22, November 08-12, 2008
|
|
|
|
REVIEW
"Ronaldo A. L. Goncalves : Reviewer"
This paper proposes a mechanism, eager misprediction recovery (EMR), for recovering the processor state after branch misprediction in modern out-of-order architectures. The idea of this mechanism is to restart the instruction fetching on the corre
more...
|