|
ABSTRACT
Although modern superscalar processors achieve high branch prediction accuracy, certain branches either are inherently difficult to predict or incur destructive interference in prediction tables, causing significant performance loss due to mispredictions. We propose a novel microarchitecture, called Skipper, to handle such difficult branches by exploiting control-flow independence. Previous approaches to handling difficult branches, one way or another, amount to executing incorrect instructions, squandering cycles and resources such as the i-cache bandwidth. Skipper altogether avoids incorrect instructions by skipping over, without even fetching, the control-flow dependent computation conditioned by a difficult branch. Instead, Skipper fetches and executes the control-flow independent instructions, which are past the point where the branch's taken and not-taken paths reconverge, and which need to be executed irrespective of the branch outcome. Because Skipper executes the correct control-flow dependent instructions after the difficult branch is resolved, it conserves the valuable resources.Skipper is the first proposal to exploit control-flow independence by skipping over control-flow dependent computation in a superscalar pipeline. Skipper fetches the skipped control-flow dependent instructions after the post-reconvergent instructions, out of program order. We describe key mechanisms to implement Skipper without unduly complicating the pipeline despite out-of-order fetch. SPECint95 simulations show that Skipper performs 10% and 8% better than superscalar and the previously-proposed Polypath, respectively, when all three microarchitectures have equal i-cache bandwidth and hardware resources.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
David I. August , Daniel A. Connors , Scott A. Mahlke , John W. Sias , Kevin M. Crozier , Ben-Chung Cheng , Patrick R. Eaton , Qudus B. Olaniran , Wen-mei W. Hwu, Integrated predicated and speculative execution in the IMPACT EPIC architecture, Proceedings of the 25th annual international symposium on Computer architecture, p.227-237, June 27-July 02, 1998, Barcelona, Spain
|
 |
3
|
Scott E. Breach , T. N. Vijaykumar , Gurindar S. Sohi, The anatomy of the register file in a multiscalar processor, Proceedings of the 27th annual international symposium on Microarchitecture, p.181-190, November 30-December 02, 1994, San Jose, California, United States
[doi> 10.1145/192724.192750]
|
| |
4
|
D. Burger, T. M. Austin, and S. Bennett. Evaluating future microprocessors: the simplescalar tool set. Technical Report CS TR-1308, University of Wisconsin, Madison, July 1996.
|
 |
5
|
|
 |
6
|
Dirk Grunwald , Artur Klauser , Srilatha Manne , Andrew Pleszkun, Confidence estimation for speculation control, Proceedings of the 25th annual international symposium on Computer architecture, p.122-131, June 27-July 02, 1998, Barcelona, Spain
|
 |
8
|
Eric Hao , Po-Yung Chang , Yale N. Patt, The effect of speculatively updating branch history on branch prediction accuracy, revisited, Proceedings of the 27th annual international symposium on Microarchitecture, p.228-232, November 30-December 02, 1994, San Jose, California, United States
[doi> 10.1145/192724.192756]
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
Scott A. Mahlke , David C. Lin , William Y. Chen , Richard E. Hank , Roger A. Bringmann, Effective compiler support for predicated execution using the hyperblock, Proceedings of the 25th annual international symposium on Microarchitecture, p.45-54, December 01-04, 1992, Portland, Oregon, United States
|
| |
15
|
S. McFarling. Combining branch predictors. Technical Report TR-36, DEC-WRL, June 1993.
|
 |
16
|
Andreas Moshovos , Scott E. Breach , T. N. Vijaykumar , Gurindar S. Sohi, Dynamic speculation and synchronization of data dependences, Proceedings of the 24th annual international symposium on Computer architecture, p.181-193, June 01-04, 1997, Denver, Colorado, United States
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
Jared Stark , Paul Racunas , Yale N. Patt, Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.34-43, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
23
|
|
 |
24
|
|
 |
25
|
|
CITED BY 15
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tanausú Ramírez , Alex Pajuelo , Oliverio J. Santana , Mateo Valero, A simple speculative load control mechanism for energy saving, Proceedings of the 2006 workshop on MEmory performance: DEaling with Applications, systems and architectures, p.29-36, September 16-20, 2006, Seattle, Washington
|
|
|
|
|
|
|
|
|
|
|
|
Hyesoon Kim , Onur Mutlu , Jared Stark , Yale N. Patt, Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.43-54, November 12-16, 2005, Barcelona, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|