| Predicate prediction for efficient out-of-order execution |
| Full text |
Pdf
(221 KB)
|
| Source
|
International Conference on Supercomputing
archive
Proceedings of the 17th annual international conference on Supercomputing
table of contents
San Francisco, CA, USA
SESSION: Processor microarchitecture II
table of contents
Pages: 183 - 192
Year of Publication: 2003
ISBN:1-58113-733-8
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 45, Citation Count: 6
|
|
|
ABSTRACT
Predicated execution is an important optimization even for an out-of-order processor, since it can eliminate hard to predict branches and help to enable software pipelining. Using predication with out-of-order execution creates a naming bottleneck, because there can be multiple definitions reaching a use, and not knowing which use is the correct one can stall the processor.In this paper, we examine using predicate prediction to speculatively allow execution to proceed in the face of multiple definitions. We show that the penalty for mispredicting a predicate is not as severe as mispredicting a branch. Thus, making it advantageous to replace hard to predict branches with predicate predictions. We present a predicate misprediction recovery architecture that replays instructions through the renamer to link up the correct dependencies on a misprediction. This approach allows us to avoid putting the predicted false path instructions in the issue queue reducing the pressure on the dynamic out-of-order scheduler.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
D.C. Burger and T.M. Austin. The Simplescalar Tool Set, version 2.0. Technical Report CS-TR-97-1342, University of Wisconsin, Madison, Jun 1997.
|
| |
3
|
Po-Yung Chang , Eric Hao , Yale N. Patt , Pohua P. Chang, Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution, Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, p.99-108, June 27-29, 1995, Limassol, Cyprus
|
| |
4
|
|
| |
5
|
S. Eranian and D. Mosberger. The Linux/ia64 Project: Kernel Design and Status Update. Technical Report HPL-2000-85, HP Labs, 2000.
|
 |
6
|
|
| |
7
|
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the pentium 4 processor. Intel Technology Journal Q1, 2001.
|
| |
8
|
Intel Itanium Processor Reference Manual for Software Optimization, November 2001. http://developer.intel.com/design/itanium/downloads/245474.htm.
|
| |
9
|
Intel Flexible Annotations. http://www.intel.com/software/products/opensource/tools1/perftools.htm.
|
| |
10
|
IA-64 Application Instruction Set Architecture Guide, Revision 1.0, 1999.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
P. Geoffrey Lowney , Stefan M. Freudenberger , Thomas J. Karzes , W. D. Lichtenstein , Robert P. Nix , John S. O'Donnell , John Ruttenberg, The multiflow trace scheduling compiler, The Journal of Supercomputing, v.7 n.1-2, p.51-142, May 1993
[doi> 10.1007/BF01205182]
|
 |
15
|
Scott A. Mahlke , Richard E. Hank , James E. McCormick , David I. August , Wen-Mei W. Hwu, A comparison of full and partial predicated execution support for ILP processors, Proceedings of the 22nd annual international symposium on Computer architecture, p.138-150, June 22-24, 1995, S. Margherita Ligure, Italy
|
| |
16
|
S. McFarling. Combining Branch Predictors. Technical Report TN-36, Compaq WRL, June 1993.
|
| |
17
|
M. Schlansker and B. R. Rau. EPIC: An Architecture for Instruction-Level Parallel Procesors. Technical Report HPL-1999-111, HP Labs, 2000.
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
 |
21
|
|
CITED BY 6
|
|
|
|
|
|
|
|
|
|
|
Hyesoon Kim , Onur Mutlu , Jared Stark , Yale N. Patt, Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.43-54, November 12-16, 2005, Barcelona, Spain
|
|
|
Aaron Smith , Ramadass Nagarajan , Karthikeyan Sankaralingam , Robert McDonald , Doug Burger , Stephen W. Keckler , Kathryn S. McKinley, Dataflow Predication, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.89-102, December 09-13, 2006
|
|
|
David Ródenas , Xavier Martorell , Eduard Ayguadé , Jesús Labarta , George Almási , Călin Caşcaval , José Castaños , José Moreira, Exploiting multilevel parallelism using OpenMP on a massive multithreaded architecture, Journal of Embedded Computing, v.2 n.2, p.141-155, April 2006
|
|