|
Warning: The download time has expired please click on the item to try again.
ABSTRACT
This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches. This paper proposes a new processor architecture for handling hard-to-predict branches, the diverge-merge processor (DMP). The goal of this paradigm is to eliminate branch mispredictions due to hard-to-predict dynamic branches by dynamically predicating them without requiring ISA support for predicate registers and predicated instructions. To achieve this without incurring large hardware cost and complexity, the compiler provides control-flow information by hints and the processor dynamically predicates instructions only on frequently executed program paths. The key insight behind DMP is that most control-flow graphs look and behave like simple hammock (if-else) structures when only frequently executed paths in the graphs are considered. Therefore, DMP can dynamically predicate a much larger set of branches than simple hammock branches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
J. R. Allen , Ken Kennedy , Carrie Porterfield , Joe Warren, Conversion of control dependence to data dependence, Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, p.177-189, January 24-26, 1983, Austin, Texas
[doi> 10.1145/567067.567085]
|
| |
3
|
|
| |
4
|
David I. August , Wen-mei W. Hwu , Scott A. Mahlke, A framework for balancing control flow and predication, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.92-103, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
5
|
|
| |
6
|
Po-Yung Chang , Eric Hao , Yale N. Patt , Pohua P. Chang, Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution, Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, p.99-108, June 27-29, 1995, Limassol, Cyprus
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
Adrian Cristal , Oliverio J. Santana , Francisco Cazorla , Marco Galluzzi , Tanausu Ramirez , Miquel Pericas , Mateo Valero, Kilo-Instruction Processors: Overcoming the Memory Wall, IEEE Micro, v.25 n.3, p.48-57, May 2005
[doi> 10.1109/MM.2005.53]
|
 |
14
|
|
| |
15
|
[15] M. Farrens, T. Heil, J. E. Smith, and G. Tyson. Restricted dual path execution. Technical Report CSE-97-18, University of California at Davis, Nov. 1997.
|
| |
16
|
|
 |
17
|
Dirk Grunwald , Artur Klauser , Srilatha Manne , Andrew Pleszkun, Confidence estimation for speculation control, Proceedings of the 25th annual international symposium on Computer architecture, p.122-131, June 27-July 02, 1998, Barcelona, Spain
|
| |
18
|
[18] T. Heil and J. E. Smith. Selective dual path execution. Technical report, University of Wisconsin-Madison, Nov. 1996.
|
| |
19
|
|
| |
20
|
|
| |
21
|
[21] H. Kim, J. A. Joao, O. Mutlu, and Y. N. Patt. Diverge-merge processor (DMP): Dynamic predicated execution of complex control-flow graphs based on frequently executed paths. Technical Report TR-HPS-2006-008, The University of Texas at Austin, Sept. 2006.
|
| |
22
|
Hyesoon Kim , Onur Mutlu , Jared Stark , Yale N. Patt, Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.43-54, November 12-16, 2005, Barcelona, Spain
[doi> 10.1109/MICRO.2005.38]
|
| |
23
|
|
| |
24
|
|
 |
25
|
|
| |
26
|
|
 |
27
|
|
 |
28
|
Scott A. Mahlke , Richard E. Hank , Roger A. Bringmann , John C. Gyllenhaal , David M. Gallagher , Wen-mei W. Hwu, Characterizing the impact of predicated execution on branch prediction, Proceedings of the 27th annual international symposium on Microarchitecture, p.217-227, November 30-December 02, 1994, San Jose, California, United States
[doi> 10.1145/192724.192755]
|
 |
29
|
Scott A. Mahlke , David C. Lin , William Y. Chen , Richard E. Hank , Roger A. Bringmann, Effective compiler support for predicated execution using the hyperblock, Proceedings of the 25th annual international symposium on Microarchitecture, p.45-54, December 01-04, 1992, Portland, Oregon, United States
|
| |
30
|
|
| |
31
|
[31] ORC. Open research compiler for Itanium processor family. http://ipforc.sourceforge.net/.
|
| |
32
|
[32] J. C. H. Park and M. Schlansker. On predicated execution. Technical Report HPL-91-58, Hewlett-Packard Labs, Palo Alto CA, May 1991.
|
 |
33
|
|
| |
34
|
[34] E. M. Riseman and C. C. Foster. The inhibition of potential parallelism by conditional jumps. IEEE Transactions on Computers, C-21(12):1405- 1411, 1972.
|
| |
35
|
|
| |
36
|
|
 |
37
|
|
 |
38
|
John W. Sias , Sain-zee Ueng , Geoff A. Kent , Ian M. Steiner , Erik M. Nystrom , Wen-mei W. Hwu, Field-testing IMPACT EPIC research results in Itanium 2, Proceedings of the 31st annual international symposium on Computer architecture, p.26, June 19-23, 2004, München, Germany
|
| |
39
|
|
| |
40
|
Kevin Skadron , Pritpal S. Ahuja , Margaret Martonosi , Douglas W. Clark, Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques, IEEE Transactions on Computers, v.48 n.11, p.1260-1281, November 1999
[doi> 10.1109/12.811115]
|
 |
41
|
|
 |
42
|
Srikanth T. Srinivasan , Ravi Rajwar , Haitham Akkary , Amit Gandhi , Mike Upton, Continual flow pipelines, Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, October 07-13, 2004, Boston, MA, USA
|
| |
43
|
[43] J. M. Tendler, J. S. Dodson, J. S. Fields, H. Le, and B. Sinharoy. POWER4 system microarchitecture. IBM Technical White Paper, Oct. 2001.
|
 |
44
|
|
| |
45
|
|
 |
46
|
Nancy J. Warter , Scott A. Mahlke , Wen-Mei W. Hwu , B. Ramakrishna Rau, Reverse If-Conversion, Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, p.290-299, June 21-25, 1993, Albuquerque, New Mexico, United States
|
|