|
ABSTRACT
Program analysis methods, especially those which support automatic vectorization, are based on the concept of interstatement dependence where a dependence holds between two statements when one of the statements computes values needed by the other. Powerful program transformation systems that convert sequential programs to a form more suitable for vector or parallel machines have been developed using this concept [AllK 82, KKLW 80].The dependence analysis in these systems is based on data dependence. In the presence of complex control flow, data dependence is not sufficient to transform programs because of the introduction of control dependences. A control dependence exists between two statements when the execution of one statement can prevent the execution of the other. Control dependences do not fit conveniently into dependence-based program translators.One solution is to convert all control dependences to data dependences by eliminating goto statements and introducing logical variables to control the execution of statements in the program. In this scheme, action statements are converted to IF statements. The variables in the conditional expression of an IF statement can be viewed as inputs to the statement being controlled. The result is that control dependences between statements become explicit data dependences expressed through the definitions and uses of the controlling logical variables.This paper presents a method for systematically converting control dependences to data dependences in this fashion. The algorithms presented here have been implemented in PFC, an experimental vectorizer written at Rice University.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
{AllK 82} J. R. Allen and K., Kennedy, "PFC: a program to convert Fortran to parallel form," Report MASC TR 82-6, Department of Mathematical Sciences, Rice University, Houston, Texas, March, 1982.
|
| |
2
|
{AlKW 82} J. R. Allen, K. Kennedy, and J., Warren, "Simplification of Boolean formulas in PFC," Dept. Mathematical Sciences, Rice University, Houston, Texas, November
|
| |
3
|
{ANSI 81} American National Standards Institute. Inc., "Proposals approved for Fortran 8x," X3J3/S6.80 (preliminary document), November 30, 1981.
|
| |
4
|
{Bane 76} U. Banerjee, "Data dependence in ordinary programs," Report 76-837, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, November 1976.
|
 |
5
|
|
| |
6
|
{Burr 77} Burroughs Corporations, "Implementation of FORTRAN," Burroughs Scientific Processor brochure, 1977.
|
| |
7
|
{GibK 81} C. Gibbons, and K. Kennedy, "Simplification of functions," Rice Technical Report 476-029-10, Rice University, January 1981.
|
 |
8
|
|
| |
9
|
{Kenn 80} K. Kennedy, "Automatic translation of Fortran programs to vector form," Rice Technical Report 476-029-4, Rice University, October 1980.
|
 |
10
|
|
| |
11
|
{KKLP 81} D. J. Kuck, R. H. Kuhn, B. Leasure, D. A. Padua, and M. Wolfe, "Compiler transformation of dependence graphs," Conf. Record of the Eighth ACM Symposium on Principles of Programming Languages, Williamsburg, Va., January 1981.
|
| |
12
|
{KKLW 80} D. J. Kuck, R. H. Kuhn, B. Leasure, and M. Wolfe, "The structure of an advanced vectorizer for pipelined processors," Proc. IEEE Computer Society Fourth International Computer Software and Applications Conf. IEEE, Chicago, October 1980.
|
| |
13
|
{McCl 56} E. J. McCluskey, "Minimization of Boolean functions," Bell System Tech. J. 35, 5, November 1956, 1417-1444.
|
| |
14
|
{Quin 52} W. V. Quine, "The problem of simplifying truth functions," Am. Math. Monthly 59, 8, October 1952, 521-531.
|
| |
15
|
|
| |
16
|
{Wolf 78} M. J. Wolfe, "Techniques for improving the inherent parallelism in program," Report 78-929, Dept. of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois. July 1978.
|
CITED BY 134
|
|
|
|
|
|
|
|
David I. August , John W. Sias , Jean-Michel Puiatti , Scott A. Mahlke , Daniel A. Connors , Kevin M. Crozier , Wen-mei W. Hwu, The program decision logic approach to predicated execution, ACM SIGARCH Computer Architecture News, v.27 n.2, p.208-219, May 1999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
T. Tsuda , Y. Kunieda , P. Atipas, Automatic vectorization of character string manipulation and relational operations in Pascal, Proceedings of the 1989 ACM/IEEE conference on Supercomputing, p.187-196, November 12-17, 1989, Reno, Nevada, United States
|
|
|
|
|
|
David López , Mateo Valero , Josep Llosa , Eduard Ayguadé, Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs, Proceedings of the 11th international conference on Supercomputing, p.12-19, July 07-11, 1997, Vienna, Austria
|
|
|
|
|
|
Jack L. Lo , Susan J. Eggers , Henry M. Levy , Sujay S. Parekh , Dean M. Tullsen, Tuning compiler optimizations for simultaneous multithreading, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.114-124, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David I. August , Daniel A. Connors , Scott A. Mahlke , John W. Sias , Kevin M. Crozier , Ben-Chung Cheng , Patrick R. Eaton , Qudus B. Olaniran , Wen-mei W. Hwu, Integrated predicated and speculative execution in the IMPACT EPIC architecture, ACM SIGARCH Computer Architecture News, v.26 n.3, p.227-237, June 1998
|
|
|
Kazuaki Ishizaki , Mikio Takeuchi , Kiyokuni Kawachiya , Toshio Suganuma , Osamu Gohda , Tatsushi Inagaki , Akira Koseki , Kazunori Ogata , Motohiro Kawahito , Toshiaki Yasue , Takeshi Ogasawara , Tamiya Onodera , Hideaki Komatsu , Toshio Nakatani, Effectiveness of cross-platform optimizations for a java just-in-time compiler, ACM SIGPLAN Notices, v.38 n.11, November 2003
|
|
|
Michael Burke , Ron Cytron , Jeanne Ferrante , Wilson Hsieh , Vivek Sarkar , David Shields, Automatic discovery of parallelism: a tool and an experiment (extended abstract), ACM SIGPLAN Notices, v.23 n.9, p.77-84, Sept. 1988
|
|
|
Josep Llosa , Mateo Valero , Eduard Ayguadé , Antonio González, Hypernode reduction modulo scheduling, Proceedings of the 28th annual international symposium on Microarchitecture, p.350-360, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
|
|
|
|
|
|
|
|
|
|
|
B.-M. Hsieh , M. Hind , R. Cytron, Loop distribution with multiple exits, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.204-213, November 16-20, 1992, Minneapolis, Minnesota, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. Govindarajan , Erik R. Altman , Guang R. Gao, Minimizing register requirements under resource-constrained rate-optimal software pipelining, Proceedings of the 27th annual international symposium on Microarchitecture, p.85-94, November 30-December 02, 1994, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David I. August , Wen-mei W. Hwu , Scott A. Mahlke, A framework for balancing control flow and predication, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.92-103, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
Zhihong Tang , Gang Chen , Chihong Zhang , Yingwei Zhang , Bogong Su , Stanley Habib, GPMB—software pipelining branch-intensive loops, Proceedings of the 26th annual international symposium on Microarchitecture, p.21-30, December 01-03, 1993, Austin, Texas, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David M. Gillies , Dz-ching Roy Ju , Richard Johnson , Michael Schlansker, Global predicate analysis and its application to register allocation, Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, p.114-125, December 02-04, 1996, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Po-Yung Chang , Eric Hao , Yale N. Patt , Pohua P. Chang, Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution, Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques, p.99-108, June 27-29, 1995, Limassol, Cyprus
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Scott A. Mahlke , Richard E. Hank , Roger A. Bringmann , John C. Gyllenhaal , David M. Gallagher , Wen-mei W. Hwu, Characterizing the impact of predicated execution on branch prediction, Proceedings of the 27th annual international symposium on Microarchitecture, p.217-227, November 30-December 02, 1994, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexander V. Veidenbaum , Alexandru Nicolau , Milind Girkar , Xinmin Tian , Hideki Saito, On the exploitation of loop-level parallelism in embedded applications, ACM Transactions on Embedded Computing Systems (TECS), v.8 n.2, p.1-34, January 2009
|
|
|
Hongbo Rong , Zhizhong Tang , R. Govindarajan , Alban Douillet , Guang R. Gao, Single-Dimension Software Pipelining for Multi-Dimensional Loops, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.163, March 20-24, 2004, Palo Alto, California
|
|
|
|
|
|
|
|
|
|
|
|
Alexandre Eichenberger , Alexandre E. Eichenberger , Waleed Meleis , Suman Maradani, An integrated approach to accelerate data and predicate computations in hyperblocks, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.101-111, December 2000, Monterey, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mattan Erez , Jung Ho Ahn , Jayanth Gummaraju , Mendel Rosenblum , William J. Dally, Executing irregular scientific applications on stream architectures, Proceedings of the 21st annual international conference on Supercomputing, June 17-21, 2007, Seattle, Washington
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hyesoon Kim , Onur Mutlu , Jared Stark , Yale N. Patt, Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.43-54, November 12-16, 2005, Barcelona, Spain
|
|
|
|
|
|
Aaron Smith , Jon Gibson , Bertrand Maher , Nick Nethercote , Bill Yoder , Doug Burger , Kathryn S. McKinle , Jim Burrill, Compiling for EDGE Architectures, Proceedings of the International Symposium on Code Generation and Optimization, p.185-195, March 26-29, 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Aaron Smith , Ramadass Nagarajan , Karthikeyan Sankaralingam , Robert McDonald , Doug Burger , Stephen W. Keckler , Kathryn S. McKinley, Dataflow Predication, Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, p.89-102, December 09-13, 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Hohenauer , F. Engel , R. Leupers , G. Ascheid , H. Meyr , Gerrit Bette , Balpreet Singh, Retargetable code optimization for predicated execution, Proceedings of the conference on Design, automation and test in Europe, March 10-14, 2008, Munich, Germany
|
|
|
B. Mei , B. Sutter , T. Aa , M. Wouters , A. Kanstein , S. Dupont, Implementation of a Coarse-Grained Reconfigurable Media Processor for AVC Decoder, Journal of Signal Processing Systems, v.51 n.3, p.225-243, June 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alessandro Cevrero , Panagiotis Athanasopoulos , Hadi Parandeh-Afshar , Ajay K. Verma , Hosein Seyed Attarzadeh Niaki , Chrysostomos Nicopoulos , Frank K. Gurkaynak , Philip Brisk , Yusuf Leblebici , Paolo Ienne, Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs, ACM Transactions on Reconfigurable Technology and Systems (TRETS), v.2 n.2, p.1-36, June 2009
|
|
|
|
|
|
Weihua Zhang , Lili Liu , Chen Zhang , Hongjiong Zhang , Binyu Zang , Chuanqi Zhu, Optimizing techniques for saturated arithmetic with first-order linear recurrence, Proceedings of the 2009 ACM symposium on Applied Computing, March 08-12, 2009, Honolulu, Hawaii
|
|
|
|
|
|
|
|
|
Alexandru Nicolau , Guangqiang Li , Alexander V. Veidenbaum , Arun Kejariwal, Synchronization optimizations for efficient execution on multi-cores, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|
|
|
|