|
ABSTRACT
The notion of dependence captures that most important properties of a program for efficient execution on parallel computers. The dependence structure of a program defines that necessary constraints of the order of execution of the program components and provides sufficient information for the exploitation of the available parallelism. Static discovery and management of the dependence structure of programs save a tremendous amount of execution time, and dynamic utilization of dependence information results in a significant performance gain on parallel computers. However, experiments with parallel computers indicate that existing multiprocessing environments are unable to deliver the desired performance over a wide range of real applicatins, mainly due to lack of precision of their dependence information. This calls for an effective compilation scheme capable of understanding the dependence structure of complicated application programs. This article describes a methodology for capturing analyzing program properties that are essential in the effective detection and efficient exploitation of parallelism on parallel computers. Based on this methodology, a symbolic analysis framework is developed for the Parafrase-2 parallelizing compiler. This framework extends the scope of a variety of important program analysis problems and solves them in a unified way. The attained solution space of these problems is much larger than that handled by existing compiler technology. Such a powerful approach is required for the effective compilation of a large class of application programs. programs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
2
|
Allen, F. E., Cocke, J., and Kennedy, K. 1981. Reduction of operator strength. In Program Flow Analysis, S. S. Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cli s, N.J., 79{101.
|
| |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
Berry, M., Chen, D., Koss, P., Kuck, D. J., Pointer, L., Lo, S., Pang, Y., Roloff, R., Sameh, A., Clementi, E., Chin, S., Schneider, D., Fox, G., Messina, P., Walker, D., Hsiung, C., Schwarzmeier, J., Lue, K., Orszag, S., Seidl, F., Johnson, O., Swanson, G., Goodrum, R., and Martin, J. 1989. The Perfect Club Benchmarks: E ective performance evaluation of supercomputers. Int. J. Supercomput. Appl. 3, 3 (Fall), 5{40.
|
| |
11
|
|
| |
12
|
D. Callahan , J. Dongarra , D. Levine, Vectorizing compilers: a test suite and results, Proceedings of the 1988 ACM/IEEE conference on Supercomputing, p.98-105, November 12-17, 1988, Orlando, Florida, United States
|
 |
13
|
|
| |
14
|
|
| |
15
|
Cheatham, T. E., Jr., Holloway, G. H., and Townley, J. A. 1979. Symbolic evaluation and the analysis of programs. IEEE Trans. Softw. Eng. 5, 4 (July), 402{417.
|
| |
16
|
Clarke, L. A. and Richardson, D. J. 1981. Symbolic evaluation methods for program analysis. In Program Flow Analysis, S. S. Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cli s, N.J., 264{300.
|
 |
17
|
|
 |
18
|
|
| |
19
|
Cooper, K. D., Hall, M. W., and Kennedy, K. 1992. Procedure cloning. In Proceedings of the 4th IEEE International Conference on Computer Languages. IEEE Computer Society, Washington, D.C., 96{105.
|
 |
20
|
|
| |
21
|
Cousot, P. and Cousot, R. 1992. Abstract interpretation frameworks. J. Logic Comput. 2, 4, 511{547.
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
Eigenmann, R., Hoeflinger, J., Jaxon, G., Li, Z., and Padua, D. 1991. Restructuring Fortran programs for cedar. In Proceedings of the 1991 ICPP. Vol. I. The Pennsylvania State University Press, University Park, Pa., 57{66.
|
| |
29
|
Eisenberg, M. 1971. Axiomatic Theory of Sets and Classes. Holt, Rinehart and Winston, Inc., New York.
|
| |
30
|
|
| |
31
|
Feautrier, P. 1988. Parametric integer programming. RAIRO Recherche Op erationelle 22, 243{268.
|
 |
32
|
|
| |
33
|
|
 |
34
|
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
| |
39
|
|
| |
40
|
|
 |
41
|
Gina Goff , Ken Kennedy , Chau-Wen Tseng, Practical dependence testing, Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, p.15-29, June 24-28, 1991, Toronto, Ontario, Canada
|
| |
42
|
Gosper, R. W., Jr. 1978. Decision procedure for inde nite hypergeometric summation. Proc. Nat. Acad. Sci. 75, 1 (Jan.), 40{42.
|
| |
43
|
|
| |
44
|
Haghighat, M. R. and Polychronopoulos, C. D. 1991. Symbolic dependence analysis for high-performance parallelizing compilers. In Advances in Languages and Compilers for Parallel Processing, A. Nicolau, D. Gelernter, T. Gross, and D. Padua, Eds. The MIT Press, Cambridge, Mass., 310{330.
|
| |
45
|
|
| |
46
|
|
| |
47
|
Hansen, E. R. 1992. Global Optimization Using Interval Analysis. Marcel Dekker, New York.
|
| |
48
|
|
| |
49
|
|
 |
50
|
|
| |
51
|
|
 |
52
|
|
| |
53
|
|
| |
54
|
|
| |
55
|
Jordan, C. 1965. Calculus of Finite Di erences, 3rd ed. Chelsea, New York.
|
 |
56
|
|
 |
57
|
|
| |
58
|
|
| |
59
|
|
| |
60
|
Kozen, D. 1981. Semantics of probabilistic programs. J. Comput. Syst. Sci. 22, 3 (June), 328{350.
|
| |
61
|
|
| |
62
|
Kuck and Associates 1988. KAP User's Guide. Kuck and Associates, Champaign, Ill.
|
| |
63
|
Kuck, D. J., Kuhn, R. H., Leasure, B., and Wolfe, M. J. 1980. The structure of an advanced vectorizer for pipelined processors. In Proceedings of the 4th International Computer Software and Applications Conference. IEEE Computer Society, Washington, D.C., 709{715.
|
 |
64
|
|
 |
65
|
Dror E. Maydan , John L. Hennessy , Monica S. Lam, Efficient and exact data dependence analysis, Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, p.1-14, June 24-28, 1991, Toronto, Ontario, Canada
|
| |
66
|
Moore, R. E. 1966. Interval Analysis. Prentice-Hall, Englewood Cli s, N.J.
|
| |
67
|
|
 |
68
|
|
 |
69
|
|
| |
70
|
|
| |
71
|
|
| |
72
|
|
| |
73
|
Polychronopoulos, C. D., Girkar, M. B., Haghighat, M. R., Lee, C. L., Leung, B. P., and Schouten, D. A. 1989. Parafrase-2: An environment for parallelizing, partitioning, synchronizing and scheduling programs on multiprocessors. In Proceedings of the 1989 ICPP. Vol. II. The Pennsylvania State University Press, University Park, Pa., 39{48.
|
| |
74
|
|
 |
75
|
|
 |
76
|
|
| |
77
|
Reif, J. H. and Tarjan, R. E. 81. Symbolic program analysis in almost linear time. SIAM J. Comput. 11, 1 (Feb.), 81{93.
|
| |
78
|
Richardson, D. 1968. Some unsolvable problems involving elementary functions of a real variable. J. Symb. Logic 33, 511{520.
|
| |
79
|
|
| |
80
|
Schouten, D. A. 1990. An overview of interprocedural analysis techniques for high performance parallelizing compilers. M.S. thesis, Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana, Ill.
|
| |
81
|
Singh, J. P. and Hennessy, J. L. 1992. An empirical investigation of the e ectiveness and limitations of automatic parallelization. In Shared Memory Multiprocessing, N. Suzuki, Ed. The MIT Press, Cambridge, Mass., 213{240.
|
| |
82
|
Tang, P. and Yew, P.-C. 1986. Processor self-scheduling for multiple-nested parallel loops. In Proceedings of the 1986 ICPP. The Pennsylvania State University Press, University Park, Pa., 528{535.
|
 |
83
|
|
 |
84
|
Rémi Triolet , Francois Irigoin , Paul Feautrier, Direct parallelization of call statements, Proceedings of the 1986 SIGPLAN symposium on Compiler construction, p.176-185, June 25-27, 1986, Palo Alto, California, United States
|
| |
85
|
Tzen, T. H. and Ni, L. M. 1991. Dynamic loop scheduling for shared memory multiprocessors. In Proceedings of the 1991 ICPP. Vol. II. The Pennsylvania State University Press, University Park, Pa., 247{250.
|
 |
86
|
|
| |
87
|
Wegbreit, B. 1975b. Property extraction in well-founded property sets. IEEE Trans. Softw. Eng. 1, 3 (Sept.), 270{285.
|
| |
88
|
Wegbreit, B. 1976a. Goal-directed program transformation. IEEE Trans. Softw. Eng. 2, 2 (June), 69{80.
|
 |
89
|
|
 |
90
|
|
| |
91
|
|
| |
92
|
Wolfe, M. J. 1990. Flow graph anomalies: What's in a loop? Tech. Rep., Oregon Graduate Inst., Beaverton, Oreg.
|
 |
93
|
|
| |
94
|
Wolfe, M. J. 1993. Engineering a data dependence test. Concurrency Pract. Exper. 5, 7 (Oct.), 603{622.
|
CITED BY 28
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert A. van Engelen , J. Birch , Y. Shou , B. Walsh , Kyle A. Gallivan, A unified framework for nonlinear dependence testing and symbolic analysis, Proceedings of the 18th annual international conference on Supercomputing, June 26-July 01, 2004, Malo, France
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Constantine D. Polychronopoulos, A novel approach for partitioning iteration spaces with variable densities, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
|
|
|
|
|
|
Mary W. Hall , Jennifer M. Anderson , Saman P. Amarasinghe , Brian R. Murphy , Shih-Wei Liao , Edouard Bugnion , Monica S. Lam, Maximizing Multiprocessor Performance with the SUIF Compiler, Computer, v.29 n.12, p.84-89, December 1996
|
|
|
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Hideki Saito , Xinmin Tian , Milind Girkar , Utpal Banerjee , Constantine D. Polychronopoulos, A general approach for partitioning N-dimensional parallel nested loops with conditionals, Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, July 30-August 02, 2006, Cambridge, Massachusetts, USA
|
|
|
|
|
|
|
|
|
J. Birch , R.A. van Engelen , K.A. Gallivan , Y. Shou, An empirical evaluation of chains of recurrences for array dependence testing, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
Silvius Rus , Guobin He , Christophe Alias , Lawrence Rauchwerger, Region array SSA, Proceedings of the 15th international conference on Parallel architectures and compilation techniques, September 16-20, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Alexander V. Veidenbaum , Constantine D. Polychronopoulos, Cache-aware iteration space partitioning, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming, February 20-23, 2008, Salt Lake City, UT, USA
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Alexander V. Veidenbaum , Constantine D. Polychronopoulos, Cache-aware partitioning of multi-dimensional iteration spaces, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
Alexandru Nicolau , Guangqiang Li , Alexander V. Veidenbaum , Arun Kejariwal, Synchronization optimizations for efficient execution on multi-cores, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
REVIEW
"Max Hailperin : Reviewer"
In this paper, Haghighat and Polychronopoulos perfectly exemplify
two of my maxims as a computer science educator: theory is an integral
part of practice, not a contrast to it, and a concrete example is often
the best way to convey an abstract
more...
|