|
ABSTRACT
An aggregate array computation is a loop that computes accumulated quantities over array elements. Such computations are common in programs that use arrays, and the array elements involved in such computations often overlap, especially across iterations of loops, resulting in significant redundancy in the overall computations. This article presents a method and algorithms that eliminate such overlapping aggregate array redundancies and shows analytical and experimental performance improvements. The method is based on incrementalization, that is, updating the values of aggregate array computations from iteration to iteration rather than computing them from scratch in each iteration. This involves maintaining additional values not maintained in the original program. We reduce various analysis problems to solving inequality constraints on loop variables and array subscripts, and we apply results from work on array data dependence analysis. For aggregate array computations that have significant redundancy, incrementalization produces drastic speedup compared to previous optimizations; when there is little redundancy, the benefit might be offset by cache effects and other factors. Previous methods for loop optimizations of arrays do not perform incrementalization, and previous techniques for loop incrementalization do not handle arrays.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
Allen, F. E. and Cocke, J.1971. A catalogue of optimizing transformations. In Design and Optimization of Compilers. R. Rustin, Ed. Prentice-Hall, Englewood Cliffs, N.J. pp. 1--30.
|
| |
4
|
|
| |
5
|
Banerjee, U.1990. Unimodular transformations of double loops. In Proceedings of the Workshop on Advances in Languages and Compilers for Parallel Processing (Aug.), pp. 192--219.
|
| |
6
|
|
 |
7
|
|
| |
8
|
Booth, R. 1997. Inner Loops. Addison-Wesley Developers Press, Reading, Mass.
|
 |
9
|
Mark Bromley , Steven Heller , Tim McNerney , Guy L. Steele, Jr., Fortran at ten gigaflops: the connection machine convolution compiler, Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, p.145-156, June 24-28, 1991, Toronto, Ontario, Canada
|
| |
10
|
Broy, M.1984. Algebraic methods for program construction: The project CIP. In Program Transformation and Programming Environments, P. Pepper, Ed., Springer-Verlag, Berlin, Germany. pp. 199--222.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
 |
14
|
Siddhartha Chatterjee , Erin Parker , Philip J. Hanlon , Alvin R. Lebeck, Exact analysis of the cache behavior of nested loops, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.286-297, June 2001, Snowbird, Utah, United States
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
Earley, J.1976. High level iterators and a method for automatically designing data structure representation. J. Comput. Lang. 1, 321--342.
|
| |
20
|
Ernst, M. D.1992. Serializing parallel programs by removing redundant computation. Master's thesis. MIT, Cambridge, Mass. (Revised August 1994.)
|
| |
21
|
|
| |
22
|
Feautrier, P.1988. Parametric integer programming. Operationnelle/Oper. Res. 22, 3 (Sept.), 243--268.
|
| |
23
|
Feautrier, P.1991. Dataflow analysis of array and scalar references. Int. J. Parall. Prog. 20, 1 (Feb.).
|
| |
24
|
Fisher, A. L. and Highnam, P. T.1988. Communication and code optimization in SIMD programs. In International Conference on Parallel Processing (Aug.).
|
| |
25
|
Fisher, A. L., Leon, J., and Highnam, P. T.1990. Design and performance of an optimizing SIMD compiler. In Frontiers of Massively Parallel Computation.
|
 |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
Garg, V. K. and Mitchell, J. R.1996. An efficient algorithm for detecting conjunctions of general global predicates. Tech. Rep. TR-PDS-1996-005. University of Texas at Austin, Austin, Tex.
|
 |
30
|
|
| |
31
|
Goldstine, H. H.1972. Charles Babbage and his analytical engine. In The Computer from Pascal to von Neumann, Princeton University Press, Princeton, N.J., Chap. 2, 10--26.
|
| |
32
|
|
| |
33
|
Grau, A. A., Hill, U., and Langmaac, H.1967. Translation of ALGOL 60. Volume 1 of Handbook for Automatic Computation. Springer, Berlin, Germany.
|
| |
34
|
Gries, D.1984. A note on a standard strategy for developing loop invariants and loops. Sci. Comput. Program., 2, 207--214.
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
Katz, S.1978. Program optimization using invariants. IEEE Trans. Softw. Eng. SE-4, 5 (Nov.), 378--389.
|
| |
39
|
|
| |
40
|
Knuth, D. E.1971. An empirical study of FORTRAN programs. Softw. Pract. Exp. 1, 2, 105--133.
|
| |
41
|
Li, K.1997. Interferometric strain/slope rosette for static and dynamic measurements. Exper. Mech. 37, 2, 111--118.
|
| |
42
|
|
| |
43
|
|
 |
44
|
|
| |
45
|
|
 |
46
|
|
| |
47
|
|
| |
48
|
|
 |
49
|
|
| |
50
|
|
 |
51
|
|
| |
52
|
Proceedings of the ACM SIGPLAN '96 Conference on Programming Language Design and Implementation. ACM, New York, 1996.
|
 |
53
|
|
 |
54
|
|
 |
55
|
|
| |
56
|
|
 |
57
|
|
| |
58
|
|
| |
59
|
|
 |
60
|
|
 |
61
|
|
| |
62
|
|
| |
63
|
Wegbreit, B.1976. Goal-directed program transformation. IEEE Trans. Softw. Eng. SE-2, 2 (June), 69--80.
|
| |
64
|
|
 |
65
|
|
| |
66
|
|
| |
67
|
|
|