|
ABSTRACT
In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimizations for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimizations for high-performance superscalar, vector, and parallel processors maximize parallelism and memory locality with transformations that rely on tracking the properties of arrays using loop dependence analysis.This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages, such as C and Fortran. Transformations for both sequential and various types of parallel architectures are covered in depth. We describe the purpose of each transformation, explain how to determine if it is legal, and give an example of its application.Programmers wishing to enhance the performance of their code can use this survey to improve their understanding of the optimizations that compilers can perform, or as a reference for techniques to be applied manually. Students can obtain an overview of optimizing compiler technology. Compiler writers can use this survey as a reference for most of the important optimizations developed to date, and as bibliographic reference for the details of each optimization. Readers are expected to be familiar with modern computer architecture and basic program compilation techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
ABU-SUFAH, W., KUCK, D. J., AND LAWRIE, I). 1981 On the performance enhancement of paging systems through program analysis and transformations IEEE Trans. Comput. C-30, 5 (May), 341-356.
|
| |
4
|
ACKERSLAXq, W B. 1982. Data flow languages. Computer 15, 2 (Feb.), 15-25.
|
 |
5
|
|
| |
6
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
 |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
ALLEN, F. E. 1969. Program optimization. In Annual Rewew ~n Automatic Programming 5. International Tracts in Computer Science and Technology and their Application, vol. 13. Pergamon Press, Oxford, England, 239-307.
|
| |
11
|
ALLEN, F. E. AND COCKE, J. 1971. A catalogue of optimizing transformations. In Design and Opt~m~zat~on of Compilers, R Rustin, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1-30.
|
 |
12
|
|
 |
13
|
|
| |
14
|
|
 |
15
|
F. Allen , M. Burke , R. Cytron , J. Ferrante , W. Hsieh, A framework for determining useful parallelism, Proceedings of the 2nd international conference on Supercomputing, p.207-215, June 1988, St. Malo, France
[doi> 10.1145/55364.55385]
|
| |
16
|
ALLEN, F. E., COCKE, J., AND KENNEDY, K. 1981. Reduction of operator ~trength. In Program Flow Analysis: Theory and Apphcations, S. S. Muchnik and N. D. Jones, Eds., Prentice-Hall, Englewood Cliffs, N.J., 79-101.
|
 |
17
|
J. R. Allen , Ken Kennedy , Carrie Porterfield , Joe Warren, Conversion of control dependence to data dependence, Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages, p.177-189, January 24-26, 1983, Austin, Texas
[doi> 10.1145/567067.567085]
|
| |
18
|
|
| |
19
|
AMERICAN NATIONAL STANDARDS INSTITUTE. 1987. An American National standard, IEEE standard for binary floating-point arithmetic SIGPLAN Not. 22, 2 (Feb.), 9-25.
|
 |
20
|
|
 |
21
|
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
ARWND, KATHAIL, V., AND PINGALI, K. 1980. A dataflow architecture with tagged tokens. Tech. Rep TM-174, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass.
|
| |
26
|
David F. Bacon , Jyh-Herng Chow , Dz-ching R. Ju , Kalyan Muthukumar , Vivek Sarkar, A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness, Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research, p.3, October 31-November 03, 1994, Toronto, Ontario, Canada
|
| |
27
|
BAILEY, D. H. 1992. On the behavior of cache memories with strided data access. Tech. Rep. RNR- 92-015, NASA Ames Research Center, Moffett Field, Calif., (May).
|
| |
28
|
BAILEY, D. H., BARszcz, E., BARTON, J. T., BROWNING, D. S., CARTER, R. L., DAGUM, L., FATOOHI, R. A., FREDERICKSON, P. 0., LASINSKI, T. A., SCHREIBER, R. S., SIMON, H. D., VENKATAKRISHNAN, V., AND WEERATUNGA, S. K. 1991. The NAS parallel benchmarks. Int. J. Supercomput. Appl. 5, 3 (Fall), 63-73.
|
| |
29
|
|
| |
30
|
BALASUNDARAM, V., Fox, G., KENNEDY, K., AND KREMER, V. 1990. An interactive environment for data partitioning and distribution. In Proceedtngs of the 5th D~strzbuted Memory Computer Conference (Charleston, South Carolina, Apr.). IEEE Computer Society Press, Los Alamitos, Calif.
|
 |
31
|
V. Balasundaram , K. Kennedy , U. Kremer , K. McKinley , J. Subhlok, The parascope editor: an interactive parallel programming tool, Proceedings of the 1989 ACM/IEEE conference on Supercomputing, p.540-550, November 12-17, 1989, Reno, Nevada, United States
[doi> 10.1145/76263.76323]
|
 |
32
|
|
| |
33
|
B~NE~EE, U. 1991. Unimodular transformations of double loops. In Advances in Languages and Compilers for Parallel Processing, A. Nicolau, Ed. Research Monographs in Parallel and Distributed Computing, MIT Press, Cambridge, Mass., Chap. 10.
|
| |
34
|
|
| |
35
|
BnNERJEE, U. 1988b. An introduction to a formal theory of dependence analysis. J. Supercomput. 2, 2 (Oct.), 133-149.
|
| |
36
|
|
| |
37
|
BANERJEE~ U., CHEN, S. C., KUCK, D. J., AND TOWLE, R. A. 1979. Time and parallel processor bounds for Fortran-like loops. IEEE Trans. Comput. C-28, 9 (Sept.), 660-670.
|
 |
38
|
|
| |
39
|
|
| |
40
|
BERRY, M., CHEN, D., Koss, P., KUCK, D., Lo, S., PANG, Y., POINTER, L., ROLOFF, R., SAMEH, A., CLEMENTI, E_. CHIN. S.. SCHNEIDER. D.. FOX. FT., MESSINA, P., WALKER, D., HSIUNG, C., SCHWARZMEIER, J., LUE, K., ORSZAG, S., SEIDL, F., JOHNSON, O., GOODRUM, R., AND MARTIN, J. 1989. The Perfect Club benchmarks: Effective performance evaluation of supercomputers. Int. J. Supercomput. Appl. 3, 3 (Fall), 5-40.
|
| |
41
|
|
 |
42
|
|
 |
43
|
|
| |
44
|
BURNETT, G. J. AND COFFMAN, E. G., JR. 1970. A study of interleaved memory systems. In Proceedings of the Spring Joint AFIPS Computer Conference. vol. 36. AFIPS, Montvale, N.J., 467-474.
|
 |
45
|
|
 |
46
|
Michael Butler , Tse-Yu Yeh , Yale Patt , Mitch Alsup , Hunter Scales , Michael Shebanow, Single instruction stream parallelism is greater than two, ACM SIGARCH Computer Architecture News, v.19 n.3, p.276-286, May 1991
|
| |
47
|
|
| |
48
|
CALLAHAN, D. AND KENNEDY, K. 1988b. Compiling programs for distributed-memory multiprocessots. J. Supercomput. 2, 2 (Oct.), 151-169.
|
 |
49
|
|
| |
50
|
|
 |
51
|
|
| |
52
|
|
 |
53
|
|
| |
54
|
CHAITIN, G. J., AUSLANDER, M. A., CHANDRA, A. K., COOKE, J., HOPKINS, M. E., AND MARKSTEIN, P. W. 1981. Register allocation via coloring. Comput. Lang. 6, 1 (Jan.), 47-57.
|
 |
55
|
|
| |
56
|
|
 |
57
|
Siddhartha Chatterjee , John R. Gilbert , Robert Schreiber , Shang-Hua Teng, Automatic array alignment in data-parallel programs, Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.16-28, March 1993, Charleston, South Carolina, United States
[doi> 10.1145/158511.158517]
|
| |
58
|
CHEN, S. C. AND KUCK, D. J. 1975. Time and parallel processor bounds for linear recurrence systems. IEEE Trans. Comput. C-24 7 (July), 701-717.
|
 |
59
|
|
 |
60
|
|
 |
61
|
|
| |
62
|
|
 |
63
|
|
 |
64
|
|
| |
65
|
COCKE, J. AND MARKSTEIN, P. 1980. Measurement of program improvement algorithms. In Proceedrags of the IFIP Congress (Tokyo, Japan, Oct.). North-Holland, Amsterdam, Netherlands, 221-228. Also available as IBM Tech. Rep. RC 8111, Feb. 1980.
|
| |
66
|
|
| |
67
|
|
 |
68
|
|
| |
69
|
COOPER, K. D., HALL, M. W., AND KENNEDY, K. 1993. A methodology for procedure cloning. Comput. Lang. 19, 2 (Apr.), 105-117.
|
 |
70
|
|
| |
71
|
|
| |
72
|
CRAY RESEARCH 1988 CFT77 Reference Manual. Publication SR-0018-C. Cray Research, Inc., Eagan, Minn.
|
| |
73
|
CYTRON, R. 1986. Doacross: Beyond vectorization for multiprocessors. In Proceedings of the Internatwnal Conference on Parallel Processing (St. Charles, Ill., Aug.). IEEE Computer Society, Washington, D.C., 836-844.
|
| |
74
|
CYTRON, R. AND FERRANTE, J. 1987. What's in a name? -or- the value of renaming for parallelism detection and storage allocation. Proceedings of the Internatwnal Conference on Parallel Processing (University Park, Penn., Aug.). Pennsylvania State University Press, University Park, Pa., 19-27.
|
| |
75
|
DANTZIG, G. B. AND EAVES, B. C. 1974. Fourier- Motzkin ehmination and sits dual with application to integer programming, in Combinatorial Programming: Methods and Applications (Versailles, France, Sept.). B. D. Roy, Ed. D. Reidel, Boston, Mass., 93-102.
|
| |
76
|
DENNIS, J. B. 1980 Data flow supercomputers_ Computer 13, 11 (Nov.), 48-56.
|
| |
77
|
|
| |
78
|
DONGARRA, J. AND HIND, A. R. 1979. Unrolling loops in Fortran. Softw. Pratt. Exper. 9, 3 (Mar.), 219-226.
|
 |
79
|
|
| |
80
|
|
| |
81
|
|
 |
82
|
|
| |
83
|
FEAUTRIER, P. 1991. Dataflow analysis of array and scalar references. Int. J. Parall. Program. 20, 1 (Feb.), 23-52.
|
 |
84
|
|
| |
85
|
FERRARI, D. 1976. The improvement of program behavior. Computer 9, 11 (Nov.), 39-47.
|
| |
86
|
FISHER, J.A. 1981. Trace scheduling: A technique for global microcode compaction. IEEE Trans. Comput. C-30, 7 (July), 478-490.
|
 |
87
|
Joseph A. Fisher , John R. Ellis , John C. Ruttenberg , Alexandru Nicolau, Parallel processing: a smart compiler and a dumb machine, Proceedings of the 1984 SIGPLAN symposium on Compiler construction, p.37-47, June 17-22, 1984, Montreal, Canada
|
| |
88
|
FLOATING POINT SYSTEMS. 1979. FPS AP-120B Processor Handbook. Floating Point Systems, Inc., Beaverton, Ore.
|
| |
89
|
FREE SOFTWARE FOUNDATION. 1992. gCC 2.X Reference Manual. Free Software Foundation, Cambridge, Mass.
|
 |
90
|
|
| |
91
|
|
| |
92
|
|
| |
93
|
|
| |
94
|
|
 |
95
|
|
 |
96
|
|
 |
97
|
|
| |
98
|
|
| |
99
|
GUPTA, M., MIDKIFF, S., SCHONBERG, E., SWEENEY, P., WANG, K. Y., AND BURKE, M. 1993. PTRAN II: A compiler for High Performance Fortran. In Proceedings of the 4th International Workshop on Compilers for Parallel Computers (Delft, Netherlands, Dec.). 479-493.
|
 |
100
|
Mary W. Hall , Ken Kennedy , Kathryn S. McKinley, Interprocedural transformations for parallel code generation, Proceedings of the 1991 ACM/IEEE conference on Supercomputing, p.424-434, November 18-22, 1991, Albuquerque, New Mexico, United States
[doi> 10.1145/125826.126055]
|
| |
101
|
HARRIS, K. AND HOBBS, S. 1994. VAX Fortran. In Optimization in Compilers, F. E. Allen, R. Cytron, B. K. Rosen, and K. Zadeck, Eds. ACM Press, New York, chap. 16. To be published.
|
| |
102
|
HATFIELD, D. J. AND GERALD, j. 1971. Program restructuring for virtual memory. IBM Syst. J. 10, 3, 168-192.
|
| |
103
|
|
| |
104
|
HEWLETT-PACKARD. 1992. PA-RISC 1.1 Architecture and Instruction Manual. 2nd ed. Part Number 09740-90039. Hewlett-Packard, Palo Alto, Ca}i~
|
| |
105
|
HIGH PERFORMANCE FORTRAN FORUM. 1993. High Performance Fortran language specification, version 1.0. Tech. Rep. CRPC-TR92225, Rice University, Houston, Tex.
|
 |
106
|
|
 |
107
|
|
| |
108
|
|
 |
109
|
|
| |
110
|
Wen-Mei W. Hwu , Scott A. Mahlke , William Y. Chen , Pohua P. Chang , Nancy J. Warter , Roger A. Bringmann , Roland G. Ouellette , Richard E. Hank , Tokuzo Kiyohara , Grant E. Haab , John G. Holm , Daniel M. Lavery, The superblock: an effective technique for VLIW and superscalar compilation, The Journal of Supercomputing, v.7 n.1-2, p.229-248, May 1993
[doi> 10.1007/BF01205185]
|
| |
111
|
IBM. 1992. Optimization and Tuning Gutde for the XL Fortran and XL C Compilers. 1st ed. Publication SC09-1545-00, IBM, Armonk, N.Y.
|
| |
112
|
IBM. 1991. IBM RISC System~6000 NIC Tuning Guide for Fortran and C Publication GG24- 3611-01, IBM, Armonk, N.Y.
|
 |
113
|
|
| |
114
|
|
| |
115
|
|
| |
116
|
KENNEDY, K., MCKINLEY, K. S., AND TSENG, C.-W. 1993. Analysis and transformation in an interactive parallel programming tool. Concurrency Pract. Exper 5, 7 (Oct.), 575-602.
|
 |
117
|
|
| |
118
|
|
| |
119
|
|
| |
120
|
KNOBE, K., LUKAS, J D., AND STEELE, G. L., JR 1988. Massively parallel data optimization. In The 2nd. Symposium on the Frontiers of Massively Parallel Computation (Fairfax, Va., Oct.). IEEE, Piscataway, N.J., 551-558.
|
| |
121
|
KNUTH, D. E. 1971. An empirical study of Fortran programs. Soflw. Pract. Exper. 1, 2 (Apr.-June), 105-133.
|
| |
122
|
|
| |
123
|
|
 |
124
|
David Kranz , Norman Adams , Richard Kelsey , Jonathan Rees , Paul Hudak , James Philbin, ORBIT: an optimizing compiler for scheme, ACM SIGPLAN Notices, v.21 n.7, p.219-233, July 1986
|
| |
125
|
|
 |
126
|
|
| |
127
|
KUCK, D. J. AND STOKES, R. 1982. The Burroughs Scientific Processor (BSP). IEEE Trans. Comput. C-31, 5 (May), 363-376.
|
 |
128
|
D. J. Kuck , R. H. Kuhn , D. A. Padua , B. Leasure , M. Wolfe, Dependence graphs and compiler optimizations, Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.207-218, January 26-28, 1981, Williamsburg, Virginia
[doi> 10.1145/567532.567555]
|
 |
129
|
|
 |
130
|
|
 |
131
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
 |
132
|
|
 |
133
|
|
| |
134
|
|
| |
135
|
|
 |
136
|
|
| |
137
|
|
| |
138
|
LI, J. AND CHEN, M. 1990. Index domain alignment: Minimizing cost of cross-referencing between distributed arrays. In The 3rd Symposium on the Frontiers of Massively Parallel Computation, J. Jaja, Ed. IEEE Computer Society Press, Los Alamitos, Calif., 424-433.
|
| |
139
|
LI, Z. AND YEW, P. 1988. Interprocedura} analysis for parallel computing. In Proceedings of the International Conference on Parallel Processing. F. A. Briggs, Ed. Vol. 2. Pennsylvania State University Press, University Park, Pa., 221-228.
|
| |
140
|
|
 |
141
|
|
 |
142
|
|
| |
143
|
|
| |
144
|
MACE, M. E. AND WAGNER, R. A. 1985. Globally optimum selection of memory storage patterns. In Proceedings of the International Conference on Parallel Processing, D. DeGrott, Ed. IEEE Computer Society, Washington, D.C., 264-271.
|
| |
145
|
MARKSTEIN, P., MARKSTEIN, V., AND ZADECK, K. 1994. Strength reduction. In Optimizatmn in Compilers. ACM Press, New York, Chap. 9. To be published.
|
 |
146
|
|
 |
147
|
|
 |
148
|
|
 |
149
|
|
| |
150
|
McGRAw, J. R. 1985. SISAL: Streams and iteration in a single assignment language. Tech. Rep. M-146, Lawrence Livermore National Laboratory, Livermore, Calif.
|
| |
151
|
MCMAHON, F. M. 1986. The Livermore Fortran kernels: A computer test of numerical performance range. Tech. Rep. UCRL-55745, Lawrence Livermore National Laboratory, Livermore, Calif.
|
| |
152
|
MICHIE, D. 1968. "Memo" functions and machine learning. Nature 218, 19-22.
|
 |
153
|
|
 |
154
|
R. Mirchandaney , J. H. Saltz , R. M. Smith , D. M. Nico , K. Crowley, Principles of runtime support for parallel processors, Proceedings of the 2nd international conference on Supercomputing, p.140-152, June 1988, St. Malo, France
[doi> 10.1145/55364.55378]
|
 |
155
|
|
| |
156
|
|
| |
157
|
MUm~OKA, Y. 1971. Parallelism exposure and exploitation in programs. Ph.D. thesis, Tech. Rep. 71-424, Univ. of Illinois at Urbana-Champaign.
|
 |
158
|
|
| |
159
|
|
| |
160
|
NICOLAU, A. AND FISHER, J. A. 1984. Measuring the parallelism available for very long instruction word architectures. IEEE Trans. Comput. C-33, 11 (Nov.), 968-976.
|
| |
161
|
NIKHIL, R. S. 1988. ID reference manual, version 88.0. Tech. Rep. 284, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass.
|
 |
162
|
|
| |
163
|
O'BRIEN, K., HAY, B., MINISH, J., SCHAFFER, H., SCHLOSS, B., SHEPHERD, A., AND ZALESKI, M. 1990. Advanced compiler technology for the RISC System/6000 architecture. In IBM RISC System/6000 Technology. Publication SA23- 2619. IBM Corporation, Mechanlcsburg, Penn
|
| |
164
|
|
| |
165
|
|
| |
166
|
PADUA, D. A. AND PETERSEN, P. M. 1992. Evaluation of paralleIizing compilers. Parallel Computing and Transputer Applications, (Barcelona, Spain. Sept.). CIMNE, 1505-1514. Also available as Center for Supercomputing Research and Development Tech Rep. 1173.
|
 |
167
|
|
| |
168
|
PADUA, D. A., KUCK, D. j. AND LAWRIE, D. 1980. High-speed multiprocessors and compilation techniques. IEEE Trans. Comput. C-29, 9 (Sept.), 763-776.
|
 |
169
|
|
 |
170
|
|
| |
171
|
|
| |
172
|
|
| |
173
|
POLYCHRONOPOULOS, C. D. 1987b. Loop coalescing: A compiler transformation for parallel machines. In Proceedings of the International Conference on Parallel Processing (University Park, Penn., Aug.). Pennsylvania State University Press, University Park, Pa., 235-242.
|
| |
174
|
|
| |
175
|
POLYCHRONOPOULOS, C. D., GIRKAR, M., HAGHIGHAT, M. R., LEE, C. L., LEUNG, B., AND SCHOUTEN, D. 1989. Parafrase-2: An environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors. In Proceedings of the International Conference on Parallel Processing. Volume 2. Pennsylvania State University Press, University Park, Pa., 39-48.
|
 |
176
|
|
 |
177
|
|
 |
178
|
|
| |
179
|
|
| |
180
|
B. Ramakrishna Rau , David W. L. Yen , Wei Yen , Ross A. Towie, The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs, Computer, v.22 n.1, p.12-26, 28-30, 32-35, January 1989
[doi> 10.1109/2.19820]
|
 |
181
|
|
| |
182
|
RISEMAN, E. M. AND FOSTER, C. C. 1972. The inhibition of potential parallelism by conditional jumps. IEEE Trans. Comput. C-21~ 12 (Dec.), 1405-1411.
|
| |
183
|
|
 |
184
|
|
 |
185
|
|
| |
186
|
|
 |
187
|
|
 |
188
|
|
| |
189
|
|
| |
190
|
SINGH, J. P. AND HENNESSY, J. L. 1991. An empirical investigation of the effectiveness and limitations of automatic parallelization. In Proceedings of the International Symposium on Shared Memory Multiprocessing, (Apr.). 213-240.
|
 |
191
|
|
| |
192
|
|
| |
193
|
|
 |
194
|
|
 |
195
|
|
| |
196
|
|
| |
197
|
SUN MICROSYSTEMS. 1991. SPARC Architecture Manual, Version 8. Part No. 800-1399-08. Sun Microsystems, Mountain View, Calif.
|
 |
198
|
|
| |
199
|
|
| |
200
|
THINKING MACHINES. 1989. Connection Machine, Model CM-2 Technical Summary. Thinking Machines Corp., Cambridge, Mass.
|
| |
201
|
TJADEN, G. S. AND FLYNN, M. J. 1970. Detection and parallel execution of parallel instructions. IEEE Trans. Comput. C-19, 10 (Oct.), 889-895.
|
| |
202
|
|
| |
203
|
|
 |
204
|
|
| |
205
|
|
| |
206
|
|
 |
207
|
|
 |
208
|
|
| |
209
|
|
| |
210
|
WANG, Ko AND GANNON, D. 1989. Applying AI techniques to program optimization for parallel computers, in Parallel Processing for Supercomputers and Artificial Intelligence, K. Hwang and D. Degroot, Eds. McGraw Hill, New York, Chap. 12.
|
 |
211
|
|
 |
212
|
|
 |
213
|
|
 |
214
|
|
| |
215
|
|
 |
216
|
|
| |
217
|
|
| |
218
|
|
| |
219
|
|
| |
220
|
ZIMA, H. P, BAST, H. J, AND GERNDT, M. 1988. SUPERB. A tool for semi-automatic S1MD/MIMD parallelization Parall. Comput 6, 1 (Jan.J, 1-18.
|
CITED BY 118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Inki Hong , Darko Kirovski , Miodrag Potkonjak, Potential-driven statistical ordering of transformations, Proceedings of the 34th annual conference on Design automation, p.347-352, June 09-13, 1997, Anaheim, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wray Buntine , Bernd Fischer , Thomas Pressburger, Towards automated synthesis of data mining programs, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.372-376, August 15-18, 1999, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christian Collberg , Clark Thomborson , Douglas Low, Manufacturing cheap, resilient, and stealthy opaque constructs, Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.184-196, January 19-21, 1998, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Björn Franke , Michael O'Boyle, An empirical evaluation of high level transformations for embedded processors, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
|
|
|
Lisa Guerra , Miodrag Potkonjak , Jan Rabaey, A methodology for guided behavioral-level optimization, Proceedings of the 35th annual conference on Design automation, p.309-314, June 15-19, 1998, San Francisco, California, United States
|
|
|
|
|
|
Christian Weiß , Wolfgang Karl , Markus Kowarschik , Ulrich Rüde, Memory characteristics of iterative methods, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), p.31-es, November 14-19, 1999, Portland, Oregon, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Clifford Liem , Pierre Paulin , Ahmed Jerraya, Address calculation for retargetable compilation and exploration of instruction-set architectures, Proceedings of the 33rd annual conference on Design automation, p.597-600, June 03-07, 1996, Las Vegas, Nevada, United States
|
|
|
|
|
|
Markus Lorenz , Peter Marwedel , Thorsten Dräger , Gerhard Fettweis , Rainer Leupers, Compiler based exploration of DSP energy savings by SIMD operations, Proceedings of the 2004 conference on Asia South Pacific design automation: electronic design and solution fair, p.838-841, January 27-30, 2004, Yokohama, Japan
|
|
|
J. H. Moreno , V. Zyuban , U. Shvadron , F. D. Neeser , J. H. Derby , M. S. Ware , K. Kailas , A. Zaks , A. Geva , S. Ben-David , S. W. Asaad , T. W. Fox , D. Littrell , M. Biberstein , D. Naishlos , H. Hunter, An innovative low-power high-performance programmable signal processor for digital communications, IBM Journal of Research and Development, v.47 n.2-3, p.299-326, March 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gang-Ryung Uh , Yuhong Wang , David Whalley , Sanjay Jinturkar , Yunheung Paek , Vincent Cao , Chris Burns, Compiler transformations for effectively exploiting a zero overhead loop buffer, Software—Practice & Experience, v.35 n.4, p.393-412, 10 April 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nikolaos Kroupis , Nikolaos Zervas , Minas Dasygenis , Konstantinos Tatas , Antonios Argyriou , Dimitrios Soudris , Antonios Thanailakis, Behavioral-Level Performance and Power Exploration of Data-Intensive Applications Mapped on Programmable Processors, Journal of VLSI Signal Processing Systems, v.44 n.1-2, p.153-171, August 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Doosan Cho , Ilya Issenin , Nikil Dutt , Jonghee W. Yoon , Yunheung Paek, Software controlled memory layout reorganization for irregular array access patterns, Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Paul Lokuciejewski , Heiko Falk , Peter Marwedel , Henrik Theiling, WCET-driven, code-size critical procedure cloning, Proceedings of the 11th international workshop on Software & compilers for embedded systems, March 13-14, 2008, Munich, Germany
|
|
|
Paul Lokuciejewski , Heiko Falk , Martin Schwarzer , Peter Marwedel , Henrik Theiling, Influence of procedure cloning on WCET prediction, Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Max Hailperin : Reviewer"
This is a comprehensive survey of high-level optimizing transformations used in compilers for superscalar, vector, and multiprocessor machines. The emphasis is on transformations endemic to these architectures; supporting analyses and more classic
more...
|