|
ABSTRACT
Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Abu-Sufah, W.. Kuck. D.J.. and Lawrie. D.H. On the performance enhancement of paging systems through program analysis and transformations. IEEE Trans. Comput. C-30, 5 (May 1981), 341-356.
|
| |
2
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
3
|
Allen, F.E.. and Cocke, J. A catalogue of optimizing transformations. In Design and Opfimizatim of Compilers, R. Rustin, Ed. Prentice-Hall, Englewood Cliffs, N.J., 1972. pp. l-30.
|
| |
4
|
Allen, F.E.. Carter, J.L.. Fabri, J.. Ferrante, J., Harrison, W.H., Loewner. P.G., and Trevillyan. L.H. The experimental compiling system. IBM 1. Res. Dev. 24, 6 {Nov. 1980), 695-715.
|
| |
5
|
Allen, J.R., and Kennedy, K. PFC: A program to convert Fortran to parallel form. Rep. MASC-TR82-6. Rice Univ.. Houston, Tex., Mar. 1982.
|
 |
6
|
|
| |
7
|
American National Standards Institute American National Standard for Iuformation Systems. Programming Language Fortran.SB (X3.9-198x}. Revision of X3.9-1978. Draft S8, Version 99. American National Standards Institute. New York, Apr. 1986.
|
| |
8
|
|
| |
9
|
Banerjee. U. Direct parallelization of call statements-A review. Rep. 576. Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Nov. 1985.
|
| |
10
|
Banerjee. U., Chen, S.C.. Kuck. D.J., and Towle, R.A. Time and parallel processor bounds for Fortran-like loops. IEEE Trans. Compuf. C-28, 9 (Sept. 1979). 660-670.
|
| |
11
|
Brode, B. Precompilation of Fortran programs to facilitate array processing. Computer 14, 9 (Sept. 1981), 46-51.
|
 |
12
|
|
| |
13
|
Burroughs Corp. Numerical Aerodynamic Simulation Facility Feasibility Study. Burroughs Corp., Paoli, Pa., Mar. 1979.
|
 |
14
|
|
| |
15
|
Chen. SC. Large-scale and high-speed multiprocessor system for scientific applications: Cray X-MP Series. In High-Speed Computation, NATO AS1 Series, vol. F7, J.S. Kowalik, Ed. Springer-Verlag, New York, 1984. pp. 59-67.
|
| |
16
|
Cytron, R.G. Doacross: Beyond vectorization for multiprocessors. In Procecdiqs of fhe 1986 International Conference cm Parallel Processing (St. Charles. 111.. Aug. 19-22). IEEE Press, New York, 1986. pp. 836- 644.
|
| |
17
|
Davies, I.. Huson. C., Macke. T., Leasure, B., and Wolfe. M. The KAP/S-1: An advanced source-to-source vectorizer for the S-l Mark IIa supercomputer. In Proceedirrgs of the 1986 I~rfernatiorral Corrferencc 011 Parallel Processi?lg (St. Charles, Ill.. Aug. 19-22). IEEE Press, New York, 1986. pp. 833-835.
|
| |
18
|
Davies, I.. Huson. C.. Macke, T.. Leasure, B., and Wolfe. M. The KAP/205: An advanced source-to-source vectorizer for the Cyber 205 supercomputer. In Proceedings of the 1986 Infernafional Conference DII Parallel Processiq (St. Charles, 111.. Aug. 19-22). IEEE Press, New York. 1986. pp. 827-832.
|
| |
19
|
Davies, J.R. Parallel loop constructs for multiprocessors. MS. thesis, Rep. 81-1070. Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, May 1981.
|
| |
20
|
Dongarra. 1.1.. and Hinds, A. Comparison of the Cray X-MP-4. Fujitsu VP-ZOO. and Hitachi S-810/20: An Argonne perspective. Rep. ANL-85-19. Argonne National Laboratory, Argonne, Ill., Oct. 1985.
|
| |
21
|
Guzzi, M.D. Cedar Fortran Reference Manual. Rep. 601. Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Nov. 1986.
|
| |
22
|
Harrison, W.L. Compiling LISP for evaluation on a tightly coupled multiprocessor. Rep. 565. Center for Supercomputing Research and Development. University of Illinois at Urbana-Champaign, Mar. 1966.
|
| |
23
|
Harrison. W.L., and Padua, D.A. Representing S-expressions for the efficient evaluation of Lisp on parallel processors. In Proceedings of the 1986 Irrfernafiotd Conference on Pnrallel Processing (St. Charles, Ill.. Aug. 19-22). IEEE Press, New York, 1986. pp. 703-710.
|
| |
24
|
Huson. CA. An in-line subroutine expander for parafrase. MS. thesis, Rep. 82-1118. Dept. of Computer Science, Univ. of Illinois at Urbana-Champaign, Dec. 1982.
|
| |
25
|
Kamiya, S., Isobe, F.. Takashima. H.. and Takiuchi, M. Practical vectorization techniques for the Facom VP. In information Processing 83, R.E.A. Mason Ed. Elsevier North-Holland, New York, 1983. pp. 369-394. pp. 369-394.
|
 |
26
|
|
| |
27
|
Kruskal. C.P.. and Weiss, A. Allocating independent subtasks on parallel processors. In Proceedings of the 1984 International Conference on Parallel Processing. R.M. Keller, Ed. IEEE Press. New York, Aug. 1964. pp. 236-240.
|
| |
28
|
Kuck. 0.1. Parallel processing of ordinary programs. In Advances in Computers. vol. 15. M. Rubinoff and MC. Yovits, Eds. Academic Press, New York, 1976. pp. 119-179.
|
 |
29
|
|
| |
30
|
Kuck, D.J.. and Stokes, R.A. The Burroughs scientific processor (BSP). Special Issue on Supersystems, IEEE Trans. Comput. C-31, 5 (May 7962).363-376.
|
| |
31
|
Kuck. D.J., Davidson, E.S., Lawrie, D.H., and Sameh, A.H. Parallel supercomputing today and the Cedar approach. Science 231, 4740 (Feb. 28, 1986). 967-974.
|
| |
32
|
Kuck. D.}., Kuhn. R.H., Leasure, B.. and Wolfe, M. The structure of an advanced retargetable vectorizer. In Tuforial on Supercomputers: Designs and Applications, K. Hwang, Ed. IEEE Press, New York, 1984, pp. 163-178.
|
 |
33
|
D. J. Kuck , R. H. Kuhn , D. A. Padua , B. Leasure , M. Wolfe, Dependence graphs and compiler optimizations, Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.207-218, January 26-28, 1981, Williamsburg, Virginia
[doi> 10.1145/567532.567555]
|
| |
34
|
Kuck, 0.1.. Sameh. A.H., Cytron, R., Veidenbaum, A.V., Polychronopoulos. CD.. Lee. G., McDaniel, T., Leasure. B.R., Beckman, C.. Davies, J.R.B., and Kruskal. C.P. The effects of program restructuring, algorithm change, and architecture choice on program performance. In Proceedings of the 1984 lnternationnl Conference on Parallel Processing, R.M. Keller, Ed. IEEE Press, New York, Aug. 1984, pp. 129-138.
|
 |
35
|
|
| |
36
|
Lundstrom. SF.. Barnes, G.H. A controllable MIMD architecture. In Proceedings of rhe 1980 Infernational Conference on Parallel Processing (BeIIaire, Mich.. Aug. 26-29). IEEE Press, New York, 1980. pp. 19-27.
|
| |
37
|
Mehrotra, P.. and Van Rosendale, J. The Blaze Language: A parallel language for scientific programming. Rep. 85-29, Institute for Computer Applications in Science and Engineering, NASA Langley Research Center, Hampton, Va., May 1985.
|
| |
38
|
Midkiff, S.P., and Padua. D.A. Compiler generated synchronization for Do loops. In Proceedings of fhe 1986 Inlernaliomzl Conference on Parallel Processirlg (St. Charles, Ill., Aug. 19-22). IEEE Press, New York. 1986.
|
| |
39
|
Miura, K.. and Uchida. K. Facom vector processor VP-loo/VP-200. In HighSpeed Computafiotl. NATO AS1 Series, vol. F7, IS. Kowalik. Ed. Springer-Verlag, New York, 1984. pp. 127-138.
|
| |
40
|
Nagashima. S.. Inagami, Y.. Odaka. T.. and Kawabe. S. Design consideration for a high-speed vector processor: The Hitachi S-810. In Proceedings of fhe IEEE International Conference on Computer Design: VLSI irr Compulers. ICCD 84. (Port Chester. N.Y., Oct. 8-11). IEEE Press. New York, 1984. pp. 238-243.
|
| |
41
|
|
| |
42
|
|
| |
43
|
Padua, D.A., Kuck. D.J., and Lawrie, D.H. High-speed multiprocessors and compilation techniques. IEEE Trans. Compur. C-29, 9 (Sepl. 1980). 763-776.
|
| |
44
|
|
 |
45
|
|
| |
46
|
|
| |
47
|
Tang, P.. and Yew, P. Processor self-scheduling for multiple-nested parallel loops. In Proceedings of the 1986 Infernational Conference on Parallel Processiug (St. Charles, Ill., Aug. 19-22). IEEE Press, New York, 1986. pp. 528-535.
|
| |
48
|
|
 |
49
|
|
| |
50
|
|
 |
51
|
|
| |
52
|
|
| |
53
|
Yasumura. M., Tanaka, Y.. Kanada, Y., and Aoyama, A. Compiling algorithms and techniques for the S-810 vector processor. In Proceedings of rhe 1984 lrltenmtiomd Conference on Parallel Processing, R.M. Keller, Ed. IEEE Press, New York, Aug. 1984, pp. 285-290.
|
CITED BY 164
|
|
|
|
|
|
|
|
Nicolas Gloy , Michael D. Smith , Cliff Young, Performance issues in correlated branch prediction schemes, Proceedings of the 28th annual international symposium on Microarchitecture, p.3-14, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
P. R. Panda , F. Catthoor , N. D. Dutt , K. Danckaert , E. Brockmeyer , C. Kulkarni , A. Vandercappelle , P. G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.6 n.2, p.149-206, April 2001
|
|
|
|
|
|
Michael Burke , Ron Cytron , Jeanne Ferrante , Wilson Hsieh , Vivek Sarkar , David Shields, Automatic discovery of parallelism: a tool and an experiment (extended abstract), ACM SIGPLAN Notices, v.23 n.9, p.77-84, Sept. 1988
|
|
|
|
|
|
T. M. Watts , M. L. Soffa , R. Gupta, Techniques for integrating parallelizing transformations and compiler-based scheduling methods, Proceedings of the 1992 ACM/IEEE conference on Supercomputing, p.830-839, November 16-20, 1992, Minneapolis, Minnesota, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Callahan , J. Dongarra , D. Levine, Vectorizing compilers: a test suite and results, Proceedings of the 1988 ACM/IEEE conference on Supercomputing, p.98-105, November 12-17, 1988, Orlando, Florida, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pedro V. Artigas , Manish Gupta , Samuel P. Midkiff , José E. Moreira, Automatic loop transformations and parallelization for Java, Proceedings of the 14th international conference on Supercomputing, p.1-10, May 08-11, 2000, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gene Fuh , Jyh-Herng Chow , Nelson Mattos , Brian Tran, Supporting procedural constructs in existing SQL compilers, Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research, p.11, November 12-14, 1996, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M. Chen , Y. Choo , J. Li, Crystal: from functional description to efficient parallel code, Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues, p.417-433, January 19-20, 1988, Pasadena, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Akimasa Yoshida , Kenichi Koshizuka , Hironori Kasahara, Data-localization for Fortran macro-dataflow computation using partial static task assignment, Proceedings of the 10th international conference on Supercomputing, p.61-68, May 25-28, 1996, Philadelphia, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tetsuo Hironaka , Takashi Hashimoto , Keizo Okazaki , Kazuaki Murakami , Shinji Tomita, Benchmarking a vector-processor prototype based on multithreaded streaming/FIFO vector (MSFV) architecture, Proceedings of the 6th international conference on Supercomputing, p.272-281, July 19-24, 1992, Washington, D. C., United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
I. Kadayif , T. Chinoda , M. Kandemir , N. Vijaykirsnan , M. J. Irwin , A. Sivasubramaniam, vEC: virtual energy counters, Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering, p.28-31, June 2001, Snowbird, Utah, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Keshav Pingali , Micah Beck , Richard Johnson , Mayan Moudgill , Paul Stodghill, Dependence flow graphs: an algebraic approach to program dependencies, Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages, p.67-78, January 21-23, 1991, Orlando, Florida, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Constantine D. Polychronopoulos, A novel approach for partitioning iteration spaces with variable densities, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
J. Dongarra , G. Bosilca , Z. Chen , V. Eijkhout , G. E. Fagg , E. Fuentes , J. Langou , P. Luszczek , J. Pjesivac-Grbovic , K. Seymour , H. You , S. S. Vadhiyar, Self-adapting numerical software (SANS) effort, IBM Journal of Research and Development, v.50 n.2/3, p.223-238, March 2006
|
|
|
|
|
|
|
|
|
Lamia Youseff , Keith Seymour , Haihang You , Jack Dongarra , Rich Wolski, The impact of paravirtualized memory hierarchy on linear algebra computational kernels and software, Proceedings of the 17th international symposium on High performance distributed computing, June 23-27, 2008, Boston, MA, USA
|
|
|
R. Mirchandaney , J. H. Saltz , R. M. Smith , D. M. Nico , K. Crowley, Principles of runtime support for parallel processors, Proceedings of the 2nd international conference on Supercomputing, p.140-152, June 1988, St. Malo, France
|
|
|
Arun Kejariwal , Alexandru Nicolau , Hideki Saito , Xinmin Tian , Milind Girkar , Utpal Banerjee , Constantine D. Polychronopoulos, A general approach for partitioning N-dimensional parallel nested loops with conditionals, Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures, July 30-August 02, 2006, Cambridge, Massachusetts, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Baxter , R. Mirchandaney , J. H. Saltz, Run-time parallelization and scheduling of loops, Proceedings of the first annual ACM symposium on Parallel algorithms and architectures, p.303-312, June 18-21, 1989, Santa Fe, New Mexico, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Arun Kejariwal , Alexandru Nicolau , Utpal Banerjee , Alexander V. Veidenbaum , Constantine D. Polychronopoulos, Cache-aware partitioning of multi-dimensional iteration spaces, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
|
|
|
|
|
|
Lamia Youseff , Keith Seymour , Haihang You , Dmitrii Zagorodnov , Jack Dongarra , Rich Wolski, Paravirtualization effect on single- and multi-threaded memory-intensive linear algebra software, Cluster Computing, v.12 n.2, p.101-122, June 2009
|
REVIEW
"Edgar M. Pass : Reviewer"
The authors survey the problems associated with the analysis of algorithms,
stated in FORTRAN, designed for highly parallel supercomputers. Their goal
is to explain the types of compiler optimizations that are currently in use
on some of these F
more...
|