|
ABSTRACT
As part of an effort to develop an optimizing compiler for a pipelined architecture, a code reorganization algorithm has been developed that significantly reduces the number of runtime pipeline interlocks. In a pass after code generation, the algorithm uses a dag representation to heuristically schedule the instructions in each basic block.Previous algorithms for reducing pipeline interlocks have had worst-case runtimes of at least O (n4). By using a dag representation which prevents scheduling deadlocks and a selection method that requires no lookahead, the resulting algorithm reorganizes instructions almost as effectively in practice, while having an O (n2) worst-case runtime.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
Ary83
|
Arya, S. Optimal Instruction Scheduling for a Class of Vector Processors: An Integer Programming Approach. Tech. Rept. CRL-TR-19-83, Computer Research Laboratory, the Univ. of Michigan, Ann Arbor, April 1983.
|
 |
Aus82
|
|
| |
Con67
|
Conway, R.W., W.L. Maxwell & L.W. Miller, Theory of Scheduling, Addison-Wesley, Reading, MA, 1967.
|
 |
Cou86
|
|
| |
Dav81
|
Davidson, S., D. Landskov, B.D. Shriver & P.W. Mallett. Some Experiments in Local Microcode Compaction for Horizontal Machines. 1EEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 460 - 477.
|
| |
Fis81
|
Fisher, j.A. Trace Scheduling: A Technique for Global Microcode Compaction. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 478 - 490.
|
| |
Gro83
|
Gross, T.R. Code Optimization of Pipeline Constraints. Tech. Rept. 83-255, Computer Systems Lab., Stanford Univ., Dec. 1983.
|
| |
Hen81
|
Hennessy, J.L. Symbolic Debugging of Optimized Code, ACM Trans. on Prog. Lang. and Sys., Vol. 3, No. 1, Jan. 1981, pp. 200 - 206.
|
 |
Hen83
|
|
 |
Joh86
|
Mark S. Johnson , Terrence C. Miller, Effectiveness of a machine-level, global optimizer, Proceedings of the 1986 SIGPLAN symposium on Compiler construction, p.99-108, June 25-27, 1986, Palo Alto, California, United States
|
| |
Knu68
|
Knuth, D.E. Fundamental Algorithms, Addison- Wesley, Reading, MA, p. 258.
|
| |
Kog81
|
Kogge, P.M. The Architecture of Pipelined Computers, McGraw-Hill, New York, 1981.
|
 |
Rym82
|
|
| |
Sit78
|
Sites, R.L. Instruction Ordering for the Cray-1 Computer. Tech. Rept. 78-CS-023, Univ. of California, San Diego, July 1978.
|
| |
Spi71
|
Spillman, Thomas C., Exposing Side-Effects in a PldI Optimizing Compiler, Information Processing 81, North:Holland, 1972, pp. 376 - 381.
|
| |
Tho64
|
Thornton, J.E. Parallel Operation in the Control Data 6600, Proc. Fall Joint Comp. Conf., Part 2, Vol. 26, 1964, pp. 33 -40.
|
| |
Tok81
|
Tokoru, M., E. Tamura & T. Takizuka. Optimization of Microprograms. IEEE Trans. on Computers, Vol. C-30, No. 7, July 1981, pp. 491 - 504.
|
| |
Tom67
|
Tomasulo, R.M. An Efficient Algorithm for Exploiting Multiple Arithmetic Units, IBM J. of Res. and Devt., Vol. 11, No. 1, Jan. 1967, pp. 25 - 33.
|
| |
Veg82
|
|
| |
Zel84
|
Zellweger, P.T. lnteractiv~ Source-Level Debugging of Optimized Programs, Research Report CSL-84-5, Xerox Palo Alto Research Center, Palo Alto, CA, May 1984.
|
CITED BY 63
|
|
|
|
|
|
|
|
Pritpal S. Ahuja , Douglas W. Clark , Anne Rogers, The performance impact of incomplete bypassing in processor pipelines, Proceedings of the 28th annual international symposium on Microarchitecture, p.36-45, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark Smotherman , Sanjay Krishnamurthy , P. S. Aravind , David Hunnicutt, Efficient DAG construction and heuristic calculation for instruction scheduling, Proceedings of the 24th annual international symposium on Microarchitecture, p.93-102, September 1991, Albuquerque, New Mexico, Puerto Rico
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feipei Lai , Hung-Chang Lee , Chun-Luh Lee, Optimization on instruction reorganization, Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture, p.143-148, November 27-29, 1990, Orlando, Florida, United States
|
|
|
|
|
|
A. R. Pleszkun , J. R. Goodman , W. C. Hsu , R. T. Joersz , G. Bier , P. Woest , P. B. Schechter, WISQ: a restartable architecture using queues, Proceedings of the 14th annual international symposium on Computer architecture, p.290-299, June 02-05, 1987, Pittsburgh, Pennsylvania, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. Leupers , K. Karuri , S. Kraemer , M. Pandey, A design flow for configurable embedded processors based on optimized instruction set extension synthesis, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
|
|
|
|
|
|
|
|
Vikki Tang , Joran Siu , Alexander Vasilevskiy , Marcel Mitran, A framework for reducing instruction scheduling overhead in dynamic compilers, Proceedings of the 2006 conference of the Center for Advanced Studies on Collaborative research, October 16-19, 2006, Toronto, Ontario, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pieter Bellens , Josep M. Perez , Felipe Cabarcas , Alex Ramirez , Rosa M. Badia , Jesus Labarta, CellSs: Scheduling techniques to better exploit memory hierarchy, Scientific Programming, v.17 n.1-2, p.77-95, January 2009
|
|