ACM Home Page
Please provide us with feedback. Feedback
A compilation technique for software pipelining of loops with conditional jumps
Full text PdfPdf (1.48 MB)
Source International Symposium on Microarchitecture archive
Proceedings of the 20th annual workshop on Microprogramming table of contents
Colorado Springs, Colorado, United States
Pages: 69 - 79  
Year of Publication: 1987
ISBN:0-89791-250-0
Author
Kemal Ebcioğlu  IBM, Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY
Sponsor
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 21,   Citation Count: 47
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/255305.255317
What is a DOI?

ABSTRACT

We describe a compilation algorithm for efficient software pipelining of general inner loops, where the number of iterations and the time taken by each iteration may be unpredictable, due to arbitrary if-then- else statements and conditional exit statements within the loop. As our target machine, we assume a wide instruction word architecture that allows multi-way branching in the form of if-then-else trees, and that allows conditional register transfers depending on where the microinstruction branches to (a hardware implementation proposal for such a machine is briefly described in the paper). Our compilation algorithm, which we call the pipeline scheduling technique, produces a software- pipelined version of a given inner loop, which allows a new iteration of the loop to begin on every cycle whenever dependencies and resources permit. The correctness and termination properties of the algorithm are studied in the paper.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Agerwala, T. and Cocke, J. (87) "High Performance Reduced Instruction Set Computers" research report no. RC 12434, IBM Thomas J. Watson Research Center, Yorktown Heights, 1987.
 
2
Allen, R. and Kennedy, K. (84) "Automatic Translation of Fortran Programs to Vector Form" Technical report COMP TR84-9, Dept. of Computer Science, Rice University, July 1984.
 
3
Anderson, D.W., Sparacio, F.J., and Tomasulo, F.M. (67) "The IBM System/360 Model 91: Machine Philosophy and Instruction Handling" IBM Journal of Research and Development, Vol. 11, January 1967.
4
 
5
Banerjee, U., Gajski, D, and Kuck, D. (80) "Array Machine Control Units for Loops Containing IFs" Proc. 1980 International Conference on Parallel Processing.
6
 
7
Charlesworth, A.E. (81) "An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family" IEEE Computer, September 1981.
 
8
Cytron, R.G. (84) "Compile-time Scheduling and Optimization for Asynchronous Machines" Report no. UIUCDCS-R-84-1177, Dept. of Computer Science, University of Illinois at Urbana- Champaign, October 1984.
 
9
Davies, J.R.B. (81) "Parallel Loop Constructs For Multiprocessors" Report no. UIUDCS-R-81-1070, Dept. of Computer Science, University of Illinois at Urbana-Champaign, May 1981.
 
10
 
11
 
12
 
13
Fisher, J.A. (82) "Very Long Instruction Word Architectures and the ELI-512" Research report #253, Dept. of Computer Science, Yale University, December 1982.
 
14
Hagiwara, H., Tomita, S., Oyanagi, S., Shibayama, K. (80) "A Dynamically Microprogrammable Computer with Low-level Parallelism" IEEE Transactions on Computers, Vol C-29, no. 7, July 1980.
15
16
 
17
Multiflow Computer Inc. (87), "Technical Summary" (Trace(M) series computers), Branford, Connecticut, 1987.
 
18
Munshi, A.A., and Simons, B. (87) "Scheduling Loops on Processors: Algorithms and Complexity" Research report no. RJ 5546, IBM Thomas J. Watson Research Center, Yorktown Heights, March 1987.
 
19
Nanodata Computer Corporation (79) "QM-1 Hardware Level User's Manual" Buffalo, New York, 1979.
 
20
 
21
Padua-Haiek, D.A. (79) "Multiprocessors: Discussion of Some Theoretical and Practical Problems" Report no. UIUCDCS-R-79-990, University of Illinois at Urbana-Champaign, November 1979.
 
22
Pfister, G.F., Brantley, W.C., George, D.A., Harvey, S.L., Kleinfelder, W.J., McAuliffe, K.P., Melton, E.A., Norton, V.A., and Weiss, J. (85) "The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture" Proceedings of the 1985 International Conference on Parallel Processing, August 1985.
23
24
25
 
26
Smith, B.J. (81) "Architecture and Applications of the HEP Multiprocessor Computer System" Real Time Signal Processing IV, Proceedings of SPIE, 1981.
 
27
Southard, J. (84) "MACPITTS: An Approach to Silicon Compilation" Computer Magazine, December 1984.
28
 
29
Tomasulo, R.M. (67) "An Efficient Algorithm for Exploiting Multiple Arithmetic Units" IBM Journal of Research and Development, vol. 11, January 1967.
30
31

CITED BY  47