ACM Home Page
Please provide us with feedback. Feedback
VLIW compilation techniques in a superscalar environment
Full text PdfPdf (1.30 MB)
Source Conference on Programming Language Design and Implementation archive
Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation table of contents
Orlando, Florida, United States
Pages: 36 - 48  
Year of Publication: 1994
ISBN:0-89791-662-X
Also published in ...
Authors
Kemal Ebcioglu  IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Randy D. Groves  IBM Rise Systern/6000 Division, 11400 Bumet Road, Austin, TX
Ki-Chang Kim  IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Gabriel M. Silberman  IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Isaac Ziv  IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY
Sponsor
SIGPLAN: ACM Special Interest Group on Programming Languages
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 27,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/178243.178247
What is a DOI?

ABSTRACT

We describe techniques for converting the intermediate code representation of a given program, as generated by a modern compiler, to another representation which produces the same run-time results, but can run faster on a superscalar machine. The algorithms, based on novel parallelization techniques for Very Long Instruction Word (VLIW) architectures, find and place together independently executable operations that may be far apart in the original code. i.e., they may be separated by many conditional branches or belong to different iterations of a loop. As a result, the functional units in the superscalar are presented with more work that can proceed in parallel, thus achieving higher performance than the approach of using hardware instruction dispatch techniques alone.While general scheduling techniques improve performance by removing idle pipeline cycles, to further improve performance on a superscalar with only a few functional units requires a reduction in the pathlength. We have designed a set of new algorithms for reducing pathlength and removing stalls due to branches, namely speculative load-store motion out of loops, unspeculation, limited combining, basic block expansion, and prolog tailoring. These algorithms were implemented in a prototype version of the IBM RS/6000 xlc compiler and have shown significant improvement in SPEC integer benchmarks on the IBM POWER machines.Also, we describe a new technique to obtain profiling information with low overhead, and some applications of profiling directed feedback, including scheduling heuristics, code reordering and branch reversal.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
6
7
 
8
K. Ebcioglu, "Global Scheduling," section of a book on compiler optimizations, F. Allen, K. Zadeck, B.K. Rosen (eds.), to appear.
 
9
10
 
11
12
13
 
14
 
15
16
17
18
19
20
 
21
22
23
 
24
G.J. Sussman and G.L. Steele, "Constraints-- A Language for Expressing Almost Hierarchical Descriptions," Artificial intelligence, Vol. 14, pp. 1-39, 1980.

CITED BY  10

Collaborative Colleagues:
Kemal Ebcioglu: colleagues
Randy D. Groves: colleagues
Ki-Chang Kim: colleagues
Gabriel M. Silberman: colleagues
Isaac Ziv: colleagues