| Register allocation for software pipelined multi-dimensional loops |
| Full text |
Pdf
(745 KB)
|
| Source
|
Conference on Programming Language Design and Implementation
archive
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
table of contents
Chicago, IL, USA
SESSION: Register allocation
table of contents
Pages: 154 - 167
Year of Publication: 2005
ISBN:1-59593-056-6
Also published in ...
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 43, Citation Count: 2
|
|
|
ABSTRACT
Software pipelining of a multi-dimensional loop is an important optimization that overlaps the execution of successive outermost loop iterations to explore instruction-level parallelism from the entire n-dimensional iteration space. This paper investigates register allocation for software pipelined multi-dimensional loops.For single loop software pipelining, the lifetime instances of a loop variant in successive iterations of the loop form a repetitive pattern. An effective register allocation method is to represent the pattern as a vector of lifetimes (or a vector lifetime using Rau's terminology) and map it to rotating registers. Unfortunately, the software pipelined schedule of a multi-dimensional loop is considerably more complex, and so are the vector lifetimes in it.In this paper, we develop a way to normalize and represent vector lifetimes in multi-dimensional loop software pipelining, which capture their complexity, while exposing their regularity that enables us to develop a simple, yet powerful solution. Our algorithm is based on the development of a metric, called distance, that quantitatively determines the degree of potential overlapping (conflicts) between two vector lifetimes. We show how to calculate and use the distance, conservatively or aggressively, to guide the register allocation of the vector lifetimes under a bin-packing algorithm framework. The classical register allocation for software pipelined single loops is subsumed by our method as a special case.The method has been implemented in the ORC compiler and produced code for the Itanium architecture. We report the effectiveness of our method on 134 loop nests with 348 loop levels. Several strategies for register allocation are compared and analyzed.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
Intel. Intel IA-64 Architecture Software Developer's Manual, Vol. 1. Intel, 2001.
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
 |
12
|
B. R. Rau , M. Lee , P. P. Tirumalai , M. S. Schlansker, Register allocation for software pipelined loops, Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation, p.283-299, June 15-19, 1992, San Francisco, California, United States
|
| |
13
|
Hongbo Rong , Alban Douillet , R. Govindarajan , Guang R. Gao, Code Generation for Single-Dimension Software Pipelining of Multi-Dimensional Loops, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.175, March 20-24, 2004, Palo Alto, California
|
| |
14
|
Hongbo Rong , Zhizhong Tang , R. Govindarajan , Alban Douillet , Guang R. Gao, Single-Dimension Software Pipelining for Multi-Dimensional Loops, Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, p.163, March 20-24, 2004, Palo Alto, California
|
|