ACM Home Page
Please provide us with feedback. Feedback
Register allocation for software pipelined multidimensional loops
Full text PdfPdf (3.34 MB)
Source
ACM Transactions on Programming Languages and Systems (TOPLAS) archive
Volume 30 ,  Issue 4  (July 2008) table of contents
Article No. 23  
Year of Publication: 2008
ISSN:0164-0925
Authors
Hongbo Rong  Microsoft Corporation, Redmond, WA
Alban Douillet  Hewlett-Packard Co., Palo Alto, CA
Guang R. Gao  University of Delaware, Newark, DE
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 168,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1377492.1377498
What is a DOI?

ABSTRACT

This article investigates register allocation for software pipelined multidimensional loops where the execution of successive iterations from an n-dimensional loop is overlapped. For single loop software pipelining, the lifetimes of a loop variable in successive iterations of the loop form a repetitive pattern. An effective register allocation method is to represent the pattern as a vector of lifetimes (or a vector lifetime using Rau's terminology [Rau 1992]) and map it to rotating registers. Unfortunately, the software pipelined schedule of a multidimensional loop is considerably more complex and so are the vector lifetimes in it.

In this article, we develop a way to normalize and represent the vector lifetimes, which captures their complexity, while exposing their regularity that enables a simple solution. The problem is formulated as bin-packing of the multidimensional vector lifetimes on the surface of a space-time cylinder. A metric, called distance, is calculated either conservatively or aggressively to guide the bin-packing process, so that there is no overlapping between any two vector lifetimes, and the register requirement is minimized. This approach subsumes the classical register allocation for software pipelined single loops as a special case. The method has been implemented in the ORC compiler and produced code for the IA-64 architecture. Experimental results show the effectiveness. Several strategies for register allocation are compared and analyzed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
5
 
6
7
8
9
 
10
 
11
 
12
Douillet, A. and Gao, G. R. 2005. Register pressure in software-pipelined loop nests: Fast computation and impact on architecture design. In The 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC'05). Hawthorne, NY, 17--31.
 
13
 
14
 
15
Gao, G. R., Ning, Q., and Dongen, V. V. 1993. Software pipelining for nested loops. ACAPS Tech Memo 53, School of Computer Science, McGill Univ., Montréal, Québec.
 
16
17
 
18
Intel. 2001. Intel IA-64 Architecture Software Developer's Manual. Vol. 1: IA-64 Application Architecture. Intel Corporation, Santa Clara, CA.
19
20
 
21
Lawler, E. L., Lenstra, J. K., Khan, A. H. G. R., and Shmoys, D. B. 1985. The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization. John Wiley & Sons.
22
 
23
 
24
25
 
26
 
27
Rau, B. R., Lee, M., Tirumalai, P. P., and Schlansker, M. S. 1992. Register allocation for modulo scheduled loops: Strategies, algorithms and heuristics. HP Labs Tech. rep. HPL-92-48, Hewlett-Packard Laboratories, Palo Alto, CA.
 
28
 
29
Rong, H. and Govindarajan, R. 2007. Advances in software piplining. In The Compiler Design Handbook: Optimization and Machine Code Generation, 2nd Ed. Y. N. Srikant and P. Shankar, Eds. CRC, Chapter 20.
 
30
Rong, H., Tang, Z., Govindarajan, R., Douillet, A., and Gao, G. R. 2007a. Single-dimension software pipelining for multi-dimensional loops. CAPSL Technical Memo, Department of Electrical and Computer Engineering, University of Delaware, Newark, DE. In ftp://ftp.capsl.udel.edu/pub/doc/memos/memo049.ps.gz.
31
 
32
Turkington, K., Masselos, K., Constantinides, G. A., and Leong, P. 2006. FPGA based acceleration of the linpack benchmark: A high level code transformation approach. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL). Madrid, Spain. IEEE, 1--6.
 
33

Collaborative Colleagues:
Hongbo Rong: colleagues
Alban Douillet: colleagues
Guang R. Gao: colleagues