|
ABSTRACT
Micro-architecture designers are very cautious about expanding the number of architected registers (also the register field), because increasing the register field adds to the code size, raises I-cache and memory pressure, complicates processor pipeline. Especially for low-end processors, encoding space could be extremely limited due to area and power considerations. On the other hand, the number of architected registers exposed to the compiler could directly affect the effectiveness of compiler analysis and optimization. For high performance computers, register pressure can be higher than the available registers in some regions, e.g. due to optimizations like aggressive function inlining, software pipelining etc. The compiler cannot effectively perform compilation and optimization if only a small number of registers are exposed through the ISA. Therefore, it is crucial that more architected registers are available at the compiler's disposal without expanding the code size significantly.In this paper, we look at a new register encoding scheme called differential encoding that allows more registers to be addressed in the operand field of instructions than the direct encoding currently being used. We show it can be implemented with very low overhead. Based upon differential encoding, we apply it in several ways such that the extra architected registers can benefit the performance. Three schemes are devised to integrate differential encoding with register allocation. We demonstrate that differential register allocation is helpful in improving the performance of both high-end and low-end processors. Moreover, We can combine it with software pipelining to provide more registers and reduce spills.Our results show that differential encoding significantly reduces the number of spills and speeds up program execution. For a low-end configuration, we achieve over 12% speedup while keeping code size almost unaffected. For optimization on loops, it significantly speeds up loops with high register pressure (over 70% speedup).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
D. Burger and T.M. Austin. The SimpleScalar Tool Set. Version 2.0, Tech. Report No. 1342, Computer Sciences Department, University of Wisconsin-Madison, June 1997.
|
| |
3
|
Gregory J. Chaitin, Marc A. Auslander, Ashok K. Chandra, John Cocke, Martin E. Hopkins, and Peter W. Markstein. Register allocation via coloring. Computer Language, 6(1):47--57, 1981.
|
 |
4
|
|
 |
5
|
|
| |
6
|
Intel Inc. SA-110 Microprocessor Technical Reference Manual, September 1998.
|
 |
7
|
Tokuzo Kiyohara , Scott Mahlke , William Chen , Roger Bringmann , Richard Hank , Sadun Anik , Wen-Mei Hwu, Register connection: a new approach to adding registers into instruction set architectures, Proceedings of the 20th annual international symposium on Computer architecture, p.247-256, May 16-19, 1993, San Diego, California, United States
|
 |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
MIPS Technologies. MIPS32 Architecture for Programmers Volume IV-a: The MIPS16 Application Specific Extension to the MIPS32 Architecture, March 2001.
|
| |
13
|
Motorola Inc. Motorola DSP56300 Family Manual, Revision 3.0, November 2000.
|
| |
14
|
|
| |
15
|
P.Briggs, K.Cooper, and L.Torczon. Improvements to Graph Coloring Register Allocation. In Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation (PLDI). ACM, 1994.
|
 |
16
|
B. R. Rau , M. Lee , P. P. Tirumalai , M. S. Schlansker, Register allocation for software pipelined loops, Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation, p.283-299, June 15-19, 1992, San Francisco, California, United States
|
| |
17
|
Andrew M. R.Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge, and R.B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In IEEE 4th Annual Workshop on Workload Characterization. IEEE, 2001.
|
 |
18
|
John Ruttenberg , G. R. Gao , A. Stoutchinin , W. Lichtenstein, Software pipelining showdown: optimal vs. heuristic methods in a production compiler, Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation, p.1-11, May 21-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
19
|
S. Segars. Low Power Design Techniques for Micro-processors. In IEEE International Solid-State Circuits Conferenc (ISSCC), 2001.
|
 |
20
|
Jian Wang , Andreas Krall , M. Anton Ertl , Christine Eisenbeis, Software pipelining with register allocation and spilling, Proceedings of the 27th annual international symposium on Microarchitecture, p.95-99, November 30-December 02, 1994, San Jose, California, United States
[doi> 10.1145/192724.192734]
|
 |
21
|
Javier Zalamea , Josep Llosa , Eduard Ayguadé , Mateo Valero, Improved spill code generation for software pipelined loops, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, p.134-144, June 18-21, 2000, Vancouver, British Columbia, Canada
|
 |
22
|
Javier Zalamea , Josep Llosa , Eduard Ayguadé , Mateo Valero, Two-level hierarchical register file organization for VLIW processors, Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, p.137-146, December 2000, Monterey, California, United States
[doi> 10.1145/360128.360143]
|
 |
23
|
Xiaotong Zhuang , Tao Zhang , Santosh Pande, Hardware-managed register allocation for embedded processors, Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems, June 11-13, 2004, Washington, DC, USA
|
|