| Heads and tails: a variable-length instruction format supporting parallel fetch and decode |
| Full text |
Pdf
(180 KB)
|
| Source
|
International Conference on Compilers, Architecture and Synthesis for Embedded Systems
archive
Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
table of contents
Atlanta, Georgia, USA
Session: Hardware Support
table of contents
Pages: 168 - 175
Year of Publication: 2001
ISBN:1-58113-399-5
|
|
Authors
|
|
Heidi Pan
|
MIT Laboratory for Computer Science, Cambridge, MA
|
|
Krste Asanović
|
MIT Laboratory for Computer Science, Cambridge, MA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 36, Citation Count: 4
|
|
|
ABSTRACT
Existing variable-length instruction formats provide higher code densities than fixed-length formats, but are ill-suited to pipelined or parallel instruction fetch and decode. This paper presents a new variable-length instruction format that supports parallel fetch and decode of multiple instructions per cycle, allowing both high code density and rapid execution for high-performance embedded processors. In contrast to earlier schemes that store compressed variable-length instructions in main memory then expand them into fixed-length in-cache formats, the new format is suitable for direct execution from the instruction cache, thereby increasing effective cache capacity and reducing cache power. The new head-and-tails (HAT) format splits each instruction into a fixed-length head and a variable-length tail, and packs heads and tails in separate sections within a larger fixed-length instruction bundle. The heads can be easily fetched and decoded in parallel as they are a fixed distance apart in the instruction stream, while the variable-length tails provide improved code density. A conventional MIPS RISC instruction set is re-encoded in a variable-length HAT scheme, and achieves an average static code compression ratio of 75% and a dynamic fetch ratio (new-bits-fetched/old-bits-fetched) of 75%.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
AMD thlon Processor x86 Code Optimization , chapter Appendix A:AMD Athlon Processor Microarchitecture.AMD Inc.,220071-0 dition, September 2000.
|
| |
2
|
Charles Lefurgy , Peter Bird , I-Cheng Chen , Trevor Mudge, Improving code density using compression techniques, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.194-203, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
3
|
Guido Araujo , Paulo Centoducatte , Mario Cartes , Ricardo Pannain, Code compression based on operand factorization, Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture, p.194-201, November 1998, Dallas, Texas, United States
|
| |
4
|
G.Hinton t al.The microarchitecture of the Pentium 4 processor.Intel Technology Journal ,Q1 2001.
|
| |
5
|
|
| |
6
|
Joe Circello , Greg Edgington , Dan McCarthy , James Gay , David Schimke , Steven Sullivan , Richard Duerden , Chris Hinds , Danny Marquette , Lal Sood , Al Crouch , Daniel Chow, The Superscalar Architecture of the MC68060, IEEE Micro, v.15 n.2, p.10-21, April 1995
[doi> 10.1109/40.372345]
|
 |
7
|
Luca Benini , Alberto Macii , Enrico Macii , Massimo Poncino, Selective instruction compression for memory energy reduction in embedded systems, Proceedings of the 1999 international symposium on Low power electronics and design, p.206-211, August 16-17, 1999, San Diego, California, United States
[doi> 10.1145/313817.313927]
|
| |
8
|
|
 |
9
|
Stan Liao , Srinivas Devadas , Kurt Keutzer , Steve Tjiang , Albert Wang, Code optimization techniques for embedded DSP microprocessors, Proceedings of the 32nd ACM/IEEE conference on Design automation, p.599-604, June 12-16, 1995, San Francisco, California, United States
[doi> 10.1145/217474.217596]
|
| |
10
|
|
| |
11
|
L.Gwennap.Intel 's P6 uses decoupled sup rscalar design.Microprocessor Report ,9(2):9 -15,February 1995.
|
| |
12
|
|
| |
13
|
Kevin D.Kissell.MIPS16:High-density MIPS for the emb dded mark t.In Proceedings RTS97 ,1997.
|
| |
14
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
15
|
|
| |
16
|
IBM Microelectronics.Pow rPC 440GP emb dded processor:High performance SOP for networked applications.Presentation from Embedded Processor Forum,June 2000.
|
| |
17
|
M.Panich.Reducing instruction cach nergy using gated wordlines.Master 's thesis,Massachusetts Institute of Technology,August 1999.
|
| |
18
|
|
| |
19
|
SiByte,Inc.SB-1 CPU fact sheet.at www.sibyte.com, October 2000.rev.0.1.
|
 |
20
|
|
CITED BY 4
|
|
|
|
|
Tay-Jyi Lin , Chie-Min Chao , Chia-Hsien Liu , Pi-Chen Hsiao , Shin-Kai Chen , Li-Chun Lin , Chih-Wei Liu , Chein-Wei Jen, A unified processor architecture for RISC & VLIW DSP, Proceedings of the 15th ACM Great Lakes symposium on VLSI, April 17-19, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
|
|