| A fill-unit approach to multiple instruction issue |
| Full text |
Pdf
(993 KB)
|
| Source
|
International Symposium on Microarchitecture
archive
Proceedings of the 27th annual international symposium on Microarchitecture
table of contents
San Jose, California, United States
Pages: 162 - 171
Year of Publication: 1994
ISBN:0-89791-707-3
|
|
Authors
|
|
Manoj Franklin
|
Dept. of Elect. and Computer Eng., Clemson University, Clemson, SC
|
|
Mark Smotherman
|
Dept. of Computer Science, Clemson University, Clemson, SC
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 16, Citation Count: 22
|
|
|
ABSTRACT
Multiple issue of instructions occurs in superscalar and VLIW machines. This paper investigates a third type of machine design, which combines the advantages of code compatibility as in superscalars and the absence of complex dependency-checking logic from the decoder as in VLIW. In this design, a stream of scalar instructions is executed by the hardware and is simultaneously compacted into VLIW-type instructions, which are then stored in a structure called a shadow cache. When a shadow cache line contains the instructions requested by the fetch unit, the scalar instruction stream is preempted and all operations in the shadow cache line are simultaneously issued and executed. The mechanism that compacts instructions is called a fill unit, and was first proposed for dynamically compacting microoperations into large executable units by Melvin, Shebanow, and Patt in 1988. We have extended their approach to directly handle data dependencies, delayed branches, and speculative execution (using branch prediction). This approach is evaluated using the MIPS architecture, and a six-functional-unit machine is found to be 52 to 108% faster than a single-issue processor for unrecompiled SPECint92 benchmarks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
K. Ebcioglu, "Some Design Ideas for a VLIW Architecture for Sequential Natured Software," in M. Cosnard, et al., (eds.), Parallel Processing (Proc. IFiP WG 10.3 Working Conference on Parallel Processing, Pisa, Italy), North Holland, 1988, pp. 3-21.
|
| |
5
|
|
| |
6
|
|
| |
7
|
M. Johnson, $uperscalar Microprocessor Design. Englewood Cliffs, NJ: Prentice-Hall, 1991.
|
| |
8
|
|
 |
9
|
Nadeem Malik , Richard J. Eickemeyer , Stamatis Vassiliadis, Interlock collapsing ALU for increased instruction-level parallelism, Proceedings of the 25th annual international symposium on Microarchitecture, p.149-157, December 01-04, 1992, Portland, Oregon, United States
[doi> 10.1145/144953.145794]
|
| |
10
|
S. W. Melvin , M. C. Shebanow , Y. N. Patt, Hardware support for large atomic units in dynamically scheduled machines, Proceedings of the 21st annual workshop on Microprogramming and microarchitecture, p.60-63, November 28-December 02, 1988, San Diego, California, United States
|
| |
11
|
Val Popescu , Merle Schultz , John Spracklen , Gary Gibson , Bruce Lightner , David Isaman, The Metaflow Architecture, IEEE Micro, v.11 n.3, p.10-13, 63-73, May 1991
[doi> 10.1109/40.87564]
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
CITED BY 22
|
|
|
|
|
|
|
|
|
|
|
Daniel Holmes Friendly , Sanjay Jeram Patel , Yale N. Patt, Alternative fetch and issue policies for the trace cache fetch mechanism, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.24-33, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
Jared Stark , Paul Racunas , Yale N. Patt, Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.34-43, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jude A. Rivers , Gary S. Tyson , Edward S. Davidson , Todd M. Austin, On high-bandwidth data cache design for multi-issue processors, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.46-56, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|