|
ABSTRACT
We examine two pipeline structures which are employed in commercial microprocessors. The first is the load-use interlock (LUI) pipeline, which employs an interlock to ensure correct operation during load-use hazards. The second is the address-generation interlock (AGI) pipeline. It eliminates the load-use hazard, but has an address-generation hazard which requires an address-generation interlock for correct operation. We compare the performance of these two pipelines on existing binaries and on applications which have been recompiled with a local code scheduler that understands the difference in the pipeline structures. When branch prediction is more than 80% accurate and the data cache access time is greater than two cycles, the AGI pipeline performs significantly better than the LUI pipeline on existing binaries. Recompiling the benchmarks with a new local code scheduler provides little additional performance improvement.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Case, "Intel reveals pentium implementation details," Microprocessor Report, vol. 5, no. 23, pp. 9-17, 1993.
|
| |
2
|
D. Dobberpuhl, R. Witek, R. Alimon, R. Anglin, S. Britton, L. Chao, R. Conrad, D. Dever, B. Gleseke, G. Hoeppner, J. Kowaleski, K. Kuchler, M. Ladd, M. Leary, L. Madden, E. McLeUan, D. Meyer, J. Montanero, D. Priors, V. Rajagopalan, S. Samudraia, and S. Santhanam, "A 200 MHz 64b dual-issue CMOS microprocessor," in Proc. '92 IEEE lnt'l Solid-State Circuits Conf., pp. 106- 107, February 1992.
|
| |
3
|
M. Golden and T. Mudge, "Hardware support for hiding cache latency," Technical Report CSE-TR-152-93, The University of Michigan, Department of Electrical Engineering and Computer Science, Ann Arbor, MI, 48109- 2122, 1993.
|
| |
4
|
L. Gwennap, "TFP designed for tremendous floating point." Microprocessor Report, vol. 7, no. 11, pp. 9-13, August 1993.
|
| |
5
|
L. Gwennap, "Cyrix describes pentium competitor," Microprocessor Report, vol. 7, no. 14, pp. 1,6-10, October 1993.
|
| |
6
|
L. Gwennap, "Intel reveals Pentium implementation details," Microprocessor Report, vol. 7, no. 4, pp. 9-17, March 1993.
|
| |
7
|
|
| |
8
|
Wen-Mei W. Hwu , Scott A. Mahlke , William Y. Chen , Pohua P. Chang , Nancy J. Warter , Roger A. Bringmann , Roland G. Ouellette , Richard E. Hank , Tokuzo Kiyohara , Grant E. Haab , John G. Holm , Daniel M. Lavery, The superblock: an effective technique for VLIW and superscalar compilation, The Journal of Supercomputing, v.7 n.1-2, p.229-248, May 1993
[doi> 10.1007/BF01205185]
|
| |
9
|
M. Johnson, Superscalar Microprocessor Design, Prenrice Hall, Englewood Cliffs, N.J., 1991.
|
| |
10
|
N.P. Jouppi, "Cache write policies and performance," Technical report, Digital Equipment Corporation Westem Research Laboratory, 250 University Ave., Palo Alto, CA, 9430 I, December 1991.
|
| |
11
|
|
 |
12
|
|
| |
13
|
"MIPS chip set implements full ECL CPU," Microprocessor Report, vol. 3, no, 12, pp. 1,14-19, December 1989.
|
 |
14
|
|
| |
15
|
|
| |
16
|
J. Shortt. Alpha AXP Architecture DEC US Presentation, 1992.
|
| |
17
|
|
| |
18
|
|
| |
19
|
M.D. Smith, "Tracing with pixie", Center for integrated systems, Stanford University, Stanford CA, 94305-4070, 1.1 edition, April 1991. Available by anonymous ftp from velox.stanford, edu.
|
| |
20
|
R.M. Stallman, Using and Porting GNU CC, Boston, MA: Free Software Foundation, Inc., 2.4.5 edition, 1993.
|
INDEX TERMS
Primary Classification:
B.
Hardware
B.5
REGISTER-TRANSFER-LEVEL IMPLEMENTATION
B.5.1
Design
Subjects:
Styles (e.g., parallel, pipeline, special-purpose)
Additional Classification:
B.
Hardware
B.5
REGISTER-TRANSFER-LEVEL IMPLEMENTATION
B.5.1
Design
Subjects:
Memory design
C.
Computer Systems Organization
C.0
GENERAL
Subjects:
Instruction set design (e.g., RISC, CISC, VLIW)
C.1
PROCESSOR ARCHITECTURES
C.1.2
Multiple Data Stream Architectures (Multiprocessors)
Subjects:
Pipeline processors**
General Terms:
Experimentation,
Measurement,
Performance
Keywords:
RISC,
cache memory,
interlocks,
memory system,
pipelines
|