ACM Home Page
Please provide us with feedback. Feedback
Instruction fetching: coping with code bloat
Full text PdfPdf (1.47 MB)
Source International Symposium on Computer Architecture archive
Proceedings of the 22nd annual international symposium on Computer architecture table of contents
S. Margherita Ligure, Italy
Pages: 345 - 356  
Year of Publication: 1995
ISBN:0-89791-698-0
Also published in ...
Authors
Richard Uhlig  Gesellshaft für Mathematik und Datenverarbeitung (GMD), Schloβ Birlinghoven, 53757 Sankt Augustin, Germany
David Nagle  Department of ECE, Carnegie Mellon University, Pittsburgh, PA
Trevor Mudge  EECS Department, University of Michigan, 1301 Beal Ave., Ann Arbor, Michigan
Stuart Sechrest  EECS Department, University of Michigan, 1301 Beal Ave., Ann Arbor, Michigan
Joel Emer  Digital Equipment Corporation, 77 Reed Road HLO2-3/J3, Hudson, MA
Sponsors
IEEE-CS\TCCA : TC on Computer Arhitecture
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 28,   Citation Count: 29
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/223982.224445
What is a DOI?

ABSTRACT

Previous research has shown that the SPEC benchmarks achieve low miss ratios in relatively small instruction caches. This paper presents evidence that current software-development practices produce applications that exhibit substantially higher instruction-cache miss ratios than do the SPEC benchmarks. To represent these trends, we have assembled a collection of applications, called the Instruction Benchmark Suite (IBS), that provides a better test of instruction-cache performance. We discuss the rationale behind the design of IBS and characterize its behavior relative to the SPEC benchmark suite. Our analysis is based on trace-driven and trap-driven simulations and takes into full account both the application and operating-system components of the workloads.This paper then reexamines a collection of previously-proposed hardware mechanisms for improving instruction-fetch performance in the context of the IBS workloads. We study the impact of cache organization, transfer bandwidth, prefetching, and pipelined memory systems on machines that rely on the use of relatively small primary instruction caches to facilitate increased clock rates. We find that, although of little use for SPEC, the right combination of these techniques substantially benefits IBS. Even so, under IBS, a stubborn lower bound on the instruction-fetch CPI remains as an obstacle to improving overall processor performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
Accetta86
Accetta, M., Baron, R., Golub, D., Rashid, R., Tevanian, A. and Young, M. Mach: A new kernel foundation for UN1X development, In the Summer 1986 USENIX Conference.
Alexander85
Alexander86
Agarwal88
 
Baer87
Baer, J.-L. and Wang, W.-H. Architectural choices for multi-level cache hierarchies. In the 16th International Conference on Parallel Processing: 258-261, 1987.
Baer88
Bershad94
 
Bomberger92
 
Borg90
Borg, A., Kessler, R. and Wall, D. Generation and analysis of very long address traces, In the 17th ISCA, Seattle, WA, 1990.
 
Bray90
 
Brunner91
 
Budd91
 
Calder94
Calder, B., Grunwald, D. and Zorn, B. Quantifying behavioral differences between C and C++ programs. The Department of Computer Science, University of Colorado. CU- CS-698-94.1994.
Chen93
 
Chen94
Chen, B. Memory behavior of an Xll window system, In the USENIX Winter 1994 Technical Conference, 1994.
 
Cheriton84
Cheriton, D. R. The V kernel: A software base for distributed systems. IEEE Software 1 (2): 19-42, 1984.
Clark83
Clark85
Clark88
Cmelik94
 
Custer93
Cvetanovic94
Emer84
Farrens89
 
Flanagan93
Flanagan, J. K., Nelson, B. E. and Archibald, J. K. The inaccuracy of trace-driven simulation using incomplete trace data. Brigham Young University. 1993.
 
Gee93
 
Happel92
Happel, L. P. and Jayasumana, A. P. Perfomtance of a RISC machine with two-level caches. IEE Proceedings-E 139 (3): 221-229, 1992.
 
Hennessy90
 
Hill87
Hill, M. Aspects of cache memory and instruction buffer performance. The University of California at Berkeley. 1987.
Huck93
Hwu89
Jouppi90
Jouppi94
 
Koch94
Koch, P. Emulating the 68040 in the PowerPC Macintosh, In Microprocessor Forum, San Francisco, CA, 1994.
 
Kessler91
Kessler, R. Analysis of multi-megabyte secondary CPU cache memories. University of Wisconsin-Madison. 1991.
Kessler92
 
Malan91
Malan, G., Rashid, R., Golub, D. and Baron, R. DOS as a Mach 3.0 application, In the USENIX Mach Symposium, 27- 40, 1991.
Maynard94
McFarling89
Mogul91
 
MReport92–95
Microprocessor Report. Sebastopol, CA, MicroDesign Resources, 1992, 1993, 1994 and 1995.
 
Mulder91
Mulder, J., Quach, N. and Flynn, M. An area model for on-chip memories and its application. IEEE Journal of Solid- State Circuits 26 (2): 98-106, 1991.
 
Nagle92
Nagle, D., Uhlig, R., Mudge, T., Monster: a tool for analyzing the interaction between operating systems and architectures. CSE-TR147-92. University of Michigan, 1992.
Nagle93
Nagle94
Olukotun91
Olukotun92
 
Ousterhout94
Palcharla94
 
Patel92
Patel, K., Smith, B. C. and Rowe, L. A. Performance of a Software MPEG Video Decoder. University of California, Berkeley. 1992.
 
Pierce95
Przybylski89
Przybylski90
 
Rozier92
Scheifler86
Short88
Sites88
 
Sites92
Sites, R., Chernoff, A., Kirk, M., Marks, M. and Robinson, S. Binary translation. Digital Technical Journal 4 (4): 137- 152, 1992.
 
Smith78
Smith, A. J. Sequential program prefetching in memory hierarchies. IEEE Computer 11 (12): 7-21, 1978.
Smith82
Smith85
 
Smith92
 
SPEC91
SPEC. The SPEC Benchmark Suite. SPEC Newsletter. 3: 3-4, 1991. -
 
SPEC93
SPEC. SPEC: A five year retrospective. The SPEC Newsletter 5 (4): 1-4, 1993.
Taylor90
Torrellas92
 
Torrellas95
 
Touma92
Uhlig94
 
Uhlig95
 
Wada92
Wada, T., Rajan, S. and Przybylski, S. An analytical access time model for on-chip cache memories. IEEE Journal of Solid-State Circuits 27 (8): 1147-1156, 1992.
Wang89
 
Wiecek92
 
Wilton94
Wilton, S. and Jouppi, N. An enhanced access and cycle time model for on-chip caches. DEC Western Research Lab. Technical Report 93/5.1994.

CITED BY  29

Collaborative Colleagues:
Richard Uhlig: colleagues
David Nagle: colleagues
Trevor Mudge: colleagues
Stuart Sechrest: colleagues
Joel Emer: colleagues