|
ABSTRACT
This paper describes the miss classification table, a simple mechanism that enables the processor or memory controller to identify each cache miss as either a conflict miss or a capacity (non-conflict) miss. The miss classification table works by storing part of the tag of the most recently evicted line of a cache set. If the next miss to that cache set has a matching tag, it is identified as a conflict miss. This technique correctly identifies 88% of misses.Several applications of this information are demonstrated, including improvements to victim caching, next-line prefetching, cache exclusion, and a pseudo-associative cache. This paper also presents the adaptive miss buffer (AMB), which combines several of these techniques, targeting each miss with the most appropriate optimization, all within a single small miss buffer. The AMB's combination of techniques achieves 16% better performance than any single technique alone.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Brian N. Bershad , Dennis Lee , Theodore H. Romer , J. Bradley Chen, Avoiding conflict misses dynamically in large direct-mapped caches, Proceedings of the sixth international conference on Architectural support for programming languages and operating systems, p.158-170, October 05-07, 1994, San Jose, California, United States
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
Hiroaki Hirata , Kozo Kimura , Satoshi Nagamine , Yoshiyuki Mochizuki , Akio Nishimura , Yoshimori Nakase , Teiji Nishizawa, An elementary processor architecture with simultaneous instruction issuing from multiple threads, Proceedings of the 19th annual international symposium on Computer architecture, p.136-145, May 19-21, 1992, Queensland, Australia
|
| |
9
|
INTEL, C. 2000. In Itanium Processor Microarchitecture Reference for Software Optimization (Aug. 2000), pp. ftp://download.intel.com/design/IA-64/Downloads/24547401.pdf.
|
| |
10
|
INTEL, C. 2001. In Intel Pentium 4 Processor in the 423-pin Package at 1.30 GHz, 1.40GHz, and 1.50 GHz (Jan. 2001), pp. ftp://download.intel.com/ design/ Pentium4/ datashts/24919802.pdf.
|
 |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
MILUTINOVIC, V., TOMASEVIC, M., MARKOVI,B.,AND TREMBLAY, M. 1996. A new cache architecture concept: the split temporal/spatial cache. In Proceedings of the 8th Mediterranean Electrotechnical Conference (May 1996), 1108-1111.
|
 |
15
|
Basem A. Nayfeh , Lance Hammond , Kunle Olukotun, Evaluation of design alternatives for a multiprocessor microprocessor, Proceedings of the 23rd annual international symposium on Computer architecture, p.67-77, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
16
|
|
 |
17
|
|
| |
18
|
ROMER, T. H., LEE, D., BERSHAD,B.N.,AND CHEN, J. B. 1994. Dynamic page mapping policies for cache conflict resolution on standard hardware. In Proceedings of the First Annual Symposium on Operating Systems Design and Implementation (Nov. 1994), 255-266.
|
 |
19
|
|
 |
20
|
|
 |
21
|
|
| |
22
|
SONG, P. 1997. Ultrasparc-3 aims at mp servers. Microprocessor Report 11, 14 (Oct.), 29-34.
|
 |
23
|
Per Stenström , Mats Brorsson , Lars Sandberg, An adaptive cache coherence protocol optimized for migratory sharing, Proceedings of the 20th annual international symposium on Computer architecture, p.109-118, May 16-19, 1993, San Diego, California, United States
|
| |
24
|
|
| |
25
|
TULLSEN, D. M. 1996. Simulation and modeling of a simultaneous multithreading processor. In Proceedings of the 22nd Annual Computer Measurement Group Conference (Dec. 1996), 384-393.
|
 |
26
|
Dean M. Tullsen , Susan J. Eggers , Joel S. Emer , Henry M. Levy , Jack L. Lo , Rebecca L. Stamm, Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor, Proceedings of the 23rd annual international symposium on Computer architecture, p.191-202, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
27
|
Gary Tyson , Matthew Farrens , John Matthews , Andrew R. Pleszkun, A modified approach to data cache management, Proceedings of the 28th annual international symposium on Microarchitecture, p.93-103, November 29-December 01, 1995, Ann Arbor, Michigan, United States
|
|