| Divide-and-conquer: a bubble replacement for low level caches |
| Full text |
Pdf
(651 KB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 23rd international conference on Supercomputing
table of contents
Yorktown Heights, NY, USA
SESSION: Cache enhancement techniques
table of contents
Pages 80-89
Year of Publication: 2009
ISBN:978-1-60558-498-0
|
|
Authors
|
|
Chuanjun Zhang
|
University of Missouri Kansas City, Kansas City, MO, USA
|
|
Bing Xue
|
University of Missouri Kansas City, Kansas City, MO, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 36, Downloads (12 Months): 94, Citation Count: 0
|
|
|
ABSTRACT
The widely used LRU replacement policy suffers from the following problems. First, LRU does not exploit fre-quency information of cache accesses. Second, LRU may experience cache thrashing when access to cache exhibits cyclic patterns and the cache capacity is less than the working set. Finally, LRU is expensive to implement in hardware. We propose a bubble replacement for low-level caches, where cache blocks in one set are arranged in a queue for replacement determination. An incoming block enters the queue from the bottom and exchanges its posi-tion with the block above when the block hits, therefore, both recency and frequency information of a program are exploited. A victim block can be chosen from either the bottom or the top block of the queue, which is controlled by a single-bit set-hit flag per set. Choosing the bottom block as the victim makes the bubble replacement resistant to less frequently used blocks from polluting the cache while choosing the top block as the victim makes the bub-ble replacement adaptable to changes in the working set. We also propose to divide the blocks in a cache set into groups where each group implements the bubble replace-ment (we name it the DC-Bubble) to resolve the problems of the bubble replacement. The victim block is chosen ran-domly from the bottom block of each group. The proposed DC-Bubble reduces the average MPKI of the baseline 1MB 16-way L2 cache by 14%, bridges 47% of the gap between LRU and the OPT, reduces the storage require-ment by 61% and simplifies the circuit design compared to LRU.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
L. A. Belady. A study of replacement algorithms for virtual storage computers. In IBM Systems Journal, pp. 78--101, 1966.
|
| |
2
|
D. Burger and T. M. Austin. The SimpleScalar Tool Set, Univ. of Wisconsin-Madison. Technical Report #1342, June 1997.
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
 |
7
|
|
 |
8
|
Jude A. Rivers , Edward S. Tam , Gary S. Tyson , Edward S. Davidson , Matt Farrens, Utilizing reuse information in data cache management, Proceedings of the 12th international conference on Supercomputing, p.449-456, July 1998, Melbourne, Australia
[doi> 10.1145/277830.277941]
|
| |
9
|
C. T. Weaver. Pre-compiled SPEC2000 Alpha Binaries. Available at: http://www.simplescalar.org.
|
 |
10
|
Donghee Lee , Jongmoo Choi , Jong-Hun Kim , Sam H. Noh , Sang Lyul Min , Yookun Cho , Chong Sang Kim, On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies, Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, p.134-143, May 01-04, 1999, Atlanta, Georgia, United States
|
| |
11
|
|
| |
12
|
W. Wong and J.-L. Baer, Modified LRU Policies for Improving Second-Level Cache Behavior, In HPCA-6, 2000.
|
 |
13
|
David A. Wood , Mark D. Hill , R. E. Kessler, A model for estimating trace-sample miss ratios, Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems, p.79-89, May 21-24, 1991, San Diego, California, United States
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
|
 |
21
|
Moinuddin K. Qureshi , Aamer Jaleel , Yale N. Patt , Simon C. Steely , Joel Emer, Adaptive insertion policies for high performance caching, Proceedings of the 34th annual international symposium on Computer architecture, June 09-13, 2007, San Diego, California, USA
|
 |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
Standard Performance Evaluation Corporation, http://www.specbench.org/osg/cpu2000/.
|
| |
27
|
W. Wong and J.-L. Baer, Modified LRU Policies for Improving Second-Level Cache Behavior, In HPCA-6, 2000.
|
| |
28
|
|
|