| Dynamic cache clustering for chip multiprocessors |
| Full text |
Pdf
(7.86 MB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 23rd international conference on Supercomputing
table of contents
Yorktown Heights, NY, USA
SESSION: Cache enhancement techniques
table of contents
Pages 56-67
Year of Publication: 2009
ISBN:978-1-60558-498-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 55, Downloads (12 Months): 135, Citation Count: 0
|
|
|
ABSTRACT
This paper proposes DCC (Dynamic Cache Clustering), a novel distributed cache management scheme for large-scale chip multiprocessors. Using DCC, a per-core cache cluster is comprised of a number of L2 cache banks and cache clusters are constructed, expanded, and contracted dynamically to match each core's cache demand. The basic trade-offs of varying the on-chip cache clusters are average L2 access latency and L2 miss rate. DCC uniquely and efficiently optimizes both metrics and continuously tracks a near-optimal cache organization from many possible configurations. Simulation results using a full-system simulator demonstrate that DCC outperforms alternative L2 cache designs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
J. Held, J Bautista, and S. Koehl. "From a Few Cores to Many: A Tera-scale Computing Research Overview," White Paper. Research at Intel, Jan. 2006.
|
 |
8
|
Jaehyuk Huh , Changkyu Kim , Hazim Shafi , Lixin Zhang , Doug Burger , Stephen W. Keckler, A NUCA substrate for flexible CMP cache sharing, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088154]
|
| |
9
|
T. Johnson and U. Nawathe. "An 8-core, 64-thread, 64-bit Power Efficient SPARC SoC," IEEE ISSCC, Feb. 2007.
|
 |
10
|
|
| |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
Qiang Wu , Margaret Martonosi , Douglas W. Clark , V. J. Reddi , Dan Connors , Youfeng Wu , Jin Lee , David Brooks, A Dynamic Compilation Framework for Controlling Microprocessor Energy and Performance, Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, p.271-282, November 12-16, 2005, Barcelona, Spain
[doi> 10.1109/MICRO.2005.7]
|
| |
16
|
|
| |
17
|
A. Ros, M. E. Acacio, and J. M. García "Scalable Directory Organization for Tiled CMP Architectures," ICCD, July 2008.
|
| |
18
|
B. Stolt, Y. Mittlefehldt, S. Dubey, G. Mittal, M. Lee, J. Friedrich, and E. Fluhr. "Design and Implementation of the POWER6 Microprocessor," Solid State Circuits. IEEE Journal., pp. 21--28, Jan. 2008.
|
 |
19
|
|
| |
20
|
Standard Performance Evaluation Corporation. http://www.specbench.org.
|
| |
21
|
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. "An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS," ISSCC, Feb 2007.
|
| |
22
|
Virtutech AB. Simics Full System Simulator "http://www.simics.com/"
|
 |
23
|
|
| |
24
|
M. Zhang and K. Asanović "Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches,"Technical Report TR-2005-064, Computer Science and Artificial Intelligence Labratory. MIT, Oct. 2005.
|
 |
25
|
|
|