|
ABSTRACT
The common approach to reduce cache conflicts is to increase the associativity. From a dynamic power perspective this associativity comes at a high cost. In this paper we present miss ratio performance and a dynamic power comparison for set-associative caches, a skewed cache and also for a new organization proposed,the elbow cache. The elbow cache extends the skewed cache organization with a relocation strategy for conflicting blocks. We show that these skewed designs significantly reduce the conflict problems while consuming p to 56% less dynamic power than a comparably performing 8-way set associative cache. We believe this to be the strongest case in favor of skewed caches presented so far
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sun's Niagara Pours on the Cores.Microprocessor Report Newsletter,September 2004.
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
L. A. Belady. A study of replacement algorithms for a virtual storage computer. IBM Systems Journal, 5:78--101, 1966.
|
 |
7
|
Nikolaos Bellas , Ibrahim Hajj , Constantine Polychronopoulos, Using dynamic cache management techniques to reduce energy in a high-performance processor, Proceedings of the 1999 international symposium on Low power electronics and design, p.64-69, August 16-17, 1999, San Diego, California, United States
[doi> 10.1145/313817.313856]
|
| |
8
|
|
 |
9
|
|
 |
10
|
Andrew Erlichson , Basem A. Nayfeh , Jaswinder P. Singh , Kunle Olukotun, The benefits of clustering in shared address space multiprocessors: an applications-driven investigation, Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM), p.60-es, December 04-08, 1995, San Diego, California, United States
[doi> 10.1145/224170.224397]
|
 |
11
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, p.148, May 25-29, 2002, Anchorage, Alaska
|
| |
12
|
R. Heald, K. Shin, V. Reddy, I.-F. Kao, M. Khan, W. L. Lynch,G. Lauterbach, and J. Petolino. 64kB Sum-Addressed-Memory Cache with 1.6ns Cycle and 2.6ns Latency.IEEE Journal of Solid-State Circuits 33, page 1682--1689, 1998.
|
| |
13
|
|
| |
14
|
J. Jalminger and P. Stenström. Improvements of Energy-Efficiency in Off-Chip Caches by Selective Prefetching. Microprocessors and Microsystems, 2001.
|
| |
15
|
|
| |
16
|
M. Karlsson and E. Hagersten. Timestamp-based Selective Cache Allocation. In Proceedings of the Workshop on Memory Performance Issues, June 2001. held in conjunction with the 28th International Symposium on Computer Architecture (ISCA28).
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
Johnson Kin , Munish Gupta , William H. Mangione-Smith, The filter cache: an energy efficient memory structure, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.184-193, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
21
|
Jack L. Lo , Luiz André Barroso , Susan J. Eggers , Kourosh Gharachorloo , Henry M. Levy , Sujay S. Parekh, An analysis of database workload performance on simultaneous multithreaded processors, Proceedings of the 25th annual international symposium on Computer architecture, p.39-50, June 27-July 02, 1998, Barcelona, Spain
|
| |
22
|
Peter S. Magnusson , Magnus Christensson , Jesper Eskilson , Daniel Forsgren , Gustav Hållberg , Johan Högberg , Fredrik Larsson , Andreas Moestedt , Bengt Werner, Simics: A Full System Simulation Platform, Computer, v.35 n.2, p.50-58, February 2002
[doi> 10.1109/2.982916]
|
| |
23
|
Michael D. Powell , Amit Agarwal , T. N. Vijaykumar , Babak Falsafi , Kaushik Roy, Reducing set-associative cache energy via way-prediction and selective direct-mapping, Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, December 01-05, 2001, Austin, Texas
|
 |
24
|
Steven Cameron Woo , Moriyoshi Ohara , Evan Torrie , Jaswinder Pal Singh , Anoop Gupta, The SPLASH-2 programs: characterization and methodological considerations, Proceedings of the 22nd annual international symposium on Computer architecture, p.24-36, June 22-24, 1995, S. Margherita Ligure, Italy
|
 |
25
|
|
| |
26
|
A. Seznec. A new case for skewed-associativity. Internal Publication No 1114, IRISA-INRIA, July 1997.
|
| |
27
|
|
| |
28
|
P. Shivakumar and N. Jouppi. CACTI 3.0 An integrated Cache Timing, Power and Area Model. Technical Report 2001/2, DEC Western Research Lab, 2001.
|
 |
29
|
|
| |
30
|
Hans Vandierendonck and Koen De Bosschere. Trade-offs for Skewed-Associative Caches. In Parallel Computing (PARCO), September 2003.
|
| |
31
|
|
| |
32
|
S. Wilton and N. Jouppi. An enhanced access and cycle time model for on-chip caches, 1994.
|
| |
33
|
|
 |
34
|
|
| |
35
|
Michael Zhang and Krste Asanovic. Highly-Associative Caches for Low-Power Processors. In Proceedings of Kool Chips Workshop held in conjunction with International Symposium on Microarchitecture (MICRO-33), Monterey, CA, December, 2000.
|
|