| A way-halting cache for low-energy high-performance systems |
| Full text |
Pdf
(236 KB)
|
Source
|
International Symposium on Low Power Electronics and Design
archive
Proceedings of the 2004 international symposium on Low power electronics and design
table of contents
Newport Beach, California, USA
SESSION: Power optimizations for cache memory
table of contents
Pages: 126 - 131
Year of Publication: 2004
ISBN:1-58113-929-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 7, Citation Count: 3
|
|
|
ABSTRACT
Caches contribute to much of a microprocessor system's power and energy consumption. We have developed a new cache architecture, called a way-halting cache, that reduces energy while imposing no performance overhead. Our way-halting cache is a four-way set-associative cache that stores the four lowest-order bits of all ways' tags into a fully associative memory, which we call the halt tag array. The lookup in the halt tag array is done in parallel with, and is no slower than, the set-index decoding. The halt tag array pre-determines which tags cannot match due to their low-order four bits mismatching. Further accesses to ways with known mismatching tags are then halted, thus saving power. Our halt tag array has an additional feature of using static logic only, rather than dynamic logic used in highly associative caches. We provide data from experiments on 17 benchmarks drawn from MediaBench and Spec 2000, based on our layouts in 0.18 micron CMOS technology. On average, 55% savings of memory-access related energy were obtained over a conventional four-way set-associative cache. We show that energy savings are greater than previous methods, and nearly twice that of highly-associative caches, while imposing no performance overhead and only 2% cache area overhead.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Advanced Micro Devices, http://www.amd.com.
|
| |
2
|
D. Burger and T.M. Austin, "The SimpleScalar Tool Set, Version 2.0," Univ. of Wisconsin-Madison Computer Sciences Dept. Technical Report #1342, June 1997.
|
| |
3
|
Cadence, http://www.cadence.com
|
 |
4
|
|
| |
5
|
Atsushi Hasegawa , Ikuya Kawasaki , Kouji Yamada , Shinichi Yoshioka , Shumpei Kawasaki , Prasenjit Biswas, SH3: High Code Density, Low Power, IEEE Micro, v.15 n.6, p.11-19, December 1995
[doi> 10.1109/40.476254]
|
| |
6
|
|
 |
7
|
Michael Huang , Jose Renau , Seung-Moon Yoo , Josep Torrellas, L1 data cache decomposition for energy efficiency, Proceedings of the 2001 international symposium on Low power electronics and design, p.10-15, August 2001, Huntington Beach, California, United States
[doi> 10.1145/383082.383086]
|
| |
8
|
|
 |
9
|
Toni Juan , Tomás Lang , Juan J. Navarro, The difference-bit cache, Proceedings of the 23rd annual international symposium on Computer architecture, p.114-120, May 22-24, 1996, Philadelphia, Pennsylvania, United States
|
| |
10
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
11
|
|
 |
12
|
|
| |
13
|
The MOSIS Service, http://www.mosis.org
|
| |
14
|
Michael D. Powell , Amit Agarwal , T. N. Vijaykumar , Babak Falsafi , Kaushik Roy, Reducing set-associative cache energy via way-prediction and selective direct-mapping, Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, December 01-05, 2001, Austin, Texas
|
| |
15
|
G. Reinmann and N.P. Jouppi. CACTI2.0: An Integrated Cache Timing and Power Model, 1999. COMPAQ Western Research Lab.
|
| |
16
|
S. Segars, "Low power design techniques for microprocessors," International Solid-State Circuits Conference Tutorial, 2001.
|
 |
17
|
George Taylor , Peter Davies , Michael Farmwald, The TLB slice—a low-cost high-speed address translation mechanism, Proceedings of the 17th annual international symposium on Computer Architecture, p.355-363, May 28-31, 1990, Seattle, Washington, United States
|
| |
18
|
M. Zhang and K. Asanovic, "Highly-Associative Caches for Low-Power Processors," Kool Chips Workshop, in conjunction with International Symposium on Microarchitecture, Dec. 2000.
|
CITED BY 3
|
|
Vinod Viswanath , Jacob A. Abraham , Warren A. Hunt, Jr, Automatic insertion of low power annotations in RTL for pipelined microprocessors, Proceedings of the conference on Design, automation and test in Europe: Proceedings, March 06-10, 2006, Munich, Germany
|
|
|
|
|
|