|
ABSTRACT
Lower threshold voltages in deep submicron technologies cause more leakage current, increasing static power dissipation. This trend, combined with the trend of larger/more cache memories dominating die area, has prompted circuit designers to develop SRAM cells with low-leakage operating modes (e.g., sleep mode). Sleep mode reduces static power dissipation, but data stored in a sleeping cell is unreliable or lost. So, at the architecture level, there is interest in exploiting sleep mode to reduce static power dissipation while maintaining high performance.Current approaches dynamically control the operating mode of large groups of cache lines or even individual cache lines. However, the performance monitoring mechanism that controls the percentage of sleep-mode lines, and identifies particular lines for sleep mode, is somewhat arbitrary. There is no way to know what the performance could be with all cache lines active, so arbitrary miss rate targets are set (perhaps on a per-benchmark basis using profile information), and the control mechanism tracks these targets. We propose applying sleep mode only to the data store and not the tag store. By keeping the entire tag store active the hardware knows what the hypothetical miss rate would be if all data lines were active, and the actual miss rate can be made to precisely track it. Simulations show that an average of 73% of I-cache lines and 54% of D-cache lines are put in sleep mode with an average IPC impact of only 1.7%, for 64 KB caches.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Bellas, N., Hajj, I., and Polychropoulos, C. 1999. A detailed, transistor-level energy model for SRAM-based caches. In Proceedings of International Symposium on Circuits and Systems, vol. 6, 198--201.
|
| |
3
|
|
| |
4
|
|
| |
5
|
Burger, D. and Austin, T. M. 1997. The Simplescalar Tool Set Version 2.0. Tech. Rep., Computer Science Department, University of Wisconsin-Madison.
|
 |
6
|
|
| |
7
|
Gonzalez, R. and Horowitz, M. 1996. Energy dissipation in general purpose microprocessors. IEEE Journal of Solid-State Circuits 31, 9, 1277--1284.
|
| |
8
|
Gwennap, L. 1996. Digital 21264 sets new standard. Microprocessor Report 10, 14 (October).
|
 |
9
|
|
 |
10
|
|
| |
11
|
|
| |
12
|
Johnson Kin , Munish Gupta , William H. Mangione-Smith, The filter cache: an energy efficient memory structure, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.184-193, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
13
|
Ko, U. and Balsara, P. T. 1995. Characterization and design of A low-power, high-performance cache architecture. In Proceedings of International Symposium on VLSI Technology, Systems, and Applications, 235--238.
|
| |
14
|
Kuroda, T., Fujita, T., Mita, S., Nagamatu, T., Yoshioka, S., Sano, F., Norishima, M., Murota, M., Kako, M., Kinugawa, M., Kakumu, M., Sakurai, T. 1996. A 0.9 V, 150 MHz 10 mW, 4 mm2, 2-D discrete cosine transform core processor with variable threshold-voltage scheme. IEEE Journal of Solid-State Circuits 31, 1770--1779.
|
 |
15
|
|
| |
16
|
|
| |
17
|
James Montanaro , Richard T. Witek , Krishna Anne , Andrew J. Black , Elizabeth M. Cooper , Daniel W. Dobberpuhl , Paul M. Donahue , Jim Eno , Gregory W. Hoeppner , David Kruckemyer , Thomas H. Lee , Peter C. M. Lin , Liam Madden , Daniel Murray , Mark H. Pearce , Sribalan Santhanam , Kathryn J. Snyder , Ray Stephany , Stephen C. Thierauf, A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor, Digital Technical Journal, v.9 n.1, p.49-62, 1997
|
 |
18
|
Koji Nii , Hiroshi Makino , Yoshiki Tujihashi , Chikayoshi Morishima , Yasushi Hayakawa , Hiroyuki Nunogami , Takahiko Arakawa , Hisanori Hamano, A low power SRAM using auto-backgate-controlled MT-CMOS, Proceedings of the 1998 international symposium on Low power electronics and design, p.293-298, August 10-12, 1998, Monterey, California, United States
[doi> 10.1145/280756.280939]
|
 |
19
|
Michael Powell , Se-Hyun Yang , Babak Falsafi , Kaushik Roy , T. N. Vijaykumar, Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories, Proceedings of the 2000 international symposium on Low power electronics and design, p.90-95, July 25-27, 2000, Rapallo, Italy
[doi> 10.1145/344166.344526]
|
| |
20
|
Reinman, G. and Jouppi, N. 1999. An itegrated cache timing and power model. CACTI 2.0 Tech. Rep., COMPAQ Western Research Lab.
|
| |
21
|
Shigematsu, S., Mutoh, S., Matsuya, Y., Tanabe, Y., and Yamada, J. 1997. A 1-V high speed MTCMOS circuit scheme for power-down application circuits. IEEE Journal of Solid-State Circuits 32, 861--869.
|
 |
22
|
N. Vijaykrishnan , M. Kandemir , M. J. Irwin , H. S. Kim , W. Ye, Energy-driven integrated hardware-software optimizations using SimplePower, Proceedings of the 27th annual international symposium on Computer architecture, p.95-106, June 2000, Vancouver, British Columbia, Canada
|
| |
23
|
|
| |
24
|
Ye, Y., Borkar, S., and De, V. 1998. A new technique for standby leakage reduction in high performance circuits. In IEEE Symposium on VLSI Circuits, 40--41.
|
| |
25
|
Zhou, H., Toburen, M., Rotenberg, E., and Conte, T. M. 2000. AMC: A Low Leakage Power Efficient On-chip Cache System Design. Tech. Rep., Department of Electrical and Computer Engineering, North Carolina State University.
|
CITED BY 7
|
|
Philo Juang , Kevin Skadron , Margaret Martonosi , Zhigang Hu , Douglas W. Clark , Philip W. Diodato , Stefanos Kaxiras, Implementing branch-predictor decay using quasi-static memory cells, ACM Transactions on Architecture and Code Optimization (TACO), v.1 n.2, p.180-219, June 2004
|
|
|
|
|
|
|
|
|
|
|
|
Kimish Patel , Luca Benini , Enrico Macii , Massimo Poncino, STV-Cache: a leakage energy-efficient architecture for data caches, Proceedings of the 16th ACM Great Lakes symposium on VLSI, April 30-May 01, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
Ismail Kadayif , Ayhan Zorlubas , Selcuk Koyuncu , Olcay Kabal , Davut Akcicek , Yucel Sahin , Mahmut Kandemir, Capturing and optimizing the interactions between prefetching and cache line turnoff, Microprocessors & Microsystems, v.32 n.7, p.394-404, October, 2008
|
|