| HitME: low power Hit MEmory buffer for embedded systems |
| Full text |
Pdf
(171 KB)
|
Source
|
Asia and South Pacific Design Automation Conference
archive
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
table of contents
Yokohama, Japan
SESSION: System level architectures
table of contents
Pages 335-340
Year of Publication: 2009
ISBN:978-1-4244-2748-2
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
IEEE Press
Piscataway, NJ, USA
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 29, Citation Count: 0
|
|
|
ABSTRACT
In this paper, we present a novel HitME (Hit-MEmory) buffer to reduce the energy consumption of memory hierarchy in embedded processors. The HitME buffer is a small direct-mapped cache memory that is added as additional memory into existing cache memory hierarchies. The HitME buffer is loaded only when there is a hit on L1 cache. Otherwise, L1 cache is updated from the memory and the processor's memory request is served directly from the L1 cache. The strategy works due to the fact that 90% of memory accesses are only accessed once, and these often pollute the cache. Energy reduction is achieved by reducing the number of accesses to the L1 cache memory. Experimental results show that the use of HitME buffer will reduce the L1 cache accesses resulting in a reduction in the energy consumption of the memory hierarchy. This decrease in L1 cache accesses reduces the cache system energy consumption by an average of 60.9% when compared to traditional L1 cache memory architecture and an energy reduction of 6.4% when compared to filter cache architecture for 70nm cache technology.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. Segars, "Low Power Design Techniques for Microprocessors," ISSCC, 2001.
|
| |
2
|
L. Lee, B. Moyer and J. Arends, "Low-Cost Embedded Program Loop Caching - Revisited," Technical Report CSE-TR-411-99, University of Michigan, 1999.
|
 |
3
|
|
 |
4
|
Haris Lekatsas , Jörg Henkel , Wayne Wolf, Code compression for low power embedded system design, Proceedings of the 37th conference on Design automation, p.294-299, June 05-09, 2000, Los Angeles, California, United States
[doi> 10.1145/337292.337423]
|
 |
5
|
|
 |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
A. Janapsatya, S. Parameswaran and J. Henkel, "REMcode: relocating embedded code for improving system efficiency," Computers and Digital Techniques, IEE Proceedings, vol. 151, no. 6, pp. 457--465, 2004.
|
 |
10
|
Rajeshwari Banakar , Stefan Steinke , Bo-Sik Lee , M. Balakrishnan , Peter Marwedel, Scratchpad memory: design alternative for cache on-chip memory in embedded systems, Proceedings of the tenth international symposium on Hardware/software codesign, May 06-08, 2002, Estes Park, Colorado
[doi> 10.1145/774789.774805]
|
| |
11
|
N. Bellas et al., "Energy and Performance Improvements in Microprocessor Design Using a Loop Cache," ICCD, 1999.
|
 |
12
|
|
| |
13
|
Johnson Kin , Munish Gupta , William H. Mangione-Smith, The filter cache: an energy efficient memory structure, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.184-193, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
14
|
|
| |
15
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
| |
16
|
|
 |
17
|
Stefan Steinke , Nils Grunwald , Lars Wehmeyer , Rajeshwari Banakar , M. Balakrishnan , Peter Marwedel, Reducing energy consumption by dynamic copying of instructions onto onchip memory, Proceedings of the 15th international symposium on System Synthesis, October 02-04, 2002, Kyoto, Japan
[doi> 10.1145/581199.581247]
|
| |
18
|
A. Janapsatya, A. Ignjatovic and S. Parameswaran, "Exploiting Statistical Information for Implementation of Instruction Scratchpad Memory in Embedded Systems," Very Large Scale Integration Systems, IEEE Transactions on, vol. 14, no. 8, August 2006.
|
 |
19
|
Lea Hwang Lee , Bill Moyer , John Arends, Instruction fetch energy reduction using loop caches for embedded applications with small tight loops, Proceedings of the 1999 international symposium on Low power electronics and design, p.267-269, August 16-17, 1999, San Diego, California, United States
[doi> 10.1145/313817.313944]
|
| |
20
|
A. Gordon-Ross and F. Vahid, "Dynamic Loop Caching Meets Preloaded Loop Caching - A Hybrid Approach," ICCD, 2002.
|
| |
21
|
|
| |
22
|
Xtensa Processor, (http://www.tensilica.com)
|
| |
23
|
J. Edler and M. D. Hill, "Dinero IV Trace-Driven Uniprocessor Cache Simulator," http://www.cs.wisc.edu/~markhill/DineroIV/.
|
| |
24
|
D. Tarjan, S. Thoziyoor and N. P. Jouppi, "CACTI 4.0," Technical Report HPL-2006-86, HP Laboratories Palo Alto, June 2, 2006.
|
| |
25
|
ARM Inc, http://www.arm.com/products/CPUs/ARM946E-S.html
|
|