|
ABSTRACT
This article presents a methodology for virtual memory support in energy-efficient embedded systems. A holistic approach is proposed, where the combined efforts of compiler, operating system, and hardware architecture achieve a significant system power reductions. The application information extracted and analyzed by the compiler is utilized dynamically by the microarchitecture and the operating system to perform energy-efficient and, for many memory references, time-deterministic address translations. We demonstrate that by using application information regarding virtual memory layout, an efficient and conflict-free translation process can be implemented through the utilization of a small hardware direct translation table (DTT) accessed in an application-specific manner. The set of virtual pages is partitioned into groups, such that for each group only a few of the least significant bits are used as an index to obtain the physical page number. We outline an efficient compile-time algorithm for identifying these groups and allocate their translation entries optimally into the DTT. The introduced hardware is minimal in terms of area, performance, and power overhead, while offering the flexibility of software programmability. This is achieved through a small set of registers and tables, which are made software accessible. We have quantitatively evaluated the proposed methodology on a number of embedded applications, including voice, image, and video processing.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
ARM Ltd. ARM920T Technical Reference Manual. ARM Ltd.
|
| |
3
|
|
| |
4
|
Baase, S. and Gelder, A. 2000. Computer Algorithms. Addison-Wesley, Boston, MA.
|
 |
5
|
|
 |
6
|
Rajeshwari Banakar , Stefan Steinke , Bo-Sik Lee , M. Balakrishnan , Peter Marwedel, Scratchpad memory: design alternative for cache on-chip memory in embedded systems, Proceedings of the tenth international symposium on Hardware/software codesign, May 06-08, 2002, Estes Park, Colorado
[doi> 10.1145/774789.774805]
|
| |
7
|
|
| |
8
|
Massimilano Chiodo , Paolo Giusto , Attila Jurecska , Harry C. Hsieh , Alberto Sangiovanni-Vincentelli , Luciano Lavagno, Hardware-Software Codesign of Embedded Systems, IEEE Micro, v.14 n.4, p.26-36, August 1994
[doi> 10.1109/40.296155]
|
 |
9
|
|
 |
10
|
|
 |
11
|
Dongrui Fan , Zhimin Tang , Hailin Huang , Guang R. Gao, An energy efficient TLB design methodology, Proceedings of the 2005 international symposium on Low power electronics and design, August 08-10, 2005, San Diego, CA, USA
[doi> 10.1145/1077603.1077688]
|
 |
12
|
Krisztián Flautner , Nam Sung Kim , Steve Martin , David Blaauw , Trevor Mudge, Drowsy caches: simple techniques for reducing leakage power, Proceedings of the 29th annual international symposium on Computer architecture, May 25-29, 2002, Anchorage, Alaska
|
 |
13
|
Poletti Francesco , Paul Marchal , David Atienza , Luca Benini , Francky Catthoor , Jose M. Mendias, An integrated hardware/software approach for run-time scratchpad management, Proceedings of the 41st annual conference on Design automation, June 07-11, 2004, San Diego, CA, USA
[doi> 10.1145/996566.996634]
|
| |
14
|
|
 |
15
|
|
| |
16
|
M. R. Guthaus , J. S. Ringenberg , D. Ernst , T. M. Austin , T. Mudge , R. B. Brown, MiBench: A free, commercially representative embedded benchmark suite, Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, p.3-14, December 02-02, 2001
[doi> 10.1109/WWC.2001.15]
|
| |
17
|
Heckmann, R., Langenbach, M., Thesing, S., and Wilhelm, R. 2003. The influence of processor architecture on the design and the results of wcet tools. IEEE Proc. 91, 7, 1038--1054.
|
 |
18
|
J. S. Hu , A. Nadgir , N. Vijaykrishnan , M. J. Irwin , M. Kandemir, Exploiting program hotspots and code sequentiality for instruction cache leakage management, Proceedings of the 2003 international symposium on Low power electronics and design, August 25-27, 2003, Seoul, Korea
[doi> 10.1145/871506.871606]
|
| |
19
|
Intel Corporation. Intel XScale microarchitecture. Intel Corporation.
|
| |
20
|
|
 |
21
|
Toni Juan , Tomas Lang , Juan J. Navarro, Reducing TLB power requirements, Proceedings of the 1997 international symposium on Low power electronics and design, p.196-201, August 18-20, 1997, Monterey, California, United States
[doi> 10.1145/263272.263332]
|
| |
22
|
I. Kadayif , A. Sivasubramaniam , M. Kandemir , G. Kandiraju , G. Chen, Generating physical addresses directly for saving instruction TLB energy, Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture, November 18-22, 2002, Istanbul, Turkey
|
 |
23
|
|
| |
24
|
Kandemir, M., Ramanujam, J., Irwin, M., Vijaykrishnan, N., Kadayif, I., and Parikh, A. 2004. A compiler-based approach for dynamically managing scratch-pad memories in embedded systems. IEEE Trans Comput.-Aid. Design Integr. Circ. Syst. 23, 2, 243--260.
|
| |
25
|
|
| |
26
|
Chunho Lee , Miodrag Potkonjak , William H. Mangione-Smith, MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems, Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, p.330-335, December 01-03, 1997, Research Triangle Park, North Carolina, United States
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
Montanaro, J., Witek, R., Anne, K., Black, A., Cooper, E., Dobberpuhl, D., Donahue, P., Eno, J., Farell, A., Hoeppner, G., et al. 1996. A 160mhz, 32b 0.5w cmos risc microprocessor. In Proceedings of the International Symposium on Computers and Communication (ISCC'96). IEEE, Los Alamitos, CA, 214--229.
|
| |
32
|
|
| |
33
|
|
| |
34
|
Shivakumar, P. and Jouppi, N. 2001. Cacti 3.0: An integrated cache timing, power and area model. Tech. rep., Western Research Lab.
|
 |
35
|
Matthew Simpson , Bhuvan Middha , Rajeev Barua, Segment protection for embedded systems using run-time checks, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
[doi> 10.1145/1086297.1086307]
|
| |
36
|
Stojanovic, V. and Oklobdzija, V. 1999. Comparative analysis of master-slave latches and flip-flops for high-performance and low-power systems. IEEE J. Solid-State Circ. 34, 4, 536--548.
|
 |
37
|
|
 |
38
|
|
 |
39
|
|
|