ACM Home Page
Please provide us with feedback. Feedback
Compiler-decided dynamic memory allocation for scratch-pad based embedded systems
Full text PdfPdf (213 KB)
Source International Conference on Compilers, Architecture and Synthesis for Embedded Systems archive
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems table of contents
San Jose, California, USA
SESSION: Memory hierarchy table of contents
Pages: 276 - 286  
Year of Publication: 2003
ISBN:1-58113-676-5
Authors
Sumesh Udayakumaran  University of Maryland, College Park, MD
Rajeev Barua  University of Maryland, College Park, MD
Sponsors
ACM: Association for Computing Machinery
SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 122,   Citation Count: 46
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/951710.951747
What is a DOI?

ABSTRACT

This paper presents a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme [4].Existing scratch-pad allocation methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the software-maintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitionsm variables at compile-time into the two banks. For example, our previous work in [3] derives a provably optimal static allocation for global and stack variables and achieves a speedup over all earlier methods. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache.In this paper we present a dynamic allocation method for global and stack data that for the first time, (i) accounts for changing program requirements at runtime (ii) has no software-caching tags (iii) requires no run-time checks (iv) has extremely low overheads, and (v) yields 100% predictable memory access times. In this method data that is about to be accessed frequently is copied into the SRAM using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary. When compared to a provably optimal static allocation our results show runtime reductions ranging from 11% to 38%, averaging 31.2%, using no additional hardware support. With hardware support for pseudo-DMA and full DMA, which is already provided in some commercial systems, the runtime reductions increase to 33.4% and 34.2% respectively.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
 
5
L. A. Belady. A study of replacement algorithms for virtual storage. In IBM Systems Journal, pages 5: 78--101, 1966.
6
 
7
 
8
David Brash. The ARM architecture Version 6 (ARMv6). ARM Ltd., January 2002. White Paper.
 
9
 
10
Document No. ARM DDI 0084D, ARM Ltd. ARM7TDMI-S Data sheet, October 1998.
11
 
12
 
13
 
14
ILOG Corporation. The CPLEX optimization suite. http://www.ilog.com/products/cplex/, 2001.
 
15
 
16
 
17
Matlab 6.1. The Math Works, Inc., 2001. http://www.mathworks.com/products/matlab/.
18
 
19
 
20
CPU12 Reference Manual. Motorola Corporation, 2000. (A 16-bit processor). http://e-www.motorola.com/brdata/PDFDB/MICROCONTROLLERS/16 BIT/68HC12FAMILY/REFMAT/CPU12RM.pdf.
 
21
M-CORE - MMC2001 Reference Manual. Motorola Corporation, 1998. (A 32-bit processor). http://www.motorola.com/SPS/MCORE/infodocumentation.htm.
22
 
23
Jan Sjodin, Bo Froderberg, and Thomas Lindgren. Allocation of Global Data Objects in On-Chip RAM. Compiler and Architecture Support for Embedded Computing Systems, December 1998.
24
 
25
 
26
TMS370Cx7x 8-bit microcontroller. Texas Instruments, Revised Feb. 1997. http://www-s.ti.com/sc/psheets/spns034c/spns034c.pdf.

CITED BY  48

Collaborative Colleagues:
Sumesh Udayakumaran: colleagues
Rajeev Barua: colleagues