|
ABSTRACT
Efficient utilization of on-chip memory space is extremely important in modern embedded system applications based on processor cores. In addition to a data cache that interfaces with slower off-chip memory, a fast on-chip SRAM, called Scratch-Pad memory, is often used in several applications, so that critical data can be stored there with a guaranteed fast access time. We present a technique for efficiently exploiting on-chip Scratch-Pad memory by partitioning the application's scalar and arrayed variables into off-chip DRAM and on-chip Scratch-Pad SRAM, with the goal of minimizing the total execution time of embedded applications. We also present extensions of our proposed memory assignment strategy to handle context switching between multiple programs, as well as a generalized memory hierarchy. Our experiments on code kernels from typical applications show that our technique results in significant performance improvements.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
AHMAD, I. AND CHEN, C. Y. R. 1991. Post-processor for data path synthesis using multiport memories. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '91, Santa Clara, CA, Nov. 11-14, 1991), IEEE Computer Society Press, Los Alamitos, CA, 276-279.
|
| |
2
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
3
|
|
| |
4
|
BALAKRISHNAN, M., BANERJI, D. K., MAJUMDAR, A. K., LINDERS, J. G., AND MAJITHIA, J. C. 1990. Allocation of multiport memories in data path synthesis. IEEE Trans. Comput.- Aided Des. 7, 4 (Apr. 1990), 536-540.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
 |
8
|
|
| |
9
|
|
 |
10
|
|
| |
11
|
|
 |
12
|
|
 |
13
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
 |
14
|
Stan Liao , Srinivas Devadas , Kurt Keutzer , Steve Tjiang , Albert Wang, Code optimization techniques for embedded DSP microprocessors, Proceedings of the 32nd ACM/IEEE conference on Design automation, p.599-604, June 12-16, 1995, San Francisco, California, United States
[doi> 10.1145/217474.217596]
|
 |
15
|
Stan Liao , Srinivas Devadas , Kurt Keutzer , Steve Tjiang , Albert Wang, Storage assignment to decrease code size, Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation, p.186-195, June 18-21, 1995, La Jolla, California, United States
|
| |
16
|
P. E. R. Lippens , J. L. van Meerbergen , W. F. J. Verhaegh , A. van der Werf, Allocation of multiport memories for hierarchical data stream, Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design, p.728-735, November 07-11, 1993, Santa Clara, California, United States
|
| |
17
|
LSI LOGIC CORPORATION. 1992. CW33000 MIPS Embedded Processor User's Manual. VLSI Technologies, Inc..
|
| |
18
|
MARGOLIN, B. 1997. Embedded systems to benefit from advances in dram technology. Comput. Des., 76-86.
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
RAMACHANDRAN, L., GAJSKI, D., AND CHAIYAKUL, V. 1994. An algorithm for array variable clustering. In Proceedings of the European Conference on Design Automation (Feb. 1994),
|
| |
26
|
RAWAT, J. 1993. Static analysis of cache performance for real-time programming. Master's Thesis. Iowa State Univ., Ames, IA.
|
 |
27
|
Mazen A. R. Saghir , Paul Chow , Corinna G. Lee, Exploiting dual data-memory banks in digital signal processors, Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, p.234-243, October 01-04, 1996, Cambridge, Massachusetts, United States
|
| |
28
|
|
| |
29
|
STOK, L. AND JESS, J. A. G. 1992. Foreground memory management in data path synthesis. Int. J. Circuits Theor. Appl. 20, 3, 235-255.
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
TSENG, C. AND SIEWIOREK, D. P. 1986. Automated synthesis of data paths in digital systems. IEEE Trans. Comput.-Aided Des. 5, 3 (July 1986), 379-395.
|
| |
34
|
TURLEY, J. L. 1994. New processor families join embedded fray. Microprocessor Report 8, 17 (Dec.), 1-8.
|
| |
35
|
VANHOOF, g., BOLSENS, I., AND MAN, H. D. 1991. Compiling multi-dimensional data streams into distributed DSP ASIC memory. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD '91, Santa Clara, CA, Nov. 11-14, 1991), IEEE Computer Society Press, Los Alamitos, CA, 272-275.
|
 |
36
|
Ingrid M. Verbauwhede , Chris J. Scheers , Jan M. Rabaey, Memory estimation for high level synthesis, Proceedings of the 31st annual conference on Design automation, p.143-148, June 06-10, 1994, San Diego, California, United States
[doi> 10.1145/196244.196313]
|
| |
37
|
WILSON, R. 1997. Graphics IC vendors take a shot at embedded DRAM. Elec. Eng. Times 938 (Jan.), 41-57.
|
| |
38
|
|
CITED BY 28
|
|
P. R. Panda , F. Catthoor , N. D. Dutt , K. Danckaert , E. Brockmeyer , C. Kulkarni , A. Vandercappelle , P. G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.6 n.2, p.149-206, April 2001
|
|
|
Oren Avissar , Rajeev Barua , Dave Stewart, Heterogeneous memory management for embedded systems, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
|
|
|
|
|
|
Takanori Okuma , Yun Cao , Masanori Muroyama , Hiroto Yasuura, Reducing access energy of on-chip data memory considering active data bitwidth, Proceedings of the 2002 international symposium on Low power electronics and design, August 12-14, 2002, Monterey, California, USA
|
|
|
Nghi Nguyen , Angel Dominguez , Rajeev Barua, Memory allocation for embedded systems with a compile-time-unknown scratch-pad size, Proceedings of the 2005 international conference on Compilers, architectures and synthesis for embedded systems, September 24-27, 2005, San Francisco, California, USA
|
|
|
Preeti Ranjan Panda , Nikil D. Dutt , Alexandru Nicolau , Francky Catthoor , Arnout Vandecappelle , Erik Brockmeyer , Chidamber Kulkarni , Eddy De Greef, Data Memory Organization and Optimizations in Application-Specific Systems, IEEE Design & Test, v.18 n.3, p.56-68, May 2001
|
|
|
Preeti Ranjan Panda , Nikil D. Dutt , Alexandru Nicolau , Francky Catthoor , Arnout Vandecappelle , Erik Brockmeyer , Chidamber Kulkarni , Eddy De Greef, Data Memory Organization and Optimizations in Application-Specific Systems, IEEE Design & Test, v.18 n.3, p.56-68, May 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nghi Nguyen , Angel Dominguez , Rajeev Barua, Scratch-pad memory allocation without compiler support for java applications, Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
|
|
|
Mihir Choudhury , Kyle Ringgenberg , Scott Rixner , Kartik Mohanram, Interactive presentation: Single-ended coding techniques for off-chip interconnects to commodity memory, Proceedings of the conference on Design, automation and test in Europe, April 16-20, 2007, Nice, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Angel Dominguez , Nghi Nguyen , Rajeev K. Barua, Recursive function data allocation to scratch-pad memory, Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, September 30-October 03, 2007, Salzburg, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Neil Robert Karl : Reviewer"
An algorithm is outlined that allocates program variables to fast
on-chip static random access memory (SRAM) in order to
optimize program CPU execution time. The algorithm needs to be
implemented in a precompiler for a target langu
more...
|