|
ABSTRACT
Code generation for embedded processors opens up the possibility for several performance optimization techniques that have been ignored by traditional compilers due to compilation time constraints. We present techniques that take into account the parameters of the data caches for organizing scalar and array variables declared in embedded code into memory, with the objective of improving data cache performance. We present techniques for clustering variables to minimize compulsory cache misses, and for solving the memory assignment problem to minimize conflict cache misses. Our experiments with benchmark code kernels from DSP and other domains on the CW4001 embedded processor from LSI Logic indicate significant improvements in data cache performance by the application of our memory organization technique.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Santosh G. Abraham , Rabin A. Sugumar , Daniel Windheiser , B. R. Rau , Rajiv Gupta, Predictability of load/store instruction latencies, Proceedings of the 26th annual international symposium on Microarchitecture, p.139-152, December 01-03, 1993, Austin, Texas, United States
|
| |
2
|
Alfred V. Aho , Ravi Sethi , Jeffrey D. Ullman, Compilers: principles, techniques, and tools, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1986
|
| |
3
|
|
| |
4
|
ARAUJO, G., DEVADAS, S., KEUTZER, K., LIAO, S., MALIK, S., SUDARSANAM, A., TJIANG, S., AND WANG, A. 1995. Challenges in code generation for embedded systems. In Code Generation for Embedded Processors, P. Marwedel and G. Goosens, Eds., Kluwer Academic, 48-64.
|
| |
5
|
|
 |
6
|
David Callahan , Ken Kennedy , Allan Porterfield, Software prefetching, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.40-52, April 08-11, 1991, Santa Clara, California, United States
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
GOOSENS, G., RABAEY, J., VANDEWALLE, J., AND MAN, H. D. 1990. An efficient microcode compiler for application specific DSP processors. IEEE Trans. CAD/ICAS 9, 9 (Sept.), 925-937.
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
LANNEER, D., PRAET,J.V.,KIFLI, A., SCHOOFS, K., GEURTS, W., THOEN, F., AND GOOSENS,G. 1995. Chess: Retargetable code generation for embedded DSP processors. In Code Generation for Embedded Processors, P. Marwedel and G. Goosens, Eds., Kluwer Academic, 65-84.
|
| |
19
|
Yau-Tsun Steven Li , Sharad Malik , Andrew Wolfe, Performance estimation of embedded software with instruction cache modeling, Proceedings of the 1995 IEEE/ACM international conference on Computer-aided design, p.380-387, November 05-09, 1995, San Jose, California, United States
|
 |
20
|
Stan Liao , Srinivas Devadas , Kurt Keutzer , Steve Tjiang , Albert Wang, Storage assignment to decrease code size, Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation, p.186-195, June 18-21, 1995, La Jolla, California, United States
|
| |
21
|
LIEM, C., MAY, T., AND PAULIN, P. 1994. Instruction-set matching and selection for DSP and ASIP code generation. In Proceedings of the European Design and Test Conference (March), 31-37.
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
|
| |
27
|
PAULIN, P., LIEM, C., MAY, T., AND SUTARWALA, S. 1995. Flexware: A flexible firmware development environment for embedded systems. In Code Generation for Embedded Processors, P. Marwedel and G. Goosens, Eds., Kluwer Academic, 65-84.
|
| |
28
|
RAWAT, J. 1993. Static analysis of cache performance for real-time programming. Tech. Rep., Iowa State University.
|
| |
29
|
SCHENK, W. 1995. Retargetable code generation for parallel, pipelined processor structures. In Code Generation for Embedded Processors, P. Marwedel and G. Goosens, Eds., Kluwer Academic, 119-135.
|
| |
30
|
|
| |
31
|
|
 |
32
|
|
| |
33
|
YAMADA, Y., JOHNSON,T.L.,HAAB, G., GYLLENHAAL,J.C.,AND HWU, W. W. 1995. Reducing cache misses in numerical applications using data relocation and prefetching. Tech. Rep. CRHC-95-04, University of Illinois, Urbana.
|
CITED BY 25
|
|
Jan Sjödin , Carl von Platen, Storage allocation for embedded processors, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
|
|
|
P. R. Panda , F. Catthoor , N. D. Dutt , K. Danckaert , E. Brockmeyer , C. Kulkarni , A. Vandercappelle , P. G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.6 n.2, p.149-206, April 2001
|
|
|
|
|
|
|
|
|
|
|
|
P. R. Panda , N. D. Dutt , A. Nicolau, Data cache sizing for embedded processor applications, Proceedings of the conference on Design, automation and test in Europe, p.925-926, February 23-26, 1998, Le Palais des Congrés de Paris, France
|
|
|
Preeti Ranjan Panda , Nikil D. Dutt , Alexandru Nicolau, Exploiting off-chip memory access modes in high-level synthesis, Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design, p.333-340, November 09-13, 1997, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tom Vander Aa , Murali Jayapala , Francisco Barat , Geert Deconinck , Rudy Lauwereins , Francky Catthoor , Henk Corporaal, Instruction buffering exploration for low energy VLIWs with instruction clusters, Proceedings of the 2004 conference on Asia South Pacific design automation: electronic design and solution fair, p.824-829, January 27-30, 2004, Yokohama, Japan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tom Vander Aa , Murali Jayapala , Francisco Barat , Geert Deconinck , Rudy Lauwereins , Henk Corporaal , Francky Catthoor, Instruction buffering exploration for low energy embedded processors, Journal of Embedded Computing, v.1 n.3, p.341-351, August 2005
|
|
|
|
|
|
|
|
|
|
|
|
Yun Liang , Lei Ju , Samarjit Chakraborty , Tulika Mitra , Abhik Roychoudhury, Cache-aware optimization of BAN applications, Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, October 19-24, 2008, Atlanta, GA, USA
|
|