| Optimizing the memory bandwidth with loop fusion |
| Full text |
Pdf
(191 KB)
|
Source
|
International Conference on Hardware Software Codesign
archive
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
table of contents
Stockholm, Sweden
SESSION: Software and hardware techniques for performance optimisation of embedded applications
table of contents
Pages: 188 - 193
Year of Publication: 2004
ISBN:1-58113- 937-3
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 17, Citation Count: 4
|
|
|
ABSTRACT
The memory bandwidth largely determines the performance and energy cost of embedded systems. At the compiler level, several techniques improve the memory bandwidth at the scope of a basic block, but often fail to exploit all. We propose a technique to optimize the memory bandwidth across the boundaries of a basic block. Our technique incrementally fuses loops to better use the available bandwidth. The resulting performance depends on how the data is assigned to the memories of the memory layer. At the same time, the assignment also strongly influences the energy cost. Therefore, we combine in our approach the fusion and assignment decisions. Designers can use our output to trade-off the energy cost with the system's performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Oren Avissar , Rajeev Barua , Dave Stewart, Heterogeneous memory management for embedded systems, Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems, November 16-17, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502217.502223]
|
| |
2
|
F. Bodin, W. Jalby, C. Eisenbeis, and D. Windheiser. A quantitative algorithm for data locality optimization. In Proc. Int. Wkshp. on Code Generation, pages 119--145, 1991.
|
| |
3
|
|
 |
4
|
Peter Grun , Nikil Dutt , Alex Nicolau, Memory aware compilation through accurate timing extraction, Proceedings of the 37th conference on Design automation, p.316-321, June 05-09, 2000, Los Angeles, California, United States
[doi> 10.1145/337292.337428]
|
 |
5
|
|
 |
6
|
|
| |
7
|
P. Marchal, J.I. Gomez, and F. Catthoor. Loop morphing to improve the performance on a VLIW. In accepted for ASAP 2004, 2004.
|
| |
8
|
M. Wolf. Improving locality and parallelism in nested loops. Technical report, Technical report CSL-TR-92-538, Stanford Univ., CA, USA, Sep. 1992.
|
| |
9
|
Preeti Ranjan Panda , Nikil D. Dutt , Alexandru Nicolau, Exploiting off-chip memory access modes in high-level synthesis, Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design, p.333-340, November 09-13, 1997, San Jose, California, United States
|
 |
10
|
P. R. Panda , F. Catthoor , N. D. Dutt , K. Danckaert , E. Brockmeyer , C. Kulkarni , A. Vandercappelle , P. G. Kjeldsberg, Data and memory optimization techniques for embedded systems, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.6 n.2, p.149-206, April 2001
[doi> 10.1145/375977.375978]
|
 |
11
|
|
| |
12
|
B. Rau. Iterative Modulo Scheduling. Technical report, HP Labs, 1995.
|
| |
13
|
M. Saghir, P. Chow, and C. Lee. Exploiting Dual Data Banks in Digital Signal Processors. In ASPLOS, Jun. 1997.
|
 |
14
|
Arnout Vandecappelle , Miguel Miranda , Erik Brockmeyer , Francky Catthoor , Diederik Verkest, Global multimedia system design exploration using accurate memory organization feedback, Proceedings of the 36th ACM/IEEE conference on Design automation, p.327-332, June 21-25, 1999, New Orleans, Louisiana, United States
[doi> 10.1145/309847.309945]
|
| |
15
|
S. Verdoorlaege, M. Bruynooghe, G. Janssens, and F. Catthoor. Multi-dimensional incremental loop fusion for data locality. In Proceedings 2003 Application-specific Systems, Architectures and Processors, pages 17--27, 2003.
|
| |
16
|
W. Verhaegh, E. Aarts, P. van Gorp, and P. Lippens. A Two-stage Solution Approach for Multidimensional Periodic Scheduling. IEEE Trans. Computer Aided Design of Integrated Circuits and Systems, 10(10):1185--1199, Oct. 2001.
|
| |
17
|
|
CITED BY 4
|
|
|
|
|
|
|
|
Youcef Bouchebaba , Bruno Girodias , Gabriela Nicolescu , El Mostapha Aboulhamid , Bruno Lavigueur , Pierre Paulin, MPSoC memory optimization using program transformation, ACM Transactions on Design Automation of Electronic Systems (TODAES), v.12 n.4, p.43-es, September 2007
|
|