ACM Home Page
Please provide us with feedback. Feedback
Optimizing the memory bandwidth with loop fusion
Full text PdfPdf (191 KB)
Source
International Conference on Hardware Software Codesign archive
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis table of contents
Stockholm, Sweden
SESSION: Software and hardware techniques for performance optimisation of embedded applications table of contents
Pages: 188 - 193  
Year of Publication: 2004
ISBN:1-58113- 937-3
Authors
Paul Marchal  IMEC/KULEUVEN, Heverlee, Belgium
José Ignacio Gómez  DACYA U.C.M., Madrid, Spain
Francky Catthoor  IMEC/KULEUVEN, Heverlee, Belgium
Sponsors
SIGDA: ACM Special Interest Group on Design Automation
SIGBED: ACM Special Interest Group on Embedded Systems
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 17,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1016720.1016767
What is a DOI?

ABSTRACT

The memory bandwidth largely determines the performance and energy cost of embedded systems. At the compiler level, several techniques improve the memory bandwidth at the scope of a basic block, but often fail to exploit all. We propose a technique to optimize the memory bandwidth across the boundaries of a basic block. Our technique incrementally fuses loops to better use the available bandwidth. The resulting performance depends on how the data is assigned to the memories of the memory layer. At the same time, the assignment also strongly influences the energy cost. Therefore, we combine in our approach the fusion and assignment decisions. Designers can use our output to trade-off the energy cost with the system's performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
F. Bodin, W. Jalby, C. Eisenbeis, and D. Windheiser. A quantitative algorithm for data locality optimization. In Proc. Int. Wkshp. on Code Generation, pages 119--145, 1991.
 
3
4
5
6
 
7
P. Marchal, J.I. Gomez, and F. Catthoor. Loop morphing to improve the performance on a VLIW. In accepted for ASAP 2004, 2004.
 
8
M. Wolf. Improving locality and parallelism in nested loops. Technical report, Technical report CSL-TR-92-538, Stanford Univ., CA, USA, Sep. 1992.
 
9
10
11
 
12
B. Rau. Iterative Modulo Scheduling. Technical report, HP Labs, 1995.
 
13
M. Saghir, P. Chow, and C. Lee. Exploiting Dual Data Banks in Digital Signal Processors. In ASPLOS, Jun. 1997.
14
 
15
S. Verdoorlaege, M. Bruynooghe, G. Janssens, and F. Catthoor. Multi-dimensional incremental loop fusion for data locality. In Proceedings 2003 Application-specific Systems, Architectures and Processors, pages 17--27, 2003.
 
16
W. Verhaegh, E. Aarts, P. van Gorp, and P. Lippens. A Two-stage Solution Approach for Multidimensional Periodic Scheduling. IEEE Trans. Computer Aided Design of Integrated Circuits and Systems, 10(10):1185--1199, Oct. 2001.
 
17


Collaborative Colleagues:
Paul Marchal: colleagues
José Ignacio Gómez: colleagues
Francky Catthoor: colleagues