ACM Home Page
Please provide us with feedback. Feedback
Profitable loop fusion and tiling using model-driven empirical search
Full text PdfPdf (525 KB)
Source International Conference on Supercomputing archive
Proceedings of the 20th annual international conference on Supercomputing table of contents
Cairns, Queensland, Australia
SESSION: Memory table of contents
Pages: 249 - 258  
Year of Publication: 2006
ISBN:1-59593-282-8
Authors
Apan Qasem  Rice University, Houston, TX
Ken Kennedy  Rice University, Houston, TX
Sponsors
SIGARCH: ACM Special Interest Group on Computer Architecture
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 35,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1183401.1183437
What is a DOI?

ABSTRACT

Loop fusion and tiling are both recognized as effective transformations for improving memory performance of scientific applications. However, because of their sensitivity to the underlying cache architecture and their interaction with each other it is difficult to determine a good heuristic for applying these transformations profitably across architectures. In this paper, we present a model-guided empirical tuning strategy for profitable application of loop fusion and tiling. Our strategy consists of a detailed cost model that characterizes the interaction between the two transformations at different levels of the memory hierarchy. The novelty of our approach is in exposing key architectural parameters within the model for automatic tuning through empirical search. Preliminary experiments with a set of applications on four different platforms show that our strategy achieves significant performance improvement over fully optimized code generated by state-of-the-art commercial compilers. The time spent in searching for the best parameters is considerably less than with other search strategies.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
K. Cooper, D. Subramanian, and L. Torczon. Adaptive optimizing compilers for the 21st century. In Proceedings of the Los Alamos Computer Science Institute Second Annual Symposium, Santa Fe, NM, Oct. 2001.
 
5
C. Ding and K. Kennedy. Resource-constrained loop fusion. Technical report, Dept. of Computer Science, Rice University, Oct. 2000.
 
6
7
 
8
 
9
G. G. Fursin, M. F. P. O'Boyle, and P. M. W. Knijnenburg. Evaluating iterative compilation. In Proceedings of the Fifteenth International Workshop on Languages and Compilers for Parallel Computing, College Park, Maryland, July 2002.
10
 
11
12
 
13
A. Lim and M. Lam. Cache optimizations with affine partitioning. In Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, Portsmouth, Virginia, Mar. 2001.
14
 
15
 
16
A. Qasem and K. Kennedy. A cache-conscious profitability model for empirical tuning of loop fusion. In Proceedings of the Eighteenth International Workshop on Languages and Compilers for Parallel Computing, Hawthorne, NY, Oct. 2005.
 
17
A. Qasem, K. Kennedy, and J. Mellor-Crummey. Automatic tuning of whole applications using direct search and a performance-based transformation system. In Proceedings of the Los Alamos Computer Science Institute Second Annual Symposium, Santa Fe, NM, Oct. 2004.
18
19
 
20
 
21
S. Verdoolaege, M. Bruynooghe, G. Jenssens, and F. Catthoor. Multi-dimensional incremental loop fusion for data locality. In Proceedings of the IEEE International Conference on Application Specific Systems, Architectures, and Processors, June 2003.
 
22
 
23
24
25
26
 
27
H. You, K. Seymour, and J. Dongarra. An effective empirical search method for automatic software tuning. Technical report, University of Tennessee, Feb. 2005.
 
28
Y. Zhao, Q. Yi, K. Kennedy, D. Quinlan, and R. Vuduc. Parameterizing loop fusion for automated empirical tuning. Technical report, Lawrence Livermore National Laboratory, Dec. 2005.