| Parametric multi-level tiling of imperfectly nested loops |
| Full text |
Pdf
(543 KB)
|
Source
|
International Conference on Supercomputing
archive
Proceedings of the 23rd international conference on Supercomputing
table of contents
Yorktown Heights, NY, USA
SESSION: Compilers
table of contents
Pages 147-157
Year of Publication: 2009
ISBN:978-1-60558-498-0
|
|
Authors
|
|
Albert Hartono
|
The Ohio State University, Columbus, OH, USA
|
|
Muthu Manikandan Baskaran
|
The Ohio State University, Columbus, OH, USA
|
|
Cédric Bastoul
|
Paris-Sud 11 University and HiPEAC Network, Orsay, France
|
|
Albert Cohen
|
INRIA Saclay and HiPEAC Network, Orsay, France
|
|
Sriram Krishnamoorthy
|
Pacific Northwest National Laboratory, Richland, WA, USA
|
|
Boyana Norris
|
Argonne National Laboratory, Argonne, IL, USA
|
|
J. Ramanujam
|
Louisiana State University, Baton Rouge, LA, USA
|
|
P. Sadayappan
|
The Ohio State University, Columbus, OH, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 71, Citation Count: 0
|
|
|
ABSTRACT
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multi-level tiled code is essential for maximizing data reuse in systems with deep memory hierarchies. Tiled loops with parametric tile sizes (not compile-time constants) facilitate runtime feedback and dynamic optimizations used in iterative compilation and automatic tuning. Previous parametric multi-level tiling approaches have been restricted to perfectly nested loops, where all assignment statements are contained inside the innermost loop of a loop nest. Previous solutions to tiling for imperfect loop nests have only handled fixed tile sizes. In this paper, we present an approach to parametric multi-level tiling of imperfectly nested loops. The tiling technique generates loops that iterate over full rectangular tiles, making them amenable to compiler optimizations such as register tiling. Experimental results using a number of computational benchmarks demonstrate the effectiveness of the developed tiling approach.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
PrimeTile: A Parametric Multi-Level Tiler for Imperfect Loop Nests. http://primetile.sourceforge.net.
|
| |
2
|
Nawaaz Ahmed , Nikolay Mateev , Keshav Pingali, Tiling imperfectly-nested loop nests, Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), p.31-es, November 04-10, 2000, Dallas, Texas, United States
|
| |
3
|
|
 |
4
|
|
| |
5
|
Workshop on Automatic Tuning for Petascale Systems. http://cscads.rice.edu/workshops/summer08/autotuning.
|
| |
6
|
C. Bastoul. Efficient code generation for automatic parallelization and optimization. In ISPDC, page 23, 2003.
|
| |
7
|
|
| |
8
|
U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral program optimization system. In PLDI'08, 2008.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
C. Chen, J. Chame, and M. Hall. Chill: A framework for composing high-level loop transformations. Technical Report 08-897, USC Computer Science Technical Report, June 2008.
|
| |
13
|
CLooG: The Chunky Loop Generator. http://www.cloog.org.
|
 |
14
|
|
| |
15
|
P. Feautrier. Dataflow analysis of array and scalar references. IJPP, 20(1):23--53, 1991.
|
| |
16
|
G. Goumas, M. Athanasaki, and N. Koziris. An Efficient Code Generation Technique for Tiled Iteration Spaces. IEEE Trans. Parallel Distrib. Syst., 14(10):1021--1034, 2003.
|
| |
17
|
M. Griebl. Automatic Parallelization of Loop Programs for Distributed Memory Architectures. FMI, University of Passau, 2004. Habilitation Thesis.
|
| |
18
|
A. Hartono, M. Baskaran, C. Bastoul, A. Cohen, S. Krishnamoorthy, B. Norris, J. Ramanujam, and P. Sadayappan. Primetile: A parametric multi-level tiler for imperfect loop nests. Technical Report OSU-CISRC-2/09-TR04, The Ohio State University, Feb. 2009.
|
| |
19
|
HiTLoG: Hierarchical Tiled Loop Generator. Available at http://www.cs.colostate.edu/MMAlpha/tiling/.
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
| |
23
|
M. Jiménez, J. Llabería, and A. Fernández. A cost-effective implementation of multilevel tiling. IEEE Trans. Parallel Distrib. Syst., 14(10):1006--1020, 2003.
|
| |
24
|
D. Kim and S. Rajopadhye. Parameterized tiling for imperfectly nested loops. Technical Report CS-09-101, Colorado State University, Department of Computer Science, February 2009.
|
 |
25
|
DaeGon Kim , Lakshminarayanan Renganarayanan , Dave Rostron , Sanjay Rajopadhye , Michelle Mills Strout, Multi-level tiling: M for the price of one, Proceedings of the 2007 ACM/IEEE conference on Supercomputing, November 10-16, 2007, Reno, Nevada
[doi> 10.1145/1362622.1362691]
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
 |
29
|
|
| |
30
|
Pluto: A polyhedral automatic parallelizer and locality optimizer for multicores. Available at http://pluto-compiler.sourceforge.net.
|
 |
31
|
|
| |
32
|
|
| |
33
|
J. Ramanujam and P. Sadayappan. Tiling multidimensional iteration spaces for multicomputers. JPDC, 16(2):108--230, 1992.
|
 |
34
|
Lakshminarayanan Renganarayanan , DaeGon Kim , Sanjay Rajopadhye , Michelle Mills Strout, Parameterized tiled loops for free, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, June 10-13, 2007, San Diego, California, USA
|
| |
35
|
|
 |
36
|
|
| |
37
|
|
 |
38
|
|
| |
39
|
A. Tiwari, C. Chen, J. Chame, M. Hall, and J. Hollingsworth. Scalable autotuning framework for compiler optimization. In IPDPS'09, May 2009.
|
| |
40
|
TLoG: A Parametrized Tiled Loop Generator. Available at http://www.cs.colostate.edu/MMAlpha/tiling/.
|
| |
41
|
|
| |
42
|
R. Whaley, A. Petitet, and J. Dongarra. Automated Empirical Optimizations of Software and the ATLAS Project. Parallel Computing Journal, 2000.
|
| |
43
|
|
| |
44
|
|
| |
45
|
|
|