| Compact multi-dimensional kernel extraction for register tiling |
| Full text |
Pdf
(383 KB)
|
| Source
|
Conference on High Performance Networking and Computing
archive
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
table of contents
Portland, Oregon
SESSION: Technical papers
table of contents
Article No.: 45
Year of Publication: 2009
ISBN:978-1-60558-744-8
|
|
Authors
|
|
Lakshminarayanan Renganarayana
|
IBM T.J. Watson Research Center, Yorktown Heights, New York
|
|
Uday Bondhugula
|
IBM T.J. Watson Research Center, Yorktown Heights, New York
|
|
Salem Derisavi
|
IBM Toronto Lab, Ontario, Canada
|
|
Alexandre E. Eichenberger
|
IBM T.J. Watson Research Center, Yorktown Heights, New York
|
|
Kevin O'Brien
|
IBM T.J. Watson Research Center, Yorktown Heights, New York
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 16, Downloads (12 Months): 68, Citation Count: 0
|
|
|
ABSTRACT
To achieve high performance on multi-cores, modern loop optimizers apply long sequences of transformations that produce complex loop structures. Downstream optimizations such as register tiling (unroll-and-jam plus scalar promotion) typically provide a significant performance improvement. Typical register tilers provide this performance improvement only when applied on simple loop structures. They often fail to operate on complex loop structures leaving a significant amount of performance on the table. We present a technique called compact multi-dimensional kernel extraction (COMDEX) which can make register tilers operate on arbitrarily complex loop structures and enable them to provide the performance benefits. COMDEX extracts compact unrollable kernels from complex loops. We show that by using COMDEX as a pre-processing to register tiling we can (i) enable register tiling on complex loop structures and (ii) realize a significant performance improvement on a variety of codes.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Uday Bondhugula, M. Baskaran, Sriram Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In International conference on Compiler Construction (ETAPS CC), April 2008.
|
 |
3
|
Uday Bondhugula , Albert Hartono , J. Ramanujam , P. Sadayappan, A practical automatic polyhedral parallelizer and locality optimizer, Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation, June 07-13, 2008, Tucson, AZ, USA
[doi> 10.1145/1375581.1375595]
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
L. Carter, J. Ferrante, F. Hummel, B. Alpern, and K. S. Gatlin. Hierarchical tiling: A methodology for high performance. Technical Report CS96-508, UCSD, Nov. 1996.
|
| |
8
|
|
 |
9
|
Albert Cohen , Marc Sigler , Sylvain Girbal , Olivier Temam , David Parello , Nicolas Vasilache, Facilitating the search for compositions of program transformations, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088169]
|
 |
10
|
Albert Hartono , Muthu Manikandan Baskaran , Cédric Bastoul , Albert Cohen , Sriram Krishnamoorthy , Boyana Norris , J. Ramanujam , P. Sadayappan, Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
[doi> 10.1145/1542275.1542301]
|
| |
11
|
HiTLoG: Hierarchical Tiled Loop Generator. Available at: http://www.cs.colostate.edu/MMAlpha/HiTLoG/.
|
 |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
Lakshminarayanan Renganarayana, Ramakrishna Upadrasta, and Sanjay Rajopadhye. Optimal ILP and register tiling: Analytical model and optimization framework. In LCPC 2005: 12th International Workshop on Languages and Compilers for Parallel Computing. Springer Verlag, 2005.
|
 |
18
|
Lakshminarayanan Renganarayanan , DaeGon Kim , Sanjay Rajopadhye , Michelle Mills Strout, Parameterized tiled loops for free, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, June 10-13, 2007, San Diego, California, USA
[doi> 10.1145/1250734.1250780]
|
 |
19
|
|
| |
20
|
|
| |
21
|
Nicolas Vasilache. Scalable Program Optimization Techniques in the Polyhedral Model. PhD thesis, Université de Paris-Sud, INRIA Futurs, September 2007.
|
| |
22
|
|
| |
23
|
|
| |
24
|
K. Yotov, Xiaoming Li, Gang Ren, M. J. S. Garzaran, D. Padua, K. Pingali, and P. Stodghill. Is search really necessary to generate high-performance BLAS? Proceedings of the IEEE, 93:358--386, 2005.
|
|