|
ABSTRACT
Determining the optimal tile size-one that minimizes the execution time-is a classical problem in compilation and performance tuning of loop kernels. Designing a model of the overall execution time of a tiled loop nest is an important subproblem. Both problems become harder when tiling is applied at multiple levels. We present a framework for determining the optimal tile sizes for a fully permutable, perfectly nested, rectangular loop with uniform dependences. Our framework supports multiple levels of tiling and uses a BSP style high level model for estimating the overall execution time of a loop program. In our framework, the problem of determining the optimal tile sizes, subject to memory capacity and bandwidth constraints, is modeled as a geometric program and transformed into a convex optimization problem, which can be solved efficiently. The model is validated through experimental results obtained by running twenty loop programs for different levels of tiling and different program and tile parameters. Our framework is very general and can also be used to solve the optimal tile size problem with many other models of execution time.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
[2] R Andonov, S. Balev, S Rajopadhye, and N. Yanev. Optimal semi-oblique tiling. IEEE Transactions on Parallel and Distributed Systems, 14(9):944-960, 2003.
|
 |
3
|
Jeff Bilmes , Krste Asanovic , Chee-Whye Chin , Jim Demmel, Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology, Proceedings of the 11th international conference on Supercomputing, p.340-347, July 07-11, 1997, Vienna, Austria
[doi> 10.1145/263580.263662]
|
| |
4
|
|
 |
5
|
|
| |
6
|
[6] L. Carter, J. Ferrante, F. Hummel, B. Alpern, and K.S. Gatlin. Hierarchical tiling: A methodology for high performance. Technical Report CS96-508, UCSD, Nov. 1996.
|
 |
7
|
|
 |
8
|
|
 |
9
|
Siddhartha Chatterjee , Erin Parker , Philip J. Hanlon , Alvin R. Lebeck, Exact analysis of the cache behavior of nested loops, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.286-297, June 2001, Snowbird, Utah, United States
|
 |
10
|
|
 |
11
|
|
| |
12
|
[12] R.J. Duffin, E.L. Peterson, and C. Zener. Geometric Programming - Theory and Applications. John Wiley, 1967.
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
[17] C. Hsu and U. Kremer. Tile selection algorithms and their performance models. Technical Report DCS-TR-401, CS Dept., Rutgers University, Oct. 1999.
|
 |
18
|
|
| |
19
|
[19] M. Jimenez, J.M. Llaberia, and A. Fernandez. A cost-effective implementation of multilevel tiling. IEEE Transactions on Parallel and Distributed Computing, 14(10):1006 1020, 2003.
|
| |
20
|
|
| |
21
|
[21] Computational Optimization Laboratory. User's guide of COPL_GP: Compuational optimization program lihary: Geometric programming. (available at: http://dollar.biz.uiowa.edu/col/). Technical Report Department of Management Science, The University of Iowa, May 2000.
|
 |
22
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
23
|
|
| |
24
|
[24] MOSEK optimization tools. see http://www.mosek.com/products/3/tools/help.
|
 |
25
|
|
 |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
 |
31
|
O. Temam , C. Fricker , W. Jalby, Cache interference phenomena, Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, p.261-271, May 16-20, 1994, Nashville, Tennessee, United States
|
 |
32
|
|
 |
33
|
|
| |
34
|
|
| |
35
|
|
| |
36
|
|
| |
37
|
|
| |
38
|
|
CITED BY 4
|
|
|
|
|
|
|
|
|
|
|
Albert Hartono , Muthu Manikandan Baskaran , Cédric Bastoul , Albert Cohen , Sriram Krishnamoorthy , Boyana Norris , J. Ramanujam , P. Sadayappan, Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
|