|
ABSTRACT
Empirical program optimizers estimate the values of key optimization parameters by generating different program versions and running them on the actual hardware to determine which values give the best performance. In contrast, conventional compilers use models of programs and machines to choose these parameters. It is widely believed that model-driven optimization does not compete with empirical optimization, but few quantitative comparisons have been done to date. To make such a comparison, we replaced the empirical optimization engine in ATLAS (a system for generating a dense numerical linear algebra library called the BLAS) with a model-driven optimization engine that used detailed models to estimate values for optimization parameters, and then measured the relative performance of the two systems on three different hardware platforms. Our experiments show that model-driven optimization can be surprisingly effective, and can generate code whose performance is comparable to that of code generated by empirical optimizers for the BLAS.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 3, pages 1381--1384, 1998
|
| |
9
|
|
| |
10
|
T. Kisuki, P. M. W. Knijnenburg, M. F. P. O'Boyle, and H. A. G. Wijshoff . Iterative compilation in program optimization. In proceedings of Compilers for Parallel Computers, pages (CPC), pages 35--44, 2000
|
| |
11
|
|
 |
12
|
Monica D. Lam , Edward E. Rothberg , Michael E. Wolf, The cache performance and optimizations of blocked algorithms, Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, p.63-74, April 08-11, 1991, Santa Clara, California, United States
|
| |
13
|
R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 2, 78--117, 1970
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
J. Ramanujam and P. Sadayappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, 16(2):108--120, 1992
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
Jianxin Xiong , Jeremy Johnson , Robert Johnson , David Padua, SPL: a language and compiler for DSP algorithms, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.298-308, June 2001, Snowbird, Utah, United States
|
CITED BY 30
|
|
|
|
|
Cemal Yilmaz , Arvind S. Krishna , Atif Memon , Adam Porter , Douglas C. Schmidt , Aniruddha Gokhale , Balachandran Natarajan, Main effects screening: a distributed continuous quality assurance process for monitoring performance degradation in evolving software systems, Proceedings of the 27th international conference on Software engineering, May 15-21, 2005, St. Louis, MO, USA
|
|
|
|
|
|
|
|
|
|
|
|
Albert Cohen , Marc Sigler , Sylvain Girbal , Olivier Temam , David Parello , Nicolas Vasilache, Facilitating the search for compositions of program transformations, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
|
|
|
|
|
|
J. Dongarra , G. Bosilca , Z. Chen , V. Eijkhout , G. E. Fagg , E. Fuentes , J. Langou , P. Luszczek , J. Pjesivac-Grbovic , K. Seymour , H. You , S. S. Vadhiyar, Self-adapting numerical software (SANS) effort, IBM Journal of Research and Development, v.50 n.2/3, p.223-238, March 2006
|
|
|
|
|
|
|
|
|
Leonardo Bachega , Siddhartha Chatterjee , Kenneth A. Dockser , John A. Gunnels , Manish Gupta , Fred G. Gustavson , Christopher A. Lapkowski , Gary K. Liu , Mark P. Mendell , Charles D. Wait , T. J. Chris Ward, A High-Performance SIMD Floating Point Unit for BlueGene/L: Architecture, Compilation, and Algorithm Design, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, p.85-96, September 29-October 03, 2004
|
|
|
|
|
|
John Cavazos , Christophe Dubach , Felix Agakov , Edwin Bonilla , Michael F. P. O'Boyle , Grigori Fursin , Olivier Temam, Automatic performance model construction for the fast software exploration of new hardware designs, Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems, October 22-25, 2006, Seoul, Korea
|
|
|
|
|
|
Albert Cohen , Sébastien Donadio , Maria-Jesus Garzaran , Christoph Herrmann , Oleg Kiselyov , David Padua, In search of a program generator to implement generic transformations for high-performance computing, Science of Computer Programming, v.62 n.1, p.25-46, September 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
F. Agakov , E. Bonilla , J. Cavazos , B. Franke , G. Fursin , M. F. P. O'Boyle , J. Thomson , M. Toussaint , C. K. I. Williams, Using Machine Learning to Focus Iterative Optimization, Proceedings of the International Symposium on Code Generation and Optimization, p.295-305, March 26-29, 2006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
John Cavazos , Grigori Fursin , Felix Agakov , Edwin Bonilla , Michael F. P. O'Boyle , Olivier Temam, Rapidly Selecting Good Compiler Optimizations using Performance Counters, Proceedings of the International Symposium on Code Generation and Optimization, p.185-197, March 11-14, 2007
|
|
|
Van Bui , Boyana Norris , Kevin Huck , Lois Curfman McInnes , Li Li , Oscar Hernandez , Barbara Chapman, A component infrastructure for performance and power modeling of parallel scientific applications, Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance, October 16-17, 2008, Karlsruhe, Germany
|
|
|
Cemal Yilmaz , Adam Porter , Arvind S. Krishna , Atif M. Memon , Douglas C. Schmidt , Aniruddha S. Gokhale , Balachandran Natarajan, Reliable Effects Screening: A Distributed Continuous Quality Assurance Process for Monitoring Performance Degradation in Evolving Software Systems, IEEE Transactions on Software Engineering, v.33 n.2, p.124-141, February 2007
|
|
|
|
|
|
Jason Ansel , Cy Chan , Yee Lok Wong , Marek Olszewski , Qin Zhao , Alan Edelman , Saman Amarasinghe, PetaBricks: a language and compiler for algorithmic choice, ACM SIGPLAN Notices, v.44 n.6, June 2009
|
INDEX TERMS
Primary Classification:
D.
Software
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
Subjects:
Compilers
Additional Classification:
D.
Software
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
Subjects:
Code generation;
Optimization
G.
Mathematics of Computing
G.4
MATHEMATICAL SOFTWARE
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.2
Automatic Programming
Subjects:
Program transformation
General Terms:
Algorithms,
Experimentation,
Measurement,
Performance
Keywords:
BLAS,
blocking,
code generation,
compilers,
empirical optimization,
memory hierarchy,
model-driven optimization,
program transformation,
tiling,
unrolling
|