| PetaBricks: a language and compiler for algorithmic choice |
| Full text |
Pdf
(551 KB)
|
Source
|
Conference on Programming Language Design and Implementation
archive
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
table of contents
Dublin, Ireland
SESSION: Adaptation
table of contents
Pages 38-49
Year of Publication: 2009
ISBN:978-1-60558-392-1
Also published in ...
|
|
Authors
|
|
Jason Ansel
|
Massachusetts Institute of Technology, Cambridge, MA, USA
|
|
Cy Chan
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
Yee Lok Wong
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
Marek Olszewski
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
Qin Zhao
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
Alan Edelman
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
Saman Amarasinghe
|
Massachusetts Institute of Technology , Cambridge, MA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 44, Downloads (12 Months): 166, Citation Count: 0
|
|
|
ABSTRACT
It is often impossible to obtain a one-size-fits-all solution for high performance algorithms when considering different choices for data distributions, parallelism, transformations, and blocking. The best solution to these choices is often tightly coupled to different architectures, problem sizes, data, and available system resources. In some cases, completely different algorithms may provide the best performance. Current compiler and programming language techniques are able to change some of these parameters, but today there is no simple way for the programmer to express or the compiler to choose different algorithms to handle different parts of the data. Existing solutions normally can handle only coarse-grained, library level selections or hand coded cutoffs between base cases and recursive cases. We present PetaBricks, a new implicitly parallel language and compiler where having multiple implementations of multiple algorithms to solve a problem is the natural way of programming. We make algorithmic choice a first class construct of the language. Choices are provided in a way that also allows our compiler to tune at a finer granularity. The PetaBricks compiler autotunes programs by making both fine-grained as well as algorithmic choices. Choices also include different automatic parallelization techniques, data distributions, algorithmic parameters, transformations, and blocking. Additionally, we introduce novel techniques to autotune algorithms for different convergence criteria. When choosing between various direct and iterative methods, the PetaBricks compiler is able to tune a program in such a way that delivers near-optimal efficiency for any desired level of accuracy. The compiler has the flexibility of utilizing different convergence criteria for the various components within a single algorithm, providing the user with accuracy choice alongside algorithmic choice.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
E. Anderson , Z. Bai , C. Bischof , L. S. Blackford , J. Demmel , Jack J. Dongarra , J. Du Croz , S. Hammarling , A. Greenbaum , A. McKenney , D. Sorensen, LAPACK Users' guide (third ed.), Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999
|
 |
3
|
Jeff Bilmes , Krste Asanovic , Chee-Whye Chin , Jim Demmel, Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology, Proceedings of the 11th international conference on Supercomputing, p.340-347, July 07-11, 1997, Vienna, Austria
[doi> 10.1145/263580.263662]
|
 |
4
|
|
| |
5
|
|
| |
6
|
Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216--231, February 2005. Invited paper, special issue on Program Generation, Optimization, and Platform Adaptation.
|
| |
7
|
Matteo Frigo and Steven G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, volume 3, pages 1381--1384. IEEE, 1998.
|
 |
8
|
Matteo Frigo , Charles E. Leiserson , Keith H. Randall, The implementation of the Cilk-5 multithreaded language, Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation, p.212-223, June 17-19, 1998, Montreal, Quebec, Canada
|
 |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
Marek Olszewski and Michael Voss. Install-time system for automatic generation of optimized parallel sorting algorithms. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pages 17--23, 2004.
|
| |
15
|
Justin Mazzola Paluska , Hubert Pham , Umar Saif , Grace Chau , Chris Terman , Steve Ward, Structured Decomposition of Adaptive Applications, Proceedings of the 2008 Sixth Annual IEEE International Conference on Pervasive Computing and Communications, p.1-10, March 17-21, 2008
[doi> 10.1109/PERCOM.2008.55]
|
| |
16
|
Markus Puschel, Jose M. F. Moura, Jeremy R. Johnson, David Padua, Manuela M. Veloso, Bryan W. Singer, Jianxin Xiong, Aca Gacic, Franz Franchetti, Robbert W. Johnson Yevgen Voronenko, Kang Chen, and Nicholas Rizzolo. SPIRAL: Code generation for DSP transforms. In Proceedings of the IEEE, volume 93, pages 232--275. IEEE, Feb 2005.
|
| |
17
|
Richard H. Rand. Computer algebra in applied mathematics: an introduction to MACSYMA. Number 94 in Research notes in mathematics. 1984. ISBN 0-273-08632-4.
|
 |
18
|
Nathan Thomas , Gabriel Tanase , Olga Tkachyshyn , Jack Perdue , Nancy M. Amato , Lawrence Rauchwerger, A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, June 15-17, 2005, Chicago, IL, USA
[doi> 10.1145/1065944.1065981]
|
| |
19
|
|
 |
20
|
|
| |
21
|
Richard Vuduc, James W. Demmel, and Jeff A. Bilmes. Statistical models for empirical search-based performance tuning. International Journal of
|
| |
22
|
High Performance Computing Applications, 18(1):65--94, 2004. ISSN 1094--3420.
|
| |
23
|
Richard Vuduc, James W. Demmel, and Katherine A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proceedings of the Scientific Discovery through Advanced Computing Conference, Journal of Physics: Conference Series, San Francisco, CA, USA, June 2005. Institute of Physics Publishing.
|
| |
24
|
|
| |
25
|
|
| |
26
|
Samuel Webb Williams, Andrew Waterman, and David A. Patterson. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Technical Report UCB/EECS--2008--134, EECS Department, University of California, Berkeley, Oct 2008.
|
 |
27
|
Jianxin Xiong , Jeremy Johnson , Robert Johnson , David Padua, SPL: a language and compiler for DSP algorithms, Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, p.298-308, June 2001, Snowbird, Utah, United States
|
 |
28
|
|
 |
29
|
Kamen Yotov , Xiaoming Li , Gang Ren , Michael Cibulskis , Gerald DeJong , Maria Garzaran , David Padua , Keshav Pingali , Paul Stodghill , Peng Wu, A comparison of empirical and model-driven optimization, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, June 09-11, 2003, San Diego, California, USA
|
| |
30
|
|
|