ACM Home Page
Please provide us with feedback. Feedback
PetaBricks: a language and compiler for algorithmic choice
Full text PdfPdf (551 KB)
Source
Conference on Programming Language Design and Implementation archive
Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation table of contents
Dublin, Ireland
SESSION: Adaptation table of contents
Pages 38-49  
Year of Publication: 2009
ISBN:978-1-60558-392-1
Also published in ...
Authors
Jason Ansel  Massachusetts Institute of Technology, Cambridge, MA, USA
Cy Chan  Massachusetts Institute of Technology , Cambridge, MA, USA
Yee Lok Wong  Massachusetts Institute of Technology , Cambridge, MA, USA
Marek Olszewski  Massachusetts Institute of Technology , Cambridge, MA, USA
Qin Zhao  Massachusetts Institute of Technology , Cambridge, MA, USA
Alan Edelman  Massachusetts Institute of Technology , Cambridge, MA, USA
Saman Amarasinghe  Massachusetts Institute of Technology , Cambridge, MA, USA
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 44,   Downloads (12 Months): 166,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1542476.1542481
What is a DOI?

ABSTRACT

It is often impossible to obtain a one-size-fits-all solution for high performance algorithms when considering different choices for data distributions, parallelism, transformations, and blocking. The best solution to these choices is often tightly coupled to different architectures, problem sizes, data, and available system resources. In some cases, completely different algorithms may provide the best performance. Current compiler and programming language techniques are able to change some of these parameters, but today there is no simple way for the programmer to express or the compiler to choose different algorithms to handle different parts of the data. Existing solutions normally can handle only coarse-grained, library level selections or hand coded cutoffs between base cases and recursive cases.

We present PetaBricks, a new implicitly parallel language and compiler where having multiple implementations of multiple algorithms to solve a problem is the natural way of programming. We make algorithmic choice a first class construct of the language. Choices are provided in a way that also allows our compiler to tune at a finer granularity. The PetaBricks compiler autotunes programs by making both fine-grained as well as algorithmic choices. Choices also include different automatic parallelization techniques, data distributions, algorithmic parameters, transformations, and blocking.

Additionally, we introduce novel techniques to autotune algorithms for different convergence criteria. When choosing between various direct and iterative methods, the PetaBricks compiler is able to tune a program in such a way that delivers near-optimal efficiency for any desired level of accuracy. The compiler has the flexibility of utilizing different convergence criteria for the various components within a single algorithm, providing the user with accuracy choice alongside algorithmic choice.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
 
6
Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3. Proceedings of the IEEE, 93(2):216--231, February 2005. Invited paper, special issue on Program Generation, Optimization, and Platform Adaptation.
 
7
Matteo Frigo and Steven G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, volume 3, pages 1381--1384. IEEE, 1998.
8
9
 
10
 
11
 
12
 
13
 
14
Marek Olszewski and Michael Voss. Install-time system for automatic generation of optimized parallel sorting algorithms. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pages 17--23, 2004.
 
15
 
16
Markus Puschel, Jose M. F. Moura, Jeremy R. Johnson, David Padua, Manuela M. Veloso, Bryan W. Singer, Jianxin Xiong, Aca Gacic, Franz Franchetti, Robbert W. Johnson Yevgen Voronenko, Kang Chen, and Nicholas Rizzolo. SPIRAL: Code generation for DSP transforms. In Proceedings of the IEEE, volume 93, pages 232--275. IEEE, Feb 2005.
 
17
Richard H. Rand. Computer algebra in applied mathematics: an introduction to MACSYMA. Number 94 in Research notes in mathematics. 1984. ISBN 0-273-08632-4.
18
 
19
20
 
21
Richard Vuduc, James W. Demmel, and Jeff A. Bilmes. Statistical models for empirical search-based performance tuning. International Journal of
 
22
High Performance Computing Applications, 18(1):65--94, 2004. ISSN 1094--3420.
 
23
Richard Vuduc, James W. Demmel, and Katherine A. Yelick. OSKI: A library of automatically tuned sparse matrix kernels. In Proceedings of the Scientific Discovery through Advanced Computing Conference, Journal of Physics: Conference Series, San Francisco, CA, USA, June 2005. Institute of Physics Publishing.
 
24
 
25
 
26
Samuel Webb Williams, Andrew Waterman, and David A. Patterson. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Technical Report UCB/EECS--2008--134, EECS Department, University of California, Berkeley, Oct 2008.
27
28
29
 
30

Collaborative Colleagues:
Jason Ansel: colleagues
Cy Chan: colleagues
Yee Lok Wong: colleagues
Marek Olszewski: colleagues
Qin Zhao: colleagues
Alan Edelman: colleagues
Saman Amarasinghe: colleagues