ACM Home Page
Please provide us with feedback. Feedback
A framework for adaptive algorithm selection in STAPL
Full text PdfPdf (225 KB)
Source Principles and Practice of Parallel Programming archive
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming table of contents
Chicago, IL, USA
SESSION: Libraries and applications table of contents
Pages: 277 - 288  
Year of Publication: 2005
ISBN:1-59593-080-9
Authors
Nathan Thomas  Texas A&M University
Gabriel Tanase  Texas A&M University
Olga Tkachyshyn  Texas A&M University
Jack Perdue  Texas A&M University
Nancy M. Amato  Texas A&M University
Lawrence Rauchwerger  Texas A&M University
Sponsors
SIGPLAN: ACM Special Interest Group on Programming Languages
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 56,   Citation Count: 10
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1065944.1065981
What is a DOI?

ABSTRACT

Writing portable programs that perform well on multiple platforms or for varying input sizes and types can be very difficult because performance is often sensitive to the system architecture, the run-time environment, and input data characteristics. This is even more challenging on parallel and distributed systems due to the wide variety of system architectures. One way to address this problem is to adaptively select the best parallel algorithm for the current input data and system from a set of functionally equivalent algorithmic options. Toward this goal, we have developed a general framework for adaptive algorithm selection for use in the Standard Template Adaptive Parallel Library (STAPL). Our framework uses machine learning techniques to analyze data collected by STAPL installation benchmarks and to determine tests that will select among algorithmic options at run-time. We apply a prototype implementation of our framework to two important parallel operations, sorting and matrix multiplication, on multiple platforms and show that the framework determines run-time tests that correctly select the best performing algorithm from among several competing algorithmic options in 86-100% of the cases studied, depending on the operation and the system.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: A standard template adaptive parallel C++ library. In Proc. of the Intern. Workshop on Advanced Compiler Technology for High Performance and Embedded Processors (IWACT), Bucharest, Romania, July 2001.
 
2
P. An, A. Jula, S. Rus, S. Saunders, T. Smith, G. Tanase, N. Thomas, N. Amato, and L. Rauchwerger. STAPL: An adaptive, generic parallel programming library for C++. In Workshop on Languages and Compilers for Parallel Computing (LCPC), Cumberland Falls, KY, August 2001.
3
4
5
6
 
7
W.H. Burge. Sorting, trees, and measures of order. Information and Control, 1(3):181--197, 1958.
8
 
9
Jaeyoung Choi , Jack J. Dongarra , L. Susan Ostrouchov , Antoine P. Petitet , David W. Walker , R. Clint Whaley, Design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines, Scientific Programming, v.5 n.3, p.173-184, Fall 1996
 
10
Jaeyoung Choi, Jack J. Dongarra, and David W. Walker. PUMMA: Parallel Universal Matrix Multiplication Algorithms on distributed memory concurrent computers. Concurrency: Practice and Experience, 6(7):543--570, 1994.
11
 
12
Edsger W. Dijkstra. Smoothsort, an alternative for sorting in situ. Science of Computer Programming, 1(3):223--233, 1982.
 
13
 
14
15
 
16
G. C. Fox and S. W. Otto. Matrix algorithms on a hypercube I: matrix multiplication.
 
17
R. A. Van De Geijn and J. Watts. SUMMA: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience, 9(4):255--274, 1997.
18
19
 
20
 
21
 
22
 
23
 
24
 
25
Marek Olszewski and Michael Voss. Proc. of the international conference on parallel and distributed processing techniques and applications, pdpta '04, june 21-24, 2004, las vegas, nevada, usa, volume 1. In Hamid R. Arabnia, editor, PDPTA. CSREA Press, 2004.
 
26
 
27
Of Signal Processing. SPIRAL: A generator for platform-adapted libraries.
 
28
 
29
 
30
J. R. Rice. The algorithm selection problem. Advances in Computers, 15:65--118, 1976.
 
31
D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by error propagation. IEEE Transactions on Parallel and Distributed Systems: Explorations in the Microstructure of Cognition, 1, 1986.
32
 
33
Steven Saunders, Nathan Thomas, Nancy Amato, and Lawrence Rauchwerger. Adaptive parallel sorting in the STAPL library. Technical Report TR01-005, Parasol Laboratory, Texas A&M University, November 2001.
 
34
 
35
 
36
R. Clint Whaley, Antoine Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, January 2001.
37
 
38

CITED BY  10

Collaborative Colleagues:
Nathan Thomas: colleagues
Gabriel Tanase: colleagues
Olga Tkachyshyn: colleagues
Jack Perdue: colleagues
Nancy M. Amato: colleagues
Lawrence Rauchwerger: colleagues