ACM Home Page
Please provide us with feedback. Feedback
Model-guided autotuning of high-productivity languages for petascale computing
Full text PdfPdf (519 KB)
Source
High Performance Distributed Computing archive
Proceedings of the 18th ACM international symposium on High performance distributed computing table of contents
Garching, Germany
Pages 151-166  
Year of Publication: 2009
ISBN:978-1-60558-587-1
Authors
Hans Zima  JPL, Pasadena, CA, USA
Mary Hall  University of Utah, Salt lake City, USA
Chun Chen  University of Utah, Salt Lake City, USA
Jaqueline Chame  ISI, Marina del Rey, USA
Sponsors
ACM: Association for Computing Machinery
SIGARCH: ACM Special Interest Group on Computer Architecture
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 13,   Downloads (12 Months): 51,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1551609.1551611
What is a DOI?

ABSTRACT

addresses the enormous complexity of mapping applications to current and future highly parallel platforms - including scalable architectures consisting of tens of thousands of nodes, many-core devices with tens to hundreds of cores, and hierarchical systems providing multi-level parallelism. At systems of these scales, for many important algorithms, performance is dominated by the time required to move data across the levels of the memory hierarchy. As a consequence, locality awareness of algorithms and the efficient management of communication are essential requirements for obtaining scalable parallel performance, and are of particular concern for applications characterized by irregular memory access patterns. We describe the design of a programming system that focuses on productivity of application programmers in expressing locality-aware algorithms for high-end architectures, which are then automatically tuned for performance. The approach combines the successes of two novel concepts for managing locality: high-level specification of user-defined data distributions and model-guided autotuning for data locality. The resulting combined system provides a powerful general mechanism for the specification of data distributions, which can express domain-specific knowledge, and facilitates automatic tuning of a distribution to access patterns in algorithms and its application to different levels of a memory hierarchy. Because there is a clean separation between the specification of a data distribution and the algorithms in which it is used, these can be written separately and composed together to quickly develop new applications that can be tuned in the context of their data set and execution environment. We address key issues for a range of codes that include LU Decomposition, Sparse Matrix-Vector Multiply and Knowledge Discovery. The knowledge discovery algorithms, in particular, stress the proposed language and compiler technology and provide a forcing function for developing tools that address inherent challenges of irregular applications.}


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
S. Benkner, G. Lonsdale, and H. Zima. The HPF+ Project: Supporting HPF for Advanced Industrial Applications. Springer Verlag, Toulouse, France, August 1999.
3
 
4
S. Benkner and M. Pantano. HPF+: Optimizing HPF for Advanced Applications. Supercomputer, 13(2):31-43, 1997.
 
5
 
6
Siegfried Benkner. Optimizing irregular HPF applications using halos. Concurrency - Practice and Experience, 12(2-3):137-155, 2000.
 
7
8
 
9
 
10
 
11
 
12
C. Chen, J. Chame, and M. Hall. Chill: A framework for composing high-level loop transformations. Technical report, University of Southern California, June 2008. Tech Report Number 08-897.
 
13
C. Chen, J. Shin, S. Kintali, J. Chame, and M. Hall. Model-guided empirical optimization for multimedia extension architectures: A case study. In Proceedings of the Workshop on Performance Optimizatoin of High-Level Languages, held in conjunction with IPDPS '07, March 2007.
 
14
 
15
Chun Chen, Jacqueline Chame, and Mary W. Hall. A systematic approach to model-guided empirical search for memory hiearchy optimization. In Proceedings of the 18th International Workshop on Languages and Compilers for Parallel Computing, October 2005.
16
 
17
Roxana E. Diaconescu and Hans P. Zima. A new approach to Locality Awareness in High Productivity languages. Technical report, Jet Propulsion Laboratory, California Institute of Technology, May 2006. New Technology Report 44028.
 
18
Roxana E. Diaconescu and Hans P. Zima. Locality Awareness in a High-Productivity Language, pages 463-485. Chapman and Hall/CRC Press, 2008. Chapter 22.
 
19
Kemal Ebcioglu, Vijay Saraswat, and Vivek Sarkar. X10: Programming for hierarchical parallelism and non-uniform data access. In 3rd International Workshop on Language Runtimes, ACM OOPSLA 2004, Vancouver, BC, October 2004.
 
20
Geoffrey Fox, Seema Hiranandani, Ken Kennedy, Charles Koelbel, Ulrich Kremer, Chau-Wen Tseng, and M.-Y. Wu. Fortran D language specification. Technical Report CRPC-TR90079, Rice University, Center for Research on Parallel Computation, Houston, TX, December 1990.
21
 
22
 
23
High Performance Fortran Forum. HPF-2 scope of work and motivating applications. Technical Report CRPC-TR 94492, Rice University, Department of Computer Science, Houston, TX, 1994.
 
24
High Performance Fortran Forum. High Performance Fortran Language Specification, January 1997.
25
 
26
27
 
28
 
29
30
 
31
Markus Püschel, José M. F. Moura, Jeremy R. Johnson, David Padua, Manuela M. Veloso, Bryan W. Singer, Jianxin Xiong, Franz Franchetti, Aca Ga¿ic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Nicholas Rizzolo. SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE: Special Issue on Program Generation, Optimization, and Platform Adaptation, 93(2):232-275, February 2005.
 
32
A. Tiwari, C. Chen, J. Chame, M. Hall, and J. Hollingsworth. Scalable autotuning framework for compiler optimization. In IEEE International Parallel and Distributed Processing Symposium, May 2009.
 
33
R. Clint Whaley, Antoine Petitet, and Jack J. Dongarra. Automated empirical optimization of software and the ATLAS project. Parallel Computing, 27(1-2):3-35, January 2001.
 
34
Katherine Yelick, Luigi Semenzato, Geoff Pike, Carleton Miyamoto, Ben Liblit, Arvind Krishnamurthy, Paul Hilfinger, Susan Graham, David Gay, Phil Colella, and Alex Aiken. Titanium: A high-performance Java dialect. In ACM, editor, ACM 1998 Workshop on Java for High-Performance Network Computing, New York, NY 10036, USA, 1998. ACM Press.
 
35
 
36
Hans P. Zima, Peter Brezany, Barbara M. Chapman, Piyush Mehrotra, and Andreas Schwald. Vienna Fortran - a language specification. Technical Report 21, ICASE, NASA Langley Research Center, Hampton, VA, March 1992. ICASE Internal Report 21.

Collaborative Colleagues:
Hans Zima: colleagues
Mary Hall: colleagues
Chun Chen: colleagues
Jaqueline Chame: colleagues