| Efficient, portable implementation of asynchronous multi-place programs |
| Full text |
Pdf
(462 KB)
|
Source
|
Principles and Practice of Parallel Programming
archive
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
table of contents
Raleigh, NC, USA
SESSION: High end computing software
table of contents
Pages 271-282
Year of Publication: 2009
ISBN:978-1-60558-397-6
Also published in ...
|
|
Authors
|
|
Ganesh Bikshandi
|
IBM STG, Bangalore, India
|
|
Jose G. Castanos
|
IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
|
|
Sreedhar B. Kodali
|
IBM STG, Bangalore, India
|
|
V. Krishna Nandivada
|
IBM India Research Lab, New Delhi, India
|
|
Igor Peshansky
|
IBM T.J. Watson Research Center, Hawthorne, NY, USA
|
|
Vijay A. Saraswat
|
IBM T.J. Watson Research Center, Hawthorne, NY, USA
|
|
Sayantan Sur
|
IBM T.J. Watson Research Center, Hawthorne, NY, USA
|
|
Pradeep Varma
|
IBM India Research Lab, New Delhi, India
|
|
Tong Wen
|
Interactive Supercomputing, Boston, MA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 31, Downloads (12 Months): 148, Citation Count: 1
|
|
|
ABSTRACT
The X10 programming language is organized around the notion of places (an encapsulation of data and activities operating on the data), partitioned global address space (PGAS), and asynchronous computation and communication. This paper introduces an expressive subset of X10, Flat X10, designed to permit efficient execution across multiple single-threaded places with a simple runtime and without compromising on the productivity of X10. We present the design, implementation and evaluation of a compiler and runtime system for Flat X10. The Flat X10 compiler translates programs into C++ SPMD programs communicating using an active messaging infrastructure. It uses novel techniques to transform explicitly parallel programs into SPMD programs. The runtime system is based on IBM's LAPI (Low-level API) and is easily portable to other libraries such as GASNet and ARMCI. Our implementation realizes performance comparable to hand-written MPI programs for well-known HPC benchmarks such as Random Access, Stream, and FFT, on a Federation-based cluster of Power5 SMPs (with hundreds of processors) and the Blue Gene (with thousands of processors). Submissions based on the work presented in this paper were co-winners of the 2007 and 2008 HPC Challenge Type II Awards.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Christopher Barton , CĆlin Casçaval , George Almási , Yili Zheng , Montse Farreras , Siddhartha Chatterje , José Nelson Amaral, Shared memory programming for large scale machines, Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation, June 11-14, 2006, Ottawa, Ontario, Canada
|
| |
3
|
Stephen M. Blackburn , Richard L. Hudson , Ron Morrison , J. Eliot B. Moss , David S. Munro , John Zigman, Starting with termination: a methodology for building distributed garbage collection algorithms, Australian Computer Science Communications, v.23 n.1, p.20-28, January-February 2001
|
 |
4
|
Robert D. Blumofe , Christopher F. Joerg , Bradley C. Kuszmaul , Charles E. Leiserson , Keith H. Randall , Yuli Zhou, Cilk: an efficient multithreaded runtime system, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.207-216, July 19-21, 1995, Santa Barbara, California, United States
|
| |
5
|
UPC Consortium. UPC language specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Laboratory, 2005.
|
| |
6
|
|
| |
7
|
F. Darema-Rogers, D. A. George, V.A. Norton, and G.F. Pfister. A Single-Program-Multiple-Data Computational Model for EPEX/FORTRAN. Parallel Computing, 7:11--24, 1988.
|
| |
8
|
F. Darema-Rogers, V. A. Norton, and G. F. Pfister. Using A Single-Program-Multiple-Data Computational Model for Parallel Execution of Scientific Applications. Technical Report RC 11552, IBM T. J. Watson Research Center, Yorktown Heights, NY, 1985.
|
| |
9
|
|
| |
10
|
V. Saraswat et al. HPC challenge 07: X10, 2007.
|
 |
11
|
|
| |
12
|
Paul N. Hilfinger , Dan Bonachea , David Gay , Susan Graham , Ben Liblit , Geoff Pike , Katherine Yelick, Titanium Language Reference Manual, University of California at Berkeley, Berkeley, CA, 2001
|
| |
13
|
IBM International Technical Support Organization Poughkeepsie Center. Overview of LAPI. www.redbooks.ibm.com/redbooks/pdfs/sg242080.pdf, 2008.
|
 |
14
|
Eric Mohr , David A. Kranz , Robert H. Halstead, Jr., Lazy task creation: a technique for increasing the granularity of parallel programs, Proceedings of the 1990 ACM conference on LISP and functional programming, p.185-197, June 27-29, 1990, Nice, France
[doi> 10.1145/91556.91631]
|
| |
15
|
R. Numrich and J. Reid. Co-array fortran for parallel programming, 1998.
|
| |
16
|
E. M. Paalvast, L. C. Breebart, and H. J. Sips. An expressive annotation model for generating SPMD programs. In Scalable High Performance Computing Conference, pages 208--211. IEEE Computer Society, 1992.
|
 |
17
|
|
| |
18
|
Vijay A. Saraswat. X10 Language Report. Technical report, IBM Research, 2004.
|
 |
19
|
|
 |
20
|
Thorsten von Eicken , David E. Culler , Seth Copen Goldstein , Klaus Erik Schauser, Active messages: a mechanism for integrated communication and computation, Proceedings of the 19th annual international symposium on Computer architecture, p.256-266, May 19-21, 1992, Queensland, Australia
|
 |
21
|
Deborah A. Wallach , Wilson C. Hsieh , Kirk L. Johnson , M. Frans Kaashoek , William E. Weihl, Optimistic active messages: a mechanism for scheduling communication with computation, Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming, p.217-226, July 19-21, 1995, Santa Barbara, California, United States
|
CITED BY
|
|
Jun Shirako , Jisheng M. Zhao , V. Krishna Nandivada , Vivek N. Sarkar, Chunking parallel loops in the presence of synchronization, Proceedings of the 23rd international conference on Supercomputing, June 08-12, 2009, Yorktown Heights, NY, USA
|
INDEX TERMS
Primary Classification:
D.
Software
D.3
PROGRAMMING LANGUAGES
D.3.4
Processors
General Terms:
Design,
Languages
Keywords:
apgas,
asynchrony,
compiler,
fft,
hpc,
hpc challenge,
pgas,
random access,
runtime,
spmd,
stream,
x10
|