ACM Home Page
Please provide us with feedback. Feedback
Toward loosely coupled programming on petascale systems
Full text PdfPdf (2.84 MB)
Source Conference on High Performance Networking and Computing archive
Proceedings of the 2008 ACM/IEEE conference on Supercomputing - Volume 00 table of contents
Austin, Texas
SECTION: Papers table of contents
Article No. 22  
Year of Publication: 2008
ISBN:978-1-4244-2835-9
Authors
Ioan Raicu  University of Chicago, Chicago, IL
Zhao Zhang  University of Chicago and Argonne National Laboratory, Chicago, IL
Mike Wilde  Argonne National Laboratory, Argonne, IL and University of Chicago and Argonne National Laboratory, Chicago, IL
Ian Foster  Argonne National Laboratory, Argonne, IL and University of Chicago, Chicago, IL and University of Chicago and Argonne National Laboratory, Chicago, IL
Pete Beckman  Argonne National Laboratory, Argonne, IL
Kamil Iskra  Argonne National Laboratory, Argonne, IL
Ben Clifford  University of Chicago and Argonne National Laboratory, Chicago, IL
Publisher
IEEE Press  Piscataway, NJ, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 251,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new---and potentially far larger---class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
IBM BlueGene/P (BG/P), http://www.research.ibm.com/bluegene/, 2008
 
2
 
3
Y. Zhao, I. Raicu, I. Foster. "Scientific Workflow Systems for 21st Century e-Science, New Bottle or New Wine?" IEEE Workshop on Scientific Workflows 2008
 
4
 
5
Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, I. Raicu, T. Stef-Praun, M. Wilde. "Swift: Fast, Reliable, Loosely Coupled Parallel Computation" IEEE Workshop on Scientific Workflows 2007
6
 
7
8
9
 
10
 
11
M. Livny, J. Basney, R. Raman, T. Tannenbaum. "Mechanisms for High Throughput Computing," SPEEDUP Journal 1(1), 1997
 
12
M. Flynn. "Some Computer Organizations and Their Effectiveness", IEEE Trans. Comput. C-21, 1972, pp. 948
 
13
 
14
"Swift Workflow System": www.ci.uchicago.edu/swift, 2008
 
15
Top500, June 2008, http://www.top500.org/lists/2008/06
16
 
17
 
18
J. Cope, M. Oberg, H. M. Tufo, T. Voran, M. Woitaszek. "High Throughput Grid Computing with an IBM Blue Gene/L," Cluster 2007
 
19
A. Peters, A. King, T. Budnik, P. McCarthy, P. Michaud, M. Mundy, J. Sexton, G. Stewart. "Asynchronous Task Dispatch for High Throughput Computing for the eServer IBM Blue Gene® Supercomputer," Parallel and Distributed Processing (IPDPS), 2008
 
20
A. Gara, et al. "Overview of the Blue Gene/L system architecture", IBM Journal of Research and Development 49(2/3), 2005
 
21
IBM Coorporation. "High-Throughput Computing (HTC) Paradigm," IBM System Blue Gene Solution: Blue Gene/P Application Development, IBM RedBooks, 2008
 
22
A. Bialecki, M. Cafarella, D. Cutting, O. O'Malley. "Hadoop: A Framework for Running Applications on Large Clusters Built of Commodity Hardware," http://lucene.apache.org/hadoop/, 2005
 
23
 
24
F. J. L. Reid, "Task Farming on Blue Gene," EEPC, Edinburgh University, 2006
 
25
N. Desai. "Cobalt: An Open Source Platform for HPC System Software Research," Edinburgh BG/L System Software Workshop, 2005
 
26
J. E. Moreira et al., "Blue Gene/L Programming and Operating Environment," IBM Journal of Research and Development 49(2/3), 2005
 
27
"ZeptoOS: The Small Linux for Big Computers," http://www-unix.mcs.anl.gov/zeptoos/, 2008
 
28
 
29
E. Robinson, D. J. DeWitt. "Turning Cluster Management into Data Management: A System Overview," Conference on Innovative Data Systems Research, 2007
 
30
 
31
G. v. Laszewski, M. Hategan, D. Kodeboyina. "Java CoG Kit Workflow," in I. J. Taylor, E. Deelman, D. B. Gannon, and M. Shields, eds., Workflows for eScience, 2007, pp. 340--356
 
32
I. Raicu, Y. Zhao, I. Foster, A. Szalay. "A Data Diffusion Approach to Large-scale Scientific Exploration," Microsoft eScience Workshop at RENCI2007
 
33
A. Szalay, A. Bunn, J. Gray, I. Foster, I. Raicu. "The Importance of Data Locality in Distributed Computing Applications," NSF Workflow Workshop 2006
 
34
Y. Zhao, I. Raicu, I. Foster, M. Hategan, V. Nefedova, M. Wilde. "Realizing Fast, Scalable and Reliable Scientific Computations in Grid Environments", Grid Computing Research Progress, Nova Pub. 2008
 
35
Open Science Grid (OSG), http://www.opensciencegrid.org/, 2008
 
36
C. Catlett et al., "TeraGrid: Analysis of Organization, System Architecture, and Middleware Enabling New Types of Applications," HPC and Grids in Action, ed. Lucio Grandinetti, IOS Press Advances in Parallel Computing series, Amsterdam, 2007
 
37
SiCortex, http://www.sicortex.com/, 2008
 
38
J. C. Jacob et al. "The Montage Architecture for Grid-Enabled Science Processing of Large, Distributed Datasets," Earth Science Technology Conference 2004
 
39
The Functional Magnetic Resonance Imaging Data Center, http://www.fmridc.org/, 2007
 
40
T. Stef-Praun, B. Clifford, I. Foster, U. Hasson, M. Hategan, S. Small, M. Wilde, Y. Zhao. "Accelerating Medical Research using the Swift Workflow System," Health Grid, 2007
 
41
D. T. Moustakas et al. "Development and Validation of a Modular, Extensible Docking Program: DOCK 5," J. Comput. Aided Mol. Des. 20, 2006, pp. 601--619
 
42
D. Hanson. "Enhancing Technology Representations within the Stanford Energy Modeling Forum (EMF) Climate Economic Models," Energy and Economic Policy Models: A Reexamination of Fundamentals, 2006
 
43
T. Stef-Praun, G. Madeira, I. Foster, R. Townsend. "Accelerating Solution of a Moral Hazard Problem with Swift," e-Social Science, 2007
 
44
I. Foster, "Globus Toolkit Version 4: Software for Service-Oriented Systems," Conference on Network and Parallel Computing, 2005
 
45
R. Stevens. "The LLNL/ANL/IBM Collaboration to Develop BG/P and BG/Q," DOE ASCAC Report, 2006
 
46
KEGG's Ligand Database: http://www.genome.ad.jp/kegg/ligand.html, 2008


Collaborative Colleagues:
Ioan Raicu: colleagues
Zhao Zhang: colleagues
Mike Wilde: colleagues
Ian Foster: colleagues
Pete Beckman: colleagues
Kamil Iskra: colleagues
Ben Clifford: colleagues