|
ABSTRACT
Observational astrophysics has recently become a data-intensive science after many decades of relative data poverty. As a result, many of the algorithms developed for processing astronomical data, although well established for low-volume data capture, do not scale well to today's high-volume sky surveys and transient searches. Specifically, problems may occur with data transfer, workflow management, efficient parallelization, and integration of legacy code. Observational astrophysics workflows present computational challenges unique in high performance computing, including 24/7 operations, time-critical processing, and very large numbers of relatively small data files which must all be processed and archived. We present a case study based on Sunfall, a distributed, parallel scientific workflow system we built for the Nearby Supernova Factory, the largest data-volume supernova search currently in existence. We describe innovative techniques for data transfer and workflow management, and discuss lessons learned in building a large-scale observational astrophysics workflow management system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Aldering, G., et al. Overview of the Nearby Supernova Factory. Proceedings of the SPIE, 2002, 61--72.
|
| |
2
|
Aragon, C. and Aragon, D. B. A Fast Contour Descriptor Algorithm for Supernova Image Classification. SPIE Symposium on Electronic Imaging: Real-Time Image Processing, San Jose, CA, 2007.
|
| |
3
|
Aragon, C., Bailey, S., Poon, S., Runge, K. and Thomas, R. C. Sunfall: A Collaborative Visual Analytics System for Astrophysics. SciDAC, Seattle, WA, 2008.
|
| |
4
|
Aragon, C., Poon, S., Aldering, G., Thomas, R. C. and Quimby, R. Using Visual Analytics to Maintain Situational Awareness in Astrophysics. IEEE Symposium on Visual Analytics Science and Technology (VAST), Columbus, OH, 2008.
|
| |
5
|
Astier, P. SuperNova Legacy Survey (SNLS). A&A (447), 2006, 31--48.
|
| |
6
|
Bailey, S., Aragon, C., Romano, R., Thomas, R. C., Weaver, B. A. and Wong, D. How to Find More Supernovae with Less Work: Object Classification Techniques for Difference Imaging. Astrophysical Journal, 2007.
|
| |
7
|
|
| |
8
|
|
| |
9
|
GPFS, IBM General Parallel File System, 2006, http://www03.ibm.com/systems/clusters/software/gpfs.html.
|
| |
10
|
HPSS, NERSC High Performance Storage System, 2007, http://www.nersc.gov/nusers/systems/HPSS/.
|
| |
11
|
HPWREN, High Performance Wireless Research and Education Network, 2004, http://hpwren.ucsd.edu.
|
| |
12
|
LSST, Large Synoptic Survey Telescope, 2008, http://lsst.org.
|
| |
13
|
Bertram Ludäscher , Ilkay Altintas , Chad Berkley , Dan Higgins , Efrat Jaeger , Matthew Jones , Edward A. Lee , Jing Tao , Yang Zhao, Scientific workflow management and the Kepler system: Research Articles, Concurrency and Computation: Practice & Experience, v.18 n.10, p.1039-1065, August 2006
[doi> 10.1002/cpe.v18:10]
|
| |
14
|
NASA, The Joint Dark Energy Mission, 2008, http://universe.nasa.gov/program/probes/jdem.html.
|
| |
15
|
NEAT, Near Earth Asteroid Tracking, 2007, http://neat.jpl.nasa.gov.
|
| |
16
|
NERSC, National Energy Research Scientific Computing Center, 2008, http://www.nersc.gov.
|
| |
17
|
PanSTARRS, Pan-STARRS: Panoramic Survey Telescope and Rapid Response System, 2008, http://panstarrs.ifa.hawaii.edu/.
|
| |
18
|
PDSF, NERSC Parallel Distributed Systems Facility, 2008, http://www.nersc.gov/nusers/systems/PDSF/.
|
| |
19
|
Perlmutter, S., Aldering, G., Goldhaber, G., et al. Measurements of Omega and Lambda from 42 High-Redshift Supernovae. Astrophysical Journal, 1999 (517), 1999, 565--586.
|
| |
20
|
Ptolemy, The Ptolemy II software framework, 2004, http://ptolemy.eecs.berkeley.edu/ptolemyII.
|
| |
21
|
Riess, A. G., Filippenko, A. V., et al. Observational Evidence from Supernovae for an Accelerating Universe and a Cosmological Constant. Astrophysical Journal, 1998 (116), 1998, 1009--1038.
|
| |
22
|
|
| |
23
|
Sako, M. The Sloan Digital Sky Survey-II Supernova Survey: Search Algorithm and Follow-Up Observations. Astronomical Journal, 135, 2008, 348--373.
|
 |
24
|
Carlos E. Scheidegger , Huy T. Vo , David Koop , Juliana Freire , Claudio T. Silva, Querying and re-using workflows with VsTrails, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada
[doi> 10.1145/1376616.1376747]
|
| |
25
|
SDSS, Sloan Digital Sky Survey, 2008, http://www.sdss.org.
|
| |
26
|
|
| |
27
|
SNfactory, The Nearby Supernova Factory, 2008, http://snfactory.lbl.gov.
|
| |
28
|
SNLS, SuperNova Legacy Survey, 2008, http://www.cfht.hawaii.edu/SNLS/.
|
| |
29
|
UH88, University of Hawaii 2.2-meter telescope, 2004, http://www.ifa.hawaii.edu/88inch/.
|
| |
30
|
Wood-Vasey, W. M. Rates and Progenitors of Type Ia Supernovae Physics, University of California, Berkeley, 2004.
|
|