| An integrated framework for performance-based optimization of scientific workflows |
| Full text |
Pdf
(2.33 MB)
|
Source
|
High Performance Distributed Computing
archive
Proceedings of the 18th ACM international symposium on High performance distributed computing
table of contents
Garching, Germany
SESSION: Workflow and dataflow applications
table of contents
Pages 177-186
Year of Publication: 2009
ISBN:978-1-60558-587-1
|
|
Authors
|
|
Vijay S. Kumar
|
Ohio State University, Columbus, OH, USA
|
|
P. Sadayappan
|
Ohio State University, Columbus, OH, USA
|
|
Gaurang Mehta
|
University of Southern California, Marina del Rey, CA, USA
|
|
Karan Vahi
|
University of Southern California, Marina del Rey, CA, USA
|
|
Ewa Deelman
|
University of Southern California, Marina del Rey, CA, USA
|
|
Varun Ratnakar
|
University of Southern California, Marina del Rey, CA, USA
|
|
Jihie Kim
|
University of Southern California, Marina del Rey, CA, USA
|
|
Yolanda Gil
|
University of Southern California, Marina del Rey, CA, USA
|
|
Mary Hall
|
University of Utah, Salt Lake City, UT, USA
|
|
Tahsin Kurc
|
Emory University, Atlanta, GA, USA
|
|
Joel Saltz
|
Emory University, Atlanta, GA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 26, Downloads (12 Months): 89, Citation Count: 1
|
|
|
ABSTRACT
Data analysis processes in scientific applications can be expressed as coarse-grain workflows of complex data processing operations with data flow dependencies between them. Performance optimization of these workflows can be viewed as a search for a set of optimal values in a multi-dimensional parameter space. While some performance parameters such as grouping of workflow components and their mapping to machines do not affect the accuracy of the output, others may dictate trading the output quality of individual components (and of the whole workflow) for performance. This paper describes an integrated framework which is capable of supporting performance optimizations along multiple dimensions of the parameter space. Using two real-world applications in the spatial data analysis domain, we present an experimental evaluation of the proposed framework.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Michael D. Beynon , Tahsin Kurc , Umit Catalyurek , Chialin Chang , Alan Sussman , Joel Saltz, Distributed processing of very large datasets with DataCutter, Parallel Computing, v.27 n.11, p.1457-1478, October 2001
[doi> 10.1016/S0167-8191(01)00099-0]
|
| |
2
|
|
| |
3
|
|
| |
4
|
D. Chiu, S. Deshpande, G. Agrawal, and R. Li. Cost and accuracy sensitive dynamic workflow composition over Grid environments. 9th IEEE/ACM International Conference on Grid Computing, pages 9--16, Oct. 2008.
|
| |
5
|
S. K. Chow, H. Hakozaki, D. L. Price, N. A. B. MacLean, T. J. Deerinck, J. C. Bouwer, M. E. Martone, S. T. Peltier, and M. H. Ellisman. Automated microscopy system for mosaic acquisition and processing. Journal of Microscopy, 222(2):76--84, May 2006.
|
| |
6
|
I.-H. Chung and J. Hollingsworth. A case study using automatic performance tuning for large-scale scientific programs. 15th IEEE International Symposium on High Performance Distributed Computing, pages 45--56, 2006.
|
| |
7
|
|
| |
8
|
V. Cortellessa, F. Marinelli, and P. Potena. Automated selection of software components based on cost/reliability tradeoff. In Software Architecture, Third European Workshop, EWSA 2006, volume 4344 of Lecture Notes in Computer Science. Springer, 2006.
|
| |
9
|
E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M.-H. Su, K. Vahi, and M. Livny. Pegasus: Mapping scientific workflows onto the Grid. Lecture Notes in Computer Science: Grid Computing, pages 11--20, 2004.
|
| |
10
|
Y. Gil, V. Ratnakar, E. Deelman, G. Mehta, and J. Kim. Wings for Pegasus: Creating large-scale scientific applications using semantic representations of computational workflows. In Proceedings of the 19th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI), July 2007.
|
| |
11
|
T. Glatard, J. Montagnat, and X. Pennec. Efficient services composition for Grid-enabled data-intensive applications. In Proceedings of the IEEE International Symposium on High Performance Distributed Computing (HPDC'06), Paris, France, June 19, 2006.
|
| |
12
|
J. Kong, O. Sertel, H. Shimada, K. Boyer, J. Saltz, and M. Gurcan. Computer-aided grading of neuroblastic differentiation: Multi-resolution and multi-classifier approach. IEEE International Conference on Image Processing, ICIP 2007, 5:525--528, Oct. 2007.
|
| |
13
|
V. Kumar, B. Rutt, T. Kurc, U. Catalyurek, T. Pan, S. Chow, S. Lamont, M. Martone, and J. Saltz. Large-scale biomedical image analysis in Grid environments. IEEE Transactions on Information Technology in Biomedicine, 12(2):154--161, March 2008.
|
| |
14
|
Vijay S. Kumar , Sivaramakrishnan Narayanan , Tahsin Kurc , Jun Kong , Metin N. Gurcan , Joel H. Saltz, Analysis and Semantic Querying in Large Biomedical Image Datasets, Computer, v.41 n.4, p.52-59, April 2008
[doi> 10.1109/MC.2008.108]
|
| |
15
|
|
| |
16
|
Bertram Ludäscher , Ilkay Altintas , Chad Berkley , Dan Higgins , Efrat Jaeger , Matthew Jones , Edward A. Lee , Jing Tao , Yang Zhao, Scientific workflow management and the Kepler system: Research Articles, Concurrency and Computation: Practice & Experience, v.18 n.10, p.1039-1065, August 2006
[doi> 10.1002/cpe.v18:10]
|
| |
17
|
B. Norris, J. Ray, R. Armstrong, L. C. Mcinnes, and S. Shende. Computational quality of service for scientific components. In Proceedings of the International Symposium on Component--based Software Engineering (CBSE7), pages 264--271. Springer, 2004.
|
| |
18
|
Tom Oinn , Matthew Addis , Justin Ferris , Darren Marvin , Martin Senger , Mark Greenwood , Tim Carver , Kevin Glover , Matthew R. Pocock , Anil Wipat , Peter Li, Taverna: a tool for the composition and enactment of bioinformatics workflows, Bioinformatics, v.20 n.17, p.3045-3054, November 2004
[doi> 10.1093/bioinformatics/bth361]
|
| |
19
|
|
| |
20
|
|
| |
21
|
J. Zhou, K. Cooper, and I.-L. Yen. A rule-based component customization technique for QoS properties. Eighth IEEE International Symposium on High Assurance Systems Engineering, pages 302--303, March 2004.
|
CITED BY
|
|
Tahsin Kurc , Shannon Hastings , Vijay Kumar , Stephen Langella , Ashish Sharma , Tony Pan , Scott Oster , David Ervin , Justin Permar , Sivaramakrishnan Narayanan , Yolanda Gil , Ewa Deelman , Mary Hall , Joel Saltz, HPC and Grid Computing for Integrative Biomedical Research, International Journal of High Performance Computing Applications, v.23 n.3, p.252-264, August 2009
|
|