ACM Home Page
Please provide us with feedback. Feedback
Experience in using a process language to define scientific workflow and generate dataset provenance
Full text PdfPdf (920 KB)
Source Foundations of Software Engineering archive
Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering table of contents
Atlanta, Georgia
SESSION: Process table of contents
Pages 319-329  
Year of Publication: 2008
ISBN:978-1-59593-995-1
Authors
Leon J. Osterweil  Univ. of Massachusetts, Amherst, MA
Lori A. Clarke  Univ. of Massachusetts, Amherst, MA
Aaron M. Ellison  Harvard University, Petersham, MA
Rodion Podorozhny  Texas State University, San Marcos, TX
Alexander Wise  Univ. of Massachusetts, Amherst, MA
Emery Boose  Harvard University, Petersham, MA
Julian Hadley  Harvard University, Petersham, MA
Sponsor
SIGSOFT: ACM Special Interest Group on Software Engineering
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 121,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1453101.1453147
What is a DOI?

ABSTRACT

This paper describes our experiences in exploring the applicability of software engineering approaches to scientific data management problems. Specifically, this paper describes how process definition languages can be used to expedite production of scientific datasets as well as to generate documentation of their provenance. Our approach uses a process definition language that incorporates powerful semantics to encode scientific processes in the form of a Process Definition Graph (PDG). The paper describes how execution of the PDG-defined process can generate Dataset Derivation Graphs (DDGs), metadata that document how the scientific process developed each of its product datasets. The paper uses an example to show that scientific processes may be complex and to illustrate why some of the more powerful semantic features of the process definition language are useful in supporting clarity and conciseness in representing such processes. This work is similar in goals to work generally referred to as Scientific Workflow. The paper demonstrates the contribution that software engineering can make to this domain.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ellison, A. M., Osterweil, L. J., Hadley, J. L., Wise, A., et al. 2006. Analytic Webs Support the Synthesis of Ecological Data Sets. Ecology, 87, 6. June 2006, 1345--1358.
 
2
Osterweil, L. J., Wise, A., Clarke, L. A., Ellison, A. M., et al. 2005. Process Technology To Facilitate the Conduct of Science. In Proceedings of the Software Process Workshop, (Beijing, China, May 2005), Springer-Verlag, 403--415.
 
3
Boose, E. R., Ellison, A. M., Osterweil, L. J., Podorozhny, R., et al. 2007. Ensuring Reliable Datasets for Environmental Models and Forecasts. Ecological Informatics 2, 237--247.
 
4
Dingman, S. L. 2002. Physical Hydrology. 2nd Ed. Prentice Hall, NJ.
 
5
6
 
7
Wise, A. 2006. Little-JIL 1.5 Language Report. Department of Computer Science, University of Massachusetts, UM-CS-2006-51.
 
8
 
9
 
10
 
11
 
12
 
13
 
14
Pautasso, C. and Alonso, G. 2005. The Jopera Visual Composition Language. Journal of Visual Languages & Computing, 16, 1--2, 119--152.
 
15
Eclipse.Org 2007. Eclipse-An Open Development Platform, 2007.
 
16
 
17
 
18
 
19
Altintas, I., Barney, O. and Jaeger-Frank, E. 2006. Provenance Collection Support In the Kepler Scientific Workflow System. In Proceedings of the International Provenance and Annotation Workshop (Revised Selected Papers), (Chicago, IL, May 3--5, 2006), Springer Verlag 118--132.
 
20
21
 
22
Girault, A., Lee, B. and Lee, E. A. 1999. Hierarchical Finite State Machines with Multiple Concurrency Models. IEEE Transactions on CAD of Integrated Circuits and Systems, 18, 6, 742--760.
23
24
 
25
 
26
 
27
Lanter, D. P. 1991. Design of A Lineage-Based Meta-Data Base for GIS. Cartography and Geographic Information Systems, 18, 4, 255--261.
 
28
29
 
30
Feldman, S. I. 1979. Make---A Program for Maintaining Computer Programs. Software---Practice and Experience, 9, 3. March, 255--265.
 
31
Rochkind, M. J. 1975. The Source Code Control System. IEEE Transactions on Software Engineering, SE-1. December 1975, 364--370.
32
33
 
34
Cobleigh, J. M., Clarke, L. A. and Osterweil, L. J. 2002. FLAVERS: A Finite State Verification Technique for Software Systems. IBM Systems Journal, 41, 1. 2002, 140--165.
 
35


Collaborative Colleagues:
Leon J. Osterweil: colleagues
Lori A. Clarke: colleagues
Aaron M. Ellison: colleagues
Rodion Podorozhny: colleagues
Alexander Wise: colleagues
Emery Boose: colleagues
Julian Hadley: colleagues