ACM Home Page
Please provide us with feedback. Feedback
Provenance and scientific workflows: challenges and opportunities
Full text MovMov (173:50),  PdfPdf (1.62 MB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
TUTORIAL SESSION: Tutorials table of contents
Pages 1345-1350  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Susan B. Davidson  University of Pennsylvania, Philadelphia, PA, USA
Juliana Freire  University of Utah, Salt Lake City, UT, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 61,   Downloads (12 Months): 491,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376772
What is a DOI?

ABSTRACT

Provenance in the context of workflows, both for the data they derive and for their specification, is an essential component to allow for result reproducibility, sharing, and knowledge re-use in the scientific community. Several workshops have been held on the topic, and it has been the focus of many research projects and prototype systems. This tutorial provides an overview of research issues in provenance for scientific workflows, with a focus on recent literature and technology in this area. It is aimed at a general database research audience and at people who work with scientific data and workflows. We will (1) provide a general overview of scientific workflows, (2) describe research on provenance for scientific workflows and show in detail how provenance is supported in existing systems; (3) discuss emerging applications that are enabled by provenance; and (4) outline open problems and new directions for database-related research.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
I. Altintas, O. Barney, and E. Jaeger-Frank. Provenance collection support in the kepler scientific workflow system. In Proceedings of the International Provenance and Annotation Workshop (IPAW), pages 118--132, 2006.
 
3
 
4
 
5
O. Biton, S. Cohen-Boulakia, S. Davidson, and C. Hara. Querying and managing provenance through user views in scientific workflows. In Proceedings of ICDE, 2008. To appear.
6
7
 
8
 
9
Business Process Execution Language for Web Services. http://www.ibm.com/developerworks/library/specification/ws-bpel/.
10
 
11
 
12
S. Cohen, S. C. Boulakia, and S. B. Davidson. Towards a model of provenance and user views in scientific workflows. In DILS, pages 264--279, 2006.
 
13
 
14
S. B. Davidson, S. C. Boulakia, A. Eyal, B. Ludäscher, T. M. McPhillips, S. Bowers, M. K. Anand, and J. Freire. Provenance in scientific workflow systems. IEEE Data Eng. Bull., 30(4):44--50, 2007.
 
15
E. Deelman and Y. Gil. NSF Workshop on Challenges of Scientific Workflows. Technical report, NSF, 2006. http://vtcpc.isi.edu/wiki/index.php/Main_Page.
 
16
 
17
 
18
19
 
20
J. Freire, C. T. Silva, S. P. Callahan, E. Santos, C. E. Scheidegger, and H. T. Vo. Managing rapidly-evolving scientific workflows. In International Provenance and Annotation Workshop (IPAW), LNCS 4145, pages 10--18, 2006. Invited paper.
 
21
D. Gannon et al. A Workshop on Scientific and Scholarly Workflow Cyberinfrastructure: Improving Interoperability, Sustainability and Platform Convergence in Scientific And Scholarly Workflow. Technical report, NSF and Mellon Foundation, 2007. https://spaces.internet2.edu/display/SciSchWorkflow.
 
22
 
23
L. Haas. Information for people. http://www.almaden.ibm.com/cs/people/laura/ Information For People keynote.pdf, 2007. Keynote talk at ICDE.
 
24
H. V. Jagadish. Making database systems usable. http://www.eecs.umich.edu/db/usable/ usability-sigmod.ppt, 2007. Keynote talk at SIGMOD.
 
25
The Kepler Project. http://kepler-project.org.
 
26
 
27
Microsoft Workflow Foundation. http://msdn2.microsoft.com/en-us/netframework/ aa663322.aspx.
 
28
 
29
 
30
L. Moreau, J. Freire, J. Futrelle, R. McGrath, J. Myers, and P. Paulson. The open provenance model, December 2007. http://eprints.ecs.soton.ac.uk/14979.
31
 
32
First provenance challenge. http://twiki.ipaw.info/bin/view/Challenge/ FirstProvenanceChallenge, 2006. S. Miles, and L. Moreau (organizers).
 
33
Second provenance challenge. http://twiki.ipaw.info/bin/view/Challenge/ SecondProvenanceChallenge, 2007. J. Freire, S. Miles, and L. Moreau (organizers).
 
34
 
35
36
 
37
Y. L. Simmhan, B. Plale, and D. Gannon. Karma2: Provenance management for data driven workflows. International Journal of Web Services Research, Idea Group Publishing, 5:1, 2008. To Appear.
 
38
Y. L. Simmhan, B. Plale, D. Gannon, and S. Marru. Performance evaluation of the karma provenance framework for scientific workflows. In L. Moreau and I. T. Foster, editors, International Provenance and Annotation Workshop (IPAW), Chicago, IL, volume 4145 of Lecture Notes in Computer Science, pages 222--236. Springer, 2006.
 
39
The Swift System. www.ci.uchicago.edu/swift.
 
40
W. C. Tan. Provenance in databases: Past, current, and future. IEEE Data Eng. Bull., 30(4):3--12, 2007.
 
41
The Taverna Project. http://taverna.sourceforge.net.
 
42
The Triana Project. http://www.trianacode.org.
 
43
VDS - The GriPhyN Virtual Data System. http://www.ci.uchicago.edu/wiki/bin/view/VDS/VDSWeb/WebMain.
 
44
 
45
The VisTrails Project. http://www.vistrails.org.
 
46


Collaborative Colleagues:
Susan B. Davidson: colleagues
Juliana Freire: colleagues