|
ABSTRACT
Through technologies such as RSS (Really Simple Syndication), Web Services, and AJAX (Asynchronous JavaScript and XML), the Internet has facilitated the emergence of applications that are composed from a variety of services and data sources. Through tools such as Yahoo Pipes, these “mash-ups” can be composed in a dynamic, just-in-time manner from components provided by multiple institutions (i.e., Google, Amazon, your neighbor). However, when using these applications, it is not apparent where data comes from or how it is processed. Thus, to inspire trust and confidence in mash-ups, it is critical to be able to analyze their processes after the fact. These trailing analyses, in particular the determination of the provenance of a result (i.e., the process that led to it), are enabled by process documentation, which is documentation of an application's past process created by the components of that application at execution time. In this article, we define a generic conceptual data model that supports the autonomous creation of attributable, factual process documentation for dynamic multi-institutional applications. The data model is instantiated using two Internet formats, OWL and XML, and is evaluated with respect to questions about the provenance of results generated by a complex bioinformatics mash-up.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Marcos K. Aguilera , Jeffrey C. Mogul , Janet L. Wiener , Patrick Reynolds , Athicha Muthitacharoen, Performance debugging for distributed systems of black boxes, Proceedings of the nineteenth ACM symposium on Operating systems principles, October 19-22, 2003, Bolton Landing, NY, USA
|
| |
2
|
Altintas, I., Barney, O., and Jaeger-Frank, E. 2006. Provenance collection support in the Kepler scientific workflow system. In Proceedings of the International Provenance and Annotation Workshop, IPAW 2006, L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany, 118--132.
|
 |
3
|
|
| |
4
|
Boag, S., Chamberlin, D., Fernández, M. F., Florescu, D., Robie, J., and Simon, J. 2006. Xquery 1.0: An XML query language. Tech. rep., World Wide Web Consortium.
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
Butler, D. 2006. Mashups mix data into global service. Nature 439, 6--7.
|
 |
9
|
Jeremy J. Carroll , Christian Bizer , Pat Hayes , Patrick Stickler, Named graphs, provenance and trust, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1060745.1060835]
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
Vikas Deora , Arnaud Contes , Omer F. Rana , Shrija Rajbhandari , Ian Wootten , Kifor Tamas , Laszlo Z. Varga, Navigating Provenance Information for Distributed Healthcare Management, Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, p.859-865, December 18-22, 2006
[doi> 10.1109/WI.2006.122]
|
| |
15
|
DeRoure, D., Ed. 2007. Web 2.0 and Grids Workshop at OGF19. http://www.semanticgrid.org/OGF/ogf19/.
|
| |
16
|
|
| |
17
|
|
| |
18
|
Futrelle, J. 2006. Harvesting RDF triples. In Proceedings of the International Provenance and Annotation Workshop (IPAW 2006), L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany, 64--72.
|
| |
19
|
Golbeck, J. 2006. Combining provenance with trust in social networks for semantic web content filtering. In Proceedings of the International Provenance and Annotation Workshop (IPAW 2006), L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany, 101--108.
|
 |
20
|
|
| |
21
|
P. Groth , S. Miles , Weijian Fang , S. C. Wong , K.-P. Zauner , L. Moreau, Recording and using provenance in a protein compressibility experiment, Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium, p.201-208, July 24-27, 2005
[doi> 10.1109/HPDC.2005.1520960]
|
| |
22
|
Groth, P., Miles, S., and Moreau, L. 2005b. PReServ: Provenance Recording for Services. In Proceedings of the UK OST e-Science 2nd All Hands Meeting 2005 (AHM'05).
|
 |
23
|
|
| |
24
|
Ibbotson, J. and Jiang, S. 2006. D9.3.3: Final Functional Prototype. Tech. rep., IBM United Kingdom. Nov.
|
| |
25
|
|
| |
26
|
Tamas Kifor , Laszlo Z. Varga , Javier Vazquez-Salceda , Sergio Alvarez , Steven Willmott , Simon Miles , Luc Moreau, Provenance in Agent-Mediated Healthcare Systems, IEEE Intelligent Systems, v.21 n.6, p.38-46, November 2006
[doi> 10.1109/MIS.2006.119]
|
| |
27
|
Kloss, G. K. and Schreiber, A. 2006. Provenance implementation in a scientific simulation environment. In Proceedings of the International Provenance and Annotation Workshop (IPAW). L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany, 37--46.
|
 |
28
|
|
 |
29
|
Hemang Lavana , Amit Khetawat , Franc Brglez , Krzysztof Kozminski, Executable workflows: a paradigm for collaborative design on the Internet, Proceedings of the 34th annual conference on Design automation, p.553-558, June 09-13, 1997, Anaheim, California, United States
[doi> 10.1145/266021.266283]
|
| |
30
|
|
| |
31
|
David Martin , Mark Burstein , Drew Mcdermott , Sheila Mcilraith , Massimo Paolucci , Katia Sycara , Deborah L. Mcguinness , Evren Sirin , Naveen Srinivasan, Bringing Semantics to Web Services with OWL-S, World Wide Web, v.10 n.3, p.243-277, September 2007
[doi> 10.1007/s11280-007-0033-x]
|
| |
32
|
McIlraith, S. and Son, T. 2002. Adapting golog for composition of semantic web services. In Proceedings of the 8th International Conference on Knowledge Representation and Reasoning (KR2002). 482--493.
|
| |
33
|
Miles, S. 2006. Electronically querying for the provenance of entities. In Proceedings of the International Provenance and Annotation Workshop (IPAW). L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany, 37--46.
|
| |
34
|
Miles, S., Groth, P., Branco, M., and Moreau, L. 2007a. The requirements of using provenance in e-science experiments. J. Grid Comput. 5, 1, 1--25.
|
| |
35
|
Simon Miles , Paul Groth , Steve Munroe , Sheng Jiang , Thibaut Assandri , Luc Moreau, Extracting causal graphs from an open provenance data model, Concurrency and Computation: Practice & Experience, v.20 n.5, p.577-586, April 2008
[doi> 10.1002/cpe.v20:5]
|
| |
36
|
Simon Miles , Sylvia C. Wong , Weijian Fang , Paul Groth , Klaus-Peter Zauner , Luc Moreau, Provenance-based validation of e-science experiments, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.1, p.28-38, March, 2007
[doi> 10.1016/j.websem.2006.11.003]
|
| |
37
|
Moreau, L. and Foster, I., Eds. 2006. Proceedings of the Provenance and Annotation of Data—International Provenance and Annotation Workshop (IPAW 2006). Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany.
|
| |
38
|
Munroe, S., Groth, P., Jiang, S., Miles, S., Tan, V., and Moreau, L. 2006a. Data model for process documentation. Tech. rep., University of Southampton. http://eprints.ecs.soton.ac.uk/13200/.
|
 |
39
|
|
| |
40
|
Novak, J. D. 1998. Learning, Creating, and Using Knowledge: Concept Maps As Facilitative Tools in Schools and Corporations. LEA, Inc.
|
 |
41
|
|
| |
42
|
Simmhan, Y. L., Plale, B., Gannon, D., and Marru, S. 2006. Performance evaluation of the karma provenance framework for scientific workflows. In Proceedings of the International Provenance and Annotation Workshop (IPAW 2006). L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany.
|
 |
43
|
|
| |
44
|
|
| |
45
|
Tan, V., Groth, P., Miles, S., Jiang, S., Munroe, S., Tsasakou, S., and Moreau, L. 2006. Security issues in a soa-based provenance system. In Proceedings of the International Provenance and Annotation Workshop (IPAW'06). Springer-Verlag, Berlin, Germany.
|
| |
46
|
|
| |
47
|
Wang, G. and Dunbrack, Jr., R. L. 2003. Pisces: A protein sequence culling server. Bioinformatics 19, 1589--1591.
|
| |
48
|
Zhao, J., Goble, C., Greenwood, M., Wroe, C., and Stevens, R. 2003. Annotating, linking and browsing provenance logs for e-science. In Proceedings of the Workshop on Semantic Web Technologies for Searching and Retrieving Scientific Data.
|
| |
49
|
Zhao, Y., Wilde, M., and Foster, I. 2006. A virtual data provenance model. In Proceedings of the International Provenance and Annotation Workshop (IPAW 2006). L. Moreau and I. Foster, Eds. Lecture Notes in Computer Science, vol. 4145. Springer-Verlag, Berlin, Germany.
|
|