ACM Home Page
Please provide us with feedback. Feedback
On the provenance of non-answers to queries over extracted data
Full text PdfPdf (633 KB)
Source
Proceedings of the VLDB Endowment archive
Volume 1 ,  Issue 1  (August 2008) table of contents
SESSION: New topics table of contents
Pages 736-747  
Year of Publication: 2008
ISSN:2150-8097
Authors
Jiansheng Huang  University of Wisconsin at Madison
Ting Chen  University of Wisconsin at Madison
AnHai Doan  University of Wisconsin at Madison
Jeffrey F. Naughton  University of Wisconsin at Madison
Publisher
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 72,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1453856.1453936
What is a DOI?

ABSTRACT

In information extraction, uncertainty is ubiquitous. For this reason, it is useful to provide users querying extracted data with explanations for the answers they receive. Providing the provenance for tuples in a query result partially addresses this problem, in that provenance can explain why a tuple is in the result of a query. However, in some cases explaining why a tuple is not in the result may be just as helpful. In this work we focus on providing provenance-style explanations for non-answers and develop a mechanism for providing this new type of provenance. Our experience with an information extraction prototype suggests that our approach can provide effective provenance information that can help a user resolve their doubts over non-answers to a query.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
GATE. http://gate.ac.uk/ie/annie.html.
 
2
MALLET. http://mallet.cs.umass.edu.
 
3
Computer Research Association. http://www.cra.org/.
 
4
MinorThird. http://minorthird.sourceforge.net.
 
5
 
6
 
7
C. Binnig, D. Kossmann, E. Lo. Reverse Query Processing. In ICDE, 2007.
8
 
9
 
10
M. J. Cafarella, C. Re, D. Suciu, and O. Etzioni. Structured querying of web text data: A technical challenge. In CIDR, 2007.
 
11
12
 
13
J. Chomicki. Consistent Query Answering: Five Easy Pieces. In ICDT, 2007.
 
14
 
15
W. Cohen and A. McCallum. Information extraction from the web. In KDD, 2003.
 
16
 
17
 
18
 
19
20
 
21
M. Garofalakis and D. Suciu. Special issue on probabilistic data management. In IEEE Data Engineering Bulletin, 2006.
22
 
23
M. Gubanov and P. A. Bernstein. Structural text search and comparison using automatically extracted schema. In WebDB, 2006.
 
24
A. Jain, A. Doan, L. Gravano Optimizing SQL Queries over Text Databases In ICDE, 2008.
25
26
 
27
T. S. Jayram, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar information extraction system. IEEE Data Eng. Bull., 29(1), 2006.
 
28
S. Sarawagi. Automation in information extraction and data integration. In VLDB, 2002.
29
 
30
 
31
D. Suciu. Managing imprecisions with probabilistic databases. In Twente Data Management, 2006.
 
32
W. C. Tan. Research problems in data provenance. IEEE Data Eng. Bull., 27(4), 2004.
 
33
D. Weld, F. Wu, E. Adar, S. Amershi, J. Fogarty, R. Hoffmann, K. Patel, M. Skinner Intelligence in Wikipedia In AAAI, 2008.
 
34
J. Widom. Trio: A system for integrated management of data, accuracy, and lineage. In CIDR, 2005.
 
35


Collaborative Colleagues:
Jiansheng Huang: colleagues
Ting Chen: colleagues
AnHai Doan: colleagues
Jeffrey F. Naughton: colleagues