ACM Home Page
Please provide us with feedback. Feedback
Efficiently incorporating user feedback into information extraction and integration programs
Full text PdfPdf (601 KB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 3: information extraction table of contents
Pages 87-100  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Authors
Xiaoyong Chai  University of Wisconsin-Madison, Madison, WI, USA
Ba-Quy Vuong  University of Wisconsin-Madison, Madison, WI, USA
AnHai Doan  University of Wisconsin-Madison, Madison, WI, USA
Jeffrey F. Naughton  University of Wisconsin-Madison, Madison, WI, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): n/a,   Downloads (12 Months): n/a,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559857
What is a DOI?

ABSTRACT

Many applications increasingly employ information extraction and integration (IE/II) programs to infer structures from unstructured data. Automatic IE/II are inherently imprecise. Hence such programs often make many IE/II mistakes, and thus can significantly benefit from user feedback. Today, however, there is no good way to automatically provide and process such feedback. When finding an IE/II mistake, users often must alert the developer team (e.g., via email or Web form) about the mistake, and then wait for the team to manually examine the program internals to locate and fix the mistake, a slow, error-prone, and frustrating process.

In this paper we propose a solution for users to directly provide feedback and for IE/II programs to automatically process such feedback. In our solution a developer U uses hlog, a declarative IE/II language, to write an IE/II program P. Next, U writes declarative user feedback rules that specify which parts of P's data (e.g., input, intermediate, or output data) users can edit, and via which user interfaces. Next, the so-augmented program P is executed, then enters a loop of waiting for and incorporating user feedback. Given user feedback F on a data portion of P, we show how to automatically propagate F to the rest of P, and to seamlessly combine F with prior user feedback. We describe the syntax and semantics of hlog, a baseline execution strategy, and then various optimization techniques. Finally, we describe experiments with real-world data that demonstrate the promise of our solution.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
 
4
5
 
6
X. Chai, B.-Q. Vuong, A. Doan, and J. F. Naughton. Efficiently incorporating user feedback into information extraction and integration programs. Technical report. {Online} Available: http://www.cs.wisc.edu/~xchai/papers/hlog_report.pdf.
 
7
8
9
 
10
 
11
H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. In ACL-02.
 
12
 
13
 
14
P. DeRose, W. Shen, F. Chen, Y. Lee, D. Burdick, A. Doan, and R. Ramakrishnan. DBLife: A community information management platform for the database research community. In CIDR-07.
15
16
 
17
A. Doan, R. Ramakrishnan, F. Chen, P. DeRose, Y. Lee, R. McCann, M. Sayyadian, and W. Shen. Community information management. IEEE Data Eng. Bull., 29(1), 2006.
 
18
19
20
 
21
J. Gray, R. A. Lorie, G. R. Putzolu, and I. L. Traiger. Granularity of locks and degrees of consistency in a shared data base. In IFIP-76.
 
22
23
 
24
A. Gupta and I. S. Mumick. Maintenance of materialized views: Problems, techniques, and applications. Data Eng. Bulletin, 18(2), 1995.
 
25
26
27
28
 
29
Y. Katsis, A. Deutsch, and Y. Papakonstantinou. Interactive source registration in community-oriented information integration. In VLDB-08.
30
31
 
32
 
33
34
 
35
36

Collaborative Colleagues:
Xiaoyong Chai: colleagues
Ba-Quy Vuong: colleagues
AnHai Doan: colleagues
Jeffrey F. Naughton: colleagues