ACM Home Page
Please provide us with feedback. Feedback
Strategies for lifelong knowledge extraction from the web
Full text PdfPdf (326 KB)
Source
International Conference On Knowledge Capture archive
Proceedings of the 4th international conference on Knowledge capture table of contents
Whistler, BC, Canada
SESSION: Text analysis for knowledge acquisition table of contents
Pages: 95 - 102  
Year of Publication: 2007
ISBN:978-1-59593-643-1
Authors
Michele Banko  University of Washington, Seattle, WA
Oren Etzioni  University of Washington, Seattle, WA
Sponsors
SIGART: ACM Special Interest Group on Artificial Intelligence
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 19,   Downloads (12 Months): 146,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1298406.1298425
What is a DOI?

ABSTRACT

The increasing availability of electronic text has made it possible to acquire information using a variety of techniques that leverage the expertise of both humans and machines. In particular, the field of Information Extraction (IE), in which knowledge is extracted automatically from text, has shown promise for large-scale knowledge acquisition. While IE systems can uncover assertions about individual entities with an increasing level of sophistication,alltext understanding -- the formation of a coherent theory from a textual corpus -- involves representation and learning abilities not currently achievable by today's IE systems. Compared to individual relational assertions outputted by IE systems, a theory includes coherent knowledge of abstract concepts and the relationships among them. We believe that the ability to fully discover the richness of knowledge present within large, unstructured and heterogeneous corpora will require a lifelong learning process in which earlier learned knowledge is used to guide subsequent learning. This paper introduces Alice, a lifelong learning agent whose goal is to automatically discovera collection of concepts, facts and generalizations that describe a particular topic of interest directly from a large volume of Web text. Building upon recent advances in unsupervised information extraction, we demonstrate that Alice can iteratively discover new concepts and compose general domain knowledge with a precision of 78%.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Banko, M. Cararella, S. Soderland, M. Broadhead, and O. Etzioni. Open information extraction from the web. In Procs. of IJCAI, 2007.
 
2
 
3
D. Downey, O. Etzioni, and S. Soderland. A probabilistic model of redundancy in information extraction. In Procs. of IJCAI 2005, 2005.
 
4
O. Etzioni, M. Banko, and M. Cafarella. Machine reading. In AAAI, 2006.
 
5
 
6
D. Lenat. Automated theory formation in mathematics. In Procs. of IJCAI, 1977.
 
7
 
8
J. B. MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967.
 
9
T. Mitchell. Reading the web: A breakthrough goal for AI. In AI Magazine. AAAI Press, 2005.
 
10
 
11
 
12
Y. Shinyama and S. Sekine. Preemptive information extraction using unrestricted relation
 
13
14
 
15
A. Teller. Exegesis. Random House, 1999.
 
16
 
17
S. Thrun and T. Mitchell. Lifelong robot learning. Robotics and Autonomous Systems, 15:25--46, 1995.


Collaborative Colleagues:
Michele Banko: colleagues
Oren Etzioni: colleagues