|
ABSTRACT
We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoidance of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are easily recognizable, that occur frequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest. We describe a method for discovering these patterns and suggest that other lexical relations will also be acquirable in this way. A subset of the acquisition algorithm is implemented and the results are used to augment and critique the structure of a large hand-built thesaurus. Extensions and applications to areas such as information retrieval are suggested.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
Coates-Stephens, S. (1991). Coping with lexical inadequacy - the automatic acquisition of proper nouns from news text. In The Proceedings of the 7th Annual Conference of the UW Centre for the New OED and Text Research: Using Corpora, pages 154--169, Oxford.
|
| |
7
|
|
| |
8
|
Grolier (1990). Academic American Encyclopedia Grolier Electronic Publishing, Danbury, Connecticut.
|
| |
9
|
Hearst, M. A. (1991). Noun homograph disambiguation using local context in large text corpora. In The Proceedings of the 7th Annual Conference of the UW Centre for the New OED and Text Research: Using Corpora, Oxford.
|
| |
10
|
|
| |
11
|
Jacobs, P. & U. Zernik (1988). Acquiring lexical knowledge from text: A case study. In Proceedings of AAAI88, pages 739--744.
|
| |
12
|
|
| |
13
|
|
| |
14
|
Miller, G. A., R. Beckwith, C. Fellbaum, D. Gross, & K. J. Miller (1990). Introduction to wordnet: An on-line lexical database. Journal of Lexicography, 3(4):235--244.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
| |
19
|
Wilks, Y. A., D. C. Fass, C. ming Guo, J. E. McDonald, T. Plate, & B. M. Slator (1990). Providing machine tractable dictionary tools. Journal of Machine Translation, 2.
|
CITED BY 135
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ben Carterette , Rosie Jones , Wiley Greiner , Cory Barr, N semantic classes are harder than two, Proceedings of the COLING/ACL on Main conference poster sessions, p.49-56, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
Eric Glover , David M. Pennock , Steve Lawrence , Robert Krovetz, Inferring hierarchical descriptions, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hiroyuki Kaji , Yasutsugu Morimoto , Toshiko Aizono , Noriyuki Yamasaki, Corpus-dependent association thesauri for information retrieval, Proceedings of the 18th conference on Computational linguistics, p.404-410, July 31-August 04, 2000, Saarbrücken, Germany
|
|
Agneta Bergström , Patricija Jaksetic , Peter Nordin, Enhancing information retrieval by automatic acquisition of textual relations using genetic programming, Proceedings of the 5th international conference on Intelligent user interfaces, p.29-32, January 09-12, 2000, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Michael J. Cafarella , Doug Downey , Stephen Soderland , Oren Etzioni, KnowItNow: fast, scalable information extraction from the web, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.563-570, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
Enrique Alfonseca , Pablo Castells , Manabu Okumura , Maria Ruiz-Casado, A rote extractor with edit distance-based generalisation and multi-corpora precision calculation, Proceedings of the COLING/ACL on Main conference poster sessions, p.9-16, July 17-18, 2006, Sydney, Australia
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Leah S. Larkey , Paul Ogilvie , M. Andrew Price , Brenden Tamilio, Acrophile: an automated acronym extractor and server, Proceedings of the fifth ACM conference on Digital libraries, p.205-214, June 02-07, 2000, San Antonio, Texas, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dmitri Roussinov , Leon J. Zhao , Weiguo Fan, Mining context specific similarity relationships using the world wide web, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.499-506, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Oren Etzioni , Michael Cafarella , Doug Downey , Stanley Kok , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Web-scale information extraction in knowitall: (preliminary results), Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
Saurav Sahay , Sougata Mukherjea , Eugene Agichtein , Ernest V. Garcia , Shamkant B. Navathe , Ashwin Ram, Discovering semantic biomedical relations utilizing the Web, ACM Transactions on Knowledge Discovery from Data (TKDD), v.2 n.1, p.1-15, March 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
Tomas Kliegr , Krishna Chandramouli , Jan Nemrava , Vojtech Svatek , Ebroul Izquierdo, Combining image captions and visual analysis for image concept classification, Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008, p.8-17, August 24-24, 2008, Las Vegas, Nevada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Joanna Golebiowska , Rose Dieng-Kuntz , Olivier Corby , Didier Mousseau, Building and exploiting ontologies for an automobile project memory, Proceedings of the 1st international conference on Knowledge capture, October 22-23, 2001, Victoria, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Risto Gligorov , Warner ten Kate , Zharko Aleksovski , Frank van Harmelen, Using Google distance to weight approximate ontology matches, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Oren Etzioni , Michael Cafarella , Doug Downey , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, v.165 n.1, p.91-134, June 2005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE Design Automation Conference on
Gwo-Dong Chen
, Daniel D. Gajski
|