|
ABSTRACT
This paper introduces the use of Wikipedia as a resource for automatic keyword extraction and word sense disambiguation, and shows how this online encyclopedia can be used to achieve state-of-the-art results on both these tasks. The paper also shows how the two methods can be combined into a system able to automatically enrich a text with links to encyclopedic knowledge. Given an input document, the system identifies the important concepts in the text and automatically links these concepts to the corresponding Wikipedia pages. Evaluations of the system show that the automatic annotations are reliable and hardly distinguishable from manual annotations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. F. Adafre and M. de Rijke. Finding similar sentences across multiple languages in wikipedia. In Proceedings of the EACL Workshop on New Text, Trento, Italy, 2006.
|
| |
2
|
T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 1(501), May 2001.
|
| |
3
|
R. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In Proceedings of the European Conference of the Association for Computational Linguistics, Trento, Italy, 2006.
|
 |
4
|
Sara Drenner , Max Harper , Dan Frankowski , John Riedl , Loren Terveen, Insert movie reference here: a system to bridge conversation and item-oriented web sites, Proceedings of the SIGCHI conference on Human Factors in computing systems, April 22-27, 2006, Montréal, Québec, Canada
[doi> 10.1145/1124772.1124914]
|
 |
5
|
|
| |
6
|
E. Gabrilovich and S. Markovitch. Overcoming the brittleness bottleneck using wikipedia: Enhancing text categorization with encyclopedic knowledge. In Proceedings of the National Conference on Artificial Intelligence (AAAI), Boston, 2006.
|
| |
7
|
J. Giles. Internet encyclopaedias go head to head. Nature, 438(7070):900--901, 2005.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
C. Jacquemin and D. Bourigault. Term Extraction and Automatic Indexing. Oxford University Press, 2000.
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
| |
17
|
R. Mihalcea. Using Wikipedia for automatic word sense disambiguation. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York, April 2007.
|
| |
18
|
R. Mihalcea and P. Edmonds, editors. Proceedings of SENSEVAL-3, Association for Computational Linguistics Workshop, Barcelona, Spain, 2004.
|
| |
19
|
R. Mihalcea and P. Tarau. TextRank - bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, 2004.
|
 |
20
|
|
| |
21
|
R. Navigli and M. Lapata. Graph connectivity measures for unsupervised word sense disambiguation. In Proceedings of the International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007.
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
S. Pradhan, E. Loper, D. Dligach, and M. Palmer. Semeval-2007 task-17: English lexical sample, srl and all words. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, June 2007.
|
| |
26
|
|
| |
27
|
M. Strube and S. P. Ponzetto. Wikirelate! computing semantic relatedeness using Wikipedia. In Proceedings of the American Association for Artificial Intelligence, Boston, MA, 2006.
|
| |
28
|
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
Huan Wang , Xing Jiang , Liang-Tien Chia , Ah-Hwee Tan, Ontology enhanced web image retrieval: aided by wikipedia & spreading activation theory, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|