|
ABSTRACT
This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50% chance of being correct. If no sense of the word has 50% or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7% above that produced by the same system but without the disambiguation, and 9.2% above that produced by using Lesk's algorithm. Our retrieval effectiveness is 7% better than the best reported result in the literature.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Daniel M. Bikel , Scott Miller , Richard Schwartz , Ralph Weischedel, Nymble: a high-performance learning name-finder, Proceedings of the fifth conference on Applied natural language processing, p.194-201, March 31-April 03, 1997, Washington, DC
[doi> 10.3115/974557.974586]
|
| |
3
|
Brill Tagger: http://www.cs.jhu.edu/~brill/
|
 |
4
|
|
| |
5
|
Nancy Chinchor: "Overview of MUC-7", MUC-7, (1998)
|
| |
6
|
Julio Gonzalo, Felisa Verdejo, Irina Chugur, Juan M. Cigarran: Indexing with WordNet synsets can improve Text Retrieval CoRR cmp-lg/9808002: (1998)
|
| |
7
|
|
 |
8
|
|
| |
9
|
K. Kwok, L. Grunfeld, N. Dinstl, P. Deng, TREC 2003 Robust, HARD, and QA Track Experiments using PIRCS, TREC12, 2003.
|
| |
10
|
K.L. Kwok, L. Grunfeld, H.L. Sun, P. Deng, TREC 2004 Robust Track Experiments Using PIRCS, TREC13, 2004
|
 |
11
|
|
| |
12
|
Shuang Liu, Chaojing Sun, Clement Yu: UIC at TREC 2004: Robust Track. TREC13, 2004
|
 |
13
|
|
| |
14
|
|
| |
15
|
Rada Mihalcea, Paul Tarau, Elizabeth Figa: PageRank on Semantic Networks, with application to Word Sense Disambiguation, COLING 2004, Switzerland, Geneva, 2004
|
| |
16
|
|
| |
17
|
George A. Miller. Special Issue. WordNet: An On-line Lexical Database, International Journal of Lexicography, 1990.
|
| |
18
|
|
| |
19
|
Siddharth Patwardhan, Satanjeev Banerjee, Ted Pedersen: Using Measures of Semantic Relatedness for Word Sense Disambiguation. CICLing 2003: 241--257
|
| |
20
|
R. Richardson, A. Smeaton: Using WordNet in a knowledge-based approach to information retrieval. BCS-IRSG Colloquium on Information Retrieval, 1995
|
| |
21
|
|
| |
22
|
Hinrich Schütze, Jan O. Pedersen: Information retrieval based on word senses. In Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, NV, 1995
|
| |
23
|
|
| |
24
|
C. Sun, S. Liu, F. Liu, C. Yu, W. Meng, Recognition and Classification of Noun Phrases in Queries for Effective Retrieval, Technique Report, UIC, 2005,
|
 |
25
|
|
| |
26
|
Xiang Tong, ChengXiang Zhai, Natasa Milic-Frayling, David A. Evans: Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report. TREC 1996
|
 |
27
|
|
| |
28
|
|
| |
29
|
Ellen M. Voorhees: Overview of the TREC 2004 Robust Retrieval Track, TREC13, 2004.
|
| |
30
|
|
| |
31
|
D.L. Yeung, C.L.A. Clarke, G.V. Cormack, T.R. Lynam, E.L. Terra, Task-Specific Query Expansion (MultiText Experiments for TREC 2003), TREC12, 2003.
|
| |
32
|
|
CITED BY 3
|
|
Jorge Gracia , Raquel Trillo , Mauricio Espinoza , Eduardo Mena, Querying the web: a multiontology disambiguation method, Proceedings of the 6th international conference on Web engineering, July 11-14, 2006, Palo Alto, California, USA
|
|
|
|
|
|
David Carmel , Elad Yom-Tov , Adam Darlow , Dan Pelleg, What makes a query difficult?, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|