|
ABSTRACT
We present an analysis of word senses that provides a fresh insight
into the impact of word ambiguity on retrieval effectiveness with potential broader implications for other processes of information retrieval. Using a methodology of forming artifically ambiguous words, known as pseudowords, and through reference to other researchers' work, the analysis illustrates that the distribution of the frequency of occurrance of the senses of a word plays a strong role in ambiguity's impact of effectiveness. Further investigation shows that this analysis may also be applicable to other processes of retrieval, such as Cross Language Information Retrieval, query expansion, retrieval of OCR'ed texts, and stemming. The analysis appears to provide a means of explaining, at least in part, reasons for the processes' impact (or lack of it) on effectiveness.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
BURNETT,J.E.,COOPER, D., LYNCH,M.F.,WILLETT, P., AND WYCHERLEY, M. 1979. Document retrieval experiments using indexing vocabularies of varying size. -1. Variety generation symbols assigned to the fronts of index terms. J. Doc. 35, 3, 197-206.
|
 |
3
|
|
| |
4
|
CRESTANI, F., SANDERSON, M., THEOPHYLACTOU, M., AND LALMAS, M. 1997. Short queries, natural language and spoken documents retrieval: Experiments at Glasgow University. In Proceedings of the 6th Text Retreival Conference (TREC-6, Nov.), E. Voorhees and D. Harman, Eds.
|
| |
5
|
GALE, W., CHURCH,K.W.,AND YAROWSKY, D. 1992a. Work on statistical methods for word sense disambiguation. In Intelligent Probabilistic Approaches to Natural Language Papers from the 1992 Fall Symposium. AAAI Press, Menlo Park, CA, 54-60.
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
KILGARRIFF, A. 1997. I don't believe in word senses. Comput. Hum. 31, 2, 91-113.
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
PORTER, M. F. 1980. An algorithm for suffix stripping. Program: Autom. Libr. Inf. Syst. 14,3, 130-137.
|
 |
20
|
|
| |
21
|
|
| |
22
|
SANDERSON, M. 1996. Word sense disambiguaiton and information retrieval. Tech. Rep. TR-1997-7. Deparment of Computing Science, University of Glasgow, Glasgow, UK.
|
| |
23
|
SCH~TZE, H. 1992. Context space. In Intelligent Probabilistic Approaches to Natural Language Papers from the 1992 Fall Symposium. AAAI Press, Menlo Park, CA, 113-120.
|
| |
24
|
SCH~TZE,H.AND PEDERSEN, J. O. 1995. Information retrieval based on word senses. In Symposium on Document Analysis and Information Retrieval (Las Vegas, NV). 161-175.
|
| |
25
|
SMALL,S.AND RIEGER, C. 1982. Parsing and comprehending with word experts (a theory and its realisation). In Strategies for Natural Language Processing, W. G. Lehnert and M. H. Ringle, Eds. 89-148.
|
 |
26
|
|
| |
27
|
|
| |
28
|
SPARCK JONES,K.AND VAN RIJSBERGEN, C. J. 1976. Progress in documentation. J. Doc. 32,1 (Mar.), 59-75.
|
 |
29
|
|
| |
30
|
|
 |
31
|
|
| |
32
|
|
| |
33
|
VOORHEES,E.AND HARMAN, D. 1996. Overview of the Fifth Text REtrieval Conference (TREC-5). In Proceedings of the 5th Text Retrieval Conference (TREC-5, Gaithersburg, MD, Nov.), E. M. Voorhees and D. K. Harman, Eds. National Institute of Standards and Technology, Gaithersburg, MD.
|
| |
34
|
WALLIS, P. 1993. Information retrieval based on paraphrase. In Proceedings of the 1st PACLING Conference.
|
| |
35
|
WEISS, S. F. 1973. Learning to disambiguate. Inf. Storage Retrieval 9, 33-41.
|
| |
36
|
WILKS, Y., FASS, D., GUO, C., MACDONALD,J.E.,PLATE, T., AND SLATOR, B. M. 1990. Providing machine tractable dictionary tools. Mach. Transl. 5, 2, 99-154.
|
 |
37
|
|
| |
38
|
|
| |
39
|
ZIPF, G. K. 1949. Human Behavior and Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Reading, MA.
|
CITED BY 8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Guang Qiu , Kangmiao Liu , Jiajun Bu , Chun Chen , Zhiming Kang, Quantify query ambiguity using ODP metadata, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
REVIEW
"Richard S. Marcus : Reviewer"
Attempts to incorporate word sense distinctions into the
information retrieval process have generally led, counterintuitively, to
minimal improvements in retrieval effectiveness. The authors, and
others, have speculated that the skewed distrib
more...
|