ACM Home Page
Please provide us with feedback. Feedback
Predicting query performance
Full text PdfPdf (259 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Queries table of contents
Pages: 299 - 306  
Year of Publication: 2002
ISBN:1-58113-561-0
Authors
Steve Cronen-Townsend  University of Massachusetts, Amherst, MA
Yun Zhou  University of Massachusetts, Amherst, MA
W. Bruce Croft  University of Massachusetts, Amherst, MA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 174,   Citation Count: 88
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564429
What is a DOI?

ABSTRACT

We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resulting clarity score measures the coherence of the language usage in documents whose models are likely to generate the query. We suggest that clarity scores measure the ambiguity of a query with respect to a collection of documents and show that they correlate positively with average precision in a variety of TREC test sets. Thus, the clarity score may be used to identify ineffective queries, on average, without relevance information. We develop an algorithm for automatically setting the clarity score threshold between predicted poorly-performing queries and acceptable queries and validate it using TREC data. In particular, we compare the automatic thresholds to optimum thresholds and also check how frequently results as good are achieved in sampling experiments that randomly assign queries to the two classes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
A. Bowman and A. Azzilini. Applied Smoothing Techniques for Data Analysis. Oxford University Press, New York, 1997.
 
2
C. Buckley. The trec-9 query track. In E. Voorhees and D. Harman, editors, Proceedings of the Ninth Text REtrieval Conference(TREC-9), 2000. NIST Special Publication 500-249.
3
 
4
 
5
W. B. Croft. Combining approaches in information retrieval. In W. B. Croft, editor, Advances in Information Retrieval: Recent Research from the CIIR, pages 1--36. Kluwer Academic Publishers, Boston, 2000.
 
6
S. Cronen-Townsend and W. B. Croft. Quantifying query ambiguity. In Proc. of Human Language Technology 2002, pages 94--98, March 2002.
 
7
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, New York, 1973.
 
8
J. D. Gibbons and S. Chakraborty. Nonparametric Statistical Inference, 3rd ed. Marcel Dekker, New York, New York, 1992.
 
9
 
10
11
12
13
 
14
 
15
16
 
17
P. Resnik. Selectional constraints: An information-theoretic model and its computational realization. Cognition, 61:127--159, 1996.
 
18
M. Rorvig. A new method of measurement for question difficulty. In Proceedings of the 2000 Annual Meeting of the American Society for Information Science, Knowledge Innovations, volume 37, pages 372--378, 2000.
19
20
 
21
S. K. M. Wong and Y. Y. Yao. An information-theoretic measure of term specificity. Journal of the American Society for Information Science, 43(1):54--61, 1992.

CITED BY  88

Collaborative Colleagues:
Steve Cronen-Townsend: colleagues
Yun Zhou: colleagues
W. Bruce Croft: colleagues