|
ABSTRACT
Given the ranked lists of documents returned by multiple search engines in response to a given query, the problem of metasearch is to combine these lists in a way which optimizes the performance of the combination. This problem can be naturally decomposed into three subproblems: (1) normalizing the relevance scores given by the input systems, (2) estimating relevance scores for unretrieved documents, and (3) combining the newly-acquired scores for each document into one, improved score.Research on the problem of metasearch has historically concentrated on algorithms for combining (normalized) scores. In this paper, we show that the techniques used for normalizing relevance scores and estimating the relevance scores of unretrieved documents can have a significant effect on the overall performance of metasearch. We propose two new normalization/estimation techniques and demonstrate empirically that the performance of well known metasearch algorithms can be significantly improved through their use.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
TREC 2, Gaithersburg, MD, USA, Mar. 1994. U.S. Government Printing Office, Washington D.C.
|
| |
2
|
TREC 5, Gaithersburg, MD, USA, 1997. U.S. Government Printing Office, Washington D.C.
|
| |
3
|
ACM SZGZR 2001, New Orleans, Louisiana, USA, 2001. ACM Press, New York.
|
 |
4
|
|
| |
5
|
|
| |
6
|
N. Belkin, P. Kantor, C. Cool, and R. Quatrain. Combining evidence for information retrieval. In TREC 2 {I}, pages 35-43.
|
| |
7
|
W. B. Croft. Combining approaches to information retrieval. In W. B. Croft, editor, Advances in Information Retrieval: Recent Research jrvm the Center for Intelligent Information Retrieval, chapter 1. Kluwer, 2000.
|
| |
8
|
E. A. Fox, M. P. Koushik, J. Shaw, , R. Modlin, and D. Rao. Combining evidence from multiple searches. In TREC 1, pages 319-328, Gaithersburg, MD, USA, Mar. 1993. U.S. Government Printing Office, Washington D.C.
|
| |
9
|
E. A. Fox and J. A. Shaw. Combination of multiple searches. In TREC 2 {l}, pages 243-249.
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
K. B. Ng. An Investigation of the Conditions for Effective Data Fusion in Information Retrieval. PhD thesis, School of Communication, Information, and Library Studies, Rutgers University, 1998.
|
| |
16
|
K. B. Ng and P. B. Kantor. An investigation of the preconditions for effective data fusion in ir: A pilot study. In Proceedings of the 61th Annual Meeting of the American Society for Information Science, 1998.
|
| |
17
|
K. B. Ng, D. Loewenstern, C. Basu, H. Hirsh, and P. B. Kantor. Data fusion of machine-learning methods for the TREC5 routing task (and other work). In TREC 5 {2}, pages 477-487.
|
| |
18
|
Content-Based Multimedia Information Access (RIAO), Paris, France, Apr. 2000.
|
| |
19
|
|
| |
20
|
J. A. Shaw and E. A. Fox. Combination of multiple searches. In TREC 3, pages 105-108, Gaithersburg, MD, USA, Apr. 1995. U.S. Government Printing Office, Washington D.C.
|
| |
21
|
|
| |
22
|
|
| |
23
|
C. C. Vogt. How much more is better? Characterizing the effects of adding more IR systems to a combination. In RIAO {18}, pages 457-475.
|
| |
24
|
|
| |
25
|
C. C. Vogt, G. W. Cottrell, R. K.Belew, and B. T. Bartell. Using relevance to train a linear mixture of experts. In TREC 5 {2}, pages 503-515.
|
CITED BY 23
|
|
Steven M. Beitzel , Ophir Frieder , Eric C. Jensen , David Grossman , Abdur Chowdhury , Nazli Goharian, Disproving the fusion hypothesis: an analysis of data fusion via effective information retrieval strategies, Proceedings of the 2003 ACM symposium on Applied computing, March 09-12, 2003, Melbourne, Florida
|
|
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder , Nazli Goharian, Fusion of effective retrieval strategies in the same information retrieval system, Journal of the American Society for Information Science and Technology, v.55 n.10, p.859-868, August 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Lillis , F. Toolan , A. Mur , L. Peng , R. Collier , J. Dunnion, Probability-based fusion of information retrieval result sets, Artificial Intelligence Review, v.25 n.1-2, p.179-191, April 2006
|
|
|
|
|
|
|
|
|
David Lillis , Fergus Toolan , Rem Collier , John Dunnion, ProbFuse: a probabilistic approach to data fusion, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
Yu-Ting Liu , Tie-Yan Liu , Tao Qin , Zhi-Ming Ma , Hang Li, Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
Aiden R. Doherty , Ciarán Ó Conaire , Michael Blighe , Alan F. Smeaton , Noel E. O'Connor, Combining image descriptors to effectively retrieve events from visual lifelogs, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Andrei Broder , Massimiliano Ciaramita , Marcus Fontoura , Evgeniy Gabrilovich , Vanja Josifovski , Donald Metzler , Vanessa Murdock , Vassilis Plachouras, To swing or not to swing: learning when (not) to advertise, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|