|
ABSTRACT
Given the ranked lists of documents returned by multiple search engines in response to a given query, the problem ofmetasearchis to combine these lists in a way which optimizes the performance of the combination. This paper makes three contributions to the problem of metasearch: (1) We describe and investigate a metasearch model based on an optimal democratic voting procedure, the Borda Count; (2) we describe and investigate a metasearch model based on Bayesian inference; and (3) we describe and investigate a model for obtaining upper bounds on the performance of metasearch algorithms. Our experimental results show that metasearch algorithms based on the Borda and Bayesian models usually outperform the best input system and are competitive with, and often outperform, existing metasearch strategies. Finally, our initial upper bounds demonstrate that there is much to learn about the limits of the performance of metasearch.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Proceedi gs of the 24th Anual International ACM SIGIR Conference on Research and Development i Information Retrieval ,New Orleans,Louisiana,USA, 2001.ACM Press,New York.
|
| |
2
|
|
| |
3
|
N.Belkin,P.Kantor,C.Cool,and R.Quatrain. Combining evidence for information retrieval.In Harman {11 },pages 35 -43.
|
 |
4
|
|
| |
5
|
W.S.Cooper,A.Chen,andF.C.Gey.Fulltext retrieval based on probabilistic equations with coe .cients .tted by logistic regression.In Harman {11 },pages 57 -66.
|
| |
6
|
W.B.Croft.Combining approaches to information retrieval.In W.B.Croft,editor,Advances in Information Retrieval:Recent Research from the Center for Intelligent Information Retrieval , chapter 1.Kluwer Academic Publishers,2000.
|
| |
7
|
The mathematics of voting:Democratic symmetry. The Economist ,page 83,Mar.2000.
|
| |
8
|
E.A.Fox,M.P.Kouhik,J.Shaw,,R.Modlin,and D.Rao.Combining evidence from multiple searches. In D.Harman,editor,The First Text REtrieval Conference (TREC-1),pages 319 -328,Gaithersburg, MD,USA,Mar.1993.U.S.Government Printing Office, Washington D.C.
|
| |
9
|
E.A.Fox and J.A.Shaw.Combination of multiple searches.In Harman {11},pages 243 -249.
|
| |
10
|
|
| |
11
|
D.Harman,editor.The Second Text REtrieval Conference (TREC-2),Gaithersburg,MD,USA,Mar. 1994.U.S.Government Printing O .ce,Washington D.C.
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
K.B.Ng.A Investigation of the Conditions for Effective Data Fusion in Information Retrieval .PhD thesis,School of Communication,Information,and Library Studies,Rutgers University,1998.
|
| |
17
|
K.B.Ng and P.B.Kantor.An investigation of the preconditions for e .ective data fusion in ir:A pilot study.In Proceedi gs of the 61th Anual Meeting of the American Society for Information Science ,1998.
|
| |
18
|
K.B.Ng,D.Loewentern,C.Bau,H.Hirh,and P.B.Kantor.Data fusion of machine-learning method for the TREC5 routing task (and other work).In Voorhees and Harman {32 },pages 477 -487.
|
| |
19
|
Content-Based Multimedia Information Access (RIAO),Paris,France,Apr.2000.
|
| |
20
|
D.G.Saari.Explaining all three-alternative voting outcomes.Journal of Economic Theory , 87(2):313 -355,Aug.1999.
|
| |
21
|
J.Savoy,A.L.Calve,and D.Vrajitoru.Report on the TREC-5 experiment:Data fusion and collection fusion.In Voorhees and Harman {32},pages 489 -502.
|
| |
22
|
E.Selberg and O.Etzioni.On the instability of web search engines.In RIAO {19},pages 223 -235.
|
| |
23
|
|
| |
24
|
J.A.Shaw and E.A.Fox.Combination of multiple searches.In D.Harman,editor,Overview of the Third Text REtrieval Conference (TREC-3),pages 105 -108, Gaithersburg,MD,USA,Apr.1995.U.S.Government Printing Office,Washington D.C.
|
| |
25
|
|
| |
26
|
|
| |
27
|
M.van Erp and L.Schomaker.Variant of the borda count method for combining ranked classi .er hypotheses. In Proceedi gs of the Seventh International Workshop on Frontiers in Handwriting Recognition ,pages 443 -452,Amsterdam,Sept.2000. International Unipen Foundation.
|
| |
28
|
|
| |
29
|
C.C.Vogt.How much more is better?Characterizing the e .ects of adding more IR systems to a combination.In RIAO {19}, pages 457 -475.
|
| |
30
|
|
| |
31
|
C.C.Vogt,G.W.Cottrell,R.K.Belew,and B.T. Bartell.Using relevance to train a linear mixture of experts.In Voorhees and Harman {32},pages 503 -515.
|
| |
32
|
E.Voorhees and D.Harman,editors.The Fifth Text REtrieval Conference (TREC-5),Gaithersburg,MD, USA,1997.U.S.Government Printing O .ce, Washington D.C.
|
| |
33
|
E.Voorhees and D.Harman.Overview of the Eighth Text REtrieval Conference (TREC-8).In D.Harman, editor,The Eighth Text REtrieval Conference (TREC-8),Gaithersburg,MD,USA,2000.U.S. Government Printing O .ce,Washington D.C.
|
CITED BY 53
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Steven M. Beitzel , Ophir Frieder , Eric C. Jensen , David Grossman , Abdur Chowdhury , Nazli Goharian, Disproving the fusion hypothesis: an analysis of data fusion via effective information retrieval strategies, Proceedings of the 2003 ACM symposium on Applied computing, March 09-12, 2003, Melbourne, Florida
|
|
|
|
Ronald Fagin , Ravi Kumar , Mohammad Mahdian , D. Sivakumar , Erik Vee, Comparing and aggregating rankings with ties, Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 14-16, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Lillis , F. Toolan , A. Mur , L. Peng , R. Collier , J. Dunnion, Probability-based fusion of information retrieval result sets, Artificial Intelligence Review, v.25 n.1-2, p.179-191, April 2006
|
|
|
|
|
|
|
|
|
|
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder , Nazli Goharian, Fusion of effective retrieval strategies in the same information retrieval system, Journal of the American Society for Information Science and Technology, v.55 n.10, p.859-868, August 2004
|
|
Javed A. Aslam , Virgiliu Pavlu , Robert Savell, A unified model for metasearch, pooling, and system evaluation, Proceedings of the twelfth international conference on Information and knowledge management, November 03-08, 2003, New Orleans, LA, USA
|
|
|
|
|
|
|
|
|
Shouchun Chen , Fei Wang , Yaangqiu Song , Changshui Zhang, Semi-supervised ranking aggregation, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
Jeremy Pickens , Gene Golovchinsky , Chirag Shah , Pernilla Qvarfordt , Maribeth Back, Algorithmic mediation for collaborative exploratory search, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
Yu-Ting Liu , Tie-Yan Liu , Tao Qin , Zhi-Ming Ma , Hang Li, Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
David Lillis , Fergus Toolan , Rem Collier , John Dunnion, ProbFuse: a probabilistic approach to data fusion, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
Dimitrios Skoutas , Dimitris Sacharidis , Alkis Simitsis , Verena Kantere , Timos Sellis, Top-k dominant web services under multi-criteria matching, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
Ronald Fagin , Ravi Kumar , Kevin S. McCurley , Jasmine Novak , D. Sivakumar , John A. Tomlin , David P. Williamson, Searching the workplace web, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
|
|
|
|
Dong Wang , Xiaobing Liu , Linjie Luo , Jianmin Li , Bo Zhang, Video diver: generic video indexing with diverse features, Proceedings of the international workshop on Workshop on multimedia information retrieval, September 24-29, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomaz Philippe C. Silva , Edleno Silva de Moura , João Marcos B. Cavalcanti , Altigran S. da Silva , Moisés Gomes de Carvalho , Marcos André Gonçalves, An evolutionary approach for combining different sources of evidence in search engines, Information Systems, v.34 n.2, p.276-289, April, 2009
|
|
|
|
|
|
|