|
ABSTRACT
Statistical language models have been proposed recently for several information retrieval tasks, including the resource selection task in distributed information retrieval. This paper extends the language modeling approach to integrate resource selection, ad-hoc searching, and merging of results from different text databases into a single probabilistic retrieval model. This new approach is designed primarily for Intranet environments, where it is reasonable to assume that resource providers are relatively homogeneous and can adopt the same kind of search engine. Experiments demonstrate that this new, integrated approach is at least as effective as the prior state-of-the-art in distributed IR.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Callan. Distributed information retrieval. In W.B. Croft, editor, Advances in information retrieval, chapter 5, pages 127--150. Kluwer Academic Publishers, 2000.
|
 |
2
|
|
 |
3
|
|
 |
4
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
 |
5
|
|
 |
6
|
Luis Gravano , Chen-Chuan K. Chang , Héctor García-Molina , Andreas Paepcke, STARTS: Stanford proposal for Internet meta-searching, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.207-218, May 11-15, 1997, Tucson, Arizona, United States
|
 |
7
|
Luis Gravano , Héctor García-Molina , Anthony Tomasic, The effectiveness of GIOSS for the text database discovery problem, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.126-137, May 24-27, 1994, Minneapolis, Minnesota, United States
|
| |
8
|
|
 |
9
|
|
| |
10
|
N. Craswell, D. Hawking, and P. Thistlewaite. Merging Results from Isolated Search Engines. In Proc. of the Tenth Australasian Database Conf., pages 189--200, 1999
|
 |
11
|
James C. French , Allison L. Powell , Jamie Callan , Charles L. Viles , Travis Emmitt , Kevin J. Prey , Yun Mou, Comparing the performance of database selection algorithms, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.238-245, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312684]
|
| |
12
|
A F. Smeaton and F. Crimmins Using A Data Fusion Agent for Searching the WWW. In Proc of The Sixth International World Wide Web Conference, 1997.
|
 |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
Leah S. Larkey , Margaret E. Connell , Jamie Callan, Collection selection and results merging with topically organized U.S. patents and TREC data, Proceedings of the ninth international conference on Information and knowledge management, p.282-289, November 06-11, 2000, McLean, Virginia, United States
[doi> 10.1145/354756.354830]
|
| |
17
|
Lemur Toolkit http://www.cs.cmu.edu/~lemur
|
| |
18
|
|
 |
19
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312680]
|
 |
20
|
|
| |
21
|
C. Buckley, A. Singhal, M. Mitra, and G. Salton, New retrieval approaches using SMART. In Proceedings of 1995 Text REtrieval Conference (TREC-3). National Institute of Standards and Technology, special publication.
|
CITED BY 24
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Matthias Bender , Sebastian Michel , Peter Triantafillou , Gerhard Weikum , Christian Zimmer, Improving collection selection with overlap awareness in P2P search engines, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
Klaus Berberich , Manolis Koubarakis , Christos Tryfonopoulos , Gerhard Weikum , Christian Zimmer, MAPS: approximate publish/subscribe functionality in peer-to-peer networks, Proceedings of the 1st international workshop on Advanced data processing in ubiquitous computing (ADPUC 2006), November 27-December 01, 2006, Melbourne, Australia
|
|
|
Milad Shokouhi , Justin Zobel , Yaniv Bernstein, Distributed text retrieval from overlapping collections, Proceedings of the eighteenth conference on Australasian database, p.141-150, January 30-February 02, 2007, Ballarat, Victoria, Australia
|
|
|
Milad Shokouhi , Justin Zobel , Falk Scholer , S. M. M. Tahaghoghi, Capturing collection size for distributed non-cooperative retrieval, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
Sebastian Michel , Matthias Bender , Nikos Ntarmos , Peter Triantafillou , Gerhard Weikum , Christian Zimmer, Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|