| Improving text collection selection with coverage and overlap statistics |
| Full text |
Pdf
(100 KB)
|
| Source
|
International World Wide Web Conference
archive
Special interest tracks and posters of the 14th international conference on World Wide Web
table of contents
Chiba, Japan
POSTER SESSION: Posters
table of contents
Pages: 1128 - 1129
Year of Publication: 2005
ISBN:1-59593-051-5
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 1, Downloads (12 Months): 14, Citation Count: 6
|
|
|
ABSTRACT
In an environment of distributed text collections, the first step in the information retrieval process is to identify which of all available collections are more relevant to a given query and which should thus be accessed to answer the query. We address the challenge of collection selection when there is full or partial overlap between the available text collections, a scenario which has not been examined previously despite its real-world applications. To that end, we present COSCO, a collection selection approach which uses collection-specific coverage and overlap statistics. We describe our experimental results which show that the presented approach displays the desired behavior of retrieving more new results early on in the collection order, and performs consistently and significantly better than CORI, previously considered to be one of the best collection selection systems.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
BibFinder: A Computer Science Bibliography Mediator. http://rakaposhi.eas.asu.edu/bibfinder, 2004.
|
 |
2
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
| |
3
|
T. Hernandez and S. Kambhampati. Collection selection in the presence of overlapping text collections. Submitted to ACM SIGIR, 2005. http://rakaposhi.eas.asu.edu/cosco.html
|
| |
4
|
|
 |
5
|
|
CITED BY 6
|
|
Matthias Bender , Sebastian Michel , Peter Triantafillou , Gerhard Weikum , Christian Zimmer, Improving collection selection with overlap awareness in P2P search engines, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
Milad Shokouhi , Justin Zobel , Yaniv Bernstein, Distributed text retrieval from overlapping collections, Proceedings of the eighteenth conference on Australasian database, p.141-150, January 30-February 02, 2007, Ballarat, Victoria, Australia
|
|
|
Sebastian Michel , Matthias Bender , Nikos Ntarmos , Peter Triantafillou , Gerhard Weikum , Christian Zimmer, Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|