ACM Home Page
Please provide us with feedback. Feedback
Approaches to collection selection and results merging for distributed information retrieval
Full text PdfPdf (1.22 MB)
Source Conference on Information and Knowledge Management archive
Proceedings of the tenth international conference on Information and knowledge management table of contents
Atlanta, Georgia, USA
Session: Distributed Information Retrieval table of contents
Pages: 191 - 198  
Year of Publication: 2001
ISBN:1-58113-436-3
Authors
Yves Rasolofo  Université de Neuchâtel, Neuchâtel, Switzerland
Faïza Abbaci  Ecole nationale supérieure des Mines de Saint-Étienne, Saint-Étienne, France
Jacques Savoy  Université de Neuchâtel, Neuchâtel, Switzerland
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 56,   Citation Count: 13
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502585.502618
What is a DOI?

ABSTRACT

We have investigated two major issues in Distributed Information Retrieval (DIR), namely: collection selection and search results merging. While most published works on these two issues are based on pre-stored metadata, the approaches described in this paper involve extracting the required information at the time the query is processed. In order to predict the relevance of collections to a given query, we analyse a limited number of full documents (e.g., the top five documents) retrieved from each collection and then consider term proximity within them. On the other hand, our merging technique is rather simple since input only requires document scores and lengths of results lists. Our experiments evaluate the retrieval effectiveness of these approaches and compare them with centralised indexing and various other DIR techniques (e.g., CORI). We conducted our experiments using two testbeds: one containing news articles extracted from four different sources (2 GB) and another containing 10 GB of Web pages. Our evaluations demonstrate that the retrieval effectiveness of our simple approaches is worth considering.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
Callan J.: Distributed Information Retrieval. In W. B. Croft (Ed.), Advances in Information Retrieval. Kluwer Academic Publishers, 2000 pp. 127-150.
4
 
5
Clarke C. L. A., Cormack G. V., Burkowski F. J.: Shortest Substring Ranking (MultiText Experiments for TREC-4). Proceedings of TREC-4, 1995, pp. 295-304.
 
6
Conover W.J.: Practical Nonparametric Statistics (2nd ed.). John Wiley & Sons, 1980, pp. 122-129.
7
 
8
Dumais S. T.: Latent Semantic Indexing (LSI) and TREC-2. Proceedings of TREC-2, 1994, pp. 105-l 15.
9
10
11
 
12
Kwok K. L., Gmnfeld L., Lewis D. D.: TREC-3 Ad-hoc, Routing Retrieval and Thresholding Experiments using PIRCS. Proceedings of TREC-3, 1995, pp. 247-255.
13
 
14
 
15
 
16
Moffat A. , Zobel J.: Information Retrieval Systems for Large Document Collections. Proceedings of TREC-3, 1995, pp. 85-94.
17
 
18
 
19
 
20
Savoy J., Rasolofo Y.: Report on TREC-9 Experiment: Linked-based Retrieval and Distributed Collections. Proceedings of TREC9,2000, to appear.
 
21
Towel1 G., Voorhees E. M., Narendra K. G., Johnson-Laird B. Learning Collection Fusion Strategies for Information Retrieval. Proceedings of The Twelfth Annual Machine Learning Conference, 1995, pp. 540-548.
22
23
24
 
25
Zobel J.: Collection Selection via Lexicon Inspection. Proceedings of The Second Australian Document Computing Symposium, 1997.

CITED BY  13

Collaborative Colleagues:
Yves Rasolofo: colleagues
Faïza Abbaci: colleagues
Jacques Savoy: colleagues