ACM Home Page
Please provide us with feedback. Feedback
A decision-theoretic approach to database selection in networked IR
Full text PdfPdf (160 KB)
Source ACM Transactions on Information Systems (TOIS) archive
Volume 17 ,  Issue 3  (July 1999) table of contents
Pages: 229 - 249  
Year of Publication: 1999
ISSN:1046-8188
Author
Norbert Fuhr  University of Dortmund
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 33,   Citation Count: 48
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/314516.314517
What is a DOI?

ABSTRACT

In networked IR, a client submits a query to a broker, which is in contact with a large number of databases. In order to yield a maximum number of documents at minimum cost, the broker has to make estimates about the retrieval cost of each database, and then decide for each database whether or not to use it for the current query, and if, how many documents to retrieve from it. For this purpose, we develop a general decision-theoretic model and discuss different cost structures. Besides cost for retrieving relevant versus nonrelevant documents, we consider the following parameters for each database: expected retrieval quality, expected number of relevant documents in the database and cost factors for query processing and document delivery. For computing the overall optimum, a divide-and-conquer algorithm is given. If there are several brokers knowing different databases, a preselection of brokers can only be performed heuristically, but the computation of the optimum can be done similarily to the single-broker case. In addition, we derive a formula which estimates the number of relevant documents in a database based on dictionary information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
BOOKSTEIN, A. 1983. Outline of a general probabilistic retrieval model. J. Doc. 39, 2, 63-72.
 
4
5
 
6
DANZIG, P., LI, S. -H., AND OBRACZKA, K. 1992. Distributed indexing of autonomous internet services. Comput. Syst. 5, 4, 433-459.
 
7
8
9
10
11
 
12
G VERT, N. 1997. Database selection in networked information retrieval systems. Diploma thesis. Department of Computer Science, University of Dortmund, Dortmund, Germany.
 
13
14
 
15
 
16
 
17
ROBERTSON, S. E. 1977. The probability ranking principle in IR. J. Doc. 33, 4, 294-304.
18
 
19
VAN RIJSBERGEN, C.J. 1986. A non-classical logic for information retrieval. Comput. J. 29, 6, 481-485.
20
21
22

CITED BY  49