ACM Home Page
Please provide us with feedback. Feedback
On modeling information retrieval with probabilistic inference
Full text PdfPdf (1.71 MB)
Source ACM Transactions on Information Systems (TOIS) archive
Volume 13 ,  Issue 1  (January 1995) table of contents
Pages: 38 - 68  
Year of Publication: 1995
ISSN:1046-8188
Authors
S. K. M. Wong  Univ. of Regina, Regina, Sask., Canada
Y. Y. Yao  Lakehead Univ., Thunder Bay, Ont., Canada
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 114,   Citation Count: 43
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/195705.195713
What is a DOI?

ABSTRACT

This article examines and extends the logical models of information retrieval in the context of probability theory. The fundamental notions of term weights and relevance are given probabilistic interpretations. A unified framework is developed for modeling the retrieval process with probabilistic inference. This new approach provides a common conceptual and mathematical basis for many retrieval models, such as the Boolean, fuzzy set, vector space, and conventional probabilistic models. Within this framework, the underlying assumptions employed by each model are identified, and the inherent relationships between these models are analyzed. Although this article is mainly a theoretical analysis of probabilistic inference for information retrieval, practical methods for estimating the required probabilities are provided by simple examples.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
ADAMS, E.W. 1975. The Logic ofCondtttonals. Reidel, Dordrecht.
 
2
BELLMAN, R. E. AND ZADEH, L.A. 1970 Decislon-making in fuzzy environment Manage. Sci. 17, 141-164.
3
 
4
BUELL, D. A. AND KRAFT, D.H. 1981a. Threshold values and Boolean retrieval systems. Inf. Process. Manage. 17, 127-136
 
5
BUELL, D. A. AND KRAFT, D.H. 1981b. A model for a weighted retrieval system. J. Am. Soc. Inf. Sci. 32, 211-216.
 
6
CARNAP, R. 1971. Inductive logic and rational decision. In Studies m Inductive Logic and Probab~ltty. Vol. 1. University of California Press, Berkeley, Calif., 5-31.
 
7
CAR~, R. 1962. Logical Foundations of Probability. 2nd ed. University of Chicago Press, Chicago, Ill.
 
8
 
9
C~ow, C. K. AND LIu, C. N. 1968. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theor. IT-14, 462-467.
 
10
COOPER, W.S. 1971. A definition of relevance for information retrieval. Inf. Storage Retriev. 7, 19-37
 
11
CROFT, W. B. AND HARPER, D. J. 1979. Using probabilistic models of document retrieval without relevance information. J. Doc. 35, 285 295.
 
12
 
13
DE FINET~I, B. 1974. Theory of Probability. Wiley, New York.
 
14
FINE, T L. 1973. Theorzes of Probability: An Examination of Foundatzons Academic Press, New York.
 
15
16
 
17
GILES, R. 1976. Lukasiewicz logic fuzzy theory. Int. J. Man-Machine Stud. 8, 313-327.
 
18
GOOD, I.J. 1983. Good Thinking: The Foundations of Probability and Its Application. University of Minnesota Press, Minneapolis.
 
19
 
20
HACKING, I. 1975. The Emergence of Probability. Cambridge University Press, London.
 
21
HARRISON, M.A. 1965. Introduction to Switching and Automata Theory. McGraw-Hill Book Company, New York.
 
22
JAMES, E.T. 1979. Where do we stand on maximum entropy. In The Maximum Entropy Formalism. The MIT Press, Cambridge, Mass.
 
23
KLm, G. J. ANn FOLGER, T.A. 1988. Fuzzy sets, Uncertainty, and Information. Prentice-Hall, Englewood Cliffs, N. J.
24
 
25
MARR, D. 1982. Vision. Freeman, San Francisco.
26
 
27
Nm, J. 1989. An information retrieval model based on modal logic. Inf. Process. Manage. 25, 477-491.
 
28
 
29
 
30
RADECKI, T. 1976. Mathematic model of information retrieval based on the concept of a fuzzy thesaurus. Inf. Process. Manage. 12,313-318.
 
31
RAGHAVAN, V. V. AND WONG, S. K. M. 1986. A critical analysis of vector space model in information retrieval. J. Am. Soc. Inf. Sc~. 37, 279-287.
 
32
ROBERTSON, S.E. 1977. The probability ranking principle in IR. J. Doc. 33, 294-304.
 
33
ROBERTSON, S. E. ANn SPARCK JONES, K. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 129-146.
 
34
ROBERTSON, S. E., MARON, M. E., AND COOPER, W. S. 1982. Probability of relevance: A unification of two competing models for document retrieval. Inf. Tech. Res. Dev. 1, 1-21.
 
35
 
36
37
 
38
SARACEVIC, T. 1975. Relevance: A review of and a framework for the thinking on the notion in information science. J. Am. Soc. Inf. Sci. 26, 321-343.
 
39
SAVAGE, L.J. 1972. The Foundations of Statistics. Dover, New York.
40
 
41
SH~ER, G. 1987. Probability judgment in artificial and expert systems. Stat. Sci. 2, 3-16.
 
42
SHANNON, C.E. 1948. The mathematical theory of communication. Bell Syst. Tech. J. 27, 379-423,623-656.
 
43
 
44
 
45
46
 
47
48
 
49
VAN RIJSBERGEN, C.J. 1986 A non-classical logic for information retrieval. Comput. J. 29, 481-485.
 
50
 
51
 
52
 
53
WONG, S. K. M. AND YAO, Y.Y. 1990. A generalized binary probabilistic independence model. J. Am Soc. Inf. Sci. 41,324-329.
 
54
WONG, S. K. M., BOLLMANN, P., AND YAO, Y.Y. 1991. Information retrieval based on axiomatic decision theory. Int. J. Gen. Syst. 19,301-321.
55
 
56
57
 
58
ZADEH, L.A. 1965. Fuzzy sets. Inf. Contr. 8, 338-353.

CITED BY  43


REVIEW

"Duncan A. Buell : Reviewer"

The problem of the mathematical framework for the documents-to-queries matching portion of an information retrieval system is examined. Assuming some universe in which probabilities can be computed, the fundamental notio  more...

Collaborative Colleagues:
S. K. M. Wong: colleagues
Y. Y. Yao: colleagues