|
ABSTRACT
This article examines and extends the logical models of information retrieval in the context of probability theory. The fundamental notions of term weights and relevance are given probabilistic interpretations. A unified framework is developed for modeling the retrieval process with probabilistic inference. This new approach provides a common conceptual and mathematical basis for many retrieval models, such as the Boolean, fuzzy set, vector space, and conventional probabilistic models. Within this framework, the underlying assumptions employed by each model are identified, and the inherent relationships between these models are analyzed. Although this article is mainly a theoretical analysis of probabilistic inference for information retrieval, practical methods for estimating the required probabilities are provided by simple examples.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ADAMS, E.W. 1975. The Logic ofCondtttonals. Reidel, Dordrecht.
|
| |
2
|
BELLMAN, R. E. AND ZADEH, L.A. 1970 Decislon-making in fuzzy environment Manage. Sci. 17, 141-164.
|
 |
3
|
|
| |
4
|
BUELL, D. A. AND KRAFT, D.H. 1981a. Threshold values and Boolean retrieval systems. Inf. Process. Manage. 17, 127-136
|
| |
5
|
BUELL, D. A. AND KRAFT, D.H. 1981b. A model for a weighted retrieval system. J. Am. Soc. Inf. Sci. 32, 211-216.
|
| |
6
|
CARNAP, R. 1971. Inductive logic and rational decision. In Studies m Inductive Logic and Probab~ltty. Vol. 1. University of California Press, Berkeley, Calif., 5-31.
|
| |
7
|
CAR~, R. 1962. Logical Foundations of Probability. 2nd ed. University of Chicago Press, Chicago, Ill.
|
| |
8
|
|
| |
9
|
C~ow, C. K. AND LIu, C. N. 1968. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theor. IT-14, 462-467.
|
| |
10
|
COOPER, W.S. 1971. A definition of relevance for information retrieval. Inf. Storage Retriev. 7, 19-37
|
| |
11
|
CROFT, W. B. AND HARPER, D. J. 1979. Using probabilistic models of document retrieval without relevance information. J. Doc. 35, 285 295.
|
| |
12
|
|
| |
13
|
DE FINET~I, B. 1974. Theory of Probability. Wiley, New York.
|
| |
14
|
FINE, T L. 1973. Theorzes of Probability: An Examination of Foundatzons Academic Press, New York.
|
| |
15
|
|
 |
16
|
|
| |
17
|
GILES, R. 1976. Lukasiewicz logic fuzzy theory. Int. J. Man-Machine Stud. 8, 313-327.
|
| |
18
|
GOOD, I.J. 1983. Good Thinking: The Foundations of Probability and Its Application. University of Minnesota Press, Minneapolis.
|
| |
19
|
|
| |
20
|
HACKING, I. 1975. The Emergence of Probability. Cambridge University Press, London.
|
| |
21
|
HARRISON, M.A. 1965. Introduction to Switching and Automata Theory. McGraw-Hill Book Company, New York.
|
| |
22
|
JAMES, E.T. 1979. Where do we stand on maximum entropy. In The Maximum Entropy Formalism. The MIT Press, Cambridge, Mass.
|
| |
23
|
KLm, G. J. ANn FOLGER, T.A. 1988. Fuzzy sets, Uncertainty, and Information. Prentice-Hall, Englewood Cliffs, N. J.
|
 |
24
|
|
| |
25
|
MARR, D. 1982. Vision. Freeman, San Francisco.
|
 |
26
|
|
| |
27
|
Nm, J. 1989. An information retrieval model based on modal logic. Inf. Process. Manage. 25, 477-491.
|
| |
28
|
|
| |
29
|
|
| |
30
|
RADECKI, T. 1976. Mathematic model of information retrieval based on the concept of a fuzzy thesaurus. Inf. Process. Manage. 12,313-318.
|
| |
31
|
RAGHAVAN, V. V. AND WONG, S. K. M. 1986. A critical analysis of vector space model in information retrieval. J. Am. Soc. Inf. Sc~. 37, 279-287.
|
| |
32
|
ROBERTSON, S.E. 1977. The probability ranking principle in IR. J. Doc. 33, 294-304.
|
| |
33
|
ROBERTSON, S. E. ANn SPARCK JONES, K. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 129-146.
|
| |
34
|
ROBERTSON, S. E., MARON, M. E., AND COOPER, W. S. 1982. Probability of relevance: A unification of two competing models for document retrieval. Inf. Tech. Res. Dev. 1, 1-21.
|
| |
35
|
|
| |
36
|
|
 |
37
|
|
| |
38
|
SARACEVIC, T. 1975. Relevance: A review of and a framework for the thinking on the notion in information science. J. Am. Soc. Inf. Sci. 26, 321-343.
|
| |
39
|
SAVAGE, L.J. 1972. The Foundations of Statistics. Dover, New York.
|
 |
40
|
|
| |
41
|
SH~ER, G. 1987. Probability judgment in artificial and expert systems. Stat. Sci. 2, 3-16.
|
| |
42
|
SHANNON, C.E. 1948. The mathematical theory of communication. Bell Syst. Tech. J. 27, 379-423,623-656.
|
| |
43
|
|
| |
44
|
|
| |
45
|
|
 |
46
|
|
| |
47
|
|
 |
48
|
|
| |
49
|
VAN RIJSBERGEN, C.J. 1986 A non-classical logic for information retrieval. Comput. J. 29, 481-485.
|
| |
50
|
|
| |
51
|
|
| |
52
|
|
| |
53
|
WONG, S. K. M. AND YAO, Y.Y. 1990. A generalized binary probabilistic independence model. J. Am Soc. Inf. Sci. 41,324-329.
|
| |
54
|
WONG, S. K. M., BOLLMANN, P., AND YAO, Y.Y. 1991. Information retrieval based on axiomatic decision theory. Int. J. Gen. Syst. 19,301-321.
|
 |
55
|
|
| |
56
|
|
 |
57
|
|
| |
58
|
ZADEH, L.A. 1965. Fuzzy sets. Inf. Contr. 8, 338-353.
|
CITED BY 43
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Robert W.P. Luk , H. V. Leong , Tharam S. Dillon , Alvin T.S. Chan , W. Bruce Croft , James Allan, A survey in indexing and searching XML documents, Journal of the American Society for Information Science and Technology, v.53 n.6, p.415-437, May, 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Norbert Gövert , Mounia Lalmas , Norbert Fuhr, A probabilistic description-oriented approach for categorizing web documents, Proceedings of the eighth international conference on Information and knowledge management, p.475-482, November 02-06, 1999, Kansas City, Missouri, United States
|
|
|
|
|
|
|
|
|
|
|
|
F. Kokkoras , H. Jiang , I. Vlahavas , A. K. Elmagarmid , E. N. Houstis , W. G. Aref, Smart videotext: a video data model based on conceptual graphs, Multimedia Systems, v.8 n.4, p.328-338, July 2002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
King-Kup Liu , Weiyi Meng , Clement Yu, Discovery of similarity computations of search engines, Proceedings of the ninth international conference on Information and knowledge management, p.290-297, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jane Cleland-Huang , Raffaella Settimi , Oussama BenKhadra , Eugenia Berezhanskaya , Selvia Christina, Goal-centric traceability for managing non-functional requirements, Proceedings of the 27th international conference on Software engineering, May 15-21, 2005, St. Louis, MO, USA
|
|
|
Shuming Shi , Ji-Rong Wen , Qing Yu , Ruihua Song , Wei-Ying Ma, Gravitation-based model for information retrieval, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Duncan A. Buell : Reviewer"
The problem of the mathematical framework for the
documents-to-queries matching portion of an
information retrieval system is examined. Assuming some universe in
which probabilities can be computed, the fundamental notio
more...
|