| On Relevance, Probabilistic Indexing and Information Retrieval |
| Full text |
Pdf
(2.24 MB)
|
| Source
|
Journal of the ACM (JACM)
archive
Volume 7 , Issue 3 (July 1960)
table of contents
Pages: 216 - 244
Year of Publication: 1960
ISSN:0004-5411
|
|
Authors
|
|
M. E. Maron
|
The RAND Corporation, Santa Monica, California
|
|
J. L. Kuhns
|
Ramo-Wooldridge, Canoga Park, California
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 60, Downloads (12 Months): 254, Citation Count: 102
|
|
|
ABSTRACT
This paper reports on a novel technique for literature indexing and searching in a mechanized library system. The notion of relevance is taken as the key concept in the theory of information retrieval and a comparative concept of relevance is explicated in terms of the theory of probability. The resulting technique called “Probabilistic Indexing,” allows a computing machine, given a request for information, to make a statistical inference and derive a number (called the “relevance number”) for each document, which is a measure of the probability that the document will satisfy the given request. The result of a search is an ordered list of those documents which satisfy the request ranked according to their probable relevance.
The paper goes on to show that whereas in a conventional library system the cross-referencing (“see” and “see also”) is based solely on the “semantical closeness” between index terms, statistical measures of closeness between index terms can be defined and computed. Thus, given an arbitrary request consisting of one (or many) index term(s), a machine can elaborate on it to increase the probability of selecting relevant documents that would not otherwise have been selected.
Finally, the paper suggests an interpretation of the whole library problem as one where the request is considered as a clue on the basis of which the library system makes a concatenated statistical inference in order to provide as an output an ordered list of those documents which most probably satisfy the information needs of the user.
CITED BY 102
|
|
|
|
|
|
|
|
|
|
|
Richard H. Fowler , Wendy A. L. Fowler , Bradley A. Wilson, Integrating query thesaurus, and documents through a common visual representation, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.142-151, October 13-16, 1991, Chicago, Illinois, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. M. Fung , S. L. Crawford , L. A. Appelbaum , R. M. Tong, An architecture for probabilistic concept-based information retrieval, Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval, p.455-467, September 05-07, 1990, Brussels, Belgium
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Wallace , B. Boyce , D. Kraft, Determining online retrieval system display size, Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval, p.234-245, June 03-05, 1987, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hsinchun Chen , Bruce Schatz , Tobun Ng , Joanne Martinez , Amy Kirchhoff , Chienting Lin, A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.18 n.8, p.771-782, August 1996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Terry Noreault , Michael McGill , Matthew B. Koll, A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment, Proceedings of the 3rd annual ACM conference on Research and development in information retrieval, p.57-76, June 23-27, 1980, Cambridge, England
|
|
|
|
|
|
Fabio Crestani , Sandor Dominich , Mounia Lalmas , Cornelis Joost van Rijsbergen, Mathematical, logical, and formal methods in information retrieval: an introduction to the special issue, Journal of the American Society for Information Science and Technology, v.54 n.4, p.281-284, February 15, 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
G. N. Arnovick , J. A. Liles , J. S. Wood, Information storage and retrieval-analysis of the state of the art, Proceedings of the April 21-23, 1964, spring joint computer conference, April 21-23, 1964, Washington, D.C.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jianhan Zhu , Jun Wang , Ingemar J. Cox , Michael J. Taylor, Risky business: modeling and exploiting uncertainty in information retrieval, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|