|
ABSTRACT
The inference network model of information retrieval allows a
probabilistic interpretation of query operators. In particular, Boolean query operators are conveniently modeled as link matrices of the Bayesian Network. Prior work has shown, however, that these operators do not perform as well as the pnorm operators used for modeling query operators in the context of the vector space model. This motivates the search for alternative probabilistic formulations for these operators. The design of such alternatives must contend with the issue of computational tractability, since the evaluation of an arbitrary operator requires exponential time. We define a flexible class of link matrices that are natural candidates for the implementation of query operators and an O(n2) algorithm (n = the number of parent nodes) for the computation of probabilities involving link matrices of this class. We present experimental results indicating that Boolean operators implemented in terms of link matrices from this class perform as well as pnorm operators in the context of the INQUERY inference network.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Nicholas J. Belkin , C. Cool , W. Bruce Croft , James P. Callan, The effect multiple query representations on information retrieval system performance, Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, p.339-346, June 27-July 01, 1993, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/160688.160760]
|
| |
2
|
|
| |
3
|
BOOKSTEIN, A. 1980. Fuzzy requests: An approach to weighted Boolean searches. J. Am. Soc. Inf. Sci. 31, 4 (July), 240-247.
|
| |
4
|
BOOKSTEIN, A. 1981. A comparison of two systems of weighted Boolean queries. J. Am. Soc. Inf. Sci. 32, 4 (July), 275-279.
|
| |
5
|
BRIER, G. W. 1950. Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78, 1 (Jan.), 1-3.
|
| |
6
|
BUELL,D.A.AND KRAFT, D. H. 1981. Threshold values and Boolean retrieval system. Inf. Process. Manage. 17, 3, 127-136.
|
| |
7
|
BALLAN,J.P.,CROFT,W.B.,AND HARDING, S. M. 1992. The INQUERY retrieval system. In Proceedings of the 3rd International Conference on Database and Expert Systems Applications 78-83.
|
| |
8
|
|
| |
9
|
|
| |
10
|
DAWID, A. P. 1989. Probability forecasting. In Encyclopedia of Statistical Sciences John Wiley & Sons, Inc., New York, NY, 210-218.
|
| |
11
|
E. Fox , S. Betrabet , M. Koushik , W. Lee, Extended Boolean models, Information retrieval: data structures and algorithms, Prentice-Hall, Inc., Upper Saddle River, NJ, 1992
|
| |
12
|
FOX,E.A.AND SHAW, J. A. 1994. Combination of multiple searches. In Proceedings of the 2nd Text Retrieval Conference (TREC-2), D. K. Harman, Ed. 243-248.
|
| |
13
|
GREIFF, W. R. 1996. Computationally tractable, conceptually plausible classes of link matrices for the INQUERY inference network. Tech. Rep. TR-96-66. Department of Computer Science, University of Massachusetts, Amherst, MA.
|
 |
14
|
|
| |
15
|
KIM,J.H.AND PEARL, J. 1983. A computational model for combined causal and diagnostic reasoning in inference systems. In Proceedings of the 8th International Joint Conference on Artificial Intelligence (Karlsruhe, Germany) 190-193.
|
| |
16
|
|
| |
17
|
NOREAULT, T., KOLL, M., AND MCGILL, M. J. 1977. Automatic ranked output from Boolean searches in SIRE. J. Am. Soc. Inf. Sci. 28, 6, 333-339.
|
| |
18
|
ORTEGA, J. M. 1972. Numerical Analysis: A Second Course. Academic Press, Inc., New York, NY.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
SALTON, G., BUCKLEY, C., AND FOX, E. A. 1983a. Automatic query formulations in information retrieval. J. Am. Soc. Inf. Sci. 34, 4 (July), 262-280.
|
 |
24
|
|
| |
25
|
SHAW,J.A.AND FOX, E. A. 1995. Combination of multiple searches. In Proceedings of the 3rd Text Retrieval Conference (TREC-3) 105-108.
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
WALLER,W.G.AND KRAFT, D. H. 1979. A mathematical model for a weighted Boolean retrieval system. Inf. Process. Manage. 15, 5, 235-245.
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Query formulation
General Terms:
Algorithms,
Performance,
Theory
Keywords:
Bayesian networks,
Boolean queries,
computational complexity,
inference networks,
link matrices,
piecewise linear functions,
pnorm,
probabilistic information retrieval,
query operators
REVIEW
"Donald Harris Kraft : Reviewer"
This paper offers a most interesting view of researchers seeking to
improve the probabilistic model of information retrieval in order to
achieve improved performance over the leading model (the vector space
model with
p
more...
|