|
ABSTRACT
In this article we show how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with four standard small collections and a large Wall Street Journal collection (173,219 documents) show that performance of feedback improves substantially over no feedback, and further gains are obtained when queries are expanded with terms from the feedback documents. The effect is much more pronounced in small collections than in the large collection. Query expansion may be considered as a tool for both precision and recall enhancement. In particular, small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
BEIN, J. AND SMOLENSKY, P. 1988. Application of the interactive activation model to document retrieval. Tech. Rep. CU-CS-405-88, Dept. of Computer Science, Univ. of Colorado, Boulder, Colo.
|
| |
2
|
|
| |
3
|
BOOKSTEIN, A. AND SWANSON, D. R. 1975. A decision theoretic foundation for indexing. In J. Am. Soc. Inf. Sct. 26, 45-50.
|
 |
4
|
|
| |
5
|
BRAUEN, R. L. 1971. Document vector modification. In SMART System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., 456-484.
|
| |
6
|
|
 |
7
|
|
 |
8
|
|
| |
9
|
CROFT, W. B. AND HARPER, D. J. 1979. Using probabilistic models of document retrieval without relevance information. J. Doc. 35, 285-295.
|
| |
10
|
DEERWESTER, S., DUMAIS, S. T., FURNAS, G. W., LANDAUER, T. K., AND HARSHMAN, R. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sct. 41,391-407.
|
 |
11
|
|
| |
12
|
GALLANT, S. I. 1991. Context vector representation for document retrieval. In AAAI-91: Natural Language Text Retrteval Workshop (July 15). AAAI, San Francisco, Calif.
|
 |
13
|
|
 |
14
|
|
| |
15
|
HAP~AN, D., Ed. 1993. The 1st Text Retrieval Conference (TREC-1) (Mar.). Special Pub. 500-207, National Institute of Standards and Technology, Gaithersburg, Md.
|
| |
16
|
|
 |
17
|
|
| |
18
|
KWOK, K. L. 1990a. Application of neural network to information retrieval. In Proceedings of the International Joint Conference on Neural Networks, M. Caudill, Ed. Vol. 2. Erlbaum, Hillsdale, N.J., 623-626.
|
 |
19
|
|
| |
20
|
|
| |
21
|
KWOK, K. L., PAPADOPOLOUS, L., ANn KWAN, Y. Y. 1993. Retrieval experiments with a large collection using PIRCS. In The 1st Text REtrieval Conference (TREC-1). Special Pub. 500-267, National Institute of Standards and Technology, Gaithersburg, Md.
|
 |
22
|
|
| |
23
|
|
| |
24
|
MOZER, M. C. 1984. Inductive information retrieval using parallel distributed computation. ICS Tech. Rep. 8406, Univ. of California, San Diego, Calif.
|
| |
25
|
RADECKI, T. 1979. Fuzzy-set theoretical approach to document retrieval. Inf. Proc. Mgmt. 15, 247-259.
|
| |
26
|
ROBERTSON, S. E. 1977. The probability ranking principle in IR. J. Doc. 33, 294-304.
|
| |
27
|
ROBERTSON, S. E. AND SPARCK JONES, K. 1976. Relevance weighting of search terms. J. Am. Soc. Inf. Sci. 27, 129-146.
|
| |
28
|
ROBERTSON, S. E., MARON, M. E., AND COOPER, W. S. 1982. Probability of relevance: A unification of two competing models for document retrieval. Inf. Tech. Res. Devel. i, 1-21.
|
| |
29
|
ROCCHIO, J. J., JR. 1971. Relevance feedback in information retrieval. In The Smart System--Experiments in Automatic Document Processing, G. Salton, Ed. Prentice-Hall, Englewood Cliffs, N.J., 313-323.
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
SALTON, G. AND BUCKLEY, C. 1990. Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41,288-297.
|
 |
34
|
|
| |
35
|
SMEATON, A. F. AND VAN RIJSBERGEN, C. J. 1983. The retrieval effects of query expansion on a feedback document retrieval system. Comput. J. 26, 239 246.
|
 |
36
|
|
| |
37
|
SPECK JONES, K. 1972. A statistical interpretation of term specificity and its application in retrieval. J. Doc. 8, 11-21.
|
 |
38
|
|
| |
39
|
|
 |
40
|
|
| |
41
|
WONG, S. K. M., YAO, Y. Y., SALTON, G., AND BUCKLEY, C. 1991. Evaluation of an adaptive linear model. J. Am. Soc. Inf. Sci. 42, 723-730.
|
 |
42
|
|
 |
43
|
|
| |
44
|
Yu, C. T., BUCKLEY, C., L~, K., AND SALTON, G. 1983. A generalized term dependence model in information retrieval. Inf. Tech. Res. Devel. 2, 129-154.
|
CITED BY 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Aitao Chen, A comparison of regression, neural net, and pattern recognition approaches to IR, Proceedings of the seventh international conference on Information and knowledge management, p.140-147, November 02-07, 1998, Bethesda, Maryland, United States
|
|
|
|
|
|
Makoto Nakashima , Keizo Sato , Yanhua Qu , Tetsuro Ito, Browsing-based conceptual information retrieval incorporating dictionary term relations, keyword association, and a user's interest, Journal of the American Society for Information Science and Technology, v.54 n.1, p.16-28, January 2003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
King-Kup Liu , Weiyi Meng , Clement Yu, Discovery of similarity computations of search engines, Proceedings of the ninth international conference on Information and knowledge management, p.290-297, November 06-11, 2000, McLean, Virginia, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Retrieval models
Additional Classification:
E.
Data
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.1
Content Analysis and Indexing
Subjects:
Indexing methods
I.
Computing Methodologies
I.2
ARTIFICIAL INTELLIGENCE
I.2.6
Learning
Subjects:
Connectionism and neural nets
General Terms:
Algorithms,
Experimentation,
Performance
Keywords:
artificial neural networks,
document-focused and query-focused relevance feedback,
indexing and retrieval,
item self-learning,
learning,
probabilistic indexing,
probabilistic retrieval,
query expansion,
training
REVIEW
"Caroline Merriam Eastman : Reviewer"
Modification of queries to information retrieval systems by the
reweighting of query terms or by the addition of new terms can in many
cases lead to improved retrieval performance. This paper describes a new
approach to query modification base
more...
|