|
ABSTRACT
We show that any approach to developing optimum retrieval functions is based on two kinds of assumptions: first, a certain form of representation for documents and requests, and second, additional simplifying assumptions that predefine the type of the retrieval function. Then we describe an approach for the development of optimum polynomial retrieval functions: request-document pairs (fl, dm) are mapped onto description vectors x(fl, dm), and a polynomial function e(x) is developed such that it yields estimates of the probability of relevance P(R | x (fl, dm) with minimum square errors. We give experimental results for the application of this approach to documents with weighted indexing as well as to documents with complex representations. In contrast to other probabilistic models, our approach yields estimates of the actual probabilities, it can handle very complex representations of documents and requests, and it can be easily applied to multivalued relevance scales. On the other hand, this approach is not suited to log-linear probabilistic models and it needs large samples of relevance feedback data for its application.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
P. Biebricher , N. Fuhr , G. Lustig , M. Schwantner , G. Knorz, The automatic indexing system AIR/PHYS - from research to applications, Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval, p.333-342, May 1988, Grenoble, France
[doi> 10.1145/62437.62470]
|
| |
2
|
BOLLMANN, P., JOCHUM, R., REINER, U., WEISSMANN, V., AND ZUSE, H. Planung und Durchfiihrung der Retrievaltests. In Leistungsbewertung yon Information Retrieval Verfahren (LIVE), H.-J. Schneider et al., eds. TU Berlin, Fachbereich Informatik, Computergestfitzte Informationssysteme (CIS), Institut fiir Angewandte Informatik, 1986, pp. 183-212.
|
| |
3
|
BOOKSTEIN, A. Logtinear Analysis of Library Data. Research Report, OCLC, Office of Research, 1988.
|
| |
4
|
BOOKSTEIN, A. Outline of a general probabilistic retrieval model. J. Doc. 39, 2 (1983), 63-72.
|
| |
5
|
|
| |
6
|
DUI)A, R. O., AND HART, P.E. Pattern Classification and Scene Analysis. Wiley, New York, 1973.
|
| |
7
|
|
| |
8
|
FUnR, N. A probabilistic model of dictionary based automatic indexing. In Proceedings of the Riao 85 (Recherche d'informations Assistee par Ordinateur) (Grenoble, France, March 18-20). 1985, pp. 207-216.
|
| |
9
|
FUHR, N. Probabilistisches lndexing und Retrieval. Fachinformationszentrum Karlsruhe, Eggenstein-Leopoldshafen, West Germany, 1988.
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
KEEN, E. M. Evaluation parameters. In The SMART Retrieval System--Experiments in Automatic Document Processing, G. Salton, ed. Prentice Hall, Englewood Cliffs, N.J., 1971, pp. 74-112.
|
| |
16
|
KNORZ, G. Automatisches Indexieren als Erkennen abstrakter Objekte. Niemeyer, Tfibingen, West Germany, 1983.
|
| |
17
|
|
| |
18
|
KONSTANTIN, J. Untersuchung yon nach dem Quadratmittel-Polynomansatz erstellten Ranking{unktionen. Diplomarbeit, TH Darmstadt, FB Informatik, Datenverwaltungssysteme II, Darmstadt, West Germany 1985.
|
| |
19
|
LUSTlCL G. Automatische Indexierung zwischen Forschung und Anwendung. Olms, Hildesheim, West Germany 1986.
|
| |
20
|
|
| |
21
|
ROBERTSON, S.E. The probability ranking principle in IR. J. Doc. 33 (1977), 294-304.
|
| |
22
|
ROBERTSON, S. E., MARON, M. E., AND COOPER, W.S. Probability of relevance: A unification of two competing models for document retrieval. Inf. Tech. Res. 1 (1982), 1-21.
|
| |
23
|
ROCCHIO, J.J. Relevance feedback in information retrieval. In The SMART Retrieval System~ Experiments in Automatic Document Processing, G. Salton, ed. Prentice Hall, Englewood Cliffs, N.J., 1971.
|
| |
24
|
|
| |
25
|
SCHORMANN, J. Polynomklassifikatoren fur die Zeichenerkennung. Ansatz, Adaption, Anwendung. Oldenbourg, M/inchen, West Germany, 1977.
|
 |
26
|
|
CITED BY 30
|
|
Yong Zhang , Vijay V. Raghavan , Jitender S. Deogun, An object-oriented modeling of the history of optimal retrievals, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.241-250, October 13-16, 1991, Chicago, Illinois, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hinrich Schütze , David A. Hull , Jan O. Pedersen, A comparison of classifiers and document representations for the routing problem, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.229-237, July 09-13, 1995, Seattle, Washington, United States
|
|
|
|
|
|
|
|
|
|
|
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
|
|
|
|
|
|
Hongyuan Zha , Zhaohui Zheng , Haoying Fu , Gordon Sun, Incorporating query difference for learning retrieval functions in world wide web search, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Filip Radlinski , Geri Gay, Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search, ACM Transactions on Information Systems (TOIS), v.25 n.2, p.7-es, April 2007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Kathleen H. V. Booth : Reviewer"
This excellent paper describes an application of the least squares
polynomial method, previously used in automatic indexing, to the
classification of request-document pairs in information retrieval.
The retrieval functions developed provide both
more...
|