|
ABSTRACT
Recall and precision are often used to evaluate the effectiveness of information retrieval systems. They are easy to define if there is a single query and if the retrieval result generated for the query is a linear ordering. However, when the retrieval results are weakly ordered, in the sense that several documents have an identical retrieval status value with respect to a query, some probabilistic notion of precision has to be introduced. Relevance probability, expected precision, and so forth, are some alternatives mentioned in the literature for this purpose. Furthermore, when many queries are to be evaluated and the retrieval results averaged over these queries, some method of interpolation of precision values at certain preselected recall levels is needed. The currently popular approaches for handling both a weak ordering and interpolation are found to be inconsistent, and the results obtained are not easy to interpret. Moreover, in cases where some alternatives are available, no comparative analysis that would facilitate the selection of a particular strategy has been provided. In this paper, we systematically investigate the various problems and issues associated with the use of recall and precision as measures of retrieval system performance. Our motivation is to provide a comparative analysis of methods available for defining precision in a probabilistic sense and to promote a better understanding of the various issues involved in retrieval performance evaluation.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
BOLLMANN, P. A comparison of evaluation measures for document retrieval Systems. J. Informatics 1 (1977), 97-116.
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
BOLLMANN, P., RAGHAVAN, V. V., JUNG, G. S., AND SHU, L. Probabiity of relevance and expected precision in evaluating retrieval performance. In preparation.
|
| |
6
|
BOOKSTEIN, A., AND COOPER, W.S. A general mathematical model for information retrieval systems. Libr. Quarterly 46 (1976), 153-157.
|
| |
7
|
|
| |
8
|
CHERNIAVSKY, V. S., AND LAKHUTY, D.G. Problem of evaluating retrieval systems. I. Naucho- Techniceskaya Informazia, Ser. 2, pp. 24-30 (in Russian). In English: Automatic Documentation and Mathematical Linguistics 4 (1970), pp. 9-26.
|
| |
9
|
CLEVERDON, C. W. Evaluation of tests of information retrieval systems. J. Doc. 26 (1970), 55-67.
|
| |
10
|
CLEVERDON, C. W. On the inverse relationship of recall and precision. J. Doc. 28 (1972), 195-201.
|
| |
11
|
COOPER, W.S. Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. Am. Doc. 19 (1968), 30-41.
|
| |
12
|
COOPER, W.S. On selecting a measure of retrieval effectiveness. Part II. Implementation of the philosophy. J. Am. Soc. Inf. Sci. 24 (Nov./Dec. 1973), 413-424.
|
| |
13
|
COOPER, W.S. On selecting a measure of retrieval effectiveness. J. Am. Soc. Inf. Sci. 24 (1973), 87-100.
|
| |
14
|
HEINE, M. H. Distance between sets as an objective measure of retrieval effectiveness. Inf. Storage and Retrieval 9 (1973), 181-198.
|
| |
15
|
HEINE, M.H. The inverse relationship of precision and recall in terms of the Swets model. J. Doc. 29 (1973), 81-84.
|
| |
16
|
HOEL, P.G. Introduction to Mathematical Statistics, 4th ed. Wiley, New York, 1971.
|
| |
17
|
KRAFT, D. H., AND BOOKSTEIN, A. Evaluation of information retrieval systems: A decision theory approach. J. Am. Soc. Inf. Sci. 29 (1978), 31-40.
|
| |
18
|
KRAFT, D. H., AND LEE, T. Stopping rules and their effect on expected search length. Inf. Process. Manage. I5 (1979), 47-58.
|
| |
19
|
ROBERTSON, S. E. The parametric description of retrieval tests. Part II: Overall measures. J. Doc. 25 (1969), 93-107.
|
| |
20
|
SALTON, G. Evaluation problems in interactive information retrieval. Inf. Storage and Retrieval 6 (1970), 29-44.
|
| |
21
|
|
 |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
SALTON, G., AND YANG, S. G. On the specification of term values in automatic indexing. J. Doc. 29, 4 (1973), 351-372.
|
| |
27
|
SALTON, G., YANO, C. S., AND Yu, C.T. Contribution to the theory of indexing, information Processing 74, North-Holland, Amsterdam, The Netherlands, 1974, pp. 584-590.
|
| |
28
|
SPARCK JONES, K. Performance averaging for recall and precision. J. Informat&s 2 (1978), 95-105.
|
| |
29
|
SUPPES, P. Introduction to Log&. Van Nostrand, New York, 1957.
|
| |
30
|
SWETS, J.A. Effectiveness of information retrieval methods. Am. Doc. 20 (1969), 72-89.
|
| |
31
|
VAN RIJSBERGEN, C.J. Foundations of evaluation. J. Doc. 30 (1974), 365-373.
|
| |
32
|
|
 |
33
|
|
| |
34
|
Yu, C. T., AND RA(~HAVAN, V.V. A single-pass method for determining the se nantic relationship between terms. J. Am. Soc. Inf. Sci. 28 (1977), 345-354.
|
CITED BY 29
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Christopher Scaffidi , Kevin Bierhoff , Eric Chang , Mikhael Felker , Herman Ng , Chun Jin, Red Opal: product-feature scoring from reviews, Proceedings of the 8th ACM conference on Electronic commerce, June 11-15, 2007, San Diego, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M S. Ali , Mariano P. Consens , Gabriella Kazai , Mounia Lalmas, Structural relevance: a common basis for the evaluation of structured document retrieval, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
REVIEW
"Dagobert Soergel : Reviewer"
This paper deals with the problem of defining performance measures
under two conditions: (1) the retrieval system produces a list of items
which is weakly rank-ordered by some relevance coefficient, and (2)
performance is to be eva
more...
Peer to Peer - Readers of this Article have also read:
-
M4: a metamodel for data preprocessing
Proceedings of the 4th ACM international workshop on Data warehousing and OLAP
Anca Vaduva
, Jörg-Uwe Kietz
, Regina Zücker
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|