|
|||||||||||||||||||||||||
|
|||||||||||||||||||||||||
ABSTRACT
There has been recent interest in collecting user or assessor preferences, rather than absolute judgments of relevance, for the evaluation or learning of ranking algorithms. Since measures like precision, recall, and DCG are defined over absolute judgments, evaluation over preferences will require new evaluation measures that explicitly model them. We describe a class of such measures and compare absolute and preference measures over a large TREC collection. REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
INDEX TERMS
Primary Classification:
General Terms:
|
|||||||||||||||||||||||||