|
ABSTRACT
Understanding user intent is key to designing an effective ranking system in a search engine. In the absence of any explicit knowledge of user intent, search engines want to diversify results to improve user satisfaction. In such a setting, the probability ranking principle-based approach of presenting the most relevant results on top can be sub-optimal, and hence the search engine would like to trade-off relevance for diversity in the results. In analogy to prior work on ranking and clustering systems, we use the axiomatic approach to characterize and design diversification systems. We develop a set of natural axioms that a diversification system is expected to satisfy, and show that no diversification function can satisfy all the axioms simultaneously. We illustrate the use of the axiomatic framework by providing three example diversification objectives that satisfy different subsets of the axioms. We also uncover a rich link to the facility dispersion problem that results in algorithms for a number of diversification objectives. Finally, we propose an evaluation methodology to characterize the objectives and the underlying axioms. We conduct a large scale evaluation of our objectives based on two data sets: a data set derived from the Wikipedia disambiguation pages and a product database.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
A. Altman and M. Tennenholtz. On the axiomatic foundations of ranking systems. In Proc. 19th International Joint Conference on Artificial Intelligence, pages 917--922, 2005.
|
| |
3
|
Kenneth Arrow. Social Choice and Individual Values. Wiley, New York, 1951.
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
Charles L.A. Clarke , Maheedhar Kolla , Gordon V. Cormack , Olga Vechtomova , Azin Ashkan , Stefan Büttcher , Ian MacKinnon, Novelty and diversity in information retrieval evaluation, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
[doi> 10.1145/1390334.1390446]
|
 |
10
|
|
| |
11
|
R. Hassin, S. Rubinstein, and A. Tamir. Approximation algorithms for maximum dispersion. Operations Research Letters, 21(3):133--137, 1997.
|
| |
12
|
J. Kleinberg. An Impossibility Theorem for Clustering. Advances in Neural Information Processing Systems 15: Proceedings of the 2002 Conference, 2003.
|
| |
13
|
B. Korte and D. Hausmann. An Analysis of the Greedy Heuristic for Independence Systems. Algorithmic Aspects of Combinatorics, 2:65--74, 1978.
|
| |
14
|
SS Ravi, D.J. Rosenkrantz, and G.K. Tayi. Facility dispersion problems: Heuristics and special cases. Proc. 2nd Workshop on Algorithms and Data Structures (WADS), pages 355--366, 1991.
|
| |
15
|
S.S. Ravi, D.J. Rosenkrantz, and G.K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299--310, 1994.
|
| |
16
|
SS Ravi, D.J. Rosenkrantzt, and G.K. Tayi. Approximation Algorithms for Facility Dispersion. In Teofilo F. Gonzalez, editor, Handbook of Approximation Algorithms and Metaheuristics. Chapman & Hall/CRC, 2007.
|
| |
17
|
|
 |
18
|
|
| |
19
|
E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S.A. Yahia. Efficient Computation of Diverse Query Results. IEEE 24th International Conference on Data Engineering, 2008. ICDE 2008, pages 228--236, 2008.
|
| |
20
|
ChengXiang Zhai. Risk Minimization and Language Modeling in Information Retrieval. PhD thesis, Carnegie Mellon University, 2002.
|
 |
21
|
|
| |
22
|
|
 |
23
|
|
|