|
ABSTRACT
Designing effective ranking functions is a core problem for information retrieval and Web search since the ranking functions directly impact the relevance of the search results. The problem has been the focus of much of the research at the intersection of Web search and machine learning, and learning ranking functions from preference data in particular has recently attracted much interest. The objective of this paper is to empirically examine several objective functions that can be used for learning ranking functions from preference data. Specifically, we investigate the roles of ties in the learning process. By ties, we mean preference judgments that two documents have equal degree of relevance with respect to a query. This type of data has largely been ignored or not properly modeled in the past. In this paper, we analyze the properties of ties and develop novel learning frameworks which combine ties and preference data using statistical paired comparison models to improve the performance of learned ranking functions. The resulting optimization problems explicitly incorporating ties and preference data are solved using gradient boosting methods. Experimental studies are conducted using three publicly available data sets which demonstrate the effectiveness of the proposed new methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
R. Bradley and M.Terry. The rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39, 1952.
|
 |
3
|
Chris Burges , Tal Shaked , Erin Renshaw , Ari Lazier , Matt Deeds , Nicole Hamilton , Greg Hullender, Learning to rank using gradient descent, Proceedings of the 22nd international conference on Machine learning, p.89-96, August 07-11, 2005, Bonn, Germany
[doi> 10.1145/1102351.1102363]
|
| |
4
|
C. J. Burges, R. Ragno, and Q. V. Le. Learning to rank with nonsmooth cost functions. In Advances in Neural Information Processing Systems 19. 2007.
|
 |
5
|
Yunbo Cao , Jun Xu , Tie-Yan Liu , Hang Li , Yalou Huang , Hsiao-Wuen Hon, Adapting ranking SVM to document retrieval, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148205]
|
 |
6
|
Zhe Cao , Tao Qin , Tie-Yan Liu , Ming-Feng Tsai , Hang Li, Learning to rank: from pairwise approach to listwise approach, Proceedings of the 24th international conference on Machine learning, p.129-136, June 20-24, 2007, Corvalis, Oregon
[doi> 10.1145/1273496.1273513]
|
 |
7
|
|
| |
8
|
H. A. David. The Method of Paired Comparisons. Oxford University Press, 2nd edition, 1988.
|
| |
9
|
|
| |
10
|
J. Friedman. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 29, 2001.
|
| |
11
|
|
| |
12
|
G. Fung, R. Rosales, and B. Krishnapuram. Learning rankings via convex hull separation. In Advances in Neural Information Processing Systems 18. 2006.
|
| |
13
|
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, 2001.
|
| |
14
|
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, Cambridge, MA, 2000. MIT Press.
|
 |
15
|
|
 |
16
|
|
 |
17
|
|
 |
18
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
 |
19
|
John Lafferty , Chengxiang Zhai, Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.111-119, September 2001, New Orleans, Louisiana, United States
[doi> 10.1145/383952.383970]
|
| |
20
|
T. Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of the Learning to Rank workshop in the 30th ACM SIGIR, 2007.
|
 |
21
|
|
| |
22
|
P. Rao and L. Kupper. Ties in paired-comparison experiments: A generalization of the bradley-terry model,. Journal of the American Statistical Association, 62, March 1967.
|
| |
23
|
S. Robertson and D. A. Hull. The TREC-9 filtering track final report. Proceedings of the 9th Text REtrieval Conference (TREC-9), 2001.
|
 |
24
|
|
 |
25
|
|
| |
26
|
L. Thurstone. A law of comparative judgement. Psychological Review, 34, 1927.
|
 |
27
|
Ming-Feng Tsai , Tie-Yan Liu , Tao Qin , Hsin-Hsi Chen , Wei-Ying Ma, FRank: a ranking method with fidelity loss, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277808]
|
 |
28
|
|
 |
29
|
|
 |
30
|
|
| |
31
|
Z. Zheng, H. Zha, T. Zhang, O. Chapelle, K. Chen, and G. Sun. A general boosting method and its application to learning ranking functions for web search. In Advances in Neural Information Processing Systems. 2007.
|
CITED BY 2
|
|
R. Agrawal , A. Halverson , K. Kenthapadi , N. Mishra , P. Tsaparas, Generating labels from clicks, Proceedings of the Second ACM International Conference on Web Search and Data Mining, February 09-12, 2009, Barcelona, Spain
|
|
|
|
|