|
ABSTRACT
Effective ranking functions are an essential part of commercial search engines. We focus on developing a regression framework for learning ranking functions for improving relevance of search engines serving diverse streams of user queries. We explore supervised learning methodology from machine learning, and we distinguish two types of relevance judgments used as the training data: 1) absolute relevance judgments arising from explicit labeling of search results; and 2) relative relevance judgments extracted from user click throughs of search results or converted from the absolute relevance judgments. We propose a novel optimization framework emphasizing the use of relative relevance judgments. The main contribution is the development of an algorithm based on regression that can be applied to objective functions involving preference data, i.e., data indicating that a document is more relevant than another with respect to a query. Experimental results are carried out using data sets obtained from a commercial search engine. Our results show significant improvements of our proposed methods over some existing methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
D. Bertsekas. Nonlinear programming Athena Scienti?c, second edition, 1999.
|
 |
4
|
Chris Burges , Tal Shaked , Erin Renshaw , Ari Lazier , Matt Deeds , Nicole Hamilton , Greg Hullender, Learning to rank using gradient descent, Proceedings of the 22nd international conference on Machine learning, p.89-96, August 07-11, 2005, Bonn, Germany
[doi> 10.1145/1102351.1102363]
|
| |
5
|
|
| |
6
|
W. Cooper, F. Gey and A. Chen. Probabilistic retrieval in the TIPSTER collections: an application of staged logistic regression. Proceedings of TREC 73--88, 1992.
|
| |
7
|
D. Cossock and T. Zhang. Subset ranking using regression. COLT 2006.
|
| |
8
|
|
| |
9
|
J. Friedman. Greedy function approximation: a gradient boosting machine. Ann. Statist. 29:1189--1232, 2001.
|
 |
10
|
|
| |
11
|
F. Gey, A. Chen, J. He and J. Meggs. Logistic regression at TREC4: probabilistic retrieval from full text document collections. Proceedings of TREC 65--72, 1995.
|
 |
12
|
|
 |
13
|
|
| |
14
|
T. Joachims. Evaluating retrieval performance using clickthrough data. Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval 2002.
|
 |
15
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
Hongyuan Zha , Zhaohui Zheng , Haoying Fu , Gordon Sun, Incorporating query difference for learning retrieval functions in world wide web search, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
[doi> 10.1145/1183614.1183660]
|
 |
20
|
|
 |
21
|
|
| |
22
|
|
CITED BY 10
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhicheng Dou , Ruihua Song , Xiaojie Yuan , Ji-Rong Wen, Are click-through data adequate for learning web search rankings?, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Keke Chen , Rongqing Lu , C. K. Wong , Gordon Sun , Larry Heck , Belle Tseng, Trada: tree based ranking function adaptation, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Fan Li , Xin Li , Shihao Ji , Zhaohui Zheng, Comparing both relevance and robustness in selection of web ranking functions, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Jiang Bian , Yandong Liu , Ding Zhou , Eugene Agichtein , Hongyuan Zha, Learning to recognize reliable users and content in social media with coupled mutual reinforcement, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Yi Chang , Anlei Dong , Ciya Liao , Zhaohui Zheng, Enhancing topical ranking with preferences from click-through data, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Shihao Ji , Ke Zhou , Ciya Liao , Zhaohui Zheng , Gui-Rong Xue , Olivier Chapelle , Gordon Sun , Hongyuan Zha, Global ranking by exploiting user clicks, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Retrieval models
Additional Classification:
H.
Information Systems
H.4
INFORMATION SYSTEMS APPLICATIONS
H.4.m
Miscellaneous
General Terms:
Algorithms,
Experimentation,
Theory
Keywords:
absolute relevance judgment,
clickthroughs,
functional gradient descent,
gradient boosting,
machine learning,
preferences,
ranking function,
regression,
relative relevance judgment
|