| Smoothing clickthrough data for web search ranking |
| Full text |
Pdf
(256 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Clickthrough models
table of contents
Pages 355-362
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
Jianfeng Gao
|
Microsoft Research, Redmond, USA
|
|
Wei Yuan
|
University of Montreal, Montreal, Canada
|
|
Xiao Li
|
Microsoft Research, Redmond, USA
|
|
Kefeng Deng
|
Microsoft China, Beijing, China
|
|
Jian-Yun Nie
|
University of Montreal, Montreal, Canada
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 65, Downloads (12 Months): 246, Citation Count: 0
|
|
|
ABSTRACT
Incorporating features extracted from clickthrough data (called clickthrough features) has been demonstrated to significantly improve the performance of ranking models for Web search applications. Such benefits, however, are severely limited by the data sparseness problem, i.e., many queries and documents have no or very few clicks. The ranker thus cannot rely strongly on clickthrough features for document ranking. This paper presents two smoothing methods to expand clickthrough data: query clustering via Random Walk on click graphs and a discounting method inspired by the Good-Turing estimator. Both methods are evaluated on real-world data in three Web search domains. Experimental results show that the ranking models trained on smoothed clickthrough features consistently outperform those trained on unsmoothed features. This study demonstrates both the importance and the benefits of dealing with the sparseness problem in clickthrough data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
Burges, C.J., Ragno, R.,&Le, Q.V. 2006. Learning to rank with nonsmooth cost functions. In NIPS, pp. 395--402.
|
 |
7
|
Chris Burges , Tal Shaked , Erin Renshaw , Ari Lazier , Matt Deeds , Nicole Hamilton , Greg Hullender, Learning to rank using gradient descent, Proceedings of the 22nd international conference on Machine learning, p.89-96, August 07-11, 2005, Bonn, Germany
[doi> 10.1145/1102351.1102363]
|
| |
8
|
Chen, S. and Goodman, J. 1998. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University.
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
Ghahramani, Z. and Jordan, M.I. 1994. Supervised learning from incomplete data via an EM approach. In NIPS, pp.
|
| |
13
|
Good, I.J. 1953. The population frequencies of species and the estimation of population parameters. Biomerika, 40 (3-4): 237--264.
|
| |
14
|
Goodman, J. 2001. A bit of progress in language modeling (extended version). Technical Report MSR-TR-2001-72, Microsoft Research.
|
| |
15
|
Goodman, J. and Gao, J. 2000. Language model size reduction by pruning and clustering. In ICSLP, pp. 176--182.
|
| |
16
|
Hastie, T., Tibshirani, R. and Friedman, J. 2001. The elements of statistical learning. Springer-Verlag, New York.
|
 |
17
|
|
 |
18
|
|
 |
19
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
| |
20
|
Katz, S.M. 1987. Estimation of probabilities from sparse data for the language model of a speech recognizer. IEEE Trans on Acoustics, Speech and Signal Processing, ASSP-35(3): 400--401.
|
 |
21
|
|
| |
22
|
|
 |
23
|
Yuting Liu , Bin Gao , Tie-Yan Liu , Ying Zhang , Zhiming Ma , Shuyuan He , Hang Li, BrowseRank: letting web users vote for page importance, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
[doi> 10.1145/1390334.1390412]
|
| |
24
|
Lowe, D. and Webb, A.R. 1990. Exploit prior knowledge in network optimization: an illustration from medical prognosis. Network: Computation in Neural Systems, 1(3):299--323.
|
 |
25
|
|
 |
26
|
|
 |
27
|
|
 |
28
|
Gui-Rong Xue , Hua-Jun Zeng , Zheng Chen , Yong Yu , Wei-Ying Ma , WenSi Xi , WeiGuo Fan, Optimizing web search using web click-through data, Proceedings of the thirteenth ACM international conference on Information and knowledge management, November 08-13, 2004, Washington, D.C., USA
[doi> 10.1145/1031171.1031192]
|
|