|
ABSTRACT
This paper presents a novel approach for using clickthrough data to learn ranked retrieval functions for web search results. We observe that users searching the web often perform a sequence, or chain, of queries with a similar information need. Using query chains, we generate new types of preference judgments from search engine logs, thus taking advantage of user intelligence in reformulating queries. To validate our method we perform a controlled user study comparing generated preference judgments to explicit relevance judgments. We also implemented a real-world search engine to test our approach, using a modified ranking SVM to learn an improved ranking function from preference data. Our results demonstrate significant improvements in the ranking given by the search engine. The learned rankings outperform both a static ranking function, as well as one trained without considering query chains.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
B. Bartell and G. W. Cottrell. Learning to retrieve information. In Proceedings of the Swedish Conference on Connectionism, 1995.
|
| |
2
|
|
 |
3
|
|
| |
4
|
J. Boyan, D. Freitag, and T. Joachims. A machine learning architecture for optimizing web search engines. In AAAI Workshop on Internet Based Information Systems, August 1996.
|
 |
5
|
|
| |
6
|
W. W. Cohen, R. E. Shapire, and Y. Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243--270, 1999.
|
| |
7
|
K. Crammer and Y. Singer. Pranking with ranking. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), 2001.
|
| |
8
|
S. Cucerzan and E. Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 293--300, 2004.
|
| |
9
|
|
 |
10
|
|
| |
11
|
L. Granka. Eye tracking analysis of user behaviors in online search. Master's thesis, Cornell University, 2004.
|
 |
12
|
|
| |
13
|
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In A. S. et al., editor, Advances in Large Margin Classifiers, pages 115--132, 2000.
|
| |
14
|
|
 |
15
|
|
| |
16
|
T. Joachims. Evaluating retrieval performance using clickthrough data. In J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining, pages 79--96. Physica/Springer Verlag, 2003.
|
 |
17
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
 |
21
|
|
| |
22
|
S. Rajaram, A. Garg, Z. S. Zhou, and T. S. Huang. Classification approach towards ranking and sorting problems. In Lecture Notes in Artificial Intelligence, volume 2837, pages 301--312, September 2003.
|
 |
23
|
|
| |
24
|
C. Silverstein, M. Henzinger, H. Marais, and M. Moricz. Analysis of a very large AltaVista query log. Technical Report 1998-014, Digital SRC, 1998.
|
| |
25
|
Q. Tan, X. Chai, W. Ng, and D.-L. Lee. Applying co-training to clickthrough data for search engine adaptation. In Proceedings of the 9th International Conference on Database Systems for Advanced Applications (DASFAA), 2004.
|
CITED BY 65
|
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
Qiankun Zhao , Steven C. H. Hoi , Tie-Yan Liu , Sourav S. Bhowmick , Michael R. Lyu , Wei-Ying Ma, Time-dependent semantic similarity measure of queries using historical click-through data, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
Qingqing Gan , Josh Attenberg , Alexander Markowetz , Torsten Suel, Analysis of geographic queries in a search engine log, Proceedings of the first international workshop on Location and the web, p.49-56, April 22-22, 2008, Beijing, China
|
|
|
|
|
|
|
|
|
Eugene Agichtein , Eric Brill , Susan Dumais , Robert Ragno, Learning user interaction models for predicting web search result preferences, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Filip Radlinski , Geri Gay, Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search, ACM Transactions on Information Systems (TOIS), v.25 n.2, p.7-es, April 2007
|
|
|
|
|
|
|
|
|
|
|
|
Hiranmay Ghosh , P. Poornachander , Anupama Mallik , Santanu Chaudhury, Learning ontology for personalized video retrieval, Workshop on multimedia information retrieval on The many faces of multimedia semantics, September 28-28, 2007, Augsburg, Bavaria, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Paolo Boldi , Francesco Bonchi , Carlos Castillo , Debora Donato , Aristides Gionis , Sebastiano Vigna, The query-flow graph: model and applications, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Carlos Castillo , Claudio Corsi , Debora Donato , Paolo Ferragina , Aristides Gionis, Query-log mining for detecting spam, Proceedings of the 4th international workshop on Adversarial information retrieval on the web, April 22-22, 2008, Beijing, China
|
|
|
|
|
|
Giorgos Giannopoulos , Theodore Dalamagas , Magdalini Eirinaki , Timos Sellis, Boosting the ranking function learning process using clustering, Proceeding of the 10th ACM workshop on Web information and data management, October 30-30, 2008, Napa Valley, California, USA
|
|
|
Zhicheng Dou , Ruihua Song , Xiaojie Yuan , Ji-Rong Wen, Are click-through data adequate for learning web search rankings?, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. Agrawal , A. Halverson , K. Kenthapadi , N. Mishra , P. Tsaparas, Generating labels from clicks, Proceedings of the Second ACM International Conference on Web Search and Data Mining, February 09-12, 2009, Barcelona, Spain
|
|
|
|
|
|
Paolo Boldi , Francesco Bonchi , Carlos Castillo , Debora Donato , Sebastiano Vigna, Query suggestions using query-flow graphs, Proceedings of the 2009 workshop on Web Search Click Data, p.56-63, February 09-09, 2009, Barcelona, Spain
|
|
|
|
|
|
Fan Guo , Chao Liu , Anitha Kannan , Tom Minka , Michael Taylor , Yi-Min Wang , Christos Faloutsos, Click chain model in web search, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|
|
Songhua Xu , Yi Zhu , Hao Jiang , Francis C. M. Lau, A user-oriented webpage ranking algorithm based on user attention time, Proceedings of the 23rd national conference on Artificial intelligence, p.1255-1260, July 13-17, 2008, Chicago, Illinois
|
|
|
Doug Downey , Susan Dumais , Eric Horvitz, Models of searching and browsing: languages, studies, and applications, Proceedings of the 20th international joint conference on Artifical intelligence, p.2740-2747, January 06-12, 2007, Hyderabad, India
|
|
|
Hila Becker , Christopher Meek , David Maxwell Chickering, Modeling contextual factors of click rates, Proceedings of the 22nd national conference on Artificial intelligence, p.1310-1315, July 22-26, 2007, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|