ACM Home Page
Please provide us with feedback. Feedback
A user browsing model to predict search engine click data from past observations.
Full text PdfPdf (198 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Web-search--2 table of contents
Pages 331-338  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Georges E. Dupret  Yahoo! Research Latin America, Santiago, Chile
Benjamin Piwowarski  Yahoo! Research Latin America, Santiago, Chile
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 61,   Downloads (12 Months): 523,   Citation Count: 16
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390392
What is a DOI?

ABSTRACT

Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. [12]. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
H. Becker, C. Meek, and D. M. Chickering. Modeling contextual factors of click rates. In AAAI, pages 1310--1315, 2007.
4
5
 
6
D. Downey, S. T. Dumais, and E. Horvitz. Models of searching and browsing: Languages, studies, and application. In IJCAI, pages 2740--2747, 2007.
 
7
G. Dupret, B. Piwowarski, C. Hurtado, and M. Mendoza. A statistical model of query log generation. In Proceedings of SPIRE 2006, LNCS 4209, pages 217--228. Springer, 2006.
 
8
A. Genkin, D. Lewis, and D. Madigan. Large-scale Bayesian logistic regression for text categorization. Technometrics, 49, 2007.
9
10
11
12
13

CITED BY  16

Collaborative Colleagues:
Georges E. Dupret: colleagues
Benjamin Piwowarski: colleagues