ACM Home Page
Please provide us with feedback. Feedback
Entropy-biased models for query representation on the click graph
Full text PdfPdf (519 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
SESSION: Clickthrough models table of contents
Pages 339-346  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Hongbo Deng  The Chinese University of Hong Kong, Hong Kong, Hong Kong
Irwin King  The Chinese University of Hong Kong, Hong Kong, Hong Kong
Michael R. Lyu  The Chinese University of Hong Kong, Hong Kong, Hong Kong
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 67,   Downloads (12 Months): 227,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1572001
What is a DOI?

ABSTRACT

Query log analysis has received substantial attention in recent years, in which the click graph is an important technique for describing the relationship between queries and URLs. State-of-the-art approaches based on the raw click frequencies for modeling the click graph, however, are not noise-eliminated. Nor do they handle heterogeneous query-URL pairs well. In this paper, we investigate and develop a novel entropy-biased framework for modeling click graphs. The intuition behind this model is that various query-URL pairs should be treated differently, i.e., common clicks on less frequent but more specific URLs are of greater value than common clicks on frequent and general URLs. Based on this intuition, we utilize the entropy information of the URLs and introduce a new concept, namely the inverse query frequency (IQF), to weigh the importance (discriminative ability) of a click on a certain URL. The IQF weighting scheme is never explicitly explored or statistically examined for any bipartite graphs in the information retrieval literature. We not only formally define and quantify this scheme, but also incorporate it with the click frequency and user frequency information on the click graph for an effective query representation. To illustrate our methodology, we conduct experiments with the AOL query log data for query similarity analysis and query suggestion tasks. Experimental results demonstrate that considerable improvements in performance are obtained with our entropy-biased models. Moreover, our method can also be applied to other bipartite graphs.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R.A. Baeza-Yates, C.A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT Workshops, pages 588--596, 2004.
2
3
4
5
6
7
8
9
 
10
T. Haveliwala, S. Kamvar, and G. Jeh. An analytical comparison of approaches to personalizing PageRank. Preprint, June, 2003.
11
 
12
K.S. Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1):11--21, 1972.
13
14
15
16
17
18
19
 
20
S. Robertson. Understanding inverse document frequency: on theoretical arguments for IDF. Journal of Documentation, 60:503--520, 2004.
21
 
22
 
23
C.E. Shannon. Prediction and entropy of printed english. The Bell System Technical Journal, 30:50--64, 1950.
24
25
26

Collaborative Colleagues:
Hongbo Deng: colleagues
Irwin King: colleagues
Michael R. Lyu: colleagues