|
ABSTRACT
In this paper we study a large query log of more than twenty million queries with the goal of extracting the semantic relations that are implicitly captured in the actions of users submitting queries and clicking answers. Previous query log analyses were mostly done with just the queries and not the actions that followed after them. We first propose a novel way to represent queries in a vector space based on a graph derived from the query-click bipartite graph. We then analyze the graph produced by our query log, showing that it is less sparse than previous results suggested, and that almost all the measures of these graphs follow power laws, shedding some light on the searching user behavior as well as on the distribution of topics that people want in the Web. The representation we introduce allows to infer interesting semantic relationships between queries. Second, we provide an experimental analysis on the quality of these relations, showing that most of them are relevant. Finally we sketch an application that detects multitopical URLs.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
R. Baeza-Yates. Applications of web query mining. ECIR'05.
|
| |
2
|
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query clustering for boosting web page ranking. AWIC'04,
|
| |
3
|
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in a search engine. EDBT Workshops, 2004.
|
 |
4
|
|
| |
5
|
|
| |
6
|
S.-L. Chuang and L.-F. Chien. Automatic query taxonomy generation for information retrieval applications. Online Information Review 27(4), 2003.
|
| |
7
|
|
| |
8
|
|
 |
9
|
Pu-Jeng Cheng , Ching-Hsiang Tsai , Chen-Ming Hung , Lee-Feng Chien, Query taxonomy generation for web search, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
[doi> 10.1145/1183614.1183768]
|
| |
10
|
G. Dupret and M. Mendoza. Automatic Query Recommendation using Click-Through Data. IFIP PPAI'06.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
 |
17
|
|
 |
18
|
Hua-Jun Zeng , Qi-Cai He , Zheng Chen , Wei-Ying Ma , Jinwen Ma, Learning to cluster web search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
[doi> 10.1145/1008992.1009030]
|
CITED BY 23
|
|
|
|
|
Paolo Boldi , Francesco Bonchi , Carlos Castillo , Debora Donato , Aristides Gionis , Sebastiano Vigna, The query-flow graph: model and applications, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Beate Krause , Robert Jäschke , Andreas Hotho , Gerd Stumme, Logsonomy - social information retrieval with logdata, Proceedings of the nineteenth ACM conference on Hypertext and hypermedia, June 19-21, 2008, Pittsburgh, PA, USA
|
|
|
Ralf Schenkel , Tom Crecelius , Mouna Kacimi , Sebastian Michel , Thomas Neumann , Josiane X. Parreira , Gerhard Weikum, Efficient top-k querying over social-tagging networks, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|
|
|
|
|
Hao Ma , Haixuan Yang , Irwin King , Michael R. Lyu, Learning latent semantic relations from clickthrough data for query suggestion, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Carlos Castillo , Claudio Corsi , Debora Donato , Paolo Ferragina , Aristides Gionis, Query-log mining for detecting spam, Proceedings of the 4th international workshop on Adversarial information retrieval on the web, April 22-22, 2008, Beijing, China
|
|
|
Jinwen Guo , Shengliang Xu , Shenghua Bao , Yong Yu, Tapping on the potential of q&a community by recommending answer providers, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
Huanhuan Cao , Daxin Jiang , Jian Pei , Qi He , Zhen Liao , Enhong Chen , Hang Li, Context-aware query suggestion by mining click-through and session data, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
Maarten van der Heijden , Max Hinne , Wessel Kraaij , Suzan Verberne , Theo van der Weide, Using query logs and click data to create improved document descriptions, Proceedings of the 2009 workshop on Web Search Click Data, p.64-67, February 09-09, 2009, Barcelona, Spain
|
|
|
Paolo Boldi , Francesco Bonchi , Carlos Castillo , Debora Donato , Sebastiano Vigna, Query suggestions using query-flow graphs, Proceedings of the 2009 workshop on Web Search Click Data, p.56-63, February 09-09, 2009, Barcelona, Spain
|
|
|
Masaya Murata , Hiroyuki Toda , Yumiko Matsuura , Ryoji Kataoka, Query-page intention matching using clicked titles and snippets to boost search rankings, Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, June 15-19, 2009, Austin, TX, USA
|
|
|
|
|
|
Huanhuan Cao , Daxin Jiang , Jian Pei , Enhong Chen , Hang Li, Towards context-aware search by learning a very large variable length hidden markov model from search logs, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Jianfeng Gao , Wei Yuan , Xiao Li , Kefeng Deng , Jian-Yun Nie, Smoothing clickthrough data for web search ranking, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Gang Wang , Jian Hu , Yunzhang Zhu , Hua Li , Zheng Chen, Competitive analysis from click-through log, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|
|
|
|