| Towards context-aware search by learning a very large variable length hidden markov model from search logs |
| Full text |
Pdf
(1.48 MB)
|
Source
|
International World Wide Web Conference
archive
Proceedings of the 18th international conference on World wide web
table of contents
Madrid, Spain
SESSION: Data mining/session: learning
table of contents
Pages 191-200
Year of Publication: 2009
ISBN:978-1-60558-487-4
|
|
Authors
|
|
Huanhuan Cao
|
University of Science and Technology of China, Hefei, China
|
|
Daxin Jiang
|
Microsoft Research Asia, Beijing, China
|
|
Jian Pei
|
Simon Fraser University, Vancouver, Canada
|
|
Enhong Chen
|
University of Science and Technology of China, Hefei, China
|
|
Hang Li
|
Microsoft Research Asia, Beijing, China
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 42, Downloads (12 Months): 178, Citation Count: 1
|
|
|
ABSTRACT
Capturing the context of a user's query from the previous queries and clicks in the same session may help understand the user's information need. A context-aware approach to document re-ranking, query suggestion, and URL recommendation may improve users' search experience substantially. In this paper, we propose a general approach to context-aware search. To capture contexts of queries, we learn a variable length Hidden Markov Model (vlHMM) from search sessions extracted from log data. Although the mathematical model is intuitive, how to learn a large vlHMM with millions of states from hundreds of millions of search sessions poses a grand challenge. We develop a strategy for parameter initialization in vlHMM learning which can greatly reduce the number of parameters to be estimated in practice. We also devise a method for distributed vlHMM learning under the map-reduce model. We test our approach on a real data set consisting of 1.8 billion queries, 2.6 billion clicks, and 840 million search sessions, and evaluate the effectiveness of the vlHMM learned from the real data on three search applications: document re-ranking, query suggestion, and URL recommendation. The experimental results show that our approach is both effective and efficient.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Baeza-Yates, R.A., et al. Query recommendation using query logs in search engines. In EDBT 2004 Workshop on Clustering Information over the Web, pages 588--596, 2004.
|
 |
2
|
|
 |
3
|
|
| |
4
|
Baum, L.E., et al. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann. Math. Statist., 41(1):164--171, 1970.
|
 |
5
|
|
 |
6
|
Huanhuan Cao , Daxin Jiang , Jian Pei , Qi He , Zhen Liao , Enhong Chen , Hang Li, Context-aware query suggestion by mining click-through and session data, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
[doi> 10.1145/1401890.1401995]
|
 |
7
|
|
| |
8
|
Chu, C.T., et al. Map-reduce for machine learning on multicore. In NIPS, pages 281--288, 2006.
|
| |
9
|
|
| |
10
|
Dempster, A.P., et al. Maximal Likelihood from Incomplete Data Via the EM Algorithm. Journal of the Royal Statistical Society, Ser B(39):1--38, 1977.
|
| |
11
|
Durbin, R., et al. Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, 1998.
|
 |
12
|
Bruno M. Fonseca , Paulo Golgher , Bruno Pôssas , Berthier Ribeiro-Neto , Nivio Ziviani, Concept-based interactive query expansion, Proceedings of the 14th ACM international conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany
[doi> 10.1145/1099554.1099726]
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
| |
18
|
Rocchio, J. Relevance feedback information retrieval. Prentice-Hall Inc., 1971.
|
 |
19
|
|
 |
20
|
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
Zhao, M., et al. Adapting document ranking to users preferences using click-through Data. In AIRS'06, pages 26--42, 2006.
|
CITED BY 2
|
|
Huanhuan Cao , Derek Hao Hu , Dou Shen , Daxin Jiang , Jian-Tao Sun , Enhong Chen , Qiang Yang, Context-aware query classification, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|