| Learning query intent from regularized click graphs |
| Full text |
Pdf
(673 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Singapore, Singapore
SESSION: Web-search--2
table of contents
Pages 339-346
Year of Publication: 2008
ISBN:978-1-60558-164-4
|
|
Authors
|
|
Xiao Li
|
Microsoft Research, Redmond, WA, USA
|
|
Ye-Yi Wang
|
Microsoft Research, Redmond, WA, USA
|
|
Alex Acero
|
Microsoft Research, Redmond, WA, USA
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 34, Downloads (12 Months): 482, Citation Count: 14
|
|
|
ABSTRACT
This work presents the use of click graphs in improving query intent classifiers, which are critical if vertical search and general-purpose search services are to be offered in a unified user interface. Previous works on query classification have primarily focused on improving feature representation of queries, e.g., by augmenting queries with search engine results. In this work, we investigate a completely orthogonal approach --- instead of enriching feature representation, we aim at drastically increasing the amounts of training data by semi-supervised learning with click graphs. Specifically, we infer class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph. Moreover, we regularize the learning with click graphs by content-based classification to avoid propagating erroneous labels. We demonstrate the effectiveness of our algorithms in two different applications, product intent and job intent classification. In both cases, we expand the training data with automatically labeled queries by over two orders of magnitude, leading to significant improvements in classification performance. An additional finding is that with a large amount of training data obtained in this fashion, classifiers using only query words/phrases as features can work remarkably well.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David D. Lewis , Abdur Chowdhury , Aleksander Kolcz, Improving Automatic Query Classification via Semi-Supervised Learning, Proceedings of the Fifth IEEE International Conference on Data Mining, p.42-49, November 27-30, 2005
[doi> 10.1109/ICDM.2005.80]
|
 |
6
|
|
 |
7
|
Andrei Z. Broder , Marcus Fontoura , Evgeniy Gabrilovich , Amruta Joshi , Vanja Josifovski , Tong Zhang, Robust classification of rare queries using web knowledge, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277783]
|
 |
8
|
|
| |
9
|
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
B. Nguyen and M. Kan. Functional faceted web query analysis. In WWW2007: 16th International World Wide Web Conference, 2007.
|
| |
14
|
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In IJCAI'99: Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.
|
| |
15
|
|
 |
16
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
| |
17
|
M. Szummer and T.Jaakkola. Partially labeled classification with Markov random walks. In Advances in Neural Information Processing Systems, volume 14, 2001.
|
| |
18
|
Gui-Rong Xue , Dou Shen , Qiang Yang , Hua-Jun Zeng , Zheng Chen , Yong Yu , WenSi Xi , Wei-Ying Ma, IRC: An Iterative Reinforcement Categorization Algorithm for Interrelated Web Objects, Proceedings of the Fourth IEEE International Conference on Data Mining, p.273-280, November 01-04, 2004
|
| |
19
|
|
| |
20
|
D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Advances in Neural Information Processing Systems, 2003.
|
| |
21
|
X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02, Carnegie Mellon University, 2002.
|
CITED BY 16
|
|
Jianfeng Gao , Wei Yuan , Xiao Li , Kefeng Deng , Jian-Yun Nie, Smoothing clickthrough data for web search ranking, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Jian Hu , Gang Wang , Fred Lochovsky , Jian-tao Sun , Zheng Chen, Understanding user's query intent with wikipedia, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jaime Arguello , Fernando Diaz , Jamie Callan , Jean-Francois Crespo, Sources of evidence for vertical selection, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
Huanhuan Cao , Derek Hao Hu , Dou Shen , Daxin Jiang , Jian-Tao Sun , Enhong Chen , Qiang Yang, Context-aware query classification, Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, July 19-23, 2009, Boston, MA, USA
|
|
|
|
|
|
Ariel Fuxman , Anitha Kannan , Andrew B. Goldberg , Rakesh Agrawal , Panayiotis Tsaparas , John Shafer, Improving classification accuracy using automatically extracted training data, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|