ACM Home Page
Please provide us with feedback. Feedback
Learning query intent from regularized click graphs
Full text PdfPdf (673 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Web-search--2 table of contents
Pages 339-346  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Xiao Li  Microsoft Research, Redmond, WA, USA
Ye-Yi Wang  Microsoft Research, Redmond, WA, USA
Alex Acero  Microsoft Research, Redmond, WA, USA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 34,   Downloads (12 Months): 482,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390393
What is a DOI?

ABSTRACT

This work presents the use of click graphs in improving query intent classifiers, which are critical if vertical search and general-purpose search services are to be offered in a unified user interface. Previous works on query classification have primarily focused on improving feature representation of queries, e.g., by augmenting queries with search engine results. In this work, we investigate a completely orthogonal approach --- instead of enriching feature representation, we aim at drastically increasing the amounts of training data by semi-supervised learning with click graphs. Specifically, we infer class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph. Moreover, we regularize the learning with click graphs by content-based classification to avoid propagating erroneous labels. We demonstrate the effectiveness of our algorithms in two different applications, product intent and job intent classification. In both cases, we expand the training data with automatically labeled queries by over two orders of magnitude, leading to significant improvements in classification performance. An additional finding is that with a large amount of training data obtained in this fashion, classifiers using only query words/phrases as features can work remarkably well.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
 
5
6
7
8
 
9
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.
10
11
 
12
 
13
B. Nguyen and M. Kan. Functional faceted web query analysis. In WWW2007: 16th International World Wide Web Conference, 2007.
 
14
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In IJCAI'99: Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.
 
15
16
 
17
M. Szummer and T.Jaakkola. Partially labeled classification with Markov random walks. In Advances in Neural Information Processing Systems, volume 14, 2001.
 
18
 
19
 
20
D. Zhou, O. Bousquet, T. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In Advances in Neural Information Processing Systems, 2003.
 
21
X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02, Carnegie Mellon University, 2002.

CITED BY  16

Collaborative Colleagues:
Xiao Li: colleagues
Ye-Yi Wang: colleagues
Alex Acero: colleagues