|
ABSTRACT
We propose a methodology for building a practical robust query classification system that can identify thousands of query classes with reasonable accuracy, while dealing in real-time with the query volume of a commercial web search engine. We use a blind feedback technique: given a query, we determine its topic by classifying the web search results retrieved by the query. Motivated by the needs of search advertising, we primarily focus on rare queries, which are the hardest from the point of view of machine learning, yet in aggregation account for a considerable fraction of search engine traffic. Empirical evaluation confirms that our methodology yields a considerably higher classification accuracy than previously reported. We believe that the proposed methodology will lead to better matching of online ads to rare queries and overall to a better user experience.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David Grossman , David D. Lewis , Abdur Chowdhury , Aleksandr Kolcz, Automatic web query classification using labeled and unlabeled training data, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076138]
|
| |
2
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David D. Lewis , Abdur Chowdhury , Aleksander Kolcz, Improving Automatic Query Classification via Semi-Supervised Learning, Proceedings of the Fifth IEEE International Conference on Data Mining, p.42-49, November 27-30, 2005
[doi> 10.1109/ICDM.2005.80]
|
| |
3
|
|
| |
4
|
E. Efthimiadis and P. Biron. UCLA-Okapi at TREC-2: Query expansion experiments. In TREC-2, 1994.
|
| |
5
|
E. Gabrilovich and S. Markovitch. Feature generation for text categorization using world knowledge. In IJCAI'05, pages 1048--1053, 2005.
|
 |
6
|
|
| |
7
|
|
 |
8
|
|
 |
9
|
|
| |
10
|
P. Kowalczyk, I. Zukerman, and M. Niemann. Analyzing the effect of query class on document retrieval performance. In Proc. Australian Conf. on AI, pages 550--561, 2004.
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
S. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-3. In TREC-3, 1995.
|
| |
15
|
|
| |
16
|
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. JASIS, 41(4):288--297, 1990.
|
| |
17
|
T. Santner and D. Duffy. The Statistical Analysis of Discrete Data. Springer-Verlag, 1989.
|
 |
18
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Q2C@UST: our winning solution to query classification in KDDCUP 2005, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.100-110, December 2005
[doi> 10.1145/1117454.1117467]
|
 |
19
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Query enrichment for web-query classification, ACM Transactions on Information Systems (TOIS), v.24 n.3, p.320-352, July 2006
[doi> 10.1145/1165774.1165776]
|
 |
20
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
 |
21
|
David Vogel , Steffen Bickel , Peter Haider , Rolf Schimpfky , Peter Siemen , Steve Bridges , Tobias Scheffer, Classifying search engine queries using the web as background knowledge, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.117-122, December 2005
[doi> 10.1145/1117454.1117469]
|
| |
22
|
|
 |
23
|
|
CITED BY 15
|
|
|
Filip Radlinski , Andrei Broder , Peter Ciccolo , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Optimizing relevance and revenue in ad search: a query substitution approach, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
Xuerui Wang , Andrei Broder , Evgeniy Gabrilovich , Vanja Josifovski , Bo Pang, Cross-lingual query classification: a preliminary study, Proceeding of the 2nd ACM workshop on Improving non english web searching, October 30-30, 2008, Napa Valley, California, USA
|
|
Jian Hu , Gang Wang , Fred Lochovsky , Jian-tao Sun , Zheng Chen, Understanding user's query intent with wikipedia, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
Aris Anagnostopoulos , Andrei Z. Broder , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Just-in-time contextual advertising, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal
|
|
|
|
Xuerui Wang , Andrei Broder , Evgeniy Gabrilovich , Vanja Josifovski , Bo Pang, Cross-language query classification using web search for exogenous knowledge, Proceedings of the Second ACM International Conference on Web Search and Data Mining, February 09-12, 2009, Barcelona, Spain
|
|
|
|
|
|
|
|
Andrei Z. Broder , Peter Ciccolo , Marcus Fontoura , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Search advertising using web relevance feedback, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
Evgeniy Gabrilovich , Andrei Broder , Marcus Fontoura , Amruta Joshi , Vanja Josifovski , Lance Riedel , Tong Zhang, Classifying search queries using the Web as a source of knowledge, ACM Transactions on the Web (TWEB), v.3 n.2, p.1-28, April 2009
|
|
|
|