| Classifying search queries using the Web as a source of knowledge |
| Full text |
Pdf
(787 KB)
|
Source
|
ACM Transactions on the Web (TWEB)
archive
Volume 3 , Issue 2 (April 2009)
table of contents
Article No. 5
Year of Publication: 2009
ISSN:1559-1131
|
|
Authors
|
|
Evgeniy Gabrilovich
|
Yahoo Research, Santa Clara, CA
|
|
Andrei Broder
|
Yahoo Research, Santa Clara, CA
|
|
Marcus Fontoura
|
PUC-Rio, Rio de Janeiro, Brazil
|
|
Amruta Joshi
|
UCLA, Los Angeles, CA
|
|
Vanja Josifovski
|
Yahoo Research, Santa Clara, CA
|
|
Lance Riedel
|
Yahoo Research, Santa Clara, CA
|
|
Tong Zhang
|
Rutgers University, Piscataway, NJ
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 60, Downloads (12 Months): 357, Citation Count: 0
|
|
|
ABSTRACT
We propose a methodology for building a robust query classification system that can identify thousands of query classes, while dealing in real time with the query volume of a commercial Web search engine. We use a pseudo relevance feedback technique: given a query, we determine its topic by classifying the Web search results retrieved by the query. Motivated by the needs of search advertising, we primarily focus on rare queries, which are the hardest from the point of view of machine learning, yet in aggregate account for a considerable fraction of search engine traffic. Empirical evaluation confirms that our methodology yields a considerably higher classification accuracy than previously reported. We believe that the proposed methodology will lead to better matching of online ads to rare queries and overall to a better user experience.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder, Hourly analysis of a very large topically categorized web query log, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
[doi> 10.1145/1008992.1009048]
|
 |
2
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David Grossman , David D. Lewis , Abdur Chowdhury , Aleksandr Kolcz, Automatic web query classification using labeled and unlabeled training data, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076138]
|
| |
3
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David D. Lewis , Abdur Chowdhury , Aleksander Kolcz, Improving Automatic Query Classification via Semi-Supervised Learning, Proceedings of the Fifth IEEE International Conference on Data Mining, p.42-49, November 27-30, 2005
[doi> 10.1109/ICDM.2005.80]
|
 |
4
|
|
 |
5
|
Andrei Z. Broder , Peter Ciccolo , Marcus Fontoura , Evgeniy Gabrilovich , Vanja Josifovski , Lance Riedel, Search advertising using web relevance feedback, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
[doi> 10.1145/1458082.1458217]
|
 |
6
|
Andrei Broder , Peter Ciccolo , Evgeniy Gabrilovich , Vanja Josifovski , Donald Metzler , Lance Riedel , Jeffrey Yuan, Online expansion of rare queries for sponsored search, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
[doi> 10.1145/1526709.1526778]
|
 |
7
|
Andrei Z. Broder , Marcus Fontoura , Evgeniy Gabrilovich , Amruta Joshi , Vanja Josifovski , Tong Zhang, Robust classification of rare queries using web knowledge, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277783]
|
| |
8
|
Duda, R. and Hart, P. 1973. Pattern Classification and Scene Analysis. John Wiley and Sons, New York, NY.
|
| |
9
|
Efthimiadis, E. and Biron, P. 1994. UCLA-Okapi at TREC-2: Query expansion experiments. In Proceedings of the Text REtrieval Conference (TREC-2). National Institute of Standards and Technology (NIST).
|
| |
10
|
|
 |
11
|
|
 |
12
|
|
| |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
| |
17
|
Kowalczyk, P., Zukerman, I., and Niemann, M. 2004. Analyzing the effect of query class on document retrieval performance. In Proceedings of the Australian Conference on Artificial Intelligence. Springer, 550--561.
|
 |
18
|
|
 |
19
|
Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed, Coupling feature selection and machine learning methods for navigational query identification, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
[doi> 10.1145/1183614.1183711]
|
| |
20
|
|
| |
21
|
McCallum, A. and Nigam, K. 1998. A comparison of event models for naive Bayes text classification. In AAAI/ICML Workshop on Learning for Text Categorization. 41--48.
|
 |
22
|
|
| |
23
|
|
| |
24
|
Robertson, S., Walker, S., Jones, S., Hancock-Beaulieu, M., and Gatford, M. 1995. Okapi at TREC-3. In Proceedings of the Text REtrieval Conference (TREC-3). NIST, Gaithersburg, MD.
|
| |
25
|
Rocchio, J. 1971. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, Englewood Cliffs, NJ, 313--323.
|
| |
26
|
Sahami, M., Mittal, V., Baluja, S., and Rowley, H. 2004. The happy searcher: Challenges in web information retrieval. In Proceedings of the 8th Pacific Rim International Conference on Artificial Intelligence. Springer-Verlag.
|
| |
27
|
|
| |
28
|
Salton, G. and Buckley, C. 1990. Improving retrieval performance by relevance feedback. J. Am. Soc. Inform. Sci. 41, 4, 288--297.
|
| |
29
|
Santner, T. and Duffy, D. 1989. The Statistical Analysis of Discrete Data. Springer-Verlag.
|
 |
30
|
|
 |
31
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Q2C@UST: our winning solution to query classification in KDDCUP 2005, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.100-110, December 2005
[doi> 10.1145/1117454.1117467]
|
 |
32
|
Dou Shen , Rong Pan , Jian-Tao Sun , Jeffrey Junfeng Pan , Kangheng Wu , Jie Yin , Qiang Yang, Query enrichment for web-query classification, ACM Transactions on Information Systems (TOIS), v.24 n.3, p.320-352, July 2006
[doi> 10.1145/1165774.1165776]
|
 |
33
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
 |
34
|
David Vogel , Steffen Bickel , Peter Haider , Rolf Schimpfky , Peter Siemen , Steve Bridges , Tobias Scheffer, Classifying search engine queries using the web as background knowledge, ACM SIGKDD Explorations Newsletter, v.7 n.2, p.117-122, December 2005
[doi> 10.1145/1117454.1117469]
|
| |
35
|
|
 |
36
|
|
| |
37
|
|
| |
38
|
|
|