| Named entity recognition in query |
| Full text |
Pdf
(417 KB)
|
Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
table of contents
Boston, MA, USA
SESSION: Information extraction
table of contents
Pages 267-274
Year of Publication: 2009
ISBN:978-1-60558-483-6
|
|
Authors
|
|
Jiafeng Guo
|
Institute of Computing Technology, CAS, Beijing, China
|
|
Gu Xu
|
Microsoft Research Asia, Beijing, China
|
|
Xueqi Cheng
|
Institute of Computing Technology, CAS, Beijing, China
|
|
Hang Li
|
Microsoft Research Asia, Beijing, China
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 91, Downloads (12 Months): 242, Citation Count: 0
|
|
|
ABSTRACT
This paper addresses the problem of Named Entity Recognition in Query (NERQ), which involves detection of the named entity in a given query and classification of the named entity into predefined classes. NERQ is potentially useful in many applications in web search. The paper proposes taking a probabilistic approach to the task using query log data and Latent Dirichlet Allocation. We consider contexts of a named entity (i.e., the remainders of the named entity in queries) as words of a document, and classes of the named entity as topics. The topic model is constructed by a novel and general learning method referred to as WS-LDA (Weakly Supervised Latent Dirichlet Allocation), which employs weakly supervised learning (rather than unsupervised learning) using partially labeled seed entities. Experimental results show that the proposed method based on WS-LDA can accurately perform NERQ, and outperform the baseline methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , David Grossman , David D. Lewis , Abdur Chowdhury , Aleksandr Kolcz, Automatic web query classification using labeled and unlabeled training data, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076138]
|
| |
2
|
S. Bergsma and Q.I. Wang. Learning noun phrase query segmentation. In EMNLP-CoNLL'07, pages 819--826, 2007.
|
| |
3
|
Daniel M. Bikel , Scott Miller , Richard Schwartz , Ralph Weischedel, Nymble: a high-performance learning name-finder, Proceedings of the fifth conference on Applied natural language processing, p.194-201, March 31-April 03, 1997, Washington, DC
[doi> 10.3115/974557.974586]
|
 |
4
|
|
| |
5
|
|
| |
6
|
|
 |
7
|
|
| |
8
|
M. Collins and Y. Singer. Unsupervised models for named entity classification. In In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 100--110, 1999.
|
 |
9
|
Erika F. de Lima , Jan O. Pedersen, Phrase recognition and expansion for short, precision-biased queries based on a query log, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.145-152, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312669]
|
| |
10
|
Oren Etzioni , Michael Cafarella , Doug Downey , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, v.165 n.1, p.91-134, June 2005
[doi> 10.1016/j.artint.2005.03.001]
|
| |
11
|
|
 |
12
|
|
 |
13
|
|
| |
14
|
T.L. Griffiths, M. Steyvers, D.M. Blei, and J.B. Tenenbaum. Integrating topics and syntax. In NIPS, pages 537--544. MIT Press, 2005.
|
 |
15
|
|
 |
16
|
|
| |
17
|
A.K. Mccallum. Multi-label text classification with a mixture model trained by EM. In AAAI 99 Workshop on Text Learning, 1999.
|
 |
18
|
|
 |
19
|
|
| |
20
|
K.M. Risvik, T. Mikolajewski, and P. Boros. Query segmentation for web search. In WWW, 2003.
|
 |
21
|
|
 |
22
|
|
 |
23
|
|
 |
24
|
Dou Shen , Jian-Tao Sun , Qiang Yang , Zheng Chen, Building bridges for web query classification, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
[doi> 10.1145/1148170.1148196]
|
 |
25
|
|
| |
26
|
N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In NIPS, pages 721--728. MIT Press, Cambridge, MA, 2003.
|
|