ACM Home Page
Please provide us with feedback. Feedback
Named entity recognition in query
Full text PdfPdf (417 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval table of contents
Boston, MA, USA
SESSION: Information extraction table of contents
Pages 267-274  
Year of Publication: 2009
ISBN:978-1-60558-483-6
Authors
Jiafeng Guo  Institute of Computing Technology, CAS, Beijing, China
Gu Xu  Microsoft Research Asia, Beijing, China
Xueqi Cheng  Institute of Computing Technology, CAS, Beijing, China
Hang Li  Microsoft Research Asia, Beijing, China
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 91,   Downloads (12 Months): 242,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1571941.1571989
What is a DOI?

ABSTRACT

This paper addresses the problem of Named Entity Recognition in Query (NERQ), which involves detection of the named entity in a given query and classification of the named entity into predefined classes. NERQ is potentially useful in many applications in web search. The paper proposes taking a probabilistic approach to the task using query log data and Latent Dirichlet Allocation. We consider contexts of a named entity (i.e., the remainders of the named entity in queries) as words of a document, and classes of the named entity as topics. The topic model is constructed by a novel and general learning method referred to as WS-LDA (Weakly Supervised Latent Dirichlet Allocation), which employs weakly supervised learning (rather than unsupervised learning) using partially labeled seed entities. Experimental results show that the proposed method based on WS-LDA can accurately perform NERQ, and outperform the baseline methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
S. Bergsma and Q.I. Wang. Learning noun phrase query segmentation. In EMNLP-CoNLL'07, pages 819--826, 2007.
 
3
4
 
5
 
6
7
 
8
M. Collins and Y. Singer. Unsupervised models for named entity classification. In In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 100--110, 1999.
9
 
10
 
11
12
13
 
14
T.L. Griffiths, M. Steyvers, D.M. Blei, and J.B. Tenenbaum. Integrating topics and syntax. In NIPS, pages 537--544. MIT Press, 2005.
15
16
 
17
A.K. Mccallum. Multi-label text classification with a mixture model trained by EM. In AAAI 99 Workshop on Text Learning, 1999.
18
19
 
20
K.M. Risvik, T. Mikolajewski, and P. Boros. Query segmentation for web search. In WWW, 2003.
21
22
23
24
25
 
26
N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In NIPS, pages 721--728. MIT Press, Cambridge, MA, 2003.

Collaborative Colleagues:
Jiafeng Guo: colleagues
Gu Xu: colleagues
Xueqi Cheng: colleagues
Hang Li: colleagues