| Recognition and classification of noun phrases in queries for effective retrieval |
| Full text |
Pdf
(499 KB)
|
Source
|
Conference on Information and Knowledge Management
archive
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
table of contents
Lisbon, Portugal
SESSION: Natural language II (IR)
table of contents
Pages 711-720
Year of Publication: 2007
ISBN:978-1-59593-803-9
|
|
Authors
|
|
Wei Zhang
|
University of Illinois at Chicago, Chicago, IL
|
|
Shuang Liu
|
Ask.com, Edison, NJ
|
|
Clement Yu
|
University of Illinois at Chicago, Chicago, IL
|
|
Chaojing Sun
|
Broadcom Corporation, San Diego, CA
|
|
Fang Liu
|
Microsoft, Redmond, WA
|
|
Weiyi Meng
|
Binghamton University, Binghamton, NY
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 17, Downloads (12 Months): 107, Citation Count: 1
|
|
|
ABSTRACT
It has been shown that using phrases properly in the document retrieval leads to higher retrieval effectiveness. In this paper, we define four types of noun phrases and present an algorithm for recognizing these phrases in queries. The strengths of several existing tools are combined for phrase recognition. Our algorithm is tested using a set of 500 web queries from a query log, and a set of 238 TREC queries. Experimental results show that our algorithm yields high phrase recognition accuracy. We also use a baseline noun phrase recognition algorithm to recognize phrases from the TREC queries. A document retrieval experiment is conducted using the TREC queries (1) without any phrases, (2) with the phrases recognized from a baseline noun phrase recognition algorithm, and (3) with the phrases recognized from our algorithm respectively. The retrieval effectiveness of (3) is better than that of (2), which is better than that of (1). This demonstrates that utilizing phrases in queries does improve the retrieval effectiveness, and better noun phrase recognition yields higher retrieval performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Daniel M. Bikel , Scott Miller , Richard Schwartz , Ralph Weischedel, Nymble: a high-performance learning name-finder, Proceedings of the fifth conference on Applied natural language processing, p.194-201, March 31-April 03, 1997, Washington, DC
[doi> 10.3115/974557.974586]
|
| |
3
|
|
 |
4
|
W. Bruce Croft , Howard R. Turtle , David D. Lewis, The use of phrases and structured queries in information retrieval, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.32-45, October 13-16, 1991, Chicago, Illinois, United States
[doi> 10.1145/122860.122864]
|
 |
5
|
|
 |
6
|
|
| |
7
|
Nancy Chinchor. Overview of MUC-7. In Proc. of MUC. 1998.
|
| |
8
|
Ken Chow, Robert Luk, Kam-Fai Wong and Kui-Lam. Kwok: Hybrid Term Indexing for Weighted Boolean and Vector Space Models. Int. J. Comput. Proc. Oriental Lang. 14(2): 133--151, 2001.
|
| |
9
|
|
| |
10
|
|
| |
11
|
Glossary of linguistic terms, by E. Loos, S. Anderson, D. Day, P. Jordan, and D. Wingate (editors). SIL International. 2003
|
| |
12
|
C. Fellbaum. WordNet, An electronic Lexical Database. The MIT Press, 1998.
|
| |
13
|
Radu Florian , Abe Ittycheriah , Hongyan Jing , Tong Zhang, Named entity recognition through classifier combination, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, p.168-171, May 31, 2003, Edmonton, Canada
[doi> 10.3115/1119176.1119201]
|
| |
14
|
Google: http://www.Google.com/apis/
|
| |
15
|
|
 |
16
|
|
 |
17
|
Erika F. de Lima , Jan O. Pedersen, Phrase recognition and expansion for short, precision-biased queries based on a query log, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.145-152, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312669]
|
| |
18
|
|
| |
19
|
D. Lin. Using collocation statistics in information extraction. In Proc. of Message Understanding Conference. 1998.
|
 |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
Mitchell Marcus , Grace Kim , Mary Ann Marcinkiewicz , Robert MacIntyre , Ann Bies , Mark Ferguson , Karen Katz , Britta Schasberger, The Penn Treebank: annotating predicate argument structure, Proceedings of the workshop on Human Language Technology, March 08-11, 1994, Plainsboro, NJ
[doi> 10.3115/1075812.1075835]
|
| |
24
|
|
| |
25
|
G. Miller. WordNet: An On-line Lexical Database, Special Issue, International Journal of Lexicography. 1990.
|
| |
26
|
S. Robertson and S. Walker. Okapi/Keenbow at TREC-8. In Proc. of TREC. 1999.
|
| |
27
|
|
| |
28
|
University of Glasgow, LILT project, www.arts.gla.ac.uk/SESLL/EngLang/LILT/frameset.htm
|
| |
29
|
|
| |
30
|
E. Voorhees. Overview of the TREC 2004 Robust Retrieval Track. In Proc. of the 13th TREC. 2004.
|
| |
31
|
E. Voorhees. Overview of the TREC 2005 Robust Retrieval Track. In Proc. of the 14th TREC. 2005.
|
| |
32
|
Wikipedia: http://en.wikipedia.org
|
 |
33
|
Wei Zhou , Clement Yu , Neil Smalheiser , Vetle Torvik , Jie Hong, Knowledge-intensive conceptual retrieval and passage extraction of biomedical literature, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
[doi> 10.1145/1277741.1277853]
|
|