|
ABSTRACT
Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this paper we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR) using proximity and question type features achieves a total reciprocal document rank of .20 on the TREC 8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
B. Katz. From sentence processing to information access on the World Wide Web. In Natural Language Processing for the World Wide Web: Papers from the 1997 AAAI Spring Symposium, pages 77--94, 1997.
|
 |
4
|
|
| |
5
|
Ellen Voorhees and Dawn Tice. The TREC-8 question answering track evaluation. In Text Retrieval Conference TREC-8, Gaithersburg, MD, 2000.
|
| |
6
|
J. Prager, D. Radev, E. Brown, and A. Coden. The use of predictive annotation for question answering in trec8. In NIST Special Publication 500-246:The Eighth Text REtrieval Conference (TREC 8), pages 399--411, 1999.
|
| |
7
|
|
| |
8
|
E. Hovy, L. Gerber, U. Hermjakob, M. Junk, and C-Y Lin. Question answering in webclopedia. In NIST Special Publication 500-249: The Ninth Text REtrieval Conference (TREC 9), pages 655--664, 2000.
|
| |
9
|
C. L. A. Clarke, G. V. Cormack, D. I .E. Kisman, and T. R. Lynam. Question answering by passage selection (multitext experiments for trec-9). In NIST Special Publication 500-249: The Ninth Text REtrieval Conference (TREC 9), pages 673--683, 2000.
|
| |
10
|
S. Harabagiu, D. Moldovan, R. Mihalcea M. Pasca, R. Bunescu M. Surdeanu, R. Gîrju, V. Rus, and P. Morarescu. Falcon: Boosting knowledge for answer engines. In NIST Special Publication 500-249:The Ninth Text REtrieval Conference (TREC 9), pages 479--488, 2000.
|
| |
11
|
|
 |
12
|
|
| |
13
|
Eric J. Glover , Gary W. Flake , Steve Lawrence , Andries Kruger , David M. Pennock , William P. Birmingham , C. Lee Giles, Improving Category Specific Web Search by Learning Query Modifications, Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001), p.23, January 08-12, 2001
|
 |
14
|
Dragomir R. Radev , Hong Qi , Zhiping Zheng , Sasha Blair-Goldensohn , Zhu Zhang , Weiguo Fan , John Prager, Mining the web for answers to natural language questions, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502610]
|
| |
15
|
William W. Cohen. Learning trees and rules with set-valued features. In Proceedings of the Thirteenth National Conference on Artificial Intelligence and the Eighth Innovative Applications of Artificial Intelligence Conference, pages 709--716, Menlo Park, August 1996. AAAI Press MIT Press.
|
 |
16
|
|
| |
17
|
|
| |
18
|
S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-4. In D. K. Harman, editor, Proceedings of the Fourth Text Retrieval Conference, pages 73--97. NIST Special Publication 500-236, 1996.
|
CITED BY 25
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hang Li , Yunbo Cao , Jun Xu , Yunhua Hu , Shenjie Li , Dmitriy Meyerzon, A new approach to intranet search based on information extraction, Proceedings of the 14th ACM international conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany
|
|
|
|
|
|
|
|
|
|
|
|
Yongping Du , Helen Meng , Xuanjing Huang , Lide Wu, The use of metadata, web-derived answer patterns and passage context to improve reading comprehension performance, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.604-611, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.4
INFORMATION SYSTEMS APPLICATIONS
H.4.m
Miscellaneous
Additional Classification:
D.
Software
D.2
SOFTWARE ENGINEERING
D.2.8
Metrics
Subjects:
Performance measures
General Terms:
Algorithms,
Design,
Experimentation,
Languages,
Performance
Keywords:
answer extraction,
answer selection,
information retrieval,
natural language processing,
query modulation,
question answering,
search engines
|