|
ABSTRACT
We introduce a method for learning to find documents on the Web that contain answers to a given natural language question. In our approach, questions are transformed into new queries aimed at maximizing the probability of retrieving answers from existing information retrieval systems. The method involves automatically learning phrase features for classifying questions into different types, automatically generating candidate query transformations from a training set of question/answer pairs, and automatically evaluating the candidate transformations on target information retrieval systems such as real-world general purpose search engines. At run-time, questions are transformed into a set of queries, and reranking is performed on the documents retrieved. We present a prototype search engine, Tritus, that applies the method to Web search engines. Blind evaluation on a set of real queries from a Web search engine log shows that the method significantly outperforms the underlying search engines, and outperforms a commercial search engine specializing in question answering. Our methodology cleanly supports combining documents retrieved from different search engines, resulting in additional improvement with a system that combines search results from multiple Web search engines.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
Adam Berger , Rich Caruana , David Cohn , Dayne Freitag , Vibhu Mittal, Bridging the lexical chasm: statistical approaches to answer-finding, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.192-199, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345576]
|
| |
5
|
|
| |
6
|
Brill, E., Lin, J., Banko, M., Dumais, S., and Ng, A. 2001. Data-intensive question answering. In Proceedings of the TREC-10 Question Answering Track. 393--400.
|
| |
7
|
|
| |
8
|
Burke, R., Hammond, K., and Kozlovsky, J. 1995. Knowledge-based information retrieval for semi-structured text. In AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval. 19--24.
|
| |
9
|
Claire Cardie , Vincent Ng , David Pierce , Chris Buckley, Examining the role of statistical and linguistic knowledge sources in a general-knowledge question-answering system, Proceedings of the sixth conference on Applied natural language processing, p.180-187, April 29-May 04, 2000, Seattle, Washington
[doi> 10.3115/974147.974172]
|
| |
10
|
Croft, W. B. 2000. Combining approaches to information retrieval. Advan. Info. Retrieval. 1--36.
|
| |
11
|
Eric J. Glover , Gary W. Flake , Steve Lawrence , Andries Kruger , David M. Pennock , William P. Birmingham , C. Lee Giles, Improving Category Specific Web Search by Learning Query Modifications, Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001), p.23, January 08-12, 2001
|
| |
12
|
|
| |
13
|
|
| |
14
|
Hovy, E., Gerber, L., Hermjakob, U., Junk, M., and Lin, C.-Y. 2000. Question answering in Webclopedia. In Proceedings of the TREC-9 Question Answering Track. 655--672.
|
| |
15
|
Ittycheriah, A., Franz, M., Zhu, W.-J., and Ratnaparkhi, A. 2000. IBM's statistical question answering system. In Proceedings of the TREC-9 Question Answering Track. 231--234.
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
 |
19
|
Steve Lawrence , Kurt Bollacker , C. Lee Giles, Indexing and retrieval of scientific literature, Proceedings of the eighth international conference on Information and knowledge management, p.139-146, November 02-06, 1999, Kansas City, Missouri, United States
[doi> 10.1145/319950.319970]
|
| |
20
|
|
| |
21
|
|
 |
22
|
|
 |
23
|
|
| |
24
|
Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., and Rus, V. 1999. Lasso: A tool for surfing the answer net. In Proceedings of the TREC-8 Question Answering Track. 175--184.
|
| |
25
|
Prager, J., Chu-Caroll, J., and Czuba, K. 2002. Statistical answer-type identification in open-domain question answering. In Proceedings of the Human Language Technology Conference (HLT-2002). 137--143.
|
 |
26
|
Dragomir Radev , Weiguo Fan , Hong Qi , Harris Wu , Amardeep Grewal, Probabilistic question answering on the web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511500]
|
 |
27
|
Dragomir R. Radev , Hong Qi , Zhiping Zheng , Sasha Blair-Goldensohn , Zhu Zhang , Weiguo Fan , John Prager, Mining the web for answers to natural language questions, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502610]
|
| |
28
|
|
| |
29
|
Robertson, S. and Sparck-Jones, K. 1976. Relevance weighting of search terms. J. Amer. Soc. Info. Sci. 27, 129--146.
|
 |
30
|
|
| |
31
|
Robertson, S., Walker, S., and Beaulieu, M. 1998. Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track. In TREC-7 Proceedings. 253--264.
|
| |
32
|
Rocchio, J. 1971. Relevance feedback in information retrieval, G. Salton, Ed. The SMART Retrieval System--Experiments in Automatic Document Processing. 313--323.
|
| |
33
|
|
| |
34
|
|
 |
35
|
|
| |
36
|
Voorhees, E. 1999a. Overview of the Eighth Text REtrieval Conference (TREC-8). In Proceedings of TREC-8. 1--24.
|
| |
37
|
Voorhees, E. 1999b. The TREC-8 question answering track report. In Proceedings of TREC-8. 77--82.
|
| |
38
|
Voorhees, E. 2000. Overview of the TREC-9 question answering track. In Proceedings of TREC-9. 71--80.
|
| |
39
|
Voorhees, E. 2001. Overview of the TREC-2001 question answering track. In Proceedings of TREC-10. 42--51.
|
| |
40
|
Voorhees, E. and Tice, D. M. 1999. The TREC-8 question answering track evaluation. In Proceedings of TREC-8. 84--106.
|
 |
41
|
|
|