ACM Home Page
Please provide us with feedback. Feedback
Mining the web for answers to natural language questions
Full text PdfPdf (1.47 MB)
Source Conference on Information and Knowledge Management archive
Proceedings of the tenth international conference on Information and knowledge management table of contents
Atlanta, Georgia, USA
Session: World Wide Web table of contents
Pages: 143 - 150  
Year of Publication: 2001
ISBN:1-58113-436-3
Authors
Dragomir R. Radev  University of Michigan, Ann Arbor, MI
Hong Qi  University of Michigan, Ann Arbor, MI
Zhiping Zheng  University of Michigan, Ann Arbor, MI
Sasha Blair-Goldensohn  University of Michigan, Ann Arbor, MI
Zhu Zhang  University of Michigan, Ann Arbor, MI
Weiguo Fan  University of Michigan, Ann Arbor, MI
John Prager  IBM TJ Watson Research Center, Hawthorne, NY
Sponsors
SIGMIS: ACM Special Interest Group on Management Information Systems
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 68,   Citation Count: 21
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502585.502610
What is a DOI?

ABSTRACT

The web is now becoming one of the largest information and knowledge repositories. Many large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users find information. In this paper, we study how we can effectively use these existing search engines to mine the Web and discover the "correct" answers to factual natural language questions.We propose a probabilistic algorithm called QASM (Question Answering using Statistical Models) that learns the best query paraphrase of a natural language question. We validate our approach for both local and web search engines using questions from the TREC evaluation. We also show how this algorithm can be combined with another algorithm (AnSel) to produce precise answers to natural language questions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
The Fast search engine. http://www.alltheweb.com, 2001.
 
2
 
3
4
 
5
 
6
 
7
 
8
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society series B, 39: l-38, 1977.
 
9
The Excite query corpus. ftp:Nftp.excite.comlpub/jack/Excite-Log-l2201999.gz, 1999.
10
 
11
S. Harabagiu, D. Moldovan, M. Pasta, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus, and P. Morarescu. The TREC-9 question answering track evaluation. In Text Retrieval Conference TREC-9, Gaithersburg, MD, 200 1.
 
12
 
13
 
14
K. Knight and D. Marcu. Statistics-based summarization -step one: sentence compression. In Proceedings, Seventeenth Annual Conference of the American Association for ArtiJicial Intelligence, Austin, Texas, August 2000.
 
15
 
16
A. Mikheev. Tagging sentence boundaries. In Proceedings, SIGIR 2000,200O.
 
17
G. A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography (special issue), 3(4):235-312, 1990.
18
 
19
20
21
 
22
D. R. Radev, K. Libner, and W. Fan. An empirical evaluation of the capability of state-of-the-art search engines to answer natural language questions. Submitted, 2001.
 
23
 
24
E. Voorhees and D. Tice. The TREC-8 question answering track evaluation. In Text Retrieval Conference TREC-8, Gaithersburg, MD, 2000.

CITED BY  21

Collaborative Colleagues:
Dragomir R. Radev: colleagues
Hong Qi: colleagues
Zhiping Zheng: colleagues
Sasha Blair-Goldensohn: colleagues
Zhu Zhang: colleagues
Weiguo Fan: colleagues
John Prager: colleagues