ACM Home Page
Please provide us with feedback. Feedback
MURAX: a robust linguistic approach for question answering using an on-line encyclopedia
Full text PdfPdf (972 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Pittsburgh, Pennsylvania, United States
Pages: 181 - 190  
Year of Publication: 1993
ISBN:0-89791-605-0
Author
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 49,   Citation Count: 39
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/160688.160717
What is a DOI?

ABSTRACT

Robust linguistic methods are applied to the task of answering closed-class questions using a corpus of natural language. The methods are illustrated in a broad domain: answering general-knowledge questions using an on-line encyclopedia. A closed-class question is a question stated in natural language, which assumes some definite answer typified by a noun phrase rather than a procedural answer. The methods hypothesize noun phrases that are likely to be the answer, and present the user with relevant text in which they are marked, focussing the user's attention appropriately. Furthermore, the sentences of matching text that are shown to the user are selected to confirm phrase relations implied by the question, rather than being selected solely on the basis of word frequency. The corpus is accessed via an information retrieval (IR) system that supports boolean search with proximity constraints. Queries are automatically constructed from the phrasal content of the question, and passed to the IR system to find relevant text. Then the relevant text is itself analyzed; noun phrase hypotheses are extracted and new queries are independently made to confirm phrase relations for the various hypotheses. The methods are currently being implemented in a system called MURAX and although this process is not complete, it is sufficiently advanced for an interim evaluation to be presented.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
Cutting et al., 1991
D.R. Cutting, J. Pedersen, and P.-K. Halvorsen. An object-oriented architecture for text retrieval. In Conference Proceedings of RIA 0'91, Intelligent Text and Image Handlzng, Barcelona, Spain, pages 285-298, April 1991.
 
Cutting et al., 1992
 
Francis and Kuc˘era, 1982
W. N. Francis and F. KuSera. Frequency Analysis of English Usage. Houghton Mifflin, 1982.
 
Grolier, 1990
The Academic American Encyclopedia. Grolier Electronic Publishing, Danbury, Connecticut, 1990.
 
Hearst, 1992
 
Hopcroft and Ullman, 1979
 
Jacobs et al., 1991
 
Kupiec, 1992a
J. M. Kupiec. Hidden Markov estimarion for unrestricted stochastic context-free grammars. In Proceedings of the 1992 International Conference on Acoustics, Speech and Signal Processing, pages 1-177-180. IEEE Signal Processing Society, IEEE, March 1992.
 
Kupiec, 1992b
J. M. Kupiec. Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language, 6:225-242, 1992.
 
Miller et al., 1990
G. A. Miller, R. Be&with, C. Fellbaum, D. Gross, and K. Miller. Five papers on Word- Net. Technical report, Princeton University, Computer Science Laboratory, July 1990.
Salton and Buckley, 1991
 
Salton et al., 1983
G. Salton, C. Buckley, and E. A. Fox. Automatic query formulations in information retrieval. Journal of the American Society for Informarion Science, 34(4):262-280, July 1983.
Wendlandt and Driscoll, 1991

CITED BY  39