ACM Home Page
Please provide us with feedback. Feedback
Generic soft pattern models for definitional question answering
Full text PdfPdf (339 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Salvador, Brazil
SESSION: Question answering table of contents
Pages: 384 - 391  
Year of Publication: 2005
ISBN:1-59593-034-5
Authors
Hang Cui  National University of Singapore
Min-Yen Kan  National University of Singapore
Tat-Seng Chua  National University of Singapore
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 70,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1076034.1076101
What is a DOI?

ABSTRACT

This paper explores probabilistic lexico-syntactic pattern matching, also known as soft pattern matching. While previous methods in soft pattern matching are ad hoc in computing the degree of match, we propose two formal matching models: one based on bigrams and the other on the Profile Hidden Markov Model (PHMM). Both models provide a theoretically sound method to model pattern matching as a probabilistic process that generates token sequences. We demonstrate the effectiveness of these models on definition sentence retrieval for definitional question answering. We show that both models significantly outperform state-of-the-art manually constructed patterns. A critical difference between the two models is that the PHMM technique handles language variations more effectively but requires more training data to converge. We believe that both models can be extended to other areas where lexico-syntactic pattern matching can be applied.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S. Blair-Goldensohn, K.R. McKeown and A. Hazen Schlaikjer, A Hybrid Approach for QA Track Definitional Questions, Proc. of TREC 2003, 2003, pp. 336--343.
2
 
3
H. Cui, M.-Y. Kan, T.-S. Chua and J. Xiao, A Comparative Study on Sentence Retrieval for Definitional Question Answering, SIGIR Workshop on Information Retrieval for Question Answering (IR4QA), Sheffield, U.K., 2004.
 
4
A.P. Dempster, N.M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, 39:1--38, 1977.
 
5
S. Harabagiu, D. Moldovan, C. Clark, M. Bowden, J. Williams and J. Bensley, Answer Mining by Combining Extraction Techniques with Abductive Reasoning, Proc. of TREC 2003, 2003.
 
6
W. Hildebrandt, B. Katz and J. Lin, Answering Definition Questions with Multiple Knowledge Sources, Proc. of HLT/NAACL 2004, Boston, MA, 2004, pp. 49--56.
 
7
F. Jelinek and R. L. Mercer, Interpolated estimation of markov source parameters from sparse data, Proc. of the Workshop Pattern Recognition in Practice, Amsterdam, Holland, 1980, pp. 381--397.
 
8
A. Krogh, M. Brown, I.S. Mian K. Sjolander and D. Haussler, Hidden Markov Models in Computational Biology - Applications to Protein Modeling, J. Mol. Biol. (1994) 235, pp. 1501--1531.
 
9
 
10
 
11
I. Muslea, Extraction patterns for information extraction tasks: A survey, Proc. of AAAI-99 Workshop on Machine Learning for Information Extraction, 1999, pp.1--6.
 
12
 
13
 
14
R. Rosenfeld, Two decades of statistical language modeling: Where do we go from here, Proc. of the IEEE, 88, August, 2000, pp. 1270--1278.
 
15
M. Skounakis, M. Craven, and S. Ray, Hierarchical hidden markov models for information extraction, Proc. of IJCAI '03, 2003.
 
16
E.M.Voorhees, Overview of the TREC 2003 question answering track, Proc. of TREC 2003, 2003.
 
17
E.M. Voorhees, Overview of the TREC 2004 question answering track, Proc. of TREC 2004, 2004.
 
18
J. Xiao, T.-S. Chua and H. Cui, Cascading Use of Soft and Hard Matching Pattern Rules for Weakly Supervised Information Extraction, Proc. of COLING '04, Geneva, Switzerland, 2004, pp.542--548.
19
 
20
H. Yang, H. Cui, M.-Y. Kan, M. Maslennikov, L. Qiu and T.-S. Chua, QUALIFIER in TREC 12 QA Main Task, Proc. of TREC 2003, 2003, pp. 54--63.

CITED BY  8

Collaborative Colleagues:
Hang Cui: colleagues
Min-Yen Kan: colleagues
Tat-Seng Chua: colleagues