|
ABSTRACT
Question classification is very important for question answering. This paper presents our research work on automatic question classification through machine learning approaches. We have experimented with five machine learning algorithms: Nearest Neighbors (NN), Naive Bayes (NB), Decision Tree (DT), Sparse Network of Winnows (SNoW), and Support Vector Machines (SVM) using two kinds of features: bag-of-words and bag-of-ngrams. The experiment results show that with only surface text features the SVM outperforms the other four methods for this task. Further, we propose to use a special kernel function called the tree kernel to enable the SVM to take advantage of the syntactic structures of questions. We describe how the tree kernel can be computed efficiently by dynamic programming. The performance of our approach is promising, when tested on the questions from the TREC QA track.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Collins and N. Duffy. Convolution Kernels for Natural Language. In Proceedings of Neural Information Processing Systems (NIPS14), 2001.
|
| |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
Chih-Chung Chang and Chih-Jen Lin. LIBSVM: a library for support vector machines. 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
|
 |
6
|
|
| |
7
|
T. Gaertner, J. Lloyd, and P. Flach. Kernels for Structured Data. In Proceedings of the 12th International Conference on Inductive Logic Programming (ILP), July 2002.
|
| |
8
|
D. Haussler. Convolution Kernels on Discrete Structures. Technical Report, University of Santa Cruz. 1999.
|
| |
9
|
U. Hermjakob. Parsing and Question Classification for Question Answering. In Proceedings of the ACL Workshop on Open-Domain Question Answering, Toulouse, France, 2001.
|
| |
10
|
E. Hovy, L. Gerber, U. Hermjakob, C. Lin, and D. Ravichandran. Towards Semantics-based Answer Pinpointing. In Proceedings of the DARPA Human Language Technology conference (HLT), San Diego, CA, 1999.
|
| |
11
|
C. W. Hsu and C. J. Lin. A Comparison of Methods for Multi-class Support Vector Machines, IEEE Transactions on Neural Networks, 13, pp. 415--425, 2002.
|
| |
12
|
|
| |
13
|
|
| |
14
|
A. McCallum and K. Nigam. A Comparison of Event Models for Naive Bayes Text Classification. In AAAI-98 Workshop on Learning for Text Categorization, 1998.
|
| |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
E. Voorhees. The TREC-8 Question Answering Track Report. In Proceedings of the 8th Text Retrieval Conference (TREC8), pp. 77--82, NIST, Gaithersburg, MD, 1999.
|
| |
19
|
E. Voorhees. Overview of the TREC-9 Question Answering Track. In Proceedings of the 9th Text Retrieval Conference (TREC9), pp. 71--80, NIST, Gaithersburg, MD, 2000.
|
| |
20
|
E. Voorhees. Overview of the TREC 2001 Question Answering Track. In Proceedings of the 10th Text Retrieval Conference (TREC10), pp. 157--165, NIST, Gaithersburg, MD, 2001.
|
| |
21
|
C. Watkins. Dynamic Alignment Kernels. In A. J. Smola, P. L. Bartlett, B. Schlkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pp. 39--50, MIT Press, 2000.
|
 |
22
|
|
CITED BY 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Vijay Krishnan , Sujatha Das , Soumen Chakrabarti, Enhanced answer type inference from questions using sequential models, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, p.315-322, October 06-08, 2005, Vancouver, British Columbia, Canada
|
|
|
Thamar Solorio , Manuel Pérez-Coutiño , Manuel Montes-y-Gémez , Luis Villaseñor-Pineda , Aurelio López-López, A language independent method for question classification, Proceedings of the 20th international conference on Computational Linguistics, p.1374-es, August 23-27, 2004, Geneva, Switzerland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Baoli Li , Yandong Liu , Ashwin Ram , Ernest V. Garcia , Eugene Agichtein, Exploring question subjectivity prediction in community QA, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore
|
|
|
|
|