ACM Home Page
Please provide us with feedback. Feedback
A maximal figure-of-merit learning approach to text categorization
Full text PdfPdf (335 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval table of contents
Toronto, Canada
SESSION: Text categorization table of contents
Pages: 174 - 181  
Year of Publication: 2003
ISBN:1-58113-646-3
Authors
Sheng Gao  Institute for Infocomm Research, Singapore
Wen Wu  National Univ. of Singapore, Singapore
Chin-Hui Lee  National Univ. of Singapore, Singapore and Georgia Institute of Technology, Atlanta, GA
Tat-Seng Chua  National Univ. of Singapore, Singapore
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 51,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/860435.860469
What is a DOI?

ABSTRACT

A novel maximal figure-of-merit (MFoM) learning approach to text categorization is proposed. Different from the conventional techniques, the proposed MFoM method attempts to integrate any performance metric of interest (e.g. accuracy, recall, precision, or F1 measure) into the design of any classifier. The corresponding classifier parameters are learned by optimizing an overall objective function of interest. To solve this highly nonlinear optimization problem, we use a generalized probabilistic descent algorithm. The MFoM learning framework is evaluated on the Reuters-21578 task with LSI-based feature extraction and a binary tree classifier. Experimental results indicate that the MFoM classifier gives improved F1 and enhanced robustness over the conventional one. It also outperforms the popular SVM method in micro-averaging F1. Other extensions to design discriminative multiple-category MFoM classifiers for application scenarios with new performance metrics could be envisioned too.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bellegarda, J. R., Exploiting latent semantic information in statistical language modeling, In Proceedings of the IEEE, Vol.88, No.8, pp.1279--1296, August, 2000.
 
2
 
3
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C.J., Classification and Regression Trees, Wadsworth Int. 1984.
 
4
 
5
 
6
 
7
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. and Harshman, R., Indexing by latent semantic analysis, In Journal of the American Society for Information Science, Vol.41, No. 6, pp.391--407, 1990.
 
8
Denoyer, L., Zaragoza, H. and Gallinari, P., HMM-based passage models for document classification and ranking, In ECIR'01, 2001.
 
9
 
10
Guo, H. and Gelfand S. B., Classification trees with neural network feature extraction, In IEEE Trans. on Neural Networks, Vol. 3, No. 6, pp.923--933, Nov., 1992.
11
 
12
 
13
 
14
 
15
Juang, B.-H., Chou, W. and Lee, C.-H., Minimum classification error rate methods for speech recognition, In IEEE Trans. on Speech and Audio Processing, Vol.5, No. 2, pp.257--265, March, 1997.
 
16
Katagiri, S., Juang, B.-H. and Lee, C.-H., Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method, In Proceedings of the IEEE, Vol.86, No.11, pp.2345--2373, 1998.
 
17
Kuo, H. K. J. and Lee, C.-H., Discriminative training of natural language call routers, In IEEE Trans. on Speech and Audio Processing, Vol.11, No.1, pp.24--35, 2003.
 
18
Lee, C.-H. and Huo, Q., On adaptive decision rules and decision parameter adaptation for automatic speech recognition, In Proceedings of the IEEE, Vol.88, No.8, pp.1241--1269, August, 2000.
 
19
 
20
Lewis, D. and Ringuette, M., A comparison of two learning algorithms for text categorization. In The Third Annual Symposium on Document Analysis and Information Retrieval, pp.81--93, 1994.
 
21
Li, Y. H. and Jain, A. K., Classification of text documents, In The Computer Journal, Vol.41, No.8, pp.537--546, 1998.
 
22
23
24
 
25
McCallum, A. and Nigam, K., A comparison of event models for Naive Bayes text classification, In AAAI-98 Workshop on Learning for Text Categorization, pp.41--48, 1998.
26
27
 
28
29
 
30
31
32
 
33
 
34
35
36
 
37

CITED BY  8

Collaborative Colleagues:
Sheng Gao: colleagues
Wen Wu: colleagues
Chin-Hui Lee: colleagues
Tat-Seng Chua: colleagues