|
ABSTRACT
A novel maximal figure-of-merit (MFoM) learning approach to text categorization is proposed. Different from the conventional techniques, the proposed MFoM method attempts to integrate any performance metric of interest (e.g. accuracy, recall, precision, or F1 measure) into the design of any classifier. The corresponding classifier parameters are learned by optimizing an overall objective function of interest. To solve this highly nonlinear optimization problem, we use a generalized probabilistic descent algorithm. The MFoM learning framework is evaluated on the Reuters-21578 task with LSI-based feature extraction and a binary tree classifier. Experimental results indicate that the MFoM classifier gives improved F1 and enhanced robustness over the conventional one. It also outperforms the popular SVM method in micro-averaging F1. Other extensions to design discriminative multiple-category MFoM classifiers for application scenarios with new performance metrics could be envisioned too.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bellegarda, J. R., Exploiting latent semantic information in statistical language modeling, In Proceedings of the IEEE, Vol.88, No.8, pp.1279--1296, August, 2000.
|
| |
2
|
|
| |
3
|
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C.J., Classification and Regression Trees, Wadsworth Int. 1984.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K. and Harshman, R., Indexing by latent semantic analysis, In Journal of the American Society for Information Science, Vol.41, No. 6, pp.391--407, 1990.
|
| |
8
|
Denoyer, L., Zaragoza, H. and Gallinari, P., HMM-based passage models for document classification and ranking, In ECIR'01, 2001.
|
| |
9
|
|
| |
10
|
Guo, H. and Gelfand S. B., Classification trees with neural network feature extraction, In IEEE Trans. on Neural Networks, Vol. 3, No. 6, pp.923--933, Nov., 1992.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
Juang, B.-H., Chou, W. and Lee, C.-H., Minimum classification error rate methods for speech recognition, In IEEE Trans. on Speech and Audio Processing, Vol.5, No. 2, pp.257--265, March, 1997.
|
| |
16
|
Katagiri, S., Juang, B.-H. and Lee, C.-H., Pattern recognition using a family of design algorithm based upon the generalized probabilistic descent method, In Proceedings of the IEEE, Vol.86, No.11, pp.2345--2373, 1998.
|
| |
17
|
Kuo, H. K. J. and Lee, C.-H., Discriminative training of natural language call routers, In IEEE Trans. on Speech and Audio Processing, Vol.11, No.1, pp.24--35, 2003.
|
| |
18
|
Lee, C.-H. and Huo, Q., On adaptive decision rules and decision parameter adaptation for automatic speech recognition, In Proceedings of the IEEE, Vol.88, No.8, pp.1241--1269, August, 2000.
|
| |
19
|
|
| |
20
|
Lewis, D. and Ringuette, M., A comparison of two learning algorithms for text categorization. In The Third Annual Symposium on Document Analysis and Information Retrieval, pp.81--93, 1994.
|
| |
21
|
Li, Y. H. and Jain, A. K., Classification of text documents, In The Computer Journal, Vol.41, No.8, pp.537--546, 1998.
|
| |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
McCallum, A. and Nigam, K., A comparison of event models for Naive Bayes text classification, In AAAI-98 Workshop on Learning for Text Categorization, pp.41--48, 1998.
|
 |
26
|
David R. H. Miller , Tim Leek , Richard M. Schwartz, A hidden Markov model information retrieval system, Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, p.214-221, August 15-19, 1999, Berkeley, California, United States
[doi> 10.1145/312624.312680]
|
 |
27
|
Hwee Tou Ng , Wei Boon Goh , Kok Leong Low, Feature selection, perception learning, and a usability case study for text categorization, Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, p.67-73, July 27-31, 1997, Philadelphia, Pennsylvania, United States
|
| |
28
|
|
 |
29
|
|
| |
30
|
|
 |
31
|
Hinrich Schütze , David A. Hull , Jan O. Pedersen, A comparison of classifiers and document representations for the routing problem, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.229-237, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215365]
|
 |
32
|
|
| |
33
|
|
| |
34
|
|
 |
35
|
Haoran Wu , Tong Heng Phang , Bing Liu , Xiaoli Li, A refinement approach to handling model misfit in text categorization, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775078]
|
 |
36
|
|
| |
37
|
|
CITED BY 8
|
|
Sheng Gao , Wen Wu , Chin-Hui Lee , Tat-Seng Chua, A MFoM learning approach to robust multiclass multi-label text categorization, Proceedings of the twenty-first international conference on Machine learning, p.42, July 04-08, 2004, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|