ACM Home Page
Please provide us with feedback. Feedback
Feature selection with conditional mutual information maximin in text categorization
Full text PdfPdf (181 KB)
Source Conference on Information and Knowledge Management archive
Proceedings of the thirteenth ACM international conference on Information and knowledge management table of contents
Washington, D.C., USA
SESSION: IR-4 (information retrieval): machine learning in information retrieval table of contents
Pages: 342 - 349  
Year of Publication: 2004
ISBN:1-58113-874-1
Authors
Gang Wang  Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Frederick H. Lochovsky  Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 104,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1031171.1031241
What is a DOI?

ABSTRACT

Feature selection is an important component of text categorization. This technique can both increase a classifier's computation speed, and reduce the overfitting problem. Several feature selection methods, such as information gain and mutual information, have been widely used. Although they greatly improve the classifier's performance, they have a common drawback, which is that they do not consider the mutual relationships among the features. In this situation, where one feature's predictive power is weakened by others, and where the selected features tend to bias towards major categories, such selection methods are not very effective. In this paper, we propose a novel feature selection method for text categorization called <i>conditional mutual information maximin</i> (CMIM). It can select a set of individually discriminating and weakly dependent features. The experimental results show that CMIM can perform much better than traditional feature selection methods.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic indexing. Jounal of the American Socity for Information Science, 1990.
 
6
 
7
 
8
F. Fleuret. Binary feature selection with conditional mutual infomration. Technical Report, 2003.
 
9
 
10
 
11
T. Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In Procedings 4th European Conference on Reaserch and Advanced Technology for Digital Libraries (ECDL'00), 2000.
 
12
D. Lewis and M. Ringuette. Comparison of two learning algorithms for text categorization. In Procedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR'94), 1994.
 
13
 
14
I. Moulinier. Is learning bias an issue on the text categorization problem? Technical report, 1997.
 
15
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In IJCAI-99 Workshop on Machine Learning for Information Filtering, 1999.
 
16
J. Platt. Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods, 2000.
17
 
18
 
19
I. Tsamardinos, C. Aliferis, and A. Statnikov. Algorithms for large scale markov blanket discovery. In Proceedings of the 16th International Florida Artificial Intelligence Research Society Conference(FLAIRS), 2003.
 
20
 
21
E. Wiener, J. Pedersen, and A. Weigend. A neural network approach to topic spotting. In Procedings of the 4th Annual Symposium on Document Analysis and Information Retrieval (SDAIR'95), 1995.
 
22
H. Yang and J. Moody. Feature selection based on joint mutual information. In Proceedings of International ICSC Symposium on Advances in Intelligent Data Analysis, 1999.
 
23
 
24



REVIEW

"Luminita State : Reviewer"

The main, and very often computationally overwhelming, characteristic of text data is its extremely high dimensionality, which could prove to be a severe obstacle for any classification algorithm. One of the most frequently used ways to reduce dim  more...

Collaborative Colleagues:
Gang Wang: colleagues
Frederick H. Lochovsky: colleagues