| Higher order feature selection for text classification |
| Source
|
Knowledge and Information Systems
archive
Volume 9 , Issue 4 (April 2006)
table of contents
Pages: 468 - 491
Year of Publication: 2006
ISSN:0219-1377
|
|
Authors
|
|
Jan Bakus
|
Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada
|
|
Mohamed S. Kamel
|
Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada
|
|
| Publisher |
Springer-Verlag New York, Inc.
New York, NY, USA
|
| Bibliometrics |
Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Citation Count: 1
|
|
|
ABSTRACT
In this paper, we present the MIFS-C variant of the mutual information feature-selection algorithms. We present an algorithm to find the optimal value of the redundancy parameter, which is a key parameter in the MIFS-type algorithms. Furthermore, we present an algorithm that speeds up the execution time of all the MIFS variants. Overall, the presented MIFS-C has comparable classification accuracy (in some cases even better) compared with other MIFS algorithms, while its running time is faster. We compared this feature selector with other feature selectors, and found that it performs better in most cases. The MIFS-C performed especially well for the breakeven and F-measure because the algorithm can be tuned to optimise these evaluation measures.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
2. Bakus J, Kamel M (2003) Information theoretic feature selection for document classification. In: Proceedings of the Eighth Canadian Workshop on Information Theory, pp 147-150. Waterloo, Ontario, Canada
|
| |
3
|
3. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5(4):537-550
|
| |
4
|
4. Devijver PA, Kittler J (1982) Pattern Recognition: A Statistical Approach. Prentice Hall, Englewood Cliffs, N J, USA
|
| |
5
|
|
| |
6
|
|
 |
7
|
Susan Dumais , John Platt , David Heckerman , Mehran Sahami, Inductive learning algorithms and representations for text categorization, Proceedings of the seventh international conference on Information and knowledge management, p.148-155, November 02-07, 1998, Bethesda, Maryland, United States
[doi> 10.1145/288627.288651]
|
| |
8
|
8. Fano RM (1961) Transmission of Information: A Statistical Theory of Communication. MIT Press, Cambridge, MA
|
| |
9
|
|
| |
10
|
10. Ghiselli EE (1964) Theory of Psychological Measurement. McGraw Hill, New York
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
|
| |
15
|
|
| |
16
|
16. Koller D, Sahami M (1996) Toward optimal feature selection. In: Proceedings of the 13th International Conference on Machine Learning (ICML-96), pp 170-178. Bari, Italy
|
| |
17
|
17. Kwak N, Choi C-H (1994) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143-159
|
 |
18
|
|
| |
19
|
|
 |
20
|
David D. Lewis , Robert E. Schapire , James P. Callan , Ron Papka, Training algorithms for linear text classifiers, Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, p.298-306, August 18-22, 1996, Zurich, Switzerland
[doi> 10.1145/243199.243277]
|
| |
21
|
|
| |
22
|
22. McCallum A, Nigam K (1998) A comparison of event models for naive Bayes text classification. In: Proceedings of the 1998 AAAI/ICML Workshop on Learning for Text Categorization, pp 41-48. Madison, WI, USA
|
| |
23
|
23. MladenićD, Grobelnik M (1998) Word sequences as features in text-learning. In: Proceedings of the Seventh Electrotechnical and Computer Science Conference (ERK-98), pp 145-148. Ljubljana, Slovenia
|
| |
24
|
|
| |
25
|
25. Press WH, Flannery BP, Teukolski SA, Vetterling WT (1988) Numerical Recipes in C. Cambridge University Press, Cambridge, UK
|
| |
26
|
|
| |
27
|
|
 |
28
|
|
 |
29
|
Hinrich Schütze , David A. Hull , Jan O. Pedersen, A comparison of classifiers and document representations for the routing problem, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.229-237, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215365]
|
 |
30
|
|
| |
31
|
31. Sidlecki W, Sklanski J (1988) On automatic feature selection. Int J Pattern Recogn Artif Intell 2(2): 197-220
|
| |
32
|
32. van Rijsbergen CJ, Harper DJ, Porter MF (1981) The selection of good search terms. Inform Process Manage 17(2):77-91
|
| |
33
|
|
| |
34
|
|
| |
35
|
|
|