ACM Home Page
Please provide us with feedback. Feedback
Feature selection using linear classifier weights: interaction with classification models
Full text PdfPdf (319 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Sheffield, United Kingdom
SESSION: Text classification table of contents
Pages: 234 - 241  
Year of Publication: 2004
ISBN:1-58113-881-4
Authors
Dunja Mladenić  Jožef Stefan Institute, Ljubljana, Slovenia
Janez Brank  Jožef Stefan Institute, Ljubljana, Slovenia
Marko Grobelnik  Jožef Stefan Institute, Ljubljana, Slovenia
Natasa Milic-Frayling  Microsoft Research Ltd, Cambridge, United Kingdom
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 23,   Downloads (12 Months): 162,   Citation Count: 14
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1008992.1009034
What is a DOI?

ABSTRACT

This paper explores feature scoring and selection based on weights from linear classification models. It investigates how these methods combine with various learning models. Our comparative analysis includes three learning algorithms: Naïve Bayes, Perceptron, and Support Vector Machines (SVM) in combination with three feature weighting methods: Odds Ratio, Information Gain, and weights from linear models, the linear SVM and Perceptron. Experiments show that feature selection using weights from linear SVMs yields better classification performance than other feature weighting methods when combined with the three explored learning algorithms. The results support the conjecture that it is the sophistication of the feature weighting method rather than its apparent compatibility with the learning algorithm that improves classification performance.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Janez Brank, Marko Grobelnik, Nataša Milić-Frayling, and Dunja Mladenić. Feature selection using support vector machines. Proc. of the 3rd Int. Conf. on Data Mining Methods and Databases for Engineering, Finance, and Other Fields, Bologna, Italy, September 2002.
 
2
 
3
 
4
Werner Krauth and Marc Mézard. Learning algorithms with optimal stability in neural networks. Jour. Physics A 20, L745-L752, August 1987.
 
5
Andrew McCallum and Kamal Nigam. A comparison of event models for Naïve Bayes text categorization. AAAI Workshop on Learning for Text Categorization (pp. 41--48). AAAI Press, 1998.
 
6
 
7
J. Ross Quinlan. Constructing decision trees. In: C4.5: Programs for machine learning, pp. 17--26. Morgan Kaufmann, 1993.
 
8
 
9
Vikas Sindhwani, Pushpak Bhattacharya, and Subrata Rakshit. Information theoretic feature crediting in multiclass Support Vector Machines. 1st SIAM Int. Conf. on Data Mining (SDM 2001), Chicago, IL, USA, April 5-7, 2001. SIAM, 2001.
 
10
Lawrence Shih, Yu-Han Chang, Jason Rennie, David Karger. Not too hot, not too cold: The Bundled-SVM is just right! Workshop on Text Learning (TextML-2002), ICML, Sydney, Australia, July 8, 2002.
 
11
Soumen Chakrabarti, Shourya Roy, Mahesh V. Soundalgekar: Fast and accurate text classification via multiple linear discriminant projections. Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, August 20--23, 2002, pp. 658--669.

CITED BY  14

Collaborative Colleagues:
Dunja Mladenić: colleagues
Janez Brank: colleagues
Marko Grobelnik: colleagues
Natasa Milic-Frayling: colleagues