ACM Home Page
Please provide us with feedback. Feedback
Local sparsity control for naive Bayes with extreme misclassification costs
Full text PdfPdf (2.82 MB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining table of contents
Chicago, Illinois, USA
SESSION: Research track paper table of contents
Pages: 128 - 137  
Year of Publication: 2005
ISBN:1-59593-135-X
Author
Aleksander Kolcz  AOL, Inc., Dulles, VA
Sponsors
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 48,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1081870.1081888
What is a DOI?

ABSTRACT

In applications of data mining characterized by highly skewed misclassification costs certain types of errors become virtually unacceptable. This limits the utility of a classifier to a range in which such constraints can be met. Naive Bayes, which has proven to be very useful in text mining applications due to high scalability, can be particularly affected. Although its 0/1 loss tends to be small, its misclassifications are often made with apparently high confidence. Aside from efforts to better calibrate Naive Bayes scores, it has been shown that its accuracy depends on document sparsity and feature selection can lead to marked improvement in classification performance. Traditionally, sparsity is controlled globally, and the result for any particular document may vary. In this work we examine the merits of local sparsity control for Naive Bayes in the context of highly asymmetric misclassification costs. In experiments with three benchmark document collections we demonstrate clear advantages of document-level feature selection. In the extreme cost setting, multinomial Naive Bayes with local sparsity control is able to outperform even some of the recently proposed effective improvements to the Naive Bayes classifier. There are also indications that local feature selection may be preferable in different cost settings.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
P. N. Bennett. Assessing the calibration of Naive Bayes posterior estimates. Technical Report CMU-CS-00-155, Computer Science Department, School of Computer Science, Carnegie Mellon University, 2000.
3
4
 
5
A. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, 1997.
 
6
R. Caruana and A. Niculescu-Mizil. Predicting good probabilities with supervised learning. In Proceedings of the American Meteorology Conference (AMS2005), 2005.
7
8
 
9
 
10
 
11
 
12
C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pages 973--978, 2001.
 
13
 
14
 
15
P. Graham. A plan for spam, 2002. Available from World Wide Web: http://www.paulgraham.com/spam.html.
 
16
 
17
M. Kukar. Transductive reliability estimation for medical diagnosis. Artificial Intelligene in Medicine, 29:81--106, 2003.
 
18
19
 
20
 
21
A. McCallum and K. Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization, 1998.
22
 
23
24
 
25
J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 61--74. MIT Press, 1999.
 
26
F. Provost. Learning with imbalanced data sets 101. In Proceedings of the AAAI'2000 Workshop on Imbalanced Data Sets, 2000.
 
27
 
28
J. Rennie, L. Shih, J. Teevan, and D. Karger. Tackling the poor assumptions of Naive Bayes text classifiers. In Proceedings of the Twentieth International Conference on Machine Learning, 2003.
 
29
 
30
 
31
 
32
33
34