| Predicting rare classes: can boosting make any weak learner strong? |
| Full text |
Pdf
(1.08 MB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
SESSION: Ensembles and boosting
table of contents
Pages: 297 - 306
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 48, Citation Count: 4
|
|
|
ABSTRACT
Boosting is a strong ensemble-based learning algorithm with the promise of iteratively improving the classification accuracy using any base learner, as long as it satisfies the condition of yielding weighted accuracy > 0.5. In this paper, we analyze boosting with respect to this basic condition on the base learner, to see if boosting ensures prediction of rarely occurring events with high recall and precision. First we show that a base learner can satisfy the required condition even for poor recall or precision levels, especially for very rare classes. Furthermore, we show that the intelligent weight updating mechanism in boosting, even in its strong cost-sensitive form, does not prevent cases where the base learner always achieves high precision but poor recall or high recall but poor precision, when mapped to the original distribution. In either of these cases, we show that the voting mechanism of boosting falls to achieve good overall recall and precision for the ensemble. In effect, our analysis indicates that one cannot be blind to the base learner performance, and just rely on the boosting mechanism to take care of its weakness. We validate our arguments empirically on variety of real and synthetic rare class problems. In particular, using AdaCost as the boosting algorithm, and variations of PNrule and RIPPER as the base learners, we show that if algorithm A achieves better recall-precision balance than algorithm B, then using A as the base learner in AdaCost yields significantly better performance than using B as the base learner.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
C. Blake and C. Merz. UCI repository of machine learning databases, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html.
|
| |
2
|
L. Breiman. Arcing classifiers. The Annals of Statistics, 26(3):801--849, 1998.
|
| |
3
|
P. Chan and S. Stolfo. Towards scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In Proc. of Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 164--168, New York City, 1998.
|
| |
4
|
William W. Cohen , Yoram Singer, A simple, fast, and effective rule learner, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.335-342, July 18-22, 1999, Orlando, Florida, United States
|
| |
5
|
W. W. Cohen. Fast effective rule induction. In Proc. of Twelfth International Conference on Machine Learning, Lake Tahoe, California, 1995.
|
| |
6
|
|
| |
7
|
William Hersh , Chris Buckley , T. J. Leone , David Hickam, OHSUMED: an interactive retrieval evaluation and new large test collection for research, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, p.192-201, July 03-06, 1994, Dublin, Ireland
|
| |
8
|
R. C. Holte, N. Japkowicz, C. X. Ling, and S. M. (eds.). Learning from imbalanced data sets (papers from aaai workshop). Technical Report WS-00-05, AAAI Press, Menlo Park, CA, 2000.
|
 |
9
|
Mahesh V. Joshi , Ramesh C. Agarwal , Vipin Kumar, Mining needle in a haystack: classifying rare classes via two-phase rule induction, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.91-102, May 21-24, 2001, Santa Barbara, California, United States
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
CITED BY 4
|
|
|
|
|
|
|
|
Junjie Wu , Hui Xiong , Peng Wu , Jian Chen, Local decomposition for rare class analysis, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|