ACM Home Page
Please provide us with feedback. Feedback
How boosting the margin can also boost classifier complexity
Full text PdfPdf (245 KB)
Source ACM International Conference Proceeding Series; Vol. 148 archive
Proceedings of the 23rd international conference on Machine learning table of contents
Pittsburgh, Pennsylvania
Pages: 753 - 760  
Year of Publication: 2006
ISBN:1-59593-383-2
Authors
Lev Reyzin  Yale University, New Haven, CT
Robert E. Schapire  Princeton University, Princeton, NJ
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 43,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1143844.1143939
What is a DOI?

ABSTRACT

Boosting methods are known not to usually overfit training data even as the size of the generated classifiers becomes large. Schapire et al. attempted to explain this phenomenon in terms of the margins the classifier achieves on training examples. Later, however, Breiman cast serious doubt on this explanation by introducing a boosting algorithm, arc-gv, that can generate a higher margins distribution than AdaBoost and yet performs worse. In this paper, we take a close look at Breiman's compelling but puzzling results. Although we can reproduce his main finding, we find that the poorer performance of arc-gv can be explained by the increased complexity of the base classifiers it uses, an explanation supported by our experiments and entirely consistent with the margins theory. Thus, we find maximizing the margins is desirable, but not necessarily at the expense of other factors, especially base-classifier complexity.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Breiman, L. (1998). Arcing classifiers. The Annals of Statistics, 26, 801--849.
 
3
 
4
Drucker, H., & Cortes, C. (1996). Boosting decision trees. Advances in Neural Information Processing Systems 8 (pp. 479--485).
 
5
 
6
 
7
Koltchinskii, V., & Panchenko, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics, 30.
 
8
Mason, L., Bartlett, P. L., & Golea, M. (2002). Generalization error of combined classifiers. Journal of Computer and System Sciences, 65, 415--438.
 
9
 
10
Quinlan, J. R. (1996). Bagging, boosting, and C4.5. Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725--730).
 
11
 
12
Rudin, C., Schapire, R. E., & Daubechies, T. (2004). Boosting based on a smooth margin. 17th Annual Conference on Learning Theory (pp. 502--517).
 
13
Schapire, R. E. (2002). The boosting approach to machine learning: An overview. Nonlinear Estimation and Classification. Springer.
 
14
Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26, 1651--1686.
 
15


Collaborative Colleagues:
Lev Reyzin: colleagues
Robert E. Schapire: colleagues