ACM Home Page
Please provide us with feedback. Feedback
Surrogate maximization/minimization algorithms for AdaBoost and the logistic regression model
Full text PdfPdf (192 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 117  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Zhihua Zhang  Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
James T. Kwok  Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Dit-Yan Yeung  Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 30,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015342
What is a DOI?

ABSTRACT

Surrogate maximization (or minimization) (SM) algorithms are a family of algorithm that can be regarded as a generalization of expectation-maximization (EM) algorithms. There are three major approaches to the construction of surrogate function, all relying on the convexity of some function. In this paper, we solve the boosting problem by proposing SM algorithms for the corresponding optimization problem. Specifically, for AdaBoost, we derive an SM algorithm that can be shown to be identical to the algorithm proposed by Collins et al. (2002) based on Bregman distance. More importantly, for LogitBoost (or logistic boosting), we use several methods to construct different surrogate functions which result in different SM algorithms. By combining multiple methods, we are able to derive an SM algorithm that is also the same as an algorithm derived by Collins et al. (2002). Our approach based on SM algorithms is much simpler and convergence results follow naturally.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Becker, M. P., Yang, I., & Lange, K. (1997). EM algorithms without missing data. Statistical Methods in Medical Research, 6, 38--54.
 
2
Böhning, D., & Lindsay, B. G. (1988). Monotonicity of quadratic-approximation algorithms. Annals of the Institute of Statistical Mathematics, 40, 641--663.
 
3
Bühlmann, P., & Yu, B. (2003). Boosting with the L2-loss: Regression and classification. Journal of the American Statistical Association, 98, 324--339.
 
4
 
5
Darroch, J. N., & Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43, 1470--1480.
 
6
 
7
Della Pietra, S., Della Pietra, V., & Lafferty, J. (2001). Duality and auxiliary functions for Bregman distances (Technical Report CMU-CS-01-109). School of Computer Science, Carnegie-Mellon University.
 
8
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1--38.
 
9
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189--1232.
 
10
Friedman, J. H., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28, 337--374.
 
11
Jaakkola, T., & Jordan, M. (1997). A variational approach to Bayesian logistic regression models and their extensions. The Sixth International Workshop on Artificial Intelligence and Statistics.
12
13
 
14
Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. Journal of the Royal Statistical Society, Series B, 57, 425--437.
 
15
Lange, K., Hunter, D. R., & Yang, I. (2000). Optimization transfer using surrogate objective functions with discussion. Journal of Computational and Graphical Statistics, 9, 1--59.
 
16
Meng, X.-L. (2000). Discussion on "Optimization transfer using surrogate objective functions". Journal of Computational and Graphical Statistics, 9, 35--43.
 
17
Rockafellar, T. (1970). Convex analysis. Princeton, New Jersey: Princeton University Press.
Collaborative Colleagues:
Zhihua Zhang: colleagues
James T. Kwok: colleagues
Dit-Yan Yeung: colleagues