| Surrogate maximization/minimization algorithms for AdaBoost and the logistic regression model |
| Full text |
Pdf
(192 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 69
archive
Proceedings of the twenty-first international conference on Machine learning
table of contents
Banff, Alberta, Canada
Page: 117
Year of Publication: 2004
ISBN:1-58113-828-5
|
|
Authors
|
|
Zhihua Zhang
|
Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
|
|
James T. Kwok
|
Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
|
|
Dit-Yan Yeung
|
Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 30, Citation Count: 0
|
|
|
ABSTRACT
Surrogate maximization (or minimization) (SM) algorithms are a family of algorithm that can be regarded as a generalization of expectation-maximization (EM) algorithms. There are three major approaches to the construction of surrogate function, all relying on the convexity of some function. In this paper, we solve the boosting problem by proposing SM algorithms for the corresponding optimization problem. Specifically, for AdaBoost, we derive an SM algorithm that can be shown to be identical to the algorithm proposed by Collins et al. (2002) based on Bregman distance. More importantly, for LogitBoost (or logistic boosting), we use several methods to construct different surrogate functions which result in different SM algorithms. By combining multiple methods, we are able to derive an SM algorithm that is also the same as an algorithm derived by Collins et al. (2002). Our approach based on SM algorithms is much simpler and convergence results follow naturally.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Becker, M. P., Yang, I., & Lange, K. (1997). EM algorithms without missing data. Statistical Methods in Medical Research, 6, 38--54.
|
| |
2
|
Böhning, D., & Lindsay, B. G. (1988). Monotonicity of quadratic-approximation algorithms. Annals of the Institute of Statistical Mathematics, 40, 641--663.
|
| |
3
|
Bühlmann, P., & Yu, B. (2003). Boosting with the L2-loss: Regression and classification. Journal of the American Statistical Association, 98, 324--339.
|
| |
4
|
|
| |
5
|
Darroch, J. N., & Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, 43, 1470--1480.
|
| |
6
|
|
| |
7
|
Della Pietra, S., Della Pietra, V., & Lafferty, J. (2001). Duality and auxiliary functions for Bregman distances (Technical Report CMU-CS-01-109). School of Computer Science, Carnegie-Mellon University.
|
| |
8
|
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1--38.
|
| |
9
|
Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29, 1189--1232.
|
| |
10
|
Friedman, J. H., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28, 337--374.
|
| |
11
|
Jaakkola, T., & Jordan, M. (1997). A variational approach to Bayesian logistic regression models and their extensions. The Sixth International Workshop on Artificial Intelligence and Statistics.
|
 |
12
|
|
 |
13
|
John Lafferty, Additive models, boosting, and inference for generalized divergences, Proceedings of the twelfth annual conference on Computational learning theory, p.125-133, July 07-09, 1999, Santa Cruz, California, United States
[doi> 10.1145/307400.307422]
|
| |
14
|
Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. Journal of the Royal Statistical Society, Series B, 57, 425--437.
|
| |
15
|
Lange, K., Hunter, D. R., & Yang, I. (2000). Optimization transfer using surrogate objective functions with discussion. Journal of Computational and Graphical Statistics, 9, 1--59.
|
| |
16
|
Meng, X.-L. (2000). Discussion on "Optimization transfer using surrogate objective functions". Journal of Computational and Graphical Statistics, 9, 35--43.
|
| |
17
|
Rockafellar, T. (1970). Convex analysis. Princeton, New Jersey: Princeton University Press.
|
|