| A quasi-Newton approach to non-smooth convex optimization |
| Full text |
Pdf
(632 KB)
|
| Source
|
ICML; Vol. 307
archive
Proceedings of the 25th international conference on Machine learning
table of contents
Helsinki, Finland
Pages: 1216-1223
Year of Publication: 2008
ISBN:978-1-60558-205-4
|
|
Authors
|
|
Jin Yu
|
NICTA, Canberra, Australia and Australian National University, Canberra, Australia
|
|
S. V. N. Vishwanathan
|
NICTA, Canberra, Australia and Australian National University, Canberra, Australia
|
|
Simon Günter
|
NICTA, Canberra, Australia and Australian National University, Canberra, Australia
|
|
Nicol N. Schraudolph
|
NICTA, Canberra, Australia and Australian National University, Canberra, Australia
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 61, Citation Count: 1
|
|
|
ABSTRACT
We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of non-smooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We apply the resulting subLBFGS algorithm to L2-regularized risk minimization with binary hinge loss, and its direction-finding component to L1-regularized risk minimization with logistic loss. In both settings our generic algorithms perform comparable to or better than their counterparts in specialized state-of-the-art solvers.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
A. Belloni. Introduction to bundle methods. Technical report, Operation Research Center, M.I.T., 2005.
|
| |
3
|
M. Haarala. Large-Scale Non-smooth Optimization. PhD thesis, University of Jyväskylä, 2004.
|
| |
4
|
J. Hiriart-Urruty and C. Lemaréchal. Convex Analysis and Minimization Algorithms, I and II, volume 305 and 306. Springer-Verlag, 1993.
|
 |
5
|
|
| |
6
|
|
| |
7
|
A. Nedich and D. P. Bertsekas. Convergence rate of incremental subgradient algorithms. In S. Uryasev and P. M. Pardalos, editors, Stochastic Optimization: Algorithms and Applications, pages 263--304. Kluwer Academic Publishers, 2000.
|
| |
8
|
J. Nocedal and S. J. Wright. Numerical Optimization. Springer Series in Operations Research. Springer, 1999.
|
 |
9
|
Choon Hui Teo , Alex Smola , S. V.N. Vishwanathan , Quoc Viet Le, A scalable modular convex solver for regularized risk minimization, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
[doi> 10.1145/1281192.1281270]
|
| |
10
|
F. Vojtěch and S. Sonnenburg. Optimized cutting plane algorithm for support vector machines. Technical Report 1, Fraunhofer Institute FIRST, December 2007. http://publica.fraunhofer.de/documents/N-66225.html.
|
| |
11
|
J. Yu, S. V. N. Vishwanathan, S. Günter, and N. N. Schraudolph. A quasi-Newton approach to non-smooth convex optimization. Technical Report arXiv:0804.3835, April 2008. http://arxiv.org/pdf/0804.3835.
|
CITED BY
|
|
Honglak Lee , Rajat Raina , Alex Teichman , Andrew Y. Ng, Exponential family sparse coding with applications to self-taught learning, Proceedings of the 21st international jont conference on Artifical intelligence, p.1113-1119, July 11-17, 2009, Pasadena, California, USA
|
|