ACM Home Page
Please provide us with feedback. Feedback
Multiple kernel learning, conic duality, and the SMO algorithm
Full text PdfPdf (196 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 6  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Francis R. Bach  University of California, Berkeley, CA
Gert R. G. Lanckriet  University of California, Berkeley, CA
Michael I. Jordan  University of California, Berkeley, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 35,   Downloads (12 Months): 252,   Citation Count: 35
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015424
What is a DOI?

ABSTRACT

While classical kernel-based classifiers are based on a single kernel, in practice it is often desirable to base classifiers on combinations of multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for the support vector machine (SVM), and showed that the optimization of the coefficients of such a combination reduces to a convex optimization problem known as a quadratically-constrained quadratic program (QCQP). Unfortunately, current convex optimization toolboxes can solve this problem only for a small number of kernels and a small number of data points; moreover, the sequential minimal optimization (SMO) techniques that are essential in large-scale implementations of the SVM cannot be applied because the cost function is non-differentiable. We propose a novel dual formulation of the QCQP as a second-order cone programming problem, and show how to exploit the technique of Moreau-Yosida regularization to yield a formulation to which SMO techniques can be applied. We present experimental results that show that our SMO-based algorithm is significantly more efficient than the general-purpose interior point methods available in current optimization toolboxes.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Andersen, E. D., & Andersen, K. D. (2000). The MOSEK interior point optimizer for linear programming: an implementation of the homogeneous algorithm. High Perf. Optimization (pp. 197--232).
 
2
Bertsekas, D. (1995). Nonlinear programming. Nashua, NH: Athena Scientific.
 
3
 
4
Brent, R. P. (1973). Algorithms for minimization without derivatives. Englewood Cliffs, NJ: Prentice-Hall.
 
5
 
6
Grandvalet, Y., & Canu, S. (2003). Adaptive scaling for feature selection in SVMs. Neural Information Processing Systems. Cambridge, MA: MIT Press.
 
7
 
8
 
9
 
10
 
11
Lobo, M. S., Vandenberghe, L., Boyd, S., & Léébret, H. (1998). Applications of second-order cone programming. Lin. Alg. and its Applications, 284, 193--228.
 
12
Ong, S., Smola, A. J., & Williamson, R. C. (2003). Hyperkernels. Neural Information Processing Systems. Cambridge, MA: MIT Press.
 
13

CITED BY  37
Collaborative Colleagues:
Francis R. Bach: colleagues
Gert R. G. Lanckriet: colleagues
Michael I. Jordan: colleagues