| More efficiency in multiple kernel learning |
| Full text |
Pdf
(199 KB)
|
| Source
|
ICML; Vol. 227
archive
Proceedings of the 24th international conference on Machine learning
table of contents
Corvalis, Oregon
Pages: 775 - 782
Year of Publication: 2007
ISBN:978-1-59593-793-3
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 13, Downloads (12 Months): 109, Citation Count: 8
|
|
|
ABSTRACT
An efficient and general multiple kernel learning (MKL) algorithm has been recently proposed by Sonnenburg et al. (2006). This approach has opened new perspectives since it makes the MKL approach tractable for large-scale problems, by iteratively using existing support vector machine code. However, it turns out that this iterative algorithm needs several iterations before converging towards a reasonable solution. In this paper, we address the MKL problem through an adaptive 2-norm regularization formulation. Weights on each kernel matrix are included in the standard SVM empirical risk minimization problem with a l1 constraint to encourage sparsity. We propose an algorithm for solving this problem and provide an new insight on MKL algorithms based on block 1-norm regularization by showing that the two approaches are equivalent. Experimental results show that the resulting algorithm converges rapidly and its efficiency compares favorably to other MKL algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Argyriou, A., Evgeniou, T., & Pontil, M. (2007). Convex multi-task feature learning (Technical Report).
|
 |
2
|
Francis R. Bach , Gert R. G. Lanckriet , Michael I. Jordan, Multiple kernel learning, conic duality, and the SMO algorithm, Proceedings of the twenty-first international conference on Machine learning, p.6, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015424]
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
Grandvalet, Y. (1998). Least absolute shrinkage is equivalent to quadratic penalization. ICANN'98 (pp. 201--206). Springer.
|
| |
8
|
Grandvalet, Y., & Canu, S. (2003). Adaptive scaling for feature selection in svms. Advances in Neural Information Processing Systems. MIT Press.
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
Scholkopf, B., & Smola, A. (2001). Learning with kernels. MIT Press.
|
| |
14
|
|
| |
15
|
Wahba, G. (1990). Spline models for observational data. Series in Applied Mathematics, Vol. 59, SIAM.
|
CITED BY 8
|
|
Gustavo Camps-Valls , Jordi Muñoz-Marí , Manel Martínez-Ramón , Jesús Requena-Carrión , José Luis Rojo-Álvarez, Letters: Learning non-linear time-scales with kernel γ-filters, Neurocomputing, v.72 n.4-6, p.1324-1328, January, 2009
|
|
|
|
|
|
|
|
|
Zenglin Xu , Rong Jin , Jieping Ye , Michael R. Lyu , Irwin King, Non-monotonic feature selection, Proceedings of the 26th Annual International Conference on Machine Learning, p.1145-1152, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Matthieu Kowalski , Marie Szafranski , Liva Ralaivola, Multiple indefinite kernel learning with mixed norm regularization, Proceedings of the 26th Annual International Conference on Machine Learning, p.545-552, June 14-18, 2009, Montreal, Quebec, Canada
|
|