ACM Home Page
Please provide us with feedback. Feedback
Nonstationary kernel combination
Full text PdfPdf (435 KB)
Source ACM International Conference Proceeding Series; Vol. 148 archive
Proceedings of the 23rd international conference on Machine learning table of contents
Pittsburgh, Pennsylvania
Pages: 553 - 560  
Year of Publication: 2006
ISBN:1-59593-383-2
Authors
Darrin P. Lewis  Columbia University, New York, NY
Tony Jebara  Columbia University, New York, NY
William Stafford Noble  University of Washington, Seattle, WA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 25,   Citation Count: 8
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1143844.1143914
What is a DOI?

ABSTRACT

The power and popularity of kernel methods stem in part from their ability to handle diverse forms of structured inputs, including vectors, graphs and strings. Recently, several methods have been proposed for combining kernels from heterogeneous data sources. However, all of these methods produce stationary combinations; i.e., the relative weights of the various kernels do not vary among input examples. This article proposes a method for combining multiple kernels in a nonstationary fashion. The approach uses a large-margin latent-variable generative model within the maximum entropy discrimination (MED) framework. Latent parameter estimation is rendered tractable by variational bounds and an iterative optimization procedure. The classifier we use is a log-ratio of Gaussian mixtures, in which each component is implicitly mapped via a Mercer kernel function. We show that the support vector machine is a special case of this model. In this approach, discriminative parameter estimation is feasible via a fast sequential minimal optimization algorithm. Empirical results are presented on synthetic data, several benchmarks, and on a protein function annotation task.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Jaakkola, T., Meila, M., & Jebara, T. (1999). Maximum entropy discrimination. Advances in Neural Information Processing Systems.
 
4
 
5
 
6
 
7
Lanckriet, G. R. G., Deng, M., Cristianini, N., Jordan, M. I., & Noble, W. S. (2004). Kernel-based data fusion and its application to protein function prediction in yeast. Proceedings of the Pacific Symposium on Biocomputing (pp. 300--311). World Scientific.
 
8
 
9
Cheng Soon Ong , Alexander J. Smola , Robert C. Williamson, Learning the Kernel with Hyperkernels, The Journal of Machine Learning Research, 6, p.1043-1071, 9/1/2005
10
 
11
 
12
Sonnenburg, S., Rätsch, G., & Schafer, C. (2006). A general and efficient multiple kernel learning algorithm. Advances in Neural Information Processing Systems.
 
13
Taskar, B., Guestrin, C., & Koller, D. (2003). Max margin markov networks. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press.
 
14
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2001). Feature selection for SVMs. Advances in Neural Information Processing Systems 13. Cambridge, MA: MIT Press.

CITED BY  8

Collaborative Colleagues:
Darrin P. Lewis: colleagues
Tony Jebara: colleagues
William Stafford Noble: colleagues