ACM Home Page
Please provide us with feedback. Feedback
Bayesian inference for transductive learning of kernel matrix using the Tanner-Wong data augmentation algorithm
Full text PdfPdf (176 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 118  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Zhihua Zhang  Hong Kong University of Science and Technology, Kowloon, Hong Kong
Dit-Yan Yeung  Hong Kong University of Science and Technology, Kowloon, Hong Kong
James T. Kwok  Hong Kong University of Science and Technology, Kowloon, Hong Kong
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 24,   Citation Count: 4
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015368
What is a DOI?

ABSTRACT

In kernel methods, an interesting recent development seeks to learn a good kernel from empirical data automatically. In this paper, by regarding the transductive learning of the kernel matrix as a missing data problem, we propose a Bayesian hierarchical model for the problem and devise the Tanner-Wong data augmentation algorithm for making inference on the model. The Tanner-Wong algorithm is closely related to Gibbs sampling, and it also bears a strong resemblance to the expectation-maximization (EM) algorithm. For an efficient implementation, we propose a simplified Bayesian hierarchical model and the corresponding Tanner-Wong algorithm. We express the relationship between the kernel on the input space and the kernel on the output space as a symmetric-definite generalized eigenproblem. Based on this eigenproblem, an efficient approach to choosing the base kernel matrices is presented. The effectiveness of our Bayesian model with the Tanner-Wong algorithm is demonstrated through some classification experiments showing promising results.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Bousquet, O., & Herrmann, D. J. L. (2003). On the complexity of learning the kernel matrix. Advances in Neural Information Processing Systems 15. Cambridge, MA: MIT Press.
 
3
Crammer, K., Keshet, J., & Singer, Y. (2003). Kernel design using boosting. Advances in Neural Information Processing Systems 15. Cambridge, MA: MIT Press.
 
4
Cristianini, N., Kandola, J., Elisseeff, A., & Shawe-Taylor, J. (2002). On kernel target alignment. Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press.
 
5
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 1--38.
 
6
Golub, G. H., & Loan, C. F. V. (1996). Matrix computations. Baltimore: The Johns Hopkins University Press. Third edition.
 
7
Gupta, A., & Nagar, D. (2000). Matrix variate distributions. Boca Raton, FL: Chapman & Hall/CRC.
 
8
 
9
Kandola, J., Shawe-Taylor, J., & Cristianini, N. (2002). Optimizing kernel alignment over combinations of kernels (Technical Report 2002--121). NeuroCOLT.
 
10
 
11
Schafer, J. L. (1997). Analysis of incomplete multivariate data. Chapman & Hall.
 
12
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528--550.
 
13
 
14
Zhang, Z. (2003). Learning metrics via discriminant kernels and multidimensional scaling: Toward expected Euclidean representation. Proceedings of the 20th International Conference on Machine Learning (pp. 872--879). Washington, D.C., USA.
 
15
Zhang, Z., Kwok, J. T., Yeung, D. Y., & Xiong, Y. (2003a). Bayesian transductive learning of the kernel matrix using Wishart processes (Technical Report HKUST-CS03-09). Department of Computer Science, Hong Kong University of Science and Technology. Available from ftp://ftp.cs.ust.hk/pub/techreport/03/tr03-09.ps.gz.
 
16
Zhang, Z., Yeung, D. Y., & Kwok, J. T. (2003b). Gaussian-Wishart processes: A statistical view of kernels and its application to kernel learning (Technical Report HKUST-CS03-15). Department of Computer Science, Hong Kong University of Science and Technology. Available from ftp://ftp.cs.ust.hk/pub/techreport/03/tr03-15.ps.

Collaborative Colleagues:
Zhihua Zhang: colleagues
Dit-Yan Yeung: colleagues
James T. Kwok: colleagues