ACM Home Page
Please provide us with feedback. Feedback
The Bayesian backfitting relevance vector machine
Full text PdfPdf (343 KB)
Source ACM International Conference Proceeding Series; Vol. 69 archive
Proceedings of the twenty-first international conference on Machine learning table of contents
Banff, Alberta, Canada
Page: 31  
Year of Publication: 2004
ISBN:1-58113-828-5
Authors
Aaron D'Souza  University of Southern California, Los Angeles, CA
Sethu Vijayakumar  University of Edinburgh, Edinburgh, UK
Stefan Schaal  University of Southern California, Los Angeles, CA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 52,   Citation Count: 3
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1015330.1015358
What is a DOI?

ABSTRACT

Traditional non-parametric statistical learning techniques are often computationally attractive, but lack the same generalization and model selection abilities as state-of-the-art Bayesian algorithms which, however, are usually computationally prohibitive. This paper makes several important contributions that allow Bayesian learning to scale to more complex, real-world learning scenarios. Firstly, we show that backfitting --- a traditional non-parametric, yet highly efficient regression tool --- can be derived in a novel formulation within an expectation maximization (EM) framework and thus can finally be given a probabilistic interpretation. Secondly, we show that the general framework of sparse Bayesian learning and in particular the relevance vector machine (RVM), can be derived as a highly efficient algorithm using a Bayesian version of backfitting at its core. As we demonstrate on several regression and classification benchmarks, Bayesian backfitting offers a compelling alternative to current regression methods, especially when the size and dimensionality of the data challenge computational resources.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Csató, L., & Opper, M. (2001). Sparse representation for Gaussian process models. In (Leen et al., 2001), 444--450.
 
3
 
4
Ghahramani, Z., & Beal, M. J. (2000). Variational inference for Bayesian mixtures of factor analysers. Advances in Neural Information Processing Systems 12 (pp. 509--514). Cambridge, MA: MIT Press.
 
5
Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models. No. 43 in Monographs on Statistics and Applied Probability. Chapman & Hall.
 
6
 
7
Leen, T. K., Diettrich, T. G., & Tresp, V. (Eds.). (2001). Advances in neural information processing systems 13, vol. 13. Cambridge, MA: MIT Press.
 
8
 
9
Massey, W. F. (1965). Principal component regression in exploratory statistical research. Journal of the American Statistical Association, 60, 234--246.
 
10
 
11
 
12
 
13
 
14
Tipping, M. E., & Faul, A. C. (2003). Fast marginal likelihood maximization for sparse Bayesian models. Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics.
 
15
 
16
Williams, C. K. I., & Rasmussen, C. E. (1996). Gaussian processes for regression. Advances in Neural Information Processing Systems 8 (pp. 514--520). Cambridge, MA: MIT Press.
 
17
Williams, C. K. I., & Seeger, M. (2001). Using the Nyströöm method to speed up kernel machines. In (Leen et al., 2001), 682--688.
 
18
Wold, H. (1975). Soft modeling by latent variables: The nonlinear iterative partial least squares approach. In J. Gani (Ed.), Perspectives in probability and statistics, papers in honour of M. S. Bartlett, 520--540. London: Academic Press.

Collaborative Colleagues:
Aaron D'Souza: colleagues
Sethu Vijayakumar: colleagues
Stefan Schaal: colleagues