| Bayesian regression with input noise for high dimensional data |
| Full text |
Pdf
(318 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 148
archive
Proceedings of the 23rd international conference on Machine learning
table of contents
Pittsburgh, Pennsylvania
Pages: 937 - 944
Year of Publication: 2006
ISBN:1-59593-383-2
|
|
Authors
|
|
Jo-Anne Ting
|
University of Southern California, Los Angeles, CA
|
|
Aaron D'Souza
|
Google, Inc., Mountain View, CA
|
|
Stefan Schaal
|
University of Southern California, Los Angeles, CA and ATR Computational Neuroscience Laboratories, Kyoto, Japan
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 6, Downloads (12 Months): 41, Citation Count: 0
|
|
|
ABSTRACT
This paper examines high dimensional regression with noise-contaminated input and output data. Goals of such learning problems include optimal prediction with noiseless query points and optimal system identification. As a first step, we focus on linear regression methods, since these can be easily cast into nonlinear learning problems with locally weighted learning approaches. Standard linear regression algorithms generate biased regression estimates if input noise is present and suffer numerically when the data contains redundancy and irrelevancy. Inspired by Factor Analysis Regression, we develop a variational Bayesian algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise -- all in a computationally efficient way. We demonstrate the effectiveness of our techniques on synthetic data and on a system identification task for a rigid body dynamics model of a robotic vision head. Our algorithm performs 10 to 70% better than previously suggested methods.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Dempster, A., Laird, N., & Rubin, D. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statistical Society. Series B, 39, 1--38.
|
| |
4
|
Derksen, S., & Keselman, H. (1992). Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. British Journal of Mathematical and Statistical Psychology, 45, 265--282.
|
| |
5
|
Draper, N. R., & Smith, H. (1981). Applied regression analysis. Wiley.
|
 |
6
|
Aaron D'Souza , Sethu Vijayakumar , Stefan Schaal, The Bayesian backfitting relevance vector machine, Proceedings of the twenty-first international conference on Machine learning, p.31, July 04-08, 2004, Banff, Alberta, Canada
[doi> 10.1145/1015330.1015358]
|
| |
7
|
Ghahramani, Z., & Beal, M. (2000). Graphical models and variational methods. In D. Saad and M. Opper (Eds.), Advanced mean field methods - theory and practice. MIT Press.
|
| |
8
|
Golub, G. H., & Van Loan, C. (198). Matrix computations. John Hopkins University Press.
|
| |
9
|
Hastie, T. J., & Tibshirani, R. J. (1990). Generalized additive models. No. 43 in Monographs on Statistics and Applied Probability. Chapman and Hall.
|
| |
10
|
Hollerbach, J. M., & Wampler, C. W. (1996). The calibration index and the role of input noise in robot calibration. In G. Giralt and G. Hirzinger (Eds.), Robotics research: The seventh international symposium, 558--568. Springer.
|
| |
11
|
Massey, W. (1965). Principal component regression in exploratory statistical research. Journal of the American Statistical Association, 60, 234--246.
|
| |
12
|
|
| |
13
|
Rao, Y. N., & Principe, J. (2002). Efficient total least squares method for system modeling using minor component analysis. In Proceedings of international workshop on neural networks for signal processing, 259--268. IEEE.
|
| |
14
|
Strassen, V. (1969). Gaussian elimination is not optimal. Num Mathematik, 13, 354--356.
|
| |
15
|
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, Series B, 58, 267--288.
|
| |
16
|
Van Huffel, S., & Vanderwalle, J. (1991). The total least squares problem: Computational aspects and analysis. Society for Industrial and Applied Mathematics.
|
| |
17
|
Wold, H. (1975). Soft modeling by latent variables: The nonlinear iterative partial least squares approach. In J. Gani (Ed.), Perspectives in probability and statistics, papers in honor of s. m. bartlett. Academic Press.
|
|