| Learning to learn with the informative vector machine |
| Full text |
Pdf
(344 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 69
archive
Proceedings of the twenty-first international conference on Machine learning
table of contents
Banff, Alberta, Canada
Page: 65
Year of Publication: 2004
ISBN:1-58113-828-5
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 12, Downloads (12 Months): 49, Citation Count: 11
|
|
|
ABSTRACT
This paper describes an efficient method for learning the parameters of a Gaussian process (GP). The parameters are learned from multiple tasks which are assumed to have been drawn independently from the same GP prior. An efficient algorithm is obtained by extending the informative vector machine (IVM) algorithm to handle the multi-task learning case. The multi-task IVM (MTIVM) saves computation by greedily selecting the most informative examples from the separate tasks. The MT-IVM is also shown to be more efficient than random sub-sampling on an artificial data-set and more effective than the traditional IVM in a speaker dependent phoneme recognition task.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Blake, C. L., & Merz, C. J. (1998). UCI repository of machine learning databases.
|
| |
3
|
|
| |
4
|
Csató, L. (2002). Gaussian processes --- iterative sparse approximations. Doctoral dissertation, Aston University.
|
| |
5
|
|
| |
6
|
Lawrence, N. D., Seeger, M., & Herbrich, R. (2003). Fast sparse Gaussian process methods: The informative vector machine. Advances in Neural Information Processing Systems (pp. 625--632). Cambridge, MA: MIT Press.
|
| |
7
|
Minka, T. P., & Picard, R. W. (1997). Learning how to learn is learning with point sets. Web. Revised 1999, available at http://www.stat.cmu.edu/~minka/.
|
| |
8
|
Seeger, M. (2002). Covariance kernels from bayesian generative models. Advances in Neural Information Processing Systems (pp. 905--912). Cambridge, MA: MIT Press.
|
| |
9
|
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In (Touretzky et al., 1996), 640--646.
|
| |
10
|
Touretzky, D. S., Mozer, M. C., & Hasselmo, M. E. (Eds.). (1996). Advances in neural information processing systems, vol. 8. Cambridge, MA: MIT Press.
|
| |
11
|
|
| |
12
|
Williams, C. K. I., & Rasmussen, C. E. (1996). Gaussian processes for regression. In (Touretzky et al., 1996), 514--520.
|
CITED BY 11
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jianhui Chen , Lei Tang , Jun Liu , Jieping Ye, A convex formulation for learning shared structures from multiple tasks, Proceedings of the 26th Annual International Conference on Machine Learning, p.137-144, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
Kai Yu , John Lafferty , Shenghuo Zhu , Yihong Gong, Large-scale collaborative prediction using a nonparametric random effects model, Proceedings of the 26th Annual International Conference on Machine Learning, p.1185-1192, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|