|
ABSTRACT
Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of two-layer undirected graphical models, called Restricted Boltzmann Machines (RBM's), can be used to model tabular data, such as user's ratings of movies. We present efficient learning and inference procedures for this class of models and demonstrate that RBM's can be successfully applied to the Netflix data set, containing over 100 million user/movie ratings. We also show that RBM's slightly outperform carefully-tuned SVD models. When the predictions of multiple RBM models and multiple SVD models are linearly combined, we achieve an error rate that is well over 6% better than the score of Netflix's own system.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Carreira-Perpinan, M., & Hinton, G. (2005). On contrastive divergence learning. 10th Int. Work-shop on Artificial Intelligence and Statistics (AISTATS'2005).
|
| |
3
|
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41, 391--407.
|
| |
4
|
Hinton, & Salakhutdinov (2006). Reducing the dimensionality of data with neural networks. Science, 313.
|
| |
5
|
|
| |
6
|
|
| |
7
|
Hofmann, T. (1999). Probabilistic latent semantic analysis. Proceedings of the 15th Conference on Uncertainty in AI (pp. 289--296). San Fransisco, California: Morgan Kaufmann.
|
 |
8
|
|
| |
9
|
Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods (Technical Report CRG-TR-93-1). Department of Computer Science, University of Toronto.
|
| |
10
|
Salakhutdinov, R., & Hinton, G. E. (2007). Learning a nonlinear embedding by preserving class neighbourhood structure. AI and Statistics.
|
| |
11
|
Srebro, N., & Jaakkola, T. (2003). Weighted low-rank approximations. Machine Learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21--24, 2003, Washington, DC, USA (pp. 720--727). AAAI Press.
|
| |
12
|
|
| |
13
|
Sutskever, I., & Hinton, G. E. (2006). Learning multilevel distributed representations for high-dimensional sequences (Technical Report UTML TR 2006-003). Dept. of Computer Science, University of Toronto.
|
| |
14
|
Taylor, G. W., Hinton, G. E., & Roweis, S. T. (2006). Modeling human motion using binary latent variables. Advances in Neural Information Processing Systems. MIT Press.
|
| |
15
|
Welling, M., Rosen-Zvi, M., & Hinton, G. (2005). Exponential family harmoniums with an application to information retrieval. NIPS 17 (pp. 1481--1488). Cambridge, MA: MIT Press.
|
CITED BY 21
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Kai Yu , John Lafferty , Shenghuo Zhu , Yihong Gong, Large-scale collaborative prediction using a nonparametric random effects model, Proceedings of the 26th Annual International Conference on Machine Learning, p.1185-1192, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
Yoshua Bengio , Jérôme Louradour , Ronan Collobert , Jason Weston, Curriculum learning, Proceedings of the 26th Annual International Conference on Machine Learning, p.41-48, June 14-18, 2009, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|