| Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering |
| Full text |
Mov
(13:21),
Pdf
(869 KB)
|
Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Paris, France
SESSION: Research track papers
table of contents
Pages 667-676
Year of Publication: 2009
ISBN:978-1-60558-495-9
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 43, Downloads (12 Months): 137, Citation Count: 0
|
|
|
ABSTRACT
One-Class Collaborative Filtering (OCCF) is a task that naturally emerges in recommender system settings. Typical characteristics include: Only positive examples can be observed, classes are highly imbalanced, and the vast majority of data points are missing. The idea of introducing weights for missing parts of a matrix has recently been shown to help in OCCF. While existing weighting approaches mitigate the first two problems above, a sparsity preserving solution that would allow to efficiently utilize data sets with e.g., hundred thousands of users and items has not yet been reported. In this paper, we study three different collaborative filtering frameworks: Low-rank matrix approximation, probabilistic latent semantic analysis, and maximum-margin matrix factorization. We propose two novel algorithms for large-scale OCCF that allow to weight the unknowns. Our experimental results demonstrate their effectiveness and efficiency on different problems, including the Netflix Prize data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Yossi Azar , Amos Fiat , Anna Karlin , Frank McSherry , Jared Saia, Spectral analysis of data, Proceedings of the thirty-third annual ACM symposium on Theory of computing, p.619-626, July 2001, Hersonissos, Greece
[doi> 10.1145/380752.380859]
|
 |
3
|
|
| |
4
|
J. S. Breese, D. Heckerman, and C. M. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, pages 43--52. Morgan Kaufmann, 1998.
|
| |
5
|
|
| |
6
|
|
 |
7
|
Abhinandan S. Das , Mayur Datar , Ashutosh Garg , Shyam Rajaram, Google news personalization: scalable online collaborative filtering, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
[doi> 10.1145/1242572.1242610]
|
| |
8
|
C. Drummond and R. Holte. C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Proc. ICML'2003 Workshop on Learning from Imbalanced Data Sets II, 2003.
|
 |
9
|
|
| |
10
|
M. Kurucz, A. A. Benczur, T. Kiss, I. Nagy, A. Szabo, and B. Torma. Who rated what: a combination of SVD, correlation and frequent sequence mining. In Proc. KDD Cup and Workshop, 2007.
|
| |
11
|
|
| |
12
|
B. Marlin, R. Zemel, S. Roweis, and M. Slaney. Collaborative filtering and the missing at random assumption. In UAI, 2007.
|
| |
13
|
Rong Pan , Yunhong Zhou , Bin Cao , Nathan N. Liu , Rajan Lukose , Martin Scholz , Qiang Yang, One-Class Collaborative Filtering, Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, p.502-511, December 15-19, 2008
[doi> 10.1109/ICDM.2008.16]
|
 |
14
|
|
 |
15
|
|
| |
16
|
N. Srebro and T. Jaakkola. Weighted low-rank approximations. In International Conference on Machine Learning (ICML), 2003.
|
| |
17
|
N. Srebro, J. D. M. Rennie, and T. Jaakkola. Maximum-margin matrix factorization. In NIPS, 2004.
|
 |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
Yunhong Zhou , Dennis Wilkinson , Robert Schreiber , Rong Pan, Large-Scale Parallel Collaborative Filtering for the Netflix Prize, Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management, p.337-348, June 23-25, 2008, Shanghai, China
[doi> 10.1007/978-3-540-68880-8_32]
|
|