ACM Home Page
Please provide us with feedback. Feedback
Predictive discrete latent factor models for large scale dyadic data
Full text MovMov (21:27),  PdfPdf (4.29 MB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Jose, California, USA
SESSION: Research track papers table of contents
Pages: 26 - 35  
Year of Publication: 2007
ISBN:978-1-59593-609-7
Authors
Deepak Agarwal  Yahoo! Research
Srujana Merugu  Yahoo! Research
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 22,   Downloads (12 Months): 290,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1281192.1281199
What is a DOI?

ABSTRACT

We propose a novel statistical method to predict large scale dyadic response variables in the presence of covariate information. Our approach simultaneously incorporates the effect of covariates and estimates local structure that is induced by interactions among the dyads through a discrete latent factor model. The discovered latent factors provide a redictive model that is both accurate and interpretable. We illustrate our method by working in a framework of generalized linear models, which include commonly used regression techniques like linear regression, logistic regression and Poisson regression as special cases. We also provide scalable generalized EM-based algorithms for model fitting using both "hard" and "soft" cluster assignments. We demonstrate the generality and efficacy of our approach through large scale simulation studies and analysis of datasets obtained from certain real-world movie recommendation and internet advertising applications.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Aitkin. A general maximum likelihood analysis of overdispersion in generalized linear models. Journal of Statistics and Computing, 6(3):1573--1375, September 1996.
 
2
 
3
4
 
5
D. Chickering, D. Heckerman, C. Meek, J. C. Platt, and B. Thiesson. Targeted internet advertising using predictive clustering and linear programming. http://research.microsoft.com/meek/papers/goal-oriented.ps.
6
 
7
C. Fernandez and P. J. Green. Modelling spatially correlated data via mixtures: a Bayesian approach. Journal of Royal Statistics Society Series B, (4):805--826, 2002.
 
8
G. Golub and C. Loan. Matrix Computations. John Hopkins University Press, Baltimore, MD., 1989.
 
9
Movielens data set. http://www.cs.umn.edu/Research/GroupLens/data/ml-data.tar.gz.
 
10
 
11
P. Hoff, A. Raftery, and M. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97:1090--1098, 2002.
12
 
13
D. L. Lee and S. Seung. Algorithms for non-negative matrix factorization. In NIPS, pages 556--562, 2001.
14
 
15
 
16
P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman & Hall/CRC, 1989.
 
17
 
18
 
19
 
20
K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96(455):1077--1087, 2001.
 
21
 
22
J. Rasbash and H. Goldstein. Efficient analysis of mixed hierarchical and cross-classified random structures using a multilevel model. Journal of Educational Statistics, (4):337--350, 1994.
23


Collaborative Colleagues:
Deepak Agarwal: colleagues
Srujana Merugu: colleagues