ACM Home Page
Please provide us with feedback. Feedback
A latent mixed membership model for relational data
Full text PdfPdf (1.68 MB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 3rd international workshop on Link discovery table of contents
Chicago, Illinois
Pages: 82 - 89  
Year of Publication: 2005
ISBN:1-59593-215-1
Authors
Edoardo Airoldi  Carnegie Mellon University
David Blei  Carnegie Mellon University
Eric Xing  Carnegie Mellon University
Stephen Fienberg  Carnegie Mellon University
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 102,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1134271.1134283
What is a DOI?

ABSTRACT

Modeling relational data is an important problem for modern data analysis and machine learning. In this paper we propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way objects interact with one another in order to learn latent groups, their typical interaction patterns, and the degree of membership of objects to groups. Our model explains the data using a small set of parameters that can be reliably estimated with an efficient inference algorithm. In our approach, the set of probabilistic assumptions may be tailored to a specific application domain in order to incorporate intuitions and/or semantics of interest. We demonstrate our methods on simulated data and we successfully apply our model to a data set of protein-to-protein interactions.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
B. P. Carlin and T. A. Louis. Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall, 2005.
 
4
D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In Advances in Neural Information Processing Systems 13, 2001.
 
5
 
6
E. Erosheva and S. E. Fienberg. Classification---The Ubiquitous Challenge, chapter Bayesian Mixed Membership Models for Soft Classification, pages 11--26. Springer-Verlag, 2005.
 
7
E. A. Erosheva, S. E. Fienberg, and J. Lafferty, Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 97(22):11885--11892, 2004.
 
8
S. E. Fienberg, M. M. Meyer, and S. Wasserman. Statistical analysis of multiple sociometric relations. Journal of the American Statistical Association, 80:51--67, 1985.
 
9
A. C. Gavin, M. Bosche, R. Krause, P. Grandi, M. Marzioch, A. Bauer, J. Schultz, and et. al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, 415:141--147, 2002.
 
10
Y. Ho, A. Gruhler, A. Heilbut, G. D. Bader, L. Moore, S. L. Adams, A. Millar, P. Taylor, K. Bennett, and K. B. et. al. Systematic identification of protein complexes in saccharomyces cerevisiae by mass spectrometry. Nature, 415:180--183, 2002.
 
11
P. D. Hoff, A. E. Raftery, and M. S. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97:1090--1098, 2002.
 
12
P. W. Holland and S. Leinhardt. Sociological Methodology, chapter Local structure in social networks, pages 1--45. Jossey-Bass, 1975.
 
13
C. Kemp, T. L. Griffiths, and J. B. Tenenbaum. Discovering latent classes in relational data. Technical Report AI Memo 2004--019, MIT, 2004.
 
14
G. R. Lanckriet, M. Deng, N. Cristianini, M. I. Jordan, and W. S. Noble. Kernel-based data fusion and its application to protein function prediction in yeast. In Proceedings of the Pacific Symposium on Biocomputing, 2004.
 
15
K. G. Manton, M. A. Woodbury, and H. D. Tolley. Statistical Applications Using Fuzzy Sets. Wiley, 1994.
 
16
H. W. Mewes, C. Amid, R. Arnold, D. Frishman, U. Guldener, and et. al. Mips: analysis and annotation of proteins from whole genomes. Nucleic Acids Research, 32:D41--44, 2004.
 
17
T. Minka. Estimating a Dirichlet distribution. Technical report, M.I.T., 2000.
 
18
J. Pritchard, M. Stephens, and P. Donnelly. Inference of population structure using multilocus genotype data. Genetics, 155:945--959, 2000.
 
19
N. A. Rosenberg, J. K. Pritchard, J. L. Weber, H. M. Cann, K. K. Kidd, L. A. Zhivotovsky, and M. W. Feldman. Genetic structure of human populations. Science, 298:2381--2385, 2002.
 
20
T. A. B. Snijders. Markov chain monte carlo estimation of exponential random graph models. Journal of Social Structure, 2002.
 
21
B. Taskar, M. F. Wong, P. Abbeel, and D. Koller. Link prediction in relational data. In Neural Information Processing Systems 15, 2003.
 
22
M. J. Wainwright and M. I. Jordan. Graphical models, exponential families and variational inference. Technical Report 649, Department of Statistics, University of California, Berkeley, 2003.
 
23
S. Wasserman and P. Pattison. Logit models and logistic regression for social networks: I. an introduction to markov graphs and p*. Psychometrika, 61:401--425, 1996.
 
24
E. P. Xing, A. Y. Ng, M. I. Jordan, and S. Russel. Distance metric learning with applications to clustering with side information. In Advances in Neural Information Processing Systems, 15, 2003.
 
25
L. Zelnik-Manor and P. Perona. Self-tuning spectral clustering. In Advances in Neural Information Processing Systems 17, 2004.


Collaborative Colleagues:
Edoardo Airoldi: colleagues
David Blei: colleagues
Eric Xing: colleagues
Stephen Fienberg: colleagues