|
ABSTRACT
Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such as to prevent state-of-the-art statistical learning techniques from delivering accurate models in reasonable time. This paper presents a hybrid random field model for pseudo-likelihood estimation in high-dimensional domains. A theoretical analysis proves that the class of pseudo-likelihood distributions representable by hybrid random fields strictly includes the class of joint probability distributions representable by Bayesian networks. In order to learn hybrid random fields from data, we develop the Markov Blanket Merging algorithm. Theoretical and experimental evidence shows that Markov Blanket Merging scales up very well to high-dimensional datasets. As compared to other widely used statistical learning techniques, Markov Blanket Merging delivers accurate results in a number of link prediction tasks, while achieving also significant improvements in terms of computational efficiency. Our software implementation of the models investigated in this paper is publicly available at http://www.dii.unisi.it/~freno/. The same website also hosts the datasets used in this work that are not available elsewhere in the same preprocessing used for our experiments.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Besag. Spatial Interaction and the Statistical Analysis of Lattice Systems. Journal of the Royal Statistical Society. Series B, 36:192--236, 1974.
|
| |
2
|
J. Besag. Statistical Analysis of Non-Lattice Data. The Statistician, 24:179--195, 1975.
|
| |
3
|
J. S. Breese, D. Heckerman, and C. M. Kadie. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pages 43--52, 1998.
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
N. Friedman and M. Goldszmidt. Learning Bayesian Networks with Local Structure. In Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (UAI '96), pages 252--262. Morgan Kaufmann, 1996.
|
| |
8
|
N. Friedman and D. Koller. Being Bayesian about Bayesian Network Structure:A Bayesian Approach to Structure Discovery in Bayesian Networks. Machine Learning, 50:95--125, 2003.
|
| |
9
|
W. R. Gilks, S. Richardson, and D. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman&Hall/CRC, 1996.
|
| |
10
|
M. Gori and A. Pucci. ItemRank: A Random-Walk Based Scoring Algorithm for Recommender Engines. In 20th International Joint Conference on Artificial Intelligence (IJCAI07), pages 2766--2771, 2007.
|
| |
11
|
David Heckerman , David Maxwell Chickering , Christopher Meek , Robert Rounthwaite , Carl Kadie, Dependency networks for inference, collaborative filtering, and data visualization, The Journal of Machine Learning Research, 1, p.49-75, 9/1/2001
|
| |
12
|
|
| |
13
|
|
| |
14
|
R. Kindermann and J. L. Snell. Markov Random Fields and Their Applications. American Mathematical Society, Providence (RI), 1980.
|
| |
15
|
S. L. Lauritzen and N. Wermuth. Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative. The Annals of Statistics, 17:31--57, 1989.
|
| |
16
|
|
| |
17
|
|
| |
18
|
K. Miyahara and M. J. Pazzani. Collaborative Filtering with the Simple Bayesian Classifier. In PRICAI, pages 679--689, 2000.
|
| |
19
|
J. Moussouris. Gibbs and Markov Random Systems with Constraints. Journal of Statistical Physics, 10:11--33, 1974.
|
| |
20
|
|
| |
21
|
S. Parise and M. Welling. Bayesian Model Scoring in Markov Random Fields. In Proceedings of the Twentieth Annual Conference on Neural Information Processing Systems (NIPS), pages 1073--1080, 2006.
|
| |
22
|
|
| |
23
|
|
| |
24
|
J. Rissanen. Stochastic Complexity. Journal of the Royal Statistical Society. Series B, 49:223--239, 1987.
|
| |
25
|
|
| |
26
|
G. Schwarz. Estimating the Dimension of a Model. The Annals of Statistics, 6:461--464, 1978.
|
| |
27
|
P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and Search. MIT Press, Cambridge (MA), second edition, 2001. Original work published 1993 by Springer-Verlag.
|
 |
28
|
|
|