|
ABSTRACT
There are well known algorithms for learning the structure of directed and undirected graphical models from data, but nearly all assume that the data consists of a single i.i.d. sample. In contexts such as fMRI analysis, data may consist of an ensemble of independent samples from a common data generating mechanism which may not have identical distributions. Pooling such data can result in a number of well known statistical problems so each sample must be analyzed individually, which offers no increase in power due to the presence of multiple samples. We show how existing constraint based methods can be modified to learn structure from the aggregate of such data in a statistically sound manner. The prescribed method is simple to implement and based on existing statistical methods employed in metaanalysis and other areas, but works surprisingly well in this context where there are increased concerns due to issues such as retesting. We report results for directed models, but the method given is just as applicable to undirected models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Bahadur, R. R. (1971). Some limit theorems in statistics. Philadelphia: SIAM.
|
| |
2
|
|
| |
3
|
Fisher, R. A. (1950). Statistical methods for research workers. London: Oliver and Boyd. 11th edition.
|
| |
4
|
|
| |
5
|
Lazar, N. A., Luna, B., Sweeney, J. A., & Eddy, W. F. (2002). Combining brains: A survey of methods for statistical pooling of information. NeuroImage, 16, 538--550.
|
| |
6
|
|
| |
7
|
Mudholkar, G. S., & George, E. O. (1979). The logit method for combining probabilities. Symposium on Optimizing Methods in Statistics (pp. 345--366). New York: Academic Press.
|
| |
8
|
|
| |
9
|
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: MIT Press. 2nd edition.
|
| |
10
|
Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A., & Williams, R. M. (1949). The american soldier: Vol. 1. adjustment during army life. Princeton: Princeton University Press.
|
| |
11
|
Studený, M. (1992). Conditional independence relations have no finite complete characterization. In S. Kubik and J. A. Visek (Eds.), Information theory, statistical decision functions and random processes, vol. B, 377--396. Dordrecht: Kluwer.
|
| |
12
|
Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A., & Song, F. (2000). Methods for meta-analysis in medical research. New York: John Wiley & Sons.
|
| |
13
|
Tillman, R. E. (2008). Learning Bayesian network structure from distributed data with overlapping variables (Technical Report). Carnegie Mellon University, Pittsburgh, PA.
|
| |
14
|
Tillman, R. E., Danks, D., & Glymour, C. (2009). Integrating locally learned causal structures with overlapping variables. Advances in Neural Information Processing Systems 21.
|
| |
15
|
Tippett, L. H. C. (1950). The method of statistics. London: Williams and Norgate. 1st edition.
|
| |
16
|
Worsely, K. J., & Friston, K. J. (2000). A test for conjunction. Statistics and Probability Letters, 47, 135--140.
|
|