|
ABSTRACT
Mixtures of gaussian (or normal) distributions arise in a variety of application areas. Many techniques have been proposed for the task of finding the component gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin~(1977). However, such heuristics are known to require time exponential in the dimension (i.e., number of variables) in the worst case, even when the number of components is $2$.This paper presents the first algorithm that provably learns the component gaussians in time that is polynomial in the dimension. The gaussians may have arbitrary shape provided they satisfy a “nondegeneracy” condition, which requires their high-probability regions to be not “too close” together.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
[1] N. Alon, J. Spencer and P. Erdös. The probabilistic method. Wiley Interscience, 1992.
|
| |
2
|
[2] J. Bourgain. Random points in isotropic convex sets. Convex geometric analysis (Berkeley, CA, 1996), 53-58, Math. Sci. Res. Inst. Publ., 34, Cambridge Univ. Press, Cambridge, 1999.
|
 |
3
|
Moses Charikar , Sudipto Guha , Éva Tardos , David B. Shmoys, A constant-factor approximation algorithm for the k-median problem (extended abstract), Proceedings of the thirty-first annual ACM symposium on Theory of computing, p.1-10, May 01-04, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301250.301257]
|
| |
4
|
|
| |
5
|
[5] S. Dasgupta and L. Schulman. Personal communication, March 2000.
|
| |
6
|
[6] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistics Soc. Ser. B, 39:1-38, 1977.
|
 |
7
|
|
| |
8
|
[8] W. J. Hoeffding. Probability inequalities for sums of bounded random variables. J. American Statistical Assoc, 58(301):13-30, March 1963.
|
| |
9
|
[9] P. J. Huber. Projection pursuit. Annals of Statistics, 13(2):435-475, June 1985.
|
| |
10
|
[10] W. B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space. Contemp. Math. 26:189-206, 1984.
|
| |
11
|
|
| |
12
|
[12] L. Leindler. On a certain converse of Hölder's inequality II. stochastic programming. Acta Sci. Math. Szeged 33:217-223(1972).
|
| |
13
|
[13] B. Lindsay. Mixture models: theory, geometry, and applications. American Statistical Association, Virginia 1995.
|
| |
14
|
[14] A. Prékopa. Logarithmic concave measures with applications to stochastic programming. Acta Sci. Math. Szeged 32:301-316 (1971).
|
| |
15
|
[15] A. Prékopa. On logarithmic concave measures and functions. Acta Sci. Math. Szeged 34:335-343 (1973).
|
| |
16
|
[16] R. A. Redner, H. F. Walker. Mixture densities, maximum likelihood and the EM Algorithm. SIAM Review, 26(2):195-239, 1984.
|
| |
17
|
[17] M. Rudelson. Random vectors in the isotropic position. J. Func. Anal. 164:60-72, 1999.
|
| |
18
|
[18] D. M. Titterington, A. F. M. Smith, and U. E. Makov. Statistical analysis of finite mixture distributions, Wiley, 1985.
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anirban Dasgupta , John Hopcroft , Ravi Kannan , Pradipta Mitra, Spectral clustering with limited independence, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, p.1036-1045, January 07-09, 2007, New Orleans, Louisiana
|
|
|
Kamalika Chaudhuri , Eran Halperin , Satish Rao , Shuheng Zhou, A rigorous analysis of population stratification with limited data, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, p.1046-1055, January 07-09, 2007, New Orleans, Louisiana
|
|
|
|
|