| Multi-view clustering via canonical correlation analysis |
| Full text |
Pdf
(703 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 129-136
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 15, Downloads (12 Months): 47, Citation Count: 0
|
|
|
ABSTRACT
Clustering data in high dimensions is believed to be a hard problem in general. A number of efficient clustering algorithms developed in recent years address this problem by projecting the data into a lower-dimensional subspace, e.g. via Principal Components Analysis (PCA) or random projections, before clustering. Here, we consider constructing such projections using multiple views of the data, via Canonical Correlation Analysis (CCA). Under the assumption that the views are un-correlated given the cluster label, we show that the separation conditions required for the algorithm to be successful are significantly weaker than prior results in the literature. We provide results for mixtures of Gaussians and mixtures of log concave distributions. We also provide empirical support from audio-visual speaker clustering (where we desire the clusters to correspond to speaker ID) and from hierarchical Wikipedia document clustering (where one view is the words in the document and the other is the link structure).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Achlioptas, D., & McSherry, F. (2005). On spectral learning of mixtures of distributions. Conf. on Learning Thy (pp. 458--469).
|
 |
2
|
|
| |
3
|
Arora, S., & Kannan, R. (2005). Learning mixtures of separated nonspherical Gaussians. Ann. Applied Prob., 15, 69--92.
|
| |
4
|
Blaschko, M. B., & Lampert, C. H. (2008). Correlational spectral clustering. Conf. on Comp. Vision and Pattern Recognition.
|
 |
5
|
|
| |
6
|
|
| |
7
|
Chaudhuri, K., & Rao, S. (2008). Learning mixtures of distributions using correlations and independence. Conf. On Learning Thy. (pp. 9--20).
|
| |
8
|
|
| |
9
|
|
| |
10
|
Davis, S. B., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech, and Signal Proc., 28, 357--366.
|
| |
11
|
Dunn, G., & Everitt, B. (2004). An introduction to math. taxonomy. Dover Books.
|
| |
12
|
Kakade, S. M., & Foster, D. P. (2007). Multi-view regression via canonical correlation analysis. Conf. Learning Thy (pp. 82--96).
|
| |
13
|
Kannan, R., Salmasian, H., & Vempala, S. (2005). The spectral method for general mixture models. Conf. on Learning Thy (pp. 444--457).
|
 |
14
|
|
| |
15
|
Sanderson, C. (2008). Biometric person recognition: Face, speech and fusion. VDM-Verlag.
|
| |
16
|
|
|