| Clustering via matrix powering |
| Full text |
Pdf
(134 KB)
|
| Source
|
Symposium on Principles of Database Systems
archive
Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
table of contents
Paris, France
SESSION: Clustering, data mining, approximations
table of contents
Pages: 136 - 142
Year of Publication: 2004
ISBN:158113858X
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 34, Citation Count: 1
|
|
|
ABSTRACT
Given a set of n points with a matrix of pairwise similarity measures, one would like to partition the points into clusters so that similar points are together and different ones apart. We present an algorithm requiring only matrix powering that performs well in practice and bears an elegant interpretation in terms of random walks on a graph. Under a certain mixture model involving planting a partition via randomized rounding of tailored matrix entries, the algorithm can be proven effective for only a single squaring. It is shown that the clustering performance of the algorithm degrades with larger values of the exponent, thus revealing that a single squaring is optimal.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Sanjeev Arora , Prabhakar Raghavan , Satish Rao, Approximation schemes for Euclidean k-medians and related problems, Proceedings of the thirtieth annual ACM symposium on Theory of computing, p.106-113, May 24-26, 1998, Dallas, Texas, United States
[doi> 10.1145/276698.276718]
|
| |
3
|
Y. Azar, A. Fiat, A. Karlin, and F. McSherry. Data mining through spectral analysis. In IEEE Symposium on Foundations of Computer Science, 2001.
|
| |
4
|
|
| |
5
|
R. Boppana. Eigenvalues and graph bisection: An average-case analysis. In IEEE Symposium on Foundations of Computer Science, pages 280--285, 1985.
|
| |
6
|
|
 |
7
|
Moses Charikar , Sudipto Guha , Éva Tardos , David B. Shmoys, A constant-factor approximation algorithm for the k-median problem (extended abstract), Proceedings of the thirty-first annual ACM symposium on Theory of computing, p.1-10, May 01-04, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301250.301257]
|
| |
8
|
A. Condon and R. Karp. Algorithms for graph partitioning on the planted partition model. In Random Structure and Algorithms, 1999.
|
| |
9
|
D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions, 1990.
|
 |
10
|
|
| |
11
|
Z. Drezner, Facility Location: A survey of Applications and Methods. Springer-Verlag, 1995.
|
| |
12
|
P. Drineas , Alan Frieze , Ravi Kannan , Santosh Vempala , V. Vinay, Clustering in large graphs and matrices, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.291-299, January 17-19, 1999, Baltimore, Maryland, United States
|
| |
13
|
M. Dyer and A. M. Frieze. A simple heuristic for the p-center problem. Operations Research Letters, 3:285--288, 1985.
|
| |
14
|
D. S. Hochbaum and D. B. Shmoys. A best possible approximation algorithm for the k-center problem. Math. Oper. Res., 10:180--184, 1985.
|
| |
15
|
|
| |
16
|
M. Jambu and M. O. Lebeaux. Cluster Analysis and Data Analysis. North-Holland, New York, 1983.
|
| |
17
|
M. Jerrum and G. Sorkin. Simulated annealing for graph bisection. In IEEE Symposium on Foundations of Computer Science, 1993.
|
| |
18
|
|
 |
19
|
Jon Kleinberg , Christos Papadimitriou , Prabhakar Raghavan, Segmentation problems, Proceedings of the thirtieth annual ACM symposium on Theory of computing, p.473-482, May 24-26, 1998, Dallas, Texas, United States
[doi> 10.1145/276698.276860]
|
| |
20
|
L. Lovasz. Random walks on graphs: a survey. In Combinatorics, pages 1--46, 1993.
|
| |
21
|
|
| |
22
|
|
 |
23
|
Christos H. Papadimitriou , Hisao Tamaki , Prabhakar Raghavan , Santosh Vempala, Latent semantic indexing: a probabilistic analysis, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.159-168, June 01-04, 1998, Seattle, Washington, United States
[doi> 10.1145/275487.275505]
|
| |
24
|
M. Szummer and T. Jaakkola. Partially labeled classification with markov random walks. In Advances in Neural Information Processing Systems, volume 14, 2001.
|
| |
25
|
|
|