ACM Home Page
Please provide us with feedback. Feedback
Clustering via matrix powering
Full text PdfPdf (134 KB)
Source Symposium on Principles of Database Systems archive
Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Paris, France
SESSION: Clustering, data mining, approximations table of contents
Pages: 136 - 142  
Year of Publication: 2004
ISBN:158113858X
Authors
Hanson Zhou  Intelligence Laboratory, Cambridge, MA
David Woodruff  Intelligence Laboratory, Cambridge, MA
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGMOD: ACM Special Interest Group on Management of Data
SIGART: ACM Special Interest Group on Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 34,   Citation Count: 1
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1055558.1055579
What is a DOI?

ABSTRACT

Given a set of n points with a matrix of pairwise similarity measures, one would like to partition the points into clusters so that similar points are together and different ones apart. We present an algorithm requiring only matrix powering that performs well in practice and bears an elegant interpretation in terms of random walks on a graph. Under a certain mixture model involving planting a partition via randomized rounding of tailored matrix entries, the algorithm can be proven effective for only a single squaring. It is shown that the clustering performance of the algorithm degrades with larger values of the exponent, thus revealing that a single squaring is optimal.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
Y. Azar, A. Fiat, A. Karlin, and F. McSherry. Data mining through spectral analysis. In IEEE Symposium on Foundations of Computer Science, 2001.
 
4
 
5
R. Boppana. Eigenvalues and graph bisection: An average-case analysis. In IEEE Symposium on Foundations of Computer Science, pages 280--285, 1985.
 
6
7
 
8
A. Condon and R. Karp. Algorithms for graph partitioning on the planted partition model. In Random Structure and Algorithms, 1999.
 
9
D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions, 1990.
10
 
11
Z. Drezner, Facility Location: A survey of Applications and Methods. Springer-Verlag, 1995.
 
12
 
13
M. Dyer and A. M. Frieze. A simple heuristic for the p-center problem. Operations Research Letters, 3:285--288, 1985.
 
14
D. S. Hochbaum and D. B. Shmoys. A best possible approximation algorithm for the k-center problem. Math. Oper. Res., 10:180--184, 1985.
 
15
 
16
M. Jambu and M. O. Lebeaux. Cluster Analysis and Data Analysis. North-Holland, New York, 1983.
 
17
M. Jerrum and G. Sorkin. Simulated annealing for graph bisection. In IEEE Symposium on Foundations of Computer Science, 1993.
 
18
19
 
20
L. Lovasz. Random walks on graphs: a survey. In Combinatorics, pages 1--46, 1993.
 
21
 
22
23
 
24
M. Szummer and T. Jaakkola. Partially labeled classification with markov random walks. In Advances in Neural Information Processing Systems, volume 14, 2001.
 
25

Collaborative Colleagues:
Hanson Zhou: colleagues
David Woodruff: colleagues