ACM Home Page
Please provide us with feedback. Feedback
On coresets for k-means and k-median clustering
Full text PdfPdf (223 KB)
Source
Annual ACM Symposium on Theory of Computing archive
Proceedings of the thirty-sixth annual ACM symposium on Theory of computing table of contents
Chicago, IL, USA
SESSION: Session 7B table of contents
Pages: 291 - 300  
Year of Publication: 2004
ISBN:1-58113-852-0
Authors
Sariel Har-Peled  University of Illinois, Urbana-Champaign, Urbana, IL
Soham Mazumdar  University of Illinois, Urbana-Champaign, Urbana, IL
Sponsors
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 157,   Citation Count: 30
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1007352.1007400
What is a DOI?

ABSTRACT

In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in Rd, one can compute a weighted set S ⊆ P, of size O(k ε-d log n), such that one can compute the k-median/means clustering on S instead of on P, and get an (1+ε)-approximation. As a result, we improve the fastest known algorithms for (1+ε)-approximate k-means and k-median. Our algorithms have linear running time for a fixed k and ε. In addition, we can maintain the (1+ε)-approximate k-median or k-means clustering of a stream when points are being only inserted, using polylogarithmic space and update time.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
4
5
6
7
 
8
M. Badoiu and K. L. Clarkson. Optimal core-sets for balls. In Proc. 14th ACM-SIAM Sympos. Discrete Algorithms, pages 801--802, 2003.
9
 
10
J. L. Bentley and J. B. Saxe. Decomposable searching problems i: Static-to-dynamic transformation. J. Algorithms, 1(4):301--358, 1980.
 
11
12
13
14
 
15
 
16
T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci., 38:293--306, 1985.
 
17
 
18
 
19
20
21
22
 
23
P. Indyk. (1+ε)-approximate bi-criterion algorithm for k-median on a stream. In Proc. 36th Annu. ACM Sympos. Theory Comput., 2004.
 
24
25
 
26
27
 
28
J. Matousek. On approximate geometric k-clustering. Discrete Comput. Geom., 24:61--84, 2000.
 
29
R. Mettu and C. G. Plaxton. Optimal time bounds for approximate clustering. In In Proceedings of the 18th Conference on Uncertainty in Artifical Intelligence, pages 344--351, 2002.

CITED BY  31

Collaborative Colleagues:
Sariel Har-Peled: colleagues
Soham Mazumdar: colleagues