| On coresets for k-means and k-median clustering |
| Full text |
Pdf
(223 KB)
|
Source
|
Annual ACM Symposium on Theory of Computing
archive
Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
table of contents
Chicago, IL, USA
SESSION: Session 7B
table of contents
Pages: 291 - 300
Year of Publication: 2004
ISBN:1-58113-852-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 23, Downloads (12 Months): 159, Citation Count: 30
|
|
|
ABSTRACT
In this paper, we show the existence of small coresets for the problems of computing k-median and k-means clustering for points in low dimension. In other words, we show that given a point set P in Rd, one can compute a weighted set S ⊆ P, of size O(k ε-d log n), such that one can compute the k-median/means clustering on S instead of on P, and get an (1+ε)-approximation. As a result, we improve the fastest known algorithms for (1+ε)-approximate k-means and k-median. Our algorithms have linear running time for a fixed k and ε. In addition, we can maintain the (1+ε)-approximate k-median or k-means clustering of a stream when points are being only inserted, using polylogarithmic space and update time.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
|
 |
5
|
Sanjeev Arora , Prabhakar Raghavan , Satish Rao, Approximation schemes for Euclidean k-medians and related problems, Proceedings of the thirtieth annual ACM symposium on Theory of computing, p.106-113, May 24-26, 1998, Dallas, Texas, United States
[doi> 10.1145/276698.276718]
|
 |
6
|
|
 |
7
|
Vijay Arya , Naveen Garg , Rohit Khandekar , Kamesh Munagala , Vinayaka Pandit, Local search heuristic for k-median and facility location problems, Proceedings of the thirty-third annual ACM symposium on Theory of computing, p.21-29, July 2001, Hersonissos, Greece
[doi> 10.1145/380752.380755]
|
| |
8
|
M. Badoiu and K. L. Clarkson. Optimal core-sets for balls. In Proc. 14th ACM-SIAM Sympos. Discrete Algorithms, pages 801--802, 2003.
|
 |
9
|
|
| |
10
|
J. L. Bentley and J. B. Saxe. Decomposable searching problems i: Static-to-dynamic transformation. J. Algorithms, 1(4):301--358, 1980.
|
| |
11
|
|
 |
12
|
Moses Charikar , Sudipto Guha , Éva Tardos , David B. Shmoys, A constant-factor approximation algorithm for the k-median problem (extended abstract), Proceedings of the thirty-first annual ACM symposium on Theory of computing, p.1-10, May 01-04, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301250.301257]
|
 |
13
|
W. Fernandez de la Vega , Marek Karpinski , Claire Kenyon , Yuval Rabani, Approximation schemes for clustering problems, Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, June 09-11, 2003, San Diego, CA, USA
[doi> 10.1145/780542.780550]
|
 |
14
|
|
| |
15
|
|
| |
16
|
T. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci., 38:293--306, 1985.
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
 |
20
|
|
 |
21
|
Mary Inaba , Naoki Katoh , Hiroshi Imai, Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering: (extended abstract), Proceedings of the tenth annual symposium on Computational geometry, p.332-339, June 06-08, 1994, Stony Brook, New York, United States
[doi> 10.1145/177424.178042]
|
 |
22
|
|
| |
23
|
P. Indyk. (1+ε)-approximate bi-criterion algorithm for k-median on a stream. In Proc. 36th Annu. ACM Sympos. Theory Comput., 2004.
|
| |
24
|
|
 |
25
|
Tapas Kanungo , David M. Mount , Nathan S. Netanyahu , Christine D. Piatko , Ruth Silverman , Angela Y. Wu, A local search approximation algorithm for k-means clustering, Proceedings of the eighteenth annual symposium on Computational geometry, p.10-18, June 05-07, 2002, Barcelona, Spain
[doi> 10.1145/513400.513402]
|
| |
26
|
|
 |
27
|
|
| |
28
|
J. Matousek. On approximate geometric k-clustering. Discrete Comput. Geom., 24:61--84, 2000.
|
| |
29
|
R. Mettu and C. G. Plaxton. Optimal time bounds for approximate clustering. In In Proceedings of the 18th Conference on Uncertainty in Artifical Intelligence, pages 344--351, 2002.
|
CITED BY 30
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nir Ailon , Bernard Chazelle , Seshadhri Comandur , Ding Liu, Self-improving algorithms, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.261-270, January 22-26, 2006, Miami, Florida
|
|
|
Amit Deshpande , Luis Rademacher , Santosh Vempala , Grant Wang, Matrix approximation and projective clustering via volume sampling, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.1117-1126, January 22-26, 2006, Miami, Florida
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Anirban Dasgupta , Petros Drineas , Boulos Harb , Ravi Kumar , Michael W. Mahoney, Sampling algorithms and coresets for ℓp regression, Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, p.932-941, January 20-22, 2008, San Francisco, California
|
|
|
|
|
|
|
|
|
Dan Feldman , Amos Fiat , Haim Kaplan , Kobbi Nissim, Private coresets, Proceedings of the 41st annual ACM symposium on Theory of computing, May 31-June 02, 2009, Bethesda, MD, USA
|
|
|
|
|
|
|
|