| Sublinear time approximate clustering |
| Full text |
Pdf
(740 KB)
|
| Source
|
Symposium on Discrete Algorithms
archive
Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
table of contents
Washington, D.C., United States
Pages: 439 - 447
Year of Publication: 2001
ISBN:0-89871-490-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
Society for Industrial and Applied Mathematics
Philadelphia, PA, USA
|
| Bibliometrics |
Downloads (6 Weeks): 3, Downloads (12 Months): 39, Citation Count: 23
|
|
|
ABSTRACT
Clustering is of central importance in a number of disciplines including Machine Learning, Statistics, and Data Mining. This paper has two foci: (1) It describes how existing algorithms for clustering can benefit from simple sampling techniques arising from work in statistics [Pol84]. (2) It motivates and introduces a new model of clustering that is in the spirit of the “PAC (probably approximately correct)” learning model, and gives examples of efficient PAC-clustering algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
ADPR00
|
|
| |
AHM95
|
Peter Auer, Robert C. Holte, and Wolfgang Maass. Theory and applications of agnostic PAC- learning with small decision trees. In Proc. 12th International Conference on Machine Learning, pages 21- 29. Morgan Kaufmann, 1995.
|
| |
Ang87
|
D. Angluin. Learning k-term DNF formulas using queries and counterexamples. Technical Report YALEU/DCS/RR-559, Department of Computer Science, Yale University, August 1987.
|
| |
BEHW87
|
|
 |
BKK+94
|
Avrim Blum , Roni Khardon , Eyal Kushilevitz , Leonard Pitt , Dan Roth, On learning Read-k-Satisfy-j DNF, Proceedings of the seventh annual conference on Computational learning theory, p.110-117, July 12-15, 1994, New Brunswick, New Jersey, United States
[doi> 10.1145/180139.181051]
|
 |
CCFM97
|
Moses Charikar , Chandra Chekuri , Tomás Feder , Rajeev Motwani, Incremental clustering and dynamic information retrieval, Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, p.626-635, May 04-06, 1997, El Paso, Texas, United States
[doi> 10.1145/258533.258657]
|
| |
CG99
|
|
 |
CGTS99
|
Moses Charikar , Sudipto Guha , Éva Tardos , David B. Shmoys, A constant-factor approximation algorithm for the k-median problem (extended abstract), Proceedings of the thirty-first annual ACM symposium on Theory of computing, p.1-10, May 01-04, 1999, Atlanta, Georgia, United States
[doi> 10.1145/301250.301257]
|
 |
FG88
|
|
| |
GMMO00
|
|
| |
Gon85
|
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(2-3):293-306, June 1985.
|
| |
Hau92
|
|
 |
HS86
|
|
 |
Ind99
|
|
| |
JV99
|
|
| |
KVV00
|
|
| |
Pol84
|
D. Pollard. Convergence of Stochastic Processes. Springer Verlag, 1984.
|
| |
PR88
|
|
 |
Sch00
|
|
 |
Val84
|
|
CITED BY 23
|
|
|
|
|
W. Fernandez de la Vega , Marek Karpinski , Claire Kenyon , Yuval Rabani, Approximation schemes for clustering problems, Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, June 09-11, 2003, San Diego, CA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brain Babcock , Mayur Datar , Rajeev Motwani , Liadan O'Callaghan, Maintaining variance and k-medians over data stream windows, Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.234-243, June 09-11, 2003, San Diego, California
|
|
|
Nir Ailon , Bernard Chazelle , Seshadhri Comandur , Ding Liu, Self-improving algorithms, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.261-270, January 22-26, 2006, Miami, Florida
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|