ACM Home Page
Please provide us with feedback. Feedback
Sublinear time approximate clustering
Full text PdfPdf (740 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms table of contents
Washington, D.C., United States
Pages: 439 - 447  
Year of Publication: 2001
ISBN:0-89871-490-7
Authors
Nina Mishra  Hewlett-Packard Labs, Palo Alto, CA
Dan Oblinger  IBM TJ Watson Labs
Leonard Pitt  University of Illinois at Urbana-Champaign, Urbana, IL
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIAM : Society for Industrial and Applied Mathematics
Publisher
Society for Industrial and Applied Mathematics  Philadelphia, PA, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 45,   Citation Count: 23
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

Clustering is of central importance in a number of disciplines including Machine Learning, Statistics, and Data Mining. This paper has two foci: (1) It describes how existing algorithms for clustering can benefit from simple sampling techniques arising from work in statistics [Pol84]. (2) It motivates and introduces a new model of clustering that is in the spirit of the “PAC (probably approximately correct)” learning model, and gives examples of efficient PAC-clustering algorithms.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
ADPR00
 
AHM95
Peter Auer, Robert C. Holte, and Wolfgang Maass. Theory and applications of agnostic PAC- learning with small decision trees. In Proc. 12th International Conference on Machine Learning, pages 21- 29. Morgan Kaufmann, 1995.
 
Ang87
D. Angluin. Learning k-term DNF formulas using queries and counterexamples. Technical Report YALEU/DCS/RR-559, Department of Computer Science, Yale University, August 1987.
 
BEHW87
BKK+94
CCFM97
 
CG99
CGTS99
FG88
 
GMMO00
 
Gon85
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38(2-3):293-306, June 1985.
 
Hau92
HS86
Ind99
 
JV99
 
KVV00
 
Pol84
D. Pollard. Convergence of Stochastic Processes. Springer Verlag, 1984.
 
PR88
Sch00
Val84

CITED BY  23

Collaborative Colleagues:
Nina Mishra: colleagues
Dan Oblinger: colleagues
Leonard Pitt: colleagues