| Requirements for clustering data streams |
| Full text |
Pdf
(486 KB)
|
| Source
|
ACM SIGKDD Explorations Newsletter
archive
Volume 3 , Issue 2 (January 2002)
table of contents
COLUMN: Contributed articles on online, interactive, and anytime data mining
table of contents
Pages: 23 - 27
Year of Publication: 2002
ISSN:1931-0145
|
|
Author
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 134, Citation Count: 9
|
|
|
ABSTRACT
Scientific and industrial examples of data streams abound in astronomy, telecommunication operations, banking and stock-market applications, e-commerce and other fields. A challenge imposed by continuously arriving data streams is to analyze them and to modify the models that explain them as new data arrives. In this paper, we analyze the requirements needed for clustering data streams. We review some of the latest algorithms in the literature and assess if they meet these requirements.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
Chernoff, H. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the Sum of Observations. Annals of Mathematical Statistics, Vol. 23, pages 493-509, 1952.
|
| |
4
|
Fisher D. H. Iterative Optimization and Simplification of Hierarchical Clusterings. Journal of AI Research, Vol. 4, pages 147-180, 1996.
|
| |
5
|
|
| |
6
|
Gluck M. A., and Corter J. E. Information, uncertainty, and the utility of categories. Proceedings of the Seventh Annual Conference of the Cognitive Science Society, Irvine, CA, 1985.
|
| |
7
|
Schroeder M. Fractal, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H. Freeman and Company, 1991.
|
| |
8
|
|
 |
9
|
Agma Traina , Caetano Traina , Spiros Papadimitriou , Christos Faloutsos, Tri-plots: scalable tools for multidimensional data mining, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, p.184-193, August 26-29, 2001, San Francisco, California
[doi> 10.1145/502512.502538]
|
| |
10
|
O'Callaghan L., Mishra N., Meyerson A., Guha S., and Motwani R. High-Performance Clustering of Streams and Large Data Sets. International Conference on Data Engineering (ICDE) 2002 (to appear).
|
| |
11
|
Watanabe, O. Simple Sampling Techniques for Discovery Science. IEICE Transactions on Inf. & Syst., Vol. E83-D, No. 1, January, 2000.
|
 |
12
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
|