ACM Home Page
Please provide us with feedback. Feedback
Mining data streams under block evolution
Full text PdfPdf (1.10 MB)
Source ACM SIGKDD Explorations Newsletter archive
Volume 3 ,  Issue 2  (January 2002) table of contents
COLUMN: Contributed articles on online, interactive, and anytime data mining table of contents
Pages: 1 - 10  
Year of Publication: 2002
ISSN:1931-0145
Authors
Venkatesh Ganti  Microsoft Research
Johannes Gehrke  Cornell University
Raghu Ramakrishnan  UW-Madison
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 58,   Citation Count: 17
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/507515.507517
What is a DOI?

ABSTRACT

In this paper we survey recent work on incremental data mining model maintenance and change detection under block evolution. In block evolution, a dataset is updated periodically through insertions and deletions of blocks of records at a time. We describe two techniques: (1) We describe a generic algorithm for model maintenance that takes any traditional incremental data mining model maintenance algorithm and transforms it into an algorithm that allows restrictions on a temporal subset of the database. (2) We also describe a generic framework for change detection, that quantifies the difference between two datasets in terms of the data mining models they induce.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
3
4
 
5
 
6
R. Agrawal and G. Psaila. Acive data mining. Proceedings of the first international conference on knowledge discovery and data mining, 1995.
 
7
R. Agrawal and A. Swami. A one-pass space-efficient algorithm for finding quantiles. In S. Chaudhuri, A. Deshpande, and R. Krishnamurthy, editors, Proceedings of the 7th International Conference on Management of Data (COMAD), December 1995.
 
8
 
9
10
 
11
 
12
 
13
T. W. Anderson. The statistical analysis of time series. John Wiley & Sons, Inc., 1971.
 
14
15
 
16
D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The new jersey data reduction report. Data Engineering Bulletin, 20(4):3-45, 1997.
 
17
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, Belmont, 1984.
 
18
 
19
 
20
 
21
22
23
 
24
 
25
 
26
D. Cheung, T. Vincent, and W. Benjamin. Maintenance of discovered knowledge: A case in multi-level association rules. In Proceedings of the second international conference on knowledge discovery in databases, August 1996.
27
 
28
 
29
A. Delis, C. Faloutsos, and S. Ghandeharizadeh, editors. SIGMOD 1999, Proceedings ACM SIGMOD International Conference on Management of Data, June 1-3, 1999, Philadephia, Pennsylvania, USA. ACM Press, 1999.
30
 
31
 
32
M. Ester, H.-P. Kriegel, and X. Xu. A database interface for clustering in large spatial databases. In Proc. of the 1st Int'l Conference on Knowledge Discovery in Databases and Data Mining, Montreal, Canada, August 1995.
 
33
 
34
 
35
 
36
37
38
 
39
40
 
41
42
 
43
44
45
46
 
47
M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. Technical Report 1998-011, Digital Eqipment Corporation, Systems Research Center, May, 1998.
48
 
49
 
50
M. Klenner and U. Hahn. Concept versioning: A methodology for tracking evolutionary concept drift in dynamic concept systems. In A. G. Cohn, editor, Proceedings of the Eleventh European Conference on Artificial Intelligence, pages 473-477, Chichester, Aug. 8-12 1994. John Wiley and Sons.
51
52
53
 
54
L. O'Callaghan, N. Mishra, A. Meyerson, S. Guha, and R. Motwani. High-performance clustering of streams and large data sets. In Proceedings of the 18th International Conference on Data Engineering, 2002.
 
55
 
56
 
57
58
 
59
P. Utgoff. ID5: An incremental ID3. In Proceedings of the Fifth International Conference on Machine Learning, pages 107-120. Morgan Kaufmann, 1988.
 
60
 
61
 
62

CITED BY  17
Collaborative Colleagues:
Venkatesh Ganti: colleagues
Johannes Gehrke: colleagues
Raghu Ramakrishnan: colleagues