ACM Home Page
Please provide us with feedback. Feedback
Expressive power of an algebra for data mining
Full text PdfPdf (392 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 31 ,  Issue 4  (December 2006) table of contents
Pages: 1169 - 1214  
Year of Publication: 2006
ISSN:0362-5915
Authors
Toon Calders  Eindhoven Technical University, Eindhoven, The Netherlands
Laks V. S. Lakshmanan  University of British Columbia, Vancouver, B.C., Canada
Raymond T. Ng  University of British Columbia, Vancouver, B.C., Canada
Jan Paredaens  University of Antwerp, Antwerp, Belgium
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 37,   Downloads (12 Months): 204,   Citation Count: 2
Additional Information:

appendices and supplements   abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1189769.1189770
What is a DOI?

APPENDICES and SUPPLEMENTS
Online appendix to designing mediation for context-aware applications. The appendix supports the information on page 1169.


ABSTRACT

The relational data model has simple and clear foundations on which significant theoretical and systems research has flourished. By contrast, most research on data mining has focused on algorithmic issues. A major open question is: what's an appropriate foundation for data mining, which can accommodate disparate mining tasks? We address this problem by presenting a database model and an algebra for data mining. The database model is based on the 3W-model introduced by Johnson et al. [2000]. This model relied on black box mining operators. A main contribution of this article is to open up these black boxes, by using generic operators in a data mining algebra. Two key operators in this algebra are regionize, which creates regions (or models) from data tuples, and a restricted form of looping called mining loop. Then the resulting data mining algebra MA is studied and properties concerning expressive power and complexity are established. We present results in three directions: (1) expressiveness of the mining algebra; (2) relations with alternative frameworks, and (3) interactions between regionize and mining loop.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
4
 
5
 
6
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.
 
7
Calders, T., Rigotti, C., and Boulicaut, J.-F. 2006. A survey on condensed representations for frequent sets. In Constraint-Based Mining and Inductive Databases, J.-F. Boulicaut, L. de Raedt, and H. Mannila, Eds. Lecture Notes in Computer Science, vol. 3848. Springer-Verlag, London, U.K.
 
8
 
9
 
10
Dantzig, G. 1963. Linear Programming and Extensions. Princeton University Press, Princeton, NJ.
 
11
 
12
Geist, I. and Sattler, K. 2002. Towards data mining operators in database systems: Algebra and implementation. In Proceedings of the DBFusion International Workshop on Databases, Documents, and Information Fusion. Vol. 124. CEUR-WS, Karlsruhe, Germany.
13
14
15
 
16
17
 
18
 
19
 
20
 
21
Law, Y.-N., Wang, H., and Zaniolo, C. 2004. Query languages and data models for database sequences and data streams. In Proceedings of the VLDB International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 492--503.
 
22
 
23
Mannila, H. and Toivonen, H. 1996. Multiple uses of frequent sets and condensed representations. In Proceedings of the KDD International Conference on Knowledge Discovery in Databases. ACM Press, New York, NY.
 
24
 
25
Murty, K. G. 1983. Linear Programming. John Wiley & Sons, New York, NY.
 
26
27
28
29
30
 
31
 
32
 
33
Wang, H. and Zaniolo, C. 2003. ATLaS: A native extension of sql for data mining. In Proceedings of the Third SIAM International Conference on Data Mining. SIAM Press, Philadelphia, PA.
 
34
Zaniolo, C. 2005. Mining databases and data streams with query languages and rules. In Proceedings of the ECML-PKDD 2005 Workshop on Knowledge Discovery in Inductive Databases. Lecture Notes in Computer Science, vol. 3933. Springer-Verlag, London, U.K.


Collaborative Colleagues:
Toon Calders: colleagues
Laks V. S. Lakshmanan: colleagues
Raymond T. Ng: colleagues
Jan Paredaens: colleagues