APPENDICES and SUPPLEMENTS
|
|
Online appendix to designing mediation for context-aware applications. The appendix supports the information on page 1169.
|
ABSTRACT
The relational data model has simple and clear foundations on which significant theoretical and systems research has flourished. By contrast, most research on data mining has focused on algorithmic issues. A major open question is: what's an appropriate foundation for data mining, which can accommodate disparate mining tasks? We address this problem by presenting a database model and an algebra for data mining. The database model is based on the 3W-model introduced by Johnson et al. [2000]. This model relied on black box mining operators. A main contribution of this article is to open up these black boxes, by using generic operators in a data mining algebra. Two key operators in this algebra are regionize, which creates regions (or models) from data tuples, and a restricted form of looping called mining loop. Then the resulting data mining algebra MA is studied and properties concerning expressive power and complexity are established. We present results in three directions: (1) expressiveness of the mining algebra; (2) relations with alternative frameworks, and (3) interactions between regionize and mining loop.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
|
 |
4
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
5
|
|
| |
6
|
Breiman, L., Friedman, J., Olshen, R., and Stone, C. 1984. Classification and Regression Trees. Wadsworth, Belmont, CA.
|
| |
7
|
Calders, T., Rigotti, C., and Boulicaut, J.-F. 2006. A survey on condensed representations for frequent sets. In Constraint-Based Mining and Inductive Databases, J.-F. Boulicaut, L. de Raedt, and H. Mannila, Eds. Lecture Notes in Computer Science, vol. 3848. Springer-Verlag, London, U.K.
|
| |
8
|
|
| |
9
|
|
| |
10
|
Dantzig, G. 1963. Linear Programming and Extensions. Princeton University Press, Princeton, NJ.
|
| |
11
|
|
| |
12
|
Geist, I. and Sattler, K. 2002. Towards data mining operators in database systems: Algebra and implementation. In Proceedings of the DBFusion International Workshop on Databases, Documents, and Information Fusion. Vol. 124. CEUR-WS, Karlsruhe, Germany.
|
 |
13
|
|
 |
14
|
Jaiwei Han , Youngjian Fu , Wei Wang , Jenny Chiang , Osmar R. Zaïane , Krzysztof Koperski, DBMiner: interactive mining of multiple-level knowledge in relational databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.550, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
15
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Law, Y.-N., Wang, H., and Zaniolo, C. 2004. Query languages and data models for database sequences and data streams. In Proceedings of the VLDB International Conference on Very Large Data Bases. Morgan Kaufmann, San Francisco, CA, 492--503.
|
| |
22
|
|
| |
23
|
Mannila, H. and Toivonen, H. 1996. Multiple uses of frequent sets and condensed representations. In Proceedings of the KDD International Conference on Knowledge Discovery in Databases. ACM Press, New York, NY.
|
| |
24
|
|
| |
25
|
Murty, K. G. 1983. Linear Programming. John Wiley & Sons, New York, NY.
|
| |
26
|
|
 |
27
|
Jan Paredaens , Jan Van den Bussche , Dirk Van Gucht, Towards a theory of spatial database queries (extended abstract), Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.279-288, May 24-27, 1994, Minneapolis, Minnesota, United States
[doi> 10.1145/182591.182640]
|
 |
28
|
|
 |
29
|
Sunita Sarawagi , Shiby Thomas , Rakesh Agrawal, Integrating association rule mining with relational database systems: alternatives and implications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.343-354, June 01-04, 1998, Seattle, Washington, United States
|
 |
30
|
Dick Tsur , Jeffrey D. Ullman , Serge Abiteboul , Chris Clifton , Rajeev Motwani , Svetlozar Nestorov , Arnon Rosenthal, Query flocks: a generalization of association-rule mining, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.1-12, June 01-04, 1998, Seattle, Washington, United States
|
| |
31
|
|
| |
32
|
|
| |
33
|
Wang, H. and Zaniolo, C. 2003. ATLaS: A native extension of sql for data mining. In Proceedings of the Third SIAM International Conference on Data Mining. SIAM Press, Philadelphia, PA.
|
| |
34
|
Zaniolo, C. 2005. Mining databases and data streams with query languages and rules. In Proceedings of the ECML-PKDD 2005 Workshop on Knowledge Discovery in Inductive Databases. Lecture Notes in Computer Science, vol. 3933. Springer-Verlag, London, U.K.
|
CITED BY 2
|
|
|
|
|
Riccardo Ortale , Ettore Ritacco , Nikos Pelekis , Roberto Trasarti , G. Costa , F. Giannotti , G. Manco , C. Renso , Y. Theodoridis, The DAEDALUS framework: progressive querying and mining of movement data, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, November 05-07, 2008, Irvine, California
|
|