|
ABSTRACT
The notion of roll-up dependency (RUD) extends functional dependencies with generalization hierarchies. RUDs can be applied in OLAP and database design. The problem of discovering RUDs in large databases is at the center of this paper. An algorithm is provided that relies on a number of theoretical results. The algorithm has been implemented; results on two real-life datasets are given. The extension of functional dependency (FD) with roll-ups turns out to capture meaningful rules that are outside the scope of classical FD mining. Performance figures show that RUDs can be discovered in linear time in the number of tuples of the input dataset.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
2
|
Rakesh Agrawal , Hiekki Mannila , Ramakrishnan Srikant , Hannu Toivonen , A. Inkeri Verkamo, Fast discovery of association rules, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996
|
| |
3
|
|
| |
4
|
|
| |
5
|
Bay, S. 1999. The UCI KDD archive, http://kdd.ics.uci.edu/. University of California, Irvine, Dept. of Information and Computer Sciences.
|
| |
6
|
Bell, S. 1995. Discovery and maintenance of functional dependencies by independencies. In Proceedings of the 1st International Conference on Knowledge Discovery in Databases. AAAI Press, Montreal, Canada, 27--32.
|
| |
7
|
Bettini, C., Dyreson, C., Evans, W., Snodgrass, R., and Wang, X. 1998. A glossary of time granularity concepts. In Temporal Databases: Research and Practice, O. Etzion, S. Jajodia, and S. Sripada, Eds. Number 1399 in LNCS State-of-the-art Survey. Springer-Verlag, 406--413.
|
| |
8
|
|
| |
9
|
|
| |
10
|
CIA. 2001. The world factbook 2001, http://www.cia.gov/cia/publications/factbook.
|
| |
11
|
Davey, B. and Priestley, H. 1990. Introduction to Lattices and Order. Cambridge University Press.
|
| |
12
|
Jim Gray , Surajit Chaudhuri , Adam Bosworth , Andrew Layman , Don Reichart , Murali Venkatrao , Frank Pellow , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Data Mining and Knowledge Discovery, v.1 n.1, p.29-53, 1997
[doi> 10.1023/A:1009726021843]
|
| |
13
|
Han, J. 1997. OLAP mining: An integration of OLAP with data mining. In Proceedings of the 7th IFIP 2.6 Working Conference on Database Semantics (DS-7). Leysin, Switzerland, 1--9.
|
 |
14
|
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman, Implementing data cubes efficiently, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.205-216, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
15
|
Huhtala, Y., Kärkkäinen, J., Porkka, P., and Toivonen, H. 1999. TANE: An efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42, 2, 100--111.
|
| |
16
|
Kantola, M., Mannila, H., Räihä, K.-J., and Siirtola, H. 1992. Discovering functional and inclusion dependencies in relational databases. Int. J. Intelligent Systems 7, 591--607.
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
May, W. 1999. Information extraction and integration with Florid: The Mondial case study. Tech. Rep. 131, Universität Freiburg, Institut für Informatik.
|
| |
23
|
|
| |
24
|
Schlimmer, J. C. 1993. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of the 10th International Conference on Machine Learning. Morgan Kaufmann, Amherst, MA, 284--290.
|
| |
25
|
van der Heyden, J. 2001. Global statistics, http://www.geohive.com/. GeoHive.
|
| |
26
|
van der Laag, P. R. J. and Nienhuys-Cheng, S.-H. 1998. Completeness and properness of refinement operators in inductive logic programming. J. Logic Program. 34, 3, 201--225.
|
 |
27
|
|
 |
28
|
|
| |
29
|
|
 |
30
|
Jef Wijsen , Raymond T. Ng , Toon Calders, Discovering roll-up dependencies, Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p.213-222, August 15-18, 1999, San Diego, California, United States
[doi> 10.1145/312129.312232]
|
| |
31
|
|
|