|
ABSTRACT
In this paper we propose a lattice-based approach intended for extracting semantics from datacubes: borders of version spaces for supervised classification, closed cube lattice to summarize the semantics of datacubes w.r.t. COUNT, SUM, and covering graph of the quotient cube as a visualization tool of minimal multidimensional associations. With this intention, we introduce two novel concepts: the cube transversals and the cube closures over the cube lattice of a categorical database relation. We propose a levelwise merging algorithm for mining minimal cube transversals with a single database scan. We introduce the cube connection, show that it is a Galois connection and derive a closure operator over the cube lattice. Using cube transversals and closures, we define a new characterization of boundary sets which provide a condensed representation of version spaces used to enhance supervised classification. The algorithm designed for computing such borders improves the complexity of previous proposals. We also introduce the concept of closed cube lattice and show that it is isomorph to on one hand the Galois lattice and on the other hand the quotient cube w.r.t. COUNT, SUM. Proposed in [16], the quotient cube is a succinct summary of a datacube preserving the Rollup/Drilldown semantics. We show that the quotient cube w.r.t. COUNT, SUM and the closed cube lattice have a similar expression power but the latter has the smallest possible size. Finally we focus on the multidimensional association issue and introduce the covering graph of the quotient cube which provides the user with a visualization tool of minimal multidimensional associations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
C. Berge. Hypergraphs: combinatorics of finite sets. North-Holland, Amsterdam, 1989.
|
| |
3
|
|
| |
4
|
A. Casali, R. Cicchetti, and L. Lakhal. Cube Lattices: a Framework for Multidimensional Data Mining. In Proceedings of the 3rd SIAM International Conference on Data Mining, SDM, pages 304--308, 2003.
|
| |
5
|
A. Casali, R. Cicchetti, and L. Lakhal. Lattice-Based Discovery of Semantics from Datacubes. Technical report, Université de la Méditerranée, 2003.
|
| |
6
|
A. Casali, R. Cicchetti, and L. Lakhal. Mining Concise Représentations of Frequent Multidimensional Patterns. In Proceedings of the 11th International Conference on Conceptual Structures, ICCS, 2003.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Jim Gray , Surajit Chaudhuri , Adam Bosworth , Andrew Layman , Don Reichart , Murali Venkatrao , Frank Pellow , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Data Mining and Knowledge Discovery, v.1 n.1, p.29-53, 1997
[doi> 10.1023/A:1009726021843]
|
 |
11
|
Dimitrios Gunopulos , Heikki Mannila , Roni Khardon , Hannu Toivonen, Data mining, hypergraph transversals, and machine learning (extended abstract), Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.209-216, May 11-15, 1997, Tucson, Arizona, United States
[doi> 10.1145/263661.263684]
|
| |
12
|
|
 |
13
|
Jiawei Han , Jian Pei , Guozhu Dong , Ke Wang, Efficient computation of Iceberg cubes with complex measures, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.1-12, May 21-24, 2001, Santa Barbara, California, United States
|
| |
14
|
H. Hirsh. Theoretical Underpinnings of Version Spaces. In Proceedings of the 12th International Joint Conference on Artificial Intelligence, IJCAI, pages 665--670, 1991.
|
| |
15
|
H. Hirsh. The Computational Compexity of the Candidate-Elimination Algorithm. Technical report, Rutgers Univeristy, 1992.
|
| |
16
|
L. Lakshmanan, J. Pei, and J. Han. Quotient Cube: How to Summarize the Semantics of a Data Cube. In Proceedings of the 28th International Conference on Very Large Databases, VLDB, pages 778--789, 2002.
|
| |
17
|
|
| |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
 |
24
|
|
| |
25
|
J. Pei, J. Han, and R. Mao. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. In Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD, pages 21--30, 2000.
|
| |
26
|
J. Pfaltz and R. Jamison. Closure systems and their structure. In Information Sciences, volume 139(3--4), pages 275--286, 2001.
|
| |
27
|
|
 |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
M. Zaki and C. Hsio. CHARM: An Efficient Algorithm for Closed Itemset Mining. In Proceedings of the 2nd SIAM International Conference on Data mining, 2002.
|
 |
32
|
|
CITED BY
|
|
Sébastien Nedjar , Alain Casali , Rosine Cicchetti , Lotfi Lakhal, Emerging Cubes: Borders, size estimations and lossless reductions, Information Systems, v.34 n.6, p.536-550, September, 2009
|
|