|
ABSTRACT
In On-Line Analytical Processing (OLAP), users explore a database cube with roll-up and drill-down operations in order to find interesting results. Most approaches rely on simple aggregations and value comparisons in order to validate findings. In this work, we propose to combine OLAP dimension lattice traversal and statistical tests to discover significant metric differences between highly similar groups. A parametric statistical test allows pair-wise comparison of neighboring cells in cuboids, providing statistical evidence about the validity of findings. We introduce a two-dimensional checkerboard visualization of the cube that allows interactive exploration to understand significant measure differences between two cuboids differing in one dimension along with associated image data. Our system is tightly integrated into a relational DBMS, by dynamically generating SQL code, which incorporates several optimizations to efficiently explore the cube, to visualize discovered cell pairs and to view associated images. We present an experimental evaluation with medical data sets focusing on finding significant relationships between risk factors and disease.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Sameet Agarwal , Rakesh Agrawal , Prasad Deshpande , Ashish Gupta , Jeffrey F. Naughton , Raghu Ramakrishnan , Sunita Sarawagi, On the Computation of Multidimensional Aggregates, Proceedings of the 22th International Conference on Very Large Data Bases, p.506-521, September 03-06, 1996
|
 |
2
|
Rakesh Agrawal , Tomasz Imieliński , Arun Swami, Mining association rules between sets of items in large databases, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.207-216, May 25-28, 1993, Washington, D.C., United States
|
| |
3
|
A. Asuncion and D. J. Newman. UCI Machine Learning Repository. University of California, Irvine. School of Inf. and Comp. Sci., 2007.
|
| |
4
|
|
 |
5
|
|
| |
6
|
L. Chen, M. Ozsu, and V. Oria. Mindex: An efficient index structure for salient-object-based queries in video databases. Multimedia Syst., 10(1):56--71, 2004.
|
 |
7
|
|
 |
8
|
|
| |
9
|
Jim Gray , Adam Bosworth , Andrew Layman , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total, Proceedings of the Twelfth International Conference on Data Engineering, p.152-159, February 26-March 01, 1996
|
| |
10
|
|
| |
11
|
|
 |
12
|
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman, Implementing data cubes efficiently, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.205-216, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
13
|
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, New York, 1st edition, 2001.
|
| |
14
|
A. Inselberg. Visualization and knowledge discovery for high dimensional data. In UIDIS, pages 5--24, 2001.
|
| |
15
|
D. A. Keim, C. Panse, J. Schneidewind, M. Sips, M. C. Hao, and U. Dayal. Pushing the limit in visual data exploration: Techniques and applications. In KI, pages 37--51, 2003.
|
 |
16
|
Andreas S. Maniatis , Panos Vassiliadis , Spiros Skiadopoulos , Yannis Vassiliou, Advanced visualization for OLAP, Proceedings of the 6th ACM international workshop on Data warehousing and OLAP, November 07-07, 2003, New Orleans, Louisiana, USA
[doi> 10.1145/956060.956063]
|
| |
17
|
|
| |
18
|
C. Ordonez. Association rule discovery with the train and test approach for heart disease prediction. IEEE Transactions on Information Technology in Biomedicine (TITB), 10(2):334--343, 2006.
|
| |
19
|
|
| |
20
|
Carlos Ordonez , Edward Omiecinski , Levien de Braal , Cesar A. Santana , Norberto Ezquerra , Jose A. Taboada , David Cooke , Elizabeth Krawczynska , Ernest V. Garcia, Mining Constrained Association Rules to Predict Heart Disease, Proceedings of the 2001 IEEE International Conference on Data Mining, p.433-440, November 29-December 02, 2001
|
| |
21
|
|
| |
22
|
M. Triola. Essentials of Statistics. Addison Wesley, 2nd edition, 2005.
|
| |
23
|
S. Vinnik and F. Mansmann. From analysis to interactive exploration: Building visual hierarchies from OLAP cubes. In EDBT, pages 496--514, 2006.
|
| |
24
|
|
|