ACM Home Page
Please provide us with feedback. Feedback
Exploration and visualization of OLAP cubes with statistical tests
Full text PdfPdf (395 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the ACM SIGKDD Workshop on Visual Analytics and Knowledge Discovery: Integrating Automated Analysis with Interactive Exploration table of contents
Paris, France
Pages 46-55  
Year of Publication: 2009
ISBN:978-1-60558-670-0
Authors
Carlos Ordonez  University of Houston, Houston, TX
Zhibo Chen  University of Houston, Houston, TX
Sponsors
: PASCAL2 - Pattern Analysis, Statistical Modelling and Computational Learning
: Helsinki Institute for Information Technology HIIT
: VisMaster, a European FP7 Coordination Action Project focused on Visual Analytics
: Danube University Krems, Departement of Information and Knowledge Engineering (DUK)
: National Visualization and Analytics Center (NVAC)
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 47,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1562849.1562855
What is a DOI?

ABSTRACT

In On-Line Analytical Processing (OLAP), users explore a database cube with roll-up and drill-down operations in order to find interesting results. Most approaches rely on simple aggregations and value comparisons in order to validate findings. In this work, we propose to combine OLAP dimension lattice traversal and statistical tests to discover significant metric differences between highly similar groups. A parametric statistical test allows pair-wise comparison of neighboring cells in cuboids, providing statistical evidence about the validity of findings. We introduce a two-dimensional checkerboard visualization of the cube that allows interactive exploration to understand significant measure differences between two cuboids differing in one dimension along with associated image data. Our system is tightly integrated into a relational DBMS, by dynamically generating SQL code, which incorporates several optimizations to efficiently explore the cube, to visualize discovered cell pairs and to view associated images. We present an experimental evaluation with medical data sets focusing on finding significant relationships between risk factors and disease.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
A. Asuncion and D. J. Newman. UCI Machine Learning Repository. University of California, Irvine. School of Inf. and Comp. Sci., 2007.
 
4
5
 
6
L. Chen, M. Ozsu, and V. Oria. Mindex: An efficient index structure for salient-object-based queries in video databases. Multimedia Syst., 10(1):56--71, 2004.
7
8
 
9
 
10
 
11
12
 
13
T. Hastie, R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. Springer, New York, 1st edition, 2001.
 
14
A. Inselberg. Visualization and knowledge discovery for high dimensional data. In UIDIS, pages 5--24, 2001.
 
15
D. A. Keim, C. Panse, J. Schneidewind, M. Sips, M. C. Hao, and U. Dayal. Pushing the limit in visual data exploration: Techniques and applications. In KI, pages 37--51, 2003.
16
 
17
 
18
C. Ordonez. Association rule discovery with the train and test approach for heart disease prediction. IEEE Transactions on Information Technology in Biomedicine (TITB), 10(2):334--343, 2006.
 
19
 
20
 
21
 
22
M. Triola. Essentials of Statistics. Addison Wesley, 2nd edition, 2005.
 
23
S. Vinnik and F. Mansmann. From analysis to interactive exploration: Building visual hierarchies from OLAP cubes. In EDBT, pages 496--514, 2006.
 
24

Collaborative Colleagues:
Carlos Ordonez: colleagues
Zhibo Chen: colleagues