ACM Home Page
Please provide us with feedback. Feedback
Efficient and effective explanation of change in hierarchical summaries
Full text MovMov (18:27),  PdfPdf (277 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Jose, California, USA
SESSION: Research track papers table of contents
Pages: 6 - 15  
Year of Publication: 2007
ISBN:978-1-59593-609-7
Authors
Deepak Agarwal  Yahoo! Research
Dhiman Barman  University of California: Riverside
Dimitrios Gunopulos  University of California: Riverside
Neal E. Young  University of California: Riverside
Flip Korn  AT&T Labs-Research
Divesh Srivastava  AT&T Labs-Research
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 136,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1281192.1281197
What is a DOI?

ABSTRACT

Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measure attributes (e.g., total sales) along a dimension hierarchy, and to characterize changes (e.g., trends and anomalies) in a hierarchical summary over time. When thenumber of changes identified is large (e.g., total sales in many stores differed from their expected values), a parsimonious explanation of the most significant changes is desirable. In this paper, we propose a natural model of parsimonious explanation, as a composition of node weights along the root-to-leaf paths in a dimension hierarchy, which permits changes to be aggregated with maximal generalization along the dimension hierarchy. We formalize this model of explaining changes in hierarchical summaries and investigate the problem of identifying optimally parsimonious explanations on arbitrary rooted one dimensional tree hierarchies. We show that such explanations can be computed efficiently in time essentially proportional to the number of leaves and the depth of the hierarchy. Further, our method can produce parsimonious explanations from the output of any statistical model that provides predictions and confidence intervals, making it widely applicable. Our experiments use real data sets to demonstrate the utility and robustness of our proposed model for explaining significant changes, as well as its superior parsimony compared to alternatives.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
Dhiman Barman, Flip Korn, Divesh Srivastava, Dimitris Gunopulos, Neal E. Young, and Deepak Agarwal. Parsimonious Explanations of Change in Hierarchical Data. In Proc. of ICDE 2007.
 
5
Census (population vs. location), 2000-2004. http://www.census.gov/popest/datasets.htm.
6
 
7
 
8
Graham Cormode and S. Muthukrishnan. What's new: Finding significant differences in network data streams. In Proc. of IEEE INFOCOM, pages 1534--1545, 2004.
9
 
10
 
11
12
 
13
Panagiotis Karras and Nikos Mamoulis. The Haar+ Tree: a Refined Synopsis Data Structure. In Proc. of the IEEE 23rd ICDE, April 2007.
 
14
15
 
16
17
 
18
S. Muthukrishnan. Subquadratic algorithms for workload-aware Haar wavelet synopses. In Proc. of FSTTCS, 2005.
 
19
P. J. Harrison. Exponential smoothing and short-term sales forecasting. Management Science, 13(11):821--842, 1967.
 
20
 
21
 
22
 
23
 
24
S. Hill, D. Agarwal, R. Bell, and C. Volinsky. Building an effective representation for dynamic graphs. Journal of Computational and Graphical Statistics, 15:584--608, 2006.
 
25
WorldCup 1998. http://ita.ee.lbl.gov/html/contrib/WorldCup.html.
26
27


Collaborative Colleagues:
Deepak Agarwal: colleagues
Dhiman Barman: colleagues
Dimitrios Gunopulos: colleagues
Neal E. Young: colleagues
Flip Korn: colleagues
Divesh Srivastava: colleagues