ACM Home Page
Please provide us with feedback. Feedback
Pruning attribute values from data cubes with diamond dicing
Full text PdfPdf (482 KB)
Source
ACM International Conference Proceeding Series; Vol. 299 archive
Proceedings of the 2008 international symposium on Database engineering & applications table of contents
Coimbra, Portugal
SESSION: Data management table of contents
Pages 121-129  
Year of Publication: 2008
ISBN:978-1-60558-188-0
Authors
Hazel Webb  University of New Brunswick
Owen Kaser  University of New Brunswick
Daniel Lemire  Université du Québec à Montréal
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 21,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1451940.1451958
What is a DOI?

ABSTRACT

Data stored in a data warehouse are inherently multidimensional, unlike most data-pruning techniques (such as iceberg and top-k queries). However, analysts need to issue multidimensional queries. For example, an analyst may need to select not just the most profitable stores or---separately---the most profitable products, but simultaneous sets of stores and products fulfilling some profitability constraints. To fill this need, we propose a new operator, the diamond dice. Because of the interaction between dimensions, the computation of diamonds is challenging. We present the first diamond-dicing experiments on large data sets. Our external memory algorithm avoids potentially expensive random accesses. Experiments show that we can compute diamond cubes over fact tables containing 100 million facts and 500,000 distinct attribute values in less than an hour using a single-core PC.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
C. Anderson. The long tail. Hyperion, 2006.
 
2
K. Aouiche, D. Lemire, and R. Godin. Collaborative OLAP with tag clouds: Web 2.0 OLAP formalism and experimental evaluation. In WEBIST'08, 2008.
3
4
 
5
J. Bennett and S. Lanning. The Netflix prize. In KDD Cup and Workshop 2007, 2007.
 
6
7
8
9
 
10
 
11
J. O. Engene. Five decades of terrorism in Europe: The TWEED dataset. Journal of Peace Research, 44(1):109--121, 2007.
 
12
 
13
 
14
R. Godin, R. Missaoui, and H. Alaoui. Incremental concept formation algorithms based on Galois (concept) lattices. Computational Intelligence, 11:246--267, 1995.
 
15
 
16
S. Hettich and S. D. Bay. The UCI KDD archive. http://kdd.ics.uci.edu (checked 2008-04-28), 2000.
 
17
18
19
 
20
21
 
22
 
23
 
24
Netflix, Inc. Nexflix prize. http://www.netflixprize.com (checked 2008-04-28), 2007.
 
25
J. Pei, M. Cho, and D. Cheung. Cross table cubing: Mining iceberg cubes from data warehouses. In SDM'05, 2005.
 
26
D. N. Politis, J. P. Romano, and M. Wolf. Subsampling. Springer, 1999.
 
27
 
28
H. Webb. Properties and applications of diamond cubes. In ICSOFT 2007 -- Doctoral Consortium, 2007.
 
29
H. Webb, O. Kaser, and D. Lemire. Pruning attribute values from data cubes with diamond dicing. Technical Report TR-08-011, Computer Science and Applied Statistics, University of New Brunswick Saint John, 2008. available from http://http://arxiv.org/abs/0805.0747.
 
30
 
31
K. Yang. Information retrieval on the web. Annual Review of Information Science and Technology, 39:33--81, 2005.
 
32

Collaborative Colleagues:
Hazel Webb: colleagues
Owen Kaser: colleagues
Daniel Lemire: colleagues