ACM Home Page
Please provide us with feedback. Feedback
Quasi-cubes: exploiting approximations in multidimensional databases
Full text PdfPdf (507 KB)
Source ACM SIGMOD Record archive
Volume 26 ,  Issue 3  (September 1997) table of contents
Pages: 12 - 17  
Year of Publication: 1997
ISSN:0163-5808
Authors
Daniel Barbará  Bell Communications Research, 445 South St., Morristown, N.J.
Mark Sullivan  Juno Online Services, 120 West 45th Street, 39th floor, New York, NY
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 19,   Citation Count: 21
Additional Information:

abstract   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/262762.262764
What is a DOI?

ABSTRACT

A data cube is a popular organization for summary data. A cube is simply a multidimensional structure that contains at each point an aggregate value, i.e., the result of applying an aggregate function to an underlying relation. In practical situations, cubes can require a large amount of storage. The typical approach to reducing storage cost is to materialize parts of the cube on demand. Unfortunately, this lazy evaluation can be a time-consuming operation. In this paper, we describe an approximation technique that reduces the storage cost of the cube without incurring the run time cost of lazy evaluation. The idea is to provide an incomplete description of the cube and a method of estimating the missing entries with a certain level of accuracy. The description, of course, should take a fraction of the space of the full cube and the estimation procedure should be faster than computing the data from the underlying relations. Since cubes are used to support data analysis and analysts are rarely interested in the precise values of the aggregates (but rather in trends), providing approximate answers is, in most cases, a satisfactory compromise. Alternatively, the technique can be used to implement a multiresolution system in which a tradeoff is established between the execution time of queries and the errors the user is willing to tolerate. By only going to the disk when it is necessary (to reduce the errors), the query can be executed faster. This idea can be extended to produce a system that incrementally increases the accuracy of the answer while the user is looking at it, supporting on-line aggregation.


CITED BY  21

Collaborative Colleagues:
Daniel Barbará: colleagues
Mark Sullivan: colleagues