ACM Home Page
Please provide us with feedback. Feedback
Multiplicative synopses for relative-error metrics
Full text PdfPdf (726 KB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Multi-dimensional table of contents
Pages 756-767  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Author
Panagiotis Karras  National University of Singapore
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 35,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516447
What is a DOI?

ABSTRACT

Existing hierarchical summarization techniques fail to provide synopses good in terms of relative-error metrics. This paper introduces multiplicative synopses: a summarization paradigm tailored for effective relative-error summarization. This paradigm is inspired from previous hierarchical index-based summarization schemes, but goes beyond them by altering their underlying data representation mechanism. Existing schemes have decomposed the summarized data based on sums and differences of values, resulting in what we call additive synopses. We argue that the incapacity of these models to handle relative-error metrics stems exactly from this additive nature of their representation mechanism. We substitute this additive nature by a multiplicative one. We argue that this is more appropriate for achieving low-relative-error data approximations. We develop an efficient linear-time dynamic programming scheme for one-dimensional multiplicative synopsis construction under general relative-error-based metrics, and a special scheme for the case of maximum relative error. We generalize our schemes to higher data dimensionality and we show a surprising additional benefit gained by our special scheme for maximum relative error in this case. In our experimental study, we verify the higher efficacy of our model on relative-error-oriented summarization problems.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
 
4
5
 
6
S. Chen and A. Nucci. Dynamic nonuniform data approximation in databases with Haar wavelet. Journal of Computers, 2(8):64--76, 2007 (also DCC 2007).
 
7
G. Cormode, M. Garofalakis, and D. Sacharidis. Fast approximate wavelet tracking on streams. In EDBT, 2006.
8
9
10
11
12
13
14
 
15
 
16
S. Guha and B. Harb. Approximation algorithms for wavelet transform coding of data streams. IEEE Transactions on Information Theory, 54(2):811--830, 2008 (also SODA 2006).
 
17
 
18
19
20
 
21
 
22
 
23
 
24
25
 
26
 
27
 
28
P. Karras and N. Mamoulis. The Haar+ tree: a refined synopsis data structure. In ICDE, 2007.
29
 
30
31
32
 
33
34
 
35
M. Muralikrishna and D. J. DeWitt. Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In SIGMOD, 1988.
 
36
S. Muthukrishnan. Subquadratic algorithms for workload-aware Haar wavelet synopses. In FSTTCS, 2005.
 
37
 
38
 
39
S. Muthukrishnan, M. Strauss, and X. Zheng. Workload-optimal histograms on streams. In ESA, 2005.
 
40
41
 
42
V. Poosala, V. Ganti, and Y. E. Ioannidis. Approximate query answering using histograms. IEEE Data Eng. Bull., 22(4):5--14, 1999.
 
43
44
45
 
46
 
47
 
48
E. Terzi and P. Tsaparas. Efficient algorithms for sequence segmentation. In SIAM SDM, 2006.
49
50