| TuG synopses for approximate query answering |
| Full text |
Pdf
(1.76 MB)
|
Source
|
ACM Transactions on Database Systems (TODS)
archive
Volume 34 , Issue 1 (April 2009)
table of contents
Article No. 3
Year of Publication: 2009
ISSN:0362-5915
|
|
Authors
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 111, Citation Count: 0
|
|
|
ABSTRACT
This article introduces the Tuple Graph (TuG) synopses, a new class of data summaries that enable accurate approximate answers for complex relational queries. The proposed summarization framework adopts a “semi-structured” view of the relational database, modeling a relational data set as a graph of tuples and join queries as graph traversals, respectively. The key idea is to approximate the structure of the induced data graph in a concise synopsis, and to approximate the answer to a query by performing the corresponding traversal over the summarized graph. We detail the (TuG) synopsis model that is based on this novel approach, and we describe an efficient and scalable construction algorithm for building accurate (TuG) within a specific storage budget. We validate the performance of (TuG) with an extensive experimental study on real-life and synthetic datasets. Our results verify the effectiveness of (TuG) in generating accurate approximate answers for complex join queries, and demonstrate their benefits over existing summarization techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
Swarup Acharya , Phillip B. Gibbons , Viswanath Poosala , Sridhar Ramaswamy, Join synopses for approximate query answering, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.275-286, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
3
|
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.94-105, June 01-04, 1998, Seattle, Washington, United States
|
 |
4
|
Noga Alon , Yossi Matias , Mario Szegedy, The space complexity of approximating the frequency moments, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.20-29, May 22-24, 1996, Philadelphia, Pennsylvania, United States
[doi> 10.1145/237814.237823]
|
| |
5
|
Anonymous. Details removed due to double-blind reviewing.
|
| |
6
|
|
 |
7
|
Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya, On random sampling over joins, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.263-274, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
8
|
|
 |
9
|
Amol Deshpande , Minos Garofalakis , Rajeev Rastogi, Independence is good: dependency-based histogram synopses for high-dimensional data, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.199-210, May 21-24, 2001, Santa Barbara, California, United States
|
 |
10
|
|
 |
11
|
Tomás Feder , Rajeev Motwani, Clique partitions, graph compression and speeding-up algorithms, Proceedings of the twenty-third annual ACM symposium on Theory of computing, p.123-133, May 05-08, 1991, New Orleans, Louisiana, United States
[doi> 10.1145/103418.103424]
|
 |
12
|
Sumit Ganguly , Phillip B. Gibbons , Yossi Matias , Avi Silberschatz, Bifocal sampling for skew-resistant join size estimation, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.271-281, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
13
|
Lise Getoor , Benjamin Taskar , Daphne Koller, Selectivity estimation using probabilistic models, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.461-472, May 21-24, 2001, Santa Barbara, California, United States
|
| |
14
|
Gilbert, A. C. 2004. Compressing network graphs. In Proceedings of the LinkKDD Workshop at the 10th ACM Conference on KDD.
|
| |
15
|
|
| |
16
|
|
 |
17
|
Yossi Matias , Jeffrey Scott Vitter , Min Wang, Wavelet-based histograms for selectivity estimation, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.448-459, June 01-04, 1998, Seattle, Washington, United States
|
 |
18
|
|
| |
19
|
|
 |
20
|
Viswanath Poosala , Peter J. Haas , Yannis E. Ioannidis , Eugene J. Shekita, Improved histograms for selectivity estimation of range predicates, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.294-305, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
21
|
|
| |
22
|
|
 |
23
|
Tian Zhang , Raghu Ramakrishnan , Miron Livny, BIRCH: an efficient data clustering method for very large databases, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.103-114, June 04-06, 1996, Montreal, Quebec, Canada
|
|