|
ABSTRACT
This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
F. N. Afrati and R. Chirkova. Selecting and using views to compute aggregate queries (extended abstract). In ICDT, pages 383--397, 2005.
|
| |
2
|
Sameet Agarwal , Rakesh Agrawal , Prasad Deshpande , Ashish Gupta , Jeffrey F. Naughton , Raghu Ramakrishnan , Sunita Sarawagi, On the Computation of Multidimensional Aggregates, Proceedings of the 22th International Conference on Very Large Data Bases, p.506-521, September 03-06, 1996
|
 |
3
|
|
| |
4
|
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.
|
 |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
S. Chaudhuri, R. Ramakrishnan, and G. Weikum. Integrating DB and IR technologies: What is the sound of one hand clapping? In CIDR, pages 1--12, 2005.
|
| |
9
|
|
| |
10
|
|
 |
11
|
Sara Cohen , Werner Nutt , Alexander Serebrenik, Rewriting aggregate queries using views, Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.155-166, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
[doi> 10.1145/303976.303992]
|
| |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
| |
16
|
Jim Gray , Surajit Chaudhuri , Adam Bosworth , Andrew Layman , Don Reichart , Murali Venkatrao , Frank Pellow , Hamid Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Data Mining and Knowledge Discovery, v.1 n.1, p.29-53, 1997
[doi> 10.1023/A:1009726021843]
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
 |
20
|
Jiawei Han , Jian Pei , Guozhu Dong , Ke Wang, Efficient computation of Iceberg cubes with complex measures, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.1-12, May 21-24, 2001, Santa Barbara, California, United States
|
 |
21
|
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman, Implementing data cubes efficiently, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.205-216, June 04-06, 1996, Montreal, Quebec, Canada
|
 |
22
|
Joseph M. Hellerstein , Peter J. Haas , Helen J. Wang, Online aggregation, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.171-182, May 11-15, 1997, Tucson, Arizona, United States
|
| |
23
|
I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. In VLDB, pages 754--765, 2003.
|
 |
24
|
Ihab F. Ilyas , Rahul Shah , Walid G. Aref , Jeffrey Scott Vitter , Ahmed K. Elmagarmid, Rank-aware query optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007593]
|
| |
25
|
C. Li, K. C.-C. Chang, and I. F. Ilyas. Efficient processing of ad-hoc top-k aggregate queries in OLAP. Technical Report UIUCDCS-R-2005-2596, Department of Computer Science, UIUC, June 2005. http://aim.cs.uiuc.edu.
|
 |
26
|
|
| |
27
|
H.-G. Li, H. Yu, D. Agrawal, , and A. E. Abbadi. Ranking aggregates. Technical report, UCSB, July 2004.
|
| |
28
|
|
| |
29
|
T. Neumann and G. Moerkotte. A combined framework for grouping and order optimization. In VLDB, 2004.
|
| |
30
|
|
 |
31
|
David Simmen , Eugene Shekita , Timothy Malkemus, Fundamental techniques for order optimization, Proceedings of the 1996 ACM SIGMOD international conference on Management of data, p.57-67, June 04-06, 1996, Montreal, Quebec, Canada
|
| |
32
|
|
| |
33
|
A. Tsois and T. K. Sellis. The generalized pre-grouping transformation: Aggregate-query optimization in the presence of dependencies. In VLDB, pages 644--655, 2003.
|
| |
34
|
|
| |
35
|
|
 |
36
|
Yihong Zhao , Prasad M. Deshpande , Jeffrey F. Naughton, An array-based algorithm for simultaneous multidimensional aggregates, Proceedings of the 1997 ACM SIGMOD international conference on Management of data, p.159-170, May 11-15, 1997, Tucson, Arizona, United States
|
CITED BY 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ka Cheung Sia , Junghoo Cho , Yun Chi , Belle L. Tseng, Efficient computation of personal aggregate queries on blogs, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|