|
ABSTRACT
Rank-aware query processing has emerged as a key requirement in modern applications. In these applications, efficient and adaptive evaluation of top-k queries is an integral part of the application semantics. In this article, we introduce a rank-aware query optimization framework that fully integrates rank-join operators into relational query engines. The framework is based on extending the System R dynamic programming algorithm in both enumeration and pruning. We define ranking as an interesting physical property that triggers the generation of rank-aware query plans. Unlike traditional join operators, optimizing for rank-join operators depends on estimating the input cardinality of these operators. We introduce a probabilistic model for estimating the input cardinality, and hence the cost of a rank-join operator. To our knowledge, this is the first effort in estimating the needed input size for optimal rank aggregation algorithms. Costing ranking plans is key to the full integration of rank-join operators in real-world query processing engines.Since optimal execution strategies picked by static query optimizers lose their optimality due to estimation errors and unexpected changes in the computing environment, we introduce several adaptive execution strategies for top-k queries that respond to these unexpected changes and costing errors. Our reactive reoptimization techniques change the execution plan at runtime to significantly enhance the performance of running queries. Since top-k query plans are usually pipelined and maintain a complex ranking state, altering the execution strategy of a running ranking query is an important and challenging task.We conduct an extensive experimental study to evaluate the performance of the proposed framework. The experimental results are twofold: (1) we show the effectiveness of our cost-based approach of integrating ranking plans in dynamic programming cost-based optimizers; and (2) we show a significant speedup (up to 300%) when using our adaptive execution of ranking plans over the state-of-the-art mid-query reoptimization strategies.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Laurent Amsaleg , Michael J. Franklin , Anthony Tomasic , Tolga Urhan, Scrambling query plans to cope with unexpected delays, Proceedings of the fourth international conference on on Parallel and distributed information systems, p.208-219, December 18-20, 1996, Miami Beach, Florida, United States
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
| |
10
|
Deshpande, A. and Hellerstein, J. M. 2004. Lifting the burden of history from adaptive query processing. In Proceedings of the 30 International Conference on Very Large Data Bases. 948--959.
|
| |
11
|
|
 |
12
|
Cynthia Dwork , Ravi Kumar , Moni Naor , D. Sivakumar, Rank aggregation methods for the Web, Proceedings of the 10th international conference on World Wide Web, p.613-622, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372165]
|
| |
13
|
|
 |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
| |
18
|
|
 |
19
|
|
| |
20
|
|
| |
21
|
Hristidis, V., Gravano, L., and Papakonstantinou, Y. 2003. Efficient IR-style keyword search over relational databases. In Proceedings of the 29th International Conference on Very Large Data Bases.
|
| |
22
|
Ilyas, I. F., Aref, W. G., and Elmagarmid, A. K. 2002. Joining ranked inputs in practice. In Proceedings of the 28th International Conference on Very Large Data Bases. 950--961.
|
| |
23
|
Ilyas, I. F., Aref, W. G., and Elmagarmid, A. K. 2003. Supporting top-k join queries in relational databases. In Proceedings of the 29th International Conference on Very Large Data Bases. 754--765.
|
| |
24
|
|
 |
25
|
Ihab F. Ilyas , Rahul Shah , Walid G. Aref , Jeffrey Scott Vitter , Ahmed K. Elmagarmid, Rank-aware query optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007593]
|
 |
26
|
|
 |
27
|
|
 |
28
|
|
 |
29
|
Volker Markl , Vijayshankar Raman , David Simmen , Guy Lohman , Hamid Pirahesh , Miso Cilimdzic, Robust query processing through progressive optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007642]
|
| |
30
|
|
| |
31
|
|
| |
32
|
Raman, V., Deshpande, A., and Hellerstein, J. M. 2003. Using state modules for adaptive query processing. In Proceedings of the 19th International Conference on Data Engineering. 353--387.
|
 |
33
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
34
|
|
 |
35
|
Tolga Urhan , Michael J. Franklin , Laurent Amsaleg, Cost-based query scrambling for initial delays, Proceedings of the 1998 ACM SIGMOD international conference on Management of data, p.130-141, June 01-04, 1998, Seattle, Washington, United States
|
| |
36
|
|
 |
37
|
|
CITED BY 6
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Thomas Neumann , Matthias Bender , Sebastian Michel , Ralf Schenkel , Peter Triantafillou , Gerhard Weikum, Distributed top-k aggregation queries at large, Distributed and Parallel Databases, v.26 n.1, p.3-27, August 2009
|
|