|
ABSTRACT
This paper introduces RankSQL, a system that provides a systematic and principled framework to support efficient evaluations of ranking (top-k) queries in relational database systems (RDBMS), by extending relational algebra and query optimization. Previously, top-k query processing is studied in the middleware scenario or in RDBMS in a "piecemeal" fashion, i.e., focusing on specific operator or sitting outside the core of query engines. In contrast, we aim to support ranking as a first-class database construct. As a key insight, the new ranking relationship can be viewed as another logical property of data, parallel to the "membership" property of relational data model. While membership is essentially supported in RDBMS, the same support for ranking is clearly lacking. We address the fundamental integration of ranking in RDBMS in a way similar to how membership, i.e., Boolean filtering, is supported. We extend relational algebra by proposing a rank-relational model to capture the ranking property, and introducing new and extended operators to support ranking as a first-class construct. Enabled by the extended algebra, we present a pipelined and incremental execution model of ranking query plans (that cannot be expressed traditionally) based on a fundamental ranking principle. To optimize top-k queries, we propose a dimensional enumeration algorithm to explore the extended plan space by enumerating plans along two dual dimensions: ranking and membership. We also propose a sampling-based method to estimate the cardinality of rank-aware operators, for costing plans. Our experiments show the validity of our framework and the accuracy of the proposed estimation model.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Shivnath Babu , Rajeev Motwani , Kamesh Munagala , Itaru Nishizawa , Jennifer Widom, Adaptive ordering of pipelined stream filters, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007615]
|
| |
2
|
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.
|
 |
3
|
|
 |
4
|
|
 |
5
|
Yuan-Chi Chang , Lawrence Bergman , Vittorio Castelli , Chung-Sheng Li , Ming-Ling Lo , John R. Smith, The onion technique: indexing for linear optimization queries, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.391-402, May 15-18, 2000, Dallas, Texas, United States
|
| |
6
|
|
 |
7
|
Surajit Chaudhuri , Rajeev Motwani , Vivek Narasayya, On random sampling over joins, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.263-274, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
8
|
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
G. Graefe. The cascades framework for query optimization. IEEE Data Eng. Bull., 18(3): 19--29, 1995.
|
| |
13
|
|
| |
14
|
S. Guha, D. Gunopulos, N. Koudas, D. Srivastava, and M. Vlachos. Efficient approximation of optimization queries under parametric aggregation constraints. In VLDB, pages 778--789, 2003.
|
| |
15
|
S. Guha, N. Koudas, A. Marathe, and D. Srivastava. Merging the results of approximate match operations. In VLDB, pages 636--647, 2004.
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
|
 |
20
|
Vagelis Hristidis , Nick Koudas , Yannis Papakonstantinou, PREFER: a system for the efficient execution of multi-parametric ranked queries, Proceedings of the 2001 ACM SIGMOD international conference on Management of data, p.259-270, May 21-24, 2001, Santa Barbara, California, United States
|
| |
21
|
I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Joining ranked inputs in practice. In VLDB, pages 950--961, 2002.
|
| |
22
|
I. F. Ilyas, W. G. Aref, and A. K. Elmagarmid. Supporting top-k join queries in relational databases. In VLDB, pages 754--765, 2003.
|
 |
23
|
Ihab F. Ilyas , Rahul Shah , Walid G. Aref , Jeffrey Scott Vitter , Ahmed K. Elmagarmid, Rank-aware query optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007593]
|
| |
24
|
W. Kießling. Foundations of preferences in database systems. In VLDB, pages 311--322, 2002.
|
| |
25
|
|
| |
26
|
|
| |
27
|
S. Raghavan and H. Garcia-Molina. Complex queries over web repositories. In VLDB, pages 33--44, 2003.
|
 |
28
|
P. Griffiths Selinger , M. M. Astrahan , D. D. Chamberlin , R. A. Lorie , T. G. Price, Access path selection in a relational database management system, Proceedings of the 1979 ACM SIGMOD international conference on Management of data, May 30-June 01, 1979, Boston, Massachusetts
[doi> 10.1145/582095.582099]
|
| |
29
|
P. Tsaparas, T. Palpanas, Y. Kotidis, N. Koudas, and D. Srivastava. Ranked join indices. In ICDE, 2003.
|
| |
30
|
|
| |
31
|
K. Yi, H. Yu, J. Yang, G. Xia, and Y. Chen. Efficient maintenance of materialized top-k views. In ICDE, 2003.
|
CITED BY 34
|
|
|
|
|
|
|
|
|
|
|
Ihab F. Ilyas , Walid G. Aref , Ahmed K. Elmagarmid , Hicham G. Elmongui , Rahul Shah , Jeffrey Scott Vitter, Adaptive rank-aware query optimization in relational databases, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1257-1304, December 2006
|
|
|
|
|
|
Holger Bast , Debapriyo Majumdar , Ralf Schenkel , Martin Theobald , Gerhard Weikum, IO-Top-k: index-access optimized top-k query processing, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
|
|
|
|
|
Zhen Zhang , Seung-won Hwang , Kevin Chen-Chuan Chang , Min Wang , Christian A. Lang , Yuan-chi Chang, Boolean + ranking: querying a database by k-constrained optimization, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
Aristides Gionis , Heikki Mannila , Kai Puolamäki , Antti Ukkonen, Algorithms for discovering bucket orders from data, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
Chengkai Li , Min Wang , Lipyeow Lim , Haixun Wang , Kevin Chen-Chuan Chang, Supporting ranking and clustering as generalized order-by and group-by, Proceedings of the 2007 ACM SIGMOD international conference on Management of data, June 11-14, 2007, Beijing, China
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hai Jin , Xiaomin Ning , Weijia Jia , Hao Wu , Guilin Lu, Combining weights with fuzziness for intelligent semantic web search, Knowledge-Based Systems, v.21 n.7, p.655-665, October, 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Partha Pratim Talukdar , Marie Jacob , Muhammad Salman Mehmood , Koby Crammer , Zachary G. Ives , Fernando Pereira , Sudipto Guha, Learning to create data-integrating queries, Proceedings of the VLDB Endowment, v.1 n.1, August 2008
|
|
|
|
|
|
|
|
|
Demetrios Zeinalipour-Yazti , Zografoula Vagena , Vana Kalogeraki , Dimitrios Gunopulos , Vassilis J. Tsotras , Michail Vlachos , Nick Koudas , Divesh Srivastava, Finding the K highest-ranked answers in a distributed network, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.53 n.9, p.1431-1449, June, 2009
|
|
|
Thomas Neumann , Matthias Bender , Sebastian Michel , Ralf Schenkel , Peter Triantafillou , Gerhard Weikum, Distributed top-k aggregation queries at large, Distributed and Parallel Databases, v.26 n.1, p.3-27, August 2009
|
|
|
Jun-Seok Heo , Kyu-Young Whang , Min-Soo Kim , Yi-Reun Kim , Il-Yeol Song, The partitioned-layer index: Answering monotone top-k queries using the convex skyline and partitioning-merging technique, Information Sciences: an International Journal, v.179 n.19, p.3286-3308, September, 2009
|
|