| Progressive and selective merge: computing top-k with ad-hoc ranking functions |
| Full text |
Pdf
(400 KB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
table of contents
Beijing, China
SESSION: Top-k queries and ranking
table of contents
Pages: 103 - 114
Year of Publication: 2007
ISBN:978-1-59593-686-8
|
|
Authors
|
|
Dong Xin
|
University of Illinois at Urbana-Champaign, Urbana, IL
|
|
Jiawei Han
|
University of Illinois at Urbana-Champaign, Urbana, IL
|
|
Kevin C. Chang
|
University of Illinois at Urbana-Champaign, Urbana, IL
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 7, Downloads (12 Months): 121, Citation Count: 5
|
|
|
ABSTRACT
The family of threshold algorithm (ie, TA) has been widely studied for efficiently computing top-k queries. TA uses a sort-merge framework that assumes data lists are pre-sorted, and the ranking functions are monotone. However, in many database applications, attribute values are indexed by tree-structured indices (eg, B-tree, R-tree), and the ranking functions are not necessarily monotone. To answer top-k queries with ad-hoc ranking functions, this paper studies anindex-merge paradigm that performs progressive search over the space of joint states composed by multiple index nodes. We address two challenges for efficient query processing. First, to minimize the search complexity, we present a double-heap algorithm which supports not only progressive state search but also progressive state generation. Second, to avoid unnecessary disk access, we characterize a type of "empty-state" that does not contribute to the final results, and propose a new materialization model, join-signature, to prune empty-states. Our performance study shows that the proposed method achieves one order of magnitude speed-up over baseline solutions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Holger Bast , Debapriyo Majumdar , Ralf Schenkel , Martin Theobald , Gerhard Weikum, IO-Top-k: index-access optimized top-k query processing, Proceedings of the 32nd international conference on Very large data bases, September 12-15, 2006, Seoul, Korea
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
Thomas Brinkhoff , Hans-Peter Kriegel , Bernhard Seeger, Efficient processing of spatial joins using R-trees, Proceedings of the 1993 ACM SIGMOD international conference on Management of data, p.237-246, May 25-28, 1993, Washington, D.C., United States
|
| |
7
|
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, 2002.
|
 |
8
|
Kaushik Chakrabarti , Venkatesh Ganti , Jiawei Han , Dong Xin, Ranking objects based on relationships, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
[doi> 10.1145/1142473.1142516]
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001.
|
| |
13
|
A. Fraenkel and S. Klein. Novel compression of sparse bit-strings-preliminary report. Combinatorial Algorithms on Words, NATO ASI Series, 12:169--183, 1985.
|
| |
14
|
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database Systems: The Complete Book. Prentice Hall, 2002.
|
 |
15
|
|
| |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
 |
20
|
|
| |
21
|
|
 |
22
|
Zhen Zhang , Seung-won Hwang , Kevin Chen-Chuan Chang , Min Wang , Christian A. Lang , Yuan-chi Chang, Boolean + ranking: querying a database by k-constrained optimization, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
[doi> 10.1145/1142473.1142515]
|
| |
23
|
|
CITED BY 5
|
|
|
|
|
|
|
|
|
|
|
Wai Kit Wong , David Wai-lok Cheung , Ben Kao , Nikos Mamoulis, Secure kNN computation on encrypted databases, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
Thomas Neumann , Matthias Bender , Sebastian Michel , Ralf Schenkel , Peter Triantafillou , Gerhard Weikum, Distributed top-k aggregation queries at large, Distributed and Parallel Databases, v.26 n.1, p.3-27, August 2009
|
|