| Kernel-based skyline cardinality estimation |
| Full text |
Pdf
(1.15 MB)
|
Source
|
International Conference on Management of Data
archive
Proceedings of the 35th SIGMOD international conference on Management of data
table of contents
Providence, Rhode Island, USA
SESSION: Research session 13: skyline query processing
table of contents
Pages 509-522
Year of Publication: 2009
ISBN:978-1-60558-551-2
|
|
Authors
|
|
Zhenjie Zhang
|
National University of Singapore, Singapore, Singapore
|
|
Yin Yang
|
Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
|
|
Ruichu Cai
|
South China University of Technology, Guangzhou, China
|
|
Dimitris Papadias
|
Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
|
|
Anthony Tung
|
National University of Singapore, Singapore, Singapore
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 33, Downloads (12 Months): 138, Citation Count: 0
|
|
|
ABSTRACT
The skyline of a d-dimensional dataset consists of all points not dominated by others. The incorporation of the skyline operator into practical database systems necessitates an efficient and effective cardinality estimation module. However, existing theoretical work on this problem is limited to the case where all d dimensions are independent of each other, which rarely holds for real datasets. The state of the art Log Sampling (LS) technique simply applies theoretical results for independent dimensions to non-independent data anyway, sometimes leading to large estimation errors. To solve this problem, we propose a novel Kernel-Based (KB) approach that approximates the skyline cardinality with nonparametric methods. Extensive experiments with various real datasets demonstrate that KB achieves high accuracy, even in cases where LS fails. At the same time, despite its numerical nature, the efficiency of KB is comparable to that of LS. Furthermore, we extend both LS and KB to the k-dominant skyline, which is commonly used instead of the conventional skyline for high-dimensional data.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Jon L. Bentley , Kenneth L. Clarkson , David B. Levine, Fast linear expected-time alogorithms for computing maxima and convex hulls, Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, p.179-187, January 22-24, 1990, San Francisco, California, United States
|
 |
3
|
|
 |
4
|
Björn Blohsfeld , Dieter Korus , Bernhard Seeger, A comparison of selectivity estimators for range queries on metric attributes, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.239-250, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
| |
5
|
Börzsönyi, S., Kossmann, D., Stocker, K. The Skyline Operator. ICDE, 2001.
|
| |
6
|
Casella, G., Berger, R., Statistical Inference. Duxbury Press, 2001.
|
 |
7
|
|
 |
8
|
Chee-Yong Chan , H. V. Jagadish , Kian-Lee Tan , Anthony K. H. Tung , Zhenjie Zhang, Finding k-dominant skylines in high dimensional space, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
[doi> 10.1145/1142473.1142530]
|
| |
9
|
|
| |
10
|
|
| |
11
|
Chomicki, J., Godfrey, P., Gryz, J., Liang, D. Skyline with Presorting. ICDE, 2003.
|
| |
12
|
|
| |
13
|
Fridman, J. Exploratory Projection Pursuit. Journal of American Statistics Association, 82:249--266, 1987.
|
| |
14
|
Godfrey, P. Skyline Cardinality for Relational Processing. FoIKS, 2004.
|
| |
15
|
|
| |
16
|
|
 |
17
|
|
| |
18
|
Hwang, J.-N., Lay, S.-R., Lippman, A. Nonparametric Multivariate Density Estimation: A Comparative Study. IEEE Trans. on Signal Processing, 42(10): 2795--2810, 1994.
|
| |
19
|
Hwang, J.-N., Lay, S.-R., Lippman, A. Unsupervised Learning for Multivariate Probability Density Estimation: Radial Basis and Projection Pursuit. IEEE Conf. on Neural Networks, 1994.
|
| |
20
|
H. V. Jagadish , Nick Koudas , S. Muthukrishnan , Viswanath Poosala , Kenneth C. Sevcik , Torsten Suel, Optimal Histograms with Quality Guarantees, Proceedings of the 24rd International Conference on Very Large Data Bases, p.275-286, August 24-27, 1998
|
| |
21
|
Koltun, V., Papadimitriou, C. Approximately Dominating Representatives. ICDT, 2005.
|
| |
22
|
|
 |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
 |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
|
| |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
|
| |
36
|
|
 |
37
|
|
| |
38
|
Yidong Yuan , Xuemin Lin , Qing Liu , Wei Wang , Jeffrey Xu Yu , Qing Zhang, Efficient computation of the skyline cube, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|