ACM Home Page
Please provide us with feedback. Feedback
Kernel-based skyline cardinality estimation
Full text PdfPdf (1.15 MB)
Source
International Conference on Management of Data archive
Proceedings of the 35th SIGMOD international conference on Management of data table of contents
Providence, Rhode Island, USA
SESSION: Research session 13: skyline query processing table of contents
Pages 509-522  
Year of Publication: 2009
ISBN:978-1-60558-551-2
Authors
Zhenjie Zhang  National University of Singapore, Singapore, Singapore
Yin Yang  Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
Ruichu Cai  South China University of Technology, Guangzhou, China
Dimitris Papadias  Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
Anthony Tung  National University of Singapore, Singapore, Singapore
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 33,   Downloads (12 Months): 138,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1559845.1559899
What is a DOI?

ABSTRACT

The skyline of a d-dimensional dataset consists of all points not dominated by others. The incorporation of the skyline operator into practical database systems necessitates an efficient and effective cardinality estimation module. However, existing theoretical work on this problem is limited to the case where all d dimensions are independent of each other, which rarely holds for real datasets. The state of the art Log Sampling (LS) technique simply applies theoretical results for independent dimensions to non-independent data anyway, sometimes leading to large estimation errors. To solve this problem, we propose a novel Kernel-Based (KB) approach that approximates the skyline cardinality with nonparametric methods. Extensive experiments with various real datasets demonstrate that KB achieves high accuracy, even in cases where LS fails. At the same time, despite its numerical nature, the efficiency of KB is comparable to that of LS. Furthermore, we extend both LS and KB to the k-dominant skyline, which is commonly used instead of the conventional skyline for high-dimensional data.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
 
5
Börzsönyi, S., Kossmann, D., Stocker, K. The Skyline Operator. ICDE, 2001.
 
6
Casella, G., Berger, R., Statistical Inference. Duxbury Press, 2001.
7
8
 
9
 
10
 
11
Chomicki, J., Godfrey, P., Gryz, J., Liang, D. Skyline with Presorting. ICDE, 2003.
 
12
 
13
Fridman, J. Exploratory Projection Pursuit. Journal of American Statistics Association, 82:249--266, 1987.
 
14
Godfrey, P. Skyline Cardinality for Relational Processing. FoIKS, 2004.
 
15
 
16
17
 
18
Hwang, J.-N., Lay, S.-R., Lippman, A. Nonparametric Multivariate Density Estimation: A Comparative Study. IEEE Trans. on Signal Processing, 42(10): 2795--2810, 1994.
 
19
Hwang, J.-N., Lay, S.-R., Lippman, A. Unsupervised Learning for Multivariate Probability Density Estimation: Radial Basis and Projection Pursuit. IEEE Conf. on Neural Networks, 1994.
 
20
 
21
Koltun, V., Papadimitriou, C. Approximately Dominating Representatives. ICDT, 2005.
 
22
23
 
24
 
25
 
26
27
 
28
 
29
 
30
 
31
 
32
 
33
 
34
 
35
 
36
37
 
38

Collaborative Colleagues:
Zhenjie Zhang: colleagues
Yin Yang: colleagues
Ruichu Cai: colleagues
Dimitris Papadias: colleagues
Anthony Tung: colleagues