ACM Home Page
Please provide us with feedback. Feedback
Efficient skyline retrieval with arbitrary similarity measures
Full text PdfPdf (744 KB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Skylines table of contents
Pages 1052-1063  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Deepak P  India Research Lab, Bangalore
Prasad M Deshpande  India Research Lab, Bangalore
Debapriyo Majumdar  India Research Lab, Bangalore
Raghu Krishnapuram  India Research Lab, Bangalore
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 81,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516480
What is a DOI?

ABSTRACT

A skyline query returns a set of objects that are not dominated by other objects. An object is said to dominate another if it is closer to the query than the latter on all factors under consideration. In this paper, we consider the case where the similarity measures may be arbitrary and do not necessarily come from a metric space. We first explore middleware algorithms, analyze how skyline retrieval for non-metric spaces can be done on the middleware backend, and lay down a necessary and sufficient stopping condition for middleware-based skyline algorithms. We develop the Balanced Access Algorithm, which is provably more IO-friendly than the state-of-the-art algorithm for skyline query processing on middleware and show that BAA outperforms the latter by orders of magnitude. We also show that without prior knowledge about data distributions, it is unlikely to have a middleware algorithm that is more IO-friendly than BAA. In fact, we empirically show that BAA is very close to the absolute lower bound of IO costs for middleware algorithms. Further, we explore the non-middleware setting and devise an online algorithm for skyline retrieval which uses a recently proposed value space index over non-metric spaces (AL-Tree [10]). The AL-Tree based algorithm is able to prune subspaces and efficiently maintain candidate sets leading to better performance. We compare our algorithms to existing ones which can work with arbitrary similarity measures and show that our approaches are better in terms of computational and disk access costs leading to significantly better response times.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
How fast is your disk? http://www.linuxinsight.com/how_fast_is_your_disk.html, January 2007.
 
2
A. Asuncion and D. Newman. UCI machine learning repository, 2007.
 
3
W.-T. Balke, U. Güntzer, and J. X. Zheng. Efficient distributed skylining for web information systems. In EDBT, pages 256--273, 2004.
 
4
5
 
6
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, 2001.
 
7
J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In ICDE, 2003.
 
8
W. Chung, Gray and Horst. Windows 2000 disk io performance. Microsoft Research TR, June 2000.
 
9
K. Deng, X. Zhou, and H. T. Shen. Multi-source skyline query processing in road networks. In ICDE, 2007.
10
11
 
12
 
13
14
 
15
16
17
 
18
 
19
 
20
S. Wang, B. C. Ooi, A. K. H. Tung, and L. Xu. Efficient skyline query processing on peer-to-peer networks. In ICDE, pages 1126--1135, 2007.
 
21
Collaborative Colleagues:
Deepak P: colleagues
Prasad M Deshpande: colleagues
Debapriyo Majumdar: colleagues
Raghu Krishnapuram: colleagues