ACM Home Page
Please provide us with feedback. Feedback
Evaluating rank joins with optimal cost
Full text PdfPdf (264 KB)
Source
Symposium on Principles of Database Systems archive
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Vancouver, Canada
SESSION: Best Newcomer Award & Invited Tutorial 1 table of contents
Pages 43-52  
Year of Publication: 2008
ISBN:978-1-60558-152-1
Authors
Karl Schnaitter  UC Santa Cruz, Santa Cruz, CA, USA
Neoklis Polyzotis  UC Santa Cruz, Santa Cruz, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 11,   Downloads (12 Months): 171,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376916.1376924
What is a DOI?

ABSTRACT

In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top K scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem without reading all of the input. In this paper, we present a thorough analysis of such rank join algorithms. A strong point of our analysis is that it is based on a more general problem statement than previous work, making it more relevant to the execution model that is employed by database systems. One of our results indicates that the well known HRJN algorithm has shortcomings, because it does not stop reading its input as soon as possible. We find that it is NP-hard to overcome this weakness in the general case, but cases of limited query complexity are tractable. We prove the latter with an algorithm that infers provably tight bounds on the potential benefit of reading more input in order to stop as soon as possible. As a result, the algorithm achieves a cost that is within a constant factor of optimal.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Parag Agrawal and Jennifer Widom. Confidence-aware joins in large uncertain databases. Technical report, Stanford University, 2007. Available at http://dbpubs.stanford.edu/pub/2007-14.
 
2
 
3
 
4
5
6
7
 
8
 
9
Karl Schnaitter and Neoklis Polyzotis. Evaluating rank joins with optimal cost. Technical report, UC Santa Cruz, 2007. Available at http://www.soe.ucsc.edu/research/reports/UCSC-CRL-07-10.pdf.
 
10


Collaborative Colleagues:
Karl Schnaitter: colleagues
Neoklis Polyzotis: colleagues