ACM Home Page
Please provide us with feedback. Feedback
Minimal probing: supporting expensive predicates for top-k queries
Full text PdfPdf (1.53 MB)
Source International Conference on Management of Data archive
Proceedings of the 2002 ACM SIGMOD international conference on Management of data table of contents
Madison, Wisconsin
SESSION: Research sessions: query processing II table of contents
Pages: 346 - 357  
Year of Publication: 2002
ISBN:1-58113-497-5
Authors
Kevin Chen-Chuan Chang  University of Illinois, Urbana-Champaign
Seung-won Hwang  University of Illinois, Urbana-Champaign
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 67,   Citation Count: 55
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564691.564731
What is a DOI?

ABSTRACT

This paper addresses the problem of evaluating ranked top-k queries with expensive predicates. As major DBMSs now all support expensive user-defined predicates for Boolean queries, we believe such support for ranked queries will be even more important: First ranked queries often need to model user-specific concepts of preference, relevance, or similarity, which call for dynamic user-defined functions. Second, middleware systems must incorporate external predicates for integrating autonomous sources typically accessible only by per-object queries. Third, fuzzy joins are inherently expensive, as they are essentially user-defined operations that dynamically associate multiple relations. These predicates, being dynamically defined or externally accessed, cannot rely on index mechanisms to provide zero-time sorted output, and must instead require per-object probe to evaluate. The current standard sort-merge framework for ranked queries cannot efficiently handle such predicates because it must completely probe all objects, before sorting and merging them to produce top-k answers. To minimize expensive probes, we thus develop the formal principle of "necessary probes," which determines if a probe is absolutely required. We then propose Algorithm MPro which, by implementing the principle, is provably optimal with minimal probe cost. Further, we show that MPro can scale well and can be easily parallelized. Our experiments using both a real-estate benchmark database and synthetic datasets show that MPro enables significant probe reduction, which can be orders of magnitude faster than the standard scheme using complete probing.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
3
4
 
5
6
7
 
8
 
9
S. Nepal and M. Ramakrishna. Query processing issues in image (multimedia) databases. ICDE 1999, pages 22-29, 1999.
10
 
11
K. Chang and S. Hwang. Minimal probing: Supporting expensive predicates for top-k queries. Technical Report UIUCDCS-R-2001-2258, University of Illinois, December 2001.
 
12
13
14
 
15
16
 
17
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. ICDE 2002, 2002
 
18
 
19
L. Zadeh. Fuzzy sets. Information and Control, 8:338-353, 1965.
 
20
 
21
 
22
The PostgreSQL Global Development Group. The PostgreSQL 7.1. Reference Manual.

CITED BY  55

Collaborative Colleagues:
Kevin Chen-Chuan Chang: colleagues
Seung-won Hwang: colleagues