ACM Home Page
Please provide us with feedback. Feedback
Optimal aggregation algorithms for middleware
Full text PdfPdf (231 KB)
Source Symposium on Principles of Database Systems archive
Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems table of contents
Santa Barbara, California, United States
Pages: 102 - 113  
Year of Publication: 2001
ISBN:1-58113-361-8
Authors
Ronald Fagin  IBM Almaden Research, Center, 650 Harry Road, San Jose, CA
Amnon Lotem  Maryland-College Park, Dept. of Computer Science, College Park, Maryland
Moni Naor  Weizmann Institute of Science, Dept. of Computer Science, and Applied Mathematics, Rehovot 76100, Israel
Sponsor
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 16,   Downloads (12 Months): 168,   Citation Count: 173
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/375551.375567
What is a DOI?

ABSTRACT

Assume that each object in a database has m grades, or scores, one for each of m attributes. For example, an object can have a color grade, that tells how red it is, and a shape grade, that tells how round it is. For each attribute, there is a sorted list, which lists each object and its grade under that attribute, sorted by grade (highest grade first). There is some monotone aggregation function, or combining rule, such as min or average, that combines the individual grades to obtain an overall grade.

To determine objects that have the best overall grades, the naive algorithm must access every object in the database, to find its grade under each attribute. Fagin has given an algorithm (“Fagin's Algorithm”, or FA) that is much more efficient. For some distributions on grades, and for some monotone aggregation functions, FA is optimal in a high-probability sense.

We analyze an elegant and remarkably simple algorithm (“the threshold algorithm”, or TA) that is optimal in a much stronger sense than FA. We show that TA is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability sense, but over every database. Unlike FA, which requires large buffers (whose size may grow unboundedly as the database size grows), TA requires only a small, constant-size buffer.

We distinguish two types of access: sorted access (where the middleware system obtains the grade of an object in some sorted list by proceeding through the list sequentially from the top), and random access (where the middleware system requests the grade of object in a list, and obtains it in one step). We consider the scenarios where random access is either impossible, or expensive relative to sorted access, and provide algorithms that are essentially optimal for these cases as well.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
10
W Niblack R Barber W Equitz M Flickner E Glasman D Petkovic and P Yanker The QBIC project Querying images bycontent using color texture and shape In SPIE Conference on Storage and Retrieval for Image and Video Databases volume 1908, pages 173-187, 1983. QBIC Web server is http wwwqbic almaden ibm com
 
11
12
 
13
 
14
L A Zadeh Fuzzy sets Information and Control 8:338-363, 1999
 
15
H J Zimmermann Fuzzy Set Theory Kluwer Academic Publishers Boston 3rd edition, 1996.

CITED BY  173

Collaborative Colleagues:
Ronald Fagin: colleagues
Amnon Lotem: colleagues
Moni Naor: colleagues