|
ABSTRACT
Assume that each object in a database has m grades, or scores, one for each of m attributes. For example, an object can have a color grade, that tells how red it is, and a shape grade, that tells how round it is. For each attribute, there is a sorted list, which lists each object and its grade under that attribute, sorted by grade (highest grade first). There is some monotone aggregation function, or combining rule, such as min or average, that combines the individual grades to obtain an overall grade.
To determine objects that have the best overall grades, the naive algorithm must access every object in the database, to find its grade under each attribute. Fagin has given an algorithm (“Fagin's Algorithm”, or FA) that is much more efficient. For some distributions on grades, and for some monotone aggregation functions, FA is optimal in a high-probability sense.
We analyze an elegant and remarkably simple algorithm (“the threshold algorithm”, or TA) that is optimal in a much stronger sense than FA. We show that TA is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability sense, but over every database. Unlike FA, which requires large buffers (whose size may grow unboundedly as the database size grows), TA requires only a small, constant-size buffer.
We distinguish two types of access: sorted access (where the middleware system obtains the grade of an object in some sorted list by proceeding through the list sequentially from the top), and random access (where the middleware system requests the grade of object in a list, and obtains it in one step). We consider the scenarios where random access is either impossible, or expensive relative to sorted access, and provide algorithms that are essentially optimal for these cases as well.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
M. J. Carey , L. M. Haas , P. M. Schwarz , M. Arya , W. F. Cody , R. Fagin , M. Flickner , A. W. Luniewski , W. Niblack , D. Petkovic , J. Thomas , J. H. Williams , E. L. Wimmers, Towards heterogeneous multimedia information systems: the Garlic approach, Proceedings of the 5th International Workshop on Research Issues in Data Engineering-Distributed Object Management (RIDE-DOM'95), p.124, March 06-07, 1995
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
W Niblack R Barber W Equitz M Flickner E Glasman D Petkovic and P Yanker The QBIC project Querying images bycontent using color texture and shape In SPIE Conference on Storage and Retrieval for Image and Video Databases volume 1908, pages 173-187, 1983. QBIC Web server is http wwwqbic almaden ibm com
|
| |
11
|
|
 |
12
|
|
| |
13
|
|
| |
14
|
L A Zadeh Fuzzy sets Information and Control 8:338-363, 1999
|
| |
15
|
H J Zimmermann Fuzzy Set Theory Kluwer Academic Publishers Boston 3rd edition, 1996.
|
CITED BY 173
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Walid Aref , Moustafa Hammad , Ann Christine Catlin , Ihab Ilyas , Thanaa Ghanem , Ahmed Elmagarmid , Mirette Marzouk, Video query processing in the VDBMS testbed for video database research, Proceedings of the 1st ACM international workshop on Multimedia databases, November 07-07, 2003, New Orleans, LA, USA
|
|
|
Ronald Fagin , Ravi Kumar , Mohammad Mahdian , D. Sivakumar , Erik Vee, Comparing and aggregating rankings with ties, Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 14-16, 2004, Paris, France
|
|
|
Therese Biedl , Broňa Brejová , Erik D. Demaine , Angèle M. Hamel , Alejandro López-Ortiz , Tomáš Vinař, Finding hidden independent sets in interval graphs, Theoretical Computer Science, v.310 n.1-3, p.287-307, 01 January 2004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ihab F. Ilyas , Rahul Shah , Walid G. Aref , Jeffrey Scott Vitter , Ahmed K. Elmagarmid, Rank-aware query optimization, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
Yi Wu , Edward Y. Chang , Kevin Chen-Chuan Chang , John R. Smith, Optimal multimodal fusion for multimedia data analysis, Proceedings of the 12th annual ACM international conference on Multimedia, October 10-16, 2004, New York, NY, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
D. Zeinalipour-Yazti , Z. Vagena , D. Gunopulos , V. Kalogeraki , V. Tsotras , M. Vlachos , N. Koudas , D. Srivastava, The threshold join algorithm for top-k queries in distributed sensor networks, Proceedings of the 2nd international workshop on Data management for sensor networks, August 30-30, 2005, Trondheim, Norway
|
|
|
|
|
|
Varun Kacholia , Shashank Pandit , Soumen Chakrabarti , S. Sudarshan , Rushi Desai , Hrishikesh Karambelkar, Bidirectional expansion for keyword search on graph databases, Proceedings of the 31st international conference on Very large data bases, August 30-September 02, 2005, Trondheim, Norway
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gautam Das , Vagelis Hristidis , Nishant Kapoor , S. Sudarshan, Ordering the attributes of query results, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ihab F. Ilyas , Walid G. Aref , Ahmed K. Elmagarmid , Hicham G. Elmongui , Rahul Shah , Jeffrey Scott Vitter, Adaptive rank-aware query optimization in relational databases, ACM Transactions on Database Systems (TODS), v.31 n.4, p.1257-1304, December 2006
|
|
|
|
|
|
|
|
|
Evangelos Dellis , Akrivi Vlachou , Ilya Vladimirskiy , Bernhard Seeger , Yannis Theodoridis, Constrained subspace skyline computation, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Zhen Zhang , Seung-won Hwang , Kevin Chen-Chuan Chang , Min Wang , Christian A. Lang , Yuan-chi Chang, Boolean + ranking: querying a database by k-constrained optimization, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Surajit Chaudhuri , Gautam Das , Vagelis Hristidis , Gerhard Weikum, Probabilistic ranking of database query results, Proceedings of the Thirtieth international conference on Very large data bases, p.888-899, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
Sudipto Guha , Nick Koudas , Amit Marathe , Divesh Srivastava, Merging the results of approximate match operations, Proceedings of the Thirtieth international conference on Very large data bases, p.636-647, August 31-September 03, 2004, Toronto, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yu-Ting Liu , Tie-Yan Liu , Tao Qin , Zhi-Ming Ma , Hang Li, Supervised rank aggregation, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|
|
|
|
|
|
|
|
Chee-Yong Chan , H. V. Jagadish , Kian-Lee Tan , Anthony K. H. Tung , Zhenjie Zhang, Finding k-dominant skylines in high dimensional space, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
Bin Cui , Ling Liu , Calton Pu , Jialie Shen , Kian-Lee Tan, QueST: querying music databases by acoustic and textual features, Proceedings of the 15th international conference on Multimedia, September 25-29, 2007, Augsburg, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gleb Skobeltsyn , Toan Luu , Ivana Podnar Zarko , Martin Rajman , Karl Aberer, Web text retrieval with a P2P query-driven index, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ming Hua , Jian Pei , Ada W. C. Fu , Xuemin Lin , Ho-Fung Leung, Efficiently answering top-k typicality queries on large databases, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
Feng Shao , Lin Guo , Chavdar Botev , Anand Bhaskar , Muthiah Chettiar , Fan Yang , Jayavel Shanmugasundaram, Efficient keyword search over virtual XML views, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
Andrey Balmin , Latha Colby , Emiran Curtmola , Quanzhong Li , Fatma Özcan , Sharath Srinivas , Zografoula Vagena, SEDA: a system for search, exploration, discovery, and analysis of XML Data, Proceedings of the VLDB Endowment, v.1 n.2, August 2008
|
|
|
|
|
|
Gleb Skobeltsyn , Toan Luu , Ivana Podnar arko , Martin Rajman , Karl Aberer, Query-driven indexing for scalable peer-to-peer text retrieval, Future Generation Computer Systems, v.25 n.1, p.89-99, January, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ka Cheung Sia , Junghoo Cho , Yun Chi , Belle L. Tseng, Efficient computation of personal aggregate queries on blogs, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
Mingjie Zhu , Shuming Shi , Nenghai Yu , Ji-Rong Wen, Can phrase indexing help to process non-phrase queries?, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Ramakrishna Varadarajan , Vagelis Hristidis , Louiqa Raschid , Maria-Esther Vidal , Luis Ibáñez , Héctor Rodríguez-Drumond, Flexible and efficient querying and ranking on hyperlinked data sources, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
|
|
|
|
|
|
|
Bin Wang , Xiao-Chun Yang , Guo-Ren Wang , Ge Yu , Lei Chen , X. Sean Wang , Xue-Min Lin, Continually answering constraint k-NN queries in unstructured P2P systems, Journal of Computer Science and Technology, v.23 n.4, p.538-556, July 2008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Feng Shao , Lin Guo , Chavdar Botev , Anand Bhaskar , Muthiah Chettiar , Fan Yang , Jayavel Shanmugasundaram, Efficient keyword search over virtual XML views, The VLDB Journal — The International Journal on Very Large Data Bases, v.18 n.2, p.543-570, April 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Demetrios Zeinalipour-Yazti , Zografoula Vagena , Vana Kalogeraki , Dimitrios Gunopulos , Vassilis J. Tsotras , Michail Vlachos , Nick Koudas , Divesh Srivastava, Finding the K highest-ranked answers in a distributed network, Computer Networks: The International Journal of Computer and Telecommunications Networking, v.53 n.9, p.1431-1449, June, 2009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Yufei Tao , Ke Yi , Cheng Sheng , Panos Kalnis, Quality and efficiency in high dimensional nearest neighbor search, Proceedings of the 35th SIGMOD international conference on Management of data, June 29-July 02, 2009, Providence, Rhode Island, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jun-Seok Heo , Kyu-Young Whang , Min-Soo Kim , Yi-Reun Kim , Il-Yeol Song, The partitioned-layer index: Answering monotone top-k queries using the convex skyline and partitioning-merging technique, Information Sciences: an International Journal, v.179 n.19, p.3286-3308, September, 2009
|
|
|
|
|
|
|
|
|
|
|
|
Theodoros Lappas , Benjamin Arai , Manolis Platakis , Dimitrios Kotsakos , Dimitrios Gunopulos, On burstiness-aware search for document sequences, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|