|
ABSTRACT
Rank aggregation has recently been proposed as a useful abstraction that has several applications, including meta-search, synthesizing rank functions from multiple indices, similarity search, and classification. In database applications (catalog searches, fielded searches, parametric searches, etc.), the rankings are produced by sorting an underlying database according to various fields. Typically, there are a number of fields that each have very few distinct values, and hence the corresponding rankings have many ties in them. Known methods for rank aggregation are poorly suited to this context, and the difficulties can be traced back to the fact that we do not have sound mathematical principles to compare two partial rankings, that is, rankings that allow ties.In this work, we provide a comprehensive picture of how to compare partial rankings, We propose several metrics to compare partial rankings, present algorithms that efficiently compute them, and prove that they are within constant multiples of each other. Based on these concepts, we formulate aggregation problems for partial rankings, and develop a highly efficient algorithm to compute the top few elements of a near-optimal aggregation of multiple partial rankings. In a model of access that is suitable for databases, our algorithm reads essentially as few elements of each partial ranking as are necessary to determine the winner(s).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
K. A. Baggerly. Visual Estimation of Structure in Ranked Data. PhD thesis, Rice University, 1995.
|
| |
3
|
W. W. Cohen, R. E. Schapire, and Y. Singer. Learning to order things. Journal of Artificial Intelligence Research, 10:243--270, 1999.
|
| |
4
|
D. E. Critchlow. Metric Methods for Analyzing Partially Ranked Data. Number 34 in Lecture Notes in Statistics. Springer-Verlag, 1980.
|
| |
5
|
P. Diaconis. Group Representation in Probability and Statistics. Number 11 in IMS Lecture Series. Institute of Mathematical Statistics, 1988.
|
| |
6
|
P. Diaconis and R. Graham. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society, Series B, 39(2):262--268, 1977.
|
 |
7
|
Cynthia Dwork , Ravi Kumar , Moni Naor , D. Sivakumar, Rank aggregation methods for the Web, Proceedings of the 10th international conference on World Wide Web, p.613-622, May 01-05, 2001, Hong Kong, Hong Kong
[doi> 10.1145/371920.372165]
|
 |
8
|
Ronald Fagin , Ravi Kumar , Kevin S. McCurley , Jasmine Novak , D. Sivakumar , John A. Tomlin , David P. Williamson, Searching the workplace web, Proceedings of the 12th international conference on World Wide Web, May 20-24, 2003, Budapest, Hungary
[doi> 10.1145/775152.775204]
|
| |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
L. A. Goodman and W. H. Kruskal. Measures of association for cross classification. Journal of the American Statistical Association, 49:732--764, 1954.
|
 |
13
|
Taher H. Haveliwala , Aristides Gionis , Dan Klein , Piotr Indyk, Evaluating strategies for similarity search on the web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511502]
|
| |
14
|
M. Kendall and J. D. Gibbons. Rank Correlation Methods. Edward Arnold, 1990.
|
| |
15
|
M. G. Kendall. The treatment of ties in ranking problems. Biometrika, 33(3):239--251, 1945.
|
| |
16
|
|
 |
17
|
|
 |
18
|
|
| |
19
|
J. Sese and S. Morishita. Rank aggregation method for biological databases. Genome Informatics, 12:506--507, 2001.
|
| |
20
|
R. R. Yager and V. Kreinovich. On how to merge sorted lists coming from different web search tools. Soft Computing Research Journal, 3:83--88, 1999.
|
CITED BY 21
|
|
|
|
|
Allan Borodin , Gareth O. Roberts , Jeffrey S. Rosenthal , Panayiotis Tsaparas, Link analysis ranking: algorithms, theory, and experiments, ACM Transactions on Internet Technology (TOIT), v.5 n.1, p.231-297, February 2005
|
|
|
Gautam Das , Vagelis Hristidis , Nishant Kapoor , S. Sudarshan, Ordering the attributes of query results, Proceedings of the 2006 ACM SIGMOD international conference on Management of data, June 27-29, 2006, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
Tamás Sarlós , Adrás A. Benczúr , Károly Csalogány , Dániel Fogaras , Balázs Rácz, To randomize or not to randomize: space optimal summaries for hyperlink analysis, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
Aristides Gionis , Heikki Mannila , Kai Puolamäki , Antti Ukkonen, Algorithms for discovering bucket orders from data, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ramakrishna Varadarajan , Vagelis Hristidis , Louiqa Raschid , Maria-Esther Vidal , Luis Ibáñez , Héctor Rodríguez-Drumond, Flexible and efficient querying and ranking on hyperlinked data sources, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, March 24-26, 2009, Saint Petersburg, Russia
|
|
|
R. Agrawal , A. Halverson , K. Kenthapadi , N. Mishra , P. Tsaparas, Generating labels from clicks, Proceedings of the Second ACM International Conference on Web Search and Data Mining, February 09-12, 2009, Barcelona, Spain
|
|
|
|
|
|
|
|