ACM Home Page
Please provide us with feedback. Feedback
Ad-hoc aggregations of ranked lists in the presence of hierarchies
Full text PdfPdf (467 KB)
Source
International Conference on Management of Data archive
Proceedings of the 2008 ACM SIGMOD international conference on Management of data table of contents
Vancouver, Canada
SESSION: Research Session 2: Ranking table of contents
Pages 67-78  
Year of Publication: 2008
ISBN:978-1-60558-102-6
Authors
Nilesh Bansal  University of Toronto, Toronto, ON, Canada
Sudipto Guha  University of Pennsylvania, Philadelphia, PA, USA
Nick Koudas  University of Toronto, Toronto, ON, Canada
Sponsors
ACM: Association for Computing Machinery
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 206,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1376616.1376626
What is a DOI?

ABSTRACT

A variety of web sites and web based services produce textual lists at varying time granularities ranked according to several criteria. For example, Google Trends produces lists of popular query keywords which can be visualized according to several criteria. At Flickr, lists of popular tags used to tag the images uploaded can be visualized as a cloud based on their popularity. Identification of the k most popular terms can be easily conducted by utilizing well known rank aggregation algorithms.

In this paper we take a different approach to information discovery from such ranked lists. We maintain the same rank aggregation framework but we elevate terms at a higher level by making use of popular term hierarchies commonly available. Under such a transformation we show that typical early stopping certificates available for rank aggregation algorithms are no longer applicable. Based on this observation, in this paper, we present a probabilistic framework for early stopping in this setting. We introduce a relaxed version of the rank aggregation problem involving a deterministic stopping condition with user specified precision. We introduce an algorithm pH -- RA for the solution of this problem. In addition we introduce techniques to improve the performance of pH -- RAeven further via precomputation utilizing a sparse set system. Through a detailed experimental evaluation using synthetic and real datasets we demonstrate the efficiency of our framework.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
 
5
 
6
J. O. Berger. Statistical Decision Theory and Bayesian Analysis, 2nd Edition. Springer, 1985.
 
7
BlogScope. http://www.blogscope.net/about/.
8
 
9
W. W. Cohen, R. E. Schapire, and Y. Singer. Learning to order things. JAIR, 10:243--270, 1999.
 
10
11
12
 
13
 
14
 
15
W. Feller. An Introduction to Probability Theory and its Applications, volume 1. John Wiley & Sons, 2nd edition, 1957.
 
16
M. F. Fontoura, R. Lempel, R. Qi, , and J. Zien. Inverted index support for numeric search. In Internet Mathematics, 2006.
17
 
18
19
 
20
S. Nepal and M. V. Ramakrishna. Query processing issues in image (multimedia) databases. In ICDE, 1999.
 
21
 
22
 
23
R. R. Yager and V. Kreinovich. On how to merge sorted lists coming from different web search tools. Soft Comput, 3(2):83--88, 1999.


Collaborative Colleagues:
Nilesh Bansal: colleagues
Sudipto Guha: colleagues
Nick Koudas: colleagues