ACM Home Page
Please provide us with feedback. Feedback
Distributed computation of the mode
Full text PdfPdf (336 KB)
Source
Annual ACM Symposium on Principles of Distributed Computing archive
Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing table of contents
Toronto, Canada
SESSION: R1 table of contents
Pages 15-24  
Year of Publication: 2008
ISBN:978-1-59593-989-0
Authors
Fabian Kuhn  ETH Zurich, Zurich, Switzerland
Thomas Locher  ETH Zurich, Zurich, Switzerland
Stefan Schmid  TU Muenchen, Munich, Germany
Sponsors
SIGOPS: ACM Special Interest Group on Operating Systems
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 81,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1400751.1400756
What is a DOI?

ABSTRACT

This paper studies the problem of computing the most frequent element (the mode) by means of a distributed algorithm where the elements are located at the nodes of a network. Let k denote the number of distinct elements and further let mi be the number of occurrences of the element ei in the ordered list of occurrences m1m2≥ ... ≥ mk. We give a deterministic distributed algorithm with time complexity O(D+k) where D denotes the diameter of the graph, which is essentially tight. As our main contribution, a Monte Carlo algorithm is presented which computes the mode in O(D + F2/m12*log k) time with high probability, where the frequency moment Ft is defined as Ft = sumi=1k mit. This algorithm is substantially faster than the deterministic algorithm for various relevant frequency distributions. Moreover, we provide a lower bound of Omega(D+F5/(m15B)), where B is the maximum message size, that captures the effect of the frequency distribution on the time complexity to compute the mode.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web Caching and Zipf-like Distributions: Evidence and Implications. In Proc. 18th Annual IEEE Conference on Computer Communications (INFOCOM), 1999.
 
3
 
4
 
5
D. Dobkin and J. I. Munro. Determining the Mode. Theoretical Computer Science, 12:255--263, 1980.
 
6
M. Durand and P. Flajolet. LogLog Counting of Large Cardinalities. In Proc. 11th Annual European Symposium on Algorithms (ESA), 2003.
 
7
A. Farzan, P. Ferragina, G. Franceschini, and J. J. Munro. Cache-Oblivious Comparison-Based Algorithms on Multisets. In Proc. 13th Annual European Sympoisum on Algorithms (ESA), 2005.
 
8
 
9
10
 
11
 
12
M. Mitzenmacher. A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics, 1:226--251, 2003.
 
13
J. I. Munro and P. M. Spira. Sorting and Searching in Multisets. SIAM Journal of Computing (SICOMP), 5(1):1--8, 1976.
 
14
 
15
 
16
 
17
H. Simon. On a Class of Skew Distribution Functions. Biometrika, 42:425--440, 1955.

Collaborative Colleagues:
Fabian Kuhn: colleagues
Thomas Locher: colleagues
Stefan Schmid: colleagues