|
ABSTRACT
We prove that any one-pass streaming algorithm which (ε, δ)-approximates the kth frequency moment Fk, for any real k ≠ 1 and any ε = Ω(1/√m), must use Ω(1/ε²) bits of space, where m is the size of the universe. This is optimal in terms of ε, resolves the open questions of Bar-Yossef et al in [3, 4], and extends the Ω(1/ε²) lower bound for F0 in [11] to much smaller ε by applying novel techniques. Along the way we lower bound the one-way communication complexity of approximating the Hamming distance and the number of bipartite graphs with minimum/maximum degree constraints.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Noga Alon , Yossi Matias , Mario Szegedy, The space complexity of approximating the frequency moments, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.20-29, May 22-24, 1996, Philadelphia, Pennsylvania, United States
[doi> 10.1145/237814.237823]
|
| |
2
|
N. Alon and J. Spencer. The Probabilistic Method, Wiley Interscience, New York, 1992, 86--87.
|
| |
3
|
|
| |
4
|
Z. Bar Yossef. The complexity of massive data set computations. Ph.D. Thesis, U.C. Berkeley, 2002.
|
| |
5
|
|
| |
6
|
|
| |
7
|
G. Cormode, M. Datar, P. Indyk, and M. Muthukrishnan. Comparing Data Streams Using Hamming Norms. 28th International Conference on Very Large Databases (VLDB), 2002.
|
| |
8
|
|
| |
9
|
P. Flajolet and G. N. Martin. Probabilistic Counting Algorithms for Data Base Applications. Journal of Computer and System Sciences, 18(2) 143--154, 1979.
|
| |
10
|
I. J. Good. Surprise Indexes and P-values. Journal of Statistical Computation and Simulation 32, p 90--92, 1989.
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
V. N. Vapnik and A. Y. Chervonenkis. On the uniform converges of relative frequences of events to their probabilities. Theory of Probability and its Applications, XVI(2):264--280, 1971.
|
| |
15
|
A. C-C. Yao. Lower bounds by probabilistic arguments. In Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science, p. 420--428, 1983.
|
|