| Space lower bounds for distance approximation in the data stream model |
| Full text |
Pdf
(271 KB)
|
| Source
|
Annual ACM Symposium on Theory of Computing
archive
Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
table of contents
Montreal, Quebec, Canada
SESSION: Session 7A
table of contents
Pages: 360 - 369
Year of Publication: 2002
ISBN:1-58113-495-9
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 5, Downloads (12 Months): 41, Citation Count: 14
|
|
|
ABSTRACT
(MATH) We consider the problem of approximating the distance of two d-dimensional vectors x and y in the data stream model. In this model, the 2d coordinates are presented as a "stream" of data in some arbitrary order, where each data item includes the index and value of some coordinate and a bit that identifies the vector (x or y) to which it belongs. The goal is to minimize the amount of memory needed to approximate the distance. For the case of Lp-distance with p &egr; [1,2], there are good approximation algorithms that run in polylogarithmic space in d (here we assume that each coordinate is an integer with O(log d) bits). Here we prove that they do not exist for pρ2. In particular, we prove an optimal approximation-space tradeoff of approximating L&infty; distance of two vectors. We show that any randomized algorithm that approximates L&infty; distance of two length d vectors within factor of d&dgr; requires ω(d1—4&dgr;) space. As a consequence we show that for pρ2/(1—4&dgr;), any randomized algorithm that approximate Lp distance of two length d vectors within a factor d&dgr; requires ω(d 1— 2< \over p—4&dgr;) space.The lower bound follows from a lower bound on the two-party one-round communication complexity of this problem. This lower bound is proved using a combination of information theory and Fourier analysis.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Noga Alon , Yossi Matias , Mario Szegedy, The space complexity of approximating the frequency moments, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.20-29, May 22-24, 1996, Philadelphia, Pennsylvania, United States
[doi> 10.1145/237814.237823]
|
| |
2
|
L. Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory (preliminary version). FOCS 1986 337--347.
|
| |
3
|
|
| |
4
|
Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. Personal communication.
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
Balasubramanian Kalyanasundaram and Georg Schnitger. The Probabilistic Communication Complexity of Set Intersection (preliminary Version) Proc. of 2nd Structure in Complexity Theory 1987 41--49.
|
| |
11
|
W. B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space. Contemporary (MATH)ematics, 26(1984) 189--206.
|
| |
12
|
|
| |
13
|
A. C. Yao. Lower bounds by probabilistic arguments. FOCS 1983 420--428.
|
CITED BY 14
|
|
|
|
|
|
|
|
Lakshminath Bhuvanagiri , Sumit Ganguly , Deepanjan Kesh , Chandan Saha, Simpler algorithm for estimating frequency moments of data streams, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.708-713, January 22-26, 2006, Miami, Florida
|
|
|
|
|
|
|
|
|
Brian Babcock , Shivnath Babu , Mayur Datar , Rajeev Motwani , Jennifer Widom, Models and issues in data stream systems, Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 03-05, 2002, Madison, Wisconsin
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|