ACM Home Page
Please provide us with feedback. Feedback
Earth mover distance over high-dimensional spaces
Full text PdfPdf (516 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms table of contents
San Francisco, California
Pages 343-352  
Year of Publication: 2008
Authors
Alexandr Andoni  MIT
Piotr Indyk  MIT
Robert Krauthgamer  Weizmann Institute and IBM Almaden
Sponsors
: SIAM Activity Group on Discrete Mathematics
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
Society for Industrial and Applied Mathematics  Philadelphia, PA, USA
Bibliometrics
Downloads (6 Weeks): 4,   Downloads (12 Months): 91,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  

ABSTRACT

The Earth Mover Distance (EMD) between two equal-size sets of points in ℝd is defined to be the minimum cost of a bipartite matching between the two pointsets. It is a natural metric for comparing sets of features, and as such, it has received significant interest in computer vision. Motivated by recent developments in that area, we address computational problems involving EMD over high-dimensional pointsets.

A natural approach is to embed the EMD metric into l1, and use the algorithms designed for the latter space. However, Khot and Naor [KN06] show that any embedding of EMD over the d-dimensional Hamming cube into l1 must incur a distortion Ω(d), thus practically losing all distance information. We circumvent this roadblock by focusing on sets with cardinalities upper-bounded by a parameter s, and achieve a distortion of only O(log s · log d). Since in applications the feature sets have bounded size, the resulting distortion is much smaller than the Ω(d) lower bound. Our approach is quite general and easily extends to EMD over ℝd.

We then provide a strong lower bound on the multi-round communicatic complexity of estimating EMD, which in particular strengthens the known non-embeddability result of [KN06]. Our bound exhibits a smooth tradeoff between approximation and communication, and for example implies that every algorithm that estimates EMD using constant size sketches can only achieve ω(log s) approximation.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
{AIK07} Alexandr Andoni, Piotr Indyk, and Robert Krauthgamer. Earth mover distance over high-dimensional spaces. ECCC Report TR07-048, May 2007.
 
2
3
 
4
 
5
 
6
7
 
8
{CK06} Moses Charikar and Robert Krauthgamer. Embedding the ulam metric into l1. Theory of Computing, 2(11):207--224, 2006.
 
9
{CLJ+06} M. Charikar, C. Lv, W. Josephson, Z. Wang, and K. Li. Ferret: A toolkit for content-based similarity search. In Proceedings of ACM SIGOS EuroSys Conference, 2006. To appear.
10
11
 
12
{GD04} K. Grauman and T. Darrell. Fast contour matching using approximate Earth Mover's Distance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington DC, June 2004.
 
13
 
14
 
15
{GD06a} K. Grauman and T. Darrell. Approximate correspondences in high dimensions. In Proceedings of Advances in Neural Information Processing Systems (NIPS), 2006.
 
16
17
 
18
19
 
20
{IT03} P. Indyk and N. Thaper. Fast color image retrieval via embeddings. Workshop on Statistical and Computational Theories of Vision (at ICCV), 2003.
 
21
{KN06} S. Khot and A. Naor. Nonembeddability theorems via fourier analysis. Mathematische Annalen, 334(4):821--852, 2006.
22
23
 
24
{Law76} E. Lawler. Combinatorial optimization: Networks and Matroids. Holt, Rinehart and Winston, 1976.
 
25
{LN04} J. Lee and A. Naor. Embedding the diamond graph in lp and dimension reduction in l1. Geometric and Functional Analysis (GAFA), 14(4):745--747, 2004.
26
27
28
 
29
30
 
31
{Yao83} A. C-C. Yao. Lower bounds by probabilistic arguments. In Proceedings of the Symposium on Foundations of Computer Science, pages 420--428, 1983.


Collaborative Colleagues:
Alexandr Andoni: colleagues
Piotr Indyk: colleagues
Robert Krauthgamer: colleagues