|
ABSTRACT
Given a matrix A it is often desirable to find an approximation to A that has low rank. We introduce a simple technique for accelerating the computation of such approximations when A has strong spectral structure, i.e., when the singular values of interest are significantly greater than those of a random matrix with size and entries similar to A. Our technique amounts to independently sampling and/or quantizing the entries of A, thus speeding up computation by reducing the number of non-zero entries and/or the length of their representation. Our analysis is based on observing that the acts of sampling and quantization can be viewed as adding a random matrix E to A, whose entries are independent random variables with zero-mean and bounded variance. Since, with high probability, E has very weak spectral structure, we can prove that the effect of sampling and quantization nearly vanishes when a low rank approximation to A+E is computed. In fact, the stronger the spectral structure of A, the more of its entries we can afford to discard and, ultimately, the faster we can discover that structure. We give bounds on the quality of our approximation both in the L2 and in the Frobenius norm.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Yossi Azar, Amos Fiat, Anna Karlin, Frank McSherry, and Jared Saia, Data mining through spectral analysis, These Proceedings.
|
| |
2
|
|
| |
3
|
|
| |
4
|
Scott Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman, Indexing by latent semantic analysis, J. Soc. Inf. Sci. 41 (1990), no. 6, 391-407.
|
| |
5
|
P. Drineas , Alan Frieze , Ravi Kannan , Santosh Vempala , V. Vinay, Clustering in large graphs and matrices, Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, p.291-299, January 17-19, 1999, Baltimore, Maryland, United States
|
| |
6
|
|
| |
7
|
Zoltan F.uredi and Janos Komlos, The eigenvalues of random symmetric matrices, Combinatorica 1 (1981), no. 3, 233-241.
|
| |
8
|
|
| |
9
|
Ferenc Juhasz, On the spectrum of a random graph, Algebraic methods in graph theory, Vol. I, II (Szeged, 1978), North-Holland, Amsterdam, 1981, pp. 313-316.
|
 |
10
|
|
| |
11
|
Michael Krivelevich and Van H. Vu, On the concentration of eigenvalues of random symmetric matrices, Tech. Report 60, Microsoft Research, 2000.
|
 |
12
|
Christos H. Papadimitriou , Hisao Tamaki , Prabhakar Raghavan , Santosh Vempala, Latent semantic indexing: a probabilistic analysis, Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, p.159-168, June 01-04, 1998, Seattle, Washington, United States
[doi> 10.1145/275487.275505]
|
| |
13
|
Matthew Turk and Alex Pentland, Eigenfaces for recognition, J. Cogn. Neurosci. 3 (1991), no. 1, 71-86.
|
| |
14
|
Eugene P. Wigner, On the distribution of the roots of certain symmetric matrices, Ann. of Math. (2) 67 (1958), 325-327.
|
CITED BY 26
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Daniel A. Spielman , Shang-Hua Teng, Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, June 13-16, 2004, Chicago, IL, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Amit Deshpande , Luis Rademacher , Santosh Vempala , Grant Wang, Matrix approximation and projective clustering via volume sampling, Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, p.1117-1126, January 22-26, 2006, Miami, Florida
|
|
|
|
|
|
|
|
|
A. C. Gilbert , S. Guha , P. Indyk , S. Muthukrishnan , M. Strauss, Near-optimal sparse fourier representations via sampling, Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, May 19-21, 2002, Montreal, Quebec, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|