ACM Home Page
Please provide us with feedback. Feedback
Sampling algorithms for l2 regression and applications
Full text PdfPdf (290 KB)
Source Symposium on Discrete Algorithms archive
Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm table of contents
Miami, Florida
Pages: 1127 - 1136  
Year of Publication: 2006
ISBN:0-89871-605-5
Authors
Petros Drineas  Rensselaer Polytechnic Institute, Troy, New York
Michael W. Mahoney  Yahoo Research Labs, Sunnyvale, California
S. Muthukrishnan  Rutgers University, New Brunswick NJ
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
: SIAM Activity Group on Discrete Mathematics
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 47,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1109557.1109682
What is a DOI?

ABSTRACT

We present and analyze a sampling algorithm for the basic linear-algebraic problem of l2 regression. The l2 regression (or least-squares fit) problem takes as input a matrix A ∈ Rn×d (where we assume nd) and a target vector b ∈ Rn, and it returns as output Z = minx∈Rd |b - Ax|2. Also of interest is xopt = A+b, where A+ is the Moore-Penrose generalized inverse, which is the minimum-length vector achieving the minimum. Our algorithm randomly samples r rows from the matrix A and vector b to construct an induced l2 regression problem with many fewer rows, but with the same number of columns. A crucial feature of the algorithm is the nonuniform sampling probabilities. These probabilities depend in a sophisticated manner on the lengths, i.e., the Euclidean norms, of the rows of the left singular vectors of A and the manner in which b lies in the complement of the column space of A. Under appropriate assumptions, we show relative error approximations for both Z and xopt. Applications of this sampling methodology are briefly discussed.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
R. Bhatia. Matrix Analysis. Springer-Verlag, New York, 1997.
 
3
 
4
P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. Technical Report YALEU/DCS/TR-1269, Yale University Department of Computer Science, New Haven, CT, February 2004. Accepted for publication in the SIAM Journal on Computing.
 
5
P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. Technical Report YALEU/DCS/TR-1270, Yale University Department of Computer Science, New Haven, CT, February 2004. Accepted for publication in the SIAM Journal on Computing.
 
6
P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. Technical Report YALEU/DCS/TR-1271, Yale University Department of Computer Science, New Haven, CT, February 2004. Accepted for publication in the SIAM Journal on Computing.
 
7
P. Drineas and M. W. Mahoney. A randomized algorithm for a tensor-based generalization of the Singular Value Decomposition. Technical Report YALEU/DCS/TR-1327, Yale University Department of Computer Science, New Haven, CT, June 2005.
 
8
9
 
10
G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, 1989.
 
11
 
12
S. Sathiya Keerthi , Dennis DeCoste, A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs, The Journal of Machine Learning Research, 6, p.341-361, 9/1/2005
 
13
 
14
M. Z. Nashed, editor. Generalized Inverses and Applications. Academic Press, New York, 1976.
 
15
L. Rademacher, S. Vempala, and G. Wang. Matrix approximation and projective clustering via iterative sampling. Technical Report MIT-LCS-TR-983, Massachusetts Institute of Technology, Cambridge, MA, March 2005.
 
16
M. Rudelson and R. Vershynin. Approximation of matrices. manuscript.
 
17
G. W. Stewart and J. G. Sun. Matrix Perturbation Theory. Academic Press, New York, 1990.
 
18
R. Vershynin. Coordinate restrictions of linear operators in ln2. manuscript.


Collaborative Colleagues:
Petros Drineas: colleagues
Michael W. Mahoney: colleagues
S. Muthukrishnan: colleagues