ACM Home Page
Please provide us with feedback. Feedback
Algorithms for subset selection in linear regression
Full text PdfPdf (276 KB)
Source
Annual ACM Symposium on Theory of Computing archive
Proceedings of the 40th annual ACM symposium on Theory of computing table of contents
Victoria, British Columbia, Canada
SESSION: 1B table of contents
Pages 45-54  
Year of Publication: 2008
ISBN:978-1-60558-047-0
Authors
Abhimanyu Das  University of Southern California, Los Angeles, CA, USA
David Kempe  University of Southern California, Los Angeles, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 15,   Downloads (12 Months): 187,   Citation Count: 5
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1374376.1374384
What is a DOI?

ABSTRACT

We study the problem of selecting a subset of k random variables to observe that will yield the best linear prediction of another variable of interest, given the pairwise correlations between the observation variables and the predictor variable. Under approximation preserving reductions, this problem is equivalent to the "sparse approximation" problem of approximating signals concisely. The subset selection problem is NP-hard in general; in this paper, we propose and analyze exact and approximation algorithms for several special cases of practical interest. Specifically, we give an FPTAS when the covariance matrix has constant bandwidth, and exact algorithms when the associated covariance graph, consisting of edges for pairs of variables with non-zero correlation, forms a tree or has a large (known) independent set. Furthermore, we give an exact algorithm when the variables can be embedded into a line such that the covariance decreases exponentially in the distance, and a constant-factor approximation when the variables have no "conditional suppressor variables". Much of our reasoning is based on perturbation results for the R2 multiple correlation measure, which is frequently used as a natural measure for "goodness-of-fit statistics". It lies at the core of our FPTAS, and also allows us to extend our exact algorithms to approximation algorithms when the matrix "nearly" falls into one of the above classes. We also use our perturbation analysis to prove approximation guarantees for the widely used "Forward Regression" heuristic under the assumption that the observation variables are nearly independent.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
K. Anstreicher, M. Fampa, J. Lee, and J. Williams. An exact algorithm for maximum entropy sampling. Operations Research, 43(4):684--691, 1995.
 
2
 
3
 
4
E. J. Candes, J. Romberg, and T. Tao. Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59:1207--1223, 2005.
 
5
W. Cochran. Some effects of errors of measurement on multiple correlation. Journal of the American Statistical Association, 65(329):22--34, 1970.
 
6
J. Cohen and P. Cohen. Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum Assoc Publishers, 2003.
 
7
G. Cornuejols, M. Fisher, and G. Nemhauser. Location of bank accounts to optimize float. Management Science, 23:789--810, 1977.
 
8
 
9
G. Davis, S. Mallat, and M. Avellaneda. Greedy adaptive approximation. Journal of Constructive Approximation, 13:57--98, 1997.
 
10
 
11
G. Diekhoff. Statistics for the Social and Behavioral Sciences. Wm. C. Brown Publishers, 2002.
 
12
D. Donoho. For most large underdetermined systems of linear equations, the minimal 11-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59:1207--1223, 2005.
 
13
D. L. Donoho, M. Elad, and V. N. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transaction on Information Theory, 52:6--18, 2006.
 
14
V. F. Flack and P. C. Chang. Frequency of selecting noise variables in subset regression analysis: A simulation study. The American Statistician Journal, 41(1):84--86, 1987.
 
15
16
 
17
 
18
 
19
 
20
P. Liaskovitis and C. Schurgers. Leveraging redundancy in sampling-interpolation applications for sensor networks. In Proc. 3rd Intl. Conf. on Distributed Computing in Sensor Systems, 2007.
 
21
A. Miller. Subset Selection in Regression. Chapman and Hall, second edition, 2002.
 
22
 
23
 
24
G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265--294, 1978.
 
25
 
26
M. H. Pesaran and R. J. Smith. A generalized r2 criterion for regression models estimated by the instrumental variables method. Econometrica, 62(3):705--710, 1994.
 
27
J. Saxe. Dynamic programming algorithms for recognizing small bandwidth graphs in polynomial time. SIAM Journal on Algebraic Methods I, 1(4):363--369, 1980.
 
28
G. W. Stewart and J.G. Sun. Matrix Perturbation Theory. Academic Press, 1990.
 
29
 
30
V. Temlyakov. Nonlinear methods of approximation. Foundations of Computational Mathematics, 3:33--107, 2002.
 
31
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, 58:267--288, 1996.
 
32
J. Tropp. Greed is good: algorithmic results for sparse approximation. IEEE Trans. Information Theory, 50:2231--2242, 2004.
 
33
J. Tropp. Topics in Sparse Approximation. PhD thesis, University of Texas, Austin, 2004.
 
34
J. Tropp. Just relax: Convex programming methods for identifying sparse signals. IEEE Trans. Information Theory, 51:1030--1051, 2006.
 
35
J. Tropp, A. Gilbert, S. Muthukrishnan, and M. Strauss. Improved sparse approximation over quasi-incoherent dictionaries. In Proc. IEEE-ICIP, 2003.
 
36
W. F. Velicer. Suppressor variables and the semipartial correlation coefficient. Educational and Psychological Measurement, 38:953--958, 1978.
 
37
M. Wainwright. Sharp thresholds for noisy and high-dimensional recovery of sparsity using l1-constrained quadratic programming. In Proc. Allerton Conference on Communication, 2006.
 
38
D. A. Walker. Suppressor variable(s) importance within a regression model. Journal of College Student Development, 44:127--133, 2003.
 
39
H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67(2):301--320, 2005.


Collaborative Colleagues:
Abhimanyu Das: colleagues
David Kempe: colleagues