| Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery |
| Full text |
Pdf
(741 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 382
archive
Proceedings of the 26th Annual International Conference on Machine Learning
table of contents
Montreal, Quebec, Canada
Pages 649-656
Year of Publication: 2009
ISBN:978-1-60558-516-1
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 15, Downloads (12 Months): 40, Citation Count: 0
|
|
|
ABSTRACT
We develop a cyclical blockwise coordinate descent algorithm for the multi-task Lasso that efficiently solves problems with thousands of features and tasks. The main result shows that a closed-form Winsorization operator can be obtained for the sup-norm penalized least squares regression. This allows the algorithm to find solutions to very large-scale problems far more efficiently than existing methods. This result complements the pioneering work of Friedman, et al. (2007) for the single-task Lasso. As a case study, we use the multi-task Lasso as a variable selector to discover a semantic basis for predicting human neural activation. The learned solution outperforms the standard basis for this task on the majority of test participants, while requiring far fewer assumptions about cognitive neuroscience. We demonstrate how this learned basis can yield insights into how the brain represents the meanings of words.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
Friedman, J., Hastie, T., Hüdotofling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. The Annals of Applied Statistics, 1, 302--332.
|
| |
4
|
Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Regularized paths for generalized linear models via coordinate descent. Technical report, Stanford University.
|
| |
5
|
Fu, W. J. (1998). Penalized regressions: The bridge versus the lasso. Journal of Computational and Graphical Statistics, 7, 397--416.
|
| |
6
|
Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: Data mining, inference, and prediction. Springer-Verlag.
|
| |
7
|
Mallows, C. L. (Ed.). (1990). The collected works of john w. tukey. volume vi: More mathematical, 1938--1984. Wadsworth & Brooks/Cole.
|
| |
8
|
Mitchell, T., et al. (2008). Predicting human brain activity associated with the meanings of nouns. Science, 320, 1191--1195.
|
| |
9
|
Rockafellar, R. T., & Wets, R. J.-B. (1998). Variational analysis. Springer-Verlag Inc.
|
| |
10
|
|
| |
11
|
Turlach, B., Venables, W. N., & Wright, S. J. (2005). Simultaneous variable selection. Technometrics, 27, 349--363.
|
| |
12
|
Wu, T. T., & Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics, 2, 224--244.
|
| |
13
|
Zhang, J. (2006). A probabilistic framework for multitask learning (Technical Report CMU-LTI-06-006). Ph.D. thesis, Carnegie Mellon University.
|
| |
14
|
Zhao, P., Rocha, G., & Yu, B. (2009). The grouped and hierarchical model selection through composite absolute penalties. The Annals of Statistics (to appear).
|
|