|
ABSTRACT
Sparse coding---that is, modelling data vectors as sparse linear combinations of basis elements---is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on learning the basis set, also called dictionary, to adapt it to specific data, an approach that has recently proven to be very effective for signal reconstruction and classification in the audio and image processing domains. This paper proposes a new online optimization algorithm for dictionary learning, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples. A proof of convergence is presented, along with experiments with natural images demonstrating that it leads to faster performance and better dictionaries than classical batch algorithms for both small and large datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Aharon, M., & Elad, M. (2008). Sparse and redundant modeling of image content using an image-signature-dictionary. SIAM Imaging Sciences, 1, 228--247.
|
| |
2
|
Aharon, M., Elad, M., & Bruckstein, A. M. (2006). The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representations. IEEE Transactions Signal Processing, 54, 4311--4322
|
| |
3
|
Bertsekas, D. (1999). Nonlinear programming. Athena Scientific Belmont, Mass.
|
| |
4
|
Bickel, P., Ritov, Y., & Tsybakov, A. (2007). Simultaneous analysis of Lasso and Dantzig selector. preprint.
|
| |
5
|
|
| |
6
|
Borwein, J., & Lewis, A. (2006). Convex analysis and nonlinear optimization: theory and examples. Springer.
|
| |
7
|
|
| |
8
|
Bottou, L., & Bousquet, O. (2008). The tradeoffs of large scale learning. Advances in Neural Information Processing Systems, 20, 161--168.
|
| |
9
|
|
| |
10
|
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407--499.
|
| |
11
|
Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions Image Processing, 54, 3736--3745.
|
| |
12
|
Fisk, D. (1965). Quasi-martingale. Transactions of the American Mathematical Society, 359--388.
|
| |
13
|
Friedman, J., Hastie, T., Hölfling, H., & Tibshirani, R. (2007). Pathwise coordinate optimization. Annals of Statistics, 1, 302--332.
|
| |
14
|
Fu, W. (1998). Penalized Regressions: The Bridge Versus the Lasso. Journal of computational and graphical statistics, 7, 397--416.
|
| |
15
|
Fuchs, J. (2005). Recovery of exact sparse representations in the presence of bounded noise. IEEE Transactions Information Theory, 51, 3601--3608.
|
| |
16
|
Lee, H., Battle, A., Raina, R., & Ng, A. Y. (2007). Efficient sparse coding algorithms. Advances in Neural Information Processing Systems, 19, 801--808.
|
| |
17
|
Mairal, J., Elad, M., & Sapiro, G. (2008). Sparse representation for color image restoration. IEEE Transactions Image Processing, 17, 53--69.
|
| |
18
|
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Supervised dictionary learning. Advances in Neural Information Processing Systems, 21, 1033--1040.
|
| |
19
|
Mallat, S. (1999). A wavelet tour of signal processing, second edition. Academic Press, New York.
|
| |
20
|
Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37, 3311--3325.
|
| |
21
|
Osborne, M., Presnell, B., & Turlach, B. (2000). A new approach to variable selection in least squares problems. IMA Journal of Numerical Analysis, 20, 389--403.
|
| |
22
|
Protter, M., & Elad, M. (2009). Image sequence denoising via sparse and redundant representations. IEEE Transactions Image Processing, 18, 27--36.
|
 |
23
|
Rajat Raina , Alexis Battle , Honglak Lee , Benjamin Packer , Andrew Y. Ng, Self-taught learning: transfer learning from unlabeled data, Proceedings of the 24th international conference on Machine learning, p.759-766, June 20-24, 2007, Corvalis, Oregon
[doi> 10.1145/1273496.1273592]
|
| |
24
|
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B, 67, 267--288.
|
| |
25
|
Van der Vaart, A. (1998). Asymptotic Statistics. Cambridge University Press.
|
| |
26
|
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B, 67, 301--320.
|
|