ACM Home Page
Please provide us with feedback. Feedback
Self-taught learning: transfer learning from unlabeled data
Full text PdfPdf (567 KB)
Source ICML; Vol. 227 archive
Proceedings of the 24th international conference on Machine learning table of contents
Corvalis, Oregon
Pages: 759 - 766  
Year of Publication: 2007
ISBN:978-1-59593-793-3
Authors
Rajat Raina  Stanford University, CA
Alexis Battle  Stanford University, CA
Honglak Lee  Stanford University, CA
Benjamin Packer  Stanford University, CA
Andrew Y. Ng  Stanford University, CA
Sponsor
: Machine Learning Journal
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 55,   Downloads (12 Months): 251,   Citation Count: 17
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1273496.1273592
What is a DOI?

ABSTRACT

We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classification task. Such unlabeled data is significantly easier to obtain than in typical semi-supervised or transfer learning settings, making self-taught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and significantly improve classification performance. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Blei, D., Ng, A. Y., & Jordan, M. (2002). Latent dirichlet allocation. NIPS.
 
4
 
5
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. J. Am. Soc. Info. Sci., 41, 391--407.
 
6
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Ann. Stat., 32, 407--499.
 
7
 
8
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313, 504--507.
 
9
 
10
 
11
 
12
 
13
Lee, H., Battle, A., Raina, R., & Ng, A. Y. (2007). Efficient sparse coding algorithms. NIPS.
14
 
15
 
16
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607--609.
 
17
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290.
 
18
 
19
Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290, 2319--2323.
 
20
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? NIPS.
 
21
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B., 58, 267--288.
 
22
Tsuda, K., Kin, T., & Asai, K. (2002). Marginalized kernels for biological sequences. Bioinformatics, 18.
 
23

CITED BY  18
Collaborative Colleagues:
Rajat Raina: colleagues
Alexis Battle: colleagues
Honglak Lee: colleagues
Benjamin Packer: colleagues
Andrew Y. Ng: colleagues