|
ABSTRACT
This work proposes a learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled video recordings. That is, two successive frames are likely to contain the same object or objects. This coherence is used as a supervisory signal over the unlabeled data, and is used to improve the performance on a supervised task of interest. We demonstrate the effectiveness of this method on some pose invariant object and face recognition tasks.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Becker, S. (1996a). Learning Temporally Persistent Hierarchical Representations. Advances in Neural Information Processing Systems (pp. 824--830).
|
| |
2
|
Becker, S. (1996b). Mutual information maximization: models of cortical self-organization. Network: Computation in Neural Systems, 7, 7--31.
|
| |
3
|
|
| |
4
|
Becker, S., & Hinton, G. (1992). Self-organizing neural network that discovers surfaces in random-dot stereograms. Nature, 355, 161--163.
|
| |
5
|
Belkin, M., Niyogi, P., & Sindhwani, V. (2005). On manifold regularization. Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTATS) (pp. 17--24).
|
| |
6
|
Bottou, L. (1991). Stochastic gradient learning in neural networks. Proceedings of Neuro-Nîmes 91. Nimes, France: EC2.
|
 |
7
|
|
| |
8
|
Bromley, J., Bentz, J., W. Bottou, L., & Guyon, I. (1993). Signature verification using a siamese time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence (p. 669).
|
| |
9
|
Caputo, B., Hornegger, J., Paulus, D., & Niemann, H. (2002). A spin-glass markov random field for 3-d object recognition (Technical Report LME-TR-2002-01). Institut fur Informatik, Universitat Erlangen Nurnberg.
|
| |
10
|
|
| |
11
|
Chapelle, O., & Zien, A. (2003). Semi-Supervised Classification by Low Density Separation. Advances in Neural Information Processing Systems, 17, 1633--1640.
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278--2324.
|
| |
16
|
LeCun, Y., Huang, F. J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. Proc. Computer Vision and Pattern Recognition Conference (pp. 97--104).
|
| |
17
|
|
| |
18
|
|
| |
19
|
Roobaert, D., & Hulle, M. M. V. (1999). View-based 3d object recognition with support vector machines. In IEEE International Workshop on Neural Networks for Signal Processing (pp. 77--84).
|
| |
20
|
Roweis, S., & Saul, L. (2000). Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 290, 2323--2326.
|
| |
21
|
Samaria, F., & Harter, A. (1994). Parameterisation of a stochastic model for human face identification. Proceedings of 2nd IEEE Workshop on Applications of Computer Vision (pp. 138--142).
|
| |
22
|
Tenenbaum, J., Silva, V., & Langford, J. (2000). A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290, 2319--2323.
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
 |
26
|
|
| |
27
|
|
|