|
ABSTRACT
Existing Recurrent Neural Networks (RNNs) are limited in their ability to model dynamical systems with nonlinearities and hidden internal states. Here we use our general framework for sequence learning, EVOlution of recurrent systems with LINear Outputs (Evolino), to discover good RNN hidden node weights through evolution, while using linear regression to compute an optimal linear mapping from hidden state to output. Using the Long Short-Term Memory RNN Architecture, Evolino outperforms previous state-of-the-art methods on several tasks: 1) context-sensitive languages, 2) multiple superimposed sine waves.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. M. Baldwin. A new factor in evolution. The American Naturalist, 30:441--451, 536--553, 1896.
|
| |
2
|
Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157--166, 1994.
|
| |
3
|
S. Chen. Combined genetic algorithm optimization and regularized orthogonal least squares learning for radial basis function networks. IEEE Transactions on Neural Networks, 10(5), September 1999.
|
| |
4
|
G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems, 2:303--314, 1989.
|
| |
5
|
|
| |
6
|
F. A. Gers and J. Schmidhuber. LSTM recurrent networks learn simple context free and context sensitive languages. IEEE Transactions on Neural Networks, 12(6):1333--1340, 2001.
|
| |
7
|
|
| |
8
|
|
| |
9
|
|
| |
10
|
H. Jaeger. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304:78--80, 2004.
|
| |
11
|
H. Jaeger. http://www.faculty.iu-bremen.de/hjaeger/courses/seminarspring04/esnstandardslides.pdf, 2004.
|
| |
12
|
E. P. Maillard and D. Gueriot. RBF neural network, basis functions and genetic algorithms. In IEEE International Conference on Neural Networks, pages 2187--2190, Piscataway, NJ, 1997. IEEE.
|
| |
13
|
H. A. Mayer and R. Schwaiger. Evolutionary and coevolutionary approaches to time series prediction using generalized multi-layer perceptrons. In Congress on Evolutionary Computation, Washington D.C., July 1999.
|
| |
14
|
P. McQuesten and R. Miikkulainen. Culling and teaching in neuro-evolution. In T. Bäck, editor, Proceedings of the Seventh International Conference on Genetic Algorithms (ICGA-97, East Lansing, MI), pages 760--767. San Francisco, CA: Morgan Kaufmann, 1997.
|
| |
15
|
|
| |
16
|
|
| |
17
|
R. Penrose. A generalized inverse for matrices. In Proceedings of the Cambridge Philosophy Society, volume 51, pages 406--413, 1955.
|
| |
18
|
A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.
|
| |
19
|
J. Schmidhuber, D. Wierstra, and F. Gomez. Evolino: Hybrid neuroevolution/optimal linear search for sequence learning. In Proceedings of the 19th International Joint Conference on Artificial Intelligence, 2005.
|
| |
20
|
P. Werbos. Backpropagation through time: what does it do and how to do it. In Proceedings of IEEE, volume 78, pages 1550--1560, 1990.
|
| |
21
|
B. Whitehead and T. D. Choate. Cooperative-Competitive genetic evolution of radial basis function centers and widths for time series prediction. IEEE Transactions on Neural Networks, 7(4):869--880, 1996.
|
| |
22
|
D. Whitley, K. Mathias, and P. Fitzhorn. Delta-Coding: An iterative search strategy for genetic algorithms. In R. K. Belew and L. B. Booker, editors, Proceedings of the Fourth International Conference on Genetic Algorithms, pages 77--84. San Francisco, CA: Morgan Kaufmann, 1991.
|
| |
23
|
R. J. Williams and D. Zipser. A learning algorithm for continually running fully recurrent networks. Neural Computation, 1(2):270--280, 1989.
|
| |
24
|
X. Yao. Evolving artificial neural networks. Proceedings of the IEEE, 87(9):1423--1447, 1999.
|
| |
25
|
X. Yao and Y. Liu. A new evolutionary system for evolving artificial neural networks. IEEE Transactions on Neural Networks, 8(3):694--713, May 1997.
|
| |
26
|
B.-T. Zhang, P. Ohm, and H. Mhlenbein. Evolutionary induction of sparse neural trees. Evolutionary Computation, 5(2):213--236, 1997.
|
|