ACM Home Page
Please provide us with feedback. Feedback
Co-evolving recurrent neurons learn deep memory POMDPs
Full text PdfPdf (252 KB)
Source Genetic And Evolutionary Computation Conference archive
Proceedings of the 2005 conference on Genetic and evolutionary computation table of contents
Washington DC, USA
SESSION: Coevolution table of contents
Pages: 491 - 498  
Year of Publication: 2005
ISBN:1-59593-010-8
Authors
Faustino J. Gomez  IDSIA, Lugano, Switzerland
Jürgen Schmidhuber  IDSIA, Lugano, Switzerland and TU Munich, München, Germany
Sponsors
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 6,   Downloads (12 Months): 35,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1068009.1068092
What is a DOI?

ABSTRACT

Recurrent neural networks are theoretically capable of learning complex temporal sequences, but training them through gradient-descent is too slow and unstable for practical use in reinforcement learning environments. Neuroevolution, the evolution of artificial neural networks using genetic algorithms, can potentially solve real-world reinforcement learning tasks that require deep use of memory, i.e. memory spanning hundreds or thousands of inputs, by searching the space of recurrent neural networks directly. In this paper, we introduce a new neuroevolution algorithm called Hierarchical Enforced SubPopulations that simultaneously evolves networks at two levels of granularity: full networks and network components or neurons. We demonstrate the method in two POMDP tasks that involve temporal dependencies of up to thousands of time-steps, and show that it is faster and simpler than the current best conventional reinforcement learning system on these tasks.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
B. Bakker. Reinforcement learning with long short-term memory. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, volume 14, pages 1475--1482, Cambridge, MA, 2002. MIT Press.
 
2
 
3
Y. Bengio, P. Simard, and P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157--166, 1994.
 
4
 
5
S. Hochreiter. Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut für Informatik, Lehrstuhl Prof. Brauer, Technische Universität München, 1991.
 
6
 
7
 
8
 
9
 
10
 
11
A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Engineering Department, Cambridge University, Cambridge, UK, 1987.
 
12
P. Werbos. Backpropagation through time: what does it do and how to do it. In Proceedings of IEEE, volume 78, pages 1550--1560, 1990.
 
13
A. Wieland. Evolving neural network controllers for unstable systems. In Proceedings of the International Joint Conference on Neural Networks (Seattle, WA), pages 667--673. Piscataway, NJ: IEEE, 1991.
 
14
R. J. Williams and D. Zipser. A learning algorithm for continually running fully recurrent neural networks. Neural Computation, 1:270--280, 1989.
 
15
X. Yao. Evolving artificial neural networks. Proceedings of the IEEE, 87(9):1423--1447, 1999.


Collaborative Colleagues:
Faustino J. Gomez: colleagues
Jürgen Schmidhuber: colleagues