|
ABSTRACT
Evolution of neural networks, or neuroevolution, bas been successful on many low-level control problems such as pole balancing, vehicle control, and collision warning. However, high-level strategy problems that require the integration of multiple sub-behaviors have remained difficult for neuroevolution to solve. This paper proposes the hypothesis that such problems are difficult because they are fractured: the correct action varies discontinuously as the agent moves from state to state. This hypothesis is evaluated on several examples of fractured high-level reinforcement learning domains. Standard neuroevolution methods such as NEAT indeed have difficulty solving them. However, a modification of NEAT that uses radial basis function (RBF) nodes to make precise local mutations to network output is able to do much better. These results provide a better understanding of the different types of reinforcement learning problems and the limitations of current neuroevolution methods. Thus, they lay the groundwork for creating the next generation of neuroevolution algorithms that can learn strategic high-level behavior in fractured domains.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
P. J. Angeline. Evolving basis functions with dynamic receptive fields. In IEEE International Conference on Systems, Man, and Cybernetics, volume 5, pages 4109--4114, 1997.
|
| |
2
|
A. Barron, J. Rissanen, and B. Yu. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory, 44(6):2743--2760, 1998.
|
| |
3
|
|
| |
4
|
|
 |
5
|
|
| |
6
|
N. Chaiyaratana and A. M. S. Zalzala. Evolving hybrid rbf-mlp networks using combinedgenetic/unsupervised/supervised learning. In UKACC International Conference on Control, volume 1, pages 330--335, 1998.
|
| |
7
|
D. E. Goldberg and J. Richardson. Genetic algorithms with sharing for multimodal function optimization. In Proceedings of the Second International Conference on Genetic Algorithms, pages 148--154, 1987.
|
| |
8
|
F. Gomez, J. Schmidhuber, and R. Miikkulainen. Efficient non-linear control through neuroevolution. In Proceedings of the European Conference on Machine Learning (ECML-06, Berlin), 2006.
|
| |
9
|
J. Gonzalez, I. Rojas, J. Ortega, H. Pomares, F. Fernandez, and A. Diaz. Multiobjective evolutionary optimization of the size, shape, and position parameters of radial basis function networks for function approximation. IEEE Transactions on Neural Networks, 14:1478--1495, 2003.
|
| |
10
|
F. Gruau, D. Whitley, and L. Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. In J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 81--89. MIT Press, 1996.
|
| |
11
|
A. Guillen, H. Pomares, J. Gonzalez, I. Rojas, L. J. Herrera, and A. Prieto. Parallel Multi-objective Memetic RBFNNs Design and Feature Selection for Function Approximation Problems, volume 4507/2007. 2007.
|
| |
12
|
A. Guillen, I. Rojas, J. Gonzalez, H. Pomares, L. J. Herrera, and B. Paechter. Improving the Performance of Multi-objective Genetic Algorithm for Function Approximation Through Parallel Islands Specialisation, volume 4304/2006. 2006.
|
| |
13
|
L. Guo, D.-S. Huang, and W. Zhao. Combining genetic optimisation with hybrid learning algorithm for radial basis function neural networks. Electronics Letters, 39:1600--1601, 2003.
|
| |
14
|
|
 |
15
|
Nate Kohl , Kenneth Stanley , Risto Miikkulainen , Michael Samples , Rini Sherony, Evolving a real-world vehicle warning system, Proceedings of the 8th annual conference on Genetic and evolutionary computation, July 08-12, 2006, Seattle, Washington, USA
[doi> 10.1145/1143997.1144273]
|
| |
16
|
R. Kretchmar and C. Anderson. Comparison of cmacs and radial basis functions for local function approximators in reinforcement learning. In Proceedings of the International Conference on Neural Networks, 1997.
|
 |
17
|
Pier Luca Lanzi , Daniele Loiacono , Stewart W. Wilson , David E. Goldberg, Classifier prediction based on tile coding, Proceedings of the 8th annual conference on Genetic and evolutionary computation, July 08-12, 2006, Seattle, Washington, USA
[doi> 10.1145/1143997.1144242]
|
| |
18
|
P. L. Lanzi, D. Loiacono, S. W. Wilson, and D. E. Goldberg. Xcs with computed prediction for the learning of boolean functions. In Proceedings of the IEEE Congress on Evolutionary Computation Conference, 2005.
|
| |
19
|
J. Li and T. Duckett. Q-learning with a growing rbf network for behavior learning in mobile robotics. In Proceedings of the Sixth IASTED International Conference on Robotics and Applications, 2005.
|
| |
20
|
J. Li, T. Martinez-Maron, A. Lilienthal, and T. Duckett. Q-ran: A constructive reinforcement learning approach for robot behavior learning. In Proceedings of IEEE/RSJ International Conference on Intelligent Robot and System, 2006.
|
| |
21
|
S. Lucas and J. Togelius. Point-to-point car racing: an initial study of evolution versus temporal difference learning. In IEEE Symposium on Computational Intelligence and Games, pages 260--267, 2007.
|
| |
22
|
E. Maillard and D. Gueriot. Rbf neural network, basis functions and genetic algorithm. In International Conference on Neural Networks, volume 4, 1997.
|
| |
23
|
|
| |
24
|
|
| |
25
|
N. J. Radcliffe. Genetic set recombination and its application to neural network topology optimization. Neural Computing and Applications, 1(1):67--90, 1993.
|
| |
26
|
J. Reisinger, E. Bahceci, I. Karpov, and R. Miikkulainen. Coevolving strategies for general game playing. In Proceedings of the IEEE Symposium on Computational Intelligence and Games, 2007.
|
| |
27
|
H. Sarimveis, A. Alexandridis, S. Mazarakis, and G. Bafas. A new algorithm for developing dynamic radial basis function neural network models based on genetic algorithms. Computers and Chemical Engineering, 28:209--217, 2004.
|
 |
28
|
|
| |
29
|
K. O. Stanley, B. D. Bryant, and R. Miikkulainen. Real-time neuroevolution in the NERO video game. IEEE Transactions on Evolutionary Computation, 9(6):653--668, 2005.
|
| |
30
|
|
| |
31
|
K. O. Stanley and R. Miikkulainen. Competitive coevolution through evolutionary complexification. Journal of Artificial Intelligence Research, 21:63--100, 2004.
|
| |
32
|
K. O. Stanley and R. Miikkulainen. Evolving a roving eye for go. In Proceedings of the Genetic and Evolutionary Computation Conference, 2004.
|
| |
33
|
P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu. Keepaway soccer: From machine learning testbed to benchmark. In I. Noda et al., editors, RoboCup-2005: Robot Soccer World Cup IX, volume 4020, pages 93--105. Springer Verlag, Berlin, 2006.
|
 |
34
|
|
| |
35
|
J. Togelius, S. Lucas, H. D. Thang, J. Garibaldi, T. Nakashima, C. H. Tan, I. Elhanany, S. Berant, P. Hingston, R. M. MacCallum, A. Gowrisankar, P. Burrow, and T. Haferlach. The 2007 IEEE CEC simulated car racing competition. Unpublished manuscript.
|
| |
36
|
V. Vapnik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264--280, 1971.
|
| |
37
|
B. Whitehead and T. Choate. Cooperative-competitive genetic evolution of radial basis functioncenters and widths for time series prediction. IEEE Transactions on Neural Networks, 7:869--880, 1996.
|
| |
38
|
|
| |
39
|
S. W. Wilson. Classifier conditions using gene expression programming. Technical Report 2008001, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, 2008.
|
| |
40
|
X. Yao. Evolving artificial neural networks. Proceedings of the IEEE, 87(9):1423--1447, 1999.
|
|