|
ABSTRACT
The crossover bias theory for bloat [18] is a recent result which predicts that bloat is caused by the sampling of short, unfit programs. This theory is clear and simple, but it has some weaknesses: (1) it implicitly assumes that the population is large enough to allow sampling of all relevant program sizes (although it does explain what to expect in the many practical cases where this is not true, e.g., because the population is small); (2) it does not explain what is meant by its assumption that short programs are unfit. In this paper we discuss these weaknesses and propose a refined version of the crossover bias theory that clarifies the relationship between bloat and finite populations, and explains what features of the fitness landscape cause bloat to occur. The theory, in particular, predicts that smaller populations will bloat more slowly than larger ones. Additionally, the theory predicts that bloat will only be observed in problems where short programs are less fit than longer ones when looking at samples created by fitness-based importance sampling, i.e. samplings of the search space in which fitter programs have a higher probability of being sampled (e.g., the Metropolis-Hastings method). Experiments with two classical GP benchmarks fully corroborate the theory.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
T. Blickle and L. Thiele. Genetic programming and redundancy. In J. Hopf, editor, Genetic Algorithms within the Framework of Evolutionary Computation (Workshop at KI-94, Saarbrcken), pages 33--38, Saarbrcken, Germany, 1994. Max-Planck-Institut fr Informatik (MPI-I-94-241).
|
| |
2
|
S. Chib and E. Greenberg. Understanding the metropolis-hastings algorithm. The American Statistician, 49(4):327--335, Nov. 1995.
|
 |
3
|
|
| |
4
|
J. R. Koza. A genetic approach to the truck backer upper problem and the inter-twined spiral problem. In Proceedings of IJCNN International Joint Conference on Neural Networks, volume IV, pages 310--318. IEEE Press, 1992.
|
| |
5
|
|
| |
6
|
|
| |
7
|
W. B. Langdon. How many good programs are there? How long are they? In K. A. De Jong, et al., editors, Foundations of Genetic Algorithms VII, pages 183--202, Torremolinos, Spain, 4-6 Sept. 2002. Morgan Kaufmann. Published 2003.
|
| |
8
|
W. B. Langdon. Convergence of program fitness landscapes. In E. Cantú-Paz, et al., editors, Genetic and Evolutionary Computation -- GECCO-2003, volume 2724 of LNCS, pages 1702--1714, Chicago, 12-16 July 2003. Springer-Verlag.
|
| |
9
|
W. B. Langdon. The distribution of reversible functions is Normal. In R. L. Riolo and B. Worzel, editors, Genetic Programming Theory and Practise, chapter 11, pages 173--188. Kluwer, 2003.
|
| |
10
|
W. B. Langdon. The distribution of amorphous computer outputs. In S. Stepney and S. Emmott, editors, The Grand Challenge in Non-Classical Computation: International Workshop, York, UK, 18-19 Apr. 2005.
|
| |
11
|
|
| |
12
|
W. B. Langdon and R. Poli. The halting probability in von Neumann architectures. In P. Collet, et al., editors, Proceedings of the 9th European Conference on Genetic Programming, volume 3905 of Lecture Notes in Computer Science, pages 225--237, Budapest, Hungary, 10 - 12 Apr. 2006. Springer.
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
| |
16
|
|
| |
17
|
R. Poli and W. B. Langdon. Efficient markov chain model of machine code program execution and halting. In R. L. Riolo, et al., editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 13. Springer, Ann Arbor, 11-13 May 2006.
|
| |
18
|
R. Poli, W. B. Langdon, and S. Dignum. On the limiting distribution of program sizes in tree-based genetic programming. In M. Ebner, et al., editors, Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of Lecture Notes in Computer Science, pages 193--204, Valencia, Spain, 11 - 13 Apr. 2007. Springer.
|
| |
19
|
R. Poli, W. B. Langdon, and N. F. McPhee. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza).
|
| |
20
|
|
| |
21
|
S. Silva and J. Almeida. Dynamic maximum tree depth. In E. Cantú-Paz, et al., editors, Genetic and Evolutionary Computation -- GECCO-2003, volume 2724 of LNCS, pages 1776--1787, Chicago, 12-16 July 2003. Springer-Verlag.
|
| |
22
|
|
| |
23
|
L. Vanneschi. Theory and Practice for Efficient Genetic Programming. Ph.D. thesis, Faculty of Sciences, University of Lausanne, Switzerland, 2004.
|
 |
24
|
|
|