What kind of idiot makes jokes in their thesis, you might ask? This kind of idiot.
Using the weighting scheme described in 3.1.1, the notion was developed that decreasing the strength of the weighting as generations increased could actually be useful to improving the quality of results. The strength of the weighting can be considered to be analogous to an evolutionary pressure, or the advantage that a fit individual will have over an unfit individual. Due to this, it could be seen as counter-intuitive that decreasing the evolutionary pressure would produce better individual chromosomes, but a possible explanation of this theory is shown in Figure 4, below, with an accompanying explanation.
Near the start of a test with an evolutionary algorithm, the evolutionary pressure is high, but the average score is bad. This is described by part a of Figure 4. The dark grey areas represent parts of the solution space surrounding local optima. These are the areas into which the algorithm is guiding the chromosomes. Not all of them will be found, however, and the light grey areas surrounding the dark grey areas represent the parts of the solution space that will eventually lead into one of the optima. These light grey areas will only lead into the optima because the current average scores are so bad, that even with high pressure, much of the solution space is reachable.
After a while, part b of Figure 4 is reached. Good best and average scores will reduce the size of the areas which will lead to a local optimum. The black regions indicate these optima. The patch inside the black region near the top-left represents a current chromosome stuck in a local optimum that is not the global optimum. Near the bottom left is a light patch inside a dark grey region. This light patch represents the global optimum in the search space. There is very little chance that a chromosome in the local optimum will manage to mutate and hit the global optimum which is very small. This is why, after some generations have passed, if the evolutionary pressure is eased off, the dark grey region of part
18b of Figure 4 can be reached, instead of just the pinhead in the middle of it. This does not guarantee that a better result will be reached, but it increases the chances of it happening.
This is why the weighting aggression should to be lowered, so that the light grey areas can be reached. This way, it still restricts results to being far better than the earliest generations, but it allows larger deviations from the current best so that the other optima can be found.
This is analogous to human evolution. In the animal kingdom, the weak are killed with no regard for whether they could become stronger in a different way if given the opportunity. Humanity, however, has evolved to the level where the weak are allowed to survive and breed. This is a trade-off. It allows those who are weak with minimal chance of imparting anything useful to society to exist for the benefit of allowing the chance few, who manage to overcome their lack of traditional fitness to gain fitness in other ways.
For example, the humble nerd would have begun in human evolution as the creature unable to hunt effectively due to lacklustre hand-eye coordination, small muscles and a constant wheezing that would scare off possible quarry. The nerd however, managed to utilise his spare time to other ends, such as inventing tools and discovering ideas that were of more benefit to his society than any individual's muscles.
So basically, it is a matter of being strict until a viable result is achieved, and then relaxing until the optimal result is reached. The time to switch is possibly located at the point of diminishing returns, such as where no significant improvement has been noticed in x generations.
This weighting scheme would result in the best score graph changing from an exponential graph asymptoting at an optimal result to a series of exponentials, each smaller than the previous, representing the finding of new local optima, and all combining to make an approximate exponential graph, overall. The average score graph would then be expected to increase briefly after each new local optimum is found, as in the first generations after an optimum is found, the genetic operators would produce more poor results than after the new optimum becomes the standard.