Genetic algorithm

Here are some details of the genetic algorithm used to obtain lower bounds for [math]\displaystyle{ c_n }[/math] and [math]\displaystyle{ c'_n }[/math].

Lookup Table

I created a lookup table for each c_n on which I ran the genetic algorithm. This speeds up checking if a chromosome is line-free tremendously, making this tractable. (Any ideas about efficient ways to do this will have a very large effect on the speed of the algorithm).

Encoding Solutions

Chromosome structure: I code the solutions as a list of ones and zeros of length 3^n. The elements are ordered in this way 1…11, 1…12, 1…13, 1…21 and so on. A 0 in position 1 means element 1…11 is not in the set, otherwise it is.

Making a new generation

Selection: Any solutions below the mean of the population are completely ignored. The rest are selected with a probablity related to their score.

Crossover operator: I pick a random number, m between 2 and (3^n) - 1 and I make a new chromosome with the first m elements of a one chromosome and the last (3^n) - m elements of another chromosome.

Mutation operator: I pick a few points at random and flip them.

I check any new chromosomes to make sure they are valid. I do this by going through the chromosome elements in a random order (visiting each one) and removing points that conflict with other points.

Population size

I use elitism. (I keep the best 5 or sometimes 10 between generations.)

Population size: 60

I make more children than I can use (somewhere between 60-80) making the algorithm somewhat Malthusian. I take the best of the children and use them to replace the rest of the population.

Non-standard addition

An extremely effective trick has been to use a greedy algoritm on the list of all solutions to:

1. pick a small but fixed number of points in each chromosome and flip them if they improve the score.

2. run the greedy algorithm on the whole chromosome, going through all positions in a random order, and flipping any that make an improvement.

I vary the number of times I use 1 versus 2 in order to control the computational costs of this step.

I have found that without this step the algorithm is dramatically less good.

Adaptive Elements

The mutation rate is adaptive. I keep statistics on whether there are any repetitions in the chromosomes I generated and if there are, I increase the mutation rate until they disappear. Thus the mutation rate is just high enough to make sure a maximum number of novel solutions are being explored. If I have increased the rate twice in a row then I double the increment size and if I decrease the rate twice in the row then I half the increment size. (This allows quick changes in the rate if necessary.)

I also keep statistics on the probability that crossover and the probability that mutaton improve the score. I randomly choose one or the other in proportion to their effectiveness in the last generation.

Cataloging solutions

I set a criteria for when I think the genetic algorithm is stuck (200 generations with no change in the best performer). If there is no improvement then I restart the algorithm and I store the best solution.

On crossover

Thanks for the suggestion. This is only my second genetic algorithm project so I appreciate the input. I tried it without crossover and crossover does seem to help. In the beginning, I tried crossover on the chomosomes with regular crossover points at the one third and two thirds mark. That did seem to be disasterous, leading to very early convergence.

Final comments

For n less than 8, this works very quickly. For n=5, a quick look at a few examples shows 150 is attained within 40 generations. This gets faster after the program has attained the solution once because a better mutation rate than my default guess is found.

Genetic algorithm

Contents

Lookup Table

Encoding Solutions

Making a new generation

Population size

Non-standard addition

Adaptive Elements

Cataloging solutions

On crossover

Final comments

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools