Machine Learning W4771
Homework 4: Genetic Programming
Introduction
The crossover step in genetic programming plays a role
of “information sharing”. Put more precisely, the information contained
in more than one potential solution can be put in common to generate another
(hopefully a better) potential solution. To evaluate the importance of
crossover characteristics in genetic programming, difference combination
of parameters involved in crossover is used to train players and is played
against random player to see how the performance changes with respect to
the different combination.
Experiment
Performance comparison is done by running the defined genetic player (trained) against random players 50 times, and counting how many times the defined genetic player wins.
Unchanged parameters settings:
PopulationSize = 150
NumberOfGenerations = 35
MaximumDepthForCreation = 6
Changed parameters:
MaximumDepthForCrossover: ranging from 6 to 30
MaximumComplexity: ranging from 50 to 250
CrossoverProbability: ranging from 0 to 100
This experiment evaluates how the performance of othello game is affected by changes made in crossover parameters.
Defined genetic player is expressed as given in othello
packet.
Result
(1) Fixed: MaximumDepthForCrossover=17, MaximumComplexity=100
CrossoverProbability | 0 | 30 | 50 | 70 | 90 | 100 |
Games Won | 36 | (*) | 36 | 35 | 34 | 33 |
Tree Depth | 2(3.6) | 3 (3.6) | 2(3.6) | 3(3.8) | 2(3.7) | 2(3.6) |
Complexity | 3(15.4) | 7(14.5) | 3(15.5) | 5(15.9) | 3(15.4) | 3(15.3) |
(2)Fixed: CrossoverProbablitiy=70, MaximumComplexity=100
MaximumDepth | 6 | 12 | 18 | 24 | 30 |
Games Won | 36 | 37 | 42 | 38 | 48 |
Tree Depth | 2(3.6) | 2(3.5) | 2(3.7) | 2(3.7) | 2(3.6) |
Complexity | 3(15.2) | 3(15.0) | 3(15.7) | 3(15.4) | 3(15.6) |
(3)Fixed: CrossoverProbablity=70, MaximumDepthForCrossover=17
MaximumComplexity | 50 | 100 | 150 | 200 | 250 |
Games Wom | 49 | 35 | 37 | (*) | (*) |
Tree Depth | 2(3.5) | 3(3.8) | 2(3.7) | 3(3.6) | 4(3.6) |
Complexity | 3(12.0) | 5(15.9) | 3(15.3) | 7(15.4) | 7(15.1) |
(*) … the genetic representation is not in simple form; parse error
(4) Combination of the best and worst values (from
above 3 tables)
(ordered <CrossoverProbability,MaximumDepth,MaximumComplexity>)
Best:
<50,18,50> | <50,30,50> | |
Games Won | 41 | (*) |
Tree Depth | 2(3.4) | 3(3.6) |
Complexity | 3(11.9) | 7(12.4) |
Worst:
<100,6,100> | |
Games Won | (*) |
Tree Depth | 3(3.8) |
Complexity | 7(16.2) |
Note: for tree depth and complexity, the best case is
written, and average is in parenthesis.
Discussion
The first observation seen here is that when playing against a random player, a defined genetic player won more than half of the 50 games it played, no matter how the parameters were set. For some particular parameter settings, it won more than 45 out of 50 games, which is remarkable considering the fact that the population size was set to a small number(150) for a problem that is fairly complex. This shows that the information within the potential solution was shared in some useful manner to achieve this outcome. None of the cases covered for this experiment performed terribly, though one can see an exceptionally good performance.
The best and average depth tree and complexity seem to stay similar for most cases. The only case when the change is drastic is when the MaximumComplexity parameter was altered. This implies that the crossover probability and MaximumDepthForCrossover parameters do not affect the shape of the tree which results from applying genetic programming.
An interesting observation from the experiment (which is not mentioned
in detail here) is the genetic representation of a defined genetic player.
The player with the best outcome had the simplest form of genetic representation.
For example, the genetic representation of the best result in table (3)
is expressed simply as
( + white_corners white ).
Throughout the experiment, the best RPBs were always simpler and less complex
than the respective worst RPBs.
In table (3), the non-simplistic expression are the two largest values set to MaximumComplexity. Performance could not be measured from these two parameter settings, but from what is said about non-simplistic expression in the previous paragraph, the assumption is that the performance probably would not have been any better than the previous three cases. Since the population size for this experiment is 150, setting the MaximumComplexity to 200 and 250 seens too much; population can becomes too diverse and so the problem very complex. Intuitively, the assumption seems to make sense. This assumption is specific to this particular problem. To verify this assumption, or to see if this applies to a more general case, a more intensive testing is required.
Table (4) was put to shows that simply combining the best values for each parameters will not necessarily result in the best performance.
One puzzling result of the experiment is the result in table (1).
The performance of when Crossover probablity was set to 0 (no "information
sharing") was not much different from that of when Crossover probablity
was set to some positive number. This might again be the problem
of setting the population size to a small number. And
population size may also be the reason for most cases to have the low tree
depth of 2 or 3. Not much information will be shared if the population
is small. This is similar to how people group together to come up
with a solution to a project (more information will be exchanged if there
is more people and vice versa) It would be interesting to see how
the number will differ if a larger population size is used, in which case
a characteristics of information sharing will come to play a more
important role in order to achieve good, stable results. From
this is the conclusion that crossover operation can have an effect on the
performance of the game, but to what degree it can have an effect seems
to depend on other factors such as population size.