Machine Learning

Homework 4

Leonid Portnoy

lp178@columbia.edu


Motivation:

Implementation Strategy:

Approach:

Imodified the code to include these new terminals, and I removed the other (old) ones as well, and the algorithm was trained to play with black pieces. The code for handling the next_board's_value primitive was essentially the following, inside GPOthello.java's evaluate() function : 

The code for handling other primitives was trivial (i.e. just passing x,y coordinates of last move to the evaluate() function and returning them whenever we encountered the appropriate primitives).

Problems encountered:

Coding went smoothly, as the genetic algorithm package was probably designed with ease of extension in mind. However, there was a slight problem in attempting to save a current run midway in its execution (and all the information associated with it) to a checkpoint file. When using this feature (setting the CheckpointGens variable to >0 in the .ini file), Java would generate an error on attempted loads of the file (when trying to restore a run), which was related to registering classes within Java's environment. This is probably due to a bug in the genetic programming package. Otherwise, no major problems were encountered.

Results:

Results from test run 1

The .stc file for this run


Results from test run 2

It's fitness is : 8

It's fitness is : 3


Results from run 3

This is the .stc file for run 3

Fitness 0

Fitness 0


Results from run 4

This is the .stc file for run 4


Conclusions:

The goal of my project was not to produce the best possible Othello player, but rather see how (and whether) the genetic programming algorithm would learn concepts needed to perform better than random in Othello, from scratch (or at least from very primitive and general concepts such as position and move-considered). Also, afterwards we can look at the results and perhaps discover concepts that we did not, or would not, think of by ourselves.

As can be seen from the result of test run 1, the heuristic that player used was "10 / (white+y)". In other words, the value of the board was inversely proportional to the sum of the number of white pieces (opponent's, since it was trained to play black) and the y coordinate. It makes a great deal of sense that the value of a board should be inversely proportional to the number of your opponent's pieces, and so the genetic programming algorithm was successful in that regard. But why should it be inversely proportional to the y coordinate? One possible answer is that this is an attempt by the algorithm to arrive at a concept of edges. The closer we place our piece to the upper edge, the more value the resulting board is going to have, since it will be harder to flip that piece over in a later part of the game. And, of course, a smaller value of y corresponds to a smaller distance to the upper edge of the piece.

After this, I ran a second run, which also showed some interesting results. The first image, shows a heuristic that does essentially the same thing as in the previous example. It's " black - y" and so it favors large number of black (its own) pieces, and small values of y (close to the upper edge). However, the second image, with even better fitness than the first one, shows the heuristic "y/white", which is now proportional to the y value. This seems to contradict the previous examples, but really it is saying the same thing only in regard to the other (bottom) edge.

By this time one thing became very obvious. In almost none of the hypotheses generated by the algorithm was the primitive "next_board's_value" used. What does this mean? Most probably, the number of pieces that survive after a single move by the opponent is not really relevant, and does not matter too much in regard to how the game concludes. The reason might be that the numerous moves following the current one, completely discredit the prediction of this primitive. For example if it says that the value of the board is high because after the opponent's move many pieces still remain, it might be that those pieces are flipped shortly after and this prediction does not help us in winning the game. In fact, this prediction becomes alike a random one, and therefore it was excluded from the best hypotheses produced, which were better than random.

Now, it was very time consuming to evaluate this primitive, which wound up not getting used. Because of this, I decided to remove it for the rest of the experiments, and see how it would do with just the "x","y","white" and "black" primitives (as well as "10").

Then came one crucial realization : The ComplexityAffectsFitness parameter was to true in the .ini file! What this meant was that only small trees were achieving any close to good complexity. But in our case, it was very probable that large trees should get used, because we are combining very general and very simple primitives! For example, to build the concept of an edge or corner it would probably need to combine several layers of combinations of these small primitives. After this, I set that parameter to false and continued running the tests. Now, since evaluation was faster, I tested each hypothesis for 10, instead of 5, games.

The results were great. As can be seen from the images for run 3, the hypotheses are quite complex, which is to be expected for representing complex concepts out of simple primitives, and the fitness of both of them is 0! This means, that they almost never lost (out of 10 games) to the randomized player.

Now, I trained the genetic program against Edgar. Upon conclusion of run 1, I had most individuals in the population getting a fitness of around 200-250, however, some individuals got really impressive results. The best individual is depicted in the image for run 4, in the results section above. The heuristic shown there received 0 fitness when playing against Edgar!

That result seemed interesting, and I decided to double check it by running that heuristic (function) through the OthelloCache program, where it played against Edgar. The result can be seen here :  The total annihilation of Edgar

Yes, the name means what it says :)  That function won that game against edgar with a score of 47 against Edgar's 0.

I tried to analyze that function, but it seems to require some time to make sense of. In fact, it is really a rather complex combination of very simple primitive functions, and as to what concept this represents one can only guess, and it will probably take some time to figure that out. However, it is very likely that humans would not come up with such an obscure function, at least not without a considerable amount of thought.

However, what we can infer from this result is that it is indeed possible for a genetic programming algorithm to take very primitive functions and terminals, which are general enough to apply to a wide range of problems, and produce a good solution for that problem. The drawback here is that this solution might be very complex (and therefore we have to allow for such complexity - which I didn't for the first two runs), and we might not be able to see the concepts imbedded within that solution right away. But given enough time, it is likely that we will also be able to learn something from that result.


Code

Here's the modified GPOthello.java

This is the modified OthelloGPTester.java

Modified OthelloCache.java

Modified OthelloGPPlayer.java

Here's the .ini file used to produce the final (winning) hypothesis