Machine Learning (W4771)
Homework 4: Genetic Programming
Project: Othello
Due Date:
Class time, Thursday March 30.
Reading:
Chapter 9 ("Genetic Algorithms") of "Machine Learning" by Tom M. Mitchell. The
math of Section 9.4.1 is optional. Also read the handout of three genetic
programming chapters (all stapled into one; note that two of the three are only partial),
not available on the web.
Writeups and results of 18 student projects on this assignment
Your assignment:
Follow the
IMPORTANT ADDENDUM MATERIAL
to familiarize yourself with the genetic
programming (GP) implementation we are using for this project, and the
application thereof to symbolic regression.
In class we are covering our implementation of the game Othello, and a specific approach to
Othello using GP. Here are the GP and Othello Java code and READMEs.
Note: If you are an instructor and decide to use all or
part of this assignment and/or these course materials, please
email email: evs at cs dot columbia dot edu
to let me know.
Our goals:
- Compare, contrast and discover methods to approach Othello with GP.
- Investigate methods to evaluate resulting Othello players.
- Compare reseults to another learned Othello player,
Edgar
- Use another Othello expert,
Edgar,
during training
For the sake of coverage, we will distribute this project
across the body of students in our class.
To avoid redundancy, we will be approving all projects.
Select a project that you can do within two weeks and email to
devans@cs.columbia.edu. Your
email description of the project should be 3 or 4 sentences at most --
we will ask if we need more details at that point.
Note
that you are allowed to work in teams.
We may ask parties to communicate
and come to an agreement on specific differences between experiments.
What is interesting to you about this problem and the use of
machine learning to solve it? What aspect of the system needs
further understanding and/or experimentation? What would result
in better Othello players?
Below are a list of possible projects (also follow the link above to
see what previous students have done for this). Note that your Java
expertice is a pragmatic factor in your decision. You should select
(or invent) only one project. One interesting thing to do is
to vary something over a series of options, and record the performance
of each.
- Variations on the hypothesis
- New primitives (i.e., terminals) for GP trees, such as
distance-from-corner, parity thereof.
- Improve EDGAR, e.g., provide a terminal that says what EDGAR's
score for a given move would be (which can be overridden by an evolved tree
based on the values of other primitives).
- Other primitives:
- number of choices this move gives apponent (e.g., 0
is good).
- number of black/white pieces that will never be switched the remainder
of this game.
- statistics, e.g., "scatterness/clumpiness of white distribution"
- ave num flips black/white has made per move this particular game (this
type of terminal will allow a GP tree to vary its game-playing behavior,
depending on who it is playing, and how particular game is going)
- random constants
- more fundamental, simple primitives, e.g., x/y coordinates of
the piece just placed to get to this new board configuration -- could
GP use this and automatically build the concepts of "edge", "corner",
"one-away-from-corner", or other useful concepts we haven't thought of?
(Get rid of (some) other primitives for this experiment.)
- etcetera
- perform a shallow search, e.g., minimax, and apply the tree
at the end-points of this search. That is, resulting player uses
the hypothesis/tree to evaluate endpoints of a shallow search to
determine which is the best current move.
This of course requires a fair
amount of programming, and results in an even lengthier fitness
evaluation -- other changes would be required to
keep the fitness measure fast enough.
- help the hypothesis behave differently in the end-game versus other points
in the game. For example, a seperate tree (ADF-ish) for each portion
of the game controls the play -- each specializes for different
"temporal epochs" of the game (credit: Chris' idea). Note that the "white"
and "black" primitives encode how
far into the game you are by way of the number of pieces on the board
so far.
- ADFs
- etcetera
- Variations on the learning method
- parameters: population size, number of generations, proportion of
crossover, etc.
- number of games played during fitness measure
- who does it play against for fitness (e.g., different kinds of
random players, or play against EDGAR, or against a previously-evolved
player, or against a hand-made player, or combination thereof)
- competitive fitness measure -- population plays itself, or competiting
populations co-evolve
- with competition, how do you select the best-of-generation
individuals? How do you track the changes in "absolute, objective" fitness?
- have a set of 32 individuals each make one move of a game and
distribute the resulting scores amongst them equally. Each individual
in the population can make one move in several games. This is a noisy
fitness measure, but much faster (more generations and/or larger
population size possible). It's so crazy it just might work!
(applies whether or not competition is happening)
- have a population vote on each move (competiting populations?)
- etcetera
- Methods to evaluate results
- against people
- against EDGAR
- against any other on-line Othello players you know of
- against random
- against solutions from previous evolution
- hand-written baselines of comparison
- head-to-head between results of various experiments, including
experiments performed by other classmates
- statistics of tendencies of how it plays -- visualize this?
- analysis of resulting function trees to "understand" its "strategy"
- etcetera
Deliverables:
You must hand in (electronically to devans@cs.columbia.edu) an html
write-up of your project, about 3-6 pages in length (including tables).
(If you do not know html and do now want to learn, a text write-up will
suffice.)
There will not be in-class presentations of this project. Instead,
we will compile an on-line compendium of the results. This compendium
will be advertised to the machine learning community (including the
GP mailing list). The compendium will optionally include applets that
allow users to compete against evolved Othello players.
Your
write-up should contain
a written description of the experiments you conducted.
In particular, the introduction
should start with the motivation, and then give an overview of the
solution. A seperate "approach" section gives
the details of your modification/experiment.
The results section gives the numerical results and some analysis.
Finally, the conclusions section puts the capital P on "perspective".
email: evs at cs dot columbia dot edu