AIS_PAPER

An On-line Othello Program Trained by Genetic Algorithm
Project Write-up for AIS (CS4721)
Prof. Eric, Siegel
By Scott Kai Lei (leikai@cs.columbia.edu)

Introduction

The idea was to right a program (Java Applet) that plays the game Othello and make the program good by letting it learn "genetically".

The normal way to write a program that plays a game is described in the first two or three chapters of the class textbook Artificial Intelligence, a modern approach (by Russel and Norwig) and some other related AI textbooks. The idea is to through the possible moves for each player in turn (the possible moves form a really big tree structure). There is an algorithm called MiniMax for determining the best move for a player. There is a faster algorithm that does the same thing called Alpha-Beta pruning.

Problem

As we have already noticed that many new domains for genetic programming require evolved programs to be executed for longer amounts of time, since the tree is so big, you usually can't search through the whole thing. One way to control the execution time of evolved programs is to impose an absolute time limit. That is, you only search part of it, down to a depth of N nodes, usually. However, this is too constraining if some test cases require more processing time than others. So in some cases, when you reach the maximum depth that you want to search, you stop searching and make a guess which player has the advantage in the position that would exist on the board at that node (remember a node might correspond to a series of moves). How can you let the program itself know what next step that it should choose? Which choice is the best? This is the question that I want find out, I want to utilize the Genetic Algorithm to let the program learn how to choose next best place that it should place the chess on.

This evaluation usually works just by calculating simple numerical features of the game position (for example in Othello, whether one player has more pieces, or is controlling the corners of the board). The final evaluation of the position is a number; you get it by computing a linear function of the position features. That is, if my evaluation function uses 10 features of the position, it calculates a number for each one, then multiplies each numbers by its own "weight" value, usually a real number between -1 and 1. This is because some of the features are more important, so we want to pay more attention to them.

Approach

So the most important question here is the "weights" evaluation function. If the weights are set correctly, our computer program will play well. If they're not set correctly, the program may play very badly. We can try to use a "genetic" algorithm to learn good weights. We think of the weights as "genes", and we "evolve" a good set. The way to do this is to start with a random set of weights in the program, and use them to test the program. If the program does well, we keep the weights, and use them
(making small changes) in the next version of the program. If they do badly, we change them a lot or throw them out and start again.

But, just like with animals, evolution happens slowly over a large population of individuals. Likewise with programs. Instead of having just one version of the program at a time, we have many. We test them all, then keep a few of the best ones. We throw out the rest. Then we make a bunch of new versions of the program by making small random changes ("mutations") in the good individuals from the last batch. In this way we hope to get better and better individual weight sets each time.

Before we are talking about how to solve this problem, we should first simply introduce some backgroud information about Othello, what is it and what are the rules to play it.

The game of Othello is based on the ancient Chinese game Reversi. The western version is named after Shakespeare's Othello because "it is a game of dramatic reversals."

The game is played on an 8 x 8 board by two players, one playing black and one playing white. A move is made by placing a piece of your color on the board in any unoccupied space that causes a "flip" of your opponent's pieces. A "flip" means that by placing the piece you outflank or surround a row, column, or diagonal of your opponents colors by pieces of your own denomination. These surrounded pieces are then captured and flipped over to your side. i.e., if there is a row of three black pieces with a white piece on one side, by placing another white piece on the other side of the row of black pieces, the three black pieces are replaced with three white pieces. Players alternate placing pieces of their color on the board until neither player may make a legal move. A legal move must result in a capture of the opponent's pieces. If a player cannot capture any pieces, he must skip his turn and let the other player make another move. Conversely, if a player has a legal move, he or she must make it, even if it is strategically undesirable. When neither player can make a move, the game is over and the winner is the person with the most pieces of his denomination on the board. This usually occurs when all 64 places on the board are filled, though it can occur earlier if neither player can move.

To implement the specific algorithm for the Othello, We also need to know those important strategies to make players to win. The following is some key ones:

Position -- The most important places on the Othello board are the corners. Most strategy revolves around moving into the corners. So you should skip those bad positions ( such as [2,2], [2,7], [7,2], [7,7] ) which might increase the likelihood that one's opponent may be able to move into the corresponding corner. What's more, you should try to occupy those good positions (such as [3,3], [3, 6], [6,3], [6,6] ) which can increase the likelihood of one's opponent moving into the bad positions that you must skip.

Edge -- Edge strategy also revolves around taking the corners. Therefore, it is sometimes considered bad or risky to occupy the "3" position. It is suicidal to occupy the "3" position if one's opponent already occupies the "1" position. It is usually considered good to occupy the "1" position, and "2" positions are neutral. Edge strategy also revolves around the avoidance of holes. If one player has pieces at the "3" "1" "2" and "1" positions, there is a hole at the empty "2" position into which the opponent could maneuver and take the corner.

Move -- There exists one non-corner based strategy. This is simply to move in such a way that your apponent has little or no legal moves and is forced to make a bad one. The theory is that any forced move will be poor, and that anytime a player has to skip a move and allow his or her opponent to take two or more turns in a row, the opponent makes a huge gain. Each player tries at each turn to have greater mobility (more moves) than the opponent. To do so, one must ideally try to increase his own mobility while decreasing the opponent’s mobility. If one player can manage to reduce the opponent’s mobility to zero (or near zero) then that player will be able to "force" his opponent into making undesirable moves and follow through with an easy win.This strategy usually isn't successful unless one's opponent is very short sighted. Human beings often have a hard time visualizing the dramatic changes that happen after a single move in Othello, and visualizing the number of moves an opponent will have after making a given move can be all but impossible except in the most simple situations. Computers do not have this problem. This is the main reason that the world's best Othello players are all computer programs.
Openning -- The opening phase in Othello has a very strong influence on the rest of the game. The opening phase is not only important for the outcome of the game, but also tends to govern the style of play in the mid-game.
The "classic" opening principles:

Try to have fewer discs than your opponent.
Try to occupy the center of the position (the 4 center-squares in the first few moves).
Avoid flipping too many frontier discs (those located on the outside boundaries of the position, i.e. avoid building walls).
Try to group your discs into one connected cluster rather than having several scattered isolated discs.
Avoid taking edges too soon (before the mid-game).
All these principles can also be applied well into the mid-game.

After we know those basic strategies on how to play the Othello, We then implement a program which has N-nodes, each node will have a weight, the weights decide how much influence that the value of this node will have on the final postion among all the possible ones ( It is very easy for us to find all the possilbe postions that a player can put chess in at one time, this has already been implemented in the applet interface version of this program) to place a next chess. Each node's value will be detemined by current chess board situation according to those principles that we have dicussed above.

After initialization, we will begin the training process, When we test a version of the program, we need something to test it against. Basically there are three ways:

Find an already written, good Othello playing program. The better our program does against this program, the better the weights in our program are.
Get a large number of recorded Othello games (usually as played by expert humans). We let are program choose moves at random from random games and try to choose the next move that the person made.
Let versions of the program play against each other. We keep the weight sets from the programs that win.

During the tranining process, hopefully, the gentical algorithm will find the best suitable weights for each inner nodes. That means, the program will know how much influence a node's value ( judged by a principle) should have on the whole final positon decision to place a new chess on board.

Interface Description

You can find our implemented java applet interface at:
http://www.cs.columbia.edu/~leikai/Othello/Othello.html

How to play:
First, when you visit this web page, the java applet will automatically start and a little control dialogue will appear to let you choose which color of chess that you want to choose.

Currently, you can only play between two people manually since the computer player has not been implememted. So two human player will play alternatively. There is a little square on the control panel to indicate whose turn is right now. there is a dark gray rectangle indicating which box you are at when you are moving the mouse. When you reach a possile box to place a your chess, all the reversable chesses of your opponent's will rotate themselves so that you can easily judge the chess board situation if you want to place your chess.

During the whole playing course, the control board will calculate the number of each chesses on the board.

When the game is over, you can find the result on the control panel to decide who wins the game, and you can re-start a new game.

The current program is easily extentable, you just need to implement the Player interface if you want to implement any other computer players that use this interface.