AI: why it’s so hard for computers to beat humans at the game of “Go”
Editor’s note: the author of this article works on a startup that provides tl:dr’s as a service, so he also provided this summary of key points.
The game of go is very interesting from an AI programmer’s point of view, because of how difficult it is to make a computer compete against a strong human and how researchers approached the problem. Paradoxically, the use of randomly played games has helped computers get much stronger.
Game of Go: a (very) brief overview
Go is wildly popular in Asia but little known in the Western world. Similar to Chess, it is a strategy game where two players take turn placing stones on a board. The key differences between these two are:
- In Go, the board starts empty, and you add stones at each turn whereas in chess you start with an army and try to destroy the other one
- The rules are much simpler in Go than in Chess: in particular there is only one type of stone while Chess uses six different pieces
Compexity, comparison to chess
Even though the rules are simple, it has proven to be very hard to build a good Go-playing AI software. In fact, while top Chess-playing softwares can compete with the best humans (remember how Deep Blue beat Gary Kasparov in 1997), even strong amateurs can beat the top Go-playing programs. And strong amateurs are much weaker than top pros. There are three main reasons for this:
- Game-tree complexity: there are about 10123 possible games in Chess. As for Go, estimates varybut the American Go Association reckons there are 10700 possible games. This is because there are more turns in a Go game than in a Chess game (200 vs 50 on average), and more possibilities at each turn in Go (up to 350 vs 50).
- Lack of a good heuristic: Chess programs are able to evaluate quite accurately and quickly whether a given position is better than another one. On the contrary, no good heuristic has been found for Go yet.
- Pattern-recognition: strong Go players rely heavily on recognizing the shapes the stones take. There are too many of them for a computer to try and recognize in a game timeframe, and humans are much better than them at this.
Key ideas behind current Go-playing AI
This complexity is what makes the study of Go algorithms so interesting.
A few years ago we saw a spectacular improvement, led by a program called MoGo. The key ideas behind MoGo are:
- Position evaluation using random games: paradoxically, it was found that the best way to evaluate a position is to play, from this position, a lot of games were each player puts stones at random on the board. A position’s score is simply the percentage of won random games.
- Use of multi-armed bandit algorithms to decrease the branching factor (the number of moves the program has to consider at each turn). The goal here is to optimize the trade-off between exploration (trying new moves) and exploitation (further testing the most promising moves to select the best one). The computer begins with a prior probability distribution for every possible move. It picks one according to this distribution, plays a random game from the corresponding position and uses the result to update the probability for this move. It then selects another possible move according to the updated distribution and so on.
- Use of a little expert knowledge: the prior distribution used in the multi-armed bandit is calculated so as to avoid trivially stupid moves. For the opening of the game, classic starting positions are also favored. And the random games are not completely random since they avoid certain moves according to simple patterns. Nonetheless, expert knowledge is used scarcely as it really hurts the exploration part of the algorithm.
- Parallelization: MoGo and the likes can become very good on computer clusters, since the random game part can be run on mutliple CPUs. The speedup factor is about 8x.
- Reinforcement learning: MoGo learns during the game and the randoms play-outs that some kind of responses to his opponent’s moves are likely to have a bad result, and assigns them a lower probability.
You can learn more on MoGo in this article.
Current strength of computers
Today, Go-playing AI is able to compete with professional players on small (9×9) boards, and with strong amateurs (5d KGS) on normal boards (19×19). This is quite an achievement, but there is still a long way to go before they can face professional players on normal boards!
I am a Go player myself, even though not very good! During an internship in a Computer Science laboratory, I worked on MoGo for 2 months in 2008.
Special thanks to Arpad Rimmel for having reviewed this article. He spent 3 years working on Mogo when he was a PhD student.
Powered by Facebook Comments