Algorithm helps computers beat human Go players

BUDAPEST (Reuters) - Computers can beat some of the world’s top chess players, but the most powerful machines have failed at the popular Asian board game “Go” in which human intuition has so far proven key.

A demonstration of the board game "Go" during the World PC Expo in Tokyo, October 20, 2006. Computers can beat some of the world's top chess players, but the most powerful machines have failed at the popular Asian board game Go in which human intuition has so far proven key. REUTERS/Kiyoshi Ota

Two Hungarian scientists have now come up with an algorithm that helps computers pick the right move in Go, played by millions around the world, in which players must capture spaces by placing black and white marbles on a board in turn.

“On a nine by nine board we are not far from reaching the level of a professional Go player,” said Levente Kocsis at the Hungarian Academy of Sciences’ computing lab SZTAKI.

The 19 by 19 board which top players use is still hard for a machine, but the new method is promising because it makes better use of the growing power of computers than earlier Go software.

“Programs using this method immediately improve if you use two processors instead of one, say, which was not typical for earlier programs,” Kocsis said.

Whereas a chess program can evaluate a scenario by assigning numerical values to pieces -- say 9 to the queen and 1 to a pawn -- and to the tactical worth of their position, that technique is not available to a Go machine.

In Go all marbles are identical and scenarios are too complex, so the computer has to think forward all the way till the end of the game and emulate the outcome of each alternative move, whose number rises exponentially with the number of turns.


Even the most powerful computers have failed at that task, but Kocsis and his colleague Csaba Szepesvari have found a way of helping computers focus on the most promising moves, using an analogy with slot machines in a casino.

Punters will find that certain one-armed bandits in a casino appear to pay more on average than others, but an intelligent player should also try machines that have so far paid less in case they are hiding a jackpot, Kocsis said.

The key is to find the balance between the two sorts of machine.

Go software using a similar method, called UCT, does not have to scan all possible outcomes of a game and they can quickly find the best mix of scenarios to check.

“This bandit algorithm has proven advantages,” Kocsis said.

The possible outcomes of a game are like branches of a tree, and earlier Go programs, unable to scan all branches, picked some at random and tried to find the best move from that sample.

The UCT method helps a computer decide which routes are most worth investigating and programs based on it have consistently won games against most other machines.