MoGo used UCT to bias its Monte Carlo search – a strategy that enabled the program to win the CGOS tournament where games were played on a smaller 9x9 grid, in August 2006. MoGo was first introduced in the On-line Trading of Exploration and Exploitation Workshop 2006 (sponsored by PASCAL, the Network of Excellence on Pattern Analysis, Statistical Modelling and Computational Learning).
The same team went on to use a Monte Carlo and bandit-based approach and win the 2006 “PASCAL Exploration vs. Exploitation Challenge,” set by TouchClarity (now Omniture TouchClarity), see box.
Despite the successes of the MoGo team, other researchers remained sceptical about their approach. According to Sylvain Gelly, “at that time, the ideas we were using in MoGo were considered ‘interesting’ in the Computer Go community only for the 9x9 Go, but were considered doomed by the fact that they were not successful on the "real" game. 9x9 Go was considered to be another game unrelated to 19x19 Go.” But the MoGo team did not lose hope and Sylvain Gelly spent the next year improving MoGo (with methods such as Rapid Action Value Estimation). |
From game-playing to webpage optimisation, Monte Carlo bandit methods seem to provide a powerful new technique. In the words of Remi Munos, “I think the idea of bandit algorithms used recursively for performing efficient tree search (such as UCT) is a new research direction that is definitely worth investigating both from theoretical and practical perspectives.”
TouchClarity was recently acquired by Omniture, specialists in online business optimisation. The development of MoGo has also been a fruitful avenue of research for all involved. In 2007 Yizao Wang and Sylvain Gelly won the prize for best student paper in the IEEE Symposium on Computational Intelligence and Games. Wang also received an honor "prix d'option" from Ecole Polytechnique for his work on this project. Gelly was awarded his PhD on “A Contribution to Reinforcement Learning; Application to Computer Go” in September 2007 and now works at Google, Zurich. Coulom continues to improve Crazy Stone which after several recent wins against MoGo is now perhaps the main competitor to MoGo in Computer Go.
Remi Munos summed up the MoGo work with enthusiasm: “From my point of view, this project (or adventure...) has been really a wonderful collaborative work which has led to some advances not only in the field of Go, but also which has opened new perspectives in other fields as well, such as in optimal control, large scale optimisation problems, and complex sequential decision problems, in particular when several agents interact, such as in games.”
Resources:
INRIA article and links for MoGo: http://www.inria.fr/bordeaux/ressources-1/computer-culture/mogo-champion-program-for-go-games/
PASCAL: http://www.pascal-network.org/
PASCAL Exploration-Exploitation Challenge & Workshop: http://www.pascal-network.org/Challenges/EEC/
http://www.homepages.ucl.ac.uk/~ucabzhu/OTEE.htm
http://www.omniture.com/products/optimization/touchclarity |