Last year I wrote a couple posts about AlphaGo, the computer go program developed by a team at Google that beat Lee Sedol (one of the world’s top players), 4-1, in a five-game match. If you’ve forgotten, you can read my post-match thoughts here.
For several months nothing more was heard about the program, but then in January a computer program popped up online, called Master or Magister, that played 60 games against top go masters (including two against Ke Jie, the 19-year-old current world champion)… and won all 60 of them. That’s right, 60 wins and no losses. After the games were finished, the Google team confirmed the rampant speculation that Master was in fact the latest iteration of AlphaGo.
In my post last year, I wrote,
It’s cool that we are still at a stage where the top human can beat the computer, if he’s lucky and the stars align. In chess, the era when we can be competitive has passed. The “roughly equal” era will probably not last very long in go either, but we can anticipate some good computer-versus-human go matches while it lasts.
It looks as if the window of opportunity for the best humans to beat the best computer may have shut. AlphaGo’s loss to Lee Sedol in the fourth game of last year’s match may be the only game it will ever lose against a human player. Ke Jie still wants to play a match against AlphaGo, but I’d be surprised if it ended in anything besides a score of x–0.
Can human go players learn from AlphaGo? Does it go about things differently from a human master? A You Tube video with analysis from Brady Daniels (3 dan) made some interesting points about those questions.
Daniels said, in his video “Whatever You Do is Wrong,” that he actually found AlphaGo’s games easier to follow than those of the top human players. In recent years, the top humans have played very sharp, aggressive games with tactical sequences that require exact calculation. Surprisingly the computer program does not go in for that. When the human makes a move on one part of the board, AlphaGo will play a move on a different part of the board. “I’ll give you that, but I’ll take this,” the computer says. The computer wins because its moves are all just a little bit better. If each move wins just a tenth of a point more than the opponent’s on average, then over the course of a 200-move game (100 for each side) AlphaGo will win by 10 points.
Things are different in the chess world. In chess, tactics are the computer’s strength, and for a long time if there was any area where the computer appeared vulnerable, it was in strategy. However, Daniels’ comment does seem true for chess in one respect. I’ve noticed, in my many games against Shredder, that it almost never plays “brilliant” sacrifices. Even though it has great tactical vision, it is usually able to find a way to increase its advantage that doesn’t involve giving up material. It just gradually outplays you, posing more and more difficult problems, until eventually you have to give up material (or perhaps you give it up unintentionally by blundering). Often it’s hard to tell what you did wrong, at least up to the point where you cracked and made a blunder.
Another difference between the go world and the chess world is that go players have had to adjust to the idea of machines surpassing us much faster than chess players did. In the space of a year and half, go programs went from not being able to give professionals a good match, to beating low-rated professionals, to beating one of the top players in the world, to being essentially unbeatable. The comparable developments took about 20 years in chess. One has to take one’s hat off to the AlphaGo team for single-handedly accelerating the timetable by a factor of more than 10.
I am very glad also to see AlphaGo’s commitment to continuing to improve their program. That’s very different from IBM’s approach. After their chess program, Deep Blue, defeated Garry Kasparov in 1997, IBM just walked away. Deep Blue never played again. This leaves a bad taste in my mouth; to IBM, the Kasparov match was just a publicity stunt, and after they had beaten the best human they didn’t care about chess any more. As I’ve written elsewhere, for chess players the revolution didn’t come when Deep Blue beat Kasparov, it came when commercial chess programs came out that could beat anybody in the world. When that happened, humans could learn from the computer.
AlphaGo hasn’t quite gone to that level of openness yet; it hasn’t released a version of AlphaGo that the general public can buy. Nevertheless, it does seem to me that they care about go and that they are not treating last year’s match as a one-time publicity stunt. That’s encouraging for the go world. I’ll end with a quote from a Wall Street Journal article (“Mr. Ke” = Ke Jie):
“After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong,” Mr. Ke, 19, wrote on Chinese social media platform Weibo after his defeat. “I would go as far as to say not a single human has touched the edge of the truth of Go.”
{ 2 comments… read them below or add one }
I would agree actually to finding AlphaGo’s 60 games easier to follow on average than games between top human pros, and mostly agree with why, but I would put a different emphasis on the exact shade of why. We know AlphaGo is eager and able to get into positions of frightening complexity when playing itself, from looking at all three of its self-play games that were released some months ago, so why don’t we see any of this in the 60 games here?
I would guess the reason is that AlphaGo is so much stronger than top pros that it doesn’t need to! A well-known behavior of bots like that maximize win probability rather than score margin is that when winning, the bot will play to simplify, giving up a few points for security and clarity of the position. I’m pretty sure that if AlphaGo were to face stronger opposition (either itself, or yet-to-be-written competitor bots, or perhaps human pros with appropriate numbers of handicap stones), we would see AlphaGo opt for sharpness and tactical complication more often.
Effectively, these are all teaching games as if between a master and a student where the master is so much more skilled that he doesn’t need to play to his utmost, but rather can always win calmly keeping the situation simple and under complete control.
Your description of AlphaGo’s style of play sounds reminiscent of the GTO/exploitative dichotomy that factors heavily into 1-on-1 no limit poker matches (this is my area of expertise, and the latest skill domain to fall to computers). Game-theory-optimal plays are not a reaction to any particular parry, rather they are moves that aim to create an unwinnable situation no matter the style of the opponent. Human matches are often exploitative in style, where the players are trying to counter-adapt to each other’s tactics instead of just settling into a set of optimal frequencies and ignoring any attempts to provoke a more aggressive skirmish.
Go is obviously deterministic, so it’s not a direct comparison (no mixed strategies), but the stylistic notion of just playing every single move slightly better is very much the way computers play poker.
{ 4 trackbacks }