The battle between man and machine: AI meets world GO champ

By Danielle Prieur

A game of Go will be streamed on YouTube starting March 9 in a match between man and machine.

This game pits Korean Lee Sedol, the world Go champion, against AlphaGo, an artificial intelligence agent designed by Google DeepMind.

Sedol will play against the AlphaGo computer program whose moves are projected on a monitor as a human makes the moves in real time on the board. AlphaGo has already mastered classic video games like Space Invaders and Atari.

Demis Hassabis, Google DeepMind’s vice president of engineering, said the Korean press gives AlphaGo a 5 percent chance of winning the $1 million prize. Hassabis said that Google DeepMind’s “internal tests suggest something else.”

And, although the March game is a gamble, the goal for AlphaGo’s human creators is even greater: to build artificial intelligence that might eventually be used to help us win in the higher stakes games in life.

Go is an ancient Chinese board game. Confucius thought mastery was necessary to becoming a true scholar, said Hassabis. A Go board looks similar to a chess board, but that’s where the comparisons end. There are 181 black pieces and 181 white pieces and one objective. The winner is the first to surround half of the opponent’s pieces.

The ancient Chinese game of Go has over 200 possibilities to consider at each move. Photo courtesy of Flickr.
The ancient Chinese game of Go has over 200 possibilities to consider at each move. Photo courtesy of Flickr.

Hassabis said that children in China sometimes attend Go schools where they study solely the strategy and philosophy of the game and Go champions are like “rock stars.” In the United States, genius game theorist and Nobel laureate John Nash used to play the game at Princeton. Hassabis and his team have applied Nash’s work to AlphaGo,

“What they don’t realize maybe is how much our system has improved,” Hassabis said.

How exactly was AlphaGo improving, even as Hassabis gave his seminar on the program at the American Association for the Advancement of Science conference in Washington D.C. on Saturday?

“It’s improving right now when I’m talking to you. Its playing itself, getting better and improving itself,” Hassabis said at the conference.

He said Google DeepMind began building AlphaGo a year and a half ago, in April 2015. A second version was built in October that could beat the older version 100 percent of the time.

The second evolution played European Go Champion Fan Hui in January and beat him by 5 to 0.

“It’s a decade earlier than most experts in the field were predicting,” said Hassabis. “Now we feel we’re ready to challenge the world champion. This really is our Deep Blue moment.”

Hassabis said that AlphaGo has a better chance of beating Sedol than the famous IBM computer Deep Blue might. Deep Blue thinks like a machine, but AlphaGo thinks more like a human.

“It plays more like a human in the way it thinks,” Hassabis said. “In fact many of the professionals when they looked at the European games, they commented on how human-like the style of the program was. And if you think about it […] it’s more like what a human grandmaster would do” in demonstrating intuition in the moves, said Hassabis.

Hassabis said the secret is in the difference between the narrow artificial intelligence (AI) of Deep Blue and the general artificial intelligence (GAI) of AlphaGo. Every move that Deep Blue made was coded. If there was an error in coding, the system would fail. Only the goal of the game is coded into AlphaGo’s program and then it must learn through its observations and algorithms how to play and win the game.

“GAI from the ground up is built to be adaptive and flexible and deal with gracefully the unexpected and learn how to deal with that,” said Hassabis.“The goal the system is given here is to maximize the score everything else is learned from scratch its not told anything about the rules of the game,” said Hassabis.

Policy and value networks which make up AlphaGo’s algorithms or its program, along with Montecarlo’s research on probability, allow it to determine which move to make and which side is winning at any moment during a game as a human player might.

How AlphaGo learns and determines its next move 

The neural networks including property and value that AlphaGo uses to determine it's next winning move.
The policy and value “neural networks” that the AlphaGo program uses to decide its next move during games of Go. Graph from Google DeepMind/Hassabis.

But Go is more complicated than most games, including chess. Unlike in chess, with 20 permutations to consider per move, Go has over 200, said Hassabis. GO also relies on pattern recognition and intuition.

“You can explicitly codify rules and write handcraft rules and an evaluation summary in chess. In Go, that’s impossible,” said Hassabis. “It combines patter recognition with this search and plan.”

As always, practice makes perfect and after downloading thousands of examples of Go games into AlphaGo and putting it through millions of internal tests, AlphaGo started to win 100 percent of the time against online Go games like Zen and Crazy Stone and then against the humans.

“We downloaded every game available on the Internet of human experts playing each other, a couple of hundred thousand games, and we took all of those positions and trained the neural network to mimic the human expert. We trained the neural network to try to predict what the next move was,” said Hassabis.

As AlphaGo plays thousands more games than any human opponent will ever be able to play, and learns how to win from each one, the program seems unstoppable.

Although Hassabis wouldn’t say how much the newest version of AlphaGo has improved over the past few months, he said AlphaGo will be ready to play Sedol in a few weeks.

“Now we feel we’re ready to challenge the world champion,” said Hassabis. “This really is our Deep Blue moment.”

Hassabis said that Sedol might win if he “finds a weakness in the program, in the system that internal tests didn’t find [as] he is one of the greatest games players of all time.”

After March, Hassabis said that Google DeepMind will begin work on artificial intelligence that uses “random play:”  a computer program that has no external knowledge of games whatsoever.

Until then, “The million-dollar question is what’s gonna happen when we play?” said Hassabis. Artificial intelligence might be able to “be a meta solution to these problems,” if used ethically, he said.

“It will take longer to train, a month or something like that, but at least we’ll be starting with no external knowledge” programmed.  “And what that will mean is that the program AlphaGo will be out of the box for any board game,” said Hassabis.

“I see two big challenges facing society today: information overload and the complexity of the systems we’d like to understand from climate and physics and energy. They’re so complex now it’s difficult for even the smartest humans to master in their lifetime,” said Hassabis.

“I want to make AI scientists or AI assisted science possible. So AI programs that work side by side with scientists and doctors to try and advance our human knowledge,” said Hassabis.

Until then, “The million-dollar question is what’s gonna happen when we play?” said Hassabis.

Photo at top: The battle between AlphaGo and Lee Sedol. (Flickr. Cropped from original)