DeepMind, a division of Google that's focused on advancing artificial intelligence research, unveiled a new version of its AlphaGo program today that learned the game exclusively by playing itself.

"If we learn the game of Go purely through supervised learning, the best you could hope to do would be as good as the human you're imitating..."

By the time the AI reached the 72-hour mark, it had already surpassed the abilities of AlphaGo Lee, one of its most notorious predecessors for being the first machine to beat a human Go champion, Lee Se-Dol in 4 out of 5 games. (There are approximately 10 positions you can make on a Go board.) As DeepMind co-founder Demis Hassabis told The Verge, AlphaGo Zero could be reprogrammed to sort through other kinds of data instead. Now what's really important to remember is that Go is simpler to learn than Chess.

The next move from the best possible chain is then played, and the computer players repeat the above steps, coming up with chains of moves ranked by strength. It made history again previous year, when it pummeled Lee Sedol, and triumphed once more in May with a three-game win against current No. 1 ranking player Ke Jie. After 40 days of internal Go playing, it beat the Master version (the same program that triumphed over world number one Ke Jie in May) 89-11 - making it "arguably the strongest Go player in history". AlphaGo analyzed those games, move-by-move, and then played itself in simulations over and over again, hyper-optimizing moves each turn based on its store of human knowledge about the game. However, that iteration of the algorithm has now been thoroughly thrashed by DeepMind's new AlphaGo Zero. For the first time, a neural network has taught itself completely how to play a game, assembling knowledge that took humans thousands of years, during a period of just a few days. Zero only understood that concept later in its training, according to DeepMind's paper.

To accomplish this, the company used a machine learning technique called "reinforcement learning" to push Zero to optimize its gameplay.

He adds: "They're not putting explicit declarative knowledge of things other than the rules of Go in there, but there's a lot of implicit knowledge that the programmers have about how to construct machines to play problems like Go". Like the original, it used a deep neural network and a powerful search algorithm to pick the next move.

"However, this is not the beginning of any end because AlphaGo Zero, like all other successful AI so far, is extremely limited in what it knows and in what it can do compared with humans and even other animals", he said.

DeepMind's human-conquering AlphaGo AI just got even smarter.

AlphaGo Zero required no human input to learn how to play Go, and beat every single AI on its path to become the ultimate player.

Hassabis said that using real world data, particularly human-generated or labeled data, is problematic because it can be expensive, publicly-unavailable or biased. "AlphaGo Zero is now the strongest version of our program and shows how much progress we can make even with less computing power and zero use of human data".

When asked how many Alphabet dollars DeepMind used to fund all of its AlphaGo work, Hassabis said it was hard to quantify before admitting that the figure is "probably quite scary".

"AlphaGo Zero becomes its own teacher", wrote DeepMind researchers recently on the company blog.

While the AlphaGo Zero breakthrough is impressive, it's worth noting that researchers are still a long way off the AIs depicted in Hollywood films like "Ex-Machina" or "Her".

