AlphaGo Zero – A New Breakthrough of AI

In October 2015, AlphaGo became the first computer Go program to beat a human professional Go player without handicaps on a full-sized 19×19 board. In March 2016, it beat Lee Sedol in a five-game match. At the 2017 Future of Go Summit, AlphaGo beat Ke Jie, the world No.1 ranked player at the time, in a three-game match. 

lee_sedol.jpg

Go was considered a difficult game for computers to master because, besides being complex, the number of possible moves – more than chess at 10170 – is greater than the number of atoms in the universe.

After beating Jie earlier this year, DeepMind announced AlphaGo was retiring from future competitions. And just a week earlier, DeepMind published a paper describing AlphaGo Zero – a leaner and meaner version of AlphaGo, the artificially intelligent program that crushed professional Go players.

I bet you will wonder what is the advance of AlphaGo Zero instead of AlphaGo? Why can it be called as BREAKTHROUGH of AI?

The answer here is SELF-LEARNING.

Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play. In doing so, it quickly surpassed human-level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.

TrainingTime-Graph-171019-r01.gif

DeepMind used a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher. You might say “Okay, AlphaGo Zero can teach and itself, why is this so great?”

Notice that the new technique makes AlphaGo Zero no longer constrained by the limits of human knowledge. Also, do you feel it is just like how human brain works? While you are in a new environment, you start to feel it, get knowledge from it, and teach yourself to understand. The new technique, in fact, is a sign of AI getting closer to human brains.

In addition, AlphaGo Zero uses one neural network rather than two. Earlier versions of AlphaGo used a “policy network” to select the next move to play and a ”value network” to predict the winner of the game from each position. These are combined with the search algorithm in AlphaGo Zero, allowing it to be trained and evaluated more efficiently.

AlphaGo%20Efficiency.width-1500.png

The logic behind AlphaGo Zero imitates how humans think and learn, which makes it more general – the purpose of studying AI. 

We want AI to help humans at a huge range of tasks which can be housework, driving, laundry in a self-aspect, or financial work, supply chain support at the industry level. While AlphaGo Zero is a step towards a general-purpose AI, it can only work on problems that can be perfectly simulated in a computer. Right now, people are only researching AI in a one by one area. AIs that match humans at a huge range of tasks are still a long way off.

We have been discussing a lot about whether AI would be a threat to human beings in the class. Everyone gets his/her own opinions. For me, I still believe the benefits that we get from AI would be much more than its threats to us. And AlphaGo Zero does bring us a big surprise that how fast the technology develops and how great humans have been developing.


Sources:

https://deepmind.com/blog/alphago-zero-learning-scratch/

https://www.theregister.co.uk/2017/10/18/deepminds_latest_alphago_software_doesnt_need_human_data_to_win/

https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own

 

4 comments

  1. Great blog post! AlphaGo Zero is another example of how AI is replacing humans. It really transforms the training experience by removing humans and the possibility of humor error/mistake. Although AlphaGo Zero makes mistakes to train itself, it’s definitely more beneficial it learns from building off of AI in order to create a perfect, free of human error invention. You also brought up class discussion of AI and whether it is a threat to humans or not. Ultimately, it seems the way the world is going to create a more efficient, seamless way of life. It will be interesting to see which industry AI interrupts next!

  2. briandentonbc · ·

    Really cool post! This is a perfect real world example of machine learning, it was really interesting to see how the AlphaGo Zero was able to surpass the AlphaGo so quickly. It’s so scary to think that even at a complex game, the computer is literally able to teach itself to beat the best in the world so quickly. While I agree that this form of machine learning is harmless and benign, the more examples like this of machines surpassing us and our own knowledge so quickly, the more nervous I get about what AI will look like in our future. It will be interesting to see how it all develops. Great post!

  3. andrewmanginelli · ·

    Very interesting to hear more about self learning. I think another very important factor of this is that the machine is able to learn as long as it is hooked up to electricity. It does not need to sleep or eat. This is another obvious reason of why many believe that robots will be able to replace humans in the workforce. It’s obvious that AI is going to penetrate into all parts of our society. It’s pretty frightening to me that our governments are not taking a proactive approach in developing plans for how our economies will work on robots (what will people do?).

  4. Nice post. Interesting to see the developments in AI. Exciting times ahead!

%d bloggers like this: