Paper Summary: Human-Level Control Through Deep Reinforcement Learning
Why this paper?
First scalable successful combination of reinforcement learning and deep learning. Result outperforms preceding approaches (at Atari games). Only uses pixel data + game scores + number of actions and the same architecture across different games.Why reinforcement learning? That is how animals and humans seem to make decisions in their environments as evidenced by parallels seen in neural data of neurons and temporal difference RL algorithms.
What about previous approaches?
Handcrafted features. When non-linear approximations of Q are used, values are unstable. Other stable neural nets approaches were there, like Q-iteration. They are slow, though - don’t work for large networks.
What are the outcomes?
Tested the method against best performing approaches at the time and a professional game tester. Used 49 different Atari g...