[Paper Summary + Doubts] Deep Residual Learning for Image Recognition
This is a great paper that addresses the vanishing/exploding gradient problem existing in deep neural network architecture. The solution proposed in the paper is a deep residual learning framework which allows for training extremely deep CNN models for various visual recognition tasks. The architecture consists of stacked convolutional layers, with every other layer connected with two layers below. In this way, every two layers are trained to approximate a residual function of an underlying mapping.
The claim made in the paper is that learning some underlying mapping H(x) is asymptotically approximate to learning the residual function [H(x)-x] and then adding x, but that the latter is easier to learn with several layers of a neural network. This intuition isn't very clear. Section 3.1 discusses this intuition, and I was wondering if someone could help me understand this.
Some further questions and observations:
This framework doesn't seem to have several fully connected layers at the end, as VGG/AlexNet papers d...
This paper focused on solving the degradation problem (saturation of accuracy in deeper networks). The paper's explanation is that residuals make backprop more efficient for deeper networks. That makes sense, but there's more to the story.
The self-referential formulation of ResNets leads to...
Hey! Welcome (back) to the ML reading group. Indicate what type of reading you're interested in doing and suggest papers here! We meet weekly to engage in discussions about various areas of machine learning.
Paper Summary: Human-Level Control Through Deep Reinforcement Learning
Why this paper?
First scalable successful combination of reinforcement learning and deep learning. Result outperforms preceding approaches (at Atari games). Only uses pixel data + game scores + number of actions and the same architecture across different games.Why reinforcement learning? That is how animals and humans seem to make decisions in their environments as evidenced by parallels seen in neural data of neurons and temporal difference RL algorithms.
What about previous approaches?
Handcrafted features. When non-linear approximations of Q are used, values are unstable. Other stable neural nets approaches were there, like Q-iteration. They are slow, though - don’t work for large networks.
What are the outcomes?
Tested the method against best performing approaches at the time and a professional game tester. Used 49 different Atari g...