The paper proposes an adversarial approach for estimating generative models where one model (generative model) tries to learn a data distribution and another model (discriminative model) tries to distinguish between samples from the generative model and original data distribution.
Two models - Generative Model(G) and Discriminative Model(D)
Both are multi-layer perceptrons.
G takes as input a noise variable z and outputs data sample x(=G(z)).
D takes as input a data sample x and predicts whether it came from true data or from G.
G tries to minimise log(1-D(G(z))) while D tries to maximise the probability of correct classification.
Think of it as a minimax game between 2 players and the global optimum would be when G generates perfect samples and D can not distinguish between the samples (thereby always returning 0.5 as the probability of sample coming from true data).
Alternate between k steps of training D and 1 step of training G so that D is maintained near its optimal solution.
When starting training, the loss log(1-D(G(z))) would saturate as G would be weak. Instead maximise log(D(G(z)))
The paper contains the theoretical proof for global optimum of the minimax game.
Datasets: MNIST, Toronto Face Database, CIFAR-10
Generator model uses RELU and sigmoid activations.
Discriminator model uses maxout and dropout.
Evaluation Metric: Fit Gaussian Parzen window to samples obtained from G and compare log-likelihood.
Backprop is sufficient for training with no need for Markov chains or performing inference.
A variety of functions can be used in the model.
Since G is trained only using the gradients from D, fewer chances of directly copying features from the true data.
Can represent sharp (even degenerate) distributions.
D must be well synchronised with G.
While G may learn to sample data points that are indistinguishable from true data, no explicit representation can be obtained.