Perceptron algorithm

June 21, 2017

Lets say you have a dataset for binary classification. Each data point x has n-dimensions, and the target value is y = 1 or -1 for the two classes.

The perceptron

The perceptron algorithm can be visualized as follows.

For a given data point x, the prediction is simply based on whether the linear function f(x) = b + w₁x₁ + w₂x₂ + … + w_n*x_n is greater than or lesser than 0 (in the above diagram, w₀ is b).

define predict(x):
    let f(x) := dot_product(x, w) + b
    prediction = 1 if f(x) > 0 and -1 if f(x) < 0

The learning algorithm

The learning algorithm is iterative. We start by initializing w with random values, and iteratively update its value.

define perceptron():
    A: initialize weights w randomly (n dimensions)
    B: for each data point (x, y) in the dataset
           make prediction p given input x
           if p is not equal to y (i.e. prediction is wrong)
               w = w + y*x (update the weights) 
    C: repeat step B until no points are misclassified

Why does the algorithm work?

The algorithm works because if a point is misclassified, then the new weights are better for the point than the old weights. [1]

For example, suppose for the data point (x, y) that f(x) < 0 and y = 1. Then, after the update, we would want f(x) to be higher than before the update (so that after many updates, it becomes > 0).

Old weights = w, and new weights = w + x. It is easy to verify that

dot-product(w + x, x) + b > dot-product(w, x) + b

Algorithm at work:

Here is an animation of the algorithm during training. Notice how the line moves only when a point is misclassified.

Notes:

Sometimes people initialize the weights to 0 instead of with random values (algorithm step A).
Theoretical guarantee: If the dataset is linearly separable, then the perceptron algorithm will always find a function that separates the data.
If data is not linearly separable, you can repeat step B of algorithm a fixed number of times.

This is my first time writing a tutorial. Hope you enjoyed, and I’d love some feedback. I tried to make the tutorial as clear and to the point as possible.