Currently, most of the machine learning products use supervised learning. In this, we have a set of features or inputs X (for example, an image) and our model will predict a target or output variable y (for example, caption for the image).
In other words, our model learns a function that maps inputs to desired outputs. Features are independent variables and targets are the dependent variable.
Supervised learning problems can be further grouped into classification and regression problems. When the output variable is a category, such as "spam" or "ham" (non-spam) then the problem is a classification problem. When the output variable is a real value, such as "price of the house", then it is a regression problem.
- Spam filtering: Is an email spam or not
- Image classification: Given an image, output which objects are present in the image (dog, cat, computer, building, so on)
- Given information about a house, predict its price
- Netflix: Given a user and a movie, predict the rating the user is going to give to the movie (which can then be used for providing recommendations)
In unsupervised learning we have input data X but no output variable y. Goal of unsupervised learning is to model the distribution of the data, i.e. identify patterns in the data.
Unsupervised learning problems can be further grouped into clustering and association problems. When we want to discover inherent groupings in the input data it is known as clustering problem. When we want to discover rules that describe portions of the input data it is known as an association problem.
- Given a list of customers and information about them, discover groups of similar users. This knowledge can then be used for targeted marketing.
- Anomaly detection: Given measurements from sensors in a manufacturing facility, identify anomalies, i.e. that something is wrong
- Discover patterns in data such as whenever it rains, people tend to stay indoors. When it is hot, people buy more ice-cream.
In the above two problem categories, the input is given to us. In reinforcement learning, the key difference is that the input itself depends on actions we take. For example, in robotics, we might start in a situation where the robot does not know anything about the surrounding it is in. As it does certain actions, it finds out more about the world. But the world it sees depends on whether it chose to move forward, or turn right.
The robot is known as an agent, and is in some environment (surrounding). At each time step, it can take some action and it might receive some reward (say the robot fell in a ditch, or found a lake on Mars).
Example reinforcement learning problems:
- Robotics: A robot is in a maze, and it needs to find a way out.
- Training an AI for a complex game such as Civilization or Dota
Problems may not necessarily fall into one of the above categories cleanly.
For example, in the image classification problem, we are given a number of images and the objects present in those images as training data. However, we may have a strategy for using the large amount of images available on the web (for which the objects have not been annotated). This is an example of semi-supervised learning, i.e. we have some data that is labelled, and some that is not labelled.
Another example is, we might want to learn a model that given an image of a handwritten sentence, tells us what the sentence is (this problem is called Optical Character Recognition). However, we have only 100 images of sentences and the corresponding label, but we also have 100,000 images of alphabets and their corresponding labels. Then, we could learn a model on the alphabets, and re-use the model in some way when learning a model on the sentences. This concept is called transfer learning (as in, we are transferring our knowledge from one domain to another).
We should note that the boundaries between the above problem types are quite fluid. For example, a supervised learning problem could become a semi-supervised or transfer learning problem if we find we have more data that can help our model in some way, even though it doesn't look exactly like the data for the problem we have at hand.
We've made a lot more progress towards solving supervised learning than unsupervised learning. That is, on average, there are a lot more problems for which supervised machine learning algorithms make accurate predictions, than there are unsupervised machine learning algorithms or reinforcement learning algorithms.
A majority of the successful machine learning products currently fall under the category of supervised learning. Such as, predicting what rating a user might give to a movie (think Netflix), how likely is a person to buy a particular product (think Amazon), or how likely is if for a particular email to be spam.
Unsupervised and reinforcement learning are areas of active research, and we've recently made significant progress in both with algorithms such as Generative Adversarial Networks (for unsupervised learning) and Deep Q-networks (for reinforcement learning).