Naive Bayes is a widely used classification algorithm. It is a supervised learning algorithm based on Bayes’ Theorem. The word naive comes from the assumption of independence among features. That is, if our input vector is (x1, x2,...,xn), then xi's are conditionally independent given y.
Let's start with Bayes' theorem (for naive bayes, x is the input and y is the output):
When we have more than one feature, we can rewrite Bayes' theorem as:
Since we are making the assumption that xi's are conditionally independent given y, we can rewrite the above as
but we also know that P(x1, x2, .., xn) is a constant given the input, i.e.
- the left hand side is the term we are interested in, probability distribution of the output y given input x.
- P(y) can be estimated by counting the number of times each class y appears in our training data (this is called Maximum a Posteriori estimation).
- P(xi|y) can be estimated by counting the number of times each value of xi appears for each class y in our training data.
- Estimate P(y): P(y=t) = number of times class t appears in the dataset / size of dataset
- Estimate P(xi|y): P(xi=k|y=t) = number of times xi has value k and y has value t / number of data points of class t
- Estimate P(y|x1,...,xn): Use above estimated values of P(y) and P(xi|y) and equation (1). Thereafter, normalize the values.
There are several variants of naive bayes which use different distributions for P(xi|y) such as gaussian distribution (gaussian naive bayes), multinomial distribution (multinomial naive bayes) and bernoulli distribution (bernoulli naive bayes).
# We will use the iris dataset:# The iris flower data set consists of 50 samples from each of three# species of Iris (Iris setosa, Iris virginica and Iris versicolor).# Four features were measured from each sample: the length and the width# of the sepals and petals, in centimeters.from sklearn.datasets import load_irisfrom sklearn.naive_bayes import GaussianNBimport numpy as np# load the datasetdata = load_iris()model = GaussianNB()model.fit(data.data, data.target)# evalauteprint(model.score(data.data, data.target))# output = 0.96# predictmodel.predict([[4.2, 3, 0.9, 2.1]])# 0 = setosa, 1 = versicolor, and 2 = virginica
Naive bayes is one of the simplest yet effective algorithms for
- Text classification: For example, we have a number of news articles, and we want to learn to classify if the article is about politics, health, technology, sports or lifestyle.
- Spam filtering: We have a number of emails, and we want to learn to classify if the email is spam or not.
- Gender classification: Given features such as height, weight, etc, predict whether the person is male or female.