Naive Bates Classifier
Naive Bates Classifier
Naïve Bayes Classifier is one of the simple and most effective Classification
algorithms which helps in building the fast machine learning models that can make
quick predictions.
Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental
analysis, and classifying articles.
ADVERTISEMENT
The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be
described as:
Naïve: It is called Naïve because it assumes that the occurrence of a certain feature
is independent of the occurrence of other features. Such as if the fruit is identified
on the bases of color, shape, and taste, then red, spherical, and sweet fruit is
recognized as an apple. Hence each feature individually contributes to identify that
it is an apple without depending on each other.
Bayes' Theorem:
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to
determine the probability of a hypothesis with prior knowledge. It depends on the
conditional probability.
Where,
ADVERTISEMENT
ADVERTISEMENT
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of
a hypothesis is true.
ADVERTISEMENT
ADVERTISEMENT
Problem: If the weather is sunny, then the Player should play or not?
Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
ADVERTISEMENT
ADVERTISEMENT
Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
Weather No Yes
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
Applying Bayes'theorem:
ADVERTISEMENT
ADVERTISEMENT
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny)= 0.35
P(Yes)=0.71
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
ADVERTISEMENT
Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
Naive Bayes assumes that all features are independent or unrelated, so it cannot
learn the relationship between features.
Applications of Naïve Bayes Classifier:
Gaussian: The Gaussian model assumes that features follow a normal distribution.
This means if predictors take continuous values instead of discrete, then the model
assumes that these values are sampled from the Gaussian distribution.
Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification problems, it
means a particular document belongs to which category such as Sports, Politics,
education, etc.
The classifier uses the frequency of words for the predictors.
Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the
predictor variables are the independent Booleans variables. Such as if a particular
word is present or not in a document. This model is also famous for document
classification tasks.
Steps to implement:
In this step, we will pre-process/prepare the data so that we can use it efficiently in our
code. It is similar as we did in data-pre-processing. The code for this is given below:
In the above code, we have loaded the dataset into our program using "dataset =
pd.read_csv('user_data.csv'). The loaded dataset is divided into training and test set,
and then we have scaled the feature variable.
After the pre-processing step, now we will fit the Naive Bayes model to the Training set.
Below is the code for it:
Output:
Now we will predict the test set result. For this, we will create a new predictor variable
y_pred, and will use the predict function to make the predictions.
Output:
The above output shows the result for prediction vector y_pred and real vector y_test. We
can see that some predications are different from the real values, which are the incorrect
predictions.
Now we will check the accuracy of the Naive Bayes classifier using the Confusion matrix.
Below is the code for it:
As we can see in the above confusion matrix output, there are 7+3= 10 incorrect
predictions, and 65+25=90 correct predictions.
Next we will visualize the training set result using Naïve Bayes Classifier. Below is the code
for it:
Output:
In the above output we can see that the Naïve Bayes classifier has segregated the data
points with the fine boundary. It is Gaussian curve as we have used GaussianNB classifier
in our code.
Output:
The above output is final output for test set data. As we can see the classifier has created
a Gaussian curve to divide the "purchased" and "not purchased" variables. There are
some wrong predictions which we have calculated in Confusion matrix. But still it is pretty
good classifier.
← Prev Next →
Feedback
Send your Feedback to feedback@javatpoint.com
RxJS tutorial
Preparation
Company
Questions
Trending Technologies
B.Tech / MCA