Unit - 4 Machine Learning
Unit - 4 Machine Learning
BTCS 618‐18
Where,
• P(A|B) is Posterior probability: Probability of hypothesis A on the observed event
B.
• P(B|A) is Likelihood probability: Probability of the evidence given that the
probability of a hypothesis is true.
• P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
• P(B) is Marginal Probability: Probability of Evidence.
Naïve Bayes Classifier Algorithm
Working of Naïve Bayes' Classifier:
Working of Naïve Bayes' Classifier can be understood with the help of the below
example:
• Suppose we have a dataset of weather conditions and corresponding target
variable "Play". So using this dataset we need to decide that whether we
should play or not on a particular day according to the weather conditions. So
to solve this problem, we need to follow the below steps:
• Convert the given dataset into frequency tables.
• Generate Likelihood table by finding the probabilities of given features.
• Now, use Bayes theorem to calculate the posterior probability.
• Problem: If the weather is sunny, then the Player should play or not?
• Solution: To solve this, first consider the below dataset:
Naïve Bayes Classifier Algorithm
Play
Outlook
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Naïve Bayes Classifier Algorithm
• Frequency table for the Weather Yes No
Weather Conditions: Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)
Hence on a Sunny day, Player can play the game.
Naïve Bayes Classifier Algorithm
Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
datasets.
• It can be used for Binary as well as Multi‐class Classifications.
• It performs well in Multi‐class predictions as compared to the other
Algorithms.
• It is the most popular choice for text classification problems.
Disadvantages of Naïve Bayes Classifier:
• Naive Bayes assumes that all features are independent or unrelated, so it
cannot learn the relationship between features.
Naïve Bayes Classifier Algorithm
Applications of Naïve Bayes Classifier:
• It is used for Credit Scoring.
• It is used in medical data classification.
• It can be used in real‐time predictions because Naïve Bayes Classifier is
an eager learner.
• It is used in Text classification such as Spam filtering and Sentiment
analysis.
Naïve Bayes Classifier Algorithm
Types of Naïve Bayes Model:
• Gaussian: The Gaussian model assumes that features follow a normal
distribution. This means if predictors take continuous values instead of
discrete, then the model assumes that these values are sampled from the
Gaussian distribution.
• Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification
problems, it means a particular document belongs to which category such as
Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
• Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier,
but the predictor variables are the independent Booleans variables. Such as if a
particular word is present or not in a document. This model is also famous for
document classification tasks.
K‐Nearest Neighbors (KNN)
• K‐Nearest Neighbour is one of the simplest Machine Learning algorithms
based on Supervised Learning technique.
• K‐NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most
similar to the available categories.
• K‐NN algorithm stores all the available data and classifies a new data
point based on the similarity. This means when new data appears then it
can be easily classified into a well suite category by using K‐ NN
algorithm.
• K‐NN algorithm can be used for Regression as well as for Classification
but mostly it is used for the Classification problems.
• K‐NN is a non‐parametric algorithm, which means it does not make any
assumption on underlying data.
K‐Nearest Neighbors (KNN)
• It is also called a lazy learner • Example: Suppose, we have an image of a
creature that looks similar to cat and dog, but
algorithm because it does not we want to know either it is a cat or dog. So
learn from the training set for this identification, we can use the KNN
algorithm, as it works on a similarity measure.
immediately instead it stores the Our KNN model will find the similar features
dataset and at the time of of the new data set to the cats and dogs
classification, it performs an images and based on the most similar
features it will put it in either cat or dog
action on the dataset. category.
• KNN algorithm at the training
phase just stores the dataset and
when it gets new data, then it
classifies that data into a
category that is much similar to
the new data.
K‐Nearest Neighbors (KNN)
Why do we need a K‐NN Algorithm?
• Suppose there are two categories, i.e., Category A and Category B, and
we have a new data point x1, so this data point will lie in which of these
categories. To solve this type of problem, we need a K‐NN algorithm.
With the help of K‐NN, we can easily identify the category or class of a
particular dataset. Consider the below diagram:
K‐Nearest Neighbors (KNN)
How does K‐NN work?
The K‐NN working can be explained on the basis of the below algorithm:
• Step‐1: Select the number K of the neighbors
• Step‐2: Calculate the Euclidean distance of K number of neighbors
• Step‐3: Take the K nearest neighbors as per the calculated Euclidean
distance.
• Step‐4: Among these k neighbors, count the number of the data points in
each category.
• Step‐5: Assign the new data points to that category for which the
number of the neighbor is maximum.
• Step‐6: Our model is ready.
K‐Nearest Neighbors (KNN)
Suppose we have a new data point and we need to put it in the required
category. Consider the below image:
As we can see the 3 nearest neighbors are from category A, hence this
new data point must belong to category A.
K‐Nearest Neighbors (KNN)
How to select the value of K in the K‐NN Algorithm?
Below are some points to remember while selecting the value of K in the
K‐NN algorithm:
• There is no particular way to determine the best value for "K", so we
need to try some values to find the best out of them. The most preferred
value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.
• Large values for K are good, but it may find some difficulties.
K‐Nearest Neighbors (KNN)
Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.