0% found this document useful (0 votes)
6 views

Practical-3 Ritesh

Uploaded by

Ritesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Practical-3 Ritesh

Uploaded by

Ritesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Faculty of Engineering & Technology

Machine Learning Laboratory (203105403)


B. Tech CSE 4rd Year 7th Semester

Practical - 3
Aim: Write a program to implement the naïve Bayesian classifier for a sample
training data set stored as a .CSV file. Compute the accuracy of the classifier,
considering few test data sets.
Theory:
What is Naïve Bayesian classifier?
- The Naive Bayes classifier is a simple yet powerful probabilistic machine learning
algorithm that is commonly used for classification tasks. It is based on Bayes' theorem and
assumes that the features are conditionally independent given the class label. This
assumption is known as the "naive" assumption, which simplifies the calculation of
probabilities.
- In machine learning, Naïve Bayes classification is a straightforward and powerful
algorithm for the classification task. Naïve Bayes classification is based on applying Bayes’
theorem with strong independence assumption between the features. Naïve Bayes
classification produces good results when we use it for textual data analysis such as Natural
Language Processing.
- Naïve Bayes models are also known as simple Bayes or independent Bayes. All these
names refer to the application of Bayes’ theorem in the classifier’s decision rule. Naïve
Bayes classifier applies the Bayes’ theorem in practice. This classifier brings the power of
Bayes’ theorem to machine learning.

Given a feature vector X and a class label y, Bayes' theorem states:

Naive Bayes algorithm calculations:


- Naïve Bayes Classifier uses the Bayes’ theorem to predict membership probabilities for
each class such as the probability that given record or data point belongs to a particular class.
The class with the highest probability is considered as the most likely class. This is also
known as the Maximum

Enrollment No.: 2203051057087 Page no:


Div: 7A9(CSE)
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4rd Year 7th Semester

-A Posteriori (MAP). The MAP for a hypothesis with 2 events A and B is MAP (A).
MAP (A) = max (P (A | B))
= max (P (B | A) * P (A))/P (B)
= max (P (B | A) * P (A))
Here, P (B) is evidence probability. It is used to normalize the result. It remains the same. So,
removing it would not affect the result. Naïve Bayes Classifier assumes that all the features
are unrelated to each other. Presence or absence of feature does not affect the other features.
The Naive Bayes classifier works as follows:
1. Training: Given a labelled training dataset, the classifier calculates the prior probability
P(y) for each class in the dataset. It also estimates the likelihood probability P(X|y) for
each feature given each class. This is done by assuming conditional independence
between the features.
2. Prediction: When a new unlabelled instance is presented, the
3. classifier calculates the posterior probability P(y|X) for each class using Bayes' theorem.
It then assigns the class label with the highest posterior probability as the predicted class
for that instance.
4. Handling Continuous Features: For continuous features, the Naive Bayes classifier
typically assumes a probability distribution, often Gaussian (hence called Gaussian
Naive Bayes), to estimate the likelihood probability.
5. Laplace Smoothing: To avoid zero probabilities when a feature value in the testing data
was not observed in the training data, Laplace smoothing (also known as additive
smoothing) is often applied. It adds a small constant to numerator and adjusts the
denominator accordingly.
6. Decision Rule: In some cases, the Naive Bayes classifier can be used for decision making by
considering the posterior probabilities. For example, in binary classification, if P(y=1|X) >
P(y=0|X), the instance is assigned to class 1; otherwise, it is assigned to class 0.

Types of Naive Bayes algorithm


- There are 3 types of Naïve Bayes algorithm. The 3 types are listed below:- 1).
Gaussian Naïve Bayes algorithm:

- When we have continuous attribute values, we made an assumption that the


values associated with each class are distributed according to Gaussian or Normal
distribution. For example, suppose the training data contains a continuous attribute x.
We first segment the data by the class, and then compute the mean and variance of x in
each class:

Enrollment No.: 2203051057087 Page no:


Div: 7A9(CSE)
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4rd Year 7th Semester

2.) Multinomial Naïve Bayes algorithm:

- With a Multinomial Naïve Bayes model, samples (feature vectors) represent the frequencies
with which certain events have been generated by a multinomial (p1, . . . ,pn) where pi is the
probability that event i occurs. Multinomial Naïve Bayes algorithm is preferred to use on data
that is multinomially distributed. It is one of the standard algorithms which is used in text
categorization classification.
3.) Bernoulli Naïve Bayes algorithm:

- In multivariate Bernoulli event model, features are independent boolean variables describing
inputs. Just like the multinomial model, this model is also popular for document classification
tasks where binary term occurrence features are used rather than term frequencies.

❖ Dataset taken: IRIS Dataset.


- This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica)
petal and sepal length, stored in a 150x4 numpy.ndarray.
- The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal
Length and Petal Width.
- No. of Rows: 150
- No. of Columns: 4

❖ Procedure:
#Step-1: Import python libraries.
import numpy as np import pandas as pd from sklearn.model_selection import
train_test_split from sklearn.naive_bayes import GaussianNB from sklearn.metrics
import accuracy_score,confusion_matrix,classification_report import
matplotlib.pyplot as plt import seaborn as sns
Enrollment No.: 2203051057087 Page no:
Div: 7A9(CSE)
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4rd Year 7th Semester

#Step-2: Import IRIS Dataset

df=pd.read_csv("/content/iris.csv") print(df.head())

#Step-3: Prepare the data for modelling

x=df.drop("species",axis=1) #features
y=df["species"] #Target variable

#Step-4: splitting the data into training and testing sets

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=42)

#Step-5: Training the Gaussian Naïve Bayes model

model=GaussianNB()
model.fit( x_train,y_train)

#Step-6: Make prediction on the tet set

y_pred=model.predict(x_test)

#Step-7: Evaluate the model’s performance.

print("Accuracy:",accuracy_score(y_test,y_pred)) print("Confusion
Matrix") print(confusion_matrix(y_test,y_pred))
print("Classification Report")
print(classification_report(y_test,y_pred))

#Step-8: Plot Confusion Matrix

confusion_matrix(y_test,y_pred)
sns.heatmap(confusion_matrix(y_test,y_pred),annot=True,cmap="Blues") plt.show()

Output:

Enrollment No.: 2203051057087 Page no:


Div: 7A9(CSE)
Faculty of Engineering & Technology
Machine Learning Laboratory (203105403)
B. Tech CSE 4rd Year 7th Semester

Enrollment No.: 2203051057087 Page no:


Div: 7A9(CSE)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy