0% found this document useful (0 votes)
29 views

Naive Bayes

This document discusses the Naive Bayes classification algorithm. It begins by explaining the assumptions of Naive Bayes - that features are independent and equally important. It then provides pros and cons, showing Naive Bayes is simple but assumes independence. An example dataset on weather and golf is presented. The document calculates probabilities for each feature value and class from the dataset. It then shows how to classify a new example using these probabilities based on Bayes' theorem and the naive assumption.

Uploaded by

Wina Fadhilah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Naive Bayes

This document discusses the Naive Bayes classification algorithm. It begins by explaining the assumptions of Naive Bayes - that features are independent and equally important. It then provides pros and cons, showing Naive Bayes is simple but assumes independence. An example dataset on weather and golf is presented. The document calculates probabilities for each feature value and class from the dataset. It then shows how to classify a new example using these probabilities based on Bayes' theorem and the naive assumption.

Uploaded by

Wina Fadhilah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Pembelajaran Mesin

Pendekatan Naive Bayes dalam


Klasifikasi

Dr. rer. nat. Akmal Junaidi, S.Si., M.Sc.


Jurusan Ilmu Komputer
FMIPA – Universitas Lampung
Introduction
Naive Bayes is a classification algorithms based on
Bayes Theorem.
The assumption of Naive Bayes is that among
features must be:
▪ independent
▪ equal
It’s relatively a simple idea indeed. However Naive
Bayes can often outperform other more
sophisticated algorithms.

Naive Bayes
Pros and Cons of Naive Bayes

Pros
▪ It’s relatively simple to understand and build
▪ It’s easily trained, even with a small dataset
▪ It’s fast!
▪ It’s not sensitive to irrelevant features

Cons
It assumes the every feature is independent, which
isn’t always the case in the reality.

Naive Bayes
Example of Observation
Consider a fictional dataset that describes the weather conditions for
playing a game of golf. Given the weather conditions, each tuple
classifies the conditions as fit(“Yes”) or unfit(“No”) for playing golf

The dataset is divided into two parts, namely, feature matrix and the
response vector.
▪ Feature matrix contains all the vectors(rows) of dataset in which
each vector consists of the value of dependent features. In above
dataset, features are ‘Outlook’, ‘Temperature’, ‘Humidity’ and
‘Windy’.
▪ Response vector contains the value of class variable(prediction or
output) for each row of feature matrix. In above dataset, the class
variable name is ‘Play golf’.
Naive Bayes
A Fictional Dataset

Naive Bayes
Interpretation of Assumptions

Based on this dataset, the concept of both assumptions is understood


as follows:
▪ We assume that no pair of features are dependent. For example,
the temperature being ‘Hot’ has nothing to do with the humidity or
the outlook being ‘Rainy’ has no effect on the winds. Hence, the
features are assumed to be independent.
▪ Secondly, each feature is given the same weight (or importance).
For example, knowing only temperature and humidity alone can’t
predict the outcome accuratey. None of the attributes is irrelevant
and assumed to be contributing equally to the outcome.

Naive Bayes
Note

The assumptions required by Naive Bayes are not


generally correct in reality. In fact, the independence
assumption does not always meet but often works
well in practice.

In a nutshell, the algorithm allows us to predict a


class, given a set of features using probability.
Naive Bayes
Theorem of Bayes

Bayes’ Theorem finds the probability of an event occurring


given the probability of another event that has already
occurred. Bayes’ theorem is stated mathematically as the
following equation:
𝑃 𝐵|𝐴 𝑃 𝐴
𝑃 𝐴|𝐵 =
𝑃 𝐵

where A and B are events and P(B) ≠ 0

Naive Bayes
Note

▪ Basically, we are trying to find probability of event A, given


the event B is true. Event B is also termed as evidence.
▪ P(A) is the priori of A (the prior probability, i.e. Probability
of event before evidence is seen). The evidence is an
attribute value of an unknown instance (here, it is event
B).
▪ P(A|B) is a posteriori probability of B, i.e. probability of
event after evidence is seen.

Naive Bayes
Applying the Theorem of Bayes

𝑃 𝑋|𝑦 𝑃 𝑦
𝑃 𝑦|𝑋 =
𝑃 𝑋

where, y is class variable and X is a dependent


feature vector (of size n) where:

X = ( x 1 , x2 , x3 . … , x n)

Naive Bayes
Note

Just to clear, an example of a feature vector and


corresponding class variable can be: (refer 1st row of dataset)
X = (Rainy, Hot, High, False)
y = No
So basically, P(X|y) here means, the probability of “Not
playing golf” given that the weather conditions are “Rainy
outlook”, “Temperature is hot”, “high humidity” and “no
wind”.

Naive Bayes
Naive Assumption

Now, its time to put a naive assumption to the Bayes’


theorem, which is, independence among the features. So
now, we split evidence into the independent parts.
Now, if any two events A and B are independent, then,
P(A,B) = P(A)P(B)
Hence, we reach to the result:

𝑃 𝑦|𝑥1 𝑥2 𝑥3 … ,x𝑛 =?

Naive Bayes
Naive Assumption

Which can be expressed as:


𝑛
ς
𝑃 𝑦 i=1 𝑃 𝑥𝑖 |𝑦
𝑃 𝑦|𝑥1 ,x2 ,x3 , … ,x𝑛 =
𝑃 𝑥1 𝑃 𝑥2 … 𝑃 𝑥𝑛

Due to the denominator remains constant for a given input, it


can be removed from such term to be:
𝑛

𝑃 𝑦|𝑥1 ,x2 ,x3 , … ,x𝑛 ∝ 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 |𝑦


i=1
Naive Bayes
Classifier Model
Based on such term, how the model of classifier can be created? The
model can be used to classify the a set of given input by selecting class
variable y with maximum probability. This expression can be denoted
as the following mathematical term:
𝑛

y = argmax𝑦 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 |𝑦
i=1
The only calculation for this formula is to compute the probability P(y)
and P(xi | y). P(y) is also called class probability and P(xi | y) is called
conditional probability.

Naive Bayes
Weather Dataset

What should be done?


• Performing some precomputation on
weather dataset.
• The intention of computation is to find the
probability P(xi | yj ) for each xi in X and yj in y
• The calculation is done for probability of
outlook, temperature, humidity, wind and
probability of class.

Naive Bayes
Outlook

Yes No P (yes) P (no)

Sunny 3 2 3/9 2/5


Overcast 4 0 4/9 0
Rainy 2 3 2/9 3/5
Total 9 5 100% 100%

P(outlook = overcast | play golf = Yes) = 4/9.

Naive Bayes
Temperature

Yes No P (yes) P (no)

Hot 2 2 2/9 2/5


Mild 4 2 4/9 2/5
Cool 3 1 3/9 1/5
Total 9 5 100% 100%

P(temperature = cool | play golf = Yes) = 3/9.

Naive Bayes
Humidity

Yes No P (yes) P (no)

High 3 4 3/9 4/5


Normal 6 1 6/9 1/5
Total 9 5 100% 100%

P(humidity = high | play golf = No) = 4/5.

Naive Bayes
Windy

Yes No P (yes) P (no)

True 3 3 3/9 3/5


False 6 2 6/9 2/5
Total 9 5 100% 100%

P(Windy = true | play golf = Yes) = 3/9.

Naive Bayes
Class Probability

Play P (yes)/P (no)

Yes 9 9/14
No 5 5/14
Total 14 100%

Naive Bayes
Example

Let have a condition called today = (Sunny, Hot,


Normal, False)
Let y: playing golf
X1 : outlook
X2 : temperature
X3 : humidity
X4 : wind

How is the probability of playing golf today? Is the


decision to play or not to play?

Naive Bayes
Example

𝑃 X=today|𝑦 𝑃 𝑦
𝑃 𝑦|X=today =
𝑃 X=today

𝑃 𝑥1 =sunny|𝑦 𝑃 𝑥2 =hot|𝑦 𝑃 𝑥3 =normal|𝑦 𝑃 𝑥4 =false|𝑦 𝑃 𝑦


𝑃 𝑦|𝑋 =
𝑃 X=today

Since the question is whether the decision to play or not to play, the
given information (prior information) is y = yes or y = no if the
condition is today (evidence). Therefore, the computation must be
done for P(y = yes | X) and P(y = no | X).

Naive Bayes
Example

𝑃 y=yes|𝑋
𝑃 𝑥1 =sunny|y=yes 𝑃 𝑥2 =hot|y=yes 𝑃 𝑥3 =normal|y=yes 𝑃 𝑥4 =false|y=yes 𝑃 y=yes
=
𝑃 X=today

𝑃 y=no ∨ 𝑋
𝑃 𝑥1 =sunny|y=no 𝑃 𝑥2 =hot|y=no 𝑃 𝑥3 =normal|y=no 𝑃 𝑥4 =false|y=no 𝑃 y=no
=
𝑃 X=today

Note that in both probabilities, the denominator P(X = today)


is common for both so that the probability can rely on
computation of nominator only.

Naive Bayes
Example

𝑃 y=yes|𝑋
∝ 𝑃 𝑥1 =sunny|y=yes 𝑃 𝑥2 =hot|y=yes 𝑃 𝑥3 =normal|y=yes 𝑃 𝑥4 =false|y=yes 𝑃 y=yes

P(y=yes|X = today) = ?

𝑃 y=no|𝑋
∝ 𝑃 𝑥1 =sunny|y=no 𝑃 𝑥2 =hot|y=no 𝑃 𝑥3 =normal|y=no 𝑃 𝑥4 =false|y=no 𝑃 y=no

P(y=no|X = today) = ?

Play or not to play ….


Naive Bayes
Question ??

Thank you for your attention

Naive Bayes

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy