K-Nearest Neighbors Clearly Explained
K-Nearest Neighbors Clearly Explained
com/in/vikrantkumar95
K-Nearest Neighbors
Clearly Explained
What is K-Nearest Neighbors?
K-Nearest Neighbors (KNN) is a Supervised Learning method. It’s
quite similar to K-Means Clustering we saw earlier, which as an
Unsupervised Learning Method. We use KNN when we already have a
labeled set of clusters and we’re trying to predict the label for a given
set of unlabelled data points. It can be used for both classification
and regression.
That new fruit would have 3 Apples and 1 Orange as it’s nearest
neighbors. We would then classify it based on the majority class,
which is an Apple.
Apples
Pineapples
Oranges
linkedin.com/in/vikrantkumar95
How does KNN work?
We saw earlier what KNN is. So how does it work? How does it find the
nearest neighbors? How do we decide on the value of K? Let’s take a look.
We’ll quantify our initial fruits example. Given below is a table with the
fruits’ weight and color intensity score along with their label.
F5
F6
F4
F2
Weight
F7
F1
F3
Color Score
linkedin.com/in/vikrantkumar95
How does KNN work?
We have our data set. Now suppose we get a new fruit that we need
to classify:
F8 0.27 0.75 ?
F5
F6
F8
F4
F2
Weight
F1F7
F3
Color Score
Now, in order to classify our new fruit, F8, using KNN, we have to
execute the following steps:
linkedin.com/in/vikrantkumar95
Choosing K & Calculating Distance
Step 1: Choose the number of neighbors (k)
F5
F6
F8
F4
F2
Weight
F1F7
F3
Color Score
linkedin.com/in/vikrantkumar95
Calculating Distances
We need to calculate the distance of our new fruit F8, from all the
other fruits in the labelled dataset.
F8 0.27 0.75 ?
Distance
Fruit Weight (kg) Color Score Type
from F8
Distance
Fruit Weight (kg) Color Score Type
from F8
linkedin.com/in/vikrantkumar95
Identify Nearest Neighbors
Step 3: Identify the k nearest neighbors.
Distance
Fruit Weight (kg) Color Score Type
from F8
Looking at the distances, the nearest 3 neighbors are : F5, F4, and F6
Now let’s see next how we classify our unknown fruit based on the
nearest neighbors we’ve identified.
linkedin.com/in/vikrantkumar95
Make a Prediction
F5
F6
F8
F4
F2
Weight
F1F7
F3
Color Score
Now that we’ve seen how KNN works, there’s still a major question
that you might have: How do we decide on the value of K? Let’s take
a look in the next section!
linkedin.com/in/vikrantkumar95
How to Choose the value of K?
First let’s recap: What is K?
What are the Pros and Cons of having a very high or a very low K?.
K Pros Cons
More stable
May oversmooth,
predictions and
High (eg. K = 40) ignoring small but
less sensitive to
important patterns.
noise.
Cross-Validation:
Use cross-validation to test different values of K and choose the
one with the best performance.
Rule of thumb:
linkedin.com/in/vikrantkumar95
Enjoyed
reading?
Follow for
everything Data
and AI!
linkedin.com/in/vikrantkumar95