0% found this document useful (0 votes)
127 views

Confusion Matrix: Prof. Asim Tewari IIT Bombay

The document discusses confusion matrices, which are tables used to evaluate classification models on test data where the true values are known. A confusion matrix displays correct and incorrect predictions in order to understand the types of errors being made. Key terms like true positives, false negatives, etc. are defined. Additional evaluation metrics like accuracy, recall, precision and F-measure are also introduced, which provide different perspectives on a model's performance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
127 views

Confusion Matrix: Prof. Asim Tewari IIT Bombay

The document discusses confusion matrices, which are tables used to evaluate classification models on test data where the true values are known. A confusion matrix displays correct and incorrect predictions in order to understand the types of errors being made. Key terms like true positives, false negatives, etc. are defined. Additional evaluation metrics like accuracy, recall, precision and F-measure are also introduced, which provide different perspectives on a model's performance.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Confusion Matrix

Prof. Asim Tewari


IIT Bombay

Prof. Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
What is Confusion Matrix?
• A table to describe the performance of a
classification model (or “classifier”) on a set of
test data for which the true values are known.
• It allows easy visualization of confusion between
classes e.g. one class is commonly mislabeled as
the other
• It gives us insight not only into the errors being
made by a classifier but more importantly the
types of errors that are being made.

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 2
Confusion Matrix

• Here,
• Class 1 : Positive
• Class 2 : Negative
• Definition of the Terms:
• Positive (P) : Observation is positive (for example: is an apple).
• Negative (N) : Observation is not positive (for example: is not an apple).
• True Positive (TP) : Observation is positive, and is predicted to be
positive.
• False Negative (FN) : Observation is positive, but is predicted negative.
• True Negative (TN) : Observation is negative, and is predicted to be
negative.
• False Positive (FP) : Observation is negative, but is predicted positive.

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 3
Confusion Matrix

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 4
Classification Rate/Accuracy
• Classification Rate or Accuracy is given by the
relation:

• However, there are problems with accuracy. It


assumes equal costs for both kinds of errors. A
99% accuracy can be excellent, good, mediocre,
poor or terrible depending upon the problem.

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 5
Recall
• Recall can be defined as the ratio of the total
number of correctly classified positive
examples divide to the total number of
positive examples. High Recall indicates the
class is correctly recognized (small number of
FN).
• Recall is given by the relation:

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 6
Precision
• To get the value of precision we divide the total number of
correctly classified positive examples by the total number
of predicted positive examples. High Precision indicates an
example labeled as positive is indeed positive (small
number of FP).
Precision is given by the relation

High recall, low precision: This means that most of the


positive examples are correctly recognized (low FN) but
there are a lot of false positives.
Low recall, high precision: This shows that we miss a lot
of positive examples (high FN) but those we predict as
positive are indeed positive (low FP)

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 7
F-measure
• Since we have two measures (Precision and Recall) it
helps to have a measurement that represents both of
them. We calculate an F-measure which uses Harmonic
Mean in place of Arithmetic Mean as it punishes the
extreme values more.
The F-Measure will always be nearer to the smaller
value of Precision or Recall.

Prof. Asim Tewari, IIT Bombay Introduction to Data Analytics & Machine Learning (Confusion Matrix) 8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy