0% found this document useful (0 votes)
111 views

Evaluation Metrics in Machine Learning

Uploaded by

Sahil Mhaske
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Evaluation Metrics in Machine Learning

Uploaded by

Sahil Mhaske
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

MENTORNESS ARTICLE

TASK 1

MIP-ML-08 BATCH

Evaluation Metrics in Machine Learning:


Exploring Performance Assessment
BY NIGAR SULTANA

What Are Evaluation Metrics in Machine Learning?

Evaluation metrics in Machine Learning (ML) are important tools for checking how well

ML models work. These metrics use numbers to show how effective a model is at

handling new information and help decide which model is best. They look at things like

accuracy, precision, recall, and other measures to see if the model is performing well.

The goal is to make sure the model can learn from training and make accurate

predictions on new things it hasn't seen before. These metrics also help pick the best

model from many options and can find where a model needs to improve. This helps ML
experts adjust things to make the model more effective and ensures it does well in real-

life situations.

Why Do We Need Evaluation Metrics in ML?

Evaluation metrics are crucial in ML for several reasons:

• They help us understand if ML models are effective and accurate in their

predictions or classifications.

• These metrics guide us in choosing the best model among several options by

comparing their performances.

• They play a role in tuning model settings (hyperparameters) to make them

perform better.

• Evaluation metrics provide a foundation for constantly improving and fine-tuning

ML algorithms.

• They allow us to objectively assess and compare different models based on their

performance scores.

• These metrics are essential for making informed decisions about which ML

model to use in a given situation.

• They help us identify areas where ML models need improvement or adjustments.

• Using evaluation metrics ensures that ML models meet desired performance

standards and deliver reliable results.

• Overall, evaluation metrics are key tools for assessing, improving, and optimizing

ML models for various applications.

Types of Evaluation Metrics

There are various types of evaluation metrics used in ML, including:


1. Regression Model Evaluation Metrics: These metrics assess how well

regression models predict numerical outcomes. They include:

• Mean Absolute Error (MAE): This measures the average difference

between predicted and actual values, giving an idea of how accurate the

predictions are.

• Root Mean Squared Error (RMSE): Similar to MAE but emphasizes larger

errors, providing a more comprehensive view of prediction accuracy.

• R-squared (R2) Score: It shows how much of the variance in the data is

explained by the model, indicating how well the model fits the data.

2. Classification Model Evaluation Metrics: These metrics evaluate how

accurately classification models classify data into different categories. They

include:

• Accuracy: This measures the overall correctness of the model's

predictions.

• Precision: It shows how many positive predictions were actually correct,

focusing on the accuracy of positive predictions.

• Recall (Sensitivity): This metric indicates how many actual positive

instances the model correctly identified, emphasizing the model's ability to

capture all positive cases.

• F1 Score: The F1 score combines precision and recall into a single value,

offering a balanced assessment of the model's performance in handling

class imbalances.
• Confusion Matrix: A confusion matrix is a tabular representation of a

machine learning model's performance, displaying the counts of true

positive, true negative, false positive, and false negative predictions.

Explaining Each Evaluation Metric in Detail

1. Mean Absolute Error (MAE):

• MAE measures the average magnitude of errors between predicted and

actual values in regression models.

• It provides insights into how accurate the predictions of the model are on

average.

• MAE is calculated by taking the average of the absolute differences

between predicted and actual values.

Formula for MAE:

Where:

• y_j: ground-truth value

• y_hat: predicted value from the regression model

• N: number of datums
Example Graph: Mean Absolute Error

2. Root Mean Squared Error (RMSE):

• RMSE is similar to MAE but gives more weight to larger errors, making it

sensitive to outliers.

• It provides a more comprehensive assessment of model performance by

penalizing significant errors.

• RMSE is calculated by taking the square root of the average of squared

differences between predicted and actual values.

Formula for RMSE:


Example Graph: Root Mean Squared Error

3. R-squared (R2) Score:

• R2 score quantifies the proportion of variance in the dependent variable

explained by independent variables in regression models.

• It indicates the goodness of fit of the regression model, showing how well

the model fits the data.

• R2 score ranges from 0 to 1, where 1 indicates a perfect fit and 0 indicates

no relationship between variables.

Formula for R2 Score:

Where:

• yˉ is the mean of the actual values.


Example Graph: R-Squared Score

4.Accuracy:

• Accuracy is a simple and intuitive metric that measures the percentage of correct

predictions made by a model.

• It is suitable for balanced datasets where the positive and negative classes are

similar in number.

• However, in imbalanced datasets, accuracy can be misleading as it favors the

majority class predictions, neglecting the minority class.

• This can lead to an inaccurate assessment of the model's performance,

especially in scenarios where the minority class is of high importance.

Formula for Accuracy:


Where:

• TP=True Positive

• TN=True Negative

• FP=False Positive

• FN=False Negative

Example Graph: Accuracy

5.Precision:

• Precision measures the proportion of correctly predicted positive instances out

of all instances predicted as positive.

• It is particularly valuable when the cost of false positives is significant.

• For instance, in medical diagnosis, high precision indicates accurate

identification of patients with a disease, reducing false positive cases.

Formula for Precision:


Example Graph: Precision

6.Recall:

• Recall (also known as Sensitivity) measures the proportion of correctly predicted

positive instances out of all actual positive instances.

• It is crucial in scenarios where capturing all positive cases is vital, even if it

results in some false alarms.

• For instance, in healthcare, high recall ensures that the model doesn't miss

identifying patients with a disease, even if it means some healthy individuals are

flagged for further evaluation.

Formula for Recall:

Where:

TP=True Positive
FN=False Negative

Example Graph: Recall

7.F1 Score:

• The F1 score represents the harmonic mean of precision and recall.

• It offers a balanced assessment of a model's performance by taking into account

both precision and recall.

• Models with a similar balance between precision and recall are favored by the F1

score.

• The harmonic mean is particularly suitable for averaging ratios of values, making

the F1 score valuable in scenarios with imbalanced precision and recall values.
Example Graph: F1 Score

8.Confusion Matrix:

• The confusion matrix is a tabular representation of true and predicted classes in

a classification problem.

• It displays the four possible combinations of true positives, true negatives, false

positives, and false negatives, offering insights into the model's performance and

areas for improvement.

Formula for Confusion Matrix:


Confusion Matrix

Recap:

In conclusion, evaluation metrics are indispensable tools in ML for assessing model

performance and guiding decision-making processes. Incorporating both regression

and classification model evaluation metrics provides a comprehensive understanding

of a model's capabilities and areas for improvement. The table below summarizes the

key evaluation metrics discussed in this article, along with their descriptions and

formulas:

Metric Description Formula


Mean Measures average
Absolute magnitude of errors
Error between predicted
and actual values in
regression models.
Root Mean Similar to MAE but
Squared penalizes large errors
Error more, sensitive to
outliers.
R-squared Quantifies proportion
(R2) Score of variance in
dependent variable
explained by
independent
variables.
Accuracy Measures proportion
of correct predictions
made by model over
total predictions.
Precision Measures proportion
of correctly predicted
positive instances
out of all predicted
positives.
Recall Measures proportion
(Sensitivity) of correctly predicted
positive instances
out of all actual
positives.
F1 Score Combines precision
and recall into single
value, offering
balanced model
performance
assessment.
Confusion Tabular
Matrix representation of
true and predicted
classes, displaying
counts of true
positives, true
negatives, false
positives, and false
negatives.

By leveraging evaluation metrics effectively, data scientists and ML practitioners can

develop robust and accurate ML models that meet the desired performance standards

and deliver reliable results in real-world applications.

Thank you.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy