0% found this document useful (0 votes)

119 views

Unit 7 ML

This document discusses machine learning model selection and evaluation. It describes selecting candidate machine learning models based on the problem type and data. It also covers feature engineering, splitting data into training, validation, and test sets, training models and tuning hyperparameters on the validation set, evaluating performance on the test set using appropriate metrics, analyzing confusion matrices for classification, and selecting the best performing model.

Uploaded by

Yuvraj Chauhan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

119 views

Unit 7 ML

Uploaded by

Yuvraj Chauhan

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Machine Learning

SUBJECT CODE: 203105515

Prof. Nitin Varshney, Assistant Professor
Computer Science & Engineering
CHAPTER - 7
Machine Learning in Practice
Agenda

1. Data collection & Preprocessing.

2. Outlier Analysis (Z-Score)
3. Model selection & evaluation
4. Optimization of tuning parameters
5. Visualization of results
Data collection & Preprocessing.

Data collection and preprocessing are critical steps in the data analysis
and machine learning pipeline.

Properly collecting and preparing your data can have a significant

impact on the quality and reliability of your results.
Data collection
1. Define Objectives: Clearly define your research or project objectives. What questions do you want
to answer or what problems do you want to solve with your data?

2. Identify Data Sources: Determine where your data will come from. It might be databases, APIs,
web scraping, sensors, surveys, or existing datasets.

3. Data Gathering: Collect the data from the identified sources. This can involve writing scripts or
programs to automate data retrieval. Ensure that you have the necessary permissions and rights to
use the data.

4. Data Quality Check: Examine the data for quality issues such as missing values, duplicates,
outliers, and inconsistencies. Clean the data as needed to address these issues.
Data collection
5. Data Integration: If your data comes from multiple sources, you may need to integrate it into a
single dataset. This may involve data merging, joining, or concatenation.

6. Data Storage: Decide on an appropriate storage format and location for your data. Common
options include relational databases, NoSQL databases, data lakes, or simple file formats like CSV
or JSON.

7. Data Documentation: Maintain documentation that describes the data sources, collection methods,
and any transformations or cleaning steps performed. This documentation is crucial for
reproducibility.
Data Preprocessing
1. Handling Missing Data: Decide how to deal with missing values. You can either remove rows with missing data,
impute missing values with statistical methods, or use advanced imputation techniques.

2. Outlier Detection and Treatment: Identify and handle outliers that can skew your analysis. You can remove outliers,
transform them, or use robust statistical methods.

3. Feature Selection: Choose relevant features (columns) that are likely to contribute to your analysis or machine
learning model. Feature selection can reduce dimensionality and prevent overfitting.

4. Feature Engineering: Create new features that can provide more information or improve model performance. This
might involve mathematical transformations, aggregation, or creating categorical variables from continuous data.

5. Scaling and Normalization: Scale or normalize your data to ensure that different features have similar scales.
Common techniques include min-max scaling and z-score normalization.
Data Preprocessing
6. Encoding Categorical Data: Convert categorical variables into numerical format using techniques like one-hot
encoding or label encoding, depending on the nature of the data and the machine learning algorithm you plan to use.

7. Data Splitting: Divide your dataset into training, validation, and test sets for machine learning tasks. The training set
is used to train models, the validation set is used for hyperparameter tuning, and the test set is reserved for evaluating
model performance.

8. Data Transformation: Some machine learning algorithms require specific data transformations, such as principal
component analysis (PCA) or time series decomposition.

9. Data Imbalance Handling: If you're dealing with imbalanced datasets in classification tasks, consider techniques
like oversampling, under sampling, or using different evaluation metrics.

10. Data Visualization: Visualize your data to gain insights and identify patterns or anomalies. Data visualization tools
can help you explore the data's characteristics.
Outlier analysis using Z-Score
Outlier analysis using Z-Score, also known as the standard score, is a statistical technique
used to identify and deal with outliers in a dataset. Outliers are data points that deviate
significantly from the rest of the data and can distort statistical analysis and machine
learning models. Z-Score helps you quantify how far each data point is from the mean and
provides a threshold for identifying outliers.
Perform outlier analysis using Z-Score
Calculate the mean (average) and standard deviation of your dataset. These statistics
describe the central tendency and the spread of the data, respectively.

Mean (μ) = (Σx) / N

Standard Deviation (σ) = √[Σ(x - μ)² / N]

Where:

Σ represents summation (summing over all data points).

x is each data point.
N is the number of data points.
Perform outlier analysis using Z-Score
Calculate the Z-Score for Each Data Point:
• For each data point in your dataset, calculate its Z-Score using the formula:

Z-Score (Z) = (x - μ) / σ

Z-Score represents how many standard deviations a data point is away from the mean. A Z-Score of 0 means
the data point is exactly at the mean, positive values indicate data points above the mean, and negative values
indicate data points below the mean.
Perform outlier analysis using Z-Score
Set a Threshold for Identifying Outliers: Decide on a threshold Z-Score value beyond which data points are considered
outliers. A common threshold is, for example, Z > 2 or Z < -2, which corresponds to data points that are more than two
standard deviations away from the mean.

Identify Outliers: Data points with Z-Scores exceeding the chosen threshold are considered outliers. You can create a new
binary variable to label them as outliers (1) or not (0).

Handle Outliers:
1. Remove Outliers: Exclude outlier data points from your analysis, especially if you believe they are erroneous or
irrelevant.
2. Transform Data: Apply transformations to mitigate the impact of outliers, such as log transformations or winsorization
(capping extreme values).
3. Keep Outliers: In some cases, outliers may be of interest, and you may want to analyze them separately or understand
why they exist.
Perform outlier analysis using Z-Score
Reanalyze Data: After handling outliers, you can recompute summary statistics, visualize the data, or build machine
learning models with the cleaned dataset.

Z-Score-based outlier analysis is a simple yet effective method for identifying and managing outliers in your data.

However, the choice of the Z-Score threshold is somewhat subjective and should be guided by domain knowledge and the

specific goals of your analysis. Additionally, be aware that Z-Score-based methods may not work well for datasets with

non-normal distributions, and alternative techniques may be more appropriate in such cases.
Model selection & evaluation

Model selection and evaluation are crucial steps in the process of building and deploying

machine learning models. Selecting the right model and assessing its performance correctly

are essential for achieving the best results in your machine learning project.
Model Selection
1. Define Your Goals: Clearly articulate the objectives of your machine learning project. Understand what
you want to predict or accomplish with your model.

2. Choose Candidate Models: Based on your problem type (classification, regression, clustering, etc.) and
the nature of your data, select a set of candidate machine learning algorithms. Common choices include
linear regression, decision trees, random forests, support vector machines, neural networks, etc.

3. Feature Selection/Engineering: Before building and comparing models, carefully select and preprocess
your features. Feature engineering may involve creating new features, transforming data, and handling
missing or categorical data.

4. Split the Data: Divide your dataset into training, validation, and test sets. The training set is used for
model training, the validation set for hyperparameter tuning and model selection, and the test set for final
model evaluation.
Model Selection
5. Train Models: Train each candidate model using the training data. Tune hyperparameters using the
validation set to find the best-performing configuration for each model.

6. Cross-Validation: Perform cross-validation on the training data to assess each model's performance more
robustly. Common techniques include k-fold cross-validation.

7. Evaluate Model Complexity: Consider the trade-off between model complexity and performance.
Simpler models are less likely to overfit but may have lower predictive power, while complex models may
capture more nuances but are more prone to overfitting.

8. Select the Best Model: Based on cross-validation results and your evaluation criteria (e.g., accuracy,
precision, recall, F1-score for classification; RMSE, MAE for regression), choose the best-performing
model.
Model Evaluation
1. Test Data Evaluation: Once you've selected your best model, evaluate its performance on the test dataset,
which it has never seen before. This step provides a realistic estimate of how well your model will
perform on unseen data.

2. Performance Metrics: Choose appropriate evaluation metrics based on your problem type. For
classification, you might use accuracy, precision, recall, F1-score, ROC AUC, etc. For regression,
common metrics include RMSE, MAE, R-squared, etc.

3. Confusion Matrix (Classification): Analyze the confusion matrix to understand the model's performance
regarding true positives, true negatives, false positives, and false negatives. This can help you make
informed decisions about trade-offs between precision and recall.
Model Evaluation
4. Visualizations: Create visualizations, such as ROC curves, precision-recall curves, or residual plots, to
gain insights into your model's behavior.

5. Business Impact: Consider the business or real-world implications of your model's performance.
Evaluate whether the model meets the desired objectives and whether it aligns with the project's goals.

6. Bias and Fairness: Assess the model for biases, fairness, and ethical concerns. Ensure that it doesn't
discriminate against certain groups or exhibit unintended behavior.

7. Interpretability: If model interpretability is important, use techniques such as feature importance analysis
or model-agnostic interpretability tools to understand how the model makes predictions.
Model Evaluation
8. Iterate and Refine: Depending on the evaluation results, you may need to iterate on the model selection,
feature engineering, or data preprocessing steps to improve model performance.

9. Documentation: Maintain thorough documentation of the selected model, its hyperparameters, and the
evaluation results. This documentation is crucial for reproducibility and future reference.
Optimization of tuning parameters

Optimizing or tuning hyperparameters is a crucial step in the process of building machine

learning models. Hyperparameters are parameters of the model that are not learned from the

data but are set before training. Tuning these hyperparameters can significantly impact a

model's performance.
Optimization of tuning parameters
Define Your Search Space:
Start by identifying the hyperparameters you want to tune. These may include learning rates, regularization
strengths, tree depths, kernel types, etc., depending on the algorithm you're using.

Choose a Search Method:

There are two primary methods for hyperparameter tuning:

•Grid Search: In this method, you specify a set of possible values for each hyperparameter. The algorithm
then evaluates all possible combinations, creating a grid of hyperparameter configurations. Grid search is
straightforward but can be computationally expensive.

•Random Search: Random search selects random combinations of hyperparameters from predefined
ranges. It's often more efficient than grid search and can find good hyperparameters faster.
Optimization of tuning parameters

Set Evaluation Metrics:

Define the evaluation metric(s) that you want to optimize. This could be accuracy, F1-score, RMSE, etc.

Cross-Validation:
Split your training data into multiple subsets for cross-validation. Common choices include k-fold cross-
validation, where the data is divided into k subsets, and each model is trained and evaluated on different
combinations of these subsets. Cross-validation helps assess how well a set of hyperparameters performs across
different data partitions, reducing the risk of overfitting.
Optimization of tuning parameters
Hyperparameter Tuning:
Apply the chosen search method (grid search or random search) to find the best hyperparameters. For
each combination of hyperparameters:
•Train the model on the training data.
•Evaluate the model's performance using cross-validation and the chosen evaluation metric(s).
•Record the evaluation metric(s) for that combination.

Select the Best Hyperparameters:

After trying multiple hyperparameter combinations, choose the set that results in the best performance
on your chosen evaluation metric(s). This set is considered the optimized set of hyperparameters.
Optimization of tuning parameters
Test Set Evaluation: Assess the model with the optimized hyperparameters on a separate test dataset that it has
not seen during the hyperparameter tuning process. This gives you an estimate of how well your model will
perform on new, unseen data.

Refinement: Depending on the results, you may need to iterate on the optimization process, fine-tuning the
hyperparameters further or even revisiting your initial choices.

Documentation: Document the optimized hyperparameters and the performance metrics associated with them.
This documentation is essential for reproducibility and model deployment.

Deployment and Monitoring: Deploy your model with the optimized hyperparameters in a production
environment. Monitor its performance over time and be prepared to re-tune the hyperparameters periodically as
the data distribution evolves or the model's requirements change.
Visualization of results

Visualizing the results of your data analysis or machine learning model can
provide valuable insights, help you communicate your findings effectively, and
aid in decision-making. The choice of visualization techniques depends on the
type of data and the specific goals of your analysis.
Common ways to visualize

Bar Charts and Histograms

1. Use bar charts to display categorical data and

compare the frequency or distribution of different
categories.
2. Histograms are useful for visualizing the
distribution of continuous data. They divide the
data into bins and display the frequency of data
points in each bin.
Common ways to visualize

Line Charts
Line charts are ideal for showing trends over time or
across ordered categories. They are commonly used
for time series data or to visualize the relationship
between two continuous variables.

Scatter Plots
Scatter plots are effective for visualizing the
relationship between two continuous variables. Each
point on the plot represents a data point, making it
easy to identify patterns or outliers.
Common ways to visualize

Box Plots and Violin Plots

Box plots summarize the distribution of a dataset,
showing median, quartiles, and potential outliers.
•Violin plots combine a box plot with a kernel density
estimation, providing a more detailed view of the data's
distribution.
Heatmaps:
Heatmaps are used to represent data in a tabular
format where colors indicate the values of individual cells.
They are commonly used for correlation matrices or to
visualize data in a 2D grid.
Common ways to visualize
Pie Charts:
Pie charts display parts of a whole. They are suitable for
showing the composition of a dataset when you have a
small number of categories.

Area Charts:
Area charts are similar to line charts but are filled in with
color, making it easier to visualize the cumulative effect of
values over time.

Radar Charts (Spider Charts):

Radar charts are useful for displaying multivariate data on
a two-dimensional chart with multiple axes. They are
suitable for comparing items across multiple categories.
Common ways to visualize
Bubble Charts
Bubble charts add a third dimension to scatter plots by varying
the size of data points based on a third variable. They are
useful when you need to represent three dimensions of data.

Sankey Diagrams
Sankey diagrams are used to visualize the flow of resources or
quantities between different entities. They are often used in
process analysis or to depict hierarchical structures.

Choropleth Maps
Choropleth maps use color-coding to represent data by
geographic regions. They are useful for showing regional
patterns or variations.
Common ways to visualize
3D Plots
3D plots can be used when you need to visualize data in three
dimensions. They are suitable for situations where two
continuous variables are dependent on a third variable.

Interactive Dashboards:
Create interactive dashboards using tools like Tableau, Power
BI, or Plotly. These allow users to explore data and results
dynamically.

Word Clouds and Text Visualization:

Word clouds are used to visualize word frequencies in text data.
Other techniques like sentiment analysis or topic modeling can
also be used to visualize and interpret text data.
Common ways to visualize

Network Graphs:
Network graphs are useful for visualizing
relationships between entities, such as social
networks, co-authorship networks, or
hierarchical structures.

Error Bars:
When presenting statistical results, use error
bars to indicate variability or uncertainty in
your measurements.
Thank You

Improve Model Accuracy With Data Pre-Processing
No ratings yet
Improve Model Accuracy With Data Pre-Processing
11 pages
Machine Learning Project Checklist
100% (1)
Machine Learning Project Checklist
10 pages
Each Stage of A Data Mining Project
No ratings yet
Each Stage of A Data Mining Project
5 pages
MSDSModule 2
No ratings yet
MSDSModule 2
35 pages
SML Updated UNIT-2
No ratings yet
SML Updated UNIT-2
43 pages
Data Prep and Cleaning For Machine Learning
No ratings yet
Data Prep and Cleaning For Machine Learning
22 pages
Data Preprocessing Techniques Cleaning Transformation and Integration
No ratings yet
Data Preprocessing Techniques Cleaning Transformation and Integration
6 pages
ML Checklist PDF
No ratings yet
ML Checklist PDF
4 pages
Data Science Checklist
No ratings yet
Data Science Checklist
22 pages
UNIT - 2 ML
No ratings yet
UNIT - 2 ML
8 pages
Statistics for Data Science
No ratings yet
Statistics for Data Science
39 pages
Oe Cae 3
No ratings yet
Oe Cae 3
7 pages
Sent-Machine Learning For Data Science
100% (1)
Sent-Machine Learning For Data Science
463 pages
Chapter 02 Overview - 4
No ratings yet
Chapter 02 Overview - 4
43 pages
AML MIDSEM
No ratings yet
AML MIDSEM
59 pages
DSUR_EA2352001010391_W7
No ratings yet
DSUR_EA2352001010391_W7
3 pages
Data Mining Basics
No ratings yet
Data Mining Basics
38 pages
ML Workflow Steps: Step 2: Building Dataset
No ratings yet
ML Workflow Steps: Step 2: Building Dataset
5 pages
Data Mining Basics
No ratings yet
Data Mining Basics
52 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Chương
No ratings yet
Chương
12 pages
Unit 3
No ratings yet
Unit 3
55 pages
Unit 5
No ratings yet
Unit 5
11 pages
Machine Learning Tips
No ratings yet
Machine Learning Tips
2 pages
S-9
No ratings yet
S-9
18 pages
L3 Overview of ML Model Development Lifecycle-1
No ratings yet
L3 Overview of ML Model Development Lifecycle-1
30 pages
3-Data Preprocessing
No ratings yet
3-Data Preprocessing
32 pages
BUSINESS ANALYTICS
No ratings yet
BUSINESS ANALYTICS
14 pages
Unit 1: Capstone Project
No ratings yet
Unit 1: Capstone Project
21 pages
Week5 Modified
No ratings yet
Week5 Modified
25 pages
Statistics For Data Science - 1
100% (2)
Statistics For Data Science - 1
38 pages
Machine Learning - Lec4 - 5
No ratings yet
Machine Learning - Lec4 - 5
41 pages
ML_DA
No ratings yet
ML_DA
55 pages
Manual Data
No ratings yet
Manual Data
13 pages
UNIT 2 ML
No ratings yet
UNIT 2 ML
14 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
U1_DA_Data Preprocessing
No ratings yet
U1_DA_Data Preprocessing
6 pages
Lec 8
No ratings yet
Lec 8
35 pages
Pattern Recognition Application
No ratings yet
Pattern Recognition Application
43 pages
ML Question Answer
No ratings yet
ML Question Answer
4 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
DSF - UNIT III Notes
No ratings yet
DSF - UNIT III Notes
17 pages
The Data Arena.
No ratings yet
The Data Arena.
11 pages
AI Strategy Flow Chart Share by WorldLine Technology
No ratings yet
AI Strategy Flow Chart Share by WorldLine Technology
1 page
Crisp-Dm
No ratings yet
Crisp-Dm
4 pages
How To Prepare Data For Machine Learning
No ratings yet
How To Prepare Data For Machine Learning
34 pages
Week 2
No ratings yet
Week 2
3 pages
Anomalies in dataset
No ratings yet
Anomalies in dataset
4 pages
HCA2 (1)
No ratings yet
HCA2 (1)
63 pages
Unit-2
No ratings yet
Unit-2
21 pages
Northbay Summarizes Data Pre-Processing Algorithms
No ratings yet
Northbay Summarizes Data Pre-Processing Algorithms
10 pages
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004_compressed (1)
No ratings yet
PYTHON PROGRAMMING FOR MACHINE LEARNING-220901004_compressed (1)
6 pages
Model Evaluation
No ratings yet
Model Evaluation
39 pages
DPT Week 1
No ratings yet
DPT Week 1
3 pages
Data processes
No ratings yet
Data processes
4 pages
Case Study - Churn Mdel Prediction
No ratings yet
Case Study - Churn Mdel Prediction
77 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Instant download Natural Language Processing for Global and Local Business 1st Edition Fatih Pinarbasi pdf all chapter
No ratings yet
Instant download Natural Language Processing for Global and Local Business 1st Edition Fatih Pinarbasi pdf all chapter
62 pages
Flight Price Prediction Using Machine Learning Report
No ratings yet
Flight Price Prediction Using Machine Learning Report
58 pages
A Novel Approach For Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
No ratings yet
A Novel Approach For Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods
11 pages
A New Data-Mining Based Approach For Network Intrusion Detection
No ratings yet
A New Data-Mining Based Approach For Network Intrusion Detection
6 pages
Top 100 Interview Questions On Machine Learning
100% (1)
Top 100 Interview Questions On Machine Learning
155 pages
Machine Learning Online Bootcamp Beginners Track Curriculum
No ratings yet
Machine Learning Online Bootcamp Beginners Track Curriculum
9 pages
Leapfrog Filer Circuit
No ratings yet
Leapfrog Filer Circuit
6 pages
An Overview of E-Documents Classification: January 2009
No ratings yet
An Overview of E-Documents Classification: January 2009
10 pages
Chapter 2 Part1
No ratings yet
Chapter 2 Part1
33 pages
Paper 17-Predicting DOS DDOS Attacks
No ratings yet
Paper 17-Predicting DOS DDOS Attacks
15 pages
2023 Yang Artificial Intelligence Methods Applied To Catalytic Cracking Processes
No ratings yet
2023 Yang Artificial Intelligence Methods Applied To Catalytic Cracking Processes
20 pages
Lui 2014
No ratings yet
Lui 2014
7 pages
An Introduction To Feature Selection
No ratings yet
An Introduction To Feature Selection
45 pages
05 Rollup and Scan Components PDF
100% (1)
05 Rollup and Scan Components PDF
57 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
Pfe Book 2022: Internship 2022
No ratings yet
Pfe Book 2022: Internship 2022
30 pages
Statistical Data Mining Through Credal Decision Tree Classifiers For Fault Prediction On Wind Turbine Blades Using Vibration Signals
No ratings yet
Statistical Data Mining Through Credal Decision Tree Classifiers For Fault Prediction On Wind Turbine Blades Using Vibration Signals
7 pages
Decision Trees: A Recent Overview: S. B. Kotsiantis
No ratings yet
Decision Trees: A Recent Overview: S. B. Kotsiantis
23 pages
Computational Science - ICCS 2020
No ratings yet
Computational Science - ICCS 2020
632 pages
AI-900 Exam (1)
No ratings yet
AI-900 Exam (1)
161 pages
Software Defect Prediction Based on Multi-filter Wrapper Feature (2)
No ratings yet
Software Defect Prediction Based on Multi-filter Wrapper Feature (2)
28 pages
Pattern recognition algorithms for data mining scalability knowledge discovery and soft granular computing 1st Edition Sankar K. Pal - Own the complete ebook set now in PDF and DOCX formats
100% (2)
Pattern recognition algorithms for data mining scalability knowledge discovery and soft granular computing 1st Edition Sankar K. Pal - Own the complete ebook set now in PDF and DOCX formats
47 pages
8.scopus ECB IJ Ramesh 2023
No ratings yet
8.scopus ECB IJ Ramesh 2023
11 pages
Multi-Objective Optimization Algorithms For Intrusion Detection in IoT Networks A Systematic Review
No ratings yet
Multi-Objective Optimization Algorithms For Intrusion Detection in IoT Networks A Systematic Review
10 pages
Sport Analytics For Cricket Game Results Using Machine Learning - An Experimental Study - Semantic Scholar
No ratings yet
Sport Analytics For Cricket Game Results Using Machine Learning - An Experimental Study - Semantic Scholar
9 pages
Feature Extraction: 4.1. Principal Component Analysis (PCA)
No ratings yet
Feature Extraction: 4.1. Principal Component Analysis (PCA)
10 pages
0489
No ratings yet
0489
13 pages
Unit - 3 Feature Engineering
No ratings yet
Unit - 3 Feature Engineering
29 pages
A Hybrid CNN+LSTM-based Intrusion Detection System For Industrial IoT Networks
No ratings yet
A Hybrid CNN+LSTM-based Intrusion Detection System For Industrial IoT Networks
13 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 7 ML

Uploaded by

Unit 7 ML

Uploaded by

Machine Learning

SUBJECT CODE: 203105515

1. Data collection & Preprocessing.

Properly collecting and preparing your data can have a significant

Mean (μ) = (Σx) / N

Standard Deviation (σ) = √[Σ(x - μ)² / N]

Σ represents summation (summing over all data points).

Optimizing or tuning hyperparameters is a crucial step in the process of building machine

Choose a Search Method:

Set Evaluation Metrics:

Select the Best Hyperparameters:

Bar Charts and Histograms

1. Use bar charts to display categorical data and

Box Plots and Violin Plots

Radar Charts (Spider Charts):

Word Clouds and Text Visualization:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.