Chapter 3 Exploratory Data Analysis
Chapter 3 Exploratory Data Analysis
CHAPTER 3
Summary of the
dataset
Structure of the
dataset
Dimensions of
the data
Load the dataset • dim, nrow,
ncol, names
CONTINOUS AND CATEGORICAL VARIABLES
Continuous variables are quantitative variables which can take
any infinite values and can be measured. Mean, median and mode
can be calculated for continuous variables. For e.g. Height, weight,
speed of the vehicle etc.
Categorical variables are variables which could be categorized
into distinct groups e.g. gender, pass/fail etc. are finite.
In simple words, if we can measure the variables it is a continuous
variable and if we can count the variables it is categorical.
NORMAL DISTRIBUTION
Line drawing
to be drawn
RIGHT SKEWED AND LEFT SKEWED
Whiskers
Outliers Whiskers
Median
OUTLIER TREATMENT
First 25% of the Second 25% Third 25% of Last 25% of the
data of the data the data data
DEALING WITH MISSING VALUES
STANDARDIZING DATA
A sample dataset