R Lectures PDF
R Lectures PDF
characters or not.
The command for naming object:
= or < −
RStudio will perform the mathematic
procedure and return the result in the
console in the line below next to a [1]
that indicates the first and, in this case,
only result from your command.
Notice that when you ask R to
print x, the value 42 appears
in the console.
Suppose you have a fruit
basket with five apples. As a
data analyst in training, you
want to store the number of
apples in a variable with the
name my_apples.
Example
How much has been your overall profit or loss per day of
the week?
MEAN
It is calculated by taking the sum of the values and dividing with
the number of values in a data series.
If there are missing values, then the mean function returns NA.
MEDIAN
MODE
VARIANCE
It is a numerical measure of how the data values is dispersed
around the mean.
VARIANCE
It is a numerical measure of how the data values is dispersed
around the mean.
STANDARD DEVIATION
It is the square root of its variance.
Graphical
Normal Q - Q Plot
Histogram
Numerical
Shapiro - Wilk Test
EXAMPLE:
qqnorm(x)
qqline(x)
To construct Histogram
Shapiro.test(x)
t.test(x)
Students Amount of
Suppose we would like to No. Money
estimate the mean amount 1 125
2 225
of money spent on books 3 154
by BS Statistics students in 4 150
a semester. We have data 5 125
6 220
from 10 randomly selected 7 195
students. Construct a 95% 8 90
9 123
confidence interval. 10 145
Based on the result of p - value, it is greater
than the level of significance of 0.05, therefore
the sample data follows a normal distribution.
If we use the t.test ( ) command listing only the
data name, we get a 95% confidence interval for
the mean after the significance test.
The t.test ( ) command can also be used to find
confidence intervals with levels confidence different
from 95%. We can specify the desired level of
confidence using the conf.level command.
EXAMPLE:
1. H0 : μ = 82 and Ha: μ ≠ 82
EXAMPLE:
2. H0 : ≤ μ 36 and Ha : μ > 36
EXAMPLE:
Student 1 2 3 4 5 6 7 8 9 10
Grade
5 6 4.5 5 5 6 5 5 5 5.5
points
H0 : μ = 4.5
Ha : μ ≠ 4.5
Reject the null hypothesis, therefore, the grade
point average of the 10 pupils is different from
the populations’ GPA because the computed p
value is less than the alpha level (0.01).
EXAMPLE:
Student 1 2 3 4 5 6
Weight 135 119 106 135 180 108
Ho : μ ≤ 140
Ha : μ > 140
Do not reject the nul l hy pothesis,
therefore, the average weight of the
student is 140 lb. and the claim of the
teacher is false because the computed p
value is greater than the alpha level (0.05).
Independent Sample t - Test
The independent sample t - test allows researchers
to evaluate or to compare the mean difference between
two populations using the data from two separate
samples. It is used to test whether population means
are significantly different from each other, using the
means from randomly drawn samples.
H0 : μ1 − μ2 ≥ 0 and Ha : μ1 − μ2 < 0
H0 : μ1 − μ2 ≤ 0 and Ha : μ1 − μ2 > 0
H0 : μ1 − μ2 = 0 and Ha : μ1 − μ2 ≠ 0
Independent Sample t - Test
Fruit Diet 3 4 4 4 5 6 6
Bread Diet 1 2 2 2 3 4 4
Before we proceed to t.test ( ) command, we must
first check whether the variances are homogeneous.
Used var.test command for F - test of Fisher.
The hypotheses used are:
Ho: Equal Variances Assumed
Theory 30 42 49 50 63 38 43 46 54 42 26
Practical 52 58 42 67 94 68 22 34 55 48 17
H0 : μ1 − μ2 ≤ 0 and Ha : μ1 − μ2 > 0
H0 : μ1 − μ2 = 0 and Ha : μ1 − μ2 ≠ 0
Independent Sample t - Test
To compute the one sample t - test:
Case 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Before
85 84 86 87 89 82 80 84 86 82 89 87 82 81 86 89 89 84 85 88
Training
After
95 98 97 92 96 93 94 95 90 82 97 98 95 95 92 91 94 95 96 97
Training
H0 : μ1 − μ2 = 0 and Ha : μ1 − μ2 ≠ 0
It shows that the results of before and after training
have significant difference because the p-value (0.000)
i s l e s s t h a n t h e l e v e l o f s i g n i f i c a n ce ( 0. 0 5 ) .
Furthermore, the result tell us that the training is
effective because the teaching performance of the
teacher was significantly higher after the training
course than before. Therefore, we have sufficient
evidence to support the claim.
Correlation Test