Statistics Handouts
Statistics Handouts
INTRODUCTION
The processing of statistical information has a history that extends back to the beginning of
mankind. In early biblical times nations compiled statistical data to provide descriptive
informa- tion relative to all sort of things, such as taxes, wars, agricultural crops, and even
athletic events. Today, with the development of probability theory, we are able to use
statistical methods that not only describe important features of the data but methods that
allow us to proceed beyond the data into the area of decision making through generalizations
and predictions.
Example 1.2.2
1. Situation: Quiz in a Math 5 class of 50
students. Population: 50 students
Sample: 11 students
Descriptive Statistics: There are 70% or 35 students who failed in the quiz.
Inferential Statistics: The students in Math 5 class should increase their studying time to
more
1
2 CHAPTER 1.
INTRODUCTION
than 1 hour since 11 of those students who failed admitted that they
only allocated 1 hour of studying before the quiz.
Example 1.3.2
1. Situation: Customers were asked about their favorite ice cream flavors sold at a grocery store.
Variable: Customers choice of ice cream flavors
Observation: Mango flavor
Data set: {Mango flavor, Ube flavor, Vanilla flavor, . . .}
Example 1.4.2
1. The following table gives several discrete variables and the set of possible values for each one.
In each case the value of the variable is determined by counting.
Example 1.5.2 The following table gives several examples of qualitative variables along with
a set of categories into which they may be classified.
Example 1.6.2 The following table gives several qualitative variables and a set of possible
nom- inal level data values.
Definition 1.6.3 The ordinal level of measurement is characterized by data that applies
to categories that can be ranked. The ordinal scale applies to data that can be arranged in
some order, but differences between data values either cannot be determined or are
meaningless. Ordinal scale data can be arranged in an ordering scheme.
Example 1.6.4 Table 1.6.4 gives several qualitative variables and a set of possible ordinal
level data values. Arithmetic operations are not performed on ordinal level data, but an
ordering scheme exists.
Example 1.6.6
1. IQ scores represent interval level data. Joses IQ score equals 104 and Juans IQ score equals
140. Juan has a higher IQ than Jose; that is, IQ scores can be arranged in order. Juans IQ
score is 36 points higher than Joses IQ score; that is, differences can be calculated and
interpreted. However, we cannot conclude that Juan is ≈1.3 times (140/104 1.3) more
intelligent than Jose. An IQ score of zero does not indicate a complete lack of intelligence.
2. Test scores represent interval level data. Mara scored 90 on a test and Clara scored 70 on
a test. Mara scored higher than Clara did on the test; that is, the test scores can be arranged
in order. Mara scored 20 points higher than Clara did on the test; that is, differences can be
calculated and interpreted. We cannot conclude that Mara knows twice as much as Clara
about the subject matter. A test score of 0 does not indicate an absence of knowledge
concerning the subject matter.
Definition 1.6.7 The ratio level of measurement results from counting or measuring.
The ratio scale applies to data that can be ranked and for which all arithmetic operations
including division can be performed. Division by zero is, of course, excluded. Ratio scale
data can be arranged in an ordering scheme and differences and ratios can be calculated and
interpreted. Ratio data has an absolute zero and a value of zero indicates a complete absence
of the characteristic of interest.
Example 1.6.8 The grams of fat consumed per day for adults in the Philippines is ratio scale
data. Mark consumes 70 grams of fat per day and Anthony consumes 35 grams per day. Mark
consumes twice as much fat as Anthony per day, since 70/35 = 2. For an individual who
consumes 0 grams of fat on a given day, there is a complete absence of fat consumed on that
day. Notice that a ratio is interpretable and an absolute zero exists.
Σ
4
xi = x1 + x2 + x3 + x4
i=1
= 15 + 10 + 18 + 6
= 49 .
Also,
Σ
3
xi = x2 + x3
i=2
Σn = 10 + 18
In general, the symbol
= 28 .
Σ
3
Example
Σ 1.7.1 If x1 = 3, x2 = 5, and x3 = 7, find
1. xi
Solution: Σ
xi = x1 + x2 + x3 = 3 + 5 + 7 = 15
III. Indicate the scale of measurements for each of the following variables. Write N for
nominal, O for ordinal, I for interval and R for ratio.
1. racial origin
2. military ranks
3. temperature scale
4. cellular phone numbers
5. medical diagnoses
IV. If x1 = 4, x2 = −3, x3 = 6, and x4 = −1, evaluate the following:
Σ4
1. xi2(xi − 3)
i=1
4
Σ
2. (xi + 1)2
i=2
V. Given x1 = −2, x2 = 3, x3 = 1, y1 = 4, y2 = 0, and y3 = 5, find the value of
the
3 following:
Σ
1. x iy i2
i=1
. Σ. 3 Σ
Σ 2 Σ
2. xi yi
i=1 i=2
II. Supply the empty spaces for sample or population that would be appropriate for the corre-
sponding data given. .
Population Sample
1. 1. A criminal justice study of 350 prison inmates
2. Legal aliens living in the Philippines
2.
II.
3. Alzheimer patients in the Philippines3.
4. 4.A psychological study of 200 individuals
who suffer anemia
Identify the variable and the number of observations in the data set.
1. In a sociological study involving 38 low-income households, the number of children per
house- hold was recorded for each househould.
2. A national survey was conducted among 3000 household and one question was asked for
the number of television per household. One thousand five hundred was completed and
participated in the survey.
3. The number of hours of research work was determined for 25 college professors. The
minimum number was 0 hours and the maximum is 28 hours.
4. Classify the problems in number 1, 2 and 3 in III as discrete or continuous variable.
Chapter 2
ORGANIZING DATA
Definition 2.1
1. Raw data is an information obtained by observing values of a variable.
2. Data obtained by observing values of qualitative variable are referred to as qualitative data.
3. Data obtained by observing values of quantitative variable are referred to as
quantitative data.
4. Quantitative data obtained from a discrete variable are also referred as discrete data.
5. Qualitative data obtained from a continuous variable are called continuous data.
Example 2.1.2
1. The following set of offenses with which individuals were charged in PNP Tangub.
rape robbery burglary arson murder robbery rape defamation
arson theft arson burglary theft robbery theft theft
theft burglary murder murder theft theft theft defamation defamation
2. The following are the flavors of ice cream sold at a grocery store, coded as 0-vanilla, 1-
chocolate, 2-ube, 3-mango, 4-melon, 5-banana, 6-durian and 7-avocado
11
12 CHAPTER 2. ORGANIZING DATA
1 5 1 6 2 7
1 4 1 4 3 5
0 6 0 3 4 6
7 6 6 2 1 7
2 2 2 0 7 2
Make a frequency distribution of the given data.
Answer :
Definition 2.1.4 The percentage for a category is obtained by multiplying the relative fre-
quency for that category to 100. The sum of the percentages for all the categories will always
equal to 100%.
Example 2.1.5
1. Make a relative frequency and percentage distribution of the offenses in Example 2.1.2(1).
Answer :
Definition 2.1.6 A bar graph is a graph composed of bars whose heights are the
frequencies of the different categories. A bar graph displays graphically the same information
concerning qualitative data that a frequency distribution shows in tabular form.
Example 2.1.7
1. Make a bar graph of the offenses in Example 2.1.2(1).
Answer :
Definition 2.1.8 A pie chart is also used to graphically display qualitative data. To construct
a pie chart, a circle is divided into portions that represent the relative frequencies or
percentages belonging to different categories. To construct a pie chart for the frequency
distribution, construct a table that gives angle sizes for each category. The 3600 in a circle are
divided into portions that are proportional to the category sizes.
Example 2.1.9
1. Make a pie graph of the offenses in Example 2.1.2(1).
Answer :
IQ Score Frequency
80-94 8
95-109 14
110-124 24
125-139 16
140-154 13
Test score is a quantitative variable and according to the above table, eight of the individuals
have scores between 80 and 94, fourteen have scores between 95 and 109, twenty-four have
scores between 110 and 124, sixteen have scores between 125 and 139, and thirteen have
scores between 140 and 154.
Definition 2.2.1
A data set consisting of the observations for some variable is referred to as
2.2. FREQUENCY DISTRIBUTION FOR QUANTITATIVE 17
DATA
raw data or ungrouped data.
Data presented in the form of a frequency distribution are called grouped data.
Illustration:
The following were grades obtained by students in their preliminary examination:
80 87 Grades frequency
92 90 80-85 3
95 83 86-91 3
83 88 92-97 2
Question: How to make ungrouped a grouped data if the number of classes is not given or
stated?
Answer:
√
1. Solve for C = n, where n = number of data.
If C is not a whole number, then round-off C to the next whole number.
The value C will be the number of classes in the desired frequency distribution.
Example 2.2.2
1. The following are the scores of an 80–item exam. Make a frequency distribution of the
data containing class boundaries, class width and class mark.
50 65 70 35 40 57 66 65 70 35
29 33 44 56 66 60 44 50 58 46
67 78 79 47 35 36 44 57 60 57
Answer :
√ √
Solve for C = n = 30 = 5.48
Since C is not a wholehighest
classes. number, therefore
value C ≈value
− lowest 6 is the
79number
− 29 of50
Solve for D = = = =
8.33
C 6 6
Since D is not a whole number and the data are whole numbers, therefore D ≈ 9
Therefore, LCL1 = smallest value = 29
UCL1 = smallest value + D = 29 + 9 = 38
LCL2 = 39
UCL2 = 39 + D = 39 + 9 = 48
LCL3 = 49
UCL3 = 49 + D = 49 + 9 = 58
LCL4 = 59
UCL4 = 59 + D = 59 + 9 = 68
LCL5 = 69
UCL5 = 69 + D = 69 + 9 = 78
LCL6 = 79
UCL6 = 79 + D = 79 + 9 = 88
9.00 7.60 8.26 8.30 8.21 7.90 8.21 8.31 8.57 8.86
8.86 8.04 7.70 7.82 7.82 8.04 8.28 8.30 9.01 8.87
8.50 7.90 8.30 8.04 8.26 8.27 8.50 8.57 8.51 8.87
8.51 8.26 8.21 8.04 7.82 8.04 8.30 8.50 8.86 8.51
Answer :
√ √
Solve for C = n= 40 = 6.32
Since C is not a wholehighest
classes. number, therefore
value C ≈value
− lowest 7 is the9.number of 1.44
01 − 7.57
Solve for D = = =
= 0.21
C 7 7
Therefore, LCL1 = smallest value = 7.57
UCL1 = smallest value + D = 7.57 + 0.21 = 7.78
LCL2 = 7.79
UCL2 = 7.79 + D = 7.79 + 0.21 = 8.00
LCL3 = 8.01
UCL3 = 8.01 + D = 8.01 + 0.21 = 8.22
LCL4 = 8.23
UCL4 = 8.23 + D = 8.23 + 0.21 = 8.44
LCL5 = 8.45
UCL5 = 8.45 + D = 8.45 + 0.21 = 8.66
LCL6 = 8.67
UCL6 = 8.67 + D = 8.67 + 0.21 = 8.88
LCL7 = 8.89
UCL7 = 8.89 + D = 8.89 + 0.21 = 9.10
Working hours Frequency Class Boundaries Class Width Class Marks
7.57-7.78 2 7.565-7.785 0.22 7.675
7.79-8.00 5 7.785-8.005 0.22 7.895
8.01-8.22 8 8.005-8.225 0.22 8.115
8.23-8.44 10 8.225-8.445 0.22 8.335
8.45-8.66 8 8.445-8.665 0.22 8.555
8.67-8.88 5 8.665-8.885 0.22 8.775
8.89-9.10 2 8.885-9.105 0.22 8.995
Exercise
1. Group the following weights into the classes 100 to under 125, 125 to under 150, and so forth:
111 120 127 129 130 145 145 150 153 155 160
161 165 167 170 171 174 175 177 179 180 180
185 185 190 195 195 201 210 220 224 225 230
245 248
Make a frequency distribution of the data containing class boundaries, class width and class
mark.
2. The price for 500 aspirin tablets is determined for each of twenty randomly selected stores
as part of a larger consumer study. The prices are as follows:
2.50 2.95 2.65 3.10 3.15 3.05 3.05 2.60 2.70 2.75
2.80 2.80 2.85 2.80 3.00 3.00 2.90 2.90 2.85 2.85
Group these data into seven classes and make a frequency distribution of the data containing
class boundaries, class width and class mark.
2.3. HISTOGRAMS 21
2.3 Histograms
A histogram is a graph that displays the classes on the horizontal axis and the frequencies of
the classes on the vertical axis. The frequency of each class is represented by a vertical bar
whose height is equal to the frequency of the class. A histogram is similar to a bar graph.
However, a histogram utilizes classes or intervals and frequencies while a bar graph utilizes
categories and frequencies.
A symmetric histogram is one that can be divided into two pieces such that each is the
mirror image of the other.
A skewed to the right histogram has a longer tail on the right side. The histogram shown
above is skewed to the right.
A skewed to the left histogram has a longer tail on the left side. The histogram shown
above is skewed to the left.
A cumulative frequency distribution gives the total number of values that fall below
various class boundaries of a frequency distribution. A cumulative relative frequency is
obtained by dividing a cumulative frequency by the total number of observations in the data
set. Cumulative percentages are obtained by multiplying cumulative relative frequencies
by 100.
Example 2.4.1
1. Make a cumulative and cumulative relative frequency distribution of the scores of the 80–
item exam.
Answer :
2.5 Ogives
An ogive is a graph in which a point is plotted above each class boundary at a height equal to
the cumulative frequency corresponding to that boundary. Ogives can also be constructed for
a cumulative relative frequency distribution as well as a cumulative percentage distribution.
The following table is the ogive of the preceding data.
24 CHAPTER 2. ORGANIZING DATA
50 65 70 35 40 57 66 65 70 35
29 33 44 56 66 60 44 50 58 46
67 78 79 47 35 36 44 57 60 57
Thus, we have the stem-and-leaf display of the data. The first row represents the number 29,
the second row represents the numbers 33, 35, 35, 35, and 36, etc. The first column in plot is
a cumulative frequency that starts at both ends of the data and meets in the middle. The row
that contains the median of the data is marked with parentheses around the count of
observations for that row. For the rows above the median, the number in the first column is
the number of items in that row plus number of items in all the rows above. Rows below the
median are just the opposite.
1 2 9
6 3 3 5 5 5 6
12 4 0 4 4 4 6 7
(7) 5 0 0 6 7 7 7 8
11 6 0 0 5 5 6 6 7
4 7 0 0 8 9
Chapter 3
DESCRIPTIVE MEASURES
Mean
Definition 3.2.2
If the set of data x1, x2, . . . , xN , not necessarily all distinct, represents a finite population of size
N, then the population mean is
ΣN i
i= x
µ i
= N
If the set of data x1, x2, . . . , xn, not necessarily all distinct, represents a finite sample of size N,
then the sample mean is
Σn i
i= x
x i
= n
Example 3.2.3
1. Compute the sample mean of the grades obtained by students in their preliminary
examination 80 87
92 90
95 83
83 88
25
26 CHAPTER 3. DESCRIPTIVE MEASURES
Solution:
Since n = 8
and Σ
8
xi = x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8
i=1
= 80 + 92 + 95 + 83 + 87 + 90 + 83 + 88
= 698
Σn Σ8
i=1
698
Thus, x = =
= i n i 8 87.25
x = x i=1
8
2. The following were the working hours of Mary on seventeen days of February: 8.76, 8.88,
9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01, 8.10, 8.49, 9.19. Find the
mean for this sample of hours.
Solution:
Since n = 17 and
Σ
17
xi = x1 + x2 + x3 + · · · + x16 + x17
i=1
= 8.76 + 8.88 + 9.2 + · · · + 8.01 + 8.10
= 147.62
Σn Σ17
i=1 xi 1147.62
Thus, x = =
i n 17 8.68
= x = i=1
17
3. If a class of 40 students has a total preliminary grade of 3612, what is the population mean
of the grades?
Solution:
Σ Σ
Given N = 40 and 20 20 3612
i=1 i =
xi = 3612. Hence, µ = 90.30
i=1 = N 40
Median
Definition 3.2.4 The median of a set of observations arranged in an increasing or
decreasing order of magnitude is the middle value when the number of observations is odd or
the arithmetic mean of the two middle values when the number of observations is even.
Example 3.2.5
1. Compute the sample median of the grades obtained by students in their preliminary
examina- tion
80 87
92 90
95 83
83 88
Solution:
Arranging the grades in an increasing order of magnitude, we get
80 83 83 87 88 90 92 95
Since the number of observations is even and the middle values are 87 and 88, then the
sample 87 + 88
median is x˜ = = 87.5
2
2. The following were the working hours of Mary on seventeen days of February: 8.76, 8.88,
9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01, 8.10, 8.49, 9.19. Find the
median for this population of hours.
Solution:
1. Arranging the hours in an increasing order of magnitude, we get
7.99 8.00 8.01 8.10 8.49 8.66 8.67 8.67 8.76 8.76 8.88 8.89 9.02 9.12 9.19 9.2 9.21
Mode
Definition 3.2.6 The mode of a set of observations is that value which occurs most often or
with the greatest frequency.
Remark 3.2.7 The mode does not always exists. This is certainly true when all observations
occur with the same frequency. If no such value exists, we say that the data set has no mode.
For some sets of data there may be several values occuring with the greatest frequency in
which case we have more than one mode. If two such values exist, we say the data set is
bimodal. If three such values exist, we say the data set is trimodal. There is no symbol that
is used to represent the mode.
Example 3.2.8
1. Find the mode of the grades obtained by students in their preliminary examination
80 87
92 90
95 83
83 88
Solution:
Since the value that occurs with the greatest frequency is 83. Thus, the mode is 83.
2. Find the mode in the working hours of Mary on seventeen days of February: 8.76, 8.88,
9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01, 8.10, 8.49, 9.19.
Solution:
Since the values that occur with the greatest frequencies are 8.67 and 8.76. Therefore, the
mode are 8.67 and 8.76. In this case the set of data has two mode and is called bimodal.
Example 3.3.3
1. Find the range of the grades obtained by students in their preliminary examination
80 87
92 90
95 83
83 88
Solution:
The maximum value is 95 and the minimum value is 80. Thus the range is 95 − 80 = 15.
2. Find the range in the working hours of Mary on seventeen days of February: 8.76, 8.88,
9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01, 8.10, 8.49, 9.19.
Solution:
The maximum value is 9.21 and the minimum value is 7.99. Thus the range is 9.21 − 7.99 = 1.22
Example 3.3.5
1. Find the variance and sample standard deviation of the grades obtained by students in their
preliminary examination
80 87
92 90
95 83
83 88
Solution:
xi xi2
80 6400
83 6889
83 6889
87 7569
88 7744
90 8100
92 8464
95 9025
Σ
8 8
xi = Σ
xi 2 =
698 61080
Σ8 xi = 698, and
Since n = 8, i=1 xi2 = 61080, thus the variance is
Σ8
i=1
n 2 n
i=1 i=1 xi)2
s =
2
Σ
n(n − 1)
Σ 2x
Σ8
− 2
n
Σ8 x − ( x
8 (i
i=1 i i=1 i )
=
8(8 − 1)
8(61080)− (698)2
=
8(7)
488640 − 487204
= 56
1436
=
56
= 25.64
2. Find the variance and standard deviation of the working hours of Mary on seventeen days
of February: 8.76, 8.88, 9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01,
8.10, 8.49, 9.19.
Solution:
xi xi2
8.76 76.7376
8.88 78.8544
9.2 84.64
9.02 81.3604
7.99 63.8401
8.67 75.1689
9.21 84.8241
9.12 83.1744
8.89 79.0321
8.67 75.1689
8.76 76.7376
8.66 74.9956
8.00 64.00
8.01 64.1601
8.10 65.61
8.49 72.0801
9.19 84.4561
Σ
17 17
xi = Σ
xi2 = 1284.8404
147.62 i=1
i=1
Σ17 Σ17 2
Since n = 17, i=1 xi = 147.62, and i=1 xi = 1284.8404, thus the variance is
n Σ2
Σ x − ni=1 xi)2
s2 = n i=1
n(n − 1)
Σ17 (i Σ17 xi)
2
17 i=1 2
= xi − ( i=1
17(17 − 1)
17(1284.8404) − (147.62)2
=
17(16)
21842.2868 − 21791.6644
= 272
50.6218
=
272
= 0.19
and so the standard deviation is √ √
s= s2 = 0.19 = 0.44.
Example 3.4.1
1. Compute the mean of the grouped data
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
The sample size n = 17 and the number of classes is m = 5. The class marks in the table are
c1 = 8.11, c2 = 8.36, c3 = 8.61 c4 = 8.86 c5 = 9.11 and the frequencies are f1 = 4, f2 = 1, f3 = 3,
f4 = 4, f5 = 5. Therefore the mean is
Σm
cifi
i=1
x=
n
Σ 5 ci fi
i=1
= 17
c1f1 + c2f2 + c3f3 + c4f4 + c5f5
= 17
(8.11)(4) + (8.36)(1) + (8.61)(3) + (8.86)(4) + (9.11)(5)
=
17
32.44 + 8.36 + 25.83 + 35.44 + 45.55
=
17
147.62
=
17
= 8.68
Median
The median for grouped data is found by locating the value that divides the data into two
equal parts. It is given by
b n
x˜ = a + . − cΣ
f 2
Example 3.4.2
1. Find the median of the grouped data
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
The median hour for the data in table is a value such that 8.5 hours are less than the value and
8.5 hours are greater than the value. The median hour must occur in the class 8.74-8.98, and
thus 8.74-8.98 is called the median class. Thus,
b n
x˜ = a + . − cΣ
f 2
0.25 17
= 8.735 + . − 8Σ
4 2
1
= 8.735 + 0.0625 . Σ
2
= 8.735 + 0.03125
= 8.77
Mode
The modal class is defined to be the class with the maximum frequency. The mode for
grouped data is defined to be the class mark of the modal class.
Example 3.4.3
1. Find the mode of the grouped data.
Grades frequency Class Boundaries Class Width Class Mark
80-85 3 79.5-85.5 6 82.5
86-91 3 85.5-91.5 6 88.5
92-97 2 91.5-97.5 6 94.5
Solution:
The classes with the highest frequencies are 80-85 and 86-91, their class marks are 82.5. and
88.5. Thus, the mode are 82.5. and 88.5. The grouped data is bimodal.
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
The class with the highest frequency is 8.99-9.23 and its class mark is 9.11. Therefore, the
mode is 9.11.
Range
The range for grouped data is given by the difference between the upper boundary of the
class having the largest values minus the lower boundary of the class having the smallest
values.
Example 3.5.1
1. Find the range of the grouped data.
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
The upper boundary of the class having the maximum value is 9.235 and the lower boundary of
the class having minimum value is 7.985. Therefore the range is 9.235 − 7.985 = 1.25.
(i n(n −i 1)
where ci are class marks and fi are class frequencies
√ and the standard deviation is given by
s = s2
Example 3.5.2
1. Find the variance and standard deviation of the grouped data.
= c12f1 +
c22f2 + c32f3
= (82.5)23 + (88.5)23 + (94.5)22
= (6806.25)3 + (7832.25)3 + (8930.25)2
= 20418.75 + 23496.75 + 1786.05
= 61776
m 3
Σ Σ
c i fi = cifi
i=1 i=1
= c1f1 + c2f2 + c3f3
= (82.5)(3) + (88.5)(3) + (94.5)(2)
= 247.5 + 265.5 + 189
= 699
36 CHAPTER 3. DESCRIPTIVE MEASURES
thus
Σ2
Σ m c f − m i=1 cifi)2
n i=1
s =
2
(i n(n −i 1)
8(61776) − (699)2
=
8(8 − 1)
494208 − 488601
= 8(7)
5607
=
56
= 100.125
√ √
and so s = s =
2
100.125 = 10.00
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
Note that n = 17 and m = 5
m 5
Σ Σ
ci2 f i = c i2 fi
i=1 i=1
m 5
Σ Σ
c i fi = cifi
i=1 i=1
= c1f1 + c2f2 + c3f3 + c4f4 + c5f5
= (8.11)(4) + (8.36)(1) + (8.61)(3) + (8.86)(4) + (9.11)(5)
= 32.44 + 8.36 + 25.83 + 35.44 + 45.55
= 147.62
3.6. CHEBYSHEV’S THEOREM 37
thus
Σ2
Σ m c f − m i=1 cifi)2
n i=1
s =
2
(i n(n −i 1)
17(1284.2532)− (147.62)2
=
17(17 − 1)
21832.3044 − 21791.6644
= 17(16)
40.64
=
272
= 0.15
√ √
and so s = s =
2
0.15 = 0.39
Equivalence: If x is the mean and s is the standard deviation of a set of data then at least
k2 − 1
of the data set will fall between x − ks and x + ks.
k2 3
If k = 2, then at least 4 or 75% of the data will fall between x − 2s and x + 2s.
8
If k = 3, then at least or 88.90% of the data will fall between x − 3s and x + 3s.
9
15
If k = 4, then at least or 93.80% of the data will fall between x − 4s and x + 4s.
16
24
If k = 5, then at least or 96% of the data will fall between x − 5s and x + 5s.
25
Example 3.6.1
1. If the IQs of a random sample of 1080 students at a large university have a mean score of
120 and a standard deviation of 8, use Chebyshev’s theorem to determine the interval
containing at least 810 of the IQs in the sample.
Solution:
Given: x = 120, s = 8
810
Since = 0.75 = 75% then 75% of the IQ Score will fall between
1080
3.7 z Score
z Score: An observation x from − a data with mean µ or x and standard deviation σ or s, has a z
x x−x
score or z value defined by z = or z =
µ s
σ
A z score measures how many standard deviations an observation is above or below the mean.
A positive z score measures the number of standard deviations an observation is above the
mean and a negative z score gives the number of standard deviations an observation is below
the mean.
Example 3.7.1
1. Maria’s grade in Math is 82 and 89 in English. If the class mean grade in Math was 68 and
standard deviation was 8 while the grades in English had a mean score of 80 and a standard
deviation of 6, can we conclude that Maria is a better student in English than Math?
Solution:
Math: x = 82, x = 68, s = 8
x − x 82 − 68 14
z= s = 8 = 8 = 1.75
English: x = 89, x = 80, s = 6
x − x 89 − 80 9
z= s = 6 = 6 = 1.5
Since the z score of Maria in Math is greater than in English therefore Maria is a better student
in Math than English.
3.8. COEFFICIENT OF VARIATION 39
2. Two soap companies argued which brand of their own powdered soaps dissolve quickly
and efficiently. In an actual demo both soap A from company A and soap B from company B
have a dissolving time of 9min. If the mean dissolving time of all soaps from company A was
10.0min and standard deviation was 5.25 while dissolving time of all soaps from company B
had a mean of 11.5min and a standard deviation of 3, can we conclude that soaps from
company A dissolves quicker than soaps from company B?
Solution:
Soap A: x = 9, x = 10.0, s = 5.25
x − x 9 − 10.0 1
z= = = − = −0.19
s 5.25 5.25
Soap B: x = 9, x = 11.5, s = 3
x − x 9 − 11.5 2. 5
z= s = 3 = − 3 = −0.83
Since the z score of soap B is less than soap A therefore soaps from company B dissolves quicker
than soaps from company A.
Example 3.8.1
1. A national sampling of prices for new and used motorcycles found that the mean price for
a new motorcycle is 60,100 and the standard deviation is 6125 and that the mean price for a
used motorcycle is 25485 with a standard deviation equal to 2630. Compute their CVs.
Solution:
New Motorcycle: x = 60100, s = 6125
s 6125
CV = × 100% = × 100% = 0.1019 × 100% = 10.19%
Used Motorcycle: x = 25485, s = 2630 x 60100
s 2630
CV = × 100% = × 100% = 0.1031 × 100% = 10.31%
x 25485
2. The mean dissolving time of all powdered soaps from company A was 10.0min and stan-
dard deviation was 5.25 while dissolving time of all powdered soaps from company B had a
mean of 11.5min and a standard deviation of 3. Compute their CVs.
Solution:
Soaps from Company A: x = 10.0, s = 5.25
s 5.25
CV = × 100% = × 100% = 0.5250 × 100% = 52.50%
Soaps from Company B: x = 11.5, s = x3 10.0
s 3
CV = × 100% = × 100% = 0.2609 × 100% = 26.09%
x 11.5
4 CHAPTER 3. DESCRIPTIVE MEASURES
0
3.9 Pearsonian Coefficient of Skewness
The Pearson Coefficient of Skewness is given by
3(x − 3(µ −
SK
x ˜) = or SK = µ˜ )
s s
Example
3.9.1
1. Compute the Pearsonian Coefficient of Skewness of the grades obtained by students in their
preliminary examination
80 87
92 90
95 83
83 88
Solution:
Note that x = x˜ = 87.5, s = 5.06.
87.25, Therefore, 3(x − x˜)
=
3(87.25 − 87.5)
s 5.06
3(−0.25)
= 5.06
0.75
=−
5.06
= −0.15
2. Compute the Pearsonian Coefficient of Skewness of the grouped data
Mary’s Working Hours frequency Class Boundaries Class Width Class Mark
7.99-8.23 4 7.985-8.235 0.25 8.11
8.24-8.48 1 8.235-8.485 0.25 8.36
8.49-8.73 3 8.485-8.735 0.25 8.61
8.74-8.98 4 8.735-8.985 0.25 8.86
8.99-9.23 5 8.985-9.235 0.25 9.11
Solution:
Note that x = x˜ = 8.77, s = 0.39.
8.68, Therefore, 3(x − x˜)
=
3(8.68 − 8.77)
s 0.39
3(−0.09)
= 0.39
− 0.27
=
0.39
= −0.69
Equivalence:
68% of the data will fall between x − s and x + s
95% of the data will fall between x − 2s and x + 2s
99.70% of the data will fall between x − 3s and x +
3s
Example 3.10.1
1. Assuming the incomes for all households last year produced a bell-shaped distribution
with a mean equal to 200,000 and a standard deviation equal to 56,540, deduce an
approximation based on empirical rule.
Solution:
Given: x = 200, 000, s = 56, 540
68% of the incomes will fall between
(x − s, x + s) = (200, 000 − 56, 540, 200, 000 + 56, 540)
= (143, 460, 256, 540)
95% of the incomes will fall between
(x − 2s, x + 2s) = (200, 000 − 2(56, 540), 200, 000 + 2(56, 540))
= (200, 000 − 113, 080, 200, 000 + 113, 080)
= (86, 920, 286, 920)
99.70% of the incomes will fall between
(x − 3s, x + 3s) = (200, 000 − 3(56, 540), 200, 000 + 3(56, 540))
= (200, 000 − 169, 620, 200, 000 + 169, 620)
= (30, 380, 369, 620)
2. Deduce an approximation based on empirical rule of the grades obtained by students in
their preliminary examination assuming the distribution is bell-shaped
80 87
92 90
95 83
83 88
Solution:
Given: x = 87.25, s = 5.06.
68% of the grades will fall between
(x − s, x + s) = (87.25 − 5.06, 87.25 + 5.06)
= (82.19, 92.31)
95% of the grades will fall between
(x − 2s, x + 2s) = (87.25 − 2(5.06), 87.25 + 2(5.06))
= (87.25 − 10.12, 87.25 + 10.12)
= (77.13, 97.37)
42 CHAPTER 3. DESCRIPTIVE MEASURES
Percentile
Definition 3.11.1 Percentiles are values that divide a set of observations into 100 equal
parts. These values, denoted by P1, P2, . . . P99, are such that 1% of the data falls below P1, 2%
falls below P2, . . . and 99% falls below P99.
Decile
Definition 3.11.2 Deciles are values that divide a set of observations into 10 equal parts.
These values, denoted by D1, D2, . . . D9, are such that 10% of the data falls below D1, 20%
falls below D2, . . . and 90% falls below D9.
Quartile
Definition 3.11.3 Quartiles are values that divide a set of observations into 4 equal parts.
These values, denoted by Q1, Q2 and Q3, are such that 25% of the data falls below Q1, 50% falls
below Q2 and 75% falls below Q3.
Equivalence:
Decile–Percentile
D1 = P10, D2 = P20, D3 = P30, D4 = P40, D5 = P50, D6 = P60, D7 = P70, D8 = P80, D9 = P90
Quartile–Percentile
Q1 = P25, Q2 = P50, Q3 = P75
Example 3.11.4
1. Given the grades obtained by students in their preliminary examination
80 87
92 90
95 83
83 88
Compute the following
a. P15
b. D2
c. Q3
Solution:
Arranging the grades in an increasing order, we get
80 83 83 87 88 90 92 95
Note that n = 8.
a. Solve for P15.
15(8) 120
Compute the index i (p)(n) = = = 1.2.
= 100 100 100
Since i = 1.2 is not an integer then i ≈ 2. Thus, P15 is the 2nd observation.
Therefore, P15 = 83.
b. Solve for D2.
Since D2 = P20, hence we compute for P20.
Compute the index i (p)(n) = 20(8) = 160 = 1.6.
= 100 100 100
Since i = 1.6 is not an integer then i ≈ 2. Thus, P20 is also the 2nd observation.
Therefore, P20 = 83.
c. Solve for Q3.
Since Q3 = P75, hence we compute for P75.
Compute the index i (p)(n) = 75(8) = 600 = 6.
= 100 100 100
Since i = 6 is an integer then Q3 is the average of the 6th and 7th observations.
Therefore, Q3 = 90 + 92 = 182 = 91.
2 2
2. The following were the working hours of Mary on seventeen days of February: 8.76, 8.88,
9.2, 9.02, 7.99, 8.67, 9.21, 9.12, 8.89, 8.67, 8.76, 8.66, 8.00, 8.01, 8.10, 8.49, 9.19. Compute
the following
a. D7
b. Q1
c. P67
d. P83
e. P98
Solution:
Arranging the hours in an increasing order, we get
7.99 8.00 8.01 8.10 8.49 8.66 8.67 8.67 8.76 8.76 8.88 8.89 9.02 9.12 9.19 9.2 9.21
Example 4.1.2
1. Experiment: Tossing a
coin Let H = head, T =
tail
S = H,
{ T }
n(S) = 2
3. Experiment: Tossing 2
coins Let H = head, T =
tail
S = {HH, HT, TH, TT }
47
4 CHAPTER 4. COUNTING AND PROBABILITY
8
n(S) = 4
Event of having the same results
E = HH,
{ TT }
n(E) = 2
4. Experiment: Tossing 3
coins Let H = head, T =
tail
S = {HHH, HHT, HTT, HTH, THH, THT, TTH, }
TTT n(S) = 8
Event that at least 2 heads occur
E = HHH,
{ HHT, HTH, THH }
n(E) = 4
Event of having the same results
E = HHH,
{ TTT }
n(E) = 2
Example 4.2.1
1. How many sample points are in the sample space when a pair of dice is thrown?
Solution:
Let
n1 =number of possible outcomes when the 1st die is thrown,
n2 =number of possible outcomes when the 2nd die is thrown
There are 6 possible outcomes when the 1st die is thrown and there are also 6 possible outcomes
when the 2nd die is thrown. Thus, there are n1 · n2 = 6 · 6 = 36 sample points.
2. How many lunches are possible consisting of soup, a sandwich, dessert, and a drink if one
can select from 4 soups, 3 kinds of sandwiches, 5 desserts and 4 drinks?
Solution:
There are
n1 = 4 ways to select soups,
n2 = 3 ways to select
sandwiches, n3 = 5 ways to
select deserts and n4 = 4 ways to
select drinks.
Therefore, there are n1 · n2 · n3 · n4 = 4 · 3 · 5 · 4 = 240 ways to select different lunches.
3. How many even three-digit number can be formed from the digits 1,2,5,6 and 9 if each
digit can be used only once?
Solution:
Let
n1 =ways to select for the the hundreds’ digit,
n2 =ways to select for the the tens’ digit,
n3 =ways to select for the the ones’ digit,
We want the number to be even so n3 = 2 to be selected from 2 or 6. Since there are 5 choices
and we already selected for the ones’ digit and each digit can be used only once so n2 = 4 and
n1 = 3. Therefore, there are n1·n2 ·n3 = 3 4· 2· = 24 even three-digit numbers that can be formed
from the digits 1, 2, 5, 6 and 9 if each digit can be used only once.
Example 4.2.2
1. How many distinct arrangements on 5 chairs for 5 persons?
Solution:
There are n! = 5! = 5 · 4 · 3 · 2 · 1 = 120 distinct arrangements on 5 chairs for 5 persons.
3. How many possible arrangements of the letters of the word “LOGARITHM” if its starts
with a vowel and ends with a consonant?
Solution:
There are n1 · n2 · n3 · n4 · n5 · n6 · n7 · n8 · n9 possible arrangements. There are n1 = 3 ways
to select for the vowels o, a and i and n9 = 6 ways to select consonants for the last letter. More-
over, n2 · n3 · n4 · n5 · n6 · n7 · n8 = 7! arrangements of 7 remaining letters in a row.
· · ·n1 ·n2 ·n3 n
Therefore, · 4 n· 5 n·6 n7 n8· n9 = 3 7! 6 = 90720 possible arrangements of the
letters of the word “LOGARITHM” if its starts with a vowel and ends with a consonant.
·
2. Two lottery tickets are drawn from 20 for first and second prizes. Find the number of
sample points in the sample space.
Solution:
There are
20! 20!
20P2 = = = 20 · 19 = 380 sample points in the sample space.
(20 − 2)! 18!
n!
6. The number of combinations of n distinct objects taken r at a time is nCr =
. r!(n −
r)!
Example 4.2.6
1. How many combinations of the letters A, B and C taken 1,2 or 3 at a time?
Solution:
Taken 1 at a time
3! 3!
3C1 = = =3
1!(3 − 1)! 2!
Therefore, there are 3 combinations of the letters A, B and C taken 1 at a time.
Taken 2 at a time
3! 3!
3C2 = = =3
2!(3 − 2)! 2!
Therefore, there are 3 combinations of the letters A, B and C taken 2 at a time.
Taken 3 at a time
3! 3!
3C 3 = = =1
3!(3 − 3)! 3!
Therefore, there is 1 combination of the letters A, B and C taken 3 at a time.
2. From 5 SOE students, 4 SBA students and 3 SAS students find the number of committees
of 6 persons that can be formed with 3 SOE, 2 SBA and 1 SAS students.
Solution:
There are
5!
n1 =5 C3 = = 10 ways to select from SOE students
− 3)!
3!(5 4!
n2 =4 C2 = = 6 ways to select from SBA students
− 2)!
2!(4 3!
n3 =3 C1 = = 3 ways to select from SAS students
1!(3 − 1)!
Therefore, there are
n1 · n2 · n3 = 10 · 6 · 3 = 180 possible number of committees.
5. Find the probability of winning the 6-55 lotto game given one ticket.
Solution:
We note that n(S) = 28, 989, 675 are the possible combinations of 6-55 lotto game. Since
n(E) = 1 for one ticket, thus we have
P (E) n(E) = 1
= 0.000000034
= n(S) 28, 989,
675
6. Find the probability of winning the 6-45 lotto game given six tickets.
Solution:
45!
We note that n(S) = 45C6 = = 8, 145, 060 are the possible combinations of 6-45 lotto
− 6)!
game. Since n(E) = 6 for six6!(45
tickets, thus we have
P (E) n(E) = 6
= 0.000000736
= n(S) 8, 145,
060
54 CHAPTER 4. COUNTING AND PROBABILITY
Chapter 5
NORMAL DISTRIBUTION
Definition 5.1.1
1. If X is a random variable with mean µ and variance σ2, then the equation of the normal
curve is . Σ2
1 x−µ
−
n(x; µ, σ) = σ
e 2
√1
2π
σ
2. The distribution of a normal random variable with mean zero and standard deviation equal to
1 is called a standard normal distribution.
X−µ
z= σ .
55
56 CHAPTER 5. NORMAL DISTRIBUTION
If X is between the values x = x1 and x = x2, (x1 < X < x2) the random variable Z will
fall between the corresponding values x1 −
= and z2 = x2 − (z1 < Z < z2). Thus,
z1 µ µ
σ σ
P (x1 < X < x2) = P (z1 < Z < z2) .
Example 5.2.1
I. Find the following z values.
1. P (z < 2.64)
Solution:
P (z < 2.64) = 0.9959
2. P (z < −1.61)
Solution:
P (z < −1.61) = 0.0537
3. P (z < 0.84)
Solution:
P (z < 0.84) = 0.7995
4. P (z > 1.38)
Solution:
P (z > 1.38) = 1 − P (z < 1.38) = 1 0.9162 = 0.0838 or
P (z > 1.38) = P (z < −1.38) = 0.0838
5. P (z > −2.75)
Solution:
P (z > −2.75) = 1 − P (z < −2.75) = 1 0.0030 = 0.9970 or
P (z > −2.75) = P−(z <−( 2.75)) = P (z < 2.75) = 0.9970
6. P (z > −0.68)
Solution:
P (z > −0.68) = 1− P (z < −0.68) = 1 0.7517 = 0.2483 or
P (z > −0.68) = P−(z <−( 0.68)) = P (z < 0.68) = 0.2483
7. P (−2.67−< z < 1.32)
Solution:
P (− 2.67 < z < 1.32) = P (z < 1.32)− P (z < −2.67) = 0.9066 0.0038 = 0.9028
8. P (1 < z < 2) −
Solution:
P (1 < z < 2) = P (z < 2) − P (z < 1) = 0.9772 0.8413 = 0.1359
9. P (− 2.74 < z < 0.11)
Solution:
P (−2.74 < z < −0.11) = P (z < −0.11) − P (z < −2.74) = 0.4562 − 0.0031 = 0.4531
2. P (x < 44)
Solution:
x−µ 44 − 50 6
x = 44, z = = = − = −2
σ 3 3
P (x < 44) = P (z <−2) = 0.0228
3. P (x > 46)
Solution:
x − µ 46 − 50 4
x = 46, z = σ = 3 = − 3 = −1.33
P (x > 46) = P (z > −1.33) = 1 − P (z < −1.33) = 1 0.0918 = 0.9082
4. P (45 < x < 60) −
Solution:
x − µ 45 − 50 5
x1 = 45, z1 = = = − 3 = −1.67
σ 3
x = 60, z
2 =
x − µ 60 − 50 10
σ = = 3 = 3.33
2
3
P (45 < x < 60) = P (− 1.67 < z < 3.33) = P (z < 3.33)−P (z < −1.67) = 0.9996 0.0475 = 0.9521
5. P (51 < x < 57) −
Solution:
x = 51, z x − µ 51 − 50 1
1 1
= σ = 3 = 3 = 0.33
x = 57, z x − µ 57 − 50 7
2 =
σ = 3 = 3 = 2.33
2
P (51 < x < 57) = P (0.33 < z < 2.33) = P (z < 2.33) − P (z < 0.33) = 0.9901 − 0.6293 =
0.3608
2. If n = 80, µ = 77 and σ = 5.2, find the maximum grade of the lowest 15.34% of the class.
Solution:
P (x < k) = 0.1534
⇒ k = −1.02 and so x = kσ + µ = −1.02(5.2) + 77 = −5.034 + 77 = 71.696 72
Therefore,
≈ 72 is the maximum grade of lowest 15.34% of the class.
3. If n = 80, µ = 77 and σ = 5.2, find the minimum grade of the highest 23.18% of the
class.
Solution:
P (x > k) = 0.2318
P (x > k) = 1 − P (x < k) = 0.2318, thus P (x < k) = 1 0.2318 = 0.7682
⇒ k = 0.73 and so x = kσ + µ = 0.73(5.2) + 77 = 3.796 + 77 = 80.796 81
Therefore, 81 is the minimum grade of the highest 23.18% of the class.
4. A certain type of storage battery lasts on the average 3.0 years, with a standard deviation
of 0.5 year. Assuming that the battery lives are normally distributed, find the probability that
a given battery will last less than 2.3 years.
Solution:
x µ 2. 3 3 0 .7
x = 2.3 transforming to z = − = =− = −1.4
σ 0.5 0.5
P (x < 2.3) = P (z < −1.4) =
0.0808
5. An electrical firm manufactures light bulbs that have a length of life that is normally dis-
tributed with mean equal to 800 hours and standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
Solution:
x = 778 transforming to z 1 x−µ 778 − 800 22
1
= = =−
= −0.55
x = 834 transforming to σ 40 40
z x−µ
834 − 800 34
2 2 = = = 40 = 0.85
σ 40
P (778 < x < 834) = P (−0.55 < z < 0.85) = P (0.85) − P (−0.55) = 0.8023 − 0.2912 =
0.5111
TESTS OF HYPOTHESIS
H0 : θ = θ0,
H1 : θ > θ0,
or
H0 : θ = θ0,
H0 : θ < θ0,
61
62 CHAPTER 6. TESTS OF HYPOTHESIS
H0 : θ = θ0
H1 : θ ƒ= θ0,
is called a two-tailed test. The alternative hypothesis θƒ = θ0 states that either θ < θ0 or θ > θ0.
Thus,
H0 : θ = θ0,
H1 : θ < θ0 or θ > θ0.
3. A test is significant if the null hypothesis is rejected at the 0.05 level of significance and
is considered to be highly significant if the null hypothesis is rejected at the 0.01 level of
significance.
x − µ0
z= ; σ known
σ
√
n
µ = µ0 or n ≥ 30 µ < µ0 z < −zα
µ > µ0 z > −zα
µ ƒ= µ0 z < −z α
and2
z > z 2α
Example 6.3.2
1. A manufacturer of sports equipment has developed a new synthetic fishing line that he
claims has a mean breaking strength of 8 kilograms with a standard deviation of 0.5
kilogram. Test the hypothesis that µ = 8 kilograms against ƒ the alternative that µ = 8
kilograms if a random sample of 50 lines is tested and found to have a mean breaking
strength of 7.8 kilograms. Use a 0.01 level of significance.
Solution:
1. H0 : µ = 8 kilograms.
2. H1 : µ ƒ= 8 kilograms.
6.3. ONE-TAILED AND TWO-TAILED TESTS 63
3. α = 0.01
4. Critical region: z < −2.58 and z > 2.58, where
x − µ0
z=
√σ
n
5. Computations: x = 7.8 kilograms, σ = 0.5, n = 50, and hence
7.8 − 8
z= = 2.83
√0.5 −
50
6. Decision: Reject H0 and conclude that the average breaking strength is not equal to 8.
2. A random sample of 100 recorded deaths in the United States during the past year showed
an average life span of 71.8 years, with a standard deviation of 8.9 years. Does this seem to
indicate that the average life span today is greater than 70 years? Use a 0.05 level of
significance. Solution:
1. H0 : µ = 70 years.
2. H1 : µ > 70 years.
3. α = 0.05
4. Critical region: z > 1.65, where
x − µ0
z=
√σ
n
5. Computations: x = 71.8 years, σ = 8.9 years, n = 100, and hence
71.8 − 70
z= √8.9 = 2.02
100
6. Decision: Reject H0 and conclude that the average life span today is greater than 70 years.
3. The average length of time for students to register for summer classes at a certain college
has been 50 minutes with a standard deviation of 10 minutes. If a random sample of 60
students had an average registration time of 52 minutes, test the hypothesis that the
population mean is now less than 50, using a level of significance of 0.025.
Solution:
1. H0 : µ = 50 years.
2. H1 : µ < 50 years.
3. α = 0.025
4. Critical region: z < −1.96 where
z = x − µ0
√σ
n
5. Computations: x = 52 years, σ = 10 years, n = 60, and hence
64 CHAPTER 6. TESTS OF HYPOTHESIS
52 − 50
z= = 1.55
√10
60
6. Decision: Accept H0 and conclude that the average life span today is equal to 50 minutes.
6.4. EXERCISE 65
6.4 Exercise
1. An electrical firm manufactures light bulbs that have a length of life that is approximately
normally distributed with a mean of 800 hours and a standard deviation of 40 hours. Test the
hypothesis that µ = 800 hours against the alternative µ =ƒ 800 hours if a random sample of 30
light bulbs has an average life span of 788 hours. Use a 0.04 level of significance.
2. The average height of females in the freshman class of a certain college has been 160.5
centime- ters with a standard deviation of 6.9 centimeters. Is there a reason to believe that
there has been a change in the average height if a random sample of 50 females in the present
freshman class has an average height of 162.5 centimeters? Use a 0.02 level of significance.
3. Test the hypothesis that the average content of containers of a particular lubricant is 10
liters if the contents of a random sample of 30 containers are 10.2, 9.7, 10.1, 10.3, 10.1, 9.8,
9.9, 10.4, 10.3, 9.8, 10.21, 9.71, 10.11, 10.31, 10.11, 9.81, 9.91, 10.41, 10.31, 9.81, 9.83, 10.33,
10.43, 9.93, 9.83, 10.13, 10.33, 10.13, 9.73, and 10.23. Use a 0.01 level of significance and
assume that the distribution of contents is normal.