0% found this document useful (0 votes)

323 views

Basicstat 1011

This document is an introduction to statistics textbook that covers topics such as: - Definitions and classifications of key statistical terms - The stages of a statistical investigation and applications/limitations of statistics - Variables, measurement scales, and methods of data collection/presentation - Measures of central tendency (mean, median, mode) and variation (range, variance, standard deviation) - Elementary probability concepts like sets, counting, and probability distributions - Common probability distributions like the binomial and normal distributions - Sampling techniques The textbook is intended as an introductory resource for a college statistics course and provides explanations of fundamental statistical concepts.

Uploaded by

Sakshay kumar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

323 views

Basicstat 1011

Uploaded by

Sakshay kumar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 115

Introduction to Statistics - Stat 1011

Awol S.
Department of Statistics
College of Computing & Informatics
Haramaya University
Dire Dawa, Ethiopia

2013/2014
c
Contents

1 Introduction 1
1.1 Some Statistical Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Definition and Classification of Statistics . . . . . . . . . . . . . . . . . . . 1
1.2.1 Definitions of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Stages in Statistical Investigation . . . . . . . . . . . . . . . . . . . 2
1.2.3 Classification of Statistics . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Applications, Uses and Limitations of Statistics . . . . . . . . . . . . . . . 4
1.3.1 Applications of Statistics . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Uses of Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.3 Limitations of Statistics . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Types of Variables and Measurement Scales . . . . . . . . . . . . . . . . . 6
1.4.1 Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 Scales of Measurement . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Methods of Data Collection and Presentation 9

2.1 Types of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Methods of Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Questionnaire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Secondary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Data Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Methods of Data Presentation . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Frequency Distributions . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Diagrammatic Display of Data . . . . . . . . . . . . . . . . . . . . . 19
2.4.3 Graphical Presentation of Data . . . . . . . . . . . . . . . . . . . . 23

3 Measures of Central Tendency 25

3.1 Objectives of Measures of Central Tendency . . . . . . . . . . . . . . . . . 25
3.2 Characteristics of Good Measure of Central Tendency . . . . . . . . . . . . 26
3.3 Summation Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.1 Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.2 Geometric Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4.3 Harmonic Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

i
CONTENTS CONTENTS

3.5 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Other Measures of Location: Quantiles . . . . . . . . . . . . . . . . . . . . 36
3.6.1 Quartiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.6.2 Deciles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6.3 Percentiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.7 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Measures of Variation, Skewness and Kurtosis 43

4.1 Objectives of Measures of Variation . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Types of Measures of Variation . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 Range and Relative Range . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.2 Quartile Deviation and Coefficient of Quartile Deviation . . . . . . 45
4.2.3 Mean Deviation and Coefficient of Mean Deviation . . . . . . . . . 46
4.2.4 Variance and Standard Deviation . . . . . . . . . . . . . . . . . . . 48
4.2.5 Coefficient of Variation . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.6 Standard Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.1 Frequency Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.2 Measures of Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Kurtosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Elementary Probability 60
5.1 Concept of Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Basic Probability Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3 Counting Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Definitions of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.5 Some Rules of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6 Conditional Probability and Independence . . . . . . . . . . . . . . . . . . 69
5.6.1 Conditional Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.6.2 Independent Events . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6 Probability Distributions 72
6.1 Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.1 Probability Distribution . . . . . . . . . . . . . . . . . . . . . . . . 72
6.1.2 Expectations of a Random Variable . . . . . . . . . . . . . . . . . . 74
6.2 Common Discrete Distributions . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2.1 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . 75
6.2.2 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . 77
6.3 Common Continuous Distributions . . . . . . . . . . . . . . . . . . . . . . 78
6.3.1 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.2 Other Continuous Distributions . . . . . . . . . . . . . . . . . . . . 82

ii
Introduction to Statistics - Stat 1011 es.awol@gmail.com

7 Sampling Techniques 83
7.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2 Reasons for Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.3 Types of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.4 Types of Sampling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.4.1 Probability Sampling Techniques . . . . . . . . . . . . . . . . . . . 85
7.4.2 Non-probability Sampling Techniques . . . . . . . . . . . . . . . . . 87

8 Statistical Inference for a Single Population 88

8.1 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.1.1 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.1.2 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.2.1 Basic Concepts in Hypothesis Testing . . . . . . . . . . . . . . . . . 90
8.2.2 Hypothesis Testing for a Population Mean . . . . . . . . . . . . . . 91
8.2.3 Confidence Interval for a Population Mean . . . . . . . . . . . . . . 93

9 Inference for Two or More Populations 94

9.1 Comparison of the Population Mean in Two groups . . . . . . . . . . . . . 94
9.1.1 Paired Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.1.2 Independent Samples . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.2 Analysis of Variance (ANOVA) . . . . . . . . . . . . . . . . . . . . . . . . 100

10 Simple Linear Regression and Correlation 104

10.1 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.1.1 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.1.2 Pearson’s Correlation Coefficient . . . . . . . . . . . . . . . . . . . 106
10.1.3 Spearman’s Rank Correlation . . . . . . . . . . . . . . . . . . . . . 107
10.2 Simple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.2.1 Method of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.2.2 The Coefficient of Determination . . . . . . . . . . . . . . . . . . . 110

iii
Chapter 1

Introduction

1.1 Some Statistical Terms

Statistics has become an integral part of our daily lives. Every day we are confronted
with some form of statistical information through newspapers, magazines and other forms
of communication. Such statistical information has become highly influential in our lives.
Before getting involved in the subject matter in detail, let us define some of the terms used
extensively in the field of statistics.

• Datum: It is an information taken from an object. It is also known as an observation

or an item or a case or a unit.

• Data: are collection of observed values representing one or more characteristics of

some objects.

• Population: is the totality of all objects under study.

• Sample: is the subset of the population. Normally a sample should be selected in

such a way as to be representative of the population.

1.2 Definition and Classification of Statistics

1.2.1 Definitions of Statistics
Statistics can be defined in two senses: plural (as statistical data) and singular (as statis-
tical methods).

• Plural sense: Statistics are collection of facts (figures). This meaning of the word
is widely used when reference is made to facts and figures on a certain characteristic.
For example: sales statistics, labor statistics, employment statistics, e.t.c. In this
sense the word ”statistics” serves simply as ”data”. But, not all numerical data
are statistics. In order for the numerical data to be identified as statistics, it must

1
Introduction to Statistics - Stat 1011 es.awol@gmail.com

possess certain identifiable characteristics. Some of these characteristics are described

as follows:

1. Statistics are aggregate of facts. Single or isolated facts or figures cannot be

called statistics as these cannot be compared or related to other figures within
the same framework. Accordingly, there must be an aggregate of these figures.
For example, if a person says that ”I earn Birr 30000 per year”, it would not be
considered as statistics. On the other hand if we say that the average salary of a
professor at our university is Birr 30000 per year, then this would be considered
as statistics since the average has been computed from many related figures such
as yearly salaries of many professors.
2. Statistics are numerically expressed. All statistics are stated in numerical
figures only. Qualitative statements cannot be called statistics. For example,
such qualitative statements as ’Ethiopia is a developing country’ or ’Jack is
very tall’ would not be considered as statistical statements. On the other hand
comparing per capita income of Ethiopia with that of Kenya would be considered
statistical in nature. Similarly, Jack’s height in numbers compared to average
height in Ethiopia would also be considered as statistics.
3. Statistics must be placed in relation to each other. The main objective
of statistical analysis is to facilitate a comparative and relative study of the
desired characteristics of the data. The comparison of facts and figures may be
conducted regarding the same characteristics over a period of time from a single
source or it may be from various sources at any one given time. For example,
prices of different items in a store as such would not be considered statistics.
However, prices of one product in different stores constitute statistical data since
these prices are comparable. Also, the changes in the price of a product in one
store over a period of time would also be considered as as statistical data since
these changes provide for comparison over a period of time. However, these
comparisons must relate to the same phenomenon or subject so that likes are
compared with likes and oranges are not compared with apples.

• Singular sense: Statistics is a science that deals with the method of data collec-
tion, data organization, data presentation, data analysis and interpretation of results.
It refers to a subject matter that is concerned with extracting relevant information
from available data with the aim to make sound decisions. According to this mean-
ing, statistics is concerned with the development and application of methods and
techniques for collecting, organizing, presenting, analyzing data and interpreting re-
sults.

1.2.2 Stages in Statistical Investigation

According to the singular sense definition of statistics, a statistical investigation involves
five stages: data collection, organization, presentation, analysis and interpretation of re-

2
Introduction to Statistics - Stat 1011 es.awol@gmail.com

sults.

1. Collection of data: Data collection is the first stage in any statistical investiga-
tion. It involves the process of obtaining (gathering) a set of related measurements
or counts to meet predetermined objectives. Data may be available from existing
published sources which may have already been organized in some presentable form.
Such information is commonly referred to as secondary data. On the other hand, the
investigator may actually collect his or her own data. This is usually warranted when
information about some area of inquiry has not been ascertained. In such cases, the
data are said to be of primary form.

2. Organization of data: It is usually not possible to derive any conclusion about

the main features of the data from direct inspection of the observations. The second
purpose of statistics is describing the properties of the data in a summary form.
Editing is the first step in the organization of data. Since there may be omissions,
inconsistencies, ambiguities, irrelevant answers and recording errors. Once the data is
edited, the second step is classification, that is, arranging the collected data according
to some common characteristics. Such classified data can more easily be presented.
The last step of the organization of data is presenting the classified data in tabular
form, using rows and columns (tabulation).

3. Presentation of data: The purpose of data presentation is to have an overview of

what the data actually looks like, and to facilitate statistical analysis. Data presen-
tation can be done using diagrams and graphs which have great memorizing effect
and facilitates comparison.

4. Analysis of data: The analysis of data is the extraction of summarized and com-
prehensive numerical description in order to reach conclusions or provide answers to
a problem. That is, the basic purpose of data analysis is to make it useful for certain
conclusions. This analysis may require from simple to sophisticated mathematical
techniques.

5. Interpretation of results: This is the last stage of statistical investigation. Once

the data has been analyzed, some numerical value(s) can be achieved. The main job
consists of attaching physical meaning or interpretation to these numerical results.
This must be true in its meaning and sense. No pre-conceived ideas should be
thrusted on the numerical results obtained out of the analysis of the data. Also
no attempts should be made to draw more conclusions than the results are actually
liable to.

1.2.3 Classification of Statistics

Based on the scope of the decision, statistics can be classified into two; descriptive and
inferential statistics.

3
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Descriptive statistics: It refers to the procedures used to organize and summarize

masses of data. It is concerned with describing or summarizing the most important
features the collected data without going beyond the data themselves. That is, this
part deals with only describing the data collected without going any further: that
is without attempting to conclude anything that goes beyond the data themselves.
The methodologies of descriptive statistics include the methods of data organization
like classification, tabulation and constructing frequency distributions and methods
of data presentation like diagrammatic and graphical displays and calculations of
certain indicators of data like measures of central tendency and measures of dispersion
(variation).

2. Inferential statistics: Inferential statistics includes the methods used to find out
something about a population based on the sample. It is concerned with drawing
statistically valid conclusions about the characteristics of the population based on
information obtained from sample. In this form of statistical analysis, descriptive
statistics is linked with probability theory in order to generalize the results of the
sample to the population. Performing hypothesis testing, determining relationships
between variables and making predictions are also inferential statistics.

Example 1.1. Classify the following statements as descriptive and inferential statistics.

1. The average age of the students in this class is 21 years.

2. There is a strong association between smoking and lung cancer.

3. The price of wheat will be increased by 5% in the coming year.

4. Of the students enrolled in Haramaya University this year, 74% are male and 26%
are female.

5. The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.

1.3 Applications, Uses and Limitations of Statistics

1.3.1 Applications of Statistics
There is hardly any walk of life which has not been affected by statistics - ranging from
a simple household to big business and the government. Hence, in this modern time,
statistical information plays a very important role in a wide range of fields. Some of the
areas where the knowledge of statistics is usually applied are as follows:

• In scientific research: There is hardly any advanced research going on without the
use of statistics in one form or another. Statistics are used extensively in medical,
pharmaceutical and agricultural research. The effectiveness of a new drug is de-
termined by statistical experimentation and evaluation. In agriculture, experiments

4
Introduction to Statistics - Stat 1011 es.awol@gmail.com

about crop yields, types of fertilizers and types of soils under different types of en-
vironments are commonly designed and analyzed through statistical methods and
concepts. In marketing research, statistical tools are indispensable in studying con-
sumer behavior, effects of various promotional strategies and so on. In economics, it
is used for modeling functional relationships between or among variables. In educa-
tion and agricultural extension also it is used to study the effects of certain training.
Also in decision making, statistics helps to enhance the power of decision making in
the face of uncertainty by providing sufficient information.

• In quality control: Statistics are used in quality control so extensively that even the
phenomenon itself is known as statistical quality control. Statistical quality control
(SQC) consists of using statistical methods to gather and analyze data on determi-
nation and control of quality. Statistical methods help to check whether a product
satisfies a given standard. This technique primarily deals with the samples taken
randomly and as representative of the entire population, then these samples are an-
alyzed and inferences made concerning the characteristics of the population from
which these samples were taken. The concept is similar to testing one spoonful from
a pot of stew and deciding whether it needs more salt or not. The characteristics
of samples are analyzed by statistical quality control and the use of other statistical
techniques.

• In natural, social and physical sciences: In natural sciences, in botany, statistics

are used in evaluating the effects of temperature and other climatic conditions and
types of soil on the health of plants. In the social sciences, statistics are used in all
areas of human and social characteristics. Similarly, in physical sciences, for example,
the science of meteorology uses statistics in analyzing the data gathered by satellites
in predicting weather conditions.

• In other areas: Statistics are commonly used by insurance companies, stock broker-
age houses, banks public utility companies and so on. Statistics are also immensely
useful to politicians since their chances of winning can be predicted through the
use of sampling techniques in random selection of voter samples and studying their
attitudes on issues and policies.

1.3.2 Uses of Statistics

• Reduction and summarization of data: Statistics condenses and summarizes a large
mass of data and presents facts into a few presentable, understandable and precise
numerical figures. The raw data, as is usually available, is voluminous and haphazard.
It is generally not possible to draw any conclusions from the raw data as collected.
Hence it is necessary and desirable to express these data in a few numerical values.

• Facilitating comparison of data: Arrangement of data with respect to different char-

acteristics facilitates comparison. Statistical devises such as averages, percentages,

5
Introduction to Statistics - Stat 1011 es.awol@gmail.com

ratios, e.t.c. are used for this purpose.

• Determining functional relationships between two or more phenomenon: Statistical
techniques such as correlation analysis assist in establishing the degree of association
between two or more variables.
• Formulation and test of hypothesis: For instance, hypothesis like whether a new
medicine is effective in curing a disease, whether there is an association between
variables can be tested using statistical tools.
• Prediction: Statistical methods are highly useful tools in analyzing the past data and
predicting some future trends.

1.3.3 Limitations of Statistics

• It does not deal with a single observation, rather, as discussed earlier, it only deals
with aggregate of facts. For example, the marks obtained by one student in a class
does not carry any meaning in itself, unless it is compared with a set standard or
with other students in the same class or with his own marks obtained earlier.
• Statistical methods are not applicable to qualitative characters and cannot be coded
in numerical values.
• Statistical results are true on average; i.e. for the majority of cases. Since statistics is
not exact science, statistical conclusions are not universally true. That is, statistical
laws are not universally true like the laws of physics, chemistry and mathematics.
• Statistics are liable to be misused or misinterpreted. Statistical interpretation requires
a high degree of skill and understanding of the subject. Misuse of statistics may be
done due to inadequate and faulty procedures of data collection and sample selection
and mainly due to ignorance (lack of knowledge).

1.4 Types of Variables and Measurement Scales

1.4.1 Variable
A variable is a characteristic or an attribute that can assume different values. For example:
height, family size, gender, · · · . Based on the values that variables assume, variables can
be classified as qualitative and quantitative.

Qualitative variables are those variables that do not assume numeric values. For example,
gender is qualitative variable. But, quantitative variables are, on the other hand, those
variables which assume numeric values. These variables are numeric in nature. Height and
family size are examples of quantitative variables.

6
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Quantitative variables are again classified into two; discrete and continuous variables. Dis-
crete variables are those variables that assume whole number values and consist of distinct
and recognizable individual elements that can be counted. For example, family size, num-
ber of children in a family, number of cars at the traffic light, · · · are some of the discrete
variables. These variables assume a finite or countable number of possible values. The
values of these variables are obtained by counting (0, 1, 2, · · · ).

The other quantitative variables, continuous variables, takes any value including decimals.
These variables can theoretically assume an infinite number of possible values. Their val-
ues are obtained by measuring. Examples of continuous variables are height, weight, time,
temperature, · · ·

Generally the values of a variable can be obtained either by counting for discrete variables,
by measuring for continuous variables or by making categories for qualitative variables.

Example 1.2. Classify each of the following variables as qualitative and quantitative and
if it is quantitative classify as discrete and continuous.

1. Color of automobiles in a dealer’s show room.

2. Number of seats in a movie theater.

3. Classification of patients based on nursing care needed (complete, partial or safers).

4. Number of tomatoes on each plant on a field.

5. Weight of newly born babies.

1.4.2 Scales of Measurement

Consider the following two cases.

Case 1:

• Mr A wears 5 when he plays foot ball.

• Mr B wears 6 when he plays foot ball.

Who plays better? What is the average shirt number?

Case 2:

• Mr A scored 5 in Stat quiz.

• Mr B scored 6 in Stat quiz.

Who did better? What is the average score?

7
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Based on the number on the shirts, it is not possible to judge whether Mr B plays better.
But by using the test score, it is possible to judge that Mr B did better in the exam. Also
it not possible to find the average shirt numbers because the numbers on the shirts are
simply codes but it is possible to obtain the average test score.

In general, a scale of measurement shows the information contained in the value of a

variable, and what mathematical operations and statistical analysis are permissible to be
done on the values of the variable. There are four levels of measurement. These levels,
from the weakest to the strongest, in order are: nominal scale, ordinal scale, interval scale
and ratio scale.

1. Nominal variables: are those qualitative variables which show category of indi-
viduals. They reflect classification into mutually exclusive (non-overlapping) and
exhaustive categories (name of groups) without any associated ranking. Numbers
may be assigned to the variables simply for coding purposes. It is not possible to
compare individuals based on the numbers assigned to categories. This scale is the
weakest form of measurement. The only mathematical operation permissible on these
variables is counting. Some examples of nominal variables are gender, religion, ID
No, ethnicity, color,· · · .

2. Ordinal variables: are also those qualitative variables whose values can be ordered
and ranked. Ranking and counting are the mathematical operations to be done on
the values of the variables. However, these ranks only indicate as to which category
greater or better but there is no precise difference between the values (categories) of
the variable. Example: grade scores (A, B, C, D, F), academic qualifications (B.Sc.,
M.Sc., Ph.D.), strength (very weak, week, strong, very strong), health status (very
sick, sick, cured).

3. Interval variables: are those quantitative variables and identifies not only as to
which category is greater or better but also by how much. It is the stronger form of
measurement but there is no true zero. Zero indicates low than empty. Examples:
temperature, 0 ◦ C does not mean there is no temperature but, rather, it is too cold.
Similarly, if a student scores 0 in a certain course, it does not mean that the student
has no knowledge in the course at all.

4. Ratio variables: These scales are the highest form of measurements. Ratio variables
are those quantitative variables but, unlike the interval variables, zero shows absence
of the characteristics. All mathematical operations are allowed to be operated on
the values of these variables. Examples: height, weight, income, amount of yield,
expenditure, consumption,· · · .

8
Chapter 2

Methods of Data Collection and

Presentation

2.1 Types of Data

The statistical data, as previously discussed, may be classified into two categories depend-
ing upon the sources utilized. These categories are:
1. Primary data: Primary data is the one which is collected by the investigator himself
for the purpose of a specific inquiry or study. These data are those data collected for
the first time either through direct observation or by enquiring individuals under the
direct supervision and instruction of the researcher. Such data is original in character
and is generated in surveys conducted by individuals or research institutions.

2. Secondary data: When an investigator uses the data which has already been col-
lected by others, such data is called secondary data. This data is primary data for
the agency that collected it and becomes secondary data for someone else who uses
this data for his own purposes. The secondary data can be obtained from journals,
official reports, government publications, publications of professional and research
organizations and so on.
Based on the role of time, data can be classified as cross-sectional and time series.
1. Cross-sectional data: is a set of observations taken at a point of time.

2. Time series data: is a set of observations collected for a sequence of time usually
at equal intervals.

2.2 Methods of Data Collection

The first and foremost task in statistical investigation is data collection. Before the actual
data collection, four important points should be considered. These are the purpose of data

9
Introduction to Statistics - Stat 1011 es.awol@gmail.com

collection (why we need to collect data), the kind data to be collected (what type of data
to be collected), the source of data (where we can get the data) and the methods of data
collection (how can we collect this data).

Once these questions are answered, it becomes necessary to collect the information needed.
This information has to be collected from certain individuals, directly or indirectly. Such
a technique is known as survey method which is commonly used in social sciences, i.e.,
problems related to sociology, political science, psychology and various economic studies.

Another way of collecting data is experimentation, i.e., an actual experiment is conducted

and then observations (measurements and counts) are recorded. Such experimental studies
are common in natural sciences; agriculture, biology, medical science, industry,· · ·

2.2.1 Questionnaire
The most common methods of data collection for survey are personal interview and self-
administered questionnaire. In these and other methods of data collection, it is necessary
to prepare a document, called questionnaire, which contains a set of questions to be an-
swered and is used to record the responses.

Questionnaire is a form containing a cover letter that explains about the person conducting
the survey and the objectives of the survey, and a set of related questions to be answered
by the respondents. One of the most important points in preparing it is that all questions
in it must have relevance to the objectives of the survey. In short, the following points
should be kept in mind while designing a questionnaire:

• The person conducting the survey should introduce himself and state the objective(s)
of the survey, promise of the anonymity and also include instructions how to fill the
form as it is necessary in getting correct responses (cover letter).

• The number of questions should be as few as possible. Once the objectives of the sur-
vey are clearly defined only questions pertinent to the objectives should be included.
The time of the respondent should not be wasted by asking irrelevant questions. In
general 5 to 25 questions may be regarded as a fair number. If a lengthy questionnaire
is unavoidable, it should preferably be divided into two or more parts.

• Questions should be logically arranged. Questions should be in a logical order (ap-

propriate sequence of topics) so that a natural and spontaneous reply is introduced.
Topics should not be mixed up and questions should not skip back and forth. For
example, it is undesirable to ask a person how many children s/he has before asking
whether s/he is married or not. Questions related to identification and description
of the respondent should be come first, followed by major information questions. If
opinions are requested, such questions should usually be placed at the end of the list.

10
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• Questions should be simple, short and easy to understand and they should convey
one and only one idea. Technical terms should be avoided.

• Sensitive questions (questions of personal and financial nature) should be avoided if

possible. Otherwise, such questions should be asked indirectly, by constructing a set
of ranges and must put at the last part of the questionnaire. Examples: age (0-25,
26-50, 51-75,>75), salary (below 200, 200-500, 500-1000, >1000).

• Leading questions should be completely avoided. If you ask person like ”You do not
smoke cigarette?” the person will automatically say ’Yes I do not’.

• Answers to the questions should not require any calculation.

• Questions should be capable of objective answers.

2.2.2 Secondary data

Secondary data should be used with utmost care. So before using this data, the following
three points should be considered.

1. Whether the data are suitable for the purpose of investigation. This can be judged in
the light of the nature and scope of investigation.

2. If the data obtained is suitable for our purpose it should be look at whether the data
are adequate for the purpose of investigation. This can be judged in the light of the
time and geographical area covered by the available data.

3. Whether the data are reliable. The data obtained should be checked for its accuracy.
In case, if the data are based on a sample, one should see whether the sample is a
proper representative of the population.

Once the above points are observed in the secondary data, it is ready to be used for further
statistical analysis.

2.3 Data Organization

It is almost impossible for management to deal with all the collected data in the raw form
as it is in a haphazard and unsystematic form. In order to describe situations and make
inferences about the population even to describe the sample, the data must be organized
into some meaningful way.

11
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Editing Data
Before further analysis, the collected data should be edited for completeness, consistency,
accuracy and homogeneity.

• Completeness: If the answer to some questions is missing, it becomes necessary to

contact the person again and complete the missing information.

• Consistency: Some information given by the respondent may not be compatible

in the sense that an information furnished by the individual either does not justify
some other information or is contradictory to earlier one.

• Accuracy: It is of vital importance. If the data are inaccurate, the conclusions

drawn from it have no relevance. If the investigator has either made a false report or
the respondent has deliberately supplied the wrong information, editing will be of no
use. In recent times, checks have been evolved to attain accuracy example by sending
supervisors to check the work of investigators or reinvestigating a few respondents
after a certain gap of time.

• Homogeneity: To maintain homogeneity, the information sheets are checked to see

whether the unit of information or measurement is the same in all the questionnaires.
If differences are there, it has to be converted to the same unit during editing.

Classification of Data
The next important step towards organizing data is classification. Classification is the
separation of items according to similar characteristics and grouping them into various
groups. Data may be classified into four broad classes:

1. Geographical classification: This classification groups the data according to lo-

cation differences; places, areas or regions among the items. The geographical areas
are usually listed in alphabetical order for easy reference.

2. Chronological classification: Chronological classification includes data according

to the time period; i.e., weekly, monthly, quarterly, annually, · · · in which the items
under consideration occurred.

3. Qualitative classification: In this type of classification, the data is grouped to-

gether according to some distinguished characteristic or attribute such as religion,
sex, nation and so on. This classification simply identifies whether a given attribute
is present or absent in a given population.

4. Quantitative classification: It refers to the classification of data according to some

characteristics that has been measured such as classification according to weight,
height, income and so on.

12
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Tabulation of Data
A table is a systematic arrangement of data in rows and columns, which is easy to under-
stand and makes data fit for further analysis and drawing conclusions.

Tabulation should not be confused with classification, as the two differ in many ways.
Mainly the purpose of classification is to divide the data into homogenous groups whereas
the data are presented into rows and columns in tabulation. Hence, classification is a pre-
liminary step prior to tabulation.

A statistical table, in general, should have the following parts.

1. Table Number: Every table should be identified by a number. It facilitates easy

reference. Whenever you refer to the table in the text, you can give the number of
the table only.

2. Title: There should be a title at the top of every table. The title should be clear,
concise and adequate. The title should answer the questions : What is the data?
where is the data? how is the data classified? and, what is the time period of data?

3. Caption: The caption labels the data presented in a column of the table. There
may be sub-captions in each caption.

4. Stub: It is a title given to each row.

5. Body: The body of the table is the most important part. The information given
in the rows and columns forms the body of the table. It contains the quantitative
information to be presented.

6. Footnote: Any explanatory notes concerning the table itself, placed directly beneath
the table, is called ’footnote’. The main purpose of footnote is to clarify some of the
specific items given in the table or to explain the ambiguities, omissions, if any, about
the data shown in the table.

7. Source Note: If the data is collected from secondary sources, a source note is given
to disclose the sources from which the data is collected.

Though the format of a table has already been discussed, some guidelines for preparing a
table are as follows:

1. The table should contain the required number of rows and columns with stubs and
captions and the whole data should be accommodated within the cells formed corre-
sponding to these rows and columns.

2. If the quantity is zero, it should be entered as zero. Leaving blank space or putting
dash in place of zero is confusing and undesirable.

13
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3. The unit of measurement should either be given in parentheses just below the col-
umn’s caption or in parentheses along with the stub in the row.
4. If any figure in the table has to be specified for a particular purpose, it should be
marked with an asterisk or another symbol. The specification of the marked figure
should be explained at the beneath of the table with the same mark.

2.4 Methods of Data Presentation

2.4.1 Frequency Distributions
The most convenient way of organizing numerical data is to construct a frequency dis-
tribution. Frequency distribution is the organization of raw data in table form, using
classes and frequencies. Here the term ’class’ stands for a description of a group of simi-
lar objects in a data set and ’frequency’ is the number of times a variable value is repeated.

There are three types of frequency distributions; categorical, ungrouped and grouped fre-
quency distributions.
1. Categorical frequency distribution: It is used when the variable is qualitative
i.e. either nominal or ordinal. Each category of the variable represents a single class
and the number of times each category repeats represents the frequency of that class.

Example 2.1. The blood type of 25 students is: A B B AB O AB O O B B B A B

B AB O A O AB A O O O AB O. Construct categorical frequency distribution.

Class (Blood type) Frequency (Number of students)

A 4
B 7
AB 5
O 9
Total 25
Example 2.2. Construct frequency distribution for the following letter grades of 25
students: A B C C C C B B A D A C C A B F C C A B.

2. Ungrouped frequency distribution: It is also called frequency array. It is a

frequency distribution of numerical data (quantitative variable) in which each value
of the variable represents a single class and the number of times each value repeats
represents the frequency of that class.

Example 2.3. Number of children for 21 families is: 2 3 5 4 3 3 2 3 1 0 4 3 2 2 1 1

1 4 2 2 2.

14
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Class (No. of children) Frequency (No. of families)

0 1
1 4
2 7
3 5
4 3
5 1
Total 21

3. Grouped frequency distribution: Like ungrouped frequency distribution, grouped

frequency distribution is used for numerical data but in grouped frequency distri-
bution several values of a variable are grouped into one class and the number of
observations belonging to the class is the frequency of that class. This frequency
distribution is also called continuous frequency distribution.

Example 2.4. Consider the following age group and number of persons:

Class Limits Class Boundaries Frequency

(Age in years) (Age in years) (No. of persons)
1-25 0.5-25.5 20
26-50 25.5-50.5 15
51-75 50.5-75.5 25
76-100 75.5-100.5 10
Total 70

(a) Class Limits: The lowest and highest values that can be included in a class
are called class limits. The lowest values are called lower class limits and the
highest values are called upper class limits. For example: Class limit for the
first class is 1-25, where 1 is the lower class limit and 25 is the upper class limit
of the first class.
(b) Class Boundaries: Class boundaries are class limits when there is no gap
between the UCL of one class and the LCL of the next class. The lowest values
are called lower class boundaries and the highest values are called upper class
boundaries. The class boundary for the first class 0.5-25.5 where the Lower class
boundary is 0.5 and the Upper class boundary is 25.5. Note that the UCL of
one class is the LCL of the next class.
(c) Class Width: It is the difference between UCB and LCB of a certain class.
It is also the difference between the lower limits of two consecutive classes or
it is the difference between upper limits of two consecutive classes. That is,
W = U CB − LCB or W = LCLi − LCLi−1 or W = U CLi − U CLi−1 .
The class width of the above frequency distribution is W = 25.5 − 0.5 = 25 or
W = 26 − 1 = 25 or W = 50 − 25 = 25.

15
Introduction to Statistics - Stat 1011 es.awol@gmail.com

(d) Class Mark: is the half way between the class limits or the class boundaries
of a certain class.
LCLi + U CLi LCBi + U CBi
CMi = =
2 2
Class marks of the above distribution are CM1 = 13, CM2 = 38, CM3 = 63
and CM4 = 88. Note also that W = CMi − CMi−1 .

Relative Frequency Distribution

The absolute frequency distribution is a summary table in which the original data is con-
densed into groups and their frequencies, which is called absolute frequency distribution.
But if a researcher would like to know the proportion or percentage of cases in each group,
instead of simply, the number of cases, s/he can do so by constructing a relative frequency
distribution table. The relative frequency distribution can be formed by dividing the fre-
quency in each class of the frequency distribution by the total number of observations.
It can be converted in to a percentage frequency distribution by simply multiplying each
relative frequency by 100.

The relative frequencies are particularly helpful when comparing two or more frequency
distributions in which the number of cases under investigation are not equal. The percent-
age distributions make such a comparison more meaningful, since percentages are relative
frequencies and hence the total number in the sample or population under consideration
becomes irrelevant.

Class Limits Class Boundaries Relative Frequency Percentage Frequency

1-25 0.5-25.5 20/70=0.2857 28.57
26-50 25.5-50.5 15/70=0.2143 21.43
51-75 50.5-75.5 25/70=0.3571 35.71
76-100 75.5-100.5 10/70=0.1429 14.29
Total 1 100

Cumulative Frequency Distribution

The above frequency distributions tell us the actual number (percentage) of units in each
class, it does not tell us directly the total number (percentage) of units that lie below
or above the specified values of the classes. This can be determined from a cumulative
frequency distribution. A cumulative frequency distribution displays the total number of
observations above (below) a certain value. When the interest of the investigator focuses on
the number of items below a specified value, then this specified value is the upper boundary
of the class. It is known as less than cumulative frequency distribution. Similarly, when
the interest lies in finding the number of cases above a specified value, then this value is
taken as the lower boundary of the specified class and is known as more than cumulative
frequency distribution.

16
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Less than Cumulative Frequency More than Cumulative Frequency

Class F Class F
<25.5 20 >0.5 10+25+15+20=70
<50.5 20+15=35 >25.5 10+25+15=50
<75.5 20+15+25=60 >50.5 10+25=35
<100.5 20+15+25+10=70 >75.5 10

Construction of Grouped Frequency Distribution

1. Arrange the data in an array form (increasing or decreasing order).

2. Find the Unit of Measurement (U ). Unit of measurement is the smallest numerical

difference between any two distinct values of the data.

3. Find the Range (R). Range is the maximum numerical difference in the data set, i.e.
the difference between the largest and the smallest values of the variable.

4. Determine the number of classes (k) using Sturge’s Rule. k = 1 + 3.322 log N where
N is the total number of observations.

5. Specify the class width (W ); W = R/K.

6. Put the smallest value of the data set as the LCL of the first class. Then obtain the
LCL of the second class by adding the class width W to the LCL of the first class.
Continue adding W until you get k classes.
Let X be the smallest observation. Thus, LCL1 = X and LCLi = LCLi−1 + W for
i = 2, 3, · · · , k.

7. Now obtain the UCLs of the frequency distribution by adding W − U to the corre-
sponding LCLs. U CLi = LCLi + (W − U ) for i = 1, 2, · · · k.
1 1
8. Generate the class boundaries. LCBi = LCLi − U and U CBi = U CLi + U for
2 2
i = 1, 2, · · · k.

Example 2.5. Construct grouped frequency distribution for the following score of 56 stu-
dents (out of 40).
31 33 33 34 34 35 35 17 31 36 17 18 19 25 26 27 27 19 20 22 31 36 38 13 22 22 35 36 28 28
29 30 30 36 11 13 16 17 17 22 22 23 23 23 23 24 24 24 25 27 27 28 28 30 13 16

Solution:

1. The array form of the data (increasing order):

11 13 13 13 16 16 17 17 17 17 18 19 19 20 22 22 22 22 22 23 23 23 23 24 24 24 25 25
26 27 27 27 27 28 28 28 28 29 30 30 30 31 31 31 33 33 34 34 35 35 35 36 36 36 36 38

17
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. U = 17 − 16 = 1
3. R = L − S = 38 − 11 = 27
4. K = 1 + 3.322 log N = 1 + 3.322 log 56 = 6.81 ≈ 7
5. W = R/K = 27/6.81 = 3.96 ≈ 4
6. W − U = 4 − 1 = 3

k CLs CBs CM Freq. Rel. Freq. Per. Freq. LCF MCF

1 11-14 10.5-14.5 12.5 4 0.0714 7.14 4 56
2 15-18 14.5-18.5 16.5 7 0.1250 12.50 11 52
3 19-22 18.5-22.5 20.5 8 0.1429 14.29 19 45
4 23-26 22.5-26.5 24.5 10 0.1786 17.86 29 37
5 27-30 26.5-30.5 28.5 12 0.2143 21.43 41 27
6 31-34 30.5-34.5 32.5 7 0.1250 12.50 48 15
7 35-38 34.5-38.5 36.5 8 0.1428 14.28 56 8
Total 56 1 100

Example 2.6. The birth weights(kilograms) of 30 children were recorded as follows.

2.0 2.1 2.3 3.0 3.1 2.7 2.8 3.5 3.1 3.7
4.0 2.3 3.5 4.2 3.7 3.2 2.7 2.5 2.7 3.8
3.1 3.0 2.6 2.8 2.9 3.5 4.1 3.9 2.8 2.2
Construct frequency distribution for these data.

Guidelines for Constructing Grouped Frequency Distributions

The number of classes and the class width are more or less arbitrary in nature within the
general guidelines established for constructing a frequency distribution. The following are
some guidelines for such construction:
1. Classes should be complete (it should include all the data set) and non-overlapping
(no data should belong to two classes). The classes should be clearly defined and
each observation should be included in only one of the class. This means that the
classes should be chosen in such a manner that one score cannot belong to more than
one classes, so that there is no overlapping of classes.
2. The number of classes should be neither too large nor too small. Normally, between 5
and 20 classes are considered to be adequate, that is, 5 ≤ k ≤ 20. In fact it depends
on the total number of observations, the larger the number of observations the more
the number of class of the frequency distribution. But we need to condense the data
set with minimum lose of information in an easy manageable classes. Fewer classes
would mean greater class width with consequent loss of accuracy. Too many classes
result in greater complexity.

18
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3. The class width should be the same for all classes.

4. Classes should standardized. A class should follow logical and chronological (increas-
ing) order.

5. Classes should be continuous. Even if there are no values in a class the class must
be included in the frequency distribution.

6. Open ended classes, where there is no lower limit of the first class or no upper
limit of the last class, should be avoided since this creates difficulty in analysis and
interpretation.

Advantages and disadvantages of grouped frequency distributions:

• Advantages:

– It condenses a large mass of data into a comparatively small table.

– It attracts the attention of even a layman and gives him an insight into the
nature of the distribution.
– It helps for further statistical analysis, like central tendency, scatter, symmetry,
of the data.

• Disadvantages:

– In the grouped frequency distributions, the identity of the observations is lost.

We know only the number of observations in a class and do not know what the
values are.
– Because the selection of the class width and the lower class limit of the first
class are to a certain extent arbitrary, different frequency distributions may be
constructed for the same data and hence may give contradictory impressions.

2.4.2 Diagrammatic Display of Data

1. Bar Diagram: It is the simplest and most commonly used diagrammatic represen-
tation of a frequency distribution. It is the most common presentation for nominal,
categorical or discrete data. It uses a serious of separated and equally spaced bars.
The heights of the bars represent the frequency or relative frequency of the classes.
But the width of the bars has no meaning; however, all the bars should be the same
width to avoid distortion. And also the bars are separated by constant distance.

(a) Simple Bar Diagram: is a diagram in which categories of a variable are

marked on the X axis and the frequencies of the categories are marked on the
Y axis. It is applicable for discrete variables, that is, for data given according
to some period, places and timings. These periods and timings are represented
on the base line (X axis) at regular interval and the corresponding frequencies

19
Introduction to Statistics - Stat 1011 es.awol@gmail.com

are represented on the Y-axis. The width of the bar represents nothing (it is
meaningless), but it should be equal for all bars. Also, each bar is separated by
an equal space.

Example 2.7. Construct simple bar chart for the following data.
Marital Status Number of Individuals
Single 10
Married 7
Divorced 3
Others (Widowed,· · · ) 1
Total 21

(b) Component Bar Diagram: is used when there is a desire to show a total or
aggregate is divided into its component parts. The bars represent total value of
a variable with each total broken into its component parts and different colors
are used for identification. In such type of diagrams, a bar is subdivided into
parts in proportion to the size of the subdivision. These subdivided rectangles
are shaded differently by lines, dots and colors so that they will be very easy to
compare the components. Sometimes the volumes of different attributes may
be greatly different. For making meaningful comparisons, the components of
the attributes are reduced to percentages. In that case each attribute will have
100 as its maximum volume. This sort of component bar diagram is known as
percentage bar-diagram.

Example 2.8. Construct component bar chart for the following data.

20
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Marital Status Male Female Total

Single 8 2 10
Married 3 4 7
Divorced 2 1 3
Others (Widowed,· · · ) 1 0 1

(c) Multiple Bars Diagram: used to display data on more than one variable. In
the multiple bars diagram two or more sets of inter-related data are interpreted.

Example 2.9. Construct multiple bar chart for the following data.
Year Coffee Butter Sugar
1997 12 10 7
1998 5 9 8
1999 10 12 7
2000 9 8 8

21
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. Pie Chart: Pie chart is popularly used in practice to show the percentage break
down of data. A pie chart is simply a circle divided into a number of slices whose
sizes correspond to the frequency or relative frequency of each class or a pie chart is
a circle representing the total, cut into slices in proportional to the size of the parts
that make up the total.

Example 2.10. Construct pie chart for the following data.

Marital Status Number of Individuals

Single 10
Married 7
Divorced 3
Others (Widowed,· · · ) 1
Total 21

Solution:

22
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Marital Status Percentage Degree

10 × 100 47.62 × 360

Single =47.62 =171.43
21 100
7 × 100 33.33 × 360
Married =33.33 =119.99
21 100
3 × 100 14.29 × 360
Divorced =14.29 =51.44
21 100
1 × 100 4.76 × 360
Others =4.76 =17.14
21 100

Total 100 360

2.4.3 Graphical Presentation of Data

1. Histogram: Histogram is the most common graphical presentation of a frequency
distribution for numerical data. It uses a series of adjacent bars in which the width of
each bar represents the class width and the heights represent the frequency or relative
frequency of the class. It is used for grouped data in which the class boundaries are
marked on the X axis and the frequencies are marked along the Y axis.

2. Frequency polygon: It is a graph that consists of line segments connecting the

intersection of the class marks and the frequencies of a continuous frequency distri-
bution. It can also be constructed from histogram by joining the mid-points of each

23
Introduction to Statistics - Stat 1011 es.awol@gmail.com

bar. It is also called frequency curve if the points are joined by a smooth free hand
sketch.

3. Cumulative Frequency (Ogive) curves: As there are two cumulative frequency

distributions, there are two Ogive curves. These are the less than cumulative fre-
quency which is a line graph joining the intersection points of the upper class bound-
aries and their corresponding less than cumulative frequencies and the more than
cumulative frequency which is a line graph joining the intersection points of the
lower class boundaries and their corresponding more than cumulative frequencies.

24
Chapter 3

Measures of Central Tendency

Usually the collected data are not suitable to draw conclusions about the mass from which
it has been taken. Even though the data will be, somewhat summarized after it has de-
picted using frequency distributions and presented by using graphs and diagrams, still we
cannot make any inference about the data since there are many groups. Hence, organizing a
data into a frequency distribution is not sufficient, there is a need for further condensation,
particularly, to compare two or more distributions, we may reduce the entire distribution
into one number that represents the distribution we need. A single value which can be
considered as typical or representative of a set of observations and around which the ob-
servations can be considered as centered is called an ’average’ (or average value or center
of location). Since, such typical values tend to lie centrally within a set of observations
when arranged according to magnitudes; averages are called measures of central tendency.

3.1 Objectives of Measures of Central Tendency

1. To condense a mass of data into one single value: That is to get a single
value which is best representative of the data (that describes the characteristics of
the entire data). Measures of central tendency, by condensing masses of data into
one single value enable us to get an idea of the entire data. Thus, one value can
represent thousands of data even more.

2. To facilitate comparison: Statistical devises like averages, percentages and ratios

are used for this purpose. For example, to compare the performances of two classes, A
and B, instead of comparing each student result, which is infeasible, we can compare
the average mark of the two classes.

There are many types of measures of central tendency, each possessing particular properties
and each being typical in some unique way. The most frequently encountered ones are:
• Computed averages: Mean (Arithmetic Mean, Geometric Mean and Harmonic Mean)

• Positional averages: Median and Quantiles (Quartiles, Deciles and Percentiles)

25
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• Mode

3.2 Characteristics of Good Measure of Central Ten-

dency
There are various measures of central tendency. The difficulty lies in choosing the measure
as no hard and fast rule have been made to select any one. However some norms have been
set which work as a guideline for choosing a particular measure of central tendency. A mea-
sure of central tendency is good or satisfactory if it possesses the following characteristics,
of course, there is no measure which satisfy all these properties:

1. It should be calculated based on all observations.

2. It should not be affected by extreme values. It should be as close to the maximum

number of observed values as possible.

3. It should be defined rigidly which means it should have a definite value (it should be
unique).

4. It should always exist.

5. It should be easy to understand and calculate. It should not subject to complicated

and tedious calculations, though the advent of electronic calculators and computers
has made it possible.

6. It should be stable with regard to sampling. This means that if a number of samples
of the same size are drawn from a population, the measure of central tendency having
the minimum variation among the different calculated values should be preferred.

7. It should be capable of further algebraic treatment. By algebraic treatment, we mean

the measures should be used further in the formulation of other formulae or it should
be used for further statistical analysis.

3.3 Summation Notation

n
X
The sum X1 + X2 + · · · + Xn is denoted by the Greek letter Σ (Sigma) as Xi and it is
i=1
called the summation notation.

Properties of the summation notation:

n
X n
X n
X
• (Xi ± Yi ) = Xi ± Yi
i=1 i=1 i=1

26
Introduction to Statistics - Stat 1011 es.awol@gmail.com

n
X
• Xi Yi = X1 Y1 + X2 Y2 + · · · + Xn Yn
i=1
n
X
• c = nc where c is a constant.
i=1
n
X n
X
• (Xi ± c) = Xi ± nc
i=1 i=1
n
X n
X
• cXi = c Xi
i=1 i=1
n
X n
X n
X n
X
• (Xi ± Yi )2 = Xi2 ± 2 Xi Yi + Yi2
i=1 i=1 i=1 i=1
n
X n
X n
X
• Xi Yi 6= Xi Yi
i=1 i=1 i=1
n
X Xn
• Xi2 6= ( Xi ) 2
i=1 i=1

3.4 Mean
3.4.1 Arithmetic Mean
1. Simple arithmetic mean: The arithmetic mean is the simplest but most useful
measure of central tendency. It is nothing but the ’average’ which we compute in our
high school arithmetic. It is defined as the sum of all observations divided by the
total number of observations. The sample mean is denoted by X̄ (read as X bar)
while the population mean is represented by the Greek letter µ, mu.
• For a sample of n raw (individual) observations, X1 , X2 , · · · , Xn :
n
X
Xi
i=1
X̄ =
n
• For grouped data (continuous or ungrouped frequency distributions):
k
X
f i Xi
i=1
X̄ = k
X
fi
i=1

27
Introduction to Statistics - Stat 1011 es.awol@gmail.com

where Xi is class mark of the ith class for grouped data or it is the ith class value
for ungrouped data and fi is the corresponding frequency.

Example 3.1. Find the arithmetic mean of 2, 4 and 8.

n
X
Xi
i=1 2+4+8 14
X̄ = = = = 4.67
n 3 3
Example 3.2. Find the mean for the frequency distribution of students score data
considered in example ??.

To find the mean of the frequency distribution, the necessary calculations are as
follows:

Class Boundaries Class Marks (Xi ) Frequency (fi ) f i Xi

10.5-14.5 12.5 4 50
14.5-18.5 16.5 7 115.5
18.5-22.5 20.5 8 164
22.5-26.5 24.5 10 245
26.5-30.5 28.5 12 342
30.5-34.5 32.5 7 227.5
34.5-38.5 36.5 8 292
Total 56 1436

k
X 7
X
f i Xi f i Xi
i=1 i=1 1436
Thus, X̄ = k
= 7
= = 25.64
X X 56
fi fi
i=1 i=1

Properties of Arithmetic Mean

(a) The algebraic sum of the deviations of each value from the arithmetic mean is
Xn
zero. That is (Xi − X̄) = 0.
i=1

(b) The sum of the squares of the deviations from the mean is less than the sum
of the squares of the deviations about the other score in the distribution, that
is, the sum of the squares of the deviation from the mean is minimum. That is,
X n X n
2
(Xi − X̄) < (Xi − a)2 , a 6= X̄
i=1 i=1

28
Introduction to Statistics - Stat 1011 es.awol@gmail.com

(c) If a constant c is added or subtracted from each value in a distribution, then

the new mean will be X̄new = X̄old ± c respectively.
(d) If each value of a distribution is multiplied by a constant c, the new mean will
be the original mean multiplied by c.
(e) Combined Mean: If there are g different groups having the same units of
measurement with mean X̄1 , X̄2 , · · · ,X̄g and number of sample observations
n1 , n2 , · · · ,ng respectively, then the mean of all the groups called the combined
mean (denoted by X̄c ) is given by:
g
X
ni X̄i
i=1 n1 X̄1 + n2 X̄2 + · · · + ng X̄g
X̄c = g =
X n1 + n2 + · · · + ng
ni
i=1

Examples:
i. The mean weight of 50 women working in a factory is 48 kilograms. The
mean weight of 75 men working in the same factory is 58 kilograms. Find
the mean weight of all workers in the factory.
ii. The mean mark in statistics of 50 students in a class was 72 and that of the
35 boys was 75. Find the mean mark of the girls in the class. Ans:65
iii. The mean salary of 100 laborers working in a factory, running in two shifts
of 40 and 60 workers respectively is birr 380. The mean salary of the 40
laborers working in the morning shift is 350. Find the mean salary of the
60 laborers working in the evening shift.
Solutions:
i. nw = 50, X̄w = 48, nm = 75, X̄m = 58, X̄c =?

nw X̄w + nm X̄m 50 × 48 + 75 × 58 6570

X̄c = = = = 52.56
nw + nm 50 + 75 125
ii. n = 50, X̄c = 72, nb = 35, X̄b = 75, ⇒ ng = n − nb = 50 − 35 = 15, X̄g =?

nb X̄b + ng X̄g nX̄c − nb X̄b 50(72) − 35(75)

X̄c = ⇒ X̄g = = = 65
n ng 15

iii. n = 100, X̄c = 380, nm = 40, X̄m = 350, ne = 60, X̄e =?

nX̄c − nm X̄m 100(380) − 40(350)

⇒ X̄e = = = 400
ne 60
Example 3.3. The mean of 100 values was found to be 40. It was latter discovered
that a value was misread as 83 instead of 53. Find out the correct mean.

29
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 3.4. The mean of 200 items was found to be 50. Later on it was discovered
that two items were wrongly read as 92 and 8 instead of the correct values 192 and
88 respectively. Find the correct mean.

2. Weighted Arithmetic Mean: While calculating the simple arithmetic mean, we

had given equal importance to all values. But there are cases where the relative
importance is not the same for all values. When this is the case, it is necessary to
assign them weights (that is, relative importance) and then calculate the weighted
arithmetic mean. Let X1 , X2 , · · · ,Xn be the values and w1 , w2 , · · · ,wn be the
corresponding weights, then the weighted arithmetic mean is denoted by X̄w and is
given by:
Xn
w i Xi
i=1 w1 X1 + w2 X2 + · · · + wn Xn
X̄w = n =
X w1 + w2 + · · · + wn
wi
i=1

Example 3.5. A teacher attaches 2 to quiz, 3 to midterm and 5 for final exam. If
a student gets 90, 50 and 60 for quiz, midterm and final exam respectively, what is
his/her average academic performance?
Solution: Xi = 90, 50, 60 and wi = 2, 3, 5
n
X 3
X
wi X i wi X i
i=1 i=1 2(90) + 3(50) + 5(60) 630
X̄w = n = 3
= = = 63
X X 2+3+5 10
wi wi
i=1 i=1

The arithmetic mean fulfils all characteristic of good measures of central tendency with
the exception that it is highly affected by extreme values. And it cannot be calculated for
a frequency distribution with open ended classes (a frequency distribution with no lower
boundary of the first class or with no upper class boundary of the last class or with both).

3.4.2 Geometric Mean

In algebra, geometric mean is calculated in case of geometric progression, but in statistics
no need to bother about the progression. Here, it is the particular type of data for which
the geometric mean is of importance because it gives a good mean value. If the variable
values are measured as ratios, proportions or percentages and some values are larger in
magnitude and others are small, then the geometric mean is a better measure of central
tendency than the simple average. The arithmetic mean is very biased toward large num-
bers in the series.

Geometric mean is defined as the nth root of the product of n positive numerical values.

30
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• For raw data, X1 , X2 , · · · , Xn :

v
u n
uY p
GM = t n
X i = n X 1 × X2 × · · · × X n
i=1

• For grouped data:

v
u k q
uY f
Xi = X1f1 × X2f2 × · · · × Xkfk
n i n
GM = t
i=1

where Xi is the class mark the ith class and fi is corresponding class frequency,
Xk
n= fi .
i=1

k
X
But the above formula is used if n = fi is small. If it is large, it is difficult to calculate
i=1
the nth root. Thus, to facilitate the computation, we make use of logarithms. Thus:
n
1X
GM = antilog( log Xi ) for ungrouped data and
n i=1
k
1 X
GM = antilog( k fi log Xi )) for grouped data.
i=1
X
fi
i=1

The disadvantage of geometric mean is that it will be meaningless if one or more obser-
vations are zero or negative. It is also affected by extreme values but not to the extent of
arithmetic mean.

Example 3.6. Find the geometric mean of 2, 4 and 8.

v v
u n u 3
uY uY
3
p √
3
n
GM = t Xi = t Xi = 3 2(4)(8) = 64 = 4
i=1 i=1

Example 3.7. The price of a commodity increased by 5% from 1989 to 1990, 8% from
1990 to 1991 and by 77% from 1991 to 1992. Find the average price increase.

For increment, take the base line value as 100% and then add the % increase so as to get
the values in successive years.

31
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Year % increase Value (X) 100% log Xi

1989-1990 5 105 2.02
1990-1991 8 108 2.03
1991-1992 77 177 P 2.25
Total log Xi = 6.30
n 3
1X 1X 1
Then, GM = antilog( log Xi ) = antilog( log Xi ) = antilog( (6.30) = antilog(2.1) =
n i=1 3 i=1 3
125.89. Therefore, the price increment is 25.89%.

Example 3.8. A machine depreciated by 10% each in the first two years and by 40% in
the third year. Find out the average rate of depreciation.

Like the previous one, take the base line value of the machine as 100% and then deduct
the % of depreciation so as to get the depreciated values in successive years.

Year % of depreciation Value (X) 100% log Xi

1 10 90 1.95
2 10 90 1.95
3 40 60 P 1.79
Total log Xi = 5.69
1
Then, GM = antilog( (5.69) = antilog(1.70) = 50.12. Therefore, the machine depreciated
3
by is 49.88%.

Example 3.9. Decadal percentage growth of population in country A is given below. Find
the average rate growth.

Year 1921 1931 1941 1951 1961 1971 1981

% increase 8.25 19.08 32.09 41.49 25.89 37.91 46.02

3.4.3 Harmonic Mean

In algebra, harmonic mean is found out in the case of harmonic progression only. But in
statistics, harmonic mean is a suitable measure of central tendency when the data pertains
to speed, rates and time. That is, harmonic Mean is another specialized average which is
useful in averaging variables expressed as rate per unit of time, such as speed, number of
units produced per day.

Harmonic mean is defined as the inverse of the arithmetic mean of the reciprocals of the
values.

32
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• For raw data, X1 , X2 , · · · , Xn :

n n
HM = n =
1 1 1 1
+ + ··· +
X
Xi X1 X2 Xn
i=1

• For grouped data,

k
X k
X
fi fi
i=1 i=1
HM = =
k
fi f1 f2 fk
+ + ··· +
X
Xi X1 X 2 Xk
i=1

where Xi is the class mark of the ith class and fi is the corresponding class frequency,
Xk
n= fi .
i=1

Similar to weighted arithmetic mean, there is also weighted harmonic mean. It is given by:
n
X n
X
wi wi
i=1 i=1
HM = n = w1 w2 wk
X wi + + ··· +
X1 X2 Xn
i=1
Xi

Harmonic mean is not affected by extreme values. But it cannot be calculated when one
or more observations are zero.
Example 3.10. Find the harmonic mean of 2, 4 and 8.
Xi = 2, 4, 8;
3 3
HM = = = 3.429
1/2 + 1/4 + 1/8 0.875
Example 3.11. In a factory a mechanic takes 15 days to fabricate a machine, the second
mechanic takes 18 days, the third takes 30 days and the fourth takes 90 days. Find the
average number of days taken by the workers to fabricate the machine.
Xi = 15, 18, 30, 90;
4 4
HM = = = 23.95
1/15 + 1/18 + 1/30 + 1/90 0.167
Example 3.12. Suppose a train moves 100 km with a speed of 40 km per hour, then 150
km with a speed of 50 km per hour and the next 135 km with a speed of 45 km per hour.
Calculate the average speed of the train.

33
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Xi = 40, 50, 45 and wi = 100, 150, 135;

100 + 150 + 135 385
HM = = = 45.294
100/40 + 150/50 + 135/45 8.500
Example 3.13. Suppose a train moves 5 hours at a speed of 40 km per hour, then 3
hours at a speed of 50 km per hour and the next 5 hours with a speed of 45 km per hour.
Calculate the average speed of the train.
Xi = 40, 50, 45 and wi = 5, 3, 5;
5+3+5 13
HM = = = 43.919
5/40 + 3/50 + 5/45 0.296
Example 3.14. A driver traveled 400 km per day for three days at a speed of 60, 50 and
40 kilometers per hour. Find the average speed of the driver.
Xi = 60, 50, 40 and wi = 400, 400, 400;
400 + 400 + 400 1200 3
HM = = = 48.648 =
400/60 + 400/50 + 400/40 24.667 1/60 + 1/50 + 1/40
Example 3.15. A student reads the first 100 pages of a book at a rate of 5 pages per hour,
the next 100 pages at a rate of 8 pages per hour. What is the student’s average reading
speed?
Xi = 5, 8 and wi = 100, 100;
2 2
HM = = = 6.15
1/5 + 1/8 0.325

Relationships between AM , GM and HM

• For n observations AM ≥ GM ≥ HM
√
• For two positive observations GM = AM × HM

Example 3.16. The arithmetic mean of two observations is 36 and their harmonic mean
is 25. What is the geometric mean of the two observations?

3.5 Median
It has been pointed out that mean cannot be calculated whenever there is frequency distri-
bution with open-ended classes. Also the mean is to a great extent affected by the extreme
values. For instance, there are eight persons getting salaries as Birr 150, 225, 240, 260,
275, 290, 300 and 1500. The mean salary of the persons is Birr 405. This value is not a
good measure of central tendency because out of the eight people, seven get Birr 300 or

34
Introduction to Statistics - Stat 1011 es.awol@gmail.com

less. Hence, some better measure is preferable and median is one of them.

Median is the half way point in a data set. It divides a data set into two equal parts
such that half of the numbers have a value less than the median and have will have values
greater than the median. Graphically, median is located at the intersection point of the
less than and more than cumulative frequency curves.

The median (denoted by X̃) of a set of n observations X1 , X2 , · · · , Xn arranged in ascending

or descending order of magnitude is the middle value if n is odd or the arithmetic mean of
the two middle values if n is even. That is:
th
n+1
• if n is odd, X̃ = value.
2
n th n th
value + +1 value
• if n is even, X̃ = 2 2 .
2
Example 3.17. Find the median of the following two data sets:

a. 180, 201, 220, 191, 219, 209 and 220.

b. 62, 63, 64, 65, 66, 66, 68 and 78

Using the formula for raw data:

a. 4th value=209

b. (4th value + 5th value)/2=(65+66)/2=65.5

The median value for grouped frequency distributions is given by the formula:
n 
− FX̃−1
X̃ = LX̃ +  2 ×w
fX̃

k
X
where n = fi is the total number of observations, fX̃ is frequency of the median class,
i=1
LX̃ is the lower class boundary of the median class, FX̃−1 is the less than cumulative fre-
quency just before the median class or it is the sum of all the frequencies up to but not
including the median class and w is the class width of the median class. The median class
is the class corresponding to the minimum less than cumulative frequency which contains
n
the value .
2

Example 3.18. Find the median mark of the students score data and interpret it.

35
Introduction to Statistics - Stat 1011 es.awol@gmail.com

First calculate less than cumulative frequency of the frequency distribution and identify
the median class.
Class Boundaries fi LCF (Fi )
10.5-14.5 4 4
14.5-18.5 7 11
18.5-22.5 8 19
22.5-26.5 10 29
26.5-30.5 12 41
30.5-34.5 7 48
34.5-38.5 8 56
Total 56

The median class is the class having the less than cumulative frequency containing the
value n/2 = 56/2 = 28. This implies, 22.5-26.5 is the median class.
n 
− FX̃−1
28 − 19

X̃ = LX̃ +  2  × w = 22.5 + × 4 = 22.5 + 3.6 = 26.1
fX̃ 10

Example 3.19. Find the median of the following data.

Class 13.5-22.5 22.5-31.5 31.5-40.5 40.5-49.5 49.5-58.5

Frequency 3 9 12 20 3

Median is not influenced by extreme values. It can be calculated for FD with open-ended
classes, even it can be located if the data is incomplete.

3.6 Other Measures of Location: Quantiles

As discussed before, median divides a given data set into two equal parts. There are also
other positional measures that divide a given data set into more than two equal parts.
These measures are collectively known as quantiles. Quantiles include quartiles, deciles
and percentiles. For all of these measures, first, the data should be arranged in ascending
order.

3.6.1 Quartiles
Quartiles are values that divide a data set into four equal parts. These values are denoted
by Q1 , Q2 and Q3 such that 25% of the data fall below Q1 , 50% below Q2 and 75% below Q3 .
th
th i(n + 1)
Let Qi be the i quartile (i = 1, 2, 3), then Qi = value.
4

36
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 3.20. Given the data: 420, 430, 435, 438, 441, 449, 490, 500, 510 and 515. Find
all the quartiles.
th
i(n + 1)
Qi = value, i = 1, 2, 3
4
th
(10 + 1)
Q1 = value = 2.75th value = 2nd value + 0.75 (3rd value - 2nd value) =
4
430+0.75(435-430) = 433.75
th
2(10 + 1)
Q2 = value = 5.5th value = 5th value + 0.5 (6th value - 5th value) =
4
441+0.5(449-441) = 445
th
3(10 + 1)
Q3 = value = 8.25th value = 8th value + 0.25 (9th value - 8th value) =
4
500+0.25(510-500)= 502.5
in
 
− FQi −1 k
 4 X
For frequency distribution, Qi = LQi +   × w, i = 1, 2, 3 where n = fi

fQi i=1

is the total number of observations, fDi is frequency of the ith quartile class, LQi is the
lower class boundary of the ith quartile class, FQi −1 is the less than cumulative frequency
just before the ith quartile class and w is the class width of the ith quartile class. The ith
quartile class is the class corresponding to the minimum less than cumulative frequency
in
which contains the value .
4
Example 3.21. Calculate all quartiles for the students score data and interpret the results.
in
 
− FQi −1
Calculate the less than cumulative frequencies Fi s first. Qi = LQi + 4 ×w, i =
 
fQi
1, 2, 3

Q1 class: n/4 = 56/4 = 14, The Q1 class is ⇒ 18.5 − 22.5.

n 
− FQ1 −1
14 − 11

Q1 = LQ1 +  4  × w = 18.5 + × 4 = 18.5 + 1.5 = 20
fQ1 8

Q2 class: 2n/4 = 2(56)/4 = 28, The Q2 class is ⇒ 22.5 − 26.5.

2n
 
− FQ2 −1
28 − 19

 4
Q2 = LQ2 +   × w = 22.5 + × 4 = 22.5 + 3.6 = 26.1

fQ2 10

Q3 class: 3n/4 = 3(56)/4 = 42, The Q3 class is ⇒ 30.5 − 34.5.

37
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3n
 
− FQ3 −1
42 − 41

 4
Q3 = LQ3 +  × w = 30.5 + × 4 = 30.5 + 0.57 = 31.07

fQ3 7

3.6.2 Deciles
Deciles are values that divide the data into ten equal parts. These values are denoted by
D1 , D2 , · · · , D9 such that 10% of the data fall below D1 , 20% below D2 , · · · , 90% below D9 .
th
th i(n + 1)
Let Di be the i decile (i = 1, 2, · · · , 9), then Di = value.
10
Example 3.22. Given the data: 420, 430, 435, 438, 441, 449, 490, 500, 510 and 515. Find
the 1stand 7th deciles.
th
i(n + 1)
Di = value, i = 1, 2, · · · , 9
10
th
(10 + 1)
D1 = value = 1.1th value = 1st value + 0.1 (2nd value - 1st value) =
10
420+0.1(430-420) = 421
th
7(10 + 1)
D7 = value = 7.7th value = 7th value + 0.7 (8th value - 7th value) =
10
490+0.7(500-490)= 497
in
 
− FDi −1 k
 4 X
For frequency distribution, Di = LDi +   ×w, i = 1, 2, · · · , 9 where n = fi

fDi i=1

is the total number of observations, fDi is frequency of the ith decile class, LDi is the lower
class boundary of the ith decile class, FDi −1 is the less than cumulative frequency just
before the ith decile class and w is the class width of the ith decile class. The ith decile class
is the class corresponding to the minimum less than cumulative frequency which contains
in
the value .
10
Example 3.23. Calculate the 5th and 8th deciles for the students score data and interpret
the results.
in
 
− FDi −1 
Di = LDi +  10  × w, i = 1, 2, · · · , 9

fDi

D5 class: 5n/10 = 5(56)/10 = 28, The D5 class is ⇒ 22.5 − 26.5.

38
Introduction to Statistics - Stat 1011 es.awol@gmail.com

5n
 
− FD5 −1
28 − 19

 10
D5 = LD5 +  × w = 22.5 + × 4 = 22.5 + 3.6 = 26.1

fD5 10

D8 class: 8n/10 = 8(56)/10 = 44.8, The D8 class is ⇒ 30.5 − 35.5.

8n
 
− FD8 −1
44.8 − 41

 10
D8 = LD8 +   × w = 30.5 + × 4 = 30.5 + 2.17 = 32.67

fD8 7

3.6.3 Percentiles
Percentiles are values that divide a data set into 100 equal parts. These values are denoted
by P1 , P2 , · · · , P99 .
th
th i(n + 1)
Let Pi be the i percentile (i = 1, 2, · · · , 99), then Pi = value.
100
Example 3.24. Given the data: 420, 430, 435, 438, 441, 449, 490, 500, 510 and 515. Find
the 40th and 75th percentiles.
th
i(n + 1)
Pi = value, i = 1, 2, · · · , 99
100
th
40(10 + 1)
P40 = value = 4.4th value = 4st value + 0.4 (5th value - 4th value) =
100
438+0.4(441-438) = 439.2
th
75(10 + 1)
P75 = value = 8.25th value = 8th value + 0.25 (9th value - 8th value) =
100
500+0.25(510-500) = 502.5
in
 
− FPi −1 k
 4 X
For frequency distribution, Pi = LPi +   ×w, i = 1, 2, · · · , 99 where n = fi

fPi i=1

is the total number of observations, fPi is frequency of the ith percentile class, LPi is the
lower class boundary of the ith percentile class, FPi −1 is the less than cumulative frequency
just before the ith percentile class and w is the class width of the ith percentile class. The ith
percentile class is the class corresponding to the minimum less than cumulative frequency
in
which contains the value .
100
Example 3.25. Calculate the 30th and 80th percentiles for the students score data and
interpret the results.

39
Introduction to Statistics - Stat 1011 es.awol@gmail.com

in
 
− FPi −1 
Pi = LPi +  100  × w, i = 1, 2, · · · , 99

fPi

P30 class: 30n/100 = 30(56)/100 = 16.80, The P30 class is ⇒ 18.5 − 22.5.
30n
 
− FP30 −1
16.80 − 11

 100
P30 = LP30 +   × w = 18.5 + × 4 = 18.5 + 1.22 = 19.72

fP30 19

P90 class: 90n/100 = 90(56)/100 = 50.40, The P90 class is ⇒ 35.5 − 38.5.
90n
 
− FP90 −1
50.40 − 48

 100
P90 = LP90 +   × w = 35.5 + × 4 = 35.5 + 1.2 = 36.7

fP90 8

Example 3.26. The life times (in hours) of eighty randomly selected light bulbs in sum-
marized in the following table. Find all the quartiles, the 6th decile and the 65th percentile.

Time 52.5-63.5 63.5-74.5 74.5-85.5 85.5-96.5 ≥96.5

No. of bulbs 6 12 25 18 19

Relationship between Median, Quartiles, Deciles and Percentiles

• X̃ = Q2 = D5 = P50

• Qi = Pi×25 , i = 1, 2, 3

• Di = Pi×10 , i = 1, 2, · · · , 9

3.7 Mode
Mode is another measure of central tendency. It is a value of a particular type of items
which occur most frequently. For instance if shoe size 7 has the maximum demand, size
No. 7 is the modal value of shoe sizes. Mode is denoted by X̂. A data set may have one
mode (uni-modal), two modes (bi-modal), more than two modes (multi-modal) or no mode
at all (i.e. when all observations are equally frequent).

In ungrouped (individual series) cases, one can find mode by inspection. After arranging
the data in ascending or descending order, the value appearing most frequently (the most
frequent value) is taken as the modal value.

Example 3.27. Find the mode of the following data sets.

a. 110, 113, 116, 116, 118, 118, 118, 121 and 123.

40
Introduction to Statistics - Stat 1011 es.awol@gmail.com

b. 2, 3, 5, 7 and 8.

c. 15, 18, 18, 18, 20, 22, 24, 24, 24, 26 and 26

d. 5, 6, 6, 7, 9, 9, 10, 12 and 12.

e. 1, 1, 0, 1, 0, 0, 0, 2, 4 and 3.

To find the modal value of each data set, just find the value having the highest frequency.

a. Since 118 occurs more than other values, the mode is 118.

b. Each value occurs once (equally frequent), the data has no mode.

c. 18 and 24 occur three times, hence the modal values are 18 and 24 (bi-modal).

d. Tri-modal (multi-modal): 6, 9 and 12.

e. The modal value here is 0 as it occurs more number of times than other values.

In grouped (continuous) frequency distribution, the modal value is located in the class with
highest frequency and that class is the modal class.
fX̂ − fX̂−1

X̂ = LX̂ + ×w
(fX̂ − fX̂−1 ) + (fX̂ − fX̂+1 )

where LX̂ is the lower class boundary of the modal class, fX̂ is frequency of the modal
class, fX̂−1 is the frequency just before the modal class, fX̂+1 is the frequency just after
the modal class and w is the class width of the modal class. The modal class is the class
corresponding to the largest frequency.

Example 3.28. Find the modal score of the students score data.

The class having highest frequency is ⇒ 26.5 − 30.5, hence it is the modal class.
fX̂ − fX̂−1

X̂ = LX̂ + ×w
(fX̂ − fX̂−1 ) + (fX̂ − fX̂+1 )

12 − 10
X̂ = 26.5 + × 4 = 26.5 + 1.14 = 27.64
(12 − 10) + (12 − 7)
Example 3.29. What is the modal life time of the light bulbs given below in the table.

Time 52.5-63.5 63.5-74.5 74.5-85.5 85.5-96.5 ≥96.5

No. of bulbs 6 12 25 18 19

Mode is not affected by extreme values and can be calculated for open-ended classes. But
it often does not exist and its value may not be unique.

41
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Relationship between AM, Median and Mode

In a symmetrical and uni-modal distribution: Mean=Median=Mode
For a moderately skewed distribution: Mean-Mode=3(Mean-Median).

EXERCISES:

1. In a certain investigation, 460 persons were involved in the study, and based on
an enquiry on their age, it was known that 75% of them were 22 or more years.
The following frequency distribution shows the age composition of the persons under
study.

Mid age in years 13 18 23 28 33 38 43 48

No. of persons 24 f1 90 122 f2 56 20 33

(a) Find the median and modal life of condensers and interpret them.
(b) Find the values of all quartiles.
(c) Compute the 5th decile, 25th percentile, 50th percentile and the 75th percentile
and interpret the results.

2. The mean annual salary of all employees in a company is 2500. The mean salary of
male and female is 2700 and 1700 respectively. Find the percentage of males and
females employed in the company.

3. Given the following FD.

Mid price of a commodity (birr) 15 25 35 45 55

Number of items sold 27 A 28 B 19

(a) If 75% of the items were sold in birr 45 or less and most items were sold in birr
34, find the missing frequencies.
(b) If 25% of the items were sold in birr 45 or more and most items were sold in
birr 34, find the missing frequencies.

Summary
Different measures of central tendency and quantiles have been discussed in this chapter.
Out of mean, median and mode, the mean (average) is the most commonly used measure
of central tendency. But, the other two namely, the median and mode are not any less
important. Median is a largely used central measure in psychology, education and other
social sciences. Mode is a suitable average for qualitative information like attitude towards
disabled people, beauty or intelligence of certain individuals. It is a useful measure for
manufacturers.

42
Chapter 4

Measures of Variation, Skewness and

Kurtosis

In the previous chapter, we concentrated on central values (measures of central tendency),

which gives an idea of the whole mass, that is, a complete set of values. However, the
information so obtained is neither exhaustive nor comprehensive, as the mean does not
lead us to know whether the observations are close to each other or far apart. Median
is a positional average and has nothing to do with the variability of the observations in
a data set. This leads as to conclude that a measure of central tendency is not enough
to have a clear idea about the data unless all observations are the same. Moreover, two
or more data sets may have the same mean and/or median but they may be quite different.

The following table displays the price of a certain commodity in four cities. Find the mean
and median prices of the four cities and interpret it.
A 30 30 30 30 30
B 28 29 30 31 32
C 10 15 30 45 50
D 0 5 30 55 60

All the four data sets have mean 30 and median is also 30. But by inspection it is ap-
parent that the four data sets differ remarkably from one another. So measures of central
tendency alone do not provide enough information about the nature of the data. Thus, to
have a clear picture of the data, one needs to have a measure of dispersion or variability
among observations in the data set.

Variation or dispersion may be defined as the extent of scatteredness of value around the
measures of central tendency. Thus, a measure of dispersion tells us the extent to which
the values of a variable vary about the measure of central tendency.

43
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4.1 Objectives of Measures of Variation

1. To have an idea about the reliability of the measures of central tendency.
If the degree of scatterdness is large, an average is less reliable. If the value of the
variation is small, it indicates that a central value is a good representative of all the
values in the data set.

2. To compare two or more sets of data with regard to their variability. Two or
more data sets can be compared by calculating the same measure of variation having
the same units of measurement. A set with smaller value posses less variability or is
more uniform (or more consistent).

3. To provide information about the structure of the data. A value of a measure

of variation gives an idea about the spread of the observation. Further, one can
surmise about the limits of the expansion of the values in the data set.

4. To pave way to the use of other statistical measures. Measures of variation

especially variance and standard deviation lead to many statistical techniques like
correlation, regression, analysis of variance,· · ·

4.2 Types of Measures of Variation

• Absolute Measures of Variation: A measure of variation is said to be an abso-
lute form when it shows the actual amount of variation of an item from a measure
of central tendency and are expressed in concrete units in which the data have been
expressed. In other words, all absolute measures of variation have units. As a result,
if two or more distributions differ in their units of measurement, their variability
cannot be compared by using any absolute measure of variation.

The size of the absolute measures of dispersion depends upon the size of the values
in the data. That is, if the size of the values is larger, the value of the absolute
measures will also be larger. Therefore, an absolute measures of variation fails to
be appropriate for comparing two or more groups if the size of the data among the
groups is not the same.

• Relative Measures of Variation: A relative measure of variation is the quotient

obtained by dividing the absolute measure by a quantity in respect to which absolute
deviation has been computed. It is a unit less pure number and used for making
comparisons between different distributions.

44
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Absolute Measures Relative Measures

Range Coefficient of Range
Quartile Deviation Coefficient of Quartile Deviation
Mean Deviation Coefficient of Mean Deviation
Variance and Standard Deviation Coefficient of Variation
Standard Scores

Before giving the details of these measures of dispersion, it is worthwhile to point out that
a measure of dispersion (variation) is to be adjudged on the basis of all those properties of
good measures of central tendency. Hence, their repetition is superfluous.

4.2.1 Range and Relative Range

Range is the simplest and crudest measure of dispersion. It is defined as the difference
between the largest and the smallest values in the data.
• For raw data: R = L − S
• For grouped data: R = U CLlast − LCLf irst
Coefficient of Range:
L−S
• For raw data: CR =
L+S
U CLlast − LCLf irst
• For grouped data: R =
U CLlast + LCLf irst
Range hardly satisfies any property of good measure of dispersion as it is based on two
extreme values only, ignoring the others. It is not also liable to further algebraic treatment.

4.2.2 Quartile Deviation and Coefficient of Quartile Deviation

Quartile deviation is sometimes known as Semi-interquartile Range (SIR). The Interquar-
tile Range is Q3 − Q1 . Thus,
Q3 − Q1
QD = .
2
The corresponding relative measure of variation, coefficient of quartile deviation is:
Q3 − Q1
CQD = .
Q3 + Q1
QD involves only the middle 50% of the observations by excluding the observations below
the lower quartile and the observations above the upper quartile. Note also that it does
not take into account all the individual values occurring between Q1 and Q3 . It means
that, no idea about the variation of even the 50% mid values is available from this measure.
Anyhow it provides some idea if the values are uniformly distributed between Q1 and Q2 .
It can be calculated for open ended classes.

45
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4.2.3 Mean Deviation and Coefficient of Mean Deviation

The measures of variation discussed so far are not satisfactory in the sense that they lack
most of the requirements of a good measure. Mean deviation is a better measure than
range and quartile deviation.
Mean deviation is the arithmetic mean of the absolute values of the deviation from some
measures of central tendency usually the mean and the median of a distribution. Hence
we have mean deviation about the mean M D(X̄) and mean deviation about the median
M D(X̃).
n
X n
X
|Xi − X̄| |Xi − X̃|
i=1 i=1
• For raw data: M D(X̄) = n and M D(X̃) = n
k
X k
X
fi |Xi − X̄| fi |Xi − X̃|
i=1 i=1
• For grouped data: M D(X̄) = k
and M D(X̃) = k
X X
fi fi
i=1 i=1

M D is not much affected by extreme values. Its main drawback is that the algebraic
negative signs of the deviations are ignored. M D is minimum when the deviation is taken
from median. The coefficient of mean deviations are:
M D(X̄) M D(X̃)
CM D(X̄) = and CM D(X̃) =
X̄ X̃
Example 4.1. Calculate the R, CR, QD, CQD, MD(X̄), MD(X̃), CMD(X̄) and CMD(X̃)
for the following data: 20, 28, 40, 12, 30, 15, 50.

Data array: 12, 15, 20, 28, 30, 40, 50.

50 − 12 38
R = L − S = 50 − 12, CR = = = 0.613
50 + 12 62
Q1 = ((7 + 1)/4)th value = 2nd value = 15

Q3 = (3(7 + 1)/4)th value = 6th value = 40

Q3 − Q1 40 − 15 25
⇒ QD = = = = 12.5
2 2 2
Q3 − Q1 40 − 15 25
⇒ CQD = = = = 0.45
Q3 + Q1 40 + 15 55
12 + 15 + · · · + 50
X̄ = = 27.86 and X̃ = 28
7

46
Introduction to Statistics - Stat 1011 es.awol@gmail.com

|12 − 27.86| + |15 − 27.86| + · · · + |50 − 27.86| 73.14

M D(X̄) = = = 10.45
7 7
|12 − 28| + |15 − 28| + · · · + |50 − 28| 73
M D(X̃) = = = 10.43
7 7
Example 4.2. Calculate the R, CR, QD, CQD, MD(X̄), MD(X̃), CMD(X̄) and CMD(X̃)
for the students score data.

Previously, we have obtained the following quantities for the students score data:
X̄ = 25.64, X̃ = 26.1, Q1 = 20, Q3 = 31.07

CBs Xi fi |Xi − X̄| fi |Xi − X̄| |Xi − X̃| fi |Xi − X̃|

10.5-14.5 12.5 4 13.14 52.56 13.6 54.4
14.5-18.5 16.5 7 9.14 63.98 9.6 67.2
18.5-22.5 20.5 8 5.14 41.12 5.6 44.8
22.5-26.5 24.5 10 1.14 11.40 1.6 16.0
26.5-30.5 28.5 12 2.86 34.32 2.4 28.8
30.5-34.5 32.5 7 6.86 48.02 6.4 44.8
34.5-38.5 36.5 8 10.86 86.88 10.4 83.2
Total 56 338.28 339.2
R = U CLlast − LCLf irst = 38 − 11 = 27
U CLlast − LCLf irst 38 − 11 27
CR = = = = 0.551
U CLlast + LCLf irst 38 + 11 49
Q3 − Q1 31.07 − 20 11.07
QD = = = = 5.54
2 2 2
Q3 − Q1 31.07 − 20 11.07
CQD = = = = 0.22
Q3 + Q1 31.07 + 20 51.07
X
fi |Xi − X̄| 338.28
M D(X̄) = X = = 6.04
fi 56
X
fi |Xi − X̂| 339.2
M D(X̂) = X = = 6.06
fi 56
M D(X̄) 6.04
CM D(X̄) = = = 0.24
X̄ 25.64
M D(X̂) 6.06
CM D(X̂) = = = 0.23
X̂ 26.1
Example 4.3. Calculate the R, QD and CQD for the following frequency distribution.

Class limits 10-14 15-19 20-24 25-29 30-34 35-39

Frequency 8 10 22 35 15 10

47
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4.2.4 Variance and Standard Deviation

Variance and standard deviation are the most superior and widely used measures of dis-
persions and both measure the average dispersion of the observations around the mean.
The variance of a data set is the sum of the squares of the deviation of each observation
taken from the mean divided by total number of observations in the data set. The positive
square root of variance is called standard deviation.
For a population containing N elements, the population standard deviation is denoted by
the Greek letter σ (sigma) and hence the population variance is denoted by σ 2 .
v
XN u N
uX
(Xi − µ)2 u
u (Xi − µ)2
• For raw data, σ 2 = i=1
t i=1
and σ =
N N
v
k
u k
X uX
2
fi (Xi − µ) u
u fi (Xi − µ)2
• For grouped data, σ 2 = i=1 k and σ = u i=1 k
u
X u X
f i
t f i
i=1 i=1

For a sample of n elements, the sample variance and standard deviation denoted by S and
S 2 , respectively, are calculated as using the formulae:
v
X n uX n
2
(Xi − X̄)2
u
(Xi − X̄) u
t
• For raw data, S 2 = i=1 n − 1 and S = i=1
n−1
v
k
u k
X uX
2
fi (Xi − X̄) u
u fi (Xi − X̄)2
• For grouped data, S 2 = i=1 k and S = u i=1 k
u
X u X
fi − 1 t fi − 1
i=1 i=1

Example 4.4. Find the variance and standard deviation of: 20, 28, 40, 12, 30, 15 and 50.
a. Take the data as a population.
b. Consider it as a sample.
N
X
(Xi − µ)2
i=1
a. N = 7, µ = 27.86; σ 2 =
N
(20 − 27.86)2 + · · · + (50 − 27.86)2 1120.86
σ2 = = = 160.12
7 7

48
Introduction to Statistics - Stat 1011 es.awol@gmail.com

√
⇒σ= 160.12 = 12.65
n
X
(Xi − X̄)2
i=1
b. n = 7, X̄ = 27.86; S 2 =
n−1
(20 − 27.86)2 + · · · + (50 − 27.86)2 1120.86
S2 = = = 186.81
6 6
√
⇒S= 186.81 = 13.67

Example 4.5. Find the variance and standard deviation of the students score data.
The necessary calculation for calculating variance are as follows: X̄ = 25.64

CBs Xi fi Xi − X̄ (Xi − X̄)2 fi (Xi − X̄)2

10.5-14.5 12.5 4 -13.14 172.6596 690.6384
14.5-18.5 16.5 7 -9.14 83.5396 584.7772
18.5-22.5 20.5 8 -5.14 26.4196 211.3568
22.5-26.5 24.5 10 -1.14 1.2996 12.9960
26.5-30.5 28.5 12 2.86 8.1796 98.1552
30.5-34.5 32.5 7 6.86 47.0596 329.4172
34.5-38.5 36.5 8 10.86 117.9396 943.5168
Total 56 2870.8576
X
fi (Xi − X̄)2 2870.8576
2
σ = X = = 51.27
fi 56
√
σ= 51.27 = 7.16
Example 4.6. Find the variance and standard deviation for the following data.

Class limits 6-10 11-15 16-20 21-25 26-30 31-35 36-40

Frequency 1 2 3 5 4 3 2
The main objection of mean deviation, removal of the negative signs, is removed by taking
the square of the deviations from the mean. The first main demerit of variance is that
its unit is the square of the unit of measurement of the variable values. For example, the
sample variance of 2m, 6m and 4m is 4m2 . The interpretation is, on average each value
differs from the mean by 4m2 , which is completely wrong because one thing the unit of
measurement of variance is not the same as that of the data set. The other disadvantage of
variance is, the variation of the data is exaggerated because the deviation of the each value
from the mean is squared. For the given example, the variation of the data is exaggerated
from two to four since it is taking the square of the deviations. Variance also gives more
weight the extreme values as compared to those which are near to the mean value.

49
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Standard deviation is considered to be the best measure of dispersion because the unit of
measurement is the same as the data set and the exaggeration made by variance will be
eliminated by taking the square root of it. In simple words, it explains the average amount
of variation on either sides of the mean. If the standard deviation of the data is small the
values are concentrated near the mean and if it large the values are scattered away from
the mean.

Properties of Variance and Standard Deviation

1. If a constant is added (subtracted) to (from) each and every observation, the standard
deviation as well as the variance remains the same.
2. If each and every value is multiplied by a nonzero constant c, the standard deviation
is multiplied by c and the variance is multiplied by c2 .
3. If there are g different groups having the same units of measurement with sample
means X̄1 , X̄2 , · · · , X̄g , number of sample observations n1 , n2 , · · · , ng and sample
variances S12 , S22 , · · · , Sg2 respectively, then the variance of all the groups called the
pooled variance (denoted by Sp2 ) is given by:
g
X
(ni − 1)[Si2 + (X̄i − X̄c )2 ]
i=1
Sp2 = g
X
ni − g
i=1

(n1 − 1)[S12 + (X̄1 − X̄c )2 ] + · · · + (ng − 1)[Sg2 + (X̄g − X̄c )2 ]

⇒ Sp2 =
n1 + n2 + · · · + ng − g
If X̄1 = X̄2 = · · · = X̄g ,
g
X
(ni − 1)Si2
i=1 (n1 − 1)S12 + (n2 − 1)S22 + · · · + (ng − 1)Sg2
⇒ Sp2 = g = =
X n1 + n2 + · · · + ng − g
ni − g
i=1

Similarly, the pooled population variance can be calculated using the formula:
g
X
Ni [σi2 + (µi − µc )2 ]
i=1
σp2 = g
X
Ni
i=1

N1 [σ12 + (µ1 − µc )2 ] + · · · + Ng [σg2 + (µg − µc )2 ]

⇒ σp2 =
N1 + N2 + · · · + Ng

50
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 4.7. The mean weight of 150 students is 60 kilograms. The mean weight of boys
is 70 kilograms with aa standard deviation of 10 kilograms. For the girls, the mean weight
weight is 55 kilograms and the standard deviation 15 kilograms.

a. Find the number of boys and girls.

b. Find the combined standard deviation.

Example 4.8. A distribution consists of four parts characterized as follows. Find the
mean and standard deviation of the distribution. Ans: X̄c = 73.8 and σ = 11.93

Part No. of items Mean S.D

1 50 61 8
2 100 70 9
3 120 50 10
4 30 83 11
Example 4.9. The arithmetic mean and standard deviation of a series of 20 items were
computed as 20 and 5 respectively. while calculating these, an item 13 was misread as 30.
Find the correct mean and standard deviation.

Example 4.10. The following data are some of the particulars of the distribution of
weights of boys and girls in a class.

Boys Girls
Number 100 50
Mean 60 45
Variance 9 4

a. Find the mean and variance of the combined series.

b. If one of the values is misread as 60 instead of 40 what is the correct standard

deviation.

Empirical Relationship between QD, MD and SD

6QD=5MD=4SD

4.2.5 Coefficient of Variation

Coefficient of variation is a relative measure of standard deviation. It is the ratio of the
standard deviation to the mean and expressed as percent. Hence, it is a unit less measure
of variation and also takes into account the size of the means of the distributions.
σ
• For population: CV = × 100%
µ

51
Introduction to Statistics - Stat 1011 es.awol@gmail.com

S
• For sample: CV = × 100%
X̄
The distribution having less CV is said to be less variable or more consistent or more
uniform. For field experiments, CV , is generally reported. If it is small, it indicates more
reliability of of experimental findings.

Example 4.11. Compare the variability of the following two sample data sets using stan-
dard deviation and coefficient of variation:

A. 2 Meters, 4 Meters, 6 Meters

B. 600 Liters, 400 Liters, 500 Liters

Example 4.12. The average IQ of statistics students is 110 with standard deviation 5 and
the average IQ of mathematics students is 106 with standard deviation 4. Which class is
less variable in terms of IQ?

4.2.6 Standard Score

The standard score (Z-score) tells us how many standard deviations a given value is above
or below the mean depending on whether the Z-score is negative or positive.
X −µ
• For population: Z =
σ
X − X̄
• For sample: Z =
S
Example 4.13. Suppose Yoseph scored 90 on a test in which the mean and standard
deviation of the class were 70 and 10 respectively. In another test, Helen scored 600 on
which the mean and standard deviation of the class were 560 and 40 respectively. Who is
better of relative to his/her class?

Summary
A measure of dispersion, specially the variance, is the back bone of statistics. As a matter
of fact, statistics involves variance almost in every study in one way or the other. Most of
the surveys and experiments are considered as a study of sample units. Hence, the formulae
for sampling are mostly used. All the formulae except the variance are not affected whether
we consider a population or a sample. Of course, the interpretation of values has to be
made accordingly.

52
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4.3 Moments
Let X is a variable that assumes values X1 , X2 ,· · · ,XN .

1. The rth raw moment about a number A, is defined as:

N
X
(Xi − A)r
i=1
• µ00r = for raw data
N
k
X
fi (Xi − A)r
i=1
• µ00r = k
for grouped data.
X
fi
i=1

X̄ = µ001 + A

2. The rth moment about the origin (i.e., in (1) above A = 0) is defined as:
N
X
Xir
i=1
• µ0r = for raw data
N
k
X
fi Xir
i=1
• µ0r = k
for grouped data.
X
fi
i=1

µ0 = X̄, σ 2 = µ02 − (µ01 )2

3. The rth central moment (i.e., in (1) above A = µ) is defined as:

N
X
(Xi − µ)r
i=1
• µr = for raw data
N
k
X
fi (Xi − µ)r
i=1
• µr = k
for grouped data.
X
fi
i=1

For any distribution,

53
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• µ0 = 1
• µ1 = 0
• µ2 = σ 2

Example 4.14. Find the first three central moments of the numbers 2, 3 and 7.

Example 4.15. Find the third moment about 3 of the numbers 2, 3 and 7.

4.4 Skewness
4.4.1 Frequency Curves
So far it has been discussed that frequency curve is one of the graphical methods of data
presentation used for continuous data. It is a graph of smooth line segment joining the
intersection points of class marks and frequencies.

1. Symmetric/Normal curve: A symmetric curve is a frequency curve when it looks

the same to the left and right of the central point. The distribution spread around
a central tendency value in a symmetrical and bell shaped pattern. Specifically in
symmetrical (normal) distribution:

• The lengths of both tails are the same.

• The mean, median and modal values are approximately equal.
• The corresponding pairs of quartiles, deciles and percentiles are equi-distance
from the median. For example, first quartile and third quartile have the same
distance from the median.

54
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. Positively skewed curve: If some observations are extremely large, the mean of the
distribution becomes greater than the median or mode. In such case, the distribution
is said to be positively skewed. In positively skewed distribution:

• The right tail of the frequency curve is more elongated, longest tail to the right
of the central point.
• More values are on the left of the mean.
• The extreme variation is towards large values (to the right).
• Smaller values are more frequent.
• Mean>Median>Mode

3. Negatively skewed curve: If some extremely small observations are present, the
mean is the smallest of the the other two averages, and the distribution is said to be
negatively skewed.

• The left tail is more elongated.

• More observations are concentrated on the right of the mean.
• The extreme variation is towards lower values (to the left).
• Larger values are more frequent than small values
• Mean<Median<Mode

55
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4.4.2 Measures of Skewness

Skewness is the lack of symmetry or departure (asymmetry) from the normal curve. If
the frequency curve is symmetrical then it has no skewness. The measure of the degree
of asymmetry is called a measure of skewness. If both tails of a frequency curve are not
equally distributed, the curve is asymmetric and is called a skewed curve.

Measures of Skewness

1. The Karl Pearson’s Coefficient of Skewness(Skp ):

X̄ − X̂
Skp =
S
• If Skp = 0, the distribution is symmetrical curve.
• If Skp > 0, the distribution is positively skewed.
• If Skp < 0, the distribution is negatively skewed.

2. The Bowley’s Coefficient of Skewness (Skb ):

Q3 + Q1 − 2Q2
Skb =
Q3 − Q1
• If Skb = 0, the distribution is symmetrical curve.
• If Skb > 0, the distribution is positively skewed.
• If Skb < 0, the distribution is negatively skewed.

56
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3. The Moment Measure of Skewness(denoted by α3 ):

µ3
α3 = p 3
µ2

• If α3 = 0, the distribution is symmetrical curve.

• If α3 > 0, the distribution is positively skewed.
• If α3 < 0, the distribution is negatively skewed.

Example 4.16. Calculate the Pearson’s and Bowley’s coefficient of skewness for: 2,3,4,4,5,5,5,7,8,9.

Example 4.17. The mean, median and coefficient of variation of 100 observations are
found to be 90.84 and 80 respectively. Find the coefficient of skewness.

4.5 Kurtosis
Kurtosis refers to the peakedness or flatness of a certain distribution with respect to the
normal distribution. It describes the degree of concentration of observations around the
mode of the distribution, whether the values are concentrated more around the mode (a
peaked curve) or away from the mode toward both the tails of the frequency curve. Two or
more distributions may have identical average, variation and skewness but they may show
different degrees of concentration of values of observations around the mode and hence
may show different degrees of peakedness.

A distribution which is neither more peaked nor flat topped is called mesokurtic.

57
Introduction to Statistics - Stat 1011 es.awol@gmail.com

If a distribution is flat toped than normal it is called platykurtic.

A distribution which is more picked than normal is called a leptokurtic distribution.

Measures of Kurtosis

1. The Coefficient of Kurtosis:

Q3 − Q1
K=
D9 − D1

58
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. The Moment Measure of Kurtosis:

µ4
β=
µ22

• If β = 3, the distribution is normal (mesokurtic).

• If β > 3, the distribution is leptokurtic.
• If β < 3, the distribution is platykurtic.

Example 4.18. The first four central moments of a distribution are 0, 2.5, 0.7 and 18.75.
Comment on the skewness and kurtosis of the distribution.

EXERCISES:

1. Find the range, quartile deviation, mean deviation about the mean, mean deviation
about the median, mean deviation about the mode, variance, standard deviation and
coefficient of variation for the following distribution.

Class 2-4 4-6 6-8 8-10

Frequency 2 5 4 7

Also calculate the Pearson’s coefficient of skewness, moment coefficient of skewness

and Bowley’s coefficient of skewness.

2. Three independent distributions each of 100 members and standard deviation 4.5
units are located with their means at 12.1, 17.1 and 22.1 units respectively. Find the
standard deviation of the three distributions taken as a whole.

3. The first of the two groups has 100 items with mean 45 and variance 49. If the
combined has 250 items with mean 51 and variance 130, find the mean and standard
deviation of the second group.

4. Karl Pearson’s coefficient of skewness is +0.32. Its standard deviation is 6.5 and
mean is 29.6. Find the median and mode of the distribution.

5. For a distribution, Bowley’s coefficient of skewness is -0.56, the lower quartile is 16.4
and median is 24.2. What is the quartile deviation?

6. If the first two moments of a distribution about the value 5 are 2 and 20. Find the
mean and variance of the distribution.

59
Chapter 5

Elementary Probability

As a general concept, probability is the measure of a chance that something will occur. Or
it may also be defined as a quantitative measure of uncertainty.

5.1 Concept of Set

In order to discuss the theory of probability, it is essential to be familiar with some ideas
and concepts of mathematical theory of set. A set is a collection of well-defined objects
which is denoted by capital letters like A, B, C, etc.

In describing which objects are contained in set A, two common methods are available.
These methods are:

1. Listing all objects of A. For example, A = {1, 2, 3, 4} describes the set consisting of
the positive integers 1, 2, 3 and 4.

2. Describing a set in words, for example, set A consists of all real numbers between 0
and 1, inclusive. It can be written as A = {x : 0 ≤ x ≤ 1}, that is, A is the set of
all x’s where x is a real number between 0 and 1, inclusive.

If A = {a1 , a2 , · · · , an }, then each object ’ai ; i = 1, 2, · · · , n’ belonging to set ’A’ is called

a member or an element of set A, i.e., ai ∈ A. A set consisting all possible elements under
consideration is called a universal set (denoted by U ). On the other hand, a set containing
no element is called an empty set (denoted by ∅ or {}).

If every element of set A is also an element of set B, A is said to be a subset of B and write
as A ⊂ B. Every set is a subset of itself, i.e., A ⊂ A. Empty set is a subset of every set.
If A ⊂ B and B ⊂ C, then A ⊂ C. If A ⊂ B and B ⊂ A, then A and B are said to be equal.

Now let us see some methods of combining sets in order to form a new set and develop the
main properties.

60
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Union (Or): A set consisting all elements in A or B or both is called the union set
of A and B, and write as A ∪ B. That is, A ∪ B = {x : x ∈ A, x ∈ B or x ∈ both}.
The set A ∪ B is also called the sum of A and B.

2. Intersection (And) ( A ∩ B): A set consisting all elements in both A and B is

called an intersection set of A and B, and write as A ∩ B. This is, A ∩ B = {x : x ∈
A and x ∈ B}. The intersection set of A and B is also called the the product of A
and B.

3. Complement (Not): The complement of a set A, denoted by A0 , is a set consisting

all elements of U that are not in A, i.e., A0 = {x : x ∈
/ A}.

Equivalent Sets

• Commutative laws:

– A∪B =B∪A
– A∩B =B∩A

• Associative laws:

– A ∪ (B ∪ C) = (A ∪ B) ∪ C
– A ∩ (B ∩ C) = (A ∩ B) ∩ C

• Distributive laws:

– A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
– A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)

• Identity laws:

– A ∪ A = A, A ∩ A = A
– A ∪ U = U, A ∩ U = A
– A ∪ ∅ = A, A ∩ ∅ = ∅

• A ∪ A0 = U and A ∩ A0 = ∅

• ∅0 = U and U 0 = ∅

• De-Morgan’s laws:

– (A ∪ B)0 = A0 ∩ B 0
– (A ∩ B)0 = A0 ∪ B 0

• A ⊂ B ⇔ B 0 ⊂ A0 ⇒ A ∪ B = B and A ∩ B = A

61
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Sets having no element in common, i.e, A ∩ B = ∅, are called mutually exclusive or

disjoint sets. Hence, note that U and ∅ are mutually exclusive sets and also A and A0 .
If A and B are finite sets, then n(A ∪ B) = n(A) + n(B) − n(A ∩ B).

Example 5.1. Let U = {a, b, c, d, e, f, g, h}. Let A = {a, d, e}, B = {d, e, g, h} and
C = {a, d, c, e, h}. Find A ∪ B, A ∩ B, A ∩ B 0 , A0 ∩ B, (A ∪ B)0 , (A ∩ B)0 , A ∩ (B ∪ C),
A ∪ (B ∩ C).

Example 5.2. Consider U = {x : x ≥ 0} and three events A = {x : x ≤ 100},

B = {x : 50 ≤ x ≤ 200} and C = {x : x ≥ 150}. Find i) A ∪ B ii) A ∩ B iii) B ∩ C iv)
(A ∩ B)0

Example 5.3. In a survey conducted among 200 statistics major students, the number of
students who visited historical, religious and both sites are found to be 150, 130 and 80
respectively. Find the number of students who visited none of the sites.

5.2 Basic Probability Terms

1. Experiment: It is an activity or a trial that leads to a well-defined results called
outcomes, but it is uncertain to which result will occur. Hence, a probability experi-
ment is identified by two properties. First, each experiment has several (at least two)
possible outcomes and all these outcomes are known in advance and second, none of
these outcomes can be predicted with certainty. For example, for the experiment of
tossing a fair coin, we cannot be certain whether the outcome will be a head.

2. Outcome: is a result of a single trial (experiment).

3. Sample Space (S): is a collection of all possible outcomes of an experiment. (In

these context, S represents the universal set described previously.)
Examples: Define the sample space for the following probability experiments.

(a) Tossing a coin: S={Head (H), Tail (T )}

(b) Tossing two coins: S = {HH, HT, T H, T T }
(c) Rolling a die: S={1,2,3,4,5,6}
(d) Selecting an item from a production lot: S={Defective, Non-defective}
(e) Introducing a new product: S={Success, Failure}

4. Event: is an outcome or a set of outcomes (having some common characteristics) of

an experiment. For example in the experiment of tossing two coins simultaneously if
the event E is defined as getting one head, then E = {HT, T H}. In the experiment
of rolling a die, let E be an even number, then E = {2, 4, 6}.

62
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Since S ⊂ S and E ⊂ S, it follows that S and ∅ are also events. S is called certain (sure)
event because every outcome is an element of S. The event ∅ is an impossible event because
no outcome of the experiment can be an element of ∅.

Definition:
Mutually Exclusive Events: Two events A and B are said to be mutually exclusive if
they cannot occur together, i.e., A ∩ B = ∅. For example, in the experiment of rolling a
die, odd numbers and even numbers are mutually exclusive events.

Let us now use the various methods of combining sets (that is, events) and obtain the new
sets (that is, events) which are introduced earlier. Consider the outcome s and events A
and B:
• If s ∈ A, then A occurs. A0 is the event which occurs if A does not occur.
• If s ∈ (A ∪ B), then one of the events A or B occurs or both occur.
• If s ∈ (A ∩ B), then both events A and B occur.
• If s ∈ (A0 ∩ B 0 ), then neither A nor B occurs.
• If s ∈ (A0 ∩ B), then B occurs but not A.
• If s ∈ (A ∩ B 0 ) ∪ (A0 ∩ B), then one of the events A or B occurs.
• If s ∈ (A ∩ B)0 = A0 ∪ B 0 , then both events do not occur.

5.3 Counting Techniques

Counting techniques are mathematical models which are used to determine the number of
possible ways of arranging or ordering objects. They are used to find a solution to fix the
size of the sample space that is extremely large. Example: What is the size of the sample
space if a coin is tossed a large number of times say 20 or more?
1. Addition Rule: Suppose there are k procedures (p1 , p2 , · · · , pk ), in which the ith
procedure can be done in ni ; i = 1, 2, · · · , k ways. Hence, the total number of ways of
performing p1 or p2 or · · · or pk is n1 + n2 + · · · + nk , provided that no two procedures
can be performed at the same time or one after the other.

Example 5.4. There are 2 bus and 3 train routes from city X to city Z. In how
many ways can a person go from city X to city Z? Ans: 2 + 3 = 5 ways

2. Multiplication Rule: Suppose there are a sequence of k events, in which the ith
event has ni ; i = 1, 2, · · · , k possibilities, then the total number of possibilities of the
whole sequence will be n1 × n2 × · · · × nk .

63
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 5.5. There are 2 bus routes from city X to city Y and 3 train routes from
city Y to city Z. In how many ways can a person go from city X to city Z? Ans:
2 × 3=6 ways
Example 5.6. There are 6 questions. Each question has 4 choices. How many
answer keys must be made? Ans: 4 × 4 × 4 × 4 × 4 × 4 = 46 = 4096
Example 5.7. There are 5 hotels in a city. If 4 persons check into a different hotel,
in how many ways can this be done? Ans: 5 × 4 × 3 × 2 = 120
Example 5.8. In how many ways can 6 persons be seat in a row? Ans: 6 × 5 × 4 ×
3 × 2 × 1 = 720
Example 5.9. Seven dice are rolled. How many different outcomes are there? Ans:
6 × 6 × 6 × 6 × 6 × 6 × 6 = 67 = 279986

3. Permutation: is the arrangement or selection of objects in a specific order.

(a) Permutation Rule 1: The number of permutations of n distinct objects tak-
ing all together is n! = n × (n − 1) × (n − 2) × · · · × (1). By definition 1! = 0! = 1

Example 5.10. In how many ways can 6 persons be seat in a row? Ans:
6! = 720
Example 5.11. Suppose a photographer must arrange 4 persons in a row for
a photograph. In how many different ways can the arrangement be done? Ans:
4! = 24
(b) Permutation Rule 2: The arrangement of n distinct objects in a specific order
using r objects at a time is called a permutation of n objects taking r objects
at a time, that is, nPr where
n!
nPr = , 0 ≤ r ≤ n.
(n − r)!
Example 5.12. In how many ways can 9 books be arranged on a shelf having
9!
4 places? Ans: 9P4 = = 9 × 8 × 7 × 6 = 3024
(9 − 4)!
Example 5.13. How many 5 letter permutations can be formed from the letters
8!
in the word ’DISCOVER’ ? Ans: 8P5 = = 8 × 7 × 6 × 5 × 4 = 6720
(8 − 5)!
(c) Permutation Rule 3: The number of permutations of n objects in which n1
are alike, n2 are alike, · · · , nr are alike is given by
n!
n1 ! × n2 ! × · · · × nr !
where n1 + n2 + · · · + nr = n

64
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 5.14. How many different permutations can be made from the letters
in the word
10!
a. STATISTICS. Ans:
3! × 3! × 1! × 2! × 1!
11!
b. MISSISSIPPI. Ans:
1! × 4! × 4! × 2!
9!
c. EXERCISES. Ans:
3! × 1! × 1! × 1! × 1! × 2
4. Combination: is the arrangement or selection of objects without regard to order.
Here, order does not matter.
The number
of combinations of r objects selected from n objects is denoted by
n
nCr = where
r

n n!
nCr = = ; 0≤r≤n
r (n − r)! × r!

Note: The difference between permutation and combination is that in combination, the or-
der of objects being selected (arranged) is not important, but order matters in permutation.

Example 5.15. In how many different ways can a secretary, a president and a manager
be selected from 5 persons?

Example 5.16. A committee of 3 persons is to be selected from 5 persons. In how many

different ways can this be done?

Example 5.17. A committee of 5 persons must be selected from 5 men and 8 women.
How many ways can the selection be done if there are at least 3 women in the committee?

5.4 Definitions of Probability

1. Classical probability: It is also called mathematical probability. Suppose there
are N possible outcomes in the sample space S of an experiment. Out of these N
outcomes, only n are favorable to the event E, then the probability that the event E
will occur is:
n(E) n
P (E) = = .
n(S) N
Example 5.18. Try the following examples.

(a) What is the probability of getting number 6 in rolling a die?

(b) What is the probability of getting one head in tossing two coins?

65
Introduction to Statistics - Stat 1011 es.awol@gmail.com

(c) A die is rolled. What is the probability of getting i) an odd number, ii) a number
greater than 4.
(d) An urn contains 6 white and 3 black balls. If one ball is selected, what is the
probability that the selected ball is black.
(e) A family plans to have three children. Describe the sample space for all possible
gender combinations. What is the probability that the family will have two
boys?
(f) Two dice are rolled. Describe the sample space. What is the probability of
getting i) a sum of 10 or more, ii) a pair which at least one number is 3, iii) a
sum of 8, 9 or 10, iv) one number less than 4.
Solutions:
(a) S = {1, 2, 3, 4, 5, 6},
E = getting number 6= {6}. Thus n(S) = 6 and n(E) = 1
n(E) 1
P (E) = =
n(S) 6
(b) S = {HH, HT, T H, T T },
E = getting one head = {HT, T H}. Thus n(S) = 4 and n(E) = 2
n(E) 2
P (E) = = = 0.5
n(S) 4
(c) S = {1, 2, 3, 4, 5, 6}, n(S) = 6
i. E = getting an odd number = {1, 3, 5}. Thus n(E) = 3
n(E) 3
P (E) = = = 0.5
n(S) 6
ii. E = getting number > 4 = {5, 6}. Thus n(E) = 2
n(E) 2
P (E) = =
n(S) 6
2. Empirical probability: It is based on a relative frequency. Given a frequency
distribution, the probability of an event being in a given class is
f
P (E) = P
f
P
where f is the class frequency and f = n is the total number of observations.

The difference between classical and empirical probability is that the former uses
sample space to determine the numerical probability while the latter is based on fre-
quency distribution.

66
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example 5.19. Given the following frequency distribution.

Grade A B C D F
No. of students 10 20 50 15 5
What is the probability of selecting a student who scored B?
3. Subjective probability: It calculates probability based on an educated guess or
experience or evaluation of a problem. For example a physician might say that on the
basis of his/her diagnosis, there is a 30% chance the patient will need an operation.
Examples:
1. A committee of 5 persons must be selected from 5 men and 8 women. What is the
probability that the committee consists of at least 3 women?
2. An urn contains 6 white, 4 red and 9 black balls. If three balls are drawn at random,
what is the probability that:
(a) 2 of the balls drawn are white.
(b) 1 is of each color.
(c) none is red.
(d) at least one is white.
Solutions:
1. Total number of ways of selection=13C5 = n(S),
At least three women= 8C3 5C2 + 8C4 5C1 + 8C5 5C0 = n(E)
8C3 5C2 + 8C4 5C1 + 8C5 5C0
13C5
2. Total=19C3
(a) E = 2 of the balls drawn are white. n(E) = 6C2 13C1
6C2 13C1
19C3
(b) E = 1 is of each color. n(E) = 6C1 4C1 9C1
6C1 4C1 9C1
19C3
(c) E = None is red. n(E) = 4C0 15C3
4C0 15C3
19C3
(d) E = At least one is white. n(E) = 6C1 13C2 + 6C2 13C1 + 6C3 13C0
6C1 13C2 + 6C2 13C1 + 6C3 13C0
19C3

67
Introduction to Statistics - Stat 1011 es.awol@gmail.com

5.5 Some Rules of Probability

1. The probability of an event always lies in between 0 and 1, that is, 0 ≤ P (E) ≤ 1.

• P (E) = 0, means it is sure that E can never happen.

• P (E) = 1, means the event E is certain to occur (E occurs surely)

Example: What is the probability of getting

(a) number 9 in rolling a die. Ans: 0

(b) a number less than 7 in rolling a die. Ans: 1
P
2. The sum of the probabilities of each outcome in the sample space S is 1, i.e., pi = 1.

Example: Consider the experiment of rolling a die.

Outcome 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/6
P
⇒ pi = 1/6 + 1/6 + 1/6 + 1/6 + 1/6 + 1/6 = 6/6 = 1.

Example: Suppose that only three outcomes are possible in an experiment: a1 , a2

and a3 . Suppose further more that a1 is twice as likely to occur as a2 which is four
times as likely to occur as a3 . Find p1 , p2 and p3 .

1. If there are two events A and B, the probability that at least one of these events will
occur is the sum of the probability that each event will occur minus the probability
that both events will occur at the same time. That is, P (A ∪ B) = P (A) + P (B) −
P (A ∩ B).

Example: A part time student is taking two courses, namely economics and statis-
tics. The probability that the student will pass economics course is 0.60 and the
probability of passing statistics course is 0.70. The probability that the student will
pass both courses is 0.50. Find the probability that the student

(a) will pass at least one course.

(b) will fail both courses.

Solution:

(a) P (E ∪ S) = P (E) + P (S) − P (E ∩ S) = 0.60 + 0.70 − 0.50 = 0.80

(b) P (E 0 ∩ S 0 ) = 1 − P (E ∪ S) = 1 − 0.80 = 0.20

68
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. The complement event E (denoted by E 0 ) is the occurrence of any outcome or

event that precludes E from happening. If E 0 is the complement event of E, then
P (E 0 ) = 1 − P (E).

Example: Suppose that A and B are events for which P (A) = x, P (B) = y and
P (A ∩ B) = z. Express each of the following probabilities in terms of x, y and z. a)
P (A0 ∪ B 0 ) b) P (A0 ∩ B), c) P (A0 ∩ B 0 ) d) P (A ∩ B 0 ).

Mutually exclusive events: If two events cannot occur simultaneously, that is, one ”ex-
cludes” the other, then the two events are said to be mutually exclusive. As a result, if
event A and B are mutually exclusive, then P (A ∩ B) = 0. In such events, the occurrence
of one stops the occurrence of the other.

Example: What is the probability of getting head and tail in tossing a coin? Ans:
P (H ∩ T ) = 0.

5.6 Conditional Probability and Independence

5.6.1 Conditional Events
When the outcome or occurrence of an event affects the outcome or occurrence of another
event, the two events are said to be dependent (conditional).

If the events A and B are dependent to each other, the probability of event B occurring
knowing that event A has already occurred is said to be the conditional probability of B
P (A ∩ B)
given that event A has occurred, P (B/A) = .
P (A)
⇒ P (A ∩ B) = P (A)P (B/A).

Similarly, the probability of event A occurring knowing that event B has already oc-
curred is said to be the conditional probability of A given that event B has occurred,
P (B ∩ A)
P (A/B) = .
P (B)
⇒ P (B ∩ A) = P (B)P (A/B).

Remarks:

i. 0 ≤ P (A/B) ≤ 1 or 0 ≤ P (B/A) ≤ 1

ii. P (S/A) = 1 and P (A/S) = P (A)

iii. P (A1 ∩A2 ∩· · ·∩An ) = P (A1 )P (A2 /A1 )P (A3 /A1 ∩A2 ) · · · P (An /A1 ∩A2 ∩· · ·∩An−1 )

Examples:

69
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Recall the previous example that a part time student who is taking two courses,
economics and statistics. Find P (E/S) and P (S/E).

2. A package contains 12 resistors, 3 of which are defective. If 3 are selected, find the
probability of getting

(a) no defective resistor.

(b) one defective resistor.
(c) all defective resistors.

3. Urn I contains 4 white balls and 5 red balls. And urn II contains 6 white balls and
8 red balls. A ball is chosen at random from urn I and put into urn II. Then a ball
is chosen at random from urn II. What is the probability that the ball is white.

4. An urn contains 6 green and 4 black balls. Another urn contains 7 green and 9 black
balls. Two balls are transferred from the first urn and placed in the second urn.
Then one ball is taken from the latter. What is the probability that the ball drawn
is from the second urn is black.

Solutions:

1. P (E/S) =, P (S/E) =

2. 12C3

5.6.2 Independent Events

Two events are said to be independent if the occurrence of one does not affect the occur-
rence of the other. If event A and B are independent, the probability of A occurring is
in no way affected by event B having occurred or vice versa, hence, P (A∩B) = P (A)P (B).

Example

1. A coin is tossed and a die is rolled. What is the probability of getting a head on the
coin or number 4 on the die.

2. An urn contains 6 white and 3 black balls. Three balls are drawn. What is the
probability that all the drawn balls will be black

(a) if the selection is without replacement.

(b) if the selection is done with replacement.

Solutions:

70
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Let A= getting head on the coin ⇒ P (A) = 1/2. Let B= getting number 4 on the
die ⇒ P (B) = 1/6
P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = P (A) + P (B) − P (A ∩ B) = 7/12
But P (A ∩ B) = P (A)P (B) = 1/12. Thus P (A ∪ B) = 7/12.

2. N = 12, n = 3
Let E1 = the first black ball selected.
Let E2 = the second black ball selected.
Let E3 = the third black ball selected.

• P (E1 ∩ E2 ∩ E3 ) = 3/9 × 2/8 × 1/7 = P (E1 )P (E2 /E1 )P (E3 /E1 ∩ E2 )

• P (E1 ∩ E2 ∩ E3 ) = 3/9 × 3/9 × 3/9 = P (E1 )P (E2 )P (E3 )

If two events A and B are independent, then

• P (A/B) = P (A), P (B > 0)

• P (B/A) = P (A), P (A) > 0

Example: Let A and B be two events associated with an experiment. Suppose that
P (A) = 0.4, P (A ∪ B) = 0.7 and P (B) = p. For what choices of p are A and B indepen-
dent.

EXERCISE: A certain travel club has 1000 members. 60% of these members are males.
45% of these members pay by credit card when they travel including 175 females. If a
member is selected from the travel club at random, what is the probability that:

1. the member is a female.

2. the member is a female and pays in cash.

3. the member is a male or a credit card user.

4. the member pays cash if we know that the member is a female.

Are the sex of the member and the mode of payment statistically independent events?

71
Chapter 6

Probability Distributions

6.1 Random Variable

Random variable is a variable whose values are determined by chance (with some prob-
ability). It is denoted by capital letter, for example, X and its value is denoted by the
corresponding small letter; xi . The set consisting of all possible values of a random variable
is called range space (RX ).

If the number of possible values of a random variable X (that is, RX ) is finite or countable
infinite, the random variable is called discrete random variable. That is, the possible values
of X may be listed as x1 , x2 , · · · , xn , · · · . In the finite case the list terminates and in the
countably infinite case the list continuous indefinitely.

On the other hand, if the random variable assumes an uncountable infinite number of
possible values, the random variable is called a continuous random variable.

6.1.1 Probability Distribution

A probability distribution is a definition of the probabilities of the values of a random
variable. Based on the type of a random variable, a probability distribution can be discrete
or continuous.

Discrete Probability Distribution

With each possible value xi of a discrete random variable, a number p(xi ) = P (X = xi ),
called probability of xi is associated. The numbers p(xi ), i = 1, 2, · · · must satisfy the
following conditions:

i. 0 ≤ p(xi ) ≤ 1
X
ii. p(xi ) = 1
i

72
Introduction to Statistics - Stat 1011 es.awol@gmail.com

This function p defined above is called probability mass function (pmf ) of the random vari-
able X. The collection of pairs (xi , p(xi )), i = 1, 2, · · · is sometimes called the probability
distribution of X.

Examples:
1. Construct a probability distribution for the number of heads observed in tossing a
coin two times. Also plot the probability distribution using bar diagram.
2. Construct a probability distribution for the number of heads observed in tossing a
coin three times and plot it.
3. Construct a probability distribution for the number of girls if a family plans to have
four children.
Solutions:
1. S = {HH, HT, T H, T T }
Let X be the number of heads observed in tossing a coin two times. RX = {0, 1, 2}

x 0 1 2 Total
P (x) 1/4 2/4 1/4 1

2. S = {HHH, HHT, HT H, HT H, T HH, T HT, T T H, T T T }

Let X be the number of heads observed in tossing a coin two times. RX = {0, 1, 2, 3}

x 0 1 2 3 Total
P (x) 1/8 3/8 3/8 1/8 1

Continuous Probability Distribution

The graph of the distribution (the equivalent of the bar graph for a discrete distribution)
is usually a smooth curve. The curve is described by an equation or a function, f (x), often
called probability density function (pdf ) satisfying the following conditions:
i. f (x) ≥ 0, for all x ∈ (a, b)
Rb
ii. a f (x)dx = 1
Rd
iii. P (c ≤ X ≤ d) = c f (x)dx, if a ≤ c ≤ d ≤ b
Examples:
1. Show that each of the following function is pdf .
(
1, 0 ≤ x ≤ 1;
(a) f (x) =
0, otherwise.

73
Introduction to Statistics - Stat 1011 es.awol@gmail.com

(
2x, 0 ≤ x ≤ 1;
(b) f (x) =
0, otherwise.
(
e−x , x ≥ 0;
(c) f (x) =
0, otherwise.

2. Find the value of b for the following function to be a pdf.

(
bx2 , 0 ≤ x ≤ 1;
f (x) =
0, otherwise.

As continuous random variables differ from discrete random variables, consequently con-
tinuous probability distributions differ from discrete ones. Some of the most important
differences are listed as follows:
1. The function f (x) does not give the probability that X = x as did p(x) in the discrete
case. This is because X can take on an infinite number of values and, therefore, it
is impossible to assign a probability for each value x. In fact the values of f (x) is
not a probability at all; hence f (x) can take any nonnegative value, including values
greater than 1.
2. Since the area under the curve corresponding to a single point is zero, the probability
of obtaining exactly a specific value is zero. Thus, for a continuous random variable,
P (a ≤ X ≤ b) and P (a < X < b) are equivalent, which is certainly not true for
discrete distributions.
3. Finding areas under curves representing continuous probability distributions involves
the use of calculus and may become quite difficult. For some distributions, areas
cannot even be directly computed and require special numerical techniques. Of course
statistical computer programs easily calculate such probabilities.

6.1.2 Expectations of a Random Variable

The mean of a random variable X is known as the expected value of X, denoted by E(X).
It is defined as:
P
xp(x) if X is a discrete r.v;
µ = E(X) = R
xf (x) if X is a continuous r.v.
The variance of the random variable X is the expected value of the square of the deviation
of X from its mean.
P
2 2 (x − µ)2 p(x) if X is a discrete r.v;
σ = E(X − µ) = R
(x − µ)2 f (x) if X is a continuous r.v.
⇒ σ 2 = E(X − µ)2 = E(X − E(X))2 = E(X 2 ) − (E(X))2
Examples:

74
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Find the mean number of heads observed in tossing a coin three times.

2. Find the mean of the following probability distributions.

(
1, 0 ≤ x ≤ 1;
(a) f (x) =
0, otherwise.
(
3x2 , 0 ≤ x ≤ 1;
(b) f (x) =
0, otherwise.

Solution:

1. S = {HHH, HHT, HT H, HT H, T HH, T HT, T T H, T T T }

Let X be the number of heads observed in tossing a coin two times. Rx = {0, 1, 2, 3}

x 0 1 2 3 Total
P (x) 1/8 3/8 3/8 1/8 1
X
µ = E(x) = xp(x)
= 0 × 1/8 + 1 × 3/8 + 2 × 3/8 + 3 × 1/8
= 1.5
X
σ 2 = E(x − µ)2 = (x − µ)2 p(x)
= (0 − 1.5)2 × 1/8 + (1 − 1.5)2 × 3/8 × +(2 − 1.5)2 × 3/8 + (3 − 1.5)2 × 1/8
= 0.75
√
⇒σ= 0.75 = 0.86
R R1 R R1
2. E(X) = µ = xf (x)dx = 0 xdx = 0.5 and σ 2 = (x − µ)2 f (x) = 0 (x − 0.5)2 dx =?

6.2 Common Discrete Distributions

6.2.1 The Binomial Distribution
A binomial distribution is one of the simplest and most frequently used discrete distribution
and is very useful in many practical situations involving either/or types of events.

1. Each trial has only two mutually exclusive outcomes or outcomes that can be reduced
to two. One of the outcomes is labeled as ”success” and and the other as ”failure”.

2. The outcome of each trial is independent. That is, the outcome of one trial does not
affect the outcome of another.

3. The probability of ’success’ remains the same from trial to trial.

75
Introduction to Statistics - Stat 1011 es.awol@gmail.com

4. The experiment (trial) is performed for fixed number of times, say n.

Let X be the number of successes out of n bernoulli trials. Hence, RX = {0, 1, 2, · · · , n}.
Then X follows a binomial distribution with parameters n, number of experiments per-
formed and p, probability of success and, write as X ∼ Bin(n, p). Then, the probability of
obtaining x successes in n trials is given by:

n x n−x
P (X = x) = p q , x = 0, 1, 2, · · · , n
x
where p is the probability of success, q = 1 − p is the probability of failure, n is number of
trials and x is number of successes.

This is called the binomial distribution. The mean is E(X) = np and variance is V (X) =
npq.

Examples:
1. Suppose a coin is tossed 10 times. What is the probability of getting

(a) exactly 3 heads.

(b) no head.
(c) at most 3 heads.
(d) at least 3 heads.
(e) more than 3 heads

Find the average and variance of the number of heads.

2. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what
is the probability of scoring

(a) two goals.

(b) at least one goal.
(c) at most 3 goals.

Find the average, variance and standard deviation of the number of goals.
Solution:
1. Let X be the number of heads observed in tossing a coin 10 time, Rx = {0, 1, 2, · · · , 10}
1 1
p = P (Success) = P (Head) = ⇒ q = 1 − p =
2 2
X ∼ Bin(p = 0.5, n = 10)
x 10−x
10 1 1
⇒ P (X = x) = ; x = 0, 1, 2, · · · , 10
x 2 2

76
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3 10−3
10 1 1
(a) P (X = 3) = =
3 2 2
0 10−0
10 1 1
(b) P (X = 0) = =
0 2 2
(c) P (X ≤ 3) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3) =
(d) P (X ≥ 3) = P (X = 3) + P (X = 4) + · · · + P (X = 10) = 1 − P (X < 3) =
(e) P (X > 3) = P (X = 4) + P (X = 5) + · · · + P (X = 10) = 1 − P (X ≤ 3) =

6.2.2 The Poisson Distribution

Poisson distribution is another theoretical discrete probability distribution, which is useful
for modeling certain real situations. It differs from binomial distribution in the sense that
it is not possible to count the number of failures even though the number of successes is
known. For example, in the case of patients coming to hospital for emergency treatment,
only the number of patients arriving in a given hour is known but it is not possible to count
the number of patients not coming for emergency treatment in that hour. Accordingly,
it is not possible to determine the total number of outcomes (successes and failures) and
hence binomial distribution cannot be applied as a decision making tool. In such situation
the poisson distribution should be used if the average number of patients arriving for emer-
gency treatment per hour is given. It is assumed that such arrival of patients is a random
phenomenon and hence exact number of patients arriving in any hour is not predictable.
Other examples of poisson distribution are number of telephone calls going to a switch
board system, the number of cars in a certain parking lot, number of customers coming to
a bank for service and so on. All of these arrivals can be described by a discrete random
variable that takes on integer values (0, 1, 2,· · · ).

Let X be the number of successes in a specific period of time. Hence, RX = {0, 1, 2, · · · }.

Then X follows a poisson distribution with parameter λ, average number of events per
unit of time and, write as X ∼ Poisson(λ). Hence, the probability of getting x successes
in the same time period is:

e−λ λx
P (X = x) = , x = 0, 1, 2, · · ·
x!
where λ is the average number of events per unit of time.
Properties:

1. The probability of success, p, is very small.

2. The experiment is performed indefinitely (n is very large).

3. The average number of events per unit of time (λ) is known.

77
Introduction to Statistics - Stat 1011 es.awol@gmail.com

The mean and variance number of successes for a poisson distribution are the same, i.e,
E(X) = V ar(X) = λ.
Examples:

1. On average a typist commits 3 errors per page. Find the probability that she will
make

(a) no mistake.
(b) two mistakes.
(c) more than one mistake.

2. Customer arrive at a photocopying machine at an average rate of two every 10 min-

utes. What is the probability that there will be

(a) no arrivals during any period of ten minutes.

(b) exactly one arrival during these time period.
(c) more than two arrivals during this time period.

Solution:

1. Let X be the number of errors committed, Rx = {0, 1, 2, · · · }, E(X) = λ = 3

X ∼ Possion(λ = 3)
e−3 3x
⇒ P (X = x) = , x = 0, 1, 2, · · ·
x!
e−3 30
(a) P (X = 0) = =
0!
e−3 32
(b) P (X = 2) = =
2!
(c) P (X > 1) = P (X = 2) + P (X = 3) + · · · = 1 − P (X ≤ 1) =

6.3 Common Continuous Distributions

6.3.1 The Normal Distribution
The most often used continuous probability distribution is the normal or gaussian distri-
bution. This distribution plays a very important and pivotal role in statistical theory and
practice, particularly in the area of statistical inference and statistical quality control. Its
importance is due to the fact that in practice, the experimental results, very often seem to
follow the normal distribution or the bell-shaped curve.

78
Introduction to Statistics - Stat 1011 es.awol@gmail.com

If X is a continuous random variable having a normal distribution with mean µ and variance
σ 2 , write as X ∼ N (µ, σ 2 ), its density function is:
1 x−µ 2
1 − ( )
f (x) = √ e 2 σ , −∞ < x < ∞
2πσ

Knowing the values of these two parameters, µ and σ 2 , completely determine the distri-
bution. Thus, the function describes a family of curves which may may differ only with
regard to µ and σ 2 , but have the same characteristics.

Several interesting features can be determined from this function without really evaluating
it. Some of these features are:

1. The random variable X can take on any value from −∞ to ∞.

2. The curve is symmetric about the mean. This means that the number of units in
the data below the mean is the same as the number of units above the mean. This
means the mean and median have the same value.

3. The height of the curve is maximum at the mean value. Thus, the mean and mode
values coincide. This means the normal distribution has the same value for the mean,
median and mode.

4. The curve declines as we go in either direction from the mean, but never touches the
base (x-axis) so that the tails of the curve on both sides extend indefinitely.

5. The corresponding deciles, quartiles and percentiles are at equidistant from the mean.

Calculating Probabilities Under the Normal Distribution

The primary use of probability distributions is to find probabilities of the occurrence of
specified values of the random variable. For a normally distributed random variable X,
the probability between two values x1 and x2 is defined as:

x2
1 x−µ 2
− ( )
Z
1
P (x1 < X < x2 ) = √ e 2 σ dx.
x1 2πσ
But, integration of this function is quite complicated and is never directly used to calcu-
late such probabilities. Fortunately normal distribution can easily be standardized, which
allows to use a single table for any normal distribution.

Suppose X has a normal distribution with mean µ and variance σ 2 , i.e, X ∼ N (µ, σ 2 ). If
X −µ
we define Z = , then Z will have a normal distribution with mean 0 and variance
σ

79
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1, that is, Z ∼ N (0, 1).

Such normal distribution with mean µ = 0 and variance σ 2 = 1 is called a standard normal
distribution. Hence, the pdf of the standard normal variate Z is given by:
1 2
1 − z
f (z) = √ e 2 , −∞ < z < ∞
2π
Now for any standard normal variate Z, the probability (area) between two values z1 and
z2 is defined as:
Z z2 1 2
1 − z
P (z1 < Z < z2 ) = √ e 2 dz.
z1 2π
Hence,
x1 − µ X −µ x2 − µ
⇒ P (x1 < X < x2 ) = P ( < < )
σ σ σ
= P (z1 < Z < z2 )

The total area under the (standard) normal curve is 1. Hence, the area to the right and left
of the central value (µ = 0) of the standard normal distribution is 0.5 (as it is symmetric
about 0).
• P (Z > z) = P (Z < −z).

• P (Z < z) = 1 − P (Z > z).

Examples:
1. Find the area to between 0 and 1.96; P (0 < Z < 1.96).
Solution: P (0 < Z < 1.96) = 0.4750 = P (−1.96 < Z < 0)

2. Find the area to the right of -1.96; P (Z > −1.96).

Solution: P (Z > −1.96) = P (Z > 0) + P (0 < Z < 1.96) = 0.5 + 0.4750 = 0.975

3. Find the area to the right of 2; P (Z > 2).

Solution: P (Z > 2) = P (Z > 0) − P (0 < Z < 2) = 0.5−??? = 0.0228

4. Find the area to the left of -0.5; P (Z < −0.5).

Solution: P (Z < −0.5) = P (Z > 0.5) = P (Z > 0) − P (0 < Z < 0.5) = 0.5−??? =
0.3085

5. Find the area between -1 and 1.5; P (−1 < Z < 1.5).
Solution: P (−1 < Z < 1.5) = P (−1 < Z < 0) + P (0 < Z < 1.5) = P (0 < Z <
1) + P (0 < Z < 1.5) = 0.7745

6. Suppose that X is normally with µ = 10 and σ 2 = 20 (or σ = 4.472). Find

80
Introduction to Statistics - Stat 1011 es.awol@gmail.com

(a) P (X > 15)

(b) P (5 < X < 15)
(c) P (5 < X < 10)

Solutions:
15 − µ 15 − 10
(a) P (X > 15) = P (Z > ) = P (Z > ) = P (Z > 1.12) = P (Z >
σ 4.472
0) − P (0 < Z < 1.12) = 0.1314
5−µ 15 − µ 5 − 10 15 − 10
(b) P (5 < X < 15) = P ( <Z < ) = P( <Z < )=
σ σ 4.472 4.472
P (−1.12 < Z < 1.12) = 0.7372
5−µ 10 − µ 5 − 10 10 − 10
(c) P (5 < X < 10) = P ( <Z < ) = P( <Z < )=
σ σ 4.472 4.472
P (−1.12 < Z < 0) = 0.3686

If the concern is to find the values of z for given probability values, the form of notation
often called the zα notation can be adopted. According to this notation, zα , is the value
of z such that P (Z > zα ) = α. This definition results in the equivalent statements
P (Z < −zα ) = α and because of the symmetry of the normal distribution, P (−zα/2 < Z <
zα/2 ) = 1 − α.
Examples:
1. Find the value z associated P (|Z| < z) = 0.10.

2. The IQ score of students is normally distributed with a mean of 120 and variance
400. What is the probability that a student will have an IQ

(a) between 100 and 130.

(b) above 140.
(c) below 150.
(d) between 140 and 150.

3. Let X be the variable representing the distribution of scores in statistics course. It

can be assumed that these scores are normally distributed with µ = 75 and σ = 10.
If the instructor wants no more than 10% of the class to get an A, what should be
the cutoff grade? That is, what is the value of x such that P (X > x) = 0.10?
Solutions:
1. P (|Z| > z) = 0.10 ⇒ P (|Z| > z) = P (Z < −z) + P (Z > z) = 0.5 − P (0 < Z < z) +
0.5−P (0 < Z < z) = 1−2P (0 < Z < z) = 0.10 ⇒ P (0 < Z < z) = 0.45 ⇒ z = 1.65

2. Let X be IQ score. X ∼ N (120, 400).

a−µ b−µ a − 120 b − 120
P (a < X < b) = P ( <Z< ) = P( <Z< )
σ σ 20 20

81
Introduction to Statistics - Stat 1011 es.awol@gmail.com

100 − 120 130 − 120

(a) P (100 < X < 130) = P ( < Z < = P (−1 < Z < 3) =
20 20
0.5328
140 − 120
(b) P (X > 140) = P (Z > ) = P (Z > 1) = 0.1587
20
150 − 120
(c) P (X < 150) = P (Z < ) = P (Z < 1.5) = 0.9332
20
140 − 120 150 − 120
(d) P (140 < X < 150) = P ( <Z < ) = P (1 < Z < 1.5) =
20 20
0.0919
x−µ x − 75
3. P (X > x) = P (Z > z) = P (Z > ) = P (Z > ) = 0.10
σ 10
x − 75
⇒z= = 1.28 ⇒ x = 87.8.
10
Therefore, the instructor should assign an A grade to those students with scores 87.8
or higher.

6.3.2 Other Continuous Distributions

The t Distribution
The t distribution is quite similar to the normal in that it is symmetric and bell shaped.
However, this distribution also has only one parameter, the degrees of freedom; hence the
t distribution with v degrees of freedom is denoted by t(v). The t distribution has ”flatter”
tails than the normal. That is, it has more probability in the extreme or tail areas than
does the normal distribution, but barely noticeable if the degrees of freedom exceed 30
or so. In fact, when the degrees of freedom are ∞, the t distribution is identical to the
standard normal distribution.

The χ2 Distribution
The χ2 distribution is usually denoted by χ2 (v), where v is the degrees of freedom. χ2
values are nonnegative. The shape of the distribution is different for each value of v. For
large values of v (usually greater than 30), the χ2 distribution is approximated by normal.

The F Distribution
It is a continuous and right skewed distribution. It is indexed by two degree of freedom
parameters v1 and v2 ; these are usually integers and written as F (v1 , v2 ).

82
Chapter 7

Sampling Techniques

7.1 Basic Concepts

• Population: is defined statistically as the totality of all subjects having certain
common characteristics that are being under study in a specific area and time.

• Population Size: It is the total number of elements in the population.

• Sample: is a subset of the population that being studied with the aim of estimating
the characteristics of the population.

• Sampling Frame: is a list of all elements of the population. The sampling frame
forms the basic material from which a sample is drawn. Hence, it should be complete
and up-to-date.

• Sampling Unit: The population may be regarded as consisting of units which are
to be used for the purpose of sampling. Each unit is regarded as individual and
indivisible when the selection is made. Such a unit is known as a sampling unit.

• Sample Size: It is the total number of elements in the sample. That is, the size of
the sample is the number of sampling units which are selected from the population
by a random method.

• Sampling: is the process of selecting a sample from the population.

• Sampling Technique: is a method of selecting a sample.

7.2 Reasons for Sampling

There are two broadly classified investigations: census survey and sample survey. In the
census method, a 100% inspection of the population is made and, each and every unit
of the population is enumerated. It enables to obtain information about each and every

83
Introduction to Statistics - Stat 1011 es.awol@gmail.com

element in the population. The latter method is a study in which some elements which
are assumed representatives of the population are investigated. It is a statistical process
in which we select and examine a sample instead of considering the whole population.

In practice, it may not be possible to collect information on all units of the population. One
reason is lack of resources in terms of money, personnel and equipment. Another reason is
that sample survey enables us to obtain results on time. Hence, for getting quick results
sampling is preferred. Also, sampling helps to get data of good quality as the number
of enumerators’ decreases we can train and supervise them well in the process of data
collection. Moreover, complete investigation may be destructive in nature. And samples
reduce the damages caused by some tests in quality control. For example, in cooking food
mothers check whether the food has enough amount of salt, spices, butter and so on, by
taking a small amount and testing it. What would happen if the test is all what is in the
dish?

7.3 Types of Errors

1. Sampling Errors: Sampling errors are the errors which are introduced due to errors
in the selection of a sample or the discrepancies between population parameters and
estimates which are derived from random sample. These errors are due to sampling
fluctuations which are the outcome of the random sampling process. These errors
can be controlled by proper choice of sampling methods and increasing sample size.

2. Nonsampling Errors: It is experienced that studies based on complete enumeration

do not yield similar results in repeated enumerations. Such a discrepancy occurs due
to many errors which are termed as nonsampling errors. Some of the sources of such
errors are observation error or response error, errors in editing and tabulation of data.
These errors can be minimized through superior management of survey, employing
benefiting personnel and by using modern computational aids.

7.4 Types of Sampling Techniques

In the selection of a sample, always the effort is to make the sample a true representative
of the population. There several sampling methods which can be broadly classified into
two categories; probability and non-probability sampling methods.

In probability sampling, each unit in the population has an equal chance of being included
in the sample. In the non-probability sampling, the units are drawn using ceratin amount
of judgement.

84
Introduction to Statistics - Stat 1011 es.awol@gmail.com

7.4.1 Probability Sampling Techniques

1. Simple Random Sampling: In simple random sampling, each and every member
of the population has an equal and independent chance of being selected in the
sample. The items that get selected are purely a matter of chance. Before applying
this method, a complete list of all members, sampling frame, should be prepared so
that each member can be identified by a distinct number. There are two methods
that can be used in order to ensure the randomness of the selection. These are:

(a) Lottery Method: This method is useful in comparatively small size of pop-
ulation. All members in the population are numbered or named on separate
pieces of paper of identical size and shape. These slips of paper are then iden-
tically folded and mixed up in a container. The probability of the first item
being selected out of the total number of N slips of paper is 1/N , for the second
particular piece, this probability is 1/(N − 1), since N − 1 slips of papers left
in the container after the first slip has been drawn. Similarly, the probability
of the third slip being picked up is 1/(N − 1) and so on. The items from the
container are selected successively until the desired sample size reached. This
would constitute a random sample called simple random sample.
(b) Random Number Table Method: A random number table is giving numbers
in a random order which are generated using computer. In the lottery method,
the selection may subject to human bias as people may identify the slips (chits)
in many ways. The inconvenience of preparing slips of paper, shuffling them and
choosing the items one by one may be avoided by the use of random number
table. This principle involved in this method is also same as that in the lottery
method.

Suppose N is a k digit number. Choose k digit numbers from the random

number table and read out the numbers continuously, vertically or horizontally.
If the number is greater than N but less than the biggest multiple of N which
has k figures, divide that number by N and take the remainder r and include
the rth unit in the sample. Discard random numbers which are greater than
the biggest multiple of N with k figures. For example, if N = 43 take 2 digit
random numbers. If the number is, say, 23 include the unit with number 23
in the sample. If the second number is 68, since it is less than 86, the biggest
2 digit multiple of 43, divide 68 by 43 and take the remainder, 25 and include
the unit with number 25 in the sample. If the number obtained is greater than
86, discard the number and go to the next in the table. This process continues
until n sampling units are selected.

2. Systematic Sampling: A systematic sample is formed by selecting the first unit at

random, and the remaining units in the sample are automatically selected in some
predetermined pattern. The process requires that the members of the population

85
Introduction to Statistics - Stat 1011 es.awol@gmail.com

be presented in some kind of order; alphabetically or numerically or in any other

order, and every k th unit (k is called the sampling interval (k = N/n)) is included in
the sample after the first item has been selected randomly. This may be considered
representative as the sample is evenly distributed over the whole population. There
are two methods of systematic sample selection. These are:

(a) Linear Systematic Sampling: Suppose N is a multiple of n, that is, N = nk.

The procedure is to select a random number, say, j such that 1 ≤ j ≤ k and then
select the j th and every subsequent j + k, j + 2k, · · · , ((n − 1)k)th positional
units. This sampling plan is known as linear systematic sampling. But the
situation that N is a multiple of does not always hold, in such case a sample of
n − 1 units, instead of n, will be obtained.
(b) Circular Systematic Sampling: This is applied when N 6= nk. Hence, take
N/n as k by rounding to the nearest integer. Select a random number from 1
to N , let the number be m. Now select every (m + jk)th unit when m + jk < N
and select every (m + jk − N )th unit when m + jk > N putting j = 1, 2, · · · till
n units are selected. By this method always a sample size of n will be obtained.

3. Stratified Sampling: When the population is heterogenous with respect to the

characteristic in which one is interested, stratified sampling should be adopted so
that it would be a representative sample. The heterogeneous population is divided in
homogeneous sub-groups, called strata, ensuring maximum uniformity within each
stratum and largest degree of variability among strata. From each stratum a separate
sample is selected using simple random sampling. This sampling method is known
as stratified sampling.

As an example of stratified sampling, assume that an investigator is interested in se-

curing a particular response that would be representative of undergraduate Haramaya
University students. S/he may stratify the population into four strata of freshman,
sophomore, junior and senior students, and take a simple random sample from each
stratum.

4. Cluster Sampling: In this sampling the population is divided into subpopulations

known as clusters. But, the units within a cluster are relatively heterogenous com-
pared to the entire population. From each cluster, a random sample of the desired
size will be selected. The value of cluster sampling depends on how representative
each cluster is of the entire population. If all clusters are similar in this regard, then
sampling a small number of clusters will provide good estimates of the population
parameters.

5. Multistage Sampling: This method of sampling is useful when the population is

very widely spread and random sampling is not possible. Although cluster sampling is
advantageous under certain circumferences, it is generally less efficient than sampling

86
Introduction to Statistics - Stat 1011 es.awol@gmail.com

of individual units directly. In such a case the whole population is divided into a
number of primary units called stages, each of which is composed of second stage of
units. A serious of samples are then taken at successive stages. The sample size at
each stage is determined by the relative population size at each stage.

7.4.2 Non-probability Sampling Techniques

In some situations, judgement or purposive sampling is preferred to probability sampling.
For example if one want a sample of persons who are suffering from cancer, s/he has to
select cancer patients who happen to come to hospital(s).

Nonprobability sampling gives rise to those methods where the subjects are selected delib-
erately. No probability is attached or can be computed for an item being selected.

1. Quota Sampling: In case of stratified sampling if the cost of selecting sampling

units from each stratum is very high, then the investigator is assigned a quota (fixed
number of subjects) in each stratum. Then the actual selection of persons is left at
the discretion of the investigator.

2. Judgment Sampling: In this method, sampling units are selected on the judgement
of the person doing the study. The underlying assumption is that the unit selected
truly represent the entire population. For example to find out the potential of drip
irrigation technology, a researcher may go the teachers of Agricultural University.

3. Convenience Sampling: Here, an investigator selects the sample at his own conve-
nience. This method is based on the assumption that the population is homogeneous
and the individuals selected and interviewed similar information with regard to the
characteristic under study. For example, persons selected from gas stations or petrol
pumps to collect information about the quality of gas or petrol, service or correct-
ness of the measurement, e.t.c are supposed to represent the population of gasoline
buyers.

4. Snowball Sampling: Snowball sampling technique involves the practice of iden-

tifying set of respondents who can, in turn, help the investigator to identify some
other person who will be included in the study. After interviewing this person, s/he
will contact the other person and interview him/her. In this way, a chain process
continuous till the required number of persons are interviewed. This type of sampling
is most suitable in qualitative research. It is frequently used in marketing research.

87
Chapter 8

Statistical Inference for a Single

Population

As noted earlier, one of the primary objectives of a statistical analysis is to use data from a
sample to make inferences about the population from which the sample was drawn. In this
section, the basic procedures for making such inferences are presented. Statistical inference
generally takes two forms, namely, estimation of the parameter and testing of a hypothesis.

8.1 Estimation
For the purpose of general discussion, let θ be the population parameter and θ̂ be the
corresponding statistic. As already stated, the parameter θ is unknown. The value of the
statistic θ̂ is computed from the random sample taken from the population.

The statistic θ̂ intended for estimating a parameter θ is called an estimator of θ. The

specific numerical value of an estimator calculated from the sample is called the estimate.
The process of obtaining an estimate of the unknown value of a parameter by a statistic
is called estimation. There are two types of estimations. One is the point estimation and
the other is interval estimation.

8.1.1 Point Estimation

Point estimation is the process of obtaining a single sample value (point estimate) that
is used to estimate the desired population parameter. The estimator is known as point
estimator. For example, X̄ is a point estimator of µ and S is a point estimator of σ.

The best estimator should be highly reliable and have desirable properties like unbiased-
ness, consistency, efficiency and sufficiency. These criteria are described as follows:

1. Unbiasedness: An estimator is a random variable since it is always a function of

the sample values. The expected value of a sample statistic is considered to be an

88
Introduction to Statistics - Stat 1011 es.awol@gmail.com

unbiased estimator if it equals the population parameter which is being estimated.

This means E(θ̂) = θ.

2. Consistency: It refers to the effect of sample size on the accuracy of the estimator. A
statistic is said to be consistent estimator of the population parameter if it approaches
the parameter as the sample size increases, that is, θ̂ → θ as n → N .

3. Efficiency: An estimator is considered to be efficient if its value remains stable from

sample to sample. The best estimator would be the one which would have the least
variance from sample to sample. From the three point estimators of central tendency,
namely, the mean, median and mode, the mean is considered the least variant and
hence is a better estimator for the population mean.

4. Sufficiency: An estimator is said to be sufficient if it uses all the information about

the population parameter contained in the sample. For example, the sample mean
uses all the sample values in its computation while median and mode do not. Hence,
mean is the better estimator in this sense.

8.1.2 Interval Estimation

Point estimator has some drawbacks. First, a point estimator from the sample may not
exactly locate the population parameter, that is, the value of point estimator is not likely
to be exactly equal to the value of the parameter, resulting in some margin of uncertainty.
If the sample value is different from the population value, the point estimator does not
indicate the extent of the possible error. Second, a point estimate does not specify as to
how confident we can be that the estimate is close to the parameter it is estimating. That
is, we cannot attach any degree of confidence to such an estimate as to what extent it
is closer to the value of the parameter. Because of these limitations of point estimation,
interval estimation is considered desirable. Interval estimation involves the determination
of an interval (a range of values) within which the population parameter must lie with a
specified degree of confidence. It is the construction of an interval on both sides of the point
estimate within which wan reasonably confident that the true parameter will lie.

8.2 Hypothesis Testing

A statistical hypothesis is a conjecture (an assumption) about a population parameter
which may or may not be correct. Such a hypothesis usually results from speculation con-
cerning observed behavior, natural phenomena, or established theory.

Hypothesis testing is a statistical procedure which leads to take a decision about an as-
sumption for the population parameter(s) for being correct or not using sample data. It
starts by making a set of two statements about the parameter(s) in question. These are

89
Introduction to Statistics - Stat 1011 es.awol@gmail.com

usually expressed in the form of simple mathematical relationships involving the param-
eters. These two statements are exclusive and exhaustive, which means that one or the
other statement must be true. The first statement is called null hypothesis and is denoted
by H0 and, the second is called alternative hypothesis and is denoted by H1 .

• Null Hypothesis: is a statement about the values of one or more parameters. It

represents the status quo, that is, it states that there is no difference between a
parameter and a hypothesized value. For any parameter θ and an assumed value θ0 ,

H0 : θ = θ0 ⇔ θ − θ0 = 0.

• Alternative Hypothesis: It is often called research hypothesis. An alternative

hypothesis is a statement that contradicts the null hypothesis, that is, states that
there is a difference between a parameter value and a hypothesized value. Hence,
such hypothesis may have three different forms:

– Two-sided test: H1 : θ 6= θ0 ⇒ θ − θ0 6= 0
– One-sided test:
∗ Right tailed test: H1 : θ > θ0 ⇒ θ − θ0 > 0
∗ Left tailed test: H1 : θ < θ0 ⇒ θ − θ0 < 0

8.2.1 Basic Concepts in Hypothesis Testing

Errors in Hypothesis Testing
There are two types of errors in hypothesis testing.

• Type I Error: It is an error occurred if one rejects the null hypothesis which is
actually true. The probability of making such error is denoted by α and called
significance level. This significance level (α) is the maximum acceptable probability
of rejecting a true null hypothesis.

• Type II Error: It is an error occurred if one failed to reject the null hypothesis
which is actually false. The probability of making this type II error is denoted by β.
The power of a test is obtained as 1 − β which is the probability of correctly rejecting
the null hypothesis when it is false.

Steps in Testing a Hypothesis

A statistical hypothesis test can be formally summarized as a five-step process. These are:

• Step 1: State both hypotheses; H0 and H1 .

• Step 2: Specify an acceptable level of significance, α.

90
Introduction to Statistics - Stat 1011 es.awol@gmail.com

• Step 3: Define a sample based test statistic (Tcal ) and rejection region (Ttab ) for H0 .

• Step 4: Make a decision, that is either reject or do not reject H0 .

• Step 5: Conclusion.

In step 1, H0 and H1 are the null and alternative hypotheses, respectively, defined before
while α is the level of significance. The most common choices of significance levels are
α = 0.1, α = 0.05 and α = 0.01. In step 3, the test statistic is a sample statistic whose
sampling distribution can be specified for both the null and alternative hypothesis case
(although the sampling distribution when the alternative hypothesis is true may often
be quite complex). After specifying the appropriate significance level α, the sampling
distribution of this statistic is used to define the rejection region. The rejection (critical)
region is the range of values of a sample statistic that will lead to rejection of the null
hypothesis. It comprises of the values of the test statistic for which (1) the probability
when the null hypothesis is true is less than or equal to the specified α and (2) probabilities
when H1 is true are greater than they are under H0 . Regard to making decision, for a
two-sided test reject H0 if |Tcal | ≥ Ttab , for right tailed test reject H0 if Tcal ≥ Ttab and for
left tailed test reject H0 if Tcal ≤ −Ttab .

8.2.2 Hypothesis Testing for a Population Mean

1. H0 : µ = µ0
H1 : µ 6= µ0 / µ < µ0 / µ > µ0

2. Significance level = α.

3. Test Statistic and Rejection Region:

x̄ − µ
• If n is large, say n ≥ 30, the test statistic Zcal = √ ∼ N (0, 1).
σ/ n
x̄ − µ
• If n is small, say n < 30, use the test statistic tcal = √ ∼ t(n − 1).
s/ n
4. Decision:

• For a two-tailed test, if |Tcal | ≥ Tα/2 , H0 will be rejected.

• For a right tailed test, if Tcal ≥ Tα , H0 will be rejected.
• For a left tailed test, if Tcal ≤ −Tα , H0 will be rejected.

5. Conclude.

Examples:

91
Introduction to Statistics - Stat 1011 es.awol@gmail.com

1. Assume that the average annual income for government employees in Ethiopia is
reported by the Ethiopian Statistical Agency Census Bureau to be birr 18750.00.
There was some doubt whether the average yearly income of government employees
in Ethiopia was representative of the national average. A random sample of 100
government employees in Ethiopia was taken and it was found that their average
salary was birr 19240.00 with a standard deviation of birr 2610.00. Can we say
that the average salary of government employees in Ethiopia is representative of the
national average at 5% level of significance?
2. A research done by a graduating student reports that the average score of Haramaya
University students in statistics course is less than 80. To test this claim, a random
sample of 10 students was taken and their scores in the course are recorded as: 65,
70, 80, 85, 60, 90, 80, 75, 85, 90. At 0.05 level of significance, test the validity of this
claim.
Solutions
1. Given µ0 = 18750, n = 100, x̄ = 19240 and s = 2610.
(a) H0 : µ = 18750
H1 : µ 6= 18750
(b) α = 0.05 ⇒ Ztab = Zα/2 = Z0.025 = 1.96
x̄ − µ0 19240 − 18750
(c) Zcal = √ = √ = 1.877
s/ n 2610/ 100
(d) Since |Zcal | < Ztab , H0 should not be rejected.
(e) Thus, the average salary of government employees in Ethiopia is not significantly
different from the national average at 5% level of significance.
2. Given µ0 = 80, n = 10
n
1X 1
x̄ = xi = (65 + 70 + . . . + 90) = 78
n i=1 10
n
1 X √
s2 = (xi − x̄)2 = 106.67 ⇒ s = 106.67 = 10.33
n − 1 i=1

(a) H0 : µ = 80
H1 : µ < 80
(b) α = 0.05 ⇒ ttab = tα (n − 1) = t0.05 (9) = 1.833
x̄ − µ0 78 − 80
(c) tcal = √ = √ = −0.612
s/ n 10.33/ 10
(d) Since tcal > −ttab , H0 should not be rejected.
(e) Thus, the average score of Haramaya University students in statistics course is
less than 80 at 5% level of significance.

92
Introduction to Statistics - Stat 1011 es.awol@gmail.com

8.2.3 Confidence Interval for a Population Mean

The (1 − α)100% confidence interval for the Population Mean µ is:
√ √
• (x̄ − tα/2 (n − 1) × s/ n, x̄ + tα/2 (n − 1) × s/ n) if n is small.
√ √
• (x̄ − Zα/2 × s/ n, x̄ + Zα/2 × s/ n) if n is large.

Example: Construct the 95% confidence interval for the population mean of the previous
two examples.
Solutions:
√ √ √
1. (x̄ − Z√α/2 × s/ n, x̄ + Zα/2 × s/ n) = (19240 − 1.96 × 2610/ 100, 19240 − 1.96 ×
2610/ 100) = (18728.44, 19751.56)
√ √ √
2. (x̄ − tα/2 (n − 1)
√ × s/ n, x̄ + tα/2 (n − 1) × s/ n) = (78 − 2.262 × 10.33/ 10, 78 −
2.262 × 10.33/ 10) = (70.61, 85.39)

93
Chapter 9

Inference for Two or More

Populations

The inferences we have made so far have concerned a parameter from a single population.
There may be situations that need comparison of parameters from different populations.

Comparative studies are designed to discover and evaluate the difference between effects
rather than the effect themselves. In such studies, we must perform an experiment, collect
informative data and then reach at a decision based on the results. In general discussion,
the statistical term ’treatment’ is used to refer to techniques that will be compared. In
performing an experiment, the basic units exposed to one or another treatment is called
experimental units (subjects). The characteristics recorded after the treatment is applied
to the units is called a response. the manner in which subjects are chosen and assign to
treatment is called experimental design.

9.1 Comparison of the Population Mean in Two groups

In studies involving the comparison of two groups, there are two ways of taking a sample:
paired sample and independent sample.

9.1.1 Paired Sample

Here a pair of similar (identical) experimental units are selected and a treatment is applied
on one member of each pair. Then the response of interest is record on each pair. A com-
mon application occurs when the response is measured on two different occasions. This is
appropriate for pre-post treatment responses.

For two paired variables X1 and X2 , the difference of the two variables, di = x1i − x2i
i = 1, 2, · · · , n, is treated as if it were a single sample. The null hypothesis is that the
true mean difference of the two variables is µd = µ1 − µ2 = D0 . The difference is typically

94
Introduction to Statistics - Stat 1011 es.awol@gmail.com

assumed to be zero unless explicitly specified.

The steps to be followed is similar to the one we have seen in the one sample case.
1. The null and alternative hypotheses to be tested are:
H0 : µd = 0
H1 : µd 6= 0 or µd < 0 or µd > 0

2. Choose a level of significance (α)

3. The test statistic is:
d¯ − µd
t= √ ∼ t(n − 1)
sd / n
n n
1X 1 X
where d¯ = di is the sample mean of the differences, s2d = ¯ 2 is
(di − d)
n i=1 n − 1 i=1
the sample variance of the differences and n is the sample size.
4. Decision:
• For a two sided test, H0 is rejected if |t| > tα/2 (n − 1).
• For a one sided case, H0 is rejected if |t| > tα (n − 1).
5. Conclude.

¯ sd
Also, the (1 − α)100% confidence interval for µd is d ± tα/2 (n − 1) √ .
n
Example: A medical researcher wishes to determine if a pill has an effect on reducing the
blood pressure of individuals. The study involves recording the initial blood pressure of 15
women. After they took the pill for six months, their blood pressure are again recorded.
The data is:
Women 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Before 70 80 72 76 76 76 72 78 82 64 74 92 74 68 84
After 68 72 62 70 58 66 68 52 64 72 74 60 74 72 74
Do the data substantiate the claim that the pill reduced blood pressure? Also construct
the 95% confidence interval for the mean difference of blood pressure.

Solution: Let µd be the population mean of the difference in the blood pressure of women.
The differences of the before-after blood pressures are:
Women 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Before (x1i ) 70 80 72 76 76 76 72 78 82 64 74 92 74 68 84
After (x2i ) 68 72 62 70 58 66 68 52 64 72 74 60 74 72 74
di = x1i − x2i 2 8 10 6 18 10 4 26 18 -8 0 32 0 -4 10

95
Introduction to Statistics - Stat 1011 es.awol@gmail.com

n
1X 1 1
The sample mean of the differences is d¯ = di = (2 + 8 + · · · + 10) = (132) = 8.8.
n i=1 15 15
n
1 X ¯ 2 = 1 {(2 − 8.8)2 + (8 − 8.8)2 +
The variance of the differences is s2d = (di − d)
n − 1 i=1 15 − 1
1
· · · + (10 − 8.8)2 } = (1686.4) = 120.457 which implies the standard deviation sd = 10.98.
14
1. Thus, the null and alternative hypotheses to be tested are:
H0 : µd = 0
H1 : µd > 0

2. The level of significance is α = 0.05. The critical value is tα (n − 1) = t0.05 (15 − 1) =

t0.05 (14) = 1.76
3. The test statistic is:
d¯ − µd 8.8 − 0
t= √ = √ = 3.10
sd / n 10.98/ 15
4. Decision: Since |t| > t0.05 (14), H0 should be rejected.
5. Conclusion: The pill reduced the blood pressure of women.
The 95% confidence interval is:

¯ sd 10.98
d ± tα/2 (n − 1) √ = 8.8 ± 2.145 √ = (2.72, 14.88).
n 15

9.1.2 Independent Samples

In many sampling situations, we will select independent random samples from two groups
in order to compare the population means. Let µ1 be the population mean of the first
group and µ2 be the populations mean of the second group. First let us consider small
sample size case.
1. The null and alternative hypotheses to be tested are:
H0 : µ1 = µ2
H1 : µ1 6= µ2 or µ1 < µ2 or µ1 > µ2

2. Choose a level of significance (α).

3. Under the assumption of equal variances in the two groups, the test statistic is:
(x̄1 − x̄2 ) − (µ1 − µ2 )
t= r ∼ t(n1 + n2 − 2)
1 1
sp +
n1 n2

96
Introduction to Statistics - Stat 1011 es.awol@gmail.com

n1 n2
1 X 1 X
where x̄1 = x1i is the sample mean of the first group and x̄2 = x2i is
n1 i=1 n2 i=1
(n1 − 1)s21 + (n2 − 1)s22
the sample mean of the second group, s2p = is the pooled
n1 + n2 − 2
n1
1 X
variance of the both groups (note s21 = (x1i − x̄1 )2 is the sample variance
n1 − 1 i=1
n2
2 1 X
of the first group and s2 = (x2i − x̄2 )2 is the sample variance of the second
n2 − 1 i=1
group), n1 is sample size of the first group and n2 is sample size of the second group.

4. Decision:

• For a two sided test, H0 is rejected if |t| > tα/2 (n1 + n2 − 2).
• For a one sided case, H0 is rejected if |t| > tα (n1 + n2 − 2).

5. Conclude.

The (1 − α)100% confidence interval for the difference of the population means is:
r
1 1
(x̄1 − x̄2 ) ± tα/2 (n1 + n2 − 2)sp + .
n1 n2

The above test statistic is only used when the two distributions have the same variance.
When the two population variances are assumed to be different and hence must be esti-
mated separately, the test statistic is a little bit modified as:

(x̄1 − x̄2 ) − (µ1 − µ2 )

t= s ∼ t(v)
s21 s22
+
n1 n2

(s21 /n1 + s22 /n2 )2

where now the v = . Similarly, the (1 − α)100%
(s21 /n1 )2 /(n1 − 1) + (s22 /n2 )2 /(n2 − 1)
confidence interval for the difference of the population means when the population variances
are different is:  s 
2 2
(x̄1 − x̄2 ) ± tα/2 (v) s1 + s2  .
n1 n2

Example: Company officials were concerned about the length of time a particular drug
product retained its toxin’s potency. A random sample of 10 bottles of the product was
drawn from the production line and measured for potency. A second sample of 10 bottles
was obtained and stored in a regulated environment for a period of one year. The readings
obtained from each sample are given below.

97
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Sample 1 10.2 10.5 10.3 10.8 9.8 10.6 10.7 10.2 10.0 10.6
Sample 2 9.8 9.6 10.1 10.2 10.1 9.7 9.5 9.6 9.8 9.9
Test the null hypothesis that the drug product retains its potency. Also, construct the
95% confidence interval for the difference of the population means.

Solution: Let µ1 be the mean potency of the product taken from the production line and
µ2 be the mean potency of the drug product that was retained for a year. The summary
statistics nof the data are:
1
1 X 1
x̄1 = x1i = (103.7) = 10.37
n1 i=1 10
n 2
1 X 1
x̄2 = x2i = (98.3) = 9.83
n2 i=1 10
"n n1
#
1
1 X 1 X 1 1
s21 = x21i − ( x1i )2 = [1076.3 − (103.7)2 ] = 0.105
n1 i=1 n1 i=1 9 10
"n n2
#
2
1 X 1 X 1 1
s22 = x22i − ( x2i )2 = [966.81 − (98.3)2 ] = 0.058
n2 i=1 n2 i=1 9 10
(n1 − 1)s21 + (n2 − 1)s22 (10 − 1)(0.105) + (10 − 1)(0.058)
s2p = = = 0.0815
n1 + n2 − 2 10 + 10 − 2
sp = 0.285
1. The hypotheses to be tested are:
H0 : µ1 = µ2
H1 : µ1 6= µ2

2. The level of significance is α = 0.05. Thus t0.05/2 (10 + 10 − 2) = t0.025 (18) = 2.101.
3. The test statistic is:
(x̄1 − x̄2 ) − (µ1 − µ2 ) (10.31 − 9.83) − 0
t= r = r = 3.766
1 1 1 1
sp + 0.285 +
n1 n2 10 10
4. Decision: Since |t| > t0.025 (18), H0 is rejected.
5. Conclusion: There is a significant difference in the mean potency of the drug product
from the production line and the drug that was retained for one year.
The 95% confidence interval for the difference of the population means, µ1 − µ2 is:
r r !
1 1 1 1
(x̄1 − x̄2 ) ± tα/2 (n1 + n2 − 2)sp + = (10.37 − 9.83) ± 2.101(0.285) +
n1 n2 10 10
= (0.272, 0.808).

98
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Example: A quick but impressive method of estimating the concentration of a chemical in

a rat has been developed. The sample from this method has 8 observations and the sample
from the standard method has 4 observations. Assuming different population variances,
test whether the quick method gives under-estimate result. The data in the two samples
are:
Standard Method 25 24 25 26
Quick Method 23 18 22 28 17 25 19 16
Solution: Let µ1 = the population mean of the standard method and µ2 = the population
mean of the quick method.
1. The hypotheses to be tested are:

H0 : µ1 = µ2
H1 : µ1 > µ2

2. The level of significance is α = 0.05. The degrees of freedom for unequal variances
assumption is

(s21 /n1 + s22 /n2 )2 (0.67/4 + 17.71/8)2

v= = ≈ 8.
(s21 /n1 )2 /(n1 − 1) + (s22 /n2 )2 /(n2 − 1) (0.67/4)2 /(4 − 1) + (17.71/8)2 /(8 − 1)

Thus t0.05 (8) = 1.86

3. The test statistic is:

(x̄1 − x̄2 ) − (µ1 − µ2 ) (25 − 21) − 0
t= s =r = 2.60
s21 s22 0.67 17.71
+ +
n1 n2 4 8

4. Decision: H0 is rejected since t > t0.05 (8).

5. Conclusion: The quick method gives an under-estimate result.

For large samples, the test statistic follows a normal distribution. That is,
(x̄1 − x̄2 ) − (µ1 − µ2 )
Z= s ∼ N (0, 1).
s21 s22
+
n1 n2

Also the (1 − α)100% confidence interval is

 s 
(x̄1 − x̄2 ) ± Zα/2 s21 s22
+ .
n1 n2

99
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Similarly, if the groups have common variance, the pooled variance (standard deviation)
can be calculated as shown before in the small sample case.

Example: For a random sample of 120 adult female born in country A, the mean height
was 62.7 inches with standard deviation 2.50 inches. For another random sample of 150
adult female born in country B the mean height was 61.8 inches with standard deviation
2.62 inches. Would you reject the null hypothesis that there is no difference in height
between adult female born in the two countries at 1% level of significance.

Solution: Let µ1 = the mean height of adult female born in country A and µ2 = the mean
height of adult female born in country B.

1. The hypotheses to be tested are:

H0 : µ1 = µ2
H1 : µ1 6= µ2

2. The level of significance is α = 0.01. Thus, Z0.01/2 = Z0.005 = 2.58.

3. The test statistic is:

(x̄1 − x̄2 ) − (µ1 − µ2 ) (62.7 − 61.8) − 0
Z= s =r = 2.88
s21 s22 (2.50)2 (2.62)2
+ +
n1 n2 120 150

4. Decision: H0 is rejected since Z > Z0.005 .

5. Conclusion: There is a difference in the population mean height of in the two coun-
tries.

9.2 Analysis of Variance (ANOVA)

The t and Z tests have been used for testing the hypothesis of a single population mean
equal to a specified value or equality of two populations means when the sample size is
small and large respectively. However in testing the equality of more than two population
means, techniques of analysis of variance (ANOVA) will be used. Thus, the analysis of
variance (ANOVA) is used to compare the means of two or more groups based on the
variance ratio test, i.e., an F -test, and relating it to the F -distribution.

There are many types of observational classifications. If the observations are classified
on the basis of a single criterion, the classification is called one-way classification. If the
observations are classified on the basis of two criteria, it is called two-way classification.

100
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Here, one-way anova will be discussed. The principle underlying the one-way ANOVA is
that the total variability in a data set is partitioned into two components; the variability
between groups and the variation within groups. Each component represents a different
source of variation. The between groups variation can be accounted for, and the within
group variation is the unexplained (residual) variation results from uncontrolled biological
variation and technical error.

Suppose there is one basic variable or criterion of classification with k groups. The null
hypothesis to be tested is that the all the k group means are equal and the alternative
hypothesis is at least one of the group mean is significantly different from the other. That
is,
H0 : µ1 = µ2 = · · · = µk
H1 : not H0

To construct the test statistics, the total sum squares (TSS) is decomposed into the between
sum squares (BSS) and within (errors) sum squares (ESS).

T SS = BSS + ESS
X ni
k X k
X ni
k X
X
2 2
(xij − x̄) = ni (x̄i − x̄) + (xij − x̄i )2
i=1 j=1 i=1 i=1 j=1

The TSS has n − 1 degrees of freedom, the BSS has k − 1 degrees of freedom and the
ESS has n − k degrees of freedom. The ratios of the BSS and ESS to their corresponding
degrees of freedom are called between mean squares (BMS) and error mean squares (EMS),
respectively. Therefore, the test statistic is called an F test which is the ratio of BMS to
EMS. In addition, the critical value is Fα (k − 1, n − k).

The ANOVA table is written as follows.

Source of Variation SS df MS F
k
X BSS BM S
Between BSS = ni (x̄i − x̄)2 k−1 BM S = F =
i=1
k−1 EM S
ni
k X
X ESS
Within (Error) ESS = (xij − x̄i )2 n−k EM S =
i=1 j=1
n−k
Xk X ni
Total T SS = (xij − x̄)2 n−1
i=1 j=1

Assumptions of the one-way ANOVA

1. The samples are independently and randomly drawn from source population(s).

2. The source population(s) reasonably normal distribution.

101
Introduction to Statistics - Stat 1011 es.awol@gmail.com

3. The samples have approximately equal variances.

If the samples are equal size, no main worry about these assumptions because anova is
quite robust (relatively unperturbed by violations of its assumptions). But if the samples
are different size, an appropriate non-parametric alternative for one-way ANOVA which is
called the Kruskal - Wallis Test should be used.

Example: Suppose a university wishes to compare the effectiveness of four teaching meth-
ods (Slide, Self Study, Lecture, Discussion) for a particular course. Twenty four students
are randomly assigned to the teaching methods, with 5, 6, 6 and 7 respectively. At the end
of teaching the students with their assigned method, a test (out of 20%) was given and the
performance of the students were recorded as follows:
Slide Self Study Lecture Discussion
9 10 12 9
12 6 14 8
14 6 11 11
11 9 13 7
13 10 11 8
5 16 6
7
Construct the ANOVA table. Also test the hypothesis that there is no difference among
the four teaching methods.

Solution: The summary statistics are calculated as:

Slide (xi1 ) Self Study (xi2 ) Lecture (xi3 ) Discussion (xi4 )
9 10 12 9
12 6 14 8
14 6 11 11
11 9 13 7
13 10 11 8
5 16 6
7
Sample Size n1 = 5 n2 = 6 n3 = 6 n4 = 7
5
X 6
X 7
X X 7
Sum x1i = 59 x2i = 46 x3i = 77 x4i = 56
i=1 i=1 i=1 i=1
Mean x̄1 = 11.800 x̄2 = 7.667 x̄3 = 12.833 x̄4 = 8.000
4
X 6 X
X 4
Also, the overall summaries are: n = nj = 24, xij = 231. Thus, x̄ =
j=1 i=1 j=1
6 4
1 XX 1
xij = (231) = 9.625.
n i=1 j=1 24

102
Introduction to Statistics - Stat 1011 es.awol@gmail.com

S.V. SS df MS F
126.89 42.2967
Between BSS = 126.89 4−1=3 BM S = = 42.2967 F = = 11.28
3 3.7485
74.97
Within ESS = 74.97 24 − 4 = 20 EM S = = 3.7485
20
Total T SS = 201.85 24 − 1 = 23

Then the calculated F value is going to be compared with F0.05 (3, 20). Thus, F0.05 (3, 20) =
2.38. Therefore, H0 should be rejected. This means that there is a difference among the
teaching methods.

Mean Separation
In the ANOVA, if the null hypothesis is rejected, then there is a need to identify which
pair of group means are significant and which are not. There are several methods of mean
separation, of these, the Fisher’s Least Significant Difference LSD test is to be considered.
In this method, first sort the group means in ascending order to compare two means at a
time. For comparing µi and µj , compute
s
1 1
LSDij = tα/2 (n − k) EM S + .
ni nj

Then, if x̄i − x̄j > LSDij , there is a significant difference between µi and µj . Otherwise,
no significant difference is observed.

Example: Recall the previous example and identify the significant pair of teaching method
means using LSD.

Solution:
Lecture: x̄3 = 12.833, Slide: x̄1 = 11.800, Discussion: x̄4 = 8.000, Self Study: x̄2 = 7.667
n1 = 5, n2 = 6, n3 = 6, n4 = 7

Methods x̄i − x̄j p LSD Sign.

Lecture vs Slide 12.833-11.800=1.033 3.153p3.7485(1/6 + 1/5) = 3.696 not sig.
Slide vs Discussion 11.800-8.000=3.800 3.153p3.7485(1/5 + 1/7) = 3.574 sig.
Discussion vs Self 8.000-7.667=0.333 3.153 3.7485(1/7 + 1/6) = 3.396 not sig.

Therefore, lecture and slide teaching methods are better than the other two.

103
Chapter 10

Simple Linear Regression and

Correlation

In the previous chapters we have been dealing with a single variable. In this chapter we
will deal with a bi-variate data, i.e., data involving two variables.

10.1 Correlation
Correlation is a statistical tool desired towards measuring the degree of the relationship
(degree of association) between variables. If the change in one variable affects the change
in the other variable, then the variables are correlated.

Correlation that involves only two variables is called simple correlation. The simplest way
to present bivariate data is to plot on the XY plane. For a bivariate distribution (X, Y ),
the values (Xi , Yi ), i = 1, 2, · · · , N are plotted in the XY plane. This is known as scatter
plot. This gives an idea about the correlation of the two variables. But, it will give only a
vague idea about the presence and absence of correlation and the nature (direct or indirect)
of correlation. It will not indicate about the strength or degree of relationship between
two variables.

10.1.1 Covariance
Covariance is a measure of the joint variation between between two variables, i.e., it
measures the way in which the values of the two variables vary together.

Recall that the sample and population variance of a certain variable X is calculated,

104
Introduction to Statistics - Stat 1011 es.awol@gmail.com

respectively, as:
n
1 X
Sx2 = (Xi − X̄)2
n − 1 i=1
n
1 X
= (Xi − X̄)(Xi − X̄)
n − 1 i=1
= Sxx

and
N
1 X
σx2 = (Xi − X̄)2
N i=1
N
1 X
= (Xi − X̄)(Xi − X̄)
N i=1
= σxx
Similarly the sample covariance between two variables is defined as:
n
1 X
Sxy = (Xi − X̄)(Yi − Ȳ )
n − 1 i=1

Xn X n 
 n Xi Yi 
1  X
Xi Yi − i=1 i=1

=  .
n − 1  i=1
 n 


and
N
1 X
σxy = (Xi − X̄)(Yi − Ȳ )
N i=1

XN XN 
X N
Xi Yi 
1  i=1 i=1

=  X i Y i − .
N i=1 N 


If the covariance is zero, there is no linear relationship between the two variables. If it is
negative, there is an indirect linear relationship between them. If the covariance is positive,
there is a direct linear relationship between the variables.

105
Introduction to Statistics - Stat 1011 es.awol@gmail.com

10.1.2 Pearson’s Correlation Coefficient

The coefficient of correlation, which was developed by Karl Pearson, is a measure of the
degree or strength of the linear association between two variables. It is defined as a ratio
of the covariance between the two variables and the product of the standard deviations of
the two variables. The sample correlation coefficient is denoted by r and the population
correlation coefficient is denoted by the Greek letter ρ, rho.
n
X
(Xi − X̄)(Yi − Ȳ )
Sxy i=1
r= =v v
Sx Sy u n
uX
u n
uX
t (Xi − X̄)2 t (Yi − Ȳ )2
i=1 i=1

This can be written as:

n
X n
X n
X
n Xi Yi − Xi Yi
i=1 i=1 i=1
r=v !2 v !2
u n n
u n n
u X X u X X
tn X2 − i X tn
i Y2− i Yi
i=1 i=1 i=1 i=1

Interpretations of r: The value of the correlation coefficient can be positive or negative,

depending on the sign of the covariance between the two variables. But, it lies between
the limits -1 and +1; that is −1 ≤ r ≤ 1.

• If the value of r is -1 or +1, there is a perfect negative/inverse/indirect or posi-

tive/direct linear relationship between the variables, respectively.

• If the value of r is approximately -1 or +1, there is a strong negative/inverse/indirect

or positive/direct linear relationship between the variables, respectively.

• If the value of r approximately -0.5 or +0.5, there is a medium negative/inverse/indirect

or positive/direct linear relationship between the variables, respectively.

• If the value of r is near zero, there is no linear association between the two variables.

Limitations of r:

1. If X and Y are statistically independent, the correlation coefficient between them

is zero; but the converse is not always true. In other words, zero correlation does
not necessarily imply independence. It is a measure of linear association or linear
dependence only; it has no meaning for describing nonlinear relations. Thus, for
example even if Y = X 2 is an exact relationship, yet r is zero. (Why?)

106
Introduction to Statistics - Stat 1011 es.awol@gmail.com

2. Although, it is a measure of the linear association between variables, it does not

necessarily imply any cause and effect relationship.
Example: A researcher wants to find out if there is a relationship between the heights of
sons and the heights of their fathers. In other words, do taller fathers have taller sons?
The researcher took a random sample of 6 fathers and their 6 sons. Their height in inches
is given below in an ordered array.
Father (X) 63 65 66 67 67 68
Son (Y ) 66 68 65 67 69 70
Find the covariance and correlation coefficient and interpret.

Solution:
No. X Y X2 Y2 XY
1 63 66 3969 4356 4158
2 65 68 4225 4624 4420
3 66 65 4356 4225 4290
4 67 67 4489 4489 4489
5 67 69 4489 4761 4623
6 P 68 P 70 P 24624 P 24900 P 4760
Total Xi = 396 Yi = 405 Xi = 26152 Yi = 27355 Xi Yi = 26740
⇒ r = 0.597

10.1.3 Spearman’s Rank Correlation

It is not always possible to take measurements on units or objects. Many characters are
expressed in comparative terms such as beauty, smartness, temperament, . . .. In such cases
the units are ranked pertaining to that particular character instead of taking measurements
on them. Sometimes, the units are also ranked according to their quantitative measure.
In these type of studies, two situations arise, (i) the same set of units is ranked according
two characters, (ii) two judges give ranks to the same set of units independently pertaining
to one character. In both these situations we get paired ranks for a set of units. For
example, the students are ranked according to their marks in Mathematics and Statistics.
Two judges rank the girls independently in a beauty competition. In all these situations,
the usual Pearson’s correlation coefficient cannot be obtained.

Suppose that a group of n individuals is given grades or ranks with respect to two char-
acteristics. Let (Xi , Yi ), i = 1, 2, · · · , n be the ranks of the ith individual on the two
characteristics. Then, the Spearman’s rank correlation coefficient is given by:
n
X
6 d2i
i=1
rs = 1 − where di = RXi − RYi .
n(n2 − 1)

107
Introduction to Statistics - Stat 1011 es.awol@gmail.com

This formula is used when all the ranks are not repeated. For repeated ranks, a correction
factor is required. If ties occur between the pair of measurements, it creates no problem.

m(m2 − 1)
If m is the number of times an item is repeated, then the factor is added to
n
2
X
d2i . For each repeated value, this correction factor is to be added.
i=1

n
!
X X
6 d2i + CF
i=1
rs = 1 −
n(n2 − 1)
Note that −1 ≤ rs ≤ 1.

Example: The ranks of some 10 students in two courses Statistics and Economics are
given below. Calculate the rank correlation and interpret.
Statistics 5 2 9 8 1 10 3 4 6 7
Economics 10 5 1 3 8 6 2 7 9 4
Ans: rs = −0.31
Example: Obtain the rank correlation for the following data.
X 85 74 85 50 65 78 74 60 74 90
Y 78 91 78 58 60 72 80 55 68 70
Ans: rs = −0.545

10.2 Simple Linear Regression

Regression may be defined as the estimation of the unknown value of one variable from
the known values of one or more variables. The variable whose values are to be estimated
is known as dependent variable while the variable which are used in determining the value
of the dependent variable are called independent variables.

The regression study that involves only two variables is called simple regression and the
regression analysis that studies more than two variables is called multiple regression. If
the relation ship between the two variables can be described by a straight line then the
regression is known as linear regression other wise it is called non-linear.

The regression analysis involving only two variables and having a linear relationship is
called simple linear regression. This linear relationship between the two variables is repre-
sented by a straight line.

108
Introduction to Statistics - Stat 1011 es.awol@gmail.com

A regression line is a line that gives the best estimate of one variable for any given value
of another variable. The regression line which is used to estimate the values of Y for any
given value of X is called regression line of Y on X.

Model: Yi = β0 + β1 Xi + εi ; i = 1, 2, · · · , n
where
Yi is the ith actual value of the dependent variable.

Xi is the ith actual value the independent variable.

β0 is the intercept.

β1 is the slope.

εi is ith value the error term, which is εi ∼ N (0, σ 2 )

This model is called the Population Regression Model. Its parameters are interpreted as
follows:
• β0 is the value of the dependent variable when the value of the independent variable
is zero.

• β1 is the increment in the value of the dependent variable when the value of the
independent variable increases by 1 unit. The sign of β1 is the same as to that of the
covariance and correlation coefficient. That is, there is a direct linear relationship
between the two variables if β1 is positive,there is an indirect linear relationship
between the two variables if β1 is negative, and there is no linear relationship between
the two variables if β1 is zero.

10.2.1 Method of Estimation

The objective in the above model is to estimate the regression parameters, β0 and β1 using
sample data. The most common and widely used method of estimation is called Ordinary
Least Squares (OLS) which minimizes the error sum of squares. The estimated regression
model is, therefore,

Ŷi = β̂0 + β̂1 Xi ; i = 1, 2, · · · , n

where
Ŷi is the ith fitted/estimated value of the dependent variable.

Xi is the ith actual value the independent variable.

β̂0 is the estimated intercept.

β̂1 is the estimated slope.

109
Introduction to Statistics - Stat 1011 es.awol@gmail.com

The estimates of the parameters can be obtained as:

n
X n
X n
X n
X
(Xi − X̄)(Yi − Ȳ ) n Xi Yi − Xi Yi
Sxy i=1 i=1 i=1 i=1
β̂1 = = n = !2
Sxx X
2
n n
(Xi − X̄)
X X
n Xi2 − Xi
i=1 i=1 i=1

and
β̂0 = Ȳ − β̂1 X̄
Example: Recall the previous example on heights of sons and heights of fathers.
a. Estimate the regression model of height of sons on height of fathers.
b. Interpret the estimated parameters.
c. What would be the predicted height of the son if the fathers height is 70 inches?
Solutions: β̂1 = 0.625 and β̂0 = 26.25

10.2.2 The Coefficient of Determination

So far, we were concerned with the problem of estimation the parameters of the regression
model and the correlation coefficient between two variables. We now consider the goodness
of fit of the estimated model to a set of data; that is, we shall find out how ”well” the
estimated model fits the data.

The coefficient of determination tells how well the estimated model fits the data. For
simple linear regression (two variables case), it is defined as the square of the sample cor-
relation coefficient, and denoted by r2 . Hence r2 measures the proportion or percentage of
the variation in the dependent variable explained by the independent variable. Variation
means the sum of the squares of the deviation of a variable from its mean value.
Generally, r2 is a nonnegative quantity which lies in the limits 0 and 1, i.e., 0 ≤ r2 ≤ 1. If
it approaches to 1, it means a good fit and if it approaches 0, no relationship between the
variables.

If we consider the example on heights of sons and their fathers, we had r = 0.597 which
implies r2 = 0.357. This means 35.7% of the variation in the heights of sons is explained
by the heights of the fathers.

Example: A study was reported in a medical journal suggesting that the peak heart rate
of an individual can reach during intensive exercise decreases with age. A cardiologist
wants to do his own study. The next 9 patients were given a stress test on the treadmill
at 6 miles per hour and their ages and their heart rates were recorded as follows:

110
Introduction to Statistics - Stat 1011 es.awol@gmail.com

Age 30 30 40 20 20 45 30 45 50
Hear Rate 190 180 180 200 195 170 185 175 165

a. Identify the dependent and independent variable.

b. Estimate the regression model.

c. Can we predict the peak heart rate of an 80 year old man who is given a similar
stress test? If so, what peak heart rate do you predict.

d. Calculate the coefficient of correlation and coefficient of determination, and interpret

the results.

Solutions:
No. X Y X2 Y2 XY
1 30 190 900 36100 5700
2 30 180 900 32400 5400
3 40 180 1600 32400 7200
4 20 200 400 40000 4000
5 20 195 400 38025 3900
6 45 170 2025 28900 7650
7 30 185 900 34225 5550
8 45 175 2025 30625 7875
9 P 50 P 165 P 22500 P 227225 P 8250
Total X = 310 Y = 1640 X = 11650 Y = 299900 XY = 55525

a. dependent= Heart rate and independent= Age.

b. Ŷi = 216.37 − 0.99Xi ; i = 1, 2, · · · , 9

c. Ŷ = 216.37 − 0.99 × 80 = 137.17

d. r = −0.95 and r2 = 0.90

THE END!!!

111

BSBA - Group 129 - Assessment On The Service Quality and Product Quality of Food Stalls at Bestlink College of The Philippines
No ratings yet
BSBA - Group 129 - Assessment On The Service Quality and Product Quality of Food Stalls at Bestlink College of The Philippines
91 pages
Chapter 5 Anova
No ratings yet
Chapter 5 Anova
10 pages
Name Date Period: - : This Worksheet Will Walk You Through How To Calculate Standard Deviation
No ratings yet
Name Date Period: - : This Worksheet Will Walk You Through How To Calculate Standard Deviation
4 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
40 pages
Introduction To Management Chapter Five 5. The Staffing Function Chapter Objectives
No ratings yet
Introduction To Management Chapter Five 5. The Staffing Function Chapter Objectives
13 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Course Outline
100% (1)
Course Outline
3 pages
Ang STA230
100% (1)
Ang STA230
62 pages
Introduction To Management Chapter 4
No ratings yet
Introduction To Management Chapter 4
29 pages
Business Communication Chapter 13 - 19
No ratings yet
Business Communication Chapter 13 - 19
14 pages
Chapter-6-Random Variables & Probability Distributions
No ratings yet
Chapter-6-Random Variables & Probability Distributions
15 pages
Chapter 2 Handouts - Introduction To Management 2
No ratings yet
Chapter 2 Handouts - Introduction To Management 2
21 pages
2 Mean Median Mode Variance
No ratings yet
2 Mean Median Mode Variance
29 pages
Lecture Note Basic Statistics
No ratings yet
Lecture Note Basic Statistics
73 pages
Ch5 MGT Mathematics Lecture Notes
No ratings yet
Ch5 MGT Mathematics Lecture Notes
6 pages
Statistics For Management Unit 2 Note
No ratings yet
Statistics For Management Unit 2 Note
24 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
11 pages
Statistics For Management - 1
No ratings yet
Statistics For Management - 1
43 pages
Central Tendency
No ratings yet
Central Tendency
26 pages
Quantitative Methods
No ratings yet
Quantitative Methods
242 pages
Solved at The Beginning of 2015 Mansen PLC Acquired Equipment Costing
No ratings yet
Solved at The Beginning of 2015 Mansen PLC Acquired Equipment Costing
1 page
Maths For Mgt. (Chapter 5)
100% (2)
Maths For Mgt. (Chapter 5)
17 pages
FRS 18, Accounting Policies
No ratings yet
FRS 18, Accounting Policies
8 pages
Research Proposal Guideline (SMU)
100% (1)
Research Proposal Guideline (SMU)
5 pages
Regression
No ratings yet
Regression
24 pages
Chapter 1: Limit and Continuity: by Solomon Bati Mathematics Dept Jimma University 2010
No ratings yet
Chapter 1: Limit and Continuity: by Solomon Bati Mathematics Dept Jimma University 2010
87 pages
Hypergeometric Distribution
No ratings yet
Hypergeometric Distribution
4 pages
Theories in Total Quality Management
No ratings yet
Theories in Total Quality Management
3 pages
Basic Stats Questions
No ratings yet
Basic Stats Questions
25 pages
STAT 1000 Assignment - Solutions
No ratings yet
STAT 1000 Assignment - Solutions
7 pages
Beter Balance Cash Holdings
No ratings yet
Beter Balance Cash Holdings
28 pages
Module 5 The Logistic Function
No ratings yet
Module 5 The Logistic Function
6 pages
Proposal On Performance Appraisal of Dashen Bank of Ethiopia
No ratings yet
Proposal On Performance Appraisal of Dashen Bank of Ethiopia
3 pages
STATA Notes 2022
No ratings yet
STATA Notes 2022
25 pages
Accounting 1
No ratings yet
Accounting 1
16 pages
List of Formula - Managerial Statistics
No ratings yet
List of Formula - Managerial Statistics
6 pages
4 Measures of Central Tendency, Position, Variability PDF
100% (1)
4 Measures of Central Tendency, Position, Variability PDF
24 pages
Lecture 1
100% (1)
Lecture 1
33 pages
5 Chapter Five ANOVA
100% (1)
5 Chapter Five ANOVA
9 pages
Sampling Distribution Revised For IBS 2020 Batch
No ratings yet
Sampling Distribution Revised For IBS 2020 Batch
48 pages
1,2. Financial System and Interest Rate 2022-23
No ratings yet
1,2. Financial System and Interest Rate 2022-23
115 pages
Introduction To Stata and Data Management
No ratings yet
Introduction To Stata and Data Management
30 pages
MPhil Econometrics Question Final Exam 2022
No ratings yet
MPhil Econometrics Question Final Exam 2022
2 pages
Lecture Notes 230
No ratings yet
Lecture Notes 230
266 pages
Introduction To Statistics - Doc1
No ratings yet
Introduction To Statistics - Doc1
236 pages
QTB Important Questions 2021
No ratings yet
QTB Important Questions 2021
3 pages
UM04CBBA04 - 09 - Statistics For Management II
No ratings yet
UM04CBBA04 - 09 - Statistics For Management II
2 pages
Quantitative Techniques
No ratings yet
Quantitative Techniques
231 pages
09 Sampling Distribution
No ratings yet
09 Sampling Distribution
15 pages
Special Probability Distributions: Presented By: Juanito S. Chan
No ratings yet
Special Probability Distributions: Presented By: Juanito S. Chan
37 pages
201 Work Effectively in The Financial Service Sector
No ratings yet
201 Work Effectively in The Financial Service Sector
50 pages
M01-Preventing and Eliminating MUDA
No ratings yet
M01-Preventing and Eliminating MUDA
51 pages
Management Concepts and Thoughts - 2020
100% (1)
Management Concepts and Thoughts - 2020
8 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Econometrics 1: Dummy Dependent Variables Models
0% (1)
Econometrics 1: Dummy Dependent Variables Models
12 pages
Linear Algebra For Economists
No ratings yet
Linear Algebra For Economists
131 pages
Chapter Two: Sampling and Sampling Distribution
100% (1)
Chapter Two: Sampling and Sampling Distribution
30 pages
Exploring Research Methodology: Review Article International Journal of Research & Reviewed by KEL
No ratings yet
Exploring Research Methodology: Review Article International Journal of Research & Reviewed by KEL
5 pages
Econometrics I CH-1
No ratings yet
Econometrics I CH-1
32 pages
Introduction To Statistics: Haramaya University College of Computing and Informatics Department of Statistics
100% (1)
Introduction To Statistics: Haramaya University College of Computing and Informatics Department of Statistics
113 pages
Intro Stat
No ratings yet
Intro Stat
112 pages
Buetow PDF
No ratings yet
Buetow PDF
9 pages
Japanese Noodle Qualities. I. Flour Components
No ratings yet
Japanese Noodle Qualities. I. Flour Components
5 pages
Group 2 Chapter 2 PLM
No ratings yet
Group 2 Chapter 2 PLM
15 pages
Pearson and Spearman Correlation SPSS
No ratings yet
Pearson and Spearman Correlation SPSS
46 pages
Instant Ebooks Textbook Essentials of Social Statistics For A Diverse Society Third Edition Anna Leon-Guerrero Download All Chapters
100% (3)
Instant Ebooks Textbook Essentials of Social Statistics For A Diverse Society Third Edition Anna Leon-Guerrero Download All Chapters
62 pages
ES5 Estaistica 1
No ratings yet
ES5 Estaistica 1
4 pages
Tabel Shapiro
No ratings yet
Tabel Shapiro
29 pages
'Learning Styles and Student's Academic Performance of Senior High School Students in Southern Mindanao Colleges Pagadian City''
100% (1)
'Learning Styles and Student's Academic Performance of Senior High School Students in Southern Mindanao Colleges Pagadian City''
17 pages
IOP3705 Unit 5 Chap 7 Collecting and Analysing Diagnostic Information
No ratings yet
IOP3705 Unit 5 Chap 7 Collecting and Analysing Diagnostic Information
6 pages
Correlation Regression 1
No ratings yet
Correlation Regression 1
9 pages
pTIA DataX+ Dy0-001
No ratings yet
pTIA DataX+ Dy0-001
16 pages
HRM370 Draft
No ratings yet
HRM370 Draft
35 pages
Chapter 3 Delphi Method
No ratings yet
Chapter 3 Delphi Method
28 pages
Capstone Datasets
No ratings yet
Capstone Datasets
37 pages
Board Diversity and Earnings Management in Publicly Listed Oil and Gas Firms in Nigeria
No ratings yet
Board Diversity and Earnings Management in Publicly Listed Oil and Gas Firms in Nigeria
24 pages
Measures of Relationship
No ratings yet
Measures of Relationship
11 pages
Differences and Similarities Between Par
100% (1)
Differences and Similarities Between Par
6 pages
GASDGASGASGASG
No ratings yet
GASDGASGASGASG
10 pages
Final New Syllabus of BBA II 2023.07.10
No ratings yet
Final New Syllabus of BBA II 2023.07.10
12 pages
Statistics - Linear Regression - Correlation Worksheet PDF
No ratings yet
Statistics - Linear Regression - Correlation Worksheet PDF
2 pages
W. T. A. Nilashin Organizational Work Life Balance Factors and
No ratings yet
W. T. A. Nilashin Organizational Work Life Balance Factors and
17 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
12 pages
Detecting Multicollinearity in Regression Analysis: Keywords
No ratings yet
Detecting Multicollinearity in Regression Analysis: Keywords
4 pages
MAA SL 4.4 LINEAR REGRESSION [concise]
No ratings yet
MAA SL 4.4 LINEAR REGRESSION [concise]
10 pages
Large-Scale Inference:: Empirical Bayes Methods For Estimation, Testing, and Prediction
No ratings yet
Large-Scale Inference:: Empirical Bayes Methods For Estimation, Testing, and Prediction
7 pages
Probability and Statistics 2019 June QB
No ratings yet
Probability and Statistics 2019 June QB
16 pages
How Generic Are Project Management Knowledge and Practice: January 2007
No ratings yet
How Generic Are Project Management Knowledge and Practice: January 2007
12 pages
Variance-Based Sensitivity Analysis: The Quest For Better Estimators and Designs Between Explorativity and Economy
No ratings yet
Variance-Based Sensitivity Analysis: The Quest For Better Estimators and Designs Between Explorativity and Economy
31 pages
Teaching Competencies and Coping Mechanisms Among The Selected Public Primary and Secondary Schools in Agusan Del Sur Division:Teachers in The New Normal Education
No ratings yet
Teaching Competencies and Coping Mechanisms Among The Selected Public Primary and Secondary Schools in Agusan Del Sur Division:Teachers in The New Normal Education
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.