Standard Deviation and Standard Error
Standard Deviation and Standard Error
Many students confuse the standard deviation and standard error of the mean and
are unsure which, if either, to use in presenting data. In this article, we endeavour to
address these questions and cover some related ambiguities about these quantities.
1. Introduction
When reporting data in research papers, authors often use descriptive statistics to describe the basic
features of the data sets. Descriptive statistics provide simple summaries about the sample. In general,
descriptive statistics include the statistical measures of central tendency and dispersion. The standard
deviation (SD) is one of the common measures of dispersion; the SD is an index of variability of the
original data points and is reported in most studies (Streiner, 1996). The precise term ‘standard devi-
ation’ was first used in writing by Karl Pearson in 1894 following his usage in lectures on the topic
(Dodge, 2003).
The SD of the sampling distribution associated with the estimation method is the standard error of
estimation (Everitt, 2003). The standard error of the mean (SEM) is the SD of the sample mean
estimate of a population mean. It can also be viewed as the SD of the error in the sample mean
relative to the true mean, since the sample mean is an unbiased estimator of the population mean. The
SE reflects the variability of the mean values, as if the study were repeated a large number of times.
A big question often posed for beginners in statistics is: ‘when should we report the standard
deviation and when should we report the standard error?’ Students get these confused easily and
ask how SD and SE are correlated, and what is the difference between them? Does SD describe the
true distance of measurements from the mean? Is the standard error just an estimate of this?
Moreover, authors in their research reports often use SE to describe the variability of their
sample (Nagele, 2003), which may seem to indicate that this problem is not simply confined
to beginners and students in statistics. For example, Nagele (2003) showed that even in some
leading journals, there are a significant number of published articles that misuse the SE in descrip-
tive statistics, which may be misinterpreted as showing the variability within the study sample.
In this article, we endeavour to address these questions and cover some related ambiguities about
SD and SE.
ß The Author 2010. Published by Oxford University Press on behalf of The Institute of Mathematics and its Applications.
All rights reserved. For permissions, please email: journals.permissions@oxfordjournals.org
H. HASSANI ET AL. 109
Section 2 discusses the SD, its construction, the structure of its formula and related features.
Appropriate examples are included for further clarification. The standard error was considered in
Section 3. The relation between SE and SD and the use of SE in confidence intervals is also considered
in Section 3. Finally, Section 4 provides some summaries regarding to the comparison between these
two entities.
2. SD
Logical as this may seem, it has two drawbacks. The first difficulty is that the answer will be zero—not
just in this situation, but in every case. By definition, the sum of the values above the mean is always
equal to the sum of the values below it, and thus they will cancel each other out. Let us first consider
the following example. Assume observations 10, 12, 20, 8 and 15 are our sample. The mean of this
sample is X ¼ 13; however, the sum in question is
X
N
¼ ð10 13Þ þ ð12 13Þ þ ð20 13Þ þ ð8 13Þ þ ð15 13Þ ¼ 0:
ðXi XÞ
i¼1
We can get around this problem by taking the absolute value of each difference (i.e. we can ignore the
sign of the differences whenever they are negative), but for a number of reasons, statisticians do not
like to use absolute numbers. Another way to eliminate negative values is to square them, since the
square of any number—negative or positive—is always positive. So, what we now have is
X
N
2:
ðXi XÞ ð1Þ
i¼1
The second problem is that the result of this equation will increase as we successively add more
objects; this is of little use particularly for large data sets. Let us imagine that we have a sample of 50
values, for which the result of (1) is 20. If we now add another 50 subjects who look exactly the same,
it makes intuitive sense that the dispersion of these 100 points should stay the same. Yet, the formula
as it now reads can result only in a larger sum as we add more data points. So, we can compensate
for this by dividing by the number of subjects, N, so that the equation now reads
PN
2
i¼1 ðXi XÞ
: ð2Þ
N
Suppose we are interested in the average height of all men in a population. To find out the true mean,
we would have to observe (i.e. measure) the height of each man in the population. Of course, in studies
we are unable to observe everyone in a population, because of the large size or the difficulty and
expense of reaching every member. Therefore, we use samples that can give us reasonable estimates.
110 A NOTE ON SD AND SE
Assume we found that the mean of a sample is X ¼ 69:7 cm and its dispersion according to (2) is
9 cm2 . This would seem to engender another set of questions: why is the dispersion measure in square
centimetres, and what can we do to interpret this number? Fortunately, these issues are easily
explained: naturally the units become cm2, since each length ðXi XÞ is squared, hence we need to
take the square root of the final solution given by (2) to return to the correct dimension. This now
provides us with an answer which is much easier to interpret as a measure of dispersion.
There is, however, one last problem when dealing with the SD of a sample such as the one described
above. The result of the formula as it stands thus far produces a biased estimate—i.e. one that is
consistently either higher or (as in this case) lower than the ‘true’ value because we are limited to using
3. Standard error
We mentioned previously that the purpose of most studies is to estimate some population parameter,
such as the mean, the SD, a correlation or a proportion. Once we have that estimate, another question
then arises: How accurate is our estimate? This may seem an unanswerable question; if we do not
know what the population value is, how can we evaluate how close we are to it?
H. HASSANI ET AL. 111
Mere logic, however, has never stopped statisticians in the past, and it will not stop us now. What we
can do is resort to probabilities: What is the probability (P) that the true (population) mean falls within
a certain range of values? (To cite one of our mottos, ‘Statistics means you never have to say you are
certain.’). One way to answer the question is to repeat the study a few hundred times, which will give
us many estimates of the mean. We can then take the mean of these means, as well as figure out what
the distribution of means is; that is, we can get the SD of the mean values. Then, using the same table
of normal distribution as usual, we can estimate what range of values would encompass 90 or 95% of
the means. If each sample had been drawn from the population at random, we would be fairly safe in
concluding that the true mean also falls within this range 90 or 95% of the time. We assign a new name
4. Summary
pffiffiffiffi
Although the SD and the SE are related ðSE ¼ SD= N ), they give two very different types of infor-
mation (Carlin & Doyle, 2000). Moreover, it is easy to be confused about the difference between SD
112 A NOTE ON SD AND SE
and SE. Whereas the SD estimates the variability in the study sample, the SE estimates the precision
and uncertainty of how the study sample represents the underlying population (Webster & Merry,
1997). In other words, the SD tells us the distribution of individual data points around the mean, and
the SE informs us how precise our estimate of the mean is (Streiner, 1996). In summary, we can
mention the following differences:
(1) The SD quantifies scatter—how much the values vary from one another.
(2) The SE quantifies how accurately you know the true mean of the population.
(3) The SE, by definition, is always smaller than the SD.
(4) The SE gets smaller as your samples get larger. This makes sense, because the mean of a large
REFERENCES
CARLIN, J. B. & DOYLE, L. W. (2000) Basic concepts of statistical reasoning: standard errors and confidence
intervals. J Paediatr Child Health, 36, 502–505.
DODGE, Y. (2003) The Oxford Dictionary of Statistical Terms, Oxford: Oxford University Press. ISBN
0-19-920613-9.
EVERITT, B. S. (2003) The Cambridge Dictionary of Statistics, CUP. ISBN 0-521-81099-x.
NAGELE, P. (2003) Misuse of standard error of the mean (SEM) when reporting variability of a sample.
A critical evaluation of four anaesthesia. Br J Anaesthesiol, 90, 514–516.
REICHMANN, J. (1961) Use and Abuse of Statistics, London: Methuen. Reprinted 1964–1970 by Pelican.
Appendix 8.
STREINER, D. L. (1996) Maintaining standards: differences between the standard deviation and standard error,
and when to use each. Can. J. Psychiatry, 41, 498–502.
VYSOCHANSKIJ, D. F. & PETUNIN, Y. I. (1980) Justification of the 3s rule for unimodal distributions. Theory
Probab Math Stat, 21, 25–36.
WEBSTER, C. S. & MERRY, A. F. (1997) The standard deviation and the standard error of the mean. Anaesthesia,
52, 183.
Hossein Hassani was born on 22 October 1976 in Tehran, Iran. He graduated with a Bsc and Msc in
Statistics (IRAN) and Phd in Statistics at the Cardiff School of Mathematics in December 2009. He has been
involved in teaching Statistics since 2004 at different levels, and also various subjects. He has been working
as a lecturer in Statistics at Cardiff School of Medicine since December 2009. He has published several
papers in high scholarly journals (some of them are the top journals in their fields). He has also received
several international prizes and grants for his research during his PhD study.
Mansoureh Ghodsi has been working towards her doctorate under the stewardship of the Statistics Group
at the Cardiff University since late 2006 and is due to submit for her PhD in 2011.
Gareth Howell has been working towards his doctorate under the stewardship of the Statistics Group at the
Cardiff University since late 2006 and is due to complete it very shortly.