Variance

Bener, Abdulbari; Lovric, Miodrag

doi:10.1007/978-3-642-04898-2_634

Abdulbari Bener² &
Miodrag Lovric³

196 Accesses
1 Citations

Access provided by Autonomous University of Puebla. Download reference work entry PDF

The term “variance” was coined by Ronald Fisher in 1918 in his famous paper on population genetics, The Correlation Between Relatives on the Supposition of Mendelian Inheritance, published by Royal Society of Edinburgh: “It is … desirable in analyzing the causes of variability to deal with the square of the standard deviation as the measure of variability. We shall term this quantity the Variance …” (p. 399). Interestingly, according to O. Kempthorne, this paper was initially rejected by the Royal Society of London, “probably the reason was that it constituted such a great advance on the thought in the area that the reviewers were unable to make a reasonable assessment.”

The variance of a random variable (or a data set) is a measure of variable (data) dispersion or spread around the mean (expected value).

Definition Let X be a random variable with second moment E(X ²) and let μ = E(X) be its mean. The variance of X is defined by (see, e.g., Feller 1968, p. 228)

$$V ar(X) = E\left [{(X - \mu )}^{2}\right ] = E({X}^{2}) - {\mu }^{2}.$$

(1)

The variance of a random variable is also frequently denoted by V (X), σ_X ² or simply σ², when the context is clear. The positive square root of variance is called the standard deviation.

From (1), the variance of X can be interpreted as the “mean of the squares of deviations from the mean” (Kendall 1945, p. 39). Since the deviations are squared, it is clear that variance cannot be negative. Variance is a measure of dispersion “since if the values of a random variable X tend to be far from their mean, the variance of X will be larger than the variance of a comparable random variable Y whose values tend to be near their mean” (Mood et al. 1974, p. 67). It is obvious that a constant has variance 0, since there is no spread. Because the deviations are squared, the variance is expressed in the original units squared (inches², euro²) which are difficult to interpret.

To compute the variance of a random variable, it is required to know the probability distribution of X. If X is a discrete random variable, then

$$V ar(X) = \sum _{i}{({x}_{i} - \mu )}^{2}P(X = {x}_{ i}) = \sum _{i}{{x}_{i}}^{2}P(X = {x}_{ i}) - {\mu }^{2}.$$

(2)

When X is a continuous random variable with probability density function f(x), then

$$V ar(X) ={ \int \nolimits \nolimits }_{-\infty }^{+\infty }{(x - \mu )}^{2}f(x)dx ={ \int \nolimits \nolimits }_{-\infty }^{+\infty }{x}^{2}f(x)dx - {\mu }^{2}.$$

(3)

Example 1 If X has a Uniform distribution on [a, b], with pdf 1 ∕ (b − a), then

$$E(X) = \frac{1} {b - a}{\int \nolimits \nolimits}_{a}^{b}xdx = \frac{{b}^{2} - {a}^{2}} {2(b - a)} = \frac{a + b} {2},$$

and

$$E({X}^{2}) = \frac{1} {b - a}{\int \nolimits \nolimits }_{a}^{b}{x}^{2}dx = \frac{{b}^{3} - {a}^{3}} {3(b - a)} = \frac{{a}^{2} + ab + {b}^{2}} {3}.$$

Hence the variance is equal to

$$V ar(X) = E({X}^{2}) - {\mu }^{2} = \frac{{(b - a)}^{2}} {12}.$$

The following table provides expressions for variance for some standard univariate discrete and continuous probability distributions.

The Cauchy distribution possesses neither mean nor variance.

Next, we list some important properties of variance.

1.
The variance of a constant is 0; in other words, if all observations in the data set are identical, the variance takes its minimum possible value, which is zero.

2.

If b is a constant then

Distribution	Notation	Variance
Bernoulli	Be(p)	pq
Binomial	Bin(n, p)	npq
Geometric	Ge(p)	q ∕ p ²
Poisson	Po(λ)	λ
Uniform	U(a, b)	(b − a)² ∕ 12
Exponential	Exp(λ)	1 ∕ λ²
Normal	N(μ, σ)	σ²
Standard Normal	N(0, 1)	1
Student	t(ν)	ν(ν − 2) for ν > 2
F	F(ν₁, ν₂)	$\frac{2{\nu }_{2}^{2}({\nu }_{ 1}+{\nu }_{2}-2)} {{\nu }_{1}{({\nu }_{2}-2)}^{2}({\nu }_{2}-4)}$ for ν₂ > 4
Chi-square	Chi(ν)	2ν

$$V ar(X + b) = V ar\;X,$$

which means that adding a constant to a random variable does not change the variance.

3.
If a and b are constants, then
$$V ar(aX + b) = {a}^{2}V ar\;X$$
4.
If two variables X and Y are independent, then
$$\begin{array}{rcl} V ar(X + Y )& =& V ar\;X + V ar\;Y \\ V ar(X - Y )& =& V ar\;X + V ar\;Y \\ \end{array}$$
5.
The previous property can be generalized, i.e., the variance of the sum of independent random variables is equal to the sum of variances of these random variables
$$V ar\left (\sum\limits_{i=1}^{n}{X}_{ i}\right ) = \sum\limits_{i=1}^{n}V ar({X}_{ i}).$$
This result is called Bienaymé equality (see Loève 1977, p. 12, or Roussas p. 171).
6.
If two random variables X and Y are independent and a and b are constants, then
$$V ar(aX + bY ) = {a}^{2}V ar\;X + {b}^{2}V ar\;Y.$$

In practice, the variance of a population, σ², is usually not known, and therefore it can only be estimated using the information contained in a sample of observations drawn from that population. If x ₁, x ₂, …, x _n is a random sample of size n selected from a population with mean μ, then the sample variance is usually denoted by s ² and is defined by

$${S}^{2} = \frac{\sum \nolimits {({x}_{i} -\overline{x})}^{2}} {n - 1},$$

(4)

where $\overline{x}$ is the sample mean. The sample variance depicts the dispersion of sample observations around the sample mean. The squared deviations in (4) are divided by n − 1, not by n, in order to obtain the unbiased estimator of the population variance, E(s ²) = σ². The factor 1 ∕ (n − 1) increases sample variance enough to make it unbiased. This factor is known as Bessel’s correction (after Friedrich Bessel). Although the sample variance defined as in (4) is an unbiased estimator of population variance, the same does not relate to its square root, standard deviation; the sample standard deviation is a biased estimate of the population standard deviation.

Example 2 The first column of the following table contains first five measurements of the speed of light in suitable units (000 km/s) from the classical experiments performed by Michelson in 1879 (data obtained from the Ernest N. Dorsey’s 1944 paper “The Velocity of Light”).

x _i	${x}_{i} -\overline{x}$	${({x}_{i} -\overline{x})}^{2}$	x _i ²
299.85	− 0. 048	0.002304	89,910.0225
299.74	− 0. 158	0.024964	89,844.0676
299.90	0.002	0.000004	89,940.0100
300.07	0.172	0.029584	90,042.0049
299.93	0.032	0.001024	89,958.0049
Σ 1499.49	0.000	0.057880	449,694.1099

Since the sample mean is equal to $\ \overline{x} = \frac{\sum\limits_{i=1}^{5}{x}_{ i}} {5} = \frac{1499.49} {5} = 299.898$ using the formula given in (4) results in the variance value

$${S}^{2} = \frac{\sum \nolimits {({x}_{i} -\overline{x})}^{2}} {n - 1} = \frac{\mathrm{0}\mathrm{.057880}} {4} = \mathrm{0}\mathrm{.01447}.$$

In the past, instead of the “definitional” formula (4), the following (so-called shorthand) formula was commonly used, but it has become obsolete with the wide access of statistical software, spreadsheets, and Internet java applets:

$$\begin{array}{rcl} S^2 & = & \frac{{\sum {x_i^2 - \frac{{\left( {\sum {x_i } } \right)^2 }}{n}} }} {{n - 1}} = \frac{{449,694.1099 - \frac{{1,499.49^2 }} {5}}}{4} \\ & = & 0.01447. \end{array}$$

About the Author

Abdulbari Bener, Ph.D., has joined the Department of Public Health at the Weill Cornell Medical College as Research Professor of Public Health. Professor Bener is Director of the Medical Statistics and Epidemiology Department at Hamad Medical Corporation/Qatar. He is also an advisor to the World Health Organization and Adjunct Professor and Coordinator for the postgraduate and master public health programs (MPH) of the School of Epidemiology and Health Sciences, University of Manchester. He is Fellow of Royal Statistical Society (FRSS) and Fellow of Faculty of Public Health (FFPH). Dr Bener holds a Ph.D. degree in Medical Statistics (Biometry) and Genetics from the University College of London, and a B.Sc. degree from Ankara University, Faculty of Education, Department of Management, Planning and Investigation. He completed research fellowships in the Departments of Genetics and Biometry and Statistics and Computer Sciences at the University College of London. He has held academic positions in public health, epidemiology, and statistics at universities in Turkey, Saudi Arabia, Kuwait, the United Arab Emirates, Qatar, and England. Professor Bener has been author or coauthor of more than 430 published journal articles; Editor, Associate Editor, Advisor Editor, and Asst. Editor for several Journals; and Referee for over 23 journals. He has contributed to more than 15 book chapters and supervised thesis of 40 postgraduate students (M.Sc., MPH, M.Phil. and Ph.D.).

Cross References

Expected Value

Mean Median and Mode

Mean, Median, Mode: An Introduction

Semi-Variance in Finance

Standard Deviation

Statistical Distributions: An Overview

Tests for Homogeneity of Variance

References and Further Reading

Fisher R (1918) The correlation between relatives on the supposition of mendelian inheritance. Philos Trans Roy Soc Edinb 52:399–433
Article Google Scholar
Dorsey EN (1944) The velocity of light. T Am Philos Soc 34(Part 1): 1–110, Table 22
Google Scholar
Feller W (1968) An introduction to the probability theory and its applications, 3rd edn. Wiley, New York
Google Scholar
Kempthorne O (1968) Book reviews. Am J Hum Genet 20(4):402–403
Google Scholar
Kendall M (1945) The advanced theory of statistics. Charles Griffin, London
Google Scholar
Loève M (1977) Probability theory I, 4th edn. Springer, New York
MATH Google Scholar
Mood AM, Graybill FA, Boes DC (1974) Introduction to the theory of statistics, 3rd edn. McGraw-Hill, London
MATH Google Scholar
Roussas G (1997) A course in mathematical statistics, 2nd edn. Academic, Hardcover
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Weill Cornell Medical College, Doha, Qatar
Abdulbari Bener
University of Kragujevac, Kragujevac, Serbia
Miodrag Lovric

Authors

Abdulbari Bener
View author publications
You can also search for this author in PubMed Google Scholar
Miodrag Lovric
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Statistics and Informatics, Faculty of Economics, University of Kragujevac, City of Kragujevac, Serbia
Miodrag Lovric

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Bener, A., Lovric, M. (2011). Variance. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_634

Download citation

DOI: https://doi.org/10.1007/978-3-642-04898-2_634
Published: 02 December 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics