Introduction

Homogeneity of variance (homoscedasticity) is an important assumption shared by many parametric statistical methods. This assumption requires that the variance within each population be equal for all populations (two or more, depending on the method). For example, this assumption is used in the two-sample t-test and ANOVA. If the variances are not homogeneous, they are said to be heterogeneous. If this is the case, we say that the underlying populations, or random variables, are heteroscedastic (sometimes spelled as heteroskedastic).

In this entry we will initially discuss the case when we compare variances of two populations, and subsequently will extend to k populations.

Comparison of Two Population Variances

The standard F-test is used to test whether two populations have the same variance. The test statistic for testing the hypothesis if σ 1 2 = σ 2 2 where σ 1 2 and σ 2 2 are the variances of two populations, is

$$F = \frac{{s}_{1}^{\;2}} {{s}_{2}^{\;2}},$$
(1)

where s 1 2 and s 22 are the sample variances for two independent random samples of n 1 and n 2 observations from normally distributed populations with variances σ 1 2 and σ 2 2, respectively. If the null hypothesis is true (i.e., H 0 : σ 1 2 = σ 2 2), the test statistic has the F-distribution with (n 1 − 1) degrees of freedom for the numerator and (n 2 − 1) degrees of freedom for the denominator. The F-test is extremely sensitive to non-normality and should not be used unless there is strong evidence that the data do not depart from normality.

In practical applications, the F ratio in (1) is usually calculated so that the larger sample variance is in the numerator, that is, s 1 2 > s 2 2. Thus, F statistic is always greater than one and only the upper critical values of the F-distribution are used. At the significance level α, the test rejects the hypothesis that the variances are equal if \(F> {F}_{(\alpha ;{n}_{1}-1;{n}_{2}-1)},\) where \({F}_{(\alpha ;{n}_{1}-1;{n}_{2}-1)}\) is the upper critical value of the F distribution with (n 1 − 1) and (n 2 − 1) degrees of freedom.

Tests for Equality of Variances of k Populations

The Bartlett’s test (Bartlett 1937) is used to test if k-groups (populations) have equal variances. Hypotheses are stated as follows:

$$\begin{array}{l} {H}_{0} : {\sigma }_{1}^{\;2} = {\sigma }_{1}^{\;2} = \ldots = {\sigma }_{k}^{\;2} \\ {H}_{1} : {\sigma }_{i}^{\;2}\neq {\sigma }_{j}^{\;2}\quad \mbox{ for at least one pair $(i,j).$}\end{array}$$

To test for equality of variance against the alternative that variances are not equal for at least two groups, the test statistic is defined as

$${\chi }^{2} = \frac{\left (N - k\right )\ln \left (\frac{{\sum \limits _{i=1}^{k}}\left ({n}_{ i}-1\right ){s}_{i}^{\;2}} {N-k} \right ) -{\sum \limits _{i=1}^{k}}\left ({n}_{i} - 1\right )\ln \left ({s}_{i}^{\;2}\right )} {1 + \frac{1} {3\left (k-1\right )}\left [\left ({\sum \limits _{i=1}^{k}} \frac{1} {{n}_{i}-1}\right ) - \frac{1} {N-k}\right ]}$$

where k is the number of samples (groups), n i is the size of the ith sample with sample variance s i 2, and N is the sum of all samples sizes.

The test statistic follows a chi-square distribution with(k − 1) degrees of freedom and the standard chi-squared test with (k − 1) degrees of freedom is applied.

The Bartlett’s test rejects the null hypothesis that the variances are equal if χ 2 > χ (α, k − 1) 2, where χ (α, k − 1) 2 is the upper critical value of the chi-square distribution with (k − 1) degrees of freedom and a significance level of α.

The test is very sensitive to departures from normality and/or to differences in group sizes and is not recommended for routine use. However, if there is strong evidence that the underlying distribution is normal (or nearly normal), the Bartlett’s test has good performance.

The Levene’s test (Levene 1960) is another test used to test if k groups have equal variances,  as an alternative to the Bartlett’s test. It is less sensitive to departures from normality and/or to differences in group sizes and is considered to be the standard test for homogeneity of variances. The idea of this test is to transform the original values of the dependent variable Y and obtain a new variable known as the “dispersion variable.” A standard analysis of variance based on these transformed values will test the assumption of homogeneity of variances.

The test has two options. Given a variable Y with sample of size N divided into k-subgroups, Y ij will be the jth individual score belonging to the ith subgroup. The first option of the test is to define the transformed variable as the absolute deviation of the individual’s score from the mean of the subgroup to which the individual belongs, that is, as \({Z}_{ij} = \left \vert {Y }_{ij} -{\overline{Y }}_{i.}\right \vert\) where \({\overline{Y }}_{i.}\) is the mean of the ith subgroup. The transformed variable is known as the dispersion variable, since it “measures” how far the individual is displaced from its subgroup mean.

The Levene’s test statistic is defined as

$${F}^{L} = \frac{\left (N - k\right ){\sum \limits _{i=1}^{k}}{n}_{i}{\left ({\overline{Z}}_{i.} -\overline{Z}\right )}^{2}} {\left (k - 1\right ){\sum \limits _{i=1}^{k}}{\sum \limits _{j=1}^{{n}_{i}}}{\left ({Z}_{ij} -{\overline{Z}}_{i.}\right )}^{2}}$$
(3)

where n i is the sample size of the ith subgroup, \({Z}_{ij} = \left \vert {Y }_{ij} -{\overline{Y }}_{i.}\right \vert\) is the dispersion variable, \({\overline{Z}}_{i.}\) are the subgroup means of Z ij and \(\overline{Z}\) is the overall mean of Z ij .

The test statistic follows the F-distribution with (k − 1) and (Nk) degrees of freedom and the standard F-test is applied.

The Levene’s test will reject the hypothesis that the variances are equal if F L > F (k − 1, Nk) α where F (k − 1, Nk) α is the upper critical value of the F distribution with (k − 1) and (Nk) degrees of freedom at the significance level α.

The second option is to define the dispersion variable as the square of the absolute deviation from the subgroup mean, that is, as \({Z}_{ij}^{2} ={ \left \vert {Y }_{ij} -{\overline{Y }}_{i.}\right \vert }^{2}.\) The Brown–Forsythe test (Brown and Forsythe 1974) is a modification of the Levene’s test, based on the same logic, except that the dispersion variable Z ij is defined as the absolute deviation from the subgroup median rather than the subgroup mean, that is, \({Z}_{ij} = \left \vert {Y }_{ij} ? {M}_{i.}\right \vert\), where M i. is the median of the ith subgroup. Such a definition, based on medians instead of means, provides good robustness against many types of non-normal data while retaining good power, and is therefore recommended in practical applications.

The O’Brien test (O’Brien 1979) is a modification of the Levene’s Z ij 2 test. In the O’Brien test, the dispersion variable Z ij 2 is modified in a way to include an additional scalar W (weight) to account for the suspected kurtosis of the underlying distribution. The dispersion variable in the O’Brien test is defined as

$${Z}_{ij}^{\;B} = \frac{\left (W + {n}_{i} - 2\right ){n}_{i}{Z}_{ij}^{\;2} - W\left ({n}_{ i} - 1\right ){s}_{i}^{\;2}} {\left ({n}_{i} - 1\right )\left ({n}_{i} - 2\right )}$$
(4)

where Z ij 2 is the square of the absolute deviation from the subgroup mean and n i is the size of the ith subgroup with its sample variance s i 2. W is a constant with values between 0 and 1 and is used to adjust the transformation. The most commonly used weight is W = 0. 5, as suggested by O’Brien (1979).

The previously discussed tests are the tests that are mostly used in empirical research and easily available in most statistical software packages. However, there are also other homogeneity of variance tests, both parametric and nonparametric. Among them are Hartley’s Fmax test, David’s multiple test, and Cochran’s G test (also known as Cochran’s C test). The Bartlett–Kendall test (like Bartlett’s test) uses log transformation of the variance to approximate the normal distribution. An example of a nonparametric test is the Sidney–Tukey test that uses ranks and the chi-square approximation. A good discussion on the topic can be found in Zhang (1998).

Cross References

Analysis of Variance Model, Effects of Departures from Assumptions Underlying

Bartlett’s Test

Heteroscedasticity

Variance