1 Introduction

The expected value of sample variance is often derived by deriving its sampling distribution which may be intractable in some situations. The objective of this paper is to derive a general formula for the mathematical expectation of sample variance.

One may wonder if there is any real-world situation for which we need a generalization of the expected value formula for sample variance. Indeed this kind of situation arises when the observations are not necessarily independent, say, time series data or observations from a mixture distribution with parameters following some other distribution. See for example [8].

It was believed earlier that the rates of return on common stocks were adequately characterized by a normal distribution. But recently, it has been observed by several authors that the empirical distribution of rates of return of common stocks has somewhat thicker tails (larger kurtosis) than that of the normal distribution. The univariate \(t\)-distribution has got fatter tails and as such it is more appropriate than the normal distribution to characterize stock return rates. For example, if for a given \(\Upsilon =\upsilon \), observations follow normal distribution, say, \(X_{i}\sim N(0,\upsilon ^{2})\), \((i=1,2,\ldots ,n)\), where \(\nu \Upsilon ^{-2}\sim \chi _{\nu }^{2}\), then the unconditional distribution of the sample follows a \(t\)-distribution (Example 3.7). Examples 3.3 and 3.7 show the usefulness of the main result proved in Theorem 2.1 for a sample governed by \(t\)-distribution. Samples can be drawn from distributions where the components are dependent by 3 methods: the conditional distribution method, transformation method, and the rejection method [10, pp. 43–48].

Blattberg and Gonedes [3] assessed the suitability of the multivariate \(t\)-model and [19] considered a regression model to study stock return data for a single stock. Interested readers may go through [14] who considered a multivariate \(t\)-model for the price change data for the stocks of four selected firms: General Electric, Standard Oil, IBM, and Sears. These are examples where expected sample variance or covariance matrix cannot be derived by appealing to independence.

Let \(x_{1},x_{2},\ldots ,x_{n}(n\ge 2)\) be a sample with variance \((s^{2})\) where \((n-1)s^{2}=\sum \nolimits _{i=1}^{n}{(x_{i}-\bar{{x}})^{2}},\, n\ge 2\). A matrix \(W\) showing the pair-wise differences among observations can be prepared whose entries are \(w_{ij} =x_{i}-x_{j}\) where \(i\) and \(j\) are integers \((i,j=1,\,2,\ldots ,n)\) so that the set of elements of \(W=\{w_{ij} : 1\le i\le n,\,1\le j\le n\}\) can be ‘decomposed’ as

$$\begin{aligned} W_{l}&= \{w_{ij} : 1\le i\le n,\,1\le j\le n;\,i>j\}=\{w_{ij} : 1\le i>j\le n\}, \\ W_{u}&= \{w_{ij} : 1\le i\le n,\,1\le j\le n;\,i<j\}=\{w_{ij} : 1\le i<j\le n\}\quad \hbox {and}\\ W_{d}&= \{w_{ii} =0: 1\le i\le n\} \end{aligned}$$

which are the elements in the lower triangle, upper triangle, and in the diagonal of the matrix \(W\). Also

$$\begin{aligned} W_{l}&= \{w_{ij} : 2\le i\le n,\,1\le j\le i-1\}=\{w_{ij} : 1\le j\le n-1,\,j+1\le i\le n\}, \nonumber \\ W_{u}&= \{w_{ij} : 1\le i\le n-1,\,i+1\le j\le n\}=\{w_{ij} : 2\le j\le n,\,1\le i\le j-1\}.\nonumber \\ \end{aligned}$$
(1.1)

Then it is easy to check that \((n-1)s^{2}=\sum \nolimits _{i=1}^{n}{(x_{i}-\bar{{x}})^{2}} \) can also be represented by

$$\begin{aligned} \frac{1}{n}\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {(x_{i}-x_{j})^{2}}} =\frac{1}{2n}\sum \limits _{i=1}^{n}{\sum \limits _{j=1}^{n}{(x_{i}-x_{j})^{2}}} . \end{aligned}$$
(1.2)

See for example, [6] and [7]. The following theorem is due to [6].

Theorem 1.1

Let \(d_{i}=x_{i+1} -x_{i}\), \(i=1,2,\ldots ,n-1\) be the first-order differences of \(n(\ge 2)\) observations. Then the variance \((s^{2})\) of \(n\) observations is given by \(n(n-1)s^{2}={d}^{\prime }Cd\) where \(d=(d_{1},d_{2},\ldots ,d_{n-1} {)}^{\prime }\) and \(C=(c_{ij})\) is an \((n-1)\times (n-1)\) symmetric matrix with \(c_{ij} =(n-i)j\) for \(i,j=1,2,\ldots ,n-1\,(i\ge j)\).

Let the mean square successive difference (MSSD) of sample observations be given by \(D=\sum \nolimits _{i=1}^{n-1} {d_{i}^{2}}\). The ratio of the MSSD to the sample variance \(T=D/[(n-1)S^{2}]\) was suggested by [1518] as a test statistic to test the independence of the random variables \(X_{1},X_{2},\ldots ,X_{n}(n\ge 2)\) which are successive observations on a stationary Gaussian time series. In particular, the ratio actually studied by von Neumann was \(nT/(n-1)\). Bingham and Nelson [2] approximated the distribution of the von Neumann’s \(T\) ratio.

In case of independently and identically distributed random variables, often the expected value of sample variance is calculated by deriving the distribution of the random sample variance. If a sample is drawn from a normal population \(N(\mu ,\sigma ^{2})\), then, it is well known that the sample mean \((\bar{{X}})\) and variance \((S^{2})\) are independent and \((n-1)S^{2}/\sigma ^{2}\sim \chi _{n-1}^{2}\), a chi-square distribution with \((n-1)\) degrees of freedom and that \((n-1)E(S^{2})/\sigma ^{2}=E(\chi _{n-1}^{2})=n-1\), i.e., \(E(S^{2})=\sigma ^{2}\) (see for example [11, p. 213]).

We demonstrate that the sampling distribution of the sample variance can be avoided to derive the expected value of sample variance in many general situations. These situations include expectation of variance of observations that are not necessarily independent as mentioned earlier. Suppose that \(X_{i}\)’s \((i=1,2,\ldots ,n)\) are uncorrelated random variables from an unknown distribution with finite mean \(E(X_{i})\!=\!\mu \,(i=1,2,\ldots ,n)\) and finite variance \(V(X_{i})=E(X_{i}-\mu )^{2}=\sigma ^{2}\,(i=1,2,\ldots ,n)\), \(E(S^{2})\) may not be obtained by utilizing the chi-square distribution. A more general approach is thus needed. In this case, it follows that \(E(S^{2})=\sigma ^{2}\) by virtue of \((n-1)E(S^{2})=\sum \nolimits _{i=1}^{n}{E(X_{i}-\mu )^{2}-} nE(\bar{{X}}-\mu )^{2}\) or \((n-1)E(S^{2})=n\sigma ^{2}-n(\sigma ^{2}/n)\).

In this paper, we alternatively demonstrated that the expected value depends on the second moment of the difference of pairs of its constituent random variables. In Theorem 2.1, a general formula for expected variance is derived in terms of some natural quantities depending on mean, variance, and correlation. Some special cases are presented in Sect. 3 with examples. An application to textile engineering is presented in Sect. 4.

2 The Main Result

In what follows we will need the following:

$$\begin{aligned} n\bar{{\mu }}&= \sum \limits _{i=1}^{n}{\mu _{i}} ,\,(n-1)\sigma _{\mu }^{2}=\sum \limits _{i=1}^{n}(\mu _{i}- \bar{{\mu }})^{2}\nonumber \\&= \frac{1}{n}\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {(\mu _{i}-\mu _{j})^{2}}} ,\,n\overline{\sigma ^{2}} =\sum \limits _{i=1}^{n}{\sigma _{i}^{2}} . \end{aligned}$$
(2.1)

We define the covariance between \(X_{i}\) and \(X_{j}\) by

$$\begin{aligned} Cov(X_{i},X_{j})\!=\!\sigma _{ij} \!=\!E(X_{i}\!-\!\mu _{i})(X_{j}\!-\!\mu _{j}),\,(i=1,2,\ldots n;\,j=1,2,\ldots ,n;\,i\!\ne \! j). \end{aligned}$$

Theorem 2.1

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with finite mean \(E(X_{i})=\mu _{i}\,(i=1,2,\ldots ,n)\) and finite variance \(V(X_{i})=\sigma _{i}^{2}\,(i=1,2,\ldots ,n)\) with \(\sigma _{ij} =\rho _{ij} \sigma _{i}\sigma _{j},\, (i=1,2,\ldots n;\,j=1,2,\ldots ,n; i\ne j)\) and

$$\begin{aligned} \bar{{\sigma }}_{..} =\left( {{ \begin{array}{l} n \\ 2 \\ \end{array}}}\right) ^{-1}\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {\rho _{ij} \sigma _{i}\,\sigma _{j}}} . \end{aligned}$$
(2.2)

Then

  1. a.
    $$\begin{aligned} n(n-1)E(S^{2})=\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {E(X_{i}-X_{j})^{2}}} =\frac{1}{2}\sum \limits _{i=1}^{n}{\sum \limits _{j=1}^{n}{E(X_{i}-X_{j})^{2}}}, \end{aligned}$$
    (2.3)
  2. b.
    $$\begin{aligned} E(S^{2})=\overline{\sigma ^{2}} +\sigma _{\mu }^{2}-\bar{{\sigma }}_{..}, \end{aligned}$$
    (2.4)

where \(\sigma _{\mu }^{2}\) and \(\overline{\sigma ^{2}} \) are defined by (2.1).

Proof

Part (a) is obvious by (1.2). Since \(x_{i}-x_{j}=(x_{i}-\mu _{i})-(x_{j}-\mu _{j})+(\mu _{i}-\mu _{j})\), it can be checked that

$$\begin{aligned} n(n-1)s^{2}&= \sum \limits _{i=2}^{n}\sum \limits _{j=1}^{i-1} {[(x_{i}-\mu _{i})^{2}+(x_{j}-\mu _{j})^{2}+(\mu _{i}-\mu _{j})^{2}} \nonumber \\&-2(x_{i}-\mu _{i})(x_{j}-\mu _{j})+2(x_{i}-\mu _{i})(\mu _{i}- \mu _{j})\nonumber \\&-2(x_{j}-\mu _{j})(\mu _{i}-\mu _{j})]. \end{aligned}$$
(2.5)

Clearly \(\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {E(X_{i}-\mu _{i})^{2}}} =\sum \limits _{i=2}^{n}{(i-1)} \sigma _{i}^{2}\).

Since \(2\le j+1\le i\le n\) (See 1.1),

it follows from (2.5) that

$$\begin{aligned} n(n-1)E(S^{2})\!=\!\sum \limits _{i=2}^{n}{(i-1)} \sigma _{i}^{2}\!+\!\sum \limits _{j=1}^{n-1} {(n\!-\!j)} \sigma _{j}^{2}+n(n-1)\sigma _{\mu }^{2}-2\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {\rho _{ij} \sigma _{i}\sigma _{j}}} . \end{aligned}$$

Then the proof for part (b) follows by virtue of

$$\begin{aligned} \sum \limits _{i=2}^{n}{(i-1)} \sigma _{i}^{2}+\sum \limits _{j=1}^{n-1} {(n-j)} \sigma _{j}^{2}=\sum \limits _{i=2}^{n}{(i-1)} \sigma _{i}^{2}+\sum \limits _{i=1}^{n-1} {(n-i)} \sigma _{i}^{2}=(n-1)\sum \limits _{i=1}^{n}{\sigma _{i}^{2}} . \end{aligned}$$

\(\square \)

Alternatively, readers acquainted with matrix algebra may prefer the following proof of Theorem 2.1.

Consider the vector \(X:(n\times 1)\) of observations, and \(\mu =E(X)\), the vector of means. Note also \(\sum =\{\sigma _{ij} \}\), the \((n\times n)\) covariance matrix. Then it is easy to check that \((n-1)S^{2}={X}^{\prime }MX\) where the \((n\times n)\) centering matrix \(M=I_{n}-1_{n}{1}^{\prime }_{n}/n\) with \(I_{n}\) the identity matrix of order \(n\) and \(1_{n}\) is a \((n\times 1)\) vector of 1’s. Then \((n-1)E(S^{2})=E({X}^{\prime }MX)\). But \(E({X}^{\prime }MX)=E(tr({X}^{\prime }MX))\) and \(E(X{X}^{\prime })=\sum +\mu {\mu }^{\prime }\) so that

$$\begin{aligned} (n-1)E(S^{2})=tr\left( {M\sum }\right) +\,{\mu }^{\prime }M\,\mu . \end{aligned}$$

That is, \(E(S^{2})=\overline{\sigma ^{2}} -\bar{{\sigma }}_{..} +\sigma _{\mu }^{2},\,(i\ne j)\), since \(tr(M\sum )=(n-1)\overline{\sigma ^{2}} -(n-1)\bar{{\sigma }}_{..} ,~(i\ne j)\) and \({\mu }^{\prime }M\mu =\sum \nolimits _{i=1}^{n}{(\mu _{i}-\bar{{\mu }})^{2}} =(n-1)\sigma _{\mu }^{2}\), where \(\bar{{\sigma }}_{..} =\left( {{ \begin{array}{l} n \\ 2 \\ \end{array}}}\right) ^{-1}\sum \nolimits _{i=2}^{n}{\sum \nolimits _{j=1}^{i-1} {\sigma _{ij}}}\) is the mean of the off-diagonal elements of \(\sum \). \(\square \)

3 Some Deductions and Mathematical Application of the Result

In this section, we deduce a number of corollaries from Theorem 2.1. But first we have two examples to illustrate part (a) of Theorem 2.1.

Example 3.1

Suppose that \(f(x_{i})=\frac{2}{3}(x_{i}+1),\,0<x_{i}<1\,(i=1,2,3)\) and the dependent sample \((X_{1},X_{2},X_{3})\) is governed by the probability density function \(f(x_{1},x_{2},x_{3})=\frac{2}{3}(x_{1}+x_{2}+x_{3})\), (cf. [5, p. 128]). Then it can be checked that \(E(X_{i})=5/9, \,E(X_{i}^{2})=7/18, \,E(X_{i}X_{j})=11/36\), \(V(X_{i})=13/162\), and \(Cov(X_{i},X_{j})=-1/324 \, (i=1,2,3;j=1,2,3;i\ne j)\). Let \(S^{2}=\sum \nolimits _{i=1}^{3}{(X_{i}} -\bar{{X}})^{2}/2\) be the sample variance. Then by Theorem 2.1(a),

$$\begin{aligned} 6E(S^{2})=E(X_{1}-X_{2})^{2}+E(X_{1}-X_{3})^{2}+E(X_{2}-X_{3})^{2}. \end{aligned}$$

Since \(E(X_{i}-X_{j})^{2}=(7/18)+(7/18)-2(11/36),\,(i=1,2,3;j=1,2,3;i\ne j)\), we have \(E(S^{2})=1/12\).

Example 3.2

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be independently, identically, and normally distributed as \(N(\mu ,\sigma ^{2})\). Since \(X_{i}-X_{j}\sim N(0,\,2\sigma ^{2}),\,i\ne j\), by Theorem 2.1(a), we have \(n(n-1)E(S^{2})=\,\sum \nolimits _{i=2}^{n}{\sum \nolimits _{j=1}^{i-1} {E(X_{i}-X_{j})^{2}}} =\sum \nolimits _{i=2}^{n}{\sum \nolimits _{j=1}^{i-1} {(0^{2}+2\sigma ^{2})}} =\frac{n(n-1)}{2}\times (2\sigma ^{2})\).

That is \(E(S^{2})=\sigma ^{2}\).

The following corollary is a special case of Theorem 2.1(b) if \(\rho _{ij} =0, \,(i=1,2,\ldots ,n; \,j=1,2,\ldots ,n)\).

Corollary 3.1

Let \(X_{i}\)’s \(\,(i=1,2,\ldots ,n)\) be uncorrelated random variables with finite mean \(E(X_{i})=\mu _{i}\,(i=1,2,\ldots ,n)\) and finite variance \(V(X_{i})=\sigma _{i}^{2}\,(i=1,2,\ldots ,n)\). Then \(E(S^{2})=\overline{\sigma ^{2}} +\sigma _{\mu }^{2}\).

Note that if we form a matrix with the correlation coefficients \(\rho _{ij} (i=1,2,\ldots ,n;j=1,2,n)\), then by symmetry of the correlation coefficients, total number of the elements in the lower triangle (say \(\rho ^{*})\) would be the same as that of the upper triangle i.e.,

$$\begin{aligned} \rho ^{*}=\sum \limits _{i=2}^{n}{\sum \limits _{j=1}^{i-1} {\rho _{ij}} =} \sum \limits _{i=1}^{n-1} {\sum \limits _{j=i+1}^{n}{\rho _{ij}}} . \end{aligned}$$

Hence \(\rho ^{*}+n+\rho ^{*}=n^{2}\bar{{\rho }}_{..} \) where

$$\begin{aligned} n^{2}\bar{{\rho }}_{..} =\sum \limits _{i=1}^{n}{\sum \limits _{j=1}^{n}{\rho _{ij}}} , \end{aligned}$$
(3.1)

so that \(\rho ^{*}=n(n\bar{{\rho }}_{..} -1)/2\).

If \(V(X_{i})=\sigma ^{2},\,(i=1,2,\ldots n)\) in Theorem 2.1 (b), then \(\overline{\sigma ^{2}} =\sigma ^{2}\) and \(\bar{{\sigma }}_{..} =(n\bar{{\rho }}_{..} -1)\sigma ^{2}/(n-1)\) where \(\bar{{\rho }}_{..}\) is defined by (3.1) and we have the following corollary.

Corollary 3.2

Let \(X_{i}\)’s \(\,(i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu _{i},\,V(X_{i})=\sigma ^{2},\, Cov(X_{i},X_{j})=\rho _{ij} \sigma ^{2}\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\left( {1-\frac{n\bar{{\rho }}_{..} -1}{n-1}}\right) \sigma ^{2}+\sigma _{\mu }^{2}\).

If \(V(X_{i})=\sigma ^{2},\,\rho _{ij} =\rho \,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\), in Theorem 2.1 (b), then \(\overline{\sigma ^{2}} =\sigma ^{2}\) and \(\bar{{\sigma }}_{..} =\rho \sigma ^{2}\), we have the following corollary.

Corollary 3.3

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu _{i},\,V(X_{i})=\sigma ^{2}, \,Cov(X_{i},X_{j})=\rho \sigma ^{2}\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=(1-\rho )\sigma ^{2}+\sigma _{\mu }^{2}\le 2\sigma ^{2}+\sigma _{\mu }^{2}\).

If \(V(X_{i})=\sigma ^{2},\,\rho _{ij} =0\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\), in Theorem 2.1 (b), then \(\overline{\sigma ^{2}} =\sigma ^{2}\) and \(\bar{{\sigma }}_{..} =0\), we have the following corollary.

Corollary 3.4

Let \(X_{i}\,(i=1,2,\ldots ,n)\)’s be random variables with \(E(X_{i})=\mu _{i},\,V(X_{i})=\sigma ^{2},\, Cov(X_{i},X_{j})=0,\, (i=1,2,\ldots n;j=1,2,\ldots ,n;\,i\ne j)\) whenever they exist. Then \(E(S^{2})=\sigma ^{2}+\sigma _{\mu }^{2}\).

An example is provided below to illustrate the situation.

Example 3.3

Let \(f(x_{i})=\frac{\Gamma ((\nu +1)/2)}{\sqrt{\nu \pi }\sigma \Gamma (\nu /2)}\,\left( {1+\frac{1}{\nu \sigma ^{2}}(x_{i}-\mu _{i})^{2}}\right) ^{-(\nu +1)/2},\,\nu >2,\,(i=1,2,3)\) and the dependent observations \((X_{1},X_{2},\,X_{3})\) be governed by the probability density function

$$\begin{aligned} f(x_{1},x_{2},x_{3})&= \frac{\Gamma ((\nu +3)/2)}{(\nu \pi )^{3/2}\sigma ^{3}\Gamma (\nu /2)}\,\nonumber \\&\times \left( 1+\frac{1}{\nu \sigma ^{2}}(x_{1}-\mu _{1})^{2}+ (x_{2}-\mu _{2})^{2}+(x_{3}- \mu _{3})^{2}\right) ^{-(\nu +3)/2}, \end{aligned}$$
(3.2)

\(-\infty <x_{i}<\infty \,(i=1,2,3),\,\sigma >0, \,\nu >2\) (cf. [1, p. 55]). Since \(V(X_{i})=\frac{\nu \sigma ^{2}}{\nu -2},\,\nu >2\), \((i=1,2,3)\), and \(Cov(X_{i},X_{j})=0,\,(i=1,2,3;\,j=1,2,3;\,i\ne j)\), it follows from Corollary 3.4 that \(E(S^{2})=\frac{\nu \sigma ^{2}}{\nu -2}+\sigma _{\mu }^{2},\,\nu >2\) where \(S^{2}\) is the sample variance and \(2\sigma _{\mu }^{2}=(\mu _{1}-\bar{{\mu }})^{2}+(\mu _{2}- \bar{{\mu }})^{2}+(\mu _{3}-\bar{{\mu }})^{2},\,3\bar{{\mu }} =\mu _{1}+\mu _{2}+\mu _{3}\).

Corollary 3.5

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu ,\,V(X_{i})=\sigma _{i}^{2}\), \(\sigma _{\mathrm{ij}} =\rho _{ij} \sigma _{i}\sigma _{j},\, (i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\overline{\sigma ^{2}} -\bar{{\sigma }}_{..} \) where \(\overline{\sigma ^{2}} \) is defined by (2.1) and \(\bar{{\sigma }}_{..} \) is defined in (2.2).

Corollary 3.6

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu ,\,V(X_{i})=\sigma _{i}^{2},\,Cov(X_{i},X_{j})=\rho \sigma _{i}^{2},\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\overline{\sigma ^{2}} -\left( {{\begin{array}{l} n \\ 2 \\ \end{array}}}\right) ^{-1}\rho \sum \nolimits _{i=2}^{n}{(i-1)\sigma _{i}^{2}}\).

Corollary 3.7

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu ,\,V(X_{i})=\sigma _{i}^{2}\), \(Cov(X_{i},X_{j})=0,\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\overline{\sigma ^{2}}\).

Corollary 3.8

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be independently distributed random variables with finite mean \(E(X_{i})=\mu \,(i=1,2,\ldots ,n)\) and finite variance \(V(X_{i})=\sigma _{\mathrm{i}}^{2}\,(i=1,2,\ldots ,n)\). Then \(E(S^{2})=\overline{\sigma ^{2}}\) .

Corollary 3.9

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be random variables with \(E(X_{i})=\mu ,\,V(X_{i})=\sigma ^{2}\), \(Cov(X_{i},X_{j})=\rho _{ij} \sigma ^{2},\,(i=1,2,\ldots n;j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\left( {1-\frac{n\bar{{\rho }}_{..} -1}{n-1}}\right) \sigma ^{2}\) where \(\bar{{\rho }}_{..} \) is defined by (3.1).

Corollary 3.10

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be identically distributed random variables i.e., \(E(X_{i})=\mu ,\,V(X_{i})=\sigma ^{2}\,(i=1,2,\ldots n)\) and \(Cov(X_{i},X_{j})=\rho \sigma ^{2}\,(i=1,2,\ldots n;\, j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=(1-\rho )\sigma ^{2}\) (cf. [13]).

Two examples are given below to illustrate the above corollary.

Example 3.4

In Example 3.1, \(\sigma ^{2}=13/162\) and \(\rho =-1/26\). Then by Corollary 3.10, we have \(E(S^{2})=(1-\rho )\sigma ^{2}=1/12\) where \(S^{2}=\sum \limits _{i=1}^{3}{(X_{i}} -\bar{{X}})^{2}/2\).

Example 3.5

Let \(X_{i}\sim N(0,1),\,(i=1,2)\) and the sample \((X_{1},X_{2})\) (not necessarily independent) have the joint density function

$$\begin{aligned} f(x_{1},x_{2})&= \frac{1}{2\pi }\exp \left[ {-\frac{1}{2}(x_{1}^{2}+x_{2}^{2})}\right] \left[ {1+x_{1}x_{2}\exp \left( {-\frac{1}{2}(x_{1}^{2}+x_{2}^{2}-2)}\right) }\right] , \nonumber \\&-\infty <x_{1},x_{2}<\infty , \end{aligned}$$
(3.3)

(cf. [4, p. 121]). Writing out the joint density function (3.3) into two parts, we can easily prove that \(E(X_{1}X_{2})=E(X_{1})E(X_{2})+\frac{e}{2\pi }\,I(x_{1})I(x_{2})\) where \(I(x)=\int _{-\infty }^{\infty } {x^{2}e^{-x^{2}}} dx=\sqrt{\pi }/2\). But \(E(X_{i})=0\,(i=1,2)\), \(V(X_{i})=1\,(i=1,2)\), hence \(Cov(X_{1},X_{2})=E(X_{1}X_{2})-E(X_{1})E(X_{2})\) which simplifies to \(e/8\) is also the correlation coefficient \(\rho \) between \(X_{1}\) and \(X_{2}\). Hence by virtue of Corollary 3.10, we have \(E(S^{2})=(1-e/8)\).

Part (b) of Theorem 2.1 is specialized below for uncorrelated but identically distributed random variables.

Corollary 3.11

Let \(X_{i}\)’s \((i=1,2,\ldots ,n)\) be uncorrelated but identically distributed random variables i.e., \(E(X_{i})=\mu ,\, V(X_{i})=\sigma ^{2},\,(i=1,2,\ldots ,n)\) and \(Cov(X_{i},X_{j})=0,\, (i=1,2,\ldots ,n;\,j=1,2,\ldots ,n;i\ne j)\) whenever they exist. Then \(E(S^{2})=\sigma ^{2}\).

Two examples are given below to illustrate Corollary 3.11.

Example 3.6

Let \(X_{i}\sim N(0,1),\,(i=1,2,3)\) and the sample \((X_{1},X_{2},\,X_{3})\) be governed by the joint density function

$$\begin{aligned} f(x_{1},x_{2},x_{3})&= \left( {\frac{1}{2\pi }}\right) ^{3/2}\exp \left[ {-\frac{1}{2}\left( x_{1}^{2}+x_{2}^{2}+x_{3}^{2} \right) }\right] \nonumber \\&\times \left[ {1+x_{1}x_{2}x_{3}\,\exp \left( {-\frac{1}{2}\left( x_{1}^{2}+x_{2}^{2}+x_{3}^{2} \right) }\right) }\right] , \end{aligned}$$
(3.4)

\(-\infty <x_{i}<\infty \,(i=1,2,3)\) (cf. [4, p. 121]). Then it can be proved that the sample observations are pair-wise statistically independent with each pair having a standard bivariate normal distribution. We thus have \(E(X_{i})=0\), \(V(X_{i})=1\) and \(Cov(X_{i},X_{j})=0\,(i=1,2,3;j=1,2,3;i\ne j)\). By virtue of Corollary 3.11, we have \(E(S^{2})=1\) where \(S^{2}=\sum \nolimits _{i=1}^{3}{(X_{i}} -\bar{{X}})^{2}/2\).

Example 3.7

Let \(X_{i}(i=1,2,3)\) have a univariate \(t\)-distribution with density function

$$\begin{aligned} f(x_{i})=\frac{\Gamma ((\nu +1)/2)}{\sqrt{\nu \pi }\Gamma (\nu /2)}\,\left( {1+\frac{1}{\nu }x_{i}^{2}}\right) ^{-(\nu +1)/2},\quad \nu >2,\;\; (i=1,2,3) \end{aligned}$$

and the sample \((X_{1},X_{2},\,X_{3})\) be governed by the joint density function of a trivariate \(t\)-distribution given by

$$\begin{aligned} f(x_{1},x_{2},x_{3})=\frac{\Gamma ((\nu +3)/2)}{(\nu \pi )^{3/2}\Gamma (\nu /2)}\,\left( {1+\frac{1}{\nu }(x_{1}^{2} +x_{2}^{2}+x_{3}^{2})}\right) ^{-(\nu +3)/2}, \end{aligned}$$
(3.5)

\(-\infty <x_{i}<\infty \,(i=1,2,3)\) [1, p. 55].

Obviously \(X_{1},X_{2}\) and \(X_{3}\) are independent if and only if \(\nu \rightarrow \infty \). It can be proved that \((X_{i}|\Upsilon =\upsilon )\sim N(0,\upsilon ^{2}),\,(i=1,2,3)\), where \(\nu /\Upsilon ^{2}\sim \chi _{\nu }^{2}\). It can be proved that \(X_{i}(i=1,2,3)\)’s are pair-wise uncorrelated with each pair having a standard bivariate \(t\hbox {-distribution}\) with probability density function

$$\begin{aligned} f(x_{i},x_{j})=\frac{1}{2\pi }\,\left( {1+\frac{1}{\nu }(x_{i}^{2}+x_{j}^{2})}\right) ^{-(\nu +2)/2}, \end{aligned}$$
(3.6)

\(-\infty <x_{i},x_{j}<\infty \,(i,j=1,2,3;i\ne j)\). Since \(E(X_{i})=0\) , \(V(X_{i})=\nu /(\nu -2),\,\nu >2\) , \((i=1,2,3)\) , \(Cov(X_{i},X_{j})=0\,(i=1,2,3;j=1,2,3;i\ne j)\). Then by virtue of Corollary 3.11, we have \(E(S^{2})=\nu /(\nu -2),\,\nu >2\) where \(S^{2}=\sum \nolimits _{i=1}^{3}{(X_{i}} -\bar{{X}})^{2}/2\) is the sample variance. A realistic example based on stock returns is considered in [14].

Corollary 3.12

Let \(X_{j}\,(j=1,2,\ldots ,n)\)’s be independently and identically distributed random variables i.e., \(E(X_{i})=\mu ,\,V(X_{i})=\sigma ^{2}\,(i=1,2,\ldots n)\) whenever they exist. Then by Theorem 2.1(b) \(E(S^{2})=\sigma ^{2}\) which can also be written as \(E(S^{2})=\frac{1}{2}E(X_{1}-X_{2})^{2}=\sigma ^{2}\) by Theorem 2.1(a).

Example 3.8

Let \(X_{j}~(j=1,2,\ldots ,n)\)’s be independently and identically distributed Bernoulli random variables \(B(1,p)\). Then by Corollary 3.12, we have \(E(S^{2})=p(1-p)\).

Example 3.9

Let \(X_{j}(j=1,2,\ldots ,n)\)’s be independently and identically distributed as \(N(\mu ,\sigma ^{2})\). Then by Corollary 3.12, we have \(E(S^{2})=\sigma ^{2}\) which is well known [11].

Similarly, the expected sample variance is the population variance in (i) exponential population with mean \(E(X)=\beta \), and also in (ii) gamma population \(G(\alpha ,\beta )\) with mean \(E(X)=\alpha \beta \) and variance \(V(X)=\alpha \beta ^{2}\) .

4 An Application in Textile Engineering

A textile company weaves fabric on a large number of looms. They would like the looms to be homogenous so that they obtain fabric of uniform strength. The process engineer suspects that, in addition to the usual variation in strength for samples of fabric from the same loom, there may be significant variations in mean strengths between looms. To investigate this, the engineer selects four looms at random and makes four strength determinations on the fabric manufactured on each loom. This experiment is done in random order, and the data obtained are shown below.

Looms

Observations

Total

1

98

97

99

96

390

2

91

90

93

92

366

3

96

95

97

95

383

4

95

96

99

98

388

Montgomery [12, p. 514].

Consider a random effects linear model given by

$$\begin{aligned} y_{ij} =\mu +\tau _{i}+\varepsilon _{ij} , \quad i=1,2,\ldots ,a;\quad j=1,2,\ldots ,n_{i} \end{aligned}$$

where \(\mu \) is some constant, \(\tau _{i}\) has mean 0 and variance \(\sigma _{\tau }^{2}\), the errors \(\varepsilon _{ij}\) have mean 0 and variance \(\sigma ^{2}\). Also assume that \(\tau _{i}\) and \(\varepsilon _{ij}\) are uncorrelated. Then \(\bar{{Y}}_{i.} =\frac{1}{n_{i}}\sum \nolimits _{j=1}^{n_{i}} {Y_{ij}}\) has mean \(E(\bar{{Y}}_{i.} )=\mu \) and variance

$$\begin{aligned} V(\bar{{Y}}_{i.} )=\sigma _{\tau }^{2}+\frac{1}{n_{i}}\sigma ^{2}. \end{aligned}$$
(4.1)

If we write \(S_{\bar{{Y}}_{i.}}^{2}\!\!=\!\!\frac{1}{a-1} \sum \nolimits _{i=1}^{a}{(\bar{{Y}}_{i.} -\bar{{Y}}_{..} )}^{2}\), then by Corollary 3.8, \(E(S_{\bar{{Y}}_{i.}}^{2})=\frac{1}{a} \sum \nolimits _{i=1}^{a}{V(\bar{{Y}}_{i.} )}\), which, by virtue of (4.1), can be written as

$$\begin{aligned} E\left( S_{\bar{{Y}}_{i.}}^{2}\right) =\frac{1}{a} \sum \nolimits _{i=1}^{a}{\left( {\sigma _{\tau }^{2}+ \frac{1}{n_{i}}\sigma ^{2}}\right) } , \end{aligned}$$

so that

$$\begin{aligned} E\left( S_{\bar{{Y}}_{i.}}^{2}\right) =\sigma _{\alpha }^{2}+ \frac{\sigma ^{2}}{a}\sum \nolimits _{i=1}^{a}{\frac{1}{n_{i}}} . \end{aligned}$$
(4.2)

The Sum of Squares due to Treatment (looms here) is \(SST=\sum \nolimits _{i=1}^{a}{\sum \nolimits _{j=1}^{n_{i}} {(\bar{{y}}_{i.}}} -\bar{{y}}_{..} )^{2}\) which can be written as \(SST=(a-1)\sum \nolimits _{j=1}^{n_{i}} {S_{\bar{{Y}}_{i.}}^{2}} \) so that by (4.2), we have

$$\begin{aligned} E(SST)=(a-1)\sum \nolimits _{j=1}^{n_{i}} {\left( {\sigma _{\tau }^{2}+\frac{\sigma ^{2}}{a} \sum \nolimits _{i=1}^{a}{\frac{1}{n_{i}}}}\right) } \end{aligned}$$
(4.3)

which, in the balanced case, simplifies to \(E(SST)=(a-1)(n\sigma _{\tau }^{2}+\sigma ^{2})\). Then the expected mean Sum of Squares due to Treatment is given by

$$\begin{aligned} E(MST)=n\sigma _{\tau }^{2}+\sigma ^{2}. \end{aligned}$$
(4.4)

The Sum of Squares due to Errors is given by \(SSE=\sum \nolimits _{i=1}^{a}{\sum \nolimits _{j=1}^{n_{i}} {(y_{ij} -\bar{{y}}_{i.}}})^{2}\) which can be written as \(SSE=\sum \nolimits _{i=1}^{a}{\sum \nolimits _{j=1}^{n_{i}} {(\varepsilon _{ij} -\bar{{\varepsilon }}_{i.}}} )^{2}\). Since \(E(\varepsilon _{ij} )=0\) and \(V(\varepsilon _{ij} )=\sigma ^{2}\), by Corollary 3.12, we have \(E\left( {\frac{1}{n_{i}-1}\sum \nolimits _{j=1}^{n_{i}} {(\varepsilon _{ij} -\bar{{\varepsilon }}_{i.}} )^{2}}\right) =\sigma ^{2},\) so that \(E(SSE)=\sum \nolimits _{i=1}^{a}{(n_{i}-1)\sigma ^{2}}\) which in the balanced case, simplifies to \(E(SSE)=a(n-1)\sigma ^{2}\) so that the mean Sum of Squares due to Error is given by

$$\begin{aligned} E(MSE)=\sigma ^{2}. \end{aligned}$$
(4.5)

By virtue of (4.4) and (4.5), a test of \(H_{0}:\sigma _{\tau }^{2}=0\) against \(H_{1}:\sigma _{\tau }^{2}>0\) can be based on \(F=\frac{MST}{MSE}\) where the variance ratio has usual \(F\) distribution with \((a-1)\) and \(a(n-1)\) degrees of freedom if the errors are normally distributed.

It can be easily checked that the Centered Sum of Squares \((CSS=a(n-1)\times MST)\) and Sum of Squares due to treatments \((SST=(a-1)MST)\) for our experiment above are given by

$$\begin{aligned} CSS&= (98)^{2}+(97)^{2}+\cdots +(98)^{2}-\left( {(1527)^{2}/4(4)}\right) \approx 111.94,\\ SST&= \frac{1}{4}\left( {(390)^{2}+(366)^{2}+\cdots +(388)^{2}}\right) -\left( {(1527)^{2}/4(4)}\right) =89.19, \end{aligned}$$

respectively, and \(F=\frac{MST}{MSE}=\frac{89.19/3}{22.75/12}=15.68\) which is much larger than \(f_{0.05} =3.52\). The \(p\hbox { value}\) is smaller than 0.001. This suggests that there is a significant variation in the strength of the fabrics between the looms.

5 Conclusion

The general method for the expectation of sample variance that has been developed here is important if observations have non-identical distributions be it in means, variances, or covariances. While part (a) of Theorem 2.1 states that expected variance depends on that of the squared difference of pairs of observations, part (b) of the theorem states that expected variance depends on the average of variances of observations, variation among true means, and the average of covariances of pairs of observations. The theorem has potential to be useful in time series analysis, design of experiments, and psychometrics where the observations are not necessarily independently and identically distributed. Because no distributional form is assumed to obtain the main results, the theorem can also be applied even without requiring strict adherence to the normality assumption. The results (4.4) and (4.5) are usually derived in Design and Analysis or other statistics courses by distribution theory based on strong distributional assumptions, mostly normality. It is worth mentioning that the assumption of normality of the errors can be relaxed to broader class of distributions, say, elliptical distributions. See for example [9].