Keywords

1 Introduction

Coefficient alpha (Guttman 1945; Cronbach 1951) is a very popular lower bound to the reliability of the total score (the unweighted sum of the item scores). Sijtsma (2009) criticized the use of coefficient alpha for assessing the reliability of the total score and recommended the use of greater lower bounds, such as the greatest lower bound (Woodhouse and Jackson 1977; ten Berge et al. 1981) and coefficient lambda-2 (Guttman 1945). Despite the existence of greater lower bounds to the reliability of the total score, coefficient alpha continues to be used in practice. For an overview of many other lower bounds to the reliability of the total score, see Revelle and Zinbarg (2009).

It has been shown that coefficient alpha equals the communality of the total score if the item scores follow the essentially tau-equivalent model (Bentler 2009). The essentially tau-equivalent model is the unrealistic special case of the one-factor model in which all item scores have the same factor loading (Lord and Novick 1968). Under the essentially tau-equivalent model, coefficient alpha is only equal to the reliability of the total score if all unique factors only contain random measurement error. However, under the essentially tau-equivalent model, the communality of the total score equals the proportion of variance of the total score explained by the common factor. This means that under the essentially tau-equivalent model, coefficient alpha assesses the validity of the total score as a measure of the common factor.

A more realistic model for the measurement of a single factor by a set of items than the essentially tau-equivalent model is the one-factor model (Spearman 1950). Under the one-factor model, factor loadings are not restricted to be equal. Since, in practice, the items of a subtest are usually constructed to measure one and the same latent factor, the item scores of a subtest are often assumed to follow the one-factor model. Under the one-factor model, the communality of the total score is given by coefficient omega (Heise and Bohrnstedt 1970; McDonald 1978) and equals the proportion of variance of the total score explained by the common factor. So under the one-factor model, coefficient omega assesses the validity of the total score as a measure of the common factor.

In this chapter, a new expression for the communality of the total score under the one-factor model is presented. Whereas coefficient omega expresses the communality of the total score in terms of factor model parameters, this new expression is in terms of the variances of the item scores, the covariances between the item scores, and the number of item scores. It is shown that the new expression equals coefficient alpha if the item scores follow the essentially tau-equivalent model. Furthermore, new expressions of the communality of an arbitrary item score and the proportion of total variance explained are derived under the one-factor model. Since all new expressions are functions of the population variances of the item scores and the population covariances between the item scores, distribution-free closed-form estimates are obtained by replacing the population parameters with sample analogues.

First, however, the one-factor model is briefly outlined in the next section. Subsequently, the new communality expressions and their closed-form estimates are presented. Finally, the closed-form estimates of the new communality expressions are calculated for a classic example data set.

2 The One-Factor Model

Let the random variables X 1, X 2, …, X J be J item scores for a randomly selected individual from a population. The means of X 1, X 2, …, X J are denoted by b 1, b 2, …, b J; the variances are denoted by \(\sigma ^{2}_{1},\sigma ^{2}_{2},\ldots ,\sigma ^{2}_{J}\); and the covariance between two arbitrary item scores X j and X k is denoted by σ jk, for all j and k ≠ j. In the one-factor model, it is assumed that

$$\displaystyle \begin{aligned} X_{j}=b_{j}+a_{j}\xi+U_{j},\ \ \mbox{for all }j, \end{aligned} $$
(6.1)

where a j is a constant factor loading, for all j, ξ is the common factor, and U j is a unique factor, for all j. Note that U j = S j + E j, where S j is an item-specific factor (only varying between persons) and E j is random measurement error (varying between persons and within persons), so that T j = b j + a j ξ + S j is the item true score. The common factor ξ is assumed to be independent of all unique factors U 1, U 2, …, U J. The unique factors are assumed to be mutually independent. To identify the model, the variance of ξ is set to one. Let var(U j) = δ j, for all j. Then, it follows that \(\sigma ^{2}_{j}=a_{j}^{2}+\delta _{j}\), for all j, and σ jk = a j a k, for all j and k ≠ j. Usually, the items are constructed such that it can be assumed that a j > 0, for all j. If the items are not constructed this way, a transformation can be applied to some of the item scores such that it can be assumed that a j > 0, for all j. Note that if a j > 0, for all j, then σ jk > 0, for all j and k ≠ j.

3 Communality/Validity

Let C j = b j + a j ξ, for all i. Then, the total score is given by X =∑j X j = C + U, where C =∑j C j =∑j b j +∑j a j ξ and U =∑j U j. The communality of X is given by coefficient omega (Heise and Bohrnstedt 1970; McDonald, 1978) and is the squared correlation between X and C, that is,

$$\displaystyle \begin{aligned} \omega=\rho^{2}_{XC}=\frac{var(C)}{\sigma_{X}^{2}}=\frac{\left(\sum_{j}a_{j}\right)^{2}}{\sum_{j}\sigma^{2}_{j}+\sum_{j}\sum_{k\neq j}\sigma_{jk}}=\frac{\sum_{j}a_{j}^{2}+\sum_{j}\sum_{k\neq j}a_{j}a_{k}}{\sum_{j}a_{j}^{2}+\sum_{j}\delta_{j}+\sum_{j}\sum_{k\neq j}a_{j}a_{k}}, \end{aligned} $$
(6.2)

where \(\sigma _{X}^{2}=var(X)\). Note that under the one-factor model, \(\rho ^{2}_{XC}\) is equal to the squared correlation between X and ξ given by

$$\displaystyle \begin{aligned} \rho^{2}_{X\xi}=\frac{\{cov(X,\xi)\}^{2}}{\sigma^{2}_{X}}=\frac{[E\{(\sum_{j}a_{j}\xi+\sum_{j}U_{j})\xi\}]^{2}}{\sigma^{2}_{X}}=\frac{(\sum_{j}a_{j})^{2}}{\sigma^{2}_{X}}. \end{aligned} $$
(6.3)

From this, it can be concluded that under the one-factor model, the communality coefficient \(\rho ^{2}_{XC}\) also assesses the validity of X as a measure of ξ. Now, since σ jk = a j a k, for all j ≠ k, it follows that

$$\displaystyle \begin{aligned} \frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}=\frac{a_{j}a_{k}a_{j}a_{l}}{a_{k}a_{l}}=a_{j}^{2},\ \ \mbox{for all }j,\ k\neq j\text{, and }l\neq j,k.\end{aligned}$$

Taking the average over all k ≠ j and l ≠ j, k gives

$$\displaystyle \begin{aligned} \frac{1}{(J-1)(J-2)}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}=a_{j}^{2},\ \ \mbox{for all }j. \end{aligned} $$
(6.4)

Substitution from Eq. 6.4 and σ jk = a j a k into

$$\displaystyle \begin{aligned} \rho^{2}_{XC}=\frac{\sum_{j}a_{j}^{2}+\sum_{j}\sum_{k\neq j}a_{j}a_{k}}{\sigma^{2}_{X}} \end{aligned} $$
(6.5)

yields the new expression of the communality of the total score X given by

$$\displaystyle \begin{aligned} \rho^{2}_{XC}=\!\left\{\frac{1}{(J-1)(J-2)}\sum_{j}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}+\sum_{j}\sum_{k\neq j}\sigma_{jk}\right\}\!/\sigma_{X}^{2}. \end{aligned} $$
(6.6)

Note that substitution from Eq. 6.4 into \(\sigma ^{2}_{j}=a^{2}_{j}+\delta _{j}\) and solving for δ j yields

$$\displaystyle \begin{aligned} \delta_{j}=\sigma^{2}_{j}-\frac{1}{(J-1)(J-2)}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}},\ \ \mbox{for all }j. \end{aligned} $$
(6.7)

Also note that if a j > 0, for all i, then it follows from Eq. 6.4 that

$$\displaystyle \begin{aligned} a_{j}=\sqrt{\frac{1}{(J-1)(J-2)}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}},\ \ \mbox{for all }j. \end{aligned} $$
(6.8)

Under the essentially tau-equivalent model, σ jk = a 2, for all j and k ≠ j, so that

$$\displaystyle \begin{aligned} \sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}=(J-2)\sigma_{jk},\ \ \mbox{for all }j\text{ and }k\neq j. \end{aligned} $$
(6.9)

Substitution from Eq. 6.9 into Eq. 6.6 and factoring ∑jkj σ jk yields coefficient alpha given by

$$\displaystyle \begin{aligned} \alpha=\frac{J}{J-1}\sum_{j}\sum_{k\neq j}\sigma_{jk}/\sigma_{X}^{2}. \end{aligned} $$
(6.10)

In addition to the communality of the total score, the communalities of the individual item scores might be of interest in practice. The communality of item score X j is defined as the squared correlation between X j and C j. Under the one-factor model, the communality of item score X j is given by

$$\displaystyle \begin{aligned} h^{2}_{j}=\rho^{2}_{X_{j}C_{j}}=\frac{cov(X_{j},C_{j})^{2}}{\sigma^{2}_{j}var(C_{j})}=\frac{var(C_{j})}{\sigma^{2}_{j}}=\frac{a_{j}^{2}}{a_{j}^{2}+\delta_{j}},\ \ \mbox{for all }j. \end{aligned} $$
(6.11)

Note that under the one-factor model, \(\rho ^{2}_{X_{j}C_{j}}\) is equal to \(\rho ^{2}_{X_{j}\xi }\). So under the one-factor model, the communality of item score X j also assesses the validity of X j as a measure of ξ. Now, dividing the left-hand side of Eq. 6.4 by \(\sigma _{j}^{2}\) yields

$$\displaystyle \begin{aligned} h^{2}_{j}=\frac{1}{(J-1)(J-2)\sigma_{j}^{2}}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}},\ \ \mbox{for all }j. \end{aligned} $$
(6.12)

The total variance is defined as \(\sum _{j}\sigma ^{2}_{j}\). Under the one-factor model, the proportion of total variance explained by the common factor is given by

$$\displaystyle \begin{aligned} \pi=\frac{\sum_{j}a^{2}_{j}}{\sum_{j}a^{2}_{j}+\sum_{j}\delta_{j}}=\frac{\sum_{j}a^{2}_{j}}{\sum_{j}\sigma^{2}_{j}} \end{aligned} $$
(6.13)

and assesses the extent to which the items measure the common factor relative to the unique factors. Substitution from Eq. 6.4 into Eq. 6.13 yields

$$\displaystyle \begin{aligned} \pi=\frac{1}{(J-1)(J-2)}\sum_{j}\sum_{k\neq j}\sum_{l\neq j,k}\frac{\sigma_{jk}\sigma_{jl}}{\sigma_{kl}}/\sum_{j}\sigma^{2}_{j}. \end{aligned} $$
(6.14)

3.1 Estimates

Let x ij be the observed score of individual i = 1, 2, …, N on item j = 1, 2, …, J. The sample mean score on item j is given by \(\bar {x}_{j}=\sum _{i=1}^{N}x_{ij}/N\), for all j. The observed total score of individual i is then given by \(x_{i}=\sum _{j=1}^{J}x_{ij}\), for all i, and the sample mean total score is then given by \(\bar {x}=\sum _{i=1}^{N}x_{i}/N\). A closed-form estimate of \(\rho ^{2}_{XC}\) is now given by

$$\displaystyle \begin{aligned} \hat{\rho}^{2}_{XC}=\!\left\{\frac{1}{(J-1)(J-2)}\sum_{j}\sum_{k\neq j}\sum_{l\neq j,k}\frac{s_{jk}s_{jl}}{s_{kl}}+\sum_{j}\sum_{k\neq j}s_{jk}\right\}\!/s^{2}, \end{aligned} $$
(6.15)

where \(s_{jk}=\sum _{i=1}^{N}(x_{ij}-\bar {x}_{j})(x_{ik}-\bar {x}_{k})/(N-1)\) is the estimate of σ jk, for all j and k ≠ j, and \(s^{2}=\sum _{i=1}^{N}(x_{i}-\bar {x})^{2}/(N-1)\) is the estimate of \(\sigma ^{2}_{X}\). Note that \(s^{2}=\sum _{j}s^{2}_{j}+\sum _{j}\sum _{k\neq j}s_{jk}\), where \(s^{2}_{j}=\sum _{i=1}^{N}(x_{ij}-\bar {x}_{j})^{2}/(N-1)\) is the estimate of \(\sigma ^{2}_{j}\), for all j. A closed-form estimate of δ j is given by

$$\displaystyle \begin{aligned} \hat{\delta}_{j}=s^{2}_{j}-\frac{1}{(J-1)(J-2)}\sum_{k\neq j}\sum_{l\neq j,k}\frac{s_{jk}s_{jl}}{s_{kl}},\ \ \mbox{for all }j. \end{aligned} $$
(6.16)

If s jk > 0, for all j ≠ k, then a closed-form estimate of factor loading a j is given by

$$\displaystyle \begin{aligned} \hat{a}_{j}=\sqrt{\frac{1}{(J-1)(J-2)}\sum_{k\neq j}\sum_{l\neq j,k}\frac{s_{jk}s_{jl}}{s_{kl}}},\ \ \mbox{for all }j. \end{aligned} $$
(6.17)

A closed-form estimate of the item communality \(\rho ^{2}_{X_{j}C_{j}}=h^{2}_{j}\) is given by

$$\displaystyle \begin{aligned} \hat{h}^{2}_{j}=\frac{1}{(J-1)(J-2)s_{j}^{2}}\sum_{k\neq j}\sum_{l\neq j,k}\frac{s_{jk}s_{jl}}{s_{kl}},\ \ \mbox{for all }j. \end{aligned} $$
(6.18)

Finally, a closed-form estimate of the proportion of total variance explained by the common factor is given by

$$\displaystyle \begin{aligned} \hat{\pi}=\frac{1}{(J-1)(J-2)}\sum_{j}\sum_{k\neq j}\sum_{l\neq j,k}\frac{s_{jk}s_{jl}}{s_{kl}}/\sum_{j}s^{2}_{j}. \end{aligned} $$
(6.19)

4 An Example

The data in this example are taken from Lord and Novick (1968, p. 91) and are the entries of the sample covariance matrix of four measures of English as a foreign language. The sample covariance matrix is based upon a sample size of 1416 and is given by

$$\displaystyle \begin{aligned} \left[\begin{array}{cccc} s^{2}_{1} \\ s_{21} & s^{2}_{2} \\ s_{31} & s_{32} & s^{2}_{3} \\ s_{41} & s_{42} & s_{43} & s^{2}_{4} \end{array}\right]=\left[\begin{array}{cccc} 94.7 \\ 87.3 & 212.0 \\ 63.9 & 138.7 & 160.5 \\ 58.4 & 128.2 & 109.8 & 115.4 \end{array} \right].\end{aligned}$$

Estimates of the parameters of the one-factor model are often obtained by maximum likelihood estimation under the assumption of multivariate normality of the item scores in the population. For comparison, both the maximum likelihood estimates and the closed-form estimates of a j, δ j, and \(h^{2}_{j}\), for all j, and ω and π are calculated. The maximum likelihood estimates are denoted by \(\tilde {a}_{j}\) and \(\tilde {\delta }_{j}\ \tilde {h}^{2}_{j}\), for all j, and \(\tilde {\omega }\) and \(\tilde {\pi }\). The estimates for all item parameters and coefficients are given in Table 6.1. Note that the item order given by the closed-form estimates \(\hat {h}^{2}_{1}<\hat {h}^{2}_{3}<\hat {h}^{2}_{4}<\hat {h}^{2}_{2}\) is different from the item order given by the maximum likelihood estimates \(\tilde {h}^{2}_{1}<\tilde {h}^{2}_{3}<\tilde {h}^{2}_{2}<\tilde {h}^{2}_{4}\). The estimate of coefficient α is \(\hat {\alpha }=.891\). The maximum likelihood estimate of coefficient ω is \(\tilde {\omega }=.909\), and its closed-form estimate is \(\hat {\omega }=.912\). So, about 91% of the sample variance of the total score is explained by the common factor. The maximum likelihood estimate of the total variance π is \(\tilde {\pi }=.725\), and its closed-form estimate is \(\tilde {\pi }=.735\). So, about 73% of the total sample variance is explained by the common factor.

Table 6.1 Estimates of all item parameters and coefficients under the one-factor model, for the Lord and Novick (1968) example data

5 Conclusion

Coefficient alpha has traditionally been used to assess the reliability of the total score. Since coefficient alpha equals the communality of the total score under the essentially tau-equivalent model, coefficient alpha is a lower bound to the reliability of the total score. The communality of the total score under the more realistic one-factor model is also a lower bound to the reliability of the total score. Under the one-factor model, however, the communality of the total score equals the proportion of variance of the total score explained by the common factor and therefore assesses the extent to which the common factor is measured by the total score. If items are constructed to measure one and the same latent factor, then the one-factor model can be used to study whether the items actually measure a single common factor. Once it has been concluded that the items measure a single common factor, it is of interest to assess how well the single common factor is measured by the item scores or the total score. To assess how well the single common factor is measured by the total score, coefficient omega and its new expression can be used. Under the one-factor model, coefficient omega and its new expression give the proportion of variance of the total score explained by the common factor. In practice, the maximum likelihood estimate of coefficient omega is often used as the estimate of the communality of the total score under the one-factor model. The closed-form estimate of omega now provides a distribution-free alternative.