On Extended Guttman Condition in High Dimensional Factor Analysis

Hayashi, Kentaro; Yuan, Ke-Hai; Jiang, Ge (Gabriella)

doi:10.1007/978-3-030-01310-3_20

Kentaro Hayashi⁶,
Ke-Hai Yuan⁷ &
Ge (Gabriella) Jiang⁸

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 265))

Included in the following conference series:

1104 Accesses
2 Citations

Abstract

It is well-known that factor analysis and principal component analysis often yield similar estimated loading matrices. Guttman (Psychometrika 21:273–285, 1956) identified a condition under which the two matrices are close to each other at the population level. We discuss the matrix version of the Guttman condition for closeness between the two methods. It can be considered as an extension of the original Guttman condition in the sense that the matrix version involves not only the diagonal elements but also the off-diagonal elements of the inverse matrices of variance-covariances and unique variances. We also discuss some implications of the extended Guttman condition, including how to obtain approximate estimates of the inverse of covariance matrix under high dimensions.

Access provided by Autonomous University of Puebla. Download conference paper PDF

The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data

Direct Schmid–Leiman Transformations and Rank-Deficient Loadings Matrices

Article 04 December 2017

On Closeness Between Factor Analysis and Principal Component Analysis Under High-Dimensional Conditions

Keywords

1 Factor Analysis and Principal Component Analysis

Factor analysis (FA) and principal component analysis (PCA) are frequently used multivariate statistical methods for data reduction. In FA (Anderson, 2003; Lawley & Maxwell, 1971), the p-dimensional mean-centered vector of the observed variables $ \varvec{y}_{i} $, $ i = 1, \ldots ,n $, is linearly related to a m-dimensional vector of latent factors $ \varvec{f}_{i} $ via $ \varvec{y}_{i} \user2 = {\varvec{\Lambda}}\varvec{f}_{i} \varvec{ + \varepsilon }_{i} $, where $ {\varvec{\Lambda}} = (\varvec{\lambda}_{1} , \ldots ,\varvec{\lambda}_{m} ) $ is a p × m matrix of factor loadings (with p > m), and $ \varvec{\varepsilon}_{i} $ is a p-dimensional vector of errors. Typically for the orthogonal factor model, the three assumptions are imposed: (i) $ \varvec{f}_{i} \sim N_{m} ({\varvec{0}},\varvec{I}_{m} ) $; (ii) $ \varvec{\varepsilon}_{i} \sim N_{p} ({\varvec{0}},{\varvec{\Psi}}) $, where $ {\varvec{\Psi}} $ is a diagonal matrix with positive elements on the diagonal; (iii) $ {{Cov}}(\varvec{f}_{i} ,\varvec{\varepsilon}_{i} ) = {\varvec{0}} $. Then, under these three assumptions, the covariance matrix of $ \varvec{y}_{i} $ is given by $ {\varvec{\Sigma}} = {\varvec{\Lambda}} {\varvec{\Lambda}}^{\prime } + {\varvec{\Psi}} $. If $ \varvec{y}_{i} $ is standardized, $ {\varvec{\Sigma}} $ is a correlation matrix.

Let $ {\varvec{\Lambda}}^{ + } = (\varvec{\lambda}_{1}^{ + } , \ldots ,\varvec{\lambda}_{m}^{ + } ) $ be the p × m matrix whose columns are the standardized eigenvectors corresponding to the first m largest eigenvalues of $ {\varvec{\Sigma}} $; $ {\varvec{\Omega}} = diag(\varvec{\omega}) $ be the m × m diagonal matrix whose diagonal elements $ \varvec{\omega}= (\omega_{1} , \ldots ,\omega_{m} )^{\prime} $ are the first m largest eigenvalues of $ {\varvec{\Sigma}} $; and $ {\varvec{\Omega}}^{1/2} $ be the m × m diagonal matrix whose diagonal elements are the square root of those in $ {\varvec{\Omega}} $. Then principal components (PCs) (c.f., Anderson, 2003) with m elements are obtained as $ \varvec{f}_{i}^{*} = {\varvec{\Lambda}}^{ + \prime } \varvec{y}_{i} $. Clearly, the PCs are uncorrelated with a covariance matrix $ {\varvec{\Lambda}}^{ + \prime } {\varvec{\Sigma \Lambda }}^{ + } $. When m is properly chosen, there exists $ {\varvec{\Sigma}} \approx {\varvec{\Lambda}}^{ + } {\varvec{\Omega}} {\varvec{\Lambda}}^{{ + {\prime }}} = {\varvec{\Lambda}}^{\text{*}} {\varvec{\Lambda}}^{*\prime } $ , where $ {\varvec{\Lambda}}^{*} = {\varvec{\Lambda}}^{ + } {\varvec{\Omega}}^{1/2} $ is the p × m matrix of PCA loadings.

2 Closeness Conditions Between Factor Analysis and Principal Component Analysis

It has been well-known that FA and PCA often yield approximately the same results, especially their estimated loading matrices $ {\varvec{\hat{\Lambda }}} $ and $ {\varvec{\hat{\Lambda }}}^{*} $, respectively (e.g., Velicer & Jackson, 1990). Conditions under which the two matrices are close to each other are of substantial interest. At the population level, two such conditions identified by Guttman (1956) and Schneeweiss (1997) are among the most well-known.

2.1 Guttman Condition

Consider the factor analysis model $ {\varvec{\Sigma}} = {\varvec{\Lambda}} {\varvec{\Lambda}}^{\prime } + {\varvec{\Psi}}, $ where $ {\varvec{\Psi}} $ is a diagonal unique variance matrix, with $ ({\varvec{\Sigma}}^{ - 1} )_{jj} = \sigma^{jj} $ and $ ({\varvec{\Psi}})_{jj} = \psi_{jj} $, $ j = 1, \ldots ,p $. Let m be the number of common factors, Guttman (1956; See also Theorem 1 of Krijnen, 2006) has shown that if $ m/p \to 0 $ as $ p \to \infty $, then $ \psi_{jj} \sigma^{jj} \to 1 $ for almost all j. Here, “for almost all j” means $ \lim_{p \to \infty } \# \{ j :\,\psi_{jj} \sigma^{jj} < 1\} /p = 0 $. That is, the number of j that satisfies $ \psi_{jj} \sigma^{jj} < 1 $ is ignorable as p goes to infinity.

2.2 Schneeweiss Condition

The closeness condition between the loading matrix from FA and that from PCA by Schneeweiss and Mathes (1995) and Schneeweiss (1997) is $ ev_{m} ({\varvec{\Lambda}}^{\prime} {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $, where $ ev_{k} (\varvec{A}) $ is the $ k $-th largest eigenvalue of a square matrix $ \varvec{A} $. Obviously, $ ev_{m} ({\varvec{\Lambda}}^{\prime} {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) $ is the smallest eigenvalue of $ {\varvec{\Lambda}}^{\prime} {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}} $.

Related with the Schneeweiss condition, Bentler (1976) parameterized the correlation structure of the factor model as $ {\varvec{\Psi}}^{ - 1/2} {\varvec{\Sigma}} {\varvec{\Psi}}^{ - 1/2} = {\varvec{\Psi}}^{ - 1/2} {\varvec{\Lambda}} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi}}^{ - 1/2} + \varvec{I}_{p} $ and showed that, under this parameterization, a necessary condition for $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi}}^{ - 1} {\varvec{\Lambda}}) = ev_{m} ({\varvec{\Psi}}^{ - 1/2} {\varvec{\Lambda}} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi}}^{ - 1/2} ) \to \infty $ is that as $ p $ increases, the sum of squared loadings on each factor has to go to infinity ($ \varvec{\lambda}_{k}^{\prime }\varvec{\lambda}_{k} \to \infty $, $ k = 1, \ldots ,m $, as $ p \to \infty $).

2.3 Relationship Between Guttman and Schneeweiss Conditions

The relationship between Guttman and Schneeweiss conditions is summarized in Table 1. Schneeweiss condition $ ( {ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty } ) $ is sufficient for Guttman condition ($ m/p \to 0 $ as $ p \to \infty $) (Krijnen, 2006, Theorem 3). What we would like is for the converse ($ m/p \to 0 $ as $ p \to \infty \Rightarrow ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $) to hold in practical applications, as to be discussed in the next section.

Table 1 Relationships among conditions and results

Full size table

First, the condition of $ m/p \to 0 $ as $ p \to \infty $ is sufficient for $ \psi_{jj} \sigma^{jj} \to 1 $ for almost all $ j $ (Guttman, 1956; Krijnen, 2006, Theorem 1). Also, $ \psi_{jj} \sigma^{jj} \to 1 $ for all $ j $ implies $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $ (Krijnen, 2006, Theorem 4). Here, “$ \psi_{jj} \sigma^{jj} \to 1 $ for all $ j $” is slightly stronger than “$ \psi_{jj} \sigma^{jj} \to 1 $ for almost all $ j $.” However, in practice, it seems reasonable to assume that the number of loadings on every factor increases with $ p $ proportionally, as stated in Bentler (1976). Then the condition of $ m/p \to 0 $ as $ p \to \infty $ becomes equivalent to $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $. That is, Guttman and Schneeweiss conditions become interchangeable.

3 Extended Guttman Condition

By far the most important consequence of the Schneeweiss condition is that, when $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $, the second term in the right-hand side of the Sherman-Morrison-Woodbury formula (see, e.g., Chap. 16 of Harville, 1997):

$$ {\varvec{\Sigma}}^{ - 1} = {\varvec{\Psi}}^{ - 1} - {\varvec{\Psi}}^{ - 1} {\varvec{\Lambda}}(\varvec{I}_{m} + {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}})^{ - 1} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} $$

(1)

vanishes, so that

$$ {\varvec{\Psi}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} \to {\varvec{0}}\quad {\text{as}}\quad ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $$

(2)

As we noted in the previous section, the condition of $ m/p \to 0 $ as $ p \to \infty $ can be equivalent to $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $ in practical applications. Therefore, we have $ {\varvec{\Psi}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} \to {\varvec{0}} $ under high dimensions with a large $ p $. We call $ {\varvec{\Psi}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} \to {\varvec{0}} $ the extended Guttman condition. It is an extension of the original Guttman condition in the sense that $ \psi_{jj} \sigma^{jj} \to 1 $ can be expressed as $ \psi_{jj}^{ - 1} - \sigma^{jj} \to 0 $, as long as $ \psi_{jj} $ is bounded above $ (\psi_{jj} \le \psi_{\sup } < \infty ) $.

Note that there exists a similar identity for the FA model:

$$ {\varvec{\Psi}}^{ - 1} - {\varvec{\Psi}}^{ - 1} {\varvec{\Lambda}}({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}})^{ - 1} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} = {\varvec{\Sigma}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} {\varvec{\Lambda}}({\varvec{\Lambda^{\prime}\Sigma }}^{ - 1} {\varvec{\Lambda}})^{ - 1} {\varvec{\Lambda^{\prime}\Sigma }}^{ - 1} $$

(3)

(see, e.g., Hayashi & Bentler, 2001). Clearly, as $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $, not only the second term on the left-hand side of Eq. (3) but the second term on the right-hand side of Eq. (3) vanishes.

As we have just seen, the extended Guttman condition is a direct consequence of the Schneeweiss condition. Because $ {\varvec{\Psi}}^{ - 1} {\varvec{\Lambda}}(\varvec{I}_{m} + {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}})^{ - 1} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} < {\varvec{\Psi}}^{ - 1} {\varvec{\Lambda}}({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}})^{ - 1} {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} $ and $ \varvec{I}_{m} + {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}} $ is only slightly larger than $ {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}} $ when $ {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}} $ is large in the sense that $ ev_{m} (\varvec{I}_{m} + {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) = ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) + 1 $, the speed of convergence in $ {\varvec{\Psi}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} \to {\varvec{0}} $ is approximately at the rate of the reciprocal of smallest eigenvalues of $ {\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}} $, that is, of $ 1/ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) $.

4 Approximation of the Inverse of the Covariance Matrix

An important point to note here is that the original Guttman condition of $ \psi_{jj} \sigma^{jj} \to 1 $ (for almost all $ j $) has to do with only the diagonal elements of $ {\varvec{\Psi}} $ (or $ {\varvec{\Psi}}^{ - 1} $) and $ {\varvec{\Sigma}}^{ - 1} $, while $ {\varvec{\Psi}}^{ - 1} - {\varvec{\Sigma}}^{ - 1} \to {\varvec{0}} $ involves both the diagonal and the off-diagonal elements of the matrices. It justifies the interchangeability of $ {\varvec{\Sigma}}^{ - 1} $ and $ {\varvec{\Psi}}^{ - 1} $ as $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $, or assuming that the number of loadings on every factor increases with $ p $ proportionally, as $ m/p \to 0 $ with $ p \to \infty $. The important implication is that all the off-diagonal elements of $ {\varvec{\Sigma}}^{ - 1} $ approach zero in the limit. Thus, it is a result of sparsity of the off-diagonal elements of the inverted covariance (correlation) matrix in high dimensions.

One of the obvious advantages of being able to approximate $ {\varvec{\Sigma}}^{ - 1} $ by $ {\varvec{\Psi}}^{ - 1} $ in high dimensions is that the matrix of unique variances $ {\varvec{\Psi}} $ is a diagonal matrix and thus it can be inverted only with $ p $ operations. Note that, in general, the inversion of a $ p $ –dimensional square matrix requires operations of order $ O(p^{3} ) $ (see, e.g., Pourahmadi, 2013, p. 121).

Consequently, the single most important application of the extended Guttman condition is to approximate the inverse of the covariance matrix $ {\varvec{\Sigma}}^{ - 1} $ by $ {\varvec{\Psi}}^{ - 1} $ under high dimensions. This implication is very important because $ {\varvec{\Sigma}}^{ - 1} $ is involved in the quadratic form for the log likelihood function of the multivariate normal distribution. Even if $ {\varvec{\Sigma}} $ is positive definite so that $ {\varvec{\Sigma}}^{ - 1} $ exists in the population, the inverse $ \varvec{S}^{ - 1} $ of the sample covariance matrix $ \varvec{S} $ does not exist under high dimensions when $ p > n $. When $ \varvec{S}^{ - 1} $ does not exist, we cannot estimate $ {\varvec{\Psi}}^{ - 1} $ under the FA model using the generalized least squares (GLS) or the maximum likelihood (ML) method, without resorting to certain regularization method(s), either. Thus, a natural choice would be to employ the unweighted least square (ULS) estimation method that minimizes the fit function of $ F_{ULS} (\varvec{S},{\varvec{\Sigma}}) = tr\{ (\varvec{S} - {\varvec{\Sigma}})^{2} \} $, which does not require to compute $ \varvec{S}^{ - 1} $ or the estimate of $ {\varvec{\Sigma}}^{ - 1} $. Note that $ 1 - 1/s^{jj} $, a common initial value for the j-th communarity cannot be used because it requires the computation of $ \varvec{S}^{ - 1} $. Then, we can use the value of 1 as the initial communality estimates. In this case, the initial solution is identical to PCA.

Alternatively, when $ p $ is huge, we can employ the following “approximate” FA model with equal unique variances (e.g., Hayashi & Bentler, 2000), using standardized variables, that is, applying to the correlation matrix:

$$ {\varvec{\Sigma}} \approx {\varvec{\Lambda}}^{*} {\varvec{\Lambda}}^{{*^{\prime}}} + k\varvec{I}_{p} , $$

(4)

with a positive constant $ k $. Note that this model is also called the probabilistic PCA in statistics (Tipping & Bishop, 1999). Use of the FA model with equal unique variances seems reasonable, because the eigenvectors of $ \text{(}{\varvec{\Sigma}} - k\varvec{I}_{p} \text{)} $ are the same as the eigenvectors of $ {\varvec{\Sigma}} $, and the eigenvalues of $ \text{(}{\varvec{\Sigma}} - k\varvec{I}_{p} ) $ are smaller than the eigenvalues of $ {\varvec{\Sigma}} $ by only the constant of $ k $. Thus, the FA model with equal unique variances is considered as a variant of the PCA, and, the loading matrices between the FA and the PCA approach the same limit values as $ ev_{m} ({\varvec{\Lambda}}^{\prime } {\varvec{\Psi }}^{ - 1} {\varvec{\Lambda}}) \to \infty $, or they become essentially equivalent, under high dimensions.

In Eq. (4), let $ {\varvec{\Psi}}^{\text{*}} = k\varvec{I}_{p} $, then $ {\varvec{\Psi}}^{* - 1} = k^{ - 1} \varvec{I}_{p} $. Thus, we can use $ {\varvec{\Psi}}^{* - 1} $ as quick and fast approximation for $ {\varvec{\Psi}}^{ - 1} $. The natural estimator of $ k $ is the MLE for $ k $ given $ {\varvec{\Lambda}}^{*} $ (Tipping & Biship, 1999):

$$ \hat{k} = \frac{1}{p - m}\sum\limits_{j = m + 1}^{p} {ev_{j} (\varvec{S})} . $$

(5)

However, a more practical method seems as follows: Once estimating $ {\varvec{\Psi}}^{\text{*}} = k\varvec{I}_{p} $ by $ {\hat{\varvec{\Psi }}}^{\text{*}} = \hat{k}\varvec{I}_{p} $, we can compute loadings $ \hat{\Lambda }^{*} $ using the eigenvalues and eigenvectors of $ \text{(}\varvec{S} - \hat{k}\varvec{I}_{p} ) $ and find the estimates of $ {\varvec{\Psi}}^{*} $ as $ {\hat{\varvec{\Psi }}}^{*} = diag(\varvec{S} - \hat{\Lambda }^{*} \hat{\Lambda }^{*\prime } ). $ Note that $ {\hat{\varvec{\Psi }}}\,^{*} $ is no longer a constant times the identity matrix. Now, invoke the estimator version of the extended Guttman condition $ {\hat{\varvec{\Psi }}}^{* - 1} - {\hat{\varvec{\Sigma }}}^{ - 1} \approx {\varvec{0}} $ to find the approximate estimator $ {\hat{\varvec{\Sigma }}}^{ - 1} $ of $ {\varvec{\Sigma}}^{ - 1} $.

5 Illustration

The compound symmetry correlation structure is expressed as $ {\varvec{\Sigma}} = (1 - \rho )\varvec{I}_{p} + \rho {\varvec{1}}_{p} {\varvec{1}}_{p}^{\prime } $ with a common correlation $ \rho $, $ 0 < \rho < 1 $. Obviously, it is a one-factor model with the vector of factor loadings $ \varvec{\lambda}_{1} = \sqrt \rho {\varvec{1}}_{p} $ and the diagonal matrix unique variances $ {\varvec{\Psi}} = (1 - \rho )\varvec{I}_{p} $. Because the first eigenvalue and the corresponding standardized eigenvector of $ {\varvec{\Sigma}} = (1 - \rho )\varvec{I}_{p} + \rho {\varvec{1}}_{p} {\varvec{1}}_{p}^{\prime} $ are $ \omega_{1} = 1 + (p - 1)\rho $ and $ \varvec{\lambda}_{1}^{ + } = (1/\sqrt p ){\varvec{1}}_{p} $, respectively, the first PC loading vector is

$$ \varvec{\lambda}_{1}^{*} =\varvec{\lambda}_{1}^{ + } \sqrt {\omega_{1} } = (1/\sqrt p )\sqrt {1 + (p - 1)\rho } \cdot {\varvec{1}}_{p} = \sqrt {1/p + (1 - 1/p)\rho } \cdot {\varvec{1}}_{p} , $$

(6)

which approaches the vector of factor loadings $ \varvec{\lambda}_{1} = \sqrt \rho {\varvec{1}}_{p} $ with $ m/p = 1/p \to 0 $ and $ p \to \infty $. The remaining p − 1 eigenvalues are $ \omega_{2} = \ldots = \omega_{p} = 1 - \rho $. Thus, obviously, the constant k in the FA model with equal unique variances is $ k = 1 - \rho $. Note that the Schneeweiss condition also holds

$$ \varvec{\lambda}_{1}^{\prime } {\varvec{\Psi}}^{ - 1}\varvec{\lambda}_{1} = (\sqrt \rho {\varvec{1}}_{p} )^{\prime}\{ (1/(1 - \rho ))\varvec{I}_{p} \} (\sqrt \rho {\varvec{1}}_{p} ) = p \cdot \rho /(1 - \rho ) \to \infty $$

(7)

with $ m/p = 1/p \to 0 $ as $ p \to \infty $. The inverse of the correlation matrix is:

$$ \begin{aligned} {\varvec{\Sigma}}^{ - 1} & = {\varvec{\Psi}}^{ - 1} - {\varvec{\Psi}}^{ - 1}\varvec{\lambda}_{1} (1 +\varvec{\lambda}_{1}^{\prime } {\varvec{\Psi}}^{ - 1}\varvec{\lambda}_{1} )^{ - 1}\varvec{\lambda}_{1}^{\prime } {\varvec{\Psi}}^{ - 1} \\ & = (\frac{1}{1 - \rho })\varvec{I}_{p} - (\frac{1}{1 - \rho })\varvec{I}_{p} \cdot (\sqrt \rho {\varvec{1}}_{p} ) \cdot (1 + \frac{\rho }{1 - \rho } \cdot p)^{ - 1} \cdot (\sqrt \rho {\varvec{1}}_{p}^{\prime } ) \cdot (\frac{1}{1 - \rho })\varvec{I}_{p} \\ & = (\frac{1}{1 - \rho })\varvec{I}_{p} - (\frac{\rho }{1 - \rho })(\frac{1}{(1 - \rho ) + \rho \cdot p})({\varvec{1}}_{p} {\varvec{1}}_{p}^{\prime } ) \to (\frac{1}{1 - \rho })\varvec{I}_{p} = {\varvec{\Psi}}^{ - 1} \\ \end{aligned} $$

(8)

with $ m/p = 1/p \to 0 $ as $ p \to \infty $.

For example, it is quite easy to show that if $ \rho = 0.5 $, then for p = 10, the diagonal elements of the inverse of the compound symmetry correlation structure are 2 − 1/5.5 = 1.818 and the off-diagonal elements are −1/5.5 = –0.182. At p = 100, the diagonal and the off-diagonal elements become 2 − 1/50.5 = 1.980 and −1/50.5 = –0.0198, respectively. Furthermore, at p = 1000, the diagonal and the off-diagonal elements become 2 − 1/500.5 = 1.998 and −1/500.5 = –0.001998. Again, we see the off-diagonal elements of $ {\varvec{\Sigma}}^{ - 1} $ approaching 0 as p increases. Also, the diagonal elements of $ {\varvec{\Sigma}}^{ - 1} $ approach 2, which are the value of the inverse of the unique variances in the FA model.

6 Discussion

We discussed the matrix version of the Guttman condition for closeness between FA and PCA. It can be considered as an extended Guttman condition in the sense that the matrix version involves not only the diagonal elements but also the off-diagonal elements of the matrices $ {\varvec{\Sigma}}^{ - 1} $ and $ {\varvec{\Psi}}^{ - 1} $. Because $ {\varvec{\Psi}}^{ - 1} $ is a diagonal matrix, the extended Guttman condition implies that the off-diagonal elements of $ {\varvec{\Sigma}}^{ - 1} $ approach zero as the dimension increases. We showed how the phenomenon happens with the compound symmetry example in the Illustration section. We also discussed some implications of the extended Guttman condition, which include the ease of inverting $ {\varvec{\Psi}} $ compared against inverting $ {\varvec{\Sigma}} $. Because the ULS estimation method does not involve any inversion of either the sample covariance matrix S or the estimated model implied population covariance matrix $ {\hat{\varvec{\Sigma }}} $, the ULS should be the estimation of choice when sample size n is smaller than the number of variables p. Furthermore, we proposed a simple method to approximate $ {\varvec{\Sigma}}^{ - 1} $ by $ {\varvec{\Psi}}^{ - 1} $ using the FA model with equal unique variances, or equivalently, the probabilistic PCA model.

Some other implications of the extended Guttman condition (especially with respect to algorithms) are as follows: First of all, suppose we add the $ \left( {p + 1} \right) $th variable at the end of already existing p variables. Then, while the values of $ \sigma^{jj} $, $ j = 1, \ldots ,p $, can change, $ \psi_{jj}^{ - 1} $, $ j = 1, \ldots ,p $, remain unchanged. Thus, with the extended Guttman condition, only one additional element needs to be computed.

Another implication is on the ridge estimator, which is among the methods to deal with singularity of S or the estimator of its covariance matrix by introducing some small bias term (see e.g., Yuan & Chan, 2008, 2016). Warton (2008, Theorem 1) showed that the ridge estimator of the covariance (correlation) matrix $ {\hat{\varvec{\Sigma }}}_{\eta } {\varvec = }\eta {\hat{\varvec{\Sigma }}} + (1 - \eta )\varvec{I}_{p} $ (with the tuning parameter $ \eta $) is the maximum penalized likelihood estimator with the penalty term proportional to $ - tr({\varvec{\Sigma}}^{ - 1} ) $. Unfortunately, as the dimension p increases (or the ratio $ p/n $ increases), it becomes more difficult to obtain the inverse of the covariance matrix. Therefore, in high dimensions, it is not practical to express the ridge estimator of the covariance matrix in the form of the maximum penalized likelihood with the penalty term involving $ - tr({\varvec{\Sigma}}^{ - 1} ) $. This naturally leads to employing an “approximate” maximum penalized likelihood with the penalty term approximately proportional to $ - tr({\varvec{\Psi}}^{ - 1} ) $ in place of the penalty term proportional to $ - tr({\varvec{\Sigma}}^{ - 1} ) $, assuming the factor analysis model, when the dimension p is large.

We are aware that, perhaps except approximations of the inverse of covariance matrix, the majority of implications that we discussed in this article may be of limited practical utility. For example, because the original Guttman condition, the Schneeweiss condition, and the extended Guttman condition are all conditions for closeness between FA and PCA, we can simply employ PCA as an approximation to FA when the conditions hold. Also, we did not discuss regularized FA with L1 regularization here, which in itself is a very interesting topic. Yet, we think the implications we discussed are still of theoretical interest that should continue to be studied. The compound symmetry example used in the Illustration is probably only an approximation to the real world. We will need to do an extensive simulation to come up with some empirical guidelines regarding how to best apply the theoretical results in practice.

References

Anderson, T. W. (2003). An introduction to multivariate statistical analysis (3rd ed.). New York: Wiley.
MATH Google Scholar
Bentler, P. M. (1976). Multistructure statistical model applied to factor analysis. Multivariate Behavioral Research, 11, 3–15.
Article Google Scholar
Guttman, L. (1956). Best possible systematic estimates of communalities. Psychometrika, 21, 273–285.
Article MathSciNet MATH Google Scholar
Harville, D. A. (1997). Matrix algebra from a statistician’s perspective. New York: Springer.
Book MATH Google Scholar
Hayashi, K., & Bentler, P. M. (2000). On the relations among regular, equal unique variances and image factor analysis. Psychometrika, 65, 59–72.
Article MathSciNet MATH Google Scholar
Hayashi, K., & Bentler, P. M. (2001). The asymptotic covariance matrix of maximum-likelihood estimates in factor analysis: The case of nearly singular matrix of estimates of unique variances. Linear Algebra and its Applications, 321, 153–173.
Article MathSciNet MATH Google Scholar
Krijnen, W. P. (2006). Convergence of estimates of unique variances in factor analysis, based on the inverse sample covariance matrix. Psychometrika, 71, 193–199.
Article MathSciNet MATH Google Scholar
Lawley, D. N., & Maxwell, A. E. (1971). Factor analysis as a statistical method (2nd ed.). New York: American Elsevier.
Google Scholar
Pourahmadi, M. (2013). High-dimensional covariance estimation. New York: Wiley.
Book MATH Google Scholar
Schneeweiss, H. (1997). Factors and principal components in the near spherical case. Multivariate Behavioral Research, 32, 375–401.
Article Google Scholar
Schneeweiss, H., & Mathes, H. (1995). Factor analysis and principal components. Journal of Multivariate Analysis, 55, 105–124.
Article MathSciNet MATH Google Scholar
Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B, 61, 611–622.
Article MathSciNet MATH Google Scholar
Velicer, W. F., & Jackson, D. N. (1990). Component analysis versus common factor analysis: Some issues in selecting an appropriate procedure. Multivariate Behavioral Research, 25, 1–28.
Article Google Scholar
Warton, D. I. (2008). Penalized normal likelihood and ridge regularization of correlation and covariance matrices. Journal of the American Statistical Association, 103, 340–349.
Article MathSciNet MATH Google Scholar
Yuan, K.-H., & Chan, W. (2008). Structural equation modeling with near singular covariance matrices. Computational Statistics & Data Analysis, 52, 4842–4858.
Article MathSciNet MATH Google Scholar
Yuan, K.-H., & Chan, W. (2016). Structural equation modeling with unknown population distributions: Ridge generalized least squares. Structural Equation Modeling, 23, 163–179.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are thankful to Dr. Dylan Molenaar’s comments. Ke-Hai Yuan’s work was supported by the National Science Foundation under Grant No. SES-1461355.

Author information

Authors and Affiliations

Department of Psychology, University of Hawaii at Manoa, 2530 Dole Street, Sakamaki C400, Honolulu, HI, 96822, USA
Kentaro Hayashi
Department of Psychology, University of Notre Dame, Corbett Family Hall, Notre Dame, IN, 46556, USA
Ke-Hai Yuan
Department of Educational Psychology, University of Illinois at Urbana-Champaign, 1310 South 6th Street, Campaign, Urbana-Champaign, IL, 61820, USA
Ge (Gabriella) Jiang

Authors

Kentaro Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Ke-Hai Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Ge (Gabriella) Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kentaro Hayashi .

Editor information

Editors and Affiliations

Department of Statistics, Umeå School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
Marie Wiberg
Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Steven Culpepper
Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
Rianne Janssen
Facultad de Matematicas, Pontificia Universidad Catolica de Chile, Santiago, Chile
Jorge González
Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
Dylan Molenaar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hayashi, K., Yuan, KH., Jiang, G.(. (2019). On Extended Guttman Condition in High Dimensional Factor Analysis. In: Wiberg, M., Culpepper, S., Janssen, R., González, J., Molenaar, D. (eds) Quantitative Psychology. IMPS IMPS 2017 2018. Springer Proceedings in Mathematics & Statistics, vol 265. Springer, Cham. https://doi.org/10.1007/978-3-030-01310-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-01310-3_20
Published: 18 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01309-7
Online ISBN: 978-3-030-01310-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

On Extended Guttman Condition in High Dimensional Factor Analysis

Abstract

Similar content being viewed by others

The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data

Direct Schmid–Leiman Transformations and Rank-Deficient Loadings Matrices

On Closeness Between Factor Analysis and Principal Component Analysis Under High-Dimensional Conditions

Keywords

1 Factor Analysis and Principal Component Analysis

2 Closeness Conditions Between Factor Analysis and Principal Component Analysis

2.1 Guttman Condition

2.2 Schneeweiss Condition

2.3 Relationship Between Guttman and Schneeweiss Conditions

3 Extended Guttman Condition

4 Approximation of the Inverse of the Covariance Matrix

5 Illustration

6 Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On Extended Guttman Condition in High Dimensional Factor Analysis

Abstract

Similar content being viewed by others

The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data

Direct Schmid–Leiman Transformations and Rank-Deficient Loadings Matrices

On Closeness Between Factor Analysis and Principal Component Analysis Under High-Dimensional Conditions

Keywords

1 Factor Analysis and Principal Component Analysis

2 Closeness Conditions Between Factor Analysis and Principal Component Analysis

2.1 Guttman Condition

2.2 Schneeweiss Condition

2.3 Relationship Between Guttman and Schneeweiss Conditions

3 Extended Guttman Condition

4 Approximation of the Inverse of the Covariance Matrix

5 Illustration

6 Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation