Keywords

2.1 Introduction

As shown in Chap. 1, a structural equation model can be fitted to the covariance or correlation matrix of the variables of interest, without requirement of the raw data. Therefore, if articles report the correlations between the research variables (or information that can be used to estimate the correlation), the results can be used in a meta-analysis. MASEM combines structural equation modeling with meta-analysis by fitting a structural equation model on a meta-analyzed covariance or correlation matrix. As the primary studies in a meta-analysis often involve variables that are measured in different scales, MASEM is commonly conducted using a pooled correlation rather than covariance matrix. In the remainder of this book I will therefore focus on correlation matrices (but see Beretvas and Furlow 2006; Cheung and Chan 2009). MASEM typically consists of two stages (Viswesvaran and Ones 1995). In the first stage, correlation coefficients are tested for homogeneity across studies and combined together to form a pooled correlation matrix. In the second stage, a structural equation model is fitted to the pooled correlation matrix. In the next sections I outline the different approaches to pool correlation coefficients under the assumption that the correlations are homogenous across studies (fixed effects approaches). Heterogeneity of correlation coefficients and random effects approaches are discussed in Chap. 3.

2.2 Univariate Methods

In the univariate approaches, the correlation coefficients are pooled separately across studies based on bivariate information only. Dependency of correlation coefficients within studies is not taken into account (as opposed to multivariate methods, described in the next section). In the univariate approaches, a population value is estimated for each correlation coefficient separately. For one correlation coefficient, for each study i, the correlation coefficient is weighted by the inverse of the estimated sampling variance (the squared standard error), v i . The sampling variance of the correlation between variables A and B is given by:

$$v_{i\_AB} = ( 1{-}{\rho_{i\_AB}}^{ 2} )^{ 2} /n_{i} ,$$
(2.1)

where n i is the sample size in study i, and the observed correlation r i_AB can be plugged in for the unknown population correlation ρ i_AB . By taking the average of the weighted correlation coefficients across the k studies, one obtains the synthesized population correlation estimate:

$$\hat{\rho } = \frac{{\mathop \sum \nolimits_{i = 1}^{k} \frac{1}{{v_{i\_AB} }} r_{i\_AB} }}{{\mathop \sum \nolimits_{i = 1}^{k} \frac{1}{{v_{i\_AB} }} }}.$$
(2.2)

Weighting by the inverse sampling variance ensures that more weight is given to studies with larger sample size (and thus smaller sampling variance). Because the sampling variance of a correlation coefficient depends on the absolute value of the correlation coefficient, some researchers (e.g. Hedges and Olkin 1985) proposed to use Fisher’s z-transformation on the correlation coefficients before synthesizing the values. The estimated sampling variance v i of a transformed correlation z in a study i is equal to 1/(n i  − 3), where n i is the sample size in study i. After obtaining the pooled z-value, it can be back-transformed to an r-value for interpretation.

There is no consensus on whether it is better to use the untransformed correlation coefficient r or the transformed coefficient z in meta-analysis (see Corey et al. 1998). Hunter and Schmidt (1990) argued that averaging r leads to better estimates of the population coefficient than averaging z. However, several simulation studies (Cheung and Chan 2005; Furlow and Beretvas 2005; Hafdahl and Williams 2009) showed that differences between the two methods were generally very small, but when differences are present, the z approach tends to do better. If a random effects model is assumed however, Schulze (2004) recommends r over z.

If the correlation coefficients are pooled across studies (using the r or z method), one pooled correlation matrix can be constructed from the separate coefficients. The hypothesized structural model is then fit to this matrix, as if it was an observed matrix in a sample.

Apart from the problem that the correlations are treated as independent from each other within a study, the univariate methods have more issues (Cheung and Chan 2005). Because not all studies may include all variables, some Stage 1 correlation coefficients will be based on more studies than others. This leads to several problems. First, it may lead to non-positive definite correlation matrices (Wothke 1993), as different elements of the matrix are based on different samples. Non-positive definite matrices cannot be analysed with structural equation modeling. Second, correlation coefficients that are based on less studies are estimated with less precision and should get less weight in the analysis, which is ignored in the standard approaches. Third, if different sample sizes are associated with different correlation coefficients, it is not clear which sample size has to be used in Stage 2. One could for example use the mean sample size, the median sample size or the total sample size, leading to different results regarding fit indices and statistical tests in Stage 2. Due to these difficulties, univariate methods are not recommended for MASEM (Becker 2000; Cheung and Chan 2005).

2.3 Multivariate Methods

The two best known multivariate methods for meta-analysis are the generalized least squares (GLS) method (Becker 1992, 1995, 2009) and the Two-Stage SEM method (Cheung and Chan 2005). Both will be explained in the next sections.

2.3.1 The GLS Method

Becker (1992, 1995, 2009) proposed using generalized least squares estimation to pool correlation matrices, taking the dependencies between correlations into account. This means that not only the sampling variances in each study are used to weight the correlation coefficients, but also the sampling covariances. The estimate of the population variance of a correlation coefficient was given in Eq. (2.1). The population covariance between two correlation coefficients, let’s say between variables A and B and between the variables C and D, is given by the long expression:

$$\begin{aligned} {\text{cov}}\;(\uprho_{{i\_{\text{AB}}}} ,\uprho_{{i\_{\text{CD}}}} ) & = (0. 5\uprho_{{i\_{\text{AB}}}} \uprho_{{{\text{i}}\_{\text{BC}}}} ({\uprho_{{i\_{\text{AC}}}}}^{ 2} + {\uprho_{{i\_{\text{AD}}}}}^{ 2} + {\uprho_{{i\_{\text{BC}}}}}^{ 2} + {\uprho_{{i\_{\text{BD}}}}}^{ 2} ) \\ & \quad + \uprho_{{i\_{\text{AC}}}} \uprho_{{i\_{\text{BD}}}} + \uprho_{{i\_{\text{AD}}}} \uprho_{{i\_{\text{BC}}}} - (\uprho_{{i\_{\text{AB}}}} \uprho_{{i\_{\text{AC}}}} \uprho_{{i\_{\text{AD}}}} + \uprho_{{i\_{\text{AB}}}} \uprho_{{i\_{\text{BC}}}} \uprho_{{i\_{\text{BD}}}} \\ & \quad + \uprho_{{i\_{\text{AC}}}} \uprho_{{i\_{\text{BC}}}} \uprho_{{i\_{\text{CD}}}} + \uprho_{{i\_{\text{AD}}}} \uprho_{{i\_{\text{BD}}}} \uprho_{{i\_{\text{CD}}}} ))/n_{i} , \\ \end{aligned}$$
(2.3)

where ρ i indicates a population correlation value in study i and n i is the sample size in study i (Olkin and Siotani 1976). As the population parameters ρ i are unknown, the estimates of the covariances between correlations can be obtained by plugging in sample correlations for the corresponding ρ i ’s in Eq. (2.3). However, because the estimate from a single study is not very stable, it is recommended to use pooled estimates of ρ, by using the (weighted) mean correlation across samples (Becker and Fahrbach 1994; Cheung 2000; Furlow and Beretvas 2005). These pooled estimates should then also be used to obtain the variances of the correlation coefficients (by plugging in the pooled estimate in Eq. 2.1). This way, a covariance matrix of the correlation coefficients, denoted V i is available for each study in the meta-analysis. The dimensions of V i may differ across studies. If a study includes three variables, and reports the three correlations between the variables, V i has three rows and three columns. The values of V i are treated as known (as opposed to estimated) in the GLS approach. The V i matrices for each study are put together in one large matrix, V, which is a block diagonal matrix, with the V i matrix for each study on its diagonal:

$${\mathbf{V}} = \left[ {\begin{array}{*{20}c} {\varvec{V}_{1} } & 0 & 0 & 0 \\ 0 & {\varvec{V}_{2} } & \cdots & 0 \\ 0 & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & {\varvec{V}_{K} } \\ \end{array} } \right].$$

V is a symmetrical matrix with numbers of rows and columns equal to the total number of observed correlation coefficients across the studies.

For performing the multivariate meta-analysis using the GLS-approach, two more matrices are needed: A vector with the observed correlations in all the studies, r, and a matrix with zeros and ones that is used to indicate which correlation coefficients are present in each study. The vector with the observed correlations in all studies can be created by stacking the observed correlations in each study in a column vector. The length of this vector will be equal to the total number of correlations in all studies. If all k studies included all p variables, r will be a pk by 1 vector. Most often, not all studies include all research variables, in which case a selection matrix, X, is needed. For a study i, which for example included variables A and B but not C (and thus reports r i_AB, but not r i_AC and r i_BC), a selection matrix is created by constructing a 3 by 3 identity matrix (a matrix with ones on the diagonal and zeros off-diagonal) and removing the row of the missing correlation. In this study the selection matrix will thus look like this:

$$\left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ \end{array} } \right],$$

and in a study which included all three correlations, the selection matrix will be an identity matrix:

$$\left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right].$$

Doing this for all k studies, leads to k small matrices with three columns and number of rows equal to the number of present correlations. These matrices are then stacked to create matrix X, which has three columns and number of rows equal to the sum of all correlation coefficients across studies. That is, it has the same number of rows as the stacked vector of observed correlations, r. Using matrix algebra with these three matrices, the estimates of the pooled correlation coefficients can be obtained:

$$\varvec{\widehat{\rho }} = \text{(}{\mathbf{X}}^{\text{T}} {\mathbf{V}}^{ - 1} {\mathbf{X}}\text{)}^{ - 1} {\mathbf{X}}^{\text{T}} {\mathbf{V}}^{ - 1} {\mathbf{r}}),$$
(2.4)

where \(\widehat{\varvec{\rho }}\) is a p-dimensional column vector with the estimates of the population correlation coefficients, as well as the asymptotic covariance matrix of the parameter estimates, V GLS:

$${\mathbf{V}}_{\text{GLS}} = \left( {{\mathbf{X}}^{\text{T}} {\mathbf{V}}^{ - 1} {\mathbf{X}}} \right)^{ - 1} .$$
(2.5)

The only structural model that can be evaluated directly with the GLS method is the regression model. This is done by creating a matrix with the estimated pooled correlations of the independent variables, say R INDEP, and a vector with estimated pooled correlations of the independent variables with the dependent variables, say R DEP, and using the following matrix equation to obtain the vector of regression coefficients B:

$${\mathbf{B}} = {{\mathbf{R}}_{\text{INDEP}}}^{ - 1} {\mathbf{R}}_{\text{DEP}} .$$
(2.6)

This approach is very straightforward (if you use a program to do the matrix algebra), but it is a major limitation that regression models are the only models that can be estimated this way. In order to fit path models or factor models, one has to use a SEM-program and use the pooled correlation coefficients as input to the program. Treating the pooled correlation matrix as an observed matrix shares problems with the univariate methods, it is unclear which sample size has to be used, and potential differences in precision of correlation coefficients is not taken into account. An alternative way to fit a structural equation model on the pooled correlation matrix is to use the V GLS matrix as a weight matrix in WLS estimation, similar to the TSSEM, which is explained in the next section. For a detailed and accessible description of the GLS method see Becker (1992) and Card (2012).

2.3.2 Two Stage Structural Equation Modeling (TSSEM)

The TSSEM method was proposed by Cheung and Chan (2005). With TSSEM, multigroup structural equation modeling is used to pool the correlation coefficients at Stage 1. In Stage 2, the structural model is fitted to the pooled correlation matrix, using weighted least squares (WLS) estimation. The weight matrix in the WLS procedure is the inversed matrix with asymptotic variances and covariances of the pooled correlation coefficients from Stage 1. This ensures that correlation coefficients that are estimated with more precision (based on more studies) in Stage 1 get more weight in the estimation of model parameters in Stage 2. The precision of a Stage 1 estimate depends on the number and the size of the studies that reported the specific correlation coefficient.

Stage 1: Pooling correlation matrices

Let R i be the p i × p i sample correlation matrix and p i be the number of observed variables in the ith study. Not all studies necessarily include all variables. For example, in a meta-analysis of three variables A, B and C, the correlation matrices for the first three studies may look like this:

$${\mathbf{R}}_{ 1} = \left[ {\begin{array}{*{20}c} 1 & {} & {} \\ {r_{1\_AB} } & 1 & {} \\ {r_{1\_AC} } & {r_{1\_BC} } & 1 \\ \end{array} } \right], \quad {\mathbf{R}}_{ 2} = \left[ {\begin{array}{*{20}c} 1 & {} \\ {r_{2\_AB} } & 1 \\ \end{array} } \right], \,{\text{and}}\quad {\mathbf{R}}_{ 3} = \left[ {\begin{array}{*{20}c} 1 & {} \\ {r_{3\_BC} } & 1 \\ \end{array} } \right].$$

Here, Study 1 contains all variables, Study 2 misses Variable C, and Study 3 misses Variable A. Similar to the GLS approach, selection matrices are needed to indicate which study included which correlation coefficients. Note however, that in TSSEM, the selection matrices filter out missing variables as opposed to missing correlations in the GLS-approach, and is thus less flexible in handling missing correlation coefficients (see Chap. 4).

In TSSEM the selection matrices are not stacked into one large matrix. For the three mentioned studies, the selection matrices are identity matrices with the rows of missing variables excluded:

$${\mathbf{X}}_{ 1} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right],\quad {\mathbf{X}}_{ 2} = \left[ {\begin{array}{*{20}c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ \end{array} } \right], {\text{and}}\quad {\mathbf{X}}_{ 3} = \left[ {\begin{array}{*{20}c} 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} } \right].$$

Next, multigroup structural equation modelling is used to estimate the population correlation matrix R of all p variables (p is three in the example above). Each study is then viewed as a group. The model for each group i (study) is:

$${\varvec{\Sigma}}_{i} = {\mathbf{D}}_{i} \left( {{\mathbf{X}}_{i} {\mathbf{RX}}_{i}^{\text{T}} } \right){\mathbf{D}}_{i} .$$
(2.7)

In this model, R is the p × p population correlation matrix with fixed 1’s on its diagonal, matrix X i is the p i × p selection matrix that accommodates smaller correlation matrices from studies with missing variables (p i  < p), and D i is a p i × p i diagonal matrix that accounts for differences in scaling of the variables across the studies. Correct parameter estimates can be obtained using maximum likelihood estimation, optimizing the sum of the likelihood functions in all the studies:

$${\text{F}}_{\text{ML}} = \mathop \sum \limits_{i = 1}^{k} \frac{{N_{i} }}{N}{\text{F}}_{{{\text{ML}}_{i} }} ,$$
(2.8)

where N i is the sample size in study i, N = N1 + N2 + ··· + N k , and with FMLi for each study as given in Eq. (1.3). Describing the model in Eq. (2.7) in words, it means that a model is fitted to the correlation matrices of all studies, with the restriction that the population correlations are equal across studies. The diagonal D i matrices do not have a particular meaning, other than that they reflect differences in variances across the studies. They are needed because the diagonal of R is fixed at 1, but the diagonals of Σ i don’t necessarily have to equal 1 due to differences in sample variances.Footnote 1 Fitting the model from Eq. (2.7) with a SEM program leads to estimates of the population correlation coefficients, as well as the associated asymptotic variance covariance matrix.

A chi-square measure of fit for the model in Eq. (2.7) is available by comparing its minimum FML value with the minimum FML value of a saturated model that is obtained by relaxing the restriction that all correlation coefficients are equal across studies. If a separate R i is estimated for each study, the selection matrices X i are not needed anymore. The model for a specific study then is:

$${\varvec{\Sigma}}_{i} = {\mathbf{D}}_{i} {\mathbf{R}}_{i} {\mathbf{D}}_{i} .$$
(2.9)

The difference between the resulting minimum FML values of the models in Eqs. (2.9) and (2.7), multiplied by the total sample size minus the number of studies, has a chi-square distribution with degrees of freedom equal to the difference in numbers of free parameters. If the chi-square value of this likelihood ratio test is significant then the hypothesis of homogeneity must be rejected (see Chap. 3), and the fixed effects Stage 2 model should not be fitted to the pooled Stage 1 matrix. In the remainder of this chapter we assume that homogeneity holds.

Stage 2: Fitting structural equation models

Cheung and Chan (2005) proposed to use WLS estimation to fit structural equation models to the pooled correlation matrix R that is estimated in Stage 1. Fitting the Stage 1 model provides estimates of the population correlation coefficients in R as well as the asymptotic variances and covariances of these estimates, V. In Stage 2, hypothesized structural equation models can be fitted to R by minimizing the weighted least squares fit function (also known as the asymptotically distribution free fit function; Browne 1984):

$${\text{F}}_{\text{WLS}} = \left( {{\mathbf{r}} - {\mathbf{r}}_{\text{MODEL}} } \right)^{\text{T}} {\mathbf{V}}^{ - 1} \left( {{\mathbf{r}} - {\mathbf{r}}_{\text{MODEL}} } \right),$$
(2.10)

where r is a column vector with the unique elements in R, r MODEL is a column vector with the unique elements in the model implied correlation matrix (R MODEL), and V −1 is the inversed matrix of asymptotic variances and covariances that is used as the weight matrix. For example, in order to fit a factor model with q factors, one would specify R MODEL as

$${\mathbf{R}}_{\text{MODEL}} = {\varvec{\Lambda} \varvec{\Phi} \varvec{\Lambda }}^{\text{T}} + {\varvec{\Theta}},$$
(2.11)

where Φ is a q by q covariance matrix of common factors, Θ is a p by p (diagonal) matrix with residual variances, and Λ is a p by q matrix with factor loadings. Minimizing the WLS function leads to correct parameter estimates with appropriate standard errors and a WLS based chi-square test statistic TWLS (Cheung and Chan 2005; Oort and Jak 2015).

One can also use the pooled correlation matrix and asymptotic covariance matrix from the GLS approach to fit the Stage 2 model with WLS estimation. Cheung and Chan (2005) compared the TSSEM method with the GLS method and the univariate methods. The GLS method in their study was based on Eq. (2.3), so they used the individual study correlation coefficients and not the pooled correlation coefficients as recommended by Becker and Fahrbach (1994) to calculate the sampling weights. The simulation research showed that the GLS method rejects homogeneity of correlation matrices too often and leads to biased parameter estimates at Stage 2. The univariate methods lead to inflated Type 1 errors, while the TSSEM method leads to unbiased parameter estimates and false positive rates close to the expected rates. The statistical power to reject an underspecified factor model was extremely high for all four methods. The TSSEM method overall came out as best out of these methods. Software to apply TSSEM is readily available in the R-Package metaSEM (Cheung 2015), which relies on the OpenMx package (Boker et al. 2011). This package can also be used for the GLS approach and the univariate approaches. More information about the software that can be used to perform MASEM can be found in Chap. 4.