Mean difference can be used as an effect size measure if the outcome variable has the same unit of measurement for both the treatment/intervention and placebo/control groups. The raw mean difference can be scaled by the inverse variance weight to define weighted mean difference (WMD). Unlike the SMD, the WMD retains the same unit of measurement as the outcome variable.

Meta-analysis of weighted mean difference (WMD) is covered in this chapter. It provides meta-analysis of WMD under different statistical models along with subgroup analysis with illustrative examples.

1 Weighted Mean Difference

For two arms experiments/studies the difference of the two means of an outcome variable is a good starting point to measure the effect size. The raw mean difference is simply the difference of the means of the two arms. It is not essential to standardize the raw mean difference as an effect size measure unless the outcome variable is measured in different units. In many cases the raw mean difference is used as an effect size measure, but it is weighted by the inverse variance. The process produces the weighted mean difference (WMD) as a measure of effect size. This WMD measure retains the same unit of measurement as the outcome variable and is used in many meta-analyses.

2 Estimation of Effect Size

Consider an experiment or study with patients randomly divided into two arms, the treatment group with mean of the outcome variable (say Y) to be \( \mu_{1} \) (or \( \mu_{T} \)) and control/placebo group having mean µ2 (or \( \mu_{P} \)). Based on a random sample of size \( n_{1} \) from the treatment group, let the sample mean be \( \hat{\mu }_{1} = \bar{Y}_{1} \) and sample variance be \( \hat{\sigma }_{1}^{2} = S_{1}^{2} . \) Similarly, of another random sample of size \( n_{2} \) for the placebo group, let the sample mean be \( \hat{\mu }_{2} = \bar{Y}_{2} \) and sample variance be \( \hat{\sigma }_{2}^{2} = S_{2}^{2} . \) The sample means and variances are used as estimates of the respective population mean and variance. Assume that both samples are independent the populations are normally distributed.

Then the population raw mean difference \( \delta = \mu_{1} - \mu_{2} \) is estimated by the sample mean difference \( \hat{\delta } = \bar{Y}_{1} - \bar{Y}_{2} . \) As discussed in the previous chapter, the standard error (SE) of the estimator of \( \delta \) depends on whether the two population standard deviations are equal or not. If the equality of population standard deviations are unknown, we will assume that they are not equal and use appropriate formula to calculate the variance and SE.

For any individual study, let us define population weighted mean difference (WMD) as \( \theta = \omega \delta \), where \( \omega = {1 \mathord{\left/ {\vphantom {1 {\sigma^{2} }}} \right. \kern-0pt} {\sigma^{2} }}, \) in which \( \sigma^{2} \) is the population variance of \( \delta , \) is the inverse variance weight of the population mean difference \( \delta . \) The population WMD, \( \theta \) is an unknown parameter. An estimator of \( \theta \) is given by its sample counterpart \( \hat{\theta } = w\hat{\delta }, \) where \( w = {1 \mathord{\left/ {\vphantom {1 v}} \right. \kern-0pt} v} \) is the sample weight, in which \( v = \hat{\sigma }^{2} \) is the sample variance, and \( \hat{\delta } = \bar{Y}_{1} - \bar{Y}_{2} \) is the estimate of unknown population mean difference, \( \delta . \)

In a meta-analysis with i = 1, 2, …, k independent studies, the sample WMD of the ith study is defined as \( \hat{\theta }_{i} = w_{i} \hat{\delta }_{i} \) with standard error \( S\!E_{i} = \sqrt {v_{i} } . \)

So the estimate of the common effect size of all studies is given by \( \hat{\theta } = {{\sum\limits_{1}^{k} {w_{i} \hat{\delta }_{i} } } \mathord{\left/ {\vphantom {{\sum\limits_{1}^{k} {w_{i} \hat{\theta }_{i} } } {\sum\limits_{1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{1}^{k} {w_{i} } }} \) and the standard error of the estimator of \( \theta \) becomes \( S\!E(\hat{\theta }) = \sqrt {{1 \mathord{\left/ {\vphantom {1 {\sum\limits_{1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{1}^{k} {w_{i} } }}} . \)

Then the \( (1 - \alpha ) \times 100\% \) confidence interval for population WMD, \( \theta \) is given by the lower limit (LL) and upper limit (UL) as follows:

$$ \begin{aligned} L\!L &= \hat{\theta } - z_{{\tfrac{\alpha }{2}}} \times S\!E(\hat{\theta })\;\;\text{and} \\ L\!L &= \hat{\theta } + z_{{\tfrac{\alpha }{2}}} \times S\!E(\hat{\theta }), \end{aligned} $$

where \( z_{{\tfrac{\alpha }{2}}} \) is the critical value of standard normal distribution leaving \( \tfrac{\alpha }{2} \) area on the upper (or lower) tail of the normal curve.

Example 9.1

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from eleven independent studies from Memon et al. 2018.

For the above data find the (i) raw mean difference (mean LARR–mean ORR) and standard error of the estimator of population mean difference, (ii) calculate 95% confidence interval for the population mean difference of each of the studies, and (iii) the weight for the first (Bonjer et al. 2015) study.

Solution:

The calculated values of the mean difference (MD), variance of mean difference (Var), standard error of mean difference (SE), weight as inverse variance (W), lower limit (LL) and upper limit (UL) of 95% confidence interval, and sum of W and sum of product of W and MD are shown in Table 9.2.

Explanations of calculations in Table 9.2

To answer the questions in Example 9.1, consider the calculations for the first study (Bonjier et al. 2015):

  1. (i)

    The raw mean difference is \( \hat{\delta}_{i} =\) MD = difference of mean of LARR and ORR groups = 200–400 = −200.

The variance of the mean difference (assuming population variances are unequal) is

$$ {\text{Var}}\, = \,\frac{{S_{L}^{2} }}{{n_{L} }} + \frac{{S_{O}^{2} }}{{n_{O} }} = \frac{{222^{2} }}{699} + \frac{{370^{2} }}{345} = 467.318. $$

Then the standard error becomes

SE = \( \sqrt {467.318} = 21.6175. \)

  1. (ii)

    The lower limit of the 95% confidence interval is

LL = \( - 200 - 1.96 \times 21.6175 = - 242.37 \) and upper limit is

UL = \( - 200 + 1.96 \times 21.6175 = - 157.63 \).

  1. (iii)

    The weight (for the first study) is W = \( \frac{1}{Var} = \frac{1}{467.32} = 0.00214 \) and WxMD = \( 0.00214 \times ( - 200) = - 0.428. \)

3 Tests on Effect Size

To test the significance of the unknown common effect size, \( \theta \), test the null hypothesis

\( H_{0} :\theta = 0 \) against \( H_{A} :\theta \ne 0 \) using the test statistic

\( Z = \frac{{\hat{\theta }}}{{S\!E(\hat{\theta })}} \) which follows a standard normal distribution.

For a two-tailed test, reject \( H_{0} \) at the \( \alpha \) level of significance (in favour of the alternative hypothesis) if the observed (or calculated) value of Z statistic satisfies \( |z_{0} | \ge z_{\alpha /2} \); otherwise don’t reject the null hypothesis (of a two-sided test).

Example 9.2

Consider the blood loss data from eleven independent studies in Table 9.1

Table 9.1 Summary statistics of blood loss of eleven studies on sample size, mean and standard deviation of LARR and ORR groups

Test the significance of the common effect size, \( \theta . \)

Solution:

To test the significance of the unknown common effect size \( \theta \), test the null hypothesis

\( H_{0} :\theta = 0 \) against \( H_{A} :\theta \ne 0 \) use the test statistic Z as

$$ z_{0} = \frac{{\hat{\theta }}}{{S\!E(\hat{\theta })}} = \frac{ - 73.2411}{3.043199} = - 24.0671. [\text{see Example 9.3 for details}]$$

The P-value is \( P(|Z| > 24.07) = 2 \times P(Z > 24.07) = 0. \)

Since the P-value is 0 the test is highly significant. Thus there is strong sample evidence that the mean difference is significantly different from 0. In other words, the mean blood loss in LARR group is significantly different from that of the ORR group.

4 Fixed Effect (FE) Model

The fixed effect (FE) model is used if there is no significant heterogeneity of effect size among the independent studies. In this section, the FE model is presented in a general framework for the meta-analysis of RR with example. An introduction to the FE model is found in (Borenstein et al. 2010).

In meta-analysis, results from all the k independent studies are combined by pooling the summary statistics of primary studies to a single point estimate and confidence interval for the common population effect size \( \theta \). Under the fixed effect model, the common effect size estimator, WMD (=\( \theta \)) is given by

\( \hat{\theta }_{F\!E} = {{\sum\limits_{i = 1}^{k} {w_{i} \hat{\delta }_{i} } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{k} {w_{i} \hat{\theta }_{i} } } {\sum\limits_{i = 1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i} } }} \) and the variance of the estimator of the common effect size is \( V\!ar(\hat{\theta }_{F\!E} ) = {1 \mathord{\left/ {\vphantom {1 {\sum\limits_{i = 1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i} } }} \). Hence the standard error of the estimator of the common effect size is \( S\!E(\hat{\theta }_{F\!E} ) = \sqrt {{1 \mathord{\left/ {\vphantom {1 {\sum\limits_{i = 1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i} } }}} \).

The confidence interval

The \( \left( {1 - \alpha } \right) \times 100\% \) confidence interval for the common population WMD \( \theta \) based on the sample estimates is given by the lower limit (LL) and upper limit (UL) as follows:

$$ \begin{aligned} L\!L &= \hat{\theta }_{F\!E} - z_{\alpha /2} \times S\!E(\hat{\theta }_{F\!E} )\;\;\text{and} \\ U\!L &= \hat{\theta }_{F\!E} + z_{\alpha /2} \times S\!E(\hat{\theta }_{F\!E} ). \end{aligned} $$

Here \( z_{\alpha /2} \) is the \( \tfrac{\alpha }{2} \) th cut-off point of standard normal distribution and \( S\!E(\hat{\theta }_{F\!E} ) = \sqrt {V\!ar(\hat{\theta }_{F\!E} )} \).

To compute the confidence interval and perform test on the population effect size \( \theta \), under the FE model, we need to compute the point estimate and standard error of the estimator for all studies.

Example 9.3

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from Table 9.1.

Using the summary statistics at the bottom of Table 9.2 calculate the (i) point estimate of the population WMD, (ii) standard error of estimator and (iii) 95% confidence interval for the population WMD of blood loss under the fixed effect model.

Solution:

Table 9.2 Calculated values of the summary statistics for the mean difference of blood loss

To answer the above questions, we use the summary statistics in the last row of Table 9.2 as follows:

  1. (i)

    The estimate of the population WMD is obtained as the sample WMD as \( \hat{\theta }_{F\!E} = \frac{{\sum\limits_{1}^{11} {w_{i} \hat{\delta }_{i} } }}{{\sum\limits_{1}^{11} {w_{i} } }} = \frac{{\sum\limits_{1}^{11} {W\!M\!D} }}{{\sum\limits_{1}^{11} W }} = \frac{ - 7.9085}{0.10798} = - 73.2411 \approx - 73.24. \)

  2. (ii)

    The standard error is \( S\!E(\hat{\theta }_{F\!E} ) = \sqrt {\frac{1}{{\sum\limits_{1}^{11} {w_{i} } }}} = \sqrt {\frac{1}{{\sum\limits_{1}^{11} W }}} = \sqrt {\frac{1}{0.10798}} = 3.043199. \)

  3. (iii)

    The 95% confidence interval for the common population WMD, \( \theta \) is given by

LL = \( \hat{\theta }_{F\!E} - 1.96 \times S\!E(\hat{\theta }_{F\!E} ) = - 73.2411 - 1.96 \times 3.043199 = - 79.2058 \approx - 79.21 \) and

UL = \( \hat{\theta }_{F\!E} + 1.96 \times S\!E(\hat{\theta }_{F\!E} ) = - 73.2411 + 1.96 \times 3.043199 = - 67.2764 \approx - 67.28. \)

Comment

The above point estimate (−73.24) and confidence limits (−79.21, −67.28) are displayed in the bottom row of forest plot and represented by the diamond as in Fig. 9.1.

Fig. 9.1
figure 1

Forest plot of meta-analysis on blood loss for the LARR and ORR groups under FE model

Measuring Heterogeneity

Here we consider two popular methods to identify and measure the extent of heterogeneity among the effect sizes of independent studies.

Cochran’s Q Statistic (Cochran, 1973)

The Cochran’s Q is defined as

$$ Q = \sum\limits_{i = 1}^{k} {w_{i} } \hat{\theta }_{i}^{2} - \frac{{\left( {\sum\limits_{i = 1}^{k} {w_{i} } \hat{\theta }_{i} } \right)^{2} }}{{\sum\limits_{i = 1}^{k} {w_{i} } }}, $$

where \( w_{i} \) is the weight and \( \hat{\theta }_{i} \) is the effect size estimate of the ith study.

The above Q statistic follows a chi-squared distribution with \( d\!f = (k - 1), \) where k is the number of studies included in the meta-analysis. Since the expected value of a chi-squared variable is its degrees of freedom, the expected value of Q is (k-1), that is, \( E\left( Q \right) = (k - 1) = df. \)

Test of Heterogeneity

To test the null hypothesis of the equality of effect sizes (i.e. equality excluding random error) across all studies, test

\( H_{0} :\,\theta_{1} \, = \,\theta_{2} \, = \, \ldots \, = \,\theta_{k} \, = \,\theta \) against \( H_{a} : \) not all \( \theta_{i} \)’s are equal (at least one of them is different), using the Cochran’s Q statistic as defined above.

Reject the null hypothesis at the \( \alpha \) level of significance if the observed value of the Q statistic is larger than or equal to \( \chi_{k - 1,1 - \alpha }^{2} \), the level \( \alpha \) critical value of the chi-squared distribution with (k-1) df, such that \( P\left( {\chi_{k - 1}^{2} \ge \chi_{k - 1,1 - \alpha }^{2} } \right) = \alpha \); otherwise don’t reject it.

The small P-value leads to the conclusion that there is true difference among the effect sizes. However the non-significant P-value may not mean that the effect sizes are not different as this could happen due to low power of the test. The test should not be used to measure the magnitude of the true dispersion.

The I2 Statistic (Higgins et al. 2003)

The I2 statistic is a ratio of excess variation to the total variation expressed in percentages as follows:

$$ I^{2} = \left( {\frac{Q - df}{Q}} \right) \times 100\% , $$

and is viewed as the proportion of between studies variation and total variation (within plus between studies variation).

Comment

The values of Q and \( I^{2} \) statistics are calculated from the sample summary data and they are not dependent on any statistical models.

Example 9.4

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from Table 9.1.

Find the value of (i) Q statistic and (ii) \( I^{2} \) statistic for the blood loss data.

Solution:

The following Table 9.3 provides summary calculations for finding Q and \( I^{2} \) statistics.

Table 9.3 Calculated values of the summary statistics for the mean difference of blood loss data

From the summary statistics in Table 9.3.

  1. (i)

    The Q statistic is calculated as

$$ \begin{aligned} Q\, & = \,\sum\limits_{i = 1}^{k} {w_{i} } \hat{\delta }_{i}^{2} \, - \,\frac{{\left( {\sum\limits_{i = 1}^{k} {w_{i} } \hat{\delta }_{i} } \right)^{2} }}{{\sum\limits_{i = 1}^{k} {w_{i} } }}\, = \,\sum\limits_{i = 1}^{k} {W\!M\!D^{2} } \, - \,\frac{{\left( {\sum\limits_{i = 1}^{k} {W\!M\!D} } \right)^{2} }}{{\sum\limits_{i = 1}^{k} W }}\, \\ & = \,638.3923\, - \,\frac{{( - 7.9085)^{2} }}{0.10798}\, = \,59.17. \\ \end{aligned} $$

To test the heterogeneity of effect sizes the P-value is found from the chi-squared Table (with df = (11 − 1) = 10) as

\( P\left( {\chi_{10}^{2} \ge 59.17} \right) = 0. \) Note there is no area left under the chi-squired density curve to the right of 59.17. Since the P-value is close to 0, we reject the null hypothesis (of equal effect sizes).

The I2 statistic is found to be

$$ \begin{aligned} I^{2} \, & = \,\left( {\frac{Q\, - \,d\!f}{Q}} \right)\, \times \,100\% \, = \,\left( {\frac{59.17\, - \,(11\, - \,1)}{59.17}} \right)\, \times \,100\% \, \\ & = \,0.83098\, \times \,100\% \, = \,83\% , \\ \end{aligned} $$

where \( d\!f = (k - 1) = (11 - 1) = 10. \)

The above value of Q = 59.17, its P-value = 0, and \( I^{2} = 83\% \) are presented in the forest plot produced by MetaXL on the left panel of Fig. 9.1.

Forest plot for WMD under FE model using MetaXL

The forest plot under the FE model (indicated by “IV” in the code) is constructed using MetaXL code

=MAInputTable(“Blood Loss WMD FE”,”WMD”,”IV”,B6:H16)

Remark: Explanations of MetaXL Code

For this type of meta-analyses in MetaXL the ‘opening’ code starts with MA Input Table ` = MAInputTable’. This is followed by an open parenthesis inside which the first quote contains the text that appears as the ‘title of the output of the forest plot’ e.g. “Blood Loss WMD FE” in the above code (user may choose any appropriate title here, but FE is chosen to indicate fixed effect model). Then in the second quote enter the type of effect measure, e.g. “WMD” in the above code which tells that the weighted mean difference is the effect size. Within the third quote enter the statistical model, e.g. “IV” in the above code stands for the fixed effect (abbreviated by FE) model. Each quotation is followed by a comma, and after the last comma enter the data area in Excel Worksheet, e.g. B6:H16 in the above code tells that the data on the independent studies are taken from the specified cells of the Excel Worksheet. The code ends with a closing parenthesis.

The forest plot of the meta-analysis using the above MetaXL code is found in Fig. 9.1.

Interpretation

From the above forest plot of WMD under the FE model, the estimated common effect size is −73.24, and the 95% confidence interval is (−79.21, −67.28). The effect size is highly statistically significant (as 0 is not included in the 95% confidence interval).

Here Cochran’s Q = 59.17 with P-value = 0 indicates highly significant heterogeneity among the mean difference of blood loss between the LARR and ORR groups of independent studies. The \( I^{2} = \) 83% also reflects that there is high heterogeneity among the studies.

Remark

The sign of the mean difference (MD = \( \hat{\delta} \)) and subsequent estimates of the common effect size are all negative because of the way the mean difference is defined here, mean of LARR group minus mean of ORR group. If the order of difference is reversed, that is, if the mean of ORR group is subtracted from that of LARR group then the sign of the MD and other estimates will interchange (negative to positive and vice versa). The results of the meta-analysis on the reversed ordered MD = \( \hat{\delta} \) is shown in Fig. 9.2.

Fig. 9.2
figure 2

Forest plot of meta-analysis on blood loss for the LARR and ORR groups in reversed order (mean ORR–mean LARR) under FE model

Comment

The interpretation of the forest plot in Fig. 9.2, significantly different WMD and significant heterogeneity, is the same as that in Fig. 9.1. All the estimates, confidence limits, value of Q statistic and P-value here are the same in magnitude as the previous forest plot but the minus sign is replaced by the plus sign. Thus the order of the raw mean difference does not impact on the final conclusion of the meta-analysis.

5 Random Effects (REs) Model

Random effects (REs) model is used when the effect size across the independent studies is significantly heterogeneous. This model was introduced by DerSimonian and Laird, 1986. In spite of its frequent use, some valid criticisms of this model and its poor performance compared with inverse variance heterogeneity (IVhet) and quality effect (QF) models are provided in Doi et al. (2015c, b, c).

Under the random effects model, the population variance of the effect size is the sum of the variance of  \( \hat{\theta }_{i} \) about \( \theta \) (\( \sigma^{2} \)), the within-study variance, and between-study variances, \( \tau^{2} \). So, for the ith study, the unknown modified variance becomes \( \sigma_{i}^{*2} = \sigma_{i}^{2} + \tau^{2} \) is estimated by its sample counterpart \( v_{i}^{*} = v_{i} + \hat{\tau }^{2} , \) where \( v_{i} \) is the estimate of \( \sigma_{i}^{2} \) and \( \hat{\tau }^{2} \) is the estimate of \( \tau^{2} \). Therefore, the weight assigned to the ith study is defined as

\( w_{i}^{*} = \frac{1}{{v_{i} + \hat{\tau }^{2} }} \) for i = 1, 2, …, k.

The common effect size under the REs model is estimated by

$$ \hat{\theta }_{R\!E} = {{\sum\limits_{i = 1}^{k} {w_{i}^{*} \hat{\delta }_{i} } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{k} {w_{i}^{*} \hat{\theta }_{i} } } {\sum\limits_{i = 1}^{k} {w_{i}^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i}^{*} } }}. $$

The standard error of the estimator of the common effect size is given by

$$ S\!E(\hat{\theta }_{R\!E} ) = \sqrt {{1 \mathord{\left/ {\vphantom {1 {\sum\limits_{i = 1}^{k} {w_{i}^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i}^{*} } }}} . $$

The \( (1 - \alpha ) \times 100\% \) confidence interval for the effect size \( \theta \) under the REs model is given by the lower limit (LL) and upper limit (UL) as follows:

$$ \begin{aligned} L\!L &= \hat{\theta }_{R\!E} - z_{\alpha /2} \times S\!E(\hat{\theta }_{R\!E} )\;\;\text{and} \\ U\!L &= \hat{\theta }_{R\!E} + z_{\alpha /2} \times S\!E(\hat{\theta }_{R\!E} ), \end{aligned}$$

where \( z_{\alpha /2} \) is the \( \tfrac{\alpha }{2} \) th cut-off point of standard normal distribution.

Estimation of \( \tau^{2} \)

The between studies variance is estimated as a scaled excess variation as follows

$$ \hat{\tau }^{2} = \frac{Q - d\!f}{C}, $$

where

$$ Q = \sum\limits_{i = 1}^{k} {w_{i} \hat{\delta }_{i}^{2} } - \frac{{\left( {\sum\limits_{i = 1}^{k} {w_{i} \hat{\delta }_{i} } } \right)^{2} }}{{\sum\limits_{i = 1}^{k} {w_{i} } }},\;\;C = \sum\limits_{i = 1}^{k} {w_{i} } - \frac{{\sum\limits_{i = 1}^{k} {w_{i}^{2} } }}{{\sum\limits_{i = 1}^{k} {w_{i} } }} $$

and \( d\!f = (k - 1) \) in which k is the number of studies.

Example 9.5

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from Table 9.1.

Find the estimated value of between studies variance, \( \hat{\tau }^{2} \), for the blood loss data.

Solution:

Using the summary statistics in Table 9.3 of the previous example we calculate the values of Q and C statistics which are required to find the value of \( \hat{\tau }^{2} \). Here

$$ Q = \sum\limits_{i = 1}^{k} {w_{i} } \hat{\delta}_{i}^{2} - \frac{{\left( {\sum\limits_{i = 1}^{k} {w_{i} } \hat{\delta}_{i} } \right)^{2} }}{{\sum\limits_{i = 1}^{k} {w_{i} } }} = 638.3923 - \frac{{( - 7.9085)^{2} }}{0.10798} = 59.17, $$

\( C = \sum\limits_{i = 1}^{11} {w_{i} } - \frac{{\sum\limits_{i = 1}^{11} {w_{i}^{2} } }}{{\sum\limits_{i = 1}^{11} {w_{i} } }} = 0.10798 - \frac{0.007742}{0.10798} = 0.0363 \) and \( d\!f = k - 1 = 11 - 1 = 10. \)

Then, the estimate of the between studies variance becomes

\( \hat{\tau }^{2} = \frac{Q - d\!f}{C} = \frac{59.17 - 10}{0.0363} = 1355.029. \)

Illustration of REs Model for WMD

Example 9.6

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from Table 9.1.

Find the (i) point estimate of the combined population WMD, (ii) standard error of the estimator, and (iii) 95% confidence interval of the population effect size, WMD under the random effects model.

In Table 9.4 the combined variance (Var*) is the sum of Var and Tau^2, and modified weight (W*) is the weight under the REs model calculated as the reciprocal of Var*.

Table 9.4 Calculated values of the summary statistics for the REs model

As an illustration, for Study 1 (Bonjer et al. 2015), the modified variance for the REs model is found to be

\( v_{1}^{*} = v_{1} + \hat{\tau }^{2} = 467.318 + 1355.03 = 1822.3 \) and the modified weight becomes \( w_{1}^{*} = {1 \mathord{\left/ {\vphantom {1 {V\!ar_{1}^{*} = \frac{1}{1822.3}}}} \right. \kern-0pt} {V\!ar_{1}^{*} = \frac{1}{1822.3}}} = 0.00055. \)

Now using the summary statistics from Table 9.4 we get

  1. (i)

    the point estimate of the common effect size, WMD under the REs model to be

$$ \hat{\theta }_{R\!E} = {{\sum\limits_{i = 1}^{11} {w_{i}^{*} \hat{\delta}_{i} } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{11} {w_{i}^{*} \hat{\theta }_{i} } } {\sum\limits_{i = 1}^{11} {w_{i}^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{11} {w_{i}^{*} } }} = {{\sum\limits_{i = 1}^{11} {W^{*} MD} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{11} {W^{*} MD} } {\sum\limits_{i = 1}^{11} {W^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{11} {W^{*} } }} = \frac{ - 0.42723}{0.00475} = - 89.98. $$
  1. (ii)

    The standard error of the estimator of the common effect size is

$$ S\!E(\hat{\theta }_{R\!\!E} ) = \sqrt {{1 \mathord{\left/ {\vphantom {1 {\sum\limits_{i = 1}^{11} {w_{i}^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{11} {w_{i}^{*} } }}} = \sqrt {{1 \mathord{\left/ {\vphantom {1 {\sum\limits_{i = 1}^{11} {W^{*} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{11} {W^{*} } }}} = \sqrt {\frac{1}{0.00475}} = 14.513. $$
  1. (iii)

    The 95% confidence interval for the population effect size \( \theta \) under the REs model is given by the lower limit (LL) and upper limit (UL) as follows:

    $$ \begin{aligned} L\!L &= \hat{\theta }_{R\!E} - 1.96 \times S\!E(\hat{\theta }_{R\!E} ) = - 89.98 - 1.96 \times 14.513 = - 118.43 \\ U\!L &= \hat{\theta }_{R\!E} + 1.96 \times S\!E(\hat{\theta }_{R\!E} ) = - 89.98 + 1.96 \times 14.513 = - 61.54. \end{aligned}$$

Comment

The above point estimate (−89.98) and the 95% confidence interval (−0118.43, −61.54) are presented at the bottom row of the forest plot and represented by a diamond as shown in Fig. 9.3.

Fig. 9.3
figure 3

Forest plot of meta-analysis on blood loss for the LARR and ORR groups under REs model

Forest plot for WMD under REs model using MetaXL

The forest plot under the RE model (indicated by “RE” in the code) is constructed using MetaXL code

=MAInputTable(“Blood Loss WMD RE”,”WMD”,”RE”,B6:H16)

Remark: Explanations of MetaXL Code

For this type of meta-analyses in MetaXL the ‘opening’ code starts with MA Input Table ` =MAInputTable’. This is followed by an open parenthesis inside which the first quote contains the text that appears as the ‘title of the output of the forest plot’ e.g. “Blood Loss WMD RE” in the above code (user may choose any appropriate title here, but RE is chosen to indicate random effects model). Then in the second quote enter the type of effect measure, e.g. “WMD” in the above code which tells that the weighted mean difference is the effect size. Within the third quote enter the statistical model, e.g. “RE” in the above code stands for the random effects (abbreviated by REs) model. Each quotation is followed by a comma, and after the last comma enter the data area in Excel Worksheet, e.g. B6:D16 in the above code tells that the data on the independent studies are taken from the specified cells of the Excel Worksheet. The code ends with a closing parenthesis.

The forest plot of the meta-analysis using the above MetaXL code is found in Fig. 9.3.

Interpretation

From the above forest plot of WMD of blood loss under the REs model, the estimated common effect size is −89.98, and the 95% confidence interval is (−118.43, −61.54). The effect size is highly statistically significant (as 0 is not included in the confidence interval).

Here Cochran’s Q = 59.17 with P-value = 0 indicates highly significant heterogeneity among the mean difference of blood loss between the LARR and ORR groups of independent studies. The \( I^{2} = \) 83% also reflects that there is high heterogeneity among the studies.

6 Inverse Variance Heterogeneity (IVhet) Model

The IVhet model is used when there is significant heterogeneity in the effect size across all the independent studies. Details on this model is found in (Doi et al., 2015a).

The estimator of the common effect size WMD (=\( \theta \)) under the inverse variance heterogeneity (IVhet) model is given by

$$ \hat{\theta }_{I\!Vhet} = \frac{{\sum\limits_{i = 1}^{k} {w_{i} \hat{\delta}_{i} } }}{{\sum\limits_{i = 1}^{k} {w_{i} } }}. $$

Then the variance of the estimator under the IVhet model is given by

$$ \begin{aligned} V\!ar(\hat{\theta }_{I\!Vhet} )\, & = \,\sum\limits_{i = 1}^{k} {\left[ {\left( {{{\frac{1}{{v_{i} }}} \mathord{\left/ {\vphantom {{\frac{1}{{v_{i} }}} {\sum\limits_{i = 1}^{k} {\frac{1}{{v_{i} }}} }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {\frac{1}{{v_{i} }}} }}} \right)^{2} \,(v_{i} + \hat{\tau }^{2} )} \right]} \, \\ & = \,\sum\limits_{i = 1}^{k} {\left[ {\left( {{{w_{i} } \mathord{\left/ {\vphantom {{w_{i} } {\sum\limits_{i = 1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {w_{i} } }}} \right)^{2} \times v_{i}^{*} } \right]} . \\ \end{aligned} $$

For the computation of the confidence interval of the common effect size based on the IVhet model use the following estimated standard error

$$ S\!E(\hat{\theta }_{I\!Vhet} ) = \sqrt {V\!ar(\hat{\theta }_{IVhet} )} . $$

Then, the \( (1 - \alpha ) \times 100\% \) confidence interval for the common effect size \( \theta \) under the IVhet model is given by the lower limit (LL) and upper limit (UL) as follows:

$$ \begin{aligned} L\!L &= \hat{\theta }_{I\!Vhet} - z_{\alpha /2} \times S\!E(\hat{\theta }_{I\!Vhet} ) \\ U\!L &= \hat{\theta }_{I\!Vhet} + z_{\alpha /2} \times S\!E(\hat{\theta }_{I\!Vhet} ), \end{aligned}$$

where \( z_{\alpha /2} \) is the \( \tfrac{\alpha }{2} \) th cut-off point of standard normal distribution.

Illustration of IVhet Model for WMD

Example 9.7

Consider the summary data on blood loss for the Laparoscopic-assisted Rectal Resection (LARR) versus Open Rectal Resection (ORR) for Carcinoma from Table 9.1.

Find the (i) point estimate of the population WMD, (ii) standard error of the estimator, and (iii) 95% confidence interval of the population effect size, WMD under the inverse variance heterogeneity model.

Solution:

To answer the questions we need to compute the values in Table 9.5.

Table 9.5 Calculated values of the summary statistics for the IVhet model

In Table 9.5, Var* is the combined variance

\( \left( {V\!ar_{i}^{*} = v_{i} + \hat{\tau }^{2} } \right) \) and W* is the modified weight under the IVhet model calculated as \( W_{i}^{*} = \left[ {\left( {{{\frac{1}{{v_{i} }}} \mathord{\left/ {\vphantom {{\frac{1}{{v_{i} }}} {\sum\limits_{i = 1}^{k} {\frac{1}{{v_{i} }}} }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{k} {\frac{1}{{v_{i} }}} }}} \right)^{2} (v_{i} + \hat{\tau }^{2} )} \right] = \left( {{{w_{i} } \mathord{\left/ {\vphantom {{w_{i} } {\sum\limits_{1}^{k} {w_{i} } }}} \right. \kern-0pt} {\sum\limits_{1}^{k} {w_{i} } }}} \right)^{2} \times V\!ar_{i}^{*} \) for the ith study. For example, for the first study (Bonjer et al. 2015).

\( V\!ar_{1}^{*} = 467.318 \, + { 1355} . 0 3 { } = { 1822} . 3 \), and

\( W_{1}^{*} = \) \( \left( {{\raise0.7ex\hbox{${0.00214}$} \!\mathord{\left/ {\vphantom {{0.00214} {0.10798}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${0.10798}$}}} \right)^{2} \times 1822.3 = 0.71569. \)

Now using the summary statistics in Table 9.5, we answer the questions in Example 9.7.

  1. (i)

    The point estimate of population WMD under the IVhet model is

$$ \hat{\theta }^{*}_{I\!Vhet} = \frac{{\sum\limits_{i = 1}^{11} {w_{i} \hat{\delta}_{i} } }}{{\sum\limits_{i = 1}^{11} {w_{i} } }} = \frac{{\sum\limits_{i = 1}^{11} {W \times MD} }}{{\sum\limits_{i = 1}^{11} W }} = \frac{ - 7.9085}{0.10798} = - 73.24. $$
  1. (ii)

    The standard error of the estimator is

$$ S\!E(\hat{\theta }^{*}_{I\!Vhet} ) = \sqrt {\sum\limits_{i = 1}^{11} {W_{i}^{*} } = } \sqrt {908.96775} = 30.1491. $$
  1. (iii)

    The 95% confidence interval of the effect size under the IVhet model is given by the lower limit (LL) and upper limit (UL) as follows:

$$ \begin{aligned} L\!L &= \hat{\theta }_{I\!Vhet} - 1.96 \times S\!E(\hat{\theta }_{I\!Vhet} ) = - 73.24 - 1.96 \times 30.1491 = - 132.33\;\;\text{and} \\ U\!L &= \hat{\theta }_{I\!Vhet} + z_{\alpha /2} \times S\!E(\hat{\theta }_{I\!Vhet} ) = - 73.24 + 1.96 \times 30.1491 = - 14.15. \end{aligned}$$

Comment

The above point estimate (−73.24) and the 95% confidence interval (−132.33, −14.15) are presented at the bottom row of the forest plot and represented by a diamond as found in Fig. 9.4.

Fig. 9.4
figure 4

Forest plot of meta-analysis on blood loss for the LARR and ORR groups under IVhet model

Forest plot for WMD under IVhet model using MetaXL

The forest plot under the IVhet model (indicated by “IVhet” in the code) is constructed using MetaXL code

=MAInputTable(“Blood Loss WMD IVhet”,”WMD”,”IVhet”,B6:H16)

Remark: Explanations of MetaXL Code

For this type of meta-analyses in MetaXL the ‘opening’ code starts with MA Input Table ` =MAInputTable’. This is followed by an open parenthesis inside which the first quote contains the text that appears as the ‘title of the output of the forest plot’ e.g. “Blood Loss WMD IVhet” in the above code (user may choose any appropriate title here, but IVhet is chosen to indicate inverse variance heterogeneity model). Then in the second quote enter the type of effect measure, e.g. “WMD” in the above code which tells that the weighted mean difference is the effect size. Within the third quote enter the statistical model, e.g. “IVhet” in the above code stands for the inverse variance heterogeneity (abbreviated by IVhet) model. Each quotation is followed by a comma, and after the last comma enter the data area in Excel Worksheet, e.g. B6:D16 in the above code tells that the data on the independent studies are taken from the specified cells of the Excel Worksheet. The code ends with a closing parenthesis.

Interpretation

From the above forest plot, the estimated common effect size is −73.24, and the 95% confidence interval is (−132.33, −14.15). The effect size is highly statistically significant (as 0 is not included in the confidence interval).

Here Cochrane’s Q = 59.17 with P-value = 0 indicates highly significant heterogeneity among the mean difference of blood loss between the LARR and ORR groups of independent studies. The \( I^{2} = \) 83% also reflects that there is high heterogeneity among the studies.

7 Subgroup Analysis

Consider the Blood loss data in Example 9.5. To illustrate subgroup analysis for the data, let’s divide the studies into two groups: studied published before 2010 (Old Studies) and after 2010 (Recent Studies) to see if there is any difference in the effect size between the two subgroups. Using MetaXL we produce the forest plot of subgroup analysis as in Figs. 9.5, 9.6 and 9.7 representing the meta-analyses of the subgorups under FE, REs and IVhet models.

Fig. 9.5
figure 5

Subgroup analysis by Older and Recent Studies of blood loss after LARR and ORR procedures under the FE model

Fig. 9.6
figure 6

Subgroup analysis by Older and Recent Studies of blood loss after LARR and ORR procedures under the REs model

Fig. 9.7
figure 7

Subgroup analysis by Older and Recent Studies of blood loss after LARR and ORR procedures under the IVhet model

Interpretation (FE)

From the forest plot (FE) in Fig. 9.5, under the FE model, the estimated common effect size of Recent Studies is −70.91, and the 95% confidence interval is (−85.83, −55.98), and that for the Old Studies are −73.68 and (−80.19, −67.18) respectively. The effect size is highly statistically significant (as 0 is not included in the confidence interval) for both subgroups as well as the pooled results of all studies (−73,24, CI: −79.21, 67.28).

For the Recent Studies Cochran’s Q = 48.05 with P-value = 0 indicating highly significant heterogeneity among the mean difference of blood loss between the LARR and ORR groups of independent studies. The \( I^{2} = \) 90% also reflects that there is high heterogeneity among the studies in this subgroup.

For the Od Studies Cochran’s Q = 11.01 with P-value = 0.03 indicating significant heterogeneity among the mean difference of blood loss between the LARR and ORR groups of independent studies. The \( I^{2} = \) 64% also reflects that there is high heterogeneity among the studies in this subgroup. But there is more heterogeneity among the Recent Studies than the Old Studies.

Interpretation (REs)

From the forest plot of WMD on blood loss under the REs model, the estimated common effect size of Recent Studies is −84.43, and the 95% confidence interval is (−136.90, −31.96), and that for the Old Studies are −103.37 and (−146.48, −60.26) respectively. The effect size is highly statistically significant (as 0 is not included in the confidence interval) for both subgroups.

The comments on the heterogeneity (Q statistic and P-value) remain the same as that for Fig. 9.5 as these are not dependent on any model.

Interpretation (IVhet)

From the forest plot of WMD on blood loss under the IVhet model, the estimated common effect size of Recent Studies is −70.91, and the 95% confidence interval is (−129.39, −12.42), and that for the Old Studies are −73.68 and (−137.47, −9.90) respectively. The effect size is highly statistically significant (as 0 is not included in the confidence interval) for both subgroups.

8 Publication Bias

The study of publication bias for WMD is very similar to that of SMD in Sect. 9.8.8 of the previous chapter. It is not necessary to re-produce thsem again here. Readers interested to produce funnel plot or Doi plot and their interpretation are referred to that Section.

9 Conclusions

The weighted mean difference (WMD) method of meta-analysis is covered in this chapter. It is applicable for continuous (numerical) outcome variables for two arms studies. In addition to introducing the WMD method of meta-analysis with step by step illustrations to apply the method on real-life data sets, forest plots are produced under different statistical models by using the MetaXL codes.

The comparison of results for different statistical models show variation in the point estimates and confidence intervals. The heterogeneity among the studies are also studied using Q and \( I^{2} \) statistics. Subgroup analysis is also provided for the Recent Studies and Old Studies.