Meta-Regression

Xu, Chang; Doi, Suhail A. R.

doi:10.1007/978-981-15-5032-4_11

Chang Xu⁴ &
Suhail A. R. Doi⁴

Part of the book series: Statistics for Biology and Health ((SBH))

2524 Accesses

Abstract

In classical meta-analysis, a two-stage procedure was generally employed. In the first stage, we estimate the “mean treatment effect” of each study by the summarized data or extract it from the original study directly; Then in the second stage, we combine the “mean treatment effect” of each study and obtain the “weighted mean treatment effect”. The ideal assumption is that for all the studies the samples are from the same population with the same age, gender proportion, region, body mass index etc while, in practice, these study populations are unlikely to be the same. Suppose we treat the “mean treatment effect” of each study as a dependent variable, and the mean value of these study population characteristics as independent variables, and then establish a study-level regression analysis. If there is an association between them, these independent variables can be regarded as study-level moderators, which have a potential impact on the pooled effect. That is, the “mean treatment effect” changes as the status of these characteristics change, and these characteristics can then be considered a source of heterogeneity between studies, with the average “treatment effect” in the regression being the moderator adjusted effect (Higgins et al. 2003). We call regression analysis based on study-level data a meta-regression (Borenstein et al. 2011).

Access provided by Autonomous University of Puebla. Download chapter PDF

Multiple moderator meta-analysis using the R-package Meta-CART

Article Open access 15 June 2020

Comparison of Various Statistical Techniques Used in Meta-analysis

Power of Statistical Tests for Subgroup Analysis in Meta-Analysis

Keywords

1 Basic Theory

1.1 The Classical Meta-Regression Method

Suppose $ \hat{\theta }_j $ is the effect estimated in the jth study, then under the fixed-effect model,

$$ \hat{\theta }_j\sim N(\mu , \sigma_{j}^{2} ) $$

The fixed-effect model assumes all the studies are from the same population so there is no heterogeneity between these studies (Thompson and Higgins 2002). Now let’s consider the random-effect model:

$$ \hat{\theta }_j\sim N(\theta _j, \sigma_{j}^{2} );\theta_ j\sim N(\mu , \tau^{2} ) $$

The heterogeneity term $ \tau^{2} $ is generated under the assumption that the difference between the overall population parameter ($ \mu $) and the study population characteristics modified effect (e.g. difference in mean age) is distributed normally with a common variance (Thompson and Sharp 1999). The regression model is then

$$ \hat{\theta }_{j} \; = \;\mu \, + \,\beta_{1} \, \cdot \,x_{1} \, + \,\beta_{2} \, \cdot \,x_{2} \, + \, \ldots \, + \beta_{i} \, \cdot \,x_{i} \, + \,b_{j} \, + \,\varepsilon_{j} $$

Here $ x $ represents the study-level characteristics and $ \varepsilon_ j $ represents the random error with the variance of $ \sigma_{j}^{2} $ and $ b $ the non-random error with the variance of $ \tau^{2} $, both of which share the expectation (mean) of zero. Because all the characteristics (independent variables) are mean or median based on the study-level, each study is independent from another, and these variables are independent from each other. To take account of the variance of error information into the meta-regression, the weighted least square method can be used to get the parameter estimations.

A problem with fixed-effect meta-regression is that most studies are heterogeneous and thus there is overdispersion of the data compared to the model that random-effect meta-regression tries to address (Harbord and Higgins 2008). However, it should be pointed out that with increasing heterogeneity of studies, the random-effect weights become more equal and the regression therefore becomes more and more unweighted and this tends to lead to continued overdispersion with this model as well (Doi et al. 2015). As expected, when variables are added (or dropped) within the regression model, the total weighted variance ($ Q $) will change, while the within study variance ($ \sigma_{j}^{2} $) is known to us and keeps the same. This will result in the change of the between study variance ($ \tau^{2} $) so that when it reduces, this means that the variable can explain part of the heterogeneity and when it increases, this means adding the variable will make the fitting of the model poorer and the variable should not be added and of course is not the source of heterogeneity. The proportion of heterogeneity explained by the added variables is then

$$ R^{2} = [({{\tau_{0}^{2} - \tau_{model}^{2} )} \mathord{\left/ {\vphantom {{\tau_{0}^{2} - \tau_{model}^{2} )} {\tau_{0}^{2} }}} \right. \kern-0pt} {\tau_{0}^{2} }},0] $$

The equation implies that when the heterogeneity is reduced then the $ \tau_{model}^{2} \le \tau_{0}^{2} $, and when heterogeneity increased that $ \tau_{model}^{2} > \tau_{0}^{2} $, with the proportion tending towards zero (Thompson and Higgins 2009). Here the proportion is actually the same as the R square of the generic regression and is then indexed as R squared.

$$ R^{2} = \frac{{\tau_{0}^{2} - \tau_{model}^{2} }}{{\tau_{0}^{2} }} = 1 - \frac{{SS_{res} }}{{SS_{total} }} = \frac{{SS_{model} }}{{SS_{total} }} $$

Here $ \tau_{0}^{2} $ is the heterogeneity when we did not add any variables into the regression and obviously, the result of this model is the pooled effect estimate of the population parameter $ \mu $(the constant term).

$$ \hat{\theta }_j = \mu + b_{j} + \varepsilon_{j} $$

1.2 The Robust Error Meta-Regression Method

The classical meta-regression model is based on the random-effect meta-analytic model while this model has the limitation we noted previously. An alternative solution is to use the generic regression with the robust (Huber-Eicker-White-sandwich) error variances to account for the underestimated variance in such analyses under the regression model (Hedges et al. 2010). These standard errors are usually bigger than the ordinary least squares (OLS) standard errors when effect sizes further from the mean are more variable. Weights applied to this model are fixed-effect weights and overdispersion is avoided through use of robust standard errors.

2 Application in MetaXL/STATA

2.1 The Meta-Regression Dataset

The IHDChol example uses 28 randomized trials of serum cholesterol reduction (by various interventions), and the risk of ischaemic heart disease (IHD) events observed. Both fatal IHD and non-fatal myocardial infarction were included as IHD events, and the analysis is based on the 28 trials reported by Law et al. (Law et al. 1994). In these trials, cholesterol had been reduced by a variety of means, namely dietary intervention, drugs, and, in one case, surgery. The meta-regression looks at if increased benefit in terms of IHD risk reduction is associated with greater reduction in serum cholesterol, in order to lend support to the efficacy of cholesterol reduction and to predict the expected IHD risk reduction consequent upon a specified decrease in serum cholesterol (Table 11.1).

Table 11.1 Comparisons on the IDH events of various interventions

Full size table

2.2 The Robust Error Meta-Regression in STATA

We may first use the inverse-variance weights with the following command to conduct a generic meta-analysis. The reason we use the inverse-variance weights is that with the robust standard errors it mimics the IVhet model (Doi et al. 2015) of meta-analysis which is a robust error fixed-effect model and results can then be compared against the latter. The pooled OR under the IVhet model is 0.83 (95%CI: 0.72, 0.95) and the relative heterogeneity (I²) is 45.7% and the between-study variance ($ \tau_{{}}^{2} $) is 0.0188.

From the results, we can see that there is moderate heterogeneity (I² = 45.7%, tau² = 0.0294) between studies. The total variance based on Mantel-Haenszel estimates is 49.69.

Using a robust error meta-regression without covariates, we can reproduce these results as follows:

We may further investigate whether the amount of cholesterol reduction is associated with the lnORs across studies by the robust error meta-regression analysis with inverse-variance weights and where _ES and _seES are the effect size and standard error of the effect size respectively.

The meta-regression analysis suggests there is significant association between amount of cholesterol reduction and lnORs (p < 0.001) and each unit reduction in cholesterol will lead to a 38% reduction of the odds (OR = 0.62, 95%CI: 0.52, 0.74). The proportion of between-study variance explained by cholesterol reduction was 23.8% ($ R^{2} = \frac{mss}{mss + rss} $, see below). Here $ mss $ indicate the model sum of square ($ SS_{model} $) while $ rss $ is the residual sum of squares ($ SS_{res} $). The ereturn list command allows us to see the total variance when the chol_reduc variable was added into the model. The e(r2_a) gives the adjusted $ R^{2} $ (20.9%).

We may observe that the total variance also reduced ($ F_{model} $ = 30.23). And we can use the total variance to calculate the I² statistic

$$ I_{model}^{2} = \frac{{F_{model} - (df\_r)}}{{F_{model} }} = \frac{30.23 - 26}{30.23} = 13.99{\% } $$

To depict this relationship we can create a twoway plot as follows:

twoway (scatter _ES chol_reduc [w = 1/(_seES²)], msymbol(oh)) (lfit _ES chol_reduc [w = 1/(_seES²)], yline(–0.193) ytitle(“Effect size (interval scale)”))

Figure 11.1 presents the regression plot between amount of cholesterol reduction and lnORs. The figure may help us to explain the reason for the reduction on total variance. The dash line is the pooled lnOR by IVhet method [ln(0.825) = –0.193] without adding the chol_reduc variable and the solid line is the linear prediction for cholesterol reduction and lnORs. As we known, the total variance is the sum weighted distance for the observed value to the predicted value ($ Q = \sum {\text{w}_j} \cdot \left( {\theta - \hat{\theta }} \right)^{2} $). Obviously, the sum weighted distance for the observed value to the dash line is different to the linear prediction and the latter shows better fitting.

As we add the chol_reduc variable into the regression model, the risk of IHD is comparable when the cholesterol reduction is zero (OR = 1.13, 95%CI: 0.98, 1.30).

The meta-regression may also be done using the classic random-effect meta-regression method using the metareg command. We then obtain the following results where _seES is the standard error for the effect size (_ES) in each study from the admetan command described earlier:

The point estimates are similar but in this instance the confidence intervals are slightly different given the Knapp-Hartung modification (Knapp and Hartung 2003).

2.3 Meta-Regression in MetaXL

The MetaXL add-in program in Excel also provide solutions for meta-analysis and it allows us to generate data for meta-regression. The MARegresData function in MetaXL allows the creation of a regression dataset that can be directly pasted in Stata and used to run meta-regression analyses under this framework. The dataset appears in a table under the Meta-Regression data tab that will show in the MAInputTable output pop-up window when a MARegresData function is linked to the MAInputTable function. The MARegresData function creates all the necessary variables and weights required for the analysis.

The regression dataset table consists of nine fixed columns that describe each study’s characteristics, and any number of user-defined columns that describe each study’s moderator variables. The fixed columns are defined in the table below (Table 11.2).

Table 11.2 Definition of variables for meta-regression in cholesterol reduction example

Full size table

Please note that the regression is performed on the transformed variables: the transformed effect size called “t_es” as well as a weight under the model of interest called “weight”. (The un-transformed variables u_es and its CI are there only for the convenience of the user, useful when back-transformed outputs are cumbersome to obtain, such as with the double arcsine transformation for prevalence). The variable t_es is the outcome variable and this is regressed against the user-defined moderator variables in the dataset.

We open the IHDCholMetaRegres example module and use the MAInputTable and the MARegresData function preparing the meta-regression data by MetaXL. We then see the meta-regression data is presented in the table (Fig. 11.2).

Right-click on the “Meta-regression data” table in the results window and click copy. Then we paste the data into Stata software and run the robust meta-regression.

3 Meta-Regression for Categorical Variables

In the above example we illustrated meta-regression for continues variable, there comes to the question that when the variable is discontinuous how to conduct the meta-regression? Let’s use the same dataset to simulate a categorical variable by categorizing the cholesterol reduction into three levels (<0.5, 0.5 ~ 0.99, 1 ~ 1.5) and assign 0, 1, 2 to these three dummy variables.

recode chol_reduc (min/0.499 = 0) (0.5/0.999999 = 1) (1/max = 2), gen(chol_grp)

Now we get the dataset as show in the following figure (Fig. 11.3).

Again, we run the meta-regression analysis with indicator variable for group to allow a categorical robust meta-regression.

We may observe that when using the categorical variable, the proportion of between-study variance explained is much less than the continues one (18.6% versus 23.8%). The constant takes the value of the zero category (reference group).

4 Multivariable Meta-Regression

Both classical meta-regression method and the robust error meta-regression method allow us to achieve multivariable meta-regression just like the multivariable regression in individual-level data (Thompson and Higgins 2009). Sometimes multivariable meta-regression is necessary because single covariate generally is only able to explain part of the between-study heterogeneity. In our above example, we know that cholesterol reduction can explain 23.8% of the between-study heterogeneity but not 100%. This means there is still a lot of between-study heterogeneity due to other covariates, which may be the mean age, the region, the mean body mass index and so forth. To address this, we may just add these variables into the meta-regression model. For example, suppose we have another covariate of age in the above example, we may then put both cholesterol and age into the model.

It is notable that more covariates mean we need more studies (one study is a data point) to ensure the statistical power of meta-regression. Then, when we put covariates into the meta-regression model, we should first ensure a sufficient number of studies and note that for every covariate added we need at least 10 additional studies. Therefore, two covariates need at least 20 studies to be present. When the total number of studies is less than 10, it is not appropriate to employ a meta-regression analysis and the subgroup analysis may be employed as an alternative solution to detect the source of heterogeneity. Similarly, when the total number of studies is less than 20, we may only use 1 covariate to fit the meta-regression.

Some characteristics cannot be treated as a covariate for meta-regression, for example, the sample size. This is because sample size in each study is highly correlated with the standard errors of effect estimates. When entered into the meta-regression model, it will break the assumption of orthogonality and make the regression model invalid (Dobson and Barnett 2008).

It might be noted that subgroup analysis is a special case of meta-regression of categorical variables. The difference is that subgroup analysis can only deal with one variable each time and does not have a relative comparison to the reference group within the analysis. The advantage of subgroup analysis to meta-regression is that it does not have the restriction regarding the minimum number of studies. It is notable that for subgroup analysis the interaction test of the potential difference of the effects among sub groups is generally underpowered when there are 3 or more sub groups.

5 Summary

In this chapter, we give a detailed introduction to the meta-regression method, including the basic theories, the step-by-step application for meta-regression in Stata and MetaXL as well as the multivariable meta-regression. We suggest that readers read this chapter with Chap. 13 which introduces dose-response meta-analysis, as this may help readers acquire a deeper understanding of both meta-regression and dose-response meta-analysis.

References

Borenstein M, Hedges LV, Higgins JPT et al (2011) Introduction to Meta-Analysis. John Wiley & Sons, New York City
MATH Google Scholar
Dobson AJ, Barnett AG (eds) (2008) An introduction to generalized linear models—third edition. Chapman and Hall/CRC Press, Boca Raton
Google Scholar
Doi SA, Barendregt JJ, Khan S et al (2015) Advances in the meta-analysis of heterogeneous clinical trials I: the inverse variance heterogeneity model. Contemp Clini Trials 45(Pt A):130–138
Article Google Scholar
Harbord RM, Higgins JPT (2008) Meta-regression in Stata. The Stata J 8(4):493–519
Article Google Scholar
Hedges LV, Tipton E, Johnson MC (2010) Robust variance estimation in meta-regression with dependent effect size estimates. Res Synth Methods 1:39–65
Article Google Scholar
Higgins JP, Thompson SG, Deeks JJ et al (2003) Measuring inconsistency in meta-analyses. BMJ 327(7414):557–560
Article Google Scholar
Knapp G, Hartung J (2003) Improved tests for a random effects meta-regression with a single covariate. Stat Med 22(17):2693–2710
Article Google Scholar
Law MR, Wald NJ, Thompson SG (1994) By how much and how quickly does reduction in serum cholesterol concentration lower risk of ischaemic heart disease? BMJ 308(6925):367–372
Google Scholar
Thompson SG, Higgins JP (2002) How should meta-regression analyses be undertaken and interpreted? Stat Med 21(11):1559–1573
Article Google Scholar
Thompson SG, Sharp SJ (1999) Explaining heterogeneity in meta-analysis: a comparison of methods. Stat Med 18(20):2693–2708
Article Google Scholar

Download references

Author information

Authors and Affiliations

Qatar University, Doha, Qatar
Chang Xu & Suhail A. R. Doi

Authors

Chang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Suhail A. R. Doi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chang Xu .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Xu, C., Doi, S.A.R. (2020). Meta-Regression. In: Meta-Analysis. Statistics for Biology and Health. Springer, Singapore. https://doi.org/10.1007/978-981-15-5032-4_11

Download citation

DOI: https://doi.org/10.1007/978-981-15-5032-4_11
Published: 28 October 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5031-7
Online ISBN: 978-981-15-5032-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us