Introduction

In a meta-analysis of epidemiological study results, a summary effect estimate is obtained by combining information from a set of study-specific estimates. A common approach is to calculate an inverse-variance-weighted average of the study-specific estimates of association [e.g., Sutton et al. (2000), United Nations Scientific Committee on the Effects of Atomic Radiation (UNSCEAR) (2018)]. This approach assigns more weight to studies with more precise study-specific estimates of association. In the context of unbiased linear sums of estimates, this approach is justified by the Gauss–Markov theorem (Plackett 1949).

For epidemiological results obtained from fitting log-linear regression models, it is easy to recover reasonable estimates of the variances associated with study-specific estimates of association using the information encoded in the confidence intervals. Standard meta-analytic techniques typically proceed by assuming that, given reported-effect measures and associated confidence intervals, one can derive the variances of estimates of association based on the assumptions typical of Wald-type statistics; these estimates of study-specific variances are used to calculate the inverse-variance-weighted average estimate of association which is reported as the summary effect estimate (Sutton et al. 2000).

However, such an approach is not straightforward for some estimators for which variances are rarely reported. In some application areas, effect measures are typically obtained from fitting linear relative risk regression models. For example, in epidemiological studies of a variety of carcinogens, including asbestos (Hein et al. 2007), benzene (Rinsky et al. 2002), radon progeny (National Research Council (U.S.) et al. 1988; Lubin et al. 1995; Darby et al. 2005), and external ionizing radiation (Boice et al. 1987; National Research Council (U.S.) et al. 1990; Preston et al. 2003), investigators have modeled the relative risk per unit exposure as a linear function of exposure rather than an exponential function of exposure. In radiation research, this convention follows from a long history of use of the linear relative risk model in analyses of the Life Span Study of Japanese atomic bomb survivors (Preston et al. 2007; Pawel et al. 2008) for which there is a biophysical basis (United Nations Scientific Committee on the Effects of Atomic Radiation (UNSCEAR) 1993; Little et al. 2009, Little 2010); and, in the contemporary epidemiological literature, the linear relative risk model has been applied in analyses of many radiation-exposed populations (Gilbert et al. 1993; Cardis et al. 1995, 2005; Muirhead et al. 2009; Metz-Flamant et al. 2013). The widespread use of the same model form has the advantage that it may facilitate comparison of results between studies. Unfortunately, a quantitative meta-analytic summarization of epidemiological results that have been quantified using linear relative risk models is more challenging than doing so with the results that have been quantified using standard log-linear regression models.

An important challenge in meta-analyses of results that have been quantified using a linear relative risk model is deriving reasonable estimates of study-specific variances. The methodology developed for quantitative summaries of epidemiological findings has largely focused on log-linear model forms, where symmetric Wald-type confidence intervals are routinely reported (DerSimonian and Laird 1986). In contrast, likelihood-based confidence intervals are commonly reported for estimates of association derived from linear relative risk models, and often these intervals are asymmetric (Cox and Hinkley 1974, Meeker and Escobar 1995). Consequently, a reasonable estimate of the variance associated with a point estimate may be difficult to infer from the information encoded in the likelihood-based confidence bounds by simply leveraging the assumptions typical of Wald-type statistics.

In this paper, we describe a method to address these challenges to meta-analysis of published studies that report estimates of association derived from linear relative risk models. The approach is based on an algebraic transformation of published results to yield an estimator with a more symmetrical distribution than that reported in the literature, and then derive an expression of variance of this transformed estimator assuming that the reported profile likelihood bounds for the estimate of association in the original scale conform well to a re-expression of Wald-type bounds of the transformed estimator. The effect on meta-analyses of non-normality in study-specific estimates has been recognized by prior authors (Jackson and White 2018); and, the Cochrane Handbook, for example, discusses transformation of results as an approach to reduce skew (Higgins and Cochrane Collaboration 2020). A meta-analytic summary and associated confidence interval are constructed and back transformed to the original scale. We address fixed-effect and random-effects meta-analyses; these approaches employ different assumptions (i.e., under a fixed-effect model, it is assumed that there is one true association that underlies all the studies in the analysis; and, under the random-effects model, it is assumed that there is an underlying distribution of true associations across studies). For illustration, the proposed methodology is implemented using an empirical example.

Methods

We assume that a systematic literature search was performed, study results (in terms of point estimates and associated confidence intervals) extracted, and study quality appraised. These important steps in a meta-analysis are not addressed here. Rather, here the focus is on the stage of data synthesis during which a quantitative summary of the study findings is calculated. We focus on a setting where epidemiological results have been obtained by fitting a model of the form

$$ \psi = 1 + \beta D, $$

where ψ denotes the risk ratio or odds ratio, D denotes the continuous exposure of interest, and the parameter of primary interest in the meta-analysis, β, denotes the excess relative risk or excess odds ratio per unit D (e.g., the excess relative risk per sievert (Sv) in a radiation epidemiology study), and likelihood-based confidence intervals have been reported.

For simplicity we will henceforth assume 95% confidence intervals but the approach is readily adapted to other bounds. First, a standard approach to meta-analytic summarization of epidemiological study results is described below. Second, the proposed alternative approach to meta-analytic summarization of epidemiological study results is described. Third, it is addressed how to proceed with a meta-analysis of results that have been quantified using a linear risk ratio model when a lower confidence bound was not determined for the reported estimate.

A standard approach to meta-analysis of published linear relative risk estimates

Typically, the data structure for a standard approach to summarization of epidemiological findings in a meta-analysis consists of a table of point estimates and associated confidence intervals. Let i = 1…k index the k studies to be summarized in the meta-analysis. Let \( {\hat{\beta }}_{i} \) denote the estimated excess relative risk or excess odds ratio per unit D for study i; and, let Li and Ui denote the associated lower and upper confidence limits for \( {\hat{\beta }}_{i} \).

For each study, i, we derive the standard error of the reported estimate of association, denoted se (\( {\hat{\beta }}_{i} \)) given the reported associated confidence intervals Li, Ui for the published results, by the following calculation: se (\( {\hat{\beta }}_{i} \)) = (UiLi)/(2 × 1.96).

This approach to estimation of the study-specific standard error follows from considering the framework typical of a linear regression model fitting that yields a point estimate (\( {\hat{\beta }}_{i} \)) and associated Wald-type confidence bounds (Li, Ui). Given this information, an estimate of se (\( {\hat{\beta }}_{i} \)) can be derived under the conditions typical of Wald-type statistics: Li = \( {\hat{\beta }}_{i} \) − 1.96 × se (\( {\hat{\beta }}_{i} \)) and Ui = \( {\hat{\beta }}_{i} \) + 1.96 × se (\( {\hat{\beta }}_{i} \)). With simple rearrangement one gets, Li + 1.96 × se (\( {\hat{\beta }}_{i} \)) = Ui − 1.96 × se (\( {\hat{\beta }}_{i} \)), and it follows that (Ui − Li) = 2 × (1.96 se (\( {\hat{\beta }}_{i} \))), leading to the above expression for se(\( {\hat{\beta }}_{i} \)) as a function of (Li, Ui).

Little et al. (2012) described an approach to deriving a fixed-effect inverse-variance-weighted estimate of the excess relative risk per unit exposure, where the meta-analytic summary is calculated as the sum of the study-specific estimates divided by its variance, over the sum of the inverse of the study-specific variances, \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} = \frac{{ \sum \nolimits_{i = 1}^{k} { {{{\hat{\beta }}_{i} }} {\left/ {\vphantom {{{\hat{\beta }}_{i} } {{\text{se(}}{\hat{\beta }}_{i} )^{2} }}}\right.} {{{\text{se(}}{\hat{\beta }}_{i} )^{2} }}}}}{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {{\text{se(}}{\hat{\beta }}_{i} )^{2} }}}\right.} {{{\text{se(}}{\hat{\beta }}_{i} )^{2} }}}}} \). This inverse-variance-weighted average estimate of association is reported as the summary estimate of association.

Confidence intervals for this fixed-effect summary estimate of association are derived by calculation of an estimate of the standard error for the meta-analytic summary association. The estimate of the standard error is simply the reciprocal of the square-root of the sum of the study-specific variances,

$$ {\text{se}}({\hat{\beta }}_{\text{tot}}^{\text{Fixed}} ) = \frac{1}{{\left[ { \sum \nolimits_{i = 1}^{k} {{\left/ {\vphantom {1 {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} }}}\right. } {{{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} }}}} \right]^{0.5} }}; \,{\text{and,}} $$

a Wald-type confidence interval for the summary estimate of association is derived as, 95% CI (\( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \)) = \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \)  ± 1.96 × se (\( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \)).

A random-effects summary estimate of association may also be derived. Little et al. (2012) described how to calculate a random-effects summary estimate of association based on the method proposed by DerSimonian and Laird (1986) for a one-step estimation of the variance of the random effect, where the summary meta-analytic estimate of the association based on a random-effects model is calculated as,

$$ {\hat{\beta }}_{\text{tot}}^{\text{Random}} = \frac{{ \sum \nolimits_{i = 1}^{k} {{{{\hat{\beta }}_{i} }} {\left/ {\vphantom {{{\hat{\beta }}_{i} } {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + {{\Delta }}^{2} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + {{\Delta }}^{2} } \right]}}}}}{{ \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + {{\Delta }}^{2} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + {{\Delta }}^{2} } \right]}}}}}, $$

where

$$ \Delta^{2} = { {\rm {max}} }\left[ {0,\frac{{Q - \left( {k - 1} \right)}}{{ \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right] - \frac{{ \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{4} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{4} } \right]}}}}}{{ \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right]}}}}}}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right] - \frac{{ \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{4} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{4} } \right]}}}}}{{ \sum \nolimits_{i = 1}^{k} { {1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} } \right]}}}}}}}}}}} \right], $$

and

$$ Q = \sum \limits_{1}^{k} \left[ {\left( {{\hat{\beta }}_{i} - {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} } \right)/{\text{se}}\left( {{\hat{\beta }}_{i} } \right)} \right]^{2} . $$

The associated standard error is calculated as,

$$ {\text{se}}\left( { {\hat{\beta }}_{\text{tot}}^{\text{Random}} } \right) = \frac{1}{{\left[ { \sum \nolimits_{i = 1}^{k} {{1} {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + \Delta^{2} } \right]}}}\right.} {{\left[ {{\text{se}}\left( {{\hat{\beta }}_{i} } \right)^{2} + \Delta^{2} } \right]}}}} \right]^{0.5} }} . $$

An alternative approach to meta-analysis of published linear relative risk estimates

In most contemporary epidemiological analyses that quantify associations under a linear relative risk model, the reported confidence interval is derived from likelihood-based methods rather than calculated as a Wald-type interval (McCullagh and Nelder 1989). This is because, in a given study, the distribution of maximum likelihood estimators for the parameter β may be far from normal unless the sample size is large. When maximum likelihood estimators are not approximately normal (e.g., in small or moderate samples), Wald-type intervals may not have nominal coverage (Cox and Hinkley 1974; Meeker and Escobar 1995); for this reason, it has become common practice for published results for fittings of linear relative risk models to report likelihood-based confidence intervals rather than Wald-type intervals (Prentice and Mason 1986; Moolgavkar and Venzon 1987). By extension, meta-analytic summaries that proceed under the assumption that typical Wald-type statistical assumptions hold may not yield an appropriately inverse-variance-weighted average estimate of association or confidence interval.

We describe an alternative approach to meta-analytic summarization of epidemiological study results that have been obtained from fitting of linear relative risk regression models. The data structure for the proposed approach includes a table of point estimates, associated confidence intervals, and maximum observed doses. Letting i = 1…k index study, \( {\hat{\beta }}_{i} \) denotes the point estimate of interest, and Li and Ui denote the associated lower and upper confidence bounds for the point estimate reported for study i. Further, let xi denote the maximum value of the dose reported in published study i, noting that the value of xi is often known and reported in an epidemiological study (or may be obtained from the authors).

For each study, we derive a transformed metric of the estimate of association (Eq. 1),

$${ \hat{A}}_{i} \, = \,{ \ln }(c{\hat{\beta }}_{i} \, + \, 1), $$
(1)

and associated standard error, denoted se(\({ \hat{A}}_{i} \)), as a function of reported values Li and Ui (Eq. 2)

$$ {\text{se}}\left( {\hat{A}_{i} } \right) = \ln \left( {\frac{{cU_{i} + 1}}{{cL_{i} + 1}}} \right)/\left( {2 \times 1.96} \right), $$
(2)

where c = min[xi: 1 ≤ ik], to ensure that \( \left( {\frac{{cU_{i} + 1}}{{cL_{i} + 1}}} \right) > 0 \) for 1 ≤ ik and therefore, se (\({ \hat{A}}_{i} \)) can be calculated for all studies in the meta-analysis. The proposed approach derives a variance estimate for this transformed quantity that is based on the reported likelihood-based confidence interval for the estimate of association on its original scale; however, estimates of the transformed quantity \({ \hat{A}}_{i} \) will tend to more closely approximate a normal distribution than \( {\hat{\beta }}_{i} \).

The justification for the proposed approach follows from considering that parameter transformations can improve asymptotic distributional approximations, as discussed in the context of the linear relative risk model by Barlow (1985a, b) and by Prentice and Mason (1986). Criteria for selecting such transformations have been discussed previously (Sprott 1974) and include removal of range restriction on \( {\hat{\beta }}_{i} \) and reduction of the asymmetry of the log-likelihood about \( {\hat{\beta }}_{i} \).

The transformation \({ \hat{A}}_{i} \) = ln(c \( {\hat{\beta }}_{i} \) + 1) can remove the range restriction on the excess relative risk parameters \( {\hat{\beta }}_{i} \). Consider study i in which the dose variable, Di, has compact support Ci, for which xi= sup[Ci]. Consequently, the possible range of the estimate of dose–response association, \( {\hat{\beta }}_{i} \), under a model RRi= 1+βiDi, is 1/− xi, infinity). Prentice and Mason (1986) proposed the simple transformation \( \alpha = { \ln }\left( {\beta + \beta_{0} } \right) \), where \( \beta_{0} = \frac{1}{{x_{i} }} \) to remove the range restriction; when c = xi, that simple transformation is equivalent to our proposed expression \( { \ln }\left( {c\beta + 1} \right) \).

The transformation \({ \hat{A}}_{i} \) = ln (c \( {\hat{\beta }}_{i} \) +1) also may reduce log-likelihood skewness, and, therefore, improve symmetry of confidence bounds on the transformed metric. The standard approach assumes that typical Wald-type statistical assumptions hold. Prentice and Mason illustrated that the simple transformation \( A = { \ln }\left( {\beta + \beta_{0} } \right) \) yields nearly complete symmetry about the transformed metric (Prentice and Mason 1986); and, when c = xi, that simple transformation is equivalent to our proposed expression \( A = { \ln }\left( {c\beta + 1} \right) \).Using Sprott’s index as a measure of the normality of the likelihood function, Barlow demonstrated that the transformation \( A = { \ln }\left( {\beta + 1} \right) \) improves the normality of estimates (Barlow 1985a, b). Our proposed transformation is equivalent to the transformation proposed by Barlow (1985a, b) when c = 1. It follows that the proposed transformation will tend to improve the symmetry of the likelihood-based confidence bounds on the transformed scale, and upon applying this transformation to the reported bounds, L and U, one can better approximate the variance by employing assumptions of Wald-type intervals to the likelihood-based bounds on this transformed scale than when applied to these bounds on their original scale.

A fixed-effect inverse-variance-weighted summary of this estimated quantity is calculated as follows, \({ \hat{A}}_{\text{tot}}^{\text{Fixed}} = \frac{{ \sum \nolimits_{i = 1}^{k} { {{\hat{A}_{i} }} {\left/ {\vphantom {{\hat{A}_{i} } {{\text{se}}(\hat{A}_{i} )^{2} }}}\right. } {{{\text{se}}(\hat{A}_{i} )^{2} }}}}}{{ \sum \nolimits_{i = 1}^{k} { {1} {\left/ {\vphantom {1 {{\text{se}}(\hat{A}_{i} )^{2} }}}\right. } {{{\text{se}}(\hat{A}_{i} )^{2} }}}}}. \)

The standard error of \({ \hat{A}}_{\text{tot}}^{\text{Fixed}} \) is given by \( {\text{se}}\left( {\hat{A}_{\text{tot}}^{\text{Fixed}} } \right) = \frac{1}{{\left[ { \sum \nolimits_{i = 1}^{k} { {1} {\left/ {\vphantom {1 {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} }}}\right. } {{{\text{se}}\left( {\hat{A}_{i} } \right)^{2} }}}} \right]^{0.5} }}. \)

We then re-transform to the original scale and obtain the summary fixed-effect meta-analytic estimate of the association \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \), and associated confidence interval. This summary estimate is calculated as:

$$ {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} = \frac{{{ \exp }\left( {\hat{A}_{\text{tot}}^{\text{Fixed}} } \right) - 1}}{c}, $$

and it is this form of the inverse-variance-weighted average estimate of association that is reported as the meta-analytic summary estimate of association.

A Wald-type confidence interval for the summary estimate of association is derived based on the estimate of se (\({ \hat{A}}_{\text{tot}}^{\text{Fixed}} \)),

$$ 95\% {\text{ CI (}}{\hat{\beta }}_{\text{tot}}^{\text{Fixed}} )= \frac{{{\text{exp(}}\hat{A}_{\text{tot}}^{\text{Fixed}} \mp 1.96{\text{se}}(\hat{A}_{\text{tot}}^{\text{Fixed}} ) )- 1}}{c}. $$

A similar approach can be used to calculate a random-effects summary meta-analytic estimate and associated confidence interval. Let \({ \hat{A}}_{\text{tot}}^{\text{Random}} \) denote the random-effects inverse-variance-weighted summary, calculated as follows,

$${ \hat{A}}_{\text{tot}}^{\text{Random}} = \frac{{ \sum \nolimits_{i = 1}^{k} { {{\hat{A}_{i} }} {\left/ {\vphantom {{\hat{A}_{i} } {\left[ {{\text{se(}}\hat{A}_{i} )^{2} + \Delta^{2} } \right]}}}\right. } {{\left[ {{\text{se(}}\hat{A}_{i} )^{2} + \Delta^{2} } \right]}}}}}{{ \sum \nolimits_{i = 1}^{k} { {1} {\left/ {\vphantom {1 {\left[ {{\text{se(}}\hat{A}_{i} )^{2} + \Delta^{2} } \right]}}}\right. } {{\left[ {{\text{se(}}\hat{A}_{i} )^{2} + \Delta^{2} } \right]}}}}},\;{\text{where}} $$
$$ {{\Delta }}^{2} = {{\rm {max}} }\left[ {0,\frac{{Q - \left( {k - 1} \right)}}{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right] - \frac{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{4} } \right]}}}\right. } { {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{4} } \right]} }}}}{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right]}}}\right. } { {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right]} }}}}}}}\right. } { {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right] - \frac{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{4} } \right]}}}\right. } { {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{4} } \right]} }}}}{{ \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right]}}}\right. } { {\left[ {{\text{se}}\left( {\hat{A}_{i} } \right)^{2} } \right]} }}}}} }}}}} \right],\;{\text{and}} $$

\( Q = \sum \limits_{1}^{k} \left[ {\left( {\hat{A}_{i} -{ \hat{A}}_{\text{tot}}^{\text{Fixed}} } \right)/{\text{se}}\left( {\hat{A}_{i} } \right)} \right]^{2} . \)

The associated standard error for this meta-analytic summary of the transformed estimates is,

$$ {\text{se}}\left( {\hat{A}_{\text{tot}}^{\text{Random}} } \right) = \frac{1}{{\left[ { \sum \nolimits_{i = 1}^{k} { { 1 } {\left/ {\vphantom {1 {\left[ {{\text{se}}(\hat{A}_{i} )^{2} + \Delta^{2} } \right]}}}\right. } { {\left[ {{\text{se}}(\hat{A}_{i} )^{2} + \Delta^{2} } \right]} }}} \right]^{0.5} }} . $$

We then re-transform to the original scale, and obtain the summary random-effects meta-analytic estimate of association and associated confidence interval, calculated as:

$$ {\hat{\beta }}_{\text{tot}}^{\text{Random}} = \frac{{{ \exp }\left( {\hat{A}_{\text{tot}}^{\text{Random}} } \right) - 1}}{c}, $$

with a Wald-type confidence interval for this random-effects summary estimate of association is derived,

$$ 95\% {\text{ CI }}({\hat{\beta }}_{\text{tot}}^{\text{Random}} ) = \frac{{{\text{exp(}}\hat{A}_{\text{tot}}^{\text{Random}} \mp 1.96{\text{se(}}\hat{A}_{\text{tot}}^{\text{Random}} ) )- 1}}{c}. $$

A simple computer code written for the SAS and R statistical packages is provided that calculates fixed and random-effects meta-analytic summary estimates as well as associated confidence intervals (Electronic Supplementary Material).

Meta-analysis when a lower confidence bound was not determined in a published report

Sometimes, a likelihood-based lower confidence bound, Li, is not determined in a particular analysis because, at the lower constraint on the parameter range (i.e., 1/− xi), the likelihood-based statistic that defines the lower confidence bound has not reached the specified critical value. In such instances, a lower bound is typically not reported; rather, authors may indicate that the bound is simply < − 1/xi.

To date, practice for how to address this has not been well described in the literature. It appears that what is done in standard meta-analyses of linear relative risk estimates is to impute a lower bound by assuming that the confidence bounds are symmetrical on the original scale (Little et al. 2012), such that the imputed lower bound is \( L_{i}^{'} = {\hat{\beta }}_{i} - (U_{i} - {\hat{\beta }}_{i} ) \). The standard approach then proceeds using \( L_{i}^{'} \) in place of the missing Li.

When using the proposed alternative method, a lower bound may be imputed by assuming symmetry on the transformed scale. If the transformation improves normality as compared to the original scale, this should be advantageous. For study i with no reported lower bound Li, impute the value \( L_{i}^{'} = \frac{{\exp (\hat{A}_{i} - ({\text{ln(}}cU_{i} + 1 )-{ \hat{A}}_{i} )) - 1}}{c} \). The data set can proceed with analysis described above using the imputed lower bound.

Sensitivity to observed exposure ranges

The proposed transformation involves selection of a constant, c, to ensure that one can calculate se (\({ \hat{A}}_{i} \)) for all studies in the meta-analysis. The proposed approach defines c = min[xi:1 ≤ ik]. The sensitivity of results to choice of c can be assessed by recalculating the meta-analytic summary measure under an alternative value, c′, under the constraint 0 < c′ ≤ min [xi: 1 ≤ ik], to ensure calculation of se (\({ \hat{A}}_{i} \)). This permits investigation of sensitivity of results, for example, to outliers or extreme values of the exposure variable in study samples. We suggest such sensitivity analyses proceed by calculating results under a value such as, c′ = 0.9c.

Simulations

The proposed approach is compared here to meta-analytic summaries to the standard fixed-effect and random-effects approaches in simulated data examples. For this, we simulated 1000 meta-analyses under scenarios in which the number of individuals in each study was small (1000–1500), moderate (2000–2500), large (4000–4500), or variable size (1000–4500); additionally, examples were considered in which the number of studies in a simulated meta-analysis was set to 5, 10, or 15 studies. In each simulation, the number of people in a study was drawn from a uniform distribution over the specified range of study size; for each cohort member, we generated an independent standard normal covariate Z. An exposure, E, was generated by sampling from a uniform distribution (0, 5). We generated a binary outcome, Y, with dependence of Y on Z and E encoded by specifying that Y took a value of 1 with odds = \( { \exp }({ \log }(\alpha ) + 0.1Z) \times (1 + \eta E) \), where the parameter describing the baseline odds of the outcome, α, was set to 0.15, 0.2, or 0.25, and the parameter describing the excess odds ratio per unit E, \( \eta \) was set to 1.0, 0.75, or 0.50. The excess odds ratio model was used for data generation to avoid numerical issues with generating data from a linear relative risk model. The coefficients associated with E for each study were estimated using maximum likelihood and profile likelihood confidence bounds were obtained. We calculated fixed-effect and random-effects meta-analyses using the standard approach; and, we calculated fixed-effect and random-effects meta-analyses using the proposed approach described above. To summarize the results, the average meta-analyzed estimate was calculated as well as the percentage of associated confidence intervals that cover the specified true effect.

Empirical examples

The calculation of a meta-analytic summary is illustrated here using both the standard approach and the proposed alternative approach in an empirical data example based on the data reported in a prior systematic review and meta-analysis of ischemic heart disease following exposure to low-level ionizing radiation (Little et al. 2012).

Results

Simulations: fixed-effect meta-analyses

Table 1 reports the results of simulations in which a meta-analytic summary estimate of the excess odds ratio per unit exposure was estimated using the standard fixed-effect meta-analytic approach and our proposed approach; in Table 1, the baseline odds of the outcome was set to 0.2 and the excess relative odds of the outcome was set to 1.0. In all simulation scenarios, the standard fixed-effect meta-analytic approach yielded summary effect measures that were null biased and had less than nominal confidence interval coverage. In simulations of meta-analyses of large cohorts (i.e., 4000–4500 people per study), the standard fixed-effect meta-analysis yielded a slightly biased meta-analytic summary result with 95% confidence interval coverage that was closest to nominal. In simulation scenarios involving meta-analyses of smaller cohorts, the standard fixed-effect meta-analysis exhibited greater null bias and the summary effect measure had less than nominal confidence interval coverage. As the number of studies per meta-analysis increased from 5 to 15 studies per meta-analysis, and other parameters remained unchanged, the bias in the standard fixed-effect meta-analysis remained unchanged; however, the confidence interval coverage for the standard meta-analytic summary worsened and was substantially less than the nominal level.

Table 1 Results for simulations, reporting the average of the standard fixed-effect and random-effects meta-analytic results, proposed fixed-effect and random-effects meta-analytic results, and confidence interval coverages

In all simulation scenarios, the proposed approach yielded fixed-effect meta-analytic summary results that were approximately unbiased and the confidence interval coverage for the proposed fixed-effect meta-analytic summary measure was close to the nominal 95% value in all simulation scenarios. Even in simulation scenarios involving meta-analyses of small cohorts (N = 1000–1500), the proposed meta-analytic summary was approximately unbiased and associated confidence interval coverage was close to the nominal 95% value. Figure 1 illustrates that the transformed metric of the estimate of association, \({ \hat{A}}_{i} \), appears more normally distributed than the excess odd ratio estimates.

Fig. 1
figure 1

Probability density functions of excess relative odds and the proposed transformed metric, A, across simulated studies with 500–750 subjects per study and 15 studies per meta-analysis

Simulations: random-effects meta-analyses

Table 1 also reports the results of simulations in which a meta-analytic summary estimate of the excess odds ratio per unit exposure was estimated using the standard random-effects meta-analytic approach and our proposed approach. Similar to the conclusions drawn for the fixed-effect meta-analyses, the standard random-effects meta-analytic approach yielded summary effect measures that were null biased and tended to have less than nominal confidence interval coverage. In simulations of meta-analyses of large cohorts (i.e., 4000–4500 people per study), the standard random-effects meta-analysis yielded a slightly biased meta-analytic summary result with 95% confidence interval coverage that was close to nominal. In simulation scenarios involving meta-analyses of smaller cohorts, the standard random-effects meta-analysis exhibited greater null bias and the summary effect measure had less than nominal confidence interval coverage. As the number of studies per meta-analysis increased from 5 to 15 studies per meta-analysis, and other parameters remained unchanged, the confidence interval coverage for the standard random-effects meta-analytic summary worsened and was substantially less than the nominal level.

The proposed approach yielded random-effects meta-analytic summary results that were approximately unbiased and the confidence interval coverage for the proposed meta-analytic summary effect measure was slightly conservative, yet very close to the nominal 95%.

Simulation results: sensitivity analyses

Simulations also were conducted under scenarios in which baseline odds of the outcome,α, was 0.15 and 0.25; again, under those simulation scenarios, the proposed approach yielded fixed-effect and random-effects meta-analytic summary results that were approximately unbiased and the confidence interval coverage for the proposed fixed-effect and random-effects meta-analytic summary measures were close to the nominal 95% value (Appendix Table A1). In addition, simulations were conducted in which the excess relative odds of the outcome per unit exposure was 0.75 and 0.5; under those simulation scenarios it was also observed that the proposed approach yielded fixed-effect and random-effects meta-analytic summary results that were approximately unbiased and the confidence interval coverage for the proposed fixed-effect and random-effects meta-analytic summary measures were close to the nominal 95% value (Appendix Table A2). Finally, simulations were conducted in which the number of subjects per study varied from 1000 to 2500 and from 1000 to 4500; under those simulation scenarios, it was also observed that the proposed approach yielded fixed-effect and random-effects meta-analytic summary results that were approximately unbiased and the confidence interval coverage for the proposed fixed-effect and random-effects meta-analytic summary measures was close to the nominal 95% value (Appendix Table A3).

Sensitivity of the simulation results to the value c was assessed by re-calculating values under the condition c′ = 0.9c and c′ = 0.8c; these sensitivity analyses yielded essentially very similar quantitative values (Appendix Table A4).

Empirical example

While high doses of ionizing radiation have a fairly well-established association with circulatory disease, evidence for an association at lower doses (e.g., < 0.5 Sv) remains more controversial. Little et al. reported on a meta-analysis of epidemiological findings of association between radiation exposure and circulatory disease involving moderate- or low-dose whole-body exposure to ionizing radiation (Little et al. 2012). Table 2 shows the point estimates and associated lower and upper confidence bounds for each study considered in the meta-analysis of radiation and ischemic heart disease, where the estimates represent the estimated excess relative rate per Sv whole-body dose (noting that the studies expressed radiation dose in Sv). The standard approach yields a fixed-effect estimate \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \) = 0.10 (95% CI 0.05, 0.15). The proposed alternative approach yields a fixed-effect estimate \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \) = 0.10 (95% CI 0.05, 0.15), equivalent to the results obtained using the standard approach to calculation of a fixed-effect estimate. These results correspond to the fixed-effect meta-analytical result reported by Little et al. (2012) (0.10; 95% CI 0.05, 0.15). The similarity of the results is expected given the large sample sizes of the studies included in this meta-analysis, demonstrating that the proposed approach is not influential in the case when conditions suggest the normality assumption is tenable (Table 2).

Table 2 Meta-analysis of epidemiological findings from eight individual studies regarding associations between ischemic heart disease and exposure to low-level radiation

We also report results derived under a random-effects meta-analysis. A standard approach to estimation of a random-effects estimate \( {\hat{\beta }}_{\text{tot}}^{\text{Random}} \) = 0.10 (95% CI 0.04, 0.15). The proposed alternative approach yields a random-effects estimate \( {\hat{\beta }}_{\text{tot}}^{\text{Random}} \) = 0.10 (95% CI 0.04, 0.16). Again, these results correspond closely to the random-effects meta-analytical result reported by Little et al. (0.10; 95% CI 0.04, 0.15).

A sensitivity analysis to the value c′ was performed by re-calculating values under the condition c′ = 0.9c and c′ = 0.8c. These sensitivity analyses yielded essentially equivalent quantitative values under these conditions, the proposed fixed-effect meta-analytic summary was \( {\hat{\beta }}_{\text{tot}}^{\text{Fixed}} \) = 0.10 (95% CI 0.05, 0.15) and the proposed random-effects meta-analytic summary was \( {\hat{\beta }}_{\text{tot}}^{\text{Random}} \) = 0.10 (95% CI 0.04, 0.16).

Discussion

Non-linear regression models fitted via maximum likelihood methods are known to suffer problems when data are sparse (Greenland et al. 2016). For example, the commonly used logistic and Cox regression models are susceptible to bias in small samples (Greenland et al. 2000). These biases translate into bias of meta-analyses based on them (Greenland et al. 2016). Maximum likelihood estimates of the linear odds ratio or linear risk ratio per unit exposure are much more prone to bias in small samples than standard log-linear regression models (Prentice and Mason 1986); and, unless the study size is very large, the resultant parameter estimates may have a profile likelihood-based confidence intervals that differ substantially from Wald-type intervals (Moolgavkar and Venzon 1987).

In the current paper, we focus on the implications for meta-analytic summarization of epidemiological study results derived from maximum likelihood fittings of linear relative risk models. Using simulations, we illustrate the potential for bias and lack of appropriate confidence interval coverage in meta-analyses of linear odds ratio models that employ a standard approach to fixed-effect meta-analysis. It was observed that bias increased as the size of the studies included in meta-analyses diminished. We further noted that as the number of studies included in meta-analyses increased from 5 to 15, while other parameters remained unchanged, the bias in the meta-analytic summary remained similar but the confidence interval coverage for the meta-analytic summary decreased. This is likely because confidence intervals for meta-analytic summary estimates become tighter as the number of studies in a meta-analysis increases while the bias remains; a similar phenomenon has been reported in simulations of sparse data bias in ordinary (i.e., loglinear) logistic regression (Lin 2018).

Barlow (1985a, b), and Prentice and Mason (1986), proposed re-parameterizations of linear relative risk models that substantially reduced bias and improved approximations of confidence intervals to those predicted by the asymptotic normal distribution. This prior body of work suggested that a transformation applied for conducting a meta-analysis of published study results should lead to a distribution of estimators for A that tend to more closely approximate a normal density than the maximum likelihood estimators for the excess relative risk per unit D (Prentice and Mason 1986). A similar transformation was applied in the present study to existing estimates of excess relative risk to better approximate the normality of the meta-analyzed parameter, which is a necessary assumption underlying typical inverse-variance-weighted meta-analyses. The proposed approach is based on the known improved symmetry of the confidence interval for the transformed metric relative to the untransformed (Prentice and Mason 1986). When \({ \hat{A}}_{i} \) and se (\({ \hat{A}}_{i} \)) better conform to normal distributions than \( {\hat{\beta }}_{i} \) and se(\( {\hat{\beta }}_{i} \)) the proposed analytical approach to deriving a meta-analytic summary based on inverse-variance weighting of the transformed quantities should improve estimation as the underlying assumptions will be better approximated (Barlow 1985a, b; Prentice and Mason 1986). In principle, this should provide improved approximations of parameters with asymptotic normal distributions for meta-analysis when working with the reported results from linear relative risk models than working directly with the published values, which may not conform well to underlying distributional assumptions in small and moderately sized studies (Barlow 1985a, b).

In contrast to standard meta-analytic approaches, the proposed approach requires information on the maximum exposure range in each study included in the meta-analysis. Often this value is reported; if not, often it can be ascertained from the study authors or inferred (e.g., from the range of observed data in an exposure–response plot, or from substantive knowledge about exposure conditions).

We have illustrated the proposed approach with an empirical example. In the example, the study sizes are very large and the published likelihood-based confidence intervals are highly symmetrical. The empirical example illustrates the important point that when the assumption of normality approximately holds, the proposed approach yields essentially equivalent results to those of the standard approach. Our simulations using larger sample sizes support the finding from the empirical example that this transformation does not distort estimates when it is used in settings where normality may be a tenable assumption. The situations under which the proposed approach is likely to perform much better than the standard approach will tend to be meta-analyses that encompass many small studies, as opposed to a few large studies. The simulations illustrate that when data are sparse the approaches may yield somewhat different meta-analytic summaries of a set of estimates of excess relative risk per unit exposure, with substantially different means and associated confidence intervals (Table 1). Our simulations demonstrate that the proposed transformation improved performance of meta-analyses in terms of bias and confidence interval coverage.

Often, in epidemiological analyses that use a linear relative risk model, the lower likelihood-based confidence bound is not determined. This poses a challenge for meta-analysis in which the published results of point estimates and confidence bounds are the basis for deriving estimates of variance that underpin inverse-variance-weighted meta-analytic summaries. It appears that a standard practice in meta-analysis of excess relative risk per unit dose estimates has been to impute a lower bound by assuming that the bounds are symmetrical on the original scale, so that given just the reported upper bound and point estimate, a lower bound is imputed. In the proposed approach, we suggest that a lower bound may be imputed by assuming symmetry on the transformed scale. If the transformation does improve normality as compared to the original scale, as illustrated in Fig. 1, this should be advantageous and improve practice for meta-analysis of radiation epidemiology results in settings where a profile likelihood-based lower confidence interval is not defined.

Conclusion

The simple approach we describe, that follows from the transformation proposed in Barlow (1985a, b) and Prentice and Mason (Prentice and Mason 1986), may offer a useful complement to standard methods that can be employed when undertaking meta-analysis of reported results based on the linear relative risk models.