Introduction

To evaluate the quality of evidence, experts have produced guides to identify and evaluate sources of bias [1, 2] to build confidence in robust, consistent scientific findings.

The Barker hypothesis, commonly known as the developmental origins of health and disease (DOHaD), fetal origins, and the thrifty gene hypothesis, proposes that early exposure to adverse conditions during fetal development and early life have strong detrimental consequences on an individual’s long-term health and susceptibility to chronic diseases [3,4,5]. According to the hypothesis, if infants who are deprived in utero are later exposed to subsequent adequate nutrition and experience rapid growth, they are then at risk of overweight and chronic weight-related diseases, including hypertension, metabolic syndrome, and cardiovascular diseases [4, 5]. Due to the DOHaD hypothesis, healthcare practitioners are concerned about the possible trade-off between supporting early growth and later metabolic complications in infants born with low birthweight [6].

However, some concerns have been raised about the methods used in such studies [7,8,9,10,11,12,13,14,15,16].

Appraisals of DOHaD studies have pointed out several weaknesses of the hypothesis, including important inconsistencies both within and between studies [7, 8, 13, 14, 16]. The hypothesis that was tested varied between studies and there was a lack of dose-response evidence [13, 16]. Furthermore, most studies inadequately addressed confounding factors related to social and economic disadvantages [13]. Critiques by researchers have pointed out that DOHaD studies frequently adjusted for adult body weight measured at or near the time of the outcome assessment [7, 8, 10,11,12]. Adult body weight can be considered a proxy, that is, an indirect measure of the desired outcome, strongly correlated with that outcome (such as adiposity, type 2 diabetes, and high blood pressure) and/or an intermediate/mediating variable that lies in the causal pathway between the exposure and the outcome [7, 8, 10, 15] leading to “statistical overadjustment”. These overadjustments can alter results, favoring DOHaD hypothesis where no such relationship actually exists [7, 8, 10,11,12].

Thus, given the influence of articles published by Barker on the development and popularization of the DOHaD hypothesis, this review aims to critically appraise Barker’s highest cited publications (HCBarker) using a risk-of-bias assessment tool for observational analyses and investigate effects of statistical overadjustment by later body weight/body mass index.

Methods

Search strategy

We conducted a search in the Web of Science and SCOPUS databases with the assistance of a health sciences research librarian (DL). The search was last repeated on November 3, 2023, to find publications with Barker DJP listed as an author, having over 1000 citations. We limited the search to the period from January 1975 (early in his career) to December 2013 (four months after his death in August 2013).

Inclusion criteria

We included high-citation studies of any design where Barker, either alone or with colleagues, examined the relationships between fetal life or childhood exposures and cardiovascular, weight and/or metabolic outcomes later in life. We excluded narrative reviews and ecological studies that lacked data on individual participants. We predefined appropriate statistical adjustment for potential confounding by predictors of adverse cardiometabolic health outcomes in offspring later in life, i.e., social determinants of health (SDOH) [17,18,19] and maternal health conditions such as hypertensive disorders of pregnancy [20, 21], diabetes mellitus [22, 23], and obesity [23,24,25].

Review process

Two reviewers (LS, TRF) independently screened titles, abstracts and full text for eligibility and discussed discrepancies to reach consensus. Data extraction and risk of bias quality assessment were performed using the Risk of Bias in Non-randomized Studies of Interventions (ROBINS-I) tool1 (Table 1) independently by two reviewers (SJ, TRF) and presented graphically using RevMan 5.3.5 (The Cochrane Collaboration, Copenhagen). Any discrepancies in ROBINS-I tool were reviewed by a third reviewer (SE).

Table 1 ROBINS-I low risk of bias approaches/criteria1.

The ROBINS-I tool1 is endorsed by the Cochrane Collaboration to review non-randomized studies for risks of bias [26]. This tool encompasses various criteria including evaluating bias related to confounding, selection, classification of exposure, deviation from intended questions, missing data, outcome measurements, and selection of reported results (Table 1).

Low risk of bias approaches to prevent confounding

In the Barker studies, certain variables could act as confounders since they are common causes of both the low birthweights and the outcomes being studied [27,28,29]. These potential confounding variables include prenatal health variables and SDOH before birth. For example, maternal hypertension could predict both lower birthweight and pressure-related outcomes [20, 21]. Similarly, maternal weight status may affect weight-related outcomes, such as the development of type 2 diabetes later in adult life [24, 30].

To address confounding, it is best to include potential confounders simultaneously in statistical models based on existing knowledge. Even if a covariate shows a non-significant association with the outcome or exposure, it can still have an important impact as a confounder. The examination of confounding is a not a test of significance but rather an examination for important differences between crude and adjusted effect estimates [27,28,29]. Guidance on confounding has been consistent throughout the time (1982–2015) when the DOHaD studies were conducted [27,28,29].

Low risk of bias approaches to prevent bias due to selection bias

An example of how loss to follow-up can introduce selection bias if the loss is related to both the exposure and the outcomes being studied would be when individuals born to women with high weight status tend to have higher birthweights and an increased risk of higher body weight, cardiovascular disease, and poor glycemic control [23, 25]. If these individuals are more likely to refuse participation or are more likely to be lost to follow-up in a study that focuses on these outcomes, a selection bias could be introduced. Selection bias can thus alter the effect estimates and impact whether the study accurately reflects important findings.

Risk of bias approaches to prevent bias due to classification of exposure

Since the ROBINS-I tool identifies a serious risk of bias if the determination of exposure status could have been influenced by knowledge of the participants’ associated outcomes [1], we assessed whether the researchers altered the exposure categories after obtaining outcome data, as this could introduce potential bias.

Results

Using our search criteria, we identified 564 papers, and 17 of them had more than 1000 citations. After screening titles and abstracts, we evaluated the full text of 11 studies, and 8 of them met our inclusion criteria (Table 2) [4, 5, 31,32,33,34,35,36]. The PRISMA flow diagram (Fig. 1) illustrates the number of records identified, included, excluded and the reasons for exclusion.

Table 2 Characteristics of Barker’s highest cited publications with over 1000 citations.
Fig. 1
figure 1

PRISMA (PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow chart for inclusion of studies in the systematic review of Barker’s highest cited publications.

Of the included HCBarker studies, 7 (87.5%) were retrospective cohort studies, and 1 (12.5%) was a case-control study (Table 2, Supplemental file 1). Geographically, the data came from the United Kingdom (5 studies, 62.5%), Finland (2 studies, 33%) and the Netherlands (1 study, 12.5%). The outcomes in the included HCBarker studies examined cardiovascular disease [4, 5, 35, 36], blood pressure [4, 31, 33], type 2 diabetes mellitus [35], and impaired glucose tolerance [34] (Table 2).

From the assessment of the HCBarker studies, we found that all these studies displayed high risks of bias, with particular concerns regarding confounding (8/8, Table 3), selection of reported results (8/8), classification of exposure (7/8), selection of participants (5/8), measurement of outcomes (2/8), deviations from intended question (1/8), and high rates of missing data (ranged from 15 to 87%) (Fig. 2, Supplemental file 2). Over-adjustment for later body weight was adjusted in most (6/8) of the studies (Fig. 2, Supplemental file 2). A comparison between crude and adjusted values was not possible, as none of the included studies provided both crude and adjusted values.

Table 3 Adjustments for key confounding variables in highly cited Barker’s publications.
Fig. 2
figure 2

Risk of bias graphs with review authors’ judgements about each risk of bias item presented for each included studies: (green) Low risk, (yellow) Unclear, (red) High risk.

Discussion

Our systemic review to critically evaluate HCBarker studies has revealed multitude of concerning issues. All the included studies in our review displayed high risks of bias due to confounding, along with remarkably high rates of missing data (ranging from 15 to 87%) and a penchant for selective reporting of results following multiple comparisons (ranging from 16 to 119) [4, 5, 31,32,33,34,35,36]. Our findings thus raise concerns about the reliability of conclusions drawn in HCBarker studies.

While Barker and colleagues rightly acknowledged the influence of prenatal factors on offspring health, they oversimplified by attributing their findings solely to birthweight, placental size, and early growth. They often failed to account for important confounding by SDOH and poor maternal health (Table 3). Four HCBarker studies mentioned adjusting for SDOH around the time of births [31,32,33], but they did not report any adjusted effect estimates (Table 2) to show whether these adjustments made any difference and/or to support statements such as “the association … was independent of duration of … of possible confounding variables including … social class.” [33].

A recent study by Lu et al., which analyzed 3.4 million singleton pregnancies in Sweden and Denmark, challenged the conventional wisdom associated with the Barker hypothesis [37]. Lu et al.’s research found that familial factors outweighed the impact of birthweight on cardiovascular disease risk [37]. They demonstrated that individuals born severely small for gestational age (SGA, ≤3rd percentile) or preterm were at increased risk of cardiovascular disease (hazard ratio (HR) = 1.38 (95% confidence interval (CI): 1.32, 1.45) and HR = 1.31 (95% CI: 1.25, 1.38), respectively). However, this association disappeared (HR = 1.11, 95% CI: 0.99, 1.25) when the relationship was analyzed between SGA with their appropriate for gestational age siblings, which adjusted for genetic, social, and environmental influences [37]. Thus, this large sample study highlights that genetic, social, and environmental influences are more important than risks from being SGA or preterm.

Moreover, the HCBarker studies are also altered by overadjustment of intermediate variables. Six of eight studies adjusted at least one or all their analyses for later weight status [4, 5, 31, 32, 34, 36]. Given that adult body weight is strongly correlated with conditions such as type 2 diabetes, high blood pressure, and acute coronary events, it acts as a mediator in the causal pathway between the exposure (birthweight) and the outcome (adult cardiovascular health). Similar to adjustments for SDOH at the time of outcome assessment, adjusting for adult body weight or its proxy can lead to statistical overadjustment, distorting conclusions [8,9,10,11].

We recommend that future life course epidemiology studies examining associations with early life variables (such as birthweight, placental weight, early growth, or any other early life measurement) adhere to established procedures to prevent bias [1, 8, 9, 13, 38, 39]. They should avoid adjusting or controlling for intermediate variables and proxies for the outcome, such as later weights, or weight status, growth at intermediate time points or later health behaviors. Mediation analysis is one way which can help to understand the effect of an intermediate variable between the exposure and the outcome. It achieves this by delineating distinct direct and indirect pathways, offering insights into which variables serve as intermediaries in the causal pathway [40, 41]. Another helpful approach is to employ a directed acyclic graph [38].

Additionally, we noted several other concerns in the HCBarker studies that likely also contributed to overstated findings. These include selection bias [4, 5, 31,32,33,34,35,36], instability in exposure classification, deviations from intended research questions; and selective reporting of results after conducting numerous tests [4, 5, 31,32,33,34,35,36] (Fig. 2). Furthermore, the high rates of missing data during follow-up across HCBarker studies, ranging from 15 to 87%, cast doubts on the representation of the target populations, thereby undermining the credibility and generalizability of their findings.

In their studies, Barker and colleagues frequently assumed causality despite failing to meet well-established criteria for establishing causation. To establish a causal relationship, results should be strong, specific, dose-related, independent of recognized confounding factors, and consistently found in other studies [13, 42]. Often, Barker and his colleagues failed to adjust for baseline socioeconomic status and prenatal risk factors, including maternal weight, maternal hypertension, and prematurity, all of which are associated with long term offsprings’ cardiovascular health. They made conclusions such as, “If each individual in the cohort had been in the highest third of birth weight and had lowered their standard deviation score for BMI between age 3 and 11 years, the incidence of type two diabetes would have been reduced by 57% and the incidence of hypertension by 25%” [35] suggesting that their reported association reflected causation and is easily modifiable. Further, they described non-specific relationships of varying degrees of strength with no evidence of consistency in dose-response relationships.

The GRADE framework recommends that to be considered likely important relationships, risk estimates should not have serious concerns about confounding [43]. All of the HCBarker studies had residual confounding which raises questions about the confidence that should be attributed to the estimates of effects.

Two recent systemic review and metanalyses of 20 and 39 studies, respectively, observed that SGA preterm infants are not at an increased risk of developing high blood pressure [12], high BMIs, and/or adiposity [44] later in life compared to preterm infants born non-SGA. This recent work contrasts earlier DOHaD papers related to lower birth weight and adult-onset diseases. Our findings in this study can reassure parents and clinicians that infants with lower birthweights can be safely fed to appetite and satiety without undue concern about excess adiposity and inducing chronic disease in adulthood.

Our study possesses several strengths. Firstly, two reviewers independently and meticulously assessed the risk of bias in all included studies. Additionally, we employed a validated tool, the ROBINS I tool for risk of bias assessment [45, 46]. However, it is important to acknowledge the limitations of our study, primarily stemming from the inclusion of only eight Barker’s highly cited publications, with more than 1000 citations. We acknowledge that our study examines only a small part of this extensive field related to the DOHaD hypothesis. The conclusions drawn might have varied if all the evidence were evaluated. Hence, the findings from this review may not be readily generalizable to all of DoHAD publications. However, by including the highest cited papers by the hypothesis’ founder, we believe our study represents methods commonly employed in this area of research. These findings thus underscore the necessity for rigorous critical appraisals of additional Barker/DOHaD hypothesis studies using valid tools.

Conclusion

The high risks of bias in HCBarker studies and their potential to distort findings limit the ability to draw causal relationships between in-utero exposures and adult cardiovascular outcomes. The DOHaD hypothesis, which stems from an unnatural setting of maternal deprivation, should be revisited in light of advancements in methodology and expanded data sets with better control for confounding variables. We advocate for improved data analyses to better understand these relationships and find effective solutions. These analyses should examine relationships between maternal and infant health, SDOH, and later offspring health without excessive adjustment of important intermediate variables such as later weight status and intermediate growth. We recommend caution when interpreting the conclusions drawn from these studies and their application in patient care. Finally, we encourage future research to evaluate other highly cited life course epidemiology studies using the robust risk of bias assessment tools.