FormalPara Take-home message

Using a genetic technique that uses genotypes as an instrument to predict plasma angiopoietin-2, we demonstrate that genetically predicted plasma angiopoietin-2 is associated with sepsis-associated ARDS risk and thus infer that plasma angiopoietin-2 may play a causal role in ARDS development. We apply mediation analysis to determine that plasma ANG2 mediates a significant portion (34%) of the association between polymorphisms in ANGPT2 and ARDS. To our knowledge, this is the first application of a technique known as Mendelian randomization analysis to investigate plasma biomarkers in ARDS

Introduction

Acute respiratory distress syndrome (ARDS) is characterized by the failure of the alveolar-capillary barrier resulting in non-cardiogenic pulmonary edema and life-threatening hypoxemia [1, 2]. Although substantial progress has been made in understanding ARDS pathophysiology, therapy for ARDS remains limited, with no consistently proven pharmacologic options for prevention or treatment. Drug development in ARDS has been hindered by many factors, including the clinical and biologic heterogeneity of the syndrome and the lack of biologically defined endotypes [3, 4]. Further limiting pharmacologic breakthroughs for ARDS is the lack of a validated intermediate phenotype or biomarker with a proven causal role in ARDS that could be used to improve clinical trial efficiency.

Genetic tools may help to identify which biomarkers contribute to a clinical outcome, disentangling correlation from causation [5, 6]. For example, genetic variants regulating plasma low density lipoprotein (LDL) levels are also associated with cardiovascular disease (CVD) risk, and the risk is statistically mediated through plasma LDL levels [5, 7]. Hence, pharmacologic targeting of plasma LDL is a major therapeutic goal for CVD. The analyses suggesting causality, termed Mendelian randomization (MR) studies, are a way to infer causality from observational data, considering each individual as “randomized” during gametogenesis to a high- or low-expressing genotype [8,9,10,11]. The MR framework is an adaptation of an instrumental variable analysis that tests an observed association while controlling for threats to its internal validity, including confounding variables, measurement error, spuriousness, simultaneity, and reverse causality [12]. The MR framework is increasingly used to assess intermediate traits and is appropriate when a potential causal intermediate is genetically predictable and associated with outcome [8, 12, 13]. Genotype then acts as an instrument to predict the plasma marker to be assessed, with the assumption that if the genetically predicted portion of the marker retains association with outcome then the measured marker has a causal true effect on the outcome.

Plasma angiopoietin-2 (ANG2) is an established biomarker of endothelial activation and permeability that is strongly associated with ARDS risk and outcome [14,15,16,17] and thus a potential causal intermediate. Supporting its causal potential, mice genetically deficient in ANGPT2 are more resistant to hyperoxia, exogenous ANG2 placed on endothelial monolayers disrupts barrier integrity, and exogenous ANG2 potentiates lung injury in animal models [18,19,20,21]. However, determining whether plasma ANG2 contributes causally to human ARDS risk has been limited by potential unmeasured confounding of plasma ANG2 and the possibility that elevated plasma ANG2 concentration may result from, rather than cause, lung injury. We and others have replicated an association between genetic variation in the angiopoietin-2 gene (ANGPT2) and ARDS risk [22, 23]. We hypothesized that plasma ANG2 acts as a causal intermediate in determining ANGPT2-associated ARDS risk during sepsis. We focused on sepsis (pulmonary and non-pulmonary) because it is the most common cause of ARDS and carries a higher mortality than other causes [24, 25]. We used the MR framework in a prospective cohort of subjects with sepsis and we estimated the proportion of the genetic effect on ARDS risk mediated through plasma ANG2. Given the established association between ANGPT2 variation and ARDS risk [22, 23], we generated a genetic instrument using multiple ANGPT2 variants to predict plasma ANG2 in ancestry-specific populations. If plasma ANG2 has a causal role in ARDS risk due to sepsis, then reducing plasma ANG2 or inhibiting its signaling warrants testing for ARDS prevention and treatment.

Methods

The Molecular Epidemiology of SepsiS in the ICU (MESSI) cohort at the University of Pennsylvania has been described previously [26, 27]. Patients were eligible if they were admitted to the intensive care unit (ICU) with infection-associated organ failure [28, 29] and excluded if an alternative diagnosis explained SIRS criteria, for declining life support on admission, or for lack of informed consent. Whole blood was collected for DNA and plasma was collected within 24 h of ICU admission, as close to the time of ICU arrival as possible. Clinical data were abstracted from the electronic medical record. All chest imaging studies obtained during the first 6 days [30] were interpreted by trained physician investigators as described [31, 32]. ARDS was adjudicated in accordance with Berlin criteria requiring that chest radiograph and oxygenation criteria be met on the same calendar day while invasively ventilated [1, 27]. Mortality was determined at 30 days. Source of sepsis was adjudicated by critical care physician investigators. As a replication sample, the iSPAAR consortium study consisted of European-ancestry genotyped subjects whose ARDS risk factor was either sepsis or pneumonia and whose plasma was assayed for ANG2 [33,34,35,36]. We used the GTEx Portal to search individual single nucleotide polymorphisms (SNPs) for expression quantitative trait locus (eQTL) significance in three tissues: lung, aorta, and tibial artery. Further details are provided in the online supplement.

MESSI assay procedures

Day 0 plasma ANG2 was measured by an enzyme-linked immunosorbent assay optimized for human plasma (ELISA; R&D Systems, Minneapolis MN). Genomic DNA was extracted from whole blood using the QIAamp DNA Mini kit (Qiagen, Hilden Germany) and assayed with the Affymetrix Axiom TxArray v.1, a genome-wide platform comprising approximately 780,000 markers, of which 184 are within 70 kilobases (kb) of ANGPT2 [37].

Statistical analysis

Continuous data were compared using nonparametric methods and categorical data by chi-square test. The association between log(plasma ANG2) and ARDS was tested by multivariable logistic regression adjusting for APACHE III score and pulmonary source of infection [4]. Using all markers on the genotyping platform, we performed multidimensional scaling to identify four principal components, allowing for identification of genetic ancestry (Supplemental Fig. E2) [22, 32]. Plasma ANG2 values were log-transformed for normality as a result of a positive skew, and we determined the association between genotypes and logANG2 using linear regression, assuming an additive model of genetic risk. Models were performed separately for genetically European (EA) and African ancestry (AA) subjects. Subjects of other ancestry were excluded because of low numbers. We limited our search for genetic determinants of plasma ANG2 to variants within 70 kb of the ANGPT2 gene to find cis regulators given our prior replicated association between ANGPT2 intron 2 and ARDS [22, 38]. SNPs demonstrating an association with plasma ANG2 at p values less than 0.005 were considered significant since ANGPT2 has fewer than 10 linkage disequilibrium blocks [38]. ANG2-associated SNPs were then tested for an association with ARDS using multivariable logistic regression adjusting for potential confounders of the ARDS–ANG2 association, including APACHE III score, pulmonary (versus non-pulmonary) sepsis [4], and genetic ancestry; please see Supplemental Methods for further justification of covariates. To infer a causal association of plasma ANG2 with ARDS, we next regressed transformed ANG2 levels on the ANG2-associated SNPs and used post-estimation prediction to generate a genetically predicted plasma ANG2 value for each EA subject. Genetically predicted ANG2 values were then tested for an association with ARDS risk using multivariable logistic regression. To ensure that our genetic instrument was truly associated with plasma ANG2, we tested ANGPT2 SNPs for replication of the plasma ANG2—ANGPT2 association in the iSPAAR dataset and tested replicating SNPs in the GTEx Project databank. To test whether plasma ANG2 concentration mediated a significant portion of the association between ANGPT2 SNPs and ARDS, we undertook mediation analysis. This technique is a formal approach to explain the mechanism by which an explanatory variable (SNP) influences the outcome (ARDS) via an intermediate or mediator variable (plasma ANG2) [39, 40]. We used linear regression of SNP on logANG2 to estimate the change in plasma ANG2 per allele, and logistic regression of SNP, logANG2, an interaction term [SNP*logANG2], pulmonary source, APACHE III score, and genetic ancestry to model the total effect of SNP on ARDS [39, 40]. The online data supplement describes the mediation effect in more detail along with a complementary MR framework analysis and a sensitivity analysis. We used R statistical packages for the principal components and mediation analyses, Plink for the QTL analysis, and Stata 15 (College Station, TX) for all other analyses. We estimated through simulations that with 250 subjects per ancestry we would have at least 80% power to detect ARDS odds ratios of at least 1.5 if we could explain approximately 10% of the variance in plasma ANG2 [41].

Results

The MESSI population is depicted in Table 1. Between September 5, 2008 and February 9, 2015, 9265 intensive care unit (ICU) patients were screened, 2163 were identified as having sepsis, and 1263 were enrolled (Fig. E3). Of enrolled subjects, 703 had available DNA and plasma ANG2 measured. Reasons for a lack of DNA included inadequate DNA quantity, poor DNA quality, or failure to collect the DNA sample. The primary reason for a lack of plasma measurement was that the sample was not obtained within the 24-h time period immediately following ICU admission, including for subjects transferred from another facility. As shown in Table 1, patients who developed ARDS were more likely to be of European ancestry (EA), had higher severity of illness at presentation, and were more likely to have a pulmonary source of sepsis. Pneumonia was a major risk factor for ARDS, with an odds ratio of 2.63 (95% CI 1.75, 3.95), p < 0.001, compared to non-pulmonary sepsis. Overall mortality was high at 50% in this observational cohort, and ARDS subjects had a significantly higher mortality than non-ARDS subjects. Measured plasma ANG2 was strongly associated with ARDS independent of APACHE III score or pulmonary source of infection. For each log increase in plasma ANG2 measurements, the adjusted odds ratio for ARDS was 1.49 (95% CI 1.20, 1.77), p < 0.001. The association of plasma ANG2 with ARDS was significant in both EA and AA subgroups (Table E2).

Table 1 MESSI population with genotype and plasma ANG2 measured

Results of the cis-QTL analysis for EA (n = 404) and AA (n = 254) subjects are shown graphically in Fig. 1 and Fig. E4, respectively. Forty-five subjects were of Asian ancestry or were genetically admixed and were excluded from the MR analysis. Genetic analysis of EA subjects revealed strong cis regulation, whereas analysis of AA subjects did not. In EAs, five SNPs were associated with ANG2 at p < 0.005 (Table 2) and these SNPs demonstrated low linkage disequilibrium with one another (r2 < 0.10 for all but one, r2 = 0.24 with opposite directionality) (Fig. 1a) [42, 43]. Individually, each SNP explained approximately 2% of the variance in plasma ANG2 levels (R2), whereas collectively they explained 8.1% variance. We tested each SNP for an additive association with ARDS risk, adjusting for genetic ancestry, pulmonary source of infection, and severity of illness (APACHE III score). Two SNPs—rs2442608 and rs2442630—demonstrated a significant ARDS association (Table 2). In addition, rs2442608 has moderate linkage (r2 = 0.37) with the locus we previously identified as associated with trauma-associated ARDS risk, Fig. 1b [22], providing additional replication for this locus now in sepsis-associated ARDS.

Fig. 1
figure 1

Regional association plots demonstrate a consistent region of association between the ANGPT2 gene and plasma ANG2 (a) and previously reported trauma-associated ARDS (b). a Depicts the regional association plot between loci on the ANGPT2 gene (chromosome 8) on the x-axis and the strength of association with plasma ANG2 in early sepsis among EA shown as − log(p value for ANG2 association) on the y-axis. Single nucleotide polymorphisms (SNPs) with association more extreme than p = 0.005 are labeled, with rs2442608 being the most extreme SNP from the QTL analysis. Color-coding depicts the strength of linkage disequilibrium (LD) between rs2442608 and other loci, with increasing LD represented by increasing red. Navy represents minimal LD with rs2442608. A schematic of the ANGPT2 gene is depicted as a red bar with blue exons and an arrow to indicate the direction of transcription. b Depicts the regional association plot between the same region of chromosome 8 with trauma-associated ARDS as reported in our prior study [22]. The most extreme SNP from that association, rs7825407, is in moderate LD with rs2442608 (r2 = 0.37) in EA populations [42], replicating the importance of this ANGPT2 intron for ARDS and providing functional relevance for this locus. Plots were created using LocusZoom [43]

Table 2 Five unlinked SNPs are cis quantitative trait loci (QTL) for plasma ANG2 expression during sepsis in Europeans, and two demonstrate significant ARDS association

For the MR analysis, we genetically predicted plasma ANG2 using post-estimation analysis from linear regression models of QTL on measured plasma ANG2 (Fig. 2). Of 404 EA subjects, 16 were missing a genotype call on one or more of the five QTL and thus did not have a predicted value, leaving 388 EA subjects for the analysis. Genetically predicted ANG2 values were associated with ARDS, adjusting for genetic ancestry, pulmonary source of infection, and APACHE III score (Fig. 2), adjusted odds ratio 2.25 (95% CI 1.06, 4.78), p = 0.035. Furthermore, a complementary method of MR analysis known as two-stage residual inclusion, described in the online data supplement, also identified a potential causal role for plasma ANG2: OR 2.65 (1.22, 6.05) p = 0.047.

Fig. 2
figure 2

Mendelian randomization conceptual model to infer whether plasma ANG2 has a causal effect on ARDS risk. We first used linear regression to identify five SNPs near ANGPT2 that were strongly associated with plasma ANG2 and jointly explained 8% of the variance in measured ANG2 among EA subjects. Each EA individual was then assigned a genetically predicted ANG2 value using post-estimation prediction. The predicted values should be less affected by unmeasured confounders beyond population stratification, since they derive from each individual’s genetic assortment of parental alleles. Genetically predicted plasma ANG2 values are then tested for an association with ARDS risk by multivariable logistic regression adjusting for genetic ancestry, severity of illness (APACHE III score), and pulmonary versus non-pulmonary source of sepsis. The statistically significant association between the genetically determined component of plasma ANG2 and ARDS risk is evidence for a potential causal effect of plasma ANG2 towards ARDS development. SNPs single nucleotide polymorphisms, ANGPT2 angiopoietin-2 gene, ANG2 angiopoietin-2 protein, EA European ancestry

Because instrumental variable methodology relies upon valid instruments, we sought to replicate the association between plasma ANG2 during sepsis and SNPs in the ANGPT2 gene in an independent dataset (Table E3). Two of the five SNPs were directly genotyped in the replication population and replicated their association in patient-level meta-analysis (Table E4 and Fig. E5). Both SNPs also exhibited differential expression of ANGPT2 in one or more relevant tissue, with rs2442608 differentially expressed in lung (p = 0.0060), aorta (p = 0.048), and tibial artery (p = 0.00090) and rs2515466 differentially expressed in tibial artery (p = 0.0051) (Table E5).

In mediation analysis, we determined the total effect and the mediation effect for the two replicating ANGPT2 cis QTL (Table 3). For both rs2442608 and rs2515466, the mediation effect was significant and the proportion of ARDS risk explained was greater than 30%, whereas no significant “direct” effect between each SNP independent of plasma ANG2 was detected.

Table 3 Causal mediation analysis demonstrates that 30–40% of the ARDS risk is mediated through changes in plasma ANG2 for replicating ANGPT2 SNPs

Discussion

We have demonstrated that the genetically determined portion of plasma ANG2 is associated with ARDS risk due to sepsis, suggesting that plasma ANG2 may serve as a causal intermediate phenotype in ARDS. Furthermore, we provide evidence that a significant proportion, over 30%, of the genetic association between ANGPT2 variants and ARDS risk is mediated by early plasma ANG2 concentration among septic European ancestry subjects. Thus, efforts to reduce plasma ANG2 or to block its signaling warrant testing to prevent and possibly treat ARDS in sepsis.

Interest in targeting the angiopoietin–TIE2 receptor axis as a potential ARDS strategy is not new, given strong evidence for this pathway’s contribution by in vitro, animal, and human studies. Mice that were genetically deficient in ANGPT2 were more resistant to hyperoxia-induced inflammation, permeability, cell death, and mortality [19]. Patients with ARDS had significantly elevated plasma and edema fluid ANG2 compared to control patients with hydrostatic edema [19]. Parikh and colleagues demonstrated that sera from septic patients with high circulating ANG2 induced stress fiber formation and endothelial intercellular gaps when applied to an endothelial monolayer [20]. Gap formation was reversed with a competitive inhibitor of ANG2, recombinant human angiopoietin-1 [20], suggesting that circulating ANG2 protein is sufficient to cause vascular permeability in sepsis and ARDS. More recent work has established the prognostic and predictive significance of plasma ANG2 for ARDS in human populations [14, 15]. Numerous approaches to pharmacologically block ANG2’s effects exist and have shown promise for reducing vascular permeability in vitro and in vivo, in some cases improving survival [44,45,46,47,48,49,50,51,52]. Despite this strong experimental rationale, no therapy targeting the ANG2–TIE2 pathway has yet been tested in human sepsis or ARDS. Trials in both ARDS and sepsis are notable for the failure of many agents to translate to clinical benefit despite strong experimental evidence and evidence of clinical biomarker association, however [53, 54], suggesting that better methods of selecting lead candidates is warranted [55].

In this study, we used the principle of Mendelian randomization (MR) to infer a causal relationship of plasma ANG2 with ARDS in sepsis. Because parental alleles are randomly sorted during gametogenesis, ANGPT2 genotypes can be considered randomly assigned, theoretically independent of confounders such as propensity for pneumonia or sepsis. We could genetically predict approximately 8% of the variance in plasma ANG2 using five loci close to ANGPT2. Similar to measured levels of plasma ANG2, genetically predicted plasma ANG2 was associated with adjusted ARDS risk, suggesting that early plasma ANG2 may contribute to ARDS risk. Because we have demonstrated that plasma ANG2 levels are heterogeneous early in sepsis and that genetically predicted plasma ANG2 concentrations influence ARDS risk, we suspect that the benefits of blocking ANG2 activity will not be uniform. A precision approach using plasma ANG2 to decide which patients should be enrolled in trials to test anti-ANG2 therapy may be superior to an approach whereby all patients are eligible [56], unless the safety profile of new agents favors testing the drug in all septic subjects.

Our work highlights the utility of quantitative traits to maximize power in a complex genetic trait such as ARDS [55, 57]. The regulation of plasma ANG2 is much less complex than the regulation of sepsis-associated ARDS, and the use of a genetic MR approach helps to prioritize plasma ANG2 as a marker that seems to contribute to ARDS risk [8]. To strengthen the validity of our genetic instrument, we used the iSPAAR population to confirm differential plasma expression and the GTEx Portal to confirm differential RNA expression for two SNPs in our instrument. Because the replication population tested plasma ANG2 at variable time points following ICU admission, it may not be surprising that the QTL analysis yielded slightly different results in this population. Nonetheless, the meta-analysis results strengthen the functional significance of our prior replicated locus [22]. Our results also highlight the importance of simulating stress conditions for studies of quantitative traits such as plasma or mRNA expression to reduce the complexity of a trait like ARDS [57,58,59], as the identified ANG2 QTL during early sepsis differ from the quiescent state [60]. Our mediation analysis suggests that over 30% of rs2442608- and rs2515466-associated ARDS risk is explained by plasma ANG2. We used a multi-SNP predicted ANG2 to explain a higher proportion of plasma ANG2 variance [61], and observed a significant association between genetically predicted plasma ANG2 and ARDS risk. However, our genetic instrument was relatively weak [61] and could have been stronger if we had used genome-wide markers for plasma ANG2 in the QTL analysis, rather than limiting to SNPs close in proximity to the ANGPT2 gene. A genome-wide approach would have been necessary to test our hypothesis in the AA subgroup, as cis variants did not have a strong enough effect on plasma ANG2 in this underpowered subpopulation. However, our study rationale was to test whether prior associations between ANGPT2 and ARDS risk were mediated by changes in plasma ANG2 concentration.

Our study has limitations. We used two methods of causal inference, instrumental variable and causal mediation, to infer a causal effect of plasma ANG2 on ARDS risk from observational data; however, neither method completely removes confounding and bias. Further, the assumptions of each methodology are somewhat in conflict: whereas Mendelian randomization specifies that the only path between SNP and ARDS travels through plasma ANG2, mediation analysis asks what proportion of the SNP–ARDS association travels through plasma ANG2, and acknowledges that some fraction of the association may be independent of plasma ANG2. However, the fact that each method detects a causal effect for plasma ANG2 is supportive of a true causal relationship [12]. Ultimately, further replication and an experimental approach to modify plasma ANG2 and observe reduced ARDS risk is necessary to prove the causality of human plasma ANG2, which would be consonant with animal and in vitro experiments implicating ANG2 as sufficient to provoke lung injury [19, 20]. Further, replication of the causal effect of plasma ANG2 in non-septic precipitants of ARDS such as trauma or inhalational injury is warranted.

Although our intent was to perform plasma ANG2 QTL analysis for both EA and AA subjects with sepsis, our power was stronger for the EA subpopulation because of enrollment trends and the strength of cis-QTL observed. Although we observed a significant association between plasma ANG2 and ARDS among AA subjects, none of the individual QTL identified in EA subjects was associated with plasma ANG2 AA subjects. Our future studies will need to analyze a larger population of AA subjects to determine whether regulation of plasma ANG2 in cis is an ancestry-specific finding. Plasma ANG2 was tested at only one time point, and a repeated measures analysis might have captured a larger proportion of plasma ANG2 variance during critical illness. However, this early time point may be most useful to inform a possible biomarker-based approach to therapy. This was a single-center study, and the MESSI population is unique in enrolling over almost a decade, having a high observed mortality—we believe due to the high proportion of comorbidities for which APACHE III may poorly account [62]—and for having a high proportion of severe ARDS. Our phenotyping was consistent with the Berlin definition but did not characterize lung vasculature permeability directly [1, 63]. Finally, it remains unknown if subjects with ANGPT2-mediated high plasma ANG2 represent a distinct endotype of septic subjects at risk for ARDS, or whether they will respond differently to therapy [64]. With the appropriate test development and validation, plasma ANG2 could be readily assessed at the bedside, obviating the need to genotype subjects, and our findings suggest that therapies aimed at reducing ANG2 or ANG2 signaling in those with the highest circulating concentrations warrant testing.

In conclusion, we have (1) provided evidence that plasma ANG2 may have a causal role in increasing ARDS risk among European ancestry subjects; (2) demonstrated significant local (cis) genetic regulation of plasma ANG2 during early sepsis in EA subjects; and (3) replicated our prior association of the second intron in ANGPT2 and ARDS in a septic population. Reducing the levels of plasma ANG2 or blocking its signaling should be tested for ARDS prevention and/or therapy, particularly among patients with high circulating concentrations of ANG2.