Introduction

Cardiovascular disease (CVD) is the number one cause of death in developed nations and is the leading cause of years of life lost due to morbidity and mortality globally (Naghavi et al. 2017). In 2015, the prevalence of CVD in the United States was 41.5% and was expected to rise to 45% by 2035, when over 130 million Americans would have one or more forms of CVD. The yearly costs (direct and indirect) of CVD to Americans are currently several hundred billion dollars and are expected to exceed one trillion dollars by 2035 (American Heart Association 2017).

Decades of observational, controlled exposure, in vivo, and in vitro studies indicate there is a causal link between CVD and air pollution, particularly particulate matter air pollution (Brook et al. 2010; Newby et al. 2015). In 2015, particulate matter air pollution contributed to 32,406,000 ischemic heart disease (IHD) disability-adjusted life years (a combined measure of morbidity and mortality), where IHD is a primary form of CVD. This contribution of air pollution contribution to disability and loss of life is comparable to the contribution of tobacco smoking (33,161,000 disability-adjusted life years). However, while the contribution of tobacco smoke is trending downwards (7.4% decrease from 2005), air pollution’s contribution is trending upwards (3.4% increase from 2005) (GBD 2015 Risk Factors Collaborators 2016). Outdoor air pollution is a global CVD mortality risk factor as it contributed to > 1500,000 CVD deaths in 2015 (Cohen et al. 2017).

CVD risk factors, including elevated blood pressure, metabolic risk factors, and inflammation, have been associated with air pollution exposure (Chuang et al. 2007, 2011; McGuinn et al. 2019; Simkhovich et al. 2008; Sørensen et al. 2012; Ward-Caviness et al. 2018; Ward-Caviness et al. 2015). Controlled exposure studies, animal models, and in vitro studies have made substantial contributions to our understanding of the biological pathways linking air pollution exposure to CVD, and suggest significant involvement of inflammation and oxidative stress pathways (Brook et al. 2010; Newby et al. 2015). However, researchers still lack a robust understanding of the factors that give rise to inter-individual variability in air pollution-related CVD risks, such as genetic variation.

Like air pollution, genetics is also a major contributor to CVD risk, with 50 + genetic loci robustly associated with CVD (Nikpay et al. 2015). Although CVD is heritable, only a relatively modest proportion (20–25%) is explained by rare and common genetic variation (Nikpay et al. 2015; So et al. 2011), and heritable non-genetic factors, e.g., epigenetics, and gene–environment interactions may contribute to the “missing heritability” (Manolio et al. 2009; Zuk et al. 2012). Gene–environment interactions are a promising area of research for new insights into CVD from both a mechanistic and public health perspective, and comprehensive characterization of gene–environment interactions can assist researchers in identifying individuals with elevated risks, providing novel biological insights, and improving disease prediction (Khoury 2017). Here, we undertake a review of CVD-associated gene–air pollution interactions, including interactions associated with CVD risk factors and emerging biomarkers, such as metabolites and DNA methylation. The previous comprehensive review on this topic was done in early stages of the field when contributions were limited in both the number of published articles (16) as well as the number of cohorts studied (3) (Zanobetti et al. 2011). Thus, an update on the current state of the field is warranted.

Scope of the review

This review covers the literature on gene–air pollution interactions in CVD. For the purposes of this review, CVD is defined as: coronary artery disease, hypertension/high blood pressure, peripheral arterial disease, heart failure, coronary atherosclerosis, myocardial infarction (MI), or coronary death. Additionally, this review covers gene–air pollution interactions for emerging biomarkers of CVD such as metabolomics, microRNAs, and DNA methylation. A search was made for manuscript written in English using five databases/search engines (ProQuest Agricultural & Environmental, Pubmed, Science Direct, Web of Science, and Google Scholar) using the following search criteria:

(“gene-environment interaction” OR “genetic marker” OR “gene expression” OR “genetic variant” OR “gene variant” OR “SNP interaction” OR “single-nucleotide polymorphism interaction” OR GSTM1 OR GSMT1 null OR PON1 OR “Interleukin 6” OR IL6 OR “Interleukin 8” OR IL8 OR cytokine OR “Glutathione S-Transferase”)

AND

(“myocardial infarction” OR “coronary heart disease” OR “coronary artery disease” OR “cardiovascular disease” OR “peripheral arterial disease” OR “peripheral vascular disease” OR “blood pressure” OR “hypertension” OR “atherosclerosis” OR “cardiovascular mortality” OR “cardiovascular morbidity” OR “cardiovascular hospitalization”)

AND

(“air pollut*” OR “particulate matter” OR PM10 OR PM2.5 OR “ultrafine particulate matter” OR “ultrafine particles” OR UFP OR “traffic-related air pollution” OR “distance to roadways” OR noise OR “noise pollution” OR “nitrogen dioxide” OR “nitrogen oxide” OR NOx OR NO2 OR ozone OR nitrate OR sulfate OR air quality OR urban air OR polluted air).

The terms were selected to broadly cover manuscripts which may involve gene–environment interactions and include specific terms for genes known to be involved in air pollution interactions based on a previous review (Zanobetti et al. 2011). Stroke was also considered as a potential outcome; however, a similar search strategy did not return any gene–air pollution interaction articles examining stroke risk. Results of the literature review are organized by the outcomes considered for the gene–environment interactions and presented in a hierarchy starting with manuscripts on CVD risk, then moving to CVD risk factors, and finally ending with inflammatory markers and emerging molecular biomarkers for CVD. Where possible, the interactions are interpreted in terms of the effect of specific genotypes on the association between an air pollution exposure and CVD outcome relative to a reference genotype.

Results of review

A total of 168 manuscripts were returned by the literature search. A review of the abstracts for these manuscripts revealed 56 which examined gene–air pollution interactions in CVD. Of these, 10 were review manuscripts of the literature, and 6 were animal models comparing strain differences which were excluded from the review. Thus, this review focuses on the 40 manuscripts remaining. The following information was extracted from the manuscripts: the outcome, pollutant, exposure duration, variant(s) and gene region(s) considered magnitude of association, direction of effect, and genotype-stratified effects. The manuscripts were also examined for racial/ethnic diversity within the participating cohort(s) and the use of functional follow-up and replication cohorts. The review is organized into an approximate “biological hierarchy” beginning with manuscripts examining CVD risk, then those examining CVD risk factors such as blood pressure and heart rate variability (HRV) measures. After this follows a section on inflammation, a primary mechanistic pathway underlying CVD, and finally a review of manuscripts related to emerging biomarkers for CVD such as microRNAs and metabolites. Each section begins with a few sentences of a brief overview, before diving deeper into the specific interactions reported in the literature.

CVD

We begin the review with manuscripts covering gene–air pollution interactions for risk of CVD outcomes reported in the literature, which were: MI, hypertension, peripheral arterial disease, coronary atherosclerosis, and left ventricular mass. Most of the manuscripts were candidate gene studies. The majority of the genes found with interactions were linked to inflammation; however, this is in large part due to inflammatory genes being the more commonly studied than genes in other pathways (Table 1).

Table 1 Summary of gene–air pollution interaction studies

In a study of short and long-term exposure to SO2 and particulate matter < 10 µm in diameter (PM10), only short-term exposures had interactions associated with MI (Panasevich et al. 2013). In this study, the SO2–MI association was only observed in individuals with the GG genotypes for IL6-174 (rs1800795) and IL6-598 (rs180797). In the same study, variants in TNF-α interacted with short-term PM10 exposure such that individuals with CC genotypes for TNF-863 (rs1800630) and TT genotypes for TNF-1031(rs1799964) had a positive association between PM10 and MI, while all other individuals had a negative direction of association (though the later estimates included the null) (Panasevich et al. 2013). In a multi-ethnic study of left ventricular mass, 12 candidate genes were examined for interactions with residential proximity to roadways, a measure of long-term exposure to traffic-related air pollution. In this study, two genes (AGTR1 [rs6801836] and ALOX15 [rs2664593]) showed significant interactions. AGTR1 encodes for angiotensin II, a vasopressor hormone which helps control blood pressure, while ALOX15 encodes a lipoxygenase enzyme that helps produce lipid mediators involved in inflammation. For both AGTR1 and ALOX15, an increased number of minor alleles was associated with a weakening of the association between traffic exposure and left ventricular mass (Van Hee et al. 2010). Interactions with the AGTR1 variant were stronger in individuals with poor blood pressure control.

There have been two genome-wide interaction studies to examine cardiovascular outcomes, both of which used residential proximity to roadways as an indicator of long-term traffic exposure. The first genome-wide interaction study examined interactions which were associated with peripheral arterial disease (Ward-Caviness et al. 2016). This study used both African- and European-American individuals from a cardiac catheterization cohort in a race-stratified analysis that were later combined into a multi-ethnic meta-analysis. Researchers observed a genome-wide significant interaction between rs755249 (BMP8A) and residential proximity to roadways and suggestive interactions were found across the entire BMP8A-MACFI locus on chromosome 1. The interaction with rs755249 and most of the suggestive interactions had a positive multiplicative interaction term, indicating that an increase in minor alleles for each variant (additive genetic model) was associated with a stronger effect of residential proximity to traffic on peripheral arterial disease (Ward-Caviness et al. 2016). BMP8A belongs to the bone morphogenic protein family, which has been implicated in vascular calcification (Hruska et al. 2005) and inhibition of bone morphogenic protein family proteins may reduce vascular calcification and atherosclerosis (Derwall et al. 2012). The second genome-wide interaction study used the same cohort and study design, but examined coronary atherosclerosis burden. In this study no genome-wide significant interactions were found, but several suggestive interactions (P value < 1 × 10−5) were found in inflammation-related genes PIGR and FCAMR (Ward-Caviness et al. 2017).

CVD risk factors and subclinical measures

Most gene–air pollution interactions for CVD risk factors have been candidate gene studies, with a few genes, e.g., GSTM1 and HFE, associated with multiple risk factors, and a few outcomes, e.g., heart rate variability (HRV) and blood pressure, examined across multiple studies. Yet, due to the multiplicity of exposures, genes, and risk factors surveyed, there has been no independent replication across studies. Still, there is mounting evidence that genes involved in detoxification (GSTM1), iron metabolism (HFE), inflammation (APOE, IL-6), and lipid metabolism (APOE, LPL) are associated with CVD risk factors via interactions with air pollution.

HRV measures have been the most studied CVD risk factor, with particulate matter (primarily PM2.5) being the most common exposure examined (Table 1). In the Normative Aging Study, a cohort of older, Caucasian, male veterans from the USA, the association between HRV and PM2.5 was modified by genetic variants in GSTM1 (Chahine et al. 2007; Schwartz et al. 2005), HFE (Park et al. 2006), APOE (Ren et al. 2010a), LPL (Ren et al. 2010a), and cSHMT, which is linked to methyl nutrient processing (Baccarelli et al. 2008). In a separate examination of Normative Aging Study participants, 48 h average exposure to PM2.5 was associated with HRV measures among wild-type genotype carriers in three of the LPL genotypes examined: LPL-G113C, LPL-N291S, and LPL-S447X (Ren et al. 2010a). In a study of genes involved in processing dietary methyl nutrients, associations between PM2.5 and multiple HRV measures were only seen amongst the CC genotype for cSHMT C1420T and the CT/TT genotypes for MTHFR C677T (Baccarelli et al. 2008). There has been one study (also using the Normative Aging Study cohort) to examine mitochondrial DNA methylation and observed that individuals with higher mitochondrial DNA methylation had a stronger association between PM2.5 and HRV measures (Byun et al. 2016). To date, no study has examined mitochondrial genetic variation. The only gene–air pollution interactions study for HRV not done using the Normative Aging Study utilized the Swiss cohort study on air Pollution And Lung and heart Disease In Adults. In this study, participants with two G alleles for rs1800795 (IL6-174) had an inverse association between traffic-related PM10 and two HRV measures: the standard deviation of normal-to-normal intervals and low frequency power (Adam et al. 2014).

Only a handful of other exposures beyond ambient particulate matter have been examined in relation to HRV. HFE and HMOX-1 are genes linked to iron metabolism, and in one study researchers observed an interaction between tibia bone lead levels and HMOX-1 genotypes. In this study tibia bone lead levels were associated with a decrease in QT interval only in carriers of a HMOX-1 long allele (Park et al. 2009). In a study of QT interval in relation to short-term exposure to black carbon (BC), individuals with a higher genetic risk score, based on variants in genes related to detoxification and iron metabolism had a stronger association between BC and QT interval (Baja et al. 2010).

As one of the most studied genes in gene–air pollution interactions, GSTM1 has been examined for CVD risk factors beyond HRV measures. A small-panel study of 22 individuals suggested that associations between short-term exposure to PM2.5 and red blood cell count are primarily observable amongst individuals with the GSTM1 null genotype (Schneider et al. 2010).

Though investigated, no interactions with air pollution were found for GSTM1, or other genes in the glutathione S transferase family, in relation to blood pressure (Frampton et al. 2015; Mordukhovich et al. 2009) or hypertension (Levinsson et al. 2014), hinting that interactions involving glutathione S transferase family genes may be specific to a subset of CVD outcomes and risk factors.

In a study of 25 candidate genes (202 SNPs total), associations between short-term PM2.5 exposure and postural blood pressure changes were modified by variants in PHF11, which is linked to T-cell activation and inflammation-related outcomes (Jang et al. 2005; Rahman et al. 2010; Vercelli 2003), and by variants in MMP1 and ITPR2, which are renin–angiotensin-related genes (Wilker et al. 2009). In a study of five candidate genes related to microRNA processing, associations between short-term exposure to BC and blood pressure were stronger in individuals with wild-type or heterozygous genotypes (Wilker et al. 2010).

Inflammation

Inflammation is a causal risk factor for CVD (Siti et al. 2015) and a primary result of air pollution exposure is the triggering of inflammatory pathways. Many of the genes involved in air pollution interactions for CVD risk factors and outcomes, e.g., IL-6, TNF-α, and GSTM1, are also associated with inflammatory markers via gene–environment interactions. Observing the same genes acting at multiple levels of the “biological hierarchy” (molecular factor → risk factor → disease) may be explained by gene–air pollution interactions acting on molecular factors. Subsequent alterations in molecular pathways may then initiate changes in disease risk factors which can then impact downstream disease risk. Though such a chain of events linking genetic background and environmental exposures to disease risk is plausible and hinted at by the whole of the literature, it remains to be tested in either observational or experimental studies and thus should still be considered speculative at this point.

As with other outcomes, most interaction studies for inflammatory outcomes have been candidate gene studies using variants in inflammation-associated genes. Some of the most widely studied loci are IL-6, TNF-α, and the fibrinogen gene cluster, particularly for inflammation-related outcomes such as sVCAM and IL-6 blood concentrations (Table 2). In a multi-ethnic interaction study, investigators observed a significant interaction between the GSTM1 null genotype and short-term exposure to PM2.5 which was associated with blood IL-6 concentrations. Two independent studies showed that variants in the IL-6 gene region modified associations between blood IL-6 concentrations and short-term (24 h) exposure to carbon monoxide (CO) (Ljungman et al. 2009) and long-term exposure (1 year) to both NO2 and SO2 (Panasevich et al. 2013). These two studies examined different IL-6 variants though rs2069832 from Ljungman et al. (2009) and rs1800795 from Panasevich et al. (2013) were in near-perfect linkage disequilibrium (r2 > 0.99). For both short-term exposure to CO (Ljungman et al. 2009) and long-term exposure to NO2 (Panasevich et al. 2013), the association between exposure and IL-6 concentrations was stronger amongst minor allele carriers in an almost linear fashion. For SO2, its association with blood IL-6 concentrations was only observed amongst individuals with one or more copies of the minor (G) allele for the IL-6 variants rs1800795 and rs180797 (Panasevich et al. 2013).

Table 2 Commonly found genes with gene–air pollution interactions for cardiovascular disease

The studies by Ljungman et al. (2009) and Panasevich et al. (2013) also examined genetic variants in the fibrinogen gene cluster (FGA, FGG, and FGB) for interactions but did not observe any significant interaction associations with blood IL-6 concentrations. However, in a study of short-term exposure to PM10, individuals with more copies of the minor allele for rs1800790 (located in FGB) had a greater association between PM10 exposure and fibrinogen concentrations (Peters et al. 2009). In a short-term exposure study of particle number count (PNC), associations between PNC and fibrinogen were higher for individuals with higher allelic risk profile scores for oxidative stress and metal processing pathways (Bind et al. 2014).

One study examined interactions between mitochondrial DNA haplogroups, where a haplogroup is a haplotype shared by a population of mitochondria, and air pollution exposure. This study observed that associations between short-term exposure to air pollutants (BC, CO, nitrogen oxides, and polycyclic aromatic hydrocarbons) and blood concentrations of IL-6 and TNF-α were stronger in the H mitochondrial haplogroup as compared to other haplogroups (Wittkopp et al. 2013). The other studies of inflammation-related outcomes examined blood concentrations of soluble vascular cell adhesion molecule (sVCAM) and soluble intracellular adhesion molecule (sICAM). One study observed that associations between sVCAM concentrations and BC exposure were elevated in individuals with the GSTM1 null genotype (Madrigano et al. 2009). The second study examined variants in five microRNA processing genes for interactions with short-term exposure to PM2.5. This study observed that the direction of association was flipped for individuals with one or more minor alleles for GEMIN4 variants as compared to those with no minor alleles (Wilker et al. 2011).

microRNAs, metabolites, DNA methylation

MicroRNAs, metabolites, and DNA methylation are novel risk factors, and potential biomarkers, for CVD (Barwari et al. 2016; McGarrah et al. 2018; Muka et al. 2016; Ono et al. 2011; Zhang et al. 2016). These molecular factors may represent the most primitive level at which gene–air pollution interactions may exert effects which can translate into downstream disease risk. Though these factors are still emerging molecular measures within the environmental and health landscapes, current research already indicates that genes with interactions for CVD outcomes and risk factors, e.g., HFE and GSTM1, also have air pollution interactions associated with emerging molecular CVD biomarkers.

There have been two studies to date which have examined gene–environment interactions in relation to CVD-associated metabolites. In one study researchers examined whether genetic variants modified associations between blood homocysteine and exposure to BC and PM2.5. Associations between 7-day average exposure to PM2.5 and BC and blood homocysteine concentrations were modified by genetic variants in HFE (C282Y), CAT (rs2300181), and GSTT1. Though interactions for PM2.5 did not show a discernable pattern, individuals with the GSTT1 null deletion had stronger positive associations between BC and homocysteine than those with the wild-type genotype (Ren et al. 2010b). Urinary 8-hydroxy-2′-deoxyguanosine (8-OHdG) is a marker of oxidative stress that is associated with environmental exposures. In a candidate gene study of oxidative stress-related genes, researchers observed that the association between 8-OHdG and 18–24 day average exposure to organic carbon is modified by genetic variants in CAT (rs2286367), GSTM1 (null deletion), and GC (rs2282679) (Ren et al. 2010c).

MicroRNAs are a central regulator of gene expression and many cellular processes. In a study of short-term exposure to BC, PM2.5, and SO42−, associations between air pollutant exposure and circulating concentrations of multiple microRNAs were modified by genetic variants in GEMIN4 and DBCR8, which encode enzymes involved in microRNA processing (Fossati et al. 2014).

Long intersped nuclear element-1 (LINE-1) is a transposable DNA element found throughout the human genome. Methylation at LINE-1 loci is often used as an indicator of global methylation and has been associated with CVD risk (Muka et al. 2016). In a multi-ethnic study of approximately 400 pregnant women, associations between LINE-1 methylation and exposure to PM2.5, PM10, NO2, and ozone in the first trimester were examined for modifications by 8 methylation-associated genes (262 variants total). After a false discovery rate correction there were 11 variants (from four genes) with significant interactions, nearly all with positive interaction coefficients. Rs16999714, located in the methyltransferase gene DNMT1, had significant interactions with all exposures examined (Breton et al. 2016). The only other study of DNA methylation was done in nearly 700 Caucasian males. This study found that BC had a stronger association with LINE-1 methylation in participants with the GSTM1 null genotype than participants with the wild-type genotype (Madrigano et al. 2011).

Discussion

Challenges of gene–environment interaction studies

Despite the potential of gene–environment interaction studies to deepen our understanding of the mechanisms of disease and environmental health risks, these studies are rare as compared to studies of genetic or environmental “main effects”. This scarcity of studies is likely related to the challenges such as gaining sufficient power, determining the appropriate scale of interactions, overcoming measurement error, and performing function validation (McAllister et al. 2017). Power is the most frequent challenge to overcome in gene–environment interaction studies since with equal sample sizes the power for interaction studies is almost always lower than the power for genetic main effect studies (Hunter 2005). Approaches to overcome power limitations include candidate gene approaches (which reduce multiple testing requirements), case-only analyses, two-stage analyses, and decreasing measurement error (Gauderman et al. 2017; Mukherjee et al. 2008; Murcray et al. 2009; Wong et al. 2003). These approaches can be quite efficient, and in some cases can decrease the number of samples needed by > 50% (Gauderman et al. 2017). In the case of decreasing measurement error, new exposure assessment methods such as remote sensing and molecular proxies for exposure may substantially improve the precision of exposure estimates and reduce inter-study heterogeneity (McAllister et al. 2017).

Even after successfully uncovering a gene–environment interaction, the relevance of this interaction may be difficult to interpret and communicate. To improve interpretation, studies should perform genotype-specific associations where possible, and always explicitly state the genetic model assumed. Even with improved reporting, direct intervention via in vivo and in vitro studies may be required to fully interpret interactions. Given the expense and complexity of in vivo and in vitro studies, researchers may choose to use tools such as the Genotype-Tissue Expression (GTEx) web portal (which contains information on tissue-specific associations between genotypes and RNA transcripts; https://gtexportal.org/home/) (The GTEx Consortium 2015) or public databases such as the Encyclopedia of DNA Elements (ENCODE Project Consortium 2012) or Kyoto Encyclopedia of Genes and Genomes (Kanehisa et al. 2016) to perform informative annotation and in silico functional validation. Correlations between environmental exposures and other factors also complicate interpretations. In the case of air pollution, correlations exist not only with other environmental factors, but also social factors, which may independently alter susceptibility to air pollution (Fuller et al. 2017). Additionally, certain racial and ethnic groups often reside closer to air pollution sources (Mikati et al. 2018), which can introduce correlations between air pollution exposure and race/ethnicity. Researchers must be aware of these potential correlations and properly account for them in interaction models and study designs.

Closely related to the interpretation of interaction is deciding on the scale of interactions. Interactions on an additive scale are rarely examined in the literature, and though a common assumption is that additive interactions can be captured via multiplicative interaction models, this may not always be true (Li and Chambless 2007). In some cases, additive interactions can better reflect underlying biology or be more appropriate for public health research objectives (Gauderman et al. 2017). Thus, the frequent decision to exclusively examine multiplicative gene–environment interactions may simplify studies at the expense of obscuring biological insights and complicating public health interpretations.

Overview

There is substantial evidence in the CVD literature that associations between air pollution exposure and CVD outcomes are altered by underlying genetic variation. Overall, studies of gene–air pollution interactions in CVD have gathered substantial evidence that genes related to detoxification, inflammation, and microRNA processing harbor genetic variants which may alter the association between air pollution exposure (in both short- and long-term periods) and cardiovascular outcomes. Though this evidence comes primarily from Caucasian cohorts, there have been a few multi-ethnic studies. However, there is still substantial work to be done within the field.

Most published gene–air pollution interaction studies have been done using a candidate gene approach, which can limit researcher’s ability to find novel interactions. Additionally, while independent replication has become standard for studies of genetic main effects, this has not translated to interaction studies. Current interaction studies rarely examine the same exposure or outcome used across studies, so even though only a relatively small number of genes represent most of the interactions reported in the literature (Tables 1, 2) it remains difficult to determine if interactions can be replicated in independent studies. HRV measures and IL-6 blood concentrations are the most studied outcomes (Table 1), but even with these commonly studied outcomes, rarely are the same exposure and genetic locus examined in independent studies. Independent replication will be essential to increasing confidence in any given interaction and demonstrating that interactions are not cohort specific. In addition, the vast majority of gene–air pollution interaction studies have been performed in Caucasian cohorts. Pooling cohorts to create studies with larger sample sizes and increased diversity might allow for more genome–wide approaches, improve examination of associations across racial/ethnic groups, and facilitate discovery and replication analyses.

While the interpretation of interactions is a persistent challenge, within the gene–air pollution literature many studies stratify associations across genotypes, which facilitate interpretation and identification of genetic models. Functional follow-up for gene–air pollution interaction studies has been non-existent, possibly due to the expense and complexity of such an undertaking. However, as the field advances, pairing association studies with functional follow-up may be a key step in translating statistical interactions into public policy and mechanistic understandings.