Introduction

Although malignant melanoma can be caused by sunburn, fatty liver disease by excessive alcohol consumption, lung cancer by cigarette smoking, and type 2 diabetes (T2D) by obesity, disease is not an inevitable consequence of excessive exposures to these risk factors; by contrast, in Mendelian diseases like familial partial lipodystrophy [1] or phenylketonuria [2], there is much higher certainty that exposure to refined carbohydrates and phenylalanine respectively can cause severe health detriments. The high sensitivity to certain environmental exposures and pharmacotherapies that some people experience and others do not may be governed by genetic factors that interact with these exposures to determine risk. In some instances, genetic variation may be beneficial, rendering the bearers of these mutations especially sensitive to the health-enhancing effects of specific drugs, foods or types of exercise, for example, and in other people genetic factors may augment the detrimental effects of lifestyle and predispose those taking certain medicines to adverse events. Discovering, replicating, validating, and translating information about interactions between genetic variation and environmental exposures and medical therapies has important implications for the prediction, targeted prevention, and stratified treatment of T2D and many other diseases.

The literature on gene-environment interactions in diabetes-related traits is extensive, but few studies are accompanied by adequate replication data or compelling mechanistic explanations. Moreover, most studies are cross-sectional, from which temporal patterns and causal effects cannot be confidently ascertained. This has undermined confidence in many published reports of gene-environment interactions across many diseases; although interaction studies in psychiatry have been especially heavily criticized [3], many of the points made in that area relate to other diseases, not least to T2D, where the diagnostic phenotype (elevated blood glucose or HbA1c) is a consequence of underlying and usually unmeasured physiological defects (e.g., at the level of the pancreatic beta-cell, peripheral tissue, liver, and gut), and the major environmental risk factors are difficult to measure well. Nevertheless, several promising examples of gene-environment interactions relating to cardiometabolic disease exist, as discussed below and described in Table 1, and interaction studies with deep genomic coverage in large cohorts are now conceivable; the hope is that these studies will highlight novel disease mechanisms and biological pathways that will fuel subsequent functional and clinical translation studies. This is important, because diabetes medicine may rely increasingly on genomic stratification of patient populations and disease phenotype, for which gene-environment interaction studies might prove highly informative.

Table 1 Summary of key literature on gene-environment interactions in obesity and type 2 diabetes

How Are Gene-Environment Interactions Defined?

The term gene-environment interaction has different meanings to different biomedical researchers (see Supplement 1for glossary of terms used). However, here, we focus on the concept of effect modification, where the genetic and environmental exposures convey synergistic effects, or, in other words, where the joint effects are more or less than additive and the estimated genetic effect on a trait differs in magnitude (and sometimes direction) across the spectrum of an environmental exposure. Figure 1 shows three types of interaction effects, and also illustrates why modeling interactions is challenged by scale dependency (i.e., where interaction effects are influenced by the scale on which the dependent variable is modeled). In clinical trials, gene-treatment interactions are usually considered to occur when the direction and/or magnitude of the treatment effects are conditional on the participant’s genotype.

Fig. 1
figure 1

Types of gene-environment interactions. a A non-removable, “pure” interaction. b A non-removable “cross-over” interaction. c A removable interaction with the trait expressed on the linear scale. d Exactly the same data used in (c), but the interaction is removed by expressing the trait on the natural log scale

The Rationale for Studying Gene-Environment Interactions

It is often said that T2D is the consequence of gene-environment interactions [17]. Indeed, both the environment and the genome are involved in diabetes etiology, and there are many genetic and environmental risk factors for which very robust evidence of association exists. But when epidemiologists and statisticians discuss gene-environment interactions, they are usually referring to the synergistic relationship between the two exposures, and there is limited empirical evidence for such effects in the etiology of cardiometabolic disease. Indeed, in non-monogenic human obesity, a condition widely believed to result from a genetic predisposition triggered by exposure to adverse lifestyle factors, of the >200 human gene-lifestyle interaction studies reported since 1995, only a few examples of gene-environment interactions have been adequately replicated [18], and because these results are derived primarily from cross-sectional studies with little or no experimental validation, even those that have been robustly replicated may not represent causal interaction effects. The evidence base for T2D is thinner still. Nevertheless, other data support the existence of gene-environment interactions in complex disease, thus motivating the search for empirically defined interactions in T2D.

Some of the earliest empirical examples of gene-environment interactions come from studies in Drosophila that show that eye facet number varies both by genotype and temperature [1921]; similar examples exist for other morphological features of the fly’s eyes and head [22]. In agricultural genetics, the need to maintain or improve food security in the face of global population growth, climate change, and land challenges has demanded the cultivation of genetically engineered plants to maximize crop yields conditional on environmental characteristics (e.g., soil quality, precipitation, altitude, or temperature) [23]. Studies of gene-environment interactions in durum wheat, for example, illustrate that in low crop yield regions, the D3415 cultivar performs well, whereas other cultivars (Karel, W4267, M104 and Messapia) produce much higher yields than D3415 in high-yield regions [24]. Such studies emphasize how pairing a plant’s genes with its environment can optimize selected phenotypes; similarly, matching appropriate environments and medical interventions to genotype is likely to be necessary for the optimization of health phenotypes in humans.

Animal studies of obesity and diabetes also provide useful examples of interactions, where phenotypic differences between genetically engineered animals are augmented with interventions that perturb the molecular pathways upon which the gene(s) of interest reside. For example, high-fat feeding is a common intervention used to accentuate phenotypic differences between genetically distinct animals; in a study of glucose and lipid metabolism, the effects of 8-week high and low fat feeding regimes on metabolic phenotypes of five inbred mouse strains (C57BL/6J, 129X1/SvJ, BALB/c, DBA/2, FVB/N) were compared; the study showed that metabolic sensitivity to dietary fat varied considerably by genotype. Elsewhere, the NOD mouse strain has provided a longstanding murine model for autoimmune type 1 diabetes owing to its predisposition to early-onset disease [25]; the NOD mouse is especially susceptible when reared in a germ-free environment, but much less so when reared in standard “dirty” cages [26]. This phenomenon, which is not observed in wild-type mice, is thought to reflect immune adaptations in the NOD mouse that require exposure to foreign microbes early in life [26].

Complex metabolic diseases such as non-autoimmune diabetes are often uncommon in indigenous populations living traditional substance farming or hunter-gatherer lifestyles, yet phylogenetically similar people living industrialized lifestyles are often disproportionally afflicted [3]; these observations are consistent with the presence of susceptibility loci whose effects are triggered by environmental exposures. This phenomenon is most apparent in ethnic groups whose recent evolution is characterized by migration and frequent exposure to famine, cold, and other metabolic stressors. This process, which is described in detail elsewhere [27], might have led to enrichment of alleles that predispose to metabolic efficiency, particularly after meals. Other intriguing examples are those from certain populations that cope unusually well living at high altitudes [17], in nutrient deficient settings [18], or in cold climates [28]. Whilst these ecological observations are especially prone to confounding, bias, and reverse causality, they provide tentative support for gene-environment interactions in human disease.

Heritability studies conducted in intervention settings also provide suggestive evidence of gene-treatment interactions. Studies of overfeeding, underfeeding, and aerobic exercise training in twins and nuclear families indicate that changes in body composition are more highly correlated between members of the same kinship than between those of different kinships. For example, Bouchard et al. implemented a long-term overfeeding protocol (structured diet containing 1000 kcal/day above the baseline energy requirement) in 12 pairs of monozygotic (MZ) twins [29]; the intraclass correlation (ICC) for change in body weight in MZ pairs was r = 0.55. The ICC in non-twin pairs was not reported, but the ratio of the trait variance explained between pairs to that within pairs (F ratio) was 3.43, suggesting that body weight adaptation to long-term overfeeding is heritable. Elsewhere, adaptation of maximal oxidative capacity (a measure of aerobic fitness that is a strong predictor of diabetes) following a 20-week standardized exercise intervention protocol was examined in 720 individuals from 450 nuclear families [30]. As with the overfeeding study, aerobic adaptation was strongly correlated in biologically related participants, and much less so in those who were unrelated (F ratio = 2.50). Importantly though, defining heritability in this way incorporates both genetic and shared non-genetic (e.g., shared familial environment) sources of trait variance; moreover, the heritable basis of baseline body weight and aerobic fitness is substantial and because these short-term studies did not partition out these factors, it is difficult to determining the extent to which phenotypic adaptation is under genetic control.

Discovery Strategies

Numerous approaches, varying by study design, data type and analytical method, have been used to discover gene-environment interactions; some approaches address similar objectives, whilst others are complementary and can be applied in sequence. Below we describe several of these approaches, and refer the reader to another excellent review of gene-environment interaction methods [31].

  1. (a)

    Established statistical approaches

Until 2008, almost all studies of gene-environment interactions focused on testing hypotheses based on existing biological evidence, typically focusing on a small number of genetic variants. Linkage studies were the first generation of genome-wide interaction studies (GWIS) [32] but were generally unsuccessful and are seldom used in contemporary studies of complex traits. With few exceptions (see Table 1), neither approach led to convincing evidence of gene-environment interactions.

The advent of genome-wide association studies (GWAS) in 2005 facilitated a new era of genetic association studies and the rapid discovery of thousands of loci for many complex traits; GWAS triggered a quantum leap in population genetics, largely because it is agnostic to prior biological knowledge, which directly contrasts most previous gene discovery approaches. By 2008, researchers were exploring if environmental risk factors modified the effects of GWAS loci, an approach that now predominates in gene-environment interaction research. There is appeal to this approach because few statistical tests are performed, which helps preserve statistical power, and it is analytically simple. Indeed, several of the few adequately replicated examples of gene-environment interactions have been discovered in this way (Table 1). There are, however, good arguments for why loci derived from GWAS may not, on average, be good candidates for interactions [33]. For instance, heterogeneous SNP association signals are generally filtered out in standard GWAS meta-analyses, yet as we discuss below, variance across genotypes is a characteristic of interactions. Indeed most, perhaps all, comprehensive studies focused on determining whether established GWAS-derived loci interact with environmental risk factors or clinical interventions have yielded predominantly negative results [4, 5, 3436].

With GWAS came the possibility to conduct GWIS at a much higher variant density, and in samples of unrelated individuals, not only in family pedigrees as with earlier linkage studies. The simplest approach involves testing all SNPs for interaction with one or more environmental variables. Whilst computationally feasible [37], conventional GWIS for complex traits require sample sizes that are often unachievable to be adequately powered. To help preserve power, restricting the number of variants tested to those with nominally significant marginal associations (e.g., P = 0.10) may help [38]. Other statistical tricks to minimize multiple testing involve the joint estimation of SNP and SNP × environment regression coefficients (2 df tests), which are relatively powerful, especially when an interacting locus also conveys a detectable marginal effect [39]. This approach has also been adapted for meta-analysis [40], and in some empirical situations has been shown to be more powerful than testing for marginal or interaction effects separately [16], although no novel loci have yet been confirmed using this approach for T2D.

  1. (b)

    Data reduction approaches

A number of data reduction strategies for the analysis of gene-environment interactions have been proposed for use in observational studies. A common feature of these approaches is reduction of multiple hypothesis testing through selection of a subset of variants (step 1) for explicit interaction testing (step 2). One such approach is the “case-only” design, whereby the association between SNPs and an interacting variable is first tested only in disease cases and associated SNPs are then tested for interaction in the full cohort of cases and controls. Statistical power is preserved because the first screening step only involves association tests, which generally yield higher power than interaction tests when all else is equal. Although somewhat counterintuitive, in the presence of gene-environment interactions, SNPs are associated with the interacting environmental exposure only in cases, providing an opportunity to shortlist candidate SNPs for subsequent pairwise interaction tests in the full cohort using an interaction effect test [41]. A caveat to this approach is that when the genetic and environmental variables are correlated in controls, variants will be inappropriately prioritized for interaction testing, thereby reducing the power of the test; this problem may be enhanced when using GWAS, owing to the large number of variants tested.

Analytical strategies have also emerged that focus on modeling genetic effects for quantitative signatures of gene-environment interactions. These approaches pivot on the notion that interaction effects are characterized by heteroscedastic phenotypic variances that are conditional upon genotype (termed variance heterogeneity) (see Fig. 2); various methods have been proposed that exploit this characteristic, approaches that have proven somewhat successful for discovering gene-environment interactions in cardiometabolic traits [42•, 43•]. Thus, identifying differences in variance conditional upon genotype allows for the shortlisting of SNPs for explicit interaction testing. In the seminal description of this approach [42•], SNPs with genome-wide significant (P < 5 × 10−8) heterogeneity of variance estimates were identified for plasma C-reactive protein and soluble ICAM1, which were subsequently shown to interact with BMI and smoking (P < 5 × 10−8). Although in this example, the interaction would have been detectable in a conventional GWIS analysis, in other examples, where the explicit interaction test (stage 2) is not genome-wide significant, a less conservative significance threshold might be sufficient, owing to the orthogonal nature of the two sets of evidence.

Fig. 2
figure 2

Outline of how variance heterogeneity tests can be used to discover gene-environment interactions. a Conventional linear regression analysis in the presence of variance homogeneity. b Conventional linear regression analyses in the presence of variance heterogeneity. c Linear regression analyses intended to model variance heterogeneity. d Linear regression analyses intended to unmask the underlying gene-environment interaction

An important advantage of variance heterogeneity tests is that the environmental exposure does not need to be explicitly characterized, as heterogeneity of variance will be present even when the interacting environmental factor is unmeasured or unknown. Indeed, many large datasets exist with genetic and phenotypic data that lack good environmental exposure data, and even where environmental exposure data are available, standardizing measurements across cohorts can results in a substantial loss of power in meta-analyses [6]. A caveat of this approach, as with most tests of gene-environment interaction, is that it is prone to confounding by linkage disequilibrium (synthetic associations and rare variant effects), scale dependency, and population stratification.

  1. (c)

    Causal inference models

Causality is often uncertain in epidemiology when an association between an exposure and outcome is observed. Genetics is well suited to causal inference, because genetic variants are randomly assorted at meiosis and are usually not correlated with factors that can confound non-genetic associations in epidemiology. Using an approach termed Mendelian randomization, genotypes can be used as instrumental variables in experiments that resemble randomized controlled trials (RCT) [44, 45]. Because there are now many established associations between gene variants and diabetes-related exposures (e.g., smoking [46], coffee consumption [47], macronutrient intake [48]), it is possible to undertake a special type of Mendelian randomization experiment that focuses on modeling gene-environment interactions using genotypes as proxies for environmental exposure, although interaction studies of this kind are yet to be reported. A limitation of this approach is that suitable instruments (genetic variants that are strongly correlated with the exposures of interest) for the environmental exposures in gene-environment interaction tests are often unavailable.

Causal interactions between genetic and environmental factors can also be modeled using types of Bayesian Network Analysis, such as the Bayesian Epistasis Association Mapping tool [49] and hierarchical modeling [50]. Approaches like these utilize multiple layers of data to estimate directional relationships between variables and hence permit some degree of causal inference. Bayesian Network Analysis in general works well when accurate and precise data are included and where gene ontologies are well defined, and much less so when these conditions do not hold. One of the major appeals of Bayesian Network Analysis is its capacity to integrate data across multiple biologic systems gathered within the same participant, which is likely to be particularly relevant for the functional elucidation of gene-environment interaction effects.

Translation of Gene-Environment Interaction Effects

Research on the genetics of complex disease has two principal objectives: (i) to elucidate understanding of pathobiology and (ii) to aid the prevention or treatment of disease. The major advances in human genetics during the past 15 years, made possible primarily through huge developments in high-throughput genomic technologies combined with a greater willingness of scientists to collaborate, have facilitated discovery of thousands of disease-associated loci that with appropriate follow-up will substantially further our understanding of disease biology. The second objective, however, is yet to be realized to any meaningful degree.

  1. (a)

    Theoretical considerations

Two common characteristics of established complex disease-associated variants discovered using hypothesis-free high-throughput approaches is that the magnitude of effect is relatively small and homogeneous across a range of environmental settings and treatment arms of clinical trials [4, 3436]. Whilst the discovery of these loci helps define novel aspects of human biology, this information has proven relatively ineffective for the stratification of medical interventions, probably in part because of the way in which the variants were discovered. To identify gene variants that are of use for stratified medicine will likely require explicit strategies that seek to discover loci that predict a person’s susceptibility to disease given specific environmental exposures or that predict treatment response. The strategies needed to detect such interactions will be distinct from those used to detect genetic associations per se.

The extent to which genetic information enhances the accuracy of established disease prediction models or improves the degree to which disease occurrence is correctly predicted in prospective analyses is likely to vary considerably across diseases. Importantly though, because germline DNA variants are salient biomarkers, their predictive accuracy relative to non-genetic biomarkers can improve as the time between the baseline assessment and disease incidence lengthens [51]. Thus, genotypes provide a rare example of disease biomarkers that could be measured very early in life to predict diseases occurring several decades later. Whilst many studies have reported on the discriminative or predictive accuracy of models including genetic and environmental data, most do not consider their joint, synergistic, effects and generally treat these two types of exposure as independent factors. However, Aschard et al. [52•] examined the discriminative value and reclassification potential of simulation models including two-way gene-gene and gene-environment interaction effects in relation to breast cancer, rheumatoid arthritis (RA), and T2D. The authors found that the inclusion of up to ten interaction effects of fairly modest magnitude improved discriminative accuracy (ROC AUC) for breast cancer by approximately 4 %, RA by approximately 2 %, and T2D by approximately 1 %. The net improvement in case–control classification for the model including all 10 interaction effects was approximately 30 % for each of these traits compared with the null model. Increasing the number (up to 20) and magnitude (risk ratio = 10) of the simulated interaction effects included in the model substantially increased both its discriminative accuracy and net reclassification.

Aschard et al.’s analyses focused on discriminating between people with and without prevalent disease, which is unlikely to be directly comparable with analyses focused on predicting incident events; although few discovery genetic association studies have been performed using longitudinal data, some prospective studies have estimated the predictive value of established prevalent disease-associated gene variants for change in quantitative biomarkers [53] or disease events [54]. Those studies suggest that genetic variants that are strongly associated with cross-sectional traits do not always predict change in the trait, and vice versa. Moreover, the primary metric used in these analyses was the C-statistic, a measure of discriminative accuracy whereby a value of 50 % reflects accuracy equivalent to tossing a coin and a value of 100 % reflects perfect discrimination; importantly, this particular approach to quantifying discriminative accuracy is sensitive to the frequency of the disease and its risk factors, with models focused on rarer diseases and exposures generally yielding lower values than those focused on common diseases and risk factors. Nevertheless, Aschard et al.’s study provides valuable information that may help quantify assumptions about the extent to which data on gene-environment interactions can help classify and predict disease events.

  1. (b)

    Mechanisms of action

The mechanisms underlying observations of gene-environment interactions in T2D are rarely discussed, probably because few functional studies have been performed around explicit interaction effects. However, more than half a century ago Jacob and Monold [55] outlined the mechanisms underlying the synthesis of enzymes in bacteria, which they described as requiring genetic repressors that can be activated or inactivated by specific metabolites present in the cellular environment [55]. In pharmacogenetics, mechanisms are often eloquently described; take for example, activating mutations in KCNJ11, the gene encoding the Kir6.2 subunit that controls gating of the ATP-sensitive K+ channels (KATP) in the pancreatic beta cells. Here, carriers of the mutations can produce but not secrete insulin in response to glucose; however, treatment with sulfonylureas, which binds to the SUR1 subunit of the sulfonylurea receptor/potassium channel complex on the beta-cell membrane, depolarizes the K+ channels, leading to the activation of voltage-gated Ca2+channels thus increasing the secretion of insulin [56].

Most gene-environment interactions are likely to include one of four mechanisms: (i) ligand binding interactions (mutations that disrupt the binding of ligands to the cell membrane receptor(s) or the nuclear receptor(s)); (ii) epigenetic interactions (mutations that in the presence of certain environmental exposures cause epigenetic changes that differentially affect gene transcription); (iii) double hit interactions (where environmental exposures cause somatic mutations that interact with existing germline variants); and (iv) gating interactions (where mutations in regulatory elements pathogenically modulate the activity of biologic processes, such that, for example, without exercise, diet modification or pharmacotherapy, disease occurs).

Whilst understanding mechanisms of action may not be necessary for translating knowledge of interactions into the clinical context, defining mechanisms is necessary to identify therapeutic targets. Thus, emphasis should be placed on elucidating the functional processes underlying any valid observation of gene-environment interaction.

  1. (c)

    Genotype-based recall (GBR)

Specially designed intervention studies, where large sample frames are used to identify two equally sized subgroups that are highly distinct in their genetic predisposition to disease (e.g., minor vs. major allele homozygotes at a given rare variant) and who are subsequently enrolled into a randomized controlled trial, represent a powerful test-bed through which gene-environment interaction effects can be validated (Fig. 3; Supplement 2). The earliest example of a genotype-based recall study focused on in vivo effects of the PPARG Pro12Ala genotypes on adipose tissue free fatty acid metabolism [57]. A second recent intervention study focused on administering 0, 10, or 20 mg of yohimbine in people selected for genotypes at the α(2A)-adrenergic receptor locus (ADRAD2A) [58]. The main outcome was early insulin response (30 min) insulin concentrations following a 75-g oral glucose load. The study was one of the first GBR trials to be reported and showed that treatment response is conditional on ADRAD2A genotype.

Fig. 3
figure 3

Figure describing the genotype-based recall trial paradigm. Participants are selected from low- and high-burden genetic risk groups within a large sampling frame and are subsequently randomized to treatment (e.g., intensive lifestyle modification) or control arms of a clinical trial. Treatment allocation and genotype ideally remain masked until the trial has ended (although this is often difficult or impossible with lifestyle interventions), at which time gene-treatment interactions can be quantified

Barriers and Limitations

Epidemiology has yielded most of the evidence garnered during the past 20 years on gene-environment interactions in T2D and related traits, much from small cross-sectional studies. However, several large prospective cohort studies exist with good measures of environmental and genetic exposures, repeated measures of quantitative outcomes, and long-term follow-up for incident disease [5961], rendering them excellent resources for generating hypotheses about gene-environment interactions. However, epidemiological studies are prone to various forms of chance, bias, and confounding as well as reverse causality, which make the determination of causal effects especially challenging [62]. Owing to the salient nature of germline DNA variants, genetic association studies are robust to reverse causality, but there are other sources of bias and confounding, such as population stratification, synthetic association, and survival bias that may provide alternative explanations for an apparent effect of a genotype on a disease trait [63]. Epidemiological studies of gene-environment interactions are prone to the limitations of both genetic and non-genetic epidemiology, as well as other limitations that are idiosyncratic to this type of research. Scale dependency is one such limitation, which occurs when data conversions drive the presence or absence of statistically significant interactions (see Fig. 1).

The term error relates to the imprecision of an estimate and the term bias describes the extent to which error is disproportionate between two or more groups under investigation. The large size of many epidemiological studies necessitates that environmental exposures are usually assessed with fairly imprecise methods such as questionnaires and outcomes with proxy variables. This can cause underestimation of the true magnitude of the marginal and interaction effects and diminishes power to detect interactions [64]. Under a set of reasonable assumptions about interaction studies, Wong et al. [65] described sample size requirements to detect interactions with low type 1 and type 2 error rates; when the exposure and outcome are good proxies for the true (latent) exposure (ρ Tx  = 0.8) and outcome (ρ Ty  = 0.8) ∼2410 participants are required to detect a reasonably sized interaction, but when exposure and outcome are poorly assessed (ρ Tx  = 0.4; ρ Ty  = 0.4), the required sample size booms (N ∼84,787).

Recognizing that many existing interaction studies may have been underpowered, studies of interaction are now often performed by combining results from large cohort collections using meta-analysis. Palla et al. illustrated why retrospective meta-analysis of published interaction studies may yield meaningless results [66•], owing largely to bias and confounding, and difficulty standardizing results. Thus, most gene-environment interaction studies involving multiple cohorts focus on prospective meta-analyses, where each participating cohort performs new analyses according to a standardized analyses plan, and their summary results are subsequently pooled.

Meta-analyses of data from multiple cohorts have obvious appeal, as sample sizes that far exceed most individual study of gene-environment interaction can be collated. A caveat to the approach though is that the assessments of exposures and outcomes in these cohorts often differ on multiple levels (e.g., type and validity of measures, data structure, reference time-frame, data processing approaches). Methodological differences demand that environmental exposure variables are standardized before analysis, which typically involves collapsing exposure data to a parsimonious level, which can substantially reduce statistical power. We recently conducted a large meta-analysis examining the interaction between an FTO variant and physical activity in obesity [7]; the study involved meta-analyzing summary statistics from 45 adult and nine pediatric cohorts. Although some cohorts had very detailed physical activity data (e.g., objective continuous assessments of physical activity), others had very crude (binary) subjective physical activity data; thus, all cohorts were asked to reduce their physical activity exposure variables to a simple binary variable where approximately 80 % of participants were defined as physically active and the remaining 20 % were defined as inactive.

This approach, whilst pragmatic, diminishes statistical power in at least two key ways: first, stratification of continuous data often results in loss of power [67]; second, where interaction effects are approximately linear, asymmetrical stratification of exposure data also diminishes power. We provide several relevant examples elsewhere [6]; for example, a study with ∼15,000 participants and the environmental exposure variable stratified at the median of its distribution would be adequately powered (80 %) to detect the interaction effect. But if the exposure variable is stratified at the 80th centile of its distribution, with all else equal, a sample size of ∼24,000 would be equivalently powered to detect the same interaction effect. Power is lost primarily owing to increased variance in the exposure variable. Pooling multiple heterogeneous cohorts causes an increase in the dependent variable’s variance, which also leads to a substantial loss of statistical power. Hence, meta-analyses of gene-environment interactions composed of data from multiple diverse cohorts may not be as powerful a strategy for replication as many hope, and focusing on a handful of large, well-characterized and comparable cohorts for replication is likely to be a considerably more efficient strategy. The recent availability of large cohorts with data that are suitable for gene-environment interaction modeling such as UK Biobank [68], seem set to change the research community’s dependency on meta-analysis to conduct large genetic studies, as many of the caveats to the latter (described in detail in [6]) are likely offset in this single large study.

Interaction effects are also prone to a specific form of confounding that can occur when the outcome variable is a proxy for the phenotype of interest, as is often true in epidemiological studies. Consider the example of gene-lifestyle interactions in obesity, where anthropometric measures such as height and weight are used to derive BMI, a proxy for total adiposity. Because BMI is not a perfect correlate of adipose mass, there are relatively lean people within any population who are muscular and heavy, with a high BMI [69]. Those persons may plausibly exercise and avoid other unhealthful lifestyle behaviors (e.g., consuming fatty foods or sugar-sweetened beverages) more than those with a high BMI and high-fat percentage; thus the magnitude of the effect of a genetic risk score on BMI will likely be stronger in inactive than active people (causing a statistical interaction) purely because the outcome measure in the inactive group is more valid. This problem emphasizes the need to validate epidemiological observations of interaction in other studies that have the ability to elucidate the target phenotypes.

Conclusions/Perspective

The major recent breakthroughs in complex trait genetics have boosted confidence that similar successes might be achievable in the field of gene-environment interaction research. The derivation of massive amounts of genetic and phenotypic data, along with an understanding that those data should be used and reused, has encouraged investigators to dig deep into their databases to explore whether genetic association signals are modulated by non-genetic factors. Thus, the once esoteric topic of gene-environment interaction is now becoming mainstream and appealing to investigators across diverse disciplines; this has propelled major methodological innovations for the discovery, replication, validation and translation of gene-environment interactions. The exponentiation of data resources for these purposes has demanded analytical solutions that address data dimensionality reduction. Although not yet extensively implemented, systems-medicine approaches for interaction modeling in complex human disease, which might build on the eQTL-based methods developed in yeast [70, 71] and human dendritic cells [72], and other system-based approaches [73], are growing in popularity and will accelerate gene-environment interaction research as large systems genetics-focused studies come online [74] (Fig. 4).

Fig. 4
figure 4

We can now quantify the key molecular events that link germline DNA variation with disease processes in a range of settings, from cell lines to human populations, and major advances have been made in coupling these complex datasets with information about extrinsic environmental exposures including drug prescription in ways that allow the logical interrogation of gene-drug and gene-lifestyle interactions. Doing so may teach us about disease etiology and help stratify type 2 diabetes (T2D) into subclasses that can be treated more effectively, with fewer side effects and at lesser cost than before

The paucity of replicated gene-environment interaction effects may reflect an abundance of false-positive findings in the published literature [3], although other explanations for why true-positive interaction effects fail to replicate should not be dismissed [6]. Most if not all complex traits probably result from the accumulation of many small-magnitude gene-environment interactions, gene-gene interactions, and marginal effects. If so, most existing interaction studies will likely be underpowered to detect real effects owing to their small sample sizes. Accordingly, interaction meta-analyses are increasingly performed on data from multiple cohorts. Although successful for genetic association studies, meta-analysis may not work well in the context of gene-environment interaction owing to the diversity of measurements and data across cohorts, which degrades statistical power [6]. Thus, it seems logical to focus gene-environment interaction analyses on cohorts that are either very large and that include well-validated and standardized assessment methods, or those that are smaller in size but which include accurate and precise measures of exposures and outcomes [65]. By exception, variance prioritization meta-analyses are likely to be less prone to loss of power, because the environmental exposure is inferred by comparing phenotypic variances by genotype rather than through direct assessment. Although most published gene-environment interaction studies focus on cross-sectional data, longitudinal interaction studies are also needed, especially those that include repeated measures of exposures and outcomes, as this will facilitate temporal inference and help preserve statistical power [75].

Emphasis is frequently placed on translation when gene-environment interaction data are discussed. The logic is appealing, as identifying genetic markers that define patients who are at substantially greater or lesser risk of disease than the general population given exposure to modifiable risk factors, or who will respond much better of worse to treatment, could help optimize medical interventions. However, there are as yet no translatable examples of gene-environment interactions that are sufficiently convincing to guide medical interventions for T2D. Nevertheless, numerous examples from Mendelian disorders and pharmacogenetics fuel hope that genetic data may eventually help tailor prevention or treatment strategies for complex diseases focused on lifestyle modification. Specially designed intervention studies, such as genotype-based recall trials (Fig. 3), will also facilitate clinical translation of data on gene-environment interactions.