Introduction

Since the early 2000s, research into candidate gene–environment interactions involved in psychiatric disorders has been both exciting (Caspi et al. 2002; 2003) and contentious (Chabris et al. 2012; Duncan and Keller 2011; Risch et al. 2009). A landmark paper published in Science and subsequent replication attempts illustrates this well. Caspi and colleagues (2003) showed that a specific genetic variant (the 5-HTTLPR serotonin transporter) was associated with an increased likelihood for depression, but only for those who had experienced rather severe stressful life events (e.g., childhood maltreatment). Such findings resonated with diathesis stress theory (e.g., Monroe and Simons 1991) as well as general theories of individual-environment interplay (Bronfenbrenner and Ceci 1994; Sameroff and Mackenzie 2003) and generated excitement across a variety of social science and biological fields. However, subsequent meta-analyses have contributed to a continuing debate on the validity of the candidate gene–environment effect. The meta-analysis by Risch and colleagues (2009) did not replicate this association, and similar null or negligible effects were reported by an independent group (Munafo et al. 2009a, b). However, another subsequent meta-analysis of 54 studies published in 2011 did find evidence for the gene–environment interaction (Karg et al. 2011). This body of work brings to light the need for replication of novel findings.

The validity of the original Caspi et al. (2003) finding and the positive meta-analytic finding is still hotly debated. Critical attention has been paid to the fact that the inclusion of only published articles (or in press or in review articles) in meta-analytic reviews increases vulnerability to publication bias (i.e., the difficultly of publishing null results; Duncan and Keller 2011; Duncan et al. 2014). To thoroughly test this, Duncan and Keller (2011) conducted an extensive review of all published candidate gene–environment interaction studies (N = 103) relevant to psychiatric traits (covering 2000–2009). They found that while nearly all novel candidate gene–environment interaction studies were significant, only about a quarter of replication attempts reached significance. Furthermore, power calculations showed the sample sizes necessary to reach statistical significance for the kinds of small effect sizes expected reaches well into the thousands, and many of the previously published studies reporting significant effects appeared to be vastly underpowered. Due to this, many of the published candidate gene results were hypothesized to be false positives (also see Dick et al. 2015; Duncan et al. 2014).

In response to these issues, replication has become the gold standard in any evaluation of candidate gene analysis and many respected journals have specific editorial policies concerning replication involving candidate gene work (c.f. Hewitt 2012; Johnston et al. 2013; Lesch 2014; Munafo and Gage 2013). Further, researchers have noted the need for replication studies that include not only the same genetic markers, but also similar measures of environmental risk and phenotypes of interest (Duncan and Keller 2011; Rutter 2012). In response to such calls, the present paper reports results from an on-going collaboration between five longitudinal studies investigating the etiology of substance use disorders (SUDs) and related psychopathology through models of complex gene– and person–environment interplay. The five longitudinal studies involved include two studies from the Minnesota Center for Twin and Family Research (MCTFR), specifically the Minnesota Twin and Family Study (MTFS; Iacono et al. 1999) and the Sibling Interaction and Behavior Study (SIBS; McGue et al. 2007), the Minnesota Drug Abuse and Attention-Deficit Hyperactivity Disorder study (MN ADHD; Winters 2015), and two studies from the Social Development Research Group, including the Seattle Social Development Project (SSDP; Hawkins et al. 2003; Hill et al. 2010), and the Raising Healthy Children study (RHC; Brown et al. 2005; Catalano et al. 2003; Haggerty et al. 2006). All five of the studies included in the collaboration are longitudinal studies of child/adolescent through young adulthood development, focusing specifically on the etiology or the general risk and protective factors associated with substance use problems and related psychopathology. The studies were selected because they include highly similar phenotype and environment measures, increasing their utility in terms of replication of findings.

Conceptual model guiding our on-going collaboration

Our groups began to work together in response to a call for collaborative research on longitudinal gene–environment interplay issued by the National Institute on Drug Abuse (The Genes, Environment, and Development Initiative, RFA-DA-07-012). Early in our work, a conceptual model was proposed by Bailey et al. (2011), to evaluate models of candidate gene–environment interaction in a general/specific framework. Specifically, Bailey and colleagues evaluated a phenotypic model which attended both to general externalizing problems in early adulthood as the shared variance between nicotine dependence, alcohol, and illicit drug use, antisocial behavior, sexual risk, and crime, and to the specific (unique) variance of each of these outcomes. In addition, the approach modeled general and specific environmental factors predictive of these outcomes. For example, the general family environment latent variable was measured using reports of general parent–child relationship quality and parenting practices (e.g., conflict, management), family tobacco-specific environments were measured by parent and sibling smoking during the child’s adolescence, and family alcohol-specific environment was measured by parent and sibling drinking during the child’s adolescence. Bailey et al. showed clear evidence linking the general family environment with the estimated latent variable of general externalizing problems (i.e., the poorer the family environment in adolescence, the greater the likelihood for externalizing problems in early adulthood), as well as evidence linking tobacco-specific and alcohol-specific family environments to unique variance in nicotine dependence and alcohol use disorder (i.e., the more parents and siblings smoked or drank during the target’s adolescence, the greater the likelihood for nicotine dependence and alcohol use disorder for the target in early adulthood, after accounting for any variance shared between adult nicotine dependence and alcohol use disorder with externalizing problems in general).

In the 2011 publication, Bailey and colleagues proposed that the phenotypic model could be further adapted to include candidate genes or polygenic risk scores to evaluate for gene–environment interplay. Candidate genes related to dopamine regulation, for example, could be evaluated as predictors of the latent factor of adult externalizing and substance use problems (reflecting genetic main effects), as correlates of the adolescent family environments (demonstrating gene–environment correlation), and/or as moderators of the association between general adolescent family environment and the latent factor of adult externalizing and substance use problems (demonstrating gene–environment interaction).

We have now replicated this basic phenotypic model in several data sets (Bailey et al. 2014; Samek et al. 2014) and at different developmental stages (Epstein et al. 2013). We have shown consistent evidence linking the general adolescent family environment (e.g., low parent–child conflict, high management) to the general externalizing latent factor in later adulthood—with a moderate effect size (i.e., the poorer the general family environment in adolescence, the greater the likelihood for externalizing problems in adulthood). There has been less consistent evidence linking the unique adolescent family smoking and drinking environments to unique variance in adult nicotine dependence and alcohol use disorder (Bailey et al. 2014; Samek et al. 2014), thus one effort is now focused on evaluating more complex patterns of gene–environment interplay relevant to the general pathway rather than substance-specific pathways. In addition, we are developing potential measures of general and drug-specific genetic influence for inclusion in this model.

Candidate gene review

The six candidate genes chosen for this work (MAOA, 5-HTTLPR, COMT, DRD2, DAT1, DRD4) are likely the most extensively studied candidate genes in relation to SUD and externalizing problems broadly that were available for analysis across our five studies. Table 1 lists the six candidate genes, gives an overview of their functions and coding, and cites meta-analyses supporting associations between the candidate genes and externalizing and substance use outcomes. We briefly review each candidate gene and its supporting meta-analytic literature, although caution should be warranted in interpreting even these meta-analytic results given the concerns we have outlined above (see Dick et al. 2015; Duncan and Keller 2011; Duncan et al. 2014).

Table 1 Overview of included candidate genes

MAOA and 5-HTTLPR are probably the most well-known and widely studied candidate genes, likely due in part to earlier work by Caspi and colleagues (2002, 2003). As shown in Table 1, MAOA is involved in the deamination of several neurotransmitters relevant to substance use and externalizing problems, including serotonin, dopamine, norepinephrine, and epinephrine. The MAOA gene is located on the X-chromosome (Xp11.23–11.4), whose variants arise from a sequence of DNA that is repeated in tandem; the number of repetitions or “repeats” varies (2–5 repeats) among individuals. This type of polymorphism is called a variable number of tandem repeats (VNTR). For MAOA, alleles containing 2 or 3 repeats are considered the “risk alleles”, i.e., the variant that is more likely to be associated with externalizing problems relative to the “long allele” or the variant with the longer number of repeats. Similarly, 5-HTTLPR is another VNTR (or 43 base-pair insertion/deletion), where the allele with the short number of repeats is considered to be the risk allele.

Several meta-analyses have evaluated the association between MAOA and antisocial behavior, including MAOA as a moderator of the link between childhood maltreatment and adult antisocial behavior (see Byrd and Manuck 2014; Ficks and Waldman 2014; Kim-Cohen et al. 2006; Reif et al. 2014). Several meta-analyses have also looked at 5-HTTLPR and its relationship to antisocial behavior and other psychiatric outcomes (e.g., Ficks and Waldman 2014; Karg et al. 2011; Miller et al. 2013, and as discussed above). For example, Ficks and Waldman (2014) conducted a meta-analysis on studies evaluating the association between both MAOA and 5-HTTLPR with antisocial behavior and found evidence of small but significant effects (31 total studies included for MAOA, pooled OR = 1.08; 18 total studies included for 5-HTTLPR, pooled OR = 1.41). Moreover, in their meta-analysis of 27 peer-reviewed papers, Byrd and Manuck (2014) showed evidence of a significant MAOA × childhood maltreatment effect on antisocial behavior for males, but not females, consistent with an earlier meta-analysis (Kim-Cohen et al. 2006). Byrd and Manuck also demonstrated a small main effect of MAOA on antisocial behavior for females.

There has been comparatively less research on COMT and the dopamine candidate genes (DRD4, DAT1, DRD2) in relation to SUDs or antisocial behavior. Like MAOA and 5-HTTLPR, DAT1 and DRD4 are VNTRs (the shorter number of repeats is the “risk” allele for DAT1, the longer number of repeats is the “risk” allele for DRD4—see Table 1 for details). COMT (rs4680) and DRD2 (rs1800497) are single nucleotide polymorphisms (SNPs; see Table 1 for details). It is known that rs1800497 is located within the eleventh ankyrin repeat of ANKK1 gene on the opposite DNA strand (Neville et al. 2004). We refer to it as DRD2 throughout the text for consistency with previous literature on its association with SUDs. The majority of meta-analyses on COMT involve associations with other psychiatric outcomes such as schizophrenia (Munafo et al. 2005) and Alzheimer’s disease (Lee and Song 2014). There is some research, however, showing an interaction between the risk allele (Val) and early cannabis use predicting subsequent psychosis (Caspi et al. 2005; Estrada et al. 2011). It is important to note that results from Genome-Wide Association Studies, which correct for multiple testing of all available SNPs in the genome, have shown no evidence that either of two SNPs we examine here (COMT: rs4680 and DRD2: rs1800497) are significantly associated with several addiction or externalizing problems, such as nicotine and alcohol dependence (e.g., see Bierut et al. 2010; Dick et al. 2011; Gelernter et al. 2014, 2015; Loukola et al. 2015; McGue et al. 2013; Vink et al. 2009 and more).

In terms of the effects of the other candidate genes related to dopamine regulation, Stapleton and colleagues (2007) showed evidence of a small but significant effect of DAT1 on smoking cessation (ORs = 1.15–1.20). With respect to DRD2, Ohmoto and colleagues (2013) subsequently showed the effect of this gene on smoking initiation, persistence, and cessation, and that this may be specific to Caucasian males. Additional meta-analyses seem to support this conclusion (Munafo et al. 2009a, b; Nikolaidis and Gray 2010). On the other hand, a review of 42 studies evaluated the link between DRD4 in relation to alcohol dependence and found no consistent evidence of effects (Forero et al. 2015), and similar null effects for alcohol dependence were reported for MAOA and several other candidate genes (Forero et al. 2015).

While publication bias may be at play in the meta-analyses and supporting literature for the six included candidate genes (as reviewed above, see Duncan and Keller 2011; Duncan et al. 2014), Table 1 nonetheless illustrates that a sizeable body of research supports the potential association between each candidate gene and the outcomes of interest here. Additionally, this review suggests there may be unique ethnicity and sex effects to many of these candidate genes (e.g., Ohmoto et al. 2013, Byrd and Manuck 2014) which are often not accounted for in analyses of gene–environment interplay (Keller 2014).

Goals of the current study

In the current work, we present main effect results for the associations between these six widely studied candidate genes (MAOA, 5-HTTLPR, COMT, DRD4, DAT1, DRD2) and five early adult SUD and externalizing outcomes (nicotine dependence, alcohol use disorder, cannabis use disorder, adult antisocial behavior, and an aggregate externalizing measure using all four externalizing disorder symptom counts) across the test sample (MCTFR) and the replicate samples (MN ADHD, SSDP, RHC). All studies include highly similar phenotype and environment measures as well as ages of assessment, which are key ingredients to testing our models. Following published guidelines (Dick et al. 2015), we provide power analyses for the test sample (MCTFR) and each of the replication samples (MN ADHD, SSDP, RHC), rule out potential confounding effects of sex and ancestry, and present meta-analyzed main effects results from the independent analyses across the five studies to give population estimates of the effects. We conclude with our plan for future analyses of candidate gene–environment interplay given the findings of the main effects analyses.

Method

Participants

MCTFR (test sample)

Two studies from MCTFR, the MTFS and the SIBS, were combined to make up the test sample. The general purpose of both MTFS and SIBS studies was to evaluate the genetic and environmental influences on SUDs and related psychopathology in the transition from adolescence through young adulthood. These studies have similarly-aged participants with data available in young adulthood, comparable assessment batteries, and nearly identical laboratory protocols (PI’s William Iacono and Matt McGue). Thus, pooling these samples for data analysis is appropriate. As each of these studies and their collaborative efforts have been described elsewhere (see Iacono et al. 1999; McGue et al. 2007), they will be only briefly reviewed here. MTFS is a community twin sample, which initially assessed Minnesota-born twins at age 11 or 17 (two separate cohorts) with follow-up assessments every 3–5 years through age 29 (assessed from 1990 to 2013). Over 90 % of identified twin families were located and over 80 % of the located, eligible families participated at baseline. Retention rates were high across follow-up assessments (range 88–93 %), and there has been limited impact of attrition on SUDs (see Iacono et al. 1999; Vrieze et al. 2014 for more detail). Consistent with state demographics from relevant birth years, the majority of the participants were of European ancestry (96 %). The total MTFS sample size is 2769 participants from 1382 twin pairs (65 % monozygotic twins, including 5 triplets, 52 % female).

SIBS includes three types of families where participating offspring were biologically related or related by adoption. Participating offspring were within 5 years of age; all adoptees were placed prior to their second birthday (M age of placement is 4.7 months, SD = 3.4; 96 % placed before their first birthday). Families were initially assessed when children were in early adolescence (M age = 14.7, SD = 1.9), with follow-up assessments every 3–5 years through early adulthood (at first follow-up: M age = 18.3, SD = 2.1; at second follow-up: M age = 22.3, SD = 1.9; assessed from 1998 to 2004). There was limited attrition over time (range 90–94 % retention). The total eligible SIBS sample size is 1226 participants from 613 eligible families (56 % adoptive, 55 % female; 53 % European ancestry, 39 % Asian ancestry, 1 % African ancestry, 7 % mixed or other ancestry; see McGue et al. 2007; Samek et al. 2014 for more detail).

For the present analyses, we excluded respondents who did not have data past age 17 from both MTFS and SIBS (in order to better assess a lifetime measure of SUD and externalizing problems in adulthood, central to the purpose of this paper and Bailey et al.’s 2011 conceptual model). The remaining 3487 participants (2514 MTFS, 973 SIBS; 87 % of total sample) comprised the test sample (54 % female; 84 % European, 11 % Asian ancestry, <1 % African ancestry, <4 % mixed or other ancestry).

MN ADHD study (replicate sample)

The purposes of the MN ADHD and Drug Abuse Study (PI Ken C. Winters; Co-Investigators George Realmuto and Gerald August) were twofold: (1) to examine the association of candidate genes and substance use disorders and (2) to evaluate long term functioning, including SUD and related drug use behaviors, as a function of childhood externalizing diagnostic status (see August et al. 1995 for details). Three types of participants were included, those in a treatment group (n = 443), those in a high risk group (n = 156), and those in a control group (n = 77). The majority of the treatment group was initially recruited from various drug treatment programs in the Twin Cities metro area; 12 % were recruited as an enrichment sample by virtue of having received drug treatment in the past. All participants in the treatment sample met criteria for at least one DSM-IV SUD at the time of their drug treatment.

Participants in the high risk sample were recruited from a larger, community-based sample of over 7000 students from 22 schools around Minneapolis and St. Paul (grades 2–3) (see August et al. 1995 for details). To be included in the high risk group, participants had to score 1.75 SD or above on a teacher rating of disruptive behavior, a score of 1.75 SD above the normative mean on a parent rating of disruptive behavior and either a DSM-III-R diagnosis of attention-deficit/hyperactivity, conduct disorder, or oppositional defiant disorder. Controls were demographically matched and drawn from the same schools as those for the high-risk group. They scored less than 1 SD above the mean on the teacher rating of disruptive behavior, had no history of psychotropic medication use, and no prior school or clinic evaluation for academic or behavior problems. The majority of participants had 4–5 follow-up assessments from childhood to adolescence, roughly spaced every 2–3 years. There was limited attrition over time (participation range 72–80 %).

The total sample size used for this study included the 621 participants with genotype data (92 % of total sample; 40 % female; 87 % European ancestry, 6 % African ancestry, 7 % other or mixed ancestry).

SSDP (replicate sample)

The Seattle Social Development Project has been described previously (Hawkins et al. 1999, 2005). Briefly, the study sought to test a preventive intervention delivered in the elementary grades and to understand risk and protective factors for antisocial behavior from childhood into adulthood. The study began in 1985, and included 808 participants (77 % of those eligible) in 18 participating Seattle area schools. In addition to intervention analyses, the study has permitted a broad range of etiological analyses, including the current examination of gene–environment interplay (PI, Karl G. Hill). The present study makes use of data collected annually from participants at ages 10–16, and at ages 18, 21, 24, 27, 30, 33 and 35. Sample retention was high, with over 90 % of living respondents retained at each wave from ages 14–33, thus minimizing risk for bias due to attrition. About 49 % of respondents were female, and the sample was ethnically/racially diverse: 47 % were of European ancestry, 26 % of African ancestry, 22 % of Asian ancestry, and 5 % of Native American ancestry; of these, 5 % were Hispanic or Latino/a.

The total sample size used for this study included the 577 participants with genotyped data (74 % of total sample; 54 % female; 46 % European ancestry, 19 % African ancestry, 35 % other or mixed ancestry).

RHC (replicate sample)

Raising Healthy Children began in 1992, and was intended as a replication and extension of SSDP (see Brown et al. 2005; Catalano et al. 2003; Haggerty et al. 2006 for additional study details). The sample included 1040 individuals (76 % of those eligible) in first (younger cohort) or second grade (older cohort) in 10 Seattle-area suburban elementary schools. Participants were surveyed annually in the spring from ages 5–6 to 24–25, with additional fall surveys added in the 2 years following high school. Retention has been 85 % or higher at each wave, suggesting minimal risk for bias due to attrition. About 47 % of participants were female. The racial/ethnic composition of the sample was 75 % European ancestry, 6 % Asian ancestry, 3 % African ancestry, 3 % Native American ancestry, 13 % Mixed ancestry; of these, 8 % were Hispanic or Latino/a.

It should be noted that DNA samples were not solicited from all participants; budgetary constraints allowed collection from a maximum of 670 participants. The total sample size used for this study included the 601 participants with genotype data (58 % of total sample, 90 % of targeted sample; 41 % female; 74 % European ancestry, 3 % African ancestry, 23 % other or mixed ancestry).

Phenotype measurement

MCTFR (test-sample)

Because DSM III-R was the diagnostic system in place when the study began, phenotypes included a lifetime DSM III-R symptom count for nicotine dependence, alcohol use disorder, cannabis use disorder, and adult antisocial behavior (alcohol and cannabis use disorder was assessed as the max of abuse and dependence symptoms, lifetime symptom count was calculated by taking the max symptom count across assessments). The Substance Abuse Module of the Composite International Diagnostic Interview (Robins et al. 1988) was used to assess all SUD symptoms. The Structured Clinical Interview for DSM-III-R Axis II (Spitzer et al. 1987) was adapted to assess adult antisocial behavior symptoms (the adult criteria for adult antisocial personality disorder). All symptoms were reviewed by at least two individuals with advanced training and consensus by both individuals was necessary to assign symptoms. Kappa coefficients were >.90 for all SUDs and .79 for adult antisocial behavior. The aggregate externalizing phenotype was computed by standardizing and averaging the nicotine, alcohol, cannabis, and adult antisocial behavior lifetime symptom counts. The average age from which the young adult SUD and adult antisocial behavior phenotypes were assessed was 27.20 years (SD = 3.5).

MN ADHD study (replicate sample)

Phenotypes included DSM-IV nicotine dependence, alcohol use disorder, and cannabis use disorder symptom counts (adult antisocial behavior as a component of adult antisocial personality disorder was not assessed, alcohol use disorder and cannabis use disorder were computed by taking the max across abuse and dependence symptoms). SUD symptoms were assessed using a DSM-IV Structured Diagnostic Interview or the Adolescent Diagnostic Interview (ADI; Winters and Henly 1993). The ADI is associated with favorable inter-rater reliability and convergent validity (ADI scores with independent ratings; Winters et al. 1999). Kappa coefficients for alcohol and cannabis use disorders ranged from .71 to .82 (see Winters et al. 1993). In lieu of measurement of adult antisocial behavior, the Deviant Behavior scale was used as a measure of general externalizing problems/deviant behavior (Winters 1999). This scale includes 10 items, such as “I have been in a fight where I used a weapon,” α = .91. The lifetime assessment of each symptom count was computed by taking a max of the symptom counts across assessments. The aggregate externalizing phenotype was computed by standardizing and averaging the nicotine, alcohol, cannabis symptom accounts and the externalizing measure.

SSDP (replicate sample)

Phenotypes included symptom counts for DSM-IV (APA 1994) alcohol and marijuana abuse/dependence and nicotine dependence as well as a crime variety count to capture antisocial behavior. Abuse and dependence symptom counts for alcohol and marijuana and dependence symptom counts for nicotine were obtained using the Diagnostic Interview Schedule (DIS; Robins et al. 1981). Only two of a possible 7 nicotine dependence symptoms were assessed at age 21 and 5 were assessed at age 24. At ages 21 and 24, marijuana use was grouped with other illicit drug use in the abuse/dependence assessments. For this paper, marijuana abuse/dependence symptom counts at ages 21 and 24 were created for those individuals who reported marijuana use but no other illicit drug use. For individuals reporting both marijuana and other illicit drug use, marijuana abuse/dependence symptom counts were treated as missing. At each wave, respondents reported whether they had engaged in a series of violent, nonviolent, and property crime behaviors in the past year. The number of different past year crimes was tallied at each age. The highest lifetime score for each of these constructs (e.g., highest alcohol abuse/dependence symptom count across ages) was used in the present analysis. An aggregate externalizing behavior measure was created by using factor analysis of the maximum lifetime abuse/dependence and crime scores to create a factor score.

RHC (replicate sample)

Phenotypes included DSM-IV (APA 1994) symptom counts for alcohol and marijuana abuse/dependence and nicotine dependence as well as a crime variety measure. Alcohol and marijuana abuse/dependence symptom counts were assessed using the Composite International Diagnostic Interview (WHO 1990) in 2008 when the younger cohort was 21 and the colder cohort was 22 and in 2011 at ages 24/25. At the age 21/22 survey, marijuana and other illicit drugs were combined in the abuse/dependence assessment. Age 21/22 marijuana symptom counts were created for those individuals who reported using only marijuana but no other illicit drugs; scores were left missing for those who used other drugs in addition to marijuana. The maximum number of alcohol and of marijuana abuse/dependence symptoms across ages 21/22 and 24/25 was used in analyses. Nicotine dependence symptoms were assessed once using the DIS in 2011 at age 24/25. A crime variety score similar to that used in SSDP was obtained each year at ages 19–24/25, and the maximum number of crimes reported in any of these years was used in analyses. As in the SSDP sample, the aggregate externalizing measure was a latent factor score obtained using confirmatory factor analysis of the maximum lifetime abuse/dependence and crime scores.

Candidate gene measurement and coding

Candidate gene polymorphisms were assessed following the protocols described in Haberstick et al. (2014) and Haberstick and Smolen (2004). Candidate genes were coded as an additive function of risk alleles (0–1–2) based on previous research (Table 1). For MAOA, which is X-linked, males have 1 copy whereas females have 2. Coding of MAOA for females has been treated in multiple ways in prior research: excluding females altogether, excluding females that are heterozygous, or conducting separate analyses for females in comparison to males using separate coding systems. To combine analyses across males and females and retain maximal power, we excluded heterozygous females (n = 470). To confirm that this did not greatly impact results, we separately tested the MAOA-phenotype associations in the female only-test sample (MCTFR, n = 1462) using the 0, 1, or 2 coding compared to the 0 versus 2 coding and found essentially identical results. Finally, we computed an aggregate genetic risk score by summing the risk alleles across the 6 candidate genes. Individuals missing more than 1 candidate gene score were coded as missing on this aggregate measure (n = 1537 included for MCTFR, n = 606 for MN ADHD Study, and n = 577 for SSDP (only COMT was collected for RHC thus is not available for aggregate genetic risk score analysis).Footnote 1

Autosomal marker measurement

Autosomal marker data was collected using the Illumina’s Human660 W-Quad Array (Illumina, Inc., San Diego, CA) in the MCTFR sample. Among all eligible MCTFR participants, 77 % provided a blood sample and 6 % a saliva sample (more details on MCTFR sample genotyping can be found in McGue et al. 2013 and Miller et al. 2012). For the MN and ADHD study, the Illumina HumanOmni2.5-8 Bead Chip was used (Illumina, Inc., San Diego, CA) and 91 % of the sample provided via buccal swabs (saliva). For SSDP and RHC, the Illumina Omni 2.5 chip (Illumina, Inc., San Diego, CA) was used. For SSDP, 100 % of the sample was ascertained saliva sample. For RHC, 93 % of the 674 participants who provided DNA provided a blood sample, 7 % provided a saliva sample (COMT was the only candidate gene available in the RHC sample). Across the test and replicate samples, and following the protocols described in McGue et al. (2013) and Miller et al. (2012), SNP markers were excluded under the following conditions: (1) a call rate <99 %, (2) minor allele frequency <1 %, (3) significant deviation from Hardy–Weinberg equilibrium, (4) more than two Mendelian inconsistencies across families or mismatch in duplicate samples, (5) an association with participant sex, or (6) previously identified as a bad marker on the array.

Ancestry principal components

Ancestry principal components were calculated using all common autosomal markers across the test and replicate samples using PLINK (version 1.90b1a). We first selected the 81,320 autosomal markers that were shared by the 9147 subjects in the four data sets (MCTFR, MN ADHD, SSDP, RHC). We dropped one marker from every pair of markers where r 2 exceeded .30. After pruning, 61,936 markers remained. Using PLINK again, we computed realized genetic relationships on the remaining markers for all 41,829,231 pairs of subjects and identified 5738 subjects with pairwise values of less than .2 in the genetic relationship matrix for all possible pairs of subjects. We then used R (version 3.0.1) with the rARPACK library to compute the first ten eigenvectors and eigenvalues from the relationship data for those 5738 subjects. These eigenvectors and eigenvalues were then used to compute eigenvector values for the remaining participants by projecting from their genetic relationship values into the eigensystem determined by their relatives.

Analysis plan

We evaluated the effect of the independent variables (each of the 6 candidate genes) on each of the dependent variables (nicotine dependence, alcohol use disorder, cannabis use disorder, adult antisocial behavior, aggregate externalizing measure) after adjusting for age at most recent assessment and the first ten ancestry principal components as covariates. Our a priori alpha was set to .008 based on a multiple-testing correction for the number of independent candidate genes evaluated (alpha = .05 divided by six tests for the six independent candidate genes). All analyses were conducted in Mplus version 7.2 (Muthén and Muthén 1998–2012) using the maximum likelihood with robust standard errors (MLR) estimator to account for non-normality of the dependent variables. Analyses on the MCTFR sample accounted for non-independence of cases (i.e., twin/sib data) by clustering the analyses by family ID.

Prior research has shown that genetic main effects on complex phenotypes like addiction and externalizing are likely to be very small in magnitude, with most accounting for less than 1 % of phenotypic variance (see Dick et al. 2015). A power analysis was conducted in Mplus, 7.2 (Muthén and Muthén 1998–2012) to determine if we had an adequate test sample size to detect small effects (β = .07, r 2 = 0.5 %); power was estimated at 97 % for N = 3000, 60 % for N = 1000, 51 % for N = 800, and 41 % for N = 600.

Our analysis in each of the test and replicate samples followed a specific pattern. First, we analyzed results using the entire sample, and then confirmed results in those exposed to each substance (i.e., those that had ever used nicotine, alcohol, or cannabis for the analyses corresponding to nicotine dependence, alcohol use disorder and cannabis use disorder, respectively). No exposure variable was set for antisocial behavior. As the majority of the test sample is of European ancestry (~85 %), and much of the prior research involving these candidate genes is on European ancestry samples alone, we also confirmed results in each of the test and replicate samples for those of European ancestry only.

We also aggregated the independent effects for each sample to provide population estimates for each effect using the Comprehensive Meta-Analysis 2.0 software (Borenstein et al. 2005). Because the statistics for all independent effects for each sample were the same (standardized beta coefficients and standard errors) with the same covariates (age, the first ten ancestry principal components as covariates), we were able to calculate mean effect sizes using the beta coefficients (see Peterson and Brown 2005). Mean effect sizes across samples were calculated by weighting each individual effect size by the inverse of its variance. A random effects model, in which both random and systematic components are assumed to account for effect size variance, was used to fit the effect size data.

Finally, exploratory analyses were subsequently conducted to evaluate potential sex differences in candidate gene-phenotype associations and to rule out any potential confounding (Keller 2014). This was done by evaluating all results in the test sample (MCTFR) separately for males and females. Significant differences were tested by constraining the candidate gene-phenotype association to be equivalent across males and females and using the Chi square difference test to evaluate for a decrement in model fit between free and constrained models. If any significant differences were found in the test-sample, follow-up analyses were conducted to test those associations in the replicate samples.

Results

Table 2 shows descriptive statistics on study phenotypes across the test and replicate samples by sex for the European ancestral group and the non-European ancestral group (including African, Asian, Hispanic, and mixed ethnic ancestry). Non-European ancestral groups were grouped together as there was not sufficient power to test for candidate gene-phenotype associations within each sub-group (see power analyses for replicate samples, above). As demonstrated in Table 2, there was adequate prevalence and variability in each of the outcomes across all studies. The highest average symptom counts were found for the MN ADHD Study, which includes both treatment and high risk samples, whereas MCTFR, SSDP, and RHC had more comparable means as they are community-based samples. In the MCTFR and SSDP samples, males had significantly greater average SUDs than females with moderate effect sizes (Cohen’s d range .26–.61). There were few differences in those of European versus mixed ancestral groups in the MCTFR, SSDP, and RHC samples, however some moderate effects in the MN Drug Abuse and in the MN ADHD study (Cohen’s d range −.37 to .00).

Table 2 Means and standard deviations for study phenotypes across sex and ancestry by study

Table 3 shows the associations between candidate genes and phenotypes for each study. As illustrated in the MCTFR column, there were no significant associations between any of the candidate genes and phenotypic outcomes that met the threshold of p < .008. In fact, there were only three nominal associations that met a threshold of p < .05 (DAT1 and cannabis, DAT1 and cannabis for those exposed to cannabis only, and MAOA and antisocial behavior). Confirming the null findings in the test samples, no associations reached the threshold of p < .008 in any of the replicate samples, and the nominal associations (p < .05) were also not replicated. Meta-analyzed βs were generally estimated at .00 (ranged from −.03 to .03, see Table 3) and none reached statistical significant at p < .008 or p < .05.

Table 3 Associations between candidate gene and externalizing phenotypes by study

Table 4 shows results of the associations between candidate gene and phenotypes, by study, for those of European ancestry only. In the test (MCTFR) sample, a similar pattern of null effects was found with one exception: the association between DAT1 and cannabis for those that had been exposed to cannabis was found to be significant at the a priori threshold (β = .09, S.E. = .03, p = .006). This effect was not replicated in any of the replicate samples (MN ADHD study: β = −.03, S.E. = .05, p = .56; SSDP: β = −.04, S.E. = .06, p = .45; not available in RHC). The meta-analyzed β for the association between DAT1 and cannabis for those exposed and of European ancestry alone = .04, p = .07.

Table 4 Associations between candidate gene and externalizing phenotypes by study: results for those with European ancestry only

The meta-analyzed βs for the sub-sample of those of European ancestry alone were generally estimated at .00 (ranged from −.04 to .04, see Table 4) and none reached statistical significant at p < .008 or p < .05.

Subsequently, sex differences were tested in the MCTFR sample. A detailed table showing the results for this analysis is given Supplementary Materials, eTables1. In general, few sex differences were found, although a pattern of effects was detected between DAT1 and the alcohol, cannabis, and the externalizing aggregate phenotypes by sex for p-values at the nominal level (<.05). Specifically, the DAT1 risk allele was positively associated with these phenotypes for males (β’s = .09, p’s ranged from .02 to .03) but not females (β’s ranged from −.01 to −.07, p’s ranged from .04 to .86). Constraining these associations to be equal across the two sexes resulted in a significant decrement to model fit (χ2’s ranged from 4.14 to 11.97 on 1 df, p’s ranged from .007 to .04). These sex differences did not replicate in the other samples (see eTables 2–3). Similar results and non-replication were found when restricting the sample to individuals of European ancestry in the test sample (eTable 4) and the replicate samples (eTables 5–7). In short, there was no replicated evidence for sex differences in candidate gene-phenotype associations.

Discussion

This work represents an important step in an on-going collaboration among five longitudinal studies developing conceptual models of gene–environment interplay. Our goal for this analysis was to test main effects of six candidate genes on a range of SUDs and externalizing outcomes and replicate them in multiple samples. The six candidate genes tested were selected as they are the most extensively studied candidate genes to date (in relation to SUD and externalizing problems broadly) and following this, were also available for analysis across most of our five studies (see Table 1 for an overview of each gene, function, coding, and supporting literature). After correction for multiple testing, we found one “hit” in the test sample that met the corrected for multiple testing level of significance. DAT1 was related to cannabis use disorder symptomology for those of European ancestry who had ever used cannabis. This association, however, was not replicated in pattern or significance in any of the replication data sets, and the meta-analyzed standardized regression coefficient was not significantly different from zero. Additionally, no replicated sex differences were found.

Plan and expectations for future collaborative efforts

Although we did not find candidate gene main effects, our plan is to continue on to test the general/specific conceptual model as outlined by (Bailey et al. 2011). This model posits that genetic effects interacting with general versus tobacco- and alcohol-specific family environments may differentially relate to a general latent factor of externalizing problems versus unique variance in nicotine dependence or alcohol use disorder. For example, it is possible that tobacco specific environments may exacerbate nicotine-related genetic risk, while the dopaminergic candidate genes studied here may exacerbate general environmental risk. As results did not show that any candidate gene in particular may be more relevant to a variety of externalizing disorders in the present paper, we plan to include all six candidate genes in future analysis of candidate gene × environment interaction, and include a multiple-testing correction to test for the six candidate gene tests as was done here.

Based on arguments made by Bakermans-Kranenburg and van IJzendoorn (2015), because we failed to find any main candidate gene main effects on externalizing outcomes here, we might expect to be more likely to find evidence for differential susceptibility rather than diathesis stress models of gene–environment interaction in our subsequent analyses. Differential susceptibility refers to the notion that those most at risk in adverse environments will also fare the best in optimal environments, in part because their specific genotype is more sensitive to environmental context compare to other genotypes. Alternatively, diathesis stress refers to the notion that those at genetic risk tend to do worse in more adverse environments, in parts because the stressful environment triggers their genetic risk. It has been argued that it is nearly impossible to detect a gene–environment interaction effect without a gene main effect for interactions in line with a diathesis stress framework (Risch et al. 2009), but that interactions involving differential susceptibility can and should be detected without a genetic main effect (see Bakermans-Kranenburg and van IJzendoorn 2015 for a detailed review).

In addition to testing the conceptual model of gene–environment interplay developed by Bailey and colleagues (2011), we have developed an additional model that accounts for potential gene × environment × development interaction (see Fig. 1). This model is based on recent work both in the Minnesota lab and others (Johnson et al. 2009; Kendler et al. 2011; Samek et al. 2015, 2016). This model is simple extension of Bailey et al.’s (2011) original model in that it proposes an analysis of gene–environment correlation and interaction involving adolescent environmental contexts in relation to adolescent externalizing problems as well as subsequent young adult externalizing problems. This model also includes the adolescent peer context in addition to the adolescent family context. Testing this model will support whether gene–environment interaction involving adolescent environmental contexts may be developmentally limited to adolescence, or whether it has long-lasting effects into adulthood.

Fig. 1
figure 1

Additional (newly developed) conceptual model developed to test candidate gene × environment × development interaction. AUD Alcohol use disorder, ND nicotine dependence, IDUD illicit drug use disorder, AAB adult antisocial behavior. Circles represent latent factors, squares represent observed variables. Indicators of adolescent parent–child relationship (e.g., parent–child conflict, management, bonding, involvement) and adolescent antisocial peer affiliation (e.g., antisocial peers, prosocial peers, substance using peers) are not shown for clarity of presentation; these adolescent environmental latent factors will be indicated by 3+ scales or may be measured as observed variables depending on what data is available across the five longitudinal studies. The model proposes an analysis of candidate genes as correlates of adolescent parent–child relationship quality and antisocial peer affiliation variables and moderators of the associations between adolescent parent–child relationship quality and antisocial peer affiliation in relation a latent factor of externalizing disorders in both adolescence and young adulthood. Polygenetic risk scores or other genetic variants could be easily included in the model instead of candidate genes. Note sex, age, and the first 10 principal components of ancestry will be included as covariates but are not shown in the figure for clarity of presentation

Strengths, limitations, and concluding thoughts

A major strength of this study is the utility of the test-replicate approach in evaluating the effect of six widely studied candidate genes in relation to SUD and externalizing behavior outcomes. While we provided evidence for one significant effect in the test sample, it failed to replicate in the other samples. We accounted for and thoroughly analyzed potential confounders (ancestry and sex differences), presented a power analysis demonstrating power to find small effects in each of our samples, and presented a plan for future candidate gene–environment interaction analyses based on the null effects reported here. These have been argued to be essential ingredients in any future work involving candidate genes (e.g., Dick et al. 2015), and we have taken great care to follow such guidelines.

This study is of course not without its limitations. While there was diversity in sample design and participant characteristics across the five longitudinal studies, each of the included samples were still predominately of European ancestry. The five samples involved in the present study do not have sufficient power to test for these candidate-gene phenotype associations within specific racial/ethnic groups other than Caucasian. Rather than exclude ethnic minorities altogether, we dealt with this potential confounding effect by accounting for ancestry principal components in our model, and by providing results for the entire sample and then for those of European ancestry only (in order to understand if these effects may be specific to those of European ancestry alone). It would be ideal to replicate results in relatively large samples (e.g., 3000+) of those with predominately African, Asian, Native American, and other U.S. ethnic minority ancestral groups.

In conclusion, this initial work represents an important step in an ongoing collaboration across five longitudinal studies. We did not find any replicated patterns in terms of candidate gene main effects. We presented two conceptual models of gene–environment interplay we aim to subsequently test, and offer a tentative expectation that any interaction effects we may find will be in line with a differential susceptibility rather than diathesis stress framework based on our null effects findings. We encourage others to test either of our proposed models of gene–environment interplay with similar candidate genes as well as other markers of genetic influence that may be relevant to the development of SUDs and externalizing problems.

Finally, it remains imperative to discuss the practical utility of demonstrating any gene effects given the likelihood that any effects we may find will likely be small in effect size. While theoretically relevant to understanding the dynamic interplay between individual-level variables and social context, there are other measures of individual-level risk for SUDs that have long been demonstrated to have moderate to substantial influences on SUDs, such as personality traits including impulsivity or negative emotionality (Chassin et al. 2004; Durbin and Hicks 2014; Hicks et al. 2012; McGue et al. 1999; McGue et al. 1997; Quinn and Fromme 2011; Quinn and Harden 2013). It can be argued that such personality trait × environmental context interactions are comparatively understudied in comparison to gene × environment interactions and that personality-environment interplay may have more practical utility for SUD and externalizing prevention and treatment programs (given the relative ease and non-invasiveness of personality measurement and identification of those most at risk for SUD and externalizing problems). Finally, other aspects of environmental risk not limited to the family—such as peer, school/work, and community (Hawkins et al. 1992) may be more relevant to future personality or gene–environment interaction work than family environmental risk alone. Further research should be conducted that analyzes these additional aspects of environmental risk in models of gene- or personality × environment interplay.