Keywords

1 Introduction

Externalizing problems refer to a constellation of behaviors and/or disorders characterized by impulsive action and/or behavioral undercontrol. Externalizing problems can be contrasted with internalizing problems in that they typically reflect actions in the external world, rather than internalized processes within the self, such as anxiety, depression, or negative affect. Externalizing problems include a variety of behaviors such as alcohol or substance misuse, antisocial behaviors, aggression, and risk taking (Krueger et al. 2002; Salvatore and Dick 2018; Young et al. 2000).

Problems associated with externalizing behaviors have high social costs. Substance misuse remains one of the leading contributors to preventable mortality and morbidity worldwide. In 2016, alcohol use contributed 4.2% of the total global burden of disease; other drug use contributed to 1.3% of the total global disease burden; and smoking contributed to approximately 12% of all deaths (Degenhardt et al. 2013; Reitsma et al. 2017). In 2017, over 47,000 Americans died as the result of an opioid overdose (Center for Disease Control and Prevention 2019). In addition to the health consequences, these behaviors have significant financial costs. Each year, excessive alcohol use is estimated to cost the United States $250 billion (Sacks et al. 2015). Illicit drugs cost the United States approximately $190 billion (National Drug Intelligence Center 2011) annually, of which $78.5 billion is due to opioid use alone (Florence et al. 2016). And while difficult to calculate, the total cost of crime in the United States is estimated between $690 billion to $3.41 trillion annually (Maurer 2017). Understanding the etiology of these behaviors is of utmost importance for practitioners and policy makers to effectively design prevention and intervention efforts.

2 Epidemiology of Externalizing Behaviors/Disorders

Behaviors and disorders across the externalizing spectrum are highly prevalent. The 12-month prevalence for substance use disorders (SUD) in the United States is approximately 14% for alcohol use disorders (AUD) and 4% for other substance use disorders (SUD), while the lifetime prevalence is much higher (~29% for AUD and ~10% for SUD) (Grant et al. 2015, 2016). These disorders typically manifest during young adulthood, with mean ages of onset ranging from 23.9 for SUD to 26.2 for AUD (Grant et al. 2015, 2016). The lifetime prevalence for other psychiatric disorders related to impulse control, such as attention hyperactivity deficit disorder (ADHD) and conduct disorder (CD), are 5.1% and 9.5%, respectively, and these appear at earlier ages (7 and 13 years old, respectively) (Kessler et al. 2005a; Polanczyk et al. 2007). Taken together, the prevalence for any disorder related to impulse control is high: 24.8%, with a median age of onset = 11 years old (Kessler et al. 2005a). Importantly, substance use and impulse control disorders do not manifest in isolation and show strong comorbidity in past 12-month diagnoses (Grant et al. 2015, 2016; Kessler et al. 2005b). Longitudinal analyses reveal that many externalizing problems, including heavy alcohol use (Chen and Jacobson 2012), illicit drug use (Chen and Jacobson 2012), and antisocial behaviors (Powell et al. 2010), increase across adolescence into young adulthood, followed by a steady decline. Overall, behaviors on the externalizing spectrum are common, with significant variation across the life course.

2.1 Genetic Epidemiology of the Externalizing Spectrum

Twin and family designs use information from close relatives to estimate the heritabilityFootnote 1 of a trait. Twin studies allow researchers to decompose the variance in a trait into additive genetic, shared environmental, and unique environmental influences by comparing the phenotypic correlations of monozygotic (MZ) and dizygotic (DZ) twin pairs. We can estimate these variances due to the fact that MZ twins share all of their genetic variation, while DZ twins share half of their genetic variation, on average. Shared environmental influences, which refer to environments that make twins more similar, include conditions such as neighborhood context, family socioeconomic status, and religion. Unique environmental influences refer to experiences that have the effect of making twins more different from each other than expected based on their genetic sharing, for example, if one twin experiences a trauma, or has a different peer group. When the within-pair MZ correlation for a phenotype is larger than the within-pair DZ correlation, this suggests the importance of genetic influences on the trait under study. When the DZ correlation for a phenotype is more than half of the MZ correlation, this suggests the presence of shared environmental influences. When the MZ correlation is less than unity, unique environmental influences are inferred (measurement error is also confounded with unique environmental influences) (Neale and Cardon 2013).

Many of the individual phenotypes on the externalizing spectrum demonstrate modest to considerable heritability (h 2). SUD have moderate genetic influences, with ~50% of the variance in AUD (Verhulst et al. 2015), 50–60% of the variance in problematic cannabis use (Verweij et al. 2010), ~40–80% of the variance in cocaine use disorders (Kendler et al. 2000, 2003a), 20–50% of the variation in opioid dependence (Kendler et al. 2003a; van den Bree et al. 1998), and ~60% of the variance in nicotine dependence (Maes et al. 2004) being due to genetic influences (h 2). Related psychiatric and behavioral outcomes, such as ADHD (h 2 = 74%), antisocial behavior (h 2 = 32%), rule breaking (h 2 = 48%), and aggression (h 2 = 65%), are moderately-to-strongly heritable (Burt 2009; Faraone and Larsson 2019; Rhee and Waldman 2002).

Importantly, while the heritability for each of these individual phenotypes is moderate-to-strong, the genetic variation impacting each of these disorders appears to be largely shared. Each of these phenotypes load onto a single highly heritable (h 2 ~80%) externalizing factor (Kendler et al. 2003b; Krueger et al. 2002; Young et al. 2000), which explains a large proportion of the genetic variance in each individual trait. For example, a general externalizing factor explains 74–80% of genetic influences for AUD, 62–74% for other SUD, and 57–92% for antisocial personality disorder (Kendler and Myers 2013). Other nonclinical risky behaviors that load on to this genetic factor for externalizing include driving while drunk, earlier age at first sex, and riskier sex (Harden et al. 2008b; Quinn and Harden 2013; Samek et al. 2014). Finally, in addition to behaviors, personality traits of novelty seeking, sensation seeking, lack of agreeableness, and lack of conscientiousness also load strongly on this externalizing factor (Kendler and Myers 2013; Krueger et al. 2002; Mann et al. 2015; Young et al. 2000). Overall, twin and family studies indicate that common genetic influences impact multiple traits on the externalizing spectrum.

2.2 Changes in the Etiology of Externalizing Problems Across Development

Like many other complex traits, genetic influences on externalizing problems change across the life course. Genetic influences generally become more important as individuals age and begin to achieve more independence (Dick 2011a; Kendler et al. 2008; Long et al. 2017). This is especially true for traits on the externalizing spectrum. Twin studies repeatedly show that shared environment has important effects on substance use/misuse in early life, whereas genetic influences become more important as individuals reach early adulthood (Dick 2011a; Kendler et al. 2008; Long et al. 2017). Figure 1 provides an overview of the changing relative influence of genetic and shared environmental variance over adolescence (Dick 2011a).

Fig. 1
figure 1

Relative importance of additive genetic (A) and shared environmental (C) influences on alcohol initiation and frequency of use across adolescence (data reported in Dick 2011a). Across adolescence additive genetic influences (A) become more important, while shared environmental influences (C) become less important

While the importance of genetic influences appears to increase over the early life course, there is evidence that the source of genetic influences is relatively stable over time. Multivariate twin models find that the majority of genetic influence on externalizing is attributable to a single factor that explains a large portion of the variance across development (Wichers et al. 2013). Longitudinal models demonstrate that the genetic influences on initial levels of externalizing mostly overlap with the genetic influences on change over time (Hatoum et al. 2018). In other words, the genetic influences that influence externalizing problems in early development also affect behaviors during adolescence. These patterns are similar when we look at specific behaviors on the externalizing spectrum, including alcohol use (Long et al. 2017) and problematic alcohol use (van Beek et al. 2012): the sources of genetic influences are fairly stable across time. Stability in the source of genetic influences over development also occurs with the personality correlates of externalizing problems (Briley and Tucker-Drob 2017). Longitudinal analyses of lab-based tasks related to both impulsivity and delay discounting reveal developmentally stable, genetic influences on these measures meant to assess dimensions of personality (Anokhin et al. 2011; Niv et al. 2012). Overall, it appears that while genetic influences may become more important over time, the same genetic influences act across development.

Despite the evidence that genetic influences on externalizing are stable over development, there is some evidence that the specificity of genetic influences can change for certain phenotypes on the externalizing spectrum. This is especially apparent in regards to alcohol misuse. Genetic risk for broader externalizing problems and genetic risk for AUD both independently predict alcohol misuse across early development. However, the effect size for each form of genetic risk changes over time. In adolescence, broader externalizing risk has a stronger effect on alcohol misuse, while alcohol specific risk becomes important during adulthood (see Fig. 2) (Kendler et al. 2011; Meyers et al. 2014). For other drug use, common genetic influences explain approximately half of the correlation between externalizing problems in childhood and drug initiation in late adolescence (Korhonen et al. 2012).

Fig. 2
figure 2

Genetic risk and alcohol phenotypes from Kendler et al. (2011) and Meyers et al. (2014). Regression coefficients between genetic risk specific to AUD (AUD) or broader externalizing disorders (EXT) and alcohol consumption across age in the Virginia Adult Twin Study of Psychiatric and Substance Use Disorders (VATSPUD, left) and the Finnish Twin Cohort (FinnTwin12, right). Genetic risk for broader externalizing problems and genetic risk specific to AUD both independently predict alcohol misuse across development, though the effect sizes for each changes over time. In adolescence, EXT has a stronger effect on alcohol misuse, while AUD becomes important during adulthood

2.3 Gene-Environment Interactions in Externalizing Problems

Environmental conditions can alter the importance of genetic influences on externalizing behaviors. This phenomenon is referred to as gene-environment interaction, or GxE (Dick 2011b). Researchers have put forth a variety of theoretical models of GxE. For externalizing phenotypes, we see consistent evidence for two primary theoretical paradigms of GxE: the social control/opportunity model and the social distinction model (Boardman et al. 2013; Shanahan and Hofer 2005). Under the social control/opportunity model of GxE, genetic influences become more important under conditions of reduced social control or increased social opportunity (Shanahan and Hofer 2005). For example, environmental conditions related to low social control/increased social opportunity, such as peer deviance (Cooke et al. 2015; Harden et al. 2008a; Mann et al. 2016; Samek et al. 2016) or high neighborhood turnover rate (Dick et al. 2009), are associated with increases in genetic influences on various externalizing traits. On the opposite side, environmental conditions associated with greater social control/reduced social opportunity, such as greater parental monitoring (Cooke et al. 2015; Dick et al. 2007) or involvement in a committed relationship (Barr et al. 2017; Heath et al. 1989), are associated with reduced genetic influences on these phenotypes. It is important to note that this model of GxE spans various externalizing phenotypes including alcohol use, smoking, behavior problems, and delinquency (Cooke et al. 2015; Harden et al. 2008a; Mann et al. 2016; Samek et al. 2016).

Under the social distinction model of GxE (Boardman et al. 2013), certain social conditions “push” the phenotype and increase the importance of environmental influences on a trait at one end of the environmental spectrum. This increase in the importance of environmental variance reduces the importance of genetic influences on that same end of the spectrum. For externalizing problems, childhood socioeconomic status (SES) has consistently fit this model of GxE. Childhood SES moderates the effect of genetic variation on externalizing problems, such that under conditions of lower SES, environmental sources of variance are more important than genetic influences (Middeldorp et al. 2014; Tuvblad et al. 2006). Family SES moderates genetic liability for externalizing whereby higher SES and higher genetic risk are associated with a steeper increase in alcohol problems across adolescence (Barr et al. 2018). Neighborhood-level SES, moderates genetic risk on delinquency (Beaver 2011) and non-violent conduct problems (Burt et al. 2016) in the same direction as family-level SES, such that environmental influences are stronger under conditions of low neighborhood SES. Overall, GxE findings for the social distinction model and social control/opportunity model demonstrate the ways in which the importance genetic influences can shift across environmental conditions.

3 Molecular Genetic Studies of Externalizing Problems

While twin and family data provide valuable insight into the genetics of externalizing problems and other complex traits, they use a latent approach that does not provide information about the specific genetic variants associated with a given trait. Over the past 20 years, the growth in research examining measured genetic variants has rapidly expanded. Much of the early work focused on candidate genes, which were proposed to be associated with a trait because of a hypothesized biological mechanism. Research in this tradition largely focused on genes or single nucleotide polymorphisms (SNPs) in the serotonergic or dopaminergic region. However, candidate gene research has been plagued by false positives, publication bias, and low powered studies (Duncan and Keller 2011). Recent large-scale meta-analyses reveal no support for much of the early work on candidate gene analyses (Border et al. 2019), suggesting that our “best guesses” for genes involved in the underlying biology were not very good. Importantly, candidate gene studies do not fit with our current polygenic understanding of complex traits, whereby phenotypes are influenced by many variants (in the hundreds, if not thousands) of very small effect (Visscher et al. 2017).

With the mapping of the human genome, the focus on candidate gene research has given way to agnostic methods of gene identification that scan the entire genome for SNPs associated with a given trait. Rather than focusing on a single variant with some hypothesized biological mechanism, genome-wide association studies, or GWAS, test the association between a phenotype and SNPs spanning the entire genome. Because of the large number of tests (the typical p-value for genome-wide significance, or GWS, is p < 5 × 10−8 to correct for approximately one million independent tests) and small effect sizes associated with individual variants, adequate sample sizes for discovery GWAS likely require hundreds of thousands to millions of individuals. Fortunately, with the growth of cheaply available genotyping arrays, large-scale biobanks that genotype large numbers of individuals, and direct-to-consumer genetic testing companies that amass genotypic information on large samples, the sample sizes for these GWAS have been rapidly increasing (Mills and Rahal 2019).

3.1 Current GWAS of Externalizing Phenotypes

Table 1 provides a sampling of the current GWAS for externalizing traits. To date, these GWAS have predominantly focused on single phenotypes that would be considered part of the externalizing spectrum, mostly substance use outcomes. The majority of GWAS on substance use have focused on alcohol-related phenotypes, including alcohol dependence (Walters et al. 2018) (3 GWS SNPs), alcohol use disorder (Kranzler et al. 2019) (10 GWS SNPs), number of alcoholic drinks per week (Liu et al. 2019) (156 GWS SNPs), and maximum alcohol intake (Gelernter et al. 2019) (6 GWS SNPs). A SNP in the ADH1B gene region (rs1229984) responsible for alcohol metabolism is the most consistently associated SNP across these alcohol GWAS (Gelernter et al. 2019; Kranzler et al. 2019; Liu et al. 2019; Walters et al. 2018); however, other genome-wide significant variants have also begun to emerge, such as those in GCKR which is involved in sugar metabolism in the liver and pancreas (Gelernter et al. 2019; Kranzler et al. 2019; Liu et al. 2019). Sample sizes for these alcohol phenotypes have ranged from moderately to extremely well powered (N’s ~50k – one million). Interestingly, these GWAS reveal that genetic influences on alcohol consumption only partially overlap with variants that impact alcohol-related problems (Sanchez-Roige et al. 2018; Walters et al. 2018).

Table 1 Current GWAS of externalizing phenotypes

GWAS of some illicit drugs, especially cannabis phenotypes, are beginning to reach sample sizes that have adequate power for detection of genetic effects. A recent GWAS of lifetime cannabis use (Pasman et al. 2018) (N ~180K) identified eight GWS SNPs. Three of these loci were in the CADM2, which has also shown up in GWAS of impulsivity (Sanchez-Roige et al. 2019), alcohol consumption (Clarke et al. 2017), number of offspring (Day et al. 2016), and risk-taking behavior (Day et al. 2016; Karlsson Linnér et al. 2019). A GWAS of cannabis use disorder (Demontis et al. 2019a) in ~50K individuals identified a single GWS SNP that is a strong expression quantitative trait locus (eQTL, a variant that influences the expression of a gene or genes) for CHRNA2, a nicotine receptor gene related to smoking behavior (Liu et al. 2019). GWAS of other illicit substance use disorders, including cocaine dependence (Gelernter et al. 2014) and opioid dependence (Cheng et al. 2018), are currently underpowered due to extremely small sample sizes (N ~2,500–7,500). Overall, larger sample sizes are needed across illicit substance use to better detect variants associated with these phenotypes.

Beyond substance use, GWAS have focused on other disorders and personality traits related to the externalizing spectrum. A recent GWAS of ADHD (Demontis et al. 2019b) (N ~55K) identified 12 GWS loci. Several of the loci associated with ADHD are located in or near genes implicated in neurodevelopmental processes, including FOXP2, SORCS3, and DUSP6 (Demontis et al. 2019b). For other behavioral phenotypes, a GWAS of antisocial behaviors in ~16k individuals using a broad variety of behaviors (including conduct disorder, behavior check lists, and other scales of antisocial behaviors) did not identify any genome-wide significant loci. GWAS of impulsivity scales including the Barratt Impulsiveness Scale, or BIS (Sanchez-Roige et al. 2019), the composite UPPS-P scale (urgency, premeditation, perseverance, sensation seeking, and positive urgency), and its subscales identified GWAS SNPs in the CADM2 gene region for the sensation seeking subscale of the UPPS-P and CACNA1I (which encodes for a protein thought to be involved in calcium signaling in neurons) gene region for the negative urgency subscale (Sanchez-Roige et al. 2019).

Recently, a GWAS of general risk tolerance in approximately 940K individuals identified 124 independent GWS loci (Karlsson Linnér et al. 2019). This study also examined a composite index of risky behaviors (defined as the first principal component of ever smoking, drinks per week, automobile speeding, and number of sexual partners, N = 315,894) identifying 106 GWS SNPs. The top variants in this GWAS were in the MAPT, CADM2, and FOXP1 gene regions, again implicating genes thought to be involved in neurodevelopmental processes (Karlsson Linnér et al. 2019).

Overall, current gene identification efforts for externalizing traits have begun to detect robust associations with individual disorders/phenotypes on the externalizing spectrum, and more recently, with general externalizing behavior. However, much remains to be discovered as to the biological mechanisms through which these variants influence behavior. While some variants have well-known biological function (such as alcohol metabolism or nicotine receptor genes), others (such as those related to neurodevelopmental processes and brain function) need further scrutiny. Future research will need to move beyond simple associations. Integrating data from human GWAS into model organisms will allow us to directly test the biological function of genes identified in GWAS and whether or not these genes exert some causal influence on externalizing problems (Baker et al. 2011; Jay 2012). As the cost of whole genome sequencing comes down, we will also be better able to examine the impact of rare variants, which are largely excluded from current methods (which focus primarily on common variants). Finally, as we begin to think of genes and variants as parts of dense networks, we may better understand the underlying biological mechanisms between genotype and phenotype (Visscher et al. 2017).

3.2 Genetic Correlations and Multivariate Genomic Methods

Perhaps the most interesting finding to emerge from all of these GWAS of phenotypes on the externalizing spectrum is the strong genetic overlap between traits, confirming earlier results from twin and family studies. Alcohol use disorder, alcohol consumption, smoking status, lifetime cannabis use, risky behaviors, general risk tolerance, and polysubstance use all have significant genetic correlations with one another (Kranzler et al. 2019; Liu et al. 2019; Meyers et al. 2014; Sanchez-Roige et al. 2019; Walters et al. 2018). These externalizing phenotypes also overlap genetically with other socio-demographic outcomes related to externalizing, including age at first birth, number of children, and educational attainment (Kranzler et al. 2019; Sanchez-Roige et al. 2019; Walters et al. 2018). Figure 3 Footnote 2 shows the genetic correlations between a subset of externalizing phenotypes, estimated using GWAS summary statistics (Demontis et al. 2019b; Karlsson Linnér et al. 2019; Liu et al. 2019; Pasman et al. 2018; Sanchez-Roige et al. 2018; Walters et al. 2018). Genetic correlations were calculated using bivariate LD score regression (Bulik-Sullivan et al. 2015).

Fig. 3
figure 3

Genetic correlations calculated from published GWAS of externalizing phenotypes (using bivariate LD score regression). Heatmap of genetic correlation using effect sizes from currently published GWAS, with correlation estimates denoted in the cells. There is a strong pattern of significant genetic overlap between clinically relevant phenotypes (Problematic Alcohol Use, ADHD), other substance use (Lifetime Cannabis Use, Ever Smoker), risky sexual behaviors (Age at First Sex, Number of Sexual Partners), and personality (General Risk Tolerance). The results in the heatmap demonstrate a potential shared genetic influence across these phenotypes. All genetic correlations are significant after correcting for multiple testing ( p < 0.0024)

There are now concerted efforts to use information from these and other GWAS to move beyond univariate analyses and model the multivariate genetic architecture of externalizing problems identified in twin and family studies. One such ongoing project is the Externalizing Consortium (Dick et al. 2018). New multivariate gene identification methods such as Genomic Structural Equation Modeling, or Genomic SEM (Grotzinger et al. 2019), utilize genetic correlations to model the underlying factor structure of a set of phenotypes using GWAS summary statistics. While traditional SEM models the phenotypic covariance to measure a latent factor, Genomic SEM models a latent genetic factor based on the genetic covariance. Utilizing these new multivariate methods allows one to boost power to identify genetic variants by harnessing existing GWAS of genetically correlated phenotypes. This type of multivariate analysis illustrates the advantage of combining information across externalizing traits to detect genetic variants associated with a range of externalizing outcomes. As more well-powered GWAS of externalizing phenotypes become available, we will be able to model the underlying externalizing spectrum with even more power and precision.

3.3 Research Using PRS for Externalizing Phenotypes

Beyond identifying associations between individual variants and a phenotype, GWAS results can be used to create polygenic risk scores (PRS) that index an individual’s overall liability for the outcomes, in order to study associations between these aggregate measures of genetic risk and phenotypes in external samples. An important component of using PRS is that the sample in which they are used must be independent of the discovery GWAS sample. Figure 4 provides an overview of how PRS are constructed. PRS are computed as the average of the number of “risk” alleles that an individual carries weighted by the parameter estimates (e.g., betas, odds ratios, and Z-scores) identified in a GWAS. Because SNPs that are close to one another in the genome correlated (referred to as linkage disequilibrium, or LD), PRS are generally constructed from a subset of independent SNPs. These SNPs can be selected using a variety of methods including “pruning and thresholding,” where SNPs below a certain GWAS p-value and LD threshold (using r 2) are included (International Schizophrenia Consortium 2009); or LDpred, which uses a Bayesian approach to model SNP effect sizes while accounting for LD from an external reference panel (Vilhjalmsson et al. 2015).

Fig. 4
figure 4

Hypothetical example for calculating polygenic risk scores. Example of using GWAS summary statistics (left) to calculate polygenic risk scores (PRS) in an independent sample. GWAS provides the effect sizes (Beta) and Risk allele to calculate the weighted sum of risk alleles that an individual in the target sample carries. For example, Person 1 carries two risk alleles (A) at SNP 1, a single risk allele (T) at SNP 2, and a single risk allele (G) at SNP 3. Therefore, there PRS would be 2∗0.874 + 1∗−0.007 + 1∗0.148 = 1.889. This process occurs across all SNPs included for calculating a given PRS for each person in the independent target sample

PRS provide a flexible way of taking results from large-scale GWAS into samples with extensive phenotyping or longitudinal data to answer more nuanced questions about how genetic liability unfolds over time or how it changes across the specific environments. Extending the twin-family literature, research that uses PRS for externalizing problems has also found evidence of GxE. Following the social control/social opportunity model of GxE from the twin literature, recent work using PRS has found evidence of romantic partnerships moderating the association between PRS and alcohol misuse (Barr et al. 2019); peer deviance and parental monitoring moderating the association between PRS and externalizing disorders (Salvatore et al. 2014); and neighborhood social cohesion moderating the association between PRS and nicotine use (Meyers et al. 2013). In each of the listed PRS analyses, the association between the PRS and the corresponding phenotype was stronger under conditions of reduced social control (e.g., not in a relationship, association with deviant peers, low parental monitoring, and low neighborhood social cohesion) compared to the conditions of increased control/reduced opportunity.

PRS derived from GWAS of other outcomes that are genetically correlated to externalizing problems can also be used as proxies for PRS of externalizing problems. For example, PRS derived from a GWAS of educational attainment (Lee et al. 2018) predict antisocial behaviors across the life course, from early adolescence into adulthood (Wertz et al. 2018). PRS derived from GWAS of schizophrenia (Schizophrenia Working Group of the Psychiatric Genomics Consortium 2014) also predict childhood behavior problems (Jansen et al. 2018). Finally, PRS derived from a GWAS of educational attainment predict a variety of substance use disorders, including alcohol, tobacco, and cannabis use disorders (Salvatore et al. 2019). These analyses demonstrate that in the absence of well-powered GWAS of externalizing problems, PRS derived from large GWAS of genetically correlated phenotypes provide information as to how these forms of genetic liability unfold over time and relate to multiple externalizing phenotypes.

It is important to note that even though PRS have proven to be a useful tool for understanding genetic liability, even in large cohorts, PRS continue to predict only small portions of the variance in independent samples. For example, PRS from a recent GWAS approximately one million explained ~2.5% of variance in alcohol consumption (Liu et al. 2019). Additionally, PRS aggregate information from across the genome without regards to the biological function of the variants included. Future methods that can incorporate additional information from biological annotations or functional enrichment may improve our ability to predict these disorders from PRS (Márquez-Luna et al. 2019).

3.4 Increasing Diversity in Genetic Research

Concerted efforts to increase the diversity of participants in genomic research are vital. To date, large scale GWAS are composed almost entirely of individuals of European ancestry (Mills and Rahal 2019). Inclusion of individuals of diverse ancestries is scientifically important because including diverse ancestries increases the discovery power in GWAS (Dick et al. 2017; Wojcik et al. 2019) and differences in LD structure across allow us to get closer to causal variants (Bigdeli et al. 2019). Creating more diverse samples is also important for ethical reasons. PRS derived from ancestral populations that differ from the target sample perform poorly (Martin et al. 2017). In the push towards precision medicine, the current GWAS will likely exacerbate health disparities rather than help solve them (Martin et al. 2019). Therefore, greater diversity in genomic research is both a moral and scientific imperative.

4 Conclusion

Research into the etiology of externalizing problems has found that each of the psychiatric disorders, nonclinical behaviors, and personality characteristics on the externalizing spectrum are heritable to varying degrees. Overall, these externalizing problems share a common genetic etiology that accounts for a large share of the genetic variance in each of the corresponding phenotypes. However, the importance of these genetic influences can change across developmental and environmental context, typically becoming more influential as individuals reach young adulthood and under conditions in which social control is limited. Recent work has moved beyond the latent genetic designs of twin and family research into designs using measure genome-wide data. While we have begun to robustly detect variants associated with these traits, there is still considerable work to be done on elucidating biological mechanisms of risk. Future multivariate GWAS efforts will help to better understand the underlying genetic architecture shared across individual phenotypes. As we better understand the genetic architecture, we will be able to use results to create more powerful PRS in independent samples to further explore the ways in which risk unfolds across time and environmental context. And as we move towards the era of precision medicine, we will need to ensure that we have even larger discovery samples of diverse ancestries so that our results are able to be used to improve the health of all individuals in the population.