Introduction

Earlier studies suggest that children manifesting externalizing behaviors in early adolescence are more likely to initiate the use of legal substances, such as tobacco, and then progress to use of illicit drugs (King et al. 2004; Korhonen et al. 2010a, b). For example, using longitudinal data from the Minnesota Twin Family Study, King et al. (2004) reported that children with externalizing psychopathology at age 11 were significantly more likely to have tried alcohol, tobacco or cannabis by age 14, as well as to have had regular and advanced experience with these substances. Two recent European studies, conducted among Finnish and Dutch adolescents, suggested that the influences of externalizing behaviors on initiation of use of cannabis and other illicit drugs were mediated by preceding cigarette smoking. That inference was made, because the direct path coefficients of certain externalizing behaviors, such as hyperactivity–impulsivity, on initiation of drug use were significantly attenuated when preceding cigarette smoking was taken into account (Korhonen et al. 2010a, b). However, these studies did not investigate whether the associations between externalizing behavior, smoking and drug use initiation have their origins in genetic and/or environmental influences common to all three phenotypes.

Several twin and family studies have examined the relative importance of genetic and environmental influences on problem behaviors and use of various psychoactive substances. Many earlier studies have limited the assessment of externalizing behaviors to aggressiveness. Miles and Carey (1997) conducted a meta-analysis of 24 genetically informative studies on aggression and reported an overall genetic effect up to 50%. Self-reports and parental ratings showed genes and family environment to be important in youth whereas later the influence of genes increased while that of family environment decreased (Miles and Carey 1997).

Considering substance use, such as tobacco and illicit drugs, studies find moderate to high heritability, with heritability estimates varying as a function of age and gender (Rose et al. 2009; Agrawal and Lynskey 2008). The influence of genetic effects on initiation of tobacco and drug use tends to be lower in early adolescence and rise afterwards. Studies on how gender modulates the magnitude of genetic and common environmental influences have been less consistent, such as some smoking initiation studies reporting higher heritability for males (Hamilton et al. 2006), while some others lower heritability (White et al. 2003; Li et al. 2003).

Concerning smoking initiation, studies of adolescent twins demonstrate the importance of genetic factors already at early stage of development (Rose et al. 2009), yet with wide-ranging estimates. Heritability of smoking initiation as high as 84% in adolescence was reported in a Virginia twin population (Maes et al. 1999) but 38% among Colorado twins (Rhee et al. 2003), and only 15% among Australian twins (White et al. 2003). According to the extensive review by Rose et al. (2009), there is consistent evidence that the influence of genetic effects on smoking behaviors increases from early adolescence into adulthood.

For cannabis use initiation during adolescence, genetic factors have a modest effect, while the influence of environmental factors predominates (Agrawal and Lynskey 2006; Shelton et al. 2007). A recent meta-analysis on cannabis use initiation reported A, C and E estimates of 48, 25 and 27% in males, whereas 40, 39 and 21% in females (Verweij et al. 2010).

Multivariate genetic analyses on substance use initiation in adolescent twins are still quite limited. Koopmans et al. (1997) reported that initiation of alcohol use and smoking in adolescents was substantially influenced by common environmental features shared by the co-twins. Multivariate modeling on use of tobacco, alcohol and illicit drugs in a Minnesota sample indicated that adolescent initiation of substance use is influenced primarily by environmental rather than genetic factors, and covariation among the three substance use phenotypes could be accounted for by a common underlying substance use factor (Han et al. 1999). Genetic and environmental contributions to the initiation of use and progression to more serious use of tobacco, marijuana and alcohol during adolescence, but also the relationship between initiation and progression of substance use have been examined using a two-stage causal-common-contingent model (Fowler et al. 2007). For tobacco and marijuana use, the relation between initiation and progression to heavier use was strong suggesting overlapping etiologies. For both substances, common environmental effects tended to be greater for initiation, with genetic influences stronger for heavier use (Fowler et al. 2007).

A recent analysis among Finnish adolescent twins compared a model describing a direct impact of liability to tobacco use on use of illicit drugs with a model including a shared underlying liability for both substances. The multivariate model, which included a direct impact of the initiation of tobacco use on initiation of illicit drug use, provided the best fit to the data. However, the influence of common genetic influences on use of both tobacco and illicit drugs could not be excluded (Huizink et al. 2010). Consistent with the Finnish study, a U.S. study tested 13 genetically informative models underlying the lifetime co-occurrence of tobacco and cannabis use in Virginian adolescent twins. In this study, the causation models fit the adolescent data best, but the correlated liabilities model with moderate genetic correlations could not be excluded either (Agrawal et al. 2010).

Considering substance use and abuse, multivariate analyses by Young et al. (2006) showed significant genetic correlations between tobacco, alcohol, and marijuana abuse, whereas significant shared environmental influences were found only for use of those substances. However, none of these multivariate models included externalizing behavior as a potential underlying common risk factor.

Co-occurrence of externalizing behaviors and substance use in youth has been tested in a few studies (Shelton et al. 2007; Krueger et al. 2002; Miles et al. 2002; Hicks et al. 2004) with inconsistent evidence on common genetic and environmental liability, depending on phenotype (initiation, use, abuse, dependence) and study design (cross-sectional, longitudinal) applied, as well as on age groups studied. For example, Shelton et al. (2007) found in their longitudinal study that conduct problems in childhood and early adolescence made a significant contribution to the risk for marijuana use eight years later, whereas Hicks et al. (2004) reported, based on cross-sectional data, that conduct disorder, antisocial behavior, alcohol dependence, and drug dependence share common genetic vulnerability. Studies on adults suggest more consistently substantial genetic overlap between use and abuse of different substances (e.g. Kendler et al. 2003, 2007).

In summary, although there are several earlier twin studies investigating the genetic and environmental influences on externalizing behaviors, cigarette smoking and use of illicit drugs, there are not many studies spanning the important developmental periods in adolescence, characterized by emotional and cognitive developmental tasks, such as separation from parents, forming a greater sense of personal identity and identification with a peer group, and increased capacity of impulse control and self-regulation (Hazen et al. 2008). These behavioral and psychological changes are accompanied by several developmental transitions in brain physiology, making adolescence a critical period of vulnerability for initiation of substance use and later also for addiction (Crews et al. 2007; Spear 2000). Importantly, there is lack of genetically informative longitudinal studies conducted across adolescence. Thus, it remains unclear to what extent early observed problem behaviors and subsequent initiation of licit and illicit substances share common genetic and/or environmental influences. Moreover, most studies did not test potential gender differences.

The aim of the present study was to investigate the extent to which common genetic and environmental influences underlie the pathways between externalizing behaviors, initiation/use of tobacco and initiation of illicit drug use. We used longitudinal data from the FinnTwin12 cohort where externalizing behaviors were studied at the age of 12, tobacco use at the age of 14 and initiation of drug use at the age of 17.5, representing phases of adolescence when each of these behaviors can be observed at their early stages. Considering the age group under investigation, our focus was on initiation and frequency, while abuse and dependence were not a focus of our analysis. Based on the existing literature and utilizing our longitudinal data, our main objective was to focus on the influences that are common to early observed externalizing behaviors and later reported tobacco and drug use phenotypes. Our second objective was to test gender differences, i.e. whether the parameter estimates of genetically informative models could be constrained to equality for boys and girls.

Methods

Subjects

This investigation was based on longitudinal data of the FinnTwin12 study, started in 1994 to examine genetic and environmental determinants of precursors of health-related behaviors in initially 11–12-year-old twins (born 1983–1987). The study targeted five consecutive and complete birth cohorts of about 5,600 Finnish twins including questionnaire assessments of both twins and about 5,000 parents at baseline in the year before the twins reach age 12 (87% participation rate). The following spring the twins’ classroom teachers rated the behavior of the twins, as described in detail elsewhere (Kaprio et al. 2002; Pulkkinen et al. 1999). All twins were re-tested at ages 14 (1997–2001) and 17.5 (2000–2005). The study protocol was approved by the IRB of the Indiana University and the Ethical Committee of the University of Helsinki. The parents provided written informed consent for participation (Kaprio et al. 2002; Kaprio 2006).

At first follow-up, the mean age was 14.1 years. The response rate was 88% (4,740 questionnaires returned out of 5,362 mailed). The present study utilized information on cigarette smoking from this survey. At second follow-up at age 17.5, a questionnaire was sent to the twins of each family that returned the family questionnaire. This questionnaire provided information on illicit drug use. In all, 4,236 questionnaires were returned out of 4,594 mailed (response rate 92.2% for those participating in earlier questionnaires). Among those participating in all three surveys (n = 4,138), data on illicit drug use at age 17.5 were available from 4,129 individuals.

The preliminary analyses of the present study, such as testing assumptions of genetic modeling and univariate modeling, were conducted on all available data, including 737 monozygotic (MZ), 722 same-sex dizygotic (SS-DZ) and 670 opposite-sex (OS-DZ) twin pairs. However, in order to make the multivariate structural models more amenable for estimation, the final sample was restricted in the multivariate models to the same-sex pairs (737 MZ and 722 SS-DZ pairs).

Measures

Externalizing behaviors were rated by the twins’ teachers at age 11–12 using a Finnish scale, the Multidimensional Peer Nomination Inventory (MPNI). It had scales for hyperactivity–impulsivity (e.g. is restless; runs about and climbs everywhere in spite of warnings), aggression (e.g. teases other kids or attacks them for no reason at all; goes round telling people’s secrets to others), and inattention (e.g. is forgetful; ignores instructions), which formed a factor for externalizing problem behaviors (also called behavioral problems) (Pulkkinen et al. 1999). The formation of the scales, including psychometric information and individual items, is described by Pulkkinen et al. (1999). MPNI has been applied in several other studies (Barman et al. 2004; Happonen et al. 2002; Korhonen et al. 2010a; Pulkkinen et al. 2003; Vaalamo et al. 2002; Vierikko et al. 2003, 2004; Virtanen et al. 2004). In the present study we used the highly skewed sum score of hyperactivity–impulsivity, aggressiveness and inattention, categorized into the three categories of 60, 30 and 10% of participants, a distribution passing the multivariate normality test and with the third category being considered as an approximation of clinically significant behavioral problems.

Adolescent smoking at age 14 was assessed with a multipart question that first asked “Have you ever smoked (or tried smoking)?” to which adolescents responded “yes” or “no”. Adolescents who responded “yes” subsequently answered a question that asked “How many cigarettes have you smoked altogether up to now?” with four response options: “only one”, “about 2 to 10”, “about 11 to 50”, or “over 50”. Because ‘initiation’ and ‘amount’ are different dimensions of this trait, we created two phenotypes for the modeling. Initiation was a dichotomous trait, whereas amount of cigarettes smoked was a 4-class ordinal one with never smokers having missing values for that trait. This method of treating the analysis of twin data on initiation and progression as a special case of missing data, in which individuals who do not initiate are regarded as having missing data on progression measures, has been suggested and developed by Neale et al. (2006a) and can easily be applied by using the general framework for the analysis of ordinal data with missing values available in the statistical package Mx.

Self-reported ever use of cannabis or other illicit drugs at age 17.5 was assessed with the item “Have you ever tried or used drugs, such as hashish, something to sniff, or other drugs or substances that would make you feel ‘intoxicated’?” The options were: 1 = I have never tried or used; 2 = 13 times; 3 = 49; 4 = 1019, and 5 = 20 times or more. As frequent use was rare, for the analyses of this study these options were re-coded as a dichotomous variable, i.e. 0 = never used and 1 = ever used (all categories with any use).

Tests of bivariate normality were performed on the twin 1 and twin 2 scores on the ordinal variables with more than two categories, i.e. externalizing and smoking amount. The assumption of bivariate normality was reasonably met in these variables, with three of the four same-sexed zygosity groups passing the test (P > 0.05) in both cases.

Statistical methods

As a preliminary analysis we calculated the phenotypic correlations; tetrachoric correlations were calculated for smoking initiation and drug use initiation, and polychoric correlations were calculated for other phenotypes. Then we calculated polychoric cross-twin within-trait correlations and cross-twin cross-trait correlations using the Stata statistical package, version 11 (StataCorp 2005). All phenotypes were analyzed as ordinal ones, i.e. externalizing behavior in three categories, initiation of smoking in two categories, amount of cigarettes smoked in four categories, and ever use of illicit drugs in two categories. The thresholds were modeled separately for male and female adolescents in all models. Those thresholds were initially estimated in Stata by ordered probit regression.

Twin modeling is based on the assumption that MZ twins share 100% of their genomic sequence, whereas DZ twins share on average 50% of their segregating genes. Greater similarity for MZ twins compared with DZ twins supports the hypothesis that genetic transmission is important, assuming that MZ and DZ pairs share to the same extent their phenotype-relevant environmental experiences. In the model the correlations for genetic components are 1 among MZ pairs and 0.5 among DZ twin pairs considering additive genetic component (A) and 0.25 considering dominant genetic component (D). Environmental factors include the environment shared by the co-twins (C = common environment) and the environment not shared by the co-twins (E = unique environment), including measurement error. In the model the correlations for common environment are 1 while for unique environment 0 within both MZ and DZ twin pairs (Boomsma et al. 2002). The Mx statistical package was used to estimate the proportion of trait variance accounted for by additive (A) or dominant (D) genetic factors, by shared/common environmental factors (C) and by factors unique for the co-twins (E). Based on twin correlations, the ACE model was selected as a starting point of the modeling. First, for each phenotype, a full model including ACE effects was fitted. Then, we tested the statistical significance of each component of the baseline model by fixing them to zero in order to find the most parsimonious model (Neale and Maes 2006; Neale et al. 2006b).

Twin modeling was initiated with univariate structural modeling including OS-DZ twins and testing both quantitative and qualitative gender differences in the genetic influences on the phenotypes. Quantitative gender differences in the A, C and E influences are tested by constraining the path coefficients from these latent variance components to the phenotypes equal across gender. The inclusion of OS pairs also enables the testing of qualitative genetic gender differences, which are inferred if fixing the correlation between A influences in OS pairs to 0.5 results in a significant reduction in model fit.

Multivariate Cholesky models included 356 male and 381 female MZ pairs, and 383 male and 339 female same-sex DZ pairs. Although including the opposite-sex DZ twins into the modeling could provide valuable information, we acknowledge that multivariate modeling of sex differences with OS pairs is challenging. The basis of this challenge has been reported by Neale et al. (2006c) showing clear identification issues in the multivariate Cholesky models with OS pairs. This provided us the rationale for excluding OS twins from the multivariate models.

Based on the existing literature, we decided to focus on the covariation between early observed externalizing behaviors and later reported smoking and drug use phenotypes. We first tested gender differences, constraining the parameters to be equal for males and females in the full multivariate model. This resulted in a significant reduction in model fit (χ 2 = 53.28, df = 30, P = 0.006). However, the inspection of path coefficients of the full model revealed that many of the diagonal E paths of the Cholesky model (i.e. E influences on the covariance between the traits) were very small (raw path coefficients ranging from 2.6 × 10−7 to 0.10, with more than half of the path coefficients being smaller than 0.01). Consequently, we tested whether these diagonal E paths could be dropped as a block, making further model testing efforts more straightforward. Dropping these 12 paths was indeed statistically possible, with only a negligible effect on model fit (χ 2 = 2.67, df = 12, P = 0.997). We then tested whether the estimates for boys and girls could be equated in this restricted model and found this possible (Table 1, comparison to the restricted model: χ 2 = 20.10, df = 20, P = 0.452; comparison to the full model: χ 2 = 22.77, df = 32, P = 0.89. Although this order of model fitting may be atypical, these fit statistics clearly indicate that estimating those very small (close to zero) E paths substantially impeded the model estimation. As a result, we chose to follow a model fitting procedure that was initially driven empirically in order to make the models with four ordinal variables more amenable to estimation, while testing specific and well-motivated research questions.

Table 1 Fit statistics for multivariate Cholesky decomposition models for externalizing behavior, smoking initiation, smoking amount, and drug use initiation

We continued the multivariate model fitting by collapsing boys and girls together and focusing on two major issues on the genetic and environmental covariance structure between externalizing behavior and substance use. We tested two primary questions: (A) whether there are significant A or C influences that are common to early adolescence (age 12) externalizing behaviors and later adolescence (14–17) smoking and drug use initiation phenotypes when the substance phenotypes are allowed to have additional (shared and specific) A and C factors affecting them, and (B) whether additional A or C influences related to substance initiation/use are required if the model contains the “general liability” A or C factor that influences both externalizing and substance use.

We compared the nested submodels with more saturated ones through Chi-square difference tests, wherein a P value less than 0.05 indicates that the submodel fits the data significantly worse than the less parsimonious model including more paths. When choosing the best fitting final model we additionally compared the Akaike’s Information Criterion (AIC) values between the models. Here, the lower AIC value—often a greater negative value—indicates the more parsimonious model (Neale and Maes 2006). Finally, in order to find the most parsimonious model, we started with the reduced model that had the lowest AIC, then arrived at the final model by dropping non-significant parameters from this model.

Results

Descriptive results

When analyzing the sum score of teacher-rated externalizing behavior at age 12 in three categories for all twins, a clear gender difference in distribution was seen such that 16.5% of boys and 3.79% of girls belonged to the highest category of that sum score. Concerning cigarette smoking initiation at age of 14, 43.0% of boys and 41.4% of girls had smoked at least once, whereas 17.9% of boys and 19.4% of girls had ever smoked over 50 times. Finally, at age 17.5, 12.1% of boys and 15.1% of girls had ever used cannabis or other illicit drugs at least once. The phenotypic correlations are shown in Table 2; tetrachoric correlations were calculated for smoking initiation and drug use initiation and polychoric correlations were calculated for the other phenotypes. The correlations are shown separately for boys and girls. Correlations were highest between smoking initiation and initiation of drug use. Polychoric cross-twin within-trait correlations for externalizing behaviors, smoking and drug use across sex-zygosity groups are shown in Table 3, and cross-twin cross-trait correlations, respectively in Table 4. The cross-twin within-trait correlations were systematically larger among MZ than DZ pairs, suggesting the presence of genetic influences on the traits. However, all DZ correlations were more than half the size of the corresponding MZ correlations, implying significant influences of the C component. The cross-twin cross-trait correlations were often only slightly larger among MZ than DZ pairs, suggesting that the co-occurrence of the traits under study would be mostly due to shared C influences, whereas shared A effects would account for a smaller proportion of the covariance.

Table 2 Tetrachoric and polychoric correlations between the phenotypes in boys (top rows) and girls (bottom rows)
Table 3 Polychoric cross-twin within-trait correlations for externalizing behavior, smoking initiation, smoking amount and drug use initiation
Table 4 Cross-twin cross-trait correlations between externalizing behavior, smoking initiation, smoking amount and drug use initiation

Univariate modeling

ACE models with equal estimates for boys and girls turned out to be the best fitting univariate models for all phenotypes, suggesting that no quantitative gender differences in the etiology of these traits were present. In contrast, the genetic correlation in OS pairs could not be fixed to 0.5 for any of the phenotypes (χ 2 in the range of 4.46 to 20.42, df = 1, P in the range of 0.03 to <0.001), indicating qualitative genetic gender differences. The estimates of A, C and E effects in the univariate model for externalizing were 0.57 (95% CI: 0.43–0.73), 0.32 (95% CI: 0.16–0.45) and 0.12 (95% CI: 0.09–0.15), respectively. For smoking initiation, the estimates of A, C and E effects were 0.20 (95% CI: 0.15–0.31), 0.75 (95% CI: 0.65–0.79) and 0.05 (95% CI: 0.03–0.07), and for the amount of cigarettes smoked 0.39 (95% CI: 0.19–0.63), 0.39 (95% CI: 0.17–0.57) and 0.21 (95% CI: 0.16–0.28), respectively. The A, C and E estimates for initiation of drug use were 0.30 (95% CI: 0.15–0.56), 0.57 (95% CI: 0.33–0.70) and 0.13 (95% CI: 0.08–0.20), respectively.

Multivariate modeling

Although including opposite-sex DZ twins into the modeling could provide valuable information, multivariate modeling of gender differences with OS pairs is problematic, because of clear identification issues in the multivariate Cholesky models with OS pairs (Neale et al. 2006c). Therefore we excluded OS twins in the multivariate models.

From the full ACE multivariate Cholesky decomposition model (−2 log likelihood = 11629.61, df = 9,547) shown in Table 1 (model 1) several parameters could be dropped without a significant decrease in model fit. First, we were able to drop all unique environmental (E) diagonal paths and constrain parameters equal for males and females, as explained above (models 2–3). Then, we allowed specific influences from externalizing behaviors to smoking and drug use and tested the general liability (part A of the test sequence, models 4–8). As shown in Table 1, A or C influences common for externalizing behaviors and smoking could not be dropped and also A and C underlying externalizing behavior and drug use could not be dropped simultaneously. After that, we tested the alternative hypothesis, i.e. allowing general liability from externalizing behaviors to smoking and drug use while testing for specific influences (part B of the test sequence, models 9–15). As seen in Table 1, the A and C effects specific to initiation of illicit drug use could be dropped.

In order to find the most parsimonious model, we then combined parts A and B (models 16–18). Because model #15 had the best fit (AIC = −7508.47) among all models conducted so far, we chose it as the starting point for the final testing. Here we tested whether any of the previously non-significant reductions can be done in addition to the reductions in this already reduced model. Reductions indicated in models 6 and 7 could be individually added, yet the P values approached significance. However, we found it very difficult to distinguish between these two models whose difference in the AIC was very small. Moreover, these more restricted models offer two opposing interpretations regarding factors underlying the association between externalizing behavior and drug use initiation, i.e. the first one dropping all genetic correlation between them and the second dropping all shared environment correlation. Because of these issues, we considered model number 15 as the most parsimonious one. It has the lowest AIC, and the P value compared to the very first full model was 0.84. Thus, our final model included both A and C paths that were common to externalizing behaviors and drug use initiation.

The results of the final model are presented in Fig. 1, showing the unstandardized path coefficients and in Table 5, where proportions of phenotypic variance and covariance explained by the additive genetic, common environmental, and unique environmental factors as well as the genetic and environmental correlations are shown. In summary, externalizing behavior is under relatively high genetic influence (56%) whereas initiation of smoking (75%), amount of cigarettes smoked (54%) and initiation of drug use (60%) under environmental influences shared by the co-twins. Considering the associations between the studied phenotypes, common environmental factors shared within a twin pair explained more than 50% of the covariance between all variables studied. The influence of common genetic factors was as strongest for the association between externalizing behaviors and initiation of drug use (49%). Finally, there were no specific additive genetic or common environmental influences on initiation of drug use, as all genetic and common environmental influences were shared by preceding externalizing and smoking behaviors (Fig. 1).

Fig. 1
figure 1

Multivariate Cholesky decomposition for externalizing behaviour at age of 12, smoking initiation and smoking amount at age of 14, and initiation of illicit drugs at age of 17: unstandardized path coefficients

Table 5 Proportions of phenotypic variance (on the diagonal, in boldface) and covariance (below the diagonal) due to additive genetic (A), common environmental (C), and unique environmental (E) factors from the best-fitting multivariate Cholesky decomposition model for externalizing behavior, smoking initiation, smoking amount and drug use initiation

Discussion

Summary of the results

In the present study, we set out to map the underlying genetic and environmental influences giving rise to the associations between externalizing behaviors in early adolescence and later initiation and use of tobacco and initiation of illicit drug use. In the multivariate models, parameters could be equated for males and females, and all common unique environmental influences among the four phenotypes could be dropped from the model. To summarize the results of the multivariate models, the heritability was 56% for externalizing behaviors, 20% for smoking initiation, 32% for smoking amount, and 27% for illicit drug use. The corresponding C influences were 32, 75, 54, and 60%. In the best-fitting multivariate model, common environmental influences explained most of the covariance between externalizing behaviors and smoking initiation (69%) and amount (77%). Covariance between smoking initiation/amount and initiation of drug use was due to additive genetic (42/22%) and common environmental (58/78%) influences. Half of the covariance between externalizing behaviors and drug use initiation was due to common genetics and half due to common environment shared by the co-twins. There were no specific additive genetic or common environmental influences on initiation of drug use, as all genetic and common environmental influences were shared with preceding externalizing and smoking behaviors.

Phenotype prevalence and correlations

Consistent with previous studies, we found that externalizing behaviors were more common in adolescent boys than girls, and that early externalizing problems predicted both tobacco smoking and use of cannabis and other drugs later in adolescence (King et al. 2004; Fergusson et al. 2007; Hayatbakhsh et al. 2008; Kirisci et al. 2009). In the present population-based sample of Finnish twins, more than 40% of boys and girls had some experience with smoking at the age of 14, an estimate that is close to other findings in Finland (Rimpelä et al. 2006). Compared to many other countries, the prevalence of cannabis use has been somewhat lower in Finland (United Nations International Drug Control Programme 1997) and the present estimates of approximately 12% of boys and 15% of girls reporting any use of cannabis or other drugs at the age of 17.5 are also relatively low in international comparison. However, earlier smoking strongly predicted illicit drug use also in the present study, as has been reported in several earlier studies (Vega and Gil 2005; Korhonen et al. 2008). A recent population-based study investigated the interplay of externalizing behavior problems, early onset cigarette smoking and ever use of cannabis in Dutch adolescents (Korhonen et al. 2010b) showing that it is likely that the influence of externalizing behaviors on cannabis use is often mediated through early onset cigarette smoking. This finding was partially replicated also among Finnish twins (Korhonen et al. 2010a). Although that analysis was adjusted for familial liability to substance dependence, those analyses did not include genetic modeling.

Results of multivariate genetic modeling

Co-occurrence of externalizing behaviors and substance use has been tested earlier using data from 17-year-old twins. Krueger et al. (2002) reported that variance of the externalizing factor was mostly genetic, but both genetic and environmental factors accounted for distinctions among phenotypes. Hicks et al. (2004) investigated symptom counts of conduct disorder, the criteria for antisocial personality disorder, alcohol dependence, and drug dependence in a twin family study. Transmission of a general vulnerability to all the externalizing disorders accounted for most familial resemblance. Such general vulnerability was highly heritable, but also disorder-specific vulnerabilities were detected for conduct disorder, alcohol dependence, and drug dependence. The mechanism underlying the familial transmission of externalizing disorders was primarily a highly heritable general vulnerability. A longitudinal multivariate modeling study on conduct problems in childhood and initiation of marijuana use in adolescence (Shelton et al. 2007) revealed that the initiation was influenced by genetic, common and unique environmental factors. The findings indicated high heritability of conduct problems per se, the severity of such problem behaviors being more strongly environmentally influenced. Multivariate modeling indicated that conduct problems in childhood and early adolescence made a small but significant contribution to the risk for marijuana use 8 years later. This literature highlights covariance between more extreme, clinical phenotypes, such as conduct disorder and alcohol dependence, whereas our study focused on externalizing behaviors across the population, along with cigarette smoking initiation and amount as well as initiation of illicit drug use. Consequently, common genetic influences were smaller and common environmental influences larger in our study in comparison to those studies with more clinical phenotypes. Interestingly, in our adolescent population-based twin data no specific genetic or common environmental variance seems to be needed to explain the initiation of illicit drug use when the common genetic and environmental background with earlier observed externalizing and smoking behaviors are taken into account. We consider this a novel finding, for which replication in other data sets would be needed.

Gender differences

One aim of this study was to test whether gender modulates the magnitude of genetic influences on externalizing behavior and substance use initiation. However, we were able to equalize the multivariate models across gender, indicating that no such significant gender differences would exist. However, we used gender-specific thresholds for the phenotypes, reflecting significantly different prevalence of the phenotypes studied, especially in externalizing behavior. In addition, univariate modeling with OS twin pairs suggested that there may be qualitative differences in the genetic background of externalizing, smoking and illicit drug use. In line with our study, many earlier twin studies on smoking in adolescence have failed to demonstrate significant quantitative gender differences in genetic or environmental influences (Rose et al. 2009). If gender differences have been reported, they have been inconsistent, potentially at least partly due to random fluctuations in the heritability estimates in data sets with limited power.

Methodological issues

Strengths of the present study include the use of a relatively large, population-based sample of adolescents providing prospective data that were tightly standardized for age. In addition, high response rates were obtained in all waves of data collection.

All data are self-reported, albeit at different time points. We based our assessment of externalizing problem behaviors on the MPNI teacher ratings (Pulkkinen et al. 1999), which is less widely used than e.g. the Child Behavior Checklist (CBCL) by Achenbach (1991). Therefore, our study based on MNPI data may not be fully comparable to studies using the Achenbach questionnaires. Further, here we used the sum score of hyperactivity–impulsivity, aggressiveness and inattention, rather than examining only hyperactivity–impulsivity. Because this was one of the first studies on underlying influences of the associations between substance use and including also early observed externalizing behaviors, we decided to start with a more general phenotype. We acknowledge, however, that in the future studies it might be interesting to find out more specific features of externalizing behaviors, i.e. whether our finding on common underlying genetic and environmental architecture with initiation of substance use would have its sources, for example, in hyperactivity–impulsivity or aggressiveness.

Considering substance use, the phenotype definition is very important. Although our study focused on substance use initiation, we have discussed above also phenotypes related to more frequent use, abuse and dependence. It is important to remember that the etiology of substance initiation, use, abuse, and dependence may have different aspects (Dick et al. 2011).

We acknowledge as a limitation that we did not calculate the confidence intervals for the estimates of our final multivariate model. This is due to unfeasible amount of computer time needed to estimate confidence intervals for multivariate models with ordinal variables and also because the confidence intervals for ordinal variables provided by the Mx may not be very reliable.

Finally, we acknowledge that we may have had limited power partly because we only have twins but no siblings in the data. We also acknowledge that twins-only data may overestimate the genetic effects. As far as we know, power limitation usually results to difficulties in detecting the effects of environment shared by the co-twins. However, as our final models included those effects, we realize that our data worked well at least in this respect.

Conclusions

Our multivariate genetic modeling of Finnish longitudinal twin data suggests that, in adolescence, the nature of the pathways from externalizing behaviors and cigarette smoking into experimentation with illicit drugs is more strongly influenced by environments shared by the co-twins, whereas genetic factors play a less important role. This inference is consistent with other findings on substance use initiation traits among adolescent populations. Our finding that early observed externalizing behavior seems to provide significant underlying genetic and environmental influences common to later substance use, eventually manifested as initiation of illicit substance use during late adolescence, is novel and offers a challenge for replication in other genetically informative samples.