Illicit substance use in adolescents is a major public health concern. In 2014, approximately 9.4% of 12- to 17-year-olds in a nationally representative sample were found to be current (i.e., past month) substance users (Substance Abuse and Mental Health Services Administration 2015a). Although rates of illicit substance use are higher in adults (Faggiano et al. 2014), understanding this behavior in adolescents is particularly important from a preventative standpoint, for a substantial proportion of individuals who initiate illicit substance use do so during their teenage years (Substance Abuse and Mental Health Services Administration 2015a). Indeed, around 56.6% of first-time marijuana users are adolescents (Substance Abuse and Mental Health Services Administration 2014), as are approximately half of all first-time users of inhalants and hallucinogens (Andersen and Teicher 2009). Furthermore, illicit substance use in adolescence has been associated with several negative course trajectories, such as neurocognitive impairments, academic difficulties, risky sexual behavior, and engagement in criminality (Aebi et al. 2014; Esch et al. 2014; Meier et al. 2012; Sarver et al. 2014; Squeglia et al. 2009).

Understanding the etiology of substance use disorders in adolescence is essential insofar as it may yield promising targets for prevention efforts in this age group. Of direct relevance to this objective is the longstanding nosological question of whether the latent structure of substance use disorders (SUDs) is categorical or dimensional (i.e., continuous) in nature (Helzer et al. 2006; Martin et al. 2008). Dimensional constructs are characterized by multi-causal etiologies (i.e., equifinality), whereas categorical ones tend to emerge from fewer pathogenic processes (Meehl 1977; Meehl and Golden 1982). That is, differences along a continuum are more likely due to the additive influences of multiple factors (for a more thorough discussion, see Ruscio et al. 2006). Consistent with the possibility that SUDs exist along a continuum of severity are some recent indications of equifinality in the development of problematic substance use from adolescence to early adulthood (i.e., multiple etiologies; Nelson et al. 2015). In favor of conceptualizing SUDs as discrete diagnostic entities, however, is the view that such a categorical approach is of practical value, particularly within clinical contexts (Cantwell 1996; Coghill and Sonuga-Barke 2012). That this phenomenological issue regarding the latent structure of SUDs remains largely unresolved is evident in the decision to include both dimensional and categorical elements for classifying SUDs in the most recent revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; American Psychiatric Association 2013). More specifically, SUDs are conceptualized within DSM-5 as consisting of multiple taxa (i.e., categories) with continuous variation within each taxon (for a detailed discussion of possible dimensionality within taxa, see Ruscio et al. 2006).

Despite the longstanding philosophical and theoretical interest in this issue, empirical study is required to address it directly (Beauchaine 2003; Sonuga-Barke 1998). A family of statistical techniques specifically designed to study the latent structure of substance use disorders is taxometric analysis (Meehl 1995, 2004). In contrast to traditional statistical methods (e.g., cluster analysis, latent class analysis) which are more vulnerable to detecting spurious classes (Solomon et al. 2001), taxometric analysis is especially suited for addressing this question because, rather than assuming or imposing a specific latent structure on the data, it evaluates the existing data relative to both continuous and categorical models to determine with which they better fit (Beauchaine 2003; Fraley and Waller 1998). This statistical approach has been empirically validated (Ruscio et al. 2004; Waller and Meehl 1998), and has been increasingly applied to various forms of psychopathology, including negative symptoms of schizophrenia (Ahmed et al. 2015), problematic gambling (James et al. 2014), psychopathy (Murrie et al. 2007), attention-deficit/hyperactivity disorder (Marcus and Barry 2011), and particularly depression (Hankin et al. 2005; Liu 2016; Richey et al. 2009).

Most of the taxometric research to date on substance use has focused on problematic alcohol use (Green et al. 2011; Kerridge et al. 2013; Slade et al. 2009; Walters 2008, 2009, 2015; Walters et al. 2010), with the lone exception among these eight studies involving marijuana (Denson and Earleywine 2006). The findings across these studies have been quite mixed, with the cannabis taxometric analysis and one alcohol study reporting support for dimensional latent structures (Denson and Earleywine 2006; Slade et al. 2009), three studies finding support for a taxonic solution (Green et al. 2011; Walters 2008, 2009), and yet another three yielding ambiguous or mixed findings (Kerridge et al. 2013; Walters 2015; Walters et al. 2010). Some researchers have interpreted such mixed findings to be consistent with the view that alcohol use disorders may be taxonic but also include a degree of dimensionality within the taxon (Kerridge et al. 2013). Nonetheless, it cannot be assumed that findings from taxometric studies on problematic alcohol use can be generalized to form inferences regarding the latent structure of other SUDs, particularly in the case of less frequently used and illicit drugs.

There is also a fair degree of homogeneity in the nature of the samples featured in these studies, with three drawing from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC; Denson and Earleywine 2006; Green et al. 2011; Kerridge et al. 2013), and another four involving individuals with a criminal history (Walters 2008, 2009, 2015; Walters et al. 2010), the latter of which, in particular, limits the generalizability of study findings. Also of note, all of these studies focused on adult samples, and it cannot be assumed these findings in the adult literature are generalizable to adolescents. Past research has found several differences between adolescent-onset and adult-onset SUDs, with one study reporting higher lifetime rates of marijuana and hallucinogen use disorders in the case of adolescent-onset SUDs (Clark et al. 1998). Furthermore, adolescent-onset SUDs appeared to be associated with a worse trajectory, characterized by a shorter time from first use to the development of dependence, shorter interval between dependence on a first substance to a second one, and greater psychiatric comorbidity. The neurobiological processes of risk underlying adolescent substance use behavior may similarly differ from those in adults. In particular, adolescents appear to be more sensitive to positively reinforcing and more insensitive to aversive properties of several commonly abused substances (Doremus-Fitzwater et al. 2010). Collectively, these findings suggest that adolescent-onset SUDs may be relatively distinct from adult-onset counterparts. It is therefore unclear to what degree their findings may be applicable to adolescent SUDs.

The current study aims to build on the existing empirical literature by applying taxometric methods to investigate the latent structure of SUDs in adolescents for four types of substances: marijuana, analgesics, hallucinogens, and inhalants. One potential reason for the relative lack of taxometric studies for substances other than alcohol may be the comparably lower prevalence rates for their use and associated disorders. The past-year prevalence rates in adolescents are approximately 4.7% for analgesics, 1.7% for hallucinogens, and 2.1% for inhalants (Substance Abuse and Mental Health Services Administration 2015b). Marijuana use is a notable exception, with past-year prevalence being estimated to be 13.1%. The past-year prevalence rates for SUDs in this age group are substantially lower still, ranging, for example, from 0.2% each for analgesics, hallucinogens, and inhalants to 2.7% for marijuana. These low prevalence rates pose not insignificant challenges for conducting taxometric studies, given the sample requirements for valid analysis. In particular, a minimum n of 300 is recommended for taxometric studies to avoid a bias toward spurious findings that may result from smaller samples (Meehl 1995). In the current context, this minimum n of 300 applies to the number of individuals who have used the substance under study. In the case of hallucinogens use in adolescents, a minimum overall sample size of 17,648 would therefore be required to obtain a subsample of 300 for analysis. The difficulty of conducting taxometric analysis for SUDs in adolescents is compounded by the requirement of taxon base rates of P ≥ 0.1 (Meehl 1995), so as to avoid a bias towards a dimensional solution. In situations where lower taxon base rates are present, a much larger overall sample is required to ensure enough cases of the putative taxon to detect its presence in the data (Holland et al. 2010). The current study, with data from the National Survey on Drug Use and Health (NSDUH) pooled across the years 2004 to 2013, is uniquely suited to address these challenges.

Method

Participants

The NSDUH is a nationally representative survey conducted annually by the Substance Abuse and Mental Health Services Administration (SAMHSA) to provide national estimates of the prevalence of substance use and disorders. The NSDUH uses a multi-stage area probability sampling design with participants aged 12 and older within all 50 U.S. states and the District of Columbia. African-Americans, Hispanics, and youth were intentionally oversampled to increase precision estimates for these groups. Participants include individuals living in households, shelters, half-way houses, group homes, rooming or boarding houses, college dormitories, and military bases. Informed consent was obtained from all respondents. The overall sample in the current study was restricted to respondents aged 12 to 17 (unweighted n = 181,573).Footnote 1 This sample was 48.92% female, with a mean age of 14.54 years (SE = 0.01), 58.28% non-Hispanic white, 14.80% non-Hispanic black, 19.52% Hispanic, 4.33% Asian, 0.60% Native American, 0.34% Hawaiian and Pacific Islander, and 2.13% multiracial.

Only respondents who endorsed using a substance at least once in the past 12 months (or on more than five days in the case of marijuana) were queried about the DSM-IV symptoms of substance abuse and dependence for that substance over the same 12-month period. As taxometric methods require response data for all symptom questions, only respondents who had used a given substance were included in the taxometric analyses for that substance. In the current study, this resulted in unweighted subsamples of 17,517 for marijuana, 10,336 for analgesics, 4,900 for hallucinogens, and 5,796 for inhalants.Footnote 2

Measure

The NSDUH interview modules for substance abuse and dependence over the past 12 months observed the DSM-IV criteria, and include dichotomous response options, with a total of 15 items for analgesics, and 14 each for marijuana, hallucinogens, and inhalants. Rather than substance abuse being pre-empted by substance dependence as specified in DSM-IV, the symptom data for both abuse and dependence were treated inclusively in the analyses. This approach is consistent with that adopted in prior taxometric studies on SUDs (e.g., Kerridge et al. 2013; Slade et al. 2009; Walters 2008).

Data Analysis

For conducting taxometric analysis, several indicators are required that should reflect relatively distinct facets of the construct of interest. Therefore, following standard taxometric procedures (Beauchaine 2003; Cole 2004; Haslam 2003), multiple indicators were constructed to reflect different facets of the latent construct of each SUD. The symptom items for each SUD were used to construct their respective indicators. Specifically, these indicators were constructed by first factor analyzing the abuse and dependence items for each substance using oblique (promax) rotation (Preacher and MacCallum 2003). For each substance, a four-factor (i.e., four-indicator) solution was supported by parallel analysis (Horn 1965; Weng and Cheng 2005) on 1000 sets of random data (O’Connor 2000). This factor-analytic approach to indicator construction has been frequently adopted in prior taxometric studies (Denson and Earleywine 2006; Marcus et al. 2004; Ruscio et al. 2004). Each four-factor indicator solution was assessed for nuisance covariance. That is, the indicator correlations should be notably smaller within the putative taxon (i.e., a group of substance users qualitatively different from the rest of the sample) and complement (i.e., the sample excluding qualitatively different substance users), respectively, than within the full sample (Ruscio et al. 2006). In each case this condition was satisfied, with full sample rs ≥ 0.31 and taxon and complement rs ≤ 0.04.Footnote 3 Valid indicators are able to discriminate between putative taxa and their complements at d ≥ 1.25 (Meehl 1995). This condition was satisfied for the indicators for all four substances. Indicator correlations and validity are summarized in Table 1.

Table 1 Summary of taxometric analyses

Taxometric analysis involves the implementation of multiple mathematically non-overlapping procedures yielding non-redundant results. Each procedure functions as a consistency test for the others. Consistency in results across multiple procedures results in greater confidence in the conclusions drawn regarding the latent structure of the construct of interest. Two distinct taxometric procedures were adopted in the present study: MAMBAC (mean above minus below a cut; Meehl and Yonce 1994) and MAXEIG (maximum eigenvalue; Waller and Meehl 1998).

MAMBAC requires at least two valid indicators, one functioning as the input indicator and another as the output indicator. For each pair of indicators, a graph is plotted with the difference in mean scores of the output indicator above and below a sliding cut-off score on the input indicator on the y-axis and the input indicator cut-points on the x-axis. This procedure is repeated for every possible combination of indicators. Each indicator in a pair alternates as the input and output indicator, and consequently two graphical MAMBAC plots are produced for each indicator pair. In the current study, 50 cuts were made along each input indicator. A single MAMBAC curve is generated from the averaged results of these analyses.

MAXEIG differs from MAMBAC in requiring at least three indicators. One indicator, again, serves as the input indicator, and the interrelationship between the remaining indicators is analyzed in a series of overlapping “windows” (i.e., subsamples) ordered along the input indicator. Based on optimal analysis parameters (Walters and Ruscio 2010), the current sample was split into 25 windows with 90% overlap between adjacent windows. The covariance matrix for the output indicators (variance values are replaced with 0’s so that only covariances remain) in each window is factor analyzed. The resulting eigenvalue for the first principal factor is then plotted on the y-axis of a graph with the windows of the input indicator on the x-axis. This procedure is repeated with each indicator designated, in turn, as the input indicator.

For both taxometric procedures, simulated dimensional and taxonic comparison data were generated, approximating all distribution properties of the research data that may influence the shape of taxometric curves in the graphical outputs. Comparing the research data to simulated dimensional and taxonic models with identical statistical properties provides a much more accurate comparison than an idealized, prototypical model. For this reason, the simulated data were identical to the research data in surface-level statistical properties of the taxometric indicators (e.g., sample size, mean, standard deviation, indicator skew, and inter-indicator correlations), differing only in latent structure. Data for each model (i.e., dimensional and taxonic) were simulated 100 times to approximate their sampling distributions for each of the two taxometric procedures used in the current study. A comparison curve fit index (CCFI; Ruscio et al. 2007) is then used objectively to determine the extent to which the research data matched the simulated dimensional and taxonic data. CCFI values range from 0 (dimensional structure) to 1 (taxonic structure), with .50 being equally supportive of dimensional and taxonic structures (Ruscio et al. 2010). CCFI values falling between 0.45 and 0.55 must be interpreted with some measure of caution, as they are reflective of ambiguous results (Walters and Ruscio 2013). The CCFI is a relatively recent development in taxometric research, but important because it appears to reduce the likelihood of spurious taxa (Haslam et al. 2012). All analyses were conducted using Ruscio's (2012) taxometric programs in MRO 3.2.5.

Results

MAMBAC analyses yielded 12 curves for each SUD. The averaged graphical output for these curves relative to simulated dimensional and taxonic MAMBAC data is depicted for marijuana, analgesics, hallucinogens, and inhalants in the top panels of Figures 1, 2, 3, and 4, respectively. As seen in Table 1, support for a dimensional latent structure was found for all four substances (marijuana, analgesics, hallucinogens, and inhalants). This is indicated by CCFI values ≤ 0.317. MAXEIG procedures similarly produced data that more closely matched simulated dimensional than categorical data (see the bottom panels of Figures 1, 2, 3, and 4). This is indicated by CCFI values ≤ 0.31. Finally, mean CCFIs for all four substances across both taxometric procedures were consistent with a dimensional latent structure. This is reflected by CCFI values ≤ 0.30. CCFI values for all analyses are summarized in Table 1.

Fig. 1
figure 1

Taxometric results for marijuana, with sample data shown relative to simulated taxonic and dimensional data. In each graph, the average curve for the sample data are represented by a dark line, with the gray area reflecting the middle 50% of the simulated values, and the light lines indicating the minimum and maximum simulated values at each data point. The top panels illustrate results for averaged MAMBAC curves, and the bottom panels depict results for averaged MAXEIG curves

Fig. 2
figure 2

Taxometric results for analgesics, with sample data shown relative to simulated taxonic and dimensional data. In each graph, the average curve for the sample data are represented by a dark line, with the gray area reflecting the middle 50% of the simulated values, and the light lines indicating the minimum and maximum simulated values at each data point. The top panels illustrate results for averaged MAMBAC curves, and the bottom panels depict results for averaged MAXEIG curves

Fig. 3
figure 3

Taxometric results for hallucinogens, with sample data shown relative to simulated taxonic and dimensional data. In each graph, the average curve for the sample data are represented by a dark line, with the gray area reflecting the middle 50% of the simulated values, and the light lines indicating the minimum and maximum simulated values at each data point. The top panels illustrate results for averaged MAMBAC curves, and the bottom panels depict results for averaged MAXEIG curves

Fig. 4
figure 4

Taxometric results for inhalants, with sample data shown relative to simulated taxonic and dimensional data. In each graph, the average curve for the sample data are represented by a dark line, with the gray area reflecting the middle 50% of the simulated values, and the light lines indicating the minimum and maximum simulated values at each data point. The top panels illustrate results for averaged MAMBAC curves, and the bottom panels depict results for averaged MAXEIG curves

Discussion

The goal of the current study was to provide the first taxometric analysis of SUDs in adolescents, thereby directly addressing the question of whether SUDs in this age group are best conceptualized as dimensional or categorical clinical phenomena. It is also the first such study to investigate the latent structure of SUDs for analgesics, hallucinogens, and inhalants, and extends prior taxometric findings on marijuana SUDs in adults (Denson and Earleywine 2006) to adolescents. Across the analyses, based on mathematically non-redundant procedures, the results provided convergent and unambiguous evidence in support of a dimensional latent structure for SUDs in the case of all four substances under study. That is, these four SUDs appear to exist along natural continua of severity and do not possess natural cut-points qualitatively distinguishing those with from those without these disorders. Lending confidence to the consistency of these results, the large sample size in the current study ensured that a sufficient number of cases within the putative taxon for each SUD would be present to be detected in the analyses, even at a very low base rate. This point is particularly worth noting, as inadequate representation of a putative taxon may bias analyses toward a dimensional solution.

These findings are congruent with the theoretical conceptualization of SUDs as having multi-determined etiologies (Dierker et al. 1997; Stein et al. 1987; Tsuang et al. 2001), and are thus reflective of the challenge involved in modeling and predicting risk for these disorders. As such, they are indicative of the need to move beyond traditional analytical approaches for characterizing risk for SUDs toward a consideration of more nuanced computational methodologies (e.g., machine learning; Whelan et al. 2014). In addition to more accurate prediction of SUDs, such methodologies may advance our theoretical understanding of the different etiological pathways through which risk for these outcomes may arise. Alternatively, if the SUDs examined in the present study in fact possess taxonic latent structures, the current findings are not necessarily invalid. They would still be able to inform our theoretical understanding of SUDs, suggesting instead that the symptoms of substance abuse and dependence, as currently conceptualized in the DSM, may not accurately reflect these phenomena. One strategy for ruling out this potential explanation for the dimensional latent structures of the SUDs in the current study would be to conduct partial replication studies involving taxometric analyses with other potential indicators of problematic substance use. For example, clinically relevant indicators may first be identified by determining which aspects of substance use behaviors, beyond those that constitute DSM criteria, are associated with clinically meaningful outcomes (i.e., impairment), and these clinical indicators may then be submitted to taxometric analysis. Consistency in taxometric findings across several different approaches to indicator construction would lend weight to the theoretical importance of conceptualizing SUDs as phenomena existing along continua of clinical severity rather than as discrete diagnostic entities.

The current findings also have direct implications for current diagnostic classification systems for SUDs. Although the movement from purely categorical distinctions of SUDs in DSM-IV to a multi-taxonic system with elements of dimensionality within individual taxa in DSM-5 appears to be a definitional improvement for the study and clinical assessment of these disorders, the consistent evidence of a continuous latent structure across multiple substances suggests that a movement toward complete dimensionality may be warranted (i.e., observing the distinction between each substance, but conceptualizing each as occurring along a continuum of severity). This is important because the creation of arbitrary taxa out of purely dimensional constructs results in a corresponding reduction in statistical power (Cohen 1983; MacCallum et al. 2002) and measurement precision (Ruscio and Ruscio 2002), and in some cases may produce spurious statistical findings (Maxwell and Delaney 1993).

Also of note, the finding in the current study that SUDs exist along continua of severity has potential clinical implications. That is, the current findings suggest that attempts to dichotomize adolescent substance use in terms of severity create largely artificial and arbitrary distinctions. Relying solely on a diagnostic cut-off value may consequently create the risk of milder but still clinically important symptom presentations being undertreated (Coghill and Sonuga-Barke 2012). Thus, the practice of conceptualizing clinically significance in terms of diagnostic criteria should be viewed with a degree of caution. Although clinical categorization of substance use is still of potential practical utility for prevention and treatment efforts, care should be taken in clinical settings not to mistake it as indicative of an actual qualitatively meaningful distinction. Additionally, insofar as the SUDs examined in the current study have multi-factorial etiologies, as would be consistent with their dimensional latent structures, it appears to be inadvisable to rely predominantly on clinical cut-offs for prognostic, treatment, or intervention determinations. Instead, a more promising approach may be to consider symptom severity together with other important factors that may be associated with risk (e.g., poor academic performance; Wills et al. 2016; sensation-seeking; Charles et al. 2016; emotion regulation; Wills et al. 2016; stressful life events; Charles et al. 2015; Wills et al. 2016) in making clinical decisions. For example, if several of these risk factors are present, clinical intervention may be indicated even if an adolescent currently engages in relatively mild substance use that does not meet the relevant clinical cut-off score.

In addition to its large sample size, the interview-based assessment of DSM-IV SUD symptoms is a significant strength of this study, for self-report symptom measures have been previously shown to lead to biased findings in taxometric analysis (Beauchaine and Waters 2003; Haslam et al. 2012; Ruscio et al. 2009). Nevertheless, several limitations should be noted. First, this study focused on DSM-IV SUD symptoms. Despite their considerable overlap with DSM-5 SUD symptoms, the degree to which current results would hold for DSM-5 SUDs is unclear. It is also possible that operationalizing substance use severity in a manner that extends beyond DSM symptoms (e.g., frequency of use) may yield different results. Evidence of dimensionality in studies utilizing different taxometric indicators would therefore lend greater confidence in the current findings. Second, although support for a dimensional latent structure was uniformly found for all four substances under study, it cannot be assumed that these findings are generalizable to SUDs for other substances. This is especially evident when one considers significant differences across classes of drugs (e.g., “hard” versus “soft”, stimulants versus sedatives and opiates), including DSM-IV and DSM-5 symptom criteria (i.e., the presence of withdrawal symptoms for some substances but not others). Therefore, future taxometric research with other substances (e.g., heroin) is warranted. Third, it is possible that SUDs are more likely to be dimensional in adolescence and taxonic in adulthood inasmuch as their underlying risk factors crystallize in the transition to adulthood (cf. Beauchaine 2003; Cole et al. 2008). Thus, it cannot be assumed that the current findings are generalizable to adult populations. Finally, as has been noted of prior empirical inquiries into the latent structure of mental illness in children (Coghill and Sonuga-Barke 2012), the current focus on symptom-level data excludes from consideration other meaningful units of analysis (e.g., neurophysiological). The inclusion of such data would be particularly important in future investigations.