Introduction

Substance use disorders (SUDs) are a worldwide public health emergency (Gaertner et al., 2022; Kariisa et al., 2022), with an annual prevalence of 2.20% (Rehm & Shield, 2019) and 100,000 fatal overdoses in the United States every year (National Center for Drug Abuse Statistics, 2023; Pearce et al., 2020). Yet, only 11–12% of those with SUDs receive treatment annually (Han et al., 2017; Substance Abuse & Mental Health Services Administration, 2022), and during treatment, abstinence remains an unreliable and infrequent outcome. In fact, abstinence rates range from 36 to 56% at treatment completion and 16%–53% at 3- to 6-month follow-ups, sometimes no different than controls (see Benishek et al., 2014 for a meta-analysis; Daigre et al., 2021; Frimpong et al., 2016; Pagano et al., 2013). Harm reduction, which focuses on functional improvement (Rosenberg et al., 2020), also shows unreliable success, as 35–58% of those with SUDs ever achieve remission in their lifetime, irrespective of achieving abstinence (Dennis et al., 2005; see Fleury et al., 2016 for a meta-analysis).Footnote 1 The health consequences of SUDs and their variable treatment outcomes put those with SUDs in an incredibly vulnerable position. I aimed to improve these outcomes by examining a potentially overlooked nuance of the role of emotion regulation problems (ERP) in SUD maintenance and treatmentsFootnote 2; that although implicating generalized ERP (G-ERP) in SUDs is important, considering (substance use) disorder-specific ERP (DS-ERP) may be essential.

Growing evidence implicates ERP (Kober, 2014; Weiss et al., 2022) in the etiological roots (see Stellern et al., 2023 and Weiss et al., 2022 for meta-analyses), mechanistic characteristics (see Gratz et al., 2015 and Sloan et al., 2017 for systematic reviews), and co-occurrences of SUDs (see Lai et al., 2015 for a meta-analysis). Notably, evidence suggests that ERP may be a risk factor across numerous spectra of psychopathology (e.g., Aldao et al., 2010; Sheppes et al., 2015), including predicting SUDs from adolescence into adulthood (Siegel, 2015). Furthermore, those with SUDs tend to have more severe ERP (Stellern et al., 2023) and robust evidence supports the role of ERP in maintaining SUDs after disorder onset (Cheetham et al., 2010); neuroscientific evidence corroborates these conclusions (Wilcox et al., 2016). These studies provide an empirical foundation for targeting ERP in treatments (Garland, 2021; Stotts & Northrup, 2015), but the results remain largely variable (Zamboni et al., 2021). As such, this growing evidence base for implicating ERP in SUDs has much potential for future research and improving the lives of those with SUDs, but explanations for this large degree of treatment outcome and research effect size variability remain sparse and uncharacterized.

Given this potential, researchers conducted two meta-analyses examining ERP in SUDs. Stellern et al. (2023), examined 22 studies (n = 3503) and found that those with SUDs reported substantial ERP compared to controls (Hedges’ g = 1.05). Furthermore, Weiss et al. (2022) examined 95 studies (n = 156,025) and found a weak relationship (r = .190) between the two constructs, but the effects varied widely across studies. Despite these effects, a potential missing nuance and explanation of the treatment and research variability is a lack of consideration for DS-ERP for SUDs. G-ERP is a transdiagnostic mechanism (Cludius et al., 2020), but emerging evidence suggests that specific ERP markedly vary across disorders, including SUDs (Sorgi‐Wilson & McCloskey, 2022). For example, the meta-analyses by Stellern et al. (2023) and Weiss et al. (2022) examined 114 studiesFootnote 3 (n = 156,025), yet 88.60% of those studies (90.46% of participants) used the Difficulty in Emotion Regulation Scale (DERS; Gratz & Roemer, 2004), a measure of six processes involved in G-ERP. As such, an exclusive focus on non-substance-specific constructs may negatively affect effect sizes in research and treatment effectiveness when applied to SUDs. Currently, the literature only supports implicating G-ERP in SUD treatment (e.g., Stellern et al, 2023; Weiss et al., 2022), warranting further investigations deviating from the saturation of the DERS and G-ERP as the essential construct for SUDs. A new focus on DS-ERP may provide context for the behaviors, motivations, and emotional experiences occurring within those with SUDs.

Transdiagnostic and SUD-specific treatments that focus on G-ERP,Footnote 4 including cognitive-behavioral therapy (CBT), Acceptance & Commitment Therapy (ACT), and Dialectical Behavior Therapy (DBT), generally show small effect sizes, sometimes no different than controls (Lee et al., 2015; Stotts & Northrup, 2015; Vujanovic et al., 2017); whereas treatments focused on DS-ERP, such as motivational interviewing (MI; Apodaca & Longabaugh, 2009) and contingency management (CM; Witkiewitz et al., 2022) show more robust and invariable effect sizes (see Sayegh et al., 2017 for a meta-analysis). The discrepancy between treatments that target G-ERP and DS-ERP may be the consideration of a newly defined construct called substance-induced emotion regulation (SIER), which I defined as a type of DS-ERP involving substance use to force changes in emotional states to improve subjective functioning and experiences (Stone, 2023). I explained my hypothesized theoretical rationale for the discrepancies in treatment outcomes and study effect sizes that focus exclusively on G-ERP in Fig. 1. The existing narrative in the literature is that G-ERP is the mechanism that leads to SUDs, which I displayed in Fig. 1A. Many treatments target G-ERP to change the response to emotions away from substances that cause impairment to an adaptive behavioral outcome, such as emotion regulation skills, as displayed in Fig. 1B. However, G-ERP as a mechanism of change for SUD may be an incomplete representation of the maintenance of SUDs.

Fig. 1
figure 1

Standard and Alternative Conceptualization of ERP in SUD Maintenance and Treatments. Note. A Standard conceptualization of G-ERP (i.e., a problem) contributing to SUDs. Researchers suggested a direct causal effect of G-ERP on SUDs. B Standard conceptualization of treatments targeting G-ERP (e.g., DBT) to redirect reactions to strong emotions from substance use to adaptive emotion regulation (ER) skills as a behavioral outcome. Adaptive ER skills may be less likely to lead to impairment than substance use, and this redirection reduces the need to use, thereby reducing SUD severity. C Alternative conceptualization of ERP, where G-ERP does not directly cause SUDs; rather, the impairment arises from using substances to force changes in emotional states to improve subjective functioning and experiences, called substance-induced emotion regulation (SIER) as a solution to life problems and strong negative emotions. G-ERP elicits a DS-ERP that leads to SUDs, but the life problems are not from G-ERP; rather, they are from the specific ERP causing the life problems (i.e., DS-ERP), simultaneously leading to impairment from substance use. D Conceptual explanation of treatments focusing exclusively on G-ERP demonstrate variability and small effect sizes; ignoring DS-ERP may neglect the effect of using substances to improve functioning, partially supporting adaptive ER skills and reducing the motivation to use substances. However, substances remain appealing when neglecting to target a DS-ERP. E Alternative conceptualization of directly reducing substance use to improve functioning by replacing the substance use solution with a non-substance use alternative solution (e.g., inviting a friend instead of using alcohol to reduce social anxiety before a speech), eliminating the causal effect of both G-ERP and DS-ERP on impairment. F Alternative conceptualization of treatments exclusively focused on changing SIER (e.g., CM) to reduce impairment by replacing substance use with non-substance alternative behaviors. Individuals may improve irrespective of G-ERP 

I suggested that ERP may not directly contribute to SUDs, as the issues and symptoms individuals with SUDs experience may not come directly from ERP; instead, using substances to force emotion regulation (i.e., in an attempt to adapt to stressors, cope, or enhance emotional experiences) may lead to impairment, as shown in Fig. 1C. When treatments focus exclusively on G-ERP, the effect may be split or diluted, leading to adaptive ER skills and reducing the need to use substances, as displayed in Fig. 1D. This approach may not address the efficiency, effectiveness, and accessibility of substance use to regulate emotions (e.g., alcohol use in PTSD; Tripp et al., 2019), as many report that substances are a quick, easy, and effective means of changing their emotional states (Biolcati & Passini, 2019; Stone, 2023) – thereby enticing people to return to a quicker or more effective option to deal with life problems as evinced by low treatment interest (Rogers et al., 2019), high relapse rates (Bradizza et al., 2006), and unfavorable attitudes toward abstinence (Connor et al., 2021). Treatments that address or exclusively focus on eliciting alternative non-substance-related behavioral options or solutions to problems, such as MI or CM, may remove or replace the motivation to use substances to emotion regulate or solve life problems, thereby reducing impairment, as displayed in Fig. 1E and 1F. See Supplementary Fig. 1 for an example of how these processes manifest in patients.

These conceptual frameworks remain untested, despite insufficient explanations for (1) why MI and CM produce larger effects than CBT, ACT, and DBT for SUDs, (2) the high variability in effects between G-ERP and SUDs, (3) the saturation of research focused on G-ERP, that is, at-odds with emerging research on DS-ERP, (4) notable relapse rates, and (5) reluctance of abstinence. I suggested a partial explanation for these issues by providing proof-of-conceptFootnote 5 evidence to shift away from G-ERP using the DERS on to DS-ERP and SIER. I collected data on SUD-related constructs, the DERS, and SIER – a previously unmeasured construct. I compared the G-ERP and DS-ERP to find evidence that the DS-ERP measures a more relevant, nuanced construct. I hypothesized that (1) the DS-ERP can better account for SUD variation, (2) the DS-ERP more effectively distinguishes individuals with substance use problems (SUPs) from controls, and (3) that the DS-ERP can fully account for the relationship between the G-ERP and SUDs, as displayed in Fig. 2, thereby providing proof-of-concept evidence that implicating G-ERP in the maintenance and treatment of SUDs may have more nuances that currently presented in the literature.

Fig. 2
figure 2

Hypothesized Structural Model. Note. Although the model is recursive, directional paths do not necessarily imply causation 

Method

Participants

Participants were Undergraduate students (N = 266) from a Southern Illinois University who completed the study as part of their Introduction to Psychology course requirements. I removed individuals who failed any of the three attention checks (e.g., please select true; n = 33; 12.40%) or did not complete the entire study (n = 35; 13.25%). The final sample (n = 198) carefully attended to all the items and contained no missing data. Therefore, all analyses were adequately powered.Footnote 6 I displayed the demographic information in Table 1. I have posted all data, scripts, and materials on this Open Science Framework page: https://doi.org/10.17605/OSF.IO/4FWZR.

Table 1 Demographic characteristics

Procedure

I used Qualtrics (Qualtrics LCC, Provo, UT, USA) to conduct the study. Participants volunteered via the SONA Systems (SONA Systems, Ltd., Tallinn, Estonia) in the Fall of 2022. Participants completed an informed consent, a demographic form, the study measures in a randomized order, and a debriefing, which took no longer than 30 minutes. The Institutional Review Board at Southern Illinois University approved this study (Protocol: 22121; Assurance: FWA00005334).

Measures

We displayed the descriptive statistics in Tables 2 and 3. I did not eliminate outliers because they were all valid cases (Orr et al., 1991). There were no univariate non-normal distributions (γ ≥ ± 2 or κ ≥ ± 7; Hair et al., 2010). However, a Mardia’s test revealed highly multivariate right-skewness, γ = 1161.16, and multivariate leptokurtosis, κ = 6.34, ps < 0.001.

Table 2 Descriptive statstics and internal consistencies
Table 3 Raw bivariate Pearson correlations

Demographic Questionnaire

Participants completed a demographic questionnaire, which included questions about their age, gender, race, ethnicity, and substance use.

Difficulty in Emotion Regulation Scale (DERS)

The DERS is a multidimensional measure of processes of G-ERP (Gratz & Roemer, 2004). Participants rated 18 items on a 5-point scale from almost never (1) to almost always (5), such as item 23 (p. 48; “when I am upset, I feel out of control”). The scale has good concurrent, construct, and structural validity and contains a higher-order general factor. Higher scores indicate more G-ERP. The scale demonstrated good internal consistency in the current study, α = .925 (M = 89.30, SD = 21.10).

Enthusiastic Substance Use Attitudes Scale (ESUAS)

The ESUAS is a multidimensional measure of the various reasons individuals report using substances (Stone, 2023). Participants rated 18 items on a 5-point scale on how the statements apply to them, from not at all (1) to very much (5), such as item 8 (p. 15; “substances help me stay calm”). Higher scores indicate stronger motivation to use substances for specific purposes. I used the substance-induced emotion regulation (SIER) subscale, which measured the use of substances to force changes in emotional states to improve subjective functioning and experiences, which demonstrated excellent internal consistency, structural validity, and construct validity. The scale demonstrated good internal consistency in the current study, α = .929 (M = 26.14, SD = 11.51).

Drug Abuse Screener Test-10 Item (DAST-10)

The DAST-10 is a unidimensional measure of SUD severity (Skinner, 1982). Participants rated ten items on a dichotomous yes or no scale, such as item 1 (p. 365; “Have you used drugs other than those required for medical purposes”). Higher scores indicate a greater risk of developing a SUD and categorize individuals into five risk groups, from no risk (0), low risk (1-2), moderate risk (4-5), substantial risk (6-8), to severe risk (9-10). The DAST-10 has demonstrated excellent psychometric properties (Yudko et al., 2007).

Drug Use Disorders Identification Test (DUDIT)

The DUDIT is a unidimensional measure of SUD severity (Berman et al., 2002). Participants rated 11 items on five- and three-point scales, such as item 2 (p. 1; “do you use more than one type of drug on the same occasion”). Higher scores indicate more severe SUD symptoms. The scale has demonstrated excellent psychometric properties across many studies (Hildebrand, 2015).

Software

I conducted the descriptive statistics and inferential tests using SPSS v.29 (IBM Corp, 2023) and the SEM using R v.4.2.2 (R Core Team, 2022) using the following packages: lavaan v.0.6–12 (Rosseel, 2012), lavaanPlot v.0.6.2 (Lishinski, 2021), semTools v.5–6 (Jørgensen et al., 2022), MVN v.5.9 (Korkmaz et al., 2014), and semPower v.1.2.0 (Moshagen & Erdfelder, 2016).

Results

Incremental Validity

I first aimed to determine if SIER explained unique variability beyond the DERS. I conducted two hierarchical linear regressions (HLRs), one for the DUDIT and one for the DAST-10 as the dependent variables. In step one, I entered the DERS as an independent variable to explain variability in SUD severity. In step two, I added the SIER to examine (1) the shared explained variability in SUD severity with the DERS and (2) uniquely explained variability beyond the DERS. The analyses did not violate any assumptions, inlcuding linearity, multivariate normality, non-multicollinearity, homoskedasticity, and having no multivariate outlier assumptions.Footnote 7 The results of the HLRs were the same for the DUDIT and DAST-10, as presented in Table 4. In step one, the DERS explained significant variability, but the addition of SIER explained ~25% unique variability beyond the DERS, a large effect size, Cohen’s f2s = ~.350. The DERS was nonsignificant in step two in both HLRs, which suggested that SIER explained all the variability of the DERS and suggested that SIER provided a more representative account of ERP for SUDs than the DERS.

Table 4 Hierarchical linear regressions of the DUDIT and DAST

Discriminant Validity

I then tested if SIER effectively distinguishes individuals with SUDs from those without SUDs using a receiver operating characteristic (ROC) analysis.

This analysis used a continuous measure to estimate the ability to differentiate between two levels of binary outcomes (i.e., the presence or absence of a condition) across all measurable degrees of a construct. I used the DUDIT cutoff of 25 to separate those with a SUD from those without, as the DUDIT is the “gold standard” measure of SUDs. However, this cutoff resulted in a sample of 12 people with a SUD, limiting analyses. Instead, I used the cutoff of eight, which indicates SUPs (i.e., risky substance use; Berman et al., 2002). This categorization resulted in a balanced proportion of 105 (53.03%) with SUPs and 93 (44.97%) without SUPs, allowing me to proceed with the ROC analysis.

I entered the DERS and SIER mean scores into the analysis with the DUDIT SUPs dichotomous outcome variable. Note that although I entered both measures into the analysis at once, I generated their curves and summary statstics separately. I displayed the results in Fig. 3. SIER was more effective at distinguishing those with SUPs than the DERS. The area under the curve, an indicator of the distinguishing effectiveness, was 77.00% for SIER, p < .001, and 59.50% for the DERS, p = .019, suggesting that SIER was 25.64% better at distinguishing than the DERS, χ2(198) = 6.70, p = .010, Φ = .183. These findings suggested that both measures significantly distinguished between the two groups; however, SIER was more effective than the DERS.Footnote 8

Fig. 3
figure 3

Receiver Operating Characteristic and Precision-Recall Curve Comparisons Between the DERS and SIER Measures. Note. n = 198. DERS, Difficulty in Emotion Regulation Scale; SIER, substance-induced emotion regulation; AUC, area under the curve. The DERS and SIER distinguished between those with or without substance use problems (SUPs) using the DUDIT cutoff of eight. SIER was significantly more effective at distinguishing between the two groups

I then tested if individuals significantly differed in emotion regulation using a 2 (between: low risk vs. high risk) by 2 (within: z-DERSFootnote 9 vs. z-SIER) and a 3 (between: low risk vs. moderate risk vs. high risk) by 2 (within: z-DERS vs. z-SIER) mixed-subjects analysis of variance (ANOVA) using the DUDIT cutoff for SUPs and the DAST-10Footnote 10 cutoffs of low, moderate, and high risk. I displayed the results in Fig. 4. There was a significant interaction for both the DUDIT, F(1, 196) = 9.47, p = .002, \({\upeta }_{\mathrm{p}}^{2}\) = .046 (86.46% power), and the DAST-10, F(2, 195) = 6.85, p = .001, \({\upeta }_{\mathrm{p}}^{2}\) = .066 (91.84% power). For the DERS, there was a small, significant simple effect with the DUDIT, p = .017, \({\upeta }_{\mathrm{p}}^{2}\) = .029, that suggested that those with SUPs have more G-ERP; however, I did not find these simple effects with the DAST-10, ps > .102. For the SIER, there was a large, significant simple effect with the DUDIT, p < .001, \({\upeta }_{\mathrm{p}}^{2}\) = .194, and the DAST-10, ps < .001, Cohen’s ds = 0.67-1.79, which suggested that those whom the DUDIT and DAST-10 classified as having high SUPs engage in more SIER. Note that, for the DERS, none of the risk classification groups significantly differ from the mean, ps > .083. Overall, the results are straightforward–the DERS falls short of distinguishing those with and without SUPs; however, SIER is quite effective.

Fig. 4
figure 4

Comparisons of Scores on the DERS and SIER by Substance Use Disorder Risk Classification of the DUDIT and DAST. Note. n = 198. DERS, Difficulty in Emotion Regulation Scale; SIER, substance-induced emotion regulation. I standardized (z-transformed) the DERS and SIER to compare their scale scores. Scores below the x-axis suggested the group has a substance use risk below the mean, and scores above the x-axis showed a substance use risk above the mean. I provided one-sample t-tests to determine if the score differed significantly from zero. The figure contains standard error bars, Cohen’s d, and partial eta-squared effect sizes, **p < 0.05

Structural Equation Modeling (SEM)

Sample Size

The ideal case-to-parameter ratio ranges from 5:1 to 10:1 (Bentler & Chou, 1987; Tanaka, 1987). Therefore, I used the subscales as indicators to preserve power and model parsimony. The model in the current study has a ratio of 6:1.Footnote 11 An a-priori power analysis (Wang & Rhemtulla, 2021) using the Root Mean Squared Error of Approximation (RMSEA = .050 cutoff; Sivo et al., 2006) suggests a sample size at 80% power, α = .050, and df = 75 is 193, suggesting that I have a large enough sample to conduct the SEM.

Analysis

I used a two-step model fit process and a three-step fit assessment (Kline, 2016). Regarding the two-step process, I first fit a measurement model (i.e., a model with all latent variables covaried) to assess the uncontrolled relationships. I used theoretically driven modifications to improve the model fit and then fit the structural model (i.e., a model with the hypothesized direct effects). Regarding the three-step procedure, I fit the models but did not use the exact fit test to assess model fit because it may detect minor, insignificant data-model misfit (Stone et al., 2021). Instead, I examined correlational residuals outside of ± .100 with an associated standardized residual outside of ± 1.96 (i.e., a z-statistic at p = .050). If there are many large residuals, I rejected the model. Conversely, if there are few large residuals, I retained the model due to excellent global and local model fit. I reported the fit indices for consistency in the literature, but do not use them to justify the model fit due to biases from loading strengths (Kline, 2016). The measurement and structural models in the current study are mathematically identical and fit the data equally.Footnote 12 I used the maximum likelihood estimation with robust standard errors and a Satorra-Bentler scaled test statistic to control for multivariate nonnormality.

Model Fit

I first fit the measurement model to the data, and the model failed the exact fit test, χ2(87) = 248.43, p < .001. I found 18 correlation residuals were outside of ± .100 with a significant standardized residuals throughout the model, so I reject the initially-proposed hypothesized model, RMSEA = .097, 90% CI[.084, .110]; SRMR = .077; CFI = .863. Three patterns among the correlational residuals occurred that guided my modifications. First, the non-acceptance subscale of the DERS was highly related to the SIER subscales, so I specified an indicator from SIER to the non-acceptance subscale to address its correlational residuals. The second pattern was underestimated relationships among the DERS indicators; I allowed their errors to covary because of the similarity in construct validity. The final pattern of residuals emerged from the model specifying orthogonality among the latent variable indicators, which are highly related; I allowed these errors to covary as well. This one modification and allowing 11 errors covariances resolved all of the data-model fit issues, despite the fit indices detecting insignificant data-model fit, χ2(75) = 146.10, p < .001, RMSEA = .069, 90% CI[0.053, 0.085]; SRMR = .051; CFI = .940. An examination of the final standardized and correlational residual matrices revealed no local data-model fit issues. A post-hoc power analysis using the RMSEA = .069 at a sample size of 198, α = .05, and df = 75 is 99.00%, suggesting that the model has a sufficient sample size to detect data-model misfit.

The additional indicator from the DERS to SIER and error covariances, as displayed in Supplementary Fig. 2, reveal important information about how individuals engage in SIER as a direct response to the DERS components. First, the addition of non-acceptance of emotions as an indicator of SIER suggests that the essential underlying process of SIER is emotional avoidance. Other patterns emerged from the error covariances such as: (1) using substances for sociability improvement may be a response to non-acceptance of emotions and difficulty engaging in goal-directed behavior, (2) mental health improvement may be a response to limited emotion regulation strategies, and (3) relaxation improvement may be a response to limited emotional clarity. These modifications and error covariances improved the model fit because these relationships are nonnegligible and theoretically consistent.

I then fit the structural model and, although the relationships among the three latent variables were significant, as displayed in Fig. 5, ps < .002, when I specified the directional paths, the relationship between the DERS and the SUD severity factor became nonsignificant, p = .270. The indirect effect of the DERS through the SIER on the SUD severity factor was significant, p < .001, which I displayed in Fig. 5. Thus, the SIER fully explained the relationship between the DERS and the SUD severity latent variables. I displayed the final structural model in Fig. 6. Note that the model explains significant variability among most endogenous variables and that the latent factors are internally consistent. In fact, SIER explained a substantial amount of variability in SUDs (R2 = 52.40%). Yet, the DERS explained only a small portion of variability in SIER (R2= 10.90%), suggesting that the DERS does not explain much variability in the emotion regulating processes involved in SUDs. These findings provide doubt about the utility of the DERS for SUDs. 

Fig. 5
figure 5

SEM Models Demonstrating the Explanatory Ability of Substance-Induced Emotion Regulation. Note. n = 198. A measurement model, B structural model. Parameter estimates are completely standardized, *p < 0.05

Fig. 6
figure 6

Final Structural Model. Note. n = 198. Unstandardized parameter estimates with explained variability and factoral internal consistency estimates. I removed the error covariances for ease of interpretation. See Supplementary Fig. 2 for a figure with error covariances, *p < 0.05

Discussion

I aimed to provide proof-of-concept evidence of a nuanced understanding of the role of ERP in SUD maintenance and treatment research, suggesting that an oversaturation of G-ERP may be limiting our understanding and conceptualizations of SUDs. I argued that a shift toward DS-ERP, including SIER, may explain gaps in the SUD literature and warrant preliminary investigations to provide a more nuanced view of SUDs (Stone, 2022a). The results confirmed my hypotheses that (1) DS-ERP well-exceeded the G-ERP in explaining SUD severity, (2) DS-ERP outperformed the G-ERP as a distinguishing factor of those with and without SUPs, and (3) that the DS-ERP may fully account for the effect of G-ERP on SUDs. Although there are notable, unignorable limitations to the generalizability and external validity of these data, this proof-of-concept evidence suggests that the effects may be substantial, as DS-ERP produced large effect sizes, and the G-ERP was frequently nonsignificant when accounting for DS-ERP – contrary to the literature thats shows variable effects of G-ERP when not accounting for DS-ERP. This evidence provides a potential conceptual and empirical framework to warrant further investigations. I note the potential implications of this study and further evidence from researchers who replicate and extend the current work in studies with special populations and more complex designs.

Theoretical Implications

I found preliminary support for the alternative conceptual understanding of ERP in SUDs presented in Fig. 1; namely, that the literature may be overstating the relevance of G-ERP in SUD treatments and maintenance research, as DS-ERP may mediate or explain the direct predictive or causal relationship between G-ERP and SUDs. The amount or frequency of use that precipitates SUDs is also subjective and not a diagnostic criterion (American Psychiatric Association, 2022). For example, using cannabis 53.6 times a month has a 31% cannabis use disorder (CUD) rate in middle-aged people, whereas 52.1 times has a 20% CUD rate in older individuals (Haug et al., 2017). Therefore, it may not be the substance use itself or G-ERP, but DS-ERP (e.g., SIER) causing the SUDs, suggesting that using substances to help with functioning or improve subjective experiences may lead to worse SUD symptomology. Although this study is preliminary, the nuance between a G-ERP and DS-ERP as mechanisms is nonnegligible and warrants further consideration.

G-ERP and DS-ERP have two notable distinguishing features. First, G-ERP, as measured by the DERS, is a transdiagnostic mechanism (see Gratz et al., 2015 and Sloan et al., 2017 for systematic reviews), whereas DS-ERP, as represented by SIER, is specific for SUDs and cannot occur in those who do not use substances. Essentially, the shared variability between SUDs and other disorders is notable, but the unique variability captured by DS-ERP may overshadow and diminish the relevance of G-ERP – thereby confirming emerging evidence of the relevance of DS-ERP over G-ERP (Cludius et al., 2020). Second, G-ERP comprises many cognitive and affective components (e.g., lack of emotional clarity; Gratz & Roemer, 2004), whereas DS-ERP is consistent with major behavioral conceptualizations of SUDs (e.g., behavioral economics; Bickel et al., 2014; Field et al., 2020; Dennhardt et al., 2019). In the Weiss et al. (2022) meta-analysis, the largest congregate effect sizes came from negative strategies (i.e., behaviors that reduce negative emotions; Cohen’s r = 0.41), much larger than the cognitive components (Cohen’s r = − 0.01-0.17). The authors suggested that variation across study contexts may explain these differences. Although study variability is a probable contributor to this effect, the current preliminary investigation provides evidence that it may also be a lack of consideration of DS-ERP, which may involve stronger behavioral processes that strengthen behavioral effect sizes, and the oversaturation of G-ERP measured by the DERS, which likely comprises less-relevent cognitive and affective processes that diminish the effect size. As such, these effect size differences from the overfocus of G-ERP and the negligence of DS-ERP may generalize to clinical settings.

Clinical Implications

The preliminary evidence I presented here may provide a novel explanation and guidance for SUD treatment. This evidence partially confirms the alternative conceptual understanding of SUD presented in Fig. 1 and Supplementary Fig. 1, which explains why MI and CM may produce larger effects than CBT, ACT, and DBT for SUDs. The DS-ERP, a behavioral mechanism, may explain a confounded relationship between the G-ERP, a partially cognitive mechanism, and SUDs. Hence, treatments directly targeting behavioral replacements of substance use (e.g., MI and CM; see Sayegh et al., 2017 for a meta-analysis) produce larger effects than treatments focused on G-ERP or that indirectly or partially target behavioral replacements (e.g., CBT, ACT, and DBT for SUDs; Lee et al., 2015; Stotts & Northrup, 2015; Vujanovic et al., 2017). A potential benefit of directly replacing substance use behaviors with non-substance alternatives in G-ERP-focused treatments could be that their effectiveness for co-occurring disorders and non-SUD-related substance use may improve to complement their effectiveness for emotional disorders (A-Tjak et al., 2015; Bai et al., 2020; Keles & Idsoe, 2018; Panos et al., 2014). Essentially, this preliminary evidence suggests that the overfocus on G-ERP may limit the effectiveness of treatments for SUDs.

These findings also partially explain: (1) the generally poor treatment success with large relapse rates and (2) a reluctance to engage in abstinence-based goals. I argued that we may account for these treatment challenges using the current preliminary evidence because many of these treatments may not directly account for DS-ERP when using transdiagnostic G-ERP-focused treatments (e.g., Lee et al., 2015; Stotts & Northrup, 2015; Vujanovic et al., 2017). Individuals may be more likely to use substances and reluctant to commit to abstinence if the perceived benefit and ease of using substances to address life problems or unpleasant emotion remains a subjectively better option, regardless of adaptive ER skills or other generalized strategies introduced to patients in transdiagnostic treatments. For example, researchers have observed that rats use fewer self-administered substances if there are available operant social and environmental rewards (Venniro et al., 2016, 2018), which is consistent with human research showing that individuals are more likely to use substances when non-substance rewards are not available (Higgins et al., 2004). This principle has roots in behavioral economics (Stone, 2022b), and emerging treatments, such as positive psychology, may better account for this principle to enhance treatment effects due to a relatively simple and quick non-substance behavioral reward (Stone & Schmidt, 2022; Stone, 2022c). As such, treatments must make the non-substance-based behaviors more appealing and accessible than the substance use behaviors. Despite being a challenging task, G-ERP-focused treatments that do not account for DS-ERP and do not replace substance-based behaviors may continue to underperform their potential as SUD treatments.

Limitations and Future Directions

The implications of this study are not conclusive or highly generalizable because of notable limitations that restrict the evidence to a proof-of-concept, including sample characteristics such as Undergraduate participants (DeRight & Jorgensen, 2015), a cross-sectional design (Agler & De Boeck, 2017), and self-reported measurement (Jordan & Troth, 2020; Lance et al., 2010). Further, I was not able to account for some extraneous participant characteristics due to the limited and non-clinical sample features, such as mood, anxiety, eating, and personality disorders (Sloan et al., 2017). Researchers may consider replicating and extending the findings with diagnostically-relevant samples, such as those with SUDs or co-occurring disorders, controlling for clinician-confirmed (i.e., through assessment) extraneous psychopathology, establishing temporal precedence to claim mediation, experimental designs to test causality, and integrating multimethod assessments (e.g., clinician interview, record review, and biochemical confirmation) to reduce method effects and include objective indications of the measured constructs to improve validity and generalizability. The following foreseeable studies may be longitudinal and experimental laboratory studies to examine if DS-ERP mediates the direct effect of G-ERP on SUDs and if DS-ERP is sensitive to intervention-elicited changes to confirm DS-ERP as a maintenance factor of SUDs within behavioral treatment. Then, future studies may be randomized clinical trials (RCT) to test the efficacy of integrating DS-ERP-targeting interventions into common G-ERP-focused treatments to assess changes in outcomes (e.g., positive psychological interventions; Stone & Parks, 2018). Essentially, this study tested the relevance of DS-ERP for SUDs; laboratory studies may test the usefulness, and RCTs may test the feasibility, generalizability, and efficacy before intentionally integrating these interventions into SUD treatments.

Conclusion

SUDs remain a significant public health problem worldwide (Rehm & Shield, 2019), and both abstinence and harm reduction practices (Rosenberg et al., 2020) generally show unreliable success (Dennis et al., 2005; Fleury et al., 2016). I proposed and provided proof-of-concept evidence for shifting from G-ERP onto DS-ERP. The preliminary evidence suggests that DS-ERP is a potential explanation for inconsistent and understudied findings within the literature and a potential strategy for improving SUD treatments. More complex investigations that deviate from the current narrative and saturation of the DERS as a measure and G-ERP as an essential component in SUDs maintenance and treatments are crucial if we want to initiate more precise work to bolster our prevention, treatment, and recovery efforts for those with SUDs.