Introduction

Attention-deficit/hyperactivity disorder (ADHD) is a common childhood psychiatric disorder with high worldwide prevalence of 2.6–4.5% [1]. It is considered a heterogeneous disorder, with a particularly high comorbidity rate of 40–70% with conduct problems (CP). Stimulant medication is the most common and effective treatment in severe ADHD, and about 70% of patients respond to this pharmacological approach [2]. However, adverse events [3], unwillingness to take medication over extended periods [4] and particularly the absence of positive long-term effects [5] are serious constraints of this treatment. Thus, there is a demand for alternative treatments with possible long-term effects such as neurofeedback (NF), which aims to improve self-regulation of certain brain activity patterns [6]. NF has gained encouraging empirical support in recent years. Meta-analysis on the effects of NF on ADHD symptoms showed medium to large effects for all three core domains of ADHD symptoms [7]. Although effects were substantially reduced for probably blinded raters in RCTs, NF effects remained significant in an exploratory analysis for studies using standard protocols [8]. Regarding sustained and long-term effects, a recent meta-analysis of ten studies [9] found small to medium effects for NF compared to non-active control conditions at follow-up and similar effects compared to active control conditions (pharmacotherapy and self-management). Moreover, the effects of NF treatment on CP and the role of this comorbidity on treatment response have not been widely studied in ADHD patients [10], although other behavioral ADHD treatments improve CP symptoms [11].

Slow cortical potential (SCP)-NF focuses on regulating cortical activation and inhibition. These slow electrical shifts form a phasic mechanism in the regulation of attention [12]. A well-studied SCP, the frontocentral contingent negative variation (CNV) reflecting cognitive activation and preparation, is reduced in ADHD children compared with healthy controls [13]. Promising effects of SCP-NF involving upregulation of CNV-like negative SCPs on ADHD have been reported in several studies [14,15,16,17,18].

The few studies investigating the impact of NF on comorbid CP generally found positive effects on CP symptoms. Gevensleben and colleagues [15] assessed significant reductions on parent-rated oppositional behavior (ODD) and CP compared to standardized computerized attention training. After theta/beta NF training, reduction of ODD symptoms was reported but without a group difference when compared with standard pharmacological intervention [19]. Furthermore, one study investigated SCP-NF in criminal psychopaths showing less aggression and impulsivity [20].

A key question in NF is whether the ability to learn and self-regulate unconscious psychophysiological parameters relates to clinical outcomes and thereby supports the specificity of treatment effects. Two studies [14, 17] linked self-regulation outcome with impulsivity, inattention and hyperactivity subscales when participants were classified as learners. Gevensleben and colleagues [15] reported that successful initial increases of negativity (until the ninth session) correlated with inattention improvement. However, one recent frequency band NF study [21] could not find any association between self-regulation and symptom reduction. These analyses are important to disentangle specific from unspecific effects provided by NF treatment approaches.

The relation between long-term effects and self-regulation in ADHD participants was analyzed only in one study 6 months after SCP-NF treatment. Strehl et al. [17] reported medium to large effect sizes (ES), which were predicted by self-regulation performance during transfer conditions after training and as a trend at follow-up.

The main aim of this follow-up in our large randomized controlled multicenter trial, which demonstrated a superior primary ADHD outcome for SCP-NF compared to a semi-active control group [18], was to evaluate the clinical long-lasting effects on ADHD and CP symptoms and relate them to self-regulation capabilities.

Methods

Study design and participants

We did a multicenter, randomized controlled, parallel, superiority trial. The study was approved by all local ethics committees according to the Declaration of Helsinki. Written consent was obtained from all participants and the persons in charge of primary custody. For more details see [22] regarding the study protocol and randomization and [18] regarding the primary outcomes 4 weeks after treatment. Participants had to meet the diagnosis of ADHD combined type according to DSM-IV TR and aged 7–9 years. Comorbid symptoms at baseline were assessed by the Child Behavior Checklist (CBCL). Exclusion criteria consisted of a diagnosis of bipolar disorder, psychosis, obsessive–compulsive disorder, chronic severe tics or Tourette syndrome, major neurological or physical illness, acute suicidal tendencies, pharmacotherapy for severe anxiety, mood disorders and psychosis, IQ below 80, lack of German language proficiency, no telephone, pregnancy and lactation, and current participation in other clinical trials. Since the interventions were considered an add-on to treatment as usual, pharmacotherapy for ADHD, ODD and CD was allowed.

Procedures

After screening, there was a washout period of 2 weeks for children with psychostimulants and 4 weeks for participants with atomoxetine. Assessments were carried out at pre-intervention (pre-test), after treatment (post-test 1), 1 month after treatment (post-test 2) and 6 months later (follow-up). Pre-tests and post-tests 2 were conducted without medication, and 6 months after treatment end participants underwent a naturalistic follow-up. Participants were trained one to two times per week for a total of 25 sessions within 3 months. Six months after training, a follow-up and booster session probed the sustainability of acquired self-regulation skills. Each session lasted about 1 h.

SCP-NF sessions were conducted with NEUROPRAX systems (neuroCare GmbH, Germany) using a monopolar setting (Cz, referenced to the right mastoid). Each training session consisted of three feedback runs (with visual feedback) and one transfer run (without feedback). A run consisted of 40 trials, each lasting 10 s, with three phases (2 s baseline and 8 s feedback, followed by a “sun” for reinforcement after successful trials). The participants had to differentiate between activation and deactivation of brain activity. During an “activation” task an electrically negative SCP shift was required, in contrast to the “deactivation” task, requiring an electrically positive shift. The baseline was set to zero. Trials were randomly distributed with a 50/50% rate for the first phase of the training (sessions 1–12). Thereafter, participants had a 3–4 weeks break. The second phase of the training (sessions 13–25) was more focused on “activation” with 80% negative SCP shifts.

The semi-active control condition EMG-BF required coordinated activity of the supraspinatus muscles. Participants were instructed either to contract or to relax the left in relation to the right supraspinatus muscle. Setting, training devices, electrode montage, feedback and transfer trials, number of sessions, and follow-up assessments were the same as in the SCP-NF group.

Outcomes

The primary outcome was ADHD symptoms rated by parents. The secondary outcomes were teacher-rated ADHD scale, time course of comorbid symptoms which were rated by parents via the Strengths and Difficulties Questionnaire (SDQ) and NF training self-regulation performance (percentage of correct trials) and its relation to clinical outcomes. Psychometric properties of all pre-specified measures are reported in the protocol [22].

Statistical analysis

Statistical analyses were run using the Statistical Package for Social Sciences version 23.0 (SPSS). Post-intervention (post-test 2) effects have been reported previously [18]. This study evaluated sustained and long-term effects between treatments. Primary outcomes (ADHD parent ratings) were tested by an analysis of covariance (ANCOVA) to test the sustainability of effects (follow-up minus post-test 2), as predefined in our protocol [22], and the longitudinal course across all assessments was analyzed using a mixed model for repeated measure (MMRM). ANCOVA analysis included the covariates trial site, sex, age, baseline ADHD score, ADHD medication at pre-test, parenting style and parents’ expectations. The MMRM model included fixed effects for group, site, time and group-by-time interaction, adding sex, age, baseline ADHD score, ADHD medication at pre-test, parenting style and parents’ expectations as covariates. We also repeated the same MMRM analysis substituting medication status at pre-test with medication at follow-up.

Secondary outcomes (ADHD teacher ratings) were tested by an analysis of covariance (ANCOVA) with trial site, sex, age, baseline ADHD score, ADHD medication at pre-test, parenting style and parents’ expectations as covariates. Differences were calculated between follow-up and post-test 2 assessments to test sustained effects and between follow-up and pre-test to test long-term clinical effects. Paired T tests were used for within-group analysis. Between-treatment effect sizes were calculated by dividing the treatment group differences by the pooled standard deviation at pre-test. Within-treatment effect sizes were calculated by dividing the mean of changes by the standard deviation at pre-test. Influence of baseline comorbid CP on the primary outcome was assessed repeating the main analysis, introducing conduct problems as an additional covariate. The course of comorbid conduct problems and other comorbid symptoms over time were assessed via the SDQ measuring CP, emotional problems and peer problems in addition to total problems and hyperactivity. Non-parametric Wilcoxon signed rank tests were used for this statistical analysis. NF self-regulation was analyzed based on the regression slope of all selected mean training sessions (for details see [18]). Consolidation of performance was compared by paired T test between follow-up training session and the first mean session using online obtained reinforcement rate. Pearson’s or Spearman correlations were assessed to link linear regression of self-regulation performance and clinical outcome for ADHD and comorbid symptoms.

For the ANCOVA, data were analyzed primarily in the modified intention-to-treat (mITT) population, comprising all patients except those who received no treatment due to violation of inclusion criteria. Baseline observation carried forward (BOCF) was used to replace missing values for analysis of covariance.

Results

A total of 174 participants were recruited between September 2009 and January 2013 for screening, 150 (86%) of whom were allocated to one of the two treatment groups and 144 (82%) participants started the treatment. The CONSORT flow diagram is depicted in Fig. 1. Finally, the mITT population comprised 75 (52%) participants in SCP-NF and 69 (48%) in EMG-BF. In SCP-NF 60 (41%) and in EMG 51 (35%) participants completed treatment and took part in all assessment points. Baseline characteristics did not differ between groups and are depicted in Table 1.

Fig. 1
figure 1

Trial profile. Modified from Strehl et al. [18]. SCP-NF slow cortical potential neurofeedback, EMG-BF electromyographic biofeedback, mITT modified intention to treat

Table 1 Baseline characteristics of the mITT population

As predefined in our protocol, we performed an ANCOVA assessing the sustained effects between groups (follow-up minus post-test 2) of the ADHD global score rated by parents, which revealed a trend for a superior improvement after EMG-BF versus SCP-NF (BOCF: treatment difference 0.15, p = 0.066, ES 0.32), while no effect of sex, trial site, medication, symptom severity at baseline, parenting style, parents’ expectation and age was observed. Regarding ADHD subdomains, ANCOVA yielded significant group differences for hyperactivity only (BOCF: treatment difference 0.19, p = 0.013, ES 0.44). No effect of sex, trial site, medication, parenting style and parents’ expectation was observed, but age (p = 0.051) showed a trend for a positive association with improved hyperactivity (Supplementary Table 1).

Analyzing the longitudinal course across all assessments from pre-test to end of 6 months follow-up together using the MMRM showed large within-group improvement on the ADHD global score for both treatments (time difference 0.43, p < 0.0001) with significant group-by-time interaction [F(3,4.376), p = 0.006]. Figure 2 shows the clinical trajectories for all assessments for primary outcome rated by parents and in Table 2 results of the MMRM are depicted. Both groups showed large initial improvement immediately after 25 training sessions (post-test 1). However, 1 month after treatment, following the medication washout, only the SCP-NF group remained stable and the EMG-BF group showed a significant relapse, resulting in significant group differences (group difference − 0.21, p = 0.019). However, at follow-up assessment group differences disappeared (group difference − 0.065, p = 0.534), indicating that the EMG-BF group significantly recovered (improved) from post-test 2 to follow-up assessment (time difference 0.16, p = 0.035). Regarding the covariates, age (p = 0.008) and symptom severity at baseline (p < 0.0001) showed significant impact on treatment outcome, reflecting more improvement with older age or more severe baseline ADHD (Supplementary Table 2). Further, when repeating the same analysis with medication status at follow-up, a significant interaction for time-by-medication [F(3,2.858), p = 0.045], but not for time-by-group-by-medication [F(3,0.365), p = 0.778] emerged. The post hoc tests indicated that only medicated participants showed a significant recovery from post-test 2 to follow-up (time difference 0.16, p = 0.048), while unmedicated participants showed a stable improvement after post-test 1 (Supplementary Tables 3, 4).

Fig. 2
figure 2

Clinical trajectories of ADHD parent ratings. Pre-test and post-test 2 were conducted without medication. °p < 0.1, *p < 0.05

Table 2 Summary of primary outcome: ADHD FBB-HKS rated by parents

In exploratory additional medication subgroup analyses, the group-by-time interaction remained significant for parent ratings in consistently unmedicated patients [N = 25 vs 24; F(3,2.122), p = 0.025]. Analysis of the consistently medicated participants showed a significant group effect for the impulsivity subscale [n = 21 vs 19; F(1,8.020), p = 0.007]. Post hoc analysis revealed significant lower impulsivity for the SCP-NF group for post-test 1 (p = 0.054), post-test 2 (p = 0.003) and follow-up (p = 0.008). Changes in medication status during the study were comparable in both groups (see Supplementary Table 5). There was no evidence that more children reduced medication use in the SCP group (n = 4) than in the EMG (n = 7).

ADHD subscales rated by parents are depicted in Fig. 2. Similar results as in the primary outcome were obtained. Inattention [F(3,110.26) = 27.753, p < 0.0001] and hyperactivity [F(3,107.28) = 18.316, p < 0.0001] achieved a significant effect of time. Hyperactivity subscale showed significant group-by-time interaction [F(3,107.24) = 3.476, p = 0.018] and inattention a trend [F(3,110.23) = 2.506, p = 0.062]. The impulsivity subscale also showed a significant effect of time [F(3,111.03) = 10.767, p < 0.0001], however, without a group-by-time interaction [F(3,111.00) = 1.724, p = 0.1661].

ANCOVA between groups assessing the secondary outcome rated by teachers did not show any significant difference between groups neither for sustained effects (follow-up minus post-test 2) (BOCF: treatment difference − 0.09, p = 0.3559) nor for long-term effects (follow-up minus pre-tests) (BOCF: treatment difference − 0.15, p = 0.1480) (for details see Supplementary Tables 6, 7). Within-group analysis are depicted in Table 3. SCP-NF showed significant improvement for ADHD global score t(64) = 3.055, p = 0.0032, and all subdomains for long-term effects with small to medium effect sizes. For EMG-BF, teacher ratings showed only a trend improvement for the impulsivity subdomain t(62) = 1.807, p = 0.0756. For details see Table 3.

Table 3 Summary of secondary outcomes: ADHD rating scale rated by teachers (mITT population N = 144, BOCF)

To assess the long-term effects of learning on self-regulation, we grouped participants into learners and non-learners based on the sign of their regression slope over sessions including the follow-up session for the feedback and transfer condition separately. For SCP-NF, 63.5% of the participants were classified as learners for the feedback condition and 58.3% for the transfer condition. In the semi-active control group, 70.2% were classified as learners during the feedback condition and 80.7% for the transfer condition. Paired T tests showed significant improvement of performance only during transfer trials between follow-up sessions and first training sessions for SCP-NF [t(42) = 2.438, p = 0.019] and EMG-BF [t(38) = 4.650, p < 0.0001]. For details see Supplementary Figure 8.

Long-term clinical effects (follow-up minus pre) and self-regulation performance did not show any significant correlation for SCP-NF. For the semi-active control group, we found significant correlations between linear performance increase and parent rating scale for ADHD global score [r(48) = 0.361, p = 0.011], inattention [r(48) = 0.302, p = 0.0370] and hyperactivity [r(48) = 0.367, p = 0.010], but no significant correlation with teacher ratings. As reported in our previous study [18], no significant correlations between training performance and parent-rated ADHD global score were found at post-test 2. However, the analysis of ADHD core symptom subdomains revealed a significant correlation of improvement of performance until post-test 2 for SCP-NF with parent [r(41) = 0.401, p < 0.009] and teacher ratings [r(36) = 0.339, p = 0.043] for improvement of impulsivity and a trend for hyperactivity [r(41) = 0.256, p < 0.0976] rated by parents. In the EMG-FB group, parent-rated hyperactivity correlated significantly negatively [r(41) = − 0.391, p = 0.036] with improved performance. For details see Supplementary Table 9.

Conduct problems at baseline did not significantly impact the clinical ADHD symptom change at follow-up on the FBB global scale (p = 0.576) or any subdomain rated by parents and teachers (all p > 0.1844). Regarding the clinical effects on comorbidity measured by the SDQ, Wilcoxon signed rank test showed significant improvement at follow-up compared to pre-test rated by parents for SDQ total score (U = 922.0, z = − 5.337, p < 0.0001) and the subdomain conduct problems (U = 843.5, z = 3.792, p < 0.0001), with no significant group differences. The other SDQ subdomains also improved (hyperactivity U = 471.0, z = − 5.727, p < 0.0001, emotional problems (U = 471.0, z = 5.727, p < 0.0001) and peer problems (U = 1.012, z = 3.642, p < 0.0001) except prosocial behavior (U = 1.474, z = − 1.062, p = 0.288)). Significant group differences emerged only for the subdomain peer problems (in favor of SCP-NF: U = 1833.5, z = 2.617, p = 0.009). Significant correlations between self-regulation during the transfer condition and symptom reduction were found only in the SCP-NF group, and only for SDQ total score [rs(58) = − 0.285, p = 0.030], peer problems [rs(58) = − 0.349, p = 0.007] and at trend level for CP [rs(58) = − 0.255, p = 0.052] and hyperactivity [rs(58) = − 0.247, p = 0.061].

Discussion

We studied the long-term effects of SCP-NF compared to a semi-active control condition. Our study showed that both treatments showed large improvements on ADHD core symptoms directly after treatment. Superior results for SCP-NF 1 month after treatment end became non-significant at follow-up for the primary outcome rated by parents. However, the improvements seen at post-test 1 remained stable 6 months after treatment end for the SCP-NF, suggesting long-lasting effects. Interestingly, the semi-active control group showed a significant relapse during the medication washout from post-test 1 to post-test 2 with a significant recovery at follow-up, suggesting that these changes are driven by a medication effect. This finding might resemble the observation of Monastra and colleagues [23], where only the control group deteriorated after medication washout. However, in our study, medication did not show such group-specific effects, and the significant time-by-medication interaction at follow-up did not interact with group. Since the clinical trajectories suggested that the medicated SCP-NF subgroup improved more, we also performed subgroup analyses of consistently medicated and unmedicated participants. However, these revealed no new NF-specific improvements and did not change the findings for the entire sample. Nevertheless, age did significantly impact treatment outcome, suggesting that the long-term effect of these intense treatments may benefit from the common symptom reduction with development [24]. Also, baseline severity remained significantly associated with improvement at follow-up, which may reflect continued regression to the mean or more room for improvement.

Regarding the clinical effects after SCP-NF, our results are in line with a recent meta-analysis [9], which analyzed sustained effects after NF in comparison with active and non-active control groups. This meta-analysis showed that superior clinical effect at follow-up for NF was observed only when it was compared with the non-active control groups; NF follow-up effects were similar to those of the active control conditions. Our study used a semi-active control group which might be closer to active control groups, as EMG-BF already showed clinical effects on ADHD symptoms [25, 26]. This, together with the short training duration of 25 sessions and the possible influence of additional confounders, may have contributed to the lack of superiority of SCP-NF 6 months after treatment. A recent study from Geladé and colleagues [27] showed that a significant advantage of medication over NF seen at post-intervention disappeared at FU. These findings suggest that in other study designs, NF-specific improvements may appear only at FU. Concerning teacher ratings, no differences between groups were found. However, within-group analysis showed significant improvement in the SCP-NF group only, with small to medium effect sizes. Teachers may be less biased but also tend to be less sensitive [28], although in a recent follow-up study [27] teacher ratings indicated an advantage of NF over a non-active group, comparable to medication. Further, reductions of comorbidity symptoms measured by SDQ were significant and independent of groups, except for peer problems which improved more in the SCP-NF group.

Considering the association between self-regulation and clinical outcome, only very few SCP-NF studies followed this relationship in participants with ADHD after the end of NF treatment [14, 17]. They related self-regulation outcome to impulsivity, inattention and hyperactivity at the end of treatment. We reported significant correlations between clinical improvement and self-regulation performance for both groups. The SCP-NF group showed at post-test 2 a significant correlation with self-regulation and symptom improvement for impulsivity and a trend for hyperactivity rated by parents and teachers, whereas the EMG-BF group showed a significant negative correlation for self-regulation and hyperactivity only. These outcomes might be interpreted as a specific effect of SCP-NF. However, at the follow-up 6 months after treatment, the EMG-BF group showed significant correlations between self-regulation performance and ADHD global score, attention and hyperactivity subdomain, which might be due to unspecific effects, such as the developmental course or regression to the mean. Interestingly, symptom change measured with SDQ at follow-up showed specific correlations between self-regulation and symptom improvement only for the SCP-NF group. Overall, after these unexpected and mixed outcomes, no firm conclusions can be drawn regarding specific and unspecific effects related to self-regulation for the follow-up outcomes after NF.

As limitation, we may consider that our follow-up was not powered enough to disentangle specific from unspecific effects between groups 6 months after treatment. Additionally, our SCP-NF setup was possibly suboptimal regarding the number of sessions and the amount of transfer trials (i.e., compared to earlier studies we had fewer training sessions and particularly fewer transfer trials) as well as the overall regulation performance during SCP-NF training. Our participants achieved a mean reinforcement rate of 44% for SCP-NF and 82% for EMG-BF. Still, these data are in line with the few published studies. Some SCP-NF studies [29, 30] showed reinforcement rates around 40% or less and similar good performance for the EMG-BF [25, 30], indicating as expected that EMG regulation is easier to learn. The rather low regulation performance (percentage of correct trials) of SCP-NF might be an important factor and partly explain the absence of group differences at follow-up for the primary outcome and for teacher ratings, as well as the modest relationship between self-regulation and clinical improvement. Successful self-regulation per se is known to be an important unspecific factor contributing to the clinical outcome in biofeedback treatments. Therefore, the substantial lower reward rates for SCP-NF compared to EMG-BF as in this study may have interfered with the specific effects. Still, the influence of regulation performance alone cannot explain the clinical follow-up outcomes, since EMG-BF was not more effective despite superior regulation performance. Future studies should ensure sufficient regulation performance as well as learning and transfer and address the question why participants show low SCP regulation performance.

In conclusion, the superiority of SCP-NF over the semi-active control group, which was reported in our previous paper, became non-significant 6 months after treatment end, but only the semi-active control group showed a relapse 1 month after treatment. This study adds important outcomes regarding the specificity of SCP-NF and the possible influence of unspecific variables on long-term treatment outcome.