Introduction

Autism spectrum disorder (ASD) is a group of pervasive neurodevelopmental disorders characterized by impairments in communication, social interactions and the presence of restricted and repetitive behaviors (DSM-5; American Psychiatric Association 2013). The main aim of an effective therapeutic intervention for individuals with ASD is to reduce symptom severity, while increasing cognitive functioning and adaptive skills. Over the past decade, naturalistic developmental behavioral interventions (NDBI), which emphasize a child’s early development of social communication by using developmentally appropriate behavioral techniques in a natural environment, have been at the forefront of research based on their positive outcomes (Dawson et al. 2010; Kasari et al. 2006; Koegel et al. 1999; Prizant et al. 2006; Schreibman et al. 2015). However, as the symptoms of ASD are heterogeneous, more research is needed to better understand the mechanisms of successful interventions and to identify which variables predict optimal outcomes. As emphasized by Vivanti et al. (2014), studying which variables predict what outcome is essential to being able to individualize early intervention programs based on a child’s clinical and developmental profile.

Age at Intervention Start

Throughout the literature, there is a consensus that a child’s age at the start of intervention is one of the most decisive variables influencing outcome (Dawson 2008; Flanagan et al. 2012; French and Kennedy 2017; Green et al. 2017; Sullivan et al. 2014; Harris and Handleman 2000; Klintwall et al. 2015; Reichow 2012; Fenske et al. 1985). Most authors speculate that the effectiveness of early intervention in young children with ASD relies on the high cerebral plasticity at this age (Dawson 2008; Ventola et al. 2013). The current recommendation is thus to intervene as early as possible, ideally before 3-years-of-age (Kasari et al. 2012; Landa et al. 2013; National Research Council 2001; Zwaigenbaum et al. 2015), and if possible before autistic symptoms are fully developed (Green et al. 2017; Rogers et al. 2014).

Intensity of Intervention

In addition to the age at which a child receives intervention, current guidelines also advocate that the number of intervention hours received per week, or “intensity”, is also important for outcome (Eldevik et al. 2009; Granpeesheh et al. 2009; Linstead et al. 2017a, b; Lovaas et al. 1974), explaining up to 60% of the outcome variance (Linstead et al. 2017a). However, studies do not always report benefits of a higher number of hours of intervention when compared to less intensive therapeutic interventions. For example, among a sample of children receiving a variety of intervention approaches, Darrou et al. (2010) did not identify any significant correlation between the amount of hours of intervention and outcome. Similarly, Fernell et al. (2011) did not observe a better outcome in children receiving high intensity of ABA intervention compared to a group receiving lower intensity of ABA-based intervention. Finally, in a meta-analysis, Maw and Haga (2018) suggested that the benefits from more hours of intervention varied from one type of intervention to another, so that the type of intervention should be taken into account when assessing the effect of intensity of the intervention on the outcome. Taken together, these discrepancies among studies suggest that more research is needed to establish a clear relationship between the number of hours of intervention and outcomes.

Cognitive Skills

Another predictor frequently reported as influencing intervention outcome is the child’s level of cognitive functioning at the onset of intervention. Considering that up to 30% of children with ASD have associated intellectual disability (Polyak et al. 2015), and that maladaptive behaviors associated with ASD are also related to lower cognitive functioning (Shattuck et al. 2007; Woodman et al. 2015), this relationship between cognitive skills and outcome appears highly relevant. Numerous studies advocate that children with higher cognitive skills are more likely to show better outcome in terms of gain in verbal skills (Anderson et al. 2007), adaptive skills (Fernell et al. 2011; Tiura et al. 2017), higher attendance rate to regular school (Harris and Handleman 2000), and higher gain in communication or socio-emotional skills (Tiura et al. 2017) compared to children with lower cognitive skills at baseline. However, the relationship between cognitive functioning at baseline and outcome might be more nuanced. In a meta-analysis, Reed (2016) suggested an inverse U-shape relationship between IQ levels at baseline and subsequent outcome, whereby studies including children with an average baseline IQ between 50 and 60 showed the most important cognitive or functional gain, while studies comprising children with a mean IQ lower than 40 or higher than 75 reported more modest gains. Taken together, these results suggest a complex relationship between IQ and outcome.

Social Orienting

A characteristic that has been less studied as a potential predictor of outcome, but that is generally acknowledged as a robust biomarker for ASD, is social orienting (Jones et al. 2014; Jones and Klin 2013; Morrisey et al. 2018; Pierce et al. 2011, 2016). Social orienting or social attention, represents the extent to which the child attends to social information and is generally measured using eye-tracking tools. Pierce et al. (2011, 2016) developed a 1-min visual preference task displaying social versus geometric stimuli, which demonstrated an ability to distinguish between two different patterns of visual exploration among children with ASD: on the one hand, the geometric responders (GR) that spent more time looking at the geometric stimuli, and on the other hand, the social responders (SR) that were more interested in the social stimuli. The authors observed that the GR group exhibited more autism symptoms and weaker cognitive abilities, when compared to the SR group (Pierce et al. 2016). Using a similar paradigm, it was recently reported that SR young children showed a more significant decrease of autistic symptoms over time than GR children (Franchini et al. 2016, 2018), suggesting that social orienting at baseline could represent a promising predictor of outcome. However, another study measuring social orienting using a different eye-tracking paradigm (Vivanti et al. 2013) did not observe any significant relationship between social orienting and outcome after a year in a group of children who received Early Start Denver Model (ESDM) intervention. Given its important role during early development, especially in the development of socio-communicative skills (Franchini et al. 2019; Schietecatte et al. 2011), more research is needed to establish a clear relationship between social orienting levels at baseline and its impact on intervention outcome.

European Context

Until relatively recently, most studies on autism intervention have been conducted in the United States, and predictors of intervention outcome have scarcely been studied in a European context. A recent survey highlighted great disparity among service provision of early intervention across European countries for children under the age of 7 (Salomone et al. 2016). While 64% of the children with ASD received speech therapy, 55% received behavioral intervention and up to 10% of the children did not receive any intervention. They showed that the type of intervention received was influenced by the educational level of the parents, verbal skills of the child, time passed since the child’s diagnosis and the European region where the family resides. While the majority of European studies have focused on the importance of early diagnosis, the implementation of Early and Intensive Behavioral Intervention (EIBI) programs, and their feasibility and results (Colombi et al. 2016; Fernell et al. 2011; Freitag et al. 2012; Remington et al. 2007; Salt et al. 2001; Touzet et al. 2017); only a small number of studies have explored the factors that predict intervention outcome in a European context (Benvenuto et al. 2016; Bieleninik et al. 2017; Narzisi et al. 2015).

This lack of knowledge regarding the efficacy of interventions provided in Europe and their related predictors of outcome encouraged us to conduct the present study. We chose to use an observational approach, as promoted by Benvenuto et al. (2016), Rosenbaum (2010) or Worrall (2007), which allowed us to obtain a more realistic representation of the possible treatment outcome predictors in the French-speaking region around Geneva, Switzerland. We used a group of 60 preschoolers diagnosed with ASD to examine putative outcome predictors described in the literature, such as intensity of intervention, age, cognitive level, and social orienting at baseline. We then explored the relationship between these variables and intervention outcome after 1 year of treatment, measured by the improvement of autism symptom severity and cognitive functioning. We hypothesized that children who were younger, more socially oriented and/or had a higher level of cognitive functioning at intake and who received a more intensive intervention would show a greater decrease in their autism symptoms and better cognitive gains over the first year of treatment.

Method

Participants

The study included a sample of 60 preschoolers with ASD (all males), who were aged 1.6-to-5-years-old at their first assessment (mean = 3.0 ± 0.8 SD) (see Table 1). All children received a clinical diagnosis of ASD according to the DSM-5 (American Psychiatric Association 2013) before their inclusion in the study. We further confirmed the diagnosis using the Autism Diagnostic Observation Schedule-G, or 2nd edition (ADOS; Lord et al. 2000, 2012). The ADOS-2 evaluation consists of a semi-structured assessment of restricted and repetitive behaviors (RRB), communication, and reciprocal social interactions (social affect, SA). Children with known Fragile X, Rett, Phelan-McDermid syndromes or neurofibromatosis, or with major somatic disorders, were excluded. All children received approximately 1 year of early intervention (mean time interval = 12.1 months ± 0.1 SD), at different intensities and with different treatment approaches. In our sample, 22 children received an early and intensive intervention, based on the ESDM intervention (Rogers and Dawson 2010), while the remaining 38 children received treatments available in their community (community treatment, CT). It is important to note that in both groups, most of the children received multiple interventions (70% of the total sample). Furthermore, as this study focused on the possible impact of different variables on intervention outcome, we did not include typically developing children as a control group. Lastly, all participants’ parents provided their written consent before the start of the evaluation, in accordance with protocols approved by the institutional review board of the institution where the research was carried out.

Table 1 Sample demographics

Procedure and Measures

First, an initial encounter with each child’s parents was scheduled to explain the research protocol. Parents were given a questionnaire to collect information regarding intervention frequency and specifications, along with written consent to take part in the study, before starting the evaluations. To assess the symptom severity of RRB, SA and overall ASD symptom levels, we used the ADOS calibrated severity score algorithms (Gotham et al. 2009; Hus et al. 2014). The ADOS calibrated severity scores are divided by “RRB” severity score, “SA” severity score and “Total” severity score. While RRB and SA severity scores represent distinct symptom measures, the “Total” severity score represents a combination of the RRB and SA severity scores in order to estimate an overall symptomatology level. Using these calibrated scores allowed us to compare children with various developmental and language levels (across modules and editions of the ADOS). All ADOS were administered by a trained examiner, videotaped and later rated in team with at least one examiner who had established research reliability on the ADOS-2. Research reliability was assessed, following common procedures, by reaching an 80% cut-off of similar ratings with a certified trainer. Research reliable clinicians were not blind to the intervention received, but did not take part in the intervention itself. Additionally, the Psychoeducational Profile—Third Edition (PEP-3; Schopler et al. 2005) was administered to evaluate the developmental profile of the child. The PEP-3 provides a measure of cognitive verbal and preverbal skills that we then converted into a developmental quotient (DQ) by dividing the developmental level by the chronological age, as already used in many studies (e.g., Franchini et al. 2018; Kawabe et al. 2016).

Finally, we used a visual preference eye-tracking task (biological vs. geometric motion) to estimate each child’s level of social orienting (Franchini et al. 2016, 2017, 2018), inspired by the task designed by Pierce et al. (2011). We applied the same metrics as those described in previous studies conducted by Franchini et al. (2016, 2017, 2018). The task consisted of a one minute, split screen simultaneous presentation of dynamic geometric motion, (similar to that of screensavers) on one side, and dynamic biological motion in the form of videos of children moving around on the other half of the screen. The task was administered using Tobii Studio software 3.1.6 on a TX300 Tobii eye-tracker system. Children were sat either alone on a chair or on their parent’s lap, at an approximate distance of 60 cm from the screen. After completing a five-point calibration adapted to toddlers, children looked freely at the screen without any prior specific indication. We drew areas of interest on the videos to delimit biological and geometrical motion to identify the participant’s preference. We then derived a percentage of social orienting from the time spent fixating biological motion (using Tobii software 3.1.6), the total time spent looking at the screen was divided by the time spent looking at biological motion. As already done in several studies (Franchini et al. 2016, 2017, 2018; Pierce et al. 2011, 2016), we split children into two groups, where participants looking at the biological stimuli for more than 50% of the total viewing time were categorized as Social Responders (SR), and children looking mostly at the geometric stimuli were considered to be Geometric Responders (GR). To avoid any bias, participants who looked at the screen during less than 50% of the task were removed from our sample. Participants repeated this protocol approximately 1 year later to measure changes following intervention. For the outcome measures, we calculated a raw change over time for each measure [e.g. (ADOS SA score at Time 2—ADOS SA score at Time 1)].

Ultimately, our design included the following five possible predictors of outcome: (1) age at baseline: age at the first visit; (2) intensity of intervention: number of hours per week of intervention the child received during the year; (3) intervention group: dichotomous variable of the intervention received (ESDM or CT); (4) social orienting group: dichotomous variable of social orienting at baseline (SR or GR); and (5) developmental quotient at baseline: cognitive functioning at baseline assessed by PEP-3, CVP subdomain, as described above.

We evaluated these variables to measure their relation to four outcome measures: (1) ADOS RRB change: restricted interest and repetitive behaviors change over the year; (2) ADOS SA change: social communication skills change over the year; (3) ADOS Total change: overall symptom level change over the year; and (4) DQ change: cognitive functioning change over the year.

Analysis Strategy

We performed a repeated measures ANCOVA in order to identify changes over time as a main effect, as well as potential interactions between groups of intervention and social orienting groups on the outcome. To do so, we used severity scores at baseline and severity scores 1 year later as dependent variables; intervention group (ESDM vs. CT) and social orienting group (SR vs. GR) as between-subject factors. In addition, we controlled for age at baseline as well as intensity of intervention and developmental quotient at baseline using mean values (see Table 1). Model resulted in a 2 (time) × 2 (intervention group) × 2 (social orienting group) repeated measures ANCOVA where age at baseline, intensity of intervention and developmental quotient were included as covariates. In addition, pairwise comparisons corrected for Bonferroni were used to determine between and within group differences. These analyses were performed using IBM SPSS Statistics for MacIntosh, Version 24.0 (Armonk, NY: IBM Corp.), and graphs were plotted using GraphPad Prism 7.0a (GraphPad Software, La Jolla California USA, www.graphpad.com) version for Macintosh. All data underwent an outlier identification test using GraphPad Prism 7.0a (ROUT, 1%), a method combining regression and outlier removal (1% corresponding to the false discovery rate; Motulsky and Brown 2006). We performed additional stepwise regression when there was more than one significant variable influencing the outcome in order to establish a hierarchy between significant predictors.

Finally, we performed post-hoc analyses to examine whether or not the inversed U-shaped relationship between IQ and outcome suggested by Reed (2016) could be related to a relationship between DQ scores and the presence of maladaptive behaviors as we believe that it could impact the test-taking ability of children. To do so, we used the PEP-3 “Maladaptive behavior” composite score which evaluates inappropriate social interactions, idiosyncratic language, and restricted and repetitive behaviors. All items are very specific to maladaptive behaviors occurring in ASD and aim to orient diagnosis. We used standard scores to assess maladaptive behaviors level at baseline where lower scores imply more maladaptive behaviors. We used regressions to explore if the Maladaptive scores at baseline were predictive of the DQ scores at baseline, at T2 and of the over time change. Finally, we used regression to see if the changes in Maladaptive scores were predictive of the DQ changes over time.

Results

RRB Change

A repeated measures ANCOVA, with a Greenhouse–Geisser correction including age at baseline, intensity of intervention and DQ at baseline as covariates, showed that RRB severity scores did not significantly differ between T1 and T2 (p > 0.05; see Table 2; Fig. 1a). In the overall sample, child RRB severity scores stayed stable after 1 year of intervention. Moreover, between subject factors such as intervention group or social orienting group did not impact RRB severity scores at baseline and 1 year later. In other words, children belonging to CT or ESDM intervention group (see Fig. 1b), or being qualified as SR or GR had similar mean RRB severity scores at T1 and T2 (see Fig. 1c).

Table 2 Repeated measures ANCOVA including mean age at baseline, intensity of intervention and DQ at baseline as covariates
Fig. 1
figure 1

Restricted and Repetitive Behavior symptom severity changes over time, a RRB overall mean severity scores at baseline and after 1 year of intervention, b RRB mean severity scores at baseline and after 1 year of intervention by Intervention group (ESDM vs. CT) c RRB mean severity scores at baseline and after 1 year of intervention by Social orienting group (Social responders vs. Geometric responders) d RRB mean severity scores at baseline and after 1 year of intervention in ESDMxGeo, ESDMxSoc, CTxGeo and CTxSoc. Framed values represent results from the ANCOVA, values in the graphs represent pairwise comparisons

Age at baseline, intensity of intervention and DQ at baseline were not predictive of the RRB mean change over time (all p > 0.05; see Table 2). Interaction between subject factors and time did not appear significant, meaning that the changes observed in mean RRB severity scores from T1 to T2 were statistically equivalent in both intervention groups (p > 0.05; see Table 2); as well as the changes observed in both social orienting groups (p > 0.05; see Table 2).

Finally, there was no interaction between time, intervention group and social orienting group (p > 0.05; see Table 2), reflecting the fact that the overall mean RRB change did not differ according to the combination of between factors (intervention group and social orienting group) over time. However, post hoc tests using the Bonferroni correction revealed that children receiving CT and categorized as GR at baseline tend to increase their mean RRB severity scores after 1 year of intervention by an average of 0.811 (p = 0.054; see Fig. 1d) which resulted in a significant 1.150 (p = 0.048) average difference at T2 between means of children receiving CT and being SR and CT children categorized as GR (see Fig. 1d).

SA Change

A repeated measures ANCOVA with a Greenhouse–Geisser correction controlling for age at baseline, intensity of intervention and DQ at baseline showed that mean SA severity scores did not differed significantly between T1 and T2 (p > 0.05; see Table 2; Fig. 2a). Age at baseline, intensity of intervention and DQ at baseline were not predictive of the SA mean change over time (all p > 0.05; see Table 2).

Fig. 2
figure 2

Social Affect symptom severity changes over time, a SA overall mean severity scores at baseline and after 1 year of intervention, b SA mean severity scores at baseline and after 1 year of intervention by Intervention group (ESDM vs. CT) c SA mean severity scores at baseline and after 1 year of intervention by Social orienting group (Social responders vs. Geometric responders) d SA mean severity scores at baseline and after 1 year of intervention in ESDMxGeo, ESDMxSoc, CTxGeo and CTxSoc. Framed values represent results from the ANCOVA, values in the graphs represent pairwise comparisons

However, we identified a significant interaction between time and the intervention group (F(1,53) = 5.072, p = 0.029; see Table 2; Fig. 2b), suggesting that the intervention received had an impact on the changes observed in SA severity scores over time. Post-hoc tests corrected using Bonferroni indicated no significant differences between intervention groups mean SA severity scores at T1 and T2, but a significant change over time was observed in children receiving ESDM based intervention with an average decrease of − 2.114 (p = 0.003, see Fig. 2b). There was no significant interaction between time and social orienting group (p > 0.05; see Table 2; Fig. 2c). SA mean severity scores were equivalent at baseline between SR and GR and did not differ after 1 year of intervention despite a significant average decrease of − 1.266 (p = 0.004; see Fig. 2c) over time in the SR group.

Finally, there was no significant interaction between time, intervention group and social orienting group (p > 0.05; see Table 2; Fig. 2d). Post-hoc tests, using Bonferroni correction indicated no differences at T1 and T2 between all combination of between factors (see Fig. 2d). However, it indicated that the mean SA severity scores of children receiving ESDM based intervention and belonging to the SR group significantly decreased their SA scores on the ADOS by − 2.665 (p = 0.001; see Fig. 2d) after 1 year of intervention.

Total Change

A repeated measures ANCOVA with a Greenhouse–Geisser correction controlling for age at baseline, intensity of intervention and DQ at baseline showed no significant difference between T1 and T2 regarding mean Total severity scores (p > 0.05; see Table 2; Fig. 3a).

Fig. 3
figure 3

Total symptom severity changes over time, a Total overall mean severity scores at baseline and after 1 year of intervention, b Total mean severity scores at baseline and after 1 year of intervention by Intervention group (ESDM vs. CT) c Total mean severity scores at baseline and after 1 year of intervention by Social orienting group (Social responders vs. Geometric responders) d Total mean severity scores at baseline and after 1 year of intervention in ESDMxGeo, ESDMxSoc, CTxGeo and CTxSoc. Framed values represent results from the ANCOVA, values in the graphs represent pairwise comparisons

Age at baseline, intensity of intervention and DQ at baseline were not predictive of the mean Total change over time (all p > 0.05; see Table 2). However, we identified a trend between time and the intervention group (F(1,53) = 3.834, p = 0.056; see Table 2; Fig. 3b), implying that the type of intervention received slightly impacted the mean change observed in Total severity scores over time. Post-hoc tests corrected using Bonferroni indicated that there were no differences between groups of intervention at T1 and T2, but children receiving ESDM based intervention significantly decreased their mean SA severity scores over time experiencing an average decrease of − 1.587 (p = 0.016; see Fig. 3b). Regarding the social orienting groups, there was no interaction with time (p > 0.05; see Table 2; Fig. 3c). Despite the absence of interaction and no significant differences at T1 and T2 between SR and GR, post-hoc tests using the Bonferroni correction revealed that SR children exhibited a significant − 0.945 (p = 0.019; see Fig. 3c) decrease of their Total severity scores over time.

Finally, there was no significant interaction effect between time, intervention group and social orienting group on the mean Total severity scores (p > 0.05; see Table 2; Fig. 3d). Pairwise comparisons did not identify any differences between groups at T1 and T2 (all p > 0.05). However, SR children who received ESDM based intervention experienced a significant decrease over time of − 2.055 (p = 0.005; see Fig. 3d) regarding their mean Total severity scores.

DQ Change

A repeated measures ANCOVA with a Greenhouse–Geisser correction controlling for age at baseline, intensity of intervention and DQ at baseline indicated a significant 10.410 increase of DQ mean scores over time (F(1,53) = 20.927, p < 0.001; see Table 2, Fig. 4a). Children included in the study improved their DQ by an average of 10.4 points during their first year of intervention.

Fig. 4
figure 4

Developmental Quotient changes over time, a DQ overall mean at baseline and after 1 year of intervention, b DQ mean at baseline and after 1 year of intervention by Intervention group (ESDM vs. CT) c. DQ mean at baseline and after 1 year of intervention by Social orienting group (Social responders vs. Geometric responders) d DQ mean at baseline and after 1 year of intervention in ESDMxGeo, ESDMxSoc, CTxGeo and CTxSoc. Baseline values represent the mean DQ while the changes represent the mean change for each group. Framed values represent results from the ANCOVA, values in the graphs represent pairwise comparisons

Age at baseline appeared to significantly predict DQ change over time (F(1,53) = 4.881, p = 0.032; see Table 2; Fig. 5a). DQ at baseline also predicted DQ change over time (F(1,53) = 15.415, p < 0.001; see Table 2; Fig. 5b). Intensity of intervention did not impact the DQ change over time (p > 0.05). A stepwise regression indicated that the combination of both predictors resulted in (F(2, 58) = 3585.335, p < 0.001) with an R2 of 0.260. DQ change was equal to 62.228–8.083 (Age; SE = 3.057) − 0.396 (DQ at baseline; SE = 0.101). DQ change was best explained by DQ level at baseline (R2 = 0.169, p = 0.001) followed by age at baseline (R2 = 0.091, p = 0.011). There was no significant interaction between time and intervention group (p > 0.05), meaning that the mean DQ change over time were equivalent between CT and ESDM. Intervention groups did not differ in mean DQ scores at T1 and T2, despite a significant 6.328 (p = 0.038; see Fig. 4b) increase of DQ mean scores over time for children in the ESDM based intervention group. Regarding the social orienting groups, there was no interaction with time (p > 0.05; see Table 2; Fig. 4c). In addition, SR and GR groups did no show any differences at T1 and T2 regarding their mean DQ scores (all p > 0.05) but they both made significant increase after 1 year of intervention (SR = 12.841, p = 0.002; GR = 8.356, p = 0.042, see Fig. 4c).

Fig. 5
figure 5

Regressions between significant predictors and DQ change, a Age at baseline and DQ change, b DQ at baseline and DQ change

Finally, there was no interaction between time, intervention group and social orienting group (p > 0.05; see Table 2; Fig. 4d). Pairwise comparison did not identify significant differences at baseline and after 1 year of intervention between groups (all p > 0.05) but there was a significant DQ increase of 4.441 (p = 0.040; see Fig. 4d) in children receiving ESDM based intervention group and being SR.

Finally, levels of maladaptive behaviors at baseline were significantly correlated and predictive of the DQ scores at baseline such lower maladaptive scores, i.e. more maladaptive behaviors (r = 0.66, R2 = 0.43, p < 0.001; see Fig. 6a), were associated with lower DQ scores at baseline. In addition, we observed that levels of maladaptive behaviors at baseline were significantly correlated and predictive of the DQ scores after 1 year of intervention (T2) such lower maladaptive scores, i.e. more maladaptive behaviors (r = 0.55, R2 = 0.30, p < 0.001; see Fig. 6b), were associated with lower DQ scores after 1 year of intervention. In other words, levels of maladaptive behaviors were predictive of cognitive scores both at baseline and after 1 year of intervention. However, we did not identify any relationship between the levels of maladaptive behaviors at baseline and the change in DQ over time (r =  − 0.11, R2 = 0.01, p > 0.05; see Fig. 6c). These results suggest that despite a relationship between maladaptive behaviors and DQ scores, all children might experience a great change in DQ scores regardless of their initial levels of maladaptive behaviors. However, it appears that the children who experience the greater increase of their DQ scores over time are the ones who also greatly reduced their levels of maladaptive behaviors over time (r = 0.36, R2 = 0.13, p = 0.007; see Fig. 6d).

Fig. 6
figure 6

Association between DQ and Maladaptive behaviors a Regression and correlation between Maladaptive behaviors and DQ scores at baseline; b Regression and correlation between Maladaptive behaviors scores at baseline and DQ scores at T2; c Regression and correlation between Maladaptive behaviors scores at baseline and DQ change over time; d Regression and correlation between Maladaptive behaviors and DQ changes over time

An additional post-hoc power analysis was conducted using the software package, G*Power3 (Faul et al. 2007). The sample size of 60 was used for the statistical power analyses, number of groups was 4 and 8, when looking at main effects and interactions respectively, with 3 covariates included in the model and using an α of 0.05. The recommended effect sizes used for this assessment were as follows: small (f = 0.10), medium (f = 0.25), and large (f = 0.40) (Cohen 1988). The post hoc analyses revealed the statistical power for this study was 0.12 for detecting a small effect, 0.48 for detecting a medium effect and 0.86 for detecting a large effect size. In consequence, there was adequate power (> 0.80) at the large effect size level but not enough statistical power for the small to moderate effect size level. Additional power analysis using similar parameters showed that, in order to reach a power of 0.80 for small and medium effects, sample size should increase up to 787 and 128 participants respectively.

Discussion

This study aimed to explore predictors of intervention outcome in a European context. In line with numerous studies advocating for early and intensive intervention, we observed that access to a comprehensive program, such as the ESDM, was the main predictor of decreased socio-communicative deficits after 1 year of intervention (Fig. 2b). Taken as individual factors, neither a higher number of hours nor a younger age at baseline showed a significant impact on outcome. This might suggest that, in order to be effective, an intervention should really combine both parameters. In addition, we observed that a gain in cognitive skills was best predicted by a combination of lower DQ and younger age at baseline (Fig. 5a, b). We estimate that this greater cognitive gain for children with lower DQ at baseline can be explained by the fact that they have a wider margin for progress. Furthermore, we hypothesize that better cognitive scores at follow-up could also rely on a reduction of maladaptive behaviors (Fig. 6). Finally, despite the fact that our main ANCOVA model did not identify social orienting as a significant predictor of outcome, results from our pairwise analyses suggest that social orienting had a meaningful impact on outcome. Indeed, only SR children showed a decrease in their autism symptoms (Fig. 3c), led by improvements in the social affect domain (Fig. 2c). Further, we observed that a child’s social orienting potentiates the effect of early and intensive intervention; only the SR receiving ESDM showed significant autistic symptoms decline (Figs. 2d, 3d) or cognitive gains over time (Fig. 4d).

Early and Intensive Intervention

Our results showed that receiving a comprehensive early and intensive intervention program (here, the ESDM) was the best predictor of ASD symptom decrease over time, providing further support for the well-established finding that early and intensive intervention is critical for therapeutic outcome (Elder et al. 2016; Eldevik et al. 2009; Fenske et al. 1985; Flanagan et al. 2012; French and Kennedy 2017; Klintwall et al. 2015; Linstead et al. 2017a, b; Mathews et al. 2018; Stahmer et al. 2019). However, our results do not bring support for higher number of hours of therapy as a standalone predictor of greater intervention outcome. This result is interesting considering that Rogers et al. (2012) suggested that interventions should combine both early and intensive factors in order to be effective. This combination of factors is supported by previous studies linking the larger gains occurring in early and intensive intervention with higher level of brain plasticity during critical developmental windows (Dawson 2008; Pascual-Leone et al. 2005; Sullivan et al. 2014). The importance of higher intensity of intervention during a critical developmental period have been supported by Granpeesheh et al. (2009) results who showed a positive relationship between the number of hours received and treatment outcome for children between 2-and-5-years-of-age but not in children older than 7-years-old. Taken together, previous and present results suggest that, taken independently, a younger age at baseline or a more intensive intervention might have a moderated effect on outcome, whereas a combination of both factors together could have a stronger influence on outcome by taking advantage of critical developmental window. Finally, our results highlight specific improvement in social communication skills in the group receiving early and intensive intervention, but not in the CT group. We hypothesize that this specific gain might be associated with the specificity of ESDM intervention. Indeed, ESDM was developed as an ASD-specific intervention, targeting all areas of development, with a particular attention to social communication (e.g., joint attention, non-verbal communication and imitation Rogers and Dawson 2010) which is particularly altered in ASD (Mundy 1995; Thorup et al. 2018). ESDM is also a manualized, data-driven approach, where all therapists work in a systematic way to target common developmental objectives specific to the needs of the child. Emphasizing social communication in a coordinated and systematic way during a period of early brain development may be key to improving core symptoms of autism, whereas nonspecific interventions, such as those provided in the CT group, may target more transversal skills (such as language skills in a speech therapy; see Ganz and Simpson 2004) and have a more diffuse effect regarding core features of autism.

Lower Cognitive Skills and Younger Age at Baseline are Associated with Greater Cognitive Gains Over Time

In the present study, we observed that children with lower cognitive levels at baseline showed larger gains in their cognitive abilities over time than children who had higher cognitive levels at baseline. While numerous studies report that children with higher cognitive scores at baseline are more likely to have a better outcome (Anderson et al. 2007; Fernell et al. 2011; Harris and Handleman 2000; Tiura et al. 2017), a meta-analysis by Reed (2016) suggests a more complex relationship between baseline IQ and cognitive gain. Similarly to our results, over the range between 50 and 80 of baseline IQ, Reed (2016) observed a negative relationship between baseline IQ and cognitive gain whereby children with the lowest baseline IQ scores showed the largest cognitive gain over time. As suggested by Reed (2016), this phenomenon could be explained by the fact that children with low cognitive functioning at baseline were also the ones with more potential progress, compared to children with high levels of cognitive functioning at baseline. In addition, we investigated if this larger cognitive gain observed in children with lower baseline functioning might be explained by improvement in their behavior and subsequent improvement in test-taking ability. Results from our Post-hoc analyses shown that before intervention, the level of maladaptive behaviors was predictive of DQ mean scores, whereby more maladaptive behaviors were associated with lower DQ, potentially because maladaptive behaviors impaired children’s ability to follow instructions and respond adequately during testing. In addition, levels of maladaptive behaviors at baseline did not predict change in DQ, meaning that children with both high and low levels of maladaptive behaviors could improve their cognitive skills, while children showing a larger decrease in maladaptive behavior were the ones who had more cognitive gain over time. As such, we hypothesize that, at this age, reducing maladaptive behaviors may results in substantial cognitive gains through the improvement of test-taking ability. These results are also consistent with several studies reporting a relationship between lower IQ and the presence of more maladaptive behaviors (Shattuck et al. 2007; Woodman et al. 2015), even within a non-autistic population (Ando and Yoshimura 1978).

Social Orienting: A Potential Outcome Contributor

Finally, we did not find social orienting at baseline to be a significant predictor of outcome. However, our results suggest that levels of social orientating at baseline may predict dissimilar symptom patterns between subgroups, whereby children who preferred geometric stimuli tended to have increased levels of RRB symptoms and children who preferred the biological stimuli showed a decrease in levels of SA symptoms 1 year after the start of intervention. These results are in line with previous studies looking at developmental trajectories using similar tools (Franchini et al. 2016, 2018; Pierce et al. 2011). Franchini et al. (2016) showed that Social Responders tended to increase their social abilities, resulting in a decrease in their autistic symptoms after 1 year of intervention, whereas Geometric Responders tended to stay stable or show an increase in their autistic symptoms. Our results support the social motivation theory (Chevallier et al. 2012a, b; Dawson 2008; Klin et al. 2002; Mundy 1995), which describes the idea that a deficit in early social attention has a cascading effect on a child with autism’s development, leading to autistic symptomatology. Indeed, typically developing children have been shown to automatically orient to social cues during early childhood (Morrisey et al. 2018), whereas children with ASD appear to orient less to these cues (Franchini et al. 2016; Jones et al. 2014; Jones and Klin 2013; Morrisey et al. 2018; Pierce et al. 2011, 2016). The reason for this early difference is commonly explained by a lack of reward perceived from social cues for individuals with ASD (Chevallier et al. 2012a, b). Consequently, as children with ASD pay less attention to social cues, they may also miss learning opportunities (Dawson et al. 1998; Franchini et al. 2019). As with previously cited studies, our results support the idea that children who showed more interest in social stimuli before intervention had a faster rate of improvement in their social communication skills, whereas children who were mostly attracted by the geometric stimuli at baseline showed an increase in repetitive behaviors over time, potentially leading them to miss crucial learning opportunities while their attention was more focused on non-social stimuli. Furthermore, our results showed that, despite a non-significant interaction between type of intervention and social orienting group factors, the only group of children making significant progress on both ASD symptomatology and cognitive levels were the ones receiving an early and intensive intervention program, and who were already Social Responders at baseline. Children receiving similar intervention but who were Geometric Responders improved their ASD symptoms but at a slower pace, and did not reach significance criteria. On a speculative basis, we could extend these results to the question of “timing of treatment response” raised by Vivanti et al. (2014), in their theoretical paper. Vivanti et al. (2014) highlighted the lack of knowledge regarding the timing of treatment response and questioned whether or not children who do not respond to an intervention during the first year would show significant changes during the following year. We thus speculate that SR at baseline would show an earlier response to treatment, especially in an ESDM based intervention which emphasizes the importance of social engagement as a main principle intervention (Rogers and Dawson 2010) as the SR child may be more inclined to engage in social interactions compared to a GR child. In line with this hypothesis, GR children might take more time to benefit from the intervention, as they are less likely to socially engage at baseline. Consequently, increasing their social orienting level as a first step of intervention might elicit better subsequent progress in the following period, which could be confirmed by further exploring the association between both parameters (Type of intervention × SO) over an extended time frame.

Limitations

One limitation of our study is that our sample includes only males with ASD. We chose to exclusively include males for multiple reasons. First, autism affects more males than females, with a sex ratio of approximately 4:1 (Christensen et al. 2016). While we could have included females with an equivalent sex ratio of 4:1, important phenotype differences (Frazier et al. 2014) including more difficulties in social communication, lower cognitive abilities, less-restricted interests and less-developed adaptive behaviors in females, as well as subjacent genetic differences (Chen et al. 2017) were identified within literature between males and females. Considering these sex differences, we chose to exclude females from our sample to avoid any misleading effects that could arise from this sex-specific phenotype.

Another limitation lies in our statistical analyses, which involved only a limited sample size, as illustrated by our power analysis described above. These analyses showed that a bigger sample size, up to 787 participants, may be needed in order to reach high statistical power for small effect size, which is way beyond the number of participants currently included in our longitudinal protocol. Consequently, and despite our efforts for controlling for covariates and to apply multiple comparisons correction, these results should be considered with caution and preliminary.

Moreover, we decided to focus our analyses on a set of selected predictors, but many other predictors could have been explored, such as parental implication for example (Chen et al. 2017; Narzisi et al. 2015). Similarly, other outcome measures could have been taken into consideration, such as quality of life, as suggested by Bieleninik et al. (2017), necessary skills for future functioning or stress reduction, as suggested by parents reviewed by McConachie et al. (2018). We chose our predictors because most of them were widely acknowledged throughout the literature but not yet in a European context. However, future research should explore alternative predictors and outcome variables.

While observational studies might bring a more naturalistic assessment of the outcome of currently available interventions compared to RCT, many parameters remained uncontrolled and unmatched (e.g., hours of treatment) between our intervention groups. As a result, our results should not be used to praise for any intervention (ESDM vs. CT).

Finally, as mentioned in the Method section, raters were not blind to the intervention received but did not take part in the intervention. In the context of our study, blinding was not possible for the examiners, as issues related to the type and intensity of intervention often came up in discussions with the families (e.g. when scheduling appointments, or when families ask for advices). While some prospective studies have achieved blinding by hiring a naïve rater who assessed the children at several timepoints (see Bieleninik et al. 2017 for a review), we did not consider this when we started this study.

Conclusion

The present study brings additional support for early and intensive intervention to reduce autistic symptoms and improve cognitive levels. The importance of improving early screening for ASD and increasing access to comprehensive early intervention programs in Europe is evident. Furthermore, this study showed that cognitive gains over time are mostly demonstrated by children with lower cognitive levels at baseline, especially when maladaptive behaviors are reduced over time. Finally, this study provides support for the use of eye-tracking as a promising tool to distinguish between subgroups of children who might show different trajectories of their autistic symptoms over time and who respond differentially to specific types of intervention. Our study provided preliminary data suggesting that children who are more socially engaged at baseline might respond faster to interventions which emphasize socio-communicative interaction, compared to children who are less interested in social stimuli. Further studies should explore whether or not increasing social orienting is associated with subsequent clinical improvement over time.