What This Paper Adds

The current study is the first to our knowledge to examine the effectiveness of a physical exercise on different types of stereotypic behaviors in children with autism spectrum disorder (ASD). The overall behavioral benefits of physical exercise in children with ASD has been previously indicated. This study confirms this finding. Furthermore, we suggest that behavioral benefits of physical exercise should be specific to stereotypic behavior and that both physical exercise and stereotypic behavior share similar biomechanics. Our findings provide guidance on the design of exercise programs to reduce specific stereotypic behaviors. Moreover, the current study provides insight into the underlying mechanism that leads to physical exercise, which potentially influences stereotypy in the ASD population.

Introduction

ASD is a neurodevelopmental disorder evident from early childhood, and its prevalence has rapidly increased in recent years (World Health Organization 2016). In Hong Kong, the prevalence of children suffering from ASD was reported to be 14.6 per 1000, and this trend is expected to continue (Centre for Disease Control and Prevention, HKSAR 2012). One of the major challenges when working with children with ASD is overcoming their repetitive motor and vocal behavior, which is commonly referred to as stereotypy (American Psychiatric Association 2016). A considerable percentage of children with ASD have been reported to be engaged in at least one form of stereotypy, with estimates reaching as high as 88% of children with ASD (see Chebli et al. 2016 for a review). Stereotypy is generally defined as behavior that is involuntary, repetitive, maladaptive and inappropriate in nature (Bahrami et al. 2012; Lanovaz et al. 2013; Turner 1999). Examples of stereotypic behaviors include movements of any or all body parts (e.g., body rocking and jumping), hand movements (e.g., hand flapping, waving and non-contextual pointing), object manipulation (e.g. spinning or flipping objects, and addition of objects to a line) and vocal stereotypy (e.g., unrecognizable vocalizations and phrasal repetition) (MacDonald et al. 2007). These behaviors have been shown to interfere with children’s social interactions (Jones et al. 1990), communications (Shriberg et al. 2001) and skill acquisitions (Ming et al. 2007). For example, children engaged in stereotyped and ritual behaviors were reported to be at an elevated risk for isolation and exclusion from their peers (Church et al. 2000). Additionally, learning performance on a task suffered when children engaged in stereotypic behavior (Morrison and Rosales-Ruiz 1997). Considering these negative behavioral outcomes, it is important to develop effective interventions to alleviate the stereotypic behaviors in this population.

In the past few decades, interventions such as sensory integration (Devlin et al. 2011; Smith et al. 2005) and functional communication training (Kennedy et al. 2000) have been proposed to treat stereotypic behaviors in children with ASD (see Turner 1999 for review). One intervention that has received considerable attention in the field is physical exercise (Bremer et al. 2016; Lang et al. 2010). Many studies have shown that individuals with ASD demonstrate significantly reduced motor stereotypic behaviors after various exercises, including jogging (Kern et al. 2011; Prupas and Reid 2001; Rosenthal-Malek and Mitchell 1997), horseback riding (Kern et al. 2011), dancing (Rosenblatt et al. 2011), stretching (Reid et al. 1988) and martial arts (Bahrami et al. 2012). Also explored were different dosage parameters of exercise such as intensity level (Elliott et al. 1994) and frequency (Prupas and Reid 2001). For example, exercise with moderate to vigorous intensity level was demonstrated to be more effective than low intensity level in decreasing the number of occurrences of the stereotypic behavior (Elliott et al. 1994), and exercise at a higher frequency was found to possibly reduce stereotypic behaviors (58.9%) more so than exercise at a single frequency (51.6%) (Prupas and Reid 2001). While many studies have reported on the behavioral benefits of physical exercise in children with ASD, the underlying mechanism of such benefits remains unclear. Some have argued that the issue remains unresolved because the sensory stimulation (or sensory consequence) between physical exercise and stereotypy is similar (Bremer et al. 2016; Lang et al. 2010).

Stereotypic behavior has long been described as being a learned and operant behavior that is maintained by its pleasant sensory consequences (Berkson 1983; Cunningham and Schreibman 2008; Lovaas et al. 1987). For example, a child with ASD may rock their body as vestibular stimulation. In this case, body rocking may be automatically reinforced by its sensory consequence (i.e., vestibular stimulation). Within this theoretical framework of sensory consequence, the stereotypic behavior could be eliminated or replaced by an activity that could produce a matched or similar sensory consequence (Berkson 1983; Lang et al. 2010; Rapp et al. 2004). Previous reviews speculated that physical exercise that closely resembles the biomechanics of stereotypic behavior may provide similar sensory stimulation to individuals with ASD, and therefore reduce the stereotypic behavior (Bremer et al. 2016; Lang et al. 2010). However, to the best of our knowledge, no study has examined this speculation.

In the present study, we aimed to examine this speculation by investigating the impact of a ball-tapping exercise on two stereotypic behaviors: repetitive hand flapping and body rocking in children with ASD. Considering the different topographies of these two behaviors (one focused on hand-and-arm movement and one focused on whole-body movement), we hypothesized that ball tapping would be more effective in reducing hand flapping behavior than in reducing body-rocking behavior.

Method

Design

The present study applied a crossover design to test our hypothesis. All participants were exposed to both control and experimental conditions in an A-B sequence with a 1-month wash-out period. Each condition comprised 24 sessions (two sessions per week, 20 min per session). Each session was conducted in the morning between 8:30 a.m. and 9:30 a.m. by a trained research assistant who was assisted by student helpers in a hall/gymnasium of each participating school. The staff-to-participant ratio for both groups was 1:2 to 1:1, depending on attendance. Each participant was assigned a student helper as a partner and this partnership was fixed throughout the study.

Participants

Participants were recruited from three local special schools for children with intellectual disabilities. Written consent was obtained from all participants’ parents/guardians and schools. The study was approved by the university’s ethics committee. The inclusion criteria were: (1) aged 9–12 years; (2) ASD diagnosis from a physician based on the Diagnostic and Statistical Manual of Mental Disorders, 5th edition; (3) non-verbal IQ over 40; (4) ability to follow instructions and perform requested motor skills; (5) no regular participation in physical exercise in the past 6 months outside of participation in physical education classes at their school, (6) demonstrated the motor stereotypies of repetitive hand flapping and body-rocking movement as confirmed by an independent physician. Meanwhile, participants were excluded from the experiment if they had one or more co-morbid psychiatric disorders; (2) a complex neurological disorder (e.g., epilepsy, phenylketonuria, fragile X syndrome, tuberous sclerosis); and (3) visual and auditory deficits. As a result of screening, 30 participants (22 males and eight females) were recruited for the study. Descriptive demographics of the participants were presented in Table 1.

Table 1 Demographic statistics of participants

Measurements

Stereotypy was measured 1 day before the first session and 1 day after the last session of each condition. Measurement was conducted during school recess when the participants were allowed to engage in free play and free time in the classroom. Measurement was performed using a video camera and Gilliam Autism Rating Scale- 3rd edition (GARS-3, Gilliam 2014). The GARS-3 is a survey consisting of 42 items divided into three subscales to measure stereotypic behavior, communication and social interaction. The items are scored on 4-point Likert scale ranging from (0) “never observed” to (3) “frequently observed”, which means the individual behaves in this manner at least 5–6 times per 6-h period (Anderson-Hanley et al. 2011). For the purpose of this study, only the repetitive behavior scale was used. All participants were video recorded during the first 10 min of the school recess period. The video recordings were collected and divided into 15-s intervals, then were coded independently by two raters for any stereotypic behaviors. The stereotypic behaviors observed were coded according to the repetitive behavior scale from the GARS-3 to obtain the scaled score. Higher scaled scores indicate greater severity of stereotypic behavior. Moreover, the number of occurrences of both repetitive hand flapping and body rocking movement were also recorded by the raters. Average measures were taken for the scaled score obtained and measurements of the stereotypic behaviors recorded by raters 1 and 2.

Procedure

In the control condition, participants were exposed to a placebo condition. In this condition, participants were seated side-by-side with their partnered student helper and the research assistant read a story to them out loud (see Appendix). This control condition not only served as a control for comparison with the exercise intervention effect, but also served as a control for comparison with any interaction effect between the participant and the staff. The experimental condition was commenced 1 month after the control condition. This experimental condition comprised 15 min of ball tapping and 5 min of seated stretching. In the ball tapping activity, participants were asked to tap the ball (Little Tikes 10-inch playground ball) as many times as they could. To achieve a consistent period of ball tapping, their teachers used prompts such as verbal reinforcements and verbal cueing. If the student helpers were unable to prompt the participants to tap the ball, an alternate mode of exercise was used, i.e., throwing the ball. After throwing the ball several times, the participants were asked to return to ball tapping as soon as possible. After 15 min of ball tapping, the participants were asked to perform gentle seated stretching for 5 min before returning to their classrooms. The participants’ parents were reminded to maintain a normal daily routine for their children without participating in any additional physical activity/exercise program throughout the entire study period.

Data Analysis

Intraclass correlation coefficients were calculated to examine the reliability between raters’ scaled scores and frequency of stereotypic behaviors observed in each video recording. Wilcoxon signed-rank tests were used to compare differences between the scaled score of the repetitive behavior scale from the GARS-3 and the frequency of different stereotypic behaviors between different conditions. Nonparametric test was used because the assumptions of normality and homogeneity of variances using Shapiro–Wilk and Levene’s tests were not met (all ps < 0.05). Generalized estimating equation (GEE) was used to estimate the effects of different conditions adjusting for age, sex, and IQ and all the results were consistent with the unadjusted analysis. The SPSS-24 statistical program was used and the level of significance was set at α = 0.05/6 = 0.0083 to correct the type I error induced from multiple comparison.

Results

Interrater Reliability

Intraclass correlation coefficients (2, k) were calculated for the scaled scores obtained in different conditions by two raters, and the scores were ranged from 0.93 to 1.0. For the two stereotypic behaviors measured by two raters, and the frequency of the stereotypic behaviors observed were ranged from 0.89 to 0.99, which were satisfactory.

Scaled Scores of the GARS-3 Subscale

A Wilcoxon signed-rank test indicated that scaled scores of the GARS-3 repetitive behavior subscale in the post-experimental condition (Mdn = 10.00) were statistically lower than those in the pre-control condition (Mdn = 12.00) (Z = − 3.66, p < .001), the post-control condition (Mdn = 13.00) (Z = − 3.57, p < .001), and the pre-experimental condition (Mdn = 12.00) (Z = − 3.77, p < .001). These findings suggest that exercise intervention was effective in lowering the severity of stereotypic behaviors. However, no significant difference in the scaled score was found between pre-control (Mdn = 12.00) and post-control conditions (Mdn = 13.00) (Z = − 0.49, p > .05), suggesting that the story-telling activity had no effect on the number of stereotypic behaviors. In addition, no significant difference of the scaled score was found between post-control and pre-experimental conditions (Z = − 1.27, p > .05), suggesting that the number of stereotypic behaviors did not change during the 1-month wash-out period. Changes to scaled scores during the study are shown in Fig. 1.

Fig. 1
figure 1

Changes of scaled score of the GARS-3 subscale

Hand Flapping Stereotypy

A Wilcoxon signed-rank test indicated that frequency of hand flapping stereotypy in the post-experimental condition (Mdn = 10.00) was statistically lower than that in the pre-control condition (Mdn = 13.00) (Z = − 3.84, p < .001), the post-control condition (Mdn = 14.00) (Z = − 3.68, p < .001), and in the pre-experimental condition (Mdn = 12.00) (Z = − 3.47, p < .001). These findings suggest that the ball-tapping exercise was effective in reducing hand-flapping stereotypy. Meanwhile, no significant difference in the frequency was found between the pre-control (Mdn = 15.00) and post-control conditions (Mdn = 14.00) (Z = − 0.19, p > .05), suggesting that the story-telling activity had no effect on the frequency of the stereotypy. In addition, no significant difference in frequency was found between post-control and pre-experimental conditions (Z = − 1.32, p > .05), suggesting that the frequency of hand flapping stereotypy did not change during the 1-month wash-out period.

Body-Rocking Stereotypy

A Wilcoxon signed-rank test indicated no significant difference in frequency of body-rocking stereotypy between post-experimental (Mdn = 12.00) and post-control conditions (Mdn = 13.00) (Z = − 1.17, p > .05), between the post-experimental and pre-control conditions (Mdn = 13.00) (Z = − 1.95, p > .05), and between the pre-experimental (Mdn = 13.00) and post-experimental conditions (Z = − 1.66, p > .05). These findings suggest that the ball-tapping exercise did not effectively reduce body-rocking stereotypy. Meanwhile, no significant difference in frequency was also found between pre-control (Mdn = 14.00) and post-control (Mdn = 13.00) conditions (Z = − 1.45, p > .05), suggesting that the story-telling activity had no effect on the frequency of the stereotypy. In addition, no significant difference in frequency was found between post-control and pre-experimental conditions (Z = − 0.26, p > .05) during the 1-month wash-out period. Changes to these two stereotypic behaviors during the study are shown in Fig. 2.

Fig. 2
figure 2

Changes of body-rocking behavior and hand-flapping behavior are shown in boxplot with median, percentiles and p-value presented

Discussion

The present study, for the first time, examines the differential impacts of a physical exercise intervention on different stereotypic behaviors in children with ASD. We hypothesized that a ball-tapping exercise that required similar biomechanics as repetitive-hand-flapping behavior would more effectively reduce said stereotypic behavior compared with repetitive body-rocking behavior. In agreement with this hypothesis, the frequency of hand flapping movements was found to be significantly reduced by the exercise intervention. In contrast, the frequency of body-rocking movement was not revealed to be significantly reduced by the exercise intervention.

One possible explanation for our results could be related to the theoretical operant nature of stereotypy (Lovaas et al. 1987). According to this theory, stereotyped motor behavior is learned and maintained by its sensory consequence to satisfy internal sensory needs in individuals with ASD. Therefore, by providing equivalent but alternative sensory stimulation, the stereotypic behavior could be reduced (Lancioni et al. 2009; Turner 1999). Considering this theoretical framework, the observed similar biomechanics of the two examined stereotypies may produce similar internal pleasant consequences to satisfy participants’ sensory needs. As a result, the hand-flapping stereotypy was reduced. In contrast, because the biomechanics of the body-rocking stereotypy (whole body movement) was different from that required for the ball-tapping exercise, the sensory stimulation provided by this stereotypy may not be matched with that of the exercise and therefore no benefit from the exercise could be obtained. Nevertheless, in terms of the severity of the overall restricted and repetitive behavior, the exercise intervention indicated a positive influence on reducing the scaled score of the GARS-3 corresponding subscale, which is a finding consistent with that of previous studies (e.g., Lang et al. 2010; Levinson and Reid 1993; Prupas and Reid 2001).

These findings, although preliminary, are encouraging because they not only confirm the positive impact of physical exercise on stereotypic behavior in children with ASD, but further suggest that exercise should be matched with stereotypic behavior to yield a significant behavioral benefit. Moreover, these results provide some insight into understanding the mechanism by which exercise impacts behavior and provides valuable information for parents, teachers and practitioners to design a more efficient behavioral treatment for children with ASD.

Despite the strengths of this study and its findings, we must address several limitations. First, considering the fact that the parents did not want us to video-tape their children during class), only a 10-min video was taken at each assessment time point (i.e. pre- and post-tests) to measure the change of frequency of stereotypic behaviors. As a result, this may not be sufficient to measure any possible daily fluctuation of stereotypic behaviors. Future studies should consider taking at least three more independent video assessments on three occasions at each time point in order to enhance the reliability of the research findings. Second, only one exercise intervention (i.e., ball tapping) was used to examine two stereotypic behaviors. With the lack of effective active control, it is unclear whether the reduction is simply related to intervention or increased physical activity level of the participants. The reliability of the results would be greatly strengthened if we could show a positive impact of one additional exercise intervention such as jogging on body-rocking stereotypy. Third, the sequence of conditions (control and experimental conditions) was not balanced and this has weakened the strength of control condition which served as a control for interaction effect between the participants and the staff. The interaction effect may exist and potentially confound the findings of the study. Future study should balance the condition order. Finally, there is a lack of motor ability assessment during baseline. Without the assessment, we do not know whether baseline motor ability of children would be a factor on their engagement in ball-tapping exercise intervention, which in turns affect the effectiveness of the intervention outcome. Future study should include such assessment in baseline to address this question.

Conclusion

This preliminary study confirmed the positive influence of physical exercise on stereotypic behavior in children with ASD. It also provided initial evidence that physical exercise should be matched with stereotypic behavior in children with ASD to deliver significant behavioral benefits. Additional research should be conducted to replicate and extend these findings to examine the sustainability of this benefit for children with ASD.