Arousal may be defined as the “degree of feeling stimulated” (Bolte et al. 2008, p. 776) and is considered a complex and difficult construct to study or operationalize as it encompasses feelings, behaviors, and physiology (Mayes 2000). Physiologically, arousal refers to activation of the central and peripheral nervous systems which can be examined through measures of heart rate (HR), blood pressure, skin conductance, muscle tone, respiration, and levels of certain neurochemicals (Humphreys and Revelle 1984). Physiological arousal occurs in response to stimulation that can come from the external world or from the individual’s own body (Humphreys and Revelle 1984), and affects a variety of important executive functions including information processing, attention, learning, memory, and stress reactivity (Mayes 2000). During the early years, individuals typically learn to regulate their arousal levels in response to stimulation and to achieve a balance that allows them to optimally perform executive functions (Mayes 2000).

For many years, researchers have proposed that physiological arousal, or rather the regulation of physiological arousal, is abnormal in individuals with Autism Spectrum Disorder (ASD; Edelson 1984; Helt et al. 2008; Hutt and Hutt 1965; Hutt et al. 1964; Leekam et al. 2011). This theory was first put forward by Hutt et al. (1964) who claimed that individuals with ASD experienced hyper-arousal which led to their increased reactivity to environmental stimuli, their failure to habituate appropriately to these stimuli, and their avoidance of unfamiliar situations and novelty. These hypotheses arose from observations that children with ASD showed greater EEG activation than typically developing children, displayed more stereotypy in response to greater environmental stimulation, and a correlation between EEG activation and stereotypy. More recently, Helt et al. (2008) suggested that those diagnosed with ASD may receive an overwhelming amount of sensory input from the environment which produces uncomfortable levels of physiological arousal and results in the disengagement and lack of interest in surroundings and social interactions that is characteristic of ASD.

Hyper-arousal alone does not explain many of the atypical behaviors that are common in ASD. Hence, it has been proposed that arousal in the population is not homogenous and that certain individuals may experience hypo-arousal where arousal is low, little attention is paid to external stimuli in the environment, and autonomic activity is minimal (Leekam et al. 2011; Schoen et al. 2008). It has been suggested that some behaviors emitted by individuals with autism, such as stereotypy and self-injurious behavior (SIB), can be explained by an increased sensory threshold that causes individuals to create their own forms of stimulation so that they are adequately aroused (Edelson 1984). Thus, the anxious, tense feelings characteristic of hyper-arousal, and the lethargic, dulled feelings characteristic of hypo-arousal are thought to possibly contribute to many of the difficulties individuals with ASD present with (Goodwin et al. 2006).

Such theories have prompted investigations of physiological arousal in those with ASD in response to different stimuli and situations. Studies have identified significantly different levels of arousal among those with ASD, when compared to typically developing controls, in response to neutral stimuli and stimuli designed to provoke emotions (Bolte et al. 2008), sensory stimuli (Woodard et al. 2012), differentially arousing conditions (Graveling and Brooke 1978), mental tasks (Toichi and Kamio 2003), psychosocial stress (Jansen et al. 2006), and stressful situations (Goodwin et al. 2006). Schoen et al. (2008) identified a low arousal group, which included children who were experiencing hypo-arousal and who were slow to react, and a high arousal group, which included children that were hyper-aroused and quick to react, among their sample of children with high-functioning autism and children with Asperger’s disorder. Kootz et al. (1982) found that physiological arousal in response to a stressor task, during interactions with others, and while at rest, was normal among higher functioning individuals with ASD, while lower functioning individuals showed much greater cardiovascular arousal during the same situations suggesting that an individual’s level of functioning may affect their physiological responsivity. However, not all studies evidence altered physiological reactivity. Sigman et al. (2003) reported no significant differences in physiological arousal, measured using HR, between individuals with ASD and typically developing controls in response to emotional videos, interactions with unfamiliar individuals, or separation from parents. Furthermore, findings that individuals with ASD present with significantly higher baseline HR that makes further HR increases unlikely may explain the reduced responsivity of individuals with ASD shown in several studies (Goodwin et al. 2006; Toichi and Kamio 2003). In their review of the literature, Rogers and Ozonoff (2005) concluded that there are a number of studies that support the hypothesis that arousal levels are atypical within this population with greater evidence in favour of the hypo-arousal theories of ASD rather than the hyper-arousal theories.

The importance of research on physiological arousal in those with ASD is increased by the hypothesised relationship between challenging behavior and levels of arousal. Few studies have examined the relationship between physiological arousal and SIB. Freeman et al. (1999) measured the HR and blood pressure of two adults with severe intellectual disabilities as they engaged in their everyday routines. It was found that incidents of SIB were typically followed by an increase in HR. This pattern was present for both participants, even when an operant function of the SIB had been identified, and SIB appeared to be reinforced by a subsequent increase in internal arousal. Barrera et al. (2007) included measures of HR during functional behavioral assessments of three individuals with developmental disabilities, two of whom had a diagnosis of ASD or pervasive developmental disorder not otherwise specified. A distinct HR pattern consisting of an increase in HR, either pre-SIB or during SIB, followed by a decrease in HR following the incident was observed for each individual. These HR changes were not attributable to respiration, movement, or the external factors manipulated as part of the functional analysis. This pattern appeared almost the opposite of that observed in Freeman et al. (1999) however, it could be attributed to differences in equipment sensitivity, the inclusion of participants with ASD, or individual differences in baseline physiological arousal. Hoch et al. (2010), utilised a measure of HR, during an analysis of activity choice and observed that SIB was significantly more likely to occur when the participant was engaging in highly arousing activities than when they were engaged in less arousing activities. Theories have been developed to explain findings such as these. Romanczyk et al. (1992) proposed an “operant-respondent model” of SIB whereby SIB is elicited by physiological arousal but shaped and maintained by the environmental consequences that it produces. Later, Brain et al. (1998) put forward a “tension reduction” hypothesis which postulates that SIB is negatively reinforced as its occurrence reduces internal arousal.

Researchers have also suggested that stereotypy, a form of challenging behavior characteristic of ASD, may be related to levels of arousal. Hutt and Hutt (1965) first proposed that stereotypy occurs during periods of high arousal and functions to block new sensory input that would serve to increase levels of arousal. Hutt et al. (1975) suggested that increased HR variability noted during occurrences of stereotypy (Hutt et al. 1975; Lewis et al. 1984), and HR decreases observed following stereotypy (Hutt et al. 1975), support the hypothesis that stereotypy is a mechanism for arousal modulation. Willemsen-Swinkels et al. (1998) found that the physiological effects of stereotypy differed according to the mood state accompanying the behavior. Stereotypy during periods of calm had no effect on HR, stereotypy during periods of excitement resulted in an increase after onset, and stereotypy during periods of distress were preceded by an increase in HR and led to a HR decrease. The authors suggest that stereotypy serves different functions during different moods; releasing excess energy and excitement during periods of happiness, a social function during periods of calm, or to reduce arousal during periods of distress. Taken together, these studies suggest that stereotypy may act as a coping strategy by allowing individuals with ASD to modulate levels of arousal that are uncomfortable.

Further research on the relationship between physiological arousal and challenging behavior in ASD is important as earlier studies are limited due to the changed DSM criteria for ASD, poorly described samples, less sophisticated measurement equipment, the non-inclusion of baseline measures of autonomic activity, and the failure to control for factors such as comorbid psychopathology and psychotropic medication (Goodwin et al. 2006; Rogers and Ozonoff 2005). The purpose of this naturalistic research study was to examine, via HR measurement, whether occurrences of challenging behaviors, including SIB and stereotypy, were related to particular patterns of physiological arousal in three children diagnosed with ASD.

Method

Participants and Setting

The participants were three males with a diagnosis of ASD between the ages of 10–16 years. Each participant was selected because they frequently engaged in challenging behavior, primarily in the form of SIB and stereotypy. Each of the participants attended a special school that delivered ABA educational interventions.

Participant 1 was a 12-year-old boy, diagnosed with ASD, a moderate intellectual disability, and cranial frontal nasal dysplasia. He also presented with hypotonia. His weight was 37 kg and his height was 153 cm. Participant 1 presented with a history of SIB, tantrum behaviors, and stereotypy. These behaviors were reported in both school and home settings. His behavior intervention plan included differential reinforcement of other behaviors and self-management (SM) techniques to treat his SIB and his tantrum behaviors. Staff reported that these interventions had led to significant reductions in the frequency and duration of these behaviors with a corresponding increase in SM behaviors observed. No behavioral interventions were in place for his stereotypy.

Participant 2 was a 10-year-old nonverbal boy, who had been diagnosed with ASD and a severe intellectual disability. His weight was 37 kg and his height was 142 cm. Parents, and school staff, reported that Participant 2 engaged in several challenging behaviors including SIB, destructive behavior, and stereotypy. His behavior intervention plan included extinction, in the form of planned-ignoring, to treat his self-injurious behavior. His destructive behavior (DB) was being treated using response cost, whereby he lost access to a preferred reinforcer following the occurrence of DB. No interventions were in place for his stereotypy.

Participant 3 was a 16-year-old nonverbal boy, who had been diagnosed with ASD and a significant cognitive delay. His weight was 51.5 kg and his height was 163.5 cm. He presented with frequent stereotypy and SIB at school and at home. His behavior intervention plan included extinction, in the form of a hand protector, as a behavioral intervention for SIB (self-biting). This hand protector was available at all times but was not always worn during instances of hand-biting. No behavioral interventions were in place for stereotypy.

For each participant, data collection was conducted over the course of five schooldays within their classroom. Behavioral and cardiovascular data were collected throughout typical school activities including 1:1 teaching sessions, classes with peers, speech and language therapy, and meal times. Data were not collected during physical education classes, self-care routines, showers, or community outings.

Behavioral and Heart Rate Recording

The occurrence of a target behavior (TB), topography of the behavior, time of occurrence, presence or absence of noise, and severity of the observed behavior, rated on a three point scale ranging from mild (1), moderate (2) and severe (3), was recorded by the observer assigned to each participant and by a second observer during periods of inter-observer reliability. To record the time of occurrence of TBs, both observers used small hand held stopwatches that displayed time in hours, minutes, and seconds. TBs were identified through consultation with the school’s director of education, the school’s consultant behavior analyst, and the participant’s key instructor. Operational definitions for the TBs of each participant are presented in Table 1. Although not classified as challenging behavior, Participant 1’s SM behaviors were also included within the analysis. These included behaviors such as hands covering ears and use of an oral-chewy tube which had replaced SIB and tantrum behaviors as the primary response to distressing experiences (primarily loud noises). It was hypothesized that if SIB and tantrums were accompanied by a specific HR pattern, as found in previous studies (Barrera et al. 2007; Freeman et al. 1999), SM behaviors may also be accompanied by the same HR waveform but that this participant may have learned alternative behavior patterns to such HR changes. Participant 2’s DB was also included in the analysis as it was identified by staff at his school as a challenging behavior with no clear function. As DB occurred without apparent antecedent and was proving resistant to behavioral intervention, it was hypothesised that this behavior might be related to internal physiological arousal and thus it was included within the analysis.

Table 1 Operational Definitions for Each Participant’s Target Behaviors

A second observer was present during 48.5 % of the total recording time for Participant 1, 71.5 % of the total recording time for Participant 2, and 54.8 % of the total recording time for Participant 3. Interobserver agreement was measured by comparing observers’ responses on measures of topography of target behavior, presence or absence of noise, and severity of the behavior. Percentage agreement was calculated by dividing the number of agreements by the number of agreements plus disagreements and multiplying this by 100. For occurrences recorded by both observers, percentage agreement for each of these variables was 100 %. To compare the accuracy of observers’ recording of time of occurrence of target behaviors, Cohen’s Kappa (a measure of agreement that takes the possibility of chance agreements into account; Cohen 1960) was calculated. The number of agreements that would have been expected by chance was subtracted from the observed number of agreements and this sum was divided by 1 minus the number of agreements expected by chance. Time of occurrence was transformed into a dichotomous variable in order to calculate a Kappa score. To do this, the recording of time of occurrence was categorised as “same” if the time of occurrence recorded was exact or differed by less than 2 s, or “different” if the time of occurrence recorded differed by more than 2 s or if the occurrence was only recorded by one observer. Participant 1’s Kappa score was + .92 which indicates almost perfect agreement (Landis and Koch 1977). Participant 2’s Kappa score was .3, a score which indicates fair agreement (Landis and Koch 1977). For Participant 3, Cohen’s Kappa was found to be .68 which is suggestive of substantial agreement (Landis and Koch 1977).

Physiological arousal was examined through the measurement of HR, one of the most commonly used indexes of internal arousal (Romanczyk et al. 1992). The popularity of HR measures is not surprising as HR monitors are “inexpensive, easy to equip, and relatively non-invasive” (Chok et al. 2010, p. 326). However, Romanczyk and Matthews (1998) have advised caution in the utilisation of HR measures and in the interpretation of HR data. HR is highly variable and factors such as mental stress, emotions, body position, and movement or physical exertion can influence HR (Palatini 2009; Wolf 1979), and individual variability in cardiovascular functioning and reactivity can impact on data and data interpretation (Romanczyk and Matthews 1998). Given the precedent of using HR in research similar to the current study (e.g. Barrera et al. 2007; Freeman et al. 1999), and the suitability of single-case research for analysing data that are highly individually variable, HR was deemed a suitable index of physiological arousal. HR recording in this study was conducted using a Polar© RS800cx HR monitor (Polar Electro OY, Kempele, Finland). Polar HR monitors have been referred to “as the most accurate tools for heart rate monitoring and registering in the field” (Laukkanen and Virtanen 1998, p. S6). Several studies have compared older Polar HR monitors to ECG and found their reliability to be adequate (Laukkanen and Virtanen 1998; Terbizan et al. 2002). Many studies have used Polar HR monitors with individuals diagnosed with ASD (e.g., Chok et al. 2010; Hoch et al. 2010; Willemsen-Swinkels et al. 1998). The Polar RS800cx includes both a chest strap and a wrist watch. The chest strap was worn around the ribcage and under the pectoral muscles and was responsible for recording the electrical signal of the heart every 5 s and transmitting these data to a small wrist watch. The wrist watch displayed this information and stored the data for later analysis. At the end of each day, data from the wrist watch were transferred to a computer, using Polar ProTrainer© 5 software, for analysis.

Behavioral Function

The QABF (Matson and Vollmer 1995), an indirect functional assessment instrument, was administered to each participant’s key instructor to indicate the function of each TB included in this study. Subscale scores of the QABF are used to identify whether the behavior is maintained by an attention, escape, physical, tangible, or a non-social behavioral function and to discern the strength of the maintaining function. There is much empirical support for the QABF with a recent review reporting that numerous studies have shown it to have good to excellent reliability, good internal consistency, and good validity (Matson et al. 2012; Healy et al. 2013). The QABF is administered by interviewing an individual who is familiar with the person being assessed.

Experimental Design and Procedure

This study employed a naturalistic design and all data were collected with minimal interference to the participants’ routine school day. Each participant was habituated to wearing the HR monitor for 1–2 h during the previous week. Although brief, this habituation was considered sufficient as participants neither exhibited, nor reported, any distress or discomfort. Each morning, the time on the Polar© wristwatch and the observer’s stopwatch was synced to ensure the accuracy of the observer’s recording of time of occurrence. Participants were fitted with the HR monitor following their arrival at school and the observer began to record data on all TBs. Behavioral observation, and HR monitoring, took place over five school days for a total of 20 h 16 min and 25 s for Participant 1, 23 h 24 min and 15 s for Participant 2, and 19 h 20 min and 50 s for Participant 3.

Results

Frequency of Target Behaviors

Figure 1 shows the frequency of all TBs for each Participant across 5 days. Each participant engaged in high frequencies of stereotypy while rates of SIB were lower. For Participant 1, tantrum behavior occurred at a low frequency across 5 days with only five occurrences observed, all of which co-occurred with SIB. SM behaviors occurred at high frequencies during each day of recording. For Participant 2, DB was only recorded eight times during the study and a decision was subsequently made to remove this behavior from further analyses due to its low frequency.

Fig. 1
figure 1

Frequency of Participant 1’s (upper panel), Participant 2’s (middle panel) and Participant 3’s (lower panel) target behaviors across days of measurement. SM (self-management), ST (stereotypy), SIB (self-injurious behavior), TM (tantrum behavior), and DB (destructive behavior) are shown

Function of Target Behaviors

Results from the functional assessment (QABF), presented in Fig. 2, suggested that all behaviors included in the analysis were multiply controlled. Participant 1’s SM served primarily as an indication of physical pain or discomfort, but some instances functioned to provide escape from aversive situations. Stereotypy was identified as serving non-social or physical functions, and SIB was shown to be maintained by escape, physical pain, or tangibles. Tantrum behaviors functioned as escape, non-social, physical, and access to tangibles. Participant 2’s SIB was identified as serving attention, escape, non-social, physical, and tangible functions while stereotypy appeared to be primarily a non-social behavior with some instances functioning to allow escape or to indicate physical pain. Participant 3’s SIB was primarily maintained by escape although some instances appeared non-social or functioned to allow access to tangibles. Each function was endorsed for stereotypy although the strongest maintaining variable was non-social reinforcement.

Fig. 2
figure 2

QABF endorsement scores and severity scores for each function of Participant 1’s (upper panel), Participant 2’s (middle panel), and Participant 3’s (lower panel) target behaviors

Cardiovascular Behavior

Due to the high frequency of certain TBs, it was not possible to analyse every occurrence. As such, every fourth occurrence, where behavior severity was rated as mild, was analysed. All occurrences rated as moderate or severe were analysed. HR data were analysed by examining the HR pattern from the 5 s measure preceding the occurrence of a TB to the 5 s measure following the occurrence of a TB, therefore a 15 s period for each single occurrence was analysed. Each occurrence was compared to a non-occurrence of the TBs. Non-occurrences were defined as a 15 s period during which none of the TBs occurred and were preceded and followed by at least 30 s during which TBs did not occur. The percentage difference from the mean HR of the session in which the TB occurred was calculated for each data point of each occurrence and non-occurrence and graphs were produced to allow a visual comparison to be conducted. Figure 3 displays a sample graph showing a comparison between an occurrence of SIB and tantrum behavior and a non-occurrence. By examining mean HR during instances of TBs across participants, we found that a 5 % increase in bpm, from a 5 s measurement period to the next, resulted in a minimum increase of 5 bpm. We categorised HR waveforms before, during, or after occurrences as an “increase”, “decrease”, or “stability”. If HR, in bpm, rose from one HR measurement point to the next, this was referred to as an “increase”. If this increase was of 4.9 % or less it was categorised as a “slight” increase. If HR declined from one HR measurement point to the next, this was referred to as a “decrease”. If this decrease was of 4.9 % or less it was categorised as a “slight” decrease. If HR, in bpm, did not change from one HR measurement point to the next, the trend observed was termed “stability”. Mean HR, over all sessions, was found to be 99.48 bpm for Participant 1, 106.36 bpm for Participant 2, and 93.26 bpm for Participant 3.

Fig. 3
figure 3

An example of the percentage difference from the mean session heart rate from the 5 s measure before until the 5 s measure following self-injurious behavior and tantrum behavior (circles) or following a non-occurrence (squares) for Participant 1

Self-Injurious Behaviors and Tantrum Behaviors

As tantrum behaviors did not occur independently of SIB for Participant 1, these behaviors were analysed together. HR during occurrences of SIB and tantrum behavior and during non-occurrences was compared using a dependent t-test. Mean HR during occurrences of SIB and tantrum behavior was lower (M = 102.71, SD = 13.92) than during non-occurrences (M = 107.85, SD = 24.98) but this difference was not significant, t (6) = −.48, p = n.s. There was no HR pattern that consistently occurred prior to SIB onset while post-SIB increases were most common, accompanying four of the seven occurrences. The low frequency of these behaviors limited our analyses for Participant 1.

HR during occurrences of Participant 2’s SIB and during non-occurrences was compared using a dependent t-test. HR during occurrences of SIB (M = 107.93, SD = 12.87) was not significantly different, t (13) = .28, p = n.s, than HR during non-occurrences of SIB (M = 106.57, SD = 15.38). There was also no particular HR pattern that consistently preceded occurrences of SIB. Although decreases in HR preceding SIB were most common, this did not deviate from chance levels, as represented by non-occurrences. Following occurrences of SIB, increases were most common, although most were categorised as slight (46.7 %). The percentage occurrence of each possible HR pattern before and after occurrences of SIB is presented in the upper panel of Table 2.

Table 2 Percentage Occurrence of Each Heart Rate Pattern, Pre- and Post- Target Behavior, for Self-injurious Behavior Compared to Non-occurrences of Self-injurious Behavior (upper panel), Stereotypy Compared to Non-occurrences of Stereotypy (middle panel), and Stereotypy Occurring Above the Mean Compared to Stereotypy Occurring Below the Mean (lower panel), for Participant 2

A dependent t-test was used to compare HR during occurrences and non-occurrences of Participant 3’s SIB. HR during SIB was found to be significantly higher (M = 106.25, SD = 23.43), t (56) =3.42, p = .001, than HR during non-occurrences of SIB (M = 92.93, SD = 16.24). There was no particular HR waveform that consistently preceded occurrences of SIB. Increases were most common but this did not deviate from chance levels, as represented by non-occurrences. Following occurrences of SIB, increases were most common, although most were slight (31.6 %). The percentage occurrence of each possible HR pattern before and after occurrences of SIB is presented in the upper panel of Table 3.

Table 3 Percentage Occurrence of Each Heart Rate Pattern, Pre- and Post- Target Behavior, for Self-injurious Behaviour Compared to Non-occurrences of Self-injurious Behavior (upper panel), Stereotypy Compared to Non-occurrences of Stereotypy (Middle Panel), and Stereotypy Occurring Above the Mean Compared to Stereotypy Occurring Below the Mean (lower panel), for Participant 3

Self-Management Behaviors

HR patterns surrounding Participant 1’s SM behaviors were also analyzed. A dependent t-test was used to compare HR during occurrences of SM (M = 99.07, SD = 17.33) and during non-occurrences (M = 98.00, SD = 18.06), but no significant differences were identified, t (53) = −.28, p = n.s., from HR during non-occurrences. There was no evidence that a specific HR pattern consistently preceded or followed the occurrence of SM and the percentage occurrence of each HR pattern did not deviate from chance levels, as represented by non-occurrences. The percentage occurrence of each possible HR pattern is presented for occurrences of SM and non-occurrences in the upper panel of Table 4.

Table 4 Percentage Occurrence of Each Heart Rate Pattern, Pre- and Post- Target Behavior, for Self-Management Compared to Non-occurrences of Self-Management (upper panel), Self-management With Noise Compared to Self-Management Without Noise (upper middle panel), Stereotypy Compared to Non-occurrences of Stereotypy (lower middle panel), and Stereotypy Occurring Above the Mean Compared to Stereotypy Occurring Below the Mean (lower panel), for Participant 1

A further analysis of incidents of SM that were triggered by noise, the primary antecedent of Participant 1’s challenging behavior, was conducted. There were 22 occurrences of SM that were in response to noise and these were compared to 32 incidents that did not occur in response to noise. Occurrences of SM triggered by noise were most typically preceded by increases and followed by decreases, although this pattern did not co-occur with every incident. Occurrences of SM in the absence of noise were most commonly preceded by a decrease in HR and followed by an increase in HR, the opposite pattern of those that occurred in the presence of noise. These results are presented in the upper middle panel of Table 4. It was found that during 78.3 % of SM occurrences triggered by noise that HR was below (56.5 %), or less than 5 % above (21.7 %) mean session HR.

Stereotypy

For Participant 1, a total of 31 incidents of stereotypy were analysed. HR during occurrences of stereotypy and during non-occurrences was compared using a dependent t-test. HR during stereotypy (M = 98.52, SD = 18.23) did not differ significantly, t (28) = −.16, p = n.s., from HR during non-occurrences (M = 97.79, SD = 15.32). Occurrences of stereotypy were not consistently preceded by a specific HR waveform. Increases were most common following an occurrence of stereotypy and most (29 %) were categorized as an increase of 5 % or more. The percentage occurrence of each possible HR pattern before and after occurrences of stereotypy is presented in the lower middle panel of Table 4.

In order to test theories which suggest that stereotypy functions to regulate arousal and to return arousal to more normal levels during periods of low or high arousal, we further analysed occurrences of stereotypy that took place below mean session HR, or above mean session HR, to determine if stereotypy functioned to return HR to more normal levels during these incidents. Incidents defined as “below the mean” were at least 5 % below mean HR, during that session, at the 5 s measure preceding the occurrence of stereotypy and did not cross the x-axis before, during or after the occurrence of stereotypy. There were 14 incidents that met this definition, with HR for these incidents having a range of 67–105 bpm, and an average of 88 bpm. Only 42.68 % of these incidents led to HR becoming closer to mean levels. The same analysis was performed for 10 incidents classified as “above the mean”. Incidents considered “above the mean” were at least 5 % above mean HR, during that session, at the 5 s measure preceding the occurrence of stereotypy and did not cross the x-axis before, during, or after the occurrence of stereotypy. Mean HR during stereotypy that occurred above the mean was 119 bpm, and ranged from 97 to 150 bpm. Half of these occurrences resulted in HR becoming closer to mean levels. The percentage occurrence of specific HR waveforms before and after the onset of stereotypy for incidents occurring “below the mean” and “above the mean” is presented in the lower panel of Table 2. However, among these data, increases were most common following stereotypy that occurred below the mean while no trend was evident prior to the occurrence of this type of stereotypy. Increases before and after the occurrence of stereotypy when HR was above mean levels were most common.

A total of 148 incidents of stereotypy were analysed for Participant 2. HR during occurrences of stereotypy and during non-occurrences was compared using a dependent t-test. HR during occurrences of stereotypy (M = 107.08, SD = 12.22) was higher, than HR during non-occurrences (M = 105.31, SD = 13.40) but this difference was not significant, t (159) = 1.23, p = n.s. It was found that incidents of stereotypy were most commonly preceded by an increase in HR, although this was most often a slight increase (39.19 %). Increases were also most common following an occurrence of stereotypy although most (39.86 %) were categorized as slight increases. The percentage occurrence of each possible HR pattern before and after occurrences of stereotypy is presented in the middle panel of Table 2.

Analysis of stereotypy occurring above and below the mean was also conducted for Participant 2. There were 60 incidents that were categorized as occurring below the mean. The mean HR during these incidents was 94.7 bpm, with a range of 76–115 bpm. It was found that, among these incidents, increases pre-stereotypy and increases post-stereotypy were most common. The majority (68.33 %) of stereotypy that occurred below mean HR led to an increase in HR that brought it nearer to mean levels. A similar analysis was performed for the 41 incidents that were categorised as occurring “above the mean”. Mean HR during these incidents was 123.66 bpm, and had a range of 113–140 bpm. Decreases pre-stereotypy were most common, while no clear trend was consistently evident post- stereotypy. Only 46.34 % of incidents of stereotypy that occurred above the mean resulted in HR becoming closer to the mean level. The percentage occurrence of all HR trends pre- and post- stereotypy for incidents that occurred below or above the mean is presented in the lower panel of Table 2.

A total of 156 incidents of stereotypy were analysed for Participant 3. HR during occurrences of stereotypy and during non-occurrences was compared using a dependent t-test. HR during occurrences of stereotypy (M = 96.89, SD = 15.56) was significantly higher, t (198) =6.24, p < .001, than HR during non-occurrences (M = 87.92, SD = 12.68). It was found that incidents of stereotypy were most commonly preceded by an increase in HR, although this was typically only a slight increase (26.9 %). Increases were also most common following stereotypy with most increases (32.1 %) involving a 5 % or greater change. The percentage occurrence of each possible HR pattern before and after occurrences of stereotypy for Participant 3 is presented in the middle panel of Table 3.

Occurrences of stereotypy above and below the mean were also further analyzed. There were 29 incidents that were categorized as below the mean, these incidents had a mean HR of 82.24 bpm, and a range of 66–125 bpm. Among these incidents, increases pre-stereotypy and increases post stereotypy were most common. The majority (69 %) of stereotypy that occurred below mean HR led to an increase in HR that brought it nearer to mean levels. There were 48 incidents of stereotypy that occurred above the mean. Average HR during these incidents was 113.58 bpm, and had a range of 93–171 bpm. The majority (58.3 %) resulted in HR becoming closer to the mean level. Decreases were most common before these incidents while no consistent HR pattern followed them. The percentage occurrence of all HR trends pre- and post-stereotypy for all incidents that occurred below or above the mean for Participant 3 is presented in the lower panel of Table 3.

Discussion

Although exploratory in nature, this study has revealed some interesting associations between HR and challenging behavior. For all three participants, the same HR patterns were found to co-occur with SIB, an abnormal HR response to seemingly stressful experiences was noted, and stereotypy co-occurred with specific HR waveforms and showed some utility in regulating arousal for two of our participants.

The majority of SIB, or SIB and tantrum behavior for Participant 1, appeared to result in HR increases, while no HR waveform consistently preceded SIB. While the percentage occurrence of increases following SIB did not deviate substantially from chance, it is still worth noting that this pattern was present during the majority of occurrences. These findings do not support theories or previous studies which have proposed that SIB is elicited by physiological arousal, or that SIB acts as a tension reduction mechanism (Barrera et al. 2007; Brain et al. 1998; Romanczyk et al. 1992). Instead, the post-SIB HR increase identified here mirrors the findings of Freeman et al. (1999) who proposed that problem behavior may serve as a discriminative stimulus for increases in physiological arousal. Our analysis suggests that it is an effective means of increasing physiological arousal, which may be desirable for some individuals with ASD, and this may explain why these behaviors occur and often prove highly difficult to treat and eliminate. Not all incidents of SIB were followed by increases in HR thus this explanation does not completely account for their occurrence. However, the SIB evinced by all three participants, was identified as having multiple behavioral functions. Thus, it may be that the behavior’s physiological impact differs according to the function of the occurrence. However, research has shown that both pain (Moltner et al. 1990) and psychological stress (Delaney and Brodie 2000) lead to HR increases. As participants were physically harming their own bodies, and often appeared distressed during the incidents, these factors, rather than the reinforcing effects of increased physiological arousal, could also explain the post-SIB HR increases observed.

Although not a statistically significant difference, it was unexpected to find that Participant 1 was less physiologically aroused during occurrences of SIB and tantrum behaviors, all of which were triggered by noise, than during non-occurrences. During four of the seven occurrences of the behaviors in this study, HR during the 15 s period analyzed was either below mean HR or less than 5 % above mean HR. This was surprising considering the physical signs and verbal indications of stress during these occurrences, the high severity of the behaviors, and the HR increases that typically occur in response to pain and stress (Delaney and Brodie 2000; Moltner et al. 1990). Given that the ability to respond physiologically to stress is an evolutionary response (Nesse and Young 2000), Participant 1’s display of the overt stress response but not of the physiological response may highlight possible physiological abnormality in ASD. Previous studies (Goodwin et al. 2006; Toichi and Kamio 2003) have identified under-arousal in response to stressors among individuals with ASD, although not in response to naturally occurring stimuli identified as being stressful to the participants. As such, there is support for the existence of such atypicality. Goodwin et al. (2006) suggested that the lack of HR response to stressors may have been due to chronic hyper-arousal that meant HR was already so elevated that further increases were not probable. However, Participant 1 did not present with an abnormally elevated HR. Jansen et al. (2006) suggested that chronic hyper-arousal may result in a down-regulation of the central nervous system and a reduced stress response. Anecdotal reports suggested that Participant 1 had a long history of SIB and tantrum behaviors in response to noise. Thus, it is possible that years of highly aroused reactions to noise had reduced the capacity of his central nervous system to respond appropriately to such experiences.

For Participants 2 and 3 HR during SIB occurrences was higher than HR during non-occurrences, although this difference was only significant for Participant 3. It is possible that elevated HR during SIB occurrences is in response to the antecedent, or stressor, that occasioned the behavior, or the pain caused by the SIB. However, as with Participant 1, HR remained below its mean level before, during, and after a number of SIB occurrences. For Participant 2, HR was below average during four of the 15 occurrences (26.67 %) while this effect was observed for 25 of the 74 SIB (33.78 %) occurrences for Participant 3. However, the analysis of behavioral function suggests a different explanation than that proposed for Participant 1. Participant 1’s SIB occurred in response to physical pain and aversive stimuli and was accompanied by overt distress. However, the QABF indicated that SIB emitted by Participants 2 and 3 had social functions. It is reasonable to believe that SIB that occurred for social reasons, such as access to attention or tangibles, may not be accompanied by changes in physiological arousal while occurrences of SIB related to escape from aversive situations, or physical pain, may be accompanied by distress and elevated HR. During this study, data were not taken on the function of each occurrence of SIB and therefore we were not able to determine whether HR changes differed according to the function of the behavior. Certainly this hypothesis warrants future further investigation.

Participant 1’s SM behaviors were not associated with consistent HR patterns. However, as it was hypothesised that Participant 1 had learned these behaviors as an alternative response to aversive stimulation (noise), which typically triggered challenging behavior, we further analysed occurrences of SM that were triggered by noise and occurrences that were not triggered by noise. SM in the presence of noise was most commonly preceded by an increase and followed by a decrease. This analysis of SM behaviors suggests that, in the presence of identified stressors, such behaviors may serve as coping mechanisms and help the individual to regulate increased arousal.

A further analysis of HR during SM was conducted due to the atypical HR levels that co-occurred with Participant 1’s SIB and tantrum behaviors. Although Participant 1 was effectively self-managing his behavior through well-acquired behavioral intervention, he exhibited pronounced physical and verbal signs of distress during SM in the presence of noise and it is somewhat surprising that his HR would remain below or even near mean HR. These findings are suggestive of hypo-arousal in response to stressors or a down-regulated central nervous system (Jansen et al. 2006).

HR during occurrences of stereotypy was also examined within this study. Increases prior to and following stereotypy were observed in two of the three participants while only post-stereotypy increases were observed in Participant 1. These patterns contrast with previous findings of HR decreases following stereotypy and suggestions that stereotypy occurs during periods of high arousal and functions to block further sensory input and reduce physiological arousal (Hutt and Hutt 1965). Given that participants’ stereotypy led to increases in arousal, an explanation may be proposed whereby these behaviors are reinforced by the subsequent increase in arousal, as Freeman et al.’s (1999) analysis of SIB suggested. Participants 2 and 3 for example, may have engaged in the behavior either to increase physiological arousal, or to prolong the sensations and feelings associated with increased arousal. Despite the common view that elevated arousal is uncomfortable and unpleasant, some researchers have suggested that individuals may enjoy the feelings and sensations that accompany high arousal and that it might be a pleasurable state for them (Svebak and Stoyva 1980). If this is the case, stereotypy would be positively reinforced by allowing individuals to access, or extending their access to, a state of high physiological arousal. The finding that HR during stereotypy was higher than during non-occurrence for all participants, although this difference was only significant for Participant 3, supports this idea. If the periods of higher arousal during stereotypy were aversive then it should not occur at the high frequencies that were observed. A review by Lang et al. (2010) on the effects of physical exercise for individuals with ASD, lends credence to this theory. Their review found that physical exercise most commonly led to reductions in stereotypy, and sometimes in SIB, particularly if the exercise was vigorous. The authors proposed that exercise, which leads to increases in physiological arousal, might produce the same internal sensations as stereotypy thus satiating individuals’ desire for this form of reinforcement and eliminating their motivation to engage in stereotypy for a period. Thus, it is possible that challenging behaviors, such as stereotypy and SIB, are reinforced and maintained by the increases in physiological arousal that they occasion. Given that not all instances of stereotypy resulted in HR increases, this theory does not entirely explain our findings. However, Participant 2 and 3’s QABF results indicated that their stereotypy had social functions. In this way, some occurrences may have social functions and may not co-occur with the typical HR waveforms.

While many theories have linked stereotypy and physiological arousal, there is a dearth of experimental research demonstrating a relationship between the two variables. We tested theories which propose that stereotypy functions to regulate arousal levels and to return arousal to more normal, comfortable levels when an individual is experiencing hypo-arousal or hyper-arousal (Leekam et al. 2011; Lewis and Bodfish 1998). However, support for this hypothesis in our data was minimal. Participants 1’s stereotypy did not seem to result in arousal moving towards mean levels. For Participants 2 and 3, stereotypy occurring during periods of low arousal typically resulted in a move towards mean levels while stereotypy occurring above the mean brought arousal closer to mean levels for Participant 3 during most occurrences, but led to physiological arousal increasing further for Participant 2. As stereotypy functioned to increase arousal when HR was below mean levels for two Participants, but only reduced arousal, when HR above mean levels, for one participant, it may be that periods of lower HR are somewhat aversive and that stereotypy is employed to escape the sensations that accompany a low HR while periods of elevated HR are not as unpleasant and do not provoke reaction, as with Participants 1 and 2. This explanation is congruent with our earlier suggestion that participants might find states of high arousal pleasurable. This might also explain why stereotypy appears in high and low arousal periods, as the participant attempts to increase arousal during periods of low HR and to maintain high levels of arousal when they are present. Thus, our data, albeit from a small sample, does little to validate theories of arousal modulation.

Although exploratory, this research has interesting implications for the study and assessment of challenging behavior. Findings such as these draw attention to the concept of automatic reinforcement. Behaviors are often classified as automatically reinforced, or non-social, without any investigation of what, if anything, is happening internally during their occurrence. Barrera et al. (2007) have previously questioned the utility of the classification of automatic reinforcement which they refer to as “a theoretical requirement of operant psychology” (p.30). Research on the physiological impact of behaviors categorized as automatically reinforced, may help us to develop this concept, to understand precisely what reinforcement the individual acquires during these occurrences, and to increase the utility of the concept for treatment development. Furthermore, if our hypothesis that some individuals with ASD seek out, and enjoy, states of elevated physiological arousal is borne out by future research, it has implications for the treatment of behaviors that function to elevate arousal or to sustain elevated arousal. Behavior intervention plans for these behaviors could incorporate appropriate activities, such as physical exercise (Lang et al. 2010) or other highly arousing behaviors, which may provide individuals with ASD with the heightened arousal they seek but in an appropriate manner.

The current study differed from previous studies in this area in a number of ways that may have impacted on our findings. Our participants were younger and displayed less severe, less frequent forms of SIB than those who participated in previous studies (e.g., Barrera et al. 2007; Freeman et al. 1999). Our equipment was less sensitive than Barrera et al.’s (2007) and as such may not have registered effects that were extremely subtle. Additionally, the current study was limited by the inclusion of several TBs that occurred at relatively low levels during the study and produced a less than ideal number of occurrences for analysis. We did not take data on mood state or behavioral function, although Willemsen-Swinkels et al. (1998) found mood state to affect HR patterns surrounding stereotypy, and thus we were not able to examine the influence that these variables may have on HR before, during, and after TBs. Our non-inclusion of a measure of HR variability prevented us from comparing our results to some previous investigations (Graveling and Brooke 1978; Hutt et al. 1975; Lewis et al. 1984) of physiological activity during stereotypy. These factors may have contributed somewhat to the differences between our findings and those of previous research. However, given that the results of our study and other research (Freeman et al. 1999; Lang et al. 2010) were in line, this is unlikely.

This study, and several preceding studies, have demonstrated abnormalities in physiological responses in ASD. Researchers should attempt to determine whether the same abnormalities are present in all individuals with ASD or whether they differ according to the individual, or other variables such as level of functioning, gender, or age. Physiological responses to identified stressors, rather than arbitrarily selected stressors, should be examined. The disconnect observed in this study between overt behaviors and covert physiological arousal also warrants further investigation. It is important to determine whether this is common among individuals with ASD, why it may occur, what implications it has, and whether it is an adaptive or maladaptive response for individuals with ASD.

Further empirical investigation of the arousal modulation theories of ASD is also required. This research suggests that the concept of hypo-arousal, hyper-arousal, and optimal stimulation may not be appropriate. Given that our results suggest that periods of higher arousal were preferred, an investigation of what “optimal stimulation” is for individuals with ASD is necessary. The inclusion of behaviors categorised as “non-social” or “automatically reinforced” by functional assessments or analyses is also important so that we can determine if non-social behaviors register physiologically or whether such a classification is misleading and not useful. Finally, future studies examining such behavior could consider incorporating direct manipulations of the environment over longer periods and including measures of HR variability along with analysis of mood states and behavioral function.