Introduction

Vygotsky (1962), in his seminal theory of cognitive development, suggested that the use of inner speech facilitates problem solving and what would now be referred to as ‘executive functioning’ (EF). In Vygotsky’s view, children’s self-regulatory (including EF) skills originate in interactions with others and are only later internalized for independent usage. Language is thought to serve a self-regulatory function during problem solving, and internalized language is considered a tool for thinking, planning, and self-organization. This concept has been supported by research showing that interfering with inner speech in neurotypical adults can significantly hinder performance on tests of problem solving ability (Baldo et al. 2005), even when contrasted with another concurrent, non-auditory task, such as finger tapping (Emerson and Miyake 2003).

Children with autism spectrum disorders (hereafter referred to as autism), including autistic disorder, Asperger disorder, and pervasive developmental disorder-not otherwise specified, share significant deficits in the ability to successfully navigate the social world and engage in effective social interaction. From a Vygotskian perspective, these social impairments may hinder development of self-regulatory EF skills. Indeed, individuals with autism demonstrate impaired EF, as indicated by difficulty with problem solving tasks like the Tower of London (TOL) (for review, see Hill 2004 and Kenworthy et al. 2008). Russell (1997) linked EF deficits in autism to a failure of internalized, self-directed speech to regulate non-routine behaviors. He has described difficulties in autism with maintaining arbitrary rules in working memory (Biro and Russell 2001) and reported that children with autism perform comparably to typically developing (TD) children on tasks requiring nonverbal rule use, but worse than TD children on tasks requiring verbal rule use (Russell et al. 1999). Deficient use of covert verbal mediation strategies also has been associated with inferior performance on a verbal working memory task in children with autism when compared with controls (Joseph et al. 2005). Another method for assessing inner speech usage, articulatory suppression (AS), requires individuals to speak aloud while problem solving, thereby suppressing their inner speech. In the only study applying this technique to autism, verbal mental age matched TD children experience greater costs from AS during task switching (i.e., alternating between addition and subtraction of mathematical problems in which the function and equals signs were omitted) than do children with autism (Whitehouse et al. 2006). These findings have led to the suggestion that limited use of inner speech may contribute to cognitive (including EF) deficits in children with autism (though see Williams et al. 2008 for an alternative account). However, this remains an open question since at least one study suggests an intact developmental precursor of inner speech among children with autism. The explicit use of overt self-talk facilitated executive problem solving in children with autism and TD children alike (Winsler et al. 2007).

The TOL and other variants of the tower task are useful for exploring inner speech in autism. Tower performance is consistently cited as deficient in individuals with autism; in their extensive review, Pennington and Ozonoff (1996) found larger effect sizes for autism-related deficits on tower tasks and the Wisconsin Card Sorting test than for any executive function measures in other developmental disorders (i.e., ADHD, conduct disorder, and Tourette’s syndrome). The tower task is generally considered to be a measure of planning ability, but there is little consensus on the core executive skills underlying successful completion of this multi-step, complex task (Bull et al. 2004; Riccio et al. 2004; and see Unterrainer and Owen 2006 for review). It requires the ability to follow arbitrary rules spoken by the examiner, and to engage in multi-step decision making. As such, tower performance would appear to benefit from inner speech usage. In a study of TD 5–6 year olds, Fernyhough and Fradley (2005) reported that private speech was positively correlated with tower performance. Furthermore, tower performance is predicted by language ability in TD controls, but not in children with autism (Joseph et al. 2005).

Given evidence of idiosyncratic use of inner speech in autism, we sought to examine what effects AS or inner speech disruption would have on tower performance in matched groups of TD adolescents and high functioning adolescents with autism. Using AS to disrupt inner speech should negatively impact tower performance for the TD adolescents. In contrast, if poor use of inner speech contributes to difficulties on the tower task among adolescents with autism, suppressing inner speech should have little impact on their performance.

Methods

Participants

Twenty-five TD adolescents (24 males, one female) between 12 and 19 years of age and 28 high-functioning adolescents with autism (26 males, two females) between 12 and 20 years of age were recruited for the study. Participants with autism were recruited from an autism clinic in which clinical diagnoses were based on DSM-IV criteria. Four were diagnosed with high functioning autism, 16 with Asperger disorder, four with pervasive developmental disorder-not otherwise specified, and four with an autism spectrum disorder but exact diagnosis unknown because of sparse developmental data. According to criteria established by the NICHD/NIDCD Collaborative Programs for Excellence in Autism (see Lainhart et al. 2006) using the Autism Diagnostic Interview (ADI; LeCouteur et al. 1989)/Autism Diagnostic Interview-Revised (ADI-R; Lord et al. 1994) and/or the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2000), all 28 participants with autism also met criteria for ‘broad autism spectrum disorder’. Indeed, mean ADI social (M = 20.04, SD = 5.10), verbal communication (M = 15.75, SD = 4.61), and repetitive behavior (M = 6.43, SD = 2.82) scores as well as the mean ADOS social + communication score (M = 11.65, SD = 5.00) were in the autism range. Exclusion criteria for the autism group included any known co-morbid medical conditions, such as fragile X syndrome, other genetic disorder, or neurological disorder which may affect cognitive functioning. TD participants were recruited from the community and parents of all TD participants underwent telephone screenings. TD participants were excluded from participation if they had ever received mental health treatment for anxiety, depression, or any other psychiatric condition, taken psychiatric medications, required special services in school, or had trauma/injury that could potentially affect cognitive functioning and/or brain development. All participants in both groups had Full Scale IQs (FSIQ) above 80, as measured by the Wechsler Abbreviated Scale of Intelligence (autism: n = 20, TD: n = 25), Wechsler Adult Intelligence Scale-III (autism: n = 2), Wechsler Intelligence Scale for Children-III (autism: n = 2) or Wechsler Intelligence Scale for Children-IV (autism: n = 4). Participants were group-matched on FSIQ (see Table 1 for details). Informed assent and/or consent were obtained for all participants.

Table 1 Participant characteristics

Procedure

The Tower of London-Drexel (TOL-Dx) (Culbertson and Zillmer 1998) requires participants to move as quickly as possible (without making mistakes) blue, red, and green balls across three differently sized pegs so as to copy a pattern demonstrated by the experimenter. This target pattern remains in full view at all times. A modified version of the TOL-Dx was administered to all participants. During five of the ten trials administered, participants completed the task under normal conditions, while on the alternate five items participants completed the task under articulatory suppression (AS).

Participants were given standard TOL-Dx instructions to follow two rules while completing this task: (1) not to place any more beads on a peg than it will hold (the largest peg can hold three beads, the middle-sized peg can hold two beads, and the smallest can hold one bead) and (2) to move only one bead at a time; not to move two beads off the pegs at the same time. The examiner also informed participants that some of the trials would be “metronome trials” (AS trials). In these AS trials, participants were instructed to say a one syllable word (“up”) to the beat of a metronome (one beat per second) during task completion. To familiarize participants with this novel procedure, they were asked to practice saying “up” to the metronome prior to completing any trials. On the rare occasion that a subject stopped saying “up”, the experimenter prompted him/her by saying “don’t forget to say ‘up’.”

Five TOL difficulty levels were sequentially presented after two practice trials, one under AS and one under normal testing conditions: two 3-move trials, two 4-move trials, two 5-move trials, two 6-move trials, and two 7-move trials for a total of 10 trials. Trials were alternately administered under AS so that there was one AS and one non-AS trial at each difficulty level. The order of AS and non-AS trials was counterbalanced within subject so that an equal number of problems began with the AS or non-AS condition. Following standard procedure, a “move score” was calculated for each trial by subtracting the minimum number of moves a trial required from the number the participant took to complete the trial.

Consistent with standard administration of the TOL-Dx, trials involving 3–7 moves were administered. However, limited variance due to ceiling effects among 3-move trials led to exclusion of these data from all analyses. Previous studies have suggested that nearly 100% of TD adolescents achieve a perfect score on 3-move trials in tower tasks (Luciana and Nelson 1998) and other investigators have found that the 3-move trials tap different cognitive abilities and fail to activate pre-frontal brain regions, as the longer move trials do (Dagher et al. 1999).

An initial 2 × 2 mixed model ANOVA was run to examine the presence of a significant group (autism versus TD) by condition (AS versus non-AS) interaction. Based on our a priori predictions, follow-up independent t-tests were performed to examine group differences in TOL performance while paired t-tests were run to assess the effects of AS on TOL performance.

Results

The ANOVA revealed a significant main effect for Condition (F(1,51) = 6.38, p = .02) so that more moves were required to reach solutions under AS than non-AS trials and a trend towards a significant main effect of Group (F(1,51) = 3.99, p = .05) with TD individuals requiring overall fewer moves to reach solutions than did individuals with autism. The Group × Condition interaction (F(1,51) = 1.08, p = .30) was not significant. Follow-up t-tests, however, showed that TD participants (M = 2.89, SD = 2.01) performed significantly better on the TOL, as indicated by lower move scores under normal conditions, than did participants with autism (M = 4.11, SD = 1.88; t(51) = 2.28, p = .03; Cohen’s d = 0.63) (see Fig. 1). Moreover, AS clearly affected tower performance of the TD individuals who took an average of 1.22 (SD = 2.61) extra moves to complete the task under AS than under normal (i.e., non-AS) conditions [t(24) = 2.34, p = .03; Cohen’s d = .47] (see Fig. 1). In contrast, individuals with autism took an average of only 0.51 (SD = 2.38) extra moves to complete the tower task under AS versus under normal conditions, a non-significant difference [t(27) = 1.13, p = .27; Cohen’s d = .21]. Finally, tower performance for the autism group under normal conditions (M = 4.11, SD = 1.88) was indistinguishable (t(51) = 0.01, p = .99; Cohen’s d < 0.01) from tower performance under AS in the TD group (M = 4.11, SD = 2.14) (see Fig. 1).

Fig. 1
figure 1

Mean number of extra moves (±SEM) needed to complete the tower problems under articulatory suppression (AS) or without articulatory suppression (non-AS) for adolescents with autism spectrum disorder and typically developing adolescents

Discussion

Corroborating the extant literature (Hill 2004; Pennington and Ozonoff 1996), TD adolescents were more proficient at the TOL when tested under standard conditions than were adolescents with autism. In contrast, under AS adolescents with autism and TD adolescents demonstrated comparable TOL performance. A comparison of TOL performance under AS and non-AS conditions indicated that the metronome had little to no effect on participants with autism, but significantly impaired the TD group. Furthermore, TOL performance under AS for TD adolescents was equivalent to TOL performance under normal conditions for individuals with autism. This suggests that inner speech may support performance on the TOL in TD adolescents. Furthermore, these results support the notion that impaired inner speech may contribute to executive dysfunction among individuals with autism.

The present study replicates and extends experiment 3 from Whitehouse et al. (2006). They found more pronounced AS effects in TD children relative to children with autism; however, under normal (non-AS) conditions, task switching was not impaired in the autism group. Thus, we show not only greater AS cost for TD individuals (as opposed to individuals with autism), but also that the magnitude of interference from AS reduces tower performance among TD individuals to a level comparable to the baseline impairment observed within the autism group.

Beyond giving one potential mechanism underlying some forms of executive dysfunction in autism, the present findings point to a potential target for intervention. Our findings support the idea that children with autism are impaired in their ability to use internally generated language to guide independent problem solving. Language is a cultural tool originating in the social world (Vygotsky 1962) and therefore it is perhaps unsurprising that a fundamentally social disorder such as autism results in idiosyncratic use of language. Self-talk links the social and private world, in so much as it allows an individual to use the language of others (e.g., directions, explanations, etc.) to guide his/her problem solving and decision-making. The value of explicit training in self-talk strategies for children with autism could be explored. Additionally, providing children with autism with written as opposed to oral instructions could reduce requirements for inner speech and facilitate independent problem solving.

Replication and extension of these results is needed. The present study contained no dual-task control for the AS condition. To ensure that these effects are specific to disruption of inner speech usage and not due to more general dual-task interference, future research should include a control (e.g., finger tapping) condition. Similarly, using a modified version of a standardized clinical task for assessing the effects of AS on EF, though increasing our external validity, may not have provided ideal reliability given the limited numbers of trials. Future research should improve upon this limitation. Additionally, the validity of the inner speech-EF connections in autism could be assessed by similarly using AS procedures during completion of the Wisconsin Card Sorting Test (Baldo et al. 2005). It would also prove revealing to examine the specificity of these findings to autism utilizing this procedure in other clinical groups with demonstrated executive dysfunction (e.g., schizophrenia, ADHD). Finally, numerous tasks have been used to assess inner speech difficulties in autism (including word length [experiment 2 from Whitehouse et al. 2006], phonological, and visuospatial similarity [Williams et al. 2008] effects, in addition to AS), resulting in mixed findings—a clearer understanding may be reached by testing the same participants using multiple methods.