Introduction

Clinicians have long observed that children with autism spectrum disorder (ASD) are impaired in their ability to generalize—that is, to relate new stimuli to past experiences (Rimland 1964). For example, imagine a child who learns a social script to respond to “hi,” but then fails to apply this script when someone says, “hey.” Generalizing a skill learned in treatment to everyday use is one of the most significant barriers to treatment success (for reviews, see Karkhaneh et al. 2010; Vismara and Rogers 2010; Wass and Porayska-Pomsta 2013). In an early study of this phenomenon, nearly half of children with ASD who learned new behaviors in a treatment room failed to transfer these skills to a new setting (Rincover and Koegel 1975). Many current treatment studies make generalization to everyday settings an explicit treatment goal (e.g., Ingersoll et al. 2007; Koegel et al. 2012; Laski et al. 1988; Pierce and Schreibman 1997; Taylor and Harris 1995), emphasizing the critical role that generalization is thought to have in child outcomes.

Despite the critical importance of generalization impairments to intervention in ASD, experimental work on generalization in ASD has been strikingly limited. Most experimental studies tapping generalization have focused on categorization and word learning. In contrast to the intervention literature, these studies have not supported robust impairments in generalization in ASD; rather, children with ASD appear to be less efficient in their approach to generalization. For example, individuals with ASD can form categories and correctly extend category structure to new exemplars; however, they are both slower (Gastgeb et al. 2006), and less consistent (Naigles et al. 2013) in how they make these extensions, compared to matched controls. Word learning studies have demonstrated that generalization in ASD is specifically related to language level (Hani et al. 2013; Hartley and Allen 2014), providing a clue about the domains that are supported by generalization (or potentially the domains that support generalization itself).

Here we use a novel paradigm to test a different form of generalization—the ability to transfer a strategy utilized in one context to a similar but not identical context—in verbally fluent children and adolescents with ASD. Specifically, we tested participants’ tendency to generalize the mutual exclusivity strategy—a strategy used to learn new words—to make inferences about facts. Mutual exclusivity refers to children’s tendency to treat object labels exclusively; that is, when hearing a new word, children are more likely to assume the word applies to an object for which they do not already have a name (Markman and Wachtel 1988). This lexical constraint emerges early in development (i.e., before age 2; Graham et al. 1998; Halberda 2003; Littschwager and Markman 1994) and is associated with expressive vocabulary growth (Graham et al. 1998). Several groups have now shown that children with and at-risk for ASD effectively apply mutual exclusivity when making word-object mappings (Bedford et al. 2013; de Marchena et al. 2011; Preissler and Carey 2005).

To test generalization of the mutual exclusivity strategy, youth with ASD and typically developing (TD) controls completed two tasks (reanalysis of data presented in de Marchena et al. 2011), originally designed by Diesendruck and Markson (2001) to test mechanisms underlying the mutual exclusivity constraint. The first task—a relatively straightforward word learning task—generally elicited the use of an exclusivity strategy. We predicted that participants with ASD would be less likely than TD participants to apply this exclusivity strategy to an analogous task in which new facts were learned instead of new words. That is, participants with ASD would be less likely to generalize the exclusivity strategy to a new context. Further, we predicted that youth with ASD would generalize less consistently across trials. Finally, based on the literature demonstrating an association between language skills and generalization in ASD (Hani et al. 2013; Hartley and Allen 2014), we predicted that generalization weaknesses would be associated with underlying language skills in the ASD sample.

Method

Participants

Youth with ASD Participants were 48 verbally fluent children and adolescents with ASD. Diagnoses were confirmed through (1) administration of the Social Communication Questionnaire—Lifetime Version (SCQ; Rutter et al. 2003), and (2) review of clinical diagnostic reports provided by the parents (n = 32), or administration of the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2002) Module 3 or 4, by a research reliable clinician (n = 16).

Receptive vocabulary standard scores of 85 or higher, as assessed by the Peabody Picture Vocabulary Test (PPVT; Dunn and Dunn 1997), were required for study inclusion. Six participants with ASD were excluded for the following reasons: ASD diagnoses not confirmed (n = 2), PPVT below 85 (n = 4). See Table 1 for characteristics of the final sample.

Table 1 Participant characteristics for participants with autism spectrum disorder (ASD) and typically developing (TD) controls

TD youth Participants were 68 youth with a typical developmental history, including no first-degree relatives with an ASD diagnosis, no developmental delays, and no known neurological impairments. Twenty-eight participants were excluded for the following reasons: failure to match to the ASD group (n = 20), high score (above nine) on the SCQ (n = 4), experimenter error in task administration (n = 3), and current concerns regarding social impairments (n = 1). The 40 remaining youth were matched to the ASD group on chronological age and receptive vocabulary.

Experimental Task

The task compared children’s tendency to treat words exclusively with their tendency to treat facts exclusively (Diesendruck and Markson 2001, Study 1). Two within-subjects conditions varied only in whether labels (i.e., words) were used to label and request novel objects or whether facts were used; see Fig. 1 for sample objects. Each of six trials per condition had an information phase, followed by a test phase; see Table 2.

Fig. 1
figure 1

Image of experimenter requesting object during task. All objects used during the task were novel objects: either household objects likely to be unfamiliar to children, or made-up objects constructed in the lab. The object set used for each condition was counterbalanced across participants, as was the side of presentation

Table 2 Task administration procedures, by condition

Counterbalancing

The specific stimuli used for each condition, the side of presentation of these stimuli, and the order in which the conditions were presented was fully counterbalanced. The first condition administered (label or fact) was also counterbalanced, with participants pseudo-randomly assigned to receive label or fact first while maintaining balance within diagnostic groups. Of the 42 participants with ASD, 21 received the label condition first and 21 received the fact condition first. Of the 40 participants with TD, 19 received the label condition first and 21 received the fact condition first. For the original study, it was important that participants not apply strategies that they had formed in the first condition to the second condition. To minimize the chance that participants would do this, the second condition was administered no sooner than 2 weeks after the first, with the exception of two adolescents with ASD. In all cases, the experimenter was the same for both days of testing.

Analysis Plan

Detailed task performance results are presented in de Marchena et al. (2011), and will not be repeated here. Critically, participants across groups had significantly higher scores on the label condition (mean 86 % correct) than on the fact condition (mean 71 % correct), t(81) = 3.97, p < .001, Cohen’s d = 0.44, demonstrating that they tended to find the label condition easier. To test generalization in the current study, we compared participants who completed the fact condition first (referred to as fact-first participants) to those who completed the fact condition second (fact-second participants) to test the hypothesis that participants who had experience applying an exclusivity strategy on the relatively straightforward label condition would perform better on the more ambiguous fact condition. An initial 2 × 2 univariate ANOVA was run to test for a significant diagnostic group (ASD vs. TD) by condition order (fact-first vs. fact-second) interaction, with performance on the fact task as the dependent variable. Based on our a priori predictions, planned independent-samples t tests were used both to compare fact performance across diagnostic groups based on condition order, and to compare fact-first versus fact-second participants within diagnostic groups. As a measure of generalization consistency, a χ 2 test for independence was used to compare perfect performers (6/6 correct trials) to imperfect performers (anything less than 6/6 trials correct).

Individual difference analyses were conducted to examine generalization effects in fact-second participants only (i.e., those who had the opportunity to generalize from the label condition). As a proxy for individual participants’ tendency to generalize, gain scores were computed by subtracting mean performance of all fact-first participants from each individual fact-second participant’s score on the fact condition. Gain scores were compared to chronological age and PPVT standard scores using bivariate correlations. Given limited variability in gain in the TD group, these analyses were conducted in the ASD group only.

Results

Prior to testing generalization itself, performance on the label condition for fact-first versus fact-second participants was examined to establish that fact-second participants were not, by chance, more likely to use exclusivity for word learning than fact-first participants. An independent-samples t test demonstrated that the groups performed similarly, t(80) = 0.67, p = .50, Cohen’s d = 0.15, suggesting a similar overall tendency toward exclusivity.

The ANOVA revealed no main effect of diagnostic group, F(1,78) = 1.65, p = .20, partial η 2 = 0.02, demonstrating that overall, TD and ASD groups performed equally well on the fact condition. The main effect of condition order was significant, F(1,78) = 12.91, p = .001, partial η 2 = 0.14, with fact-second participants performing significantly better on the fact condition than fact-first participants. The diagnostic group by condition order interaction was not significant, F(1,78) = 1.94, p = .17, partial η 2 = 0.02, see Fig. 2.

Fig. 2
figure 2

Percent correct on the fact condition by participants with TD and ASD. “Fact first” bars indicate participants who completed the fact condition first. “Label first” bars indicate participants who completed the analogous label condition 2 weeks prior to completing the fact condition. Chance performance is 50 % correct. Error bars represent standard errors. Inferential statistics describing group differences are presented in the text

ASD and TD groups performed similarly when the fact condition was administered first, t(40) = −0.07, p = .94, Cohen’s d = −0.02; however, the TD group performed better than the ASD group when the fact condition was second, t(38) = 1.99, p = .05, Cohen’s d = 0.62, see Fig. 2. Examining diagnostic groups separately, the TD fact-second group performed significantly better than the TD fact-first group, t(38) = 4.12, p < .001, Cohen’s d = 1.31, with a very large effect size. In contrast, there was no significant difference between fact-first and fact-second participants with ASD, t(40) = 1.40, p = .17, Cohen’s d = 0.43, see Fig. 2.

Among fact-first participants, there was no diagnostic group difference in the number of perfect performers, χ 2 (1,N = 42) = 0.17, p = .68, demonstrating that, at baseline, participants in both diagnostic groups were equally likely to use exclusivity 100 % of the time. In contrast, among fact-second participants, TD participants were significantly more likely to be perfect performers than participants with ASD, χ 2 (1,N = 40) = 6.51, p = .01, as shown in Fig. 3.

Fig. 3
figure 3

Percent of participants from each group who attained perfect performance (in dark gray) or less than perfect performance (in light gray) on the fact condition. Note that while perfect performance was equally likely across diagnostic groups in fact-first participants, among fact-second participants, those with TD were more likely to exhibit perfect performance than those with ASD

Gain scores in fact-second participants only (i.e., improvement from label to fact condition) were marginally higher in TD participants than ASD participants; see Table 1. Among participants with ASD, age was not significantly correlated with gain scores, r(21) = .25, p = .28. PPVT was positively correlated with gain in ASD, r(21) = .56, p = .01, with participants with greater gains showing larger receptive vocabularies for chronological age.

Discussion

This study focused on the tendency of children and adolescents with ASD to generalize a problem solving strategy (i.e., exclusivity), from one context to another. Prior experience and success with the label condition did transfer to fact performance, as demonstrated by a main effect of condition order, and by the finding that TD participants showed a dramatic improvement in performance on the fact condition when they had already seen and succeeded on the parallel label condition. In contrast, youth with ASD did not significantly improve fact performance based on experience with the label condition, resulting in stronger fact-second performance in TD participants relative to ASD despite equivalent fact-first performance across diagnostic groups. That is, TD youth were more successful on the fact condition due to their experience with the label condition—not so for participants with ASD. Further, among fact-first participants, youth with ASD and TD were equally likely to be perfect performers (i.e., 100 % use of exclusivity) on the fact condition. In contrast, among fact-second participants, youth with TD were more likely to be perfect performers, suggesting decreased consistency of generalization in ASD, as has been demonstrated by others (Hartley and Allen 2014; Naigles et al. 2013). In the current study, it appears that some participants with ASD may have recognized the similarity between the two contexts, and increased their use of exclusivity accordingly; however, they did not commit to applying this strategy consistently in the same way that the majority of TD participants did.

One major limitation of this study is that we did not explicitly teach participants the exclusivity strategy. Participants did not receive any feedback on either condition, so not only was it unclear whether exclusivity was the “correct” approach to the ambiguous fact condition, but it was also unclear whether it was the correct approach to the label condition, from which they generalized.Footnote 1 While speculative, it appears that when task demands are ambiguous, TD individuals have a bias to generalize from past experience, even in the absence of specific feedback. This bias appears to be less robustly developed in ASD. Our data do not speak directly to how generalization might work in ASD when skills are taught explicitly—a process that more closely parallels the problems observed in intervention. Our design reflects generalization of spontaneously acquired skills; it is possible that impaired generalization observed in our study would be attenuated if the to-be-generalized skill were taught explicitly. However, an alternative possibility is that generalization impairments would be even greater when testing generalization of a skill that does not come as naturally (verbal children with ASD acquire the mutual exclusivity bias during language acquisition without support; Bedford et al. 2013; de Marchena et al. 2011; Preissler and Carey 2005), even when explicitly taught. The point of intervention itself is to teach skills that do not come naturally, thus research in this area is of great importance. Strategy transfer designs, such as ours, can contrast generalization of explicitly taught skills versus spontaneously (and implicitly) acquired skills to address these questions.

With respect to individual differences in the ASD group, the tendency to generalize was uncorrelated with age; however, it was strongly correlated with receptive vocabulary, such that participants who showed a stronger, or more consistent, tendency to generalize had larger receptive vocabularies for their age. The current study did not include a measure of nonverbal IQ, thus it is unknown whether nonverbal reasoning skills are also related to generalization. The finding that generalization was uncorrelated with age suggests that it may be specifically related to vocabulary growth and verbal reasoning. Further, this is not the first study to find a relationship between generalization and vocabulary, when relationships were not observed in other domains. For example, in children with ASD and intellectual disability, receptive language, but not age or nonverbal developmental level, was positively correlated with generalization skill (Hartley and Allen 2014). Similarly, children with ASD who passed a word generalization task had stronger expressive and receptive language skills than failures, but did not differ in age (Hani et al. 2013). There is likely a dynamic relationship between generalization and receptive language, such that children who are stronger generalizers are able to use this skill to build larger vocabularies; as development unfolds, children may also be able to use verbal reasoning skills to extract meaningful relationships between familiar and novel contexts, thereby improving their ability to generalize. These hypotheses can be addressed in future studies.

The current study demonstrates subtle weaknesses in generalization in a large sample of verbally fluent children and adolescents with ASD. This study was not originally designed to assess generalization, a fact that brings with it several limitations, and a call for more research in this area. These findings represent a first step toward understanding strategy transfer in ASD, a phenomenon that parallels the weaknesses often noted by interventionists, allowing for a new experimental perspective on generalization in ASD. A limitation of the field broadly is that while we agree that generalization is a problem in ASD, there is no consensus on what generalization is exactly or how to best test it. Strategy transfer designs such as ours may serve as a useful bridge between the bulk of the experimental generalization literature, which has primarily focused on categorization and word learning, and the intervention literature, which typically looks at the spontaneous use of newly learned skills across a range of contexts.

Several theories have been proposed to account for the generalization weaknesses observed in ASD, for example, stimulus over-selectivity, originally described by Lovaas et al. (1979), weak central coherence (Happé and Frith 2006), and enhanced discrimination of perceptually similar stimuli (Plaisted 2001). While a full discussion of these theories is beyond the scope of this brief report, strategy transfer designs could be used to experimentally manipulate features such as perceptual similarity to test the validity of these theories in explaining generalization weaknesses in ASD. Strategy transfer may also be a form of abstract analogical reasoning, in that it requires recognition of similarities across contexts to generate a problem-solving strategy. Studies of abstract analogical reasoning in ASD are also very limited; however, strategy transfer via analogical reasoning in ASD appears intact when paired with explicit cueing to generalize (Green et al. 2014; Morsanyi and Holyoak 2010), a hypothesis that has important implications for intervention. More research needs to be done in this area to understand both intervention strategies to enhance spontaneous generalization, and the theoretical underpinnings of generalization in both TD and ASD.