Introduction

Autism spectrum disorder (ASD) comprises a range of conditions that all show difficulties with reciprocal social interaction and communication as well as repetitive and stereotyped behaviours, often accompanied by characteristic language difficulties or global cognitive impairment (American Psychiatric Association 2000). Individuals with minimal language difficulties and normal or higher-than-normal IQ are often referred to as ‘high-functioning’, a category that includes Asperger disorder (Volkmar and Klin 2000). There is now a large body of evidence showing that high-functioning individuals with ASD experience subtle but characteristic memory difficulties (see Bowler and Gaigg 2008; Boucher et al. 2012, for reviews) the patterning of which can provide both clues to underlying neuropsychological functioning and potential pointers to intervention. Relatively undiminished performance on recognition (Bowler et al. 2000; Minshew et al. 1992) contrasts with greater difficulty on free recall tasks, particularly when materials are semantically related (Bowler et al. 1997; Smith et al. 2007, but see Leekam and Lopez 2003) or when learning is assessed over a number of trials or lists (Bowler et al. 1997; Bowler et al. 2008; Minshew and Goldstein 1993, 2001, Smith et al. 2007). This patterning of performance led Bowler et al. (2004) to propose a task support hypothesis (TSH), which states that memory in ASD will be better on any task with a test procedure that provides information about the studied material.

Other aspects of the pattern of memory performance in ASD also point to a difficulty in the flexible processing of relations among elements of multi-dimensional material as well as the binding together of subsets of these elements in ways that are task-relevant or unique to the episode in which they were studied (see Zimmer et al. 2006; Shimamura 2010). Diminished recall of categorised word lists in the context of relatively spared recognition performance suggests enhanced processing of individual items of material, coupled with difficulties in processing of relations among items. Both enhanced processing of individual items and diminished processing of inter-item relations were observed in ASD participants by Gaigg et al. (2008). These researchers found that recall performance of adults with ASD was poorer under conditions that relied heavily on the ability to relate relatively infrequent members of a category together (e.g. two items of fruit in a list of 52 words) but not under conditions that relied heavily on focussing on the characteristics of each item belonging to a particular category (e.g. remembering a 13th, 14th and 15th item of furniture in the same list). People with ASD also show a reduced ability to re-experience the spatio-temporal context that characterises the recollection of a personally experienced episode (Bowler et al. 2000; Bowler et al. 2007) as well as difficulties with episodic future thinking (EFT), i.e. in imagining themselves in future situations (Lind and Bowler 2010) and in scene construction, which is the ability to accurately create an imaginary scene or re-create a previously experienced scene (Lind et al. 2014). These last observations reflect abnormalities in relational binding—the capacity to bind disparate aspects of experience into flexible configurations—and to relate this configural experience either to stored representations or to a personally constructed view of the world.

Difficulty with recall in the presence of intact recognition is also seen in healthy ageing, especially when accompanied by frontal lobe decline (Craik and Anderson 1999), an observation that led Bowler (2007) to propose an ‘ageing analogy’ as a heuristic for developing our understanding of memory in ASD. Furthermore, relational processing difficulties have been demonstrated in healthy older adults by Chalfonte and Johnson (1996) who found that when they were asked to study grids in which some cells contained drawings of objects in non-canonical colours (e.g. a pink banana), their rates of recognition of item-colour and item-location combinations were significantly worse than their recognition of individual items, colours or places. The existing behavioural findings on memory difficulties in ASD would lead us to predict similarly diminished recognition of combinations of features in the presence of intact recognition of individual features in adults with high-functioning ASD if tested with Chalfonte and Johnson’s paradigm. Such a finding would represent a strong test of relational memory difficulties in ASD because the paradigm uses a supported test procedure (recognition) which is known to pose fewer difficulties for people with ASD compared to an unsupported one (free recall), which is harder for them. Because of the importance of executive difficulties both in ASD and in the memory functioning of older individuals, in Experiment 2 we administered the Color Trails Test (CTT, D’Elia et al. 1996). This consists of two trials. Trial 1 measures sustained attention, and requires participants to join up in numerical order circles containing the numbers 1–25 randomly distributed on the page. Trial 2 measures attentional shifting and comprises two sets of 25 circles, one yellow and one pink, each set containing the numbers 1–25. The participant has to join the circles in numerical order alternating between pink and yellow circles.

Experiment 1

Method

Participants

Eighteen individuals with ASD (5 female, 13 male) and 18 typical individuals (4 female, 14 male) participated in Experiment 1. Groups were closely matched in terms of chronological age number of years of formal education and cognitive ability measured by the Wechsler Adult Intelligence Scale (WAIS-IIIUK; The Psychological Corporation, 2000, see Table 1). Participants with ASD were recruited from a panel maintained by the Autism Research Group at City University London. A review of available medical records confirmed that all ASD participants had received a clinical diagnosis according to DSM-IV-TR criteria (American Psychiatric Association 2000) by clinical psychologists or psychiatrists in the UK. The Autism Diagnostic Observation Schedule Generic (ADOS-G; Lord et al. 2000) was furthermore carried out with all individuals in this group by persons trained to research reliability on this instrument. The observations confirmed difficulties in the areas of reciprocal social and communicative behaviours consistent with the clinical diagnosis (Communication score: M = 2.5, SD = 1.5, Range = 0–5, ASD cut-off = 2; Reciprocal Social Interaction score: M = 7.2, SD = 3.0, Range = 3–12, ASD cut-off = 4; Total score: M = 9.7, SD = 3.7, Range = 5–17, ASD cut-off = 7). For experiment 2 the ADOS was also available for 14 participants (Communication score: M = 2.6, SD = 1.6, Range = 0–6, ASD cut-off = 2; Reciprocal Social Interaction score: M = 6.7, SD = 2.7, Range = 3–12, ASD cut-off = 4; Total score: M = 9.3, SD = 3.6, Range = 5–17, ASD cut-off = 7).Footnote 1The typical comparison participants were recruited via local newspaper advertisements. All were free of psychotropic medication and did not report any family history of neuropathology or psychiatric illness. Analysis of the age, educational and IQ data set out in Table 1 revealed no significant differences (max t = 0.53, min p = .60). The Ishihara Tests for Colour Deficiency (Ishihara 1999) confirmed that no one had colour vision deficits. All participants gave their informed consent and were paid standard University fees for their time.

Table 1 Average age and IQ scores for the ASD and Typical Comparison Group in Experiment 1

Materials and Design

Following Chalfonte and Johnson (1996) we generated six sets of 22 line drawings from the Snodgrass and Vanderwart (1980) norms of familiarity, visual complexity, naming agreement (i.e. agreement amongst participants in naming the objects depicted) and image agreement (i.e. how closely the drawings resembled participants’ mental images of the objects). Ratings for drawings in our selected six sets matched closely on all these variables (all ts < 1.5). We also ensured even distribution of categories (e.g. fruit, furniture…) across the six sets. The 22 drawings in each set were randomly allocated to locations in a 6 × 6 grid, which measured 18 cm × 18 cm on a 30 cm Laptop monitor. Each line drawing was presented in one of 32 unique colours in the centre of the 3 cm × 3 cm grid locations (see Fig. 1).

Fig. 1
figure 1

Example of a study array used in both experiments

Different pairs of the six study arrays served as the to-be-remembered materials for the Item, Colour and Location tests (see Fig. 2). For the Item recognition test, 10 black drawings from the studied item set and 10 from the unstudied item set were displayed in five rows of four items each (see Fig. 2a). For the Colour recognition test twenty colour patches (10 studied and 10 unstudied) were presented in five rows of four patches each (see Fig. 2b). For the Location recognition test, 6 × 6 grids were displayed, which marked 20 locations with a black ‘X’. Half of these locations had included a line drawing during study, the other half did not. Care was taken to ensure counterbalancing of drawing, colour and location elements across participants.

Fig. 2
figure 2

Examples of the item recognition test (a), colour recognition test (b) and location recognition test (c) of experiment 1

Procedure

Participants were tested individually in a sound attenuated laboratory dimly lit with a fluorescent desk lamp. The three experimental conditions were separated by several weeks in order to avoid interference effects. On each occasion participants were told that they would be shown a set of coloured line drawings that would appear in random locations of a 6 × 6 grid and that they should try to remember either what the line drawings were (Item condition), what colours they saw (Colour condition) or which of the locations in the grid were filled with a drawing (Location condition). It was made clear to participants that they could ignore aspects of the study array they were not asked to remember. Participants studied the array for 1 min and, after a brief description of the recognition test procedure, they were shown the appropriate test array and asked to indicate which items, colours or locations they remembered. Participants were allowed unlimited time during the test and they were instructed to try not to guess.

Results and Discussion

Analysis of between-group false alarm rates revealed no significant differences (max t = 0.61, df = 34; min p = .55). Analysis of corrected recognition rates (hits-false alarms), set out in Table 2, was by a 2 (Group) × 3 (Experimental Condition) mixed ANOVA. This revealed a main effect of Experimental Condition (F (1,56.81) = 76,40, Greenhouse-Geisser correction, p < .001; effect size r = .76) but no main effect of Group (F (1,34 = 0.05, ns; effect size r = .11) or interaction between the factors (F (2,56.03) = 0.06, ns, Greenhouse-Geisser correction; effect size r = .12). These findings extend to line drawings, colours and item locations existing findings that show relatively undiminished verbal recognition memory in higher-functioning individuals with ASD (Bowler et al. 2000). The question of whether recognition of combinations of these features is also undiminished is addressed in Experiment 2.

Table 2 Mean and standard deviations of corrected recognition rates for the item, colour and location conditions of experiment 1 as a function of group

Experiment 2

Method

Participants

Fourteen individuals with ASD (3 female, 11 male) and fifteen typical individuals (2 female, 13 male) participated in Experiment 2. In order to achieve sufficiently large group sizes, nine ASD and 10 typical participants who had also participated in Experiment 1 also took part in this experiment. As in Experiment 1, participants were closely matched in terms of chronological age, number of years of formal education and cognitive ability (see Table 3). All were selected in a similar manner and all met the same inclusion criteria as set out for Experiment 1. Analysis of the data in Table 3 revealed no significant differences between the groups (max t = 0.79, min p = .44).

Table 3 Average age and IQ scores for the ASD and typical comparison group in experiment 2

Materials and Design

Four new study arrays were generated in the same manner as described for Experiment 1. Each array thus comprised 22 uniquely coloured line drawings (chosen from Snodgrass and Vanderwart 1980) arranged in random locations in a 6 × 6 grid. Two arrays served as the to-be-remembered materials for the Item-Colour condition (one serving as study items and the other as lures, counterbalanced across participants) and two for the Item-Location condition with half of the participants studying one array whilst the remaining participants studied the other. The same matching for complexity, familiarity, naming and image agreement as in Experiment 1 was ensured.

Sample recognition test arrays for the Item-Colour and Item-Location conditions are depicted in Fig. 3a and b respectively. For the Item-Colour test, 20 line drawings from the studied set were presented, ten of which were shown in exactly the same colour as during study whilst the colours of the remaining ten were randomly reassigned. Thus, the test included no new item or colour information but rather old and new combinations of studied items and colours. The Item-Location test was constructed in a similar manner. The CTT (D’Elia et al. 1996) was also administered to all participants. The score used here was the time taken to complete the test standardised with a mean of 100 and an S.D. of 15. Mean scores are set out in Table 3. There was no group difference in either test measure (max t = 1.69, min p = .10).

Fig. 3
figure 3

Examples of the item-colour recognition test (a) and item-location recognition test (b) of experiment 2

Procedure

Stimulus presentation was identical to that of Experiment 1. For the Item-Colour condition, participants were instructed to try to remember both the identity and colour of each of the line drawings in the grid and told that the test would not include any new drawings or new colours but that the drawings would either appear in their original colour or swap colours with the other drawings. Similar instructions were given for the Item-Location test. After participants indicated that they had understood the instructions, they studied the grid for 1 min. During the test, they were allowed unlimited time to indicate their responses and they were asked to try not to guess.

Results and Discussion

Analysis of false alarm rates revealed no significant group differences (max t = 1.2, df = 27, min p = .24). Corrected recognition rates (hits-false alarms) are set out in Table 4. A 2 (Group) × 2 (Experimental Condition) mixed ANOVA revealed no significant effect for Experimental Condition (F (1,27) = 1.90, n.s.; effect size r = .26). There was a significant main effect of Group (F (1,27) = 13.01, p < .001; effect size r = .57) but no interaction between the factors (F (1,27) = 0.48, ns; effect size r = .13). Thus, the ASD group performed significantly worse than the comparison group on both of the combination conditions.

Table 4 Mean and standard deviations of corrected recognition rates for the item-colour and item-location conditions of experiment 2 as a function of group

In addition, because Chalfonte and Johnson (1996) reported an association between binding and frontal lobe function we repeated the analysis using trials 1 and 2 of the CTT as covariates. We also included VIQ and PIQ as covariates because as can be seen in Table 5, these correlated significantly with performance on the relational memory task. The resulting 2 (Group) × 2 (Experimental Condition) mixed ANCOVA, revealed no significant effect of Experimental Condition and no significant interactions between Condition and any of the covariates (maximum F = 2.30, minimum p = .14, maximum effect size r = .30). There was a main effect of Group (F (1,23) = 12.77, p < .001; effect size r = .60) but no interaction between the experimental factors (F (1,23) = 0.07, ns; effect size r = .05). Thus, the ASD group performed significantly worse than the comparison group on both of the combination conditions.

Table 5 Pearson product-moment correlations between corrected recognition scores and verbal IQ (VIQ), performance IQ (PIQ) and trials 1 and 2 of the color trails test (CT1 and CT2)

Finally, to compare the performance across the two experiments of the 19 participants who had taken part in both studies, we first of all compared their VIQ and PIQ scores and chronological ages, which revealed no significant differences (maximum t = 0.76, minimum p = .51). The single-element data from Experiment 1 was combined into an average element score which together with the Item-Colour and Item-Location data from Experiment 2 were entered into a 2 (Group) × 3 (Item-type) repeated measures ANOVA, which revealed a significant main effect for Group (F (1,17) = 5.64, p < .03, effect size r = .50, Item-type (F (2,16) = 69.59, p < .001, effect size r = .90) and a Group × Item-type interaction (F (2, 16) = 6.44, p < .001, effect size r = .54. Inspection of Fig. 4 reveals that the data confirm the earlier analyses by showing that although both groups show better recognition of single than multiple-attribute stimuli, the magnitude of the difference is greater in the ASD than in the typical group.

Fig. 4
figure 4

Corrected hit-rates for item-colour and item location combinations from experiment 2 and mean single-element scores from experiment 1

The overall findings of Experiment 2 show that the ASD participants experienced difficulty in recognising previously studied combinations of features even when their recognition memory for the constituent elements of the combinations was relatively undiminished and when individual differences in intellectual ability and executive functioning were statistically controlled.

General Discussion

Taken together, the findings of the two experiments support the hypothesis that individuals with ASD experience difficulty with recognition memory for episodically defined combinations of features of visual stimuli despite showing little evidence of reduced recognition of the features separately. As such, the findings replicate with individuals on the autism spectrum those of Chalfonte and Johnson’s (1996) healthy ageing participants. The findings have a number of implications for our understanding of memory in ASD. They are consistent with the view that the ASD-related episodic memory difficulties documented by Bowler et al. (2000, 2007) and by Lind and Bowler (2010) are likely to result in part from diminished relational binding, i.e. a difficulty in holding together in memory the separable features that define a particular episode. It is clear that such a difficulty would adversely affect the reconstruction of the spatio-temporal context that is characteristic of the episodic experience (Tulving 2001) as well as the associated difficulty with scene construction documented by Lind et al. (2014). The present findings also help to refine an account developed by Minshew, Williams and colleagues (see Williams et al. 2008). Our demonstration of difficulties with episodic binding of elements of experience provides a potential operationalization of the problem of dealing with complex memory that Minshew and colleagues identify as characteristic of individuals with ASD, but which as yet has not been fully spelled out. Further studies could explore the relation between the measure of relational binding used here and measures argued by Minshew and colleagues as involving complex memory processes.

By showing diminished recognition of element combinations alongside intact recognition of the elements themselves, the findings of Experiment 2, also place an important constraint on the TSH (Bowler et al. 2004). As pointed out earlier on, the TSH predicts undiminished performance in the two experiments because they both utilise a procedure (recognition) that provides a high degree of support at test and should thus be relatively easy for the ASD participants. Yet the requirement to bind disparate elements together in memory even under conditions of task support increases memory demands sufficiently to compromise performance in ASD. The present findings also extend the ageing analogy of memory in ASD (Bowler 2007). Although this analogy was intended simply as a heuristic for the development of experimental paradigms for the further study of the neuropsychology of memory in ASD, it also has implications for how we might address the as-yet under-researched area of cognitive ageing in this population (see Happé and Charlton 2012; Mukaetova-Ladinska et al. 2012).

The present findings prompt several further avenues of research. The first is the extent to which the phenomenon of diminished relational binding is associated with the broader clinical manifestation of ASD as well as with difficulties experienced by individuals on the spectrum in their everyday lives. The parallel between patterns of memory difficulties found in ASD and in healthy ageing prompts further investigation of the effectiveness of interventions on relational memory performance, and whether such interventions had knock-on effects on profiles of wider adaptive or cognitive functioning. A third strand of investigation could explore possible neural and neuropsychological underpinnings of impaired relational memory. The capacity for relational binding is widely agreed to involve the medial temporal lobe and in particular the hippocampus (Brown and Aggleton 2001), which encodes objects, events and relations among them rapidly and in a way that allows the adaptive use of encoded information in different settings (Squire 2004). These two structures have for long been suspected to play a role in the development of ASD (e.g. Damasio and Maurer 1978), and structural hippocampal abnormalities have been identified in post mortem (Bauman and Kemper 1985) and imaging studies of morphology (Nicolson et al. 2006) in individuals with ASD.

It has been known for some time that although individuals with ASD have little difficulty in recognition memory, this intact performance can diminish in certain specific circumstances. The two studies reported here confirm previous observations of intact recognition memory for single element stimuli and further demonstrate that when studied materials comprise episodically defined, arbitrary combinations of multiple elements of experience, these individuals’ recognition memory is severely compromised. The potential implications of this finding both clinically and in terms of understanding the brain mechanisms underlying memory, although as yet unexplored are of considerable importance.