Introduction

Memory underpins all learning (DeLong 2003). Accordingly any impairment of memory early in life may result in substantial difficulties in terms of how and what is learned across the remainder of the lifespan (Baron-Cohen 1995) and thus is likely to shape the ways an individual will understand and relate to others in their daily lives (Bayley and Squire 2002).

A wealth of research shows that individuals with autism spectrum disorder (ASD) all experience some form of memory anomaly possibly from very early in life (see Boucher and Bowler 2008, 2010). But there is considerable diversity in terms of the type and extent of memory problems which present across the entire autistic spectrum and this needs to be explored (Russo et al. 2005).

Currently we know that people with autism can sometimes present with outstanding memory for specific domains such as mathematical calculations (Pring 2008); music (Hermelin et al. 1987; Miller 1999), calendar calculations (Heavey et al. 1999) or reproducing visually accurate, complex art works (Heaton et al. 2007; Selfe 1977). However less than 10 % of all individuals diagnosed with autism show signs of such savant skills (Rimland 1978) with much debate surrounding the memory systems underpinning the consolidation, storage, and retrieval of same (Pring et al. 1995).

Aside from savant memory, much of memory in ASD is interpreted through the lens of Tulving’s (1985) distinction between semantic and episodic memory (Bigham et al. 2010). Semantic memory typically deals with factual information, and is said to be dependent upon the medial temporal lobe (MTL) cortices, (most notably the perirhinal cortex area) of the brain (see Aggleton and Brown 2006). Tulving further argued that semantic memory is strongly underpinned by a level of consciousness or awareness that is automatic and thus not open to conscious control (Schacter and Tulving 1994).

Many intelligence scales include subtests assessing the degree of factual information a child ‘knows’. These intelligence scales test areas such as general knowledge or vocabulary, and as such, a wide variety of academic studies contain information about semantic memory in autism (Salmond 2001). Whilst it must be acknowledged that scores on many subtests of intelligence scales reflect educational experience as much as semantic memory systems, research does show that in high functioning autism (HFA) and Asperger syndrome, memory for decontextualized facts appear to be comparable to, if not superior to, age and ability matched typically developing cohorts (Minshew et al. 1992; Siegel et al. 1996).

Episodic memory on the other hand, refers to memory for personally experienced events (Tulving 1985). This memory process is said to be reliant upon the hippocampus (Skinner and Fernandes 2007) and involves effortful, consciously controlled levels of awareness (Schacter and Tulving 1994). Measuring episodic memory thus involves assessing what a child remembers (R), as opposed to what a child knows (K), which is assessed via their semantic memory system (Gardiner et al. 1994; Tulving 1985).

To date, assessment of episodic memory has focused on individuals with HFA, and this typically involves tests of recognition (Bennetto et al. 1996), tests of free recall (Boucher and Warrington 1976), or tests of cued recall (Bowler et al. 1997). It would appear that recognition of spoken words is typically unimpaired if not superior in this cohort (Breversdorf et al. 2000; Hillier et al. 2007) as is recognition of lists of written words (Boucher et al. 2005). Individuals with HFA show no difficulties in recognition of pictures of everyday items (Ambery et al. 2006), or that of meaningless shapes (Bigham et al. 2010).

Free recall is generally unimpaired in HFA (Ambery et al. 2006; Bowler et al. 2008; Minshew et al. 1992) but research suggests that individuals with HFA will show superior free recall for lists of items over that of semantically related word lists (Smith et al. 2007).

A number of studies suggest that cued recall and paired associated learning (PAL) are intact in HFA (Ambery et al. 2006; Boucher and Warrington 1976; Minshew and Goldstein 1993). However, research implies that HFA performance is somewhat diminished when the material to be recalled involves family scenes (Williams et al. 2006) and severely diminished for face-to-name tests (Salmond et al. 2005).

Therefore, individuals with HFA demonstrate certain episodic memory difficulties when materials to be recalled are social in their composition, such as the recognition of faces (Boucher et al. 1998, 2000), or the recollection of events or activities the individual with HFA personally participated in (Millward et al. 2000). Memory for the order of items presented at test is frequently impaired in HFA also (Boucher et al. 2007; Poirier et al. 2011).

Accordingly, using Tulving’s (1985) categorisation system, memory in HFA is typically interpreted as a combination of mildly diminished episodic memory with spared semantic memory (Bowler et al. 2000; Ben Shalom 2003; Toichi and Kamio 2003).

Much less is known about memory in individuals with low functioning autism (LFA) or what is termed ‘autistic disorder’ within the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV: American Psychiatric Association 1994). Unlike their HFA counterparts, individuals with LFA frequently present with some form of learning disability (Bolton et al. 1994). Furthermore, 25–50 % of all individuals with LFA never acquire functional speech across the lifespan (Tager-Flusberg 2000). Such severe cognitive deficits suggest that LFA is underscored by significantly graver impairments of memory than those experienced by their HFA counterparts (DeLong 2003).

Recently it has been proposed that two additional and very distinct processes underpin episodic memory, namely recollection and familiarity (Mayes et al. 2007; Joseph et al. 2005; Wixted 2007; Yonelinas 2002). Recollection here refers to the recall of information typically associated with an item from an earlier encounter whilst familiarity relates more to the feeling that an item was encountered before without necessarily recalling any additional information (Bigham et al. 2010).

Evidence from animal studies (Eacott and Easton 2007) and neurophysiological research (Skinner and Fernandes 2007) suggests that the hippocampus region of the brain is critical for recollection whilst familiarity is dependent upon other MTL cortices (Brown and Aggleton 2001). To this effect, it has been suggested that whereas recollection (especially for social stimuli) is somewhat impoverished in HFA leaving familiarity intact, both recollection and familiarity are impaired in LFA (Bigham et al. 2010; Boucher et al. 2008). Indeed using this distinction, impaired familiarity has recently been proposed as the major contributor to language and learning difficulties typically associated with LFA (Bigham et al. 2010: 879).

Assessing recollection and familiarity separately is very difficult however as both processes typically contribute to performance on standard memory tests (Bigham et al. 2010). Research within the last decades strongly suggests that the type of test used is also crucial for assessing each domain separately of each other (Holdstock et al. 2002b). For example, the Complementary Learning Systems (CLS) model of recognition (Norman and O’Reilly 2003) supports the notion that neocortical components facilitate familiarity whilst recollection is dependent upon hippocampal structures in the brain. This model predicts that participant ability to discriminate between recollection and familiarity in a standard yes/no test will rely more on the recall of the target item from study and less on whether the target is familiar to them (Migo et al. 2009). However the model also predicts that in a forced choice corresponding (FCC) version whereby the target is presented with three similar but slightly different foils, the participant must rely on neocortical structures of the brain and the feeling that the target item was seen before and is thus familiar.

Evidence supporting the CLS model of memory stems from research with adult patients who have suffered hippocampal damage leaving them with impaired recollection yet spared levels of familiarity (Migo et al. 2009). Such patients are impaired on yes/no (Y/N) object recognition memory test but unimpaired when tested via a forced choice corresponding (FCC) version of the test (Holdstock et al. 2002a, b). This anomaly may have occurred as Y/N recognition is more heavily dependent upon recollection and the hippocampus than familiarity whilst the reverse appears true of FCC tests (Yonelinas 2002).

Consequently, Bigham et al. (2010) devised a 4-choice forced choice test of familiarity using targets and foils comprised of highly similar but somewhat different abstract shapes. As with Holdstock et al., the hypothesis was that in order to succeed on this test the participant must rely heavily on their sense of familiarity or the feeling that they know the target was an item they had seen before (Bigham et al. 2010). This paradigm was devised as a useful measure to assess familiarity as distinct from recollection in young children and teenagers with LFA (Bigham et al. 2010).

In order to test recollection separately from that of familiarity, a temporal source memory (TSM) task was developed by Bigham and her colleagues. Source memory involves remembering the context elements which occurred around a particular event or item (see Boucher and Bowler 2010). Previously tested in HFA via words lists (Bennetto et al. 1996; Bowler et al. 2004) the TSM task by Bigham and her colleagues was designed specifically to assess recollection of 16 everyday items with teenagers with LFA compared to an ability matched group of typically developing (TD) children and an age and ability matched group of teenagers with developmental delay (DD). Participants had to try to remember which everyday items they had been shown before they saw a tube of sweets and which of those items had been presented after the tube of sweets (Bigham et al. 2010). The findings showed that memory for contextual information in LFA was not only impoverished, but that impaired source memory was specific to LFA as opposed to their TD and DD counterparts.

These findings are very important but based on a single trial they may not be robust and need to be repeated (Robson 2002). To address this issue a series of ten trials was conducted in this study. The intention was to obtain more data, yielding potentially more findings.

In addition, the singular trial conducted by Bigham et al. (2010) did not allow for the examination of possible differences in participants recall of the order of presentation of items in the testing phase (Craik and Lockhart 1972). Given previous findings of impaired temporal memory in HFA this may be of intrinsic interest and may help explain discrepant performances across groups (Boucher et al. 2007; Poirier et al. 2011). In particular, the current research set out to assess the ability of participants with LFA to recall which item was presented just before and which item was presented just after a target stimulus in a temporal source memory task. Poor performance for items appearing just before the target stimulus may be reflective of impaired long-term memory in LFA (Toichi and Kamio 2003), whilst any deficit in recall of items presented just after the target stimulus may potentially be more reflective of short-term memory impairments (Craik 1971; Poirier et al. 2011).

Finally, in the Bigham et al., study, it may have been the case that the participants with autism recognised the everyday items less well than the TD or the DD groups and that this negatively impacted on their performance. Consequently, Bigham and colleagues recommended that future researchers control for familiarity in any tests of recollection with ASD participants (Bigham et al. 2010: 888). This study therefore aimed to extend the work of Bigham and her colleagues, beginning with an assessment of familiarity.

Study 1: A test of Familiarity in Children and Adolescents with Low Functioning Autism

As noted previously, familiarity can be assessed independently of recollection when participants are shown target stimuli that are abstract or non-meaningful in their design at study and their subsequent recognition is then tested via a 4–choice forced recognition task using foils very similar to the target item (Bigham et al. 2010; Holdstock et al. 2002a, b; Migo et al. 2009). Past research strongly indicates that success on this type of task is heavily dependent on familiarity, or the feeling that the target item has been viewed previously (Holdstock et al. 2002a, b).

A number of studies have demonstrated how individuals with HFA perform as well as ability matched typically developing individuals on this type of test (Bigham et al. 2010; Minshew and Goldstein 2001; Williams et al. 2006). Tests of this kind have not as yet been conducted with individuals with LFA. Therefore, although the prediction is that the LFA group will be impaired relative to two comparison groups this hypothesis has not been tested previously.

Methods

Participants

Three groups took part in this study. The experimental group consisted of thirty-three children and adolescents with autism. These children were attending one of five schools catering specifically for students with autism in the Republic of Ireland. All had been diagnosed with autism spectrum disorder by independent psychiatrists or psychologists (National Educational Psychologists (NEPs): Department of Education and Science 1999), using the DSM-IV (APA 1994) criteria. The children and teens were either non-verbal or minimally verbal (defined here as having ten words/phrases or less).

A group of thirty-three typically developing (TD) children matched with the autism group for non-verbal mental age (NVMA) were recruited from a large national school in the Republic of Ireland. The TD children were selected to be of average ability with no reports of social, emotional or cognitive difficulties.

Twenty-seven children and adolescents with developmental delay (DD) without autism were recruited. The DD group were matched with the autism group for chronological age (CA), verbal and nonverbal ability. The children and teenagers within the DD group attended one of two special schools in the Republic of Ireland and had no diagnosis of autism. Descriptive data for classroom allocation of the participants is shown in Table 1.

Table 1 Participant classroom details

Participant receptive language scores were established using the British Picture Vocabulary Scale (BPVS; Dunn et al. 1997), with non-verbal ability measured using subsets of the Wechsler Abbreviated Intelligence Scale (WASI; Wechsler 1999). Descriptive data for the three groups are shown in Table 2.

Table 2 Participant details, including between-group similarities and differences on baseline measures

General Procedural Points

All participants were tested in their respective schools in a quiet, familiar room. Testing typically took place over a 2–3 day period. Baseline scores of verbal and non-verbal ability were obtained in strict accordance with procedures for administration of the BPVS and the WASI. In line with ethical guidelines the children and teenagers were at all times accompanied by an adult member of staff (Child First Guidelines 2001).

Materials and Procedure

A pilot/practice phase was conducted to ensure that all participants fully understood what was expected of them (Bigham et al. 2010). To this effect, a set of 4 laminated white A4 cards was prepared with four abstract shapes printed on each in blue ink. One of the abstract shapes was the target shape and the other three were foils. A separate set of 4 A4 laminated cards was created. Each of these four cards had the target shape printed in blue ink. Each card had a number from 1-4 printed on the back to ensure that the tester presented the cards one at a time in a pre-determined order. The procedure used at practice was exactly as outlined within the Test phase. Three practice runs per child were conducted with prompts and verbal praise given to ensure correct responding. Two children in the DD group and three from the ASD group showed mild distress and were excluded from the Test phase. All remaining participants (n = 88) correctly identified at least three shapes correctly on two of the three practice runs.

For the main test a set of sixteen white laminated A4 cards were created, each with four abstract shapes in black ink printed on them (Bigham et al. 2010). One of these four shapes was the target shape whilst the other three shapes were foils. A second set of sixteen white laminated cards was also created. This set comprised of target shapes only. The back of each target card and each target-plus-foils card had the numbers 1–16 printed on them to facilitate presentation of the target-foil cards in a predetermined order at test. An example of a target card and target foil card replicated from the Bigham study is shown in Fig. 1.

Fig. 1
figure 1

An example of the target and target-plus-foils card used in study 1: A test of Familiarity in LFA children and teenagers

Children and adolescents who progressed from the practice phase were asked to engage a second time, but told that this time that more cards would be used. The participants were reminded to look at each card carefully. They were shown 4 of the 16 target cards one at a time for 3–5 s in a predetermined order (Bigham et al. 2010). Following the presentation of four of the target cards, the children were shown each of target-plus-foils corresponding to the just shown target cards in an upright position on the table.

The children were asked to point to the shape they had seen before. When the participant had responded the card was removed from the table and the next one was presented. If a child hesitated for more than 5 s they were encouraged to make a choice. This procedure was repeated until all 16 cards were shown.

Each correctly identified target card was awarded one point with zero for an incorrect response. Scores ranged from 0–16.

Results

Mean scores for the three groups on the Test of Familiarity are shown in Table 3. Floor effects (defined here as scoring 8 or less) occurred within the developmental delay (DD) group, however all three groups scored significantly above chance on this task.

Table 3 Results of Study 1 by group: mean scores, SD’s, ranges, median scores, numbers tested, mean scores for males, mean scores for females, and numbers at ceiling and at floor

A one-way between-groups analysis of variance (ANOVA) showed that the effect of group (ASD/DD/TD) was significant F (2, 85) = 19.72, p < .0005. The difference in mean scores between the groups was large, with the effect size, calculated using eta squared at .31. Post hoc comparisons using the Tukey HSD test indicated that the mean scores for levels of Familiarity for participants with autism (m = 11.40, SD = 3.02) differed significantly from both the participants with developmental delay (m = 12.8, SD = 3.0) and the young typically developing children (m = 15.2, SD = 1.2).

Next, a one-way ANOVA was conducted to explore the effect of gender on scores of familiarity across the three groups (ASD/DD/TD). There was no statistical significance for gender in either the ASD or the TD groups, however the mean difference in scores across males (m = 14.0, SD = 3.0) and females (m = 11.3, SD = 2.4) for the participants with developmental delay was significant at the p < .05 level on the Test of Familiarity: F (1, 23) = 5.5, p = .02. The effect size, calculated using eta squared was large at 0.1.

The three groups of participants were matched for NVMA scores and this significantly correlated with performance on the Test of Familiarity (r = .11, n = 87, p < .29). However, entering NVMA score as a covariate did not affect the overall result.

Comments on Study 1

Study 1 involved conducting a relatively pure test of familiarity with participants diagnosed with low functioning autism, an age and ability matched group of children and teenagers with developmental delay, and an ability matched group of young typically developing school-children. The test involved showing the participants 16 abstract non-significant shapes which were used as target stimuli and a forced choice test consisting of three foils which were very similar to the target stimuli (Bigham et al. 2010; Holdstock et al. 2002a, b; Migo et al. 2009). Past research strongly suggests that to do well on this test the participant must rely heavily on the feeling that they have seen the target stimuli before, as recollection of such meaningless images from study would be far too difficult (Holdstock et al. 2002a, b).

Some ceiling effects (scores of 16) and some floor effects (scores of 8 or less) occurred (see Table 3). However, no floor effects occurred in the ASD group indicating that this entire group understood the task required of them.

It could be argued that the participants with autism performed less well than controls in this test as a function of the chosen stimuli. Williams et al. (2006) have indicated that memory in autism is most impaired when stimuli are complex. Evidence against this argument comes from the fact that all 30 participants with low functioning autism (LFA) performed above chance (scores of 8 or less), with 7 scoring at ceiling (scores of 16). Furthermore the participants with autism performed only moderately less well than an age and ability matched group of participants with developmental delay.

Significantly, the results imply that not only is familiarity impaired in LFA, something that has not previously been tested via this particular forced choice paradigm, but in addition that the impairment in familiarity is specific to the LFA group as opposed to TD and DD groups without autism. Therefore the results are significant and consistent with predictions made by Bigham et al. (2010:879).

Nevertheless, caution should be exercised when interpreting these results as the test procedures used here were modified from original procedures set down by Bigham et al. (2010) where the performance of HFA children and teens (n = 18; mean mental age 117 months) were compared to an ability matched TD (n = 29; mean mental age 110 months) cohort. Within the Bigham et al. (2010) study, one child from each group performed at ceiling, whereby the researchers suggested recalibration of the paradigm for future use with participants of that age.

To this effect, a brief pilot of the original recognition test procedures was conducted with the participants with LFA in the present study. The pilot confirmed that the presentation of all 16 stimuli in a single run was overwhelming for the participants with LFA. It was decided therefore to present the target cards in sets of four at a time in a predetermined order at a rate of 1 every 5 s rather than the entire 16 at once. In spite of this recalibration, participants with LFA showed significantly lower levels of familiarity at test compared to those with developmental delay and the young typically developing group.

In relation to the typically developing (TD) cohort, having the 16 targets presented in a series of 1–4 rather than 1–16 possibly over-simplified the task for this group whilst also facilitating easier recall of the target stimuli in TD working memory (Craik and Lockhart 1972). However, given the mean scores of both the DD and ASD participants in this study, any increase in the presentation rate may have negatively impacted upon these two groups thus potentially inflating floor effects.

Gender differences were not reported in the study by Bigham et al. (2010), and in this study there were no significant differences in the mean scores of males and females in either the ASD or the TD group. A large difference in mean scores for males and females was noted in the DD group however with females scoring significantly less well than their male counterparts. This effect merits further research in subsequent studies.

Previous research suggested that the similarity of the foils and target stimuli could be increased or decreased to recalibrate difficulty levels across experimental and control groups (Bigham et al. 2010). However, this alternative may once again disadvantage or advantage participants with learning difficulties and TD groups respectively. As such, it may not be possible to reconfigure this particular test for use with typically developing children and individuals with LFA simultaneously (even if matched for mental age). However, this paradigm remains extremely useful in terms of both assessing familiarity distinct from recollection, and in terms of clarifying the extent of impaired familiarity within LFA individuals relative to comparison groups.

Study 2: A Test of Recollection in Children and Adolescents with Low Functioning Autism

For this part of the study, a temporal source memory (TSM) recall task originally devised by Bigham et al. (2010) was conducted. The main aim was to assess recollection of contextual information in children and teenagers with low functioning autism (LFA), an age and ability matched group of participants with developmental delay, and an ability matched group of young typically developing children. Previous research requiring participants to recall one of two previously presented word lists suggested that recollection is impaired somewhat in individuals with high functioning autism (HFA) (Bennetto et al. 1996). An alternative source memory task involving lists of words also showed source memory impairment in HFA for unassisted recall of previously presented words (Bowler et al. 2004). More recently, Bigham et al. (2010) reported impaired recall of a set of everyday objects in individuals with LFA, but also noted that an omission to explicitly test for impaired familiarity in the sample may have resulted in the participants recognising the objects used at test less well than the comparison groups which in turn may have negatively affected their overall performance.

To this effect, the test of familiarity was carried out first in this study. This test indicated lower overall mean scores of familiarity in the LFA group than that of age and ability matched controls. Given this finding, and with the benefit of findings and recommendations from the Bigham study, the participants with LFA in this study were allocated the 16 items to be used in the TSM recall task 48 h before test to best familiarise themselves with same.

The TSM recall task to be used was a test that also included participant memory for the order in which items were presented. Bennetto et al. (1996) noted impaired performance on a Corsi block task in HFA, whilst Martin et al. (2006) showed that although HFA participants exhibited similar recall scores to controls, the HFA group made many more errors in terms of the order in which items were to be recalled in, implying an impaired sense of temporal sequence in HFA.

In line with the above findings, it was predicted that (a) participants with LFA would be impaired relative to both the DD and TD groups on recall of contextual information, and (b) participants with LFA would be impaired in recalling the sequencing of items shown just before and just after a target stimuli.

Methods

Participants

Participants were those used in study 1 (see Table 2).

Materials and Procedure

General procedural points were the same as those used in the first study. Again testing took place over a 2–3 day period in the schools of participating children. However, as Bigham and colleagues previously suggested that the children with autism in their study may not have recognised the everyday objects used as well as the two comparison groups, it was attempted to control for same in the current study. To this effect, on day one, several boxes, each containing 12 identical everyday items, were allocated to the ASD children’s classrooms. Audio-visual copies of a Classic Sesame Street song called ‘After Me’ (Sesame Street.org. 2010) were also given to teaching staff. Teachers were encouraged to engage the children with the 12 everyday items frequently over the next 2 days. Teaching staff also agreed to incorporate the ‘After Me’ song in class work throughout the 2 day period.

Before administration of the main test a set of three pre-tests were conducted. The first test took approximately 5–8 min to administer and involved ensuring participants’ general understanding of ‘before’ and ‘after’. The second test was devised to ensure participant understanding of a stimulus to be presented just before and just after an object and this also took 5–8 min approximately. Finally, practice in the actual procedure to be used in the main test was given to fully ensure all participants understood what was expected of them.

Training Test 1: Understanding Before and After

The central aim of this test was to ensure participant understanding of ‘Before’ and ‘After’. A Sesame Street Board Game (Sesame Street.org. 2010) was used whereby a bowl of sand was placed in front of three small toys with a card placed in front of each toy. The first card had ‘Before’ printed on it, the second card had ‘Between’ printed on it and the third card came with the word ‘After’ printed on it (see Fig. 2).

Fig. 2
figure 2

The labelled and positioned toys

The first step involved the tester pointing to each card clearly saying the words, ‘before, in between, and after.’ The tester then said ‘Look at this toy’ pointing to the toy nearest the bowl of sand ‘this toy will play in the sand BEFORE all the other toys’. The tester then pointed clearly to the last toy saying ‘Look at this toy, this toy will play in the sand AFTER all the other toys.’ Finally the tester pointed to the middle toy saying ‘And this toy is in between’.

Secondly the toys were placed in the sand one-by-one by the tester as the tester said ‘Look, he got to go BEFORE the others. This one was in between. And look! This one went AFTER all the others.

The third step saw the tester place the toys back in a row at the bowl making sure they all took different positions in the line. This was to ensure that the participant could see that the labels ‘before’ ‘between’ and ‘after’ did not apply to a specific toy, but rather to a position in the line.

The tester now said ‘Can you show me who will go BEFORE the other toys? Who is in between, and who will go AFTER all the other toys?’ Three practice runs were given, with prompting and support given on the first run if needed to ensure correct responding. Any child who failed on more than one response in each of the practice runs was considered unable to understand the procedure. No child failed this pre-test which took between 5 and 8 min to complete.

Training Test 2: Understanding What Comes Immediately Before and Immediately After an Object

For this pre-test a laminated strip of card with three rectangles drawn on it was placed on the desk in front of the participant. The words BEFORE and AFTER were written in the left and right rectangles with the word BETWEEN in the middle rectangle. The tester pointed to each word saying ‘Look, here are those words again! This is the word before, this is the word after, and this one says the word between’.

A set of plastic fridge magnet numbers 1, 2, 3, 4, 5, 6, 7, 8, and 9 were placed approximately two inches apart along the middle of the desk above the before-and-after strip. The tester proceeded to place the plastic (fridge magnet) number 5 in the centre rectangle.

The tester pointed to the group of numbers 1–4 saying ‘All these numbers come BEFORE the number 5!’ then the tester pointed to the group of numbers 6–9 saying ‘And all these numbers come AFTER the number 5’. The tester then picked up the number 4 and said clearly ‘However, THIS number comes JUST before the number 5’. The tester then placed the number 4 slowly and deliberately on the rectangle of the laminated strip printed with the word BEFORE. The tester proceeded to pick up the number 6 and slowly placed it on the rectangle of the laminated strip printed with the word AFTER, saying ‘And this number comes JUST after the number 5’.

After 5 s looking time, the numbers were removed from the before-and-after strip and returned to their rightful place amongst the numbers 1–9. The tester said ‘Let’s play again with different numbers. Will you help me this time?

The number five is placed back on the centre rectangle of the before-and-after strip and the participant is prompted to pick up the number that comes just before and just after that number. Verbal praise was given to promote good responding.

Next the tester placed the number 2 on the centre of the before-and-after strip. The tester pointed to the number 1 (slightly separated from the numbers 3–9) saying ‘Oh look, this comes before the number 2’. The tester points to the group of numbers 3–9 saying ‘And all these numbers come after the number 2.’

The tester then looks at the child and asks ‘Can you show me which number comes JUST before the number 2?’ The participant is encouraged to place the correct number (1) on the rectangle of the laminated strip printed with the word BEFORE, but pointing or a clear indication of choice is acceptable. This procedure was repeated for the number (3) to come just after the number 2. Prompting was given if required.

Two further practice runs were given with no prompting on these occasions. Any child who failed to indicate either the number coming just before or the number to come just after the appointed number in each of the practice runs was deemed not to understand the test and thus excluded from the main test. No child failed this test which took approximately 5–8 min to complete.

Training Test 3: Practice in the Procedure to be Used in the Main Test

Practice in the procedure to be employed in the main test was then conducted to ensure participants full understanding of what was expected of them (Bigham et al. 2010). Therefore, six of the twelve everyday items (see Appendix 1) to be used in the main test were used, plus a glass paperweight, and the laminated ‘before and after’ strip. The procedures used at this juncture were exactly as those outlined below for the main test. Participants were allocated three practice runs, with prompting given on the first run to facilitate errorless learning. When participants made 3 consecutive correct responses it was considered appropriate to continue to the Testing phase. This test took approximately 8–10 min. No participant failed this procedure.

Main Test

The main test consisted of re-introducing participants to the full set of twelve everyday items plus the ‘BEFORE and AFTER’ strip from the pre-test phase. The glass paperweight used at pre-test was replaced with a wooden blue cat for the main test.

At study the tester said ‘Will we play that game again? I have some more things to show you, look’. The twelve items were presented on the table in front of the participants’ one at a time in a pre-determined order at a rate of 1 every 5 s looking time. Each item was clearly named by the tester before being removed from sight in order to present the next item. This was repeated ten times, with the blue wooden cat presented in a predetermined and sequential manner across each presentation (see Appendix 2).

After each sequenced presentation the participant was shown six of the original twelve items and asked to identify which singular item came just before the blue wooden cat and which singular item came just after the blue wooden cat. Participants were encouraged to place the items onto the before/after strip but pointing, or a clear indication of choice, was accepted. If a participant perseverated (chose either just the before or just the after rectangle three times in a row) the researcher indicated that it was sometimes the other side of the strip (Bigham et al. 2010: 881).

There was no retention level other than time allocated to test instructions. Participants scored 1 point for any of the just BEFORE items correctly identified and 1 point for any of the just AFTER items correctly noted. Scores thus ranged from 0– 10 for ‘Before’ scores and 0-10 for ‘After’ scores with a total of 0–20 for ‘Total Recollection’ scores (Before scores + After scores).

Results

Mean scores for each of the three groups on Study 2 are shown in Table 4. Seventeen participants with ASD scored at chance or below at this test. Despite this, there was statistically significant difference at the p < .05 level in TSM scores for the three groups: F (2, 89) = 38.14, p < .0005. Post hoc comparisons using the Tukey HSD test indicated that the mean score of the autism group (M = 10.71, SD = 5.00) was significantly lower than those of either the participants with developmental delay (M = 16.22, SD = 2.72) or the typically developing children (M = 17.93, SD = 1.81). The difference in group means was large with the effect size calculated using eta squared at .46.

Table 4 Results of TSM by group: mean scores, SD’s, ranges, median scores, numbers tested, mean scores for males, mean scores for females, and numbers at ceiling and chance

Given the high number of participants with autism scoring at or below chance, a re-analysis of the data with the seventeen at floor removed was conducted. The 15 participants with autism who remained consisted of 12 males and 3 females ranging in age from 74–185 months of age. There was statistical significance at the p < .05 level for the three groups of participants: F (2, 72) = 8.1, p = .0003. Post hoc comparisons using the Tukey HSD test indicated that the mean recollection score for children and teens with autism (M = 15.00, SD = 3.18) remained significantly less than those scores of recall for both the participants with developmental delay (M = 16.22, SD = 2.72) and the young typically developing children (M = 17.93, SD = 1.81). The effect size, calculated using eta squared was large at 0.18. There was no effect for gender or chronological age on scores of recollection in any of the three groups (see Table 4).

It was then necessary to consider the impact on participants’ recall in relation to using six of the twelve items used to be used at Main Test at an earlier Pre-Test stage. To this effect a series of three paired-samples T tests were conducted (see Table 4). It is apparent that increased familiarity with 6 of the 12 test items enhanced recall for all participants at Main Test. Such enhanced recall would be anticipated in the young typically developing children, and the greater recall scores noted for the DD group indicates that increased practice facilitates learning, which again would be expected for participants with delayed development. Conversely, in spite of access to the entire 12 everyday items for 2 days prior to Main Test, exposure to the ‘After Me’ song from Sesame Street, and encountering 6 of the 12 items at Pre-Test prior to the Main test, the ASD group still demonstrated the weakest recall of items overall (see Table 5).

Table 5 Participant scores for items encountered at pre-test and main test

Following on from this, it was necessary to statistically analyse which of the three groups benefitted most from seeing the six items at pre-test. Three independent-sample t tests were conducted to this effect. The 15 participants with autism differed significantly (m = 4.60, SD = 1.12) from the 27 participants with developmental delay (m = 5.33, SD = 1.0) t (40) = −2.181, p < .03 (two-tailed). The difference in the means was large at 0.12.

There was no statistical difference between the DD (m = 5.33, SD = 1.0) and the TD group (m = 5.45, SD = .13) t (58) = −.523, p < .20. Indeed the magnitude in difference between the means (0.01) was very small.

There was statistical significance between the 15 children and teens with autism (m = 4.60, SD = 1.12) and the 33 young typically developing children (m = 5.45, SD = .13) t (46) = −3.028, p < .004 (two-tailed). The extent of the differences in the means (mean difference = −.85, 95 % CI: −1.42 to −.286 was large (eta squared = −0.15).

As such, the DD and TD group benefitted similarly from encountering 6 items to be recalled at Main Test at an earlier Pre-Test stage. The ASD and DD groups in this study were matched for chronological age and NVMA whilst the ASD and TD group were matched for NVMA. These factors significantly correlated with performance on overall scores of recollection, yet entering them as covariates did not affect overall results.

Next it was necessary to explore whether any group differences occurred for participant recall for the item presented just before the cat as opposed to the item presented just after the cat via a paired samples t test.

For the ASD group there was a statistically significant decrease on recall scores for items presented just before the blue cat (m = 5.8, SD = 3.25) rather than those presented just after the blue cat (m = 9.20, SD = 1.26), t (14) = 3.48, p < .004 (two-tailed). The mean decrease in scores for items which were presented before rather than after the blue cat was 3.40 with a 95 % CI interval ranging from 1.30 to 5.49. The eta squared statistic (.46) indicated a large effect size.

There was minimal difference in the mean recollection scores of the DD group in relation to items presented just before the blue cat (m = 7.66, SD = 2.14) versus items presented just after the blue cat (m = 8.55, SD = 1.98). Likewise recall of items which appeared just before the blue cat in the TD group (m = 8.93, SD = 1.39) did not vary greatly for overall recall of items presented just after the blue cat (m = 8.93, SD = 1.32).

In spite of significantly lower mean scores of recollection for the ASD group relative to comparison groups, a somewhat greater mean recall performance (just after the blue cat) was exhibited in the ASD group compared to mean recall performance for items presented just before the cat. Given that these particular participants with autism were exposed to a Sesame Street song called ‘After Me’, it was necessary to investigate the effect of classroom on these scores.

A separate ANOVA was conducted to consider the effect of attending one of five classrooms for just the ASD students (n = 33) on mean scores of recollection. There was no statistically significant difference at the p < .05 level in remembering the 12 everyday items across the five classrooms: F (4, 28) = 1.27, p = .305.

Whilst performing at a somewhat better rate for items which came after the blue cat (m = 7.36, SD = 2.6) the participants with autism showed no enhanced effect of classrooms in relation to DD scores (m = 8.55, SD = 1.98) or that of the TD group (m = 8.93, SD = 1.32) for the same items presented after the cat.

Nonetheless, these results should be interpreted with caution as classroom sizes are highly variable, making Type 1 error impossible to rule out. It must be noted however that despite increased familiarity (Bigham et al. 2010) with both the ‘After Me’ song and the items used at test, the participants from ASD classrooms displayed mean recall scores that are statistically lower than those in a number of the DD and TD classes who had had no such increased familiarity.

Correspondingly, it would appear that recall is significantly impaired in LFA in spite of explicitly controlling for improved familiarity with test items which is consistent with the predictions of Boucher et al. (2008) and Bigham et al. (2010).

Comments on Study 2

Study 2 involved conducting a temporal source memory (TSM) recall task with participants diagnosed with low functioning autism (LFA), an age an ability matched group of children and teenagers with developmental delay, and an ability matched group of young typically developing school-children. Recollection was assessed across a series of ten trials in which participants had to recall which of sixteen everyday items had been presented just before a blue wooden cat and the item which had been presented just after the blue wooden cat.

The overall result of the TSM recall task are interesting as it appears that memory for temporal contextual information is not only impaired in LFA, but also that memory for temporal source is specific to LFA as opposed to individuals with learning difficulties without autism. The results are consistent with previous studies showing impaired recollection in participants with low functioning autism (Bigham et al. 2010).

In line with the results from Study 1 indicating significantly impaired levels of familiarity in children with LFA, plus observations of previous research in this field suggesting participants with autism may recognise everyday objects less well than comparison groups (Bigham et al. 2020:882), the participants with LFA were allocated use of the 12 test items 2 days before testing.

The teaching staff of this group were also encouraged to play the Sesame Street song ‘After Me’ to the students over this 48 h period. Furthermore, in light of research implying poor generalisation in autism (Wing 1996), six of the items to be used at Main test were used at Pre-Test to facilitate the participants with autism as much as possible.

Despite these steps to enhance familiarisation in the autism group, more than half the participants with LFA (n = 18) scored at or below chance (a score of 10 or less) on this task. Therefore the data was re-analysed with the remaining 15 participants with LFA (see Table 5). The gender of participants showed no effect on the overall recall scores of the three groups. Interestingly the chronological age range of the remaining participants with LFA (m = 111 months) was lower than the 18 participants with autism (m = 124 months) who scored at or below chance on the TSM task. This might imply that recall memory disimproved with increased age for LFA participants in this study (Boucher and Warrington 1976; Minshew and Goldstein 1993). In relation to BPVS scores, the mean age of the 18 participants who scored at chance or below on the TSM task, and were thus excluded from the final analysis was also higher (m = 132.8 months, SD = 36.2) than that of the 15 remaining at test (m = 118.1 months, SD = 38.05) implying the 18 LFA participants who scored less well than the 15 scoring above chance on the TSM task understood the instructions, but found the task overly difficult. Alternatively and in line with the work of Klin (1991), the 18 participants with LFA who clearly understood the task but failed to score above chance at main test may have demonstrated an overall lack of attention to the task at hand.

The fact that all 93 participants attended a range of schools and classrooms in this study is another methodological concern when interpreting the results. It could be argued that the teachers in the ASD classrooms may have played the ‘After Me’ song differentially across all five classrooms catering only for students with autism. Equally it could be argued that the ASD classrooms were hugely advantaged over the DD and TD classrooms due to the time the pupils with LFA had with items to be used at test, and the exposure to the song ‘After Me’. In relation to these two issues, it may be wise to allocate the 12 everyday items and copies of the ‘After Me’ song to all participants equitably. However, this may cause ceiling effects in more able participants.

Nonetheless it is noteworthy that performance on the TSM task in this study did not increase in the LFA classrooms as opposed to the DD or TD classes. Nor was there any effect noted across classes teaching students with autism only, implying that enhanced exposure to the ‘After Me’ song and the 12 everyday items to be used at test failed to increase overall scores of recollection for this group.

It was noted within this TSM recall task how participants with LFA demonstrated a significantly better recall for items presented just after the blue cat than those presented just before the cat. This effect was somewhat specific to the LFA group as opposed to the participants with developmental delay and the young typically developing children. To this effect, as participants really only had to concentrate on recalling 2 of 12 items (the item just before and the item just after the blue cat), the DD and TD group may have used verbal mediation during study phase (comb before and crayon after) which contributed to a more even recall of items encountered before and after the cat. Research does suggest that typically developing children are capable of storing visually presented material in memory via verbal mediation from the age of 4 (Algria and Pignot 1979). However less than 30 % of the TD children scored at ceiling in this task, with less than 20 % of the DD group demonstrating scores of 20. Instead the poor recall of items just before the blue cat in the LFA group may reflect an impoverished serial position effect (Boucher 1978, 1981; Hermelin and O’Connor 1970; Toichi and Kamio 2003). Recall of items encountered earlier in test may be considered more indicative of long term memory (Craik 1971) whilst recall of items encountered later in test may just reflect greater use of short term memory (STM) systems.

Further evidence for this argument rest in the fact that the 12 everyday items used in this TSM recall task were all clearly named out by the researcher at test and research suggests STM is mediated mainly through auditory or phonological components. LTM on the other hand is mediated primarily through semantic components (Baddeley 2002) and as we have seen, the participants with LFA in this study have shown impairments in areas subserved by short tem memory (familiarity and automatic consciousness) but not to the extent that their long term memory (source memory and controlled thinking) appears impaired within the present study.

Alternatively the participants with LFA within this study may have a specific deficit in time-related thinking (Boucher et al. 2007). Pons and Montangero (1999) suggest that time-related thinking is a particular skill independent of other skills typically considered as indictors of intelligence (Bigham et al. 2010), As such, and in line with previous observations by Bigham et al. (2010) the low scores in overall recollection demonstrated by the LFA group in this study may be more reflective of a specific impoverishment in temporal information processing than a specific impairment of recollection.

Finally any argument that the high floor effect in the LFA group resulted from failure to understand the task is unlikely given that all 33 passed the series of pre-tests designed to ensure participant understanding prior to undertaking the Main Test.

Discussion

Memory in autism spectrum disorder has been widely researched over the past decades (see Boucher and Bowler 2008, 2010). In particular, individuals with a diagnosis of Asperger syndrome or high functioning autism (HFA) have received a high level of attention in this regard (Boucher et al. 2010). Whilst behavioural findings on memory in HFA are most commonly interpreted via Tulving’s (1985) categorization of memory systems, of late an additional distinction has been made within the literature regarding the episodic memory system. Joseph et al. (2005) distinguished between recollection (a significant factor in the memory of personally experienced events) and familiarity (a significant factor in the memory of facts and decontextualized items) suggesting that a profile of impaired recollection and intact familiarity characterises memory profiles in HFA.

Regrettably, much less research has been dedicated to the memory of individuals with low functioning autism (LFA). Nevertheless, it has recently been hypothesised that both recollection and familiarity may be impaired in LFA (Boucher et al. 2008, 2010). It is thought that the uneven profile of spared and impaired ability across HFA and LFA would explain the additional learning and language impairments commonly associated with LFA as opposed to individuals with HFA (Bigham et al. 2010).

With this aim this study tested recollection and familiarity separately in children and adolescents with LFA, an ability matched group of young typically developing school-children, and an age and ability matched group with learning difficulties. It appears that in both paradigms used the formats used are unsuitable for use with groups of typically developing and cognitively challenged groups such as DD and ASD simultaneously.

In both test formats used here, ceiling and floor effects were unavoidable when attempting to devise methods and procedures that were sensitive to the needs of all groups even when matched for mental ages. However, when employed solely with DD and ASD groups, the results are robust and shed some light on the impairments specific to LFA across memory profiles.

A limitation with this set of findings concerns the omission of error responses in the temporal source memory recall task. In hindsight, it would be of interest to record how many of the DD and ASD participant error responses were in fact correct in as far as the items identified by the individual did come before or after the blue cat, despite being incorrect in as far as these items did not come just before or just after the wooden cat. Such data would potentially contribute to a greater understanding of time based discriminations in memory representations in low functioning autism (Boucher 2001). Future work would also benefit from the measurement and analysis of latency as a factor of interest in both the test of familiarity and the temporal source memory task.

It is also a concern that the stimuli used to assess levels of familiarity in participants in Study 1 and the stimuli used to assess recollection in Study 2 are vastly different. A series of abstract shapes was used to test familiarity in Study 1 and Williams et al. (2006) have stipulated that memory in autism may be further compromised if the stimuli used at test are particularly complex. Conversely, the items used in Study 2 were everyday common objects, which may have inflated participants recall if indeed familiarity aids recall, or impaired if participants experienced interference as a result of encountering the items used at Main Test at Pre-test stage. Future work will incorporate the paradigm recently devised by Migo et al. (2009) combined with the tests of familiarity and recollection used here to investigate the extent to which recollection and familiarity are impaired in LFA as opposed to individuals with HFA.