Communication impairment is one of the core diagnostic criteria for autism, although the extent of the impairment is variable (American Psychiatric Association 1994). Some children may linger in the prelinguistic period of development and a key challenge for these children is how to support them to become more intentional and symbolic communicators (Lord 1997).

Augmentative and alternative communication (AAC) systems have been used successfully to assist children with autism at the prelinguistic stage of development to communicate their needs and wants (Ganz and Simpson 2004). Even in the early years, children can learn to use graphic symbols. One of the most widely adopted AAC approaches to teach the use of graphic symbols to young children with autism is the Picture Exchange Communication System (PECS; Bondy and Frost 1994). Research conducted to date into the efficacy of this approach suggests that individuals with autism who have little or no functional speech can benefit from PECS (Preston and Carter 2009). PECS has its foundations in applied behavior analysis and uses a step-by-step, six element program to develop functional communication and spontaneous initiations. The program consists of teaching requesting, generalizing the skills across communicative partners and environments, discrimination between symbols, and commenting (Bondy and Frost 1994). An important component of PECS is the use of items the child with autism finds interesting. These items are used to motivate requesting behavior which can then be shaped and prompted around a picture exchange.

For some children with autism, it can be difficult to find items and activities that are highly motivating. There has been some research to suggest that music can be a motivating and engaging learning context for children with autism. For example, Stephens (2008) found that imitation of actions and words increased when a music context was used to enhance social engagement. Music has also been found to increase joint attention and engagement when compared to a play context for children with autism (Kim et al. 2008, 2009). Music is a common component of most early childhood programs and can provide a naturally occurring context in which to embed the teaching of communication skills. The use of systematic instruction within naturally occurring routines has been recommended for the development of functional communication for beginning communicators (Keen et al. 2002). In addition to creating an engaging learning context, music is thought to aid language acquisition by increasing arousal and attention, and through the consistent mapping of music to linguistic structures (Schon et al. 2008).

There have been few well designed studies that have assessed the effectiveness of teaching graphic symbols through music (Accordino et al. 2007). Edgerton (1994) examined the effects of interactive instrument playing on communicative behaviors and found a significant increase in communicative acts and responses during the intervention period for all participants. Kim et al. (2008, 2009) compared the effects of improvisational music therapy or play sessions on joint attention and positive emotional communication. They found increased joint attention during the music therapy session compared with play sessions. However, little change was observed in either condition for initiating joint attention behaviors, like pointing and showing. During the music session, participants recorded a significant effect for “joy”, “emotional synchronicity” and initiation of engagement, and participants were more responsive to the researcher. Buday (1995) compared the use of musical cues to non-musical cues, to facilitate memory for signed words. The number of signs and spoken words correctly imitated was significantly higher when the words were sung, compared to the words being spoken. During the music condition the participants were observed to be more attentive. Kern et al. (2007) embedded songs into the morning routine of two children with autism in order to increase independent functioning. This was paired with the use of a visual symbol and was used to greet teachers and peers. Kern et al. reported an increase in independent functioning using the song. An increase in peer interaction with the participants was also recorded.

In general, these music intervention studies provide some evidence of positive outcomes but have not reported generalization and maintenance data. Generalization of skills from the training context to other situations is a significant challenge for children with autism (Lovaas et al. 1973). They appear to display overselectivity whereby the recognition of a symbol, for example, is dependent on cues specific to the training situation. Failure to recognize the symbol in a different context may result when the salient cues of the symbol from the training context are absent (Wilkinson and McIlvane 1997).

The aim of this study was to investigate whether children with autism could learn to receptively label animal symbols incorporated within an interactive song using a PowerPoint presentation and Interactive Whiteboard (IWB) and to generalise the learning to other non-music contexts. The song “Old McDonald” was chosen to embed animal symbols and both the song and animal symbols were presented using a PowerPoint presentation and IWB. This format enabled reliable and consistent delivery of identical material “without the many idiosyncratic and incidental (and sometimes undetectable) behaviors that accompany adult or peer presentation” (Panyan 1984, p. 378). Given that some children with autism have shown a preference for computer instruction over typical teaching instruction (Moore and Calvert 2000), the use of a computer-based presentation provided an opportunity to avoid researcher bias while possibly enhancing the engagement of participants.

Method

The design of the study was a single subject multiple baseline across participants (Baer et al. 1968). Prior to the collection of baseline data, the participants were assessed to determine their knowledge of animal symbols (symbol selection) and their ability to follow instructions using an IWB (task competence). Intervention consisted of two phases: intensive teaching of the animal symbols during the song using a single array presentation (Phase A) and; teaching of symbols with a three symbol array (Phase B).

Participants

Three males with a diagnosis of Autism Spectrum Disorder, Uri (4yrs 3mths), Ayrton (4yrs 1mth) and Oscar (3yrs 2mths), were recruited for the study from an Early Childhood Development Program (ECDP). The three participants attended the ECDP 1 day per week. They had all been previously exposed to the use of symbols and were able to make a choice from two symbols. All participant names are pseudonyms.

Ethical clearance was granted by the authors’ university ethics committee. An information pack outlining the purpose and procedure of the study and consent form requesting the use of the setting to recruit and implement the study were sent to the ECDP Principal who forwarded information packs and consent forms to the children’s parents/guardian. Only children with a verified diagnosis of ASD from a paediatrician were included in the study.

Language assessments were conducted by a speech language pathologist for Ayrton and Oscar using the Preschool Language Scale 4th Edition (PLS) (Zimmerman et al. 2002). Ayrton scored 50 on the Auditory comprehension subtest; 50 on the Expressive communication subtest, resulting in an overall score of 50 which ranked him in the 1st percentile. Oscar scored 59 on the Auditory comprehension subtest; 74 on the Expressive communication subtest; and obtained an overall score of 63. The overall score ranked Oscar in the 1st percentile. Uri was assessed using the Expressive Vocabulary Test (EVT-2) (Williams 2007) and the Peabody Picture Vocabulary Test (PPVT-4) (Dunn and Dunn 2007). Uri scored below the 1st percentile on both tests.

Materials and Settings

The IWB was chosen as the presentation mode as symbol selection using this interface was not reliant on the eye-hand coordination skills required when using a mouse (Huguenin 2004). The interactive song was developed using Microsoft PowerPoint® 2003, colour line symbols, and sound files created by a musician. The line symbols were created using Boardmaker V. 6 (Mayer-Johnson 2007) to which the participants had previous exposure. Particular animal symbols were chosen in terms of diversity of shape, colour, size and unfamiliarity. The animal names differed in terms of beginning sounds and number of syllables.

The traditional song “Old MacDonald” was pre-recorded and embedded into the PowerPoint program. Following a correct response from a participant, a custom animated slide of the animal symbol was displayed. The slide consisted of a 1m x 1m symbol of the correctly selected animal moving across the screen with an associated pseudo animal sound.

Five animal symbols were selected for intervention. These were presented in three-symbol arrays during trials so that the probability of a child randomly selecting a correct symbol during a trial was 33%. Each animal symbol was then named in the song three times per session thereby reducing the probability of a child randomly selecting a correct symbol to 3.7%. The use of five animal symbols generated 60 possible array presentations. A program developer used Microsoft (.net 2008) C# program language, to generate random displays of three animal symbols selected from five possibilities ensuring that each animal was named in the song three times during a session. A total of 12 PowerPoint music sessions and 10 Powerpoint no-music sessions were created with different array configurations. Random allocation was used to avoid positional bias selection (Duker et al. 2004).

One-to-one training sessions, of 5–10 min duration were conducted at the child’s ECDP in a 4x6m room which normally functioned as a computer room, storeroom and staffroom. The first author sat at the computer and the participant was provided a chair in front of the IWB. The participant was required to use an IWB cursor to activate the whiteboard. The array symbols were presented at the participant’s eye level. The use of the IWB projected the array symbols to a size of 40cm square.

Procedure

Selection of Animal Symbols

Screening was conducted with individual participants to identify unfamiliar or novel animal symbols. With the participant sitting at a table in his classroom, the researcher placed three colour symbols simultaneously in front of him on the table. The researcher then instructed the participant to “touch the (animal name)”. The three symbols were then removed and three new animal symbols were presented. The symbol name, position and participant’s response was recorded following each presentation. Each animal symbol was presented on three occasions. The animal symbol was classified as unfamiliar or novel if the participant recorded an incorrect response or made no response on two or more presentations. The five symbols finally selected for the study using this procedure were: emu, goat, kookaburra, mosquito, and lizard.

Task Competence

Using an IWB, participants were individually tested to determine their competency in following the verbal prompt “Touch the (animal name)”. This was to ensure that any incorrect responses during intervention were not related to lack of competence in executing the task. Familiar animal symbols that participants had expressively identified during the screening described above were used for this assessment. A participant was considered competent when he touched the screen in response to the instruction to do so on three consecutive occasions with sufficient pressure to activate the screen. All three participants demonstrated task competence according to these criteria.

Baseline

Baseline consisted of five sessions. Each session employed 15 PowerPoint slides that ensured the five animal symbols were presented three times in a randomized display. Each slide contained three graphic symbols simultaneously presented in a horizontal array on an IWB. A pre-recorded voice embedded into the PowerPoint provided the customized instruction, “(child’s name) touch the (animal name)”. Touching any of the symbols with the IWB cursor activated the presentation of the next slide. Position of the symbols and response to the verbal instruction was recorded. No feedback was provided during the procedure. Following five baseline sessions, Uri commenced intervention and baseline probes continued for Ayrton and Oscar until they commenced intervention according to the multiple baseline design.

Intervention Phase

Intervention was similar in design to baseline with the addition of music. Intervention consisted of two phases. Phase A involved five teaching sessions designed to develop an association between the animal name and animal symbol using a single array presentation. Phase B involved presentation of the symbol in a 3–symbol array. Phases A and B differed only on the number of symbols presented in the array. This method was based on the procedures used in PECS whereby learning of individual symbols (Phase A) occurred prior to the introduction of a discrimination task (Phase B). This procedure was considered important to symbol acquisition, particularly given the task difficulty of learning five new animal symbols simultaneously. The format of the slide presentation for the PowerPoint music intervention is outlined in Table 1.

Table 1 Format of interactive presentation

A response latency of 20 sec was allowed prior to prompting for a response. Although this is a longer interval than generally applied (Duker et al. 2004), it was required to allow the participant time to approach the screen and make a selection. The screen remained static if the participant made an incorrect response, defined as selection of an incorrect symbol or failure to respond within 20 sec. Errorless learning is recommended for effective teaching of symbols whereby a prompt is provided before the participant has the opportunity to make a mistake (Sigafoos et al. 1996). As the study required the identification of correct/incorrect responses this was not feasible, however following the recording of an incorrect response, the error correction procedure employing “least-to-most” prompts was implemented (Duker, et al. 2004). The prompt hierarchy consisted of a verbal plus gestural prompt, whereby the researcher pointed to the correct symbol and provided the animal name; and a verbal plus physical prompt, where the researcher physically guided the participant to perform the task while providing the animal name. Participant responses and level of prompts were recorded at each session. The criterion for acquisition of an animal symbol was three correct responses per session over three consecutive sessions.

In addition to the PowerPoint music sessions, the baseline condition was extended into the intervention phase to collect data on a PowerPoint no-music presentation. PowerPoint no-music sessions were conducted at the end of every third PowerPoint music session. Three PowerPoint music sessions and one PowerPoint no-music sessions were implemented 1 day per week when the participants attended their ECDP. The duration of the intervention for the participants was determined by school holidays and study constraints which limited the number of intervention sessions that could be offered.

Generalization and Maintenance

At the end of the intervention period, generalization to a no-song, no-IWB condition was assessed. Procedures were similar to those used during baseline but with coloured Boardmaker symbols presented as laminated symbols. Each symbol was presented a total of three times within a 3-symbol array. The placement of the symbols in the array was randomized and the first author requested the child to touch one of the animal symbols which were placed on a table in front of the child. At the conclusion of the intervention period no further training was provided for a period of 3 weeks. Maintenance data were recorded for each participant using the procedures for the PowerPoint music and PowerPoint no-music conditions.

Reliability

The use of the computer allows for infinite presentations without degradation of fidelity (Mineo et al. 2009). This provided an inbuilt fidelity of the training implementation. Activation of the animal movement and sound slide was dependent on the participants’ selection of the correct symbol. This provided validation of the observer’s record of correct response and therefore separate interrater reliability data were not collected.

Results

Figure 1 shows the number of correct (unprompted) responses during baseline, intervention, generalization and maintenance conditions. There was an increase in the number of correct responses during the PowerPoint music intervention for all three participants with Uri reaching criteria for all five animal symbols and Ayrton and Oscar reaching criteria on three animal symbols. During baseline, Uri had a mean of 1.8 (13%) correct responses rising to 15 (100%) correct responses in the final three intervention sessions. For Ayrton, a mean of 2.7 (18%) correct responses during baseline increased to a mean of 13.3 (88.7%) correct responses for the last three intervention sessions. Oscar’s baseline showed a mean correct response rate of 2.1 (14%) and a mean correct response rate in his final three sessions of 12.7 (84.7%).

Fig. 1
figure 1

Number of correct responses for each participant

Correct response rates for the PowerPoint no-music sessions varied across participants. For Uri, correct responses for this condition were similar to his performance in the PowerPoint music condition and he achieved 100% correct responses in the last two sessions, reaching criteria for all five animal symbols. Correct responding for Ayrton and Oscar in the no-music condition was lower than in the music condition. The mean number of correct responses in the no-music condition for Ayrton was 6.2 (41%) while mean number of correct response in the music condition was 8.6 (55.1%). Ayrton reached criteria for only one animal symbol in the no-music condition compared to three animal symbols in the music condition. The mean number of correct responses in the no-music condition for Oscar was 5 (33.3%) while mean number of correct response in the music condition was 11.1 (73.7%). Oscar failed to reach criteria for any of the animal symbols in the no-music condition compared to reaching criteria for three animal symbols in the music condition.

Generalization probes were conducted at the completion of the intervention. Uri had 10 correct responses during the probe session with 100% correct responses for the emu and kookaburra symbols. Ayrton had six correct responses during the generalization probe session while Oscar had two correct responses.

Results of maintenance probes conducted 3 weeks post-intervention are also shown in Fig. 1. Uri had 100% correct responses in both conditions. Ayrton had 100% correct responses during the music condition and 33.3% correct responses during the no-music condition. Oscar’s correct response rate was 93.3% in the music condition compared to 33.3% in the no-music condition.

The rate and order of symbol acquisition during the music condition is displayed in Table 2. Uri reached criteria for his first animal symbol after three sessions, while Ayrton and Oscar required six sessions. While the rate and order varied across participants, ‘goat’ was the initial animal symbol acquired by all participants. Mosquito was the last symbol acquired by Uri while Ayrton and Oscar failed to reach criteria on this animal symbol by the end of the study.

Table 2 Number of sessions required to reach criteria in music condition

Discussion

The results of this study indicate that three young children with an Autism Spectrum Disorder were able to learn symbols through an interactive song presented using PowerPoint slides and an IWB. In addition one participant, Uri, was able to receptively label the animal symbols when presented using PowerPoint and an IWB but no music. Uri also demonstrated some generalization to another context which didn’t involve music or a computer-based presentation. At follow-up, receptive labelling of symbols was maintained for all three participants. This study adds to the research literature by demonstrating that the use of an interactive song presented using PowerPoint and an IWB can facilitate receptive labelling in young children with an Autism Spectrum Disorder.

For two of the participants, the use of music led to a higher rate of correct responding than when music was absent. During the PowerPoint and music condition the participants were exposed to a song, changes in screen presentation and repetitive presentations of the sequence. These characteristics have been found to be appealing to typically developing toddlers (Ellis and Blashki 2004) and to aid in memory recall (Ricks and Wing 1975). It appears for two participants, elements of the music condition may have aided symbol acquisition. There were, however, a number of differences between the music and no-music conditions that may have contributed to the difference in response rates for these two participants.

First, the number of sessions in each condition differed with approximately three times the number of music versus no-music sessions. Due to time constraints and the availability of participants only 1 day a week, it wasn’t feasible to offer the no-music condition at the same frequency as the music condition. The difference in response rates for Ayrton and Oscar may therefore reflect the more frequent opportunities offered in the music compared to no-music condition. However, it should be noted that Uri’s response rates were similar in both conditions, suggesting that the frequency of exposure to the different conditions did not influence his acquisition of symbols.

Second, in the music condition, a correct response was immediately followed by the naming of the selected animal in the song and movement of the animal symbol across the screen paired with an animal sound. In the no-music condition, correct selection was immediately followed by a new array of symbols. The movement and animal sound may have acted as a reinforcer for the children, subsequently increasing the rate of responding. The naming of the animal following selection may also have supported symbol acquisition by modelling language immediately following each symbol selection. Such modelling has been found to positively impact on speech acquisition (Romski and Sevcik 1996).

While all the participants reached criteria on at least three animal symbols, the rate and order of acquisition varied. The first symbol acquired by all participants was goat. Kookaburra was the second symbol acquired by Uri and the third acquired by Oscar. Mosquito was the last symbol to be acquired by Uri and was not acquired by either of the other participants. It would seem unlikely given the order of acquisition that the number of syllables was a factor, as this would suggest kookaburra with four syllables would be the most difficult animal symbol for the participants to acquire. In the absence of a larger sample size and symbol array, it is difficult to determine what the salient features of these animals may be in terms of symbol acquisition. Factors to consider may include prior experience and exposure to the animal, animal properties, the sound the animal makes and individual child preference.

Once a symbol had been acquired this may have impacted on future symbol acquisition. Once Ayrton had acquired the initial symbol “goat” he then proceeded to select goat each time it was presented in an array irrespective of the symbol requested. This tendency was also observed to a lesser extent in Uri. According to Gershkoff-Stowe et al. (2006) “lingering activation from previous retrievals influences the competition for lexical selection by enhancing the activation levels of some competitors over others” (p. 476). In their study with typically developing children, Gershkoff-Stowe et al. observed that the overgeneralization behavior was more apparent in younger children (2yrs old) and less present in older children (4 yrs old). This may have been a factor that influenced symbol acquisition for these children.

Individuals with autism experience difficulty in generalizing their learning from one context to another and comprehension is often limited to highly familiar contexts (Lovaas et al. 1973). At times, a child with autism may be responding to environmental cues rather than speech and it is common for receptive language to lag behind expressive language (Noens and Van Berckelaer-Onnes 2004). The generalization context in this study paralleled more typical AAC presentations in that laminated symbols of the animals were placed on a table in front of the participant. This context differed from the intervention contexts in that music was absent and the IWB was not used. Generalization was limited with only one participant (Uri) correctly labelling two of the five symbols. For Uri, responses in the generalization probe were much lower than in the PowerPoint music and PowerPoint no-music conditions in which he performed equally well. The lack of generalization to picture symbols presented alone suggests that he was reliant on cues specific to the song and/or PowerPoint/IWB to produce correct responses.

For Ayrton and Oscar, there was a failure to generalize and their performance in the generalization probe was similar to their performance during the PowerPoint no-music condition. The learning of symbols for these children appeared to be particularly contextualised to the use of PowerPoint slides, an IWB and music. Replication of this study with a condition where symbols are presented with music but no PowerPoint slides or IWB would help to determine the extent to which the PowerPoint slides and IWB played a role in performance or whether it was the music alone that was the salient variable for these children. The failure to generalize, however, has important implications for the use of music as a medium for teaching communication skills to children with autism who are early communicators. Interventions need to incorporate strategies that promote generalization and maintenance (Schlosser and Lee 2003). Typically in early childhood settings, learning occurs around a themed unit. This provides the opportunity for teaching skills across a range of exemplars. Extending opportunities to learn symbols that are taught using the PowerPoint music approach to other themed activity within the educational setting would provide multiple exposures across different contexts and communication partners and may aid generalization of communication skills beyond the context of the song/IWB presentation. The use of the interactive song presented using PowerPoint slides and IWB appeared to be an engaging, enjoyable and novel experience for the participants. Providing initial intensive instruction using this approach which is then reinforced through themed activities across the day may enhance engagement and motivation in learning and lead to improved generalization.

Care should be taken in generalizing these findings beyond these three children and the following factors should be considered in any future research of this kind. Time constraints played an important part in the research. Two of the participants didn’t reach criteria on all five symbols by the time intervention ceased. Also, attendance of the participants at the educational setting 1 day per week meant that intervention sessions were delivered in concentrated periods (3 PowerPoint music sessions and 1 PowerPoint no-music session per day) once a week. A more dispersed timeframe for intervention sessions may have enhanced performance and increased the number of symbols acquired. Similarly, performance may have been improved by reducing the number of novel symbols taught simultaneously.

Future research could help to identify the contribution of specific presentation variables on performance by including a condition of music without PowerPoint and IWB and of the ‘song’ without music. However, this study provides preliminary evidence that three children with an Autism Spectrum Disorder displaying low level verbal comprehension skills were able to learn to receptively label symbols by embedding the animal names and symbols in an interactive song.