Imitation is regarded as a critical repertoire in child development (Bekkering et al. 2000; Gleissner et al. 2000; Wohlschlager et al. 2003). In early behavioral analysis of child development (Baer et al. 1967; Gewirtz and Stingle 1968), generalized imitation (GI), or the capability of imitating movements that are not directly trained, came to be regarded as an a higher order class of responding (Baer and Sherman 1964; Catania 2007; Greer and Ross 2008; Hayes et al. 2001; Zentall 2006). In subsequent research, GI emerged as a foundational social-learning developmental cusp (Keohane et al. 2009; Greer and Longano 2010; Greer and Ross 2008; Greer and Speckman 2009; Tomasello 1999). As a result of the developmental evidence about the importance of imitation, generalized imitation is currently regarded as a critical objective in the application of behavior analysis to the education and treatment of children with autism as well as early childhood education for typically developing children.

In cognitive psychology, imitation is seen to play a role in the development of children’s perspective-taking. “…The child must learn to use a symbol toward the adult in the same way the adult used it toward her. This is clearly a process of imitative learning in which the child aligns herself with the adult in terms of both the goal and the means for attaining the goal….” (Tomasello 1999, p.105). Thus, in some developmental theories, incidences of non-mirrored imitation in which a child reverses motor imitation in accordance with the perspective of the person observed rather than the child’s own perspective is regarded as a milestone of cognitive development. In that view, imitation involves not only the emission of point-to-point topographical similarity but also reversal of visual perspective. That is, in cases where the experimenter faced the participant and raises her right hand, the participant who took the perspective of the experimenter would raise her right hand, and this would be a non-mirrored response where the participant took the perspective of the experimenter. On the other hand, if the experimenter demonstrated the same right hand response and the participant were to raise her left hand this would be a mirrored response and the participant would not be taking the perspective of the experimenter. A search of the literature revealed little research that directly tested whether adults do indeed emit non-mirrored gestural responses.

Research on topographical similarity has concentrated on attempting to train generalized imitation when it was missing in young children or testing for the presence or absence of GI in typically developing young children (i.e., Erjavec 2002; Horne and Erjavec 2007; Erjavec and Horne 2008; Erjavec et al. 2009; Poulson and Kymissis 1988; Poulson et al. 1991, 2002). Erjavec and colleagues identified limitations in the research on GI. They cited that early studies in behavior analysis used common imitative movements that children were likely to encounter, such as clapping their hands or blowing kisses. They also identified limitations in the training procedures of some prior studies (Erjavec et al. 2009, p. 357) that they controlled for in their program of research.

One of the critical determinants of GI according to the seminal work by Baer and Sherman (1964) was the emission of untrained imitative responses that were not reinforced by the experimenter. According to Baer and colleagues (Baer and Deguchi 1985; Baer et al. 1967) the nature of the consequence that served as the reinforcement for the imitative determined the presence of GI. In order for GI to be present, the reinforcement must be the correspondence between what is observed and the imitated response. Others have suggested that the emission of unreinforced correspondence is a reinforcement schedule effect resulting from multiple exemplars that reinforced seeing and doing (i.e., Gewirtz and Stingle 1968).

The three key issues in the research involve the following questions: (a) Is the emission of non-mirrored responding central to GI and to the teaching of GI? (b) Current evidence suggests that GI does not emerge from face-to-face training; however, can GI be taught using a mirror when stringent controls over possible prior experiences are in place? (c) What is the source of reinforcement for GI?

Erjavec and colleagues instituted a program of research to address two of these questions. Their studies appear to be the most carefully controlled in the literature to date, and improved on procedures and controls found to be limitations in prior research. They developed and used a collection of movements that provided the most stringent control for instructional history in the literature for what constituted generalized imitation (GI). In contrast to prior studies, they carefully selected the responses that did not exist in the participants’ repertoires for their GI assessment. They also took into consideration some possible variables that might impact the participants’ responses. For example, the experimenter smiled as she modeled the actions in instead of having a “still face”. They found that a “still face” may suppress the participants’ responses. Over several studies they found that GI did not emerge after extensive instruction with young and typically developing participants across a range of ages (Horne and Erjavec 2007; Erjavec and Horne 2008; Erjavec et al. 2009). These findings suggested that prior studies reporting GI did not, in fact, demonstrate the emergence of GI because the criterion was not stringent enough to control for commonly and incidentally learned imitative responses. Erjavec et al., also concluded from their findings that correspondence between seeing and doing was not the reinforcer for the responses and that Baer and Sherman’s theory (Baer and Sherman 1964) that correspondence between seeing and doing was faulty. Thus, these studies revised our understanding of generalized imitation in regard to both whether it is teachable at certain ages to typically developing children. Of course, if this is indeed the case, whether it can be taught, or when it should be taught, to children with developmental delays must be revised. Similarly, they argue that there is no convincing evidence that the reinforcer for GI is correspondence. However, because they were not able to establish GI in their studies, they could not observe possible reinforcers for responses that did not emerge.

Another issue that has been problematic in the research in generalized imitation concerns whether GI requires that participants emit non-mirrored responding showing a perspective-taking repertoire. Erjavec and colleagues noted this as an issue in their early studies and published both the data for both non-mirrored and mirrored responding; however, they recorded non-mirrored responses that were topographically similar as correct matching responses but accepted both non-mirrored and mirrored responses as matching response category in their training.

Although its usefulness could be disputable, the mirror has long been considered and used as an indispensable tool in dance training (i.e., Dearborn and Ross 2006; Ehrenberg 2010). Some argue that the mirror could make the learning environment for the dancers so complex that it becomes dysfunctional and potentially impedes the dancer’s learning. However, the follow up results from the studies supported the role of the mirror in facilitating the learning process in the long run (Dearborn, & Ross).

There are three issues concerning GI that need to be addressed. Should the emission of non-mirrored responding be part of the criteria for GI? If GI cannot be taught to young children using face-to-face training, might GI be taught using a mirror? (c) What is the likely source of reinforcement for GI? We attempted to address these issues in two experiments.

First, we tested whether adults demonstrated perspective taking by emitting non-mirrored GI using a sample population of adults. Unless it can be established whether non-typically developing adults emit mirrored or non-mirrored responses in GI, the criteria for the components of GI remain speculative. If non-mirrored responding is characteristic, then training or testing of GI should require non-mirrored responding. Findings from this study were to determine whether we should require mirrored or non-mirrored responding in our second experiment devoted to teaching GI.

With regard to the teaching of GI, we posited that the way in which Erjavec and Horne taught their participants might have affected their outcomes. That is, the literature suggests that GI consists of correspondence as a class of responding, not specific behaviors. There are some prior studies on the emergence of untaught behavior that suggested that the emergence of class responding requires the rotation of multiple exemplars. Greer et al. (2007) found that training that involved rotation of listener and speaker responses across stimuli resulted in naming, while teaching the exact same numbers of trials for separate responses did not. Thus, we posited that: (a) if the responses were taught in a mirror, allowing the children to see the correspondence between their responses and the model, under conditions in which (b) several responses where rotated in the training sessions, the general case of imitative responding might emerge. Finally, if GI emerges from training using a mirror, we can provide a test, to some degree, of the theory that correspondence between seeing and doing is a conditioned reinforcer by providing pre-intervention and post-intervention probes that provide no consequences.

Experiment One: Do Adults Mirror?

Participants

In the first study that tested the presence or absence of non-mirrored responding in GI, we recruited 128 adults from 19 to 56 years old (with a mean of 28 years old). There were 98 females (77 %) and 30 males (23 %). Sixty-one were staff from a private preschool, and the remainder of the adults were undergraduate and graduate students chosen randomly from the Engineering Department of a major university.

Setting

For the 61 adult participants from the preschool, the study was conducted in a classroom in the school. The classroom was about 4.5 × 4 m, which had a horseshoe shaped table in the middle, and three smaller rectangle tables placed at three sides of the room. During the time of the study, the researcher and the participant sat in the center of the classroom, directly facing each other, approximately 1-1.5 m away from each other. For the other participants, who were recruited from the university, the study was conducted in a small office room (3 m × 3 m) in the engineering department. There were two chairs and two tables in the room. The procedures were conducted in the exact same fashion as at the preschool.

The direction given at the onset of the study to each participant was, “In the next few minutes, I want you to imitate what I do after I say, ‘Do this’. Please do not ask me any questions or talk about the study with your coworkers. Thank you!” Each probe session took between 1 and 2 minutes. If the participants attempted to raise any questions before or during the study, the researcher told them that the questions would be answered when the study was completed. None of the participants were aware of the purpose of the experiment prior to the study.

Response Definition and Data Collection

In the present study, we adopted and slightly expanded the actions introduced by Erjavec and Horne in their studies (Horne and Erjavec 2007; Erjavec and Horne 2008; Erjavec et al. 2009). This allowed the results of our investigation to be compared with their results. The dependent variable in this and the following study was the number of correct GI responses emitted by the adults during pre-intervention and post-intervention probe sessions. A total of 26 responses were demonstrated to each participant when he/she was sitting face to face to the experimenter. Following the experimenter demonstrating the movements, the participants were asked to imitate them with a one to one correspondence within 3 s.

As shown in Table 1, there were 11 same body side (ipsilateral) responses (eight with one hand and three with two hands), and 15 cross-body (contralateral) responses (12 with one hand and three with two hands). No feedback or reinforcement was given to the participants during the probe trials. A mirrored response occurred when the participant emitted a response that had topographical correspondence with the model (i.e., raised hand) but the participant responded with a different hand (i.e., left hand for the experimenter and right hand for the participant) from that of the experimenter within 3 s of the command to, “Do this”. A non-mirrored response occurred when the participant emitted a response that had topographical correspondence with the model (i.e., raised hand) by using the same hand (i.e., right hand for the experimenter and right hand for the participant) within 3 s of the command to, “Do this.” Using the example of response number 24 in Table 1 “Left hand same ear”, the experimenter first modeled the action by using her left hand touching her left ear while facing the participant. If the participant responded by using her right hand touching her right ear, this constituted a mirrored response. If the participant used her left hand to touch her left ear, this constituted a non-mirrored response. A non-mirrored response is considered as an example of taking the perspective of the model where the body orientation determines the hand usage (Erjavec and Horne 2008).

Table 1 Actions Presented to Adults during Experimental Probes

Given that there were relatively a large number of actions (26) used in the probe session, it was not clear if the sequence of these responses would affect the participants’ emission of mirrored or non-mirrored responses, and left-hand versus right-hand responses. We arranged the sequence of the 26 movements into four different orders for this experiment (see Table 2).

Table 2 Four Different Versions of Probe Lists Used In Experiment 1

Coding

To avoid cues as to the correctness or incorrectness of the participants’ actions and possible impact on their future responses, we used the codes rather than pluses for correct and minuses for incorrect. All responses were coded into Arabic numbers consistent with the procedures used by Erjavec and Horne (Erjavec and Horne 2008). The responses to the modeled actions were classified as mirrored responses (1), non-mirrored responses (2), two-handed responses (3), and not related responses (4). For these six responses that required two-hand movements, the mirrored responses would be identical to their non-mirrored ones. Therefore, they were coded (3) when participants responded correctly. Any irrelevant responses were coded (4).

Interobserver Agreement and Interscorer Agreement

Interobserver agreement (IOA) was obtained by two independent observers with extensive experience in research. They were naïve to the purpose of the study and recorded the participants’ responses independently and simultaneously. Ninety-eight percent of IOA was obtained for 91 participants (71 % of the participants) during the probe sessions, ranging from 96 % to 100 %.

Interscorer agreement (ISA) was obtained by having an independent scorer calculate the point-by-point agreement on the numbers of responses in different categories recorded by observers on the preprinted data collection forms. The independent scorer was naïve to the purposes of the study. ISA was conducted for 100 % of all the participants with a mean agreement of 97 % (range from 95 % to 100 %).

Results

There were significant differences between the mirrored and non-mirrored for the adults responses (t (127) =4.6, p < .01). The mean discrepancy between mirrored and non-mirrored responses was 5.5703. The adults tended to emit five or six more mirrored responses than the non-mirrored responses for the 26 imitative actions. Adults do not consistently emit non- mirrored responses in face to face setting. As a result we did not require non-mirrored responding as a criterion for testing children for the presence or absence of GI. We concluded that both mirrored and non-mirrored responses were characteristic of typical adults and that the presence of non-mirrored responses is not a developmental stage that should be considered in determining the accuracy of GI responses by children or in establishing it.

A 2 (gender) × 4 (probe list) analysis of variance was performed on the discrepancy of mirrored responses and revealed the significant differences between the two genders F (1,120) =10.883 (p < .01). The female participants were more likely to emit mirrored responses than the male participants. A 2 (gender) × 4 (probe list) analysis of variance was performed on the discrepancy of hand preference and found that the main effect of different probe lists was significant, F (3,120) = 6.47 (p < .01) and revealed a list effect. Only when Probe List 4 was used did the participants not show any hand preference (mean = 0.16). When the other three probe lists were used, the participants tended to emit more right-hand responses during imitation, especially in Probe List 3.

Discussion

The present results indicate that the failure to differentiate between left and right perspectives is pervasive even among adults during face-to-face imitation. Most adult participants did not visually reverse the actions to adopt the experimenter’s perspective. This interesting result could be due to the inexplicit vocal direction (“Do this”) given by experimenter at the onset of the task. Nevertheless, in light of the results, it is apparent that the discrimination between left and right during imitation should not be considered as a component in the assessment of accuracy of GI for young children.

It is an interesting side note to find the dramatic sex differences in the males’ and females’ preferences for hand use. Why and how this happens is still unknown. Although earlier research reported sex differences in human corpus callosum (the brain part that bridges two cerebral hemispheres) (DeLasote-Utamsing and Hollowar 1982), later studies with more advanced technological assistance found the limited difference between the genders could be due to other compounding factors, such as age and body weight (i.e., Allen et al. 1991; Bishop and Wahlsten 1997; Holloway and de Lacoste 1986). The controversial neuroanatomical basis is, thus, tenuous and does not come close to support the gender difference. An alternative explanation lies in the theory that males have an advantage in mental rotation (Harris 1978).That is, men are more likely than women to perform better in spatial skills Still other studies found no significant gender difference (Sherman 1978).

Experiment Two

Participants

The participants in this experiment were six preschool aged students, ranging in age from 3 years to 4 years and 4-months. All participants were identified as children with autism spectrum disorders by psychologists from the referring school districts. They all functioned as beginning listeners (i.e., follow simple vocal directions with visual cues) and did not have any vocal verbal or vocal substitutes for speaker behavior (Greer and Ross 2008). Participant M1 was female and the other five participants were males.

According to the C-PIRK (Greer and McCorkle 2009) and Verbal Behavior Assessment (Greer and Ross 2008), which were conducted prior to the experiment, the following preverbal foundational developmental cusps and capabilities were in the repertoire of all participants except one (M3): teacher presence resulted in instructional control, conditioned reinforcement for observing 3D visual stimuli on the desktop, and generalized match-to-sample for 2D/3D stimuli. The repertoires of moving from vocal imitation to labeling (echoic-to-tact, Greer and Ross 2008), and vocal imitation to requesting (echoic-to-mand, Greer & Ross) were also present in Participant NM3’s repertoire. Participant M3 responded such that the teacher presence resulted in instructional control, conditioned reinforcement for 3D object visual stimuli on the desktop, and adult voices functioning as conditioned reinforcers for observing responses. Table 3 reports the ages and standardized test scores for the six participants in the mirror-trained group and non-mirror trained group.

Table 3 The Standardized Test Scores for the Six Participants in the Mirror-Trained Group and Non-Mirror Trained Group

Setting

Pre-Intervention and Post-Intervention Probes

During the pre-intervention and post-intervention probes, the participant sat on a child-size chair facing the experimenter, while the experimenter sat on a smaller child-size chair to make sure that she was at the same or approximate eyesight level of the young participant. They sat 0.5-1 m away from each other. No other furniture or equipment was necessary.

Training Sessions

During the training sessions for the mirror-trained group, the experimenter sat on a chair behind the participant (slightly to the side so that each participant could see the motions of her own and the experimenter’s in the mirror) in front of a full-length safety-glass mirror (width 55 cm × height150 cm). The mirror was placed securely on the floor against the wall during the instructional sessions. For the non-mirror trained group, the set-up was identical to the pre-intervention and post-intervention probes, in which the experimenter sat face to face with the participants.

Design

A combined experimental-control group design, with a nested time-lagged multiple probe design across participants was used within the mirror trained group (Greer et al. 2007) to test the effectiveness of the mirror instruction. The combination of two designs allowed the control for instructional histories and maturation both within groups and between groups. The experimental group received the mirror training in staggered multiple probe design and the control group simultaneously received the face-to-face training. In this particular case, the implementation of the group design allowed the comparison of the rate of learning between two intervention groups under yoked instructional trial conditions (i.e., each individual in the matched pairs received the same number of instructional trials) so that the effects of the mirror could be isolated. It should be noted that the “group” design component was limited to a very small sample.

Each participant in the mirror-trained group was matched with one participant in the non-mirrored trained group, who had similar repertoires and test scores. They were matched in order to ensure that the participants who received either the face-to-face training or the mirror procedure had comparable repertoires and were close in age. Age is regarded as an important variable because age is a strong predictor of the onset of children’s new repertoires. However, since our participants were developmentally delayed controlling the test score ages were considered to be important. Once matched on repertoires, mental ages, and test scores the participants were randomly assigned to either the mirror training or face-to-face training group. The two participants in each matched pair received their respective interventions simultaneously with one exception. That is, the second member of the pair of participants did not start their intervention until the first member of the pair had completed the first post-intervention probes in order to control for instructional presentations.

The sequence of the experiment was as follows: 1) pre-experimental probes for GI for all participants to determine if they had GI in their repertoire, 2) imitation sets were presented to the mirror-trained group participants and the non-mirror trained group participants controlling for the amount of instruction (the same numbers of instructional trials), 3) once a participant from either group mastered three sets of imitation responses, instruction ceased for both groups and GI probes were conducted for both, 4) if GI did not emerge for any of the participants, we reinstituted instruction with three new sets of responses, and 5) Steps 3 and 4 were repeated until GI was demonstrated with a paired participant in one of the groups. Figure 1 shows the sequence of the experiment. Both groups simultaneously received multiple probe designs in implementing the two procedures providing another test of the mirror and face-to-face intervention.

Fig. 1
figure 1

Design Sequence for the Mirror-Trained Group and Non-Mirror Trained Group

Dependent Variable

The dependent variable was the number of correct imitative responses during the probe sessions. The 26 novel imitative responses were the same as the ones used in Experiment 1. During each of the probe sessions, the experimenter faced the participant and demonstrated a block of 26 untaught responses (a single session). None of the responses were consequated during probes. Correct responses were operationally defined as the participant’s emission of a response that matched the model within 3 s of the command and the presentation of the model, regardless of which hand she was using. That is, both mirrored and non-mirrored responses were accepted as correct in this experiment provided the topographies were the same. The criterion that we accepted as evidence of the presence of GI for probe sessions was 90 % accuracy, or better, for one session.

Data Collection

During the probe sessions, the participants’ responses were recorded by using the same coding system from Experiment One. Because of the young participants’ non-responsive behaviors during the probe sessions, a fifth code was added to the coding category to discriminate the absence of responses from the non-related responses (5). During the instructional sessions involving imitation instruction using actions not found in the pre-intervention and post-intervention probes, correct responses to the target responses were recorded with a plus (+), and incorrect responses were recorded with a minus (-).

Procedure

Habituation

Prior to the onset of the experiment, the experimenter asked the participants’ teachers and other service providers about the participants’ preferred edibles, toys, activities, songs, as well as things they did not prefer. The experimenter arranged play sessions in order to habituate the children to the experimenter. During these sessions the experimenter played with the participants and conducted non-related instructional programs to familiarize herself to the participants consistent with Rothstein (2010)’s findings about the importance of habituation in conducting experiments with young children.

Pre-Trainin and Post-Training Generalized Imitation Probes

The pre-intervention probes were conducted prior to the instruction with all six participants to assess if they had GI in their repertoires. The post-intervention probe sessions were conducted in the same manner as the pre-intervention probes after mastery of every three sets of imitation responses. During the probe sessions (block of 26 imitation actions), the researcher and the participant sat on the child-sized chairs, directly facing each other. The researcher obtained the participant’s attention by calling her name or showing her the pre-determined reinforcers determined from the participant’s instructional history. After obtaining the participant’s attention, the researcher delivered the vocal antecedent “Do this” together with a model of the target action. Consistent with findings from the first experiment, both mirrored and non-mirrored responses were accepted as correct. An incorrect response was recorded when the participant failed to demonstrate a response with one to one correspondence or did not respond at all within 3 s. During the probes, no reinforcement or corrections were delivered.

Intervention

During the intervention, three participants received the mirror training (the mirror group) and three received the face-to-face training (the non-mirror group). Three novel actions for each training set were taught to the participants and each training session consisted of blocks of 20 instructional trials. Both mirrored and non-mirrored responses were accepted as correct.

We used instructional trials, that met the criterion for learn units (Albers and Greer 1991; Emurian et al. 2000; Greer 1994, 2002; Greer and Ross 2008). Instructional trials were yoked for the matched pairs such that the numbers of instructional trials were the same for the individuals in the mirror or non-mirror instruction. Components of a learn unit must include: attainment of the participant’s attention before presenting the vocal command, “Do this.” If the participant is not looking, redo the presentation. Once the experimenter has the participant’s attention, model the target action in the mirror and present the vocal command: “Do this.” The participant is given 3 s to respond. The experimenter provides reinforcement contingent on correct responses. Following the emission of an incorrect response, the experimenter demonstrates a correct response and then repeats the presentation of the direction “Do this”. If the participant then emits a correct response she is not reinforced. If the participant does not emit the correct response in the correction procedure, the experimenter prompts the participant through the target response, again with no reinforcement.

The numbers of learn units were yoked for the matched pairs to rule out amount of instruction as a variable. If a participant from either group mastered the imitation sets before a participant in the other group and the amount of instruction was controlled, the differences were attributable to the use of the mirror and not the amount of instruction. Criterion for intervention sessions was set at 90 % accuracy across two consecutive sessions or 100 % accuracy in one session.

Post-intervention probes were conducted when one of the participants in a dyad mastered three instructional sets (each included three actions) and a rotation of all nine actions from the three sets. Learn units were yoked in the matched pairs such that, regardless of which training condition they were assigned, each participant in each dyad received the same number of learn units. The number of learn units were yoked in order to control for amount of instruction for determining which intervention, if any, was more effective.

Each participant had different existing repertoires at the time of the study and some of the participants had physical limitations. Therefore, the target responses for the training sets were individualized based on the participants’ then-current repertoires, as well as their instructional history. During the intervention, if the participant responded correctly to a model on the first trial, this action was replaced with another one that was not in repertoire. For example, Participant M1 was taught tap chin, touch hair, and touch tummy in Set 1 and Participant NM1 was taught touch head, one hand tap lap, and two hands tap chest in Set 1.

Mirror Trained Group

During the instructional sessions, the experimenter sat behind, and slightly to the side of the participant in front of a mirror. The participant was given a few minutes to adapt to the mirror. The experimenter then gained the attention from the participant via the mirror and modeled the target action together with the vocal antecedent “Do this.” A correct response occurred when the participant emitted the response with point-to-point correspondence with the model within 3 s. The experimenter delivered vocal praises, gentle physical touches, and preferred edibles for the correct responses based on knowledge of each child’s community of reinforcers. If the participant did not respond or emit any responses with point-to-point correspondence within 3 s, it was counted as incorrect and a correction procedure was provided. In the correction procedure, the participant was required to emit the correct response whether independently or with least to most physical prompts and no reinforcement was given for the corrected response.

Non-Mirror Trained Group

The non-mirror trained group participants were taught imitation sets in the standard fashion as their counterparts. The experimenter sat on a child-sized chair facing the participant. The imitative actions were taught to the participants with the vocal antecedent “Do this.” Correct responses with one to one correspondence were followed by reinforcement and corrections were delivered upon incorrect responses. Each child in this group received the same face-to-face learn units as did her matched pair in the mirror group.

Interobserver Agreement Data

Interobserver agreement (IOA) was collected during pre-intervention and post-intervention probes for all six participants in the mirror trained and non-mirror trained group. The researchers, in advance of the experiment, trained and calibrated all IOA observers, and all achieved 100 % agreement scores in at least two consecutive sessions with the researcher prior to observing videos of the probe sessions. At least one of the trained IOA observers presented in addition to the experimenter during each session. See Table 4 for IOA for calibrated measurement scores for each participant in the training and probe sessions.

Table 4 Number of Sessions and Mean Percentage of IOA Collected on Each Participant during Probes and Instructional Sessions

Results

In the first pair, Participant M1 the mirror participant emitted one correct response during the initial pre-intervention probe of 26 trials. Her correct responses increased to five during the first post-intervention probe, 13 in the second post-intervention probe, 16 in the third post-intervention probe, and 24 in the last two post-intervention probes (see Fig. 2). Participant NM1 (the matched non-mirror participant) did not have any correct responses during his initial pre-intervention probe. His correct responses during the post-intervention probes were 2, 1, 3, 4, and 4, respectively (see Fig. 3).

Fig. 2
figure 2

The Number of Correct Responses Emitted by the Three Participants in Mirror-Trained Group (M1, M2, M3) Pre and Post Mastery of Each of the Three Mirror Instructional Sets

Fig. 3
figure 3

The Number of Correct Responses Emitted by the Three Participants in Non-Mirror Trained Group (NM1, NM2, NM3) Pre and Post Mastery of Each of the Three Instructional Sets

In the second pair, Participant M2 correctly responded to six and three novel actions during the pre-intervention GI probes. After the implementation of the intervention in the mirror, his correct responses increased to 19, 20, 23, and 24, respectively, in the post-intervention probes (see Fig. 2). Participant NM2 did not imitate any physical movements correctly during his initial pre-intervention probes. His correct responses did not increase during the later post-intervention probes (see Fig. 3).

In the third pair, Participant M3 emitted 1 and 0 correct responses during the pre-intervention probe sessions. After being taught imitation sets in the mirror, his correct responses increased to 16 in the first post-intervention probe, 19 in the second post-intervention probe, 23, and 24 in the last two sessions (see Fig. 2). Participant NM3 imitated 1 and 0 physical movements correctly during the initial pre-intervention probes. After the implementation of the intervention, his correct responses increased to 8, 9, 10, and 11 in the post-intervention probe sessions, respectively (see Fig. 3).

It is worth noting that every participant was probed for two consecutive sessions following the intervention. This was done to ensure that GI had emerged in the participants who emitted 88 % accuracy in the first post-intervention probes since the first probe was slightly below 90 %. As a result all of the remaining participants received two post-intervention probes.

The number of instructional sessions required for the participants from each group to meet on imitation sets is shown in Table 5. The participants from the mirror-trained group mastered all target responses in their instructional sets (Fig. 4). Not only did the participants from the non-mirror trained group need more instructional trials to meet criteria during the training sessions, but also they typically required various types of research-based tactics in an attempt to teach them in mastering the training sets (Fig. 5), such as zero second time delay (Shuster et al. 1988; Terrace 1963; Touchette and Howard 1984), graduated physical guidance (Hourcade 1988), and response blocks (Lerman et al. 2003).

Table 5 Number of 20-Learn-Unit Instructional Sessions Required to Meet on Imitation Sets for All Participants
Fig. 4
figure 4

The Number of Correct Responses Emitted by Three Participants in Mirror-Trained Group during 20-Learn-Unit Instructional Sessions. (Note the Lack of Necessity to Use Prompts)

Fig. 5
figure 5

The Number of Correct Responses Emitted by Three Participants in Non-Mirror Trained Group during 20-Learn-Unit Instructional Sessions. (Note the necessity of using prompts)

Figure 6 demonstrated the increase of correct responses in the probe sessions emitted by the participants from two intervention groups. It is apparent that the mirror-trained procedure worked and the face-to-face procedure did not. In total, the mirror-trained group mastered 30 sets (M1 had 12, M2, and M3 had nine each), and the non-mirror trained group mastered six sets in total (NM1 had 2, NM2 had 0, and NM3 achieved 4) (see Fig. 7). This discrepancy in mastered imitative responses occurred because the instructional trials between pairs were yoked. The face-to-face trained participants continued to be taught sets of imitative responses that they did not master as the mirror-trained participants mastered sets.

Fig. 6
figure 6

The Increase in Correct Responses Emitted from Pre-Intervention Probes to Post- Intervention Probes for the Paired Participants in the Mirror-Trained Group and Non-Mirror Trained Group

Fig. 7
figure 7

Number of Criteria Achieved during Training Sessions for Paired Participants from Mirror-Trained Group and Non-Mirror Trained Group

Discussion

It would appear that the mirror provided immediate complete visual feedback relative to the children relative to the models of the behaviors. In addition, the actions were taught side-to-side rather than face-to-face. Because imitation requires a comparison of one’s own behavior with that of others, the imitator needs to see both herself and the model at the same time to determine if the response does or does not correspond with the model. However, for the non-mirror trained group, the child sat on a chair facing the model. This traditional method of imitation instruction can only provide half of the “learning picture” to the imitator. Without the presence of the mirror that builds up the connection between her own response and that of the model, the child can only kinesthetically feel, or guess at, the visual correspondence. The reflective feature of mirrors enables the child to see not only the model but also himself/herself in the mirror, therefore, the “learning picture” is intact and, thus, “what the response feels like” (kinesthetically) is connected with “what the response looks like” (visually) (Mitchell 1992, 1993). Moreover, we rotated three responses for each set and taught multiple sets to mastery providing multiple exemplars of responding by duplication instead of mastering one response at a time. The former should encourage the formation of a class while the latter might not.

All three participants in the mirror-trained group learned and the participants in the face-to-face group did not learn. Thus, consistent with Erjavec et al. (Erjavec and Horne 2008; Erjavec et al. 2009; Horne and Erjavec 2007), the children did not learn in the face-to-face instruction. GI can be taught with the use of a mirror and apparently cannot without the use of a mirror. However, within the mirror-trained group the three participants did not progress at the same rate. Participant M1 required the most number of instructional sessions before GI emerged, compared to M2, and M3 (Fig. 8). This finding was consistent with the physical development of the three participants, as Participant M1 had the lowest standardized test scores in her gross motor skills. Therefore, there might be some correlation between children’s physical development and their rate of learning imitation of gestural movements.

Fig. 8
figure 8

Number of 20-Learn-Unit Training Sessions Completed by Three Pairs

One of the most notable limitations in the current experiment was the uneven numbers of learn units assigned to each action taught in the 20 learn-unit instructional sessions. That is, there was always one response in the session that received one fewer learn unit than the other two. This required considerable care in arranging the presentation of the stimuli such that there were equal numbers of learn units for each response across sessions and could have been eliminated by arranging the sessions differently. It is recommended that in future studies, the number of target responses could be set at a number that can be divided such that every response will have the same opportunity to be presented (i.e., 2 or 4) or simply run 21 learn-unit blocks.

General Discussion

The major finding was that the training using a mirror resulted in the acquisition of GI capability in children with autism spectrum disorders and presumably would work equally well with typically developing children like those studied by Erjavec and colleagues (Erjavec 2002; Erjavec and Horne 2008; Erjavec et al. 2009; Horne and Erjavec 2007). One explanation for the differences between the two procedures concerns the differences in what is reinforced. That is, in the non-mirror training procedure the children are reinforced for specific behaviors that they emit but cannot see from the same perspective than they see of the modeled behavior. If the model touches her right shoulder and the child responds by touching her shoulder what the child sees is very different than what she sees if her response is reflected in the mirror. The mirror experience teaches the child the correspondence between what is seen and what is felt in producing correspondence. Moreover, it is possible that mirror training may facilitate children seeing themselves as others see them. Seeing themselves as others see them may be an important aspect of audience control.

Another aspect of the procedure that probably contributed to the success of the procedure is that we arranged the instruction such that the general case or class of responding was more likely to emerge. If several responses are taught where both the child’s response and the model’s response are viewed from the same perspective, we can surmise that this contributes to learning see and do relations as a class of responding. That is, it is important to rotate different responses rather than teaching one or two at a time since the former teaches a class of responding and the latter teaches specific responses. However, when we taught the general case with the mirror GI emerged, but it did not without the mirror. Both the general case procedure, and the mirror procedure were important.

As to the source of reinforcement for GI, some argue that the emission of GI is a function of a reinforcement schedule history (Gewirtz and Stingle 1968, Erjavec). It is undeniable that the participants in Experiment 2 received many instructional trials where each correct response was reinforced and each incorrect response was not. On the one hand, it is possible that a reinforcement history in instructional trials alone could result in the emission of 80 %, or higher, correct responses out of 26 unconsequated probe trials. On the other hand, in studies comparing known-reinforcers with neutral stimuli as consequences, numerous sessions were required for extinction effects to be observed (Greer and Singer-Dudek 2008; Greer et al. 2008; Singer-Dudek et al. 2011). Thus, it remains possible that the reinforcement for correct responding in the post-mirror training probes, where the children demonstrated GI, might be attributed to a reinforcement schedule effect. In summary, while the reinforcement schedule may have been the source for continuing to imitate, the correspondence between seeing and doing as a reinforcer seems to be a more parsimonious explanation. Moreover, reinforcement schedules may act to condition the correspondence as a reinforcer. While future research should develop procedures to test the two hypotheses, we think that our data support the correspondence theory. In order to test the source of reinforcement the experiments need to: (a) establish GI, and then (b) test the possible source of reinforcement. It would appear that our study is one of the few (Moreno 2012; Moreno & Greer) that has resulted in GI.

We discovered in Experiment 1 that the order of presentation was important with regard to whether the adults did or did not mirror. The adult participants performed quite differently in the four different probe lists, where the 26 responses were arranged in different sequences. That is, more right hand responses were emitted when same body side responses were sequentially presented (e.g., left hand movement, left hand movement, left hand movement, right hand movement, right hand movement, right hand movement). Similarly more left hand responses were observed when the same responses were required to be emitted from each side of the body (e.g., left hand same shoulder, right hand same shoulder). Thus, there was an interaction between order of presentation and responding. We controlled for this in Experiment 2 by using Probe List 4, which had the least sequence effects.

Regardless of the limitations, the data suggest that (a) adults do not typically emit non-mirrored responses, hence they do not show perspective taking as some have argued to be important developmentally (Ozonoff and Miller 1995). Face-to-face GI emerged with training that used a mirror and did not when face-to-face training was done without a mirror, at least with participants like ours. We conclude that the use of a mirror acted to reinforce duplication of movements rather than the emission of individual behaviors. Face-to-face training did not allow the participants to see that the target response or duplication as a class that was reinforced. Thus, it is possible that the participants in the Erjavec et al., studies could have acquired GI had they been taught with training sets using a mirror. Our participants were older than those in the Erjavec et al., studies, but were developmentally younger than some of their participants. Our data show that GI can be induced, and we surmise this is because the mirror facilitated the learning of the appropriate response class; however, alternative explanations are possible.

Interestingly, most typically developing children do acquire generalized imitation without special mirror training at some point since adults have GI. How they come to do so, and why, are important research questions that remain to be answered. Neuropathologic research found that abnormalities in corpus callosum might result in defective long-range connections (Deweerdt 2013). Many individuals with autism may suffer from this lack of corpus callosum (i.e., Hardan et al. 2009; Keary et al. 2009). However, the present study demonstrated that appropriate physiological treatment could help them overcome difficulties from their generic disabilities.

Additional research also needs to be done on the actual benefits that accrue from establishing GI. Questions about the difference between selective imitation and generalized imitation remain. Apparently there is a difference between selective imitation that is obviously present in young children and generalized imitation. Infants imitate certain actions of caretakers that might be selective imitation (Meltzoff and Moore 1977, 1989); however, generalized imitation involves a range of actions. We suggest that the difference in the two types is a function of the reinforcer for each as we described above. Unresolved issues like these simply emphasize the importance of locating the source of reinforcement for GI.