“If a child cannot learn in the way we teach, we must teach in a way the child can learn”—Dr. Lovaas’s famous call to action is as relevant to behavioral practitioners today as it was in the 1990s when he uttered those words at conferences in the United States and throughout the world. The research of Lovaas (1987); the follow-up study of McEachin, Smith, and Lovaas (1993); and the replication studies of Sallows and Graupner (2005) and Cohen, Amerine-Dickens, and Smith (2006) have laid the foundation of effective early intensive behavioral intervention (EIBI). One component of EIBI is the acquisition of receptive language, also referred to as listener responding. An article by Grow and LeBlanc (2013) provides a set of basic implementation guidelines to follow when first beginning receptive language programming. These strategies are meant to decrease the likelihood of encountering common difficulties associated with receptive language development in children with autism, such as faulty stimulus control, overselection, and failure to attend to the stimuli. Through a careful analysis of receptive labeling procedures and with evidence from research to support their recommendations, Grow and LeBlanc (2013) established a strong foundation upon which to develop receptive language programming.

However, several strategies have been effective in helping children with autism gain receptive language that either are not captured within the guidelines or are contrary to the guidelines. Table 1 lists those strategies and the general guideline they may violate. Two articles have already been written that include a list of programming variations to use when children struggle to acquire receptive language (Chesnut, Williamson, & Morrow, 2003; Pelios & Sucharzewski, 2004). Although lists of behavioral strategies for receptive language can be helpful in informing practitioners of the wide variations available for programming, practitioners must do their part not to fall into the trap of randomly choosing a strategy or attempting to implement an uninformed shotgun approach of multiple strategies. First, careful consideration should be given to the evidence. Does a particular strategy have more or less evidence to back it up? How similar or different are the program variations under consideration to the exact procedure found in research? Second, careful consideration must be given to the rationale. What is the hypothesis for why a particular strategy will work? What are the underlying behavioral principles that make it reasonable to assume that the strategy will be effective? Third, the individual child must be assessed carefully. How is this child similar or different from the participants in the research? How or why does the rationale for this strategy fit for this particular child’s profile?

Table 1 Guidelines from grow and LeBlanc (2013)

This article is meant to build upon the existing literature and help behavior analysts become better problem solvers when difficulties with receptive language arise. The article identifies, through a literature review, possible alternatives to teaching receptive language when general guidelines fail. In addition, it identifies strategies that have not yet been studied experimentally but that hold promise based on their underlying rationale and effectiveness with a few learners during the course of practice over the past 22 years by the authors of this study. Finally, the article identifies specific strengths and weaknesses of a child that can help practitioners determine which alternatives may be most beneficial to attempt.

Potential Strategies from an Analysis of Current Skill Level

Behavioral practitioners often pride themselves on their ability to break down complex skills into smaller prerequisite skills, teach those prerequisite skills first, and then gradually combine those skills to teach more complex skills. When difficulty with a complex skill such as receptive labeling occurs, one approach available to behavioral practitioners is to focus on smaller prerequisite skills.

A receptive labeling program is a type of receptive language skill that requires a conditional discrimination rather than a simple discrimination. A simple discrimination is a basic three-term contingency composed of a discriminative stimulus, a response, and a differential consequence for the correct response. For example, in a receptive instructions program, a simple discrimination results when (a) the therapist provides an auditory stimulus (e.g., she says “Wave”), (b) the child responds to the stimulus (e.g., the child waves), and (c) the therapist delivers reinforcement only for the correct behavior. Conditional discriminations are a more complex four-term contingency that require an additional comparison to ensure a correct response. In an auditory–visual conditional discrimination, such as in a receptive labeling program, the auditory words spoken by the therapist make, for that moment, one visual item the discriminative stimulus (SD) and the other visual items S-deltas. For example, (a) the therapist provides an auditory stimulus (e.g., she says “Elmo”) while (b) an array of visual items (e.g., a car, Elmo, and a plastic cup) are in front of the child, so that (c) for the moment, the child selects Elmo from the array and not the car or plastic cup, and (d) the therapist delivers reinforcement only for the correct behavior. The therapist should also make sure that Elmo is established as both an SD (when the therapist says “Elmo”) and an S-delta (when the therapist labels a different object while Elmo is still in the array). If the therapist always asks for Elmo when Elmo is present, then the child does not have to attend to the auditory stimulus but rather only needs to visually discriminate where Elmo is located. Behavioral practitioners should not underestimate the complexity of a conditional discrimination or discriminations in general (Sidman, 1986, 2010)—conditional discriminations can be challenging to establish. Once a discrimination is established, we often assume that the stimuli we wanted to control the behavior are, in fact, controlling the behavior (Sidman, 2008). However, there are many variables in the environment that can inadvertently control the behavior, and we may overlook their impact on what we teach. Establishing a strong foundation of prerequisite skills in a child with autism becomes important so that we can focus specifically on the conditional discriminations we wish to develop.

The assessments of Kodak et al. (2015), which built upon the Assessment of Basic Learning Abilities (Kerr et al., 1977; Sakko, Martin, Vause, Martin, & Yu, 2004), serve as a useful starting point in identifying potential prerequisite skills for receptive labeling. The authors found correlations between the ability to complete all skills in the assessment and the ability to receptively identify objects. Five prerequisite skills were identified. First, in imitation of pointing, the therapist points at a picture in an array of two and the child points at the same picture. Second, in simple visual discrimination, the child touches a picture in an array of two pictures whose position is randomly rotated. One picture results in reinforcement and the other picture does not. Third, visual–visual identity matching is a type of visual–visual conditional discrimination that is often taught through match-to-sample procedures. For this skill, a therapist hands the child a picture and the child places the picture on top of a matching picture in a field of three cards. Fourth, by scanning, the child looks at each stimulus in the array during visual–visual identity matching. Finally, in simple auditory discrimination, the child touches a white card in the presence of a sound and keeps his hands in his lap in the presence of a different sound. Failure to demonstrate one of the five skills provides direction for prerequisite programming that may benefit a child prior to attempting a typical receptive labeling program.

A couple combinations of the aforementioned skills may also be helpful prerequisites to the auditory–visual conditional discrimination required in receptive labeling. A simple auditory discrimination followed by a simple visual discrimination is more complex than either simple discrimination in isolation but is not as difficult as a conditional discrimination. An example of such a program would be having a therapist say the word go and then having a child always touch the same picture in an array of two pictures that are randomly rotated on the table. The discriminations remain simple because the spoken word is not directly related to the picture. The child must wait to respond until he hears the word go, but the word does not indicate which picture to touch. The child always touches the same picture. This is the type of discrimination present in the earlier example if a therapist always says “Elmo” when Elmo is on the table and never names another object on the table. Green (2001) and Grow and LeBlanc (2013) caution against such a procedure because it may inadvertently teach a child that he or she does not need to attend to the actual spoken word or he or she may learn not to attend to all the stimuli in the array. However, as indicated in the following sections, procedures that include this type of discrimination have been helpful in teaching some children with autism to gain receptive language, perhaps in part because they need more practice with simple discriminations. If necessary, it may be possible to allay some concerns by using arbitrary nonfunctional sounds or arbitrary nonfunctional objects while a child gains this prerequisite skill and then using actual words and functional objects in a typical receptive labeling program.

Also, although an auditory–auditory conditional discrimination skill has been demonstrated to come after auditory–visual conditional discriminations (Marion et al., 2003), an auditory–visual conditional discrimination that includes auditory identity matching may facilitate correct responding. In fact, neuropsychology research has demonstrated that auditory sounds associated with an object can facilitate recognition of that object (Kassuba, Menz, Röder, & Siebner, 2013).

In such a program, the child’s response includes a sound that is the same as the sound in the SD. For example, a therapist reaches into a bag and pushes the button on a train that makes a train whistle noise. The child has three objects that make noise in front of him (the same train, an electronic piano, and a maraca). The child pushes the button on the train. Assessing these two skills in the example formats described previously may also help identify prerequisite skills to teach.

Finally, two other prerequisite skills worth assessing are a child’s ability to respond to shortened stimulus presentations and delayed matching-to-sample tasks. An auditory stimulus is transient, and children may be more successful with receptive labels after learning to respond to other stimuli that are present for only a short period of time. In addition, because a child must scan an array of objects before responding, the amount of time before a response can occur may be longer than in a simple discrimination. Research on delayed matching-to-sample tasks in both humans (Arntzen, 2006) and animals (Lind, Enquist, & Ghirlanda, 2015) may hold answers in helping children learn to remember the auditory sound while searching for the visual stimuli.

The following strategies may be helpful for children who demonstrate difficulty with one or more of the aforementioned prerequisite skills. Table 2 lists a synopsis of which of these strategies may be helpful based on an analysis of the prerequisite skills taught in each procedure and an initial assessment of a child’s ability to demonstrate those skills.

Table 2 Beneficial strategies based on strengths and weaknesses a child demonstrates during an assessment of prerequisite skills

Strategy 1: Selection-Based Imitation

The general format of the program that Lund (2004) calls selection-based imitation starts with two identical sets of pictures placed directly across from each other on a table. The therapist says “Do this” and points to one of the pictures closest to her side of the table (e.g., a picture of a house). The child points to the same picture close to his side of the table (e.g., an identical house picture closest to the child). The program progresses from pictures lined up in a field of three to a line of six, followed by varying the picture location (so the pictures are not directly across from each other but are still in a line) and then finally varying the pictures in a random pile rather than in a straight line. We are unaware of any additional research articles that evaluate selection-based imitation, and the original article is a discussion of the procedure and its theoretical underpinnings based on its success with a few learners rather than an examination of the procedure using an experimental design.

The procedure teaches several prerequisite skills, including imitation of pointing and scanning. Children who benefited from the procedure also demonstrated “impulsive” responding, immediately grabbing for stimuli on the table, which means that they probably would have failed a simple visual discrimination test. Interestingly, the article notes that picture-to-picture matching skills are typically acquired prior to implementing selection-based imitation, so one would expect visual identity matching to be a strength for the child prior to implementing this program.

Strategy 2: Simple Auditory Discrimination

A basic receptive instructions program is one way to teach a simple auditory discrimination. The assessment of Kodak et al. (2015) provides another format that may be worth pursuing. The program would require the child to touch a card (e.g., a picture of a duck) in the presence of one auditory stimulus (e.g., the sound of a duck quacking) and not in the presence of other auditory stimuli (e.g., other sounds from a Listening Lotto game). One can extrapolate to other versions of this simple discrimination to include: (a) silence versus a target sound (e.g., a duck quacking), (b) auditory sounds versus a vocal target sound (e.g., du), (c) unbroken vocal sounds (mmmmm, hhhhhhh) versus a target word (e.g., duck), and (d) other words (e.g., elephant, juice) versus the target word (e.g., duck). In all of these formats, only one card would remain on the table to touch because this is meant to be a simple auditory discrimination, not a conditional discrimination.

Whether or not such a program would be beneficial as a prerequisite for receptive language is unknown, but it demonstrates the breadth of possibilities still worth studying, both in research and in practice, just in the area of simple auditory discrimination for a child who demonstrates difficulty acquiring receptive labels.

Strategy 3: Touch Same

A common program we have conducted in the past called “touch same” often follows other visual identity matching programs. The therapist holds up a picture (e.g., a frog) for a brief period (e.g., 1 s), and the child learns to touch an identical picture in a large field size (e.g., a field of 24 cards).

Another skill that may be worth assessing in future research, the program may be helpful for children to gain the ability to respond to visual stimuli that are gradually displayed for shorter periods of time prior to learning to respond to auditory stimuli that already occur only briefly. The format of the program also includes a delayed matching-to-sample component. As the field size increases, the amount of time it takes for the child to find the correct response also increases.

Strategy 4: Order of Stimulus Presentation

Whether the therapist delivers the auditory stimulus first (e.g., “dinosaur”) or presents the visual stimuli first (e.g., placing three objects in front of the child) may affect client learning. Petursdottir and Aguilar (2016) recently conducted research to determine which delivery method was more effective for four typically developing children. They found that delivering the auditory word first followed by showing pictures on a computer screen resulted in faster acquisition. In contrast, most applied settings present the visual stimuli first (e.g., putting objects on a table) followed by delivering the auditory SD. Although it is unknown whether results would be the same for children with autism, this is a component modification that could be manipulated and tracked by practitioners working with an individual child.

Such a program may be incorporated with a variety of the strategies that follow. The success of either format may be linked to the specific deficits a child exhibits. For example, if a child demonstrates difficulty scanning, presenting the objects first may also be used to require a type of observing response, during which the child is expected to shift his gaze to each object as it is placed on the table before the next item is presented and then the auditory SD is finally delivered. However, if the child demonstrates difficulty with simple auditory discriminations, the child could be required to engage in a differential observing response (e.g., touching a blank card) prior to the therapist repeating the SD and showing the visual stimuli (Green, 2001).

Strategy 5: Simple-to-Conditional Discrimination

The simple-to-conditional procedure is a nine-step process that ends with the conditional-only procedure. To get there, the therapist would (a) ask for “horse” with only the horse on the floor, (b) ask for “star” with only the star on the floor, (c) ask only for “horse” with both the horse and the star on the floor, (d) ask only for “star” with both the horse and the star on the floor, (e) randomly intermix “horse” and “star,” (f) ask for “Lightning McQueen” with only Lightning McQueen on the floor, (g) randomly intermix “Lightning McQueen” and “horse” with those two objects out, (h) randomly intermix “Lightning McQueen” and “star” with those two objects out, and finally (i) randomly intermix asking for the horse, star, and Lightning McQueen with all three objects on the floor (Lovaas, 2003).

Although multiple recent studies have indicated an advantage to using the conditional-only method that immediately starts at Step 9 (Grow, Carr, Kodak, Jostad, & Kisamore, 2011; Grow, Kodak, & Carr, 2014; Holmes, Eikeseth, & Schulze, 2015; Vedora & Grandelski, 2015), there are some limitations to the research. In particular, children were not initially assessed to determine whether or not they could already respond correctly to simple auditory discriminations and simple visual discriminations. In fact, many of the children immediately responded correctly to steps with simple auditory discriminations (e.g., Steps 1, 2, and 6) and responded within one or two sessions to steps with a simple auditory discrimination followed by a simple visual discrimination (e.g., Steps 3 and 4). However, for the few who demonstrated difficulty with the initial steps of simple discrimination, mastery often occurred faster or nearly as quickly in the simple-to-conditional procedure as the conditional-only strategy.

Strategy 6: Blocked Trials

In blocked trials, the therapist delivers one SD (e.g., “Nemo”) in a field size of only two. The therapist repeats the same SD for a block of trials such as 10 trials. The therapist then switches to the other SD (e.g., “pizza”) for a second set of blocked trials. Based on meeting specific mastery criteria, blocks of trials are gradually decreased and SDs are randomly intermixed.

Blocked trials have a long history of success in teaching receptive labels to some individuals (Kodak et al., 2015; Pérez-González & Williams, 2002; Saunders & Spradlin, 1989). Pérez-González and Williams (2002) successfully used the procedure to teach receptive object labels to three children with autism who had already demonstrated difficulty acquiring the skill. Their procedure consisted of six steps:

  1. 1.

    Blocks of 10 trials were carried out with objects remaining in the same location.

  2. 2.

    Blocks of five trials were carried out with objects still remaining in the same location.

  3. 3.

    Blocks of two or three trials were carried out with objects still remaining in the same location.

  4. 4.

    The two object names were randomly intermixed with objects still remaining in the same location.

  5. 5.

    The two object names were randomly intermixed with objects in the opposite location.

  6. 6.

    The object names were randomly intermixed and the object location was randomly chosen.

Interestingly, the researchers did not find the procedure problematic for the reasons one might typically associate with this strategy (i.e., faulty stimulus control created based on the location of the object or matching by exclusion based on only two objects present in the field).

As with simple-to-conditional discrimination, repeating one label prior to changing to a different label in blocked trials sets up a simple discrimination that may be easier for the child to learn prior to learning a conditional discrimination. Research indicates that too frequent or too few reversals in a conditional discrimination can hinder acquisition (Saunders & Spradlin, 1989). Blocked trials alleviate this concern by systematically programming the reversals. In addition, blocked trials may be helpful for a child who demonstrates prompt dependency with physical and gestural prompts because the child has the opportunity to learn to correct his error by switching to the only other available object rather than through other forms of prompting.

Strategy 7: Sound Discrimination

The sound discrimination program progresses through a series of steps so that a child responds to the sound of an object by making the same sound. For example, the therapist shakes a rattle behind a barrier so that the child cannot see which object is making the sound. The child then selects the correct object (e.g., a rattle) from a field of three objects (e.g., bells, a rattle, and a drum) and shakes the rattle to make the same sound.

Eikeseth and Hayward (2009) successfully taught this skill to children who had not been able to acquire receptive labels. Further, children demonstrated transfer in responding from the auditory sounds to the actual words. After teaching the child to discriminate between the sounds of two different objects, the vocal word was added as part of the SD and the sound was gradually faded for those two objects so that eventually the child picked up the rattle to shake it when hearing the word rattle and beat on the drum when hearing the word drum.

Responding to a variety of auditory stimuli may be easier to learn first before learning to respond to vocal stimuli (i.e., spoken words), which are a subset of auditory stimuli and have many more features in common than other auditory sounds (Eikeseth & Hayward, 2009). A whistle blowing, bells ringing, and a drum banging sound much different than the words whistle, bells, and drum. In addition, having the auditory sound occur in both the initial stimulus and the response may also create an easier auditory–visual conditional discrimination to acquire because it includes auditory matching. Because the program requires the child to make sounds with objects, the ability to manipulate objects, often already practiced in EIBI through some form of object imitation, should be a strength of the child.

Appendix 1 outlines a series of modifications to the sound discrimination program to help a child gradually switch from a broader auditory stimulus (e.g., the sound of a drum) that requires an auditory response (e.g., banging an identical drum) to a vocal stimulus (e.g., the word pizza) that leads to a nonvocal response (e.g., touching a toy pizza). For example, one variation changes the SD to a vocal sound that sounds similar to the object (e.g., making a high-pitched “ding-ding-ding” vs. a low-pitched “bum-bum-bum” sound for bells and a drum, respectively) as a potential next step in generalization from object sounds to vocal sounds. As another step, an app such as SpeakAll (Boesch, Wendt, Subramanian, & Hsu, 2013) can be used to record the therapist’s voice on one iPad (e.g., saying “rocket”) while the learner’s iPad can be programmed to repeat the therapist’s voice when the child touches the picture of the rocket. These and other variations of the sound discrimination program deserve further study to determine their usefulness as intermediate steps in the acquisition of receptive labeling.

Strategy 8: Receptive Video Labeling

In receptive video labeling, the therapist plays a short video clip from a movie or TV show (such as part of the Mickey Mouse Clubhouse theme song) and then pauses the video. The child does not see the video. The child then picks up the correct character associated with the movie from a field of three characters and is allowed to watch the remainder of the video clip.

The receptive video labeling program was conducted successfully with two children with autism by the authors of this article. The program includes elements of the stimulus-specific reinforcement strategy discussed later. Future research into the effectiveness of this program and the necessary components to make it effective would be beneficial.

In general, the program is an auditory–visual conditional discrimination just like receptive labels, but the auditory SD is an excerpt from a video rather than a word. Children who already watch a variety of television shows or movies may benefit from this program because of either familiarity with or motivation for those videos. The program was first considered because the parents of one child who demonstrated difficulty acquiring receptive language noted that he always came running into the family room from anywhere in the house if the opening song from the Disney movie Cars was played on the television. In addition, the length of a video is longer, potentially eliminating the shortened stimuli presentation.

Strategy 9: Receptive Singing Label

One version of the receptive singing label program involves the therapist singing an object label to a specific tune (e.g., “fire engine” to the two words in the tune “London Bridges” or “blanket, blanket” to the two words in the tune “twinkle, twinkle” as in “Twinkle, Twinkle, Little Star”). The child then hands the correct object to the therapist.

This singing program has been used successfully with three children with autism by the authors of this study. Simpson and Keen (2010) found that the use of song facilitated receptive labeling. However, there was little generalization when the music was not present. A follow-up study found that singing the SD in a receptive labeling task led to greater engagement and learning than in the spoken condition (Simpson, Keen, & Lamb, 2013). Computer-based software delivered the SD to the tune of “Old McDonald,” ending with one of five animal names. The child moved the computer mouse to the correct animal and clicked on it.

A vocal response that includes additional auditory cues is key to the receptive singing label program. Sung words may be easier to discriminate than spoken words, with variations in rhythm, melody, and tone. Although this program violates the guidelines established by Grow and LeBlanc (2013) to stay away from voice modulation, it is important to recognize that the purpose of the receptive singing label program is again to establish an introductory level of auditory discrimination as a prerequisite skill. Its purpose is not, in fact, to teach the receptive labeling of objects using only words. Interestingly, however, two of the children with whom the authors worked were able to successfully transfer the skill and respond to the spoken label alone when the song was faded. Children for whom this program may be especially beneficial are those for whom music is a strong reinforcer (e.g., musical sounds, musical instruments, or songs in general).

Strategy 10: Voice Inflection

Emphasizing different parts of the actual label (e.g., “Darth Vader” in a low, slow voice vs. “Puppy!” in a high-pitched voice) is the essence of a voice inflection program. In a recent study, Simpson, Keen, and Lamb (2015) found that there was little difference between sung words and spoken words when an elevated pitch was used in each. They noted that some research indicates that children with autism respond better to linguistic and musical pitch, which may be the relevant feature in the intervention.

Using voice modulation is not recommended by Grow and LeBlanc (2013) because of the possibility that the child will overselect on the way in which the word is said rather than select the word itself. However, for a child who is simply learning to focus on the auditory sound, using voice inflection may be appropriate. Acquiring a few labels with voice inflection may be a gradual step toward more subtle vocal discriminations. In this case, it would be better to consider the voice inflection auditory sound as the actual target rather than as a prompt to be faded. The goal is to obtain a basic discrimination between a high pitch and a low pitch—and if they cannot be faded, then include another object as a high pitch said slowly and another object as a low pitch said slowly and continue to include different pitch and duration variations. To mediate the concern of having to spend an exorbitant amount of time in the future programming appropriate stimulus control, behavior practitioners should consider using arbitrary objects or only a small subset of items.

Strategy 11: Response Delay

Dyer, Christian, and Luce (1982) created a program in which the therapist labels an object (e.g., “Spiderman”) with three objects in front of the child. The therapist waits 3 s and then signals for the child to respond (e.g., holding down the child’s hands to prevent him or her from responding sooner). The child must wait until his or her hands are released before pointing to the correct object.

Lamela and Tincani (2012) extended the research in wait times by comparing a brief wait time (approximately 1 s) with a longer wait time (approximately 4 s) in two children with autism who demonstrated off-task behavior during one-on-one therapy. Their results indicated that the brief wait time led to more correct responding, which is comparable to the findings of one study (Tincani & Crozier, 2008) and contrary to the findings of other studies, two of which focused on receptive language development (Dyer et al., 1982; Valcante, Roberson, Reid, & Wolking, 1989). It appears that the appropriate wait time is a balance between allowing enough time for a child to stop engaging in other off-task behaviors and attend to the relevant cues and being short enough to keep the child attending to the task without engaging in other inappropriate behaviors.

The response delay program may enhance one skill identified in the study by Kodak et al. (2015): attending to the task by scanning. Such a program may be incorporated with many of the strategies we have already discussed. However, it may also be appropriate for children who have met all other prerequisite skills for auditory–visual conditional discriminations but who still have a tendency to engage in off-task behavior during therapy, look away from the materials in front of them, or engage in impulsive behavior and immediately grab for the objects in front of them even before the SD is delivered. The program can also be modified to focus on delayed matching to sample by not allowing the child to see the objects in the array until after a predetermined length of time after delivering the auditory SD.

Potential Strategies from an Analysis of Program Implementation

If a child demonstrates all of the prerequisite skills for auditory–visual conditional discriminations but still demonstrates difficulty with receptive labeling programs, another source of information to help determine which variations of conditional discrimination programs may be helpful is observations of the child’s performance in other programs. The relative ease that accompanies learning other skills when using specific treatment techniques may transfer to other programs. Table 3 summarizes the types of behaviors that may have already been observed in other early intervention programming. The strategies discussed in the following sections may be effective for an individual child based on his or her demonstrated weaknesses and strengths.

Table 3 Beneficial strategies based on strengths and weaknesses a child demonstrates in earlier early intensive behavioral intervention programming

Strategy 12: Similar Task Interspersal with Expansion Trials

Three forms of task interspersal have been clearly outlined by Volkert, Lerman, Trosclair, Addison, and Kodak (2008), including similar task interspersal. Expansion trials include the systematic increase of time or demands between when a current target SD is delivered and the next time it is delivered. In similar task interspersal with expansion trials, one SD is the target (e.g., “Touch the boat”) and other acquired SDs from the same program (e.g., “Touch the cake,” “Touch Thomas the Tank Engine”) are gradually included. Thus, the SD sequence of “Touch the boat,” “Touch the cake,” and then “Touch the boat” would be considered an expansion of one because one acquired SD was interspersed between the target SD. The SD sequence of “boat–cake–Thomas–Thomas–boat” would be considered an expansion of three because three acquired SDs were interspersed between the target SD (“boat”).

Grow and LeBlanc (2013) recommends that behavior analysts use task interspersal in the form of either similar task interspersal or dissimilar task interspersal when teaching receptive labels. Smith (1994) demonstrated that children were more likely to retain skills with such a systematic approach to task interspersal compared to a mass-trial condition. Further research should compare a systematic expansion trial approach with a more random interspersal of trials, a systematic expansion approach with similar responses versus dissimilar responses, and the number of expansion trials necessary for most children to maintain a skill from one day to the next.

Because the procedure gradually and systematically increases the amount of time or work between when newer skills are practiced, it may be particularly helpful for a child who has difficulty maintaining skills once they are acquired.

Strategy 13: Time Expansion

In this program, the therapist delivers the SD (e.g., “tambourine”) with a tambourine located in one corner of the room. The therapist prompts the first response (e.g., walking to the tambourine and shaking it), waits for 5 s, and redelivers the SD. If the child is successful, the therapist continues to systematically increase the time before redelivering the SD (e.g., 10 s, 30 s, 1 min, 2 min, 5 min, 10 min, 20 min, 30 min, 1 h, 2 h, 3 h, 6 h, overnight). If the child responds incorrectly, the therapist decreases the time to the previous level. Between the 5-s and 5-min time period, the therapist engages the child in other preferred activities. From the 5-min time period forward, the therapist continues with other programs. Once one target has been mastered, a second target is introduced the next day. The goal is for the learner to acquire a label in 1 day and recall it the next morning. Once a child has acquired two labels, each in a day, those two labels are randomly intermixed.

Another area for future research, this teaching procedure was used successfully with one child by the authors of this article but was unsuccessful with another child. The procedure is similar to distributed trial instructions in which breaks of 5 s to a few minutes occur between trials (Majdalany, Wilder, Greif, Mathisen, & Saini, 2014).

Time expansion balances the frequency with which a skill is practiced with the amount of time that passes between trials. It also incorporates a dissimilar task interspersal procedure so that the skill is interspersed with all other skills that are practiced throughout the day. The procedure may be helpful for children who need a large number of trials to learn the skill and therefore may benefit from a more systematic increase in the frequency of practice.

Strategy 14: Touch Object Versus Hand Object

Behavior practitioners should consider the topography of the child’s response in receptive labeling. One format is to have the child touch an object. A second format is to have the child hand an object to the therapist. A third format is to require the child to stand up, walk to the object, and then either touch the object or bring it back to the therapist.

Booth (1978) noted in his research of receptive object identification that children with disabilities responded best when they were required to hand objects to the therapist. The response associated with picking up an object and handing it to the therapist (or placing it in a container) makes it more difficult to give multiple responses (e.g., pointing to one object and then immediately pointing to another). Other behaviors a child exhibits (e.g., the likelihood the child will throw an object or elope) may also make one response format more effective than the other.

Strategy 15: Embedded Discrete Trial Teaching

Embedded discrete trial teaching incorporates motivation and natural reinforcers within the teaching format. For example, one child may jump to the correct picture based on his preference for a jumping game. Other response format variations might include using a pointer to point to the correct response, shining a flashlight on the correct response, dropping the correct response in water, or slapping the correct picture with a flyswatter. A list of 25 different response formats, originally posted to the Me-List listserv in 1997, is included in Appendix 2.

Geiger et al. (2012) demonstrated that the strategy was more efficient than traditional discrete trials in teaching receptive labeling. A child’s particular preferences become important in the selection of a response format, and a variety of strategies are available to help determine the child’s preference for particular activities (Reid, DiCarlo, Schepis, Hawkins, & Stricklin, 2003).

Such a strategy attempts to increase the motivation associated with the response to help maintain the child’s attention. It is also helpful for children who do not respond to other typical forms of contrived reinforcement. At the same time, it is important to ensure that the response does not add too much undue complexity to the child’s basic discrimination task.

Strategy 16: Verb–Noun Combination

A verb–noun combination requires a response that includes both a discrete action and an object (e.g., “Push car,” “Wave flag,” “Blow bubbles”). The same action is always conducted with the same object.

Curiel, Sainato, and Goldstein (2016) implemented a matrix training procedure with one child in which five actions, each with a different object, were first taught, and then therapists probed for generalization to other combinations of the same actions and objects (e.g., “Push flag” and “Wave car”). Although the child in the study also had limited receptive language skills and did not respond to receptive instructions, the focus of the verb–noun program is not to probe for generalization, which would include a more complex conditional discrimination. Instead, the focus is to teach initial auditory–visual discriminations with objects and actions that are as radically different from each other as possible, rather than always responding with the same action (e.g., touching objects). In our experience, other indications that this format should be attempted is if a child has been successful in object imitation programs and has already acquired a variety of simple receptive instructions.

Strategy 17: Modified Incidental Teaching

In this program, the therapist brings the child to an area associated with a preferred activity (e.g., into the kitchen with items on the counter) and asks if he is ready to make a snack. When the child indicates that he is ready for a snack (e.g., pointing, nodding his head, using augmentative communication, or saying “Yes”), the therapist asks for the items needed to make the snack in random order. Once all items are successfully identified receptively (with prompts if necessary), the child is allowed to make the snack.

McGee, Krantz, Mason, and McClannahan (1983) created this strategy by combining elements of both incidental teaching and discrete trial teaching. Increased motivation in the program may increase a child’s attention to the objects as well as ensure that a high level of reinforcement is delivered. This strategy may be beneficial for learners who have shown rapid development in other skill areas when an incidental teaching approach was used (e.g., in requesting, imitation, or play).

Strategy 18: Two-Item Field

A two-item field in which only two objects are placed in the array is a common format in blocked trials and can teach a basic problem-solving strategy of trying something different (i.e., changing answers) if the first response is incorrect.

The preferred field size suggested by both Grow and LeBlanc (2013) and Green (2001) is a three-item field because it decreases the likelihood that the child will respond correctly by chance. However, Leaf, Sheldon, and Sherman (2010) used a no-no prompt strategy with a two-item field to successfully teach receptive labels to three children with autism. During the program, the therapist delivers an SD (e.g., “garbage truck”) with two objects on the table (a garbage truck and a Lego). If the child responds incorrectly, the therapist says “No” in a neutral tone and repeats the SD. If the child responds incorrectly again, the therapist says “No” again and delivers the SD a third time while also delivering the least intrusive prompt that is effective for the child.

As discussed in blocked trials, the format allows other forms of prompting to be faded and sets up a situation in which learning occurs through differential reinforcement to all responses. The format can initially be attempted in an easier program such as visual identity matching. Responding to such differential reinforcement is key to the strategy. One must be cautious of children who do not find enough differential reinforcement associated with immediately responding correctly instead just randomly choosing an object and, if the response is incorrect, choosing the other object.

Strategy 19: Modes of Stimuli

Four common modes of stimuli include objects (e.g., a My Little Pony figurine), pictures (e.g., a picture of a tiger), other people (e.g., the child points to the therapist’s nose), and the child himself (e.g., the child points to his own feet). Different learners may attend better or be more motivated by different modes of stimuli.

There is no current research comparing the acquisition rate of receptive language with different modes of stimuli. However, Pérez-González, Cereijo-Blanco, and Carnerero (2014) found that in the procedures in a study of tacts they implemented, children showed more emergence of novel skills with objects in comparison to pictures, demonstrating that the mode of stimuli can matter in the development of some skills.

Practitioners may be able to determine which stimuli are likely to be more effective by evaluating the child’s acquisition rate on other tasks that use different modes of stimuli (e.g., matching pictures vs. matching objects) or by allowing the child to choose between program formats. Whenever stimuli are used that may be motivating, one must also be cautious that the items are not so motivating that the child is always grabbing for the items just to gain access to them.

Potential Strategies from an Analysis of Equivalence Class Formations

The use of equivalence classes to, in a sense, work around a child’s difficulty with receptive language is one final strategy to consider. A large body of research exists concerning equivalence class formation in individuals with autism (McLay, Sutherland, Church, & Tyler-Merrick, 2013). Results are mixed, with equivalence classes emerging for some individuals but not for others. The following strategies may be worth attempting with children who already demonstrate equivalence class formation with visual identity matching tasks.

Strategy 20: Audio-Specific Consequences

With audio-specific consequences, a child is initially handed an ambulance to match to a picture of the ambulance in an array of three or four pictures. When the child matches the object to the picture, the child is then given an edible as reinforcement and an ambulance sound is played at the same time. The child learns to match four objects to pictures in this format (e.g., after matching a stuffed lion to a picture of a lion, the sound of a lion is played during reinforcement; after matching a spaceship to a picture of a spaceship, the sound of a spaceship blasting off is played during reinforcement). In Phase 2 of the program, the sound is delivered as the stimuli (e.g., the sound of the ambulance) and the child selects the correct picture (e.g., in an array of the ambulance, lion, rooster, and spaceship), hopefully without the need for additional teaching.

Varella and de Souza (2014) demonstrated the emergence of auditory–visual relations when a specific sound was presented as part of the consequence for each stimulus. Although Varella and de Souza’s results were promising, the four 7- to 15-year-old children with autism in the research already had some receptive language skills, although only in the range of those of a 3-year-old.

The procedure and its rationale have long been studied in both animals and humans (Dube, McIlvane, Mackay, & Stoddard, 1987; Zaine, Domeniconi, & de Rose, 2014). If a specific reinforcer is used for each comparison stimuli in a conditional discrimination procedure, the reinforcer may become part of the equivalence class, and equivalence relations that include the reinforcer may emerge without deliberate teaching.

Strategy 21: Stimulus-Specific Reinforcement

Rather than deliver the same or random reinforcers for correct responding, stimulus-specific reinforcement always delivers one specific reinforcer for each specific response (e.g., a cookie is given for correct responding to “spoon” and M&Ms are given for correct responding to “Buzz Lightyear”).

Litt and Schreibman (1981) initially published a study demonstrating the value of stimulus-specific reinforcement in learning receptive labeling discriminations. However, Chong and Carr (2010) did not replicate the results. However, the children in the study conducted by Litt and Schreibman (1981) were all nonvocal, whereas the children in the study conducted by Chong and Carr (2010) were categorized as advanced vocal learners.

Stimulus-specific reinforcement is potentially the most puzzling strategy in this article. There are multiple theories behind how the procedure works (Goeters, Blakely, & Poling, 1992; Urcuioli, 2005). Goeters et al. (1992) boldly stated that it does not matter if we know why it works—the fact that it works is reason enough to use it. However, Chong and Carr (2010) noted that the procedure is consistently successful with animals but has mixed results in human populations.

Conclusion

The 21 strategies included in this article are not meant to be an exhaustive list of alternative strategies for teaching receptive language. For example, many of the strategies can be used together to create additional alternatives. By the time all options have been exhausted, there are literally hundreds of different potential combinations. Also, other alternatives have been suggested, such as teaching expressive language in the form of tacts or mands first (Pelios & Sucharzewski, 2004) or focusing on joint attention skills (Yoder, Watson, & Lambert, 2015). While research continues to assess each strategy, this list is meant to serve as an additional resource upon which behavior analysts can continue to build based on current research and practice, conceptually systematic rationale, and individual child profiles.

The strategies are also not meant to replace the general guidelines of Grow and LeBlanc (2013). Many of the suggested strategies require additional research. Many of the strategies violate at least one of the general guidelines. But the fact remains that some children with autism continue to demonstrate difficulty with receptive labeling when general guidelines are followed, and some children with autism do make progress with the aforementioned strategies. A cursory review of the research studies in this article that included data on the number of sessions to mastery indicated that children acquired an initial discrimination in receptive labeling for the first two to three items, typically within nine to 14 sessions. That is around 2 weeks in most EIBI programs. If something is not working, what assessments are behavioral practitioners conducting, and what changes are occurring? A child who cannot learn in the way we teach is depending on us to find a way to help him or her learn.

In our rush to find what will work, we must remain careful not to just find what is different. At a minimum, when a child is demonstrating difficulties gaining receptive language, behavior analysts should critically review the progress of a learner and make ongoing changes to standard programming based on data. Insights from applied behavior analysis will grow most rapidly and accurately when there is a symbiotic relationship between behavior analysts in research and behavior analysts in practice. In research, rigorous, narrowly controlled experiments test the validity of what we do. Research keeps us grounded in evidence-based practice. But it is impossible to study all of the decisions behavior analysts make on a daily, weekly, and monthly basis. In fact, some of those decisions become the spark for future research. When working with children with autism, insights from research without insights from practice become lethargic. Insights from practice without insights from research become impulsive. Insights from research and practice together become transformative. Behavioral practitioners hold themselves accountable to that ideal. Behavioral practitioners never settle.