Individuals with autism spectrum disorder (ASD) typically have deficits in verbal and nonverbal communication, social interactions, and engagement in play activities (American Psychiatric Association, 2013; Centers for Disease Control and Prevention, 2015). In neurotypical individuals, vocal behavior development begins with babbling, the utterance of different syllables (Petursdottir & Lepper, 2015). According to Lovaas (2003), once babbling occurs at high rates, it can be brought under the control of antecedent stimuli (e.g., therapist vocalizations), prompted, and shaped into verbal operants (e.g., echoics, mands, tacts). Some children with ASD, however, do not readily acquire babbling, speaking, or the use of gestures (American Psychiatric Association, 2013; Centers for Disease Control and Prevention, 2015). Furthermore, many children diagnosed with ASD lack functional vocalizations, have limited echoic skills, or both (Williams White et al., 2007). Research has shown that establishing an echoic repertoire in people who lack vocal-verbal behavior can be difficult (Drash et al., 1999; Koegel et al., 1988). Therefore, additional research on procedures that may facilitate the acquisition of or increase vocalizations that can subsequently be shaped into echoic responding is imperative.

According to Goldstein et al. (2009), typically developing infant vocalizations are generally maintained by positive social reinforcement. However, babbling may also be maintained and shaped through automatic reinforcement (Palmer, 1996; Skinner, 1957). More specifically, speech sounds may acquire reinforcing properties through naturally occurring pairings with established reinforcers such that when the child produces sounds that resemble these conditioned auditory stimuli, the auditory products of the sounds emitted by the child serve as reinforcers for the emission of these sounds. The auditory product of these vocalizations increases the probability of the child emitting these sounds again in the future. Given the plausible role of automatic reinforcement in the acquisition of vocal behavior, one potential way to facilitate the acquisition of vocalizations in children with ASD is to condition vocalizations as reinforcers so the emission of these vocalizations may result in automatic reinforcement.

Previous research has evaluated methods to condition vocalizations as reinforcers (Esch et al., 2009; Lepper et al., 2013; Miguel et al., 2001; Petursdottir & Lepper, 2015). Two such procedures are stimulus–stimulus pairing (SSP; Esch et al., 2009) and response-contingent pairing (RCP; Dozier et al., 2012). SSP involves pairing a neutral stimulus (e.g., vocalization) with an established reinforcer, independent of a response from the participant. During each SSP trial, the researcher delivers a reinforcing stimulus while presenting the target vocalization a specific number of times (e.g., Esch et al., 2009), and the delivery of the reinforcer is not contingent on the participant emitting the target response. In fact, the emission of the target response by the participant may result in a delay in reinforcer delivery to prevent adventitious reinforcement (e.g., Miguel et al., 2001). In some studies, SSP increased vocalizations of all participants (e.g., Barry et al., 2019; Esch et al., 2009); however, in others, SSP was only effective for some of the participants (e.g., Carroll & Klatt, 2008; Miguel et al., 2001; Stock et al., 2008; Yoon & Feliciano, 2007). Overall, the literature suggests that SSP does not reliably increase the vocalizations of children with ASD (Petursdottir et al., 2011). Similar to SSP, RP involves the pairing of a neutral stimulus (e.g., vocalization) with an established reinforcer. However, in RSP, the reinforcer and neutral stimulus are delivered contingent on the participant emitting a specified response (e.g., disk sorting; Dozier et al., 2012). The conditioning effect of this procedure is determined by comparing the rate of the target response before and after pairing sessions. If there is an increase in the rate of the target response during the postpairing sessions, then a conditioning effect has occurred. For example, Dozier et al. (2012) assessed whether RCP established praise as a conditioned reinforcer for adults with disabilities by pairing novel praise statements with an established reinforcer on a fixed schedule. The results indicated that RCP was an effective conditioning procedure, but only for 50% of the participants. Moreover, Lepper and Petursdottir (2017) compared the effects of RCP and SSP on the rate of vocalizations of children diagnosed with ASD. Results indicated that RCP produced a higher rate of the target vocalizations, as compared to SSP, across all participants. Additionally, during Phase 2 of the study, they evaluated the effects of RCP on the rate of sounds previously assigned to the SSP condition. In this phase, RCP led to an immediate increase in these vocalizations across all participants.

Another procedure for conditioning stimuli as reinforcers is observational conditioning (OC), which is a type of observational learning (Greer & Singer-Dudek, 2008). Observational learning is the process of acquiring a new skill, or set of skills, as a result of observing another person contacting the contingencies of reinforcement or punishment for engaging in these responses (Greer et al., 2006). Thus, OC occurs when a participant observes a model come in contact with a stimulus for the emission of an arbitrary response (e.g., Greer et al., 2008; Singer-Dudek et al., 2018). Greer and Singer-Dudek (2008) first employed the OC procedure to establish plastic disks and strings as reinforcers for five children diagnosed with mild to moderate language delays. The experimenters assessed the effects of the OC procedure across performance tasks (i.e., previously learned tasks) in a pre- and postintervention reversal design and learning tasks (i.e., response acquisition) in a pre- and postintervention assessment. Following the preintervention phases, they implemented the OC procedure. During conditioning sessions, a peer confederate sat at a table next to the target participant, across from the experimenter. A partition board was placed on the table so the target participant could not see the confederate’s correct or incorrect responses, but could see the experimenter’s delivery of the plastic disk to the confederate. At the start of each trial, the experimenter simultaneously prompted the target participant and confederate to engage in the target response. The confederate received one plastic disk contingent on correct responses, and there were no programmed consequences for the target participant’s responses. That is, the target participant did not receive the plastic disks for correct responses, but instead observed the confederate’s receipt of the stimulus. Following conditioning, the experimenters implemented the postintervention performance and learning tasks. Overall, the OC procedure was effective at conditioning plastic disks or strings as reinforcers for all five participants. Furthermore, correct responses increased following conditioning in both the performance and learning tasks. In previous research, the OC procedure has also been effective in establishing other neutral stimuli as conditioned reinforcers, such as books (Singer-Dudek et al., 2011) and praise (Greer et al., 2011). However, the effects of OC on conditioning vocalizations as reinforcers have not been examined.

The aforementioned research suggests that RCP and OC may be effective procedures for establishing neutral stimuli as conditioned reinforcers; however, the efficacy of these procedures varied across studies (e.g., Dozier et al., 2012; Lepper & Petursdottir, 2017), populations (e.g., Axe & Laprime, 2017), and stimuli (e.g., Rodriguez & Gutierrez, 2017). Furthermore, although RCP is somewhat effective at conditioning vocalizations as reinforcers, it appears OC has not yet been used to condition participants’ vocalizations as reinforcers. Therefore, the purpose of the current study was to determine whether RCP and OC were effective in conditioning vocalizations and to assess and compare their effects on the overall rate of vocalizations of children with ASD.

Method

Participants and Setting

Participants were three children diagnosed with ASD—Thomas, Arthur, and Mozart—who emitted infrequent vocalizations (i.e., 10 or fewer instances of utterances) as determined by direct observations completed during the preassessments. We recruited participants from a local early intervention clinic through flyers and word of mouth. After attaining informed consent, experimenters asked each caregiver to complete the participant screening questionnaire, which consisted of a modified version of the Behavior Language Assessment (Sundberg & Partington, 1998). The questionnaire included questions about the child’s medical diagnosis, vocal repertoire, ability to follow instructions and imitate actions, and disruptive behaviors emitted by the child. If the questionnaire responses indicated that the child had limited vocal skills but did not engage in severe problem behavior, then preassessments were conducted to directly assess the child’s skills repertoire and determine whether the child met participation criteria.

Thomas was a 5-year-old male, who, according to his caregiver, was learning to communicate using American Sign Language (ASL) and through a speech-generating application, Proloquo2Go. Thomas communicated his wants and needs to the experimenter using basic ASL. Arthur was a 10-year-old male whose primary mode of communication consisted of vocal approximations. Finally, Mozart was a 9-year-old male who communicated using vocal approximations. At the time of their participation in this study, Thomas and Arthur were receiving applied behavior analysis services, whereas Mozart was receiving speech and occupational therapy services.

The experimenters conducted sessions in each participant’s home, in a quiet area, which included at least one table and three chairs. During most sessions, the participant and experimenter were the only individuals present in the room; however, because Arthur’s sessions were completed in the family room, occasionally his parents were also present. In these cases, we requested that his parents not interact with Arthur during sessions. Sessions were conducted 1 to 2 days per week, dependent on participant availability.

Materials

The materials differed across phases and conditions. The Observational Learning Prerequisite Assessment (OLPA) included toys (i.e., blocks, trains, puzzles, and stacking cups), 2D identical matching pictures approximately 10 cm in diameter, and edibles. The paired-stimulus preference assessments included edibles and colored circles. The Early Echoic Skills Assessment (EESA; Esch, 2008) and structured observations included edibles and toys. Sessions of OC, RCP, the reinforcer assessments, and the social validity assessment included arbitrary task materials (i.e., matching 2D pictures of shapes, button pressing, stacking cups, matching colors, 2D identical animals, and colored circles, approximately 10 cm in diameter, used for target touching). Finally, the OC sessions included a partition wall to prevent the participants from viewing the confederate’s responding.

Response Measurement

Trained observers collected data using the Countee (Peić & Hernández, 2016) application and corresponding data collection sheets. The trained observers were graduate students in an applied behavior analysis program. The primary dependent variables were the frequency of a free-operant response during the reinforcer assessments and vocalizations during the conditioning sessions. The free-operant response for all participants consisted of target touching, and this response was selected because it was not associated with a history of reinforcement. We defined target touching as physical contact between the participant’s open palm, an isolated finger, or multiple fingers and a colored circle taped to the table in front of the participant. Observers collected data on the frequency of free-operant responses, which were later converted to a rate by dividing the frequency of responses by the session duration. Additionally, the proportional change in responding from extinction to reinforcement (i.e., proportion from extinction) was calculated for each session by dividing the frequency of responding toward the stimuli associated with reinforcement by the frequency of responding toward the stimuli associated with extinction.

Observers also collected data on vocalizations emitted by the participant during the structured observations, echoic assessments, conditioning sessions, and reinforcer assessments. During the structured observations, observers collected frequency data on all vocalizations (i.e., separate utterances of sounds, words, or approximations of sentences) emitted by the participant. During the EESA and the brief echoic assessment, observers recorded participants’ echoics. We defined echoic responses as the emission of a sound with point-to-point correspondence and formal similarity to the experimenter’s sound that occurred within 5 s of the onset of the trial. Based on the results of these assessments, we assigned a target vocalization that was in the participant’s repertoire but emitted at a low rate to the RCP, OC, and control conditions. During the reinforcer assessment and conditioning sessions, observers collected data on the emission of the target vocalization and other vocalizations. Any vocalization that resembled an English sound (i.e., vowel, single- or multisyllable words), other than the target vocalization, was recorded as an instance of other vocalizations. We calculated the rates by dividing the frequency of each type of vocalization by the session duration.

Finally, during the color preference assessment and social validity assessment, observers collected data on stimulus selection. We defined stimulus selection as pointing to, touching, or grabbing one of the presented stimuli within 5 s of the onset of the trial. Data were converted to a percentage of opportunities with stimulus selection by dividing the number of times each stimulus was selected by the total number of times that that stimulus was available, and multiplying by 100.

Interobserver Agreement and Procedural Integrity

Interobserver agreement was calculated for 67% of sessions for Thomas, 96% of sessions for Arthur, and 93% sessions for Mozart. Interobserver agreement for the OLPA, structured observation, preference assessments, EESA, brief echoic assessment, and social validity was calculated on a trial-by-trial basis by dividing the number of agreements by the number of agreements plus disagreements and converting the result to a percentage. Mean interobserver agreement across participants for the OLPA, the structured observation, the color preference assessment, and the social validity assessment was 100%. For the structured observation, interobserver agreement was calculated only for the listener-responding trials. Mean interobserver agreement across participants was 96.5% (range 93%–100%) for the edible stimulus preference assessment, 98% (range 98%–100%) for the EESA, and 97.2% (range 91.6%–100%) for the brief echoic assessment.

Interobserver agreement for the reinforcer assessments and the conditioning sessions was calculated using proportional agreement through the Countee application’s website. To calculate proportional interobserver agreement, the total observation period was divided into 10-s intervals. Agreement was calculated by dividing the smaller number of responses by the larger number of responses within each interval to create a ratio. Then, the ratios were summed, divided by the total number of intervals, and multiplied by 100 to yield a percentage. Mean interobserver agreement across participants for the reinforcer assessments was 95.6% (range 88.3%–100%). Mean interobserver agreement for the conditioning sessions was 96% (range 92%–100%) for OC and 97% (range 90%–100%) for RCP.

Procedural integrity was calculated for the preference assessment, conditioning sessions, and reinforcer assessments for Thomas, Arthur, and Mozart for 82%, 93%, and 81% of all sessions, respectively. Procedural integrity data were collected using checklists that described the steps to complete during each assessment/session. For example, items included in the procedural integrity checklist were (a) presenting a pair of items (i.e., paired-stimulus preference assessment), (b) presenting the target sound (i.e., echoic assessments), (c) presenting the target sound following each target touch (i.e., reinforcer assessment), (d) presenting the RCP sound five times and delivering the reinforcer simultaneously with the fifth emittance of the RCP target sound (i.e., RCP conditioning sessions), and (e) presenting the OC sound five times to the confederate contingent on the emission of a correct response and presenting identical task materials to the participant and confederate (i.e., OC conditioning sessions). The mean procedural integrity score across participants was 96.6% (range 90%–100%) for the edible and color preference assessments, 98% (range 90%–100%) for the reinforcer assessments completed prior to the conditioning phase, 96% (range 92%–100%) for the OC conditioning sessions, and 96% (range 92%–100%) for the RCP condition sessions. The mean procedural integrity score for the brief echoic assessment, the color preference assessment, and the reinforcer assessments completed during and after conditioning was 100%.

Experimental Design

During the conditioning phase, an adapted alternating-treatments design was used to compare target responding during the RCP and OC conditions and the control condition. Pre- and postconditioning reinforcer assessments were conducted using a multielement design to determine whether either procedure was effective in conditioning the participants’ vocalizations as reinforcers.

Preassessments

Before the conditioning evaluation, a series of preassessments was completed with each participant to directly evaluate the participant’s skills repertoire and identify appropriate target vocalizations. These included an OLPA, the EESA, structured observations, and a brief echoic assessment.

OLPA

The OLPA assessed four critical skills for observational learning—attending to a model, imitation, delayed imitation, and consequence discrimination (MacDonald & Ahearn, 2015). Each of these skills was assessed in one session consisting of 10 trials, and no consequences were provided for performance during these trials; however, a preferred edible was delivered every two to three trials for appropriate session behavior (e.g., sitting at the table). During trials for attending to a model, the experimenter attempted to gain the participant’s attention by stating “Watch me.” The participant’s target behavior during these trials consisted of orienting their head toward the experimenter and making brief eye contact (1 s) within 5 s of the onset of the trial. During imitation trials, the participants observed the experimenter model a one-step action while stating “Do this.” The experimenter recorded whether the participant imitated the action within 5 s of the experimenter stating “Do this.” During delayed-imitation trials, the experimenter modeled a one-step action but did not allow the participant to imitate the action until 5 s had elapsed after the model. After 5 s, the experimenter stated, “Now it’s your turn.” The experimenter recorded whether the participant imitated the action previously shown. To assess the participants’ consequence-discrimination skills, the experimenter first modeled two specific responses that required task materials (e.g., identifying the color of a blue or red train for Thomas): one associated with a positive consequence (i.e., edibles plus praise) and another with a neutral consequence (e.g., book for Thomas). The consequences were selected based on caregivers’ reports of participants’ preferences. During the subsequent consequence-discrimination trials, participants were given 5 s to choose between the tasks (i.e., identifying the color of the blue or red train for Thomas) that had been previously followed by positive or neutral consequences. During the OLPA, Thomas scored 100% on attending to a model and 90% on imitation, delayed imitation, and consequence discrimination. Arthur scored 100% on attending to a model, delayed imitation, and consequence discrimination and 90% on imitation. Finally, Mozart scored 100% on consequence discrimination and 90% on attending to a model, imitation, and delayed imitation.

EESA

The EESA was completed to directly assess each participant’s echoic repertoire. During this assessment, the experimenter presented a vocal model of each of the target sounds from the EESA, starting with sounds from Group 1 (one- to two-syllable sounds) and ending with sounds from Group 5 (testing prosody). Each sound was presented up to three times, and the participant’s response was assigned a score of 0 (incorrect), 0.5 (recognizable), or 1 (correct) based on the participant’s best response out of the three opportunities. In cases where the participant echoed the modeled response during the first trial, the sound was not presented again. In addition, if the participant received a 0 on three consecutive sounds, the assessment was terminated, and the scores for all previously presented sounds were totaled. Thomas’s overall EESA score was a 3, and he only echoed one-syllable vowel and consonant sounds. Arthur’s overall EESA score was 23, and he correctly echoed 13 one-syllable sounds and words and 5 two-syllable words. He also emitted an approximation to 9 other sounds. Mozart’s overall EESA score was 18.5, and he correctly echoed 10 one-syllable sounds (i.e., vowels and consonants) and words and 4 two-syllable words and sounds, and he emitted approximations of 8 one-syllable and 1 two-syllable combination sounds. The EESA scores of all three participants indicated an echoic repertoire in the 0- to 18-month-old range.

Structured Observations

Two 10-min sessions were conducted to directly assess each participant’s listener responding, as well as to identify potential target vocalizations to be used in the study. During the first observation, the experimenter and participant were seated at a table, and listener responding was assessed. The participant did not have access to play items. The experimenter vocally presented 10 simple instructions (e.g., “Clap your hands”) and allowed 5 s for the participant to respond. Thomas completed 90% of the listener-responding tasks correctly, Arthur 80%, and Mozart 100%. The second observation occurred during free play. The participants had access to various toys, and the experimenter responded to participant-initiated interactions. Data were recorded on all vocalizations emitted by the participant during both observations. Across both observations, Thomas emitted 7 different utterances but repeated 1 of them (8 total). Arthur emitted 4 different utterances (5 total), and Mozart emitted 7 different utterances (10 total). These vocalizations were then included in the brief echoic assessment to identify appropriate targets for the conditioning evaluation.

Brief Echoic Assessment

Sounds the participants emitted in the structured observation were included in a brief echoic assessment that was completed prior to and following the conditioning evaluation. If during the structured observation the participant emitted a vocalization that consisted of a combination of sounds (e.g., “push me”), during the echoic assessment we presented these sounds together (e.g., “push me”) and in isolation (e.g., “push” in a trial and “me” in another trial). In addition, if we did not identify at least three sounds for the conditioning assessment, additional sounds emitted by the participant during other assessments (e.g., preference assessments) were added to the brief echoic assessment. Thus, we evaluated Thomas’s, Arthur’s, and Mozart’s ability to echo a mean of 7, 29, and 28 different vocalizations, respectively.

The initial assessment was completed to determine each participant’s ability to echo potential target sounds, whereas the second assessment evaluated whether exposure to the conditioning procedures had an impact on the participants’ echoics (see Figure 1). During this assessment, each sound was presented 10 times, and the participant’s response was recorded verbatim. Each session consisted of 10 rapidly alternated trials. During each trial, the experimenter said “Say . . .” and then emitted a targeted sound or word. The participant was allotted 5 s to echo the sound. No consequences were provided for correct or incorrect responding, but a preferred edible or tangible item was provided every two to three trials for appropriate session behavior. Once all sounds and words were assessed, we calculated the percentage of trials with correct responding for each sound. Sounds that participants emitted correctly in fewer than 10% of the trials were chosen as targets for the participants. Arthur’s selected targets were “it” for OC, “up” for RCP, and “go” for control. Mozart’s selected targets were “bread” for OC, “help” for RCP, and “door” for control. Finally, Thomas’s target sounds were “mm” for OC, “woo” for RCP, and “bee” for control.

Fig. 1
figure 1

Results of the Pre- and Postconditioning Brief Echoic Assessments. Note. OC = observational conditioning; RCP = response-contingent pairing; CRT = control.

Conditioning Evaluation

We evaluated the effects of RCP and OC via reinforcer assessments completed preconditioning, during conditioning, and postconditioning to determine whether the target vocalizations assigned to the conditioning procedures acquired reinforcing properties. A vocalization was randomly assigned to the RCP, OC, or control condition. The sound assigned to the control condition was included in the reinforcer assessments but was not exposed to any conditioning procedures. In addition, we recorded the frequency of the target and other vocalizations emitted by the participants throughout the conditioning sessions.

Reinforcer Assessments

Reinforcer assessments were conducted in the same manner throughout the study. However, during the conditioning phase, each reinforcer assessment session was preceded by five consecutive sessions of the corresponding RCP or OC conditioning procedure. For Arthur only, due to an extended gap in the implementation of the conditioning sessions and the postconditioning reinforcer assessment sessions, the experimenter conducted a booster conditioning session with three trials of the OC condition immediately before the first and fourth postconditioning reinforcer assessment session for the OC condition.

All reinforcer assessment sessions were 5 min in duration with three forced-exposure trials implemented for each response option before the start of a session. Reinforcer assessments were conducted in a concurrent-operant arrangement that included two sets of identical task stimuli (colored circles), only differing in color and the consequence provided for responses (i.e., reinforcement in the form of the associated vocalization from the OC, RCP, or control condition or extinction). Therefore, a different colored circle was assigned to each consequence. The colors of the circles were selected based on the results of a color preference assessment, completed using procedures similar to Heal et al. (2009). Colors were ranked from most to least preferred based on the assessment results, and the three colors in the middle of the hierarchy were selected for inclusion. Thomas’s selected colors were blue (OC), black (RCP), orange (control), and purple (extinction). Arthur’s selected colors were orange (OC), white (RCP), yellow (control), and black (extinction). Mozart’s selected colors were purple (OC), red (RCP), white (control), and black (extinction). The placement of two circles (e.g., RCP on the left, extinction on the right) remained constant throughout each session; however, placement was rotated (e.g., RCP on the right, extinction on the left) across each reinforcer assessment session for all participants. This was done to control for potential side bias.

At the start of each session, the experimenter told participants that they could do as much or as little work as they wanted, and then placed the two colored circles on the table—one associated with extinction and the other associated with the sound assigned to one of the conditions (i.e., OC, RCP, or control). Touching the colored circle associated with the RCP, OC, or control condition resulted in the experimenter’s emission of the sound assigned to that condition. The experimenter did not deliver any consequences for the participant touching the colored circle associated with extinction. The experimenter did not interact with the participant unless the participant engaged in off-task behavior (i.e., turning their body away from the experimenter, looking away from the table) for at least 10 consecutive seconds. If this occurred, the experimenter provided a vocal instruction (e.g., “Look this way”) to prompt the participant to face the table.

Conditioning Procedures

During the conditioning phase, sessions of the OC and RCP conditioning procedures were completed prior to the reinforcer assessments. An arbitrary mastered task that was similar in response effort (i.e., required a single motor response from the participant) was assigned to each conditioning procedure per participant. For Thomas and Arthur, their tasks were button pressing (RCP) and matching 2D nonidentical shapes (OC). Mozart’s tasks were button pressing (RCP) and matching colors with colored clothespins (OC). Mozart’s original OC task was stacking blocks, which was changed to matching colors with colored clothespins at OC Session 12 due to problem behavior associated with the task materials. Each session for both conditioning procedures consisted of 10 trials and lasted 3–5 min. After five consecutive sessions of the same conditioning procedure, a reinforcer assessment session was conducted.

Conditioning sessions were completed until visual inspection of graphs depicting responding during the reinforcer assessments indicated a reinforcing effect (i.e., higher responding than in preconditioning) for at least one of the conditioning procedures or until a maximum of eight reinforcer assessment sessions and 40 conditioning sessions, were conducted per condition. Once the termination criteria were met for a condition, the corresponding conditioning sessions were no longer conducted, and the postconditioning reinforcer assessments were conducted for that specific condition.

OC

A confederate, the participant, and the primary experimenter were present. The participant sat next to the confederate at a table. An opaque partition was placed between the participant and the confederate on the table so that neither the participant nor the confederate was able to view each other’s responses. However, the participant was able to see and hear the consequences (model of the vocalization) provided contingent on the confederate’s completion of the arbitrary task. At the start of each trial, the experimenter simultaneously prompted the participant and confederate to engage in the arbitrarily selected tasks (i.e., “Match”). For the target participant, no consequences were delivered for correct or incorrect responses. For the confederate, correct responses resulted in the immediate delivery of the target sound emitted by the experimenter five times with 1 s between presentations (e.g., “ba, ba, ba, ba, ba”). The confederate did not emit any incorrect responses. Additionally, edibles were provided for appropriate sitting behavior. Edibles were also given once for sitting at the table, between Trials 5–7, and at the end of the 10-trial session. To avoid directly reinforcing vocalizations and correct responding during OC, the experimenter waited 10 s to deliver the edible following the occurrence of a correct response or vocalization occurred.e.

RCP

Only the participant and the experimenter were present. The participant sat across from the experimenter at the table. At the beginning of each trial, the experimenter placed the arbitrary task stimuli (i.e., a button) within reach of the participant on the table and waited for the participant to press the button. If the participant did not press the button within 5 s, the experimenter positioned the button directly in front of the participant but did not provide any further prompts. Button presses resulted in the experimenter immediately emitting the target sound five times with 1 s between presentations. The preferred edible was delivered simultaneously with the fifth presentation of the target sound. Following this pairing, the experimenter removed the task materials. There was a 15-s intertrial interval to allow for consumption of the edible. To prevent direct reinforcement of vocalizations, if the participant emitted the target sound prior to the scheduled delivery of the preferred edible, the experimenter delayed edible delivery by 10 s.

Social Validity

To assess the social validity of the procedures implemented during the conditioning phase, a concurrent-chains preference assessment was completed with each participant following procedures in Hanley (2010). During the concurrent-chains preference assessment, three colored cards (38 cm by 38 cm) were presented to the participant. These cards were the same color as the cards associated with the OC (i.e., blue for Thomas, orange for Arthur, purple for Mozart), RCP (i.e., black for Thomas, white for Arthur, red for Mozart), and extinction (i.e., purple for Thomas and black for Arthur and Mozart) response options in the reinforcer assessment.

Prior to completing choice trials, three forced exposures to each condition were completed. At the start of each choice trial, the three colored cards were presented on the table, and the experimenter instructed the participant to “pick one.” Contingent on a selection, the other cards were removed and the participant was exposed to the consequences associated with that card. For example, if the participant selected the OC card, then the participant was exposed to three trials of the OC condition, whereas selection of the RCP card resulted in exposure to three trials of the RCP condition. If the participant selected the extinction card, the participant and the experimenter sat at the table for 20 s and did not interact. Each selection trial was followed by a 1-min (Arthur) or 30-s (Thomas and Mozart) break. Choice trials were conducted until a maximum of 20 trials were completed. However, a response-restriction component was implemented if the participant selected the same option for three consecutive trials (Hanley et al., 2003). The conditioning procedure that was repeatedly chosen was removed from the choices for one trial and then returned to the array for subsequent trials. This procedure was implemented on Trials 8 and 15 with Thomas; Trials 4, 9, 13, and 20 with Arthur; and Trials 4, 9, and 13 with Mozart.

Results

Figure 2 depicts Thomas’s rate of target touching toward the stimuli associated with the delivery of the OC (top), RCP (middle), and control (bottom) vocalizations (i.e., consequence) and extinction (i.e., no consequences). Thomas engaged in low levels of target touching during the preconditioning phases of the OC, RCP, and control conditions and during the conditioning phase of the control condition. During the conditioning phase, he engaged in marginally higher levels of target touching during the OC condition. In the RCP condition, a large increase in target touching was observed initially during conditioning, but rates decreased to the same level as those observed during preconditioning. Target touching remained stable in the control condition. Thomas engaged in higher levels of target touching during the postconditioning phase across the OC, RCP, and control conditions. However, during OC, target touching toward the stimuli associated with reinforcement increased initially but then decreased to levels similar to those observed in the preconditioning phase, whereas higher levels of target touching persisted in the RCP condition and higher levels of target touching associated with extinction persisted only in the control condition. Furthermore, rates of target vocalizations remained low across all phases and conditions, but rates of other vocalizations increased during the postconditioning phase of the OC, RCP, and control conditions.

Fig. 2
figure 2

Thomas’s Responding During the Reinforcer Assessment Sessions. Note. TT. Ext. = target touching toward extinction component; TT. Con. = target touching toward consequence component; RCP = response-contingent pairing; OC = observational conditioning; rpm = responses per minute.

Figure 3 depicts Arthur’s rate of target touching toward the stimuli associated with the delivery of the OC (top), RCP (middle), and control (bottom) vocalizations (i.e., consequence) and extinction. Arthur rarely engaged in target touching during the preconditioning and conditioning phases of the OC, RCP, and control conditions. Target touching continued to occur at low levels during the postconditioning phase of the OC and control conditions. However, during the postconditioning phase for RCP, Arthur engaged in high levels of target touching toward the stimuli associated with reinforcement. Arthur also did not emit any vocalizations during the preconditioning phases of all three conditions. The rate of target vocalizations increased to high levels during the OC conditioning phase and remained high during the OC postconditioning phase. Arthur’s other vocalizations increased to low levels toward the end of the conditioning phase of both the OC and control conditions, and these occurred at low-to-moderate levels during the postconditioning sessions of the OC, RCP, and control conditions.

Fig. 3
figure 3

Arthur’s Responding During the Reinforcer Assessment Sessions. Note. TT. Ext. = target touching toward extinction component; TT. Con. = target touching toward consequence component; RCP = response-contingent pairing; OC = observational conditioning; rpm = responses per minute.

Figure 4 depicts Mozart’s rate of target touching toward the stimuli associated with the delivery of the OC (top), RCP (middle), and control (bottom) vocalizations (i.e., consequence) and extinction. Mozart did not engage in target touching during any sessions of the preconditioning, conditioning, and postconditioning phases for either the OC, RCP, or control condition. In addition, Mozart rarely emitted the target vocalization across all phases and conditions. However, he emitted high rates of other vocalizations during the conditioning and postconditioning phases of the OC and control conditions and low rates of other vocalizations during the RCP condition.

Fig. 4
figure 4

Mozart’s Responding During the Reinforcer Assessment Sessions. Note. TT. Ext. = target touching toward extinction component; TT. Con. = target touching toward consequence component; RCP = response-contingent pairing; OC = observational conditioning; rpm = responses per minute.

Figure 5 depicts the proportional change in target touching from extinction to reinforcement (i.e., consequence). Consistent with previous research (e.g., Ahearn et al., 2003; Nevin & Shahan, 2011; Sweeney et al., 2014), a change in responding from extinction to reinforcement greater than 1.0 was interpreted as an indicator of a change in response rate (i.e., reinforcement effect for the current study). These data indicate that during the OC and RCP postconditioning phases, Thomas emitted slightly more target touch responses toward the stimuli associated with reinforcement than the stimuli associated with extinction, indicating a reinforcement effect for these two conditions. However, a decreasing trend in responding was observed for both the OC and control conditions. Arthur engaged in more responding toward the stimuli associated with reinforcement than extinction during the RCP and OC postconditioning phases, indicating a reinforcement effect for RCP and OC. Mozart engaged in similar levels of target touching toward the stimuli associated with reinforcement and extinction in the OC, RCP, and control postconditioning phases. These data show a lack of a reinforcement effect for Mozart.

Fig. 5
figure 5

Proportional Change in Responding (Target Touching) From Reinforcement (i.e., Consequence) to Extinction in Each of the Reinforcer Assessment Sessions. Note. Data points on the horizontal line at the value of 1.0 denotes that the rate of responding during reinforcement (i.e., consequence) was identical to rate of responding during extinction. RCP = response-contingent pairing; CRT = control.

Figure 1 depicts data from the brief echoic assessment completed before and after the conditioning evaluation. All participants responded correctly in fewer than 10% of trials in the preconditioning assessment. However, correct responding increased for all participants in the postconditioning assessment. Thomas initially scored 0% in all conditions preconditioning, whereas he scored 0%, 50%, and 50% in the OC, RCP, and control conditions, respectively, postconditioning. Arthur initially scored 10% in all conditions preconditioning, whereas he scored 70%, 60%, and 100% in the OC, RCP, and control conditions, respectively, postconditioning. Mozart initially scored 10% in all conditions preconditioning, whereas he scored 60%, 40%, and 50% in the OC, RCP, and control conditions, respectively, postconditioning. Results of these assessments indicate that echoic responding increased for eight out of the nine target sounds.

Figure 6 depicts the rate of target and other vocalizations emitted by each participant during the conditioning sessions. The rate of vocalizations increased for Thomas and Mozart, but Arthur rarely emitted vocalizations during these sessions. Thomas emitted a similar number of vocalizations in the OC (top left) and RCP (top right) conditions, although rates of the target response increased only in the OC condition. Arthur emitted a few instances of target and other vocalizations in the OC and RCP conditions, but these did not persist. Mozart engaged in low-to-zero rates of target vocalizations in the OC condition and variable rates during the RCP condition. Other vocalizations occurred at variable rates in the OC and RCP conditions; however, RCP produced higher rates of other vocalizations and target responses than OC.

Fig. 6
figure 6

Target and Other Vocalizations During the Conditioning Sessions. Note. rpm = responses per minute

Finally, during the social validity assessment, participants displayed a preference for different conditioning procedures (data available upon request). Thomas selected the RCP condition on 50% of trials, extinction in 33% of trials, and OC in 17% of trials, suggesting a slight preference for RCP. Arthur selected OC in 60% of trials, extinction in 5% of trials, and RCP in 35% of trials, suggesting a preference for OC. Mozart selected extinction in 70% of trials, OC in 18% of trials, and RCP in 12% of trials, suggesting a preference for the extinction condition.

Discussion

This study evaluated the relative effects of RCP and OC procedures on the rate of vocalizations for three children with ASD. Additionally, this study assessed whether these procedures were effective in conditioning vocalizations as reinforcers. Both OC and RCP led to an increase in target vocalizations and target touching. However, responding varied across participants. Target touching increased from the pre- to postconditioning phase for OC and RCP, but only for Thomas and Arthur, suggesting these procedures were effective in conditioning vocalizations for these two participants. There was also an increase in the percentage of echoic responses for all participants in the echoic assessment. Postconditioning, all three participants engaged in increased percentages of echoic responses in the RCP and control conditions, and Arthur and Mozart also showed a higher percentage of echoic responses in the OC condition.

This study extends the previous literature on conditioning procedures in several ways. First, this appears to be the first study evaluating the effects of OC that assessed the participant’s current repertoire to determine if they had the skills necessary for observational learning (MacDonald & Ahearn, 2015). The inclusion of a similar prerequisite skills assessment in future studies will help determine individual characteristics that may be associated with the efficacy of conditioning procedures. Second, this study appears to be the first to evaluate the effects of OC with individuals diagnosed with ASD. In previous studies, participants included children with other health impairments (e.g., Greer et al., 2008), mild-to-moderate language or developmental delays (e.g., Singer-Dudek et al., 2011), or other disabilities (e.g., Greer et al., 2008). In addition, in the current study, RCP and OC were used to condition vocalizations as reinforcers, whereas in previous studies, the neutral stimuli included praise (Dozier et al., 2012; Greer et al., 2008), recorded voices (i.e., voices recorded reading stories for auditory feedback; Greer et al., 2011), books (Singer-Dudek et al., 2011), and plastic disks and strings (Greer & Singer-Dudek, 2008). Given that conditioning procedures are not always effective in establishing vocalizations as reinforcers, future research should consider evaluating these conditioning procedures with physical stimuli (e.g., books, toys) first to see if they are effective. Only then should these procedures be used to condition vocalizations. That is, a conditioning effect with physical stimuli such as toys should be a participation criterion in future studies to permit ruling out of any individual characteristics that may be responsible for the lack of an effect with conditioning vocalizations as reinforcers. Furthermore, this sequence would allow future research to determine if conditioning failures are due to the intangible nature of vocalizations.

Procedures employed in the current study also differed from those in previous research. In the current study, confederates in the OC sessions were research assistants (i.e., college-aged individuals), whereas same-aged peers were included in previous research (Greer & Singer-Dudek, 2008; Greer et al., 2008). In these previous studies OC was effective in conditioning praise delivered by familiar people for two participants (Greer et al., 2008) and conditioning discs and strings as reinforcers for all six participants (Green & Singer-Dudek, 2008). Thus, it is plausible that characteristics of the confederate (e.g., age, gender) may correlate with the efficacy of OC and this potential relation between characteristics of the confederate and the efficacy of OC should be evaluated in future research. Also, the current study included specific criteria for ending conditioning sessions (i.e., a reinforcing effect in one of the conditions during conditioning or a maximum of eight reinforcer assessment sessions during the conditioning phase), whereas Singer-Dudek et al. (2008) implemented conditioning until the individually determined termination criteria were met. Specifically, Singer-Dudek et al. (2008) terminated conditioning for one participant when correct responding decreased and nonvocal mands increased for two consecutive sessions. For another participant, they terminated conditioning when nonvocal mands increased across three consecutive sessions. Given the differing criteria for terminating conditioning, our study also differs from previous studies in regard to the number of conditioning trials that were conducted. For instance, Singer-Dudek et al. (2008) and Greer and Singer-Dudek (2008) exposed each participant to 90 to 300 conditioning trials, completed across 9 to 30 conditioning sessions. Conversely, our study implemented an average of 733 (400 trials of RCP and 200–400 trials of OC) conditioning trials per participant. Additionally, Lepper and Petursdottir (2017) implemented 20 randomized sound presentations per session that included 10 target and 10 nontarget sound presentations, whereas in our study, RCP sessions only consisted of 10 target-sound presentations per session. It is possible that the number of conditioning trials and the inclusion of nontarget sounds within conditioning sessions influence the effectiveness of the conditioning procedure.

An additional difference between the current study and previous research is the inclusion of a reinforcer assessment during the conditioning phase and the format of the control condition. In regard to the reinforcer assessments, previous studies completed these only before (preconditioning) and after (postconditioning) conditioning (Greer & Singer-Dudek, 2008; Greer et al., 2008; Singer-Dudek et al., 2008). The inclusion of these assessments during the conditioning phase allowed us to end the conditioning phase sooner for one of the participants. Also, the control response was only included in the reinforcer assessments, whereas previous studies presented the control vocalization during the RCP conditioning sessions (e.g., Lepper & Petursdottir, 2017), interspersed the control vocalization during pairing trials with the target vocalization (Barry et al., 2019), did not include control vocalizations (Carroll & Klatt, 2008), or simply measured control vocalizations (Esch et al., 2005). Results of our study suggest that the inclusion of control vocalizations in the reinforcer assessments may suffice to demonstrate experimental control. By including the control condition, we were able to compare the reinforcing effects of sounds that were (RCP and OC) and were not (control) exposed to conditioning procedures, which allowed us to identify whether extraneous variables might have influenced the reinforcing effects of the sounds, irrespective of the conditioning procedures, or if generalization of the effects occurred.

There are several limitations to the current study. First, because an adapted alternating-treatments design was used during the conditioning phase, it is possible some of the results are due to a carryover effect. However, because each reinforcer assessment was preceded by a block of five conditioning sessions of a particular conditioning procedure, carryover effects are less likely. Another potential limitation is we did not collect data on participants’ attending responses during the conditioning sessions. It is plausible that incorrect responding during the conditioning sessions and lower responding during the reinforcer assessment were due to participants not attending to the task, as opposed to a potential lack of a conditioning effect. Future researchers should record data on attending and consider waiting to present a conditioning trial until the participant is attending.

Increases in echoic responding may have resulted from repeated testing or maturation and not from the conditioning procedures. Though the echoic assessments were brief and did not include reinforcement of correct responding, these were completed twice (pre- and postconditioning phases). Also, echoic responding with the sound from the control condition and other vocalizations increased for all participants, which partially supports the hypothesis that variables other than the conditioning procedures may have been responsible for some of the outcomes of this study. However, it is also plausible that the participants’ echoic responding was generalizing to novel vocalizations (i.e., generalized echoic responding) and that the conditioning procedures conditioned vocalizations in general as reinforcers (i.e., auditory feedback produced by vocalizations) instead of the specific vocalizations included in the conditioning sessions. We did not stagger the number of preconditioning assessments across participants to ensure that increases in responding only occurred as a result of conditioning rather than following repeated exposure to the procedures. Additionally, we employed a response-restriction component during the concurrent-chains preference assessment. Thus, at least some of the participants’ responding was influenced by the number of options available during a given choice trial. Future research might consider conducting an entire assessment without the response-restriction component, and then add it in, if necessary.

Another limitation is the potential aversive properties of the conditioning procedures due to the lack of reinforcement for correct responses. Anecdotally, OC conditioning sessions began to appear aversive for Thomas and Mozart. For instance, Thomas would repeatedly sign “potty” at the onset of the OC session and would attempt to elope from the study room instead of going to the bathroom, as well as swiping the OC materials off the table. Mozart slid out of his chair and crawled under the work desk or engaged in other task-refusal behaviors during OC sessions. To minimize problem behavior occurring during the OC sessions, we delivered a preferred edible for appropriate session behavior. However, this modification was only used during OC. Future research should evaluate using an edible component with OC and RCP or other ways to minimize problem behavior in OC sessions. Furthermore, different tasks (i.e., button pressing for RCP, matching stimuli for OC) were used during the OC and RCP conditioning sessions. Although the tasks selected were mastered by the participants and similar in difficulty, this could be a confounding variable. Future research should counterbalance tasks assigned to the conditioning procedures across participants.

Furthermore, an extinction effect could explain the variability of correct responding when completing the arbitrary task in OC sessions (i.e., extinction-induced variability), as we did not reinforce participants’ correct responses during the conditioning sessions. Previous research has noted that OC may result in an extinction effect across conditioning sessions (Greer & Singer-Dudek, 2008; Singer-Dudek et al., 2008). For example, although responding increases initially, these responses either decrease or cease to occur due to a lack of reinforcement. In the current study, correct responding during OC was variable for two of the three participants and decreased for the third participant (Thomas). Future research should assess the effects of delivering a reinforcer for correct responding during the OC condition.

Finally, in the current study a change in responding from extinction to reinforcement greater than 1.0 was deemed an indicator of a reinforcer effect; however, the change in responding for Thomas was small relative to Arthur. Future research should consider additional indicators of reinforcer efficacy, such as break points attained during a progressive ratio reinforcer assessment (e.g., Roane et al., 2001).

Overall, the current study demonstrated that both RCP and OC are effective in increasing vocalizations for some children with ASD, and, in some cases, these procedures can establish vocalizations as conditioned reinforcers. In regard to clinical implications, both procedures are not time-consuming to implement. For example, conducting five conditioning sessions and one reinforcer assessment took an average of 20 min. Therefore, these interventions may be feasible and appropriate to conduct in clinical settings with individuals who have limited vocal repertoires or who do not vocalize.