Introduction

In 2015, the United States (US) Census Bureau (2015) estimated that approximately 21.5% of school-aged children spoke languages other than English at home. In the 2014–2015 school year, it was estimated that 9.7% of students with disabilities in the USA, between the ages of 6 and 21, were also limited English proficient (LEP; US Department of Education 2015). Previous research has shown similar patterns of language development for bilingual and monolingual children with disabilities and language delays (Hambly and Fombonne 2012; Ohashi et al. 2012; Reetzke et al. 2015; Valicenti-McDermott et al. 2013). Despite this evidence, instruction and support is often delivered only in English to students that come from bilingual homes (Cheatham and Ro 2010; Mueller et al. 2004; Paneque and Rodriguez 2009).

There is limited behavioral research on the impact of language of instruction on skill acquisition and challenging behavior for children with language delays. Lang et al. (2011) examined the effects of the language of instruction on correct responding and the presence of challenging behavior during discrete trial teaching (DTT) sessions for a 4-year-old girl with autism. The participant received most of her instruction in English at school, but her parents only spoke to her in Spanish at home. During DTT sessions, the therapist delivered task demands based on her current level of performance according to the school’s evaluation and in consultation with her family. Sessions were conducted during the participant’s regularly scheduled DTT period and consisted of 2–3 sessions per day that were approximately 15 min long. The same task demands, reinforcers, and prompting hierarchy were used throughout all phases of the study. The task demands included identification of common objects by pointing, manding specific preferred tangibles, and motor imitation. Within each trial of the DTT sessions, the teacher presented a discriminative stimulus (e.g., “Touch the red doll”), prompted the correct response as needed, delivered a programmed consequence, and paused for 1–10 s before beginning the next trial. Discriminative stimuli were presented at a rate of about 2 per min in both English and Spanish conditions. The order of trial presentation was randomized within sessions in both languages, but the specific trials were held constant across sessions and languages. The results of this study showed that the participant emitted more correct responses and fewer instances of challenging behavior when instruction was presented in her home language, Spanish.

In a similar study, Rispoli et al. (2011) examined levels of challenging behavior during a functional analysis conducted in English and Spanish for a 5-year-old girl with severe intellectual disability and cerebral palsy. The participant received instruction in both English and Spanish at school, but her parents only spoke in Spanish to her at home. Each session lasted 5 min and challenging behavior was assessed across four conditions: attention, play-verbal, play-nonverbal, and demand. The influence of language was examined by alternating phases in English (i.e., the implementer only spoke in English to the participant) and in Spanish (i.e., the implementer only spoke in Spanish to the participant). Results of this study showed higher levels of challenging behavior during functional analysis conditions when the experimenter spoke in English.

Collectively, results of these studies show that bilingual children with disabilities, including language delays, demonstrate lower levels of challenging behavior and a greater response accuracy when instruction is provided in the language with which they are most familiar (in this case, Spanish). These findings also suggest that bilingual children with language delays may have a language preference, but preference was not directly evaluated by this group of researchers.

Preference for language of instruction and manding or obtaining reinforcement was evaluated in a study by Padilla Dalmau et al. (2011). The experimenters evaluated the effects of implementing functional communication training (FCT) in English and Spanish to decrease levels of challenging behavior and increase levels of communication and task completion with a 6-year-old boy and a 5-year-old girl with developmental disabilities who received instruction in English at school and were exposed to both English and Spanish at home. During FCT sessions in English, the participants received instruction only in English. During FCT sessions in Spanish, the participants received instruction only in Spanish.

Participants were taught to mand for reinforcement in English or Spanish by pressing one of two voice output microswitches after completing a required number of tasks per trial. The reinforcement period was delivered in the language selected and lasted 1–2 min with access to parent attention and preferred toys. The percentage of trials of language choice (i.e., reinforcement period in English or Spanish) was compared for each participant during FCT. Results of this study showed the intervention was effective in both languages and neither participant exhibited a language preference.

Given the growing prevalence of bilingual children with developmental disabilities, including children with language delays that live in the USA (US Department of Education 2015), it is important to identify and implement best practices to instruct and support this population. Currently, there is disagreement and inconclusive evidence on whether bilingual children with language delays should be instructed and supported using English, their home language, or both (Cummins 2009; Paneque and Rodriguez 2009). To our knowledge, the direct impact of teaching in multiple languages on the acquisition of verbal behavior in multiple languages has not yet been examined for bilingual children with developmental disabilities, including language delays.

Skill acquisition programs for children with developmental disabilities often include tact instruction. The tact was defined by Skinner (1957) as a verbal operant under the control of a specific object or event or property of an object or event (i.e., a nonverbal discriminative stimulus) that is reinforced by generalized conditioned reinforcement (pp. 81–82). To date, only one study (Lang et al. 2011) has examined the influence of language of instruction on the acquisition of tacts among bilingual children with developmental disabilities. Lang et al. (2011) provided instruction in one language in isolation (i.e., English or the home language) and so the effects of training in a bilingual format remain unknown.

The purpose of the present study was to expand upon the findings of Lang et al. (2011) by directly evaluating the effects of tact training when instruction was presented in a bilingual format (English and the home language, Portuguese) compared to instruction in English alone. Specifically, we sought to investigate skill acquisition during bilingual instruction as conducted in some bilingual educational settings, where the language of instruction and responding is varied momentarily during lessons (Creese and Blackledge 2010). We compared the number of trials to reach mastery criterion during tact training for two sets of stimuli: one taught only in English and the other taught in both English and Portuguese (the participant’s home language), and evaluated generalization and maintenance of the skills acquired for both training sets.

Method

Participant, Setting, and Materials

Paulo was 6 years and 8 months at the start of the study and had an educational diagnosis of communication impairment. His parents had expressed interest in their child learning more words in their home language, Portuguese. At the start of the study, Paulo received all of his academic instruction at school in English, but was exposed to and spoke some Portuguese at home with his family. He attended a partial inclusion classroom for most of the school day. A language exposure questionnaire was developed for the study based on the “Parent Interview Form” used by Padilla Dalmau (2012). This questionnaire was completed by Paulo’s parents in Portuguese in order to obtain information regarding his exposure to English and Portuguese. Paulo’s mother and his school speech-language pathologist (SLP) reported that he had a large vocabulary but difficulty with articulation of words in both languages. His mother reported that he first spoke in both languages, all of his academic instruction was conducted in English, he tended to perform better when academics were presented in English, and that he understood English and Portuguese at about the same level. Results of the questionnaire indicated that both of his parents spoke in Portuguese to him on a daily basis, but that he did not speak Portuguese well and did not read or write in Portuguese.

We assessed Paulo’s language abilities in English using the Verbal Behavior Milestones Assessment and Placement Program (VB-MAPP; Sundberg 2008), the Peabody Picture Vocabulary Test, Fourth Edition (PPVT-4), and the Receptive and Expressive One-Word Picture Vocabulary Tests (ROWPVT-4, EOWPVT-4). We assessed his language skills on four milestones of the VB-MAPP: tact, listener responding, intraverbal, and echoic and he achieved the maximum score on all four milestones: 5 out of 5 points on Levels 3 for the tact, listener responding, and intraverbal domains; and 5 out of 5 points on Level 2 of the echoic domain. On the PPVT-4, his score fell in the 14th percentile that placed him in the moderately low range and at an age equivalence of 5 years and 7 months. We also assessed his language skills in English with the bilingual edition of the ROWPVT-4 and EOWPVT-4. Results were lower on the receptive measure than on the expressive measure. On the ROWPVT-4, his score fell in the 23rd percentile and his age equivalence was 6 years. On the EOWPVT-4, his score fell below the 37th percentile and his age equivalence was 9 years and 11 months. We are unable to report a standardized measure for Paulo’s Portuguese proficiency because the three standardized language assessments included in the study were not available in Portuguese. Thus, we only collected and report data on his Portuguese proficiency via parental report as indicated in the section above.

All initial sessions were conducted in a classroom workspace. When the school year ended, the experimenter continued to conduct sessions with Paulo in a quiet room in his home. Each training session included exposure to the two training conditions that were evaluated in the study. Sessions were conducted 3–4 days per week for 20–25 min. Data collection continued for a total of 15 weeks. During all sessions, the experimenter sat next to the participant at a table with two chairs and presented materials created in Microsoft PowerPoint on an Apple iPad. Two, 3-stimulus sets were created based on results of pre-training (see Fig. 1). Word pairs were matched on the number of syllables in both languages (i.e., two to three syllables), and words that were too similar when translated were excluded (e.g., “jacket” and “jaqueta”). We selected stimuli that consisted of two to three syllables, were not too similar when translated to Portuguese, and are commonly found in children’s home and community environments. The stimuli were selected from a list of common targets, but were not based on any standardized assessments. We obtained pictures of target stimuli using an internet search engine and presented them on the iPad. Images were resized to 10.41 by 11.43 cm and presented on a yellow (for Portuguese) or blue (for English) background. An inter-trial interval (ITI) screen (i.e., a white background PowerPoint slide) was presented after each target stimulus and was preset to advance automatically after 2 s. We devised data sheets to record the participant’s responses, interobserver agreement (IOA), and procedural integrity (PI).

Fig. 1
figure 1

Stimuli presented for set A (top panel) and set B (bottom panel). Stimuli presented in both English and Portuguese for Set A consisted of basket/cesto, suitcase/mala, and notebook/caderno, respectively. Stimuli presented in English for Set B and consisted of oven, teapot, and ladle, respectively

Experimental Design

An adapted alternating treatments design (Sindelar et al. 1985) was used to evaluate training effects for the two training conditions. This experimental design involves the comparison of two methods, each associated with a unique set of instructional items that are equivalent and functionally independent. The two tact training conditions evaluated in this study were: (1) bilingual instruction, which consisted of training in both English and Portuguese; (2) English-only instruction, which consisted of training only in English. One set of three stimuli was assigned to the bilingual instruction condition (Set A) and another set of three stimuli was assigned to the English-only instruction (Set B), using quasi-random assignment. Pretest probes were conducted to ensure that tact responses for the stimuli to be used in the study were not within the participant’s repertoire in either language prior to the start of the study. Posttest probes were conducted to assess the acquisition of tact responses. The same task demands, reinforcers, and prompting hierarchy were used throughout the training phase for both conditions (see details below).

Dependent Variable and Response Measurement

The primary dependent variable was the number of correct tact responses emitted by the participant. A correct tact response was defined as an independent response in the language corresponding to the condition within 10 s following presentation of a visual and auditory discriminative stimulus. For example, following presentation of the instruction, “What is it?” and image of a chair on the iPad, a correct response was scored if the participant said “Chair” within 10 s. If the participant named any one item correctly in the language that did not correspond to the condition, an additional probe was provided. For example, following the instruction “What is it?” in English if the participant responded correctly in Portuguese, the experimenter asked “What is it in English?” An incorrect response was defined as a vocalization that did not correspond to the name of the target stimulus, responses emitted following 10 s after presentation of the discriminative stimulus, or any response in the language that did not correspond to the training condition including prompted responses (e.g., the participant responding accurately following an echoic prompt (e.g., “Chair”) from the experimenter. For the purposes of data collection, only the first response given by the participant was included in the data analysis.

Procedures

The independent variable in this study was the language of instruction. That is, the experimenter presented instructions, modeled correct responses, and provided verbal praise in the language that corresponded to each condition. For example, if the language of instruction during tact training was English and the target stimulus was Chair, the experimenter presented the instruction, “What is it?” and provided a model prompt, “Chair.” If the language of instruction during tact training was Portuguese; for example, the experimenter presented the instruction, “O que é?” and provided a model prompt, “Cadeira.”

Echoic Pre-training Probes

A flowchart of the experimental conditions is presented in Fig. 2. Echoic pre-training probes were first conducted to determine if the participant could articulate the words for all potential stimuli in both English and Portuguese. A potential list of stimuli was devised according to the criteria described above. The experimenter provided an echoic prompt (e.g., “Say ‘chair’”) for each of the identified stimuli. If the participant did not articulate the word correctly, it was replaced with another word from the potential list of stimuli. That is, words that the participant did not articulate correctly were excluded from the list of potential training stimuli. Probes were presented once for each stimulus, first in English and then in Portuguese with a 5-min break in between probes in each language. Probes were presented across two sessions on separate days to present additional probes in both languages during the second session. No response specific consequences were delivered, but the experimenter delivered generic praise statements for compliance on a VI-1 min schedule of reinforcement (i.e., “You’re doing such a great job today!”. The schedule of reinforcement was facilitated by an application on an iPhone, R + Remind™.

Fig. 2
figure 2

Flowchart of experimental conditions

Tact Pre-training Probes

Following echoic pre-training probes, tact pre-training probes were conducted to identify items that were not already within the participant’s repertoire in either language. The experimenter presented the initial instruction, “I am going to show you some pictures, tell me what you see” and then presented one stimulus at a time and the auditory discriminative stimulus, “What is it?” on the iPad. If the participant emitted a correct tact, it was replaced by another stimulus from the list in the following session. The experimenter waited up to 5 s before presenting the next trial. Probes were presented first in English and then in Portuguese, with a 5-min break in between probes in each language. Probes were presented across two sessions on separate days to present additional probes in both languages during the second session. In addition, a conversational prompt was added immediately before presentation of the probes in Portuguese. This consisted of the experimenter speaking to the participant in Portuguese by saying: “I like how you are sitting and doing such a good job! We are going to practice speaking in Portuguese now” before presenting the first probe. The same reinforcement procedure employed during echoic pre-training probes was used during tact pre-training probes.

Pretest Probes

Once eight stimuli were identified in the tact pre-training probes, six of these stimuli were selected at random and assigned in a quasi-random fashion to either the bilingual training condition (Set A) or the English-only condition (Set B; see Fig. 1). Pretest probes for tact responses were conducted in 9-trial blocks with each stimulus presented three times in a counterbalanced order in the language that corresponded to the training condition. Stimuli in Set A were presented in two 9-trial blocks; one in English and one in Portuguese. That is, the experimenter presented each stimulus three times in English during one 9-trial block and in Portuguese during the other 9-trial block. Stimuli in Set B were presented in one 9-trial block, with each stimulus presented three times in English. If the participant emitted two or more correct responses for the same target stimulus, another stimulus was selected at random from the list of stimuli identified in the tact pre-training probes to replace the target stimulus, and a new 9-trial block was conducted with the new target stimulus before moving to pre-generalization probes.

Pre-generalization Probes

Pre-generalization probes were conducted to assess tact responses for a variation of each pre-training stimulus (i.e., a different picture of the same stimulus) in Set A and Set B, using the same procedures employed during pretest probes. Each stimulus was presented in 9-trial blocks in the same manner as pretest probes and the mastery criterion was set at eight out of nine correct independent responses in one trial block.

Bilingual Tact Training

Following pre-generalization probes, the participant was exposed to the two training conditions in an alternating format (see Fig. 2). The order of presentation and the order of language of instruction (i.e., for Set A) was determined a priori in a quasi-random fashion. During the bilingual tact training condition, the experimenter and the participant spoke in the language that corresponded to each trial. Prior to each tact training session, the experimenter provided the initial instruction, “I am going to show you some pictures, tell me what you see” in both English and Portuguese.

Tact training was conducted in 18-trial blocks. Each stimulus in Set A was presented three times each in English (i.e., “notebook”) and three times each in Portuguese (i.e., “caderno”) for a total of 18 trials. The order of presentation of each stimulus was counterbalanced using quasi-random selection, where the same combination of the three stimuli was not presented twice consecutively. Each stimulus was presented first in one language (e.g., English) and in the alternate language (e.g., Portuguese) on the following trial. Following the presentation of a stimulus in both languages, one of the two remaining stimuli was randomly selected for presentation so that the same stimulus was not presented twice consecutively. The order of language of instruction for the first presentation of each stimulus was counterbalanced by flipping a coin. For example, if basket, suitcase, and notebook were selected and the languages selected were English, Portuguese, and English then the first six trials consisted of basket-English, basket-Portuguese, suitcase-Portuguese, suitcase-English, notebook-English, and notebook-Portuguese. If the same language of instruction was selected twice consecutively then the alternate language was always selected for the following trial. Additionally, a contextual cue was designated for the language of instruction used during each trial. That is, the background color for stimuli presented during English-language trials was blue and the background color for stimuli presented during Portuguese-language trials was yellow. The order of trial presentations was determined prior to the start of the study on data sheets created for this study. A PowerPoint presentation that depicted the order designated for each trial block as indicated on the data sheets was prepared prior to the start of each session.

Each trial consisted of three steps: (a) the experimenter required an observing response from the participant (i.e., the participant looked toward the experimenter); (b) the experimenter modeled the correct response after a designated period of time without a response or following an incorrect response, using a progressive prompt delay (i.e., 0, 2, 4, 6, 8 s, and no prompt; Handen and Zane 1987; Touchette and Howard 1984); and (c) the experimenter delivered a programmed consequence. A 2-s ITI was programmed into the PowerPoint presentation. During this time, the participant viewed an empty slide with a white background color. Correct independent responses and correct prompted responses were reinforced with descriptive verbal praise (e.g., “Great job saying chair!”). Incorrect responses were followed by representation of the trial at a 0-s prompt delay. If the participant did not emit a response within the designated period of time for the current prompt delay, the experimenter modeled the correct response, waited for the participant to emit the correct response, delivered descriptive verbal praise, and presented the next trial.

The criterion for increasing the prompt delay was one trial block with 16/18 correct independent responses. The prompt delay was reset to the previous level following three consecutive incorrect responses within a block. The criterion for mastery of a stimulus set was 17/18 correct independent responses in one trial block. In addition, a criterion for incorrect responding was set at three trial blocks with no additional correct independent responses to indicate a lack of progress with the current prompt delay. If the criterion for incorrect responding was met, the prompt delay was reset to 0 s.

Procedural Modification to Bilingual Tact Training

Procedural changes were implemented during the bilingual tact training condition in an attempt to bring correct independent responses under stimulus control. First, a modified blocked trials procedure (Saunders and Spradlin 1989, 1990; Slocum et al. 2012) was implemented between trial block 30 and 31. The stimuli included in the modified blocked trials procedure were only those stimuli that had not produced consistent correct independent responses in Set A (i.e., basket and notebook). Each stimulus was presented in isolation across two 9-trial blocks, one in English and one in Portuguese (i.e., a 9-trial block where the correct response was always “notebook” followed by a 9-trial block where the correct response was always “caderno”). The participant was allotted 5 s to respond during each trial. The order for selection of the stimulus and language of instruction was predetermined by flipping a coin. Mastery criterion during the blocked trials procedure was eight out of nine correct independent responses in one trial block. Once the mastery criterion was met for a stimulus, that stimulus was no longer presented and the remaining stimuli were presented in blocked trials until mastery criterion was met. The participant earned a 1–4 min break after every two 9-trial blocks. A total of seven trial blocks were conducted during the blocked trials phase. Starting with trial block 36, blocked trials for basket (Portuguese) and notebook (Portuguese) were conducted at the beginning of each session, due to continued incorrect responding to those stimuli only.

The second procedural modification began at trial block 44. At this point in the training, the progressive prompt delay procedure was applied only to basket (Portuguese) and notebook (Portuguese) and the participant was provided with a full 10 s to respond for all remaining stimuli (i.e., the progressive prompt delay was no longer implemented). The third and final procedural modification occurred starting with trial block 51. At this time, a token system was implemented to promote independent responses. The experimenter delivered a token (i.e., drew a star on a blank sheet of paper) following each correct independent response. The contingencies of the token system were described to the participant at the start of each session. That is, the experimenter stated: “I’m going to draw a star on this paper each time you give the right answer. Once you get X stars, you can play on the iPad for 5 min.” Following criterion-level performance (i.e., at least one correct independent response above the number of correct independent responses on the previous trial block), Paulo exchanged all earned tokens for the backup reinforcer of access to the iPad. The iPad was selected as the backup reinforcer because the experimenter observed this to be a highly preferred activity for the participant.

English Tact Training

During the English tact training condition, the experimenter spoke only in English and followed the same procedures outlined in the bilingual tact training condition. The order of presentation (i.e., for Set B) was determined a priori in a quasi-random fashion. Prior to each tact training session, the experimenter provided the initial instruction, “I am going to show you some pictures, tell me what you see.” Tact training was conducted in 18-trial blocks. Each stimulus in Set B was presented six times in English (i.e., “oven”) for a total of 18 trials. The order of presentation of each stimulus was counterbalanced using the same procedure as outlined for the bilingual tact training condition (i.e., for Set A). Stimuli for Set B were always presented in English and thus, the color of the background for each stimulus was always blue. The experimenter followed the same steps during each trial and employed the same prompting procedure as in bilingual tact training. The criterion for increasing and decreasing the prompt delay, mastery criterion, and incorrect responding were also the same.

Probes Conducted Under Extinction

After mastery criterion was met for both sets, an 18-trial block under extinction was conducted for each set of stimuli. The same reinforcement procedure employed during pretest probes was used for probes conducted under extinction, except reinforcement for compliance with unrelated instructions was no longer provided. When the participant met the mastery criterion during these probes, the next experimental condition was implemented (i.e., tact posttest probes). If mastery criterion was not met for either stimulus set, a remedial training phase was implemented until mastery criterion was met. The purpose of conducting probes under extinction was to ensure that tacts acquired as a result of tact training were emitted by the participant in the absence of programmed consequences for correct and incorrect responses which would also be absent during posttest probes.

Remedial Training

We implemented remedial training if the participant did not meet the mastery criterion during probes conducted under extinction or posttest probes. The remedial training employed the same procedures as tact training and continued until the participant demonstrated mastery criterion performance. This training was implemented after trial block 59 for the set assigned to bilingual tact training (Set A), and after trial block 2 for the set assigned to English-only tact training (Set B). After mastery criterion was met for a stimulus set during remedial training, probes conducted under extinction and posttest probes were conducted once more.

Posttest Probes

After mastery was met for a stimulus set during probes conducted under extinction, posttest probes were conducted using the same procedures as in pretest probes. The criterion for mastery of a stimulus set was eight out of nine correct independent responses in one trial block.

Post-generalization Probes

Post-generalization probes were conducted once mastery criterion was demonstrated during posttest probes. The same procedures and mastery criterion employed during pretest probes were used, but with the same stimuli as pre-generalization probes. Post-generalization probes could not be conducted immediately following posttest probes for Set A because Paulo and his family went on vacation. These probes were scheduled upon his return four weeks after the mastery criterion was met. For Set B, post-generalization probes were conducted on the following day after mastery criterion was demonstrated.

Follow-Up Probes

Follow-up probes were scheduled to take place 2–6 weeks following the implementation of post-generalization probes. The same procedures and mastery criterion employed during posttest probes were used. Given the time constraints with data collection, the experimenters scheduled follow-up probes for Set A 1 week after the completion of post-generalization probes. Follow-up probes for Set B were conducted six weeks after the completion of post-generalization probes.

Interobserver Agreement

All sessions were video-recorded for the purpose of IOA and PI measures. Two trained observers scored participant responses during 26–100% of trial blocks for each phase of the study. One of the observers scored participant responses in vivo and the other observer scored participant responses from video-recorded sessions. IOA data were collected for 100% of trial blocks during pretest probes (M = 100%); 100% of trial blocks during pre-generalization probes (M = 100%); 26% of trial blocks during the training phase (M = 94.4%, range 77.8–100%); 60% of trial blocks during the posttest probes (M = 100%); 33.3% of trial blocks during the post-generalization probes (M = 100%); and 33.3% of trial blocks during follow-up probes (M = 100%). IOA data were calculated by dividing the number of agreements by the number of agreements plus disagreements and then multiplying by 100 to yield a percentage for each trial block.

Procedural Integrity

PI measures were also scored by a third trained observer from video-recorded sessions to ensure that the experimenter implemented the training procedures consistently. A checklist for each phase of the study was created for the purposes of scoring PI. The checklist for pre- and posttest probe sessions included 11 possible experimenter responses. The checklist for tact training included 14 possible experimenter responses. Examples of responses scored were: (a) the experimenter cleared the work area and prevented access to preferred items, (b) the experimenter asked “What is it?” and presented each picture in the corresponding language of instruction, and (c) the experimenter recorded the participant’s response on the data sheet before presenting the next trial (data sheets available from the first author upon request). PI data were summarized by summing the number of responses that were correctly implemented and dividing by the total number of available responses per trial block and multiplying by 100 to yield a percentage. PI data were collected for 100% of trial blocks during pretest probes (M = 98.7%, range 96.1–100%); 66.6% of trial blocks during pre-generalization probes (M = 99%, range 98–100%); 32.8% of trial blocks during the training phase (M = 99.2%, range 92.2–100%); 40% of trial blocks during the posttest probes (M = 100%); 33.3% of trial blocks during the post-generalization probes (M = 100%); and 33.3% of trial blocks during follow-up probes (M = 100%).

Results

Figure 3 shows Paulo’s correct tact responses during pre- and posttest probes, and pre- and post-generalization probes, and follow-up. His responses during pretest and pre-generalization probes ranged from zero to two across the two training conditions. In the bilingual tact training condition (Set A) during pretest probes, there were zero correct responses in both English and in Portuguese. In the English tact training condition (Set B) during pretest probes, there was one correct response. For Set A during pre-generalization probes, there were two correct responses in English and zero correct responses in Portuguese. For Set B during pre-generalization probes, there were zero correct responses. Paulo met and exceeded the predetermined mastery criterion with Set A on posttest probes in both English and Portuguese. He also met the mastery criterion with Set B in English only, but did not meet criterion during post-generalization and follow-up probes (see Fig. 3). For Set A during posttest probes, there were eight correct responses in English and nine correct responses in Portuguese. For Set B during posttest probes, there were nine correct responses. For Set A during post-generalization probes, there were nine correct responses in English and eight correct responses in Portuguese. For Set B during post-generalization probes, there were three correct responses. For Set A during follow-up probes, there were eight correct responses in English and nine correct responses in Portuguese. For Set B during follow-up probes, there were six correct responses.

Fig. 3
figure 3

Paulo’s number of correct tact responses with the bilingual condition (in English), the bilingual condition (in Portuguese), and the English-only condition during pretest, posttest, pre-/post-generalization, and follow-up probes

Results of Paulo’s tact training are shown in Fig. 4. This figure shows that Paulo met mastery criterion with Set B following the second trial block, although he required six additional trial blocks of remedial training to meet mastery criterion on probes under extinction and posttest probes. The level of correct responses for Set A remained at a lower level relative to Set B throughout the first 56 trial blocks. For Set A, Paulo met mastery criterion following the 59th training trial block and required nine additional trial blocks of remedial training to meet mastery criterion during probes conducted under extinction and posttest probes. During the first seven trial blocks with Set A, correct independent responses ranged from two to nine. From trial block 8–20, there was an increase in the level of correct independent responses (range 3–13). From trial block 21–30, the level of correct independent responses decreased slightly (range 1–14). After 30 trial blocks of tact training, the number of correct independent responses with Set A ranged from 1 to 14, and there was no evidence of an increasing trend. Due to continued incorrect responding and in an effort to bring responding under stimulus control, a modified blocked trials procedure was implemented. Paulo’s level of correct independent responses increased to a higher level (range 6–14). He met the mastery criterion with Set A following 59 trial blocks of tact training and an additional seven blocked trials.

Fig. 4
figure 4

Tact training results for Paulo. Remedial training sessions represented by an arrow accompanied by the text “RT,” following a break in the data path

Figure 5 shows the number of correct independent tact responses in English and in Portuguese during tact training in the bilingual training condition (Set A). This figure shows that the overall level of correct independent responses was higher in English than in Portuguese across all phases of tact training in the bilingual condition. It is clear that the number of correct independent responses emitted in English was greater than in Portuguese across the first five trial blocks (i.e., range 2–6 correct independent responses in English vs. range 0–1 correct independent responses in Portuguese). Starting with the eighth trial block, there were as many as eight correct independent responses in English, whereas the number of correct independent responses in Portuguese remained below eight until the 59th trial block.

Fig. 5
figure 5

Tact training results for Paulo with the bilingual condition only. Remedial training sessions represented by an arrow accompanied by the text “RT,” following a break in the data path

Figure 6 shows the total cumulative number of correct independent responses during tact training in the bilingual condition (Set A). This figure shows that Paulo had nearly twice as many correct independent tact responses in English (i.e., 451) compared to correct independent tact responses in Portuguese (i.e., 236) at the conclusion of the study. The difference in correct independent responses was clear after the first five trial blocks (i.e., 26 cumulative correct independent responses in English versus 2 total correct independent responses in Portuguese), and this difference grew exponentially as the training continued.

Fig. 6
figure 6

Cumulative number of correct independent responses for Paulo during tact training with the bilingual condition. Open circles represent correct independent responses in English for the bilingual condition and closed circles represent correct independent responses in Portuguese for the bilingual condition

In summary, the results of this study show that tact training resulted in fewer trials to mastery criterion when instruction was provided only in English (Set B) compared to bilingual instruction (Set A). Paulo required two trial blocks to reach mastery criterion during tact training in English whereas he required 59 trial blocks and an additional seven blocked trials to reach the mastery criterion during the bilingual tact training condition. However, Paulo learned novel tact responses in both languages despite rapid alternation of the language of instruction. The findings also show greater generalization and maintenance of acquired tact responses for stimuli following bilingual instruction.

Discussion

The purpose of the present study was to directly evaluate the effects of tact training when instruction was presented in a bilingual format (English and the home language, Portuguese) compared to instruction in English alone. Specifically, we sought to investigate skill acquisition during bilingual instruction as conducted in some bilingual educational settings, where the language of instruction and responding is varied momentarily during lessons (Creese and Blackledge 2010). The results show that Paulo emitted a higher level of correct responding during the English-only condition. It is important to note that this study examined the effects of bilingual instruction and not the home language in isolation as has been done in previous studies (Lang et al. 2011). The participant in this study had a longer history of reinforcement for speaking in English relative to Portuguese prior to the start of the study and he continued to receive all of his academic instruction outside of the study in English. These factors may have contributed to his overall superior performance with the English-only tact training condition. The cumulative number of correct independent responses emitted in English compared to those emitted in Portuguese was also evident in the training (see Fig. 6). That is, despite the fact that responses were prompted and reinforced when they were incorrect in Portuguese; Paulo practiced more correct responding in English. Thus, these findings may not be generalizable to bilingual children with language delays who present with a different history of language exposure to their home language.

Despite these findings, generalization and maintenance of acquired tact responses was better for the bilingual instruction training set. One potential explanation for the outcomes of the generalization and follow-up probes is that the bilingual training condition continued for a total of 68 trial blocks compared to 8 trial blocks for the English-only training condition. A potential explanation for better maintenance for the bilingual instruction training set is the shorter interval of time between generalization probes and follow-up probes for the bilingual training instruction set, relative to the English-only training instruction set. For the bilingual training condition, follow-up probes were conducted one week after the completion of post-generalization probes, whereas, for the English-only training condition, follow-up probes were conducted six weeks after the completion of post-generalization probes. As mentioned above, we scheduled sessions at different time intervals due to the family and experimenter’s conflicting schedules after the academic year ended.

The lack of stimulus control during the bilingual training condition may also be partially explained by the manner in which stimuli were presented during this condition. Each target was presented in English and Portuguese in a rapidly alternating and quasi-random fashion within each 18-trial block. This required the participant to alternate between two languages after every response. Previous studies have delivered bilingual instruction using a variety of different procedures. For example, some studies have alternated and combined English and the home language within and across sessions (i.e., English, home language, and English/home language; Ebert et al. 2014), alternated English and the home language across sessions (Lang et al. 2011; Padilla Dalmau et al. 2011; Rispoli et al. 2011), alternated English and the home language within sessions (Creese and Blackledge 2010), and alternated English and the home language within sessions based on the participant’s selection on a preference assessment (Aguilar 2013; Padilla Dalmau 2012). This study sought to investigate skill acquisition during bilingual instruction as conducted in some bilingual educational settings, where the language of instruction and responding is varied momentarily during lessons (Creese and Blackledge 2010).

It is unclear whether the contextual cue for each language condition (i.e., the color of the screen background) and the auditory discriminative stimulus (i.e., “What is it?” or “O que é?”) were sufficient for the participant to associate each trial with its corresponding language condition. During bilingual instruction, the contextual cue and the language of the auditory discriminative stimulus were alternated after almost every response. Thus, response effort for correct responding during bilingual instruction may have been greater relative to the English-only condition. Anecdotally, the participant sometimes emitted the correct tact in the incorrect language; although, data were not collected for each occurrence. For instance, on trial block 32 the participant emitted the correct tact in the incorrect language when the stimulus for basket was presented in English (i.e., “cesto” instead of “basket”). The efficacy of bilingual instruction where language is varied momentarily is limited and a different outcome may be obtained if another method is used; for example, if the language of instruction is alternated within sessions based on the child’s preference. Future studies should examine the effects of different bilingual training procedures on correct tact responses for learners with developmental disabilities and language delays. It will also be important to assess listener relations, or the selection of a stimulus following an instruction by the experimenter (i.e., “point to ____” when items are presented in an array of three), which was not the main focus of this study.

These findings are preliminary and should be interpreted with caution, given its limitations. First, there was unequal exposure to the stimuli in each set. That is, during pre- and posttests, the participant was exposed to the stimuli in Set A twice as many times as to the stimuli in Set B. Set A was presented in two blocks; one in English and the other in Portuguese. Presentation of Set A as such was necessary to assess the participant’s responses to each stimulus in both languages. Additionally, in a trial block during tact training, each stimulus in Set B was presented six times in the same language (i.e., English), whereas each stimulus in Set A was presented only three times in English and three times in Portuguese. This presentation of stimuli was conducted in order to equate the number of trials in each block between sets during tact training. Another limitation is that of the unequal lengths of time that transpired before post-generalization probes (i.e., 4 weeks vs. 1 day) and follow-up probes (i.e., 1 vs. 6 weeks) were conducted, between Set A and Set B. It was not always possible to control the timing of post-generalization and follow-up probes, due to the limited availability of the participant and time constraints for data collection. Lastly, there were technical difficulties associated with using the iPad. The ITI screen was preset to advance automatically after 2 s, but the participant occasionally pressed the screen before this time had elapsed. This resulted in unequal lengths of exposure to each stimulus, on occasion.

This is the first study to examine the efficacy of bilingual tact training with this population and the results present an avenue for future research in this area. First, future studies may wish to examine the effects of training in the home language alone and compare this with training in English only. Second, this study present data for a single participant and future studies should implement this procedure with other bilingual children with developmental disabilities and language delays who present with unique histories of language exposure (e.g., academic history in both English and their home language). Third, future studies should aim to examine the effects of training in two languages with this population in early childhood, when decisions regarding the optimal language of instruction to instruct and support are often made.