According to the Diagnostic and Statistical Manual of Mental Disorders (5th ed.) an individual diagnosed with autism must exhibit impairments (excesses or deficits) in social communication and interaction (American Psychiatric Association 2013). In fact, it has been estimated 76 % of individuals diagnosed with autism will not develop fluent speech and 30 % will fail to develop any vocal output (Wodka et al. 2013). For such individuals, Augmentative and Alternative Communication (AAC) systems are effective to assist with (i.e., augment) or act as the primary means (i.e., alternative) of social communication (Cafiero and Meyer 2008). Forms of AAC include picture-exchange, the Picture Exchange Communication System (PECS), manual sign language, and speech generating devices (SGD) or Voice Output Communication Aids (VOCA). Although there are various types of AAC, recent studies have provided preliminary support that handheld technology outfitted to function as SGDs are as effective in teaching communication skills and are generally more preferred by individuals with developmental disabilities (Couper et al. 2014; Lorah et al. 2013; Lorah et al. 2014), when compared to other methods of AAC.

Given the characteristics of individuals with autism, the development of communication and social interaction skills typically require specific intervention or the implementation of effective intervention strategies (Sundberg and Michael 2001). One such skill associated with social interaction is an individual’s ability to respond to a verbal discriminative stimulus (or a verbal cue) with a verbal response (i.e., demonstrate an intraverbal repertoire). The intraverbal, is an elementary operant that does not involve point-to-point correspondence between the verbal discriminative stimulus and the verbal response (Skinner 1957). Intraverbal skills range in complexity from fill-in the blank statements, such as “A car goes…” to typing an email in response to a coworker, or engaging in and maintaining a conversation.

Verbal behavior is behavior that is mediated by another person (Skinner 1957). Since all verbal behavior is mediated by another individual, all verbal behavior is inheritably social in nature. In order for an individual to communicate effectively, he or she must possess an intraverbal repertoire, for without an intraverbal repertoire a person’s ability to interact socially can be severely impaired (Goldsmith et al. 2007). Thus, intraverbal training is an essential element of any intervention procedure used with a child with autism to improve impairments in socialization and is therefore incorporated into many curricular sequences (Partington and Sundberg 1998; Leaf et al. 1999; Sundberg 2008; & Partington 2006).

As previously discussed, recent technological advancements, in the development of hand held computing devices, such as the iPad® have made high-tech SGDs, more affordable and readily available than alternative methods of SGD, and perhaps more socially acceptable and more easily interpreted than other forms of AAC, such as, picture based communication and manual signing (Alzrayer et al. 2014). That said, Alzrayer et al. (2014) cited only four studies conducted since 2007 demonstrating the effectiveness of teaching intraverbal skills with a SGD despite the associated advantages. Additionally, a recent review of handheld computing technology outfitted to function as a SGD emphasized the need for additional research in developing an intraverbal repertoire, as none of the 17 studies reviewed were self described as investigating the acquisition of an intraverbal repertoire (Lorah et al. 2014).

Although to date, no studies have specifically evaluated the use of handheld devices as SGD in the acquisition of an intraverbal repertoire, several studies, such as those listed by Alzrayer et al. (2014), have generally evaluated such acquisition through an analysis of multiply controlled operants. Multiply controlled operants are those that present the possibility of more than one occasioning stimulus or antecedent condition (Bondy et al. 2004). For example, an intraverbal-mand-tact could be occasioned by an establishing operation (e.g., hunger; as is characteristic of a mand or request), plus a verbal discriminative stimulus (e.g., “Do you want something to eat?”, and some aspect of the environment (e.g., food; as is characteristic of a tact or label). An example of a intraverbal-mand-tact is the statement “I’m hungry for pizza.” while in the presence of a pizza parlor, following the verbal stimuli “What do you want to eat?” while begin in a state of food deprivation (an establishing operation).

van der Meer et al. (2011) assessed the acquisition of two intraverbal mand responses with a multiple-probe across participants design when participants were presented with the discriminative stimulus (SD), “Let me know if you want something.” The participants’ ages ranged from 13 to 23 years and each were diagnosed with autism or a development disability. Participants were instructed to respond by selecting a picture on an iPod Touch®, which represented a toy or snack. The training procedures included a 10 s time delay with full physical guidance. At the conclusion of training, two of the three participants acquired the ability to respond using the device. However, the level of multiply-controlled response is far greater in this intraverbal mand as the trained response could have largely been occasioned by the establishing operation of deprivation alone, which presents a theoretical limitation to the research findings in terms of intraverbal acquisition. Additionally, the reinforcement produced by the verbal behavior, the delivery of a tangible reinforcer that had one-to-one correspondence with the verbal behavior, is characteristic of a mand and not of an intraverbal. In other words, within this research design, it is impossible to determine whether the verbal behavior demonstrated by the participant possessed any characteristics of an intraverbal.

In a similar study, researchers assessed the acquisition of two intraverbal mand responses while comparing the participants’ AAC mode preference (van der Meer et al. 2012a, b, c). In a multiple probe across participants design, the participants, who each had a diagnosis of autism and whose ages ranged from 5.5 years to 10 years, were presented with the SD, “Let me know if you want something” (van der Meer et al. 2012a, b, c). The participants then responded by either selecting a picture on the device which represented a toy or snack or manually signing the request. The training procedures involved a 10 s time delay with graduated guidance. After training, three of the four participants demonstrated the ability to request their preferred snack or toy with a SGD. Similar to the previously discussed study, a major limitation within this design is the response could have largely been under the control of the establishing operation, deprivation of the toy or snack, instead of the verbal SD. In other words, it is difficult to determine if the acquired responses were a mand, an intraverbal, or multiply-controlled verbal behavior.

Finally, Strasberger and Ferreri (2014) investigated the acquisition of participants ability to answer “What do you want?” and “What is your name?” using a multiple baseline design. The participants included four children, ranging in age from 5-to-12 years old with a diagnosis of autism. For the purpose of the study, the participants used an iPod Touch® and Proloquo2Go™ as the SGD, and training consisted of peer assisted communication application, in which each participant was taught to complete the two step sequence by a trained peer participant and researcher. The results of the study indicated that, three of four participants were able to demonstrate acquisition, maintenance, and generalization of the response to, “What do you want?”, while only two of the four participants met criteria when responding to, “What is your name?” Although the first response would be an example of intraverbal mand, the second response acquired is a true intraverbal demonstrating acquisition of the ability to respond to social questions when using an SGD (Strasberger and Ferreri 2014). However, the results of this investigation should be considered preliminary and, therefore, additional investigations of the use of handheld computing devices as SGD in the acquisition of intraverbal responding are indicated.

Because the ability to communicate is key to effective social interaction, teaching an intraverbal repertoire with a SGD merits additional research to determine the best methods to assist in acquisition of intraverbal skills. The present study sought to further the evidence-base regarding the effectiveness of using the iPad® and application Proloqu2Go™ as a SGD for acquisition of intraverbal skills. As such, the focus of the current investigation was to a) evaluate the use of the iPad® and application Proloqu2Go™ as a SGD and b) evaluate the use of a 5 s time delay with full physical prompts in terms of acquisition of an intraverbal response to a question regarding personal information (i.e., a social question), using a multiple baseline across intraverbal response design.

Method

Participants

One boy and one girl participated in the study, both of who had been diagnosed with autistic spectrum disorder (ASD) and were receiving behaviorally based-instruction in a university center-based program for children with a diagnosis of ASD. The university program met for 2-h, 2-days per week, for a 12-week semester. Both participants presented with an absent or weak mand, tact, echoic, and intraverbal repertoires (a score of 3 or 4) and were assessed with a score of 3 or 4 (weak or absent) in the articulation category, as indicated by a VB-MAPP Barriers Assessment. Therefore, for both participants, the use of a SGD is clearly indicated as educational and clinical best practice.

As indicated in Table 1, Cate was 12 years old at the time of the study; she was non-vocal and manded primarily using gestures such as pointing. Although she had some training on the use of a SGD, she was highly prompt bound in terms of manding. Levi was 8 years old, at the time of the study, and manded using 1–2 word utterances, though his mands were highly prompt dependent on either the presence of the item or the vocal prompt “What do you want?” A third participant was initially recruited for this study, but his participation was discontinued after three baseline sessions due to excessive absences.

Table 1 Participant information

Setting & Materials

All sessions were conducted in a therapy room within the center-based learning environment, where the participants received their instruction. The room included a child-sized table, chairs, and a variety of toys. The materials used in the study were a 32 GB iPad Version II covered with a black OtterBox and the application Proloqu2Go, which functioned as the SGD.

Dependent Measures

Probe data (a targeted responses was scored as either correct or incorrect) were collected for a total of three trials, per target, per session. A response was scored as correct if the participant selected and pressed the accurate picture symbol that corresponded to the intraverbal statement (i.e., the social question asked) with enough force to evoke the digitized output. For example, if the participant were asked “What is your name?” and responded by selecting the picture symbol that represented “My name is…” with enough force to evoke the digitized message, the trial was scored as correct. Alternately, if the participant were asked “What is your name?” and responded by selecting the picture symbol that represented “I live in…”, with enough force to evoke the digitized message, or if the participant did not respond with a 5-s latency, the trial was scored as incorrect. Following the end of each session, the probe data were converted into a percentage correct for each respective intraverbal statement (i.e., response) and graphed for visual inspection.

Research Design

A multiple-baseline design across target behaviors or responses (social questions) was used to evaluate the effects of the procedure on the establishment of intraverbal skills, in the form of responding to social questions.

General Procedures

The study was broken into sessions and trials, with no more than two sessions occurring per day, and each session containing three-to-four trials (only on one occasion was there four trials and this was due to low procedural fidelity within that particular session), or opportunities to respond, for each of the social questions (a total of three). During both baseline and training, the screen of the iPad contained the sentence frame window on top and five picture symbols (three targeted symbols and two distractor symbols) depicted on the screen. During training trials, a 5 s time delay with full physical prompts was used to evoke correct responding if the participant made an error during responding or did not respond within 5 s. Following both prompted and independent responses, vocal social praise was delivered, for example “You’re right.” or “Good work.” During baseline and training, the iPad as positioned within three inches and to the front of the participant. The instructor/researcher was seated on a chair, to the right or left, but within six inches, of the participant. For Cate, the targeted social questions were “How old are you?”, “What is your favorite toy?”, and “Where do you live?” For Levi, the targeted social questions were “What is your favorite toy?”, “What is your favorite food?”, and “Where do you live?”

Baseline

During baseline, the participant was presented with the iPad and the experimenter then vocally presented the targeted discriminative stimuli/asked the targeted social question. If the participant did not respond within 5 s, or if the participant made an error in responding, the response was scored as incorrect, all materials were removed, and the trial was ended. If the participant responded correctly, the experimenter provided social praise (e.g., “Yes, that is your name.”), the trial was scored as correct, all materials were removed, and the trial ended. No prompting occurred during baseline sessions and baseline continued in this manner until stable responding, across 3 consecutive data points, was established.

Training

During training, the participant was presented with the iPad and the experimenter then presented the targeted vocal stimulus/question (e.g., “What is your name?”). If the participant did not respond within 5 s, or if the participant made an error in responding, the response was scored as incorrect and the experimenter used a full-physical prompt to evoke correct responding, vocal social praise was delivered, all materials were then removed, and the trial ended. If the participant responded correctly, the experimenter provided social praise (e.g., “Yes, that is your name.”), the trial was scored as correct, all materials were removed, and the trial ended. Training continued in this manner until the participant reached a mastery criterion of at least 80 % correct responding, across two consecutive sessions.

Maintenance

Maintenance probes were conducted for each targeted response, following the achievement of mastery and were continued until all three-target responses met mastery criteria. Maintenance probes were identical to baseline probe procedures, in that no prompting took place, contingent upon incorrect responding or non-responsiveness.

Interobserver Agreement & Procedural Fidelity

A second independent observer collected agreement data on the dependent variables for 42 % of baseline sessions and 33 % of training and maintenance sessions. Agreement was defined as both observers scoring the occurrence of independent and/or prompted responses. A disagreement was scored if one observer identified a response as independent and the other observer identified the response as prompted. Inter-observer agreement data were calculated by dividing the number of agreements by the number of agreements plus disagreements, multiplied by 100. The average inter-observer agreement was 98 % during baseline and 100 % during training and maintenance. Following completion of each session, the experimenter/instructor completed a procedural fidelity checklist. Analysis of checklist data indicate that the procedures were followed, as outlined by the primary experimenter, at 90 % fidelity during all baseline, training, and maintenance trials. Additionally, the physical presence of the primary experimenter at all baseline, training, and maintenance trials helped to ensure procedural fidelity.

Experimenter/Instructor

Two experimenters/instructors were involved in this research project. The first was the primary experimenter, who was an Assistant Professor in a University special education department. She was also a Board Certified Behavior Analyst at the Doctorate Level and supervised all practica within the University clinic. The second experimenter/instructor was a doctorate level student in special education. She was also seeking board certification as a behavior analyst and was completing her practicum hours within the university clinic. The primary experimenter trained the secondary experimenter on the procedures to a mastery criteria of 100 % fidelity.

Results

Levi

As indicated in Fig. 1, Levi never responded accurately and independently to the vocal stimulus “What is your favorite toy?”, during the six baseline sessions. He met mastery criteria for this statement after three sessions of training, during which he averaged 85 % accurate responding. There were six maintenance sessions for this statement, during which responding averaged 97 % accurate and independent responding. Levi averaged of 17 % accurate and independent responding the vocal stimulus “What is your favorite food?”, during the nine baseline sessions that were conducted. He met mastery criteria for this statement after two sessions of training, during which he averaged 100 % accurate responding. There were four maintenance sessions for this statement, during which responding averaged 100 % accurate and independent responding. Levi averaged of 11 % accurate and independent responding to the vocal stimulus “Where do you live?”, during the 11 baseline sessions. He met mastery criteria for this statement after four sessions of training, during which he averaged 88 % accurate responding. There were no maintenance sessions for this statement as it was the last statement to be trained and mastered.

Fig. 1
figure 1

Levi’s percentage of independent social questions. This figure depicts the percentage at which Levi independently and accurately answered social questions during baseline, training, and maintenance

In terms of visual analysis of Levi’s data, there are clear indications of experimental effect. During all three trained social questions, there are large degrees of magnitude and immediacy between baseline sessions and the first sessions of training. On average, Levi required only three training sessions to reach the mastery criteria of 80 % accurate and independent responding. Furthermore, in terms of percentage of non-overlapping data, 100 % of the data are non-overlapping indicating that the treatment was highly effective. Finally, with regards to maintenance, both statements for which maintenance data were collected clearly indicate stability and general maintenance of the acquired intraverbals.

Cate

As indicated in Fig. 2, Cate never responded accurately and independently to the vocal stimulus “What is your favorite toy?”, during the 5 baseline sessions. She met mastery criteria for this statement after 10 sessions of training, during which she averaged 55 % accurate responding. There were seven maintenance sessions for this statement, during which responding averaged 96 % accurate and independent responding. Cate averaged of 3 % accurate and independent responding the vocal stimulus “How old are you?”, during the nine baseline sessions that were conducted. She met mastery criteria for this statement after three sessions of training, during which she averaged 75 % accurate responding. There were four maintenance sessions for this statement, during which responding averaged 94 % accurate and independent responding. Cate averaged of 1 % accurate and independent responding the vocal stimulus “Where do you live?”, during 18 baseline sessions. She met mastery criteria for this statement after four sessions of training, during which she averaged 63 % accurate responding. There were no maintenance sessions for this statement as it was the last statement to be trained and mastered.

Fig. 2
figure 2

Cate’s percentage of independent social questions. This figure depicts the percentage at which Cate independently and accurately answered social questions during baseline, training, and maintenance

In terms of visual analysis of Cate’s data, there are clear indications of experimental effect. During all three trained social questions, there are moderate degrees of magnitude and immediacy between baseline sessions and the first sessions of training. On average, Cate required six training sessions to reach the mastery criteria of 80 % accurate and independent responding. That said, the majority of those sessions were for the first acquired intraverbal, which required ten sessions for mastery. The remaining statements only required three and four training sessions, respectively. The highly differential session requirements may have been due to a scheduled holiday break that occurred in the middle of the training sessions. In terms of percentage of non-overlapping data, 90 % of the data are non-overlapping for intraverbal one, indicating treatment was highly effective. For intraverbal two, 66 % of the data are non-overlapping indicating treatment was minimally effective. For intraverbal three, 100 % of the data are non-overlapping, again indicating treatment was highly effective. Taken together, there are 85 % of non-overlapping data, indicating moderately effective treatment. Finally, with regards to maintenance, both statements for which maintenance data were collected clearly indicate stability and general maintenance of the acquired intraverbals.

Discussion

This study sought to add to the existing literature on the use of handheld computing devices as speech-generating devices (SGD) by investigating the use of the iPad® and application Proloqu2Go™ as a SGD for the acquisition of intraverbal responses. This study also evaluated the use of a 5 s time delay with full physical prompting procedure in the acquisition of an intraverbal repertoire, using the iPad® as a SGD. The results of the study, in the visual inspection of the data, indicate that the use of the iPad® and application Proloqu2Go™ and the prompting procedure were effective in the acquisition of the ability to respond to an intraverbal statement for the participants. Additionally, it is clear that for these participants use of generalized social reinforcement was effective in terms of establishing and maintaining the acquired intraverbal repertoire. However, as noted by Sobsey and Reichle (1989), it is possible that the synthetic speech output may have functioned as a reinforcer, thereby maintaining the participants responses.

Thus, this study offers preliminary evidence that the iPad® as a SGD can be used for the acquisition of operants beyond the basic mand, an area of research lacking in the current literature base. An interesting finding of this research is the rapid rate at which each response was acquired. Taken together, the participants required an average of four training sessions to acquire each intraverbal statement. These data indicate a very rapid rate of acquisition despite the fact that Cate’s first introduced statement required an unusually high 10 training sessions. An additional strength of the results is the clear indication that the acquired repertoires maintained following mastery, with an average of 95 % accurate and independent responding, across 21 maintenance sessions.

Despite the strength of the data and the clear demonstration of experimental effect, the results of this study should be taken as preliminary due to several theoretical and procedural limitations. First, are the subtle nuances between the listener operant pliance and the speaker operant intraverbals in terms of the current research design. According to Zettle and Hayes (1982), pliance is a listener response to a mand and is a form of rule-governed behavior. In the current investigation, participants acquired the ability to respond to an intraverbal statement. Given that this speaker statement was also a mand, in that it requested personal information, one may argue that the participants acquired pliance not an intraverbal repertoire. That said, an intraverbal is defined by Skinner (1957) as a verbal operant that is occasioned by a verbal discriminative stimulus, thus, the acquired repertoire in this investigation also meets the criteria for an intraverbal, as all intraverbals incorporate a degree of listener responding. As such, perhaps it would be more descriptive in identifying the acquired repertoire as a pliance/intraverbal.

A second limitation of the described research is the incorporation of only two participants. As previously mentioned three participants were initially recruited for this investigation; however, one was discontinued due to excessive absences. It was at the point of participant discontinuation that the study design was changed from a multiple-baseline across participants to a multiple-baseline across behaviors/responses. This change in design choice accounts for why there were more than three stable baseline data points for the first response to be trained, with the included participants. Additionally, the change in design allowed for us to obtain three demonstrations of experimental effect per participant, despite only having two participants.

Additionally, within this research design we did not evaluate the generalization of the acquired repertoires. Generalization is an important consideration in all applied research endeavors and was neglected in the current investigation due to time constraints and the onset of a university break. Thus, future investigations of such repertoires should take into account the generative effects of the training. Finally, for the participants included within this investigation, it is evident that the use of generalized social reinforcement was effective for the acquisition and maintenance of the acquired repertoire. Future research may evaluate alterative reinforcement procedures in terms of the acquisition of an intraverbal repertoire. However, such alternative reinforcement should be used cautiously, as the definition of an intraverbal requires the consequence of generalized conditioned reinforcement (Skinner 1957).

Despite the limitations discussed, the current investigation does provide preliminary evidence for the effectiveness of the iPad® and application Proloqu2Go™ as a SGD, beyond the mand repertoire. Additionally, the results demonstrate the effectiveness of a time-delay with full physical prompting as an instructional strategy. Future research should continue to investigate such new SGD in terms of more advanced verbal operants. As such a research base continues to expand, so can our use of such devices as an evidence-based best practice.