Reinforcement procedures are a cornerstone of applied behavior analytic programming for consumers with autism spectrum disorders. Since such consumers often lack the ability to vocalize or demonstrate choices, or have limited choice-making ability, it is important that clinical staff demonstrate the skills necessary to conduct stimulus preference assessments (SPAs) as an objective means to identify consumer preferences. The results of a recent survey suggest that the use of SPAs is a ubiquitous clinical practice (Graff and Karsten 2012a). This is not surprising considering the ongoing need to identifying potential reinforcers for use during behavioral programming. Because SPAs are widely used in clinical practice, it may be important to develop strategies to teach clinical staff to implement SPAs with high fidelity. More specifically, the current study focuses on the single-stimulus (SS), paired-stimulus (PS), and the multiple-stimulus-without-replacement (MSWO) assessments as data from Graff and Karsten (2012a) indicate that these SPAs are commonly used by certified behavior analysts.

The SS preference assessment involves presenting various stimuli one at a time to a consumer and observing if he approaches the item (Pace et al. 1985). The PS preference assessment involves presenting stimuli in pairs to a consumer and observing which stimulus in the pair is approached (Fisher et al. 1992). The MSWO assessment involves presenting a number of stimuli (e.g., six) in a horizontal array in front of the consumer, and observing the order in which stimuli are approached; after a stimulus is approached, it is removed from the array, and the remaining items’ positions are rotated and presented again (DeLeon and Iwata 1996).

Regardless of type, at least four components are involved in the independent implementation of SPAs: (a) the selection of a SPA given certain consumer characteristics, (b) the identification of stimuli to use during the chosen SPA, (c) implementation of the SPA, and (d) scoring and interpreting data obtained during the SPA. Each of these components will be discussed in turn.

Given the availability of multiple SPAs, clinicians must select one assessment over other available options. Although a clinician’s decision will likely be influenced by a number of factors (e.g., history with each assessment, training) the consumer’s skill repertoire and learning history are important considerations in the decision making process. Along this line, Karsten et al. (2011) outlined the strengths and limitations of the PS, SS, and MSWO preference assessments. The SS assessment is likely to identify multiple reinforcers, and is useful for consumers who cannot scan an array or respond negatively to vocal instructions. A limitation of this assessment is the possibility of identifying items as preferred when they are not (i.e., false positives). The PS assessment is likely to identify multiple reinforcers, and accommodates larger tabletop items and a greater number of items. Limitations of this assessment are the possibility that positional biases may confound data, and the longer time required to conduct the assessment relative to the SS and MSWO assessments. The MSWO assessment is likely to identify multiple reinforcers in a relatively shorter amount of time compared to the SS and PS assessments. However, results may be confounded by a positional bias, and only smaller items can be assessed due to the practical limitations of simultaneously presenting items in an array.

One option that may help guide clinicians in selecting the most appropriate SPA is a flow chart job aid. A flow chart job aid is a tool that can help lead clinicians through a decision-making process and has been suggested as an option to guide clinicians through selecting function-based treatments for escape-maintained problem behavior (Geiger et al. 2010).

After selecting an appropriate SPA, clinicians must identify stimuli to use during the assessment. Studies suggest that stimuli generated by caregivers or teachers result in the identification of more potent reinforcers compared to stimuli selected from a standard list (Cote et al. 2007; Fisher et al. 1996). These studies suggest that best practice is to ask caregivers or teachers to nominate stimuli to enhance the effectiveness of SPAs in identifying reinforcers.

Next, clinicians must conduct the preference assessment. Although no study has systematically evaluated the effect of low levels of integrity during the implementation of SPAs, implementing ABA technology with a high degree of integrity leads to improved outcomes (e.g., Carroll et al. 2013; Wilder et al. 2006). Thus, it seems prudent that SPAs be implemented with a high level of integrity to increase the likelihood that valid results are obtained.

Finally, as the ultimate goal of SPAs is to guide the selection of stimuli most likely to function as reinforcers, it is necessary that clinicians correctly score and interpret preference assessment data. If preference assessment data are improperly scored or interpreted, clinicians may make incorrect decisions regarding consumer preference, which may lead to less effective instructional arrangements.

In a search of the extant literature, we identified ten studies that trained staff to implement SPAs (Bishop and Kenzer 2012); Graff and Karsten (2012b); (Lavie and Sturmey 2002; Lerman et al. 2008; Lerman et al. 2004; Pence et al. 2012; Rosales et al. 2015; Roscoe and Fisher 2008; Roscoe et al. 2006; Weldy et al. 2014). No previous study trained staff to implement all four components previously discussed. When considering each of the components individually, in no study were staff trained how to choose the most appropriate type of SPA to conduct with consumers, one study reported training staff how to identify stimuli to use during SPAs based on best practice (Bishop and Kenzer 2012), all ten studies trained staff to conduct one or more SPAs, and six studies trained staff to score and interpret data obtained from SPAs (Bishop and Kenzer 2012; Graff and Karsten 2012b; Rosales et al. 2015; Roscoe and Fisher 2008; Roscoe et al. 2006; Weldy et al. 2014). More studies are needed to identify training approaches that lead to staff implementing all of the components necessary to independently administer SPAs.

When considering the training approaches used in the previous studies, all have included some combination of antecedent- and/or consequence-based approaches (i.e., instruction, modeling, rehearsal, feedback or some combination of these). Furthermore, the majority of training procedures relied heavily on the presence of a staff trainer (Bishop and Kenzer 2012; Lavie and Sturmey 2002; Lerman et al. 2008; Lerman et al. 2004; Pence et al. 2012; Roscoe and Fisher 2008; Roscoe et al. 2006). Given the limited resources that may be available in clinical settings, it may be advantageous to adopt training approaches that require reduced staff trainer presence. One option may be to use a self-instruction package (SIP). Graff and Karsten (2012b) used a SIP to train 11 teachers to implement PS and MSWO preference assessments. The SIP consisted of a detailed data sheet and step-by-step instructions written without technical jargon and supplemented with diagrams.

Another possible training option, video modeling (VM), involves modeling skills the viewer should imitate and emit in appropriate situations (Catania et al. 2009). VM has been used in the SPA staff training literature as one component of a comprehensive training package (Bishop and Kenzer 2012; Lavie and Sturmey 2002) and has been used to successfully train staff to implement functional analysis methodology (Moore and Fisher 2007), problem-solving interventions (Collins et al. 2009), and discrete-trial instruction (Catania et al. 2009; Vladescu et al. 2012). VM is often combined with voiceover instruction to increase the saliency of certain portions of the video, and may be useful in settings where access to qualified supervisors may be limited (e.g., rural areas).

Two recent studies examined using VM to teach staff trainees to implement different types of SPAs., Weldy et al. (2014) used VM with voiceover instruction within the context of two 30-min PowerPoint presentations to train groups of staff trainees to implement both the MSWO and FO preference assessments. Rosales et al. (2015) used VM with embedded instructions (i.e., on-screen written instructions) to train staff trainees to implement the PS, MSWO and FO preference assessments. Although Weldy et al. (2014) successfully trained staff to implement both the MSWO and the FO preference assessments, and Rosales et al. (2015) successfully trained staff to implement the PS, MSWO, and FO preference assessments, both studies used separate videos to train each individual type of assessment. Due to the similarity of steps among certain types of SPAs, it seems worthwhile to evaluate the extent to which one training video can be used to simultaneously teach multiple types of SPAs. As shown in Table 1, the SS, PS, and MSWO assessments have a number of similar steps. Because of this, training might take advantage of these similar steps by training multiple SPAs using a single training video.

Table 1 Steps for completing SS, PS, and MSWO preference assessments

The purposes of the current study were to extend the staff training literature by evaluating the effectiveness of VM with voiceover instruction to simultaneously train staff to conduct the SS, PS, and MSWO preference assessments while also providing training on the four components necessary for the independent implementation of these three SPAs. It is possible that by developing less resource-intensive training packages (i.e., VM), it may make identifying effective reinforcers easier in clinical settings. Additionally, the current study evaluated the extent to which staff trainees demonstrated generalized responding and whether the correct implementation of the three SPAs maintained during the follow-up session. Finally, the validity of the goals, procedures, and outcomes were assessed.

Method

Participants

Two males and two females served as participants (hereafter referred to as staff trainees) in the study. The staff trainees ranged in age from 18- to 22-years old and had 3 to 60 months experience working with individuals with autism and other developmental disabilities. Each staff trainee (Susan, Rick, Jackie, and James) completed a 19-question multiple-choice pretest composed of questions about SPAs (M = 46 %; range, 39 to 53 %). Informed consent was obtained prior to participation in the study. Staff trainees were paid a stipend of 50 dollars for participating.

The first author served as the simulated consumer during baseline, training, and maintenance sessions. Six other female staff members working in the autism center where the study was conducted served as simulated instructors during baseline, training, and maintenance sessions. A 16-year-old male with autism served as the actual consumer during generalization sessions. The consumer had an extensive history receiving services based on the principles of applied behavior analysis, participating in SPAs, and was currently receiving services at the university autism center where the study was conducted. He was chosen to participate because of his diagnosis and availability. His caregiver signed a consent form for him to participate. Three of his female instructors at the autism center served as his actual instructor during generalization sessions.

Setting and Materials

All sessions were conducted in a university-based autism center. Each room contained a table, chairs, and the materials necessary for each session (i.e., data sheets, timers, calculators). Staff trainees and simulated consumers sat on opposite sides of the table, while the simulated instructor stood away from the table.

During sessions with the simulated consumer, the stimuli (i.e., toys and leisure items) for the session were located on a small bookcase. These stimuli consisted of three exemplars drawn from eight categories (i.e., vehicles, books, action figures, instruments, sports toys, building toys, stuffed toys, and construction toys). Four stimuli from four different categories were randomly selected to generate an instructor survey for use during each SPA. Additionally, the training video (described below) used in the study contained stimuli from four of the eight categories. These stimuli were never present during any sessions.

Generalization sessions with simulated consumers were conducted in rooms not associated with training at the autism center. During these sessions, 24 stimuli not associated with training from the eight categories were used. The simulated instructor generated a list of six stimuli from the 24 stimuli available during those sessions.

Generalization sessions with the actual consumer were conducted in his classroom in the autism center. During these sessions, actual instructors generated lists of stimuli from items specific to the actual consumer. Stimuli used during sessions with simulated consumers were not present during sessions with actual consumers.

The rooms also contained a video camera to record sessions. During training sessions, a laptop with headphones was used to play the training video for staff trainees.

Design and Measurement

A concurrent multiple-baseline across-participants design was used to evaluate the effects of video modeling with voiceover instruction (hereafter referred to as video modeling) on staff trainees’ implementation of the three SPAs (i.e., SS, PS, MSWO). Video modeling continued until staff trainees implemented each of the three assessments with at least 90 % of the steps implemented correctly for two consecutive sessions. If a staff trainee mastered one SPA prior to the others, probes of that SPA were conducted every other session until mastery was achieved for all assessments.

Data for each session were collected from video using data sheets created for the study. The dependent variable was the percentage of steps implemented correctly by the staff trainee for each SPA (See Appendices A, B, and C for a complete task analysis of the steps and definitions for the SS, PS, and MSWO assessments, respectively). Similar to Weldy et al. (2014), data on correct and incorrect implementation were scored only once per assessment for each of the steps of the task analysis. Thus, for a step to be scored as correct, the staff trainee was required to complete the step correctly across all possible opportunities during the assessment. The percentage of steps implemented correctly for each session was calculated by dividing the number of steps performed correctly by the total number of steps and multiplying by 100.

Interobserver Agreement and Procedural Integrity

Two observers independently scored data by viewing video of sessions for 47 % of Susan’s, 42 % of Rick’s, 63 % of Jackie’s, and 47 % of James’ sessions across conditions. For each trial, an agreement was defined as both data collectors scoring implementation of a step identically. A disagreement was defined as data collectors scoring implementation of a step differently. Agreement was calculated as the number of agreements divided by agreements plus disagreements, multiplied by 100. The mean interobserver agreement (IOA) scores were 99 % (range, 97 to 100 %) for Susan, 97 % (range, 90 to 100 %) for Rick, 97 % (range, 92 to 100 %) for Jackie, and 97 % (range, 93 to 100 %) for James.

Procedural integrity data were collected from video for 47 % of Susan’s, 42 % of Rick’s, 63 % of Jackie’s, and 47 % of James’ sessions across conditions. Data were collected using a checklist that listed the components of baseline and video modeling conditions. The percentage of correctly implemented components was calculated by dividing the number of correctly implemented steps by the total number of applicable steps, multiplied by 100. Mean procedural integrity scores were 100 % for Susan, 99 % (range, 94 to 100 %) for Rick, 97 % (range, 90 to 100 %) for Jackie, and 99 % (range, 92 to 100 %) for James. A second observer also collected procedural integrity data for 47 % of Susan’s, 42 % of Rick’s, 63 % of Jackie’s, and 47 % of James’ sessions for IOA purposes. Agreement was calculated as the number of agreements divided by agreements plus disagreements, multiplied by 100. Procedural integrity IOA was 100 % for Susan, 99 % (range, 94 to 100 %) for Rick, 98 % (range, 94 to 100 %) for Jackie, and 99 % (range, 94 to 100 %) for James.

Preference Assessments for Simulated and Actual Consumers

Sessions consisted of staff trainees choosing the appropriate SPA to conduct for three hypothetical simulated consumers (according to hypothetical consumer information); identifying items to use in the SPA; implementing the SPA; and summarizing data and interpreting the results of the SS, PS, and MSWO preference assessments. Similar to previous studies (e.g., Graff and Karsten 2012b), modifications to the published procedures were made to specify how staff trainees should respond following specific consumer responses not explicitly addressed in previous publications (described below). Staff trainees evaluated consumer preference of four items (except for generalization sessions with simulated consumers during which six items were used) in each assessment. The relatively small number of items was selected to keep session duration manageable and to keep the number of trials for each assessment per session roughly equal.

SS Assessments

The SS preference assessment was based on procedures described by Pace et al. (1985). During each session with a consumer, staff trainees provided an opportunity for the consumer to individually approach four different items for four trials. After identifying the items to use, the staff trainee was to place each individual item approximately 0.3 m in front of the consumer. If the consumer approached the item within 10 s, the staff trainee was to allow the consumer to access to the item for 10 s. Following this access period, the staff trainee was to take the item back from the consumer, and present the item for the next trial. If the consumer did not approach the item within 10 s, the staff trainee was to remove the item, wait approximately 3 s, and then re-present the item. If the consumer did not approach the item within 10 s during the re-presentation trial, the staff trainee was to remove the item, and began the next trial. Staff trainees were to record data between trials. The staff trainee was to continue presenting items until all trials were completed.

PS Assessments

The PS preference assessment was based on procedures described by Fisher et al. (1992). During each session with a consumer, the staff trainee was to present items two at a time, according to the data sheet given to them. After identifying items to use, the staff trainee was to place two items in a horizontal array approximately 0.3 m in front of the consumer and approximately 0.15 m apart. The staff trainee was to then present the instruction “pick one.” If the consumer approached an item within 10 s, the staff trainee was to allow the consumer to access the item for 10 s and remove the item that was not approached. Following the access period, the staff trainee was to remove the item from the consumer, and present the next trial. If the consumer did not approach an item within 10 s, approached both items simultaneously (i.e., touch both items at the same time) or consecutively (i.e., touch one item, then touch the other item immediately after or while touching the first item), the staff trainee was to remove both items, wait approximately 3 s, and re-present the trial. If the consumer did not approach an item within 10 s or attempted to approach both items simultaneously or consecutively on the re-presentation trial, the staff trainee was to remove the items, and begin the next trial. Staff trainees were to record data between trials and to present items until all trials were complete.

MSWO Assessments

The MSWO preference assessment was based on procedures described by DeLeon and Iwata (1996). After identifying the items to use, the staff trainee was to place all items in a horizontal array approximately 0.3 m in front of the consumer and approximately 0.15 m apart from one another. The staff trainee was to then instruct the consumer to “pick one.” If the consumer approached an item within 10 s, the staff trainee was to place a divider blocking the non-approached items, and allow access to the approached item for 10 s. After the 10-s access period, the staff trainee was to retrieve the item from the consumer and present the remaining items in an array in front of the consumer. The staff trainee was to rotate the left-most or right-most item to the farthest position on the opposite side of the array before presenting the next trial. If the consumer did not approach an item within 10 s, the staff trainee was to re-place the divider, wait approximately 3 s, and re-present the array of items. If the consumer did not approach an item on the re-presentation trial, the staff trainee was to immediately terminate the assessment. If the consumer attempted to approach more than one item (either simultaneously or consecutively), the staff trainee was to block the response, re-place the divider, and then re-present the trial after approximately 3 s. If the consumer attempted to approach more than one item on the re-presentation trial, the staff trainee was to terminate the assessment. Staff trainees were to record data between trials. The staff trainee was to present items until all items were approached, or the consumer did not approach any items within 10 s on two consecutive presentations. Staff trainees may have conducted fewer trials than items available if consumers did not approach an item after two presentations of an item during a trial (i.e., if the consumer did not select an item within 10 s for two presentations, the MSWO assessment was terminated).

Assessment Scripts for Simulated Consumers

Simulated consumers responded during the SPAs according to predetermined scripts. Four scripts for each SPA were created. Each script specified the type of response (i.e., typical or atypical) the simulated consumer should provide during each trial of a session. The simulated consumer engaged in a typical response that consisted of approaching an item within 10 s of the presentation of an item (during the SS assessment) or staff trainee’s instruction (during the PS and MSWO assessments) during 50 % of trials (Graff and Karsten 2012b). Atypical responses occurred during the other 50 % of trials and included: (a) attempts to approach more than one item (simultaneously and consecutively), (b) not approaching an item within 10 s of its presentation or the staff trainee’s instruction, and (c) engaging in problem behavior or stereotypy. Staff trainees were also required to prompt the simulated consumer’s hands to his lap during about 50 % of trials during each assessment. Simulated consumer scripts were chosen randomly without replacement before each session.

General Procedure

Maximum allowed times to complete each of the four components of a session were yoked to the longest time required by three staff members not associated with the study at the autism center to complete this sequence of components. Staff trainees entered the session room that contained all the materials necessary for the session. An experimenter provided the staff trainee with three written descriptions of hypothetical consumers, and stated, “Do your best to choose the most appropriate preference assessment for each consumer. I cannot answer any questions. Please let me know when you are finished.” Each description contained information about a hypothetical consumer that should be used to determine the most appropriate SPA to conduct. Staff trainees needed to attend to the pertinent information in the description to decide which SPA to conduct (e.g., ability to pick from an array, number of stimuli to be assessed, size of stimuli to be assessed) and disregard extraneous information not used for choosing a type of SPA (e.g., name, age, gender, diagnosis). Each description was followed by a multiple-choice question containing five types of preference assessments (i.e., single-stimulus, paired-stimulus, MSWO, MSW, free-operant). Staff trainees were given 5 min to choose a type of preference assessment for each description.

Next, staff trainees were provided with a clear plastic bin; a folder containing the data sheets necessary to conduct the SS, PS, or MSWO assessment; and the materials needed to conduct the SPAs (i.e., pencil, calculator, divider for MSWO). The order in which staff trainees conducted the three types of SPAs was determined by randomization without replacement. The folder contained an instructor survey, a data sheet specific to the SPA being conducted, and a calculation data sheet. The assistant then stated, “Do your best at conducting a (type) preference assessment. I cannot answer any questions or give any feedback. Please let me know when you are finished.”

The staff trainee was then given 1 min to administer the instructor survey (Appendix D). The survey was an open-ended survey which prompted the simulated consumer’s instructor to generate a list of items. During sessions with the simulated consumer, the simulated instructor completed the survey using a predetermined list of items (i.e., four items, one each from a different category). If the staff trainee did not administer the instructor survey within 1 min, the simulated instructor gave the staff trainee an already completed the instructor survey. Once survey results were obtained, the staff trainee was to then locate the items from the array of 24 items on a bookshelf in the room. If the staff trainee did not correctly obtain and place the items from the completed survey in the plastic bin and return to the table within 1 min, the simulated instructor instructed the staff trainee to leave the room. While the staff trainee was outside the room, the experimenter placed the items in a clear plastic bin, and placed it by the staff trainee’s chair. Staff trainees were then returned to the room.

Staff trainees were to conduct four trials of the SS assessment, six trials of the PS assessment, and three or four trials of the MSWO assessment (depending on simulated consumer responses). Staff trainees recorded whether the consumer approached any presented item(s). During all three SPAs, the staff trainee was to ignore problem behavior (e.g., inappropriately playing with items, not returning items when the staff trainee retrieved an item) or stereotypy (e.g., hand flapping, body rocking) and continue with the assessment. On occasion during each of the three assessments, the staff trainee may have needed to re-present a trial due to certain responses by consumers (see above). The staff trainee was to only collect data on the re-presentation trial. At the conclusion of each trial, the staff trainee was to record which item was approached, if any, by circling the number corresponding to that item on the provided data sheet.

After the staff trainee administered the SPA, the staff trainee was to score results for the preference assessment that was just conducted. Staff trainees were given the opportunity to score their actual data sheets. If staff trainees did not collect data during a SPA administration, hypothetical data were given to the staff trainee to provide an opportunity to score and interpret results. For each item, staff trainees were required to calculate the percentage of trials in which that item was approached. For all three SPAs, this was calculated by dividing the number of trials the item was approached by the total number of trials the item was presented and multiplying by 100. At the bottom of the calculation data sheet, staff trainees were asked to rank the items in order of preference (i.e., percentage of trials approached), and identify the item that should be used for teaching (i.e., the item with the highest percentage of trials approached). If two or more items were ranked the same, the staff trainee was to pick any one of the items with the highest percentage of trials approached and write the name of that item in the appropriate location on the calculation data sheet.

Baseline

During baseline, sessions were conducted as described above. At the beginning of each session, staff trainees were brought into the session room by the experimenter. The experimenter then provided the following instruction, “Do your best to choose the most appropriate preference assessment for each consumer. I cannot answer any questions. Please let me know when you are finished.” Staff trainees first read the descriptions presented to them, and then chose the most appropriate preference assessment to conduct based on each description. After choosing a preference assessment to conduct for each consumer description, staff trainees were then given the opportunity to complete each of the three SPAs. For each individual preference assessment, the staff trainee was to have the simulated consumer’s instructor complete an instructor survey asking which items the instructor predicted the consumer would work for, conduct the preference assessment, and score and interpret the results of that preference assessment. This was repeated until all three SPAs were conducted. The experimenter did not answer questions or provide feedback.

Video Modeling

Prior to each training session, staff trainees watched a video (19 min 28 s) depicting a simulated staff trainee conducting the three SPAs with a simulated consumer. It is important to note that the staff trainee only watched the video once prior to completing the SS, PS, and MSWO preference assessment. That is, the staff trainee watched the training video, and then completed each one of the preference assessments before watching the training video again. The only exception to this occurred when the staff trainee had performed above mastery criterion for one of the preference assessments. If the staff trainee had mastered one of the preference assessments, he or she still watched the entirety of the training video, but was only asked to complete the preference assessments that were not mastered. The video consisted of individual video clips of the steps required to select which preference assessments to implement (more specifically, the video explained how to use a job aid to guide them through the decision-making process of selecting a preference assessment; see Appendix E), how to survey the consumer’s instructor to identify items to use during the administration of the SPAs, implement each of the SPAs, and score and interpret the results of each assessment. Voiceover instruction was used to increase the saliency of certain aspects of the video clips (script available from the second author). Steps that were consistent across multiple preference assessments (e.g., present item in correct location) were only reviewed once, but corresponding video clips demonstrated the step being implemented within the context of each assessment. For some steps, on-screen text and diagrams were shown to further clarify important aspects of the video clip being shown.

Within 5 min of watching the video, the staff trainee was brought into the session room to begin the session. Sessions were conducted as in baseline, except staff trainees were also provided with a job aid (see Appendix D) that contained a flowchart that depicted the decision-making process of choosing the most appropriate SPA to conduct. The experimenter did not provide feedback or answer questions.

Pre- and Post-Training Generalization Sessions

Single session probes were conducted with an actual consumer during baseline and after the staff trainees met the mastery criterion with the simulated consumer. Each generalization probe consisted of the staff trainee choosing a type of preference assessment to implement with three novel descriptions of hypothetical consumers. Each staff trainee then conducted each of the three SPAs with the actual consumer. During generalization probes, the actual consumer’s instructor completed the survey recommending which items the actual consumer might work for.

Staff trainees also completed a generalization probe with a simulated consumer using a pool of six (rather than four) novel items for each assessment. This was done to more closely simulate the conditions of a preference assessment in the natural environment, during which preference assessments will most likely consist of more than four items. The experimenter acted as the simulated consumer in these sessions. Novel scripts were used in these generalization probes, and sessions were conducted in a room not associated with training. Staff trainees did not view the video prior to either type of generalization probe session. The job aid was also provided along with the novel hypothetical descriptions during post-training probes. No feedback was provided, and no questions were answered.

Follow-Up Probe

Follow-up data were collected 1 week after staff trainees met the mastery criterion with a simulated consumer. Follow-up sessions were conducted in the same manner as baseline sessions. Although the staff trainees did not have access to the training video following mastery and did not view the video prior to follow-up probes, they were provided with the job aid during the probe.

Validity

Content Validity

Prior to implementing the video training, three BCBAs with 18 months to 6 years’ experience administering SPAs viewed the video and completed a questionnaire evaluating the content of the training video. These informants were asked their opinion about the extent to which the video included all of the important steps for conducting the SS, PS, and MSWO preference assessments, and whether the video failed to address any important aspects relevant to conducting these assessments.

Social Validity

Following completion of training, the staff trainees were asked to confidentially complete a modified version of the Treatment Acceptability Rating Form-Revised (TARF-R; Reimers and Wacker 1988) to evaluate the acceptability of the training methods used in the study.

Additionally, the social validity of the study’s outcome were assessed by having 11 graduate students not involved in the study watch 30-s clips of sessions in which a staff trainee implemented steps in the SS, PS, and MSWO preference assessment with a simulated consumer both prior to and after training (Reeve et al. 2007). There was one pre-training and one post-training video for each type of SPA, for a total of six videos shown for each staff trainee. The order in which pre- and post-training videos were presented was randomized. The graduate students were asked to select the clip in which the staff trainee more competently conducted the SPAs. Although all 11 graduate students had taken graduate coursework, it is possible that they had little or no experience with the SPAs taught in the study.

Results

Figure 1 depicts the percentage of correctly implemented steps for the SS, PS, and MSWO preference assessments by the four staff trainees. During baseline, Susan (first panel), Rick (second panel), Jackie (third panel) and James (fourth panel) implemented low to moderate percentages of correct steps for the SPAs during sessions with the simulated consumer. Although there was some variability in the baseline data (particularly for James), all staff trainees implemented the SPAs well below the mastery criterion. During baseline probes with an actual and simulated consumer, Susan did not correctly implement any steps during the three preference assessments. The other staff trainees demonstrated low to moderate levels of correctly implemented steps for the three preference assessments, although these data were variable.

Fig. 1
figure 1

Percentage of correctly implemented steps for the single-stimulus (SS), paired-stimulus (PS), and multiple-stimulus-without-replacement (MSWO) preference assessments during baseline, video modeling, and no video modeling. Black-filled shapes represent sessions with simulated consumers, gray-filled shapes represent generalization sessions with simulated consumers, and open shapes represent generalization sessions with actual consumers

Following VM, all staff trainees demonstrated immediate and substantial increases in the percentage of correctly implemented steps for all three SPAs. Susan met the mastery criterion for the SS, MSWO, and PS preference assessments in 3, 5, and 6 sessions, respectively. Susan watched the training video six times (total viewing time: 116 min, 48 s). Rick met the mastery criterion for the SS, PS, and MSWO preference assessments in 3, 3, and 6 sessions, respectively. Rick watched the training video six times (total viewing time: 116 min, 48 s). Due to time constraints related to completing training, it was necessary to provide performance feedback to Rick before the fifth MSWO preference assessment training session to address a consistent error of commission (i.e., not administering the survey and retrieving the toys to use during the preference assessment). No other staff trainee received feedback. Jackie met the mastery criterion for the SS, MSWO, and PS preference assessments in 2, 2, and 3 sessions, respectively. Jackie watched the training video 3 times (total viewing time: 58 min, 24 s). James met the mastery criterion for the SS, PS, and MSWO preference assessments in two sessions for all three preference assessments. James watched the training video 2 times (total viewing time: 38 min, 56 s). During post-training generalization probes with the simulated consumer, all four staff trainees performed at or above the mastery criterion. During post-training generalization probes with the actual consumer, three staff trainees performed at or above the mastery criterion. During the first post-training generalization probe with the actual consumer, James’ performance was below the mastery criterion for the SS preference assessment. We conducted another probe, and he demonstrated performance above 90 %.

During 1-week follow-up probes, Susan, Jackie, and James all demonstrated responding at or above 90 %. Unfortunately, we were unable to obtain maintenance data for Rick.

With regard to the content validity, the three BCBAs surveyed rated the video as having all the steps necessary to correctly implement the SS, PS, and MSWO preference assessments. One BCBA indicated that additional information regarding the positioning of the items in the PS preference assessment (i.e., left- or right-hand side) be provided. This information was added to the training video prior to beginning VM.

Three of the four staff trainees completed the modified TARF-R questionnaire. The questionnaire involved rating ten items on a seven-point Likert scale (e.g., 1 = Not at All; 7 = Very Much). Staff trainees indicated that they understood the SPA procedures (M = 6), that they found the VM to be acceptable (M = 6, range 5–7), that they were willing to implement the SPAs (M = 6.7, range 6–7), that they thought there would be little disadvantage to using VM (M = 2.7, range 2–4), that it would not be costly to train staff using VM (M = 6.7, range 6–7), that it would not be difficult to implement these SPAs in their clinical work (M = 5.3, range 4–7), that VM was likely to make permanent improvements in behavior (M = 5, range 3–6), that they liked receiving the VM (M = 5, range 4–6), that they did not experience discomfort during VM (M = 4.7, range 4–6), and that they believed VM was effective for training staff to conduct SPAs (M = 5, range 3–6).

Finally, for the social validity of the outcomes, the 11 graduate students selected the post-training clip as demonstrating more competent implementation of the SPAs for 96 % of the pre-training/post-training video clip comparisons.

Discussion

The current study successfully used VM to train four staff trainees to select the most appropriate SPA to implement given hypothetical consumer characteristics; survey instructors to generate stimuli to use during SPAs; implement the SS, PS, and MSWO preference assessments; and score and interpret data collected during the assessment. In addition, the staff trainees exhibited generalized responding and performance that maintained 1 week after training. Staff trainees provided ratings that indicated acceptability of the VM training and graduate students largely indicated that staff trainees more competently implemented the SPAs following training.

The results of the current study are important for several reasons. First, except for one session with Rick, the presence of a staff trainer was not required during training. Although a trainer was required to assess if staff trainees’ performance was adequate, and re-administer the training video if performance remained below criterion, using a video to train staff may lead to a reduction in the need for a trainer to be present. This is important because it is possible that in some applied settings there may not be a staff trainer available to provide training for SPAs. Training methods that do not require the presence of staff trainers may be one way to circumvent such situations. The current study adds to the small, but growing, body of literature (i.e., Graff and Karsten 2012b; Rosales et al. 2015; Weldy et al. 2014) which suggests that some individuals may not require the presence of a trainer in order to implement a procedure with fidelity.

The second important aspect was that staff trainees were trained to implement the four components seemingly necessary to independently conduct SPAs. Prior to the present study, no studies had trained staff to implement these components together. By training staff in these four components, it may be more likely that staff will not only correctly implement SPA methodology when working with consumers, but they will also select the most appropriate type of SPA for a given consumer.

The third important aspect of the current study was that staff trainees were simultaneously trained to implement three different types of SPAs using a single video. By taking advantage of the similarities in steps among these SPAs, it was possible to train staff to implement three SPAs simultaneously. Such training approaches are advantageous because they reduce the need to create separate training materials for different preference assessments, such as those used by Weldy et al. (2014) and Rosales et al. (2015). Future research could further explore this approach to training by evaluating the extent to which one set of training materials (e.g., one training video) can train staff to implement behavioral procedures that involve similar components (e.g., direct teaching strategies, functional analysis conditions).

Similar to previous research (Graff and Karsten 2012b), the staff trainee responding in the current study generalized from simulated to actual consumers. This is not unexpected as we programmed for generalization. First, we exposed staff trainees to a wide variety of stimuli to use during SPAs. By including different categories of stimuli, and multiple exemplars in each category, it was more likely that staff trainees would successfully implement the SPAs with novel stimuli. Second, we exposed the staff trainees to a variety of potential consumer responses during sessions and provided training regarding how to appropriately respond to such responses in the training video. Lastly, we programmed common stimuli by creating instructor surveys, data sheets, and calculation data sheets used during sessions with simulated and actual consumers which shared similar features.

Correct implementation of the three SPAs maintained during 1-week follow-up probes for three staff trainees. Unfortunately, we were unable to obtain these data for Rick. These follow-up data, which suggest that VM may promote maintenance of the skills taught, align with previous research showing short-term maintenance of skills taught using VM (e.g., Catania et al. 2009). Future studies could examine maintenance of SPA skills for longer periods of time following training.

The current study contributes information regarding the content validity of the training procedures used, as well as the acceptability of VM as a training procedure. Prior to implementing VM, three BCBAs rated the training video as having all of the steps necessary for the implementation of the SS, PS, and MSWO preference assessments. To our knowledge, this evaluation of content validity has not been conducted in any published studies on training staff to implement SPAs. In addition, no published studies have included data on the staff trainees’ opinions of VM as a training procedure. In our study, staff trainees provided mostly positive ratings of VM. Taken together with the favorable training results, these findings suggest that VM is not only an effective means for training staff, but that staff may also enjoy VM. More research is needed to determine the relative acceptability of VM compared to other training procedures (e.g., behavioral skills training, self-instruction packages).

To our knowledge, this is the first published study that trained staff to identify pertinent characteristics in written descriptions of hypothetical consumers, and then select which SPA was most appropriate to conduct with that consumer. Giving staff trainees a tool (i.e., the job aid) that can be used to identify important consumer characteristics to consider when choosing SPAs may lead to clinicians being more likely to collect meaningful data about the preferences of the consumers they are working with. Although we did not train staff to select a SPA based on the characteristics of actual consumers with whom they work, these efforts represent an important step forward in the staff training literature. Future research should continue to identify methods to train staff to competently complete the components needed to independently use behavioral procedures so these individuals achieve higher levels of autonomy in their work. For example, future research could evaluate the possibility of training staff to select the most appropriate type of data collection procedures to use given certain characteristics of consumer problem behaviors or to select an appropriate sequence of functional analysis conditions to conduct given certain types of data or problem behavior.

Three of the four staff trainees in the study learned to implement three different SPAs from viewing a single training video; however, one staff trainee required feedback regarding the correct implementation of one step (i.e., administering the instructor survey and retrieving the toys identified in the survey). Although feedback has been described as an essential component of staff training (Reid and Fitch 2011), the majority of staff trainees in the current study demonstrated mastery without feedback from the experimenter. These results align with recent research that suggests that performance feedback may not be a necessary component of training for some individuals (Catania et al. 2009; Collins et al. 2009; Moore and Fisher 2007; Rosales et al. 2015; Vladescu et al. 2012; Weldy et al. 2014). More research is needed to identify under what specific conditions feedback may be necessary when training staff.

It could be argued that baseline conditions in the current study did not represent an appropriate comparison condition. Previous staff training studies (e.g., Moore and Fisher 2007; Roscoe et al. 2006; Vladescu et al. 2012) have administered written instructions to staff trainees prior to any sessions being conducted. Other staff training studies have examined the effects of training using a comparison with baseline condition where staff trainees did not receive any type of written instructions prior to training (e.g., Bishop and Kenzer 2012; Lavie and Sturmey 2002). In these studies, performance during baseline both with and without written instructions was below criterion levels, which suggests that written instructions are not an effective method for increasing performance to adequate levels. For the current study, the experimenters chose to not to include written instructions during baseline for two reasons. First, staff in our clinic do not usually have access to written instructions when conducting SPAs. Thus, we did not expose staff trainees to written instructions to better approximate conditions at our clinic. Second, we wanted to evaluate the effects of VM training in the absence of other training procedures. By exposing staff trainees to written instructions, it is possible that the observed results may have occurred due to some interaction between the written instructions and the VM. However, nearly all staff trainees demonstrated some levels of correct responding during baseline, suggesting this was a fair comparison to the VM condition.

There are several avenues for future research based on the methodology and results of the current study. Although staff trainees were exposed to written descriptions containing hypothetical consumer characteristics during training, they were not taught to assess and obtain those characteristics based on their own observations, as would be the case in actual clinical environments. Future research could teach staff what behaviors to look for when determining a consumer’s skill repertoire, and then make the correct decision choosing an appropriate SPA to conduct based on those observations. Future research could also examine training more types of SPAs. Although the current study trained three types of SPAs that may be useful for a wide array of consumers, it is likely that other types of SPAs would be more appropriate for other consumers (e.g., free operant, duration-based assessments) than the ones trained in the study. It is feasible that the other types of SPAs have similar components as well, making it likely that the training for those types of SPAs can be combined into one single training package.

As noted earlier, although the current study sought to devise a method to reduce the need for the presence of a trainer, a trainer was necessarily present to identify when staff trainees met mastery criteria or identify when additional training was needed. To address this concern, future research could look at self-monitoring of procedural integrity to further reduce the need for trainers to present during the majority of training.

Although it was beyond the scope of the current study, it would be important to determine if accurate SPA choice improve consumer responding. Future research could obtain data on consumer responding when an appropriate SPA was conducted compared to consumer responding when a SPA was conducted without taking into account consumer characteristics.

Finally, future research can more specifically evaluate which features of VM training are most important to properly train staff. The video in the current study contained multiple exemplars of possible staff trainee and consumer behaviors, voiceover script, and on-screen text and figures to make certain aspects of the video more salient. It is possible that some of these features were more effective than others, or that certain features were distracting or confusing. Future studies could conduct a component analysis of these features, as well as examine the effectiveness of different video durations, and the effectiveness of VM to train groups of staff.