According to the Centers for Disease Control and Prevention, one in 54 children aged 8 years meets the criteria for a diagnosis of autism spectrum disorder (ASD), which occurs across ethnic, racial, and socioeconomic groups (Maenner et al. 2020). Children with ASD experience deficits in social communication (American Psychological Association 2017) and also marked difficulties with, or an absence of, spoken language (Mirenda and Iacono 2009). In fact, between 25 and 30% of children with ASD present with little or no functional speech even after years of intervention (Anderson et al. 2009; Rose et al. 2016; Tager-Flusberg and Kasari 2013). Receptive language issues, however, have received relatively less attention (Dada et al. 2020; Sevcik 2006), even though children with ASD may have considerable difficulty understanding language (Mechling and Hunnicutt 2011). Finally, children with ASD may also experience difficulties with executive function such as following the steps of an activity and transitioning between activities and settings (e.g., Quill 1997).

Visual supports have the potential to help alleviate difficulties in each of the above areas (Quill 1997). As an umbrella term, visual supports include a variety of visuals that target a range of outcomes. Rutherford et al. (2019) noted that what is considered a visual support under this umbrella is sometimes unclear and the boundaries of the construct are often ill-defined. In broad terms and for purposes of this paper, visual supports are either aided (i.e., involving something external to the body) or unaided in the pursuit of a variety of learner outcomes. On the aided side, visual supports include static or dynamic two-dimensional representations or three-dimensional objects presented using low-tech (e.g., a paper-based activity schedule) or high-tech such as dedicated (e.g., speech-generating devices) or repurposed technologies (e.g., iPad, smart watches); on the unaided side, visual supports involve gestures, manual signs, fingerspelling, and the like that can be produced by the learner or communication partner without any external aide.

Several systematic reviews have documented the efficacy of visual supports for individuals with ASD (Arthur-Kelly et al. 2009; Banda and Grimmett 2008; Bellini and Akullian 2007; Knight et al. 2015; Lequia et al. 2012). These reviews, however, emphasize visual schedules and video modeling. Although both visual schedules and video modeling have been declared “evidence-based” (National Autism Center 2015; Steinbrenner et al. 2020), the narrow range in type of visual supports as well as potential outcomes in the cited reviews is not reflective of visual supports as an umbrella term (see also Rutherford et al. 2019). Furthermore, the breadth of visual support strategies and potential outcomes had not been considered within a unifying assessment and intervention approach.

The Visual Immersion System™ (VIS™) is a proprietary clinical strategy aimed at improving expressive and receptive communication and language proficiency as well as functioning in natural environments for individuals with moderate-to-severe ASD. The principles of the VIS™ are contained in a clinical textbook (Shane et al. 2014). The VIS™ includes three instruction modes: the (a) Visual Instructional Mode (VIM)—visual supports used to aid comprehension, which are imposed as an alternative to, or in conjunction with, speech; (b) Visual Organizational Mode—visual supports used to represent the organization of an activity, routine, script, or schedule; and (c) the Visual Expressive Mode (VEM)—visual supports used for aiding expressive communication. Additionally, the instructional focus is not restricted to expressive and receptive language training but also provide guidance using a plethora of visual supports that help clarify meaning around temporal concepts, academic subjects, and activities of daily living.

In terms of overall significance, the VIS™ builds upon the processing strengths of individuals with ASD in the visual modality (Althaus et al. 1996; Ashwin et al. 2009; Shah and Frith 1993; Thaut 1987), benefits of the visual modality for learning (Bellini and Akullian 2007; Mesibov et al. 1994; Quill 1997), and their proclivity for electronic screen media (Charlop-Christy et al. 2000; Shane & Albert 2008; Shane et al. 2012). This is bolstered by instructional mode-specific theories of change: Strategies within the VIM (e.g., video modeling, dynamic scene cues) are grounded in social learning theory (Bandura 1977) and augmented input (e.g., Allen et al. 2017) or alternative input; augmented and alternative input permits children to supplement or replace having to glean information from spoken language depending on their respective language skills. Strategies within the VOM benefit from some of the same processes in addition to visuals offering a sustained referent whereas spoken language is ephemeral (Hogdon 1995). Strategies within the VEM are largely those that are associated with augmentative and alternative communication (AAC) systems and modalities such as speech-output technologies (Schlosser and Koul 2015), nonelectronic communication boards, and manual signs (Mirenda and Iacono 2009).

At the heart of the VIS™ approach is a visually rich language environment analogous to the immersive environment created for using sign language with individuals who are deaf (Adamo-Villani et al. 2006) and those who acquire a second spoken and/or written language (Cummins 1998). In this immersive and broad-based approach to visual supports, instruction is focused on seven operations that include multiple pragmatic functions (i.e., protesting, requesting, directives, commenting, questions, and social pragmatics) as well as instructions affecting organization/transitions. The visual supports incorporated within the VIS™ are created, stored, and accessed using a wide variety of tools that may be low-tech, mid-tech, or high-tech in nature (Shane et al. 2012). Examples of low-tech tools are paper-based topic displays, activity schedules, graphic symbols on flashcards, etc. Mid-tech tools are simple speech-output technologies such as the BigMac, Step-by-Step, and GoTalk4. High-tech tools include both dedicated speech-generating devices such as NovaChat, and/or repurposed consumer-level technologies such as smartwatches, smart speakers, and iPads running AAC-specific applications (e.g., TouchChat and GoTalk NOW).

Many visual supports and instructional components used in the VIS™ enjoy varying degrees of empirical support, including scene cues (Dauphin et al. 2004; Mechling and Gustafson 2008; Pierce and Schreibman 1994; Remner et al. 2016; Schlosser et al. 2013), visual schedules (Knight et al. 2015; Watanabe and Sturmey 2003), animated graphic symbols (Harmon et al. 2014; Schlosser et al. 2019; Schlosser et al. 2012, 2014), smartwatches (O’Brien et al. 2016, 2020; Schlosser et al. 2017), smart speakers (Allen et al. 2018; Yu et al. 2018), and speech-output technologies (Schlosser and Koul 2015). Likewise, several instructional methods used in the VIS™ have been found effective, including video modeling (Bellini and Akullian 2007; Hong et al. 2016), augmented input (e.g., Allen et al. 2017), and miniature linguistic training (Curiel et al. 2020; Nigam et al. 2006). The effectiveness of the VIS™ as a treatment package, however, remains unstudied.

In order to effect change in students with ASD served in classroom settings, research teams can either work directly with the students or use a train-the-trainer model. While the former approach typically ensures high-fidelity implementation of an intervention, there is a danger that the attained results do not maintain once the research team leaves the setting. A train-the-trainer approach, on the other hand, has the potential to achieve not only greater maintenance in targeted students but also to subsequently reach a larger number of nontargeted students (Andzik and Cannella-Malone 2017; Schlosser et al. 2000, 2006; World Health Organization 2005). A systematic review indicated that train-the-trainer models are effective for children with ASD across a variety of outcomes including cognition, language, and autism symptoms (Shire and Kasari 2014). Hence, the current study employed a train-the-trainer approach.

Recognizing that workshops by themselves are rarely sufficient to result in practice changes (e.g., Joyce and Showers 2002), coaching was chosen as a “supportive professional development practice” (Reinke et al. 2014, p. 151) for training the school team in the implementation of the VIS™. Coaching improves the fidelity with which teachers implement empirically supported treatments in classrooms (Graves Kretlow and Bartholomew 2010). Coaching interventions have also met with some success with teachers who serve children with ASD. For example, Suhrheinrich (2011, 2015) conducted several studies in which they successfully coached special educators to implement pivotal response training in classrooms. Coaching interventions targeting interdisciplinary school teams (e.g., involving special educators and speech-language pathologists or other related service professionals) (Schlosser et al. 2000), however, are more rare.

Given the physical distance between the classroom and the trainer team, using telehealth was a natural choice for delivering the coaching-based intervention. Within the speech-language context, telepractice has been defined as the “the application of telecommunications technology to deliver professional services at a distance by linking clinician to client, or clinician to clinician for assessment, intervention, and/or consultation” (American Speech and Hearing Association n.d., p. 1). Telehealth interventions in ASD produce a range of benefits for individuals with ASD as documented in a systematic review by Sutherland et al. (2018). Yet, only one of the 14 included studies targeted special educators serving students with ASD (Ruble et al. 2013). In the current study, telehealth was supplemented by in-person consultation and coaching.

Although the ASD field can increasingly rely on interventions that are evidence-based (e.g., National Autism Center 2015; Steinbrenner et al. 2020), a secondary analysis of this research base revealed that only a small fraction of these interventions have been studied in school settings (Martin et al. 2020). Those interventions that have been studied in school settings tend to be (a) carried out in schools designed for students with disabilities rather than public schools that serve both students with disabilities and typically developing peers and (b) researcher-implemented rather than practitioner-implemented. With the emergence of implementation science, there is a growing recognition that research must not only tackle the efficacy of interventions (i.e., controlled environments, ideal conditions, researcher-implemented) but also the effectiveness of interventions (natural settings, typical conditions, practitioner-implemented). Interventions developed under ideal conditions often do not get adopted in school-based settings because “…these interventions were not developed for implementation with the barriers and constraints (e.g. number of staff, quality of training, level of expertise)…” (Locke et al. 2019, p. 136) that may be present in public school settings. This supports the argument that these interventions should be studied in school settings from the outset (i.e., early on in the development of these interventions) (Kasari and Smith 2013) as well as early in the research progression toward becoming an evidence-based intervention. Additionally, mixed-methods approaches (i.e., the deliberate combined use of quantitative and qualitative methodologies; Johnson et al. 2007) is perhaps uniquely suited to examine the effectiveness of school-based interventions because it enables demonstrations of effectiveness while also elucidating the perspectives of relevant stakeholders regarding implementation issues. Especially, in the early stages of research progression, the perspectives on implementation gleaned can feed back into the improvement of emerging school-based interventions. It is therefore not surprising that mixed methods are gaining increasing traction for the study of school-based implementation of interventions (Dickson et al. 2020; Fairweather et al. 2016; Locke et al. 2016).

Using a mixed-methods approach, the purpose of this proof-of-concept study was to (a) evaluate the effectiveness of a coaching-based intervention on implementing the VIS™ with an interdisciplinary school team supporting seven children with ASD in a self-contained classroom of an elementary school and to (b) assess its social validity by relevant stakeholders. In terms of effectiveness, we aimed to evaluate the effects of the coaching intervention on child outcomes and school team outcomes. Child outcomes pertained to the seven operations/functions targeted by the VIS™ at an individual and aggregate level using both objective criterion-referenced measures (i.e., goal attainment scaling; Kiresuk et al. 1994) and indirect measures (i.e., Communication Matrix; Rowland 2011). In terms of school team outcomes, we were interested in changes in perceived self-efficacy. Perceived self-efficacy refers to “beliefs in one’s capabilities to organize and execute the courses of action required to produce given attainments” (Bandura 1977, p. 3). Specifically, these beliefs “are an individual’s perception of confidence in his or her ability to perform a specific task” (Pasupathy and Bogschutz 2013, p. 151). Here, we are interested in whether the school team’s confidence in implementing the VIS™ changes over time. In terms of social validation (Schlosser 1999), indirect relevant stakeholders provided subjective evaluation of the acceptability of the VIS™ intervention using a questionnaire. Furthermore, stakeholder groups (speech-language pathologists and the special educator, educational assistants, and parents) participated in separate focus groups to assess the social validity of the goal method, the implementation of the intervention, and the attained outcomes.

Method

Participants

A newly formed self-contained classroom for children with ASD in the northeastern United States in a suburban public elementary school agreed to implement the VIS as the routine practice method of choice. Hence, a convenience sample of participants included all students in that self-contained classroom and some of the staff supporting the students. In terms of students, seven children, aged 6 to 8 years, with moderate-to-severe autism, participated in the study. Participant characteristics at pretest are summarized in Table 1 in terms of diagnosis, gender, chronological age, Communication Matrix score, autism severity score and classification, and primary communication method. Additional participants were the interdisciplinary school team, consisting of one special educator (SpED), two speech-language pathologists (SLPs), and one educational technology specialist. Educational assistants (EAs) serving the classroom and mothers of the participating children participated in a postintervention focus group but were not part of the coaching intervention. Table 2 provides a summary of the characteristics of the interdisciplinary school team and EAs. The Institutional Review Board of the first author’s university approved this project.

Table 1 Participant characteristics at pretes
Table 2 Demographic characteristics of school team

Procedures

Setting

The study was conducted in a newly formed self-contained classroom for children with ASD that was part of a suburban elementary school (K–4) in the State of New York, USA. The schools’ total population was 408 students and approximately 15% of the student body qualified for a free or reduced lunch. About 77% of students were Caucasian, 10% were Asian, ~ 4% African-American, ~ 4% Hispanic, and 5% were from other minorities. The classroom was staffed by a special educator, seven EAs, and two SLPs who were both based in the class part-time; each was in the classroom for 1 ½ h in the morning and afternoon, respectively. Additionally, the district’s Director of Instructional Technology was a member of the team and assisted the teachers with hardware, software, video creation, and other on-site technological support. The coaching sessions conducted via Telepractice involved the interdisciplinary school team in the classroom (SLPs, SpED) and the trainer/research team from Boston, Massachusetts.

Experimental Design

A mixed-methods approach was utilized to evaluate the effectiveness of the coaching-based intervention, consisting of both quantitative research methods (i.e., pretest–posttest design, quantitative direct measures [goal attainment scaling], and quantitative indirect measures [parent report of autism severity, staff and parent report of expressive communication skills, treatment acceptability, self-efficacy) and qualitative measures (i.e., focus group interviews) (Johnson et al. 2007). In terms of the specific mixed-methods approach, quantitative and qualitative methods were implemented concurrently, but greater emphasis was placed on quantitative research (Christensen et al. 2014). The purpose of mixing quantitative with qualitative methods was to gain complementarity in that the focus group interviews were aimed to elaborate, enhance, and illustrate the results of some of the quantitative measures (e.g., treatment acceptability, fidelity of implementation) (Greene et al. 1989).

A one-group pretest–posttest design (Hegde and Salvatore 2019) was used to evaluate the effectiveness of the coaching intervention on student outcomes and perceived self-efficacy by staff. Repeated outcome measurements (i.e., goal attainment scaling) were taken consisting of one pretest administered at the beginning of the school year, one probe after 4 months, and a third probe 2 months later. It is not uncommon to encounter the use of one-group pretest–posttest designs at the proof-of-concept stages of research progression toward establishing empirically supported treatment packages for classrooms with students with ASD (e.g., Locke et al. 2018; Schwartz et al. 2004). Additionally, the social validity of the coaching intervention and the VIS™ was assessed.

Experimental and Control Conditions

VIS Assessment Battery

Under the guidance of a research assistant (RA), the school team completed the VIS assessment battery as delineated in Shane et al. (2014) for each of the students. The assessment results provided the basis for developing subsequent goal areas.

Pretests and baseline probe

Goal attainment was evaluated at baseline for each of the goals developed for participants. Additionally, the Communication Matrix, the CARS-QPC, the Self-Efficacy Inventory for Implementing the Visual Immersion System™, and the adapted Treatment Evaluation Inventory-Short Form (TEI-SF) were administered.

Coaching Intervention

Prior to the onset of this study, the school staff had received a workshop on the VIS™ by the second author. Consistent with adult education strategies, workshops by themselves are insufficient for evoking actionable learning; actionable learning is achievable if workshops are supplemented with ongoing coaching (National Research Council 2001; Odom et al. 2013; Schlosser et al. 2006). According to Snyder et al. (2015), effective practice-based coaching involves (a) planning goals and action steps, (b) engaging in focused observation, and (c) reflecting on and sharing feedback. This is what was implemented in this classroom-based coaching intervention.

During scheduled biweekly videoconferences (30–45 min), researchers fielded questions from school staff and provided training in intervention procedures. Topics addressed included problem behaviors, prioritizing goal areas, technology applications and approaches, classroom design, curricular issues and applications, and materials creation. Between videoconference connections, researchers interacted with staff via email or telephone to answer questions or provide brief clarifications of procedures. For the first 3 months, these exchanges occurred once or twice per week; for the remainder of the school year, the virtual interactions lessened to once per week. In the beginning, additional phone conferences were held between one of the researchers and the school SLPs to assist with goal writing (wording, criteria, etc.). There were three on-site visits by the researchers during the school year. Each visit consisted of a meeting between researchers and school staff, plus observations of the students by the researcher team. Additionally, the first meeting included a meeting with parents, and the third meeting included focus groups with parents, SpED/SLPs, and EAs.

Goal-Specific Intervention and Fidelity of Implementation Assessments

The school team was asked to work on the goals in terms of frequency and intensity as they would normally target IEP goals. Staff was also supplied with treatment fidelity checklists for each of the targeted goals and were asked to self-monitor fidelity (Belfiore et al. 2008; Smith et al. 2007). The percentage of correctly followed steps was calculated by dividing the total number of steps completed by the total number of steps and multiplied by 100.

One additional goal per participant was targeted for an assessment of treatment fidelity by an independent RA (Schlosser 2002). To do so, one intervention session of one goal each (involving five trials) per participant was videotaped. Using goal-specific fidelity checklists, the RA viewed the videotapes and scored whether the procedures were carried out as planned.

Intervention probes and interobserver agreement checks

Four months into the intervention, goal attainment was evaluated for each of the goals across participants. Additionally, the staff completed the self-efficacy scale once more. Approximately 2 months later, goal attainment was measured once more along with the Communication Matrix, the adapted TEI-SF, and the Self-efficacy Inventory for Implementing the Visual Immersion System™.

Interobserver agreement (IOA) checks on goal attainment scores were completed for 33.33% of randomly selected probes from baseline and intervention probes. There were 22 goals in total with three probes each for a total of 66 probes. For three probes (from three goals), probe 2 was not implemented because the best expected outcome (i.e., + 2) had been reached in probe 1, reducing the total number of probes to 63. An independent RA reviewed videotapes of 21 probes (33.33%). Prior to watching a videotape, the RA reviewed the goal definitions.

Focus Groups

At the end of the intervention, three separate focus groups were conducted with the mothers of participating children (n = 5, of 7 mothers), the EAs supporting the participating children (n = 7, of 7), and the SLPs and SpED teacher (n = 3, of 3). The first and the seventh author served as moderator and assistant moderator, respectively. The moderator led the discussion, whereas the assistant moderator recorded the session and handled logistics. The interviews were videotaped for subsequent transcription and analyses.

Measures

Table 3 offers a summary of what measures were taken at what measurement occasions (baseline, probe 1, and probe 2) and who the respective respondents were. Each of the measures is explained as follows.

Table 3 Measures taken across three measurement occasions and respondents

Goal Attainment Scaling

Goal attainment scaling (GAS) was used as an objective and direct criterion-referenced outcome measure. GAS is a technique for evaluating individual progress toward goals while also allowing for aggregation across individuals (Kiresuk et al. 1994; Ruble et al. 2012; Schlosser 2004). Unlike traditional goals, GAS features a gradation of goal attainment, ranging from − 2 (worst expected outcome) to + 2 (best expected outcome), and the following attainment levels in between: − 1 (worse than expected outcome), 0 (expected outcome), and + 1 (better than expected outcome) (see ESM Appendix A for an example). Goals are meant to be calibrated in such a manner that the expected outcome level is commensurate with what the school team realistically anticipates accomplishing given the context surrounding this goal. As a measurement and goal-setting technique, GAS features these attributes (Schlosser 2004): (a) comparability across goals and participants through aggregation; (b) adaptability to any levels and domains of the International Classification of Functioning, Disability, and Health levels; (c) versatility across populations and interventions; (d) linkage tied to expected outcomes; (e) facilitator of goal attainment; and (f) a focal point for team energies.

After participating in a 2-h GAS workshop, the school team was provided with a template for developing goals using the GAS format along with a set of three example goal matrices from an earlier project. Following this orientation, they developed three to four preliminary goals per student that reflected the principles of the VIS™ instructional approach. The researcher team reviewed the draft goals paying particular attention to the potential documented pitfalls for formulating GAS matrices including overlapping levels, gaps between levels, and multidimensional scales (Becker et al. 2000; Smith 1994). Upon receiving the feedback, the school team made revisions and resubmitted a second round of draft goals to the research team. For the second set of goals, the research team focused on the underlying operational definitions for determining attainment at each level as well the procedures for making these assessments. As a final step, the school team approved the proposed definitions upon which they were finalized. ESM Appendix A provides an example of a GAS matrix along with underlying definitions and procedures. The school team was asked to reach a consensus on both the perceived importance and perceived difficulty of each goal, using a four-point scale ranging from 0 = not at all to 3 = extremely. Once the goals and measurement aspects were finalized, the school team was instructed to proceed with baseline measurements (see ESM Appendix B). The same measurements were repeated 4 months and 6 months into the intervention, respectively.

CARS2-QPC

The Childhood Autism Rating Scale 2—Questionnaire for Parents or Caregivers (CARS2-QPC) (Schopler et al. 2010) is part of the larger CARS2 family of scales (Schopler et al. 2010) and a measure of autism severity based on indirect report, in this case by parents. The CARS2 family of measures has sound psychometric characteristics (Vaughan 2011). One parent per child completed the CARS2-QPC once at baseline to provide an indication of autism severity at the onset of the project.

Communication Matrix

Parents and the two SLPs completed the Communication Matrix for each participant at pretest and posttest. The Communication Matrix is an on-line assessment tool that queries respondents about 24 states, functions, and intents with a focus on expressive skills in any modality (Rowland 2011; Rowland and Fried-Oken 2010). These include (a) A1: Express discomfort, (b) A2: Express comfort, (c) A3: Express interest in other people, (d) B1: Protests, (e) B2: Continues action, (f) B3: Obtains more of something, (g) B4: Attracts attention, (h) C1: Refuses/rejects, (i) C2: Requests more action, (j) C3: Requests new action, (k) C4: Requests more object, (l) C5: Makes choices, (m) C6: Requests new object, (n) C7: Requests absent objects, (o) C8: Requests attention, (p) C9: Shows affection, (q) C10: Greets people, (r) C11: Offers/shares, (s) C12: Direct your attention, (t) C13: Polite social forms, (u) C14: Answers yes/no questions, (v) C15: Asks questions, (w) C16: Names things/people, and (x) C17: Makes comments. Some of these states, functions, and intents repeat (for a total of 80 cells) across two or more of the seven levels of representation listed on the left-hand side of the profile: level 1 (pre-intentional behavior), level 2 (intentional behavior), level 3 (unconventional communication [presymbolic behavior]), level 4 (conventional communication [presymbolic behavior]), level 5 (concrete symbols), level 6 (abstract symbols), and level 7 (language).

Across the bottom of the profile, the cells are further organized into four major reasons to communicate, namely, to (a) refuse things that a learner does not want (7 cells), (b) obtain items that a learner wants (31 cells), (c) socially interact with others (28 cells), and (d) provide or seek information (14 cells) for a total of 80 cells.

Items are scored using a 3-point scale that designates (a) mastered or surpassed (2 points), (b) emerging (1 point), and (c) not used (0 points). Total scores range from 0 to 160 (2 × 80 cells). Some parents received support from a research assistant to navigate the tool, whereas the SLPs completed the matrix independently. The Communication Matrix has been shown sensitive to change as a result of a year-long intervention as documented by the developer’s work involving children with pervasive developmental disorders (Rowland & Schweigert 2002) and a group of youngsters with a variety of severe and multiple disabilities (Rowland and Fried-Oken 2010).

Treatment Acceptability

The two SLPs and the SpED and the seven EAs completed an adapted version of the TEI-SF (Kelley et al. 1989) at baseline and again postintervention. The original TEI-SF is a 9-item rating of treatment acceptability with strong evidence of internal consistency and validity (Kelley et al. 1989; Njardvik and Kelley 2008). In our adaptation, two items were added to the TEI-SF to more specifically address the VIS™ as the target treatment, resulting in an 11-item instrument. As with the original TEI-SF, the items on the adapted form are scored using a 5-point Likert scale, with 1 representing “strongly disagree” and 5 equaling “strongly agree” on items 1 to 5 and 7 to 11. Item 6 is reversed scored because it is negatively worded. The TEI-SF is interpreted based on the total score. A “moderate” acceptability would result from a midpoint rating of 3 on each item, or a total score of 33 across the 11 items.

Self-Efficacy

On three occasions (baseline, probe 1, and probe 2), the school staff (SpED, SLPs, and EAs) completed a self-efficacy scale that was developed specifically for this project. As such, its use in this study is exploratory. The Self-efficacy Inventory for Implementing the Visual Immersion System™ is a 48-item checklist grouped into the following categories: (a) technology, (b) assessment, (c) strategies, and (d) operation-specific (e.g., related to requesting). To develop this checklist, we followed the guide to the construction of self-efficacy scales by Bandura (2006). Consistent with these guidelines, the Likert-type response scale ranged from 0 (cannot do at all) to 100 (highly certain can do) and a middle of 50 (moderately certain can do) with 10-point intervals.

Focus Groups

Following probe 2, focus groups were conducted. Consistent with focus group methodology (Greenbaum 1998; Krueger and Casey 2014), interview guides were prepared for each of the focus groups (SLPs and SpED, EAs, and parents) in order to guide the discussion (see ESM Appendix C for one of the guides). The school team (SLPs and SpED) interview guide included guiding questions around these themes: (a) definition of the VIS™, (b) VIS™ implementation and future, (c) goal setting and measurement, (d) impact on their role, (e) trainer team support, and (f) VIS™ outcomes. The EA interview guide included questions pertaining to the (a) definition of the VIS™, (b) VIS™ implementation and the future, (c) impact on their role, and (d) VIS™ outcomes. The interview guide for parents targeted (a) the definition of the VIS and (b) VIS™ outcomes.

Data Analyses

GAS raw scores were converted to T-scores using the formula demonstrated by Turner-Stokes (2009) and applied by others (Raghavendra et al. 2018). A one-way repeated measures ANOVA was carried out to establish any differences between baseline and intervention probes.

The total scores (0 to 160) of the Communication Matrix pre- and posttest for the participants as a group were compared using a nonparametric Wilcoxon signed-ranks test due to the small sample size (Wilcoxon 1945). Consistent with Rowland (2013), individual participant data were further analyzed descriptively as follows: (a) primary level—the level at which a student has the highest percentage of cells scored as mastered; (b) highest level—the level beyond the primary level that contains at least one intent with mastery; and (c) pattern-mastery—the percentage of mastered intents within each of four reasons to communicate: to refuse (7 cells), to obtain (31 cells), for social interaction (28 cells), and for information (14 cells).

The self-efficacy data were analyzed to determine changes across the three measurement occasions (baseline, probes 1 and 2) using the mean rather than the total score. Self-efficacy analyses typically use the total score. However, because several of the EAs chose the N/A option for several of the items, it was decided to use the mean in order to keep the data for the trained school team and the nontrained EAs comparable. Due to the small sample (n = 3), the self-efficacy data for the trained school team were analyzed using a nonparametric Friedman test. Self-efficacy data for the EAs were not analyzed statistically because they were not the direct targets of the intervention.

The focus group videotapes were transcribed verbatim by research assistants blind to the purpose of the study; the transcripts served as the basis for subsequent analyses. The constant comparative method (Corbin and Strauss 1990) was used to analyze the transcripts. To do so, the first author read the entire transcript once and then re-read the transcript, grouping data into small units that seem to belong together by providing them with a temporary unique label. As appropriate, the smaller units were then placed into broader categories which either came from the interview guide (and the moderator’s line of questioning) or they emerged from the data itself. A second analyst independently engaged in the same process, and the analysts subsequently compared notes with the goal to establish a consensus on the broad categories, the subthemes supporting these categories, and the best underlying quotations in support of a subtheme.

Results

IOA and Fidelity of Implementation

Using the formula number of agreements divided by the number of agreements plus disagreements multiplied by 100 (16/16 + 5 × 100) resulted in an IOA of 76.2%. The five disagreements were distributed across 5 participants and were the result of a 1-level discrepancy between the school team and the independent observer. Self-monitoring of fidelity resulted in a mean of 100% implementation fidelity across seven goals for the seven participants. The observations by the research assistant also resulted in 100% implementation fidelity.

Goal Attainment Scaling

Inferential Findings

A one-way repeated measures ANOVA was conducted to evaluate whether the GAS scores of the students as a group improved over time as a result of the coaching intervention. There was a significant effect of the coaching intervention as an independent variable, Wilks’ lambda = 0.188, F(2, 5) = 10.776, p = 0.015. Table 4 provides the means, standard deviations, and confidence intervals across probes (baseline, probes 1 and 2), whereas Table 5 contains multiple paired t tests with a Bonferroni correction to keep the type I error at 5% overall. There was a significant difference in the scores between baseline (M = 25.571, SD = 1.166) and probe 1 (M = 37.929, SD = 4.013), baseline and probe 2 (M = 51.70, SD = 5.666), and between probe 1 and probe 2. From the “Mean difference” column, it is evident that probe 1 and 2 scores are significantly superior to baseline scores. This suggests that the participants with ASD, as a group, made significant progress on individualized goals as a result of the VIS™ coaching intervention.

Table 4 GAS scores at baseline and probes 1 and 2: means, standard deviations, and confidence intervals
Table 5 Post hoc pairwise comparisons of probes in terms of GAS scores

Descriptive Group and Individual Findings

For the participants as a group, 22 goals spread across seven participants were assessed at baseline, 4 months into the intervention (probe 1), and 6 months into the intervention (probe 2). Table 6 provides a summary of the 22 goals in terms of participants, VIS™ goal targeted, importance, difficulty, and performance levels across the three probes as well as total change. Overall, the intervention resulted in improved performance at probe 2 for 16 of the 22 goals (72.7%). For these 16 goals, the improvement ranged from 1 level (4 goals) to 2 levels (3 goals), 3 levels (4 goals), to 4 levels (5 goals). Given that the expected level of performance (i.e., level 0) is considered the gold standard of desired change in the GAS literature (Kiresuk et al. 1994), 14 out of 16 goals that showed improvement (87.5%) or 14 out of a total of 22 goals (63.6%) attained this standard. From baseline to probe 1 (4 months), the intervention improved performance on 12 goals (54.5%), whereas 10 goals showed no improvement (45.5%) (three of the improved goals reached the maximum level at + 2 and were no longer targeted for probe 2). From probe 1 (4 months) to probe 2 (6 months), additional improvements were noted for 12 goals out of 19 goals (63.2%). Four out of 22 goals (18.2%) showed no improvement. The total change varied, ranging from a mean change of 0.33 levels (Jack) to 3.33 levels (Brian).

Table 6 Goal attainment levels across participants and probes along with associated changes

Descriptive Findings for Functions and Expressive or Receptive Focus

The data displayed in Table 7 show the GAS data grouped by communicative function, beginning with those that are expressive followed by those that are receptive. Half of all goals (n = 11) were focused on expressive and the other half (n = 11) on receptive language skills. The summed change (of GAS score levels) across all expressive goals yielded 29 (with a mean of 2.64 per goal), whereas the summed change across receptive goals yielded 13 (with a mean of 1.18 per goal). Ignoring the social pragmatics goal (there was only one), most progress was made for requesting (+ 3.33), followed by commenting (+ 2.14), directives (+ 1.5), and answering WH questions (+ 0.8). Given the expected change in levels for GAS goals is + 2 (assuming a − 2 baseline) resulting in an attainment of the expected level (0), most communicative functions met or nearly met this expectation. This also suggests that the goals were appropriately calibrated.

Table 7 Goal attainment levels across probes, summed change levels, and mean change levels grouped by VIS™ goal, and expressive/receptive status

Communication Matrix

Inferential Findings

Table 8 provides the pretest and posttest results for the Communication Matrix as completed by SLPs and parents. For the SLPs, the pretest scores ranged from 40 to 128 with a mean of 77, while the posttest scores ranged from 59 to 150 with a mean of 98.1. A Wilcoxon signed-ranks test indicated that the median posttest ranks were significantly higher than the median pretest ranks, Z = − 2.02, p = 0.04. For the parents, the pretest scores ranged from 6 to 144 with a mean of 86.1 and the posttest scores ranged from 56 to 146 with a mean 92.3. For parents, a Wilcoxon signed-ranks test indicated that the median posttest ranks were not significantly different from the median pretest ranks, Z = − 1.78, p = 0.75.

Table 8 Communication Matrix scores (maximum score of 160) by parents and speech-language pathologists

Descriptive Group Findings

Table 9 provides data for each of the four types of scoring of the Communication Matrix by SLPs, including the (a) total score, (b) primary level, (c) highest level, and (d) reason-pattern score. To derive a profile based on the total score, 5 of 7 participants improved from pre- to posttest, whereas two participants were nonresponders (i.e., Katie, Jack). Among the 5 that did improve, Mary may be described as a “medium responder,” while all others were “high responders” (Brian, David, Peter, and Tom).

Table 9 Communication Matrix total score, primary level, highest level, and reason-pattern across participants pre- and posttest

At pretest, four participants and three participants were operating at levels 2 and 3, respectively, as their primary level (i.e., levels with the highest percentage of cells marked as mastered). At posttest, four participants had improved their primary level, whereas three participants did not, including Jack who had been identified as a nonresponder on GAS goals. Three of the four participants that improved did so by one level, while one participant (i.e., Brian) improved by 5 levels; this seems extraordinary, although one has to keep in mind that his pretest scores at these levels were high to begin with. In terms of the highest level, defined as the level at which at least one intent was mastered, three participants began at level 4 (including Jack), two participants at level 6, and two participants at level 7. At posttest, three participants improved their highest level by at least one and 4 participants did not improve, including Jack who remained at level 4 as his highest levels. Two others who did not change levels had reached the ceiling with a score of 7 at baseline.

In terms of the four major reasons to communicate, all but one participant (Jack) showed across-the-board improvements. Participants tended to communicate less for social interaction and providing or seeking information than expressing refusal and obtaining desirable things at pretest. Although this pattern tended to remain consistent at posttest, three participants who did not provide or seek information at pretest made some gains here as well. Furthermore, two participants (Brian and Peter) achieved a balanced pattern in that they communicated for all four reasons.

Comparison of Communication Matrix with GAS Goals

“Requests new objects (C6)” was the only matrix function that matched one of the GAS goals, that is the requesting GAS goal for three of the participants (David, Tom, and Katie) (at first glance, the “makes comments” [C17] function on the matrix appears to correspond to the GAS commenting goals of each participant, but closer examination reveals that the former is focused on spontaneous comments only, whereas the GAS goals were on prompted comments). In Table 10, the pre- and posttest scores attained on the matrix for each level of representation are compared to those obtained on the GAS goals. The matrix results for Jack corroborate his status as a “nonresponder” on GAS goals. Specifically, there was no improvement across all four types of matrix analyses (total score, primary level, highest level, and reason-pattern).

Table 10 Communication Matrix and GAS pre- and posttest scores for requesting new objects

Treatment Acceptability

Results from the adapted TEI-SF for the SLPs/SpED and the EAs are summarized in Table 11. The data for the two SLPs and the SpED indicate that the intervention was perceived as highly acceptable both before (i.e., 47.6 out of 55) and after the intervention (i.e., 44.5), corresponding roughly to an average rating of 4 (“agree”) for each of the 11 items. Although statistical analyses were not carried out due to the small n, it seems that participating in the intervention did not significantly alter their perception.

Table 11 Treatment acceptability total scores (TEI-SF adapted—minimum score of 11 and a maximum of 55) at pretest and posttest

The data for the EAs show an overall lower acceptability rating both pre- and posttest. A much larger standard deviation is noticeable for this group. Upon closer inspection, the mean scores at pretest were influenced by one outlier who rated the intervention as 11 at pretest—essentially marking “1.” At posttest, however, this EA indicated substantively higher acceptability. Similar to the SLPs/SpED, the data for the EAs did not change in a meaningful manner from pretest to posttest, negating the need to conduct a statistical analysis.

Self-Efficacy

Descriptive self-efficacy ratings of the school team (i.e., SLPs, SpED) that received the coaching intervention are presented in Table 12. Participants seemed to have increased their perceived self-efficacy from baseline to probe 2 within each of four subindices as well as the overall VIS™ self-efficacy index. Across indices, the increases were more pronounced from baseline to probe 1 compared to increases from probe 1 to probe 2. A nonparametric Friedman test of differences among repeated measures (baseline, probe 1, probe 2) was conducted and rendered a chi-square value of 29.18 which was significant (p = 0.01).

Table 12 Self-efficacy ratings by speech-language pathologists and special educators

Descriptive self-efficacy ratings of the educational assistants are presented in Table 13.

Table 13 Self-efficacy ratings by educational assistants

A nonparametric Friedman test of differences among repeated measures (baseline, probe 1, probe 2) was conducted and rendered a chi-square value of 55.53 which was significant (p = 0.02). Consistent with the trained team, the increases were more prominent from baseline to probe 1 than from probes 1 to 2.

Focus Group Results

School Team Focus Group

The trained school team, consisting of the SpED and two SLPs, reported several positive child outcomes and school team outcomes as a result of participating in the VIS™ coaching intervention (see Tables 14 and 15).

Table 14 Positive child outcomes
Table 15 Positive school team outcomes

Although the majority of comments were indicative of positive child outcomes, the school team did identify one participant as a nonresponder to the VIS™ coaching intervention and attributed this to the student’s learner characteristics as well as concomitant diagnosis:

I can say that the least responsive, I would say J because he auditorily, I do not think that’s a strength either but it’s a little bit stronger than the visual. The visual, it’s very hard to keep him in tune. And even last year, he was one of the ones that was working outside this program, it was very difficult. I just think he does not have it, you know the interest in the videos, you know, as a classic child with autism

SLP1: It’s just his profile is a little bit different. Special Educator: Yeah, his profile is a little different. SLP2: Right, he has more than just autism but, like his joint attention visually is not as strong. Special Educator: Well and that’s what we need to keep in mind. Our two lowest students have other coexisting conditions, it’s not just autism, so...

Educational Assistant Focus Group

Four major themes emanated from the transcripts involving the EAs, namely, (a) perceived improvements as a result of the intervention, (b) understanding of the VIS™, (c) perceived challenges in implementing the VIS™, and (d) future supports. In terms of improvements, there was widespread agreement that the children made considerable communication gains: “I think with our non-verbal children, in my opinion I – I have seen a big increase in their communication.”

Two of the seven EAs, however, had a limited or incorrect understanding regarding the nature of the VIS™. This first quote shows that there was a tendency by some to devalue augmentative modalities in favor of speech, and that is not the intent of the VIS™:

But a lot of times I say to her ‘use your words’ as well as device. So, I think she needs an extra prompt. But she will do it but it’s a re- it’s something she needs to be reminded. To do. Because otherwise she will rely, more on the device than she will her own words.

The following statement illustrates a lack of understanding the importance of providing augmented input—that is visual supports plus natural speech:

I took out the visuals and pointed to them. Without using any words and she got right up.

One of the perceived challenges for implementing the VIS™ pertained to missed opportunities to fade visual supports when they are no longer needed, resulting in a perceived loss of fluidity of unfolding events:

And, but when it comes to like directions to the next uhm, uhm, place (that) she needs to be. I do not feel she needs it. So, a lot of times we are told you know “Get the book. Show her the book,” but I feel like when she unders- She’s moving – She is going. She is going to – She is out – she is going already and now we have to, you know get the book and, you know it’s just (stops herself). But, I get it, there is a reason for that, but I find for her its not as useful. ..it can be frustrating for her.

Other perceived challenges related to the lack of training in new technologies (e.g., smartwatches) and breaking through years of programmed dependence on adults in order to strive toward independence with the help of technology (here, a smartwatch):

Because, it’s a child who’s trained since, probably preschool. To stay with the adult that is with him. So, he had to relearn that. So, that’s right. He’s gotten a little better. At first, he would not He would just stand there and say “Are you coming.” And I would just say, do what the watch says.

In terms of future supports, the EAs wished that they had been better informed about each of the students’ goals: “So yeah knowing what their goal is – its gonna be huge”.

Parent Focus Group

Three major themes emerged from the analysis of the parent transcript, namely, (a) perceived outcomes, (b) visual supports implemented at home, and (c) desired future supports. In terms of outcomes, there was consensus among the parents that their children improved in terms of spoken language as exemplified by the following quotes: “the verbal part has just really blossomed a lot” and “I think she is a lot more verbal, coming up with her own ideas and not just choosing from choices that are given.” Additionally, there was pervasive support for improvement in directive-following as illustrated by this statement: “That [Smartwatch, added by author] helps him a lot to be independent. Cause he is so used to being with an adult all the time wherever he goes. He is hesitant a bit but then his carrying on whatever command is being given to him he carries on and following that.” Finally, at least two parents reported that their children appeared to be calmer.

When asked by the moderator whether they have tried any of the VIS™ strategies at home, several parents mentioned the use of video modeling, and that they found this to be effective. The provision of more video modeling materials was also viewed as desirable support going forward: “I think I’d like videos of somebody modeling something so I could show Jack…continue at home in doing like daily living skills or chores..” Additionally, the parents also desired receiving more frequent videos of their children documenting progress.

Discussion

The purpose of this mixed-methods proof-of-concept study was to (a) evaluate the effectiveness of a coaching-based intervention on implementing the VIS™ with an interdisciplinary school team supporting seven children with ASD in a self-contained classroom of an elementary school and to (b) assess its social validity by relevant stakeholders. This study provided “proof-of-concept” that a coaching-based intervention on the VIS™ is feasible for an authentic educational setting, that is a classroom serving students with ASD, based on favorable outcomes, reliable data collection by classroom staff, and adequate fidelity of implementation. The data are suggestive that the intervention was effective in improving language and communication outcomes of participating students and improved self-efficacy of classroom staff.

Goal Attainment

GAS served as the primary outcome measure for documenting change. Statistical analysis suggests that the participants, as a group, improved across the three measurement occasions and across the 22 goals. This suggests that the VIS™ coaching intervention was effective for this group of students with ASD. This application of GAS adds to its growing role in evaluating outcomes in settings serving children with ASD (Ruble and McGrew 2013) and school-based interventions more generally (Fairweather et al. 2016). In 2007, Steenbeek et al. concluded their critical review of GAS in pediatric settings as follows: “…, current knowledge about its reliability when used with children is insufficient. There is a need for further development of GAS and its application for children of different ages and disabilities, across therapists of different disciplines” (p. 550). This study answers this call to action in multiple ways. First, our data show the measurement of attainment levels occurred reliably for this group of children; considering the IOA was between practitioners and trained research assistants rather than between research assistants only, the IOA is in the acceptable range. Second, to our knowledge, this may be the first application of GAS as part of a classroom-based intervention for children with ASD. Third, this study contributes goal-specific treatment fidelity data that are often unreported in studies using GAS; goal-focused instruction was implemented with fidelity based on self-monitoring and independent monitoring.

The descriptive data showed that more progress was made on expressive goals compared to receptive goals. Two data points may help contextualize this finding. First, unlike typical development, some children with ASD show greater impairment in receptive relative to expressive language (e.g., Hudry et al. 2010; Kover et al. 2013; McDaniel et al. 2018). Second, a recent meta-analysis found that while early naturalistic developmental behavioral interventions were very effective in developing expressive language in preschoolers with ASD, they were less effective in improving receptive language abilities (Tiede and Walton 2019).

Descriptive analyses on goal attainment by communicative function revealed that the function of “requesting” yielded the greatest level of improvement. This is not surprising given that requesting is emerging early in normal language development and represents a motivating function to teach to children with developmental disabilities that allows them to exert immediate control over their environment (Sigafoos et al. 2003). Answering WH questions, on the other hand, showed the least progress among the targeted functions. This may be explained by the complexity of the psycholinguistic process underlying the answering of questions (Shane et al. 2014) and empirical evidence regarding the difficulties associated with teaching this function to children with ASD and other developmental disabilities (Daar et al. 2015).

At an individual level, changes ranged widely. A range is to be expected given the heterogeneity of children with ASD, but the question has to be raised as to why Jack was the only student who may be considered a nonresponder to the VIS™ coaching intervention based on the GAS data. There are a number of possible explanations. To begin with, Jack was one of only two children who had a concomitant diagnosis along with ASD and exhibited relatively frequent problem behaviors. Given that Beckwith–Weidemann syndrome features physical symptoms rather than cognitive ones, it seems unlikely, however, to be an underlying reason for his performance. Additionally, a perhaps plausible explanation rests with the fact that requesting—a relatively “easy-to-teach” and immediately reinforcing communicative function—was not among his targeted goals; instead, his goals included the answering of WH questions, directive-following, and commenting, which arguably tend to be on the more difficult-to-teach side.

Communication Matrix

The significant difference between pre- and posttest total scores by SLPs indicates that the Communication Matrix might be a tool that is sensitive to changes as a result of a classroom-based exposure to the VIS™ and the VIS™ coaching intervention. This is an exploratory finding within the confines of this proof-of-concept study. However, the findings add to the previously documented sensitivity by its developer (Rowland 2011; Rowland and Fried-Oken 2010). The individual data presented a more nuanced picture with the help of additional analyses including the primary level, the highest level, and reason-pattern scores. Most participants improved their highest level of representation used to communicate as well as the frequency with which they communicated for each of the four reasons captured by the matrix, including reasons for which they did not communicate at baseline. Two of the participants even achieved a balanced reason-pattern communicating nearly equally across the four reasons.

The combined use of GAS and Communication Matrix is novel and therefore exploratory. Hence, the question has to be asked how the reported outcomes using these measures compare to one another. In terms of nonresponders, Jack was identified as a nonresponder with both measures. The matrix data corroborated his nonresponder status on GAS in that he not only did not improve in total score, but he also remained at the same level of representation, and did not alter his reason-pattern. Interestingly, an examination of his reason-pattern score revealed that his score for “obtaining items” went down, which is consistent with the absence of a GAS goal on requesting. Katie was identified as a nonresponder by the Communication Matrix and to some degree also by GAS—she showed no improvements on her commenting and answering goals, but improved substantially on her requesting goal. So, overall, there appears to be fairly good consistency across the two measures as far as nonresponders are concerned.

Another way to draw comparisons is to examine the actual overlap between the communicative functions/operations targeted by the GAS goals and the states, functions, and intents of the matrix. It turns out that only the requesting GAS goals of three participants overlap with the “C6. Requests new object” function of the matrix, and there was good overlap in scores between the two measures. That being said, it has to be recognized that the primary focus of the matrix is on expressive communication and half of the GAS goals targeted were receptive in nature. Furthermore, the GAS scores are based on direct observations of criterion-referenced performance, whereas the matrix was used as a generalized measure that is indirect based on relevant stakeholder perspectives.

The Communication Matrix results by the parents, on the other hand, showed no significant differences. Parents primarily observe their children in the home environment and in the community. Since the intervention was classroom-based and did not train parents or target strategies to promote generalization to the home environment (Schlosser and Lee 2000), it is likely that the classroom changes did not transfer to the home/community, and therefore, failed to change scores on the Communication Matrix. The focus group results as discussed below, however, yielded a more nuanced look at generalized language and communication outcomes suggesting that the changes were noted but simply did not rise to the level of statistical significance with the Communication Matrix.

Treatment Acceptability

The school team who received the VIS™ coaching intervention perceived the intervention to be highly acceptable going into the intervention and remained positive following the intervention. This suggests social validity of the VIS™ and the coaching intervention with which it had been delivered to the school team. Within an evidence-based practice framework, interventions that are objectively effective as well as socially valid are deemed of greater value relative to interventions that are objectively effective without evidence in terms of social validity (Schlosser and Raghavendra 2004). The high scores at pretest are possibly due to expectations created by the VIS™ workshop provided prior to the pretest. The overall lower treatment acceptability among the EAs may be attributed to the fact that they had not received the VIS workshop prior to the pretest nor were they directly part of the VIS™ coaching intervention. The focus group results for this group support this interpretation.

Self-Efficacy

The data indicated marked improvements in perceived self-efficacy by the trained school team (SLPs, special educator). Their increased confidence in implementing the VIS™ may have been influenced by one or a combination of the following processes: (a) enactive mastery, (b) vicarious learning, and (c) verbal persuasion (Bandura 1977; Schunk 1991). In terms of enactive mastery, the completion of the VIS assessment under the supervision of a trained research assistant may have enhanced their confidence. The focused observations resulting from the coaching intervention may have contributed to vicarious learning as well. Finally, the manner in which the trainer team provided performance feedback in biweekly videoconferences may have improved how the staff appraised their self-efficacy.

The self-efficacy of the EAs also improved overtime, although at levels that were lower than those of the trained school team. The EAs had less pre-existing knowledge about the VIS™. Because the EAs were not targeted directly in this coaching intervention, it is plausible that their confidence was influenced exclusively by vicarious learning. There are several conditions under which perceived self-efficacy is especially sensitive to vicarious learning. How this may have played out in the current project is discussed as follows. For one, when individuals have little prior experience (as they did with the VIS™) on which to base their evaluation of their capabilities, their perceived self-efficacy is susceptible to greater influence through vicarious learning (Bandura 1977; Schunk 1991; Takata and Takata 1976). Second, ambiguity about task demands influences the appraisal of self-efficacy (Schunk 1995) and could have resulted in overall lower levels relative to the trained school team. At least in the initial phases of the coaching intervention, the EAs likely were less clear as to what was expected of them and only later through observations of the trained school team was ambiguity reduced. Third, the greater the assumed similarity with the model, the stronger the influence on self-efficacy. Here, the EAs only observed SLPs who have a specialized skill set and knowledge, which may have had a weaker influence on their self-efficacy (Bandura 1977; Wood 1989).

Focus Groups

The focus group results for the school team fulfilled its intended complementary role for the quantitative indirect measures of treatment acceptability and self-efficacy. In terms of treatment acceptability, the focus group results for both the school team and the EAs corroborated the yielded high acceptability for the VIS™ on the adapted TEI-SF in particular as it pertains to perceived effectiveness of positive child outcomes. In terms of self-efficacy of the school team, the quantitative measure reported significant improvements over the course of the intervention. The focus group analyses not only substantiated these improvements but also illustrated them more clearly; over the course of the intervention, the school team did feel more comfortable in implementing visual supports in general and technology in particular.

The focus group results for the parents elaborated upon the lack of statistically significant change from pretest to posttest reported on the Communication Matrix. That is, the parents did indeed note improvements in spoken language, directive-following, and receptive language, but these changes did not rise to statistical significance on the matrix. Perhaps, the primary expressive focus of the matrix might explain this given that the noted improvements in the focus group were largely receptive in nature (i.e., directive-following, receptive language in general). The parents identified a need to receive training in visual supports in the home environment, which is consistent with the identified gap in the research literature on visual supports (Rutherford et al. 2019). The focus group results of the school team bolstered and corroborated the positive child language and communication outcomes yielded over time as shown by significant changes in GAS scores and the matrix. Finally, the identification of one nonresponder via GAS was further corroborated by the focus group results of the school team.

Limitations and Directions for Future Research

Because of the design employed, it is not clear whether the outcomes attained are solely attributable to the intervention and not other services the participants may have received. That being said, it represents proof-of-concept and a reasonable stepping-stone for more rigorous designs in the future. Research into other treatment packages in ASD, which are now considered evidence-based such as the Early Start Denver model, began with case studies and similar pre-experimental group designs (e.g., Vismara and Rogers 2008). Also, newly emerging school-based interventions in ASD have begun their journey toward being evidence-based in authentic school settings with pre-experimental designs (Locke et al. 2019; Suhrheinrich 2015) and some with mixed methods (e.g., Locke et al. 2016).

Although self-efficacy is considered an important outcome variable for coaching-based intervention studies such as this one, it is acknowledged that the VIS™-specific measure has been developed specifically for this intervention and has not been validated. The small n precluded an analysis into its psychometric characteristics. Therefore, the results obtained ought to be viewed as exploratory. Also, the small n made it impossible to keep the self-efficacy responses as well as the treatment acceptability responses anonymous. So, it cannot be ruled out that the responses may have been influenced by social desirability considerations by the respondents.

On several measures (e.g., GAS, Communication Matrix, self-efficacy), the changes from baseline to probe 1 were more pronounced than from probe 1 to probe 2. This can be explained by the unequal interval when measurements were conducted. While probe 1 was conducted 4 months into the coaching intervention, probe 2 followed 2 months later. Hence, this presents an important contextual factor for putting the differential probe results into perspective.

The obtained results need to be viewed within the context of the staffing and qualifications and experiences of the school team and others who supported the seven students. The members of the school team and support were not only exceedingly well-qualified but also had many years of experience working with children with ASD. The adoption of the VIS also enjoyed the support from the school district administration. It remains to be seen whether similar results are possible with a less-experienced and qualified school team or in a classroom or school setting that is under-resourced.

In terms of future directions, the focus group results revealed the need for greater parent involvement in the coaching-based intervention. Parents clearly expressed their desire to learn more about ways to implement the use of more visual supports and technology at home. Likewise, the EAs asked to be directly involved in the training. They are the frontline adults who provide an extensive amount of instructional face-to-face time with the students; therefore, they are critical to the success of classroom-based interventions (Giangreco et al. 1997). The objective treatment fidelity data, while positive, were based on a small sample of observations. Hence, it is unclear whether they are representative of the practitioner behavior during unobserved times. While promising, in future replications, the fidelity data should be based on a larger sample. Finally, this coaching-based intervention needs to be replicated with a more rigorous design and other classrooms to further establish internal validity and enhance generality.

In conclusion, this project provides proof-of-concept that a coaching intervention involving the VIS might be an effective approach toward improving language, communication, and/or executive function skills (i.e., transitioning between activities) in children with ASD in a classroom setting. As such, this study represents the first step toward establishing the VIS as an empirically-supported treatment for elementary school-aged children with ASD. The coaching intervention was deemed successful in terms of both criterion-referenced direct and indirect measures of student outcomes as well perceived self-efficacy by an experienced school staff. Additionally, both questionnaires and focus group methodology revealed that relevant stakeholders view the coaching intervention involving the VIS as a socially valid treatment package.