In this paper, a multi-stage virtual reality (VR) intervention named Virtuoso (a play on the words “virtual” and “social”) is presented. Virtuoso is designed to provide training for and promote generalization of adaptive skills. The target population is adults with autism spectrum disorder (ASD) in need of substantial supports (i.e., level 2) who are enrolled in an adult day program at a large Midwestern university. Interest in virtual reality (VR) interventions for individuals with autism spectrum disorder (ASD) has been steadily growing for over 20 years (Aresti-Bartolome & Garcia-Zapirain, 2014). Since Strickland and colleagues’ seminal investigation into the acceptability of VR equipment and potential learning effects (1996; 1997), researchers have continued to explore VR as a means to deliver interventions. This research has contributed to the emergence of a promising, but preliminary, basis of support for the efficacy of VR as an intervention modality (Mesa-Gresa et al., 2018).

ASD is a lifelong condition that manifests as a cluster of neurodevelopmental disorders and is characterized by persistent deficits in social communication/interaction and restricted, repetitive patterns of behavior (American Psychiatric Publishing, 2013). In the United States, the prevalence of autism is increasing, with recent reports indicating that one in 54 children receives an ASD diagnosis (Baio et al., 2018). Comorbidities include cognitive impairments, epilepsy/seizures, ADHD, general anxiety disorder, sensory problems, etc. (Simonoff et al., 2008). ASD can severely influence an individual’s independent functioning and quality of life, exacerbate employment problems, impact ability to live independently, and lead to social isolation (Eaves & Ho, 2008; Hedley et al., 2017; Müller et al., 2008).

VR is believed to be particularly appealing to individuals with ASD, in part due to their affinity to computers and strong visual-spatial skills (Strickland, 1997). VR conveys concepts, meanings, and activities through highly realistic scenarios that mimic the real world, thereby providing rich and meaningful contexts for embodiment and practice of activities, behaviors, and skills (Wallace et al., 2017; Wang et al., 2016). The literature points to a number of potential benefits of VR for individuals with ASD such as predictability; structure; customizable task complexity; control; realism; immersion; automation of feedback, assessment, reinforcement; etc. (Bozgeyikli et al., 2018). In addition to these benefits, VR can increase access to services by overcoming logistical barriers such as distance, cost, and limited access to providers (Zhang et al., 2018).

While many benefits have been ascribed to VR technologies for individuals with ASD, the empirical evidence supporting the effectiveness of VR is fragmentary, unsystematic, and exhibits significant methodological limitations (Lorenzo et al., 2013; Parsons et al., 2019; Wang & Anagnostou, 2014). Contributing to this are issues such as cost, technological complexity, development challenges, and short technology life cycles (Schmidt et al., 2019). We argue in the current article that another contributor is the historically technocentric stance toward VR in the field, which has led to ontologically narrow research that privileges intervention efficacy over alternate methods of inquiry. We also expose the near-complete absence of documented design precedent or theoretical foundations in this area and reveal how these gaps disadvantage learning designers. Compounded by the universally-acknowledged heterogeneity of this population, designers are presented with a truly “wicked problem” (Becker, 2007; Schmidt, 2014).

Systematic consideration of the inherent complexity and multidisciplinarity of VR design for individuals with ASD is largely absent in the literature, but is needed if we as a field are to extend our understanding beyond narrow lines of questioning related to technological efficacy towards broader questions that consider how to make VR work better, on a larger scale, and for a broader proportion of individuals with ASD (e.g., adults, those who are nonverbal, those who require more significant supports, etc.). On this basis we advocate for a reframing of VR interventions through the lens of complexity theory (Rowland, 2007; Solomon, 2000; You, 1993) Supporting this view, we present findings from a usage test of Virtuoso. This research was guided by two central research questions. First, we sought to understand how the design goals of being acceptable, feasible, easy to use, and relevant to the unique needs of participants manifested in usage test findings. Second, we sought to explore the nature of participants’ learner experience. Standing as a counterpoint to dominant research discourse in the field, our context and research methods serve as a model for how researchers can move beyond reductive questions of “what works” to questions of how and why something works and, ultimately, how to do things better (Honebein & Reigeluth, 2020). Further, our case provides a much-needed exemplar that addresses identified shortcomings in the literature and contributes specifically to questions of optimizing the VR learning experience for participants with ASD.

Literature review

Recent reductions in hardware costs and the potential for increasing access to services have boosted interest in information and communication technologies (ICT) for teaching adaptive skills to individuals with ASD (Beaumont & Sofronoff, 2008; Grynszpan et al., 2014; Knight et al., 2013; Parsons, 2016). People with ASD tend to express a strong affinity for technology (Bozgeyikli et al., 2018) and respond positively to visual stimuli, use of visual cues, and instruction that digital interfaces can provide (Reed et al., 2011). ICTs provide opportunities for learning in controlled contexts devoid of nuanced social customs, which individuals with ASD may have challenges navigating (Grynszpan et al., 2014). As such, VR is considered to hold particular promise for individuals with ASD (Bellani et al., 2011; Wang et al., 2017).

There is a problematic lack of semantic clarity and little consensus around what constitutes VR in general (Girvan, 2018), which extends to the literature on the use of VR for individuals with ASD. Many interventions have been erroneously described as VR that are not, including avatar-based conversational agents, driving simulations, virtual worlds, and video games. This imprecise use of terminology confounds classification and comparison efforts. For example, most publications in this area have focused on desktop-based VR; however, some authors incorrectly classify desktop-based VR as virtual worlds or 3D games. With no clear standard of what constitutes VR in this field, we offer the caveat that the literature reviewed in following sections is classified based on how authors describe their work (not on the basis of current research definitions), which falls into the categories of desktop-based and headset-based VR.

Desktop-based VR interventions for individuals with ASD

Desktop-based VR interventions for individuals with ASD have been designed around a variety of social and adaptive skills (Wang & Anagnostou, 2014). Specific examples include (1) riding a bus and engaging within a cafe (Mitchell et al., 2007; Parsons et al., 2004), (2) safety skills (Self et al., 2007), (3) street-crossing (Josman et al., 2011), (4) emotion recognition (Moore et al., 2005), (5) engaging in conversation at social gatherings (Ke & Im, 2013), (6) practicing public speaking (Jarrold et al., 2013), (7) interviewing for a job (Kandalaft et al., 2013), (8) learning to think imaginatively (Herrera et al., 2008), and (9) preparing for a court appearance (Standen & Brown, 2005). Research on desktop-based VR has shown some evidence of effectiveness. For example, Brooks et al. (2002) investigated a VR environment modeled on an actual kitchen designed to teach food preparation. After training, participants showed improvement on tasks for which they had no prior training. Brown and colleagues (2002) found improvements in terms of fewer inquiries and mistakes by participants using a street crossing and road safety VR intervention.

Most VR interventions for individuals with ASD are single-user experiences (Glaser & Schmidt, 2018). Conversely, the iSocial project developed a multi-user 3D virtual learning environment for a 10-week social competence curriculum (Schmidt et al., 2011; Wang et al., 2017). This intervention followed a curricular structure in which participants would first learn a target skill and then rehearse that skill in controlled contexts (Schmidt et al., 2008). Another project, the Virtual Reality Social Cognition Training (Didehbani et al., 2016; Kandalaft et al., 2013), utilized social scenario simulations that allowed participants to practice meeting someone or to confront a bully. Findings suggest improved measures of social cognition related to theory of mind and emotion recognition. In a more recent study, Parsons (2005) investigated a game-based, collaborative virtual environment called Block Challenge designed to promote collaborative and communicative reciprocity among dyads of children with ASD and typically developing peers. Findings suggest that dyads communicated similarly, although participants with ASD may have communicated less efficiently.

Headset-based VR applications for individuals with ASD

Very few studies have investigated the use of VR head-mounted displays (HMD). Of those that do, nearly all are single-user. Strickland and her colleagues (1996, 1997) were among the first to assess how young children with autism might accept HMD. Findings indicated that study participants generally accepted the headsets and were able to complete the virtual tasks. However, the research was limited by a small sample size (n = 2). HMD-based research in this area remained dormant until recently, when consumer-grade, affordable HMD became commercially available (Newbutt et al., 2016). Newbutt et al. (2016) provide a brief report of people with autism’s perceptions of HMD. Generally speaking, users reported positive perceptions. Most participants consented to return for multiple virtual experiences and found HMDs to be comfortable and enjoyable. Importantly, the researchers concluded that the adoption of HMD for interventions with this population could introduce ethical concerns. They urge designers to seriously consider sensory processing disorders that individuals with ASD face when designing VR interventions.

A criticism of VR interventions for individuals with ASD in general and HMD-based VR interventions specifically has to do with substantial investments of time, money, and specialized expertise required for development, as well as the challenges of short technology longevity and prohibitive hardware costs (Grynszpan et al., 2014). Schmidt et al. (2019) present 360-degree video-based VR (VBVR) as a less complex and less expensive alternative to digitally-modeled VR environments, with findings suggesting the medium can provide above-average usability and user-friendliness, high participant engagement, and a sense of enjoyment. VBVR uses 360-degree video to represent the virtual environment through an HMD. Given the nascent state of VBVR as an emerging technology, nearly no research on VBVR for ASD exists beyond this work. Anecdotal case studies are emerging, such as a prototype for teaching social stories (Gelsomini et al., 2017) and an adaptive skills intervention related to grocery shopping (Wickham, 2016). More general research on VBVR suggests it could in certain cases be equivalent to computer-simulated environments in terms of sense of presence, anxiety reduction, improving emotional state, etc. (Brivio et al., 2020; Li et al., 2017).

Conceptual framework

VR technologies, such as the systems outlined in the previous literature review, have long been lauded as having particular promise for individuals with ASD. Historically, the majority of research has placed principal emphasis on VR as a rehabilitative medium, focusing “mostly on improvements in skill acquisition or development based on positivist or post-positivist paradigms” (Parsons et al., 2019, p. 6). This stance positions VR chiefly as an advantageous treatment medium, a therapeutic instrument, or an intervention modality, relegating the role of the technology to that of “a cognitive or behavioural [sic] prosthesis” (p. 6). However, this position is troubling not only because it reflects a medical model of disability, but also in that it reflects a particularly technocentric perspective. By technocentric, we allude to Seymour Papert’s “fallacy of referring all questions to the technology” (1988, p. 4), which, as famously demonstrated in Cuban’s (2009) work, tends to overemphasize the importance of technology’s role in solving problems. Interrogating the critique of technocentrism leads to the question of what role VR technology should play in ASD interventions. To approach this question, we turn to scholars of instructional media and educational technology who have long debated the role of technology in learning (e.g., Clark, 1994; Kozma, 1994).

Central to this ongoing debate is the question of whether technology (media) alone is sufficient to influence learning. However, Jonassen et al. (1994) suggest the focus of this debate on technology is misguided. They advocate for a more learner-centered approach that focuses “less on media attributes vs. instructional methods and more on the role of media in supporting, not controlling the learning process” (p. 31). They support this position in their presentation of learning as a holistic, complex process that defies objective or causal characterizations, sharing affinity with the more teleological nature of quantum mechanics and chaos theory. The perspective of Jonassen and colleagues underscores the futility of attempts to predict outcomes based solely on an analysis of the component parts that comprise underlying learning processes (e.g., attributes, activities, learner characteristics, context)—what the majority of research in this area has attempted to do to date—and highlights the remarkable complexity of learning design. Such complexity undergirds our own positionality relative to the learner experience of individuals with ASD in HMD-based VR interventions, which we characterize as an emergent property of a complex, interconnected and interdependent system that includes, but is not limited to, the designed intervention, learner-as-user, and learning context. We situate this characterization within the frame of complexity theory in the following section.

Reframing VR experiences from the perspective of complexity theory

Complexity theory has established a limited foothold in the field of instructional design (e.g., Jacobson & Kapur, 2012; Jonassen, 2000; Reigeluth, 2004). This theory eschews deterministic and causal models of predictability, instead adopting a more organic, holistic, and non-linear perspective. Particularly relevant to our perspective are the complex system properties of dynamics and emergence. These properties can be applied to describe the learner experience of individuals with ASD with VR interventions. Complex systems are comprised of many component parts that themselves can be simple (Clancy et al., 2008). What leads these systems to be complex are the connections and dependencies between the system components, which all interact in a non-linear manner (J. H. Kim et al., 2008; Savioja et al., 2014). On a small scale, causal inferences can be made regarding the interactions between system components; however, this becomes impossible at a large scale (Simon, 2018). How the system behaves cannot be predicted by considering the component parts. Instead, behavior is determined by the dynamic nature of interaction between interconnected and interdependent component parts (Mercer, 2011). Moreover, the system’s behavior is also dependent upon its context or environment (Simon, 1977). Therefore, system behavior emerges from a complex interplay of contextualized dynamic interactions, or system dynamics, with any attempts to describe the complex system being inherently reductive (Cilliers, 2013).

Similarly, VR interventions for individuals with ASD consist of a constellation of complex, interconnected, and interdependent systems of technologies, intervention strategies, participants, stakeholders, etc. (Bozgeyikli et al., 2018; Parsons, 2016). No single factor in this complex ecosystem individually affects the learner experience; instead, it is the nature of interaction between the interconnected and interdependent components that gives rise to the phenomenon of learner experience as an individually perceived, unique phenomenon (Schmidt, Glaser, et al., 2020; Schmidt, Tawfik, et al., 2020). This individual experience, we argue, is an emergent property of the complex system, and thus cannot be predicted reliably on the basis of simple causal inference (Kreps, 2014; Reeves, 1999).

Research implications

Reframing of the VR experience as a complex, emergent phenomenon stands counter to dominant research discourse in the field. For example, a central assumption used to justify the use of VR for autistic populations is the assumption of veridicality. This assumption asserts a causal relationship between VR realism and generalization of learning from the VR context to novel contexts and situations (McComas et al., 1998; Strickland, 1997; Wang & Reid, 2011). This assumption is grounded in the affordance of VR to provide virtual experiences with a high degree of ecological validity (G. J. Kim & Rizzo, 2005) that is, the ability of VR to present avatars, environments, and objects accurately and with a remarkably high level of photographic and behavioral fidelity. Many individuals with ASD have a tendency to be concrete thinkers (Grandin, 1995) and have difficulties generalizing skills learned in one context to another (Yerys et al., 2009), leading to difficulties in establishing intervention effects across settings (Plaisted, 2001). Because VR provides high-fidelity photographic and behavioral realism, this medium is believed to align well with the tendency towards concrete thinking in individuals with ASD. However, the assumption of veridicality is not strongly supported by empirical evidence. Although a handful of studies have reported modest findings relative to generalization of skills from VR to the real-world for individuals with ASD (e.g., Josman et al., 2011; Parsons et al., 2006), findings in general have been inconclusive and limited (Wang & Anagnostou, 2014).

Problematically, the premise that technology alone can provide sufficient ecological validity to promote generalization of treatment outcomes ignores the myriad ecological variables that could potentially constrain or promote generalization. The complexity of promoting generalization has long been recognized in autism intervention research and, indeed, is widely considered to be the most pervasive challenge of that field (Yerys et al., 2009). It is hardly surprising, therefore, that a functional relationship between generalization and VR realism has yet to be firmly established in the literature. Moreover, the assumption of veridicality underscores a fallacy of traditional perspectives that position VR principally as a treatment medium. Placing technology at the forefront of any solution with insufficient consideration of the individual experience of usage and the contexts of usage, unsurprisingly, has led to narrow and limited outcomes (Karami et al., 2020). The assumption that a causal inference can be made between realism and generalization ignores entirely the nature of complexity that circumscribes the learner experience within VR interventions. Returning to Jonassen and colleagues, “We delude ourselves when we manipulate attributes of the medium and expect these manipulations to have a predictable effect on a process as complex as learning” (1994, p. 35).

Although we have singled out the assumption of veridicality specifically, its underlying flaw of reductivism is evident in the majority of research that has been performed to date in the area of VR interventions for individuals with ASD. To be clear, we do not contest the notion that the ecological validity of VR environments could play an important role in generalization. However, we argue that deterministically inspired attempts to map potential affordances onto already complex phenomena such as skills generalization are misguided. This is in large part because such attempts neglect the inherent complexity of VR intervention design, compounded by the perceptual, sensory, and cognitive differences of many people with ASD (Parsons & Cobb, 2011). VR interventions are complex systems; therefore, research that seeks to prove their effectiveness “will have limited generalizability and external validity because of the innumerable interactions among conditions, values, and methods” (Honebein & Reigeluth, 2020, p. 11). We therefore assert an urgent and timely need exists for research in this field to look beyond reductive attempts to establish intervention effects on the basis of technology alone.

Our perspective is supported in Parsons (2016) call for research in this area to adopt a more holistic framing of research problems, to shift the focus of research away from technology alone, and instead to position more centrally the complex interaction between users with ASD and the VR intervention within the local context. Key to Parson’s call is consideration of the nature of the VR technology and how the technology should be leveraged to promote specific objectives, which includes consideration of how users are uniquely impacted by autism and what kinds of supports might be needed. This view is supported by Kozma (2000), who calls for a rethinking of traditional research methodologies in light of the messiness introduced by real-world contexts, which introduces confounds that cannot be controlled. However, instead of trying to control for such issues, Kozma suggests they should be recognized as endemic to the real-world context. On this basis, he calls for alternative research approaches that can lead researchers to better understand “what is working, what is not working, why it is working or not, and what can be done about it” (p. 10). Attending to this call has implications from the perspectives of both design and theory, which we discuss in the following sections.

Design implications

Documented design precedent informs and enriches learning design (Gray & Boling, 2016), but is sorely lacking in the complex design topography of immersive learning interventions (Huang & Lee, 2019; Kasurinen, 2017). Nearly all research on VR for ASD rests on the assumption that VR will produce beneficial outcomes, as evidenced by the overwhelming majority of published research in this area being focused in establishing positive intervention effects (Karami et al., 2020; Mesa-Gresa et al., 2018). Although a variety of heuristics and design principles exist for developing 2D computer interfaces for individuals with ASD (Benton et al., 2012; Khowaja & Salim, 2013), far less is known about how to design 3D interfaces and environments. Absent in the literature are guiding principles for designing rich and stimulating three-dimensional interfaces that might impact learner experience (Jerald, 2016; Nelson & Erlandson, 2008; van der Land et al., 2013). Indeed, according to Bozgeyikli et al. (2018), “[…] there is no well-established literature on the best practices in designing VR user interface attributes for individuals with ASD yet” (p. 22). This privileging of outcomes-oriented research over reports of design processes, evaluation efforts, effective design principles, and considerations of learner experience disadvantages learning designers. The implication is that learning designers working in this space are confronted with the paradox that VR is widely regarded as a promising tool that potentially could lead to promising outcomes for individuals with ASD, but practically no design precedent exists to inform the development of interventions. This lack of design precedent is compounded by a concomitant absence of theoretical guidance, which we discuss in the following section.

Theoretical implications

As discussed previously, research on VR for ASD generally rests on the assumption that VR will produce beneficial outcomes. To-date however, nearly no theoretical grounding has been established to justify this assumption (Howard & Gutworth, 2020). By theoretical grounding, we refer primarily to psychological theories of autism. The general lack of theoretical reporting in the literature is problematic because theory informs selection and vetting of intervention strategies (Ertmer & Newby, 2013) and is the vehicle through which findings are interpreted and broader connections are made (diSessa & Cobb, 2004). An outlier is Wang and colleagues’ research (Wang et al., 2016, 2018) that connects the embodied social presence of learners with ASD with avatar interaction patterns in a collaborative 3D virtual learning environment. Their research supports the position that intentional design of VR features can lead to higher levels of embodied social presence, which they maintain has implications for transforming collaborative learning for youth with ASD. Importantly, Wang and colleagues’ research does not report intervention effectiveness, but instead foregrounds the critical importance of learning design to shape learner experience. Another outlier is Rajendran (2013), who is perhaps the only scholar to attempt to formally connect some of the psychological theories of ASD with VR interventions. Rajendran presents a range of psychological theories that have found resonance in the tradition of autism research and explores how they might be considered from the perspective of VR interventions, including theory of mind, executive function, weak central coherence, etc.

Central to our own design is reduced generalization theory (Plaisted, 2001). As mentioned previously, generalization is a pervasive challenge in all autism research. Reduced generalization theory maintains that individuals with ASD have limited capability to process similarities between contexts. This suggests that generalization takes place in a gradual and incremental manner. Rajendran (2013) argues that the infinitely plastic nature of VR allows it to be manipulated so as to gradually change the virtual environment and make it progressively more similar to the real environment. This, in turn, could promote generalization for individuals with ASD. However, given the issues of complexity discussed previously, this position alone is at odds with our theoretical foundation of complexity, as it rests on the assumption of causal inference and is therefore problematically reductive. Hence, we bolster our theoretical stance on the basis of generalization theory established in the area of applied behavior analysis (Stokes & Baer, 1977; Stokes & Osnes, 2016).

Stokes and colleagues’ generalization theory rests on nine related techniques (Stokes & Baer, 1977) that have been categorized into three broad heuristics (Stokes & Osnes, 2016). The first heuristic, taking advantage of natural communities of reinforcement, refers to using elements of the natural environment that already function to maintain the target behavior (Stokes et al., 1978). The second heuristic, train diversely, refers to maintaining the minimal level of training control possible while still producing behavior change (Stokes & Osnes, 2016). The third principle, incorporate functional mediators, refers to taking advantage of relevant discriminative stimuli in the training environment that can be transferred to other environments to promote generalizations (Stokes & Osnes, 2016). We have reported elsewhere how these heuristics informed the design of Virtuoso (Schmidt, Glaser, et al., 2020; Schmidt, Tawfik, et al., 2020). We demonstrate how these theories have been instantiated in the design of Virtuoso in the Project Description section below.

Implications for the current research

In light of the research, design, and theoretical implications presented above, we made the decision to privilege questions of design and learner experience over establishing intervention outcomes. We did not focus on how to recreate and rehearse a discrete task in a virtual environment (arguably the approach of the majority of extant research). Rather, we sought to approach a wicked problem and explore possibilities to better understand the problem through the tradition of learning design. Starting with questions of how one might embed generalization heuristics in the design of an immersive learning environment naturally led to questions of how to sequence the design, what scaffolds and supports might be needed, how these might be faded, etc. This approach bears some resemblance to that proposed by Dorst (2006) in which the designer seeks to transcend established discourse (and associated paradox) by stepping “out of the ways of thinking embodied in the [dominant] discourses” (p. 15) and confronting the paradoxical situation on the basis of experience, deep understanding of the discourses that inform the problem, and—importantly—designerly intuition. However, Dorst argues, in order for any solution to be considered a true solution, it must be acceptable to stakeholders.

Dorst’s (2006) positionality reflects our own values about priorities, that is “[s]tatements about which priorities should be used to judge the success of the instruction” (Reigeluth & Carr-Chellman, 2009, p. 23). These values ultimately serve as the criteria through which learning design is evaluated, and commonly focus on the relative effectiveness, efficiency, and appeal of the learning design. Honebein and Honebein (2015) argue that designers must make trade-offs between these three elements, sometimes having to sacrifice one in favor of others. They characterize this negotiation of values prioritization using what they call the Instructional Design Iron Triangle, claiming in “a designer’s choice of method yields good success in only two of the three learning outcomes: effectiveness, efficiency, and appeal” (p. 939). Returning to Dorst, we adopt the position that the individual perceptions of our stakeholders—the individuals with ASD for whom Virtuoso was designed—regarding the quality of their experience (arguably its appeal and efficiency) using Virtuoso is the ultimate criterion through which we can make judgements about our methods. Therefore, we choose to make a tradeoff. We sacrifice the possibility of establishing a causal connection between designed intervention and learning outcomes (effectiveness) for deeper insights about the efficiency and appeal of our designed intervention. On this note, we turn to our evaluation of Virtuoso from these perspectives.

Material and methods

A usage study was conducted in Summer 2018 and consisted of two phases, beginning with expert testing (Phase 1: n = 4) that incorporated semi-structured interviews and survey methods, and concluding with participant testing (Phase 2: n = 5) that incorporated observational and survey methods.

Project description

Virtuoso was designed for adults with significant communication and behavioral challenges associated with ASD enrolled in a remarkably innovative day program called Impact Innovation at the University of Cincinnati. Over 20 adults participate in this program year-round. Participants follow a daily schedule with the assistance of a peer mentor, during which they take part in vocational internships and work on developing adaptive skills. Adaptive skills are considered to be a core challenge for individuals with ASD (Gilotty et al., 2002), and include practical, everyday skills needed to function and meet the demands of one's environment. Examples include getting dressed, self-care, and safety-related activities (National Academy Press, 2001). Developing these skills is critical for attaining more independent levels of functioning (Ditterline & Oakland, 2009).

Virtuoso provides the day program a collection of immersive learning technology interventions that promote the skill of using public transportation. This use of immersive technology for discrete skills training is in alignment with current practices and trends for special populations (Olakanmi et al., 2020). Public transportation was identified in consultation with the director of the adult day program (hereafter referred to as the subject matter expert, or SME) as a promising application of immersive training. Research suggests the ability to access and use public transportation promotes independence through higher access to employment, medical care, community, etc. (Felce, 1997). However, transportation is one of the most cited barriers for individuals with disabilities (Allen & Mor, 1997; Carmien et al., 2005). Research shows that transportation is among the greatest hurdles in getting to and maintaining a vocation and that many individuals with disabilities are unable to keep medical appointments due to the lack of access to transportation (Shier et al., 2009).

Prior to developing Virtuoso, however, no formal or systematic shuttle training existed in Impact Innovation. When a need to use the university shuttle arose in terms of a participants’ individualized programming, a staff member would attempt to orient and train the participant in an ad-hoc manner. This led to a number of challenges. Day program participants need to use public transportation to travel to vocational training sites, yet real-world public transportation training exposes participants to a variety of risks. For Impact Innovation, a systematic approach to shuttle training using immersive technologies was attractive because it would allow training to be experienced safely and repeatedly in controlled scenarios. Additional benefits were identified, including that participants could be taught to use public transportation to return safely to the program offices if they were to get lost and the synthesis of important subordinate skills (e.g., interpreting a map, interpreting a schedule) associated with using public transportation could be useful across a range of other vocational scenarios.

A procedural task analysis was performed (Jonassen et al., 1999) based on staff interviews, ride-alongs, and interviews with the SME. Detailed flow charts representing the task were developed. Based on these flowcharts, learning objectives were identified. The overarching learning objective states that, upon completion of shuttle training, learners (with prompting by a trained staff member) will be able to successfully board the correct university shuttle. Specific learning objectives state that learners will be able to:

  1. 1.

    Identify that it is time for shuttle training using their daily schedule,

  2. 2.

    Plan a route from their workspace to the shuttle stop using a map,

  3. 3.

    Follow the planned route from their workspace to the shuttle stop,

  4. 4.

    Check current shuttle location using mobile app,

  5. 5.

    Identify correct shuttle when it arrives at shuttle stop, and

  6. 6.

    Board the correct shuttle in a timely manner.

To promote these learning objectives, our design was informed by reduced generalization theory. Reflecting sensitivity to the perceptual need for a gradual and graded approach to skills training, Virtuoso uses a stage-wise technique that progresses from simple to complex across a spectrum of low-tech to high-tech tools (and then to real world application). The stage-wise approach consists of: (1) skill introduction, (2) 360-degree video modeling of the skill, (3) VR rehearsal of the skill, and (4) real-world practice of the skill (Fig. 1). In the first stage, training is introduced using a low-tech, social narrative that breaks down the overarching activity into a series of tasks accompanied by visual supports (e.g., photographs and icons). This low-tech social narrative is presented on a tablet as paginated, comic strips that users reviewed with the assistance of a trained staff member. In the second stage, an Android application is used to present 360-degree videos of the task in a VBVR environment. This software is called Virtuoso-VBVR and supports both Google Cardboard and Google Daydream HMD, which allows for the use of many commercially available mobile phones. During this usage test, participants used Motorola Z Force devices that we provided. In the third stage, participants engaged with an online guide who controls an avatar and leads users through instructional content presented in a multi-user, HMD-based VR environment to rehearse the skill. This environment is called Virtuoso-VR, was developed using the open source High Fidelity virtual reality toolkit, and supports both HTC Vive and Oculus Rift HMD (see Fig. 2). In the fourth stage, participants practiced the skills in the real world with a trained staff member. The design of all stages of training was informed by evidence-based practices related to task structure, instructional scaffolding, prompting, generalization, accessibility, etc. (National Autism Center, 2015; Wong et al., 2015).

Fig. 1
figure 1

Intervention architecture of Virtuoso as a whole, with Stages 2 and 3 emphasized as the focus of the current study

Fig. 2
figure 2

A participant navigates his avatar to the shuttle stop in Virtuoso-VR while the online guide (bottom-right) provides verbal prompting as needed

Although we incorporated Stage 1 to orient participants to the activity (the low-tech, social narrative), we did not evaluate it as it was not central to our focus on better understanding a wicked problem of VR design. Indeed, social narratives are among the most well-researched and best-established types of interventions for individuals with ASD, with widely published and implemented guidelines for best practice (Hale & Schmidt, 2018). In addition, Stage 4 was not performed or evaluated in this usage study, although it is included in our intervention architecture (Fig. 1). Adopting the view of learning with media vs. media as conveyor of instruction (Jonassen et al., 1994), understanding the role of VR in the complex learner-media relationship is paramount. If barriers such as design flaws, technology glitches, or cybersickness disrupt this relationship, then learning effects could be negatively impacted. Therefore, these flaws must be identified and remediated such that barriers are removed and the learner-media relationship is made as seamless as possible. This is particularly important in VR, where immersion and sense of presence are critical (Slater, 2018). Barriers that interrupt these important perceptive phenomena can lead to deleterious effects (e.g., reduced sense of immersion, broken flow state, increased cognitive load, reduced ability to concentrate). Given that Virtuoso targets a population that is highly sensitive to changes, such interruptions could lead to negative behaviors/outbursts (Kaat & Lecavalier, 2013), increased anxiety (Bellini, 2006), and/or a lack of willingness to continue (Carr et al., 2016).

Methods

The purpose of this user-centered, multi-phase usage study was to evaluate the efficiency and appeal of the VR-based stages of the Virtuoso intervention (Stage 2: 360-degree video modeling of the skill, and Stage 3: VR rehearsal of the skill) so as to reveal design flaws and uncover opportunities to improve the overall learner experience for participants with autism in an adult day program at a large Midwestern university. The questions that guided our inquiry are presented below:

RQ1: How do the Virtuoso design goals of being acceptable, feasible, easy to use, and relevant to the unique needs of participants manifest in findings from the usage test?

RQ2: What is the nature of participants’ learner experience in Virtuoso VR and Virtuoso VBVR?

Usage testing of Virtuoso took place across two research phases (Table 1). All research performed was approved by our university’s institutional review board. In Phase 1 (expert testing), expert reviewers (n = 4) engaged in a structured usage test with the Virtuoso-VBVR application, followed by a semi-structured interview. Phase 1 focused only on research question 1. This phase took place in May of 2018 in the offices of the respective expert reviewers. Experts explored Virtuoso-VBVR, completed the System Usability Scale (SUS; Brooke, 1996) and then responded to questions from a semi-structured interview protocol. Expert responses were audio recorded for later transcription and analysis.

Table 1 Study procedures across phases

In Phase 2 (participant testing), participants with ASD (n = 5) engaged in usage testing of the Virtuoso-VBVR and the Virtuoso-VR software. Phase 2 focused on both research questions 1 and 2. After informed consent and/or assent was obtained, participants took part in two different sessions of approximately 30 min each. In the first session, participants used Virtuoso-VBVR. In the second session, participants used Virtuoso-VR. After each session, participants completed the SUS and one user-friendliness question (Bangor et al., 2009). Video, audio, and screen recordings were captured for later transcription and analysis. Detailed field notes were taken during all sessions.

Participants

For Phase 1 (expert testing), study participants were purposively sampled. Identification of experts was performed in consultation with the SME, a Board Certified Behavior Analyst-Doctoral (BCBA-D), who suggested a need for both autism and usability experts (Table 2). Autism experts were identified based on expertise in the field of autism research and prior clinical interactions with the adults enrolled in the adult day program. The usability expert was identified based on expertise in usability evaluation.

Table 2 Demographics and description of expert review participants

For Phase 2 (participant testing), five adults with ASD were purposively sampled from the adult day program. These participants were identified by the SME, a  based on (1) level of independence, (2) acuity scores, (3) scores on standardized assessments, and (4) a clinical diagnosis of having ASD. Usage test participants had an average age of 26.2 years old with a range between 22 and 34 years old. An overview of participant demographics is provided in Table 3.

Table 3 Participant demographics and measures of peabody picture vocabulary test (PPVT), Social responsiveness scale (SRS), and Behavior rating inventory of executive function (BRIEF)

Data collection

Qualitative and quantitative data were collected using a variety of measures and methods. These are described in the following sections.

System usability scale

The SUS was administered to both expert and participant testers. Given participants’ differing literacy levels, SUS items were read aloud to participants by a master’s-level staff member who the participants knew. Items were read aloud and explained in concrete terms with examples. Each response option was read aloud, after which participants were asked to choose a response. Participants were prompted continually regarding their understanding.

Adjectival ease of use scale

The adjectival ease of use scale was administered to participant testers, (Bangor et al., 2009), a single-item measure of user-friendliness. This scale considers ease-of-use using adjectives as ratings. The item states, “Overall, I would rate the use-friendliness of this product as: Worst Imaginable, Awful, Poor, Ok, Good, Excellent, Best Imaginable.” This item was administered to participants on the same sheet as the SUS.

Structured expert interviews

A structured interview was conducted with each expert tester by a trained graduate student. The interview protocol consisted of two prompts focused on learner experience and design principles, respectively. The first question sought experts’ opinions on the design of the system, including hardware and software. The second question sought opinions on evidence-based practices noted in the system’s design. Interviews took approximately five minutes and were audio recorded for later transcription and analysis.

Screen, webcam, and audio recordings

Screen, webcam, and audio recordings were captured for participant testers. Videos were captured for each session. In total, 12 interaction videos were captured for Virtuoso-VR (six from the perspective of the online guide, six from participants) and six interaction videos were captured for Virtuoso-VBVR. All videos were transcribed using a professional transcription service.

Unstructured, post-usage testing interviews

Unstructured interviews were conducted by the first author following each participant usage testing session. Interview duration was between five and 15 min. Interviews began with the interviewer asking participants about their experiences, exploring what participants liked best, least, and what they might change. Depending on what was observed during each usage test, the interviewer asked follow-up questions. All interviews were recorded and transcribed for later analysis.

Field notes

During Phase 2 (participant testing), field notes were taken by a trained graduate student, who made observations in handwritten notes related to participants’ preparation for usage testing, actual usage of Virtuoso-VBVR and Virtuoso-VR, and participants’ post-usage testing surveys and interviews. Specific focus areas included the nature of user interaction, participant responses, commentary during participant sessions, and any particularly salient moments. These handwritten notes were scanned, typed into a word processor, and stored for later analysis.

Analysis

Analysis adopted a multi-methods approach. Quantitative data were analyzed using methods appropriate to usability evaluation. Qualitative data were analyzed using inductive and deductive methods. A constant comparative approach was applied across all phases of analysis, with specific attention given to coding reliability.

Quantitative analysis

Data from the SUS and the User-friendliness Adjectival Rating Scale were analyzed using quantitative methods. Methods outlined in Brooke (1996) were used to calculate the SUS score. Scores above the value of 68 are considered to represent above average usability. These data were aggregated into tables for analysis (see Table 9).

User-friendliness was measured using the single item User-friendliness Adjectival Rating Scale (Bangor et al., 2009). The seven possible responses on this scale include: Worst Imaginable, Awful, Poor, Ok, Good, Excellent, Best Imaginable. These categorical data were converted to ordinal data, with 1 representing “worst imaginable” and 7 representing “best imaginable.” These data were aggregated into spreadsheet tables for analysis.

Qualitative analysis

Two independent, qualitative analyses were performed—one deductive, and one inductive. The deductive analysis focused on exploring acceptability, feasibility, ease-of-use, and relevance of Virtuoso prototypes. The inductive analysis focused on the nature of participants’ experiences while using Virtuoso prototypes. Procedures are described in the following sections.

Deductive analysis was performed by applying an existing coding methodology to further our understanding of the topic (Creswell & Poth, 2016). The coding scheme used was developed by Kushniruk and Borycki (2015) to analyze usability and usefulness within the context of medical interventions. Whereas many usability coding methods focus on general heuristics, this coding scheme was developed specifically for video analysis within intervention contexts. Minor modifications to the coding scheme were made to better align it to our specific context as the coding scheme was originally developed to evaluate usability and usefulness of 2D interfaces. In addition, we augmented the coding scheme with four supplemental codes (Table 4) related to technology-induced errors due to instability in our beta-level software.

Table 4 Supplemental codes appended to Kushniruk and Borycki’s (2015) video analysis coding scheme

This coding was performed by two trained graduate student independent observers (a primary observer and an agreement observer), with codes intermittently reviewed by the lead researcher. After multiple training and calibration sessions, the primary observer coded 100% of the videos, and the agreement observer coded 50% of the videos. Coder drift was controlled for by comparing and discussing any discrepancies in coded videos. Two separate comparisons between raters were performed based on independent application of codes and estimates of training duration: interobserver agreement and Cohen’s Kappa. Results from IOA and Kappa analyses (Table 5) are indicative of high agreement between coders.

Table 5 Interobserver agreement and Kappa for coding and duration in Virtuoso-VBVR and Virtuoso-VR

Inductive analysis was conducted on screen, webcam, and audio recordings, as well as interview transcripts and field notes (Benaquisto, 2008) to look for themes related to the nature of learner experience. Axial coding procedures were used to create a set of preliminary codes and operational definitions. Emergent categories and subcategories were continually refined across three major iterations through a constant comparative method (Denzin & Lincoln, 2011). Two overarching themes emerged as particularly relevant to the nature of learner experience in our context: accessibility and user affect. Preliminary evidence of transfer was also identified and coded. A listing of the categories and codes that emerged from the process identified here are found in Table 6.

Table 6 Qualitative codes and operationalizations that emerged from inductive analysis

Results

Findings are presented below. First, quantitative and qualitative results are triangulated from the perspectives of expert reviewers and usage test participants related to how the design goals of Virtuoso prototypes were found to be acceptable, feasible, easy to use, and relevant to the unique needs of participants. Next, qualitative findings are reported in relation to the nature of learner experience using the Virtuoso-VR and Virtuoso-VBVR prototypes.

RQ1: Manifestation of design goals

Findings from this research question are organized across multiple phases of the research study to present a holistic view of how expert reviewers and usage test participants perceived how the design goals of the Virtuoso prototypes were realized. Perceptions of usability are presented from both user groups as well as perceptions of feasibility and relevance from the perspective of expert reviewers.

Experts testers’ perceptions of usability

Experts evaluated the Virtuoso-VBVR prototype using the Google Cardboard and Google Daydream View. Average SUS scores were 79.38 for the Google Cardboard version of Virtuoso-VBVR and 84.38 for the Google Daydream version. Averaged SUS scores were 81.88, nearly 14 points above the average SUS rating of 68, suggesting good usability. Among the SUS questions, the lowest mean and median score applied to the Daydream’s ability to be learned quickly. Conversely, findings suggest that the Daydream was not cumbersome.

Experts identified specific issues as impacting the usability of the Virtuoso-VBVR app. These issues were categorized as relating to hardware and software, videos, and task design (Table 7), and suggest important differences between the Cardboard and Daydream headsets. For the Cardboard, expert testers preferred its simple button located on the headset. It required less initial assistance to use, and was also less likely to face issues requiring intervention such as head strap discomfort, or pressing the wrong button. Experts were slightly better able to get started with little instruction using the Cardboard. For the Daydream, some experts had issues navigating with its multi-function remote pointer. Across both devices, expert testers indicated that they experienced some symptoms of cybersickness and anticipated the need for a high degree of support due to insufficient explicit directions. Ultimately, expert responses suggested that they anticipated substantial support would be needed to be able to use the VBVR system.

Table 7 Usability issues identified during expert testing

Participant testers’ perceptions of usability

Participant testers evaluated both the Virtuoso-VBVR and Virtuoso-VR experiences, with both being rated as above average on the SUS in terms of ease-of-use (Table 8). Mean computed SUS scores across all participants for Virtuoso-VBVR were 79.58 (SD = 0.99) and for Virtuoso-VR were 73.33 (SD = 1.08), both above the cutoff score of 68. In addition, participants completed a one-item adjectival scale (Bangor et al., 2009) to rate user-friendliness. On average, participants rated Virtuoso-VBVR as “good” and Virtuoso-VR as “excellent.” These ratings are in contrast to participants’ SUS ratings, which suggested Virtuoso-VBVR was more favorably received.

Table 8 Mean System Usability Scale (SUS) scores across participant testers

Qualitative evidence from deductive analysis suggests both prototypes were easy to use. Results from inductive analysis suggest that participants encountered fewer usability problems with Virtuoso-VBVR, perceived usefulness of the content to be greater, and encountered fewer technology-induced errors. Considering the number of total codes that were assigned across the technologies and sessions, more assigned codes suggesting more usability challenges. There were only 16 codes applied to the Virtuoso-VBVR application while the Virtuoso-VR platform had 69 codes.

Assigned codes also varied across categories. For instance, the majority of codes assigned to the Virtuoso-VR software were related to user’s problems with (1) understanding instructions, (2) system crashes, and (3) graphical issues. Users of the Virtuoso-VBVR application tended to have some of these problems, but less frequently. The most frequently applied codes were related to (1) navigation issues, (2) determining the meaning of icons and or terminology of the system, and (3) understanding instructions.

The inductive analysis of video data uncovered participants stating they were having difficulties. As Evan was using Virtuoso-VR within the HTC Vive, his avatar got stuck and he commented, “it’s kind of hard to use this.” He later stated, “it got difficult though" when asked what he thought of the overall experience. Despite these remarks, Evan still rated the system as ‘Best Imaginable’ for user friendliness.

Expert testers’ perceptions of feasibility and relevance

Feasibility and relevance were investigated in expert testing not as binary constructs but as interrelated and interdependent. In unstructured interviews, experts were asked to provide general commentary on the design of the hardware and software. Generally speaking, experts found the design of Virtuoso-VBVR to be inclusive to participants’ unique needs. The task and task sequences were found to be relevant to learning to use the shuttle. Jacob commented that Virtuoso-VBVR was “overall, a good tool.” Jennifer highlighted the relevance of the approach when she noted, “This has interesting potential for immersive modeling for individuals with ASD.” Barb suggested, “It would be worth asking several of our participants [adults in the day program] to try this,” speaking to the potential feasibility of the approach. Experts agreed the design promoted accessibility. For example, Jennifer commented specifically on the use of visual cues that directed the attention of participants, although she felt that more cues were needed. Jennifer, Barb, and Daria all noted the use of visual symbols that learners would already be familiar with.

Many comments pertain specifically to the videos themselves, for example, how realistic the videos looked, video length, and video stability. Experts found the fidelity of the videos to be adequate. For example, Daria indicated the environment was detailed and realistic, although the resolution was sometimes “fuzzy.” This led to minor visual distortion and difficulty reading text. Experts also noted that the viewport shaking or rotating during movement was potentially overwhelming. Jacob commented, “The shaky cam (sic) was disorienting.” Daria indicated that the short length of videos was a strength.

Experts provided a number of suggestions for improvement and recommendations for implementation. In line with our design strategy, Jennifer noted, “This system does not, through a single experience, prepare learners for the real-world task,” and recommended repeated use. Importantly, Daria expressed concerns related to the high degree of variability between members of the learner population due to varied sensory processing sensitivities. She suggested this could be difficult to control for in instances of cybersickness.

RQ2: Nature of learner experience

Findings from this research question are presented from the perspective of usage test participants as they used both the Virtuoso-VBVR and Virtuoso-VR prototypes. Participant testers’ learner experience was investigated using inductive methods and stratified using the coding categories outlined in Table 6. In usage testing, exposure to the VR technology was short. Mean computed time to completion across all participants for Virtuoso-VBVR was 0:08:12 (SD = 0:02:06) and for Virtuoso-VR were 0:09:18 (SD = 0:02:17). Participant experiences across both applications fell within our parameters of limiting HMD exposure to < 10 min per session so as to reduce potential adverse effects. However, Virtuoso-VR had somewhat unpredictable stability due to the immaturity of the underlying beta-level software. Frequent system instability and crashes were observed. This has implications for feasibility. Error rates were, on average, 0.32 errors per minute for participants using Virtuoso-VBVR, and 1.24 errors per minute for participants using Virtuoso-VR. Across all sessions, no participants asked if they could leave the usage test. No participants expressed dissatisfaction with either of the intervention platforms. All participants expressed a desire to return and use Virtuoso again, suggesting high acceptability.

Generally speaking, usage test participants’ experiences were enjoyable, although errors and bugs were observed. When Travis came back to complete his second time through the Virtuoso-VR activities, the software presented several process-hindering system crashes that resulted in Travis having time to explore while the online guide restarted the application. However, Travis stated that he still was having fun:

RESEARCHER: So, you also said you had fun. What did you find fun?

Travis: Just trying to mess around with the controls.

It was observed that Travis’ avatar was walking around, waving his hands, and exploring during these crashes. Even though his progress was stalled by crashes, he still found being in the environment enjoyable, suggesting that the technology is perhaps intrinsically reinforcing.

Some participants indicated they were excited to tell their friends about Virtuoso. Evan stated, “I would love to tell my friends about that.” Andy also said he would tell his friends about Virtuoso. Several participants also expressed a desire to return, and Travis did return for a second session. Participants seem to have found their experiences with the system interesting or appealing. Many described Virtuoso as “cool.” Andy asked when we would be releasing our project to the public: “How do the others use the app or get it?… Is this a mobile app that you can get for the iPad?” He followed up by asking when it would be available after we said it was still in development: “How long will you think it’ll be ready?”.

Accessibility was considered from the perspectives of physical accessibility, cognitive accessibility, and cybersickness. Physical accessibility issues were observed primarily as participants sought to gain fluency using the controllers for Virtuoso, including the Google Daydream remote and the HTC Vive controllers. While most users were able to gain fluency quickly with controls, some encountered challenges. For instance, Kevin has challenges with fine-motor skills. He struggled to operate the Vive’s default controllers, which are flat and require some dexterity. Virtuoso supports alternative input devices for such situations; thus, the Vive controllers were replaced with a Microsoft Xbox 360 controller, which resulted in Kevin being able to navigate without further issues.

Another example of a physical accessibility issue was when Evan struggled to navigate his avatar with the HTC Vive default controllers. He repeatedly asked for assistance:

Evan: How do you do that?

RESEARCHER: Alright. You take your thumb… and remember, you push forward. Push down with that thumb.

Evan: Push down with this thumb?

RESEARCHER: Push down… yeah!!! You got it.

Evan: [still struggling with controls] That’s okay. It’s a little kind of… it’s kind of hard.

In contrast to Kevin overcoming physical accessibility challenges by using a different controller, Evan was not able to gain a high degree of fluency. It appears that Evan struggled with mapping controller inputs with avatar control. It is unclear whether Evan forgot the controls, whether he needed more practice, or whether he needed to use an alternative controller.

In designing Virtuoso, we set out to instantiate design standards that could address challenges related to using technologies for people with cognitive disabilities. In doing so we paid particular interest to supporting executive functioning, language, literacy, perception, and reasoning. Participants benefited from these features with the exception of Jermaine when using Virtuoso-VBVR. Jermaine is pre-literate and had trouble understanding terms and icons used to select the 360-degree videos, inadvertently skipping one of the tasks when he was unable to correctly identify the button he was asked to select. In this instance Jermaine was asked to watch the third video of the Virtuoso-VBVR application. Although the buttons are labeled both numerically and with graphical representations of the activity, he was unable to make the correct selection and watched the fourth video instead. In the remaining Virtuoso-VR tests, the online guide read all text and instructions to the participant to help address these accessibility issues.

Following each usage testing session, we asked our participants if they were feeling any kind of physical discomfort that might suggest cybersickness. Most participants did not report headaches, eyestrain, dizziness, or nausea. Three participants stated they were feeling fine and free of symptoms. Andy and Travis, however, reported feelings of discomfort. Andy related the following:

RESEARCHER: Do your eyes feel weird?

Andy: Little bit.

RESEARCHER: Do you feel at all dizzy?

Andy: Maybe a tiny bit. Not too bad.

RESEARCHER: Yeah? And does your stomach feel okay? Any nausea?

Andy: It feels a little weird but it’s mostly okay.

RESEARCHER: Yeah? So, you think that might be because you’re hungry because it’s almost lunchtime or…?

Andy: Yeah. It could be because it’s almost lunchtime.

While it is unclear if Andy was exhibiting symptoms of cybersickness, or simply feeling hungry, evidence from Travis suggests a clear connection from his symptoms to cybersickness. After using the Google Daydream, Travis leaned back and seemed to be feeling uncomfortable.

RESEARCHER: So, I see you’re kind of leaning back, why are you leaning back?

Travis: I’m disorientated.

RESEARCHER: You’re disorientated. Can you…

Travis [distressed]: Oh.

RESEARCHER: What do you mean by that when you say that you’re ‘disorientated?’ Can you explain?

Travis: Like that… [inaudible]. Okay. [exasperated tone] Oh boy!

RESEARCHER: So, what do you mean disoriented? Can you tell me how you physically feel?

Travis: Like [tone rising] whooooo boy!

RESEARCHER: Like dizzy or a headache?

Travis: Just it’s interesting to be outside… Oh no, it’s somewhat dizzy and somewhat out of the roof.

RESEARCHER: So, when you took off the headset…

Travis: Yeah. It felt like… give me a minute.

RESEARCHER: Okay. Do you need a drink or water or something?

Travis: I’m fine.

RESEARCHER: Okay...

Travis: So, I don’t think virtual reality is for me.

Despite the claim that “I don’t think virtual reality is for me,” Travis insisted on returning to complete the second part of that day’s session and later asked to return the following day for a second usage test. In subsequent testing sessions, Travis did not demonstrate or communicate any further cybersickness.

Analysis suggests users found our environment to be useful and realistic, and that they were making some connections between Virtuoso and the real-world. Most participants commented on the recognizability of the assets and were able to identify where they were within the virtual environment with corresponding real-world locations. All participants recognized office spaces and, when prompted, were able to walk to their personal workspaces and those of their friends. Using Virtuoso-VBVR, Travis was immediately able to recognize his location in the office space and reacted positively: “Oh. Wait. I’m a little— that’s the plant thing… … before you come in. And that’s the work… how did you formulate this? This is awesome.” Evan was also able to recognize many of the assets: “I know where those seats are… It’s outside by Dyer Hall. That’s right. Where the new café is. That’s right. And it’s University Square straight ahead.”

Not only did participants find assets to be recognizable and realistic, they also gave indications of a virtual sense of presence, that is, that they felt they were “really there.” Andy had the following exchange:

RESEARCHER: Do you feel like you were there with [the online guide]?

Andy: I do.

RESEARCHER: You do? Do you feel like you were really there?

Andy: I do.

RESEARCHER: You do?

Andy: I really do when I’m going out the door.

RESEARCHER: Did it feel like you were in the office?

Andy: It did.

RESEARCHER: Did it feel like you were outside?

Andy: It did.

RESEARCHER: Did it feel like you were going on a bus?

Andy: Yeah.

Realism of assets and environments was noted by nearly all usage test participants. They were readily able to recognize where they were both inside the office space as well as outside on the virtual university campus. After usage testing concluded, some participants were also able to identify their current location relative to the environments and tasks portrayed in the Virtuoso environment, suggesting some degree of transfer. For example, Evan was asked, “Do you think you can find that shuttle stop outside?” He responded positively and was able to look out the window and point to the shuttle stop that he had visited in the virtual environment. He then indicated that the environment was “pretty real,” and that he could tell it was the university campus. In another session, Andy indicated he might be able to identify the location of the bus stop:

RESEARCHER: Do you feel like it was realistic?

Andy: I do.

RESEARCHER: Do you feel like if I asked you to point out where that bus stop was, do you think you could point where it was?

Andy: Probably.

Discussion

In this paper, a prototype VR adaptive skills intervention for adults with ASD was presented. The intervention aimed to help participants learn to use a university shuttle in a safe and appropriate manner. In the current research, participants watched 360-degree videos that modeled the task. This was followed by a rehearsal of the task in a multi-user VR environment. We sought to answer two research questions. Firstly, we sought to understand how the design goals of being acceptable, feasible, easy to use, and relevant to the unique needs of participants manifested in the usage test findings. Secondly, we sought to explore the nature of learner experience. By approaching these questions, we sought to reveal design flaws and uncover opportunities to improve the overall learner experience for participants. Borrowing from Rogers’ (2003) adoption characteristics, our findings have implications from the perspectives of (a) relative advantage, (b) compatibility, (c) complexity, and (d) trialability. A range of improvements are evident in terms of Virtuoso’s relative advantage in comparison to its precursor. Prior to Virtuoso, no formal shuttle training existed, and when training was performed, it was done in an impromptu and unsystematic manner. Not only did Virtuoso formalize shuttle training through task analysis and explication of learning outcomes, it also created an environment in which things that are dangerous, impossible, counterproductive, or expensive (DICE) could be performed (Bailenson, 2018). From this perspective, Virtuoso was highly compatible with the training needs communicated by the Impact Innovation director caused by staff turnaround, a lack of systematic training, anxiety on the part of participants to use the shuttle, and a need for consistency and fidelity in training. Further, the intervention attended to both physical and cognitive accessibility needs of participants.

Regarding the complexity of the intervention, which Rogers (2003) casts as the degree to which it was perceived as being difficult to use, findings suggest both prototypes were easy to use. Mean SUS scores are comprehensively above the standard metric for a system to be considered usable. All participants were able to complete all phases of usage testing successfully (although not without help), and some came back for multiple trials. Qualitative inquiry into the nature of participants’ learning experiences indicates a largely positive experience characterized by positive affect. Further, the research presented here unveils important insights into the trialability of Virtuoso, or the degree to which it could be experimented with before adoption. Overall, evaluation results of the Virtuoso-VBVR and Virtuoso-VR prototypes are positive. Expert review findings suggest that the intervention is feasible and relevant. Experts’ perceptions of usability suggested that Virtuoso-VBVR was usable, but they cautioned that substantial training would be needed and the VBVR intervention alone would not be sufficient for acquisition of skills. However, participant perceptions of usability with Virtuoso VBVR were high, and participants required very little training before they were able to use the intervention. In addition, all participants were able to complete the Virtuoso-VBVR activities, although some support was needed initially to show how the software worked and to remind participants about the sequence in which to watch the videos. While experts indicated a preference for the Cardboard HMD over the Daydream, findings suggest that participants were able to operate both devices with relative ease and few errors.

Limitations

A range of limitations was noted in the execution of this research. In line with nearly all research that has been performed in this area, our sample size was limited and the research did not incorporate a comparison group. Future research should consider incorporating comparison groups, such as a neurotypical peer comparison group. However, given individuals with ASD represent a low-incidence disability group, challenges with small sample sizes are likely to remain. Importantly, participants spent a limited amount of time within the treatment conditions. Questions remain as to whether there is a novelty effect associated with the technology. It is unclear whether further exposure and multiple applications of Virtuoso might change this. Moreover, we used multiple hardware configurations and chose them in such a way as to reduce potential adverse effects. This, conversely, could have impacted participants’ acceptance of the technology. Future research should consider how best to implement HMDs with this population, as there are currently no standards or guidelines. Furthermore, while learner experience was generally positive, analysis uncovered accessibility challenges which could be exacerbated with prolonged exposure to the system. Although most participants were able to quickly learn to use the system, notable concerns arose around physical accessibility, cognitive accessibility, and cybersickness.

Implications

Rogers (2003) maintains that trials of a given innovation can lead to important discoveries that allow for reinvention. An example of this from the current study are anecdotes from participants that aligned with the generalization heuristics we incorporated in our current design (Stokes & Osnes, 2016). For example, participants noted the connection between real-world objects and objects they experienced in the virtual world. Further, participants stated explicit connections between their location in the virtual world and in the real-world. Future research will be needed to more explicitly expose these connections. However, we are concerned by the tendency of HMD devices to induce adverse effects (e.g., cybersickness), particularly among a population with significant sensory processing differences. While there is an emerging base of evidence that suggests people with ASD find desktop-based VR and its use to be acceptable, it is unclear whether those findings extend to HMD-based VR (Bozgeyikli et al., 2018). Prior research suggests that the majority of users who use HMD will experience adverse effects, including nausea, headaches, eye strain, dizziness, and an array of psychosomatic irregularities (Cobb et al., 1999; Dennison et al., 2016). Of importance is that the vast majority of research on adverse effects has been conducted on neurotypical individuals. Very little research exists on using HMD for individuals with ASD, and we have been unable to locate any research that has been performed on multi-user HMD-based VR interventions. The lack of research in this area for people with ASD has led to ethical concerns regarding adoption (Newbutt et al., 2016). We as researchers have a special obligation and a greater responsibility to minimize real or potential risks of HMD-based VR for ASD (American Educational Research Association, 2011).

To address issues of generalization and cybersickness in future research will necessitate a return to values about priorities (Honebein & Honebein, 2015). The current iteration of research and development unveiled important insights regarding Virtuoso’s appeal and efficiency, but required that we sacrifice the opportunity to pursue investigation of generalization effects (effectiveness). While it may be tempting to question whether this was the “right” trade-off, we argue this question is misguided. Designing interventions in this research area constitutes a wicked problem, steeped in complexity. Compounding this are multiple gaps, including (1) a paucity of design precedent in this research area, (2) a lack of substantial theoretical guidance, and (3) an unfortunate failure to include participants in the design of interventions. Our approach to addressing these gaps reflects our values about priorities—values that are situated in principles of respect and dignity for our participants and an obligation to engage in research with this population in the most principled and virtuous manner we are able. We therefore urge future researchers to adopt practices that include and involve participants, that are sensitive to the vulnerabilities of the target population, and that honor researchers’ special obligation and greater responsibility when working with this population.

Exploring the impact of Virtuoso on generalization in future iterations will require a re-balancing of effectiveness, efficiency, and appeal. Moreover, designing a learning experience that seeks to minimize adverse effects like cybersickness reveals yet another potential tradeoff in terms of this dynamic. We have begun developing careful implementation procedures to gradually acclimate participants to the technology and potentially desensitize them to adverse effects (Schmidt et al., 2021), but research on the effectiveness of this approach is still needed. From the perspective of the Instructional Design Iron Triangle, prioritizing effectiveness (whether in terms of establishing generalization outcomes or minimizing adverse effects) could require a sacrifice in terms of either efficiency or appeal. Given the limitations of extant research and design precedent already discussed, it is likely that a range of approaches will be needed to inform this decision. As Honebein and Honebein (2015) maintain, “While this approach may likely diminish efficiency on the design side (that is, it is more costly for the designer to blend multiple methods), it may help balance what the learner experiences on the delivery side” (p. 953). Further guidance is needed to inform designers in this research area regarding trade-offs and sacrifices—an area for future research.