Introduction

Many business analyses and reports (e.g., Belini et al. 2016; Greenlight and Roadtovr 2016), predict that virtual reality (VR) could be the biggest future computing platform of all time. Furthermore, over $4 billion has been invested in VR start-ups since 2010 (Benner and Wingfield 2016) with the expectation that VR could revolutionize the entertainment, gaming, and education industries (e.g. Blascovich and Bailenson 2011; Standen and Brown 2006; Taylor and Disinger 1997). All of the attention surrounding VR in mainstream media, as well as investment by large technology companies like Apple, Facebook, Google, Microsoft, and Samsung, indicates that VR will be used for many applications including learning. Several educational simulations already exist—including Google’s simulation software Expedition which allows students to go on virtual fieldtrips, and NASA’s PlayStation VR demo that allows operators to practice using robotic arms—and many more are on the way (Dredge 2016; Singer 2015). Given this emerging trend, it is important to gain a better understanding of the utility and impact of VR when it is applied in an educational context. The emotional impact of immersion is one important factor to consider when investigating the utility of VR. This is important because VR fosters a higher level of immersion than standard media, which in turn could facilitate learning through positive emotions such as enjoyment (e.g., Picard et al. 2004). According to The control value theory of achievement emotions (CVTAE; Pekrun 2000), this is possible to the extent that added immersion fosters appraisals of control and positive value for the task and object of learning. However, there is limited empirical evidence of the affective value of immersive VR, and even less research that investigates the psychological process by which added immersion impacts students’ interest and motivation, or whether it could facilitate self-regulation and performance in the learning process.

In order to fully understand how learners process a virtual environment (VE), it is necessary to consider their emotional responses to this environment (Plass and Kaplan 2016). This is especially important when designing educational material, as an understanding of the underlying mechanisms that impact learners’ perceptions and motivations can guide the optimal development of VR learning simulations which incorporate emotional considerations that can lead to increased use and better cognitive outcomes (i.e. learning). Consequently, the main objectives of this study are to: (A) Investigate the emotional value of an immersive VR science simulation as compared to a desktop VR version of the simulation; and (B) use structural equation modeling (SEM) to investigate the process by which the level of immersion in a VR simulation impacts non-cognitive outcomes including satisfaction, perceived learning, and intentions to use the simulation.

Existing definitions and types of VR

According to Burdea and Coiffet (1994) VR can be defined as “a high end user interface that involves real-time simulation and interaction through multiple sensorial channels”. Similarly, Lee and Wong (2014) argue that VR is a way of simulating or replicating an environment which a person can explore and interact with. Furthermore, Biocca (1992) defined virtual reality as ‘‘an environment created by a computer or other media, an environment in which the user feels present”. Although these three definition vary somewhat, they all emphasize that VR is a way of simulating or replicating an environment.

Currently, several different VR systems exist, including cave automatic virtual environment (CAVE), head mounted displays (HMD) and desktop VR. CAVE is a projection-based VR system with display-screen faces surrounding the user (Cruz-Neira et al. 1992). As the user moves around within the bounds of the CAVE, the correct perspective and stereo projections of the VE are displayed on the screens. The user wears 3D glasses inside the CAVE to see 3D structures created by the CAVE, thus allowing for a very lifelike experience. HMD usually consist of a pair of head mounted goggles with two LCD screens portraying the VE by obtaining the user´s head orientation and position from a tracking system (Sousa Santos et al. 2008). HMD may present the same image to both eyes (monoscopic), or two separate images (stereoscopic) making depth perception possible. Like the CAVE, HMD offers a very realistic and lifelike experience by allowing the user to be completely surrounded by the VE. As opposed to CAVE and HMD, desktop VR does not allow the user to be surrounded by the VE. Instead desktop VR enables the user to interact with a VE displayed on a computer monitor using keyboard, mouse, joystick or touch screen (Lee and Wong 2014; Lee et al. 2010). According to Cummings and Bailenson (2016), the immersiveness of VR is largely dependent on system configurations or specifications, as opposed to aspects of the mediated content itself. Accordingly, Cummings and Bailenson (2016) argue that immersion can be regarded as an objective measure of the extent to which the VR system presents a vivid VE while shutting out physical reality. By this account, VR systems such as CAVE and HMD which effectively shut out the physical reality while offering high fidelity can be characterized as immersive VR. On the other hand, VR systems such as desktop VR, which have little or no ability to shut out the physical reality and offer limited fidelity, can be characterized as non-immersive VR.

Existing research about the impact of immersive VR

Several studies have found that desktop VR simulations can have a positive impact on cognitive (e.g., Lee and Wong 2014; Bonde et al. 2014), and non-cognitive (e.g., Makransky et al. 2016a, b; Thisgaard and Makransky 2017) outcomes. Results from a recent meta-analysis suggest that students who receive a combination of non-immersive VR and traditional teaching outperform students who either receive traditional teaching, 2-D images or no treatment (Merchant et al. 2014). However, early research into the effectiveness of using immersive VR in education has been inconclusive. Several studies have found immersive virtual training procedures to produce positive cognitive outcomes in a number of settings including engineering (Alhalabi 2016), military (Webster 2016) and robotic surgery (Bric et al. 2015). For instance, Alhalabi (2016) compared traditional teaching with an immersive learning environment presented via a Corner CAVE System (CCS) or HMD. In this study Alhalabi (2016) reported that engineering students learn significantly more about astronomy, transportation, networking and inventors when using an immersive learning environment presented via either CCS or HMD, as compared with traditional teaching. Similarly, Webster (2016) compared lecture-based and immersive VR-based multimedia instruction in terms of declarative knowledge acquisition in a military setting. The results from this study indicate that military personnel learn significantly more about corrosion prevention and control principles and theory when using an immersive VR simulation presented via HMD as compared to a traditional lecture with PowerPoint slides. Furthermore, through a review of the existing literature Bric et al. (2015) concluded that training with immersive VR simulations (e.g. the Da Vinci surgical simulator) significantly improves basic robotic surgical skills. On the other hand, other studies that have tested whether immersive VR lead to better cognitive outcomes compared to desktop VR have found neutral or negative results. For instance, Moreno and Mayer (2002) compared the results of an immersive VR and a desktop VR simulation while sitting and walking. Two separate experiments were conducted in which students were asked to use either an immersive VR or a desktop VR simulation designed to teach students about botany. The desktop VR and the immersive VR simulation were identical in terms of content and differed only in terms of mediation and controls. In each of the two experiments Moreno and Mayer (2002) found that the immersive VR simulation did not increase performance on measures of retention and transfer (i.e. learning). Similarly, Stepan et al. (2017) found that medical students do not learn more about neuroanatomy when watching a 3D video and interacting with a 3D model of the human brain in an immersive VE presented via HMD as compared to reading online text books for the same duration of time. Finally, Makransky et al. (2017b) found that immersive VR lead to higher levels of presence but less learning. The study also found that the immersive VR condition lead to higher cognitive load which was measured using EEG. The authors suggest that specific the affordances of the media, and the factors that influence learning should be considered in designing learning content for immersive VR.

The limited and inconclusive results suggest that there are many factors that can play a role in how immersive VR leads to educational outcomes. Therefore, from an instructional design perspective it is important to understand the process by which learners interact with a VE, and how this leads to educational outcomes. Furthermore, although immediate cognitive outcomes are relevant to determine the immediate value of an educational intervention, non-cognitive outcomes such as emotions (Plass and Kaplan, 2016) intrinsic motivation (Ryan and Deci 2000), enjoyment, and intrinsic value of the learning activity (Pekrun 2006) have all been shown to have long-term positive effects on learning and transfer. Consequently, non-cognitive outcomes may be more relevant for determining the ultimate value of immersive VR based on the expectation that positive emotions and the intrinsic value of the tool will lead to more use and ultimately higher long-term cognitive outcomes. Therefore, the approach used in this study is to investigate the process by which the level of immersion through technology impacts non-cognitive and perceived learning outcomes.

Virtual simulations in training and education

One field where immersive VR could play a particularly central role is virtual simulations used for training and education (Bodekaer 2015). Most areas of industry need highly skilled employees. Many of these skills require mastery through intensive repeated practice, training, and hands-on practical experience, which are often both time-consuming and expensive. Virtual learning simulations are practical and economical, and can supplement or be an alternative to real-life skills training (e.g. Herrmann-Werner et al. 2013; Issenberg 1999; McGaghie et al. 2010; Natioan Research Council 2011). Furthermore, virtual learning simulations provide students and trainees with cost-effective and elaborate teaching methods that enhance both cognitive and non-cognitive outcomes (Bonde et al. 2014; Makransky et al. 2016a, b). By learning and training in a VE, students and trainees can practice uncommon scenarios and time-consuming work whenever the need arises, without having to wait for the correct materials. In comparison, if trainees had to acquire the same amount of practical experience using traditional real-world training methods, the cost would far exceed that of using a virtual learning simulation. Several meta-analyses and empirical studies investigating the efficiency of simulations have shown that overall, the use of simulations results in at least as good or better cognitive outcomes and attitudes toward learning than do more traditional teaching methods (Bayraktar 2000; Rutten et al. 2012; Smetana and Bell 2012; Vogel et al. 2006). However, a recent report concludes that there are still many questions that need to be answered regarding the value of simulations in education (Natioan Research Council 2011). In the past, virtual learning simulations were primarily accessed through desktop VR. With the increased use of immersive VR it is now possible to obtain a much higher level of immersion in the virtual world, which enhances many virtual experiences (Blascovich and Bailenson 2011). However, empirical research is needed to investigate if an immersive VR will enhance the benefits of virtual learning simulations.

Theory and predictions

Recent advances in motivational theory (Renninger and Hidi 2016; Wentzel and Miele 2016) suggest that an understanding of how to harness the emotional appeal of e-learning tools is a central issue for learning and instruction, since research shows that initial situational interest can be a first step in promoting learning (Renninger and Hidi 2016). Furthermore, a learner’s emotional reaction to instruction can have a great influence on academic achievement (Pekrun 2016). There are several educational theories that describe the affective, emotional, and motivational factors that play a role in multimedia learning which are relevant for understanding the role of immersion in VR learning environments. Some theories, such as the cognitive-affective theory of learning with media (Moreno and Mayer 2007), and the integrated cognitive affective model of learning with multimedia (ICALM; Plass and Kaplan 2016), include general emotional factors; but they do not describe specific emotional or motivational constructs and their relationships in detail. One theory that provides a more detailed description of the emotional process during learning is CVTAE (Pekrun 2000). CVTAE posits that learning can be facilitated through positive achievement emotions such as enjoyment (Pekrun 2006; Pekrun and Stephens 2010). According to CVTAE, enjoyment arises to the extent that instructional design elicits and promotes appraisal of control and intrinsic value for the educational content (Plass and Kaplan 2016). Consequently, enjoyment can be predicted to be the strongest when high intrinsic value is combined with an appraisal that the learning activity is sufficiently controllable (Pekrun 2006 p. 323). CVTAE highlights two design features as crucial for instructional design. The first is to give the learner a sense of autonomy (Pekrun and Stephens 2010). The second is to evoke intrinsic value for the task and object of learning (Pekrun 2006). Establishing both autonomy and intrinsic value instructional designs can reduce the amount of negative emotions, such as anger and frustration, and facilitate enjoyment (Plass and Kaplan 2016). As such, intrinsic motivation and enjoyment are important affective factors with regard to the process of learning. Furthermore, cognitive factors such as the appraisal of cognitive benefits and the amount of control or autonomy also play an important role in the learning process.

To further understand how immersive VR technology can facilitate learning, it is necessary to build on the CVTAE with a conceptual framework that incorporates the relevant constructs and possible relationships that play a role in the learning process while using VR technology. There have been several models developed specifically for learning within VEs. Using a different approach based on media technology models, (Lee et al. 2010) developed and tested a framework that specifies the causal relationships between factors that play a role in desktop VR-based learning environments. The framework is based on (Salzman et al. 1999) and technology-mediated learning models of Alavi and Leidner (2001), Piccoli et al. (2001), and Wan et al. (2007). The framework used in the current study is grounded on CVTAE, and the model by Lee et al. (2010). Figure 1 illustrates the framework of the a priori model that describes the hypothesized relationships between the variables used in this study. Lee et al. (2010) have shown that most of these variables play a role in the learning process when using a desktop VR platform; but our model includes the addition of immersive/desktop VR, and a distinction between affective and cognitive variables based on CVTAE. In the a priori model presented in Fig. 1 we predict that the level of immersion in a science simulation (immersive/desktop VR) will predict VR features and usability. These will in turn predict the affective as well as the cognitive variables, which will predict the perceived learning outcomes. Below, we provide a quick introduction to and description of the relevance of the variables that are used in the model; a more detailed overview can be found in (Lee et al. 2010; Salzman et al. 1999).

Fig. 1
figure 1

A-priori model

VR features: representational fidelity and immediacy of control

Research has shown that simulation features play a significant role in mediating the experience of learning and interaction, which in turn improves educational outcomes (Choi and Baek 2011; Dalgarno and Lee 2010; Lee et al. 2010). For instance, Lee et al. (2010) investigated how desktop VR affects cognitive outcomes, and found representational fidelity and immediacy of control (i.e. VR features) to be directly and indirectly linked to a number of non-cognitive outcomes (e.g. motivation, presence, usability etc.), which in turn affected students cognitive outcomes. Similarly, Choi and Baek (2011) used a desktop VR simulation to identify media characteristics which influence students experience of flow while learning in a VE. Using exploratory factor analysis and multiple regression analysis Choi and Baek (2011) found representational fidelity and interactivity (i.e. variables of ‘control’ and immediacy’ combined) to be linked with students’ experience of flow. Lastly, through a review of the existing literature Dalgarno and Lee (2010) also identified representational fidelity and learner interaction as two important factors for learning in VEs. According to Witmer and Singer (1998), the factors which influence the experience of learning and interaction are control factors (i.e., the amount of control that users have in the VE) and realism factors (i.e., the degree of realism of the objects and situations in the environment). Consistent with previous studies (e.g., Lee et al. 2010), this study operationalized the realism factors as representational fidelity and the control factors as the immediacy of control. Representational fidelity is characterized by the degree of realism offered by the 3-D images and scene content, the degree of realism provided by smooth object and view changes, and the degree of consistency in object behavior (Dalgarno and Lee 2010). Immediacy of control refers to the user’s ability to change his or her point of view, as well as the ability to manipulate and interact with objects within the VE (Lee et al. 2010).

Usability: perceived usefulness and perceived ease of use

Usability is a dependent variable that is influenced by both the VR features of the VE and an independent variable that influences the affective and cognitive factors in the framework. Previous research has identified perceived usefulness and perceived ease of use as important components that influence students’ interactional experiences when using educational technology (Salzman et al. 1999). Perceived usefulness refers to the degree to which students believe that using the platforms will enhance their performance. Perceived ease of use was defined as the degree to which students believe that using the platforms is easy or difficult (Davis 1989).

Affective factors

Presence, intrinsic motivation, enjoyment, and control and active learning are the affective factors used in this study. Presence is defined as the psychological state in which the virtuality of the experience goes unnoticed (Lee 2004). According to Witmer and Singer (Witmer and Singer 1998), both involvement (i.e., focusing one’s attention on a coherent set of stimuli) and immersion (i.e., perceiving oneself as enveloped by, included in, and interacting with a VE) are necessary to experience presence (Schuemie et al. 2001). Intrinsic motivation is defined as performing an action for the inherent satisfaction of the performance itself (Deci et al. 1991; Ryan and Deci 2000). Intrinsic motivation has often been linked to positive educational outcomes including those with regard to attention, effort, behavior, and grades (Hardré and Sullivan 2008; Linnenbrink and Pintrich 2002). Perceived enjoyment is the degree to which a student finds a VE pleasant, fun, and enjoyable (Tokel and İsler 2015). Control and active learning refer to the amount of autonomy available in a VE, which allows the students to actively take control of their own learning experience.

Cognitive factors

Cognitive factors in this framework include cognitive benefits, and reflective thinking. Cognitive benefits are described as improved understanding and application as well as a more positive perception of the learned material (Lee et al. 2010). Finally, Dewey (1933, p. 9) defined reflective thinking as the “active, persistent, and careful consideration of any belief or supposed form of knowledge in the light of the grounds that support it and the conclusion to which it tends”.

Outcome variables

The dependent outcomes of the proposed framework were behavioral intentions, satisfaction, and perceived learning. Behavioral intentions are the degree to which the student intends to use the simulation for learning in the future. Satisfaction refers to the degree to which the student finds the simulation satisfactory, while perceived learning represents the degree to which the student perceives the simulation as educational. These outcome variables were included in the framework because previous studies identified these factors as specifically relevant when using VR technology in an educational context (Lee et al. 2010).

Materials and Methods

Sample

The sample consisted of 104 students (39 females and 65 males; average age = 23.8 years) from a large European university. All participants were provided with written and oral information describing the research aims and the experiment before participation. Written consent was collected from all participants in accordance with the ethical regulations of the Health Research Ethics Committee in Denmark.

Procedure

The experiment used a crossover repeated-measures design that involved all of the participants using both the immersive VR (Samsung Gear VR with Samsung Galaxy S6) and the desktop VR version of a virtual laboratory simulation (on a standard computer). The participants were randomly assigned to two groups: the first used the immersive VR followed by the desktop VR version, and the second used the two platforms in the opposite sequence. Both groups began with a preliminary test to measure their individual background information. The participants had a maximum of 20 min to play the virtual simulation on each platform to enable comparability. After using each virtual simulation platform, the participants completed a survey that measured their experience of playing the simulation on a specific platform.

Survey

The survey included demographic questions such as age, gender, and year of study in addition to items measuring the 13 constructs used in this study. A full list of items and the source of the scale is included in “Appendix”. Representational fidelity was measured with three items adapted from Lee et al. (2010; e.g., The realism of the 3-D helps enhance my understanding). Immediacy of control was measured with four items adapted from Lee et al. (2010; e.g., The ability to manipulate the objects in real time helps to enhance my understanding). Perceived usefulness was measured with four items adapted from Davis (1989; e.g., This type of virtual reality/computer simulation is useful in supporting my learning). Perceived ease of use was measured with four items adapted from Davis (1989; e.g., Overall, I think that this type of virtual reality/computer program is easy to use). Presence was measured with 10 items adapted from Sutcliffe et al. (Sutcliffe et al. 2005; e.g., My experiences in the virtual environment seemed consistent with real world experiences). Motivation was measured with seven items adapted from Lee et al. (2010; e.g., I would describe the virtual laboratories as very interesting). Perceived enjoyment was measured with three items adapted from Tokel and İsler (2015; e.g., I have fun using virtual reality/computer simulations). Control and active learning was measured with five items adapted from Lee et al. (2010; e.g., This type of virtual reality/computer program allows me to have more control over my own learning). Cognitive benefits was measured with four items adapted from Lee et al. (2010; e.g., This type of virtual reality/computer program makes the comprehension easier). Reflective thinking was measured with four items adapted from Lee et al. (2010; e.g., Virtual reality/computer simulations enable me to reflect on how I learn). Perceived learning was measured with eight items adapted from Lee et al. (2010; e.g., I gained a good understanding of the basic concepts of the materials). Satisfaction was measured with seven items adapted from Lee et al. (2010; e.g., I was satisfied with this type of virtual reality/computer-based learning experience). Behavioral intention to use was measured with four items adapted from Tokel & İsler (2015; e.g., I would use virtual reality/computer simulations frequently in the future). All of the items were scored on a five-point Likert scale ranging from (1) strongly disagree to (5) strongly agree.

Statistical analyses

Statistical analyses were performed using IBM SPSS version 23.0 and Mplus version 7.31 (Muthén, and Muthén 2012). The items were treated as ordinal variables and are reported using the following goodness-of-fit indices according to Hu and Bentler (1999): the comparative fit index (CFI), the Tucker-Lewis Index (TLI), and the root mean square of approximation (RMSEA). Acceptable fits were indicated by CFI and TLI scores ≥ 0.90 and an RMSEA score ≤ 0.06.

VR learning simulation

The VR learning simulation used in this experiment was developed by the company Labster and designed to facilitate learning within the field of biology at a university level. The VR simulation was based on a realistic murder case in which the participants were required to investigate a crime scene, collect blood samples and perform DNA analysis in a high-tech laboratory in order to identify and implicate the murderer (see Labster 2017 for a video description of the simulation).

The main learning objective was to develop an understanding of DNA profiling and small tandem repeats. The VR simulation started off with the user being introduced to a crime scene. After investigating the crime scene and collecting blood samples, the user accesses a virtual laboratory to perform a DNA analysis. In the virtual laboratory a PCR kit, purified DNA from the crime scene, and a full lab bench set up are available to the user. Once in the laboratory the user is asked to mix the correct reagents and perform a PCR in the PCR-machine. Next the user has to run a gel on the collected sample and compare the patterns emerging on the gel with other already prepared samples from suspects. Finally, the user is asked to identify the murderer.

The VR simulation utilized an inquiry-based approach (Bonde et al. 2014), allowing the user to virtually work through the procedures of DNA analysis by using and interacting with the relevant laboratory equipment. Furthermore, the simulation was designed based on the guided activity principle (Moreno ad Mayer 2007), so that the user received step-by-step guidance from a virtual female laboratory assistant. According to the guided activity principle, students learn more when they have the opportunity to interact with a pedagogical agent who guides their learning (Mayer 2004). The theoretical background for the guided activity principle is that prompting students to actively engage in selection, organization, and integration of information stimulates essential and generative processing (Moreno and Mayer 2007).

The VR simulation used in this study included several different forms of interactivity: i.e. dialoguing, manipulating, and controlling (Moreno and Mayer 2007). In the simulation dialoguing was achieved through an interaction with the female laboratory assistant and through optional selection of additional information through Wiki-links. Manipulation was attained through the opportunity to control and move objects around the screen. Moreover, the user had to find the appropriate tools and prepare them correctly. Lastly, controlling was achieved by letting the user decide when to proceed with the experiment, and by letting the user choose whether to read additional information. The desktop VR version of the simulation was optimized for a computer screen presentation, while the immersive VR version was optimized for immersive VR.

Apparatus

The desktop VR version of the simulation was administered on a high-end laptop with a 15-inch screen. A standard touchpad was used by the participants to control input in the PC condition. The participants used the touchpad to both navigate from the different static points of view and to select answers to multiple-choice questions. In general, the touchpad functioned as a way to select which object the participant wanted to interact with through cursor movement and left-clicks.

In the immersive VR condition the simulation was administered using Samsung Galaxy S6 phones, and stereoscopically displayed through a Samsung GearVR head-mounted display (HMD). This condition requires the participants to use the touchpad on the right side of the HMD, in order to select which objects to interact with. In this condition head movement is used to move the participant’s field of view and the centered dot-cursor around the dynamic 360-degree VE.

Results

The structural validity of the scales used in the study was assessed using confirmatory factor analysis prior to conducting any analyses. The results indicated that two items from the presence scale had non-significant loadings to the latent construct. An acceptable fit was obtained for the model after eliminating these two items (CFI = 0.96, TLI = 0.95, RMSEA = 0.05) indicating that the remaining items measured the intended constructs as hypothesized (items and their standardized loadings are presented in “Appendix”). The scales used in this study also had an acceptable level of reliability, with Cronbach’s alpha values ranging from 0.69 to 0.91 (see Table 1).

Table 1 Mean values, standard deviations, reliability coefficients, p values, and effect sizes of the differences between the immersive VR and desktop VR versions of the simulation for each construct used in the study

Is there a difference between using immersive VR as a platform for virtual learning simulations as compared with desktop VR?

Paired-samples t-tests were conducted to investigate if significant differences were present between the two platforms on each of the constructs used in this study. The mean values, standard deviations, reliability coefficients, t-values, p values, and effect sizes of the differences are reported in Table 1. The results showed that significant differences were found between the two platforms on 11 of the 13 constructs. The largest differences (effects sizes over 0.8) were found for the variables of presence (d = 1.67), motivation (d = 1.28), immediacy of control (d = 0.99), and enjoyment (d = 0.94). Therefore, we conclude that the emotional value of the immersive VR version of the learning simulation is significantly greater than the desktop VR version. This is a major empirical contribution of this study.

What is the process by which the level of immersion in a VR simulation impacts outcomes including satisfaction, perceived learning, and intentions to use the simulation?

The relationships between the constructs in the a priori model presented in Fig. 1 were investigated by conducting SEM. The fit of this model was almost acceptable (CFI = 0.90, TLI = 0.90, RMSEA = 0.07). Therefore, adjustments were made to the a priori model iteratively because there were several non-significant loadings. Each of these paths was evaluated and removed step by step, resulting in a simplified model containing significant loadings only. Furthermore, the iterative analyses made it clear that presence did not predict the outcome variables as anticipated by the a priori model, but rather played a mediating role between the simulation platform (immersive/desktop VR) and VR features on the one hand, and motivation and enjoyment on the other, to predict the outcomes. These changes were made and an acceptable fit was obtained for the final model shown in Fig. 2 (CFI = 0.91, TLI = 0.91, RMSEA = 0.06). Figure 2 only shows the significant (i.e., p < 0.01) unstandardized path coefficients according to Mplus. The results indicate that the level of immersion in the science learning simulation (immersive/desktop VR) predicted perceived learning outcomes indirectly as expected in our a priori model; however, not all of the a priori predictions were significant. The results show two distinct paths between the level of immersion in the virtual learning simulation and perceived learning outcomes: these are labeled the affective and the cognitive paths. This is another major empirical contribution of this study.

Fig. 2
figure 2

Final model with significant unstandardized path coefficients

The affective path

The most direct path in our final model showed that the increased immersion in the VR simulation leads to greater VR features and usability, and a higher sense of presence. This makes the experience more fun and motivating, resulting in higher perceived learning outcomes.

In other words, the simulation platform (immersive VR/desktop VR) was a significant antecedent to presence (beta = − 0.52, p < 0.001) and VR features (beta = − 0.31, p < 0.001), but was not significantly related to any other construct directly. Given that the variable was coded 0 for immersive and 1 for desktop VR, the negative relationship indicates that the desktop VR version was associated with lower levels of presence and VR features. VR features was a significant antecedent to presence (beta = 0.21, p < 0.001), control and active learning (beta = 0.32, p < 0.001), and usability (beta = 0.11, p < 0.001). Also, usability was a significant antecedent of presence (beta = 0.36, p < 0.001), motivation (beta = 0.28, p < 0.001), enjoyment (beta = 0.20, p < 0.001), and control and active learning (beta = 0.62, p < 0.001). Furthermore, presence played an unexpected role in predicting the outcomes in the study. Although presence did not directly predict the outcomes, it was a strong significant antecedent to both motivation (beta = 0.68, p < 0.001) and enjoyment (beta = 0.63, p < 0.001). Motivation (beta = 0.41, p < 0.001), and enjoyment (beta = 0.15, p < 0.001) were in turn significant antecedents of the perceived learning outcomes in the study. Furthermore, control and active learning was also a significant antecedent of perceived learning outcomes (beta = 0.23, p < 0.001).

The cognitive path

A secondary path from the increased immersion in the VR simulation to the outcomes in this study went through the constructs of VR features and usability, then through the cognitive variable of cognitive benefits. That is, the variable VR features was also an antecedent to cognitive benefits (beta = 0.51, p < 0.001), and reflective thinking (beta = 0.83, p < 0.001). Usability was also a significant antecedent of cognitive benefits (beta = 0.45, p < 0.001), but was not significantly related to reflective thinking.

Only one of the two cognitive variables predicted the outcomes in this study: cognitive benefits was a significant antecedent of perceived learning outcomes (beta = 0.35, p < 0.001); however, reflective thinking did not predict the outcomes in this study.

Discussion

Empirical contributions

The first main empirical contribution of this study is the finding that students prefer using an immersive rather than a desktop VR version of a virtual learning simulation, with the largest effect sizes observed for presence, motivation, enjoyment, and immediacy of control, as well as the outcome variable of behavioral intentions.

The largest difference between the platforms was with regard to presence. The effect size difference of 1.67 in favor of immersive VR is larger than the results from a recent meta-analysis investigating the impact of immersion (e.g., immersive VR with head tracking compared to a desktop display), which found a medium impact of immersion on presence of r = 0.339 (Cummings & Bailenson, 2016). Large effect sizes were also observed with regard to the affective variables of intrinsic motivation and enjoyment. Increasing students’ motivation to learn science has been highlighted as one of the most important potential benefits of using simulations in education (Natioan Research Council 2011). Furthermore, previous research supports the motivational value of using desktop virtual learning simulations in education (e.g., Adams et al. 2008a, b; Makransky et al. 2016a, b; Thisgaard and Makransky, 2017; Edelson et al. 1999). Therefore, the finding that the intrinsic motivation to use immersive VR as a learning platform was higher than the desktop VR version is appealing because previous research has found that students who are intrinsically motivated are more likely to set higher learning goals (Archer et al. 1999), engage more in deep approaches to learning (Kyndt et al. 2011), and have higher academic achievement (Hattie 2009). Finally, a large effect size was found for the dependent variable behavioral intention. According to the technology acceptance model (TAM), behavioral intention to use predicts actual use (Venkatesh and Bala 2008). Several studies have supported this notion, and the correlation between behavioral intention to use and actual use ranges from 0.44 to 0.57 (Venkatesh and Bala 2008). This finding suggests that immersive VR technology could lead students to use virtual learning simulations more than a desktop VR version.

The second major empirical contribution of this paper was the finding that a structural equation model with two general paths best describes the relationship between the level of immersion in a VR science simulation and perceived learning outcomes.

The first path is the affective path. Presence played a key role in the affective path and was directly affected by the VR platform, VR features, and usability. These results are consistent with previous research which has proposed that presence is influenced by control factors, realism factors, distraction factors, and sensory factors which result from the VR platform (Lee et al. 2010; Witmer and Singer 1998; Makransky et al. 2017). However, unlike the results from a previous study by Lee et al. (2010), who found that presence directly predicted perceived learning outcomes using desktop VR, the findings of this study indicate that presence plays a mediating role in this relationship. Specifically, presence predicts students’ intrinsic motivation and enjoyment, which in turn predict the outcomes of behavioral intention, perceived learning, and satisfaction. Intrinsic motivation was the strongest predictor of the outcomes of the study, which is consistent with previous studies which have found that intrinsic motivation is important for predicting educational outcomes. This suggests that these findings can also be generalized to immersive VR environments (e.g., Archer et al. 1999; Deci et al. 1991; Hattie 2009; Kyndt et al. 2011).

Control and active learning also predicted the perceived learning outcomes in this study. One of the key factors in designing educational technology is for design features to support the learners’ sense of autonomy, or control over their environment (Pekrun and Stephens 2010). In this study immediacy of control was one of the variables in which the largest differences were observed between the immersive VR and desktop VR versions of the simulation. Therefore, the results suggest that immersive VR can increase perceived learning outcomes by affording learners a higher sense of autonomy through better control over the environment.

The second path was the cognitive path. In this path the variable cognitive benefits played an important mediating role between the learning platform and perceived learning outcomes. Most educational theories include cognitive benefits as being central to the learning process (e.g. Mayer 2009; Norman 1993); therefore, this finding is consistent with the expectation that students are willing to exert more cognitive effort when they can see the value of the lesson. Finally, reflective thinking did not predict the perceived learning outcomes in this model, and did not differ significantly between the immersive and desktop VR platforms. This is consistent with learning theories which suggest that more immersion can increase, but also impede reflective thinking. For instance, the cognitive theory of multimedia learning (Mayer 2009) and cognitive load theory (Sweller et al. 2011) suggest that immersive VEs could foster generative processing by providing a more realistic experience (Slater and Wilbur 1997). However, they also suggest that any material that is not related to the instructional goal should be eliminated in order to eliminate extraneous processing (Moreno and Mayer 2002).

Theoretical contributions

What is the theoretical explanation for the results favoring the immersive VR version of the simulation in this study? CVTAE suggests that when a learning activity (e.g., VR simulation) is positively valued, and the activity is perceived as being sufficiently controllable, enjoyment is instigated. The immersive VR simulation leads to a higher level of enjoyment because the VR features (representational fidelity and immediacy of control) were greater, resulting in a higher sense of presence. Enjoyment is one component in fostering a sense of engagement and flow (Csikszentmihalyi 2000), which can provide a sense of perceived learning and satisfaction. A learning activity which is positively valued can also be attractive; and students would have positive intentions about participating in similar activities.

Furthermore, the cognitive affective model of learning with media (Moreno and Mayer 2007) suggests that immersive VREs could foster generative processing by providing a more realistic experience, which would result in a higher sense of presence (Slater and Wilbur 1997) and a higher level of generative processing. This is supported by the interest theories of learning starting with Dewey (1913), who argued that students learn through practical experience in ecological situations and tasks by actively interacting with the environment. The finding that the increased immersion can lead to positive educational outcomes is specifically relevant for immersive VR because the sense of presence experienced by the student can have a powerful emotional impact (Milk, 2015).

Practical contributions

The findings of this study suggest that immersive VR has significant potential for use in simulations and other e-learning applications because immersive VR is superior to desktop VR in arousing, engaging, and motivating students. Furthermore, the relationships found in this study provide a better understanding of how technology features and students’ interactive experiences can influence important affective and cognitive factors, as well as how these relationships predict important educational outcomes.

The results can guide instructional design decisions when developing VR learning environments. It is clear from the results that VR features—specifically, immediacy of control—seems to be an important factor to take into account in designing VEs. The immersive VR version of the simulation was perceived to have a significantly higher level of immediacy of control as compared with the desktop VR version. This was probably because the immersive VR simulation is controlled through head-motion tracking, so when users move their heads to look around they move their field of view inside of the virtual 360-degree environment correspondingly (Moreno and Mayer 2002). This seems to be important for making learners feel that they have a greater sense of control and autonomy in the learning process. On the other hand, the control mechanism in the immersive VR version of the simulation was still quite primitive. The technology used was the Samsung Gear VR, which requires the learner to use a touchpad on the right side of the HMD in order to select which objects to interact with in the lab. The simulation in this study was designed to create a setting wherein students could perform an experiment in which they could manipulate different items in a lab using two hands which are controlled by the touchpad. However, several participants commented that the touchpad was not intuitive. More advanced VR technology would thus likely afford more natural control systems and even higher levels of immediacy of control.

Usability also played an important role in the model. Usability was a significant antecedent of five of the six affective and cognitive variables included in the study. The simulations used in this study functioned with few technical difficulties. This is reflected by the high level of perceived ease of use among students. Therefore, the high level of usefulness and perceived ease of use suggests that technological factors did not distract students during the learning process.

The results of the study point to two general means of impacting learners’ satisfaction, perceived learning outcomes, and intention to use VR learning environments. The first is to design VR environments that are enjoyable and motivating by creating a high level of usability and good VR features, which give students a sense of presence. The second is to ensure that students have a high level of autonomy through a sense of control and active learning, and to make sure that students see the cognitive benefits of the VR lesson.

Limitations and future directions

One limitation of this study was that most of the participants had never tried immersive VR before; therefore, the positive results favoring the immersive VR platform might be partly due to the novelty of the technology. Interest theories such as the four phase theory of interest development (Renninger and Hidi 2016) posit that learning activities can spark situational interest, but that this does not necessarily develop into well-developed individual interest. The potential of immersive VR has in sparking situational interest could fade as the technology becomes more widely used. It is therefore important to develop an understanding of how to design instructional content that leads to positive emotional as well as cognitive outcomes, rather than relying on the technology. Because research on the use of VR technology in education is in its infancy, the list of future research topics is substantial. One promising future direction is to replicate the results in this study among students who are more familiar with immersive VR.

Another limitation in this study was that the outcome variables only included self-report measures rather than objective measures of learning such as retention or transfer, or objective measures of affect. Future research should include objective measures of cognitive and emotional constructs including tests of retention and transfer, as well as physiological measures of affect such as analyses of facial expressions or galvanic skin response (e.g., Picard et al. 2004; Kai et al. 2015). A further limitation in this study is that the sample size was quite small in regard to the number of factors that were used in the structural equation model. Future research should investigate if the results generalize to larger samples, and other VR content. Future research should also investigate the long-term potential benefits and consequences of using immersive VR in education. In addition, it is important to investigate how individual characteristics such as prior knowledge, age, gender, and culture influence the process of learning in VR environments. The potential negative side effects of using immersive VR (e.g., motion sickness and nausea) should also be considered. Finally, future studies should investigate the potential use of immersive VR for organizational training (e.g., competency devolvement and continuing education) and in other fields.