1 Introduction

Virtual reality (VR) systems have come a long way since the 1960s when VR technologies were first proposed as a system that can display data and information to all senses of the user with an equal or bigger resolution than the one that can be achieved in a natural way identical to the real world (Richardson 2017; Wolfartsberger 2019). VR provides an indirect experience through a virtual space, which interacts with the human sensory systems and overcomes spatial and physical constraints of the real world (Richardson 2017). Rheingold (1991) defined VR as an experience in which users are “surrounded by a three-dimensional computer-generated representation, and are able to move around in the virtual world and see it from different angles, to reach into it, grab it, and reshape it”.

VR’s goal is to fully immerse a user in a virtual environment through simulating the same kinds of physical and psychological reactions they would experience in the real world and providing the feeling of presence, which is the illusion of being inside the virtual world (Slater 2009; Slater et al. 2010). Fidelity is a common and useful concept for distinguishing different VR systems. The ultimate goal for VR is to provide a high-fidelity experience similar to the real world. Harris et al. (2020) proposed that key elements of psychological, affective, and ergonomic fidelity, are the real determinants of VR system fidelity. The key elements of proposed psychological fidelity include the measurement and comparison of mental effort, gaze behavior, neural activity, etc., between real and virtual tasks. Affective fidelity is the self-reported experiences of users or online monitoring of psychophysiological indices of affect. Whereas ergonomics fidelity assesses the realism of interaction and tracking of the VR system.

Currently, even though much progress has been made in the development of VR technologies that can be easily adopted in home environments using Head-Mounted Display (HMD) devices, the technology is being improved to make computer-generated worlds as believable as reality. VR technologies are still under development, measuring the fidelity of VR systems is still not comprehensive enough to address all human-sensory systems. VR applications have different purposes and end-objectives. Nevertheless, VR simulations do not need to alert all human-sensory systems to achieve the objectives for which they were created. As a result, the digital sensory systems used in the virtual environment (VE) are considered the main aspect that affects the overall fidelity of the VR experience, specifically the visual system fidelity. This led us to a question: Do current measurements of VR applications’ fidelity effectively and accurately measure fidelity? Hence, future studies are needed to capture objective and subjective measures of VR systems’ fidelity (Ragan et al. 2015).

1.1 Immersive modeling infrastructure

In VR technologies, the devices and programs used in the creation of virtual space play an integral role in the fidelity and immersion of the virtual experience. VR immersive systems provide a complete simulated experience due to the support of several human-sensory output devices, such as VR 3D shutter glasses and HMDs that enrich the stereoscopic view of the virtual environment during navigation.

Immersion is one of the central features of VR technologies; it refers to the amount of human senses involved in the VR experience, and it can be evaluated through the degree of interaction involved, as well as the realistic degree of objects used to create the VR environment (Cipresso et al. 2018). Immersion is directly related to the physical configuration of the VR system through stereoscopic vision and spatial sound so that users can perceive the virtual environment as in real life. To create a deep sense of immersion, good design in virtual environments highly affects the overall VR fidelity; “the more immersion, the more identity, the more believability, the more change to the user” (Grimshaw 2014).

Previously, VR display systems could be classified with respect to immersion as the following: (a) non-immersive systems that use desktops to reproduce images of the space, (b) semi-immersive systems that provide stereoscopic 3D scenes on a monitor using projection, and (c) immersive systems that provide a complete simulation of the space with the support of sensory outputs (Kyriakou et al. 2017). However, such classification of VR systems has numerous limitations. For instance, this classification criterion is limited to visual display specifications. VR is a system composed of various hardware and software. Although the characteristics of the visual display have a dominant influence on the subjective sense of immersion, the immersive feeling can be improved or reduced by factors other than the visual display. Also, there is no objective criterion for the semi-immersive level. For this reason, the same system can be classified differently. In addition, the immersion level may differ even for the same classification condition. For example, typical desktop VRs are classified as non-immersive VRs. However, there is no clear basis for whether desktop VRs with added tracking technology should be classified as non-immersive.

Technologically, VR systems include input and output devices; input devices include bend-sensing gloves or haptics. Haptics are dedicated to communication between VR users and virtual space as well as tracking and detecting users’ movement in the virtual space. Moreover, output devices, such as VR glasses and high-end multiscreen systems, are dedicated to simulating human body senses, like vision, touch, hearing, and smell, and provide realism to the virtual experience (Cipresso et al. 2018).

Humans experience virtuality in many domains and thus have a conception of the virtual as being a part of reality (Grimshaw 2014). Presence is a concept that explains how we perceive our relationship(s) to virtuality and reality (Grimshaw 2014). It is a mental state in which users recall VR experiences as if they had actually occurred. It provides a sense of embodied cognition, which comes from interactions and expressions simulated by a user’s avatar and a visible and tangible effect within the virtual environment. Moreover, the use of an avatar, which is a digital representation of the user inside the VR experience, can affect and alter human behavior outside virtuality (Grimshaw 2014). According to Slater and Wilbur (1997), “Presence is a state of consciousness, the (psychological) sense of being in the virtual environment” (p.603). Therefore, the degree of interactivity, fidelity of human-sensory systems, and other psychological stimuli are important to create a sense of presence in an artificial environment.

Indeed, two main factors affect the degree of presence (spatial and self-presences) in VR experiences; the first factor is vividness, which refers to the technology’s ability to produce a rich human-sensory interaction through the integration of visual, auditory, and haptics elements. The second factor is interactivity, which refers to the degree to which the VR environment can respond to user input (responsiveness) and the ability of the user to respond to the VR environment (multimodality). Higher degrees of vividness and interactivity provide higher levels of presence and enrich the VR experience (Yim et al. 2017).

2 Related work

Technological advances in VR systems have expanded VR applications in different fields and led to VR becoming a dominant tool for training and experiments across several fields (Cabrera and Wachs 2017). A VR application aims to simulate specific aspects of a task to create alternatives and solutions virtually without using physical aspects and real conditions of the task (Harris et al. 2020). Yet, the evaluating of the VR systems' fidelity is still ambiguous and debatable. Based on the literature review, there were main methods found to evaluate the fidelity of VR systems: practical, based-evidence, and controlled methods.

Numerous studies have been conducted to evaluate different VR systems through practical evaluations. These evaluations consist of a demonstration of VR content using different VR systems to evaluate the level of fidelity and determine the effects of high fidelity on task performance. For instance, Pala et al. (2021) have compared the fidelity of Cave Automatic Virtual Environment (CAVE) and HMDs through interactive pedestrian simulators for investigating street-crossing behavior to improve pedestrian safety. They found that HMD produces more presence feeling than the CAVE system.

Similarly, Elor et al. (2020) compared CAVE and HMD immersive VR exergaming for adults with mixed abilities. In their study, users were recorded playing a VR game with both systems: electroencephalography sensors (EEG), galvanic skin response (GSR), and heart rate (HR) were collected at runtime, as well as post gameplay surveys. The study results show that across all abilities, the HMD excelled in in-game performance, biofeedback response, and player engagement compared to the CAVE.

In summary, the practical evaluations demonstrated significant benefits of using high-fidelity systems compared to low fidelity systems in most cases, but the results of these evaluations can’t be generalized as different VR applications have their own objectives so that it has a different set of hardware and software that can be used. Moreover, there are high numbers of complex applications and interdependent tasks in practical evaluations, making it an improper method of generalizing the results of fidelity evaluation at different VR systems.

Other researchers have used the evidence-based methods of developing frameworks and criteria to evaluate the fidelity and validity of VR systems. Harris et al. (2020) developed a framework for testing and validating a VR system in training perceptual-motor skills, such as for applications to sport, surgery, rehabilitation, and the military. They found that a successful implementation of training and psychological experimentations using VR, required firstly to establish whether the simulation captures fundamental features of the real task and environment, and elicits realistic behaviors. And secondly, by proposing evidence-based methods for establishing fidelity and validity during simulation design. The framework outlined a categorization of fidelity and validity subtypes. It suggests face validity, construct validity, physical fidelity, psychological fidelity, affective fidelity, and ergonomics fidelity as the six aspects of VR system fidelity and generally illustrate how to test each of the aspects. The framework was found to be focused on the VR applications in training, without taking into consideration other domains. Also, the framework is limited to the general classification of the fidelity and validity aspects without demonstrating the sub-elements of these aspects and how it can be evaluated.

Furthermore, controlled evaluations methods were used to overcome the limitations of practical evaluations. Controlled evaluations of fidelity usually involve the direct comparison of similar systems or setups while controlling one (univariate) or more aspects (multivariate) of fidelity, to determine their impact on the overall fidelity of the system. For instance, Pastel et al. (2021) used the univariate method to analyze the gaze accuracy and gaze precision using eye-tracking devices in reality and VR. Moreover, Trepkowski et al. (2019) used the multivariate method to address the interrelationships between the field of view, information density, and search performance.

Despite having better generality and fewer variables than practical evaluations of fidelity, prior controlled evaluations have also had limitations such as advances in technologies that made some of these studies and evaluations outdated. However, in comparison with the development of the VR technology, the fidelity evaluation through user experiences needs to be further studied. Especially from the technical specifications of VR devices and the research method. Moreover, users’ evaluation of fidelity can be derived differently according to the VR content (Kim et al. 2020). For instance, Hontvedt and Øvergård (2020) developed a framework for configuring simulation fidelity with training objectives. The framework considered fidelity in the design of a simulator of the ship for training purposes. It conceptualized fidelity requirements in simulator training through three types of fidelity: technical, psychological, and interactional, and linked it to different levels of training and targeted learning outcomes. The study demonstrated how the fidelity of the simulation relates to the main objectives of the simulation. Likewise, McMahan and Herrera (2016) suggested three aspects of system fidelity (interaction, scenario, and display fidelities) for analyzing and designing VR techniques in learning and training. They described how the system fidelity can be altered by manipulating these aspects. However, Srivastava et al. (2019) found that the differential effect of visual versus interactional fidelity on human performance depends on the nature of cognitive and functional behavior users employed, and the usability of VR systems.

It is commonly assumed that the more advanced VR technologies with high-fidelity are associated with better performances (Zizza et al. 2018; Liu et al. 2019a, b; Franzluebbers and Johnsen 2018). Nevertheless, using high-fidelity systems can be useful and lead to a higher level of performance based on the expertise level of the user. Frithioff et al. (2020) found that the ultra-high fidelity graphics reduced the performance level of surgeons compared with conventional graphics, whereas the high fidelity graphics increased the cognitive level of the training. They suggested that high fidelity trainings and might be better suited for the training of intermediates or experienced surgeons.

Also, several empirical studies investigated how high-fidelity VR technologies can affect the virtual experiences and the overall performance. It implies that increasing the fidelity of one or multiple aspects of the VR system can be beneficial to performance (Zizza et al. 2018; Liu et al. 2019a, b; Franzluebbers and Johnsen 2018; Frithioff et al. 2020). This implies that the overall fidelity of the virtual environment may not be as important factor for overall performance. Another example of a framework of fidelity evaluation is “the framework for evaluating based on a simulation’s display, interaction, and scenario components” presented by Ragne et al. (2015). The experimental evaluation of fidelity was designed to test the effects of fidelity on training effectiveness for a visual scanning task. It was found that as higher as the field of view, the better the performance of the VR system, while high visual realism worsened performance. However, they suggested that future evaluation criteria are needed in order to gain more realistic settings.

3 Methods and guidelines

In this paper, the building blocks and set of factors required to evaluate the fidelity of a VR system are presented. Additionally, the objective and subjective measures are introduced considering the core modules required for producing a virtual experience suggested by Spanlang et al. (2014). These core modules include: (1) the display module which consists of the hardware devices used to display the virtual experience, (2) the motion tracking module, which maps the user movements of the virtual world as in the real world, (3) VR content which consists the 3D presentation of the simulation content, and (4) the integration module that handles the creation, management, and rendering of all virtual entities. Also, it integrates all other modules.

The framework and the fidelity scale in this paper was developed based on the existing literature on the evaluation of fidelity of VR technologies and aims to provide an objective evaluation based on human biological abilities and advances in VR technologies. Accordingly, the factors and elements affecting the overall fidelity of a VR system were classified into four interrelated aspects: (1) digital sensory system fidelity, (2) tracking systems fidelity, (3) simulation system fidelity, and (4) integration among these aspects to produce high-fidelity virtual experiences.

There are numerous definitions of presence and immersion in the literature in the discussion of fidelity evaluation. The interrelations between the presence and immersion are not well understood as it depends on the VR simulation’s objectives. In the suggested framework, immersion and presence were measured by breaking it down into subcomponents. The principal aim in designing high-fidelity VR systems is to immerse users to such an extent in the virtual worlds that they accept the virtual world as ‘real’. The fidelity of the VR system is a measure of the degree to which a simulation system represents a real-world system (Pan et al. 2006; Meyer et al. 2012). Fidelity of VR systems is considered through two constructs known as ‘presence’ and ‘immersion’ (Slater and Usoh 1994). Therefore, VR interaction’s primary purpose is not limited to creating a VE that is eased to use. It is about making users feel inside the virtual world.

The definition of immersion has been subjected to a more debatable discussion among researchers. Notably, Slater (Slater et al. 1994; Slater 2018) defined the term ‘immersion’ as the objective level of sensory fidelity a VR system provides. Accordingly, immersion can be derived from the hardware and software constraints, such as the field of view, stereoscopic imaging quality, effective display size, display resolution, frame rate, and refresh rate (Slater et al. 1994; Slater  2018). Immersion is a technological aspect of VE. The fidelity evaluation framework in this paper proposes scales for measuring fidelity of visual, auditory, and haptic systems. In these scales, each factor affecting fidelity was quantified and classified from low to high on a separate scale along with the description of the factor features at each classification of fidelity. These scales can be used to evaluate the immersion concept.

3.1 Virtual reality fidelity framework objectives

Considering the recent advances in VR systems, the proposed fidelity evaluation framework in Fig. 1 was designed to answer the following questions:

  • What are the significant aspects of the digital sensory system’s fidelity to the user experience?

  • How to evaluate fidelity objectively and subjectively?

Fig. 1
figure 1

Virtual reality system’ fidelity evaluation framework

Fig. 2
figure 2

Visual system fidelity: this figure shows the main factors affecting visual system fidelity along with different types of virtual reality systems which are the high end multiscreen, head mounted display, and mobileVR

It was hypothesized that increasing fidelity in the digital sensory and tracking systems, with accurate integration between the factors, would result in the best user performances and high-fidelity experiences (Cabrera and Wachs 2017; Cooper et al. 2018; Zizza et al. 2018; Liu et al. 2019a, b; Franzluebbers and Johnsen 2018; Slater et al. 2010). Also, the olfactory and taste factors of the digital sensory system were excluded as limited research were found discussing the factors affecting the fidelity of these two sensory systems.

The proposed framework in Fig. 1 is classifying factors affecting the fidelity of VR systems into four main fidelity components: the digital sensory system, the tracking system, the simulation system, and the integration and synchronization of the VR system data and devices. Also, it defines high-level common and useful concepts for distinguishing different VR systems with respect to fidelity.

In the following sections, it describes the core factors of the framework. For each module, we first define the role of it within the VR system along with technical aspects. It is outside the scope of this paper to go into full detail for all technologies that could be used in the digital sensory and tracking systems. We, therefore, point to a review of subsystems where possible and put an emphasis on the technology that is used at Eastern Michigan University (EMU).

3.2 Fidelity evaluation building blocks

In this section, the building blocks and set of factors required to evaluate fidelity of a VR system are presented. High-fidelity VR systems should be able to present simulations with high-quality graphics and accurately track the users’ real bodies, therefore, providing a match between the human-sensory systems and VR systems. The building blocks of evaluating and designing a VR system are as follows.

3.2.1 Digital sensory system fidelity

The digital sensory system fidelity consists of the sensory stimuli necessary to simulate a specific event or task virtually. For instance, a field of flowers can be simulated using specific sensory such as visual, olfactory, and auditory to evoke the impression of being in the field of flowers. Accordingly, the choice of digital sensory systems in the VR experience is highly dependent on the VR simulation objectives and applications.


Visual Sensory System the fundamental aspect of VR systems is the visual system (Spanlang et al. 2014). Consequently, the framework suggests the elements of visual system fidelity are the quality of 3D stereoscopic graphics, field of view (FOV), field of regard (FOR), frame rate, effective display size, and display resolution pixel. Figure 2 summarizes factors affecting the fidelity of the visual system.

These visual system elements have significant effects on aspects of the overall user experience fidelity (Ragan et al. 2015; Menzies et al. 2016). As a result, visual system fidelity induces and highly influences the feeling of presence (Cummings and Bailenson 2016). In terms of visual system fidelity, prior research has shown that high-fidelity VR display systems (e.g., 3D graphics and audio quality) facilitate immersion and presence (Bowman et al. 2012; Cummings and Bailenson 2016; Nabiyouni et al. 2015; Witmer and Singer 1998).

A stereoscopic 3D view is a unique feature that differentiates VR systems from the majority of other visualization systems. In VR systems, stereoscopic 3D is implemented through the rendering of left and right images using a graphic card system; this rendering technique is broadly known as quad-buffering (Norman 2010). Quality of 3D stereoscopic graphics is one of the sub-elements of the visual system fidelity. It refers to the degree of realism of the visual graphics displayed by the system with respect to the real world in terms of both accuracy and complexity. A low-fidelity degree of visual realism results from using low-quality 3D models in terms of geometry and textures, and it may result from aliasing in graphics. Graphics with high visual realism that mirrors the real world. Volonte et al. (2019) found that photorealistic rendering affects users’ perception where cartoon characters were considered highly appealing compared with human-like appearance. Menzies et al. (2016) found that stereoscopy has been shown to positively effect user performance for tasks while also providing a greater sense of presence.

The display system is an essential element to build a VR system, and it has a major effect on the visual system fidelity of the VR experience. The display system allows the user to view the VR experience and interact with it. Thus, the specifications and quality of the display systems, such as the display resolution pixels and the effective display size, are important to obtain high-quality stereoscopic vision. Effective Display size is the actual physical dimensions of the display system. Ni et al. (2006) found that increasing effective display size and resolution reliably improve task performance in large displays. Also, users experienced better navigation, search, and comparison tasks in information-rich virtual environment when using large high-resolution displays.

A suitable method for obtaining a large display area and high image resolution is to acquire a CAVE system, which is equipped with high brightness projectors that project images on multiple large surfaces that can be viewed by multiple users at the same time in a collaborative interactive approach. Display resolution pixel is the degree of exactness with which real-world graphics stimuli are reproduced by a display system. Dmitrenko et al. (2017) found that display resolution affects the presence and the overall effectiveness of VR devices.

Field of view (FOV) FOV is considered as another significant aspect of immersive system modeling. It refers to the angular size of the area of the scene that a user can see directly in the VR experience. Humans’ visual field has a slightly over 210-degree forward-facing horizontal arc and a vertical range of 150° (Norman 2010). Figure 3 illustrates human’s field of view limits and definition. A wider FOV allows the user to see more of the scene at once and to use peripheral vision, while a narrower FOV allows the user to focus on the region of interest in the VR experience. Common VR systems have a wide range of FOVs, from less than 30° (e.g., in some consumer-level, head-mounted displays) to 180° or more (e.g., in surround-screen displays like CAVE). The wider the field of view, the more present the user is likely to feel in the experience. Dmitrenko et al. (2017) Increasing FOV has been demonstrated to improve user performance for tasks that require high navigation accuracy. Also, it has a positive effect on the effectiveness of VR devices.

Fig. 3
figure 3

The field of view description

Field of regard (FOR) FOR refers to the total area that can be captured by a movable sensor. The FOR is important in virtual worlds, as well as in the real world. For example, when using the CAVE system, the user needs to look at the ceiling and floor screens, and sometimes at a back screen. When using a head-mounted display (HMD) in a virtual world, the user needs to look around. Thus, the human FOR is important to consider along with the FOV in visual displays including wide HMDs, large displays, and multiple screens (Jang et al. 2016).

Finally, the frame rate refers to the degree of exactness with which real-world graphics frames are reproduced by a display system. Frame rate has been shown to affect user performance as increasing frame rates appears to increase users’ sense of presence, and it affects the overall effectiveness of VR devices (Sargunam et al. 2017).

Auditory sensory system Developers of VR systems tend to focus mainly on the visual sensory system as it is considered to be the fundamental aspect of VR systems (Spanlang et al. 2014). However, the auditory system is powerful and the technology exists to bring high-fidelity audio experiences into VR (Cooper et al. 2018).

The fidelity of the auditory system consists of three main factors: (1) quality of auditory stimuli which refers to the degree to which the VR system’s auditory stimuli corresponds with visual and other interaction systems’ cues. Several contributions suggest the potential benefits of integrating multiple sensory systems in virtual experiences (Cooper et al. 2018; Bowman et al. 2012; Dmitrenko et al. 2017).

Cooper et al. (2018) found a significant main effects of audio and tactile cues on task performance and on participants’ subjective ratings. Dmitrenko et al. (2017) found that sound waves are readily turned on and off to correspond with the values of digitization. (2) Realism of the surrounding audio refers to the degree to which the VR system’s auditory stimuli is a presentable reproduction of real-world sounds. Bowman et al. (2012) found that high-fidelity realism of the surrounding in auditory systems improves immersion and presence in virtual experiences. (3) Audio resolution refers to the degree of exactness with which real-world sound stimuli are reproduced by an auditory system. Bowman et al. (2012) found that high-audio resolutions can improve immersion and presence.

Haptics sensory system VR systems integrate haptics and modern sensor technology. More advanced systems like CAVE use multiple stereoscopic displays, force-feedback devices (haptic interfaces), and modern sensing techniques to capture accurate data for further analysis related to planning, ergonomics, virtual prototyping, etc. These systems should be designed for a broader scale adoption using high-quality of input and output devices.

In particular, haptics tracks users’ fingers movement on joints within the VR experience and allow the streaming of data from gloves over the network. Haptics are useful as interactive and training tool. Furthermore, VR applications that combine motion sensor technology and high-fidelity graphics can motivate users to engage in the VR experience. A drawback with motion sensors is that gloves still lack a consistent natural finger movement and force feedback that produce physical reactions similar to the real world. Haptics is still an emerging technology that still needs more advancement in reproducing the desired force feedback. Also, wearing sensors can cause inconvenience and discomfort for some users. Figure 4 illustrates the haptic definition as well as factors affecting the fidelity of haptic systems.

Fig. 4
figure 4

Haptics system fidelity

The fidelity of haptics system consists of four main factors: (1) Haptic device design which refers to the number of DOF a user can have using a haptic device, the accuracy of haptic movement interpretation by the operating software, and the refresh rate. Liu et al. (2019a, b) demonstrated that the proposed glove-based design yields a higher success rate in various tasks in VR. Also, Spanlang et al. (2014) found that some haptic devices impair participants from moving freely during the experiment, thus limiting their range of motion. (2) Haptic movement capability is the design and physical features accuracy through which users can perform the required task accurately and effectively. For instance, such features can be the haptic design, shape, material specifications, or user-interface. Spanlang et al. (2014) found the haptic module should be designed to be simple, flexible, and applicable with different hardware and software parts to be unplugged or replaced with new functionality when needed. (3) Haptic navigation accuracy which refers to the degree of accuracy with which an input/output device estimates a position and an orientation in the VR simulation. Hence, the choice of a full-body motion tracking system will greatly depend on the VR simulation content and the system setup (Rogers et al. 2019). (4) Haptic force feedback which refers to the degree of accuracy with which an input/output device simulates the feedback of the user’s applied force during interaction with objects in the VR simulation. Spanlang et al. (2014) found the delivery of haptic stimulus is complicated by the variety and specialization of touch receptor cells in our skin for the different types of haptic stimuli, such as pressure, vibration, force, and temperature.

For the other digital sensory systems such as olfactory and taste, several contributions have summarized challenges revolving around scent delivery, detection, and dispersal with respect to the digital olfactory systems. Moreover, smell and taste are notoriously difficult to generate and control with respect to user’s movement (Dmitrenko et al. 2017; Kerruish 2019). As the number of detectors involved in human smell is in the thousands, so it is difficult to code odors as a mixture of a small number of “primary odors” (Rouby et al. 2002). Accordingly, these two senses were not discussed thoroughly in the framework. Furthermore, as explained by Hoedt et al. (2017), one of the challenges that can occur in haptics integration is that there is a lack of consistency in the development and application of virtual assembly systems in terms of haptics feedback. Also, in the 3D virtual interactions, there are limited possibilities to present haptic feedback (Nabiyouni et al. 2015).

3.2.2 Tracking system fidelity

There are different categories and classifications of motion tracking systems, each based on different applications’ techniques, such as magnetic, acoustic, optical, and inertial trackers. The focus of this research was on optical tracking systems as they are used extensively in engineering applications. Each of these classifications has advantages and disadvantages, but to describe these is beyond the scope of this article. Regardless of the technology used, tracking systems can be classified according to their performance parameters: accuracy, latency, etc. Figure 5 the fidelity of the tracking system shows an example of the optical tracking mechanism along with a hardware sample.

Fig. 5
figure 5

The fidelity of interaction system: this figure shows an example of motion tracking system using optical tracking technology at the virtual reality laboratory at Eastern Michigan University

Different physical factors have a direct effect on the performance of tracking systems, which directly affects the fidelity of the overall virtual experience (van der Kruk and Reijne 2018). For instance, poor accuracy can affect synchronization between movement and its representation. In addition, a poor performance can cause corrupted outcome measures from the VR simulation. Synchronization between VR users and VR systems is one of the main determinants for effective audio-visual integration (Harrison et al. 2010) and an important determinant of simulator fidelity (Grant and Lee 2007). Tracking system fidelity depends on two sub-elements, according to the suggested model: (1) Motion tracking system accuracy, which refers to the degree to which the user’s position and orientation are tracked by the VR system in terms of accuracy. van der Kruk and Reijne (2018) found that the accuracy of each tracking system depends on the system specifications such as weight and size of the sensors, maximum capture volume per camera, sampling frequency, etc. (2) synchronization of tracking data which refers to the degree of exactness with which user movements for a task in real-world are reproduced accurately in the virtual world.

In turn, to create a high-fidelity 3D interaction in the virtual environment, it is necessary to establish accurate synchronization between movements in the real and the virtual world. Users’ interaction can be supported by input/output devices, such as motion trackers, control devices (joysticks), eye trackers, and data gloves.

Motion tracking systems are responsible for locating the position and orientation of specific markers or sensors attached to the user’s body or the device used in the real world for interaction, and then transferring that information to the middleware (Maran and Glavin 2003; Aggarwal et al. 2007). Consequently, the middleware interprets and renders the data to visualize it for the user. Motion tracking system fidelity is highly concerned with how accurately the user’s movement corresponds to and synchronizes with position and orientation in the virtual world.

In motion tracking systems, the user’s movement in the virtual space can be tracked through trackers on the full body, head, hand, and other body parts depending on the simulation objectives. For instance, Pan and Steed (2019) investigated the effect of using a self-avatar that allows users to see their feet, legs, and other parts on their behavior. Spanlang et al. (2014) distinguished between head-tracking and body tracking. Body tracking ideally tracks the movement of the user’s body parts including facial expressions. Slater and Sanchez-Vives (2016) argued that the virtual body can be designed to look like the real one, or not, and certainly with body tracking can be programed to move with real body movements and so on. In their review for recent literature in using VR for pediatric pain, Won et al. (2017) found that the tracking capabilities inherent in embodied VR experiences offers clinicians the ability to monitor their patients’ physical movements and quantify rehabilitative effort without relying on self-report data using wearable monitors for tracking physiological data.

Consequently, a more sophisticated VR experience requires more advanced hardware and software systems that are capable of producing a higher fidelity experience. Therefore, one of the biggest challenges with current VR technology is the need for powerful hardware and software to generate, simulate, and render data to create high-fidelity VR experiences.

3.2.3 Simulation system fidelity

Generic methods for modeling and generating animations and motions have been extensively studied in engineering, biomechanics, robotics, and computer graphics. In the literature, we can find a wide variety of methods to generate 3D models and character motions based on a set of controllers. These methods are usually stable and quite suitable for robotics purposes, but they might lack on certain human-like characteristics in the generated motions. The VR content is the fundamental core element for making decisions about VR display and VR interaction systems to build a VR system (Cabrera and Wachs 2017).

VR simulation architecture consists of 3D model design of simulation and 3D interaction design in simulation. The 3D Interaction design in VR Simulation can be designed using natural and magical interactions. The 3D interaction design process depends on the purpose and main objectives of VR simulation and nature of tasks performed in different VEs. Donald Norman (2010) argued that natural interactions performed using realistic interfaces aren’t necessarily superior to enhance the VR simulation fidelity. Whereas magical interaction allows users to extend their interactions to be more powerful. Therefore, designing a magical interaction in VR simulation is about creating the real-world interactions and extending its range and power. In some cases, you just want to be able to do things that you can’t do in the real world. On one hand, Bowman and McMahan (2007), have shown that magical interfaces can be much more efficient and more usable than realistic interfaces in designing 3D interactions. On the other hand, using magical interactions in VR simulations can reduce the plausibility of interaction, reducing the feeling of presence.

Donald Norman (2010) argued about one of the most critical concepts in 3D interaction design which is affordances. An affordance refers to the functionality of 3D objects inside the VR scene along with a presentation of the relationship between object’s properties and the technique of interaction with it (Norman 2010). It is simply how user perceive affordances as a user interface element in the VE (Gibson 2014). For instance, in a VR scene of manufacturing training for machine operators, when a machine operator sees the machine start button, he realizes that pressing that button starts the machine. So, an affordance is the relationship between the properties of an object (machine start button), and the ability to act on it, (pressing on the button using hands). Accordingly, an affordance in the design of 3D interactions provides a similar experience to the real world compared to an interface. For illustration, workers can assemble and disassemble a complex automotive part in with a six-degree-of-freedom controller. In other words, the type of controller to be used in the VE should be capable and efficient to interact with virtual objects based on its properties. As a summary, the simulation content should be designed with visual signals, signs, and gestures that indicate the presence of each affordance in the VE. Proper designing of affordances overcomes frustrations during the VE.

3.2.4 Integration of the system data

One significant aspect of VR display and tracking fidelity is the integration of the system data. Proper integration of the VR system data allows users in a VE to observe, interact, and manipulate the surrounding VE through real-time updates of the graphics according to the viewpoint and interactions. The VR system data originates from several sources of data includes VR hardware, 2D user interface, 3D user interface, head position and rotation tracking, hand position and rotation tracking, any other body parts motion tracking, etc. different user inputs from different VR systems support VR interaction. Consequently, the main role in the integration of VR system data is how to translate the captured data from different VR system components like interactions performed in real-time by users into a language that the computer system understands and reacts to in a real-time manner.

Other than user’s body motion tracking, VR system data could include physiology signals, and eye gaze. Physiological signals could include heart rate, brain activities, or muscle activities. This normally requires some expertise in real-time data processing, and it can be used in training and therapies. For instance, tracking system can be used to design manufacturing processes taking into consideration health and safety issues by monitoring for example heart rate during performing a certain manufacturing activity. Furthermore, VR simulation can be performed using VR technology integrated with other physical devices that simulate certain movements corresponding with the interaction in the simulation, such as using VR to simulate and validate a new model of a car using an actual car steering and driving controls as in the real car.

Middleware acts like the motor of VR systems’ integration process. The role of middleware is to link personal computers (PCs) and servers through coordinate and enable VR applications running to communicate with other applications. It provides the VR engine that enables internal and external VR functions to pass data between each other.

3.3 Fidelity evaluation scale

After identifying the building blocks and set of factors required to evaluate fidelity of a VR system in the previous section. The factors were classified objectively and subjectively of a VR experience. Table 1 fidelity evaluation criteria summary shows how each factor can be evaluated along with the required evaluation tools and items to evaluate it objectively and subjectively or a mix of both.

Table 1 Fidelity evaluation criteria summary for virtual reality systems

As mentioned in the previous section, the dimensions for each variable affecting immersion and presence were determined. The presence was evaluated by applying the theoretical approach created by Sheridan (1992). Sheridan (1992) determined the underlying factors of presence as (sensory information, sensory control, and motor control). Consequently, Witmer and Singer (2005) developed a 32-item presence questionnaire that identified three subscales to measure presence: (sensory, control, distraction, and realism) factors. However, the questionnaire by Witmer and Singer was criticized for the lack of objective measures and comprehended presence due to the need for items to capture and measure psychological factors effectively.

Another approach to measuring presence was developed by Slater et al. (1994) in multiple studies. The Slater et al. (1994) questionnaires are based on questions relevant to three themes: (1) embodiment illusion refers to the sense of being physically present in the VE, (2) plausibility illusion the extent to which the VE becomes the dominant reality, (3) place illusion which is the extent to which the VE is remembered as a “place” and being feel inside a VE.

The aforementioned questionnaires, in addition to the Igroup Presence Questionnaire (IPQ) developed by Schubert et al. (2001), can be used to derive the items that measure presence for VR applications in Table 1. The IPQ is a scale for measuring the sense of presence experienced in a VE. Nevertheless, the presence questionnaires by Witmer et al. (2005), Schubert et al. (2001), and Slater et al. (1994) are currently the most cited three presence questionnaires applicable for VEs on Google scholar with 5155, 1259, and 1226 cites respectively (Nov 2020).

The fidelity of a visual, auditory, and haptics can be evaluated using the suggested scale in Fig. 6. Also, it includes a general evaluation for other fidelity factors. The objective of the scale of fidelity is to classify objectively the fidelity of VR systems. In these scales, each factor affecting fidelity was quantified and classified from 1 to 5 or low to high on a separate scale along with the description of the factor features at each classification of fidelity. The scale formulates the details of the framework by comprehensively analyzing and referring to previous research contributions related to fidelity evaluation. evaluation (Richardson 2017; Wolfartsberger 2019; Harris et al. 2020; Pala et al. 2021; Pastel et al. 2021; Trepkowski et al. 2019; Kim et al. 2020; Witmer et al. 2005; Schubert et al. 2001). In addition, the evaluation criteria were determined according to the recent advances in VR technologies and human biological abilities.

Fig. 6
figure 6

Motion to photon time

The focus in measuring fidelity concentrated on the digital sensory system fidelity specifically. The importance of each factor affecting fidelity depends on the objectives of the VR simulation and the user expertise in the tasks or events in the simulation.

Overall, the ideal VR system that has the highest fidelity among the four building blocks scales. The ideal VR system would display all digital sensory system aspects including the five sensations in a real-time interaction system in a full-body tracking mode (Cabrera and Wachs 2017; Slater 2009; Slater et al. 2010). The VR content of the simulation for the ideal VR system should be high-quality 3D graphics near to the real-world. Also, the generated data in the ideal system should be integrated and synchronized to produce a real 3D graphics VE.

The digital sensory fidelity scale is depending on the number of the sensations involved in the VE. As more of the digital sensory systems are used the more fidelity can be reached. The ideal VR system that would display visual, auditory, haptics, taste, and olfactory systems. Furthermore, the high fidelity visual systems are which capable to provide highly realistic 3D stereoscopic graphics through accurately rendering high-density geometry models in a high frame rate display resolution pixel, and field of view. Humans’ visual field has a slightly over 210-degree forward-facing horizontal arc and a vertical range of 150 degrees. Accordingly, the ideal field of view would match the FOV of humans’ visual field. The wider FOV the more present the user is likely to feel in the experience. For the field of view, HMDs commonly have a field of view around 100 degrees, whereas, in the high-end multiscreen system, the field of view is around 170 degrees. Therefore, HMDs have a significant inferior to the human actual field of view. On the other hand, most high-end multiscreen systems run at a frame rate of 60 frames per second, while HMDs run at 90 frames per second.

The auditory systems with a high-fidelity improve immersion and presence in the VE. The scale suggests three factors affecting the fidelity of auditory systems. The quality of auditory stimuli was determined according to the mouth-to-ear latency which refers to the time delay between the user’s head motion and the corresponding display output of the VR system. Becher et al. (2018) evaluated the accuracy of the mouth-to-ear latency measurement method using buzzer was directly attached to the microphone of the measurement system. He found that the mouth-to-ear delay can be measured very precisely with the measurement system used.

Similarly, the haptic systems fidelity is highly dependent on the number of DOF a user can have using a haptic device, the accuracy of haptic movement interpretation by operating software, and the refresh rate.

The integration of the VR system data fidelity depends on producing a high-fidelity 3D interaction in the virtual environment, by establishing an accurate synchronization between movements of the user and the display system along with other systems. A key fundamental fact about VR that the graphics are updating according to the change in the head viewpoint. This process refers to rendering, which is basically the motion to photon time as shown in Fig. 7. When images are rendered, the time of this process refers to the motion to photon time. To completely avoid nausea and to enjoy the smoothest effects in the VE, it is recommended that rendering time should be under 20 ms, so the display systems must have a total motion-to-photon latency of no larger than 20 ms (ms) to prevent motion sickness and nausea during the VE (Carmack 2013).

Fig. 7
figure 7

The fidelity evaluation scale of virtual reality systems

The simulation system scale classifies VR systems according to the quality of 3D imaging from monoscopic imaging to high realism 3D graphics. Moreover, the consistency and accuracy of the simulation system depend on the displacement in the user’s view within the VE, the displacement must be less than 1 arcmin to avoid nausea during the VE (Knecht et al. 2012).

For the interaction system fidelity scale, it classifies VR systems according to the body coverage tracking and consistency of interaction experience between users. Accordingly, motion tracking system fidelity is highly concerned with how accurately the user’s movement corresponds to and synchronizes with position and orientation in the virtual world.

A VR system purpose is to provide users with real-time interactivity tracked by the system using 3D stereoscopic display. Head-mounted displays (HMDs) achieve this purpose and these features by using small display screens that move with the viewer. HMDs basically isolate the user from his surrounding real environment which can be highly intrusive and confusing. Whereas, in the high-end multiscreen systems, the projection plane is fixed and does not move with the viewer’s position and angle as it does in HMDs.

With respect to immersion, HMDs provide a high immersive experience as users are completely isolated from their surroundings. Whereas in the high-end multiscreen system the immersion might be lost in case the system is not six-sided display screens. For instance, in a four-wall VR system, the user might lose the sense of immersion if projectors weren’t synchronized to process the images and obtain a combined visual rendering across all the screens in the system.

3.4 Fidelity evaluation process

Fidelity evaluation consists of reviewing several studies to capture the objective and subjective measures of all factors affecting VR systems' fidelity (Grant and Lee 2007). Immersion can be evaluated on the degree to which VR systems deliver an objectively measurable match to the real world in an interactive way (Slater and Sanchez-Vives 2016). The objective measures include physiological measures (Shi et al. 2007; Oviatt et al. 2004), technical measures (van der Kurk and Reijne 2018), time recordings (Meyer et al. 2005), eye movement (Rey et al. 2010), and task performance (Jennett et al. 2008; Grajewski et al. 2013). The visual system has also been proposed as an objective measure for presence in the VR environment Cooper et. al (2018). However, to capture subjective evaluation, including the psychological and other subjective factors listed in Table 1, a rigorous method of gaining insight into target users' opinions and subjective measures is questionnaires (Youngblut 2003). While interaction techniques have an intense effect on fidelity and presence (Seibert and Shafer 2018), they play an integral role in creating collaborative, interactive virtual experiences and are an important determinant of simulator fidelity. Accordingly, Fig. 8 illustrates the suggested process of evaluating fidelity using the proposed fidelity scales along with evaluation items for each factor.

Fig. 8
figure 8

The fidelity evaluation process

The first stage in the creation of VR simulation content is to identify its objectives through a simulation scope statement. Identifying simulation objectives involves ensuring all of the expected results and outcomes were fully accomplished by the development team. The VR simulation objectives provide a documented basis for decision-making during the VR simulation development process. Also, they are used to direct the simulation development team. The project team creates the simulation scope statement by defining the simulation's primary expected outcomes, purposes, and system requirements. The second stage is to determine and prioritize factors affecting the fidelity of the VR system. The investigator should identify and assign the importance weight for each factor to determine the critical fidelity. Then, the investigator performs the evaluation of fidelity objectively and subjectively. Accordingly, the results of the evaluation are consolidated and validated.

Even though several evaluations demonstrated significant benefits of using high-fidelity systems compared to low fidelity systems in most cases, but the results of these evaluations can't be generalized as different VR applications have their objectives, so that it has different configurations. Moreover, there is a high number of complex applications and interdependent variables in the evaluation process of fidelity. Moreover, VR simulations do not need to alert all human-sensory systems to achieve the objectives for which they were created; the fidelity of the simulation relates to the main objectives of the simulation (Hontvedt and Øvergård 2020). Therefore, the first step in evaluating fidelity is a careful identification of VR simulation objectives and requirements. Accordingly, the fidelity factor priorities were weighted relevant to the main objectives of the simulation. Assigning the importance of each factor should be performed by experts in the field of application. Then, from the Fidelity Evaluation Criteria table, each objective factor can be evaluated using the fidelity scales in Fig. 8. Similarly, the subjective factor is evaluated using the evaluation items in the fidelity evaluation criteria in Table 1. The following is an example of how the fidelity framework was used to evaluate the VR system objectively at EMU. To achieve this goal, a case study for a car door inner panel was established and used for the implementation of the framework.

The EMU CAVE system is equipped with three screens of acrylic rigid rear projection screens with a 16:9 aspect ratio. The projection system is based on four projectors; each projector projects the VR content on each corresponding screen. The user wears 3D shutter glasses with retro-reflective markers for head tracking. The haptic system is based on Vicon APEX and ManusVR gloves. There are nine available cameras in the CAVE at the EMU VR lab that captures speeds up to 250 frames per second (FPS).

The proposed VR simulation for this paper, as shown in Fig. 9, consists of a set of 3D models of a conventional car door and manufacturing systems required to perform the stamping process in it. This simulation was created for layout planning.

Fig. 9
figure 9

Sample of the proposed virtual reality simulation

To prioritize the factors affecting fidelity, the investigator should identify the VR simulation objectives and the expected end results from the VR experience. Accordingly, the VR simulation requirements are determined, including visual, auditory, haptics, 3D interactions, etc., based on the purpose of the VR simulation. Then, the investigator should assign the importance weight for each factor to determine the critical fidelity factors. Table 2 shows the proposed VR simulation's objective and the requirements for each factor, and the critical fidelity factors: the visual, haptics, and tracking systems as it's assumed to be highly important for achieving the VR simulation objective and perform the layout planning accurately and effectively.

Table 2 VR simulation objectives

Table 3 shows the objective evaluation results for each factor can be evaluated objectively according to Table 1. The actual technical specifications of the VR system used for this study were evaluated using the fidelity scale in Fig. 6. For instance, the FOV of the EMU CAVE system is 170 degrees. According to the scale, it is considered with high fidelity, so it was evaluated as 4 out of 5, and the importance of this factor with respect to the objectives was given as 3 out of 5. Accordingly, the score of the FOV was 12 by multiplying the objective evaluation and importance. The same procedure was followed for other factors, and the average overall fidelity was found to be 3.38 out of 5, which be classified using Fig. 10 as a medium-fidelity system for this specific VR simulation.

Table 3 The objective evaluation results
Fig. 10
figure 10

The overall fidelity score classification

The proposed framework and scales in this paper were developed after analyzing previous contributions to evaluating VR technologies' fidelity. The scale limits were assigned based on the ideal systems and human biological abilities. However, this framework is general, and some factors can be added to the evaluation according to the objectives or expected results from the VR simulation.

The literature review results show that the levels of digital sensory and tracking systems are significant factors in determining fidelity and performance in most cases. Combined with the example provided, we have contributed to the overall understanding and scaling of the factors affecting VR systems' fidelity. The framework's major strengths are that it considers the diversity and complexity of VR tasks involved within the VR experience and its scalability to various hardware and software configurations. It defines sub-elements of each aspect, along with evaluation criteria, methods, and tools. Also, it identified a medium-fidelity for each factor along with suggested objective evaluation. However, this scale's potential limitations include the inability to capture all the factors affecting fidelity objectively and subjectively as various VR applications required different VR systems and configurations.

In summary, even though many progress has been achieved toward the development of VR technologies. However, the VR applications don't need to alert all human-sensory to accomplish the objectives it has been created to achieve. Also, the technological advancements in the hardware and software of the VR systems are still not mature enough to create a realistic VR experience with all sensations. As a result, the digital sensory system is considered as the main aspect that affects the overall fidelity of the VR experience, specifically the visual system fidelity. Several contributions have summarized challenges revolving around such systems for the other digital sensory systems such as olfactory and taste.

4 Conclusions

This paper discussed the factors and elements affecting the overall fidelity of a VR system and provide a comprehensive framework for the evaluation process with respect to four interrelated aspects: (1) digital sensory system fidelity, (2) interaction systems fidelity, (3) simulation system fidelity, and (4) integration among these aspects to produce high-fidelity virtual experiences. The paper also included a description of each factor and element involved in the four aspects of fidelity.

Different physical and psychological factors have a direct effect on the performance of the VR systems, which directly affects the overall virtual experience fidelity. Accordingly, fidelity evaluation consists of reviewing several studies to capture objective and subjective measures of all factors affecting the VR systems' fidelity. The proposed framework presents the evaluation criteria divided into subjective and objective measures according to a systematic literature review of previous findings related to each aspect and were based on experimental measures. The fidelity of the simulation depends on the main objectives of the simulation (Hontvedt and Øvergård 2020). Therefore, it was found that increasing fidelity in the digital sensory and tracking systems, with accurate integration between the factors, would not always result in the best user performances and high-fidelity experiences as it highly depends on the simulation objectives.

Finally, we believe that evaluating fidelity should be performed objectively prior to the experience, and a mix of subjective and objective evaluations post to the virtual experiences to effectively evaluate the performance of the users and the VR systems.