9.1 Introduction and Overview

For the conception of applications in VR and AR, a large design space exists with an unmanageable number of conceivable realization alternatives. The large number of available input and output devices alone, which are themselves available in different variants and which can be combined in different ways, makes a systematic analysis and evaluation of all implementation alternatives difficult. This is especially true since a sufficient theoretical foundation for such an analysis is not available today. Therefore, case studies in the sense of best practices provide a good orientation. VR/AR designers often take existing successful case studies as a starting point for the initial conception. In case studies, one can see how different technologies interact and how interaction techniques can be selected and adapted for the technical circumstances in a meaningful way. Case studies are an important source of experience. Since most VR and AR applications today are “one-offs” for a specific VR/AR setup and a specific application goal, one cannot consult any standards, but one can try to benefit from the experiences of previous successful applications.

This chapter contains a selected collection of case studies. On the one hand, they illustrate the basic principles of VR and AR taught in the other chapters and show examples of how virtual worlds have actually been realized or reality has been enhanced with virtual content. On the other hand, they provide an insight into how case studies can serve as a basis or inspiration for the development of future applications with VR/AR. Each case study is self-contained. Since the context in which case studies were created is also of interest, each case study not only mentions the authors directly, but also the organization or company in which the case study was created.

The first case study shows that the use of VR in certain applications, such as the construction of automobiles in the automotive industry, is already very well established. VR and AR are therefore not only something that researchers in academia are dealing with in the prototype stage but something that is being used in a commercial environment. In the assessment of Technology Readiness Levels (TRLs ), as defined in the ISO 16290:2013 standard (ISO 2013), the maturity of VR/AR technologies today comprises all stages from basic technology research to system test, launch and operations. The next three case studies provide further examples of successful commercial use of VR/AR in different application domains, such as entertainment/infotainment, life sciences and diagnostics, as well as civil engineering. The case studies illustrate the added value of VR/AR. These include cost savings, for example, when physical models in the design process are at least partially replaced by virtual models that can be created more cheaply, or when costly excavation damages during construction work are avoided. But the examples also illustrate other benefits, such as the improvement of human–machine interaction in Case Study 9.3 or the realization of telepresence and computer-supported collaboration in Case Study 9.4, as well as completely new possibilities of visualization.

While Case Study 9.5 uses mobile devices such as smart phones or tablets, the next two case studies (9.6 and 9.7) show examples of large installations that use specially equipped rooms. In contrast to Case Study 9.5, Case Study 9.6 does not visualize construction data such as blueprints in reality, but utilizing Spatial Augmented Reality in a permanently installed, dedicated hardware setup, which can display construction data flexibly and at life size in their spatial context. Case Study 9.7 shows how a CAVE, a sophisticated hardware infrastructure, can be used to convincingly present a virtual world. This VR hardware is located in an academic environment and also highlights the added value of VR for scientific applications.

Case Study 9.8 is an example of the use of VR/AR in the field of medicine and health. It shows that AR can also be used for treatment, in this case for therapy of people who have suffered a stroke. It also shows how ideas and approaches for the use of VR/AR are developed in the academic environment. Case Study 9.9 shows how a transition from the academic environment to commercial exploitation can be accomplished.

The next three Case Studies, 9.9 through 9.11, demonstrate the value of integrating not only objects but also virtual characters into VR/AR. These virtual characters can be used either to graphically represent users as avatars in the VR/AR environment, or to populate the world with virtual people, e.g., in the form of virtual agents capable of acting autonomously. All three case studies also demonstrate the potential that VR/AR offers for teaching and training. For example, Case Study 9.9 illustrates basic research on collaborative virtual trainers. Case Study 9.10 shows how virtual patients already serve as established support in medical education. Case Study 9.11 is an example of how embodied social XR also supports social skills training. Furthermore, this case study also shows how avatars can be created and what effects avatars may have on the users they represent. This feedback effect can also be used for therapeutic purposes, for example. Case Study 9.12 is another example of how VR can be used for rehabilitation and training. This case study also shows that diverse user groups can benefit from VR/AR. In this case, VR opens up new possibilities for training that rely on playful effects, which can extend to serious games (Doerner et al. 2016) that are realized in VR/AR.

All in all, the 11 case studies show the wide range of possible applications of VR and AR technologies and the associated objectives, which can range from training to visualization, therapy, design, construction and entertainment.

9.2 Using Virtual Reality for Design Processes in the Automotive Industry

The design process of a car consists of various consecutive steps where several qualities such as aesthetics and feasibility are reviewed. For this purpose, physical mock-ups are manufactured on a 1:1 scale and presented at specific milestones. In Fig. 9.1 (left), an example of a partial physical model is shown. However, the production of these mock-ups is time-consuming, as it can take weeks until the whole prototype is ready for presentation. As a consequence, prototypes do not represent the current state of the car development project. Moreover, they often lack several components, as the manufacturing costs for a fully detailed mock-up would be too high. The overall cost of one prototype varies with the desired quality and can take up to several hundred thousand US dollars.

Fig. 9.1
figure 1

A physical mock-up serving as a real model of the front part of a car in reality (left) and a virtual counterpart (right). (©Volkswagen AG. All rights reserved)

To mitigate the problems associated with physical prototypes, today’s design process employs Virtual Reality (VR). VR can support the decision-making already in the early phases of the development. Using powerwalls, CAVEs and VR Head Mounted Displays (HMD), car components are visualized and reviewed in different variants, leading to a reduction in the number of physical mock-ups needed. With VR being part of the daily work in a variety of different fields, such as ergonomics or lighting design, VR has become a valued technology and can be considered a standard technology throughout the development phase of a car.

For certain design reviews, a highly immersive virtual environment is required to guarantee that an executive is enabled to make a valid decision based on VR visualization. Common examples of such reviews are ergonomics and visibility checks in a car’s interior. Besides photo-realistic rendering techniques, adjustable seating bucks are used to achieve a high degree of immersion in both examples. Figure 9.2 shows an example of such a seating buck. These seating bucks are physical car seats in combination with VR HMDs. They provide the user with the feeling of being fully surrounded by the interior with a natural view out of the windows. For ergonomic checks, the alignment of the virtual seating position with the physical seat is key for creating a highly immersive experience, as the executives are typically experts in the field of interior design and consequently highly sensitive to any positional discrepancies. They are capable of noticing offsets and height differences of only a few millimeters between the real seating position and the virtual seating position. These discrepancies can result in a significant reduction in the feeling of presence, which in turn could make it impossible to continue with a meaningful evaluation. Thus, a precise alignment of the virtual and physical world is crucial for a valid evaluation result. Visibility checks deal with questions such as: To what extent do the C-pillars affect the driver’s visibility? Does the front vent glass restrict the driver’s visibility on pedestrians? Does the car’s shape limit visibility through the side- or rear-view mirrors? While the first two questions might be answered by varying the car’s geometry or exchanging certain components directly in VR, virtual mirrors require a correct simulation of how light rays behave.

Fig. 9.2
figure 2

A seating buck as shown in this picture is used to provide haptic feedback for an immersive experience in VR. (©Volkswagen AG. All rights reserved)

For the evaluation of the car’s surfaces on the exterior and interior, the demands for a realistic depiction are particularly high. Here, a virtual presentation on a powerwall represents the standard tool as it allows for an agile demonstration during which numerous variants can be presented instantaneously under the direct control of a presenter. Figure 9.3 shows an example of a presentation room equipped with a powerwall. The VR system allows the presenter to dynamically change the virtual environment as well as the materials on the interior’s surfaces. As a result, the visualization of different variants of a component not only shows the changes made to the geometry but also emphasizes the impact on the impression of the whole car. As the model used in the VR visualization is automatically derived from the most recent version of the construction data, it is guaranteed that the presented component is always up to date. Furthermore, a rendering cluster enables the rendering of virtual models on a powerwall with global illumination. Therefore, the powerwall can offer a high-quality presentation of the car. With advances in graphics hardware and rendering algorithms, the need for physical prototypes throughout the design process might be further reduced. On the other hand, a powerwall also enables the user to investigate the model from perspectives that no customer is likely to take. This in return can lead to inappropriate decisions and high costs.

Fig. 9.3
figure 3

A prototype is visualized on a powerwall. (©Volkswagen AG. All rights reserved)

A VR presentation using an HMD offers the possibility to confine the user to natural viewing perspectives. This enforces an examination of the car similar to how a potential buyer of this car would look at it. With the focus on surface evaluation, an expert study has shown that a surface analysis with a VR HMD can achieve almost equivalent results to the use of physical mock-ups (Tesch and Doerner 2020). Furthermore, a VR presentation on an HMD enables the user to experience the models with natural dimensions while also being able to interactively change not only the model itself but also the virtual scene. For instance, a virtual parking lot can be provided as a context, exhibiting a variety of different car models for comparison purposes. Another example is the provision of a virtual studio with sophisticated lighting controls that facilitates the design evaluation of exterior surfaces.

There are several challenges when using virtual reality with an HMD. One major problem is the occurrence of cybersickness in a variety of different scenarios, such as in driving simulations. Another drawback of a VR presentation with an HMD is that experts have low confidence in the validity of the appearance of virtual objects. One reason for this is imperfections in VR presentations. For instance, displays in an HMD still exhibit the screen-door effect, i.e., the pixel grid can be perceived. Even though there are technologies to achieve an almost realistic look of the virtual data such as raytracing, a real-time presentation with sufficient quality on an HMD is currently not feasible.

In summary, there are multiple use cases where VR has the potential to reduce costs and to accelerate the development phase of a car. Among these use cases are visibility checks and surface evaluations. More examples of VR applications can be found in Berg and Vance (2017). VR opens up new ways of interaction between multiple users as well as between the user and the virtual data. For example, while two experts or executives can only discuss a physical prototype from different perspectives (while sitting next to each other), a virtual environment enables them to be in the same position, allowing them to have a similar view. Nonetheless, physical models remain the most trusted basis for final decisions and are still indispensable. Even though the use of VR has proven to be successful and has already led to a reduction in the need for physical mock-ups, there is room for improvement (e.g., in the area of usability of VR tools or the area of dedicated VR authoring processes). The potential of VR for design processes in the automotive industry has not been fully exploited yet. Besides VR, also Augmented Reality (AR) is set out to be used frequently in a variety of different areas, such as for constructing and designing vehicles. For instance, AR offers the possibility to enrich simple physical mock-ups by superimposing virtual data on top of these prototypes (Zimmermann 2008). Dummy components on these models can be replaced with the most recent construction data when viewing a live video image augmented with AR methods on a tablet. Thus, AR also facilitates new ways to reduce the level of detail worked into the physical mock-up and thereby further reduces manufacturing cost. By using the camera of a tablet and a precise tracking algorithm that merges a virtual model with its physical counterpart, the image of the physical exterior can be virtually projected on a physical mock-up. Thus, a simple physical object can be augmented with additional virtual details in the correct position. Examples of such details are gaps, different car paints or car components such as different headlights.

9.3 AR/VR Revolutionizes Your In-Car Experience

Although VR and AR have become commodity technologies with high awareness even in the non-tech community, employing AR/VR in a car is still a challenging objective. In doing so, several conditions need to be considered to ensure a seamless and well-received in-vehicle AR/VR experience.

The advantages are obvious and numerous use-cases exist, e.g., getting the right information at the right place via AR without the need to take the driver’s eyes off the road, or gaining a new quality of in-vehicle entertainment leveraging VR. However, aligning the almost non-predictable vehicle motion and vehicle space to an augmented or virtual environment requires a careful and well-defined transition to achieve consistent storytelling. In addition, specifically regarding AR, a precise localization of the vehicle is of particular importance.

This case study describes our journey, lasting more than 15 years, from prototyping and research up to the MBUX 2.0 Augmented Reality Head-Up Display available within the new Mercedes S-Class launched at the end of 2020 (see Fig. 9.4). Furthermore, we present how Head-Mounted Displays (HMDs) can become part of an immersive in-car gaming or entertainment solution.

Fig. 9.4
figure 4

Different AR features (left to right): Distronic (adaptive cruise control), lane departure warning, assisted lane change, route guidance, destination. (©Daimler Protics GmbH. All rights reserved)

Given the recent advances in navigation and driver assistance systems, there are three major questions that arise during the path to fully autonomous driving:

  1. 1.

    How can the driver oversee, understand and leverage the growing number of increasingly powerful assistance systems of a car?

  2. 2.

    How can the driver gain trust and lie back while the car takes over part of their job?

  3. 3.

    How can an entirely new in-car experience be created when there is no human driver?

AR can play a key role in all these questions and it is happening already. From a technical point of view there are multiple ways in which AR can be used within a car. These include showing an augmented video on a screen, AR glasses or projecting the virtual content directly on the windshield. The latter can be achieved by using a recent head-up display with a comparatively large field of view (e.g., 10° × 5°).

After the introduction of the video-based MBUX Augmented Reality for Navigation in 2018, Mercedes-Benz introduced a novel AR HUD starting with the S-Class presented in 2020. It features both contact analog visualization of navigation as well as assistance systems to address the first two questions in particular. So why are there no other AR HUDs available yet? Apart from the hardware, implementing such a system is much more challenging than it might seem in the first place.

Pose Estimation and Sensor Fusion

For the navigation use case especially, it is crucial to know exactly the position and orientation of the car. Every little bump on the street needs to be considered, as the contact analogy would suffer otherwise. A precise and high-resolution pose is important for the quality of AR in general, but it is even more crucial for the HUD, as it acts like a magnifier that makes changes of orientation of well below 0.1° obvious. This is aggravated by the fact that most of the sensor data comes with a different frequency, latency, reliability, coordinate system and resolution.

Projection

The content of the head-up display must appear in a way that fits exactly to the reality in front of the car. To achieve this, every piece of hardware involved needs to be calibrated. Additionally, the head position of the driver (head tracking) as well as the windshield distortion need to be taken into account (warping).

Latency

A head-up display does not forgive visual inaccuracies. As you directly see the reality behind the windshield, the requirement is no less than to have zero latency on the screen as well. As this cannot be achieved, the goal is at least to reduce the latency as much as possible and then use proper prediction to make up for the rest.

User Interface

Although the size of HUDs has become bigger and bigger over time it is still limited to a rather small area just in front of the car. Thus, it easily happens that relevant information leaves the field of view. So, the challenge for the UI is to get the most out of AR, but at the same time not to overload the screen and to handle information outside the screen.

Autonomous driving is going to release huge amounts of free time in the car. But already (rear seat) passengers are looking for distraction during long-haul trips. While traditional activities such as reading, working, playing games or watching movies on smartphones and in-car displays will have their place, car movements easily induce motion-sickness during these activities. In addition, unfavorable viewing angles for handheld devices (eyes below road level) and the confined space of the car can further limit perceived comfort and enjoyment for a significant amount of people during longer periods of transit.

Head-Mounted Displays (HMDs) that allow full immersion into a virtual or (in the future) augmented reality promise to alleviate many of these problems. They can present any content at eye level (AR/VR) and, considering VR, include the car motion to counteract motion sickness and place the user in completely new environments to escape the confines of the car. In addition, the interior of a car, with all its sensors/actuators and the computable driving forces that act on human bodies, represents an instrumented environment with unprecedented opportunities for new types of entertainment and gaming pertaining especially, but not only, to HMDs. Potential use-cases for VR headsets include:

  • gaming (e.g., a space shooter where the story and route depend upon the real navigation route)

  • entertainment (e.g., VR roller coaster or scenic ride through a fictitious or historic landscape (Haeling et al. 2018))

  • working (e.g., virtually larger office inside the confined car space)

  • recreation scenarios (e.g., sailing over a calm sea while listening to relaxing music)

However, enabling the use of HMDs inside cars is very different compared to living rooms in multiple aspects. First, the space for body movement is much more confined. To one side of the seat, in particular, there is usually almost no space available. Thus, the virtual interfaces must be intelligently adapted so that users do not accidentally collide with the car interior. Yet more importantly, the driving forces work on the body and induce motion sickness if the visual perception cues contradict those that the body senses through the vestibular system. That means that for any shown VR content, some visual representation of movement of the real car and the surrounding environment is beneficial for the user.

Other remedies against motion sickness are the visualization of landmarks, rest frames in the real world (e.g., a part of the car that appears in the virtual environment to hold on to) and subtle information cues about the expected acceleration or deceleration of the car. These showed that they are even able to reduce motion sickness instead of increasing it (Carter et al. 2018). For AR, digital content has to match the real-world environment precisely, by analogy with the head-up-display. Because motion sickness, once experienced, can take hours for symptoms to resolve, solving this problem is a key enabler for prolonged use of both AR and VR HMDs inside cars.

However, all these visualizations require precise alignment of the virtual and physical worlds (see Fig. 9.5). The previous section has already elaborated the challenges of calculating a precise car pose in relation to the real world. In combination with HMDs, the precise alignment is further impeded, since conventional tracking methods for HMDs will not work out of the box in a vehicle. This is because the physical forces measured by the HMD’s Inertial Measurement Unit (IMU) will reflect the car and head motion combined (e.g., accelerating will have your virtual head turn slightly down), while other tracking sensor data (e.g., optical tracking) may still only provide evidence for the head motion. This creates observation conflicts during the sensor fusion for the final HMD pose. This sensor fusion, however, is required, as the drift of today’s IMUs is much too high for them to deliver sufficient tracking quality while driving on their own.

Fig. 9.5
figure 5

Different types of dynamic virtual entertainment matched to the route ahead (left to right): original view – dynamic procedural environment as basis for, e.g., virtual cinema – space shooter game – edutainment – all viewed inside a (mobile) VR headset. (©Daimler Protics GmbH. All rights reserved)

Regarding storytelling, the biggest challenge is to create thrilling experiences on-the-fly for dynamic routes at dynamic paces (e.g., sudden stops at red lights or traffic jams) which can both change at any time during the course of the experience. To this end, the car can provide a lot of interesting information, like the current route, traffic situation along the route, immediate traffic surrounding and possible alternative routes the driver could take. This data can help game designers redesign their game story and content positions according to dynamic properties, which could be part of an SDK offered by car manufacturers.

Finally, the crash safety of HMDs is a topic that must be researched and solved for wide adoption. Overall, while technical challenges such as fusing precise car and HMD poses at low latency are coming into reach (e.g., as demonstrated by Haeling et al. (2018), questions regarding social acceptability (Is it okay to use VR HMDs over the whole duration of a family trip?), safety concerns (e.g. crash safety and driver distraction), and business plans (What are users willing to pay for these experiences?) increasingly gain importance, for which McGill et al. (2020) provide a good overview. Both software in general and user interfaces in particular will gain even more importance to deal with the increasing complexity of driving situations, especially considering the trend towards autonomous driving. In addition, technical innovations such as waveguides or holographic displays, as well as more natural and holistic user-interaction will lead towards a seamless human–machine interface. For gaming and entertainment especially, in the short term the usage of HMDs has already enabled new applications.

9.4 VR-Based Service Training in the Life Sciences and Diagnostics Industry

The adoption of virtual (VR) and augmented reality (AR) technologies has expanded rapidly across various enterprise sectors, and the technology is now experiencing accelerated integration within the life sciences and the analytical and diagnostics, pharmaceutical, chemical, and processing industries. While VR and AR are considered breakthrough technologies and are expected to substitute for computers and smartphones in the coming decades, the global crisis of COVID-19 has forced many companies to accelerate the digitalization of their workforce and operations, leading to increasing general adoption of the technology. VR represents the ideal technology for companies to continue conducting training courses and holding group events without any risk to health and safety. In addition to being an important contribution to a company’s overall digitalization and global sustainability strategy – as drastic reductions in air travel for companies will result in a significant decrease in their carbon emissions – the cost-saving potential from VR-based training is around US$450,000–650,000 per product per year.

Along with the continuous growth of businesses in the life sciences and diagnostics industry, the global headcount of their Field Service Engineers (FSEs) is consistently increasing. Additionally, as product lines have become increasingly complex, the knowledge and skill requirements for FSEs have also grown. Previously, FSEs conducted only basic maintenance and repair tasks, but now they also provide expanded services. Because of these factors, the post-service and support teams are reaching their full training capacities, resulting in extended waiting periods for newly hired FSEs to conduct their on-site training. By implementing VR into the training process, businesses can significantly increase their global training capacities and will be able to meet both their short- and long-term training needs. VR provides the global FSEs with highly realistic and fully interactive training scenarios created by service specialists that can be accessed at any time from any location.

While there will be pre-recorded training content prepared for FSEs, any internal specialist within a company can easily connect to the VR environment and join the FSEs at any time to answer specific questions or even test an FSE, without the need for anyone to leave the office or home. Also, FSEs can always revisit any training material on their own to refresh their knowledge of particular aspects of the instruments and equipment.

The VR-based software platform developed by realworld one now serves as a standard for applications in training, sales, marketing and service (see Fig. 9.6). This software platform has been specifically designed for the life sciences and analytical and diagnostics industries and includes the following functionalities:

  • The CAN functionality enables users to create and preserve their own VR content and share it with others across the globe. People can record and save interactive training sessions, product explanations, events, meetings, and more within their VR environments.

  • The multiuser-based software allows users from all over the world to meet and collaborate in real time, as well as interact directly with products in virtual environments (see Fig. 9.7).

  • Users can upload 3D and CAD data, PDFs, PowerPoint presentations, images, videos, and notes from their desktop into VR to share them with colleagues and business partners.

  • The virtual desktop function lets people use their personal computers in VR. One can browse the web, view files, answer emails or work with BI (business intelligence), CRM (customer relationship management) or ERP (enterprise resource planning) systems on a giant virtual screen.

  • The avatar configurator gives users the option to select and configure their own avatars for a personalized virtual experience.

  • realworld one provides multipurpose rooms, including user, conference, training and showrooms, as well as an auditorium hall for larger events. Users can host and invite people to join at any time.

  • The realworld one software is designed to be used with the latest virtual and mixed reality head-mounted display devices from various manufacturers.

  • The non-VR mode enables users to connect to virtual environments without requiring a VR headset.

Fig. 9.6
figure 6

Example view within a multiuser VR training session. (© realworld one. All rights reserved)

Fig. 9.7
figure 7

Example interaction in VR training. (© realworld one. All rights reserved)

Fig. 9.8
figure 8

Using smart devices with a location sensor to visualize underground infrastructure such as buried pipes and cables for construction: (a) with a tablet and (b) with a smart phone. (© vGIS Inc. All rights reserved)

The implementation of VR solutions into the service training process provides companies with the following performance enhancements:

  • Consistency: Access to highly consistent information throughout the entire organization, while providing a coherent format for product user education.

  • Efficiency: Significant gains for global service and support teams through a VR-based strategy that provides easier and quicker access to service experts, while simultaneously reducing the resources required from these experts.

  • Time: Significant reduction in the time required to train personnel on instruments. This shortens the length of the certification process for staff, as there is no waiting period to participate in training sessions.

  • Capacity: The capacity to conduct training on certain instruments is determined by training staff, facilities and hardware availability. By implementing VR-based training with virtual instruments, these dependencies can be significantly reduced.

  • Cost savings: As the number of training participants has considerably increased over the years, the costs incurred by companies for hosting trainees, by providing travel, accommodation and food expenses, have risen markedly. In addition, the depreciation and maintenance of instruments required for training also represent a significant cost that can be saved by moving them into a virtual environment.

  • Flexibility: The ability to receive/conduct training can be made as flexible as is necessary, using a VR-based approach, as content is available at any time and experts can quickly connect into the various environments for one-on-one sessions.

The typical rollout plan for VR software implementation at realworld one is usually a 3- to 6-months process, requiring extensive consultation with the client’s technical team to bring the full spectrum of training features into a virtual environment. The initial phase calls for the complete 3D rendering of all technical equipment involved in the training. After the client provides feedback on the VR prototype module, the final version will be completed for international distribution.

9.5 Utilizing Augmented Reality for Visualizing Infrastructure

Municipalities and utility companies maintain vast networks of underground and aboveground infrastructure. This infrastructure is difficult to access – many assets such as pipes, cables, valves, etc., are buried underground – and often complex, as multiple utility types reside densely near each other. The combination of complexity and inaccessibility leads to the high cost of any infrastructure-related initiative. Additionally, utility workers’ inability to see buried assets directly occasionally leads to excavation damages, which are estimated at U$6 billion annually for North America alone.

The traditional approach to locating utility assets relies on using printed and digital maps in conjunction with specialized equipment such as electromagnetic locator devices. The locator then paints the horizontal location of the asset on the ground, produces a sketch and compiles a report. The sketch and report are then provided to the excavator. Often, locations are independently validated by another person through a quality assurance process. The location work process is complicated, relies on records that can – at times – be inaccurate or incomplete, involves personnel with varying degrees of experience and is an important component of the damage prevention and workplace safety programs of the construction industry.

In the AEC (architecture, engineering and construction) industry, unseen infrastructure can cause design errors or construction problems in the field. At the design phase, it can be costly to redesign an already developed plan. If issues come up during construction, they can be extraordinarily costly, leading to long delays and project cost overruns. Furthermore, it can be difficult for engineers to analyze blueprints to understand 3D spatial relationships with regard to construction projects. As a result, it takes longer to work on designs, and those designs are more likely to have errors, which could lead to delays, rework and cost overruns.

Emerging technologies such as Mixed Reality (MR) and especially Augmented Reality (AR) have great potential to positively influence the fieldwork (see Fig. 9.8). Using AR tools, field workers and engineers can see an unobstructed physical world in front of them, as well as virtual representations of lines, pipes and proposed structures that can be perceived similar to holograms (see Fig. 9.9). By interacting with virtual ‘digital twins’, the user should be able to perform the job faster, more easily, more safely and more accurately.

Fig. 9.9
figure 9

Screenshots from the vGIS application showing AR scenes with additional annotations such as distances measured. (© vGIS Inc. All rights reserved)

vGIS is an AR/MR application designed by vGIS Inc. for high-accuracy field services operations (vGIS 2021). The app either uses the HoloLens, a holographic headset by Microsoft equipped with cameras, audio, various sensors or traditional smartphones and tablets to display underground pipes and other assets as holograms. While wearing the HoloLens or using the smart device, workers see an unobstructed physical world in front of them as well as carefully placed virtual imagery of proposed buildings and bridges, lines of wastewater pipes underground and reality capture displays. The virtual representations are color-coded and projected to scale at job sites, while advanced positioning algorithms designed by vGIS Inc. maintain its real-time-created virtual imagery world – positioned at the correct physical location – with up to 1 cm accuracy (see Fig. 9.10).

Fig. 9.10
figure 10

Visualizing BIM data in AR. (© vGIS Inc. All rights reserved)

The vGIS platform combines client-provided BIM (building information modeling), GIS (geographic information system), Reality Capture and other types of spatial data with third-party information from multiple sources to create visuals to power purpose-built applications. The information is converted into unified 3D visuals in real time to display on the end user’s devices (see Fig. 9.10).

The broad range of devices covered by vGIS allows AR users to deploy tools that work better in specific environments. Phones and tablets offer a unique combination of accessibility and convenience. They are familiar, easy to use and always on, which enables apps to run within a few seconds or less after unlocking the phone. Depending on the model and screen size, they are fast and offer excellent visuals, even in bright light. On top of this, they already run numerous apps that comprise a standard toolkit of any enterprise. It is not surprising that approximately 90% of vGIS app deployments are on mobile devices.

HoloLens and other dedicated AR devices are the best tools for complex or busy visualizations. These include visualizations of sophisticated BIM models, structures, multi-layered utility corridors, subsurface utilities of a busy downtown street, intertwining fibre-optic cables, etc. HoloLens delivers depth perception, which helps the user understand complex 3D objects almost instantly. The superiority of the stereoscopic 3D visuals exclusive to HoloLens and similar devices warrant deploying at least a few of these units to support advanced construction and engineering jobs, critical utility maintenance tasks (e.g., field crew supervisors), utility location validators, public works and similar roles where speed, deeper understanding and accuracy are important.

The hands-free environment is another type of deployment where HoloLens shines. If the user needs to remain hands-free to perform his or her job, paper records and tablet/phone-based tools will not suffice. HoloLens provides a rich and interactive user experience for displaying manuals, guides and collaboration tools while keeping the user’s hands free to do the job.

vGIS helps field technicians close service tickets more quickly by reducing the time required to locate assets. Depending on the complexity of location and availability of utilities data, the system can save up to several hours on a single locate job. A study conducted by vGIS clients found that utility locators could reduce the time required to complete jobs by 50%. At the same time, QA validation time was reduced by 66–85%. This translated to cumulative savings of 12–20 h per locator per month.

Additionally, vGIS helps avoid costly repairs and line breaks. A line strike means that work comes to a halt until repairs are made. Many of those problems occur because the aboveground markings are inaccurate or incomplete. A simple two-hour markup may easily turn into a $23,000 dig up and repair. vGIS helps reduce the number of such strikes.

The impact in the AEC space is yet to be measured. However, early deployments conducted by several multinational corporations have demonstrated tangible improvements in infrastructure-related projects, such as light rail construction and road work.

9.6 Enhancing the Spatial Design Process with CADwalk

Building design remains a uniquely challenging problem, involving a variety of stakeholders (novices to experts), waterfall development and high costs. In addition to these problems, given the huge physical size of the buildings being created and the fact that they must be scaled down for planning, design becomes increasingly abstract and complex in nature. This is especially difficult for clients who are not architects themselves, but instead are stakeholders who will have to utilize the end product. Fundamentally, clients require some way to bring abstract CAD plans into the real world for collaborative validation and optimization of the proposed project. Design experts can visualize the designs as final built constructs, but this is a complicated process for clients on plans given in a non-1:1 scale, with many layers of complexity (electrical, heating and cooling, etc.). Ideally, clients would be able to see the life-size end result as early in the design phase as possible, and throughout the entire process.

Projection mapping as a research topic enables the real world to be augmented and enhanced. More specifically, Spatial Augmented Reality (SAR ) allows large-scale collaboration with a blend of physical and virtual experiences. Given the unique affordances of SAR as a display and interaction medium, the question arose: how could we leverage the unique affordances of SAR for visualizing and editing large-scale, life-size building designs (Thomas et al. 2011)? In exploring this problem, a joint project between the University of South Australia and Jumbo Vision International (now CADwalk Digital) was established to explore how SAR could be employed. The end result of the research project and subsequent commercialization is CADwalk Lifesize, a large-scale, projection-based, collaborative building design tool that allows end-users (novices and experts alike) to explore their plans in real time and at life size.

Built using the Unreal Engine, CADwalk utilizes multiple floor-facing projectors working in concert with a wall-facing projector in large warehouse-style spaces (see Fig. 9.11). The floor-facing 2D projectors display life-size blueprints and CAD designs of buildings, enabling end-users to physically walk through their new spaces before they have been built. Using the wall-facing projector, a 3D view is projected, showing the 3D textured real-time rendering of the current plans, allowing users to see both 1:1 blueprints on the floor and the rendered 3D floor view of the space simultaneously. A roaming Surface tablet is used as a control screen for the session facilitator.

Fig. 9.11
figure 11

CADwalk session showing blueprints on the floor and perspective correct 3D rendering of the scene on the end wall. Trees are visible in the scene as the thin vertical stands. (© 2020 CADwalk Global Group Pty Ltd. All rights reserved)

To collaboratively edit the plans a novel interaction device is used: a “tree”, which consists of an aluminum pole approximately 2 m high, on a wheeled base, with retro-reflective balls attached to the top. Using an optical-tracking system present throughout the whole space, users can wheel the trees onto content, rapidly spin the tree one way, and then back, to have the tree “pick up” the content underneath, which is then fixed to the tree to be moved and rotated around the scene. The user then rapidly spins the tree back-and-forth again to uncouple the projected content from the tree, leaving it in its new location. The trees act as a shared, mobile method for directly interacting with projected content, along with other functions, such as a digital tape measure showing the distance between multiple trees. For plans larger than the physical space available, blueprints can be panned and scaled as desired, including moving between floors in multi-floor structures.

Additional functions, such as adding/removing models, is performed with the aid of a user at a desktop placed to the side of the main space. Newer versions of CADwalk seek to leverage tablet input, and employ head-mounted displays (i.e., Microsoft HoloLens) to let users visualize and interact with the full 3D CAD model rendered above the projected blueprints. A miniature version of CADwalk (CADwalk Mini) also allows users without access to a purpose-built installation to still leverage the collaborative and direct interaction offered by CADwalk, albeit at a much smaller, non-1:1 scale. A Virtual Reality (VR) view is also offered, allowing the current scene to be immediately viewed by users in a VR headset. While VR obviously also allows users to view the plans in 1:1 scale, the lack of natural collaboration and the spatial perception issues present in VR (Henry and Furness 1993) impact its effectiveness when needing to ensure accurate representation of structural plans to end users. Multiple CADwalk installations can be networked together, enabling remote collaboration at real-world scale.

A CADwalk session starts with ingesting the CAD models from the designers/architects. Given the plethora of data formats in use, data must first be prepped for import to the system in a compatible format. As CAD models are increasingly complex, data preparation may involve polygon reduction, among other tools, to create a scene that can be rendered effectively by the system. This process is done offline, before the session begins.

When the session commences, a CADwalk staff member (facilitator) is present to facilitate the session and system, enabling the stakeholder’s users present to focus on their discussions around the space, not on the system itself. Users are able to freely roam the space and use the multiple trees to modify the layout of the environment, measuring, moving and rotating items in the scene, such as doors, walls, furniture or other fixtures. Scenarios can be saved, and actions can be undone/redone and recalled for final decision-making from all stakeholders.

Project stakeholders can then commence their own interaction regarding points of concern, either previously identified or new factors identified from being able to view the plans at scale in CADwalk. These include not just cosmetic changes, but legal requirements (safe distances, minimum clearances, etc.), clash detection (e.g., does the air-conditioning duct interfere with the placement of other elements?) and domain-specific investigations.

Given the wide application domains across which spatial design occurs, e.g., manufacturing/industrial, domestic housing, aerospace, defense and city planning, the applicability of CADwalk for improving the current design process has been demonstrated for its fast, efficient and cost-saving properties. CADwalk Lifesize Studios are used from kitchen and bathroom design validation and optimizing “dream home layouts”, to highly specialized mission-critical control centers. The European Space Agency (ESA) utilized CADwalk to understand current workflows and spaces for their current and future space exploration missions, and subsequent validation and optimization of new workstations for their highly specialized operators. This will be the blueprint for all future ESA facilities globally.

While largely ignored for consumer AR, the use of projection in SAR provides unique affordances for commercial and industrial applications, where requirements such as having a fixed setup are not a restriction to adoption. In representing structural plans at life size, CADwalk allows novice and expert end users to collaboratively explore plans on an equal footing. Whereas novice users looking at traditional blueprints or CAD plans may only be able to visualize and understand a subset of the overall plans, including spatial relationships, the intuitive representation of those plans in CADwalk means structural plans are now accessible to all stakeholders, for both viewing and modification.

9.7 The aixCAVE at RWTH Aachen University

At a large technical university like RWTH Aachen, there is enormous potential to use VR as a tool in research. In contrast to applications from the entertainment sector, many scientific application scenarios – for example a 3D analysis of result data from simulated flows – not only depend on a high degree of immersion, but also on the high resolution and excellent image quality of the display. In addition, the visual analysis of scientific data is often carried out and discussed in smaller teams. For these reasons, but also for simple ergonomic aspects (comfort, cybersickness), many technical and scientific VR applications cannot just be implemented on the basis of head-mounted displays. To this day, in VR Labs of universities and research institutions it is therefore desirable to install immersive large-screen rear projection systems (CAVEs ) to adequately support the scientists (Kuhlen and Hentschel 2014). Due to the high investment costs, such systems are used at larger universities such as Aachen, Cologne, Munich or Stuttgart, often operated by the computing centers as a central infrastructure accessible to all scientists at the university.

At RWTH Aachen University, the challenge was to establish a central VR infrastructure for the various schools of the university with their very different requirements for VR solutions. In cooperation between the RWTH IT Center and the Belgian company Barco, a concept was therefore developed and implemented as aixCAVE (Aachen Immersive eXperience CAVE), which, as a universal VR display, equally meets the requirements of full immersion and high-quality projection.

To achieve the highest possible degree of immersion, a configuration consisting of four vertical projection walls was chosen, completely surrounding the user. To enter and exit the system, an entire wall can be moved using an electric drive. This avoids door elements that interfere with immersion – when closed, no difference to the other projection walls is visible. However, extensive security measures had to be implemented so that no one could be locked in the CAVE in an emergency. Although a ceiling projection would have further contributed to the degree of immersion in the system, it was not used, as the complex audio and tracking integration planned for the CAVE in Aachen would not have been possible then. To nevertheless achieve largely complete immersion, the vertical screens are 3.3 m high.

The 5.25 × 5.25 m area, which is quite large compared to conventional CAVE installations, offers smaller teams of scientists enough space for collaborative analysis session, enables natural navigation (“physical walking”) within certain limits, and creates a realistic feeling of space in the virtual environment (“spatial awareness”). Since the floor should not bend noticeably even with such a large base, 6.5 cm thick glass was used, on which thinner acrylic glass was placed as the actual display. This two-stage structure decouples the static requirements from the display requirements. Glass has better rigidity, while the acrylic glass has very similar properties to the sidewalls, which are also made of acrylic glass. For structural reasons, two glass elements lying next to each other had to be installed instead of a single glass plate. This inevitably creates a gap that, with a suitable mechanical design and skillful alignment of the projectors, turned out to be very narrow at 2 mm.

Figure 9.12 shows the basic structure of the solution with a total of 24 projectors. To achieve the required high image quality, projector and screen technologies were used that guarantee sufficiently high resolution, brightness, brightness uniformity and luminance. The final solution (see Fig. 9.13) is based on active stereo projection technology with 3-chip DLP projectors, each with a light output of 12,000 lumens and a WUXGA resolution (1920 × 1200 pixels). To meet the requirements for the resolution of the system as a whole, four of these projectors were used for each vertical side and eight for the floor, each in a 2 × 2 tiled display configuration with soft edge blending (see also Sect. 5.2)

Fig. 9.12
figure 12

Concept of the aixCAVE with 24 projectors. (© TW Kuhlen, G Matthys. All rights reserved)

Fig. 9.13
figure 13

Complex installation of the glass plates for the floor rear projection of the aixCAVE. (© TW Kuhlen, G Matthys. All rights reserved)

Apart from the resolution and the brightness of the selected projectors, the properties of the rear projection screens are critical for the resulting image quality. These should provide a uniform brightness distribution without hotspots, so that users can walk from one corner to another within the CAVE without the image quality or perceived brightness suffering from the different perspectives. This requirement was achieved by using canvas materials with excellent diffuse properties (low peak and half gain, see also Sect. 5.2.2).

Figure 5.2 shows the fully installed aixCAVE in operation. By combining a precise mechanical construction with high-quality projection technology, a CAVE system could be implemented that allows an intuitive visual analysis of high-resolution scientific data in three-dimensional space. Ergonomic factors such as high luminance and brightness uniformity, high contrast and excellent channel separation of the stereo projection, as well as small gaps between the individual screens have been consistently taken into account. As a result, the Aachen CAVE goes beyond a pure presentation system, providing a valuable tool that users from science and industry actually use in longer, intensive sessions for exploratory data analysis. In particular, the clear ergonomic advantages over HMDs, as well as the possibilities of a combined analysis of geometric and abstract data resulting from the high resolution, justify – at least at RWTH Aachen University – the very high installation and operating costs. Since its inauguration in 2013, the aixCAVE has proven to be a valuable tool in research projects in production technology, fluid mechanics, architecture, psychology and neurosciences. In addition, the CAVE is not only used as a tool for data analysis, but also as a tool for basic VR research by the computer scientists at RWTH to develop new navigation and interaction paradigms in virtual environments (Kuhlen 2020).

9.8 Augmented Reflection Technology: Stroke Rehabilitation with XR

Millions of people experience a stroke and require rehabilitation therapy every year. Most stroke survivors are left with unilateral impairments, e.g., the inability to move one arm, and have to undergo a very long period of rehabilitation and training to regain motor function. The efficacy of this training depends on four intertwined factors: (1) the patient’s (stroke survivor’s) motivation, (2) the meaningfulness of the tasks in training, (3) the training intensity, and last but not least (4) the provision and effectiveness of stimuli for neuroplastic change. XR techniques, i.e., the full spectrum of computer mediation of reality between Virtual Reality and Augmented Reality, can play a major role here and we are presenting two systems based on the concept of augmented reflection technology (ART ) we have developed and empirically and clinically tested.

With ART we are focusing on the factor of neuroplasticity, i.e., the brain’s ability to lastingly change in response to environmental stimuli, while maintaining patient engagement with the other three factors for rehabilitation efficacy (motivation, meaningfulness, intensity). The neuroplastic effect is achieved by “fooling the brain” (Regenbrecht et al. 2011) about what it is “perceiving”, e.g., by visually exaggerating movement capabilities of a limb or by mirroring over the healthy limb’s movements to the impaired side (Regenbrecht et al. 2012; Hoermann et al. 2017). XR offers great possibilities here for (1) precisely directing what the patient controls and perceives, (2) suppressing potential disbelief, i.e., believing in the virtual magic of the technology and (3) keeping patients engaged with the rehabilitation process.

ART is based on the principle of decoupling what the patient is doing from what they are seeing. We sense and capture patients’ limb movements (here upper limbs), feed this into an XR system and manipulate the perceivable output in a way that the (neurorehabilitation) effect can be achieved. Over the last decade we have built different versions of ART using tailored input, computing and output modalities.

ART4 (Fig. 9.14, left) comprises two closed boxes into which the patient puts their hands and lower arms. The boxes are closed with curtains, like with magician’s boxes, so that the patient cannot see their actual hand movements. Both boxes are equipped with a particular form of diffuse lighting and cameras, which capture what is inside the boxes. The camera feeds are used to (1) foreground segment the hands and (2) track the hand movements. The segmented hands are put inside (in front of) a virtual environment, so that the user gets the impression to interact within that space. These segmented hands can then be selectively shown, hidden and/or mirrored at the therapist’s discretion. We can also augment the users’ perceived hand movement: for instance, an actual movement of say 10 mm will result in 30 mm movement as perceived by the patient on the screen. ART4 was designed for use in clinical settings for the treatment of chronic stroke patients. However, XR features were implemented to allow for its use in other rehabilitation scenarios. These include hot/cold virtual environments for burn victims, enlarged or smaller hands for pain management and the ability to change the color appearance of hands (complex regional pain).

Fig. 9.14
figure 14

Augmented Reflection Technology systems in action. Left: ART4 with “magician’s boxes” and operator; center: ART6 for home use without an operator; right: ART6 mirroring a stroke survivors’ right (unaffected) hand movement and presenting it to them in XR as their left (affected) hand carrying out the mirrored hand movements. (© H Regenbrecht, C Heinrich. All rights reserved)

If we want to apply ART in users’ homes, then we have to (1) allow the system to be self-controlled and (2) be suitable for installation in people’s homes. ART6 (Fig. 9.14, center and right) utilizes a head-mounted display, a Leap motion controller, big arcade-style push buttons and foot pedals, individualized virtual hands and machine learning-based feedback mechanisms in conjunction with a tailored rehabilitation protocol (Heinrich et al. 2020). Our stroke application scenario has unique requirements in that our user has an impaired arm (no/limited movement), their unaffected arm is carrying out the mirrored hand movements (and thus cannot be used to control the system while in VR), and survivors can have low technical competency, which means the system has to be easy and intuitive to use. To account for these requirements, we developed an interface that consists of arcade style push buttons for the user to interact with the system outside of VR (start/stop system, switch between system modules). While in VR, the user can interact with the system by using two foot pedals (move on to next hand exercise or show a virtual training hand which demonstrates the hand exercise to the survivor in VR). Our XR hardware was chosen for survivors’ home use because it provides an inherent decoupling of the survivors’ view from their home (real) environment into our XR environment. For our stroke rehabilitation scenario, this serves three purposes. (1) Survivors are completely immersed in our virtual illusion and this can lead to a more convincing “fooling of the brain” because of the mixing of what is real (hand movements, real-world/virtual environment correspondence) and augmented (mirrored hand position and mirrored movement), which can help lead to that suppression of disbelief that is desired for neuroplastic effects to occur. (2) It allows for the mirrored virtual hand to be observed in the most spatially congruent and natural position for the survivor. (3) It disconnects the survivor from their home environment, which often consists of various distracting stimuli, and allows them to focus their complete attention/gaze on their mirrored virtual hand and rehabilitation exercises.

Besides the neurorehabilitation effects of ART, both systems are valuable instruments for patient engagement. The “newness” of XR, the game elements of the training tasks, the control of the exercises, including the individually tailored pace, and the realism and meaningfulness of the experience lead to increased patient engagement.

To make ART more widely available – currently, our systems are used with patients and users in Dunedin, New Zealand (Dunedin Hospital) and Berlin, Germany (MEDIAN Klinik Kladow) – we are going to bring our systems to market in the near future. The improvements in the quality of XR technology in combination with increasing affordability of that technology will allow more and more users to benefit from our ART approach. While stroke rehabilitation is our main focus at the moment, ART can be used with other conditions, like traumatic brain injuries, (phantom limb) pain management and hand therapy, but also for education and training, entertainment and other related sectors.

9.9 Collaborative Virtual Trainers in VR Applications

We use the term “virtual trainer” to refer to a simulated human-like character that can collaborate with humans to complete a given task with the use of interactive verbal and/or non-verbal movements and behaviors. Virtual trainers collaborating with human users can be achieved in different ways. Here we discuss two important types of collaboration that are representative of indirect and direct types of interaction. We consider an indirect collaboration when the virtual trainer collaborates with the user only by providing verbal or non-verbal feedback as instructions, therefore helping the user to complete a given task but letting the user perform the task independently. In a direct collaboration, the virtual trainer will instead jointly complete the task with the user. Here we focus on the particular case of collaborative object manipulation where both the virtual trainer and the user need to manipulate a virtual object together in order to complete the given task. We summarize in this chapter our current work on implementing both indirect and direct types of collaborative virtual trainers. Both of our projects are being developed with the use of the Unity game engine.

To achieve effective interactions when assisting humans to perform tasks in a given scenario, a feedback strategy has to be identified and implemented. In general, feedback is a language or gesture signal given by the virtual trainer and which might change the user’s thinking or behavior to improve his/her learning or training performance (Arif et al. 2017; Blair 2013). A feedback strategy will specify how feedback is provided, including types of feedback and several other parameters, such as frequency and adaptation. We have investigated two particular types of feedback strategies for virtual trainers assisting participants in a VR task, as illustrated in Fig. 9.15. Strategies based on Correctness Feedback (CF) and Suggestive Feedback (SF) were compared as possible feedback strategies used by the virtual trainer to help users to memorize relative areas of given countries.

Fig. 9.15
figure 15

In this VR training environment the virtual trainer provides feedback to assist the user to sort virtual cubes such that the represented countries appear in increasing area order. (© X Shang, M Kallmann. All rights reserved)

A scenario was designed where the virtual trainer assists the user to sort cubes representing countries according to the area of the countries. The user needs to complete the sorting task with different levels of difficulty, which are implemented with an increasing number of countries to be sorted. Under this task scenario, CF is defined as providing correctness feedback by fully correcting human responses at each stage of the task, and SF is defined as providing suggestive feedback by only notifying if and how a response can be corrected. We have conducted a pilot user study with four participants and a formal user study with 14 participants to investigate the effects of the feedback strategies provided by the virtual trainer on the user’s performance. Our final study results show that CF was more effective because it had higher user preference and shorter task completion time with equivalent performance outcomes. This study exemplifies the importance of implementing an appropriate feedback strategy for a given scenario and application. More details are available in our previous work (Shang et al. 2019).

Using virtual trainers to assist users during direct manipulation tasks, in either simulated environments or physical environments, requires the use of some specific approach for achieving adaptive motion control. While in some cases a hard-coded solution involving a step-by-step procedure for the virtual trainer to follow may be possible, in such cases the virtual trainer will not be able to adapt and execute a similar but different task, or to address the same task in a different environment. To increase the adaptability of this type of collaborative virtual trainer, different machine learning methods can be applied. A common approach is to rely on imitation learning methods able to learn human behaviors using some type of action mapping and then to apply the learned knowledge to the robotic or virtual trainer for it to cooperate with human users on given tasks. Another popular approach is to apply reinforcement learning to improve a robotic or virtual trainer’s sequential decision-making policy by interacting with the environment periodically.

Previous work (Yu et al. 2020) has demonstrated the effectiveness of using deep reinforcement learning (DRL) for virtual trainers or robotic agents, and for agent–human collaboration. We focus on applying the DRL methodology to a virtual trainer collaborating with a human user immersed in a VR environment. In our simulated environment we have designed a task involving two virtual trainers collaboratively moving a tray from a random position to a target position in a dynamic environment with an object on top of the tray. The goal is to reach the target location while avoiding collisions with obstacles and while keeping the tray balanced. Based on this design, we have trained an efficient initial policy in this virtual environment, as illustrated in Fig. 9.16.

Fig. 9.16
figure 16

Two virtual trainers move a tray collaboratively in the VR environment. (© X Shang, M Kallmann. All rights reserved)

Fig. 9.17
figure 17

Learning scenario with a virtual patient. (© B Lok, FA Jimenez, C Wilson. All rights reserved)

The use of virtual trainers assisting humans in a variety of scenarios represents a promising application for VR technologies and the study of collaborative behaviors for virtual trainers is key for achieving effective virtual trainers. When properly implemented the discussed types of collaborative virtual trainers can significantly enhance the learning and training experiences of users by achieving interactions that can closely resemble intuitive human–human exchanges.

9.10 Virtual Patients: A Case Study from Research to Real-World Impact

In this case study, we will explore the journey of virtual patient technology from research to a commercial system that is educating hundreds of thousands of healthcare students a year. Virtual patients are computer simulations of a real patient encounter. Virtual patients are used in the training of healthcare students, including nursing, physician, pharmacy and physical therapy. Virtual patients provide students with opportunities for practice, remediation, feedback and exposure to a wide range of conditions and symptoms. Virtual patients are diverse in their background, being able to present patient scenarios that involve various ages, genders, ethnicities, races and personalities. Virtual patients are used by educators to develop psychomotor, cognitive and social skills in learners. This case study will cover the research that was conducted by the Virtual Experiences Research Group at the University of Florida, lessons learned through commercialization of the research by Shadow Health® from Elsevier and implications for nursing education and virtual reality as their simulations are the most used virtual patient platform in the world.

Research began in the early 2000s into using virtual patients to improve healthcare students’ conversational skills. Early systems experimented with a wide range of modalities including head-mounted displays, large projection displays and desktop monitors (Johnsen and Lok 2008). Research studies evaluated multiple input modalities, including enabling the user to speak to the virtual patient, type questions to the virtual patient and gesture to the virtual patient.

Dozens of user studies were conducted with healthcare students to explore the potential and limitations of virtual patients, including exploring the validity of virtual patients (Johnsen et al. 2007), learning empathy with virtual patients (Deladisma et al. 2007), the impact of different display types (Johnsen and Lok 2008), physical mannequin integration (Kotranza et al. 2008), reflection with virtual patient training (Raij and Lok 2008) and team training (Robb et al. 2014).

The resulting body of publications demonstrated the educational benefits of virtual patients, including developing clinical reasoning, empathy and communication skills. With the benefit and limitations identified through scientific study, the next stage was to identify how to help as many healthcare students as possible with a curriculum of virtual patients. Designing a curriculum of virtual patients would require resources that were beyond standard academic mechanisms of grants and collaborations.

The researchers worked with the University of Florida Office of Technology Licensing to identify pathways to commercialization. In 2011, a team of entrepreneurs and some of the core researchers founded Shadow Health.

Three important pivots occurred during the transition from a research platform to a commercial product: market identification, change in delivery mechanism, and adapting the virtual patient to curriculums. First, the nursing student market was identified as the healthcare group that had the largest need for virtual patient training. There are over 400,000 nursing students in the United States and Canada alone. Second, an effective method for delivery of the virtual patients was identified. As head-mounted displays were not widely available at the time, standard laptop/desktop computers with both typed and speech recognition capabilities were used. Finally, the virtual patients moved from short 15-min scenarios used in the research studies to a series of virtual patient assignments that could be integrated throughout a course and provide over a dozen hours of educational content.

As of 2020, thousands of universities and colleges use virtual patients from Shadow Health® in their curriculum. Each year, over 100,000 nursing students use Shadow Health® products in their classes, reaching over 25% of nursing students in the United States and Canada. When they graduate, these nursing students will see approximately half of the US and Canada population, making the impact of the research into virtual patients a reality that is improving healthcare.

Each Shadow Health® product has a set of Digital Clinical Experiences™ (DCE). The DCE is the virtual patient encounter (see Fig. 9.17). Each DCE simulation starts with a pre-brief with a virtual preceptor that introduces the scenario, provides goals and instructions, and delineates what is expected from the learner in terms of performance. Next, the learner conducts a patient interview and physical assessment with the virtual patient, engages in therapeutic and non-judgmental communication, documents findings, and applies clinical reasoning skills to develop nursing diagnoses, care plans or interventions relevant to the scenario (e.g., administer medications, write a prescription or conduct a mental status exam). Upon completion of the patient exam in each DCE simulation, the learner is presented either with self-reflection prompts or a structured debrief where they can revisit actions and decisions taken throughout the simulation as well as reflecting on how they could improve in future patient interactions.

After submitting their attempt to their instructor for review, the learner is automatically scored on their clinical reasoning. Shadow Health’s team of instructional designers, psychometricians, nurse educators and computer scientists have collaborated with educators to develop the scoring for each DCE simulation. This development process includes rigorous discovery, design, construction, pilot testing and psychometric evaluation of each instrument so that it is aligned to the learning objectives and target learner population of each DCE simulation.

Shadow Health® DCE is addressing evolving nursing education needs. The landscape of nursing education allows for increased innovation and technological advancement in education programs. Students growing up as digital natives embrace the utilization of technology in their training programs. Faculty of nursing have recognized the impact that the technology has on the learning potential of their students.

With the development of technology delivering virtual patients, faculty time can be devoted to translating the virtual patient experience into clinically relevant applications instead of developing, implementing, debriefing and evaluation of the simulation experience. Faculty can also be assured their students are participating in a standardized experience. Integration of virtual patient experiences has allowed faculty to see how their students develop communication skills and clinical reasoning throughout a course.

Virtual patients are one of a growing number of virtual reality technologies that are transitioning from research to commercial product that is impacting our daily lives. So the next time you interact with a nurse or physician, you will know that your healthcare provider has likely practiced and improved their interpersonal skills using a virtual human.

9.11 Embodied Social XR for Teaching, Learning and Therapy

The Breaking Bad Behaviors (BBB) system utilizes the power of embodied social VR to teach and test classroom management skills with student teachers (Latoschik et al. 2016). The system simulates individual and group behavior through a parameterized AI-based model. The model includes typical patterns of student behaviors and their dynamic development from real classroom situations. Users can then slip in a teacher’s role in front of a simulated class and experience different, even critical situations in a realistic way. BBB lets them try out and reflect on suitable response strategies alone or in groups and acquire important media skills during the process. Figure 9.18 shows snapshots from real-life use, which has been implemented for several years at the Julius-Maximilians-Universität Würzburg in the teacher training program. Initial empirical findings show significant advantages compared to the previous gold standard.

Fig. 9.18
figure 18

Virtual training of classroom management skills in the 2018 FraMediale – Award-winning Breaking Bad Behaviors project. Left: a user within a virtual class of AI-simulated virtual agents. Right: a student teacher discusses her classroom management experiences with fellow students, showing her first-person view. (© ME Latoschik et al. All rights reserved)

In principle, Virtual Reality has the power to release us from the need to physically meet at the same places and times and thus significantly increase the potential for participation. Virtual agents can realize group experiences for individuals at any time. At the same time, the virtual worlds can include and support the ever-increasing volume of digital data, multimedia content, and information required by almost every aspect of collaborative knowledge work, specifically in the domain of learning and teaching.

The project ViLeArn explores teaching and learning with avatars and agents in an immersive social VR (Latoschik et al. 2019). ViLeArn preserves the diversity of embodied interpersonal communication for digital teaching. For example, a heterogeneous group of avatars that are not homogeneously represented (see Fig. 9.19, left) does provoke some eeriness but also increases the perceived possibility of interaction. In this context, an immersive realistic personalized embodiment increases body ownership, presence and emotional response (Waltemate et al. 2018). Moreover, non-verbal communication signals such as gestures, facial expressions or gaze and eye contact are important mediators of, for example, our intentions (Roth et al. 2018). These are important factors, especially for the intended collaborative learning progress.

Fig. 9.19
figure 19

Left: A virtual classroom with a different embodiment of participants during work in small groups. Right: Discussion in front of an interactive screen. The personalized photorealistic avatars maintain important non-verbal communication cues while providing a shared spatial reference system for communication. (© ME Latoschik et al. All rights reserved)

The work on ViLeArn has contributed, among other things, to the first non-commercial German social VR platform that supports a wide range of avatar embodiments up to photorealistic avatars. The system provides access to multimedia and text-based teaching/learning content: it supports a markdown-to-HTML5 processing pipeline and integrates personal and shared virtual large-screen interactive HTML5 panels. It also supports necessary functions for text input and sketch creation. The platform is largely independent of big IT service providers and also takes into account important data protection and privacy issues.

In general, avatars are our digital replicas in virtual worlds. The acceptance of virtual bodies as our own is called the Virtual Body Ownership (VBO) illusion . The VBO illusion is significantly determined by three different factors. These are (a) the perception and acceptance of the virtual body as our own body and thus as the source of sensory input (body ownership), (b) the perception of control over the virtual body and thus control over actions taken in the environment (agency), and (c) the change in the perceived body schema evoked by the stimulation (change). Figure 9.20 illustrates these three factors. A VBO illusion, in turn, is one of the central initiators and promotors of the Proteus Effect (Yee et al. 2009). The Proteus effect describes a change in behavior induced in the user/wearer of the avatar solely by the appearance of the virtual body and the properties associated with this body by the user/wearer.

Fig. 9.20
figure 20

Illustration of the three identified embodiment factors (from left to right): body ownership (a), agency (b) and change (c). The user appears in gray, the avatar in orange. Illustration after Roth and Latoschik (2020). (© ME Latoschik et al. All rights reserved)

The plasticity of one’s own body schema opens up far-reaching possibilities for therapies, e.g., in the treatment of chronic pain, or in eating disorders such as obesity and anorexia, which, in indicated cases, also correlate with a disturbance of the body schema. The goals of the project ViTraS (Virtual Reality Therapy through Stimulation of Modulated Body Perception) are the development of the necessary avatar technologies and the design of appropriate therapy concepts. ViTraS utilizes the plasticity of one’s own body schema for therapeutical interventions to help patients that suffer from obesity. The project explores different approaches from the wide spectrum and design space of XR-based therapies, including interactive sketch systems, social VR group therapies, or mirror expositions as shown in Fig. 9.21.

Fig. 9.21
figure 21

Mirror confrontation with the digital self. Left: Illustrating the consequences of obesity by looking into one’s virtual body. Right: A user testing a mirror therapy with a modified (made fatter) avatar. The overlay shows the user from outside the VR surrounded by a camera-based tracking system. (© ME Latoschik et al. All rights reserved)

The application scenario of the ViTraS project combines new methods for virtual embodiment, self-(mis-)perception, and faithful avatar reconstruction and its manipulation using digital XR-based interventions. Among other things, the developed solutions increase participation, as they also support distributed therapies for the rampant worldwide health problem of eating disorders, especially obesity, which has far-reaching negative individual as well as overall social and economic consequences. The project strongly demonstrates the great potential of embodiment, especially embodied XR with photorealistic avatars.

The avatars for XR-assisted therapy are created via an optimized photogrammetry-based approach (Wenninger et al. 2020). The method combines 3D reconstruction of geometry and textures with an automated rigging process. As a result, personalized fully animated photorealistic virtual replicas of a user’s body are created within a few minutes (see Fig. 9.22). These avatars can then instantly be used with common XR platforms (e.g., Unity 3D). Therapeutically, they can be used to realistically modify and simulate body proportions at the push of a button (or change of a slider). The avatars in Fig. 9.19 were created by the same process. Personalization and photorealism are important to increase the efficacy of XR exposures and the therapeutic interventions. The accompanying user-studies identified personalization and photorealism as strong promoters, especially of the VBO illusion and other important XR factors like presence, acceptance or emotional response (Waltemate et al. 2018).

Fig. 9.22
figure 22

Photogrammetry system at the Chair of HCI at the University of Würzburg with about 100 SLR cameras for photorealistic 3D reconstruction of user avatars. Left: The multi-camera system that was initially used. Center: A user during the 3D scan process. Right: The result of the reconstructed avatar in a virtual scene. Figure adapted from Latoschik et al. (2019). (© ME Latoschik et al. All rights reserved)

9.12 Virtual Reality for Teaching Literacy to Prisoners

Numeracy and literacy skills are very low in corrections facilities around the world – New Zealand not being an exception. A large proportion of prisoners are illiterate to a degree that their reading skills do not allow them to participate in normal social life, e.g., being able to comprehend job advertisements or to write a job application. Hence, when released from prison they often cannot reintegrate successfully into society and the chances are that they will end up in criminal activity again. This negative cycle can be broken by, for example, giving prisoners better opportunities to learn how to read and write.

While in prison, prisoners’ motivation to learn is usually much lower than with average people outside prison – for many, complex reasons. Classes in literacy are offered within the prison, but in rather traditional classroom settings, i.e., front of class teaching using standard literacy teaching methods. For some prisoners, those settings have positive effects, but many drop out of classes or do not fully engage in learning. The question arises: How to motivate and engage prisoners in literacy learning? Immersive Virtual Reality (VR) might be one promising vehicle for this – at least it is new and potentially exciting for a number of prisoners; for many it is probably their first encounter with such technology.

The Methodist Mission South, a provider of learning services to our local corrections facility, approached us at the Otago University Human-Computer Interaction (HCI) lab about developing a VR system that can be used for literacy training with prisoners. This task is not without challenges (McLauchlan and Farley 2019): Which technology can be used within a prison? Which virtual environment is exciting and motivating enough to carry the literacy learning task? How to test and evaluate solutions and how to bring them sustainably into the prison environment? We addressed all of those challenges and developed a prototypical system, the “Virtual Mechanic”, which was tested in a lab and in the prison environment, and handed over to a commercial partner for product development and market introduction (Collins et al. 2020).

For inherent reasons, corrections facilities are closed off from the rest of the societal environment. Being allowed to bring a VR system comprising a head-mounted display, a high-end computer and plenty of needed wiring and peripherals requires a huge amount of willingness, motivation and constructive cooperation from corrections facilities staff. The primary concerns of staff include outside communications potential, access to unmediated content and any other type of unauthorized behavior that could be facilitated by the technology. Prisoners are highly creative when it comes to exploiting the materials around them for their own purposes; therefore, what comes in and out of the facility is highly regulated.

Because our main focus was on how to motivate and engage prisoners in literacy learning, we tried to develop a virtual environment which aligns with the existing interests of prisoners. We learned that a common interest amongst prisoners is automotive engineering and cars in general. We selected this common interest as our context, and we built an environment that simulates a car workshop. We took 360° panoramic photos of an existing car workshop, stitched them together, and used this as a background (Fig. 9.23, left) including ambient workshop noise. We explored other environments as a context for learning; however, the remaining most common interests extracted from prisoners were not ethically viable.

Fig. 9.23
figure 23

Virtual environment with panoramic environment (left), virtual brake system broken apart showing (1) syllabic version of a word as a voice reads it aloud for the user (middle), and (2) an active task in which a user attempts to complete rhyming words (right). Tasks are embedded in the context of the environment. (© H Regenbrecht, J Collins. All rights reserved)

Throughout the stages of development, an Oculus Rift HMD was used as the visual medium. During the prototyping stage, we opted for an Xbox controller combined with gaze-based selection to allow users to interact with the environment. In this way they could explore the different virtual components and activities that were available. In the later commercial development iteration, the Oculus Touch controllers were used to enable interactions with the virtual world. Compared to the prior gaze-based approach combined with an Xbox controller, Oculus touch controllers lead to a more embodied experience, as users’ real hand movements are mapped directly into the environment for interaction. This is a more intuitive form of interaction and can therefore lead to higher levels of engagement.

The actual task we chose was to disassemble and assemble the brakes of a virtual car. Therefore, we introduced a virtual car model with detailed parts modeled for the front disc brake which have been animated in a way to step-by-step reveal the inner structure of the brake. This task was then used as the medium to deliver literacy skills training by giving (interactive) instructions with words. The verbal instructions have been given in three different ways: displayed as words next to the parts of the virtual brake, decomposed into syllables, and read aloud by a computer-generated voice (Fig. 9.23, middle). In addition, we also developed some word rhyming exercises as part of the instructions in a multiple-choice, quiz-like style (Fig. 9.23 right).

Due to the prototypical nature of the application and therefore the lack of actual content during prisoner exposures to date, tangible learning gains have been difficult to evaluate empirically. However, we have gained some insights from our sessions. For instance, trust emerged as an issue with some prisoners, as wearing an HMD meant impeding their view of the real-world environment, which was shared with a small number of other prisoners. Issues arose regarding exposure times, as some prisoners’ attention spans and patience levels are more volatile. We also found that a self-directed lesson approach is desirable, as outside intervention reduces a user’s momentum and presence in/engagement with the system. The project is currently in the hands of the commercial sector, where it is in continued development. Hopefully, more robust evaluations of the application’s educational impact will be conducted soon and can eventually lead to wider dissemination.

The entire process of research and development of this prototype application has been a very enlightening exercise for us. Everyone involved saw this as a clear step forward, especially the prisoners themselves. Virtual Reality carries a lot of potential for delivering training in those kinds of challenging environments. Despite the current lack of content and inability to robustly measure learning outcomes, collectively we could show that implementing VR-based, contextual learning applications in a prison can be done. The idea to “piggy-back” a less engaging task, here literacy training, on a more exciting and motivating task, here immersive VR car maintenance, seems to work well. Whether this approach will lead to transferrable results for the prisoners when leaving prison is still to be shown. VR has the potential to make a real difference here.