Keywords

1 Introduction

The way how information is presented to students in education is evolving to integrate the abilities and advantages of new media. Replacing blackboards, educators have already adopted digital media using projectors or whiteboards. However, all these presentation forms remain as two-dimensional displays. In order to convey an understanding of a three-dimensional object, lecturers needed to resort to physical 3D sculptures like plastic models of organs or actual dissections in anatomy. The application of such artifacts in education poses challenges regarding their availability. In presentations, this leads to difficulties as the lecturer places the object at the front and viewers can only see it from afar or filmed on the two-dimensional projector display. During the COVID-19 pandemic, this challenge is more prominent in remote education as students are only able to view the objects isolated on a screen in video conferences. Mixed reality (MR) enables new possibilities for conveying 3D content in an immersive and intuitive way. For instance, augmented reality (AR) can have a positive impact on education like an increased desire for self-learning, improved memory, and improved spatial understanding [12]. With its ability to embed virtual 3D objects into real-world settings, 3D scans of the physical models can be distributed to a large number of students [8]. Students are able to embed the objects into a real environment and can inspect them from multiple angles by walking around them, gaining an understanding of their true scale and spatial structure. With the introduction of new MR technologies like the Microsoft HoloLens and software libraries like ARCore and ARKit, MR experiences become more available. Especially the mobile libraries enable students to view MR content on their own smartphones, thereby enabling a widespread use in education. This allows them to view the content anywhere and at any time, fitting into their schedules.

In this paper, we elaborate the concept and realization of a MR presentation framework. It is a general-purpose 3D immersive presentation system for lecturers and students. With its collaborative features, synchronous co-located and remote presentations can be supported and mediated. It presents a cross-platform and cross-device approach to combine the advantages of 2D slide editors with the exact placement options in 3D on the HoloLens and the wide distribution of smartphones for AR viewers.

The remainder of this paper is structured as follows. In Sect. 2, we investigate related approaches for MR presentations. After that, we describe our concept for combining the different media for presentations in Sect. 3. Section 4 highlights implementation details and the resulting architecture of the application. With this implementation, we conducted evaluations and the results are laid out in Sect. 5. These results are discussed in Sect. 6. Finally, the paper closes with a conclusion and an outlook on future work in Sect. 7.

2 Related Work

Our work can be classified on Milgram and Kishino’s MR continuum in the AR range [10]. The range describes a spectrum between the real world and the virtual reality (VR). Apart from AR, augmented virtuality (AV) defines a second intermediate form. AR and AV are differentiated by the ratio of real objects to virtual elements. In AR, the real world is predominant with some virtual objects integrated into the world. For AV experiences, this ratio is reversed.

In the related work, a series of approaches can be found where augmented reality enhances storytelling and conveys information in presentations. Here, static information systems that augment objects can be distinguished from systems that support a presenter. For instance, Saquib et al. created a video-based presentation system where 2D virtual content can be embedded into a video feed [13]. Using a Kinect camera, the presenter is able to interact with the virtual content on previously set up interaction points. This e.g. allows the presenter to carry a virtual element around or to control a visualization.

Information presentations that do not require a speaker can e.g. be found in museums and exhibitions. For instance, Sommerauer and Müller showed in 2014 in an AR-supported mathematics exhibition that teaching experiences can be improved with AR-compatible smartphones [14]. Results showed that these parts of the exhibition were better understood by visitors. The effects and possible applications of AR have also been further investigated specifically for high-level teaching. This includes, for example, a textbook that was supplemented with 3D content that can be displayed in AR [1]. Alrashidi et al. found that the groups that received AR support outperformed the other groups [2].

The success of these specialized applications has led to the research of generalized cross-discipline systems. Another influence here can be seen in the work of Karsten et al., who have shown that the best learning outcomes can be achieved through a combination of traditional learning practices and AR [9]. In 2018, Antoun et al. created the SlidAR system, which allows lecturers to add AR content to their slides [3]. Students can then view this AR content by scanning the slide with a mobile app. Results showed that both students and professors would like to use the system. The work also shows that the 3D content for the slides needs to be intuitive to set up and place in the 3D environment.

The related work shows that previous concepts for 3D presentations either focus on replacing traditional techniques with AR 3D content or they depend on reference points like slides or physical objects. We did not encounter a related approach that is independent of physical markers on the slides but still integrates slides in the AR content. Our approach has the advantage that it allows teachers to use existing presentations and adds a layer of 3D AR content to them.

3 3D Presentation Concept

Our 3D presentation approach combines the advantages of immersive scenes and 2D presentations. Therefore, we have chosen the following structure for our presentation framework. A presentation consists of stages that are arranged sequentially. These give the presentation a structure similar to the slides of a 2D presentation. Each stage can consist of three presentation elements. First, there is the canvas, which can display 2D contents. This element can display existing slides, e.g. imported from traditional 2D presentations. In addition, a stage can contain a scene and a handout. These two elements can each contain any number of 3D objects that are positioned relative to a starting point. Scenes and handouts can be distinguished by the way how the 3D objects are displayed to the participants and how they can interact with them. Objects in the scene are synchronized in time and space for all participants in a presentation. This allows the presenter to refer to the 3D objects and the entire audience sees them at the same position. The objects from the handout are locally distributed to each participant in a presentation. Thus, each member of the audience can inspect the objects on their own by moving, rotating and scaling them. The handout integrates interactivity into the presentation model as students are motivated to explore the provided 3D objects on their own in the private space.

4 Implemented Presentation System

To gain user experiences and evaluate the concept, we implemented the presentation system ImPres. The resulting implementations are available on GitHub under an open-source licenseFootnote 1.

4.1 System Architecture

The system consists of five parts and each of them is assigned a separate task. They are illustrated in Fig. 1.

Fig. 1.
figure 1

The system architecture of the ImPres System.

The first element of the system is a 2D editor that runs on desktop PCs and was created using the Windows Presentation Foundation framework (WPF). Complementing this 2D basis, we also implemented a 3D editor in the Unity 3D engineFootnote 2 using Microsoft’s open-source Mixed Reality Toolkit (MRTK)Footnote 3. The visualization of the presentations for 2D content is handled by the 2D editor and the 3D content is visualized in AR by the 3D editor on smartphones using ARCore and on the Microsoft HoloLensFootnote 4. The 3D editor also contains a viewer mode which can e.g. be used by students to follow the presentations. Both the 2D editor and the 3D editor communicate with a backend coordinator. It consists of a Node.js server which e.g. stores created presentations. Moreover, it administers a login system to keep track of users and their activities. Apart from the built-in login system, the presentation framework also supports OpenID Connect login. Access rights are granted via the backend, with which all system parts can communicate via a RESTful API. The synchronization of the presentations consists of two primary tasks which are handled by two services. The temporal synchronization, which allows all clients to get the same state of the presentation in real-time, is implemented with the Photon engineFootnote 5. We integrated Photon synchronization solutions both in our desktop 2D editor and the MR 3D editor to allow them to communicate with each other. For the spatial synchronization in the 3D editor and viewer, Azure Spatial AnchorsFootnote 6 are used.

4.2 2D Editor

The standard workflow for creating AR presentations starts with the 2D editor where the presenter sets up the slides and defines which 3D content is related to each slide. First, the presenter has to log in using a system-specific account or an OpenID Connect account. The structure of the user interface of the 2D editor is presented in Fig. 2. In the large main view, text and images can be added to the slide. The left slide stack allows for navigating through the set of slides. In order to support existing presentation slides, the 2D editor provides a PDF import option for externally generated LaTeX slides. Alternatively, PowerPoint presentations of already used lecture slides can be exported to PDF and then imported as a basis for the 3D presentations. 3D elements can be added to a presentation by dragging and dropping a 3D model file into the handout or scene panel. While it is possible to already define the position of 3D models in the 2D editor by specifying numeric coordinates, it is more intuitive to switch to the 3D editor at this point to position the model in the 3D environment. To do this, the presentation must first be saved in the 2D editor. It transfers the data to the backend where it is stored in the database.

Fig. 2.
figure 2

The 2D editor of the ImPres system.

4.3 3D Editor

The presenter can log in with the same account in the 3D editor as shown in Fig. 3 on the left. Based on the account, the presenter can access the previously created presentations via the backend coordinator. In the 3D view, the slide is presented on a 2D canvas in space and the associated 3D models are also placed in the environment, as can be seen in Fig. 3 in the middle. The presenter can now proceed with the creation process by positioning, rotating and scaling the 3D models precisely. Using a menu, the presenter can navigate through the different stages and thereby edit the 3D content for each slide. The spatial anchor of the scene is presented by a three-dimensional model of an X, shown in Fig. 3 on the right. The objects that belong to the scene are positioned relative to this anchor point. The anchor which is associated with the presentation assures that the presentation content always appears at the defined position in space and with the given orientation and size.

Fig. 3.
figure 3

The 3D editor of the ImPres System. Left: Login menu. Middle: Editor mode. Right: Presentation mode.

4.4 Conducting Presentations Using the System

The presentation can be started either using the 2D editor or the 3D editor. In a co-located setting, slides from the 2D editor can be projected onto a wall just like traditional PowerPoint presentations. At startup, a short numeric code is displayed. This allows students to join the presentation via the 3D editor on their smartphones. Then, the 3D editor client automatically connects to the Photon engine service. Through this service, the state of the presentation is shared in real-time with all participants of a presentation. In this scenario, we use the Photon engine as a 1-to-n communication channel because only the presenter is allowed to broadcast status updates as only this master client can switch between stages of the presentation. Through this temporal synchronization, the teacher’s auditory explanations match the current visual impressions for all students. In order to establish a common spatial understanding where the presenter can walk up to objects and point at them, a spatial synchronization was added to the system. The feature first asks the user to scan the room in the 3D editor by walking around at the start of the presentation. With the HoloLens, its built-in tracking system automatically creates a spatial scan. On smartphones, a visual space reconstruction takes place based on the camera feed of the smartphone’s camera. Independent of the used device, a new spatial anchor can be created based on the reference points in the established spatial scan. The anchor is then stored in the Azure Spatial Anchors service and shared with the students in real-time via the Photon service. Azure Spatial Anchors allow an anchor to be compatible with a variety of devices which enables broader accessibility of the system. This spatial anchor forms a coordinate system that is firmly anchored in space. Students who load the spatial anchor then have the same coordinate system at their disposal as the teacher. This ensures spatial synchronicity and that all 3D objects in the scene are visible in the same place for all participants.

ImPres also supports remote presentations which are especially important during the COVID-19 pandemic. Here, remote participants can establish their own spatial anchor for their local room. In addition, students can also activate the canvas via their 3D viewer in order to see the associated slides directly in AR as shown in Fig. 3 on the right. So, remote learners can follow the entire presentation by using their own smartphones.

5 Evaluation

We conducted three evaluations at different stages of the project, following the “iterative cycle of human-centered design” [11]. In this method, user evaluations are carried out within the smallest possible iterations. The fidelity of the used prototypes increases with each iteration. The feedback from each iteration is then used as input for improvements for the next prototype. We performed two iterations to collect user impressions, first using a paper prototype and then, in the second user evaluation, we used the implemented software prototype of ImPres. Finally, a technical evaluation was conducted.

5.1 Paper Prototype

The paper prototype was created directly after the initial concept for immersive 3D group presentations was ideated. It consists of paper elements that represent both the 2D user interface and the 3D user interface elements. In a Wizard of Oz experiment, we simulate the behavior and functionality by positioning the paper elements in the room according to the user’s inputs [6].

We have vertically limited our paper prototype and restricted ourselves to the 3D editor functionalities. We have chosen this limitation since the layout of the 2D editor follows other slide creation tools and is therefore already well-known by users. Moreover, the main goal of the evaluation is to inspect the concept of the 3D presentation. The paper prototype was used in a user evaluation with five users. Only one had previous experience with MR applications. During the user evaluation, the users were given two tasks in succession. First, a prepared presentation was to be opened and presented. The goal of this task was primarily to observe how the users interact with the given menu elements to control the presentation. In the second task, the users were asked to enter the edit mode. Here, they should navigate to a specific slide in the presentation and they should rotate an object. The goal of this task was to evaluate whether the planned interactions to place 3D content could be used intuitively. During the evaluation, we observed the users, noting their steps. We also obtained additional insights about the user’s thoughts by letting them fill out a qualitative questionnaire afterward. The questionnaire contained the following questions:

  • Was there a time during your use when you felt uncertain about how to perform the task? If so, then please tell me about that situation.

  • What aspects of the user interface do you remember positively?

  • What aspects of the user interface do you remember negatively?

  • What improvements do you want to see in future versions?

The most dominant result of the paper prototype evaluation was that all users understood the structure of the 3D presentations quickly. According to their statements, they already felt confident in presenting and dealing with 3D content in the presentation. The placement method was also positively evaluated by the users and therefore the concept was further included in the development of the ImPres system. The aspect that bothered the users the most was the input of text to sign in or to join a presentation. For the software prototype, we focused on reducing the text input to a minimum. Presentations can now be joined with a short numeric code and it is no longer necessary to enter the complete presentation name and a password. Moreover, this validated the importance of the 2D editor in the system as users can type on conventional desktop keyboards to produce the text for the 2D slides. In addition, the paper prototype evaluation revealed several user interface improvements that streamlined the menu structure. Since these improvement requests were noticed so early, we were able to integrate them directly into the software prototype.

5.2 Software Prototype

The software prototype is a fully functional software solution that realizes all features of the described system in Sect. 4. We evaluated the resulting ImPres system with 16 students. Since the underlying concept was already examined in the previous evaluation, the focus of this evaluation was to examine its usability. Users were able to try out the 2D and 3D editor in order to create new presentations and to view existing ones. To enable comparability with other systems, we used two questionnaires. The System Usability Scale (SUS), which was created by Brooke, provides a score with which the usability of the system can be quantified [5]. However, the given score does not allow a linear comparison, since a doubling of the score does not correspond to a doubling of the usability, but Bangor et al. has already divided the scale into blocks with adjective ratings [4]. When averaging the computed scales, the evaluated software prototype of the ImPres system achieved an SUS score of 86.5, which indicates good usability.

In order to gain detailed insights into how demanding the system is, we also used the NASA-Task Load Index (NASA-TLX) questionnaire [7]. In addition to an overall score, the NASA-TLX questionnaire also provides individual ratings for the categories “Mental Demand”, “Physical Demand”, “Temporal Demand”, “Overall Performance”, “Effort” and “Frustration Level”. The participants were asked to rate the workload in each category on a scale from 0 to 100 in steps of 5 after performing a given set of taks. Those tasks contained the editing of a presentation by adding new 3D objects and changing their position and scale. In the task set, participants were also asked to hold a presentation and to join an existing one by the instructor using the ImPres system. After performing the tasks, they had to rank the NASA-TLX categories in terms of their importance for the user. Categories received five points each time they were selected as the most important and one point less for each position behind the first one. Therefore, a category got zero points if it is considered least important. By normalizing this score we created the weight for each category. The scores and weights for the individual dimensions of the questionnaire are laid out in Table 1.

Table 1. The individual scores and their weight of the NASA-TLX questionnaire evaluation.

We observed that the perceived workload was the lowest for the physical demand, directly followed by the temporal demand. Combined with the low weight of the physical demand, this shows that participants were not bothered by holding the smartphone or wearing a Microsoft HoloLens. The low temporal demand is a good indication that lecturers are able to create an immersive presentation in a time-efficient manner with ImPres. The weight of 0.21 for the temporal demand shows that the users perceive the time perspective as crucial. Noticeable is the relatively high value for the effort dimension. Compared to the other categories this value is significantly higher with a value of 44. One possible reason for this could be that most of the users that participated in the evaluation had not been in contact with MR applications or devices before. Therefore, they had to learn the interaction paradigms for immersive applications while participating in our user study. We believe that this additional learning curve can lead to a higher perceived effort. We plan to investigate this further with future iterations of the ImPres system. Nevertheless, participants assigned a lower weight to this dimension, thereby rating it as less important. The individual scores in the categories were combined into an overall NASA-TLX score of 24.81. Since this score is in the lower quarter of the score that stretches from 0 to 100, it can be deduced that the workload was perceived as fairly low by most users. The NASA-TLX helped us to get a better estimation of the workload. This value can also serve as a benchmark for future iterations and improvements of the ImPres system.

Furthermore, we also subjected the system to a technical evaluation. We used a computer with 8 GB RAM and a 2.50 GHz Intel(R) Core(TM) i5-6300U processor. The 2D editor runs continuously at 60 frames per second (fps) with a CPU utilization of 5%. Only when saving the presentation, the CPU load briefly increase to about 30%. The 3D editor also runs at stable 60 fps. The only drops in the framerate can be observed if a presentation is loaded. To improve the user experience here, load indicators were added to indicate to the user that the system is working on a background task. Overall, the technical analysis showed that ImPres provides users a smooth interaction that supports usability in MR.

6 Discussion

ImPres provides students access to 3D models during presentations. Especially its remote support has potential for remote teaching. Students can use their own smartphones to gain access to interactive 3D content independent of their location. This way, education that used to be conducted with physical 3D models can be maintained during the pandemic and is enhanced by digital models.

Regarding the conducted studies, the initial focus is on the usability and perceived task load of users. We plan on conducting further studies that have a closer look at other aspects of the system, e.g. regarding its learning effect.

The cross-platform support for the Microsoft HoloLens and smartphones opens up suitable use cases. Since the Microsoft HoloLens is less available than smartphones which can even be provided by the students themselves, the HoloLens is mainly for the lecturer’s use. As the lecturer has to author the presentation content, the HoloLens can be beneficial because of its intuitive in-air interactions for placing the 3D models. Smartphones only provide interactions on the touch screen which require a bit of practice to master the 3D placements. Hence, it is fitting that smartphones predominantly serve as viewers for the audience. Thus, the lecturer can prepare the 3D content which is then accessible for students anywhere and anytime.

7 Conclusion and Future Work

In this paper, we introduced ImPres, an immersive presentation framework for MR. It enhances traditional 2D slides with 3D content. On a conceptual level, we extended the slide-based structure of a presentation to stages. Each stage contains the slide, a 3D scene for displaying spatially anchored 3D objects and a handout. With the handout, students can inspect designated 3D models in their personal space. The implemented cross-platform solution is available as an open-source project. It runs on the Microsoft HoloLens and smartphones. During the development, we followed an iterative design approach where we started with a paper prototype that was evaluated first in a Wizard of Oz study. Based on the results which e.g. showed that typing in MR should be avoided, we created the fully functional prototype. In a user evaluation that focused on the usability of the prototype and a technical evaluation, its practicability was investigated. The SUS questionnaire yielded an average value of 86.5 and the NASA-TLX showed also an adequate average value of 24.81.

We plan on using the developed presentation system in our lectures and our MR software lab. Especially in the MR lab, where we teach the fundamentals of MR development, 3D visualizations can be beneficial to visualize coordinate systems and geometric operations to students. Moreover, they can gain an impression of a MR application and it is also possible to convey different 3D interaction metaphors using 3D visualizations. Regarding the features, we plan the extension of the system by animations. Animations can e.g. be applied to 3D objects like a beating heart. Alternatively, the scene itself can be animated by recording the movement of objects in the anchored space. This would allow objects to appear and move into a highlighted zone at the click of a button during the presentation. Currently, the audio itself is not transmitted using ImPres but a separate audio call is still necessary. Thus, we plan on including an audio stream in the presentation where the built-in microphones of the MR devices record the presenter’s voice.

All in all, the presentation framework ImPres enables new opportunities to enhance traditional presentation slides by adding 3D content to them. It offers a new approach that combines existing slide-based practices with interactive 3D models both for co-located and remote presentations in formal learning.