Keywords

1 Introduction

Education is often slow to adopt technological improvements [1]. In the last decade, the democratization of immersive Virtual Reality (iVR) technologies has allowed their growing use in educational contexts. While iVR is costly, what are its benefits compared to traditional education? iVR is thus one step ahead in the incorporation of technology in the classroom (as was previously the case with videos, computers…). Its effectiveness has been evidenced in repeated analyses, highlighting the substantial improvement in student engagement and interest in these environments [2]. The immersive nature of iVR allows learners to actively participate and interact with the virtual world, leading to heightened levels of engagement and motivation. Moreover, iVR has demonstrated its potential to facilitate the understanding of complex concepts by providing interactive and experiential learning experiences [3]. By simulating realistic scenarios and environments, iVR enables learners to visualize abstract ideas and concepts, leading to enhanced comprehension and knowledge retention [4].

In recent years, multi-user iVR environments have gained significant attention for their potential to facilitate collaborative learning experiences [5]. These environments provide ideal settings for social and collaborative learning, as well as for experiencing situated and experiential learning. It is also evident that the creation of these immersive Virtual Reality Environments (iVREs) is a very time-consuming activity [6]. For that reason, some researchers or professors use iVREs based on pre-created applications. However, publicly available options for multi-user iVR learning environments often come with limitations that hinder their widespread adoption in educational settings. These limitations include heavy reliance on proprietary tools provided by developers and the uncertainty of ongoing support and updates. Additionally, existing multi-user iVR environments are not specifically designed solely for educational purposes; instead, they often require adaptation to be effectively utilized for learning experiences.

Addressing these challenges, this paper presents a framework for developing multi-user iVR learning environments. The framework provides a structured approach to design and development, empowering educators, and developers to create immersive and interactive iVR learning experiences independently. By integrating key components such as user interaction, data tracking, and visualization functionalities, the framework aims to enhance learning outcomes and promote collaborative experiences.

To illustrate the functionality and potential of the framework, a case study is presented in this paper. We showcase a multi-user virtual reconstruction of San Juan de Arce, a historical site located in La Rioja, Spain, for cultural heritage learning. Through this case study, we demonstrate how the framework can be applied to create engaging and educational experiences that immerse learners in the exploration of cultural heritage, reducing reliance on external tools, and fostering meaningful engagement and knowledge acquisition.

This paper is structured as follows: in Sect. 2 the state of the art is exposed. Then Sect. 3 covers how a multiplayer iVR application was designed and developed, which was used for the case study, described in Sect. 4. Finally, the conclusions summarize and analyze this research in Sect. 5.

2 State of the Art

iVR learning experiences have the potential to achieve learning objectives across cognitive processes and knowledge dimensions. Different key features have been identified in these works as advantages of iVR for learning enhancement. In first place the motivation and engagement. Maintaining learner interest and motivation is a challenge for any educator. A motivated learner will be more engaged and more determined to try to understand the learning material, as well as more resilient to potential obstacles to its understanding [10]. Most investigations measure motivation and engagement and conclude that the use of iVR leads to increased interest and engagement when compared to conventional learning environments or other 2D multimedia systems [3, 11]. Secondly, iVR provides higher levels of interaction than conventional educational methodologies, that are usually language-based, conceptual and abstract; characteristics that compromise the implementation of practical learning. iVR supports ‘doing’ rather than only observing, which leads to a constructivist approach, where students learn through interaction and even collaborate with other students. In this way, students can experiment, investigate and obtain instant feedback in a personalized experience that can improve learning [12]. They can learn experientially and proceed at their own pace. iVR offers enhanced learning through active participation, in which learners create their knowledge through practice, using motor and cognitive skills and receiving frequent feedback. This makes learning content easier to connect to the real-world context [13]. Thirdly, iVR can also create a window to another place and time, so that students can discover places and sites that are far away, inaccessible or no longer exist today. Likewise, iVR also can be used for empathy training, enabling students to empathize with others and to broaden their range of perspectives and experiences beyond their normal spheres of interaction. Finally, as the COVID-19 pandemic has highlighted, there is a need for tools that facilitating e-learning and iVR has been shown to be effective in distance learning processes [14]. The immersive nature of iVR helps block out other distractions, making students more focused and concentrated on learning objectives [15].

It is very difficult to find suitable learning content or content with the potential to be applied if a classroom. Commercial solutions usually have to be found in game software engines, usually with poor labeling and a very large catalog. Likewise, it is common to find iVR applications with an absence of learning theories, designed without taking into account either the rationality of the design or the user experience [9]. A prerequisite for an effective iVR educational application is its pedagogical approach and the learning theory it applies [7]. This learning theories can be categorized according to how learners assimilate, process and retain the information they have learned [8]. The promotion of iVR-based learning is linked to a fusion of principles from multiple pedagogical perspectives. Regardless of which learning theories under each paradigm are used, it is crucial for the development of iVR applications to be firmly grounded in existing learning theories, because these theories offer guidelines on the motivations, learning process and learning outcomes of the learners.

Researchers also face additional challenges when it comes to determining which application to use, as there are several categories of iVR experiences. According to the technical characteristic of interactivity can be categorized into four classes: explorative interaction, explorative, interactive and passive. First, explorative interaction experiences are those experiences that allow the user to explore and to interact freely with the virtual environment. They allow users to move freely through space and interact with most of the objects involved in the simulation. Second, explorative experiences are more restricted solutions, which allow free exploration of the virtual environment, although no direct interaction. It is possible for the operator to navigate freely but the user has almost no direct interaction capabilities with the objects present in the simulation. Third, interactive experiences permit user interaction with the environment, but no free movement through it. Generally, users can interact with any nearby objects. However, they do not have the ability to move around the virtual environment, beyond a very limited surrounding area. Finally, passive experiences are the most restricted solution in which user interactivity and movement are very limited. The use of passive experiences is clearly related to the use of 3DoF devices, due to their technical limitations affecting movement and interaction.

In addition to these categories, there are also those inherent to multi-user experiences. Multi-user iVR refers to experiences that allow multiple participants to access shared virtual spaces, events, and experiences simultaneously. Multi-user iVR environments designed for learning purposes should possess certain common affordances that contribute to their effectiveness. It should provide a sense of social presence, allowing participants to feel connected and engaged with others. This sense of presence enhances collaboration, communication, and a feeling of being part of a shared learning community. Within these immersive environments, participants are represented by avatars, enabling them to interact and engage with one another in real time. Should strive to replicate real-world social interactions and dynamics as closely as possible. By simulating realistic social cues, gestures, and communication mechanisms, participants can engage in authentic and meaningful social interactions, fostering deeper engagement and immersion. The new Head Mounted Displays (HMDs) integrates eye gaze and gesture tracking systems, which achieves a significant improvement in this aspect. Also, these environments should facilitate immediate and seamless communication among participants. This allows for real-time interactions, discussions, and exchanges of ideas, enhancing collaborative learning experiences and promoting active engagement.

In a review of the tools available, there is a wide variety of multi-user iVR learning environments on the market, which can be classified into social, collaborative, experimental development, and exhibition multi-user virtual reality learning environments. Social, refers to the interactive and collaborative multi-user iVR learning environments, where participants can engage with one another, communicate, and interact in real-time. It involves fostering connections, promoting social interactions. Collaboration refers to the cooperative and collective efforts of participants to work together towards a common goal or task in a collaborative multi-user virtual reality learning environment. It involves sharing ideas, resources, and knowledge, as well as engaging in joint problem-solving, group discussions, and teamwork. Collaboration fosters active learning, mutual support, and the exchange of diverse perspectives. Experimental Development refers to the iterative and exploratory process of designing, testing, and refining new approaches, techniques, or tools. It involves experimenting with different features, functionalities, and instructional strategies within the virtual environment to enhance the learning experience. Experimental development allows for innovation, adaptation, and continuous improvement in iVR learning environments. Finally, exhibition refers to presentation of educational content or experience. It involves showcasing educational materials, virtual replicas of real-world objects, interactive simulations, or multimedia presentations to engage learners and provide them with immersive learning experiences. Exhibitions in iVR environments offer learners the opportunity to explore, interact with, and gain knowledge about specific subjects or themes in a visually rich and immersive manner. Table 1 shows the commercial multi-user iVR learning environments found in each category. Most of these multi-user environments could be classified in several categories but they were grouped in the one with the most shared attributes or in the one that most corresponded to their main purpose.

Table 1. Applications available in each multi-user iVR learning environment category.

As can be seen, there is a wide variety of multi-user content with the potential to be used for learning purposes. This diversity of options is also a challenge for educators and developers. The challenge lies in selecting the right multi-user content that aligns with educational objectives, promotes effective learning outcomes, and maximizes learner engagement. Added to this is the continuous development and evolution of multi-user content, which adds even more complexity to the selection process. Educators and developers must stay informed about emerging technologies, advancements in iVR, and updates to existing multi-user platforms. This dynamic landscape demands ongoing research, exploration, and evaluation to ensure that the chosen multi-user content remains relevant, effective, and up to date.

For instance, educators may invest significant effort in creating their educational experiences within existing multi-user platforms, only to face the risk of sudden closure or discontinuation, as exemplified by the case of Altspace VR. Such drawbacks highlight the dependency on external platforms and the potential disruption it can cause to educational endeavors.

To overcome these challenges, educators and developers have increasingly adopted the creation of custom iVR learning environments. This approach enables them to exert greater control over the content, functionalities, and longevity of their educational experiences. By tailoring the virtual environment to their specific needs, they ensure continuity and circumvent the limitations and uncertainties associated with relying on third-party platforms. Creating a self-contained environment provides educators and developers with the freedom to design and curate their content, incorporate relevant pedagogical strategies, and implement the desired features and interactions. It grants them the autonomy to adapt and iterate their virtual reality learning environment as educational needs evolve, without being reliant on external parties for ongoing support or updates. This approach fosters stability, consistency, and the ability to align the virtual environment closely with the intended educational goals and objectives.

3 Development of a Multiplayer iVR Application

3.1 Design Procedure of Multi-user iVR Educational Experiences

The use of iVR by itself does not automatically improve learning, even when learners report very high satisfaction rates [16]. Most research gives no consideration to nor explains how the Immersive Virtual Reality Learning Environments (iVRLEs) are designed and used to enhance learning. However, as already mentioned, there are many ready-to-use solutions, but on the other hand, we must adapt to what they already offer.

Nowadays, developing an iVR application is expensive (in terms of time and money) and needs a multidisciplinary team. This section presents some of the key features needed for the design and successful use of an iVRLE in education. Three stages are followed for the development of an educational application in iVR: pre-design, design and evaluation.

In the first stage, the pre-design, a breakdown of the requirements included the definition of the target audience and the application domain. The learning objective must be important and enhanced by the introduction of iVR technologies. For example, it can focus on difficult-to-understand problems or learning that has proven to be resistant to conventional pedagogy. In this initial phase, the iVR experience was defined by taking into account the four key objectives for an iVRLE: interaction, immersion, user involvement and, to a lesser extent, photorealism [17]. Depending on the target audience and the scope of application, each objective will play a different role. Finally, the educational objectives must also be defined. Learning goals must be well-established, so that the user will not become lost in the amusement of the iVRLE.

In the second phase, the design phase, it is best to raise some questions: Which iVR technology is best suited for the proposed application? As described above, there is a wide variety of devices with substantial differences in functionality, portability and price. The interaction interfaces are subdivided into: general, customized and automatic. The general ones include keyboards, mouse and iVR controllers that are normally included with each HMDs. The second group refers to interfaces customized by the application developer, usually found in educational experiences related to medical fields, which enhance the user’s ability to learn specific tools. The third group includes all sensors that collect data automatically; they can be integrated into HMDs, such as accelerometers or eye-tracking sensors, or adapted to the user, such as biometric devices. These sensors are essential tools that provide insight into the user’s performance and decision-making capabilities. It is also important to consider which is the best game design for this application? In the learning experience, the learner has to select, organize, and integrate information within a limited working memory, so the iVR learning environment should be directly designed to support these processes. For example, interactivity should be designed to be easy to use; a well-designed learning curve should be developed for novices to the iVR technology; and preferably a game structure that offers genuine game play, rather than quiz-like questions and answers, should be created. A balance between immersion, freedom and comfort must be sought in the design of an iVR experience. In addition, the design should take account of incorporating game-based learning elements that support the motivational needs of competence, autonomy and relatedness, so that the motivation and the engagement of the student is maintained. Finally, the application should be designed in such a way that it can be modified, customized and easily updated by the instructors, so they can fit the needs to their individual classes and students. An advanced graphical application programming interface (API) for game engines is usually employed. Unity 3D and Unreal Engine are the two most common game engines in iVR. These two engines include tools such as physical force simulators, graphics engines (responsible for generating 3D graphics using methods such as rasterization, ray tracing, etc.) and interaction modules to integrate devices such as iVR controllers, custom interfaces or sensors in a simple way in the experiences.

Finally, the third stage consists of the evaluation of the iVRLE. The evaluation should take into account which are the key factors to be evaluated and how are they can be better evaluated. If iVR is to gain wide-spread acceptance as a reliable pedagogical method, it must demonstrate that it can confer a tangible benefit in terms of learning outcomes over less immersive and traditional teaching methods [7]. It has been observed that most studies on iVR as a learning tool have no well-defined evaluation method or perform no comparison with other methods of education [3]. Most studies used only one of the following evaluation procedures: questionnaires, user interviews, data recording, and direct user observation. A combination of two of these procedures, especially questionnaires and indicators extracted from data recording, would also increase confidence in the results, especially if standardized questionnaires were used. This strategy would increase the validity and reliability of the conclusions, as previous studies have pointed out [18]. Figure 1 summarizes the flow chart for the design and implementation of an iVRLE.

Fig. 1.
figure 1

Proposed process for the design and implementation of an iVRLE.

3.2 Framework Development

This framework is based on an extension of the previous version [19] and also has been designed inside Unreal Engine 5™. The framework is made up of four main components: Player, Scene, Utilities and Metrics. The player section comprises the options available in the framework for the user of the iVR experience. It offers personalized teleportation techniques, including parabolic teleportation and direct movement, and supports various controllers, with the representation ranging from controllers to realistic hands with a watch/display for user interaction. The scene section allows the implementation of gamified tasks and objectives in educational or training applications, with an evaluation manager enabling task completion tracking. The evaluation manager can be accessed through the hand watch or panels, providing additional scene information like maps to aid spatial comprehension. The utilities section of the framework includes features such as scene transfer, enabling seamless level changes while preserving progress. It also provides a load menu for contextual level selection. Additionally, gaze view events allow for actions triggered by direct eye contact, enhancing passive and informative experiences. The spectator view tool allows non-immersed observers to monitor and evaluate the participant’s performance from a separate viewpoint. Furthermore, the framework can be adjusted for 2D screen monitor usage, facilitating comparative studies between iVR and other methods like 2D video games. Finally, the metric section addresses one of the weaknesses of many studies that use virtual reality for education. Having data collected during the process is nowadays essential in the creation of this type of experience. It also allows the analysis of the extracted data and the import of feedback into the game in real time (Fig. 2).

Fig. 2.
figure 2

Diagram of the framework and pipeline of the creation of experiences.

Until now the framework only allowed single-user experiences, but it has been redesigned to create multi-user experiences. The replication of the framework to support multi-user experiences involved significant modifications and enhancements. It required the implementation of networking capabilities to enable communication and synchronization between multiple users in the virtual environment. This includes features such as user avatars, real-time interaction, and collaborative activities. The framework now allows multiple participants to engage and interact with each other within the same virtual space. The replication process involved careful consideration of network optimization, data synchronization, and ensuring a smooth and immersive experience for all users involved. It was decided to use Epic’s online services (EOS) as it is a free service and independent of the game platform. It allows you to launch, operate and expand your game, whatever the gaming platform on which the clients are running. Also implement functionality like authentication, player progression tracking, matchmaking, voice chat and statistics.

Several adjustments were made to the player section of the framework to support multi-user experiences. Instead of a single player character, the framework now allows for the creation and control of individual avatars for each participant, providing a unique identity and presence within the virtual environment. Ready Player Me, a platform that allows avatar customization, was implemented. In this way, participants in the virtual environment were able to personalize their avatars with unique appearances, including various clothing options, hairstyles, and facial features as seen in Fig. 1. This level of customization helped to foster a sense of individuality and personalization for each user. Additionally, the player component incorporates features for real-time communication and collaboration among users. This includes voice chat communication channels, enabling participants to interact, exchange information, and coordinate their actions within the shared virtual space. To facilitate user interactions and enhance immersion, the player component was enhanced with synchronized hand movements and gestures. This allows users to engage in gestures and hand-based interactions with virtual objects and environments, fostering a sense of presence and natural interaction within the multi-user experience. Moreover, the player component integrates mechanisms for spatial awareness and collision detection to ensure that users can navigate and move within the virtual environment without interfering with each other’s movements. This involves implementing techniques such as collision avoidance, user tracking, and positional tracking to maintain a smooth and seamless experience for all participants (Fig. 3).

Fig. 3.
figure 3

Example of a personalized avatar.

Likewise, several adjustments were made to the scene component of the framework to support multi-user experiences. Mechanisms for synchronized and replicate events and activities, were developed to ensure that all participants experienced the same sequence of events and had equal opportunities to engage with the learning content. This synchronization was crucial for maintaining coherence and consistency within the multi-user experience. Furthermore, specific menus were created and made accessible through the user’s wristwatch interface. These menus provided convenient control and customization options for interacting with fellow participants. For example, users could mute or unmute their colleagues’ audio, allowing them to selectively listen to specific individuals or maintain silence when needed. In addition, the wristwatch interface offered additional scene information and functionalities to enhance navigation and collaboration. Users could access 3D maps displaying the virtual environment and real-time location markers of other participants. This feature enabled users to locate and track the movements of their peers, promoting better coordination and communication within the shared space. The wristwatch interface served as a central hub for users, providing quick and easy access to various scene-related features and interactions. By incorporating these menus into the wristwatch interface, the framework facilitated seamless control over social interactions and access to scene-related information, ultimately enhancing the overall user experience in multi-user iVR environments.

Within the Utilities section of the framework, a significant adjustment was made to introduce a tool that allows users to point and highlight specific objects or locations within the virtual environment, making it visible to all participants. This feature, known as the “pointing tool,” enhances communication and collaboration by enabling users to draw attention to specific elements and facilitate shared focus. By activating the pointing tool, users can extend a virtual pointer or laser-like beam from their hand or controller to highlight an object or designate a specific area of interest. The beam is visible to all participants in the multi-user experience, making it easier for everyone to follow and understand the direction of attention. This tool proves particularly useful during discussions, presentations, or group activities where visual cues are essential for effective communication. The pointing tool enhances the social dynamics and engagement within the virtual environment by promoting shared attention and facilitating discussions around specific objects or locations. It serves as a powerful communication tool, allowing users to direct focus and guide the collective exploration and interaction within the multi-user experience.

Under the Metrics section of the framework, significant adjustments were made to enhance the data collection and analysis capabilities. Now, all participant data is stored and can be analyzed later, providing valuable insights into user behavior and performance within the virtual environment. The framework includes a comprehensive data logging system that captures various metrics, such as movement patterns, interaction events, task completion times, and user interactions with objects and elements within the virtual scene. This data is then stored in a structured format, allowing for in-depth analysis and evaluation.

In addition to real-time data analysis, the framework also enables the playback of recorded sessions, allowing researchers or educators to review and analyze user interactions from a third-person perspective. This feature is particularly valuable for post-experience evaluation and gaining insights into the overall user experience and learning outcomes. Furthermore, this replay can generate heatmaps based on user interactions and movements. These heatmaps visually represent the frequency and intensity of user activity within specific areas of the virtual environment as can be seen in Fig. 4. They provide a powerful tool for identifying points of interest, areas of high engagement, or potential bottlenecks in the user experience. Furthermore, the framework allows for the identification of outliers or unusual behavior patterns among participants. By analyzing the collected data, it becomes possible to detect outliers, anomalies, or unexpected user actions that may warrant further investigation or adjustments in the virtual environment design. By incorporating these adjustments to the Metrics section, the framework provides a robust foundation for data-driven analysis and evaluation of user behavior and performance within the iVRLE. It empowers researchers and educators to gain valuable insights, identify areas for improvement, and make informed decisions based on the collected data.

Fig. 4.
figure 4

Example of a generated heatmap based on the user interactions and movements.

4 Case Study: San Juan de Arce, La Rioja (Spain)

This framework has been implemented in a case study for its validation. The aim of this implementations is to demonstrate its capabilities in the field of creating multi-user iVR experiences focused on cultural heritage.

The Pilgrims’ Hospital of San Juan de Arce in Navarrete has been recreated virtually because only the ruins of the floor plan of the chapel, the doorway and the windows are preserved, which were moved to the local cemetery. Through this virtual reconstruction, it has been possible to recreate the appearance and distribution of the hospital in the 14th century. Visitors have the opportunity to explore the inner courtyard and enter the hospital’s rooms. In addition, they will be able to appreciate and examine objects related to the pilgrims, providing a more immersive and detailed experience of what life was like in the hospital and the care provided to pilgrims on the Camino de Santiago who passed through. It is important to note that this virtual reconstruction has been created to provide a historical representation of the Pilgrims’ Hospital of San Juan de Arce in the 14th century. It seeks to provide a visual and interactive experience to better understand the importance of this place for the pilgrims of yesteryear.

The educational multi-user visit, as seen in Fig. 5, is a directed tour where a guide provides detailed explanations about each part of the virtual reconstruction. Thanks to the tools provided by the framework, the guide has the ability to point out specific points in the virtual space and provide references to them. Additionally, the framework enables interactive features, allowing visitors to participate by speaking or muting them all, creating a collaborative and engaging experience.

Fig. 5.
figure 5

Visitors on a guided tour of the Pilgrims’ Hospital of San Juan de Arce in the 14th century.

During the tour, the guide can utilize the virtual environment to enhance the explanations and create an immersive learning experience. They can highlight important features, artifacts, or architectural details by pointing directly at them in the virtual space as seen in Fig. 6. This visual aid helps to enrich the understanding of the visitors and fosters a more interactive and dynamic tour. Moreover, the framework allows for two-way communication, enabling visitors to ask questions, share their insights, or engage in discussions with the guide and other participants. The guide has the ability to manage the audio settings, muting or unmuting visitors when necessary to maintain a smooth and organized tour.

Fig. 6.
figure 6

Example of the guide highlighting important architectural details by pointing them out directly in the virtual space.

Additionally, each user is equipped with a wristwatch that displays a mini-map for orientation purposes as seen in Fig. 7. This mini-map provides a visual representation of the virtual environment, allowing users to navigate and understand their location within the reconstruction. Users can easily identify different areas, landmarks, and points of interest. Furthermore, the wristwatch also displays volume indicators, indicating the proximity and direction of sounds within the virtual space. This feature helps users to identify audio cues and enhances the overall immersive experience. By following the volume indicators, users can locate specific audio sources or engage in conversations with fellow participants or the guide. Finally, the wristwatch facilitates social interaction by showing the presence and positions of other users within the virtual experience. Users can easily identify and locate their companions, enabling them to collaborate, discuss, or simply observe the actions and movements of others.

Fig. 7.
figure 7

Wristwatch showing to the user the 3d view of the reconstruction.

5 Conclusions

The present work represents the collaborative efforts of a multidisciplinary team, resulting in the development of a versatile framework tailored for multi-user educational experiences in virtual reality. iVR brings several advantages compared to traditional education, including improved student engagement, interest, and motivation. It enables active participation and interaction with the virtual world, leading to enhanced comprehension and knowledge retention. Multi-user iVR environments have gained attention for facilitating collaborative learning experiences and providing opportunities for social and experiential learning. The main drawback is that developing iVR learning environments is a time-consuming activity, and existing publicly available options often come with limitations, such as reliance on proprietary tools and lack of ongoing support and updates.

To address these challenges, the framework presented here takes into account the importance of the design of Immersive Virtual Reality Learning Environments (iVRLEs). The use of iVR by itself does not automatically improve learning, even when learners report very high satisfaction rates. By considering the design and evaluation aspects described above, the framework aims to offer educators and developers the autonomy to create engaging and educational iVR experiences. It empowers educators and developers to create interactive iVR learning experiences independently, integrating key components such as user interaction, data tracking, and visualization functionalities. Finally, a case study of a multi-user virtual reconstruction of a historical site demonstrates the functionality and potential of the framework in creating engaging and educational experiences that immerse learners in cultural heritage learning. Special attention has been given to the development of comprehensive metrics to extract meaningful results for the final validation with students. This addresses a common weakness observed in the validation of educational iVR experiences, and the framework aims to overcome this limitation. As the ongoing efforts continue to validate and integrate the framework into different types of experiences, it holds promise to become a reliable and effective tool for educational and training purposes in iVR, benefiting both researchers and educators.