Introduction

Over time, cultural heritage inevitably undergoes a certain degree of degradation, particularly influenced by natural and anthropogenic factors, further accelerating the process of deterioration. In recent years, preventive conservation has been recognized as one of the most effective methods of safeguarding cultural heritage [1, 2]. Its primary objective is proactive maintenance, anticipating potential damage to ensure the authenticity and integrity of cultural heritage to the greatest extent possible [3]. The International Centre for the Study of the Preservation and Restoration of Cultural Property (ICCROM) has been dedicated to researching issues of preventive conservation in cultural heritage from the perspectives of risk management and response. This is particularly evident since the adoption of the Sendai Framework for Disaster Risk Reduction 2015–2030 (SFDRR) by United Nations member states in Sendai, Japan, marking a paradigm shift in the global discourse on cultural heritage. International organizations such as UNESCO, ICCROM, and ICOMOS have incorporated cultural heritage into a broader framework of Disaster Risk Management (DRM) theory and practice [4, 5]. Consequently, cultural heritage risk management has become a focal point of attention for cultural heritage organizations and scholars in the academic community [6].

The concept of DTs, transcending reality, is increasingly embraced by researchers in the field of cultural heritage. Within the framework of the DT concept, employing a combination of 3D technologies, HBIM, IoT, artificial intelligence, and other technological means [7], has found widespread applications in the documentation [8], monitoring [9, 10], exhibition [11,12,13], diagnosis [14, 15], intervention [16,17,18], and management [19, 20] of cultural heritage. International standard organizations, such as ISO, have established global standards for risk management (ISO 31000:2009, Risk Management—Principles and guidelines). Building upon this standard, the ICCROM and the Canadian Conservation Institute (CCI) have outlined the implementation steps for cultural heritage risk management in "A Guide to Risk Management of Cultural Heritage" (identifying relevant factors, recognizing, analyzing, assessing, addressing, monitoring). In accordance with the international recognition of cultural heritage risk management, DTs offer significant advantages [3]. The theoretical framework of "historical DTs" proposed by [21] applies DTs to the development and evolution of risk prediction, damage, and possible intervention measures in heritage.

DT provides an effective technological pathway for cultural heritage risk management, particularly with breakthroughs in emerging technologies such as artificial intelligence, big data, and cloud computing. Cultural heritage risk management based on DT is gradually transitioning towards autonomous data analysis and simulated decision-making [22, 23]. While many view automation as beneficial, existing literature on artificial intelligence has identified various issues and pitfalls associated with automation [24]. Fundamentally, the mission of DT is to enhance and amplify human performance [25]. As noted in studies by [26,27,28], the role of DT is to provide additional information to humans, allowing operators to anticipate future system behaviors and assist in improving proactive maintenance strategies [29]. In the context of cultural heritage risk management, particularly during unforeseeable risk occurrence processes, management personnel still need to intervene directly in the real world by synthesizing various information sources at cultural heritage sites [4]. However, the real-time capabilities, high precision, and high integration of DT determine that its intervention in cultural heritage risk management presents a system characterized by high stress and dynamics. Users often cannot accurately perceive all safety-related elements [30]. The substantial volume of information and complex information sources significantly increases the cognitive burden for management personnel in understanding on-site situations. Therefore, the focus is on how to effectively utilize cognitive resources in DT systems to enhance management personnel's perception of risks and decision-making [6].

To enhance users' cognitive processes within DT systems, researchers initially delve into the exploration of DT interface design [31, 32]. They strive to better present system-relevant information to operators based on their individual preferences and capabilities, aiming to improve operators' cognitive abilities and execution efficiency [33]. It is believed that dynamically adaptive user interfaces can effectively enhance situational awareness [34]. However, the design of DT interfaces is closely related to specific application domains and types of visualized information, resulting in high coupling [31]. In comparison to DT interfaces, some researchers argue that in the virtual space of DT, a user-friendly operating environment is more crucial [35]. Within the contextual information represented by DT, immersive views akin to real-life scenarios are believed to enhance human adaptability [36]. Therefore, in the design of DT systems, an increasing number of developers are considering immersive technologies such as VR, AR, and MR as interfaces for human–computer interaction [37, 38]. DTs are seen as a further realization of VR and AR technologies [39], with immersive technologies becoming essential tools for integrating virtual and real interactions in highly integrated DT systems.

The rapid development of X-reality technologies is merging with DT, providing new interactive gateways and serving as a crucial support for the visualization and interaction platforms of DT [40,41,42]. Empowering DT with X-reality technologies enables the provision of immersive and realistic views and natural modes of interaction [43,44,45], thereby offering effective visualization methods to intuitively present information in DT systems. X-reality technologies have been extensively applied in the cultural heritage domain, offering new technological means for heritage management, visualization displays, and remote interaction [6, 11, 46,47,48], thereby providing more user-friendly remote operation and observation methods for cultural heritage conservation and utilization from the perspective of management personnel. Despite extensive research on X-reality technologies, they remain relatively unfamiliar to most people, and there are varied understandings of X-reality technologies within the academic and professional communities. Some studies define X-reality technology as Extended Reality (XR) [49, 50], suggesting that XR encapsulates various forms of reality including VR, AR, and MR [51]. However, some argue that the ‘‘X’’ in XR should be a placeholder for all new formats of reality, existing as a variable [52]. Furthermore, there is ambiguity in defining VR/AR/MR. [53] suggests that virtual technologies (VR/AR/MR) all belong to the category of VR and should exist as a unified continuum. Virtual technologies are often studied as a unified class of immersive technologies [54]. The ambiguity of professional terminology significantly hinders the development of visualization risk management service systems, especially concerning cultural heritage risks. The ambiguity of technologies can affect the outcome variables related to management, and users' cognition and acceptance of information vary under different technological empowerments [55], leading to different effects on user perception [56, 57].

Situation awareness (SA) is defined as the perception of elements in the environment within a certain time and space, the understanding of their meaning, and the prediction of their future states [58]. The state of SA reflects the extent to which users perceive, understand, and project context-relevant elements they wish to comprehend. Studies have shown that higher SA can lead to better decision-making and performance in risk management [59]. SA is a key attribute for effectively responding to organizational events [60]. In the context of cultural heritage risk management, considering SA will facilitate more efficient risk assessment, analysis, and strategic decision-making by management personnel, thereby reducing damage to cultural heritage [58]. Particularly within the complex systems of cultural heritage DT, evaluating the performance of management personnel from the perspective of SA will significantly enhance their understanding of risks, reduce cognitive load related to risk perception, and improve attentional performance.

In the realm of cultural heritage risk management, the diversity of human–computer interaction (HCI) can significantly influence managers' cognitive load and situational awareness [61, 62]. Within the context of DT, cultural heritage managers, experts, and stakeholders need to integrate various real-time information sources to effectively perceive current situations and forecast future states. Despite the rapid advancement of information technology, particularly the integration of X-reality technologies, which offer new possibilities for enhancing users' situational awareness [63,64,65], they also present new challenges. Some studies suggest that immersion in X-reality environments may lead to insufficient situational awareness among users, resulting in slower reaction times and increased cognitive loads [63, 66]. Although research has begun to explore the potential of X-reality technologies in enhancing situational awareness among operators, there remains a lack of in-depth investigation into their effectiveness in the field of cultural heritage risk management. Specifically, there is a dearth of research addressing the impact of different X-reality interaction modalities on user situational awareness. Therefore, further exploration of how X-reality technologies influence the functional mechanisms of user situational awareness is warranted. Such research efforts will not only shed light on the potential role of DT systems in cultural heritage risk management but also provide more reliable decision support and guidance for practitioners.

In this context, evaluating the intervention of DT cultural heritage risk management from the perspective of situational awareness becomes crucial. Particularly, assessing the effectiveness and impact mechanisms of different X-reality technologies in assisting users' risk cognition is imperative. This study begins by conducting an in-depth analysis of the academic definition of X-reality technologies to determine their specific manifestation modes. Subsequently, through the establishment of control experimental procedures, it explores the effectiveness of different technological modes in cultural heritage risk situational awareness. Furthermore, it investigates how the core features under different technological modes influence various dimensions of cultural heritage situational awareness, including the demand for attention resources, the supply of attention resources, and the understanding of the situation. Such research not only contributes to the development of visualization applications for DT in cultural heritage risk management but also provides theoretical and managerial insights for heritage site managers, heritage experts, and related stakeholders adopting X-reality technologies.

The structure of the remaining sections of this paper is outlined in Fig. 1: Sect. ‘‘Main Concepts’’ reviews related work on DTs, cultural heritage risk management, X-reality technologies, and situational awareness theories. Sect. ‘‘Materials and Methods’’ establishes the research hypotheses based on key concepts, detailing the experimental design and technical specifics. Sect. ‘‘Results’’ presents the experimental results and analyzes the proposed hypotheses. Sect. ‘‘Discussion’’ provides a summary and discussion of the research findings. Finally, the study's applicability and limitations are discussed.

Fig. 1
figure 1

Flowchart of the research process

Main concepts

Cultural heritage risk management and digital twin

ICCROM and CCI emphasize the risks faced by cultural heritage, including both sudden and catastrophic events such as earthquakes, floods, fires, and armed conflicts, as well as gradual process risks such as chemical, physical, and biological damage, with the severity and frequency of disasters increasing in recent years [67]. Despite the efforts of international organizations to strengthen cultural heritage risk management through a series of documents, frameworks, conventions, and guidelines, the process requires collaboration among professionals from different backgrounds, which brings additional risks and challenges to project organization and management and hinders progress in cultural heritage risk management [4, 68]. Based on the current status of heritage, researchers are keenly interested in enhancing interdisciplinary understanding among disciplines and departments through the identification, analysis, and evaluation of cultural heritage risks using contextual information, physical principles, and mechanisms.

To describe the real-time operation of physical equipment or systems, Dr. Grieves proposed the Information Mirror Model in 2003, introducing the concept of DTs for the first time through the model [69, 70]. A DT is a set of virtual information structures that can fully describe potential or actual physical manufactured products, ranging from the microscopic level of atoms to the macroscopic level of geometry, and includes precise simulations of expert knowledge [71, 72]. Leveraging the virtual-real integration and real-time interaction features of DTs, provides technical support for visual real-time monitoring and full-state real-time simulation of cultural heritage, realizing full-scale, full-process digitization, and offering new solutions for collaborative and intelligent management of target object operations.

Cultural heritage risk management is an ongoing process that requires continuous monitoring of risks and timely adjustment of actions to minimize adverse impacts. Previous studies have explored cultural heritage risk management under the concept of DTs, integrating technologies such as 3D, virtual reality, HBIM, IoT, and big data. By reviewing relevant literature, it is evident that risk monitoring methods were initially applied to cultural heritage [2, 73, 74], and in recent years, numerous examples of SHM studies have further highlighted the importance of risk monitoring in cultural heritage conservation, saving substantial costs for heritage maintenance in later stages. However, there is a lack of consideration for real-time dynamic monitoring in regular or long-term monitoring of heritage or in monitoring specific aspects of heritage, leading to a lag in the transmission and analysis of risk monitoring data and resulting in low sensitivity in heritage risk monitoring [75].

IoT technology has provided a new paradigm for cultural risk monitoring [76, 77]. With the assistance of smart sensors, monitoring data can be transmitted, analyzed, and visualized in real time [78, 79], making cultural heritage an open and comprehensive smart object capable of real-time dynamic interaction between digital and physical objects when facing different situations and environmental changes. However, current applications of IoT in cultural heritage monitoring remain at the data level, with [80] studies mainly focusing on real-time data collection and monitoring heritage through data anomalies, providing an effective approach for heritage professionals, albeit requiring additional learning costs for heritage management. Particularly in the context of risk management for architectural heritage, it is necessary to identify, analyze, and evaluate cultural heritage risks based on the current state of heritage, utilizing contextual information, physical laws, and mechanisms, which has not been widely explored in current IoT research [81].

The European project HeritageCare (SOE1/P5/P258) has initiated the implementation of a digitalized hierarchical preventive protection scheme. GIS [2, 82] and BIM [83,84,85,86] are increasingly being employed for the visualization of risk analysis in cultural heritage and the visualization of database elements related to the geometric features of heritage [82]. To date, researchers have begun to utilize GIS and BIM to support risk management efforts in cultural heritage [87,88,89]. However, it is evident that these two widely used technological methods each have their limitations in practical applications of cultural heritage risk management. GIS, for instance, is primarily used to establish geographic spatial databases for heritage, defining it as the most ‘‘natural’’ visualization platform for different types of data in heritage management [90]. However, traditional two-dimensional methods are no longer sufficient to meet the current demands of cultural heritage risk management. Despite the integration of technologies such as laser scanning and photogrammetry [91, 92], which enable the linkage of 3D digital graphics and information [93], the method of connecting via external links has not effectively mapped specific data characteristics to the heritage itself [94]. Specialized visualization data is insufficient for heritage managers to effectively identify and analyze risks. Importantly, cultural heritage risk management is an ongoing process, and GIS cannot meet the requirements of all its steps. In contrast to GIS, BIM can store accurate and interoperable 3D building information records, especially real-time data input and updates, enhancing building operation and maintenance and enabling closed-loop design [81, 95, 96]. It is BIM's unique advantages that researchers have applied to architectural heritage and identified as HBIM [97]. Subsequently, with the implementation of HBIM, there is an increasing amount of management literature on historical architectural heritage, and some research case studies have established HBIM platforms as crucial gateways for visualized information on architectural heritage [83]. Furthermore, HBIM has been used for potential decision-making in heritage maintenance and intervention actions [98], providing more effective pathways for cultural heritage risk management.

In past literature, DTs have been defined in various ways across different domains [99, 100]. Despite some differences in practical applications, researchers unanimously consider bidirectional communication between physical and digital objects as a fundamental characteristic of DTs [99, 101, 102]. Furthermore, some studies have proposed six levels of DT applications in cultural heritage. At the highest level, researchers position it as a means to achieve cultural heritage preservation through the integration of emerging technologies such as big data, cloud computing, 5G, and artificial intelligence [7, 22]. This aligns with the DT theoretical framework proposed by Tao [103, 104], wherein the construction of a DT entity requires the integration of geometric, physical, behavioral, and rule models, with the construction means of behavioral and rule models involving the integration of emerging technologies such as artificial intelligence, neural networks, and machine learning. According to this theoretical definition, we can establish the operational logic of a DT system for cultural heritage risk management as shown in Fig. 2. The DT entity contains all basic information and contextual process data of physical objects, with the capability for real-time information updates, self-awareness, and self-understanding. It can support cultural heritage management through high-precision, visualization, and autonomous data analysis and decision simulation [22]. In theory, physical objects with executable capabilities can achieve visualized autonomous management of cultural heritage risks through data connections [105].

Fig. 2
figure 2

Operational logic of DTs in cultural heritage risk management

However, regarding cultural heritage as the physical object of DTs, at the level of physical entities, data collection and transmission and control can only be achieved through detection devices and monitoring equipment. The heritage entity cannot be controlled through signal input or output, meaning it cannot exist as a bidirectional control port as an executor, especially in the face of sudden cultural heritage risks, where management personnel need to intervene directly in the real world by integrating various sources of information on-site [4]. Therefore, in the context of DTs intervening in cultural heritage risk management, how to enhance and amplify human cognition of risks in DTs has become a research focus [58]. For DT systems oriented towards cultural heritage risk management, it is crucial to explore avenues for enhancing cultural managers' identification and analysis of risks, thereby achieving effective informed decision-making regarding risks [44].

X-reality technology

Since the introduction of the concept of the ‘‘Reality-Virtuality Continuum’’ by Milgram and Kishino [106], discussions on the classification of different X-reality technologies have been ongoing in academia. Based on this concept, previous studies have categorized XR into VR, AR, Augmented Virtuality (AV), and Mixed Reality (MR), with MR being considered a continuum between the real and fully virtual environments, encompassing AR and AV [56]. However, the limitations of this classification have been highlighted as technology advances. [107] suggests that as current technology allows for clear boundaries between different realities, MR will no longer be a continuum containing AR and AV, but rather an independent dimension between AR and AV, termed Pure Mixed Reality (PMR) [107]. Building on this, some studies have begun to redefine XR as Extended Reality (XR), incorporating VR, AR, and MR [50, 51]. However, defining XR as Extended Reality also raises questions, particularly regarding the distinction between AR and MR in practical applications. Although [107] discusses the differences between AR and PMR, the boundaries between them are not clearly defined. In particular, the authors mention that PMR can only be achieved through devices such as HoloLens and Magic Leap. However, in previous practical applications of HoloLens, researchers have used HoloLens2 to test AR applications in various fields such as healthcare [108], industry [109], and architecture [110]. Additionally, in [111], it is mentioned that "AR is sometimes subdivided into Mixed Reality, with MR representing another form of AR’’ on page 5. From the perspective where VR predominates, VR is considered the primary medium above all other formats, with other types of reality being variants of VR [112]. Some studies argue that the term "Extended" does not include VR because reality in VR is replaced. However, it is undeniable that VR and AR are the two most prominent modes among all viewpoints. Therefore, a new X-reality framework was proposed in [52], where X represents a placeholder for any form, emphasizing the strict differentiation between VR and AR in hardware and devices. Additionally, in June 2023, Apple released the Apple Vision Pro, providing further evidence for the new XR framework. While the Apple Vision Pro is considered a Mixed Reality device, Apple has not officially defined it and has not positioned the product as either an AR or VR device, referring to it only as a spatial computing device. However, based on its development applications, which cover the entire spectrum from fully virtual to fully real, the modes are simply referred to as AR and VR modes, with the novel interactive experiences brought by the device in AR mode not defined as MR. Hence, in exploring the application of X-reality technologies in cultural heritage risk management through DTs, AR and VR modes emerge as the most prominent. Figure 3 illustrates the various definitions of X-reality technology.

Fig. 3
figure 3

Definitions of X-reality technology

VR/AR technologies, as immersive technologies, are widely applied in interacting with complex systems, providing users with interactivity, visual behavior, and immersive experiences [54, 112]. They enrich the presentation of visual content for users, bringing about new human–computer interaction paradigms for users' information perception and visual information processing [113]. When exploring the mechanisms of VR/AR in different applications, concepts such as presence, perception of reality, and psychological imagery are considered intermediate nodes of user experience [114,115,116]. However, from the perspective of human–computer interaction, interactivity and immersion remain important system factors for enhancing user experience during the deployment of VR/AR systems [117]. Immersive environments and multimodal interaction methods can assist users in completing information cognition and accurately performing tasks in a shorter time [118]. Although VR and AR are often studied as a category of immersive technologies in many studies [112, 113], it is argued that the degree of immersion in VR and AR affects user experience differently [119]. Additionally, from the perspective of human–computer interaction, it is pointed out that VR and AR should be clearly distinguished in terms of user experience. When developing specific VR/AR applications, it is necessary to consider the specific application service requirements and combine them with the users' expected goals to select appropriate technical means reasonably [120].

Situation awareness

Situation Awareness (SA) is a critical criterion in the design and evaluation of human–machine systems [121,122,123]. Its function lies in driving the reduction of uncertainty in information within complex political, social, and industrial organizations [58]. Among the various definitions of Situation Awareness, Endsley's three-level model based on information processing is widely acknowledged: Level 1 involves the perception of environmental elements; Level 2 involves the understanding of the current situation; and Level 3 involves predicting future states [121]. The concept of SA is widely embraced in the fields of human factors and safety science [30, 124], finding extensive applications in areas such as military combat management [125], law enforcement [126], safety operation management [127], and autonomous driving [128]. It serves as a vital metric for evaluating the effective task performance of operators. Despite ongoing theoretical debates surrounding the structure of SA in academia [129], higher levels of SA are still considered effective tools for promoting operators' shared understanding and calibration of information [130]. The absence of situation awareness can lead to an increased risk of accidents [64, 131, 132].

In the study of situational awareness, another key aspect is the measurement of situational awareness [123]. Previous research has categorized the measurement of situational awareness into three groups: post-accident investigation, direct system performance measurement, and simulation-based direct experimental techniques [131]. Different measurement methods have their advantages under different conditions. For instance, post-accident investigation and direct system performance measurement have greater advantages in real-world environments, but they have limitations in terms of the potential impact of factors. In current situational awareness measurements, some studies have provided physiological measurements, subjective measurements, and other methods [122, 127]. Due to its ease of use, low cost, and sensitivity to different conditions, subjective assessment techniques are frequently employed [133]. SAGAT [121] and SATR [134] are the most commonly used methods for individuals and teams to date. Research indicates that SAGAT and SATR are two entirely different approaches. SAGAT is more suitable for known tasks and outcomes, while SATR focuses more on general, overall task characteristics and does not involve specific task-related elements [135]. In the context of cultural heritage risk management within the DT framework, the creation of workflows follows open standards [21] to promote interoperability between different information systems and ensure proper maintenance and management. Moreover, considering the diversity of cultural heritage risks, especially with the increasing attention of local stakeholders to heritage management [4], cultural heritage risk management has become a collective effort of heritage managers, experts, and relevant stakeholders. In such circumstances, selecting SATR as the measurement indicator for situational awareness would be more conducive to an accurate assessment of the system.

To measure Situation Awareness (SA), SATR employs three dimensions: Situation Attention Resource Demand (SART-DAR), Situation Attention Resource Supply (SART-SAR), and Situation Understanding (SART-UOS)[136]. The SA index is calculated using the formula SA = Understanding—(Demand—Supply). According to this formula, the combined perception of the three dimensions determines the user's overall SA. However, there exists independent logic of influence between each dimension, and the imbalance of each dimension significantly affects the understanding of SA factors [123].

Materials and methods

Research hypotheses

VR technology immerses users in a completely virtual world by simulating virtual environments. This immersive experience not only enhances emotional involvement but also deepens users' understanding and experience. Studies have shown that compared to conventional media, virtual reality environments are more conducive to information transmission [137]. Additionally, VR provides various interactive methods, allowing users to explore scenes more intuitively and enhance perception and understanding of on-site scenarios [65]. AR overlays virtual elements onto the real world, providing users with an enhanced visual experience. By integrating virtual objects with the natural environment and overlaying real-time information, AR enables users to gain a comprehensive understanding of on-site scenarios [63]. Research has indicated that using AR can reduce cognitive workload and errors when performing tasks [138]. Based on the aforementioned analysis, the following hypothesis is proposed:

H1

Compared to 2D desktop environments, VR/AR facilitates enhanced situational awareness of cultural heritage risks.

SART, developed by Tayler, evaluates user situational awareness from the perspective of workload paradigms [134], focusing on the knowledge, cognition, and expectations of events, factors, and variables that affect the safe, rapid, and effective completion of tasks. It consists of three dimensions: SART-DAR, SART-SAR, and SART-UOS. SART-DAR measures participants’ attention levels during task execution by system instability, complexity, and variability, representing participants' perception of external tasks. SART-SAR assesses participants’ understanding of the scenario during task execution through arousal level, attention, attention allocation, and mental capacity. SART-UOS measures individuals' understanding of complex situations and level of situational awareness through information quantity, information quality, and familiarity. SART-DAR and SART-SAR are primarily used to capture workload. As seen from the SART index formula, the relationship between supply and demand significantly affects users’ final level of understanding. SART-UOS is influenced by the relationship between supply and demand as well as the specific performance of the system [123]. In the development of DT VR/AR systems for cultural heritage risk management, distinguishing the impact relationships of different modes on different dimensions under the same scenario will provide effective support for specific development work.

A core feature of VR/AR experiences is immersion [54]. From a technological perspective, immersion is defined as the degree to which a computer display can provide participants with inclusive, extensive, surrounding, and vivid illusions of reality [139], offering a wider range of information content and more vivid sensory stimulation compared to traditional media. Psychologically, immersion is a mental state that allows users to perceive themselves as being protected, included, and interacting with the environment. Research has shown that higher levels of immersion environments can provide users with higher quality information presentation and greater accuracy in information understanding [140]. Additionally, immersion can directly promote information perception from sensory dimensions and the existence of non-mediated illusions [141]. Based on the above analysis, the following hypotheses are proposed:

H2

Immersion can reduce users' demands for attention resources (a: VR, b: AR).

H3

Immersion can increase the supply of attention resources (a: VR, b: AR).

H4

Immersion can enhance the understanding of the situation (a: VR, b: AR).

Interactivity is considered another typical feature of VR/AR at the technological level, where stronger interactivity means users can more easily interact with target objects and engage with content [142]. Users can actively participate and gain a deeper understanding of the content in real-time. Psychologically, interactivity is perceived subjectively by users and is related to individual attentional motivation [143]. In complex DT systems, natural and effortless real-time interaction is expected to facilitate users in receiving more information, thus enhancing their comprehensive understanding of the surrounding environment [63]. Based on the above analysis, the following hypotheses are proposed:

H5

Interactivity can reduce users' demands for attention resources (a: VR, b: AR).

H6

Interactivity can increase the supply of attention resources (a: VR, b: AR).

H7

Interactivity can enhance the understanding of the situation (a: VR, b: AR).

Conceptual model

Based on the aforementioned analysis, it is essential to analyze VR and AR as distinct interactive tools and examine their impact on the situational awareness of heritage management professionals. Additionally, we explore the specific mechanisms through which their core features, immersion, and interactivity, influence various dimensions of situational awareness. This study aims to provide insights into situational awareness factors for the design of DT systems in cultural heritage risk management. Furthermore, considering the diverse requirements in the field of cultural heritage risk management, adopting appropriate interactive methods will better facilitate the cognitive tasks and execution performance of management professionals. Drawing upon the analytical logic of situational awareness [136], we have formulated the research framework as illustrated in Fig. 4.

Fig. 4
figure 4

Proposed research framework

Scene setting

The Green Integration and Ecological Innovation Team at Guangdong University of Technology provided the foundation for our testing. They have long collaborated with institutions such as the Guangzhou Uprising Memorial Hall and the Autumn Harvest Uprising Memorial Hall in China, focusing on exploring the digital preservation and planning of Chinese cultural heritage. With rich cultural heritage data and advanced management experience, they selected the site of the Autumn Harvest Uprising in China as a case study for this research. This site served as the command center for the Autumn Harvest Uprising and is designated as a key cultural relic protection unit in China, as shown in Fig. 5.

Fig. 5
figure 5

The site of the Autumn harvest uprising

The development of DT systems for cultural heritage risk management revolves around the construction of DT entities. Initially, we systematically reviewed the relevant regulatory frameworks for heritage sites and hierarchically classified the sites. Drawing on the concept of the five-dimensional DT model, we treated individual artifacts as unit-level entities, site areas as system-level entities, and site entities along with surrounding environments and human activities as complex system-level entities, thus constructing a multi-domain model of the heritage site. Subsequently, based on the definitions and attributes of geometric, physical, behavioral, and rule models, we established a multi-dimensional DT of the heritage site and conducted consistency validation, as illustrated in Fig. 6.

Fig. 6
figure 6

Multi-domain/multi-dimensional entity modeling method for heritage

Next, based on the operational logic of DTs in cultural heritage risk management as illustrated in Fig. 2, we established a cloud/fog/edge collaborative data collection framework for multi-source data perception in the "people-machine-material-environment" context of heritage sites. This framework manages real-time data connection, preprocesses data, converts data, and handles commands to efficiently perceive and process heritage-related information. The perceived data information serves as the basis for modeling the DT ecosystem of heritage sites. Through data feature extraction and the utilization of methods such as self-organizing map neural networks, correlation analysis, and hierarchical analysis, we established the mapping relationship between dynamic heritage data and real-time heritage status. Finally, in accordance with the digital protection requirements of heritage sites, we developed multi-dimensional real-time situation models. Refer to Fig. 7 for details.

Fig. 7
figure 7

Perception and dynamic modeling method for heritage data

Drawing on the analysis above, we integrated the specific concepts of relevant models and the requirements of cultural heritage risk management. Based on the content requirements and implementation path of the DT quasi-state model, we initially integrated the relevant data of heritage sites in Unity 2020. Then, we endowed the heritage entity with physical attributes and constructed a DT management system for the Autumn Harvest Uprising site based on 2D desktop, as depicted in Fig. 8.

Fig. 8
figure 8

Digital twin management system based on 2D desktop

The development of VR/AR modes is based on existing databases and implemented using the Unity3d development engine, with the assistance of Steam and MRTK development plugins. Two modes were developed as depicted in Fig. 9: (1) A VR interactive experience DT management system based on the HTC VIVE head-mounted display, and (2) An augmented reality experience DT management system based on Microsoft HoloLens 2. In these modes, users have basic functionalities such as free movement in space, accessing and viewing information, and controlling facilities and equipment. It is important to note that, due to certain management constraints, we could not directly integrate real-time monitoring data into the system. However, we simulated the characteristics of the DT more effectively by dynamically transforming virtual data. This allowed users to control infrastructure and provide feedback in the context of dynamically displayed data. The design of our simulated dynamic data received guidance from relevant management personnel.

Fig. 9
figure 9

Development of a testing program

Participants

Researchers recruited participants from the Environmental Design program at Guangdong University of Technology through online channels. The primary objective of this study was to evaluate the impact of three different modes on users’ subjective perceptions. We ensured that participants had no known history of visual, cognitive, cardiovascular, or neurological issues, and excluded those with a history of dizziness, epilepsy, or inability to tolerate virtual reality [144]. Ultimately, 184 students from the Environmental Design program participated in the study. The participants, aged 22 ± 1.8 years, self-reported being more receptive to VR/AR technologies and capable of adapting to different immersive devices. Although they had heard of our test heritage site, none had visited it in person. Participants were randomly divided into two groups: one group performed tasks in the VR mode (94 participants), while the other group operated in the AR mode (90 participants). This grouping aimed to compare the specific differences between VR and AR in situational awareness across different dimensions, while avoiding the interference of increased experience factors from multiple tests. Despite the slight difference in sample sizes, the total sample size was sufficiently large to ensure statistical power. Participants were instructed to avoid caffeine, alcohol, and prescription drugs 24 h before and during the experiment. All experiments were conducted in a standardized environment using calibrated equipment and predetermined procedures. Data collection utilized standardized scales and tools, and SEM was employed for data analysis to ensure the stability and reliability of the results.

Measurement items

Through a theoretical review of relevant literature, we devised the measurement framework for our study (Table 1), encompassing five scales. To distinguish and verify the impact on different dimensions of situational awareness, we referred to the study by [136] and reorganized the ten SATR scales [134] into three independent measurement scales. All measurement items were sourced from existing studies, ensuring sufficient reliability and validity.

Table 1 Measurements

The fourth scale assesses user immersion. While devices can provide technical immersion and the potential for bodily immersion, scholars argue that the user's psychological immersion, encompassing involvement and presence, is the primary measure of immersion [117, 140]. Presence is a subjective psychological response influenced by the user’s psyche and the external real environment [116], while involvement is the experiential focus on the stimulus set [145]. Measurement indicators were formulated based on references [116, 117].

The fifth scale gauges interactivity. Referring to existing literature [143], we adopted four interactivity metrics, with all measurement items evaluated using a seven-point Likert scale [146]. It's worth noting that for specific heritage risk management processes, VR/AR primarily manifests concrete behavioral features in interaction. Consequently, adjustments were made to the interactivity metrics to align with the standards in [57].

Design and procedure

In the context of target-oriented risk management design, discussions were held with the Green Integration and Ecological Innovation Team of Guangdong University of Technology and heritage site managers. From the findings illustrated in Fig. 5, it was observed during multiple field surveys that the most prominent damages at the site were the peeling of wooden artifacts and weathering of stone artifacts. Analysis by the team revealed that due to the presence of two courtyard structures at the target site, prolonged exposure to high humidity conditions during the rainy season was a major contributing factor to heritage risks. In response, heritage site managers implemented measures such as re-coating wooden artifacts and applying protective materials to stone materials. For this experiment, the risk of water seepage in the stone platform of the heritage site was chosen as the test target. Following the procedure outlined in [147], experimenters conducted the experiment as depicted in Fig. 10: Initially, users underwent a pre-experiment phase where they were introduced to the system background and experimental requirements to familiarize themselves with equipment usage, estimated at 1 min. Subsequently, users proceeded to the system for risk identification and analysis. The system was programmed to alert for rainfall amounts exceeding 10 mm/days, simulating real rainfall conditions ranging between 10 mm/days and 30 mm/days. The system panel displayed dynamic changes in monitored rainfall, triggering alerts when rainfall exceeded 10 mm/days, prompting users to consult expert knowledge and historical records on the risk of water seepage in stone platforms. The system then gradually increased rainfall amounts, allowing users to intervene selectively based on their perceptions, such as laying plastic protective film and deciding on post-intervention measures such as filling stone platforms with silicate cement, estimated at 5 min. Upon completion of system testing, a rating scale appeared for users to rate each parameter based on their experience until the submission of the final question and experiment exit.

Fig. 10
figure 10

The experimental process

The experiment is designed to simulate a scenario under the ideal state of DTs, with the experimental content possessing a certain degree of subjectivity and singularity, and the specific time of participants being controlled within the system. Our primary objective is to assess users' situational awareness rather than evaluate the level of risk or determine their ability to adopt correct intervention measures. This rationale underscores the adoption of the SART subjective measurement of users' situational awareness in the experiment. Upon completion of program setup, all participants are required to undergo the first round of experiments in a 2D desktop program. Users view the site model in 360  on a computer display, select the stone platform as prompted by the system, and review relevant information. All interactions are conducted via mouse operations. Subsequently, two groups of participants undertake the second round of experiments wearing either HMD or HoloLens 2. In HMD mode, users immerse themselves in the heritage scene and freely navigate using the controller touchpad to select targets and view spatial interface information. In HoloLens 2 mode, spatial models and interfaces are provided, allowing users to comprehensively inspect the site and select targets to view relevant information, aiding in understanding the risk situation (refer to Fig. 11). Post-experiment, participants independently rate their situational awareness, immersion, and interactivity. Given the experiment's division into two separate groups, potential biases stemming from the sequence are not considered, and participant order is randomly assigned [148].

Fig. 11
figure 11

The experiment demonstration

Statistical analysis

The experiment aimed to initially ascertain the effectiveness of three modes across three levels of situational awareness. To test this, we employed a single-factor repeated measures analysis of variance (ANOVA) to compare differences among all variables of situational awareness across the three levels under different modes [136]. Additionally, two independent experiments were conducted to explore the internal mechanisms of situational awareness in risk management within a DT context. Given the multidimensional relationship between the research structure and hypotheses, we opted for Structural Equation Modeling (SEM) techniques. SEM enables simultaneous testing of relationships between latent structures and offers more robust estimation of data. We utilized the Statistical Package for the Social Sciences (SPSS) and AMOS software for all analyses conducted in this study.

Results

Preliminary analysis

Firstly, we conducted exploratory factor analysis to validate the reliability of the measurement scales. We analyzed two groups of participants: one group experienced the 2D desktop program, while the other group experienced the VR and AR programs separately. The analysis revealed high internal consistency for all scales, as shown in Table 1. Evaluation of Cronbach’s alpha values indicated α > 0.70 for all aggregated scores. Composite reliability (CR) was computed post-factor analysis, with all values exceeding 0.80, and average variance extracted (AVE) was above 0.50. Subsequently, we conducted a normality test on the data using the K-S test in SPSS, which revealed no significant deviations (p > 0.05). With the normal distribution of data confirmed, we proceeded with the first group's comparative analysis.

Hypothesis testing

We conducted comparisons between two groups of participants to distinguish the differences between immersive technologies VR and AR and conventional 2D desktop. After Bonferroni correction, significant differences were found in three dimensions of situational awareness between VR mode (SART-DAR: M = 5.6, SART-SAR: M = 5.7, SART-UOS: M = 5.7) and 2D desktop (SART-DAR: M = 5.9, SART-SAR: M = 5.4, SART-UOS: M = 5.4) (t = -3.36, p = 0.008 < 0.05; t = 4.22, p = 0.03 < 0.05; t = 3.59, p = 0.04 < 0.05), indicating that VR can reduce the demand for attentional resources and is superior to 2D desktop in attentional supply and state understanding. Similarly, AR mode (SART-DAR: M = 5.7, SART-SAR: M = 5.8, SART-UOS: M = 5.8) exhibited differences from 2D desktop (SART-DAR: M = 5.8, SART-SAR: M = 5.7, SART-UOS: M = 5.6) in the three dimensions of user situational awareness, with superior performance observed only in state understanding (t = 2.83, p = 0.01 < 0.05), while showing slight advantages in attentional resource demand and supply, but not significant. Compared to the traditional 2D desktop mode, VR and AR demonstrated more efficient situational awareness, supporting Hypothesis 1. Additionally, VR and AR showed certain differences, prompting us to further analyze the specific mechanisms affecting situational awareness.

We first conducted single-path analysis of the structural models of VR and AR using Amos software, and assessed model fit using tools provided by Gaskin and Lim (2016). The results showed acceptable fit indices for both modes in single-path analysis, meeting the standards for fit indices [149]. Next, we performed multi-group path analysis with VR and AR as grouping variables to test unconstrained basic models, obtaining acceptable fit indices (see Table 2).

Table 2 Model fit measures

We employed SEM to estimate the structured relationships based on hypotheses, and the hierarchical SEM results are depicted in Fig. 12. The model fit indices are summarized in Table 2 and compared with the respective critical values, demonstrating a good fit between the model and empirical data. Given the acceptable fit of the measurement model, conventionally, a p-value less than 0.05 is considered the minimum criterion for statistical significance [150]. Based on these findings, we can preliminarily draw conclusions regarding the hypotheses, as detailed in Table 3.

Fig. 12
figure 12

Structural model and path coefficients. *p < 0.05;**p < 0.01;***p < 0.001. Dotted line indicates non-significant path, while solid lines indicate significant paths

Table 3 A Summary of hypotheses testing

The hypothesis results and associated path coefficients clearly indicate that, in VR mode, user immersion reduces the demand for attentional resources. Simultaneously, it enhances overall situational awareness by increasing attentional supply and understanding of the environment's state. For both VR and AR modes, interactivity enhances situational awareness by increasing attentional supply and understanding of the environment's state. However, in the case of interactivity, a positive impact on attentional resource demand is observed but without statistical significance. In AR mode, two distinct paths show differences: (1) the impact of immersion on attentional resource demand and (2) the positive impact of interactivity on attentional resource demand. To further analyze the significance of these differences, a comparison of the measurement structural models is needed, considering VR and AR as grouping variables. Differential comparisons based on the Critical Ratios Matrix for multiple groups are presented in Table 4. VR and AR exhibit differences primarily in the impact of immersion on attentional demand, and although differences are also observed in interactivity, they are not statistically significant. Therefore, direct attribution of the differences in user experience between VR and AR cannot be made.

Table 4 Multi-group analysis for the moderating effects of technology types

Discussion

This study begins with the theoretical application of DTs in cultural heritage risk management and explores the effectiveness of X-Reality technologies (VR/AR) in enhancing risk perception in cultural heritage risk management from the perspective of situational awareness. Specifically, it examines the impact of VR/AR application features (immersion and interactivity) on three dimensions of situational awareness (SART-DAR, SART-SAR, SART-UOS). The current research confirms the effectiveness of VR and AR in cultural heritage risk situational awareness and demonstrates similarities and differences in the functional mechanisms influencing situational awareness.

Effectiveness of VR/AR in risk situation awareness

Our study first revealed the effectiveness of VR and AR in cultural heritage risk situation awareness. Notably, the development of SART is influenced by the workload paradigm [123] and SA theory, defined as the difference between comprehension and (demand–supply) [134], providing a theoretical basis for measuring the effectiveness of situation awareness. Our findings indicate that compared to the 2D desktop, users exhibit significantly enhanced situation awareness in VR mode. Specifically, the use of VR reduces SART-DAR and significantly increases SART-SAR, thereby enhancing users' SA. Previous research has demonstrated that in realistic, first-person perspective environments, various contextual dimensions are translated into coherent real scenes [151]. Therefore, compared to the 2D desktop, users' contextual understanding of the target situation has partially integrated into the dynamic virtual space. When perceiving risks in practice, VR reduces users' demand for identifying information while providing them with more cognitive elements of information [152]. Additionally, studies have shown that VR-based interaction can enhance cognitive factors [153]. In our study, VR increased SART-UOS, consistent with previous research logic, allowing users to better recognize decisive factors and support their understanding of the situation. Furthermore, VR facilitates the transmission of implicit knowledge, enhancing users' understanding of information [154].

Regarding AR, our results indicate that compared to the 2D desktop, AR mode is more effective in enhancing situation awareness. We found that AR outperforms the 2D desktop in terms of SART-UOS. This conclusion is supported by previous studies, which have shown that AR can reduce errors, lower cognitive workload, and expedite completion time [138, 155,156,157]. However, in terms of SART-DAR and SART-SAR, we did not observe significant differences. Despite the logical independence of situation awareness across these three dimensions, our experimental results did not reflect causal relationships among them. We first conducted follow-up interviews with participants, revealing that AR and 2D desktop experiences did not provide additional information supply. Participants still relied on objective data and relevant information guidance to understand the current situation, with dynamic DT data display remaining their primary basis for judgment. Upon revisiting past research on AR's role in enhancing situation awareness, we found that most studies focused on information overlay and presentation [63]. However, in this study, the heritage site context did not provide additional information cues, leaving users to rely solely on the information provided by the system for objective judgment.

Similarities and differences in immersion in VR/AR

In our study, we further investigated the impact of immersion and interactivity features in VR and AR on situational awareness. According to our findings, immersion significantly influenced situational awareness in VR mode, consistent with previous research. However, in AR mode, we observed differences in immersion, particularly in SART-DAR, providing additional evidence for varying levels of user experience between VR and AR [56]. Further explanation of this result can be attributed first to the inherent technical attributes of VR and AR and the corresponding experiential devices. VR emphasizes placing the user entirely within a digital environment, emphasizing remote presentation [158], whereas AR focuses on mutual presentation with reality [53]. In the actual process of remote risk perception, better immersion effectively substitutes for sensory inputs from the real world. As previously analyzed, dynamic and realistic environments are more conducive to reducing users' demand for attentional resources. However, AR’s technical attributes necessitate direct involvement in real-world locations, conflicting with remote risk perception in our discussed system context. Secondly, although we employed Hololens2 immersive devices in our study instead of the Pad devices used in previous research [159], the transition between the real and virtual worlds in Hololens2 to some extent caused ‘‘location illusions’’ for users, diverting their attention and affecting their perception of the real environment. Both VR and AR positively influenced SART-SAR and SART-UOS.

Similarities and differences in interactivity in VR/AR

The current results indicate that the impact of interactivity on situational awareness differs between VR and AR modes, but there is no significant difference. We found that in both VR and AR modes, interactivity positively influences SART-DAR, although this effect is not significant in VR, the corresponding path coefficients still provide some reference. This result is unexpected, as interactivity is related to individual attention motivation [143], and previous studies have also shown that VR and AR can enhance situational awareness by providing better interactivity [160]. To further explain this phenomenon, we promptly revisited a few test subjects. According to their recall surveys, we found that the first reason might be the lack of adaptation to hand-based interaction in both devices. In VR, using HTC Vive controllers for interaction may add extra workload, especially when additional hand control is needed in dynamic spaces. More critically, in AR situations, subjects often experienced a cross-over between gestures and reality during information viewing, requiring them to re-identify gestures. This could lead to a negative user experience, as mentioned in previous studies [62] that situational awareness cognitive capabilities are significantly reduced in ARUI mode. Although interactivity imposes a certain burden on SART-DAR, it is undeniable that it enhances situational awareness by positively impacting SART-SAR and SART-UOS.

Conclusion

In the context of DT technology intervention in cultural heritage risk management, this study explores the influence of different X-reality technologies applied on the server side on users' perception of dynamic risks. The results of this study indicate that, in the scenario of remote cultural heritage risk management, the choice of developing VR applications by heritage institutions would be more conducive to promoting the understanding of risks among relevant personnel and guiding decision-making. Compared to AR and traditional 2D desktop programs, VR programs are more advantageous in breaking through spatial barriers, enhancing users' awareness of risk information, reducing the need for active information search, and facilitating a better understanding of the direct data provided by the environment, thus accelerating users' understanding of risk situations. Therefore, in the actual development process of VR programs, especially in the context of DT technology, developers need to fully consider the authenticity of cultural heritage in the virtual space, as vivid virtual spaces will be more conducive to the transmission of implicit knowledge covered by the system. In addition, it is necessary to consider multi-sensory interaction, as interaction in the form of device interaction will increase users' workload. Especially in fully immersive virtual spaces, multi-sensory interactive information queries can timely receive feedback within different spatial ranges of heritage.

AR mode is usually adopted along with VR mode as virtual technologies by heritage sites. Although AR shows certain advantages compared to traditional 2D desktop applications, in the scenario of remote risk management, there is no significant difference between AR and traditional applications in terms of information acquisition and supply. Users’ direct perception of on-site situations and environmental understanding in AR mode do not significantly differ from those in traditional 2D desktop mode, and users still need to rely on the information and actual data provided by the system for judgment. Therefore, AR mode is more suitable for integration with physical entities of heritage. Managers can use AR mode for on-site inspections or risk assessments of cultural heritage. AR adds extra information supply to heritage in the real space, thereby providing users with an understanding of risk situations. In actual DT application services, developers need to consider the presentation method and visual effects of AR information in real space to enhance users' risk perception work. Especially in the context of dynamic information transmission of DT, the information supply of AR mode on the server side should be further studied from the perspective of specific information types, information volume, and cognitive workload, and immersion should not be considered as the main consideration for AR application development in this scenario.

Limits and perspectives

This study has certain limitations, as with all research. Firstly, and most importantly, we did not examine the performance of situational awareness measurement throughout the experiment. We only conducted subjective measurements based on the SART method, exploring users' subjective perceptions under different interaction modes, without assessing whether they actually contribute to cultivating correct situational awareness. Secondly, the sample size is relatively small, with n = 94 and n = 90, which may initially seem small, especially considering the subjective measurement method used in this study. However, we ensured that the sample size was chosen based on the minimum sample recommended for statistical analysis, indicating that the sample size can generate reliable statistical results to a certain extent. Future research may expand on this by collecting data from a broader range of samples. Lastly, in terms of the selection of VR and AR impact mechanisms, we only considered immersion and interaction as technological attributes. However, exploring the impact of X-reality technologies on situational awareness should take into account more factors (such as perception of realism and user enjoyment), which were not deeply explored in this study.