1 Introduction

Digital twin (DT) is gaining traction in recent years in the academia and industry. Although there exists no consensus on the formal definition of a DT, most reputable researches share the commonality that a DT is a cyber-physical system (CPS) consisting of at least three components, namely, a physical system, a virtual model and connections between these two [1,2,3].

In academia, there has been research into a wide range of applications for DT, e.g., in the fields of science and engineering, studies include machine tool life management, product health management (PHM), smart cities, patient health monitoring, etc. [4,5,6,7,8]. In the industry, many companies, such as Siemens, are already offering software and platform solutions for the creation of DT to their industrial partners. However, some critics have claimed that these are not true DT but digital shadows, due to a lack of bidirectional communication [9], i.e., changes in the physical system will affect the virtual model but changes in the virtual model do not affect the physical system. A true DT must include bidirectional communication instead of having a virtual model that updates according to a physical system.

DT technology is garnering interest in non-academic entities, such as governments and industries, due to the potentials of accurate, real-time simulations to aid decision-making processes. The initial idea of a twin was proposed by NASA in the 1980s as a method to monitor the status of an aircraft in space from Earth [10].This concept was that of a physical twin, rather than a DT, i.e., two identical copies of the same machine would be built, one kept on Earth and the other sent to space. The twin on Earth would be subject to, as close as possible, the same conditions as the twin in space. This method produces many inaccuracies due to differences in manufacturing flaws between the two twins and the inability to ensure both twins were subject to identical operating conditions. With advances in computation and internet of thing (IoT) technologies, the DT concept was proposed by Grieves to address these issues [11]. With sensors and IoT networks, the virtual twin can reflect accurately the state of the physical twin as sensors gather and send real time data of the physical system through an IoT network to the virtual twin, which updates itself to reflect the current state of the physical system being twinned. Figure 1 illustrates the concept of a DT.

Fig. 1
figure 1

General framework of a DT [1]

The Land Transport Authority (LTA) of Singapore is carrying out a DT project as part of the Smart Nation Initiative of Singapore. LTA will map and virtualize the entirety of the country, creating a tool that would be used to guide policy making by allowing potential policies to be tested virtually before implementation [6]. This shows the great potential of DTs to be realized with current and possibly future technologies. As such, interest in DT will likely continue to grow in Singapore since the government is leading the way in this new technology.

As interest for DT grows, more users with less technical knowledge of data processing and simulation would have to make sense of the information presented by DTs. There will not be benefits in having real-time data and information if there is no proper visualization method to present the complex information in a concise and non-cluttered manner to the users [12]. However, amongst all literature perused to date, the focus has been on the application development of DT, and little work has been reported on human interaction with DTs and the wealth of information available. One of the purposes of a DT is to assist users in decision making on the systems being simulated; thus, it is important that there exists an intuitive and natural mode of interaction between non-expert users and a DT model to maximize the use of the amount of information available. A potential solution for this is to use augmented reality (AR) as it is a method to overlay virtual information over physical objects, which makes it highly suitable for data visualization.

AR is a technology that overlays virtual objects over the physical world to provide users with more information, allowing users to interact with the digital world. Thus far, users have been used to interacting with digital objects and information in the 2D space of screens and monitors. AR seeks to expand user interaction with digital information and objects by bringing them to the real world, creating 3D experiences of digital information in a setting that users are most familiar with [13].

Researches have been reported and demonstrated that maintenance actions can be translated effectively from words to symbols and displayed via AR is such a way that is easy to understand [14]. As users tend to learn and absorb information more effectively through visuals than words [15], an AR projection of maintenance instructions onto a machine would allow users to interact with and perform maintenance actions much more efficiently as compared to searching up the instructions from a physical manual [16].

It has already been demonstrated that AR can be utilized to aid users in a multitude of manufacturing processes from providing guidance for maintenance procedures to the training of novice workers [17]. Therefore, DT can be integrated with AR by using DT to monitor the state of a system and AR to display data and information of the system to the users accordingly. Figure 2 shows a framework for the implementation of an AR human-machine interface (HMI) for DT. In this framework, sensors collect and send data from a physical twin to the DT for analysis and storage. The digital model uses both real-time and historical data for data analysis, which result can be viewed by the users through an AR HMI. The users can interact with the DT through the user interface.

Fig. 2
figure 2

AR human-machine interface (HMI) for DT

A comprehensive literature review of DT research has been conducted. A search for the keyword “digital twin” was performed on Google Scholar for publications from 2016 to 2020, and the results were sorted by relevance; the first 100 of the 8 800 results were reviewed in this paper. A further search with keywords “augmented reality” and “digital twin” was performed for publications from 2017 to 2021, and the results were sorted by relevance; the first 20 publications were reviewed in this paper. Table 1 gives the abbreviations that are used in Tables 2 and 3. Table 2 summarizes the research articles and reports on DT while Table 3 summarizes research reported on integration of DT with AR.

Table 1 Abbreviations
Table 2 Summary of DT papers
Table 3 Summary of DT-AR papers

2 Related works

2.1 DT definitions

There are many definitions of DTs. As the volume of DT-related research increases, a few core components that are present in every DT development have emerged.

A large number of researchers have defined DT in their own terms as shown in Table 4, while other researchers have cited other definition. Among the definitions reported, “an integrated multiphysics, multiscale, probabilistic simulation of an as-built vehicle or system that uses the best available physical models, sensor updates, fleet history, etc., to mirror the life of its corresponding flying twin” [106] is the most cited DT definition.

Table 4 Definitions of DT

Despite multiple different definitions of DTs, these DT definitions share commonalities in that there must exist a physical system with a digital representation or virtual model of that system and the digital model mirrors the physical system. The mirroring capability can be achieved only through data exchange, which necessitate sensors to be installed on the physical system to collect and transmit these data through a network to the virtual model, as shown in Fig. 3. In Fig. 3, the sensors collect and send data from the physical system to the digital model through an IoT network. The digital model processes the input data and determines the appropriate control data to send to the physical system through the IoT network. The physical system actuates according to the control data received from the digital model. Therefore, three essential components for a DT system are as follows.

  1. (i)

    A self-aware physical system enabled by sensors.

  2. (ii)

    A virtual representation that updates using available data.

  3. (iii)

    A network that connects all elements in the physical system and the virtual representation.

Fig. 3
figure 3

Essential components of a DT

2.2 Reviews

There are eight review papers [2, 9, 23,24,25,26, 28, 42] that focus on DT research with being the most recent review [26]. Many of these reviews categorize DT based on the level of direct control between the virtual models and the physical systems, namely, no interaction, unidirectional and bidirectional, or digital models, digital shadows and digital twins respectively [9, 24, 25]. Unidirectional models requiring human-in-the-loop feedback are the most dominant with the most implementation cases reported. True bidirectional DTs are rare as compared to unidirectional digital shadows. Implementations of true DT are few. Although unidirectional models are the most dominant among DT research, effective HMI between human users and the DT has not been discussed in these reported reviews. Among the review articles, big data are stated clearly as a key element of DTs. However, no reviews mention the usage of machine learning to leverage on big data.

The review presented in this paper builds upon the foundation laid by past reviews, and categorizes existing DT researches based on their features to identify the current state of the art of DT research. New research issues and challenges will be identified and presented, and possible future directions for DT research will be discussed.

2.3 Notable DT applications

In recent years, DT research has moved out of its infancy stage into a more practical usage in different engineering services. Luo et al. [4] proposed a method for which DT could be used to predict faults in CNC machine tools and display solutions to these faults. Using an open-source network protocol, a connection is established among a database of past DT data, the CNC control system, and a DT model. Data from these three systems are fed into an expert system, which uses a series of rules to match the predicted faults with potential solutions.

Tao et al. [5] established a framework where DT could be used to provide engineering services. The model suggests that beyond the three basic components of the virtual model, the physical system and the connection between them, the DT should also provide services and collect data. In the application of DT in product health management (PHM), services provided would include the prediction of faults and suggestion of maintenance strategies, based on the data from both the virtual model and the physical system.

As cities become increasingly populated and resources run scarce, there is a pressing need to increase the efficiency of resource utilization. One method is to develop a smart city where technology and smart monitoring of stockpiles and usage patterns can help policy makers deploy resources more efficiently. In the DT project in the Singapore Smart Nation initiative, various elements of the city are mapped and 3D scanned using aerial and street level LID AR to create accurate 3D models of the city [6]. With a digital model of the entire country, various governmental entities and companies with the approval of the government can run simulations of possible outcomes before committing to making policy decisions.

DTs have been used in the healthcare sector to monitor individual patients, providing health care workers with real time status of patients and potential tools to treat patients remotely if it is inconvenient for patients to visit healthcare facilities. Although there is no specific mention of DTs, Mamatha [7] created a patient monitoring and robotic treatment system that fell under the definition of a DT. In her research, an Algorithmic Start Machine Chart (ASM Chart) was implemented within Labview as a digital prototype of a planned patient care system that would monitor the status of patients through a series of sensors. Next, a micro controller would make decisions based on the ASM chart and actuate a robot arm to perform the appropriate actions. There is no mention of a connection between the physical system and virtual model in this research. Doukas and Maglogiannis [8] proposed a method of creating body sensor networks using Arduinos and IoT to collect and analyze data for healthcare purposes.

Implementation of DT that updates with sensor readings from a physical system in real-time has been proven possible. Information can be extracted from the virtual model and viewed via an Android device. Frontoni et al. [107] demonstrated the creation of a web-based DT with an update delay of about 7 s between the physical system and the virtual model.

2.4 Enabling technologies

2.4.1 IoT and underlying network protocols

IoT is a key enabling technology for DT, utilizing networking protocols to connect multiple devices [90, 108]. To realize a full-scale DT, a large number of sensors across multiple devices must be connected to a common network, which stores and routes data such that relevant data are always available to the virtual model to perform computation and prediction. The management of connections and routing in a network is achieved through protocols. These protocols facilitate the transmission of data among sensor nodes, data storage and the digital model. In this section, some common protocols used to support DT are discussed. Each protocol has different properties and is suitable for different applications. Some popular protocols include the open platform communication (OPC), OPC unified architecture (OPC UA), transmission control protocol (TCP/IP) and message queuing telemetry transport (MQTT) and Zigbee [9].

OPC is a Microsoft Windows-based protocol that uses distributed component object model (DCOM) to create a network between devices. The original OPC is built only on and for machines with Windows operating systems and lacks security features. The OPC foundation has since updated the OPC protocol to OPC UA to address these issues. The updated OPC UA is a service-oriented protocol that can be deployed across machines with different operating systems [109]. OPC UA is a protocol designed with the goal of creating communication cross machine networks, making it highly appropriate for IoT and DT applications, where many different machines must interface and share data.

TCP is the most common protocol for inter-machine connections as it is the fundamental protocol for the World Wide Web (WWW). Though not specifically built for IoT purposes, the ubiquity of TCP/IP creates convenience for any system designed to communicate through this protocol as most modern devices with internet connectivity can connect to other devices with TCP/IP [110]. Thus, using TCP/IP as the underlying networking protocol for a system allows for easier porting of the system to another set of machines as devices that can run TCP/IP are readily available and the infrastructure to support TCP/IP is prevalent.

MQTT runs on top of TCP/IP and provides a method to route data reliably and easily through a star network topography using a publish/subscribe paradigm. MQTT networks comprise of clients connected to a broker through TCP. Each client is subscribed to a set of topics. Any client can publish messages to any topic. When a message is published to a topic, the client sends the message to the broker and the broker sends the same message to every client subscribed to that topic including the sender if the sender is also subscribed [111, 112]. This protocol allows for convenient data management as each machine receiving information can create subscriptions that are limited to topics transmitting data they are interested in, while machines sending information can limit transmission to only machines that are interested in a topic.

Zigbee is an IEEE 802.15.4 compliant protocol with low power consumption and low data transmission rates that runs on short-range radio frequency devices. The Zigbee protocol relay information from node to node in a mesh network topology until the data reach its intended target node. This relaying overcomes the short-range limitation between two communicating devices. Zigbee is suitable for applications where sensors are installed at remote locations without easy access to a power source (such as wearable sensors for healthcare and sensors in hard-to-reach locations) as Zigbee can run for prolonged periods of time on battery powered radio transceivers due to the low power nature of the protocol [113].

2.4.2 Available software

There are differences between the industry and academia in the software preference for DT implementation. In the industry, proprietary all-in-one software solutions designed specifically for DT and IoT, such as General Electric’s Predix, Siemen’s Mindsphere, PTC’s ThingWorx, IBM’s Watson IoT platform, Dassault’s 3D experience, Foxconn’s Beacon and ANSYS dominate the development and implementation [25, 53]. In academia, more flexible and general analytics and simulation software, such as Matlab [9, 18, 41, 82] and Unity [46, 69, 93, 94] are preferred.

2.5 Augmented reality

AR can create a more interactive and immersive experience for complex engineering software solutions. Huang et al. [114] created a framework and system that integrated finite element analysis (FEA) with AR. In this system, users can interact with a physical object and an FEA model of the same object simultaneously. Through AR, an FEA model of an object is overlaid onto the physical object itself. When a user applies force(s) to the physical object, the sensors read and send these force signals to the FEA software, which updates the FEA model and computation accordingly, and displays the updated results over the physical object. This provides an intuitive method for visualizing and interacting FEA models while at the same time providing users with a wealth of information beyond what is normally available for a physical prototype.

Bruno et al. [115] created a video see-through (VST) AR interface for computational fluid dynamics (CFD) using a webcam and a tablet computer. From a real object, reverse engineering methods are employed to scan this object to create a CAD model that is fed into a CFD software. Flow simulations from the CFD software is superimposed over the physical object as visible streamlines via the VST device.

AR can be used as a platform that allows for a high level of interactivity between non-experts of data science and complex data, as demonstrated by Selter et al. [116]. In the project reported in Ref. [116], participants of an island planning workshop were shown a 3D representation of the island that had undergone alterations to reflect future states of different features under the influence of different policies that could be implemented. The participants can move through the simulated environment and experience different possible consequences of different policy decisions in an interactive manner as opposed to reading about abstract numbers and descriptions of consequences. This allows non-experts of numeric simulations to have a much more intuitive understanding of the potential impacts of different policies.

In medicine, AR has been used to overlay magnetic resonance (MR) images over a patient’s body, allowing the doctors and surgeons to have an “X-ray vision” [117]. Although the research is not related directly to engineering services, a similar “X-ray vision” concept can be applied to monitoring and maintenance of machines through integration of AR with DT. Using a DT, the state of a machine can be monitored and simulated; using an AR interface, the internal structures and states of the machine can be overlaid onto the machine similar to how an MR image can be overlaid onto a patient. This provides users of the machine with an intuitive and comprehensive method to diagnose faults in a machine without having to dismantle the entire machine.

2.5.1 Display and interaction methods for AR

AR is a technology of overlaying virtual objects and information onto physical spaces. The display methods and the forms of interaction between a user and these digital objects can vary depending on the applications and the end goals of these applications. Common display methods include the head mounted displays (HMD), handheld devices (HHD), monitor with peripheries and cave automatic virtual environment (CAVE).

HMDs are wearable devices that project images directly into a user’s eye or onto a small screen close to the front of the eye. These HMDs can be classified as optical see-through (OST) or video see-through (VST) devices. An OST device projects image onto a transparent surface in front of a user’s eye, such that the user can see the displayed content and the environment on the opposite side of the transparent surface at the same time. Examples of OST devices include the Microsoft HoloLens and Google glasses. A VST device has a camera and either a projector or a screen. The camera captures images of the environment that a user would normally see and the VST device displays this image with virtual content augmented onto it to the user using either the projector or the screen [13, 118]. While most of these devices are relatively expensive and less widespread in adoption, they offer the benefit being hands free, as the users are free to use their hands to perform other tasks while viewing the AR content through the displays.

Handheld devices are the most ubiquitous and familiar devices for most users. These HHD devices are interactive devices with cameras and screens that can be handheld. Examples include smartphones and tablets, which are intuitive for the public to interact with [119]. These devices are similar to VST HMDs as overlaying of virtual objects is made onto a captured image rather than a projection of the virtual object onto a real space. AR applications using HHD are easy to develop due to their ubiquity, and software development kits are readily available [120]. However, users are unable to use one or both of their hands for other tasks, while interacting with HHD-based AR applications.

A cave automatic virtual environment (CAVE) is a fully immersive AR environment where an entire room is used to display virtual content. Three to six walls of a room have virtual content projected onto them and a user can interact with these contents using their actions and gestures within the room. The interaction is achieved through motion capture or with elements on user interfaces (UIs) projected onto the walls [121]. However, CAVE has the constraint of being immobile and costly.

2.5.2 Software

Almost all implementations of AR in recent researches are developed using the Unity 3D engine with the Vuforia plug-in [96, 97, 101,102,103]. Unity 3D is a 3D game engine that has gained popularity within research, as virtual 3D environments, objects and their interactions can be easily created within the engine environment. The behaviors of these environments, objects and interactions can be controlled through scripts and plug-ins. Vuforia is a plug-in that has been created specifically for implementation of AR applications; it has a robust tracking algorithm and supports the use of 2D markers, object markers and ground plane detection for the anchoring of a virtual space onto a physical space. The Unity 3D engine supports network connectivity through various protocols, including TCP and MQTT, allowing for data exchange between AR interfaces created using Unity 3D and DT models connected to IoT networks.

3 Classification of DTs

Currently, there are no unifying standards for DTs and their implementation. There exists a multitude of different DT implementations with different features.

The application, key implementation features and functions provided by the DTs that have been reported are identified to classify these DT implementations and better understand the trends and future directions in DT research. The main implementation differences of DTs include the level of interaction between the physical system and virtual model, the method of modeling the virtual twin, the software used to realize the DT, and the IoT protocol used to connect the physical system with the virtual model.

3.1 Data flow between physical system and virtual model

Many DT implementations have different levels of control and integration between the physical system and digital model. These different levels of interaction can be classified into three broad categories as shown in Table 5.

Table 5 Levels of control

In this paper, the level of control is defined as the direct machine-to-machine communication between a physical system and its virtual model and vice versa without human interaction. Based on this definition, a traditional numerical simulation of a system would be classified as no interaction as there is no physical system sending data to the virtual model. The vast majority of the reported DTs have unidirectional communication, as there exists a physical system actively sending data to a virtual model, which self-updates according to the data received. However, these systems require human input for the insights provided by the virtual model to actualize a change in the physical system. Bidirectional communication is an extension of the unidirectional system where the virtual model automatically actuates the physical system based on insights gained from accurate simulation, without human intervention. Figure 4 illustrates the three possible level of control.

Fig. 4
figure 4

Levels of control between physical system and digital model

3.2 Model type

Virtual modeling and simulation are the core aspects of a DT. Various types of virtual representations of physical systems exist within the DT research as shown in Table 6. Most DT implementations represent the physical system with more than one type of model, depending on the overall requirements of the DT.

Table 6 Types of model representation

3.3 Network protocol

Rapid progress in IoT technologies has allowed DT to become viable as data transmission rates allow for real time update of complex systems. The control of IoT networks and facilitation of communications between different devices require the usage of network protocols. Many protocols exist, and each has its pros and cons. Majority of DT researches are application focused and employ different protocols to suit their needs. Table 7 shows the different protocols used in the literature reviewed.

Table 7 IoT protocols

3.4 Simulation software

The software for creating and running the virtual models are diverse. Simulation software used in DT research is extremely varied and unique for each research with the notable exception of Matlab and Unity. Table 8 shows the different simulation softwares used.

Table 8 Simulation software

4 State of the art

4.1 AR integration

DTs contain a large amount of data, and the virtual models are comprehensive and high fidelity simulations of the physical systems. To allow users to interact with these data effectively and intuitively, immersive HMI technologies, such as AR [93] and VR [94] can be employed.

Schroeder et al. [93] devised a method to view DT data through an AR interface via web services. Using HTTP requests, the AR client device can query for data about the current state of the DT from a server running the DT simulation. The server can return a message formatted in JSON or XML back to the client, which will process and use this message to update the AR display. This allows users to access their desired information remotely.

Wu et al. [94] chosen to display DT information through VR by mapping the physical space onto a virtual space. They have created a virtual space filled with machines that are identical to those in an existing shop floor using Unity 3D. Using a network implemented through OPC, data are fed from physical machines to the virtual environment to update the virtual machines; the web services handle the formatting of the data being circulated in the network. Users can be immersed in the shop floor even if they are not physically present on location as the virtual environment is updated in real time. This VR implementation can have potential to allow engineers and managers to monitor production lines remotely.

Zhu et al. [96] created an AR display system that overlayed a DT model and process data onto a CNC milling machine using a Microsoft HoloLens. A DT of a milling machine is created which stores and updates according to data collected from a physical CNC milling machine during machining processes. The data are used to simulate the machining processes and predict potential problems, such as collisions. The digital model of the machine, collected data and DT predictions are overlaid onto the physical milling machine through a HoloLens. Users can use audio and gesture control supported by HoloLens to interact with and control the physical machine through the AR interface. Liu et al. [104] created a similar system using more sophisticated simulation methods so that the DT was able to simulate the roughness and surface temperature of a workpiece while undergoing machining. The AR display is able to overlay the roughness and temperature information onto the workpiece being machined.

A key motivation of displaying DT information using AR interfaces is to provide non-technical users with intuitive access to pertinent information. Sepasgozar [100] developed a pedagogy that delivered construction and engineering lessons using AR and DT technologies. Sepasgozar’s research was motivated by the fact that most students were unable to visit an actual construction site or construction machines due to spatial constraints. Furthermore, due to Covid19, the importance of remote learning has been elevated. Five different mixed reality systems were created for education purposes. Data collection and cloud storage of information from a real construction cite and construction machineries are the cores of these systems. The collected data were used for the creation of the different mixed reality systems, including the DT of an excavator where users could use AR interface to interact with the excavator.

Müller et al. [105] utilized AR tracking technologies to determine the position of a smart-pointing-device (SPD) to perform programming by demonstration of a robot arm. A motion tracking system is used to track the movement of an SPD operated by a human worker to perform a specific task. The tracked motion of the SPD is converted into an trajectory for the end-effector, movements of individual joints, and controls for the individual actuators in the robot arm, to allow the robot arm to perform the demonstrated action.

4.2 Data collection

Some researchers have suggested the utilization of data fusion to analyze and increase the fidelity of the data collected and generated by the DT in order to increase the reliability of sensor data. Tao et al. [2] suggested that data fusion is an enabler of DTs due to the massive amount of data that DTs would be required to process. Nikolakis et al. [38] developed a method to apply data fusion in the creation of a DT of a human worker. They used multiple different sensors to track the movements of a human factory worker and reassembled the data through sensor fusion to synchronize the movements of the DT counterpart in the virtual environment.

Fuller et al. [25] suggested that for DT applications, data fusion could be used to augment physical data with virtual data. Using the predictions from the virtual model, virtual data can be obtained, and sensor fusion can be applied on this data and data from the physical system to gain an even deeper insight of the physical system.

4.3 Modelling

A physical system can be represented virtually in many forms. Many different types of virtual models exist among the DT implementations. Some models reflect the systems’ geometry, position and rotation while others track the states of the systems using mathematical formulations. Zheng et al. [30] produced a DT that reflected the geometry and position of a system by creating the DT using CAD models of the physical system. The virtual model and physical system are fully integrated and delays between the two are kept below 1 s. Ganguli et al. [67] proposed a method to model dynamic physical systems as DTs using differential equations. Zhang et al. [52] created a method that combined both geometric and analytical modeling, where a geometric model of a shop floor was created and updated according to the data collected. At the same time, predictive information about machine tools, such as the expected tool life, is estimated using a neural network that is being fed both simulation and physical data.

4.4 Data networks

DT is a technology that leverages on big data to be effective. Therefore, network management and data storage and distribution technologies are essential to the success of DTs. As discussed in Section 2.4.1, network management is achieved via OPC protocol and its derivatives. OPC and OPC-UA protocols are convenient for industrial applications as many IoT-enabled industrial machineries use these protocols. Many proprietary DT software (e.g., Siemens NX) also connect to machines using this protocol [92].

For data storage and distribution, Qi et al. [53] identified cloud-based NoSQL and NewSQL as the emerging frontiers as they were able to store and distribute big data effectively unlike traditional data servers, which were not designed to deal with the current data explosion.

4.5 DT as a service

The knowledge and software necessary to create DTs are available and well researched at the current stage. As some researchers have suggested, DT research should go beyond basic implementation and delve into the potential services and functions that can be provided by such a DT [1, 2, 5, 9, 22, 23, 25, 29, 31, 34, 42, 49, 51,52,53, 59, 61, 70, 73, 77, 83, 93, 94]. The most common services mentioned are cloud-based DTs and product health management.

Cloud-based DTs, which are hosted on a distributed network, allow users to access the DTs wherever there is connection to the host. This offers the advantage of retrieving data of the twined systems remotely and the ability to control the twinned systems remotely as compared to a localized DT, where users must be onsite to access the DT. Alam et al. [31] suggested a reference model for cloud-based DTs. In their model, the connections, data processing and data formatting are defined clearly. The cloud-based DT in this model augments the physical system by offering services that would be impossible without a virtual implementation of the physical system, such as data processing and monitoring network connection status within the physical system.

Liu et al. [32] suggested using a cloud-based DT system for healthcare purposes. A DT of a patient is created and updated using data from sensors attached to the body of the patient. Since this DT is hosted on the cloud, patients and their doctors can access the DT from anywhere using any device with internet connectivity. This DT can provide early warnings to patients when sudden anomalies or future complications are detected through the DT.

5 Research challenges

5.1 Difference between industry and academia DT implementation

As previously mentioned, most DT implementations in the industry are achieved through proprietary software solutions that are catered specifically to the development and creation of DT. In academic research, software suitable for general simulation and computation is used. This could lead to a widening rift between industrial implementations of DT and those created in academic research.

5.2 Different DT definitions

As DT research matures, the definitions of DT have been converging. At the current stage, there is no consensus on the basic features that a DT should have. A few current differences in DT definitions are as follows.

  1. (i)

    Bidirectional human-out-of-the-loop communication between physical system and virtual model. Most reported researches have stated that the bidirectional communication is the goal of a DT and that CPS with no communication and unidirectional communications should be considered as different from true DTs. There exists a large amount of CPS implementations that claim to be DTs without exhibiting two-way communications [1, 5, 9, 21, 27, 30, 32, 33, 35,36,37,38,39,40, 44, 47,48,49, 51, 54, 56, 57, 60, 63,64,65,66,67,68,69,70, 72, 74, 78, 79, 81, 83, 85, 87, 88, 93, 94].

  2. (ii)

    Optimal digital models for implementing DTs. There exist many methods to implement the virtual models of a physical system. As DT research progresses, modularity and scalability will become an issue. With different possible representations of a physical system available, standardization is needed between these representations so that each DT implementation can interact seamlessly with another DT to achieve scalability and modularity.

  3. (iii)

    Optimal scale for DT implementation. Some DT implementations have twinned an entire plant [44, 50, 64, 66, 69, 83] or shop floor, while others have only twinned a single part [47, 48, 65, 88] of a physical system.

5.3 Scalability

To the best of the authors’ knowledge, no research to date has addressed the scalability and modularity of a DT. The scalability and modularity of a DT refers to twins of individual machines or parts can be combined to form a twin of a larger system, such as a machine or a shop floor, and further combined to form a plant. To achieve scalability and modularity, there must be methods to represent a DT digitally so that different DTs can exchange data with each other in a consistent and predicable manner.

5.4 Lack of intuitive interactions

Many highly technical and complex models have been chosen for DT implementation. As DT technology progresses, more non-technical users, who do not have the prerequisite understanding of how a DT model functions, would inevitably be interacting with DT systems. Hence, intuitive HMIs to visualize and interact with DT models and the systems they are mirroring should be developed. The majority of research is still focused on developing the DT and not delivering the DT to non-technical users.

6 Future directions

6.1 Standards, modularity and scalability

Standardized frameworks for DT modeling and data transmission can be implemented for ease of communications between different DT models. This would allow for the modularization of DTs, which will increase the scalability of DT as smaller DTs can act as parts of a larger DT system. An example would be connecting multiple DTs of machines to form the DT of a shop floor. The current methods would require either the creation of DTs of the machines or a DT for the entire shop floor, which requires more work and is difficult to troubleshoot. With a modular DT, each individual module can be verified for accuracy before being fitted into the overall DT. DTs created for individual machines can be combined to form the DT of an entire shop floor or plant.

Furthermore, the standardization of DT would increase the value of DT research in academia as the results from the standardized DT models can be implemented easilyusing proprietary software and are thus of more value to the industry.

6.2 Bidirectional communication

Although bidirectional communication is commonly regarded as a defining feature of DT, majority of DT implementations do not have this feature. This could be due to the difficulties in implementation using the current technologies, as many systems are not yet network compatible. Bidirectional communication should gradually become the norm as the industry is realizing the power of IoT and new machines are designed with network connections as a common feature, thus increasing the ease of implementation for bidirectional communication.

6.3 AR integration

Integration of DT with cyber-physical HMI, such as AR [94] and VR [95], exists. However, in these researches, AR is used only as a display for DT. This does not achieve the full potential of AR and DT integration. Furthermore, many of the current HMI methods are used in conjunction with one-way DT implementations, which does not allow users to control directly the DT or the physical system twinned. Future exploration of HMI integration with DTs could include the elements of visual programming and other methods, which users can control and interact directly with the DT and the physical system twinned, thereby maximizing the potential of AR as an HMI medium.

Work has already been done to explore the combination of AR and DT outside of fields directly related to manufacturing and monitoring, such as education [101]. Building upon this, other non-engineering related fields that already have some DT-related researches, such as healthcare and policy making, could implement an integrated AR and DT system so that users can have an intuitive interface when interacting with the DT.

7 Conclusions

This paper presents a review of DTs based on the underlying technologies, implementations and features. An emphasis is placed on AR as a technology for interacting with the wealth of information contained within a DT. The potentials of DT are highlighted by the vast differences between use cases. There are huge variations among the different DT implementations. This paper categorizes the reported DT researches to establish some commonalities between different implementations. Researchers have sought to integrate DT and AR by using AR as an HMI for DT. In the research on the integration of AR with DT, AR was mostly implemented using the Unity 3D engine with the Vuforia plug-in. As DT research advances, a common output and input data format or representation can be formed, such that different DTs would be able to interact with each other. AR interfaces could use the standardized data exchange format and representation to display and interact with a multitude of different DT models and their corresponding physical systems.

From the research challenges identified in this paper, a more scalable and unified DT model would help greatly in increasing the value of DT research in the commercial sector, including manufacturing, healthcare and city planning. An AR interface would enhance the DT by allowing interactions that are more intuitive and rendering insights gained from complex DT simulations more accessible.