Keywords

1 Introduction

The evolution of the Internet in the last decades has taken several leaps, but the most fundamental influence on our life stems from the Internet of Things (IoT) phenomena. A recent IHS Markit report chronicles this quick-moving arena and gauges its explosive growth. The installed base for Internet-connected devices reached an estimated 12.1 billion in 2013, a number that is expected to more than quadruple to nearly 50 billion by 2025 [1]. The use of micro sensors and networking technologies connects the everyday physical world to the Internet. This is not only about connecting things, but also about providing services and opportunities to improve our life. 95% of corporate executives surveyed by The Economist say they plan to launch an IoT business within 3 years. According to PRWeb.com, 87% of manufacturers surveyed have not yet taken advantage of IoT to transform their facilities, but among the 13% of manufacturers who did implement IoT solutions, there were reports of increased efficiency, fewer product defects and a noted higher customer satisfaction. In addition, the automotive industry affects everyone’s life and as Parks Associates claims, 89% of the new cars sold worldwide will have embedded connectivity by 2024 [2].

These are only some of the facts, but, nevertheless, the global tendencies and direction are quite clear. In the whole context of IoT technologies, when life is assigned to the world of physical things, it is important to provide a natural interface between it and IoT elements, through the use of visualization technologies. In communication where people, things, processes and data are involved, feedback is crucial. The lower the communication barrier between humans and IoT elements, the more precise and correct the decisions that will be made. Undeniably, nowadays M2P (machine-to-people), M2M (machine-to-machine) and P2P (people-to-people) provide successful execution of these processes [3], but there is still plenty of space for innovation and improvements to complete and develop the next level. Respectively, the core idea of the research is to use real time augmented reality (AR) during the communication process. The information flow between IoT parties is bi-directional, so the more naturally information will be perceived by human senses, the more precise will be the reaction and a decreased response time. It has crucial meaning in many areas and industries (medicine, emergency services, military, logistics, Smart cities, manufacturing - Industry 4.0 etc.). Nowadays information perception via computer monitors and smartphone screens is quite typical, though rapid development of augmented reality technologies should be taken into account, such as future possibilities with devices like Magic Leap, Microsoft HoloLens 2, Meta 2, Daqri and others. These devices just recently appeared in the US market or will appear in the immediate future [4,5,6]. In the annual Internet of Things World Forum (IoTW) held in Dubai [7], it was already announced that besides services such as connected parking, connected lighting and waste management, alongside other vertical industries, more important is the opportunity to visualize these solutions for attendees. Also, Horizon2020 financed projects by the European Commission have a focus area on IoT technologies, emphasizing IoT wearables; user interfaces and object virtualization for an IoT ecosystem [8].

Standard visualization provides us with visual data depiction on screens of various devices, but augmented reality (AR) provides the depiction of virtual objects (text, images, graphics, video, 3D models) on top of the real world. The AR concept dates back to the Fifties of the previous century in cinematography, and modern definitions already appeared twenty years ago [9], but real use cases were developed in the last decade, when computing performance, graphical resolution and sensor (gyroscopes, magnetometers, accelerometers etc.) precision notably increased.

CCS Insight Global predicts that the dedicated augmented reality device market is expected to reach 5 million devices (AR smart glasses and mixed reality glasses) by 2021 and that augmented reality technology is already widely used in the education field for advanced learning and for teaching technologies [10]. Augmented reality and virtual reality technology will be used to contribute to projects with smart innovations in the future due to its great fascination and potential. At the moment several AR mobile applications (Vuforia, HP Reveal, Augment, Wikitude, Blippar etc.) [11] which mostly use fiducial or image-based markers for virtual object positioning are available. Future AR solutions mostly will integrate marker less solutions. For now, the growth in the marker less AR area is fostered by the smartphone industry and widely available sensors, such as GPS positioning, compass, video camera and connections to the Internet. Unfortunately, GPS positioning is insufficient in solutions where high precision is demanded and indoor capability provided. That is why nowadays marker less solutions are mostly oriented towards entertainment (PokemonGo) [12], advertising activities and training needs. Without precise virtual object positioning, an important step in AR technology development will be switching from smartphone displays to head-mounted displays (HMD), thereby providing an essential increase in the environment’s immersion level (HoloLens, Magic Leap) [4, 6]. At the end of 2018, the U.S. Army awarded Microsoft a 480 million USD contract to supply the military branch with as many as 100,000 HoloLens augmented reality headsets for training and combat purposes [13]. This fact not only makes the U.S. Army the most important HoloLens consumer, but this deal highlights the readiness of AR technologies for serious applications.

2 Problem Area and Global Practice in Object and Avatar Positioning

The previously stated facts prove the undoubted increase in AR significance in the future. However, there are still factors that limit the use of AR in the greater scope of areas and industries. To specify the research’s goal and the use of AR technologies in the context of IoT, several problem areas arise for which solutions are being devised. The following functionality should be achieved:

  1. 1.

    Indoor and outdoor AR, regardless of weather conditions and lighting. However, nowadays marker-based solutions are quite precise, but they have a fundamental disadvantage in relation to immersion level, interactivity and mobility. Furthermore, in outdoor conditions it is inconvenient to use them. There are some AR solutions available for marker less outdoor use (SightSpace, PokemonGo) [12, 14] through the calculation of GPS coordinates, but in reality, inaccuracy is too high, preventing the participant from moving freely in the augmented environment. Basically, a target 3D model is statically positioned, and internal sensors’ data are used for further projection calculations [15].

  2. 2.

    Virtual 3D model positioning at distances greater than 5 m. An additional disadvantage of marker-based AR systems is the short range of the operating distance where AR libraries should recognize the marker in video frames and place a virtual object on top of it. The latest solutions replace marker-based close-range solutions by spatial mapping and the use of depth cameras. However, phones with depth cameras like the Asus ZenfoneAR and the Lenovo Phab2 are quite expensive and the latest libraries like Google ARCore and Apple ARKit recognize the environment and the distance to the physical objects based on camera (not depth camera) by identifying interesting points, called features, and tracking how those points move over time. With a combination of the movement of these points and the readings from the phone’s inertial sensors, the library determines both the position and orientation of the phone as it moves through space [16]. A surrounding 3D model is constructed in real time, thereby allowing the virtual object precise placement at distances from 10 cm to 5 m [17]. This approach offers high precision, participant mobility and a high immersion level in a typical living room or office conditions (indoor, close range).

  3. 3.

    The provision of high precision and stability of depicted 3D models. The main disadvantage of GPS-based AR solutions is insufficient precision in a virtual object coordinate system, resulting in a disturbing effect, where the virtual object is jumping and its position is unstable, especially if it is observed with 3D models. Some algorithms are used for stabilization, such as the Kalman filter, but it does not solve issues if a participant is in motion and observing a virtual object, because displacement occurs [18]. Quite a lot of research has been done in the field of line and blob detection algorithms to recognize real world objects by analysing video frames [19, 20] but in real-time solutions this is impracticable because of changing weather and lighting conditions, as well as the fact that participant movement is not suggested.

  4. 4.

    Real time depiction of 3D models set in AR mode, including animations and photorealistic rendering. At the present moment there are no AR platforms which could offer simultaneous independent 3D model depiction. Mostly nowadays AR solutions provide depiction of single 3D or grouped 3D models (Vuforia, Augment, Wikitude, Blippar etc.) in marker-based systems. By avoiding this limitation, more intelligent environments could be developed, which are necessary for bi-directional communication among human and IoT elements in the form of visualized objects.

3 The Importance of Position Tracking for Multiple Entities in a Dynamic Environment

In virtual and augmented reality (VR/AR) environments, the collaboration among different parties can be implemented at various levels where each has its own challenges. There are meaningful differences in visualization and position calculation aspects. If the environment participant’s position is static and also the position of the virtual object is static, visualization can be achieved quite easily, even if the viewing angle is changing. But if the environment’s participant is in motion and also the virtual object or objects are moving, the complexity increases. Generally, sixteen modes can be estimated (see Fig. 1), where the most challenging is to provide an environment with several participants moving around and also several moving virtual objects.

Fig. 1.
figure 1

General collaboration modes in VR/AR systems

For separate modes a sub-mode can also be estimated with additional requirements, especially in a VR environment. It depends on the implemented scenario. A significant difference is in the visualization of several participants’ avatars for one participant or for all the participants, accordingly only one person is wearing VR HMD at a time and he sees other persons’ 3D avatars, or everyone in the environment is wearing VR HMDs and see each other’s avatars. In the first case it is useful implementation as well, because a participant in a VR environment can still easily communicate with real persons surrounding him and avoid collisions when he is moving. The other case should be implemented using multiplayer mode, where the synchronization process and scene depiction on several VR HMDs should be implemented via a data transmission network.

To depict virtual objects in the correct size, position and rotation, calculations for each frame should be done according to the flowchart in Fig. 2.

Fig. 2.
figure 2

General flowchart of a virtual object in an AR environment

These are the general steps which were implemented in the City 3D-AR project [18] based on GPS positioning, unfortunately, due to a lack of precision, this project was left only as a test platform. Nowadays for these calculations we can rely on Unity or Unreal engines’ libraries and provide basic positioning in more natural way, nevertheless, issues with the precision of multiple dynamic objects in multiuser large-scale environment are still topical.

4 The Necessity for Precision Improvement and Flexibility in Large Scale AR Environments

As there are various modes of collaboration among parties in the VR/AR environment, it is important to test these modes in real 3D scenarios, without the use of a real positioning system, e.g. different technologies offered under real-time locating systems (RTLS). Therefore, a simulator was developed, which generates coordinates for static and dynamic objects in space (see Fig. 3).

Fig. 3.
figure 3

Coordinate generator for the simulation of objects’ positions

It is possible to specify the measurements of area, number of agents, frequency of JSON data objects generated (see Fig. 4). As advancements in the simulator, different types of movement can be implemented instead of random positions, e.g. a variety of trajectories, range of movement speed, avatar movement rules and collision detection. After the JSON data objects are formatted, they are delivered over the network to a specified IP address and UDP port number.

Fig. 4.
figure 4

Generated JSON data object

The next phase involves Unity scripts to correctly use data objects for 3D models and cameras in the VR/AR environment (see Fig. 5). By changing simulation parameters, the number of agents and complexity of 3D models, it is possible to evaluate the performance of the environment. Evaluation can be done in a qualitative way, by visual investigation during runtime or various quantitative values can be acquired by the use of a Unity Profiler. General values include CPU usage for frame processing during rendering and script processing.

Fig. 5.
figure 5

Unity JSON deserialize and transformation scripts for 3D visualization

In this case measurements were made from 1 to 10 agents, generating JSON data objects at random intervals ranging from 50 ms to 500 ms (see Fig. 6).

Fig. 6.
figure 6

Script execution performance

This is because usually raw data from positioning systems are processed to remove noise, e.g. the Kalman filter, and intervals could vary for one session. To also evaluate the influence of network latency and jitter, data objects were sent over 1 Gbps Ethernet and Wi-Fi IEEE802.11ac standards, but there was not significant difference in performance, because latency in both cases is 1 ms or less in perfect conditions with two connections. However, situations can be very different, if other Wi-Fi standards and frequencies are used, because of higher latency and high jitter (from 1 ms to 200 ms), especially if there are several Wi-Fi connections. Specific tests were executed on a VR compatible MSI GT75VR Titan laptop with 7th Gen. i7 processor and GeForce GTX 980M SL graphical adapter. For each test 30,000 frames were captured, and an average value was calculated, getting results from 0.795 ms with one agent to 1.090 ms with 10 moving agents. Rendering time was even less with lower variance. In specific Unity composition there are not noticeable performance issues, but this slight tendency must be considered if more complicated environments are developed, meaning not only higher quality animated 3D models, but also more serious collaboration logic and gaming mechanics implemented through scripting.

Thanks to a data generator for movement simulation, it is possible to save a lot of financial resources and time and specify the general requirements, to choose the most appropriate positioning system for a VR/AR system.

5 UWB Tracking as the Solution for AR and VR Systems

To prepare a real testbed, ultra-wideband (UWB) positioning technology was chosen. Last year’s results with high precision has been achieved in UWB tracking, meaning that this technology could be suitable not only for sport tracking and logistics, but also for multiple user collaborative VR/AR environments. The 9 × 12 m room was setup to test precision and also validate the accuracy of the coordinate generator (see Fig. 7).

Fig. 7.
figure 7

UWB RTLS testbed for real-time 3D visualizations in Unity

Ultra-wideband uses short-range radio communication, in contrast to Bluetooth Low Energy and Wi-Fi, the position determination is not based on the measurement of signal strength (Receive Signal Strength Indicator, RSSI), but on a runtime method Time of Flight (ToF) or time-difference-of-arrival (TDoA). The light propagation time between an object and several anchors is measured. At least three receivers are required for the exact localization of an object using trilateration. There must also be a direct line of sight between the receiver and the transmitter [21]. UWB utilizes a train of impulses rather than a modulated sine wave to transmit information. This unique characteristic makes it perfect for precise ranging applications. Since the pulse occupies such a wide frequency band (3–7 GHz according to the IEEE 802.15.4a standard), its rising edge is very steep, and this allows the receiver to very accurately measure the arrival time of the signal. The pulses themselves are very narrow, typically no more than two nanoseconds [22]. UWB positioning systems offer 5–30 cm accuracy in both indoors and outdoors and enable to get both 2D and 3D data [23]. Companies like Sewio, Eliko, Insoft, Decawave offer setups for the implementation of UWB wireless real-time locating systems.

Similar to a data generator, UWB systems provide data delivery over a network via UDP port in the form of lightweight data-interchange JSON data objects.

6 Conclusions

This work focused on the large scale indoor and outdoor positioning necessary for use in augmented and virtual reality systems. Such systems are becoming more and more relevant in the context of smart cities and also Industry 4.0 where more natural involvement of society and professionals can be achieved. To determine and evaluate the requirements in 3D environments, a data generator was developed to simulate coordinates and different parameters related to nowadays positioning technologies (frequency, speed, precision, etc.). This study allowed the testing of the potential of UWB tracking before a real system is bought and set up. Now at the VR lab. at Vidzeme University of Applied Sciences in collaboration with EchoSports UWB tracking system is set up for further tests and real-time experiments with high quality 3D models for various surroundings and avatars, allowing to evaluate a large scale free-room concept, when physical movement is not limited and is performed in a natural way, without the use of controllers. These experiments will allow us to find out more information related to the mitigation of cybersickness, use cases and performance in augmented reality, as well as the potential of multi-user collaboration locally and remotely, by connecting remote free-rooms via 5G technology.