Keywords

1 Introduction

The evolution of edge and mobile devices and the related technologies, in both hardware and software level, enables the execution of even heavier Machine Learning tasks on the edge providing new possibilities for research and innovation. Smartphones are nowadays the mainstream computing paradigm for users of all types, and their commodity hardware incorporates advanced capabilities for communication with other systems and interacting with the physical environment [3]. In parallel, the advancements in Computer Vision and the extensive use of Machine Learning technologies create new opportunities in all application domains by exploiting the capabilities and the performance of the hardware, especially when the operations are performed locally. Augmented Reality (AR) is a technological area which benefited from these advancements and in its recent form, offers to users of smartphones means for interacting with the physical world by utilizing raw data from a mobile device’s sensors. Use cases of AR can be found in a variety of scientific fields, from architecture and engineering [2] to health and education [14]. The proposed solution is based on the extraction of the features of a front scene using pattern recognition to identify surfaces on which virtual Anchors are placed. After the system is trained, it is able to discover these Anchors in runtime and translate them to points of interest which can be used for localization and indoor guidance. AR also supports user interactivity by creating a series of indoor waypoints to indicate a path, or other information, introducing an innovative navigation system which can adapted to support different scenarios and applications [22].

The rest of the paper is structured as follows: Sect. 2 highlights related works and studies on the field. In Sect. 3, the technological foundation is presented along with the overview of the system architecture and implementation. Results from the system in practice are demonstrated in Sect. 4. A discussion of the evaluation of the system is presented in Sect. 5 while Sect. 6 concludes the work.

2 Related Work

Traditional mobile navigation methods retrieve the position of the device either by cellular network [16] or via satellite using GPS [7]. While these methods perform good giving driving, cycling or walking directions, they lose precision when indoors. To overcome this limitation, solutions utilizing static hardware attached to a building are introduced. Adam Satan’s system uses Bluetooth Beacons which emit radio frequency signals identified by the device before using Dijkstra algorithm to find the shortest path [18]. Following the same concept, but using WiFi signal instead of Bluetooth, indoor navigation was achieved in COEX complex in Seoul [6]. An approach was to identify landmarks and create magnetic maps for multiple corridors of a floor in a building using a phone’s built-in magnetometer [5]. All aforementioned examples require modifications and sensor installations in order to produce the desired result. Another proposal, which does not require any kind of sensor installation, is navigation by estimating steps using accelerometer and 5G signals [19]. Keeping aligned with the using-onboard-sensors-only approach, more integrated hardware can be utilized and combined, such as the device’s camera. At this point the term AR will be introduced. AR in comparison with Virtual Reality (VR), captures the outside world and interacts with the area in front by attaching and visualizing augmented information [1]. Generic implementation of AR gamification techniques with physical activity goal and combined with AR navigation can also be found in Nature-Based solutions [13], where the user is navigated through a park’s attractions. Target indication in facility maintenance operations is another proof-of-concept scenario of AR’s [11] localization ability, along with freight car routing [21]. In combination with WiFi/Bluetooth Low Energy signals, Jehn-Ruey Jiang et al. introduced an AR indoor navigation framework [8] which is applicable in both AR for mobile and VR glasses. The scientific base behind these solutions is a combination of feature extraction from a series of images with data retrieved from device sensors, called Simultaneous Localization and Mapping (SLAM). SLAM is identified as a problem [4] with solutions in robotics [20] and more recently in combination with AR [17]. The Googles ARCore Library facilitates indoor space recognition by utilizing SLAM in such a way that enables a device to identify locations previously recorded by other devices. Features from the device’s camera feed are recorded, processed and stored to the cloud. Then, they are retrieved by other devices which compare them to what they are recording at runtime [15]. Such feature extraction techniques are used by handheld PCs to identify similar locations in an image database and display location-related information [10] or used in a simulated physical shopping mall environment by utilizing the Vuforia engine [9].

3 Design and Implementation

3.1 Background Technologies

For a better understanding of the applied computer vision concepts and technologies, a brief introduction and description regarding the required terminology is following:

Augmented Reality: An immersive human-machine-interaction experience is achieved without the need of additional hardware. By utilizing onboard camera and IMU sensors, the phone’s video feed can be supplemented with additional augmented information such as labels, images, markers and other kinds of multimedia. A marker’s position remains attached at the predefined location regardless of any device movement or environment change. All items attached to a surface are generally referred to as Anchors in AR terminology. In order to attach an Anchor to the scene, the area needs to be scanned using specific software that implements SLAM.

SLAM and Cloud Anchors: Local area identification and localization is achieved by extracting features of an image feed along with data from device IMU sensors. At first, the area around the device needs to be slowly and steadily scanned. While scanning, the SLAM algorithm parses the camera frame feed and extracts feature points from each frame. SLAM algorithms are optimized to focus on certain and dense segments of each image to achieve better data processing. Extracted features are combined with data from the IMU sensors to determine the exact distance, rotation and orientation of the recorded frames. Segments of the feed that have a confident amount of features offer the ability to attach Anchors to the scene, considering that the nearby features of an Anchor can be easily recognized once the device reaches that spot. Cloud Anchors functionality gets use of a SLAM output as a base to store Anchor locations for remote use. Goal of Cloud Anchors is the ability to preview Anchors that were placed by another device in the past. Hence, feature points in the range of an Anchor are captured and stored in a cloud database. By providing a camera feed from the same area and comparing the current extracted features with the database, the exact position of an Anchor can be precisely estimated. Limitation of the aforementioned capability is the bounded amount of Cloud Anchors that can be searched at the same time. ARCore Cloud Anchors implementation, which is used in the current work, allows up to 20 simultaneous Anchor scans. Needs to be mentioned, though, that once an Anchor is attached, it is removed from the scanning stack, allowing for an additional Anchor to be scanned.

3.2 Methodology

To address the requirements of developing an AR navigation system, a set of algorithms and techniques, aiming at achieving indoor localization, performing efficient path planning and visualizing the guidance system are designed and implemented.

Indoor Positioning and Key Anchors: A plausible observation arising from the Cloud Anchor functionality is the ability to achieve indoor positioning. If information about a location is related to an Anchor instance, the nearest Anchor identified by the device leads to the knowledge of the current position. Having multiple Anchors acting as reference points in a complex space, we can introduce a new term, Key Anchors. Such Anchors do not contain any visual information and are rather used to determine the device’s position at the area.

Routing: Apart from identifying the device’s position, Cloud Anchors can be used for a variety of other use cases. Our proposed system prompts the user to follow an on-screen visualized path by using Anchors as route checkpoints. Routing entities are categorized as: ArPaths, ArRooms and ArRoutes. These entities are referring to relations between Anchors and locations, offering a state-of-the-art solution to the AR routing problem. An ArPath entity contains a list of Anchors that lead from a starting point to a gateway point. The order of the list defines the flow of the navigation. A gateway point is a special Anchor which includes references to the next ArPaths that begin from there. The ArRoom entity corresponds to a room of a building and includes ArPaths along with a set of additional satellite Anchors. Finally, the ArRoute entity acts as a connecting pole between paths and indicates the final gateway of the route. Once a gateway is reached it is first checked if this is the final point. Otherwise, the first path of the ArRoute that is included in this gateway’s next ArPaths is shown. This naive approach achieves a completely modular functionality which is tested under real scenarios showing successful results. Following the flow of this algorithm, a device can be navigated through different rooms, corridors and stories of the same building with the least required Cloud Anchors downloaded.

Navigation Elements Placement Algorithm: To overcome the possible long distance between two checkpoints, a series of helper Anchors is programmatically placed following our Navigation Elements Placement (NEP) algorithm. NEP locations are generated by

  1. (a)

    subtracting the first Anchor’s transformation vector from the second to get the angle

  2. (b)

    getting the distance between these vectors

  3. (c)

    programmatically attaching an Anchor following the line of the subtracted angle every 0.2 m, which is our interval until the distance is reached

3.3 System Overview

The proposed AR indoor navigation system ingrates the aforementioned algorithms, manages their parameters and offers a client application which is accessible by the end-users. The system consists of two main elements:

  1. A.

    Hybrid AR Platform: A unified framework which includes a client platform and requires integration of a client library. Is responsible for handling AR related information and consists of three modules: Creator Module, Administrator Module, Backend Service and a Query Engine.

  2. B.

    Player Module, a subsystem which can be integrated to any third-party mobile application.

The users, which have the client application installed into their mobile devices, are able to reach their selected destination guided by on-screen AR instructions, without the need of additional hardware interventions. AR components are configured to identify the specific area and initialize the routing algorithm. Configuration of the system’s functionality is performed through limited access applications which are responsible for content creation and management.

3.4 Implementation

Figure 1 highlights the architecture of the proposed system. All applications follow the object oriented programming design principles and are built using Flutter, a cross platform development environment. For AR services, ARCore platform and its Cloud Anchors environment is selected due to its ability of retrieving and sharing feature maps between both Android and iOS, resulting, thus, in seamless integration and interoperability.

Fig. 1.
figure 1

System Architecture

The key advantage of the Hybrid AR Platform approach is that AR operations are separated from client applications, and are integrated into them through a software library.

The platform’s Creator module is an application that hosts Anchors to the cloud and supplementary manages other system aspects and parameters. Hosting functionality is performed by initializing an AR session which allows Anchor placement at the front scene. Anchors are uploaded to the Cloud and their references along with other metadata are stored in the database. Additionally, this module modifies an Anchor’s mesh position, location and orientation. An extended version of the Creator Module, without AR capabilities but with the ability to upload 3D model files and better manage the related content, is the Administrator module. This web application creates and manages instances of routing entities, modifies other administrative parameters that affect the workflow of the Query Engine and globally performs changes to all system configuration. A backend in AR applications was introduced to monitor physical activity via AR exergames [12]. The platform’s Backend service is an extension of the aforementioned implementation, managing the storage of information regarding Anchors, Routes, Paths, Rooms and the assets related to them. The Query Engine acts as the interface of the platform. It is retrieving AR information and producing Localization material which is then transmitted to the Player Module.

The Player module can be integrated into any third-party application. The integrated Localization Controller searches for nearby Key Anchors and identifies the user’s current position by communicating with the platform’s Query Engine. At the same time, the Routing Controller interprets the routing algorithm starting from the current location towards the destination. Each time the Routing Controller retrieves new Anchors, they are transported to the AR screen in order to be visualized. In addition, this controller indicates that a user has either reached at a gateway or at a destination by returning this information to the Routing Controller.

4 Results

4.1 System in Practice

All previously demonstrated technologies, concepts and algorithms have been implemented in a state-of-the-art application which is not only a proof-of-concept prototype, but an end-user product. “UNIPI: AR Experience” application is available to download for both Android and iOSFootnote 1 and the innovative AR integration is in the initial release phase at the university. When opening the app, users can select their destination which lies under a 2-level categorization. The first level refers to the building where the navigation will take place and the second to the type of destination.

Currently there are two buildings supported, “Central” and “Venentokleio”, and for each building two types of routes, “Faculty” and “Classrooms” which refer to directions towards faculty offices and classrooms respectively. An additional route type called “Erasmus” is available only in “Central” building providing directions to Erasmus-related rooms. Users have to tap the Category and then the desired destination.

Fig. 2.
figure 2

System in practice

Starting point can either be selected via the next screen’s list, or the user is able to scan the area in front, in order for the Key Anchor system to identify his current location. After a while, arrows are displayed in the user’s screen showing the direction he has to follow, as demonstrated in Fig. 2. The case of a floor difference between the route’s endpoints is also covered by navigating the user to either the elevator or the stairs, and then indicate to them the next floor via on-screen dialog. While in the navigation phase, additional AR related content containing localized information may be displayed on the user’s AR screen.

4.2 Experimentation

Most important aspect of such an application is the time required for a device to first localize itself and then correctly display the desired checkpoints at their precise locations. We concluded that two metrics are important and attempted to be optimized: duration of first Anchor identification and Anchor displacement error. Two Anchor hosting methods were used, Method 1: hosting each Anchor in an individual AR session and Method 2: hosting all Anchors in the same session. For each test we compared the results to the Anchor hosting method that was used, using two devices: an iPhone SE 2020 and a Huawei H20 Pro. Table 1 demonstrates results of the metrics and is clearly indicated why we finally selected Method 2 for the route recording process. In a single AR session, the surrounding feature points are more than one time visited and are more precisely recorded, thus resulting in better SLAM.

Table 1. Session hosting

Additionally, 100 tests were performed to find the percentage of successful navigations for the “Erasmus Office” route using the iPhone SE 2020 device.

Table 2. Success rate per missed Anchors

Table 2 indicates the number of successful tests compared to their missing Anchors. 58% of the tests had no missing Anchors while tests with one, two or even three Anchors still showed a successful navigation, resulting in the final 85% of successful navigations. Routing algorithm is developed in such a way that if a checkpoint is not recognized, but the next one is, navigation will proceed to the next checkpoint overcoming the current.

5 Discussion

A significant limitation of our system is the requirement of ARCore compatible devicesFootnote 2 Concerning the efficiency of the SLAM, if a corridor or place does not contain any special characteristic rather than plain walls or is filled with people, localization will take longer to process. Needs to be noted, though, that the aforementioned limitations are directly bound to the quality of the Anchor placement procedure.

Low GPS/Cellular coverage or the lack of precision in small displacements is a deterrent factor of using these technologies while installation of transmitters (sensors) can be expensive and slow down the integration of such a system. Although existing studies try to follow a SLAM oriented direction, there are many occurring limitations, obstacles and restrictions this work is overcoming. Other works are either proof-of-concept prototypes with limited functionality, have no user-tester friendly experience and do not meet the hardware independence standards. The between-checkpoint-distancing problem, which occurred not only in our development process, but also indicated as an issue in other published works, is finally solved using our NEP algorithm. Moreover, instead of using Dijkstra or other shortest path algorithms, the proposed system follows a multi-floor-centric model with support for all use cases a visitor will create. Our state-of-the-art routing algorithm introducing the “room to gateway” model ensures that the correct path will be presented while optimizing the use of AR resources to the least required.

6 Conclusion

Advancements in Computer Vision technologies along with the evolution of microprocessors, sensors and cameras form a rich set of assets which facilitate the implementation of innovative solutions for indoor localization and navigation. The proposed solution not only improves and expands the AR Anchor “checkpoint” navigation approach, but also introduces a new routing algorithm, offers interoperability and platform independence.

Future extensions of this work may include a wider system usage monitoring. Moreover, other cross platform AR frameworks can be considered (e.g. Unity AR Foundation) and compared with the current implementation. SLAM’s environmental understanding provides indication of a device’s deviation from a route path. For people with special needs and particularly the visually impaired, a voice command module indicating directions and deviations is a possible future proposal. AR navigation systems can be experimented in other scenarios with limited signal coverage and short distances between targets (e.g. in a museum or hospital). Considering the route creation procedure, an extension of the system is expected to further automate this process, and possibly eliminate completely the need of an experienced system administrator, either by integrating this functionality to an AI tool which will be integrated into the Creator Module.

The case of “UNIPI: AR Experience” highlights the development process of a complete functional system with a Hybrid AR Platform and client application, able to apply in real conditions providing satisfactory results while maintaining the cost and interventions at the lowest level.