Keywords

1 Introduction

Simultaneous localization and mapping (SLAM) technology is the main solution to the key problem in the field of intelligent robot research that how the robot can obtain the map of an unknown environment while realizing its own localization in the map simultaneously, and provide technical support for subsequent tasks.

Due to the difficulties for a single robot to complete some complicated tasks, such as limited movement capability, insufficient computing power, poor anti-interference capability and so on, a multi-robot system is demanded. In the face of a large-scale complex environment, multiple robots can be dispersed into various sections of regions of the environment. A single robot uses SLAM technology to establish a local environment map/model. Then the system fuses these local models into a larger-range environment model. Compared with the single-robot SLAM, the SLAM completed by a multi-robot system in collaboration has the advantages of more accuracy, more efficiency and more robustness. The main challenges in implementing a multi-robot SLAM system include bandwidth limitations, map fusion, asynchronous communications, coherent information integration, and data association between robots [1].

Most of the multi-robot SLAM methods are based on the corresponding single-robot SLAM algorithms. Earlier implementations are based on filtering methods, such as Extended Kalman Filter (EKF) [2, 3]. These methods inherit some of the shortcomings of filter-based SLAMs. The current mainstream multi-robot SLAM algorithms are based on graph optimization or pose graph methods [4, 5].

At present, most of the researches on the multi-robot SLAM problem are still in the simulation stage, where the information sensed and the communication conditions are often set to be completely ideal. This study focuses on the problem of multi-robot SLAM map fusion, proposes a new method based on image stitching [6], and conducts experimental verification in the real scene on the mobile robot platform developed by our own. With the help of ROS data transmission mechanism, our map fusion method does not depend on the specific single-robot SLAM algorithm, as long as the maps generated by the algorithm can be converted to grayscale images. It can process maps from any number of robots within the allowable range of computing power, and allows dynamic addition or removal of robots in the system.

The method closest to this work is proposed by Hörner [7]. We both draw on the principle of image stitching in computer vision. The difference is that this work uses depth features, and includes relocalization function using the global map. In addition, this work has been verified by experiments in the real scene.

The rest of this paper is organized as follows. Section 2 presents the method overview. The map fusion method based on image stitching for multi-robot SLAM is detailed in Sect. 3. The experiments are presented in Sect. 4, and conclusions are drawn in Sect. 5.

2 Method Overview

The proposed method is used to fuse 2D occupancy grid maps independently established by multiple robots, and method pipeline is already shown in Fig. 1.

Fig. 1.
figure 1

Method pipeline

Assume that there are n robots participating in the mapping task. The map built by a single robot through lidar SLAM is called a local occupancy grid map (short for local map) \(Lmap_i\) (\(i=1, 2,...,n\)), and the fused occupancy grid map is called a global map \(\mathcal {M}\). The local environment maps are represented in the form of a common occupancy grid. One establishes a mapping relationship \(\mathcal {F}: Occupancy Grid Map \rightarrow Grayscale Image\) from occupancy grid map to grayscale image. Obviously, the proposed method also supports other formats of map, as long as the map format can be mapped to a grayscale image.

After \(Lmap_i\) has been converted into grayscale image \(Imap_i\) (\(i=1, 2,...,n\)), one then uses the neural network SuperPoint [8] to extract feature points and calculate descriptors, which can obtain higher image matching accuracy compared with traditional ORB [9] or SIFT [10]. After feature matching, it estimates the coordinate transformation relationships between local maps, that is, solves the homography matrices. The local maps and their matching relationships form a weighted graph, which is called the matching topology graph G. The largest spanning tree Tr is established in the largest connected component H of G, and map fusion is achieved by exploring Tr. Finally, the inverse mapping \(\mathcal {F}^{-1}\) needs to be used to convert the global map in the form of grayscale image back into an occupancy grid map \(\mathcal {M}\).

The obtained global map can then provide support for the robot to perform other advanced tasks in the area, such as target search and path planning.

3 Multi-robot SLAM Map Fusion

3.1 Estimate the Coordinate Transformations Between Local Maps

In order to fuse the local maps, calculating the transformations between the local occupancy grid maps established by each robot is the necessity. The method follows these steps:

  1. (1)

    Convert the local grid map \(Lmap_i\) (\(i=1, 2,...,n\)) built by each robot into a grayscale image \(Imap_i\) (\(i=1, 2,...,n\)). The value of each cell is within the range of [0,100]. This value represents the probability of obstacles in the cell. If the probability is unknown, it is represented by \(-1\). The local map in this form is converted to a grayscale image, and if the value is \(-1\), it maps to 255 in the pixel value of the grayscale image, and one can get a standard 8-bit depth grayscale image.

  2. (2)

    Use the SuperPoint network to extract the feature points and calculate feature descriptors of each local grid map \(Imap_i\).

  3. (3)

    Feature matching is carried out for each pair of local maps afterwards. The brute force matching algorithm is used. If there are a large number of robots, the parallel hierarchical clustering algorithm will be used to speed up the matching.

  4. (4)

    Solve the coordinate transformation matrix \(T_{i, j}\) between each pair of local grid maps \((Imap_i, Imap_j)\). Use the RANSAC [11] method to filter the matched features, use the SVD method to solve the \(T_{i, j}\), and calculate the matching confidence \(c_{i, j}\) of the corresponding match

    $$\begin{aligned} c_{i, j}=\frac{{ number\_inliers }_{i, j}}{8+0.3 * { number\_matches }_{i, j}}, \end{aligned}$$
    (1)

    where \({number\_inliers}_{i, j}\) is the number of inliers found in the RANSAC method, and \({ number\_matches }_{i, j}\) is the number of matched feature points between each pair of local maps \((Imap_i, Imap_j)\).

  5. (5)

    Eliminate matches with a confidence less than 1, and form a matching topology graph G. The vertices of the G are the local maps \(Imap_i\) built by single robots, and the edges are \(T_{i, j}\), \(c_{i, j}\), \({number\_inliers}_{i, j}\) and \({ number\_matches }_{i, j}\).

Each time a robot updates the local map, it will upload the updated incremental map to the central node instead of uploading the entire local map, which can effectively reduce the amount of data transmission. Then, the central node will regenerate a corresponding grayscale image.

3.2 Map Fusion Based on Matching Topology Graph

After the matching topology graph is established, if any of the local maps has strong matching relationships with the others, the coordinate system of this local map will be fixed as the world coordinate system. The coordinate transformations will be performed on the remaining maps according to the results of feature matching. The global map is thus established.

If there are a large number of robots or there are maps that need to be eliminated, such as local maps with large errors, isolated local maps and so on, the graph method will be used for map fusion [7]. It is very common that some local grid maps cannot be successfully matched. In order to cover as large environmental area as possible, the weighted maximum connected component H in the established matching topology graph G is considered, and only the local maps included by H will be fused. This study selects the coordinate system of one of the local maps as the global coordinate system in H. The maximum spanning tree Tr is established in H, and by exploring Tr, the transformations between the local maps and the global coordinate system are finally determined. As for each local map, it can obtain the final transformation result by synthesizing paired transformations along the path. After all the transformations are completed, the map fusion is realized.

3.3 Robot Relocalization

Relocalization with maps built by lidar SLAM is often more difficult than the visual relocalization problem, because the information stored in the map established by lidar SLAM is not rich enough. At present, the relocalization problem using lidar usually uses the loop detection part of Cartographer [12] or Karto-SLAM [13]. In this study, after obtaining the global map \(\mathcal {M}\), the newly entering robot k uses its own lidar to build a local grid map \(Rmap_k\), and uses the method described in Sect. 3.1 to perform feature matching between the local grid map \(Rmap_k\) and the global map \(\mathcal {M}\). The coordinate transformation between the local grid map and the global map is estimated, and then the position of the newly entering robot in the global coordinate system is obtained, so as to realize the relocalization and provide support for the robot to realize the tasks such as navigation and target search.

4 Experiments

4.1 Multi-robot Map Fusion Experiment

A real world experiment using three ground omnidirectional mobile robots as shown in Fig. 2(a) is conducted. The experimental area is an indoor environment, as shown in Fig. 2(b). The three mobile robots, each carrying a laser range scanner, are distributed in different areas to measure the environment and perceive environmental information. In the experiment, it uses remote control to make those robots move in a certain part of the field and build local maps. The map established by each robot covers only one part of the entire environment. The fused map on the central node includes the contents of all three local maps.

Fig. 2.
figure 2

(a) Mobile robot running single-robot SLAM. In order to avoid the lidar interference between each other, the lidars of the three robots are set up in different heights. (b) Experimental area.

Each robot uploads the environmental information obtained to the central node. In this experiment, the central node is only responsible for controlling the movement of the mobile robots and for performing map fusion, and single-robot SLAM is implemented independently by mobile robots and does not depend on the central node. The central node is a laptop computer with Intel Core i7-8750H and 8G RAM in this case. The communication method between the central node and the mobile robots is WiFi, and our method is run based on ROS. The data format of the local map established by a single robot in ROS is nav_msgs/OccupancyGrid message, and the data format of the corresponding incremental map updates is map_msgs/OccupancyGridUpdate message. Common SLAM algorithms that support this format rule in ROS include Karto-SLAM [13], Gmapping [14], and so on.

The map fusion process observed in the display interface of the center node is shown in Fig. 3.

Fig. 3.
figure 3

Map fusion in the exploration process of three ground mobile robots.

As shown in Fig. 3, after 380 s of exploration, the three ground mobile robots finally completed the task of building a global map. On the display interface of the central node, the process of localization and mapping by the mobile robots could be observed. Initially, the fusion error between the three local maps was relatively large due to the fact that there were few feature points. After 95 s, the fusion of the mobile robot maps achieved a good effect. After 380 s, each of the three robots covered most of the entire site, the global map contained enough area information, and the task was completed.

4.2 Multi-robot Target Search Experiment

In the same field, 7 new mobile robots are placed, and the configuration of these 7 mobile robots and the lidars they carried are different from the three robots used for mapping. In this experiment, the robots themselves do not have the localization function. It only detects environmental information and uploads it to the central node. The central node uses the global map that has been obtained before to provide localization service for the robots and guide them to perform collaborative search task. Seven mobile robots use the improved PSO algorithm for target search, and the specific implementation method refers to our previous work [15]. The experiment process is shown in Fig. 4.

Fig. 4.
figure 4

Multi-robot target search experiment. (a) Seven robots were in their initial positions. (b) Seven robots were searching. (c) Target search task completed.

5 Conclusions

In this study, a multi-robot SLAM map fusion method based on image stitching is proposed. In the method, the occupancy grid map is mapped to grayscale image, and the transformation relationships between maps are estimated through depth feature matching. On this basis, the map fusion is performed using the matching topology map. This method can realize the fusion of multiple local maps consuming fewer computing resources. We have carried out experiments on three mobile robots developed by our own in an indoor environment, and the experiments have proved that the method has good real-time performance and robustness. After obtaining the global map, some new robots are used in the environment and the multi-robot target search task is realized successfully by using the relocalization function.

Future works might include using more robots to conduct experiments in real scenes and to realize autonomous robotic exploration.