1 Introduction

The annual investments on the infrastructure sector represent a significant percentage of the Gross Domestic Product (GDP) of developed and developing countries e.g. 3.9% of the GDP for the old European states, 5.07% of the GDP for the new European states and 9% of the GDP for China [1]. Towards these inspection tasks, a variety of methods and approaches are adopted to address the challenges of infrastructure inspection, where as an example specialized personnel performs visual inspection, nondestructive testing and maintenance tasks using scaffolds, roping or even manned helicopters in order to obtain access to the sites of interest. According to the Helicopter Association International and the Utilities, Patrol and Construction Committee (UPAC) for Safety Guide for Helicopter Operators, 2009 [2] thousands of flight hours are accumulated each day conducting manned aerial work, in support of Utilities Infrastructure (electricity, gas, water), as it is now well-understood that such aerial works bring down the cost and time requirements

In order to decrease the human life risk and to increase the performance of the overall procedure, autonomous ground, aerial or maritime vehicles are employed for executing the inspection tasks. As an example, for these applications it can be mentioned the power-line monitoring using autonomous mobile robots [3], bridge inspection [4], boiler power-plant 3D reconstruction [5], urban structure coverage [6], forest fire inspection [7], aerial manipulation [8] using UAVs, inspection of underwater structures in [9] by the utilization of autonomous underwater vehicles and cooperative sensing [10]. In most of these scenarios, there is an a priori knowledge about the infrastructure, while the 3D or 2D models are available or can be derived using CAD software.

Unmanned Aerial Vehicles (UAVs) equipped with remote sensing instrumentation are emerging in the last years due to their mechanical simplicity, agility, stability and outstanding autonomy in performing complex manoeuvres etc [11]. A variety of remote sensors such as visual sensors, lasers, sonar, thermal, etc. could be mounted, while the acquired information from the UAV’s mission can be analyzed and used to produce sparse or dense surface models, hazard maps, investigate access issues, and other area characteristics. However, the main problem in these approaches is to guarantee the full coverage of the area, a fundamental problem that is directly related to the autonomous path planning of the aerial vehicles.

This article demonstrates the novelty of an aerial sensor for the inspection of complex 3D structures with multiple agents. In this approach, the a priori coverage path is divided and assigned to each agent, based on the infrastructure architectural characteristics in order to reduce the inspection time. Furthermore, to guarantee a full coverage and a 3D reconstruction, the introduced path planning for each agent creates an overlapping visual inspection area, that will enable the off-line cooperative reconstruction. Based on the aforementioned state of the art, the major contribution stems from the direct demonstration of the applicability and feasibility of the overall cooperative coverage and inspection scheme with the UAVs for outdoors scenarios without the utilization of any external reference system, e.g. motion capture systems. This demonstration has a significant novelty and impact as an enabler for a continuation of research efforts towards the real-life aerial cooperative inspection of the aging infrastructure, a concept that has never been presented before to the authors best knowledge, in outdoor and with a real infrastructure as a test case. In the outdoors demonstrations, the UAVs have been autonomously operated based on odometry information from visual and inertial sensor fusion and without any other support on localization, which adds more complexity and impact on the acquired results. The image and pose data on board the platform were post processed to build a 3D representation of the structure.

The rest of the article is structured as follows. First a brief review on related works is presented in Sect. 2. Then the general formulation of the problem is described in Sect. 3. Later on, the proposed method is presented in Sect. 4, which follows with a brief description on the 3D reconstruction from multiple agents. In Sects. 5 and 6 the hardware for autonomous navigation and multiple simulation and experimental results are presented respectively. Finally the article concludes in Sect. 7.

2 Related Work

Navigation of multi-agent systems for infrastructure inspection, is an area of increasing interest both from a research, as well as an application viewpoint. In recent years, multiple approaches have been proposed regarding obstacle free path generation for robotic platforms. The application of potential field-based methods has been explored [12, 13] as a promising research direction for such algorithms. Coverage Path Planning (CPP) is the task of determining a path that passes over all points of an area or volume of interest while avoiding obstacles [14]. The task of coverage is fundamental in many robotic applications, such as, visual inspection of complex structure [15], painter robots [16], wall climbing robots for inspection [17], inspection of complex underwater structures [18], vacuum cleaning robots [19], etc. In [14] a complete survey was presented on CPP methods in 2D and 3D. This article is mainly focused on the application of CPP in aerial robotics for infrastructure inspection, providing simultaneously the required theoretical background.

The task of CPP has received significant attention over the last years in the different application scenarios, however still there are limited CPP approaches in the case of aerial robotics and fewer approaches that addressed coverage of 3D spaces [20]. Especially in the case that the CPP concept is extended in the Collaborative approach (C-CPP), by the utilization of multiple aerial agents, instead of a single one, the overall coverage time has the potential to be dramatically reduced, while it can be achieved realistically by multiple UAVs, when taken under consideration the flying times and the levels of autonomy. Thus, inspired by this vision, the main objective of this article is to establish a C-CPP method that is based on an a priori knowledge of the infrastructure (e.g. a CAD model) and it will have the ability to generate proper way points by considering multiple agents, while ensuring full coverage and the overall collision avoidance among the flying agents. As it will be presented, the proposed novel scheme will create a sub-coverage path planning for cooperative inspection of the whole infrastructure, while having the capability to detect branches of complicated infrastructures, which can be assigned to different agents. In the sequel, supplementary, 3D reconstruction routines can be performed to provide an updated 3D mesh of the structure by using either the monocular or the stereo cameras.

Towards the 3D CPP, Atkar et al. [16] presented an offline 3D CPP method for spray-painting of automotive parts. Their method used a CAD model and the resulting CPP should satisfy certain requirements for paint decomposition. In [21], the authors presented a CPP with real time re-planning for inspection of 3D underwater structures, where the planning assumed an a prior knowledge of a bathymetric map and they adapted their methodology for the case of an autonomous underwater vehicle, while their overall approach was containing no branches. The authors in [18] introduced a new algorithm for producing paths that cover complex 3D environments. In this case, the algorithm was based on off-line sampling with the application of autonomous ship hull inspection, while the presented algorithm was able to generate paths for structures with unprecedented complexity.

In the area of aerial inspection, [6] presented a time-optimal UAV trajectory planning for 3D urban structure coverage. In this approach, initially the structures to be covered (buildings) were simplified into hemispheres and cylinders and in a later stage the trajectories were planned to cover these simple surfaces. In [22], the authors studied the problem of 3D CPP via viewpoint re-sampling and tour optimization for aerial robots. More specifically, they presented a method that supports the integration of multiple sensors with different fields of view and considered the motion constraints on aerial robots. Moreover, in the area of multi-robot coverage for aerial robotics in [23], a coverage algorithm with multiple UAVs for remote sensing in agriculture has been proposed, where the target area was partitioned into k non-overlapping sub-tasks and in order to avoid collision both different altitudes have been assigned to each UAV and security zones were defined, where the vehicles are not allowed to enter.

3 Problem Statement

Multiple aerial robots will be employed and will address the problem of autonomous, complete, and efficient execution of infrastructure inspection and maintenance operations. To facilitate the necessary primitive functionalities, an inspection path-planner that can guide a team of UAVs to efficiently and completely inspect a structure will be implemented. The collaborative team of UAVs should be able to understand the area to be inspected, ensure complete coverage and an accurate 3D reconstruction to accomplish complex infrastructure inspection. Relying on the accurate state estimation as well as the dense reconstruction capabilities of the collaborative aerial team, algorithms for the autonomous inspection planning should be designed to ensure full coverage.

It should be highlighted that, this work is an extension of our preliminary work presented in [24] which studied the problem formulation in 2D. Let assume \(\varOmega \subset \mathbb {R}^3\) be a given region that can have multi-connected components (complex structure), while we also consider the finite set

$$\begin{aligned} \varvec{\varLambda }=\{C_i:i\in I_n = \{1,2, \dots , n\}\} \end{aligned}$$
(1)

of cells

$$\begin{aligned} C_i= \left\{ (x_i,y_i,z_i)\in \mathbb {R}^3: (x,y,z) \sim \text {camera specification and position}\right\} . \end{aligned}$$
(2)

The placement of the cells \(C_i\) can be defined by the translation vector \(u_i=(x_i,y_i,z_i)\) and orientation vector \(o=\{\phi ,\theta , \psi _i\}\), \(i \in I_n\), while the set of translated and orientated cells \(C_i(u_i,o_i)\) is expressed by \(\varLambda (u,o)\), where \(u=\{u_1,u_2,\dots ,u_n\}\in \mathbb {R}^{3n}\) and \(o=\{o_1,o_2,\dots ,o_n\}\in R^{3n}: 0\le o_i \le 2\pi \).

The 3D polygonal

$$\begin{aligned} P (u_i,o_i,n)= \bigcup \limits _{i=1}^{n} C_{i} (u_i,o_i) \end{aligned}$$
(3)

represents the region covered by the union of the cells \(C_i\), while \(\varLambda ^*\) is a cover of \(\varOmega \) if there exist a solution such that

$$\begin{aligned} \varOmega \subset P(u_i,o_i,n) = \bigcup \limits _{i \in I_n} C_i (u_i,o_i) \end{aligned}$$
(4)

Moreover, several cases arise in the interaction between two cells \(C_i(u_i,o_i)\) and \(C_j(u_j,o_j)\) with \(i \ne j\), \(u_i=(x_i,y_i,z_i)\), \(o_i=(\phi _i,\theta _i,\psi _i)\), \(u_j=(x_j,y_j,z_j)\) and \(o_j=(\phi _j,\theta _j,\psi _j)\), where mainly determined by:

$$\begin{aligned} \begin{aligned} C_{i} (u_i,o_i)&\cap C_{j} (u_j,o_j) = \emptyset \\ C_{i} (u_i,o_i)&\cap C_{j} (u_j,o_j) \ne \emptyset \end{aligned} \end{aligned}$$
(5)

Additionally the cases of \(C_i(u_i,o_i) \subset C_j(u_j,o_j)\) and \(C_j(u_j,o_j) \subset C_i(u_i,o_i)\) are not considered when dealing with the coverage problem, because it is contrary to optimality of the path to have a substantial overlapping for visual processing and cover the whole surface of the under inspection object.

4 Methodology

Collaborative UAVs can be deployed, equipped with advanced environmental perception and 3D reconstruction, intelligent task planning, and multi-agent collaboration capabilities. Such a team of UAVs should be capable of autonomously inspecting infrastructure facilities, while this Section presents an experimental setup, towards multi-robot collaboration, path-planning, localization, as well as cooperative environmental perception and reconstruction. Furthermore, the mission-oriented planning algorithms will be integrated with the control and localization components of the platform in outdoor environments. A major component that affects the overall performance of the inspection task, is the Path planning strategy. In this work an extended version of CPP is implemented in a collaborative manner (C-CPP) by the utilization of multiple aerial agents, instead of a single one. The attributes of this approach are the full coverage of a complex structure and the reduced mission time [25] for the overall coverage, by considering the level of autonomy and flight times for each agent. A full reference on developed C-CPP can be found in [15].

Briefly, the established C-CPP method is based on an a priori knowledge of the infrastructure (e.g. a CAD model) and it has the ability to generate proper way points by considering multiple agents, while ensuring collision avoidance among the flying agents. The implemented method will create a sub-coverage path planning for cooperative inspection of the whole infrastructure, while having the capability to detect branches of complicated infrastructures, which can be assigned to different agents. In the sequel, supplementary, 3D reconstruction routines can be performed to provide an updated 3D mesh of the structure by using various sensors, like cameras or lidars. Additionally, the generated waypoints guarantee enough overlapping Field of View in order to build the structure 3D model from the processed sensor data.

The resulting waypoints are then converted into position-velocity-yaw trajectories, which can be directly provided to the utilized linear model predictive controller cascaded [26] over an attitude-thrust controller. This is done by taking into account the position controller’s sampling time \(T_s\) and the desired velocity along the path \(\mathbf {V}_d\). These trajectory points are obtained by linear interpolation between the waypoints, in such a way that the distance between two consecutive trajectory points equals the step size \(h = T_s ||\mathbf {V}_d||\). The velocities are then set parallel to each waypoint segment and the yaw angles are also linearly interpolated with respect to the position within the segment. The adopted trajectory generation that was used in the experimental realization of the proposed C-CPP.

As stated throughout this article, the C-CPP method addresses the case of autonomous cooperative inspection by multiple aerial UAVs. Each aerial platform is equipped with a camera to record image streams and provide a 3D reconstruction of the infrastructure [11]. More specifically, Monocular mapping is considered to obtain the 3D model of the infrastructure, while the overall aim is to merge the processed data from multiple agents into a global representation. Structure from Motion (SfM) approach is used to provide a 3D reconstruction. While, the aerial agents follow their assigned path around the object of interest the image streams from the monocular cameras of all agents are stored in a database.

The process starts with the correspondence search step, which identifies overlapping scene parts among input images. During this stage, feature extraction and matching algorithms between frames are performed to extract information about image scene coverage. Next, a filtering step is implemented to remove outliers using epipolar geometry [27]. The algorithm requires an initial image pair \(I_1\) and \(I_2\) with enough parallax as the starting point and then to incrementally register new frames. Briefly, image matches are extracted and the camera extrinsics for \(I_1\) and \(I_2\) using the 5-point algorithm [28]. Then projection matrices, including the relative pose between frames, are estimated and used for triangulation of the detected image points, to recover their 3D position \(X^{3D}\). Afterwards, the two-frame Bundle Adjustment refines the initial set of 3D points minimizing the reprojection error.

The aforementioned process consist of the initialization step. The remaining images of the dataset are incrementally registered in the current camera and point sets using Perspective-n-Point (PnP) [29]. The newly extracted points are triangulated and are processed from a global Bundle Adjustment to correct drifts in the process. The absolute scale of the reconstructed object can be recovered by combining the full-pose annotated images from the onboard localization of the camera systems.

5 Setup for Autonomous Navigation

The proposed method has been evaluated with the utilization of the Ascending Technologies NEO hexacopter, depicted in Fig. 1. The platform has a diameter of 0.59 m and height of 0.24 m. The length of each propeller is 0.28 m as depicted in Fig. 1. This platform is capable of providing a flight time of 26 min, which can reach a maximun airspeed of 15 m/s and a maximum climb rate of 8 m/s, with maximum payload capacity up to 2 kg. It has an onboard Intel NUC computer with a Core i7-5557U and 8 GB of RAM. The NUC runs Ubuntu Server 14.04 with Robotic Operatic System (ROS) installed. ROS is a collection of software libraries and tools used for developing robotic applications [30]. Additionally, multiple external sensory systems (e.g. cameras, laser scanners, etc.) can be operated in this setup. Regarding the onboard sensory system, the Visual-Inertial (VI) sensor (weight of 0.117 kg.) (Fig. 1) developed by Skybotix AG is attached below the hexacopter with a 45\({^\circ }\) tilt from the horizontal plane. The VI sensor is a monochrome global shutter stereo camera with 78\({^\circ }\) FOV, housing an Inertial Measurement Unit (IMU) [31]. Both cameras and IMU are tightly aligned and hardware synchronized. The camera was operated in 20 fps with a resolution of 752\(\,\times \,\)480 pixels, while the depth range of the stereo camera lies between 0.4 and 6 m.

Fig. 1
figure 1

AscTec NEO platform with the VI sensor attached

Fig. 2
figure 2

Software and hardware components used for conducting inspections

The proposed C-CPP method, established in Sect. 4, has been entirely implemented in MATLAB. The inputs for the method are a 3D approximate model of the object of interest and specific parameters, which are the number of agents (n), the offset distance from the object (\(\varOmega \)), the FOV of the camera (\(\alpha \)), the desired velocity of the aerial robot (\(V_d\)) and the position controller sampling time (\(T_s\)). The generated paths are sent to the NEO platforms through the utilization of the ROS framework.

The platform contains three main components to provide autonomous flight, which are a visual-inertial odometry, a Multi-Sensor-Fusion Extended Kalman Filter (MSF-EKF) [32] and a linear Model Predictive Control (MPC) position controller [26, 33, 34]. The visual-inertial odometry is based on the Robust Visual Inertial Odometry (Rovio) [35] algorithm for the pose estimation. It consists of an EKF filter that uses inertial measurements from the VI IMU (accelerometer and gyroscope) during the state propagation and the visual information is utilized during the filter correction step. The outcome of the visual inertial odometry are the position-orientation (pose) and the velocity (twist) of the aerial robot. Afterwards, the MSF-EKF component fuses the obtained pose information and the NEO IMU measurements. This consists of an error state Kalman filter, based on inertial odometry, performing sensor fusion as a generic software package, while it has the unique feature of being able to handle delayed and multi-rate measurements, while staying withing computational bounds. The linear MPC position controller [34] generates attitude and thrust references for the NEO predefined low level attitude controller. The image stream from the overall experiment is processed using the discussed method in Sect. 4, while the overall schematic of the experimental setup is presented in Fig. 2.

6 Experimental Results

To evaluate the performance of the method, in a real autonomous inspection task, an outdoors experiment was conducted. For this purpose the Luleå University’s campus fountain has been selected to represent the actual infrastructure for the cooperative aerial inspection. The fountain has a radius of 2.8 m and height 10.1 m without branches. The area is considered rural, while surrounded by multiple buildings and also vegetated in some places. The UAV navigated in a constrained area around the Fountain, for safety purposes 2 m away from the structure and the experiment was bounded in a cylindrical area with radius of 10 m. In order to achieve a full autonomous flight, the localization of the UAV relied only on the onboard sensory system. Thus, the UAVs followed the assigned paths, based on visual-inertial odometry localization. Before the beginning of the experiment an initialization process was followed to fix the origin of the coordinate frame of UAVs close to the base of the fountain. This initialization step was considered for simplicity purposes due to the robocentric approach of the localization component that fixes the origin at the position where the algorithm is initiated. It should be noted that there was little wind during the described flight tests and the background was mainly static. However, people were passing or standing to observe the experiment, which were considered to be dynamic. Overall, the localization algorithm shows promising results despite to these small variations in environment. This can be resulted from enough texture in the inspected structure. The starting point has 180\({^\circ }\) difference and the cooperative scheme reduce the flight time from 327 s in case of one agent to 166 s with two agents. The average velocity along the path was 0.2 m/s and the points fed to the agents in a way to guarantee the maximum distance and avoid collision. It should be highlighted that during the experiment all processes were executed on board the aerial platform, while the 3D mesh was build offline from a ground station. During the experiment, the UAVs took-off manually and when they reached specific height switched to autonomous navigation. Similar strategy was followed for the landing of the vehicles. This steps are mainly done for safety reasons of landing and taking off. The actual and reference trajectories followed by both platforms are depicted in Fig. 3.

Fig. 3
figure 3

Trajectories which are followed in outdoor experiment

The 3D model of the structure was build offline using the dataset collected from both aerial agents. The extracted images were combined and processed by the SfM algorithm as described in Sect. 4. The fountain and its sparse 3D model are presented in Fig. 4.

Fig. 4
figure 4

On the left is the Luleå University outdoor fountain, and, on the right, the cooperative pointcloud of the structure with estimated flight trajectories

Fig. 5
figure 5

Cooperative 3D mesh of the outdoor structure

In the proposed experiment the same strategy as indoor experiment is followed for two agents. The starting position of each of them has the maximum of distance with 180\({^\circ }\) difference. The overall flight time is reduced from 370 to 189 s and the average velocity along the path was 0.5 m/s. The sparse reconstruction provides an initial insight regarding the object to inspect, while an extra step is followed to create a 3d mesh. To retrieve the 3D mesh of the structure Autodesk ReCap 360 was used [36]. ReCap 360 is an online photogrammetry software suited for accurate 3D modeling. The reconstructed surface obtained from image data, is shown in Fig. 5. The results show that the collaborative scheme of the path planner could be successfully integrated for automating inspection tasks (https://youtu.be/c4q2T5eqYRk).

7 Conclusions

This article presents an aerial tool towards the autonomous cooperative coverage and inspection of a 3D infrastructure using multiple Unmanned Aerial Vehicles (UAVs). The proposed approach deploys multiple aerial robots and generates collision free trajectories for the inspection of the 3D structure. The aim of this application is to assign different parts of the scene to each agent for complete structure coverage in short time, considering the agents navigate autonomously. The visual information collected from the aerial team is collaboratively processed to create the dense 3D model, which can be used for inspection purposes. The experimental evaluation of the proposed inspection system demonstrated substantial performance in realistic outdoor cases that could act as an enabler for further developments.