Keywords

1 Introduction

Oceans cover around the 70% of the Earth’s surface and they represent the largest basin of mineral and biological resources. For this reason, the exploration of the underwater environment has always been a relevant field for science and technology. Moving from shallow to deep waters, specifically after the continental slope, the exploration and monitoring become always more challenging, being characterized by high hydrostatic pressure, turbidity, poor light, limited availability of communication technologies (the state of art technologies mainly rely on acoustic communications). Furthermore, the energy consumption and the limited battery life are crucial problems in any underwater mission, given the difficulty to have underwater recharging base stations. However, for the above-mentioned reasons, it is a priority to create new technologies for exploration of the oceans, particularly at deeper water depths.

Nowadays satellite images are increasingly available. They provide coarse-grained maps. To have higher resolutions images, we need to use underwater vehicles. In fact terrain based navigation is a common component in underwater exploration [6, 14]. The monitoring of a wide area can be more efficient when performed with a fleet of underwater vehicles because it can cover larger distances than a single one and can achieve longer missions, being each robot equipped with its own batteries’ system. More relevantly, at the perception level, the use of a fleet of robots is much more convenient than a single robot. In fact, in order to cover an area with an extension comparable with that of a fleet, the single robot should be equipped with much more advanced and costly sensors in order to perform long-range underwater navigation and exploration. For this reason an appropriate algorithm for marine coverage path planning is needed.

The goal of this paper is to determine the best path to be assigned to each robot of an underwater fleet in order to cover the most of the area and to enlarge the perception of specific features. We explore both cases where a single or many features are explored. When dealing with a group of robots the identification of the best path to be covered by each robot is in fact an optimization problem in the large class of Vehicle Routing Problems (VRPs), where the goal is to find the optimal routes for a number of vehicles visiting specific given locations.

The rest of the paper is organized as follows: In Sect. 2 we analyze the existing literature on similar problems, looking at applications both to the marine environment and different environments. In Sect. 3 the problem is clearly defined and outlined, while the mathematical formulation is given in Sect. 4. Finally, the data are presented in Sect. 5 and the experiment results are reported in Sect. 6. The Conclusions summarize the main results of our study.

2 Related Works

Coverage path planning is a well established task in Robotics where many approaches have been proposed  [5]. Nevertheless, the extension to multi-robot system is a very active field of research with field applications  [1, 5, 18].

In Marine Robotics, the topic of autonomous underwater robotic exploration and path planning attracted research attention, as summarized in a recent very comprehensive review  [16]. A work quite similar to our approach, but applied to only one single AUV, is proposed by McMahon and Plaku  [12, 13]. They focus on trajectory planning for a vehicle that has to reach a few target regions within a specified time limit, with applications also to mine countermeasure missions. They implement an algorithm based on the physical traveling salesman problem, including a complex bathimetry of the area and time-varying currents derived from the Operational Forecast System of Chesapeake Bay where they operate.

Nevertheless, the number of projects on underwater multi-vehicles systems or swarms  [3] are increasing. In subCULTron project [20], an heterogenous swarm of AUVs is developed, including free to move a-fishes used for navigation and exploration, fixed a-mussels used for sensing and floating a-pads used to keep a connection with a base station. Within this project, they propose an algorithm of a decentralized exploration of interesting patches in an underwater environment with some randomly placed obstacles. This model is very interesting being decentralized and bio-inspired, but it is still quite simplistic. Another bioinspired monitoring system is proposed in [11], where a AUVs fleet is supported by a set of communication nodes connected to the sea surface. Based on the exploration of the AUVs, a kind of pheromone map is created and shared among the nodes, that is used by the AUVs to understand where the interesting targets are located. This approach is interesting from a theoretical point of view, but it cannot be adapted to different areas, relying on communication nodes that have to be deployed before the exploration starts. Li et al. [10] develop a strategy for AUVs’ mission planning, generating sets of trajectories, including rendez-vous with energy-carrier agents and placements of recharging stations. Fu et al. [4] consider a mixed fleet of underwater and surface marine vehicles to perform data muling from measurements data obtained by underwater sensors. They applied Reinforcement Learning techniques to simultaneously maximize fairness in data transmissions and minimize the travel distance of the surface nodes.

In this paper, we suggest a methodology based on Vehicle Routing Problem (VRP). VRP is a well known combinatorial optimization problem that aims at finding the optimal set of routes for a fleet of vehicles [19]. The literature of exact [9] and heuristic [17] algorithms for the VRP and its variants is vast, both in terms of applications and methodologies. For more details we refer the reader to the following surveys: [7,8,9, 15, 19].

To the best of our knowledge, none of the mentioned works provide optimal exploration behaviour by a large number of robots. In this work, we present for the first time an exact method for finding simultaneously both the optimal routing for a fleet of drones and their starting point. Moreover, we apply this method to real case scenarios, for underwater exploration and monitoring, as explained in the Sect. 3.

3 Problem Formulation

In our real scenario, a first coarse resolution map is available to a centralized controller. The input data of our paper derive from satellite imagery, where bathymetry map with soil and vegetation description is provided by the Abu Dhabi Environmental Authority. The map distinguishes 3 features out of sand: grass, coral, rock. This first low resolution map allows the operator to identify the areas of interest.

In order to have higher resolution data, a fleet of robots has to be deployed. The exploring AUVs are “small” and equipped with simple sensors. Each one is able to get close to the sea bottom to get good resolution images but it can only cover a limited area. Therefore the single robot is unable to solve this task without cooperating with the other AUVs of the fleet. The full fleet, on the contrary, is able to guarantee the coverage of an extended area, collecting at the same time high resolution images and videos. For this reason, after an operator selects the area of interest, a centralized controller calculates the deployment point and the optimal path for each robot in order to maximize the coverage of one or more specific feature (rock, coral, sea-grass) to be monitored. The controller assigns to each robot the proper path and the low resolution map. An ultra-short baseline system (USBL) allows the localization of each AUV. The set of robots is subsequently deployed in the specific location from a service boat. The AUVs can move to explore the features taking pictures and videos, following the predetermined paths and eventually reaching back the deployment point within their individual battery life.

Fig. 1.
figure 1

Sketch of the underwater monitoring scenario, which is one of the main applications of the routing optimization here proposed. An USBL allows the localization of each AUV.

For example, a specific use case particularly relevant for environmental sustainability purposes is the monitoring of the health status of the coral reef and data collection (images and videos) of the 3D structure of this complex habitat, like in the sketch of Fig. 1 .

Paper Contributions. From a more technical perspective, this project aims at formulating a Team Orienteering Problem (TOP) [2] applied to an underwater multi-robot system, that would allow efficient underwater exploration. Many variants of the problem are present in the literature [15]. However, this is the first exact algorithm that includes the choice of the starting point as part of the decision process. The provided routes must satisfy a limit in terms of autonomy. Moreover, the speed of the vehicles is considered constant and equal for all the vehicles. Finally, being the focus on the exploration of the sea bottom, the space is considered 2-dimensional. In order to solve this problem, a new exact mathematical model has been developed, as described in the next section. To the best of our knowledge, this is the first exact algorithm that includes the choice of the starting point as part of the decision process.

4 Mathematical Model

Parameters and Variables. In Table 1 and Table 2 we present the decision variables and the parameters used in the models presented in this section. The space is divided in a grid of squares. Each square is represented by a node. A graph G is used to represent the seabed, where to each node i is associated a given quantity \(p^f_i\) of feature f, representing the abundance of the feature f on that node (i.e. square).

Table 1. Decision variables

In this section we present the mathematical models used to simultaneously compute the optimal starting point and routes for a fleet of underwater AUVs, according to different optimization criteria.

Parameters and Variables. In Table 1 and Table 2 we present the decision variables and the parameters used in the models presented in this section. The space is divided in a grid of squares. Each square is represented by a node. A graph G is used to represent the seabed, where to each node i is associated a given quantity \(p^f_i\) of feature f, representing the abundance of the feature f on that node (i.e. square).

Constraints. All the proposed mathematical models use the following sets of constraints:

$$\begin{aligned}&y_i - \sum _{k=1}^{n_K} \sum _{j \in N^-(i)} x_{j,i}^k \le 0&\forall i \in N \end{aligned}$$
(1)
$$\begin{aligned}&\sum _{j \in N} w_i =1&\end{aligned}$$
(2)
$$\begin{aligned}&1 \ge \sum _{j \in N^-(i)} x_{j,i}^k \ge w_i&\forall i \in N, k \in k \end{aligned}$$
(3)
$$\begin{aligned}&1 \ge \sum _{j \in N^+(i)} x_{i,j}^k \ge w_i&\forall i \in N, k \in k \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{j \in N^+(i)} x_{i,j}^k - \sum _{j \in N^-(i)} x_{j,i}^k = 0&\forall i \in N, k \in K \end{aligned}$$
(5)
$$\begin{aligned}&\sum _{\{i,j\}\in A} d_{i,j}x_{i,j}^k \le d_{max}^k&\forall k \in K \end{aligned}$$
(6)
$$\begin{aligned}&t_i^k +d_{i,j} -M (1-x_{i,j}^k +w_j) \le t_j^k&\forall ~\{i,j\}\in A,k \in K\end{aligned}$$
(7)
$$\begin{aligned}&t_i^k \le M (1- w_i)&\forall i \in N, k \in K \end{aligned}$$
(8)
$$\begin{aligned}&t_i^k \le M \sum _{j \in N^-(i)} x_{j,i}^k&\forall i \in N, k \in K \end{aligned}$$
(9)
Table 2. Parameters

Constraint (1) allows to count one node i as visited only if at least one of the arcs entering in i is equal to one. Constraint (2) imposes that one (and only one) node can be used as base station. Constraints (3) and (4) ensure that the route of each AUVs starts and end at the node selected as base station. Constraint (5) imposes that if a AUV visits node i it must also leave it. Constraint (6) imposes a limit on the maximum distance that each AUV can travel. Constraint (7) imposes the correct order of the nodes visited by vehicle k and avoids the generation of subtours. In our computation we used a value for the “big-M” constant equal to \(M=d_{max}+\max _{(i,j)\in A} d_{i,j}\). Constraint (8) sets to zero the variable t associated to the base station. Constraint (9) sets to zero all the variables t associated to nodes that are not visited by any of the vehicles. In the Sect. 6 we show that the choice of the starting position is crucial for achieving good results in practice.

Single Feature Objective Function. In this work we have \(n_F\) features and to each feature f we associate a set of profits \(p_i^f\), with \(f=1,\dots ,n_F\) and \(i=1,\dots ,n_N\). To obtain a model able to find the set of paths that allow to obtain the maximum collected quantity of a given feature f, we use the following objective function:

$$\begin{aligned}&\max ~\sum _{i=1}^{n_N} p^f_i y_i\;.&\end{aligned}$$
(10)

The objective function (10) is used to sum up the profits of feature f associated to each node.

Multiple Feature Objective Function. On the other hand, we are also interested in finding solutions that allows to obtain a good mix of all the features considered. In this case it is not trivial to understand how to compare two solutions, because of the vectorial nature of the objective function. We therefore propose a maxmin approach, that maximizes the minimum amount of the gathered quantities of the \(n_F\) features. In this context, we add the following objective functions and additional constraints:

$$\begin{aligned}&\max ~\beta&\end{aligned}$$
(11)
$$\begin{aligned}&\beta \le \sum _{i=1}^{n_N} p^f_i y_i&\forall f \in F\;. \end{aligned}$$
(12)

Constraints (12) allows to the non-negative variable \(\beta \) to be equal to the minimum quantity gathered among all the considered features. To better investigate different solutions we also consider situation where we use the following generalization of constraints (12):

$$\begin{aligned}&\beta \le \sum _{i=1}^{n_N} \alpha ^f p^f_i y_i&\forall f \in F \end{aligned}$$
(13)

where the coefficients \(\alpha \)s are used to weight the contribution of each feature.

Model Improvements. To improve model (1)–(9) we add two set of inequalities. The first set of inequalities allows to have a tighter link between the y and the w variables. For each node \(i \in N\), let \(\mathcal {C}(i) \subseteq N\) be the set of nodes that are reachable from node i. A node \(i'\) is reachable from node i if there exists a path of length \(d_{max}\) that starts and ends in i that reaches node \(i'\). With this definition, we define the following set of inequalities:

$$\begin{aligned}&\sum _{j \in \mathcal {C}(i)} w_{j} \ge y_i&\forall i \in N\;. \end{aligned}$$
(14)

Inequalities (14) allows to have a variable \(y_i\) with a value of one only if at least one of the nodes that can be reached by i is selected as starting position.

5 Basic Data and Methodology

Fig. 2.
figure 2

Map showing the mainland and marine environment studied and the 12 different snapshots used in the experimental section.

The experiments are based on a map provided by the Abu Dhabi Environment Agency based on satellite imagery (see Fig. 2). The habitat map reports 54 different features (original features) on both marine and terrestrial habitats.

The 54 starting features are grouped into three groups: relevant features (distinguished in rock, coral and grass), other seabed (not of interest) and mainland. We focus on 12 snapshots of size 16 km \(\times \) 9 km providing an interesting combination of the three basic features considered.

We tested two options for the size of the square grids: High granularity, corresponding to squares of \(200\,\times \,200\) and Low granularity, corresponding to squares of \(1000\,\times \,1000\)  m. In Figs. 3a, and 3b, we show an example of how a snapshot is preprocessed according to the chosen granularity. In both figures, each square is filled with four colors: light blue (seabed), dark green (sea grass), light green (rock), red (coral) and peach (mainland). The height of each color is proportional to the percentage of the abundance of the corresponding feature.

Fig. 3.
figure 3

Low and high granularity (Color figure online)

6 Experimental Results

Impact of the Starting Point. As a first step in our analysis, we aim at assessing how the choice of the starting point can impact the overall quality of the obtained solution.

To assess the impact of the starting position it is not required to have an accurate description of the environment. For this reason, to perform this analysis we use the 12 snapshots presented in Sect. 5 with a low granularity and with a fleet of 3 AUVs and 4 km of autonomy each. The time limit imposed to the solver is of two hours. We computed the best solution according to the following objectives: maximizing the amount of rock (R), of coral (C), of grass (G), the minimum among the three features (MM), the minimum among the amount of normalized quantity of each feature, divided by the total availability of each feature (MMN).

The first three options are obtained by solving model (1)–(9) with objective function (10), with f equal to either rocks, coral or grass. The option MM is obtained by solving model (1)–(9) with objective function (11) and constraints (12). The option MMN is obtained by solving model (1)–(9) with objective function (11) and constraints (13) with \(\alpha ^f = \dfrac{1}{\sum _{i \in N}p^f_i}\). The goal of options MM and MMN is to search for a solution that balances the collected quantity of the three features. The option MMN is a normalized version of MM, where the contribution of each feature is divided by its total presence in the map, in order to better handle the situations where one feature is significantly more scarce than the others.

The results are presented in Table 3. Table 3a presents the amount of collected feature by each option. For example, if we consider the first row, associated with the option R, we have that the average quantity gathered of rocks is 7.11 and it is (as theoretically expected) the highest among the five options. However, a (relatively small) quantity of the other features is still gathered on average.

Table 3b shows the average distance (in Km) between the starting point selected by the solution of the given option and the starting point of the other options. In this table, we see for example that the distance between the optimal starting point that maximizes the rocks and the one that maximizes the corals is on average 7.04 Km. The results provided by Table 3b clearly show that a careful choice of the starting position has a strong impact on the overall performances. Preliminary studies also showed that for many instances it is sufficient to move 1 Km to have a drastic decrease/increase of the collected quantity of a given feature. Coral and grass are significantly scarcer in comparison to rocks. This implies that if we are not including the collection of them in the objective function, it is unlikely to collect a significant quantity of the feature. For example, the quantity of coral collected with the options R and G is 0.08 and 0.33; a similar behaviour happens for grass. For this reasons, it seems a good idea to consider also the more balanced solutions provided by MM and MMN.

Table 3. a) Amount of the collected features; b) Relative distance of the starting position and c) Amount of rocks gathered (km\(^2\)).

Impact of the Fleet Characteristics.

To assess the impact of the number of AUVs and their autonomy we need a more accurate description of the environment. For this reason, to perform this analysis we use snapshots with high granularity. To obtain more realistic information we run different combinations of number of AUVs \(n_K\in \{3,5,10\}\) and autonomy \(d^k_{max}=\{200\,m, 300\,m, 500\,m\}\). Such high granularity, combined with a significant number of AUVs leads to models that are really challenging to solve to optimality. To keep the computational effort under control, we focus only on 4 of the snapshots presented in Sect. 5 and we focus only on the solutions aiming at maximizing the quantity of collected rocks. In Table 3c we present the average collected quantity of rocks over the selected snapshots.

A possible issue of dealing with a variable fleet of AUVs is the risk that an increase of the size of the fleet does not produce the same increase of the observed feature, but a sublinear relationship is shown. The aim of Table 3c is to assess this sublinear behavior in our application. The table shows that the value of the gathered quantity scales well with the increase of the number of AUVs. Going from 3 to 10 AUVs allows to almost double the area covered. Therefore the use of a fleet of AUVs is highly recommended for large areas exploration. Moreover, Table 3c shows also that an increase of autonomy (represented in this paper by the maximum full path \(d_{max}\) covered by each AUV) is beneficial in terms of the gathered information. In fact the table shows that an increased autonomy is more effective than the fleet size.

Multi Objective Analysis - Pareto Frontier. To better understand the interaction between the three features we investigate in this section the Pareto frontier for pairs of features.

We focus on studying the Pareto frontier on instances with low granularity, with a fleet of 3 AUVs and 4 km of autonomy each. This allows us to work with a time limit of two hours. We consider the following combinations of \(\alpha \) parameters: \((\alpha ^{rocks}, \alpha ^{coral}, \alpha ^{grass}) \in \{\) (1, 5, 5), (5, 1, 5), (5, 5, 1), (1, 10, 10), (10, 1, 10), (10, 10, 1), (1, 5, 10), (1, 10, 5), (5, 1, 10), (10, 1, 5), (10, 5, 1), \((5,10,1)\}\). Figure 4 shows the three Pareto frontiers that we obtain when comparing pairs of features. We decided to show the Pareto frontier for only a single snapshot for conciseness reasons and because in general the plots do not change significantly from one snapshot to the other.

Fig. 4.
figure 4

Pareto Frontiers

Considering Fig. 4a, we notice that if we look for a solution with mixed quantities of rocks and coral it is likely to obtain solutions with significantly fewer quantities for both features. The same trend happens for Fig. 4b. Conversely, Fig. 4c shows that finding a mixed solution of coral and grass seems easier. This last behaviour is well explained by the nature of the two features: coral and grass are clearly more likely to be part of the same biome. The data seems to suggest that, when planning the exploration of an area, it is recommendable to plan a mixed exploration only if the two features involved are grass and coral. On the other hand, if we are interested in gathering information about rocks, it is more fruitful to plan a dedicated exploration.

7 Conclusions

In this paper, we propose the use of an underwater fleet for a cheaper and more reliable underwater area coverage. We provide a tool that can be used for real underwater monitoring, changing the input parameters based on the specific mission. Compared to the use of a single vehicle, a fleet is cheaper, more flexible, scalable and robust, and -most relevantly- has improved perception, because it can provide detailed information (images, videos) of large areas even in complex environment.

In our scenario, a detailed (high-resolution) exploration and mapping is performed by a fleet of AUVs. A centralized controller has a map with coarse resolution, as can be obtained from satellite’s images. Based on this it runs an optimization algorithm able to identify the best paths each vehicle has to cover, in order to get the most possible information for a specific feature (coral, grass, rock) or a set of them, minimizing the energy consumption and guaranteeing all the robots complete their mission. Then, the controller transfers to each AUV its best path and the map. The fleet is deployed to get high resolution data.

The algorithm has been applied to real environmental data, specifically underwater maps based on satellite imagery and provided by the Abu Dhabi Environmental Authority. An example of a use case is the collection of detailed information on the health and conservation status of corals, or different species of seagrass, as it is needed by local Environmental Authorities. The optimization problem has been exactly solved. To the best of our knowledge, this is the first exact algorithm that includes the choice of the starting point in the decision process.

Our results show that the use of a fleet is beneficial, and this effect increases enlarging the size of the fleet. Larger fleets are able to collect more relevant information, as well as longer battery life increases the area covered by the fleet. We also observe that an increase of the fleet size is useful only if also the AUVs autonomy is increased. The Pareto frontiers have been analyzed, providing useful information for the mission decision plans: the monitoring missions of corals and seagrass can be performed together, while possible missions focused to investigate minerals and rocks need dedicated plans.

From a practitioner’s point of view, we demonstrate that the proposed model is effective in representing the main salient aspects of submarine exploration. In case it is needed to solve scenarios with significantly larger dimensions than those considered in this paper, it will be more appropriate to use heuristic algorithms, which allow to obtain good solutions in a shorter time.

In the next future, we plan to improve our model introducing a more detailed environment (sea currents), by also considering communications, that guarantee the connection among the members of the fleet during the exploration.