1 Introduction

Unmanned aerial vehicles (UAVs) have being received ever increasing attention in the scientific literature. Recent advances in their technology have allowed the emergence of a wide new range of applications, such as military operations (Schumacher et al. 2007), disaster management (Quaritsch et al. 2010), agricultural surveillance (Israel 2011), urban terrain surveillance (Gross et al. 2006), etc.

Without the need of an on-board pilot, an aerial vehicle can be designed to accomplish the so-called D-cube (Dull, Dangerous, Dirty) missions (Ingham 2008). Dull operations are too monotonous or require excessive endurance for human occupants. Dirty operations are hazardous and could pose a health risk to a human crew. Dangerous operations could result in the loss of life for the on-board pilot.

Nevertheless, as outlined in Pascarella et al. (2013), nowadays UAVs are mostly Remotely Piloted Vehicles (RPVs), since their operations are remotely performed by large teams of human operators. Ground operators must be endowed with the proper expertise and this represents a significant cost factor. Furthermore, dull tasks could relieve the remote pilots if they were performed autonomously.

This consideration pushes the necessity to extend aerial platforms capabilities related to autonomy. Data link constraints are another important motivation for autonomy in unmanned systems. Bandwidth is a limited resource and communications with the ground station could be not available in emergency situations. On-board autonomous subsystems could also provide faster reactions to external context changes due to their direct access to sensor data.

Generating optimal flight trajectories that are compliant with the problem constraints is a common problem in autonomous UAVs (Mattei and Blasi 2010). Constraints derive from the mission scenario (such as the avoidance of no-fly zones) and from the vehicle dynamics (e.g., the turn radius, the climb and descent angles, etc.). These algorithms pose a not negligible computational cost which, while not affecting offline applications, has a remarkable impact on possible online applications.

The online trajectory optimization involves an online processing of the flight trajectories to allow for a time-variant mission scenario, such as the occurrence of new obstacles. The problem is further complicated because real time replanning to account any change into the mission objectives and scenarios is a mandatory requirement in these kinds of applications and implies the invariable completion of the reaction to an external stimulus within an hard deadline.

Thus, the reference problem for an online application is a real time planning and replanning problem, that is a dynamic optimization problem. As a consequence, a strategy for an improvement of the temporal behavior is essential for an effective implementation.

A promising method for the speed-up of the trajectory planning process is the parallelization of the computation, since current processors have more cores per micro-chip instead of single higher clock rates. Moreover, such an algorithm may be fitted in with future developments related to collaborative networks of UAVs. This is a suitable solution for complex planning in a distributed environment (i.e., partially controllable and not observable by a single entity).

As stated in Shima and Rasmussen (2009), a loose collection of vehicles that have some objectives in common is a collaborative team. If the vehicles are working together to achieve a common objective, then they are a cooperative team. The main motivation for team cooperation stems from the possible synergy, as the group performance is expected to exceed the sum of the performance of the single UAVs. Such cooperation entails: global information, because each UAV can share its sensor information with the rest of the group via a communication network; resource management, because a cooperative decision algorithm allows efficient allocation of the overall group resources over the multiple targets; robustness, because the team can reconfigure its distributed architecture in order to minimize the performance degradation if a failure occurs.

Here we investigate the parallelization of an algorithm that converts the trajectory optimization problem into a minimum cost path search in a weighted and oriented graph, called core paths graph (CPG) in Mattei and Blasi (2010).

The CPG defines a discrete set of admissible connection paths in the space domain. The weights of the CPG arcs are obtained solving convex quadratic programming optimization problems. The particular case of piecewise polynomial trajectories minimizing flight paths length is fully developed in Mattei and Blasi (2010). The CPG process can also exploit a distributed implementation that is endorsed on a fleet of cooperative UAVs. In detail, we are interested in broadening the basic CPG algorithm discussed in Mattei and Blasi (2010) in order to reduce its lead time for online operations.

This paper is an extended version of Pascarella et al. (2014). Compared with the previous work, here we provide a detailed state-of-the-art survey, an in-depth analysis of the test results and some further considerations about the distributed variation of the CPG algorithm.

2 Related work

During the past years, a lot of work has been carried out on the trajectory optimization for many kinds of vehicles. The variational formulation is probably the most natural and rigorous one for this class of problems. However, the possibility to solve complex problems with variational methods is very poor.

Many papers deal with numerical direct and indirect methods based on the solution of a non linear programming (NLP) problem (Betts 1998; Hargraves and Paris 1987). Unfortunately NLP turns out to be quite burdensome for many practical applications.

By means of some approximations, feasible trajectories can be generated following a purely geometrical approach based on topological techniques creating a sequence of way points. This sequence can derive from probabilistic or potential methods (Hwang and Ahuja 1992; Cen et al. 2007). A purely geometrical approach has been proposed in Asseo (1998) for flight trajectories generation using circular arcs combined with straight connection paths. The A* search algorithm is applied in Yang and Zhao (2004) to generate trajectories over a four-dimensional discretized search space.

An interesting technique, taking into account also flight dynamics, is based on the so called “motion primitives” (Dever et al. 2006; Frazzoli et al. 2002), where flight path is defined through a series of trim conditions and manoeuvres. Definition of smooth trajectories based on Dubins curves compliant with curvature as well as operational constraints are presented in Anderson et al. (2005) and in Chitsaz and LaValle (2007). Different approaches based on mixed integer logical programming (MILP) with a Receding Horizon philosophy have also been proposed (Schouwenaars et al. 2004; Borrelli et al. 2006).

Non-deterministic methods also have received much attention showing their effectiveness and robustness in a wide range of engineering problems. An example of a space plane re-entry trajectory optimization can be found in Yokoyama and Suzuki (2005), whereas hybrid soft computing and evolutionary techniques were used in Vaidyanathan et al. (2001) to compute optimum flight path for UAVs under several aerodynamic constraints.

A real-time free flight path optimization based on improved genetic algorithms is reported in Hu et al. (2004). A numerical potential field combined with a genetic optimizer has been applied for mobile robot path planning in Lei et al. (2006). Again in the field of evolutionary techniques, an example of path planning based on a particle swarm optimizer (PSO) can be found in Raja and Pugazhenhi (2009).

In Li et al. (2006) a mixed fuzzy-PSO technique has been implemented and applied to identify the optimum route over a planning area represented by a mesh of equal square cells. In Saska et al. (2006) smooth trajectories are described through cubic splines whose parameters are optimized by a PSO-based procedure. A further example of PSO application in the field of flight path optimization in a constrained environment can be found in Blasi et al. (2013). An example of path planning in a constrained environment based on a particle swarm optimizer (PSO) can be found in Blasi et al. (2013). Alternative approaches related to hybrid simulation are used in Ficco et al. (2014) and in Gigante et al. (2015).

A few methods for the parallelization of trajectory planning have been discussed in the scientific literature. For example, Kopřiva et al. (2012) presents a parallelization approach of the AA* algorithm. The authors in Kider et al. (2010) introduce R*GPU, a parallel extension of R*, whereas a parallel extension of probabilistic roadmaps is reported in Pan et al. (2010).

Several works are also available as regards the multi-vehicle routing problem by means of decentralized and cooperative management, such as Murray (2007) and Faied et al. (2010).

Anyway, this is the first proposal of a parallelized and a distributed extension of the CPG algorithm.

3 The mathematical problem formulation

Let us consider a two-dimensional space domain \(\mathrm {\Delta }\subseteq \mathbb {R}^{2}\) in which to fly, possibly non connected, with a boundary \(\delta \mathrm {\Delta }\) that is determined by the presence of no-fly zones and/or obstacles. A parametric trajectory function in \(\mathrm {\Delta }\) is denoted with

$$\begin{aligned} s\left( \cdot \right) : t\in \left[ t_{0},t_{f}\right] \rightarrow s\left( t\right) =\Big (x\left( t\right) ,y\left( t\right) ,z\left( t\right) \Big )\; \end{aligned}$$
(1)

where \(\left( x,y,z\right)\) are the coordinates in a given Cartesian reference system.

We denote with \(\dot{s}=\frac{ds}{dt}\) the first-order temporal derivative of the position \(s\) and with \(E=\left( s,\dot{s}\right)\) the vector of positions and their first derivatives.

A flight trajectory from \(E_{0}=\left( s_{0},\dot{s}_{0}\right)\) to \(E_{f}=\left( s_{f},\dot{s}_{f}\right)\) in \(\mathrm {\Delta }\) is a continuously differentiable \(C^1\) oriented curve \(\theta _{E_{0},E_{f}}\) in the independent variable \(t\), connecting the point \(s_{0}\) to the point \(s_{f}\) with assigned first derivatives (tangents) \(\dot{s_{0}}\) and \(\dot{s_{f}}\) and at the extremes. Thus, the flight trajectory \(\theta _{E_{0},E_{f}}\) has the following expression

$$\begin{aligned} \theta _{E_{0},E_{f}} \triangleq \ \Big \{s_{\left[ t_{0},t_{f}\right] }\left( \cdot \right) \in C^{1}_{\left[ t_{0},t_{f}\right] }:&\ s\left( t_{0}\right) =s_{0},\ \ s\left( t_{f}\right) =s_{f}, \nonumber \\&\ \dot{s}\left( t_{0}\right) =\dot{s}_{0},\ \ \dot{s}\left( t_{f}\right) =\dot{s}_{f} \Big \} \end{aligned}$$
(2)

Other essential properties for a flight trajectory (e.g., \(\mathrm {\Delta }\)-compatibility, \(\mathrm {\Delta }\)-admissibility, \(\mathrm {\Sigma }\)-admissibility, controllability, etc.) are thoroughly described in Mattei and Blasi (2010).

The cost of every admissible flight trajectory \(\theta _{E_{0},E_{f}}\) is a real nonnegative function \(C\left( \cdot \right)\), which may be related to the path length or height, to the fuel consumption, to the risk or to other variables inherent to the flight mission.

The search of the optimal flight trajectory connecting \(E_{0}\) to \(E_{f}\) consists in finding a flight trajectory \(\theta ^{*}_{E_{0},E_{f}}\) that is compliant with the desired properties and is a solution of the minimization problem

$$\begin{aligned} \theta ^{*}_{E_{0},E_{f}}=\underset{\theta _{E_{0},E_{f}}\in \mathrm {\Theta }^{\mathrm {\Delta },\mathrm {\Sigma }}}{{\text {argmin}}} \ C\left( \theta _{E_{0},E_{f}}\right) \; \end{aligned}$$
(3)

The set of all the \(\mathrm {\Sigma }\)-admissible optimal flight trajectories connecting every \(\mathrm {\Delta }\)-compatible \(E\) with each other is called the optimal \(\mathrm {\Delta }\)-connection set.

The problem (3) and the search of the optimal \(\mathrm {\Delta }\)-connection set are complex tasks because the searching space \({\mathrm {\Theta }}^{\mathrm {\Delta },{\mathrm {\Sigma }}}_{E_{0},E_{f}}\) is infinite dimensional, the cost function can be nonlinear and non-convex and the constraints are generally nonlinear and non-convex.

Some assumptions and approximations are introduced in order to convert the optimization problem (3) into a minimum cost path search over a graph. A \(\mathrm {\Delta }\)-core paths graph (\(\mathrm {\Delta }\)CPG) is a discrete approximation of the optimal \(\mathrm {\Delta }\)-connection set, wherein the nodes are suitable \(\mathrm {\Delta }\)-compatible \(E\) vectors and the arcs are \(\mathrm {\Sigma }\)-admissible optimal flight trajectories connecting the nodes. We denote with \({\mathrm {N}}_{CPG}\) the set of CPG nodes and with \({\mathrm {\Upsilon }}_{CPG}\) the set of CPG arcs. The weight \(W_{i,j}\) of the arc connecting the i-th node with the j-th node represents the cost of the optimal flight trajectory from \(E_{i}\) to \(E_{j}\).

The authors in Mattei and Blasi (2010) prove that the infinite dimensional problem (3) to calculate the CPG weights \(W_{i,j}\) can be converted into the following finite dimensional problem

$$\begin{aligned} W_{i,j}=\min _{\delta \in \ {\mathrm {\Omega }}} \left( \theta ^{f}_{E_{i},E_{j}}\left( \delta \right) \right) , \ \ \ \forall E_{i},E_{j}\in {\mathrm {N}}_{CPG}\; \end{aligned}$$
(4)

wherein \(\mathrm {\Omega }\) is the set of parameters \(\delta\) ensuring \(\mathrm {\Sigma }\)-admissible flight trajectories.

Once a CPG has been built and the weights \(W_{i,j}\) have been calculated, the optimal flight trajectory between two nodes of the graph can be determined adopting a search algorithm for a minimum cost path in a graph. In Mattei and Blasi (2010) polynomial functions are used for the flight trajectory parameterization. The trajectory \(\theta ^{f}_{E_{i},E_{j}}\left( \delta \right)\) can be expressed as

$$\begin{aligned} \theta ^{f}_{E_{i},E_{j}}\left( \delta \right)&\triangleq \left\{ y\left( x\right) =\delta _{1}x^{p-1}+\cdots +\delta _{p-1}x+\delta _{p}: \right. \nonumber \\&\qquad \left. x\in \left[ x_{0},x_{f}\right] , \ \delta \in {\mathrm {\Omega }}\subseteq \mathbb {R}^{p},\right. \nonumber \\&\qquad \left. y\left( x_{0}\right) =x_{0}, \ y\left( x_{f}\right) =y_{f},\right. \nonumber \\&\qquad \left. \dot{y}\left( x_{0}\right) =\dot{y}_{0}, \ \dot{y}\left( x_{f}\right) =\dot{y}_{f} \right\} \end{aligned}$$
(5)

The equalities constraints in (5) and the obstacle avoidance requirements can be converted in a matrix notation, as described in Mattei and Blasi (2010). Moreover, if we assume that the cost of a trajectory is proportional to its length, the cost function can be represented as a quadratic function of \(\delta\) and the optimization problem (4) can be converted into

$$\begin{aligned} W_{i,j}=&\min _{\delta \in {\mathrm {\Omega }}\subseteq \mathbb {R}^{p}} \delta ^{T}Q\delta \nonumber \\& {\mathrm {s.t.}} \nonumber \\&A_{I}\delta \le b_{I} \nonumber \\&A_{E}\delta \le b_{E} \end{aligned}$$
(6)

The problem (6) is a convex quadratic programming problem and can be efficiently solved using quadratic programming algorithms. The matrices \(Q\), \(A_{I}\), \(A_{E}\) and the vectors \(b_{I}\), \(b_{E}\) are accurately described in Mattei and Blasi (2010).

4 CPG algorithm

The CPG computation includes first an offline processing phase, which entails the setting of the CPG topology and the evaluation of the weights of the arcs. The topology is built by choosing the set \({\mathrm {N}}_{CPG}\) of CPG nodes. A criterion can be a uniform grid of points and directions with increased density in the proximity of no-fly zones and obstacles. Once a cost function \(C\left( \cdot \right)\) has been selected, the weights \(W_{i,j}\) of the arcs can be determined by solving the problem (6) if we adopt polynomial trajectories.

The verification of the desired properties for \(W_{i,j}\) is another important step. This verification is carried out into the following three different phases:

  1. 1.

    evaluation of the properties that can be directly verified on the relations between \(E_{i}\) and \(E_{j}\) (such as distance limitations among the extreme points \(s_{i}\) and \(s_{j}\)): if these properties are not satisfied, there is not an arc between \(E_{i}\) and \(E_{j}\);

  2. 2.

    evaluation of the solvability of the problem (6): if this has no admissible solutions for \(\left( E_{i},E_{j}\right)\), then there is not an arc between \(E_{i}\) and \(E_{j}\);

  3. 3.

    evaluation of a posteriori checks on the optimal trajectory between \(E_{i}\) and \(E_{j}\): if the solution of the problem (6) does not respect other \(\mathrm {\Sigma }\)-properties (not verified at point 1, such as obstacles avoidance), then the optimal trajectory for \(\left( E_{i},E_{j}\right)\) is not feasible and there is not an arc between \(E_{i}\) and \(E_{j}\).

The pseudo-code for the CPG algorithm is composed by Algorithm 1 and Algorithm 2. The parametric structure scenario contains the relevant environment data (i.e., the space domain, no-fly zones, obstacles, etc.), \({\mathrm {N}}_{CPG_{points}}\) and \({\mathrm {N}}_{CPG_{directions}}\) are respectively the set of points and the set of directions of the CPG nodes and \(A_{CPG}\) is the adjacency matrix of the CPG. The check functions perform the properties verification.

figure a
figure b

The lead time of the CPG algorithm depends on the number of nodes \(\left| {\mathrm {N}}_{CPG}\right| =\sum ^{\left| {\mathrm {N}}_{CPG_{points}}\right| }_{i=0}\left| {\mathrm {N}}_{CPG_{directions}}\right| _{i}\), wherein \(\left| \mathrm {N}_{CPG_{directions}}\right| _{i}\) is the number of directions for the \(i\)-th point.

A theoretical estimation of the lead time should take into account the computational complexity of the CPG_arcs_computing function. If we suppose that the number of directions is the same for every point, the computational complexity is

$$\begin{aligned} O\left( \left| {\mathrm {N}}_{CPG_{directions}}\right| ^{2} \cdot \left| {\mathrm {N}}_{CPG_{points}}\right| \cdot \left( \left| {\mathrm {N}}_{CPG_{points}}\right| -1\right) \cdot T_{QP}\right) \end{aligned}$$
(7)

wherein \(T_{QP}\) is the computational complexity of the quadratic programming solver. For instance, primal-dual interior point methods for convex quadratic programming usually exhibit polynomial-time complexity (Monteiro et al. 1990).

The online adaptation of the CPG algorithm solicits an enlargement of the CPG against any change in the scenario. The enlargement is in way of including the vertexes that are related to new no-fly zones or obstacles and the current vehicle position. As a consequence of the update of the CPG topology, CPG_arcs_computing function is performed in online fashion starting from the new nodes.

5 Parallel CPG algorithm

The CPG algorithm is computationally heavy due to the size of the graph in a real scenario and some expedients are convenient in order to make the proposed approach less time-consuming for an online adaptation of the algorithm. A proper choice of the discrete set of CPG nodes, of the \(\mathrm {\Sigma }\)-properties and of the cost function can already reduce the required time for the CPG construction.

Anyway, the algorithm can benefit from the use of multi-core and/or multi-processor platforms since the CPG construction can be formulated as a parallel process. It does not present loop-carried dependencies, namely one iteration of the loop does not depend upon the results of other iterations of the loop and the loop can be executed in any order, even in parallel. There exists a strict one-to-one relation between the iteration order \(\left( i,j,k,l\right)\) and the evaluated arc, therefore a CPG parallel process does not imply any potential race condition. Indeed, the arc weights computations do not depend on the sequence or timing of the parallel threads.

An OpenMP implementation of a parallel CPG algorithm is addressed here. OpenMP is an application program interface (API), jointly defined by a group of major computer hardware and vendors (OpenMP 2013). It provides a portable and scalable model for developers of shared memory and multi-threaded parallel applications. The API supports C/C++ and Fortran on several architectures. The adopted programming language for the CPG algorithm is C, wherein the OpenMP consists of a set of compiler pragmas that control how the program parallelism works.

The application begins with a single thread, that is the master thread. As the program executes, the application may encounter parallel regions in which the master thread creates a thread team. Each thread member of the team executes within the parallel region. The end of the parallel region marks implicitly a barrier synchronization for all the threads, that will wait here until the last thread completes. Afterwards, the thread team is removed and the master thread continues the sequential execution.

The OpenMP for directive splits a for-loop into a parallel region so that every thread gets different sections of the loop and it executes its own sections in parallel. In the case of the CPG algorithm, one or more of the for-loops can be parallelized amongst a team of threads. Each thread can handle some iterations of the loop, i.e., some arcs of the graph. As a consequence of the one-to-one relations between iterations and arcs, the CPG data structure does not have to be protected from concurrent accesses and the overhead due to OpenMP parallelism management is minimum.

Here we report the parallel implementation of the second-level for-loop of the CPG algorithm by way of example. This loop scans all the directions starting from the current CPG point and proceeds with the feasibility checks and the arcs processing by executing the lower-levels for-loops.

The OOQP (Object-Oriented software for quadratic programming) package (Gertz and Wright 2001) has been used as regards the resolution of the problem (6). OOQP is a C++ package for solving convex quadratic programming problems. It employs object-oriented programming techniques for a primal-dual interior point algorithm. It contains code that can be used out of the box to solve a variety of quadratic programming problems. It can be used as a standalone off-the-shelf solver, as an embeddable solver (i.e., as an integrated code in a custom application) or as a development framework.

Within this work, we have used OOQP as an embedded solver into a C application for CPG algorithm. In the following, we briefly describe the test scenario and the test results.

5.1 Test scenario and test results

The considered test scenario is a two-dimensional example with two circular no-fly zones in a rectangular space region \(\mathrm {\Delta }\) (Fig. 1), so we suppose that the aerial vehicle moves only in a plane (i.e., at a constant height). We also assume that the CPG nodes are the combination of the 115 points marked with circles in Fig. 1 and of a discrete set of 36 equally spaced directions identified with the star of arrows centred in the point \(\left( -1,3\right)\) in such a figure.

This choice produces a set of \(4140\) nodes and a number \({\mathrm {\Upsilon }}_{CPG}\) of arcs that is equal to \(\left| {\mathrm {N}}_{CPG}\right| \cdot \left( {\mathrm {N}}_{CPG}-1\right)\) in the case of a complete graph without auto-connections. Nevertheless, some nodes are not \(\mathrm {\Delta }\)-compatible and further reduction of the arcs is obtained by applying the following \(\mathrm {\Sigma }\) properties: a distance limitation of 5 km among the connected points and a difference limitation of 70° among the extreme directions. Finally, we consider 8th-order polynomial trajectories to connect nodes, namely p = 9. Constraints on the radius of curvature of the vehicle have not been considered.

Figure 1 shows two sample trajectories through respectively a green and a blue line. They both start from the point (−1, −1) with a direction of −40° and arrive at the point (4, 0) (the former with a direction of −40°, the latter with a direction of 30°).

Fig. 1
figure 1

Two-dimensional test scenario

A general purpose workstation with Intel Core i5- 2520 M has been used as testbed. It is a processor with a clock speed of 2.50 GHz and two physical cores. It supports hyper-threading technology, so every physical core consists of two logical cores, which share the execution resources, but have their own architectural state. From the operating system’s point of view, logical cores are handled like physical cores, therefore the Intel Core i5-2520 M processor exhibits 4 cores. The authors in Russ and Stütz (2012) and in Meier et al. (2011) point out some UAVs applications which have a on-board computing power that is comparable to our test platform’s computing power.

Table 1 reports the runtime performances for the proposed parallel CPG implementation. In detail, we have measured the wall-clock times of the first 5 iterations of the parallelized for-loop (the second-level for-loop) of the CPG algorithm. Measurements have been accomplished by means of omp_get_wtime function. Each measure in Table 1 is the average of 10 runs. A single optimal trajectory is computed in about 0.3 s. Best results have been obtained with 8 threads.

Table 1 Measures in seconds of the wall-clock time of the parallelized for-loop iteration

Figures 2 and 3 respectively show the speedup and the efficiency of the parallel CPG implementation. Speedup and efficiency are defined by the following formulas

$$\begin{aligned} Speedup = \frac{T_{sequential}}{T_{parallel}} \end{aligned}$$
(8)
$$\begin{aligned} Efficiency = \frac{Speedup}{N_{cores}} \end{aligned}$$
(9)

wherein \(T_{sequential}\) is the wall-clock time of the sequential for-loop iteration, \(T_{parallel}\) is the wall-clock time of the parallelized for-loop iteration, \(N_{threads}\) is the number of threads in the team and \(N_{cores}\) is the number of (logical) processor cores (4 for the Intel Core i5-2520M). \(T_{sequential}\) and \(T_{parallel}\) have been evaluated by summing the measured wall-clock times of each iteration fixed the thread number.

The speedup and the efficiency (Figs. 2, 3) initially grow because the number of threads is lower than the number of cores. They settle in the region in which \(N_{threads}\) is slightly larger than \(N_{cores}\) because all the available resources have been employed for the parallelization and the overhead effects are significant.

Fig. 2
figure 2

Average speedup of the parallel CPG implementation

Fig. 3
figure 3

Average efficiency of the parallel CPG implementation

6 Distributed CPG algorithm

A fleet of collaborative UAVs can be arranged as a distributed system, that is a system in which the individual UAVs are located on a network and coordinate their actions by passing messages. In this case, the problem (6) can be distributed amongst the team of vehicles. Indeed, a distributed CPG algorithm would reduce the global processing time by taking advantage of the computing powers of the single vehicle platforms. Moreover, a distributed approach may be compliant with the key requirements of robustness, scalability and flexibility for the high-level control of a fleet of UAVs. For example, a centralized planner represents a single point of failure and its complexity quickly grows with the problem dimension, accordingly to the expression (7). For this reason, we focus on the use of a distributed approach in which the individual vehicles can work on a partition of the CPG.

However, an effective protocol for the fleet allocation shall be set up and the coordination may require the vehicles to exchange large quantities of information about the environment and the optimal trajectories. Constraining the communication may be unavoidable, although it potentially limits the situational awareness of the single UAVs and their cooperative capacities. Hence, a compromise between a global situational awareness and acceptable performances shall be achieved.

Here we assume that the fleet is homogeneous, so the problem (6) has the same formulation and the same parameters for all the UAVs members. The pseudo-code for the distributed strategy is composed by Algorithm 2 and Algorithm 3.

In Algorithm 3, \(N_{v}\) is the number of vehicles in the team, \({\mathrm {N}}^{\left( i\right) }_{CPG_{points}}\) is the set of CPG points that are assigned to the i-th vehicle and \(A^{\left( i\right) }_{CPG}\) is the adjacency matrix that is processed by the i-th vehicle. This strategy distributes the processing of a starting point and then merges the single adjacency matrices into \(A_{CPG}\).

The robustness of our solution in real scenarios would be affected by an UAV failure if it occurs during the CPG processing. In this case, the starting points that were assigned to the failed UAV shall be redistributed to the remaining alive UAVs.

figure c

Two policies have been considered for the allocation of CPG points (Fig. 4): a geographic criterion and a fair criterion. The former assigns a CPG point to the closest UAV, which should be the most engaged vehicle by the processing of such a starting point. The latter randomly assigns CPG points in equal parts in order to supply a uniform distribution of the processing workload.

Fig. 4
figure 4

Geographic criterion (left) and fair criterion (right) for the allocation of CPG points amongst the team of vehicles \(\left( V_1,V_2,V_3,V_4 \right)\)

Besides, the distributed strategy has been simulated on a single platform by means of a variation of the parallel CPG algorithm. In more detail, a parallel OpenMP implementation of the first-level for-loop of CPG_arcs_computing function has been used in order to reproduce the distribution amongst the vehicles. Indeed, this loop scans all CPG points and processes the optimal flight trajectories starting from the current CPG point. Hence, a parallelization of the first-order for-loop of CPG_arcs_computing function is equivalent to the distribution of CPG points processing to some degree. The outcome is a double-level parallel CPG process.

Table 2 reports the test results with four vehicles. The first row concerns the sequential criterion. The test scenario and the testbed are the same of Sect. 5.1. The vehicles are located at the coordinates (−1, −1), (−1, 3), (3, −1) and (3, 3). The measurement of the global wall-clock time goes from the setting of the topology to the merging of adjacency matrices. Table 3 shows the speedup and the efficiency of the proposed implementation of the distributed CPG algorithm.

The results in Table 2 highlight non-uniform lead times of the vehicles because their workload are imbalanced, even for the fair criterion. More balanced criteria are difficult to arrange since the CPG algorithm is irregular: its distribution of work and data cannot be characterized a priori because these quantities are input-dependent (e.g., they are related to the feasibility of the arcs). Thus, the irregularity of the CPG algorithm inhibits an equal distribution of work over vehicles.

Table 2 Measures in seconds of the wall-clock time of the distributed CPG implementation
Table 3 Speedup and efficiency of the distributed CPG implementation

7 Conclusion

Parallel and distributed extensions of a CPG algorithm have been proposed in order to improve the lead time for an online CPG application. The parallelization, even by high level primitives such as OpenMP and over off-the-shelf hardware, shows improving performance figures. However, absolute turnaround does not allow for real time exploitation in dynamically changing scenarios yet, at least without dedicated hardware like massive parallel architecture.

For this reason, future work aims at exploring the effectiveness of such kind of test-beds like GPU. On the other hand, the feasibility of a heuristic approach which could provide a subset of solutions, but in real time, will be considered. The design of a protocol for an effective CPG implementation on a fleet of UAVs is an additional objective to take into account the communication overhead in a distributed environment.