1 Introduction

Compared with single unmanned aerial vehicles (UAV), multi-unmanned aerial vehicles (Multi-UAVs) system has the advantages of high speed, reliability and flexibility by cooperation and coordination [1]. Many combat tasks such as Wide Area Search, Suppression of Enemy Air Defense (SEAD), and Intelligence Surveillance and Reconnaissance (ISR) need to be executed by a Multi-UAVs cooperative in complex battlefield situations [2, 3]. Thus, the Multi-UAVs system has gradually replaced the single UAV in performing a variety of complex tasks. However, the cooperation of Multi-UAVs increases the problem scale, and thus the computational complexity grows exponentially. Moreover, the cooperation of Multi-UAVs is affected by other issues, such as environmental constraints, equipment degradation, resource competition, cooperative constraints and the conflict of UAVs [49] etc. Therefore, it is a challenge to search for the optimal solution for Multi-UAVs cooperation at a minimum cost.

The problem of Multi-UAVs cooperative multi-targets assignment (MUCMTA) plays an important role in the task planning system. How to minimize the total cost or maximize the total benefits is the key to properly assigning a series of UAVs to implement multiple targets [6]. In particular, the target assignment problem is a well-known discrete combinatorial optimization problem with multi-constraints and multi-models. It is a non-deterministic polynomial (NP) and is hard to find an optimal solution [7], especially in a three-dimensional (3D) environment.

The MUCMTA has been tackled with some classical optimization methods, such as mathematical programming and distribution approaches. Mathematical programming approaches have included the Hungarian algorithm [810], mixed integer linear programming [11], and Dijikstra algorithm [12]. The assignments based on those techniques work well using simple models in a two-dimensional (2D) scenario, but they are not suitable for the complicated MUCMTA in 3D environments. The distributed approaches have a negotiation scheme based on the contract net [13] and market mechanism [1419]. These distributed methods balance the conflict by consultation and information sharing. So they can solve the multi-target assignment problem even in a larger scale. However, those methods emphasize avoidance of conflict without taking into account the practical cost between the UAVs and the targets, so it cannot provide an accurate assignment result.

Inspired by the good performance of artificial intelligence, intelligence algorithms have been widely used in the field of MUCMTA. For example, Genetic Algorithm (GA) is extended to seek the optimal solution of an assignment [2023], because it mimics the process of natural selection and is not restricted by continuity, differentiability, or unimodality in a search domain. But GA usually converges rapidly and falls into local optimization easily. The particle swarm optimization (PSO) algorithm is another commonly used intelligence method. It was improved in [2428] for solving the assignment problem. This approach finds the best solution by particles swirling around, but it has the potential failure in achieving the global optimization. Moreover, the ant colony optimization (ACO) and the bee colony methods have also been introduced [2933], which have better performance in MUCMTA. However, it degenerates very fast when its situations become more complicated, especially in 3D. So it is hardly suitables for the assignment problem with a large scale.

The MUCMTA research in a 3D environment has more practical applications than 2D [34]. However, It is difficult to establish an effective representation by using a graph structure in 3D environment because it has a more extensive searching space. On the other hand, the UAVs should fly through the complex uneven terrain and avoid various threatening zones and no fly zones, furthermore the UAVs should to satisfy their own and cooperative constraints. The less the 3D environment is considered, the greater damage probability the UAV will be suffered, which will lead to task failure. So the reality task cost should be estimated by the flight distance, execution time and other various constraints, but not the linear distance between the start location and target position nor the shortest path on a graph. Research has also been proposed for the task assignment in 3D environments [34, 35]. An approach of a self-organizing map (SOM) neural network has been used for task assignment and path planning of multiple autonomous underwater vehicles (AUV) [34] in a 3D workspace. The approach combines task assignment with path planning, and the AUV is similar to a UAV in a 3D environment.

Differential evolution (DE) algorithm is a stochastic, parallel and population based search method, using the distance and direction difference of individuals to guide the search process [3639]. Characterized by continuous parameters it has a strong search capability and fast convergence. The improved DE has been proved to be quite appropriate for the combinatorial optimization problems with discrete parameters, and it is effective and competitive with other approaches in this domain [39]. In our previous work [40], we have also used a DE algorithm to deal with a uniform assignment model using partially mapped crossover (PMX). This approach has provided better simulation results. But because it performs differential operations with only sequenced targets as well as rounding the differential results directly into integer, physical properties are ignored and the rationality of the problem is reduced.

In this paper, we continue improving the DE algorithm for solving the MUCMTA more elaborately. Although the DE algorithm has a better effect on the combinatorial optimization problems, there are still some major challenges to solve for practical problems. They are summarized as follows:

  1. 1.

    How do we design a unified gene code that indicates the task assignment? A population with unified coding can not only represent various assignment models, but can also be processed by a uniform algorithm framework in 3D environment.

  2. 2.

    How do we define reasonable mapping rules for DE with discrete parameters relating to MUCMTA? They should properly transform the problem from discrete space to continuous space by physical properties, so that it can correctly guide the direction of search.

  3. 3.

    How to dynamically select a differential strategy during the evolutionary process? Because the DE algorithm improves the efficiency by creating a balance between the exploration and the exploitation, another challenge lies in the selection of the differential strategy.

Therefore, this paper presents a novel discrete mapping differential evolution (DMDE) approach based on a unified gene code strategy in 3D, which can handle various assignment models in a uniform algorithm framework. Then we improved the differential operation by discrete mapping and inverse mapping rules. The map of the flight path cost transforms the problem from discrete space to continuous space, so that the rational optimal result with physical properties can easily be found. Moreover, the inverse mapping with matched strategies inverses the differential result back to the assignment relationship for generating feasible offspring of DE. Furthermore, a hybrid differential strategy with a dynamic crossover rate is used to improve the efficiency of the DMDE algorithm. Finally, we replace the constraint of arriving at the same time with the asynchronous initial waiting time, which further enhances the cooperative capacity of UAVs.

The rest of this paper is organized as follows. Section 2 describes the unified algorithm framework for MUCMTA. The unified encoding strategy based on the cost of the flight path is given in Sect. 3. In Sect. 4 we present the differential mapping rules for discrete parameters. Simulation results are shown in Sect. 5 before the conclusion is given in Sect. 6.

2 The unified algorithm framework

2.1 Unified assignment model

We have previously worked on the unified assignment model for the MUCMTA problem in a 3D environment, and estimated the flight path cost by using the vertical section between a UAV and a target [40]. In the meanwhile the assignment situations have been summarized as these three basic cases: N > M, N < M and N = M, which are classified by the number of UAV and target.

Based on the previous research, the scenario of MUCMTA is described as a set of UAVs \(N=\{U_1,U_2,\ldots ,U_n\}\) to attack a set of targets \(M=\{T_1,T_2,\ldots ,T_m\}\) in a complex 3D environment. The objective of this problem is to find an assignment sequence, which has the optimal total cost of the flight path and the shortest flying time for all UAVs. Meanwhile, all UAVs fit a variety of constraints and avoid radar threat by cooperating with each other. The typical scenario and simulation environment are shown in Fig. 1.

Fig. 1
figure 1

The scenario of MUCMTA in 3D environment

The objective function of the unified assignment model can be formulated as follows:

$$\begin{aligned} {\min (f(x))}=\sum _{i=1}^{n} \sum _{j=1}^{m}w_j d_{(i,j)} x_{(i,j)} + \alpha \left( \max \left( \sum _{i=1}^{n} \sum _{j=1}^{m} t_{(i,j)} x_{(i,j)}\right) \right) +\beta \sum _{k=1}^{K}c_k \end{aligned}$$
(1)

Where \(d_{(i,j)}\) is the distance of flight path from UAV to the target. \(t_{(i,j)}\) is the flying time of the UAV. \(c_k\) is the penalty corresponding to the constraints. \(w_j\) is the weight of target \(j\), and \(0<w_j\le 1\), \(\alpha\) and \(\beta\) are scale factors for keeping the balance of polynomial in the objective function, which can be empirically set. \(x_{(i,j)}\) is the decision variable, which decides the corresponding of UAV and target. The forms of \(x_{(i,j)}\) with different assignment cases are as follows.

$$\begin{aligned} x_{(i,j)} = {\left\{ \begin{array}{ll} x_{(i,j)} &{} N= M\\ x_{(i^\prime ,j)} , i^\prime \in [i_s,\ldots ,i_l], s,l \in N &{}N>M\\ x_{(i,j^\prime )} , j^\prime \in [j_p,\ldots ,j_q], p,q \in M &{}N<M \end{array}\right. } \end{aligned}$$
(2)

In (2), while N > M, \(x_{(i^\prime ,j)}\) is the decision between many UAVs and one target. That means many UAVs can attack the same target. While N < M, \(x_{(i,j^\prime )}\) is the decision between one UAV and many targets, and that means a single UAV can attack a group of targets.

According to the discussion in the previous research [40], a variety of constraints can be described as follows.

$$\begin{aligned} \sum _{i=1}^{n}X_{ij}=1, \forall j=1,\ldots , m, \quad n\ge m \end{aligned}$$
(3)
$$\begin{aligned} \sum _{j=1}^{m}X_{ij}=1, \forall i=1,\ldots , n,\quad n \le m \end{aligned}$$
(4)
$$\begin{aligned} {\left\{ \begin{array}{ll} \tau = T(j)_{min}-T(i)_{max} \\ t_j \ge t_i + \tau \quad i,j \in {1,\ldots , m} \end{array}\right. } \end{aligned}$$
(5)
$$\begin{aligned} {\left\{ \begin{array}{ll} [T_{k-1},T_k]\ne \varPhi \\ \max {\{t_i, \ldots , t_j\}} \le T_k \quad i,j \in {1, \ldots , m}\\ \min {\{t_i, \ldots , t_j\}} \ge T_{k-1} \quad i,j \in {1, \ldots , m}\\ \end{array}\right. } \end{aligned}$$
(6)
$$\begin{aligned} \sum _{i=1}^{n}\sum _{j=1}^{m} d_{(i,j)}x_{(i,j)}\le \sum _{k=1}^{n} D_k \quad \forall k= 1, \ldots , n \end{aligned}$$
(7)
$$\begin{aligned} \max {\left\{ \sum _{i=1}^{n}\sum _{j=1}^{m}t_{(i,j)}x_{(i,j)} \right\} }\le T_k \quad \forall k=1, \ldots , n \end{aligned}$$
(8)
$$\begin{aligned} \bigcap _{i=1}^{K}\left( [t(i)_{min},t(i)_{max}]\right) \ne \varPhi \quad K \in {1, \ldots , n} \end{aligned}$$
(9)

Where (3) and (4) indicate the relationship between UAV and target; (5) represents the perform sequence of targets and \(\tau\) is the minimum time interval. Then, (6) is the constraint of time window, (7) is the constraint of maximum flight path, (8) is the constraint of maximum flying time, and (9) is the constraint of arriving at the same time.

Moreover, in order to ensure the UAVs simultaneous arrival, we introduce the vector of maximum initial waiting time \(T_{wait}\), which allows some UAVs to wait until other UAVs take off in a certain time period. It can be described as:

$$\begin{aligned} T_{wait}=[t_{wait}(1),t_{wait}(2),\ldots ,t_{wait}(k)],\quad k\in \lbrace 1,\ldots ,n\rbrace \end{aligned}$$
(10)

Then the penalty of the simultaneous arrival constraint can be changed as:

$$\begin{aligned} C_{constrain} = {\left\{ \begin{array}{ll} 0 \quad \bigcap \limits _{i=1}^{L}([T_{min}(i),(T_{max}(i)+T_{wait}(i))])\ne \varPhi , &{} L \in N \\ K_{p}\quad \bigcap \limits _{i=1}^{L}([T_{min}(i),(T_{max}(i)+T_{wait}(i))])= \varPhi , &{} L \in N \\ \end{array}\right. } \end{aligned}$$
(11)

Where the \(T_{min}(i)\) is the time of the UAV flying at the highest speed, and the \(T_{max}(i)\) is the time of the UAV flying at the lowest speed. The \(K_p\) is a penalty term, when the cooperative tasks have no intersection.

The objective function presents the optimal direction of search, so it can measure the evolution of individual merits. Therefore, we use the objective function to be the fitness function of DE. The individual with high fitness has more chances to survive and reproduce in the evolutionary process, and the fitness function can make the algorithm search toward the direction of the optimal solution.

2.2 The DE algorithm framework

The DE algorithm developed by Storn and Price in 1995 is similar to other evolution algorithms in the characters of population transfer, crossover and mutation [4144]. However it significantly differs from other evolution methods in using the vector of distance and direction difference between individuals to guide the search process, which performs quickly and efficiently searching in continuous space. The main performance advantage of the DE algorithm is reflected in the differential strategies [36, 37, 42]. The general strategies used by DE are indicated by the notation of DE/\(x\)(base)/\(y\)(number)/\(z\)(cross), where \(x\) represents the base of the differential tendentious, \(y\) is the number of individuals involved in the differential, and \(z\) is the type of crossover. The crossover \(z\) should usually be chosen between binomial (bin) and exponential (exp), where the binomial crossover is randomly selected from the set of possible crossover points, and the exponential crossover is selected from a sequence of adjacent crossover points in one loop. The research of [37] has stated that the exponential and binomial crossovers yield similar results for discrete optimization problems, so we selected the commonly used type of binomial crossover in our research. The DE strategies most frequently used in the literatures are shown in Table 1.

Table 1 Common formulas of DE strategies

However, the traditional DE only handles the problem with continuous parameters. If using DE to solve the combinatorial optimization problem with discrete parameters, the discrete parameters should be changed to continuous space by some translation approaches. Through the translation, DE can easily solve the discrete problem with continuous parameters. Therefore, in order to avoid the defects of DE for MUCMTA, we developed a reasonable translation method with map and inverse mapping strategies. The proposed strategy not only takes advantage of the DE algorithm, but also reduces the differential error by involving the physical properties of MUCMTA.

In this paper, we first propose a unified gene encoding for assignment models of N = M, N > M and N < M, so that different models can be processed in a consistent framework of the DE algorithm. Then we convert the discrete parameters into continuous space by mapping the matrix of the flight path cost, while we obtain the differential result with the matrix. Next we inverse the differential result to that of the relationship of assignment by the inverse mapping rules. Those translation processes can easily produce reasonable offspring of a DE individual with practical properties. Finally, we make the evolution always move toward the optimized direction of MUCMTA by using a cooperative fitness function with various constraints. In addition, we also use a hybrid differential strategy to enhance the efficiency of DMDE. The algorithm framework is shown in Fig. 2.

Fig. 2
figure 2

Framework of unified targets assignment algorithm based on the improved discrete DE

3 The Unified encoding strategies

3.1 Encoding strategies

The efficiency and complexity of a search algorithm relies heavily on the representation scheme [36]. Therefore, the key to an evolution algorithm is to find an efficient representation scheme. For MUCMTA, it usually uses the permutation of targets to represent the individual of evolution. The gene of an individual is a serial number of targets, while the length of individual gene is the total of combat tasks. Moreover, If the number of UAVs is more than that of the targets, a target can be randomly assigned to multiple UAVs. If the number of UAVs is less than that of the targets, many targets can consist of a sequence for UAV cruising. This approach easily represents the MUCMTA by the individual of evolution algorithm, but it is difficult to deal with different models in a uniform framework. Furthermore, the algorithm randomly assigns the UAVs to targets and blindly searches the optimal solution during the evolution process, resulting in the reduction of the algorithm efficiency.

Therefore, we propose a uniform gene encoding for various assignment models (N = M, N > M and N < M) based on the flight cost of UAVs and targets. In [40], we have introduced that the cost matrix of various models has a similar form, so we can unify the gene encoding by the assignment relationship of UAVs, targets and their flight path cost. Then the uniform gene encoding of DE can be represented by a triple vector \((U_i, T_j, C(i,j))\) or \((T_j, T_{(j+1)}, T_c(j,(j+1)))\), where \(U_i\) denotes the UAV \(i\), \(T_j\) denotes the target \(j\), \(C(i,j)\) the cost of the UAV \(i\) flying to the target \(j\), and \(T_c(j,(j+1))\) the cost of the target \(j\) flying to the next target \((j+1)\) (target \(j\) is treated as a new starting point for a UAV when the UAV cruises a group of targets). Thus, the genes of an individual composes a feasible solution for the MUCMTA. Figure 3 describes an example of the cost matrix and the individual encoding.

Fig. 3
figure 3

Uniform code based on cost matrix

In Fig. 3, \(\{U_1,\ldots ,U_6\}\) indicates the UAVs, \(\{T_1,\ldots ,T_6\}\) is the targets, \(C(i,j)\) is the flight path cost of the UAV \(i\) to the target \(j\), and \(T_c(i,j)\) is the flight path cost of the target \(i\) to the next target \(j\). In the left sub-region of the graph in Fig. 3, the area enclosed by the black solid lines shows the cost matrix in which the number of UAVs equals to the targets (N = M). The area enclosed by the blue dotted lines represents the cost matrix in which the number of UAVs is larger than the targets (N > M). And, the area enclosed by the red span lines indicates the cost matrix in which the number of UAVs is less than the targets (N < M). In the right sub-region of the graph in Fig. 3, there are the genes encoding of the individuals for the three models.

3.2 The rules of code generation

Fig. 4
figure 4

The gene code sequence of individuals when N = M

The unified encoding of various models has a similar form but different meaning, so the rules of code generation are slightly different. They are described as follows:

  1. 1.

    While N = M, the relationship of the UAV and the target is one to one. So each gene denotes a set of a UAV \(U_i\), a target \(T_j\) and their cost \(C(i,j)\). \(C(i,j)\) is the flight path cost of \(U_i\) to \(T_j\). Because each target is performed by a UAV and a UAV only executes one target, the code of the evolutionary individuals has no repetitive \(U_i\) and \(T_j\). Figure 4 shows the gene code of the individuals.

  2. 2.

    While N > M, the relationship of UAV and target is many to one. Each gene has the same meaning as N equals to M. However, because many UAVs can attack the same target, a repetitive \(U_i\) does not exist, but \(T_j\) could be repetitive in a code of evolutionary individuals. Moreover, each target is at least performed by one UAV, so \(T_j\) occurs at least once in an individual. It is shown in Fig. 5.

  3. 3.

    While N < M, the relationship of UAV and target is one to many. Each gene may have relations with other genes because a UAV can cruises to a group of targets. Those targets assigned to the same UAV, can be grouped into a swarm team. In the swarm team, the UAV implements all those targets sequentially. Therefore, the UAV \(U_i\) corresponds to a group of targets \([T_j,\ldots ,T_k]\). \(C(i,j)\) is the flight path cost of \(U_i\) to \(T_j\). Then we set the current finished target \(T_j\) as a new starting point for the UAV. Then, \(T_c(j,k)\) is the flight path cost from the new starting point \(T_j\) to next target \(T_k\). The sequence of \([T_j,\ldots ,T_k]\) is based on the principle of the shortest path, so that the UAV can cruises to all targets in the team with a minimum cost. In the code of evolutionary individuals, a repetitive \(T_j\) does not exist but \(U_i\) could be repetitive. The \(U_i\) occurs in an individual at least once, because every UAV should be involved. The gene code of the individual is shown in Fig. 6.

Fig. 5
figure 5

The gene code sequence of individuals when N > M

Fig. 6
figure 6

The gene code sequence of individuals when N < M

The above code rules represent the evolutionary individual of three basic MUCMTA missions. It can unify various assignment models to be dealt with in a consistent algorithm framework. Meanwhile, this approach stores the flight path cost in a gene, so that the distance information between the UAV and a target can be used to guide the direction of search. So it is a more general and more reasonable algorithm.

4 Improved DE by discrete mapping rules

4.1 Discrete mapping rules

The feasible solution of MUCMTA is constituted by the discrete integer set of UAVs, targets and their cost. So the classical DE algorithm can’t solve this problem directly. If only the integer part of the float differential result is utilized, the new individual would have invalid duplicate values, zero values, negative values and over bounds values. Such an individual cannot be used in subsequent evolutionary computation, which will inevitably lead to the failure of the searching algorithm.

Some approaches such as the partially matched crossover method [23, 40] or the sorting differential results method [37] have solved the invalidation of DE. But, they are variants of random search, and neglect the actual physical features of MUCMTA. Therefore, it is difficult to achieve optimal results.

To solve this problem, the improved algorithm uses the cost value in the gene code to map the discrete relationship of the UAV and the target, so that the search process makes the differential in continuous space. Then we inverse the search results from the continuous space back to the original discrete space by employing the inverse mapping strategies, while a new feasible individual is obtained. The space conversion with map and inverse mapping strategies can generate new feasible individuals and avoid the failures of the other approaches.

In the physical significance of the MUCMTA problem, the relationship of the UAV and target connects with the flight path cost. The main goal of the search is to find the assignment with a set of minimum cost. Therefore, we calculate the differential of individuals in DMDE with the corresponding cost of the UAV and the target. Because the individual can be encoded as an \(n_d\)-dimensional bivariate vector, we set the discrete space as \(\{(U_i,T_j)\in Z^{n_d}\}\) and the continuous cost space as \(R_c^n\). So the mapping function for conversion can be interpreted by the following expression:

(12)

The cost value matrix of \(C_{cost}\) is the permutation matrix of the mapping function. As we discussed before, the flight path cost has been added in the unified gene encoding, so it is easy to get the cost value of the permutation matrix in the genes of individuals. According to DE strategies, the algorithm calculates the differential temporary solution with the cost value as follows:

$$\begin{aligned} C_j^{(i)^\prime }(t)\!=\! {\left\{ \begin{array}{ll} C_j^{(r_1)}(t)\!+\!F\cdot (C_j^{(r_2)}(t)\!-\!C_j^{(r_3)}(t)), &{} rand[0,1]<CR\\ C_j^{(i)}(t), &{}Otherwise \end{array}\right. } \end{aligned}$$
(13)

In (13), \(F\) is the scale factor which controls the amplification of the differential variation rate. \(CR\) is the crossover rate. \(C_j^{(i)^\prime }(t)\) is a temporary result given by the DE differential strategy, which represents the distance difference between father individuals. It also may be a reasonable value or an invalid duplicate value, zero value, negative value or over bounds value. Therefore, in order to get feasible new offspring individuals, we need inverse mapping the temporary result to correspond to the relationship of the UAV and the target by some matching rules. The inverse mapping function can be defined as:

(14)

In the inverse mapping strategies we adopt three matching rules, which ensure that the temporary result transforms into a new feasible offspring of an individual. Those match rules can be described as follows:

  1. 1.

    Rule 1 (Nearest neighbor matching): For each reasonable value in a differential temporary result, we find out the nearest value \(C(i,j)\) in the cost matrix, and make the corresponding \((U_i,T_j,C(i,j))\) the new gene position of offspring individual. The nearest value is the minimum value compared with all the cost values in the matrix, which is described as follows:

    $$\begin{aligned} C_{(i,j)}^\prime =C_{(i,j)}|\min \{(x_j^{(i)^\prime }(t)-C_{(i,j)}), 0<i<n,0<j<m\} \end{aligned}$$
    (15)
  2. 2.

    Rule 2 (Unique matching): The match should guarantee the reasonable value uniquely corresponding to a relationship of the UAV and the target, and meanwhile avoid the duplicate value of mapping to the same position. So some strategies are required to choose the position of the genes in the cost matrix. The different assignment model need different strategies, which are as follows:

    1. a.

      While N = M, both of UAVs and targets are not repetitive in a gene sequence. After the match element in the cost matrix is found and added to the new offspring individuals, we delete the row and the column of this element in the matrix, so that the cost values of that row and column are no longer involved in the next matching.

    2. b.

      While N > M, the UAVs are not repetitive but the same target can be repetitively matched by different UAVs. When we matched the element in the cost matrix and add it to the new offspring individuals, we delete the row of this element in matrix, which can prevent any UAV duplication.

    3. c.

      While N < M, the UAVs may cruises to a group of different targets and the targets can’t be repetitive in a gene sequence. So we adopt the iterative replacement method to match the results. First, we match all the UAVs in the upper part of cost matrix, so that each UAV matches at least one target. Then we use the row in the lower part of matrix to replace the row of UAV in upper part of matrix. The replacement uses current visited target point as the UAV’s new start point, so that the UAV can visit the next target from this new location. Finally we iteratively execute this replacement process, and match the target in the changed upper part of cost matrix until the rest of targets are matched.

  3. 3.

    Rule 3 (Invalid mutation matching): After matching the reasonable values of the temporary result by Rule 1 and 2, the invalid values and the unmapped matrix are left. Those invalid zero values, negative values and over bounds values in the temporary result can be matched by the local random mutation. Therefore, an element is randomly assigned in the unmapped matrix to an invalid value, and the unmapped matrix is modified by Rule 2 until all the invalid values are matched and the unmapped matrix is null.

During inverse mapping with the match rules, the operation of deleting rows or columns seriously reduces the algorithm’s efficiency. Therefore, we use an unlimited large value \(Inf\) to reset those rows or columns instead of deleting them. The \(Inf\) can easily improve the efficiency of the algorithm, and ensures that the matching process is implemented smoothly. Figure 7 shows how the match rules work.

Fig. 7
figure 7

The schematic of match action

4.2 DMDE with hybrid evolution strategies

In the application of DMDE for the MUCMTA problem, a main issue is to find the trade off between the exploration and the exploitation of the search by setting the parameters and differential strategies. The exploration effectively identifies the optimal solution in the search regions, exploitation accelerates the convergence to the local optimal solution [44]. To improve the performance of DMDE, something is needed to be done to not only enhance the exploration to cover more space, but also to increase the exploitation to refine the search. However, those two aspects interact with each other. Excessive exploration may make the algorithm difficult to converge to the optimal solution, while the over exploitation may cause the algorithm to fall into a local optimum. Consequently, only by keeping a balance between exploration and exploitation, can the performance of the DMDE algorithm rise in the searching process.

The DE algorithm with a single strategy tends towards better exploration (DE/rand/\(y\)/\(z\)) or faster convergent exploitation (DE/best/\(y\)/\(z\)), which only focuses on one aspect of optimization. The DE with strategy of DE/rand-to-best/\(y\)/\(z\) keeps a better balance of both aspects than the above. However, this strategy also lowers the capability of optimization, because the random or best parameter is still a fixed proportion in the evolutionary process.

For the above reasons, we adopt a hybrid strategy to solve this problem, which dynamically mixes two evolutionary strategies. The hybrid strategy uses a dynamic crossover rate to adjust the proportion of DE/rand/1/bin and DE/best/2/bin in the evolutionary process. In the early stages of evolution, the algorithm is dominated by the strategy of DE/rand/1/bin, which maintains population diversity. Then the algorithm gradually turns to the strategy of DE/best/2/bin, employing a more sophisticated search for the optimal solution. So, the hybrid pattern dynamically adjusts the evolutionary strategies to trade off the exploration and the exploitation in the search process. It can effectively improve the efficiency of the DMDE algorithm. The dynamic crossover rate \(CR\) is defined as follows:

$$\begin{aligned} CR = 1 - \left( \dfrac{\log (x)}{\log (S)} \right) ^\theta , \quad x \in (1,2,\ldots ,S),\quad \theta \in N \end{aligned}$$
(16)

Where \(x\) is the current generation, \(S\) the total generation, and \(\theta\) the exponent of the curvature. The curves of \(CR\) with different \(\theta\) are shown in Fig. 8.

From Fig. 8, it can be seen that the value of \(\theta\) determines the variation of the curves. So the appropriate value of \(\theta\) can better balance the exploration and the exploitation. For example, when \(\theta = 3\), the dynamic crossover rate reduces very quickly at the beginning of the search, so that the proportion of the exploration by random search quickly approaches to the proportion of exploitation by the refined search. Then the crossover rate changes more and more slowly and the proportion of the exploitation is gradually larger than the exploration, which ensures that the refined search is dominant at the end of the search process.

Fig. 8
figure 8

Crossover rate curve

Then, the new individual generated according to the hybrid strategy and the dynamic crossover rate is as follows:

$$\begin{aligned} C_j^{(i)^\prime }(t)\!=\! {\left\{ \begin{array}{ll} C_j^{(r_1)}(t)\!+\!F\!\cdot \!(C_j^{(r_2)}(t)\!-\!C_j^{(r_3)}(t)), &{} rand[0,1]\!\leqslant \!CR\\ C_j^{(best)}(t)\!+\!F\!\cdot \!(C_j^{(r_1)}(t)\!+C_j^{(r_2)}(t)\! -\!C_j^{(r_3)}(t)\!-\!C_j^{(r_4)}(t)), &{}rand[0,1]\!>\!CR \end{array}\right. } \end{aligned}$$
(17)
$$\begin{aligned} F\!=\! {\left\{ \begin{array}{ll} CR, &{} rand[0,1]\!\leqslant \!CR\\ (1-CR)/2, &{}rand[0,1]\!>\!CR \end{array}\right. } \end{aligned}$$
(18)

Equation (17) shows that the crossover strategy of DE/rand/1/bin helps to explore more individuals when \(CR\) is a big value in the beginning. Then the proportion gradually changes with \(CR\). After evolution iterates many generations, solutions are close to the optimization, and \(CR\) is near the minimum value. So the crossover strategy of DE/best/2/bin guide the searching process to exploit the optimal solution. Moreover, Eq. (18) shows that the value of the scale factor \(F\) is related to \(CR\) so that it is also the value of dynamic change.

On the other hand, we also utilize the generational mutation mechanism in DMDE to enlarge the search coverage. The generational mutation saves some excellence individuals, and replaces all other common individuals in the population during generation iteration. So, it significantly changes the tendency of before search. It also enhances the population diversity and random search coverage. We set the generation mutation rate as \(GMR=1-CR\), which gradually increase with the reduction of \(CR\).

5 Experimental results

To demonstrate the effectiveness of the improved DMDE algorithm, several simulation cases are implemented on the software platform of MATLAB R2011b in Windows 7. The PC had a Core i5 3.20 GHz (quad-core) processor and 8GB of RAM. We also simulated the same 3D terrain, randomly fixed radars coordinates, and different numbers of UAVs and targets as the previous research to compare the improved approach with previous common approach under the unified assignment model.

5.1 The effectiveness and efficiency of DMDE

In order to verify the DMDE algorithm searching the global optimal solution validity, we set two different scales for the experiments. Exp1 has a small number of UAVs and targets to verify the feasibility of the algorithm and in comparison with the result of previous research’s method. Exp2 has a big number of UAVs and targets to prove the algorithm’s effectiveness in complex situations. We set the initial data format as: the velocity matrix of UAVs as \(V= \left(\begin{array}{cc} V_{min}^1 &{} V_{max}^1\\ \vdots &{} \vdots \\ V_{min}^n&{}V_{max}^n \end{array}\right)\), the constraint vector of maximum flight path cost as \(D=[d_{max}^1,\ldots ,d_{max}^n]\), the constraint vector of maximum flying time as \(T=[ \dfrac{d_{max}^1}{v_{min}^1},\ldots , \dfrac{d_{max}^n}{v_{min}^n}]\), the weight vector of targets as \(W=[\omega _1,\ldots ,\omega _n]\), the constraint matrix of targets order as \(T_{sort}= \begin{pmatrix} t(i)\quad &{} t(j)\\ \vdots \quad &{} \vdots \end{pmatrix}\), the constraint vector of arriving at the same time as \(T_s=[k_1,k_2,\ldots ,k_l]\)(k is the number of teams by assignment), the vector of maximum initial waiting time as \(T_{wait}=[t_{wait}^1,t_{wait}^2,\ldots ,t_{wait}^k]\), and the constraint of the time window as \(T_{win}=[t_{start},t_{end}]\). The initial data for Exp1 is shown in Table 2. Moreover the scale factors in the objective function are chosen with \(\alpha =2.5\), \(\beta =1.5 (N\ge M)\) and \(\alpha =1.5\), \(\beta =1 (N<M)\). Then we set the dynamic crossover rate as \(CR\) from Eq. (16) and assume the exponent as \(\theta =3\). The other evolution parameters of DMDE for each experiment are shown in Table 3.

Table 2 Initial data for experiment 1
Table 3 Evolution parameters of each experiment

Where, ‘Pop’ denotes the number of individuals in populations, ‘Gen’ is the number of iterations in generations, and ‘Num’ is the repeated times of the experiments.

The statistical data from the results of experiments is listed in Table 4.

Table 4 Statistics data by multi-groups assignment result

Where ‘Avg time’ is the average running time, ‘Avg cost’ is the average total cost of the flight path, ‘Best cost’ is the optimal total cost of the flight path over many runs of the experiments, ‘Violation’ is cooperative violation rate, which is defined as the sum of the violation percentage of the total cost in the optimal solution, and the ‘Optimal’ is the rate of optimal result times the percentage of experiment times. The ‘Avg cost’ and ‘Best cost’ are all fitness scores, which are composed of optimization goals and various constraints violations. So they are dimensionless values.

We can draw two conclusions from Table 4: (1) the improved DMDE algorithm can get reasonable results of targets assignments for various MUCMTA models; (2) the algorithm can also handle the large scale MUCMTA problem effectively.

For Exp2, if the larger scale experiment increases the complexity and reduces the efficiency of the problem, the DMDE algorithm can efficiently search for an optimal solution. Furthermore, the average running time of both Exp1 and Exp2 are lower and at the same level, which proves that the unified DMDE algorithm has more efficiency for various assignment models and is suitable for solving larger scale problems. Moreover, the average total cost is very close to the optimal cost, which indicates that the algorithm search is near the global optimal result. Meanwhile, the smaller the cooperative violation rate is, the stronger the cooperative capacity of the solution there is. So the algorithm has less chance to violate constraints. In addition, the higher optimal rate illustrates that experiments have higher probability in finding the optimal solution.

In order to present the assignment results directly, Table 5 illustrates the optimal assignment results of Exp1 for various models, and Fig. 9 shows the corresponding results in the 3D simulation battlefield environment with radar threats and multi-constraints.

Table 5 Experimental assignment results of Exp1
Fig. 9
figure 9

Simulation results of Exp1 in 3D simulation battlefield environment

In Table 5, ‘U’ is the UAV, ‘T’ is the target, and the value is the flight path cost by the UAV and the target. Then, in Fig 9, the yellow circles indicate the UAVs starting position. The yellow stars indicates the targets position. The red lines show the assignment relationship of the UAV and target, and it also indicates the flight path cost between the UAV and the target.

From Fig. 9, we can see that the DMDE algorithm can effectively deal with the assignment for different models, when the distribution of radars is intensive and the targets are randomly distributed in a 3D environment. Moreover, when the radar is not very intensive in the battlefield, the estimated cost of the flight path will be closer to the real path, which can provide an important reference for the next phase of the UAVs trajectory planning.

Comparing with the previous study in [40], we list the data of two experiments in Table 6.

Table 6 Comparison with the previous experimental data

From Table 6, the average running time of the experiments with DMDE is shorter than that in the previous research. That is because the DMDE does not only modify the genes generation strategies, but also introduces the differential mapping rules. Thus, the optimal rate shows that DMDE has more capacity to find the optimal result than the previous research.

5.2 Performance of DMDE

The curves of convergence of the three models in Exp1 are shown in Fig. 10, where we can analyze the tendency of DMDE based on the following three aspects:

Fig. 10
figure 10

The Single convergence curve and average convergence curve of assignment result by Exp1

First, the blue line represents the convergence curve of a single experiment with the optimal point in generation iteration, and the red line shows the average convergence curve by statistical data of the experiments over many running times. The curves of different models have similar convergence trends. All of them can converge near to the optimal solution after about 150 generations, which means that the speed of convergence is high in the beginning of the search. Consequently, the DMDE algorithm has better convergence performance.

Secondly, the red circle in Fig. 10 indicates that the current optimum is obtained by differential mutation, while the blue star represents that the current optimum is obtained by generation mutation. Then we can see that the generation mutation not only increases the population diversity, but also can find some optimum randomly, while differential mutation still plays a dominant role during the searching process in DMDE. On the other hand, the DMDE algorithm maps physical properties to guide the direction of search, so that it rapidly moves towards the global optimum. Meanwhile, the hybrid strategy reduces the scale of the random search. So the DMDE algorithm has substantially stronger search capabilities.

Finally, in the last phase of the search process, convergence still occurs. It proves that the dynamic crossover rate, generation mutation and hybrid strategies increase the capacity of the algorithm to escape from a local optimum, and ensures the balance between exploration and exploitation.

On the other hand, we compared the different DE crossover strategies for the search. The average convergence curves are shown in Fig. 11. From Fig. 11, it can be seen that the hybrid strategy is better than the single strategy(DE/rand/1/bin or DE/best/2/bin). Moreover, when the hybrid strategy is used in place of DE/rand/2 and DE/best/1, they have very similar search results and convergence curves.

Fig. 11
figure 11

Average convergence curve of assignment result by different crossover strategies

5.3 A comparison between DMDE and other algorithms

We also compared our algorithm with the general GA algorithm, Sort DE algorithm, and PMXDE algorithm. The experiments were set with the same evolution parameters, population size, and iteration times for the four algorithms. The results are shown in Table 7. In Table 7, the variable \(CR\) is crossover rate, the variable \(MR\) is mutation rate and the variable \(GMR\) is generation mutation rate for DE. The average convergence curve are shown in Fig. 12.

Table 7 A comparison between DMDE and other algorithms
Fig. 12
figure 12

Average convergence curve of assignment result by different algorithm

From Table 7 and Fig. 12, it can be seen that DMDE algorithm obtains the optimal solution at a minimum cost, the fastest speed of convergence and with the shortest running time compared with the other algorithms. The Sort DE algorithm has a shorter running time than GA and PMX DE, when N = M and N > M. Although those three algorithms can all converge near to the optimal solution, they are worse than DMDE and need more running time.

Furthermore, we analyze the time complexity of the DMDE algorithm. We set \(n\), \(m\) to be the scale of individuals (the number of UAV and target), and \(d\), \(l\) the scale of DE (population size and generation iterators). Then the time complexity of the DMDE algorithm is mainly affected by the initialization operation \((n\cdot m+m)\), differential operation \((d\cdot l)\), mutation operation \(((n\cdot m+m)\cdot l)\), fitness and constraints calculation \((6n)\), and inverse mapping matching \((m \cdot d\cdot l)\). So the time complexity is as follows:

$$\begin{aligned} \begin{array}{ll} &{}f(n)=(n\cdot m+m)+(d\cdot l)+((n\cdot m+m)\cdot l)+(6n)+(m \cdot d\cdot l)\\ &{}\quad \quad =2n^3+3n^2+2n\\ \end{array} \end{aligned}$$
(19)

Thus the inverse mapping matched operation and mutation operation mostly lead to the loss of algorithm performance, but the overall efficiency of the algorithm does not reduce because of directly calculating the differential by flight path cost. So the worst-case of time complexity is \(O(n^3)\).

In contrast, GA and PMX DE handle the cross and mutation with the discrete sequence of the individual. So the time complexity of evolution is mainly described as \((d\cdot l \cdot (n \cdot m))\), which has more loss of performance than DMDE. The worst-case of time complexity is \(O(n^4)\). The other algorithm of Sort DE is simpler than GA and PMX DE in differential operation. It has similar time complexity as DMDE, whose worst-case of time complexity is also \(O(n^3)\).

These experimental results indicate that the DMDE algorithm has better performance than other algorithms in solving the MUCMTA problem. Because other algorithms still are variants of random search. A blind random search not only leads to the reduction of efficiency, but also rapidly declines the performance when the problem scale increases. On the contrary, the DMDE algorithm is to some degree free from the restriction of random searching, so it is adaptable to solve a large-scale MUCMTA problem.

6 Conclusions

A novel method for UAVs cooperative multi-targets assignment based on a unified model is proposed in this paper. The main contributions of our work include:

  1. 1.

    A unified gene coding strategy is proposed to unify various target assignment models in a consistent framework by analyzing the sequence of targets and UAVs and their flight path cost. The flight path cost is added into the gene code to make it easy to transform the discrete space into continuous space.

  2. 2.

    Mapping and inverse mapping rules are proposed to handle the combinational optimization problem in DE. The improved DMDE algorithm converts the discrete space between UAVs and targets into a continuous space of flight path cost, and does differential operations by using the cost value of flight cost. In order to get rational offspring individuals, the temporary differential results are mapped back to the space of the relationship between the UAV and the target according to the inverse mapping rules. Besides, the distance information is utilized to guide the direction for searching for the global optimum and this could help speed up the searching process.

  3. 3.

    In order to avoid falling into a local optimum, a dynamic crossover rate and hybrid evolution strategies are applied to balance the exploration and exploitation during the searching process.

Compared with the current algorithms, the improved DMDE algorithm is more efficient. Simulation results show that feasible optimal solutions can be obtained in a reasonable running time.

Future work is to apply the results of MUCMTA to guide the multi-UAVs cooperative trajectory planning.