Keywords

1 Introduction

Unmanned aerial vehicles (UAVs) [1] and other unmanned flight systems have extremely strong manoeuvrability. When performing tasks in hazardous areas, they do not rely on onboard personnel. They are not only convenient and intelligent to use, but also reduce pilot training costs. Therefore, they are widely used in various application scenarios in various fields such as reconnaissance, strike, surveillance, search and rescue. With the increasing popularity of drone applications [2], research issues such as autonomous algorithms, intelligent control algorithms [3], and collaborative planning algorithms for drones are becoming increasingly prominent. Especially the ability of drones to perform tasks in dry, harsh or dangerous environments for humans has attracted more and more scientific researchers to conduct research on drone flight formation algorithms, task allocation algorithms, trajectory planning algorithms, and efficiency evaluation methods. Drones not only play an increasingly important role in the military field, but also have a high demand in civilian field such as IoT (Internet of Things) networks. Especially in today’s increasingly frequent occurrence of various disasters, drones are increasingly being used to monitor disaster scenes, such as mountain fire monitoring, maritime emergency rescue, and medical material transportation.

The main scenario considered in this paper is that after a severe earthquake, the early warning measurement center needs to collect soil, pressure, water quality and other information collected by wireless sensor nodes pre-set at certain key points in the earthquake area, such as plate boundaries, in order to analyze the disaster situation, whether a barrier lake will form, and the arrival time of aftershocks. Due to the threat of aftershocks and the impact of landforms being destroyed, it is not feasible to manually collect information using land transportation facilities. Utilizing the high mobility of drones, each early warning measurement center sends a drone cluster to collect data from a task target point. Due to the cluster distribution of wireless sensor nodes, when earthquakes occur on a large scale, it is difficult to network between each group of sensor nodes. At the same time, there is no support from communication facilities such as base stations, which makes it impossible to gather all information from various locations. That is, there are far more locations in a large range that require data collection tasks than drones. On a small scale, due to the possibility of node damage during earthquakes and limited energy, maintaining a data aggregation state in scenarios where batteries cannot be replaced during earthquakes can quickly deplete battery energy. Therefore, each group of nodes does not choose a separate data aggregation node, but instead uses drones as the aggregation nodes, maximizing energy efficiency by planning the acquisition path of unmanned aerial vehicles. By allocating tasks reasonably and planning drone flight paths, data collection tasks within the entire earthquake area can be efficiently and stably completed.

2 Task Allocation Model

The issue of multi-drone task allocation [4] can generally be described as: assigning a set of target point sequences to all drones under the condition of known drone formation groups, target point information, and partial environmental information, in order to complete all tasks with the minimum cost and achieve the optimal efficiency.

The drone clusters in this project meet the following conditions:

  1. 1.

    Isomorphism: All the drones have the same structure and function.

  2. 2.

    Communication guarantee: All the drones have established communication links through flight Ad hoc networks.

  3. 3.

    Functionality: All drones are equipped with relevant equipment for data collection.

Assume that the number of drones is N, and the set of drones is U = {U1, U2, …, UN}. The number of target points to traverse is M, and the set of tasks is T = {T1, T2, …, TM}. The task allocation problem meets the following principles:

  1. 1.

    Each drone has the same structure, which satisfies the same resource and dynamic constraints;

  2. 2.

    Tasks at any location can only be assigned to a single drone cluster, meaning that a single target point can only be accessed once;

  3. 3.

    The drones visit all target points at the shortest possible distance and return to the base, while ensuring the shortest total length of the drone formation flight route;

  4. 4.

    The task allocation of all drone clusters participating in the task should be as fair as possible;

  5. 5.

    Each drone cluster should try to avoid meteorological threat areas as much as possible.

The task allocation problem is a combinatorial optimization problem with multiple constraints, which needs to meet the following constraints:

  • 1. Maximum flight distance constraint: The maximum flight distance of a single drone cluster is limited by the amount of fuel carried. Assume that the maximum flight distance is Dmax, and the route of the ith drone is Li. Then the maximum flight distance constraint is expressed as,

    $$D\left( {L_{i} } \right) \le D_{\max } ,\forall i = 1,...,N$$
  • 2. Target traversal constraint: All the target points should be assigned only once, and this constraint can be expressed as,

    $$\sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{{M_{i} }} {T_{i}^{j} } } = T\& \& \sum\limits_{j = 1}^{{M_{i} }} {T_{i}^{j} } \cap \sum\limits_{j = 1}^{{M_{k} }} {T_{k}^{j} } = \emptyset ,\forall i,j \in N,i \ne j$$

where \(T_{i}^{j}\) is the jth target point that the ith drone will visit.

In dynamic scenarios, task allocation should consider the total length of the drone cluster’s execution path, the flight load of each drone cluster, and the penalty for passing through the threat zone. Considering the above factors, the fitness function is established as,

$$Fitness = \alpha \sum\limits_{i = 1}^{N} {D\left( {L_{i} } \right)} + \beta \max \left( {D\left( {L_{i} } \right)} \right) + \gamma \sum\limits_{i = 1}^{N} {D_{o} \left( {L_{i} } \right)}$$

where \(\sum\limits_{i = 1}^{N} {D\left( {L_{i} } \right)}\) is the total length of all the drones, \(\max \left( {D\left( {L_{i} } \right)} \right)\) is the maximum flight distance, and \(D_{o} \left( {L_{i} } \right)\) is the penalty path for drones flying through threat zones. \(\alpha ,\beta ,\gamma\) are three weighting coefficients and,

$$\alpha + \beta + \gamma = 1$$

By minimizing the fitness function, an optimal solution can be obtained for the comprehensive evaluation of the total distance, the balanced degree of task allocation, and the flight safety degree. The final optimization problem can be expressed as,

$$\begin{gathered} \min Fitness \hfill \\ s.t. \, D\left( {L_{i} } \right) \le D_{\max } ,\forall i = 1,...,N \hfill \\ \, \sum\limits_{i = 1}^{N} {\sum\limits_{j = 1}^{{M_{i} }} {T_{i}^{j} } } = T\& \& \sum\limits_{j = 1}^{{M_{i} }} {T_{i}^{j} } \cap \sum\limits_{j = 1}^{{M_{k} }} {T_{k}^{j} } = \emptyset ,\forall i,j \in N,i \ne j \hfill \\ \end{gathered}$$

3 Task Allocation Based on Improved Immune Algorithm

Based on the concept of artificial immune algorithm [5] and multi-agent system, an improved immune multi-agent algorithm is proposed to solve the task allocation problem. By limiting some operations within the neighborhood, the ability of traditional immune algorithms to jump out of local optima is improved, while also maintaining the fast convergence speed of traditional immune algorithms in the initial stage. It is also suitable for dynamic and static task allocation problems.

3.1 Memory Population Initialization

Immunological memory refers to the concept of memory population in Clonal selection algorithm, which is used to save the optimal part of agents in each evolution process. Due to the limited size of the memory agent group, it is necessary to update and eliminate it in real-time during each iteration. The specific steps are as follows,

  1. 1.

    Assuming that the size of the initial memory agent group is N, select k antibody agents with the best fitness in the current antibody agent grid, and try to add them to the memory agent group. If the memory agent group has not reached the pre-set size, this addition is allowed, and the k antibody agents are directly added to the agent group.

  2. 2.

    If the memory agent group is full, or the number of antibody agents that can be accommodated is less than k, then the elimination operation is performed, all antibody agents in the current memory agent group are compared with the new k antibody agents for fitness, and the antibody agents with the lowest fitness that exceed the size of the memory agent group are removed from the memory agent group.

The overall process can be expressed as,

$$N_{best} = SelectN(N_{best} ,L)$$

where Nbest is the memory agent group, \(SelectN( \cdot )\) means to select the first N agent functions with the best fitness.

3.2 Neighborhood Cloning Operation

Neighborhood cloning refers to one of the artificial immune algorithms, the antibody cloning operation in the clonal selection algorithm. It is worth noting that the antibody cloning operation in the clonal selection algorithm only multiplies the best few antibodies in the population and then mutates, which is unfavorable for maintaining the diversity of the population. This paper will keep the cloning process in the neighborhood. For each antibody agent, select the agent with the best fitness in its neighborhood and its own several agents, multiply and mutate it to a certain extent, and then select the antibody agent with the best fitness to replace the original antibody agent. Neighborhood cloning operations can be represented as,

$$A_{i,j} = Select\left( {Mutation\left( {Clone\left( {Select\left( {\left\{ {A_{i,j} ,Neigh.A_{i,j} ,} \right\}} \right)} \right)} \right)} \right)$$

where \(Select\left( \bullet \right)\) represents selecting the best agent, \(Mutation\left( \bullet \right)\) represents the mutation operation, and \(Clone\left( \bullet \right)\) represents the cloning operation.

3.3 Neighborhood Suppression Operation

The neighborhood suppression operator draws inspiration from the suppression concept in artificial immune algorithms, which suppresses high affinity antibodies to maintain population diversity. Because the cloning operation of this algorithm is carried out in the neighborhood, if an antibody agent is the local optimal agent, its excellent fitness will make it the parent of all antibody agent clones in its neighborhood, that is, the antibodies in a small range all preserve the characteristics of the local optimal agent, and the affinity is naturally high, which needs to be suppressed. So within each neighborhood, neighborhood suppression operations can be represented as,

$$A_{i,j} = \left\{ \begin{gathered} B_{i,j} ,if \, A_{i,j} \in Neigh.\left\{ {A_{i,j} } \right\} \hfill \\ A_{i,j} ,otherwise \hfill \\ \end{gathered} \right.$$

where Bi,j is a randomly generated new antibody agent. It can be noted that due to the small neighborhood range, for each agent, if it is found that there are identical agents in the neighborhood, they will be directly replaced with randomly generated antibody agents to maintain the diversity of the agent group.

3.4 Population Crossover Operation

The objects selected for the crossover operation in this algorithm are different. For each antibody agent, a roulette wheel based selection method is used to select another parent agent in the entire grid for cross operation, and then replace the original agent with the better offspring after cross operation. This operation not only provides an opportunity for every antibody agent, regardless of its excellence, to intersect, improve the diversity of the population, conduct global search, but also has mutual motivation between excellent individuals, ensuring the direction of evolution. If the crossover operator uses a partially matched crossover operator, the population crossover operation can be represented as,

$$A_{i,j} = \left\{ \begin{gathered} PMX\left( {A_{i,j} ,Roulette\left( L \right)} \right),if \, unifrnd(0,1) < \left( {times - i} \right)/times \hfill \\ A_{i,j} ,otherwise \hfill \\ \end{gathered} \right.$$

where Roulette(L) represents the roulette choice function for the entire grid L, unifrnd (0, 1) represents a random number that follows a uniform distribution between 0 and 1, times represents the overall number of iterations, and i indicates the current number of iterations. It can be noted that the probability of population crossover operations gradually decreases as the number of iterations increases, with the aim of accelerating population convergence and preventing the loss of too much excellent information due to excessive crossover.

3.5 Self Learning Operation

The self-learning operation refers to the self-learning ability of agent in multi-agent system. In this paper, multi-agent is the antibody agent group, and its most representative individual is the individual with the best fitness. Through cloning and mutation of the individual, selecting the optimal individual after mutation to replace the original optimal agent, and completing local search in a small range, the antibody agent group will further evolve. Self learning operations can be represented as,

$$A_{best} = Select\left( {Mutation\left( {Clone\left( {A_{best} } \right)} \right)} \right)$$

3.6 Partial 2-opt Operator

The 2-opt operator was first used in traveling salesman problems, where the goal is to find a Hamiltonian circuit with the minimum sum of weights. The traveling salesman problem is one of the most widely studied problems in the field of combinatorial optimization. This problem is also NP hard, and without adding any additional assumptions, there is no constant approximation of polynomial time for the traveling salesman problem. Researchers are committed to designing a better approximation algorithm for the traveling salesman problem in metric space. The 2-opt optimization algorithm is one of them, which is the core of the classic Lin Kernighan algorithm. The 2-opt optimization algorithm starts from an arbitrary feasible solution and repeatedly searches. If a crossing edge is found, the encoding in the middle of this pair of crossing edges is reversed. Perform this substitution operation until the local optimal is reached, and finally use the local optimal solution as the algorithm’s solution. The algorithm execution process is as follows.

Algorithm 2-opt local optimization algorithm

Input: feasible solution, distance matrix D between nodes

Output: 2-opt optimized feasible solution

     For a = 1:length(Solution)-2

         For b = a + 2: length(Solution)-1% Traverse the entire feasible solution

             If D(a,a + 1) + D(b,b + 1) > D(a,b) + D(a + 1,b + 1) % If cross edges are found

                 Solution(a + 1:b) = Solution(b:− 1:a + 1)

             End

         End

     End

In the task allocation problem in this paper, although it is not possible to directly perform 2-opt operations on the overall antibody, for the target points under each drone, partial 2-opt operations can be used to accelerate the convergence of the algorithm. The specific operations are as follows,

$${\text{Solution}} = \left\{ \begin{gathered} 2 - opt({\text{Solution}}\left[ {1:\left\lfloor \frac{N}{M} \right\rfloor } \right]),..., \hfill \\ ...,2 - opt({\text{Solution}}\left[ {M*\left\lfloor \frac{N}{M} \right\rfloor :N} \right]) \hfill \\ \end{gathered} \right\}$$

where \(2 - opt\left( \bullet \right)\) represents performing 2-opt optimization, N is the number of target points, M is the number of drones, and \(\left\{ \bullet \right\}\) represents the splicing of various parts.

4 Simulation Results

The simulation parameters are given as follows: the grid parameter is 6, the size of the agent group is 36, the memory population size is 36, the iterations are 5000, and the clone multiplier is 10. The simulation scene is 50 km × 50 km × 400 m, the number of drones is 4, the number of targets is 30, and the truncation vector is [7, 14, 21].

Assume that the fuel of the drones is limited, so the distances that the drones fly should be as short as possible. On the other hand, in the post-earthquake areas, there are often meteorological threats such as strong wind, so the route plan of the drones should take the meteorological threats into consideration, and let the drones avoid such areas. In the simulation scene, we set a random meteorological threat area with a radius of 3 km. When there is a meteorological threat, the task allocation problem can be expressed as follows: while ensuring the shortest distance and load balance, try to stay as far away from the meteorological threat as possible. Due to the dominant factor in the threat zone, the penalty path weighting coefficient γ = 1/3. α should be larger than β, so we set α = 7/15, and β = 1/5. The simulation results are shown in Table 1.

Table 1 Simulation results

It can be seen that our proposed algorithm can reach the theoretical optimal solutions. Besides, the total flight distance of the clonal selection algorithm is 349.6837 km, whereas that of our proposed algorithm is 285.1336 km.