1 Introduction

Unmanned air vehicles (UAVs) are unpiloted aircraft used for both military and civilian purposes such as reconnaissance, surveillance against crime, agricultural applications, weather forecast, hurricane detection, transportation, and minimizing hazardous effects of natural disasters. UAVs are far less costly in initial and operational expenses (fuel and maintenance) and do not need on-board pilots, in contrast with conventional aircraft. They have high-resolution video-recording capabilities. Owing to these advantages, the use of UAVs has become widespread in recent years.

In a military context, route planning for UAVs comprises determining the path the UAV follows while visiting a number of targets in a hostile terrain. This terrain is typically monitored by radars to protect the targets from hostile intrusion. Radars pose a threat to UAVs as there is a possibility that they could be destroyed if detected by a hostile force. Criteria, such as the total distance of the route, flight duration, flight altitude, average distance from the radar sites, safety, radar detection threat, or risk posed by a UAV to third parties on the ground could be considered. Although many studies consider a single criterion, there are an increasing number of studies that consider multiple criteria (see Rojas Viloria et al. 2021, for a review). There are recent studies that include a second criterion in addition to a travel distance-based criterion. In this study, we consider two conflicting objectives: minimizing the total distance of the route and minimizing the radar detection threat. The first objective is to minimize the distance traveled by the UAV, and the second objective is to minimize a detection measure for the route based on a cumulative measure of the probability that the radars will detect the UAV on its route. These are two relevant objectives in military applications of UAV route planning.

The studies that consider similar criteria (such as fuel consumption, path length, risk of threat, distance to radar sites), mainly consider a static environment where conditions stay the same throughout the mission of the UAV. Li et al. (2006) linearly aggregate path length and threat exposure into a single objective and search for the path of a UAV using particle swarm optimization. Zhenhua et al. (2008) consider similar objectives and find alternative routes for a UAV using an ant colony system without aggregating the objectives. Swartzentruber et al. (2009) use particle swarm optimization to construct paths for a UAV using a linear aggregation of fuel consumption, risk of threats, and reconnaissance criteria. They find alternative paths by varying the weights of the criteria in the linear aggregation. Pfeiffer et al. (2009) develop efficient paths for a UAV between an origin and a destination minimizing the flight duration and the probability of being detected in excess of a threshold number of times. Dolicanin et al. (2018) linearly aggregate fuel consumption and threat exposure and use a brain storm optimization algorithm to structure the routes of a UAV. Tezcaner and Köksalan (2011) develop an interactive approach to find the best route of a route planner (RP) who is assumed to have an implicit underlying linear preference function of travel distance and radar detection threat. They consider the routing of a single UAV visiting multiple targets in a hostile terrain. Dasdemir et al. (2020) also consider the same problem and search for efficient routes close to the reference points specified by the RP to represent their preferences using an evolutionary algorithm. In all these approaches, the UAV follows the route selected at the beginning of the mission ignoring any changes that might occur in the environment during the mission.

A more realistic version of this problem is to take into account the dynamic nature of the environment. Some approaches consider pop-up or mobile threats by linearly aggregating several objectives into a single objective and employ heuristic approaches to find approximate solutions (see Zheng et al. 2005, Besada-Portas et al. 2010, Peng et al. 2013 for examples of routing multiple UAVs, and Yao and Zhao 2015, for an example of routing a single UAV). Studies that model the realistic case of mobile targets are rare. Qu and Xi (2010) determine the route of a fleet of UAVs for targets that move in known directions. They use a mixed integer linear program that minimizes the total route length to determine the coordinates of the points at which the UAVs meet their targets. Chen and Xu (2016) find the path of a UAV that visits several moving targets in a radar-monitored terrain where the visiting sequence is known in advance. They linearly aggregate the fuel consumption and radar threat objectives, restrict the path of the UAV to move through the edges of a graph they form based on the locations of the radars, and solve shortest path problems between consecutive targets on this graph.

Natalizio et al. (2020) introduce a different context where UAVs are used to film a sports event. They optimize a linear aggregation of the two criteria: the travel distance and timeliness of arrival to capture an action. They find the routes of a team of UAVs as the locations of the actions (to be filmed) keep changing.

Another dimension in multiobjective UAV route planning is to consider the existence of multiple trajectories when traveling between a pair of targets. In the classical multiobjective traveling salesperson problem (TSP), a single connection is assumed to exist between any two targets (target pairs). However, under multiple objectives, this is not a realistic setting since different trajectories could favor different objectives at different levels, leading to multiple efficient trajectories. In a continuous space where the UAV can move to any point in the terrain, this implies the existence of multiple efficient trajectories between target pairs. The problem, therefore, requires searching for efficient trajectories to use between target pairs as well as determining the visiting order to the targets. A few studies allow for multiple trajectories between target pairs using the objectives of minimizing flight distance and detection threat. Tezcaner and Köksalan (2011) introduced this problem for a discrete terrain where a UAV is restricted to move between imaginary grid points that are assumed to be uniformly located over the terrain. Tezcaner Öztürk and Köksalan (2016) and Köksalan and Tezcaner Öztürk (2017) used different approaches for the same problem, also considering a discrete terrain. Discretizing the terrain limits the possible moves of the UAV compared to the actual situation that allows the UAV to move to any point in the space. Tezcaner Öztürk and Köksalan (2023), Türeci (2017), and Dasdemir et al. (2020) worked on this more realistic version of the problem where the UAV is routed in a continuous space. Tezcaner Öztürk and Köksalan (2023) develop methods to generate the nondominated frontier of the routes for this problem. For the same problem structure, Türeci (2017) develops interactive algorithms to find the most preferred solution of a RP having different preference functions of the two objectives. Dasdemir et al. (2020) develop a reference point-based evolutionary algorithm that approximates the preferred regions of the nondominated frontier and apply the algorithm to the biobjective UAV route planning problem in continuous space.

In this paper, we consider a more general case than the problem setting of Tezcaner Öztürk and Köksalan (2023), Türeci (2017), and Dasdemir et al. (2020). We form the route of a single UAV that has a mission to visit a set of targets by moving freely in continuous space considering flight distance and detection threat objectives. These targets are typically located on ground vehicles that move at a much slower speed relative to the UAV. As is commonly done in the literature, we assume that the UAV moves to the location of each target. In practice, the exact location the UAV needs to move could change slightly depending on its mission. In general, it needs to move close enough to accomplish its mission. Within the context of this study, such slight differences do not affect the routing decisions. Once the UAV reaches the vicinity of a target, it can make small adjustments to get to the desired location to achieve its mission. We consider multiple trajectories between target pairs and, different from the previous studies, a dynamic environment where the targets change their locations throughout the mission of the UAV. We assume that the movements of the targets are unknown to us in advance. Our approach starts with finding an initial route considering the observed locations of the targets when the UAV is at the base. The route is constructed taking the RP’s preferences into account. When the UAV is en route to the first target, all targets keep moving in directions unknown in advance but can be observed by the UAV. The UAV locks on to the next target and moves toward it, adjusting its route as necessary. At the point the UAV completes its reconnaissance mission for that target, it observes the new locations of the remaining targets and updates its route based on the observed locations. Our algorithms keep constructing and updating preferred routes of the RP, taking the changing conditions into account each time the UAV reaches a target. To the best of our knowledge, this is the only study that considers a dynamic environment in a continuous space while allowing for multiple trajectories that could be used between target pairs.

UAVs are commonly used for military missions that have characteristics similar to the problem structure we consider. Pursuing mobile targets is an important mission of UAVs (Glade 2000), and our approach captures the fundamental aspects of such missions well. A specific application of UAVs that can be modelled with our problem structure is gathering intelligence from static or mobile targets using sensors (cameras, recording devices, etc.). Mufalli et al. (2012) explicitly address the problem of which sensors to place on a fleet of UAVs and they determine the routes of the UAVs to maximize the information collected from static targets. Our approach addresses the dynamic version of this problem under two objectives for a UAV that has a fixed set of sensors.

The paper unfolds as follows. We present definitions and state our problem in Sect. 2. We develop solution mechanisms and an algorithm in Sect. 3 and demonstrate it in Sect. 4. We conclude and discuss possible future work in Sect. 5.

2 Problem definition

We first introduce our notation and definitions.

Let \(p\) denote the number of objectives and assume without loss of generality that they are all of minimization type. Let \({\varvec{x}}\) be a solution (that is, the decision variable vector indicating the route to follow in our case) and \(X\) be the feasible set. \(z\left({\varvec{x}}\right)=\left({z}_{1}\left({\varvec{x}}\right), {z}_{2}\left({\varvec{x}}\right),\dots ,{z}_{p}\left({\varvec{x}}\right)\right) \in Z\) denotes the objective function vector corresponding to solution \({\varvec{x}}\in X,\) where \({z}_{k}({\varvec{x}})\) is the performance of \({\varvec{x}}\) in objective \(k,\) and \(Z\) is the image of \(X\) in the objective function space.

We take the following definitions directly from Tezcaner Öztürk and Köksalan (2016).

Definition 2.1

A solution \({\varvec{x}}\in X\) is efficient if there does not exist \({\varvec{x}}\mathbf{^{\prime}}\in X\) such that \({z}_{k}\left({{\varvec{x}}}^{\boldsymbol{^{\prime}}}\right)\le {z}_{k}({\varvec{x}})\) for \(k=\mathrm{1,2},\dots ,p\) and \({z}_{k}\left({{\varvec{x}}}^{\boldsymbol{^{\prime}}}\right)<{z}_{k}({\varvec{x}})\) for at least one objective. If there is such an \({\varvec{x}}\boldsymbol{^{\prime}}\), \({\varvec{x}}\) is said to be inefficient. All efficient solutions constitute the efficient frontier (set).

Definition 2.2

If a solution \({\varvec{x}}\in X\) is efficient, then \(z({\varvec{x}})\) is said to be nondominated, and if \({\varvec{x}}\) is inefficient, and then \(z({\varvec{x}})\) is said to be dominated. All nondominated points constitute the nondominated frontier (set).

Definition 2.3

An extreme efficient solution \({\varvec{x}}\) is an efficient solution that has the minimum possible value in at least one of the objectives.

We next explain the problem setting, the structure we create to represent the trajectories between target pairs, and the route we construct.

2.1 The problem setting

We consider the same problem setting as in Tezcaner Öztürk and Köksalan (2023), Türeci (2017), and Dasdemir et al. (2020), where the route of a UAV is constructed in continuous space. In these studies, the UAV is assumed to start from a base, visit each target once, and return to the base. They minimize distance and radar detection threat objectives. Tezcaner Öztürk and Köksalan (2023) generate the nondominated frontier of this problem. Türeci (2017) and Dasdemir et al. (2020) find preferred solutions using exact and heuristic approaches, respectively. They all assume static targets. In contrast, we consider mobile targets that are ground vehicles that move at slow speeds relative to the speed of the UAV. Our approach can work for different relative speeds for the ground vehicles and the UAVs.

The exact movements of the targets are unknown to us, but we assume that the UAV observes the targets throughout its mission. The initial route is constructed when the UAV is at the base for the observed locations of the targets. The UAV then locks on to the next target and flies towards it adjusting its route according to the movements of that target. Once the UAV reaches the point it can collect information, and we update the remaining legs of the route based on the new locations of the unvisited targets. We further discuss the details of constructing the route later.

During its travel toward the targets, the UAV can move to any point in a two-dimensional terrain that is monitored by radars. The UAV tries to minimize both the distance traveled and the radar detection threat. The total distance traveled is the sum of the lengths of all the paths the UAV traverses. We measure the possibility of detection using the radar detection threat measure developed in Gudaitis (1994). Based on this measure, the effectiveness of radar in detecting a UAV depends on how close it is. If the UAV is within a certain distance, it is detected with certainty. This detection probability decreases proportional to its distance from the radar and reaches 0 beyond a certain distance. The region of effectiveness of the radar is shown in Fig. 1a with two concentric circular areas assuming that the radar is located at the center. The detection probability is 1.0 in the inner circular region (the core region) depicted with a grey shading. The outer region is the remaining white area in the circular radar region, where the detection probability starts at 1.0 at the circumference of the core region and gradually decreases toward the circumference of the outer region, and reaches a value of 0.0 at the outer circumference. To create an overall measure of radar detection threat objective, we accumulate all detection probabilities on the route of the UAV, as the overall risk to the UAV depends on the instantaneous probability of detection weighted with the duration of being exposed to that probability.

Fig. 1
figure 1

Types of target pairs (adapted from Tezcaner Öztürk and Köksalan, 2023)

The details of the computation of this objective function are given in Appendix.

2.2 Trajectories

As the UAV can move to any point in the continuous space, there are multiple efficient trajectories between each target pair the UAV can use, each performing better than any other trajectory in one objective and performing worse in the other objective. These trajectories differ by the points at which they enter the effective radar region, by the paths they follow in the radar region, and by the points they exit the effective radar region. It would be necessary to determine all these components (the entrance and exit points to and from the radar region, and the path to follow in the radar region) in order to structure all efficient trajectories exactly. In a dynamic setting with moving targets, the routes need to be updated in real time, regularly updating the trajectories based on the new locations of the targets. It is not practical to find and update all these trajectories precisely in such a dynamic setting. We utilize a simplified approach that approximates the nondominated frontiers of the efficient trajectories to be used in the updated route. For each target pair, we find several efficient trajectories precisely and approximate the nondominated frontiers of all efficient trajectories by fitting functions that pass through the objective function values of these few trajectories using the method developed by Tezcaner Öztürk and Köksalan (2023). We then select the preferred points of a RP (corresponding to the objective function values of the two objectives) on each of these frontiers and use the corresponding trajectories to form routes. We next explain Tezcaner Öztürk and Köksalan’s (2023) representation of the nondominated frontiers between target pairs.

Tezcaner Öztürk and Köksalan (2023) enumerate many trajectories between a target pair for different entrance and exit points to and from the radar region, and the path followed within the radar region. They then eliminate the inefficient trajectories and form the nondominated frontier of the enumerated efficient solutions. They observe that the nondominated frontiers of different target pairs exhibit similar structures. They classify these under three categories based on the positioning of the target pairs relative to the radars. If the shortest path between two targets does not pass through a radar region, there is a single efficient trajectory that simultaneously minimizes both the distance and the radar detection threat objectives when moving between those targets (target pair Type 1, see Fig. 1a). The target pairs for which the shortest path passes through only the outer radar region are categorized as Type 2 (Fig. 1b). Finally, the target pairs for which the shortest path passes through both the outer and the core radar regions are categorized as Type 3 (Fig. 1c).

Tezcaner Öztürk and Köksalan (2023) structure the nondominated frontiers of Types 2 and 3 target pairs as demonstrated in Fig. 2a and b, respectively. The trajectories connecting Type 3 target pairs have two segments: a straight-line segment that is made up of the objective function values of the efficient trajectories passing through both the core and the outer radar regions, and a curved segment that is made up of the objective function values of the trajectories passing through only the outer radar region. An example efficient trajectory between a Type 3 target pair is shown in Fig. 1c. The points (\({x}_{en},{y}_{en})\) and (\({x}_{ex},{y}_{ex})\) represent the entrance and exit points to and from the outer radar region, respectively. The entrance and exit points to and from the core radar region are the points (\({x}_{ien},{y}_{ien})\) and (\({x}_{iex},{y}_{iex}),\) respectively. The nondominated frontier of Type 2 target pairs is a curved frontier and is similar in structure to the curved segment of Type 3 target pairs. An example trajectory is shown in Fig. 1b. Type 1 target pairs, on the other hand, have a single efficient trajectory connecting them by a straight path.

Fig. 2
figure 2

Nondominated frontiers of type 2 and 3 target pairs (adapted from Dasdemir et al., 2020)

Based on these structures, Tezcaner Öztürk and Köksalan (2023) characterize the nondominated frontiers of trajectories of Types 2 and 3 by fitting functions that represent their shapes. In this approach, depending on whether the trajectories are of Type 2 or 3, three or four actual efficient trajectories are found between each target pair, respectively. Then the objective values of these trajectories corresponding to each target pair are used to fit a function to represent the nondominated frontier of the trajectories between that target pair. We next explain Algorithm NDF that fits functions to represent the nondominated frontiers for Type 2 and Type 3 target pairs.

Algorithm NDF

Although the algorithms for Type 2 and Type 3 target pairs have many aspects in common, we write two separate algorithms. We briefly discuss how to find trajectories here and elaborate on its details in a later section. To find the trajectory type, we check the trajectory that corresponds to the minimum distance solution \(\left(\mathrm{MDS}\right)\) connecting target pair \(\left(i,j\right).\) Let the distance and radar detection threat values of this trajectory be \({z}_{\mathrm{MDS}}^{1}\) and \({z}_{\mathrm{MDS}}^{2},\) respectively. If this trajectory passes through the core radar region, the target pair is of Type 3. If it only passes through the outer radar region, the target pair is of Type 2.

Algorithm NDF for Type 2 Target Pairs

Step 1. Find the shortest path trajectory between target pair \((i,j)\) that does not pass through any radar region. Let the objective values of this trajectory (minimum radar solution, \(\mathrm{MRS})\) be \({z}_{\mathrm{MRS}}^{1}\) and \({z}_{\mathrm{MRS}}^{2},\) respectively.

Step 2. Find a central solution with distance value that is the midpoint of \({z}_{\mathrm{MRS}}^{1}\) and \({z}_{\mathrm{MDS}}^{1}\). Let the distance and radar detection threat values of the central solution be \({z}_{\mathrm{CS}}^{1}\) and \({z}_{\mathrm{CS}}^{2},\) respectively.

Step 3. Fit an Lq function passing through \(\mathrm{CS},\) \(\mathrm{MRS},\) and \(\mathrm{MDS}\) using Eq. (1).

$${\left(1-\frac{{z}_{\mathrm{CS}}^{1} - {z}_{\mathrm{MDS}}^{1}}{{z}_{\mathrm{MRS}}^{1}- {z}_{\mathrm{MDS}}^{1}}\right)}^{q}+{\left(1-\frac{{z}_{\mathrm{CS}}^{2} - {z}_{\mathrm{MRS}}^{2}}{{z}_{\mathrm{MDS}}^{2}- {z}_{\mathrm{MRS}}^{2}}\right)}^{q}=1$$
(1)

Algorithm NDF for Type 3 Target Pairs

Step 1. Find the shortest path trajectory between target pair \((i,j)\) that avoids the core radar region. Let the objective values of this trajectory (minimum distance outer solution, \(\mathrm{MDOS})\) be \({z}_{\mathrm{MDOS}}^{1}\) and \({z}_{\mathrm{MDOS}}^{2},\) respectively.

Step 2. Find the shortest path trajectory between target pair \((i,j)\) that does not pass through any radar region. Let the objective values of this trajectory (minimum radar solution, \(\mathrm{MRS})\) be \({z}_{\mathrm{MRS}}^{1}\) and \({z}_{\mathrm{MRS}}^{2},\) respectively.

Step 3. Find a central solution with distance value that is the midpoint of \({z}_{\mathrm{MRS}}^{1}\) and \({z}_{\mathrm{MDOS}}^{1}\). Let the distance and radar detection threat values of the central solution be \({z}_{\mathrm{CS}}^{1}\) and \({z}_{\mathrm{CS}}^{2},\) respectively.

Step 4. Fit an Lq function passing through \(\mathrm{CS},\) \(\mathrm{MRS},\) and \(\mathrm{MDOS}\) using Eq. (2).

$${\left(1-\frac{{z}_{\mathrm{CS}}^{1} - {z}_{\mathrm{MDOS}}^{1}}{{z}_{\mathrm{MRS}}^{1}- {z}_{\mathrm{MDOS}}^{1}}\right)}^{q}+{\left(1-\frac{{z}_{\mathrm{CS}}^{2} - {z}_{\mathrm{MRS}}^{2}}{{z}_{\mathrm{MDOS}}^{2}- {z}_{\mathrm{MRS}}^{2}}\right)}^{q}=1$$
(2)

Step 5. Estimate the straight-line piece of the nondominated frontier using the straight-line equation that passes through the objective function values of the two solutions, \(\mathrm{MDS}\) and \(\mathrm{MDOS}.\)

We start the algorithm with finding the extreme efficient trajectory with the minimum distance value and maximum radar detection threat value, \(\mathrm{MDS},\) and classify the target pair as Type 2 or 3. If the target pair is Type 2, we additionally find two points, \(\mathrm{MRS}\) and \(\mathrm{CS},\) and characterize the curved nondominated frontier. A similar approach is employed for Type 3 target pairs. We find three points, \(\mathrm{MDOS},\mathrm{ MRS},\) and \(\mathrm{CS},\) on the curved segment. We then fit an Lq function passing through the end points and a central point on these curved segments. These functions are developed in Köksalan (1999), demonstrated on several problems in Köksalan and Lokman (2009), and used in different contexts in Köksalan and Soylu (2010) and Köksalan and Tezcaner Öztürk (2017). The unknown in Eqs. (1) and (2) is the value of \(q\), and it can be found solving a nonlinear mathematical model, with \(q\) as its only decision variable.

The straight-line segment of Type 3 moves is found with the equation of the line passing through two points, \(\mathrm{MDS}\) and \(\mathrm{MDOS}.\) We next explain how we structure efficient trajectories and, more specifically, how we find efficient trajectories in Algorithm NDF.

2.3 Structuring the efficient trajectories

All efficient trajectories are structured by their entrance points to and exit points from the radar region, and the circular paths they follow in the radar region. The structures of \(\mathrm{MDS}, \mathrm{MRS},\) and \(\mathrm{MDOS}\) are common to any position of targets with respect to the radar as established in Tezcaner Öztürk and Köksalan (2023). They can be found by solving a small set of linear equations simultaneously in negligible computational time.

As demonstrated in Fig. 2, there is a continuous nondominated frontier of the trajectories between Type 2 and Type 3 target pairs and hence a continuum of efficient trajectories as each distinct point on the nondominated frontier corresponds to a distinct trajectory. To structure any of the remaining efficient trajectories between targets \((i,j)\) we employ the search-based algorithm developed in Tezcaner Öztürk and Köksalan (2023). The algorithm finds the efficient trajectory for any value of the first objective function, \({d}_{ij},\)\({z}_{\mathrm{MDS}}^{1}\le {d}_{ij}\le {z}_{\mathrm{MRS}}^{1}.\)

To obtain a central solution, \(\mathrm{CS},\) in Algorithm NDF, we set \({d}_{ij}\) to \({z}_{\mathrm{CS}}^{1},\) the midpoint of \({z}_{\mathrm{MRS}}^{1}\) and \({z}_{\mathrm{MDOS}}^{1}\), in the search-based algorithm.

2.4 Progress of the route

When the UAV is at the base, its route should be formed by solving a generalized multiobjective TSP (MOTSP), where the target pairs are connected by multiple efficient trajectories.

To decide on which of the efficient trajectories to follow, we need to reflect the preferences of the RP. If we know or estimate the preference function of the RP, we can solve the generalized MOTSP right before the UAV starts its flight and have the UAV move toward the first target of this route. However, as the targets keep changing their locations, the initial route may not stay preferred or may not even be nondominated. We may improve by restructuring the route periodically based on new information. One extreme may be to restructure the routes continuously, accounting for all small changes in the targets’ locations. The other extreme is to let the UAV follow the initial route without any modifications. In between these two extremes are the alternatives of restructuring the route at certain time intervals. Since the UAV moves toward the next target at a much faster speed than that of the targets, it is neither practical nor useful to recalculate the routes in short time intervals.

One could choose different reasonable time intervals to reevaluate whether the current route remains a preferred route. We reevaluate every time the UAV reaches its next target. An earlier reevaluation would be beneficial only if it would require rerouting by changing the next target to be visited. This is unlikely since the UAV has been approaching to this target at a much faster speed than the speed of this target (as well as all other targets). Once the UAV reaches the next target, all other targets could have changed their locations relative to each other and relative to the current location of the UAV. Therefore, the remaining legs of the UAV may change at this point in time and it is an opportune time to reevaluate.

The initial problem (with all targets to be visited) is a generalized MOTSP and yields a tour. Each time we complete the next leg of the trip we only need to find a route through the unvisited targets. Hence, the problem becomes a multiobjective shortest Hamiltonian path problem (SHPP) after visiting the first leg. Therefore, we solve a new multiobjective SHPP (MOSHPP) after visiting each target in real time. We refer to the UAV’s visiting order to the targets (starting from and ending at the base) as the tour in the rest of the manuscript. The visiting order to targets and the trajectories used between target pairs together constitute the route of the UAV.

Tezcaner Öztürk and Köksalan (2016) define the generalized biobjective traveling salesperson problem (GBOTSP). We provide their mathematical model below and then build upon their formulation to develop the mathematical model of the biobjective SHPP. In this model all nodes are connected with multiple efficient edges, and hence we refer to it as the generalized biobjective shortest Hamiltonian path problem (GBOSHPP). Let set \(M\) include the unvisited targets and the target the UAV is currently at, and \(T\) be the set of all targets. In TSP, \(M=T\) since we aim to visit all targets when the UAV is at the base (Target 1). For both formulations, we assume that the number of efficient trajectories between targets \((i,j)\) is finite and equal to \({H}_{ij}.\) We use the binary variable \({x}_{ij}^{h}\) to denote whether the efficient trajectory,\(h,\) connecting targets \((i,j)\) is used or not and \({c}_{ijh}^{k}\) denotes its coefficient in objective \(k.\)

(GBOTSP)

$${\text{Min}}z_{1} \left( x \right) = \sum\limits_{{i \in M}} {\sum\limits_{{j \in M}} {\sum\limits_{{h = 1}}^{{H_{{ij}} }} {c_{{ijh}}^{1} x_{{ij}}^{h} } } }$$
(3)
$${\mathrm{Min }z}_{2}\left(x\right)=\sum \limits_{i\in M}\sum\limits_{j\in M}\sum\limits_{h=1}^{{H}_{ij}}{c}_{ijh}^{2}{x}_{ij}^{h}$$
(4)

Subject to:

$$\sum_{j\in M}\sum\limits_{h=1}^{{H}_{ij}}{x}_{ij}^{h}=1\, \quad\forall i\in M$$
(5)
$$\sum_{i\in M}\sum\limits_{h=1}^{{H}_{ij}}{x}_{ij}^{h} =1 \,\quad\forall j\in M$$
(6)
$${u}_{i}-{u}_{j}+\sum\limits_{h=1}^{{H}_{ij}}\left(\left|M\right|-1\right){x}_{ij}^{h}+\sum\limits_{h=1}^{{H}_{ij}}\left(\left|M\right|-3\right){x}_{ji}^{h}\ge \left|M\right|-2 \,\quad\forall i,j \in M-\left\{1\right\},i\ne j$$
(7)
$$1\le {u}_{i}\le \left|M\right|-1\,\quad \forall i \in M-\left\{1\right\}$$
(8)
$${x}_{ij}^{h}\in \left\{\mathrm{0,1}\right\}\,\quad \forall i,j \in M, h=1,\dots ,{H}_{ij}$$
(9)

We minimize the two sum-type objectives in (3) and (4), respectively. Departure from and arrival to each target are satisfied with (5) and (6), respectively. Constraints (7) and (8) are the subtour elimination constraints (Desrochers and Laporte 1991) that are the strengthened version of subtour elimination constraints of Miller et al. (1960).

We solve the GBOSHPP once we arrive at a target, say \(\ell\), and search for the path starting from Target \(\ell,\) reaching Target 1 (base), visiting all unvisited targets in set \(M.\)

(GBOSHPP)

(3), (4)

Subject to:

(7)–(9)

$$\sum_{j\in M-\{\ell\}}\sum\limits_{h=1}^{{H}_{ij}}{x}_{ij}^{h} =1\,\quad \forall i\in M-\{1\}$$
(10)
$$\mathop \sum \limits_{{i \in M - \left\{ 1 \right\}}} \mathop \sum \limits_{h = 1}^{{H_{ij} }} x_{ij}^{h} = 1\,\quad \forall { }j \in M - \left\{ \ell \right\}$$
(11)

Constraints (10) and (11) are the modified versions of (5) and (6), respectively, such that only the targets that are not visited yet should be arrived at and departed from.

3 The solution approach

We develop a general real-time algorithm that constructs and restructures the route of the UAV in a dynamic environment as discussed above. The first phase of the algorithm constructs a route for the UAV when it is at the base by solving a GBOTSP for the initial locations of all targets. The UAV then moves toward the next target. At their meeting point, we update the route by solving a GBOSHPP that starts at the current location of the UAV, visits all unvisited targets (based on their current locations), and terminates at the base. The UAV then moves in the direction of the next target on the updated route. The process repeats until the UAV visits all targets and reaches the base. We refer to the first execution of the algorithm, where an initial tour is found, as Phase 1. We refer to the phase where we successively construct shortest Hamiltonian paths as Phase 2.

3.1 The real-time algorithm

We next present the steps of our real-time algorithm and explain how we execute each step in each phase.

Let the base be Target 1. Recall that set \(M\) includes the unvisited targets as well as the target the UAV is currently at, and \(T\) is the set of all targets. Let \(Q=\{\left(i,j\right):i,j\in M\},\) be the set of all target pairs.

Algorithm R-T

Step R.0. Find the objective function values of the efficient trajectories that can be used between target pairs \(\left(i,j\right)\in Q.\)

Step R.1. Find the most preferred route of the RP, and the target to visit next, say Target \(\ell .\)

Step R.2. Find the new location of and the trajectory to follow to Target \(\ell .\)

3.1.1 Using algorithm R-T in phase 1

We first execute Steps R.0-R.2 of Algorithm R-T when the UAV is at the base.

Step R.0.: finding efficient trajectories between target pairs \(\left(i,j\right)\in Q\)

Since the UAV moves in a continuous space, the efficient trajectories between target pairs are not finite. We employ Algorithm NDF to characterize the nondominated frontiers corresponding to these efficient trajectories.

Since the nondominated frontiers corresponding to the trajectories between target pairs are continuous, the nondominated frontier made up of tours that use these trajectories is also continuous. Furthermore, the environment is dynamic and keeps changing rapidly and the RP needs to make routing decisions quickly. In order to make decisions quickly in real time, we approximate the preferences of the RP with a weighted linear function of the objectives as given in (12), where \(w\) and \(1-w\) represent the relative importances of the first and the second objectives, respectively, to the RP. The most preferred solution of the RP is the solution that minimizes \(U\left(x\right).\)

$$U\left(x\right)=w {z}_{1}\left(x\right)+(1-w) {z}_{2}\left(x\right)\,where \,0<w<1$$
(12)

We interact with the RP before the execution of the real-time algorithm to estimate the value of \({w}^{*}\) that structures the RP’s preference function. For this purpose, we use Türeci’s (2017) interactive algorithm for underlying linear preference functions. This algorithm finds the most preferred route of a RP for the biobjective UAV routing problem in continuous space, with mechanisms addressing the structure of its continuous nondominated frontier. Briefly, in this algorithm, she finds the range of values, \(\left[{w}_{L},{w}_{R}\right],\) that \(w\) can take based on past preferences of the RP. This range can be narrowed down by asking for pairwise comparisons between solutions. We keep asking for pairwise comparisons from the RP and set the mid-point, \({w}^{*},\) as the estimated weight once the range is sufficiently narrow.

Once we estimate \({w}^{*}\) and characterize the nondominated frontier, we next identify the objective function values of the trajectory between each target pair \(\left(i,j\right)\) that optimizes (13). Here, \({d}_{ij}\) and \({r}_{ij}\) are the first and second objective function values of the identified most preferred trajectory between targets \(\left(i,j\right),\) respectively.

$$U={w}^{*} {d}_{ij}+(1-{w}^{*}) {r}_{ij}$$
(13)

We next briefly explain the procedure developed in Türeci (2017) (and used in Dasdemir et al. 2020, in a similar context) that finds the objective function values of the most preferred trajectory between a target pair for an underlying linear preference function with known \(w.\)

We first check the type of each target pair \(\left(i,j\right)\in Q\). We then find the point \(({d}_{ij},{r}_{ij})\) on the corresponding nondominated frontier that optimizes (13) using Procedure BT.

Procedure BT

  • If target pair \(\left(i,j\right)\) is of Type 1, there is a single efficient trajectory with objective values \(({d}_{ij},{r}_{ij})\) connecting them.

  • If target pair \(\left(i,j\right)\) is of Type 2, we find the point \(({d}_{ij},{r}_{ij})\) minimizing (13) on Lq function (14).

    $${\left(1-\frac{{d}_{ij} - {z}_{\mathrm{MDP}}^{1}}{{z}_{\mathrm{MRP}}^{1}- {\mathrm{z}}_{\mathrm{MDP}}^{1}}\right)}^{q}+{\left(1-\frac{{r}_{ij} - {z}_{\mathrm{MRP}}^{2}}{{z}_{\mathrm{MDP}}^{2}- {z}_{\mathrm{MRP}}^{2}}\right)}^{q}=1$$
    (14)
  • If target pair \(\left(i,j\right)\) is of Type 3, we find the point \(({d}_{ij},{r}_{ij})\) minimizing (13) on Lq function (15) and on the straight line parts separately. We then find the best trajectory as the point that gives a smaller value for (13).

    $${\left(1-\frac{{d}_{ij} - {z}_{M\mathrm{DOP}}^{1}}{{z}_{\mathrm{MRP}}^{1}- {z}_{M\mathrm{DOP}}^{1}}\right)}^{q}+{\left(1-\frac{{r}_{ij} - {z}_{\mathrm{MRP}}^{2}}{{z}_{M\mathrm{DOP}}^{2}- {z}_{\mathrm{MRP}}^{2}}\right)}^{q}=1$$
    (15)

More details on this procedure can be found in Türeci (2017).

Step R.1.: finding the most preferred route and the next target to be visited

After finding the objective values of the efficient trajectory to be used between each target pair, \(({d}_{ij},{r}_{ij})\) for all \(\left(i,j\right)\in Q,\) we find the most preferred route of the RP. For this, we solve the single-objective version of GBOTSP (3)-(9) where \({H}_{ij}=1 \mathrm{and} \left({c}_{ij1}^{1},{c}_{ij1}^{2}\right)=({d}_{ij},{r}_{ij})\) for all \(\left(i,j\right)\in Q.\) As the objective function, we use (12), combining (3) and (4) with \({w}^{*}.\) The resulting tour, \(1-{O}_{1}-{O}_{2}-\dots -{O}_{\left|T\right|-1}-1\) is a symmetrical tour starting and terminating at the base, where \({O}_{i}\) indicates the target to be visited in the \({i}^{th}\) order. We have the option of going to targets \({O}_{1}\) or \({O}_{\left|T\right|-1}\) from the base. Of these, we choose the target that has the better first-connection value. That is, we choose the target to visit comparing the preference function values of the trajectories connecting 1 with \({O}_{1}\) and 1 with \({O}_{\left|T\right|-1}.\) The trajectory with the smaller preference value is selected, and the chosen target is denoted as Target \(\ell .\)

Step R.2.: find the new location of and the trajectory to follow to target \(\ell\)

In Step R.2, we observe the direction of movement of Target \(\ell\) and predict the point at which the UAV meets Target \(\ell\) given its travel speed. We then execute Algorithm NDF to form the nondominated frontier of the trajectories between Target 1 and the new location of Target \(\ell,\) and utilize Procedure BT to find the two objective function values of the most preferred trajectory on the characterized nondominated frontier. Until this point, we have worked in the objective function space. As a final step, knowing the approximate objective function values, (\({d}_{1\ell},{r}_{1\ell})\), of the most preferred trajectory, we employ the search-based algorithm of Tezcaner Öztürk and Köksalan (2023) to find where the UAV specifically enters the radar region, how it moves in the radar region, and where it leaves the radar region. This trajectory corresponds to the efficient trajectory that has the distance value \({d}_{1\ell}\) and the true radar detection threat value \({r}_{1\ell}^{^{\prime}}.\) In our experiments, \({r}_{1\ell}\) turned out to be very close to \({r}_{1\ell}^{^{\prime}}\). In the rest of the paper, we will refer to this procedure as structuring the trajectory of the UAV.

In practice the target may deviate from its original direction of movement. In this case, the UAV can make small adjustments to follow the target. Given the UAV operates at a much faster speed than the targets, we expect such adjustments to result in negligible changes in the overall route.

3.1.2 Using algorithm R-T in phase 2

When the UAV visits Target \(\ell\), we assume that all targets have changed their locations. We thus start executing Algorithm R-T for the new locations of the targets. The main difference in using Algorithm R-T in Phase 2 is in forming the route in Step R.1. We approximate fewer nondominated frontiers considering only unvisited targets and solve a GBOSHPP instead of a GBOTSP to form the route. We explain the steps of Phase 2 briefly, pointing out its similarities with Phase 1.

Step R.0.: finding the efficient trajectories between target pairs \(\left(i,j\right)\in Q\)

We characterize the nondominated frontiers between target pairs that have not been visited yet using Algorithm NDF and find the objective function values of the most preferred trajectory between each target pair using Procedure BT, as in Phase 1.

Step R.1.: finding the most preferred route and the next target to be visited

We solve the single-objective version of the GBOSHPP combining (3) and (4) using \({w}^{*}\) as in (12). The solution gives a path that starts from Target \(\ell,\) ends at Target 1, and visits all unvisited targets in \(M.\) We then update \(M\leftarrow M-\left\{\ell\right\},\) and let the next target to be visited be denoted as \(\ell .\)

Step R.2.: find the new location of and the trajectory to follow to target \(\ell\)

As in Phase 1, we observe the movement of Target \(\ell,\) and find the location where Target \(\ell\) and the UAV meets. We then use Algorithm NDF and Procedure BT to find the objective function values of RP’s best trajectory to \(\ell.\) Using the first objective function value of this trajectory, we find its structure (the trajectory’s entry point to and exit point from the radar region, and the circular path it follows in the radar region) as in Phase 1. When the UAV reaches \(\ell\) following this trajectory, we run Algorithm R-T to determine the new locations of the targets in set \(M.\)

We terminate the algorithm when there is only one target left, say Target \(k,\) to visit after the UAV visits Target \(\ell,\) before returning to the base (Target 1). We thus structure the remaining path as \(\ell-k-1,\) and find the trajectories to follow between Targets \(\ell\) and \(k,\) and \(k\) and 1.

We next give the detailed steps of using Algorithm R-T in Phase 2.

R-T Algorithm – Phase 2

Step P2.0. Approximate the nondominated frontiers between all target pairs \(\left(i,j\right),i,j\in M,\) using Algorithm NDF for the updated map. Find the objective function values of the most preferred trajectories between target pairs \(\left(i,j\right),i,j\in M,\) using Procedure BT.

Step P2.1. Find the most preferred route solving GBOSHPP using preference function (12)\(.\) Update \(M\leftarrow M-\left\{\ell \right\}.\) Set \(\ell\) as the next target in the route.

Step P2.2. Find the new location of Target \(\ell .\) and structure the trajectory to follow to Target \(\ell .\)

  • If \(\left|M\right|=2,\) where \(M=\{1,k\}\), structure the path to follow as \(\ell-k-1.\) Structure the trajectories to follow between Targets \(\ell\) and \(k\), and \(k\) and 1, and terminate the algorithm.

  • Otherwise, go to Step P2.0 when the UAV reaches Target \(\ell .\)

Note that we utilize the representations of the nondominated frontiers of the trajectories throughout our algorithm. For each target pair, we find several efficient trajectories and use these trajectories to characterize their nondominated frontiers. We then work on these characterized frontiers in the objective function space to find the most preferred route. After determining the next target in the last step, we structure the specific trajectory to that target. Postponing the structuring of the actual trajectories of the UAV to the last phase reduces the computational burden substantially.

Although the algorithm finds a solution quickly, there may still be benefits to further reduce the computation time since the routes need to be updated in real time. We next develop a heuristic approach that attempts to further reduce the solution time without sacrificing much from solution quality.

3.2 The k-closest heuristic

In Algorithm R-T, a considerable part of the computational effort is spent in step R.0. In this step, we find the objective function values of the most preferred trajectories between all target pairs. However, some target pairs are very distant and are not likely to be visited consecutively. Based on this observation, we may be able to skip some of the calculations in Step R.0 for distant target pairs without compromising solution quality. In order to implement this idea, we find the most preferred trajectory of each target considering its \(k\) closest targets only and use rough approximate trajectories for targets farther away. We next discuss the necessary modifications to Step R.0 of Algorithm R-T.

Let \(Q=\{\left(i,j\right):i,j\in M\}\) be the set of all target pairs to be evaluated.

Modifications to step R.0 in the k-closest heuristic

We break step R.0 into four substeps as follows:

Step R.0.1. For each target pair \(\left(i,j\right)\in Q,\) find \(\mathrm{MDS}\) and \(\mathrm{MRS}\) having objective function values \(({z}_{ij,\mathrm{MDS}}^{1},{z}_{ij,\mathrm{MDS}}^{2})\) and \(\left({z}_{ij,\mathrm{MRS}}^{1},{z}_{ij,\mathrm{MRS}}^{2}\right),\) respectively.

Step R.0.2. For each target pair \(\left(i,j\right)\in Q\), compute the preference function value of the preferred one of \(\mathrm{MDS}\) and \(\mathrm{MRS},\) \({U}_{ij}={\mathrm{min}}_{t=\mathrm{MDS},\mathrm{MRS}}\left\{{w}_{d}^{*} {z}_{ij,t}^{1}+{w}_{r}^{*} {z}_{ij,t}^{2}\right\},\) \({t}_{\mathrm{min}}={\mathrm{argmin}}_{t=\mathrm{MDS},\mathit{ }\mathrm{MRS}}\left\{{w}_{d}^{*} {z}_{ij,t}^{1}+{w}_{r}^{*} {z}_{ij,t}^{2}\right\},\) and \(\left({d}_{ij},{r}_{ij}\right)=\left({z}_{ij,{t}_{\mathrm{min}}}^{1},{z}_{ij,{t}_{\mathrm{min}}}^{2}\right).\)

Step R.0.3. For each \(i\in M,\) let \(j\) be the order statistics such that \({U}_{i(1)}\le {U}_{i\left(2\right)}\le \dots \le {U}_{i\left(\left|M\right|-1\right)}.\) Let \({Q}_{i}=\left\{\left(j\right):j=1,\dots ,k\right\}.\)

Step R.0.4. For target pairs \(\left(i,j\right),\) \(i\in M,j\in {Q}_{i},\) construct their nondominated frontiers using Algorithm NDF, and update the objective function values of their most preferred trajectories \(\left({d}_{ij},{r}_{ij}\right)\) using Procedure BT.

In Step R.0.1, we find the two extreme efficient solutions for all target pairs. We then assign the objective function values of the efficient solution that has a smaller preference function value as an approximation of the objective values, \(\left({d}_{ij},{r}_{ij}\right)\), of the best trajectory between Targets \(i\) and \(j,\) in Step R.0.2. For each Target \(i,\) its \(k\) neighbors with the smallest \({U}_{ij}\) values are identified and the nondominated frontiers between Target \(i\) and each of those \(k\) targets are characterized in Step R.0.3. The objective function values of the most preferred trajectories are selected in Step R.0.4. For targets that are not among the \(k\) closest targets, their most preferred trajectories are kept at their initially approximated values in Step R.0.2.

After finding the objective function values of the most preferred trajectories between target pairs, we continue with Steps R.1 and R.2 of Algorithm R-T as before.

Setting \(k\) to \(|T|-1\) corresponds to the original algorithm (i.e., finding the objective function values of the most preferred trajectories between all target pairs). As \(k\) gets smaller, there is a possibility that the solution quality deteriorates.

3.3 The 0-closest heuristic

This is the special case of the \(k\)-closest heuristic where \(k=0.\) In this case, we do not construct the nondominated frontiers between target pairs. Instead, we use the extreme efficient trajectory of each target pair that has the smaller preference function value as the most preferred trajectory between that target pair.

Strictly speaking, Algorithm \(NDF\) structures the trajectories for targets that are in the regions radars are ineffective. If targets end up in the effective regions of the radars small adjustments may be necessary to determine the trajectories. Under such circumstances, we can employ the 0-closest heuristic using \(\mathrm{MDS}\) and an approximation of the other extreme solution, \(\mathrm{MRS}\). We demonstrate an example in Fig. 3 where Target 1 falls in the effective region of a radar. Here, the light-colored (blue) trajectory that connects the two targets with the shortest path is the \(\mathrm{MDS}\). The dark-colored (red) trajectory is an approximation for \(\mathrm{MRS}\). In this trajectory, we first exit the radar region using the shortest path possible (using the straight-line path between Target 1 and point (\(e,f\))). We then reach Target 2 by traversing the trajectory with the shortest distance that does not enter the radar region. If the UAV visits these two targets consecutively, the trajectory (\(\mathrm{MDS}\) or \(\mathrm{MRS}\)) that gives a better preference function value can be used. We can handle other configurations in a similar manner. We are currently working on structuring further efficient trajectories when the trajectories are directly under the regions covered by the radars.

Fig. 3
figure 3

Extreme trajectories when a target enters the effective radar region

4 Results

Although there are some differences, UAVs are restricted in the ranges they could fly and this affects the number of targets a UAV could typically visit. In order to consider a range of realistic cases, we demonstrate our algorithms on three test problems with 5, 9, and 15 targets. Similar number of targets have been used in many applications (see Venkatachalam 2018, for routing one UAV to 10 to 15 targets on average, Dasdemir et al. 2020, for routing to 15 targets at most, and Santin et al. 2021, for routing one UAV to 6 to 20 targets on average). In our implementation, we use an average speed of 220 km/h for the UAV, and a constant speed of 36 km/h for the ground vehicles carrying the targets (see Jayaweera and Hanoun 2022), which is approximately one-sixth of the speed of the UAV. We let the ground vehicles to move in random directions. For the radar detection threat measure, we set \(C=2270, {\mathrm{LB}}_{S/N}=15,\) and \({\mathrm{UB}}_{S/N}=30.\)

While the UAV is traveling toward the next target, we assume that the target moves at a constant speed without changing its direction. Using the direction and speed information, we predict the point at which the UAV will reach the target. This is not a strong assumption as it encompasses a short time period during which the UAV moves from one target to the next. If the target changes its direction or speed, the UAV can make small adjustments to reach the target. Since the UAV moves at a much faster speed than the ground vehicle, we expect such small adjustments will not affect the selected route or the trajectory.

4.1 Five-target UAV route planning problem

In this case, we consider 5 targets (a base plus 4 targets) and 4 radar regions in a square terrain of 400 km2. The x mark shows the initial location of each target \(i\)=1, …, 5, and concentric circular regions represent the effective ranges of the radars in Fig. 4. The first target is the base where the UAV starts and terminates its route.

Fig. 4.
figure 4

5-Target UAV route planning problem

We run the algorithm in MATLAB R2017a and use optimization package CPLEX 12.9 to solve the mathematical models on Intel(R) Core(TM) i7-8550U CPU PC @ 1.80 GHz (8CPUs) with 16,384 MB RAM.

4.1.1 Illustration of the algorithm for \({\varvec{k}}=4,\boldsymbol{ }{{\varvec{w}}}^{\boldsymbol{*}}=0.2,\) and \({{\varvec{w}}}^{\boldsymbol{*}}=0.8\)

We illustrate the results of the real-time algorithm by setting \(k\) = 4 for two different weights, \({w}^{*}=0.2\) and \({w}^{*}=0.8,\) in Fig. 5. The UAV is at the base (Target 1) at time 0. \({w}^{*}\) is the relative importance the RP assigns to the distance-traveled (D) objective, and \(1-{w}^{*}\) to the radar-detection-threat (RDT) objective. The RP gives a higher relative importance to the D-objective when \({w}^{*}=0.8\), and to the RDT-objective when \({w}^{*}=0.2.\)

Fig. 5
figure 5

Illustration of the algorithm, \(k=4\), \({w}^{*}=0.2\) (the left column) and \({w}^{*}=0.8\) (the right column)

Both cases start with the same visiting order to the targets in Fig. 5a and a’, but the trajectories followed between consecutive target pairs are different. In both cases, the initial tour, that is 1-4-3-5-2-1, is changed to 1-4-3-2-5-1 after the UAV visits Target 4. After visiting Target 3, the change in the locations of the unvisited targets does not result in a change in the visiting order. The decision of visiting Target 2 after Target 3 leaves no other path alternatives than the path 2-5-1 to reach Target 1. In general, for \({w}^{*}=0.2,\) the UAV avoids radar regions at the expense of longer D, whereas for \({w}^{*}=0.8\) shorter trajectories are used taking the risk of larger RDT values, as would be expected.

4.1.2 Results of example problems

We next give the results of the algorithm for \(k=0, 2, 3,\) and 4 for \(w\) values: 0.2, 0.4, 0.6, and 0.8. The results are summarized in Table 1. For all \(w\) values, we obtain the same results when \(k=0, 2\), 3, and \(4,\) therefore we report their results in a single column. This is an important observation implying that the extreme solutions \(\mathrm{MDS}\) and \(\mathrm{MRS}\) are good indicators of the relative performances of the efficient frontiers of trajectories between a target and its different neighbors. We will further elaborate on this later.

Table 1 Results of 5-target problem

In columns 2 to 4 of Table 1, we report the results for the case the UAV follows the initial route obtained in Phase 1, without any updating due to moving targets. We refer this case as IR (initial route). In columns 5 to 7, we give the results of Algorithm R-T. If the two algorithms result in the same visiting order to the targets, the trajectories to follow between consecutive target pairs will be found and structured using the same algorithms (Procedure BT followed with the search-based algorithm) and the resulting routes will be identical. Only if the visiting order changes, the resulting route changes. We indeed obtain results in accordance with the latter, for all \(w\) values. Although the visiting order is the same for the first two targets (1-4-3) in all tours, at the third iteration, the real-time algorithm changes the initial route. We plot the two objective values of the solutions obtained by both Algorithm R-T and IR, together with the resulting tour for each \(w\) value in Fig. 6. It can be seen that, for all \(w\) values, the routes of Algorithm R-T strictly dominate the routes obtained by IR (has better values in both objectives). As we increase the relative importance of D (higher \(w\) values), in general, the total D value decreases and the total RDT value increases.

Fig. 6
figure 6

Results of IR and algorithm R-T for the 5-target problem

We report the average solution times of four \(w\) values for \(k=0, 2, 3,\) and 4, after visiting each target in Table 2. The solution times generally decrease after each visit (as expected since the remaining problem is smaller in size). Also as expected, the solution times increase as the value of \(k\) increases, implying that the heuristic substantially reduces the computational burden.

Table 2 CPU times of algorithm R-T (seconds)—5-target problem

4.2 Nine-target UAV route planning problem

We use the setting of Türeci (2017) with a base, eight targets, and four radar regions, slightly modifying the locations of the targets to demonstrate our approach better. We solve the problem for \(k=0, 2, 3,\) and 8. When \(k=8\), we characterize at most 36 nondominated frontiers in the first iteration of the algorithm. This value decreases to 27 and 18, for \(k=3\) and \(k=2\), respectively. Similar to the 5-target case, the results for \(k=\mathrm{0,2},3,\) and 8 are the same. For only \(w=0.8\), the IR and Algorithm R-T find identical routes, with the same visiting order to the targets, and the same trajectories used between target pairs. For \(w=0.2, 0.4,\) and \(0.6,\) updating the route progressively improves the objective values of the resulting tour and the results with Algorithm R-T strictly dominate those with IR (see Fig. 7). The average solution times of four \(w\) values for \(k=0, 2, 3,\) and 8 can be seen in Table 3. Again, the heuristic improves the computational burden substantially as \(k\) value gets smaller.

Fig. 7
figure 7

Results of IR and algorithm R-T for the 9-target problem

Table 3 CPU times of algorithm R-T (seconds)—9-target problem

4.3 15-Target UAV route planning problem

Next problem we solve has one base, 14 targets, and 4 radar regions. We solve the problem for \(k=0, 2, 3,\) and 14, where we characterize at most 0, 30, 45, and 105 nondominated frontiers, respectively. All k values again resulted in the same tours. For all \(w\) values, the solutions we obtain with Algorithm R-T have better objective values (strictly dominate) than those with IR (see Fig. 8). The solution times for \(k=0, 2, 3,\) and 14 given in Table 4 again show that the heuristic reduces the solution times substantially.

Fig. 8
figure 8

Results of IR and algorithm R-T for the 15-target problem

Table 4 CPU times of algorithm R-T (seconds)—15-target problem

For all \(T\)-target problems, we obtain the same results when \(k\) is set to 0, 2, 3, or \(T-1\). Although it might be possible to create a problem setting where this changes, we did not observe such a case in our further experiments, changing the locations of targets and radars. This is an important observation that the nondominated frontiers of target pairs have been substantially different from each other in terms of the magnitudes of the objectives in all our experiments so that even their extreme points are representative enough to develop the path to follow. Based on this observation, it may be possible to use the fastest version of the algorithm, especially when there are sizeable differences in the extreme points of the objective values of trajectories of different target pairs.

5 Conclusions

In this paper, we consider the real-time routing problem of a UAV in a two-dimensional dynamic environment where the locations of the targets change in time. The UAV, starting from a base, visits a set of targets and returns to the base. We consider two objectives to be minimized: distance traveled and radar detection threat. The UAV moves in continuous space to visit multiple targets that change their locations to destinations unknown to us. To the best of our knowledge, this is the only study that considers such a dynamic environment in a continuous space to find and keep updating the best route of a RP under the two objectives.

We develop a real-time algorithm to find the most preferred route for a RP having an underlying linear preference function. The algorithm starts with an initial route and updates the route every time it reaches a new target, if necessary. The characterizations of the nondominated frontiers of the trajectories allow us to work in the objective function space throughout the algorithm, without the need to find the exact trajectories that require determining their entrance and exit points to and from the radar region, and the path they use within the radar region, as explained in Sect. 2.3. Only after determining the target to be visited next, Target \(\ell,\) we find the exact trajectory between the current location of the UAV and Target \(\ell .\) Although this keeps the computational burden low, we also develop a \(k\)-closest heuristic to further reduce the computational burden, keeping in mind that the algorithm needs to be solved in real time. In the heuristic approach, we make a detailed analysis to find the objective function values of the most preferred trajectory between relatively “close” targets and approximate the most preferred trajectories between the remaining pairs of targets with trajectories that are easier to compute. We demonstrate our algorithm on three test problems with 5, 9, and 15 targets. The algorithm finds identical tours for \(k=0, 2, 3,\) and \(\left|T\right|-1,\) for all problem sizes. Updating the route at each visit to the targets, rather than using the initial route, improves the objective function values in general. The results also show that we reduce the computation times considerably using the k-closest heuristic without sacrificing from the quality of solutions.

Although we allow the targets to change their locations keeping the radars static, our algorithm is applicable to dynamic radar locations as well. To implement the algorithm for changing radar locations, we need to characterize the nondominated frontiers between target pairs for the new locations of the targets or radars at each execution of Phase 2.

Although UAV route planning has been studied well in recent years, the literature on real-time UAV route planning with multiple objectives is rather scarce. There are many possible extensions of this study. Our approach can be extended considering a three-dimensional continuous space. In this case, the altitude of the UAV and the ground structure should also be considered. Additionally, the computations of the two objectives, total distance and total radar detection threat, must be revised. Currently, we consider three different placement configurations of targets relative to the radars. If the targets reside in the effective regions, we may employ the procedure suggested in Sect. 3, and find a route using 0-closest heuristic. We are working on further enriching the efficient trajectories for targets lying directly in the effective radar regions.

If all movements of all targets during the mission of the UAV were known to the RP in advance (which is a strong assumption), finding efficient routes would still not be straightforward. In this case, the distances between all targets would change depending on the time interval at which the efficient trajectory between them is traversed. The decisions of this problem would then include the visiting order to the targets, the trajectories to use between targets, as well as the distance between targets. This is a complex research question due to the combinatorial nature of all possible movements and could be a topic for future research. Another possible extension could be routing multiple UAVs to multiple targets in a dynamic environment. This would require making additional decisions of assigning UAVs to subsets of the targets to be visited. Considering additional dynamic components such as pop-up targets and radars is another interesting future study.