1 Introduction

Monitoring a moving target in dynamic environments is a challenging problem with many practical applications such as studying and following marine animals [27], search and rescue [38], convoy protection [13], or tracking the spread of wildfires [5]. A collection of multiple agents can be demonstrably more powerful than a single unit of the collective [15], however, this introduces additional complexities such as scheduling and avoiding collisions. This paper addresses this problem by proposing a framework for generating optimal trajectories for multiple autonomous agents to monitor a target.

The problem of tracking a target by autonomous systems has received a great deal of attention in the research community, see [29, 30] and citations therein. Methods for tracking a target with a known position include cell decomposition methods [4], path planning-based approach [24], Lyapunov-based nonlinear control [25], oscillatory control [21], differential games [14], and dynamic programming-based methods [22]. Tracking a target with unknown motion and dynamics properties requires an approach that takes the uncertainty into account. In [1] the target’s motion is assumed to be described by a 2D stochastic process which is represented as 2D Brownian motion and the Markov chain approximation method is employed for numerically determining the optimal control policy. The work presented in [28] seeks to continuously keep a stochastic target in the view of a gimbaled camera while avoiding blind spots, like those produced by landing. A cutting-plane approach is presented in [34], which enables multiple vehicles to seek and detect one or more uncertain targets. In [6] a temporal and spatial discretization method is proposed to formulate the optimal target tracking problem as a nonlinear programming problem subject to turn-rate constraints of the monitoring agents. In [26] a monitoring agent is equipped with a camera on a gimbal allowing the agent to monitor the target from several different line-of-sight angles.

When dealing with multi-vehicle tracking missions in a real-world environment, control problems that arise from this application are very complex since they must take into account vehicle’s dynamic constraints, inter-vehicle safety, multiple mission objectives and environmental uncertainties. These lead to optimal control problems with cost function and constraints that can be highly nonlinear, non-smooth, and non-convex, and closed-loop solutions cannot be sought. Thus, solutions to these problems must be found using numerical methods such as direct methods [31]. Direct methods are based on approximating optimal control problems as nonlinear programming problems (NLPs) using some discretization scheme [2, 3, 12, 31]. Then, these NLPs can be solved using off-the-shelf optimization solvers (e.g., MATLAB, SNOPT, etc.). A wide range of direct methods that use different discretization schemes have been developed, which include Euler [16], Runge–Kutta [35, 36], Pseudospectral [33], as well as methods based on Bernstein approximants [9, 10].

In the context of the existing literature on multi-vehicle target tracking, this work differs from other approaches in a fundamental way. Rather than dealing with simplified problems and designing closed-loop (reactive) controllers, we formulate the multi-vehicle target tracking problem as a general nonlinear optimal planning problem and find a solution using a direct method. This approach allows us to consider the cooperative target tracking problem in its full complexity, and rigorously ensure satisfaction of dynamics and safety constraints and mission objectives. Furthermore, the method enhances human operators’ trust through transparent and predictable operations, since the solutions are generated open loop, in one shot, for long temporal windows.

We employ the direct method based on Bernstein approximants that was initially proposed in [8]. Bernstein polynomials (BPs) were originally introduced by Sergei Natanovich Bernstein (1880–1968) as a means to develop a constructive proof of the Weierstrass theorem. Due to their slow convergence as function approximants, Bernstein polynomials were not widely used until two automotive engineers from competing companies used them as a parametric representation for automotive bodies. Today, Bernstein polynomials (sometimes referred to as Bézier curves) are frequently used in graphics software and font representations. Bernstein polynomials enjoy the benefit of low computational requirements since many of their properties can be discerned from their coefficients. For example, a Bernstein polynomial is guaranteed to reside in the convex hull defined by its coefficients, leading to safe estimates of its bounds at a low computational cost, allowing generation of trajectories in near real-time fashion. Here we exploit the properties and algorithms of Bernstein polynomials to address the problem of target tracking. This approach allows us to consider the complex scenario where the vehicles must coordinate in order to ensure continuous monitoring of a partially unknown moving target by gimbeled cameras mounted on-board the UAVs. To ensure robustness to a target’s motion uncertainties, we propose an event-triggered approach to re-plan trajectories using up-to-date measurements. For more in depth discussion on the benefits of Bernstein polynomial-based approaches as compared to other direct methods for trajectory generation, we refer to [8, 11, 17, 19].

Our proposed method comprises three main parts: sequential planning, approximation via BPs, and target prediction re-planning. Sequential planning allows each agent to plan their own trajectory which offers a lower computational expense and does not require a centralized processor. BPs allow the optimal target tracking problem, an infinite dimensional optimal control problem, to be formulated as a finite dimensional nonlinear optimization problem problem. The latter can be solved using off-the-shelf solvers, e.g., fmincon, SciPy, SNOPT, and IPOPT. The target prediction re-planning allows trajectories to be re-planned when the expected future position of the target varies further than a predefined distance away from the original prediction used to generate the initial optimal trajectory. A preliminary version of this problem was approached in [37].

The paper begins with the problem formulation of the trajectory generation for target tracking in Sect. 2. Section 3 presents the polynomial and sequential approximations to the problem at hand and describes both the sequential approach and the Bernstein approximation method adopted. Section 4 presents numerical results of simulated missions and Sect. 5 draws conclusions.

2 Trajectory Generation for Target Tracking

In what follows, vectors are denoted by bold letters, e.g. \(\mathbf {p} =[p_x \, , \, p_y]^\top \). \(||\cdot ||\) denotes the Euclidean norm (or magnitude), e.g., \(||\mathbf {p}|| = \sqrt{p_x^2 + p_y^2}\), and \(\angle \cdot \) denotes the direction angle with respect to the x axis, e.g., \(\angle \mathbf {p} = \tan ^{-1} \frac{p_y}{p_x}\).

The objective of this work is to maintain constant camera coverage of a ground target using multiple unmanned aerial vehicles (UAVs) traveling at a constant altitude. The UAVs are assumed to be of the fixed wing variety which allows for efficient power usage. This means that the UAVs are both nonholonomic and lack the ability to hover. The camera is assumed to be fixed in order to reduce both the cost and mechanical complexity of the UAVs; therefore, the target will only lie in the camera footprint when the UAV is facing towards the target.

Consider an example monitoring mission involving 2 UAVs which takes place over the time interval [0, T], \(T\gg 0\), and let the target’s position be denoted as \(\varvec{p}_\tau \) (depicted as a red star in Fig. 1). For the safety of the monitoring vehicles, a no-fly-zone with radius \(R_T\) is imposed around the target (depicted as a red circle in Fig. 1). Similarly, an outer monitoring radius \(R_O\) represents the maximum distance at which the monitoring vehicles begin their monitoring trajectories (depicted as a black circle in Fig. 1). The value for \(R_O\) is user defined and can depend on several considerations, including the cameras being used, the number of vehicles involved in the mission, or the target’s dynamics. The initial monitoring points \(I_i\) lie on the outer monitoring circle and are randomly chosen from a uniform distribution to mitigate the target predicting future positions of the UAVs. At time \(t_1\), \(\text {UAV}_1\) must reach the point \(I_1\) while also facing directly towards the target. Between times \(t_1\) and \(t_2\), \(\text {UAV}_1\) will follow the monitoring trajectory depicted as a blue dotted line between points \(I_1\) and \(F_1\). At time \(t_2\), \(\text {UAV}_1\) will reach final point \(F_1\) and simultaneously \(\text {UAV}_2\) will reach its initial point \(I_2\). Between times \(t_2\) and \(t_3\), \(\text {UAV}_1\) will follow a trajectory (depicted as a greed dotted line in Fig. 1) to its new initial point \(I_3\) while \(\text {UAV}_2\) will follow its monitoring trajectory between points \(I_2\) and \(F_2\). At time \(t_3\), \(\text {UAV}_1\) will reach its initial monitoring point \(I_3\) and \(\text {UAV}_2\) will reach the end of its monitoring trajectory at point \(F_2\). From there, the process begins anew and repeats until some final time T. The monitoring vehicles are given a fixed time interval \(T_\varDelta \) to complete their trajectories, e.g., \(t_2 = t_1 + T_\varDelta \). The fixed time interval \(T_\varDelta \) is a constant value which depends on the length of the monitoring trajectory and the monitoring speed of the vehicle, i.e., \(T_\varDelta = \frac{\text {monitoring trajectory length}}{\text {monitoring speed}}\).

Fig. 1
figure 1

Graphical depiction of the problem

This monitoring scenario can be extended to an arbitrary number of UAVs, where \(\text {UAV}_i\) begins its monitoring trajectory at time \(t_i = t_{i-1} + T_\varDelta \), when the previous agent, namely \(\text {UAV}_{i-1}\), completes its monitoring maneuver. To minimize agents predictability, the points \(I_i\) are selected randomly on the outer circle. Furthermore, since the agents are assumed to be equipped with non-gimballed sensors, the heading angle of the agents at \(I_i\) must be equal to the line-of-sight angle to the target. Similarly, in order to guarantee monitoring, the difference between the UAVs’ heading angle and the line-of-sight angle to the target must be minimized throughout the execution of the monitoring maneuvers. The problem of trajectory generation for multi-vehicle continuous target monitoring missions can be formulated as a general optimal control problem, i.e.,

Consider a mission involving m UAVs. Find m trajectories to be tracked by the UAVs over the time interval \([t_0 , t_f]\), that minimize a cost function, \(J(\cdot )\), subject to boundary conditions, feasible flight constraints, and mission specific constraints.

Notice that the proposed formulation accounts for re-planning during mission execution in order to react to unpredicted events (i.e. external obstacles, target’s unknown dynamics, etc.). In fact, the above problem is defined over the time interval \([t_0, t_f]\), where the initial time \(t_0\) is the time at which planning occurs (different from the initial time of the overall monitoring mission, \(t=0\)).

In order to formulate the trajectory generation problem mathematically, let the trajectory assigned to the ith vehicle be defined as \(\mathbf {p}^{[i]}: \mathbb {R}^+ \rightarrow \mathbb {R}^2\). Moreover, let the position of the target be defined as \(\mathbf {p}_\mathcal {T}: \mathbb {R}^+ \times \varTheta \rightarrow \mathbb {R}^2\). The target’s position is supposed to be conditionally deterministic, i.e. there exists a stochastic variable \(\varvec{\theta } = [\theta _1 , \theta _2]^\top \in \varTheta \subset \mathbb {R}^2\) such that the target’s position conditioned on \(\varvec{\theta }\) at time t is given by \(\mathbf {p}_\mathcal {T}(t,\varvec{\theta })\), with dynamics governed by \({\dot{\mathbf{p}}}_\mathcal {T}(t, \varvec{\theta }) = [ \theta _1 \cos (\theta _2), \, \theta _1 \sin (\theta _2) ]^\top .\) The variables \(\theta _1\) and \(\theta _2\) represent the stochastic speed and heading angle of the target, respectively.

We assume that the probability density of \(\varvec{\theta }\) over \(\varTheta \) is known and is denoted by \(p_\theta : \varTheta \rightarrow \mathbb {R}\). Then, assuming that the position of the target at \(t_0\) is known, the expected position of the target at \(t>t_0\) is given by

$$\begin{aligned} \mathbb {E}[ \mathbf {p}_\mathcal {T}(t, \varvec{\theta }) ] = \int _{\varTheta } \left( \int _{t_0}^{t} {\dot{\mathbf{p}}}_\mathcal {T}(\tau , \varvec{\theta }) \hbox {d}\tau \right) p_\theta (\varvec{\theta }) \hbox {d} \varvec{\theta }. \end{aligned}$$
(1)

For notation simplicity, let us define \( E_{\mathbf {p}_\mathcal {T}}(t_0,t) \equiv \mathbb {E}[ \mathbf {p}_\mathcal {T}(t, \varvec{\theta }) ] \), in order to highlight the dependency of the expected position of the target on the initial position (measured at \(t_0\)). Moreover, we denote the expected line-of-sight angle to the target at time t by \( E_{\lambda ^{[i]}}(t_0,t) \equiv \angle \left( E_{p_{\mathcal {T}}}(t_0,t) - \mathbf {p}^{[i]}(t)\right) \, . \) The target’s position (and, therefore, the line-of-sight angle) can be measured on-line, and the planning algorithm is triggered if the actual position of the target deviates from its expected value, computed previously when planning occurred.

Let

$$\begin{aligned} \mathcal {I}^{[i]} = \left\{ t^{[i]}_1,t^{[i]}_2, \ldots , t^{[i]}_{n_m} \right\} , n_m \ge 1, \quad t_0 \le t^{[i]}_1 < t^{[i]}_{n_m} + T_\varDelta \le T \end{aligned}$$
(2)

be the set of time points at which \(\text {UAV}_i\) is required to start a monitoring maneuver, where \(T_\varDelta > 0\) is the pre-specified monitoring time. In order to maximize the probability that, at time \(t^{[i]}_j\), \(\text {UAV}_i\) is positioned along a circle of radius \(R_o\) centered around the target, the cost function

$$\begin{aligned} \begin{aligned} J_{I_j} (\mathbf {p}^{[i]}(t))&= \left\| \mathbf {p}^{[i]}\left( t^{[i]}_j\right) - E_{\mathbf {p}_\mathcal {T}}\left( t_0,t^{[i]}_j\right) + \mathbf {R}_o \right\| ^2 \end{aligned} \end{aligned}$$
(3)

must be minimized, where \(\mathbf {R}_o = R_o [\cos (\alpha _r), \sin (\alpha _r)]^\top \), and \(\alpha _r \in [0,2\pi ]\) is a uniformly random angle generated before the flight trajectory is planned, in order to minimize agents’ predictability. Finally, one additional objective is to maximize the probability that, for the interval of time \(t \in [t^{[i]}_j,t^{[i]}_j + T_{\varDelta }]\), the heading angle of \(\text {UAV}_i\), denoted by \(\psi ^{[i]}(t)\), is equal to the line-of-sight angle to the target. This is captured by the following cost:

$$\begin{aligned} \begin{aligned}&J_{F_j} (\mathbf {p}^{[i]}(t)) = \int _{t^{[i]}_j}^{t^{[i]}_j+T_\varDelta } \left\| E_{\lambda ^{[i]}}(t_0,\tau ) - \psi ^{[i]}(\tau ) \right\| ^2 \hbox {d}\tau . \end{aligned} \end{aligned}$$
(4)

With this setup, the cost function to be minimized is

$$\begin{aligned} J (\mathbf {p}_1(t),\ldots ,\mathbf {p}_m(t)) = w_1 J_{F_j} (\mathbf {p}^{[i]}(t)) + w_2 \sum _{i=1}^m \sum _{j=1}^{n_m} J_{I_j} (\mathbf {p}^{[i]}(t)), \end{aligned}$$
(5)

where \(w_1,w_2>0\) are weights.

Boundary conditions on the trajectories can be imposed, such as initial position, speed, heading angle, and angular rate constraints:

$$\begin{aligned} \begin{aligned}&\mathbf {p}^{[i]}({t_0}) = \mathbf {p}_{0}^{[i]}, \quad ||\dot{\mathbf{p}}^{[i]}(t_0)|| = v_{0}^{[i]}, \quad \varvec{\psi }^{[i]}(t_0) = \varvec{\psi }_{0}^{[i]}, \quad \dot{\varvec{\psi }}^{[i]}(t_0) = \varvec{\omega }_{0}^{[i]} \end{aligned} \end{aligned}$$
(6)

\(\forall i=1 , \ldots , m\), where \(\mathbf {p}_{0}^{[i]}\), \(v_{0}^{[i]}\), \(\varvec{\psi }_{0}^{[i]}\), and \(\varvec{\omega }_{0}^{[i]}\) are the actual position, speed, heading angle, and angular rate of \(\text {UAV}_i\) at time \(t_0\).

Feasible and flyable trajectories are trajectories that can be tracked by the UAVs without having them exceed pre-specified bounds on the vehicles’ states and control inputs and that ensure collision avoidance. The feasibility and flyability constraints considered are minimum and maximum speed, maximum angular rate, and inter-vehicle safety:

$$\begin{aligned}&v_{\min } \le ||\dot{\mathbf{p}}^{[i]}(t)|| \le v_{\max }, \end{aligned}$$
(7)
$$\begin{aligned}&||{{\dot{\varvec{\psi }}}}^{[i]}(t)|| \le \omega _{\max }, \end{aligned}$$
(8)
$$\begin{aligned}&||\mathbf {p}^{[j]}(t) - \mathbf {p}^{[i]}(t)|| \ge D_s , \end{aligned}$$
(9)

\(\forall t \in [t_0,t_f]\) and \(\forall i,j = 1, \ldots ,m\), \(i \ne j\), where \(v_{\max }> v_{\min } > 0\), \(\omega _{\max } > 0\), and \(D_s >0\).

Finally, in order to prevent detection by the target, we impose no-fly-zone constraints, namely the trajectory cannot enter a circle of radius \(R_T\) around the target, which can be expressed as follows:

$$\begin{aligned} ||\mathbf {p}^{[i]}(t) - E_{\mathbf {p}_\mathcal {T}}(t_0,t)|| \ge R_{T}, \quad \forall t \in [t_0,t_f], \quad \forall i = 1, \ldots , m. \end{aligned}$$
(10)

With the equations defined above, we now formulate the trajectory generation problem.

Problem 2.1

(Trajectory generation) Given the prespecified parameters \(R_0\), T, and \(T_\varDelta \) find m trajectories for the UAVs to track over the time interval \([t_0, t_f]\) that minimize the cost function \(J (\mathbf {p}_1(t),\ldots ,\mathbf {p}_m(t))\) (Eq. 5) subject to boundary conditions (Eq. 6), feasible flight constraints (Eqs. 7, 8, and 9), and mission specific constraints (Eq. 10).

Remark 2.1

Problem 2.1 is concerned with the generation of desired trajectories to be tracked by the UAVs. We consider UAVs that are equipped with controllers that stabilize the vehicles’ dynamics and provide trajectory tracking capabilities (for example, see [7, 18, 32]). Notice that these capabilities are provided by most commercially available autonomous vehicle platforms. For simplicity, in this paper we assume that the on-board controllers provide ideal tracking performance, i.e. the UAVs’ position and attitude coincide with the desired ones for all times. Nevertheless, the present method can be easily extended to the more realistic case on non-ideal tracking performance by properly including additional constraints in Problem 2.1.

3 Polynomial and Sequential Problem Approximation

This section is divided into two parts. First, we present our approach to separate the overall trajectory generation problem into different sub-problems that can be solved sequentially in a decentralized fashion, in order to handle scalability issues and the problem’s uncertainties. Second, we introduce a method based on Bernstein polynomials that approximates the cost function and constraints at hand, in order to transcribe the trajectory generation problems into nonlinear programming (NLP) problems.

3.1 Sequential Planning

The trajectory generation problem formulated in Sect. 2 minimizes the cost function for all vehicles simultaneously over the mission duration [0, T]. Consequently, scalability issues arise when the time interval and the number of vehicles involved in the mission become large. To address this issue, we propose a sequential approach, wherein the trajectories for each vehicle are computed independently from the other vehicles’ trajectories. Moreover, each UAV trajectory is divided into two components, each of which is computed separately. These two components are the trajectory from the current UAV position to the outer-circle centered around the target and the monitoring trajectory.

Recall that Eq. (2) is the set of time instants at which \(\text {UAV}_i\) is required to start a monitoring maneuver. Then, \(\mathcal {F}^{[i]} = \{t^{[i]}_j + T_\varDelta \}_{j=1}^{n_m}\) is the set of time points at which the same UAV is required to end its monitoring maneuver. Given a monitoring mission involving m UAVs, each UAV is tasked with solving \(n_m\) flight trajectory generation problems, and \(n_m\) monitoring trajectory generation problems. The jth flight trajectory generation problem for \(\text {UAV}_i\) is stated as follows:

Problem 3.1

(Flight trajectory generation) At given time \(t_0 < t^{[i]}_j\), find \(\mathbf {p}^{[i]}(t)\), \(t \in [t_0,t_f]\), \(t_f = t^{[i]}_j\) that minimizes the following:

$$\begin{aligned} \begin{aligned}&\left\| \mathbf {p}^{[i]}(t_f) - E_{\mathbf {p}_\mathcal {T}}(t_0,t_f) + \mathbf {R}_{i, p} \right\| ^2 + \left\| E_{\lambda ^{[i]}}(t_0,t_f) - \psi ^{[i]}(t_f) \right\| ^2, \end{aligned} \end{aligned}$$
(11)

subject to the boundary conditions described in Eqs. (6)–(10).

The jth monitoring trajectory generation problem for \(\text {UAV}_i\) is formulated as follows:

Problem 3.2

(Monitoring trajectory generation) At given time \(t_0 \ge t^{[i]}_j\), find \(\mathbf {p}^{[i]}(t)\), \(t \in [t_0,t_f]\), \(t_f=t^{[i]}_j+T_\varDelta \) that minimizes

$$\begin{aligned} \int _{t_0}^{t_f} \left\| E_{\lambda ^{[i]}}(t_0,\tau ) - \psi ^{[i]}(\tau ) \right\| ^2 \hbox {d}\tau \end{aligned}$$
(12)

subject to Eqs. (6)–(10).

Under nominal conditions, the initial time, \(t_0\), of the monitoring trajectory is equal to the final time, \(t_f\), of the previous flight trajectory. Similarly, the initial time of a flight trajectory corresponds to the final time of the previous monitoring trajectory. However, in the presence of unpredicted events, such as unexpected target detours, the (flight or monitoring) trajectory planner can be triggered, with the initial time set as the current time. In this work, re-planning is triggered when the expected value of the future position of the target deviates more than a certain distance from the original expected value. In particular, recall that \(E_{\mathbf {p}_\mathcal {T}}(t_{0},t_f)\) is the expected position of the target computed when planning occurred, while the expected position computed at current time t is \(E_{\mathbf {p}_\mathcal {T}}(t,t_f)\). If at time t the distance between the two values is greater than a pre-defined constant \(\epsilon \), then re-planning is triggered.

3.2 Bernstein Approximation

In what follows, we present a method based on the use of BPs to transcribe Problems 3.1 and 3.2 into discrete, finite-dimensional optimization problems. For additional details on the transcription method, the reader is referred to [9]. Let the position of the ith UAV involved in the monitoring mission be defined over an arbitrary interval \([t_0,t_f]\) by a 2-dimensional nth order BP, i.e.,

$$\begin{aligned} \mathbf {p}^{[i]}(t) \triangleq \mathbf {p}_{n}^{[i]}(t) = \sum _{k=0}^n {{\bar{\mathbf{p}}}}_{k,n}^{[i]} b_{k, n}(t), \end{aligned}$$
(13)

where \({{\bar{\mathbf{p}}}}_{k,n}^{[i]} \in \mathbb {R}^{2}\) is kth Bernstein coefficients, and \(b_{k, n}(t)\) is the Bernstein basis (see Eq. (24) in the Appendix). The derivative of \(\mathbf {p}^{[i]}(t)\), i.e. \(\dot{\mathbf{p}}^{[i]}(t)\), is a 2-dimensional nth order Bernstein polynomial computed as

$$\begin{aligned} \dot{\mathbf{p}}^{[i]}(t) = \sum _{\ell =0}^n {\left( \sum _{k=0}^n {{\bar{\mathbf{p}}}}_{k,n}^{[i]}\mathbf {D}_{k,\ell } \right) } b_{\ell , n}(t) = \sum _{\ell =0}^n {{\bar{\mathbf{p}}}}_{\ell ,n}^{[i] \prime } b_{\ell , n}(t),, \end{aligned}$$
(14)

where \(\mathbf {D}_{k,\ell }\) is the the \((k,\ell )\)th entry of the \((n+1)\) by \((n+1)\) differentiation matrix \(\mathbf {D}_n = \mathbf {D}_{n-1} \mathbf {E}_{n-1}^n \) that is obtained by combining the differentiation and degree elevation properties of BPs (see Properties A.3 and A.4). Note that higher order derivatives of BPs can be computed similarly. The speed, \(v^{[i]}\), and the heading angle, \(\psi ^{[i]}(t)\), are

$$\begin{aligned} v^{[i]}(t) = ||\dot{\mathbf{p}}^{[i]}(t)|| , \quad \psi ^{[i]}(t) = \angle \dot{\mathbf{p}}^{[i]}(t) , \end{aligned}$$
(15)

and the angular rate, \(\omega ^{[i]}(t)\), is given by

$$\begin{aligned} \omega ^{[i]}(t) = {\frac{\dot{\mathbf {p}}^{[i]}(t) \times \ddot{\mathbf {p}}^{[i]}(t)}{\left\| \dot{\mathbf {p}}^{[i]}(t) \right\| ^2}} = \sum _{j=0}^n {{\bar{\omega }}}_{j,n}^{[i]} b_{j,n} . \end{aligned}$$
(16)

The Bernstein coefficients \({{\bar{\omega }}}_{j,n}^{[i]}\), \(j=0,\ldots ,n\), can be easily computed from the Bernstein coefficients of \(\dot{\mathbf {p}}^{[i]}(t)\) and \(\ddot{\mathbf {p}}^{[i]}(t)\) by using the arithmetic properties of BPs (Property A.5).

Using BPs, the cost function in Eq. (11) can be approximated as follows:

$$\begin{aligned} \begin{aligned}&\left\| {{\bar{\mathbf{p}}}}^{[i]}_{n, n} - E_{\mathbf {p}_\mathcal {T}}(t_0,t_f) + \mathbf {R}_{i, p} \right\| ^2 + \left\| E_{\lambda ^{[i]}}(t_0,t_f) - \angle {{\bar{\mathbf{p}}}}_{n,n}^{[i] \prime } \right\| ^2, \end{aligned} \end{aligned}$$
(17)

with \( E_{\lambda ^{[i]}}(t_0,t_f) = \angle \left( E_{p_{\mathcal {T}}}(t_0,t_f) - \mathbf {p}^{[i]}_{n,n} \right) , \) where we used the end point value property of BPs (Property A.1). Moreover, we notice that the cost function in Eq. (12) can be rewritten as

$$\begin{aligned} \int _{t_0}^{t_f} \left\| \angle \left( E_{p_{\mathcal {T}}}(t_0,\tau ) - \mathbf {p}^{[i]}(\tau ) \right) - \angle \dot{\mathbf {p}}^{[i]}(\tau ) \right\| ^2 \hbox {d}\tau . \end{aligned}$$

Minimization of the above cost function is equivalent to minimizing

$$\begin{aligned} \int _{t_0}^{t_f}\eta ^{[i]}(\tau ) d\tau \triangleq \int _{t_0}^{t_f} \left\| \left( E_{p_{\mathcal {T}}}(t_0,\tau ) - \mathbf {p}^{[i]}(\tau ) \right) \times \dot{\mathbf {p}}^{[i]}(\tau ) \right\| ^2 \hbox {d}\tau . \end{aligned}$$
(18)

It is straightforward to show that the argument of the integral above is a Bernstein polynomial of order n, i.e., \(\eta ^{[i]}(t) = \sum _{j=0}^n \bar{\eta }_{j,n}^{[i]}b_{j,n}\), and the integral reduces to the sum of its Bernstein coefficients by virtue of Property A.2, namely \( \int _{t_0}^{t_f} \eta ^{[i]}(\tau ) d\tau = \sum _{j=0}^n \bar{\eta }_{j,n}^{[i]} \, . \) The Bernstein coefficients \(\bar{\eta }_{j,n}^{[i]}\), \(j = 0,\ldots ,n\), can be computed using Property A.5 of BPs.

The boundary conditions can be easily enforced by using the end point value property as follows:

$$\begin{aligned} \begin{aligned}&{{{\bar{\mathbf{p}}}}}^{[i]}_{0,n} = \mathbf {p}_{0}^{[i]}, \qquad \left\| {{\bar{\mathbf{p}}}}_{0,n}^{[i] \prime } \right\| = v_{0}^{[i]}, \qquad \angle {{\bar{\mathbf{p}}}}_{0,n}^{[i] \prime } = \varvec{\psi }_{0}^{[i]}, \qquad {{\bar{\omega }}}^{[i]}_{0,n} = \varvec{\omega }_{0}^{[i]}, \\ \end{aligned} \end{aligned}$$
(19)

To impose the feasible and flyable constraints, we notice that the square of the speed of the UAV (see Eq. (15)) is a 2nth order Bernstein polynomial, and its coefficients can be computed from the coefficients \(\dot{\mathbf {p}}^{[i]}\). Thus, it is convenient to express the constraint in Eq. (7) as follows:

$$\begin{aligned} v^2_{\min } \le ||\dot{\mathbf {p}}_{i,n}(t)||^2 \le v^2_{\max }, \quad \forall t \in [t_0,t_f] . \end{aligned}$$
(20)

The latter can be enforced using the Evaluating Bounds and Evaluating Extrema algorithms presented in [19]. Similarly, the angular rate constraint

$$\begin{aligned} ||\omega ^{[i]}(t)|| \le \omega _{\max }, \quad \forall t \in [t_0,t_f], \end{aligned}$$
(21)

can be enforced using the same methods, since the angular rate is a rational BP (see Eq. (16) and Property A.5). The inter-vehicle safety constraint

$$\begin{aligned} ||\mathbf {p}^{[j]}(t) - \mathbf {p}^{[i]}(t)|| \ge D_s , \end{aligned}$$
(22)

\(\forall i,j = 1, \ldots ,m\), \(i \ne j\), and the no-fly-zone constraint

$$\begin{aligned} ||\mathbf {p}^{[i]}(t) - E_{\mathbf {p}_\mathcal {T}}(t_0,t)|| \ge R_{T}, \end{aligned}$$
(23)

\(\forall i = 1, \ldots , m\), can be efficiently imposed using the Minimum Distance and Collision Detection algorithms presented in [19].

Then, the continuous-time jth flight and monitoring trajectory generation problems for \(\text {UAV}_i\) (Problem 3.1) can be reformulated as finite dimensional optimization problems as follows:

Flight Trajectory Generation For given \(t_0\) and \(t_f = t^{[i]}_j\), find \(\mathbf {p}^{[i]}_{j,n}\), \(j = 0,\ldots ,n\), that minimize the cost function given by Eq. (17) subject to Eq0.  (19)–(23).

Similarly, the (continuous-time) jth monitoring trajectory generation problem for \(\text {UAV}_i\) (Problem 3.2) can be reformulated as follows:

Monitoring Trajectory Generation For given \(t_0\) and \(t_f = t^{[i]}_j + T_\varDelta \), find \(\mathbf {p}^{[i]}_{j,n}\), \(j = 0,\ldots ,n\), that minimize the cost function given by Eq. (18) subject to Eqs. (19)–(23).

Remark 3.1

Due to the geometric properties of BPs (see Appendix and [7, 19]), constraints (20)–(22) can be enforced for the whole trajectories (i.e. \(\forall t \in [t_0,t_f]\)) by means of arithmetic operations on the Bernstein coefficients, independently of the order of the polynomials. This also implies that, to guarantee inter-vehicle safety, each vehicle must communicate only a finite (possibly small) number of coefficients to the rest of the fleet.

Remark 3.2

The above reformulated flight and monitoring trajectory generation problems can be solved using an off-the-shelf optimization solver such as MATLAB’s fmincon or SNOPT. The computation time required to solve is in the order of milliseconds, and this makes it feasible for real time implementation. This, in turn, guarantees robustness and safety of the mission even with unpredictable target movements.

4 Results

The BeBOT library [19] as well as Scipy’s Optimize subpackage, Numpy, Matplotlib, and Numba are used to generate the following results. The tests were run on a Lenovo Thinkpad P52s laptop with 8 GB of RAM and an Intel Core i7-8550U processor running at 1.8 GHz. The implementation used to run the tests is available on GitHub at https://github.com/caslabuiowa/stochastic_target.

Consider a mission involving three UAVs. At \(t^{[1]}_1 = 30\)s, \(\text {UAV}_1\) is tasked with starting its monitoring trajectory. The prescribed monitoring maneuver mission is \(T_{\varDelta }=10\)s. Thus, we have \(\mathcal {I}^{[1]} = \{30,60,90,\ldots \}\)s, \(\mathcal {I}^{[2]} = \{40,70,100,\ldots \}\)s, and \(\mathcal {I}^{[3]} = \{50,80,110,\ldots \}\)s. The minimum safe distance, \(D_s = 1\)m, outer monitoring radius, \(R_O = 75\)m, maximum speed, \(v_{\max } = 10 \mathrm {\frac{m}{s}}\), minimum speed, \(v_{\min } = 1 \mathrm {\frac{m}{s}}\), maximum angular rate, \(\omega _{\max } = 0.6 \, \mathrm {\frac{rad}{s}}\), and re-planning radius, \(\epsilon = 5 \, m\) are selected. The UAVs are initially positioned at \(\mathbf {p}^{[1]}_0 = [0, 0]^\top \)m, \(\mathbf {p}^{[2]}_0=[0, 25]^\top \)m, and \(\mathbf {p}^{[3]}_0=[0, 50]^\top \)m, respectively. They have initial headings \(\psi _0^{[1]}=\psi _0^{[2]}=\psi _0^{[3]}= 0 \, rad\), initial speeds \(v_0^{[1]}=v_0^{[2]}=v_0^{[3]}=3 \mathrm {\frac{m}{s}}\), and initial angular rates \(\omega _0^{[1]}=\omega _0^{[2]}=\omega _0^{[3]}=0 \mathrm {\frac{rad}{s}}\).

Scenario 1 Consider a static target, with \(\mathbf {p}_{\mathcal {T}}(t) = [25, 25]^\top \)m. The flight trajectories are computed at \(t_0=0\), see Fig. 2a. The red dotted lines show the outer monitoring radii of the target at specified times. The gray circle in the center represents the no-fly-zone. The red star is the target and the solid colored lines are the agent flight trajectories. As the mission progresses, the agents generate and follow optimal monitoring trajectories (dashed lines), Fig. 2b. The previous flight trajectories are shown in light cyan, light green, and light blue. At \(t = 60 s\) in Fig. 2b, Agent 1 is about to start its monitoring trajectory while Agent 3 is finishing its monitoring trajectory. On the other hand, Agent 2 is on its way to the outer radius and will begin its monitoring trajectory once Agent 1 completes its monitoring trajectory. Agent 2 had been monitoring the target before Agent 3. As expected, the monitoring trajectories all face radially inward towards the static target.

New flight trajectories are planned and bring the monitoring agents to a new, random position on the outer monitoring radius. The time history of the speeds and angular rates of the vehicles remain within the prescribed bounds, as shown in Fig. 2c, d.

Fig. 2
figure 2

Mission Scenario #1

Scenario 2 Consider a target initially positoned at \(\mathbf {p}_{\mathcal {T}}(0) = [25, 25]^\top \)m, with initial heading angle of 0rad, initial speed of \(3 \mathrm {\frac{m}{s}}\), and initial angular rate of \(0\mathrm {\frac{rad}{s}}\). At time \(t = 1\)s, the target starts moving with speed \(3 \mathrm {\frac{m}{s}}\), \(\forall t \ge 1\)s. It is assumed that the UAVs can only measure the target’s state once per second. Thus, it is not until \(t = 2 \, s\) that the expected future position of the target is found to be outside of the radius \(\epsilon \). As a result, re-planning is triggered, see Fig. 3. Note that the new trajectories still retain the same final time as the original flight trajectories, but were simply re-planned to accommodate the new motion of the target. As the mission progresses, the agents continue to monitor the target. At time \(t = 50\)s, the monitoring trajectories track the position of the target, shown in Fig. 4a. Unfilled star symbols are used to show the past positions of the target at times \(t = 30\)s and \(t = 40\)s. The previously computed flight trajectories are shown in light cyan, light green, and light blue. At time \(t = 60\)s, Fig. 4b, \(\text {UAV}_3\) finishes its monitoring trajectory.

Fig. 3
figure 3

Mission Scenario #2 for \(t = 2s\)

Fig. 4
figure 4

Mission Scenario #2 for \(t = 50s\) and \(t = 60s\)

Scenario 3 The target is initially positioned at \(\mathbf {p}_{\mathcal {T}}(0) = [25, 25]^\top \)m, with initial heading angle of 0 radians, initial speed of \(3 \mathrm {\frac{m}{s}}\), and initial angular rate of \(0\mathrm {\frac{rad}{s}}\). At time \(t = 1\)s, the target quickly changes its heading from 0 radians to \(\frac{\pi }{4}\) radians. Similarly to Scenario 2, the agents detect the change in the predicted future position of the target and re-plan their trajectories. At time \(t = 60\)s the agents can be seen to be following the new trajectory of the target, shown in Fig. 5b.

Fig. 5
figure 5

Mission Scenario #3

5 Conclusions

The problem of monitoring a stochastic target using a team of autonomous agents was formulated as a continuous time infinite dimensional optimal control problem. This problem was simplified using the sequential approach and then transcribed into a nonlinear optimization problem by restricting to the class of solutions defined by Bernstein polynomials. Two separate trajectories were computed for each agent: a flight trajectory and a monitoring trajectory. The flight trajectory aimed to safely navigate the agent to the outer monitoring radius and the monitoring trajectory optimized the line of sight with respect to the target position. To improve the success of monitoring the uncertain target, the trajectories were re-planned any time the predicted future position of the target deviated more than a radius of \(\epsilon \) from the original predicted position. The proposed method was demonstrated in simulation using three monitoring agents. The agents were able to safely reach the outer monitoring radius and then produced optimal monitoring trajectories. When the expected position of the target was predicted to lie outside the radius, \(\epsilon \), of the original prediction, a new flight trajectory was generated to guarantee monitoring without lapses. Plots of these simulations were shown along with plots of the speed and angular rates to verify that constraints were satisfied.

While the examples shown only include a team of three vehicles, this method can be extended to a team with more or less vehicles. In the case for a team of two, one vehicle would be monitoring the trajectory while the second vehicle travels to its initial monitoring position. Two vehicles would provide a cheaper mission cost at the expense of less redundancy in monitoring. Similarly, a team of four vehicles could be employed to improve the chances of a successful mission and provide more time for each vehicle to reach the initial monitoring position. Depending on hardware being used, the number of monitoring vehicles can be increased even further. The interested reader is referred to [20], where numerical results involving multiple vehicle missions with larger number of agents are presented.

The direct method approach using Bernstein polynomials provides the user with an intuitive interface for including additional mission objectives or constraints by simply tweaking the cost and constraint functions. In addition to that, Bernstein polynomials provide computationally efficient properties and algorithms with which the cost and constraint functions can be computed. Due to the curse of dimensionality, this method does not scale to swarms of hundreds or more vehicles. However, the cost and feasibility of such a swarm for monitoring an individual target would likely limit this approach.

Future work includes incorporating stochastic dynamics of the target into the cost function. This will potentially be achieved by leveraging Wiener’s Polynomial Chaos [40] and Gaussian Process Regression (e.g. [39]) to express the future position of an unknown target in polynomial form, which can be easily incorporated into the problem formulation presented in this paper. We are also interested in developing methods to make the system more robust with respect to challenges such as an unreliable communication network, a target actively evading monitoring, and limited fuel availability.