Keywords

1 Introduction

This paper presents a proof of concept numerical implementation of a motion planning algorithm related to a two player game. A first player selects, within an admissible class \(\varXi \), an integral cost function on paths, which takes into account their position, orientation, and possibly curvature. The second player selects a path, within an admissible class \(\varGamma \), with prescribed endpoints and an intermediate keypoint. The players objective is respectively to maximize and minimize the path cost

$$\begin{aligned} {\mathfrak C}(\varXi , \varGamma )&:= \sup _{\xi \in \varXi } \inf _{\gamma \in \varGamma } {\mathfrak C}(\xi ,\gamma ),&\text {where } {\mathfrak C}(\xi ,\gamma )&:= \int _0^{T(\gamma )} {\mathcal C}_\xi (\gamma (t), \gamma '(t), \gamma ''(t)) \, {\mathrm d}t, \end{aligned}$$
(1)

where the path \(\gamma \) is parametrized at unit Euclidean speed, and the final time \(T(\gamma )\) is free. From a game theoretic point of view, this is a non-cooperative zero-sum game, where player \(\varXi \) has no information and player \(\varGamma \) has full information over the opponent’s strategy.

The game (1) typically models a surveillance problem [17], and \(\exp (-{\mathfrak C}(\varXi , \varGamma ))\) is the probability for player \(\varGamma \) to visit a prescribed keypoint without being detected by player \(\varXi \). For instance player \(\varXi \) is responsible for the installation of radar [1] or sonar detection systems [17], and would like to prevent vehicles sent by player \(\varGamma \) from spying on some objectives without being detected.

The dependence of the cost \({\mathcal C}_\xi \) w.r.t. the path tangent \(\gamma '(t)\) models the variation of a measure of how detectable the target is (radar cross section, directivity index, etc.) w.r.t. the relative positions and orientations of the target and sensor. The dependence of \({\mathcal C}_\xi \) on the path curvature \(\gamma ''(t)\) models the airplane maneuverability constraints, such as the need to slow down in tight turns [9], or even a hard bound on the path curvature [8].

Strode [17] has shown the interplay of motion planning and game theory in a similar setting, on a multistatic sonar network use case, but using isotropic graph-based path planning. The same year, Barbaresco [2] used fast-marching for computing threatening paths toward a single radar, but without taking into account curvature constraints and without considering a game setting.

The main contributions of this paper are as follows:

  1. 1.

    Anisotropy and curvature penalization: Strategy optimization for player \(\varGamma \) is an optimal motion planning problem, with a known cost function. This is addressed by numerically solving a generalized eikonal PDE posed on a two or three dimensional domain, and which is strongly anisotropic in the presence of a curvature penalty and a detection measurement that depends on orientation. A Fast-Marching algorithm, relying on recent adaptive stencils constructions, based on tools from lattice geometry, is used for that purpose [9, 12, 13]. In contrast, the classical fast marching method [14] used in [5] is limited to cost functions \({\mathcal C}_\xi (\gamma (t))\) independent of the path orientation \(\gamma '(t)\) and curvature \(\gamma ''(t)\).

  2. 2.

    Gradient computation for sensors placement: Strategy optimization for player \(\varXi \) is typically a non-convex problem, to which various strategies can be applied, yet gradient information w.r.t. the variable \(\xi \in \varXi \) is usually of help. For that purpose, we implement efficient differentiation algorithms, forward and reverse, for estimating the gradient of the value function of player \(\varXi \)

    $$\begin{aligned}&\nabla _\xi {\mathfrak C}(\xi ,\varGamma ),&\text {where } {\mathfrak C}(\xi , \varGamma ) := \inf _{\gamma \in \varGamma } {\mathfrak C}(\xi , \gamma ). \end{aligned}$$
    (2)

    Reverse mode differentiation reduced the computation cost of \(\nabla _\xi {\mathfrak C}(\xi , \varGamma )\) from \({\mathcal O}(N^2)\), as used in [5], to \({\mathcal O}(N\ln N)\), where N denotes the number of discretization points of the domain. As a result, we can reproduce examples from [5] with computation times reduced by several orders of magnitude, and address complex three dimensional problems.

Due to space constraints, this paper is focused on problem modeling and numerical experiments, rather than on mathematical aspects of wellposedness and convergence analysis. Free and open source codes for reproducing (part of) the presented numerical experiments are available on the first author’s webpageFootnote 1.

2 Mathematical Background of Trajectory Optimization

We describe in this section the PDE formalism, based on generalized eikonal equations, used to compute the value function \(\min _{\gamma \in \varGamma } {\mathfrak C}(\xi ,\gamma )\) of the second player, where \(\xi \) is known and fixed. Their discretization is discussed in Sect. 3. We distinguish two cases, depending on whether the path local cost function \({\mathcal C}_\xi (x,\dot{x},\ddot{x})\) appearing in (1) depends on the last entry \(\ddot{x}\), i.e. on path curvature.

2.1 Curvature Independent Cost

Let \(\varOmega \subset {\mathbb E}:= {\mathbb R}^2\) be a bounded domain, and let the source set \(\varUpsilon \) and target set \(\varTheta \) be disjoint subsets of \(\varOmega \). For each \(x \in \varOmega \), let \(\varGamma _x\) denote the set of all paths \(\gamma \in C^1([0,T], \varOmega )\), where \(T = T(\gamma )\) is free, such that \(\gamma (0) \in \varUpsilon \), \(\gamma (T) = x\) and \(\forall t \in [0,T],\, \Vert \gamma '(t)\Vert = 1\). The problem description states that the first player needs to go from \(\varUpsilon \) to \(\varTheta \) and back, hence its set of strategies is \(\varGamma = \bigcup _{x \in \varTheta } \varGamma ^+_x \times \varGamma ^-_x\), where \(\varGamma ^+_x = \varGamma ^-_x := \varGamma _x\), and

$$\begin{aligned} {\mathfrak C}(\xi ,\varGamma )&= \inf _{x \in \varTheta } u^+_\xi (x) + u^-_\xi (x),&\text {where } u^\pm _\xi (x)&:= \inf _{\gamma \in \varGamma ^\pm _x} {\mathfrak C}^\pm (\xi ,\gamma ). \end{aligned}$$
(3)

Here and below, the symbol “±” must be successively replaced with “\(+\)” and then “−”. We denoted by \({\mathfrak C}^\pm \) the path cost defined in terms of the local cost \({\mathcal C}_\xi (x,\pm \dot{x})\). In practice though, we only consider symmetric local costs, obeying \({\mathcal C}_\xi (x,\dot{x}) = {\mathcal C}_\xi (x,-\dot{x})\), hence the forward and return paths are identical and we denote \(u_\xi := u^+_\xi = u^-_\xi \). Define the 1-homogenous metric \({\mathcal F}_\xi : \varOmega \times {\mathbb E}\rightarrow [0,\infty ]\), the Lagrangian \({\mathcal L}_\xi \) and the Hamiltonian \({\mathcal H}_\xi \) by

$$\begin{aligned} {\mathcal F}_\xi (x,\dot{x})&:= \Vert \dot{x}\Vert {\mathcal C}_\xi (x,\dot{x}/\Vert \dot{x}\Vert ),&{\mathcal L}_\xi&:= \frac{1}{2} {\mathcal F}_\xi ^2,&{\mathcal H}_\xi (x,\hat{x})&:= \sup _{\dot{x}\in {\mathbb E}} \langle \hat{x},\dot{x}\rangle - {\mathcal L}_\xi (x,\dot{x}). \end{aligned}$$

Here and below, symbols denoting tangent vectors are distinguished with a “dot”, e.g. \(\dot{x}\), and co-vectors with a “hat”, e.g. \(\hat{x}\). Under mild assumptions [3], the function \(u_\xi : \varOmega \rightarrow {\mathbb R}\) is the unique viscosity solution to a generalized eikonal equation \(\forall x \in \varOmega \setminus \varUpsilon ,\, {\mathcal H}_\xi (x,\nabla _x u_\xi (x)) = 1/2\), \(\forall x \in \varUpsilon ,\, u_\xi (x) = 0\), with outflow boundary conditions on \(\partial \varOmega \). The discretization of this PDE is discussed in Sect. 3. We limit in practice our attention to Isotropic costs \({\mathcal C}_\xi (x)\), and Riemannian costs \({\mathcal C}_\xi (x, \dot{x}) = \sqrt{\langle \dot{x},M_\xi (x) \dot{x}\rangle }\) where \(M_\xi (x)\) is symmetric positive definite, for which efficient numerical strategies have been developed [11, 12].

2.2 Curvature Dependent Cost

Let \(\varOmega \subset {\mathbb R}^2 \times {\mathbb S}^1\) be a bounded domain, within the three dimensional space of all positions and orientations. As before, let \(\varUpsilon ,\varTheta \subset \varOmega \). For all \(x \in \varOmega \) let \(\varGamma ^\pm _x\) be the collection of all \(\gamma \in C^2([0,T], \varOmega )\), such that \(\eta := (\gamma ,\pm \gamma ')\) satisfies \(\eta (0) \in \varUpsilon \), \(\eta (T) = x\) and \(\forall t \in [0,T],\, \Vert \gamma '(t)\Vert = 1\). Since the first player needs to go from \(\varUpsilon \) to \(\varTheta \) and back, its set of strategies is \(\varGamma = \bigcup _{x \in \varTheta } \varGamma _x^+ \times \varGamma _x^-\). Equation (3) holds, where \({\mathfrak C}^\pm \) denotes the path cost defined in terms of the local cost \({\mathcal C}_\xi (p,\pm \dot{p},\ddot{p})\).

Consider the 1-homogeneous metric \({\mathcal F}^\pm _\xi : {\mathrm T} \varOmega \rightarrow [0,\infty ]\), defined on the tangent bundle to \(\varOmega \subset {\mathbb R}^2 \times {\mathbb S}^1\) by

$$\begin{aligned} {\mathcal F}^\pm _\xi ( (p,n), (\dot{p},\dot{n})) := {\left\{ \begin{array}{ll} +\infty &{} \text {if } \dot{p} \ne \Vert \dot{p}\Vert n,\\ \Vert \dot{p}\Vert {\mathcal C}_\xi (p,n,\pm \dot{n}/\Vert \dot{p}\Vert ) &{}\text {else}, \end{array}\right. } \end{aligned}$$

where \(p \in {\mathbb R}^2\), \(n \in {\mathbb S}^1\) is a unit vector, and the tangent vector satisfies \(\dot{p} \in {\mathbb R}^2\), \(\dot{n} \perp n\). This choice is motivated by the fact that \(\int _0^T {\mathcal F}^\pm _\xi (\eta (t),\eta '(t)) {\mathrm d}t\) is finite iff \(\eta :[0,T]\rightarrow \varOmega \) is of the form \((\gamma ,\pm \gamma ')\), and then it equals \(\int _0^T {\mathcal C}_\xi (\gamma (t), \pm \gamma '(t), \gamma ''(t)) {\mathrm d}t\).

Introducing the Lagrangian \({\mathcal L}^\pm _\xi = \frac{1}{2} ({\mathcal F}^\pm _\xi )^2\) on \({\mathrm T} \varOmega \), and its Legendre-Fenchel dual the Hamiltonian \({\mathcal H}^\pm _\xi \), one can again under mild assumptions characterize \(u_\xi ^\pm \) as the unique viscosity solution to the generalized eikonal PDE \({\mathcal H}^\pm _\xi (x,\nabla u^\pm _\xi (x)) = 1/2\) with appropriate boundary conditions [3]. In practice, we choose cost functions of the form \({\mathcal C}_\xi (p,\dot{p}, \ddot{p}) = {\mathcal C}^\circ _\xi (p, \dot{p}) {\mathcal C}_\star (|\ddot{p}|)\), where \({\mathcal C}_\star \) is the Reeds-Shepp car or Dubins car [8] curvature penalty, with respective labels \(\star =\mathrm{RS}\) and \(\mathrm{D}\), namely

$$\begin{aligned} {\mathcal C}_\mathrm{RS}(\kappa )&:= \sqrt{1+\rho ^2\kappa ^2},&{\mathcal C}_\mathrm{D}(\kappa )&:= {\left\{ \begin{array}{ll} 1 &{}\text {if } |\rho \kappa |\le 1,\\ +\infty &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

where \(\rho >0\) is a parameter which has the dimension of a curvature radius. The Dubins car can only follow paths which curvature radius is \(\le \rho \), whereas the Reeds-Shepp car (in the sense of [9] and without reverse gear), can rotate into place if needed. The Hamiltonian then has the explicit expression \({\mathcal H}((p,n),(\hat{p}, \hat{n})) = \frac{1}{2} {\mathcal C}^0_\xi (p,n)^{-2} {\mathcal H}_*(n,(\hat{p}, \hat{n}))\) where \({\mathcal H}_\mathrm{RS} = \frac{1}{2} (\langle \hat{p},n\rangle _+^2+\Vert \hat{n}/\rho \Vert ^2)\) and \({\mathcal H}_\mathrm{D} = \frac{1}{2} \max \{0,\langle \hat{p}, n\rangle +\Vert \hat{n}/\rho \Vert \}^2\).

3 Discretization of Generalized Eikonal Equations

We construct a discrete domain X by intersecting the computational domain with an orthogonal grid of scale \(h>0\): \(X = \varOmega \cap (h {\mathbb Z})^d\), where \(d=2\) for the curvature independent models, and \(d=3\) for the other models which are posed on \({\mathbb R}^2 \times ({\mathbb R}/2 \pi {\mathbb Z})\) —using the angular parametrization \({\mathbb S}^1 \cong {\mathbb R}/2 \pi {\mathbb Z}\) (in the latter periodic case, \(2\pi /h\) must be an integer). We design weights \(c_\xi (x,y)\), \(x,y \in X\) such that for any tangent vector \(\dot{x}\) at \(x\in \varOmega \) one has

$$\begin{aligned} {\mathcal H}_\xi (x,\dot{x}) \approx h^{-2} \sum _{y \in X} c^2_\xi (x,y)\langle x-y, \dot{x}\rangle _+^2, \end{aligned}$$
(4)

where \(a_+ := \max \{0,a\}\) (expression (4) is typical, although some models require a slight generalization). The weights \(c_\xi (x,y)\) are non-zero for only few \((x,y)\in X\) at distance \(\Vert x-y\Vert = {\mathcal O}(h)\). Their construction exploits the additive structure of the discretization grid X and relies on techniques from lattice geometry [16], see [9, 12, 13] for details. The generalized eikonal PDE \({\mathcal H}_\xi (x,\nabla _x u_\xi (x)) = 1/2\), which solution \(u_\xi (x)\) should be regarded as a distance map, is discretized as

$$\begin{aligned} \sum _{y \in X} c^2_\xi (x,y) (U_\xi (x) - U_\xi (y))_+^2 = h^2/2, \end{aligned}$$
(5)

with adequate boundary conditions. The solution \(U_\varepsilon : X \rightarrow {\mathbb R}\) to this system of equations is computed in a single pass with \({\mathcal O}(N \ln N)\) complexity [14], using a variant of the Fast-Marching algorithm. This is possible since the l.h.s. of (5) is a non-decreasing function of the positive parts of the finite differences \((U_\xi (x) - U_\xi (y))_{y\in X}\). Note that the eikonal PDE discretization (5), based on upwind finite differences, differs from the semi-Lagrangian approach [15], which can also be solved in a single pass but is usually less efficient due to the large cardinality and radius of its stencils. Image segmentation techniques relying on the numerical solutions to anisotropic eikonal PDEs were proposed in [6] using Riemannian metrics, and in [4, 7] based on the reversible Reeds-Shepp car and Euler elastica curvature penalized models respectively. However these early works rely on non-causal discretizations, which have super-linear complexity \({\mathcal O}(N^{1+1/d})\) where the unspecified constant is large for strongly anisotropic and non-uniform metrics. This alternative approach yields (much) longer solve times, incompatible our application - where strongly anisotropic three dimensional eikonal PDEs are solved as part of an inner loop of an optimization procedure.

To be able to use the gradient to solve the problem (1), we need to differentiate the cost \({\mathfrak C}(\xi ,\varGamma )\) w.r.t. the first player strategy \(\xi \in \varXi \). In view of (3), this only requires the sensitivity of the discrete solution values \(U_\xi (x_*)\) at the few points \(x_* \in X \cap \varTheta \), w.r.t to variations in the weights \(c_\xi (x,y)\), \(x,y \in X\). For that purpose we differentiate (5) w.r.t. \(\xi \) at an arbitrary point \(x \in X\setminus \varUpsilon \), and obtain

$$\begin{aligned} \sum _{y \in X} \omega _\xi (x,y) \left( {\mathrm d}U_\xi (x)-{\mathrm d}U_\xi (y) + (U_\xi (x) - U_\xi (y)) \, {\mathrm d}\ln c_\xi (x,y) \right) = 0, \end{aligned}$$

where \(\omega _\xi (x,y) := c^2_\xi (x,y) (U_\xi (x) - U_\xi (y))_+\). Therefore

$$\begin{aligned} {\mathrm d}U_\xi (x) = \sum _{y \in X} \alpha _\xi (x,y) {\mathrm d}U_\xi (y) + \sum _{y \in X} \beta _\xi (x,y) {\mathrm d}c_\xi (x,y), \end{aligned}$$
(6)

where \(\alpha _\xi (x,y):=\omega _\xi (x,y)/\sum _y \omega _\xi (x,y)\), and \(\beta _\xi (x,y) := \alpha _\xi (x,y)/c_\xi (x,y)\). We first choose \(x=x_*\) in (6), and then recursively eliminate the terms \({\mathrm d}U_\xi (y)\) by applying the same formula at these points, except for points in the source set \(y\in \varUpsilon \) for which one uses the explicit expression \({\mathrm d}U_\xi (y) = 0\) (since \(U_\xi (y) = 0\) is in this case independent of \(\xi \)). This procedure terminates: indeed, whenever \({\mathrm d}U_\xi (x)\) depends on \({\mathrm d}U_\xi (y)\) in (6), one has \(\alpha _\xi (x,y)>0\), thus \(\omega (x,y)>0\), hence \(U_\xi (x)>U_\xi (y)\). It is closely related to automatic differentiation by reverse accumulation [10], and has the modest complexity \({\mathcal O}(N)\).

4 Numerical Results

The chosen physical domain R is the rectangle \([0,2]\times [0,1]\) minus some obstacles, as illustrated on Fig. 1. Source point is (0.2, 0.5) and target keypoint (1.8, 0.5). The computational domain is thus \(\varOmega =R\) for curvature independent models and \(\varOmega = R\times {\mathbb S}^1\) for curvature dependent models, which is discretized on a \(180 \times 89\) or \(180 \times 89 \times 60\) grid.

No intervention from the first player. The cost function is \({\mathcal C}_\xi (p, \dot{p}, \ddot{p}) = {\mathcal C}_*(|\ddot{p}|)\), where \({\mathcal C}_*(\kappa )\) is respectively 1, \(\sqrt{1+\rho ^2\kappa ^2}\) and (1 if \(\rho \kappa \le 1\), otherwise \(+\infty \)), with \(\rho := 0.3\). The differences between the three models are apparent: the curvature independent model uses the same path forward and back; the Reeds-Shepp car spreads some curvature along the way but still makes an angle at the target point; the Dubins car maintains the radius of curvature below the bound \(\rho \), and its trajectory is a succession of straight and circular segments. A referee notes that following an optimal trajectory for the Dubins model is dangerous in practice, since any small deviation is typically impossible to correct locally, and may drive into an obstacle; these trajectories are also easier to detect due to the large circular arc motions.

Fig. 1.
figure 1

Shortest path from the blue point (left) to the red keypoint (right) and back. (Color figure online)

Next we study three games where player one aims to detect player two along its way from the source set \(\varUpsilon \) to the target \(\varTheta \) and back, using different means. If the first player does not intervene, see Fig. 1, or if its strategy is not optimized, see Fig. 3, then there is typically a unique optimal path (optimal loop in our games) for player two. In contrast, an interesting qualitative property of the optimal strategy \(\xi \in \varXi \) for the first player is that it has a large number of optimal responses from player two, see Fig. 4, in some cases even a continuum, see Fig. 2 (bottom) and [5]. This is typical of two player games.

Fresh paint based detection. In this toy model, see Fig. 2, the first player spreads some fresh paint over the domain, and the second player is regarded as detected if he comes back covered in it from his visit to the keypoint. The cost function is \({\mathcal C}_\xi (p,\dot{p}, \ddot{p}) = \xi (p) {\mathcal C}_*(|\ddot{p}|)\), where \(\xi : R \rightarrow {\mathbb R}_+\) is the fresh paint density, decided by the first player, and \({\mathcal C}_*(\kappa )\) is as above. For wellposedness, we impose upper and lower bounds on the paint density, namely \(0.1 \le \xi (p) \le 1\), and subtract the paint supply cost \(\int _R \xi (p) {\mathrm d}p\) to (1). The main interest of this specific game, also considered in [5], is that \({\mathfrak C}(\xi ,\varGamma )\) is concave w.r.t. \(\xi \in \varXi \). The observed optimal strategy for player \(\varXi \) is in the curvature independent case to make some “fences” of paint between close obstacles, and in the curvature penalized models to deposit paint at the edges of obstacles, as well as along specific circular arcs for the Dubins model.

Fig. 2.
figure 2

Top: Optimal distribution of paint, to mark a path from the blue point (left) to the red keypoint (right) and back. Bottom: Geodesic density at the optimal paint distribution. (Color figure online)

Visual detection. The first player places some cameras, e.g. with 360-degree field of view and mounted at the ceiling, which efficiency at detecting the second player decreases with distance and is blocked by obstacles, see Fig. 3. The cost function is

(7)

where \(\xi \in \varXi \) is a subset of R with prescribed cardinality, two in our experiments. The green arrows on Fig. 3 originate from the current (non optimal) camera position, and point in the direction of greatest growth \(\nabla {\mathfrak C}(\xi ,\varGamma )\) for the first player objective function.

Fig. 3.
figure 3

Field of view of the cameras (black gradients), optimal furtive paths (red lines), local direction of improvement of the camera position (green arrows). (Color figure online)

Radar based detection. The first player places some radars on the domain \(R=[0,2]\times [0,1]\), here devoid of obstacles, and the second player has to fly by undetected. The cost function is

$$\begin{aligned} {\mathcal C}_\xi (p,\dot{p}, \ddot{p}) = {\mathcal C}_*(|\ddot{p}|) \sqrt{\sum _{q \in \xi } \frac{\langle \dot{p},n_{\varvec{pq}}\rangle ^2+\delta ^2\langle \dot{p},n^\perp _{\varvec{pq}}\rangle }{\Vert p-q\Vert ^4}} \end{aligned}$$
(8)

where \(n_{\varvec{pq}} := (q-p)/\Vert q-p\Vert \). The first player strategy \(\xi \) contains the positions of three radars, constrained to lie in the subdomain \([0.4,1.6]\times [0,1]\). The parameter \(\delta \) is set to 1 for an isotropic radar cross section (RCS), or to 0.2 for an anisotropic RCS. In the latter case a plane showing its side to radar is five times less likely to be detected than a plane showing its nose or back, at the same position. Green arrows on Fig. 4 point from the original position to the (locally) optimized position for player \(\varXi \). At this position, several paths are optimal for player \(\varGamma \), shown in red on Fig. 4.

Fig. 4.
figure 4

Optimal radar placement with an isotropic (top) or anisotropic (bottom) radar cross section.

Computational cost. On a standard Laptop computer (2.7 Ghz, 16 GB ram), optimizing the second player objective, by solving a generalized eikonal equation, takes \({\approx }1\) s in the curvature dependent case, and \(\approx 60\) times less in the curvature independent case thanks to the absence of angular discretization of the domain. Optimizing the first player objective takes \({\approx }100\) L-BFGS iterations, each one taking at most 8 s. For the stability of the minimization procedure, the problems considered were slightly regularized by the use of soft-minimum functions and by “blurring” the target keypoint over the \(3\times 3\) box of adjacent pixels.

5 Conclusion

We have modeled a motion planning problem that minimize an anisotropic probability of detection, taking into account navigation constraints while computing the gradient of the value function related to the sensors location. This model is thus useful for surveillance applications modeled as a two-player zero-sum game involving a target that tries to avoid detection.