A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline

Munts, N.V.; Kumkov, S.S.

doi:10.1134/S0005117920080159

A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline

mathematical game theory and applications
Published: 30 September 2020

Volume 81, pages 1545–1561, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Automation and Remote Control Aims and scope Submit manuscript

A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline

Download PDF

N.V. Munts¹ &
S.S. Kumkov¹

103 Accesses
1 Citation
Explore all metrics

Abstract

Time-optimal differential games with a lifeline are considered. In such games, there are two sets of interest: the first player tries to guide the system into a target set as soon as possible, while the second player counteracts him and wins if the system reaches another set (called the lifeline). A numerical method for solving time-optimal games with a lifeline is suggested. With this method, the value function is designed as a viscosity solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation. The convergence of the method is established.

Convergence of Numerical Method for Time-Optimal Differential Games with Lifeline

On the Coincidence of the Minimax Solution and the Value Function in a Time-Optimal Game with a Lifeline

Article 01 July 2019

The Problem of Approach of Controlled Objects in Dynamic Game Problems with a Terminal Payoff Function

Article 01 September 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

This paper is devoted to time-optimal differential games with a lifeline and a numerical design scheme for the value function of such games. In games of this type, the first player tries to guide the system into a given closed target set as soon as possible; at the same time, the system’s trajectory should stay within some open set in which the game takes place. The second player counteracts the first player and wins as soon as the system’s trajectory leaves this open set or if he manages to forever keep the system outside the target set.

Apparently, the problems with a lifeline were first formulated by Isaacs in the book [1, 18]. In his statements, the lifeline was understood as a set upon reaching which the second player absolutely wins. Petrosjan made a significant contribution to the study of games with a lifeline; for example, see [2, 5, 6, 23]. However, the authors are not aware of works in which the games of this type would be exhaustively investigated: Petrosjan mainly studied the dynamic problems with simple motions of players (i.e., controls were the speeds of objects controlled by players). In the books [3, 19] by Krasovskii and Subbotin, such problems were treated as the problems with state-space constraints: the first player should not guide the system beyond the boundary of a given set.

The problems ideologically very close to the games with a lifeline were studied by French mathematicians Cardaliaguet, Quincampoix, and Saint-Pierre [12–15]. For the case of controlled systems, using the framework of complex analysis, the theory of differential inclusions, and viability theory, they considered sets (viability kernels) in which the controller can forever keep the system. In differential games, the French authors analyzed situations with two target sets (one for the first player and the other for the second player, respectively) into which each of the players tries to guide the system, still avoiding the opponent’s target set. Another modification of the game described therein is a game with state-space constraints for the first player. In such a situation, the main goal is to study the victory domains of players, i.e., the sets from which a player can reach the target set without getting into the target set of the opponent (or without violating the state-space constraints). Also in terms of viability theory, the upper value of such games (the guaranteed result of the first player) was characterized as a function whose epigraph is the viability set of the first player. Cardaliaguet, Quincampoix, and Saint-Pierre proposed geometric grid algorithms for the approximation of viability kernels (and hence of the upper value). Nevertheless, as far as we know, these authors have not proven the existence of the value function for the games of this type and/or its coincidence with the generalized solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation (although, this connection was explicitly mentioned by them).

The main motivation for studying time-optimal differential games with a lifeline is to develop numerical methods for solving classical time-optimal games. In particular, Italian mathematicians Bardi, Falcone, and Soravia [9] suggested a theoretical design method for the value function of a time-optimal game as the generalized (viscosity) solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation. The procedures proposed therein involve grids, and their convergence (pointwise as well as uniform on any compact set) was established under the assumption that the grid is infinite and covers the entire space of the game. But practical computer implementations operate finite grids only, which cover just a limited part of the space. This immediately leads to the following problem: what boundary conditions should be imposed on the outer boundary of the area covered by a grid? Bardi and Falcone considered the infinite boundary conditions: the second player wins as soon as the system reaches the outer boundary of the game set. Thus, the practical implementation of this procedure actually deals with the games with a lifeline, but is not justified for this class of games. Therefore, we decided to fill the gaps associated with the time-optimal problem with a lifeline in a fairly general statement, namely, to prove the existence of the value function of the game with a lifeline in the case of general nonlinear dynamics; to prove the existence of a generalized solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation; to develop an explicit numerical scheme for such a boundary-value problem; finally, to establish the convergence of the numerical scheme to the generalized solution.

In this paper, we propose a numerical design method for the value function of a differential time-optimal game with a lifeline as a viscosity solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation; also, we prove its convergence to the value function—pointwisely everywhere and uniformly on any compact set.

2 Problem Statement

Consider the time-optimal problem

$$\dot{x}=f(x,a,b),\quad t\geqslant 0,\quad a\in A,\quad b\in B,$$

(2.1)

where $x\in {{\mathbb{R}}}^{d}$ denotes the state vector of the system; a and b are the control actions of the first and second players, respectively, that are constrained by compact sets A and B in their Euclidean spaces. A compact set ${\mathcal{T}}$ and an open set ${\mathcal{W}}\subset {{\mathbb{R}}}^{d}$ such that ${\mathcal{T}}\subset {\mathcal{W}}$ are given. Introduce the notations ${\mathcal{G}}:={\mathcal{W}}\setminus {\mathcal{T}}$ and ${\mathcal{F}}:={{\mathbb{R}}}^{d}\setminus {\mathcal{W}}$ (see Fig. 1). The game takes place on the set ${\mathcal{G}}$ in the following way. The first player, who chooses the control action a, seeks to guide the system into the set ${\mathcal{T}}$ as soon as possible, still keeping the system’s trajectory within the set ${\mathcal{G}}$; the second player, who chooses the control action b, seeks to guide the system’s trajectory into the set ${\mathcal{F}}$ or, if impossible, to guarantee that the system will reach the set ${\mathcal{T}}$ as late as possible (probably, never).

Such a game can be called a game with a lifeline: the boundary $\partial {\mathcal{F}}$ of the set ${\mathcal{F}}$ is the lifeline, i.e., the domain where the second player absolutely wins.

Let the following conditions hold.

C.1. The function $f:{{\mathbb{R}}}^{d}\times A\times B\mapsto {{\mathbb{R}}}^{d}$ is continuous in the aggregate of variables and satisfies the Lipschitz condition in the variable x: for all ${x}^{(1)},{x}^{(2)}\in {{\mathbb{R}}}^{d}$, a ∈ A and b ∈ B,

$$\left\Vert f\left({x}^{(1)},a,b\right)-f\left({x}^{(2)},a,b\right)\right\Vert \leqslant L\left\Vert {x}^{(1)}-{x}^{(2)}\right\Vert ;$$

(2.2)

in addition, it satisfies the saddle point condition in the small game [3, p. 56] (the Isaacs condition):

$$\mathop {\min } \limits_{a\in A}\,\mathop {\max } \limits_{b\in B}\,\left\langle p,f(x,a,b)\right\rangle =\mathop {\max } \limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\left\langle p,f(x,a,b)\right\rangle \quad \forall p\in {{\mathbb{R}}}^{d}.$$

(2.3)

Hereinafter, ⟨⋅ , ⋅⟩ indicates the scalar product.

C.2. The boundary $\partial {\mathcal{G}}$ of the set ${\mathcal{G}}$ (that is, the set composed of $\partial {\mathcal{T}}$ and $\partial {\mathcal{F}}$) is smooth and compact.

These goals of players can be formalized in terms of a payoff function as follows. For given strategies a(⋅) and b(⋅) of the players and for an initial point x₀, we define the values

$$\begin{array}{l}{t}_{* }={t}_{* }\left({x}_{0},a(\cdot),b(\cdot)\right)={\rm min}\left\{t\geqslant 0:x\left(t;{x}_{0},a(\cdot),b(\cdot)\right)\in {\mathcal{T}}\right\},\\ {t}^{* }={t}^{* }\left({x}_{0},a(\cdot),b(\cdot)\right)={\rm min}\left\{t\geqslant 0:x\left(t;{x}_{0},a(\cdot),b(\cdot)\right)\in {\mathcal{F}}\right\},\end{array}$$

which are the first time instants when the trajectory $x\left(\cdot ;{x}_{0},a(\cdot),b(\cdot)\right)$ generated by the strategies a(⋅) and b(⋅) will reach the sets ${\mathcal{T}}$ and ${\mathcal{F}}$, respectively. If the trajectory does not reach the set ${\mathcal{T}}$ (or ${\mathcal{F}}$), then the value t_* (t^*, respectively) will be set equal to + ∞. Further analysis can be performed using the formal framework with nonanticipating strategies or the positional formalization developed by Krasovskii and Subbotin [3, 19]. In the latter case, under the feedback strategies of the first and second players we mean some functions $a(\cdot):{{\mathbb{R}}}^{d}\mapsto A$ and $b(\cdot):{{\mathbb{R}}}^{d}\mapsto B$, respectively.

We define the game result by

$$\tau \left({x}_{0},a(\cdot),b(\cdot)\right)=\left\{\begin{array}{ll}+\infty &{\rm{if}}\,{t}_{* }=+\infty \,{\rm{or}}\,{t}^{* }<{t}_{* }\\ {t}_{* }&{\rm{otherwise}}.\end{array}\right.$$

(2.4)

In the paper [20], the authors proved the existence of the value function

$$T(x)=\mathop {\rm inf} \limits_{a(\cdot)\in {\mathcal{A}}}\,\mathop {\rm sup} \limits_{b(\cdot)\in {\mathcal{B}}}\tau \left(x,a(\cdot),b(\cdot)\right)=\mathop {\rm sup} \limits_{b(\cdot)\in {\mathcal{B}}}\,\mathop {\rm inf} \limits_{a(\cdot)\in {\mathcal{A}}}\tau \left(x,a(\cdot),b(\cdot)\right)$$

for this game. Here the supremum and infimum are calculated over the classes ${\mathcal{A}}$ and ${\mathcal{B}}$ of all admissible (nonanticipating or feedback) strategies of the first and second players, respectively.

The payoff and value functions are unbounded, which causes some inconvenience for the numerical study of game (2.1), (2.4). Therefore, using Kruzhkov’s transform, the unbounded payoff function is often replaced by the bounded one

$$J\left({x}_{0},a(\cdot),b(\cdot)\right)=\left\{\begin{array}{ll}1-\exp\, \left(-\tau \left({x}_{0},a(\cdot),b(\cdot)\right)\right)&\,{\rm{if}}\,\tau \left({x}_{0},a(\cdot),b(\cdot)\right)<+\infty \\ 1&{\rm{otherwise}}.\end{array}\right.$$

(2.5)

In this case, the value function

$$v(x):=\mathop {\rm inf} \limits_{a(\cdot)\in {\mathcal{A}}}\,\mathop {\rm sup}_{b(\cdot)\in {\mathcal{B}}}J\left(x,a(\cdot),b(\cdot)\right)=\mathop {\rm sup}_{b(\cdot)\in {\mathcal{B}}}\,\mathop {\rm inf} \limits_{a(\cdot)\in {\mathcal{A}}}J\left(x,a(\cdot),b(\cdot)\right)$$

(2.6)

also becomes bounded, and its codomain is the range [0, 1].

3 Numerical Scheme

In general, the numerical scheme for this problem is designed and its convergence is established in the same fashion like in the paper [9] for the classical time-optimal problem. The value function of the game is constructed as a generalized (viscosity) solution to the corresponding boundary-value problem for the Hamilton–Jacobi equation (HJE).

3.1 Discrete Scheme

Using time discretization, we replace the continuous-time dynamics by the discrete ones with a time step h > 0:

$${x}_{n}={x}_{n-1}+hf({x}_{n-1},{a}_{n-1},{b}_{n-1}),\quad n=1,\ldots ,N,\quad {x}_{0}\,{\rm{is}}\,\,{\rm{given}},$$

where a_n ∈ A and b_n ∈ B.

Then the value function w_h(⋅) of the discrete-time problem can be represented as

$${w}_{h}(x)=\left\{\begin{array}{ll}\gamma \mathop {\rm max} \limits_{b\in B}\,\mathop {\rm min} \limits_{a\in A}\ {w}_{h}\left(z(x,a,b)\right)+1-\gamma &\,{\rm{if}}\,x\in {\mathcal{G}}\\ 0&\,\,{\rm if}\,\,x\in {\mathcal{T}}\\ 1&\,\,{\rm{if}}\,x\in {\mathcal{F}}.\end{array}\right.$$

Here γ = e^−h and z(x, a, b) = x + hf (x, a, b). This representation is based on the principle of dynamic programming.

Next, we perform spatial discretization with step k. We choose in the space ${{\mathbb{R}}}^{d}$ a grid ${\mathcal{L}}$ with step k that consists of the nodes ${q}_{{i}_{1},\ldots ,{i}_{n}}=({x}_{{i}_{1}},\ldots ,{x}_{{i}_{n}})$, ${i}_{1},\ldots ,{i}_{n}\in {\mathbb{Z}}$, ${x}_{{i}_{j}}=k{i}_{j}$. (In principle, the steps of this grid in different coordinates may differ, but this feature will not affect the idea of numerical scheme design.) Hereinafter, for the nodes of the grid ${\mathcal{L}}$ a linear indexing q_ν, $\nu \in {\mathbb{Z}}$, will be generally used. The symbols ${{\mathcal{L}}}_{{\mathcal{T}}}$, ${{\mathcal{L}}}_{{\mathcal{G}}}$ and ${{\mathcal{L}}}_{{\mathcal{F}}}$ denote the sets of all nodes of the grid ${\mathcal{L}}$ that belong to the sets ${\mathcal{T}}$, ${\mathcal{G}}$ and ${\mathcal{F}}$, respectively. In the theoretical considerations below, the grid is supposed to be infinite.

For any point $x\in {{\mathbb{R}}}^{d},$ it is possible to find a simplex S(x) with vertices $\left\{{q}_{l}(x)\right\}$ from ${\mathcal{L}}$ such that the point x belongs to the simplex S(x) and S(x) does not contain any other nodes of the grid. It is assumed that by introducing the grid ${\mathcal{L}},$ we also choose a partition of the entire space into simplexes with vertices located at the nodes of this grid. The barycentric (local) coordinates λ_l(x) of a point x with respect to the vertices q_l(x) of the corresponding simplex S(x) are given by

$$x=\mathop{\sum }\limits_{l=1}^{n+1}{\lambda }_{l}(x){q}_{l}(x),\qquad {\lambda }_{l}(x)\geqslant 0,\quad \mathop{\sum }\limits_{l=1}^{n+1}{\lambda }_{l}(x)=1.$$

Sometimes, the arguments of the coefficients λ and vertices q will be omitted, whenever the context makes it totally clear for which point the local coordinates are constructed.

We replace the function w_h(⋅) by a new function w(⋅), whose values w(q_ν) at the nodes q_ν of the grid ${\mathcal{L}}$ are given by an infinite-dimensional vector

$$W={\left(w({q}_{\nu })\right)}_{\nu \in {\mathbb{Z}}}.$$

The value w(x) at some point x not representing a node of the grid can be restored using a piecewise-linear approximation of the local coordinates of x:

$${w}_{loc}(x,W)=\mathop{\sum }\limits_{l=1}^{n+1}{\lambda }_{l}\ w({q}_{l}).$$

(3.1)

Thus, the value function for the completely discretized problem is described by

$$w({q}_{\nu })=\left\{\begin{array}{ll}\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}{w}_{loc}\left(z({q}_{\nu },a,b),W\right)+1-\gamma &\,{\rm{if}}\,{q}_{\nu }\in {{\mathcal{L}}}_{{\mathcal{G}}}\\ 0&\,{\rm{if}}\,{q}_{\nu }\in {{\mathcal{L}}}_{{\mathcal{T}}}\\ 1&\,{\rm{if}}\,{q}_{\nu }\in {{\mathcal{L}}}_{{\mathcal{F}}}.\end{array}\right.$$

This characterization has a recursive character: the value w(q_ν) at a node q_ν depends on the value of the local reconstruction w_loc, which in turn depends on the values of the function w(⋅) at the nodes of the grid (possibly, including its value at the node q_ν). Such relations are common for the use of dynamic programming. In what follows, this formula will be adopted for suggesting an iterative numerical design method for the vector W and the function w. In accordance with this characterization, while defining the function w(⋅) for the practical implementation of the method, we have to memorize its values only at the nodes of the grid ${{\mathcal{L}}}_{{\mathcal{G}}}$ that belong to the set ${\mathcal{G}}$. If the set ${\mathcal{G}}$ is bounded, then ${{\mathcal{L}}}_{{\mathcal{G}}}$ contains just a finite number of nodes and can be computerized.

For the grid ${\mathcal{L}}={\{{q}_{\nu }\}}_{\nu \in {\mathbb{Z}}}$, denote by ${\mathcal{M}}$ the set of all infinite-dimensional vectors $W={\left(w({q}_{\nu })\right)}_{\nu \in {\mathbb{Z}}}$. In addition, denote by ${{\mathcal{M}}}_{1}$ those vectors in ${\mathcal{M}}$ whose elements w(q_ν) satisfy the inequality 0 ⩽ w(q_ν) ⩽ 1. With the vector $W={\left(w({q}_{\nu })\right)}_{\nu \in {\mathbb{Z}}}$, for each $s\in {\mathbb{Z}}$ we define an operator ${F}_{s}:{\mathcal{M}}\to {\mathbb{R}}$ by

$${F}_{s}(W)=\left\{\begin{array}{ll}\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,{w}_{loc}\left(z({q}_{s},a,b),W\right)+1-\gamma &\,{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{G}}}\\ 0&\,{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{T}}}\\ 1&\,{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{F}}}.\end{array}\right.$$

Here ${w}_{loc}:{{\mathbb{R}}}^{d}\times {\mathcal{M}}\to {\mathbb{R}}$ is the local reconstruction (3.1) of the function w(⋅) defined using the vector W. The aggregate of all values of the operators F_s over all indexes s (all nodes q_s) actually defines the operator $F:{\mathcal{M}}\to {\mathcal{M}}$.

In the set ${\mathcal{M}},$ a partial order of element-wise comparison can be introduced:

$${W}_{1}\leqslant {W}_{2}\iff \forall \nu \in {\mathbb{Z}}{w}_{1}({q}_{\nu })\leqslant {w}_{2}({q}_{\nu }).$$

Also in this set, it is reasonable to introduce the norm

$$\parallel W{\parallel }_{\infty }={\rm sup}\left\{\left|w({q}_{\nu })\right|:\nu \in {\mathbb{Z}}\right\}.$$

Now, we prove a lemma on the operator F, which is similar to [9, pp. 124–125,Proposition 2.1].

Lemma 3.1.

The operator $F:{\mathcal{M}}\to {\mathcal{M}}$has the following properties.

(1) $F:{{\mathcal{M}}}_{1}\to {{\mathcal{M}}}_{1}$.

(2) F is monotonic with respect to a partial order in ${\mathcal{M}}$.

(3) F is a contraction mapping with respect to the norm ‖ ⋅ ‖_∞.

Proof. Generally, the proof repeats the same steps like in [9].

(1) Let $W\in {{\mathcal{M}}}_{1}$ and ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{G}}}$,

$${F}_{s}(W)=\gamma \mathop {\rm max}\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right){W}_{m}\left(z({q}_{s},a,b)\right)+1-\gamma .$$

Here W_m(z) is an element of the vector W that corresponds to the node representing the mth vertex of the simplex $S\left(z({q}_{s},a,b)\right)$.

Since ${\lambda }_{m}\left(z({q}_{s},a,b)\right)\geqslant 0$, $\sum {\lambda }_{m}\left(z({q}_{s},a,b)\right)=1$ and 0 ⩽ W_m ⩽ 1, it follows that

$$0\leqslant {F}_{s}(W)\leqslant \gamma \mathop {\rm max}\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right)+1-\gamma =\gamma +1-\gamma =1.$$

If ${q}_{s}\notin {{\mathcal{L}}}_{{\mathcal{G}}}$, then F_s(W) = 0 or F_s(W) = 1; therefore, we obtain $F:{{\mathcal{M}}}_{1}\to {{\mathcal{M}}}_{1}$.

(2) Let $U,\,V\in {\mathcal{M}}$ and U ⩾ V. If ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{G}}}$, then

$$\begin{array}{c}{F}_{s}(V)-{F}_{s}(U)\,=\,\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right){V}_{m}\left(z({q}_{s},a,b)\right)\\ -\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right){U}_{m}\left(z({q}_{s},a,b)\right).\end{array}$$

We take the control action $\overline{a}(b)$ of the first player that minimizes the expression of F_s(U) for a fixed value b. Then the first summand in the expression is increased, because $\overline{a}(b)$ does not necessary minimize F_s(V) while the second summand remains the same. As a result,

$$\begin{array}{c}\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop {\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right){V}_{m}\left(z({q}_{s},a,b)\right)-\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\,\mathop {\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z({q}_{s},a,b)\right){U}_{m}\left(z({q}_{s},a,b)\right)\\ \leqslant \gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right){V}_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right)-\gamma \mathop {\rm max }\limits_{b\in B}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right){U}_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right).\end{array}$$

Now, we take the control action $\overline{b}$ of the second player that maximizes the expression of the minuend, that is

$$\overline{b}\in {\rm{Arg}}\,\mathop {\rm {max}} \limits_{b\in B}\,\left[\gamma \mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right){V}_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right)\right].$$

Consequently,

$$\begin{array}{c}\gamma\, \mathop {\max }\limits_{b\in B}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right){V}_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right)-\gamma\, \mathop {\max }\limits_{b\in B}\,\mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right){U}_{m}\left(z\left({q}_{s},\overline{a}(b),b\right)\right)\\ \leqslant \gamma \mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))({V}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))-{U}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b})))\leqslant 0.\end{array}$$

If ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{T}}}$ or ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{F}}}$, then F_s(V) − F_s(U) = 0; therefore, F is a monotonic operator.

(3) Let $U,\,V\in {\mathcal{M}}$. If ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{G}}}$, then

$$\begin{array}{c}\left|{F}_{s}(V)-{F}_{s}(U)\right|\leqslant \gamma \mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))\times \left|{V}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))-{U}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))\right|\\ \leqslant \gamma\, \mathop {\rm max}\limits_{m}\left|{V}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))-{U}_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))\right|\times \mathop{\sum }\limits_{m=1}^{n+1}{\lambda }_{m}(z({q}_{s},\overline{a}(\overline{b}),\overline{b}))\leqslant \gamma \parallel V-U{\parallel }_{\infty }.\end{array}$$

This inequality holds for any $s\in {\mathbb{Z}}$.

If ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{T}}}$ or ${q}_{s}\in {{\mathcal{L}}}_{{\mathcal{F}}}$, then F_s(V) − F_s(U) = 0. Hence, it follows that F is a contracting mapping, since γ = e^−h < 1.□

Note that items 3.1 and 3.1 of Lemma 3.1 are satisfied not only for the operators from ${{\mathcal{M}}}_{1}$ into ${{\mathcal{M}}}_{1}$ but also for the ones from ${\mathcal{M}}$ into ${\mathcal{M}}$.

As a corollary of this lemma, we establish the existence of a unique fixed point W of the operator F that determines the function w(⋅) in ${{\mathbb{R}}}^{d}$. This function depends on the discretization steps h (time) and k (space) of the original problem:

$${\bf{w}}(x)=\left\{\begin{array}{ll}\mathop {\sum} \limits_{m}{\lambda }_{m}{\bf{w}}({q}_{m})\quad\quad {\rm{if}}\,x\notin {\mathcal{L}}\,{\rm{and}}\,x=\,\mathop {\sum} \limits_{m}{\lambda }_{m}{q}_{m}\\ \gamma \mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma &{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{G}}}\\ 0&{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{T}}}\\ 1&{\rm{if}}\,{q}_{s}\in {{\mathcal{L}}}_{{\mathcal{F}}}.\end{array}\right.$$

(3.2)

3.2 Viscosity Solution to HJE

Consider the following boundary-value problem for the HJE:

$$\begin{array}{c}z+H(x,Dz)=0,\quad x\in {\mathcal{G}},\\ z(x)=0\,\,{\rm{for}}\,\,x\in \partial {\mathcal{T}},\\ z(x)=1\,\,{\rm{for}}\,\,x\in \partial {\mathcal{F}}.\end{array}$$

(3.3)

Here Dz denotes the gradient of the function z. The function H is called the Hamiltonian and, in the case of dynamics (2.1), is given by

$$H(x,p)=\mathop {\rm min}_{a\in A}\,\mathop {\rm max }_{b\in B}\left\langle -f(x,a,b)\times p\right\rangle -1,\quad x\in {\mathcal{G}},\,p\in {{\mathbb{R}}}^{d}.$$

(3.4)

The structure of Eq. (3.3) (in particular, the presence of the term z) is due to the fact that Kruzhkov’s transform has been applied to the value function.

The problems of this type are difficult to solve because the necessary condition for the existence of a classical solution to a partial differential equation consists in the twice differentiability of its left-hand side. But for the time-optimal game, we can only talk about the continuity of the Hamiltonian, and hence there is no guarantee that the HJE has a classical solution.

For eliminating this obstacle, the concept of a generalized viscosity solution [16] was adopted. In the book [7], an alternative approach to the generalized solution of the HJE was suggested, which was called the generalized minimax solution. Also, in [7] it was demonstrated that the viscosity and minimax solutions coincide with each other at the points of continuity.

The authors proved [4] that the value of game (2.1), (2.5) is a viscosity solution to problem (3.3). The proof relied on the assumption that each of the players has a dynamic advantage at the boundary of a corresponding set:

$$\begin{array}{l}\forall x\in \partial {\mathcal{T}}\,\mathop {\rm min}\limits_{a\in A}\,\mathop {\rm max }\limits_{b\in B}\left\langle {n}_{{\mathcal{T}}}(x),f(x,a,b)\right\rangle <0,\\ \forall x\in \partial {\mathcal{F}}\,\mathop {\rm max }\limits_{b\in B}\,\mathop {\rm min}\limits_{a\in A}\left\langle {n}_{{\mathcal{F}}}(x),f(x,a,b)\right\rangle <0.\end{array}$$

(3.5)

Here ${n}_{{\mathcal{T}}}(x)$ and $\left({n}_{{\mathcal{F}}}(x)\right)$ are the vectors of normal to the boundaries $\partial {\mathcal{T}}$ and $\partial {\mathcal{F}}$ of the sets ${\mathcal{T}}$ and ${\mathcal{F}}$, respectively, at a point x that are directed outside the corresponding set (equivalently, inside the set ${\mathcal{G}}$). These relations have the following sense: if the system appears at the boundary of the set ${\mathcal{T}}$ (${\mathcal{F}}$), then the first (second) player has enough dynamical capabilities to guide its trajectory inside the corresponding set independently of the opponent’s actions. Considered together, these assumptions actually imply the continuity of the value function inside the set ${\mathcal{G}}$.

Definition 3.1.

An upper semicontinuous function $u(\cdot):{\rm{cl}}\Omega \to {{\mathbb{R}}}^{d}$ will be called the lower viscosity solution to Eq. (3.3) in a domain Ω if for any function φ ∈ C¹(Ω) and for any point y ∈ Ω at which a local maximum of the function u − φ is achieved, the inequality $u(x)+H\left(x,D\varphi (x)\right)\leqslant 0$ holds.

Definition 3.2.

A lower semicontinuous function $u(\cdot):{\rm{cl}}\Omega \to {{\mathbb{R}}}^{d}$ will be called the upper viscosity solution to Eq. (3.3) in a domain Ω if for any function φ ∈ C¹(Ω) and for any point y ∈ Ω at which a local minimum of the function u − φ is achieved, the inequality

$$u(x)+H\left(x,D\varphi (x)\right)\geqslant 0$$

holds.

Definition 3.3.

Consider two sequences of real numbers h_n > 0 and k_n > 0 (discretization steps in time and space). They will be called admissible if h_n → 0 and k_n/h_n → 0 as n → ∞.

Consider admissible sequences h_n > 0 and k_n > 0 and a sequence of solutions w_n to problem (3.2) that corresponds to them.

The facts below are proved using the weak viscosity limit, which was introduced in [10]. The upper and lower weak viscosity limits of a sequence w_n are defined as

$$\begin{array}{l}\mathop {\rm lim\,sup}\limits_{(y,n)\to (x,\infty)}{{\bf{w}}}_{n}(y):=\mathop {\rm lim}\limits_{\delta \to 0+}\,{\rm sup}\left\{{{\bf{w}}}_{n}(y):| x-y| \leqslant \delta ,n\geqslant 1/\delta \right\},\\ \mathop {\rm lim\,inf}\limits_{(y,n)\to (x,\infty)}{{\bf{w}}}_{n}(y):=\mathop {\rm lim}\limits_{\delta \to 0+}\,{\rm inf}\left\{{{\bf{w}}}_{n}(y):| x-y| \leqslant \delta ,n\geqslant 1/\delta \right\}.\end{array}$$

(3.6)

Note that, in particular, condition C.2 is required for the existence of these limits.

Definition 3.4.

An upper semicontinuous function $\overline{v}:{\rm{cl}}\Omega \to {\mathbb{R}}$ satisfies at the boundary Ω the inequality $\overline{v}+H(x,D\overline{v})\leqslant 0$in the viscosity sense if ∀ φ ∈ C¹(clΩ) and a point x ∈ ∂Ω, at which the function $\overline{v}-\varphi $ achieves a local maximum, the inequality $\overline{v}(x)+H(x,D\varphi (x))\leqslant 0$ holds.

Definition 3.5.

A lower semicontinuous function $\overline{v}:{\rm{cl}}\Omega \to {\mathbb{R}}$ satisfies at the boundary Ω the inequality $\overline{v}+H(x,D\overline{v})\geqslant 0$in the viscosity sense if ∀ φ ∈ C¹(clΩ) and a point x ∈ ∂Ω, at which the function $\overline{v}-\varphi $ achieves a local maximum, the inequality

$$\overline{v}(x)+H(x,D\varphi (x))\geqslant 0$$

holds.

4 Convergence of Numerical Scheme

For the time-optimal problem with a lifeline, we formulate and prove a lemma similar to [9, p. 127,Lemma 2.2]. In the original lemma, some calculations were omitted: there was no proof for the upper viscosity solution; the inequalities similar to (4.6) and (4.7) were not established in full; some essential comments were missing (e.g., in the original lemma the domain of the function φ was the closure of the game domain but then this function was used as if defined on the entire space ${{\mathbb{R}}}^{d}$.) In this paper, all necessary calculations and comments will be restored.

Lemma 4.1.

Consider admissible sequences of real numbers h_n > 0 and k_n > 0, and let w_n be the corresponding sequence of solutions to (3.2). Denote

$$\overline{v}(x):=\mathop {\rm lim sup}_{(y,n)\to (x,\infty)}{{\bf{w}}}_{n}(y),\quad \underline{v}(x):=\mathop {\rm lim inf}_{(y,n)\to (x,\infty)}{{\bf{w}}}_{n}(y).$$

The functions $\overline{v}$and $\underline{v}$are the lower and upper viscosity solutions, respectively, to the boundary-value problem (3.3) with the boundary conditions

$$\underline{v}\geqslant 0\,\,{at}\,\,\partial {\mathcal{T}},$$

(4.1)

$$\overline{v}\leqslant 0\,{or}\,\overline{v}+H\left(x,D\overline{v}(x)\right)\leqslant 0\,\,{at}\,\,\partial {\mathcal{T}},$$

(4.2)

$$\underline{v}\geqslant 1\,{or}\,\underline{v}+H\left(x,D\underline{v}(x)\right)\geqslant 0\,\,{at}\,\,\partial {\mathcal{F}},$$

(4.3)

$$\overline{v}\leqslant 1\,\,{at}\,\,\partial {\mathcal{F}}.$$

(4.4)

The second inequalities in (4.2) and (4.3) are interpreted in the viscosity sense.

Proof. The boundary conditions (4.1), (4.2) as well as the fact that $\overline{v}$ is the lower viscosity solution are established by analogy with the paper [9]. The last condition (4.4) obviously follows from the design of the function $\overline{v}$. Hence, it remains to show that the function $\underline{v}$ is the upper viscosity solution, and also that the boundary condition (4.3) holds. We will prove these simultaneously (in (4.3), the second inequality).

We take any function $\varphi \in {C}^{1}({{\mathbb{R}}}^{d})$ and a point y that is a local minimum of the function $\underline{v}-\varphi $. Although in the definition of an upper viscosity solution the function φ is considered only on the set ${\rm{cl}}\ {\mathcal{G}}$, we define it on the entire space ${{\mathbb{R}}}^{d}$, as it will be required below; the restriction of the function φ to the set ${\rm{cl}}\ {\mathcal{G}}$ is a smooth function. Adding a constant to the function φ will not affect the property of the point y; hence, assume $\varphi (y)=\underline{v}(y)$. The point y either belongs to the set ${\mathcal{G}}$, or lie at $\partial {\mathcal{F}}$. The case in which the point y belongs to $\partial {\mathcal{T}}$ should not be considered because it is included in (4.1). If $y\in \partial {\mathcal{F}}$ and $\underline{v}(y)\geqslant 1$, then inequality (4.3) holds; in what follows, therefore let $\underline{v}(y)<1$ for $y\in \partial {\mathcal{F}}$, and $\underline{v}(y)\leqslant 1$ for $y\in {\mathcal{G}}$.

It is necessary to prove that

$$\underline{v}(y)+H\left(y,D\varphi (y)\right)\geqslant 0.$$

We select a sequence of points x_n such that

$$\mathop {\rm min}_{{\rm{cl}}\left({\mathcal{G}}\cap B(y,1)\right)}({{\bf{w}}}_{n}-\varphi)=({{\bf{w}}}_{n}-\varphi)({x}_{n}).$$

Hereinafter, B(z, r) is a closed ball of radius r with center at point z. The basic property of weak viscosity limits [8, 11, 17] is the existence of a subsequence (hereinafter, the sequence x_n itself) such that x_n → y and ${{\bf{w}}}_{n}({x}_{n})\to \underline{v}(y)$ as n → ∞. Hence, we may choose ε > 0 so that $B(y,\varepsilon)\subset {\mathcal{G}}$ if $y\in {\mathcal{G}}$, or $\varphi (y^{\prime})<1-\varepsilon $ for any $y^{\prime} \in B(y,\varepsilon)$ if $y\in \partial {\mathcal{F}}$. This can always be achieved with an appropriate decrease of ε, since if $y\in \partial {\mathcal{F}}$, then $\varphi (y)=\underline{v}(y)<1$. Moreover, we may choose a sufficiently great number n so that the following conditions will be performed.

(1) x_n ∈ B(y, ε/3) due to the convergence of the sequence x_n to the point y as n → ∞.

(2) $\left|{h}_{n}f({x}_{n},a,b)\right|\leqslant \varepsilon /3$ due to the convergence of h_n to 0.

(3) (2 + σ)k_n ⩽ ε/3, $\sigma =\max \left\{\left|D\varphi (z)\right|:z\in B(y,1)\right\}$, due to the convergence of k_n to 0.

(4) φ(x_n) − w_n(x_n) > − ε due to the assumption $\varphi (y)=\underline{v}(y)$ (in this case, φ(x_n) < w_n(x_n), since $\varphi ({x}_{n})<\underline{v}({x}_{n})$ and $\underline{v}({x}_{n})\leqslant {{\bf{w}}}_{n}({x}_{n})$ in the neighborhood of the point y).

The following calculations are carried out for a fixed number n; therefore, this index will be omitted for h_n, k_n, w_n, and x_n.

(1) Let $y\in {\mathcal{G}}$. The local coordinates of the point x are written through the vertices q_s of a corresponding simplex: x = ∑_sλ_sq_s. Note that q_s ∈ B(y, ε), as x ∈ B(y, ε/3) and (2 + σ)k ⩽ ε/3. Therefore, ${q}_{s}\in {\mathcal{G}}$, and consequently we have the representation

$${\bf{w}}({q}_{s})=\gamma \mathop {\rm max }_{b\in B}\,\mathop {\rm min}_{a\in A}\,{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma .$$

(2) Let $y\in \partial {\mathcal{F}}$. Then − ε < φ(x) − w(x) < 1 − ε − w(x) ⇒ w(x) < 1. As a result, if x = ∑_sλ_sq_s, there exists a node q_s such that λ_s ≠ 0 and w(q_s) < 1. Hence, we have the representation

$${\bf{w}}({q}_{s})=\gamma \mathop {\rm max }_{b\in B}\,\mathop {\rm min}_{a\in A}\,{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma .$$

Note that

$${\bf{w}}({q}_{s})=\gamma \mathop {\rm max }_{b\in B}\,\mathop {\rm min}_{a\in A}\,{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma \geqslant \gamma \mathop {\rm min}_{a\in A}\,{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma $$

for any b ∈ B. In addition, for any ρ > 0 there exists a point a_s(ρ) (e.g., the one at which a corresponding minimum is achieved) such that

$$\gamma \mathop {\rm min}_{a\in A}\,{{\bf{w}}}_{loc}\left(z({q}_{s},a,b),{\bf{W}}\right)+1-\gamma >\gamma \,{{\bf{w}}}_{loc}\left(z\left({q}_{s},{a}_{s}(\rho),b\right),{\bf{W}}\right)+1-\gamma -\rho h.$$

Denote

$${z}_{s}(\rho)=z\left({q}_{s},{a}_{s}(\rho),b\right)={q}_{s}+hf\left({q}_{s},{a}_{s}(\rho),b\right).$$

Hence, for any ρ > 0, we obtain

$${\bf{w}}({q}_{s})-\gamma {{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-(1-\gamma)>-\rho h\quad \forall b\in B.$$

(4.5)

Let

$${z}_{s}=\sum _{p}{\mu }_{p}{q}_{p}.$$

Now, we prove the inequality

$${\bf{w}}(x)-\varphi (x)\leqslant {{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-\varphi \left({z}_{s}(\rho)\right)+\sigma k+{o}_{1},$$

(4.6)

where ${o}_{1}=o(\left|{z}_{s}(\rho)-{q}_{{p}^{\star }}\right|)$ and ${q}_{{p}^{\star }}$ is a vertex of the simplex $S\left({z}_{s}(\rho)\right)$ at which the function φ achieves the minimum value $\varphi ({q}_{{p}^{\star }})$ on this simplex. Hereinafter, all values o are considered for n → ∞.

If ${z}_{s}(\rho)\in {\rm{cl}}\ {\mathcal{G}}$, then by condition 3 we have z_s(ρ) ∈ B(q_s, ε/3). Since q_s ∈ B(x, ε/3), z_s(ρ) ∈ B(x, 2ε/3) ⊂ B(x, ε). As a result, inequality (4.6) holds, because x is a local minimum point of the function w − φ.

Now, let ${z}_{s}(\rho)\notin {\rm{cl}}\ {\mathcal{G}}$. Two cases are possible as follows.

(1) There exists a term in the representation of z_s such that μ_p ≠ 0 and ${q}_{p}\in {\rm{cl}}\ {\mathcal{G}}$. Then, by analogy we have ${q}_{p}\in B\left({z}_{s}(\rho),\varepsilon /3\right)$, z_s(ρ) ∈ B(q_s, ε/3), q_s ∈ B(x, ε/3). Hence, q_p ∈ B(x, ε), and consequently w(x) − φ(x) ⩽ w(q_p) − φ(q_p), because x is a local minimum point of w − φ.

(2) For all p such that μ_p ≠ 0, we have ${q}_{p}\notin {\rm{cl}}\ {\mathcal{G}}$. Recall that φ is defined on the entire space ${{\mathbb{R}}}^{d}$ and also the inequality $\varphi (y^{\prime})<1-\varepsilon $ holds for any $y^{\prime} \in B(y,\varepsilon).$ In this case, due to condition 4,

$${\bf{w}}(x)-\varphi (x)<\varepsilon <1-\varphi ({q}_{p})={\bf{w}}({q}_{p})-\varphi ({q}_{p}),$$

since w(q_p) = 1 at the node ${q}_{p}\in {\mathcal{F}}$.

Then

$$\begin{array}{c}{\bf{w}}(x)-\varphi (x)\leqslant \mathop {\sum \limits_{p}}{\mu }_{p}\left({\bf{w}}({q}_{p})-\varphi ({q}_{p})\right)=\mathop {\sum \limits_{p}}{\mu }_{p}{\bf{w}}({q}_{p})-\mathop {\sum \limits_{p}}{\mu }_{p}\varphi ({q}_{p})\\ \leqslant {{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-\mathop {\sum \limits_{p}}{\mu }_{p}\varphi ({q}_{{p}^{\star }})={{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-\varphi ({q}_{{p}^{\star }}),\end{array}$$

where the index p^⋆ has been defined above.

Note that

$$\left|\varphi \left({z}_{s}(\rho)\right)-\varphi ({q}_{{p}^{\star }})\right|\leqslant \sigma \left|{z}_{s}(\rho)-{q}_{{p}^{\star }}\right|+o(\left|{z}_{s}(\rho)-{q}_{{p}^{\star }}\right|)<\sigma k+o(\left|{z}_{s}(\rho)-{q}_{{p}^{\star }}\right|).$$

Consequently, $-\varphi ({q}_{{p}^{\star }})\leqslant -\varphi \left({z}_{s}(\rho)\right)\sigma k+{o}_{1}$. Hence, we arrive at inequality (4.6).

Now, we demonstrate that $\left|{\bf{w}}(x)-{\bf{w}}({q}_{s})\right|\leqslant \sigma k$.

As x and q_s belong to one simplex S, the function w is affine on the segment X = [x, q_s]. As the function ${\left.({\bf{w}}-\varphi)\right|}_{X}$ achieves minimum at the point x, we obtain

$${\left\Vert {D}_{X}{\bf{w}}\right\Vert }_{C({\mathcal{G}})}=| {D}_{X}{\bf{w}}| =| {D}_{X}\varphi | \leqslant \sigma .$$

The symbol D_Xg indicates the derivative of the restriction of g to the segment X as the derivative of a one-variable function.

Also note that

$$\begin{array}{c}\left|\varphi \left({z}_{s}(\rho)\right)-\varphi \left(x+hf(x,{a}_{s}(\rho),b)\right)\right|\leqslant \sigma \left|{z}_{s}(\rho)-x+hf(x,{a}_{s}(\rho),b)\right|+{o}_{2}\\ \leqslant \sigma \left|{q}_{s}+hf({q}_{s},{a}_{s}(\rho),b)-x-hf(x,{a}_{s}(\rho),b)\right|+{o}_{2}\\ \leqslant \sigma \left(| {q}_{s}-x| +h\left|f({q}_{s},{a}_{s}(\rho),b)-f(x,{a}_{s}(\rho),b)\right|\right)+{o}_{2}\leqslant \sigma (k+hLk)+{o}_{2},\end{array}$$

(4.7)

where ${o}_{2}=o(\left|{z}_{s}(\rho)-x-hf(x,{a}_{s}(\rho),b)\right|)$.

Now, we apply the inequalities derived above to (4.5):

$$-\rho h<{\bf{w}}({q}_{s})-\gamma {{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-(1-\gamma)$$

(as ∣w(x) − w(q_s)∣ ⩽ σk)

$$\leqslant {\bf{w}}(x)-\gamma {{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)-(1-\gamma)+\sigma k$$

(adding and subtracting γw(x))

$$=(1-\gamma){\bf{w}}(x)+\gamma \left({\bf{w}}(x)-\,{{\bf{w}}}_{loc}\left({z}_{s}(\rho),{\bf{W}}\right)\right)-(1-\gamma)+\sigma k$$

(by inequality (4.6))

$$\leqslant (1-\gamma)\,{\bf{w}}\,(x)+\gamma \left(\varphi (x)-\varphi \left({z}_{s}(\rho)\right)\right)-(1-\gamma)+(1+\gamma)\sigma k+\gamma {o}_{1}$$

(by inequality (4.7))

$$\leqslant (1-\gamma){\bf{w}}(x)+\gamma \left(\varphi (x)-\varphi \left(x+hf(x,{a}_{s},b)\right)\right)-(1-\gamma)+(1+2\gamma +\gamma hL)\sigma k+\gamma ({o}_{1}+{o}_{2}),$$

where L is the Lipshitz constant for the function f from condition (2.2).

Due to an arbitrary choice of ρ,

$$\begin{array}{c}0\leqslant \frac{1-{\gamma }_{n}}{{h}_{n}}{{\bf{w}}}_{n}({x}^{n})+\mathop {\rm min} \limits_{b\in B}\,\mathop {\rm max } \limits_{a\in A}\left\{{\gamma }_{n}\frac{\varphi ({x}^{n})-\varphi \left({x}^{n}+{h}_{n}f({x}^{n},a,b)\right)}{{h}_{n}}-\frac{1-{\gamma }_{n}}{{h}_{n}}\right\}\\ +\sigma \frac{{k}_{n}}{{h}_{n}}(1+2{\gamma }_{n}+{\gamma }_{n}{h}_{n}L)+\gamma ({o}_{1}+{o}_{2}).\end{array}$$

Letting n → ∞, we obtain $0\leqslant \underline{v}(y)+H\left(y,D\varphi (y)\right)$. Thereby, we have established relation (4.3) and also the fact that $\overline{v}$ and $\underline{v}$ are the lower and upper viscosity solutions, respectively, to problem (3.3) with the boundary conditions (4.1)–(4.4) in the viscosity sense. □

Now, we can prove a theorem on the convergence of this numerical scheme that is similar to [9 ,pp. 125–129,Theorem 2.3]. Prior to this, note that an auxiliary assertion corresponding to [9 ,pp. 117–118,Theorem 1.10], but for the time-optimal problem with a lifeline, is proved by analogy with substituting the set Ω by the set ${\mathcal{G}}$.

Theorem. Suppose that conditions C.1 and C.2 are satisfied. Also, let the value function v (2.6) of game (2.1), (2.5) be continuous on ${\rm{cl}}\ {\mathcal{G}}$. Then w_n is converging to the function $v=\overline{v}=\underline{v}$as n → ∞, uniformly on any compact set ${\mathcal{K}}\in {{\mathbb{R}}}^{d}$where the value function is continuous.

Note that the value function v is continuous under Assumptions (3.5).

Proof. In accordance with Lemma 4.1, the function $\overline{v}$ is the lower solution to the boundary-value problem (3.3), while the function v is its upper solution due to [9, pp. 115–116,Theorem 1.6], which is a common result for the boundary-value problems for the HJ equation. Then, using an assertion that is similar to [9, pp. 117–118,Theorem 1.10], we derive the inequality $\overline{v}\leqslant v$ on ${\rm{cl}}\ {\mathcal{G}}$. In the same fashion, it is proved that $v\leqslant \underline{v}$, and because $\underline{v}\leqslant \overline{v}$, we obtain $\underline{v}=\overline{v}=v$.

Finally, we check the uniform convergence of w_n to v on compact sets. Assume on the contrary that there exist ε > 0, n_m → ∞ and ${x}_{m}\in {\mathcal{K}}$ such that x_m → x and $\left|{{\bf{w}}}_{{n}_{m}}({x}_{m})-v({x}_{m})\right|>\varepsilon $.

Generally speaking, the viscosity limits are defined as the supremum and infimum of a function in a contracting neighborhood; see relations (3.6). However, we consider a continuity point of the value function and, consequently, a continuity point of the upper and lower viscosity solutions. Therefore, from limits in the viscosity sense we can pass to classical limits.

In this inequality, let m → ∞; by the continuity of the function v, we obtain $\left|\overline{v}(x)-v(x)\right|\geqslant \varepsilon $, which is an obvious contradiction. □

5 Numerical Examples

The algorithm described above was implemented by the authors as a Microsoft Windows application using .Net 4.0 and C#. Initially, the application was a single-threaded program; at later stages, it was improved to a multi-threaded one using the C# language functionality to speed up computations on multicore processors.

In this section, the results of calculations for three classical systems are presented. The first example is the homicidal chauffeur game proposed by Isaacs [1], for some set of parameters. The second example considers the Dubins car—the simplest model of movements of a car, aircraft or sea vessel. The third example deals with the classical control problem of a material point. Formally, the last two examples are not described by games, but can be reduced to a game-theoretic form by introducing an additive (fictitious) disturbance dynamics bounded by a singleton composed of the zero point of a corresponding space.

Also, note that formally these examples do not belong to the class of games for which the convergence of the numerical method is proved: in all examples below, the optimal result is a discontinuous function due to violating the first of conditions (3.5) (the admissibility of the entire boundary of the terminal set ${\mathcal{T}}$). Moreover, the second of conditions (3.5) is violated at the outer boundary of the set ${\mathcal{G}}$ (on the lifeline $\partial {\mathcal{F}}$). Nevertheless, the results obtained by the new algorithm of this paper have a fairly good coincidence with the ones obtained by other methods [21, 22], which is obvious from direct comparison. The time-optimal problems with a lifeline in the case of a discontinuous function of the optimal result are the subject of further research.

5.1 Homicidal Chauffeur Game

In the homicidal chauffeur game, the pursuer (a car with a limited radius of turn) tries to capture the evader with the dynamics of simple motions.

The reduced dynamics of the system [1] have the form

$$\begin{array}{c}\dot{x}=-5ya+\frac{1}{2}\sin (b),\\ \dot{y}=5xa+\frac{1}{2}\cos (b)-1,\end{array}$$

where a ∈ [ − 1, 1] and b ∈ [ − π, π]. Here 5 is the constant radius of turn of the first player, while 1/2 is the maximum speed of the second player. The terminal set ${\mathcal{T}}$ is a circle of radius 0.1 with center at the origin. The set ${\mathcal{G}}$ is a square box with sides of length 1 and center at the origin. For numerical modeling, the number of iterations was set equal to n = 300. The discretization steps were h = 0.05 (time) and k = 0.015 (space). The calculations consumed 7 h 15 mins. The contour lines of the value function approximated using a local reconstruction by the resulting values at different nodes of the grid are shown in Fig. 2. They have some angularity due to the approximation. The contour lines correspond to the values 3.5 × i/200, i = 0, 1, …, 200, of the value function.

The white circle in the center of Fig. 2 is the target set ${\mathcal{T}}$, where the value function equals zero (and the contour lines disappear). The white plateau in the lower part of Fig. 2 shows + ∞ of the value function. In the game with the same dynamics and players’ capabilities but without a lifeline, the evader is catched in a finite time from any initial point; in other words, in the game without a lifeline, the value function is less than + ∞. The occurrence of infinite values after introducing the lifeline can be explained by the fact that the optimal trajectories evolving from some initial points will reach the lifeline $\partial {\mathcal{F}}$, and the second player wins at these points accordingly.

Due to this fact, the following question immediately arises: do the value functions of the games with and without a lifeline, but with the same dynamics, terminal sets and dynamic capabilities of players coincide with each other? The authors have some results to answer this question.

Clearly, the image in Fig. 2 has a general asymmetry with respect to the vertical axis, although the original problem is symmetric. Most likely, this effect is caused by the asymmetry of the plane’s partition into simplexes (triangles), which is performed a piecewise linear approximation of the value function. The basic simplexes were the rectangular triangles constructed by cutting the square grid with the lines parallel to the bisectors of the second and fourth quadrant and passing through the vertices of the grid. For obtaining a symmetric result, we should use a grid and an approximation method that correspond to the symmetry axes of the problem (e.g., a hexagonal grid of equilateral triangles or a bilinear approximation on a square grid). However, as it has been proven, the asymmetry will decrease with reducing the discretization steps as soon as the value function obtained numerically converge to the ideal one.

5.2 Dubins Car

The reduced dynamics of the system are described [1] by

$$\dot{x}=-ya,\quad \dot{y}=xa-1,$$

where a ∈ [ − 1, 1]. The target set is given by ${\mathcal{T}}=\left\{(x,y)\in {{\mathbb{R}}}^{2}:\max \{| x| ,| y| \}\le 0.1\right\}$. The set ${\mathcal{G}}$ is a square box with the sides of length 6 and center at the origin. For numerical modeling, the discretization steps were set equal to h = 0.15 (time) and k = 0.025 (space). The number of iterations was specified as n = 50. In fact, the Dubins car is a control problem, but we treated it as a differential game with a fictitious second player. The contour lines of the optimal result function for the values 7 × i/50, i = 0, 1, …, 50, are demonstrated in Fig. 3.

The white plateau in the lower part of Fig. 3 shows + ∞ of the value function; the white square box in the center is the target set.

The differential time-optimal game with a lifeline and a fictitious second player corresponds to a controlled system with state-space constraints. Therefore, a domain with infinite values of the optimal result function at the bottom appears because the system cannot be guided from there to the terminal set through the trajectories remaining inside the square box.

5.3 Material Point

This control problem has the elementary linear dynamics given by

$$\dot{x}=y,\quad \dot{y}=a,$$

where a ∈ [ − 1, 1]. The target set ${\mathcal{T}}$ is a square box with the sides of length 1 and center at the origin. The set ${\mathcal{G}}$ is also a square box, but with the sides of length 3. For numerical modeling, the number of iterations was specified as n = 100. The discretization steps were set equal to h = 0.02 (time) and k = 0.025 (space). The time of calculations was a bit less than 30 mins. The contour lines of the optimal result function for the values of 3.5 × i/200, i = 0, 1, …, 200, are shown in Fig. 4. The white segments in the lower left-hand and upper right-hand parts of this figure correspond to + ∞ of the value function; the white square box at the center is the target set.

6 Conclusions

In this paper, we have proposed a numerical method to design the value function of a differential time-optimal game with a lifeline as a generalized (viscosity) solution to the corresponding boundary-value problem for the HJ equation, and also have proven its convergence. In the earlier publications, we have established the existence of a generalized solution and its coincidence with the value function under very strong assumptions on the dynamic advantage of each of the players at the boundary of the corresponding set (3.5). As is well-known, the simultaneous fulfillment of these two inequalities implies the continuity of the value function; under this assumption, the convergence of the numerical method has been proven. In the future, it is expected to establish the existence of a generalized solution and its coincidence with the value function under weaker assumptions, as well as to propose a numerical method converging to the discontinuous value function of a time-optimal game with a lifeline.

7 Funding

This work was supported in part by the Russian Foundation for Basic Research, project no. 18-01-00410.

References

Isaacs, R.Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, New York: Wiley, 1965. Translated under the title Differentsial’nye igry, Moscow: Mir, 1967.
Dutkevich, Yu. G. & Petrosjan, L. A. Games with a Life-line. The Case of l-capture. SIAM J. Control 10(no. 1), 40–47 (1972).
Article MathSciNet Google Scholar
Krasovskii, N. N. & Subbotin, A. I. Pozitsionnye differentsialanye igry (Positional Differential Games). (Nauka, Moscow, 1974).
Google Scholar
Munts, N.V. and Kumkov, S.S., On the Coincidence of the Minimax Solution and the Value Function in a Time-Optimal Game with a Lifeline, Proc. Steklov Inst. Math., 2019, vol. 305, suppl. 1, S125-S139.
Petrosjan, L. A. Dispersion Surfaces in One Family of Pursuit Games. Dokl. Akad. Nauk Arm. SSR 43(no. 4), 193–197 (1966).
Google Scholar
Petrosjan, L. A. Differential Games of Pursuit. (World Scientific, Singapore, 1993).
Book Google Scholar
Subbotin, A. I. Generalized Solutions of First Order PDEs: The Dynamical Optimization Perspective. (Birkhauser, Boston, 1995).
Book Google Scholar
Bardi, M. & Capuzzo-Dolcetta, I. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. (Birkhäuser, Boston, 1997).
Book Google Scholar
Bardi, M., Falcone, M., and Soravia, P.Numerical Methods for Pursuit-Evasion Games via Viscosity Solutions, in Stochastic and Differential Games: Theory and Numerical Methods: Annals of Int. Society on Dynamic Games, Boston: Birkhäuser, 1999, vol. 4, pp. 105–175.
Barles, G. & Perthame, B. Discontinuous Solutions of Deterministic Optimal Stopping Time Problems. RAIRO Model. Math. Anal. no. 21, 557–579 (1987).
Article MathSciNet Google Scholar
Barles, G. Solutions de viscosite des equations de Hamilton-Jacobi, in Mathematiques et Applications 17 (Springer, Paris, 1994).
MATH Google Scholar
Cardaliaguet, P., Quincampoix, M. & Saint-Pierre, P. Some Algorithms for Differential Games with Two Players and One Target. RAIRO-Modélis. Math. Anal. Numér. 28(no. 4), 441–461 (1994).
Article MathSciNet Google Scholar
Cardaliaguet, P., Quincampoix, M. & Saint-Pierre, P. Set-Valued Numerical Analysis for Optimal Control and Differential Games. Ann. Int. Soc. Dynamic Games 4, 177–247 (1999).
MathSciNet MATH Google Scholar
Cardaliaguet, P., Quincampoix, M. & Saint-Pierre, P. Pursuit Differential Games with State Constraints. SIAM J. Control Optimiz. 39(no. 5), 1615–1632 (2001).
Article MathSciNet Google Scholar
Cardaliaguet, P., Quincampoix, M. & Saint-Pierre, P. Differential Games Through Viability Theory: Old and Recent Results. Ann. Int. Soc. Dynamic Games 9, 3–35 (2007).
Article MathSciNet Google Scholar
Crandall, M. G., Evans, L. C. & Lions, P.-L. Some Properties of Viscosity Solutions of Hamilton-Jacobi Equations. Trans. Am. Math. Soc. 282(no. 2), 487–502 (1984).
Article MathSciNet Google Scholar
Crandall, M. G., Ishii, H. & Lions, P. L. Useras Guide to Viscosity Solutions of Second Order Partial Differential Equations. Bull. Am. Math. Soc. no. 27, 1–67 (1992).
Article Google Scholar
Isaacs, R. Differential Games. (Wiley, New York, 1965).
MATH Google Scholar
Krasovskii, N. N. & Subbotin, A. I. Game-Theoretical Control Problems. (Springer-Verlag, New York, 1988).
Book Google Scholar
Munts, N.V. and Kumkov, S.S.The Existence of the Value Function in a Time-Optimal Game with Lifeline, Proc. 47th Int. Youth School-Conf. "Modern Problems in Mathematics and Its Applications,” Yekaterinburg, 2016, pp. 94–99.
Patsko, V. S. & Turova, V. L. Level Sets of the Value Function in Differential Games with the Homicidal Chauffeur Dynamics. Int. Game Theory Rev. 3(no. 1), 67–112 (2001).
Article MathSciNet Google Scholar
Patsko, V.S. and Turova, V.L.Numerical Study of Differential Games with the Homicidal Chauffeur Dynamics, Yekaterinburg: Inst. Mat. Mekh., 2000. http://sector3.imm.uran.ru/preprint2000_01/ preprint2000.pdf.
Petrosjan, L. A. A Family of Differential Survival Games in the Space ${{\mathbb{R}}}^{n}$. Soviet Math. Dokl. no. 6, 377–380 (1965).
Google Scholar

Download references

Author information

Authors and Affiliations

Krasovskii Institute of Mathematics and Mechanics, Ural Branch, Russian Academy of Sciences, Saint Petersburg, Russia
N.V. Munts & S.S. Kumkov

Authors

N.V. Munts
View author publications
You can also search for this author in PubMed Google Scholar
S.S. Kumkov
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Munts, N., Kumkov, S. A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline. Autom Remote Control 81, 1545–1561 (2020). https://doi.org/10.1134/S0005117920080159

Download citation

Received: 16 August 2018
Revised: 02 December 2018
Accepted: 02 December 2018
Published: 30 September 2020
Issue Date: August 2020
DOI: https://doi.org/10.1134/S0005117920080159

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline

Abstract

Similar content being viewed by others

Convergence of Numerical Method for Time-Optimal Differential Games with Lifeline

On the Coincidence of the Minimax Solution and the Value Function in a Time-Optimal Game with a Lifeline

The Problem of Approach of Controlled Objects in Dynamic Game Problems with a Terminal Payoff Function

1 Introduction

2 Problem Statement

3 Numerical Scheme

3.1 Discrete Scheme

Lemma 3.1.

3.2 Viscosity Solution to HJE

Definition 3.1.

Definition 3.2.

Definition 3.3.

Definition 3.4.

Definition 3.5.

4 Convergence of Numerical Scheme

Lemma 4.1.

5 Numerical Examples

5.1 Homicidal Chauffeur Game

5.2 Dubins Car

5.3 Material Point

6 Conclusions

7 Funding

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Numerical Method for Solving Time-Optimal Differential Games with a Lifeline

Abstract

Similar content being viewed by others

Convergence of Numerical Method for Time-Optimal Differential Games with Lifeline

On the Coincidence of the Minimax Solution and the Value Function in a Time-Optimal Game with a Lifeline

The Problem of Approach of Controlled Objects in Dynamic Game Problems with a Terminal Payoff Function

1 Introduction

2 Problem Statement

3 Numerical Scheme

3.1 Discrete Scheme

Lemma 3.1.

3.2 Viscosity Solution to HJE

Definition 3.1.

Definition 3.2.

Definition 3.3.

Definition 3.4.

Definition 3.5.

4 Convergence of Numerical Scheme

Lemma 4.1.

5 Numerical Examples

5.1 Homicidal Chauffeur Game

5.2 Dubins Car

5.3 Material Point

6 Conclusions

7 Funding

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation