1 Introduction

The static Hamilton–Jacobi–Bellman (HJB) equation with a prescribed value on the boundary of a region \({\varOmega } \subset {\mathbb {R}}^n\) where the solution is found on the interior of \({\varOmega }\) arises in a number of optimization problems. Applications include optimal escape from a region [1], area patrol and perimeter surveillance [15], modelling folds in structural geology [17] and reactive fluxes [8].

There are two classes of semi-Lagrangian approximations [19] that approximate a solution to the static HJB equation. These approximations are known as semi-Lagrangian because the solution is approximated along short segments of characteristics dependent on the discretization. Both are solved on a fixed simplicial mesh or grid that discretizes the region of interest. The difference between them is the method in which the control is approximated.

In the first approach, the control is assumed to be held constant within an element of a mesh [16]. Non-iterative schemes such as the ordered upwind method (OUM), monotone acceptance ordered upwind method (MAOUM) [2] and fast marching method (FMM) [18] use this approximation. In OUM, MAOUM and FMM, the order in which the solution on the vertices of the mesh (or grid) is found explicitly much like in Dijkstra’s algorithm [11] resulting in a significant speed up in computation, despite the coupling between vertices.

In the other semi-Lagrangian approximation, the control is assumed to be held fixed for a small time \(\triangle t\). To determine the solution at a mesh point, a first-order reconstruction from nearby points on the discretization is required. An error bound \(\mathcal {O}(\triangle t)\) has been shown for controls that have bounded variation [6]. Results of higher-order convergence rates using higher-order semi-Lagrangian approximation schemes of this type exist [14]. Many iterative algorithms [5, 10] have been devised that use this approximation.

Convergence rate results exist for the related time-dependent Hamilton–Jacobi equation, where similar half-order convergence is observed in terms of the longest time step (rather than edge length). These results have been proven for grid like discretizations [9, 23] and have been extended to the use of triangular meshes [4] both using finite difference schemes. In [5], convergence rate results are given using similar schemes that include both time step and spatial discretizations. The proof of the main result in this work draws on some similar ideas such as doubling the variables in the use of an auxiliary function as in [5] and [13, Chapter 10].

It is proven in this paper that the convergence rate of the approximate solution provided by OUM to the viscosity solution of the static HJB boundary value problem is at least \(\mathcal {O}(\sqrt{h_{max}})\) in terms of maximum error, where \(h_{max}\) is the longest edge length of a mesh. In [21], the OUM was shown to provide an approximate solution to the static HJB equation that converges as \(h_{max} \rightarrow 0\), but no convergence rate was obtained. The proof in this work is based on a similar result for FMM in [20]. The OUM however is a different algorithm used to solve a wider class of problems where the weight (or speed) function can depend on position and direction and the boundary function can depend on position. The result in [20] is proven on a uniform grid whereas the result here holds on a simplicial mesh. Simplicial meshes are better suited towards discretizing regions with complex geometries. A finer discretization may be required to obtain the same accuracy when the discretization is restricted to grids. A key step in the proof for the OUM convergence rate is showing the existence of a directionally complete stencil that is consistent with the result of OUM, an idea which was first presented in [2].

The optimal control problem along with an introduction to viscosity solutions will be presented in Sect. 2. In Sect. 3, a general discretization of \({\varOmega } \subset {\mathbb {R}}^n\), known as a simplicial mesh, will be described. The ordered upwind method [21] will be reviewed in Sect. 4. Properties of the OUM algorithm required in the proof of the main result will be presented in Sect. 5. The convergence rate result will be proven in Sect. 6. An example demonstrating numerical convergence close to the proven theoretical rate will be presented in Sect. 7. Conclusions and directions of future work will be discussed in Sect. 8.

2 Problem Formulation

A point is denoted \(\mathbf x \in {\mathbb {R}}^n\) and the Euclidean norm is denoted \(\left\| \cdot \right\| \). The set of positive real numbers is denoted \({\mathbb {R}}_+\). Let \({\varOmega } \subset {\mathbb {R}}^n\) be open, connected, bounded with non-empty interior and boundary \(\partial {\varOmega }\). Let \(\overline{{\varOmega }} = {\varOmega } \cup \partial {\varOmega }\) be the closure of \({\varOmega }\).

Let \(\mathcal {U} = \{\mathbf{u }(\cdot ): {\mathbb {R}}_+\cup \{0\} \rightarrow {\mathbb {S}}^{n-1}|\mathbf{u }(\cdot ) \text { is measurable}\}\) where \({\mathbb {S}}^{n-1} = \{\mathbf{u } \in {\mathbb {R}}^n |\) \(\left\| \mathbf u \right\| = 1 \}\) be the set of admissible controls and the trajectory \(\mathbf y : {\mathbb {R}}_+\cup \{0\} \rightarrow \overline{{\varOmega }}\) is governed by control \(\mathbf u (\cdot ) \in \mathcal {U}\),

$$\begin{aligned} \dot{\mathbf{y }}(t) = \mathbf u (t), \quad \mathbf y (0) = \mathbf{x }_0, \quad \mathbf{x }_0 \in \overline{{\varOmega }}. \end{aligned}$$
(1)

The control problem is to steer \(\mathbf y (\cdot )\) from \(\mathbf{x }_0 \in \overline{{\varOmega }}\) to any point on the boundary \(\mathbf{x }_f \in \partial {\varOmega }\). The trajectory with initial condition \(\mathbf y (0) = \mathbf{x }_0\) may be written \(\mathbf y _{\mathbf{x }_0}(\cdot )\).

Definition 1

The exit-time \(T: \overline{{\varOmega }} \times \mathcal {U} \rightarrow {\mathbb {R}}_+\cup \{0\}\) is the first time \(\mathbf y _{\mathbf{x }_0}(\cdot )\) reaches \(\mathbf{x }_f \in \partial {\varOmega }\) under the influence of the control \(\mathbf u (\cdot )\),

$$\begin{aligned} T(\mathbf{x }_0, \mathbf u (\cdot )) = \inf \{t | \mathbf y _{\mathbf{x }_0}(t) \in \partial {\varOmega } \}. \end{aligned}$$
(2)

To discuss optimality, a cost is assigned to each control.

Definition 2

The cost function, Cost: \( \overline{{\varOmega }} \times \mathcal {U} \rightarrow {\mathbb {R}}\) is

$$\begin{aligned} \text {Cost}(\mathbf{x }_0,\mathbf u (\cdot )) = \int _0^{T(\mathbf{x }_0,\mathbf u (\cdot ))} g(\mathbf y _{\mathbf{x }_0}(s),\mathbf u (s))ds + q(\mathbf y _{\mathbf{x }_0}(T(\mathbf{x }_0,\mathbf u (\cdot )))),\quad \text {for}\quad \mathbf{x }_0 \in \overline{{\varOmega }} \end{aligned}$$
(3)

where \(q: \partial {\varOmega } \rightarrow {\mathbb {R}}\) is the boundary exit-cost and \(g: \overline{{\varOmega }} \times {\mathbb {S}}^{n-1} \rightarrow {\mathbb {R}}_+\) is the weight.

The optimal control problem is to find a control \(\mathbf u ^*(\cdot )\) that minimizes (3).

Definition 3

The value function \(V: \overline{{\varOmega }} \rightarrow {\mathbb {R}}\) at \(\mathbf x \in \overline{{\varOmega }}\) is the cost associated with the optimal control \(\mathbf u ^*(\cdot )\) for reaching any \(\mathbf{x }_f \in \partial {\varOmega }\) from \(\mathbf x \),

$$\begin{aligned} V(\mathbf x ) = \inf _\mathbf{u (\cdot ) \in \mathcal {U}} \text {Cost}(\mathbf x ,\mathbf u (\cdot )). \end{aligned}$$
(4)

The value function at \(\mathbf x \in \overline{{\varOmega }}\) is the lowest cost to reach \(\partial {\varOmega }\) from \(\mathbf x \). The value function satisfies the continuous Dynamic Programming Principle (DPP).

Theorem 1

(Dynamic Programming Principle [13, Theorem 10.3.1]) For \(h > 0\), \(t \ge 0\), such that \(0 \le t+h \le T(\mathbf{x }_0,\mathbf u ^*(\cdot ))\),

$$\begin{aligned} V(\mathbf y _{\mathbf{x }_0}(t)) = \inf _\mathbf{u (\cdot ) \in \mathcal {U}} \left\{ \int _t^{t+h} g(\mathbf y _{\mathbf{x }_0}(s),\mathbf u (s)) ds + V(\mathbf y _{\mathbf{x }_0}(t+h)) \right\} . \end{aligned}$$
(5)

For V to be continuous on \(\overline{{\varOmega }}\), continuity between V on \({\varOmega }\) and q on \(\partial {\varOmega }\) must be established. Let \(L: \overline{{\varOmega }} \times \overline{{\varOmega }}\) be

$$\begin{aligned} L(\mathbf{x }_1,\mathbf{x }_2) = \inf _\mathbf{u (\cdot ) \in \mathcal {U}} \left\{ \int _0^{\tau } g(\mathbf y _{\mathbf{x }_1}(s),\mathbf u (s))ds \text { } \Big | \text { }{} \mathbf y _{\mathbf{x }_1}(\tau ) = \mathbf{x }_2, \mathbf y _{\mathbf{x }_1}(t) \in \overline{{\varOmega }}, t \in (0,\tau ) \right\} . \end{aligned}$$
(6)

Definition 4

The exit-cost q is compatible (with the continuity of V) if

$$\begin{aligned} q(\mathbf{x }_1) - q(\mathbf{x }_2) \le L(\mathbf{x }_1, \mathbf{x }_2) \end{aligned}$$
(7)

for all \(\mathbf{x }_1,\mathbf{x }_2 \in \partial {\varOmega }\).

Definition 5

The speed profile of \(g(\mathbf x ,\mathbf u )\) is

$$\begin{aligned} \mathcal {U}_g(\mathbf x ) = \left\{ \frac{t\mathbf u }{g(\mathbf x ,\mathbf u )} \Big | \mathbf u \in {\mathbb {S}}^{n-1} \text { and } t \in [0,1] \right\} . \end{aligned}$$

In \({\mathbb {R}}^2\), the speed profile is the shape centred at \(\mathbf x \) with radius \(1/g(\mathbf x ,\mathbf u )\) at the angle corresponding to the direction \(\mathbf u \).

The optimal control problem (1), (3) will be assumed to satisfy the following:

(P1) :

The boundary function q is compatible with the continuity of V.

(P2) :

There exist constants \(G_{min}, G_{max} \in {\mathbb {R}}_+\) and continuous functions \(g_{min}, g_{max}: \overline{{\varOmega }} \rightarrow {\mathbb {R}}_+\) such that for all \(\mathbf x \in \overline{{\varOmega }}\) and \(\mathbf u \in {\mathbb {S}}^{n-1}\),

$$\begin{aligned} 0 < G_{min} \le g_{min}(\mathbf x ) \le g(\mathbf x , \mathbf u ) \le g_{max}(\mathbf x ) \le G_{max} < \infty . \end{aligned}$$
(8)
(P3) :

There exists \(L_g \in {\mathbb {R}}_+\) such that for \(\mathbf{x }_1, \mathbf{x }_2 \in \overline{{\varOmega }}\) and \(\mathbf u \in {\mathbb {S}}^{n-1}\),

$$\begin{aligned} |g(\mathbf{x }_1,\mathbf u ) - g(\mathbf{x }_2,\mathbf u )| \le L_g\left\| \mathbf{x }_1 - \mathbf{x }_2\right\| . \end{aligned}$$
(9)
(P4) :

For all \(\mathbf{x }_1,\mathbf{x }_2 \in \overline{{\varOmega }}\) and \(\lambda \in (0,1)\), \(\lambda \mathbf{x }_1 + (1-\lambda )\mathbf{x }_2 \in \overline{{\varOmega }}\).

(P5) :

The speed profile \(\mathcal {U}_g(\mathbf x )\) is convex for all \(\mathbf x \in {\varOmega }\).

Assumption (P5) is needed to guarantee uniqueness in the optimizing direction in the approximated problem provided \(\nabla V\) exists [2, 25].

Lemma 1

The boundary function \(q: \partial {\varOmega } \rightarrow {\mathbb {R}}\) is Lipschitz-continuous.

The proof follows from (P1), (P2), and (P4) with Lipschitz constant \(2G_{max}\).

Since q is Lipschitz-continuous on a compact subset of \({\mathbb {R}}^n\), there exist \(q_{min}, q_{max} \in {\mathbb {R}}\) such that

$$\begin{aligned} q_{min} \le q(\mathbf x ) \le q_{max}. \end{aligned}$$
(10)

Define the Hamiltonian \(H: {\varOmega } \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\)

$$\begin{aligned} H(\mathbf x , \mathbf p ) = -\min _\mathbf{u \in {\mathbb {S}}^{n-1}} \{ \mathbf p \cdot \mathbf u + g(\mathbf x ,\mathbf u ) \}. \end{aligned}$$
(11)

The corresponding static Hamilton–Jacobi–Bellman (HJB) equation which can be derived from a first-order approximation of (5) [25] is

$$\begin{aligned} H(\mathbf x ,\nabla V)= & {} \min _\mathbf{u \in {\mathbb {S}}^{n-1}} \{(\nabla V(\mathbf x ) \cdot \mathbf u )+ g(\mathbf x ,\mathbf u )\} = 0, \mathbf x \in {\varOmega }, \nonumber \\&V(\mathbf x ) = q(\mathbf x ), \quad \text {for}\quad \mathbf x \in \partial {\varOmega }. \end{aligned}$$
(12)

Definition 6

The characteristic direction \(\mathbf u ^*: {\varOmega } \rightarrow {\mathbb {S}}^{n-1}\) at \(\mathbf x \in {\varOmega }\) is an optimizer of (12) at x.

Even for smooth \(g(\mathbf x ,\mathbf u )\), \(q(\mathbf x )\) and \(\partial {\varOmega }\), \(\nabla V\) (and hence unique \(\mathbf u ^*\)) may not exist over all of \({\varOmega }\). The weaker notion of viscosity solutions [5], is used to describe solutions of (11). Let \(C^k({\varOmega })\), \(k\in {\mathbb {N}}\cup \{\infty \}\) denote the space of functions on \({\varOmega }\) that are k-times continuously-differentiable.

Definition 7

[5] A function \(\underline{V}: \overline{{\varOmega }} \rightarrow {\mathbb {R}}\) is a viscosity subsolution of (12) if for any \(\phi \in C^\infty ({\varOmega })\),

$$\begin{aligned} H(\mathbf{x }_0, \nabla \phi (\mathbf{x }_0)) \le 0, \end{aligned}$$
(13)

at any local maximum point \(\mathbf{x }_0 \in {\varOmega }\) of \(\underline{V} - \phi \).

Definition 8

[5] A function \(\overline{V}: \overline{{\varOmega }} \rightarrow {\mathbb {R}}\) is a viscosity supersolution of (12) if for any \(\phi \in C^\infty ({\varOmega })\),

$$\begin{aligned} H(\mathbf{x }_0, \nabla \phi (\mathbf{x }_0)) \ge 0, \end{aligned}$$
(14)

at any local minimum point \(\mathbf{x }_0 \in {\varOmega }\) of \(\overline{V} - \phi \).

Definition 9

[5] A viscosity solution of the static HJB (12) is both a viscosity subsolution and a viscosity supersolution of (12).

3 Simplicial Meshes

Viscosity solutions are often difficult to find analytically. The region \(\overline{{\varOmega }}\) will be discretized using a simplicial mesh on which V (4) will be solved approximately.

Definition 10

A set of points \(F = \{\mathbf{x }_0,\ldots ,\mathbf{x }_k \} \subset {\mathbb {R}}^n\) is affinely independent if the vectors \(\{\mathbf{x }_1 - \mathbf{x }_0, \ldots , \mathbf{x }_k - \mathbf{x }_0 \}\) are linearly independent.

Definition 11

A k-simplex (plural k-simplices) \(\mathbf s = \mathbf{x }_0^\mathbf{s }\mathbf{x }_1^\mathbf{s }\ldots \mathbf{x }_k^\mathbf{s }\) is the convex hull of an affinely independent set of points \(F=\{\mathbf{x }_0^{\mathbf{s }},\mathbf{x }_1^{\mathbf{s }}\ldots ,\mathbf{x }_k^{\mathbf{s }} \}\).

Definition 12

Suppose s is a k-simplex defined by the convex hull of F. A face of s is any m-simplex (\(-1\le m \le k\)) forming the convex hull of a subset of F containing \(m+1\) elements.

Definition 13

A simplicial mesh, X is a set of simplices such that

  1. 1.

    Any face of a simplex in X is also in X.

  2. 2.

    The intersection of two simplices \(\mathbf s _1,\mathbf s _2 \in X\) is a face of X.

Definition 14

A k-simplicial mesh is a simplicial mesh where the highest dimension of any simplex in X is k.

Denote \(X_j\), \(0 \le j \le n\) the set of j-simplices of X. Elements of \(X_0\), the 0-simplices of X are denoted \(\mathbf{x }_i\) and known as vertices. Elements of \(X_1\), the 1-simplices of X, are known as edges.

Suppose \(X \subset {\mathbb {R}}^n\) is an n-simplicial mesh. For \(0 \le k \le n\), define

$$\begin{aligned} {\varXi }_k = \left\{ (\zeta _0,\zeta _1,\ldots ,\zeta _k) \in {\mathbb {R}}^{k+1} \Big | \sum _{j=0}^k \zeta _j = 1, \zeta _j \in [0,1] \quad \forall \quad 0 \le j \le {k-1} \right\} . \end{aligned}$$
(15)

Definition 15

The barycentric coordinates of \(\mathbf x \in {\mathbb {R}}^n\) belonging to a k-simplex \(\mathbf s \) is a vector \(\zeta = (\zeta _0,\ldots ,\zeta _{k}) \in {\varXi }_k\) such that \(\mathbf x = \sum _{j=0}^{k} \zeta _j\mathbf{x }_j^\mathbf{s }\).

Definition 16

A closed region \(A \subset [{\mathbb {R}}^n\) is contained in an n-simplicial mesh X if for every \(\mathbf x \in A\), there exists \(\mathbf s = \mathbf{x }_0^\mathbf{s }\mathbf{x }_1^\mathbf{s }\ldots \mathbf{x }_n^\mathbf{s }\) and \(\zeta = (\zeta _0,\zeta _1,\ldots ,\zeta _n) \in {\varXi }_{n}\) such that \(\mathbf x = \sum _{j=0}^n \xi _j \mathbf{x }_j^\mathbf{s }\).

Definition 17

The maximum edge length \(h_{max}\) is the length of the longest edge of X.

Definition 18

Let \(1 \le k\le n\). A neighbour of simplex \(\mathbf{x }_0\mathbf{x }_1\ldots \mathbf{x }_{k-1} \in X_{k-1}\), is a vertex \(\mathbf{x }_k \in X_0\) such that \(\mathbf{x }_0\mathbf{x }_1\ldots \mathbf{x }_k \in X_k\).

Definition 19

The minimum simplex height \(h_{min}\) of X is the shortest perpendicular distance between any \(\mathbf s \in X_{n-1}\) with its neighbours.

If \(n=2\), then \(h_{min}\) is the shortest triangle height. The following assumptions will be made on the (n-simplicial) mesh \(X \subset {\mathbb {R}}^n\) on which the approximation of V in the optimal control problem (1), (3) will be found.

(M1) :

There exists \(M \in {\mathbb {R}}_+\) such that \(1 \le \frac{h_{max}}{h_{min}} \le M\).

(M2) :

The region \(\overline{{\varOmega }}\) is contained (Definition 16) in the mesh X.

(M3) :

The mesh X is bounded and has a finite number of vertices \(X_0\).

The value M is a measure of the worst-case degeneracy for a mesh X. An example of \(\overline{{\varOmega }} \subset {\mathbb {R}}^2\) being contained in a mesh X is shown in Fig. 1. With the discretization definitions and assumptions stated, the OUM will now be presented.

Fig. 1
figure 1

An example of \(\overline{{\varOmega }} \subset {\mathbb {R}}^2\) contained in a 2-simplicial mesh X

4 Review of the Ordered Upwind Method

The OUM [21] is used to find an approximation \({\widetilde{V}}: X_0 \rightarrow {\mathbb {R}}\) of V in (5) on the vertices of an n-simplicial mesh \(X \subset {\mathbb {R}}^n\) satisfying (M1)(M3).

The vertices of \(X_0\) are assigned and updated between the following labels throughout the execution of the OUM.

\({\mathbf {Far}}\) :

These vertices have values \({\widetilde{V}}(\mathbf{x }_i) = K\), where K is a large value. Computation of \({\widetilde{V}}\) has not yet started.

\({\mathbf {Considered}}\) :

These vertices have tentative values \({\widetilde{V}} < K\) and are computed using an update formula.

\({\mathbf {Accepted}}\) :

These vertices have finalized values \({\widetilde{V}}\).

At any instant of the algorithm, each vertex in X must be labelled exactly one of Accepted, Considered or Far. Simplices with Accepted label are further classified.

Accepted Front :

The subset of vertices \(X_0\) with Accepted label that have a neighbour labelled Considered.

AF :

The subset of \(X_{n-1}\) made of vertices on the Accepted Front that have a neighbouring vertex labelled Considered.

Definition 20

Let \({\varGamma } = \frac{G_{max}}{G_{min}}\) denote the global anisotropy coefficient where \(G_{min}\) and \(G_{max}\) are described in (8).

\({\mathbf {Near Front}}\) of \(\mathbf{x }_i\) (\(\mathbf NF (\mathbf{x }_i)\)) - Let \(\mathbf{x }_i\) be labelled Considered. Define

$$\begin{aligned} \mathbf NF (\mathbf{x }_i) = \left\{ \mathbf s \in \mathbf AF \Big | \, \exists \, {\widetilde{\mathbf{x }}} \in \mathbf s \Big | \left\| {\widetilde{\mathbf{x }}} - \mathbf{x }_i \right\| \le {\varGamma } h_{max}\right\} . \end{aligned}$$
(16)

See Fig. 2. The sets \(\mathbf AF \), \(\mathbf NF (\mathbf{x }_i) \subset X_{n-1}\) change throughout the execution of the OUM due to the vertices of X being relabelled from Far to Considered to Accepted.

Fig. 2
figure 2

OUM labels—an example for \(\overline{{\varOmega }} \subset {\mathbb {R}}^2\). The vertex \(\mathbf{x }_i\) with Considered label is updated from the set of directions provided by \(\mathbf NF (\mathbf{x }_i)\). Vertices labelled Accepted are shaded, including vertices on the edges that make up AF and the Near Front of \(\mathbf{x }_i\), \(\mathbf NF (\mathbf{x }_i)\). Vertices outside \({\varOmega }\) are also labelled Accepted. \(\overline{B}_{{\varGamma } h_{max}}(\mathbf{x }_i)\) is the closed ball with radius \({\varGamma } h_{max}\) and centre \(\mathbf{x }_i\). Vertices labelled Considered are marked with a triangle. Unmarked vertices are labelled Far

Define the discrete set of controls \({\widetilde{\mathcal {U}}}\)

$$\begin{aligned} {\widetilde{\mathcal {U}}} = \left\{ {\widetilde{\mathbf{u }}}(\cdot ) \in \mathcal {U} \Big | {\widetilde{\mathbf{u }}}(t) = \widetilde{\mathbf{u }}_i, {\widetilde{\mathbf{u }}}_i \in {\mathbb {S}}^{n-1} \text { while } \mathbf y (t) \in \mathbf s \in X \right\} . \end{aligned}$$
(17)

The distance between vertex \(\mathbf{x }_i\) and \(\mathbf x \in \mathbf s \in X_{n-1}\), where \(\mathbf x = \sum _{j=0}^{n-1} \zeta _j\mathbf{x }_j^\mathbf{s }\) is denoted \(\tau _\mathbf{s }(\mathbf{x }_i,\zeta )=\left\| \sum _{j=0}^{n-1} \zeta _j\mathbf{x }_j^\mathbf{s } - \mathbf{x }_i \right\| = \left\| \mathbf x -\mathbf{x }_i\right\| \). The direction from \(\mathbf{x }_i\) to \(\mathbf x \) is \(\mathbf{u }_{\mathbf{s }}(\mathbf{x }_i,\zeta )=\frac{\mathbf{x }-\mathbf{x }_i}{\tau _{\mathbf{s }}(\mathbf{x }_i,\zeta )}\). The update for \(\mathbf{x }_i\) provided by \(\mathbf s = \mathbf{x }_0^\mathbf{s }\mathbf{x }_1^\mathbf{s }\ldots \mathbf{x }_{n-1}^\mathbf{s }\) is a first-order approximation of the DPP (1),

$$\begin{aligned} {\widetilde{C}}_\mathbf{s }(\mathbf{x }_i) = \min _{\zeta \in {{\varXi }}_{n-1}} \left\{ \sum _{j = 0}^{n-1} \zeta _j {{\widetilde{V}}}(\mathbf{x }_j^{\mathbf{s }}) + \tau _\mathbf{s }(\mathbf{x }_{i}, \zeta )g(\mathbf{x }_i,\mathbf{u }_\mathbf{s }(\mathbf{x }_i,\zeta )) \right\} , \end{aligned}$$
(18)

where \(\zeta = (\zeta _0,\zeta _1,\ldots ,\zeta _{n-1}) \in {\varXi }_{n-1}\). The optimizing direction is captured by updating \(\mathbf{x }_i\) from its Near Front [21]. The update formula over all of \(\mathbf{NF }(\mathbf{x }_i)\) is

$$\begin{aligned} {\widetilde{C}}(\mathbf x ) = \min _\mathbf{s \in \mathbf NF (\mathbf{x }_i)} {\widetilde{C}}_\mathbf{s }(\mathbf{x }_i). \end{aligned}$$
(19)

Note that the minimizing update along all of \(\mathbf NF (\mathbf{x }_i)\) (19) does not necessarily come from \(\mathbf s \in X_{n-1}\) where \(\mathbf{x }_i\) is a neighbour of \(\mathbf s \).

The algorithm can now be stated. Recall that any vertex \(\mathbf{x }_i \in X_0\) is labelled only one of Accepted, Considered or Far at any instant of the algorithm.

  1. 1.

    Label all vertices \(\mathbf{x }_i \in X_0\) Far, assigning \({\widetilde{V}}(\mathbf{x }_i) = K\) (where K is large).

  2. 2.

    For each vertex \(\mathbf{x }_i \in X_0 \cap {\varOmega }^c\), relabel \(\mathbf{x }_i\) Accepted, and set \({\widetilde{V}}(\mathbf{x }_i) = q(\hat{\mathbf{x }})\) where \(\hat{\mathbf{x }} = \hbox {arg min}_{{\widetilde{\mathbf{x }}} \in \partial {\varOmega }}\left\| \mathbf{x }_i-{\widetilde{\mathbf{x }}}\right\| \).

  3. 3.

    Relabel all neighbours of Accepted vertices \(\mathbf{x }_i\) that have Far label, to Considered. For these vertices, compute \({\widetilde{V}}(\mathbf{x }_i) = {\widetilde{C}}(\mathbf{x }_i)\) according to (19).

  4. 4.

    Relabel vertex \(\overline{\mathbf{x }}_i\) with Considered label with lowest value \({\widetilde{V}}(\overline{\mathbf{x }}_i)\) with \(\textit{Accepted}\) label. If all vertices in X are labelled Accepted, terminate the algorithm.

  5. 5.

    Relabel all neighbouring vertices \(\mathbf{x }_i\) of \(\overline{\mathbf{x }}_i\) with Far label to Considered. For these vertices, compute \({\widetilde{C}}(\mathbf{x }_i)\) using (19) and set \({\widetilde{V}}(\mathbf{x }_i) = {\widetilde{C}}(\mathbf{x }_i)\).

  6. 6.

    Recompute \(\widetilde{C}(\mathbf{x }_i)\) for all other \(\mathbf{x }_i\) with Considered label using (19) such that \(\overline{\mathbf{x }}_i \in \mathbf NF (\mathbf{x }_i)\), using only \(\mathbf s \in \mathbf NF (\mathbf{x }_i)\) such that \(\overline{\mathbf{x }_i} \in \mathbf s \). If \(\widetilde{V}(\mathbf{x }_i) > \widetilde{C}(\mathbf{x }_i)\), then update \(\widetilde{V}(\mathbf{x }_i) = \widetilde{C}(\mathbf{x }_i)\). Go to Step 4.

The domain of \(\widetilde{V}\) will be extended from \(X_0\) to all of X. Define

$$\begin{aligned} \overline{{\varOmega }}_X = \left\{ \bigcup _\mathbf{s \in X_n} \bigcup _{\zeta \in {\varXi }_{n}} \sum _{j=0}^n \zeta _j\mathbf{x }_j^\mathbf{s } \right\} . \end{aligned}$$
(20)

From (M2), \(\overline{{\varOmega }} \subseteq \overline{{\varOmega }}_X\).

The domain of the spatial dimension of value function V and g (and as a result H) are extended from \(\overline{{\varOmega }}\) to \(\overline{{\varOmega }}_X\). For \(\mathbf x \in \overline{{\varOmega }}^c \cap \overline{{\varOmega }}_X\), let

$$\begin{aligned} \hat{\mathbf{x }} = \mathop {\hbox {arg min}}\limits _{\widetilde{\mathbf{x }}\in \partial {\varOmega }} \left\| \mathbf x - \widetilde{\mathbf{x }}\right\| , V(\mathbf x ) = q(\hat{\mathbf{x }}), \text { and } g(\mathbf x ,\mathbf u ) = g(\hat{\mathbf{x }},\mathbf u ). \end{aligned}$$

The domain of \(\widetilde{V}\) is extended from \(X_0\) to \(\overline{{\varOmega }}_X\) by linear interpolation using barycentric coordinates. For \(\mathbf x \in \mathbf s = \mathbf{x }_0^\mathbf{s }\mathbf{x }_1^\mathbf{s }\ldots \mathbf{x }_n^\mathbf{s } \in X_n\),

$$\begin{aligned} \widetilde{V}(\mathbf x ) = \sum _{j=0}^n \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf{s }),\quad \text {where}\quad \mathbf x = \sum _{j=0}^n \zeta _j\mathbf{x }_j^\mathbf{s }. \end{aligned}$$

Most of the effort in the implementation of the OUM occurs in the maintenance and the searching of AF and NF \((\mathbf{x }_i)\). The focus of this paper however is on the accuracy and its convergence to the true solution in relation to discretization properties. Additional discussion on the implementation and computational complexity of OUM can be found in [21].

5 Properties of the Approximated Value Function and Numerical Hamiltonian

An approximation of the Hamiltonian H (11) known as the numerical Hamiltonian will be defined on the vertices \(X_0\) of X. A similar numerical Hamiltonian was proposed in [2]. As in [2], the numerical Hamiltonian will be shown to be both monotonic and consistent with the Hamiltonian (11). The consistency statement here resembles that in [20], which was given as an assumption for the half-order convergence proof for FMM. The proof of consistency relies on directional completeness introduced in [2].

Consider the OUM algorithm at the instant the vertex \(\mathbf{x }_i \in X_0 \cap {\varOmega }\) is about to be relabelled \(\textit{Accepted}\). The Near Front of \(\mathbf{x }_i\) at this instant is denoted \(\overline{\mathbf{NF }}(\mathbf{x }_i)\).

Definition 21

The approximated characteristic direction \(\widetilde{\mathbf{u }}_{\widetilde{\mathbf{s }}}^*: X_0 \cap {\varOmega } \times {\varXi }_{n-1} \rightarrow {\mathbb {S}}^{n-1}\) at \(\mathbf{x }_i \in X_0 \cap {\varOmega }\) from the OUM algorithm is

$$\begin{aligned} \widetilde{\mathbf{u }}_{\widetilde{\mathbf{s }}}^*(\mathbf{x }_i,\widetilde{\zeta }^*) =\frac{\widetilde{\mathbf{x }}^*-\mathbf{x }_i}{\left\| \widetilde{\mathbf{x }}^*-\mathbf{x }_i\right\| }=\frac{\sum _{j=0}^{n-1}\widetilde{\zeta }_j^*\mathbf{x }_j^{\widetilde{\mathbf{s }}^*}-\mathbf{x }_i}{\tau _{\widetilde{\mathbf{s }}^*}(\mathbf{x }_i,\widetilde{\zeta }_j^*)} \quad \text {where}\quad \widetilde{\mathbf{x }}=\sum _{j=0}^{n-1}\widetilde{\zeta }_j^*\mathbf{x }_j^{\widetilde{\mathbf{s }}^*} \end{aligned}$$
(21)

where \(\widetilde{\mathbf{s }}^* \in \overline{\mathbf{NF }}(\mathbf{x }_i)\) and \(\widetilde{\zeta }^* \in {\varXi }_{n-1}\) are the minimizers of (18), (19) when \(\mathbf{x }_i\) is labelled Accepted.

Definition 22

Let \(\phi : X_0 \cap {\varOmega } \rightarrow {\mathbb {R}}\). The numerical Hamiltonian \(\widetilde{H}:X_0 \cap {\varOmega } \times {\mathbb {R}} \rightarrow [{\mathbb {R}}\) is

$$\begin{aligned} \widetilde{H}[\mathcal {S},\phi [\mathcal {S}]](\mathbf{x }_i,\mu ) = - \min _\mathbf{s \in \mathcal {S}} \min _{\zeta \in {\varXi }_{n-1}} \left\{ \frac{\sum _{j=0}^{n-1} \zeta _j \phi (\mathbf{x }_j^\mathbf{s }) - \mu }{\tau _\mathbf{s }(\mathbf{x }_i,\zeta )} + g(\mathbf{x }_i,\mathbf u _\mathbf{s }(\mathbf{x }_i,\zeta )) \right\} , \end{aligned}$$
(22)

where \(\mathcal {S}\subset X_{n-1}\).

The argument \(\phi [\mathcal {S}]\) of \(\widetilde{H}\) denotes the use of the values of \(\phi \) on the vertices that make up the \((n-1)\)-simplices of \(\phi [\mathcal {S}]\) in the optimization of (22). For notational brevity, the argument of \(\phi \) will be dropped.

The numerical HJB equation for the OUM algorithm for all \(\mathbf{x }_i \in X_0 \cap {\varOmega }\) is

$$\begin{aligned} \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i), \widetilde{V}](\mathbf{x }_i,\widetilde{V}(\mathbf{x }_i)) = 0. \end{aligned}$$
(23)

Theorem 2

[1, Prop 5.3] Let \(\mathcal {S} \subset X_{n-1}\). The solution \(\mu \) to \(\widetilde{H}[\mathcal {S},\widetilde{V}](\mathbf{x }_i,\mu ) = 0\) with \(\widetilde{H}\) defined by (22) is unique, and is given by

$$\begin{aligned} \widetilde{\mu } = \min _\mathbf{s \in \mathcal {S}} \min _{\zeta \in {\varXi }_{n-1}} \left\{ \sum _{j=0}^{n-1} \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf s ) + \tau (\mathbf{x }_i, \zeta )g(\mathbf{x }_i,\mathbf u _\mathbf{s }(\mathbf{x }_i,\zeta ))\right\} . \end{aligned}$$
(24)

Furthermore, if \(\widetilde{\mathbf{s }}^* \in \mathcal {S}\) and \(\widetilde{\zeta }^* \in {\varXi }_{n-1}\) are the minimizers in (22), then \(\widetilde{\mathbf{s }}^*\) and \(\widetilde{\zeta }^*\) also minimize (24).

From Theorem 2, finding the solution \(\widetilde{V}(\mathbf{x }_i)\) to (23) is equivalent to solving the update (19) in the OUM algorithm for \(\mathcal {S}=\overline{\mathbf{NF }}(\mathbf{x }_i)\).

Definition 23

[2, Section 2.2] The set \(\mathcal {S} \subseteq X_{n-1}\) is directionally complete for a vertex \(\mathbf{x }_i \in X_0\) if for all \(\mathbf u \in {\mathbb {S}}^{n-1}\) there exists \(\mathbf x \in \mathbf s \) where \(\mathbf s \in \mathcal {S}\) such that

$$\begin{aligned} \mathbf{u } = \frac{\mathbf{x } - \mathbf{x }_i}{\left\| \mathbf{x }-\mathbf{x }_i\right\| }. \end{aligned}$$

A subset \(A \subset {\mathbb {R}}^n\) has no holes if its complement \(A^c\) is connected.

Lemma 2

Prior to each instance of Step 4 of the OUM algorithm, \((n-1)\)-simplices of AF form the boundaries \(\mathbf AF _k\) of j (\(1 \le k \le j<\infty \)) bounded open subsets \({\varOmega }_\mathbf{AF _j} \subset \overline{{\varOmega }}_X\), such that each \({\varOmega }_\mathbf{AF _j}^c\) is connected and \(\bigcup _{k=1}^j \mathbf AF _k = \mathbf AF \).

Fig. 3
figure 3

Lemma 2: three cases in \({\mathbb {R}}^2\). a After relabelling \(\mathbf{x }_i\) Accepted, \(\mathbf AF _3\) is no longer part of \(\mathbf AF \), b after relabelling \(\mathbf{x }_i\) Accepted, the other vertices in the interior of \(\mathbf AF _1\) are still not yet Accepted and c relabelling \(\mathbf{x }_i\) Accepted splits \(\mathbf AF _1\) into two regions, each only containing not yet Accepted vertices in their interiors

Furthermore, if \(\mathbf{x }_m \in X_0 \cap {\varOmega }_\mathbf{AF _k}\), then

  1. 1.

    the set of \((n-1)\)-simplices \(\mathbf AF _k\) is directionally complete for \(\mathbf{x }_m\), and

  2. 2.

    \(\mathbf{x }_m\) is not labelled Accepted.

Proof

At the initialization (Steps 1–3) of the OUM algorithm, only vertices in \(X_0 \cap {\varOmega }^c\) are labelled Accepted. From (M2) and (P4), \(j=1\) and \(\mathbf AF _1 = \mathbf AF \) form a single boundary that encloses \({\varOmega }_\mathbf{AF _1} \supseteq {\varOmega } \). The lemma is satisfied in the first instance of Step 4.

The Accepted Front and AF change only in Step 4 of the OUM. Proof by induction will be used. The lemma is assumed to hold prior to Step 4 of the OUM. Let \(\mathbf{x }_i \in X_0 \cap {\varOmega }_\mathbf{AF _k}\) be the vertex to be relabelled Accepted for some \(1 \le k \le j\). Only \(\mathbf AF _k\) and \({\varOmega }_\mathbf{AF _k}\) may change while \({\varOmega }_\mathbf{AF _{j\ne k}}\) will remain unchanged.

If \(\mathbf{x }_i\) has no neighbours in \(X_0 \cap {\varOmega }_\mathbf{AF _k}\), then the resulting \({\varOmega }_\mathbf{AF _k}\) and \(X_0\cap {\varOmega }_\mathbf{AF _k}\) are both empty. See Fig. 3a.

If \(\mathbf{x }_i\) has a neighbour in \(X_0 \cap {\varOmega }_\mathbf{AF _k}\), then \(\mathbf{x }_i\) is added to the Accepted Front. If \({\varOmega }_\mathbf{AF _k}\) remains a single open connected subset of \({\mathbb {R}}^n\), \(\mathbf{x }_m \in X_0\cap {{\varOmega }}_{\mathbf{AF }_k}\backslash \{\mathbf{x }_i\}\), \(\mathbf{AF }_k\) remains directionally complete and \(\mathbf{x }_m\) is not labelled Accepted. See Fig. 3b.

Otherwise, \({\varOmega }_{\mathbf{AF }_k}\) is no longer a single open connected subset of \({\mathbb {R}}^n\). Thus, \({\varOmega }_{\mathbf{AF }_k}\) has been split into \(p \ge 2\) non-intersecting open connected regions \({\varOmega }_{\mathbf{AF }_{k1}},{\varOmega }_{\mathbf{AF }_{k2}},\ldots ,{\varOmega }_{\mathbf{AF }_{kp}}\) with a subset of the resultant \(\mathbf AF _k\) as the boundary of each. Vertices \(\mathbf{x }_m \in X_0\cap {\varOmega }_{\mathbf{AF }_k}\backslash \{\mathbf{x }_i\}\) are still not labelled Accepted, and \(\mathbf AF _{kl}\) is directionally complete for \(\mathbf{x }_m \in {\varOmega }_{\mathbf{AF }_{kl}}\). See Fig. 3c. \(\square \)

Definition 24

For every \(\mathbf{x }_i \in X_0\cap {\varOmega }\), let \(S(\mathbf{x }_i) \subset X_{n-1}\) such that

  1. 1.

    \(\overline{\mathbf{NF }}(\mathbf{x }_i) \subseteq S(\mathbf{x }_i)\),

  2. 2.

    \(S(\mathbf{x }_i)\) is directionally complete for \(\mathbf{x }_i\).

  3. 3.

    For all \(\mathbf s \in S(\mathbf{x }_i)\), if a point \(\mathbf x \in \mathbf s \), then

    $$\begin{aligned} \left\| \mathbf x - \mathbf{x }_i\right\| \le (2{\varGamma } + 1)h_{max}. \end{aligned}$$
  4. 4.

    \(\widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\mathbf{x }_i,\widetilde{V}(\mathbf{x }_i)) = \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\mathbf{x }_i,\widetilde{V}(\mathbf{x }_i))\)

Such \(S(\mathbf{x }_i)\) will now be constructed for all \(\mathbf{x }_i \in X_0 \cap {\varOmega }\) and shown to satisfy Definition 24. Let \(\overline{B}_r(\mathbf x ) = \{\widetilde{\mathbf{x }} \in {\mathbb {R}}^n | \left\| \mathbf x - \widetilde{\mathbf{x }}\right\| \le r, r \in {\mathbb {R}}_+ \}\).

Definition 25

Assume the OUM algorithm is at the instant that vertex \(\mathbf{x }_i\) labelled Considered is about to be relabelled Accepted. Let \(\overline{\mathbf{AF }}(\mathbf{x }_i)\) be the subset of \(\mathbf AF \) described in Lemma 2 for \(\mathbf{x }_i\) labelled Considered.

Two cases are considered.

Case 1

The set \(\overline{\mathbf{AF }}(\mathbf{x }_i)\) lies in the interior of \(\overline{B}_{2{\varGamma } h_{max}}(\mathbf{x }_i)\), where \(h_{max}\) and \({\varGamma }\) have been defined in Definitions 17 and 20 respectively. Let

$$\begin{aligned} S(\mathbf{x }_i) = \overline{\mathbf{AF }}(\mathbf{x }_i)\cup \overline{\mathbf{NF }}(\mathbf{x }_i). \end{aligned}$$

Case 2

Otherwise, let \(R(\mathbf{x }_i)\) be the region described by the smallest subset of \(X_n\) in which \(\overline{{\varOmega }}_X \cup \overline{B}_{2{\varGamma } h_{max}}(\mathbf{x }_i)\) is contained, and \(\partial R(\mathbf{x }_i)\) its boundary.

Let \(S_{\overline{\mathbf{AF }}R}(\mathbf{x }_i) \subset X_{n-1}\) form the boundary of the compact region \(\overline{{\varOmega }}_{\overline{\mathbf{AF }}(\mathbf{x }_i)} \cap R(\mathbf{x }_i)\). Finally for Case 2,

$$\begin{aligned} S(\mathbf{x }_i) = S_{\overline{\mathbf{AF }}R}(\mathbf{x }_i) \cup \overline{\mathbf{NF }}(\mathbf{x }_i). \end{aligned}$$
(25)

See Fig. 4.

Fig. 4
figure 4

\(S(\mathbf{x }_i)\) in \({\mathbb {R}}^2\)Left edges of \(\overline{\mathbf{NF }}(\mathbf{x }_i)\), \(\overline{\mathbf{AF }}(\mathbf{x }_i)\) and \(\partial R(\mathbf{x }_i)\) are shown. Right \(S(\mathbf{x }_i)\) is the union of \(\overline{\mathbf{NF }}(\mathbf{x }_i)\) with the boundary of the intersection of regions \(R(\mathbf{x }_i)\) with \(\overline{\mathbf{AF }}(\mathbf{x }_i)\). Vertices strictly inside \(S(\mathbf{x }_i)\) are not labelled Accepted

In both cases, the union with \(\overline{\mathbf{NF }}(\mathbf{x }_i)\) ensures that \(\mathbf s \in \overline{\mathbf{NF }}(\mathbf{x }_i) \backslash \overline{\mathbf{AF }}(\mathbf{x }_i)\) are still included in \(S(\mathbf{x }_i)\), just as in OUM.

By construction, \(S(\mathbf{x }_i)\) satisfies the first three properties of Definition 24. It remains to show Property 4 in Definition 24 is satisfied.

For \(\mathbf{x }_i \in X_0\cap {\varOmega }\), let \(\widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_i}}\) be the minimum value on the Accepted Front AF just before \(\mathbf{x }_i\) is labelled Accepted.

Lemma 3

[21, Lemma 7.3(i) and (iii)] Assume the vertex \(\mathbf{x }_i \in X_0\) is about to be labelled Accepted. Then

  1. 1.

    \(\widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_i}} + h_{min}G_{min} \le \widetilde{V}(\mathbf{x }_i) \le \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_i}} + h_{max}G_{max}.\)

  2. 2.

    If \(\mathbf{x }_i\) is labelled Accepted before \(\mathbf{x }_j\) then \(\widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_i}} \le \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_j}}\).

Lemma 4

Let \(\widetilde{\mathbf{x }} = \sum _{j=0}^{n-1} \zeta _j\mathbf{x }_j^\mathbf{s }\) where \(\mathbf s = \mathbf{x }_0^\mathbf{s }\mathbf{x }_1^\mathbf{s }\ldots \mathbf{x }_{n-1}^\mathbf{s } \in X_{n-1}\), \(\zeta \in {\varXi }_{n-1}\). If \(\mathbf{x }_i \in X_0\) is labelled Accepted before all of \(\mathbf{x }_0^\mathbf{s }\), \(\mathbf{x }_1^\mathbf{s }, \ldots ,\mathbf{x }_{n-1}^\mathbf{s }\) and \(\left\| \widetilde{\mathbf{x }} - \mathbf{x }_i\right\| > {\varGamma } h_{max}\), then

$$\begin{aligned} \widetilde{V}(\mathbf{x }_i) < \sum _{j=0}^{n-1} \zeta _j\widetilde{V}(\mathbf{x }_j^\mathbf{s }) + \left\| \widetilde{\mathbf{x }} - \mathbf{x }_i\right\| g\left( \mathbf{x }_i,\frac{\widetilde{\mathbf{x }} - \mathbf{x }_i}{\left\| \widetilde{\mathbf{x }} - \mathbf{x }_i\right\| }\right) . \end{aligned}$$
(26)

Proof

From Lemma 3, (P2), Definition 20 and \(\widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_j^\mathbf{s }}} < \widetilde{V}(\mathbf{x }_j^\mathbf{s })\) for \(j = 1,\ldots ,n-1\),

$$\begin{aligned} \widetilde{V}(\mathbf{x }_i)&\le \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_i}} + h_{max}G_{max}, \\&\le \sum _{j=0}^{n-1}\zeta _j \min \left\{ \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_0^\mathbf{s }}}, \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_1^\mathbf{s }}},\ldots , \widetilde{V}_{min}^\mathbf{AF _{\mathbf{x }_{n-1}^\mathbf{s }}} \right\} + {\varGamma } h_{max} G_{min}\\&< \sum _{j=0}^{n-1} \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf{s }) + \left\| \widetilde{\mathbf{x }}- \mathbf{x }_i\right\| g\left( \mathbf{x }_i,\frac{\widetilde{\mathbf{x }} - \mathbf{x }_i}{\left\| \widetilde{\mathbf{x }} - \mathbf{x }_i\right\| }\right) . \end{aligned}$$

\(\square \)

Lemma 5

[21, Lemma 7.1] Let \(\mathbf{x }_i\) be the vertex with Considered label that is about to be relabelled Accepted. Let

$$\begin{aligned} \widetilde{W}(\mathbf{x }_i) = \min _{s \in \mathbf AF } \min _{\zeta \in {\varXi }_{n-1}} \left\{ \sum _{j = 0}^{n-1} \zeta _j\widetilde{V}(\mathbf{x }_j^\mathbf{s }) + \tau _s(\mathbf{x }_i,\zeta )g(\mathbf{x }_i,\mathbf u _\mathbf{s } (\mathbf{x }_i,\zeta ))\right\} . \end{aligned}$$
(27)

Then \(\widetilde{W}(\mathbf{x }_i) = \widetilde{V}(\mathbf{x }_i)\).

The minimizing update from AF must come from \(\overline{\mathbf{NF }}(\mathbf{x }_i)\). The next theorem states that the minimizing update \(\widetilde{V}(\mathbf{x }_i)\) from \(S(\mathbf{x }_i)\) must come from \(\overline{\mathbf{NF }}(\mathbf{x }_i)\).

Theorem 3

Let \(\widetilde{V}: X_0 \rightarrow \mathbb {R}\) be computed by the OUM on mesh X, with weight function g and boundary function q. Then for \(\mathbf{x }_i \in X_0\cap {\varOmega }\),

$$\begin{aligned} \widetilde{V}(\mathbf{x }_i) = \min _\mathbf{s \in S(\mathbf{x }_i)} \min _{\zeta \in {\varXi }_{n-1}} \left\{ \sum _{j=0}^{n-1} \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf{s }) + \tau (\mathbf{x }_i,\zeta )g(\mathbf{x }_i, \mathbf u _\mathbf{s }(\mathbf{x }_i,\zeta ))\right\} . \end{aligned}$$
(28)

Proof

Let the OUM algorithm be at the instant where vertex \(\mathbf{x }_i\) with Considered label is about to be relabeled Accepted.

Recall Case 1, where \(\overline{\mathbf{AF }}(\mathbf{x }_i)\) is entirely inside \(\overline{B}_{2{\varGamma } h_{max}}(\mathbf{x }_i)\) and \(S(\mathbf{x }_i) = \overline{\mathbf{AF }}(\mathbf{x }_i) \cup \overline{\mathbf{NF }}(\mathbf{x }_i)\). Since \(\overline{\mathbf{AF }}(\mathbf{x }_i) \subseteq \mathbf AF \) and \(\overline{\mathbf{NF }}(\mathbf{x }_i) \subseteq \mathbf AF \), \(S(\mathbf{x }_i) \subseteq \mathbf AF \). By Lemma 5, \(\overline{\mathbf{NF }}(\mathbf{x }_i)\) must contain the minimizers \(\widetilde{\mathbf{s }}^*\) and \(\widetilde{\zeta }^*\) of (28).

Recall Case 2, where \(S(\mathbf{x }_i) = S_{\overline{\mathbf{AF }}R}(\mathbf{x }_i) \cup \overline{\mathbf{NF }}(\mathbf{x }_i)\). The minimizing \(\widetilde{\mathbf{s }}^*\), \(\widetilde{\zeta }^*\) of \(S(\mathbf{x }_i)\) will be shown to come from \(\overline{\mathbf{NF }}(\mathbf{x }_i)\) by showing the updates of \(S(\mathbf{x }_i) \backslash \overline{\mathbf{NF }}(\mathbf{x }_i) = (\overline{\mathbf{AF }}(\mathbf{x }_i)\backslash \overline{\mathbf{NF }}(\mathbf{x }_i)) \cup (S(\mathbf{x }_i) \cap \partial R(\mathbf{x }_i))\) are at least the value from OUM. By Lemma 5, the minimizers are not from \(\overline{\mathbf{AF }}(\mathbf{x }_i)\backslash \overline{\mathbf{NF }}(\mathbf{x }_i)\).

It remains to show that updates (18) from \(\mathbf s \in S(\mathbf{x }_i) \cap \partial R(\mathbf{x }_i)\) (which are just outside \(\overline{B}_{2{\varGamma } h_{max}}(\mathbf{x }_i)\)) are at least the value obtained from OUM. Because vertices of \(\mathbf s \) lie on or inside \(\overline{\mathbf{AF }}(\mathbf{x }_i)\), they must either be on the Accepted Front or not yet \(\textit{Accepted}\) (Lemma 2). Three cases are considered.

  1. 1.

    If none of the vertices of \(\mathbf s \) have been labelled Accepted, Lemma 4 applies. The update for \(\mathbf{x }_i\) from \(\mathbf s \in S(\mathbf{x }_i) \cap \partial R(\mathbf{x }_i)\) is greater than \(\widetilde{V}(\mathbf{x }_i)\) from OUM.

  2. 2.

    If the vertices of \(\mathbf s \) are all on the Accepted Front, then \(\mathbf s \in \mathbf AF \) and Lemma 5 applies. The update from \(\mathbf s \) is at least \(\widetilde{V}(\mathbf{x }_i)\) from OUM.

  3. 3.

    If at least one but not all the vertices of \(\mathbf s \) are on the Accepted Front, then the rest of the vertices on \(\mathbf s \) (that are not labelled Accepted) must be labelled Considered. Let the Accepted and Considered vertices of \(\mathbf s \) be denoted \(\{\mathbf{x }_1^\mathbf{sa },\ldots ,\mathbf{x }_l^\mathbf{sa } \}\) and \(\{\mathbf{x }_1^\mathbf{sc },\ldots ,\mathbf{x }_k^\mathbf{sc } \}\) respectively. Let \(\mathbf{s }\) be rewritten

    \(\mathbf s = \mathbf{x }_1^\mathbf sa \ldots \mathbf{x }_l^\mathbf sa \mathbf{x }_1^\mathbf sc \ldots \mathbf{x }_k^\mathbf sc \) where \(l+k = n\) since \(\mathbf s \) has n vertices. Let \(\zeta = (\zeta _1^\mathbf{sa },\ldots ,\zeta _l^\mathbf{sa },\zeta _1^\mathbf{sc },\ldots ,\zeta _k^\mathbf{sc })\) be the barycentric coordinates for \(\mathbf x \in \mathbf s \). By Lemma 3, \(\widetilde{V}(\mathbf{x }_i) > \widetilde{V}^\mathbf{AF _{\mathbf{x }_i}}_{min}\) and Definition 20, for all \(1 \le j \le k\),

    $$\begin{aligned} \widetilde{V}(\mathbf{x }_i) \le \widetilde{V}^{\mathbf{AF }_{\mathbf{x }_i}}_{min} + h_{max}G_{max} < \widetilde{V}(\mathbf{x }_j^{\mathbf{sc }}) + {\varGamma } h_{max} G_{min}. \end{aligned}$$

    For all \(1 \le j \le k\), and \(1 \le m\le l\), \(\mathbf{x }_j^\mathbf{sc }\) is labelled Considered and \(\mathbf{x }_m^\mathbf{sa }\) is on its Near Front \(\overline{\mathbf{NF }}(\mathbf{x }_j^\mathbf{sc })\). Thus,

    $$\begin{aligned} \widetilde{V}(\mathbf{x }_j^{\mathbf{sc }})\le & {} \widetilde{V}(\mathbf{x }_m^{\mathbf{sa }}) + \left\| \mathbf{x }_m^{\mathbf{sa }} - \mathbf{x }_j^{\mathbf{sc }}\right\| g\left( \mathbf{x }_j^{\mathbf{sc }},\frac{\mathbf{x }_m^{\mathbf{sa }} - \mathbf{x }_j^{\mathbf{sc }}}{\left\| \mathbf{x }_m^{\mathbf{sa }} - \mathbf{x }_j^{\mathbf{sc }}\right\| }\right) \le \widetilde{V}(\mathbf{x }_m^{\mathbf{sa }}) + {\varGamma } h_{max} G_{min} \\ \widetilde{V}(\mathbf{x }_m^{\mathbf{sa }})\ge & {} \widetilde{V}(\mathbf{x }_j^{\mathbf{sc }}) - {\varGamma } h_{max}G_{min} > \widetilde{V}(\mathbf{x }_i) - 2{\varGamma } h_{max}G_{min}. \end{aligned}$$

    Consider the update for \(\mathbf{x }_i\) (18) from \(\mathbf s \in S(\mathbf{x }_i)\cap \partial R(\mathbf{x }_i)\). For any \(\zeta \in {\varXi }_{n-1}\),

    $$\begin{aligned}&\sum _{j=0}^{n-1} \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf{s }) + \tau _\mathbf{s }(\mathbf{x }_i,\zeta ) g(\mathbf{x }_i,\mathbf u _\mathbf{s }(\mathbf{x }_i,\zeta )) \\&\quad = \left( \sum _{m=1}^l \zeta _m^\mathbf{sa } \widetilde{V}(\mathbf{x }_m^\mathbf{sa }) \right) + \left( \sum _{j=1}^k \zeta _j^\mathbf{sc } \widetilde{V}(\mathbf{x }_j^\mathbf{sc })\right) + \tau _\mathbf{s }(\mathbf{x }_i,\zeta )g(\mathbf{x }_i,\mathbf u _\mathbf{s }(\mathbf{x }_i,\zeta )),\\&\quad > \sum _{m=1}^l \zeta _m^\mathbf{sa } (\widetilde{V}(\mathbf{x }_i)- 2{\varGamma } h_{max}G_{min}) + \sum _{j=1}^k \zeta _j^\mathbf{sc }(\widetilde{V}(\mathbf{x }_i) - {\varGamma } h_{max}G_{min}) + 2{\varGamma } h_{max}G_{min},\\&\quad \ge \widetilde{V}(\mathbf{x }_i), \end{aligned}$$

    since \(\tau _\mathbf{s }(\mathbf{x }_i,\zeta ) > 2{\varGamma } h_{max}\) and (\(\sum _{m=1}^l \zeta _m^\mathbf{sa }) + (\sum _{j=1}^k \zeta _j^\mathbf{sc }) = 1\).

Therefore \(\mathbf s \in S(\mathbf{x }_i) \cap \partial R(\mathbf{x }_i)\) provides an update larger or equal to OUM. By Lemma 5, a minimizing update (28) in \(S(\mathbf{x }_i)\) must always come from \(\overline{\mathbf{NF }}(\mathbf{x }_i)\). \(\square \)

By Theorems 2 and 3,

$$\begin{aligned} \widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\mathbf{x }_i,\widetilde{V}(\mathbf{x }_i)) = \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\mathbf{x }_i,\widetilde{V}(\mathbf{x }_i)). \end{aligned}$$

Therefore, for \(\mathbf{x }_i \in X_0 \cap {\varOmega }\), \(S(\mathbf{x }_i)\) satisfies Definition 24.

The monotonicity and consistency of the numerical Hamiltonian will now be discussed.

Theorem 4

(Monotonicity) [2, Proposition 2.1] For \(\underline{\phi },\overline{\phi }:X_0 \rightarrow \mathbb {R}\) that satisfy \(\underline{\phi }(\mathbf{x }_j) \le \overline{\phi }(\mathbf{x }_j)\) for all \(\mathbf{x }_j \in X_0 \cap {\varOmega }\), and \(\phi (\mathbf{x }_i) = \overline{\phi }(\mathbf{x }_i) = \underline{\phi }(\mathbf{x }_i) \in \mathbb {R}\),

$$\begin{aligned} \widetilde{H}[S(\mathbf{x }_i),\underline{\phi }](\mathbf{x }_i,\phi (\mathbf{x }_i)) \ge \widetilde{H}[S(\mathbf{x }_i),\overline{\phi }](\mathbf{x }_i,\phi (\mathbf{x }_i)). \end{aligned}$$

Theorem 5

(Consistency) There exists \(C_1 \in {\mathbb {R}}_+\) (not dependent on \(h_{max}\)) for all \(\mathbf{x }_i \in X_0 \cap {\varOmega }\) and \(\phi \in C^2({\varOmega })\), such that

$$\begin{aligned} |H(\mathbf{x }_i,\nabla \phi )-\widetilde{H}[S(\mathbf{x }_i),\phi ](\mathbf{x }_i,\phi (\mathbf{x }_i))| \le C_1\left\| \nabla ^2\phi \right\| _2 h_{max}. \end{aligned}$$

where \(\left\| A \right\| _2\) is the maximum singular value of \(A \in {\mathbb {R}}^{n\times n}\).

Proof

Let \(\phi \in C^2({\varOmega })\) and \(\mathbf{x }_i \in X_0 \cap {\varOmega }\). Recall \(S(\mathbf{x }_i)\) is directionally complete, so the characteristic direction (Definition 6) \(\mathbf u ^*\) can be described using barycentric coordinates \(\zeta ^* =(\zeta _0^*,\zeta _1^*,\ldots ,\zeta _{n-1}^*)\in {\varXi }_{n-1}\) from an appropriate simplex \(\mathbf s ^* \in S(\mathbf{x }_i)\). Let \(\mathbf x ^*=\sum _{j=0}^{n-1}\zeta ^*_j\mathbf x ^\mathbf{s ^*}_j\). Taylor’s theorem will be used on H (11). Let \(\mathbf c ^*\) and \(\mathbf c _j^{*}\) for \(j=0,1,\ldots ,n-1\) denote the points arising from Taylor’s theorem on the line segments between \(\mathbf x ^*\) and \(\mathbf{x }_i\) and \(\mathbf x ^*\) and \(\mathbf x ^\mathbf{s ^*}_j\) respectively. Since \(\sum _{j=0}^{n-1}\zeta ^*_j\nabla \phi (\mathbf x ^*)^T(\mathbf x ^\mathbf{s ^*}_j-\mathbf x ^*)=\nabla \phi (\mathbf x ^*)^T(\mathbf x ^*-\mathbf x ^*)=0\), evaluating both H and \(\widetilde{H}\) at \(\mathbf s ^*\) and \(\zeta ^*\),

$$\begin{aligned}&H(\mathbf{x }_i,\nabla \phi ) - \widetilde{H}[S(\mathbf{x }_i),\phi ](\mathbf{x }_i,\phi (\mathbf{x }_i))\\&\quad \le -\frac{\sum _{j=0}^{n-1} \frac{{\zeta }_j^*}{2}(\mathbf{x }_j^{\mathbf{s }^*} - \mathbf{x }^*)^T\nabla ^2\phi (\mathbf c _j^*)(\mathbf{x }_j^{\mathbf{s }^*} - \mathbf{x }^*) +\frac{1}{2}(\mathbf{x }^* - \mathbf{x }_i)^T\nabla ^2\phi (\mathbf c ^*)(\mathbf{x }^* - \mathbf{x }_i )}{\tau _{\mathbf{s }^*}(\mathbf{x }_i,{\zeta }^*)},\\&\quad \le \frac{1}{h_{min}}\left( \frac{\sum _{j=0}^{n-1} \zeta _j^*}{2} \left\| \nabla ^2 \phi \right\| _{2} h_{max}^2 + \frac{1}{2} \left\| \nabla ^2\phi \right\| _{2} (2{\varGamma }+1)^2 h_{max}^2\right) , \\&\quad \le \frac{M}{2}\left\| \nabla ^2\phi \right\| _{2}(1 + (2{\varGamma }+1)^2)h_{max}, \end{aligned}$$

since the point \(\mathbf x ^* \in \mathbf s ^* \in S(\mathbf{x }_i)\) is at most \((2{\varGamma } + 1)h_{max}\) from \(\mathbf{x }_i\) and at most \(h_{max}\) away from any of the vertices of \(\mathbf s ^*\). The distance \(\tau _\mathbf{s ^*}(\mathbf{x }_i,\zeta ^*)\) is at least the minimum simplex height \(h_{min}\) and M from \((\mathbf M1 )\) satisfies \(1 \le \frac{h_{max}}{h_{min}} \le M\). The proof for \(\widetilde{H}[S(\mathbf{x }_i),\phi ](\mathbf{x }_i,\phi (\mathbf{x }_i)) - H(\mathbf{x }_i,\nabla \phi )\) yields the same estimate using the minimizers of \(\widetilde{H}\), \((n-1)\)-simplex \(\widetilde{\mathbf{s }}^* \in S(\mathbf{x }_i)\) and \(\widetilde{\zeta }^* \in {\varXi }_{n-1}\). The theorem is proved with \(C_1 = \frac{M}{2}(1+(2{\varGamma }+1)^2)\). \(\square \)

A similar consistency property was assumed in [20] for the half-order proof for FMM. A similar proof without rate using similar arguments was given in [2, Prop. 2.2] for the Monotone Acceptance OUM.

6 OUM Error Bound

The error bound proof will be presented. Several definitions and results are first required.

Lemma 6

[3] Let \(\mathbf x \in {\mathbb {R}}^n\). If \(\overline{{\varOmega }}\) is convex, then \(\mathbf z ^* = \hbox {arg min}_\mathbf{z \in \overline{{\varOmega }}} \left\| \mathbf x - \mathbf z \right\| \) is unique, and satisfies

$$\begin{aligned} (\mathbf x -\mathbf z ^*) \cdot (\mathbf w - \mathbf z ^*) \le 0 ,\quad \text {for all}\quad \mathbf w \in \overline{{\varOmega }}. \end{aligned}$$
(29)

Lemma 7

The value function V is globally Lipschitz-continuous over \(\overline{{\varOmega }}_X\). That is, there exists \(L_V \in {\mathbb {R}}_+\) such that for any \(\mathbf{x }_1, \mathbf{x }_2 \in \overline{{\varOmega }}_X\),

$$\begin{aligned} |V(\mathbf{x }_1) - V(\mathbf{x }_2)| \le L_V\left\| \mathbf{x }_1 - \mathbf{x }_2\right\| . \end{aligned}$$

An outline of the proof is given using three cases.

Case 1

\(\mathbf{x }_1, \mathbf{x }_2 \in \overline{{\varOmega }}_X \cap {\varOmega }^c\). This is an exercise in [7, Exercise 2.8d], which can be shown using the Cauchy–Schwartz inequality and Lemma 1.

Case 2

\(\mathbf{x }_1, \mathbf{x }_2 \in {\varOmega }\). This is shown in [25, Lemma 2.2.7] with constant \(G_{max}\).

Case 3

\(\mathbf{x }_1 \in {\varOmega } \) and \(\mathbf{x }_2 \in \overline{{\varOmega }}_X \cap {\varOmega }^c\). This can be shown using Lemma 6 and

$$\begin{aligned} L(\mathbf a ,\mathbf b ) \le L(\mathbf a ,\mathbf c ) + L(\mathbf c ,\mathbf b ), \end{aligned}$$

for \(\mathbf a , \mathbf b ,\mathbf c \in \overline{{\varOmega }}\). For \(\mathbf{x }_1,\mathbf{x }_2 \in \overline{{\varOmega }}_X\), a valid Lipschitz constant is \(L_V=2G_{max}\).

Lemma 8

[25, Lemma 2.2.9] Let \(\mathbf x \in \overline{{\varOmega }}_X\) . Let \(\widetilde{\mathbf{x }} = \hbox {arg min}_\mathbf{z \in \partial {\varOmega }} |\mathbf x - \mathbf z |\). The value function V satisfies

$$\begin{aligned} q_{min} \le V(\mathbf x ) \le G_{max}\left\| \mathbf x - \widetilde{\mathbf{x }}\right\| + q_{max}. \end{aligned}$$

The proof is shown in [25] for \(\mathbf x \in \overline{{\varOmega }}\). The proof is trivial for \(\mathbf x \in \overline{{\varOmega }}_X \cap \overline{{\varOmega }}^c\).

Lemma 9

[21, Lemma 7.5] Let \(\widetilde{V}: X_0 \rightarrow \mathbb {R}\) obtained by the ordered upwind method. There exists \(L_{\widetilde{V}} \in {\mathbb {R}}_+\) for any \(\mathbf{x }_i, \mathbf{x }_j \in X_0\), such that

$$\begin{aligned} |\widetilde{V}(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_j)| \le L_{\widetilde{V}} |\mathbf{x }_i - \mathbf{x }_j|. \end{aligned}$$

A possible Lipschitz constant for \(\widetilde{V}\) is \(L_{\widetilde{V}} = M^2 G_{max}\) [21], where M is described in (M1). Similar proof from Case 1 and Case 3 of Lemma 7 is valid with a restriction of \(\mathbf x \in X_0\) and function L (6) is replaced with \(\widetilde{L}: X_0 \times X_0 \rightarrow \mathbb {R}\),

$$\begin{aligned} \widetilde{L}(\mathbf{x }_1,\mathbf{x }_2) = \inf _\mathbf{u (\cdot ) \in \widetilde{\mathcal {U}}} \left\{ \int _0^{\tau } g(\mathbf y _{\mathbf{x }_1}(s),\mathbf u (s))ds \text { } \Big | \text { }{} \mathbf y _{\mathbf{x }_1}(\tau ) = \mathbf{x }_2, \mathbf y _{\mathbf{x }_1}(t) \in \overline{{\varOmega }}, t \in (0,\tau ) \right\} . \end{aligned}$$
(30)

where \(\widetilde{\mathcal {U}}\) is defined in (17).

Lemma 10

[21, Lemma 7.2] Let \(\mathbf x \in \mathbf s \) where \(\mathbf s \in X_n\) and \(\widetilde{\mathbf{x }} = \hbox {arg min}_\mathbf{z \in \partial {\varOmega }} \left\| \mathbf x - \mathbf z \right\| .\) Then

$$\begin{aligned} q_{min} \le \widetilde{V}(\mathbf x ) \le G_{max}|\mathbf x - \widetilde{\mathbf{x }}| + q_{max}. \end{aligned}$$

The proof is shown in [21] for \(\mathbf x \in \overline{{\varOmega }}\). The proof is trivial for \(\mathbf x \in \overline{{\varOmega }}^c\).

The next lemma states that any point on the boundary \(\partial {\varOmega }\) must be at most \(h_{max}\) away from its nearest vertex of X outside of \({\varOmega }\).

Lemma 11

If \(\mathbf x \in \partial {\varOmega }\), there exists \(\mathbf{x }_i \in X_0 \cap {\varOmega }^c\) such that

$$\begin{aligned} \left\| \mathbf x - \mathbf{x }_i\right\| \le h_{max}. \end{aligned}$$
(31)

Proof

Assumption (M2) states that \(\overline{{\varOmega }}\) is contained in X. The point \(\mathbf x \in \mathbf s \) where \(\mathbf s \in X_n\). Since \(\overline{{\varOmega }}\) is convex (P4), and \(\mathbf x \) can be described by barycentric coordinates of \(\mathbf s \), at least one of the vertices of \(\mathbf s \) must be outside \({\varOmega }\). Furthermore, for all \(1 \le j \le n\),

$$\begin{aligned} \left\| \mathbf x - \mathbf{x }_j^s\right\| \le \max _{1\le k \le n} \left\| \mathbf{x }_k^\mathbf{s } - \mathbf{x }_j^\mathbf{s }\right\| \le h_{max}. \end{aligned}$$

\(\square \)

The following definitions provide a weaker description of the gradient for functions that are not necessarily differentiable. Let A be a bounded subset of \({\mathbb {R}}^n\).

Definition 26

The vector \(\mathbf p \in {\mathbb {R}}^n\) is a subgradient of a function \(f: A \rightarrow \mathbb {R}\) at \(\mathbf{x }_0 \in A\) if there exists \(\delta > 0\) such that for any \(\mathbf x \in B_{\delta }(\mathbf{x }_0)\),

$$\begin{aligned} f(\mathbf x ) - f(\mathbf{x }_0) \ge \mathbf p \cdot (\mathbf x - \mathbf{x }_0). \end{aligned}$$

Definition 27

The vector \(\mathbf p \in {\mathbb {R}}^n\) is a supergradient of a function \(f: A \rightarrow \mathbb {R}\) at \(\mathbf{x }_0 \in A\) if there exists \(\delta > 0\) such that for any \(\mathbf x \in B_{\delta }(\mathbf{x }_0)\),

$$\begin{aligned} f(\mathbf x ) - f(\mathbf{x }_0) \le \mathbf p \cdot (\mathbf x - \mathbf{x }_0). \end{aligned}$$

Let \(D^-f(\mathbf{x }_0)\) and \(D^+f(\mathbf{x }_0)\) denote the sets of all subgradients and supergradients of f at \(\mathbf{x }_0\) respectively.

Lemma 12

Let \(f: A \rightarrow \mathbb {R}\) be globally Lipschitz-continuous with Lipschitz constant C and \(\mathbf{x }_0 \in A\). If \(\mathbf p \in D^-f(\mathbf{x }_0)\cup D^+f(\mathbf{x }_0)\) , then

$$\begin{aligned} \left\| \mathbf p \right\| \le C . \end{aligned}$$

Proof

Let \(\mathbf{x }_0 \in A\), \(\mathbf b \in {\mathbb {S}}^{n-1}\), \(\delta > 0\), such that \(\mathbf{x }_0 + \delta \mathbf b \in A\). Let \(\mathbf p \in D^-f(\mathbf{x }_0)\) (Definition 26). The Lipschitz continuity of f gives

$$\begin{aligned} C\left\| \mathbf{x }_0 + \delta \mathbf b - \mathbf{x }_0\right\| \ge f(\mathbf{x }_0 + \delta \mathbf b ) - f(\mathbf{x }_0) \ge \mathbf p \cdot (\mathbf{x }_0 + \delta \mathbf b - \mathbf{x }_0). \end{aligned}$$

Choosing \(\mathbf{b }=\frac{\mathbf{p }}{\left\| \mathbf{p }\right\| }\) gives \(\left\| \mathbf{p }\right\| \le C\). The proof is analogous for \(\mathbf{p } \in D^+f(\mathbf{x }_0)\). \(\square \)

Lemma 13

[5, Lemma 1.7] A vector \(\mathbf p \in D^-f(\mathbf{x }_0)\) if and only if there exists \(\phi \in C^1({\varOmega }) \rightarrow \mathbb {R}\) such that \(f - \phi \) has a local minimum at \(\mathbf{x }_0\). Similarly, a vector \(\mathbf p \in D^+f(\mathbf{x }_0)\) if and only if there exists \(\phi \in C^1({\varOmega }) \rightarrow \mathbb {R}\) such that \(\nabla \phi (\mathbf{x }_0) = \mathbf p \), and \(f - \phi \) has a local maximum at \(\mathbf{x }_0\).

The approximated value function \(\widetilde{V}\) is in a sense a viscosity solution for the numerical HJB equation (23).

Definition 28

Let \(\hat{\mathbf{x }} = \hbox {arg min}_\mathbf{x \in \partial {\varOmega }} \left\| \mathbf{x }_i - \mathbf x \right\| \). A subsolution of the numerical HJB equation (23) \(\underline{\widetilde{V}}: X_0 \rightarrow \mathbb {R}\) satisfies

$$\begin{aligned} \left\{ \begin{array}{ll} \underline{\widetilde{V}}(\mathbf{x }_i) \le q(\hat{\mathbf{x }}) &{}\quad \text {for}\quad \mathbf{x }_i \in X_0 \cap {\varOmega }^c,\\ \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\underline{\widetilde{V}}](\mathbf{x }_i,\underline{\widetilde{V}}(\mathbf{x }_i)) \le 0 &{} \quad \text {for}\quad \mathbf{x }_i \in X_0 \cap {\varOmega }. \end{array}\right. \end{aligned}$$

Definition 29

Let \(\hat{\mathbf{x }} = \hbox {arg min}_\mathbf{x \in \partial {\varOmega }} \left\| \mathbf{x }_i - \mathbf x \right\| \). A supersolution of the numerical HJB equation (23) \(\widetilde{\overline{V}}: X_0 \rightarrow \mathbb {R}\) satisfies

$$\begin{aligned} \left\{ \begin{array}{ll} \widetilde{\overline{V}}(\mathbf{x }_i) \ge q(\hat{\mathbf{x }}) &{} \quad \text {for}\quad \mathbf{x }_i \in X_0 \cap {\varOmega }^c, \\ \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{\overline{V}}](\mathbf{x }_i,\widetilde{\overline{V}}(\mathbf{x }_i)) \ge 0 &{} \quad \text {for}\quad \mathbf{x }_i \in X_0 \cap {\varOmega }. \end{array}\right. \end{aligned}$$

Definition 30

A solution of the numerical HJB equation (23) \(\widetilde{V}\) is both a subsolution and a supersolution of the numerical HJB equation (23).

By Theorem 2, the approximate value function \(\widetilde{V}\) produced by the OUM algorithm is a solution of the numerical HJB equation. Hence, it is both a subsolution and supersolution of the numerical HJB equation. Recall the definition of \(\overline{{\varOmega }}_X\) (20).

Theorem 6

Let \(V: \overline{{\varOmega }}_X \rightarrow \mathbb {R}\) be a viscosity solution of (12) and \(\widetilde{V}: X_0 \rightarrow \mathbb {R}\) be a solution of the numerical HJB equation (23). There exist \(C, h_0 > 0\), both independent of \(h_{max}\) such that

$$\begin{aligned} \max _{\mathbf{x }_i \in X_0} |V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i)| \le C \sqrt{h_{max}}, \end{aligned}$$
(32)

for every \(\mathbf{x }_i \in X_0\) and \(h_{max} < h_0\).

Proof

The proof is trivial for \(\mathbf{x }_i \in X_0\cap \overline{{\varOmega }}^c\). Otherwise, \(\mathbf{x }_i \in X_0 \cap \overline{{\varOmega }}\). Since \(\overline{{\varOmega }} \subseteq {\mathbb {R}}^n\) is bounded, define

$$\begin{aligned} d_{{\varOmega }}= & {} \max _\mathbf{x ,\widetilde{\mathbf{x }} \in \partial {\varOmega }} \left\| \mathbf x - \widetilde{\mathbf{x }}\right\| , \end{aligned}$$
(33)
$$\begin{aligned} C_0= & {} \max \{L_V, L_{\widetilde{V}}, |q_{min}|, d_{{\varOmega }} G_{max} + |q_{max}| \}, \end{aligned}$$
(34)

where \(L_V, L_{\widetilde{V}}, |q_{min}|, G_{max}\), and \(|q_{max}|\) are from Lemmas 78910.

For \(\mathbf{x }_i \in X_0 \cap \overline{{\varOmega }}\), the result of the theorem is shown for \(V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i)\). A similar argument for \(\widetilde{V}(\mathbf{x }_i) - V(\mathbf{x }_i)\) can be made.

Two parameters \(\epsilon \) and \(\lambda \) are used to determine the error bound. For \(\epsilon > 0\) and \(0 < \lambda < 1\), define \({\varPhi }: \overline{{\varOmega }} \times X_0 \rightarrow \mathbb {R}\)

$$\begin{aligned} {\varPhi }(\mathbf x ,\mathbf{x }_i) = \lambda V(\mathbf x ) - \widetilde{V}(\mathbf{x }_i) - \frac{\left\| \mathbf x - \mathbf{x }_i\right\| ^2}{2\epsilon }. \end{aligned}$$
(35)

Let \(\overline{\mathbf{x }} \in \overline{{\varOmega }}\) and \(\overline{\mathbf{x }}_i \in X_0\) maximize \({\varPhi }\), over the compact set \(\overline{{\varOmega }} \times X_0\). Define

$$\begin{aligned} M_{\epsilon ,\lambda } = \max _\mathbf{x \in \overline{{\varOmega }}, \mathbf{x }_i \in X_0} {\varPhi }(\mathbf x ,\mathbf{x }_i) = {\varPhi }(\overline{\mathbf{x }},\overline{\mathbf{x }}_i). \end{aligned}$$
(36)

For \(\mathbf{x }_i \in X_0 \cap \overline{{\varOmega }}\), using (35) and (36) with \(V(\mathbf{x }_i) \le C_0\) from Lemma 8 (boundedness of V),

$$\begin{aligned} V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i) \le (1-\lambda ) V(\mathbf{x }_i) + M_{\epsilon ,\lambda } \le C_0(1-\lambda ) + M_{\epsilon ,\lambda }. \end{aligned}$$
(37)

Choose \(\lambda \) such that

$$\begin{aligned} 1-\lambda = \frac{2}{G_{min}}\left( \frac{C_1}{\epsilon }h_{max} + C_0L_g\epsilon \right) , \end{aligned}$$
(38)

where \(L_g\) is defined in (9), and \(C_1 = \frac{M(1+(2{\varGamma } + 1)^2)}{2}\) is defined in Theorem 5 with M in (M1) and \({\varGamma } = \frac{G_{max}}{G_{min}}\).

The result of the theorem will be true with \(\epsilon = \sqrt{h_{max}}\). Therefore, it is sufficient to pick \(h_0\) small enough so that for all \(h_{max} < h_0\), \(0 < (1-\lambda ) < 1\) is satisfied. Setting (38) less than 1, with \(\epsilon = \sqrt{h_{max}}\) yields \(h_{max} < \frac{G_{min}^2}{4(C_1+C_0L_g)^2}\). Let \(h_0 = \min \{\frac{G_{min}^2}{4(C_1+C_0L_g)^2},1\}\).

The point \(\overline{\mathbf{x }}\) in (35) must belong to \({\varOmega }\) or \(\partial {\varOmega }\), while \(\overline{\mathbf{x }}_i\) must belong to \(X_0\cap {\varOmega }\) or \(X_0\cap {\varOmega }^c\). An outline of the remainder of proof is as follows.

Step 1 :

Show that at most only one of \(\overline{\mathbf{x }}\) and \(\overline{\mathbf{x }}_i\) may be in \({\varOmega }\).

Step 2 :

Find an upper bound for \(M_{\epsilon ,\lambda }\) in (36) given the restriction in Step 1.

Step 3 :

Find an upper bound on \(V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i)\) (37) in terms of \(h_{max}\).

Step 1 :

Define \(\phi : \overline{{\varOmega }} \rightarrow \mathbb {R}\),

$$\begin{aligned} \phi (\mathbf{x }) = \frac{1}{\lambda } \left( M_{\epsilon ,\lambda } + \widetilde{V}(\overline{\mathbf{x }}_i) + \frac{\left\| \mathbf{x } - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }\right) \text { and so } \nabla \phi (\mathbf{x }) = \frac{1}{\lambda } \left( \frac{\mathbf{x } - \overline{\mathbf{x }}_i}{\epsilon }\right) . \end{aligned}$$
(39)

Using (35), (36), (39), and \(M_{\epsilon ,\lambda } \ge {\varPhi }(\mathbf x ,\overline{\mathbf{x }}_i)\), it can be shown that \(V(\mathbf x ) \le \phi (\mathbf x )\) for all \(\mathbf x \in \overline{{\varOmega }}\) and \(V(\overline{\mathbf{x }}) = \phi (\overline{\mathbf{x }})\). Therefore \(V - \phi \) has a local maximum at \(\overline{\mathbf{x }}\). By Lemma 13, \(\mathbf p = \nabla \phi (\overline{\mathbf{x }}) \in D^+V(\overline{\mathbf{x }})\). By Lemma 12, \(|\nabla \phi (\overline{\mathbf{x }})|\) is bounded by the Lipschitz constant \(L_V\), which by (34) and (39),

$$\begin{aligned} \left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \le \lambda \left\| \nabla \phi (\overline{\mathbf{x }})\right\| \epsilon \le \lambda C_0\epsilon . \end{aligned}$$

From (38), and using \(0 < \lambda < 1\),

$$\begin{aligned} (1-\lambda ) > \frac{1}{G_{min}}\left( \frac{C_1}{\epsilon }h_{max} + \lambda L_g\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \right) . \end{aligned}$$
(40)

Define \(\psi : \overline{{\varOmega }}_X \rightarrow \mathbb {R}\),

$$\begin{aligned} \psi (\mathbf{x }_i) = -M_{\epsilon ,\lambda } + \lambda V(\overline{\mathbf{x }}) - \frac{\left\| \overline{\mathbf{x }} - \mathbf{x }_i\right\| ^2}{2\epsilon }, \text { and so } \nabla \psi (\mathbf{x }_i) = \frac{\overline{\mathbf{x }}-\mathbf{x }_i}{\epsilon }. \end{aligned}$$
(41)

Let \(\mathbf u _{\overline{\mathbf{x }}_i}^*\) optimize the Hamiltonian (11) for arguments \(\overline{\mathbf{x }}_i\) and \(\nabla \psi (\overline{\mathbf{x }}_i)\),

$$\begin{aligned} H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i)) = -\nabla \psi (\overline{\mathbf{x }}_i) \cdot \mathbf u _{\overline{\mathbf{x }}_i}^* - g(\overline{\mathbf{x }}_i,\mathbf u ^*_{\overline{\mathbf{x }}_i}). \end{aligned}$$

From (40), assumptions (P2), (P3) and definitions of \(\nabla \phi \) (35) and \(\nabla \psi \) (41),

$$\begin{aligned} (1-\lambda )&g(\overline{\mathbf{x }}_i,\mathbf u _{\overline{\mathbf{x }}_i}^*) > \frac{C_1}{\epsilon }{h_{max}} + \lambda \left( g\left( \overline{\mathbf{x }},\mathbf u _{\overline{\mathbf{x }}_i}^*\right) - g\left( \overline{\mathbf{x }}_i,\mathbf u _{\overline{\mathbf{x }}_i}^*\right) \right) , \nonumber \\&\frac{\overline{\mathbf{x }}-\overline{\mathbf{x }}_i}{\epsilon } \cdot \mathbf u ^*_{\overline{\mathbf{x }}_i} + g(\overline{\mathbf{x }}_i, \mathbf u ^*_{\overline{\mathbf{x }}_i}) - \lambda \left( \frac{1}{\lambda }\cdot \frac{\overline{\mathbf{x }}-\overline{\mathbf{x }}_i}{\epsilon } \cdot \mathbf u ^*_{\overline{\mathbf{x }}_i} + g(\overline{\mathbf{x }},\mathbf u _{\overline{\mathbf{x }}_i}^*) \right) > \frac{C_1}{\epsilon }h_{max}, \nonumber \\&\nabla \psi (\overline{\mathbf{x }}_i) \cdot \mathbf u ^*_{\overline{\mathbf{x }}_i} + g(\overline{\mathbf{x }}_i, \mathbf u ^*_{\overline{\mathbf{x }}_i}) - \lambda (\nabla \phi (\overline{\mathbf{x }}) \cdot \mathbf u ^*_{\overline{\mathbf{x }}_i} + g(\overline{\mathbf{x }},\mathbf u ^*_{\overline{\mathbf{x }}_i})) > \frac{C_1}{\epsilon }h_{max}. \end{aligned}$$
(42)

Since \(\mathbf u _{\overline{\mathbf{x }}_i}^*\) is not necessarily the maximizer of \(H(\overline{\mathbf{x }},\nabla \phi (\overline{\mathbf{x }}))\),

$$\begin{aligned} - \lambda \left( \nabla \phi \left( \overline{\mathbf{x }}\right) \cdot \mathbf u ^*_{\overline{\mathbf{x }}_i} + g\left( \overline{\mathbf{x }},\mathbf u ^*_{\overline{\mathbf{x }}_i}\right) \right) \le \lambda H\left( \overline{\mathbf{x }}, \nabla \phi \left( \overline{\mathbf{x }}\right) \right) . \end{aligned}$$
(43)

It will now be shown that at most one of \(\overline{\mathbf{x }}_i\) or \(\overline{\mathbf{x }}\) can be in \({\varOmega }\). Following (42) and using the definition of the Hamiltonian (11), (43), \(\mathbf u _{\overline{\mathbf{x }}_i}^*\) is the optimizer of \(H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i))\),

$$\begin{aligned} - H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i)) + \lambda H(\overline{\mathbf{x }},\nabla \phi (\overline{\mathbf{x }})) > \frac{C_1}{\epsilon }{h_{max}}. \end{aligned}$$
(44)

Case 1

Let \(\overline{\mathbf{x }} \in {\varOmega }\). From Definition 7, \(H(\overline{\mathbf{x }},\nabla \phi (\overline{\mathbf{x }})) \le 0.\) From (44),

$$\begin{aligned} H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i) ) < - \frac{C_1}{\epsilon }h_{max}. \end{aligned}$$
(45)

For all \(\mathbf{x }_i \in X_0\), \(\psi (\mathbf{x }_i) \le \widetilde{V}(\mathbf{x }_i)\), \(\psi (\overline{\mathbf{x }}_i) = \widetilde{V}(\overline{\mathbf{x }}_i)\). By Definition 24 and Theorem 4,

$$\begin{aligned} \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) = \widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) \le \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)). \end{aligned}$$
(46)

It will be shown that \(\overline{\mathbf{x }}_i \in X_0\cap {\varOmega }^c\) using proof by contrapositive. Since \(\widetilde{V}\) is a solution to the numerical HJB equation (23), it is a supersolution of the numerical HJB equation (Definition 29). If \(\overline{\mathbf{x }}_i \in X_0\cap {\varOmega }\), then

$$\begin{aligned} \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) = \widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) \ge 0. \end{aligned}$$
(47)

Furthermore if \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }\), Theorem 5 must also hold. That is, since \(\left\| \nabla ^2 \psi \right\| _2 = \frac{1}{\epsilon }\),

$$\begin{aligned} |H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i)) - \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i))| \le \frac{C_1}{\epsilon }h_{max}. \end{aligned}$$
(48)

It will be shown (47) and (48) cannot simultaneously be true, implying \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }^c\). If (47) is true, then by (46), \(\widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) \ge 0\). By (45),

$$\begin{aligned} H(\overline{\mathbf{x }}_i, \nabla \psi (\overline{\mathbf{x }}_i)) - \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) < -\frac{C_1}{\epsilon }h_{max}. \end{aligned}$$

Therefore (48) is false.

Otherwise, if (48) were true, using (45),

$$\begin{aligned} H(\overline{\mathbf{x }}_i, \nabla \psi (\overline{\mathbf{x }}_i)) - \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) \ge -\frac{C_1}{\epsilon } h_{max} > H(\overline{\mathbf{x }}_i, \nabla \psi (\overline{\mathbf{x }}_i)). \end{aligned}$$

Hence with (46),

$$\begin{aligned} \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) =\widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i,\widetilde{V}(\overline{\mathbf{x }}_i)) \le \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) < 0. \end{aligned}$$

Therefore (47) is false. Hence \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }^c\).

Case 2

If \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }\), from Theorem 5,

$$\begin{aligned} \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\mathbf{x }_i)) - H(\overline{\mathbf{x }}_i,\nabla \psi (\overline{\mathbf{x }}_i)) \le \frac{C_1}{\epsilon }h_{max}. \end{aligned}$$
(49)

From (46), Definition 24 and \(\widetilde{V}\) is a supersolution of the numerical HJB (23) (Definition 29),

$$\begin{aligned} \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i, \psi (\overline{\mathbf{x }}_i)) \ge \widetilde{H}[S(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i, \widetilde{V}(\overline{\mathbf{x }}_i)) = \widetilde{H}[\overline{\mathbf{NF }}(\mathbf{x }_i),\widetilde{V}](\overline{\mathbf{x }}_i, \widetilde{V}(\overline{\mathbf{x }}_i)) \ge 0. \end{aligned}$$

From (44) and (49),

$$\begin{aligned} \frac{C_1}{\epsilon }h_{max} + \widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) -\lambda H(\overline{\mathbf{x }},\nabla \phi (\overline{\mathbf{x }})) < \frac{C_1}{\epsilon } h_{max}. \end{aligned}$$
(50)

Since \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }\), \(\widetilde{H}[S(\mathbf{x }_i),\psi ](\overline{\mathbf{x }}_i,\psi (\overline{\mathbf{x }}_i)) \ge 0\), from (50), and \(0<\lambda <1\),

$$\begin{aligned} H(\overline{\mathbf{x }},\nabla \phi (\overline{\mathbf{x }})) > 0, \end{aligned}$$

which implies by Definition 7 of the viscosity subsolution, \(\overline{\mathbf{x }} \in \partial {\varOmega }\). Hence at most one of maximizers of \(M_{\epsilon ,\lambda }\), \(\overline{\mathbf{x }}\) and \(\overline{\mathbf{x }}_i\) can belong to \({\varOmega }\).

Step 2 :

An upper bound on \(M_{\epsilon ,\lambda }\) (36) will be found.

Case 1

\(\overline{\mathbf{x }} \in \overline{{\varOmega }},\, \overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }^c\).

Let \(\check{\mathbf{x }} = \hbox {arg min}_\mathbf{x \in \partial {\varOmega }} \left\| \overline{\mathbf{x }}_i - \mathbf x \right\| \). Let \(\underline{\mathbf{x }}_i\) be the point on the line from \(\overline{\mathbf{x }}\) and \(\overline{\mathbf{x }}_i\) intersecting \(\partial {\varOmega }\) . For \(\overline{\mathbf{x }} \in \partial {\varOmega }\), \(\underline{\mathbf{x }}_i = \overline{\mathbf{x }}\). Since \(\overline{{\varOmega }}\) is convex, by Lemma 6, the angle between vectors \(\underline{\mathbf{x }}_i - \check{\mathbf{x }}\) and \(\overline{\mathbf{x }}_i - \check{\mathbf{x }}\) is nonacute. Using the cosine law,

$$\begin{aligned} \left\| \underline{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\| ^2&= \left\| \underline{\mathbf{x }}_i - \check{\mathbf{x }}\right\| ^2 + \left\| \overline{\mathbf{x }}_i - \check{\mathbf{x }}\right\| ^2 - 2 ( \underline{\mathbf{x }}_i - \check{\mathbf{x }}) \cdot ( \overline{\mathbf{x }}_i -\check{\mathbf{x }}), \\&\ge \left\| \underline{\mathbf{x }}_i - \check{\mathbf{x }}\right\| ^2, \\ \left\| \underline{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\|&\ge \left\| \underline{\mathbf{x }}_i - \check{\mathbf{x }}\right\| . \end{aligned}$$

Since \(\underline{\mathbf{x }}_i\) is on the line segment from \(\overline{ \mathbf x }\) to \(\overline{\mathbf{x }}_i\), \(\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| = \left\| \overline{\mathbf{x }} - \underline{\mathbf{x }}_i\right\| + \left\| \underline{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\| \). With the triangle inequality,

$$\begin{aligned} \left\| \overline{\mathbf{x }} - \underline{\mathbf{x }}_i\right\| + \left\| \underline{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\|&\ge \left\| \overline{\mathbf{x }} - \underline{\mathbf{x }}_i\right\| + \left\| \underline{\mathbf{x }}_i - \check{\mathbf{x }}\right\| , \nonumber \\ \left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\|&\ge \left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| . \end{aligned}$$
(51)

By the Lipschitz-continuity of V with constant \(C_0\), \(0 < \lambda < 1\), \(|\widetilde{V}| \le C_0\), and since \(\widetilde{V}\) is a supersolution to the numerical HJB equation (23), \(\widetilde{V}(\overline{\mathbf{x }}_i) \ge q(\check{\mathbf{x }})\),

$$\begin{aligned} M_{\epsilon ,\lambda }&= \lambda V(\overline{\mathbf{x }}) - \widetilde{V}(\overline{\mathbf{x }}_i) - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \nonumber \\&= \lambda ( V(\overline{\mathbf{x }}) - \widetilde{V}(\overline{\mathbf{x }}_i)) - (1-\lambda )\widetilde{V}(\overline{\mathbf{x }}_i) - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \nonumber \\&\le \lambda ( V(\overline{\mathbf{x }}) - q(\check{\mathbf{x }})) + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon } , \end{aligned}$$
(52)

If \(\overline{\mathbf{x }} \in {\varOmega }\), \( V(\check{\mathbf{x }}) \le q(\check{\mathbf{x }})\), from (52),

$$\begin{aligned} M_{\epsilon ,\lambda } \le \lambda ( V(\overline{\mathbf{x }}) - V(\check{\mathbf{x }})) +(1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }. \end{aligned}$$
(53)

Otherwise \(\overline{\mathbf{x }} \in \partial {\varOmega }\), and \(V(\overline{\mathbf{x }}) \le q(\overline{\mathbf{x }})\), from (52),

$$\begin{aligned} M_{\epsilon ,\lambda } \le \lambda (q(\overline{\mathbf{x }}) - q(\check{\mathbf{x }})) + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }. \end{aligned}$$
(54)

The Lipschitz continuity of both q and V with constant \(C_0\) in (53) and (54) and \(\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \ge \left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| \) from (51), along with \(0<\lambda <1\) yield

$$\begin{aligned} M_{\epsilon ,\lambda } \le C_0\left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| ^2}{2\epsilon }, \end{aligned}$$
(55)

which is quadratic in \(\left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| \). The quadratic is maximized with \(\left\| \overline{\mathbf{x }} - \check{\mathbf{x }}\right\| = C_0\epsilon \). Thus,

$$\begin{aligned} M_{\epsilon ,\lambda } \le (1-\lambda )C_0 + \frac{C_0^2 \epsilon }{2}. \end{aligned}$$
(56)

Case 2

\(\overline{\mathbf{x }} \in \partial {\varOmega }\), \(\overline{\mathbf{x }}_i \in X_0 \cap {\varOmega }\).

From Lemma 11, there exists \(\hat{\mathbf{x }}_i \in X_0 \cap {\varOmega }^c\) such that

$$\begin{aligned} \left\| \overline{\mathbf{x }} - \hat{\mathbf{x }}_i\right\| \le h_{max}. \end{aligned}$$
(57)

Let \(\widetilde{\mathbf{x }} = \hbox {arg min}_\mathbf{x \in \partial {\varOmega }} \left\| \hat{\mathbf{x }}_i - \mathbf x \right\| \). Using \(0 < \lambda < 1\), \(\widetilde{V}(\hat{\mathbf{x }}_i) \ge q(\widetilde{\mathbf{x }})\), \( V(\overline{\mathbf{x }}) \le q(\overline{\mathbf{x }})\), Lipschitz-continuity of q and \(\widetilde{V}\) both with constant \(C_0\),

$$\begin{aligned} M_{\epsilon ,\lambda }&= \lambda V(\overline{\mathbf{x }}) - \widetilde{V}(\overline{\mathbf{x }}_i) - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon },\\&= \lambda ( V(\overline{\mathbf{x }}) - \widetilde{V}(\overline{\mathbf{x }}_i)) - (1-\lambda )\widetilde{V}(\overline{\mathbf{x }}_i) - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \\&\le \lambda (q(\overline{\mathbf{x }}) - q(\widetilde{\mathbf{x }}) + q(\widetilde{\mathbf{x }}) - \widetilde{V}(\overline{\mathbf{x }}_i)) + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \\&\le \lambda C_0\left\| \overline{\mathbf{x }} - \widetilde{\mathbf{x }}\right\| + \lambda (\widetilde{V}(\hat{\mathbf{x }}_i) - \widetilde{V}(\overline{\mathbf{x }}_i)) + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \\&\le C_0 (\left\| \overline{\mathbf{x }} - \widetilde{\mathbf{x }}\right\| + \left\| \hat{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\| ) + (1-\lambda )C_0 - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }, \end{aligned}$$

Using the triangle inequality, \(\left\| \hat{\mathbf{x }}_i - \overline{\mathbf{x }}_i\right\| \le \left\| \hat{\mathbf{x }}_i - \widetilde{\mathbf{x }}\right\| + \left\| \widetilde{\mathbf{x }} - \overline{\mathbf{x }}\right\| + \left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \), hence

$$\begin{aligned} M_{\epsilon ,\lambda } \le (1-\lambda )C_0 + C_0 \left( \left\| \overline{\mathbf{x }} - \widetilde{\mathbf{x }}\right\| + \left\| \hat{\mathbf{x }}_i - \widetilde{\mathbf{x }}\right\| + \left\| \widetilde{\mathbf{x }} - \overline{\mathbf{x }}\right\| + \left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \right) - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }. \end{aligned}$$

By Lemma 6, and the cosine law, \(\left\| \overline{\mathbf{x }} - \widetilde{\mathbf{x }}\right\| \le \left\| \overline{\mathbf{x }} - \hat{\mathbf{x }}_i\right\| \). From the definition of \(\widetilde{\mathbf{x }}\), \(\left\| \hat{\mathbf{x }}_i-\widetilde{\mathbf{x }}\right\| \le \left\| \overline{\mathbf{x }} - \hat{\mathbf{x }}_i\right\| \). Therefore,

$$\begin{aligned} M_{\epsilon ,\lambda } \le (1-\lambda )C_0 + 3C_0\left\| \overline{\mathbf{x }} - \hat{\mathbf{x }}_i\right\| + C_0\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| - \frac{\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| ^2}{2\epsilon }. \end{aligned}$$

From (57) and maximizing over the quadratic \(\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| \) with \(\left\| \overline{\mathbf{x }} - \overline{\mathbf{x }}_i\right\| = C_0\epsilon \),

$$\begin{aligned} M_{\epsilon ,\lambda } \le (1-\lambda )C_0 + 3C_0h_{max} + \frac{C_0^2\epsilon }{2}. \end{aligned}$$
(58)
Step 3 :

The upper bound of \(M_{\epsilon ,\lambda }\) in (58) is larger than (56). From (37),

$$\begin{aligned} V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i)&\le C_0(1 - \lambda ) + M_{\epsilon ,\lambda }, \\&\le 2C_0(1-\lambda ) + 3C_0h_{max} + \frac{C_0^2 \epsilon }{2},\\&\le 2C_0\frac{2}{G_{min}}\left( \frac{C_1}{\epsilon }h_{max} + C_0L_g\epsilon \right) + 3C_0h_{max} + \frac{C_0^2 \epsilon }{2}, \\&\le \left( \frac{4C_0C_1}{G_{min}} + \frac{4C_0^2 L_g}{G_{min}} + \frac{C_0^2}{2} \right) \left( \frac{h_{max}}{\epsilon } + \epsilon \right) + 3C_0h_{max}, \end{aligned}$$

Since \(\epsilon = \sqrt{h_{max}}\) is a global minimum of \((\frac{h_{max}}{\epsilon } + \epsilon )\), and setting \(C = 2\left( \frac{4C_0C_1}{G_{min}} + \frac{4C_0^2 L_g}{G_{min}} + \frac{C_0^2}{2} + 3C_0 \right) \), for \(h_{max} <h_0 = \min \{\frac{G_{min}^2}{4(C_1+C_0L_g)^2},1\}\),

$$\begin{aligned} V(\mathbf{x }_i) - \widetilde{V}(\mathbf{x }_i) \le C \sqrt{h_{max}}. \end{aligned}$$
(59)

Finally, a symmetrical argument using V a viscosity supersolution of (12) (Definition 8), and \(\widetilde{V}\) a subsolution of the numerical HJB equation (23) (Definition 28) can show (59) with \(V(\mathbf{x }_i)\) and \(\widetilde{V}(\mathbf{x }_i)\) reversed. Hence for \(h_{max} < h_0\),

$$\begin{aligned} \max _{\mathbf{x }_i \in X_0} |\widetilde{V}(\mathbf{x }_i) - V(\mathbf{x }_i)| \le C\sqrt{h_{max}}. \end{aligned}$$

\(\square \)

Theorem 6 will now be extended to \(\overline{{\varOmega }}_X\). Define \(\hat{V}:\overline{{\varOmega }}_X \rightarrow \mathbb {R}\)

$$\begin{aligned} \hat{V}(\mathbf x ) = \sum _{j=0}^n \zeta _j V(\mathbf{x }_j^\mathbf{s }) \text { for } \mathbf x = \sum _{j=0}^n \zeta _j \mathbf{x }_j^\mathbf{s }. \end{aligned}$$

On \(\mathbf{x }_i \in X_0\), \(V(\mathbf{x }_i) = \hat{V}(\mathbf{x }_i)\) are equal.

Lemma 14

There exists \(D_1>0\) for all \(\mathbf x \in \overline{{\varOmega }}_X\), such that

$$\begin{aligned} |V(\mathbf x ) - \hat{V}(\mathbf x )| \le D_1h_{max}. \end{aligned}$$
(60)

Proof

Let \(\zeta \in {\varXi }_n\) and \(\mathbf x \in \mathbf s \) such that \(\mathbf x = \sum _{j=0}^n \zeta _j \mathbf{x }_j^\mathbf{s }\). Using \(V(\mathbf{x }_i) = \hat{V}(\mathbf{x }_i)\) for all vertices \(\mathbf{x }_i \in X_0\), \(\sum _{j=0}^n \zeta _j = 1\), Lemma 7, with Lipschitz constant \(L_V=2G_{max}\),

$$\begin{aligned} |V(\mathbf x ) - \hat{V}(\mathbf x )| \le \sum _{j=0}^n \zeta _j |V(\mathbf x ) - V(\mathbf{x }_j)| \le 2G_{max}h_{max}. \end{aligned}$$

\(\square \)

Fig. 5
figure 5

A numerical example: a Rectangular speed profile \(\mathcal {U}_{g}(\mathbf x )\) with length 6 in the x-direction and 2 in the y-direction, b the exact solution V: a three dimensional view and c the error between \(\widetilde{V}\) and V (viewed from above) is greatest at points where \(\nabla V\) is not defined

Corollary 1

There exists \(D_2>0\) for all \(\mathbf x \in \overline{{\varOmega }}_X\) such that

$$\begin{aligned} |V(\mathbf x ) - \widetilde{V}(\mathbf x )| \le D_2\sqrt{h_{max}}, \end{aligned}$$
(61)

for \(h_{max} < h_0\) as described in Theorem 6.

Proof

Let \(\zeta \in {\varXi }_n\) and \(\mathbf x \in \mathbf s \) such that \(\mathbf x = \sum _{j=0}^n \zeta _j \mathbf{x }_j^\mathbf{s }\). For \(\mathbf x \in \overline{{\varOmega }}_X\), \(\widetilde{V}(\mathbf x ) = \sum _{j=0}^n \zeta _j \widetilde{V}(\mathbf{x }_j^\mathbf{s })\). From Lemma 14 and Theorem 6,

$$\begin{aligned} V(\mathbf x ) - \widetilde{V}(\mathbf x )&\le D_1h_{max} + \hat{V}(\mathbf x ) - \widetilde{V}(\mathbf x ) \\&= D_1h_{max} + \sum _{j=0}^n \zeta _j \left( \hat{V}(\mathbf{x }_j^\mathbf{s }) - \widetilde{V}(\mathbf{x }_j^\mathbf{s })\right) \\&\le (D_1+C)\sqrt{h_{max}}, \end{aligned}$$

for \(h_{max} < h_0\). The proof for \(\widetilde{V}(\mathbf x ) - V(\mathbf x )\) is symmetrical. Hence \(D_2 = D_1 + C\). \(\square \)

7 Numerical Convergence of OUM Example

An example of the error computed using OUM for the boundary value problem is given. The OUM algorithm was programmed in MATLAB\(^{{\textregistered }}\) on an ASUS X550L Laptop with Intel\(^{\textregistered }\) Core \(^\text {TM}\) i5 -4210U CPU Processor (1.7 GHz/2.4 GHz) with 4 GB RAM. As in [21], the update for the OUM algorithm (19) was solved using the golden section search. For \(\overline{{\varOmega }} = [-500,500]\times [-500,500]\), \(\partial {\varOmega } = \{(x,y)\in \overline{{\varOmega }}| |x| = 500 \text { or } |y| = 500\}\), the weight g used corresponded to a rectangular speed profile (Definition 5) centred about x with dimensions 6 in the x-direction and 2 in the y-direction. See Fig. 5a. The boundary function was \(q(\mathbf x ) = 0\) for \(\mathbf x \in \partial {\varOmega }\). The same speed profile was used for all \(\mathbf x \in {\varOmega }\). The analytic solution is made up of the concatenation of 4 planes: \(y + z = 500\), \(x + 3z = 500\), \(-y + z = 500\) and \(-x + 3z = 500\) within \({\varOmega }\). See Fig. 5b.

Table 1 Accuracy of OUM for a boundary value problem—the OUM was used to solve the static HJB problem with a rectangular profile on five meshes
Fig. 6
figure 6

Average and maximum error for OUM convergence example—average error shown in red (below), maximum error shown in black (above). The overall convergence rates measured were \(r_{avg} = 1.043\) and \(r_{max} = 0.523\) (Color figure online)

Given a set of boundary points, meshes with uneven triangles were generated using Mesh2D [12]. The error values are given in Table 1 and a plot is provided in Fig. 6. Using polyfit in MATLAB with the data provided in Table 1, affine approximations of the log–log slope fit using least squares were found. Using all 5 data points, overall rates of convergence of \(r_{avg} = 1.043\) and \(r_{max} = 0.523\) were obtained for average error and maximum error across the vertices respectively. The convergence rate for maximum error in this example matches closely to the theoretical results shown earlier. In average error, the OUM algorithm is at most first-order accurate (as described in [21]) since the update formula (19) is a first-order approximation. Since V is Lipschitz continuous, from Rademacher’s theorem, \(\nabla V\) can only be undefined on a set of measure zero. The error for all discretiztaions had the same general shape, appearing greatest near where \(\nabla V\) was undefined. See Fig. 5c. Characteristics flow into, but not out of such points where \(\nabla V\) is undefined, preventing the error from being propagated further [19], hence the expected first-order convergence rate in average error.

8 Conclusions and Future Work

It was proven in this paper that the rate of convergence of the approximate solution provided by OUM to the viscosity solution of the HJB for prescribed boundary values is at least \(\mathcal {O}(\sqrt{h_{max}})\) in maximum error. The basic idea of the proof is an extension of a similar proof for FMM in [20]. A key step was to show the existence of a directionally complete stencil. This implied from existing results that the numerical Hamiltonian for the OUM is both consistent and monotonic.

An extension of this work would be to provide a convergence rate proof for OUM in the single-source point formulation of the static HJB. This will extend the applicability of the result shown here to point-to-point path planning problems, such as for rovers [22] and other robots [24]. Constructing a directionally complete stencil as done here may be difficult near the source point.

Another direction of research could be to prove that the convergence in average error of OUM is at a rate of \(\mathcal {O}(h_{max})\) as was the case in the example in this paper. This could follow because OUM is a first-order method, with V generally not differentiable only on a set of measure zero. Additional assumptions of regularity, such as a continuously differentiable speed profile, may lead to a proof for first-order convergence in average error applicable to many problems.