A Uniform Tauberian Theorem in Optimal Control

Oliu-Barton, Miquel; Vigeral, Guillaume

doi:10.1007/978-0-8176-8355-9_10

Miquel Oliu-Barton³ &
Guillaume Vigeral⁴

Part of the book series: Annals of the International Society of Dynamic Games ((AISDG,volume 12))

1774 Accesses
16 Citations

Abstract

In an optimal control framework, we consider the value V _T(x) of the problem starting from state x with finite horizon T, as well as the value W _λ(x) of the λ-discounted problem starting from x. We prove that uniform convergence (on the set of states) of the values V _T( ⋅) as T tends to infinity is equivalent to uniform convergence of the values W _λ( ⋅) as λ tends to 0, and that the limits are identical. An example is also provided to show that the result does not hold for pointwise convergence. This work is an extension, using similar techniques, of a related result by Lehrer and Sorin in a discrete-time framework.

Access provided by Autonomous University of Puebla. Download chapter PDF

Maximum principle for infinite-horizon optimal control problems under weak regularity assumptions

Article 11 December 2015

An Existence Theorem for a Class of Infinite Horizon Optimal Control Problems

Article 09 October 2014

On Balder’s Existence Theorem for Infinite-Horizon Optimal Control Problems

Article 01 January 2018

Keywords

1 Introduction

Finite horizon problems of optimal control have been studied intensively since the pioneer work of Stekhov, Pontryagin, Boltyanskii [27], Hestenes [18], Bellman [9] and Isaacs [19, 20] during the cold war—see for instance [7, 22, 23] for major references, or [14] for a short, clear introduction. A classical model considers the following controlled dynamic over ℝ ₊

$$\left\{\begin{array}{@{}l@{\quad }l@{}} y\prime(s) = f(y(s),u(s))\quad \\ y(0) = {y}_{0} \quad \end{array} \right.$$

(10.1)

where y is a function from ℝ ₊ to ℝ ⁿ, y ₀ is a point in ℝ ⁿ, u is the control function which belongs to $\mathcal{U}$, the set of Lebesgue-measurable functions from ℝ ₊ to a metric space U and the function f : ℝ ⁿ ×U → ℝ ⁿ satisfies the usual conditions, that is: Lipschitz with respect to the state variable, continuous with respect to the control variable and bounded by a linear function of the state variable, for any control u.

Together with the dynamic, an objective function g is given, interpreted as the cost function which is to be minimized and assumed to be Borel-measurable from ℝ ⁿ ×U to [0, 1]. For each finite horizon t ∈ ]0, + ∞[, the average value of the optimal control problem with horizon t is defined as

$${V }_{t}({y}_{0}) {=\inf }_{u\in \mathcal{U}}\frac{1} {t}{\int \nolimits }_{0}^{t}g(y(s,u,{y}_{ 0}),u(s))\,\mathrm{d}s.$$

(10.2)

It is quite natural to define, whenever the trajectories considered are infinite, for any discount factor λ > 0, the λ-discounted value of the optimal control problem, as

$${W}_{\lambda }({y}_{0}) {=\inf }_{u\in \mathcal{U}}\lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda s}g(y(s,u,{y}_{ 0}),u(s))\,\mathrm{d}s.$$

(10.3)

In this framework the problem was initially to know whether, for a given finite horizon T and a given starting point y ₀, a minimizing control u existed, solution of the optimal control problem (T, y ₀). Systems with large, but fixed horizons were considered and, in particular, the class of “ergodic” systems (that is, those in which any starting point in the state space Ω is controllable to any point in Ω) has been thoroughly studied [2, 3, 5, 6, 8, 11, 25]. These systems are asymptotically independent of the starting point as the horizon goes to infinite. When the horizon is infinite, the literature on optimal control has mainly focussed on properties of given trajectories as the time tends to infinity. This approach corresponds to the uniform approach in a game theoretical framework and is often opposed to the asymptotic approach (described below), which we have considered in what follows, and which has received considerably less attention.

In a game-theoretical, discrete time framework, the same kind of problem was considered since [29], but with several differences in the approach: (1) the starting point may be chosen at random (a probability μ may be given on Ω, which randomly determines the point from which the controller will start the play); (2) the controllability-ergodicity condition is generally not assumed; (3) because of the inherent recursive structure of the process played in discrete time, the problem is generally considered for all initial states and time horizons.

For these reasons, what is called the ”asymptotic approach”—the behavior of V _t( ⋅) as the horizon t tends to infinity, or of W _λ( ⋅) as the discount factor λ tends to zero—has been more studied in this discrete-time setup. Moreover, when it is considered in Optimal Control, in most cases [4, 10] an ergodic assumption is made which not only ensures the convergence of V _t(y ₀) to some V , but also forces the limit function V to be independent of the starting point y ₀. The general asymptotic case, in which no ergodicity condition is assumed, has been to our knowledge studied for the first time recently. In [11, 28] the authors prove in different frameworks the convergence of V _t( ⋅) and W _λ( ⋅) to some non-constant function V (y ₀).

Some important, closely related questions are the following : does the convergence of V _t( ⋅) imply the convergence of W _λ( ⋅)? Or vice versa? If they both converge, does the limit coincide? A partial answer to these questions goes back to the beginning of the twentieth century, when Hardy and Littlewood proved (see [17]) that for any sequence of bounded real numbers, the convergence of the Cesaro means is equivalent to the convergence of their Abel means, and that the limits are then the same :

Theorem 10.1 ([17]).

For any bounded sequence of reals {a _n } _n≥1 , define ${V }_{n} = \frac{1} {n} \sum\limits_{i=1}^{n}{a}_{i}$ and ${W_\lambda} = \lambda\sum\limits_{i=1}^{+\infty}{\left(1-\lambda\right)^{i-a}}a_i$ . Then,

$$\mathop{lim inf }\limits_{n \rightarrow +\infty }\ {V }_{n} \leq \mathop{ lim inf }\limits_{\lambda \rightarrow 0}\ {W}_{\lambda } \leq \mathop{ lim sup}\limits_{\lambda \rightarrow 0}\ {W}_{\lambda } \leq \mathop{ lim sup}\limits_{n \rightarrow +\infty }\ {V }_{n}.$$

Moreover, if the central inequality is an equality, then all inequalities are equalities.

Noticing that {a _n} can be viewed as a sequence of costs for some deterministic (uncontrolled) dynamic in discrete-time, this results gives the equivalence between the convergence of V _t and the convergence of W _λ, to the same limit. In 1971, setting ${V }_{t} = \frac{1} {t} { \int \nolimits }_{0}^{t}g(s)\,\mathrm{d}s$ and W _λ = λ ∫₀ ^{+ ∞}e^− λs g(s) ds, for a given Lebesgue-measurable, bounded, real function g, Feller proved that the same result holds for continuous-time uncontrolled dynamics (particular case of Theorem 2, p. 445 in [15]).

Theorem 10.2 ([15]).

$$\mathop{lim inf }\limits_{n \rightarrow +\infty }\ {V }_{n} \leq \mathop{ lim inf }\limits_{\lambda \rightarrow 0}\ {W}_{\lambda } \leq \mathop{ lim sup}\limits_{\lambda \rightarrow 0}\ {W}_{\lambda } \leq \mathop{ lim sup}\limits_{n \rightarrow +\infty }\ {V }_{n}.$$

Moreover, if the central inequality is an equality, then all inequalities are equalities.

In 1992, Lehrer and Sorin [24] considered a discrete-time controlled dynamic, defined by a correspondence Γ : Ω ⇉ Ω, with nonempty values, and by g, a bounded real cost function defined on Ω. A feasible play at z ∈ Ω is an infinite sequence y = { y _n}_{n ≥ 1} such that y ₁ = z and y _{n + 1} ∈ Γ(y _n). The average and discounted value functions are defined respectively by ${V }_{n}(z) =\inf \ \frac{1} {n} \sum\limits_{i=1}^{n}g({y}_{i})$ and W _λ(y ₀) = inf λ ∑_{i = 1} ^{+ ∞}(1 − λ)^{i − 1} g(y _i), where the infima are taken over the feasible plays at z.

Theorem 10.3 ([24]).

$$\mathop{\lim }\limits_{n \rightarrow +\infty }\ {V }_{n}(z) = V (z)\text{ uniformly on }\Omega \Longleftrightarrow\mathop{\lim }\limits_{\lambda \rightarrow 0}\ {W}_{\lambda }(z) = V (z)\text{ uniformly on }\Omega.$$

This result establishes the equivalence between uniform convergence of W _λ(y ₀) when λ tends to 0 and uniform convergence of V _n(y ₀) as n tends to infinity, in the general case where the limit may depend on the starting point y ₀. The uniform condition is necessary: in the same article, the authors provide an example where only pointwise convergence holds and the limits differs.

In 1998, Arisawa (see [4]) considered a continuous-time controlled dynamic and proved the equivalence between the uniform convergence of W _λ and the uniform convergence of V _t in the specific case of limits independent of the starting point.

Theorem 10.4 ([4]).

Let d ∈ ℝ, then

$$\mathop{\lim }\limits_{t \rightarrow +\infty }\ {V }_{t}(z) = d,\text{ uniformly on }\Omega \Longleftrightarrow\mathop{\lim }\limits_{\lambda \rightarrow 0+}\ {W}_{\lambda }(z) = d,\text{ uniformly on }\Omega.$$

This does not settle the general case, in which the limit function may depend on the starting point.^{Footnote 1} For a continuous-time controlled dynamic in which V _t(y ₀) converges to some function V (y ₀), dependent on the state variable y ₀, as t goes to infinity, we prove the following

Theorem 10.5.

V _t (y ₀ ) converges to V (y ₀ ) uniformly on Ω, if and only if W _λ (y ₀ ) converges to V (y ₀ ) uniformly on Ω.

In fact, we will prove this result in a more general framework, as described in Sect. 10.2. Some basic lemmas which occur to be important tools will also be proven on that section. Section 10.3 will be devoted to the proof of our main result. Section 10.4 will conclude by pointing out, via an example, the fact that uniform convergence is a necessary requirement for the Theorem 10.5 to hold. A very simple dynamic is described, in which the pointwise limits of V _t( ⋅) and W _λ( ⋅) exist but differ. It should be noted that our proofs (as well as the counterexample in Sect. 10.4) are adaptations in this continuous-time framework of ideas employed in a discrete-time setting in [24]. In the appendix we also point out that an alternative proof of our theorem is obtained using the main theorem in [24] as well as a discrete/continuous equivalence argument.

For completeness, let us mention briefly this other approach, mentioned above as the uniform approach, and which has also been deeply studied, see for exemple [12, 13, 16]. In these models, the optimal average cost value is not taken over a finite period of time [0, t], which is then studied for t growing to infinite, as in [4, 15, 17, 24, 28] or in our framework. On the contrary, only infinite trajectories are considered, among which the value $\overline{{V }_{t}}$ is defined as $\inf {}_{u\in \mathcal{U}}{\sup }_{\tau \geq t}\frac{1} {\tau }{ \int }_{0}^{\tau }g(y(s,u,{y}_{0}),u(s))\,\mathrm{d}s$, or some other closely related variation. The asymptotic behavior, as t tends to infinity, of the function $\overline{{V }_{t}}$ has also been studied in that framework. In [16], both λ-discounted and average evaluations of an infinite trajectory are considered and their limits are compared. However, we stress out that the asymptotic behavior of those quantities is in general^{Footnote 2} not related to the asymptotic behavior of V _t and W _λ.

Finally, let us point out that in the framework of zero-sum differential games, that is when the dynamic is controlled by two players with opposite goals, a Tauberian theorem is given in the ergodic case by Theorem 2.1 in [1]. However, to our knowledge the general, non ergodic case is still an open problem.

2 Model

2.1 General Framework

We consider a deterministic dynamic programming problem in continuous time, defined by a measurable set of states Ω, a subset $\mathcal{T}$ of Borel-measurable functions from ${\mathbb{R}}_{+}$ to Ω, and a bounded Borel-measurable real-valued function g defined on Ω. Without loss of generality we assume g : Ω → [0, 1]. For a given state x, define $\Gamma (x) :=\{ X \in \mathcal{T} ,\ X(0) = x\}$ the set of all feasible trajectories starting from x. We assume Γ(x) to be non empty, for all x ∈ Ω. Furthermore, the correspondence Γ is closed under concatenation: given a trajectory X ∈ Γ(x) with X(s) = y, and a trajectory Y ∈ Γ(y), the concatenation of X and Y at time s is

$$\begin{array}{rcl} X {\circ }_{s}Y := \left\{\begin{array}{l@{\quad }l} X(t) \quad &\text{ if }t \leq s\\ Y (t - s)\quad &\text{ if } t \geq s \\ \quad \end{array} \right.& &\end{array}$$

(10.4)

and we assume that X ∘ _s Y ∈ Γ(x).

We are interested in the asymptotic behavior of the average and the discounted values. It is useful to denote the average payoff of a play (or trajectory) X ∈ Γ(x) by:

$$\begin{array}{rcl}{ \gamma }_{t}(X)& :=& \dfrac{1} {t}{\int \nolimits }_{0}^{t}g(X(s))\,\mathrm{d}s\end{array}$$

(10.5)

$$\begin{array}{rcl}{ \nu }_{\lambda }(X)& :=& \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda s}g(X(s))\,\mathrm{d}s.\end{array}$$

(10.6)

This is defined for t, λ ∈ ]0, + ∞[. Naturally, we define the values as:

$$\begin{array}{rcl}{ V }_{t}(x)& =& {\inf }_{X\in \Gamma (x)}{\gamma }_{t}(X)\end{array}$$

(10.7)

$$\begin{array}{rcl}{ W}_{\lambda }(x)& =& {\inf }_{X\in \Gamma (x)}{\nu }_{\lambda }(X).\end{array}$$

(10.8)

Our main contribution is Theorem 10.5:

$$(A)\ {W}_{\lambda } \rightarrow _{\lambda \rightarrow 0}V,\text{ uniformly on }\Omega \Longleftrightarrow(B)\ {V }_{t} \rightarrow _{t \rightarrow \infty }V,\text{ uniformly on }\Omega.$$

(10.9)

Notice that our model is a natural adaptation to the continuous-time framework of deterministic dynamic programming problems played in discrete time ; as it was pointed out during the introduction, this theorem is an extension to the continuous-time framework of the main result of [24], and our proof uses similar techniques.

This result can be applied to the model presented in Sect. 10.1: let $\widetilde{\Omega } = {\mathbb{R}}^{d} \times U$ and for any $({y}_{0},{u}_{0}) \in \widetilde{ \Omega }$, define $\widetilde{\Gamma }({y}_{0},{u}_{0}) =\{ (y(\cdot ),u(\cdot ))\ \vert u \in \mathcal{U},u(0) = {u}_{0}\text{ and }y \text{is the solution of (10.1)}\}$. Then $\widetilde{\Omega }$, $\widetilde{\Gamma }$ and g satisfy the assumptions of this section. Defining $\widetilde{{V }}_{t}$ and $\widetilde{{W}}_{\lambda }$ as in (10.7) and (10.8) respectively, since the solution of (10.1) does not depend on u(0) we get that

$$\begin{array}{rcl} \widetilde{{V }}_{t}({y}_{0},{u}_{0})& =& {V }_{t}({y}_{0}) \\ \widetilde{{W}}_{\lambda }({y}_{0},{u}_{0})& =& {W}_{\lambda }({y}_{0})\end{array}$$

Theorem 10.5 applied to $\widetilde{V }$ and $\widetilde{W}$ thus implies that V _t converges uniformly to a function V in Ω if and only if W _λ converges uniformly to V in Ω.

2.2 Preliminary Results

We follow the ideas of [24], and start by proving two simple lemmas yet important tools, that will be used in the proof. The first establishes that the value increases along the trajectories. Then, we prove a convexity result linking the finite horizon average payoffs and the discounted evaluations on any given trajectory.

Lemma 10.1.

Monotonicity (compare with Proposition 1 in [24]). For all $X \in \mathcal{T}$ , for all s ≥ 0, we have

$$\begin{array}{rcl} {lim inf}_{t\rightarrow \infty }{V }_{t}(X(0))& \leq & {lim inf}_{t\rightarrow \infty }{V }_{t}(X(s))\end{array}$$

(10.10)

$$\begin{array}{rcl} {lim inf}_{\lambda \rightarrow 0}{W}_{\lambda }(X(0))& \leq & {lim inf}_{\lambda \rightarrow 0}{W}_{\lambda }(X(s)).\end{array}$$

(10.11)

Proof.

Set y : = X(s) and x : = X(0). For ε > 0, take T ∈ ℝ ₊ such that $\frac{s} {s+T} < \epsilon $. Let t > T and take an ε-optimal trajectory for V _t, i.e. Y ∈ Γ(y) such that γ_t(Y ) ≤ V _t(y) + ε. Define the concatenation of X and Y at time s as in (10.4), where X ∘ _s Y is in ∈ Γ(x) by assumption. Hence

$$\begin{array}{rcl}{ V }_{t+s}(x) \leq {\gamma }_{t+s}(X {\circ }_{s}Y )& =& \dfrac{s} {t + s}{\gamma }_{s}(X) + \frac{t} {t + s}{\gamma }_{t}(Y ) \\ & \leq & \epsilon + {\gamma }_{t}(Y ) \\ & \leq & 2\epsilon + {V }_{t}(y)\end{array}$$

Since this is true for any t ≥ T the result follows.

Similarly, for the discounted case let λ₀ > 0 be such that

$${\lambda }_{0}{ \int \nolimits }_{0}^{s}\mathrm{{e}}^{-{\lambda }_{0}r}\,\mathrm{d}r = 1 -\mathrm{ {e}}^{{\lambda }_{0}s} < \epsilon.$$

Let λ ∈ ]0, λ₀] and take Y ∈ Γ(y) an ε-optimal trajectory for W _λ(y). Then:

$$\begin{array}{rcl}{ W}_{\lambda }(x) \leq {\nu }_{\lambda }(X {\circ }_{s}Y )& =& \lambda {\int \nolimits }_{0}^{s}\mathrm{{e}}^{-\lambda r}g(X(r))\,\mathrm{d}r + \lambda {\int \nolimits }_{s}^{+\infty }\mathrm{{e}}^{-\lambda r}g(Y (r - s))\,\mathrm{d}r \\ & \leq & \epsilon +\mathrm{ {e}}^{-\lambda s}{\nu }_{ \lambda }(Y ) \\ & \leq & 2\epsilon + {W}_{\lambda }(y)\end{array}$$

Again, this is true for any λ ∈ ]0, λ₀], and the result follows. □

Lemma 10.2.

Convexity (compare with Eq. (10.1) in [24]). For any play $X \in \mathcal{T}$ , for any λ > 0:

$${\nu }_{\lambda }(X) ={ \int \nolimits }_{0}^{+\infty }{\gamma }_{ s}(X){\mu }_{\lambda }(s)\,\mathrm{d}s,$$

(10.12)

where μ _λ (s) d s := λ ² s e ^−λs ds is a probability density on [0,+∞].

Proof.

It is enough to notice that the following relation holds, by integration by parts:

$${\nu }_{\lambda }(X) = \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda s}g(X(s))\,\mathrm{d}s = {\lambda }^{2}{ \int \nolimits }_{0}^{+\infty }s\mathrm{{e}}^{-\lambda s}\left(\frac{1} {s}{\int \nolimits }_{0}^{s}g(X(r))\,\mathrm{d}r\right)\,\mathrm{d}s,$$

and that ∫₀ ^{+ ∞}λ² se^− λs ds = 1. □

The probability measure μ_λ plays an important role in the rest of the paper. Denoting

$$M(\alpha ,\beta ;\lambda ) := {\int \nolimits }_{\alpha }^{\beta }{\mu }_{ \lambda }(s)\,\mathrm{d}s =\mathrm{ {e}}^{-\lambda \alpha }(1 + \lambda \alpha ) -\mathrm{ {e}}^{-\lambda \beta }(1 + \lambda \beta ),$$

we prove here two estimates that will be helpful in the next section.

Lemma 10.3.

The two following results hold (compare with Lemma 3 in [24]):

(i)
$\forall t > 0,\exists {\epsilon }_{0}\ such\ that\ \forall \epsilon \leq {\epsilon }_{0},M\left((1 - \epsilon )t,t; \frac{1} {t} \right) \geq \frac{\epsilon } {2\mathrm{e}}.$
(ii)
∀δ > 0,∃ε ₀ such that ∀ε ≤ ε ₀ , ∀t > 0, $M\left(\epsilon t,(1 - \epsilon )t; \frac{1} {t\sqrt{\epsilon }}\right) \geq 1 - \delta.$

Proof.

Notice that in these particular cases, M does not depend on t:

(i)
$M(t(1 - \epsilon ),t; \frac{1} {t} ) = (2 - \epsilon )\mathrm{{e}}^{-1+\epsilon } - 2\mathrm{{e}}^{-1} = \frac{1} {\mathrm{e}} (\epsilon + o(\epsilon )) \geq \frac{\epsilon } {2\mathrm{e}}$, for ε small enough.
(ii)
$M(t\epsilon ,t(1 - \epsilon ); \frac{1} {t\sqrt{\epsilon }}) = (1 + \sqrt{\epsilon })\mathrm{{e}}^{-\sqrt{\epsilon }} - (1 - 1/\sqrt{\epsilon } + \sqrt{\epsilon })\exp \left(-1/\sqrt{\epsilon } + \sqrt{\epsilon }\right)$. This expression tends to 1 as ε → 0, hence the result. □

3 Proof of Theorem 10.5

3.1 From V _t to W _λ

Assume (B) : V _t( ⋅) converges to some V ( ⋅) as t goes to infinity, uniformly on Ω. Our proof follows Proposition 4 and Lemmas 8 and 9 in [24].

Proposition 10.1.

For all ε > 0, there exists λ ₀ > 0 such that W _λ (x) ≥ V (x) − ε for every x ∈ Ω and for all λ ∈]0,λ ₀ ].

Proof.

Let T be such that $\|{V }_{t} - {V \|}_{\infty }\leq \epsilon /2$ for every t ≥ T. Choose λ₀ > 0 such that

$${\lambda }^{2}{ \int \nolimits }_{T}^{+\infty }s\mathrm{{e}}^{-\lambda s}\,\mathrm{d}s = 1 - (1 + \lambda T)\mathrm{{e}}^{-\lambda T} \geq 1 -\frac{\epsilon } {4}$$

for every λ ∈ ]0, λ₀]. Fix λ ∈ ]0, λ₀] and take a play Y ∈ Γ(x) which is ε ∕ 4-optimal for W _λ(x). Since γ_s(X) ≥ 0, the convexity formula (10.12) from Lemma 10.2 gives:

$$\begin{array}{rcl}{ W}_{\lambda }(x) + \frac{\epsilon } {4} \geq {\nu }_{\lambda }(Y )& \geq & 0 + {\lambda }^{2}{ \int \nolimits }_{T}^{+\infty }s\mathrm{{e}}^{-\lambda s}{\gamma }_{ s}(Y )\,\mathrm{d}s \\ & \geq & {\lambda }^{2}{ \int \nolimits }_{T}^{+\infty }s\mathrm{{e}}^{-\lambda s}{V }_{ s}(x)\,\mathrm{d}s \\ & \geq & \left(1 -\frac{\epsilon } {4}\right)\left(V (x) -\frac{\epsilon } {2}\right) \\ & =& V (x) -\frac{\epsilon } {4}V (x) -\frac{\epsilon } {2} + \frac{{\epsilon }^{2}} {8} \\ & \geq & V (x) -\frac{3\epsilon } {4}. \\ & & \\ \end{array}$$

□

Lemma 10.4.

∀ε > 0,∃M such that for all t ≥ M,∀x ∈ Ω, there is a play X ∈ Γ(x) such that γ _s (X) ≤ V (x) + ε for all s ∈ [εt,(1 − ε)t].

Proof.

By (B) there exists M such that $\|{V }_{r} - V \| \leq {\epsilon }^{2}/3$ for all r ≥ εM. Given t ≥ M and x ∈ Ω, let X ∈ Γ(x) be a play (from x) such that γ_t(X) ≤ V _t(x) + ε² ∕ 3. For any s ≤ (1 − ε)t, we have that t − s ≥ εt ≥ εM so Proposition 10.1 (Monotonicity) imply that

$${V }_{t-s}(X(s)) \geq V (X(s)) -\frac{{\epsilon }^{2}} {3} \geq V (x) -\frac{{\epsilon }^{2}} {3}.$$

(10.13)

Since V (x) + ε² ∕ 3 ≥ V _t(x), we also have:

$$\begin{array}{rcl} t\left(V (x) + \frac{2{\epsilon }^{2}} {3} \right)& \geq & t\left({V }_{t}(x) + \frac{{\epsilon }^{2}} {3} \right) \\ & \geq & t{\gamma }_{t}(X) ={ \int \nolimits }_{0}^{s}g(X(r))\,\mathrm{d}r +{ \int \nolimits }_{s}^{t}g(X(r))\,\mathrm{d}r \\ & \geq & s{\gamma }_{s}(X) + (t - s){V }_{t-s}(X(s)) \\ & \geq & s{\gamma }_{s}(X) + (t - s)\left(V (x) -\frac{{\epsilon }^{2}} {3} \right),\text{ by (10.13)}\end{array}$$

Isolating γ_s(X) we get:

$$\begin{array}{rcl}{ \gamma }_{s}(X)& \leq & V (x) + \frac{t{\epsilon }^{2}} {s} \\ & \leq & V (x) + \epsilon ,\quad \mathrm{for}\ s/\epsilon \geq t, \\ \end{array}$$

and we have proved the result for all s ∈ [εt, (1 − ε)t]. □

Proposition 10.2.

∀δ > 0,∃λ ₀ such that ∀x ∈ Ω, for all λ ∈]0,λ ₀ ], we have W _λ (x) ≤ V (x) + δ.

Proof.

By Lemma 10.3(ii), one can choose ε small enough such that

$$M\left(\epsilon t,(1 - \epsilon )t; \frac{1} {t\sqrt{\epsilon }}\right) \geq 1 -\frac{\delta } {2},$$

for any t. In particular, we can take ε ≤ δ ∕ 2. Using Lemma 10.4 with δ ∕ 2, we get that for t ≥ t ₀ (and thus for $\lambda (t) := \frac{1} {t\sqrt{\epsilon }} \leq \frac{1} {{t}_{0}\sqrt{\epsilon }}$) and for any x ∈ Ω, there exists a play X ∈ Γ(x) such that

$$\begin{array}{rcl}{ \nu }_{\lambda (t)}(X)& \leq & \frac{\delta } {2} +{ \lambda (t)}^{2}{ \int \nolimits }_{\epsilon t}^{(1-\epsilon )t}s\mathrm{{e}}^{s\lambda (t)}{\gamma }_{ s}(X)\,\mathrm{d}s \\ & \leq & \frac{\delta } {2} + V (x) + \frac{\delta } {2}\end{array}$$

□

Propositions 10.1 and 10.2 establish the first part of Theorem 10.5: (B) ⇒ (A).

3.2 From W _λ to V _t

Now assume (A) : W _λ( ⋅) converges to some W( ⋅) as λ goes to 0, uniformly on Ω. Our proof follows Proposition 2 and Lemmas 6 and 7 in [24]. Start by a technical Lemma:

Lemma 10.5.

Let ε > 0. For all x ∈ Ω and t > 0, and for any trajectory Y ∈ Γ(x) which is ε∕2-optimal for the problem with horizon t, there is a time L ∈ [0,t(1 − ε∕2)] such that, for all T ∈]0,t − L]:

$$\frac{1} {T}{\int \nolimits }_{L}^{L+T}g(Y (s))\,\mathrm{d}s \leq {V }_{ t}(x) + \epsilon.$$

Proof.

Fix Y ∈ Γ(x) some ε ∕ 2-optimal play for V _t(x). The function s↦γ_s(Y ) is continuous on ]0, t] and satisfies γ_t(Y ) ≤ V _t(x) + ε ∕ 2. The bound on g implies that γ_r(Y ) ≤ V _t(x) + ε for all r ∈ [t(1 − ε ∕ 2), t].

Consider now the set {s ∈ ]0, t] | γ_s(Y ) > V _t(x) + ε}. If this set is empty, then take L = 0 and observe that for any r ∈ ]0, t],

$$\frac{1} {r}{\int \nolimits }_{0}^{r}g(Y (s))\,\mathrm{d}s \leq {V }_{ t}(x) + \epsilon.$$

Otherwise, let L be the superior bound of this set. Notice that L < t(1 − ε ∕ 2) and that by continuity γ_L(Y ) = V _t(x) + ε. Now, for any T ∈ [0, t − L],

$$\begin{array}{rcl}{ V }_{t}(x) + \epsilon & \geq & {\gamma }_{L+T}(Y ) \\ & =& \frac{L} {L + T}{\gamma }_{L}(Y ) + \frac{T} {L + T}\left( \frac{1} {T}{\int \nolimits }_{L}^{L+T}g(Y (s))\,\mathrm{d}s\right) \\ & =& \frac{L} {L + T}\left({V }_{t}(x) + \epsilon \right) + \frac{T} {L + T}\left( \frac{1} {T}{\int \nolimits }_{L}^{L+T}g(Y (s))\,\mathrm{d}s\right) \\ \end{array}$$

and the result follows. □

Proposition 10.3.

∀ε > 0,∃T such that for all t ≥ T we have V _t (x) ≥ W(x) − ε, for all x ∈ Ω.

Proof.

Let λ be such that $\|{W}_{\lambda } - W\| \leq \epsilon /8$, and T such that

$${\lambda }^{2}{ \int \nolimits }_{T\epsilon /4}^{+\infty }s\mathrm{{e}}^{-\lambda s}\,\mathrm{d}s < \frac{\epsilon } {8}.$$

Proceed by contradiction and suppose that ε > 0 is such that for every T, there exists t ₀ ≥ T and a state x ₀ ∈ Ω such that ${V }_{{t}_{0}}({x}_{0}) < W({x}_{0}) - \epsilon $.

Using Lemma 10.5 with ε ∕ 2, we get a play Y ∈ Γ(x ₀) and a time L ∈ [0, t ₀ (1 − ε ∕ 4)] such that, ∀s ∈ [0, t ₀ − L] (and, in particular, ∀s ∈ [0, t ₀ε ∕ 4]),

$$\frac{1} {s}{\int \nolimits }_{L}^{L+s}g\left(Y (r)\right)\,\mathrm{d}r \leq {V }_{{ t}_{0}}({x}_{0}) + \frac{\epsilon } {2} < W({x}_{0}) -\frac{\epsilon } {2}.$$

Thus,

$$\begin{array}{rcl} W(Y (L)) -\frac{\epsilon } {8}& \leq & {W}_{\lambda }(Y (L)) \\ & \leq & \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda s}g(Y (L + s))\,\mathrm{d}s \\ & \leq & {\lambda }^{2}{ \int \nolimits }_{0}^{{t}_{0}\epsilon /4}s\mathrm{{e}}^{-\lambda s}\left(\frac{1} {s}{\int \nolimits }_{L}^{L+s}g\left(Y (r)\right)\,\mathrm{d}r\right)\,\mathrm{d}s +\ \frac{\epsilon } {8} \\ & \leq & W({x}_{0}) -\frac{\epsilon } {2} + \frac{\epsilon } {8} \\ & =& W({x}_{0}) -\frac{3\epsilon } {8} \end{array}$$

This gives us W(Y (L)) ≤ W(x ₀) − ε ∕ 4, contradicting Proposition 10.1 (Monotonicity). □

Proposition 10.4.

∀ε > 0,∃T such that for all t ≥ T we have V _t (x) ≤ W(x) + ε, for all x ∈ Ω.

Proof.

Otherwise, ∃ε > 0 such that ∀T, ∃t ≥ T and x ∈ Ω with V _t(x) > W(x) + ε. For any X ∈ Γ(x) consider the (continuous in s) payoff function ${\gamma }_{s}(X) = \frac{1} {s}{ \int \nolimits }_{0}^{s}g(X(r))\,\mathrm{d}r$. Of course, γ_t(X) ≥ V _t(x) > W(x) + ε. Furthermore, because of the bound on g,

$${\gamma }_{r}(X) \geq W(x) + \epsilon /2,\ \forall r \in \left[t\left(1 - \epsilon /2\right),t\right].$$

By Lemma 10.3, we can take ε small enough, so that for all t,

$$M\left(t(1 - \epsilon /2),t; \frac{1} {t} \geq \frac{\epsilon } {4\mathrm{e}}\right)$$

holds. We set $\delta := \frac{\epsilon } {4\mathrm{e}}$. By Proposition 10.3, there is a K such that ${V }_{t} \geq W(x) -\frac{\delta \epsilon } {8}$, for all t ≥ K. Fix K and consider

$$M(0,K;1/t) = 1 -\mathrm{ {e}}^{-K/t}(1 + K/t)$$

as a function of t. Clearly, it tends to 0 as t tends to infinity, so let t be such that this quantity is smaller than $\frac{\delta \epsilon } {16}$. Also, let t be big enough so that $\parallel {W}_{1/t} - W\parallel < \frac{\delta \epsilon } {5}$, which is a consequence of assumption (A).

We now set $\tilde{\lambda } := 1/t$ and consider the $\tilde{\lambda }$-payoff of some play X ∈ Γ(x). We split [0, + ∞] in three parts : $\mathcal{K} = [0,K],\mathcal{R} = [t(1 - \epsilon /2),t]$, and ${(\mathcal{K}\cup \mathcal{R})}^{c}$. The three parts are disjoint for t large enough, so by the Convexity formula (10.12), for any λ > 0,

$${\nu }_{\tilde{\lambda }}(X) = \left({\int \nolimits }_{\mathcal{K}}{\gamma }_{s}(X){\mu }_{\tilde{\lambda }}(\mathrm{d}s) +{ \int \nolimits }_{\mathcal{R}}{\gamma }_{s}(X){\mu }_{\tilde{\lambda }}(\mathrm{d}s) +{ \int \nolimits }_{{(\mathcal{K}\cup \mathcal{R})}^{c}}{\gamma }_{s}(X){\mu }_{\tilde{\lambda }}(\mathrm{d}s)\right)$$

where μ_λ(s) ds = λ² se^− λs ds. Recall that

$$\begin{array}{rcl} {\gamma }_{s}{(X)}_{\vert \mathcal{K}}& \geq & 0 \\ {\gamma }_{s}{(X)}_{\vert {(\mathcal{K}\cup \mathcal{R})}^{c}}& \geq & W(x) -\frac{\delta \epsilon } {8} \\ {\gamma }_{s}{(X)}_{\vert \mathcal{R}}& \geq & W(x) + \frac{\epsilon } {2}\end{array}$$

It is thus straightforward that

$$\begin{array}{rcl}{ \nu }_{\tilde{\lambda }}(X)& \geq & 0 + \delta \times \left(W(x) + \frac{\epsilon } {2}\right) + \left(1 - \delta -\frac{\delta \epsilon } {16}\right) \times \left(W(x) -\frac{\delta \epsilon } {8} \right) \\ & \geq & W(x) + \delta \epsilon \left(\frac{1} {2} - \frac{1} {16} -\frac{1} {8} -\frac{\delta } {8} + \frac{\delta \epsilon } {64}\right) \\ & \geq & W(x) + \frac{\delta \epsilon } {8} \end{array}$$

This is true for any play, so its infimum also satisfies ${W}_{\tilde{\lambda }}(x) \geq W(x) + \frac{\delta \epsilon } {4}$, which is a contradiction, for we assumed that ${W}_{\tilde{\lambda }}(x) < W(x) + \frac{\delta \epsilon } {5}$. □

Propositions 10.3 and 10.4 establish the second half of Theorem 10.5: (A) ⇒ (B).

4 A Counter Example for Pointwise Convergence

In this section we give an example of an optimal control problem in which both V _t( ⋅) and W _λ( ⋅) converge pointwise on the state space, but to two different limits. As implied by Theorem 10.5, the convergence is not uniform on the state space.

Lehrer and Sorin were the first to construct such an example [24], in the discrete-time framework. We consider here one of its adaptations to continuous time, which was studied as Example 5 in [28],^{Footnote 3} where the notations are the same that in Sect. 10.1:

The state space is Ω = ℝ ₊ ².
The payoff function is given by g(x, y) = 0 if x ∈ [1, 2], 1 otherwise.
The set of control is U = [0, 1].
The dynamic is given by f(x, y, u) = (y, u) (thus Ω is forward invariant).

An interpretation is that the couple (x(t), y(t)) represents the position and the speed of some mobile moving along an axis, and whose acceleration u(t) is controlled. Observe that since U = [0, 1], the speed y(t) increases during any play. We claim that for any (x ₀, y ₀) ∈ ℝ ₊ ², V _t(x ₀, y ₀) (resp W _λ(x ₀, y ₀)) converges to V (x ₀, y ₀) as t goes to infinity (respectively converges to W(x ₀, y ₀) as λ tends to 0), where:

$$\begin{array}{rcl} V ({x}_{0},{y}_{0})& =& \left\{\begin{array}{@{}l@{\quad }l@{}} 1 \quad &\text{ if }{y}_{0} > 0\text{ or }{x}_{0} > 2 \\ 0 \quad &\text{ if }{y}_{0} = 0\text{ and }1 \leq {x}_{0} \leq 2 \\ \frac{1-{x}_{0}} {2-{x}_{0}} \quad &\text{ if }{y}_{0} = 0\text{ and }{x}_{0} < 1 \end{array} \right. \\ W({x}_{0},{y}_{0})& =& \left\{\begin{array}{@{}l@{\quad }l@{}} 1 \quad &\text{ if }{y}_{0} > 0\text{ or }{x}_{0} > 2 \\ 0 \quad &\text{ if }{y}_{0} = 0\text{ and }1 \leq {x}_{0} \leq 2 \\ 1 -\frac{{(1-{x}_{0})}^{1-{x}_{0}}} {{(2-{x}_{0})}^{2-{x}_{0}}} \quad &\text{ if }{y}_{0} = 0\text{ and }{x}_{0} < 1. \end{array} \right. \end{array}$$

Here we only prove that $V (0,0) = \frac{1} {2}$ and $W(0,0) = \frac{3} {4}$; the proof for y ₀ = 0 and 0 < x ₀ < 1 is similar and the other cases are easy.

First of all we prove that for any t or λ and any admissible trajectory (that is, any function X(t) = (x(t), y(t)) compatible with a control u(t)) starting from (0, 0), ${\gamma }_{t}(X) \geq \frac{1} {2}$ and ${\nu }_{\lambda }(X) \geq \frac{3} {4}$. This is clear if x(t) is identically 0, so consider this is not the case. Since the speed y(t) is increasing, we can define t ₁ and t ₂ as the times at which x(t ₁) = 1 and x(t ₂) = 2 respectively, and moreover we have t ₂ ≤ 2t ₁. Then,

$$\begin{array}{rcl}{ \gamma }_{t}(X)& =& \frac{1} {t}\left({\int \nolimits }_{0}^{\min (t,{t}_{1})}\,\mathrm{d}s +{ \int \nolimits }_{\min (t,{t}_{2})}^{t}ds\right) \\ & =& 1 +\min \left(1, \frac{{t}_{1}} {t} \right) -\min \left(1, \frac{{t}_{2}} {t} \right) \\ & \geq & 1 +\min \left(1, \frac{{t}_{2}} {2t}\right) -\min \left(1, \frac{{t}_{2}} {t} \right) \\ & \geq & \frac{1} {2} \\ \end{array}$$

and

$$\begin{array}{rcl}{ \nu }_{\lambda }(X)& =& {\int \nolimits }_{0}^{{t}_{1} }\lambda \mathrm{{e}}^{-\lambda s}\,\mathrm{d}s +{ \int \nolimits }_{{t}_{2}}^{+\infty }\lambda \mathrm{{e}}^{-\lambda s}\,\mathrm{d}s \\ & =& 1 -\exp \left(-\lambda {t}_{1}\right) +\exp \left(-\lambda {t}_{2}\right) \\ & \geq & 1 -\exp \left(-\lambda {t}_{1}\right) +\exp \left(-2\lambda {t}_{1}\right) \\ & {\geq} & \min _{a>0}\{1 - a + {a}^{2}\} \\ & =& \frac{3} {4}\end{array}$$

On the other hand, one can prove [28] that ${V_t}\left(0,0\right)\leq 1 /\ 2$: in the problem with horizon t, consider the control “u(s) = 1 until s = 2 ∕ t and then 0”. Similarly one proves that ${W_\lambda}\left(0,0\right)\leq 3 /\ 4$: in the λ-discounted problem, consider the control “u(s) = 1 until s = λ ∕ ln2 and then 0”.

So the functions V _t and W _λ converge pointwise on Ω, but their limits V and W are different, since we have just shown V (0, 0)≠W(0, 0). One can verify that neither convergence is uniform on Ω by considering V _t(1, ε) and W _λ(1, ε) for small positive ε.

Remark 10.1.

One may object that this example is not very regular since the payoff g is not continuous and the state space is not compact. However a related, smoother example can easily be constructed:

1.
The set of controls is still [0, 1].
2.
The continuous cost g(x) is equal to 1 outside the segment [0.9,2.1], to 0 on [1,2], and linear on the two remainings intervals.
3.
The compact state space is $\Omega =\{ (x,y)\vert 0 \leq y \leq \sqrt{2x} \leq 2\sqrt{2}\}$.
4.
The dynamic is the same that in the original example for x ∈ [0, 3], and f(x, y, u) = ((4 − x)y, (4 − x)u) for 3 ≤ x ≤ 4. The inequality y(t)y′(t) ≤ x′(t) is thus satisfied on any trajectory, which implies that Ω is forward invariant under this dynamic.

With these changes the values V _t( ⋅) and W _λ( ⋅) still both converge pointwise on Ω to some $\widetilde{V }(\cdot )$ and $\widetilde{W}(\cdot )$ respectively, and $\widetilde{V }(0,0)\neq \widetilde{W}(0,0)$.

5 Possible Extensions

We considered the finite horizon problem and the discounted one, but it should be possible to establish similar Tauberian theorems for other, more complex, evaluations of the payoff. This was settled in the discrete time case in [26].
It would be very fruitful to establish necessary or sufficient conditions for uniform convergence to hold. In this direction we mention [28] in which sufficient conditions for the stronger notion of Uniform Value (meaning that there are controls that are nearly optimal no matter the horizon, provided it is large enough) are given in a general setting.
In the discrete case an example is constructed in [26] in which there is no uniform value despite uniform convergence of the families V _t and W _λ. It would be of interest to construct such an example in continuous time, in particular in the framework of Sect. 10.1.
It would be very interesting to study Tauberian theorems for dynamic systems that are controlled by two conflicting controllers. In the framework of differential games this has been done recently (Theorem 2.1 in [1]): an extension of Theorem 10.4 has been accomplished for two player games in which the limit of V _T or W _λ is assumed to be independent of the starting point. The similar result in the discrete time framework is a consequence of Theorems 1.1 and 3.5 in [21]. Existence of Tauberian theorems in the general setup of two-persons zero-sum games with no ergodicity condition remains open in both the discrete and the continuous settings.

Notes

1.
Lemma 6 and Theorem 8 in [4] deal with this general setting, but we believe them to be incorrect since they are stated for pointwise convergence and, consequently, are contradicted by the example in Sect. 10.4.
2.
The reader may verify that this is indeed not the case in the example of Sect. 10.4.
3.
We thank Marc Quincampoix for pointing out this example to us, which is simpler that our original one.
4.
We thank Frédéric Bonnans for the idea of this proof.

References

Alvarez, O., Bardi, M.: Ergodic Problems in Differential Games. Advances in Dynamic Game Theory, pp. 131–152. Ann. Int’l. Soc. Dynam. Games, vol. 9, Birkhäuser Boston (2007)
Google Scholar
Alvarez, O., Bardi, M.: Ergodicity, stabilization, and singular perturbations for Bellman-Isaacs equations. Mem. Am. Math. Soc. 960(204), 1–90 (2010)
Article MathSciNet Google Scholar
Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation I. Ann. Inst. Henri Poincare 14, 415–438 (1997)
Article MathSciNet MATH Google Scholar
Arisawa, M.: Ergodic problem for the Hamilton-Jacobi-Bellman equation II. Ann. Inst. Henri Poincare 15, 1–24 (1998)
Article MathSciNet MATH Google Scholar
Arisawa, M., Lions, P.-L.: On ergodic stochastic control. Comm. Partial Diff. Eq. 23(11–12), 2187–2217 (1998)
MathSciNet MATH Google Scholar
Artstein, Z., Gaitsgory, V.: The value function of singularly perturbed control systems. Appl. Math. Optim. 41(3), 425–445 (2000)
Article MathSciNet MATH Google Scholar
Bardi, M., Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Systems & Control: Foundations & Applications. Birkhäuser Boston, Inc., Boston, MA (1997)
Google Scholar
Barles, G.: Some homogenization results for non-coercive Hamilton-Jacobi equations. Calculus Variat. Partial Diff. Eq. 30(4), 449–466 (2007)
Article MathSciNet MATH Google Scholar
Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. U.S.A, 38, 716–719 (1952)
Article MATH Google Scholar
Bettiol, P.: On ergodic problem for Hamilton-Jacobi-Isaacs equations. ESAIM: COCV 11, 522–541 (2005)
Article MathSciNet MATH Google Scholar
Cardaliaguet, P.: Ergodicity of Hamilton-Jacobi equations with a non coercive non convex Hamiltonian in ℝ ² ∕ ℤ ². Ann. l’Inst. Henri Poincare (C) Non Linear Anal. 27, 837–856 (2010)
Google Scholar
Carlson, D.A., Haurie, A.B., Leizarowitz, A.: Optimal Control on Infinite Time Horizon. Springer, Berlin (1991)
Book Google Scholar
Colonius, F., Kliemann, W.: Infinite time optimal control and periodicity. Appl. Math. Optim. 20, 113–130 (1989)
Article MathSciNet MATH Google Scholar
Evans, L.C.: An Introduction to Mathematical Optimal Control Theory. Unpublished Lecture Notes, U.C. Berkeley (1983). Available at http://math.berkeley.edu/~evans/control.course.pdf
Feller, W.: An Introduction to Probability Theory and its Applications, vol. II, 2nd ed. Wiley, New York (1971)
Google Scholar
Grune, L.: On the Relation between Discounted and Average Optimal Value Functions. J. Diff. Eq. 148, 65–99 (1998)
Article MathSciNet Google Scholar
Hardy, G.H., Littlewood, J.E.: Tauberian theorems concerning power series and Dirichlet’s series whose coefficients are positive. Proc. London Math. Soc. 13, 174–191 (1914)
Article MathSciNet MATH Google Scholar
Hestenes, M.: A General Problem in the Calculus of Variations with Applications to the Paths of Least Time, vol. 100. RAND Corporation, Research Memorandum, Santa Monica, CA (1950)
Google Scholar
Isaacs, R.: Games of Pursuit. Paper P-257. RAND Corporation, Santa Monica (1951)
Google Scholar
Isaacs, R.: Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Wiley, New York (1965)
Google Scholar
Kohlberg, E., Neyman, A.: Asymptotic behavior of nonexpansive mappings in normed linear spaces. Isr. J. Math. 38, 269–275 (1981)
Article MathSciNet MATH Google Scholar
Kirk, D.E.: Optimal Control Theory: An Introduction. Englewood Cliffs, N.J. Prentice Hall (1970)
Google Scholar
Lee, E.B., Markus, L.: Foundations of Optimal Control Theory. SIAM, Philadelphia (1967)
MATH Google Scholar
Lehrer, E., Sorin, S.: A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17, 303–307 (1992)
Article MathSciNet Google Scholar
Lions, P.-L., Papanicolaou, G., Varadhan, S.R.S.: Homogenization of Hamilton-Jacobi Equations. Unpublished (1986)
Google Scholar
Monderer, M., Sorin, S.: Asymptotic Properties in Dynamic Programming. Int. J. Game Theory 22, 1–11 (1993)
Article MathSciNet MATH Google Scholar
Pontryiagin, L.S., Boltyanskii, V.G., Gamkrelidge: The Mathematical Theory of Optimal Processes. Nauka, Moskow (1962) (Engl. Trans. Wiley)
Google Scholar
Quincampoix, M., Renault, J.: On the existence of a limit value in some non expansive optimal control problems. SIAM J. Control Optim. 49, 2118–2132 (2011)
Article MathSciNet MATH Google Scholar
Shapley, L.S.: Stochastic games. Proc. Natl. Acad. Sci. 39, 1095–1100 (1953)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This article was done as part of the PhD of the first author. Both authors wish to express their many thanks to Sylvain Sorin for his numerous comments and his great help. We also thank Hélène Frankowska and Marc Quincampoix for helpful remarks on earlier drafts.

Author information

Authors and Affiliations

Institut Mathématique de Jussieu, UFR 929, Université Paris 6, Paris, France
Miquel Oliu-Barton
CEREMADE, Université Paris-Dauphine, Paris, France
Guillaume Vigeral

Authors

Miquel Oliu-Barton
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Vigeral
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miquel Oliu-Barton .

Editor information

Editors and Affiliations

, CEREMADE, Université Paris-Dauphine, Place du Maréchal de Lattre de Tassigny, Paris, 75775, France
Pierre Cardaliaguet
Wilfrid Laurier University, University Ave W 75, Waterloo, N2L 3C5, Ontario, Canada
Ross Cressman

Appendix

We give here another proof^{Footnote 4} of Theorem 10.5 by using the analoguous result in discrete time [24] as well as an argument of equivalence between discrete and continuous dynamic.

Consider a deterministic dynamic programming problem in continuous time as defined in Sect. 10.2.1, with a state space Ω, a payoff g and a dynamic Γ. Recall that, for any ω ∈ Ω, Γ(ω) is the non empty set of feasible trajectories, starting from ω. We construct an associated deterministic dynamic programming problem in discrete time as follows.

Let $\widetilde{\Omega } = \Omega \times [0,1]$ be the new state space and let $\widetilde{g}$ be the new cost function, given by $\widetilde{g}(\omega ,x) = x$. We define a multivalued-function with nonempty values $\widetilde{\Gamma } :\widetilde{ \Omega } \rightrightarrows \widetilde{ \Omega }$ by

$$(\omega ,x) \in \widetilde{ \Gamma }(\omega \prime,x\prime)\Longleftrightarrow\exists X \in \Gamma (\omega \prime),\quad \text{ with }X(1) = \omega \quad \text{ and }\quad {\int \nolimits }_{0}^{1}g(X(t))\,\mathrm{d}t = x.$$

Following [24], we define, for any initial state $\widetilde{\omega } = (\omega ,x)$

$$\begin{array}{rcl} {v}_{n}(\widetilde{\omega })& =& \inf \frac{1} {n}\sum\limits_{i=1}^{n}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ {w}_{\lambda }(\widetilde{\omega })& =& \inf \lambda \sum\limits_{i=1}^{+\infty }{(1 - \lambda )}^{i-1}\widetilde{g}(\widetilde{{\omega }}_{ i}) \\ \end{array}$$

where the infima are taken over the set of sequences $\{\widetilde{{\omega }{}_{i}\}}_{i\in \mathbb{N}}$ such that $\widetilde{{\omega }}_{0} =\widetilde{ \omega }$ and $\widetilde{{\omega }}_{i+1} \in \widetilde{ \Gamma }(\widetilde{{\omega }}_{i})$ for every i ≥ 0.

Theorem 10.5 is then the consequence of the following three facts.

Firstly, the main theorem of Lehrer and Sorin in [24], which states that uniform convergence (on $\widetilde{\Omega }$) of v _n to some v is equivalent to uniform convergence of w _λ to the same v.

Secondly, the concatenation hypothesis (10.4) on Γ implies that for any $(\omega ,x)\in \widetilde{\Omega }$

$$\begin{array}{rcl}{ v}_{n}(\omega ,x) = {V }_{n}(\omega )& & \\ \end{array}$$

where ${V }_{t}(\omega ) {=\inf }_{X\in \Gamma (\omega )}\frac{1} {t} { \int \nolimits }_{0}^{n}g(X(s))\,\mathrm{d}s$, as defined in equation (10.7). Consequently, because of the bound on g, for any t ∈ ℝ ₊ we have

$$\vert {V }_{t}(\omega ) - {v}_{\lfloor t\rfloor }(\omega ,x)\vert \leq \frac{2} {\lfloor t\rfloor }$$

where ⌊t⌋ stands for the integer part of t.

Finally, again because of hypothesis (10.4), for any λ ∈ ]0, 1],

$${w}_{\lambda }(\omega ,x) {=\inf }_{X\in \Gamma (\omega )}\lambda {\int \nolimits }_{0}^{+\infty }{(1 - \lambda )}^{\lfloor t\rfloor }g(X(t))\,\mathrm{d}t.$$

Hence, by equation (10.8) and the bound on the cost function, for any λ ∈ ]0, 1],

$$\vert {W}_{\lambda }(\omega ) - {w}_{\lambda }(\omega ,x)\vert \leq \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

which tends uniformly (with respect to x and ω) to 0 as λ goes to 0 by virtue of the following lemma.

Lemma 10.6.

The function

$$\lambda \mapsto \lambda {\int \nolimits }_{0}^{+\infty }\left\vert {(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right\vert \mathrm{d}t$$

converges to 0 as λ tends to 0.

Proof.

Since λ ∫₀ ^{+ ∞}(1 − λ)^⌊t⌋ = λ ∫₀ ^{+ ∞}e^− λtdt = 1, for any λ > 0, the lemma is equivalent to the convergence to 0 of

$$E(\lambda ) := \lambda {\int \nolimits }_{0}^{+\infty }{\left[{(1 - \lambda )}^{\lfloor t\rfloor }-\mathrm{ {e}}^{-\lambda t}\right]}_{ +}\mathrm{d}t$$

where [x]₊ denotes the positive part of x. Now, from the relation 1 − λ ≤ e^− λ, true for any λ, one can easily deduce that, for any λ > 0, t ≥ 0, the relation (1 − λ)^⌊t⌋e^λt ≤ e^λ holds. Hence,

$$\begin{array}{rcl} E(\lambda )& =& \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}{\left[{(1 - \lambda )}^{\lfloor t\rfloor }\mathrm{{e}}^{\lambda t} - 1\right]}_{ +}\mathrm{d}t \\ & \leq & \lambda {\int \nolimits }_{0}^{+\infty }\mathrm{{e}}^{-\lambda t}(\mathrm{{e}}^{\lambda } - 1)\,\mathrm{d}t \\ & =& \mathrm{{e}}^{\lambda } - \end{array}$$

(1)

which converges to 0 as λ tends to 0. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Oliu-Barton, M., Vigeral, G. (2013). A Uniform Tauberian Theorem in Optimal Control. In: Cardaliaguet, P., Cressman, R. (eds) Advances in Dynamic Games. Annals of the International Society of Dynamic Games, vol 12. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-8355-9_10

Download citation

DOI: https://doi.org/10.1007/978-0-8176-8355-9_10
Published: 09 August 2012
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-0-8176-8354-2
Online ISBN: 978-0-8176-8355-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Uniform Tauberian Theorem in Optimal Control

Abstract

Similar content being viewed by others

Maximum principle for infinite-horizon optimal control problems under weak regularity assumptions

An Existence Theorem for a Class of Infinite Horizon Optimal Control Problems

On Balder’s Existence Theorem for Infinite-Horizon Optimal Control Problems

Keywords

1 Introduction

Theorem 10.1 ([17]).

Theorem 10.2 ([15]).

Theorem 10.3 ([24]).

Theorem 10.4 ([4]).

Theorem 10.5.

2 Model

2.1 General Framework

2.2 Preliminary Results

Lemma 10.1.

Proof.

Lemma 10.2.

Proof.

Lemma 10.3.

Proof.

3 Proof of Theorem 10.5

3.1 From V t to W λ

Proposition 10.1.

Proof.

Lemma 10.4.

Proof.

Proposition 10.2.

Proof.

3.2 From W λ to V t

Lemma 10.5.

Proof.

Proposition 10.3.

Proof.

Proposition 10.4.

Proof.

4 A Counter Example for Pointwise Convergence

Remark 10.1.

5 Possible Extensions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Lemma 10.6.

Proof.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation

3.1 From V _t to W _λ

3.2 From W _λ to V _t