Abstract
In the last decades, control problems with infinite horizons and discount factors have become increasingly central not only for economics but also for applications in artificial intelligence and machine learning. The strong links between reinforcement learning and control theory have led to major efforts toward the development of algorithms to learn how to solve constrained control problems. In particular, discount plays a role in addressing the challenges that come with models that have unbounded disturbances. Although algorithms have been extensively explored, few results take into account time-dependent state constraints, which are imposed in most real-world control applications. For this purpose, here we investigate feasibility and sufficient conditions for Lipschitz regularity of the value function for a class of discounted infinite horizon optimal control problems subject to time-dependent constraints. We focus on problems with data that allow nonautonomous dynamics, and Lagrangian and state constraints that can be unbounded with possibly nonsmooth boundaries.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Infinite time horizon models arising in mathematical economics and engineering typically involve control systems with restrictions on both controls and states. Models of optimal allocation of economic resources were, in the late 50 s, among the key incentives for the creation of the mathematical theory of optimal control. Constrained optimal control problems are often solved in practical control applications, which are more challenging to deal with than unconstrained ones.
Over the last few decades, an increasingly central role has been given to infinite horizon control problems with discount factors not only for applications in finance but for applications to artificial intelligence and machine learning. Strong connections between reinforcement learning and control theory have prompted a major effort toward developing algorithms to learn optimal solutions. Discounting plays a role in addressing the challenges that come with models where unbounded disturbances are present. The discount configuration is common in many stochastic control problems [7, 23, 24, 33], reinforcement learning [8], and financial engineering [19, 30]. Discount factors ensure the feasibility of constrained optimal control problems with potentially unbounded perturbations. In dynamic programming, discounting is often used to ensure well-posedness of problems with infinite horizons and possibly unlimited costs [6, 9]. Moreover, with an appropriate value of the discount factor, stability is guaranteed [31].
Much of the present works in the literature focuses on manage constraints in control problems. In general, state constraints imply non-convex feasible sets. So, there are several ways to provide amenable approximations in deterministic and probabilistic frameworks, e.g., using available informations from probability distributions [32], deterministic approximation methods jointly with confidence sets [25], deterministic and stochastic tubes [5, 12], attainable sets [22] or conservative approach with probabilistic inequalities [17, 21] and random methods [11, 26]. Although deterministic algorithms have been widely investigated to solve the optimal regulation problems, few results consider the solution of optimal synthesis in the presence of time-dependent state constraints, needed for most real-world control applications (cfr. Sect. 2). A fundamental point in constrained cases is how to ensure desirable properties, as existence of viable solutions and stability, including regularity of the value function. This is particularly evident in reinforcement learning, where value functions necessitates employing a function approximator with a limited set of parameters. Several researchers have emphasized that integrating reinforcement learning algorithms subject to state constraints with general approximation systems, such as neural networks, fuzzy sets, or polynomial approximators, can lead to unstable or divergent outcomes, even for straightforward problems (cfr. [2, 10, 20]).
In this settings, a key role is given by the dynamic programming principle and the Hamilton–Jacobi–Bellman (HJB) equation associated with the control problem [6, 34]. The value function, when differentiable, solves the HJB equation in the classical sense. However, it is well known that such a kind of notion turns out to be quite unsatisfactory for HJB equations arising in control theory and the calculus of variations (we refer the interested reader to the pioneer works [13, 14] and [3] for further discussions). Indeed, the value function loses the differentiability property whenever there are multiple optimal solutions at the same initial condition or additional state constraints are present. The lack of classical (smooth) solutions to HJB equations for regular data led to the need of a new notion—i.e., weak or viscosity solution—of this equation, in the class of Lipschitz continuous functions. Such regularity is not taken for granted especially when time-dependent state constraints are imposed on the control problem [4, 6].
In this paper, we focus on analyzing the Lipschitz regularity of the value function of infinite horizon control problems, specifically those with discount factors and time-dependent state constraints of a functional type. Our approach is designed to provide a comprehensive and rigorous analysis of this problem. We carefully consider the impact of the presence of time-dependent state constraints and discount factors, as these factors can significantly alter the optimal control strategy and lead to unexpected system behavior. To ensure feasibility and obtain neighboring estimates on the set of feasible trajectories, sufficient conditions on the constraint set by means of inward pointing conditions are imposed (cfr. Sects. 3 and 4). More specifically, by employing recent viability results that were investigated in [3], we establish Lipschitz regularity of the value function and viability of the system. We also demonstrate that the value function vanishes at infinity on the feasible set for all sufficiently large discount factors. This result is significant, as it implies that the value function is bounded on such set, which has important implications for the stability of the system. Overall, our analysis sheds light on the behavior and regularity of weak—or viscosity—solutions of HJB equations.
The outline of the present paper is as follows. In Sect. 2 we describe the general formulation of the optimal control problem addressed here, with notations and backgrounds on nonsmooth analysis. Section 3 is devoted to a controllability condition on constraint set. We give a viability and neighboring estimate results in Sect. 4 for feasible trajectories on infinite horizon. Meanwhile, in Sect. 5 we show the desiderate Lipschitz continuity for the value function.
2 Problem’s formulation and backgrounds
In this paper, we address the following infinite horizon control problem subject to functional constraints
We assume:
-
the controls u takes values in \(\mathbb {R}^m\) and are Lebesgue measurable;
-
U is a measurable set-valued map with nonempty closed images in \(\mathbb {R}^m\);
-
\(h_i\)’s are real-valued functions, measurable in time and space-\(\Gamma ^{1,\theta }\) regular, uniformly in time;
where \(\Gamma ^{1,\theta }\) stands for the class of continuously differentiable functions with \(\theta \)-Höelder continuous and bounded differential, i.e., for \(\theta \in ]0,1[\)
The optimal control problem described above is applicable to several scenarios within the fields of economics and engineering sciences (cfr. [15, 18, 27]). In these fields of applications, functional constraints often appear as functions affine in space with measurable time-dependent terms, specifically, \(h_i(s,x)=A(s)x_k+B(s)\) which falls under the framework of the proposed model. This family of functions extends to include \(h_i(s,x)=A(s)\psi _i(x)+B(s)\), with \(c_i\in {\mathbb {R}}^n\) a parameter and \(\psi _i\in \Gamma ^{1,\theta }\). It is worth to notice that the autonomous case with \(\theta =1\) was previously studied in [6].
2.1 Preliminaries and notations
Let \(B(x,\delta )\) stand for the closed ball in \({\mathbb {R}}^n\) with radius \(\delta >0\) centered at \(x\in {\mathbb {R}}^n\) and set \({\mathbb {B}}=B(0,1)\), \(S^{n-1}=\partial {\mathbb {B}}\). Denote by \(|\,\cdot \,|\) and \(\langle \cdot , \cdot \rangle \) the Euclidean norm and scalar product, respectively. Let \(C\subset {\mathbb {R}}^n\) be a nonempty set. We denote the interior of C by \(\textrm{int}\,C\) and the convex hull of C by \(\textrm{co}\,C\). The distance from \(x\in {\mathbb {R}}^n\) to C is defined by \(d_C(x):=\inf \{|x-y|:\,y\in C\}\). If C is closed, we let \(\Pi _C(x)\) be the set of all projections of \(x\in {\mathbb {R}}^n\) onto C.
For \(p\in {\mathbb {R}}^+\cup \{\infty \}\) and a Lebesgue measurable set \(I\subset {\mathbb {R}}\) we denote by \(L^p(I;{\mathbb {R}}^n)\) the space of \({\mathbb {R}}^n\)-valued Lebesgue measurable functions on I endowed with the norm \(\Vert \cdot \Vert _{p,I}\). We say that \(f\in L^p_{\textrm{loc}}(I;{\mathbb {R}}^n)\) if \(f\in L^p(J;{\mathbb {R}}^n)\) for any compact subset \(J\subset I\). Let I be an open interval in \({\mathbb {R}}\). For any \(f\in L^1_{\textrm{loc}}({\overline{I}};{\mathbb {R}}^n)\) we define \(\theta _f:[0,\mu (I))\rightarrow \mathbb {R}^+ \) by
We denote by \({\mathcal {L}}_{\textrm{loc}}\) the set of all functions \(f\in L^1_{\textrm{loc}}(\mathbb {R}^+;{\mathbb {R}}^+)\) such that \(\lim _{\sigma \rightarrow 0}\theta _{ f}(\sigma )=0\). Notice that \(L^{\infty }(\mathbb {R}^+;{\mathbb {R}}^+)\subset {\mathcal {L}}_{\textrm{loc}}\) and, for any \(f\in {\mathcal {L}}_{\textrm{loc}}\), \(\theta _{ f}(\sigma )<\infty \) for every \(\sigma >0\).
Let \(\Omega :\mathbb {R}\rightsquigarrow \mathbb {R}^n\), \(F:{\overline{I}}\times {\mathbb {R}}^n\rightsquigarrow {\mathbb {R}}^n\), and \(G:{\mathbb {R}}^m\rightsquigarrow {\mathbb {R}}^n\) be set-valued maps with nonempty values. G is said to be L-Lipschitz continuous, for some \(L\geqslant 0\), if \(G(x)\subset G({\tilde{x}})+ L|x-{\tilde{x}}|{\mathbb {B}}\) for all \(x,\,{\tilde{x}}\in {\mathbb {R}}^m\). We say that F has a sub-linear growth (in x) if, for some \(c\in L^1_{\textrm{loc}}({\overline{I}};{\mathbb {R}}^+)\),
Definition 2.1
Let \(\gamma \in L^1_{\textrm{loc}}({\overline{I}};{\mathbb {R}}^+)\). We say that F is \(\gamma \)-left absolutely continuous, uniformly wrt \(\Omega \), if
If F does not depends explicitly from x, in that case we simply say that F is \(\gamma \)-left absolutely continuous.
If \({\overline{I}}=[S,T]\), then we have the following characterization of uniform absolute continuity from the left: F is \(\gamma \)-left absolutely continuous, uniformly wrt \(\Omega \), for some \(\gamma \in L^1_{\textrm{loc}}({\overline{I}};{\mathbb {R}}^+)\), if and only if for every \(\varepsilon >0\) there exists \(\delta >0\) such that for any finite partition \(S\leqslant t_1<\tau _1\leqslant t_2<\tau _2\leqslant ...\leqslant t_m<\tau _m\leqslant T\) of [S, T],
where the excess of A given B is defined by
3 Controllability
In what follows, we take the notation
Consider the following condition
Assumptions 3.1
Let \(\theta \in ]0,1[\) and \(h_i:\mathbb {R}^+ \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) be m real-valued functions satisfying for any \(i=1,...,m\):
-
\(h_i(.,x)\) is measurable for any x.
-
\(h_i(t,.)\) is \(\Gamma ^{1,\theta }\) regular, uniformly wrt t.
The proposition below states a geometric result for a Inward Pointing Field Condition (also known as Inward Pointing Condition) on infinite horizon wrt the constraints \(\Omega (t)\) and a vector fields F(t, x).
Proposition 3.1
[Inward Pointing Fields Condition] Consider the Assumptions 3.1. Let \(F:\mathbb {R}\times \mathbb {R}^n \rightsquigarrow \mathbb {R}^n\) be a set-valued map with nonempty closed values satisfying
Assume that
where \(I(z)=\{i\in I:\, z\in \partial \Omega _i(t)\}\) and \(I:=\{1,...,m\}\). Then
Proof
Let us set \(J(x):=\bigcup _{z\in B(x,\delta )} I(z)\) for all \(x\in \partial \Omega (s)\) and \(s\ge 0\). Fix \(t\in \mathbb {R}^+,\,x\in \partial \Omega (t)\), and \(v\in \textrm{co}\,F(t,x)\) satisfying \(\langle \nabla h_i(t,x), v \rangle \leqslant -r\) for all \(i\in J(x)\). Pick
We proceed by steps.
(i): We claim that there exists \(\eta '>0\), not depending on (t, x), such that for all \(y\in B(x,\eta ')\) we can find \(w\in \textrm{co}\,F(t,y)\), with \(|w-v|\leqslant r/4\,L\), satisfying for all \(i\in J(x)\),
Indeed, for all \(i\in J(x)\) and \(y \in B(x,\root \theta \of {r/4\,kM})\) we have
and for all \(w\in {\mathbb {R}}^n\) such that \(|w-v|\leqslant r/4L\)
Since \(F(t,\cdot )\) is \(\varphi \)-Lipschitz continuous, there exists \(w\in \textrm{co}\,F(t,y)\) such that \(|w-v|\leqslant r/4L\) whenever \(|y-x|\leqslant r/4\varphi L\). So the claim follows with \(\eta '=\min \{r/4\varphi L, \root \theta \of {r/4kM}\}\).
(ii): We claim that there exists \(\varepsilon '>0\), not depending on (t, x), such that for all \(y \in B(x,\eta ')\) we can find \(w\in \textrm{co}\,F(t,y)\) such that
Indeed, let \(y\in B(x,\eta ')\) and \(w\in \textrm{co}\, F(t,y)\) be as in (i). Then for any \({\tilde{w}}\in {\mathbb {R}}^n\) such that \(|\tilde{w}-w|\leqslant r/8\,L\) and for all \(i\in J(x)\) and \(z\in {\mathbb {R}}^n\),
So the claim follows with \(\varepsilon '= \min \{k^{-1}(M+r/2\,L)^{-1} r/8,\,r/8\,L\}\).
(iii): We prove that there exist \(\eta>0,\,\varepsilon >0\), not depending on (t, x), such that for all \(y\in B(x,\eta )\cap \Omega (t)\) we can find \(w\in \textrm{co}\,F(t,y)\) satisfying
Let \(y\in B(x,\eta ')\cap \Omega (t)\) and \(w\in \textrm{co}\, F(t,y)\) be as in (ii). Then, by the mean value theorem, for any \(\tau \geqslant 0\), any \(z\in B(y,\varepsilon ')\cap \Omega (t)\), any \({\tilde{w}}\in B(w,\varepsilon ')\), and any \(i\in J(x)\) there exists \(\sigma _\tau \in [0,1]\) such that
Choosing \(\eta \in ]0, \eta ']\) and \(\varepsilon \in ]0, \varepsilon ']\) such that \(\eta +\varepsilon (M+r/4L+\varepsilon )\leqslant \delta \) and \(\varepsilon \leqslant {k^{-1}(M+r/4L+\varepsilon ')^{-2}r/4}\), it follows that for all \(z\in B(y,\varepsilon )\cap \Omega (t),\,{\tilde{w}}\in B(w,\varepsilon )\), and all \(0\leqslant \tau \leqslant \varepsilon \)
and
Furthermore, by (3.4) and since \(B(x,\delta )\subset \Omega _j(t)\) for all \(j\in I\backslash J(x)\), we have for all \(z\in B(y,\varepsilon )\cap \Omega (t),\,{\tilde{w}}\in B(w,\varepsilon )\), and all \(0\leqslant \tau \leqslant \varepsilon \)
4 Viability and distance estimates on trajectories
We provide here sufficient conditions for uniform linear \(L^\infty \) estimates on intervals of the form \(I=[t_0,t_1]\), with \(0\leqslant t_0<t_1\), for the state constrained differential inclusion
where \(F:\mathbb {R}^+ \times {\mathbb {R}}^n \rightsquigarrow {\mathbb {R}}^n\) is a given set-valued map. A function \(x:[t_0,t_1] \rightarrow {\mathbb {R}}^n\) is said to be:
-
F-trajectory if it is absolutely continuous and \(x' (t) \in F(t,x (t) )\) for a.e. \(t\in [t_0,t_1]\).
-
feasible F-trajectory if \(x (\cdot ) \) is an F-trajectory and \(x (t) \in \Omega (t)\) for all \( t\in I\).
Assumptions 4.1
We assume the following on \(F(\cdot ,\cdot )\):
-
(1)
F has closed and nonempty values, a sub-linear growth, and \(F(\cdot ,x)\) is Lebesgue measurable for all \(x\in {\mathbb {R}}^n\).
-
(2)
There exist \(M\geqslant 0\) and \(\alpha >0\) such that
$$\begin{aligned} \sup \{|v|\,:\,v \in F(t,x),\,t\in \mathbb {R}^+,\,x\in {\partial \Omega (t)+\alpha {\mathbb {B}}}\} \le M. \end{aligned}$$ -
(3)
There exists \(\varphi \in {\mathcal {L}}_{\textrm{loc}}\) such that \(F(t,\cdot )\) is \(\varphi (t) \)-Lipschitz continuous for all \(t\in {\mathbb {R}}^+\).
-
(4)
There exist \({\tilde{\eta }}>0\) and \(\gamma \in \mathcal L_{\textrm{loc}}\) such that F is \(\gamma \)-left absolutely continuous, uniformly wrt \({\partial \Omega +{\tilde{\eta }} {\mathbb {B}}}\).
Before to state the main result of this section, we recall a definition and a viability result for tubes ([3]-Corollary 4.5).
Definition 4.1
Consider a closed interval \(I\subset \mathbb {R}\). We say that a set-valued map \(\Phi :I\rightsquigarrow \mathbb {R}^k\) is of locally bounded variations if:
-
\(\Phi \) takes nonempty closed images.
-
For any \([a,b]\subset I\)
$$\begin{aligned} \sup \;\sum _{i=1}^{m-1} exc(\Phi (t_{i+1})\cap \mathscr {K}| {\Phi (t_i)})\vee exc(\Phi (t_i)\cap \mathscr {K}|{\Phi (t_{i+1})})<+\infty \end{aligned}$$
where the supremum is taken over all compact subset \(\mathscr {K}\subset \mathbb {R}^k\) and all finite partition \(a=t_1< t_2<...< t_{m-1}<t_m= b\).
In the next result, we need to recall the definition of Boulingad (or contingent) cone. Consider a closed set \(G\subset \mathbb {R}^n\). The Boulingad tangent cone at \(x\in G\) is defined by \(\mathscr {T}_G(x):=\{ v\in \mathbb {R}^n:\exists t_i\rightarrow 0+, \exists v_i\rightarrow v, x+t_iv_i \in G \,\forall i \}\).
Proposition 4.1
(Existence of Viable Trajectories, [3] Let \(E:\mathbb {R}^+ \rightsquigarrow \mathbb {R}^{d}\) be continuous,Footnote 1 of locally bounded variations in the sense of Definition 4.1, and consider \(\Phi :\mathbb {R}^+ \times \mathbb {R}^d\rightsquigarrow \mathbb {R}^d\) a set-valued map with nonempty convex closed values such that:
If for a.e. \(t>0\) and all \(y\in E(t)\) it holds
then for any \(t_0\in \mathbb {R}^+ \) and \(x_0\in E(t_0)\) there exists an absolutely continuous viable solution
Remark 4.1
-
(1)
Proposition 4.1 extends classical viability results under restricted conditions on the regularity of the tube E (we refer the interested reader to the bibliography therein [3]). Furthermore, it is straightforward to see that Lipschitz continuity for set-valued maps imply the locally bounded variations property.
-
(2)
We notice that, whenever Proposition 3.1 applies, then condition (3.1) on \(\Phi (t,x)=\{\mathbf{{f}}(t,x,u):u\in U(t)\}\) ensure the non-triviality intersection (4.1) for \(E(t)=\Omega (t)\).
We have the following
Theorem 4.1
(Neighboring Trajectories Estimates) Consider Assumptions 4.1. Suppose that \(h_i\)’s satisfy the viability condition (3.1) and there exists \(L\ge 0\) such that
Then for every \(\delta >0\) there exists a constant \(\beta >0\) such that for any \([t_0,t_1]\subset \mathbb {R}^+ \) with \(t_1-t_0 = \delta \), any F-trajectory \({\hat{x}} (\cdot ) \) defined on \([t_0,t_1]\) with \({\hat{x}}(t_0)\in \Omega (t_0)\), and any \(\varrho >0\) satisfying
we can find an F-trajectory \(x (\cdot ) \) on \([t_0,t_1]\) such that
Proof
Fix \(\delta >0\) and let \([t_0,t_1]\subset \mathbb {R}^+\) with \(t_1-t_0=\delta \).
We first show the statement whenever \(F=\text {co }F\). Let
be such that
and
Notice that all the constants appearing in (4.3) do not depend on the time interval \([t_0,t_1]\), the trajectory \({\hat{x}} (\cdot ) \), and \(\varrho \).
1) \(\varrho \leqslant {\bar{\varrho }}\) and \(\delta \leqslant \Delta \).
We observe that, by the last inequality in (4.4), if
then \(x (\cdot ) ={\hat{x}} (\cdot ) \) is as desired. Indeed, without loss of generality, assume \({\hat{x}}(t_0)\in (\partial \Omega (t_0)+\frac{{\hat{\eta }} }{2}S^{n-1} )\cap \Omega (t_0)\) and suppose by contradiction that
Put \(s:=\inf R\) and notice that \(s\ne t_0\). Then, we have
where \(\text {dist}(A,B)\) stands for the standard Euclidean distance between two sets A and B. Since
it follows that
a contradiction. Next we assume that \({\hat{x}}(t_0)\in (\partial \Omega (t_0)+\frac{{\hat{\eta }} }{2}{\mathbb {B}})\cap \Omega (t_0) \).
From Proposition 3.1, let \(v\in F(t_0,{\hat{x}}(t_0))\) be as in (3.2) and define
by
where \(J=\{s\in ]t_0+k\varrho ,t_1]:\,{\hat{x}}'(s-k\varrho )\; \; \textrm{exists}\}\). Hence
By Filippov’s theorem (cfr. [1]) there exists an F-trajectory \(x (\cdot ) \) on \([t_0,t_1]\) such that \(x(t_0)=y(t_0)\) and
for all \(t\in [t_0,t_1]\). Then, using Assumptions 4.1–3, (2.1), and (4.7), it follows that
Hence, we obtain for any \(t\in [t_0,(t_0+k\varrho )\wedge t_1]\)
and, using the Fubini theorem, for any \(t\in ]t_0+k\varrho ,t_1]\)
Thus, by (4.9), for all \(t\in [t_0,(t_0+k\varrho )\wedge t_1]\)
and
Finally, taking note of (4.8), it follows that
where we put \(\beta _1=2 (M+e^{\theta _\varphi (\Delta )}(\theta _\gamma (\Delta )+\theta _\varphi (\Delta )M))k\).
We claim next that
Indeed, if \(t\in ]t_0,(t_0+k\varrho )\wedge t_1]\), then from (3.2), the first condition in (4.4), and (4.7) it follows that
and it is enough to use (4.10) and the first inequality in (4.5).
On the other hand, if \(t\in ]t_0+k\varrho ,t_1]\), then for \( \pi (t) \in \Pi _{\Omega (t)}({\hat{x}}(t-k\varrho )) \) we have \(|{\hat{x}}(t-k\varrho )-\pi (t) |= d_{\Omega (t)}({\hat{x}}(t-k\varrho ))\leqslant \varrho \), and, from (4.7), it follows that
Now, since \(|\pi (t) -{\hat{x}}(t_0)|\leqslant |{\hat{x}}(t-k\varrho )-\pi (t) |+ |{\hat{x}}(t-k\varrho )-{\hat{x}}(t_0)| \leqslant {\bar{\varrho }}+M\Delta \), from Proposition 3.1 and the \(2^{\text {nd}}\) inequality in (4.4)
Finally, (4.12) and (4.13) imply that \( y (t) +(k\varepsilon -1)\varrho {\mathbb {B}}\subset \Omega (t). \) So, the claim follows from (4.5)-(ii) and (4.11).
2) \(\varrho > {\bar{\varrho }}\) and \(\delta \leqslant \Delta \).
By Proposition 4.1, there exists a feasible F-trajectory \({\bar{x}} (\cdot ) \) on \([t_0,t_1]\) starting from \({\hat{x}}(t_0)\). Note that \(d_{\Omega (t)}({\bar{x}}(t))=0\) for all \(t\in [t_0,t_1]\). By the Case 1, replacing \({\hat{x}} (\cdot ) \) with \({\bar{x}} (\cdot ) \), it follows that there exists a feasible F-trajectory \(x (\cdot ) \) on \([t_0,t_1]\) such that \(x(t_0)={\hat{x}}(t_0)\) and \(x(t)\in \text {int }\Omega (t)\) for all \(t\in ]t_0,t_1]\). Hence, by Assumption 4.1-2, we have \( \Vert {\hat{x}}-x\Vert _{\infty ,[t_0,t_1]}\leqslant 2\,M\Delta \leqslant \beta _2\varrho , \) with \(\beta _2=\frac{2\,M\Delta }{{\bar{\varrho }}}\).
3) \(\delta > \Delta \).
The above proof implies that in Cases 1 and 2, \(\beta _1, \, \beta _2\) can be taken the same if \(\delta \) is replaced by any \(0< \delta _1 < \delta \). Define \({\tilde{\beta }}=\beta _1\vee \beta _2\) and let \(\{[\tau _-^i,\tau _+^i]\}_{i=1}^m\) be a partition of \([t_0,t_1]\) by the intervals with the length at most \(\delta /m\). Put \(x_0 (\cdot ):={\hat{x}} (\cdot ) \). From Cases 1 and 2, replacing \([t_0,t_1]\) by \([\tau _-^1,\tau _+^1]\) and setting
we conclude that there exists an F-trajectory \(x_1 (\cdot ) \) on \([\tau _-^1,\tau _+^1]=[t_0,\tau _+^1]\) such that \(x_1(t_0)={\hat{x}}(t_0)\), \(x_1(t)\in \text {int } \Omega (t)\) for all \(t\in ]t_0,\tau _+^1]\), and
Using Filippov’s theorem, we can extend the trajectory \(x_1 (\cdot ) \) on whole interval \([t_0,t_1]\) so that
where \(K:=e^{\theta _{\varphi }(\delta )}\). Repeating recursively the above argument on each time interval \([\tau _-^i,\tau _+^i]\), we conclude that there exists a sequence of F-trajectories \(\{x_i (\cdot ) \}_{i=1}^m\) on \([t_0,t_1]\), such that:
-
\(x_i(t_0)={\hat{x}}(t_0)\) for all \(i=1,...,m\);
-
\(x_i(t)\in \text {int }\Omega (t)\) for all \(t\in ]t_0,\tau _+^i]\) and all \(i=1,...,m\);
-
\(x_j (\cdot ) |_{[t_0,\tau _+^{j-1}]}=x_{j-1} (\cdot ) \) for all \(j=2,...,m\);
and
where
Notice that
Taking note of (4.14) and (4.15) we get for all \(i=1,...,m\)
Then, letting \(x (\cdot ):=x_m (\cdot ) \) and observing that \(\varrho _0\leqslant \varrho \), we obtain
where \( \beta _3=(1+K{\tilde{\beta }})^m-1. \)
Then all conclusions of the theorem follow with \(\beta ={\tilde{\beta }} \vee \beta _3\). Observe that \(\beta \) depends only on \(\varepsilon ,{\hat{\eta }} \), M, \(\delta \), and on functions \(\gamma (\cdot ) \) and \(\varphi (\cdot ) \).
Now, assume \(F\ne \text {co }F\). From the first part of the proof, we have that there exist \(\beta >0\) (that does not depend on the reference trajectory \(\hat{x}(\cdot )\) on \([t_0, t_1]\)) and a \(\text {co }F\) trajectory \( {\bar{x}} (\cdot ):[t_0, t_1] \rightarrow {\mathbb {R}}^{n}\), strictly feasible on \( ]t_0, t_1]\), such that
Let \(\left\{ s_{i}\right\} _i\subset ]t_0, t_1]\) with \(s_{1}=t_1\) be a decreasing sequence such that \(s_{i} \rightarrow t_0\). Since \({\bar{x}}(\cdot )\) is strictly feasible on \( ]t_0, t_1]\) we can find a sequence of decreasing numbers \(\{\varepsilon _{i}\}_i \subset ]0, \varrho [\) such that \(\varepsilon _{i} \rightarrow 0\) and
Without loss of generality, we can assume that \(\varepsilon _i\le \frac{1}{4}\wedge \varrho \) for all \(i\in \mathbb {N}^+\). Put \(C:=e^{ \int _{t_0}^{t_1} \varphi (\sigma ) d \sigma }\) and define \(a_k:=\frac{\varepsilon _k}{C^k}\) for all \(k\in \mathbb {N}\). Notice that
We recall the following known relaxation result.
Lemma 4.1
(Relaxation, [34]) Consider a measurable set-valued map \(F: [S,T]\times \mathbb {R}^n \rightsquigarrow \mathbb {R}^{n}\) with closed and nonempty values. Assume that there exist \(\varphi ,\psi \in L^{1}(S,T;\mathbb {R})\) such that
Take any feasible \(\text {co }F\)-trajectory \(x(\cdot )\) and any \(\varepsilon >0\). Then there exists an F-trajectory \(y(\cdot )\) that satisfies \(y(S)=x(S)\) and
From the above lemma, there exist a sequence of F-trajectories \(x_{i}:\left[ s_{i}, t_1\right] \rightarrow {\mathbb {R}}^{n}\) such that, for all \(i \geqslant 2\), we have \(x_{i}\left( s_{i}\right) = {\bar{x}} \left( s_{i}\right) \) and
For each integer \(j \geqslant 2\), we construct an F-trajectory \(y_{j}:\left[ s_{j}, t_1\right] \rightarrow {\mathbb {R}}^{n}\) as follows: \(y_2(\cdot ):=x_2(\cdot )\) and for all \(j>2\)
-
\(y_{j}(\cdot )\) is the restriction of \(x_{j}(\cdot )\) on \( ]s_{j}, s_{j-1} ]\).
-
for all \(1\le k\le j-2\), \(y_{j}(\cdot )\) restricted to \( ]s_{j-k}, s_{j-k-1}]\) is an F-trajectory with initial state \(y_{j}\left( s_{j-k}\right) \), obtained by applying Filippov’s theorem with reference trajectory \(y_{j-1}(\cdot )\).
Now fix an integer \(j>2\). From Filippov’s theorem and since \( x_{i} (s_{i} )= {\bar{x}} (s_{i} )\), for any \(2 \leqslant i<j\)
From these relations and (4.17), it follows that for each \(2 \leqslant i<j\) and any \(l\in \mathbb {N}^+\)
Notice that \(y_{j} (s_{j} )= {\bar{x}} (s_{j} )\) for any \(j \geqslant 2\). Hence, we can extend each F-trajectory \(y_{j}\) as an co F trajectory to whole interval \([t_0, t_1]\), by setting \(y_{j}(\sigma )= {\bar{x}} (\sigma )\) for \(\sigma \in \left[ t_0, s_{j}\right] \). Since the trajectories \(\{y_{i}\}_i\) have initial value \(\hat{x}(t_0)\) and owing the sub-linear growth of F, taking a subsequence and keeping the same notation, we have
We conclude to show that \(x(\cdot )\) satisfy all the conclusions with \(\beta \) replaced by \(\beta +1\). Indeed, due to (4.19), for each \(k \geqslant 2\) the F-trajectories \(\{y_{i}\}_i\), restricted to \(\left[ s_{k}, s_{k-1}\right] \), forms a Cauchy sequence on \(W^{1,1}\left( s_{k}, s_{k-1}\right) \).Footnote 2 So, it follows that the limiting co F-trajectory \(x(\cdot )\) is an F-trajectory and, since \(\varepsilon _{i} \leqslant \varrho \) for all \(i\ge 2\),
Moreover, notice that \(x(\cdot )\) is strictly feasible on \(]t_0,t_1]\). Indeed, consider \(\sigma \in ]t_0, t_1]\). We have \(\sigma \in ]s_{i}, s_{i-1}]\) for some \(i \geqslant 2\). From (4.18), (4.17), and (4.16) we get
Since the \(\{y_{j}\}_j\) converge uniformly to x,
This concludes our proof. \(\square \)
Now, consider the following state constrained differential inclusion:
where \(t_0\geqslant 0\). A function \(x:[t_0,+\infty [ \rightarrow {\mathbb {B}}^n\) is said to be:
-
\(F_{\infty }\)-trajectory if \(x|_{[t_0,t_1]} (\cdot ) \) is an F-trajectory.
-
feasible \(F_{\infty }\)-trajectory if \(x|_{[t_0,t_1]} (\cdot ) \) is a feasible F-trajectory for all \(t_1> t_0\).
Theorem 4.2
Consider Assumptions 4.1. Suppose that conditions in (3.2) hold true and
Then there exist \(C>1\) and \(K>0\) such that for any \(t_0\geqslant 0\), any \(x^0,x^1\in \Omega (t_0)\), and any feasible \(F_{\infty }\)-trajectory \(x:[t_0,+\infty [ \rightarrow {\mathbb {R}}^n\), with \(x(t_0)=x^0\), we can find a feasible \(F_{\infty }\)-trajectory \(\tilde{x}:[t_0,+\infty [ \rightarrow {\mathbb {R}}^n\), with \({\tilde{x}}(t_0)=x^1\), such that
Proof
Let \(\delta =1\) and \(\beta >0\) be as in Theorem 4.1. Consider \(K_1>0,K_2>0,\) and \({\tilde{k}}>0\) such that
Fix \(t_0\ge 0\), \(x^0,x^1\in \Omega (t_0)\), with \(x^1\ne x^0\), and a feasible \({F}_{\infty }\)-trajectory \(x:[t_0,+\infty [ \rightarrow {\mathbb {R}}^n\) with \(x(t_0)=x_0\). By Filippov’s theorem, there exists an F-trajectory \(y_0:[t_0,t_0+1] \rightarrow {\mathbb {R}}^n\) such that \(y_0(t_0)=x^1\) and
Denote by \(x_0:[t_0,t_0+1] \rightarrow {\mathbb {R}}^n\) the feasible F-trajectory, with \(x_0(t_0)=x^1\), satisfying the conclusions of Theorem 4.1 with \({\hat{x}} (\cdot ) =y_0 (\cdot ) \). Thus
and therefore
Now, applying again Filippov’s theorem on \([t_0+1,t_0+2]\), there exists an F-trajectory \(y_1:[t_0+1,t_0+2] \rightarrow {\mathbb {R}}^n\), with \(y_1(t_0+1)=x_0(t_0+1)\), such that, thanks to (4.21),
Denoting by \(x_1:[t_0+1,t_0+2] \rightarrow {\mathbb {R}}^n\) the feasible F-trajectory, with \(x_1(t_0+1)=x_0(t_0+1)\), satisfying the conclusions of Theorem 4.1, for \({\hat{x}} (\cdot ) =y_1 (\cdot ) \), we deduce from (4.22), that
Hence, taking note of (4.22) and (4.23),
Continuing this construction, we obtain a sequence of feasible F-trajectories \(x_i:[t_0+i,t_0+i+1] \rightarrow {\mathbb {R}}^n\) such that \(x_j(t_0+j)=x_{j-1}(t_0+j)\) for all \(j\geqslant 1\), and
Define the feasible \({F}_{\infty }\)-trajectory \({\tilde{x}}:[t_0,+\infty [ \rightarrow {\mathbb {R}}^n\) by \( {\tilde{x}}(t):=x_i(t)\) if \(t\in [t_0+i,t_0+i+1] \) and observe that \({\tilde{x}}(t_0)=x^1\). Let \(t\geqslant t_0\). Then there exists \(i\in {\mathbb {N}}\) such that \(t\in [t_0+i,t_0+i+1]\). So, from (4.24) and (4.20), it follows that
where \(K=K_1+K_2\) and \(C=e^{{\tilde{k}}}(2\beta +1)\). \(\square \)
5 Lipschitz continuity
Now we give an application of the results of previous sections to the Lipschitz regularity of the value function for a class of infinite horizon optimal control problems subject to state constraints.
Let us considerFootnote 3 the problem (\(\mathcal {P}\)) stated in Sect. 2.
Assumptions 5.1
We take the following assumptions on \(\textbf{f}\) and \(\textbf{L}\):
-
(1)
For all \(x\in {\mathbb {R}}^n\) the mappings \(\textbf{f}(\cdot ,x,\cdot ),\, \textbf{L}(\cdot ,x,\cdot )\) are Lebesgue-Borel measurable.
-
(2)
There exists \(\alpha >0\) such that \(\textbf{f}\) and \(\textbf{L}\) are bounded functions on
$$\begin{aligned} \{(t,x,u)\,:\,t\geqslant 0,\, x\in (\partial \Omega (t)+\alpha {\mathbb {B}}),\, u\in U (t) \}. \end{aligned}$$ -
(3)
For all \((t,x)\in \mathbb {R}^+ \times {\mathbb {R}}^{n}\) the set
$$\begin{aligned} \{(\textbf{f}(t,x,u), \textbf{L}(t,x,u))\,:\,u\in U (t) \} \end{aligned}$$is closed.
-
(4)
There exist \(c\in L^1_{\textrm{loc}}(\mathbb {R}^+;{\mathbb {R}}^+)\) and \(k\in {\mathcal {L}}_{\textrm{loc}}\) such that for any \(t\in {\mathbb {R}}^+,\, x,\,y\in {\mathbb {R}}^n\), and \(u\in U (t) \),
$$\begin{aligned} |\textbf{f}(t,x,u)-\textbf{f}(t,y,u)|+ | \textbf{L}(t,x,u)- \textbf{L}(t,y,u)|\leqslant k (t) |x-y|, \end{aligned}$$$$\begin{aligned} |\textbf{f}(t,x,u)|+ | \textbf{L}(t,x,u)|\leqslant c (t) (1+|x|). \end{aligned}$$ -
(5)
there exist \(\tilde{\eta }>0\) and \(\gamma \in {\mathcal {L}}_{\text{ loc } }\) such that
$$\begin{aligned} t\rightsquigarrow \{(\textbf{f}(t,x,u), \textbf{L}(t,x,u)): u\in U(t)\} \end{aligned}$$is \(\gamma \)-left absolutely continuous, uniformly wrt \( \partial \Omega +\tilde{\eta } {\mathbb {B}}\).
-
(6)
\(\limsup _{t \rightarrow \infty }\,\frac{1}{t}\int _0^{t} (c(s)+k(s))\,ds <\infty \).
We consider, for any \(\lambda >0\), the relaxed infinite horizon state constrained problem
where
-
\(W:\mathbb {R}^+ \rightsquigarrow {\mathbb {R}}^{(n+1)m}\times {\mathbb {R}}^{n+1}\) is the measurable set-valued map defined by:
$$\begin{aligned} W (s) :=(\times _{i=0}^n U (s) ) \times \{(\alpha _0,...,\alpha _n)\in {\mathbb {R}}^{n+1}\,:\, \sum _{i=0}^n \alpha _i=1,\, \alpha _i\geqslant 0\,\, \forall \, i\} \end{aligned}$$for all \(s \geqslant 0\).
-
\( \textbf{f}^\star : \mathbb {R}^+ \times {\mathbb {R}}^n\times {\mathbb {R}}^{(n+1)m}\times {\mathbb {R}}^{n+1} \rightarrow {\mathbb {R}}^n\) and \( \textbf{L}^\star : \mathbb {R}^+ \times {\mathbb {R}}^n\times {\mathbb {R}}^{(n+1)m}\times {\mathbb {R}}^{n+1} \rightarrow {\mathbb {R}}\) are defined by:
$$\begin{aligned} \textbf{f}^\star (s,x,w)&:=\sum _{i=0}^n \alpha _i \textbf{f}(s,x,u_i)\\ \textbf{L}^\star (s,x,w)&:=\sum _{i=0}^n\alpha _i \textbf{L}(s,x,u_i) \end{aligned}$$for all \(s \geqslant 0\), \(x\in {\mathbb {R}}^n\), and \(w=(u_0,...,u_n,\alpha _0,...,\alpha _n)\in {\mathbb {R}}^{(n+1)m}\times {\mathbb {R}}^{n+1}\).
Remark 5.1
-
1.
For control systems, the condition (3.2) take the following form: for some \(\varepsilon>0,\,\eta >0\) and every \(t\in \mathbb {R}^+,\,x\in (\partial \Omega (t)+\eta {\mathbb {B}})\cap \Omega (t)\) there exist \(\{\alpha _i\}_{i=0}^n\subset [0,1]\), with \(\sum _{i=0}^n\alpha _i=1\), and \(\{u_i\}_{i=0}^n\subset U(t)\) satisfying
$$\begin{aligned} \{y+[0,\varepsilon ] (\sum _{i=0}^n\alpha _i\textbf{f}(t,x,u_i)+\varepsilon {\mathbb {B}} )\,:\, y\in (x+\varepsilon {\mathbb {B}})\cap \Omega (t)\}\subset \Omega (t). \end{aligned}$$ -
2.
If there exist \({\tilde{\eta }}>0,\,\gamma ,\,{\tilde{\gamma }}\in {\mathcal {L}}_{\textrm{loc}}\), and \(k\geqslant 0\) such that \((\textbf{f},\textbf{L} )\) is \(\gamma \)-left absolutely continuous, uniformly wrt \( (\partial \Omega +{\tilde{\eta }} {\mathbb {B}})\times {\mathbb {R}}^m\), \(U (\cdot ) \) is \({\tilde{\gamma }}\)-left absolutely continuous, and \(\textbf{f}(t,x,\cdot )\) is k-Lipschitz continuous for all \(t\in \mathbb {R}^+,\,x\in (\partial \Omega (t)+{\tilde{\eta }} {\mathbb {B}})\), then Assumption 5.1-5 holds true.
Definition 5.1
We denote by
the value functions of the infinite horizon control problems (\(\mathcal {P}\)) and (\(\mathcal {P}^\star \)), respectively, where
Next, we state the main result of this section
Theorem 5.1
Consider Assumptions 5.1. Suppose that (3.1) and (4.2) hold true. Then there exist \(b>1\) and \(K>0\) such that for all \(\lambda >K\) we have
-
(i)
\( V^\star (t,\cdot )\) is \( b \cdot e^{-(\lambda -K)t}\)-Lipschitz continuous on \(\Omega (t)\), for any \(t\geqslant 0\).
-
(ii)
\(\lim _{t \rightarrow \infty } V^\star (t,x (t) )=0\) for any feasible trajectory \(x (\cdot ) \).
-
(iii)
\(V^\star =V\) on \(\mathcal {Q}_\Omega \).
Proof
We notice that, by Proposition 4.1, the problem (\(\mathcal {P}^\star \)) admits feasible trajectory-control pairs for any initial condition; using the sub-linear growth of \(\textbf{f}\) and the Gronwall lemma, we have \(1+|x (t) |\leqslant (1+|x_0|)e^{\int _{t_0}^{t}c(s)\,ds}\) for all \(t\geqslant t_0\) and for any trajectory-control pair \((x (\cdot ),u (\cdot ) )\) at \(t_0\in \mathbb {R}^+,\,x_0\in \Omega (t_0)\).
In what follows, we define for all \( (t,x,z)\in \mathbb {R}^+ \times {\mathbb {R}}^n\times {\mathbb {R}} \) the time-measurable set-valued maps
Next, we show (i). Let \(a_1>0,\,a_2>0\) be such that
For all \(T>t_0\), we have
Then, by (5.1) and denoting \(\psi (t) =\int _{t_0}^{t}c(s)\,ds\), for any \(\lambda >a_1\)
Passing to the limit when \(T \rightarrow \infty \), we deduce that for every feasible trajectory-control pair \((x (\cdot ),w (\cdot ) )\) at \((t_0,x_0)\)
From now on, assume that \(\lambda >a_1\). Fix \(t\geqslant 0\) and \(x^1,x^0\in \Omega (t)\) with \(x^1\ne x^0\). Then, for any \(\delta >0\) there exists a feasible trajectory-control pair \((x_\delta (\cdot ),w_\delta (\cdot ) )\) at \((t,x^0)\) such that
Hence
for any feasible trajectory-control pair \(( x (\cdot ), w (\cdot ) )\) satisfying \(x(t)=x^1\). Consider the following state constrained differential inclusion in \({\mathbb {R}}^{n+1}\)
Putting \(z_\delta (s)=\int _{t}^{s} \textbf{L}^\star (\xi ,x_\delta (\xi ),w_\delta (\xi ))\,d\xi \), by Theorem 4.2 applied on \(\Omega (t)\times {\mathbb {R}}\) and the measurable selection theorem, there exist \(C>1\) and \(K>0\) such that for all \(\delta >0\) we can find a \( G^\star _{\infty }\)-trajectory \(({\tilde{x}}_\delta (\cdot ), {\tilde{z}}_\delta (\cdot ) )\) on \([t,+\infty [\), and a measurable selection \({\tilde{w}}_\delta (s)\in W (s) \) a.e. \(s\geqslant t\), satisfying
and for any \(s\geqslant t\)
Now, relabelling by K the constant \(K\vee a_1\), by (5.5) and integrating by parts, for all \(\lambda >K\), all \(\tau \geqslant t\), and all \(\delta >0\)
Taking note of (5.4), (5.6), and putting \(\delta =\lambda -K\), for all \(\lambda >K\) we get
By the symmetry of the previous inequality with respect to \(x^1\) and \(x^0\), and since \(\lambda \), C, and K do not depend on t, \(x^1\), and \(x^0\), the statement (i) follows.
Now, let \((t_0,x_0)\in \mathcal {Q}_\Omega \) and consider a feasible trajectory \(X (\cdot ) \) at \((t_0,x_0)\). Let \(t> t_0\) and \((x (\cdot ),w (\cdot ) )\) be a feasible trajectory-control pair at (t, X(t) ) such that \( V^\star (t,X (t) )>\int _{t}^{\infty }e^{-\lambda s} \textbf{L}^\star (s,x (s),w (s) )\,ds-\frac{1}{t}\). Then
From (5.1) and (5.2), we have for all \(T>t\)
Then, arguing as in (5.3) with \(t_0\) replaced by t and taking the limit when \(T \rightarrow \infty \), we deduce that
Since \(K\geqslant a_1\), (ii) follows passing to the limit when \(t \rightarrow + \infty \).
Next, we show (iii). Notice that \( V^\star (t,x)\leqslant V(t,x)\) for any \((t,x)\in \mathcal {Q}_\Omega \), and \( V^\star (t,\cdot )\) is Lipschitz continuous on \(\Omega (t)\) for all \(t\geqslant 0\) whenever \(\lambda >0\) is sufficiently large. Fix \(t_0\in \mathbb {R}^+,\,x_0\in \Omega (t_0)\), and \(\varepsilon >0\). We claim that: for all \(j\in {\mathbb {N}}^+\) there exists a finite set of trajectory-control pairs \(\{(x_k (\cdot ),u_k (\cdot ) )\}_{k=1,...,j}\) satisfying the following: \(x_k' (s) =\textbf{f}(s,x_k' (s),u_k' (s) )\) a.e. \(s\in [t_0,t_0+k]\) and \(x_k(s)\in \Omega (s)\) for all \(s\in [t_0,t_0+k]\) and for all \(k=1,...,j\); if \(j\geqslant 2\), \(x_k|_{[t_0,t_0+k-1]} (\cdot ) =x_{k-1} (\cdot ) \) for all \(k=2,...,j\); and for all \(k=1,...,j\)
We prove the claim by the induction argument with respect to \(j\in {\mathbb {N}}^+\). By the dynamic programming principle, there exists a trajectory-control pair \(({\tilde{x}} (\cdot ),{\tilde{w}} (\cdot ) )\) on \([t_0,t_0+1]\), feasible for the problem (\(\mathcal {P}^\star \)) at \((t_0,x_0)\), such that
By the relaxation theorem for finite horizon problems (cfr. [34]), for any \(h>0\) there exists a measurable control \({\hat{u}}^h(t)\in U(t)\) a.e. \(t\in [t_0,t_0+1]\) such that the solution of the equation \(({\hat{x}}^h)' (t) =\textbf{f}(t,{\hat{x}}^h (t),{\hat{u}}^h (t) )\) a.e. \(t\in [t_0,t_0+1]\), with \({\hat{x}}^h(t_0)=x_0\), satisfies
and
Now, consider the following state constrained differential inclusion in \({\mathbb {R}}^{n+1}\)
Letting \({\hat{X}}^h(\cdot )=({\hat{x}}^h(\cdot ),{\hat{z}}^h(\cdot ))\), with \({\hat{z}}^h (t) =\int _{t_0}^{t} e^{-\lambda s} \textbf{L}(s,{\hat{x}}^h (s),{\hat{u}}^h (s) )\,ds\), by Theorem 4.1 and the measurable selection theorem, there exist \(\beta >0\) (not depending on \((t_0,x_0)\)) such that for any \(h>0\) we can find a feasible G-trajectory \(X^h (\cdot ) =(x^h (\cdot ),z^h (\cdot ) )\) on \([t_0,t_0+1]\), with \(X^h(t_0)=(x_0,0)\), and a measurable control \(u^h (s) \in U (s) \) a.e. \(s\in [t_0,t_0+1]\), such that
and
Since \(\sup _{s\in [t_0,t_0+1]}d_{\Omega (s)\times {\mathbb {R}}}({\hat{X}}^h(s))\leqslant \Vert {\tilde{x}}-{\hat{x}}^h\Vert _{\infty ,[t_0,t_0+1]}\), we have
and
Hence, choosing \(0<h<\varepsilon /4(2\beta +1)\) sufficiently small, we can find a trajectory-control pair \((x^h (\cdot ),u^h (\cdot ) )\) on \([t_0,t_0+1]\), with \(u^h (s) \in U (s) \) and \((x^h)' (s) =\textbf{f}(s,x^h (s),u^h (s) )\) a.e. \(s\in [t_0,t_0+1]\), \(x^h(t_0)=x_0\), and \(x^h(s)\in \Omega (s)\) for \(s\in [t_0,t_0+1]\), such that, by (5.8) and continuity of \( V^\star (t_0+1,\cdot )\)
Letting \((x_1 (\cdot ),u_1 (\cdot ) ):=(x^h (\cdot ),u^h (\cdot ) )\), the conclusion follows for \(j=1\). Now, suppose we have shown that there exist \(\{(x_k (\cdot ),u_k (\cdot ) )\}_{k=1,...,j}\) satisfying the claim. Let us to prove it for \(j+1\). By the dynamic programming principle there exists a trajectory-control pair \(({\tilde{x}} (\cdot ),{\tilde{w}} (\cdot ) )\) on \([t_0+j,t_0+j+1]\), feasible for the problem (\(\mathcal {P}^\star \)) at \((t_0+j,x_j(t_0+j))\), such that
As before, for every \(h>0\) there exist a feasible G-trajectory \(X^h (\cdot ) =(x^h (\cdot ),z^h (\cdot ) )\) on \([t_0+j,t_0+j+1]\), with \(X^h(t_0)=(x_j(t_0+j),0)\), and a measurable control \(u^h (s) \in U (s) \) a.e. \(s\in [t_0+j,t_0+j+1]\), such that
satisfying
and
Putting
and choosing \(0<h<\varepsilon /2^{j+2}(2\beta +1)\) sufficiently small, it follows from (5.9) that
So, taking note of (5.10) and (5.11), we obtain
Hence \(\{(x_k (\cdot ),u_k (\cdot ) )\}_{k=1,...,j+1}\) also satisfy our claim. Now, let us define the trajectory-control pair \((x (\cdot ),u (\cdot ) )\) by \((x (t),u (t) ):=(x_k (t),u_k (t) )\) if \(t\in [t_0+k-1,t_0+k]\). Then \((x (\cdot ),u (\cdot ) )\) is a feasible trajectory-control pair for the problem (\(\mathcal {P}\)) at \((t_0,x_0)\). Since, by (ii), \( V^\star (t,x (t) ) \rightarrow 0\) when \(t \rightarrow +\infty \), from (5.7) we have
Hence, we deduce that \((t_0,x_0)\) lays in the domain of the value function V and so \( V^\star (t_0,x_0)\geqslant V(t_0,x_0)-\varepsilon \). From the arbitrariness of \(\varepsilon \), the conclusion follows. \(\square \)
Corollary 5.1
Consider any \(N>0\) with
Then, for any \(\lambda >0\) sufficiently large, for any \(t\geqslant 0\) and any \(x\in \Omega (t)\), the function \(V(\cdot ,x)\) is Lipschitz continuous on \([t,+\infty [\) with constant \(\left( L(t)+2e^{-\lambda t}\right) N\) and \(L(t):=b e^{-(\lambda -K)t}\).
Proof
From Theorem 5.1, when \(\lambda >0\) is large enough, \(V(t,\cdot )\) is L(t) -Lipschitz continuous on \(\Omega (t)\). Fix \(t\geqslant 0\) and \(x\in \Omega (t)\). Let \(s,{\tilde{s}}\in [t,+\infty [\).
Suppose that \(s\geqslant {\tilde{s}}\). Then, by the dynamic programming principle, there exists a feasible trajectory-control pair \(({\bar{x}} (\cdot ),{\bar{u}} (\cdot ) )\) at \(({\tilde{s}},x)\) such that
Arguing in a similar way, we get (5.12) when \(s<{\tilde{s}}\). Hence, by the symmetry with respect to s and \(\tilde{s}\) in (5.12), the conclusion follows. \(\square \)
Remark 5.2
The relaxation result in Theorem 5.1 assumes crucial significance when convex data assumptions are absent. By transitioning to the relaxed problem, we ensure that both convergence and Lipschitz regularity remain guaranteed. This approach naturally aligns with the need of machine learning algorithms, in which the desirable property of Lipschitz regularity of the value function improve significantly convergence rates (cfr. [7, 8]). As mentioned earlier in the Introduction, it is well known that incorporating state constraints in the learning process can introduce instability or lead to error divergence in function approximation techniques. Hence, to bolster the overall robustness and reliability of the methods, inward point conditions such as those in (3.1) play a critical role. We refer the reader to [16, 28, 29, 35] for a more comprehensive understanding of the roles of inward pointing conditions and Lipschitz continuity of value functions in convergence guarantees for reinforcement learning with uncertainties and path planning algorithms for autonomous vehicles.
6 Conclusions
This paper presents a method for recovering the feasibility and Lipschitz regularity of the value function for control problems with time-dependent state constraints and infinite horizon discount factor. These results are essential for addressing optimal synthesis and weak solutions to the Hamilton–Jacobi–Bellman equation. We establish sufficient conditions on the constraint set to ensure feasibility and obtain estimates on the neighboring set of feasible trajectories, based on recent viability results. An important contribution of this paper is the demonstration of the equivalence between the master and relaxed infinite horizon problems. Additionally, we prove that the value function approaches zero at infinity for all feasible sets and large discount factors.
Notes
In the sense of set-valued maps, see e.g. [1]-Section 1.4.
Here \(W^{1,1}(a,b)\) stands for the space of all absolutely continuous functions on [a, b] endowed with the norm \(\left\| g \right\| =g(a)+\int _a^b g'(s)ds\).
We recall that for a function \(q\in L^1_{\textrm{loc}}([t_0,+\infty [;{\mathbb {R}})\) the integral
$$\begin{aligned} \int _{t_0}^\infty q (t) \,dt:=\lim _{T \rightarrow \infty }\int _{t_0}^T q (t) \,dt, \end{aligned}$$provided this limit exists.
References
Aubin J-P, Frankowska H (2009) Set-valued analysis. Modern Birkhäuser Classics. Birkhäuser Boston Inc, Boston, MA
Baird L (1995) Residual algorithms: reinforcement learning with function approximation. In: Machine learning proceedings. Elsevier, pp 30–37
Basco V (2022) Weak epigraphical solutions to Hamilton–Jacobi–Bellman equations on infinite horizon. J Math Anal Appl 515(2):126452
Basco V, Cannarsa P, Frankowska H (2018) Necessary conditions for infinite horizon optimal control problems with state constraints. Math Control Relat Fields 8(3–4):535–555
Basco V, Frankowska H (2019) Hamilton–Jacobi–Bellman equations with time-measurable data and infinite horizon. Nonlinear Differ Equ Appl 26(1):7
Basco V, Frankowska H (2019) Lipschitz continuity of the value function for the infinite horizon optimal control problem under state constraints. In: Alabau-Boussouira F, et al (eds) Trends in control theory and partial differential equations, vol 32 of Springer INdAM Series. Springer International Publishing, pp 15 – 52
Bertsekas D (2022) Dynamic programming and optimal control, volume 1. Athena scientific
Bertsekas D (2019) Reinforcement learning and optimal control. Athena Scientific
Blackwell D (1965) Discounted dynamic programming. Ann Math Stat 36(1):226–235
Boyan J, Moore A (1994) Generalization in reinforcement learning: safely approximating the value function. Adv Neural Inf Process Syst 7
Calafiore GC, Fagiano L (2012) Robust model predictive control via scenario optimization. IEEE Trans Autom Control 58(1):219–224
Cannon M, Kouvaritakis B, Raković SV, Cheng Q (2010) Stochastic tubes in model predictive control with probabilistic constraints. IEEE Trans Autom Control 56(1):194–200
Crandall MG, Evans LC, Lions P-L (1984) Some properties of viscosity solutions of Hamilton–Jacobi equations. Trans Amer Math Soc 282(2):487–502
Crandall MG, Lions P-L (1983) Viscosity solutions of Hamilton–Jacobi equations. Trans Amer Math Soc 277(1):1–42
De Jager B, Van Keulen T (2013) Optimal control of hybrid vehicles. Springer, Kessels
De Pinho MR, Foroozandeh Z, Matos A (2016) Optimal control problems for path planing of AUV using simplified models. In: 2016 IEEE 55th conference on decision and control (CDC), pp 210–215. IEEE
Farina M, Giulioni L, Magni L, Scattolini R (2013) A probabilistic approach to model predictive control. In: 52nd IEEE conference on decision and control. IEEE, pp 7734–7739
Feichtinger G, Kovacevic RM, Tragler G (2018) Control systems and mathematical methods in economics, volume 687. Lecture Notes in Economics and Mathematical Systems. Springer
Frankel A (2016) Discounted quotas. J Econ Theory 166:396–444
Gordon GJ (1995) Stable function approximation in dynamic programming. In: Machine learning proceedings. Elsevier, pp 261–268
Hashimoto T (2013) Probabilistic constrained model predictive control for linear discrete-time systems with additive stochastic disturbances. In: 52nd IEEE conference on decision and control. IEEE, pp 6434–6439
Hewing L, Zeilinger MN (2018) Stochastic model predictive control for linear systems using probabilistic reachable sets. In: 2018 IEEE conference on decision and control (CDC), pp 5182–5188. IEEE
Kamgarpour M, Summers T (2017) On infinite dimensional linear programming approach to stochastic control. IFAC-PapersOnLine 50(1):6148–6153
Kouvaritakis B, Cannon M, Couchman P (2006) Mpc as a tool for sustainable development integrated policy assessment. IEEE Trans Autom Control 51(1):145–149
Kouvaritakis B, Cannon M, Raković SV, Cheng Q (2010) Explicit use of probabilistic distributions in linear predictive control. Automatica 46(10):1719–1724
Margellos K, Goulart P, Lygeros J (2014) On the road between robust optimization and the scenario approach for chance constrained optimization problems. IEEE Trans Autom Control 59(8):2258–2263
Menon PKA, Briggs MM (1990) Near-optimal midcourse guidance for air-to-air missiles. J Guid Control Dyn 13(4):596–602
Munos R (1998) A general convergence method for reinforcement learning in the continuous case. In: European conference on machine learning. Springer, pp 394–405
Munos R (2000) A study of reinforcement learning in the continuous case by the means of viscosity solutions. Mach Learn 40:265–299
Nystrup P, Boyd S, Lindström E, Madsen H (2019) Multi-period portfolio selection with drawdown control. Ann Oper Res 282(1):245–271
Postoyan R, Buşoniu L, Nešić D, Daafouz J (2016) Stability analysis of discrete-time infinite-horizon optimal control with discounted cost. IEEE Trans Autom Control 62(6):2736–2749
Schildbach G, Goulart P, Morari M (2015) Linear controller design for chance constrained systems. Automatica 51:278–284
Van Parys BPG, Goulart PJ, Morari M (2013) Infinite horizon performance bounds for uncertain constrained systems. IEEE Trans Autom Control 58(11):2803–2817
Vinter RB (2000) Optimal Control. Birkhäuser, Boston, MA
Weston J, Tolić D, Palunko I (2022) Mixed use of Pontryagin’s principle and the Hamilton–Jacobi–Bellman equation in infinite-and finite-horizon constrained optimal control. In: International conference on intelligent autonomous systems. Springer, pp 167–185
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author declares no conflicts of interest in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Basco, V. Control problems on infinite horizon subject to time-dependent pure state constraints. Math. Control Signals Syst. 36, 423–450 (2024). https://doi.org/10.1007/s00498-023-00372-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00498-023-00372-3