Abstract
In a Hilbert framework, we introduce a new class of second-order dynamical systems that combine viscous and geometric damping but also a time rescaling process for nonsmooth convex minimization. A main feature of these systems is to produce trajectories that lie in the graph of the Fenchel subdifferential of the objective. Moreover, they do not incorporate any regularization or smoothing processes. This new class originates from some combination of a continuous Nesterov-like dynamic and the Minty representation of subdifferentials. These models are investigated through first-order reformulations that amount to dynamics involving three variables: two solution trajectories (including an auxiliary one) and another one associated with subgradients. We prove the weak convergence towards equilibria for the solution trajectories, as well as properties of fast convergence to zero for their velocities. Remarkable convergence rates (possibly of exponential-type) are also established for the function values. We additionally state notable properties of fast convergence to zero for the subgradients trajectory and for its velocity. Some numerical experiments are performed so as to illustrate the efficiency of our approach. The proposed models offer a new and well-adapted framework for discrete counterparts, especially for structured minimization problems. Inertial algorithms with a correction term are then suggested relative to this latter context.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \({\mathcal {H}}\) be a real Hilbert space with inner product and induced norm denoted by \(\langle .,. \rangle \) and \(\Vert . \Vert \), respectively. This paper aims at proposing fast continuous Newton-like dynamics for solving the nonsmooth minimization problem
where \(f: {\mathcal {H}}\rightarrow \mathrm{I\!R}\cup \{+\infty \}\) is a proper convex and lower semi-continuous function such that \(S:=\textrm{argmin} f \ne \emptyset \). This issue was particularly discussed this last decade through second-order dissipative dynamical models with asymptotic vanishing (isotropic linear) damping (see, e.g., [12,13,14, 23, 24, 31, 38]), possibly coupled with geometric damping [3, 7, 9, 10, 16, 19]. Note that these afore-mentioned dynamics can also incorporate a time rescaling process for acceleration purposes [13,14,15,16, 19]. Nevertheless, these studies (in which the two kinds of damping occur) are mostly concerned with the case when the objective f is smooth. Some related strategies were recently proposed to face the nonsmooth case by replacing the objective with an appropriate smooth regularization (see, e.g., [7, 19]).
It is our purpose here to propose and investigate a different but simpler approach to the issue under consideration. Our methodology is inspired by the recent models in [30] (for computing zeroes of a maximally monotone operator) whose discrete counterparts gave rise to very efficient forward-backward algorithms with a correction term (see [26, 29]). Specifically, based on the work [30], we discuss fast continuous models that generate dynamics \(\{x( \cdot ),\xi ( \cdot )\}\) lying in the graph of \(\partial f\) (the Fenchel subdifferential of f). It is worthwhile noticing that similar dynamics can be deduced from the systems studied in [30] (relative to the special case of the potential operator \(\partial f\)) with nice features such as \(\Vert \dot{x}(t)\Vert =o(t^{-1})\) and \(\Vert \xi (t)\Vert =o(t^{-1})\) (as \(t \rightarrow + \infty \)) among others. However, in absence of time rescaling process, the typical convergence rate \(f(x(t))- \min f= o(t^{-2})\) (as \(t \rightarrow + \infty \)) is not shown, which is somewhat restrictive for numerical purposes regarding structured minimization. This drawback can be overcome with our new models which are nothing but slight modifications of the latter ones issued from [30] (with regard to the special case of potential operators). This approach additionally leads us to noteworthy convergence rates related to the trajectories. As discrete counterparts of our models in view of solving structured minimization problems, we also suggest new forward-backward algorithms with a correction term (besides the momentum term).
\({{\textbf{Notations}}}\) In what follows, for any given function \(u: [0, \infty ) \rightarrow {\mathcal {H}}\), we will sometimes use the notations \(\big ( u(\cdot ) \big )^{(1)}\) and \(\big ( u(\cdot ) \big )^{(2)}\) as the first and second derivatives in time (respectively) of u.
Furthermore, given two time-dependent functions \(a: \mathrm{I\!R}\rightarrow \mathrm{I\!R}\) and \(b: \mathrm{I\!R}\rightarrow \mathrm{I\!R}\) we recall the notation \(a(t) \sim b(t)\) as \(t \rightarrow +\infty \), which means that there exists a real mapping \(h: \mathrm{I\!R}\rightarrow \mathrm{I\!R}\) for which \(a(\cdot ) = h(\cdot ) b(\cdot )\) and \(\lim _{t \rightarrow + \infty } h(t) = 1\). In particular, if \(b(\cdot )=b^*\) is a nonzero constant, \(a(t) \sim b^*\) as \(t \rightarrow +\infty \) is equivalent to \(\lim _{t \rightarrow + \infty } a(t) = b^*\).
1.1 A Second-Order Dynamical System
As a particular case of the second-order model initiated in [30], we intend here to exploit the dynamics \((x,\xi ): [0, \infty ) \rightarrow {\mathcal {H}}^2\) generated by
where \(\{ \alpha ( \cdot ), \beta ( \cdot ), b( \cdot ), \sigma ( \cdot ) \}\) are positive functions from \(\mathrm{I\!R}\) to \(\mathrm{I\!R}\). Recall that this system was inspired by the Minty representation of maximally monotone operators and the approach due to Attouch–Chbani–Fadili–Riahi [16]. The term \({\big (x( \cdot ) + \sigma ( \cdot ) \xi ( \cdot )\big )^{(2)}}(t)\) acts as a singular perturbation of the possibly degenerated classical Newton continuous dynamical system (see, e.g., [3]) in which a time scaling parameter \(\sigma ( \cdot )\) is incorporated. In addition to the time scaling parameter \(\sigma ( \cdot )\), system (1.2) embeds some geometric damping (through the terms \({\dot{\xi }} ( \cdot )\)), but also an isotropic damping coefficient \(\alpha ( \cdot )\) that can be intended to vanish asymptotically.
1.2 An Equivalent First-Order System
A main step in our methodology is to rewrite (1.2) as an equivalent first-order dynamical system, by means of a phase-space lifting method. This was done in [30] only for sufficiently regular pairs \(\{x( \cdot ), \xi ( \cdot )\}\) verifying (1.2) together with parameters \(\{ \alpha ( \cdot ), \beta ( \cdot ), b( \cdot )\}\) of the form
where \(\kappa \) is some positive constant, while \(\{\theta ( \cdot ), \omega ( \cdot )\} \) are positive mappings of class \(C^1\). Let us stress that we will prove that (1.2)–(1.3) can be alternatively formulated in some sense (even for a nonregular pair \(\{x( \cdot ), \xi ( \cdot )\}\)) as the first-order dynamical system (see Propositions 2.1):
Observe that the simplicity of the latter model makes it particularly interesting with regards to numerical developments. In the sequel of this work, we consider the above system with a particular choice of parameters \(\{\theta ( \cdot ), \sigma ( \cdot ), \omega ( \cdot )\}\) taken such that
where \(\delta \) is a nonnegative constant, \(\{e_*, \sigma _0\}\) are positive constants and \(\{\nu ( \cdot ), \vartheta ( \cdot )\}\) are positive mappings of class \(C^1\) that play crucial roles.
Note that the mapping \(\vartheta ( \cdot )\) will be assumed to be nonincreasing and such that \(\vartheta (t) \sim \vartheta _{\infty }\) (as \(t \rightarrow \infty \)) for some positive value \(\vartheta _{\infty }\).
Remark 1.1
The coefficients used here are different from that used in [30] relative to the computation of zeroes of an arbitrary maximally monotone operator. We also stress that \(\vartheta ( \cdot )\) could be chosen as a constant, but such a choice would be restrictive with regard to the numerical purposes (see Sect. 5.2).
1.3 Connection with the State-of-the-Art
Many of the inertial approaches to minimizing a smooth convex function f enter the following model
where \({\bar{\alpha }}(t)\) (viscous damping coefficient) and \({\bar{b}}( \cdot )\) (time scale parameter) are positive mappings, while \({\bar{\beta }}( \cdot )\) is nonnegative. The parameter \({\bar{b}}( \cdot )\) plays a key role in the acceleration of the asymptotic convergence properties of the trajectories \(x( \cdot )\) whenever \(b(t) \rightarrow +\infty \) (as \(t \rightarrow \infty \)). Nonetheless, it is worthwhile underlining that the use of a bounded scale parameter \({\bar{b}}( \cdot )\) is up until now of great importance with regard to numerical purposes for structured minimization problems (by means of proximal-like algorithms). In addition, this model originates from two important classes of second-order systems (depending on the presence or not of the geometric damping) that follow the seminal works on inertial dynamics initiated by Polyak [34], Su–Boyd–Candès [38] and Attouch–Peypouquet–Redont [11].
1.3.1 A First Class with Only Viscous Damping
The first class (which only involves a viscous damping) enters (1.6) with \({\bar{\beta }} \equiv 0\) and writes
where f is of class \(C^1\) and \(\{{\bar{\alpha }} ( \cdot ), {\bar{b}}( \cdot )\}\) are positive mappings. The special case of (1.7) when \({\bar{\alpha }}(t) \equiv {\bar{\alpha }} >0\) and \({\bar{b}}(t) \equiv 1\) corresponds to the (classical) heavy ball with friction method (initiated by Polyak [34]). The special case of (1.7) when \({\bar{\alpha }} (t)= \alpha _*t^{-1}\) (for \(\alpha _* \ge 3\)) were discussed by Attouch–Chbani–Riahi [13, 14] as the time re-scaled (AVD) (whose terminology stands for Asymptotic Vanishing Damping) given by
The special case of (1.8) when \({\bar{b}} (t) \equiv 1\) is nothing but the classical system (AVD), which is the dynamic version of the popular Nesterov’s method introduced by Su–Boyd–Candès [38] (see, also, Apidopoulos–Aujol–Dossal [4], Attouch–Chbani–Peypouquet–Redont [12]). Let us underline that the asymptotic convergence rate of the function value \({{\mathcal {O}}} (t^{-1})\) as \(t \rightarrow + \infty \) (for the heavy ball with friction method) was improved to \(o(t^{-2})\) (for the classical (AVD)). Moreover, the trajectories of (1.8) were shown to verify (see [14, Corollary 5]), under the growth condition \(\limsup _{t \rightarrow \infty }\frac{t}{{\bar{b}}(t)}\frac{d}{dt}{\bar{b}(t)} < \alpha _*-3 \), the fast convergence rates (for some \(t_0 \ge 0\)): \(f(x(t))-\min f= o \big (\frac{1}{\int _{t_0}^t s{\bar{b}}(s)ds}\big )\) and \(\Vert \dot{x}(t)\Vert ^2= o \big (\frac{{\bar{b}} (t)}{\int _{t_0}^t s{\bar{b}}(s)ds}\big )\) as \(t \rightarrow + \infty \). So, for the polynomial time scaling function \({\bar{b}}(t)=t^p\) together with \(\alpha _* > p+3\), these rates writes: \(f(x(t))-\min f= o \big (t^{-(p+2)}\big )\) and \(\Vert \dot{x}(t)\Vert = o \big (t^{-1}\big )\) (as \(t \rightarrow + \infty \)).
Continuous approaches to nonsmooth convex minimization based upon model (1.8) were furthermore addressed by means of either Moreau-Yosida regularizations (see Attouch-Cabot [5]) or smoothing techniques (see Qu-Bian [35]). A nonsmooth setting based on the more general model (1.7) was also investigated by Luo [28] by means of the concept of energy-conserving solution, leading to additional substantial convergence results such as a rate of \({{\mathcal {O}}} (e^{-t})\) as \(t \rightarrow + \infty \) for the function values.
1.3.2 A Second Class with Both Viscous and Geometric Damping
The second class (linked with Newton’s method by combining both viscous and geometric damping) writes as system (1.6) in which f is of class \(C^2\) and \(\{{\bar{\alpha }} ( \cdot ), \bar{\beta }( \cdot ), {\bar{b}}( \cdot )\}\) are positive mappings. The special case of (1.6) when \({\bar{\alpha }} (t)= \alpha _*t^{-1}\) (for some constant \(\alpha _* \ge 1\)) was introduced by Attouch–Chbani–Fadili–Riahi [16, 17] (see also, Attouch–Peypouquet–Redont [11] and Shi–Du–Jordan–Su [36]) so as to neutralize the oscillations observed for system (1.8). This modification of (1.8) gave rise to the time re-scaled (DIN-AVD) that equivalently writes
It has been shown in [16] (under appropriate conditions on the parameters) that the convergence properties of (AVD) regarding the function values are preserved, besides having proved the strong convergence to zero of \(\nabla f (x( \cdot ))\) and other estimates on this last term. The authors also established among others the asymptotic properties below (see [17, Section 2.4]):
-If \(\alpha _*>3\), \({\bar{\beta }} (t) \equiv \beta \) (for some constant \(\beta >0\)) and \({\bar{b}} (t) \equiv 1\), then the trajectories of (1.9) satisfy \(f(x(t))-\min f=o \left( \frac{1}{t^2} \right) \) as \(t \rightarrow + \infty \), together with \(\int _{t_0} ^{+\infty } t^2 \Vert \nabla f (x(t)) \Vert ^2dt < \infty \) (property of fast decaying gradient) and \(\int _{t_0} ^{+\infty } t \Vert \dot{x}(t) \Vert ^2dt <\infty \).
- If \({\bar{\beta }} (t) = t^{\beta } \) and \({\bar{b}} (t) = ct^{\beta -1} \) (for some positive constants \(\beta \) and c), along with \(\beta <c-1\) and \(\beta \le \alpha _* -2\), then the trajectories of (1.9) satisfy the rate \(f(x(t))-\min f=o \left( \frac{1}{t^{\beta +1}} \right) \) as \(t \rightarrow + \infty \).
Later on, the nonsmooth setting of f based upon (1.9) was addressed by Attouch-László [7] and Boţ-Karapetyants [19]. Their approaches consist in replacing f in (1.9) with its Moreau envelope \(f_{\lambda ( \cdot )}\) of some time-dependent parameter \(\lambda ( \cdot )\). Such a strategy was first considered in [7] relative to constants \(\{ {\bar{\beta }}, {\bar{b}}\}\subset (0,\infty )\), and then extended in [19] to the case of positive mappings \(\{ \bar{\beta }( \cdot ), {\bar{b}}( \cdot )\}\) through the model
It was stated (see [19, Theorems 2, 4 and 5]), under appropriate assumptions including \(\alpha _* >1\) and \(\limsup _{t \rightarrow \infty } \frac{t}{{\bar{b}}(t)}\frac{d}{dt}{{\bar{b}}(t)} <\infty \), that \(x( \cdot )\) converges weakly to a minimizer of f together with the following convergence rates as \(t \rightarrow + \infty \) (in which \(\textrm{prox}_{\lambda ( \cdot ) f}\) denotes the proximal operator of f):
-
\({ f_{\lambda (t)} \big (x(t)\big ) - \min f = o\left( \frac{1}{t^2 \bar{b}(t)} \right) , f \big (\textrm{prox}_{\lambda (t) f} \big (x(t)\big )\big ) - \min f = o\left( \frac{1}{t^2 {\bar{b}}(t)} \right) ,}\)
-
\({\Vert \nabla f_{\lambda (t)} \big (x(t)\big ) \Vert = o\left( \frac{1}{t \sqrt{{\bar{b}}(t) \lambda (t)} } \right) , \Vert \dot{x}(t) \Vert = o \left( \frac{1}{t} \right) }\),
-
\({ \int _{t_0}^{\infty } t\Vert \dot{x}(t) \Vert ^2 dt<\infty , \int _{t_0}^{\infty } t {\bar{b}}(t) \big (f_{\lambda (t)} (x(t)) - \min f\big ) dt <\infty }.\)
More recently, Boţ, Csetnek and László proposed in [20] a slightly modified version of system (1.10) which incorporates a Tikhonov regularization term, in order to get strong convergence for the trajectories.
Unlike the methodology proposed in [7, 18,19,20], our new model (1.2) (which is also linked with Newton’s method) incoporates the same types of damping terms (even relative to the nonsmooth setting) without resorting to any regularization of the objective.
1.4 Overview of the Main Results
We prove existence and uniqueness of a strong (global) solution \((x,\xi , y)\) to (1.4)–(1.5) (see Propositions 2.2 and 2.3), for which \((x,\xi )\) equivalently solves (1.2)–(1.3)–(1.5). Next, focusing on (1.4)–(1.5), we put out some important asymptotic features of its trajectories with respect to \(\nu ( \cdot )\) (see Theorems 4.1, 4.2, 4.3 and 4.4). Theorems 4.1 and 4.2 are concerned with the general setting of \(\nu ( \cdot )\). They establish the weak convergence of \(x( \cdot )\) and \(y( \cdot )\) towards the same equilibria, but also the strong convergence to zero of both \(\xi ( \cdot )\) and \({\dot{\xi }}( \cdot )\), with fast decaying properties. Theorems 4.3 and 4.4 deal with the particular case \(\nu (t) = \nu _0^{1-\gamma } (t + \nu _0)^{\gamma }\) for \(t \ge 0\) (with \(\nu _0 >0\) and \(\gamma \in [0,1]\)) for which the parameters in (1.3) satisfy as \(t \rightarrow \infty \) (see Proposition 4.3): \(\alpha (t) \sim \alpha _* t^{-\gamma }\), \(\beta (t) \sim \beta _*\) and \(b(t) \sim b_*\) (for some \(\{\alpha _*, \beta _*, b_*\} \subset (0,\infty )\)). It is particularly established (among others) remarkable convergence rates relative to \(\gamma \) and the involved parameters \(\{ \delta , \kappa , \sigma _0, \nu _0, \vartheta _{\infty }, e_*\}\) as described below:
- Case \(\mathbf {\gamma =1 (\nu (t) = t + \nu _0)}\). For any \(\delta \ge 0\) and \(\{\kappa , \sigma _0, \nu _0\} \subset (0,\infty )\), together with \(e_*>\kappa ^{-1}(\delta +2)\) and \(\vartheta _{\infty } \ge 1\), we obtain for some \(t_0 \ge 0\), and as \(t \rightarrow + \infty \):
- Case \({\gamma \in [0,1)}\). For any \(\delta \ge 0\) and \(\{ \kappa , \sigma _0, \nu _0\} \subset (0,\infty )\), together with \(e_*=\lambda \kappa ^{-1} \delta \) (for some \(\lambda >1\)) and \( \vartheta _{\infty } \ge 1\), we get for \(c:= \big ( \frac{\nu _0^{\gamma }}{(1-\gamma )(\nu _0+\lambda )} \big ) \frac{\delta \kappa }{\max \{\delta ,\kappa \}}\) and for some \(t_0 \ge 0\) and as \(t \rightarrow + \infty \):
Remark 1.2
As a consequence of the latter case, we deduce (see corollary 4.1) that, for any \(\{ \delta , \sigma _0, \nu _0\} \subset (0,\infty )\) together with \(\kappa =\delta \), \(e_*>1\) and \( \vartheta _{\infty } \ge 1\), the rates in (1.12) still hold with \(c:= \big ( \frac{\nu _0^{\gamma }}{(1-\gamma )(\nu _0+e_*)} \big ) \delta \).
Note that when \(\gamma =1\), our results in terms of convergence rates are as good as those of [19] (regarding model (1.10)), despite the simplicity of model (1.4)–(1.5). Moreover, we observe that, in absence of time rescaling process (namely \(\delta =0\)), better theoretical convergence rates are obtained for the case \(\gamma =1\), while our model can be easily adapted to solve structured minimization problems. Concerning the case \(\gamma \in [0,1)\) in presence of time rescaling process (namely \(\delta >0\)), we get better convergence rates than for \(\gamma =1\), excluding the two rates for \(\Vert \dot{x}( \cdot )\Vert \). In particular, through a certain trade-off regarding the specific case when \(\gamma =0\), together with the choice \(e_*>1\) and \(\kappa =\delta \), we can reach the following exponential-like rates as \(t \rightarrow + \infty \) (for the sub-gradient and the function values):
where the parameter \(\delta \) can be arbitrarily chosen.
The proofs of our results rely on Lyapunov properties of functionals \({\mathcal {L}}_{s,q}( \cdot )\) (related to (1.4)–(1.5)) defined for \((s,q) \in (0,\infty ) \times S\) and \(t\ge 0\), by
Main assumptions Throughout this paper, we assume the condition
The other assumptions required on the parameters are detailed below. Given a positive constant \(\kappa \) and positive mappings \(\{\nu ( \cdot ), \vartheta ( \cdot )\}\) we set \(\rho ( \cdot ):= \kappa -\frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )} \) while assuming the following conditions:
We also consider the following additional condition on \(\vartheta ( \cdot )\):
1.5 Organization of the Paper
An outline of this paper is as follows: In §2, we show some equivalence between systems (1.2) and (1.4) as well as the well-posedness of (1.4). In §3, we exhibit some Lyapunov functional associated with (1.4). §4 is devoted to the convergence analysis of (1.4)–(1.5). §5 is concerned with numerical experiments and suggestions for discrete models. The last section is an Appendix that contains several proofs.
2 Equivalency and Well-Posedness of the Considered Models
In this section, we establish some equivalence between absolutely continuous solutions to (1.2) and (1.4). On the basis of this approach, we introduce a notion of strong solution relative to each of these systems, for which we also state existence and uniqueness results.
2.1 Reminders on the Notion of Absolute Continuity
Let us recall some notions concerning vector-valued functions of a real variable (see, e.g., [2]).
Definition 2.1
Given \({\bar{c}} \in [0, \infty )\), a function \(z: [0,{\bar{c}}] \rightarrow {\mathcal {H}}\) is said to be absolutely continuous if one of the following equivalent properties (i1)-(i3) holds:
(i1) There exists an integrable function \(g: [0,{\bar{c}}] \rightarrow {\mathcal {H}}\) such that
\(z(t) = z(0)+\int _0^t g(s)ds\), \(\forall t \in [0,{\bar{c}}]\);
(i2) z is continuous, and its distributional derivative is Lebesgue integrable on \([0,{\bar{c}}]\);
(i3) \(\forall \epsilon >0\), \(\exists \eta >0\) such that, for finitely many intervals \(I_k=(a_k,b_k) \subset [0,{\bar{c}}]\),
(\(I_k \cap I_j = \emptyset \) (for \(k \ne j\)) and \(\sum _k |b_k-a_k| \le \eta \)) \(\Rightarrow \) \(\sum _k \Vert z(b_k)-z(a_k)\Vert \le \epsilon \).
For simplicity, we say that a function \(z:[0,\infty ) \rightarrow {\mathcal {H}}\) is absolutely continuous whenever it is so on every bounded interval, and we denote by \({\mathcal {A}}_c\) the set of such mappings, that is
Remark 2.1
Recall that z belongs to \({\mathcal {A}}_c\) whenever it is Lipschitz continuous on every bounded interval. It is also well-known that any element of \({\mathcal {A}}_c\) is differentiable almost everywhere and that its derivative coincides with its distributional derivative almost everywhere.
2.2 Notions of Strong Solutions
We introduce here two notions of strong solutions (through the next definitions) regarding (1.2) and (1.4). Let us begin by defining a notion of strong solution for the first-order system (1.2).
Definition 2.2
We say that \((x,\xi ): [0,\infty ) \rightarrow {\mathcal {H}}^2\) is a strong (global) solution to (1.2), for initial data \((x_0,\xi _0,q_0) \in {\mathcal {H}}^3\) such that \(\xi _0 \in \partial f(x_0)\), if \(\{x( \cdot ), \xi ( \cdot )\}\) are two elements of \({\mathcal {A}}_c\) such that, (for \(\zeta ( \cdot ):=\sigma ( \cdot ) \xi ( \cdot )\)), \({x}( \cdot ) + \zeta ( \cdot )\) is of class \(C^1\) and \(\big ( {x}( \cdot ) + \zeta ( \cdot ) \big )^{(1)} \in {\mathcal {A}}_c\), and if:
We proceed with a notion of strong solution for system (1.4).
Definition 2.3
We say that the triplet \((x,\xi , y): [0,\infty ) \rightarrow {\mathcal {H}}^3\) is a strong (global) solution to (1.4), for some Cauchy data \((x_0,\xi _0,y_0) \in {\mathcal {H}}^3\) such that \(\xi _0 \in \partial f(x_0)\), if the functions \(\{x( \cdot ), \xi ( \cdot ), y( \cdot ) \} \subset {\mathcal {A}}_c\) and if they satisfy the following properties:
The previous two definitions will be shown farther to be equivalent in a certain way.
2.3 From a Second to a First Order System
Let us prove some equivalence regarding the second-order system (1.2) and the first-order system (1.4).
Proposition 2.1
Let (CF) hold, let \(\kappa >0\), let \(\{\theta ( \cdot ), \omega ( \cdot ), \sigma ( \cdot )\}\) be positive mappings of class \(C^1\) and suppose that \(\{ \alpha ( \cdot ), \beta ( \cdot ), b( \cdot ) \}\) are given by (1.3).
Then, for \((x_0,\xi _0,q_0)\in {\mathcal {H}}^3\), the statements (i1) and (i2) below are equivalent:
(i1) \((x,\xi ): [0, \infty ) \rightarrow {\mathcal {H}}^2\) is a strong (global) solution to (1.2) with initial data \((x_0,\xi _0,q_0)\);
(i2) \((x, \xi , y): [0, \infty )\rightarrow {\mathcal {H}}^3\), for some auxiliary variable \(y( \cdot )\), is an element of \({\mathcal {A}}_c\times {\mathcal {A}}_c\times {\mathcal {A}}_c\) that satisfies (when denoting \(\zeta ( \cdot ):=\sigma ( \cdot ) \xi ( \cdot )\)) the first-order system
Proof
See Appendix A.1. \(\square \)
At first sight, any triplet \((x, \xi , y)\) satisfying (i2) is nothing but a strong solution to (1.4) with Cauchy data \((x_0, \xi _0, y_0)\) where \(y_0=x_0-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi _0 \right) \). It will be also proved (see Proposition 2.3) that, under some additional condition on \(\sigma ( \cdot )\), such a strong solution to (1.4) is uniquely defined.
2.4 Existence and Uniqueness of Strong Solutions
From now on, we denote by \(J_{\sigma }^{\partial f}\) and \(\big (\partial f\big )_{\sigma }\) the resolvent and the Yosida approximation of \( \partial f\) (with index \(\sigma \)). Existence and uniqueness of strong solutions to (1.4) and (1.2) are established through the next proposition under the following assumption:
Proposition 2.2
Let (CF) and (CG) hold, and let \( \kappa >0\). Then, for any Cauchy data \((x_0,\xi _0, y_0)\in {\mathcal {H}}^3\), with \(\xi _0 \in \partial f (x_0)\), there exists a unique strong solution \((x( \cdot ),\xi ( \cdot ), y( \cdot ))\) to (1.4). Moreover, we have
where \(v( \cdot )\) is obtained from the unique \(C^1\times C^1\) couple \(\big (v( \cdot ),y( \cdot )\big ) \) satisfying, for \(t \ge 0\),
Furthermore, \(\big (v( \cdot ),y( \cdot )\big ) \) is the unique strong solution to (2.6) (namely, there is no other couple of mappings belonging to \({\mathcal {A}}_c\times {\mathcal {A}}_c\) that satisfies (2.6) for almost every \(t \ge 0\)).
Proof
See Appendix A.2. \(\square \)
Proposition 2.3
Let (CF) and (CG) hold, let \( \kappa >0\), and let \(\{ \alpha ( \cdot ), \beta ( \cdot ), b( \cdot ) \}\) given by (1.3).
Then, there exists a unique strong solution \((x( \cdot ),\xi ( \cdot ))\) to (1.2) for any initial data \((x_0,\xi _0, q_0)\in {\mathcal {H}}^3\), with \(\xi _0 \in \partial f (x_0)\). Moreover, the trajectories \(\{x( \cdot ),\xi ( \cdot )\}\) are obtained from the unique strong solution \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) to (1.4) with Cauchy data \((x_0,\xi _0,y_0)\) such that \(y_0=x_0-\frac{1}{\theta (0)}(q_0+\omega (0) \sigma (0)\xi _0)\), where \(q_0=\big ( x( \cdot ) + \sigma ( \cdot ) \xi ( \cdot )\big )^{(1)}(0)\).
Proof
Observe from Proposition 2.1 that a strong solution \((x( \cdot ), \xi ( \cdot ))\) to (1.2) equivalently solves system (1.4) (for some auxiliary variable \(y( \cdot ) \in {\mathcal {A}}_c\) with \(y_0=x_0-\frac{1}{\theta (0)}(q_0+\omega (0) \sigma (0)\xi _0)\)). It is not difficult to see from Proposition 2.2 that this latter system (1.4) admits a unique solution given by (2.5)–(2.6). Combining these previous two observations yields existence and uniqueness of a strong solution to (1.2) \(\square \)
3 Preparatory Results for a Lyapunov Analysis
In this section, we set up estimations by exhibiting some energy-like functional associated with model (1.4)–(1.5).
3.1 Exhibiting a Lyapunov Functional
Consider \(\{e_*, \kappa \} \subset (0,\infty )\) and positive mappings \(\{\sigma ( \cdot ), \nu ( \cdot ), \vartheta ( \cdot )\}\), as parameters involved in (1.5), and denote \(\rho ( \cdot ):=\kappa - \frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )}\) (as a recurrent term in our analysis). With the trajectories \(\{x( \cdot ), \xi ( \cdot ), y( \cdot ) \}\) produced by the system (1.4)–(1.5), we associate the functionals \( {\mathcal {L}}_{s,q}( \cdot )\) and \({{\mathcal {T}}}_s( \cdot )\) defined with \((s,q) \in [0, \infty ) \times S\) and \(t \ge 0\) by
The following result will serve as a basis for establishing Lyapunov properties for \({\mathcal {L}}_{s,q}( \cdot )\).
Proposition 3.1
Consider \(\{\kappa , e_*,\sigma _0\}\subset (0,\infty )\), \(\delta \ge 0\), positive mappings \(\{\nu ( \cdot ), \vartheta ( \cdot )\}\) of class \(C^1\), and let \(\{\omega ( \cdot ), \sigma ( \cdot ), \theta ( \cdot )\}\) be given by (1.5). Suppose also that \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) is a strong solution to (1.4)–(1.5), along with parameters satisfying:
Then, for \((s,q) \in [0,\infty )\times {\mathcal {H}}\) and for a.e. \(t \in [t_0, \infty )\), we have
where \(\rho ( \cdot ):=\kappa - \frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )}\), \({\mathcal {L}}_{s,q}( \cdot )\) and \(\mathcal {T}_s( \cdot )\) are given by (3.1)–(3.2), and \(\psi _1(s,t)\) is defined by
The above proposition will be proved in the next section.
3.2 Proof of Proposition 3.1
3.2.1 Preliminaries
Before proving Proposition 3.1, we recall three results of great importance that will be helpful regarding our methodology. The first one is a key result established in [30].
Proposition 3.2
[30, Proposition 3.1]. Let \(\{\kappa , e_*\} \subset (0,\infty )\), let \(\{\nu ( \cdot ),\omega ( \cdot ),\sigma ( \cdot )\}\) be positive mappings of class \(C^1\) such that \(\nu ( \cdot )\) satisfies (3.3), and suppose that \(\big (x( \cdot ),\xi ( \cdot ),y( \cdot ) \big )\) is a strong solution to (1.4), along with \(\theta ( \cdot )\) given by (1.5a). Then, for any \((s,q)\in [0,\infty )\times {\mathcal {H}}\), we have, for a.e. \(t \ge 0\),
where \(\rho (t)=\kappa - \frac{\dot{\nu }(t)}{\nu (t)}\), \({\mathcal {T}}_s( \cdot )\) is given by (3.2), while \({\mathcal {E}}_{s,q}( \cdot )\) is defined by
The second result states some derivation chain rules for the convex lsc (lower semicontinuous) objective, which was explicitly stated in [1, Lemma 1.9].
Lemma 3.1
[21, Lemma 3.3] Let \({\textbf {(CF)}}\) hold, let \({\bar{c}} >0\), and let \(\{x, \xi \}: [0,{\bar{c}}] \rightarrow {\mathcal {H}}\) satisfy the following conditions (i1)–(i3):
(i1) \(\xi (t) \in \partial f(x(t))\), for a.e. \(t \in [0,{\bar{c}}]\); (i2) \(\xi ( \cdot ) \in L^2([0, {\bar{c}}];{\mathcal {H}})\); (i3) \(\dot{x}( \cdot ) \in L^2([0, {\bar{c}}];{\mathcal {H}})\).
Then, \(f(x( \cdot ))\) is absolutely continuous on \([0,{\bar{c}}]\) and we have
Next, a useful property of the considered dynamics is given through the following lemma in which \( \textrm{gra}(\partial f)\) denotes the graph of \(\partial f\), that is \( \textrm{gra} ( \partial f) = \{ (x,x^*) \in {\mathcal {H}}^2; x^* \in \partial f (x)\}\).
Lemma 3.2
[30, Lemma 4.1] Let \(f: {\mathcal {H}}\rightarrow \mathrm {I\!R}\cup \{ +\infty \}\) be a proper convex function. For any couple of absolutely continuous functions \((x,\xi ): [0,\infty )\rightarrow \textrm{gra}(\partial f)\), we have \(\langle \dot{\xi }(t),\dot{x}(t)\rangle \ge 0\), for a.e. \(t \ge 0\).
Proof
See Appendix A.3. \(\square \)
3.2.2 Proving the Main Inequality (3.4)
For simplification we set \(\tau ( \cdot ):=e_*+\nu ( \cdot )\). Clearly, given \((s,q)\in [0,\infty )\times S\), by applying Proposition 3.2 we obtain, for a.e. \(t \in [0,\infty )\),
where \(a_1(t)\) and \(a_2(t)\) are defined by
In order to estimate the right side of (3.9), we observe that (1.5b)–(1.5c) also give us \(\frac{\dot{\sigma }(t)}{\sigma (t)} = \frac{\delta }{\tau (t)}\) and \(\omega (t)=\rho (t) \vartheta (t)-\frac{\delta }{\tau (t)}\) (as \(\rho ( \cdot ):=\kappa - \frac{\dot{\nu }( \cdot )}{\nu ( \cdot )}\)), whence (3.10) reduces to
So, it is readily checked from condition (3.3b) that \(a_1( \cdot )\) is nonnegative on \([t_0,\infty )\). Moreover, setting \({\bar{f}}:= f - \min f\), by the well-known convex inequality we have
It can also be set up a derivation chain rule regarding \({\bar{f}}(x( \cdot ))\) by verifying the assumptions (i1) to (i3) of Lemma 3.1. Indeed, given \({\bar{c}} >0\), (i1) is obvious, while (i2) is also satisfied by \(\xi ( \cdot )\) because of its continuity on \([0, {\bar{c}}]\) (from \((x( \cdot ),\xi ( \cdot ),y( \cdot )) \in {\mathcal {A}}_c^3\), as a strong solution to (1.4)). Regarding (i3), by (2.3b) we equivalently have, for a.e. \(t \in [0,\infty )\),
\(\dot{x}(t) + {\dot{\zeta }}(t)+ \theta (t) u(t) + \omega (t) \zeta (t)=0\), (where \(u( \cdot ) = y( \cdot ) -x( \cdot )\) and \(\zeta ( \cdot ) = \sigma ( \cdot ) \xi ( \cdot )\)).
This, by \(\langle \dot{x}(t), {\dot{\xi }} (t)\rangle \ge 0\) (from Lemma 3.2), readily yields
\(\Vert \dot{x}(t) \Vert ^2\le \Vert \dot{x}(t) + {\dot{\zeta }}(t)\Vert ^2 = \Vert \theta (t) u(t) + \omega (t) \zeta (t)\Vert ^2\),
which clearly ensures (i3). Then, applying Lemma 3.1 entails, for a.e. \(t \ge 0\),
Consequently, by (3.9) along with (3.12)–(3.13), while noticing that \( {{\mathcal {L}}}_{s,q}( \cdot )= {{{\mathcal {E}}}}_{s,q}( \cdot )+ a_2( \cdot ) {\bar{f}} (x( \cdot ))\), we infer that, for a.e. \(t \ge t_0\),
Regarding the second term in the right side of the above inequality, by \(\frac{{\dot{\sigma }}(t)}{\sigma (t)}=\frac{\delta }{\tau (t)}\) (from (1.5b)) and \(\dot{\tau }(t)=\dot{\nu }(t)\) (as \(\tau ( \cdot ):=\nu ( \cdot )+e_*\)), while using (3.11), we simply get (omitting the variable t to shorten the equations)
Then, combining (3.14) and the previous argument leads us to (3.4) \(\square \)
3.3 Specificities on the Parameters and the Dynamics
In view of our next computations, we make some observations regarding the parameters and the dynamics.
Remark 3.1
The following arguments (j0)–(j2) will be very useful in our study:
(j0) Observe from (1.2) and \(\rho ( \cdot ):= \kappa -\frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )}\) that \(\theta ( \cdot )\) can be rewritten as \(\theta ( \cdot )= \frac{\nu ( \cdot ) \rho ( \cdot ) }{\nu ( \cdot )+e_*}\).
(j1) From (j0) we have \(\theta (t) = \rho (t) \big ( 1+\frac{e_*}{\nu (t)} \big )^{-1}\). Then, as \(\nu ( \cdot )\) and \(\rho ( \cdot )\) are positive (from (1.15a)–(1.15b)), and \(e_*>0\), we infer that \(\theta ( \cdot )\) is positive. Moreover, observing that \(\rho ( \cdot )\) and \(( 1+\frac{e_*}{\nu ( \cdot )}) ^{-1}\) are nondecreasing entails that \(\theta ( \cdot )\) is nondecreasing. In addition, we obviously have \(\rho (t) \in (0, \kappa ]\) for \(t\ge 0\) and \(\rho (t) \rightarrow \kappa \) as \(t \rightarrow \infty \) (from (1.15b)–(1.15c))). So, by (j0) we obtain \(\theta (t) \in \big [ \theta (0), \kappa \big )\) for \(t\ge 0\). If, in addition, \(\nu (t) \rightarrow \infty \), we get \(\theta (t) \rightarrow \kappa \) (as \(t\rightarrow \infty \)).
(j2) From (1.15d) and the positivity of \(\nu ( \cdot )\) we have \(0< \nu (t) \le M t + \nu (0)\) for \(t\in [0, \infty )\). This simply yields \(\int _{0}^{+\infty } \frac{1}{\nu (t)} dt = +\infty \) (hence \(\int _{0}^{+\infty } \frac{1}{e_*+\nu (t)} dt = +\infty \)).
4 Convergence Analysis and Estimations
This section is devoted to the asymptotic behavior of the strong solution to (1.4)–(1.5). As standing assumptions we suppose that \(f: {\mathcal {H}}\rightarrow \mathrm{I\!R}\cup \{+\infty \}\) is a proper convex and l.s.c. function such that \(S:=\textrm{argmin}_{{\mathcal {H}}} f \ne \emptyset \), and we denote \({\bar{f}}( \cdot ):= f( \cdot )- \min f\). Estimations are established first in the general case of parameters (under somewhat theoretical conditions) and then in interesting specific cases of parameters (under classical conditions).
4.1 Intermediate Results by a Lyapunov Analysis
For the sake of simplicity and legibility, we start by assuming that \(\{\delta , e_*, \nu ( \cdot ), \vartheta ( \cdot )\}\) (occurring in (1.5)) are such that:
Another useful condition on the parameters is needed for our methodology. Let us recall the definitions of \(a_1( \cdot )\) and \(\psi _1(.,.)\)(used in (3.11) and (3.5), respectively) and, denoting \(\rho ( \cdot ):= \kappa - \frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )}\), let us introduce a new mapping \(\psi _2(.,.)\). These mappings are given for \(s \ge 0\) and \(t \ge 0\) by:
We focus here on a full study of (1.4)–(1.5) along with parameters satisfying (CP) and the additional theoretical conditions which consist of assuming for some \(s_0 \in (0,e_*)\) and some \(t_0 \ge 0\) that:
At once, showing Lyapunov properties for the functional \({\mathcal {L}}_{s_0,q}( \cdot )\) allows us to derive two series of estimations (through the next Propositions 4.1 and 4.2).
Proposition 4.1
Let \(\{\delta , \kappa , e_*, \sigma _0, \nu ( \cdot ), \vartheta ( \cdot )\}\) satisfy (CP) and (1.15a)–(1.15b), let \(\{\omega ( \cdot ), \theta ( \cdot ), \sigma ( \cdot )\} \) be given by (1.5), and suppose that \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) is a strong solution to (1.4)–(1.5). Assume furthermore that \(\{a_1( \cdot ), \psi _1(.,.), \psi _2(.,.)\}\) (introduced in (4.1)) satisfy (4.2) for some \(t_0 >0\) and \(s_0 \in (0,e_*)\). Then, for any \(q\in S\), \({\mathcal {L}}_{s_0,q}( \cdot )\) is nonincreasing on \([t_0, \infty )\), convergent, and we have
together with the following integral estimates:
Proof
With a view to applying Proposition 3.1, we first check that (1.15a)–(1.15b) and (4.2) altogether guarantee the two conditions in (3.3). This is obvious regarding (3.3a). In addition, for \(t \ge t_0\), denoting \(\tau (t)=e_*+\nu (t)\) and \(\rho (t)=\kappa - \frac{{\dot{\nu }} (t)}{\nu (t)}\), by (4.2) we have \(a_1(t):=\sigma (t) \big ( \rho (t) \vartheta (t) \tau (t) -\delta -{\dot{\nu }}(t) \big ) \ge 0\), or equivalently \(\rho (t) \vartheta (t) \tau (t) \ge \delta + \dot{\nu }(t)\) (as \(\sigma ( \cdot )\) is assumed to be positive), that is (3.3b). Thus, condition (3.3) is fulfilled. Next, given \(q \in S\), we prove that \({\mathcal {L}}_{s_0,q}( \cdot )\) is nonincreasing on \([t_0, \infty )\) and convergent. Indeed, denoting \({\bar{f}}=f -\min f\) and \(u( \cdot ):=y( \cdot )-x( \cdot )\), by Proposition 3.1 with \(s=s_0 \in (0,e_*)\), we get, for a.e. \(t \ge t_0\),
where \(\psi _1(.,.)\) is given by (4.1b), while \({\mathcal {L}}_{s_0,q}( \cdot )\) and \({\mathcal {T}}_ {s_0}( \cdot )\) are given (from (3.1) and (3.2)) by
Concerning the terms appearing in the above formulations of \({\mathcal {T}}_{s_0}(t)\) and \(\mathcal {L}_{s_0,q}(t)\), we have \(\langle \xi (t), x(t)-q\rangle \ge 0 \) (as \(\partial f\) is monotone), \(\psi _2(s_0,t) {\bar{f}} (x(t)) \ge 0\) (by (4.2) which guarantees that \(\psi _2(e_*,t)\ge 0\), while noticing from the expression of \(\psi _2(\cdot , \cdot )\) given in (4.1c) together with \(e_*>s_0\) that \(\psi _2(s_0,t) \ge \psi _2(e_*,t)\)). Hence, it is immediately observed that \({\mathcal {T}}_{s_0}( \cdot )\) and \( {{\mathcal {L}}}_{s_0,q}( \cdot )\) are nonnegative. Moreover, regarding the last two terms in the right side of (4.6), we have \(\langle {\dot{\xi }}(t), \dot{x}(t)\rangle \ge 0\) (from Lemma 3.2) and \(\psi _1(s_0,t) \ge 0\) (from (4.2)). So, from (4.6) and the previous arguments, we classically deduce that \(\mathcal {L}_{s_0,q}( \cdot )\) is nonincreasing and bounded below on \([t_0,\infty )\), which implies that \(\mathcal {L}_{s_0,q}(t)\) is convergent as \(t \rightarrow +\infty \).
We proceed by proving the other estimates separately:
- Let us prove (4.4). As \(\mathcal {L}_{s_0,q}( \cdot )\) is nonincreasing and nonnegative on \([t_0,\infty )\), we clearly have (for \(t\ge t_0\)) \(0 \le \mathcal {L}_{s_0,q}(t) \le \mathcal {L}_{s_0,q}(t_0)\). We also recall that the four terms arising in the definition of \(\mathcal {L}_{s_0,q}( \cdot )\) (given by (4.7a)) are nonnegative. Consequently, each of these terms is bounded by \(\mathcal {L}_{s_0,q}(t_0)\). So, we deduce (by the boundedness of the second term) the boundedness of \(x( \cdot )\), namely the first part of item (4.4)), which (by the boundedness of the first term) guarantees that \(\nu ( \cdot ) \Vert u( \cdot )\Vert \) is also bounded. This, in light of \(\dot{y}(t)=-\kappa u(t)\) (from (2.3c)), proves the second part of item (4.4).
- Let us prove (4.5a)–(4.5b)–(4.5c)–(4.5d). Integrating inequality (4.6) between \(t_0\) and \(t\ge t_0\), in light of the nonnegativity of \({{\mathcal {L}}}_{s_0,q}( \cdot )\), entails
Then, remembering that the terms in the above integrands are nonnegative, we classically deduce that \( \int _{t_0}^{\infty } \psi _1(s_0,t) {\bar{f}} \big (x(t)\big ) dt \le {{\mathcal {L}}}_{s_0,q}(t_0)\) (that is (4.5a)) and the following estimates:
We also recall from Remark 3.1 that conditions (1.15a)–(1.15b) imply that
So, (4.9a), (4.10) and \(\tau ( \cdot ) \ge \nu ( \cdot )\) yield \( \int _{t_0}^{\infty } \sigma (r) \frac{ \nu ^2(r)}{\kappa } \langle \dot{\xi }(r),\dot{x}(r)\rangle dr \le {{\mathcal {L}}}_{s_0,q}(t_0)\) (that is (4.5b)). Furthermore, using the formulation of \(\mathcal {T}_{s_0}( \cdot )\) (from (4.7b)) and noticing that \(\frac{1}{4} ( \frac{s_0+3e_*}{\vartheta (t)} + 4 ) \ge 1\), by (4.9b) we obtain
Combining (4.11a) and (4.10) amounts to \(\frac{(e_*-s_0)}{\kappa } \int _{t_0}^\infty \nu (r)\Vert \dot{x}(r)\Vert ^2dr \le \mathcal {L}_{s_0,q}(t_0)\) (that is (4.5c)). In addition, for \(t \ge 0\), by Remark 3.1 we have \(0<\theta (0) \le \theta (t) \le \rho (t) \le \kappa \), which implies that \(\frac{\rho (t)}{\theta ^2(t)} \ge \frac{1}{\theta (t)}\ge \frac{1}{\kappa }\). Therefore, inequality (4.11b) immediately yields
where \(g(t):=1 -\frac{e_*-s_0}{2 \tau (t) }\). It is not difficult to check (for \(t \ge 0\)) that \(0 \le g(t) \le 1\) (as \(s_0 \le e_*\)), hence, using the obvious decomposition
\(\Vert u(t)\Vert ^2 = \frac{4}{\theta ^2(t)}\Vert \frac{1}{2}(\theta (t) u(t)+g(t) \dot{x}(t))-\frac{1}{2}g(t) \dot{x}(t)\Vert ^2\),
besides the convexity of the square norm, together with \(0<\theta (0) \le \theta (t)\), we obtain
Then, by \(\int _{t_0}^\infty \nu (r)\Vert \theta (r) u(r)+ g(r) \dot{x}(r)\Vert ^2 dr< \infty \) (from (4.12) and \(\nu ( \cdot ) \ge \nu (0)>0\)) together with \( \int _{t_0}^\infty \nu (r)\Vert \dot{x}(r)\Vert ^2 dr <\infty \) (from (4.5c)), we deduce that \(\int _{t_0}^{\infty } \nu (t) \Vert u(t)\Vert ^2 dt <\infty \), that is the first inequality of item (4.5d). The second one follows in light of \(\dot{y}( \cdot )=-\kappa u( \cdot )\).
- Let us prove (4.5e). For \(t \ge 0\), using the notation \(g(t):=1 -\frac{e_*-s_0}{2 \tau (t) }\) obviously yields
\(\theta (t) u(t) + \dot{x}(t) = \theta (t) u(t) + g(t) \dot{x}(t) + \frac{e_*-s_0}{2 \tau (t)}\dot{x}(t)\), hence, noticing that \(\nu (t) \le \tau (t)\), we deduce that
This, by \(\int _{t_0}^\infty \nu ^2(r)\Vert \theta (r) u(r)+ g(r) \dot{x}(r)\Vert ^2 dr < \infty \) (from (4.12)) and \( \int _{t_0}^\infty \Vert \dot{x}(r)\Vert ^2 dr <\infty \) (from (4.5c) and \(\nu ( \cdot ) \ge \nu (0)>0\)), amounts to \(\int _{t_0}^\infty \nu ^2(r)\Vert \theta (r) u(r)+\dot{x}(r)\Vert ^2 dr< \infty \), that is (4.5e).
- Let us prove (4.5f). From a quick computation and using (2.4b) we have
\(\begin{array}{l} \big ( \sigma ( \cdot ) \xi ( \cdot ) \big )^{(1)}(t) + \omega (t)\big ( \sigma (t) \xi (t) \big ) = \sigma (t) {\dot{\xi }}(t) + ({\dot{\sigma }}(t) + \omega (t) \sigma (t)) \xi (t)\\ = -\dot{x}(t) - \theta (t) u(t), \end{array}\)
which yields \(\begin{array}{l} \Vert \big ( \sigma ( \cdot ) \xi ( \cdot ) \big )^{(1)}(t) + \omega (t)\big ( \sigma (t) \xi (t) \big ) \Vert ^2 = \Vert \dot{x}(t) + \theta (t) u(t)\Vert ^2,\end{array}\)
thus, item (4.5f) follows from this last inequality and (4.5e).
- It remains to prove (4.5g). Applying Proposition 3.1 with \(s=e_*\), in light of the definition of \({\mathcal {T}}_ {e_*}( \cdot )\) (given in (3.2)) and \( \langle {\dot{\xi }}(t),\dot{x}(t)\rangle \ge 0\) (from Lemma 3.2), implies that, for a.e. \(t \ge t_0\),
Moreover, by \(\psi _1(s_0,t) \ge 0\) (from (4.2)) and the definition of \(\psi _1(.,.)\) (given in (4.1b)), we get
Therefore, using (4.14), in light of (4.15), entails
Clearly, \({{\mathcal {L}}}_{e_*,q}( \cdot )\) and the term in the right side of (4.16) are nonnegative. Consequently, integrating (4.16) between \(t_0\) and \(t\ge t_0\), in light of the nonnegativity of \({{\mathcal {L}}}_{e_*,q}( \cdot )\), while recalling that \(\rho ( \cdot )\) is positive and nondecreasing (from Remark 3.1) and that \(\nu ( \cdot ) \le \tau ( \cdot )\), yields \((e_*- s_0)\rho (t_0)\int _{t_0}^t (\sigma \nu \vartheta )(r) {\bar{f}} (x(r))dr \le {{\mathcal {L}}}_{e_*,q}(t_0)\), which obviously leads us to (4.5g) \(\square \)
The next proposition establishes general convergence results, besides other estimations.
Proposition 4.2
Let \(\{\delta , \kappa , e_*, \sigma _0, \nu ( \cdot ), \vartheta ( \cdot )\}\) satisfy (CP) and (1.15), and let \(\{\omega ( \cdot ), \theta ( \cdot ), \sigma ( \cdot )\} \) be given by (1.5). Assume furthermore that \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) is a strong solution to (1.4)–(1.5) and that conditions (4.2)–(4.3) hold for some \(t_0 >0\) and \(s_0 \in (0,e_*)\). Then the following estimates are reached:
Proof
For simplification we set \({\bar{f}}= f-\min f\), \(u( \cdot )=y( \cdot )-x( \cdot )\) and \(\tau ( \cdot )=e_*+\nu ( \cdot )\). The proof will be divided into several steps:
- Let us prove (4.17a) and (4.17b). Given \(q \in S\), by Proposition 3.1 with \(s=0\), and recalling that \(\langle \dot{\xi }(t),\dot{x}(t)\rangle \ge 0\), we have, for a.e. \(t \ge t_0\),
where \({{\mathcal {L}}}_{0,q}( \cdot )\) and \({\mathcal {T}}_0( \cdot )\) are given (from (3.1) and (3.2)) by
It can be also checked that condition \(\psi _1(s_0,t) \ge 0\) (from 4.2)) can be rewritten as
\( \sigma (t) \left( \tau (t) \vartheta (t) \delta + \big (\tau ^2( \cdot ) \vartheta ( \cdot ) \big ) ^{(1)}(t) \right) \le s _0 \sigma (t) \tau (t) \vartheta (t) \rho (t)\).
Hence, by (4.18) and this last inequality, while noticing that \({\mathcal {T}}_0( \cdot )\) is nonnegative (in light of (4.19b)) and that \(\rho (t) \le \kappa \) (from Remark 3.1), we obtain
Therefore, by \(\int _{t_0}^{\infty } (\sigma \tau \vartheta )(r) {\bar{f}} (x(r))dr < \infty \) (from (4.5g)), together with the nonnegativity of \({{\mathcal {L}}}_{0,q}(t)\), we classically deduce that \({{\mathcal {L}}}_{0,q}(t)\) is convergent as \(t \rightarrow \infty \). Thus, there exists \(l \ge 0\) such that \(\lim _{t \rightarrow \infty } {{\mathcal {L}}}_{0,q}(t)=l\). Let us prove by contradiction that \(l=0\). Indeed, using the above definition of \({\mathcal {L}}_{0,q}( \cdot )\) while noticing that \(\frac{\nu (t)}{\tau (t)} \le 1\) yields
which, by \(\int _0^{\infty } \nu (t) \Vert u(t) \Vert ^2 dt < \infty \) (from (4.5d)) and \(\int _{t_0}^{\infty } (\sigma \tau \vartheta )(t) {\bar{f}} (x(t))dt < \infty \) (from (4.5g)), entails that \(\int _0^{\infty } \frac{1}{\tau (t)}{\mathcal {L}}_{0,q}(t) dt < \infty \). Then it is immediate that assuming that \(l > 0\) would give us \(\frac{1}{\tau (t)}{\mathcal {L}}_{0,q}(t) \sim \frac{l}{\tau (t)}\) as \(t \rightarrow \infty \), which is absurd since \(\int _0^{\infty } \frac{1}{\tau (t)}dt =\infty \) (from Remark 3.1). We infer that \(\lim _{t \rightarrow \infty }{{\mathcal {L}}}_{0,q}(t)=0\), which, by the definition of \({{\mathcal {L}}}_{0,q}( \cdot )\), amounts to \({\lim _{t \rightarrow \infty } \nu ^2(t) \Vert u(t) \Vert ^2=0}\) (that is the first estimate of (4.17a)) together with
The second part of item (4.17a) is a direct consequence of the first one in light of \(\dot{y}(t)=-\kappa u(t)\), while item (4.17b) clearly follows from (4.21) in light of \(\tau (t)\ge \nu (t)\).
- Next, we prove items (4.17c). We denote \(\zeta (t):=\sigma (t) \xi (t)\) and \(\Gamma (t):= {\dot{\zeta }}(t) + \omega (t) \zeta (t)\). From the definition of \(\Gamma ( \cdot )\) and noticing that \(\langle {\dot{\zeta }}(t), \zeta (t) \rangle = \frac{1}{2} \left( \Vert \zeta ( \cdot ) \Vert ^2 \right) ^{(1)}(t)\), we have
A direct computation also gives us \(\left( \nu ^2( \cdot )\Vert \zeta ( \cdot ) \Vert ^2 \right) ^{(1)} (t)= 2 \nu (t) {\dot{\nu }}(t) \Vert \zeta (t) \Vert ^2 + \nu ^2(t) \left( \Vert \zeta ( \cdot ) \Vert ^2 \right) ^{(1)}(t)\),
which combined with (4.22) amounts to
Observe that \(\omega (t)\) (introduced in (1.5c)) can be rewritten as \(\omega (t):=\rho (t) \vartheta (t) -\frac{\delta }{\tau (t)}\), while by condition (4.3) we assume (for some positive constant \(\omega _0\)) that \(\omega ( \cdot ) \ge \omega _0 > 0\) on \([t_0, \infty )\). By condition (1.15c), we also know that \(\frac{{\dot{\nu }}(t)}{\nu (t)} \rightarrow 0\) as \(t \rightarrow \infty \). So, given any constant \(h \in (0,1)\), we can see without any difficulty, for some existing \(\epsilon > 0,\) that there exists \(t_1 \ge t_0\) such that \(t \ge t_1\) yields \(\omega (t) \ge \frac{1}{1-h} \left( \frac{\epsilon }{2} + \frac{\dot{\nu }(t)}{\nu (t)} \right) \),
which can be equivalently written as
Moreover, by the Peter-Paul inequality, we classically obtain
Hence, by (4.25), in light of (4.23) and (4.24), we infer that
Whence, recalling that \(\omega ( \cdot )\ge \omega _0> 0\) (from condition (4.3)), (4.26) entails
Then, by this last inequality together with \(\int _{t_1}^{\infty }\nu ^2(t) \Vert \Gamma (t) \Vert ^2dt <\infty \) (from (4.5f)), we classically deduce that
Clearly, the first estimate in (4.28) yields \(\int _{t_0}^{\infty } \nu ^2(t) \Vert \zeta (t) \Vert ^2 dt <\infty \) (because of the continuities of \(\nu ( \cdot )\), \(\sigma ( \cdot )\) and \(\xi ( \cdot )\) on \([t_0, t_1]\)), namely the first estimate in (4.17c). It is also obviously checked from the two arguments in (4.28) that \(l = 0\), which proves the second result in item (4.17c).
- Let us prove (4.17d) and (4.17e). Noticing that \(\omega (t)+ \frac{{\dot{\sigma }}(t)}{\sigma (t)}= \rho (t) \vartheta (t)\) (from (1.5c)) yields
So, in light of (4.29), a quick computation gives us
Therefore, according to (4.30), using the Young inequality yields
We also underline that \(\rho ( \cdot )\) is bounded (from Remark 3.1) and so is \(\vartheta ( \cdot )\) (from condition (1.16)). So, by \(\int _{t_0}^{\infty } \nu ^2(t) \Vert \big ( \sigma ( \cdot ) \xi ( \cdot ) \big )^{(1)}(t) + \omega (t)\big ( \sigma (t) \xi (t) \big ) \Vert ^2 dt <\infty \) (from (4.5f)) together with \( \int _{t_0}^{\infty } \sigma ^2(t) \nu ^2 (t) \Vert \xi (t) \Vert ^2dt <\infty \) (from (4.17c)), we deduce from these last arguments that \(\int _{t_0}^{\infty }\sigma ^2(t) \nu ^2 (t) \Vert {\dot{\xi }}(t) \Vert ^2dt <\infty \), that is the first estimate in item (4.17d).
Furthermore, from (1.4b) together with \({\dot{\sigma }}(t) + \sigma (t) \omega (t) = \sigma (t) \rho (t) \vartheta (t)\) (from (4.29)) and \(\omega (t)+ \frac{{\dot{\sigma }}(t)}{\sigma (t)}= \rho (t) \vartheta (t)\) (from (1.5c)) we obtain
from which we immediately derive that
We also know that \(\displaystyle \lim _{t\rightarrow \infty }\nu (t) \Vert u(t) \Vert =\lim _{t \rightarrow \infty } \nu (t) \sigma (t) \Vert \xi (t) \Vert =0\) (from (4.17a) and (4.17c)). So, by the boundedness of \(\{\theta ( \cdot ), \rho ( \cdot )\}\) (from Remark 3.1) and that of \(\vartheta ( \cdot )\) (from (1.16)), we infer that
Moreover, by \(\langle \dot{x}(t), \dot{\xi }(t) \rangle \ge 0\) (from Lemma 3.2), we obviously have
\(\Vert \dot{x}(t)\Vert ^2 + \sigma ^2(t)\Vert \dot{\xi }(t) \Vert ^2 \le \Vert \dot{x}(t) + \sigma (t) \dot{\xi }(t)\Vert ^2\). This, together with (4.33), yields \(\lim _{t \rightarrow \infty } \nu (t) \Vert \dot{x}(t)\Vert =\lim _{t \rightarrow \infty } \nu (t) \sigma (t) \Vert \dot{\xi }(t)\Vert =0\), namely the second result in item (4.17d) and item (4.17e), respectively \(\square \)
Now we claim the main result of this section regarding our model (1.4)–(1.5).
Theorem 4.1
Let \(\delta \ge 0\) and \(\{\kappa , e_*, \sigma _0\} \subset (0,\infty )\), let \(\{\nu ( \cdot ),\vartheta ( \cdot )\}\) be positive mappings of class \(C^1\) satisfying conditions (1.15), and suppose that (4.2)–(4.3) hold for some \(t_0 > 0\) and \(s_0 \in (0, e_*)\). Then, for any strong solution \((x,\xi ,y): [0,\infty ) \rightarrow {\mathcal {H}}^3\) to (1.4)–(1.5), we have the following properties:
Proof
Items (4.34a) to (4.34f) are direct consequences of Propositions 4.1 and 4.2 whose hypotheses are fulfilled under the assumptions of Theorem 4.1:
-
Items (4.34a) and (4.34b) are given by (4.5g) and (4.17b), respectively.
-
The two results in item (4.34c) are given by (4.17a) and (4.5d), respectively.
-
The two results in item (4.34d) are given by (4.17e) and (4.5c), respectively.
-
(4.34e) is derived from (4.17c) and item (4.34f) follows from (4.17d).
It remains to prove (4.34g), that is the weak convergence of the trajectories. For simplification we set \(\tau ( \cdot ) = e_*+ \nu ( \cdot )\), \(u( \cdot ) = y( \cdot ) - x( \cdot )\) and \(\bar{f}(\cdot )= f(\cdot ) - \min f\). Given \(q \in S\), by definition of \(\mathcal {{\mathcal {L}}}_{s_0,q}( \cdot )\) (given in (3.1)) we have, for \(t \in [0, \infty )\),
The above equality can be equivalently written as
Let us analyze separately the behavior as \(t \rightarrow \infty \) of each term in the right side of (4.36). Regarding the first term, we know that \(\lim _{t\rightarrow +\infty } \mathcal {{\mathcal {L}}}_{s_0,q}(t)\) exists (from Proposition 4.1). Next, we show that the other terms converge to zero. Concerning the second term, we simply have \(\lim _{t \rightarrow +\infty } \nu ^2(t) \Vert y(t)-x(t)\Vert ^2 = 0\) (from (4.17a)). In order to estimate the third term, using the Cauchy–Schwarz inequality yields
\(s_0 \nu (t) | \langle q-x(t),y(t)-x(t) \rangle | \le s_0 \nu (t) \Vert q-x(t) \Vert \Vert y(t)-x(t) \Vert \),
which, by the boundedness of \(x(\cdot )\) and \( \lim _{t \rightarrow \infty } \nu (t) \Vert y(t)-x(t) \Vert =0\) (from (4.17a)), entails that
\(\lim _{t \rightarrow \infty } s_0 \nu (t) \langle q-x(t),y(t)-x(t) \rangle =0\).
Regarding now the fourth term, by the monotonicity of \(\partial f\) and the Cauchy–Schwarz inequality we have
\(0 \le \sigma (t) \tau (t)\langle \xi (t), x(t)-q\rangle \le \sigma (t) \tau (t)\Vert \xi (t)\Vert \times \Vert x(t)-q\Vert \).
Hence, by remembering that (as \(t\rightarrow \infty \)) \( \sigma (t) \nu (t)\Vert \xi (t)\Vert \rightarrow 0\) (from (4.17c)) (thus \( \sigma (t) \tau (t)\Vert \xi (t)\Vert \rightarrow 0\) since \(\tau ( \cdot ) = e_*+ \nu ( \cdot )\) and since \(\nu (\cdot )\) is nondecreasing), we deduce from the boundedness of \(x( \cdot )\) that
\(\lim _{t \rightarrow +\infty } \sigma (t) \tau (t) \langle \xi (t), x(t)-q\rangle = 0\). Concerning the last term, recalling that \(\tau (\cdot ) = e_*+ \nu (\cdot )\), we observe that
Furthermore, we have \(\lim _{t \rightarrow \infty } \sigma (t) \nu ^2(t) \vartheta (t) {\bar{f}}(x(t))=0\) (from (4.17b)), hence recalling that \(\vartheta (\cdot )\) is bounded away from zero, we readily get \(\lim _{t \rightarrow \infty } \sigma (t) \nu ^2(t) {\bar{f}}(x(t))\)\(=0\). Thus, by noticing that the quantity \((1+\frac{e_*}{\nu (t)})^2 | \vartheta (t) - \frac{s}{\tau (t)} |\) is bounded (since \(\tau (\cdot )\) is nondecreasing and since \(\vartheta (\cdot )\) is bounded as a positive and nonincreasing mapping), we infer that
\(\lim _{t \rightarrow \infty } \sigma (t) \tau (t) \big (\tau (t) \vartheta (t) -s \big ) {\bar{f}}(x(t))=0\). So equality (4.36) in light of the previous arguments gives us
which implies that \(\lim _{t\rightarrow +\infty } \Vert x(t) - q\Vert \) exists.
Now, let \(\bar{x}\) be a weak sequential cluster point of \(x( \cdot )\), namely, there exists a sequence \((t_n)_{n\ge 0} \subset (0, \infty )\) such that \(\lim _{n\rightarrow +\infty } t_n = +\infty \) and for which the sequence \((x(t_n))_{n\ge 0}\) weakly converges to \(\bar{x}\) as \(n\rightarrow +\infty \). Then, for \(n \ge 0\), we readily have
Observe that \(\xi (t_n) \rightarrow 0\) strongly in \({\mathcal {H}}\) as \(n\rightarrow +\infty \), since \(\lim _{t\rightarrow +\infty } \sigma (t) \nu (t)\Vert \xi (t)\Vert = 0\) (from (4.17c)) and since \(\sigma (t) \ge \sigma (0) > 0\) and \(\nu (t) \ge \nu (0) > 0\) (from (1.5b) and (1.15a)). Therefore, passing to the limit in (4.37) as \(n\rightarrow +\infty \) and using the fact that \(\partial f\) is sequentially semi-closed (as \(\partial f\) is maximally monotone), we obtain \(0\in \partial f (\bar{x})\), that is \(\bar{x} \in (\partial f)^{-1}(0)\). Thus, we conclude by means of the well-known Opial’s lemma [33], which completes the proof\(\square \)
4.2 Main Estimates and Asymptotic Convergence Results
4.2.1 The General Setting of Parameters
The next result can be regarded as Theorem 4.1 in which conditions (4.2)–(4.3) are simplified.
Theorem 4.2
Let \(\delta \ge 0\), \(\{e_*, \kappa , \sigma _0 \} \subset (0,\infty )\), let \(\{\nu , \vartheta \}: [0,\infty )\rightarrow (0,\infty )\) satisfy (1.15)–(1.16) (for some \( \vartheta _{\infty }>0\)). Assume that \((x,\xi ,y):[0,\infty ) \rightarrow {\mathcal {H}}^3\) is a strong solution to (1.4)–(1.5) and that the following conditions (a) and (b) are satisfied:
Then the conclusions of Theorem 4.1 are still valid.
Proof
In light of Theorem 4.1, we just prove that there exist two constants \(s_0 \in (0, e_*)\) and \(t_0 \ge 0\) for which (4.2)–(4.3) hold. For simplification, we set \(\tau ( \cdot ):= e_*+ \nu ( \cdot )\) and \(\rho ( \cdot ):=\kappa - \frac{{\dot{\nu }}( \cdot )}{\nu ( \cdot )}\). From the definitions of \(\psi _1(.,.)\), \(\psi _2(.,.)\) and \(a_1( \cdot )\) (given in (4.1)) we readily have
The rest of the proof can be divided into the following steps (i1)- (i4):
(i1): Let us prove (for t large enough) that \(\psi _1(s_0,t) \ge 0\) for some \(s_0 \in (0, e_*)\). Indeed, (4.38)-(a) writes \(e_*> \frac{\delta + 2M}{\kappa }\), where \(M:=\limsup _{t \rightarrow +\infty } {\dot{\nu }}(t)\) is well-defined (from 1.15d). So, we can take \(s_0 \in (\frac{\delta + 2\,M}{\kappa }, e_*)\). Moreover, for \(t \ge 0\), by the definition of \(\psi _1(s_0,t)\) and by \({\dot{\vartheta }} ( \cdot ) \le 0\) (from 1.16) we successively obtain
In order to estimate the right side of this last inequality, we recall that \(\lim _{t \rightarrow \infty } \rho (t)=\kappa \) (from Remark 3.1). It is then immediately observed that \(\liminf _{t \rightarrow \infty } \big ( s_0 \rho (t) - \delta - 2 {\dot{\nu }}(t) \big )= s_0 \kappa - \delta - 2 M > 0 \) (since \(s_0 \in (\frac{\delta + 2\,M}{\kappa }, e_*)\)).
Thus, by \(\sigma (t) \ge \sigma _0>0\), \(\tau (t) \ge e_*>0\) and \(\vartheta (t) \ge \vartheta _{\infty }>0\) (from (1.16)) for all \(t \ge 0\), we readily infer that \(\liminf _{t \rightarrow \infty } \psi _1(s_0,t) > 0\). Whence, for \(t_1\) large enough, \(t \ge t_1\) yields \(\psi _1(s_0,t) > 0\).
(i2): Setting \(\omega _0:= \frac{1}{2} (\kappa \vartheta _{\infty }- \frac{\delta }{\tau (t_*)})\), we prove (for t large enough) that \(\omega (t) \ge \omega _0\). Indeed, by definition of \(\omega ( \cdot )\) (given in (4.3)) and by \(\vartheta ( \cdot ) \ge \vartheta _{\infty }>0\) (from (1.16)), we obtain, for \(t\ge t_*\) (\(t_*\) being the constant arising in condition (4.38)-(b)), \(\omega (t) = \rho (t) \vartheta (t)- \frac{\delta }{\tau (t)}\ge \rho (t) \vartheta _{\infty }- \frac{\delta }{\tau (t_*)}\). Recall that \(\lim _{t \rightarrow \infty } \rho (t)=\kappa \) (from Remark 3.1). Moreover, by \(\tau (t_*) \ge \frac{e_*}{\vartheta _{\infty }}\) (from (4.38)-(b)) and \(e_*> \frac{\delta }{\kappa }\) (from (4.38)-(a)), we additionally have \(\tau (t_*) > \frac{ \delta }{\kappa \vartheta _{\infty }} \) (hence \(\omega _0>0\)). It follows that \(\liminf _{t \rightarrow + \infty } \omega (t) \ge \kappa \vartheta _{\infty }- \frac{\delta }{\tau (t_*)}= 2 \omega _0>0\). So we readily deduce for some \(t_2 \ge t_*\) that \(t \ge t_2\) implies \(\omega (t) \ge \omega _0>0\).
(i3): Let us prove (for t large enough) that \(\psi _2(e_*,t)\ge 0\) and \(a_1(t) \ge 0\). Indeed, we have \(\vartheta ( \cdot ) \ge \vartheta _{\infty }>0\) (from (1.16)) and \(\tau (t_*) \ge \frac{e_*}{\vartheta _{\infty }}\) (from (4.38)-(b)), hence \(t \ge t_*\) yields \(\psi _2(e_*,t) \ge \sigma (t) \tau (t) \big (\tau (t_*) \vartheta _{\infty } -e_*\big ) \ge 0\). In addition, by \(\inf _{t \ge t_2} \omega (t) \ge \omega _0 >0\) (from item (i2)), we can observe for \(t \ge t_2\) that \(a_1(t) \ge \sigma (t) \tau (t) \big ( \omega _0-\frac{{\dot{\nu }}(t)}{\tau (t)} \big ) \). Note that \(\lim _{t \rightarrow \infty } \frac{{\dot{\nu }}(t)}{\tau (t)}=0\) (from (1.15)). So, we classically deduce for some \(t_3 \ge t_*\) that \(t\ge t_3\) yields \(\psi _2(e_*,t) \ge 0\) and \(a_1(t) \ge 0\).
(i4): The desired result follows from (i1), (i2) and (i3) altogether \(\square \)
4.3 Specific Cases of Parameters
Let us start by stressing, under appropriate conditions on the parameters, some properties regarding the isotropic damping coefficient \(\alpha ( \cdot )\) that occurs in the equivalent second-order formulation (1.2)–(1.3) of system (1.4)–(1.5).
Proposition 4.3
Let \(\delta \ge 0\), \(\{\kappa , e_*, \} \subset (0,\infty )\), let \(\{ \nu ( \cdot ), \vartheta ( \cdot )\}\) be positive mappings of class \(C^2\) satisfying (1.15) and for which \(\vartheta (t) \sim \vartheta _{\infty }\) as \(t \rightarrow \infty \) (for some \(\vartheta _{\infty }>0\)), and suppose that (as \(t \rightarrow \infty \)):
Then the parameters \(\{\alpha ( \cdot ), \beta ( \cdot ), b( \cdot ) \}\) defined by (1.3) (depending on \(\theta ( \cdot )\) and \(\omega ( \cdot )\) given in (1.5)) satisfy (as \(t \rightarrow \infty \)):
In particular, for \(\nu (t) = \nu _0^{1-\gamma } (t + \nu _0)^{\gamma }\) with \(\nu _0 >0\) and \(\gamma \in (0,1]\), we get (as \(t \rightarrow \infty \))
Proof
See Appendix A.4.
At once, we specialize Theorem 4.1 to two particular cases of \(\nu ( \cdot )\) through the next two results (Theorems 4.3 and 4.4).
Theorem 4.3
(Case \(\nu (t) = t + \nu _0\)). Let \(\delta \ge 0\), \(\{\kappa , e_*, \nu _0, \sigma _0\} \subset (0,\infty )\), set \(\nu (t) = t + \nu _0\) and let \(\vartheta ( \cdot )\) be any positive mapping of class \(C^1\) satisfying (1.16). Suppose also that \((x,\xi ,y):[0,\infty ) \rightarrow {\mathcal {H}}^3\) is a strong solution to (1.4)–(1.5) with parameters such that
Then there exists \(\bar{x}\in S\) such that \(x( \cdot )\rightharpoonup \bar{x}\) weakly in \({\mathcal {H}}\), and, for some \(t_0 \ge 0\), we obtain:
Proof
Let \(\sigma ( \cdot )\) be defined by (1.5b), along with \(\nu (t) = t + \nu _0\). A simple computation yields \(\sigma (t) =\sigma _0 \left[ \frac{e_*+\nu _0+t}{e_*+\nu _0} \right] ^\delta \), so that \(\sigma (t) \sim \frac{\sigma _0}{(e_*+\nu _0)^\delta } t^\delta \) as \(t \rightarrow \infty \). Regarding this situation in which \({\dot{\nu }}( \cdot ) = 1\), (4.38)-(a) reduces to (4.43), while (4.38)-(b) is obviously satisfied (since \(\nu (t) \rightarrow \infty \) as \(t \rightarrow \infty \)). Hence, Theorem 4.3 follows directly from Theorem 4.1 and \(\sigma (t) \sim \frac{\sigma _0}{(e_*+\nu _0)^\delta } t^\delta \) as \(t \rightarrow \infty \)
\(\square \)
Theorem 4.4
(Case \(\nu (t) = \nu _0^{1-\gamma } (t + \nu _0)^{\gamma }\) with \(\gamma \in [0,1)\)). Let \(\delta \ge 0\), \(\{\kappa , e_*, \nu _0, \sigma _0\} \subset (0,\infty )\), \(\gamma \in [0,1)\), set \(\nu (t) = \nu _0^{1-\gamma }(t+ \nu _0)^\gamma \) and let \(\vartheta ( \cdot )\) be any positive mapping of class \(C^1\) satisfying (1.16). Suppose furthermore that \((x,\xi ,y): [0,\infty ) \rightarrow {\mathcal {H}}^3\) is a strong solution to (1.4)–(1.5) with parameters such that:
Then there exists \(\bar{x}\in S\) such that \(x( \cdot )\rightharpoonup \bar{x}\) weakly in \({\mathcal {H}}\). Moreover, denoting \({\bar{\alpha }} := \frac{\delta \nu _0^\gamma }{(1-\gamma )(\nu _0 + e_*)} \), we have the following properties (for some \(t_0 \ge 0\)):
Proof
Let \(\sigma ( \cdot )\) be defined by (1.5b) with \(\nu (t) = \nu _0^{1-\gamma } (t+ \nu _0)^\gamma \), where \(\nu _0 >0 \) and \(\gamma \in [0,1)\). Then, denoting \({\bar{\alpha }} := \frac{\delta \nu _0^\gamma }{(1-\gamma )(\nu _0 + e_*)} \), a simple computation yields, for \(s \ge 0\),
\( \frac{\delta }{\nu _0^{1-\gamma } (s+ \nu _0)^{\gamma } + e_*} \ge \frac{\delta }{(s+ \nu _0)^{\gamma }\big (\nu _0^{1-\gamma } + \frac{e_*}{ \nu _0^\gamma }\big )}= (1-\gamma ) {\bar{\alpha }} (s+ \nu _0)^{-\gamma }\).
Consequently, by \(\sigma (t)= \sigma _0e^{\delta \int _0^t \frac{1}{\nu _0^{1-\gamma }(s+ \nu _0)^{\gamma } + e_*} ds }\) we immediately deduce that
Concerning this situation in which \({\dot{\nu }} (t) = \frac{\gamma \nu _0^{1-\gamma }}{ (t + \nu _0)^{1 - \gamma }}\), we have \(\limsup _{t \rightarrow \infty } {\dot{\nu }}(t) = 0\). Hence, (4.38)-(a) reduces to (4.45)-(a). Moreover, if \(\gamma \in (0,1)\), (4.38)-(b) is obviously satisfied (since \(\nu (t) \rightarrow \infty \) as \(t \rightarrow \infty \)), while, otherwise (if \(\gamma = 0\)), (4.38)-(b) follows from (4.45)-(b). Thus, Theorem 4.4 follows directly from Theorem 4.1 and \(\sigma (t) \ge \sigma _0 e^{-{\bar{\alpha }} \nu _0^{1-\gamma } } e^{{\bar{\alpha }} t^{1-\gamma } }\) \(\square \)
The next result can be regarded as an important consequence of the previous theorem that enlightens the possible effects of the parameters \(\{\delta , \kappa \}\) on the estimates and convergence rates in (1.12).
Corollary 4.1
Let \(\{\kappa , e_*, \nu _0, \sigma _0\} \subset (0,\infty )\), \(\delta \ge 0\), \(\gamma \in [0,1)\), set \(\nu (t) = \nu _0^{1-\gamma }(t+ \nu _0)^\gamma \) and let \(\vartheta ( \cdot )\) be any positive mapping of class \(C^1\) satisfying (1.16). Suppose furthermore that \((x,\xi ,y):[0,\infty ) \rightarrow {\mathcal {H}}^3\) is a strong solution to (1.4)–(1.5) with parameters such that:
Then the conclusions of Theorem 4.4 still hold when replacing \({\bar{\alpha }}\) with \(c:= \big ( \frac{\nu _0^{\gamma }}{(1-\gamma )(\nu _0+\lambda )} \big ) \frac{\delta \kappa }{\max \{\delta ,\kappa \}}\). Furthermore, in the special case when \(\kappa = \delta >0\), (4.48) (a) reduces to the condition \(e_*>1\) and we get \(c= \frac{\nu _0^{\gamma }}{(1-\gamma )(\nu _0+e_*)} \delta \).
Proof
Clearly, under condition (4.48), the conclusions of Theorem 4.4 (including the estimates and rates in (1.12)) hold with \({\bar{\alpha }}: = \frac{\kappa \delta \nu _0^{\gamma }}{(1-\gamma ) (\kappa \nu _0+\lambda \delta ) }\) (since \(e_*= \frac{\lambda }{\kappa } \delta \)). Hence we readily deduce that \({\bar{\alpha }} \ge \big ( \frac{\nu _0^{\gamma }}{(1-\gamma ) (\nu _0+\lambda )} \big ) \frac{\delta \kappa }{\max \{\delta ,\kappa \}}\). This leads us immediately to the two claims of Corollary 4.1\(\square \)
5 Numerical Experiments and Discrete Perspectives
5.1 Numerical Experiments
We carry out some numerical experiments regarding the dynamics \(\{x( \cdot ), \xi ( \cdot )\}\) generated by our models, relative to three examples of problem (1.1) when \({\mathcal {H}}=\mathrm{I\!R}^2\). The first one (which deals with a smooth objective) is intended to compare our model with DIN-AVD. The last two examples deal with nonsmooth objectives (one is strongly convex and the other is not), so as to provide insight into the influence and relevance of the parameters. For the sake of legibility, we make the following observations.
Remark 5.1
Recall from Proposition 2.2 that existence and uniqueness of a strong solution \((x,\xi )\) to (1.2) require initial conditions:
\((x(0),\xi (0))=(x_0, \xi _0)\) and \(\big ( x( \cdot ) + \sigma ( \cdot ) \xi ( \cdot ) \big )^{(1)}(0) = q_0\), such that \(\xi _0 \in \partial f(x_0)\).
As suggested by Proposition 2.3, we know that \((x, \xi )\) uniquely solves (for some auxiliary variable y) system (1.4) with Cauchy data: \(~x(0)=x_0\), \(~\xi (0)=\xi _0\) and \(y(0) =x_0-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi _0 \right) \). So, from Proposition 2.2, we focus on computing \((x, \xi )\) through the unique solution \((x, \xi , y)\) given (for \(t \ge 0\)) by
\((v( \cdot ),y( \cdot ))\) being the unique classical solution to (2.6) that can be alternatively written as
along with: \(y(0)= x_0-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi _0 \right) \) and \(v(0)=x_0+ \sigma (0) \xi _0\). In our forthcoming experiments, we compute the trajectories produced by DIN-AVD and (5.1) by using Matlab.
In all the following examples, we denote \({\bar{f}}= f -\min f\) and we consider our model with \(\nu (t) = \nu _0^{1-\gamma }(t+ \nu _0)^\gamma \) for some \(\gamma \in [0,1]\) and \(\nu _0>0\), together with \(\vartheta (t) \equiv \vartheta _\infty \) (for some \(\vartheta _\infty >0\)). Our experiments will be mainly focused on the two special cases: (i) \(\delta =0\) (useful for numerical purposes); (ii) \(\delta >0\) and \(\gamma =0\) (which ensures exponential convergence rates).
5.1.1 Example 1 (Comparing Our Model with DIN-AVD)
Our first example aims at comparing the classical DIN-AVD [11] (namely, (1.9) where \({\bar{b}}(t) \equiv \beta _*\) for some constant \(\beta _*>0\)) with our model in the special case when \(\delta =0\) (namely, in absence of time rescaling process) and \(\gamma =1\) (for which \(\alpha (t) \sim \frac{1+\kappa e_*}{t}\) as \(t \rightarrow \infty \), from Proposition 4.3). Toward that end, we consider the smooth objective used in [11] (for illustrating the former dynamic) and defined for \(x=(x_1,x_2) \in \mathrm{I\!R}^2\) by \(f(x)= \frac{1}{2}(x_1^2+1000 x_2^2)\). This function is quadratic but somewhat ill-conditioned. From its separable form (so as to use (5.1)), we classically obtain, for \(\sigma >0\) and \(v=(v_1,v_2) \in \mathrm{I\!R}^2 \),
As in [11], concerning DIN-AVD, we use the near-optimal parameters \(\alpha _*=3.1\) and \(\beta _*=1\), with \(x_0 = (1,1)\) and \(\dot{x}_0 = (0,0)\). Concerning our model we set \(x_0 = y_0= (1,1)\) and \(\xi _0=(1, 1000)\) (so \(\xi _0 \in \partial f(x_0)\)), and we highlight the influence of the parameters \(\kappa \) and \(\vartheta _\infty \) on its trajectories. It appears on Figs. 1 and 2 that our model outperforms DIN-AVD as soon as \(\kappa \) and \(\vartheta _\infty \) are large enough. In addition, its performances are all the better as the values of \(\kappa \) and \(\vartheta _\infty \) are large. Further experiments (not reported here) suggest that increasing \(e_*\) tends to slightly damp the oscillations.
5.1.2 Example 2 (Influence of the Parameters on the Trajectories)
We aim here at illustrating the influence of the parameters \(\{\delta , \gamma , \kappa , \vartheta _\infty \}\) on the trajectories \(\{x( \cdot ), \xi ( \cdot )\}\) produced by our model. For this purpose, we consider the nonsmooth objective defined for \(x\in \mathrm{I\!R}^2\) by \(f(x) = \frac{1}{2} \Vert x -b\Vert _2^2 + \Vert x\Vert _1\) for some \(b \in \mathrm{I\!R}^2\), which is linked to the Lasso problem. In our experiments, we set \(b = (0,10)\). So it can be checked that the minimum of f is reached at \(x^*=(0,9)\). As a typical result for solving (5.1), we have, for \(\sigma >0\) and \(v \in \mathrm{I\!R}^2\),
We also choose the initial conditions \(x_0=y_0=(10,10)\), \(\xi _0= (11,1)\) (so \(\xi _0 \in \partial f(x_0)\)).
On figures 3-5, we illustrate the influence of \(\{ \gamma , \kappa , \vartheta _\infty \}\) relative to the useful context of no time rescaling. Figure 3 shows us that, when \(\delta =0\) (absence of time rescaling), the convergence is slightly better for \(\gamma =1\) (as it would be anticipated from Theorems 4.3 and 4.4).
Figures 4 and 5 are concerned with the influence of \(\kappa \) and \(\vartheta _\infty \) on our model, in the special case when \(\delta =0\) and \(\gamma =1\). It can be observed on these figures (as in example 1) the effectiveness of the model for sufficiently large values of \(\kappa \) and \(\vartheta _\infty \).
Now, we focus on the influence of \(\{ \gamma , \kappa , \vartheta _\infty \}\) relative to the context of time rescaling. In this context, the effectiveness of the model for sufficiently large values of \(\kappa \) and \(\vartheta _\infty \) can be also observed on additional experiments (not reported here for conciseness). Figure 6 suggests that, when \(\delta >0\) (in presence of a time rescaling process), a fastest convergence holds for \(\gamma =0\) (which, once again, is consistent with our theoretical results). Figures 7 and 8 show us, under particular choices of parameters entering Theorem 4.3 (see Fig. 7) and Corollary 4.1 (see Fig. 8), that the convergence is all the better as \(\delta \) increases.
On Fig. 8 we observe that, after an initial transient phase, the profiles stabilize to straight lines relative to a semi-logarithmic scale. This clearly indicates an exponential convergence rate of the form \(O \left( e^{-A t} \right) \) for \(\gamma =0\). Let us recall that in this specific case, Theorem 4.4 states the rate \({\bar{f}}(x(t))= o \left( e^{-{\bar{\alpha }} t} \right) \) with \({\bar{\alpha }} := \frac{\delta }{\nu _0 + e_*} \). Through a linear regression of \(\ln {\bar{f}}(x( \cdot ))\) with respect to time (by means of a classical least squares method), we can easily estimate the values of A so as to compare it with \({\bar{\alpha }} \) (see Fig. 9). For all considered values \(\delta \), we get \(A > {\bar{\alpha }} \). This confirms the decaying rate \(o \left( e^{-{\bar{\alpha }} t} \right) \).
5.1.3 Example 3 (Influence of \(\gamma \) for a Non Strongly Convex Objective)
In this last example, we consider the nonsmooth objective defined for \(x\in \mathrm{I\!R}^2\) by \(f(x) = \Vert x\Vert _1\). Even though this is a very simple problem, it has the advantage of addressing the case of a convex objective function which is not strongly convex (as in the two previous examples). We assess the trajectories generated by the dynamics for various values of \(\gamma \) and the following setting of parameters: \(\delta =0\), \(e_*=2.5\), \(\nu _0=1\) and \(\vartheta _\infty =5\). The experiment was conducted for \(x_0=(200,200)\) (which is a starting point away from the minimizer of f), \(\xi _0=y_0= (1,1)\). It can be easily checked that \(\xi _0 \in \partial f(x_0)\).
It can be seen on Fig. 10 that the convergence is better for \(\gamma =1\), which is in accordance with the results of Theorems 4.3 and 4.4.
The last Fig. 11 shows off (through a zoom relative to the special case of the previous figure when \(\gamma =0.5\)) regularity properties regarding the solution \((x(\cdot ), \xi (\cdot ))\), in which \(x(\cdot )=\big ( x_1(\cdot ), x_2(\cdot ) \big )\) and \(\xi (\cdot )=\big ( \xi _1(\cdot ), \xi _2(\cdot ) \big )\). It can be noticed that \(x_1(\cdot )\) behaves as an (absolutely) continuous function that reaches the minimizer of f (around the time \(t=8\)), while \(\xi _1(\cdot )\) appears to be differentiable almost everywhere together with \(x_1(\cdot )+\sigma (\cdot ) \xi _1(\cdot )\) that seems to be twice differentiable, in accordance with Proposition 2.3.
5.2 Perspectives on Discrete Variants
Inspired by system (1.4)–(1.5), and following the methodology of [30], we suggest a new inertial and corrected proximal algorithm for solving the structured convex minimization problem:
where \(f: {\mathcal {H}}\rightarrow (-\infty ,\infty ]\) is proper convex and l.s.c. while \(g: {\mathcal {H}}\rightarrow (-\infty , \infty )\) is convex and continuously differentiable.
However, the study of this algorithm is out of the scope of this study and will be carried out in a future work.
In what follows, given some positive mappings \(\nu ( \cdot )\) and \(\vartheta ( \cdot )\), we set \(t_n= hn\) (for some positive value h), \(\nu _n=\nu (t_n)\), \(\vartheta _n=\vartheta (t_n)\), \(\theta _n=\theta (t_n)\) and \(\rho _n=\rho (t_n)\) for all \(n\ge 0\).
We also introduce the operator \(M_{\mu }\) defined for any \(\mu >0\) and for any \(x\in {\mathcal {H}}\) by \(M_{\mu }(x):=\mu ^{-1}( x - J_{\mu \partial f}(x-\mu \nabla g(x))\). It is well-known that \(M_{\mu }\) satisfies \(M_{\mu }^{-1}(0)=(\partial f +\nabla g)^{-1}(0)=S\) and that it has co-coercive properties, whenever \(\nabla g\) is Lipschitz continuous and \(\mu \) is small enough (see, e.g. [6]). For this reason, the algorithms based on the computation of zeroes of \(M_{\mu _n}\) (for some \((\mu _n) \subset (0,\infty )\)) generally require to use bounded indexes \((\mu _n)\), which excludes the benefit of time rescaling process in structured minimization (see, e.g., Boţ-Hulett [18]).
In order to solve (5.5), without time rescaling process, we consider the discrete model which consists of the sequences \(((z_n,x_n,y_n)) \subset {\mathcal {H}}^3\) generated by the following numerical scheme.
A discrete model Let \(\mu \) and t be positive constants and consider any starting elements \(\{z_{-1}, x_0,y_0\} \subset {\mathcal {H}}\). For \(n \ge 0\), given elements \(\{z_{n-1}, x_{ n}, y_{ n} \}\), we compute the updates by:
where “\(\eta _n (z_{ n-1}-x_{n} )\)” is a correction term with coefficient \(\eta _n:=1-h\omega _n\).
Remark 5.2
From an easy computation (noticing for \(\delta =0\) that \(\omega _n=\rho _n\vartheta _n\)) we get \(\eta _n =1-h \vartheta _n \rho _n\), or equivalently \(\vartheta _n =\frac{1-\eta _n}{\rho _n}\). This suggests conversely that we can consider (5.6) with any nondecreasing sequence \((\eta _n) \subset [0,\epsilon ]\) (for some \(\epsilon \in [0,1) \) ), just by taking \(\vartheta _n =\frac{1-\eta _n}{\rho _n}\), since \((\rho _n)\) is positive and nondecreasing. So, \((\eta _n)\) is indeed a nondecreasing sequence that is bounded away from zero.
Remark 5.3
The specificity of this scheme lies in the fact that the inertial corrected algorithms studied in the literature generally involve a correction coefficient such that \(\eta _n \rightarrow 0\) as \(n \rightarrow \infty \) (see, e.g. [26, 29, 30]), contrary to the case of model (5.6) for which \(\eta _n \rightarrow 1-h\kappa \vartheta {\infty }\).
The next result shows us that (5.6) can be regarded as a discrete counterpart of (1.4)–(1.5) in which \(\delta =0\) and f is replaced with \(f+g\) (with a multiplicative factor).
Proposition 5.1
Let \(((z_n,x_n,y_n))\) be any sequence generated by (5.6) For \(n\ge 1\), and setting \(\xi _{ n}:=z_{n-1}-x_{n} \), we have
Proof
For \(n \ge 1\), by (5.6b) we have \(\xi _n=t M_{\mu }(z_{n-1})\), while it is easily checked that \(M_{\mu }(z_{n-1})\in \partial f (x_{n} )+ \nabla g(z_n)\), which leads us to (5.7a). In addition, (5.6) readily yields \(z_n=x_n - h \theta _n (y_{ n} -x_{ n}) + (1-h\omega _{n})\xi _n \), which, by \(z_n=\xi _{n+1} + x_{n+1}\), entails
It follows immediately (5.7b), while (5.7c) is obvious from (5.6c)\(\square \)
Data Availibility
We do not analyse or generate any datasets, because our work proceeds within a theoretical and mathematical approach. One can obtain the relevant materials from the references below.
References
Abbas, B., Attouch, H.: Dynamical systems and forward-backward algorithms associated with the sum of a convex subdifferential and a monotone cocoercive operator. Optimization 64, 2223–2252 (2015)
Abbas, B., Attouch, H., Svaiter, B.F.: Newton-like dynamics and forward-backward methods for structured monotone inclusions in Hilbert spaces. J. Optim. Theory Appl. 161(2), 331–360 (2014)
Alvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with Hessian driven damping. Application to optimization and mechanics. J. Math. Pures Appl. 81(8), 747–779 (2002)
Apidopoulos, V., Aujol, J.-F., Dossal, Ch.: The differential inclusion modeling the FISTA algorithm and optimality of convergence rate in the case \(b \le 3\). SIAM J. Optim. 28(1), 551–574 (2018)
Attouch, H., Cabot, A.: Convergence of damped inertial dynamics governed by regularized maximally monotone operators. J. Differ. Equ. 264, 7138–7182 (2018)
Attouch, H., Cabot, A.: Convergence of a relaxed inertial forward-backward algorithm for structured monotone inclusions. Appl. Math. Optim. 80, 547–598 (2019)
Attouch, H., László, S.C.: Continuous Newton-like inertial dynamics for monotone inclusions. Set Valued Var. Anal. 29, 555–581 (2021)
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators. Math. Program. 174, 391–432 (2019). https://doi.org/10.1007/s10107-018-1252-x
Attouch, H., Svaiter, B.F.: A continuous dynamical Newton-like approach to solving monotone inclusions. SIAM J. Control Optim. 49(2), 574–598 (2011)
Attouch, H., Bolte, J., Redont, P.: Optimizing properties of an inertial dynamical system with geometric damping: Link with proximal methods. Control Cybern. 31, 643–657 (2002)
Attouch, H., Peypouquet, J., Redont, P.: Fast convex minimization via inertial dynamics with Hessian driven damping. J. Differ. Equ. 261, 5734–5783 (2016)
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168(1–2), 123–175 (2018)
Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial gradient dynamics. SIAM J. Optim. 29(3), 2227–2256 (2019)
Attouch, H., Chbani, Z., Riahi, H.: Fast convex optimization via time scaling of damped inertial gradients dynamics. Pure Appl. Funct. Anal. 6(6), 1081–1117 (2021)
Attouch, H., Balhag, A., Chbani, Z., Riahi, H.: Fast convex optimization via inertial combining viscous and Hessian-driven damping with time rescaling dynamics. Evol. Equ. Control Theory 11(2), 487–514 (2022)
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping. Math. Program. 193, 113–155 (2022). https://doi.org/10.1007/s10107-020-01591-1
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: Convergence of iterates for first-order optimization algorithms with inertia and Hessian driven damping. Optimization 72(5), 1199–1238 (2023). https://doi.org/10.1080/02331934.2021.2009828
Boţ, R.I., Hulett, D.A.: Second order splitting dynamics with vanishing damping for additively structured monotone inclusions. J. Dyn. Differ. Equ. (2022). https://doi.org/10.1007/s10884-022-10160-3
Boţ, R.I., Karapetyants, M.A.: A fast continuous time approach with time scaling for nonsmooth convex optimization. Adv. Cont. Disc. Models (2022). https://doi.org/10.1186/s13662-022-03744-2
Boţ, R.I., Csetnek, E., László, S.C.: On the strong convergence of continuous Newton-like inertial dynamics with Tikhonov regularization for monotone inclusions. J. Math. Anal. Appl. (2023). https://doi.org/10.13140/RG.2.2.20539.18729
Brezis, H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. Math. Stud., vol. 5. North-Holland, Amsterdam (1973)
Brezis, H.: Function Analysis, Sobolev Spaces and Partial Differential Equations. Springer, New York (2010)
Cabot, A., Engler, H., Gadat, S.: On the long time behavior of second order differential equations with asymptotically small dissipation. Trans. Am. Math. Soc. 361, 5983–6017 (2009)
Cabot, A., Engler, H., Gadat, S.: Second order differential equations with asymptotically small dissipation and piecewise flat potentials. Electron. J. Differ. Equ. 17, 33–38 (2009)
Haraux, A.: Systémes dynamiques dissipatifs et applications, RMA17. Masson, Paris (1991)
Kim, D.: Accelerated proximal point method for maximally monotone operators (2019). Math. Program. 190, 57–87 (2021)
Labarre, F., Maingé, P.E.: First-order frameworks for continuous Newton-like dynamics governed by maximally monotone operators. Set Valued Var. Anal. 20(2), 425–451 (2022). https://doi.org/10.1007/s11228-021-00593-1
Luo, H.: Accelerated differential inclusion for convex optimization. Optimization 72(5), 1139–1170 (2023). https://doi.org/10.1080/02331934.2021.2002327
Maingé, P.E.: Accelerated proximal algorithms with a correction term for monotone inclusions. Appl. Math. Optim. 84(Suppl 2), 2027–2061 (2021)
Maingé, P.E., Weng-Law, A.: Fast continuous dynamics inside the graph of maximally monotone operators. Set Valued Var. Anal. (2023). https://doi.org/10.1007/s11228-023-00663-6
May, R.: Asymptotic for a second order evolution equation with convex potential and vanishing damping term. Turk. J. Math. 41(3), 681–785 (2015). https://doi.org/10.3906/mat-1512-28
Minty, G.J.: Monotone (nonlinear) operators in Hilbert spaces. Duke Math. J. 29, 341–346 (1962)
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
Polyak, B.T.: Some methods of speeding up the convergence of iterative methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964)
Qu, X., Bian, W.: Fast inertial dynamic algorithm with smoothing method for nonsmooth convex optimization. Comput. Optim. Appl. 83, 287–317 (2022)
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations. Math. Program. 195, 79–148 (2022). https://doi.org/10.1007/s10107-021-01681-8
Sontaq, E.D.: Mathematical Control Theory, 2nd edn. Springer, New York (1998)
Su, W., Boyd, S., Candés, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Neural Inf. Process. Syst. 27, 2510–2518 (2014)
Funding
The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Proof of Proposition 2.1
Let us prove that (i1) \(\Rightarrow \) (i2). Consider a solution \((x,\xi ) \in {\mathcal {A}}_c\times {\mathcal {A}}_c\) to (2.2). Set \(\zeta ( \cdot )=\sigma ( \cdot ) \xi ( \cdot )\) and suppose that \(x( \cdot )+ \zeta ( \cdot )\) is of class \(C^1\) and that \(\left( x( \cdot )+ \zeta ( \cdot ) \right) ^{(1)} \in {\mathcal {A}}_c\). Clearly, for \(t \ge 0\), as \(\dot{x}\), \({\dot{\zeta }}\) and \(\zeta \) are integrable on [0, t] (since x and \(\zeta \) belong to \({\mathcal {A}}_c\)), we can set as a well-defined quantity
Hence, \(z \in {\mathcal {A}}_c\), and by differentiating (1.9) we obtain
Therefore, by (2.2b) together with the above equality, we get
Moreover, recalling that \(\big (x( \cdot ) + \zeta ( \cdot ) \big )^{(1)} \in {\mathcal {A}}_c\) and \(z \in {\mathcal {A}}_c\), we get
\(\frac{d}{dt} \left( \big (x( \cdot ) + \zeta ( \cdot ) \big )^{(1)} + z( \cdot ) \right) (t) = \big (x( \cdot ) + \zeta ( \cdot ) \big )^{(2)}(t)+ \dot{z}(t)\), for a.e. \(t \ge 0\). Hence, we straightforwardly deduce
It follows immediately that
which, by the initial condition \(\big ( x( \cdot ) + \sigma ( \cdot )\xi ( \cdot )\big )^{(1)}(0) = q_0 \), yields
which readily implies that
Multiplying (1.15) by \(\beta (t)\) and adding the resulting equality to (1.10) give us
Now, since \(\theta ( \cdot )\) is positive, we introduce the function \(y( \cdot )\) defined for \(t \ge 0\) by
For simplification we also set \(u(t)=y(t)-x(t)\). Observe from (1.17) that we equivalently have
Thus, for \(t \ge 0\), (1.14) in light of the above equality entails
that is (2.4b). We now prove (2.4c). Differentiating (1.18), while noticing that \(\{x, \xi , u\} \subset {\mathcal {A}}_c\), readily implies, for a.e. \(t\ge 0\),
Moreover, using the definitions of \(\alpha ( \cdot )\), \(\beta ( \cdot )\) and \(b( \cdot )\) given by (1.3), namely \(\alpha (t) =-\frac{\dot{\theta }(t) }{\theta (t)} + \kappa -\theta (t)\), \(\beta (t) = -\frac{{\dot{\theta }}(t) }{\theta (t)} + \kappa +\omega (t) \) and \(b(t)= \omega (t) \left( \kappa +\frac{{\dot{\omega }} (t)}{\omega (t)}-\frac{\dot{\theta }(t)}{\theta (t)} \right) \), yields
Hence, by (1.16) and using (1.20), (1.18), \(\dot{u}=\dot{y} - \dot{x}\) and (1.21a), successively, we get, for a.e. \(t\ge 0\),
namely
In addition, by (1.19), while recalling that \(\{x,\xi \} \subset {\mathcal {A}}_c\), we obtain
Thus, combining (1.22) and (1.23), in light of \(\theta \ne 0\), yields
\(\dot{y}(t)+\kappa (u(t))=0\) for a.e. \(t\ge 0\),
that is (2.4c).
Finally, regarding the initial conditions, we have \((x(0),\xi (0))= (x_0,\xi _0)\) and \(\big (x( \cdot ) + \zeta ( \cdot )\big )(0)= q_0\) (according to (i1)) while (2.4b) at time \(t=0\) ensures that
\(\big ( x( \cdot ) + \zeta ( \cdot ) \big )^{(1)} (0) + \theta (0) \left( y(0)-x(0) \right) + \omega (0) \zeta (0) =0\).
Hence, we deduce that \(y(0) =x_0-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi _0 \right) \).
Let us prove that (i2) \(\Rightarrow \) (i1). Consider a solution \((x,\xi ,y) \in {\mathcal {A}}_c\times {\mathcal {A}}_c\times C^1\) to (2.4). For simplification, we set again \(u( \cdot )=y( \cdot )-x( \cdot )\) and \(\zeta ( \cdot )=\sigma ( \cdot ) \xi ( \cdot )\). Clearly, by (2.4b), we have, for \(t \ge 0\),
This, by \((x,\xi ,y) \in {\mathcal {A}}_c\times {\mathcal {A}}_c\times C^1\) and by \(\{\omega ( \cdot )\), \(\theta ( \cdot )\), \(\sigma ( \cdot )\} \subset C^1([0, \infty ])\), entails that \(x( \cdot )+\zeta ( \cdot )\) is of class \(C^1\) and that \(\big ( x( \cdot ) + \sigma ( \cdot )\xi ( \cdot ) \big )^{(1)} \in {\mathcal {A}}_c\). Then, differentiating (1.24) gives us, for a.e \(t \ge 0\),
while we know from (2.4c) that \(\dot{y}(t)= -\kappa u(t) \). Consequently, we readily obtain, for a.e \(t \ge 0\),
Furthermore, for a.e. \(t \ge 0\), by (1.24) we readily have
\(u(t)=-\frac{1}{\theta (t)} \big ( ( x( \cdot )+ \zeta ( \cdot )) ^{(1)} + \omega (t) \zeta (t) \big )\), which, by (1.26), entails
Hence the expressions of \(\alpha ( \cdot ), \beta ( \cdot )\) and \(b( \cdot )\) defined in (1.3) amounts to (2.2b). In addition, from the initial conditions in (i2), we have \(~x(0)=x_0\), \(~\xi (0)=\xi _0\) and \(y(0) =x_0-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi _0 \right) \). This implies that \(y(0) =x(0)-\frac{1}{\theta (0)} \left( q_0 + \sigma (0) \omega (0) \xi (0) \right) \), while (2.4b) at time \(t=0\) yields
\(\big ( x( \cdot ) + \zeta ( \cdot ) \big )^{(1)} (0) + \theta (0) \left( y(0)-x(0) \right) + \omega (0) \zeta (0) =0\).
Hence, regarding the last two equalities, substituting the former in the latter gives us \(\big ( x( \cdot ) + \zeta ( \cdot )\big )^{(1)}(0) = q_0 \).
1.2 Proof of Proposition 2.2
1.2.1 The Yosida Regularization
Some useful properties of the Yosida regularization are recalled through the lemma below established in [27] (see also [8, 21, 22]).
Lemma 1.1
Let \(A: {\mathcal {H}}\rightrightarrows {\mathcal {H}}\) be a maximally monotone operator such that \(S:=A^{-1}(0) \ne \emptyset \). Let \(\gamma ,\delta > 0\) and \(x, y \in {\mathcal {H}}\). Then for \(z \in A^{-1}(\{0\})\), we have
Proof
The proof of (1.27) can be found in [8]. For proving (1.28), we simply observe that \(\begin{array}{l} A_{\gamma }x- A_{\delta }y= \frac{1}{ \delta } \left( \delta A_{\gamma }x- \delta A_{\delta }y\right) = \frac{1}{ \delta } \left( ( \delta -\gamma ) A_{\gamma }x + ( \gamma A_{\gamma }x -\delta A_{\delta }y\right) , \end{array}\) from which we get \(\begin{array}{l} \Vert A_{\gamma }x- A_{\delta }y \Vert \le \frac{1}{ \delta } \left( | \delta -\gamma | \times \Vert A_{\gamma }x \Vert + \Vert \gamma A_{\gamma }x -\delta A_{\delta }y\Vert \right) . \end{array}\)
Consequently, by \( \Vert A_{\gamma }x\Vert \le \frac{1}{\gamma } \Vert x-z\Vert \) and using (1.27), we obtain (1.28) \(\square \)
1.2.2 Main Proof of the Proposition
The proof follows the same lines as in [30] (see, also, [1, 2]), but it is developed through the following steps (s1)- (s3) with full details:
(s1) We begin by reformulating the (possibly) existing strong global solutions to (1.4)) (that are supposed to satisfy (2.3)) by means of the Minty representation of the maximal monotone operator \(\partial f\) (see [32]). Set \(J_{ \sigma ( \cdot )}^{ \partial f}=\big (I+\sigma ( \cdot ) \partial f\big )^{-1}\) and \((\partial f)_{\sigma ( \cdot ) }= \frac{1}{\sigma ( \cdot )} \big (I-J_{\sigma ( \cdot ) } ^{ \partial f}\big )\), namely the resolvent and the Yosida approximation of \(\partial f\) (with index \(\sigma ( \cdot )\)), respectively, which are well-known to be single-valued and everywhere defined. Associated with any strong global solution \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) to (1.4), we introduce the new unknown function
It is readily seen that \(v( \cdot )\) belongs to \({\mathcal {A}}_c\) (the set of absolutely continuous functions) and that
\(v(0)= x_0+ \sigma (0) \xi _0\).
Moreover, for \(t \ge 0\), by \(\xi (t) \in \partial f (x(t))\) we obtain \(v(t) \in x(t)+ \sigma (t) \partial f(x(t))\) and \( \xi (t)=\frac{1}{\sigma (t)}(v(t)-x(t))\), hence, by Minty’s representation we simply have
Differentiating (1.29), in light of (2.3b), gives us, for a.e. \(t \ge 0\),
\(\dot{v}(t)=\dot{x}(t)+ (\sigma ( \cdot ) \xi ( \cdot ))^{(1)}(t) = -\theta (t) (y(t)-x(t))-\omega (t) \sigma (t) \xi (t)\), hence, by (1.30), we obtain
Hence, from (2.3), we deduce that \((v( \cdot ),y( \cdot ))\) are implicitly given, for a.e. \(t \in [0,\infty )\), by
together with \(y(0)=y_0\) and \(v(0)=x_0+ \sigma (0) \xi _0\).
This shows us that any strong global solution \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) to (1.4) is entirely determined (thanks to the two formulas in (1.30)) by some (strong) solution \((v( \cdot ),y( \cdot ))\) to (1.31). So, for proving existence and uniqueness of a strong global solution to (1.4), we just state (as argued below) the existence and uniqueness of a (strong) global solution \((v( \cdot ),y( \cdot ))\) to (1.31), but also the existence of a strong global solution \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) to (1.4).
(s2) Existence, uniqueness and regularity of a (strong) global solution \((v( \cdot ),y( \cdot ))\) to (1.31). First, we show that (1.31) is relevant to the Cauchy–Lipschitz theorem. Indeed, (1.31) can be expressed as
where \(U( \cdot )=(v( \cdot ),y( \cdot ))\) and \(F(t,.): {\mathcal {H}}^2 \rightarrow {\mathcal {H}}^2\) is defined for any \(t \ge 0\) and \( ({\bar{v}}, {\bar{y}}) \in {\mathcal {H}}^2\) by \(F(t,({\bar{v}},{\bar{y}}))= \big ( \phi _1(t,({\bar{v}},{\bar{y}})), \phi _2(t,({\bar{v}},{\bar{y}})) \big )\), together with
In view of applying the global Cauchy–Lipschitz theorem, we establish two main properties on F(., .) through the following items (a) and (b):
(a) Given \(({\bar{v}},{\bar{y}}) \in {\mathcal {H}}^2\), we prove that \(F(., (\bar{v},{\bar{y}}))\) is continuous on \([0,\infty )\). Indeed, let \(z \in (\partial f)^{-1}(0)\) and \((t_1,t_2) \in [0,\infty )^2\). By Lemma 1.1 with \(A= \partial f\) and \(\sigma ( \cdot ) \ge \sigma _0 >0\) (from (1.5b)), we obtain \( \Vert (\partial f)_{\sigma (t_1) }{\bar{y}}- (\partial f)_{\sigma (t_2) }{\bar{y}} \Vert \le 3 \frac{ | \sigma (t_1) - \sigma (t_2) |}{\sigma _0^2}\Vert {\bar{y}}-z\Vert \).
Then, the continuity of \(\sigma ( \cdot )\) on \([0,\infty )\) yields that the mappings \(t \rightarrow (\partial f)_{\sigma (t) }{\bar{y}}\) and \(t \rightarrow J_{\sigma (t)}^{\partial f}{\bar{v}} \) (given by \(J_{\sigma (t)}^{\partial f}{\bar{v}}:={\bar{v}}-\sigma (t) (\partial f)_{\sigma (t)}{\bar{v}}\)) are also continuous on \([0,\infty )\). So, in light of the definition of \(\phi _1(.,.)\) and \(\phi _2(.,.)\), together with the continuity of \(\{ \theta ( \cdot ), \omega ( \cdot ) \}\), we infer that \(F(.,(\bar{v},{\bar{y}}))\) is continuous on \([0,\infty )\) (as are \(\phi _1(.,.)\) and \(\phi _2(.,.)\)).
(b) Given \(t \ge 0\), we prove that F(t, .) is \(\iota (t)\)-Lipschitz continuous on \({\mathcal {H}}^2\), for some continuous mapping \(\iota : [0,\infty ) \rightarrow [0,\infty )\). Indeed, for \((v_i,y_i)\in {\mathcal {H}}^2\) (for \(i=1,2\)), while noticing that \(J_{\sigma (t)}^{\partial f}\) and \(\frac{1}{2}\sigma (t) (\partial f)_{\sigma (t)}\) are nonexpansive on \({\mathcal {H}}\), by (1.33) we get
while an easy computation gives us
It follows from the previous arguments that F(t, .) satisfies
hence F(t., .) is \( \iota (t)\)-Lipschitz continuous on \({\mathcal {H}}^2\) along with \( \iota ( \cdot )=\theta ( \cdot )+2 \omega ( \cdot )+ \kappa \) which is continuous (by the continuity of \(\theta ( \cdot )\) and \(\omega ( \cdot )\)).
Thus, for any given \((x_0,y_0,\xi _0)\in {\mathcal {H}}^3\), applying the global Cauchy–Lipschitz theorem yields existence and uniqueness of a global classical solution \((v( \cdot ),y( \cdot ))\) to (1.31) (namely, \(y( \cdot )\) and \(v( \cdot )\) are of class \(C^1\)) such that \(y(0)=y_0\) and \(v(0)=x_0+ \sigma (0) \xi _0\). Furthermore, the previous arguments (a) and (b) guarantee existence and uniqueness of a strong global solution \((v( \cdot ),y( \cdot ))\) to the same problem (1.31), by invoking the version of the Cauchy–Lipschitz theorem involving absolutely continuous trajectories, see for example [25, Proposition 6.2.1.], [37, Theorem 54].
(s3) Existence of a strong global solution \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) to (1.4). Let \((x_0,\xi _0,y_0) \in {\mathcal {H}}^3\) be such that \( \xi _0 \in \partial f (x_0)\). Given a global classical solution \((v( \cdot ),y( \cdot ))\) to (1.31) such that \(y(0)=y_0\) and \(v(0)=x_0+ \sigma (0) \xi _0\), we consider the functions \(x( \cdot )\) and \(\xi ( \cdot )\) defined by (1.30), and we show through the following items (s3-a)–(s3-b) that \((x( \cdot ),\xi ( \cdot ),y( \cdot ))\) is a strong global solution to (1.4):
(s3-a) Let us prove the absolute continuity of \(x( \cdot )\) and \(\xi ( \cdot )\) on any bounded subset of \([0, \infty )\). As \(v( \cdot )\) is of class \(C^1\) on \([0, \infty )\), we immediately see that \(v( \cdot )\) is absolutely continuous from the characterization (i1) of Definition 2.1. Hence, given \(\epsilon >0\) and finitely many intervals \(I_k=(a_k,b_k)\) such that \(I_k \cap I_j = \emptyset \) (for \(k \ne j\)), by using Definition 2.1-(i3) we know for some \(\eta >0\) that \(\sum _k |b_k-a_k| \le \eta \) implies that \(\sum _k \Vert v(b_k)-v(a_k)\Vert \le \min (\epsilon , \frac{2 \epsilon }{\sigma _0}).\) So, invoking the non-expansiveness of \(J_{\sigma ( \cdot )}^{ \partial f}\) and \( \frac{1}{2} \sigma ( \cdot ) (\partial f)_{\sigma ( \cdot )}\) while recalling that \(\sigma ( \cdot ) \ge \sigma _0 >0\) entails that
\(\sum _k \Vert J_{\sigma (t)}^{ \partial f} v(b_k)- J_{\sigma (t)}^{\partial f} v (a_k)\Vert \le \sum _k \Vert v(b_k)- v (a_k)\Vert \le \epsilon \)
and that
\(\sum _k \Vert (\partial f)_{\sigma (t)} v(b_k)-(\partial f)_{\sigma (t)} v(a_k)\Vert \le \sum _k \frac{2}{\sigma (t)}\Vert v(b_k)-v(a_k)\Vert \le \frac{2}{\sigma _0} \sum _k \Vert v(b_k)-v(a_k)\Vert \le \epsilon \).
Consequently, the mappings \(x( \cdot )= J_{\sigma ( \cdot )}^{\partial f }v( \cdot )\) and \(\xi ( \cdot ) = (\partial f)_{\sigma ( \cdot )} v( \cdot )\) also comply with characterization (i3) of Definition 2.1, which proves that \(x( \cdot )\) and \(\xi ( \cdot )\) are absolutely continuous on \([0, \infty )\).
(s3-b) Let us show that the triplet \((x( \cdot ),y( \cdot ),\xi ( \cdot ))\) satisfies system (2.3). Indeed, by \(x( \cdot )= J_{\sigma ( \cdot )}^{\partial f }v( \cdot )\) (from (1.30)), and \(\sigma ( \cdot ) \xi ( \cdot )=v( \cdot )-x( \cdot )\), we readily deduce that \(\sigma ( \cdot ) \xi ( \cdot ) \in \left( \sigma ( \cdot ) \partial f \right) (x( \cdot ))\) (because \(v( \cdot ) \in x( \cdot ) + {\sigma ( \cdot )}\partial f (x( \cdot ))\)), which by the positivity of \(\sigma ( \cdot )\) proves (2.3a). Moreover, in (1.31), substituting \(v( \cdot )\), \( J_{\sigma ( \cdot )}^{\partial f }v( \cdot )\) and \({\sigma ( \cdot )} (\partial f)_{\sigma ( \cdot )}v( \cdot )\), by \(x( \cdot ) + \sigma ( \cdot ) \xi ( \cdot )\), \(x( \cdot )\) and \(\sigma ( \cdot ) \xi ( \cdot )\), respectively, gives us immediately (2.3b) and (2.3c). In addition, regarding the initial conditions we obtain \(x(0)=J_{\sigma (0)}^{\partial f } v(0)=x_0\) (since \(v(0)=x_0+\sigma (0) \xi _0\) and \( \xi _0 \in \partial f (x_0)\)), \(y(0)=y_0\) and \(\sigma (0) \xi (0)=v(0)-x(0)=\sigma (0) \xi _0\).
Consequently, by items (s3-a)– (s3-b), we get the existence of a strong global solution to (1.4) \(\square \)
1.3 Proof of Lemma 3.2
(See [30, Lemma 4.1]). Given \(t\in [ 0,\infty )\) and \(h\in (0,\infty )\), we have \(\xi (t)\in \partial fx(t)\) and \(\xi (t+h)\in \partial f x(t+h)\) hence, by monotonicity of \(\partial f\), we simply have
Clearly, assuming that \(x( \cdot )\) and \(\xi ( \cdot )\) are absolutely continuous on \([0,\infty )\), yields that, for a.e. \(t \in [0,\infty )\), and as \(h \rightarrow 0^+\), we have
Thus, letting h tend to \(0^+\) in (1.35), implies \(\langle \dot{\xi }(t),\dot{x}(t)\rangle \ge 0\), that is the desired result \(\square \)
1.4 Proof of Proposition 4.3
By Remark 3.1, we know under condition (1.15) that \(\theta ( \cdot )\) is well-defined and positive on \([0,\infty )\). Moreover, by \(\nu ( \cdot ) \in C^2\) and \(\theta ( \cdot )=\frac{\kappa \nu ( \cdot ) -\dot{\nu }( \cdot )}{\nu ( \cdot )+ e_*}\) (hence \( \theta ( \cdot )=\kappa -\frac{ \dot{\nu }( \cdot )+\kappa e_*}{\nu ( \cdot ) + e_*}\)), we can see that \(\theta ( \cdot ) \in C^1([0,\infty ))\) and (omitting the variable t) we readily get
Let us recall (from (1.3)) that \(\alpha := \frac{\dot{\theta }}{\theta } + \kappa - \theta \). Consequently, by the previous arguments we obtain
\(\alpha := \frac{1}{\theta } (\theta ( \kappa - \theta ) - {\dot{\theta }}) = \frac{1}{\kappa \nu - {\dot{\nu }}} \left( (\dot{\nu } +\kappa e_*) \frac{\kappa \nu - 2 {\dot{\nu }}}{\nu + e_*} + \ddot{\nu }\right) \).
So, as \(t \rightarrow \infty \), by \(\nu (t) \rightarrow +\infty \), \(\dot{\nu }(t) \rightarrow l \in [0,\infty )\) and \(\ddot{\nu }(t) \rightarrow 0\) (from (4.40)), we immediately obtain that \(\alpha (t) \sim \frac{ l +\kappa e_*}{\nu (t) + e_*}\). We also recall (from (1.3)) that \(\beta :=-\frac{\dot{\theta }}{{\theta }}+ \kappa +\omega \), hence by the definition of \(\alpha \) we equivalently have \(\beta =\alpha + \theta + \omega \). Moreover, as \(t \rightarrow \infty \), by \(\omega :=\big (\kappa -\frac{\dot{\nu }}{\nu } \big )\vartheta -\frac{\delta }{\nu +e_*}\) (from (1.5c)), by \(\nu (t) \rightarrow \infty \), \({\dot{\nu }}(t) \rightarrow l\) and \(\vartheta (t) \rightarrow \vartheta _\infty \), we readily deduce that \(\omega (t) \rightarrow \kappa \vartheta _\infty \). Then, as \(t \rightarrow \infty \), by the latter formulation of \(\beta \), and remembering that (as \(t\rightarrow \infty \)) \(\theta (t) \rightarrow \kappa \), \(\alpha (t) \rightarrow 0\) and that \(\omega (t) \rightarrow \kappa \vartheta _\infty \), we get \(\beta (t) \rightarrow \kappa (1+\vartheta _\infty )\). Again from (1.3) we simply have
\(b(t):=\omega (t) \left( \kappa +\frac{{\dot{\omega }}(t)}{ \omega (t)}- \frac{\dot{\theta }(t)}{{\theta (t)}} \right) = \omega (t) \left( \kappa - \frac{\dot{\theta }(t)}{{\theta (t)}} \right) + {\dot{\omega }}(t)\).
In addition, we obviously see from its expression that \(\omega ( \cdot )\) is of class \(C^1\), and we have
It is also classically deduced from the convergence of \(\vartheta \) and the Lipschitz continuity of \({\dot{\vartheta }}\) that \({\dot{\vartheta (t)}} \rightarrow 0\) (as \(t \rightarrow \infty \)). This, in light of condition (4.40) and \(\lim _{t \rightarrow \infty } \vartheta (t)= \vartheta _\infty \) entails that \({\dot{\omega }}(t) \rightarrow \kappa \vartheta _\infty \) (as \(t \rightarrow \infty \)). Consequently, as \(t \rightarrow \infty \), by the previous formulation of b together with \(\frac{\dot{\theta }(t)}{{\theta }(t)} \rightarrow 0\), \(\omega (t) \rightarrow \kappa \vartheta _\infty \) and \({\dot{\omega }}(t) \rightarrow 0\), we deduce that \( b(t) \rightarrow \kappa ^2 \vartheta _\infty \)
\(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Maingé, PE., Weng-Law, A. Fast Continuous Dynamics Inside the Graph of Subdifferentials of Nonsmooth Convex Functions. Appl Math Optim 89, 1 (2024). https://doi.org/10.1007/s00245-023-10055-9
Accepted:
Published:
DOI: https://doi.org/10.1007/s00245-023-10055-9
Keywords
- Nonsmooth minimization
- Differential equations
- Dissipative dynamical systems
- Nonsmooth convex minimization
- Damped inertial dynamics
- Yosida approximation
- Coupled systems
- Nesterov acceleration