On the convergence of a class of inertial dynamical systems with Tikhonov regularization

Xu, Bo; Wen, Bo

doi:10.1007/s11590-020-01663-3

On the convergence of a class of inertial dynamical systems with Tikhonov regularization

Original Paper
Published: 14 November 2020

Volume 15, pages 2025–2052, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Optimization Letters Aims and scope Submit manuscript

On the convergence of a class of inertial dynamical systems with Tikhonov regularization

Download PDF

Bo Xu¹ &
Bo Wen²

403 Accesses
4 Citations
Explore all metrics

Abstract

We consider a class of inertial second order dynamical system with Tikhonov regularization, which can be applied to solving the minimization of a smooth convex function. Based on the appropriate choices of the parameters in the dynamical system, we first show that the function value along the trajectories converges to the optimal value, and prove that the convergence rate can be faster than $o(1/t^2)$. Moreover, by constructing proper energy function, we prove that the trajectories strongly converges to a minimizer of the objective function of minimum norm. Finally, some numerical experiments have been conducted to illustrate the theoretical results.

Tikhonov regularization of a second order dynamical system with Hessian driven damping

Article Open access 11 June 2020

Convex optimization via inertial algorithms with vanishing Tikhonov regularization: fast convergence to the minimum norm solution

Article Open access 27 June 2024

Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators

Article 07 March 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, convex optimization problems draw many researchers’ attention due to its arisen in a lot of application areas, such as machine learning [10, 30], statistics [18], image processing [20, 32] and so on. Hence, various algorithms have been proposed for solving different structured convex optimization problems. One simple and often powerful algorithm is Nesterov accelerated gradient algorithm, whose convergence rate can be $O(1/t^2)$. Many accelerated algorithms based on Nesterov’s accelerated technique has been proposed since then, we refer the readers to [11, 19, 25, 26, 29, 31] and the reference therein for an overview of these algorithms.

Most literatures consider Nesterov’s accelerated method by using different numerical optimization techniques. However, differential equations are also important and efficient tools to study numerical algorithms. Recently, Su, Boyd, Candés [28] propose a class of second order differential equations to study Nesterov’s accelerated gradient method, which is

$$\begin{aligned} \left\{ \begin{aligned}&\ddot{x} + \frac{\alpha }{t}\dot{x} + \nabla \Phi \left( x \right) = 0 \\&x\left( {{t_0}} \right) = {u_0},\dot{x}\left( {{t_0}} \right) = {v_0}, \\ \end{aligned} \right. \end{aligned}$$

(1.1)

where $\Phi $ is convex and differentiable, and $\nabla \Phi $ is Lipschitz continuous, $t_0>0$. They show that this system can be seen as the continuous version of Nesterov’s accelerated gradient method. In addition, they prove that the convergence rate of the function value along the trajectories of (1.1) is $O(1/t^2)$, if $\alpha $ is chosen as 3, which is the same as the convergence rate of Nesterov’s accelerated gradient method. Moreover, they show that 3 is the minimum constant that guarantees the convergence rate of $O(1/t^2)$.

Su, Boyd, Candés work [28] motivates subsequent studies on the second order differential equation (1.1), see, for example, [3, 8, 13,14,15,16, 24, 33]. Particularly, Attouch, Chbani, Peypouquet and Redont [3] establish the weak convergence of the trajectory if $\alpha > 3$, and they also show that the convergence rate of the objective function along the trajectory is $o(1/t^2)$. Later, Attouch, Chbani and Riahi [5] consider the convergence properties under the condition that $\alpha <3$. They prove that the convergence rate is

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - \min \Phi = O\left( {\frac{1}{{{t^{\frac{{2\alpha }}{3}}}}}} \right) . \end{aligned}$$

In order to establish the strong convergence of the trajectory, Attouch, Chbani and Riahi [4] propose the following second order dynamical system:

$$\begin{aligned} \left\{ \begin{aligned}&\ddot{x} + \frac{\alpha }{t}\dot{x} + \nabla \Phi \left( x \right) + \varepsilon \left( t \right) x\left( t \right) = 0 \\&x\left( {{t_0}} \right) = {u_0},\dot{x}\left( {{t_0}} \right) = {v_0} , \\ \end{aligned} \right. \end{aligned}$$

(1.2)

which add a Tikhonov regular term compared with system (1.1). They show that the function value along the trajectory converges to the optimal value fast, if $\varepsilon (t)$ decreases to 0 rapidly. In addition, they establish the strong convergence of the trajectory x(t) to the element of minimum norm of $\arg \min \Phi $, if $\varepsilon (t)$ tends slowly to zero. There are many other literatures considering the Tikhonov regular techniques, the readers can result the references [1, 2, 9, 17, 23].

In 2019, Attouch, Chbani and Riahi [6, 7] study another differential equation:

$$\begin{aligned} \left\{ \begin{aligned}&\ddot{x} + \gamma \left( t \right) \dot{x} + \beta \left( t \right) \nabla \Phi \left( x \right) = 0 \\&x\left( {{t_0}} \right) = {u_0},\dot{x}\left( {{t_0}} \right) = {v_0} ,\\ \end{aligned} \right. \end{aligned}$$

(1.3)

where $\gamma (t)$ and $\beta (t)$ are scalar functions. They first consider the convergence properties of (1.3). Then a discretized numerical algorithm for solving structured convex composite optimization problem based on the differential equation has been proposed. Inspired by the proof of the convergence of the trajectory of (1.1), they establish the convergence and convergence rate of the algorithm. Concretely, they obtain that the convergence rate of $\Phi (x(t))$ is

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - \min \Phi = O\left( {\frac{1}{{\beta \left( t \right) \Gamma {{\left( t \right) }^2}}}} \right) , \end{aligned}$$

where $\Gamma \left( t \right) = p\left( t \right) \int _t^{ + \infty } {\frac{1}{{p\left( u \right) }}du} $, $p\left( t \right) = {e^{\int _{{t_0}}^t {\gamma \left( u \right) du} }}$. In particular, if $\gamma (t)$ is chosen as ${\frac{\alpha }{t}}$, the convergence rate becomes

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - \min \Phi = O\left( {\frac{1}{{\beta \left( t \right) {t^2}}}} \right) . \end{aligned}$$

According to the above relation, it can be easily seen that the convergence rate of $\Phi (x(t))$ can be faster than $O\left( {\frac{1}{{{t^2}}}} \right) $, if we choose proper $\beta (t)$. For the nonsmooth optimization problems, which means the objective function is not differentiable, differential equations can not be applied directly, we recommend the readers to [12, 21] to see the details.

From the above literatures, we note that some work consider the strong convergence of the trajectory x(t), the other work study the faster convergence rate of objective $\Phi (x(t))$. A natural question is that whether we can combine these discussions together. In this work, both the strong convergence property of the trajectory x(t) and the fast convergence rate of objective $\Phi (x(t))$ are studied under different choice of the parameters. To this end, this paper mainly considers the following differential equation:

$$\begin{aligned} \left\{ \begin{aligned}&\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) + \beta \left( t \right) \left( {\nabla \Phi \left( {x\left( t \right) } \right) + \varepsilon \left( t \right) x\left( t \right) } \right) = 0 \\&\begin{array}{*{20}{c}} {x\left( {{t_0}} \right) = {u_0}}, &{} {\dot{x}\left( {{t_0}} \right) = {v_0}}, \\ \end{array} \\ \end{aligned} \right. \end{aligned}$$

(1.4)

where $\Phi $ is convex and differentiable, $\nabla \Phi $ is Lipschitz continuous, ${u_0},{v_0} \in {\mathcal {H}}$, $t_0>0$, $\alpha $ is a positive parameter, $\beta (t)$ is a time scaling parameter, and ${\varepsilon \left( t \right) x\left( t \right) }$ is a Tikhonov regularization term. Throughout the whole paper, we assume that

$$\begin{aligned} \begin{array}{*{20}{c}} {{H_1}} &{} {\left\{ \begin{aligned} &{}t_0>0,\varepsilon :\left[ {{t_0}, + \infty } \right) \rightarrow {\mathbb {R}^ + } \text { is a nonincreasing function ; } \\ &{}\varepsilon (t) \text { is continuously differentiable and}\mathop {\lim }\limits _{t \rightarrow + \infty } \varepsilon \left( t \right) = 0; \\ &{}\beta :\left[ {{t_0}, + \infty } \right) \rightarrow {\mathbb {R}^ + } \text { is a non-negative continuous function.} \\ \end{aligned} \right. } \\ \end{array} \end{aligned}$$

Our main contributions are as follows:

By constructing proper energy function, we first prove that the existence and uniqueness of the global solution of dynamical system (1.4);
We establish the fast convergence rate of $\Phi (x(t))$ and strong convergence of the trajectory x(t) of system (1.4). In details, under the condition that $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$, we establish the global convergence rate of $\Phi (x(t))$ which is
$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - \min \Phi = o\left( {\frac{1}{{{t^2}\beta \left( t \right) }}} \right) . \end{aligned}$$
Moreover, if $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $, we show that the global solution x(t) of (1.4) satisfies the following ergodic convergence result:
$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left\| {x\left( \tau \right) - p} \right\| ^2d\tau } = 0, \end{aligned}$$
where p is the element of minimal norm of $\arg \min \Phi $. In addition, we prove that $\mathop {\lim \inf }\limits _{t \rightarrow \infty } \left\| {x\left( t \right) - p} \right\| = 0.$

The rest of the paper is organized as follows: Section 2 presents some basic notation and preliminary materials. In Sect. 3, the global existence and uniqueness result is established for (1.4). In Sect. 4, we first establish the fast convergence rate of $\Phi (x(t))$ based on the condition $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$, and then show that the trajectory x(t) of (1.4) converges to a minimizer of the objective function of minimum norm. In Sect. 5, we perform some numerical experiments to illustrate our theoretical results.

2 Notation and preliminaries

The problems we consider in this paper are all in Hilbert space $\mathcal {H}$ , and we denote its inner product by $\langle \cdot ,\cdot \rangle $, the corresponding norm is denoted as $\left\| \cdot \right\| $.

For the real valued convex and differentiable function $\Phi :\mathcal {H}\rightarrow \mathbb R$, the gradient of $\Phi $ is said to be $L_{\Phi }$-Lipschitz continuous, if

$$\begin{aligned} \left\| {\nabla \Phi \left( x \right) - \nabla \Phi \left( y \right) } \right\| \leqslant L_{\Phi }\left\| {x - y} \right\| ,\forall x,y \in \mathcal {H}. \end{aligned}$$

We say that $\Phi $ is a $\sigma $-strongly convex if and only if $\Phi \left( \cdot \right) - \frac{\sigma }{2}{\left\| \cdot \right\| ^2}$ is convex, $\sigma >0$. Moreover, if $\Phi $ is continuously differentiable, then

$$\begin{aligned} \Phi \left( y \right) \geqslant \Phi \left( x \right) + \left\langle {\nabla \Phi \left( x \right) ,y - x} \right\rangle + \frac{\sigma }{2}{\left\| {y - x} \right\| ^2}. \end{aligned}$$

A function $x:[0,\infty )\rightarrow \mathcal {H}$ is called locally absolutely continuous if $x:[0,\infty )\rightarrow \mathcal {H}$ is absolutely continuous on every compact interval, which means that there exists an integrable function $y:\left[ {{t_0}, T } \right) \rightarrow {\mathcal {H}}$ such that

$$\begin{aligned} \begin{array}{*{20}{c}} {x\left( t \right) = x\left( 0 \right) + \int _{{t_0}}^t {y\left( s \right) ds} } &{} {\forall t \in \left[ {{t_0},T} \right] .} \\ \end{array} \end{aligned}$$

For a locally absolutely continuous function, we would like to point out the following property, which will be used in the following sections.

Remark 2.1

Every locally absolutely continuous function $x:\left[ {{t_0}, + \infty } \right) \rightarrow {\mathcal {H}}$ is differentiable almost everywhere and its derivative coincides with its distributional derivative almost everywhere and one can recover the function from its derivative $\dot{x} = y$ by the integration formula above.

Before ending this section, we state some lemmas which will be used in our convergence analysis.

Lemma 2.1

Suppose that $F:\left[ {0, + \infty } \right) \rightarrow \mathbb {R}$ is locally absolutely continuous and bounded below and that there exist $G \in {L^1}\left( {\left[ {0, + \infty } \right) } \right) $ such that for almost every $t \in \left[ {0, + \infty } \right) $

$$\begin{aligned} \frac{d}{{dt}}F\left( t \right) \leqslant G\left( t \right) . \end{aligned}$$

Then there exists $\mathop {\lim }\limits _{t \rightarrow \infty } F\left( t \right) \in \mathbb {R}$.

Now, we will introduce an energy function we used in the paper, which is

$$\begin{aligned} W\left( t \right) = \frac{1}{2}{\left\| {\dot{x}\left( t \right) } \right\| ^2} + \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) , \end{aligned}$$

(2.1)

where ${\Phi _t}\left( x \right) = \Phi \left( x \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| x \right\| ^2}$.

Next, we will give two important results, which play important roles in the analysis of asymptotic behavior of system (1.4).

Lemma 2.2

Let W be defined by (2.1), we have

$$\begin{aligned} \frac{{dW\left( t \right) }}{{dt}} \leqslant - \frac{\alpha }{t}{\left\| {\dot{x}\left( t \right) } \right\| ^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) . \end{aligned}$$

Proof

From the definition of $\Phi _{t}(x(t))$, we immediately have

$$\begin{aligned} \nabla {\Phi _t}\left( {x\left( t \right) } \right) = \nabla \Phi \left( {x\left( t \right) } \right) + \varepsilon \left( t \right) x\left( t \right) . \end{aligned}$$

(2.2)

On the other hand, by taking the derivative of the energy function (2.1) and using the definition of ${\Phi _t}\left( x \right) = \Phi \left( x \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| x \right\| ^2}$, we have

$$\begin{aligned} \begin{aligned} \dot{W}\left( t \right) =&\left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&+\beta \left( t \right) \left( {\left\langle {\nabla {\Phi }\left( {x\left( t \right) } \right) ,\dot{x}\left( t \right) } \right\rangle + \dot{\varepsilon }\left( t \right) \frac{{{{\left\| {x\left( t \right) } \right\| }^2}}}{2}} + \varepsilon (t)\langle x(t), \dot{x}(t)\rangle \right) \\ =&\left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&+\beta \left( t \right) \left( {\left\langle {\nabla {\Phi }\left( {x\left( t \right) } \right) +\varepsilon (t)x(t),\dot{x}\left( t \right) } \right\rangle + \dot{\varepsilon }\left( t \right) \frac{{{{\left\| {x\left( t \right) } \right\| }^2}}}{2}} \right) .\\ \end{aligned} \end{aligned}$$

Combining (2.2) with the above relation, we obtain further that

$$\begin{aligned} \dot{W}\left( t \right)&= \left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&\quad + \beta \left( t \right) \left( {\left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,\dot{x}\left( t \right) } \right\rangle + \dot{\varepsilon }\left( t \right) \frac{{{{\left\| {x\left( t \right) } \right\| }^2}}}{2}} \right) . \end{aligned}$$

By rearranging terms and using the fact that $ \varepsilon (t)$ is continuously differentiable and nonincreasing, we have

$$\begin{aligned} \begin{aligned} \dot{W}\left( t \right)&= \left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) } \right\rangle + \beta \left( t \right) \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,\dot{x}\left( t \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&\quad +\beta (t)\dot{\varepsilon }\left( t \right) \frac{{{{\left\| {x\left( t \right) } \right\| }^2}}}{2} \\&=\left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) + \beta \left( t \right) \nabla {\Phi _t}\left( {x\left( t \right) } \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&\quad + \beta \left( t \right) \dot{\varepsilon }\left( t \right) \frac{{{{\left\| {x\left( t \right) } \right\| }^2}}}{2}\\&\le \left\langle {\dot{x}\left( t \right) ,\ddot{x}\left( t \right) + \beta \left( t \right) \nabla {\Phi _t}\left( {x\left( t \right) } \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) , \end{aligned} \end{aligned}$$

(2.3)

where the last inequality is from the fact that $\dot{\varepsilon }(t) \le 0$ and $\beta (t)\ge 0$. Moreover, according to system (1.4), we have

$$\begin{aligned} {\ddot{x}\left( t \right) + \beta \left( t \right) \nabla {\Phi _t}\left( {x\left( t \right) } \right) = \ddot{x}\left( t \right) + \beta \left( t \right) \left( {\nabla \Phi \left( {x\left( t \right) } \right) + \varepsilon \left( t \right) x\left( t \right) } \right) = - \frac{\alpha }{t}\dot{x}\left( t \right) }. \end{aligned}$$

(2.4)

Combining (2.3) and (2.4) together, we obtain that

$$\begin{aligned} \begin{aligned} \dot{W}\left( t \right)&\leqslant \left\langle {\dot{x}\left( t \right) , - \frac{\alpha }{t}\dot{x}\left( t \right) } \right\rangle + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) \\&= - \frac{\alpha }{t}{\left\| {\dot{x}\left( t \right) } \right\| ^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi } \right) ,\\ \end{aligned} \end{aligned}$$

which implies our desired conclusion immediately.

In the following, we introduce another auxiliary function

$$\begin{aligned} {h_z}\left( t \right) = \frac{1}{2}{\left\| {x\left( t \right) - z} \right\| ^2}, \end{aligned}$$

(2.5)

where $z\in \arg \min \Phi $, then we will give the following property of $h_z$.

Lemma 2.3

Suppose ${h_z}\left( t \right) $ is defined as (2.5), then

(i)
${{\ddot{h}}_z}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_z}\left( t \right) \leqslant {\left\| {\dot{x}\left( t \right) } \right\| ^2} - \beta \left( t \right) \left( {\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi - \frac{{\varepsilon \left( t \right) }}{2}{{\left\| z \right\| }^2} + \frac{{\varepsilon \left( t \right) }}{2} \left\| x\left( t \right) - z \right\| ^2 \right) .$
(ii)
$\mathop {\sup }\limits _{t \geqslant {t_0}} \frac{{\left| {{{\dot{h}}_z}\left( t \right) } \right| }}{t} < + \infty $ if $\mathop {\sup }\limits _{t \geqslant {t_0}} \left\| {\dot{x}\left( t \right) } \right\| < + \infty .$

Proof

(i): From the definition of $h_{z}(t)$, we immediately obtain that

$$\begin{aligned} {{\dot{h}}_z}\left( t \right) = \left\langle {\dot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle , ~~{{\ddot{h}}_z}\left( t \right) = {\left\| {\dot{x}\left( t \right) } \right\| ^2} + \left\langle {\ddot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle . \end{aligned}$$

(2.6)

Hence

$$\begin{aligned} \begin{aligned} {{\ddot{h}}_z}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_z}\left( t \right)&= {\left\| {\dot{x}\left( t \right) } \right\| ^2} + \left\langle {\ddot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle + \frac{\alpha }{t}\left\langle {\dot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle \\&= {\left\| {\dot{x}\left( t \right) } \right\| ^2} + \left\langle {\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle . \\ \end{aligned} \end{aligned}$$

(2.7)

According to (1.4), we have

$$\begin{aligned} \ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) = - \beta \left( t \right) \left( {\nabla \Phi \left( {x\left( t \right) } \right) + \varepsilon \left( t \right) x\left( t \right) } \right) = - \beta \left( t \right) \nabla {\Phi _t}\left( {x\left( t \right) } \right) . \end{aligned}$$

Combining this relation with (2.7), we obtain further that

$$\begin{aligned} {{\ddot{h}}_z}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_z}\left( t \right) = {\left\| {\dot{x}\left( t \right) } \right\| ^2} - \beta \left( t \right) \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,x\left( t \right) - z} \right\rangle . \end{aligned}$$

(2.8)

On the other hand, recall $\Phi $ is convex function, from this with the definition $\Phi _{t}$ , we see that $\Phi _{t}$ is ${\varepsilon \left( t \right) }$-strongly convex function. Hence, we have

$$\begin{aligned} {\Phi _t}\left( z \right) - {\Phi _t}\left( {x\left( t \right) } \right) \geqslant \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,z - x\left( t \right) } \right\rangle + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {z - x\left( t \right) } \right\| ^2}, \end{aligned}$$

which implies that

$$\begin{aligned} \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,x\left( t \right) - z} \right\rangle \ge {\Phi _t}\left( {x\left( t \right) } \right) - \Phi _t(z) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) - z} \right\| ^2}. \end{aligned}$$

(2.9)

Since z is a minimizer of $\Phi $, by (2.9) and the definition of ${\Phi _t}\left( z \right) = \Phi \left( z \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2} = \min \Phi + \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2}$, we obtain further that

$$\begin{aligned} \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,x\left( t \right) - z} \right\rangle \geqslant {\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi - \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2} + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) - z} \right\| ^2}. \end{aligned}$$

Using this, the fact $\beta (t)\ge 0$ from $H_1$ and (2.8), we have

$$\begin{aligned} \begin{aligned}&{{\ddot{h}}_z}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_z}\left( t \right) \\&\quad = {\left\| {\dot{x}\left( t \right) } \right\| ^2} - \beta \left( t \right) \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,x\left( t \right) - z} \right\rangle \\&\quad \le {\left\| {\dot{x}\left( t \right) } \right\| ^2} - \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - \min \Phi - \frac{{\varepsilon \left( t \right) }}{2}{{\left\| z \right\| }^2} + \frac{{\varepsilon \left( t \right) }}{2}{{\left\| {x\left( t \right) - z} \right\| }^2}} \right) . \\ \end{aligned} \end{aligned}$$

This completes the proof of (i).

Now we prove (ii). By the definition of ${{\dot{h}}_z}\left( t \right) $ in (2.6) , the assumption that $\mathop {\sup }\limits _{t \geqslant {t_0}} \left\| {\dot{x}\left( t \right) } \right\| < + \infty $ and Schwartz’s inequality, we obtain that

$$\begin{aligned} \left| {{{\dot{h}}_z}\left( t \right) } \right| \leqslant \left\| {\dot{x}\left( t \right) } \right\| \left\| {x\left( t \right) - z} \right\| \leqslant \mathop {\sup }\limits _{s \geqslant {t_0}} \left\| {\dot{x}\left( s \right) } \right\| \left\| {x\left( t \right) - z} \right\| . \end{aligned}$$

(2.10)

In addition,

$$\begin{aligned} \left\| {x\left( t \right) - z} \right\| \leqslant \left\| {x\left( t \right) - x\left( {{t_0}} \right) } \right\| + \left\| {x\left( {{t_0}} \right) - z} \right\| \leqslant \left( {t - {t_0}} \right) \mathop {\sup }\limits _{s \geqslant {t_0}} \left\| {\dot{x}\left( s \right) } \right\| + \left\| {x\left( {{t_0}} \right) - z} \right\| . \end{aligned}$$

Combining the above inequality, the assumption $\mathop {\sup }\limits _{t \geqslant {t_0}} \left\| {\dot{x}\left( t \right) } \right\| < + \infty $ with (2.10), we immediately deduce that

$$\begin{aligned} \left| {{{\dot{h}}_z}\left( t \right) } \right| \leqslant \mathop {\sup }\limits _{s \geqslant {t_0}} \left\| {\dot{x}\left( s \right) } \right\| \left( {\left( {t - {t_0}} \right) \mathop {\sup }\limits _{s \geqslant {t_0}} \left\| {\dot{x}\left( s \right) } \right\| + \left\| {x\left( {{t_0}} \right) - z} \right\| } \right) \leqslant {\tilde{C}}\left( {1 + t} \right) , \end{aligned}$$

where ${\tilde{C}}>0$ is a constant. Thus, $\mathop {\sup }\nolimits _{t \geqslant {t_0}} \frac{{\left| {{{\dot{h}}_z}\left( t \right) } \right| }}{t} < + \infty .$ This completes the proof.

3 Existence and uniqueness of the solution of (1.4)

In this section, we will prove the existence and uniqueness of a global solution of dynamical system (1.4). We first give the definition of a strong global solution of (1.4).

Definition 3.1

We say that $x:\left[ {{t_0}, + \infty } \right) \rightarrow {\mathcal {H}}$ is a strong global solution of (1.4), if it satisfies the following properties:

(a)
$x,\dot{x}:\left[ {{t_0}, + \infty } \right) \rightarrow {\mathcal {H}}$ are locally absolutely continuous, in other words, absolutely continuous on each interval $\left[ {{t_0},T} \right] $ for ${t_0}< T < + \infty $;
(b)
$\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) + \beta \left( t \right) \left( {\nabla \Phi \left( {x\left( t \right) } \right) + \varepsilon \left( t \right) x\left( t \right) } \right) = 0$ for almost every $t \geqslant {t_0}$;
(c)
$x\left( {{t_0}} \right) = {u_0}$ and $\dot{x}\left( {{t_0}} \right) = {v_0}$.

We are now ready to prove the existence and uniqueness of system (1.4). We mainly use Cauchy-Lipschitz-Picard theorem for absolutely continuous trajectories (see, for example [22], proposition 6.2.1], [27], Theorem 54]) to establish the result. The proof is based on the idea that rewriting (1.4) as a particular first order dynamical system in a suitably chosen product space (see also [8, 17]).

Theorem 3.1

For any initial points ${u_0},{v_0} \in {\mathcal {H}}$, there exists a unique $C^2$-global solution of the dynamical system (1.4).

Proof

Define $X\left( t \right) = \left( {x\left( t \right) ,\dot{x}\left( t \right) } \right) $, and $F:\left[ {{t_0}, + \infty } \right) \times {\mathcal {H}} \times {\mathcal {H}} \rightarrow {\mathcal {H}} \times {\mathcal {H}}$ as

$$\begin{aligned} F\left( {t,u,v} \right) = \left( {v, - \frac{\alpha }{t}v - \beta \left( t \right) \left( {\nabla \Phi \left( u \right) + \varepsilon \left( t \right) u} \right) } \right) . \end{aligned}$$

(3.1)

Hence, from (1.4), (3.1) and the definition of X(t), we see that (1.4) can be rewritten as a first order dynamical system, which is

$$\begin{aligned} \left\{ \begin{aligned} \dot{X}\left( t \right) =&F\left( {t,X\left( t \right) } \right) = F\left( {t,x\left( t \right) ,\dot{x}\left( t \right) } \right) \\ X\left( {{t_0}} \right) =&\left( {{u_0},{v_0}} \right) . \\ \end{aligned} \right. \end{aligned}$$

(3.2)

We will first show that $F\left( {t, \cdot , \cdot } \right) $ is L(t)-Lipschitz continuous for every $t\geqslant {t_0}$. And the Lipschitz constant is a function of time with the property that $L\left( \cdot \right) \in L_{loc}^1\left( {\left[ {{t_0}, + \infty } \right) } \right) $. Concretely, for every $\left( {u,v} \right) ,\left( {\overline{u} ,\overline{v} } \right) \in {\mathcal {H}} \times {\mathcal {H}}$, by (3.1), we have

$$\begin{aligned} \begin{aligned}&\left\| {F\left( {t,u,v} \right) - F\left( {t,\overline{u} ,\overline{v} } \right) } \right\| \\&\quad = \sqrt{{{\left\| {v - \overline{v} } \right\| }^2} + {{\left\| {\frac{\alpha }{t}\left( {\overline{v} - v} \right) + \beta \left( t \right) \left( {\nabla \Phi \left( {\overline{u} } \right) - \nabla \Phi \left( u \right) + \varepsilon \left( t \right) \left( {\overline{u} - u} \right) } \right) } \right\| }^2}} . \\ \end{aligned} \end{aligned}$$

Using the fact that ${\left( {a + b} \right) ^2} \leqslant 2{a^2} + 2{b^2}$ and the above formula, we have

$$\begin{aligned} \begin{aligned}&\left\| {F\left( {t,u,v} \right) - F\left( {t,\overline{u} ,\overline{v} } \right) } \right\| \\&\quad \leqslant \sqrt{{{\left\| {v - \overline{v} } \right\| }^2} + 2{{\left\| {\frac{\alpha }{t}\left( {v - \overline{v} } \right) } \right\| }^2} + 2\beta {{\left( t \right) }^2}{{\left\| {\nabla \Phi \left( {\overline{u} } \right) - \nabla \Phi \left( u \right) + \varepsilon \left( t \right) \left( {\overline{u} - u} \right) } \right\| }^2}} \\&\quad \leqslant \sqrt{\left( {1 + 2\frac{{{\alpha ^2}}}{{{t^2}}}} \right) {{\left\| {v - \overline{v} } \right\| }^2} + 4\beta {{\left( t \right) }^2}{{\left\| {\nabla \Phi \left( u \right) - \nabla \Phi \left( {\overline{u} } \right) } \right\| }^2} + 4\beta {{\left( t \right) }^2}\varepsilon {{\left( t \right) }^2}{{\left\| {u - \overline{u} } \right\| }^2}}. \\ \end{aligned} \end{aligned}$$

From this relation and the fact ${\nabla \Phi }$ is $L_{\Phi }$-Lipschitz continuous in the assumption, we obtain further that

$$\begin{aligned} \begin{aligned}&\left\| {F\left( {t,u,v} \right) - F\left( {t,\overline{u} ,\overline{v} } \right) } \right\| \\&\quad \leqslant \sqrt{\left( {1 + 2\frac{{{\alpha ^2}}}{{{t^2}}}} \right) {{\left\| {v - \overline{v} } \right\| }^2} + \left( {4L_\Phi ^2\beta {{\left( t \right) }^2}{\text { + }}4\beta {{\left( t \right) }^2}\varepsilon {{\left( t \right) }^2}} \right) {{\left\| {u - \overline{u} } \right\| }^2}} \\&\quad \leqslant \sqrt{1 + 2\frac{{{\alpha ^2}}}{{{t^2}}}{\text { + }}4L_\Phi ^2\beta {{\left( t \right) }^2}{\text { + }}4\beta {{\left( t \right) }^2}\varepsilon {{\left( t \right) }^2}} \sqrt{{{\left\| {v - \overline{v} } \right\| }^2} + {{\left\| {u - \overline{u} } \right\| }^2}} \\&\quad \leqslant \left( {1 + \sqrt{2} \frac{\alpha }{t}{\text { + 2}}{L_\Phi }\beta \left( t \right) {\text { + 2}}\beta \left( t \right) \varepsilon \left( t \right) } \right) \left\| {\left( {u,v} \right) - \left( {\overline{u} ,\overline{v} } \right) } \right\| ,\\ \end{aligned} \end{aligned}$$

where the last inequality follows from the fact $\sqrt{a^2+b^2}\le a+b$ if $a\ge 0, b\ge 0$, and $\alpha >0, \beta (t)\ge 0, \varepsilon (t)\ge 0$. Define $L(t)={1 + \sqrt{2} \frac{\alpha }{t}{\text { + 2}}{L_\Phi }\beta \left( t \right) {\text { + 2}}\beta \left( t \right) \varepsilon \left( t \right) } ,$ then we have

$$\begin{aligned} \left\| {F\left( {t,u,v} \right) - F\left( {t,\overline{u} ,\overline{v} } \right) } \right\| \le L(t)\left\| {\left( {u,v} \right) - \left( {\overline{u} ,\overline{v} } \right) } \right\| . \end{aligned}$$

(3.3)

Hence $F\left( {t, \cdot , \cdot } \right) $ is L(t)-Lipschitz continuous for every $t\geqslant {t_0}$. Recall that $\frac{\alpha }{t}$, $\beta (t)$ and $\varepsilon (t)$ are continuous for any $t\ge t_0$. Thus we see that L(t) is integrable on $\left[ {{t_0},T} \right] $, consequently, $L\left( \cdot \right) \in L_{loc}^1\left( {\left[ {{t_0}, + \infty } \right) } \right) $.

Next, we will show that $F\left( { \cdot ,u,v} \right) \in L_{loc}^1\left( {\left[ {{t_0}, + \infty } \right) ,{\mathcal {H}} \times {\mathcal {H}}} \right) $ for all $u,v \in {\mathcal {H}}$. Take any $u,v \in {\mathcal {H}}$, by the definition of F, for ${t_0}< T < + \infty $, we have

$$\begin{aligned} \begin{aligned}&\int _{{t_0}}^T {\left\| {F\left( {t,u,v} \right) } \right\| } dt \\&\quad = \int _{{t_0}}^T {\sqrt{{{\left\| v \right\| }^2} + \left\| { - \frac{\alpha }{t}v - \beta \left( t \right) \left( {\nabla \Phi \left( u \right) + \varepsilon \left( t \right) u} \right) } \right\| ^2} } dt \\&\quad \le \int _{{t_0}}^T {\sqrt{\left( {1 + \frac{{2{\alpha ^2}}}{{{t^2}}}} \right) {{\left\| v \right\| }^2} + 4\beta {{\left( t \right) }^2}{{\left\| {\nabla \Phi \left( u \right) } \right\| }^2} + 4\beta {{\left( t \right) }^2}\varepsilon {{\left( t \right) }^2}{{\left\| u \right\| }^2}} } dt\\&\quad \le \sqrt{{{\left\| v \right\| }^2} + {{\left\| {\nabla \Phi \left( u \right) } \right\| }^2} + {{\left\| u \right\| }^2}} \int _{{t_0}}^T {\sqrt{1 + \frac{{2{\alpha ^2}}}{{{t^2}}} + 4\beta {{\left( t \right) }^2} + 4\beta {{\left( t \right) }^2}\varepsilon {{\left( t \right) }^2}} } dt, \end{aligned} \end{aligned}$$

(3.4)

where the first inequality follows from the fact $\Vert a+b\Vert ^2\le 2\Vert a\Vert ^2 + 2\Vert b\Vert ^2$, the last inequality follows from that the points $u,v\in {\mathbb {R}^n}$ are fixed.

Hence, by (3.4) and the fact $\frac{\alpha }{t}$, $\beta (t)$ and $\varepsilon (t)$ are continuous for any $t\ge t_0$, we immediately obtain that

$$\begin{aligned} F\left( { \cdot ,u,v} \right) \in L_{loc}^1\left( {\left[ {{t_0}, + \infty } \right) ,{\mathcal {H}} \times {\mathcal {H}}} \right) . \end{aligned}$$

Combining this relation with (3.3) and the result $L\left( \cdot \right) \in L_{loc}^1\left( {\left[ {{t_0}, + \infty } \right) } \right) $, and using the Cauchy-Lipschitz-Picard theorem, we see that there exists a unique global solution of system (3.2), which implies the existence of a unique $C^2$-global solution of (1.4) by the Lipschitz continuity of $\nabla \Phi $ and the the continuities of $\beta (t)$ and $\varepsilon (t)$. This completes the proof.

4 Convergence analysis of the trajectory of (1.4)

In this section, we will establish the convergence and convergence rate of the trajectory x(t) of (1.4). The proof of convergence will be casted into the following two cases.

Case 1: $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$. In this case, we will show in Theorem 4.1 that for any global solution trajectory of (1.4), the function $\Phi \left( {x\left( t \right) } \right) $ satisfies the fast convergence property

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - \min \Phi = o\left( {\frac{1}{{{t^2}\beta \left( t \right) }}} \right) . \end{aligned}$$

Case 2: $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $. In this case, we will show in Theorem 4.2 that for any global solution trajectory of (1.4), the following ergodic convergence result holds

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left\| {x\left( \tau \right) - p} \right\| ^2d\tau } = 0, \end{aligned}$$

where p is the element of minimal norm of $\arg \min \Phi $. Moreover, the strong global convergence of x(t) will be established, which is

$$\begin{aligned} \mathop {\lim \inf }\limits _{t \rightarrow \infty } \left\| {x\left( t \right) - p} \right\| = 0. \end{aligned}$$

Now we are ready to present the convergence results case by case.

4.1 Case $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$

For simplicity, we set $m = \min \Phi $. Take any $z \in \arg \min \Phi $, we will introduce another auxiliary function for $\alpha \ne 1$, which is

$$\begin{aligned} E\left( t \right) = \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) + \frac{1}{2}{\left\| {x\left( t \right) - z + \frac{t}{{\alpha - 1}}\dot{x}\left( t \right) } \right\| ^2}, \end{aligned}$$

(4.1)

where $\Phi _t$ is the same function as defined in (2.1). Let ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$, then by simply computing, we immediately have

$$\begin{aligned} 1 + \dot{g}\left( t \right) = \frac{\alpha }{t}g\left( t \right) . \end{aligned}$$

(4.2)

From this relation and the definition of $E\left( t \right) $ in (4.1), we can rewrite $E\left( t \right) $ as the following formula

$$\begin{aligned} E\left( t \right) = g{\left( t \right) ^2}\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) + \frac{1}{2}{\left\| {x\left( t \right) - z + g\left( t \right) \dot{x}\left( t \right) } \right\| ^2}. \end{aligned}$$

(4.3)

Combining (4.3) with the definition of $W\left( t \right) $ in (2.1) , $h_z$ in (2.5) and $\dot{h}_z$ in (2.6), we obtain that

$$\begin{aligned} E\left( t \right) = g{\left( t \right) ^2}W\left( t \right) + {h_z}\left( t \right) + g\left( t \right) {{\dot{h}}_z}\left( t \right) . \end{aligned}$$

(4.4)

Next, we will prove that the global convergence rate of $\Phi \left( {x\left( t \right) } \right) $ is $o\left( {\frac{1}{{{t^2}\beta \left( t \right) }}} \right) $.

Theorem 4.1

Let $\Phi : \mathcal {H} \rightarrow \mathbb {R}$ be a convex continuously differentiable function such that $\arg \min \Phi $ is nonempty. Assume that $\varepsilon (t)$, $\beta (t)$ satisfies condition $(H_1)$, $\alpha >1$, $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$ and there exist $b>0$ such that $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $. Let $x(\cdot )$ be a classical global solution of (1.4), consider the energy function (4.4), then

(i)
$\dot{E}\left( t \right) \leqslant \frac{1}{{2\left( {\alpha - 1} \right) }}t\beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2}$.
(ii)
$\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) dt < } + \infty $.
(iii)
The following fast convergence of $\Phi (x(t))$ holds:
$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - m = o\left( {\frac{1}{{\beta \left( t \right) {t^2}}}} \right) . \end{aligned}$$
(iv)
Moreover, the trajectory $x\left( \cdot \right) $ is bounded on $\left[ {{t_0}, + \infty } \right) $ and
$$\begin{aligned} \int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) {{\left\| {x\left( t \right) } \right\| }^2}dt < } + \infty . \end{aligned}$$

Proof

We first prove (i). From the definition of E(t) in (4.4), we immediately have

$$\begin{aligned} \begin{aligned} \dot{E}\left( t \right) =&g{\left( t \right) ^2}\dot{W}\left( t \right) + 2g\left( t \right) \dot{g}\left( t \right) W\left( t \right) + {{\dot{h}}_z}\left( t \right) + \dot{g}\left( t \right) {{\dot{h}}_z}\left( t \right) + g\left( t \right) {{\ddot{h}}_z}\left( t \right) \\ \leqslant&g{\left( t \right) ^2}\left[ { - \frac{\alpha }{t}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right] \\&+ 2g\left( t \right) \dot{g}\left( t \right) \left[ {\frac{1}{2}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right] \\&+ {{\dot{h}}_z}\left( t \right) \left( 1 + \dot{g}\left( t \right) \right) + g\left( t \right) {{\ddot{h}}_z}\left( t \right) , \end{aligned} \end{aligned}$$

(4.5)

where the last inequality follows from the Lemma 2.2.

On the other hand, according to (4.2) and Lemma 2.3, we have

$$\begin{aligned} \begin{aligned}&{{\dot{h}}_z}\left( t \right) \left( 1 + \dot{g}\left( t \right) \right) + g\left( t \right) {{\ddot{h}}_z}\left( t \right) = g\left( t \right) \left( {{{\ddot{h}}_z}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_z}\left( t \right) } \right) \\&\quad \leqslant g\left( t \right) \left[ {{{\left\| {\dot{x}\left( t \right) } \right\| }^2} - \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m - \frac{{\varepsilon \left( t \right) }}{2}{{\left\| z \right\| }^2} + \frac{{\varepsilon \left( t \right) }}{2}{{\left\| {x\left( t \right) - z} \right\| }^2}} \right) } \right] . \end{aligned}\nonumber \\ \end{aligned}$$

(4.6)

Combining the (4.5) and (4.6), we obtain further that

$$\begin{aligned} \begin{aligned} \dot{E}\left( t \right) \leqslant&g{\left( t \right) ^2}\left[ { - \frac{\alpha }{t}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right] \\&+ 2g\left( t \right) \dot{g}\left( t \right) \left[ {\frac{1}{2}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right] \\&+ g\left( t \right) \left[ {{{\left\| {\dot{x}\left( t \right) } \right\| }^2} - \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m - \frac{{\varepsilon \left( t \right) }}{2}{{\left\| z \right\| }^2} + \frac{{\varepsilon \left( t \right) }}{2}{{\left\| {x\left( t \right) - z} \right\| }^2}} \right) } \right] .\\ \end{aligned} \end{aligned}$$

By rearranging terms, we have

$$\begin{aligned} \begin{aligned} \dot{E}\left( t \right) \leqslant&g\left( t \right) {\left\| {\dot{x}\left( t \right) } \right\| ^2}\left( {1 + \dot{g}\left( t \right) - \frac{\alpha }{t}g\left( t \right) } \right) + g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2}\\&+ \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) g\left( t \right) \left( {g\left( t \right) \dot{\beta }\left( t \right) + 2\dot{g}\left( t \right) \beta \left( t \right) - \beta \left( t \right) } \right) - g\left( t \right) \beta \left( t \right) \\&\frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) - z} \right\| ^2}\\ =&g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2}+ \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) g\left( t \right) \left( {g\left( t \right) \dot{\beta }\left( t \right) + 2\dot{g}\left( t \right) \beta \left( t \right) - \beta \left( t \right) } \right) \\&- g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) - z} \right\| ^2}, \end{aligned} \end{aligned}$$

(4.7)

where the second equality follows from the fact ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$ and $1 + \dot{g}\left( t \right) = \frac{\alpha }{t}g\left( t \right) $ from (4.2).

Recall that ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$ and $1 + \dot{g}\left( t \right) = \frac{\alpha }{t}g\left( t \right) $ from (4.2), combining this with the assumption $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $ and $\alpha >1$, we obtain that

$$\begin{aligned} \begin{aligned} g\left( t \right) \dot{\beta }\left( t \right) + 2\dot{g}\left( t \right) \beta \left( t \right) - \beta \left( t \right)&= \frac{t}{\alpha -1}\dot{\beta }\left( t \right) + 2\left( \frac{\alpha }{t}g(t) - 1 \right) \beta (t) - \beta (t)\\&=\frac{t}{\alpha -1}\dot{\beta }\left( t \right) + 2\left( \frac{\alpha }{t}\cdot \frac{t}{\alpha -1}-1\right) \beta (t)-\beta (t)\\&=\frac{t}{{\alpha - 1}}\dot{\beta }\left( t \right) + \frac{{3 - \alpha }}{{\alpha - 1}}\beta \left( t \right) \\&\leqslant -b\beta (t)\\&\leqslant 0. \end{aligned} \end{aligned}$$

Thus, according to this relation, the fact that $g(t)\ge 0$, $\beta (t)\ge 0$, $\varepsilon (t)\ge 0$, $ {{\Phi _t}\left( {x\left( t \right) } \right) - m} \ge 0$ and (4.7), we obtain further that

$$\begin{aligned} \dot{E}\left( t \right) \leqslant \frac{1}{2}g\left( t \right) \beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2} - \frac{1}{2}g\left( t \right) \beta \left( t \right) \varepsilon \left( t \right) {\left\| {x\left( t \right) - z} \right\| ^2}. \end{aligned}$$

(4.8)

Moreover, using the fact that $g\left( t \right) ,\beta \left( t \right) ,\varepsilon \left( t \right) \geqslant 0$ and ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$, we finally have

$$\begin{aligned} \dot{E}\left( t \right) \leqslant \frac{1}{2}g\left( t \right) \beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2} = \frac{1}{{2\left( {\alpha - 1} \right) }}t\beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2}, \end{aligned}$$

(4.9)

which proves (i).

We now prove (ii). From the assumption $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$ and (4.9), we deduce that the positive part of $\dot{E}\left( t \right) $ belongs to ${L^1}\left( {{t_0}, + \infty } \right) $. Using this and the fact that E is bounded from below, we obtain that E(t) has a limit as $t \rightarrow + \infty $ due to Lemma 2.1. Hence, there exists $C_1>0$ such that $|E(t)|\le C_1$.

In addition, according to (4.7), we have

$$\begin{aligned}&\dot{E}\left( t \right) + g\left( t \right) \left( {\beta \left( t \right) - g\left( t \right) \dot{\beta }\left( t \right) - 2\dot{g}\left( t \right) \beta \left( t \right) } \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) \\&\quad \leqslant \frac{1}{2}g\left( t \right) \beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2}. \end{aligned}$$

Recall the fact that ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$, $\beta (t)\ge 0$ and the assumption that $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $, $\alpha >1$, we obtain further that

$$\begin{aligned} \dot{E}\left( t \right) + \frac{b}{{{{\left( {\alpha - 1} \right) }^2}}}t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) \leqslant \frac{1}{{2(\alpha - 1)}}t\beta \left( t \right) \varepsilon \left( t \right) {\left\| z \right\| ^2}. \end{aligned}$$

Integrating this inequality, and using that $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$ and the fact $E\left( t \right) $ is bounded from below, we have

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) dt < } + \infty , \end{aligned}$$

which proves (ii).

We now prove (iii). According to Lemma 2.2, we have

$$\begin{aligned} \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\dot{W}\left( t \right) + \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\frac{\alpha }{t}{\left\| {\dot{x}\left( t \right) } \right\| ^2} \leqslant \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) . \end{aligned}$$

After integration by parts on $(t_0 ,t)$, we obtain

$$\begin{aligned} \begin{aligned}&\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) - \frac{{{t_0}^2}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( {{t_0}} \right) - \frac{2}{{{{\left( {\alpha - 1} \right) }^2}}}\int _{{t_0}}^t {sW\left( s \right) ds} \\&\quad + \frac{\alpha }{{{{\left( {\alpha - 1} \right) }^2}}}\int _{{t_0}}^t {s{{\left\| {\dot{x}\left( s \right) } \right\| }^2}ds} \\&\quad \leqslant \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}\int _{{t_0}}^t {{s^2}\dot{\beta }\left( s \right) \left( {{\Phi _s}\left( {x\left( s \right) } \right) - m} \right) ds.} \\ \end{aligned} \end{aligned}$$

From the definition of W(t) in (2.1), we immediately have

$$\begin{aligned}&\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) + \frac{1}{{\alpha - 1}}\int _{{t_0}}^t {s{{\left\| {\dot{x}\left( s \right) } \right\| }^2}ds} \leqslant \frac{{{t_0}^2}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( {{t_0}} \right) \\&\quad + \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}\int _{{t_0}}^t {s\left( {s\dot{\beta }\left( s \right) + 2\beta \left( s \right) } \right) \left( {{\Phi _s}\left( {x\left( s \right) } \right) - m} \right) ds}. \end{aligned}$$

From the assumption $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $, we obtain further that

$$\begin{aligned}&\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) + \frac{1}{{\alpha - 1}}\int _{{t_0}}^t {s{{\left\| {\dot{x}\left( s \right) } \right\| }^2}ds} \leqslant \frac{{{t_0}^2}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( {{t_0}} \right) \\&\quad + \frac{\alpha -1-b}{{(\alpha - 1)^2}}\int _{{t_0}}^t {s\beta \left( s \right) \left( {{\Phi _s}\left( {x\left( s \right) } \right) - m} \right) ds.} \end{aligned}$$

According to (ii), there exists a constant ${\tilde{C}}$ such that $\frac{\alpha -1-b}{{(\alpha - 1)^2}}\int _{{t_0}}^{+\infty } s\beta \left( s \right) \left( {\Phi _s}\left( {x\left( s \right) } \right) - m \right) ds < {\tilde{C}}$. Then,

$$\begin{aligned} \frac{1}{{\alpha - 1}}\int _{{t_0}}^{+\infty } {s{{\left\| {\dot{x}\left( s \right) } \right\| }^2}ds} < \frac{{{t_0}^2}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( {{t_0}} \right) + {\tilde{C}}. \end{aligned}$$

From the definition of W(t) in (2.1), we have

$$\begin{aligned} \begin{aligned}&\frac{1}{{\alpha - 1}}\int _{{t_0}}^{ + \infty } {tW\left( t \right) dt} \\&\quad = \frac{1}{{\alpha - 1}}\int _{{t_0}}^{ + \infty } {\frac{1}{2}t{{\left\| {\dot{x}\left( t \right) } \right\| }^2}dt} + \frac{1}{{\alpha - 1}}\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) dt} \\&\quad < \frac{{{t_0}^2}}{{2{{\left( {\alpha - 1} \right) }^2}}}W\left( {{t_0}} \right) + \frac{1}{2}{\tilde{C}} + \frac{\alpha -1}{\alpha -1-b}{\tilde{C}}, \\ \end{aligned} \end{aligned}$$

which implies that

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {tW\left( t \right) dt < } + \infty . \end{aligned}$$

(4.10)

In addition, according to the Lemma 2.2, we have

$$\begin{aligned} \begin{aligned}&\frac{d}{{dt}}\left( {\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) } \right) \\&\quad = \frac{{2t}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) + \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\dot{W}\left( t \right) \\&\quad \leqslant \frac{{2t}}{{{{\left( {\alpha - 1} \right) }^2}}}\left( {\frac{1}{2}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right) \\&\qquad + \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\left( { - \frac{\alpha }{t}{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right) \\&\quad = \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}\left( t{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + 2t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) - \alpha t{{\left\| {\dot{x}\left( t \right) } \right\| }^2} \right. \\&\qquad \left. + {t^2}\dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) \right) \\&\quad = \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}\left( {\left( {1 - \alpha } \right) t{{\left\| {\dot{x}\left( t \right) } \right\| }^2} + t\left( {2\beta \left( t \right) + t\dot{\beta }\left( t \right) } \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) } \right) \\&\quad = \frac{1}{{1 - \alpha }}t{\left\| {\dot{x}\left( t \right) } \right\| ^2} + \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}t\left( {2\beta \left( t \right) + t\dot{\beta }\left( t \right) } \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) \\&\quad \leqslant \frac{1}{{{{\left( {\alpha - 1} \right) }^2}}}t\left( {2\beta \left( t \right) + t\dot{\beta }\left( t \right) } \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) , \\ \end{aligned} \end{aligned}$$

which the last inequality is due to $\alpha >1$. Recall the assumption that $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $, we immediately have

$$\begin{aligned} \frac{d}{{dt}}\left( {\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) } \right) \leqslant \frac{\alpha -1-b}{{(\alpha - 1)^2}}t\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) . \end{aligned}$$

According to (ii), we see that the positive part of $\frac{d}{{dt}}\left( {\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) } \right) $ belongs to ${L^1}\left( {{t_0}, + \infty } \right) $. As a result,

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow + \infty } \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) \text { exists,} \end{aligned}$$

which implies that

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow + \infty } \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) = 0. \end{aligned}$$

Otherwise, there would exist a constant $\bar{C} > 0$ such that $\frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) \geqslant \bar{C}$ for t sufficiently large, i.e. $\frac{t}{{{{\left( {\alpha - 1} \right) }^2}}}W\left( t \right) \geqslant \frac{{\bar{C}}}{t}.$ According to (4.10), we have $\int _{{t_0}}^{ + \infty } {\frac{{\bar{C}}}{t}dt < } + \infty $, which leads to a contradiction. According to the definition of W(t) in (2.1) again, we have

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow + \infty } \frac{{{t^2}}}{{{{\left( {\alpha - 1} \right) }^2}}}\beta \left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m} \right) = 0. \end{aligned}$$

Recall that ${\Phi _t}\left( x \right) = \Phi \left( x(t)\right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| x(t) \right\| ^2}$ and $\mathop {\lim }\limits _{t \rightarrow + \infty } \varepsilon \left( t \right) = 0$. Combining these facts with the above relation, we obtain further that

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - m = o\left( {\frac{1}{{\beta \left( t \right) {t^2}}}} \right) , \end{aligned}$$

which proves (iii).

Finally, we prove (iv). According to (4.8),we have

$$\begin{aligned} \dot{E}\left( t \right) + g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) - z} \right\| ^2} \leqslant g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{\left\| z \right\| ^2}. \end{aligned}$$

(4.11)

By integrating (4.11) from $t_0$ to $\infty $, we obtain that

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {g\left( t \right) \beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{{\left\| {x\left( t \right) - z} \right\| }^2}dt < } + \infty , \end{aligned}$$

(4.12)

which follows from the fact ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$, the assumption $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$, and the positive part of $\dot{E}\left( t \right) $ belongs to ${L^1}\left( {{t_0}, + \infty } \right) $. Using again the definition of $g(t) = \frac{t}{{\alpha - 1}}$, the assumption $\alpha >1$, and (4.12), we immediately have

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \frac{{\varepsilon \left( t \right) }}{2}{{\left\| {x\left( t \right) - z} \right\| }^2}dt < } + \infty . \end{aligned}$$

Combining this relation with the assumption that $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$ and the fact $\left\| {x\left( t \right) } \right\| ^2 \le 2\left\| {x\left( t \right) - z} \right\| ^2 + 2\left\| z \right\| ^2$, $t>0$, $\beta (t)\ge 0$, $\varepsilon (t)\ge 0$, we obtain further that

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) {{\left\| {x\left( t \right) } \right\| }^2}dt < } + \infty . \end{aligned}$$

(4.13)

Next, we begin to establish the boundedness of the trajectory of (1.4). Recall the definition of E(t) in (4.3) and the result E(t) is bounded from the discussion in the proof of (ii), we see there exists $C_2>0$ such that

$$\begin{aligned} \frac{1}{2}{\left\| {x\left( t \right) - z + g\left( t \right) \dot{x}\left( t \right) } \right\| ^2} \leqslant C_2. \end{aligned}$$

(4.14)

Moreover, we have

$$\begin{aligned} {\left\| {x\left( t \right) - z} \right\| ^2} + 2g\left( t \right) \left\langle {x\left( t \right) - z,\dot{x}\left( t \right) } \right\rangle \leqslant 2C_2. \end{aligned}$$

(4.15)

After dividing (4.15) by $p\left( t \right) = {\left( {\frac{t}{{{t_0}}}} \right) ^\alpha }$, we obtain that

$$\begin{aligned} \frac{\left\| {x\left( t \right) - z} \right\| ^2}{p(t)} + \frac{2g\left( t \right) }{p(t)}\left\langle {x\left( t \right) - z,\dot{x}\left( t \right) } \right\rangle \leqslant \frac{2C_2}{p(t)}. \end{aligned}$$

Combing this relation with the definition of ${h_z}\left( t \right) = \frac{1}{2}{\left\| {x\left( t \right) - z} \right\| ^2}$ in (2.5) and $ {{\dot{h}}_z}\left( t \right) = \left\langle {\dot{x}\left( t \right) ,x\left( t \right) - z} \right\rangle $ from (2.6), we obtain further that

$$\begin{aligned} \frac{{{h_z}\left( t \right) }}{{p\left( t \right) }} + q\left( t \right) {{\dot{h}}_z}\left( t \right) \leqslant \frac{C_2}{{p\left( t \right) }}, \end{aligned}$$

(4.16)

where $q(t)= \frac{{g\left( t \right) }}{{p\left( t \right) }}$. Using the definition of ${g\left( t \right) = \frac{t}{{\alpha - 1}}}$ and $p\left( t \right) = {\left( {\frac{t}{{{t_0}}}} \right) ^\alpha }$, we can easily compute q(t) as $q(t)= \frac{{g\left( t \right) }}{{p\left( t \right) }} = \frac{t}{{\alpha - 1}}\cdot \frac{t_0^\alpha }{t^\alpha } = \frac{{t_0^\alpha {t^{1 - \alpha }}}}{{\alpha - 1}}$. Hence, we have $\dot{q}(t) = - \frac{1}{{p\left( t \right) }}$ and q(t) is bounded due to the fact $\alpha >1$ from the assumption. From these discussion, we can rewrite (4.16) as

$$\begin{aligned} q\left( t \right) {{\dot{h}}_z}\left( t \right) - \dot{q}\left( t \right) \left( {{h_z}\left( t \right) - C_2} \right) \leqslant 0, \end{aligned}$$

dividing this equation by $q{\left( t \right) ^2}$, we have

$$\begin{aligned} \frac{q\left( t \right) {{\dot{h}}_z}\left( t \right) - \dot{q}\left( t \right) \left( {{h_z}\left( t \right) - C_2} \right) }{q{\left( t \right) ^2}} \leqslant 0, \end{aligned}$$

which is equivalent to

$$\begin{aligned} \frac{d}{{dt}}\left( {\frac{{{h_z}\left( t \right) - C_2}}{{q\left( t \right) }}} \right) \leqslant 0. \end{aligned}$$

Hence, by integrating the above inequality from $t_0$ to t, we see that there exists $C_3>0$ such that

$$\begin{aligned} {h_z}\left( t \right) \leqslant C_3\left( {1 + q\left( t \right) } \right) . \end{aligned}$$

Note that q(t) is bounded due to the fact $\alpha >1$ from the assumption, combining this with the definition of $h_z(t) = \frac{1}{2}{\left\| {x\left( t \right) - z} \right\| ^2}$, we immediately obtain that x(t) is bounded. This completes the proof.

Remark 4.1

From Theorem 4.1, we see that if $\beta (t) = 1$, then we have

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - m = o\left( {\frac{1}{{{t^2}}}} \right) , \end{aligned}$$

which is just the result obtained by Attouch, Chbani, Riahi [4]. Furthermore, the assumption $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $ and the assumption $\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }$ in Theorem 4.1 reduced to $\alpha > 3$ and $\int _{{t_0}}^{ + \infty } {t\varepsilon \left( t \right) dt < + \infty }$ in [4]. Hence our results are more general.

Remark 4.2

From Theorem 4.1, we see that if $\beta (t) = t$, then we have

$$\begin{aligned} \Phi \left( {x\left( t \right) } \right) - m = o\left( {\frac{1}{{{t^3}}}} \right) . \end{aligned}$$

Furthermore, the assumption $t\dot{\beta }\left( t \right) \leqslant \left( {\alpha - 3-b} \right) \beta \left( t \right) $ and the assumption $\int _{{t_0}}^{ + \infty } t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty $ in Theorem 4.1 reduced to $\alpha \geqslant 4+b$ and $\int _{{t_0}}^{ + \infty } t^2\varepsilon \left( t \right) dt < + \infty $, respectively. This also shows that the convergence rate of the function value is faster than $O\left( {\frac{1}{{{t^2}}}} \right) $ if $\alpha >3$.

4.2 Case $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $

For each $\varepsilon > 0$, we use $x_\varepsilon $ to denote the unique solution of the strongly convex minimization problem

$$\begin{aligned} {x_\varepsilon } = \mathop {\mathrm{arg}\,\mathrm{min}}\limits _{x \in \mathcal {H}} \left\{ {\Phi \left( x \right) + \frac{\varepsilon }{2}{{\left\| x \right\| }^2}} \right\} . \end{aligned}$$

From the first order optimality condition, we immediately have

$$\begin{aligned} \nabla \Phi \left( {{x_\varepsilon }} \right) + \varepsilon {x_\varepsilon } = 0. \end{aligned}$$

Let us recall the Tikhonov approximation curve, $\varepsilon \mapsto {x_\varepsilon }$, which satisfies the well-known strong convergence property:

$$\begin{aligned} \mathop {\lim }\limits _{\varepsilon \rightarrow 0} {x_\varepsilon } = p, \end{aligned}$$

(4.17)

where p is the element of minimal norm of the closed convex nonempty set $\arg \min \Phi $. Moreover, by the monotonicity property of $\nabla \Phi $ , and $\nabla \Phi \left( p \right) = 0$, $\nabla \Phi \left( {{x_\varepsilon }} \right) = - \varepsilon {x_\varepsilon }$, we have

$$\begin{aligned} \left\langle {{x_\varepsilon } - p, - \varepsilon {x_\varepsilon }} \right\rangle \geqslant 0, \end{aligned}$$

which, after dividing by $\varepsilon > 0$, and by Cauchy-Schwarz inequality gives

$$\begin{aligned} \left\| x_{\epsilon }\right\| \le \Vert p\Vert \quad \text{ for } \text{ all } \epsilon >0. \end{aligned}$$

(4.18)

Theorem 4.2

Let $\Phi :\mathcal {H} \rightarrow \mathbb {R}$ be a convex continuously differentiable function such that $\arg \min \Phi $ is nonempty. Suppose that $\varepsilon (t)$, $\beta (t)$ satisfies condition $(H_1)$, $\beta (t)$ is a nonincreasing function such that $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $ and $\alpha > 1$ hold. Let $x(\cdot )$ be a classical global solution of (1.4). Then $\mathop {\lim \inf }\limits _{t \rightarrow \infty } \left\| {x\left( t \right) - p} \right\| = 0,$ where p is the element of minimal norm of $\arg \min \Phi $. Moreover, the ergodic convergence property holds, which is

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }} d\tau }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left\| {x\left( \tau \right) - p} \right\| ^2} d\tau = 0. \end{aligned}$$

Proof

From Lemma 2.2, we have

$$\begin{aligned} \frac{{dW\left( t \right) }}{{dt}} \leqslant - \frac{\alpha }{t}{\left\| {\dot{x}\left( t \right) } \right\| ^2} + \dot{\beta }\left( t \right) \left( {{\Phi _t}\left( {x\left( t \right) } \right) - m } \right) . \end{aligned}$$

According to the assumption that $\beta (t)$ is a nonincreasing function, we have

$$\begin{aligned} \frac{{dW\left( t \right) }}{{dt}} \leqslant - \frac{\alpha }{t}{\left\| {\dot{x}\left( t \right) } \right\| ^2}. \end{aligned}$$

Hence, W(t) is nonincreasing, and $\mathop {\lim }\limits _{t \rightarrow + \infty } W\left( t \right) $ exists in $\mathbb {R}$. Then, by the definition of W(t) in (2.1), we obtain further that $\mathop {\sup }\limits _{t \geqslant {t_0}} \left\| {\dot{x}\left( t \right) } \right\| < + \infty $ and that

$$\begin{aligned} \int _{{t_0}}^{ + \infty } {\frac{{{{\left\| {\dot{x}\left( t \right) } \right\| }^2}}}{t}} dt \leqslant \frac{1}{\alpha }\left( {W\left( {{t_0}} \right) - \mathop {\lim }\limits _{t \rightarrow + \infty } W\left( t \right) } \right) < + \infty . \end{aligned}$$

(4.19)

Now, we introduce an auxiliary function $h_{p}(t)$, which is defined by

$$\begin{aligned} {h_p}\left( t \right) = \frac{1}{2}{\left\| {x\left( t \right) - p} \right\| ^2}, \end{aligned}$$

(4.20)

where p is the element of minimal norm of $\arg \min \Phi $. By taking the derivative and second derivative of the $h_{p}(t)$, we have

$$\begin{aligned} \dot{h}_{p}(t)=\langle \dot{x}(t), x(t)-p\rangle , ~~~~~~~~~ \ddot{h}_{p}(t)=\Vert \dot{x}(t)\Vert ^{2}+\langle \ddot{x}(t), x(t)-p\rangle . \end{aligned}$$

(4.21)

Hence, we deduce that

$$\begin{aligned} \ddot{h}_{p}(t)+\frac{\alpha }{t} \dot{h}_{p}(t)=\Vert \dot{x}(t)\Vert ^{2}+\left\langle \ddot{x}(t)+\frac{\alpha }{t} \dot{x}(t), x(t)-p\right\rangle . \end{aligned}$$

(4.22)

Moreover, recall the definition ${\Phi _t}\left( x \right) = \Phi \left( x \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| x \right\| ^2}$ in (2.1), from this and the assumption $\varepsilon (t)\ge 0$, we see that $\Phi _t$ is strongly convex with modulus $\varepsilon (t)$. Then we have

$$\begin{aligned} {\Phi _t}\left( p \right) \geqslant {\Phi _t}\left( {x\left( t \right) } \right) + \left\langle {\nabla {\Phi _t}\left( {x\left( t \right) } \right) ,p - x\left( t \right) } \right\rangle + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {p - x\left( t \right) } \right\| ^2}. \end{aligned}$$

Combining this relation with system (1.4), we obtain further that

$$\begin{aligned} {\Phi _t}\left( p \right) \geqslant {\Phi _t}\left( {x\left( t \right) } \right) + \frac{1}{{\beta \left( t \right) }}\left\langle { - \ddot{x}\left( t \right) - \frac{\alpha }{t}\dot{x}\left( t \right) ,p - x\left( t \right) } \right\rangle + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {p - x\left( t \right) } \right\| ^2}. \end{aligned}$$

From this relation and the definition of h(p) in (4.20), we have

$$\begin{aligned} \frac{1}{{\beta \left( t \right) }}\left\langle {x\left( t \right) - p,\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) } \right\rangle + \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant {\Phi _t}\left( p \right) - {\Phi _t}\left( {x\left( t \right) } \right) . \end{aligned}$$

(4.23)

By the definition of $x_\varepsilon $ and $\Phi _t$, we immediately get

$$\begin{aligned} {\Phi _t}\left( {{x_\varepsilon }\left( t \right) } \right) = \Phi \left( {{x_\varepsilon }\left( t \right) } \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {{x_\varepsilon }\left( t \right) } \right\| ^2} \leqslant \Phi \left( {x\left( t \right) } \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| {x\left( t \right) } \right\| ^2} = {\Phi _t}\left( {x\left( t \right) } \right) . \end{aligned}$$

Combining (4.23) with the above relation, we obtain further that

$$\begin{aligned} \frac{1}{{\beta \left( t \right) }}\left\langle {x\left( t \right) - p,\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) } \right\rangle + \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant {\Phi _t}\left( p \right) - {\Phi _t}\left( {{x_\varepsilon }\left( t \right) } \right) . \end{aligned}$$

(4.24)

Since p is the element of minimal norm of $\arg \min \Phi $, we have $\Phi \left( p \right) \leqslant \Phi \left( {{x_\varepsilon }\left( t \right) } \right) $. Using this and the definition of $\Phi _t$, we obtain that

$$\begin{aligned} \begin{aligned} {\Phi _t}\left( p \right) - {\Phi _t}\left( {{x_\varepsilon }\left( t \right) } \right)&= \Phi \left( p \right) + \frac{{\varepsilon \left( t \right) }}{2}{\left\| p \right\| ^2} - \Phi \left( {{x_\varepsilon }\left( t \right) } \right) - \frac{{\varepsilon \left( t \right) }}{2}{\left\| {{x_\varepsilon }\left( t \right) } \right\| ^2} \\&\leqslant \frac{{\varepsilon \left( t \right) }}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) . \end{aligned} \end{aligned}$$

(4.25)

Combining (4.24) with (4.25) together, we obtain further that

$$\begin{aligned} \frac{1}{{\beta \left( t \right) }}\left\langle {x\left( t \right) - p,\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) } \right\rangle + \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant \frac{{\varepsilon \left( t \right) }}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) . \end{aligned}$$

Multiply both sides of the above formula by $\beta (t)$, we have

$$\begin{aligned} \left\langle {x\left( t \right) - p,\ddot{x}\left( t \right) + \frac{\alpha }{t}\dot{x}\left( t \right) } \right\rangle + \beta \left( t \right) \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant \frac{{\beta \left( t \right) \varepsilon \left( t \right) }}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) .\nonumber \\ \end{aligned}$$

(4.26)

Combining (4.22) with (4.26) together, we obtain that

$$\begin{aligned} {{\ddot{h}}_p}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_p}\left( t \right) + \beta \left( t \right) \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant {\left\| {\dot{x}\left( t \right) } \right\| ^2} + \frac{{\beta \left( t \right) \varepsilon \left( t \right) }}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) . \end{aligned}$$

(4.27)

On the other hand, by simply computing, we have

$$\begin{aligned} {{\ddot{h}}_p}\left( t \right) + \frac{\alpha }{t}{{\dot{h}}_p}\left( t \right) = \frac{1}{{{t^\alpha }}}\frac{d}{{dt}}\left( {{t^\alpha }{{\dot{h}}_p}\left( t \right) } \right) . \end{aligned}$$

Hence, from the above relation and (4.27), we obtain that

$$\begin{aligned} \beta \left( t \right) \varepsilon \left( t \right) {h_p}\left( t \right) \leqslant {\left\| {\dot{x}\left( t \right) } \right\| ^2} + \frac{{\beta \left( t \right) \varepsilon \left( t \right) }}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) - \frac{1}{{{t^\alpha }}}\frac{d}{{dt}}\left( {{t^\alpha }{{\dot{h}}_p}\left( t \right) } \right) . \end{aligned}$$

Dividing both sides of the above formula by t, we deduce that

$$\begin{aligned} \frac{{\beta \left( t \right) \varepsilon \left( t \right) {h_p}\left( t \right) }}{t} \leqslant \frac{{{{\left\| {\dot{x}\left( t \right) } \right\| }^2}}}{t} + \frac{{\beta \left( t \right) \varepsilon \left( t \right) }}{{2t}}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) - \frac{1}{{{t^{\alpha + 1}}}}\frac{d}{{dt}}\left( {{t^\alpha }{{\dot{h}}_p}\left( t \right) } \right) . \end{aligned}$$

Define $\delta \left( t \right) = \frac{1}{2}\left( {{{\left\| p \right\| }^2} - {{\left\| {{x_\varepsilon }\left( t \right) } \right\| }^2}} \right) $, from the assumption $\lim \limits _{t\rightarrow \infty } \varepsilon (t) =0$ and (4.17),(4.18) we see that $\mathop {\lim }\limits _{t \rightarrow \infty } \delta \left( t \right) = 0$. Moreover,

$$\begin{aligned} \frac{{\beta \left( t \right) \varepsilon \left( t \right) {h_p}\left( t \right) }}{t} \leqslant \frac{{{{\left\| {\dot{x}\left( t \right) } \right\| }^2}}}{t} + \frac{{\beta \left( t \right) \varepsilon \left( t \right) \delta \left( t \right) }}{t} - \frac{1}{{{t^{\alpha + 1}}}}\frac{d}{{dt}}\left( {{t^\alpha }{{\dot{h}}_p}\left( t \right) } \right) . \end{aligned}$$

By rearranging terms, we have

$$\begin{aligned} \frac{{\beta \left( t \right) \varepsilon \left( t \right) }}{t}\left( {{h_p}\left( t \right) - \delta \left( t \right) } \right) \leqslant \frac{{{{\left\| {\dot{x}\left( t \right) } \right\| }^2}}}{t} - \frac{1}{{{t^{\alpha + 1}}}}\frac{d}{{dt}}\left( {{t^\alpha }{{\dot{h}}_p}\left( t \right) } \right) . \end{aligned}$$

(4.28)

By integrating (4.28) on $[t_0,t]$, there exists $C_4>0$ such that

$$\begin{aligned} \int _{{t_0}}^t {\frac{{\varepsilon \left( s \right) \beta \left( s \right) }}{s}} \left( {{h_p}\left( s \right) - \delta \left( s \right) } \right) ds \leqslant {C_4} - \int _{{t_0}}^t {\frac{1}{{{s^{\alpha + 1}}}}} \frac{d}{{ds}}\left( {{s^\alpha }{{\dot{h}}_p}\left( s \right) } \right) ds, \end{aligned}$$

(4.29)

which follows from (4.19).

Next, we begin to analyze the right terms in the above formula, i.e., $\int _{{t_0}}^t {\frac{1}{{{s^{\alpha + 1}}}}} \frac{d}{{ds}}\left( {{s^\alpha }{{\dot{h}}_p}\left( s \right) } \right) ds$. According to the integration rule, we have

$$\begin{aligned} \begin{aligned}&\int _{{t_0}}^t {\frac{1}{{{s^{\alpha + 1}}}}} \frac{d}{{ds}}\left( {{s^\alpha }{{\dot{h}}_p}\left( s \right) } \right) ds \\&\quad = \left( {\frac{1}{s}{{\dot{h}}_p}\left( s \right) } \right) \left| {_{{t_0}}^t} \right. + \left( {\alpha + 1} \right) \int _{{t_0}}^t {\frac{1}{{{s^2}}}} {{\dot{h}}_p}\left( s \right) ds \\&\quad = \frac{1}{t}{{\dot{h}}_p}\left( t \right) - \frac{1}{t_0}{{\dot{h}}_p}\left( t_0 \right) + \frac{{\alpha + 1}}{{{t^2}}}{h_p}\left( t \right) - \frac{{\alpha + 1}}{{t_0^2}}{h_p}\left( {{t_0}} \right) + 2\left( {\alpha + 1} \right) \\&\qquad \int _{{t_0}}^t {\frac{1}{{{s^3}}}} {h_p}\left( s \right) ds \\&\quad = C{}_5 + \frac{1}{t}{{\dot{h}}_p}\left( t \right) + \frac{{\alpha + 1}}{{{t^2}}}{h_p}\left( t \right) + 2\left( {\alpha + 1} \right) \int _{{t_0}}^t {\frac{1}{{{s^3}}}} {h_p}\left( s \right) ds, \\ \end{aligned} \end{aligned}$$

where $C_5$ is other constant. Recall the definition of $h_{p}(t)$ in (4.20), we see that $h_{p}(t)$ is nonnegative. From this with the fact $\alpha >1$, we have

$$\begin{aligned} \int _{{t_0}}^t {\frac{1}{{{s^{\alpha + 1}}}}} \frac{d}{{ds}}\left( {{s^\alpha }{{\dot{h}}_p}\left( s \right) } \right) ds \geqslant C_5 + \frac{1}{t}{{\dot{h}}_p}\left( t \right) . \end{aligned}$$

Combining the above formula with (4.29), we have

$$\begin{aligned} \int _{{t_0}}^t {\frac{{\varepsilon \left( s \right) \beta \left( s \right) }}{s}} \left( {{h_p}\left( s \right) - \delta \left( s \right) } \right) ds \leqslant {C_4} -C_5 - \frac{1}{t}{{\dot{h}}_p}\left( t \right) \leqslant C_6 + \frac{1}{t}\left| {{{\dot{h}}_p}\left( t \right) } \right| , \end{aligned}$$

(4.30)

where $C_6$ is other constant.

From the fact $\mathop {\sup }\limits _{t \geqslant {t_0}} \left\| {\dot{x}\left( t \right) } \right\| < + \infty $ , similar to Lemma 2.3 (ii) , we have $\mathop {\sup }\limits _{t \geqslant {t_0}} \frac{{\left| {{{\dot{h}}_p}\left( t \right) } \right| }}{t} < + \infty .$ Using this result and (4.30), we obtain that there exists another constant $\bar{C}>0$ such that

$$\begin{aligned} \int _{{t_0}}^t {\frac{{\varepsilon \left( s \right) \beta \left( s \right) }}{s}} \left( {{h_p}\left( s \right) - \delta \left( s \right) } \right) ds \leqslant \bar{C}. \end{aligned}$$

(4.31)

Since $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $ from the assumption, by (4.31), we obtain further that

$$\begin{aligned} \mathop {\lim \inf }\limits _{t \rightarrow \infty } \left( {{h_p}\left( t \right) - \delta \left( t \right) } \right) \leqslant 0. \end{aligned}$$

Note that $\mathop {\lim }\limits _{t \rightarrow \infty } \delta \left( t \right) = 0$, hence, $\mathop {\lim \inf }\limits _{t \rightarrow \infty } {h_p}\left( t \right) = 0,$ which implies that $\mathop {\lim \inf }\limits _{t \rightarrow \infty } \left\| {x\left( t \right) - p} \right\| = 0.$ This proves the strong convergence of the trajectory x(t).

In the following, we will prove the trajectory x(t) is ergodicly convergent to the solution with minimal norm of the solution of (1.4). Note that

$$\begin{aligned} \begin{aligned}&\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }{h_p}\left( \tau \right) } d\tau \\&\quad = \int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left( {{h_p}\left( \tau \right) - \delta \left( \tau \right) } \right) } d\tau + \int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\delta \left( \tau \right) } d\tau \\&\quad \leqslant \bar{C} + \int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\delta \left( \tau \right) } d\tau , \\ \end{aligned} \end{aligned}$$

where the last inequality follows from (4.31). Dividing both sides of the above formula by $\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }} d\tau $, then we have

$$\begin{aligned} \begin{aligned}&\mathop {\lim \sup }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }{h_p}\left( \tau \right) d\tau } \\&\quad \leqslant \mathop {\lim \sup }\limits _{t \rightarrow \infty } \left( {\frac{{{\bar{C}}}}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }} + \frac{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\delta \left( \tau \right) d\tau } }}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }}} \right) \\&\quad \leqslant \mathop {\lim \sup }\limits _{t \rightarrow \infty } \frac{{{\bar{C}}}}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }} + \mathop {\lim \sup }\limits _{t \rightarrow \infty } \frac{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\delta \left( \tau \right) d\tau } }}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }d\tau } }} \\&\quad = 0, \\ \end{aligned} \end{aligned}$$

where the first inequality follows from the fact $\varepsilon (t)\ge 0, \beta (t)\ge 0$, and the last equality follows from the assumption that $\mathop {\lim }\limits _{t \rightarrow \infty } \int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }} d\tau = + \infty $ and the fact that $\mathop {\lim }\limits _{t \rightarrow \infty } \delta \left( t \right) = 0.$ Then, by the definition of $h_p$, we have

$$\begin{aligned} \mathop {\lim \sup }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }} d\tau }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left\| {x\left( \tau \right) - p} \right\| ^2} d\tau \leqslant 0. \end{aligned}$$

(4.32)

Since, all the terms in the left side of (4.32) are nonnegative, we obtain further that

$$\begin{aligned} \mathop {\lim }\limits _{t \rightarrow \infty } \frac{1}{{\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }} d\tau }}\int _{{t_0}}^t {\frac{{\varepsilon \left( \tau \right) \beta \left( \tau \right) }}{\tau }\left\| {x\left( \tau \right) - p} \right\| ^2} d\tau = 0. \end{aligned}$$

(4.33)

This completes the proof.

5 Numerical experiments

In this section, we perform numerical experiments to illustrate our theoretical results of dynamical system (1.4). All the experiments are performed by Matlab 2014b on a 64-bit Thinkpad laptop with an Intel(R) Core(TM) i7-6600U CPU (2.60GHz) and 12GB of RAM.

In our numerical tests, we consider three optimization problems: the first two examples are two dimensional strongly convex problem and convex problem respectively, the third is a convex and twice continuously differentiable one-dimensional problem and the minimizer is not unique, this example comes from reference [17]. We use Runge Kutta 4-5 adaptive method to solve them.

The first two examples are mainly to emphasize the fast convergence rate of the function value (Theorem 4.1), and the third example is to show the strong convergence of the trajectory (Theorem 4.2). A detailed description is given below.

In the next two subsections, we choose $b \in (0,1)$ and $(\alpha , \beta (t), \varepsilon (t))=\left( 5, t, \frac{1}{t^4}\right) $, $(\alpha , \beta (t), \varepsilon (t))=\left( 7, t^3, \frac{1}{t^6}\right) $, $(\alpha , \beta (t), \varepsilon (t))=\left( 9, t^5, \frac{1}{t^8}\right) $ respectively. All the choices of $\alpha , \beta (t), \varepsilon (t)$ satisfy the assumptions in Theorem 4.1. Hence, by Theorem 4.1, the function value along the trajectory is convergent fast.

5.1 Strongly convex function

In this subsection, we consider the strongly convex optimization problem:

$$\begin{aligned} \min {\Phi _1}\left( {{x_1},{x_2}} \right) = 2{x_1}^2 + 5{x_2}^2 - 4{x_1} + 10{x_2} + 7. \end{aligned}$$

By simply computing, we obtain that $\nabla {\Phi _1}\left( {{x_1},{x_2}} \right) = \left( {4{x_1} - 4,10{x_2} + 10} \right) ^T$ and ${x^ * } = \left( {1, - 1} \right) ^T$ is the unique minimizer of $\Phi _1$, hence the optimal value is ${\Phi _1}^ * = {\Phi _1}\left( {1, - 1} \right) = 0$.

To illustrate the fast convergence rate of $\Phi (x(t))$, we plot in Fig. 1 the trajectory of $|{\Phi _1}\left( {x\left( t \right) } \right) - {\Phi _1}^ * |$ versus the time t, the horizontal axis represents t, the initial point is chosen as $u_0=v_0=(-5,30)^T$. According to Fig. 1a, we see that $\Phi _1(x(t))$ converges to $\Phi _1^ *$ fast for all the choices of $\alpha $, $\beta (t)$ and $\varepsilon (t)$. Fig. 1b shows the performance of $|{\Phi _1}\left( {x\left( t \right) } \right) - {\Phi _1}^ * |$ under the choice of $\alpha =5$, $\beta (t)=t$, $\varepsilon (t)=1/t^4$ and the case $\alpha =5$, $\beta (t)=1$, $\varepsilon (t)=1/t^4$, where the latter choice is from [4]. We see from Fig. 1b that the choice $\beta (t)=t$ in (1.4) are comparable with $\beta (t)=1$.

5.2 Convex function

In this subsection, we consider convex optimization problem:

$$\begin{aligned} \min {\Phi _2}\left( {{x_1},{x_2}} \right) = {x_1}^4 + 5{x_2}^2 - 4{x_1} - 10{x_2} + 8. \end{aligned}$$

We can easily deduce that $\nabla {\Phi _2}\left( {{x_1},{x_2}} \right) = \left( {4{x_1}^3 - 4,10{x_2} -10} \right) ^T$ and ${x^ * } = \left( {1, 1} \right) ^T$ is the minimizer of $\Phi _2$, thus the optimal value is ${\Phi _2}^ * = {\Phi _2}\left( {1, 1} \right) = 0.$

The computational results are presented in Fig. 2. We plot $|{\Phi _1}\left( {x\left( t \right) } \right) - {\Phi _2}^ * |$ versus the time t in the following figures, the horizontal axis represents t, and the initial point is chosen as $u_0=v_0=(-1,5)^T$. From Fig. 2a, we see that $\Phi _2(x(t))$ converges to $\Phi _2^ *$ fast for all the choices of $\alpha $, $\beta (t)$ and $\varepsilon (t)$. Figure 2b shows the comparison between the case $\alpha =5$, $\beta (t)=t$, $\varepsilon (t)=1/t^4$ and the case $\alpha =5$, $\beta (t)=1$, $\varepsilon (t)=1/t^4$, where the latter case is from [4]. From the numerical results, we see that the convergence rate of $\Phi _2(x(t))$ are comparable under both choices of $\alpha $, $\beta (t)$, $\epsilon (t)$.

5.3 One-dimensional function

In this subsection, we conduct numerical experiments to illustrate the influence of Tikhonov regularization on the strong convergence of the trajectory x(t). We consider $(\alpha , \beta (t), \varepsilon (t))=\left( 3, \frac{1}{{\sqrt{1 + \ln t} }}, \frac{1}{{\sqrt{1 + \ln t} }}\right) $ and $(\alpha , \beta (t), \varepsilon (t))=\left( 3, \frac{1}{{\sqrt{1 + \ln t} }}, 0\right) $ respectively, and the choice of $(\alpha , \beta (t), \varepsilon (t))=\left( 3, \frac{1}{{\sqrt{1 + \ln t} }}, \frac{1}{{\sqrt{1 + \ln t} }}\right) $ satisfies the assumptions in Theorem 4.2.

The optimization problem we consider in this part is as follows:

$$\begin{aligned} \min \Phi : \mathbb {R} \rightarrow \mathbb {R}, \quad \Phi (x)=\left\{ \begin{array}{ll} -(x+1)^{3}, &{}\quad \text{ if } x<-1 \\ 0, &{}\quad \text{ if } -1 \le x \le 1 \\ (x-1)^{3}, &{}\quad \text{ if } x>1 \end{array}\right. \end{aligned}$$

By easily computing, we can deduce that $\arg \min \Phi =[-1,1]$ and $x^*=0$ is its minimum norm solution.

Our computational results are presented in Fig. 3. We plot the trajectory x(t) generated by (1.4) versus the time t in the following figure, the horizontal axis represents t. We see from the figure that x(t) generated by (1.4) with the choice $(\alpha , \beta (t), \varepsilon (t))=\left( 3, \frac{1}{{\sqrt{1 + \ln t} }}, \frac{1}{{\sqrt{1 + \ln t} }}\right) $ converges to the minimum norm solution $x^*=0$, which conforms with our theory. However, the trajectory x(t) under the case $(\alpha , \beta (t), \varepsilon (t))=\left( 3, \frac{1}{{\sqrt{1 + \ln t} }}, 0\right) $ (without the Tikhonov regularization) converges to the optimal solution, but not the minimum norm solution.

6 Conclusion, perspective

In this paper, we mainly study the convergence behavior of a second order gradient system with Tikhonov regularization (1.4). We first prove the existence and uniqueness of the $C^2$-global solution of (1.4). Next, under the assumption $\int _{{t_0}}^{ + \infty }t\beta \left( t\right) \varepsilon \left( t\right) dt < +\infty $, we establish the global convergence of $\Phi \left( {x\left( t \right) } \right) $ to the optimal value of $\Phi $. Moreover, we show that the convergence rate of $\Phi (x(t))$ to $\min \Phi $ is $o(1/t^2\beta (t))$, which can be faster than $o(1/t^2)$. In the case $\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } $, by constructing proper energy function, we show that the trajectory x(t) strongly converges to p, where p is the element of minimal norm of $\arg \min \Phi $. In addition, we also prove the ergodic convergence of x(t). Finally, we conduct some numerical experiments to illustrate the theoretical results.

At the end of this paper, we would like to list some possible directions of future research related to the dynamical sytem (1.4):

(i)
A natural direction is to propose some proper numerical algorithms via time discretization of (1.4). Furthermore, one can investigate their theoretical convergence properties, and confirm them with numerical experiments;
(ii)
One can also consider (1.4) endowed with an additional Hessian driven damping, see for example [8, 17];
(iii)
Another direction is to consider the non-smooth optimization problems, which mean the objective functions are not differentiable, then we can not apply (1.4) directly. One can use the monotone inclusion to solve it, see for example [8, 12, 21].

References

Attouch, H., Cominetti, R.: A dynamical approach to convex minimization coupling approximation with the steepest descent method. J. Differ. Equ. 128(2), 519–540 (1996)
Article MathSciNet Google Scholar
Attouch, H., Czarnecki, M.O.: Asymptotic control and stabilization of nonlinear oscillators with non-isolated equilibria. J. Differ. Equ. 179(1), 278–310 (2002)
Article MathSciNet Google Scholar
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168(1–2), 123–175 (2018)
Article MathSciNet Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Combining fast inertial dynamics for convex optimization with Tikhonov regularization. J. Math. Anal. Appl. 457(2), 1065–1094 (2018)
Article MathSciNet Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterovs accelerated gradient method in the subcritical case $\alpha \le 3$, ESAIM: control. Optim. Calculus Var. 25, 2 (2019)
Article Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Fast convex optimization via time scaling of damped inertial gradient dynamics, https://hal.archives-ouvertes.fr/hal-02138954
Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM J. Optim. 29(3), 2227–2256 (2019)
Article MathSciNet Google Scholar
Attouch, H., Peypouquet, J., Redont, P.: Fast convex minimization via inertial dynamics with Hessian driven damping. J. Differ. Equ. 261(10), 5734–5783 (2016)
Article Google Scholar
Bolte, J.: Continuous gradient projection method in Hilbert spaces. J. Optim. Theory Appli. 119(2), 235–259 (2003)
Article MathSciNet Google Scholar
Bach, F.: Learning with submodular functions: a convex optimization perspective. Found. Trends Mach. Learn. 6(2–3), 145–373 (2013)
Article Google Scholar
Becker, S., Bobin, J., Cands, E.J.: NESTA: a fast and accurate first-order method for sparse recovery. SIAM J. Imag. Sci. 4(1), 1–39 (2011)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R.: Second order forward-backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54(3), 1423–1443 (2016)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R.: Approaching nonsmooth nonconvex optimization problems through first order dynamical systems with hidden acceleration and Hessian driven damping terms. Set-Valued Variation. Anal. 26(2), 227–245 (2018)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R.: A second-order dynamical system with Hessian-driven damping and penalty term associated to variational inequalities. Optimization 68(7), 1265–1277 (2019)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R., László, S.C.: Second-order dynamical systems with penalty terms associated to monotone inclusions. Anal. Appl. 16(05), 601–622 (2018)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R., László, S.C.: A second-order dynamical approach with variable damping to nonconvex smooth minimization. Appl. Anal. 99(3), 361–378 (2020)
Article MathSciNet Google Scholar
Bot, R.I., Csetnek, E.R., László, S.C.: Tikhonov regularization of a second order dynamical system with Hessian driven damping. Math. Program. (2020). https://doi.org/10.1007/s10107-020-01528-8
Article Google Scholar
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Structured sparsity through convex optimization. Stat. Sci. 27(4), 450–468 (2012)
Article MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-threshoiding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Article Google Scholar
Condat, L.: A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 158(2), 460–479 (2013)
Article MathSciNet Google Scholar
Csetnek, E.R.: Continuous dynamics related to monotone inclusions and non-smooth optimization problems. Set Valued Variation. Anal. (2020). https://doi.org/10.1007/s11228-020-00548-y
Article MathSciNet MATH Google Scholar
Haraux, A.: Systèmes dynamiques dissipatifs et applications, Rech. Math. Appl. 17, Masson, Paris. (1991)
May, R.: On the strong convergence of the gradient projection algorithm with Tikhonov regularizing term, arXiv preprint arXiv:1910.07873, (2019)
Muehlebach, M., Jordan, M. I.: A dynamical systems perspective on Nesterov acceleration, arXiv preprint arXiv:1905.07436, (2019)
Mitchell, D., Ye, N., Sterck, H.D.: Nesterov acceleration of alternating least squares for canonical tensor decomposition. Numer. Linear Algbera Appl. (2020). https://doi.org/10.1002/nla.2297
Article MATH Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate $O(1/k^2)$. Soviet Math. Doklady 27(2), 372–376 (1983)
MATH Google Scholar
Sontag, E.D.: Mathematical Control Theory Deterministic Finite-Dimensional Systems. Texts in Applied Mathematics, vol. 6, 2nd edn. Springer, New York (1998)
MATH Google Scholar
Su, W., Boyd, S., Candés, E.J.: A differential equation for modeling Nesterovs accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)
MathSciNet MATH Google Scholar
Sutskever, I., Martens, J., Dahl, G., et al.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)
Sra, S., Nowozin, S., Wright, S.J.: Optimization for Machine Learning. Mit Press, Cambridge (2012)
Google Scholar
Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization, http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf (2008). Accessed 2019
Wright, J., Ganesh, A., Rao, S.: Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. Advances in Neural Information Processing Systems 22, 2080–2088 (2009)
Google Scholar
Wilson, A. C., Recht, B., Jordan, M. I.: A lyapunov analysis of momentum methods in optimization, arXiv preprint arXiv:1611.02635, (2016)

Download references

Acknowledgements

The authors would like to thank the editor and anonymous reviewers for their insight and helpful comments and suggestions which improve the quality of the paper. This work is also supported in part by NSFC 11801131, Natural Science Foundation of Hebei Province (Grant No. A2019202229), Science and Technology Project of Hebei Education Department (Grant No. QN2018101).

Author information

Authors and Affiliations

School of Science, Hebei University of Technology, Tianjin, People’s Republic of China
Bo Xu
Institute of Mathematics, Hebei University of Technology, Tianjin, People’s Republic of China
Bo Wen

Authors

Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Wen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Wen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, B., Wen, B. On the convergence of a class of inertial dynamical systems with Tikhonov regularization. Optim Lett 15, 2025–2052 (2021). https://doi.org/10.1007/s11590-020-01663-3

Download citation

Received: 09 June 2020
Accepted: 30 October 2020
Published: 14 November 2020
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11590-020-01663-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the convergence of a class of inertial dynamical systems with Tikhonov regularization

Abstract

Similar content being viewed by others

Tikhonov regularization of a second order dynamical system with Hessian driven damping

Convex optimization via inertial algorithms with vanishing Tikhonov regularization: fast convergence to the minimum norm solution

Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators

1 Introduction

2 Notation and preliminaries

Remark 2.1

Lemma 2.1

Lemma 2.2

Proof

Lemma 2.3

Proof

3 Existence and uniqueness of the solution of (1.4)

Definition 3.1

Theorem 3.1

Proof

4 Convergence analysis of the trajectory of (1.4)

4.1 Case \(\int _{{t_0}}^{ + \infty } {t\beta \left( t \right) \varepsilon \left( t \right) dt < + \infty }\)

Theorem 4.1

Proof

Remark 4.1

Remark 4.2

4.2 Case \(\int _{{t_0}}^{ + \infty } {\frac{{\varepsilon \left( t \right) \beta \left( t \right) }}{t}dt = + \infty } \)

Theorem 4.2

Proof

5 Numerical experiments

5.1 Strongly convex function

5.2 Convex function

5.3 One-dimensional function

6 Conclusion, perspective

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation