Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming

Dahal, Raju; Kar, Indrani

doi:10.1007/s11071-022-07594-1

Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming

Original Paper
Published: 14 July 2022

Volume 109, pages 2831–2850, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming

Download PDF

732 Accesses
9 Citations
Explore all metrics

Abstract

In this paper, a novel robust tracking control strategy for nonlinear unmatched uncertain systems is formulated using the event-based adaptive dynamic programming (ADP) approach. First, an augmented system is constructed based on the nonlinear system and the reference trajectory. Then, by forming an auxiliary system and introducing a discounted cost function, the event-based robust tracking control problem is transformed into the event-based optimal control problem of the auxiliary system. The event-based Hamilton–Jacobi–Bellman (HJB) equation associated with the event-based optimal control problem is solved using a single critic neural network (NN) under the ADP framework. A novel weight tuning rule for the critic network is formulated to avoid the necessity of an initial admissible control at the beginning of the weights tuning process. The obtained event-based controller is updated only at the triggering instants decided by the designed triggering condition, which helped in a significant reduction of resources used in computation and communication. Meanwhile, it is demonstrated that the obtained event-based controller can guarantee the tracking error’s uniform ultimate boundedness. Furthermore, using the Lyapunov method, it is guaranteed that the established novel event-triggering rule ensures uniform ultimate boundedness of all signals associated with the closed-loop auxiliary system. Finally, the applicability of the proposed control scheme is demonstrated by providing two simulation examples.

Event-triggered self-learning-based tracking control for nonlinear constrained-input systems with uncertain disturbances

Article 25 February 2024

Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems

Article 11 March 2022

Triggering and Control Co-design of Nonlinear Systems with External Disturbances Using Adaptive Dynamic Programming

Article 22 November 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Automotive Engineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Uncertainties are inevitable in practical nonlinear systems because of the presence of external disturbances and modeling errors. So, considering the requirement of the robustness of the designed feedback controller to uncertainties, many robust control design schemes have been developed over several decades [1,2,3]. Especially the method developed by Lin [3], in which the optimal control approach is utilized to obtain the robust controller, got remarkable attention [4, 5]. In the case of linear systems, the optimal controller can be derived conveniently by solving the algebraic Riccati equation (ARE) associated with it [6]. However, for nonlinear systems, instead of the ARE, one needs to find the solution of the Hamilton–Jacobi–Bellman (HJB) equation [7]. Since the HJB equation is a nonlinear partial differential equation, solving it with an analytical method is challenging. Although dynamic programming is generally used to solve the optimal control problem of nonlinear systems, it suffers heavily from the notorious “curse of dimensionality” [8]. The neural network (NN)-based function approximation technique called adaptive dynamic programming (ADP) has been employed to address this difficulty [9, 10]. The ADP approach was initially developed by Werbos to determine the solution of the optimal control problem effectively [11]. The ADP algorithm has a close relation with the reinforcement learning (RL) technique [12]. In the literature, ADP is also known as approximate dynamic programming [13], adaptive critic designs (ACDs) [14], neural dynamic programming (NDP) [15], and Q-learning [16].

In most practical applications, the system states need to track the desired trajectory rather than converge to zero merely [17, 18]. In the past several years, significant work has been done on tracking control by combining the aforementioned robust control method and the ADP algorithms [19,20,21,22]. In [19], the integral reinforcement learning technique is utilized to develop an optimal tracking controller for constrained input systems. For nonlinear matched uncertain systems, in [20], a robust tracking controller is designed via neural network approach, and in [21], a guaranteed cost tracking controller is developed. In [22], tracking controller for nonlinear systems considering unmatched uncertainties is derived via ACDs. However, all the work mentioned above is developed under the classical time-triggering framework, which suffers from inefficient use of computational and communicational resources.

Compared to the classical time-triggered approach, in the event-based or event-triggered strategy, the controller is only updated if a predefined triggering rule is not fulfilled, which helps in the effective use of computational and communicational resources [23,24,25,26]. Thus, many works have been done by combining the ADP-based robust control approach and the event-based framework. In [27], an actor-critic NN structure is utilized to derive an event-based optimal adaptive controller for nonlinear systems. In [28], an event-based guaranteed cost controller is derived for nonlinear systems utilizing a single critic NN. In [29], adaptive critic is used to design event-based near-optimal controller for heating, ventilation, and air conditioning (HVAC) systems. In [30], event-triggered optimal controller is designed for continuous stirred tank reactor (CSTR) system using ADP approach. The event-based ADP framework has been further utilized in designing controller for nonlinear systems with constrained input [31], with matched uncertainties [32], and with unmatched uncertainties [33].

Under the event-based ADP framework, the tracking controller for nonlinear systems has been designed in [34,35,36,37,38,39,40,41,42,43]. In [34,35,36, 38, 44], the ADP approach is used to formulate event-based optimal tracking controller for nonlinear systems without considering any uncertainties. The event-based ADP approach is utilized to derive an optimal tracking controller for modular reconfigurable robots in [39], and in [40], the tracking controller is design with application in wastewater treatment. In [41], event-based ADP is utilized to develop a tracking controller for constrained input systems. In [42], Cui et al. established event-based $H_{\infty }$ tracking controller via RL method. In our previous work [43], nonlinear matched uncertain system was considered while designing the event-based robust tracking controller. In [44], the event-triggered ADP approach is used to design a tracking controller for partially unknown matched uncertain constrained systems. Nonetheless, papers in the existing literature have not focused on developing a robust tracking controller for continuous-time nonlinear systems with unmatched uncertainty via event-based ADP approach, particularly without using the $H_{\infty }$ control approach [42]. Unlike the matched uncertainty, the unmatched uncertainty enters the system through a different channel than the control input. The unmatched uncertainties are a more general kind of uncertainties and can be widely seen in most practical systems. So, it is vital to consider unmatched uncertainty while designing a controller for nonlinear systems. This is what drives the research developed in this paper.

The following are the major contributions of this work.

1.
Compared with [34,35,36,37,38, 41], in this work uncertainty is considered while designing the ADP- based event-triggered robust tracking controller. As unmatched uncertainties are the most common form of uncertainty, they must be taken into account while developing a controller for nonlinear systems.
2.
Unlike [42], in this work, the event-based robust tracking controller is derived without using the $H_{\infty }$ control strategy. In the $H_{\infty }$ optimal control approach, the existence of the saddle point must be judged, but this is a challenging task.
3.
Rather than considering matched uncertainty as in [43, 44], unmatched uncertainty is considered in this work. Moreover, unlike [43], the need for the initial stabilizing control at the beginning of the critic weights tuning process is also relaxed by modifying the tuning rule.

The remaining part of this work is organized in the following manner. In Sect. 2, the original tracking control problem is transformed into the optimal control problem of an auxiliary system. The event-based HJB equation is formulated, and the event-triggering rule is derived in Sect. 3. In Sect. 4, the HJB equation is solved via the ADP approach. In Sect. 5, the Lyapunov approach is used to show that all the signals associated with the closed-loop auxiliary system are uniformly ultimately bounded. In Sect. 6, two simulation examples are presented. Finally, a concluding remark is given in Sect. 7. Moreover, limitations and future scope of the proposed work are also mentioned in Sect. 7.

Notation: In this work, the maximum and minimum eigenvalues of a matrix are denoted by $\lambda _{M}(\cdot )$ and $\lambda _{m}(\cdot )$, respectively. The transform operation is represented by the superscript $\top $. $\nabla (\cdot )$ denotes the gradient operation. $I_n$ is the identity matrix of dimension $n\times n$ and $0_{n\times m}$ is the zero matrix of dimension $n\times m$. $\Omega $ is a compact subset of $\mathbb {R}^2$.

2 Problem transformation

Consider the continuous-time nonlinear uncertain system given in the form

$$\begin{aligned} \dot{x}(t)=f(x(t))+g(x(t))u(t)+\Delta f(x(t)), \end{aligned}$$

(1)

where $ x(t)\in \mathbb {R}^n$ and $ u(t)\in \mathbb {R}^b$ are the state vector and control input, respectively. Let $x(0)=x_0$ be the initial state. $f(\cdot )$ and $g(\cdot )$ are smooth functions in their arguments with $f(0)=0$ and $f+gu$ satisfies the Lipschitz continuity. The unmatched uncertainty $\Delta f(x)=l(x)d(x)$, where $l(x)\in \mathbb {R}^{n\times p}$, $d(x)\in \mathbb {R}^p$ and if $b=p$ then $l(x)\ne g(x)$. Let d(x) be bounded by a known function $\lambda _d(x)$, i.e., $\left\Vert d(x)\right\Vert \le \lambda _d(x)$. Furthermore, $\lambda _d(0)=0$ and $d(0)=0$. In addition, there exists a nonnegative function $g_{M}(x)$ satisfying

$$\begin{aligned} \left\Vert g^+(x)\Delta f(x)\right\Vert \le g_{M}(x), \end{aligned}$$

where $g^+(x)$ is the pseudoinverse of g(x). Let the desired trajectory $x_d(t)\in \mathbb {R}^n$ be generated from

$$\begin{aligned} \dot{x}_d(t)=\Theta (x_d(t)), \end{aligned}$$

(2)

where $\Theta (x_d)$ satisfies the Lipschitz continuity and $\Theta (0)=0$. Let $x_d(0)=x_{do}$ be the initial condition.

The objective of this work is to derive an event-based robust controller for system (1) so that the system state x(t) follows the desired trajectory $x_d(t)$. Define the tracking error as ${e}_{t}(t)=x(t)-x_d(t)$. From (1) and (2), the tracking error dynamics can be presented as

$$\begin{aligned} \dot{e}_{t}(t)&=f({e}_{t}(t)+x_d(t))+g({e}_{t}(t)+x_d(t))u(t)\nonumber \\&\quad +\Delta f({e}_{t}(t)+x_d(t))-\Theta (x_d(t)). \end{aligned}$$

(3)

Now, based on the tracking error and the desired trajectory, an augmented state vector $\xi (t)=[{e}_{t}^{\top }(t),x_d^{\top }(t)]^{\top } \in \mathbb {R}^{2n}$ is formed. Then, using (2) and (3), the augmented system dynamics is formulated as

$$\begin{aligned} \dot{\xi }(t)=F(\xi (t))+G(\xi (t))u(\xi (t))+\Delta F(\xi (t)), \end{aligned}$$

(4)

where $F:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n}$ and $G:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n\times b}$ are new system matrices while $\Delta F(\xi (t))\in \mathbb {R}^{2n}$ is the new uncertain term. They can be expressed as

$$\begin{aligned} F(\xi (t))&=\begin{bmatrix}f({e}_{t}(t)+x_d(t))-\Theta (x_d(t))\\ \Theta (x_d(t))\end{bmatrix}, \\ G(\xi (t))&=\begin{bmatrix}g({e}_{t}(t)+x_d(t))\\ 0 \end{bmatrix} \end{aligned}$$

and

$$\begin{aligned} \Delta F(\xi (t))&=\begin{bmatrix}\Delta f({e}_{t}(t)+x_d(t))\\ 0\end{bmatrix}=L(\xi (t))d(\xi (t)). \end{aligned}$$

The terms $d(\xi )$ and ${G^+(\xi )\Delta F(\xi )} $ are still upper bounded and the bound can be derived as

$$\begin{aligned} {d(\xi )}={d(x)}\le \lambda _d(x)=\lambda _d({e}_{t}+x_d)\triangleq \lambda _d(\xi ) \end{aligned}$$

(5)

and

$$\begin{aligned} \left\| G^{+} (\xi )\Delta F(\xi )\right\|&=\left\| g^{+} (x)\Delta f(x)\right\| \nonumber \\&\le g_M(x) =g_M({e} _{t}+x_d)\triangleq g_M(\xi ), \end{aligned}$$

(6)

respectively.

Next, the uncertain term $L(\xi )d(\xi )$ is projected onto the range of matrix $G(\xi )$ and decomposed into sum of matched and unmatched component, that is

$$\begin{aligned} L(\xi )d(\xi )&=G(\xi )G^+(\xi )L(\xi )d(\xi )\nonumber \\ {}&\quad +(I-G(\xi )G^+(\xi ))L(\xi )d(\xi ). \end{aligned}$$

Then following auxiliary system is formed

$$\begin{aligned} \dot{\xi }&=F(\xi )+G(\xi )u(\xi )\nonumber \\&\quad +(I -G(\xi )G^+(\xi ))L(\xi )v(\xi ), \end{aligned}$$

(7)

where $v(\xi )\in \mathbb {R}^p$ is an auxiliary control that handles the unmatched component.

3 Event-based robust tracking control strategy

In this section, the event-based HJB equation is developed for the auxiliary system (7). Moreover, the event-triggering rule is also obtained using Lyapunov approach. The cost function associated with the auxiliary system (7) is defined as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-\gamma (\tau -t)}\{U(\xi (\tau ), u(\xi (\tau )), v(\xi (\tau )))\nonumber \\ {}&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+\Vert r^\top \Vert ^2g_M^2(\xi )\}\mathrm {d}\tau , \end{aligned}$$

(8)

where $\gamma $ and $\beta $ are positive constant, $U(\xi , u(\xi ),v(\xi )) =\xi ^{\top }\bar{Q}\xi +u^{\top }(\xi )Ru(\xi )+\beta ^2v^\top (\xi )Mv(\xi )$ and $\bar{Q}=diag\{Q,0_{n\times n}\}$. Q, M and R are positive definite matrices with appropriate dimension. Let r and m be lower triangular matrices with appropriate dimension. Then, using Cholesky decomposition one can write $R=rr^{\top }$ and $M=mm^{\top }$.

Remark 1

The discount term $\mathrm {e}^{-\gamma (\tau -t)}$ in cost function (8) is employed to make sure that (8) is bounded. Otherwise, the control policy pair $ [u^{\top }(e_{t}(t),x_d(t)), v^{\top }(e_t(t),x_d(t))]^{\top }$ may cause (8) to become unbounded since it depends on reference trajectory $x_d(t)$. In many practical systems, we need to consider reference trajectory which does not converge to zero. In that situation $x_d(t)$ makes (8) unbounded [45, 46].

Let $\Psi (\Omega )$ be the set of admissible controls on $\Omega $. We assume that the optimal control policy pair is admissible. If the cost function ${J}(\xi )$ is continuously differentiable then one can write

$$\begin{aligned}&\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}+U(\xi , u(\xi ),v(\xi ))\nonumber \\&\qquad -\gamma J(\xi )+\dot{J}(\xi )=0 \end{aligned}$$

(9)

with $J(0)=0$. Here (9) is called the infinitesimal version of (8). The Hamiltonian for the auxiliary system (7) is given as

$$\begin{aligned}&H(\xi ,u(\xi ),v(\xi ),\nabla J(\xi ))\nonumber \\&\quad =(\nabla J(\xi ))^{\top }(F(\xi )+G(\xi )u(\xi )\nonumber \\&\qquad +(I-G(\xi )G^+(\xi ))L(\xi )v(\xi ))+\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}+U(\xi , u(\xi ),v(\xi ))-\gamma J(\xi ). \end{aligned}$$

(10)

The optimal cost function is given by

$$\begin{aligned}&J^{*}(\xi (t))\nonumber \\&=\min \limits _{u,v\in \Psi (\Omega )}\int \nolimits _{t}^{\infty }\mathrm {e}^{-\gamma (\tau -t)}\{U(\xi (\tau ), u(\xi (\tau )), v(\xi (\tau )))\nonumber \\&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+\Vert r^\top \Vert ^2g_M^2(\xi )\}\mathrm {d}\tau . \end{aligned}$$

(11)

By the Bellman’s principle, $J^{*}(\xi (t))$ holds the HJB equation

$$\begin{aligned} \min \limits _{u,v\in \Psi (\Omega )}H(\xi ,u(\xi ),v(\xi ),\nabla J^{*}(\xi ))=0 \end{aligned}$$

(12)

with $J^{*}(0)=0$. Define $(I-G(\xi )G^+(\xi ))L(\xi )=K(\xi )$. The optimal control policies are obtained as

$$\begin{aligned} u^{*}(\xi )=- \frac{1}{2}R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi ) \end{aligned}$$

(13)

and

$$\begin{aligned} v^{*}(\xi )=- \frac{1}{2\beta ^2}M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi ). \end{aligned}$$

(14)

Substituting (13) and (14) into (12), we present the HJB equation as

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )+\xi ^{\top }\bar{Q}\xi +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )-\gamma J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{4}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi )=0. \end{aligned}$$

(15)

3.1 The event-based HJB equation formulation

Here, we present the HJB equation (15) in event-based form. Before proceeding, the event-based strategy is explained.

Let us consider a monotonically increasing sequence $\left\{ t_k\right\} _{k=0}^{\infty }$, where the kth triggering instant is represented as $t_k$ and $k \in \mathbb {N}$. Let the system state be sampled at every triggering instants and $\left\{ {\xi }_k\right\} _{k=0}^{\infty }$ be the sequence of sampled state, where $\xi _k=\xi (t_k)$ is the sampled state at $t_k$. The triggering error is described as the difference between the current state $\xi (t)$ and the sampled state $\xi _k$ and is represented as

$$\begin{aligned} e_{k}(t)={\xi }_k-\xi (t), \quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$

(16)

Based on (16), the event-based mechanism can be explained. If a predefined triggering rule is not satisfied, then the triggering error becomes zero, i.e., $e_{k}(t)=0$, and the control law will be updated. When the triggering rule is fulfilled, the control law is held constant between the two consecutive triggering instants. This principle is similar to the familiar zero-order hold (ZOH) principle, and it can be expressed as

$$\begin{aligned} u(t)=u({\xi }_k)\triangleq \mu (\xi _k),\quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$

From (16), the event-based control policy is obtained as

$$\begin{aligned} u(t)=\mu (\xi (t)+e_{k}(t)), \quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$

(17)

Now, using the control law (17), we obtain the sampled version of auxiliary system (7) as

$$\begin{aligned} \dot{\xi }=F(\xi )+G(\xi )\mu (\xi (t)+e_k(t))+K(\xi )v(\xi ). \end{aligned}$$

(18)

The optimal control (13), under event-triggered mechanism, can be expressed as

$$\begin{aligned} \mu ^{*}(\xi _k)=- \frac{1}{2}R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k). \end{aligned}$$

(19)

Now, using (19), we formulate the HJB equation under event-based framework as

$$\begin{aligned} H(\xi ,\mu ^{*}(\xi _k),v^{*}(\xi ),\nabla J^{*}(\xi ))=0, \end{aligned}$$

that is,

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )+\xi ^{\top }\bar{Q}\xi +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )-\gamma J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{2}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k)\nonumber \\&\qquad + \frac{1}{4}\nabla (J^{*}(\xi _k))^\top G(\xi _k)R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k)\nonumber \\&\qquad - \frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi )=0, \end{aligned}$$

(20)

where $J^{*}(0)=0$.

3.2 Event-triggering condition

In this subsection, we obtain the event-triggering condition using the Lyapunov approach. Before continuing, following statement is made which will be required to derive the triggering rule. The following statement is satisfied in many applications when the controller is affine with respect to the event-triggering error signal [27, 47].

Assumption 1

Let $\mathcal {L}$ be a positive constant. We consider that the optimal control policy $u^{*}(\xi )$ fulfills the Lipschitz continuity on $\Omega $ such that

$$\begin{aligned} \Vert u^*(\xi (t))-u^*({\xi }_{k})\Vert&=\Vert u^*(\xi (t))-u^*(\xi (t)+e_{k}(t))\Vert \\&\le \mathcal {L}\Vert e_{k}(t)\Vert . \end{aligned}$$

Theorem 1

Let Assumption 1 be true, $J^{*}(\xi )$ satisfies the HJB equation (12), the control policies are described by (14) and (19), and the event-triggering law is formulated as

$$\begin{aligned} \Vert e_{k}(t)\Vert ^2&\le \frac{(1-\eta _1^2)\lambda _{m}(Q)\Vert e_{t}\Vert ^2-2\beta ^2\Vert m^\top v^*(\xi )\Vert ^2}{2\Vert r^\top \Vert ^2\mathcal {L}^2}\nonumber \\&\triangleq \Vert e_T\Vert ^2, \end{aligned}$$

(21)

then for $\eta _1\in (0,1)$ and $\gamma =0$, the closed-loop augmented system (4) is asymptotically stable under $\mu ^*(\xi _k)$ and for $\gamma \ne 0$ the tracking error $e_{t}$ is uniformly ultimately bounded.

Proof

Consider $J^{*}(\xi )$ is the Lyapunov function candidate. Differentiating $J^{*}(\xi )$ along the trajectory of $\dot{\xi }(t)=F(\xi (t))+G(\xi (t))\mu ^{*}({\xi }_k)+\Delta F(\xi (t))$, one can write

$$\begin{aligned} \dot{J}^{*}(\xi )&=(\nabla J^{*}(\xi ))^{\top } (F(\xi )+G(\xi )\mu ^{*}({\xi } _k)\!+\!\Delta F(\xi (t))\nonumber \\&=(\nabla J^{*}(\xi ))^{\top }F(\xi ) +(\nabla J^{*}(\xi ))^{\top }G(\xi )\mu ^{*}({\xi }_k)\nonumber \\&\quad +(\nabla J^{*}(\xi ))^{\top }(G(\xi )G^+(\xi )L(\xi ) +K(\xi ))d(\xi ). \end{aligned}$$

(22)

From (12), we obtain

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )\nonumber \\&=-\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+\frac{1}{4}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi )\nonumber \\&\quad +\frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi ), \end{aligned}$$

(23)

from (13), we can write

$$\begin{aligned} G^{\top }(\xi )\nabla J^{*}(\xi )=-2Ru^{*}(\xi ) \end{aligned}$$

(24)

and from (14), we obtain

$$\begin{aligned} K^{\top }(\xi )\nabla J^{*}(\xi )=-2\beta ^2Mv^{*}(\xi ) \end{aligned}$$

(25)

Using (23), (24) and (25) we derived

$$\begin{aligned} \dot{J}^{*}(\xi )&= -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+u^{{*}{\top }}(\xi )Ru^{*}(\xi )\nonumber \\&\quad +\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )-2u^{{*}{\top }}(\xi )R\mu ^{*}(\xi _k)\nonumber \\&\quad -2u^{{*}{\top }}(\xi )R G^+(\xi )L(\xi )d(\xi )\nonumber \\&\quad -2\beta ^2 v^{{*}{\top }}(\xi )Md(\xi ). \end{aligned}$$

(26)

Now,

$$\begin{aligned}&u^{{*}{\top }}(\xi )Ru^{*}(\xi )-2u^{{*}\top }(\xi )R\mu ^{*}({\xi }_k)\nonumber \\&\qquad -2u^{{*}\top }(\xi )RG^+(\xi )L(\xi )d(\xi )\nonumber \\&\quad =\Vert r^{\top }(u^{*}(\xi )-u^{*}({\xi }_k)-G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\quad \le 2 \Vert r^{\top }\Vert ^2+2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2\nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\quad \le 2 \Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2+2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2 \nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2 \end{aligned}$$

(27)

and

$$\begin{aligned} -2\beta ^2v^{{*}\top }(\xi )Md(\xi )\le \beta ^2(\Vert m^\top v^{*}(\xi )\Vert ^2\nonumber \\ +\Vert m^\top d(\xi )\Vert ^2). \end{aligned}$$

(28)

Since, $\bar{Q}=diag\{Q,0_{n\times n}\}$, one can write $\xi ^{\top }\bar{Q}\xi =e_{t}^\top Q e_{t}$. Now, using (27), (28), and Assumption 1 we derive

$$\begin{aligned} \dot{J}^{*}(\xi )&\le -\lambda _m(Q)\Vert e_t\Vert ^2-\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad -\Vert r^\top \Vert ^2g_M^2(\xi ) +2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi ) +\gamma J^{*}(\xi )\nonumber \\&\quad +\beta ^2\Vert m^\top d(\xi )\Vert ^2 +2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2\nonumber \\&\quad +2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2\nonumber \\&\quad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\le -\lambda _m(Q)\Vert e_t\Vert ^2-\Vert r^\top \Vert ^2(g_M^2(\xi )\nonumber \\&\quad -2\Vert G^+(\xi )L(\xi )d(\xi )\Vert ^2)+2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2\nonumber \\&\quad -\beta ^2\Vert m^\top \Vert ^2(\lambda _{d}^{2}(\xi )-\Vert d(\xi )\Vert ^2)\nonumber \\ {}&\quad +2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )+\gamma J^{*}(\xi )\nonumber \\&\quad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\le -\eta _1^2\lambda _m(Q)\Vert e_t\Vert ^2+(\eta _1^2-1)\lambda _m(Q)\Vert e_t\Vert ^2\nonumber \\&\quad +2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2+2\beta ^2 \Vert m^\top v^{*}(\xi )\Vert ^2\nonumber \\&\quad +\gamma J^{*}(\xi ). \end{aligned}$$

(29)

Hence, when the triggering rule stated in Theorem 1 is satisfied and $\gamma =0$, then using (29) we can write

$$\begin{aligned} \dot{J}^{*}(\xi )\le -\eta _1^2\lambda _{m}(Q)\Vert {e_{t}(t))}\Vert ^2. \end{aligned}$$

(30)

Thus, the system is asymptotically stable for $\gamma =0$. When $\gamma \ne 0$, then

$$\begin{aligned} \dot{J}^{*}(\xi )\le \gamma J^{*}(\xi )-\eta _1^2\lambda _{m}(Q)\Vert {e_{t}(t))}\Vert ^2. \end{aligned}$$

(31)

Since $J^{*}(\xi )$ is positive definite and bounded on $\Omega $, let $J^{*}_{max}$ be the maximum value of $J^{*}(\xi )$. So, from (31), $\dot{J}^{*}(\xi )\le 0$ only if $e_{t}$ lies out of the set

$$\begin{aligned} \Omega _{e_{t}}=\Bigg \lbrace e_{t}:\Vert e_{t}\Vert \le \frac{1}{\eta _1}\sqrt{\frac{\gamma J^{*}_{max}}{\lambda _{m}(Q)}}\Bigg \rbrace . \end{aligned}$$

(32)

Thus we conclude that for $\gamma \ne 0$, the tracking error $e_t(t)$ is uniformly ultimately bounded and the ultimate bound is $\frac{1}{\eta _1}\sqrt{\frac{\gamma J^{*}_{max}}{\lambda _{m}(Q)}}$. $\square $

Remark 2

In this work, the control policy $\mu ^{*}({\xi }_k)$ is formulated under the event-triggered framework, but the augmented control policy $v^{*}({\xi })$ is formulated under the time-triggered framework. There are two reasons behind this. First, the control policy to be used in the uncertain system is $\mu ^{*}({\xi }_k)$ not the augmented control $v^{*}({\xi })$. Second, if we also consider the augmented control in the event-triggering framework, then it becomes very difficult to obtain the event-triggering rule (21).

Remark 3

The lower bond of the minimum event interval $\Delta t_{\min }$ can be expressed as

$$\begin{aligned} \Delta t_{\min }\ge \frac{1}{ \mathcal {P}} \ln (1+T_{min}) , \end{aligned}$$

where

$$\begin{aligned} T_{min}= \min _{k\in \mathbb {N}}\Bigg \lbrace \frac{||{e_k(t_{k+1})||}}{||{\xi _k}||+\pi }\Bigg \rbrace >0 \end{aligned}$$

and $e_k(t_{k+1})=\xi _k-\xi (t_{k+1})$, $\mathcal {P}$ and $\pi $ are positive constant satisfying $F(\xi )+G(\xi )u(\xi )+\Delta F(\xi )\le \mathcal {P}\Vert \xi \Vert +\pi $. Note that the positive constants $\mathcal {P}$ and $\pi $ exist because $F(\xi )+G(\xi )u$ is Lipschitz continuous and the terms $d(\xi )$ and ${G^+(\xi )\Delta F(\xi )} $ are upper bounded. The theoretical proof is similar to [32]. We have excluded the proof to avoid repetition. In the simulation result, we have presented that the intersample time indeed has a lower limit which is larger than zero. As a result, the infamous Zeno behavior is avoided.

4 ACDs for solving event-based HJB equation

In this section, a single critic network is employed to approximate the optimal value of the cost function under the ADP framework. The optimal cost function can be reconstructed on $\Omega $, utilizing the neural network’s universal approximation property and l number of hidden layer neurons, as

$$\begin{aligned} J^{*}(\xi )=\omega _c^\top \sigma _c(\xi )+\epsilon _c(\xi ), \end{aligned}$$

(33)

where $\omega _c\in \mathbb {R}^l$ is the actual weight vector of critic network, $\sigma _c(\xi )\in \mathbb {R}^l$ is the activation function, and $\epsilon _c(\xi )$ is the reconstruction error. Next, we obtain the gradient of (33) as

$$\begin{aligned} \nabla J^{*}(\xi )=(\nabla \sigma _c(\xi ))^\top \omega _c+\nabla \epsilon _c(\xi ). \end{aligned}$$

(34)

Due to the unavailability of the actual weight vector $\omega _c$, the approximate weight vector $\hat{\omega }_c$ is used to form a critic network to estimate the value of the optimal cost function ${J}^{*}(\xi )$ as follows

$$\begin{aligned} \hat{J}(\xi )=\hat{\omega }_c^\top \sigma _c(\xi ). \end{aligned}$$

(35)

Then the gradient of (35) is

$$\begin{aligned} \nabla \hat{J}(\xi )=(\nabla \sigma _c(\xi ))^\top \hat{\omega }_c. \end{aligned}$$

(36)

Considering (34) we present the augmented control law (14) and the event-based control law (19) as

$$\begin{aligned} {v}^{*}({\xi })=- \frac{1}{2\beta ^2}M^{-1}K^{\top }({\xi })((\nabla \sigma _c({\xi }))^\top {\omega }_c +\nabla \epsilon _c(\xi )) \end{aligned}$$

(37)

and

$$\begin{aligned} {\mu }^{*}({\xi _k})=- \frac{1}{2}R^{-1}G^{\top }({\xi _k})((\nabla \sigma _c({\xi _k}))^\top {\omega }_c +\nabla \epsilon _c(\xi _k)), \end{aligned}$$

(38)

respectively. Then by using (36), the approximate value of ${v}^{*}({\xi })$ and ${\mu }^{*}({\xi _k})$ can be obtained as

$$\begin{aligned} \hat{v}({\xi })=- \frac{1}{2\beta ^2}M^{-1}K^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \hat{\omega }_c \end{aligned}$$

(39)

and

$$\begin{aligned} \hat{\mu }({\xi _k})=- \frac{1}{2}R^{-1}G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top \hat{\omega }_c , \end{aligned}$$

(40)

respectively. Substituting $J^{*}(\xi )$ from (33) into (10), we obtain

$$\begin{aligned}&H(\xi ,\omega _c,\mu ^{*}(\xi _k),v^{*}(\xi ))\nonumber \\ {}&\quad =\omega _c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\mu ^{*}(\xi _k)+K(\xi )v^{*}(\xi ))\nonumber \\ {}&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\ {}&\qquad +U(\xi , \mu ^{*}(\xi _k),v^{*}(\xi ))-\gamma \omega _{c}^\top \sigma _c(\xi )\nonumber \\ {}&\quad \triangleq e_{cH}, \end{aligned}$$

(41)

where $e_{cH}=-(\nabla \epsilon _c(\xi ))^\top (F(\xi )+G(\xi )\mu ^{*}({\xi }_k)+K(\xi )v^{*}(\xi ))+\gamma \epsilon _c(\xi )$ is the residual error because of the reconstruction error associated with the NN approximation. Now the Hamiltonian (10) is approximated as

$$\begin{aligned}&\hat{H}(\xi ,\hat{\omega }_c,\hat{\mu }({\xi }_k),\hat{v}(\xi ))\nonumber \\&\quad =\hat{\omega }_c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi ))\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))-\gamma \hat{\omega }_{c}^\top \sigma _c(\xi ). \end{aligned}$$

(42)

From the HJB equation it is evident that $H(\xi ,\omega _c,\mu ^{*} (\xi _k),v^{*}(\xi ))=0$. So, the approximation error of Hamiltonian is given by

$$\begin{aligned} e_c&=\hat{H}(\xi ,\hat{\omega }_c,\hat{\mu }({\xi }_k),\hat{v}(\xi ))-H(\xi ,\omega _c,\mu ^{*}(\xi _k),\nonumber \\ v^{*}(\xi ))\nonumber&=\hat{\omega }_c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)\nonumber \\&\quad +K(\xi )\hat{v}(\xi ))+\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))-\gamma \hat{\omega }_{c}^\top \sigma _c(\xi )\nonumber \\&=\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))+\hat{\omega }_c^{\top }\phi , \end{aligned}$$

(43)

where $\phi =\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi ))-\gamma \sigma _c(\xi )$.

Now, to ensure $e_c$ given in (43) to be sufficiently small, we need to train the critic network to obtain appropriate weights. For that, the objective function constructed as $E=(1/2)e_c^\top e_c$ is minimized by using the steepest descent technique. Based on this approach the tuning rule is given as

$$\begin{aligned} \dot{\hat{\omega }}_{c1}=&\frac{-l_c}{(1+\phi ^\top \phi )^2}\frac{{\partial E}}{{\partial \hat{\omega }_c}}\nonumber \\ =&\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\hat{\omega }_c^\top \phi +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi )), \end{aligned}$$

(44)

where $l_c>0$ is a design parameter which is also known as the critic network’s learning rate and $1/{(1+\phi ^\top \phi )^2}$ is introduced to normalize $\phi $. However, the tuning rule (44) has following drawbacks

1.
An initial stabilizing control is needed at the beginning of the critic weight vector learning process while using the tuning rule provided in (44). However, in some practical applications, determining the initial admissible control can be difficult.
2.
The term $\phi /(1+\phi ^{\top }\phi )^2$ in (44) should be held persistently exciting to guarantee the convergence of the critic weights to valid optimal values. To meet the persistency of excitation (PE) condition, usually a probing noise is applied to the control input during the initial period of the critic weights tuning process. However, the probing noise can cause the system to become unstable.

To overcome the above drawbacks, we modify the tuning rule (44) via the Lyapunov approach. Before continuing the following assumption, which is similar to [20], is presented.

Assumption 2

Let us consider $V(\xi )$ be a continuously differentiable Lyapunov function candidate for the system (7) under the action of control policies given by (37) and (38), and satisfy

$$\begin{aligned}&\dot{V}{(\xi )}\nonumber \\&\quad =(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\nonumber \\&\quad < 0. \end{aligned}$$

(45)

Moreover, there exists a symmetric positive definite matrix $\Lambda \in \mathbb {R}^{2n}$ defined on $\Omega $ ensuring

$$\begin{aligned}&(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\nonumber \\&\quad =-(\nabla V(\xi ))^{\top }\Lambda \nabla V(\xi )\nonumber \\&\quad \le -\lambda _{m}(\Lambda )\Vert V(\xi )\Vert ^2. \end{aligned}$$

(46)

Remark 4

$F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )$ is frequently considered to be bounded by a positive constant on a compact set $\Omega $ [48]. In other words, there exist a constant $z_1$ such that $\Vert F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\Vert \le z_1$. Here we assumed that $F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )$ is bounded by a function with respect to $\xi $, which is less stringent than the constant upper bound assumption. Without loss of generality, we consider that $\Vert (F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert $, where $z_2$ is a positive constant. In this regard, we can write $\Vert (\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert ^2$. Observing (45), one can find that (46) is reasonable. In simulation, a polynomial with respect to $\xi $ is chosen as $V(\xi )$.

While we apply the approximated control policies (39) and (40) to the auxiliary system (7), to avoid instability we need to avoid the possibility

$$\begin{aligned} (\nabla V(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }({\xi _k})+K(\xi )\hat{v}(\xi ))> 0. \end{aligned}$$

(47)

To avoid (47), the training process is enhanced by introducing an additional term which is obtained using the steepest descent method as given below

$$\begin{aligned}&\dot{\hat{\omega }}_{c2}\nonumber \\&\ =l_s\frac{{\partial ((\nabla V(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }({\xi _k})+K(\xi )\hat{v}(\xi ))})}{{\partial \hat{\omega }_c}}\nonumber \\&\ =\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )), \end{aligned}$$

(48)

where $l_s>0$ is a design parameter. Now, the modified critic weights tuning rule is obtained by adding the stabilizing term (48) to the traditional tuning rule (44) as

$$\begin{aligned} \dot{\hat{\omega }}_c&=\dot{\hat{\omega }}_{c1}+\dot{\hat{\omega }}_{c2}\nonumber \\&\quad =\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\hat{\omega }_c^\top \phi +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi )))\nonumber \\&\qquad +\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )). \end{aligned}$$

(49)

Remark 5

The new tuning rule (49) can eliminate the need of initial admissible control. Hence, we can initialize the critic weight vector to zero while learning the appropriate critic weights. Moreover, the risk of instability due to the addition of probing noise to fulfill the PE condition is also eliminated.

The critic weights approximation error $\tilde{\omega }_c$ is defined as the difference between the ideal and the approximate weight vector, i.e., $ \tilde{\omega }_c=\omega _c-\hat{\omega }_c $. From (41) and (43) we obtain

$$\begin{aligned} e_c&=-\tilde{\omega }_c^\top \phi +e_{cH}. \end{aligned}$$

(50)

Then, using (49) and (50), the critic weights approximation error dynamics is presented as

$$\begin{aligned} \dot{\tilde{\omega }}_c&=\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\tilde{\omega }_c^\top \phi -e_{cH})\nonumber \\&\qquad -\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )). \end{aligned}$$

(51)

The closed-loop system functions as an impulsive dynamical system consisting of flow dynamics and jump dynamics under the event-based control law. Let us consider an augmented state vector $\psi =[\xi ^\top ,{\xi }_k^\top ,\tilde{\omega }_c^\top ]^\top $. Then, the flow dynamics of the closed-loop system, which occurs for all $t\in [t_k,t_{k+1})$, can be presented as

(52)

and the jump dynamics of the closed-loop system, which occurs for all $t\in t_{k+1}$, can be presented as

$$\begin{aligned} \psi (t^+)=\psi (t)+\begin{bmatrix} 0\\ {\xi }_k-\xi (t)\\ 0 \end{bmatrix},\quad \forall t \in t_{k+1}, k \in \mathbb {N}, \end{aligned}$$

(53)

where $\psi (t^+)=\lim _{\varsigma \rightarrow 0^+}\psi (t+\varsigma )$ and $\varsigma \in (0,t_{k+1}-t_k)$.

5 Stability analysis

In this section, the stability of impulsive dynamical representation of closed-loop system, given by (52) and (53), is studied. Prior to moving forward some assumptions, which are common in the literature, are stated below [32].

Assumption 3

The augmented system dynamics $G(\xi )$ and $K(\xi )$ satisfy the following assumptions, where A, $G_{M}$, and $K_M$ are positive constants.

1.
The dynamics $G(\xi )$ satisfies the Lipschitz continuity such that $\Vert G(\xi )-G({\xi }_k)\Vert \le A\Vert e_{k}(t)\Vert $.
2.
The dynamics $G(\xi )$ and $K(\xi )$ are upper bounded by $G_M$ and $K_M$, respectively.

Assumption 4

The following conditions hold on $\Omega $, where B, $\nabla \sigma _{cM}$, $\nabla \epsilon _{cM}$, $\omega _{cM}$, and $e_{cHM}$ are positive constants.

1.
The gradient of activation function satisfies the Lipschitz continuity such that $\Vert \nabla \sigma _c(\xi )-\nabla \sigma _c({\xi }_k)\Vert \le B\Vert e_{k}(t)\Vert $.
2.
The gradient of activation function $\nabla \sigma _c(\xi )$ and the gradient of neural approximation error $\nabla \epsilon _c(\xi )$ are upper bounded by $ \nabla \sigma _{cM}$ and $\nabla \epsilon _{cM}$, respectively.
3.
The ideal weight vector $\omega _c$ and the residual error $e_{cH}$ are upper bounded by $\omega _{cM}$ and $e_{cHM}$, respectively.

Theorem 2

Let Assumptions 1 to 4 be true. Then, under the control policies (39) and (40), the closed-loop auxiliary system (7) is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded if the inequalities (54) and (55) hold, where $(\eta _2\in (0,1))$ is a design parameter, and the values of $\varGamma _1$ and $\varGamma _2$ are give by (63) and (69), respectively.

$$\begin{aligned}&\Vert e_k(t)\Vert ^2\nonumber \\&\quad \le \frac{(1-\eta ^2_2)\lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2+\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2-\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2}{\Vert R^{-1}\Vert ^2\Vert r^{\top }\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert \hat{\omega }_c\Vert ^2}\nonumber \\&\qquad \triangleq \Vert \hat{e}_T\Vert ^2 \end{aligned}$$

(54)

$$\begin{aligned}&\Vert \tilde{\omega }_c\Vert >\sqrt{\frac{2\Vert R\Vert ^2(1+\phi ^\top \phi )(\varGamma _1+\varGamma _2+\gamma J^*_{max})}{2(1+\phi ^\top \phi )(l_c\Vert R\Vert ^2\lambda _{\varphi m}-G^2_{M}\nabla \sigma _{cM}^2)-l_c\Vert R\Vert ^2\lambda _{\varphi M}}} \end{aligned}$$

(55)

Proof

In light of the flow dynamics (52) and the jump dynamics (53) we consider the Lyapunov function candidate as

$$\begin{aligned} \Upsilon (t)=\Upsilon _1(t)+\Upsilon _2(t)+\Upsilon _3(t)+\Upsilon _4(t), \end{aligned}$$

(56)

where $\Upsilon _1(t)=J^*(\xi )$, $\Upsilon _2(t)=J^*({\xi }_k)$, $\Upsilon _3(t)=\frac{1}{2}\tilde{\omega }_c^\top \tilde{\omega }_c$ and $\Upsilon _4(t)=l_sV(\xi )$. Now, the analysis is separated into two cases.

Case 1. Events are not triggered, i.e., $t\in [t_k,t_{k+1})$. Taking the differentiation of (56) one can write

$$\begin{aligned} \dot{\Upsilon }(t)=\dot{\Upsilon }_1(t)+\dot{\Upsilon }_2(t)+\dot{\Upsilon }_3(t)+\dot{\Upsilon }_4(t). \end{aligned}$$

(57)

It is evident that for $t\in [t_k,t_{k+1})$$\dot{\Upsilon }_2(t)=0$. Now, differentiating $\Upsilon _1(t)$ along the trajectory of $\dot{\xi }(t)=F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi )$, we obtain

$$\begin{aligned} \dot{\Upsilon }_1(t)&=(\nabla J^{*}(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }(\xi _k)\\&\quad +K(\xi )\hat{v}(\xi )). \end{aligned}$$

Now using Eqs. (23) and (24), we derived

$$\begin{aligned} \dot{\Upsilon }_1(t)&= -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+u^{{*}{\top }}(\xi )Ru^{*}(\xi )\nonumber \\&\quad -2u^{{*}{\top }}(\xi )R\hat{\mu }(\xi _k)+\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )\nonumber \\&\quad -2\beta ^2 v^{{*}{\top }}(\xi )M\hat{v}(\xi ). \end{aligned}$$

(58)

We can write

$$\begin{aligned}&u^{{*}{\top }}(\xi )Ru^{*}(\xi )-2u^{{*}{\top }}(\xi )R\hat{\mu }(\xi _k)\nonumber \\&\quad \le \Vert r^\top (u^*(\xi )-\hat{\mu }(\xi _k))\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\quad \le \Vert r^\top \Vert ^2\Vert \frac{1}{2}R^{-1}G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \hat{\omega }_c \nonumber \\&\qquad - \frac{1}{2}R^{-1}G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top \hat{\omega }_c \nonumber \\&\qquad +\frac{1}{2}R^{-1}G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top (\tilde{\omega }_c+\nabla \epsilon _c(\xi ))\Vert ^2 \nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\quad \le \frac{\Vert r^\top \Vert ^2}{2}\Vert R^{-1}(G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \nonumber \\&\qquad -G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top )\hat{\omega }_c \Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2 \nonumber \\&\qquad +\frac{1}{2}\Vert R^{-1}G^{\top }({\xi })((\nabla \sigma _c({\xi }))^\top \tilde{\omega }_c+\nabla \epsilon _c(\xi ))\Vert ^2 \nonumber \\&\quad \le \Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2 \nonumber \\&\qquad +\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2+\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \epsilon _{cM}\nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2, \end{aligned}$$

(59)

$$\begin{aligned}&\qquad -2\beta ^2 v^{{*}{\top }}(\xi )M\hat{v}(\xi )\nonumber \\&\quad \le \beta ^2 v^{{*}{\top }}(\xi )M{v}^{*}(\xi )+\beta ^2 \hat{{v}}^{{\top }}(\xi )M\hat{{v}}(\xi ) \end{aligned}$$

(60)

and

$$\begin{aligned}&2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )\nonumber \\&\quad \le \frac{1}{2\beta ^2\Vert M\Vert }\Vert K^{\top }(\xi )(\nabla \sigma _c({\xi }))^\top (\omega _c+\nabla \epsilon _c)\Vert ^2\nonumber \\&\quad \le \frac{1}{2\beta ^2\Vert M\Vert }K^2_M\nabla \sigma ^2_{cM}(\omega ^2_{cM}+\nabla \epsilon _{cM}). \end{aligned}$$

(61)

Based on the above three inequalities, (58) can be expressed as

$$\begin{aligned}&\dot{\Upsilon }_1(t)\nonumber \\&\quad \le -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\gamma J^{*}(\xi )+\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _1, \end{aligned}$$

(62)

where $\varGamma _1$ is a positive constant and it is expressed as

$$\begin{aligned} \varGamma _1=\frac{G^2_M}{\Vert R\Vert ^2}\nabla \epsilon ^2_{cM}+\frac{K^2_M\nabla \sigma ^2_{cM}}{2\beta ^2\Vert M\Vert } (\omega ^2_{cM}+\nabla \epsilon ^2_{cM}). \end{aligned}$$

(63)

We have $\omega _c^\top \phi =\phi ^\top \omega _c$. Let $\varphi =\phi /({1+\phi ^\top \phi })$. Now, using (51), the time derivative of $\Upsilon _3(t)$ is found as

$$\begin{aligned} \dot{\Upsilon }_3(t)&=-l_c\tilde{\omega }_c^\top \varphi \varphi ^\top \tilde{\omega }_c+\frac{l_c}{(1+\phi ^\top \phi )}\tilde{\omega }_c^\top \varphi e_{cH}\nonumber \\&\quad -\frac{l_s}{2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi _k)G(\xi _k)R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\quad -\frac{l_s}{2\beta ^2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi )K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi ). \end{aligned}$$

(64)

Let $\lambda _{M}(\varphi \varphi ^\top )=\lambda _{\varphi M}$ and $\lambda _{m}(\varphi \varphi ^\top )=\lambda _{\varphi m}$. Then, Considering Young’s inequality $2c^\top d \le c^\top c+d^\top d$ and Assumption 4, (64) can be expressed as

$$\begin{aligned}&\dot{\Upsilon }_3(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )}(\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\&\qquad -\frac{l_s}{2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi _k)G(\xi _k)R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad -\frac{l_s}{2\beta ^2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi )K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi ) . \end{aligned}$$

(65)

Now, substituting $\tilde{\omega }_c=\omega _c-\hat{\omega }_c$ in last two terms of (65) and considering the control policies (39) and (40), one can write

$$\begin{aligned}&\dot{\Upsilon }_3(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\ {}&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )} (\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\ {}&\qquad - \frac{l_s}{2}(\nabla V(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla \sigma _c(\xi _k)\omega _c\nonumber \\ {}&\qquad -{l_s}(\nabla V(\xi ))^\top G(\xi )\hat{\mu }(\xi _k)\nonumber \\ {}&\qquad -\frac{l_s}{2\beta ^2}(\nabla V(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla \sigma _c(\xi )\omega _c\nonumber \\ {}&\qquad -{l_s}(\nabla V(\xi ))^\top K(\xi )\hat{v}(\xi ). \end{aligned}$$

(66)

The derivative of ${\Upsilon }_4(t)$ is

$$\begin{aligned} \dot{\Upsilon }_4(t)=l_s\nabla V(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi )). \end{aligned}$$

(67)

Now, combining (66) and (67) and using the control policies (37) and (38), one can write

$$\begin{aligned} \begin{aligned}&\dot{\Upsilon } _3(t)+\dot{\Upsilon } _4(t)\nonumber \\ {}&\quad \le - l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\ {}&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )} (\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM}) \nonumber \\ {}&\qquad +l_s(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu } ^{*}(\xi _k) +K(\xi ){v}^{*}(\xi ))\nonumber \\ {}&\qquad +\frac{l_s}{2}(\nabla V(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla \epsilon _c(\xi _k)\nonumber \\ {}&\qquad +\frac{l_s}{2\beta ^2} (\nabla V(\xi ))^\top K(\xi )M^{-1} K^{\top }(\xi )\nabla \epsilon _c(\xi ). \end{aligned}\end{aligned}$$

Now, utilizing Assumptions 1 to 4, we can write

$$\begin{aligned}&\dot{\Upsilon }_3(t)+\dot{\Upsilon }_4(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2 \nonumber \\&\qquad +\frac{l_c}{2(1+\phi ^{\top }\phi )}(\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\&\qquad -l_s\lambda _{m}(\Lambda )\Vert V(\xi )\Vert ^2 +{l_s}\kappa \Vert \nabla V(\xi )\Vert \nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2 +\frac{l_c}{2(1+\phi ^\top \phi )}\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c }{2(1+\phi ^\top \phi )}e^2_{cHM} +\frac{l_s\kappa ^2}{4\lambda _{m}(\Lambda )}\nonumber \\&\qquad -l_s\lambda _{m}(\Lambda ) \left( \Vert V(\xi )\Vert -\frac{\kappa }{2\lambda _{m}(\Lambda )}\right) ^2\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2+\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _2, \end{aligned}$$

(68)

where $\kappa =\frac{1}{2}\nabla \epsilon _{cM}(G_M^2\Vert R^{-1}\Vert +\frac{1}{\beta ^2}K_M^2\Vert M^{-1}\Vert )$ and the positive constant $\varGamma _2$ is given by

$$\begin{aligned} \varGamma _2=\frac{l_c e^2_{cHM}}{2(1+\phi ^\top \phi )}+\frac{l_s\kappa ^2}{4\lambda _{m}(\Lambda )}. \end{aligned}$$

(69)

Substituting (62) and (68) into (57), we obtain

$$\begin{aligned}&\dot{\Upsilon }(t)\nonumber \\&\quad \le -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\gamma J^{*}(\xi )+\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2+\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2+\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _1+\varGamma _2. \end{aligned}$$

(70)

Since $\bar{Q}=diag\{Q,0_{n\times n}\}$, one can write $\xi ^\top (t)\bar{Q} \xi (t)=e_{t}^\top (t)Qe_{t}(t)$. Now, introducing the design parameter $\eta _2$, (70) can be presented as

$$\begin{aligned}&\dot{\Upsilon }(t)\nonumber \\&\quad \le -\eta ^2_2 \lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2-(1-\eta ^2_2)\lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2\nonumber \\&\qquad -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{G^2_M}{\Vert R\Vert ^2}\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2 -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\gamma J^{*}_{max}+\varGamma _1+\varGamma _2. \end{aligned}$$

(71)

If the inequalities (54) and (55) mentioned in Theorem 2 hold then (71) implies

$$\begin{aligned} \dot{\Upsilon }(t)&\le -\eta ^2_2 \lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2-\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad -\Vert r^\top \Vert ^2g_M^2(\xi )\\ {}&<0, \end{aligned}$$

i.e., the proposed Lyapunov function candidate has negative time derivative for all $t\in [t_k,t_{k+1})$.

Case 2. Events are triggered, i.e., $t\in t_{k+1}$. We derive the difference of the Lyapunov function candidate as

$$\begin{aligned} \Delta \Upsilon (t_k)&=J^{*}(\xi (t_k^+))-J^{*}(\xi (t_k))\nonumber \\&\quad +\frac{1}{2}\tilde{\omega }^\top _c(t^+_k)\tilde{\omega }_c(t^+_k) -\frac{1}{2}\tilde{\omega }^\top _c(t_k)\tilde{\omega }_c(t_k))\nonumber \\&\quad +J^{*}({\xi }_{k+1})-J^{*}({\xi }_{k})+l_s(V(t^+_k)-V(t_k)), \end{aligned}$$

(72)

where $\xi (t^+_k)=\lim _{\varsigma \rightarrow 0^+}\xi (t_k+\varsigma )$ and $\varsigma \in (0,t_{k+1}-t_k)$. In Case 1 we derived that $\dot{\Upsilon }(t)<0$ for all $t\in [t_k,t_{k+1})$, so

$$\begin{aligned} \Upsilon (t_k)&\ge \lim _{\varsigma \rightarrow 0^+}\Upsilon (t_{k}+\varsigma ) \quad \forall \varsigma \in (0,t_{k+1}-t_k), k \in \mathbb {N}\nonumber \\&\triangleq \Upsilon (t_k^+). \end{aligned}$$

(73)

Thus, one can write

$$\begin{aligned}&J^{*}(\xi (t_k^+))+\frac{1}{2}\tilde{\omega }^\top _c(t^+_k)\tilde{\omega }_c(t^+_k)+l_s V(t^+_k)\nonumber \\&\qquad -J^{*}(\xi _k)-\frac{1}{2}\tilde{\omega }^\top _c(t_k)\tilde{\omega }_c(t_k))-l_s V(t^+_k)\le 0. \end{aligned}$$

(74)

Hence, we can further express that

$$\begin{aligned} (J^{*}({\xi }_{k+1})-J^{*}({\xi }_{k}))\le -\vartheta \Vert e_{k+1}(t_k)\Vert , \end{aligned}$$

(75)

where $\vartheta $ is a class k function and $e_{k+1}(t_k)={\xi }_{k+1}-{\xi }_{k}$. The inequalities (74) and (75) imply the monotonically decreasing property of the proposed Lyapunov function candidate for all $t\in t_{k+1}$.

Thus from the two cases presented above, we conclude that the closed-loop system is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded. $\square $

A flowchart is given in Fig. 1 to explain the fundamental methodology of the proposed work, which comprises the learning and implementation phases. In the learning phase, the converged critic weights are obtained after sufficient iterations while using the event-triggering rule (54). Then the converged weights are passed to the implementation phase to obtain the approximate values of the optimal control policies $ \mu ^{*}(\xi _k)$ and $v^{*}(\xi )$ as $ \hat{\mu }(\xi _k)$ and $\hat{v}(\xi )$, respectively. The approximated event-based control policy $ \hat{\mu }(\xi _k)$ is applied to the uncertain nonlinear system while using the event triggering rule (21) to track the desired trajectory.

Remark 6

The values of the sampling frequencies $\eta _1$ and $\eta _2$ are chosen such that the terms $\Vert e_T\Vert ^2$ and $\Vert \hat{e}_T\Vert ^2$ become positive, respectively. The increase in the value of $\eta _1$ and $\eta _2$ will increase the sampling frequency and the number of the event triggering instants. Furthermore, it improves tracking performance. However, we have to select the sampling frequency such that there is a trade-off between the number of triggering instants and the tracking performance. Similar to relevant literature [38], other parameters are chosen heuristically such that the convergence time of the critic weights and the number of triggering instants are minimum with acceptable tracking performance.

6 Simulation illustration

In this section, two simulation examples are presented to exhibit the efficacy of the proposed event-based robust trajectory tracking scheme. In Example 1, we have considered a linear system with unmatched uncertainty, and in Example 2, the spring-mass-damper system with nonlinear spring constant and unmatched uncertainty is considered.

6.1 Example 1

Consider the following linear unmatched uncertain system [21]

$$\begin{aligned} \dot{x}=\begin{bmatrix} x_2\\ -100x_1-2x_2 \end{bmatrix}+\begin{bmatrix} 0\\ 1 \end{bmatrix}u+\Delta f(x), \end{aligned}$$

(76)

where $x=[x_1,x_2]^\top \in \mathbb {R}^2$ is the state vector, $u\in \mathbb {R}$ denotes the control input, $ \Delta f(x)=l(x)d(x)$ and $l(x)=[1,0]^\top $. The perturbation $d(x)=0.5 \theta _1 x_1 sin(x_2+\theta _2)$, where the parameters $\theta _1$ and $\theta _2$ are unknown. We consider $\theta _1\in [-1,1]$, $\theta _2\in [-5,5]$ and the upper bound of the perturbation d(x) is $\lambda _d(x)=\vert x_1\vert $. Let $x_0=[0.6,-0.5]^\top $ be the initial state. The desired trajectory $x_d(t)$ is generated from

$$\begin{aligned} \dot{x}_d(t)= \begin{bmatrix} x_{d2}\\ -100x_{d1} \end{bmatrix}, \end{aligned}$$

(77)

where $x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2$ with the initial condition $x_{d0}=[0.3,-0.3]^\top $. The tracking error $e_t$ is defined as $e_t=x-x_d$, where $e_t=[e_{t1},e_{t2}]^{\top }\in \mathbb {R}^2$, and initial condition $e_{to}=x_0-x_{d0}$. Then an augmented state vector $\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4$ is defined and following augmented system is formed

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -100\xi _1-2(\xi _2+\xi _4)\\ \xi _4\\ -100\xi _3 \end{bmatrix} +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}u(\xi )\nonumber \\ {}&\quad +L(\xi )d(\xi ), \end{aligned}$$

(78)

where $L(\xi )=[1,0,0,0]^\top $ and $d(\xi )=0.5\theta _1 (\xi _1+\xi _3) sin((\xi _2+\xi _4)+\theta _2)$. The initial condition $\xi _0=[e_{t0},x_{d0}]^{\top }=[0.3,-0.2,0.3,-0.3]^\top $. The upper bound for $\lambda _d(\xi )$ is derived as $\lambda _d(\xi )=\vert \xi _1+\xi _3\vert $. We have obtained $G^+(\xi )=[0,1,0,0]$ so $(I-G(\xi )G^+(\xi ))L(\xi )=[1, 0, 0, 0]^{\top }$. As in (7), the auxiliary system is formulated as

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -100\xi _1-2(\xi _2+\xi _4)\\ \xi _4\\ -100\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}{u}(\xi ) +\begin{bmatrix} 1\\ 0\\ 0\\ 0 \end{bmatrix}{v}(\xi ). \end{aligned}$$

(79)

Since, $\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0$ we have taken $g_M(\xi )=0$. Let $R=I_1$, $M=I_1$ and $Q=500I_2$. For the simulation purpose, consider $\gamma =0.5$ and $\beta =0.85$.

Our aim is to develop an event-based robust controller for the system (76) to track the reference trajectory generated by (77). As described in the theoretical analysis, to achieve this design criteria, the augmented system (78) is formed and then the original control problem is transformed to designing an event-based optimal controller for auxiliary system (79). Based on (8), the cost function for (79) can be presented as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-0.5(\tau -t)}\{500\Vert e_t\Vert ^2+\Vert u(\xi )\Vert ^2\nonumber \\&\quad +0.72\Vert v(\xi )\Vert ^2+0.72\vert \xi _1+\xi _3\vert ^2\}\mathrm {d}\tau . \end{aligned}$$

(80)

The critic network (35) is employed to find the solution of the event-based optimal control problem approximately. We have considered $l=10$ numbers of hidden layer neurons and the weight vector of the critic network is represented as $ \hat{\omega }_c=[\hat{\omega }_{c1},\ldots ,\hat{\omega }_{c10}]^\top $. The activation function for the critic network is selected as $\sigma _c(\xi )=[\xi _1^2,\xi _2^2,\xi _3^2,,\xi _4^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2\xi _3,\xi _2\xi _4, \xi _3\xi _4]^\top $. The weights are trained using the tuning rule (49) and the triggering condition (54) is used during the training process. The parameters used during the tuning process are $l_c=3$, $l_s=0.1$, $(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8$ and $\eta _2=0.7$.

To satisfy the PE condition, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. As shown in Fig. 2, the critic weight vector converges to $\hat{\omega }_c=[0.46, 3.64, 0.01, -0.69, -0.85, -0.32, -0.02, 0.71, -3.89, -1.60]^{\top }$. During the training process the event-based controller updates 5714 times. On the contrary, under the same design criteria, the time-based controller updates 18000 times.

Then, we used the converged weights to obtain the approximate values of control policies control polices (39) and (40). Now, we select $\theta _1=-0.3$ and $\theta _2=5$ to demonstrate the trajectory tracking ability of the designed control policy $ \hat{\mu }(\xi _k)$ and the triggering rule described in (21). We considered $\eta _1=0.65$ and $\mathcal {L}=2.5$. The sampling period is taken as 0.01 second. The tracking performance of the designed controller is displayed in Figs. 3 and 4. The obtained event-based control policy $ \hat{\mu }(\xi _k)$ is shown in Fig. 5.

The value of the sampling frequency $\eta _1$ is considered as $\eta _1$ $\in (0,1)$. Table 1 illustrates the relationship between $\eta _1$ and the number of triggering instants $N_s$. From the table it is clear that as $\eta _1$ increases the number of event-triggering instants $N_s$ increases.

The evolution of the triggering condition with $\Vert e_T\Vert ^2$ and $\Vert e_k\Vert ^2$ is displayed in Fig. 6. The sampling period is shown in Fig. 7. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 7 also conveys that only 435 state samples are used during the tracking process. So, the controller is updated only 435 times. Nonetheless, if we use the time-triggering method under the same condition then 1600 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.

Next, in order to show that the derived controller is robust, we have taken $\theta _1=0.4$ and $\theta _2=-1$. The tracking performance for new value of $\theta _1$ and $\theta _2$ is shown in Figs. 8 and 9. In this scenario, the event-based controller updates 468 times only. On the contrary, the conventional time-triggered controller updates 1600 times under the same design criteria.

Table 1 Effect of $\eta _1$ on number of triggering instants

Full size table

6.2 Example 2

Consider the spring-mass-damper system [36]

$$\begin{aligned} \dot{x}=\begin{bmatrix} 0 &{} 1\\ -\frac{k}{m} &{} -\frac{c}{m} \end{bmatrix}x+\begin{bmatrix} 0\\ \frac{1}{m} \end{bmatrix}u , \end{aligned}$$

(81)

where $x=[x_1,x_2]^\top $, $x_1$ is the position and $x_2$ is the velocity, m represents the mass of the object, k denotes the spring constant and c is the damping. Let $m= 1 Kg$, $c=0.5N.\text {s/m}$ and the spring is nonlinear with the nonlinearity $k(x)=-5 x^3 \text {N/m}$. After adding an unmatched uncertainty $ \Delta f(x)$, the system dynamics is obtained as

$$\begin{aligned} \dot{x}=\begin{bmatrix} x_2\\ -5x_1^3-0.5x_2 \end{bmatrix}+\begin{bmatrix} 0\\ 1 \end{bmatrix}u+\Delta f(x), \end{aligned}$$

(82)

where $ \Delta f(x)=l(x)d(x)$ and $l(x)=[1,0]^\top $. The perturbation $d(x)=0.5 \theta _1 x_1 x_2 sin(x_1) cos(x_2+\theta _2)$, where the parameters $\theta _1$ and $\theta _2$ are unknown. We consider $\theta _1\in [-1,1]$, $\theta _2\in [-5,5]$ and the upper bound of the perturbation d(x) is $\lambda _d(x)=\vert x_2\vert $. Let $x_0=[0.5,0.2]^\top $ be the initial state. The desired trajectory $x_d(t)$ is generated from

$$\begin{aligned} \dot{x}_d(t)= \begin{bmatrix} x_{d2}\\ -5x_{d1} \end{bmatrix}, \end{aligned}$$

(83)

where $x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2$ with the initial condition $x_{d0}=[0.2,-0.2]^\top $. Then an augmented state vector $\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4$ is defined and following augmented system is formed

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -5(\xi _1+\xi _3)^3-0.5(\xi _2+\xi _4)+5\xi _3\\ \xi _4\\ -5\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}u(\xi ) +L(\xi )d(\xi ), \end{aligned}$$

(84)

where $L(\xi )=[1,0,0,0]^\top $ and $d(\xi )=0.5\theta _1 (\xi _1+\xi _3) (\xi _2+\xi _4) sin(\xi _1+\xi _3) cos((\xi _2+\xi _4)+\theta _2)$. The initial condition $\xi _0=[0.3,0.4,0.2,-0.2]^\top $. The upper bound for $\lambda _d(\xi )$ is derived as $\lambda _d(\xi )=\vert \xi _2+\xi _4\vert $. The auxiliary system is formulated as

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -5(\xi _1+\xi _3)^3-0.5(\xi _2+\xi _4)+5\xi _3\\ \xi _4\\ -5\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}{u}(\xi ) +\begin{bmatrix} 1\\ 0\\ 0\\ 0 \end{bmatrix}{v}(\xi ). \end{aligned}$$

(85)

Since $\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0$, we have taken $g_M(\xi )=0$. Let $R=I_1$, $M=I_1$ and $Q=300I_2$. For the simulation purpose, consider $\gamma =1.2$ and $\beta =0.9$.

Based on (8), the cost function for (85) can be presented as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-1.2(\tau -t)}\{300\Vert e_t\Vert ^2+\Vert u(\xi )\Vert ^2\nonumber \\&\quad +0.81\Vert v(\xi )\Vert ^2+0.81\vert \xi _2+\xi _4\vert ^2\}\mathrm {d}\tau . \end{aligned}$$

(86)

We have considered $l=10$ numbers of hidden layer neurons and the activation function is chosen as $\sigma _c(\xi )=[\xi _1^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2^2,\xi _2\xi _3,\xi _2\xi _4,\xi _3^2,\xi _3\xi _4, \xi _4^2]^\top $. The parameters used during the tuning process are $l_c=4$, $l_s=0.5$, $(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8$ and $\eta _2=0.7$.

To fulfill the PE criteria, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. The critic weight vector $\hat{\omega }_c$ converges to $[3.53,19,11.46,2.49,1.93,0.23,0.01,10.07,-0.73, 2.05]^{\top }$ as shown in Fig. 10. During the training process the event-based controller updates 8947 times. On the contrary, under the same design criteria, the time-based controller updates 16000 times.

Then, we used the converged weights to obtain the control polices (39) and (40). Now, we select $\theta _1=-0.9$ and $\theta _2=-0.3$ to check the trajectory tracking performance of the designed control policy $ \hat{\mu }(\xi _k)$ and the triggering rule described in (21). We considered $\eta _1=0.7$ and $\mathcal {L}=10$. The sampling period is taken as 0.01 second. The performance of the designed tracking controller is displayed in Figs. 11 and 12. The obtained event-based control policy $ \hat{\mu }(\xi _k)$ is displayed in Fig. 13.

The evolution of the triggering condition with $\Vert e_T\Vert ^2$ and $\Vert e_k\Vert ^2$ is displayed in Fig. 14. The sampling period is shown in Fig. 15. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 15 also conveys that only 1452 state samples are used during the tracking process. So, the controller is updated 1452 times only. Nonetheless, if we use the time-triggering method under the same condition then 8000 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.

Next, in order to show that the derived controller is robust, we have taken $\theta _1=0.8$ and $\theta _2=4$. The tracking performance for new value of $\theta _1$ and $\theta _2$ is shown in Figs. 16 and 17. Here, only 1438 state samples are used during the tracking process. In other words, the event-based controller updates 1438 times only. On the other hand, the conventional time-triggered controller updates 8000 times under the same design criteria.

7 Conclusion

In this work, an event-based robust tracking strategy for an unmatched uncertain system is developed. By forming an auxiliary system and decomposing the unmatched uncertainty, the original control problem is transformed into obtaining an optimal controller for an auxiliary system. The related event-based HJB equation is solved via the ADP approach. The critic weights tuning law is modified to avoid the need for initial stabilizing control at the beginning of the tuning process. In the meantime, a novel event-triggering law is developed, and the uniform ultimate boundedness of the tracking error is verified using the Lyapunov method. The closed-loop auxiliary system’s asymptotic stability and the uniform ultimate boundedness of the critic approximation error are assured. Finally, two simulation examples are included to demonstrate the usefulness of the proposed methodology.

The main limitations of the proposed work and future scope are as follows.

1.
The work developed in this article needs complete knowledge of system dynamics. However, we may not know the system dynamics completely in many applications.
2.
The proposed method is not suitable for time-delay systems. In the future, a tracking controller will be designed for uncertain nonlinear systems with time-delay using event-based ADP approach.

Availability of data

Data sharing not applicable to this paper as no datasets were generated or analyzed during this work.

References

Kravaris, C., Palanki, S.: A lyapunov approach for robust nonlinear state feedback synthesis. IEEE Trans. Autom. Control 33(12), 1188–1191 (1988)
MathSciNet MATH Google Scholar
Lewis, F., Jagannathan, S., Yesildirak, A.: Neural Network Control of Robot Manipulators and Non-linear Systems. CRC Press, London (2020)
Google Scholar
Lin, F.: An optimal control approach to robust control design. Int. J. Control 73(3), 177–186 (2000)
MathSciNet MATH Google Scholar
Karimi-Ghartemani, M., Khajehoddin, S.A., Jain, P., Bakhshai, A.: Linear quadratic output tracking and disturbance rejection. Int. J. Control 84(8), 1442–1449 (2011)
MathSciNet MATH Google Scholar
Zribi, M., Almutairi, N., Abdel-Rohman, M., Terro, M.: Nonlinear and robust control schemes for offshore steel jacket platforms. Nonlinear Dyn. 35(1), 61–80 (2004)
MATH Google Scholar
Kwakernaak, H., Sivan, R.: Linear Optimal Control Systems, vol. 1. Wiley-interscience, New York (1972)
MATH Google Scholar
Modares, H., Sistani, M.-B.N., Lewis, F.L.: A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Trans. 52(5), 611–621 (2013)
Google Scholar
Zhang, H.-G., Zhang, X., Yan-Hong, L., Jun, Y.: An overview of research on adaptive dynamic programming. Acta Autom. Sin. 39(4), 303–311 (2013)
MATH Google Scholar
Adhyaru, D., Kar, I., Gopal, M.: Fixed final time optimal control approach for bounded robust controller design using hamilton-jacobi-bellman solution. IET Control Theory Appl. 3(9), 1183–1195 (2009)
MathSciNet Google Scholar
Zhao, B., Jia, L., Xia, H., Li, Y.: Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation. Nonlinear Dyn. 93(4), 2089–2103 (2018)
MATH Google Scholar
Werbos, P.: Beyond regression:” new tools for prediction and analysis in the behavioral sciences. Ph. D. dissertation, Harvard University (1974)
Lewis, F.L., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9(3), 32–50 (2009)
Google Scholar
Si, J., Barto, A.G., Powell, W.B., Wunsch, D.: Handbook of Learning and Approximate Dynamic Programming, vol. 2. John Wiley & Sons, New York (2004)
Google Scholar
Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)
Google Scholar
Si, J., Wang, Y.-T.: Online learning control by association and reinforcement. IEEE Trans. Neural Netw. 12(2), 264–276 (2001)
Google Scholar
Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006)
Google Scholar
Cheein, F.A., Scaglia, G.: Trajectory tracking controller design for unmanned vehicles: a new methodology. J. Field Robot. 31(6), 861–887 (2014)
Google Scholar
Al Issa, S., Kar, I.: Design and implementation of event-triggered adaptive controller for commercial mobile robots subject to input delays and limited communications. Control Eng. Pract. 114, 104865 (2021)
Google Scholar
Modares, H., Lewis, F.L.: Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7), 1780–1792 (2014)
MathSciNet MATH Google Scholar
Wang, D., Liu, D., Zhang, Y., Li, H.: Neural network robust tracking control with adaptive critic framework for uncertain nonlinear systems. Neural Netw. 97, 11–18 (2018)
MATH Google Scholar
Yang, X., Liu, D., Wei, Q., Wang, D.: Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing 198, 80–90 (2016)
Google Scholar
Mu, C., Zhang, Y., Gao, Z., Sun, C.: Adp-based robust tracking control for a class of nonlinear systems with unmatched uncertainties. IEEE Trans. Syst. Man Cybern. Syst. 50(11), 4056–4067 (2019)
Google Scholar
Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52(9), 1680–1685 (2007)
MathSciNet MATH Google Scholar
Al Issa, S., Chakravarty, A., Kar, I.: Improved event-triggered adaptive control of non-linear uncertain networked systems. IET Control Theory Appl. 13(13), 2146–2152 (2019)
MathSciNet Google Scholar
Al Issa, S., Kar, I.: Event-triggered adaptive control of uncertain non-linear systems under input delay and limited resources. Int. J. Dyn. Control (2021). https://doi.org/10.1007/s40435-021-00767-7
Article MathSciNet Google Scholar
Mu, C., Wang, D., Sun, C., Zong, Q.: Robust adaptive critic control design with network-based event-triggered formulation. Nonlinear Dyn. 90(3), 2023–2035 (2017)
MathSciNet MATH Google Scholar
Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sin. 1(3), 282–293 (2014)
Google Scholar
Wang, D., Liu, D.: Learning and guaranteed cost control with event-based adaptive critic implementation. IEEE Trans. Neural Netw. Learn. Syst. 29(12), 6004–6014 (2018)
Google Scholar
Dhar, N.K., Verma, N.K., Behera, L.: Adaptive critic-based event-triggered control for HVAC system. IEEE Trans. Ind. Inf. 14(1), 178–188 (2017)
Google Scholar
Yang, X., Wei, Q.: Adaptive critic designs for optimal event-driven control of a CSTR system. IEEE Trans. Ind. Inf. 17(1), 484–493 (2020)
Google Scholar
Dong, L., Zhong, X., Sun, C., He, H.: Event-triggered adaptive dynamic programming for continuous-time systems with control constraints. IEEE Trans. Neural Netw. Learn. Syst. 28(8), 1941–1952 (2016)
MathSciNet Google Scholar
Yang, X., He, H.: Adaptive critic designs for event-triggered robust control of nonlinear systems with unknown dynamics. IEEE Trans. Cybern. 49(6), 2255–2267 (2018)
Google Scholar
Wang, D., Mu, C., He, H., Liu, D.: Event-driven adaptive robust control of nonlinear systems with uncertainties through NDP strategy. IEEE Trans. Syst. Man Cybern. Syst. 47(7), 1358–1370 (2016)
Google Scholar
Vamvoudakis, K.G., Mojoodi, A., Ferraz, H.: Event-triggered optimal tracking control of nonlinear systems. Int. J. Robust Nonlinear Control 27(4), 598–619 (2017)
MathSciNet MATH Google Scholar
Zhang, K., Zhang, H., Xiao, G., Su, H.: Tracking control optimization scheme of continuous-time nonlinear system via online single network adaptive critic design method. Neurocomputing 251, 127–135 (2017)
Google Scholar
Zhang, K., Zhang, H., Jiang, H., Wang, Y.: Near-optimal output tracking controller design for nonlinear systems using an event-driven ADP approach. Neurocomputing 309, 168–178 (2018)
Google Scholar
Xue, S., Luo, B., Liu, D., Gao, Y.: Adaptive dynamic programming-based event-triggered optimal tracking control. Int. J. Robust Nonlinear Control 31(15), 7480–7497 (2021)
MathSciNet Google Scholar
Xu, N., Niu, B., Wang, H., Huo, X., Zhao, X.: Single-network ADP for solving optimal event-triggered tracking control problem of completely unknown nonlinear systems. Int. J. Intell. Syst. 36(9), 4795–4815 (2021)
Google Scholar
Zhao, B., Liu, D.: Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming. IEEE Trans. Industr. Electron. 67(4), 3054–3064 (2019)
Google Scholar
Wang, D., Hu, L., Zhao, M., Qiao, J.: Adaptive critic for event-triggered unknown nonlinear optimal tracking design with wastewater treatment applications. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3135405
Article Google Scholar
Cui, L., Xie, X., Wang, X., Luo, Y., Liu, J.: Event-triggered single-network ADP method for constrained optimal tracking control of continuous-time non-linear systems. Appl. Math. Comput. 352, 220–234 (2019)
MathSciNet MATH Google Scholar
Cui, L., Qu, W., Wang, L., Luo, Y., Wang, Z.:Event-triggered h$\inf $ tracking control of nonlinear systems via reinforcement learning method. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019). DOI:10.1109/IJCNN.2019.8851956
Dahal, R., Kar, I.: Event-triggered robust tracking controller for uncertain nonlinear systems using adaptive critic. In: 2020 IEEE 17th India Council International Conference (INDICON), pp. 1–6 (2020). IEEE
Xue, S., Luo, B., Liu, D., Gao, Y.: Event-triggered ADP for tracking control of partially unknown constrained uncertain systems. IEEE Trans. Cybern. (2021). https://doi.org/10.1109/TCYB.2021.3054626
Article Google Scholar
Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)
MathSciNet MATH Google Scholar
Kiumarsi, B., Lewis, F.L.: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 140–151 (2014)
MathSciNet Google Scholar
Zhang, Q., Zhao, D., Wang, D.: Event-based robust control for uncertain nonlinear systems using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 29(1), 37–50 (2016)
MathSciNet Google Scholar
Dierks, T., Jagannathan, S.: Optimal control of affine nonlinear continuous-time systems. In: Proceedings of the 2010 American Control Conference, pp. 1568–1573 (2010). IEEE

Download references

Funding

This work received no specific grant from any funding agency.

Author information

Authors and Affiliations

Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, 781039, India
Raju Dahal & Indrani Kar

Authors

Raju Dahal
View author publications
You can also search for this author in PubMed Google Scholar
Indrani Kar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raju Dahal.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dahal, R., Kar, I. Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming. Nonlinear Dyn 109, 2831–2850 (2022). https://doi.org/10.1007/s11071-022-07594-1

Download citation

Received: 20 September 2021
Accepted: 31 May 2022
Published: 14 July 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11071-022-07594-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming

Abstract

Similar content being viewed by others

Event-triggered self-learning-based tracking control for nonlinear constrained-input systems with uncertain disturbances

Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems

Triggering and Control Co-design of Nonlinear Systems with External Disturbances Using Adaptive Dynamic Programming

Explore related subjects

1 Introduction

2 Problem transformation

3 Event-based robust tracking control strategy

Remark 1

3.1 The event-based HJB equation formulation

3.2 Event-triggering condition

Assumption 1

Theorem 1

Proof

Remark 2

Remark 3

4 ACDs for solving event-based HJB equation

Assumption 2

Remark 4

Remark 5

5 Stability analysis

Assumption 3

Assumption 4

Theorem 2

Proof

Remark 6

6 Simulation illustration

6.1 Example 1

6.2 Example 2

7 Conclusion

Availability of data

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation