1 Introduction

Uncertainties are inevitable in practical nonlinear systems because of the presence of external disturbances and modeling errors. So, considering the requirement of the robustness of the designed feedback controller to uncertainties, many robust control design schemes have been developed over several decades [1,2,3]. Especially the method developed by Lin [3], in which the optimal control approach is utilized to obtain the robust controller, got remarkable attention [4, 5]. In the case of linear systems, the optimal controller can be derived conveniently by solving the algebraic Riccati equation (ARE) associated with it [6]. However, for nonlinear systems, instead of the ARE, one needs to find the solution of the Hamilton–Jacobi–Bellman (HJB) equation [7]. Since the HJB equation is a nonlinear partial differential equation, solving it with an analytical method is challenging. Although dynamic programming is generally used to solve the optimal control problem of nonlinear systems, it suffers heavily from the notorious “curse of dimensionality” [8]. The neural network (NN)-based function approximation technique called adaptive dynamic programming (ADP) has been employed to address this difficulty [9, 10]. The ADP approach was initially developed by Werbos to determine the solution of the optimal control problem effectively [11]. The ADP algorithm has a close relation with the reinforcement learning (RL) technique [12]. In the literature, ADP is also known as approximate dynamic programming [13], adaptive critic designs (ACDs) [14], neural dynamic programming (NDP) [15], and Q-learning [16].

In most practical applications, the system states need to track the desired trajectory rather than converge to zero merely [17, 18]. In the past several years, significant work has been done on tracking control by combining the aforementioned robust control method and the ADP algorithms [19,20,21,22]. In [19], the integral reinforcement learning technique is utilized to develop an optimal tracking controller for constrained input systems. For nonlinear matched uncertain systems, in [20], a robust tracking controller is designed via neural network approach, and in [21], a guaranteed cost tracking controller is developed. In [22], tracking controller for nonlinear systems considering unmatched uncertainties is derived via ACDs. However, all the work mentioned above is developed under the classical time-triggering framework, which suffers from inefficient use of computational and communicational resources.

Compared to the classical time-triggered approach, in the event-based or event-triggered strategy, the controller is only updated if a predefined triggering rule is not fulfilled, which helps in the effective use of computational and communicational resources [23,24,25,26]. Thus, many works have been done by combining the ADP-based robust control approach and the event-based framework. In [27], an actor-critic NN structure is utilized to derive an event-based optimal adaptive controller for nonlinear systems. In [28], an event-based guaranteed cost controller is derived for nonlinear systems utilizing a single critic NN. In [29], adaptive critic is used to design event-based near-optimal controller for heating, ventilation, and air conditioning (HVAC) systems. In [30], event-triggered optimal controller is designed for continuous stirred tank reactor (CSTR) system using ADP approach. The event-based ADP framework has been further utilized in designing controller for nonlinear systems with constrained input [31], with matched uncertainties [32], and with unmatched uncertainties [33].

Under the event-based ADP framework, the tracking controller for nonlinear systems has been designed in [34,35,36,37,38,39,40,41,42,43]. In [34,35,36, 38, 44], the ADP approach is used to formulate event-based optimal tracking controller for nonlinear systems without considering any uncertainties. The event-based ADP approach is utilized to derive an optimal tracking controller for modular reconfigurable robots in [39], and in [40], the tracking controller is design with application in wastewater treatment. In [41], event-based ADP is utilized to develop a tracking controller for constrained input systems. In [42], Cui et al. established event-based \(H_{\infty }\) tracking controller via RL method. In our previous work [43], nonlinear matched uncertain system was considered while designing the event-based robust tracking controller. In [44], the event-triggered ADP approach is used to design a tracking controller for partially unknown matched uncertain constrained systems. Nonetheless, papers in the existing literature have not focused on developing a robust tracking controller for continuous-time nonlinear systems with unmatched uncertainty via event-based ADP approach, particularly without using the \(H_{\infty }\) control approach [42]. Unlike the matched uncertainty, the unmatched uncertainty enters the system through a different channel than the control input. The unmatched uncertainties are a more general kind of uncertainties and can be widely seen in most practical systems. So, it is vital to consider unmatched uncertainty while designing a controller for nonlinear systems. This is what drives the research developed in this paper.

The following are the major contributions of this work.

  1. 1.

    Compared with [34,35,36,37,38, 41], in this work uncertainty is considered while designing the ADP- based event-triggered robust tracking controller. As unmatched uncertainties are the most common form of uncertainty, they must be taken into account while developing a controller for nonlinear systems.

  2. 2.

    Unlike [42], in this work, the event-based robust tracking controller is derived without using the \(H_{\infty }\) control strategy. In the \(H_{\infty }\) optimal control approach, the existence of the saddle point must be judged, but this is a challenging task.

  3. 3.

    Rather than considering matched uncertainty as in [43, 44], unmatched uncertainty is considered in this work. Moreover, unlike [43], the need for the initial stabilizing control at the beginning of the critic weights tuning process is also relaxed by modifying the tuning rule.

The remaining part of this work is organized in the following manner. In Sect. 2, the original tracking control problem is transformed into the optimal control problem of an auxiliary system. The event-based HJB equation is formulated, and the event-triggering rule is derived in Sect. 3. In Sect. 4, the HJB equation is solved via the ADP approach. In Sect. 5, the Lyapunov approach is used to show that all the signals associated with the closed-loop auxiliary system are uniformly ultimately bounded. In Sect. 6, two simulation examples are presented. Finally, a concluding remark is given in Sect. 7. Moreover, limitations and future scope of the proposed work are also mentioned in Sect. 7.

Notation: In this work, the maximum and minimum eigenvalues of a matrix are denoted by \(\lambda _{M}(\cdot )\) and \(\lambda _{m}(\cdot )\), respectively. The transform operation is represented by the superscript \(\top \). \(\nabla (\cdot )\) denotes the gradient operation. \(I_n\) is the identity matrix of dimension \(n\times n\) and \(0_{n\times m}\) is the zero matrix of dimension \(n\times m\). \(\Omega \) is a compact subset of \(\mathbb {R}^2\).

2 Problem transformation

Consider the continuous-time nonlinear uncertain system given in the form

$$\begin{aligned} \dot{x}(t)=f(x(t))+g(x(t))u(t)+\Delta f(x(t)), \end{aligned}$$
(1)

where \( x(t)\in \mathbb {R}^n\) and \( u(t)\in \mathbb {R}^b\) are the state vector and control input, respectively. Let \(x(0)=x_0\) be the initial state. \(f(\cdot )\) and \(g(\cdot )\) are smooth functions in their arguments with \(f(0)=0\) and \(f+gu\) satisfies the Lipschitz continuity. The unmatched uncertainty \(\Delta f(x)=l(x)d(x)\), where \(l(x)\in \mathbb {R}^{n\times p}\), \(d(x)\in \mathbb {R}^p\) and if \(b=p\) then \(l(x)\ne g(x)\). Let d(x) be bounded by a known function \(\lambda _d(x)\), i.e., \(\left\Vert d(x)\right\Vert \le \lambda _d(x)\). Furthermore, \(\lambda _d(0)=0\) and \(d(0)=0\). In addition, there exists a nonnegative function \(g_{M}(x)\) satisfying

$$\begin{aligned} \left\Vert g^+(x)\Delta f(x)\right\Vert \le g_{M}(x), \end{aligned}$$

where \(g^+(x)\) is the pseudoinverse of g(x). Let the desired trajectory \(x_d(t)\in \mathbb {R}^n\) be generated from

$$\begin{aligned} \dot{x}_d(t)=\Theta (x_d(t)), \end{aligned}$$
(2)

where \(\Theta (x_d)\) satisfies the Lipschitz continuity and \(\Theta (0)=0\). Let \(x_d(0)=x_{do}\) be the initial condition.

The objective of this work is to derive an event-based robust controller for system (1) so that the system state x(t) follows the desired trajectory \(x_d(t)\). Define the tracking error as \({e}_{t}(t)=x(t)-x_d(t)\). From (1) and (2), the tracking error dynamics can be presented as

$$\begin{aligned} \dot{e}_{t}(t)&=f({e}_{t}(t)+x_d(t))+g({e}_{t}(t)+x_d(t))u(t)\nonumber \\&\quad +\Delta f({e}_{t}(t)+x_d(t))-\Theta (x_d(t)). \end{aligned}$$
(3)

Now, based on the tracking error and the desired trajectory, an augmented state vector \(\xi (t)=[{e}_{t}^{\top }(t),x_d^{\top }(t)]^{\top } \in \mathbb {R}^{2n}\) is formed. Then, using (2) and (3), the augmented system dynamics is formulated as

$$\begin{aligned} \dot{\xi }(t)=F(\xi (t))+G(\xi (t))u(\xi (t))+\Delta F(\xi (t)), \end{aligned}$$
(4)

where \(F:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n}\) and \(G:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n\times b}\) are new system matrices while \(\Delta F(\xi (t))\in \mathbb {R}^{2n}\) is the new uncertain term. They can be expressed as

$$\begin{aligned} F(\xi (t))&=\begin{bmatrix}f({e}_{t}(t)+x_d(t))-\Theta (x_d(t))\\ \Theta (x_d(t))\end{bmatrix}, \\ G(\xi (t))&=\begin{bmatrix}g({e}_{t}(t)+x_d(t))\\ 0 \end{bmatrix} \end{aligned}$$

and

$$\begin{aligned} \Delta F(\xi (t))&=\begin{bmatrix}\Delta f({e}_{t}(t)+x_d(t))\\ 0\end{bmatrix}=L(\xi (t))d(\xi (t)). \end{aligned}$$

The terms \(d(\xi )\) and \({G^+(\xi )\Delta F(\xi )} \) are still upper bounded and the bound can be derived as

$$\begin{aligned} {d(\xi )}={d(x)}\le \lambda _d(x)=\lambda _d({e}_{t}+x_d)\triangleq \lambda _d(\xi ) \end{aligned}$$
(5)

and

$$\begin{aligned} \left\| G^{+} (\xi )\Delta F(\xi )\right\|&=\left\| g^{+} (x)\Delta f(x)\right\| \nonumber \\&\le g_M(x) =g_M({e} _{t}+x_d)\triangleq g_M(\xi ), \end{aligned}$$
(6)

respectively.

Next, the uncertain term \(L(\xi )d(\xi )\) is projected onto the range of matrix \(G(\xi )\) and decomposed into sum of matched and unmatched component, that is

$$\begin{aligned} L(\xi )d(\xi )&=G(\xi )G^+(\xi )L(\xi )d(\xi )\nonumber \\ {}&\quad +(I-G(\xi )G^+(\xi ))L(\xi )d(\xi ). \end{aligned}$$

Then following auxiliary system is formed

$$\begin{aligned} \dot{\xi }&=F(\xi )+G(\xi )u(\xi )\nonumber \\&\quad +(I -G(\xi )G^+(\xi ))L(\xi )v(\xi ), \end{aligned}$$
(7)

where \(v(\xi )\in \mathbb {R}^p\) is an auxiliary control that handles the unmatched component.

3 Event-based robust tracking control strategy

In this section, the event-based HJB equation is developed for the auxiliary system (7). Moreover, the event-triggering rule is also obtained using Lyapunov approach. The cost function associated with the auxiliary system (7) is defined as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-\gamma (\tau -t)}\{U(\xi (\tau ), u(\xi (\tau )), v(\xi (\tau )))\nonumber \\ {}&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+\Vert r^\top \Vert ^2g_M^2(\xi )\}\mathrm {d}\tau , \end{aligned}$$
(8)

where \(\gamma \) and \(\beta \) are positive constant, \(U(\xi , u(\xi ),v(\xi )) =\xi ^{\top }\bar{Q}\xi +u^{\top }(\xi )Ru(\xi )+\beta ^2v^\top (\xi )Mv(\xi )\) and \(\bar{Q}=diag\{Q,0_{n\times n}\}\). Q, M and R are positive definite matrices with appropriate dimension. Let r and m be lower triangular matrices with appropriate dimension. Then, using Cholesky decomposition one can write \(R=rr^{\top }\) and \(M=mm^{\top }\).

Remark 1

The discount term \(\mathrm {e}^{-\gamma (\tau -t)}\) in cost function (8) is employed to make sure that (8) is bounded. Otherwise, the control policy pair \( [u^{\top }(e_{t}(t),x_d(t)), v^{\top }(e_t(t),x_d(t))]^{\top }\) may cause (8) to become unbounded since it depends on reference trajectory \(x_d(t)\). In many practical systems, we need to consider reference trajectory which does not converge to zero. In that situation \(x_d(t)\) makes (8) unbounded [45, 46].

Let \(\Psi (\Omega )\) be the set of admissible controls on \(\Omega \). We assume that the optimal control policy pair is admissible. If the cost function \({J}(\xi )\) is continuously differentiable then one can write

$$\begin{aligned}&\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}+U(\xi , u(\xi ),v(\xi ))\nonumber \\&\qquad -\gamma J(\xi )+\dot{J}(\xi )=0 \end{aligned}$$
(9)

with \(J(0)=0\). Here (9) is called the infinitesimal version of (8). The Hamiltonian for the auxiliary system (7) is given as

$$\begin{aligned}&H(\xi ,u(\xi ),v(\xi ),\nabla J(\xi ))\nonumber \\&\quad =(\nabla J(\xi ))^{\top }(F(\xi )+G(\xi )u(\xi )\nonumber \\&\qquad +(I-G(\xi )G^+(\xi ))L(\xi )v(\xi ))+\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}+U(\xi , u(\xi ),v(\xi ))-\gamma J(\xi ). \end{aligned}$$
(10)

The optimal cost function is given by

$$\begin{aligned}&J^{*}(\xi (t))\nonumber \\&=\min \limits _{u,v\in \Psi (\Omega )}\int \nolimits _{t}^{\infty }\mathrm {e}^{-\gamma (\tau -t)}\{U(\xi (\tau ), u(\xi (\tau )), v(\xi (\tau )))\nonumber \\&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+\Vert r^\top \Vert ^2g_M^2(\xi )\}\mathrm {d}\tau . \end{aligned}$$
(11)

By the Bellman’s principle, \(J^{*}(\xi (t))\) holds the HJB equation

$$\begin{aligned} \min \limits _{u,v\in \Psi (\Omega )}H(\xi ,u(\xi ),v(\xi ),\nabla J^{*}(\xi ))=0 \end{aligned}$$
(12)

with \(J^{*}(0)=0\). Define \((I-G(\xi )G^+(\xi ))L(\xi )=K(\xi )\). The optimal control policies are obtained as

$$\begin{aligned} u^{*}(\xi )=- \frac{1}{2}R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi ) \end{aligned}$$
(13)

and

$$\begin{aligned} v^{*}(\xi )=- \frac{1}{2\beta ^2}M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi ). \end{aligned}$$
(14)

Substituting (13) and (14) into (12), we present the HJB equation as

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )+\xi ^{\top }\bar{Q}\xi +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )-\gamma J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{4}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi )=0. \end{aligned}$$
(15)

3.1 The event-based HJB equation formulation

Here, we present the HJB equation (15) in event-based form. Before proceeding, the event-based strategy is explained.

Let us consider a monotonically increasing sequence \(\left\{ t_k\right\} _{k=0}^{\infty }\), where the kth triggering instant is represented as \(t_k\) and \(k \in \mathbb {N}\). Let the system state be sampled at every triggering instants and \(\left\{ {\xi }_k\right\} _{k=0}^{\infty }\) be the sequence of sampled state, where \(\xi _k=\xi (t_k)\) is the sampled state at \(t_k\). The triggering error is described as the difference between the current state \(\xi (t)\) and the sampled state \(\xi _k\) and is represented as

$$\begin{aligned} e_{k}(t)={\xi }_k-\xi (t), \quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$
(16)

Based on (16), the event-based mechanism can be explained. If a predefined triggering rule is not satisfied, then the triggering error becomes zero, i.e., \(e_{k}(t)=0\), and the control law will be updated. When the triggering rule is fulfilled, the control law is held constant between the two consecutive triggering instants. This principle is similar to the familiar zero-order hold (ZOH) principle, and it can be expressed as

$$\begin{aligned} u(t)=u({\xi }_k)\triangleq \mu (\xi _k),\quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$

From (16), the event-based control policy is obtained as

$$\begin{aligned} u(t)=\mu (\xi (t)+e_{k}(t)), \quad \forall t \in [t_k , t_{k+1}), k \in \mathbb {N}. \end{aligned}$$
(17)

Now, using the control law (17), we obtain the sampled version of auxiliary system (7) as

$$\begin{aligned} \dot{\xi }=F(\xi )+G(\xi )\mu (\xi (t)+e_k(t))+K(\xi )v(\xi ). \end{aligned}$$
(18)

The optimal control (13), under event-triggered mechanism, can be expressed as

$$\begin{aligned} \mu ^{*}(\xi _k)=- \frac{1}{2}R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k). \end{aligned}$$
(19)

Now, using (19), we formulate the HJB equation under event-based framework as

$$\begin{aligned} H(\xi ,\mu ^{*}(\xi _k),v^{*}(\xi ),\nabla J^{*}(\xi ))=0, \end{aligned}$$

that is,

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )+\xi ^{\top }\bar{Q}\xi +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )-\gamma J^{*}(\xi )\nonumber \\&\qquad - \frac{1}{2}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k)\nonumber \\&\qquad + \frac{1}{4}\nabla (J^{*}(\xi _k))^\top G(\xi _k)R^{-1}G^{\top }(\xi _k)\nabla J^{*}(\xi _k)\nonumber \\&\qquad - \frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi )=0, \end{aligned}$$
(20)

where \(J^{*}(0)=0\).

3.2 Event-triggering condition

In this subsection, we obtain the event-triggering condition using the Lyapunov approach. Before continuing, following statement is made which will be required to derive the triggering rule. The following statement is satisfied in many applications when the controller is affine with respect to the event-triggering error signal [27, 47].

Assumption 1

Let \(\mathcal {L}\) be a positive constant. We consider that the optimal control policy \(u^{*}(\xi )\) fulfills the Lipschitz continuity on \(\Omega \) such that

$$\begin{aligned} \Vert u^*(\xi (t))-u^*({\xi }_{k})\Vert&=\Vert u^*(\xi (t))-u^*(\xi (t)+e_{k}(t))\Vert \\&\le \mathcal {L}\Vert e_{k}(t)\Vert . \end{aligned}$$

Theorem 1

Let Assumption 1 be true, \(J^{*}(\xi )\) satisfies the HJB equation (12), the control policies are described by (14) and (19), and the event-triggering law is formulated as

$$\begin{aligned} \Vert e_{k}(t)\Vert ^2&\le \frac{(1-\eta _1^2)\lambda _{m}(Q)\Vert e_{t}\Vert ^2-2\beta ^2\Vert m^\top v^*(\xi )\Vert ^2}{2\Vert r^\top \Vert ^2\mathcal {L}^2}\nonumber \\&\triangleq \Vert e_T\Vert ^2, \end{aligned}$$
(21)

then for \(\eta _1\in (0,1)\) and \(\gamma =0\), the closed-loop augmented system (4) is asymptotically stable under \(\mu ^*(\xi _k)\) and for \(\gamma \ne 0\) the tracking error \(e_{t}\) is uniformly ultimately bounded.

Proof

Consider \(J^{*}(\xi )\) is the Lyapunov function candidate. Differentiating \(J^{*}(\xi )\) along the trajectory of \(\dot{\xi }(t)=F(\xi (t))+G(\xi (t))\mu ^{*}({\xi }_k)+\Delta F(\xi (t))\), one can write

$$\begin{aligned} \dot{J}^{*}(\xi )&=(\nabla J^{*}(\xi ))^{\top } (F(\xi )+G(\xi )\mu ^{*}({\xi } _k)\!+\!\Delta F(\xi (t))\nonumber \\&=(\nabla J^{*}(\xi ))^{\top }F(\xi ) +(\nabla J^{*}(\xi ))^{\top }G(\xi )\mu ^{*}({\xi }_k)\nonumber \\&\quad +(\nabla J^{*}(\xi ))^{\top }(G(\xi )G^+(\xi )L(\xi ) +K(\xi ))d(\xi ). \end{aligned}$$
(22)

From (12), we obtain

$$\begin{aligned}&(\nabla J^{*}(\xi ))^{\top }F(\xi )\nonumber \\&=-\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+\frac{1}{4}\nabla (J^{*}(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi )\nabla J^{*}(\xi )\nonumber \\&\quad +\frac{1}{4\beta ^2}\nabla (J^{*}(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla J^{*}(\xi ), \end{aligned}$$
(23)

from (13), we can write

$$\begin{aligned} G^{\top }(\xi )\nabla J^{*}(\xi )=-2Ru^{*}(\xi ) \end{aligned}$$
(24)

and from (14), we obtain

$$\begin{aligned} K^{\top }(\xi )\nabla J^{*}(\xi )=-2\beta ^2Mv^{*}(\xi ) \end{aligned}$$
(25)

Using (23), (24) and (25) we derived

$$\begin{aligned} \dot{J}^{*}(\xi )&= -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+u^{{*}{\top }}(\xi )Ru^{*}(\xi )\nonumber \\&\quad +\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )-2u^{{*}{\top }}(\xi )R\mu ^{*}(\xi _k)\nonumber \\&\quad -2u^{{*}{\top }}(\xi )R G^+(\xi )L(\xi )d(\xi )\nonumber \\&\quad -2\beta ^2 v^{{*}{\top }}(\xi )Md(\xi ). \end{aligned}$$
(26)

Now,

$$\begin{aligned}&u^{{*}{\top }}(\xi )Ru^{*}(\xi )-2u^{{*}\top }(\xi )R\mu ^{*}({\xi }_k)\nonumber \\&\qquad -2u^{{*}\top }(\xi )RG^+(\xi )L(\xi )d(\xi )\nonumber \\&\quad =\Vert r^{\top }(u^{*}(\xi )-u^{*}({\xi }_k)-G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\quad \le 2 \Vert r^{\top }\Vert ^2+2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2\nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\quad \le 2 \Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2+2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2 \nonumber \\&\qquad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2 \end{aligned}$$
(27)

and

$$\begin{aligned} -2\beta ^2v^{{*}\top }(\xi )Md(\xi )\le \beta ^2(\Vert m^\top v^{*}(\xi )\Vert ^2\nonumber \\ +\Vert m^\top d(\xi )\Vert ^2). \end{aligned}$$
(28)

Since, \(\bar{Q}=diag\{Q,0_{n\times n}\}\), one can write \(\xi ^{\top }\bar{Q}\xi =e_{t}^\top Q e_{t}\). Now, using (27), (28), and Assumption 1 we derive

$$\begin{aligned} \dot{J}^{*}(\xi )&\le -\lambda _m(Q)\Vert e_t\Vert ^2-\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad -\Vert r^\top \Vert ^2g_M^2(\xi ) +2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi ) +\gamma J^{*}(\xi )\nonumber \\&\quad +\beta ^2\Vert m^\top d(\xi )\Vert ^2 +2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2\nonumber \\&\quad +2\Vert r^\top G^+(\xi )L(\xi )d(\xi )\Vert ^2\nonumber \\&\quad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\le -\lambda _m(Q)\Vert e_t\Vert ^2-\Vert r^\top \Vert ^2(g_M^2(\xi )\nonumber \\&\quad -2\Vert G^+(\xi )L(\xi )d(\xi )\Vert ^2)+2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2\nonumber \\&\quad -\beta ^2\Vert m^\top \Vert ^2(\lambda _{d}^{2}(\xi )-\Vert d(\xi )\Vert ^2)\nonumber \\ {}&\quad +2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )+\gamma J^{*}(\xi )\nonumber \\&\quad - \Vert r^{\top }(u^{*}(\xi _k)+G^+(\xi )L(\xi )d(\xi ))\Vert ^2\nonumber \\&\le -\eta _1^2\lambda _m(Q)\Vert e_t\Vert ^2+(\eta _1^2-1)\lambda _m(Q)\Vert e_t\Vert ^2\nonumber \\&\quad +2\Vert r^{\top }\Vert ^2\mathcal {L}^2\Vert e_k\Vert ^2+2\beta ^2 \Vert m^\top v^{*}(\xi )\Vert ^2\nonumber \\&\quad +\gamma J^{*}(\xi ). \end{aligned}$$
(29)

Hence, when the triggering rule stated in Theorem 1 is satisfied and \(\gamma =0\), then using (29) we can write

$$\begin{aligned} \dot{J}^{*}(\xi )\le -\eta _1^2\lambda _{m}(Q)\Vert {e_{t}(t))}\Vert ^2. \end{aligned}$$
(30)

Thus, the system is asymptotically stable for \(\gamma =0\). When \(\gamma \ne 0\), then

$$\begin{aligned} \dot{J}^{*}(\xi )\le \gamma J^{*}(\xi )-\eta _1^2\lambda _{m}(Q)\Vert {e_{t}(t))}\Vert ^2. \end{aligned}$$
(31)

Since \(J^{*}(\xi )\) is positive definite and bounded on \(\Omega \), let \(J^{*}_{max}\) be the maximum value of \(J^{*}(\xi )\). So, from (31), \(\dot{J}^{*}(\xi )\le 0\) only if \(e_{t}\) lies out of the set

$$\begin{aligned} \Omega _{e_{t}}=\Bigg \lbrace e_{t}:\Vert e_{t}\Vert \le \frac{1}{\eta _1}\sqrt{\frac{\gamma J^{*}_{max}}{\lambda _{m}(Q)}}\Bigg \rbrace . \end{aligned}$$
(32)

Thus we conclude that for \(\gamma \ne 0\), the tracking error \(e_t(t)\) is uniformly ultimately bounded and the ultimate bound is \(\frac{1}{\eta _1}\sqrt{\frac{\gamma J^{*}_{max}}{\lambda _{m}(Q)}}\). \(\square \)

Remark 2

In this work, the control policy \(\mu ^{*}({\xi }_k)\) is formulated under the event-triggered framework, but the augmented control policy \(v^{*}({\xi })\) is formulated under the time-triggered framework. There are two reasons behind this. First, the control policy to be used in the uncertain system is \(\mu ^{*}({\xi }_k)\) not the augmented control \(v^{*}({\xi })\). Second, if we also consider the augmented control in the event-triggering framework, then it becomes very difficult to obtain the event-triggering rule (21).

Remark 3

The lower bond of the minimum event interval \(\Delta t_{\min }\) can be expressed as

$$\begin{aligned} \Delta t_{\min }\ge \frac{1}{ \mathcal {P}} \ln (1+T_{min}) , \end{aligned}$$

where

$$\begin{aligned} T_{min}= \min _{k\in \mathbb {N}}\Bigg \lbrace \frac{||{e_k(t_{k+1})||}}{||{\xi _k}||+\pi }\Bigg \rbrace >0 \end{aligned}$$

and \(e_k(t_{k+1})=\xi _k-\xi (t_{k+1})\), \(\mathcal {P}\) and \(\pi \) are positive constant satisfying \(F(\xi )+G(\xi )u(\xi )+\Delta F(\xi )\le \mathcal {P}\Vert \xi \Vert +\pi \). Note that the positive constants \(\mathcal {P}\) and \(\pi \) exist because \(F(\xi )+G(\xi )u\) is Lipschitz continuous and the terms \(d(\xi )\) and \({G^+(\xi )\Delta F(\xi )} \) are upper bounded. The theoretical proof is similar to [32]. We have excluded the proof to avoid repetition. In the simulation result, we have presented that the intersample time indeed has a lower limit which is larger than zero. As a result, the infamous Zeno behavior is avoided.

4 ACDs for solving event-based HJB equation

In this section, a single critic network is employed to approximate the optimal value of the cost function under the ADP framework. The optimal cost function can be reconstructed on \(\Omega \), utilizing the neural network’s universal approximation property and l number of hidden layer neurons, as

$$\begin{aligned} J^{*}(\xi )=\omega _c^\top \sigma _c(\xi )+\epsilon _c(\xi ), \end{aligned}$$
(33)

where \(\omega _c\in \mathbb {R}^l\) is the actual weight vector of critic network, \(\sigma _c(\xi )\in \mathbb {R}^l\) is the activation function, and \(\epsilon _c(\xi )\) is the reconstruction error. Next, we obtain the gradient of (33) as

$$\begin{aligned} \nabla J^{*}(\xi )=(\nabla \sigma _c(\xi ))^\top \omega _c+\nabla \epsilon _c(\xi ). \end{aligned}$$
(34)

Due to the unavailability of the actual weight vector \(\omega _c\), the approximate weight vector \(\hat{\omega }_c\) is used to form a critic network to estimate the value of the optimal cost function \({J}^{*}(\xi )\) as follows

$$\begin{aligned} \hat{J}(\xi )=\hat{\omega }_c^\top \sigma _c(\xi ). \end{aligned}$$
(35)

Then the gradient of (35) is

$$\begin{aligned} \nabla \hat{J}(\xi )=(\nabla \sigma _c(\xi ))^\top \hat{\omega }_c. \end{aligned}$$
(36)

Considering (34) we present the augmented control law (14) and the event-based control law (19) as

$$\begin{aligned} {v}^{*}({\xi })=- \frac{1}{2\beta ^2}M^{-1}K^{\top }({\xi })((\nabla \sigma _c({\xi }))^\top {\omega }_c +\nabla \epsilon _c(\xi )) \end{aligned}$$
(37)

and

$$\begin{aligned} {\mu }^{*}({\xi _k})=- \frac{1}{2}R^{-1}G^{\top }({\xi _k})((\nabla \sigma _c({\xi _k}))^\top {\omega }_c +\nabla \epsilon _c(\xi _k)), \end{aligned}$$
(38)

respectively. Then by using (36), the approximate value of \({v}^{*}({\xi })\) and \({\mu }^{*}({\xi _k})\) can be obtained as

$$\begin{aligned} \hat{v}({\xi })=- \frac{1}{2\beta ^2}M^{-1}K^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \hat{\omega }_c \end{aligned}$$
(39)

and

$$\begin{aligned} \hat{\mu }({\xi _k})=- \frac{1}{2}R^{-1}G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top \hat{\omega }_c , \end{aligned}$$
(40)

respectively. Substituting \(J^{*}(\xi )\) from (33) into (10), we obtain

$$\begin{aligned}&H(\xi ,\omega _c,\mu ^{*}(\xi _k),v^{*}(\xi ))\nonumber \\ {}&\quad =\omega _c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\mu ^{*}(\xi _k)+K(\xi )v^{*}(\xi ))\nonumber \\ {}&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\ {}&\qquad +U(\xi , \mu ^{*}(\xi _k),v^{*}(\xi ))-\gamma \omega _{c}^\top \sigma _c(\xi )\nonumber \\ {}&\quad \triangleq e_{cH}, \end{aligned}$$
(41)

where \(e_{cH}=-(\nabla \epsilon _c(\xi ))^\top (F(\xi )+G(\xi )\mu ^{*}({\xi }_k)+K(\xi )v^{*}(\xi ))+\gamma \epsilon _c(\xi )\) is the residual error because of the reconstruction error associated with the NN approximation. Now the Hamiltonian (10) is approximated as

$$\begin{aligned}&\hat{H}(\xi ,\hat{\omega }_c,\hat{\mu }({\xi }_k),\hat{v}(\xi ))\nonumber \\&\quad =\hat{\omega }_c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi ))\nonumber \\&\qquad +\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\qquad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))-\gamma \hat{\omega }_{c}^\top \sigma _c(\xi ). \end{aligned}$$
(42)

From the HJB equation it is evident that \(H(\xi ,\omega _c,\mu ^{*} (\xi _k),v^{*}(\xi ))=0\). So, the approximation error of Hamiltonian is given by

$$\begin{aligned} e_c&=\hat{H}(\xi ,\hat{\omega }_c,\hat{\mu }({\xi }_k),\hat{v}(\xi ))-H(\xi ,\omega _c,\mu ^{*}(\xi _k),\nonumber \\ v^{*}(\xi ))\nonumber&=\hat{\omega }_c^{\top }\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)\nonumber \\&\quad +K(\xi )\hat{v}(\xi ))+\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))-\gamma \hat{\omega }_{c}^\top \sigma _c(\xi )\nonumber \\&=\Vert r^\top \Vert ^2g_M^2(\xi )+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad +U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi ))+\hat{\omega }_c^{\top }\phi , \end{aligned}$$
(43)

where \(\phi =\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi ))-\gamma \sigma _c(\xi )\).

Now, to ensure \(e_c\) given in (43) to be sufficiently small, we need to train the critic network to obtain appropriate weights. For that, the objective function constructed as \(E=(1/2)e_c^\top e_c\) is minimized by using the steepest descent technique. Based on this approach the tuning rule is given as

$$\begin{aligned} \dot{\hat{\omega }}_{c1}=&\frac{-l_c}{(1+\phi ^\top \phi )^2}\frac{{\partial E}}{{\partial \hat{\omega }_c}}\nonumber \\ =&\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\hat{\omega }_c^\top \phi +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&+\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi )), \end{aligned}$$
(44)

where \(l_c>0\) is a design parameter which is also known as the critic network’s learning rate and \(1/{(1+\phi ^\top \phi )^2}\) is introduced to normalize \(\phi \). However, the tuning rule (44) has following drawbacks

  1. 1.

    An initial stabilizing control is needed at the beginning of the critic weight vector learning process while using the tuning rule provided in (44). However, in some practical applications, determining the initial admissible control can be difficult.

  2. 2.

    The term \(\phi /(1+\phi ^{\top }\phi )^2\) in (44) should be held persistently exciting to guarantee the convergence of the critic weights to valid optimal values. To meet the persistency of excitation (PE) condition, usually a probing noise is applied to the control input during the initial period of the critic weights tuning process. However, the probing noise can cause the system to become unstable.

To overcome the above drawbacks, we modify the tuning rule (44) via the Lyapunov approach. Before continuing the following assumption, which is similar to [20], is presented.

Assumption 2

Let us consider \(V(\xi )\) be a continuously differentiable Lyapunov function candidate for the system (7) under the action of control policies given by (37) and (38), and satisfy

$$\begin{aligned}&\dot{V}{(\xi )}\nonumber \\&\quad =(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\nonumber \\&\quad < 0. \end{aligned}$$
(45)

Moreover, there exists a symmetric positive definite matrix \(\Lambda \in \mathbb {R}^{2n}\) defined on \(\Omega \) ensuring

$$\begin{aligned}&(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\nonumber \\&\quad =-(\nabla V(\xi ))^{\top }\Lambda \nabla V(\xi )\nonumber \\&\quad \le -\lambda _{m}(\Lambda )\Vert V(\xi )\Vert ^2. \end{aligned}$$
(46)

Remark 4

\(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\) is frequently considered to be bounded by a positive constant on a compact set \(\Omega \) [48]. In other words, there exist a constant \(z_1\) such that \(\Vert F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\Vert \le z_1\). Here we assumed that \(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\) is bounded by a function with respect to \(\xi \), which is less stringent than the constant upper bound assumption. Without loss of generality, we consider that \(\Vert (F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert \), where \(z_2\) is a positive constant. In this regard, we can write \(\Vert (\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert ^2\). Observing (45), one can find that (46) is reasonable. In simulation, a polynomial with respect to \(\xi \) is chosen as \(V(\xi )\).

While we apply the approximated control policies (39) and (40) to the auxiliary system (7), to avoid instability we need to avoid the possibility

$$\begin{aligned} (\nabla V(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }({\xi _k})+K(\xi )\hat{v}(\xi ))> 0. \end{aligned}$$
(47)

To avoid (47), the training process is enhanced by introducing an additional term which is obtained using the steepest descent method as given below

$$\begin{aligned}&\dot{\hat{\omega }}_{c2}\nonumber \\&\ =l_s\frac{{\partial ((\nabla V(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }({\xi _k})+K(\xi )\hat{v}(\xi ))})}{{\partial \hat{\omega }_c}}\nonumber \\&\ =\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )), \end{aligned}$$
(48)

where \(l_s>0\) is a design parameter. Now, the modified critic weights tuning rule is obtained by adding the stabilizing term (48) to the traditional tuning rule (44) as

$$\begin{aligned} \dot{\hat{\omega }}_c&=\dot{\hat{\omega }}_{c1}+\dot{\hat{\omega }}_{c2}\nonumber \\&\quad =\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\hat{\omega }_c^\top \phi +\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )+U(\xi , \hat{\mu }(\xi _k),\hat{v}(\xi )))\nonumber \\&\qquad +\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )). \end{aligned}$$
(49)

Remark 5

The new tuning rule (49) can eliminate the need of initial admissible control. Hence, we can initialize the critic weight vector to zero while learning the appropriate critic weights. Moreover, the risk of instability due to the addition of probing noise to fulfill the PE condition is also eliminated.

The critic weights approximation error \(\tilde{\omega }_c\) is defined as the difference between the ideal and the approximate weight vector, i.e., \( \tilde{\omega }_c=\omega _c-\hat{\omega }_c \). From (41) and (43) we obtain

$$\begin{aligned} e_c&=-\tilde{\omega }_c^\top \phi +e_{cH}. \end{aligned}$$
(50)

Then, using (49) and (50), the critic weights approximation error dynamics is presented as

$$\begin{aligned} \dot{\tilde{\omega }}_c&=\frac{-l_c\phi }{(1+\phi ^\top \phi )^2}(\tilde{\omega }_c^\top \phi -e_{cH})\nonumber \\&\qquad -\frac{l_s}{2}(\nabla \sigma _c({\xi _k})G({\xi _k})R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad +\frac{1}{\beta ^2}\nabla \sigma _c({\xi })K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi )). \end{aligned}$$
(51)

The closed-loop system functions as an impulsive dynamical system consisting of flow dynamics and jump dynamics under the event-based control law. Let us consider an augmented state vector \(\psi =[\xi ^\top ,{\xi }_k^\top ,\tilde{\omega }_c^\top ]^\top \). Then, the flow dynamics of the closed-loop system, which occurs for all \(t\in [t_k,t_{k+1})\), can be presented as

(52)

and the jump dynamics of the closed-loop system, which occurs for all \(t\in t_{k+1}\), can be presented as

$$\begin{aligned} \psi (t^+)=\psi (t)+\begin{bmatrix} 0\\ {\xi }_k-\xi (t)\\ 0 \end{bmatrix},\quad \forall t \in t_{k+1}, k \in \mathbb {N}, \end{aligned}$$
(53)

where \(\psi (t^+)=\lim _{\varsigma \rightarrow 0^+}\psi (t+\varsigma )\) and \(\varsigma \in (0,t_{k+1}-t_k)\).

5 Stability analysis

In this section, the stability of impulsive dynamical representation of closed-loop system, given by (52) and (53), is studied. Prior to moving forward some assumptions, which are common in the literature, are stated below [32].

Assumption 3

The augmented system dynamics \(G(\xi )\) and \(K(\xi )\) satisfy the following assumptions, where A, \(G_{M}\), and \(K_M\) are positive constants.

  1. 1.

    The dynamics \(G(\xi )\) satisfies the Lipschitz continuity such that \(\Vert G(\xi )-G({\xi }_k)\Vert \le A\Vert e_{k}(t)\Vert \).

  2. 2.

    The dynamics \(G(\xi )\) and \(K(\xi )\) are upper bounded by \(G_M\) and \(K_M\), respectively.

Assumption 4

The following conditions hold on \(\Omega \), where B, \(\nabla \sigma _{cM}\), \(\nabla \epsilon _{cM}\), \(\omega _{cM}\), and \(e_{cHM}\) are positive constants.

  1. 1.

    The gradient of activation function satisfies the Lipschitz continuity such that \(\Vert \nabla \sigma _c(\xi )-\nabla \sigma _c({\xi }_k)\Vert \le B\Vert e_{k}(t)\Vert \).

  2. 2.

    The gradient of activation function \(\nabla \sigma _c(\xi )\) and the gradient of neural approximation error \(\nabla \epsilon _c(\xi )\) are upper bounded by \( \nabla \sigma _{cM}\) and \(\nabla \epsilon _{cM}\), respectively.

  3. 3.

    The ideal weight vector \(\omega _c\) and the residual error \(e_{cH}\) are upper bounded by \(\omega _{cM}\) and \(e_{cHM}\), respectively.

Theorem 2

Let Assumptions 1 to 4 be true. Then, under the control policies (39) and (40), the closed-loop auxiliary system (7) is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded if the inequalities (54) and (55) hold, where \((\eta _2\in (0,1))\) is a design parameter, and the values of \(\varGamma _1\) and \(\varGamma _2\) are give by (63) and (69), respectively.

$$\begin{aligned}&\Vert e_k(t)\Vert ^2\nonumber \\&\quad \le \frac{(1-\eta ^2_2)\lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2+\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2-\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2}{\Vert R^{-1}\Vert ^2\Vert r^{\top }\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert \hat{\omega }_c\Vert ^2}\nonumber \\&\qquad \triangleq \Vert \hat{e}_T\Vert ^2 \end{aligned}$$
(54)
$$\begin{aligned}&\Vert \tilde{\omega }_c\Vert >\sqrt{\frac{2\Vert R\Vert ^2(1+\phi ^\top \phi )(\varGamma _1+\varGamma _2+\gamma J^*_{max})}{2(1+\phi ^\top \phi )(l_c\Vert R\Vert ^2\lambda _{\varphi m}-G^2_{M}\nabla \sigma _{cM}^2)-l_c\Vert R\Vert ^2\lambda _{\varphi M}}} \end{aligned}$$
(55)

Proof

In light of the flow dynamics (52) and the jump dynamics (53) we consider the Lyapunov function candidate as

$$\begin{aligned} \Upsilon (t)=\Upsilon _1(t)+\Upsilon _2(t)+\Upsilon _3(t)+\Upsilon _4(t), \end{aligned}$$
(56)

where \(\Upsilon _1(t)=J^*(\xi )\), \(\Upsilon _2(t)=J^*({\xi }_k)\), \(\Upsilon _3(t)=\frac{1}{2}\tilde{\omega }_c^\top \tilde{\omega }_c\) and \(\Upsilon _4(t)=l_sV(\xi )\). Now, the analysis is separated into two cases.

Case 1. Events are not triggered, i.e., \(t\in [t_k,t_{k+1})\). Taking the differentiation of (56) one can write

$$\begin{aligned} \dot{\Upsilon }(t)=\dot{\Upsilon }_1(t)+\dot{\Upsilon }_2(t)+\dot{\Upsilon }_3(t)+\dot{\Upsilon }_4(t). \end{aligned}$$
(57)

It is evident that for \(t\in [t_k,t_{k+1})\)\(\dot{\Upsilon }_2(t)=0\). Now, differentiating \(\Upsilon _1(t)\) along the trajectory of \(\dot{\xi }(t)=F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi )\), we obtain

$$\begin{aligned} \dot{\Upsilon }_1(t)&=(\nabla J^{*}(\xi ))^{\top }(F(\xi )+G(\xi )\hat{\mu }(\xi _k)\\&\quad +K(\xi )\hat{v}(\xi )). \end{aligned}$$

Now using Eqs. (23) and (24), we derived

$$\begin{aligned} \dot{\Upsilon }_1(t)&= -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\quad +\gamma J^{*}(\xi )+u^{{*}{\top }}(\xi )Ru^{*}(\xi )\nonumber \\&\quad -2u^{{*}{\top }}(\xi )R\hat{\mu }(\xi _k)+\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )\nonumber \\&\quad -2\beta ^2 v^{{*}{\top }}(\xi )M\hat{v}(\xi ). \end{aligned}$$
(58)

We can write

$$\begin{aligned}&u^{{*}{\top }}(\xi )Ru^{*}(\xi )-2u^{{*}{\top }}(\xi )R\hat{\mu }(\xi _k)\nonumber \\&\quad \le \Vert r^\top (u^*(\xi )-\hat{\mu }(\xi _k))\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\quad \le \Vert r^\top \Vert ^2\Vert \frac{1}{2}R^{-1}G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \hat{\omega }_c \nonumber \\&\qquad - \frac{1}{2}R^{-1}G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top \hat{\omega }_c \nonumber \\&\qquad +\frac{1}{2}R^{-1}G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top (\tilde{\omega }_c+\nabla \epsilon _c(\xi ))\Vert ^2 \nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\quad \le \frac{\Vert r^\top \Vert ^2}{2}\Vert R^{-1}(G^{\top }({\xi })(\nabla \sigma _c({\xi }))^\top \nonumber \\&\qquad -G^{\top }({\xi _k})(\nabla \sigma _c({\xi _k}))^\top )\hat{\omega }_c \Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2 \nonumber \\&\qquad +\frac{1}{2}\Vert R^{-1}G^{\top }({\xi })((\nabla \sigma _c({\xi }))^\top \tilde{\omega }_c+\nabla \epsilon _c(\xi ))\Vert ^2 \nonumber \\&\quad \le \Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2 \nonumber \\&\qquad +\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2+\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \epsilon _{cM}\nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2, \end{aligned}$$
(59)
$$\begin{aligned}&\qquad -2\beta ^2 v^{{*}{\top }}(\xi )M\hat{v}(\xi )\nonumber \\&\quad \le \beta ^2 v^{{*}{\top }}(\xi )M{v}^{*}(\xi )+\beta ^2 \hat{{v}}^{{\top }}(\xi )M\hat{{v}}(\xi ) \end{aligned}$$
(60)

and

$$\begin{aligned}&2\beta ^2 v^{{*}{\top }}(\xi )Mv^{*}(\xi )\nonumber \\&\quad \le \frac{1}{2\beta ^2\Vert M\Vert }\Vert K^{\top }(\xi )(\nabla \sigma _c({\xi }))^\top (\omega _c+\nabla \epsilon _c)\Vert ^2\nonumber \\&\quad \le \frac{1}{2\beta ^2\Vert M\Vert }K^2_M\nabla \sigma ^2_{cM}(\omega ^2_{cM}+\nabla \epsilon _{cM}). \end{aligned}$$
(61)

Based on the above three inequalities, (58) can be expressed as

$$\begin{aligned}&\dot{\Upsilon }_1(t)\nonumber \\&\quad \le -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\gamma J^{*}(\xi )+\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _1, \end{aligned}$$
(62)

where \(\varGamma _1\) is a positive constant and it is expressed as

$$\begin{aligned} \varGamma _1=\frac{G^2_M}{\Vert R\Vert ^2}\nabla \epsilon ^2_{cM}+\frac{K^2_M\nabla \sigma ^2_{cM}}{2\beta ^2\Vert M\Vert } (\omega ^2_{cM}+\nabla \epsilon ^2_{cM}). \end{aligned}$$
(63)

We have \(\omega _c^\top \phi =\phi ^\top \omega _c\). Let \(\varphi =\phi /({1+\phi ^\top \phi })\). Now, using (51), the time derivative of \(\Upsilon _3(t)\) is found as

$$\begin{aligned} \dot{\Upsilon }_3(t)&=-l_c\tilde{\omega }_c^\top \varphi \varphi ^\top \tilde{\omega }_c+\frac{l_c}{(1+\phi ^\top \phi )}\tilde{\omega }_c^\top \varphi e_{cH}\nonumber \\&\quad -\frac{l_s}{2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi _k)G(\xi _k)R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\quad -\frac{l_s}{2\beta ^2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi )K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi ). \end{aligned}$$
(64)

Let \(\lambda _{M}(\varphi \varphi ^\top )=\lambda _{\varphi M}\) and \(\lambda _{m}(\varphi \varphi ^\top )=\lambda _{\varphi m}\). Then, Considering Young’s inequality \(2c^\top d \le c^\top c+d^\top d\) and Assumption 4, (64) can be expressed as

$$\begin{aligned}&\dot{\Upsilon }_3(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )}(\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\&\qquad -\frac{l_s}{2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi _k)G(\xi _k)R^{-1}G^{\top }(\xi )\nabla V(\xi )\nonumber \\&\qquad -\frac{l_s}{2\beta ^2}\tilde{\omega }_c^\top \nabla \sigma _c(\xi )K(\xi )M^{-1}K^{\top }(\xi )\nabla V(\xi ) . \end{aligned}$$
(65)

Now, substituting \(\tilde{\omega }_c=\omega _c-\hat{\omega }_c\) in last two terms of (65) and considering the control policies (39) and (40), one can write

$$\begin{aligned}&\dot{\Upsilon }_3(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\ {}&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )} (\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\ {}&\qquad - \frac{l_s}{2}(\nabla V(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla \sigma _c(\xi _k)\omega _c\nonumber \\ {}&\qquad -{l_s}(\nabla V(\xi ))^\top G(\xi )\hat{\mu }(\xi _k)\nonumber \\ {}&\qquad -\frac{l_s}{2\beta ^2}(\nabla V(\xi ))^\top K(\xi )M^{-1}K^{\top }(\xi )\nabla \sigma _c(\xi )\omega _c\nonumber \\ {}&\qquad -{l_s}(\nabla V(\xi ))^\top K(\xi )\hat{v}(\xi ). \end{aligned}$$
(66)

The derivative of \({\Upsilon }_4(t)\) is

$$\begin{aligned} \dot{\Upsilon }_4(t)=l_s\nabla V(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi )). \end{aligned}$$
(67)

Now, combining (66) and (67) and using the control policies (37) and (38), one can write

$$\begin{aligned} \begin{aligned}&\dot{\Upsilon } _3(t)+\dot{\Upsilon } _4(t)\nonumber \\ {}&\quad \le - l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\ {}&\qquad +\frac{l_c}{2(1+\phi ^\top \phi )} (\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM}) \nonumber \\ {}&\qquad +l_s(\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu } ^{*}(\xi _k) +K(\xi ){v}^{*}(\xi ))\nonumber \\ {}&\qquad +\frac{l_s}{2}(\nabla V(\xi ))^\top G(\xi )R^{-1}G^{\top }(\xi _k)\nabla \epsilon _c(\xi _k)\nonumber \\ {}&\qquad +\frac{l_s}{2\beta ^2} (\nabla V(\xi ))^\top K(\xi )M^{-1} K^{\top }(\xi )\nabla \epsilon _c(\xi ). \end{aligned}\end{aligned}$$

Now, utilizing Assumptions 1 to 4, we can write

$$\begin{aligned}&\dot{\Upsilon }_3(t)+\dot{\Upsilon }_4(t)\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2 \nonumber \\&\qquad +\frac{l_c}{2(1+\phi ^{\top }\phi )}(\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2+e^2_{cHM})\nonumber \\&\qquad -l_s\lambda _{m}(\Lambda )\Vert V(\xi )\Vert ^2 +{l_s}\kappa \Vert \nabla V(\xi )\Vert \nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2 +\frac{l_c}{2(1+\phi ^\top \phi )}\lambda _{\varphi M}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c }{2(1+\phi ^\top \phi )}e^2_{cHM} +\frac{l_s\kappa ^2}{4\lambda _{m}(\Lambda )}\nonumber \\&\qquad -l_s\lambda _{m}(\Lambda ) \left( \Vert V(\xi )\Vert -\frac{\kappa }{2\lambda _{m}(\Lambda )}\right) ^2\nonumber \\&\quad \le -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2+\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _2, \end{aligned}$$
(68)

where \(\kappa =\frac{1}{2}\nabla \epsilon _{cM}(G_M^2\Vert R^{-1}\Vert +\frac{1}{\beta ^2}K_M^2\Vert M^{-1}\Vert )\) and the positive constant \(\varGamma _2\) is given by

$$\begin{aligned} \varGamma _2=\frac{l_c e^2_{cHM}}{2(1+\phi ^\top \phi )}+\frac{l_s\kappa ^2}{4\lambda _{m}(\Lambda )}. \end{aligned}$$
(69)

Substituting (62) and (68) into (57), we obtain

$$\begin{aligned}&\dot{\Upsilon }(t)\nonumber \\&\quad \le -\xi ^{\top }\bar{Q}\xi -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\gamma J^{*}(\xi )+\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad -\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2+\frac{1}{\Vert R\Vert ^2}G^2_M\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2+\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\varGamma _1+\varGamma _2. \end{aligned}$$
(70)
Fig. 1
figure 1

Flowchart of the proposed method

Since \(\bar{Q}=diag\{Q,0_{n\times n}\}\), one can write \(\xi ^\top (t)\bar{Q} \xi (t)=e_{t}^\top (t)Qe_{t}(t)\). Now, introducing the design parameter \(\eta _2\), (70) can be presented as

$$\begin{aligned}&\dot{\Upsilon }(t)\nonumber \\&\quad \le -\eta ^2_2 \lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2-(1-\eta ^2_2)\lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2\nonumber \\&\qquad -\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )-\Vert r^\top \Vert ^2g_M^2(\xi )\nonumber \\&\qquad +\beta ^2\Vert m^{\top }\Vert ^2\Vert \hat{{v}}(\xi )\Vert ^2-\Vert r^\top \hat{\mu }(\xi _k)\Vert ^2\nonumber \\&\qquad +\Vert r^\top \Vert ^2\Vert R^{-1}\Vert ^2(A^2\nabla \sigma ^2_{cM}+B^2G^2_M)\Vert e_k\Vert ^2\Vert \hat{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{G^2_M}{\Vert R\Vert ^2}\nabla \sigma ^2_{cM}\Vert \tilde{\omega }_c\Vert ^2 -l_c\lambda _{\varphi m}\Vert \tilde{\omega }_c\Vert ^2\nonumber \\&\qquad +\frac{l_c \lambda _{\varphi M}}{2(1+\phi ^\top \phi )}\Vert \tilde{\omega }_c\Vert ^2+\gamma J^{*}_{max}+\varGamma _1+\varGamma _2. \end{aligned}$$
(71)

If the inequalities (54) and (55) mentioned in Theorem 2 hold then (71) implies

$$\begin{aligned} \dot{\Upsilon }(t)&\le -\eta ^2_2 \lambda _{m}(Q)\Vert e_{t}(t)\Vert ^2-\beta ^2\Vert {m^\top }\Vert ^2 \lambda _{d}^{2}(\xi )\nonumber \\&\quad -\Vert r^\top \Vert ^2g_M^2(\xi )\\ {}&<0, \end{aligned}$$

i.e., the proposed Lyapunov function candidate has negative time derivative for all \(t\in [t_k,t_{k+1})\).

Case 2. Events are triggered, i.e., \(t\in t_{k+1}\). We derive the difference of the Lyapunov function candidate as

$$\begin{aligned} \Delta \Upsilon (t_k)&=J^{*}(\xi (t_k^+))-J^{*}(\xi (t_k))\nonumber \\&\quad +\frac{1}{2}\tilde{\omega }^\top _c(t^+_k)\tilde{\omega }_c(t^+_k) -\frac{1}{2}\tilde{\omega }^\top _c(t_k)\tilde{\omega }_c(t_k))\nonumber \\&\quad +J^{*}({\xi }_{k+1})-J^{*}({\xi }_{k})+l_s(V(t^+_k)-V(t_k)), \end{aligned}$$
(72)

where \(\xi (t^+_k)=\lim _{\varsigma \rightarrow 0^+}\xi (t_k+\varsigma )\) and \(\varsigma \in (0,t_{k+1}-t_k)\). In Case 1 we derived that \(\dot{\Upsilon }(t)<0\) for all \(t\in [t_k,t_{k+1})\), so

$$\begin{aligned} \Upsilon (t_k)&\ge \lim _{\varsigma \rightarrow 0^+}\Upsilon (t_{k}+\varsigma ) \quad \forall \varsigma \in (0,t_{k+1}-t_k), k \in \mathbb {N}\nonumber \\&\triangleq \Upsilon (t_k^+). \end{aligned}$$
(73)

Thus, one can write

$$\begin{aligned}&J^{*}(\xi (t_k^+))+\frac{1}{2}\tilde{\omega }^\top _c(t^+_k)\tilde{\omega }_c(t^+_k)+l_s V(t^+_k)\nonumber \\&\qquad -J^{*}(\xi _k)-\frac{1}{2}\tilde{\omega }^\top _c(t_k)\tilde{\omega }_c(t_k))-l_s V(t^+_k)\le 0. \end{aligned}$$
(74)

Hence, we can further express that

$$\begin{aligned} (J^{*}({\xi }_{k+1})-J^{*}({\xi }_{k}))\le -\vartheta \Vert e_{k+1}(t_k)\Vert , \end{aligned}$$
(75)

where \(\vartheta \) is a class k function and \(e_{k+1}(t_k)={\xi }_{k+1}-{\xi }_{k}\). The inequalities (74) and (75) imply the monotonically decreasing property of the proposed Lyapunov function candidate for all \(t\in t_{k+1}\).

Thus from the two cases presented above, we conclude that the closed-loop system is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded. \(\square \)

A flowchart is given in Fig. 1 to explain the fundamental methodology of the proposed work, which comprises the learning and implementation phases. In the learning phase, the converged critic weights are obtained after sufficient iterations while using the event-triggering rule (54). Then the converged weights are passed to the implementation phase to obtain the approximate values of the optimal control policies \( \mu ^{*}(\xi _k)\) and \(v^{*}(\xi )\) as \( \hat{\mu }(\xi _k)\) and \(\hat{v}(\xi )\), respectively. The approximated event-based control policy \( \hat{\mu }(\xi _k)\) is applied to the uncertain nonlinear system while using the event triggering rule (21) to track the desired trajectory.

Remark 6

The values of the sampling frequencies \(\eta _1\) and \(\eta _2\) are chosen such that the terms \(\Vert e_T\Vert ^2\) and \(\Vert \hat{e}_T\Vert ^2\) become positive, respectively. The increase in the value of \(\eta _1\) and \(\eta _2\) will increase the sampling frequency and the number of the event triggering instants. Furthermore, it improves tracking performance. However, we have to select the sampling frequency such that there is a trade-off between the number of triggering instants and the tracking performance. Similar to relevant literature [38], other parameters are chosen heuristically such that the convergence time of the critic weights and the number of triggering instants are minimum with acceptable tracking performance.

6 Simulation illustration

In this section, two simulation examples are presented to exhibit the efficacy of the proposed event-based robust trajectory tracking scheme. In Example 1, we have considered a linear system with unmatched uncertainty, and in Example 2, the spring-mass-damper system with nonlinear spring constant and unmatched uncertainty is considered.

6.1 Example 1

Consider the following linear unmatched uncertain system [21]

$$\begin{aligned} \dot{x}=\begin{bmatrix} x_2\\ -100x_1-2x_2 \end{bmatrix}+\begin{bmatrix} 0\\ 1 \end{bmatrix}u+\Delta f(x), \end{aligned}$$
(76)

where \(x=[x_1,x_2]^\top \in \mathbb {R}^2\) is the state vector, \(u\in \mathbb {R}\) denotes the control input, \( \Delta f(x)=l(x)d(x)\) and \(l(x)=[1,0]^\top \). The perturbation \(d(x)=0.5 \theta _1 x_1 sin(x_2+\theta _2)\), where the parameters \(\theta _1\) and \(\theta _2\) are unknown. We consider \(\theta _1\in [-1,1]\), \(\theta _2\in [-5,5]\) and the upper bound of the perturbation d(x) is \(\lambda _d(x)=\vert x_1\vert \). Let \(x_0=[0.6,-0.5]^\top \) be the initial state. The desired trajectory \(x_d(t)\) is generated from

$$\begin{aligned} \dot{x}_d(t)= \begin{bmatrix} x_{d2}\\ -100x_{d1} \end{bmatrix}, \end{aligned}$$
(77)

where \(x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2\) with the initial condition \(x_{d0}=[0.3,-0.3]^\top \). The tracking error \(e_t\) is defined as \(e_t=x-x_d\), where \(e_t=[e_{t1},e_{t2}]^{\top }\in \mathbb {R}^2\), and initial condition \(e_{to}=x_0-x_{d0}\). Then an augmented state vector \(\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4\) is defined and following augmented system is formed

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -100\xi _1-2(\xi _2+\xi _4)\\ \xi _4\\ -100\xi _3 \end{bmatrix} +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}u(\xi )\nonumber \\ {}&\quad +L(\xi )d(\xi ), \end{aligned}$$
(78)

where \(L(\xi )=[1,0,0,0]^\top \) and \(d(\xi )=0.5\theta _1 (\xi _1+\xi _3) sin((\xi _2+\xi _4)+\theta _2)\). The initial condition \(\xi _0=[e_{t0},x_{d0}]^{\top }=[0.3,-0.2,0.3,-0.3]^\top \). The upper bound for \(\lambda _d(\xi )\) is derived as \(\lambda _d(\xi )=\vert \xi _1+\xi _3\vert \). We have obtained \(G^+(\xi )=[0,1,0,0]\) so \((I-G(\xi )G^+(\xi ))L(\xi )=[1, 0, 0, 0]^{\top }\). As in (7), the auxiliary system is formulated as

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -100\xi _1-2(\xi _2+\xi _4)\\ \xi _4\\ -100\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}{u}(\xi ) +\begin{bmatrix} 1\\ 0\\ 0\\ 0 \end{bmatrix}{v}(\xi ). \end{aligned}$$
(79)

Since, \(\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0\) we have taken \(g_M(\xi )=0\). Let \(R=I_1\), \(M=I_1\) and \(Q=500I_2\). For the simulation purpose, consider \(\gamma =0.5\) and \(\beta =0.85\).

Our aim is to develop an event-based robust controller for the system (76) to track the reference trajectory generated by (77). As described in the theoretical analysis, to achieve this design criteria, the augmented system (78) is formed and then the original control problem is transformed to designing an event-based optimal controller for auxiliary system (79). Based on (8), the cost function for (79) can be presented as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-0.5(\tau -t)}\{500\Vert e_t\Vert ^2+\Vert u(\xi )\Vert ^2\nonumber \\&\quad +0.72\Vert v(\xi )\Vert ^2+0.72\vert \xi _1+\xi _3\vert ^2\}\mathrm {d}\tau . \end{aligned}$$
(80)

The critic network (35) is employed to find the solution of the event-based optimal control problem approximately. We have considered \(l=10\) numbers of hidden layer neurons and the weight vector of the critic network is represented as \( \hat{\omega }_c=[\hat{\omega }_{c1},\ldots ,\hat{\omega }_{c10}]^\top \). The activation function for the critic network is selected as \(\sigma _c(\xi )=[\xi _1^2,\xi _2^2,\xi _3^2,,\xi _4^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2\xi _3,\xi _2\xi _4, \xi _3\xi _4]^\top \). The weights are trained using the tuning rule (49) and the triggering condition (54) is used during the training process. The parameters used during the tuning process are \(l_c=3\), \(l_s=0.1\), \((A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8\) and \(\eta _2=0.7\).

To satisfy the PE condition, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. As shown in Fig. 2, the critic weight vector converges to \(\hat{\omega }_c=[0.46, 3.64, 0.01, -0.69, -0.85, -0.32, -0.02, 0.71, -3.89, -1.60]^{\top }\). During the training process the event-based controller updates 5714 times. On the contrary, under the same design criteria, the time-based controller updates 18000 times.

Fig. 2
figure 2

Convergence process of critic weights

Then, we used the converged weights to obtain the approximate values of control policies control polices (39) and (40). Now, we select \(\theta _1=-0.3\) and \(\theta _2=5\) to demonstrate the trajectory tracking ability of the designed control policy \( \hat{\mu }(\xi _k)\) and the triggering rule described in (21). We considered \(\eta _1=0.65\) and \(\mathcal {L}=2.5\). The sampling period is taken as 0.01 second. The tracking performance of the designed controller is displayed in Figs. 3 and 4. The obtained event-based control policy \( \hat{\mu }(\xi _k)\) is shown in Fig. 5.

The value of the sampling frequency \(\eta _1\) is considered as \(\eta _1\) \(\in (0,1)\). Table 1 illustrates the relationship between \(\eta _1\) and the number of triggering instants \(N_s\). From the table it is clear that as \(\eta _1\) increases the number of event-triggering instants \(N_s\) increases.

Fig. 3
figure 3

Tracking performance of \(x_1\) for \(\theta _1=-0.3\) and \(\theta _2=5\)

Fig. 4
figure 4

Tracking performance of \(x_2\) for \(\theta _1=-0.3\) and \(\theta _2=5\)

Fig. 5
figure 5

Event-based control input

Fig. 6
figure 6

Evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\)

Fig. 7
figure 7

Triggering instants during the tracking process

The evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\) is displayed in Fig. 6. The sampling period is shown in Fig. 7. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 7 also conveys that only 435 state samples are used during the tracking process. So, the controller is updated only 435 times. Nonetheless, if we use the time-triggering method under the same condition then 1600 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.

Fig. 8
figure 8

Tracking performance of \(x_1\) for \(\theta _1=0.4\) and \(\theta _2=-1\)

Fig. 9
figure 9

Tracking performance of \(x_2\) for \(\theta _1=0.4\) and \(\theta _2=-1\)

Next, in order to show that the derived controller is robust, we have taken \(\theta _1=0.4\) and \(\theta _2=-1\). The tracking performance for new value of \(\theta _1\) and \(\theta _2\) is shown in Figs. 8 and 9. In this scenario, the event-based controller updates 468 times only. On the contrary, the conventional time-triggered controller updates 1600 times under the same design criteria.

Table 1 Effect of \(\eta _1\) on number of triggering instants

6.2 Example 2

Consider the spring-mass-damper system [36]

$$\begin{aligned} \dot{x}=\begin{bmatrix} 0 &{} 1\\ -\frac{k}{m} &{} -\frac{c}{m} \end{bmatrix}x+\begin{bmatrix} 0\\ \frac{1}{m} \end{bmatrix}u , \end{aligned}$$
(81)

where \(x=[x_1,x_2]^\top \), \(x_1\) is the position and \(x_2\) is the velocity, m represents the mass of the object, k denotes the spring constant and c is the damping. Let \(m= 1 Kg\), \(c=0.5N.\text {s/m}\) and the spring is nonlinear with the nonlinearity \(k(x)=-5 x^3 \text {N/m}\). After adding an unmatched uncertainty \( \Delta f(x)\), the system dynamics is obtained as

$$\begin{aligned} \dot{x}=\begin{bmatrix} x_2\\ -5x_1^3-0.5x_2 \end{bmatrix}+\begin{bmatrix} 0\\ 1 \end{bmatrix}u+\Delta f(x), \end{aligned}$$
(82)

where \( \Delta f(x)=l(x)d(x)\) and \(l(x)=[1,0]^\top \). The perturbation \(d(x)=0.5 \theta _1 x_1 x_2 sin(x_1) cos(x_2+\theta _2)\), where the parameters \(\theta _1\) and \(\theta _2\) are unknown. We consider \(\theta _1\in [-1,1]\), \(\theta _2\in [-5,5]\) and the upper bound of the perturbation d(x) is \(\lambda _d(x)=\vert x_2\vert \). Let \(x_0=[0.5,0.2]^\top \) be the initial state. The desired trajectory \(x_d(t)\) is generated from

$$\begin{aligned} \dot{x}_d(t)= \begin{bmatrix} x_{d2}\\ -5x_{d1} \end{bmatrix}, \end{aligned}$$
(83)

where \(x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2\) with the initial condition \(x_{d0}=[0.2,-0.2]^\top \). Then an augmented state vector \(\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4\) is defined and following augmented system is formed

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -5(\xi _1+\xi _3)^3-0.5(\xi _2+\xi _4)+5\xi _3\\ \xi _4\\ -5\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}u(\xi ) +L(\xi )d(\xi ), \end{aligned}$$
(84)

where \(L(\xi )=[1,0,0,0]^\top \) and \(d(\xi )=0.5\theta _1 (\xi _1+\xi _3) (\xi _2+\xi _4) sin(\xi _1+\xi _3) cos((\xi _2+\xi _4)+\theta _2)\). The initial condition \(\xi _0=[0.3,0.4,0.2,-0.2]^\top \). The upper bound for \(\lambda _d(\xi )\) is derived as \(\lambda _d(\xi )=\vert \xi _2+\xi _4\vert \). The auxiliary system is formulated as

$$\begin{aligned} \dot{\xi }&= \begin{bmatrix} \xi _2\\ -5(\xi _1+\xi _3)^3-0.5(\xi _2+\xi _4)+5\xi _3\\ \xi _4\\ -5\xi _3 \end{bmatrix}\nonumber \\ {}&\quad +\begin{bmatrix} 0\\ 1\\ 0\\ 0 \end{bmatrix}{u}(\xi ) +\begin{bmatrix} 1\\ 0\\ 0\\ 0 \end{bmatrix}{v}(\xi ). \end{aligned}$$
(85)

Since \(\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0\), we have taken \(g_M(\xi )=0\). Let \(R=I_1\), \(M=I_1\) and \(Q=300I_2\). For the simulation purpose, consider \(\gamma =1.2\) and \(\beta =0.9\).

Fig. 10
figure 10

Convergence process of critic weights

Fig. 11
figure 11

Tracking performance of \(x_1\) for \(\theta _1=-0.9\) and \(\theta _2=-0.3\)

Fig. 12
figure 12

Tracking performance of \(x_2\) for \(\theta _1=-0.9\) and \(\theta _2=-0.3\)

Based on (8), the cost function for (85) can be presented as

$$\begin{aligned} J(\xi (t))&= \int \nolimits _{t}^{\infty }\mathrm {e}^{-1.2(\tau -t)}\{300\Vert e_t\Vert ^2+\Vert u(\xi )\Vert ^2\nonumber \\&\quad +0.81\Vert v(\xi )\Vert ^2+0.81\vert \xi _2+\xi _4\vert ^2\}\mathrm {d}\tau . \end{aligned}$$
(86)

We have considered \(l=10\) numbers of hidden layer neurons and the activation function is chosen as \(\sigma _c(\xi )=[\xi _1^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2^2,\xi _2\xi _3,\xi _2\xi _4,\xi _3^2,\xi _3\xi _4, \xi _4^2]^\top \). The parameters used during the tuning process are \(l_c=4\), \(l_s=0.5\), \((A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8\) and \(\eta _2=0.7\).

To fulfill the PE criteria, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. The critic weight vector \(\hat{\omega }_c\) converges to \([3.53,19,11.46,2.49,1.93,0.23,0.01,10.07,-0.73, 2.05]^{\top }\) as shown in Fig. 10. During the training process the event-based controller updates 8947 times. On the contrary, under the same design criteria, the time-based controller updates 16000 times.

Fig. 13
figure 13

Event-based control input

Fig. 14
figure 14

Evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\)

Then, we used the converged weights to obtain the control polices (39) and (40). Now, we select \(\theta _1=-0.9\) and \(\theta _2=-0.3\) to check the trajectory tracking performance of the designed control policy \( \hat{\mu }(\xi _k)\) and the triggering rule described in (21). We considered \(\eta _1=0.7\) and \(\mathcal {L}=10\). The sampling period is taken as 0.01 second. The performance of the designed tracking controller is displayed in Figs. 11 and 12. The obtained event-based control policy \( \hat{\mu }(\xi _k)\) is displayed in Fig. 13.

The evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\) is displayed in Fig. 14. The sampling period is shown in Fig. 15. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 15 also conveys that only 1452 state samples are used during the tracking process. So, the controller is updated 1452 times only. Nonetheless, if we use the time-triggering method under the same condition then 8000 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.

Fig. 15
figure 15

Triggering instants during the tracking process

Fig. 16
figure 16

Tracking performance of \(x_1\) for \(\theta _1=0.8\) and \(\theta _2=4\)

Fig. 17
figure 17

Tracking performance of \(x_2\) for \(\theta _1=0.8\) and \(\theta _2=4\)

Next, in order to show that the derived controller is robust, we have taken \(\theta _1=0.8\) and \(\theta _2=4\). The tracking performance for new value of \(\theta _1\) and \(\theta _2\) is shown in Figs. 16 and 17. Here, only 1438 state samples are used during the tracking process. In other words, the event-based controller updates 1438 times only. On the other hand, the conventional time-triggered controller updates 8000 times under the same design criteria.

7 Conclusion

In this work, an event-based robust tracking strategy for an unmatched uncertain system is developed. By forming an auxiliary system and decomposing the unmatched uncertainty, the original control problem is transformed into obtaining an optimal controller for an auxiliary system. The related event-based HJB equation is solved via the ADP approach. The critic weights tuning law is modified to avoid the need for initial stabilizing control at the beginning of the tuning process. In the meantime, a novel event-triggering law is developed, and the uniform ultimate boundedness of the tracking error is verified using the Lyapunov method. The closed-loop auxiliary system’s asymptotic stability and the uniform ultimate boundedness of the critic approximation error are assured. Finally, two simulation examples are included to demonstrate the usefulness of the proposed methodology.

The main limitations of the proposed work and future scope are as follows.

  1. 1.

    The work developed in this article needs complete knowledge of system dynamics. However, we may not know the system dynamics completely in many applications.

  2. 2.

    The proposed method is not suitable for time-delay systems. In the future, a tracking controller will be designed for uncertain nonlinear systems with time-delay using event-based ADP approach.