1 Introduction

With the framework of network-based systems, the control loops are often closed through a communication medium. As a hot topic of system and control community, it is significant to perform systematically theoretical researches and meaningfully industrial applications for network-based control design. The growing demands in reducing the computational load of networked control systems, or more extensively, the emerging cyber-physical systems bring a great attention to develop the mechanism of event-triggering control [1,2,3,4]. Dolk et al. [1] proposed the popular framework for output-based dynamic event-triggered control design under denial-of-service attacks. Wu et al. [3] dealt with the event-based optimal control of heating, ventilation and air-conditioning systems of buildings for the purpose of energy saving. Within these general event-based control approaches, the actuators are only updated under certain triggering conditions such that both control performance and system stability can be guaranteed toward the target objects.

Robustness is an important criterion to evaluate the performance of the designed controller with respect to uncertain disturbances and parameters of the controlled plant. In particular, the \(H_{\infty }\) method usually concerns to construct a control law for the worse-case uncertain plant. From the point of view of the minimax optimization, a \(H_{\infty }\) control problem can be considered as the two-player zero-sum differential game, where a controller is obtained that minimizes the cost function in the worst-case disturbance. In this case, it requires to get the Nash equilibrium solution via the Hamilton–Jacobi–Isaacs equation. As we all know, it is obviously difficult to get the analytic Nash equilibrium solution for nonlinear systems. Fortunately, the methodology of adaptive/approximate dynamic programming is developed to effectively solve the class of optimal control problems forward-in-time [5,6,7], with neural networks [8,9,10,11] as well as some new developed function approximation architectures like incremental support vector machines [12], and so on. The adaptive/approximate dynamic programming approach has made great progress in the aspects of optimal control for discrete-time nonlinear systems[13,14,15,16,17], continuous-time nonlinear systems [18,19,20,21] and some related applications [22,23,24]. Moreover, the problems of nonlinear \(H_{\infty }\) control and the nonzero-sum game have been revisited and studied with the approach of adaptive/approximate dynamic programming in [25,26,27,28,29].

Adaptive critic control, as one method of adaptive dynamic programming-based control, comes from the literature [30] where Prokhorov et al. proposed the adaptive critic design with neural networks. Since then, adaptive critic control has been developed as an important method of approximate optimal control approaches. In order to improve the robustness of adaptive critic control, the robust adaptive critic control methodology was proposed in references [31, 32], which has recently achieved great development in [33,34,35,36,37]. However, these existing research results are obtained by the traditional design manner of time-based control, which would cause that actuators are frequently adjusted and thus energy consumption is enormous. Therefore, the time/event control structure has become an outlet to fulfill the event-based design and enhance the control efficiency [29, 36, 38,39,40]. In recent few years, the event-based adaptive critic design method has been developed as a new channel for the adaptive optimal stabilization of nonlinear systems [36, 40,41,42]. With the new time/event control mechanism, the developed controller is updated once an event is triggered, which results in reducing the computational cost. Consider that most existing work conducted for the optimal regulation without involving output information, such as [18, 19, 21, 26, 40, 41, 43], motivates this extension work to nonlinear event-based zero-sum differential game problem with output information.

In this paper, the event-based robust \(H_{\infty }\) control with output information is investigated under the framework of adaptive critic designs. The contributions of this paper are listed as follows. For one thing, the framework of the event-based adaptive critic control with output information is established to study the nonlinear \(H_{\infty }\) feedback control. The two-player zero-sum differential game problem with output information is formulated, and the event-based minimax optimization involving a suitable triggering condition is designed within the event-based adaptive critic control framework. For another thing, by involving output information, both the event-based optimal control law and the time-based worst-case disturbance law are derived with stability proof, and the Zeno behavior in the event-based control is effectively avoided. This improves the results of traditional adaptive critic design such as [18, 19, 21, 26] and event-based control design such as [40, 41, 43]. The rest of this paper is organized as follows: In Sect. 2, a succinct transformation of nonlinear \(H_{\infty }\) control with output information is described. The event-based adaptive critic design for the nonlinear \(H_{\infty }\) feedback control problem is intensively investigated in Sect. 3 with the analysis of closed-loop system stability and Zeno behavior exclusion. The application of a robot arm plant is provided in Sect. 4, and some concluding remarks are finally drawn in Sect. 5.

For the effective presentation, these notations are defined and used in the following sections. \({\mathbb R}\), \({\mathbb R}^n\) and \({\mathbb R}^{n \times m}\) define the set of all real numbers, the Euclidean space of all n-dimensional real vectors and the space of all \(n \times m\) real matrices, respectively. \(\mathbb {N}=\{0,1,2,\ldots \}\) defines the set of all nonnegative integers. \(I_{n}\) is the identity matrix in \({\mathbb R}^{n \times n}\). \(\lambda _{\max }(\cdot )\) and \(\lambda _{\min }(\cdot )\) represent the maximal and minimal eigenvalues of a matrix, while \(\text {diag}\{\xi _{1}, \xi _{2}, \ldots , \xi _{n}\}\) expresses the \(n \times n\) diagonal matrix with elements of \(\xi _{1}\), \(\xi _{2}\), \(\ldots \), \(\xi _{n}\). \(\Vert \cdot \Vert \) denotes the 2-norm for a vector and the induced-norm for a matrix. Define \({\varOmega }\) as a compact subset of \({\mathbb R}^{n}\), and \(\mathscr {A}({\varOmega })\) is the admissible control policy set on \({\varOmega }\). A superscript “\(\mathsf {T}\)” and \(\nabla (\cdot )\triangleq \partial (\cdot )/\partial x\) mean the transpose operation and the gradient operation, respectively.

2 Problem description and preliminaries

The following class of affine nonlinear continuous-time systems is considered in this paper with external perturbations:

$$\begin{aligned} \dot{x}(t)&=f(x)+g(x)u(t)+h(x)\nu (t), \end{aligned}$$
(1a)
$$\begin{aligned} y(t)&= Cx(t). \end{aligned}$$
(1b)

In (1), x(t) denotes the state vector belonging to \({\varOmega }\subset { \mathbb R}^n\), \(u(t) \in \mathbb R^m\) is the control input, \(\nu (t)\in \mathbb R^q\) is the perturbation belonging to \(L_{2}[0,\infty )\), \(y(t)\in { \mathbb R}^p\) is the output vector, and \(C \in { \mathbb R}^{p \times n}\) is a constant output matrix. f(x), g(x) and h(x) are differentiable with \(f(0)=0\). x(0) is the initial state vector, recorded as \(x_{0}\), and \(x=0\) is the equilibrium point of the system.

Assumption 1

The nonlinear system (1) is controllable. The system function f(x) defined on \({\varOmega }\) is Lipschitz continuous and contains the origin.

With this assumption, considering the nonlinear \(H_{\infty }\) control design of system (1), a feedback control law u(x) is expected to make the closed-loop system (1) asymptotically stable with a \(L_{2}\)-gain no larger than \(\iota \), which is

$$\begin{aligned} \int _{0}^{\infty }\big (y^{\mathsf {T}}(\tau ) P y(\tau )+u^{\mathsf {T}}(\tau )u(\tau )\big )\text {d}\tau \le \iota ^{2}\int _{0}^{\infty }\Vert \nu (\tau )\Vert ^{2}\text {d}\tau , \end{aligned}$$
(2)

where P is a positive semidefinite matrix with appropriate dimension. Recalling (1b), it is obvious that (2) can be rewritten as

$$\begin{aligned} \int _{0}^{\infty }\big (x^{\mathsf {T}}(\tau )Qx(\tau )+u^{\mathsf {T}}(\tau )u(\tau )\big )\text {d}\tau \le \iota ^{2}\int _{0}^{\infty }\Vert \nu (\tau )\Vert ^{2}\text {d}\tau , \end{aligned}$$
(3)

where \(Q=C^{\mathsf {T}}P C \) is nonnegative definite. If the closed-loop expression of system (1) satisfies the condition (3), then it has a \(L_{2}\)-gain no larger than \(\iota \). It is known to all that the nonlinear \(H_{\infty }\) control can be translated into the problem of two-player zero-sum differential game, where the minimizing player is considered as the control and the maximizing player is regarded as the disturbance [25, 26]. Therefore, the solution of nonlinear \(H_{\infty }\) control is defined as a control pair with the form \((u^{*}, \nu ^{*})\). \(u^{*}\) and \(\nu ^{*}\) are the optimal control and the worst-case disturbance, respectively. Define the utility function \(U\big (x(\tau ), u(\tau ), \nu (\tau )\big )\) as

$$\begin{aligned}&U\big (x(\tau ),u(\tau ),\nu (\tau )\big ) \nonumber \\&\quad = x^{\mathsf {T}}(\tau )Qx(\tau ) + u^{\mathsf {T}}(\tau )u(\tau ) - \iota ^{2} \nu ^{\mathsf {T}}(\tau )\nu (\tau ), \end{aligned}$$

and the corresponding cost function is

$$\begin{aligned} J(x, u, \nu )=\int _{t}^{\infty }U\big (x(\tau ), u(\tau ), \nu (\tau )\big )\text {d}\tau , \end{aligned}$$
(4)

where the cost function \(J(x, u, \nu )\) can be simplified as J(x) in the following text. The initial cost function at \(t=0\) is recorded as \(J(x_{0})\). In the two-player zero-sum game problem, the feedback control pair \((u^{*}, \nu ^{*})\) satisfies the Nash condition, i.e.,

$$\begin{aligned} J^{*}(x_{0})=\min _{u}\max _{\nu }J(x_{0}, u, \nu )=\max _{\nu }\min _{u}J(x_{0}, u, \nu ). \end{aligned}$$

Considering that an admissible control policy \(u \in \mathscr {A}({\varOmega })\) is used, if the cost function in (4) is differentiable, then it derives the following nonlinear Lyapunov equation

$$\begin{aligned} U(x, u, \nu )+\big (\nabla J(x)\big )^\mathsf {T}\big (f(x)+g(x)u+h(x)\nu \big )=0 \end{aligned}$$

with an initial condition \(J(0)=0\). Correspondingly, the Hamiltonian function of system (1) is defined as

$$\begin{aligned}&H\big (x, u, \nu ,\nabla J(x)\big ) \nonumber \\&\quad = U(x, u, \nu ) + \big (\nabla J(x)\big )^\mathsf {T} \big (f(x)+g(x)u+h(x)\nu \big ). \end{aligned}$$

If the Bellman’s optimality principle is used, then the optimal cost \(J^{*}(x)\) can make sure that the Hamilton–Jacobi–Isaacs equation

$$\begin{aligned} \min _{u }\max _{\nu } H\big (x, u, \nu , \nabla J^{*}(x)\big )=0 \end{aligned}$$

holds. That is to say, the control pair \((u^{*}, \nu ^{*})\) can be obtained by the following partial differential equations

$$\begin{aligned} \frac{\partial H\big (x, u, \nu , \nabla J^{*}(x)\big )}{\partial u}=&0,\\ \frac{\partial H\big (x, u, \nu , \nabla J^{*}(x)\big )}{\partial \nu }=&0. \end{aligned}$$

Therefore, the optimal control and the worst-case disturbance are calculated by

$$\begin{aligned} u^{*}(x)&=-\frac{1}{2}g^\mathsf {T}(x) \nabla J^{*}(x), \end{aligned}$$
(5a)
$$\begin{aligned} \nu ^{*}(x)&=\frac{1}{2\iota ^{2}}h^\mathsf {T}(x)\nabla J^{*}(x). \end{aligned}$$
(5b)

By using (5), the Hamilton–Jacobi–Isaacs equation turns to the following expression

$$\begin{aligned} 0=&x^{\mathsf {T}}Qx + \big (\nabla J^{*}(x)\big )^\mathsf {T}f(x) - \frac{1}{4}\big (\nabla J^{*}(x)\big )^\mathsf {T}g(x)g^\mathsf {T}(x) \nonumber \\&\times \nabla J^{*}(x)+\frac{1}{4\iota ^{2}}\big (\nabla J^{*}(x)\big )^\mathsf {T}h(x)h^\mathsf {T}(x)\nabla J^{*}(x) \end{aligned}$$
(6)

with \(J^{*}(0)=0\). It should be noted that (6) is the classical time-based Hamilton–Jacobi–Isaacs equation. An approximate solution is pursued to substitute for the analytic solution. The adaptive critic control-based method is taken as an effective approach to handle the problem.

3 Event-based robust adaptive critic control design and implementation

3.1 Event-based control design with Zeno behavior exclusion

In industrial practice, a sampling component is often incorporated into a networked system. With the event-triggering control method, a monotonically increasing sequence is usually defined as the triggering instants, i.e., \(\{s_{j}\}_{j=0}^{\infty }\), where \(s_{j}\) expresses the jth consecutive sampling instant with \(j \in \mathbb {N}\). The sampled state vector is denoted as \(x(s_{j}) \triangleq \hat{x}_{j}\) for all \(t \in [s_{j}, s_{j+1})\). The event-triggered error defines the gap between current and sampled states, which is represented as \( \sigma _{j}(t)=\hat{x}_{j}-x(t), ~ \forall t \in [s_{j}, s_{j+1}) \).

In the event-based control, the triggering condition decides the triggering instants. That is to say, at the triggering instant \(t=s_{j}\), when the triggering condition is activated, the system is sampled such that the event-triggered error \(\sigma _{j}(t)\) is reset as zero. The control law \(u(x(s_{j}))=u(\hat{x}_{j}) \triangleq \mu (\hat{x}_{j})\) is accordingly updated. By introducing a zero-order holder, the control sequence \(\{\mu (\hat{x}_{j})\}_{j=0}^{\infty }\) can be turned into a continuous-time signal in the form of a piecewise constant function with a constant value \(\mu (\hat{x}_{j})\) at the time interval \([s_{j}, s_{j+1})\), \(j \in \mathbb {N}\). When the event-triggering mechanism is employed, the feedback control law in (5a) becomes

$$\begin{aligned} \mu ^{*}(\hat{x}_{j})=-\frac{1}{2}g^\mathsf {T}(\hat{x}_{j}) \nabla J^{*}(\hat{x}_{j}), \end{aligned}$$
(7)

where \(\nabla J^*({\hat{x}}_j)=\big (\partial J^*(x)/\partial x\big )|_{x={\hat{x}}_j}\). The disturbance law is unchanged during the time/event structure transformation. Additionally, we make the following assumptions which are reasonable and conventional in the event-based design.

Assumption 2

(cf. [41]) The control law u(x) is Lipschitz continuous with regard to the event-triggered error \(\sigma _{j}(t)\), which is formulated as \( \Vert u(x(t))-u(\hat{x}_{j})\Vert \le M_{u}\Vert \sigma _{j}(t)\Vert \), where \(M_{u}\) is a positive constant.

Assumption 3

The control function matrix g(x) is Lipschitz continuous associated with the event-triggered error \(\sigma _{j}(t)\) and is also upper-bounded, which means \(\Vert g(x)-g({\hat{x}}_j)\Vert \le M_{g}\Vert \sigma _j(t)\Vert \) and \(\Vert g(x)\Vert \le B_{g}\), where \(M_{g}\) and \(B_{g}\) are positive constants. The disturbance matrix h(x) is bounded by a positive constant \(B_{h}\), which is expressed as \(\Vert h(x)\Vert \le B_{h}\).

The following theorem is provided to design a triggering condition.

Theorem 1

Considering the nonlinear system (1) and its related cost function (4), for all \(t \in [s_{j},s_{j+1})\) with \(j \in \mathbb {N}\), if the disturbance law and the event-based control law are given by (5b) and (7), respectively, and the triggering condition is given as

$$\begin{aligned} \Vert \sigma _{j}(t)\Vert ^{2} \le \sigma _{T}=\frac{x^\mathsf {T}Qx+\Vert \mu ^{*}(\hat{x}_{j})\Vert ^{2}-\iota ^{2}\Vert \nu ^{*}(x)\Vert ^{2}}{M_{u}^2 } , \end{aligned}$$
(8)

where \(\sigma _{T}\) is the threshold of the triggering condition, then the closed-loop system (1) is asymptotically stable.

Proof

Select \(L_{1}(t)=J^{*}(x(t))\) as the Lyapunov function candidate. Using (5b) and (7), we take the time derivative of \(L_{1}(t)\) along the trajectory of system (1a) to compute \(\dot{L}_{1}(t)=\text {d}J^{*}(x(t))/\text {d}t\), which derives

$$\begin{aligned} \dot{L}_{1}(t) = \big (\nabla J^{*}(x)\big )^\mathsf {T}\big (f(x)+ g(x)\mu ^{*}(\hat{x}_{j})+ h(x)\nu ^{*}(x)\big ). \end{aligned}$$

Note that formula (5) implies that

$$\begin{aligned} \big (\nabla J^{*}(x)\big )^\mathsf {T} g(x)&=-2u^{*\mathsf {T}}(x), \end{aligned}$$
(9a)
$$\begin{aligned} \big (\nabla J^{*}(x)\big )^\mathsf {T} h(x)&=2\iota ^{2} \nu ^{*\mathsf {T}}(x). \end{aligned}$$
(9b)

Besides, Eq. (6) reveals

$$\begin{aligned} \big (\nabla J^{*}(x)\big )^\mathsf {T}f(x)= -x^{\mathsf {T}}Qx+u^{*\mathsf {T}}(x)u^{*}-\iota ^{2}\nu ^{*\mathsf {T}}(x)\nu ^{*}(x). \end{aligned}$$
(10)

By using (9) and (10), it can derive

$$\begin{aligned} \dot{L}_{1}(t) =&-x^\mathsf {T}Qx + u^{*\mathsf {T}}(x)u^{*} - 2u^{*\mathsf {T}}(x)\mu ^{*}(\hat{x}_{j})\nonumber \\&+ \iota ^{2} \nu ^{*\mathsf {T}}(x)\nu ^{*}(x). \end{aligned}$$
(11)

By introducing Assumption 2, \(\dot{L}_{1}(t)\) can be obtained as

$$\begin{aligned} \dot{L}_{1}(t) =&-x^\mathsf {T}Qx+\big (u^{*}(x)-\mu ^{*}(\hat{x}_{j})\big )^{\mathsf {T}}\big (u^{*}(x)-\mu ^{*}(\hat{x}_{j})\big )\nonumber \\&-\mu ^{*\mathsf {T}}(\hat{x}_{j})\mu ^{*}(\hat{x}_{j})+\iota ^{2}\nu ^{*\mathsf {T}}(x)\nu ^{*}(x)\nonumber \\ \le&-x^\mathsf {T}Qx+M_{u}^2\Vert \sigma _{j}( t)\Vert ^{2}\\&-\Vert \mu ^{*}(\hat{x}_{j})\Vert ^{2}+\iota ^{2}\Vert \nu ^{*}(x)\Vert ^{2}. \end{aligned}$$

It is obvious that \(\dot{L}_{1}(t) < 0\) can be obtained for any \(x\ne 0\) if the triggering condition (8) holds, which ends the proof. \(\square \)

For the proposed network-based event-triggered \(H_{\infty }\) control problem, the jth inter-sample time is \( s_{j+1}-s_{j}\). Denote the minimal inter-sample time as

$$\begin{aligned} \Delta s_{\min }= \min _{j\in \mathbb {N}} \{ s_{j+1}-s_{j} \}, \end{aligned}$$

which might be zero and thus lead to the accumulation of the event times, i.e., the infamous Zeno behavior.

By using Assumptions 1 and 3, and considering the fact that the optimal control function and the worst-case disturbance function are upper-bounded, it can acquire two positive constants \(\kappa _{1}\) and \(\kappa _{2}\) such that

$$\begin{aligned} \Vert \dot{x}\Vert =\Vert f(x)+g(x)\mu ^{*}+h(x)\nu ^{*}\Vert \le \kappa _{1}\Vert x\Vert +\kappa _{2} \end{aligned}$$
(12)

holds, where \(\kappa _{2}\) is a bounded term with respect to the control matrix, the optimal control, the disturbance matrix and the worst-case disturbance. Take the derivative of the triggering error \(\sigma _{j}(t)\) and then yield \(\dot{\sigma }_{j}(t)=-\dot{x}\) for \(t\in [s_j,s_{j+1})\). Based on (12), it can be further found that

$$\begin{aligned} \Vert \dot{\sigma }_{j}\Vert&\le \kappa _{1}\Vert \hat{x}_{j}-\sigma _{j}\Vert +\kappa _{2} \nonumber \\&\le \kappa _{1}\Vert \sigma _{j}\Vert +\kappa _{1}\Vert \hat{x}_{j}\Vert +\kappa _{2}, \forall t\in [s_j,s_{j+1}). \end{aligned}$$
(13)

By using the initial condition \(\sigma _{j}(s_{j})={\hat{x}}_{j}-x(s_{j})=0\) and the comparison lemma (seeing [44]), the following inequality can be derived based on the solution of (13), which is

$$\begin{aligned} \Vert \sigma _{j}\Vert \le \frac{\kappa _{1}\Vert \hat{x}_{j}\Vert +\kappa _{2}}{\kappa _{1}} (\sigma ^{\kappa _{1}(t-s_{j})}-1) \end{aligned}$$
(14)

for any \(t\in [s_j,s_{j+1})\). According to (14), we obtain that the jth inter-sample time satisfies

$$\begin{aligned} s_{j+1}-s_{j} \ge \frac{1}{\kappa _{1}} \ln (1+\bar{\kappa }_{j})>0, \end{aligned}$$
(15)

where the term \(\bar{\kappa }_{j} = \kappa _{1} \bar{\sigma }_T/ (\kappa _{1}\Vert \hat{x}_{j}\Vert +\kappa _{2}) \) is positive with \(\bar{\sigma }_T=\Vert \sigma _{j}(s_{j+1})\Vert \) and \(\sigma _{j}(s_{j+1})={\hat{x}}_{j}-x(s_{j+1})\). The minimum of \(\bar{\kappa }_{j}\) with regard to all \(t\in [s_j,s_{j+1}), j\in \mathbb {N}\), is defined as \(\kappa _{\min }=\min _{j\in \mathbb {N}} \bar{\kappa }_{j} > 0\). By minimizing both sides of (15), we can conclude the following remark.

Remark 1

Considering the nonlinear system (1) used the disturbance law (5b) and the event-based control law (7), the minimal inter-sample time \(\Delta s_{\min }\) determined by (8) is lower-bounded such that

$$\begin{aligned} \Delta s_{\min } \ge \frac{1}{\kappa _{1}} \ln (1+\kappa _{\min })>0, \end{aligned}$$
(16)

where \(\kappa _{1}\) and \(\kappa _{\min }\) are positive constants. Hence, the Zeno behavior in this event-based control design is avoided.

3.2 Neural network implementation with stability analysis

The adaptive critic control design with neural networks is a practical approach to obtain the approximate optimal control solution for nonlinear system control problems [5, 18, 21, 27, 40]. In the neural network implementation, \(l_c\) is denoted as the neuron number of the hidden layer. By adopting the universal approximation property of neural networks, the cost function J(x) is reconstructed by a single-hidden-layer neural network as

$$\begin{aligned} J(x)=\omega _{c}^{\mathsf {T}}\varphi _{c}(x)+\epsilon _{c}(x), \end{aligned}$$

where \(\omega _{c}\in \mathbb {R}^{l_c}\) is the desired weight vector, \(\varphi _{c}(x)\in \mathbb {R}^{l_c}\) denotes the activation function of the neural network, and \(\epsilon _{c}(x)\in \mathbb {R}\) is the reconstruction error. The gradient of J(x) is expressed as

$$\begin{aligned} \nabla J(x)=\big (\nabla \varphi _{c}(x)\big )^\mathsf {T}\omega _{c}+\nabla \epsilon _{c}(x). \end{aligned}$$

It is obvious that the desired weight vector \(\omega _{c}\) is unknown; thus, the critic neural network with an estimated weight vector \(\hat{\omega }_{c}(t)\) is used to construct the cost function, which is

$$\begin{aligned} \hat{J}(x)=\hat{\omega }_{c}^{\mathsf {T}}(t)\varphi _{c}(x). \end{aligned}$$

Similarly, the gradient of the estimated cost function \(\hat{J}(x)\) can be formulated as

$$\begin{aligned} \nabla \hat{J}(x)=\big (\nabla \varphi _{c}(x)\big )^\mathsf {T}\hat{\omega }_{c}(t). \end{aligned}$$

Therefore, the event-based optimal control and the time-based worst-case disturbance are formulated as

$$\begin{aligned} \mu (\hat{x}_{j})&=-\frac{1}{2}g^\mathsf {T}(\hat{x}_{j}) \Big ( \big (\nabla \varphi _{c}(\hat{x}_{j})\big )^\mathsf {T}\omega _{c}+\nabla \epsilon _{c}(\hat{x}_{j})\Big ),\nonumber \\ \nu (x)&=\frac{1}{2\iota ^{2}}h^\mathsf {T}(x) \Big ( \big (\nabla \varphi _{c}(x)\big )^\mathsf {T}\omega _{c}+\nabla \epsilon _{c}(x)\Big ). \end{aligned}$$

By introducing the critic neural network, the approximate values of the above control pair are

$$\begin{aligned} \hat{\mu }(\hat{x}_{j})&=-\frac{1}{2}g^\mathsf {T}(\hat{x}_{j})\big (\nabla \varphi _{c}(\hat{x}_{j})\big )^\mathsf {T}\hat{\omega }_{c}(t), \end{aligned}$$
(17a)
$$\begin{aligned} \hat{\nu }(x)&=\frac{1}{2\iota ^{2}}h^\mathsf {T}(x) \big (\nabla \varphi _{c}(x)\big )^\mathsf {T}\hat{\omega }_{c}(t). \end{aligned}$$
(17b)

In the sequel, we apply the neural network expression to the Hamiltonian function and derive that

$$\begin{aligned}&H\big (x, \mu (\hat{x}_{j}), \nu (x), \omega _{c}\big )&\nonumber \\&\quad =U\big (x, \mu (\hat{x}_{j}), \nu (x)\big )+\omega _c^{\mathsf {T}}\nabla \varphi _c(x)\big (f(x)+g(x)\mu (\hat{x}_{j})\nonumber \\&\qquad +h(x)\nu (x)\big ) \nonumber \\&\quad \triangleq e_{cH}, \end{aligned}$$
(18)

where the term

$$\begin{aligned} e_{cH}=-\big (\nabla \varepsilon _{c}(x)\big )^\mathsf {T}\big (f(x)+g(x)\mu (\hat{x}_{j})+h(x)\nu (x)\big ) \end{aligned}$$

represents the residual error arising in the approximate operation. Meanwhile, the approximate Hamiltonian function is

$$\begin{aligned}&{\hat{H}}\big (x, \mu (\hat{x}_{j}), \nu (x), \hat{\omega }_{c}\big )\nonumber \\&\quad = U\big (x, \mu (\hat{x}_{j}), \nu (x)\big )+{\hat{\omega }}_c^{\mathsf {T}}(t) \nabla \varphi _c(x) \big (f(x)\nonumber \\&\qquad +g(x)\mu (\hat{x}_{j}) +h(x)\nu (x)\big ) \nonumber \\&\quad \triangleq e_{c}. \end{aligned}$$
(19)

Let us define the weight error vector as \({\tilde{\omega }}_c(t)= \omega _c-{\hat{\omega }}_c(t)\). Then, we combine (18) with (19) to yield

$$\begin{aligned} e_{c}=&-\tilde{\omega }_c^{\mathsf {T}}(t)\nabla \varphi _c(x)\big (f(x)+g(x)\mu (\hat{x}_{j})+h(x)\nu (x)\big ) \nonumber \\&+e_{cH}. \end{aligned}$$

Next, we show how to train the critic neural network. Here, we aim at minimizing the objective function defined as \(E_{c}=0.5e_{c}^{2} \) to get \(\hat{\omega }_{c}(t)\). It should be pointed out that the control pair of (17) is often adopted during the learning process because the optimal control and the worst-case disturbance are unavailable to be obtained. Based on (19), the normalized steepest descent algorithm is employed to regulate the weight vector \(\hat{\omega }_{c}(t)\):

$$\begin{aligned} \dot{\hat{\omega }}_{c}(t)=&-\alpha _{c}\frac{1}{(1+\psi ^{\mathsf {T}}\psi )^{2}}\bigg (\frac{\partial E_{c}}{\partial \hat{\omega }_{c}(t)}\bigg )\nonumber \\ =&-\alpha _{c}\frac{\psi }{(1+\psi ^{\mathsf {T}}\psi )^{2}}\big (U(x, {\hat{\mu }}({\hat{x}}_j), \hat{\nu }(x))\nonumber \\&+ \psi ^{\mathsf {T}}{\hat{\omega }}_c(t)\big ), \end{aligned}$$
(20)

where \(\alpha _{c}>0.5\) is the learning rate of the critic neural network,

$$\begin{aligned} \psi =\nabla \varphi _c(x)\big ( f(x)+g(x) \hat{\mu }({\hat{x}}_j)+h(x)\hat{\nu }(x)\big ) \end{aligned}$$

is a \(l_{c}\)-dimensional column vector, and \((1+\psi ^{\mathsf {T}}\psi )^{2}\) is a regularized term [45].

For the sake of clarity, a simple diagram of the adaptive critic-based nonlinear \(H_{\infty }\) control design that integrated the event-based component is depicted in Fig. 1, where the solid blocks exhibit the network-based computation modules, while the dashed blocks reveal the time/event transformation components. The solid line denotes the signal flow path for \(H_{\infty }\) control design, while the dashed line represents the back-propagation path for neural network training.

Fig. 1
figure 1

Simple structure of adaptive critic \(H_{\infty }\) control design with network-based event-triggered mechanism

By using \(\dot{{\tilde{\omega }}}_c(t)= -\dot{{\hat{\omega }}}_c(t)\) and introducing the following notations

$$\begin{aligned} \psi _{1}=\frac{\psi }{(1+\psi ^{\mathsf {T}}\psi )},\psi _{2}=1+\psi ^{\mathsf {T}}\psi , \end{aligned}$$

the error dynamics of critic neural network are further investigated, which can be written as

$$\begin{aligned} \dot{{\tilde{\omega }}}_c(t) = -\alpha _c \psi _{1}\psi _{1}^\mathsf {T}{\tilde{\omega }}_c (t)+\alpha _c \frac{\psi _{1}}{\psi _{2}}e_{cH}. \end{aligned}$$
(21)

It is well known that the persistence of excitation is necessary to execute the system identification [46]. Therefore, this assumption is also required in this paper since the parameters of critic neural network need to be identified such that the cost function can be approximated.

Assumption 4

(cf. [18]) The signal \(\psi _{1}\) is with the property of persistent excitation in the time interval \([t, t+T]\), \(T>0\), i.e., there exist two constants \(\varsigma _{1}>0\) and \(\varsigma _{2}>0\) such that

$$\begin{aligned} \varsigma _{1}I_{l_{c}} \le \int _{t}^{t+T}\psi _{1}(\tau ) \psi _{1}^{\mathsf {T}}(\tau ) \text {d}\tau \le \varsigma _{2}I_{l_{c}} \end{aligned}$$

holds for all t.

Based on Assumption 4, the persistent excitation condition means that \(\lambda _{\min }(\psi _{1} \psi _{1}^{\mathsf {T}}) > 0\), which is useful in the following stability analysis.

In the event-triggered control, the closed-loop sampled-data system contains a flow dynamics for all \(t \in [s_j,s_{j+1})\) and a jump dynamics at all \(t=s_{j+1}\) with \(j \in \mathbb {N}\). Before proceeding the stability issue of the closed-loop system, Assumption 5 is required, which is similar in [27, 36, 42].

Assumption 5

The derivative of used activation function is Lipschitz continuous, i.e., \(\Vert \nabla \varphi _c(x)-\nabla \varphi _c({\hat{x}}_j)\Vert \le M_{\varphi }\Vert \sigma _j(t)\Vert \), where \(M_{\varphi }\) is a positive constant. \(\nabla \varphi _c(x)\), \(\nabla \epsilon _{c}(x)\) and \(e_{cH}\) are upper-bounded by \(\Vert \nabla \varphi _c(x)\Vert \le B_{\varphi }\), \(\Vert \nabla \epsilon _{c}(x)\Vert \le B_{\epsilon }\), and \(\Vert e_{cH}\Vert \le B_{e}\), where \(B_{\varphi }\), \(B_{\epsilon }\) and \(B_{e}\) are positive constants.

Theorem 2

With Assumptions 3 and 5, for the nonlinear system (1), the event-based approximate optimal control law is given by (17a), and the time-based approximate worst-case disturbance law is (17b), where the weight vector of critic neural network is updated according to (20). Then, the closed-loop system (1) is asymptotically stable, and the weight error vector is uniformly ultimately bounded with the following triggering condition

$$\begin{aligned} {\Vert }\sigma _j(t)\Vert ^2 \le {\hat{\sigma }}_T = \frac{x^{\mathsf {T}}Qx+\Vert {\hat{\mu }}({\hat{x}}_j)\Vert ^2-\iota ^{2}\Vert {\hat{\nu }}(x)\Vert ^{2}}{2 M_{\mathcal {L}}\Vert {\hat{\omega }}_c\Vert ^2} , \end{aligned}$$
(22)

where the inequality

$$\begin{aligned} {\Vert }{\tilde{\omega }}_c(t) \Vert >\sqrt{\frac{2 B_{g}^2 B_{\epsilon }^2 + \alpha _c^2 B_{e}^2}{(2\alpha _c-1) \lambda _{\min }(\psi _{1} \psi _{1}^{\mathsf {T}})-2B_{g}^2B_{\varphi }^2}} \end{aligned}$$
(23)

is satisfied when \(M_{\mathcal {L}}=M_{g}^2 B^2_{\varphi }+M_{\varphi }^2B^2_{g}\) and \(\alpha _{c}>0.5\).

Proof

Construct a Lyapunov function candidate as the formula

$$\begin{aligned} L_{2}(t)=L_{21}(t)+L_{22}(t)+L_{23}(t), \end{aligned}$$

where

$$\begin{aligned} L_{21}(t)=&J^*(x), L_{22}(t)=J^*({\hat{x}}_j),\\ L_{23}(t)=&\frac{1}{2}{\tilde{\omega }}_c^\mathsf {T}(t){\tilde{\omega }}_c(t). \end{aligned}$$

When \( t \in [s_j,s_{j+1})\), the events are not triggered. The time derivative of \(L_{2}(t)\) is calculated as

$$\begin{aligned} \dot{L}_{21}(t)=&\big (\nabla J^*(x)\big )^\mathsf {T}\big (f(x)+g(x){\hat{\mu }}({\hat{x}}_j) \\&+h(x)\hat{\nu }(x)\big ), \end{aligned}$$

\(\dot{L}_{22}(t)=0\), and

$$\begin{aligned} \dot{L}_{23}(t)= -\alpha _c {\tilde{\omega }}_c^\mathsf {T}(t)\psi _{1}\psi _{1}^\mathsf {T}{\tilde{\omega }}_c(t) +\alpha _c \frac{{\tilde{\omega }}_c^\mathsf {T}(t)\psi _{1}}{\psi _{2}}e_{cH}. \end{aligned}$$

For the term \(\dot{L}_{21}(t)\), based on (5) and (6), by adding and subtracting \(\hat{\mu }^{\mathsf {T}}(\hat{x}_{j})\hat{\mu }(\hat{x}_{j})\), \(\dot{L}_{21}(t)\) can be obtained as

$$\begin{aligned} \dot{L}_{21}(t)=&-x^\mathsf {T}Qx+u^{*\mathsf {T}}(x)u^*(x)-2u^{*\mathsf {T}}(x){\hat{\mu }}({\hat{x}}_j)\nonumber \\&-\iota ^{2}\nu ^{*\mathsf {T}}(x)\nu ^{*}(x)+2\iota ^{2}\nu ^{*\mathsf {T}}(x)\hat{\nu }(x)\nonumber \\ \le&-x^\mathsf {T}Qx + \Vert u^*(x)- {\hat{\mu }}({\hat{x}}_j)\Vert ^2- \Vert {\hat{\mu }}({\hat{x}}_j)\Vert ^2 \nonumber \\&+ \iota ^{2}\Vert \hat{\nu }(x)\Vert ^{2}. \end{aligned}$$

Considering (5a) and using the neural network expression, the time-based optimal control can be reformulated as

$$\begin{aligned} u^*(x)&=-\frac{1}{2}g^\mathsf {T}(x) \Big ( \big (\nabla \varphi _{c}(x)\big )^\mathsf {T}\omega _{c}+\nabla \epsilon _{c}(x)\Big ). \end{aligned}$$
(24)

Using \({\hat{\mu }}({\hat{x}}_j)\) in (17a) and \(u^*(x)\) in (24), it follows from \(\omega _c= {\hat{\omega }}_c(t)+{\tilde{\omega }}_c(t)\) that

$$\begin{aligned}&\Vert u^*(x)-{\hat{\mu }}({\hat{x}}_j)\Vert ^2 \nonumber \\ \le&\big \Vert \big [ g^\mathsf {T}({\hat{x}}_j)\big (\nabla \varphi _c({\hat{x}}_j)\big )^\mathsf {T}-g^\mathsf {T}(x)\big (\nabla \varphi _c(x)\big )^\mathsf {T} \big ] {\hat{\omega }}_c(t) \big \Vert ^2\nonumber \\&+\big \Vert g^\mathsf {T}(x)\Big (\big (\nabla \varphi _c(x)\big )^\mathsf {T}{\tilde{\omega }}_c(t)+\nabla \epsilon _c(x)\Big )\big \Vert ^2. \end{aligned}$$

Recalling Assumptions 3 and 5, it yields

$$\begin{aligned}&\big \Vert g^{\mathsf {T}}({\hat{x}}_j)\big (\nabla \varphi _c({\hat{x}}_j)\big )^{\mathsf {T}}-g^{\mathsf {T}}(x)\big (\nabla \varphi _c(x)\big )^{\mathsf {T}}\big \Vert ^2 \nonumber \\ =&\ \big \Vert \big (\nabla \varphi _c({\hat{x}}_j)-\nabla \varphi _c(x)\big )g({\hat{x}}_j)+\nabla \varphi _c(x) \big (g({\hat{x}}_j)-g(x)\big )\big \Vert ^2 \nonumber \\ \le&\ 2M_{\mathcal {L}}\Vert \sigma _j(t)\Vert ^2. \end{aligned}$$

Thus, the following inequality can be obtained

$$\begin{aligned}&\dot{L}_{21}(t) \le -x^{\mathsf {T}}Qx-\Vert {\hat{\mu }}({\hat{x}}_j)\Vert ^2+\iota ^{2}\Vert \hat{\nu }(x)\Vert ^{2}+B_{g}^2B_{\epsilon }^2\nonumber \\&\quad +2M_{\mathcal {L}}\Vert {\hat{\omega }}_c(t)\Vert ^2\Vert \sigma _j(t)\Vert ^2+ B_{g}^2B_{\varphi }^2\Vert {\tilde{\omega }}_c(t)\Vert ^2. \end{aligned}$$
(25)

For the term \(\dot{L}_{23}(t)\), by applying the Young’s inequality into its second term, \(\dot{L}_{23}(t)\) satisfies

$$\begin{aligned} \dot{L}_{23}(t) \le&-\bigg (\alpha _c-\frac{1}{2}\bigg ) \lambda _{\min }(\psi _{1} \psi _{1}^{\mathsf {T}})\Vert {\tilde{\omega }}_c(t)\Vert ^2\nonumber \\&+\frac{1}{2}{\alpha _c^2B_{e}^2}, \end{aligned}$$
(26)

where Assumption 5 and the fact \(\psi _{2} \ge 1\) are used. By combining (25) and (26), we can obtain that the overall time derivative of \(L_{2}(t)\) is

$$\begin{aligned} \dot{L}_{2}(t) \le&-x^{\mathsf {T}}Qx+2M_{\mathcal {L}}\Vert {\hat{\omega }}_c(t)\Vert ^2\Vert \sigma _j(t)\Vert ^2-\Vert {\hat{\mu }}({\hat{x}}_j)\Vert ^2 \nonumber \\&+\iota ^{2}\Vert \hat{\nu }(x)\Vert ^{2}+B_{g}^2B_{\epsilon }^2+\frac{1}{2}\alpha _c^2B_{e}^2 \nonumber \\&-\bigg [\bigg (\alpha _c-\frac{1}{2}\bigg ) \lambda _{\min }(\psi _{1} \psi _{1}^{\mathsf {T}})-B_{g}^2B_{\varphi }^2\bigg ]\Vert {\tilde{\omega }}_c(t)\Vert ^2. \end{aligned}$$
(27)

Therefore, it is clear that if (22) and (23) are satisfied, then \(\dot{L}_{2}(t) < 0\) for any \(x\ne 0\) can be obtained according to (27).

When \(t=s_{j+1}\), the events are triggered. The difference of \(L_{2}(t)\) is expressed as

$$\begin{aligned} \Delta L_{2}(t)&= L_{2}({\hat{x}}_{j+1})-L_{2}\big (x(s_{j+1}^{-})\big )\nonumber \\&=\Delta L_{21}(t)+\Delta L_{22}(t)+\Delta L_{23}(t), \end{aligned}$$

where \(x(s_{j+1}^{-}) = \lim _{\varepsilon \rightarrow 0} x( s_{j+1}{-\varepsilon } )\) and \(\varepsilon \) is a sufficiently small positive constant. For all \(t\in [s_j, s_{j+1})\), \(\dot{L}_{2}(t)<0\) can be derived from (22), (23) and (27). Considering that the system states and the cost function are all continuous, it can acquire

$$\begin{aligned} \Delta L_{21}(t)= J^*({\hat{x}}_{j+1}) - J^*\big (x(s_{j+1}^{-})\big ) \le 0 \end{aligned}$$

and \(\Delta L_{23}(t) \le 0\), where

$$\begin{aligned} \Delta L_{23}(t)=&\frac{1}{2}\big [{\tilde{\omega }}_c^{\mathsf {T}}({\hat{x}}_{j+1})\tilde{\omega }_c({\hat{x}}_{j+1})- {\tilde{\omega }}_c^{\mathsf {T}}\big (x(s_{j+1}^{-})\big ) \nonumber \\&\times {\tilde{\omega }}_c\big (x(s_{j+1}^{-})\big )\big ]. \end{aligned}$$

Hence, we obtain

$$\begin{aligned} \Delta L_{2}(t)&\le \Delta L_{22}(t)= J^*({\hat{x}}_{j+1})-J^*({\hat{x}}_{j}) \nonumber \\&\le -\mathcal {K} (\Vert \sigma _{j+1}(s_{j})\Vert ), \nonumber \end{aligned}$$

where \(\mathcal {K}(\cdot )\) is a class-\(\mathcal {K}\) function [44] and \(\sigma _{j+1}(s_{j})={\hat{x}}_{j+1}-{\hat{x}}_j\). This leads to that \(L_{2}(t)\) is decreasing for all \(t=s_{j+1}\).

Based on these two cases, with the triggering condition (22) and the uniformly ultimately bounded weight error in (23), the closed-loop system (1) is asymptotically stable, which ends the proof. \(\square \)

Remark 2

If we regard the first term of weight error dynamics (21) as a nominal system, which is written as \(\dot{{\tilde{\omega }}}_{c\text {n}}(t) = -\alpha _c \psi _{1}\psi _{1}^\mathsf {T}{\tilde{\omega }}_{c\text {n}}(t)\), we can verify that it is exponentially stable. To this end, we choose a Lyapunov function as the form \(L_{c\text {n}}(t)=0.5{\tilde{\omega }}_{c\text {n}}^\mathsf {T}(t){\tilde{\omega }}_{c\text {n}}(t)\) and differentiate it along the nominal part to yield \(\dot{L}_{c\text {n}}(t)= - \alpha _{c}{\tilde{\omega }}_{c\text {n}}^\mathsf {T}(t)\psi _{1}\psi _{1}^\mathsf {T}{\tilde{\omega }}_{c\text {n}}(t) \), which clearly reveals that \(\dot{L}_{c\text {n}}(t) \le 0\) and exhibits the stability of the nominal system. Moreover, the solution \({\tilde{\omega }}_{c\text {n}}(t)\) can be given by \({\tilde{\omega }}_{c\text {n}}(t)=\mathcal {T}(t,0){\tilde{\omega }}_{c\text {n}}(0)\), where the state transition matrix is defined as \(\dot{\mathcal {T}}(t,0)=-\alpha _{c}\psi _{1}\psi _{1}^\mathsf {T}\mathcal {T}(t,0)\). Hence, according to [44], there exist two constants \(\varsigma _{3}\) and \(\varsigma _{4}\) such that

$$\begin{aligned} \Vert \mathcal {T}(t,0)\Vert \le \varsigma _{3} e^{-\varsigma _{4}t}, \forall t \ge 0. \end{aligned}$$

Under such circumstance, we can derive that

$$\begin{aligned} \Vert {\tilde{\omega }}_{c\text {n}}(t)\Vert \le \Vert \mathcal {T}(t,0)\Vert \Vert {\tilde{\omega }}_{c\text {n}}(0)\Vert \le \varsigma _{3}\Vert {\tilde{\omega }}_{c\text {n}}(0)\Vert e^{-\varsigma _{4}t}. \end{aligned}$$

Thus, it is shown that for the nominal part of the critic error dynamics (21), the equilibrium point is exponentially stable in case that \(\psi _{1}\) satisfies the persistence of excitation condition. Note that this kind of stability with respect to the nominal system is stronger than the uniformly ultimately bounded stability of the whole error dynamics developed in Theorem 2. Nevertheless, the existence of the residual error-related term is indeed indispensable due to the neural network approximation, which eventually results in a weaker stability of the critic error dynamics.

It should be mentioned that although two triggering thresholds \(\sigma _T\) and \({\hat{\sigma }}_T\) are provided in Theorems 1 and 2, respectively, it is obvious that two thresholds work in different design stages. Overall, the event-based robust adaptive critic control algorithm can be summarized in Algorithm 1.

figure a

4 Simulation analysis

In this section, a numerical example is conducted to demonstrate the effectiveness of the event-based nonlinear \(H_{\infty }\) control. We consider a single-link robot arm with the description in [7, 40, 47, 48], and the mechanical dynamics are derived by

$$\begin{aligned} \bar{G}\ddot{\theta }(t)&=-{M\bar{g}\bar{H}}\sin (\theta (t))-D\dot{\theta }(t)+u(t)+\nu (t), \end{aligned}$$
(28a)
$$\begin{aligned} y(t)&=\theta (t), \end{aligned}$$
(28b)

where \(\theta (t)\) denotes the angle position, u(t) is the control, and \(\nu (t)\) is the perturbation. \(M=10\) and \(\bar{H}=0.5\) are the mass and the length of the robot arm, respectively, \(\bar{g}=9.81\) is the gravity acceleration, \(D=2\) is the viscous friction, and \(\bar{G}=10\) is the inertia moment.

Define \(x=[x_{1},x_{2}]^\mathsf {T}\) with \(x_{1}=\theta \) and \(x_{2}=\dot{\theta }\) such that the dynamic equation of system (28) is rewritten as

$$\begin{aligned} \left[ \begin{array}{c} \dot{x}_{1}\\ \dot{x}_{2}\end{array}\right]&=\left[ \begin{array}{c} x_{2}\\ -4.905\sin x_{1}-0.2x_{2} \end{array}\right] +\left[ \begin{array}{c} 0\\ 0.1 \end{array}\right] (u+\nu ), \end{aligned}$$
(29a)
$$\begin{aligned} y&=x_{1}. \end{aligned}$$
(29b)

Obviously, the control and disturbance matrices are constants, which are both upper-bounded. For instance, we can choose \(B_{g}=B_{h}=0.1\). Then, the initial state vector of (29) is set as \(x_{0}=[1,-1]^{\mathsf {T}}\), and choose \(P=2\) so that \(Q=\text {diag}\{2,0\}\). The adaptive critic controller is designed for system (29) in the following.

Fig. 2
figure 2

Convergence curves of the critic network weights

Fig. 3
figure 3

Adaptive regulation of the state trajectories

In the simulation, the critic neural network is constructed as

$$\begin{aligned} \hat{J}(x)=\hat{\omega }_{c}^{\mathsf {T}}\varphi _{c}(x)= \hat{\omega }_{c1}x_{1}^{2}+\hat{\omega }_{c2}x_{1}x_{2}+\hat{\omega }_{c3}x_{2}^{2}, \end{aligned}$$

where \(\hat{\omega }_{c}=[\hat{\omega }_{c1},\hat{\omega }_{c2},\hat{\omega }_{c3}]^{\mathsf {T}}\) and \(\varphi _{c}(x)=[x_{1}^{2},x_{1}x_{2},x_{2}^{2}]^{\mathsf {T}}\). Clearly, the derivative of the activation function is a \(3 \times 2\) function matrix of the form

$$\begin{aligned} \nabla \varphi _{c}(x) =\left[ \begin{array}{ccc} 2x_{1} &{} x_{2} &{} 0\\ 0 &{} x_{1} &{} 2x_{2} \end{array}\right] ^{\mathsf {T}}. \end{aligned}$$

The neuron number of the hidden layer is often decided by computer experiment. We can certainly choose hidden neurons of any number, while we should also consider the complexity of the computation issue. In this case study, we find that selecting three hidden neurons can lead to satisfactory simulation results. In other words, it can be observed that the choice of the activation function is more of an art than science. For adjusting the critic network, we experimentally set \(\alpha _{c}=1.2\), \(\iota =2\), and \(M_{\mathcal {L}}=36\). The sampling time in the learning process is selected as 0.1 s. Note that we also employ a probing noise to ensure the persistency of excitation condition in the training process. The simulation results of the learning stage are shown in Figs. 2, 3 and 4. In Fig. 2, it can be observed that the critic network weight vector converges to \([0.6050, 0.2418, 0.1310]^{\mathsf {T}}\).

Fig. 4
figure 4

Adjustment of the triggering condition with the relationship between \(\Vert \sigma _{j}(t)\Vert ^{2}\) and \({\hat{\sigma }}_{T}\)

The adaptive regulation process of state trajectories and the triggering condition is displayed in Fig. 3, where the system is trained under the persistency of excitation condition and the states are regulated to zero once the excitation signals are stopped. Figure 4 provides the adjustment process of the triggering condition, in which the triggering condition with respect to \(\Vert \sigma _{j}(t)\Vert ^{2}\) and \({\hat{\sigma }}_{T}\) is shown. It can be observed that the time-based controller uses 3000 state samples, while the event-based controller only needs 1501 samples, thereby resulting in an evident reduction in the data transmission.

For the controlled plant (29), the obtained control law is used for 60 s with the external perturbation \(\nu (t)=5e^{-(t)}\cos (t)\), \(t > 0 \), to evaluate the robust \(H_{\infty }\) control performance. Set \(M_{u}=5\), and the sampling time is 0.05 s. The simulation results with the \(H_{\infty }\) feedback control are exhibited in Figs. 5, 6, 7 and 8. Specifically, the system state trajectories and the control input trajectory are depicted in Figs. 5 and 6, respectively. Figure 7 shows the adjustment of the triggering condition under the robust \(H_{\infty }\) control.

Fig. 5
figure 5

System state trajectories under the robust \(H_{\infty }\) control

Fig. 6
figure 6

Robust \(H_{\infty }\) control input

Fig. 7
figure 7

Adjustment of the triggering condition under the robust \(H_{\infty }\) control

Fig. 8
figure 8

Evolution curve of the ratio function

Then, referring to the common definition in [27,28,29], a ratio function \(\bar{\iota }(t)\) is defined as

$$\begin{aligned} \bar{\iota }(t)=\sqrt{{\displaystyle \int _{0}^{t}\big (x^{\mathsf {T}}(\tau )Qx(\tau )+u^{\mathsf {T}}(\tau )u(\tau )\big )\text {d}\tau }\bigg /{\displaystyle \int _{0}^{t}\Vert \nu (\tau )\Vert ^{2}\text {d}\tau }}, \end{aligned}$$

which is used to reflect the disturbance attenuation of the \(H_{\infty }\) control problem. In Fig. 8, the ratio \(\bar{\iota }(t)\) gradually converges to 1.2440 over time. This implies that the designed \(H_{\infty }\) controller really works on attaining a prespecified \(L_2\)-gain performance level (i.e., \(\bar{\iota }(t) < \iota =2\)).

These simulation results substantiate the effectiveness of the event-based robust adaptive critic control strategy with regard to the external disturbance, and consequently, it possesses the excellent ability of disturbance rejection.

5 Conclusion

In this paper, the event-based \(H_{\infty }\) feedback control of nonlinear dynamic systems involving output information has been intensively studied under the event-based adaptive critic design framework. It formulated the \(H_{\infty }\) control problem of disturbed nonlinear system as the problem of two-player zero-sum differential game. The event-based mechanism and the adaptive critic approach have been adopted to pursue the Nash equilibrium solution of the two-player zero-sum differential game such that the event-based approximate optimal control law and the time-based worst-case disturbance law were derived by the learning process of the critic network, where the triggering condition and its related threshold were provided. Simultaneously, this paper also presented the stability analysis of the closed-loop system and the weight estimation error of critic neural network. With the experimental verification of a single-link robot arm, the theoretical results have been well demonstrated and illustrated. Along this direction of the event-triggered adaptive critic control, some interesting research topics can be further studied in the future work, such as the event-triggered approximate optimal tracking control design for affine nonlinear systems with unmatched uncertainties, for nonaffine nonlinear systems including uncertainties and unknown dynamics.