1 Introduction

Switched systems are a class of special hybrid systems that includes many subsystems with different dynamics, whose decision variable consists of the continuous control input and the discrete switching signal [1,2,3]. The control input is used to determine how to stabilize the system state, and the switching signal is applied to determine which subsystem is active. Switched systems have been attracting many researchers for decades due to the widespread application, such as electrical circuits [4], chemical process control [5], hybrid electric vehicle [6], and sensor fault estimation [7]. For all we know, uncertainties induced by exogenous disturbances widely exist in engineering practice. If these exogenous disturbances are not well considered for the controller design procedure, they will seriously affect the smooth operation of industrial systems. As a result, it is nontrivial to address robust control issues for switched systems [8, 9]. In addition, the actuator saturation phenomenon is also pervasive due mainly to the actuators’ protection facilities or physical limits, where their adverse effects will degrade the performance of switched systems and have a bad influence on the stability. Thus, it is of tremendous significance to handle the controller design problems considering the actuator saturation phenomenon [6]. Nevertheless, in the existing literature, most research on uncertainties or actuator saturation is restricted to the general discrete-time system, which lacks the promotion to discrete-time switched systems in the simultaneous presence of uncertainties and actuator saturation issues. It is, of course, our first motivation to fill this gap.

When it comes to the optimal controller design procedure for discrete-time switched systems, one of the basic ideas is to apply dynamic programming to address the optimal control issues back-forward in time. However, the main disadvantage of traditional dynamic programming is that with the increase of problem dimension, it will be difficult to solve in calculation. For the sake of circumventing this disadvantage, adaptive dynamic programming (ADP) is proposed to obtain feasible numerical solutions of Hamilton–Jacobi–Bellman (HJB) equations using neural network approximation with time-forward manner [10, 11]. Currently, the main ADP structures include heuristic dynamic programming (HDP), dual heuristic programming (DHP), globalized dual heuristic programming, and their action-dependent structures [12,13,14,15,16,17]. Recently, considerable attention has been paid to ADP-based optimal control for switched systems. For instance, the two-stage ADP method has been presented to gain the hybrid state-feedback control policy for infinite-horizon nonlinear switched systems in [18]. The two-stage DHP method has been used to cope with the constrained optimal control issue for switched systems in [19]. An ADP-based method has been developed for solving the optimal switching control problem with fixed switching time in [20]. However, few results have been made on the robust optimal controller design issues for discrete-time switched systems subject to exogenous disturbances and actuator saturations, which motivates us to study further.

Generally, the number of alternative switching sequences in the optimal controller design procedure will increase exponentially as switching moments increase, which will result in a lot of computational and communication burdens. To solve this knotty problem, the event-triggered mechanism (ETM) serves as an effective tool to implement sampling control operations when certain triggering conditions are violated. Note that the ETM has the advantages of saving communication resources as well as the closed-loop stability will not be affected [21,22,23, 25]. For this reason, more and more research attention has been dedicated to the event-triggered optimal control problems. For instance, several ETM-based optimal control approaches have been proposed in continuous-time nonlinear systems without knowing internal dynamic information in [26,27,28]. The event-triggered HDP strategy with the input-to-state stability (ISS) analysis has been developed for discrete-time affine nonlinear systems in [29]. The event-triggered DHP approach has been implemented to address the optimal control problems with fewer communication resources [30, 31]. Even so, due to the strict conditions required for the ISS property of uncertain systems [32], the event triggering conditions mentioned above cannot be applied to discrete-time uncertain systems, let alone uncertain switching systems subject to actuator saturation. Naturally, the main objective of the current study is to address such a knotty issue.

Summarized the discussions mentioned above, in this paper, we endeavor to address the event-triggered robust optimal control problem for discrete-time switched systems in the simultaneous presence of uncertainties and actuator saturation. Taking all these engineering-oriented complexities into account, it is of significance to address this control problem. The main difficulties and challenges are given as follows: (1) How to design a reasonable event triggering condition to guarantee that the constrained uncertain switched systems are stable under this ETM? (2) How to tackle indecomposable perturbation uncertainty existing in ISS conditions to take full account of exogenous disturbances? (3) How to develop an effective approach to obtain the robust optimal hybrid feedback control policy in the event triggering context? To overcome the above challenges, the main contributions of this paper are concluded as follows:

  1. (1)

    An adaptive event triggering condition is designed for the uncertain switched systems with input saturation, which could reduce the communication burden meanwhile ensuring the underlying system’s robustness. Compared with existing works [18, 19, 33], the uncertainties induced by exogenous disturbance are specially considered in the adaptive ETM design for uncertain switched systems.

  2. (2)

    The asymptotic stability for switched systems with uncertainties is proved by the Lyapunov approach under the adaptive ETM. Benefiting from the transformation of the indecomposable perturbation uncertainty existing in ISS conditions, a more relaxed sufficient condition compared with [32] is given to facilitate the asymptotic stability proof.

  3. (3)

    The neural network (NN)-based robust optimal control approach is proposed to obtain the optimal hybrid feedback control policy for uncertain switched systems, and the convergence of the proposed approach is analyzed. Compared with [18, 19], the weights of actor-critic NNs only updated when the triggering rules are violated, which reduces the computational burden of the designed controller and saves the communication resources during controller-to-actuator networks.

The rest of this paper is organized in the following. In Sect. 2, the robust optimal control problem is presented. In Sect. 3, an adaptive event triggering condition is designed, and its corresponding asymptotical stability analysis is given. In Sect. 4, the robust optimal control algorithm is illustrated, and the convergence of the suggested approach is analyzed. In Sect. 5, the implementation of the critic-actor NNs is described. In Sect. 6, two numerical examples are shown to verify the validity of the proposed approach. Finally, the paper is concluded in Sect. 7.

Notations: \(\varOmega _v=\{1,2,\cdots ,M\}\) denotes the subsystems index set, where M is the subsystem number for the switched systems. \(\Vert \cdot \Vert \) denotes the norm for real number or vector. \({\mathcal {L}}_i=\{1,\cdots ,M^i\}\), where \(i \in \{0, 1, 2,...\}\). For positive integers A and B, the mod(A, B) and \(\lfloor A\rfloor \) mean the remainder of A to B and the largest integer that is less than or equal to A, respectively. The gradient operator \(\nabla C(x)\) is defined as \(\nabla C(x)=\frac{\partial C(x)}{\partial x}\), where C(x) is a differential function with respect to x.

2 Problem formulation

The following discrete-time nonlinear switched system with matched uncertainty is considered

$$\begin{aligned} x(k+1)=f_{v(k)}(x(k))+g_{v(k)}(x(k))(u_{v(k)}(k)+\varpi (k)) \end{aligned}$$
(1)

where \(x(k)\in {\mathbb {R}}^n\) denotes the system state, and \(u_{v(k)}(k) \in {\mathbb {R}}^m\) denotes the constrained control law with \(u_{v(k)}^j(k)\le |\bar{u}|\), \(\forall j\in \{1,2,\cdots ,m\}\), \(\bar{u}>0\) is the upper bound of actuator saturation. \(\varpi (k)\) denotes the uncertain disturbance term. \(v(k) \in \varOmega _v=\{1,2,\cdots ,M\}\) denotes the switching signal, where M is the number of switched subsystem. Besides, \(f_v:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) and \(g_v:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^{n\times m}\) are Lipschitz continuous in the compact set \(\varOmega _x\in {\mathbb {R}}^n\) containing the origin.

For switched system (1), Assumptions 1 and 2 hold throughout the paper.

Assumption 1

The switched system (1) is controllable, i.e., there is at least one set of hybrid control policy \(\varPi _0^\infty \left<u,v\right>\), such that the switched system (1) is stabilized.

Assumption 2

Suppose that the unknown disturbance term \(\varpi (k)\) is dependent on the x(k), and \(\varpi (k)\) is upper bounded by a differential function of x(k). There exist \(\zeta \in {\mathcal {K}}_\infty \) and \(\varUpsilon \in {\mathbb {R}}_+\), such that \(\Vert \varpi (k)\Vert =\zeta (\Vert x(k)\Vert )\le \varUpsilon \Vert x(k)\Vert \), \(\forall k\in {\mathbb {Z}}_{+}\).

The event-triggered constrained state-feedback control law under ETM is defined as

$$\begin{aligned} u_{v(k)}(k)=\mu _{v(k)}(x(k_t)), \quad k\in \left[ k_t,k_{t+1}\right) \end{aligned}$$
(2)

where \(\{k_t\}_{t=0}^\infty \) is the monotonically increasing triggered instants sequence and \(x(k_t)\) is the sampled state at the triggered instant \(k_t\). Note that \(k_t=0\) when \(t=0\). Next, we define the event-triggered gap between \(x(k_t)\) and x(k) as

$$\begin{aligned} e(k)=x(k_t)-x(k), \quad k\in \left[ k_t,k_{t+1}\right) \end{aligned}$$
(3)

where the last sampled state \(x(k_t)\) holds by zero-order holder. The event-triggered gap determines the frequency of control and communication. Then, by substituting (3) into (2), the event-triggered constrained control law is represented as

$$\begin{aligned} u_{v(k)}(k)=\mu _{v(k)}(x(k)+e(k)), k\in \left[ k_t,k_{t+1}\right) . \end{aligned}$$
(4)

Hence, the closed-loop switched system (1) could be reformulated as

$$\begin{aligned} x(k+1)= & {} f_{v(k)}(x(k))+g_{v(k)}(x(k))(\mu _{v(k)}(x(k)\nonumber \\&+e(k))+\varpi (k)). \end{aligned}$$
(5)

To cope with the matched uncertainty, a special cost function should be devised to evaluate the performance for the switched system. Here, the cost function is defined as

$$\begin{aligned} J(x(k),v(k),u_{v(k)}(k))=\sum _{\ell =k}^{\infty }{\mathcal {M}}(x(\ell ),v(\ell ),u_{v(\ell )}(\ell )) \end{aligned}$$
(6)

with

$$\begin{aligned}&{\mathcal {M}}(x(k),v(k),u_{v(k)}(k))= \rho (\varUpsilon \Vert x(k)\Vert )^2\nonumber \\&\quad +x(k)^{T}{\mathcal {Q}}_{v(k)}x(k)+{\mathcal {W}}(u_{v(k)}(k)) \end{aligned}$$
(7)

where \({\mathcal {M}}\) is the utility function, \(\rho \) is a positive discount factor, and \({\mathcal {Q}}_{v(k)}\in {\mathbb {R}}^{n\times n}\) is the semipositive definite matrix of the state penalty term. The positive definite nonquadratic functional term \({\mathcal {W}}(u(k))\) is employed to evaluate the impact of constrained control law on cost function, which is defined as

$$\begin{aligned} {\mathcal {W}}(u(k))=\int _{0}^{u(k)}\bar{u}(\phi ^{-1}(s/\bar{u}))^TR_{v(k)}ds \end{aligned}$$
(8)

where \(R_{v(k)}=\mathrm {diag}\{r^1_{v(k)},r^2_{v(k)},\cdots ,r^m_{v(k)}\} \in {\mathbb {R}}^{m\times m}\) being a symmetric positive matrix, and \(\phi (\cdot )\) is a bounded surjective function satisfying \(|\phi (\cdot )|\le 1\) and belonging to \({\mathcal {C}}^p(p\ge 1)\) and \(L_2(\varOmega )\). Note that \(\phi (s)\in L_2(\varOmega )\) represents that \(\big (\int _{\varOmega }\phi ^T(s)\phi (s)ds\big )^{1/2}<\infty \) and \(\int _{\varOmega }\phi ^T(s)\phi (s)ds\) is the Lebesgue integral on [34, 35]. Meanwhile, \(\phi (\cdot )\) is a monotonically increasing nonlinear odd function with its derivative bounded by a scalar \(\phi _m\). Without loss of generality, we choose \(\phi (\cdot ) = \mathrm {tanh}(\cdot )\) in this paper.

Remark 1

In practice engineering, the matched uncertainty caused by the exogenous disturbance exists widely in industrial processes. It is well-known that the uncertain term cannot be acquired precisely by the sensors on industrial equipment, which is why the perturbation penalty term cannot be written as a form of \(\varpi _k\) in the cost function. After repeated experiments and tests, the upper bound of \(\varpi _k\) can be obtained as the form in Assumption 2. Therefore, in this paper, the penalty term \(\rho (\varUpsilon \Vert x(k)\Vert )^2\) related with the upper bound of perturbance is inserted into the cost function to obtain the robust optimal hybrid control policy. A good robust performance can be guaranteed by choosing a reasonable discount factor \(\rho \).

Definition 1

A sequence \(\varPi _0^\infty (u,v)\) is considered as admissible with respect to (6) if it could make the cost function (6) finite for any initial state x(0), as well as stabilize the switched system asymptotically.

In this paper, our goal is to seek the robust optimal hybrid feedback control sequence \(\varPi _0^\infty (u,v)\) in line with Definition 1 to make the cost function (6) minimum. From the definition of cost function (6), the value function can be given by

$$\begin{aligned} {\mathcal {V}}(x(k))=\sum _{\ell =k}^{\infty }\{{\mathcal {M}}(x(\ell ),v(\ell ),u_{v(\ell )}(\ell ))\}. \end{aligned}$$
(9)

In light of the Bellman optimality principle [36], the robust optimal value function \({\mathcal {V}}^*(x(k))\) can be obtained as

$$\begin{aligned} ~{\mathcal {V}}^*(x(k))= & {} \underset{v(k),u_{v(k)}(k)}{\min }\{{\mathcal {M}}(x(k),v(k),u_{v(k)}(k))\nonumber \\&+{\mathcal {V}}^*(x(k+1))\}. \end{aligned}$$
(10)

In Sect. 3, the event triggering condition will be given under ISS attribute and the asymptotic stability will be proved by the Lyapunov approach.

3 Stability analysis of event-triggered switched systems

In this section, considering the uncertain discrete-time systems with the ISS attribute, an event triggering condition is designed to schedule the communication between sensors and controllers. It is worth mentioning that the stability of the system is critical for switched systems, and thus, if the subsystem v cannot be stable with the event triggering condition, the ETM will become meaningless. Therefore, how to employ the Lyapunov method to explore the closed-loop switched system stability under the ETM is the main focus in this section.

For closed-loop switched system (1), the event triggering condition is defined as

$$\begin{aligned} \Vert e(k)\Vert \le e_T \end{aligned}$$
(11)

where \(e_T\) means the threshold of the ETM, which can be adaptively adjusted with the last sampled state and the current state. Only when condition (11) is triggered, the robust hybrid control policy will be updated by the ADP controller.

Remark 2

It is noteworthy that when condition (11) is triggered, the system state information will be transmitted into the ADP controller from the sensors, then the robust hybrid control policy will be updated. In other words, the robust optimal hybrid control approach designed in Sect. 4 will be activated, the constrained control law will be recalculated by the value iteration method, and the switching signal will be updated by minimizing the cost function. However, it should be noticed that whether the subsystem is switched or not depends on whether the cost function will become smaller after switching. Therefore, the switching signal maybe not be changed. In other words, the subsystem may not be switched when the event triggering condition is triggered, which is also called “quasi-asynchronous switching” [33].

For the sake of focusing on the ETM design, suppose that switched system (1) has a stabilized control input \(u_{v(k)}(k)\), and the closed-loop system is ISS concerning e(k) and \(\varpi (k)\) [32]. Therefore, based on the related works on ISS and robust optimal control for the discrete-time nonlinear systems with exogenous disturbance [32, 37, 38, 40], the following assumptions are given before proceeding.

Assumption 3

[32]. For any \(v\in \varOmega _v\), switched system (1) is ISS with e(k) and \(\varpi (k)\) as inputs, and an ISS-Lyapunov function \(V:{\mathbb {R}}^n \rightarrow {\mathbb {R}}_+\) is admitted, such that the following conditions satisfied:

  1. (1)

    there exist \({\mathcal {K}}_\infty \) functions \(\underline{\alpha }\) and \(\overline{\alpha }\), such that

    $$\begin{aligned} \underline{\alpha }(\Vert x(k)\Vert )\le V(x(k)) \le \overline{\alpha }(\Vert x(k)\Vert ), \forall x\in {\mathbb {R}}^n; \end{aligned}$$
    (12)
  2. (2)

    there exist positive constants \(\alpha _1, \gamma _2, \gamma _3\), such that

    $$\begin{aligned} \begin{aligned}&V(f_v(x(k))+g_v(x(k))(\mu _v(x(k)+e(k)) \\&\quad +\varpi (k)))-V(x(k))\\&\quad \le -\alpha _1V(x(k))+\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}. \end{aligned} \end{aligned}$$
    (13)

The cyclic-small-gain theorem [41, 42] has been employed to deal with the term \(\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}\) in [32]. Loosely speaking, the closed-loop systems in [32] are stable only when the cyclic-small-gain conditions hold, which limits the range of application of ISS for discrete-time systems. By contrast, we do not require the closed-loop systems to satisfy the cyclic-small-gain condition in this paper, but use the variable substitution method to tackle this term instead, which relaxes the stability conditions for closed-loop switched systems in a certain sense.

Here, the variable substitution method is applied to (13) for dealing with the term \(\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}\). We define \(\kappa (\varrho _1,\varrho _2)\) as a binary function which is equal to 1 for \(\varrho _1\ge \varrho _2\), otherwise 0. With the help of the binary function, this term could be substituted by

$$\begin{aligned}&\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}\nonumber \\&\quad :=\kappa \gamma _2\Vert e(k)\Vert +(1-\kappa )\gamma _3\Vert \varpi (k)\Vert \end{aligned}$$
(14)

where \(\kappa \left( \gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \right) \) is written as \(\kappa \) for convenience. Define \(\alpha _2=\kappa \gamma _2\), \(\alpha _3=(1-\kappa )\gamma _3\), then (13) is rewritten as

$$\begin{aligned} \begin{aligned}&V(f_v(x(k))+g_v(x(k))(\mu _v(x(k)+e(k))\\&\qquad +\varpi (k))) -V(x(k))\\&\quad \le -\alpha _1V(x(k))+\alpha _2\Vert e(k)\Vert +\alpha _3\Vert \varpi (k)\Vert .\\ \end{aligned} \end{aligned}$$
(15)

Consider that the term \(\Vert \varpi (k)\Vert \) has an upper bound function \(\varUpsilon \Vert x(k)\Vert \) in Assumption 2, inequality (15) can be contracted as

$$\begin{aligned} \begin{aligned}&V(f_v(x(k))+g_v(x(k))(\mu _v(x(k)+e(k)) \\&\qquad +\varpi (k))) -V(x(k))\\&\le -\alpha _1V(x(k))+\alpha _2\Vert e(k)\Vert +\alpha _3\varUpsilon \Vert x(k)\Vert .\\ \end{aligned} \end{aligned}$$
(16)

Assumption 4

[29]. For any \(v\in \varOmega _v\), there exist positive constants \(L_1, L\), such that the following properties hold:

$$\begin{aligned}&\Vert f_v(x(k))+g_v(x(k))(\mu _v(x(k)+e(k))+\varpi (k))\Vert \nonumber \\&\quad \le L\Vert e(k)\Vert +L\Vert x(k)\Vert +L\Vert \varpi (k)\Vert \end{aligned}$$
(17)
$$\begin{aligned}&\underline{\alpha }^{-1}(x(k))\le L_1\Vert x(k)\Vert ,\forall x\in {\mathbb {R}}^n \end{aligned}$$
(18)

where \(\underline{\alpha }\) is the \({\mathcal {K}}_\infty \) function defined in Assumption 3.

According to Assumption 2, \(\varpi (k)\) has an upper bound function \(\varUpsilon \Vert x(k)\Vert \), inequality (17) can be rewritten as

$$\begin{aligned} \begin{aligned}&~\Vert f_v(x(k))+g_v(x(k))(\mu _v(x(k)+e(k))+\varpi (k))\Vert \\&~\le L\Vert e(k)\Vert +L(1+\varUpsilon )\Vert x(k)\Vert . \end{aligned} \end{aligned}$$
(19)

Assumption 5

For closed-loop system (1), suppose that the next event-triggered error is upper bounded by the next state value in metric space, i.e., \(\Vert e_{k+1}\Vert \le \chi \Vert \eta _{k+1}\Vert \) with \(e_{k+1}=\eta _{k_t}-\eta _{k+1}\), for \(k\in \left[ k_t,k_{t+1}\right) \), holds in the adaptive ETM, where \(\chi \ge 1\) is a positive scalar.

Remark 3

Note that (13) is transformed into (15) by the variable substitution for two merits. One merit is not to change its conservativeness after the transformation, it is clear that when \(\kappa \) equals 0 or 1, both sides of (14) will be equivalent. For instance, when \(\gamma _2\Vert e(k)\Vert \) is larger than \(\gamma _3\Vert \varpi (k)\Vert \), both sides of (14) will be equal under \(\kappa =0\), then (13) is equivalent to (15). Another merit is to provide convenience for the following asymptotic stability analysis by converting the unsimplifiable terms \(\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}\) into terms \((1-\kappa )\gamma _2\Vert e(k)\Vert \), \(\kappa \gamma _3\Vert \varpi (k)\Vert \) that can be split.

Lemma 1

Under Assumptions 25, the event-triggered gap for the closed-loop switched system should satisfy the following condition:

$$\begin{aligned} \Vert e(k)\Vert\le & {} L_\chi (1+\varUpsilon )\frac{1-\left[ L_\chi (2+\varUpsilon )\right] ^{k-k_t}}{1-L_\chi (2+\varUpsilon )}\Vert x(k_t)\Vert , \quad \nonumber \\&k\in \left[ k_t,k_{t+1}\right) . \end{aligned}$$
(20)

Proof

Based on Assumptions 45, for each \(k\in \left[ k_t,k_{t+1}\right) \), it is not difficult to see that

$$\begin{aligned} \Vert e_{k+1}\Vert \le&\chi \Vert \eta _{k+1}\Vert \nonumber \\ =&\chi \Vert f_v(x(k))+g_v(x(k))(u_v(k)+\varpi (k))\Vert \nonumber \\ \le&L_\chi \Vert e_k\Vert +L_\chi (1+\varUpsilon )\Vert \eta _k\Vert \nonumber \\ \le&L_\chi (2+\varUpsilon )\Vert e_k\Vert +[L_\chi (1+\varUpsilon )]\Vert \eta _{k_t}\Vert \end{aligned}$$
(21)

where \(L_\chi =L\chi \) for brevity.

Therefore, by expanding (21), for \(k\in (k_t,k_t+1)\), it follows that

$$\begin{aligned} \begin{aligned}&\Vert e(k)\Vert \le L_\chi (2+\varUpsilon )\Vert e(k-1)\Vert \\ {}&\quad +L_\chi (1+\varUpsilon )\Vert x(k_t)\Vert \\&\quad \le \left[ L_\chi (2+\varUpsilon )\right] ^2\Vert e(k-2)\Vert \\&\quad +L_\chi ^2(2+\varUpsilon )(1+\varUpsilon )\Vert x(k_t)\Vert \\ {}&\quad +L_\chi (1+\varUpsilon )\Vert x(k_t)\Vert \\&\quad \le \left[ L_\chi (2+\varUpsilon )\right] ^{k_t}\Vert e(k_t)\Vert \\&\qquad +\left[ L_\chi (2+\varUpsilon )\right] ^{k_t-1}L_\chi (1+\varUpsilon )\Vert x(k_t)\Vert \\&\quad +\left[ L_\chi (2+\varUpsilon )\right] ^{k_t-2}L_\chi (1+\varUpsilon )\Vert x(k_t)\Vert +\cdots \\&\qquad +L_\chi (1+\varUpsilon )\Vert x(k_t)\Vert \end{aligned} \end{aligned}$$
(22)

when \(k=k_t\), \(e(k)=0\). Hence, the event triggering condition can be deduced by sum formula of geometric series, one has

$$\begin{aligned} \Vert e(k)\Vert \le L_\chi (1+\varUpsilon )\frac{1-\left[ L_\chi (2+\varUpsilon )\right] ^{k-k_t}}{1-L_\chi (2+\varUpsilon )}\Vert x(k_t)\Vert . \end{aligned}$$

The proof is thus completed. \(\square \)

Theorem 1

Consider event triggering condition (20) and Assumptions 34, we define \(\lambda =L_\chi (2+\varUpsilon )\) and \(\delta =\alpha _1-\alpha _3L_1\varUpsilon \). If there exist \(0<\lambda <1\), \(0<\delta <1\), the positive number \(\xi \in (0,\frac{1}{k-k_t})\) and the following inequality is satisfied, the switched nonlinear system is asymptotically stable.

$$\begin{aligned} \frac{\alpha _2}{\delta ^2}\le \frac{(1-\lambda )\left[ 1-\xi (k-k_t)\right] }{L_\chi (1+\varUpsilon )L_1}, \quad k\in \left[ k_t,k_{t+1}\right) . \end{aligned}$$
(23)

Proof

Considering whether the event is triggered or not, this proof is divided into two cases.

Case I: The event triggering condition is not be violated at \(k\in (k_t, k_t+1)\). According to inequalities (12) and (18), we have

$$\begin{aligned} \Vert x(k)\Vert \le \underline{\alpha }^{-1}(V(x(k)))\le L_1V(x(k)) \end{aligned}$$
(24)

Substituting (20) into (16) yields

$$\begin{aligned} \begin{aligned}&V(f_v(x(k))+g_v(x(k))(\mu _v(x(k)\\ {}&\qquad +e(k))+\varpi (k)))-V(x(k))\\&\quad \le \alpha _2L_\chi (1+\varUpsilon )\frac{1-\lambda ^{k-k_t}}{1-\lambda }\Vert x(k_t)\Vert \\&\quad -\alpha _1V(x(k))+\alpha _3\varUpsilon \Vert x(k)\Vert . \end{aligned} \end{aligned}$$
(25)

According to (24), we rewrite (25) as

$$\begin{aligned} \begin{aligned}&V(x(k+1))\le (1-\delta )V(x(k))\\&\quad +\alpha _2L_\chi (1+\varUpsilon )\frac{1-\lambda ^{k-k_t}}{1-\lambda }L_1V(x(k_t)). \end{aligned} \end{aligned}$$
(26)

Further, for \(k\in (k_t,k_t+1)\), it is easy to see from (26) that

$$\begin{aligned} \begin{aligned}&V(x(k))\le (1-\delta )V(x(k-1))\\&\quad +\alpha _2L_\chi (1+\varUpsilon )\frac{1-\lambda ^{k-1-k_t}}{1-\lambda }L_1V(x(k_t)). \end{aligned} \end{aligned}$$
(27)

Substituting (27) into (26), we have

$$\begin{aligned} \begin{aligned}&V(x(k+1))\\&\le (1-\delta )\bigg [(1-\delta )V(x(k-1))\\&\quad +\alpha _2L_\chi (1+\varUpsilon ) \frac{1-\lambda ^{k-k_t}}{1-\lambda }L_1V(x(k_t))\bigg ]\\&\quad +\alpha _2L_\chi (1+\varUpsilon )\\&\quad \frac{1-\lambda ^{k-1-k_t}}{1-\lambda }L_1V(x(k_t)). \end{aligned} \end{aligned}$$
(28)

Hence, expanding (28) yields

$$\begin{aligned} \begin{aligned}&V(x(k))\le (1-\delta )^{k-k_t}V(x(k_t))\\&\quad +\frac{1-(1-\delta )^{k-k_t}}{1-(1-\delta )}\frac{1-\lambda ^{k-k_t}}{1-\lambda } \alpha _2\\ {}&\quad \times L_\chi (1+\varUpsilon )L_1V(x(k_t)). \end{aligned} \end{aligned}$$
(29)

Considering condition (23), we know

$$\begin{aligned} \frac{\alpha _2L_\chi (1+\varUpsilon )}{\delta (1-\lambda )}L_1\le \delta -\delta \xi (k-k_t). \end{aligned}$$
(30)

Since \(\delta \in (0,1)\), we obtain

$$\begin{aligned} \delta \le 1-(1-\delta )^{k-k_t} . \end{aligned}$$
(31)

By virtue of (30) and (31), it follows that

$$\begin{aligned} \frac{\alpha _2L_\chi (1+\varUpsilon )}{\delta (1-\lambda )}L_1\le 1-(1-\delta )^{k-k_t}-\delta \xi (k-k_t). \end{aligned}$$
(32)

Considering \(1-\delta <1\) and \(0<\lambda <1\), then

$$\begin{aligned} \begin{aligned}&1-(1-\delta )^{k-k_t}<1,\\&1-\lambda ^{k-k_t}<1. \end{aligned} \end{aligned}$$
(33)

With the aid of (32) and (33), it is clear that

$$\begin{aligned} \begin{aligned}&\frac{1-(1-\delta )^{k-k_t}}{\delta }\frac{1-\lambda ^{k-k_t}}{1-\lambda }\alpha _2L_\chi (1+\varUpsilon )L_1\\ {}&\quad +(1-\delta )^{k-k_t}\\&\le 1-(1-\delta )^{k-k_t}-\delta \xi (k-k_t)+(1-\delta )^{k-k_t}\\&= 1-\delta \xi (k-k_t). \end{aligned} \end{aligned}$$
(34)

Multiplying both sides of (34) right by \(V(x(k_t))\), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1-(1-\delta )^{k-k_t}}{\delta }\frac{1-\lambda ^{k-k_t}}{1-\lambda }\alpha _2L_\chi (1+\varUpsilon )L_1V(x(k_t))\\&\qquad +(1-\delta )^{k-k_t}V(x(k_t))\\&\quad \le V(x(k_t))-\delta \xi (k-k_t)V(x(k_t)) \end{aligned} \end{aligned}$$
(35)

To move on, we compare (29) with (35), have

$$\begin{aligned} V(x(k))\le V(x(k_t))-\delta \xi (k-k_t)V(x(k_t)), \end{aligned}$$
(36)

with \(k\in (k_t,k_t+1)\).

Define a function \({\mathcal {H}}\) as

$$\begin{aligned} {\mathcal {H}}(x(k)) = V(x(k_t))-\delta \xi (k-k_t)V(x(k_t)). \end{aligned}$$
(37)

Then

$$\begin{aligned} 0<V(x(k))\le {\mathcal {H}}(x(k)). \end{aligned}$$
(38)

Taking the first-order difference of \({\mathcal {H}}(x(k))\) yields

$$\begin{aligned} \begin{aligned} \varDelta {\mathcal {H}}={\mathcal {H}}(x(k+1))-{\mathcal {H}}(x(k)) =-\delta \xi V(x(k_t)). \end{aligned} \end{aligned}$$
(39)

Thus, in light of (12) and (39), the following inequality holds

$$\begin{aligned} \varDelta {\mathcal {H}}\le -\delta \xi \overline{\alpha }(\Vert x(k_t)\Vert )<0. \end{aligned}$$
(40)

Since (39) and (40) hold, the switched system is asymptotically stable in this case.

Case 2: The event triggering condition will be violated at \(k=k_t\). At this moment \(e(k)=0\), and (16) is rewritten as

$$\begin{aligned} \begin{aligned}&V(f_v(x(k_t))+g_v(x(k_t))\left( \mu _v(x(k_t))+d(k_t)\right) )\\&\quad -V(x(k_t))\le -\alpha _1V(x(k_t))+\alpha _3\varUpsilon \Vert x(k_t)\Vert . \end{aligned} \end{aligned}$$
(41)

According to (24) and (41), it follows that

$$\begin{aligned} \begin{aligned}&V(f_v(x(k_t))+g_v(x(k_t))\left( \mu _v(x(k_t))+d(k_t)\right) )\\&\le (1-\alpha _1+\alpha _3\varUpsilon L_1)V(x(k_t))\\&=(1-\delta )V(x(k_t)). \end{aligned} \end{aligned}$$
(42)

Define a function \({\mathcal {G}}\) as

$$\begin{aligned} {\mathcal {G}}(x(k_t))=(1-\delta )V(x(k_t)). \end{aligned}$$
(43)

Then

$$\begin{aligned} {\mathcal {G}}(x(k_t+1))=(1-\delta )V(x(k_t+1)). \end{aligned}$$
(44)

With the help of (43) and (44), the first-order difference of \({\mathcal {G}}\) can be derived as

$$\begin{aligned} \begin{aligned} \varDelta {\mathcal {G}}&= {\mathcal {G}}(x(k_t+1))-{\mathcal {G}}(x(k_t))\\&=(1-\delta )(V(x(k_t+1))-V(x(k_t))). \end{aligned} \end{aligned}$$
(45)

On the basis of (42) and (45), it is not difficult to see that

$$\begin{aligned} \begin{aligned} \varDelta {\mathcal {G}}&=(1-\delta )(V(x(k_t+1))-V(x(k_t)))\\&\le -(1-\delta )\delta V(x(k_t)). \end{aligned} \end{aligned}$$
(46)

Hence, by substituting (12) into (46), we have

$$\begin{aligned} \varDelta {\mathcal {G}}&\le -(1-\delta )\delta V(x(k_t))\nonumber \\\quad&\le -(1-\delta )\delta \underline{\alpha }(\Vert x(k_t)\Vert )<0. \end{aligned}$$
(47)

Therefore, the switched system is asymptotically stable in this case.

In conclusion, from the proof under two cases, it is obvious that when the sufficient condition in Theorem 1 is satisfied, the switched system will be asymptotically stable. This proof is thus completed. \(\square \)

Next, the non-trivial minimum triggering instant will be analyzed to avoid the Zeno’s behavior. According to Theorem 1, it is not difficult to see that events are triggered when

$$\begin{aligned} V(x(k)) = V(x(k_t))-\delta \xi (k-k_t)V(x(k_t)), \end{aligned}$$
(48)

Let \(\varphi (k)=\alpha _2L_1L_\chi (1+\varUpsilon )\frac{1-\lambda ^{k-k_t}}{1-\lambda }\), and from (29), we have

$$\begin{aligned} V(x(k))\le \frac{\big (\delta -\varphi (k)\big )(1-\delta )^{k-k_t}+\varphi (k)}{\delta }V(x(k_t)). \end{aligned}$$

Similar with [43], the minimum \(k=k^*\) is defined as follows:

$$\begin{aligned} k^*=&\arg \underset{k\in {\mathbb {N}}}{\min }\{1-\delta \xi (k-k_t)\}\\ \ge&\frac{\big (\delta -\varphi (k)\big )(1-\delta )^{k-k_t}+\varphi (k)}{\delta }. \end{aligned}$$

As a result, the Zeno’s behavior will be avoided when \(k^*>1\), that is, the event triggering rule is non-trivial. Therefore, \(\delta , \xi \) and \(\alpha _2\) should be selected properly to guarantee \(k^*>1\).

Remark 4

It should be noticed that the term \(\max \{\gamma _2\Vert e(k)\Vert ,\gamma _3\Vert \varpi (k)\Vert \}\) in (13) is converted to the terms \(\alpha _2\) and \(\alpha _3\) in (16) to facilitate the proof of asymptotical stability for closed-loop nonlinear switched system. In what follows, we will analyze the rationality of this variable substitution in Theorem 1. Firstly, it is assumed that the condition (23) is true for \(\alpha _2=\gamma _2\) and \(\alpha _3=\gamma _3\). Then, we can observe that the right-hand side of (23) is a positive number and remains unchanged when other parameters such as k and \(k_t\) remain unchanged. Next, we will discuss whether the condition (23) holds in two cases of \(\kappa =1\) and \(\kappa =0\). i) If \(\kappa =1\), then \(\alpha _2=\gamma _2\), \(\alpha _3=0\), the denominator \(\delta ^2\) on the left-hand side will become larger, and its numerator will not change. Then, the left-hand side of the inequality will become smaller, and the condition (23) is satisfied. ii) If \(\kappa =0\), then \(\alpha _2=0\), \(\alpha _3=\gamma _3\), the denominator on the left-hand side stays unchanged, and the numerator is equal to zero, therefore, the left-hand side will be equal to zero, and the condition (23) is satisfied; in conclusion, as long as the condition (23) holds for \(\alpha _2=\gamma _2\) and \(\alpha _3=\gamma _3\), then the condition will be true for all cases discussed above, and this variable substitution is reasonable for the closed-loop switched systems with uncertainty.

In this section, we have introduced the event triggering condition and related ISS attributes for closed-loop switched systems with uncertainty, where the variable substitution method is utilized to tackle the uncertain term in ISS assumption. Then, the asymptotical stability for the closed-loop system has been deduced with a sufficient condition. It is worth mentioning that the stability analysis studied here is focused on the ETM applied into the closed-loop systems, rather than the stability of the ETM-based ADP approach. In the next section, we will introduce the ETM-based ADP approach and the convergence proof of the proposed algorithm.

4 ETM-based robust optimal hybrid feedback control policy design and convergence analysis

In this section, the ETM-based ADP approach is designed to achieve a robust optimal hybrid feedback control policy. Besides, the convergence is proved for closed-loop switched systems with input saturation constraints.

4.1 ETM-based robust optimal hybrid control policy design for switched systems

For closed-loop switched systems, an ETM-based iterative ADP approach is proposed to achieve both the constrained control law and the switching signal simultaneously. The value function is initialized as \({\mathcal {V}}_0(x(k))={\mathcal {V}}_0^1(x(k))=0\). For \(v\in \varOmega _v\), \(l\in {\mathcal {L}}_i\), \(\forall x(k)\), the constrained control law \(\mu _{i}^{(l,v)}(x(k))\), \(i=0,1,2,\cdots \), can be expressed as

$$\begin{aligned} \begin{aligned}&\mu _{i}^{(l,v)}(x(k))\\&\quad =\mathrm {arg}\min \{{\mathcal {M}}(x(k),v(k),u_v(k))\\&\qquad +{\mathcal {V}}^l_i(x(k+1))\}\\&\quad =-\bar{u}\phi \left( \frac{1}{2\bar{u}}R_v^{-1}g_v^T(x(k))\nabla {\mathcal {V}}_i^{l}(x(k+1))\right) . \end{aligned} \end{aligned}$$
(49)

The constrained control law \(\mu _{i}^{(l,v)}(x(k))\) under the ETM will not be updated during the triggered interval \((k_t,k_{t+1})\). Therefore, the event-triggered control law is defined as

$$\begin{aligned} \begin{aligned}&\mu _{i}^{(l,v)}(x(k_t))\\&\quad =-\bar{u}\phi \left( \frac{1}{2\bar{u}}R_v^{-1}g_v^T(x(k_t))\nabla {\mathcal {V}}_i^{l}(x(k_t+1))\right) . \end{aligned} \end{aligned}$$
(50)

Then the corresponding value function at \((i+1)\)th iteration is represented as

$$\begin{aligned} {\mathcal {V}}_{i+1}^{\hat{l}}(x(k))&={\mathcal {M}}(x(k),v(k),\mu _{i}^{(l,v)}(x(k_t))+{\mathcal {V}}_i^{\hat{l}}\nonumber \\ {}&\quad \times (x(k+1)) \end{aligned}$$
(51)

where \(\hat{l}=(l-1)M+v\). Then the iterative robust optimal value function is defined as

$$\begin{aligned} {\mathcal {V}}_{i+1}(x(k))=\underset{\hat{l}}{\min }\{{\mathcal {V}}_{i+1}^{\hat{l}}(x(k))\}. \end{aligned}$$
(52)

Define

$$\begin{aligned} \iota _i(x(k))=\mathrm {arg}\underset{\hat{l}\in {\mathcal {L}}}{\min }\{{\mathcal {V}}_{i+1}^{\hat{l}}(x(k))\} \end{aligned}$$
(53)

then the corresponding control law \(\mu _{i}(x(k))\) and switching signal \(v_i(x(k))\) can be obtained as

$$\begin{aligned} \mu _i(x(k_t))&=\mu _{i}^{(\lfloor \iota _i(x(k))/M\rfloor +1,v_i(x(k))}(x(k_t)), \end{aligned}$$
(54)
$$\begin{aligned} v_i(x(k))&=\mathrm {mod}(\iota _i(x(k)),M). \end{aligned}$$
(55)

4.2 Convergence analysis

The robust optimal hybrid feedback control policy has been obtained via the designed adaptive event-triggered ADP approach. Next, the convergence analysis of the proposed approach will be given as follows.

Theorem 2

In closed-loop switched systems (1), for each iterative index \(i\in \{0,1,2,\ldots \}\), we assume that the sequence \(\{{\mathcal {V}}_i(x(k))\}\) is generated by (51)–(52), and the associated hybrid control sequence \(\pi _i=\left<\mu _{i}(x(k_t)),v_i(x(k))\right>\) is obtained by (54)–(55), then the following two properties hold:

  1. (1)

    (boundedness) for any \(x(k)\in {\mathbb {R}}^n\), there always exists a state-dependent upper bound \(\breve{{\mathcal {V}}}_i(x(k))\), such that \({\mathcal {V}}_i(x(k))\le \breve{{\mathcal {V}}}_i(x(k))\) holds, \(\forall k\).

  2. (2)

    (monotonicity) with the initial value function \({\mathcal {V}}_0(x(k))=0, \forall k\), the sequence \(\{{\mathcal {V}}_i(x(k))\}\) is nondecreasing.

Proof

Let \(\breve{\pi }_i=\left<\breve{\mu }_{i}(x(k_t)),\breve{v}_i(x(k))\right>\) be any admissible hybrid feedback control policy. According to the definition for value function (9), the associated \(\breve{{\mathcal {V}}}_{i+1}(x(k))\) after ith iteration is given by

$$\begin{aligned} \breve{{\mathcal {V}}}_{i+1}(x(k))&={\mathcal {M}}(x(k),\breve{v}_{i}(k),\breve{\mu }_{i}(x(k_t)))\nonumber \\ {}&\quad +\breve{{\mathcal {V}}}_i(x(k+1)). \end{aligned}$$
(56)

Then, with equations (51)–(52), the associated \({\mathcal {V}}_{i+1}(x(k))\) of control policy \(\pi _i\) obtained by (54)–(55) after ith iteration becomes

$$\begin{aligned} {\mathcal {V}}_{i+1}(x(k))={\mathcal {M}}(x(k),v_i(k),\mu _{i}(x(k_t))+{\mathcal {V}}_{i}(x(k+1)) \end{aligned}$$
(57)

where \(\mu _{i}(x(k_t))\) minimizes the right-hand term of equation (56), therefore, \({\mathcal {V}}_{i}(x(k))\le \breve{{\mathcal {V}}}_{i}(x(k))\), for any k. This proves the boundedness of the iterative value function.

Let \(\breve{\pi }_i=\pi _{i+1}=\left<\mu _{i+1}(x(k_t)),v_{i+1}(x(k))\right>\), the equation (56) can be formulated into the following equation:

$$\begin{aligned} \breve{{\mathcal {V}}}_{i+1}(x(k))= & {} {\mathcal {M}}(x(k),v_{i+1}(k),\mu _{i+1}(x(k_t)))\nonumber \\&+\breve{{\mathcal {V}}}_i(x(k+1)) \end{aligned}$$
(58)

where \(\breve{{\mathcal {V}}}_{0}(x(k))={\mathcal {V}}_{0}(x(k))=0\).

In what follows, we will prove the monotonicity of sequence \(\{{\mathcal {V}}_i(x(k))\}\). First, we prove \(\breve{{\mathcal {V}}}_i(x(k))\le {\mathcal {V}}_i(x(k))\) by mathematical induction. When \(i=0\), it is obvious that

$$\begin{aligned}&{\mathcal {V}}_{1}(x(k))-\breve{{\mathcal {V}}}_{0}(x(k))\nonumber \\ {}&\quad ={\mathcal {M}}(x(k),v_{1}(k),\mu _{1}(x(k_t)))\ge 0. \end{aligned}$$
(59)

Suppose that \({\mathcal {V}}_{i}(x(k))-\breve{{\mathcal {V}}}_{i-1}(x(k))\ge 0,\) it is not difficult to see from (57) and (58) that

$$\begin{aligned}&{\mathcal {V}}_{i+1}(x(k))-\breve{{\mathcal {V}}}_{i}(x(k))\nonumber \\&\quad = {\mathcal {V}}_{i}(x(k+1))-\breve{{\mathcal {V}}}_{i-1}(x(k+1))\ge 0. \end{aligned}$$
(60)

Then, we can get a conclusion that \(\breve{{\mathcal {V}}}_i(x(k))\le {\mathcal {V}}_{i+1}(x(k))\) with mathematical induction. Therefore, recalling the boundedness of \({\mathcal {V}}_i(x(k))\) (i.e., \({\mathcal {V}}_i(x(k))\le \breve{{\mathcal {V}}}_i(x(k))\)), we can derive that \({\mathcal {V}}_i(x(k))\le \breve{{\mathcal {V}}}_i(x(k))\le {\mathcal {V}}_{i+1}(x(k)),\forall k\) and thus monotonicity is proved. \(\square \)

Theorem 3

The sequence \(\{{\mathcal {V}}_i(x(k))\}\) generated by (51)–(52) is convergent along with \(i\rightarrow \infty \), i.e., \({\mathcal {V}}_{\infty }(x(k))={\mathcal {V}}^*(x(k)),\forall k\), where \({\mathcal {V}}^*(x(k))\) is denoted as the optimal value function under admissable hybrid control policy.

Proof

Let \(\breve{\pi }_i=\left<\breve{\mu }_{i}(x(k_t)),\breve{v}_i(x(k))\right>\) be any admissable feedback control policy. Since the corresponding iterative value function is \(\breve{{\mathcal {V}}}_i(x(k))\), then we have

$$\begin{aligned} \begin{aligned} \breve{{\mathcal {V}}}&_{i+1}(x(k))\\ =&{\mathcal {M}}(x(k),\breve{v}_{i}(k),\breve{\mu }_{i}(x(k_t)))+\breve{{\mathcal {V}}}_i(x(k+1))\\ =&\sum _{j=i-1}^{i}{\mathcal {M}}(x(k+i-j),\breve{v}_{j}(k+i-j),\\ {}&\quad \times \breve{\mu }_{j}(x((k+i-j)_t)))\\&\quad +\breve{{\mathcal {V}}}_{i-1}(x(k+2))\\ =&\cdots \quad \\ =&\sum _{j=0}^{i}{\mathcal {M}}(x(k+i-j),\breve{v}_{j}(k+i-j),\\ {}&\quad \times \breve{\mu }_{j}(x((k+i-j)_t))) \end{aligned} \end{aligned}$$
(61)

when i goes to infinity, the following equation holds:

$$\begin{aligned} \begin{aligned} \underset{i\rightarrow \infty }{\lim }\breve{{\mathcal {V}}}_i(x(k))&= \underset{i\rightarrow \infty }{\lim }\sum _{j=0}^{i}{\mathcal {M}}(x(k+i-j),\\&\quad \breve{v}_{j}(k+i-j),\breve{\mu }_{j}(x((k+i-j)_t))). \end{aligned} \end{aligned}$$
(62)

For the optimal value function \({\mathcal {V}}^*(x(k))\), by recalling definitions (9)–(10), yields

$$\begin{aligned} {\mathcal {V}}^*(x(k))=\underset{\breve{\pi }}{\min }\underset{i\rightarrow \infty }{\lim }\breve{{\mathcal {V}}}_i(x(k)). \end{aligned}$$
(63)

From Theorem 2, the boundedness of the iterative sequence \({\mathcal {V}}_i(x(k))\) has been given, i.e., \({\mathcal {V}}_i(x(k))\le \breve{{\mathcal {V}}}_i(x(k)), \forall k.\) Hence, for any state x(k) and \(i\in \{1,2,\cdots \}\), it follows that

$$\begin{aligned} \begin{aligned} {\mathcal {V}}_\infty (x(k))=\underset{i\rightarrow \infty }{\lim }{\mathcal {V}}_i(x(k))&\le \underset{i\rightarrow \infty }{\lim }\breve{{\mathcal {V}}}_i(x(k))\\&\le \underset{\breve{\pi }}{\min }\underset{i\rightarrow \infty }{\lim }\breve{{\mathcal {V}}}_i(x(k))\\&={\mathcal {V}}^*(x(k)). \end{aligned} \end{aligned}$$
(64)

Then, in light of the optimal control theory, considering the definition of optimal value function, yields \({\mathcal {V}}^*(x(k))\le {\mathcal {V}}_{\infty }(x(k)).\) Combining this and (64), it holds that \({\mathcal {V}}^*(x(k))= {\mathcal {V}}_{\infty }(x(k))\).

The convergence analysis of the designed approach is thus completed. \(\square \)

5 Implement of the event-triggered ADP approach for switched systems

Based on universal approximation theorem [44] and nonlinear systems Lipschitz continuous property, the iterative value function (51) and constrained control law (50) can be approximated via NNs with any precision. Hence, two NNs need to be established at each iteration. One is a critic NN used to approach the value function \({\mathcal {V}}_{i+1}^{\hat{l}}(x(k))\), another one is an actor NN to approach the event-based constrained control law \(\mu _i(x(k_t))\). Finally, the ETM-based robust optimal control algorithm is constructed to implement the proposed ETM-based ADP approach for closed-loop switched systems.

5.1 Critic network

The critic network’s input is the sampled state \(x(k_t)\) and its output can be represented as

$$\begin{aligned} \tilde{{\mathcal {V}}}_{i+1}^{\hat{l}}(x(k_t))=(W_{c(i+1)}^{\hat{l}})^T\varphi (x(k_t)) \end{aligned}$$
(65)

where \(\hat{l}=(l-1)\times M+v\), \(W_{c(i+1)}^{\hat{l}}\) is the weight of the critic NN, \(\varphi \) is the critic’s smooth and differentiable polynomial activation function. The target value function is denoted as

$$\begin{aligned} \begin{aligned} {\mathcal {V}}_{i+1}^{\hat{l}}(x(k_t))=&U(x(k_t),\mu _{i}^{(l,v)}(x(k_t)),v_i(x(k_t)))\\&+\tilde{{\mathcal {V}}}_{i+1}^{\hat{l}}(x(k_t+1)). \end{aligned} \end{aligned}$$
(66)

Define the critic NN’s approximating error function as

$$\begin{aligned} e_{c(i+1)}^{\hat{l}}(x(k))=\tilde{{\mathcal {V}}}_{i+1}^{\hat{l}}(x(k_t))-{\mathcal {V}}_{i+1}^{\hat{l}}(x(k_t)). \end{aligned}$$
(67)

The objective of weight updating in the critic NN is to minimize the following function

$$\begin{aligned} E_{c(i+1)}^{\hat{l}}(x(k))=\frac{1}{2}(e_{c(i+1)}^{\hat{l}}(x(k)))^2. \end{aligned}$$
(68)

By virtue of the gradient descent approach, the weight updating rule of the critic NN can be given by

$$\begin{aligned} \begin{aligned}&W_{c(i+1)}^{\hat{l}}(j+1)\\&\quad =\left\{ \begin{array}{ll} W_{c(i+1)}^{\hat{l}}(j)-\alpha _c\frac{\partial E_{c(i+1)}^{\hat{l}}(x(k))}{\partial W_{c(i+1)}^{\hat{l}}(j)},&{}\quad k=k_t,\\ W_{c(i+1)}^{\hat{l}}(j),&{}\quad k_t<k<k_{t+1}\\ \end{array} \right. \end{aligned} \end{aligned}$$
(69)

where \(\alpha _c\) is the learning rate of the critic network.

5.2 Actor network

The actor NN is constructed to learn the constrained optimal control law. The event-triggered state \(x(k_t)\) is the actor NN’s input, and its output is given by

$$\begin{aligned} \tilde{\mu }_i^{(l,v)}(x(k_t))=(W_{ai}^{\hat{l}})^T\sigma (x(k_t)) \end{aligned}$$
(70)

where \(\hat{l}=(l-1)\times M+v\), \(W_{ai}^{\hat{l}}\) is the actor NN’s weight, \(\sigma \) is the actor’s smooth and differentiable polynomial activation function. The target constrained optimal control law is

$$\begin{aligned}&\mu _i^{(l,v)}(x(k_t))\\ {}&\quad =-\bar{u}\phi \left( \frac{1}{2\bar{u}}R_v^{-1}g_v^T(x(k_t)) \frac{\partial {\mathcal {V}}_i^{l}(x(k_t+1))}{\partial x(k_t+1)}\right) . \end{aligned}$$

The approximating error of the actor NN is defined as

$$\begin{aligned} e_{ai}^{(l,v)}=\tilde{\mu }_i^{(l,v)}(x(k_t))-\mu _i^{(l,v)}(x(k_t)). \end{aligned}$$
(71)

The objective of weight updating in the actor NN is to minimize the following function

$$\begin{aligned} E_{ai}^{(l,v)}=\frac{1}{2}{e_{ai}^{(l,v)}}^Te_{ai}^{(l,v)}. \end{aligned}$$
(72)

By virtue of the gradient descent approach, the weight updating rule of the actor NN is given by

$$\begin{aligned} \begin{aligned}&W_{ai}^{(l,v)}(j+1)\\&\quad =\left\{ \begin{array}{ll} W_{ai}^{(l,v)}(j)-\alpha _a\frac{\partial E_{ai}^{(l,v)}(x(k))}{\partial W_{ai}^{(l,v)}(j)},&{}\quad k=k_t,\\ W_{ai}^{(l,v)}(j),&{}\quad k_t<k<k_{t+1}\\ \end{array} \right. \end{aligned} \end{aligned}$$
(73)

where \(\alpha _a\) is the learning rate of the actor network.

Remark 5

In this paper, the neural networks adopt the form of multilayer perceptron with three layers in the actor-critic structure (i.e., input layer-one hidden layer-output layer). A literature research shows that numerous ADP-related works have taken this multilayer perceptron form with the polynomial function as the activation function, see, for instance, [11, 45, 46]. Generally speaking, the activation function adopted is smooth and differentiable to facilitate updating the gradient vector of the weight parameter. Therefore, in this paper, the neural network structures adopt the form of multilayer perceptron with the smooth and differentiable polynomial activation function.

The detailed implement procedure of the event-triggered robust ADP algorithm for uncertain switched systems is described in Algorithm 1.

figure a

6 Simulation

In this section, two numerical examples, including nonlinear and linear cases, are demonstrated to verify the suggested approach is effective. The first numerical example is utilized to illustrate the robust optimal ability for nonlinear switched systems subjected to input saturation. Another example is employed to show the simple process of our approach for coping with linear open-loop unstable switched systems.

6.1 Case 1: Nonlinear switched systems

6.1.1 Controlled objective and simulation setup

The following nonlinear switched system taking from [19, 33] is given by :

$$\begin{aligned} x(k+1)=f_{v}(x(k))+g_{v}(x(k))(u_{v}(k)+\varpi (k)) \end{aligned}$$

where two subsystems are considered, i.e., \(M=2\) and \(v(k) \in \{1,2\}\). The disturbance term \(\varpi (k)=\varepsilon x_2(k)\mathrm{sin}(x_1(k)^2)\mathrm{cos}(x_2(k))\), where \(\varepsilon \) is an unknown random variable with \(\varepsilon \in [-0.5,0.5]\).

In this case, the two subsystem dynamics are given as follows, respectively.

$$\begin{aligned} \begin{aligned} f_1(x(k))&=\begin{bmatrix} f_1^1(x(k))&f_1^2(x(k)) \end{bmatrix}^T,~\\ g_1(x(k))&=\begin{bmatrix} 0&-x_2(k) \end{bmatrix}^T;\\ f_2(x(k))&=\begin{bmatrix} f_2^1(x(k))&f_2^2(x(k)) \end{bmatrix}^T,~\\ g_2(x(k))&=\begin{bmatrix} 0&-x_2(k) \end{bmatrix}^T \end{aligned} \end{aligned}$$

where \(f_1^1(x(k))=-0.8x_2(k)\), \(f_1^2(x(k))=\mathrm{sin}(0.8x_1(k)-x_2(k))+1.8x_2(k)\), \(f_2^1(x(k))=0.5x_1^2(k)x_2(k)\), \(f_2^2(x(k))=x_1(k)+0.8x_2(k)\).

The state of switched systems is initialized with \(x(0)=[-1.5,0.5]^T\). The utility function is selected as \(Q_1=Q_2=I_2\), \(R_1=R_2=0.1\), where \(I_2\) refers to the identity matrix with dimension 2. The control input is constrained with \(|u_v(k)|\le 0.5\). Set \(\varUpsilon =1, L_\chi =0.14\), then the event triggering condition can be computed with (20):

$$\begin{aligned} \Vert e(k)\Vert \le \frac{1-0.42^{k-k_t}}{0.58}0.28\Vert x(k_t)\Vert ,\quad k\in \left[ k_t,k_{t+1}\right) . \end{aligned}$$

Table 1 summarizes the parameters used in example 1. Next, Algorithm 1 is implemented to obtain the event-triggered robust optimal hybrid control policy, the simulation time steps \(T=30\), the critic and actor networks all have three layers with a 2-9-1 structure. The same activation function shares between the critic and actor networks: \(\varphi (x)=\sigma (x)=[x_1,x_2,x_1^2,x_2^2,x_1x_2,x_1^3,x_2^3,x_1^2x_2,x_1x_2^2]\). Initialization weights of the critic and actor networks are randomly selected from [-1, 1], the other parameters can be found in Table 1.

Table 1 The parameters used in example 1

6.1.2 Results and discussion section

To describe the effectiveness of the proposed control approach, Fig. 1 shows that the curves of state trajectory converge gradually with time. Besides, we compare the proposed event-triggered ADP approach with the classical time-triggered ADP method in [18]. In the ETM-based method, the zero-order holder is utilized to maintain the control input of the last triggering instant. In addition, the classical ADP method does not consider the actuator constraint, and thus, the controller design and cost function are different from our approach. From Fig. 2, it is obvious that the classical ADP method exceeds the lower limit at \(0-2\) s (so it is reset to the lower limit value), which will reduce the control performance and damage the life of the equipment. In contrast, the proposed approach takes the actuator saturation phenomenon into full consideration and makes the controller operate safely and stably within the upper and lower limits.

To further illustrate the control performance of the proposed approach, we compare the event-triggered optimal cost function with the random hybrid control policy-induced cost function in Fig. 4. Since there are total \(2^k\) switching sequences, we randomly selected 10 switching sequences for comparison. The results show that compared with other switching control strategies, the proposed robust optimal event-triggered hybrid control approach can minimize the cost function, which further verifies the effectiveness of the proposed method.

Figure 5 provides the comparison of the cumulative number of the event triggering approach and time triggering method, where we can observe that the proposed ETM-based ADP triggers 15 times in total, which is 15 times less than the conventional time triggering method. From Fig. 6, it is shown that the event triggering threshold \(\Vert e_T\Vert \) is equal to zero at the beginning, then increases gradually, and finally approaches zero as the state gradually converges to zero. From the overall numerical simulation results of Case 1, the proposed ETM-based robust ADP algorithm for switched systems can indeed reduce the communication burden and obtain a robust optimal control performance.

Fig. 1
figure 1

State trajectory of Case 1

Fig. 2
figure 2

Event-triggered switching signal of Case 1

Fig. 3
figure 3

Comparison of the event-triggered constrained control input and classical time-triggered ADP control input of Case 1

Fig. 4
figure 4

Control performance under different switching sequences of Case 1

Fig. 5
figure 5

Triggering Number of Case 1

Fig. 6
figure 6

The event triggering threshold of Case 1

6.2 Case 2: booster converter system

6.2.1 Controlled objective and simulation setup

Fig. 7
figure 7

A booster converter

Further, a booster converter system as shown in Fig. 7 is adopted to verify the effectiveness of the proposed method. Consider the following system dynamics adapted from [47]:

$$\begin{aligned} \left\{ \begin{aligned} {\dot{i}}_l(t)=&~\frac{R}{L}i_l(t)+(1-v(t))\frac{1}{L}e_c(t)+\frac{1}{L}(e_s(t)+\varpi (t))\\ {\dot{e}}_c(t)=&~(v(t)-1)\frac{1}{C}i_l(t)-\frac{1}{R_0C}e_c(t)\\ \end{aligned} \right. \end{aligned}$$
(74)

where \(i_l\) is the current through the inductor L, \(e_c\) is the voltage of the capacitor C, \(e_s\) is the input voltage and \(\varpi \) is the matched disturbance. The switching signal v(t) is 1 or 2, which refers to the switch that is turned on at time t. When \(S_1\) is turned on, \(S_2\) is turned off, and vice versa.

Denote \(\tau (t)=[i_l(t), e_c(t)]^T\) and \(u(t)=e_s(t)\). With the Euler method and sampling time \(\mathrm {\varDelta T}\), the discrete-time version of the booster converter system is given by

$$\begin{aligned} \tau (k+1)=f_{v(k)}(\tau (k))+g_{v(k)}(u_{v(k)}(k)+\varpi (k)) \end{aligned}$$
(75)

where

$$\begin{aligned} f_1(\tau (k))&=\tau (k)-\mathrm {\varDelta T}\left[ \begin{array}{c} \frac{R}{L}\\ \frac{1}{R_0C} \end{array} \right] \tau (k), ~~\\ f_2(\tau (k))&=\tau (k)-\mathrm {\varDelta T}\left[ \begin{array}{c} \frac{R}{L}\\ \frac{1}{R_0C} \end{array} \right] \tau (k)\\ g_1(\tau (k))&=g_2(\tau (k))=\mathrm {\varDelta T}\left[ \frac{1}{L}\right] . \end{aligned}$$

The parameters in model (75) are selected as \(\mathrm {\varDelta T}= 0.1~\mathrm {s}, R_0=0.2~\mathrm {\varOmega }\), \(L=2~\mathrm {H}\), \(C=4~\mathrm {F}\), and \(R=5~\mathrm {\varOmega }\). And the desired state of the booster converter system is given by \(\tau _d=[-2,1]^T\). The tracking error is defined as \(x(k)=\tau (k)-\tau _{d}\). Therefore, the error dynamics of the booster converter system is

$$\begin{aligned} x(k+1)=F_{v(k)}(x(k))+G_{v(k)}(x_k)(u_{v(k)}(k)+\varpi (k)) \end{aligned}$$
(76)

where

$$\begin{aligned} F_1(x(k))&=x(k)-\mathrm {\varDelta T}\left[ \begin{array}{c} \frac{R}{L}\\ \frac{1}{R_0C} \end{array} \right] (x(k)+\tau _d)\\ F_2(x(k))&=x(k)-\mathrm {\varDelta T}\left[ \begin{array}{c} \frac{R}{L}\\ \frac{1}{R_0C} \end{array} \right] (x(k)+\tau _d)\\ G_1(x(k))&=G_2(x(k))=\mathrm {\varDelta T}\left[ \frac{1}{L}\right] . \end{aligned}$$

Setting the initial state of switched systems as \(x(0)=[0,0]^T\). The parameters in utility function are set as \(Q_1=\mathrm {diag}\{100,200\}, R_1=400;\) \(Q_2=\mathrm {diag}\{200,300\}, R_2=50.\) The control input is constrained with \(|u_v(k)|\le 2\), and \(\varpi (k)=\varepsilon x_1(k)\sin (x_1(k)^2)\) \(\cos (x_2(k))\), where \(\varepsilon \) is an unknown random value selected in [-0.5,0.5]. Select \(\varUpsilon =1, L_\chi =0.01\), the event triggering condition is calculated with (20):

$$\begin{aligned} \Vert e(k)\Vert \le \frac{1-0.3^{k-k_t}}{0.7}0.2\Vert x(k_t)\Vert ,\quad k\in \left[ k_t,k_{t+1}\right) . \end{aligned}$$
Table 2 The parameters used in example 2

Table 2 summarizes the parameters used in Case 2. Next, we implement Algorithm 1 to obtain an event-based robust optimal hybrid control policy, the total simulation time steps \(T=20\). The structure of critic and actor networks both are 2-5-1 with three layers. The activation functions of critic and actor NNs both are the same form \(\varphi (x)=\sigma (x)=[x_1,x_2,x_1^2,x_2^2,x_1x_2]\). The initial weights are selected as the random values between [-1,1], the other parameters can refer to Table 2.

Fig. 8
figure 8

State trajectory of Case 2

Fig. 9
figure 9

Event-triggered switching signal of Case 2

6.2.2 Results and discussion section

To illustrate the effectiveness of the proposed control approach, curves of the state for uncertain switched systems are described in Fig. 8. It is not difficult to see that the state trajectory goes to zero fastly and then basically keeps zero. Figure 9 shows the event-based switching signal. To clearly describe the validity of the proposed control approach, we compare it with the classical time triggering ADP method. The same network structure and system state parameters are provided for the two methods. As can be seen from Fig. 10, the classic ADP method does not consider actuator saturation phenomenons at 0-0.2s, which exceeds the upper limit. Due to the constraints of the actuator, we reset the values of the exceeding section to the upper limit. In practice, this is often one of the main reasons for the instability and performance degradation of the controlled system. In contrast, our approach takes full account of the saturation constraints and keeps the control input within a safe domain.

Fig. 10
figure 10

Comparision of the event-triggered constrained control input and classical time-triggered ADP control input of Case 2

Fig. 11
figure 11

Triggering instants and triggering intervals of Case 2

The triggering instants and triggering interval of the switched systems under the adaptive ETM are depicted in Fig. 11. Moreover, Fig. 12 offers the comparison of the cumulative number of the proposed event triggering approach and traditional time triggering method, where we can find that the proposed ETM-based algorithm triggers 6 times in total, which is 14 times less than the conventional time triggering method. Figure 13 displays the trajectory of the triggering threshold. From the overall results of Case 2, the proposed ETM-based robust ADP approach can effectively reduce the communication burden and improve computational efficiency.

Fig. 12
figure 12

Triggering number of Case 2

Fig. 13
figure 13

The event triggering threshold of Case 2

7 Conclusion

In this paper, an adaptive event-triggered robust optimal control strategy has been proposed for discrete-time nonlinear switched systems with input constraints and uncertainties. First, an event triggering condition has been introduced to determine the triggered instant for closed-loop switched uncertain systems. Then, by re-interpreting the ISS properties of the underlying system in a simpler manner, the adaptive ETM has been designed with the help of the Lyapunov technique. Subsequently, an event-triggered robust optimal control approach with ADP methodology has been proposed for constrained uncertain switched systems. In addition, the actor-critic structure, respectively, has been employed to obtain the value function and constrained control law. Finally, the validity of the proposed approach has been demonstrated by two numerical simulation examples.