Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics

Hua, Tong; Xiao, Jiang-Wen; Liu, Xiao-Kang; Lei, Yan; Wang, Yan-Wu

doi:10.1007/s11071-022-07970-x

Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics

Original Paper
Published: 13 October 2022

Volume 111, pages 2487–2500, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics

Download PDF

Tong Hua^1,2,
Jiang-Wen Xiao^1,2,
Xiao-Kang Liu ORCID: orcid.org/0000-0001-6262-6779^1,2,
Yan Lei^1,2 &
…
Yan-Wu Wang^1,2

519 Accesses
3 Citations
Explore all metrics

Abstract

The paper investigates the event-triggered sub-optimal control problem for two-time-scale systems with unknown slow dynamics. Specifically, a composite event-triggered optimal controller design is proposed for the decoupled slow and fast subsystems with different triggering conditions. To handle the unknown dynamics, the actor and critic networks are utilized to approximate the optimal controller employing tuning laws applied to the weights. The results indicate that a system employing the proposed event-triggered mechanism can achieve asymptotic stability and afford a sub-optimal controller. Furthermore, Zeno behavior is excluded with rigorous proof, and several simulation examples and comparative studies demonstrate the method’s effectiveness.

Robust optimal control for constrained uncertain switched systems subjected to input saturation: the adaptive event-triggered case

Article 28 July 2022

Adaptive Decentralized Event-triggered Tracking Control for Large-scale Strongly Interconnected Nonlinear System with Global Performance

Article 20 December 2022

Finite-Horizon Robust Event-Triggered Control for Nonlinear Multi-agent Systems with State Delay

Article 21 November 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Automotive Engineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In many practical systems, such as electric circuits [1], power systems [2], and spacecraft systems [3], slow and fast dynamics coexist and are coupled. These systems can be modeled as two-time-scale systems (TTSSs), where the optimal control objective of TTSSs is to minimize a predefined performance index concerning the control energy, state variables, or other indicators. Such problems have always been a hot topic in the control field for TTSSs. To address these issues, it is often required to deal with Hamilton–Jacobi–Bellman (HJB) equations, which are hard to solve analytically. As a result, methods focusing on finding an approximate solution have become popular to realize near-optimal control for practical needs [4, 5]. Another concern is that applying a traditional optimal control of single-time-scale systems directly to TTSSs may pose ill-conditioned numerical and high-dimensional issues. Therefore, some effective methods have been proposed [6,7,8,9,10,11,12,13,14]. For instance, when the fast dynamics decay rapidly, the controller can be designed solely based on the slow dynamics [6]. However, when the fast dynamics are not globally asymptotically stable, we can decompose the original problem into two reduced subproblems with separate time scales and then design a composite control strategy [7,8,9,10,11,12,13]. In [14], the authors proposed a feasible alternative to utilizing a fixed-point iteration method.

The optimal control methods for TTSSs mentioned above [7, 8, 10,11,12, 14] assume that the system’s dynamics are completely known. However, precisely parameterized system models are always hard to obtain in real applications. Based on the practical background of industrial thickening and flotation processes, model-free optimal operational [15, 16] and optimal tracking [17] control strategies have been suggested. Utilizing prior model identification, the unknown system dynamics are estimated through a multi-time-scale dynamic neural network, and then the near-optimal controller is designed [18, 19]. However, this method is still based on precise modeling. To directly obtain the optimal solution with unknown slow dynamics, an adaptive composite near-optimal control strategy is designed in the framework of adaptive dynamic programming [20]. Nevertheless, the controllers are continuously updated in real-time [15,16,17,18,19,20], posing implementation challenges due to the physical devices and communication limitations.

To overcome the difficulty of continuous updating, the event-triggered mechanism has been proposed so that the controller updates when necessary [21], enhancing the efficient usage of restricted resources. The event-triggered optimal control (ETOC) offers a tradeoff between performance and resources. Thus, a sub-optimal performance is obtained if selecting a less-resourced oriented controller, such as an event-triggered controller [22]. Furthermore, the upper bound of the performance index [23, 24], unmatched uncertainties [25] and double-channel communication [26] have also been studied within the scope of ETOC. The case of nonlinear stochastic systems is considered in [27]. However, related works on ETOC are mainly about single-time-scale systems. For example, considering the unknown dynamics, an event-triggered adaptive dynamic programming method has been proposed based on the policy and value iteration algorithms [28]. The adaptive neural network (NN) control or the adaptive fuzzy control is also a valid tool to deal with nonparametric uncertainties [29, 30]. Additionally, a model-free Q-learning-based algorithm has been designed where the Q-function is considered a parametrization for the state and the input [31]. Finally, some related works concern event-triggered mechanism (ETM) for TTSSs [6, 9, 32, 33], but optimality and unknown dynamics have not been simultaneously considered.

Based on the above discussions, ETOC for TTSSs with unknown dynamics is still an open issue to be investigated, as methods for single-time-scale systems [23,24,25, 31, 34] do not apply to TTSSs directly. The main challenge is that the errors of the slow and the fast states are always coupled, which is not the case in single-time scale systems.

This paper considers unknown slow dynamics and proposes a Q-learning ETOC algorithm for TTSSs. The Q-learning method is an adaptive dynamic programming method using the actor-critic structure. An action-depend function that estimates the expected performance under a given action and a given state through the collected input/output data selects the optimal action based on the previous observations. A significant advantage of such a strategy is that it does not require accurately modeling the systems. The main contribution of this paper is twofold: (1) Considering the event-triggered sub-optimal control for TTSSs, as existing event-triggered controllers for TTSSs mainly focus on stabilization [6, 9, 32, 33] and neglect performance indicators during the control process, or focus on using a continuous-time optimal controller without an event-triggering mechanism [18,19,20]. (2) Proposing an online approach without knowing the system’s slow dynamics, thus not requiring prior offline training [20] or identifying unknown dynamics [18, 19]. The optimal controller can be approximated directly with the state, input information, and partially known dynamics.

The remainder of the paper is organized as follows. Section 2 presents the background concerning the decomposition of slow and fast modes and the optimal control theory for TTSSs. Section 3 designs the event-triggered sub-optimal control with an unknown slow dynamic and analyzes the stability of the original system. Section 4 demonstrates the efficiency of the proposed strategy using numerical examples, and finally, Sect. 5 concludes this work.

Notation: $I_{n}$ denotes the identity matrix of dimension n, $\parallel \!\cdot \!\parallel $ represents the Euclidean norm for vectors or the spectral norm for matrices, ${{\mathcal {R}}}_{+}$ is the set of positive real numbers, and ${{\mathcal {N}}}_{+}$ denotes the set of positive integers. For a matrix $A\in {{\mathcal {R}}}^{n \times m}$, $vec(A) =[a^T_1, a^T_2, \dots , a^T_m ]^T$, where $a_i \in R^n$ is the ith column of $A,\ i=1,2,\dots m$. $tr(A)=\sum _{i = 1}^{n}a_{ii}$ indicates the trace, $ \otimes $ indicates the Kronecker product, ${\bar{\lambda }}(R),\, {\underline{\lambda }}(R)$ denote the maximum and minimum eigenvalue of matrix R, respectively, and $O(\cdot )$ is the order of magnitude defined in [10]: vector function $f(t,\varepsilon )\in {{\mathcal {R}}}^n$ is said to be $O(\varepsilon )$ over an interval $[t_1,t_2]$ if there exist positive constants k and $\varepsilon ^*$ such that

$$\begin{aligned} \begin{aligned} \Vert f(t,\varepsilon )\Vert \le k\varepsilon \quad \forall \varepsilon \in [0,\varepsilon ^*],\ \forall t\in [t_1,t_2]. \end{aligned} \end{aligned}$$

2 Problem formation

Consider the following linear continuous-time TTSSs described by

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{x}}(t)=A_{11}x(t)+A_{12}z(t)+B_{1}u(t),\ x(t_{0})=x^{0} \\ \varepsilon {\dot{z}}(t)=A_{21}x(t)+A_{22}z(t)+B_{2}u(t),\ z(t_{0})=z^{0} \\ y(t)=C_1x(t)+C_2z(t) \end{array}\right. } \end{aligned}$$

(1)

where $x(t) \in {{\mathcal {R}}}^{n_{1}}$ is the slow state vector, $z(t)\in {{\mathcal {R}}}^{n_{2}}$ is the fast state vector, $u(t)\in {{\mathcal {R}}}^{m}$ and $y(t)\in {{\mathcal {R}}}^{p}$ are the input and output, respectively. $\varepsilon $ represents a small positive singular perturbation parameter. $A_{11}\in {{\mathcal {R}}}^{n_{1}\times n_{1}},\ A_{12}\in {{\mathcal {R}}}^{n_{1}\times n_{2}},\ A_{21}\in {{\mathcal {R}}}^{n_{2}\times n_{1}},\ A_{22}\in {{\mathcal {R}}}^{n_{2}\times n_{2}}, B_{1}\in {{\mathcal {R}}}^{n_{1}\times m},\ B_{2}\in {{\mathcal {R}}}^{n_{2}\times m}$.

The objective is adjusting u to minimize the performance index

$$\begin{aligned} \begin{aligned} J=\frac{1}{2}\int _{0}^{\infty }(y^{T}y+u^{T}Ru)dt, R>0. \end{aligned} \end{aligned}$$

(2)

Let $A=\begin{bmatrix} A_{11} &{} A_{12}\\ \frac{1}{\varepsilon } A_{21} &{} \frac{1}{\varepsilon }A_{22} \end{bmatrix}, B=\left[ \begin{array}{c} {B_1} \\ \frac{1}{\varepsilon }B_2 \end{array}\right] , C=[C_1, C_2].$ $X=\left[ \begin{array}{c} x \\ z \end{array}\right] $, and the initial state $X(t_0)=X_0=\left[ \begin{array}{c} x^0 \\ z^0 \end{array}\right] $. From the optimal control theory, the exact optimal control for the original system (1) with a performance index (2) is

$$\begin{aligned} \begin{aligned} u_{opt}=-R^{-1}B^TPX, \end{aligned} \end{aligned}$$

(3)

where matrix P is the solution of the following Riccati equation

$$\begin{aligned} \begin{aligned} 0=-PA-A^TP+PBR^{-1}B^TP-C^TC, \end{aligned} \end{aligned}$$

(4)

and the corresponding performance index is $J_{opt}=\frac{1}{2}X_0^TPX_0$.

However, the traditional method cannot solve (4) directly due to numerical illness. An effective way is to divide the original system into two separate time-scale subsystems and design the controllers separately. Additionally, to reduce resource waste and consider the triggering factor, the controller only updates at the triggering instants, i.e., $\{t_x^k\}$ and $\{t_z^k\}$, $k \in {{\mathcal {N}}}_{+}$, for the slow and fast subsystems, respectively. The input is written as

$$\begin{aligned} \begin{aligned} u=u_{d}=K_1 {\hat{x}}+K_2 {\hat{z}}, \end{aligned} \end{aligned}$$

(5)

where $K_1\in {{\mathcal {R}}}^{m\times n_{1}}$, $K_2\in {{\mathcal {R}}}^{m\times n_{2}}$, and the sampled states $ {\hat{x}},\ {\hat{z}}$ are defined as

$$\begin{aligned} \begin{aligned} {\hat{x}}(t)=x(t_x^{k}), \quad \forall t\in [t_x^{k},t_x^{k+1}) \\ {\hat{z}}(t)=z(t_z^{k}), \quad \forall t\in [t_z^{k},t_z^{k+1}) \end{aligned} \end{aligned}$$

The errors between the sampled states and the real states are defined as

$$\begin{aligned} \begin{aligned} e_1(t)&={\hat{x}}(t)-x(t), \\e_2(t)&={\hat{z}}(t)-z(t). \end{aligned} \end{aligned}$$

For convenience, we denote $E=(e_1^T \ e_2^T)^T,\ K=(K_1 \ K_2)$.

Assumption 1

$A_{22}$ is nonsingular.

Under Assumption 1, the following Chang transformation [8] will be used to decouple the slow and the fast modes.

$$\begin{aligned} \begin{aligned} \left[ \begin{array}{c} \xi \\ \eta \end{array}\right] =T \left[ \begin{array}{c} x\\ z \end{array}\right] , \end{aligned} \end{aligned}$$

(6)

where $T=\begin{bmatrix} I_{n}-\varepsilon ML &{} -\varepsilon M\\ L &{} I_{m} \end{bmatrix}$, $\xi $ and $\eta $ denote the decoupled slow and fast dynamics of the original system, respectively. $L\in {{\mathcal {R}}}^{n_{2}\times n_{1}}$ and $M\in {{\mathcal {R}}}^{n_{1}\times n_{2}}$ are the solutions of the following equations,

$$\begin{aligned} \begin{aligned}&\varLambda _{21}+\varepsilon L\varLambda _{11}-\varLambda _{22}L-\varepsilon L\varLambda _{12}L=0,\\&\varepsilon \varLambda _{11}M-\varepsilon \varLambda _{12}LM+\varLambda _{12}-M\varLambda _{22}-\varepsilon ML\varLambda _{12}=0, \end{aligned} \end{aligned}$$

where $\varLambda _{11}=A_{11}+B_{1}K_{1},\,\varLambda _{12}=A_{12}+B_{1}K_{2},\,\varLambda _{21}=A_{21}+B_{2}K_{1},\,\varLambda _{22}=A_{22}+B_{2}K_{2}$.

Following the Chang transformation, the dynamics of the original system (1) with the input (5) can be transformed into

$$\begin{aligned} \begin{aligned} \left[ \begin{array}{c} {\dot{\xi }}\\ \varepsilon {\dot{\eta }} \end{array}\right] =&\begin{bmatrix} \varLambda _{11}-\varLambda _{12}L &{} 0\\ 0 &{}\varLambda _{22}+\varepsilon L\varLambda _{12} \end{bmatrix} \left[ \begin{array}{c} \xi \\ \eta \end{array}\right] \\&+ \left[ \begin{array}{c} B_{1}-M(B_2+\varepsilon LB_1)\\ B_{2}+\varepsilon LB_1\end{array}\right] KE. \end{aligned} \end{aligned}$$

(7)

Let

$$\begin{aligned} \begin{aligned} {\hat{\xi }} ={\hat{x}},\ {\hat{\eta }} =L{\hat{x}}+{\hat{z}},\ e_f={\hat{\eta }}-\eta =Le_1+e_2. \end{aligned} \end{aligned}$$

(8)

With transformation (6), the input $u_d$ can be divided into two parts, $u_{d}=u_{sd}+u_{fd}$, concerning ${\hat{\xi }}$ and ${\hat{\eta }}$, respectively

$$\begin{aligned} \begin{aligned} u_{sd}=&K_0{\hat{\xi }}=K_0\xi +K_0e_1+O(\varepsilon )\xi +O(\varepsilon )\eta , \\ u_{fd}=&K_2{\hat{\eta }}, \end{aligned} \end{aligned}$$

(9)

where $K_0$ and $K_2$ will be separately designed for the slow and fast subsystems, respectively. By substituting $\xi $, $\eta $ with x and z by (6) and compare the parameters in (5), we obtain $K_1=K_0+K_2L$. From [7], $L=A_{22}^{-1}(A_{21}+B_2K_0)+O(\varepsilon )$. Let $A_0=A_{11}-A_{12}A_{22}^{-1}A_{21},\ B_0=B_1-A_{12}A_{22}^{-1}B_2,\, \varLambda _{0}=A_0+B_0K_0$. In [9] and [10], it is pointed out that for $O(\varepsilon )$ approximations, $\varLambda _{11}-\varLambda _{12}L$ can be approximated by $\varLambda _{0}$ and L can be approximated by $\varLambda _{22}^{-1}\varLambda _{21}$. Along with (6), the transformed variables $\xi , \eta $ can be approximated by

$$\begin{aligned} \begin{aligned} \xi (t)&=x(t)+O(\varepsilon )x(t)+O(\varepsilon )z(t), \\ \eta (t)&=z(t)+\varLambda _{22}^{-1}\varLambda _{21}x(t)+O(\varepsilon )x(t). \end{aligned} \end{aligned}$$

(10)

Remark 1

Note that although the anti-diagonal elements of the system matrix are zero in (7), the errors $e_1$ and $e_2$ coexist in the dynamic of $\xi $ and $\eta $, respectively, posing difficulties in the design of ETM.

When the inputs are updated continuously, $E=0$, and if Assumption 1 holds, the transformed system (7) can be approximated by two subsystems. Then, the original optimal control problem of system (1) with performance index (2) can be converted to two subproblems as follows [20].

(1) For the approximated slow subsystem

$$\begin{aligned} \begin{aligned}&{\dot{x}}_s=A_{0}x_s+B_{0}u_s,\\&y_s=C_0x_s+D_0u_s, \end{aligned} \end{aligned}$$

(11)

where $C_0=C_1-C_2A_{22}^{-1}A_{21}, D_0=-C_2A_{22}^{-1}B_2$. The objective is adjusting $u_s$ to minimize the performance index

$$\begin{aligned} \begin{aligned} J_s&=\int _{t_0}^{\infty }r_s(t)dt\\&=\frac{1}{2}\int _{t_0}^{\infty }(x_s^{T}C_0^TC_0x_s+2x_s^TC_0^TD_0u_s\\&\quad +u_s^TR_0u_s)dt. \end{aligned} \end{aligned}$$

The reward $r_s(t)$ can also be written in a standard quadratic form

$$\begin{aligned} \begin{aligned} r_s(t) =\frac{1}{2}(x_s^{T}G_sx_s +u_{sc}^{T}R_0u_{sc}), \end{aligned} \end{aligned}$$

(12)

where $R_0=R+D_0^{T}D_0, \ G_s=C_0^T(I-D_0R_0^{-1}D_0^T) C_0, \ u_{sc}=u_s+R_0^{-1}D_0^TC_0x_s$. Note that $I-D_0R_0^{-1}D_0^T =(I+D_0R^{-1}N_0^T)^{-1}>0$.

We define the value function as

$$\begin{aligned} \begin{aligned} V_s(x_s(t))=\frac{1}{2}\int _{t}^{\infty }(x_s^{T}G_sx_s +u_{sc}^{T}R_0u_{sc})dt. \end{aligned} \end{aligned}$$

(13)

Let $A_s=A_0-B_{0}R_0^{-1}D_0^TC_0$. From the optimal control theory [11], if the Riccati equation

$$\begin{aligned} \begin{aligned} A_s^TP_s+P_sA_s-P_sB_0R_0^{-1}B_0^TP_s+G_s=0 \end{aligned} \end{aligned}$$

(14)

has a positive semidefinite stabilizing solution $P_s$, then the optimal control can be constructed by

$$\begin{aligned} \begin{aligned} u_{sc}^*=-R_0^{-1}B_0^TP_sx_s. \end{aligned} \end{aligned}$$

Accordingly, the optimal control of the slow subsystem (11) is

$$\begin{aligned} \begin{aligned} u_{s}^*=-R_0^{-1}(D_0^TC_0+B_0^TP_s)x_s. \end{aligned} \end{aligned}$$

(15)

Since the system (11) is linear, under the optimal control $u^*_s$, the optimal value function can be denoted in a quadratic form, that is

$$\begin{aligned} \begin{aligned} V_s^*(x_s(t))=\frac{1}{2}x_s(t)^TP_sx_s(t). \end{aligned} \end{aligned}$$

For the approximated slow subsystem, we define Hamilton function as

$$\begin{aligned} \begin{aligned}&H_s\left( x_s(t),u_{s}(x_s),\frac{\partial V_s^*(x_s)}{\partial x_s(t)}\right) \\&\quad =\frac{\partial {V_s(x_s)}^T}{\partial x_s(t)}{\dot{x}}_s(t)+r_s(t). \end{aligned} \end{aligned}$$

(16)

Under the optimal control $u^*_{s}(t)$, it has

$$\begin{aligned} \begin{aligned} H_s\left( x_s(t),u_{s}^*(x_s),\frac{\partial V_s^*(x_s)}{\partial x_s(t)}\right) =0. \end{aligned} \end{aligned}$$

From the form of $u_s^*$ in (15), it can be deduced that

$$\begin{aligned} \begin{aligned}&H_s\left( \xi (t),u_{s}^*(\xi ),\frac{\partial V_s^*(\xi )}{\partial \xi (t)}\right) \\&\quad =\frac{1}{2}\xi ^T(P_sA_0+A_0^TP_s+O(\varepsilon ))\xi \\&\qquad +\xi ^TP_sB_0u_{s}^*+\frac{1}{2}{\xi ^T} C_0^TC_0 \xi +\xi ^TC_0^TD_0u_{s}^*\\&\qquad +\frac{1}{2}{u_{s}^*}^T R_0 u_{s}^* =0. \end{aligned} \end{aligned}$$

(17)

(2) For the approximated fast subsystem

$$\begin{aligned} \begin{aligned} \varepsilon {\dot{z}}_f&=A_{22}z_f+B_{2}u_f, \\y_f&=C_{2}z_f. \end{aligned} \end{aligned}$$

(18)

The objective is adjusting $u_f$ to minimize the performance index

$$\begin{aligned} \begin{aligned} J_f=\frac{1}{2}\int _{0}^{\infty }(z_f^{T}G_f z_f+u_f^{T}Ru_f)dt, \end{aligned} \end{aligned}$$

where $G_f=C_2^TC_2$.

If the Riccati equation

$$\begin{aligned} \begin{aligned} A_{22}^TP_f+P_fA_{22}+P_fB_2R^{-1}B_2^TP_f-G_f=0 \end{aligned} \end{aligned}$$

(19)

has a positive semidefinite stabilizing solution $P_f$, then the optimal control of the fast subsystem (18) is $u_{f}^*=-R^{-1}B_2^TP_fz_f$.

Since the system (18) is linear, under the optimal control $u^*_f$, the optimal value function can be denoted in a quadratic form, that is

$$\begin{aligned} \begin{aligned} V_f^*(z_f(t))=\frac{1}{2}\varepsilon z_f(t)^TP_fz_f(t). \end{aligned} \end{aligned}$$

Similarly, the Hamilton function of the fast subsystem is defined as

$$\begin{aligned} \begin{aligned}&H_f\left( z_f(t),u_{f}(z_f),\frac{\partial V_f^*(z_f)}{\partial z_f}\right) \\&\quad =\frac{1}{\varepsilon }\frac{\partial V_f^*}{\partial z_f}{\dot{z}}_f+\frac{1}{2}{u_{f}}^T R u_{f}\\&\qquad +\frac{1}{2}z_f^T G_f z_f. \end{aligned} \end{aligned}$$

Then under the optimal control $u^*_{f}(\eta )$

$$\begin{aligned} \begin{aligned}&H_f\left( \eta (t),u_{f}^*(\eta ),\frac{\partial V_f^*(\eta )}{\partial \eta (t)}\right) \\&\quad =\frac{1}{2} \eta ^T(P_fA_{22} +A_{22}P_f +G_f\\&\qquad +O(\varepsilon ))\eta +\eta ^T P_f B_2u_f^*+\frac{1}{2}{u_f^*}^TRu_f^* =0. \end{aligned} \end{aligned}$$

(20)

With the obtained $P_s$ and $P_f$ by solving the above two subproblems, the optimal composite input of the original system (1) with the performance index (2), operating on the actual states x and z is

$$\begin{aligned} \begin{aligned}&u_{c}^*=-[(I-R^{-1}B_2^TP_fA_{22}^{-1}B_2)R_0^{-1}(D_0^TC_0\\&\quad +B_0^TP_s) +R^{-1}B_2^TP_fA_{22}^{-1}A_{21}]x-R^{-1}B_2^TK_2z. \end{aligned} \end{aligned}$$

(21)

Assumption 2

The triples $(A_0,B_0,C_0)$ and $(A_{22},\, B_2, C_2)$ are stabilizable-detectable.

Lemma 1

[7] For the dynamics of the transformed system (7) with $E=0$ and approximated by reduced order subsystems (11) and (18), the approximated states satisfy $\xi (t)=x_s(t)+O(\varepsilon )x+O(\varepsilon )z, \ \eta (t)=z_f(t)+O(\varepsilon )x$. If Assumptions 1 and 2 hold, then (14) and (19) have unique positive semi-definite solutions $P_s$ and $P_f$, respectively. The composite feedback control $u_c^*$ in (21) is $O(\varepsilon ) $ close to $u_{opt}$ defined in (3). Further, $J_c$, the value of the performance index obtained by (2) with the controller $u_c^*$, satisfies

$$\begin{aligned} \begin{aligned} J_c=J_{opt}+O(\varepsilon ^2). \end{aligned} \end{aligned}$$

$\square $

A brief proof of Lemma 1 is presented in the appendix part.

From Lemma 1 and considering the triggering factor, the optimal control for the slow subsystem $u_{sd}^*$ is according to (15)

$$\begin{aligned} \begin{aligned} u_{sd}^*(\xi )=-R_0^{-1}(D_0^TC_0+B_0^TP_s){\hat{\xi }}, \end{aligned} \end{aligned}$$

(22)

and the optimal control for the fast subsystem $ u_{fd}^*$ is

$$\begin{aligned} \begin{aligned} u_{fd}^*(\eta )=-R^{-1}B_2^TP_f{\hat{\eta }}, \end{aligned} \end{aligned}$$

(23)

Hence, the event-triggered sub-optimal composite control is

$$\begin{aligned} \begin{aligned} u_{d}^*=u_{sd}^*(\xi )+u_{fd}^*(\eta ). \end{aligned} \end{aligned}$$

(24)

For consistency with the standard quadratic form, we define $u_{sdc}$ for the slow subsystem, which is an adaption form of $u_{sc}$ in (12) as follows

$$\begin{aligned} \begin{aligned} u_{sdc}(\xi )=u_{sd}+R_0^{-1}D_0^TC_0\xi . \end{aligned} \end{aligned}$$

3 Main results

In this part, we consider the situation where the slow dynamics are unknown, i.e., $A_{11},A_{12}, B_1$ in (1) are unknown. So the solution of the Riccati equation (14) cannot be obtained directly. To deal with this problem, a Q-learning method based on available data is proposed.

For convenience, denote $B_0'=B_{1}-M(B_2+\varepsilon LB_1)$. Referring to the time-triggered Hamiltonian function (16), (7) and their approximations, the event-triggered Hamiltonian functions are rewritten as follows.

$$\begin{aligned} \begin{aligned}&H_s\left( \xi (t),u_{sd}(\xi ),\frac{\partial V_s^*(\xi )}{\partial \xi (t)},KE\right) \\&=\frac{\partial {V_s^*}^T(\xi )}{\partial \xi }{\dot{\xi }} +\frac{1}{2}\xi ^{T}C_0^TC_0\xi \\&\qquad +\xi ^TC_0^TD_0u_{sd}+\frac{1}{2}u_{sd}^TR_0u_{sd}\\&\quad =\frac{1}{2}\xi ^T(P_sA_0+A_0^TP_s+O(\varepsilon ))\xi +\xi ^TP_s(B_0K_0\xi \\&\qquad +B_0'KE)+\frac{1}{2}\xi ^{T}C_0^TC_0\xi \\&\qquad +\xi ^TC_0^TD_0u_{sd}+\frac{1}{2}u_{sd}^TR_0u_{sd}. \end{aligned} \end{aligned}$$

Combining (9) leads to

$$\begin{aligned}&H_s\left( \xi (t),u_{sd}(\xi ),\frac{\partial V_s^*(\xi )}{\partial \xi (t)},KE\right) \nonumber \\= & {} \frac{1}{2}\xi ^T(P_sA_0+A_0^TP_s +C_0^TC_0+O(\varepsilon ))\xi \nonumber \\&+\xi ^T(P_sB_0+C_0^TD_0)u_{sd} +\xi ^TP_sB_0'KE\nonumber \\&-\xi ^TP_sB_0K_0e_1+\frac{1}{2}u_{sd}^T R_0 u_{sd}\nonumber \\&+O(\varepsilon )\xi ^TP_sB_0M\eta . \end{aligned}$$

(25)

Similarly,

$$\begin{aligned} \begin{aligned}&H_f\left( \eta (t),u_{fd}(\eta ),\frac{\partial V_f^*(\eta )}{\partial \eta (t)},KE\right) \\&\quad =\frac{\partial V_f^*(\eta )}{\partial \eta }((A_{22} +B_2K_2 +\varepsilon L \varLambda _{12})\eta +(B_2\\&\qquad +\varepsilon LB_1)KE) +\frac{1}{2}{u_{fd}}^T R u_{fd}+\frac{1}{2}\eta ^T G_f \eta \\&\quad =\frac{1}{2} \eta ^T(P_fA_{22} +A_{22}P_f+G_f+O(\varepsilon ))\eta \\&\qquad +\eta ^T P_f B_2u_{fd}+\eta ^T P_fB_2K_1e_1\\&\qquad -\eta ^TP_fB_2K_2Le_1 +\frac{1}{2}{u_{fd}}^TRu_{fd}. \end{aligned} \end{aligned}$$

With the derivation results in (25), define a Q function

$$\begin{aligned} \begin{aligned}&Q(\xi ,u_{sd},KE)\\&\quad =V_s^*(\xi )+H_s(\xi ,u_{sd},\frac{\partial V_s^*}{\partial \xi },KE)\\&\qquad -H_s(\xi ,u^*_{s},\frac{\partial V_s^*}{\partial \xi })\\&\quad =\frac{1}{2}\xi ^T(P_s+P_sA_0+A_0^TP_s+C_0^TC_0+O(\varepsilon ))\xi \\&\qquad +\xi ^T(P_sB_0+C_0^TD_0)u_{sd}+\xi ^TP_sB_0'KE \\&\qquad -\xi ^TP_sB_0K_0e_1+\frac{1}{2}{u_{sd}}^T R_0 u_{sd}. \end{aligned}\nonumber \\ \end{aligned}$$

(26)

Let

$$\begin{aligned} \begin{aligned} \varPhi _s=&[\frac{1}{2}\xi ^T\otimes \xi ^T,u_{sd}^T\otimes \xi ^T, KE^T\otimes \xi ^T, \\&-(K_0^T e_1)^T\otimes \xi ^T],\\ W_c =&{\left[ \begin{array}{c} vec(P_s+P_sA_s+A_sP_s+C_0^TC_0+O(\varepsilon )) \\ vec(P_sB_0+C_0^TD_0)\\ vec(P_sB_0')\\ vec(P_sB_0) \end{array}\right] }. \end{aligned} \end{aligned}$$

Then (26) can be written as

$$\begin{aligned} \begin{aligned} Q(\xi ,u_{sd},KE)=\varPhi _s W_{c}+\frac{1}{2}u_{sd}^TR_0u_{sd}. \end{aligned} \end{aligned}$$

Since the vector $W_c$ is an unknown constant vector, an approximated critic weight vector ${\hat{W}}_c\in {{\mathcal {R}}}^{n_{1}^2+3mn_1}$ is utilized. Accordingly, we define the following function ${\hat{Q}}$ to estimate Q.

$$\begin{aligned} \begin{aligned} {\hat{Q}}(x,u_{sd})=\varPhi _s{\hat{W}}_{c}+\frac{1}{2}u_{sd}^TR_0u_{sd}. \end{aligned} \end{aligned}$$

Similarly, the actor weight ${\hat{W}}_a \in {{\mathcal {R}}}^{n_{1}}$ is introduced to construct the controller

$$\begin{aligned} \begin{aligned} u_{sd}={\hat{W}}_{a}^T{\hat{\xi }}(t), \end{aligned} \end{aligned}$$

(27)

the tuning laws of ${\hat{W}}_{c}$ and ${\hat{W}}_{a}$ will be introduced later.

Remark 2

Note that the unknown parameters are all integrated in $W_c$, so $Q(\xi ,u_{sd},KE)$ is considered as two separate parts, i.e., unknown and available information. Additionally, the second part of $W_c$, $vec(P_sB_0+C_0^TD_0)$, contains all the unknown information needed in the optimal controller in (22).

The overall controller is

$$\begin{aligned} \begin{aligned} u_d={\hat{W}}_{a}^T{\hat{\xi }}(t)+K_2{\hat{\eta }}. \end{aligned} \end{aligned}$$

(28)

As for the fast subsystem, the optimal controller, i.e., $K_2=-R^{-1}B_2^TP_f$, is chosen.

According to the performance index defined in (12) and the Hamilton function (17), under optimal control (24), it can be obtained that

$$\begin{aligned} \begin{aligned}&Q(\xi (t),u_{sd}^*)\\&\quad =Q(\xi (t-T),u_{sd}^*)-\frac{1}{2}\int ^t_{t-T}(\xi ^{T}C_0^TC_0\xi \\&\qquad +2\xi ^TC_0^TD_0u_s+u_s^TR_0u_s)d\tau , \end{aligned} \end{aligned}$$

where T denotes a small fixed time interval.

Define the error related to the critic network as

$$\begin{aligned} \begin{aligned} e_{c}&={\hat{Q}}(x(t),{\hat{u}}_{sd}(t))-{\hat{Q}}(x(t-T),{\hat{u}}_{sd}(t-T))\\&\quad +\frac{1}{2}\int ^t_{t-T}(\xi ^TG_s\xi +{u}_{sd}R{u}_{sd})d\tau \\&={\hat{W}}_c^T(\varPhi _s(t)-\varPhi _s(t-T))\\&\quad +\frac{1}{2}\int ^t_{t-T}(\xi ^TG_s\xi +{u}_{sdc}R{u}_{sdc})d\tau . \end{aligned} \end{aligned}$$

Define the error related to the actor as

$$\begin{aligned} \begin{aligned} e_{a}={\hat{W}}_{a}^T \xi (t_x^k)+R_0^{-1}{\hat{W}}_{cp}^T\xi (t_x^k), \end{aligned} \end{aligned}$$

(29)

where ${\hat{W}}_{cp}=W_c(n_1^2+1:n_1^2+mn_1)$. Let $\sigma =\varPhi _s(t)-\varPhi _s(t-T)$, $K_c=\frac{1}{2}\Vert e_c\Vert ^2$, $K_a=\frac{1}{2}\Vert e_a\Vert ^2$.

The tuning law for the weight of the critic network is designed as

$$\begin{aligned} \begin{aligned} \dot{{\hat{W}}}_c&=-\alpha _{c} \frac{1}{(1+\sigma ^T\sigma )^2}\frac{\partial K_c}{\partial {\hat{W}}_c}\\&=-\alpha _{c} \frac{\sigma }{(1+\sigma ^T\sigma )^2}e_c^T, \end{aligned} \end{aligned}$$

(30)

where the convergence speed can be adjusted through $\alpha _{c}\in {{\mathcal {R}}}_+$.

Due to the event-triggered mechanism, the controller only updates at triggering instants. The tuning law for the weight of the actor is designed as

$$\begin{aligned} \begin{aligned} \dot{{\hat{W}}}_a=&0, \\{\hat{W}}_a^{+}=&{\hat{W}}_a-\alpha _a \frac{1}{(1+\xi (t)^T\xi (t))}\frac{\partial K_a}{\partial {\hat{W}}_a}, \\=&{\hat{W}}_a-\alpha _a \frac{\xi }{(1+\xi (t)^T\xi (t))}e_a^T \quad t=t_x^k, \end{aligned} \end{aligned}$$

(31)

where $\alpha _{a}\in {{\mathcal {R}}}_+$ presents the convergence speed.

Let $V_1(\xi )=\frac{1}{2}\xi ^{T} P_s \xi , \ V_2(\eta )=\frac{1}{2}\varepsilon \eta ^T P_f \eta $,

$$\begin{aligned} \begin{aligned} {\dot{V}}_1(\xi )=&\frac{1}{2}{\dot{\xi }}^{T}P_s\xi +\frac{1}{2}\xi ^{T} P_s {\dot{\xi }}\\=&\frac{1}{2}\xi ^T(P_sA_0+A_0^TP_s+O(\varepsilon ))\xi \\&+\xi ^TP_sB_0K_0\xi +\xi ^TP_sB_0'KE. \end{aligned} \end{aligned}$$

Apply (17) to replace $P_sA_0+A_0^TP_s$

$$\begin{aligned} \begin{aligned} {\dot{V}}_1(\xi )&=-\frac{1}{2}\xi ^{T}(C_0^TC_0+O(\varepsilon ))\xi -\xi ^T(P_sB_0\\&\quad +C_0^TD_0 +O(\varepsilon ))u_s^*-\frac{1}{2}{u_s^*}^{T}R_0u_s^*\\&\quad +\xi ^TP_sB_0K_0\xi +\xi ^TP_sB_0'KE. \end{aligned} \end{aligned}$$

From (9), (12) and the equation $\xi ^TP_sB_0=-{u_{sc}^*}^T R_0$, it can be rewritten as

$$\begin{aligned} \begin{aligned} {\dot{V}}_1(\xi )&=-\frac{1}{2}\xi ^T(G_s+O(\varepsilon ))\xi -\frac{1}{2}{u_{sc}^*}^TR_0u_{sc}^*\\&\quad -\xi ^TP_sB_0u_s^*+\xi ^TP_sB_0(u_{sd}-K_0e_1\\&\quad +O(\varepsilon )M\eta )+\xi ^TP_sB_0'KE\\&=-\frac{1}{2}\xi ^T(G_s+O(\varepsilon ))\xi -\frac{1}{2}{u_{sc}^*}^TR_0u_{sc}^*\\&\quad +{u_{sc}^*}^T R_0 u_{s}^*-{u_{sc}^*}^T R_0 u_{sd}-\xi ^TP_sB_0K_0e_1\\&\quad +\xi P_sB_0'KE+O(\varepsilon )\xi ^TP_sB_0M\eta \\&=-\frac{1}{2}\xi ^T(G_s+O(\varepsilon ))\xi +\frac{1}{2}({u_{sc}^*}-u_{sdc})^T R_0\\&\quad \times ({u_{sc}^*} -u_{sdc}) -\frac{1}{2}u_{sdc}^T R_0 u_{sdc}\\&\quad -\xi ^TP_sB_0K_0e_1+\xi P_sB_0'KE\\&\quad +O(\varepsilon )\xi ^TP_sB_0M\eta . \end{aligned}\nonumber \\ \end{aligned}$$

(32)

The performance index of the fast subsystem satisfies

$$\begin{aligned} \begin{aligned} {\dot{V}}_2(\eta )&=\frac{1}{\varepsilon }\frac{\partial V_2}{\partial \eta } ((A_{22}+O(\varepsilon )) \eta +(B_2+O(\varepsilon )) K_2\eta \\&\quad +B_2KE).\\&=\frac{1}{2}\eta ^T(P_fA_{22}+A_{22}^TP_f+O(\varepsilon ))\eta \\&\quad +\eta ^TP_fB_2K_2\eta +\eta ^TP_fB_2KE. \end{aligned} \end{aligned}$$

Similarly, for $V_2$, apply (20) to take the place of $P_fA_{22}+A_{22}^TP_f$.

$$\begin{aligned} \begin{aligned} {\dot{V}}_2&=-\frac{1}{2}\eta ^T(G_f+O(\varepsilon ))\eta \\&\quad +\frac{1}{2}(u_{f}^*-u_{fd})^T R (u_{f}^*-u_{fd})\\&\quad -\frac{1}{2} u_{fd}^T Ru_{fd} +\eta ^TP_fB_2K_0e_1\\&\le -\frac{1}{2}{\underline{\lambda }}(G_f+O(\varepsilon ))\Vert \eta \Vert ^2\\&\quad +\frac{1}{2}{\bar{\lambda }}(R)\Vert R^{-1}B_2^TP_f\Vert \Vert e_f\Vert ^2\\&\quad - \frac{1}{2}{\underline{\lambda }}(R)\Vert u_{fd}\Vert ^2+\eta ^TP_fB_2K_0e_1. \end{aligned} \end{aligned}$$

(33)

Since $e_f=Le_1+e_2$, then

$$\begin{aligned} \begin{aligned}&{\dot{V}}_1+{\dot{V}}_2 \\&\le -\frac{1}{2}\xi ^T(G_s+O(\varepsilon ))\xi -\frac{1}{2}{\underline{\lambda }}(R_0)\Vert u_{sdc}\Vert ^2\\&+\left( \frac{1}{2}{\bar{\lambda }}(R_0)\Vert R_0^{-1}B_0^TP_s\Vert ^2\right. \\&\left. +{\bar{\lambda }}(R)\Vert L\Vert ^2\Vert R^{-1}B_2^TP_f\Vert ^2\right) \Vert e_1\Vert ^2\\&+\xi ^T(P_sB_0'K_1-P_sB_0K_0)e_1+\eta ^TP_fB_2K_0e_1\\&-\frac{1}{2}\eta ^T(G_f+O(\varepsilon ))\eta -\frac{1}{2}{\underline{\lambda }}(R)\Vert u_{fd}\Vert ^2\\&+{\bar{\lambda }}(R)\Vert R^{-1}B_2^TP_f\Vert ^2\Vert e_2\Vert ^2+\xi ^TP_sB_0'K_2e_2. \end{aligned} \end{aligned}$$

(34)

When $\varepsilon $ is small enough, there exist $G_s-O(\varepsilon )>G_s'>0, G_f-O(\varepsilon )>G_f'>0$. Let

$$\begin{aligned} \begin{aligned} \left\Vert {\tilde{W}}_a \right\Vert \le l_1, \ \left\Vert -R_0^{-1}(B_0^TP_s+D_0^TC_0) \right\Vert&\le l_2, \\ \left\Vert R_0^{-1}B_0^TP_s \right\Vert \le l_3,\ \left\Vert R_0^{-1}{B_0'}^TP_s \right\Vert&\le l_4, \\ \left\Vert R^{-1}B_2^TP_f \right\Vert \le l_5, \left\Vert L\right\Vert&\le l. \end{aligned} \end{aligned}$$

(35)

By applying following inequalities to (34)

$$\begin{aligned} \begin{aligned}&\Vert \xi \Vert \Vert e_1\Vert \le \frac{1}{2\alpha _1}\Vert \xi \Vert ^2+\frac{\alpha _1}{2}\Vert e_1\Vert ^2,\\&\Vert \xi \Vert \Vert e_2\Vert \le \frac{1}{2\alpha _2}\Vert \xi \Vert ^2+\frac{\alpha _2}{2}\Vert e_2\Vert ^2,\\&\Vert \eta \Vert \Vert e_1\Vert \le \frac{1}{2\alpha _3}\Vert \eta \Vert ^2+\frac{\alpha _3}{2}\Vert e_1\Vert ^2,\\&\Vert e_f\Vert ^2=\Vert Le_1+e_2\Vert ^2\le 2l^2\Vert e_1\Vert ^2+2\Vert e_2\Vert ^2. \end{aligned} \end{aligned}$$

where $\alpha _1, \alpha _2, \alpha _3$ and $\alpha _4$ are parameters to be designed later.

Choose $c_1,\ c_2,\ c_3$ and $c_4$ which satisfies

$$\begin{aligned} \begin{aligned} c_1\le&\frac{1}{2}{\underline{\lambda }}(G_s')-l_1{\bar{\lambda }}(R_0)\\&-\frac{{\bar{\lambda }}(R_0)l_3\sqrt{l_1^2+l_2^2}}{2\alpha _1} -\frac{{\bar{\lambda }}(R_0)l_4l_5}{2\alpha _2}\\&-\frac{{\bar{\lambda }}(R_0)l_4\sqrt{l_1^2+l_2^2+l^2l_5^2}}{\alpha _1},\\ c_2\ge&(l_1^2+l_2^2){\bar{\lambda }}(R_0)+\alpha _3 {\bar{\lambda }}(R)l_5\sqrt{l_1^2+l_2^2}+{\bar{\lambda }}(R)l^2l_5\\&+\alpha _1{\bar{\lambda }}(R_0)l_4\sqrt{l_1^2+l_2^2+l^2l_5^2}\\&+\frac{\alpha _1({\bar{\lambda }}(R_0)l_3\sqrt{l_1^2+l_2^2}}{2}, \\c_3\le&\frac{1}{2}{\underline{\lambda }}(G_f')-\frac{{\bar{\lambda }}(R)l_5\sqrt{l_1^2+l_2^2}}{\alpha _3}, \\c_4\ge&l_5{\bar{\lambda }}(R)+\frac{\alpha _2{\bar{\lambda }}(R_0)l_4l_5}{2}. \end{aligned}\nonumber \\ \end{aligned}$$

(36)

$c_1,\ c_2,\ c_3,\ c_4$ are assumed positive and $c_2,\ c_3$, and $c_4$ can be positive by adjusting the value of $\alpha _1, \alpha _2, \alpha _3$, and $\alpha _4$. The positivity of $c_1$ will be easy to obtain after proving that $l_1$ tends to zero as ${\tilde{W}}_a$ tends to the origin.

$\mathbf {Theorem\ 1}.$ Consider the two-time-scale system (1) and suppose that Assumptions 1, 2, and (35) hold. The weights of the networks tune as (29)-(31) and the controller is designed as (28). As long as the corresponding controller updates when either of the following conditions is satisfied,

$$\begin{aligned} \left\Vert e_1 \right\Vert ^2\ge & {} \frac{2c_1\Vert \xi \Vert ^2+{\underline{\lambda }}(R_0)\Vert u_{sdc}\Vert ^2}{2c_2}, \end{aligned}$$

(37)

$$\begin{aligned} \left\Vert e_2 \right\Vert ^2\ge & {} \frac{2c_3\Vert \eta \Vert ^2+{\underline{\lambda }}(R)\Vert u_{fd}\Vert ^2}{2c_4}, \end{aligned}$$

(38)

then, there exists $\varepsilon ^{*}>0$, such that for all $\varepsilon \in (0,\varepsilon ^*]$ and any bounded initial conditions $x^0,\, z^0$, the states of (1) converge asymptotically to the origin, and the controller is sub-optimal. In addition, the optimal cost is guaranteed by

$$\begin{aligned} \begin{aligned} J(X_0,u_{d}^*)&= J(X_0,u_{opt}) +\frac{1}{2}\int _0^\infty (u_{d}^*-u_{opt})^T\\&\quad \times R(u_{d}^*-u_{opt}^*)dt. \end{aligned} \end{aligned}$$

(39)

Figure 1 illustrates a flowchart of the implementation process, where (37) and (38) are the two triggering conditions for the slow and fast subsystems, respectively. Once they are triggered, the input controller will be updated.

Proof

First, we prove that the system is asymptotically stable. According to previous analysis,

$$\begin{aligned}&{\dot{V}}_1(\xi )\\= & {} -\frac{1}{2}\xi ^TG_s'\xi +\frac{1}{2}({u_{sc}^*}-u_{sdc})^T R_0 ({u_{sc}^*}-u_{sdc})\\&-\frac{1}{2}u_{sdc}^T R_0 u_{sdc}-\xi ^TP_sB_0{\hat{W}}_a^Te_1\\&+O(\varepsilon ))\xi ^TP_sB_0M\eta +\xi P_sB_0'KE\\\le & {} -\frac{1}{2}{\overline{\lambda }}(G_s')\left\Vert \xi \right\Vert ^2+ \frac{1}{2}{\overline{\lambda }}(R_0)\left\Vert {\tilde{W}}_a^T\xi -{\hat{W}}^T_a e_1 \right\Vert ^2\\&- \frac{1}{2}{\underline{\lambda }}(R_0)\left\Vert u_{sdc} \right\Vert ^2 -\xi ^TP_sB_0{\hat{W}}_a^Te_1\\&+O(\varepsilon )\xi ^TP_sB_0M\eta +\xi P_sB_0'KE. \end{aligned}$$

The performance index of the fast subsystem satisfies

$$\begin{aligned} \begin{aligned} {\dot{V}}_2(\eta )\le&-\frac{1}{2}{\underline{\lambda }}(G_f')\Vert \eta \Vert ^2- \frac{1}{2}{\underline{\lambda }}(R)\Vert u_{fd}\Vert ^2\\&+\frac{1}{2}l_5{\bar{\lambda }}(R)\Vert e_f\Vert ^2+\eta ^TP_fB_2{\hat{W}}_a^Te_1. \end{aligned} \end{aligned}$$

Note that the sum of terms concerning errors in ${\dot{V}}_1,\ {\dot{V}}_2$ satisfies

$$\begin{aligned} \begin{aligned}&-\xi ^TP_sB_0K_0e_1+\xi P_sB_0'KE+\eta ^TP_fB_2{\hat{W}}_a^Te_1 \\&\quad \le \left( {\bar{\lambda }}(R_0)l_3\sqrt{l_1^2+l_2^2}+2{\bar{\lambda }}(R_0)l_4\sqrt{l_1^2+l_2^2+l^2l_5^2}\right) \\&\qquad \Vert \xi \Vert \Vert e_1\Vert \\&\qquad +{\bar{\lambda }}(R_0)l_4l_5\Vert \xi \Vert \Vert e_2\Vert +2{\bar{\lambda }}(R)l_5\sqrt{l_1^2+l_2^2}\Vert \eta \Vert \Vert e_1\Vert . \end{aligned} \end{aligned}$$

Then (34) can be rewritten as

$$\begin{aligned} \begin{aligned} {\dot{V}}_1+{\dot{V}}_2 \le&-c_1\Vert \xi \Vert ^2-\frac{1}{2}{\underline{\lambda }}(R_0)\Vert u_{sdc}\Vert ^2+c_2\Vert e_1\Vert ^2\\&-c_3\Vert \eta \Vert ^2-\frac{1}{2}{\underline{\lambda }}(R)\Vert u_{fd}\Vert ^2+c_4\Vert e_2\Vert ^2. \end{aligned} \end{aligned}$$

If the system triggers when (37) or (38) is satisfied, then $ {\dot{V}}_1+{\dot{V}}_2<0$, and thus the states of the original system asymptotically converge to the origin.

Next, we prove that the parameters are convergent.

We define the weight estimation errors as

$$\begin{aligned} \begin{aligned}&{\tilde{W}}_a=-(P_sB_0+C_0^TD_0)R_0^{-1}-{\hat{W}}_a,\\&{\tilde{W}}_c=W_c-{\hat{W}}_c,\\&{\tilde{W}}_{cp}=P_sB_0+C_0^TD_0-{\hat{W}}_{cp}. \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \dot{{\tilde{W}}}_c= & {} -\alpha _{c} \frac{\sigma \sigma ^T}{(1+\sigma ^T\sigma )^2}{\tilde{W}}_c, \nonumber \\ {\tilde{W}}_a^{+}= & {} {\tilde{W}}_a-\alpha _{a} \frac{\xi (t)\xi ^T(t)}{1+\xi ^T(t)\xi (t)}({\tilde{W}}_a+{\hat{W}}_{cp}R_0^{-1}). \end{aligned}$$

(40)

We set $V_3({\tilde{W}}_a)=\frac{1}{2}tr\{{\tilde{W}}_a^T{\tilde{W}}_a\}$ and define $ \bigtriangleup V_3=V_3({\tilde{W}}_a^{+})-V_3({\tilde{W}}_a)$. Then from the dynamic of ${\tilde{W}}_a$ in (40)

$$\begin{aligned} \begin{aligned}&\bigtriangleup V_3\\&=\alpha _a tr\left( -{\tilde{W}}_a^T\frac{\xi \xi ^T}{1+\xi ^T\xi }{\tilde{W}}_a-{\tilde{W}}_a^T\frac{\xi \xi ^T}{1+\xi ^T\xi }{\tilde{W}}_{cp}R_0^{-1} \right. \\&\quad +\frac{\alpha _a}{2}{\tilde{W}}_a^T\frac{(\xi \xi ^T)^2}{(1+\xi ^T\xi )^2}{\tilde{W}}_a\\&\quad +\alpha _a{\tilde{W}}_a^T\frac{(\xi \xi ^T)^2}{(1+\xi ^T\xi )^2}{\tilde{W}}_{cp}R_0^{-1} \\&\quad \left. +\frac{\alpha _a}{2}({\tilde{W}}_{cp}R_0^{-1})^T\frac{(\xi \xi ^T)^2}{(1+\xi ^T\xi )^2}{\tilde{W}}_{cp}R_0^{-1}\right) . \end{aligned} \end{aligned}$$

Since $tr(AB)=tr(BA)$, it gets

$$\begin{aligned} \begin{aligned} \bigtriangleup V_3&= \frac{\alpha _a\xi ^T}{1+\xi ^T\xi } (-{\tilde{W}}_a{\tilde{W}}_a^T-{\tilde{W}}_{cp}R_0^{-1}{\tilde{W}}_a^T\\&\quad +\frac{\alpha _a}{2}\frac{\Vert \xi \Vert ^2}{1+\Vert \xi \Vert ^2}({\tilde{W}}_a{\tilde{W}}_a^T +2{\tilde{W}}_{cp}R_0^{-1}{\tilde{W}}_a\\&\quad +{\tilde{W}}_{cp}R_0^{-1}({\tilde{W}}_{cp}R_0^{-1})^T))\xi . \end{aligned} \end{aligned}$$

By utilizing the following inequalities

$$\begin{aligned} \begin{aligned} \frac{\Vert \xi \Vert ^2}{1+\Vert \xi \Vert ^2}&\le 1, \\ -\xi ^T{\tilde{W}}_{cp}R_0^{-1}{\tilde{W}}_a^T\xi&\le \frac{\xi ^T}{2}\left( \frac{1}{\beta _b}{\tilde{W}}_{cp}{\tilde{W}}_{cp}^T\right. \\&\left. \quad +\beta _b{\tilde{W}}_a(R_0^{-1})^{T}R_0^{-1}{\tilde{W}}_a^T\right) \xi , \end{aligned} \end{aligned}$$

where $\beta _b$ can be arbitrary positive real number.

Then, it can be deduced that

$$\begin{aligned} \begin{aligned} \bigtriangleup V_3&\le \frac{\alpha _a}{1+\xi ^T\xi } \xi ^T\left( -\left( 1-\frac{\Vert 1-\alpha _a\Vert \beta _b{\bar{\lambda }}(R_0^{-1})^2}{2} \right. \right. \\&\left. \left. \quad -\frac{\alpha _a}{2}\right) {\tilde{W}}_a{\tilde{W}}_a^T+\left( \frac{\Vert 1-\alpha _a\Vert }{2\beta _b}+\frac{\alpha _a}{2}{\bar{\lambda }}(R_0^{-1})^2\right) {\tilde{W}}_{cp}{\tilde{W}}_{cp}^T \right) \xi . \end{aligned} \end{aligned}$$

Choosing proper $\alpha _a, \beta _b$ to ensure $1-\frac{\Vert 1-\alpha _a\Vert \beta _b{\bar{\lambda }}(R_0^{-1})^2}{2} -\frac{\alpha _a}{2}> 0$, then $\bigtriangleup V_3< 0$ when ${\tilde{W}}_{a}$ is beyond a certain range than ${\tilde{W}}_{cp}$. Since ${\tilde{W}}_{cp}$, as part of ${\tilde{W}}_{c}$, converges to the origin, then $\xi ^T(\frac{\Vert 1-\alpha _a\Vert }{2\beta _b}+\frac{\alpha _a}{2}{\bar{\lambda }}(R_0^{-1})^2){\tilde{W}}_{cp}{\tilde{W}}_{cp}^T)\xi $ tends to converge to the origin. So $\hat{W_a}$ is near optimal. In addition, it implies that $l_1$ becomes smaller and $c_1>0$ in (36) is easier to satisfied.

Third, we prove that Zeno behavior is excluded.

For the slow mode, from (37), the triggering condition satisfies $\frac{\Vert e_1(t)\Vert }{\Vert \xi (t)\Vert }\ge \frac{c_1}{c_2}$. For convenience, we denote $s(t)=\frac{\Vert e_1(t)\Vert }{\Vert \xi (t)\Vert }$. From the dynamics of the subsystems (7), since the parameters are bounded, ${{\mathcal {F}}}_1\in {{\mathcal {R}}}_+$ satisfy $ \Vert {\dot{\xi }}(t)\Vert \le {{\mathcal {F}}}_1(\Vert \xi (t)\Vert +\Vert e_1(t)+\Vert e_2(t)\Vert )$ and $ \Vert {\dot{e}}_1(t)\Vert \le {{\mathcal {F}}}_1(\Vert \xi (t)\Vert +\Vert e_1(t)\Vert +\Vert e_2(t)\Vert )$. Then, it is further deduced that

$$\begin{aligned} \begin{aligned}&\frac{ds}{dt}\le {{\mathcal {F}}}_1(1+s)^2+ {{\mathcal {F}}}_1\left( 1+\frac{\Vert {e}_1(t)\Vert }{\Vert \xi \Vert }\right) \frac{\Vert {e}_2(t)\Vert }{\Vert \xi \Vert }, \\&\quad t \in [t_{x}^k,t_{x}^{k+1}). \end{aligned} \end{aligned}$$

As the states are convergent, it has ${{\mathcal {C}}}_1\in {{\mathcal {R}}}_+$ satisfying ${{\mathcal {C}}}_1\ge {{\mathcal {F}}}_1(1+\frac{\Vert {e}_1(t)\Vert }{\Vert \xi \Vert })\frac{\Vert {e}_2(t)\Vert }{\Vert \xi \Vert }$, then $\frac{dt}{ds}\le \frac{1}{{{\mathcal {F}}}_1(1+s)^2+{{\mathcal {C}}}_1}$. Since s(t) changes from 0 to a number larger than $\frac{c_1}{c_2}$ between the two triggering instants, then the triggering interval $\tau _s=t_{x}^{k+1}-t_{x}^k>\int _0^{\frac{c_1}{c_2}}\frac{1}{{{\mathcal {F}}}_1(1+s)^2+{{\mathcal {C}}}_1}ds>0$.

Similarly, for the fast mode, ${{\mathcal {F}}}_2,{{\mathcal {C}}}_2\in {{\mathcal {R}}}_+$ satisfies $ \Vert {\dot{\eta }}(t)\Vert \le {{\mathcal {F}}}_2(\Vert \eta (t)\Vert +\Vert e_2(t)\Vert )$, $ \Vert {\dot{e}}_2(t)\Vert \le {{\mathcal {F}}}_2(\Vert \xi (t)\Vert +\Vert e_2(t)\Vert )$ and ${{\mathcal {C}}}_2\ge {{\mathcal {F}}}_2(1+\frac{\Vert {e}_2(t)\Vert }{\Vert \eta \Vert })\frac{\Vert {e}_1(t)\Vert }{\Vert \eta \Vert }$, the triggering interval $\tau _f=t_{z}^{k+1}-t_{z}^k>\int _0^{\frac{c_3}{c_4}} \frac{1}{{{\mathcal {F}}}_2(1+s)^2+{{\mathcal {C}}}_2}ds>0$. Thus Zeno behavior is excluded.

Regarding the performance index employing the proposed event-triggered mechanism, [24] provides a similar proof. Under a precise optimal control $u_{opt}$ in (3) and the sub-optimal event-triggered control $u_{d}^*$, the performance indexes have the relation described in (39).

This completes the proof. $\square $

4 Simulation

This part conducts three simulation examples involving a four-dimensional system, a practical motor example, and a comparison study. The slow mode dynamics are assumed unknown, which means that $A_{11}, A_{12}, B_1$ are unavailable.

(1)The effectiveness of the proposed method

Consider a four-dimensional system, the parameters are $A_{11}={\left[ \begin{array}{cc} 0 &{} 0.4 \\ 0 &{} 0 \end{array} \right] }, A_{12}= {\left[ \begin{array}{cc} 0 &{} 0 \\ 0.345 &{} 0 \end{array} \right] }, A_{21}={\left[ \begin{array}{cc} 0 &{} -0.524 \\ 0 &{} 0 \end{array} \right] }, A_{22}={\left[ \begin{array}{cc} -0.465 &{} 0.262 \\ 0 &{} -1 \end{array} \right] },\ B_{1}\!=\! {\left[ \begin{array}{cc}0&0\end{array}\right] }^{T}\!, \ B_{2}= {\left[ \begin{array}{cc}0&1\end{array}\right] }^{T}$. We set $\varepsilon =0.1$, $C_1=C_2=\left[ 1\ 1\right] $, $R=1$. The initial states are $x(0)={\left[ \begin{array}{cc}6&5\end{array}\right] }^{T},\ z(0)={\left[ \begin{array}{cc}4&2\end{array}\right] }^{T}$ and $\alpha _a=0.5, \alpha _c=2$. The trajectories of the states, triggering time, and inputs are depicted in Fig. 2. Furthermore, to highlight the approximate optimality of the proposed method, the optimal control is also presented in Fig. 2. By solving the Riccati equations of the slow and fast subsystems (14), and (19) respectively, we obtain $u_c^*=[-2.0000 -2.5632]x+[-0.5953 -0.5205]z$ from (21). The solid line depicts the case under the proposed event-triggered control with unknown slow dynamics, while the dotted line illustrates that under optimal control. From Fig. 2, we observe that the states converge asymptotically to the origin. The minimum triggering intervals for the slow and fast dynamics are $0.6730\ \mathrm {s}$ and $0.0320\ \mathrm {s}$, respectively, and the average triggering intervals are $1.0288\ \mathrm {s}$ and $0.3182\ \mathrm {s}$.

(2) A DC-motor example

In this simulation case, a practical DC-motor example is considered, as described in [20], where the electromagnetic transient is regarded as a fast mode, while the torque response is rather slow. The dynamics of such systems can be modeled as

$$\begin{aligned} \begin{aligned} J_{m}\frac{d\omega }{dt}=&-b\omega +k_{m}i, \\ L_i\frac{di}{dt}=&-k_b\omega -R_0i+u, \end{aligned} \end{aligned}$$

where $J_m$ is the equivalent moment of inertia, $\omega $ is the angular speed, b is the equivalent viscous friction coefficient, and $k_m$ and $k_b$ are the torque and back electromotive force, respectively. i, u and $R_0$ represent the armature current, voltage, and resistance, respectively, $L_i$ is a small inductance constant, and the system parameters are as follows: $J_m = 0.093\ \mathrm{{kgm^2}}, k_m = 0.7274\ \mathrm{{Nm}}, k_b = 0.6\ \mathrm{{vs/rad}}, L_i = 0.006\ \mathrm {H}, R_0 = 0.6\ {\Omega }, b = 0.008$. Rewriting the latter formula in the form of (1), the parameters are $\varepsilon =0.06$, $A_{11}=-0.086, A_{12} =7.82, A_{21} =-0.6, A_{22}=-0.6, B_1=0, B_2=1$. Let $C_1=2, C_2=1.$ We set $\alpha _a=0.5, \alpha _c=20$. The calculated optimal control for the slow mode is $u_{sc}^*(\xi )=-0.4741\xi $, and the initial values of the states are chosen as $x_1(0)=6, x_2(0)=2$. At $t=0.75\ \mathrm {s}, {\hat{W}}_a=-0.4769$. The error between the actor network weight with the proposed method and the one using the optimal control, tunes as Fig. 3 shows, indicating that ${\tilde{W}}_a$ is near the origin after $0.75\ \mathrm {s}$, suggesting that the control gain is near-optimal. The triggering condition is also depicted in Fig. 3.

Table 1 Data of the two methods

Full size table

(3) Comparing ETM and the proposed method

This simulation aims to compare our work and [9]. The system parameters are the ones in the first case. In [9], the controller is also in the form of (5) where the control gains are $K_1= [-0.6861 -\!1.1784]$, $K_2=-R^{-1}B_2^{T}P_f=[-0.5953 -\!0.5205]$. Accordingly, $K_0=[-0.2\ -0.1]$. The triggering condition is designed as $\Vert e_{i}(t)\Vert \ge c_0+c_{1}e^{-{\alpha _{0}} t} $ where $i=1,2$, $e_{1}(t)$ and $e_{2}(t)$ denote the error between the sampled and the real states of the slow and fast modes, respectively. Here ${\alpha _0}=0.1, c_0=0.05$ and $c_{1}=0.2$ as [9].

Regarding the proposed method, to be consistent with the basic parameters in [9], the initial control gains, $K_0, K_1$, and $K_2$ are equal. Table 1 compares the triggering numbers, minimum triggering intervals and average triggering intervals between the proposed method and [9] for $90\ \mathrm {s}$. The trajectories of the states and the triggering time are depicted in Fig. 4 along with the performance index. The latter figure highlights that in $90\ \mathrm {s}$, although the triggering numbers of the proposed method are much smaller than ETM in [9], the suggested method has a lower cost. In [9], because of the triggering mechanisms, after some time, the threshold of the triggering condition has a minor change. The states influence the error’s rate of change, posing a critical impact on the triggering frequency (Fig. 3), while in the proposed method, the thresholds change together with the states. The triggering frequency in [9] is relatively high at the beginning and reduces when time goes to infinity, while for the proposed method, the change of the triggering frequency is rather gentle.

5 Conclusion

This paper presents an event-triggered sub-optimal control for TTSSs with unknown slow dynamics. Based on the singular perturbation theory, a composite event-triggered controller is designed. Additionally, we utilize a Q-function as a critic network in the form of the product of available information and unknown parameters. The controller is constructed using an actor network and is updated to be sub-optimal under the proposed event-triggered mechanism. Furthermore, we prove that the triggering time intervals are strictly positive, and thus, Zeno behavior is excluded. Moreover, the system’s global asymptotic stability is guaranteed. Future work will address TTSSs with actuator failure or fully unknown parameters for more general applications.

Data Availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

Podhaisky, H., Marszalek, W.: Bifurcations and synchronization of singularly perturbed oscillators: an application case study. Nonlinear Dyn. 69, 949–959 (2012)
Article MathSciNet Google Scholar
Ganjefar, S., Mohammadi, A.: Variable speed wind turbines with maximum power extraction using singular perturbation theory. Energy 106, 510–519 (2016)
Article Google Scholar
Bajodah, A.H.: Singularly perturbed feedback linearization with linear attitude deviation dynamics realization. Nonlinear Dyn. 53, 321–343 (2008)
Article MathSciNet MATH Google Scholar
Ha, M., Wang, D., Liu, D.: Event-triggered adaptive critic control design for discrete-time constrained nonlinear systems. IEEE Trans. Syst. Man Cybern. Syst. 50(9), 3158–3168 (2020)
Article Google Scholar
Wang, N., Gao, Y., Zhao, H., Ahn, C.K.: Reinforcement learning-based optimal tracking control of an unknown unmanned surface vehicle. IEEE Trans. Neural Netw. Learn. Syst. 32(7), 3034–3045 (2021)
Article MathSciNet Google Scholar
Abdelrahim, M., Postoyan, R., Daafouz, J.: Event-triggered control of nonlinear singularly perturbed systems based only on the slow dynamics. Automatica 52, 15–22 (2015)
Article MathSciNet MATH Google Scholar
Chow, J., Kokotovic, P.: A decomposition of near-optimum regulators for systems with slow and fast modes. IEEE Trans. Autom. Control 21(5), 701–705 (1976)
Article MathSciNet MATH Google Scholar
Chang, K.W.: Singular perturbations of a general boundary value problem. SIAM J. Math. Anal. 3(3), 520–526 (1972)
Article MathSciNet MATH Google Scholar
Bhandari, M., Fulwani, D.M., Gupta, R.: Event-triggered composite control of a two time scale system. IEEE Trans. Circuits Syst. II Express Briefs 65(4), 471–475 (2018)
Google Scholar
Kokotović, P., Khalil, H.K., O’Reilly, J.: Singular perturbation methods in control: analysis and design. Society for Industrial and Applied Mathematics, USA (1999)
Book MATH Google Scholar
Anderson, B., Moore, J.B., Molinari, B.P.: Linear optimal control. IEEE Trans. Syst. Man Cybern. 93(4), 559 (1971)
Article Google Scholar
Tognetti, E. S., Calliero, T. R., Morarescu, I. C., Daafouz, J.: Lmi-based output feedback control of singularly perturbed systems with guaranteed cost. In 2020 59th IEEE Conference on Decision and Control (CDC), (2020)
Babaghasabha, R., Khosravi, M.A., Taghirad, H.D.: Adaptive robust control of fully constrained cable robots: singular perturbation approach. Nonlinear Dyn. 85, 607–620 (2016)
Article MathSciNet MATH Google Scholar
Kodra, K., Gajic, Z.: Optimal control for a new class of singularly perturbed linear systems. Automatica 81, 203–208 (2017)
Article MathSciNet MATH Google Scholar
Li, J., Kiumarsi, B., Chai, T., Lewis, F.L., Fan, J.: Off-policy reinforcement learning: optimal operational control for two-time-scale industrial processes. IEEE Trans. Cybern. 47(12), 4547–4558 (2017)
Article Google Scholar
Xue, W., Fan, J., Lopez, V.G., Li, J., Jiang, Y., Chai, T., Lewis, F.L.: New methods for optimal operational control of industrial processes using reinforcement learning on two time scales. IEEE Trans. Ind. Inf. 16(5), 3085–3099 (2020)
Article Google Scholar
Xue, W., Fan, J., Lopez, V. G., Jiang, Y., Chai, T., Lewis, F. L.: Off-policy reinforcement learning for tracking in continuous-time systems on two time scales. IEEE Transactions on Neural Networks and Learning Systems, pp. 1–13, (2020)
Zhijun, F., Xie, W., Rakheja, S., Na, J.: Observer-based adaptive optimal control for unknown singularly perturbed nonlinear systems with input constraints. IEEE/CAA J. Autom. Sin. 4(1), 48–57 (2017)
Article MathSciNet Google Scholar
Zhi-Jun, F., Xie, W.-F., Rakheja, S., Zheng, D.-D.: Adaptive optimal control of unknown nonlinear systems with different time scales. Neurocomputing 238, 179–190 (2017)
Article Google Scholar
Yang, C., Zhong, S., Liu, X., Dai, W., Zhou, L.: Adaptive composite suboptimal control for linear singularly perturbed systems with unknown slow dynamics. Int. J. Robust Nonlinear Control 30(7), 2625–2643 (2020)
Article MathSciNet MATH Google Scholar
Lei, Y., Wang, Y.-W.: Irinel-Constantin Morarescu, and Romain Postoyan. Event-triggered fixed-time stabilization of two-time-scale linear systems. IEEE Trans. Autom. Control, p. 1, (2022)
Luo, B., Huang, T., Liu, D.: Periodic event-triggered suboptimal control with sampling period and performance analysis. IEEE Trans. Cybern. 51(3), 1253–1261 (2021)
Article Google Scholar
Luo, B., Yang, Y., Liu, D., Huai-Ning, W.: Event-triggered optimal control with performance guarantees using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 31(1), 76–88 (2020)
Article MathSciNet Google Scholar
Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sin. 1(3), 282–293 (2014)
Article Google Scholar
Xue, S., Luo, B., Liu, D.: Event-triggered adaptive dynamic programming for unmatched uncertain nonlinear continuous-time systems. IEEE Trans. Neural Netw. Learn. Syst. 32(7), 2939–2951 (2021)
Article MathSciNet Google Scholar
Deng, Y., Gong, M., Ni, T.: Double-channel event-triggered adaptive optimal control of active suspension systems. Nonlinear Dyn. 108, 3435–3448 (2022)
Article Google Scholar
Zhang, G., Zhu, Q.: Event-triggered optimal control for nonlinear stochastic systems via adaptive dynamic programming. Nonlinear Dyn. 105, 387–401 (2021)
Article Google Scholar
Zhao, F., Gao, W., Liu, T., Jiang, Z.-P.: Adaptive optimal output regulation of linear discrete-time systems based on event-triggered output-feedback. Automatica 137, 110103 (2022)
Article MathSciNet MATH Google Scholar
Zhang, Y., Zhang, J.-F., Liu, X.-K.: Implicit function based adaptive control of non-canonical form discrete-time nonlinear systems. Automatica 129, 109629 (2021)
Article MathSciNet MATH Google Scholar
Zhang, Y., Zhang, J.-F., Liu, X.-K.: Matrix decomposition-based adaptive control of non-canonical form mimo discrete-time nonlinear systems. IEEE Trans. Autom. Control, p. 1, (2021)
Vamvoudakis, K.G., Ferraz, H.: Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance. Automatica 87, 412–420 (2018)
Article MathSciNet MATH Google Scholar
Yan, Y., Yang, C., Ma, X., Zhou, L.: Observer-based event-triggered control for singularly perturbed systems with saturating actuator. Int. J. Robust Nonlinear Control 29(12), 3954–3970 (2019)
MathSciNet MATH Google Scholar
Hua, T., Xiao, J.-W., Lei, Y., Yang, W.: Dynamic event-triggered control for singularly perturbed systems. Int. J. Robust Nonlinear Control 31(13), 6410–6421 (2021)
Article MathSciNet Google Scholar
Guo, Z., Yao, D., Bai, W., Li, H., Renquan, L.: Event-triggered guaranteed cost fault-tolerant optimal tracking control for uncertain nonlinear system via adaptive dynamic programming. Int. J. Robust Nonlinear Control 31(7), 2572–2592 (2021)
Article MathSciNet Google Scholar
Kucera, V.: A contribution to matrix quadratic equations. IEEE Trans. Autom. Control 17(3), 344–347 (1972)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grants 62233006, 62173152 and 62103156 and the Natural Science Foundation of Hubei Province of China (2021CFB052).

Author information

Authors and Affiliations

The School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Tong Hua, Jiang-Wen Xiao, Xiao-Kang Liu, Yan Lei & Yan-Wu Wang
The Key Laboratory of Image Processing and Intelligent Control, Ministry of Education, Huazhong University of Science and Technology, Wuhan, 430074, China
Tong Hua, Jiang-Wen Xiao, Xiao-Kang Liu, Yan Lei & Yan-Wu Wang

Authors

Tong Hua
View author publications
You can also search for this author in PubMed Google Scholar
Jiang-Wen Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Kang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Lei
View author publications
You can also search for this author in PubMed Google Scholar
Yan-Wu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiao-Kang Liu or Yan Lei.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Lemma 1

For the dynamics of the transformed system (7) and approximated reduced order subsystems (11) and (18), the approximated states satisfy $\xi (t)=x_s(t)+O(\varepsilon )x+O(\varepsilon )z, \ \eta (t)=z_f(t)+O(\varepsilon )x$. If Assumptions 1 and 2 hold, then (14) and (19) have unique positive semi-definite solutions $P_s$ and $P_f$, respectively. The composite feedback control $u_c^*$ in (21) is $O(\varepsilon ) $ close to $u_{opt}$ defined in (3). Furthermore, $J_c$, the value of the performance index J defined in (2) of system (1) with control input (21), satisfies

$$\begin{aligned} \begin{aligned} J_c=J_{opt}+O(\varepsilon ^2). \end{aligned} \end{aligned}$$

With the substitution $K_0=R_0^{-1}(D_0^TC_0+B_0^TP_s),\ K_2=-R^{-1}B_2^TP_f $, the closed loop of the system (1) with the feedback controller (21) is

$$\begin{aligned} \begin{aligned} \left[ \begin{array}{c} {\dot{x}}\\ {\dot{z}} \end{array}\right] =A^* \left[ \begin{array}{c} \xi \\ \eta \end{array}\right] \end{aligned} \end{aligned}$$

where $ A^*=$

$$\begin{aligned} \begin{aligned}&\begin{bmatrix} A_{11}+B_1(I+K_2A_{22}^{-1}B_2)K_0+B_1K_2A_{22}^{-1}A_{21} &{} A_{12}+B_1K_2\\ (A_{22}+B_2K_2)A_{22}^{-1}(A_{21}+B_2K_0)/\varepsilon &{}(A_{22}+B_2K_2)/\varepsilon \end{bmatrix}. \end{aligned} \end{aligned}$$

With the transformation (6), from [10], it has

$$\begin{aligned} \begin{aligned} \left[ \begin{array}{c} {\dot{\xi }}\\ \varepsilon {\dot{\eta }} \end{array}\right] =&\begin{bmatrix} A_s+B_sK_0 &{} 0\\ 0 &{}A_f+B_fK_2 \end{bmatrix} \left[ \begin{array}{c} \xi \\ \eta \end{array}\right] \end{aligned} \end{aligned}$$

where $A_s=A_0-\varepsilon A_{12}A_{22}^{-1}L(A_{11}-A_{12}L), \ B_s=B_0-\varepsilon A_{12}A_{22}^{-1}LB_1,\ A_f=A_{22}+\varepsilon LA_{12}, B_f=B_2+\varepsilon LB_1.$ Compared to (11) and (18), if $ K_2$ are designed such that $Re \lambda (A_{22}+B_2K_2)<0$, the states $\xi $ and $x_s$, $\eta $ and $z_f$ are starting from the same bounded initial conditions, respectively. Then there exists $\varepsilon ^*>0$, $\forall \varepsilon \in [0,\varepsilon ^*] $, $\xi (t)=x_s(t)+O(\varepsilon )x+O(\varepsilon )z$ and $ \eta (t)=z_f(t)+O(\varepsilon )x$ hold for all finite $t\ge t_0$.

Then we prove the condition of the existence of $P_s,\ P_f$.

It has been proved in [35] that the stabilizing solution $P_s, P_f$ of the Riccati equation (14), (19) exist if $(A_s, B_0, {\hat{C}}_0)$ and $(A_{22}, B_2, C_2)$ is stabilizable-detectable, respectively, where ${\hat{C}}_0^T{\hat{C}}_0=G_s=C_0^T(I-D_0R_0^{-1}D_0^T)C_0$. Since $(A_s, B_0)$ is stabilizable if and only if $(A_0, B_0)$ is stabilizable, $I-D_0R_0^{-1}D_0^T=(I+D_0R^{-1}N_0^T)^{-1}>0$, then there exists a nonsingularly $Q_0$ such that $(Q_0C_0)^T(Q_0C_0)=G_s$. So, $(A_s, Q_0C_0)$ is detectable if and only if $(A_0, C_0)$ is detectable.

Finally, we give a brief proof about the performance between the composite control and optimal control from [7].

With (19), the composite controller (21) can be rewritten as

$$\begin{aligned} \begin{aligned} u_c&=-R^{-1}(B_1^TP_sx+B_2^TP_m^Tx+B_2^TP_fz)\\&=-R^{-1}B^T\begin{bmatrix} P_s &{} 0\\ \varepsilon P_m^T &{}\varepsilon P_f \end{bmatrix}=-R^{-1}B_1^TM_cX \end{aligned} \end{aligned}$$

(41)

where $P_m=[P_s(B_1R^{-1}B_2^TP_f-A_{12})-(A_{21}^TP_f+C_1^TC_2)](A_{22}-B_2R^{-1}B_2^TP_f)^{-1}$.

The solution of the Riccati equation (4) concerning the exact optimal control $P(\varepsilon )$ possesses a power series expansion at $\varepsilon =0$, that is $P=\begin{bmatrix} P_1 &{} \varepsilon P_2\\ \varepsilon P_2^T &{}\varepsilon P_3 \end{bmatrix}+\sum _{n = 1}^{\infty }\frac{\varepsilon ^i}{i!} \begin{bmatrix} P_1^{(i)} &{} \varepsilon P_2^{(i)}\\ \varepsilon {P_2^{(i)}}^T &{}\varepsilon P_3^{(i)} \end{bmatrix}$ and bring it to (4), it can be deduced that

$$\begin{aligned} \begin{aligned} P_1=P_s,\ P_2=P_m,\ P_3=P_f. \end{aligned} \end{aligned}$$

(42)

Since $J_{opt}=\frac{1}{2}{X_0}^TP{X_0}$, $J_{c}=\frac{1}{2}{X_0}^TP_c{X_0}$ where $P_c$ is the positive definite solution of the following function $P_c(A-SM_c)+(A-SM_c)^TP_c=-M_c^TSM_c-C^TC$, let $P_c-P=W$. Comparing it with (4), deduces that $W(A-SM_c)+(A-SM_c)^TW=-(P-M_c^T)S(P-M_c)$. W can also be written in a power series form at $\varepsilon =0$, $W=\sum _{n = 0}^{\infty }\frac{\varepsilon ^i}{i!} \begin{bmatrix} W_1^{(i)} &{} \varepsilon W_2^{(i)}\\ \varepsilon {W_2^{(i)}}^T &{}\varepsilon W_3^{(i)} \end{bmatrix}$, then from (41) and (42), $(P-M_c^T)S(P-M_c)=O(\varepsilon ^2)$. Then $W_j^{(0)}=0$ and $W_j^{(1)}=0, j=1,2,3$. Then $W=O(\varepsilon ^2)$, which implies that $J_c=J_{opt}+O(\varepsilon ^2)$. $\square $

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hua, T., Xiao, JW., Liu, XK. et al. Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics. Nonlinear Dyn 111, 2487–2500 (2023). https://doi.org/10.1007/s11071-022-07970-x

Download citation

Received: 02 February 2022
Accepted: 30 September 2022
Published: 13 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11071-022-07970-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics

Abstract

Similar content being viewed by others

Robust optimal control for constrained uncertain switched systems subjected to input saturation: the adaptive event-triggered case

Adaptive Decentralized Event-triggered Tracking Control for Large-scale Strongly Interconnected Nonlinear System with Global Performance

Finite-Horizon Robust Event-Triggered Control for Nonlinear Multi-agent Systems with State Delay

1 Introduction

2 Problem formation

Assumption 1

Remark 1

Assumption 2

Lemma 1

3 Main results

Remark 2

Proof

4 Simulation

5 Conclusion

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Proof of Lemma 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Event-triggered sub-optimal control for two-time-scale systems with unknown dynamics

Abstract

Similar content being viewed by others

Robust optimal control for constrained uncertain switched systems subjected to input saturation: the adaptive event-triggered case

Adaptive Decentralized Event-triggered Tracking Control for Large-scale Strongly Interconnected Nonlinear System with Global Performance

Finite-Horizon Robust Event-Triggered Control for Nonlinear Multi-agent Systems with State Delay

Explore related subjects

1 Introduction

2 Problem formation

Assumption 1

Remark 1

Assumption 2

Lemma 1

3 Main results

Remark 2

Proof

4 Simulation

5 Conclusion

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Lemma 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation