Event-triggered design for discrete-time nonlinear systems with control constraints

Mu, Chaoxu; Liao, Kaiju; Wang, Ke

doi:10.1007/s11071-021-06218-4

Event-triggered design for discrete-time nonlinear systems with control constraints

Original paper
Published: 11 February 2021

Volume 103, pages 2645–2657, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Nonlinear Dynamics Aims and scope Submit manuscript

Event-triggered design for discrete-time nonlinear systems with control constraints

Download PDF

1075 Accesses
15 Citations
Explore all metrics

Abstract

In order to solve the constrained-input problem and reduce the computing resources, a novel event-triggered optimal control method is proposed for a class of discrete-time nonlinear systems. In the proposed method, the event-triggered control policy is applied to the globalized dual heuristic dynamic programming (GDHP) algorithm. Compared with the traditional adaptive dynamic programming (ADP) control, the event-triggered GDHP control can reduce the computation while ensuring the system performance. In this paper, a non-quadratic function is given to code the control constraints and the trigger condition with the stability analysis is provided. Neural networks (NN) are constructed in the GDHP structure, where the model network is designed to identify the unknown nonlinear system, the critic network is used to learn the cost function and its partial derivative, and the action network is designed to obtain the approximate optimal control law. Three simulation examples are presented to demonstrate the performance of the proposed event-triggered design for constrained discrete-time nonlinear systems.

Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems

Article 11 March 2022

An Event-Triggered Heuristic Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

Dynamic Event-triggered Approximate Optimal Control Strategy for Nonlinear Systems

Article 21 April 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Automotive Engineering

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Traditional period control uses a time-triggered mechanism in which the control law is executed for every fixed time interval. Such a sampling mechanism greatly increases the computing burden and causes computing waste. Compared with the traditional time-triggered control, the event-triggered control can effectively utilize computing resources and reduce the resources waste [1, 2]. The event-triggered control updates the control signal irregularly by setting the trigger condition on the premise of guaranteeing the system performance [3,4,5]. Under this event-triggered mechanism, the control law is only executed when the trigger condition is satisfied.

Event-triggered control has been widely studied and applied in general linear systems, continuous-time systems, discrete-time systems and other fields [6,7,8,9,10,11]. For instance, Zhang et al. [7] proposed a state-feedback event-triggered control method for linear systems. Tabuada [8] introduced an event-triggered mechanism for nonlinear systems, which can make the Lyapunov function decrease strictly along the solution curve. Qiu et al. [10] designed an fuzzy event-triggered control for pure-feedback nonlinear systems with unknown states. Recently, many scholars have applied the event-triggered control to the adaptive dynamic programming (ADP) algorithm [12,13,14,15,16,17,18]. Zhong et al. [15] proposed an event-triggered mechanism for continuous-time nonlinear systems and studied the unknown system dynamic by a NN observer. Literature [16] gave an event-triggered ADP control method and designed a dynamic NN structure to identify the system internal states for continuous-time systems. In [17], Yang et al. used NN structure to reconstruct the angular position and angular velocity signals of the robot arm, and introduced the event-triggered ADP control method to approximate the performance index of the robot.

ADP algorithm is a useful method of iteratively solving the optimal control for various systems, which satisfies the principle of Bellman optimality [19,20,21,22,23]. In 1977, Werbos [19] first proposed the framework of the ADP algorithm. The main idea of this method is to use the function approximation structure (such as neural network, fuzzy model, polynomial, etc.) to approximate the cost function and control law. Subsequently, Murray et al. [20] presented a specific ADP iterative algorithm for continuous systems and gave strict proof of stability and convergence. Prokhorov and Wunsch [24] summarized that the main structures of ADP algorithm can be classified as: heuristic dynamic programming (HDP), dual heuristic dynamic programming (DHP), globalized DHP (GDHP), and their action-dependent structure. On the basis of previous studies, Jiang et al. [21] introduced a novel policy iterative algorithm without relying on the system dynamic. Al-Tamimi et al. [22] proposed an value-iteration-based ADP algorithm for discrete-time nonlinear systems with unknown internal dynamic. In [23], the greedy ADP algorithm was proposed to solve the tracking control problem for discrete-time nonlinear systems by converting the optimal tracking problem into the optimal adjustment problem.

Control constraints widely exist in practical worlds, which can easily damage the overall performance of the system and maybe lead to system instability [25,26,27,28]. As one of the powerful method to solve the optimal control problem for nonlinear systems, ADP method also plays an important role in the system with control constraints. Na et al. [29] proposed a novel online control policy for constrained nonlinear systems based on iterative ADP algorithm. Fan et al. [30] solved the output-constrained optimal control problem for continuous-time nonlinear systems. In addition, researchers have applied iterative ADP event-triggered control to a class of constrained-input systems. For continuous-time constrained nonlinear systems, Zhu et al. [31] introduced an event-triggered optimal control policy and gave the detailed Lyapunov analysis. Literature [32] considered the global stability of the saturated system, and proposed a state-dependent non-quadratic event-triggered control method. In [33], an event-triggered state feedback control policy was provided for constrained linear systems, the positive lower bounds and the self-triggered method were also given.

Recently, scholars have proposed many event-triggered methods for linear systems, but there are few studies that focus on discrete-time nonlinear systems. Besides, for discrete-time nonlinear systems, researches usually use the basic structures (such as HDP and DHP) of ADP algorithm, and do not consider the constrained-input problem. Motivated by this, we propose a novel ADP-based event-triggered approximate optimal control for discrete-time nonlinear systems with control constraints. In this paper, the globalized dual heuristic dynamic programming (GDHP) structure is designed to learn the event-triggered optimal control, in which the information of cost function and its partial derivative are both studied by the critic network. Compared with the HDP and DHP structures, the GDHP structure learns more system information, which enables the GDHP method to obtain better control performance. The contributions of this paper are summarized as follows: (1) the event-triggered design is developed on the GDHP technique for discrete-time nonlinear systems, where the control law generated from action network is updated when the trigger condition is satisfied. (2) The control constraints are well considered in the event-triggered design based on GDHP structure, and the stability of event-triggered constrained systems has been provided.

The rest of this paper is organized as follows. Section 2 introduces the adaptive event-triggered control for a class of discrete-time nonlinear systems with control constraints. The trigger condition and the corresponding stability analysis are given in Sect. 3. The approximate optimal learning method using GDHP structure and the detailed iterative process are discussed in Sect. 4. Section 5 presents three simulation examples to prove the effectiveness of the proposed method. Finally, Sect. 6 gives the conclusion and discussion.

2 Problem formulation

Consider a class of discrete-time nonlinear systems described as

$$\begin{aligned} {x_{k + 1}} = f({x_k}) + g({x_k}){u_k}, \end{aligned}$$

(1)

where ${x_k} = {\left[ {{x_{1k}},{x_{2k}}, \ldots ,{x_{nk}}} \right] ^T}\in {\mathbb {R}^n}$ is the state vector, ${u_k} = {\left[ {{u_{1k}},{u_{2k}}, \ldots ,{u_{mk}}} \right] ^T} \in {\mathbb {R}^m}$ is the control input vector. For any $x_k$, $f(x_k):{\mathbb {R}^n} \rightarrow {\mathbb {R}^n}$ is differentiable with $f(0) = 0$, $g(x_k):{\mathbb {R}^n} \rightarrow {\mathbb {R}^{n\times m}}$ is nonsingular. Assume that system (1) is Lipschitz continuous on a set $\varOmega $ in ${\mathbb {R}^n}$ containing the origin, and the system (1) is controllable in the sense that there exists a continuous control on $\varOmega $ to stabilize the system. Let us define ${\varOmega _u} = \{ u_k|u_k = {[{u_1}_k,{u_2}_k, \ldots ,{u_m}_k]^T} \in {\mathbb {R}^m},\left| {{u_i}_k} \right| \le {{\overline{u}} _i},i = 1,2, \ldots ,m\} $, where ${{\overline{u}} _i}$ is the saturating boundary of the ith executor. ${\overline{U}} \in {\mathbb {R}^{m \times m}}$ is the constant diagonal matrix given by ${\overline{U}} = diag\left[ {{{{\overline{u}} }_1},{{{\overline{u}} }_2}, \ldots ,{{{\overline{u}} }_m}} \right] $.

In the event-triggered control, we set the time $\{ {k_i}\} _{i = 0}^\infty $ as sampling instant, which means the controller only samples at discrete-time points ${k_0},{k_1},{k_2}, \ldots $. The state feedback control law $u_k$ satisfies

$$\begin{aligned} u_k = \mu ({x_{k_{i}}}), \end{aligned}$$

(2)

where ${x_{k_{i}}}$ represents the state vector at sampling instant ${k_i} \le k < {k_{i + 1}},\mathrm{{ }}i = 0,1,2, \ldots $. In addition, we design a zero-order-hold (ZOH) device to maintain the control input during the trigger interval. Thus, a continuous control input sequence can be obtained by the ZOH.

Define the event-triggered error as

$$\begin{aligned} {e_k} = {x_{k_{i}}} - {x_k}; ~~ k \in [{k_i},{k_{i + 1}}), \end{aligned}$$

(3)

where ${x_{k_{i}}}$ represents the state at the sampling instant, $x_k$ represents the current state. By (2) and (3), we can get

$$\begin{aligned} {u_k} = \mu ({e_k} + {x_k}). \end{aligned}$$

(4)

Then, applying (4) into (1), we have

$$\begin{aligned} {x_{k + 1}} = f({x_k}) + g({x_k})\mu ({e_k} + {x_k}). \end{aligned}$$

(5)

The general discrete-time optimal control problem is to find the control law $u_k$ that can minimize the following infinite domain cost function

$$\begin{aligned} V({x_k}) = \sum \limits _{j = k}^\infty {U({x_j},\mu ({e_j} + {x_j}))}, \end{aligned}$$

(6)

where $\mu ({e_j} + {x_j}))=\mu ({x_{k_{i}}})$, and $U({x_j},\mu ({e_j} + {x_j}))$ is the utility function. The utility function is usually a quadratic form which can be described as

$$\begin{aligned} U({x_k},\mu ({x_{k_{i}}})) = x_k^TQ{x_k} + {\mu ^T}({x_{k_{i}}})R\mu ({x_{k_{i}}}), \end{aligned}$$

(7)

where Q and R are symmetric positive definite matrices with appropriate dimensions, and $U(0,0) = 0$. However, such a quadratic utility function is not suitable for the system with control constrains. Thus, a non-quadratic form is provided to solve the constrained-input problem, and the utility function becomes

$$\begin{aligned}&U({x_k},\mu ({x_{k_{i}}})) = x_k^TQ{x_k}\nonumber \\&\quad + 2{\int _0^{\mu ({x_{k_{i}}})}\varphi ^{ - T}}\left( {{{{\overline{U}}}^{-1}}\tau } \right) {\overline{U}} R\mathrm{d}\tau , \end{aligned}$$

(8)

where $\tau \in {\mathbb {R}^m}$, $\varphi ( \cdot )\in {\mathbb {R}^m}$ is a bounded one-to-one function satisfying $\left| {\varphi ( \cdot )} \right| \le 1$. Here, $U({x_k},\mu ({x_{k_{i}}}))$ is denoted by $U_k$.

Based on Bellman optimality principle, we can obtain the optimal cost function ${V^*}({x_k})$ as

$$\begin{aligned} V^*(x_k)=\mathop {\min }\limits _{\mu (x_{k_{i}})}\left\{ U_k+V^*(x_{k+1})\right\} . \end{aligned}$$

(9)

In addition, the control law ${u_{k}}$ satisfies the first-order necessary condition of optimal control [34]. For $k \in [{k_i},{k_{i + 1}})$, $i = 0,1,2 \ldots $, the optimal control law ${{\mu }^{*}}(x_{k_i})$ can be obtained as

$$\begin{aligned}&{{\mu }^{*}}(x_{k_i}) =\underset{\mu (x_{k_{i}})}{\mathop {\arg \min }}\,\left\{ U_k+V^*(x_{k+1})\right\} \nonumber \\&\quad = \overline{U}\varphi \left( -\frac{1}{2}{{\left( \overline{U}R \right) }^{-1}}{{g}^{T}}\left( {{x}_{k}} \right) \frac{\partial {{V}^{*}}({{x}_{k+1}})}{\partial {{x}_{k+1}}} \right) . \end{aligned}$$

(10)

Next, we will give an event-triggered condition and prove the corresponding stability for system (5).

3 Stability proof under the event-triggered condition

Definition 1

(cf. [35]) For $\forall {x_k} \in \varOmega $, a control law $u_k$ is admissible with respect to (6) on $\varOmega $ if $u_k$ is continuous and stabilizes (1) on $\varOmega $, $u_k = 0$ if $x_k = 0$, and $\forall {x_0} \in \varOmega $, $V(x_0)$ is finite. For the constant diagonal matrix ${\overline{U}} \in {\mathbb {R}^{m \times m}}$, let $m=1$, there is ${\overline{u}}= {\overline{U}}$ with $\left\| {{\mu _{{k_i}}}} \right\| \le {\overline{u}}$.

For system (5), we define an event-triggered condition $\left\| {{e_k}} \right\| \le {e_T}$, where $e_T$ is the trigger threshold. During the event-triggered control, the action network will update the corresponding control law only if the trigger condition is satisfied. Besides, at each sampling instant $k_i$, $i = 0,1,2 \ldots $, the trigger error $\left\| {{e_k}} \right\| $ will be reset to zero.

Assumption 1

(cf. [36]) If function V: ${\mathbb {R}^n} \rightarrow {\mathbb {R}^n}\ge 0$ is continuously differentiable, the state vector $x_k$ and the trigger error $e_k$ satisfy

$$\begin{aligned}&\left\| {f({x_k}-{e_k})} \right\| \le P_1\left\| {{e_k}} \right\| + P_1\left\| {{x_k}} \right\| , \end{aligned}$$

(11)

$$\begin{aligned}&\left\| {g({x_k}-{e_k})} \right\| \le P_2\left\| {{e_k}} \right\| + P_2\left\| {{x_k}} \right\| , \end{aligned}$$

(12)

$$\begin{aligned}&{\alpha _1}(\left\| x \right\| ) \le V({x_k}) \le {\alpha _2}(\left\| x \right\| ),\mathrm{{ }}~~~\forall x \in {\mathbb {R}^n} \end{aligned}$$

(13)

$$\begin{aligned}&V\left( x_{k+1} \right) - V({x_k})\le - \alpha V({x_k}) + \beta \left\| {{e_k}} \right\| , \end{aligned}$$

(14)

$$\begin{aligned}&\alpha _1^{ - 1}(\left\| x \right\| ) \le {L}\left\| x \right\| . \end{aligned}$$

(15)

where L, $P_1$, $P_2$, $\alpha $ and $\beta $ are the positive constants, $\alpha _1 $ and $\alpha _2 $ are the class ${\kappa _\infty }$ functions.

Among them, if (13) and (14) hold, function V is called an input-to-state stability Lyapunov (ISS-Lyapunov) function [37].

According to (3), for each $k \in [{k_i},{k_{i + 1}})$, we have

$$\begin{aligned} e_{k + 1} = x_{k_i} - x_{k + 1}, \end{aligned}$$

(16)

where ${k_i}$ is the latest sampling instant. In addition, according to [36], we can get

$$\begin{aligned} \left\| {{e_{k + 1}}} \right\| \le \left\| {{x_{k + 1}}} \right\| . \end{aligned}$$

(17)

From Assumption 1, by applying (3) and (5) into (17), we have

$$\begin{aligned}&\left\| e_{k+1} \right\| \le \left\| f(x_k)+ g(x_k)\mu (x_{k_i})\right\| \nonumber \\&\quad \le \left\| f(x_k)\right\| +\left\| g(x_k) \right\| \overline{u}\nonumber \\&\quad =\left\| f(x_{k_i}-e_k)\right\| +\left\| g(x_{k_i}-e_k) \right\| \overline{u}\nonumber \\&\quad \le \,(P_1+P_2\overline{u})\left\| x_{k_i} \right\| +(P_1+P_2\overline{u})\left\| e_k \right\| \nonumber \\&\quad \le \,(P_1+P_2\overline{u})\left\| x_{k_i} \right\| +2(P_1+P_2\overline{u})\left\| e_k \right\| . \end{aligned}$$

(18)

Therefore, we can obtain

$$\begin{aligned}&\left\| {{e_k}} \right\| \le 2(P_1+P_2\overline{u})\left\| {{e_{k - 1}}} \right\| +(P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| \nonumber \\&\quad \le 2(P_1+P_2\overline{u})(2(P_1+P_2\overline{u})\left\| {{e_{k - 2}}} \right\| \nonumber \\&\quad + (P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| ) + (P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| \cdots \nonumber \\&\quad \le {(2(P_1+P_2\overline{u}))^{k - {k_i}}}\left\| {{e_{k_{i}}}} \right\| \nonumber \\&\quad + {(2(P_1+P_2\overline{u}))^{k - {k_i} - 1}} (P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| \nonumber \\&\quad + {(2(P_1+P_2\overline{u}))^{k - {k_i} - 2}} (P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| \nonumber \\&\quad + \cdots + (P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| . \end{aligned}$$

(19)

Set the initial condition ${e_{k_{i}}} = 0$, Eq. (19) can be solved as

$$\begin{aligned} \left\| {{e_k}} \right\| \le \frac{{1 - {{(2(P_1+P_2\overline{u}))}^{k - k_{i}}}}}{{1 - 2(P_1+P_2\overline{u})}}(P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| . \end{aligned}$$

(20)

So, we define (20) as the event-triggered condition, such that

$$\begin{aligned} \left\| {{e_k}} \right\|&\le {e_T}\nonumber \\&\quad = \frac{{1 - {{(2(P_1+P_2\overline{u}))}^{k - k_{i}}}}}{{1 - 2(P_1+P_2\overline{u})}}(P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| . \end{aligned}$$

(21)

In the following, we will give the stability proof of the system (5) with control constraints under the condition (21).

Theorem 1

According to Assumption 1, if $0\le P_1+P_2\overline{u}\le 1$ and the function $V({x_k})$ satisfies

$$\begin{aligned} V({x_k})&\le V({x_{k_i + 1}})\nonumber \\&\quad = -\xi \alpha V({x_{k_{i}}})({k_{i + 1}} - {k_i}) + V({x_{k_{i}}}), \end{aligned}$$

(22)

for $k \in [{k_i},{k_{i + 1}})$, $i = 0,1,2 \ldots $, where $\xi \in (0,1)$, the event-triggered control system (5) with control constraints is asymptotically stable.

Proof

By (13) and (15), one can get

$$\begin{aligned} \left\| {{x_{k_{i}}}} \right\| \le \alpha _1^{ - 1}(V({x_{k_{i}}})) \le LV({x_{k_{i}}}). \end{aligned}$$

(23)

Substituting (20) into (14), one has

$$\begin{aligned}&V(x_{k+1}) - V({x_k})\le - \alpha V({x_k}) \nonumber \\&\quad + \beta \frac{{1 - {{(2(P_1+P_2\overline{u}))}^{k - {k_i}}}}}{{1 - 2(P_1+P_2\overline{u})}}(P_1+P_2\overline{u})\left\| {{x_{k_{i}}}} \right\| . \end{aligned}$$

(24)

Then, considering (22) and (23), one can define

$$\begin{aligned} \psi _k = \beta \frac{{1 - {{(2(P_1+P_2\overline{u}))}^{k - {k_i}}}}}{{1 - 2(P_1+P_2\overline{u})}}L(P_1+P_2\overline{u}). \end{aligned}$$

(25)

Equation (24) can be written as

$$\begin{aligned} V({x_{k + 1}}) \le (1 - \alpha )V({x_k}) + {\psi _k}V({x_{k_{i}}}). \end{aligned}$$

(26)

Therefore, it obtains

$$\begin{aligned} V({x_k}) \le&\, (1 - \alpha )V({x_{k - 1}}) + {\psi _{k - 1}}V({x_{k_{i}}})\nonumber \\ \le&\, (1 - \alpha )(1 - \alpha )V({x_{k - 2}}) + {\psi _{k - 2}}V({x_{k_{i}}})\nonumber \\&\,+ {\psi _{k - 1}}V({x_{k_{i}}}) \cdots \nonumber \\ \le&\,{(1 - \alpha )^{k - {k_i}}}V({x_{k_{i}}}) \nonumber \\ {}&\,+{(1 - \alpha )^{k - {k_i} - 1}}{\psi _{k_{i}}}V({x_{k_{i}}}) + \cdots \nonumber \\&\, + (1 - \alpha ){\psi }_{k - 2}V({x_{k_{i}}}) + {\psi _k}V({x_{k_{i}}}). \end{aligned}$$

(27)

According to theorem 1, ${\psi _k}$ is a monotonically increasing function with positive common ratio. Then, (27) can be solved as

$$\begin{aligned} V({x_k}) \le&\,(1 - \alpha {)^{k - {k_i}}}V({x_{k_{i}}}) \nonumber \\&\quad + {\psi _k}\frac{{1 - {{(1 - \alpha )}^{k - {k_i}}}}}{\alpha }V({x_{k_{i}}}). \end{aligned}$$

(28)

Based on (22), one has

$$\begin{aligned} \begin{aligned} {V{({x_k})}} \le - \xi \alpha V({x_{k_{i}}})(k - {k_i})+ V({x_{k_{i}}}). \end{aligned} \end{aligned}$$

(29)

To simplify the calculation, one can define

$$\begin{aligned} \begin{aligned} {M {({x_k})}} = - \xi \alpha V({x_{k_{i}}})(k - {k_i})+ V({x_{k_{i}}}). \end{aligned} \end{aligned}$$

(30)

Therefore, one has

$$\begin{aligned} V({x_k}) \le M ({x_k}). \end{aligned}$$

(31)

From (30), the first difference of $M({x_k})$ can be obtained as

$$\begin{aligned} \varDelta M&= M ({x_{k + 1}}) - M ({x_k})\nonumber \\&\quad = - \xi \alpha V({x_{k_{i}}}). \end{aligned}$$

(32)

Then, substituting (13) into (32), it obtains

$$\begin{aligned} \varDelta M \le - \xi \alpha \cdot {\alpha _1}(x\left\| {{k_i}} \right\| ) < 0. \end{aligned}$$

(33)

This completes the proof. $\square $

From the above derivation process, it can be concluded that the event-triggered control system (5) with control constrains is asymptotically stable.

4 Event-triggered controller design based on GDHP structure

In this section, we will apply the event-triggered condition into the GDHP structure. Due to the actual optimal control law cannot be obtained in theory, an iterative stopping criterion is designed to obtain the approximate optimal control law. Only when the stopping condition is satisfied, an iterative process is completed. Then, the trigger error between the sampled state and the current state is compared with the trigger threshold online. When the designed trigger condition is violated, the current control law will be sampled. Otherwise, the control law is maintained by a ZOH.

This section is divided into three parts. Firstly, an approximate optimal event-triggered controller is designed in the first part. Then, the specific implementation process of NN is given in the second part. Subsequently, the iterative stopping criterion is provided in the third part.

4.1 Event-triggered controller design

The event-triggered controller design based on GDHP structure is displayed in Fig. 1. It can be seen that the proposed event-triggered GDHP method is implemented by three NN structures, where the model network is designed to obtain the system dynamic, the critic network is used to approximate the cost function and its partial derivative, and the action network is designed to obtain the event-triggered approximate optimal control law.

In addition, a sensor device is designed to judge whether the trigger condition is violated. When the trigger condition is violated, the current time is set to the sampling instant ${k_i}$, $i = 1,2, \ldots $, and the control law $\mu ({x_{k_i}})$ is maintained by the ZOH device during ${k_i} \le k < {k_{i + 1}}$.

4.2 NN implementation of GDHP technique

4.2.1 Model network

For the unknown system, before performing the iterative calculation, a model network is first constructed to obtain the system dynamic. The number of hidden layer neurons is set as ${N_m}$. Let ${\upsilon _{m}}$ denote the weight matrix of input-to-hidden layer, and ${\omega _{m}}$ denote the weight matrix of hidden-to-output layer. According to the state vector ${x_k}$ and the control law $ \mu ({x_k})$, the state vector ${{\widehat{x}}_{k + 1}}$ for the next time step can be obtained as

$$\begin{aligned} {{\widehat{x}}_{k + 1}} = \omega _{m}^T\phi (\upsilon _{m}^T{x_{mk}}), \end{aligned}$$

(34)

where $\phi ( \cdot ) \in {R^{{N_m}}}$ is the activation function, which satisfies

$$\begin{aligned} \phi (a) = \frac{{1 - \exp ( - a)}}{{1 + \exp ( - a)}}. \end{aligned}$$

(35)

Set the error function as

$$\begin{aligned} {e_{mk}} = {{\widehat{x}}_{k + 1}} - {x_{k + 1}} \end{aligned}$$

(36)

and the objective error function as

$$\begin{aligned} {E_{mk}} = \frac{1}{2}e_{mk}^T{e_{mk}}. \end{aligned}$$

(37)

In the iterative training of the model network, the weights are updated based on the gradient descent rule, which are

$$\begin{aligned} \omega _{m{(k + 1)}} =&\,\omega _{mk} - {\vartheta _m}\left[ \frac{{\partial {E_{mk}}}}{{\partial \omega _{mk}}}\right] , \end{aligned}$$

(38)

$$\begin{aligned} \upsilon _{m{(k + 1)}} =&\, \upsilon _{mk} - {\vartheta _m}\left[ \frac{{\partial {E_{mk}}}}{{\partial \upsilon _{mk}}}\right] , \end{aligned}$$

(39)

where ${\vartheta _m}$ is the learning rate. Notice that the weights are kept unchanged after a sufficient training and will be used in the following training.

4.2.2 Critic network

It is well known that the critic network of the HDP structure learns the information of the cost function ${V}({x_k})$ and the critic network of the DHP structure learns the information of the partial derivatives ${{\partial {V}({x_k})}/ {\partial {x_k}}}$ of the cost function. However, in the GDHP structure, the critic network not only learns the information of the cost function ${V}({x_k})$, but also studies the knowledge of its partial derivatives ${{\partial {V}({x_k})} / {\partial {x_k}}}$. Because the GDHP structure learning more information about the system, the more excellent control performance can be obtained by this method. For simplicity, the partial derivative is denoted as ${\lambda }({x_k}) = {{\partial {V}({x_k})} / {\partial {x_k}}}$.

In the critic network, the outputs can be obtained as

$$\begin{aligned} {{\widehat{V}}}({x_{k_{i}}}) =&\,\omega _{c}^{VT}\phi (\upsilon _{c}^T{x_{k_{i}}}), \end{aligned}$$

(40)

$$\begin{aligned} {{\widehat{\lambda }}}({x_{k_{i}}}) =&\, \omega _{c}^{\lambda T}\phi (\upsilon _{c}^T{x_{k_{i}}}), \end{aligned}$$

(41)

where ${\upsilon _c}$ represents the weight matrix of input-to-hidden layer, and ${\omega _c}$ represents the weight matrix of hidden-to-output layer. The target function can be expressed as

$$\begin{aligned} {V}({x_{k_{i}}}) =&\,U_k + {{\widehat{V}}}({{\widehat{x}}_{k_{i} + 1}}), \end{aligned}$$

(42)

$$\begin{aligned} {\lambda }({x_{k_{i}}}) =&\,2Q{x_k} + 2\left( \frac{\partial {{\mu }}({{x}_{{{k}_{i}}}})}{\partial {{x}_{{{k}_{i}}}}} \right) \overline{U}R{{\varphi }^{-1}}({{\overline{U}}^{-1}}{{\widehat{\mu }}}({{x}_{{{k}_{i}}}}))\nonumber \\ {}&+ {(\frac{{\partial {{{\widehat{x}}}_{k_{i} + 1}}}}{{\partial {x_{k_{i}}}}} + \frac{{\partial {{{\widehat{x}}}_{k_{i} + 1}}}}{{\partial {{{\widehat{\mu }} }}({x_{k_{i}}})}}\frac{{\partial {{{\widehat{\mu }} }}({x_{k_{i}}})}}{{\partial {x_{k_{i}}}}})^T} \nonumber \\ {}&\times {{\widehat{\lambda }}}({{\widehat{x}}_{k_{i} + 1}}). \end{aligned}$$

(43)

Hence, the error function can be obtained as

$$\begin{aligned} e_{ck}^V =&{{\widehat{V}}}({x_{k_{i}}}) - {V}({x_{k_{i}}}), \end{aligned}$$

(44)

$$\begin{aligned} e_{ck}^\lambda =&\, {{\widehat{\lambda }} }({x_{k_{i}}}) - {\lambda }({x_{k_{i}}}). \end{aligned}$$

(45)

The minimized objective error function is

$$\begin{aligned} {E_{ck}} = (1 - \rho )(\frac{1}{2}e_{ck}^{VT}e_{ck}^V) + \rho (\frac{1}{2}e_{ck}^{\lambda T}e_{ck}^V). \end{aligned}$$

(46)

According to (46) and the gradient descent rule, the weights of the critic network are updated as

$$\begin{aligned} \omega _{c(k+1)} =&\,\omega _{ck} - {\vartheta _c}\left[ (1 - \rho )\frac{{\partial E_{ck}^V}}{{\partial \omega _{ck}}} + \rho \frac{{\partial E_{ck}^\lambda }}{{\partial \omega _{ck}}}\right] , \end{aligned}$$

(47)

$$\begin{aligned} \upsilon _{c(k+1)} =&\,\upsilon _{ck} - {\vartheta _c}\left[ (1 - \rho )\frac{{\partial E_{ck}^V}}{{\partial \upsilon _{ck}}} + \rho \frac{{\partial E_{ck}^\lambda }}{{\partial \upsilon _{ck}}}\right] , \end{aligned}$$

(48)

where ${\vartheta _c} > 0$ is the learning rate. $0 \le \rho \le 1$ is a constant that reflects the weights of HDP and DHP in GDHP structure, which means that when $\rho = 0$, the structure reduces to a pure HDP, and when $\rho = 1$, the used structure reduces to a pure DHP.

4.2.3 Action network

In the action network, the sampled state $x_{k_i}$ is taken as the input and the obtained ${{\widehat{\mu }} }(x_{k_i})$ is used to approximate ${{\mu }}({{x}_{{{k}_{i}}}})$. With the activation function $\phi ( \cdot ) \in {R^{{N_m}}}$, ${{\widehat{\mu }} }(x_{k_i}) $ can be formulated as

$$\begin{aligned} {{\widehat{\mu }} }(x_{k_i}) = \omega _{a}^T\phi (\upsilon _{a}^T{x_{k_i}}), \end{aligned}$$

(49)

where ${\upsilon _a}$ represents the weight matrix of input-to-hidden layer, and ${\omega _a}$ represents the weight matrix of hidden-to-output layer. According to (10), the target control input at the sampling instant $k_i$ can be obtained as

$$\begin{aligned} {{\mu }}({{x}_{{{k}_{i}}}})=\overline{U}\varphi \left( -\frac{1}{2}{{\left( \overline{U}R \right) }^{-1}}{{g}^{T}}\left( {{x}_{k_i}} \right) \frac{\partial {{V}^{*}}({{x}_{k_i+1}})}{\partial {{x}_{k_i+1}}} \right) . \end{aligned}$$

(50)

Define the error function ${e_{ak}}$ and the objective error function ${E_{ak}}$ as

$$\begin{aligned} {e_{ak}} =&\,{{\widehat{\mu }} }(x_{k_i})-{{\mu }}({{x}_{{{k}_{i}}}}), \end{aligned}$$

(51)

$$\begin{aligned} {E_{ak}} =&\,\frac{1}{2}e_{ak}^T{e_{ak}}. \end{aligned}$$

(52)

Similarly, the weights updating rule of the action network can be formulated as

$$\begin{aligned} \omega _{a(k + 1)} =&\,\omega _{ak} - {\vartheta _a}\left[ \frac{{\partial {E_{ak}}}}{{\partial \omega _{ak}}}\right] , \end{aligned}$$

(53)

$$\begin{aligned} \upsilon _{a(k + 1)} = \,&\upsilon _{ak} - {\vartheta _a}\left[ \frac{{\partial {E_{ak}}}}{{\partial \upsilon _{ak}}}\right] , \end{aligned}$$

(54)

where ${\vartheta _a} > 0$ is the learning rate.

4.3 Approximate optimal algorithm design

Define ${V_\infty }({x_k}) = \mathop {\lim }\limits _{l \rightarrow \infty } {V_l}({x_k})$, if the system state $x_k$ is controllable, the cost function ${V_\infty }({x_k})$ is equal to the optimal cost function ${V^*}({x_k})$

$$\begin{aligned} \mathop {\lim }\limits _{l \rightarrow \infty } {V_l}({x_k}) = {V^*}({x_k}), \end{aligned}$$

(55)

where l is the outer loop iteration index. As $l \rightarrow \infty $, we have ${V_l}({x_k}) \rightarrow {V^*}({x_k})$ in theory. However, it is not possible to perform iterations indefinitely in the actual calculation process. Thus, we introduce an error $\varepsilon $ to make the cost function $V(x_k)$ converge after a finite number of iterations [38]. That is to say, there is a limited l that can make the cost function ${V_l}({x_k})$ satisfy

$$\begin{aligned} \left| {{V^*}({x_k}) - {V_l}({x_k})} \right| \le \varepsilon . \end{aligned}$$

(56)

In the iterative ADP algorithm, this design achieves the purpose of approximate optimal regulation. However, the optimal cost function ${V^*}({x_k})$ is unknown in general, it is difficult to use the termination criterion (56) to verify whether the iterative algorithm satisfies the requirements. So, we use the following criterion

$$\begin{aligned} \left| {{V_{l + 1}}({x_k}) - {V_l}({x_k})} \right| \le \varepsilon \end{aligned}$$

(57)

to replace (56).

5 Simulation results and analysis

In this section, the event-triggered GDHP method is applied in three discrete-time systems. Simulations show the advantages of the proposed method by comparing with the traditional GDHP method.

5.1 Case 1: two-dimensional system

Consider the following discrete-time system:

$$\begin{aligned} {{x}_{k+1}}=\left[ \begin{matrix} {{x}_{1k}+0.1{{x}_{2k}}} \\ -2{{x}_{1k}}+0.7{{x}_{2k}} \\ \end{matrix} \right] +\left[ \begin{matrix} 0 \\ x_{1k} \\ \end{matrix} \right] {{u}_{k}}, \end{aligned}$$

(58)

where ${x_k} = {\left[ {{x_{1k}},{x_{2k}}} \right] ^T} \in {\mathbb {R}^2}$ is the state vector and $ {{u}_{k}}\in \mathbb {R}$ is the control input vector. The initial state vector is set as $x_0=[-1,1]$ and the boundary of the saturated actuator is chosen as $\left| u \right| \le 0.1$. Let the parameters $Q=I_2$ and $R=I$, where the subscript represents the dimensions of the identity matrix. Based on (21), we set $P_1+P_2\overline{u}=0.2$. The event-triggered threshold can be obtained as

$$\begin{aligned} {e_T} = \frac{{1 - {{0.4}^{k - {k_i}}}}}{{1 - 0.4}} \cdot 0.2\left\| {{x_{{k_i}}}} \right\| . \end{aligned}$$

(59)

The model network needs to be pre-trained to obtain the system dynamic before implementing the proposed method. The structure of model network is designed as 3–8–2 and the learning rate is set as ${\vartheta _m=0.1}$. It is well known that the setting of parameters will affect the convergence speed and control effect of the algorithm to a certain extent. In order to obtain a good control performance, the initial weights of the three networks are randomly selected from [$-\,0.1$, 0.1] after multiple experiments, which enables the algorithm to have a high control accuracy. To obtain sufficient system dynamic, 500 sets of data are randomly selected from $[-\,1,1]$ for training. Model network is trained for 50 time steps on each sample and the training performance is shown in Fig. 2. From Fig. 2, we can see that the training error has a large value at the beginning, as the samples increase, the training error becomes smaller and eventually converges to zero. After the training, the obtained weights will be kept unchanged for the following training.

Next, the critic network and the action network are designed with structures 2–8–3 and 2–8–1. The initial weights of the two networks are all randomly generated within $[-\,0.5, 0.5]$ and the adjusting parameter is set as $\rho = 0.5$. The learning rates are chosen as ${\vartheta _c}=0.01$ and ${\vartheta _a=0.1}$. During iterative process, each network is trained for 200 inner-loop steps with each iteration of 4000 training steps. In addition, we set $\varepsilon ={10^{ - 6}}$ as the termination condition for each state, which ensures the control law u is an approximate optimal control. As shown in Fig. 3, the weight norms of the two networks are all convergent after the training. It should be noted that due to the weights of Fig. 3 are updated on line. During an iterative process Of state variables, the weights are updated with the iterative calculation until the termination criterion is satisfied. In the entire iterative update process, it can be seen that the iteration is performed on 4000 training steps.

In order to prove the effectiveness of the proposed method, the traditional GDHP method is also applied in this example to make a contrast. The state trajectories and the control input curves in 200 time steps under the two methods are shown in Figs. 4 and 5. As can be seen in Fig. 5, compared with the traditional GDHP method, our method overcomes the constrained-input problem. The trigger error $\left\| {{e_k}} \right\| $ and the trigger threshold ${e_T}$ are given in Fig. 6. The error $\left\| {{e_k}} \right\| $ between the current state and the sampled state is calculated in each sampling process, if $\left\| {{e_k}} \right\| > {e_T}$, $\left\| {{e_k}} \right\| $ will be reset to zero. In this simulation, the action network of traditional GHDP method needs 200 samples to update the control input, while the event-triggered GDHP method only needs 82 samples.

5.2 Case 2: three-dimensional system

Consider the following discrete-time affine nonlinear system presented in [39]:

$$\begin{aligned} {x_{k + 1}} = \left[ {\begin{array}{*{20}{c}} {{x_{1k}}{x_{2k}}}\\ {x_{1k}^2 - 0.5\sin ({x_{2k}})}\\ {{x_{3k}}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} 0\\ 1\\ { - 1} \end{array}} \right] {u_k}, \end{aligned}$$

(60)

where the state vector ${x_k} = {\left[ {{x_{1k}},{x_{2k}},{x_{3k}}} \right] ^T} \in {\mathbb {R}^3}$. Set the initial state vector as ${{x}_{0}}={{\left[ -0.5,0.5,1 \right] }^{T}}$ and the saturating boundary as $\left| u \right| \le 0.1$. The performance index function is chosen the same as case 1 with $Q=I_n$ and $R=I_m$. The termination condition error is set as $\varepsilon ={{10}^{-4}}$. In this case, we let $P_1+P_2\overline{u}=0.1$. So, the trigger threshold can be obtained as

$$\begin{aligned} {e_T} = \frac{{1 - {{0.2}^{k - {k_i}}}}}{{1 - 0.2}} \cdot 0.1\left\| {{x_{{k_i}}}} \right\| . \end{aligned}$$

(61)

The structures of model, critic and action networks are designed as 4–8–3, 3–8–1, and 3–8–1. All the initial weights of three networks are randomly set in [-0.1, 0.1], and other parameters are designed the same as case 1. Similar to case 1, the model network is pre-trained at first and the unchanged weights are used to train the critic network and action network.

In this case, we also compare the traditional GHDP method with the event-triggered GDHP method. The state trajectories in 100 time steps under the two different methods are shown in Fig. 7. According to the state trajectories, we can get that the two methods have similar performance. Figure 8 shows the control input trajectories under the two methods. As can be seen from Fig. 8, the proposed method overcomes the control constraints as well as reduces the computing resources. The event-triggered error and trigger threshold are shown in Fig. 9, where we can see that the action network of traditional GHDP method is updated 100 times, while in the proposed method, the action network is only updated 34 times. Besides, the updated weights process of the action network from hidden-to-output layer are shown in Fig. 10. Due to the weights in Fig. 10 obtained through the external iteration, which are only updated at the sampling instant, the iterative time steps of the action network are the same as the time steps of state trajectory.

5.3 Case 3: torsional pendulum system

In this case, we apply the event-triggered GDHP method to the dynamical system of torsional pendulum, whose mechanical model is shown in Fig. 11. The mathematical description of this system is as follows [40]:

$$\begin{aligned} \left\{ \begin{aligned}&\frac{\mathrm{d}\theta }{\mathrm{d}t}=\omega \\&G\frac{\mathrm{d}\omega }{\mathrm{d}t}=u-Mgl\sin \theta -{{f}_{d}}\frac{\mathrm{d}\theta }{\mathrm{d}t}, \end{aligned} \right. \end{aligned}$$

(62)

where $M=1/3$ kg is the mass, $l=2/3$ m and $G=4/3M{{l}^{2}}$ are the length of the pendulum bar and the rotary inertia, respectively. Let ${{f}_{d}}=0.2$ be the frictional factor and $g=9.8$ $m/{{s}^{2}}$ be the acceleration of gravity. The angle $\theta $ and the angular velocity $\omega $ are the inputs of the system.

Using the sampling interval $\varDelta t=0.1$ s, the dynamic function of the torsional pendulum system can be discretized as

$$\begin{aligned}{{x}_{k+1}}=\left[ \begin{matrix} 0.1{{x}_{2k}}+{{x}_{1k}} \\ -0.49\sin ({{x}_{1k}})-0.1{{f}_{d}}\cdot {{x}_{2k}}+{{x}_{2k}} \\ \end{matrix} \right] +\left[ \begin{matrix} 0 \\ 0.1 \\ \end{matrix} \right] {{u}_{k}},\end{aligned}$$

where ${{x}_{1k}}=\theta $ and ${{x}_{2k}}=\omega $. The initial state and the control constraint are set as ${{x}_{0}}={{\left[ -1,1 \right] }^{T}}$, $\left| u \right| \le 0.3$, respectively. The structures of model, critic and action networks are designed as 3–8–2, 2–8–1, and 2–8–1. The termination condition error is set as $\varepsilon ={{10}^{-3}}$. Besides, the trigger threshold is chosen the same as case 2 and all other parameters are the same as case 1.

In this case, the traditional GHDP method and the proposed method are applied in torsional pendulum system. The state trajectories in 100 time steps under the two method are shown in Fig. 12, which indicates the control performance of the proposed method is similar to that of the traditional GHDP method. However, the event-triggered GDHP method is able to constrain the control input to a certain range as shown in Fig. 13. To further illustrate the effectiveness of the proposed algorithm, the trigger threshold and the event-triggered error are given in Fig. 14. Compared with the traditional time-triggered method with the requirement of 100 samples, the event-triggered method only needs 34 samples, representing a save of 66%.

6 Conclusion

In this paper, a novel event-triggered method based on GDHP technique is proposed for a class of discrete-time systems with control constraints. In order to solve the constrained-input problem, a non-quadratic performance index is introduced in the utility function. Additionally, we give a trigger threshold and use Lyapunov technique to prove the stability of the event-triggered systems. Then, the NN implementation based on GDHP technique is given and an iterative termination criterion is also designed to obtain the approximate optimal control. Finally, three cases are given to demonstrate the effectiveness of the event-triggered GDHP method by comparing with the traditional GDHP method. According to the simulation results from case 1, case 2 and case 3, it can be known that compared with the traditional time-triggered method, the event-triggered method saves 59%, 66% and 66% of computing resources, respectively. Therefore, it can be concluded that the event-triggered GDHP method can solve the constrained-input problem and reduce computing resources while ensuring system performance.

References

Zhao, L., Xiong, H., Zheng, Z., Li, Q.: Improving worst-case latency analysis for rate-constrained traffic in the time-triggered ethernet network. IEEE Commun. Lett. 18(11), 1927–1930 (2014)
Article Google Scholar
Donkers, M., Heemels, W.: Output-based event-triggered control with guaranteed ${{\cal{L}}}_{\infty }$-gain and improved and decentralized event-triggering. IEEE Trans. Autom. Control 57(6), 1362–1376 (2012)
Article MathSciNet Google Scholar
Molin, A., Hirche, S.: On the optimality of certainty equivalence for event-triggered control systems. IEEE Trans. Autom. Control 58(2), 470–474 (2013)
Article MathSciNet Google Scholar
Zhang, X., Han, Q., Zhang, B.: An overview and deep investigation on sampled-data-based event-triggered control and filtering for networked systems. IEEE Trans. Ind. Inf. 13(1), 4–16 (2017)
Article MathSciNet Google Scholar
Zhang, H., Han, J., Wang, Y., Jiang, H.: $H_\infty $ consensus for linear heterogeneous multiagent systems based on event-triggered output feedback control scheme. IEEE Trans. Cybern. 49(6), 2268–2279 (2019)
Article Google Scholar
Ma, Y., Wu, W., Görges, D., Cui, B.: Event-triggered feedback control for discrete-time piecewise affine systems subject to input saturation. Nonlinear Dyn. 95(3), 2353–2365 (2019)
Article Google Scholar
Zhang, J., Feng, G.: Event-driven observer-based output feedback control for linear systems. Automatica 50, 1852–1859 (2014)
Article MathSciNet Google Scholar
Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52(9), 1680–1685 (2007)
Article MathSciNet Google Scholar
Tallapragada, P., Chopra, N.: On event triggered tracking for nonlinear systems. IEEE Trans. Autom. Control 58(9), 2343–2348 (2013)
Article MathSciNet Google Scholar
Qiu, J., Sun, K., Wang, T., Gao, H.: Observer-based fuzzy adaptive event-triggered control for pure-feedback nonlinear systems with prescribed performance. IEEE Trans. Fuzzy Syst. 27(11), 2152–2162 (2019)
Article Google Scholar
Heemels, W.P.M.H., Donkers, M.C.F., Teel, A.R.: Periodic event-triggered control for linear systems. IEEE Trans. Autom. Control 58(4), 847–861 (2013)
Article MathSciNet Google Scholar
Yang, D., Li, T., Xie, X., Zhang, H.: Event-triggered integral sliding-mode control for nonlinear constrained-input systems with disturbances via adaptive dynamic programming. IEEE Trans. Syst. Man Cybern. Syst. 50(11), 4086–4096 (2020)
Mu, C., Wang, K., Sun, C.: Policy-Iteration-Based Learning for Nonlinear Player Game Systems With Constrained Inputs. IEEE Trans. Syst. Man Cybern. Syst. (in press). https://doi.org/10.1109/TSMC.2019.2962629
Luo, B., Yang, Y., Liu, D., Wu, H.: Event-triggered optimal control with performance guarantees using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 31(1), 76–88 (2020)
Zhong, X., He, H.: An event-triggered ADP control approach for continuous-time system with unknown internal states. IEEE Trans. Cybern. 47(3), 683–694 (2017)
Article Google Scholar
Shi, J., Yue, D., Yang, Y., Hu, S.: Event-triggered control based on adaptive dynamic programming for continuous-time nonlinear systems with completely unknown dynamics. In: 2016 12th World Congress on Intelligent Control and Automation, pp. 2035–2040 (2016)
Yang, Y., Xu, C., Meng, Q., Tan, J.: An event-triggered ADP controller for single link robot arm system based on output position. In: 2018 Chinese Control And Decision Conference, pp. 2271–2276 (2018)
Mu, C., Wang, K.: Single-network ADP for near optimal control of continuous-time zero-sum games without using initial stabilising control laws. IET Control Theory Appl. 12(18), 2449–2458 (2018)
Article Google Scholar
Werbos, P. J.: Advanced forecasting methods for global crisis warning and models of intelligence. Gen. Syst. Yearb. 22, 25–38 (1977)
John, M., Chadwick, C., George, L., Richard, S.: Applications and reviews. IEEE Trans. Syst. Man Cybern. Part C 32, 140–153 (2002)
Article Google Scholar
Jiang, Y., Jiang, Z.: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48, 2699–2704 (2012)
Article MathSciNet Google Scholar
Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man Cybern. Part B 38(4), 943–949 (2008)
Article Google Scholar
Zhang, H., Wei, Q., Luo, Y.: A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(4), 937–942 (2008)
Article Google Scholar
Prokhorov, D., Wunsch, D.: Adaptive critic designs. IEEE Trans. Neural Netw. 8, 997–1007 (1997)
Article Google Scholar
Eun, Y., Kabamba, P.T., Meerkov, S.M.: Analysis of random reference tracking in systems with saturating actuators. IEEE Trans. Autom. Control 50(11), 1861–1866 (2005)
Article MathSciNet Google Scholar
Mu, C., Wang, K.: Approximate-optimal control algorithm for constrained zero-sum differential games through event-triggering mechanism. Nonlinear Dyn. 95, 2639–2657 (2019)
Article Google Scholar
Eun, Y., Kabamba, P.T., Meerkov, S.M.: System types in feedback control with saturating actuators. IEEE Trans. Autom. Control 49(2), 287–291 (2004)
Article MathSciNet Google Scholar
He, W., Ge, S.: Vibration control of a flexible string with both boundary input and output constraints. IEEE Trans. Control Syst. Technol. 23(4), 1245–1254 (2015)
Article Google Scholar
Na, J., Wang, B., Li, G., Zhan, S., He, W.: Nonlinear constrained optimal control of wave energy converters with adaptive dynamic programming. IEEE Trans. Ind. Electron. 66(10), 7904–7915 (2019)
Article Google Scholar
Fan, B., Yang, Q., Tang, X., Sun, Y.: Robust ADP design for continuous-time nonlinear systems with output constraints. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2127–2138 (2018)
Article MathSciNet Google Scholar
Zhu, Y., Zhao, D., He, H., Ji, J.: Event-triggered optimal control for partially unknown constrained-input systems via adaptive dynamic programming. IEEE Trans. Ind. Electron. 64(5), 4101–4109 (2017)
Article Google Scholar
Zhang, L., Chen, M.: Event-based global stabilization of linear systems via a saturated linear controller. Int. J. Robust Nonlinear Control 26(5), 1073–1091 (2016)
Ni, W., Zhao, P., Wang, X.: Event-triggered control of linear systems with saturated inputs. Asian J. Control 17(4), 1196–1208 (2015)
Lewis, F.L., Syrmos, V.: Optimal Control. Wiley, New York (1995)
Google Scholar
Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5), 779–791 (2005)
Article MathSciNet Google Scholar
Eqtami, A., Dimarogonas, D.V., Kyriakopoulos, K.J.: Event-triggered control for discrete-time systems. In: Proceedings of the 2010 American Control Conference, pp. 4719–4724 (2010)
Jiang, Z.P., Wang, Y.: Input-to-state stability for discrete-time nonlinear systems. Automatica 37(6), 857–869 (2001)
Article MathSciNet Google Scholar
Wang, F., Jin, N., Liu, D., Wei, Q.: Adaptive dynamic programming for finite-Horizon optimal control of discrete-time nonlinear systems with $\varepsilon $-wrror bound. IEEE Trans. Neural Netw. 22(1), 24–36 (2011)
Article Google Scholar
Mu, C., Wang, D., He, H.: Novel iterative neural dynamic programming for data-based approximate optimal control design. Automatica 81, 240–252 (2017)
Article MathSciNet Google Scholar
Liu, D., Wei, Q.: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grants 62022061, 61773284, and in part by Tianjin Natural Science Foundation under Grant 20JCYBJC00880.

Author information

Authors and Affiliations

School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
Chaoxu Mu, Kaiju Liao & Ke Wang

Authors

Chaoxu Mu
View author publications
You can also search for this author in PubMed Google Scholar
Kaiju Liao
View author publications
You can also search for this author in PubMed Google Scholar
Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chaoxu Mu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

No conflict of interest exits in this submission, and the research work does not involve any human participants and/or animals. The manuscript is approved by all authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mu, C., Liao, K. & Wang, K. Event-triggered design for discrete-time nonlinear systems with control constraints. Nonlinear Dyn 103, 2645–2657 (2021). https://doi.org/10.1007/s11071-021-06218-4

Download citation

Received: 15 January 2020
Accepted: 11 January 2021
Published: 11 February 2021
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11071-021-06218-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Event-triggered design for discrete-time nonlinear systems with control constraints

Abstract

Similar content being viewed by others

Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems

An Event-Triggered Heuristic Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

Dynamic Event-triggered Approximate Optimal Control Strategy for Nonlinear Systems

1 Introduction

2 Problem formulation

3 Stability proof under the event-triggered condition

Definition 1

Assumption 1

Theorem 1

Proof

4 Event-triggered controller design based on GDHP structure

4.1 Event-triggered controller design

4.2 NN implementation of GDHP technique

4.2.1 Model network

4.2.2 Critic network

4.2.3 Action network

4.3 Approximate optimal algorithm design

5 Simulation results and analysis

5.1 Case 1: two-dimensional system

5.2 Case 2: three-dimensional system

5.3 Case 3: torsional pendulum system

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Event-triggered design for discrete-time nonlinear systems with control constraints

Abstract

Similar content being viewed by others

Adaptive Event Triggered Optimal Control for Constrained Continuous-time Nonlinear Systems

An Event-Triggered Heuristic Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

Dynamic Event-triggered Approximate Optimal Control Strategy for Nonlinear Systems

Explore related subjects

1 Introduction

2 Problem formulation

3 Stability proof under the event-triggered condition

Definition 1

Assumption 1

Theorem 1

Proof

4 Event-triggered controller design based on GDHP structure

4.1 Event-triggered controller design

4.2 NN implementation of GDHP technique

4.2.1 Model network

4.2.2 Critic network

4.2.3 Action network

4.3 Approximate optimal algorithm design

5 Simulation results and analysis

5.1 Case 1: two-dimensional system

5.2 Case 2: three-dimensional system

5.3 Case 3: torsional pendulum system

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation