Abstract
In this paper, a novel robust tracking control strategy for nonlinear unmatched uncertain systems is formulated using the event-based adaptive dynamic programming (ADP) approach. First, an augmented system is constructed based on the nonlinear system and the reference trajectory. Then, by forming an auxiliary system and introducing a discounted cost function, the event-based robust tracking control problem is transformed into the event-based optimal control problem of the auxiliary system. The event-based Hamilton–Jacobi–Bellman (HJB) equation associated with the event-based optimal control problem is solved using a single critic neural network (NN) under the ADP framework. A novel weight tuning rule for the critic network is formulated to avoid the necessity of an initial admissible control at the beginning of the weights tuning process. The obtained event-based controller is updated only at the triggering instants decided by the designed triggering condition, which helped in a significant reduction of resources used in computation and communication. Meanwhile, it is demonstrated that the obtained event-based controller can guarantee the tracking error’s uniform ultimate boundedness. Furthermore, using the Lyapunov method, it is guaranteed that the established novel event-triggering rule ensures uniform ultimate boundedness of all signals associated with the closed-loop auxiliary system. Finally, the applicability of the proposed control scheme is demonstrated by providing two simulation examples.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Uncertainties are inevitable in practical nonlinear systems because of the presence of external disturbances and modeling errors. So, considering the requirement of the robustness of the designed feedback controller to uncertainties, many robust control design schemes have been developed over several decades [1,2,3]. Especially the method developed by Lin [3], in which the optimal control approach is utilized to obtain the robust controller, got remarkable attention [4, 5]. In the case of linear systems, the optimal controller can be derived conveniently by solving the algebraic Riccati equation (ARE) associated with it [6]. However, for nonlinear systems, instead of the ARE, one needs to find the solution of the Hamilton–Jacobi–Bellman (HJB) equation [7]. Since the HJB equation is a nonlinear partial differential equation, solving it with an analytical method is challenging. Although dynamic programming is generally used to solve the optimal control problem of nonlinear systems, it suffers heavily from the notorious “curse of dimensionality” [8]. The neural network (NN)-based function approximation technique called adaptive dynamic programming (ADP) has been employed to address this difficulty [9, 10]. The ADP approach was initially developed by Werbos to determine the solution of the optimal control problem effectively [11]. The ADP algorithm has a close relation with the reinforcement learning (RL) technique [12]. In the literature, ADP is also known as approximate dynamic programming [13], adaptive critic designs (ACDs) [14], neural dynamic programming (NDP) [15], and Q-learning [16].
In most practical applications, the system states need to track the desired trajectory rather than converge to zero merely [17, 18]. In the past several years, significant work has been done on tracking control by combining the aforementioned robust control method and the ADP algorithms [19,20,21,22]. In [19], the integral reinforcement learning technique is utilized to develop an optimal tracking controller for constrained input systems. For nonlinear matched uncertain systems, in [20], a robust tracking controller is designed via neural network approach, and in [21], a guaranteed cost tracking controller is developed. In [22], tracking controller for nonlinear systems considering unmatched uncertainties is derived via ACDs. However, all the work mentioned above is developed under the classical time-triggering framework, which suffers from inefficient use of computational and communicational resources.
Compared to the classical time-triggered approach, in the event-based or event-triggered strategy, the controller is only updated if a predefined triggering rule is not fulfilled, which helps in the effective use of computational and communicational resources [23,24,25,26]. Thus, many works have been done by combining the ADP-based robust control approach and the event-based framework. In [27], an actor-critic NN structure is utilized to derive an event-based optimal adaptive controller for nonlinear systems. In [28], an event-based guaranteed cost controller is derived for nonlinear systems utilizing a single critic NN. In [29], adaptive critic is used to design event-based near-optimal controller for heating, ventilation, and air conditioning (HVAC) systems. In [30], event-triggered optimal controller is designed for continuous stirred tank reactor (CSTR) system using ADP approach. The event-based ADP framework has been further utilized in designing controller for nonlinear systems with constrained input [31], with matched uncertainties [32], and with unmatched uncertainties [33].
Under the event-based ADP framework, the tracking controller for nonlinear systems has been designed in [34,35,36,37,38,39,40,41,42,43]. In [34,35,36, 38, 44], the ADP approach is used to formulate event-based optimal tracking controller for nonlinear systems without considering any uncertainties. The event-based ADP approach is utilized to derive an optimal tracking controller for modular reconfigurable robots in [39], and in [40], the tracking controller is design with application in wastewater treatment. In [41], event-based ADP is utilized to develop a tracking controller for constrained input systems. In [42], Cui et al. established event-based \(H_{\infty }\) tracking controller via RL method. In our previous work [43], nonlinear matched uncertain system was considered while designing the event-based robust tracking controller. In [44], the event-triggered ADP approach is used to design a tracking controller for partially unknown matched uncertain constrained systems. Nonetheless, papers in the existing literature have not focused on developing a robust tracking controller for continuous-time nonlinear systems with unmatched uncertainty via event-based ADP approach, particularly without using the \(H_{\infty }\) control approach [42]. Unlike the matched uncertainty, the unmatched uncertainty enters the system through a different channel than the control input. The unmatched uncertainties are a more general kind of uncertainties and can be widely seen in most practical systems. So, it is vital to consider unmatched uncertainty while designing a controller for nonlinear systems. This is what drives the research developed in this paper.
The following are the major contributions of this work.
-
1.
Compared with [34,35,36,37,38, 41], in this work uncertainty is considered while designing the ADP- based event-triggered robust tracking controller. As unmatched uncertainties are the most common form of uncertainty, they must be taken into account while developing a controller for nonlinear systems.
-
2.
Unlike [42], in this work, the event-based robust tracking controller is derived without using the \(H_{\infty }\) control strategy. In the \(H_{\infty }\) optimal control approach, the existence of the saddle point must be judged, but this is a challenging task.
-
3.
Rather than considering matched uncertainty as in [43, 44], unmatched uncertainty is considered in this work. Moreover, unlike [43], the need for the initial stabilizing control at the beginning of the critic weights tuning process is also relaxed by modifying the tuning rule.
The remaining part of this work is organized in the following manner. In Sect. 2, the original tracking control problem is transformed into the optimal control problem of an auxiliary system. The event-based HJB equation is formulated, and the event-triggering rule is derived in Sect. 3. In Sect. 4, the HJB equation is solved via the ADP approach. In Sect. 5, the Lyapunov approach is used to show that all the signals associated with the closed-loop auxiliary system are uniformly ultimately bounded. In Sect. 6, two simulation examples are presented. Finally, a concluding remark is given in Sect. 7. Moreover, limitations and future scope of the proposed work are also mentioned in Sect. 7.
Notation: In this work, the maximum and minimum eigenvalues of a matrix are denoted by \(\lambda _{M}(\cdot )\) and \(\lambda _{m}(\cdot )\), respectively. The transform operation is represented by the superscript \(\top \). \(\nabla (\cdot )\) denotes the gradient operation. \(I_n\) is the identity matrix of dimension \(n\times n\) and \(0_{n\times m}\) is the zero matrix of dimension \(n\times m\). \(\Omega \) is a compact subset of \(\mathbb {R}^2\).
2 Problem transformation
Consider the continuous-time nonlinear uncertain system given in the form
where \( x(t)\in \mathbb {R}^n\) and \( u(t)\in \mathbb {R}^b\) are the state vector and control input, respectively. Let \(x(0)=x_0\) be the initial state. \(f(\cdot )\) and \(g(\cdot )\) are smooth functions in their arguments with \(f(0)=0\) and \(f+gu\) satisfies the Lipschitz continuity. The unmatched uncertainty \(\Delta f(x)=l(x)d(x)\), where \(l(x)\in \mathbb {R}^{n\times p}\), \(d(x)\in \mathbb {R}^p\) and if \(b=p\) then \(l(x)\ne g(x)\). Let d(x) be bounded by a known function \(\lambda _d(x)\), i.e., \(\left\Vert d(x)\right\Vert \le \lambda _d(x)\). Furthermore, \(\lambda _d(0)=0\) and \(d(0)=0\). In addition, there exists a nonnegative function \(g_{M}(x)\) satisfying
where \(g^+(x)\) is the pseudoinverse of g(x). Let the desired trajectory \(x_d(t)\in \mathbb {R}^n\) be generated from
where \(\Theta (x_d)\) satisfies the Lipschitz continuity and \(\Theta (0)=0\). Let \(x_d(0)=x_{do}\) be the initial condition.
The objective of this work is to derive an event-based robust controller for system (1) so that the system state x(t) follows the desired trajectory \(x_d(t)\). Define the tracking error as \({e}_{t}(t)=x(t)-x_d(t)\). From (1) and (2), the tracking error dynamics can be presented as
Now, based on the tracking error and the desired trajectory, an augmented state vector \(\xi (t)=[{e}_{t}^{\top }(t),x_d^{\top }(t)]^{\top } \in \mathbb {R}^{2n}\) is formed. Then, using (2) and (3), the augmented system dynamics is formulated as
where \(F:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n}\) and \(G:\mathbb {R}^{2n}\rightarrow \mathbb {R}^{2n\times b}\) are new system matrices while \(\Delta F(\xi (t))\in \mathbb {R}^{2n}\) is the new uncertain term. They can be expressed as
and
The terms \(d(\xi )\) and \({G^+(\xi )\Delta F(\xi )} \) are still upper bounded and the bound can be derived as
and
respectively.
Next, the uncertain term \(L(\xi )d(\xi )\) is projected onto the range of matrix \(G(\xi )\) and decomposed into sum of matched and unmatched component, that is
Then following auxiliary system is formed
where \(v(\xi )\in \mathbb {R}^p\) is an auxiliary control that handles the unmatched component.
3 Event-based robust tracking control strategy
In this section, the event-based HJB equation is developed for the auxiliary system (7). Moreover, the event-triggering rule is also obtained using Lyapunov approach. The cost function associated with the auxiliary system (7) is defined as
where \(\gamma \) and \(\beta \) are positive constant, \(U(\xi , u(\xi ),v(\xi )) =\xi ^{\top }\bar{Q}\xi +u^{\top }(\xi )Ru(\xi )+\beta ^2v^\top (\xi )Mv(\xi )\) and \(\bar{Q}=diag\{Q,0_{n\times n}\}\). Q, M and R are positive definite matrices with appropriate dimension. Let r and m be lower triangular matrices with appropriate dimension. Then, using Cholesky decomposition one can write \(R=rr^{\top }\) and \(M=mm^{\top }\).
Remark 1
The discount term \(\mathrm {e}^{-\gamma (\tau -t)}\) in cost function (8) is employed to make sure that (8) is bounded. Otherwise, the control policy pair \( [u^{\top }(e_{t}(t),x_d(t)), v^{\top }(e_t(t),x_d(t))]^{\top }\) may cause (8) to become unbounded since it depends on reference trajectory \(x_d(t)\). In many practical systems, we need to consider reference trajectory which does not converge to zero. In that situation \(x_d(t)\) makes (8) unbounded [45, 46].
Let \(\Psi (\Omega )\) be the set of admissible controls on \(\Omega \). We assume that the optimal control policy pair is admissible. If the cost function \({J}(\xi )\) is continuously differentiable then one can write
with \(J(0)=0\). Here (9) is called the infinitesimal version of (8). The Hamiltonian for the auxiliary system (7) is given as
The optimal cost function is given by
By the Bellman’s principle, \(J^{*}(\xi (t))\) holds the HJB equation
with \(J^{*}(0)=0\). Define \((I-G(\xi )G^+(\xi ))L(\xi )=K(\xi )\). The optimal control policies are obtained as
and
Substituting (13) and (14) into (12), we present the HJB equation as
3.1 The event-based HJB equation formulation
Here, we present the HJB equation (15) in event-based form. Before proceeding, the event-based strategy is explained.
Let us consider a monotonically increasing sequence \(\left\{ t_k\right\} _{k=0}^{\infty }\), where the kth triggering instant is represented as \(t_k\) and \(k \in \mathbb {N}\). Let the system state be sampled at every triggering instants and \(\left\{ {\xi }_k\right\} _{k=0}^{\infty }\) be the sequence of sampled state, where \(\xi _k=\xi (t_k)\) is the sampled state at \(t_k\). The triggering error is described as the difference between the current state \(\xi (t)\) and the sampled state \(\xi _k\) and is represented as
Based on (16), the event-based mechanism can be explained. If a predefined triggering rule is not satisfied, then the triggering error becomes zero, i.e., \(e_{k}(t)=0\), and the control law will be updated. When the triggering rule is fulfilled, the control law is held constant between the two consecutive triggering instants. This principle is similar to the familiar zero-order hold (ZOH) principle, and it can be expressed as
From (16), the event-based control policy is obtained as
Now, using the control law (17), we obtain the sampled version of auxiliary system (7) as
The optimal control (13), under event-triggered mechanism, can be expressed as
Now, using (19), we formulate the HJB equation under event-based framework as
that is,
where \(J^{*}(0)=0\).
3.2 Event-triggering condition
In this subsection, we obtain the event-triggering condition using the Lyapunov approach. Before continuing, following statement is made which will be required to derive the triggering rule. The following statement is satisfied in many applications when the controller is affine with respect to the event-triggering error signal [27, 47].
Assumption 1
Let \(\mathcal {L}\) be a positive constant. We consider that the optimal control policy \(u^{*}(\xi )\) fulfills the Lipschitz continuity on \(\Omega \) such that
Theorem 1
Let Assumption 1 be true, \(J^{*}(\xi )\) satisfies the HJB equation (12), the control policies are described by (14) and (19), and the event-triggering law is formulated as
then for \(\eta _1\in (0,1)\) and \(\gamma =0\), the closed-loop augmented system (4) is asymptotically stable under \(\mu ^*(\xi _k)\) and for \(\gamma \ne 0\) the tracking error \(e_{t}\) is uniformly ultimately bounded.
Proof
Consider \(J^{*}(\xi )\) is the Lyapunov function candidate. Differentiating \(J^{*}(\xi )\) along the trajectory of \(\dot{\xi }(t)=F(\xi (t))+G(\xi (t))\mu ^{*}({\xi }_k)+\Delta F(\xi (t))\), one can write
From (12), we obtain
from (13), we can write
and from (14), we obtain
Using (23), (24) and (25) we derived
Now,
and
Since, \(\bar{Q}=diag\{Q,0_{n\times n}\}\), one can write \(\xi ^{\top }\bar{Q}\xi =e_{t}^\top Q e_{t}\). Now, using (27), (28), and Assumption 1 we derive
Hence, when the triggering rule stated in Theorem 1 is satisfied and \(\gamma =0\), then using (29) we can write
Thus, the system is asymptotically stable for \(\gamma =0\). When \(\gamma \ne 0\), then
Since \(J^{*}(\xi )\) is positive definite and bounded on \(\Omega \), let \(J^{*}_{max}\) be the maximum value of \(J^{*}(\xi )\). So, from (31), \(\dot{J}^{*}(\xi )\le 0\) only if \(e_{t}\) lies out of the set
Thus we conclude that for \(\gamma \ne 0\), the tracking error \(e_t(t)\) is uniformly ultimately bounded and the ultimate bound is \(\frac{1}{\eta _1}\sqrt{\frac{\gamma J^{*}_{max}}{\lambda _{m}(Q)}}\). \(\square \)
Remark 2
In this work, the control policy \(\mu ^{*}({\xi }_k)\) is formulated under the event-triggered framework, but the augmented control policy \(v^{*}({\xi })\) is formulated under the time-triggered framework. There are two reasons behind this. First, the control policy to be used in the uncertain system is \(\mu ^{*}({\xi }_k)\) not the augmented control \(v^{*}({\xi })\). Second, if we also consider the augmented control in the event-triggering framework, then it becomes very difficult to obtain the event-triggering rule (21).
Remark 3
The lower bond of the minimum event interval \(\Delta t_{\min }\) can be expressed as
where
and \(e_k(t_{k+1})=\xi _k-\xi (t_{k+1})\), \(\mathcal {P}\) and \(\pi \) are positive constant satisfying \(F(\xi )+G(\xi )u(\xi )+\Delta F(\xi )\le \mathcal {P}\Vert \xi \Vert +\pi \). Note that the positive constants \(\mathcal {P}\) and \(\pi \) exist because \(F(\xi )+G(\xi )u\) is Lipschitz continuous and the terms \(d(\xi )\) and \({G^+(\xi )\Delta F(\xi )} \) are upper bounded. The theoretical proof is similar to [32]. We have excluded the proof to avoid repetition. In the simulation result, we have presented that the intersample time indeed has a lower limit which is larger than zero. As a result, the infamous Zeno behavior is avoided.
4 ACDs for solving event-based HJB equation
In this section, a single critic network is employed to approximate the optimal value of the cost function under the ADP framework. The optimal cost function can be reconstructed on \(\Omega \), utilizing the neural network’s universal approximation property and l number of hidden layer neurons, as
where \(\omega _c\in \mathbb {R}^l\) is the actual weight vector of critic network, \(\sigma _c(\xi )\in \mathbb {R}^l\) is the activation function, and \(\epsilon _c(\xi )\) is the reconstruction error. Next, we obtain the gradient of (33) as
Due to the unavailability of the actual weight vector \(\omega _c\), the approximate weight vector \(\hat{\omega }_c\) is used to form a critic network to estimate the value of the optimal cost function \({J}^{*}(\xi )\) as follows
Then the gradient of (35) is
Considering (34) we present the augmented control law (14) and the event-based control law (19) as
and
respectively. Then by using (36), the approximate value of \({v}^{*}({\xi })\) and \({\mu }^{*}({\xi _k})\) can be obtained as
and
respectively. Substituting \(J^{*}(\xi )\) from (33) into (10), we obtain
where \(e_{cH}=-(\nabla \epsilon _c(\xi ))^\top (F(\xi )+G(\xi )\mu ^{*}({\xi }_k)+K(\xi )v^{*}(\xi ))+\gamma \epsilon _c(\xi )\) is the residual error because of the reconstruction error associated with the NN approximation. Now the Hamiltonian (10) is approximated as
From the HJB equation it is evident that \(H(\xi ,\omega _c,\mu ^{*} (\xi _k),v^{*}(\xi ))=0\). So, the approximation error of Hamiltonian is given by
where \(\phi =\nabla \sigma _c(\xi )(F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi ))-\gamma \sigma _c(\xi )\).
Now, to ensure \(e_c\) given in (43) to be sufficiently small, we need to train the critic network to obtain appropriate weights. For that, the objective function constructed as \(E=(1/2)e_c^\top e_c\) is minimized by using the steepest descent technique. Based on this approach the tuning rule is given as
where \(l_c>0\) is a design parameter which is also known as the critic network’s learning rate and \(1/{(1+\phi ^\top \phi )^2}\) is introduced to normalize \(\phi \). However, the tuning rule (44) has following drawbacks
-
1.
An initial stabilizing control is needed at the beginning of the critic weight vector learning process while using the tuning rule provided in (44). However, in some practical applications, determining the initial admissible control can be difficult.
-
2.
The term \(\phi /(1+\phi ^{\top }\phi )^2\) in (44) should be held persistently exciting to guarantee the convergence of the critic weights to valid optimal values. To meet the persistency of excitation (PE) condition, usually a probing noise is applied to the control input during the initial period of the critic weights tuning process. However, the probing noise can cause the system to become unstable.
To overcome the above drawbacks, we modify the tuning rule (44) via the Lyapunov approach. Before continuing the following assumption, which is similar to [20], is presented.
Assumption 2
Let us consider \(V(\xi )\) be a continuously differentiable Lyapunov function candidate for the system (7) under the action of control policies given by (37) and (38), and satisfy
Moreover, there exists a symmetric positive definite matrix \(\Lambda \in \mathbb {R}^{2n}\) defined on \(\Omega \) ensuring
Remark 4
\(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\) is frequently considered to be bounded by a positive constant on a compact set \(\Omega \) [48]. In other words, there exist a constant \(z_1\) such that \(\Vert F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\Vert \le z_1\). Here we assumed that \(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi )\) is bounded by a function with respect to \(\xi \), which is less stringent than the constant upper bound assumption. Without loss of generality, we consider that \(\Vert (F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert \), where \(z_2\) is a positive constant. In this regard, we can write \(\Vert (\nabla V(\xi ))^{\top }(F(\xi )+G(\xi ){\mu }^{*}({\xi _k})+K(\xi ){v}^{*}(\xi ))\Vert \le z_2 \Vert \nabla V(\xi )\Vert ^2\). Observing (45), one can find that (46) is reasonable. In simulation, a polynomial with respect to \(\xi \) is chosen as \(V(\xi )\).
While we apply the approximated control policies (39) and (40) to the auxiliary system (7), to avoid instability we need to avoid the possibility
To avoid (47), the training process is enhanced by introducing an additional term which is obtained using the steepest descent method as given below
where \(l_s>0\) is a design parameter. Now, the modified critic weights tuning rule is obtained by adding the stabilizing term (48) to the traditional tuning rule (44) as
Remark 5
The new tuning rule (49) can eliminate the need of initial admissible control. Hence, we can initialize the critic weight vector to zero while learning the appropriate critic weights. Moreover, the risk of instability due to the addition of probing noise to fulfill the PE condition is also eliminated.
The critic weights approximation error \(\tilde{\omega }_c\) is defined as the difference between the ideal and the approximate weight vector, i.e., \( \tilde{\omega }_c=\omega _c-\hat{\omega }_c \). From (41) and (43) we obtain
Then, using (49) and (50), the critic weights approximation error dynamics is presented as
The closed-loop system functions as an impulsive dynamical system consisting of flow dynamics and jump dynamics under the event-based control law. Let us consider an augmented state vector \(\psi =[\xi ^\top ,{\xi }_k^\top ,\tilde{\omega }_c^\top ]^\top \). Then, the flow dynamics of the closed-loop system, which occurs for all \(t\in [t_k,t_{k+1})\), can be presented as
and the jump dynamics of the closed-loop system, which occurs for all \(t\in t_{k+1}\), can be presented as
where \(\psi (t^+)=\lim _{\varsigma \rightarrow 0^+}\psi (t+\varsigma )\) and \(\varsigma \in (0,t_{k+1}-t_k)\).
5 Stability analysis
In this section, the stability of impulsive dynamical representation of closed-loop system, given by (52) and (53), is studied. Prior to moving forward some assumptions, which are common in the literature, are stated below [32].
Assumption 3
The augmented system dynamics \(G(\xi )\) and \(K(\xi )\) satisfy the following assumptions, where A, \(G_{M}\), and \(K_M\) are positive constants.
-
1.
The dynamics \(G(\xi )\) satisfies the Lipschitz continuity such that \(\Vert G(\xi )-G({\xi }_k)\Vert \le A\Vert e_{k}(t)\Vert \).
-
2.
The dynamics \(G(\xi )\) and \(K(\xi )\) are upper bounded by \(G_M\) and \(K_M\), respectively.
Assumption 4
The following conditions hold on \(\Omega \), where B, \(\nabla \sigma _{cM}\), \(\nabla \epsilon _{cM}\), \(\omega _{cM}\), and \(e_{cHM}\) are positive constants.
-
1.
The gradient of activation function satisfies the Lipschitz continuity such that \(\Vert \nabla \sigma _c(\xi )-\nabla \sigma _c({\xi }_k)\Vert \le B\Vert e_{k}(t)\Vert \).
-
2.
The gradient of activation function \(\nabla \sigma _c(\xi )\) and the gradient of neural approximation error \(\nabla \epsilon _c(\xi )\) are upper bounded by \( \nabla \sigma _{cM}\) and \(\nabla \epsilon _{cM}\), respectively.
-
3.
The ideal weight vector \(\omega _c\) and the residual error \(e_{cH}\) are upper bounded by \(\omega _{cM}\) and \(e_{cHM}\), respectively.
Theorem 2
Let Assumptions 1 to 4 be true. Then, under the control policies (39) and (40), the closed-loop auxiliary system (7) is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded if the inequalities (54) and (55) hold, where \((\eta _2\in (0,1))\) is a design parameter, and the values of \(\varGamma _1\) and \(\varGamma _2\) are give by (63) and (69), respectively.
Proof
In light of the flow dynamics (52) and the jump dynamics (53) we consider the Lyapunov function candidate as
where \(\Upsilon _1(t)=J^*(\xi )\), \(\Upsilon _2(t)=J^*({\xi }_k)\), \(\Upsilon _3(t)=\frac{1}{2}\tilde{\omega }_c^\top \tilde{\omega }_c\) and \(\Upsilon _4(t)=l_sV(\xi )\). Now, the analysis is separated into two cases.
Case 1. Events are not triggered, i.e., \(t\in [t_k,t_{k+1})\). Taking the differentiation of (56) one can write
It is evident that for \(t\in [t_k,t_{k+1})\)\(\dot{\Upsilon }_2(t)=0\). Now, differentiating \(\Upsilon _1(t)\) along the trajectory of \(\dot{\xi }(t)=F(\xi )+G(\xi )\hat{\mu }(\xi _k)+K(\xi )\hat{v}(\xi )\), we obtain
Now using Eqs. (23) and (24), we derived
We can write
and
Based on the above three inequalities, (58) can be expressed as
where \(\varGamma _1\) is a positive constant and it is expressed as
We have \(\omega _c^\top \phi =\phi ^\top \omega _c\). Let \(\varphi =\phi /({1+\phi ^\top \phi })\). Now, using (51), the time derivative of \(\Upsilon _3(t)\) is found as
Let \(\lambda _{M}(\varphi \varphi ^\top )=\lambda _{\varphi M}\) and \(\lambda _{m}(\varphi \varphi ^\top )=\lambda _{\varphi m}\). Then, Considering Young’s inequality \(2c^\top d \le c^\top c+d^\top d\) and Assumption 4, (64) can be expressed as
Now, substituting \(\tilde{\omega }_c=\omega _c-\hat{\omega }_c\) in last two terms of (65) and considering the control policies (39) and (40), one can write
The derivative of \({\Upsilon }_4(t)\) is
Now, combining (66) and (67) and using the control policies (37) and (38), one can write
Now, utilizing Assumptions 1 to 4, we can write
where \(\kappa =\frac{1}{2}\nabla \epsilon _{cM}(G_M^2\Vert R^{-1}\Vert +\frac{1}{\beta ^2}K_M^2\Vert M^{-1}\Vert )\) and the positive constant \(\varGamma _2\) is given by
Substituting (62) and (68) into (57), we obtain
Since \(\bar{Q}=diag\{Q,0_{n\times n}\}\), one can write \(\xi ^\top (t)\bar{Q} \xi (t)=e_{t}^\top (t)Qe_{t}(t)\). Now, introducing the design parameter \(\eta _2\), (70) can be presented as
If the inequalities (54) and (55) mentioned in Theorem 2 hold then (71) implies
i.e., the proposed Lyapunov function candidate has negative time derivative for all \(t\in [t_k,t_{k+1})\).
Case 2. Events are triggered, i.e., \(t\in t_{k+1}\). We derive the difference of the Lyapunov function candidate as
where \(\xi (t^+_k)=\lim _{\varsigma \rightarrow 0^+}\xi (t_k+\varsigma )\) and \(\varsigma \in (0,t_{k+1}-t_k)\). In Case 1 we derived that \(\dot{\Upsilon }(t)<0\) for all \(t\in [t_k,t_{k+1})\), so
Thus, one can write
Hence, we can further express that
where \(\vartheta \) is a class k function and \(e_{k+1}(t_k)={\xi }_{k+1}-{\xi }_{k}\). The inequalities (74) and (75) imply the monotonically decreasing property of the proposed Lyapunov function candidate for all \(t\in t_{k+1}\).
Thus from the two cases presented above, we conclude that the closed-loop system is asymptotically stable and the critic weights approximation error is uniformly ultimately bounded. \(\square \)
A flowchart is given in Fig. 1 to explain the fundamental methodology of the proposed work, which comprises the learning and implementation phases. In the learning phase, the converged critic weights are obtained after sufficient iterations while using the event-triggering rule (54). Then the converged weights are passed to the implementation phase to obtain the approximate values of the optimal control policies \( \mu ^{*}(\xi _k)\) and \(v^{*}(\xi )\) as \( \hat{\mu }(\xi _k)\) and \(\hat{v}(\xi )\), respectively. The approximated event-based control policy \( \hat{\mu }(\xi _k)\) is applied to the uncertain nonlinear system while using the event triggering rule (21) to track the desired trajectory.
Remark 6
The values of the sampling frequencies \(\eta _1\) and \(\eta _2\) are chosen such that the terms \(\Vert e_T\Vert ^2\) and \(\Vert \hat{e}_T\Vert ^2\) become positive, respectively. The increase in the value of \(\eta _1\) and \(\eta _2\) will increase the sampling frequency and the number of the event triggering instants. Furthermore, it improves tracking performance. However, we have to select the sampling frequency such that there is a trade-off between the number of triggering instants and the tracking performance. Similar to relevant literature [38], other parameters are chosen heuristically such that the convergence time of the critic weights and the number of triggering instants are minimum with acceptable tracking performance.
6 Simulation illustration
In this section, two simulation examples are presented to exhibit the efficacy of the proposed event-based robust trajectory tracking scheme. In Example 1, we have considered a linear system with unmatched uncertainty, and in Example 2, the spring-mass-damper system with nonlinear spring constant and unmatched uncertainty is considered.
6.1 Example 1
Consider the following linear unmatched uncertain system [21]
where \(x=[x_1,x_2]^\top \in \mathbb {R}^2\) is the state vector, \(u\in \mathbb {R}\) denotes the control input, \( \Delta f(x)=l(x)d(x)\) and \(l(x)=[1,0]^\top \). The perturbation \(d(x)=0.5 \theta _1 x_1 sin(x_2+\theta _2)\), where the parameters \(\theta _1\) and \(\theta _2\) are unknown. We consider \(\theta _1\in [-1,1]\), \(\theta _2\in [-5,5]\) and the upper bound of the perturbation d(x) is \(\lambda _d(x)=\vert x_1\vert \). Let \(x_0=[0.6,-0.5]^\top \) be the initial state. The desired trajectory \(x_d(t)\) is generated from
where \(x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2\) with the initial condition \(x_{d0}=[0.3,-0.3]^\top \). The tracking error \(e_t\) is defined as \(e_t=x-x_d\), where \(e_t=[e_{t1},e_{t2}]^{\top }\in \mathbb {R}^2\), and initial condition \(e_{to}=x_0-x_{d0}\). Then an augmented state vector \(\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4\) is defined and following augmented system is formed
where \(L(\xi )=[1,0,0,0]^\top \) and \(d(\xi )=0.5\theta _1 (\xi _1+\xi _3) sin((\xi _2+\xi _4)+\theta _2)\). The initial condition \(\xi _0=[e_{t0},x_{d0}]^{\top }=[0.3,-0.2,0.3,-0.3]^\top \). The upper bound for \(\lambda _d(\xi )\) is derived as \(\lambda _d(\xi )=\vert \xi _1+\xi _3\vert \). We have obtained \(G^+(\xi )=[0,1,0,0]\) so \((I-G(\xi )G^+(\xi ))L(\xi )=[1, 0, 0, 0]^{\top }\). As in (7), the auxiliary system is formulated as
Since, \(\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0\) we have taken \(g_M(\xi )=0\). Let \(R=I_1\), \(M=I_1\) and \(Q=500I_2\). For the simulation purpose, consider \(\gamma =0.5\) and \(\beta =0.85\).
Our aim is to develop an event-based robust controller for the system (76) to track the reference trajectory generated by (77). As described in the theoretical analysis, to achieve this design criteria, the augmented system (78) is formed and then the original control problem is transformed to designing an event-based optimal controller for auxiliary system (79). Based on (8), the cost function for (79) can be presented as
The critic network (35) is employed to find the solution of the event-based optimal control problem approximately. We have considered \(l=10\) numbers of hidden layer neurons and the weight vector of the critic network is represented as \( \hat{\omega }_c=[\hat{\omega }_{c1},\ldots ,\hat{\omega }_{c10}]^\top \). The activation function for the critic network is selected as \(\sigma _c(\xi )=[\xi _1^2,\xi _2^2,\xi _3^2,,\xi _4^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2\xi _3,\xi _2\xi _4, \xi _3\xi _4]^\top \). The weights are trained using the tuning rule (49) and the triggering condition (54) is used during the training process. The parameters used during the tuning process are \(l_c=3\), \(l_s=0.1\), \((A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8\) and \(\eta _2=0.7\).
To satisfy the PE condition, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. As shown in Fig. 2, the critic weight vector converges to \(\hat{\omega }_c=[0.46, 3.64, 0.01, -0.69, -0.85, -0.32, -0.02, 0.71, -3.89, -1.60]^{\top }\). During the training process the event-based controller updates 5714 times. On the contrary, under the same design criteria, the time-based controller updates 18000 times.
Then, we used the converged weights to obtain the approximate values of control policies control polices (39) and (40). Now, we select \(\theta _1=-0.3\) and \(\theta _2=5\) to demonstrate the trajectory tracking ability of the designed control policy \( \hat{\mu }(\xi _k)\) and the triggering rule described in (21). We considered \(\eta _1=0.65\) and \(\mathcal {L}=2.5\). The sampling period is taken as 0.01 second. The tracking performance of the designed controller is displayed in Figs. 3 and 4. The obtained event-based control policy \( \hat{\mu }(\xi _k)\) is shown in Fig. 5.
The value of the sampling frequency \(\eta _1\) is considered as \(\eta _1\) \(\in (0,1)\). Table 1 illustrates the relationship between \(\eta _1\) and the number of triggering instants \(N_s\). From the table it is clear that as \(\eta _1\) increases the number of event-triggering instants \(N_s\) increases.
The evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\) is displayed in Fig. 6. The sampling period is shown in Fig. 7. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 7 also conveys that only 435 state samples are used during the tracking process. So, the controller is updated only 435 times. Nonetheless, if we use the time-triggering method under the same condition then 1600 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.
Next, in order to show that the derived controller is robust, we have taken \(\theta _1=0.4\) and \(\theta _2=-1\). The tracking performance for new value of \(\theta _1\) and \(\theta _2\) is shown in Figs. 8 and 9. In this scenario, the event-based controller updates 468 times only. On the contrary, the conventional time-triggered controller updates 1600 times under the same design criteria.
6.2 Example 2
Consider the spring-mass-damper system [36]
where \(x=[x_1,x_2]^\top \), \(x_1\) is the position and \(x_2\) is the velocity, m represents the mass of the object, k denotes the spring constant and c is the damping. Let \(m= 1 Kg\), \(c=0.5N.\text {s/m}\) and the spring is nonlinear with the nonlinearity \(k(x)=-5 x^3 \text {N/m}\). After adding an unmatched uncertainty \( \Delta f(x)\), the system dynamics is obtained as
where \( \Delta f(x)=l(x)d(x)\) and \(l(x)=[1,0]^\top \). The perturbation \(d(x)=0.5 \theta _1 x_1 x_2 sin(x_1) cos(x_2+\theta _2)\), where the parameters \(\theta _1\) and \(\theta _2\) are unknown. We consider \(\theta _1\in [-1,1]\), \(\theta _2\in [-5,5]\) and the upper bound of the perturbation d(x) is \(\lambda _d(x)=\vert x_2\vert \). Let \(x_0=[0.5,0.2]^\top \) be the initial state. The desired trajectory \(x_d(t)\) is generated from
where \(x_d=[x_{d1},x_{d2}]^{\top }\in \mathbb {R}^2\) with the initial condition \(x_{d0}=[0.2,-0.2]^\top \). Then an augmented state vector \(\xi =[\xi _1,\xi _2,\xi _3,\xi _4]^{\top }\in \mathbb {R}^4\) is defined and following augmented system is formed
where \(L(\xi )=[1,0,0,0]^\top \) and \(d(\xi )=0.5\theta _1 (\xi _1+\xi _3) (\xi _2+\xi _4) sin(\xi _1+\xi _3) cos((\xi _2+\xi _4)+\theta _2)\). The initial condition \(\xi _0=[0.3,0.4,0.2,-0.2]^\top \). The upper bound for \(\lambda _d(\xi )\) is derived as \(\lambda _d(\xi )=\vert \xi _2+\xi _4\vert \). The auxiliary system is formulated as
Since \(\Vert G^+(\xi )L(\xi )d(\xi )\Vert =0\), we have taken \(g_M(\xi )=0\). Let \(R=I_1\), \(M=I_1\) and \(Q=300I_2\). For the simulation purpose, consider \(\gamma =1.2\) and \(\beta =0.9\).
Based on (8), the cost function for (85) can be presented as
We have considered \(l=10\) numbers of hidden layer neurons and the activation function is chosen as \(\sigma _c(\xi )=[\xi _1^2,\xi _1\xi _2,\xi _1\xi _3,\xi _1\xi _4,\xi _2^2,\xi _2\xi _3,\xi _2\xi _4,\xi _3^2,\xi _3\xi _4, \xi _4^2]^\top \). The parameters used during the tuning process are \(l_c=4\), \(l_s=0.5\), \((A^2\nabla \sigma ^2_{cM}+B^2G^2_M)=8\) and \(\eta _2=0.7\).
To fulfill the PE criteria, a small exponentially decreasing probing noise is applied to the control input for the initial 10 seconds of the training process. All the elements of the weight vector are initialized to zero. The critic weight vector \(\hat{\omega }_c\) converges to \([3.53,19,11.46,2.49,1.93,0.23,0.01,10.07,-0.73, 2.05]^{\top }\) as shown in Fig. 10. During the training process the event-based controller updates 8947 times. On the contrary, under the same design criteria, the time-based controller updates 16000 times.
Then, we used the converged weights to obtain the control polices (39) and (40). Now, we select \(\theta _1=-0.9\) and \(\theta _2=-0.3\) to check the trajectory tracking performance of the designed control policy \( \hat{\mu }(\xi _k)\) and the triggering rule described in (21). We considered \(\eta _1=0.7\) and \(\mathcal {L}=10\). The sampling period is taken as 0.01 second. The performance of the designed tracking controller is displayed in Figs. 11 and 12. The obtained event-based control policy \( \hat{\mu }(\xi _k)\) is displayed in Fig. 13.
The evolution of the triggering condition with \(\Vert e_T\Vert ^2\) and \(\Vert e_k\Vert ^2\) is displayed in Fig. 14. The sampling period is shown in Fig. 15. The minimal intersample time is found to be 0.01 second. That means the infamous Zeno behavior is excluded. Furthermore, Fig. 15 also conveys that only 1452 state samples are used during the tracking process. So, the controller is updated 1452 times only. Nonetheless, if we use the time-triggering method under the same condition then 8000 samples are required. So, developed event-based tracking control strategy reduces the resources used significantly.
Next, in order to show that the derived controller is robust, we have taken \(\theta _1=0.8\) and \(\theta _2=4\). The tracking performance for new value of \(\theta _1\) and \(\theta _2\) is shown in Figs. 16 and 17. Here, only 1438 state samples are used during the tracking process. In other words, the event-based controller updates 1438 times only. On the other hand, the conventional time-triggered controller updates 8000 times under the same design criteria.
7 Conclusion
In this work, an event-based robust tracking strategy for an unmatched uncertain system is developed. By forming an auxiliary system and decomposing the unmatched uncertainty, the original control problem is transformed into obtaining an optimal controller for an auxiliary system. The related event-based HJB equation is solved via the ADP approach. The critic weights tuning law is modified to avoid the need for initial stabilizing control at the beginning of the tuning process. In the meantime, a novel event-triggering law is developed, and the uniform ultimate boundedness of the tracking error is verified using the Lyapunov method. The closed-loop auxiliary system’s asymptotic stability and the uniform ultimate boundedness of the critic approximation error are assured. Finally, two simulation examples are included to demonstrate the usefulness of the proposed methodology.
The main limitations of the proposed work and future scope are as follows.
-
1.
The work developed in this article needs complete knowledge of system dynamics. However, we may not know the system dynamics completely in many applications.
-
2.
The proposed method is not suitable for time-delay systems. In the future, a tracking controller will be designed for uncertain nonlinear systems with time-delay using event-based ADP approach.
Availability of data
Data sharing not applicable to this paper as no datasets were generated or analyzed during this work.
References
Kravaris, C., Palanki, S.: A lyapunov approach for robust nonlinear state feedback synthesis. IEEE Trans. Autom. Control 33(12), 1188–1191 (1988)
Lewis, F., Jagannathan, S., Yesildirak, A.: Neural Network Control of Robot Manipulators and Non-linear Systems. CRC Press, London (2020)
Lin, F.: An optimal control approach to robust control design. Int. J. Control 73(3), 177–186 (2000)
Karimi-Ghartemani, M., Khajehoddin, S.A., Jain, P., Bakhshai, A.: Linear quadratic output tracking and disturbance rejection. Int. J. Control 84(8), 1442–1449 (2011)
Zribi, M., Almutairi, N., Abdel-Rohman, M., Terro, M.: Nonlinear and robust control schemes for offshore steel jacket platforms. Nonlinear Dyn. 35(1), 61–80 (2004)
Kwakernaak, H., Sivan, R.: Linear Optimal Control Systems, vol. 1. Wiley-interscience, New York (1972)
Modares, H., Sistani, M.-B.N., Lewis, F.L.: A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Trans. 52(5), 611–621 (2013)
Zhang, H.-G., Zhang, X., Yan-Hong, L., Jun, Y.: An overview of research on adaptive dynamic programming. Acta Autom. Sin. 39(4), 303–311 (2013)
Adhyaru, D., Kar, I., Gopal, M.: Fixed final time optimal control approach for bounded robust controller design using hamilton-jacobi-bellman solution. IET Control Theory Appl. 3(9), 1183–1195 (2009)
Zhao, B., Jia, L., Xia, H., Li, Y.: Adaptive dynamic programming-based stabilization of nonlinear systems with unknown actuator saturation. Nonlinear Dyn. 93(4), 2089–2103 (2018)
Werbos, P.: Beyond regression:” new tools for prediction and analysis in the behavioral sciences. Ph. D. dissertation, Harvard University (1974)
Lewis, F.L., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9(3), 32–50 (2009)
Si, J., Barto, A.G., Powell, W.B., Wunsch, D.: Handbook of Learning and Approximate Dynamic Programming, vol. 2. John Wiley & Sons, New York (2004)
Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)
Si, J., Wang, Y.-T.: Online learning control by association and reinforcement. IEEE Trans. Neural Netw. 12(2), 264–276 (2001)
Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006)
Cheein, F.A., Scaglia, G.: Trajectory tracking controller design for unmanned vehicles: a new methodology. J. Field Robot. 31(6), 861–887 (2014)
Al Issa, S., Kar, I.: Design and implementation of event-triggered adaptive controller for commercial mobile robots subject to input delays and limited communications. Control Eng. Pract. 114, 104865 (2021)
Modares, H., Lewis, F.L.: Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7), 1780–1792 (2014)
Wang, D., Liu, D., Zhang, Y., Li, H.: Neural network robust tracking control with adaptive critic framework for uncertain nonlinear systems. Neural Netw. 97, 11–18 (2018)
Yang, X., Liu, D., Wei, Q., Wang, D.: Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing 198, 80–90 (2016)
Mu, C., Zhang, Y., Gao, Z., Sun, C.: Adp-based robust tracking control for a class of nonlinear systems with unmatched uncertainties. IEEE Trans. Syst. Man Cybern. Syst. 50(11), 4056–4067 (2019)
Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52(9), 1680–1685 (2007)
Al Issa, S., Chakravarty, A., Kar, I.: Improved event-triggered adaptive control of non-linear uncertain networked systems. IET Control Theory Appl. 13(13), 2146–2152 (2019)
Al Issa, S., Kar, I.: Event-triggered adaptive control of uncertain non-linear systems under input delay and limited resources. Int. J. Dyn. Control (2021). https://doi.org/10.1007/s40435-021-00767-7
Mu, C., Wang, D., Sun, C., Zong, Q.: Robust adaptive critic control design with network-based event-triggered formulation. Nonlinear Dyn. 90(3), 2023–2035 (2017)
Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sin. 1(3), 282–293 (2014)
Wang, D., Liu, D.: Learning and guaranteed cost control with event-based adaptive critic implementation. IEEE Trans. Neural Netw. Learn. Syst. 29(12), 6004–6014 (2018)
Dhar, N.K., Verma, N.K., Behera, L.: Adaptive critic-based event-triggered control for HVAC system. IEEE Trans. Ind. Inf. 14(1), 178–188 (2017)
Yang, X., Wei, Q.: Adaptive critic designs for optimal event-driven control of a CSTR system. IEEE Trans. Ind. Inf. 17(1), 484–493 (2020)
Dong, L., Zhong, X., Sun, C., He, H.: Event-triggered adaptive dynamic programming for continuous-time systems with control constraints. IEEE Trans. Neural Netw. Learn. Syst. 28(8), 1941–1952 (2016)
Yang, X., He, H.: Adaptive critic designs for event-triggered robust control of nonlinear systems with unknown dynamics. IEEE Trans. Cybern. 49(6), 2255–2267 (2018)
Wang, D., Mu, C., He, H., Liu, D.: Event-driven adaptive robust control of nonlinear systems with uncertainties through NDP strategy. IEEE Trans. Syst. Man Cybern. Syst. 47(7), 1358–1370 (2016)
Vamvoudakis, K.G., Mojoodi, A., Ferraz, H.: Event-triggered optimal tracking control of nonlinear systems. Int. J. Robust Nonlinear Control 27(4), 598–619 (2017)
Zhang, K., Zhang, H., Xiao, G., Su, H.: Tracking control optimization scheme of continuous-time nonlinear system via online single network adaptive critic design method. Neurocomputing 251, 127–135 (2017)
Zhang, K., Zhang, H., Jiang, H., Wang, Y.: Near-optimal output tracking controller design for nonlinear systems using an event-driven ADP approach. Neurocomputing 309, 168–178 (2018)
Xue, S., Luo, B., Liu, D., Gao, Y.: Adaptive dynamic programming-based event-triggered optimal tracking control. Int. J. Robust Nonlinear Control 31(15), 7480–7497 (2021)
Xu, N., Niu, B., Wang, H., Huo, X., Zhao, X.: Single-network ADP for solving optimal event-triggered tracking control problem of completely unknown nonlinear systems. Int. J. Intell. Syst. 36(9), 4795–4815 (2021)
Zhao, B., Liu, D.: Event-triggered decentralized tracking control of modular reconfigurable robots through adaptive dynamic programming. IEEE Trans. Industr. Electron. 67(4), 3054–3064 (2019)
Wang, D., Hu, L., Zhao, M., Qiao, J.: Adaptive critic for event-triggered unknown nonlinear optimal tracking design with wastewater treatment applications. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3135405
Cui, L., Xie, X., Wang, X., Luo, Y., Liu, J.: Event-triggered single-network ADP method for constrained optimal tracking control of continuous-time non-linear systems. Appl. Math. Comput. 352, 220–234 (2019)
Cui, L., Qu, W., Wang, L., Luo, Y., Wang, Z.:Event-triggered h\(\inf \) tracking control of nonlinear systems via reinforcement learning method. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019). DOI:10.1109/IJCNN.2019.8851956
Dahal, R., Kar, I.: Event-triggered robust tracking controller for uncertain nonlinear systems using adaptive critic. In: 2020 IEEE 17th India Council International Conference (INDICON), pp. 1–6 (2020). IEEE
Xue, S., Luo, B., Liu, D., Gao, Y.: Event-triggered ADP for tracking control of partially unknown constrained uncertain systems. IEEE Trans. Cybern. (2021). https://doi.org/10.1109/TCYB.2021.3054626
Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)
Kiumarsi, B., Lewis, F.L.: Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 140–151 (2014)
Zhang, Q., Zhao, D., Wang, D.: Event-based robust control for uncertain nonlinear systems using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 29(1), 37–50 (2016)
Dierks, T., Jagannathan, S.: Optimal control of affine nonlinear continuous-time systems. In: Proceedings of the 2010 American Control Conference, pp. 1568–1573 (2010). IEEE
Funding
This work received no specific grant from any funding agency.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dahal, R., Kar, I. Robust tracking control of nonlinear unmatched uncertain systems via event-based adaptive dynamic programming. Nonlinear Dyn 109, 2831–2850 (2022). https://doi.org/10.1007/s11071-022-07594-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11071-022-07594-1