1 Introduction

Stabilization of a Furuta pendulum has been considered as an active research area to control the engineers’ system. The rotary inverted pendulum is a popular test bed for the class of underactuated mechanical systems. Early researches on rotary inverted pendulum were motivated by the need to design controllers to balance the rockets during a vertical take-off [1]. Nevertheless, the control algorithm developed for a rotating pendulum system can be easily extended for any other two-degree-of-freedom unstable underactuated system (e.g., Acrobot, pendubot, inertia wheel pendulum, cart-pole) [2].

The Furuta pendulum is a two-degree-of-freedom system with only one actuator. This is an inverted pendulum, classified as a nonlinear, nonminimum phase and underactuated system. The structure is composed of an arm, attached to a motor, rotating in the horizontal plane. At the end of the arm, a pendulum is attached with a free rotational movement in the vertical plane. The motion control of such systems becomes difficult because the control of the overall system should be achieved from the actuated joints to the nonactuated joints [3]. Moreover, the presence of extraneous disturbance in the system has made the control design more complicated.

The rotary inverted pendulum (Fig. 1), which was first introduced by Furuta et al. [4], contains well-known underactuated dynamics, and many reports about its stabilization can be found. Most of the controls of the rotary inverted pendulum fall into one of the several categories. For example, some have considered the problem of stabilizing the pendulum around the unstable vertical position [5,6,7,8,9]. Some swung the pendulum from its hanging position to its upright vertical position [10,11,12,13]. Some tried to create oscillations around its unstable vertical position [14, 15], and some tried to track the trajectory of the arm, while the pendulum is in the upright vertical position [3, 16]. Therefore, three classical control objectives have been discussed in the literature as follows: (1) swing-up; (2) stabilization; and (3) trajectory tracking.

Fig. 1
figure 1

Furuta pendulum system

This study aims to consider the control problem of swinging up the pendulum to its upright vertical position and to stabilize the pendulum around that point. The adaptive neural network has been applied to achieve these control goals due to their approximation property, and the sliding surface brings us robustness property. Eliminating the chattering phenomenon in steady-state mode and optimization of the control input signal are the main purpose of this novel control scheme.

Regarding the stabilization problem, neural networks have been used in various pendulum-type systems [17, 18]. The robustness property of neural networks has been demonstrated using either real-time experiments or numerical simulations. In [19], adaptive neural network control for unknown nonlinear systems was proposed. The approach was applied to a car pendulum that provided tracking of the pendulum without considering the cart’s position. In [20], a dynamic Takagi–Sugeno–Kang-type, radial basis function-based neural–fuzzy system was proposed for online estimation of an ideal controller. Although the controller can solve a tracking problem, it was applied to the stabilization of the car-pole system; however, similar to [19], the boundedness of the cart position was not shown. In [21], a method based on neural networks with output feedback control was applied to address the tracking problem for a spherical inverted pendulum.

Several robust controllers were proposed for dealing with uncertainties and disturbances in the Furuta system. Yu et al. [22] proposed a robust controller to stabilize the Furuta pendulum under bounded perturbation. Khanesar et al. [23] used a fuzzy sliding controller to drive a rotary inverted pendulum to the vertical position subject to bound uncertainties and disturbances. Park et al. [17] presented a swing-up and stabilization control with coupled sliding mode control. In [24], an adaptive RBF network-based NN controller has been proposed, which combines with the SMC robust compensator to control problems for two-link robot manipulator; in [5], an adaptive controller has been proposed to balance a rotary inverted pendulum with time-varying uncertainties; and finally in [3], the tracking control of pendulum-type systems has been discussed using neural networks.

In these approaches, oscillations in outputs or control input signals have not been eliminated completely, and strong fluctuations exist in control input signals when putting the system in an unstable situation in addition to the high energy consumption. Some have steady-state errors or more fluctuations when the external disturbance has been disappeared.

In these approaches, oscillations in outputs or control input signals have not been eliminated completely, and strong fluctuations exist in control input signals when putting the system in an unstable situation in addition to the high energy consumption. Some have steady-state errors or more fluctuations when the external disturbance has been disappeared.

In this paper, the control input has been smoothed in a steady-state mode using a new approach, and the oscillation of the system has been eliminated by using filtered tracking error and a new scheme of adaptive neural network algorithm. Removing discontinuity terms in control law and using a dual neural network make a fast adaptation while reasonable control input is needed. Robustness of the system has been demonstrated by applying external disturbance to drive out the system from the equilibrium position. The results show that the novel adaptive neural network controller performs well, while the dynamic model of the system is not needed to design the controller. Finally, the results have been compared with previous works to highlight the controller improvement with the new presented scheme. Finally, all of these results are verified by ADAMS model (software in the loop) and the last section gives the conclusions.

The main contributions are as follows:

  • Introducing a new filtered tracking error to stabilize the inverted pendulum in the vertical position and keep the arm stable with the zero velocity.

  • Introducing a new control scheme for an underactuated system using dual adaptive neural network.

  • Proposing a new weight adaptation laws based on e-modification technique for the corrective neural network and introducing a pseudo-sigmoid activation function in the second neural network for oscillation compensation.

2 System dynamics

The dynamic model of the Furuta pendulum in Euler–Lagrange form [25, 26] can be written as:

$$M(q)\ddot{q} + C(q,\dot{q})\dot{q} + G_{m} (q) = U$$
(1)

where \(\,q = [q_{0} \,\,q_{1} ]^{\text{T}} \in IR^{2}\) is a vector of joint positions, \(M(q) \in IR^{2*2}\) is the symmetric positive definite inertia matrix, \(\,c(q,\dot{q})\dot{q} \in IR^{2}\) is the vector of centripetal and Coriolis torques, \(G_{m} (q) \in IR^{2}\) is the vector of gravitational torques, and \(U = [u\,\,0]^{\text{T}} \in IR^{2}\) is the vector of input torques, with \(u \in IR\) being the torque applied to the arm. In particular, the model of the Furuta pendulum has the following components:

$$\begin{aligned} q & = \left[ {\begin{array}{*{20}c} {q_{0} } \\ {q_{1} } \\ \end{array} } \right],\quad M(q) = \left[ {\begin{array}{*{20}c} {I_{0} + m_{1} (L_{0}^{2} + l_{1}^{2} \sin^{2} q_{1} )} & {m_{1} l_{1} L_{0} \cos q_{1} } \\ {m_{1} l_{1} L_{0} \cos q_{1} } & {J_{1} + m_{1} l_{1}^{2} } \\ \end{array} } \right] \\ C(q,\dot{q}) & = \left[ {\begin{array}{*{20}c} {\frac{1}{2}m_{1} l_{1}^{2} \sin (2q_{1} )\dot{q}_{1} } & { - m_{1} l_{1} L_{0} \sin q_{1} \dot{q}_{1} + \frac{1}{2}m_{1} l_{1}^{2} \sin (2q_{1} )\dot{q}_{0} } \\ { - \frac{1}{2}m_{1} l_{1}^{2} \sin (2q_{1} )\dot{q}_{0} } & 0 \\ \end{array} } \right] \\ G_{m} (q) & = \left[ {\begin{array}{*{20}c} 0 \\ { - m_{1} gl_{1} \sin q_{1} } \\ \end{array} } \right],\quad U = \left[ {\begin{array}{*{20}c} u \\ 0 \\ \end{array} } \right] \\ \end{aligned}$$

The coordinate system and notations are described in Fig. 1. We will assume that friction is negligible.

\(I_{0}\):

Inertia of the arm

\(L_{0}\):

Total length of the arm

\(m_{1}\):

Mass of the pendulum

\(l_{1}\):

Distance to the center of gravity of the pendulum

\(J_{1}\):

Inertia of the pendulum around its center of gravity

\(q_{0}\):

Rotational angle of the arm

\(q_{1}\):

Rotational angle of the pendulum

\(u\):

Input torque applied on the arm

g:

The gravity

2.1 Problem formulation

The controller is required to serve a twofold control objective. The first objective is to stabilize the pendulum in its upright position at the origin from an initial condition in the upper half plane (i.e., \(q_{1} \in ( - \frac{\pi }{2},\frac{\pi }{2})\)). The second objective of the controller is to ensure the proper orientation control of the arm (\(q_{0}\)). In addition, the controller must possess adequate disturbance rejection ability to offer satisfactory control performance in an uncertain environment. The rotary inverted pendulum has dynamics from (1). Define the tracking error e(t) by

$$\begin{aligned} e_{0} & = q_{{0_{d} }} - q_{0} \\ e_{1} & = q_{{1_{d} }} - q_{1} \\ \end{aligned}$$
(2)

where the desired arm position \(q_{d} (t)\) is twice differentiable and bounded for all time t ≥ 0 in the sense

$$\left\| {q_{d} (t)} \right\|,\left\| {\dot{q}_{d} (t)} \right\|,\left\| {\ddot{q}_{d} (t)} \right\| \le \zeta$$
(3)

where ζ is a positive constant.

The Furuta pendulum model in (1) can be written in the following form [3]:

$$\begin{aligned} \frac{d}{dt}q_{0} & = \dot{q}_{0} \\ \frac{d}{dt}\dot{q}_{0} & = f_{0} + g_{0} u \\ \frac{d}{dt}q_{1} & = \dot{q}_{1} \\ \frac{d}{dt}\dot{q}_{1} & = f_{1} + g_{1} u \\ \end{aligned}$$
(4)

where

$$\begin{aligned} f_{0} & = \frac{1}{\det \,M(q)}[M_{22} Z_{1} - M_{12} Z_{2} ] \\ f_{1} & = \frac{1}{\det \,M(q)}[ - M_{21} Z_{1} + M_{11} Z_{2} ] \\ Z_{1} & = - C_{11} \dot{q}_{0} - C_{12} \dot{q}_{1} \\ Z_{2} & = - C_{21} \dot{q}_{0} - C_{22} \dot{q}_{1} + m_{1} gl_{1} \sin \theta_{1} \\ g_{0} & = \frac{{M_{22} }}{\det \,M(q)} \\ g_{1} & = \frac{{M_{21} }}{\det \,M(q)} \\ \end{aligned}$$
(5)

with \(M_{ij}\) and \(C_{ij}\), which are the elements of the inertia matrix \(M(q)\) and the Coriolis matrix \(C(q,\dot{q})\), respectively. The system given by (4) can be written in terms of the tracking error (2) as follows:

$$\dot{e}_{0} = \dot{q}_{{0_{d} }} - \dot{q}_{0}$$
(6)
$$\ddot{e}_{0} = \ddot{q}_{{0_{d} }} - f_{0} - g_{0} u$$
(7)
$$\dot{e}_{1} = \dot{q}_{{1_{d} }} - \dot{q}_{1}$$
(8)
$$\ddot{e}_{1} = \ddot{q}_{{1_{d} }} - f_{1} - g_{1} u$$
(9)

which describe the open-loop system with \(\dot{q}_{{1_{d} }} = 0\,,\,\,\ddot{q}_{{1_{d} }} = 0\). Here, it is important to introduce a proper function of the error to gain the control goal. Then, we propose an output function \(r(t) \in IR\) as a filtered tracking error given by

$$r = \dot{e}_{0} + \dot{e}_{1} + \lambda e_{1}$$
(10)

where \(\lambda > 0\) is a positive definite design parameter. By computing its time derivative, one can obtain

$$\dot{r} = \ddot{e}_{0} + \ddot{e}_{1} + \lambda \dot{e}_{1}$$
(11)

substituting (7) and (9) into (11) and simplifying

$$\begin{aligned} \dot{r} & = \ddot{q}_{{0_{d} }} - f_{0} - g_{0} u - f_{1} - g_{1} u - \lambda \dot{q}_{1} \\ & = F - Gu \\ \end{aligned}$$
(12)

where

$$\begin{aligned} F & = \ddot{q}_{d0} - \lambda \dot{q}_{1} - f_{0} - f_{1} \\ G & = (g_{0} + g_{1} ) \\ \end{aligned}$$
(13)

Equation (12) can also be rewritten as:

$$\,\frac{{\dot{r}}}{G}\,\, = \frac{F}{G} - u$$
(14)

where the function \(G(q_{1} )\) is strictly positive for all \(\left| {\left. {q_{1} } \right|} \right. < \cos^{ - 1} (\frac{{J_{1} + m_{1} l_{1}^{2} }}{{m_{1} l_{1} L_{0} }})\) and is bounded and continuous such that

$$\frac{1}{2}\left| {\left. {\frac{{\dot{G}}}{{G^{2} }}} \right|} \right. \le \mu$$
(15)

where \(\mu\) is a positive constant value [3].

A two-layer neural network can estimate any nonlinear, continuous and unknown function [27]. Now, according to the universal approximation property of NN, there is a two-layer NN such that:

$$\frac{F}{G} = {\textit{f}}\left({\textit{x}} \right) = W^{\text{T}} \sigma \left({V^{\text{T}} x} \right) + \epsilon$$
(16)

which \(V,\,\,W\) are the NN weights, \(\sigma\) is a sigmoid activation function, x is the input vector of the neural network and the approximation error \(\epsilon\) bounded on a compact set by

$$\left\| \epsilon \right\| < \epsilon_{N}$$

Now, let an NN estimate of f(x) be given by

$$\hat{f}\left( x \right) = \hat{W}^{\text{T}} \sigma \left( {\hat{V}^{\text{T}} x} \right)$$
(17)

where \(\hat{V},\,\,\hat{W}\) are the matrix of input and output NN weights, respectively, which should be specified given the tuning algorithm. Note that \(\hat{V},\hat{W}\) are estimates of the ideal weight values.

The problem is to define a control law, which can estimate the system dynamics and is robust enough to compensate disturbances while reaching the control goal. Oscillation compensation is another goal for the control system.

3 Controller structure

In this section, SHL networks will perform the approximation of the corresponding command. According to the universal approximation theory, the neural network of the SHL type can estimate any nonlinear, continuous, unknown function [27]. Due to the iterative nature of the neural network’s training mechanism and due to the high order of complexity of the dynamic model, the neural network may take a relatively long time to converge, which may lead to unstable dynamics or unsatisfactory performance. Henceforth, a robustifying term \(F_{r}\) is introduced that is corresponding to a PD controller and injects damping into the system.

$$F_{r} = K_{v} r$$
(18)

with \(K_{v}\) being a positive constant gain. Therefore, a novel adaptive neural-network-based controller is proposed by the following expression:

$$U = U^{*} + U_{\text{cor}}$$
(19)
$$U^{*} = \hat{W}^{\text{T}} \sigma \left( {\hat{V}^{\text{T}} x} \right) + K_{v} r$$
(20)

where \(U^{*}\) is the equivalent controller, which contains SHL neural network and \(F_{r}\). The \(U_{\text{cor}}\) is the ideal corrective control that will be presented in the next section.

The optimal weight matrices \(\hat{V},\,\,\hat{W}\) are unknown, and it is necessary to estimate them by an adaptation mechanism so that the output feedback control law can be realized. The matrices \(\hat{W}\) and \(\hat{V}\) are the estimation of W and V, respectively. Assumption: On any compact subset of \(\Re^{n}\), the ideal NN weights are bounded so that \(W_{F} \le W_{m} . V_{F} \le V_{m} . \;{\text{with}}: W_{m} \;{\text{and}}\; V_{m}\) are unknown positive constants, \(\left\| {\,.\,} \right\|_{\text{F}}\) is the Frobenius norm, and the weight deviations or weight estimation errors are defined as

$$\hat{V} = V - \hat{V}\quad \hat{W} = W - \hat{W}$$

Define the hidden layer output error for a given x as

$$\tilde{\sigma } = \sigma - \hat{\sigma } \equiv \sigma (V^{\text{T}} x) - \sigma (\hat{V}^{\text{T}} x)$$
(21)

The Taylor series expansion of \(\sigma \left( x \right)\) for a given x may be written as

$$\sigma (V^{\text{T}} x) = \sigma (\hat{V}^{\text{T}} x) + \sigma^{\prime}(\hat{V}^{\text{T}} x)\tilde{V}^{\text{T}} x + O(\tilde{V}^{\text{T}} x)^{2}$$
(22)

with

$$\sigma^{\prime}(\hat{z}) \equiv \left. {\frac{{{\text{d}}\sigma (z)}}{{{\text{d}}z}}} \right|_{{z = \hat{z}}}$$
(23)

the Jacobian matrix. The \(O\left( z \right)^{2}\) denotes terms of order two. Denoting \(\hat{\sigma^{\prime}} = \sigma^{\prime}(\hat{V}^{\text{T}} x)\), we have [28]

$$\tilde{\sigma } = \sigma^{\prime}(\hat{V}^{\text{T}} x)\tilde{V}^{\text{T}} x + O(\tilde{V}^{\text{T}} x)^{2} = \hat{\sigma^{\prime}}\tilde{V}^{\text{T}} x + O(\tilde{V}^{\text{T}} x)^{2}$$
(24)

E-modification technique usually used in robust adaptive control, which is applied for improving the robustness of the controller in the presence of the NN approximation error [29]. Hence, the weight adaptation laws based on e-modification technique has been provided by [28]:

$$\begin{aligned} \dot{\hat{W}} & = F_{w} (\hat{\sigma }r^{T} - k\left\| r \right\|\hat{W}) \\ \dot{\hat{V}} & = F_{v} (xr^{T} \hat{W}^{T} \hat{\sigma }^{\prime} - k\left\| r \right\|\hat{V}) \\ \end{aligned}$$
(25)

with \(F_{w} > 0,\,\,F_{v} > 0\) are the adaptive gains, and k is a positive constant. Then, the tracking error r(t) approaches to zero with t, and the weight estimates \(\hat{V},\hat{W}\) are bounded.

3.1 Corrective control

According to the use of the filtered tracking error, some oscillations may appear in steady-state mode. Although in other works, a sign function has been used to correct the controller, the discontinuity makes some perturbations specifically when an external disturbance has been applied. The corrective control is designed to eliminate the chattering phenomenon, which is provided by the discontinuous terms; for that reason, we proposed a continuous function Z(·) given by:

$$Z\left( x \right) = \frac{{1 - e^{ - 2x} }}{{1 + e^{ - 2x} }}$$
(26)

Therefore, the ideal expression of the corrective term will be:

$$U_{\text{cor}}^{*} = BZ\left( { \propto^{\text{T}} x^{e} } \right)$$
(27)

where the gain B and \(\alpha = [\lambda_{1} \,\,\,\lambda_{2} ]^{\text{T}}\) present the ideal output layer and hidden layer weights, respectively, \(x^{e} = [e_{1} \,\,\,\,\dot{e}_{1} ]^{\text{T}}\) is the input vector, and Z(·) is the activation function defined in Eq. (26). A second neural network, which is shown in Fig. 2, is added to the control scheme. The purpose here is estimating the ideal corrective term.

Fig. 2
figure 2

Second network to estimate the correction term

The output of this network is represented by:

$$U_{\text{cor}}^{*} = \hat{B}Z\left( {\hat{ \propto }^{\text{T}} x^{e} } \right) + \varepsilon_{c}$$
(28)

where \(\hat{B}\) and \(\hat{ \propto }\) are the estimation of B and \(\propto\), respectively, and \(\varepsilon_{c}\) is the reconstruction error. The weights B and \(\propto\) are unknown, so it is judicious to find a way of adapting them. Therefore, the real estimate of the corrective term by the ANN presented in Fig. 2 has the following structure:

$$U_{\text{cor}} = \hat{B}Z\left( {\hat{ \propto }^{\text{T}} x^{e} } \right)$$
(29)

The Taylor series expansion of the corrective term estimation error is given by:

$$\,\tilde{U}_{\text{cor}} = BZ - \hat{B}\hat{Z} = \tilde{B}\hat{Z} + B\tilde{Z} = \tilde{B}\hat{Z} + \hat{B}\hat{Z}^{\prime}\tilde{\alpha }^{T} x^{e} + w_{g}$$
(30)

where \(\tilde{B} = B - \hat{B},\,\,\tilde{\alpha } = \alpha - \hat{\alpha }\) are the error parameters. The term \(w_{g}\) presents the approximation error given by:

$$w_{g} = \varepsilon_{c} + \tilde{B}\hat{Z}^{\prime}\tilde{\alpha }^{\text{T}} x^{e} + \tilde{B}O(\tilde{\alpha }^{\text{T}} x^{e} )^{2}$$
(31)

Assumption: \(\,\left\| {w_{g} } \right\| \le \bar{w}_{g} ,\,\bar{w}_{g}\) is an unknown positive constant and \(\left| B \right| \le B_{m} ,\left\| \alpha \right\|_{F} \le \alpha_{m}\) with \(\alpha_{m}\) and \(B_{m}\) are unknown positive constants. The proposed adaptation laws for adjusting weights are given by the following equations that will be proofed by the Lyapunov function in (49).

$$\begin{aligned} \dot{\hat{B}} & = F_{B} \hat{Z}(\alpha^{\text{T}} x^{e} )r_{e} \\ \dot{\hat{\alpha }} & = F_{\alpha } (x^{e} r_{e} \hat{B}\hat{Z}^{\prime} + k_{\alpha } \left| {r_{e} } \right|(\bar{\alpha } - \hat{\alpha })) \\ \end{aligned}$$
(32)

where \(r_{e} = \dot{e}_{1} + \delta e_{1} ,\,\,F_{B} = F_{B}^{\text{T}} > 0,\,\,F_{\alpha } = F_{\alpha }^{\text{T}} > 0,\,\,k_{\alpha } > 0.\) The vector \(\bar{\alpha }\) is selected as follows:

$$\bar{\alpha } = \left[ {\begin{array}{*{20}c} {\frac{p}{{\left| {e_{1} } \right| + \varepsilon }}} & 1 \\ \end{array} } \right]^{\text{T}} ,\quad p,\varepsilon > 0$$
(33)

where \(\bar{ \propto } \le \propto_{n}\), \(\propto_{n}\) is a positive constant. The structure of the controller is shown in Fig. 3.

Fig. 3
figure 3

Control system structure

3.2 Error system dynamics

The control input from (19), (20), and (29) is defined as

$$U = \hat{W}^{\text{T}} \sigma \left( {\hat{V}^{\text{T}} x} \right) + K_{v} r + \hat{B} Z\left( {\hat{ \propto }^{\text{T}} x^{e} } \right)$$
(34)

Using the control law (34), the closed-loop filtered error dynamics becomes

$$\frac{{\dot{r}}}{G} = - K_{v } r + W^{\text{T}} \sigma \left( {V^{\text{T}} x} \right) - \hat{W}^{\text{T}} \sigma \left( {\hat{V}^{\text{T}} x} \right) - \hat{B} Z\left( {\hat{ \propto }^{\text{T}} x^{e} } \right) + \varepsilon$$
(35)

Adding and subtracting \(W^{\text{T}} \hat{\sigma } , B\hat{Z}\;{\text{yields}}\)

$$\frac{{\dot{r}}}{G} = - K_{v} r + \tilde{W}^{\text{T}} \hat{\sigma } + W^{\text{T}} \hat{\sigma } + \hat{B}\hat{Z} - B\hat{Z} + \varepsilon$$
(36)

Moreover, now adding and subtracting \(\hat{W}^{\text{T}} \tilde{\sigma } , BZ\) yields

$$\frac{{\dot{r}}}{G} = - K_{v} r + \tilde{W}^{\text{T}} \hat{\sigma } + \hat{W}^{\text{T}} \tilde{\sigma } + \tilde{W}^{\text{T}} \tilde{\sigma } + \tilde{B}\hat{Z} + B\tilde{Z} - BZ + \varepsilon$$
(37)

The key step here is the use of the Taylor series approximation for \(\tilde{\sigma },\tilde{Z}\), according to which the closed-loop error system is

$$\frac{{\dot{r}}}{G} = - K_{v} r + \tilde{W}^{\text{T}} \hat{\sigma } + \hat{W}^{\text{T}} \hat{\sigma }^{'} \tilde{V}^{\text{T}} x + \tilde{B}\hat{Z} + \hat{B}\hat{Z}^{'} \tilde{ \propto }x^{e} + w_{1}$$
(38)

where the disturbance terms are

$$w_{1} = \tilde{W}^{\text{T}} \hat{\sigma }^{'} \tilde{V}^{\text{T}} x - BZ + w_{g} + \hat{W}^{\text{T}} O \left( {\tilde{V}^{\text{T}} x} \right)^{2} + \varepsilon$$
(39)

As seen, the convergence of “r” to zero implies convergence of the tracking error and its derivative to zero. So, the objective of control is summarized in the synthesis of a control law that allows the convergence of the filtered error to zero [28]. Let the desired trajectory \(q_{d} \left( t \right)\) be bounded by \(q_{b}\). Assume the disturbance term \(w_{1 }\) in (38) equals zero. The stability of closed-loop system has been proved by defining the Lyapunov function candidate L as below. The proof is illustrated in “Appendix.”

$$\begin{aligned} L & = \frac{1}{2}\frac{{r^{2} }}{G} + \frac{1}{2}{\text{tr}} \left\{ {\tilde{W}^{T} F_{w}^{ - 1} \tilde{W}} \right\} \\ & \quad + \frac{1}{2}{\text{tr}}\left\{ {\tilde{V}^{T} F_{v}^{ - 1} \tilde{V}} \right\} + \frac{1}{2}{\text{tr}}\left\{ {\tilde{B}^{T} F_{B}^{ - 1} \tilde{B}} \right\} \\ & \quad + \frac{1}{2}{\text{tr}}\left\{ {\tilde{ \propto }^{T} F_{ \propto }^{ - 1} \tilde{ \propto }} \right\} \\ \end{aligned}$$
(40)

4 Simulation studies and performance comparison

The simulation results are introduced in order to make it possible to see the operation of the proposed adaptive neuro-controller and control input signals. The results of the swing-up and disturbance rejection tests are introduced. A simulation comparison wherein three controllers are compared to the new controller is performed. The model parameters are briefly described, and the controllers used in the comparison are introduced.

4.1 Controllers used in the comparison

A simulation study has been performed to assess the performance of the new controller (34). Specifically, a linear controller and the adaptive neural network schemes in [16, 30] were implemented in real time.

The schemes in [3, 16, 30] have an adaptive neural network component and a linear PD or PID component. In addition, in [3] there is a nonlinear term that eliminates the adaptation error introduced by the neural network, but that discontinued term introduces some fluctuations, specifically when the disturbance happens. In the proposed controller, the second adaptive neural network has been used to eliminate the oscillation and the adaptation error. Also in disturbance investigation, the external disturbance should be applied on the pendulum to observe the ability of the controller to keep it stable, while in some of the previous works, it has been applied as a torque summation in the control input of the first link. In [3], the initial angle of the pendulum has been defined zero, while here, we propose a nonzero initial angle to highlight the ability of the proposed scheme to swing up and also to confirm the robustness against external disturbances. It is important to observe that the proposed adaptation laws (25) and (32) of the new scheme (34) are derived such that the time derivative of the Lyapunov function L in (49) is negative, and some of the other schemes use the back-propagation adaptation without any motivation given by the closed-loop system analysis.

The adaptive neural network controller proposed by Moreno [3] is given by

$$\tau = - \hat{W}^{\text{T}} \sigma (\hat{V}^{\text{T}} x) - k_{p} y - \delta \,{\text{sign}}(y)$$
(41)

where the constants \(k_{p}\) and \(\delta\) are positive, \(\hat{V}\) is the matrix of estimated input weights, and \(\hat{W}\) is a vector of the estimated output weights. Output function y(t) is given by

$$y = \dot{e}_{1} + \Delta_{1} e_{1} + \dot{e}_{2} + \Delta_{2} e_{2}$$
(42)

where \(\Delta_{1} = 3,\,\Delta_{2} = 8\) are positive constants and \(k_{p} = 1.05,\,\delta = 0.035\). The number of neurons in the hidden layer was L = 10. The adaptation laws were implemented with α = 1, N = 1.05, and R = 8.53. The weights of the neural network were initialized zero. It is worth mentioning that the presented approach by Moreno has been compared to the PID algorithm and the results show the superiority of Moreno’s approach to PID algorithm. Therefore, the PID has not been discussed in this paper.

In the tests, we also consider the adaptive neural network controller proposed by Chaoui and Sicard [16].

$$\tau = \tau_{\text{NN}} + k_{D} s$$
(43)

where \(\tau_{\text{NN}}\) is the output of a two-layer neural network, six neurons in the hidden layer and one neuron in the output layer; that its, input and output weights are obtained by the back-propagation algorithm, which minimizes the signal s given by

$$s = [1 - \lambda ][\dot{e}_{1} + \psi_{1} e_{1} ] + \lambda [\dot{e}_{2} + \psi_{2} e_{2} ]$$
(44)

with \(0 < \lambda < 1,\,\psi_{1} ,\psi_{2} > 0\), and \(k_{D} > 0\). Specifically, (43) was implemented with \(\lambda = 0.5,\,\,\psi_{1} = 20,\,\,\psi_{2} = 30\), and

$$k_{D} = 1.757.$$

See the original works [3, 16] for further details on the controllers (41) and (43), respectively.

4.2 Simulation

The simulation of the Furuta pendulum has been conducted on a MATLAB®/Simulink with the following initial conditions: (\(q_{0} = 0,\varvec{ }q_{1} = 30^\circ\)). The actual values of the system parameters are presented in Table 1. For the proposed controller, the number of neurons in the hidden layer is 7, and initial values for W and V have been defined zero. Determining the number of hidden layer neurons required for the best approximation is an open problem for general fully connected two-layer NN [28]. Therefore, the number of neurons in the hidden layer is defined by the trial-and-error approach, and it can be selected from 3 to 11 for a good performance. Changing this parameter (Figs. 10, 11) in this area causes little changes in the results and can be easily modified. The initial adaptive coefficients in (25) are assumed to be \(F_{w} = 100\), \(F_{v} = 50\), k = 1 and in (32) is assumed to be \(F_{B} = 5,\,\,F_{\alpha } = 2,\,\,k_{\alpha } = 1\). We suppose that \(\lambda = 10,\,K_{v} = 1,\,\,\delta = 2\) and initial values for \(B,\lambda_{1} ,\,\lambda_{2}\) are zero.

Table 1 System parameters

To stabilize the system, the arm speed should also be controlled, and by applying the zero as the desired state for it, a stable system has been achieved. Therefore, the state variables are considered as follows:

$$\begin{aligned} X & = \left[ {e_{1} \cdot \dot{e}_{1} \cdot e_{{1\,{\text{delay}}}} \cdot \dot{e}_{0} \cdot \ddot{e}_{0} } \right] \\ e_{0} & = q_{{0\,{\text{desired}}}} - q_{0} \\ e_{1} & = q_{{1\,{\text{desired}}}} - q_{1} \\ \end{aligned}$$
(45)

The reason for introducing \(e_{{1\,{\text{delay}}}}\) is that since the neural network controller itself is a nonmemory device, some delayed signals must be introduced in order to control the output to depend not only on the current input (error in our case) but also on past inputs. Under some circumstances, by properly introducing time delays into the control channel, the control performance of some practical systems can be improved [31]. As discussed in [32], using mixed current and delayed states can significantly reduce both internal oscillations of the offshore platform and required control force. In this paper, we consider only one simple delayed signal. However, in principle, multiple delayed signals can be introduced.

According to Eq. (25), its parameters are calculated as follows.

$$\sigma (x) = \frac{1}{{1 + e^{ - x} }}\,$$
(46)

Then, \(\sigma^{\prime}(x)\) is calculated as follows [28, 33, 34]:

$$\sigma^{\prime}(x) = \sigma (x)\sigma ( - x)$$
(47)

4.3 Results

The following shows the obtained results. For simplicity, the proposed controller in (34) is labeled as “new,” the neural network controller proposed by [3] in (41) is labeled as “Moreno,” and the neural network controller introduced by [16] in (43) is labeled as “Sicard.” The results of the three implemented controllers to reach the inverted pendulum from 30 degrees to vertical position (zero degree) without external disturbances are shown in Figs. 4 and 5. The maximum absolute value of the pendulum position error and the root mean square (RMS) value of the error are compared in Table 2. The settling time has been defined when the signal crosses into and remains in the 2%-tolerance region around the state level. In maximum error quantification, the first-time interval is considered after 5 s when the pendulum is in a stable position, and the maximum of the error has been highlighted. The second time interval (2 < t < 10) shows the ability of the proposed approach to reach a stable position in a shorter time.

Fig. 4
figure 4

Angle of Pendulum position in comparison with other implemented controllers and the magnified part of it

Fig. 5
figure 5

Comparison of three implemented controllers a the control input and b the angular velocity of the arm

Table 2 Quantification of error

Figure 4 shows that, in 0.2 s, the controller can adapt to parameters with a completely unknown model for the controller, and control the underactuated system well. Also, the control input is within the acceptable range (Fig. 5a). The velocity of the arm tends to be zero while the oscillation has not been produced (Fig. 5b). The first second of the diagrams has been magnified to highlight the performance of the proposed controller in comparison with others. The fast adaptation and less fluctuation in pendulum position and arm velocity are the advantages of the new controller. Table 2 presents a quantification of the results obtained during the tests, therein showing a comparison of the maximum absolute value of the error \(e_{1} \left( t \right)\), the RMS value of the error \(e_{1} \left( t \right)\), for each algorithm. A better performance is obtained with the new controller because the index RMS{e(t)}, the settling time, and the maximum absolute of the error are the lowest.

To check the robustness of the controller, an external disturbance as a torque is applied to the pendulum for 0.1 s with \(F_{d} = 0.5\) NM and the controller performance are evaluated. The other variables with time are displayed in Figs. 6 and 7. Quantification of the error with disturbance has been evaluated in Table 3.

Fig. 6
figure 6

Pendulum position with external disturbance

Fig. 7
figure 7

The arm velocity with the external disturbance

Table 3 Quantification of error with external disturbance

As seen in Figs. 6 and 7, the disturbance applied within 0.1 s changed the pendulum angle significantly from the steady state. However, the proposed controller is robust enough to keep the states in the neighborhoods of the equilibrium points and to return them to the desired values immediately when the disturbance disappears. The applied torque is a step function with positive value, which makes a positive deviation in pendulum’s angle if there is no controller. But here, because of fast adaptation of the controller parameters, the movement of the arm prevented the deviation of the pendulum in that direction, and after a very small movement in positive direction, the controller makes an input torque and pendulum deviates in minus direction and then tends to be in the vertical stable position. The results show that the controller tries to do does not produce a big error like other approaches and the settling time after disturbance highlights the controller adaptability. The RMS values of the errors in Table 3 show the ability of the proposed controller in disturbance rejection in comparison with other approaches. The absence of extreme fluctuations in the results shows good performance of the controller in an underactuated system.

4.4 Corrective control analysis

In the following section, adding \(U_{\text{cor}}\), which is the second part of the controller in (19), has been investigated. One of the advantages of the presented approach is using a dual neural network to eliminate oscillations that are emerged by a discontinuity of the sign function used in other approaches like [3, 35]. To show the effect of the \(U_{\text{cor}}\) in oscillation compensation, the results with \(U_{\text{cor}}\) and without that are shown in Figs. 8 and 9. Actually, the controller \(U_{2}\) that is compared to (19) is defined by:

$$U_{2} = \hat{W}^{\text{T}} \sigma \left( {\hat{V}^{\text{T}} x} \right) + K_{v} r + {\text{sign}}\left( r \right)$$
(48)
Fig. 8
figure 8

The angle of pendulum position. ANN controller in comparison with proposed ANN controller in which oscillation is compensated

Fig. 9
figure 9

Control input. ANN controller and proposed ANN controller in which oscillation is compensated

Figures 8 and 9 show the performance of the proposed algorithm in comparison with the ANN controller without \(U_{\text{cor}}\). The proposed control scheme has compensated the oscillation caused by uncertainties and discontinuity. Although by multiplying a coefficient to the sign function, the pendulum position oscillation can decrease (not eliminated at all), but the control input value increases so that it becomes unreasonable. Therefore, the figures suggest that the performance of the proposed algorithm is clearly better than the performance of the ANN controller formulated in (44).

4.5 Number of hidden neuron analysis

The number of hidden neurons in the neural network was studied in the following. As it has been referred to before, the number of neurons in the hidden layer is defined by the trial-and-error approach. In Figs. 10 and 11, the results have been analyzed when the number of hidden neurons in the hidden layer is changing from 3 to 11. The three sets of neurons have been introduced to show the efficiency of each case in pendulum position and control input values.

Fig. 10
figure 10

Angle of pendulum position in a number of hidden neuron analysis

Fig. 11
figure 11

Control input in a number of hidden neuron analysis

As seen in Figs. 10 and 11, three variants (3 neurons, 7 neurons, and 11 neurons) of choosing a number of hidden neurons have been analyzed. It is clearly shown that the 7 neurons in the hidden layer make better performance in position error and also in control input values.

The controller is able to drive the Furuta pendulum system toward its desired equilibrium points (i.e., X = 0) in an efficient manner. Simulation results clearly show the stabilizing ability of the controller. Thus, the algorithm can be used for either two-degree-of-freedom systems with a slight design modification.

5 ADAMS simulation

In order to verify numerical results, dynamic model of the Furuta is simulated in the commercial software, ADAMS/View package of MSC software. Robot Furuta with all of the details, as like as real working condition of the robot, is modeled in ADAMS, which is shown in Fig. 12. Coulomb friction effect, Stiction and Sliding, are considered in all of the joints. Coefficients of static and dynamic friction are set to 0.5 and 0.3, respectively. All the robot dimensions and inertia parameters are assumed the same as in Table 1. The absolute velocity threshold for the transition from dynamic friction to static friction is assumed to be 0.1 mm/sec. The effect of viscous friction using a damper on the joints is considered. Damping coefficient is chosen 0.2, so that the software model performs the same as the experimental setup.

Fig. 12
figure 12

ANN control of the system in ADAMS and Simulink (software in the loop)

By using “Controls” plugin in ADAMS/View, dynamic model of the system is exported to MATLAB/Simulink environment (Fig. 12) and the proposed controller is implemented on the robot to evaluate the results. The controller parameters are the same as Sect. 4.2. The results of the implemented controller on ADAMS model without external disturbance have been evaluated. To compare the ADAMS model with the numerical model, the results are shown in Fig. 13.

Fig. 13
figure 13

Comparison of results for ADAMS and numerical model

As seen in Fig. 13, both numerical and ADAMS models are almost the same and some small differences exist because of the friction, which is modeled perfectly in ADAMS model. The quantification of error for 10 s simulation has been presented in Table 4.

Table 4 Quantification of error for numerical and ADAMS model

An external disturbance as a torque is applied to the pendulum for 0.1 s with \(F_{d} = 0.5\) NM (Fig. 14) as like as numerical model in Fig. 6.

Fig. 14
figure 14

Pendulum position with external disturbance for ADAMS and numerical model

As it is shown in Fig. 13, because of the friction existence in the ADAMS model, deviation of the angular position of the pendulum is not too much in comparison with numerical model. The results show that the proposed approach is working on numerical and simulated physical models as well and the performance is comparable with previous works.

6 Conclusion

This paper showed the development and application of an adaptive neural network control scheme to drive the Furuta pendulum system from an initial condition toward its upright vertical position and to stabilize the system at that point. The \(U_{\text{cor}}\) has been introduced to compensate the oscillation of the pendulum when it reaches the steady-state point. The controller was derived from the universal approximation property of neural networks, and weight adaptation laws were designed. The simulation results clearly indicate the effectiveness of the proposed control law in an uncertain nonlinear underactuated system. The results have been compared with previous works to highlight the performance of the proposed approach. The controller simultaneously stabilizes the arm’s orientation angle and angular position of the pendulum. The controller is able to provide robust, nonfluctuation performance in the presence of parametric uncertainty and extraneous disturbances. In addition, the proposed algorithm can be easily extended for any other two-degree-of-freedom underactuated system. Numerical results are verified with software in the loop simulation of the robot in ADAMS software. Authors are planning to apply other activation functions and different ordering of the adaptive neural network to improve the performance and to apply the controller in other complex mechanisms for future works.