1 Introduction

At present, marine transportation is indispensable for the development of the world. The ocean-going ships always have to endure large environmental impacts including waves, winds, and currents. The dramatic roll motions will affect the comfort of seafarers and passengers because of seasickness. Moreover, it may lead to the instability and unsafety of the ship and cargoes. From the perspective of safety, the roll damping facilities or strategies are needed to damp the roll motions as much as possible. Some effective methods and devices, e.g. moving weights (Treakle et al. 2000), stabilizing fins, anti-roll tanks, and bilge keels, have been successfully designed to reduce roll motions. However, the additional equipment affects the ship’s carrying capacity, seaworthiness, and structure strength and increases the shipbuilding and maintenance costs. Therefore, other applicable approaches are needed for vessels to keep stability while maintaining the orientation.

When altering the rudder angle, there exist additional force and moment on the roll motions. Consequently, the rudder is an alternative to control roll motions besides being used as the steering facility. Using the rudder for path following and roll reduction simultaneously is not a simple task due to the coupling between the motions of yaw and roll. Hence, qualified control strategies are required to handle the trade-offs. The conventional rudder roll stabilizer based on the proportional-integrative-derivative (PID) control method (Van and Van 1978) has been applied in the commercial autopilot system for its simplification and reliability. However, this kind of controllers, which was designed with fixed parameters and scheduling gains, does not work well in heavy seas (Sun et al. 2014). With the development of modern control theory, fuzzy logic control algorithm was adopted to design the rudder roll stabilization system (Nejim 2000). However, it is difficult to formulate the fuzzy control rules, which are generally obtained by trial-and-error-based human knowledge. The H-Infinite-based rudder roll stabilization system was developed by Zhang et al. (2006), but the control method has the risk of being unstable when the changing speed of the dynamics is beyond the adapting capability. The sliding mode control also has been adopted in this domain (Fang and Luo 2007). However, the high frequency of control actions may lead to unexpected dynamic distortions (Sun et al. 2014).

Impelled by the development of computing technology, the control algorithms based on neural networks became applicable. The advantages of the neural network control algorithm lie in the capability of approximating nonlinearity and robustness against system noises. Another feature of the neural network is the capability in ‘comprehending’ the multivariable characteristics of the system by adjusting the weights, which avoids the analytical analysis of the complicated nonlinear differential equations. In Alarçïn (2007), the rudder roll stabilization system for fishing vessels was designed by using a neural network approach, in which the sigmoid transfer function is used as the activation functions. By utilizing the hyperbolic tangent function, Fang et al. (2010) proposed a PID controller tuned by the neural network to control the ship’s roll motions in random waves. The wavelet neural network also has been investigated to design the roll damping controller (Li et al. 2010). Amongst the multilayer feed-forward neural networks, the radial basis function neural networks (RBFNN) has simple architecture and owns the adequate generalization capability in avoiding unnecessary and lengthy calculations (Liu 2013). Thus, the RBFNN is adopted in this study to design the control system.

The training algorithm is essential to propose a neural network controller. Up to the present, most of the neural network–based roll stabilizers are trained by the well-known backpropagation (BP) algorithm or its variants (Alarçïn 2007; Fang et al. 2012). Although the BP methods are applicable to train neural networks, the relevant flaws are still of concern. Since BP is a first-order steepest descent method, the training of networks with gradient descent tends to be slowly and poorly approaching satisfactory results (Ko and Lee 2013). Additionally, the propagation of dynamic derivatives regarding the networks’ output is computationally expensive (Choi et al. 2005). Actually, the training of proposed neural networks can be considered as the process of estimating parameters. As the optimal state estimator, the Kalman filter can be employed as the alternative. Amongst the Kalman filter variants, the extended Kalman filter (EKF) algorithms, which are capable of achieving second-order nonlinearity accuracy by using Jacobian matrix for approximation, can provide the online training mechanism (Medagam and Pourboghrat 2009). Although the unscented Kalman filter (UKF) algorithm has been investigated as a satisfied neural network training method to the control of the ship’s motions (Wang et al. 2017a), the additional computational burdens are excessive since it needs to handle the propagations of ‘sigma points’ to capture the true means and covariances of the Gaussian random variables through the system dynamics. Contrary to the higher order training methods, the EKF-based training algorithm for RBFNN does not require batch processing and can easily access the Jacobian matrix, making it more suitable for online usage (Zhao et al. 2013). From the running time standpoint, the EKF-based training algorithm is more competent with reasonable computational expenses.

Besides the standard EKF, the augmented EKF (Goh and Mandic 2007) and decoupled EKF (Nouri et al. 2008) have been developed to train neural networks. It is shown that, with the application of the training algorithm based on EKF variants, the converging speed and the capability in restraining noises were improved (Sanchez et al. 2008). However, the algorithms were proposed to train the weights but not proposed to train the dynamic parameters in the activation functions. For the ship’s motion control, which always requires the alternations of the desired attitude due to the passage planning, the capabilities of the RBFNN controller also depend on the online optimization of the centers in radial basis functions. Therefore, the dual EKF (DEKF) is utilized to train both weights and centers of the RBFNN controller in this study.

The main motivation of this study is to develop a qualified rudder roll damping autopilot based on DEKF RBFNN, which has low computational expenses and satisfactory tracking capabilities. The key objectives of this research are (1) to formulate the DEKF-based training method for the feedback RBFNN control scheme; (2) to analyze the merits of the proposed training method in comparison with the BP training method; (3) to develop a rudder roll stabilization system based on DEKF RBFNN for surface vessels advancing in waves; and (4) to validate the efficiency of the system by achieving the tasks of path following and roll damping simultaneously with environmental disturbances.

The organization of this study is as follows: section 2 introduced the preliminaries including the state space function of the ship’s motions in four degrees of freedom (DOF) as well as the control scheme of a direct method for robust adaptive control by RBFNN. In section 3, the DEKF training method for RBFNN-based controller and its theoretical advantages are presented. After that, in section 4, the rudder roll stabilization system was proposed by using DEKF RBFNN–based control algorithm to fulfil the tasks of roll damping and path following through the rudder actions. The simulation scenarios and results are presented in section 5 to investigate the efficiency of the system. Conclusions are given in section 6.

2 Problem Formulations and Preliminaries of the Direct RBFNN Control

2.1 Ship’s Motion Equations

The ship’s mathematical model, which is deduced from Newton-Euler’s law, is the representation of the ship’s motions. The kinematic and kinetic states of the surface vessel are clarified in the two correlative coordinates, namely earth-fixed coordinate and body-fixed coordinate (see Fig. 1).

Fig. 1
figure 1

Two correlative coordinates for the motions of the surface vessel

Considering the objectives of roll damping and path following, the four-DOF (i.e. surge, sway, yaw, and roll) model is used to describe the dynamic motions of surface vessels. The ship’s four-DOF nonlinear model including control forces and manoeuvring characteristics can be expressed as (Fossen 1994):

$$ \dot{\boldsymbol{\eta}}=\boldsymbol{Jv} $$
(1)
$$ \boldsymbol{M}\dot{\boldsymbol{v}}+\boldsymbol{C}\left(\boldsymbol{v}\right)\boldsymbol{v}+\boldsymbol{D}\left(\boldsymbol{v},\boldsymbol{\eta} \right)\boldsymbol{v}+\boldsymbol{G}\boldsymbol{\eta } ={\boldsymbol{\tau}}_{\mathrm{ex}}+\boldsymbol{\tau} $$
(2)

where the vectors η = [x, y, ϕ, ψ]T and v = [u, v, p, r]T are used to represent the states of the ship’s motions, where (x, y) are the position, ϕ and ψ are the roll angle and yaw angle in the earth-fixed coordinate, (u, v) are the linear velocity of surge and sway in the body-fixed coordinate, p and r are the roll rate and yaw rate of the ship, J is the transformation matrix of the kinetic and kinematic states, M is the system inertial matrix, C(v) is the Coriolis and centripetal matrix, D(v) is the damping matrix, denotes the vector of gravitational force and moments, and τ denotes the vector of control inputs, τex is the matrix of external disturbance.

Specifically, the above-mentioned matrices are given as:

$$ \boldsymbol{M}=\left[\begin{array}{cccc}m-{X}_{\dot{u}}& 0& 0& 0\\ {}0& m-{Y}_{\dot{v}}& -m{z}_g-{Y}_{\dot{p}}& m{x}_g-{Y}_{\dot{r}}\\ {}0& -m{z}_g-{Y}_{\dot{v}}& {I}_x-{Y}_{\dot{p}}& 0\\ {}0& {mx}_g-{N}_{\dot{v}}& 0& {I}_z-{N}_{\dot{r}}\end{array}\right] $$
(3)
$$ \boldsymbol{C}\left(\boldsymbol{v}\right)=\left[\begin{array}{cccc}0& 0& m{z}_gr& {Y}_{\dot{v}}v-m\left({rx}_g+v\right)\\ {}0& 0& 0& mu-{X}_{\dot{u}}u\\ {}-m{z}_gr& 0& 0& {Y}_{\dot{v}}v\\ {}-{Y}_{\dot{v}}v+m\left({rx}_g+v\right)& - mu+{X}_{\dot{u}}u& -{Y}_{\dot{v}}v& 0\end{array}\right] $$
(4)
$$ \boldsymbol{D}\left(\boldsymbol{v},\boldsymbol{\eta} \right)=\left[\begin{array}{cccc}-{X}_{uu}u-{X}_{uu u}{u}^2& -{X}_{vv}v-{X}_{rv}r& 0& -{X}_{rr}r\\ {}-{Y}_{uv\phi} v\phi -{Y}_{ur\phi} r\phi & -{Y}_{uu v}{u}^2-{Y}_{vv v}{v}^2-{Y}_{rr v}{r}^2& 0& -{Y}_{uu r}{u}^2-{Y}_{rr r}{r}^2-{Y}_{vv r}{v}^2\\ {}-{K}_{uv\phi} v\phi -{K}_{ur\phi} r\phi & -{K}_{uu v}{u}^2-{K}_{vv v}{v}^2-{K}_{rr v}{r}^2& {K}_p-{K}_{ppp}{p}^2& -{K}_{uu r}{u}^2-{K}_{rr r}{r}^2-{K}_{vv r}{v}^2\\ {}-{N}_{uv\phi} v\phi -{N}_{ur\phi} r\phi & -{N}_{uu v}{u}^2-{N}_{vv v}{v}^2-{N}_{rr v}{r}^2& 0& -{N}_{uu r}{u}^2-{N}_{rr r}{r}^2-{N}_{vv r}{v}^2\end{array}\right] $$
(5)
$$ \boldsymbol{G}=\left[\begin{array}{cccc}0& 0& 0& 0\\ {}0& 0& 0& 0\\ {}0& 0& -{K}_{\phi }& 0\\ {}0& 0& 0& 0\end{array}\right] $$
(6)
$$ {\boldsymbol{\tau}}_{\mathrm{ex}}={\left[{\tau}_{\mathrm{ex}\mid X},{\tau}_{\mathrm{ex}\mid Y},{\tau}_{\mathrm{ex}\mid K},{\tau}_{\mathrm{ex}\mid N}\right]}^{\mathrm{T}} $$
(7)
$$ \boldsymbol{\tau} ={\left[{\tau}_X,{\tau}_Y,{\tau}_K,{\tau}_N\right]}^{\mathrm{T}} $$
(8)

where m is the mass of the ship; xg and zg are the position of the ship’s center of mass in the body-fixed coordinate; Ix and Iz are the moment of inertia; \( {X}_{\dot{u}} \), \( {Y}_{\dot{v}} \), \( {Y}_{\dot{r}} \), \( {N}_{\dot{v}} \), \( {Y}_{\dot{p}} \), and \( {N}_{\dot{r}} \) are the hydrodynamic added mass and added moment inertia of the ship; X, YK, and N with subscripts are the hydrodynamic coefficients concerning the corresponding dimension; τXτY, τK, and τN are the forces and moments generated from actuators including propellers and rudder on corresponding DOF; and τex ∣ X, τex ∣ Y, τex ∣ K, and τex ∣ N are the environmental disturbances on corresponding DOF, including the disturbances of waves, currents, and wind.

2.2 RBFNN-Based Function Approximation for Feedback Control

In order to better represent the process of the ship’s motions, the four-DOF mathematical model of the surface vessel can be rewritten as a second-order nonlinear system. Considering the fact that the transformation matrix of the states (i.e. J) is a determined item in every iteration of calculating ordinary differential equation (ODE), when the relevant items are defined as \( A\left(\boldsymbol{S}\right)=\frac{-\boldsymbol{C}\left(\boldsymbol{v}\right)-\boldsymbol{D}\left(\boldsymbol{v},\boldsymbol{\eta} \right)}{\boldsymbol{M}}+\frac{-\boldsymbol{G}\boldsymbol{\eta }}{\boldsymbol{M}}\boldsymbol{v} \), \( B\left(\boldsymbol{S}\right)=\frac{1}{\boldsymbol{M}} \)C = J ∈ R, \( d=\frac{{\boldsymbol{\tau}}_{\mathrm{ex}}}{\boldsymbol{M}} \), u = τ, the Eq. (1) can be simplified as the following state space equations:

$$ {\displaystyle \begin{array}{c}{\dot{s}}_1=C{s}_2\\ {}{\dot{s}}_2=A\left(\boldsymbol{S}\right)+B\left(\boldsymbol{S}\right)u+d\\ {}Y={s}_1\end{array}} $$
(9)

where S = [s1 s2]T are the system states of the plant, which contains the relevant controllable variables in the control process, u is the input of the plant being controlled, and Y is the output of the plant.

In the closed-loop control of the plant, the target value Yd, the error e, and an augmented error term eA can be defined as:

$$ {\boldsymbol{Y}}_d={\left[{Y}_d\ {\dot{Y}}_d\right]}^{\mathrm{T}},\boldsymbol{e}={\boldsymbol{Y}}_d-\boldsymbol{Y}={\left[e\ \dot{e}\right]}^{\mathrm{T}}={\left[{Y}_d-Y,\dot{Y_d}-\dot{Y}\right]}^{\mathrm{T}},{e}_A=\left[\lambda\ 1\right]\boldsymbol{e}=\lambda e+\dot{e} $$
(10)

where λ is selected to ensure that the polynomial eA + λ is Hurwitz (Ge et al. 1999). The derivative of the augmented s can be expressed as:

$$ {\displaystyle \begin{array}{l}\dot{e_A}=\lambda \dot{e}+\ddot{e}\\ {}\kern1.25em =\lambda \dot{e}+{\ddot{Y}}_d-\ddot{Y}\\ {}\begin{array}{l}\kern1.25em =\lambda \dot{e}+{\ddot{Y}}_d- CA\left(\boldsymbol{S}\right)- CB\left(\boldsymbol{S}\right)u- Cd\\ {}\kern1.25em =D- CA\left(\boldsymbol{S}\right)- CB\left(\boldsymbol{S}\right)u- Cd\end{array}\end{array}} $$
(11)

where \( D=\lambda \dot{e}+{\ddot{Y}}_d \).

Lemma: For the system expressed by Eq. (9), considering that B(S) > 0, the ideal control law to make the plant converge to desired states can be chosen as:

$$ {u}^{\ast }=-\frac{1}{CB\left(\boldsymbol{S}\right)}\left( CA\left(\boldsymbol{S}\right)-D+ Cd\right)-\left(\frac{\dot{B}\left(\boldsymbol{S}\right)}{2 CB{\left(\boldsymbol{S}\right)}^2}-\frac{1}{\varepsilon CB\left(\boldsymbol{S}\right)}-\frac{1}{\varepsilon CB{\left(\boldsymbol{S}\right)}^2}\right){e}_A $$
(12)

where ε > 0 is the design parameter which is used to determine the converge rate of the tracking error, then \( \underset{n\to \infty }{\lim}\left\Vert e\right\Vert =0 \).

Proof: When the control law in Eq. (12) is substituted in the Eq. (11), the augmented term s can be rewritten as:

$$ {\displaystyle \begin{array}{l}\dot{e_A}=D- CA\left(\boldsymbol{S}\right)- CB\left(\boldsymbol{S}\right)u- Cd\\ {}\kern1.25em =-\left(\frac{\dot{B}\left(\boldsymbol{S}\right)}{2B\left(\boldsymbol{S}\right)}+\frac{1}{\varepsilon }+\frac{1}{\varepsilon B\left(\boldsymbol{S}\right)}\right){e}_A\end{array}} $$
(13)

Developing the Lyapunov function as \( V=\frac{1}{2B\left(\boldsymbol{S}\right)}{e_A}^2, \) its derivative can be presented as:

$$ \dot{V}=\frac{1}{B\left(\boldsymbol{S}\right)}{e}_A\dot{e_A}-\frac{\dot{B}\left(\boldsymbol{S}\right)}{2B{\left(\boldsymbol{S}\right)}^2}{e_A}^2=-\left(\frac{1}{\varepsilon B\left(\boldsymbol{S}\right)}+\frac{1}{\varepsilon B{\left(\boldsymbol{S}\right)}^2}\right){e_A}^2 $$
(14)

The above-mentioned function indicates that the augmented error eA would converge to zero. Correspondingly, \( \underset{n\to \infty }{\lim}\left\Vert e\right\Vert =0 \).

It is shown that the robust control law in Eq. (12) can be formulated as a function of the desired signals, the system states, and their relevant derivatives

$$ \boldsymbol{z}={\left[\boldsymbol{S}\ {e}_A\ \frac{e_A}{\varepsilon }\ D\right]}^{\mathrm{T}} $$
(15)

whereas, as can be seen in the above-mentioned equations, the corresponding matrices in the ship’s motion equations (i.e. A(S), B(S), and C) are complicated. In addition, the various unknown disturbances acting on the ship add difficulty to achieve the model-based closed-loop control. Hence, to achieve the control law in Eq. (12), it is reasonable to utilize the online optimization algorithm to approximate the control laws.

The RBFNN was developed to approximate the unknown functions by adopting the radial basis functions as the activation functions (Broomhead and Lowe 1988). The scheme of the neural networks contains three layers: input layer, output layer, and hidden layer. The essential feature of the RBFNN-based control approach is that no prior knowledge of the dynamics of the plant is required to design the controller, and no offline training is required for the neural network (Ge et al. 2010). Therefore, the RBFNN can be applied in this study to approximate the function of control law as:

$$ {\displaystyle \begin{array}{l}\hat{u}={\hat{\boldsymbol{\omega}}}^{\mathrm{T}}\boldsymbol{\varPhi} \left(\boldsymbol{z}\right)=\left[{\hat{\omega}}_1{\varPhi}_1\left(\boldsymbol{z}\right)+{\hat{\omega}}_2{\varPhi}_2\left(\boldsymbol{z}\right)+\cdots +{\hat{\omega}}_i{\varPhi}_i\left(\boldsymbol{z}\right)+\cdots +{\hat{\omega}}_m{\varPhi}_m\left(\boldsymbol{z}\right)\right]\\ {}\varPhi {\left(\boldsymbol{z}\right)}_i=\exp \left(-\frac{{\left\Vert \boldsymbol{z},-,{\hat{\boldsymbol{c}}}_i\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\end{array}} $$
(16)

where \( \hat{u} \) is the approximated value of ideal control law, \( \hat{\boldsymbol{\omega}} \) is the estimation of the ideal weights, Φ(z)i is the radial basis function working as the activation function in the hidden layer, in which \( {\hat{\boldsymbol{c}}}_i \) is the center and σRBF is the width representing the covering scope for the network input, and Φ with i subscripts denote the activation functions with the input z.

Based on the introduction mentioned above, it is known that the unknown nonlinear plant can be controlled by the RBFNN-based controller with the input z. The input matrix can be constructed by using the desired value and actual value. When the algorithm to update the neural network weights and centers is applied, the direct control method based on the RBFNN approximation can be used to provide control and make the plant converge to the desired states; the details are illustrated in Fig. 2.

Fig. 2
figure 2

Architecture of the direct RBFNN control system

3 Control Algorithm Based on DEKF-Trained RBFNN

In this section, the DEKF training algorithm is introduced to train the RBFNN-based controller to improve the control performance with satisfactory design complications. The strengths of the DEKF training algorithm to the BP training method are analyzed mathematically as well.

3.1 RBFNN-Based Controller Trained by DEKF

The training algorithm for the RBFNN-based controller is presented as:

3.1.1 Estimation of Weights

  • Initialization

$$ {\displaystyle \begin{array}{l}{\hat{\boldsymbol{w}}}_0=E\left[{\boldsymbol{w}}_0\right]\\ {}{\boldsymbol{P}}_0=E\left[\left({\boldsymbol{w}}_0-{\hat{\boldsymbol{w}}}_0\right){\left({\boldsymbol{w}}_0-{\hat{\boldsymbol{w}}}_0\right)}^{\mathrm{T}}\right]\end{array}} $$
(17)

where \( \hat{\boldsymbol{w}} \) is the estimated weights initialized as small random value (e.g. magnitude of 0.1). P is the estimated error covariance of weights, and its initial value can be a diagonal matrix with diagonal components.

  • Recursively executing with time interval ∆t

Step W-1. Time updating

$$ {\displaystyle \begin{array}{l}{\overset{\sim }{\boldsymbol{w}}}_t={\hat{\boldsymbol{w}}}_{t-\Delta t}\\ {}{\overset{\sim }{\boldsymbol{P}}}_t={\boldsymbol{P}}_{t-\Delta t}\end{array}} $$
(18)

where \( \overset{\sim }{\boldsymbol{w}} \) is the predicted weights and \( \overset{\sim }{\boldsymbol{P}} \) is the predicted error covariance of weights.

  • Step W-2. Jacobian matrix calculation

$$ {\displaystyle \begin{array}{l}{\boldsymbol{H}}_t=\frac{\partial {y}_p}{\partial {\overset{\sim }{\boldsymbol{w}}}_t}=\frac{\partial {y}_p}{\partial {y}_m}\frac{\partial {y}_m}{\partial {\overset{\sim }{\boldsymbol{w}}}_t}\\ {}\kern1.5em =\left[\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(1)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(2)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\dots, \exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(m)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}} $$
(19)

where H is the Jacobian matrix regarding neural network weights for DEKF estimation and \( \frac{\partial {y}_p}{\partial {y}_m} \) is the Jacobian item representing the sensitivity between the plant input and output; more details about the implement are shown in Liu (2013).

  • Step W-3. Kalman gains matrix calculation

$$ {\boldsymbol{K}}_t={\overset{\sim }{\boldsymbol{P}}}_t{{\boldsymbol{H}}_t}^{\mathrm{T}}{\left[{\boldsymbol{H}}_t{\overset{\sim }{\boldsymbol{P}}}_t{{\boldsymbol{H}}_t}^{\mathrm{T}}+{\boldsymbol{R}}^w\right]}^{-1} $$
(20)

where K is the Kalman gain matrix for weights group, Rw is a diagonal matrix with components equal to or slightly less than 1.

  • Step W-4. Estimation of states

$$ {\hat{\boldsymbol{w}}}_t={\overset{\sim }{\boldsymbol{w}}}_t+{\boldsymbol{K}}_t{e}_{Aw} $$
(21)

where \( \hat{\boldsymbol{w}} \) is the estimated weights of the RBFNN-based controller and eAw is the augmented deviation, which obeys the format in Eq. (11).

  • Step W-5. Error covariance updating

$$ {\boldsymbol{P}}_t={\overset{\sim }{\boldsymbol{P}}}_t-{\boldsymbol{K}}_t{{\boldsymbol{H}}_t}^{\mathrm{T}}{\overset{\sim }{\boldsymbol{P}}}_t+{\boldsymbol{Q}}^w $$
(22)

where Qw is the matrix representing artificial process noise in the process of the weights’ training. It is used to avoid numerical divergence.

3.1.2 Estimation of Centers

  • Initialization

$$ {\displaystyle \begin{array}{l}{\hat{\boldsymbol{c}}}_0=E\left[{\boldsymbol{c}}_0\right]\\ {}{\boldsymbol{G}}_0=E\left[\left({\boldsymbol{c}}_0-{\hat{\boldsymbol{c}}}_0\right){\left({\boldsymbol{c}}_0-{\hat{\boldsymbol{c}}}_0\right)}^{\mathrm{T}}\right]\end{array}} $$
(23)

where \( \hat{\boldsymbol{c}} \) is the estimated centers initialized as small random value. G is the estimated error covariance of centers initialized as a diagonal matrix with diagonal components.

  • Recursively executing with time interval ∆t

  • Step C-1. Time updating

$$ {\displaystyle \begin{array}{l}{\overset{\sim }{\boldsymbol{c}}}_t={\hat{\boldsymbol{c}}}_{t-\Delta t}\\ {}{\overset{\sim }{\boldsymbol{G}}}_t={\boldsymbol{G}}_{t-\Delta t}\end{array}} $$
(24)

where \( \overset{\sim }{\boldsymbol{c}} \) is the predicted centers and \( \overset{\sim }{\boldsymbol{G}} \) is the predicted error covariance of centers.

  • Step C-2. Jacobian matrix calculation

$$ {\displaystyle \begin{array}{l}{\boldsymbol{J}}_t=\frac{\partial {y}_p}{\partial {\overset{\sim }{\boldsymbol{c}}}_t}=\frac{\partial {y}_p}{\partial {y}_m}\frac{\partial {y}_m}{\partial {\overset{\sim }{\boldsymbol{c}}}_t}\\ {}\kern1.25em =\left[{\overset{\sim }{\boldsymbol{w}}}_t(1)\boldsymbol{\varPhi} (1)\frac{\overset{\sim }{\boldsymbol{c}}(1)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},{\overset{\sim }{\boldsymbol{w}}}_t(2)\boldsymbol{\varPhi} (2)\frac{\overset{\sim }{\boldsymbol{c}}(2)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},\dots, {\overset{\sim }{\boldsymbol{w}}}_t(m)\boldsymbol{\varPhi} (m)\frac{\overset{\sim }{\boldsymbol{c}}(m)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2}\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}} $$
(25)

where J is the Jacobian matrix regarding neural network centers for DEKF estimation.

  • Step C-3. Kalman gains matrix calculation

$$ {\boldsymbol{T}}_t={\overset{\sim }{\boldsymbol{G}}}_t{{\boldsymbol{J}}_t}^{\mathrm{T}}{\left[{\boldsymbol{J}}_t{\overset{\sim }{\boldsymbol{G}}}_t{{\boldsymbol{J}}_t}^{\mathrm{T}}+{\boldsymbol{R}}^c\right]}^{-1} $$
(26)

where T is the Kalman gain matrix for centers group and Rc is a diagonal matrix with components equal to or slightly less than 1.

  • Step C-4. Estimation of states

$$ {\hat{\boldsymbol{c}}}_t={\overset{\sim }{\boldsymbol{c}}}_t+{\boldsymbol{T}}_t{e}_{Ac} $$
(27)

where \( \hat{\boldsymbol{c}} \) is the estimated centers of RBFNN-based controller and eAc is the augmented deviation, which obeys the format in Eq. (11).

  • Step C-5. Error covariance updating

$$ {\boldsymbol{G}}_t={\overset{\sim }{\boldsymbol{G}}}_t-{\boldsymbol{T}}_t{{\boldsymbol{J}}_t}^{\mathrm{T}}{\overset{\sim }{\boldsymbol{G}}}_t+{\boldsymbol{Q}}^c $$
(28)

where Qc is the matrix representing artificial process noise in the process of the centers’ training. It is used to avoid numerical divergence.

More details about the studies of the stability analysis and converging in the use of the EKF-based training algorithm can be referred in De and Yu (2007) and Wang and Huang (2011). Based on the above-mentioned formulas, the DEKF-based algorithm was used as the training algorithm by using the augmented system error to update the weights and centers of the RBFNN. As shown in Eqs. (21) and (27), the augmented error between the desired value and the actual value was adopted in the current iteration to approximate the weights and centers. After calculating the control law by using the proposed DEKF-trained RBFNN controller shown in Eq. (16), the actuator will make the plant converge to the desired value recursively.

3.2 Performance Analysis Between DEKF- and BP-Based Training Algorithm

It is well-known that the BP training algorithm is a classical method to set the weights and centers of RBFNN. Although the BP training method is effective in training neural networks, it only processes the present gradient descent procedure. That is to say, the gradient is calculated for the error surface merely based on the current state but not the whole set of the states, which are constantly changing under the control law. The BP training algorithm can be expressed as (Liu 2013):

$$ {\displaystyle \begin{array}{l}\begin{array}{l}{\Delta \boldsymbol{w}}_{\mathrm{BP}}=-{\eta}^w\frac{\partial E}{\partial {\boldsymbol{w}}_{\mathrm{BP}}}\\ {}\kern2.75em =-{\eta}^w\frac{\partial E}{\partial {y}_p}\frac{\partial {y}_p}{\partial {y}_m}\frac{\partial {y}_m}{\partial {\boldsymbol{w}}_{\mathrm{BP}}}\\ {}\kern2.75em ={\eta}^w{e}_{Aw}\left[\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(1)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(2)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\dots, \exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(m)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}\\ {}\begin{array}{l}{\Delta \boldsymbol{c}}_{\mathrm{BP}}=-{\eta}^c\frac{\partial E}{\partial {\boldsymbol{c}}_{\mathrm{BP}}}\\ {}\kern2.75em =-{\eta}^c\frac{\partial E}{\partial {y}_p}\frac{\partial {y}_p}{\partial {y}_m}\frac{\partial {y}_m}{\partial {\boldsymbol{c}}_{\mathrm{BP}}}\\ {}\kern2.75em ={\eta}^c{e}_{Ac}\left[{\overset{\sim }{\boldsymbol{w}}}_t(1)\boldsymbol{\varPhi} (1)\frac{\overset{\sim }{\boldsymbol{c}}(1)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},{\overset{\sim }{\boldsymbol{w}}}_t(2)\boldsymbol{\varPhi} (2)\frac{\overset{\sim }{\boldsymbol{c}}(2)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},\dots, {\overset{\sim }{\boldsymbol{w}}}_t(m)\boldsymbol{\varPhi} (m)\frac{\overset{\sim }{\boldsymbol{c}}(m)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2}\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}\end{array}} $$
(29)

where ηw and ηc are the learning rates of the neural networks and s is the augmented error of the system. When the following two assumptions are taken, (1) \( \overset{\sim }{\boldsymbol{P}}={p}^w\boldsymbol{I} \), \( \overset{\sim }{G}={p}^c\boldsymbol{I} \) and (2) \( {\left[\boldsymbol{H}\overset{\sim }{\boldsymbol{P}}{\boldsymbol{H}}^{\mathrm{T}}+{\boldsymbol{R}}^w\right]}^{-1}={a}^w\boldsymbol{I} \), \( {\left[\boldsymbol{J}\overset{\sim }{\boldsymbol{G}}{\boldsymbol{J}}^{\mathrm{T}}+{\boldsymbol{R}}^c\right]}^{-1}={a}^c\boldsymbol{I} \), the DEKF algorithm for neural network training can be simplified as the following form:

$$ {\displaystyle \begin{array}{l}\begin{array}{l}\begin{array}{l}{\Delta \boldsymbol{w}}_{\mathrm{EKF}}=\boldsymbol{K}{e}_{Aw}\\ {}\kern3.5em =\overset{\sim }{\boldsymbol{P}}{\boldsymbol{H}}^{\mathrm{T}}{\left[\boldsymbol{H}\overset{\sim }{\boldsymbol{P}}{\boldsymbol{H}}^{\mathrm{T}}+{\boldsymbol{R}}^w\right]}^{-1}{e}_{Aw}\\ {}\kern3.5em ={a}^w{p}^w{e}_{Aw}\boldsymbol{H}\end{array}\\ {}\kern3.5em ={a}^w{p}^w{e}_A\left[\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(1)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(2)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right),\dots, \exp \left(\frac{-{\left\Vert {\boldsymbol{X}}_t-\overset{\sim }{\boldsymbol{c}}(m)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}\\ {}\begin{array}{l}{\Delta \boldsymbol{c}}_{\mathrm{EKF}}=\boldsymbol{T}{e}_{Ac}\\ {}\kern3.25em =\overset{\sim }{\boldsymbol{G}}{\boldsymbol{J}}^{\mathrm{T}}{\left[\boldsymbol{J}\overset{\sim }{\boldsymbol{G}}{\boldsymbol{J}}^{\mathrm{T}}+{\boldsymbol{R}}^c\right]}^{-1}{e}_{Ac}\\ {}\begin{array}{l}\kern3.25em ={a}^c{p}^c{e}_{Ac}\boldsymbol{J}\\ {}\kern3.25em ={a}^c{p}^c{e}_{Ac}\left[{\overset{\sim }{\boldsymbol{w}}}_t(1)\boldsymbol{\varPhi} (1)\frac{\overset{\sim }{\boldsymbol{c}}(1)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},{\overset{\sim }{\boldsymbol{w}}}_t(2)\boldsymbol{\varPhi} (2)\frac{\overset{\sim }{\boldsymbol{c}}(2)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2},\dots, {\overset{\sim }{\boldsymbol{w}}}_t(m)\boldsymbol{\varPhi} (m)\frac{\overset{\sim }{\boldsymbol{c}}(m)-{\boldsymbol{X}}_t}{{\sigma_{\mathrm{RBF}}}^2}\right]\frac{\partial {y}_p}{\partial {y}_m}\end{array}\end{array}\end{array}} $$
(30)

Note that from the mathematical standpoint, the DEKF training algorithm can be interpreted as the degeneration of the BP training algorithm with awpw = ηw and acpc = ηc.

The essentialities of the two above-mentioned assumptions can be implicated as follows. The first assumption means that the covariance of the weights and centers remains the diagonal form as its value is not changed during the propagation. Initially, this assumption is reasonable because the initial errors of the weights and centers do not have cross-correlations. However, during the training process, maintaining the diagonal forms of P and G means that the errors in the weights and centers are uncorrelated, which is unsound because the weights and centers in the previous steps affect the derivative value of the outputs which will contribute to the value of the following estimations. Hence, the off-diagonal forms are essential as they contain interrelated information for the training of weights and centers. Moreover, this assumption leads to the results that the weights and centers are updated with the same rate even though they need to be trained based on the changing rate of differences. Therefore, discarding the off-diagonal covariance has the significant impact on the training of weights and centers.

Based on the first assumption, the second assumption can be rewritten as:

$$ {\displaystyle \begin{array}{l}\boldsymbol{H}\overset{\sim }{\boldsymbol{P}}{\boldsymbol{H}}^{\mathrm{T}}+{\boldsymbol{R}}^w={p}^w\boldsymbol{H}{\boldsymbol{H}}^{\mathrm{T}}+{\tau}^w\boldsymbol{I}\\ {}\boldsymbol{J}\overset{\sim }{\boldsymbol{G}}{\boldsymbol{J}}^{\mathrm{T}}+{\boldsymbol{R}}^c={p}^c\boldsymbol{J}{\boldsymbol{J}}^{\mathrm{T}}+{\tau}^c\boldsymbol{I}\end{array}} $$
(31)

In order to achieve the assumptions which equal to awI and acI, it is required that

$$ {\displaystyle \begin{array}{l}\boldsymbol{H}{\boldsymbol{H}}^{\mathrm{T}}={\gamma}^w\boldsymbol{I}\\ {}\boldsymbol{J}{\boldsymbol{J}}^{\mathrm{T}}={\gamma}^c\boldsymbol{I}\end{array}} $$
(32)

When the rows of matrices are orthogonal and the entries are\( \sqrt{\gamma^w} \) and \( \sqrt{\gamma^c} \), respectively, the assumptions will be satisfied with

$$ {\displaystyle \begin{array}{l}{\left[\boldsymbol{H}\overset{\sim }{\boldsymbol{P}}{\boldsymbol{H}}^{\mathrm{T}}+{\boldsymbol{R}}^w\right]}^{-1}=\frac{1}{p^w{\gamma}^w+{\tau}^w}\boldsymbol{I}\\ {}{\left[\boldsymbol{J}\overset{\sim }{\boldsymbol{G}}{\boldsymbol{J}}^{\mathrm{T}}+{\boldsymbol{R}}^c\right]}^{-1}=\frac{1}{p^c{\gamma}^c+{\tau}^c}\boldsymbol{I}\end{array}} $$
(33)

However, these conditions cannot always be satisfied because the row entries of H and J are the partial derivatives of the outputs regarding the weights and centers in the proposed RBFNN.

Overall, when the DEKF algorithm is being simplified into the BP algorithm, the assumptions cannot always be guaranteed and would result in the discarding of some useful information in updating weights and centers. Therefore, the DEKF-based training algorithm is superior to the BP-based algorithm theoretically. The section of case studies will examine the comparative performance of BP- and DEKF-trained RBFNN control system.

4 DEKF RBFNN–Based Rudder Roll Damping Autopilot

The rudder roll stabilizer has been demonstrated to be a competent approach to reduce roll motions only using the rudder as the damping actuator (Perez and Blanke 2012). In comparison with the compact controller design, the separate controller design, which owns two controllers implemented in parallel, has fewer rudder actions and design complexity caused by the coupling items (Fang and Luo 2006). In this study, the rudder roll stabilization system is designed through developing and conducting the yaw motion controller and the roll damping controller separately. Based on the DEKF RBFNN control scheme, the autopilot control system concerning the objective of path following and roll damping is developed as shown in Fig. 3.

Fig. 3
figure 3

The scheme of the proposed rudder roll stabilization

The enclosure-based steering line of sight (EBS LOS)–based guidance method (see Fig. 1) is employed in the control system. The dynamic desired course angle ψd to track the trajectory and the tracking deviation E0 can be calculated as:

$$ {\psi}_d=\arctan \left(\frac{y_{\mathrm{los}}-{y}_c}{\ {x}_{\mathrm{los}}-{x}_c}\right) $$
(34)
$$ {E}_0=\left({y}_c-{y}_k\right)\cos \left({\alpha}_k\right)-\left({x}_c-{x}_k\right)\sin \left({\alpha}_k\right) $$
(35)

where (xc, yc) is the current position, (xlos,  ylos) is the position of EBS LOS point, (xk, yk) is the previous pre-set waypoint, and αk is the orientation of the pre-determined trajectory; more details can be seen in Fossen (2011).

In the proposed system, the roll damping controller will calculate the relevant control output based on the actual roll angle ϕ and roll rate p. Simultaneously, the actual position is utilized in the EBS LOS guidance block to calculate the instantaneous course angle. The actual yaw angle ψ and yaw rate r are compared with the desired yaw angle ψd to build up the input matrix for the yaw motion controller. The ship’s command rudder angle is equivalent to the summation of the control law from yaw motion controller and roll reduction controller with corresponding parameters:

$$ {\displaystyle \begin{array}{l}{u}_{\psi }={{\hat{\boldsymbol{w}}}_{\psi}}^{\mathrm{T}}\boldsymbol{\varPhi} \left({\boldsymbol{z}}_{\psi}\right)={\sum}_{j=1}^m{\hat{\boldsymbol{w}}}_{\psi }(j)\exp \left(\frac{-{\left\Vert {\boldsymbol{z}}_{\psi }-{\hat{\boldsymbol{c}}}_{\psi }(j)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\\ {}{u}_{\phi }={{\hat{\boldsymbol{w}}}_{\phi}}^{\mathrm{T}}\boldsymbol{\varPhi} \left({\boldsymbol{z}}_{\phi}\right)={\sum}_{j=1}^m{\hat{\boldsymbol{w}}}_{\phi }(j)\exp \left(\frac{-{\left\Vert {\boldsymbol{z}}_{\phi }-{\hat{\boldsymbol{c}}}_{\phi }(j)\right\Vert}^2}{2{\sigma_{\mathrm{RBF}}}^2}\right)\\ {}{\delta}_c={b}_{\psi }{u}_{\psi }+{b}_{\phi }{u}_{\phi}\end{array}} $$
(36)

where uψ and uϕ are the approximated control laws for path following and roll reduction, Φ are the neurons of the proposed RBFNN, \( {\boldsymbol{z}}_{\psi }={\left[\psi, \kern0.5em \lambda \left({\psi}_d-\psi \right)+\left({\dot{\psi}}_d-\dot{\psi}\right),\kern0.5em \frac{\lambda \left({\psi}_d-\psi \right)+\left({\dot{\psi}}_d-\dot{\psi}\right)}{\varepsilon },\kern0.5em \lambda \left({\dot{\psi}}_d-\dot{\psi}\right)+{\ddot{\psi}}_d\right]}^{\mathrm{T}} \) and \( {\boldsymbol{z}}_{\phi }={\left[\phi, \kern0.5em \lambda \left({\phi}_d-\phi \right)+\left({\dot{\phi}}_d-\dot{\phi}\right),\kern0.5em \frac{\ \lambda \left({\phi}_d-\phi \right)+\left({\dot{\phi}}_d-\dot{\phi}\right)}{\varepsilon },\kern0.5em \lambda \left({\dot{\phi}}_d-\dot{\phi}\right)+{\ddot{\phi}}_d\right]}^{\mathrm{T}} \) (where ϕd = 0 is selected in this study to show the roll stabilization) are the input of relevant controllers, \( {\hat{\boldsymbol{w}}}_{\psi } \) and \( {\hat{\boldsymbol{w}}}_{\phi } \) are the corresponding weights trained by the DEFK, \( {\hat{\boldsymbol{c}}}_{\psi } \) and \( {\hat{\boldsymbol{c}}}_{\phi } \) are the corresponding centers trained by the DEFK, j is the total number of the neurons in the neural network. δc is the rudder command angle and bψ and bϕ are the parameters reflecting the emphasis of control performance. Since the underactuation characteristic of the ship’s motion control, the rudder is commonly utilized as the only actuator. The selection of the parameters is determined by the environmental conditions and stabilization requirements. In this study, the importance of the yaw control is assumed to be equivalent to that of the roll damping control; thus, the two parameters are assumed to be equivalent to each other (Fossen 2011).

To analyze the control capability of the proposed DEKF RBFNN–based rudder roll stabilization system, the evaluation items including roll reduction percentage, the cost functions of roll motion CRoll, and rudder angle CRudder (McGookin et al. 2000) are adopted and expressed as follows:

$$ {\displaystyle \begin{array}{c}{d}_B=\sqrt{\frac{\sum_{k=0}^M{\left({p}_k^B\right)}^2}{M-1}};{d}_A=\sqrt{\frac{\sum_{k=0}^M{\left({p}_k^A\right)}^2}{M-1}}\\ {}{P}_{\mathrm{Reduction}}\ \left(\%\right)=\frac{d_B-{d}_A}{d_B}\times 100\%\\ {}{C}_{\mathrm{Roll}}=\sum \limits_{k=0}^M{\phi}_k^2;{C}_{\mathrm{Rudder}}={\sum}_{k=0}^M{\delta}_k^2\end{array}} $$
(37)

where dB is the standard deviation of roll rate pB without roll damping control and dA is the standard deviation of roll rate pA with roll damping control (Fossen 1994), M is the amount of the total iterations, and ϕk and δk are the roll angle and rudder angle in the kth iteration respectively.

5 Simulation Studies

In order to validate the efficiency of the rudder roll stabilization system based on DEKF-trained RBFNN, the four-DOF nonlinear mathematical model of a full-scale container ship is adopted in this study. The main characteristics of the relevant ship are outlined in Table 1. More details can be seen in Fossen (1994). The input and output of the plant (i.e. the surface vessel being controlled) are the rudder angle and angular motions, while the system states are the attitudes of the vessel.

Table 1 The main characteristics of the full-scale container ship

5.1 Simulation Setting and Prior Training of the DEKF RBFNN–Based Controller

The Bogacki-Shampine algorithm was employed to solve the ODE of the ship’s mathematical model with wave disturbances. The ship’s propeller rotation rate was set at 80 r/min. From the perspective of engineering practice, the slew rate of rudder angle (i.e. motions of control actuator) was constrained within ±5 (°) /s regarding the characteristics of normal servo motors as well as the requirement of IMO (Oda et al. 2008), while the angle was correspondingly constrained to δmax =  ± 20° to avoid sharp fluctuations of rudder actions.

Two scenarios were carried out to investigate the performance of the proposed stabilization system: in the first scenario, the ship is requested to sail from the beginning waypoint (0, 0) to the next waypoint (3600, 2160), and then heading to the following point at (5920, 6120) before advancing to the waypoint (5920, 9600) (defined as trajectory 1) with initial state at [η0, v0] = [0 m, 0 m, 0°, 30°, 8 m/s, 0 m/s, 0 (°) /s, 0 (°) /s]T; the second scenario (trajectory 2) is designed to make the ship sailing from the initial waypoint (0, 0) to (4000, 0), and then to (8000, 1200) before arriving at the waypoint (12 000, 1200) with initial state at [η0, v0] = [0 m, 0 m, 0°, 0°, 8 m/s, 0 m/s, 0 (°) /s, 0 (°) /s]T. The characteristic of the environmental disturbance was selected as sea state 5.

The parameters of the proposed DEKF training method for approximating the weights and centers of the RBFNN are tuned by the experience and trial-and-error method as ε = 0.5, λ = 4,\( {\boldsymbol{R}}_{\psi}^w={\boldsymbol{R}}_{\psi}^c={\boldsymbol{I}}_3 \), \( {\boldsymbol{R}}_{\phi}^w={\boldsymbol{R}}_{\phi}^c={\boldsymbol{I}}_3 \), \( {\boldsymbol{Q}}_{\psi}^w={\boldsymbol{Q}}_{\psi}^c=0.01\times {\boldsymbol{I}}_9 \), and \( {\boldsymbol{Q}}_{\phi}^w={\boldsymbol{Q}}_{\phi}^c=0.00001\times {\boldsymbol{I}}_9 \).

In order to save the learning time for the DEKF RBFNN controller, the prior training was conducted. Nine neuron nodes were involved in the neural network scheme. In the present study, the prior case for the ship advancing based on the desired path is carried out to train the ship to learn how to sail on the desired trajectory while reducing roll motions. The ‘convergent weight matrices’ obtained after 100 s can then be selected as the initial matrices for the RBFNN–based roll stabilizer to handle the motions of the ship \( {\hat{\boldsymbol{w}}}_{\psi 0}={\left[0.09,\kern0.5em -0.001,\kern0.5em -0.002,-0.003,\kern0.5em -0.004,\kern0.5em -0.003,\kern0.5em -0.0022,\kern0.5em -0.001,\kern0.5em -0.0003\ \right]}^{\mathrm{T}} \);\( {\hat{\boldsymbol{w}}}_{\phi 0}={\left[-0.17,\kern0.5em -0.44,\kern0.5em -0.77,\kern0.5em -0.94,\kern0.5em -0.75,\kern0.5em -0.34,\kern0.5em -0.03,\kern0.5em 0.07,\kern0.5em 0.05\right]}^{\mathrm{T}} \).

5.2 Results and Discussion

In order to highlight the capability of the designed DEKF RBFNN–based rudder roll stabilization system, the BP RBFNN–based and proportional derivative (PD)–based rudder roll stabilization system (Wang et al. 2015) were employed in this study for comparison. Two different pre-set trajectories are proposed to investigate the roll damping and path following performance of the proposed control system.

For the first scenario, as shown in Fig. 4, without the roll damping controller, the DEKF RBFNN–based yaw motion controller is capable of maintaining the ship sailing on the desired path steadily. However, the roll angle of the ship is huge due to environmental disturbances. When the roll damping control loop is activated, the tasks of path following and roll damping are achieved by using the rudder roll stabilization system synchronously. The figures also show that the DEKF RBFNN–based control system uses mild rudder actions, but it performs better in roll damping than that of the BP RBFNN- and PD-based control systems. It can be explained that the increase of yaw angle deviations is the price paid for roll reduction and added to the complicated coupling system. Figures 5 and 6 illustrate the trajectories and the corresponding deviations between desired trajectories and actual trajectories for four types of control systems (i.e. without roll damping control, with DEKF RBFNN roll damping control, with BP RBFNN roll damping control, and with PD roll damping control). It is shown that the ship is capable of sailing on the desired trajectory without significant tracking errors when using the control of the proposed rudder roll stabilization system.

Fig. 4
figure 4

The roll motions and rudder actions of the ship sailing on trajectory

Fig. 5
figure 5

The trajectories of the vessel when following trajectory 1

Fig. 6
figure 6

The path following error of the vessel advancing according to trajectory 1

The values of relevant evaluation items are summarized in Table 2. The roll damping percentages of the proposed rudder roll stabilization systems are calculated as 59.83%, 52.70%, and 41.08%, respectively. It is found that comparing with the BP RBFNN- and PD-based stabilizer, the DEKF RBFNN stabilizer is capable of providing effective rudder actions with better roll damping performance and path following accuracy.

Table 2 The values of cost function and relevant roll damping percentage (sailing on trajectory 1)

A similar conclusion can be drawn from the second scenario with different wave encounter angles. In this process, the dynamic performances of the ship turning to both the port side and starboard side are considered. Figures 7, 8, and 9 show that the DEKF RBFNN–based rudder roll stabilization system is promising in making the vessel to track the desired path and to reduce the roll motion even subject to the waves on the beam sea. The values of relevant evaluation items are shown in Table 3. It can be observed that the roll damping rate of the DEKF RBFNN–based system (i.e. 58.17%) is larger than that of the BP RBFNN system at 46.35% and PD system at 38.30%. Thus, the roll damping and path following, as well as the efficiency, of rudder actions are validated.

Fig. 7
figure 7

The path following error of the vessel advancing according to trajectory 2

Fig. 8
figure 8

The trajectories of the vessel when following trajectory 2

Fig. 9
figure 9

The path following error of the vessel sailing based on trajectory 2

Table 3 The values of cost function and relevant roll damping percentage (sailing on trajectory 2)

In practice, the functionality of the autopilot is achieved by the embedded computer. Regarding the computational expenses, the active control running time per period can be used to evaluate the computational complexity. That is the elapsed time when running the control program at every interval. Table 4 outlines the computational expenses of the DEKF RBFNN roll stabilizer and the BP RBFNN roll stabilizer (Wang et al. 2015). It is indicated that the computing load of the proposed DEKF RBFNN–based control system approximately equals to that of the BP-based control system, since DEKF only needs to perform the calculation of Jacobian matrix in one integration. Running time for each iteration (s) is shown in Table 4.

Table 4 The comparison of computational burdens amongst DEKF-, UKF-, and PB-trained RBFNN controllers

Therefore, the competent control performance, the low computational overhead, and the efficiency of using actuators make the EKF RBFNN a good choice to design autopilot and rudder roll stabilizer. From the investigations concerning different trajectories and encounter angles of waves, the EKF RBFNN–based rudder roll stabilization system was demonstrated to be effective in maintaining the ship advancing on the desired path and compensating the huge roll motions at the same time.

6 Conclusion

In this paper, the DEKF RBFNN algorithm has been proposed to develop the rudder roll stabilization system, which contains the yaw motion controller and the roll reduction controller implemented in parallel. The rudder roll stabilization system incorporated with the nonlinear mathematical model had been used to verify the control performance of the ship by only using the rudder as the steering actuator. It is found that the designed control system is feasible to maintain the surface vessel advancing on the pre-set path while reducing the roll motion at the same time. Comparing with the BP RBFNN–based control system, the DEKF RBFNN–based system has faster converge speed and better robustness against the external disturbances. It is worth noting that the designed DEKF RBFNN–based control algorithm has promising control performance but equivalent computational cost as it avoids some complicated integration calculations. Therefore, the DEKF RBFNN control algorithm is competent to design the rudder roll stabilization system with qualified capability in reducing roll motions and following path.

Further investigations will focus on the validation of the DEKF RBFNN–based rudder roll stabilization autopilot through experimental approaches by using the free running model scaled vessel ‘Hoorn’ (Wang et al. 2017b). To further improve the accuracy of the estimations and the robustness in coping with the environmental variables, other competent training algorithms are going to be investigated to design the neural network–based control system for vessels.