1 Introduction

In recent years, mobile robots have always been a research hotspot in the field of robotics to their flexibility and controllability [1,2,3,4,5]. Visual servoing control that integrates feedback control with visual information is widely used in mobile robots to control their orientation and position. By using the rich visual information, the intelligence of mobile robots can be improved and their application can be broadened [6,7,8,9,10]. Therefore, visual servoing control for mobile robots has received widespread attention from many scholars.

Image-based visual servoing (IBVS) controls the robot directly at the two-dimensional image plane, and its concise structure does not require 3D pose estimation in Cartesian coordinate and is insensitive to camera parameter calibration errors. Due to its advantages, many control approaches have been proposed, such as sliding mode control (SMC) [11, 12], PID control [13] and adaptive control methods [14,15,16]. However, actual visual servoing robot systems often have constraints on velocity, position and field of view (FOV). Due to the ability to deal with visibility constraints, model predictive control (MPC) has received widespread attention [17,18,19,20]. In [17] and [18], a robust tube-based predictive control method is proposed to compensate for the effects of bounded uncertainties with the constraint state variables, such as servoing error, velocity and acceleration to stabilize the nominal visual servoing system. In [19], a MPC controller with constraints on roll and pitch angles is proposed for IBVS of a quadrotor to guarantee the visibility of feature points. In [21], a model predictive path integral (MPPI) control framework is proposed by using path integral (PI) control theory [22], which does not require the calculation of gradients and second-order approximations. In [23, 24], a real-time inversion-free control method based on MPPI is proposed for both IBVS, 3D point (3DVS) and position-based visual servoing (PBVS), which has been validated on a 6-DoF Cartesian robot (namely, Gantry robot) with an eye-in-hand camera. However, due to only considering the kinematic model and ignoring the dynamic characteristics of the actuators, their performance may decrease. For example, it is assumed that the actual velocity can immediately track the desired velocity of the servo controller output, which is impractical. Thus, it is necessary to establish a velocity deviation model and incorporate it into the design of the servo controller to accurately reflect the difference between the desired velocity and the actual velocity.

In general, there are two main approaches to deal with the modeling problem. One is mechanism modeling method. For instance, a linear model has been established in [25] for a visual servo task of tendon-driven continuum robots by using the depth camera data. In [26], based on the MPC and IBVS framework, a full model of the continuum robot is derived to improve the control robustness of the system uncertainties, perceived noise and modeling error. But it is difficult to write accurate mathematical expressions for objects with complex mechanisms. The another one is black box modeling method witch uses the input–output data to construct a sufficiently accurate model for prediction, such as neural network model [27, 28] or a fuzzy model [29]. However, these methods are difficult to evaluate the quality of the model on-line. The Gaussian process [30] (GP) is a nonparametric modeling approach based on a Bayesian framework [31]. Due to its limited prior knowledge [32] and the ability to directly provides model uncertainty [33], it is used to model various systems, such as race cars [34], manipulators [35] and quadcopters [36]. GP models is usually included into the MPC framework, which is termed as GP-based model predictive control (GPMPC). Since the velocity model of a vehicle is built by GP, a GPMPC method is proposed in [37] for path tracking problem of vehicles, and the performance is better than MPC method. In [38], a GPMPC method is employed in a time-varying system to deal with prediction uncertainty where the uncertainty is converted into constraints for safe operation. However, the performance index used in [37] and [38] are deterministic even though the GPR-based model is stochastic, which does not fully utilize the model information. In [32], the performance index of GPMPC is set as the expected value of the accumulated quadratic stage cost. This kind of performance index leads to a cautious control where the state area with less prediction uncertainty is more preferred. In [39], GPMPC incorporated with risk-sensitive cost is presented. Different with cautious control proposed in [32], the controller is encouraged to explore the unknown state area at a reasonable level to learn a better model, which improves the control performance. The stability of GPMPC is proved in [40].

Summarized by the above discussion, a GPMPC method is proposed for a wheeled mobile robot (WMR) to deal with the IBVS task under unknown actuator dynamic properties. Moreover, the orientation angle control is also considered to track the target pose of the WMR. The main contributions are shown as follows:

  1. 1)

    A GP-enhanced model instead of a pure GP model is learned on-line by constructing a nominal and a GP model. The nominal model improves the control performance in the state area which stays away from the training data set, and the GP model captures actuators’ dynamic properties which can lead to differences between the nominal model and the real model.

  2. 2)

    To guarantee the visibility of the feature point, the chance constraints of the feature point’s image coordinates are proposed and combined into the stochastic GPMPC formulation.

  3. 3)

    To solve the stochastic GPMPC problem, an augmented deterministic model (ADM) that represents the uncertainty propagation of the state is proposed to transform the stochastic MPC (SMPC) formulation to a deterministic model predictive control (DMPC) formulation which is solved by iterative linear quadratic regulator (iLQR).

  4. 4)

    Due to the static servo error that exists when using iLQR, a Lorentzian \(\rho \)-function is introduced into the terminal cost to replace the common quadratic terminal cost.

2 Model description

2.1 GP-enhanced model

The considered visual servoing system for a WMR is shown in Fig. 1, where \({O_w}{X_w}{Y_w}{Z_w}\), \({O_c}{X_c}{Y_c}{Z_c}\) and \({O_r}{X_r}{Y_r}{Z_r}\) represent the world coordinate, camera coordinate and robot coordinate, respectively. \({\left[ {\begin{array}{*{20}{c}}{{v_r}}&{{w_r}}\end{array}} \right] ^T}\) denote the linear velocity and angular velocity of the WMR, and \({\left[ {\begin{array}{*{20}{c}}{{v_c}}&{{w_c}}\end{array}} \right] ^T}\) denote the linear velocity and angular velocity of the camera. An identity matrix is created by setting the rotation matrix between the robot coordinates and the camera coordinates; we can obtain

$$\begin{aligned} {\left[ {\begin{array}{*{20}{c}} {{v_r}}&{{w_r}} \end{array}} \right] ^T} = {\left[ {\begin{array}{*{20}{c}} {{v_c}}&{{w_c}} \end{array}} \right] ^T} \end{aligned}$$
(1)
Fig. 1
figure 1

Visual servoing system for a WMR

The camera coordinates and image coordinates are shown in Fig. 2 where \(\left( {{x_i},{y_i}} \right) \) and \(\left( {{x_c},{y_c},{z_c}} \right) \) are the positions of the feature point in the image coordinates and camera coordinates, respectively. The relationship between these two coordinates can be described by the camera projection model as follows

$$\begin{aligned} \begin{array}{l} {x_i} = {{{x_c}} / {{z_c}}}\\ {y_i} = {{{y_c}} / {{z_c}}} \end{array} \end{aligned}$$
(2)
$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{{\dot{x}}_i}}\\ {{{\dot{y}}_i}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{{x_i}} / {{z_c}}}}&{}{ - \left( {1 + x_i^2} \right) }\\ {{{{y_i}} / {{z_c}}}}&{}{ - {x_i}{y_i}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{v_c}}\\ {{w_c}} \end{array}} \right] \end{aligned}$$
(3)

In this paper, \({y_c}\) is a constant since the height of the camera and the feature point are fixed. Thus, (3) can be rewritten as (4) to remove the depth information

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{{\dot{x}}_i}}\\ {{{\dot{y}}_i}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{{x_i}{y_i}} / {{y_c}}}}&{}{ - \left( {1 + x_i^2} \right) }\\ {{{y_i^2} / {{y_c}}}}&{}{ - {x_i}{y_i}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{v_c}}\\ {{w_c}} \end{array}} \right] \end{aligned}$$
(4)
Fig. 2
figure 2

Projection relations of the image frame and the camera frame

By substituting (1) into (4), we obtain

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{{\dot{x}}_i}}\\ {{{\dot{y}}_i}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{{x_i}{y_i}} / {{y_c}}}}&{}{ - \left( {1 + x_i^2} \right) }\\ {{{y_i^2} / {{y_c}}}}&{}{ - {x_i}{y_i}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{v_r}}\\ {{w_r}} \end{array}} \right] \end{aligned}$$
(5)

Moreover, the orientation angle \(\dot{\theta }= {w_r}\) is also introduced into (5) as follows

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{{\dot{x}}_i}}\\ {{{\dot{y}}_i}}\\ {\dot{\theta }} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{{{x_i}{y_i}} / {{y_c}}}}&{}{ - \left( {1 + x_i^2} \right) }\\ {{{y_i^2} / {{y_c}}}}&{}{ - {x_i}{y_i}}\\ 0&{}1 \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{v_r}}\\ {{w_r}} \end{array}} \right] \end{aligned}$$
(6)

To use MPC method, (6) is discretized as follows

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{x_{i,k + 1}}}\\ {{y_{i,k + 1}}}\\ {{\theta _{k + 1}}} \end{array}} \right]= & {} \left[ {\begin{array}{*{20}{c}} {{x_{i,k}}}\\ {{y_{i,k}}}\\ {{\theta _k}} \end{array}} \right] + T\left[ {\begin{array}{*{20}{c}} {{{{x_{i,k}}{y_{i,k}}} / {{y_c}}}}&{}{ - \left( {1 + x_{i,k}^2} \right) }\\ {{{y_{i,k}^2} / {{y_c}}}}&{}{ - {x_{i,k}}{y_{i,k}}}\\ 0&{}1 \end{array}} \right] \nonumber \\{} & {} \quad \times \left[ {\begin{array}{*{20}{c}} {{v_{r,k}}}\\ {{w_{r,k}}} \end{array}} \right] \end{aligned}$$
(7)

where T and k are the sampling time and the time index.

However, the desired velocity \({\left[ {\begin{array}{*{20}{c}}{{v_{r,d}}}&{{w_{r,d}}}\end{array}} \right] ^T}\) is the output of the visual servo controller in practical and the real velocity \({\left[ {\begin{array}{*{20}{c}}{{v_r}}&{{w_r}}\end{array}} \right] ^T}\) is not directly controllable. Thus, the velocity model that represents the relationship between \({\left[ {\begin{array}{*{20}{c}}{{v_{r,d}}}&{{w_{r,d}}}\end{array}} \right] ^T}\) and \({\left[ {\begin{array}{*{20}{c}}{{v_r}}&{{w_r}}\end{array}} \right] ^T}\) should also be considered to design an efficient visual servo controller. Generally, the velocity model is

$$\begin{aligned} {\left[ {\begin{array}{*{20}{c}} {{v_{r,k}}}&{{w_{r,k}}} \end{array}} \right] ^T} = f\left( {{v_{r,k - 1}},{w_{r,k - 1}},{v_{r,d,k}},{w_{r,d,k}}} \right) \end{aligned}$$
(8)

In this paper, (8) is written as follows

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {{v_{r,k}}}\\ {{w_{r,k}}} \end{array}} \right] \!\! =\!\! \left[ {\begin{array}{*{20}{c}} {{v_{r,d,k}}}\\ {{w_{r,d,k}}} \end{array}} \right] \!\! +\!\! g\left( {{v_{r,k - 1}},{w_{r,k - 1}},{v_{r,d,k}},{w_{r,d,k}}} \right) \nonumber \\ \end{aligned}$$
(9)

By combining (9) and (7), we have the following augmented model

$$\begin{aligned}{} & {} {\left[ {\begin{array}{*{20}{l}}{{x_{i,k + 1}}}&{{y_{i,k + 1}}}&{{\theta _{k + 1}}}&{{v_{r,k}}}&{{w_{r,k}}} \end{array}} \right] ^T} \nonumber \\{} & {} \quad = m\left( {{x_{i,k}},{y_{i,k}},{\theta _k},{v_{r,d,k}},{w_{r,d,k}}} \right) \nonumber \\{} & {} \qquad + h\left( {{x_{i,k}},{y_{i,k}},{\theta _k},{v_{r,k - 1}},{w_{r,k - 1}},{v_{r,d,k}},{w_{r,d,k}}} \right) + {\varepsilon _k} \nonumber \\ \end{aligned}$$
(10)

where

$$\begin{aligned} \begin{array}{c} m\left( {{x_{i,k}},{y_{i,k}},{\theta _k},{v_{r,d,k}},{w_{r,d,k}}} \right) = \left[ {\begin{array}{*{20}{l}} {{x_{i,k}}}\\ {{y_{i,k}}}\\ {{\theta _k}}\\ 0\\ 0 \end{array}} \right] \\ + T\left[ {\begin{array}{*{20}{c}} {{{{x_{i,k}}{y_{i,k}}} / {{y_c}}}}&{}{ - \left( {1 + x_{i,k}^2} \right) }\\ {{{y_{i,k}^2} / {{y_c}}}}&{}{ - {x_{i,k}}{y_{i,k}}}\\ 0&{}1\\ {{1 / T}}&{}0\\ 0&{}{{1 / T}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{v_{r,d,k}}}\\ {{w_{r,d,k}}} \end{array}} \right] \end{array} \end{aligned}$$
(11)

is a nominal model which is usually directly used to design a servo controller in most works, and

$$\begin{aligned} \begin{array}{c} h\left( {{x_{i,k}},{y_{i,k}},{\theta _k},{v_{r,k}},{w_{r,k}},{v_{r,d,k}},{w_{r,d,k}}} \right) \\ = T\left[ {\begin{array}{*{20}{c}} {{{{x_{i,k}}{y_{i,k}}} / {{y_c}}}}&{}{ - \left( {1 + x_{i,k}^2} \right) }\\ {{{y_{i,k}^2} / {{y_c}}}}&{}{ - {x_{i,k}}{y_{i,k}}}\\ 0&{}1\\ {{1 / T}}&{}0\\ 0&{}{{1 / T}} \end{array}} \right] \\ g\left( {{v_{r,k - 1}},{w_{r,k - 1}},{v_{r,d,k}},{w_{r,d,k}}} \right) \end{array} \end{aligned}$$
(12)

is an additive model which contain the information about dynamic properties of WMR’s actuators. Moreover, we also consider the i.i.d process noise \({\varepsilon _k} \sim N\left( {0,{\varSigma _\varepsilon }} \right) \) with \({\varSigma _\varepsilon } = diag\left( {{{\left( {\sigma _\varepsilon ^1} \right) }^2}, \ldots ,{{\left( {\sigma _\varepsilon ^5} \right) }^2}} \right) \). The GP method is employed in this paper to approximate the function h, and thus, (10) is named as the GP-enhanced model. For convenience, we define \({o_k} = {\left[ {\begin{array}{*{20}{l}} {{x_{i,k + 1}}}&{{y_{i,k + 1}}}&{{\theta _{k + 1}}}&{{v_{r,k}}}&{{w_{r,k}}} \end{array}} \right] ^T}\) and \({u_k} = {\left[ {\begin{array}{*{20}{c}} {{v_{r,d,k}}}&{{w_{r,d,k}}} \end{array}} \right] ^T}\) as the state and input of the GP-enhanced model at time index k, respectively.

2.2 GP modeling

The Gaussian process regression (GPR) method can use the previously collected data set to describe the additive model h. More concretely, five independent GP models are built where each model is corresponding to an output dimension of h.

At time index k, the state \({o_k}\) has been observed and the value \(h\left( {{o_k},{u_k}} \right) \) is expected to be inferred. Labels of the data set for \({i^{th}}\) model are set as follows

$$\begin{aligned} \overrightarrow{y} _{k - 1}^i \!\!= \!\!{\left[ {\begin{array}{*{20}{l}} {{{\left( {{o_2} - {m_1}} \right) }^i}}&\ldots&{{{\left( {{o_{j + 1}} - {m_j}} \right) }^i}}&\ldots&{{{\left( {{o_k} - {m_{k - 1}}} \right) }^i}} \end{array}} \right] ^T}\nonumber \\ \end{aligned}$$
(13)

where \({o_j}\) and \({m_j}\) denote the value of the state and output of the nominal model at time index j, respectively. \({\left( {{o_{j + 1}} - {m_j}} \right) ^i}\) represents the \({i^{th}}\) dimension of the \(\left( {{o_{j + 1}} - {m_j}} \right) \). Features for each GP model are the same and defined as

$$\begin{aligned} {Z_{k - 1}} = {\left[ {\begin{array}{*{20}{l}} {{z_1}}&\ldots&{{z_j}}&\ldots&{{z_{k - 1}}} \end{array}} \right] ^T} \end{aligned}$$
(14)

where \({z_j} = {\left[ {\begin{array}{*{20}{c}} {o_j^T}&{u_j^T} \end{array}} \right] ^T}\). The relationship between labels and features is as follows

$$\begin{aligned} y_j^i = {\left( {{o_{j + 1}} - {m_j}} \right) ^i} = {h^i}\left( {{z_j}} \right) + \varepsilon _j^i \end{aligned}$$
(15)

where \(\varepsilon _j^i\) represents \({i^{th}}\) dimension of the \({\varepsilon _j}\), and \(\varepsilon _j^i \sim N\left( {0,{{\left( {\sigma _\varepsilon ^i} \right) }^2}} \right) \). By using the GPR method, \({{\buildrel {\rightharpoonup }\over {h}}^i}= {\left[ {\begin{array}{*{20}{c}} {{h^i}\left( {{z_1}} \right) }&\ldots&{{h^i}\left( {{z_{k - 1}}} \right) } \end{array}} \right] ^T}\) is assumed to satisfy a multivariate Gaussian distribution as

$$\begin{aligned} {{\buildrel {\rightharpoonup }\over {h}}^i}\sim N\left( {\left[ {\begin{array}{*{20}{c}} {{\beta ^i}\left( {{z_1}} \right) }\\ \vdots \\ {{\beta ^i}\left( {{z_{k - 1}}} \right) } \end{array}} \right] , \left[ {\begin{array}{*{20}{c}} {{\varphi ^i}\left( {{z_1},{z_1}} \right) }&{} \ldots &{}{{\varphi ^i}\left( {{z_1},{z_1}} \right) }\\ \vdots &{} \ddots &{} \vdots \\ {{\varphi ^i}\left( {{z_{k - 1}},{z_1}} \right) }&{} \ldots &{}{{\varphi ^i}\left( {{z_{k - 1}},{z_{k - 1}}} \right) } \end{array}} \right] } \right) \nonumber \\ \end{aligned}$$
(16)

where \({\varphi ^i}\left( { \cdot , \cdot } \right) \) is the kernel function and \({\beta ^i}\left( \cdot \right) \) is the mean function of the GPR method. The mean function can be arbitrarily set. For convenience, we set \({\beta ^i}\left( \cdot \right) = 0\). The kernel function should be designed such that \(\varPhi _{k - 1}^i = \left[ {\begin{array}{*{20}{c}} {{\varphi ^i}\left( {{z_1},{z_1}} \right) }&{} \ldots &{}{{\varphi ^i}\left( {{z_1},{z_1}} \right) }\\ \vdots &{} \ddots &{} \vdots \\ {{\varphi ^i}\left( {{z_{k - 1}},{z_1}} \right) }&{} \ldots &{}{{\varphi ^i}\left( {{z_{k - 1}},{z_{k - 1}}} \right) } \end{array}} \right] \) is positive semidefinite or positive definite. Here, the kernel function is set as

$$\begin{aligned} {\varphi ^i}\left( {{z_a},{z_b}} \right) = {\left( {\sigma _s^i} \right) ^2}{e^{ - \frac{1}{2}{{\left( {{z_a} - {z_b}} \right) }^T}{\varLambda ^i}\left( {{z_a} - {z_b}} \right) }} \end{aligned}$$
(17)

where \({\varLambda ^i}\) and \(\left( {{{\left( {\sigma _s^i} \right) }^2},{\varLambda ^i}} \right) \) are the diagonal matrix and the hyper parameters of the GPR [30], respectively. That can be decided by maximizing the log-likelihood function given by

$$\begin{aligned}{} & {} \log p\left( {\overrightarrow{y} _{k - 1}^i|{Z_{k - 1}},{{\left( {\sigma _s^i} \right) }^2},{\varLambda ^i}} \right) \nonumber \\{} & {} \quad = - \frac{1}{2}{\left( {\overrightarrow{y} _{k - 1}^i} \right) ^T}{\left( {\varPhi _{k - 1}^i + {{\left( {\sigma _\varepsilon ^i} \right) }^2}I} \right) ^{ - 1}}\overrightarrow{y} _{k - 1}^i\nonumber \\{} & {} \qquad - \frac{1}{2}\log \left| {{{\left( {\varPhi _{k - 1}^i + {{\left( {\sigma _\varepsilon ^i} \right) }^2}I} \right) }^{ - 1}}} \right| - \frac{7}{2}\log \left( {2\pi } \right) \nonumber \\ \end{aligned}$$
(18)

After determining the hyperparameters, the joint distribution of \({h^i}\left( {{z_k}} \right) \) and \(\overrightarrow{y} _{k - 1}^i\) is shown as follows

$$\begin{aligned} \left[ {\begin{array}{*{20}{c}} {\overrightarrow{y} _{k - 1}^i}\\ {{h^i}\left( {{z_k}} \right) } \end{array}} \right] \left| {{Z_{k - 1}},{{\left( {\sigma _s^i} \right) }^2},{\varLambda ^i}} \right. \sim \nonumber \\ N\left( {\overrightarrow{0},\left[ {\begin{array}{*{20}{c}} {\varPhi _{k - 1}^i + {{\left( {\sigma _\varepsilon ^i} \right) }^2}I}&{}{{{\overrightarrow{\varphi }}^i}}\\ {{{\left( {{{\overrightarrow{\varphi }}^i}} \right) }^T}}&{}{{\varphi ^i}\left( {{z_k},{z_k}} \right) } \end{array}} \right] } \right) \end{aligned}$$
(19)

where \({\overrightarrow{\varphi }^i} = {\left[ {\begin{array}{*{20}{c}} {{\varphi ^i}\left( {{z_1},{z_k}} \right) }&\ldots&{{\varphi ^i}\left( {{z_{k - 1}},{z_k}} \right) } \end{array}} \right] ^T}\). Applying the conditional Gaussian rules [30], the posterior distribution of \({h^i}\left( {{z_k}} \right) \) can be obtained as

$$\begin{aligned}{} & {} {h^i}\left( {{z_k}} \right) \left| {{Z_{k - 1}},{{\left( {\sigma _s^i} \right) }^2},{\varLambda ^i}} \right. ,\overrightarrow{y} _{k - 1}^i\sim \nonumber \\{} & {} \quad N\left( {\mu _{h,k}^i,{{\left( {\sigma _{h,k}^i} \right) }^2}} \right) \end{aligned}$$
(20)

where

$$\begin{aligned} \mu _{h,k}^i= & {} {\left( {{{\overrightarrow{\varphi }}^i}} \right) ^T}{\left( {\varPhi _{k - 1}^i + {{\left( {\sigma _\varepsilon ^i} \right) }^2}I} \right) ^{ - 1}}\overrightarrow{y} _{k - 1}^i\end{aligned}$$
(21)
$$\begin{aligned} {\left( {\sigma _{h,k}^i} \right) ^2}= & {} {\varphi ^i}\left( {{z_k},{z_k}} \right) \nonumber \\{} & {} - {\left( {{{\overrightarrow{\varphi }}^i}} \right) ^T}{\left( {\varPhi _{k - 1}^i + {{\left( {\sigma _\varepsilon ^i} \right) }^2}I} \right) ^{ - 1}}{\overrightarrow{\varphi }^i} \end{aligned}$$
(22)

Then, the posterior distribution of \(h\left( {{z_k}} \right) \) which is expected to be inferred can be computed as follows [30]

$$\begin{aligned}{} & {} h\left( {{z_k}} \right) \left| {{Z_{k - 1}},{{\left( {\sigma _s^i} \right) }^2},{\varLambda ^i}} \right. ,\overrightarrow{y} _{k - 1}^i{,_{i = 1, \ldots ,5}}\sim \nonumber \\{} & {} N\left( {{\mu _{h,k}},{\varSigma _{h,k}}} \right) \end{aligned}$$
(23)

where

$$\begin{aligned}{} & {} {\mu _{h,k}} = \left[ {\begin{array}{*{20}{l}} {\mu _{h,k}^1}&{\mu _{h,k}^2}&{\mu _{h,k}^3}&{\mu _{h,k}^4}&{\mu _{h,k}^5} \end{array}} \right] , {\varSigma _{h,k}} \\{} & {} = diag\left( {{{\left( {\sigma _{h,k}^1} \right) }^2},{{\left( {\sigma _{h,k}^2} \right) }^2},{{\left( {\sigma _{h,k}^3} \right) }^2},{{\left( {\sigma _{h,k}^4} \right) }^2},{{\left( {\sigma _{h,k}^5} \right) }^2}} \right) . \end{aligned}$$

Based on the posterior distribution of \(h\left( {{z_k}} \right) \), the distribution of \({o_{k + 1}}\) can be easily derived as

$$\begin{aligned} {o_{k + 1}}\sim N\left( {{\mu _{o,k + 1}},{\varSigma _{o,k + 1}}} \right) \end{aligned}$$
(24)

where

$$\begin{aligned}{} & {} {\mu _{o,k + 1}} = {m_k} + {\mu _{h,k}} \end{aligned}$$
(25)
$$\begin{aligned}{} & {} {\varSigma _{o,k + 1}} = {\varSigma _{h,k}} + {\varSigma _\varepsilon } \end{aligned}$$
(26)

3 MPC controller design

With the GP-enhanced model, the MPC problem is exactly nonlinear and stochastic, which is given as follows

$$\begin{aligned} \mathop {\min }\limits _{{U_k}} \mathrm{{ }}J\mathrm{{ = }}E\left[ {\sum \limits _{t = k}^{k + M - 1} {l\left( {{o_t},{u_t}} \right) + {l_f}\left( {{o_{k + M}}} \right) } } \right] \end{aligned}$$
(27a)
$$\begin{aligned}{} & {} s.t.\quad \forall i = 1, \ldots ,5,\,\mathrm{{ }}\forall j = 1,2\mathrm{{ }}\, and \, \mathrm{{ }}\forall t = k, \ldots ,\\ {}{} & {} k + M - 1: \end{aligned}$$
$$\begin{aligned} p\left( {{o_{t + 1}}|{o_k},{u_k}} \right) \sim N\left( {{\mu _{o,t + 1}},{\varSigma _{o,t + 1}}} \right) \end{aligned}$$
(27b)
$$\begin{aligned} u_t^j \ge u_{\min }^j \end{aligned}$$
(27c)
$$\begin{aligned} u_t^j \le u_{\max }^j \end{aligned}$$
(27d)
$$\begin{aligned} p\left( {o_{t + 1}^i \ge o_{\min }^i|{o_k},{u_k}} \right) \ge c \end{aligned}$$
(27e)
$$\begin{aligned} p\left( {o_{t + 1}^i \le o_{\max }^i|{o_k},{u_k}} \right) \ge c \end{aligned}$$
(27f)

where \({U_k} = {\left[ {\begin{array}{*{20}{c}} {{u_k}}&\ldots&{{u_{k + M - 1}}} \end{array}} \right] ^T}\) is the control sequence which is need to be searched, the stage cost function \(l\left( {{o_t},{u_t}} \right) \) is defined as a quadratic function \(l\left( {{o_t},{u_t}} \right) = {\left\| {{o_t} - {o^ * }} \right\| _Q} + {\left\| {{u_t}} \right\| _R}\) and the terminal cost function is defined as follows

$$\begin{aligned} {l_f}\left( {{o_{k + M}}} \right) = \eta {d^2} + \lambda \log \left( {{d^2} + \alpha } \right) \end{aligned}$$
(28)

where \({d^2} = {\left\| {{o_{k + M}} - {o^ * }} \right\| _Q}\) and \({o^ * }\) is the target state, and the Lorentzian \(\rho \)-function \(\log \left( {{d^2} + \alpha } \right) \) is introduced into the terminal cost function to encourage accuracy placement into the \({o^ * }\), which is shown in Fig. 3 (Assuming \(\eta = \lambda = 1\)).

Fig. 3
figure 3

Shape of the terminal cost function

It should be noted that the performance index (27a) is an expected value, which is different with traditional MPC problem. Thus, the expected values of the \(l\left( {{o_t},{u_t}} \right) \) and \({l_f}\left( {{o_{k + M}}} \right) \) need to be derived. However, accurate closed forms of these expected values cannot be obtained. Thus, the second-order Taylor expansion technique can be used to approximate the expected values as

$$\begin{aligned}{} & {} \begin{array}{l} E\left[ {l\left( {{o_t},{u_t}} \right) } \right] = \int {p\left( {{o_t}} \right) } l\left( {{o_t},{u_t}} \right) d{o_t}\\ \approx \int {p\left( {{o_t}} \right) } \left( {{{\left\| {{\mu _{o,t}} - {o^ * }} \right\| }_Q} + {{\left( {{o_t} - {\mu _{o,t}}} \right) }^T}\frac{{\partial l\left( {{o_t},{u_t}} \right) }}{{\partial {o_t}}}\left| {_{{o_t} = {\mu _{o,t}}}} \right. } \right. \\ \quad + \frac{1}{2}{{\left( {{o_t} - {\mu _{o,t}}} \right) }^T}\frac{{{\partial ^2}l\left( {{o_t},{u_t}} \right) }}{{\partial {o_t}^2}}\left| {{_{{o_t} = {\mu _{o,t}}}}} \right. \\ \quad \times \left. {\left( {{o_t} - {\mu _{o,t}}} \right) } \right) d{o_t}\\ \quad + {\left\| {{u_t}} \right\| _R} = {\left\| {{\mu _{o,t}} - {o^ * }} \right\| _Q} + 0 + tr\left( {\frac{1}{2}}\int {p\left( {{o_t}} \right) }\right. \\ \quad \times \left( {{o_t} - {\mu _{o,t}}} \right) {{{\left( {{o_t} - {\mu _{o,t}}} \right) }^T}d{o_t}} \frac{{{\partial ^2}l\left( {{o_t},{u_t}} \right) }}{{\partial {o_t}^2}}\left| {{_{{o_t} = {\mu _{o,t}}}}} \right) + {\left\| {{u_t}} \right\| _R}\\ = {\left\| {{\mu _{o,t}} - {o^ * }} \right\| _Q} + tr\left( {{\varSigma _{o,t}}Q} \right) + {\left\| {{u_t}} \right\| _R} \end{array}\end{aligned}$$
(29)
$$\begin{aligned}{} & {} \begin{array}{l} E\left[ {{l_f}\left( {{o_{k + M}}} \right) } \right] = \int {p\left( {{o_{k + M}}} \right) } {l_f}\left( {{o_{k + M}}} \right) d{o_{k + M}}\\ \approx \int {p\left( {{o_{k + M}}} \right) } \left( {{{\left\| {{\mu _{o,k + M}} - {o^ * }} \right\| }_Q}} + {{\left( {{o_{k + M}} - {\mu _{o,k + M}}} \right) }^T}\right. \\ \quad \times \frac{{\partial {l_f}\left( {{o_{k + M}}} \right) }}{{\partial {o_{k + M}}}}\left| {_{{o_{k + M}} = {\mu _{o,k + M}}}}\right. \\ \quad { + \frac{1}{2}}{{\left( {{o_{k + M}} - {\mu _{o,k + M}}} \right) }^T}\frac{{{\partial ^2}l\left( {{o_{k + M}}} \right) }}{{\partial {o_{k + M}}^2}}\left| {_{{o_{k + M}} = {\mu _{o,k + M}}}} \right. \\ \quad \times \left. \left( {{o_{k + M}} - {\mu _{o,k + M}}} \right) \right) d{o_{k + M}}\\ = \eta {\mu _{{d^2},k + M}} + \lambda \log \left( {{\mu _{{d^2},k + M}} + \alpha } \right) + tr\left( {{\varSigma _{o,t}}B} \right) \end{array} \end{aligned}$$
(30)

where

$$\begin{aligned}{} & {} {\mu _{{d^2},k + M}} = {\left\| {{\mu _{o,k + M}} - {o^ * }} \right\| _Q} \end{aligned}$$
(31)
$$\begin{aligned}{} & {} B = \eta Q + \lambda \left( \frac{1}{{{\mu _{{d^2},k + M}} + \alpha }}Q - \frac{2}{{{\left( {{\mu _{{d^2},k + M}} + \alpha } \right) }^2}}\right. \nonumber \\{} & {} \quad \left. Q\left( {{\mu _{o,k + M}} - {o^ * }} \right) {{\left( {{\mu _{o,k + M}} - {o^ * }} \right) }^T}Q \right) \end{aligned}$$
(32)

when Q is a weighted matrix with \(Q \ge 0\), and the second part of (29) can be represented as \(tr\left( {{\varSigma _{o,t}}Q} \right) = \varSigma _{o,t}^{\left( {1,1} \right) }{Q^{\left( {1,1} \right) }} + \ldots + \varSigma _{o,t}^{\left( {5,5} \right) }{Q^{\left( {5,5} \right) }}\), which penalizes the predicted state with large variances. Thus, the GPMPC controller with such stage cost function prefers the predicted state area with less uncertainty and behaves cautiously.

However, since the model (27b) and constraints (27e) and (27f) are stochastic, the optimization problem (27) is computationally intractable. Thus, several techniques are presented in the remainder of this section to transform (27) into a tractable optimization problem.

3.1 Uncertainty propagation

In this subsection, the specific form of \({\mu _{o,t + 1}}\) and \({\varSigma _{o,t + 1}}\) in the stochastic model (27b) is derived. However, for long-term prediction, \(\left( {{\mu _{o,k + 2}},{\varSigma _{o,k + 2}}} \right) , \ldots ,\left( {{\mu _{o,k + M}},{\varSigma _{o,k + M}}} \right) \) should also be computed, which are more complex compared with computation for \(\left( {{\mu _{o,k + 1}},{\varSigma _{o,k + 1}}} \right) \) since features \({z_{k + 1}}, \ldots ,{z_{k + M - 1}}\) for prediction are stochastic variables.

Only the derivation of \(\left( {{\mu _{o,k + 2}},{\varSigma _{o,k + 2}}} \right) \) is presented here since the method to derive other mean–variance pairs is similar. The feature \({z_{k + 1}}\) is a variable which satisfies a Gaussian distribution as

$$\begin{aligned}{} & {} {z_{k + 1}}\sim N\left( {{\mu _{z,k + 1}},{\varSigma _{z,k + 1}}} \right) \nonumber \\{} & {} = N\left( {\left[ {\begin{array}{*{20}{c}} {{\mu _{o,k + 1}}}\\ {{u_{k + 1}}} \end{array}} \right] , \left[ {\begin{array}{*{20}{c}} {{\varSigma _{o,k + 1}}}&{}{{0_{5 \times 2}}}\\ {{0_{2 \times 5}}}&{}{{0_{2 \times 2}}} \end{array}} \right] } \right) \end{aligned}$$
(33)

where \(\left( {{\mu _{o,k + 1}},{\varSigma _{o,k + 1}}} \right) \) can be computed based on (25) and (26). Under the circumstance of the uncertain input, \({\mu _{o,k + 2}}\) can be computed as

$$\begin{aligned} \begin{array}{l} {\mu _{o,k + 2}}\begin{array}{*{20}{l}} { = {E_{{z_{k + 1}},{h_{k + 1}}}}\left[ {{o_{k + 2}}} \right] } \end{array} \\ = \iint {p\left( {{z_{k + 1}}} \right) p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) }\left( {{m_{k + 1}} + {h_{k + 1}} + {\varepsilon _{k + 1}}} \right) \\ d{z_{k + 1}}d{h_{k + 1}} \\ {=} \int {p\left( {{z_{k {+} 1}}} \right) } {m_{k {+} 1}}d{z_{k + 1}} {+} \iint {p\left( {{z_{k {+} 1}}} \right) p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) }\\ {h_{k + 1}}d{z_{k + 1}}d{h_{k + 1}} \\ = \int {p\left( {{z_{k + 1}}} \right) } {m_{k + 1}}d{z_{k + 1}} + \int {p\left( {{z_{k + 1}}} \right) } {\mu _{h,k + 1}}d{z_{k + 1}} \\ \approx {m_{k + 1}}\left( {{\mu _{z,k + 1}}} \right) + {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) \\ \end{array} \nonumber \\ \end{aligned}$$
(34)

where the final approximate equation is obtained by using the first-order Taylor expansion of \({m_{k + 1}}\) and \({\mu _{h,k + 1}}\) around \({\mu _{z,k + 1}}\). Although an accurate solution of \(\int {p\left( {{z_{k + 1}}} \right) } {\mu _{h,k + 1}}d{z_{k + 1}}\) can be computed as in [41], the Taylor expansion approximation is used to simplify calculation. Covariance matrix can be obtained based on the above approximated mean value as

$$\begin{aligned}&{\varSigma _{z,k {+} 1}}\nonumber \\&{=} {E_{{z_{k {+} 1}},{h_{k {+} 1}}}}\left[ {\left( {{o_{k + 2}} {-} {\mu _{o,k {+} 2}}} \right) {{\left( {{o_{k + 2}} {-} {\mu _{o,k {+} 2}}} \right) }^T}} \right] \nonumber \\&= {E_{{z_{k + 1}},{h_{k + 1}}}}\left[ {\left( {\underbrace{\left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) \left( {{z_{k + 1}} - {\mu _{z,k + 1}}} \right) }_{\varDelta {m_{k + 1}}}} \right. } \right. \nonumber \\&\quad \left. { {+} \underbrace{{h_{k + 1}} {-} {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) }_{\varDelta {h_{k + 1}}} {+} {\varepsilon _{k + 1}}} \right) \left. \right. \nonumber \\&\quad \left. {{{\left( {\varDelta {m_{k + 1}} + \varDelta {h_{k + 1}} + {\varepsilon _{k + 1}}} \right) }^T}} \right] \nonumber \\&= \int {\varDelta {m_{k + 1}}} \varDelta m_{k + 1}^Tp\left( {{z_{k + 1}}} \right) d{z_{k + 1}}\nonumber \\&\quad {+} \iint {\varDelta {m_{k {+} 1}}}\varDelta {h_{k {+} 1}}^Tp\left( {{z_{k + 1}}} \right) p\left( {{h_{k + 1}}|{z_{k {+} 1}}} \right) \nonumber \\&\quad \times d{z_{k + 1}}d{h_{k + 1}} \nonumber \\&\quad + \iint {\varDelta {h_{k + 1}}}\varDelta m_{k + 1}^Tp\left( {{z_{k + 1}}} \right) p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) \nonumber \\&\quad \times d{z_{k + 1}}d{h_{k + 1}}\nonumber \\&\quad + \iint {\varDelta {h_{k + 1}}}\varDelta {h_{k + 1}}^Tp\left( {{z_{k + 1}}} \right) p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) \nonumber \\&d{z_{k +1}}d{h_{k + 1}} + {\varSigma _\varepsilon } \end{aligned}$$
(35)

The remaining work is to compute integrals. The first integral can be computed as

$$\begin{aligned} \begin{array}{l} \int {\varDelta {m_{k + 1}}} \varDelta m_{k + 1}^Tp\left( {{z_{k + 1}}} \right) d{z_{k + 1}}\\ = \left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) \int p\left( {{z_{k + 1}}} \right) \\ \left( {{z_{k + 1}} - {\mu _{z,k + 1}}} \right) {\left( {{z_{k + 1}} - {\mu _{z,k + 1}}} \right) ^T}\\ d{z_{k + 1}}{\left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) ^T}\\ = \left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) {\varSigma _{z,k + 1}}{\left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) ^T} \end{array} \end{aligned}$$
(36)

The value of the second integral can be obtained as

$$\begin{aligned} \begin{array}{l} \iint {\varDelta {m_{k {+} 1}}}\varDelta {h_{k {+} 1}}^Tp\left( {{z_{k {+} 1}}} \right) p\left( {{h_{k {+} 1}}|{z_{k {+} 1}}} \right) d{z_{k {+} 1}}d{h_{k + 1}} \\ = \int {p\left( {{z_{k + 1}}} \right) \varDelta {m_{k + 1}}} \int {p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) }\\ {\left( {{h_{k + 1}} - {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) } \right) ^T} d{h_{k + 1}}d{z_{k + 1}} \\ {=} \int {p\left( {{z_{k {+} 1}}} \right) \varDelta {m_{k + 1}}} \left( {{\mu _{h,k + 1}} {-} {\mu _{h,k {+} 1}}\left( {{\mu _{z,k {+} 1}}} \right) } \right) d{z_{k {+} 1}} \\ \approx 0 \\ \end{array} \nonumber \\ \end{aligned}$$
(37)

where the final approximate equation is obtained by using the first-order Taylor expansion of \(\varDelta {m_{k + 1}}\left( {{\mu _{h,k + 1}} - {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) } \right) \) around \({\mu _{z,k + 1}}\). The value of the third integral which can be derived by using the same method as the second integral is also zero. The fourth integral can be computed as

$$\begin{aligned} \begin{array}{l} \iint {\varDelta {h_{k + 1}}}\varDelta {h_{k + 1}}^Tp\left( {{z_{k + 1}}} \right) p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) d{z_{k + 1}}d{h_{k + 1}} \\ {=} \int {p\left( {{z_{k {+} 1}}} \right) } \int {p\left( {{h_{k {+} 1}}|{z_{k {+} 1}}} \right) \varDelta {h_{k {+} 1}}} \varDelta {h_{k {+} 1}}^Td{h_{k {+} 1}}d{z_{k {+} 1}} \\ \approx \int {p\left( {{z_{k + 1}}} \right) } \int {p\left( {{h_{k + 1}}|{z_{k + 1}}} \right) } {h_{k + 1}}h_{k + 1}^Td{h_{k + 1}}d{z_{k + 1}} \\ \quad - {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) \mu _{_{h,k + 1}}^T\left( {{\mu _{z,k + 1}}} \right) \\ = \int {p\left( {{z_{k + 1}}} \right) } \left( {{\varSigma _{h,k + 1}} + {\mu _{h,k + 1}}\mu _{_{h,k + 1}}^T} \right) d{z_{k + 1}}\\ \quad - {\mu _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) \mu _{_{h,k + 1}}^T\left( {{\mu _{z,k + 1}}} \right) \\ \approx {\varSigma _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) \\ \end{array} \nonumber \\ \end{aligned}$$
(38)

where the second approximate equation and the final approximate equation are obtained by using the first-order Taylor expansion of \({\mu _{h,k + 1}}\) and \({\varSigma _{h,k + 1}} + {\mu _{h,k + 1}}\mu _{_{h,k + 1}}^T\) around \({\mu _{z,k + 1}}\), respectively. Thus, based on (35)-(38), the approximated covariance matrix is as follows

$$\begin{aligned} \begin{array}{l} {\varSigma _{o,k + 2}}\approx \left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) \\ {\varSigma _{z,k + 1}}{\left( {\frac{{\partial {m_{k + 1}}}}{{\partial {z_{k + 1}}}}\left| {_{{z_{k + 1}} = {\mu _{z,k + 1}}}} \right. } \right) ^T} + {\varSigma _{h,k + 1}}\left( {{\mu _{z,k + 1}}} \right) \\ + {\varSigma _\varepsilon } \end{array} \end{aligned}$$
(39)

where

$$\begin{aligned} \begin{array}{l} {\mu _{z,k + 1}} = {\left[ {\begin{array}{*{20}{c}} {{\mu _{o,k + 1}}}&{}{{u_{k + 1}}} \end{array}} \right] ^T}\\ {\varSigma _{z,k + 1}} = \left[ {\begin{array}{*{20}{c}} {{\varSigma _{o,k + 1}}}&{}{{0_{5 \times 2}}}\\ {{0_{2 \times 5}}}&{}{{0_{2 \times 2}}} \end{array}} \right] \end{array} \end{aligned}$$
(40)

It can be found that (25) and (26) are special forms for (34) and (39), respectively, by setting \({z_k} = {\mu _{z,k}}\) and \({\varSigma _{z,k}} = {0_{5 \times 5}}\).

Although the distribution of \({o_{k + 2}}\) is not exactly Gaussian, it is approximated into a Gaussian distribution by using the derived \(\left( {{\mu _{o,k + 2}},{\varSigma _{o,k + 2}}} \right) \). Then, the chance constraints (27e) and (27f) can be transformed into a more tractable formulation by using the quantile function of Gaussian distribution. In this paper, c is set as 0.95, which lead to the following constraints

$$\begin{aligned} \begin{array}{l} o_{t + 1}^i - 2\varSigma _{o,t + 1}^{\left( {i,i} \right) } \ge o_{\min }^i\\ o_{t + 1}^i + 2\varSigma _{o,t + 1}^{\left( {i,i} \right) } \le o_{\max }^i \end{array} \end{aligned}$$
(41)

for all \(t = k, \ldots ,k + M - 1\) and \(i = 1,2,3,4,5\).

3.2 Tractable MPC design

In this subsection, the intractable SMPC formulation is transformed to a tractable DMPC formulation by designing an ADM based on the derivations in subsection 3.1, and a modified iLQG approach is proposed to deal with the DMPC problem.

The state of the ADM at time index t is defined as

$$\begin{aligned} {s_t} = {\left[ {\begin{array}{*{20}{c}} {\mu _{o,t}^T}&{vec{{\left( {{L_{o,t}}} \right) }^T}} \end{array}} \right] ^T} \end{aligned}$$
(42)

where \({L_{o,t}} = \left[ {\begin{array}{*{20}{c}} {{l_1}}&\ldots&{{l_5}} \end{array}} \right] \) is a lower triangular matrix which is obtained by using Cholesky decomposition of \({\varSigma _{o,t}}\), and \(vec\left( {{L_{o,t}}} \right) = {\left[ {\begin{array}{*{20}{c}} {l_1^T}&\ldots&{l_5^T} \end{array}} \right] ^T}\).

Remark 1

We set \({L_{o,k}} = \overrightarrow{0} \) since \({\varSigma _{o,k}} = {0_{5 \times 5}}\), and for \(t = k + 1, \ldots k + M\), \({L_{o,t}}\) can be obtained by using Cholesky decomposition of \({\varSigma _{o,t}}\) since \({\varSigma _{o,t}}\) is a positive definite matrix.

The ADM is represented by \({s_{t + 1}} = F\left( {{s_t},{u_t}} \right) \) which can be easily derived by using (34) and (39). Under the DMPC formulation, the expected value of the \({l_s}\left( {{o_t},{u_t}} \right) \) and \({l_f}\left( {{o_{k + M}}} \right) \) can be transformed as

$$\begin{aligned} l\left( {{s_t},{u_t}} \right) {=} E\left[ {l\left( {{o_t},{u_t}} \right) } \right] {=} {=} {\left\| {{s_t} - {s^*}} \right\| _{{Q_1}}} {+} {\left\| {{u_t}} \right\| _R} \nonumber \\ \end{aligned}$$
(43)

and

$$\begin{aligned} {l_f}\left( {{s_{k + M}}} \right){} & {} = E\left[ {{l_f}\left( {{o_{k + M}}} \right) } \right] = {\left\| {{s_{k + M}} - {s^*}} \right\| _{{Q_2}}} \nonumber \\{} & {} \quad + \lambda \log \left( {{{\left\| {{s_{k + M}} - {s^*}} \right\| }_{{Q_3}}} + \alpha } \right) \end{aligned}$$
(44)

,

where \({s^*} = {\left[ {{{\left( {{o^ * }} \right) }^T},{{\overrightarrow{0} }^T}} \right] ^T} \in {R^{30}}\), \({Q_1} = diag\left( {\underbrace{Q, \ldots ,Q}_6} \right) \in {R^{30 \times 30}}\), \({Q_2} = diag\left( {Q,\underbrace{B, \ldots B}_5} \right) \), \({Q_3} = dig\left( {Q,\underbrace{{0_{5 \times 5}}, \ldots {0_{5 \times 5}}}_5} \right) \). Constraints in (41) are equivalent to

$$\begin{aligned} \begin{array}{l} s_{t + 1}^T{e^i} - 2{\left( {{L_{o,t}}L_{o,t}^T} \right) ^{\left( {i,i} \right) }} \ge s_{\min }^i\\ s_{t + 1}^T{e^i} + 2{\left( {{L_{o,t}}L_{o,t}^T} \right) ^{\left( {i,i} \right) }} \le s_{\max }^i \end{array} \end{aligned}$$
(45)

for \(i = 1, \ldots ,5\), where \({e^i}\) is the \({i^{th}}\) column vector of an identity matrix. Then, the DMPC problem is given as follows

$$\begin{aligned} \mathop {\min }\limits _{{U_k}} \mathrm{{ }}{J_k} = \sum \limits _{t = k}^{k + M - 1} {l\left( {{s_t},{u_t}} \right) + {l_f}\left( {{s_{k + M}}} \right) } \end{aligned}$$
(46a)
$$\begin{aligned} s.t.\quad{} & {} \forall i = 1, \ldots ,5,\mathrm{{ }}\forall j = 1,2\mathrm{{ }} and \mathrm{{ }}\forall t = k, \ldots ,k \\{} & {} \quad + M - 1 \end{aligned}$$
$$\begin{aligned} {s_{t + 1}} = F\left( {{s_t},{u_t}} \right) \end{aligned}$$
(46b)
$$\begin{aligned} u_t^j \ge u_{\min }^j \end{aligned}$$
(46c)
$$\begin{aligned} u_t^j \le u_{\max }^j \end{aligned}$$
(46d)
$$\begin{aligned} s_{t + 1}^T{e^i} - 2{\left( {{L_{o,t}}L_{o,t}^T} \right) ^{\left( {i,i} \right) }} \ge s_{\min }^i \end{aligned}$$
(46e)
$$\begin{aligned} s_{t + 1}^T{e^i} + 2{\left( {{L_{o,t}}L_{o,t}^T} \right) ^{\left( {i,i} \right) }} \le s_{\max }^i \end{aligned}$$
(46f)

To simplify the DMPC problem, constraints (46c) - (46f) are softened the performance index by introducing several barrier functions as

$$\begin{aligned} {J'_k} = {J_k} + \sum \limits _{t = k}^{k + M - 1} {{l_{bs}}\left( {{s_{t + 1}}} \right) + {l_{bu}}\left( {{u_t}} \right) } \end{aligned}$$
(47)

where

$$\begin{aligned} \begin{array}{l} {l_{bs}}\left( {{s_{t + 1}}} \right) = \sum \limits _{i = 1}^5 {{b_1}r} \left( {s_{t + 1}^T{e^i} + 2{{\left( {{L_{o,t}}L_{o,t}^T} \right) }^{\left( {i,i} \right) }} - s_{\max }^i} \right. \\ \left. { + {b_2}r\left( {s_{\min }^i - s_{t + 1}^T{e^i} + 2{{\left( {{L_{o,t}}L_{o,t}^T} \right) }^{\left( {i,i} \right) }}} \right) } \right) \end{array} \end{aligned}$$
(48)
$$\begin{aligned} {l_{bu}}\left( {{u_t}} \right) = \sum \limits _{j = 1}^2 {{b_3}r\left( {u_t^j - u_{\max }^j} \right) + {b_4}r\left( {u_{\min }^j - u_t^j} \right) } \end{aligned}$$
(49)
$$\begin{aligned} r\left( * \right) = \left\{ \begin{array}{l} 0,\mathrm{{ }} * \le 0\\ *,\mathrm{{ }} * \ge 0 \end{array} \right. \end{aligned}$$
(50)

To facilitate the subsequent introduction of iLQR method, \({J'_k}\) is rewritten as

$$\begin{aligned} {J'_k} = l'\left( {{s_t},{u_t}} \right) + {l'_f}\left( {{s_{k + M}}} \right) \end{aligned}$$
(51)

where

$$\begin{aligned} l'\left( {{s_t},{u_t}} \right) = l\left( {{s_t},{u_t}} \right) + {l_{bs}}\left( {{s_t}} \right) + {l_{bu}}\left( {{u_t}} \right) \end{aligned}$$
(52)
$$\begin{aligned} {l'_f}\left( {{s_{k + M}}} \right) = {l_f}\left( {{s_{k + M}}} \right) + {l_{bs}}\left( {{s_{k + M}}} \right) \end{aligned}$$
(53)

With the above settings, the considered DMPC problem is simplified as

$$\begin{aligned} \mathop {\min }\limits _{{U_k}} \mathrm{{ }}{J'_k} = \sum \limits _{t = k}^{k + M - 1} {l'\left( {{s_t},{u_t}} \right) + {{l'}_f}\left( {{s_{k + M}}} \right) } \end{aligned}$$
(54a)
$$\begin{aligned} s.t. \quad {s_{t + 1}} = F\left( {{s_t},{u_t}} \right) \end{aligned}$$
(54b)

The iLQR [42] approach is employed to address the above DMPC issue through trajectory optimization. The value function of \({x_t}\) is defined as cost-to-go, and written as

$$\begin{aligned} V\left( {{x_t}} \right) = \mathop {\min }\limits _{{U_t}} \sum \limits _{t = k}^{k + M - 1} {l'\left( {{s_t},{u_t}} \right) + {{l'}_f}\left( {{s_{k + M}}} \right) } \end{aligned}$$
(55)

where \({U_t} = \left[ {\begin{array}{*{20}{c}} {{u_t}}&\ldots&{{u_{t + M - 1}}} \end{array}} \right] \). The value function of \({x_{k + M}}\) is set as \(V\left( {{x_{k + M}}} \right) = {l'_f}\left( {{s_{k + M}}} \right) \). Then, the iLQR can perform minimization sequentially on a single control unit rather than minimizing over the entire control sequence by proceeding backward in time as

$$\begin{aligned} V\left( {{x_t}} \right){} & {} = \mathop {\min }\limits _{{u_t}} \mathrm{{ }}l'\left( {{s_t},{u_t}} \right) + V\left( {{s_{t + 1}}} \right) = \mathop {\min }\limits _{{u_t}} \mathrm{{ }}l'\left( {{s_t},{u_t}} \right) \nonumber \\{} & {} \quad + V\left( {F\left( {{s_t},{u_t}} \right) } \right) \end{aligned}$$
(56)

To solve the above minimization problem, the Q function is defined as a perturbation function around the current state–input pair as

$$\begin{aligned} \begin{array}{l} Q\left( {\delta {s_t},\delta {u_t}} \right) = l'\left( {{s_t} + \delta {s_t},{u_t} + \delta {u_t}} \right) - l'\left( {{s_t},{u_t}} \right) \\ + V\left( {F\left( {{s_t} + \delta {s_t},{u_t} + \delta {u_t}} \right) } \right) - V\left( {F\left( {{s_t},{u_t}} \right) } \right) \end{array} \end{aligned}$$
(57)

The minimization problem (56) is then transformed to find an optimal \(\delta u_t^ *\) that minimizes \(Q\left( {\delta {s_t},\delta {u_t}} \right) \). Function \(Q\left( {\delta {s_t},\delta {u_t}} \right) \) can expand to a second order as

$$\begin{aligned} Q\left( {\delta {s_t},\delta {u_t}} \right) \approx \frac{1}{2}{\left[ {\begin{array}{*{20}{c}} 1\\ {\delta {s_t}}\\ {\delta {u_t}} \end{array}} \right] ^T}\left[ {\begin{array}{*{20}{c}} 0&{}{Q_{{s_t}}^T}&{}{Q_{{u_t}}^T}\\ {{Q_{{s_t}}}}&{}{{Q_{{s_t}{s_t}}}}&{}{{Q_{{s_t}{u_t}}}}\\ {{Q_{{u_t}}}}&{}{{Q_{{u_t}{s_t}}}}&{}{{Q_{{u_t}{u_t}}}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} 1\\ {\delta {s_t}}\\ {\delta {u_t}} \end{array}} \right] \nonumber \\ \end{aligned}$$
(58)

where

$$\begin{aligned} {Q_{{s_t}}} = {l'_{{s_t}}} + F_{{s_t}}^T{V_{{s_{t + 1}}}} \end{aligned}$$
(59a)
$$\begin{aligned} {Q_{{u_t}}} = {l'_{{u_t}}} + F_{{u_t}}^T{V_{{s_{t + 1}}}} \end{aligned}$$
(59b)
$$\begin{aligned} {Q_{{s_t}{s_t}}} = {l'_{{s_t}{s_t}}} + F_{{s_t}}^T{V_{{s_{t + 1}}{s_{t + 1}}}}{F_{{s_t}}} + {V_{{s_{t + 1}}}}{F_{{s_t}{s_t}}} \end{aligned}$$
(59c)
$$\begin{aligned} {Q_{{u_t}{u_t}}} = {l'_{{u_t}{u_t}}} + F_{{u_t}}^T{V_{{s_{t + 1}}{s_{t + 1}}}}{F_{{u_t}}} + {V_{{s_{t + 1}}}}{F_{{u_t}{u_t}}} \end{aligned}$$
(59d)
$$\begin{aligned} {Q_{{u_t}{s_t}}} = {l'_{{u_t}{s_t}}} + F_{{u_t}}^T{V_{{s_{t + 1}}{s_{t + 1}}}}{F_{{s_t}}} + {V_{{s_{t + 1}}}}{F_{{u_t}{s_t}}} \end{aligned}$$
(59e)

The last terms in (59c), (59d) and (59e) are ignored in iLQR for reducing calculation burden. Minimizing (58) with respected to \(\delta {u_t}\), we obtain

$$\begin{aligned} \delta u_t^ * = - Q_{{u_t}{u_t}}^{ - 1}\left( {{Q_{{u_t}}} + {Q_{{u_t}{s_t}}}\delta {s_t}} \right) = {k_t} + {K_t}\delta {s_t} \end{aligned}$$
(60)

where \({k_t} = - Q_{{u_t}{u_t}}^{ - 1}Q_{u_t}\) and \({K_t} = - Q_{{u_t}{u_t}}^{ - 1}Q_{{u_t}{s_t}}\). Substituting (60) into (58), we have

$$\begin{aligned} {V_{{s_t}}} = {Q_{{s_t}}} - {Q_{{u_t}}}Q_{{u_t}{u_t}}^{ - 1}{Q_{{u_t}{s_t}}} \end{aligned}$$
(61)
$$\begin{aligned} {V_{{s_t}{s_t}}} = {Q_{{s_t}{s_t} - }}{Q_{{s_t}{u_t}}}Q_{{u_t}{u_t}}^{ - 1}{Q_{{u_t}{s_t}}} \end{aligned}$$
(62)

which can be used to compute \({k_{t - 1}}\) and \({K_{t - 1}}\). Recursively computing \(\left( {\left( {{k_{k + M - 1}}, {K_{k + M - 1}}} \right) , \ldots ,\left( {{k_k},{K_k}} \right) } \right) \) constitutes a backward pass. Once the backward pass is completed, a forward pass is used to obtain a new trajectory as

$$\begin{aligned} \begin{array}{l} {{{\hat{s}}}_k} = {s_k}\\ {{{\hat{u}}}_t} = {u_t} + {k_t} + {K_t}\left( {{{{\hat{s}}}_t} - {s_t}} \right) \\ {{{\hat{s}}}_{t + 1}} = F\left( {{{{\hat{s}}}_t},{{{\hat{u}}}_t}} \right) \end{array} \end{aligned}$$
(63)

By iteratively performing backward pass and forward pass, an optimal \(U_k^ * = {\left[ {\begin{array}{*{20}{c}} {u_k^ * }&\ldots&{u_{k + M - 1}^ * } \end{array}} \right] ^T}\) can be obtained, and the first control unit \(u_k^ * \) is applied to the WMR. The specific algorithm for designing GPMPC controller is summarized as algorithm 1.

Algorithm 1
figure a

GPMPC controller design

4 Simulation

In this section, the validity of the proposed GPMPC approach will be verified by numerical simulations. The simulation example runs in Python environment. The computer used is configured with AMD-R7 4800 H CPU, 2.90 GHz and 16.0 GB running memory. Specifically, two comparison simulations are performed to verify the efficacy of the GP-enhanced model and the Lorentzian \(\rho \)-function, respectively. The velocity model in simulations is set as

$$\begin{aligned} \begin{array}{l} {v_{r,k}} = {v_{r,k - 1}} + 0.5 \times \left( {{v_{r,d,k}} - {v_{r,k - 1}}} \right) \\ {w_{r,k}} = {w_{r,k - 1}} + 0.5 \times \left( {{w_{r,d,k}} - {w_{r,k - 1}}} \right) \end{array} \end{aligned}$$
(64)

It means that the real velocity of the robot only steps forward to the desired velocity but cannot arrive at the desired velocity. Other parameters are shown in Table 1.

Table 1 Parameters used in the simulation
Fig. 4
figure 4

Comparison results of the GPMPC and T-MPC. a Moving trajectories of the feature point in the image frame. b Moving trajectory of the camera in the world frame

Fig. 5
figure 5

Computing details of the T-MPC controller and the GPMPC controller. a Computing details of the T-MPC controller. b Computing details of the GPMPC controller

Fig. 6
figure 6

Comparison results of the GPMPC controller and Q-GPMPC

Fig. 7
figure 7

Detailed control information of the Q-GPMPC and GPMPC

Fig. 8
figure 8

Environment of the experiment

Fig. 9
figure 9

Motion trajectories of the feature point in the image frame. a motion trajectories of GPMPC. b Motion trajectories of O-GPMPC. c Motion trajectories of T-MPC. d Motion trajectories of Q-GPMPC

Only the first two dimensions of the state are constrained since visibility constraints are considered. The input constraints are handled by applying a saturation function to the \(U_k^ * \) that derived by iLQR method, and thus, \({b_3}\) and \({b_4}\) are set as 0.

To highlight the importance of model learning, we first compare the simulation results of the proposed GPMPC and the traditional MPC (T-MPC) that only uses the nominal model (11), which is shown in Fig. 4. Solid circles in Fig. 4a and b represent the end position of the feature point and the camera. The symbol ‘\(+\)’ represents the desired position and the dashed box denotes the visibility constraints. It is evident that with the control of GPMPC and the T-MPC, the feature point can all arrive the desired position. However, the FOV of the feature point will be lost when the WMR is controlled by T-MPC, which is reflected in Fig. 4a that the blue trajectory is sometimes outside the dashed box. In real applications, this issue may cause the failure of the visual servoing task.

Fig. 10
figure 10

Motion trajectories in the image frame. a Motion trajectory of GPMPC. b Motion trajectory of O-GPMPC. c Motion trajectory of T-MPC. d Motion trajectory of Q-GPMPC

Fig. 11
figure 11

Comparison results of GPMPC, O-GPMPC, T-MPC and Q-GPMPC

Fig. 12
figure 12

Command and real velocities in experiments

To further analyze reasons for this issue, we record the computing details of the GPMPC controller and the T-MPC controller, as shown in Fig. 5. The prediction horizon lines (blue dashed lines) represent the horizon state trajectories (HSJs) predicted by the optimal control sequence and current model (the nominal model for T-MPC and the GP-enhanced model for GPMPC), and the real horizon lines (red dashed lines) represent the real horizon state trajectories (RHSJs) predicted by the optimal control sequence and the real model. Figure 5a shows that visibility constraints are sometimes violated, especially in the Y-axis of the image frame under the control of the T-MPC. This is caused by the model error between the nominal model and the real model. At step 102, although HSJs do not violate visibility constraints, the RHSJs do because the optimal control lists are derived based on the nominal model. Therefore, with the model error, visibility constraints are not guaranteed to be satisfied. For GPMPC, the additive GP model can effectively capture differences between the nominal model and the real model, and thus, the GP-enhanced model is accurate enough to derive proper optimal control lists. HSJs for GPMPC are stochastic, shown in Fig. 5b, where shadow areas represent the \(95\%\) predictive confidence region. It can be seen that RHSJs are close to the mean trajectories, which reflects the fact that the GP-enhanced model resembles the real model. Moreover, the prediction variance that represents the prediction uncertainty gradually decreases with the duration of on-line model learning. Thus, the visibility constraints are satisfied by using the GPMPC method.

In what follows, the effectiveness of the Lorentzian \(\rho \)-function is shown by comparing the control results of the GPMPC and the quadratic GPMPC (Q-GPMPC) that uses the quadratic terminal cost function designed as \({l_f}\left( {{o_{k + M}}} \right) = {d^2}\). Figure 6 illustrates the comparison results with the control of the Q-GPMPC. It can be seen that Q-GPMPC has relatively large static errors when the system is stable in the \(\theta \), \(x_i\) and \(y_i\) directions, which are 0.038 [rad], \(0.022\,[pixels]\) and \(0.01\,[pixels]\), respectively. In contrast, the static errors of GPMPC are much minor, which are 0.002 [rad], 0.002 [pixels], and 0.004 [pixels], respectively. This issue is caused by the shape of the quadratic cost function, i.e., the gradient is nearly zero when d is near 0 (when o is near \({o^ * }\)). Thus, the optimizer outputs the near-zero control list when the error is small, leading to the static error. In contrast, the Lorentzian \(\rho \)-function has a concave shape as d is close to 0, which encourages precise control.

Figure 7 shows the detailed control information. From Fig. 7a, the control inputs to the Q-GPMPC disappear at step 60 even though the servo error still exists. Figure 7b shows that the WMR finally stops near the desired position. Different with the Q-GPMPC, the control inputs of GPMPC do not disappear when WMR is near the desired position. However, the control inputs become more complex, gradually driving the WMR to the desired position by alternating forward and backward movements. The basic condition for producing these alternating motions is that the value function has obvious differences in the region near the desired position, which is consistent with the shape of the Lorentzian \(\rho \)-function.

The average control periods of the T-MPC and GPMPC are 0.046 [sec] and 0.080 [sec], respectively. Thus, as the above analyses, the proposed GPMPC method is effective to address the visual servoing task for the WMR with unknown velocity model and visibility constraints.

5 Experiments

In this section, a corresponding experiment of WMR visual servoing point stabilization is designed to confirm the effectiveness of the GPMPC method in practical applications. The experimental equipment includes a turtlebot2 mobile robot, a Realsense D435 USB camera, a high-precision gyroscope and a computer with an i7-12700 H CPU and RTX-3060 GPU. The camera frame rate of the camera is set to a frame rate of 30 (FPS), and the resolution is set 640x480 pixels, with the internal parameters \((fx,fy,{c_u},{c_v}) = \mathrm{{( - 607}}\mathrm{{.5, - 606}}\mathrm{{.2, 325}}\mathrm{{.5, 243}}\mathrm{{.8)}}\). The gyroscope angle accuracy is 0.1 degrees. The experiment is conducted in the environment shown in Fig. 8. In the experiment, four corner points of the AprilTag marker were selected as visual servoing image features used by the Visual Servoing Platform (ViSP) [43]. A tolerance area is set \((\left| {{e_{{x_i}}}} \right| < {\mathrm{{0}}.\mathrm{{01\,[pixels]}}},\left| {{e_{{y_i}}}} \right|<\)\({\mathrm{{0}}.\mathrm{{01\,[pixels]}}},\left| {{e_\theta }} \right| < {0.035\,[rad]})\), which means that the servo task is assumed to be completed only when all three conditions are met simultaneously. It should be noted that some experimental parameters are different from simulation parameters are set as \({o_0}={\left[ {\begin{array}{*{20}{l}} {0.5\,[pixels]}&{0.07\,[pixels]}&{-\pi / 6\,[rad]}&0&0 \end{array}} \right] ^T}\), \({o^ * }={\left[ {\begin{array}{*{20}{l}} {- 0.15\,[pixels]}&{0.2\,[pixels]}&0&0&0 \end{array}} \right] ^T}\), \(Q=diag\left( {0.1,0.4,0.05,0,0} \right) \), \(R=diag\left( {0.003,0.003} \right) \), \({o_{\max }} = {\left[ {\begin{array}{*{20}{c}} {{0.58\,[pixels]}}&{0.26\,[pixels]}\sim & {} \sim&\sim \end{array}} \right] ^T}\), \({o_{\min }} = {\left[ {\begin{array}{*{20}{c}} { {- 0.38\,[pixels]}}&{0\,[pixels]}\sim & {} \sim&\sim \end{array}} \right] ^T}\), \({u_{\max }} = {\left[ {\begin{array}{*{20}{c}} {0.2\,[m/\sec ]}&{0.2\,[rad/\sec ]} \end{array}} \right] ^T}\), \({u_{\min }} = {\left[ {\begin{array}{*{20}{c}} { - 0.2\,[m/\sec ]}&{ - 0.2\,[rad/\sec ]} \end{array}} \right] ^T}\).

To clearly show the model uncertainty, we have added experiment tests on a road with random bumps to verify the effectiveness of the proposed method under environmental uncertainty. For the sake of clarity, we use O-GPMPC to represent GPMPC experiments conducted in a randomly bumpy environment. The experiment results of GPMPC, O-GPMPC, T-MPC and Q-GPMPC are shown in Figs. 9, 10, 11 and 12. In Fig. 9, the blue symbol ‘+’ represents the initial position, the red symbol ‘+’ represents the desired position, and the black symbol ‘+’ represents the final position. In Fig. 10, the solid circles represent the desired positions of the corresponding trajectories, and the dashed box denotes the visibility constraints. To keep it brief, only the root mean square error of the camera features is displayed in Fig. 11 instead of showing all errors of the camera features [26]. When the system stabilize and reach the desired position, it can be clearly seen that the motion trajectory of T-MPC in subgraph (c) of Figs. 9, 10 is closer to the constraint boundary than GPMPC in subgraph (a) of Fig. 9, 10. It can be seen that the constraint performance of GPMPC is better than T-MPC. The results of GPMPC are similar to the simulations. Figure 11 shows that Q-GPMPC has a relatively large static error. In contrast, the GPMPC control method with the addition of the Lorentzian \(\rho \)-function has a smaller static error. Specifically, the \(\theta \) and mean square error values of Q-GPMPC are \(0.025\,[pixels]\) and \(-0.086\,[rad]\). On the other hand, GPMPC shows better performance with \(\theta \) and mean square error values of \(0.004\,[pixels]\) and \(-0.016\,[rad]\). It can be indicated that Q-GPMPC has poorer performance compared to GPMPC. Under the bumpy environment, feature points undergo significant fluctuations in the FOV. The subgraph (b) results of Fig. 9, 10 and Fig. 11 demonstrate that the proposed GPMPC controller can successfully move the WMR to the desired position in a bumpy environment. However, there is oscillation present during the movement. It infers that the proposed method is robust to the uncertainty of the environment. Figure 12 shows the commanded and actual velocities for those control methods. The Lorentzian \(\rho \)-function can reduce static errors effectively. However, it can also cause oscillations in WMR when approaching the desired posture. This result is consistent with the simulation results. It can be seen that subgraphs (g) and (h) in Fig. 12 shows persistent small oscillations of the commanded and actual velocity around the desired point. These oscillations are caused by the large static error of Q-GPMPC. Therefore, we only analyzed the first 800 steps of data using the Q-GPMPC method.

The experiment shows that the proposed GPMPC method effectively handles model uncertainty, reduces static errors, and overcomes visibility constraints in WMR visual servoing tasks. Furthermore, it demonstrates robustness to environmental uncertainties.

6 Conclusion and future works

This paper proposes a GPMPC approach to deal with the issue of IBVS in WMR with unknown actuator dynamics. Firstly, a stochastic GP-enhanced model is learned on-line to approximate the real model using the GP method. Then, an SMPC formulation is presented where chance constraints of the state is considered to ensure visibility of the feature point. Moreover, the Taylor expansion techniques is utilized to approximate the uncertainty propagation when performing the multi-step forward state prediction. Based on the approximation results, the SMPC problem is transformed as a DMPC problem which is solved by iLQG. Finally, we give two comparison simulations and experiments to verify the validity of the proposed GPMPC method. Notice that iLQG-based MPC requires derivative information and Taylor expansion approximation. In contrast, the MPPI control framework, which does not require the calculation of gradients and second-order approximations, can be easily applied to the real system in real time. In future work, we will use the MPPI strategy to improve the model predictive control algorithm and verify it effectiveness on the actual robot.