Keywords

1 Introduction

Model predictive control (MPC) is widely used in industry due to the ability of handling uncertainties and fulfilling constraints. However, the traditional nominal MPC may result in poor control quality on occasion in case that the serious disturbance occur, because it does not account for the uncertainty [1, 2]. The robust MPC, presuming the uncertainty is bounded, is capable to guarantee constraints satisfaction all the time by only considering the worst-case uncertainty. But it does not allow for the possible statistical properties of the uncertainty, despite the information is available in many cases. As a consequence, robust approaches may lead to overly conservative in algorithm design [3, 4]. In some real-world application cases, a certain probability of constraint violation is usually allowed. The stochastic model predictive control (SMPC) taking into account the a priori knowledge of the uncertainty and using the chance constraint will result in less conservatism in constraints satisfaction [5,6,7,8].

Majority of existing SMPC algorithms ensure closed loop constraint satisfaction typically rely on knowledge of worst case bounds corresponding to prescribed chance constraints. Although the offline computation of constraint tightening releases the computational complex, it causes intrinsic conservatism for ignoring past constraint violations and current uncertainty. To cope with this issue, several approaches have been proposed. In [9], the authors develop a recursively feasible MPC scheme by explicitly taking into account the past constraint violations to adaptively scale the tightening parameters. The work [10] exploits the observed constraint violations to adaptively scale the tightening parameters to eliminate the conservatism, and analyze the convergence of the amount of constraint violations rigorously using stochastic approximation. For linear systems under multiplicative and possibly unbounded model uncertainty, the work [11] presents a stochastic model predictive control algorithm. In which, the probabilistic constraints are reformulated in deterministic terms by means of the Cantelli inequality. A recursively feasible stochastic model predictive control scheme is designed by explicitly taking into account the past averaged-over-time amount of constraint violations when determining the current control input [12].

Another way to reduce the conservatism of MPC scheme is using the statistical machine learning methods to model the uncertainties based on prior knowledge [13,14,15]. Gaussian process (GP) regression is particularly attractive because it provides variance besides the mean of uncertainty, which can be incorporated into MPC to improve the performance [16,17,18,19].

In this paper, we propose a Gaussian process based SMPC (GP-SMPC) scheme for linear time-invariant (LTI) systems subject to bounded additive uncertainties. The uncertainties are state-dependent and bounded. The GP models for the uncertainties are trained offline on the base of previous collected data. The future mean and variance of the uncertainty can be predicted by the learned GP model on the condition of current state and input. The key contribution of this work is that the predicted information of uncertainty is used to adaptively scale the tightening parameters of the system constraints to achieve less conservatism.

he remainder of this paper is organized as follows. The time-varying tube-based SMPC is introduced in Sect. 2. Section 3 proposes the GP-SMPC scheme which mainly consists of the uncertainty modeling and constraint tightening. In Sect. 4, numerical simulations are given. Section 5 concludes the paper.

Notations

\({x}_{k|t}\) represents the \(k\)-step-ahead prediction of \(x\) at time \(t\).

\({\mathbb{R}}\) denotes the set of reals, \({\mathbb{N}}_{i}\) denotes the set of integers which equal or greater than \(i\),\({\mathbb{N}}_{i}^{j}\) denotes the set of consecutive integers \(\left\{i,\cdots ,j\right\}\).

\(\mathrm{Pr}(X)\) stands for the probability of an event \(X\).

The Minkowski sum is denoted by \(A\oplus B=\left\{a+b|a\in A, b\in B\right\}\).

The Pontryagin set difference is represented by \(A\ominus B=\left\{a\in A|a+b\in A, \forall b\in B\right\}\).

2 Time-Varying Tube-Based SMPC

Consider a discrete LTI system subject to additive uncertainties

$${x}_{k+1}=A{x}_{k}+B{u}_{k}+{w}_{k},$$
(1)

where \({x}_{k}\in {\mathbb{R}}^{\mathrm{n}}\) and \({u}_{k}\in {\mathbb{R}}^{\mathrm{m}}\) are the state and input at time \(k\) respectively. The uncertainties \({w}_{k}\in {\mathbb{W}}\subset {\mathbb{R}}^{\mathrm{n}}\), which can be unmodeled nonlinearities and/or external disturbances, are bounded and state-dependent. Moreover, system (1) is subjected to the following constraints on states and inputs

$$\mathrm{Pr}\left({x}_{k+1}\in {\mathbb{X}}\right)\ge 1-\epsilon , k\in {\mathbb{N}}_{0},$$
(2a)
$${u}_{k}\in {\mathbb{U}}, k\in {\mathbb{N}}_{0}.$$
(2b)

The nominal system neglecting the uncertainty part is defined as

$${s}_{k+1}=A{s}_{k}+B{v}_{k},$$
(3)

where the nominal state \({s}_{k}\in {\mathbb{R}}^{\mathrm{n}}\) and the nominal open loop input \({v}_{k}\in {\mathbb{R}}^{\mathrm{m}}\).

The error between observed state \({x}_{k}\) and nominal state \({s}_{k}\) is defined as

$${e}_{k}={x}_{k}-{s}_{k},$$
(4)

One of the commonly used control policies in robust tube MPC is

$${u}_{k}={Ke}_{k}+{v}_{k},$$
(5)

where the feedback gain \(K\) is obtained by LQR optimization for the nominal dynamics (4), such that \({A}_{cl}=A+BK\) is Schur stable.

Then the system dynamics in (1) can be decoupled into a nominal dynamics and an error dynamics as

$${s}_{k+1}=A{s}_{k}+B{v}_{k},$$
(6a)
$${e}_{k+1}={A}_{cl}{e}_{k}+{w}_{k},$$
(6b)

The error dynamics (6b) will be used for constraint tightening.

Suppose that a polytope \(\mathcal{E}\subset {\mathbb{W}}\) is a confidence region of probability level \(1-\upepsilon\) for the uncertainty, that is.

$$\mathrm{Pr}\left({w}_{k}\in \mathcal{E}\right)\ge 1-\upepsilon,$$
(7)

where \(\upepsilon \in \left(0, 1\right)\).

Since the error dynamics in (6b) is linear, and the uncertainty \(w\in {\mathbb{W}}\), the propagation set of uncertainty \({e}_{k}\in {\mathcal{W}}_{\mathrm{k}}\) is evolved as

$${\mathcal{W}}_{k+1}={A}_{cl}{\mathcal{W}}_{k}\oplus {\mathbb{W}}, k\in {\mathbb{N}}_{0},$$
(8)

where \({\mathcal{W}}_{0}={\mathbb{W}}\). Then, it can be induced that \({\mathcal{W}}_{k}=\sum_{i=0}^{k}\oplus {A}_{cl}^{i}{\mathbb{W}}\), \(k\in {\mathbb{N}}_{0}\).

Construct the tightened propagation set of uncertainty as

$${\mathcal{D}}_{k}={A}_{cl}{\mathcal{W}}_{k-1}\oplus \mathcal{E}, k\in {\mathbb{N}}_{1},$$
(9)

Then

$$\mathrm{Pr}\left({e}_{k}\in {\mathcal{D}}_{k}\right)\ge 1-\epsilon , k\in {\mathbb{N}}_{0},$$
(10)

follows from (7), where \({\mathcal{D}}_{0}=\mathcal{E}\).

Define the time-varying tightened state constraint set as

$${\mathcal{C}}_{k}={\mathbb{X}}\ominus {\mathcal{D}}_{k}, k\in {\mathbb{N}}_{0}.$$
(11)

If \({s}_{k}\in {\mathcal{C}}_{k}\), then \(\mathrm{Pr}\left({x}_{k}={s}_{k}+{e}_{k}\in {\mathbb{X}}\right)\ge 1-\upepsilon\) is satisfied, that is, the satisfaction of chance constraint (2a) is guaranteed by (10) and (11).

Define the tightened input constraint set

$$\mathcal{V}={\mathbb{U}}\ominus \mathrm{K}\mathcal{Z},$$
(12)

where \(\mathcal{Z}=\sum_{i=0}^{\infty }\oplus {A}_{cl}^{i}{\mathbb{W}}\) and \({e}_{k}\in \mathcal{Z}, k\in {\mathbb{N}}_{0}\). If \({v}_{k}\in \mathcal{V}\), then the hard constraint (2b) \({u}_{k}={v}_{k}+K{e}_{k}\in {\mathbb{U}}\) is guaranteed by (12).

Define terminal constraint set

$${\mathcal{X}}_{f}=\left\{s\in {\mathbb{R}}^{n}: {s}_{k}\in {\mathbb{X}}\ominus \mathcal{Z}, K{s}_{k}\in \mathcal{V}, k\in {\mathbb{N}}_{0}\right\}.$$
(13)

The finite horizon optimal control problem to be solved at each time instant \(t\) is as follows:

$$\underset{{s}_{0|t},{v}_{0|t},\cdots ,{v}_{N-1|t}}{\mathrm{min}}{\sum}_{k=0}^{N-1} \left({\Vert {s}_{k|t}\Vert }_{Q}^{2}+{\Vert {v}_{k|t}\Vert }_{R}^{2}\right)+{\Vert {s}_{N|t}\Vert }_{P}^{2}$$
$${\rm{s.t.}}\,\,s_{k+1|t}=A{s}_{k|t}+B{v}_{k|t},$$
$${s}_{k|t}\in {\mathcal{C}}_{k+t}, k\in {\mathbb{N}}_{1}^{N-1},$$
$${v}_{k|t}\in \mathcal{V}, k\in {\mathbb{N}}_{0}^{N-1},$$
(14)
$${x}_{t}- {s}_{0|t}\in {\mathcal{W}}_{t},$$
$${s}_{N|t}\in {\mathcal{X}}_{f}.$$

3 SMPC Using Gaussian Process Regression

Since the uncertainty is state-dependent, it will be conservative if the confidence region \(\mathcal{E}\) formulated based on its maximum amplitude. In this section, a Gaussian process regression method is proposed to solve this issue.

3.1 Gaussian Process Regression

Considering a training set \(\left\{\left({x}_{i},{y}_{i}\right),i=\mathrm{1,2},\cdots ,M\right\}\), where \({x}_{i}\in {\mathbb{R}}^{d}\) and \({y}_{i}\in {\mathbb{R}}\). The GPR model learns a function \(f\left(x\right)\) mapping the input vector \(\mathrm{x}\) to the observed output value \(y\) given by \(y=f\left(x\right)+w\), where \(w\sim \mathcal{N}\left(0, {\sigma }_{n}^{2}\right)\). The observed output values are normally distributed \({\varvec{y}}\sim \mathcal{N}\left(\mu \left(X\right), K(X, X)\right)\), where the mean value vector \(\mu \left(X\right)={\left[\mu \left({x}_{1}\right),\cdots ,\mu \left({x}_{M}\right)\right]}^{T}\) and the covariance matrix

$$K\left(X, X\right)=\left[\begin{array}{cc}\begin{array}{cc}c\left({x}_{1},{x}_{1}\right)& c\left({x}_{1},{x}_{2}\right)\\ c\left({x}_{2},{x}_{1}\right)& c\left({x}_{2},{x}_{2}\right)\end{array}& \begin{array}{cc}\cdots & c\left({x}_{1},{x}_{M}\right)\\ \cdots & c\left({x}_{2},{x}_{M}\right)\end{array}\\ \begin{array}{cc}\vdots & \vdots \\ c\left({x}_{M},{x}_{1}\right)& c\left({x}_{M},{x}_{2}\right)\end{array}& \begin{array}{cc}\ddots & \vdots \\ \cdots & c\left({x}_{M},{x}_{M}\right)\end{array}\end{array}\right],$$
(15)

and \(c\left({x}_{i},{x}_{j}\right)\) is the covariance of \({x}_{i}\) \(\mathrm{and}\) \({x}_{j}\), which can be any positive definite function. A frequently used covariance function called Square Exponential Kernel function is defined as

$$c\left({x}_{i},{x}_{j}\right)={\sigma }_{f}^{2}\mathrm{exp}\left(-\frac{1}{2}{\left({x}_{i}-{x}_{j}\right)}^{T}L\left({x}_{i}-{x}_{j}\right)\right)+{\sigma }_{n}^{2},$$
(16)

where \(L=\mathrm{diag}\left(\left[{l}_{1},\cdots {l}_{d}\right]\right)\), \({\sigma }_{f}\) and \({\sigma }_{n}\) are the hyperparameters of the covariance function.

The training output \({\varvec{y}}\) and a predicted output \({y}^{*}\) corresponding to the test input \({x}^{*}\) are jointly Gaussian distribution

$$\left[\begin{array}{c}{\varvec{y}}\\ {y}^{*}\end{array}\right]\sim \mathcal{N}\left(0, \left[\begin{array}{cc}K\left(X, X\right)& k{\left({x}^{*}, X\right)}^{T}\\ k\left({x}^{*}, X\right)& k\left({x}^{*}, {x}^{*}\right)\end{array}\right]\right),$$
(17)

where \(k\left({x}^{*}, X\right)=\left[c\left({x}^{*},{x}_{1}\right), c\left({x}^{*},{x}_{2}\right),\cdots ,c\left({x}^{*},{x}_{M}\right)\right]\), \(k\left({x}^{*}, {x}^{*}\right)=c\left({x}^{*}, {x}^{*}\right)\).

Following the Bayesian modeling framework, the posterior distribution of \({y}^{*}\) can be obtained conditioned on the observations, and the resulting is still Gaussian with \({y}^{*}|{\varvec{y}}\sim \mathcal{N}\left(\mu \left({x}^{*}\right), {\sigma }^{2} \left({x}^{*}\right)\right)\) and

$$\mu \left({x}^{*}\right)= k\left({x}^{*}, X\right){K\left(X, X\right)}^{-1}y,$$
(18a)
$${\sigma }^{2} \left({x}^{*}\right)= k\left({x}^{*}, {x}^{*}\right)-k\left({x}^{*}, X\right){K\left(X, X\right)}^{-1}{k\left({x}^{*}, X\right)}^{T}.$$
(18b)

3.2 GP Model of Uncertainty

The learned GPR model depends on measurement data collected from previous experience. The model input and output are the state-control tuple \({z}_{k}=\left({x}_{k}; {u}_{k}\right)\) and corresponding uncertainty \({w}_{k}\), respectively. The uncertainty at time \(\mathrm{k}\).

$${w}_{k}={x}_{k+1}-\left(A{x}_{k}+B{u}_{k}\right).$$
(19)

The data pair \(\left({z}_{k},{w}_{k}\right)\) represents an individual experience. Given a well collected data pair set \(\mathfrak{D}=\left\{{\varvec{z}},{\varvec{w}}\right\}\) and a test data pair \(\left({z}^{*},{w}^{*}\right)\), the jointly Gaussian distribution is

$$\left[\begin{array}{c}w\\ {w}^{*}\end{array}\right]\sim \mathcal{N}\left(0, \left[\begin{array}{cc}K\left(z, z\right)& k{\left({z}^{*}, z\right)}^{T}\\ k\left({z}^{*}, z\right)& k\left({z}^{*}, {z}^{*}\right)\end{array}\right]\right).$$
(20)

The posterior distribution of \({w}^{*}\) is still Gaussian

$${w}^{*}|\mathfrak{D}\sim \mathcal{N}\left(\mu \left({z}^{*}\right), {\sigma }^{2} \left({z}^{*}\right)\right)$$
(21)

with mean and variance as follows

$$\mu \left({z}^{*}\right)= k\left({z}^{*}, z\right){K\left(z, z\right)}^{-1}w,$$
(22a)
$${\sigma }^{2} \left({z}^{*}\right)= k\left({z}^{*}, {z}^{*}\right)-k\left({z}^{*}, z\right)K{\left(z, z\right)}^{-1}{k\left({z}^{*}, z\right)}^{T}.$$
(22b)

The \(\mathrm{n}\) separate GP models are trained for each dimension in \(w\in {\mathbb{R}}^{n}\). We gain the optimal hyperparameters of each Gaussian model offline by maximizing the log marginal likelihood of collected data sets [20].

3.3 Adaptive Constraints

Define the prediction model as

$${\tilde{x }}_{k+1}=A{\tilde{x }}_{k}+B{\tilde{u }}_{k}+{\tilde{w }}_{k},$$
(23)

where \({\tilde{x }}_{k}\) denotes the predicted state, \({\tilde{u }}_{k}\) the predicted input, and \({\tilde{w }}_{k}\) the predicted uncertainty. On the condition of trained GP models, the distribution of \({\tilde{w }}_{k}\) corresponding to \(\left({\tilde{x }}_{k}; {\tilde{u }}_{k}\right)\) can be obtained as

$${\tilde{w }}_{k}\sim \mathcal{N}({\stackrel{\sim }{\mu }}_{k},{\stackrel{\sim }{\sigma }}_{k}^{2}),$$
(24)

where \({\stackrel{\sim }{\mu }}_{k}\) and \({\stackrel{\sim }{\sigma }}_{k}^{2}\) are computed by (22a) and (22b).

Define the confidence region of the predicted uncertainty with the probability level \(1-\upepsilon\) as

$${\stackrel{\sim }{\mathcal{E}}}_{k}=\left\{{\stackrel{\sim }{\mu }}_{k}-\alpha {\stackrel{\sim }{\sigma }}_{k}\le {\tilde{w }}_{k}\le {\stackrel{\sim }{\mu }}_{k}+\alpha {\stackrel{\sim }{\sigma }}_{k}\right\}, k\in {\mathbb{N}}_{0},$$
(25)

where \(\mathrm{\alpha }\) is the quantile value corresponding to \(1-\upepsilon\).

According to (9), the more stringent propagation set of uncertainty is

$${\stackrel{\sim }{\mathcal{D}}}_{k}={A}_{cl}{\mathcal{W}}_{k-1}\oplus {\stackrel{\sim }{\mathcal{E}}}_{k-1}, k\in {\mathbb{N}}_{1},$$
(26)

Then

$$\mathrm{Pr}\left({\tilde{e }}_{k}\in {\stackrel{\sim }{\mathcal{D}}}_{k}\right)\ge 1-\epsilon , k\in {\mathbb{N}}_{1},$$
(27)

follows.

Construct the adaptively time-varying state constraint set as

$${\stackrel{\sim }{\mathcal{C}}}_{k}={\mathbb{X}}\ominus {\stackrel{\sim }{\mathcal{D}}}_{k}, k\in {\mathbb{N}}_{1}.$$
(28)

If \({s}_{k}\in {\stackrel{\sim }{\mathcal{C}}}_{k}\), then the chance constraint \(\mathrm{Pr}\left({x}_{k}={s}_{k}+{e}_{k}\in {\mathbb{X}}\right)\ge 1-\upepsilon\) is satisfied.

Define the tightened input constraint set \(\mathcal{V}={\mathbb{U}}\ominus \mathrm{K}\mathcal{Z}\) as (12). If \({v}_{k}\in \mathcal{V}\), then the satisfaction of hard constraint \({u}_{k}={v}_{k}+K{e}_{k}\in {\mathbb{U}}\) is guaranteed.

3.4 Gaussian Process Based SMPC

On the basis of the time-varying tube-based SMPC, by combining the GP-based uncertainty prediction, the Gaussian process based stochastic optimal control problem to be solved at each time instant \(t\) is as follows:

$$\underset{{s}_{0|t},{v}_{0|t},\cdots ,{v}_{N-1|t}}{\mathrm{min}}\;{\sum}_{k=0}^{N-1} \left({\Vert {s}_{k|t}\Vert }_{Q}^{2}+{\Vert {v}_{k|t}\Vert }_{R}^{2}\right)+{\Vert {s}_{N|t}\Vert }_{P}^{2}$$
$$\mathrm{s}.\mathrm{t}.\;{\tilde{x }}_{0|t}={x}_{t},$$
$${\tilde{x }}_{k+1|t}=A{\tilde{x }}_{k|t}+B{\tilde{u }}_{k|t}+{\tilde{w }}_{k|t},$$
$${\tilde{u }}_{k|t}=K\left({\tilde{x }}_{k|t}-{s}_{k|t}\right)+{v}_{k|t},$$
$${\tilde{w }}_{k|t}({\tilde{x }}_{k|t}, {\tilde{u }}_{k|t})\sim \mathcal{N}({\stackrel{\sim }{\mu }}_{k|t},{\stackrel{\sim }{\sigma }}_{k|t}^{2}),$$
$$\mathrm{Generate}\;\;{\stackrel{\sim }{\mathcal{C}}}_{\mathrm{k}+\mathrm{t}|\mathrm{t}}\;by\; Eq.\; (24)-(27)$$
(29)
$${s}_{k+1|t}=A{s}_{k|t}+B{v}_{k|t},$$
$${s}_{k|t}\in {\stackrel{\sim }{\mathcal{C}}}_{k+t|t}, k\in {\mathbb{N}}_{1}^{N-1},$$
$${v}_{k|t}\in \mathcal{V}, k\in {\mathbb{N}}_{0}^{N-1},$$
$${x}_{t}- {s}_{0|t}\in {\mathcal{W}}_{t},$$
$${s}_{N|t}\in {\mathcal{X}}_{f}.$$

The solution of the optimal control problem yields the optimal initial nominal state \({s}_{0|t}^{*}\) and input sequence.

$${v}^{*}\left({x}_{t}\right)=\left[{v}_{0|t}^{*},\cdots ,{v}_{N-1|t}^{*}\right].$$
(30)

The associate optimal state sequence for nominal system is

$${s}^{*}\left({x}_{t}\right)=\left[{s}_{0|t}^{*},\cdots ,{s}_{N|t}^{*}\right].$$
(31)

Using the first entry of the optimal input sequence and the optimal initial state, the optimal control law is designed as

$${u}^{*}\left({x}_{t}\right)=K\left({x}_{t}-{s}_{0|t}^{*}\right)+{v}_{0|t}^{*}.$$
(32)

Apply \({u}^{*}\left({x}_{t}\right)\) to the system (1) yields new state.

$${x}_{t+1}=A{x}_{t}+B{u}^{*}\left({x}_{t}\right)+{w}_{t}.$$
(33)

Based on the new state \({x}_{t+1}\), the entire process of GP based SMPC is repeated at time \(t+1\), yielding a receding horizon control strategy.

4 Numerical Simulation

In this section, the chance constraint satisfaction of the GP-SMPC scheme are compared with nominal MPC, robust MPC and time-varying tube-based SMPC. In the simulations, the polytopes \(\mathcal{C}\), \(\mathcal{V}\), \(\mathcal{D}\) and \(\mathcal{Z}\) are computed by using the MPT3 toolbox.

To show the constraint violation of the GP-SMPC scheme, a discrete LTI system subject to state-dependent additive uncertainty disturbed by a truncated normal distributed noise is designed as

$${x}_{k+1}=\left[\begin{array}{cc}1.6& 1.1\\ -0.7& 1.2\end{array}\right]{x}_{k}+\left[\begin{array}{c}1\\ 1\end{array}\right]{u}_{k}+{w}_{k},$$

The state and input constraints are \(\mathrm{Pr}\left({x}_{k}\in {\mathbb{X}}\right)\ge 0.8\) and \({u}_{k}\in {\mathbb{U}}\), respectively.

$${\mathbb{X}}\triangleq \left\{x\in {\mathbb{R}}^{2}: \left[\begin{array}{c}-10\\ -2\end{array}\right]\le x\le \left[\begin{array}{c}2\\ 10\end{array}\right]\right\}, {\mathbb{U}}\triangleq \left\{u\in {\mathbb{R}}^{1}:|u|\le 10\right\}.$$

The uncertainty \({w}_{k}\in {\mathbb{W}}\) and

$${\mathbb{W}}\triangleq \left\{w\in {\mathbb{R}}^{2}:{\Vert w\Vert }_{\infty }\le 0.1, w=0.1*\left(\frac{1}{1+|u|{e}^{-x}}-0.5\right)+\mathcal{N}\left(0, {0.015}^{2}{I}_{2}\right)\right\}.$$

Design the weights of cost function \(Q={I}_{2}\) and \(R=1\). Compute \(K\) as the LQR feedback gain for the unconstrained optimal problem \(\left(A,B,Q,R\right)\). The prediction horizon is \(N=6\). The simulation step is \(N\mathrm{sim}=11\). The initial state \({x}_{0}={\left[-6.5, 10.5\right]}^{\mathrm{T}}\).

The state constraint violations of the nominal MPC, the robust MPC, the time-varying tube-based SMPC and the proposed GP-SMPC are illustrated in Figs. 1, 2, 3 and 4. In the up side of each figure, the closed-loop actual state trajectories of 100 realizations are demonstrated. On account of that the constraint violation occurs around the border at the first several steps, the details of the first 3 step trajectories is enlarged at the down side part of each figure. Table 1 presents the first three steps constraint violation ratios and the average ratios of 1000 realizations. From the figures and the table, it can be seen that: the average constraint violation at the first 3 steps of the nominal MPC is 100%, the constraint is break at all steps; the average constraint violation at the first 3 steps of the robust MPC is 0%, the constraint is satisfied with heavy-duty conservatism; the average constraint violation at the first 3 steps of the time-varying tube-based SMPC is 65.0%, the conservatism is relieved a bit; and that of the proposed GP-SMPC is 16.2%, which is close to 20% specified in advance, resulting in less conservatism and constraint satisfaction.

Fig. 1.
figure 1

Closed-loop trajectories of nominal MPC with 100 realizations.

Fig. 2.
figure 2

Closed-loop trajectories of robust MPC with 100 realizations.

Fig. 3.
figure 3

Closed-loop trajectories of time-varying SMPC with 100 realizations.

Fig. 4.
figure 4

Closed-loop trajectories of GP-SMPC with 100 realizations.

Table 1. Constraint violation of MPC algorithms.

5 Conclusion

The proposed GP-SMPC scheme reduces the conservatism through tightening the constraints adaptively. Specifically, the stringent propagation set of uncertainty is obtained by using the time varying confidence region which is formulated on the basis of Gaussian process prediction. Numerical simulations validate that the chance constraint satisfaction of GP-SMPC is better than that of nominal MPC, robust MPC and time-varying tube-based SMPC.