Abstract
This paper presents the main principles of a stochastic nonlinear model predictive control (NMPC) novel approach for imperfectly observed discrete time systems described by time varying nonlinear non-Gaussian state space models with unknown parameters. A convergent particle estimator of the conditional expectation of a chosen cost function on a given receding control horizon is built, leading to an almost sure (a.s.) epi-convergent estimator of the NMPC cost-to-go criterion. The estimator of the expected cost function relies upon simulations and on a recently developed nonparametric convergent particle estimator of a multi-step ahead conditional probability density function (pdf) of the state variables. The theory of stochastic epi-convergence is applied to the estimated cost-to-go criterion to prove the almost sure convergence of the optimal solutions of the approximated NMPC problem to their true counterparts, when both simulation and particle numbers grow to infinity.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Model predictive control (MPC), an open-loop approach in which a receding control horizon is considered to determine an optimal constrained control input \(u_t\) at each time step t, is now a well established control synthesis method for deterministic dynamical systems described by linear models with quadratic cost functions, in particular in several industrial fields [8]. Extensions to more realistic situations, such as nonlinearity of the systems and general constraint settings, have given rise to a considerable amount of works in the last twenty years (an almost exhaustive survey of recent developments and future promise in MPC has recently been proposed [36]): for example, to take a better account of significant nonlinearities in many real life systems and to improve control quality, several extensions have progressively led to a set of approaches known as nonlinear model predictive control (NMPC), which tried to overcome more or less the loss of convexity of the related constrained optimization problems, often by successive linearizations [34]. Even if they allowed some progress in the control efficiency, these approaches are still faced to other control difficulties such that those linked to random effects and/or uncertainties rising from system noise and imperfect state information (undirect measurements). In the last few years they have induced the development of robust approaches (see [31, 41, 42]) or stochastic approaches (see [30] and the recent survey of [37] ) and still scarcely for the unobserved state case, the introduction of state estimation steps into the control procedure (see particularly [10, 58] and [26]). However this last situation had already been considered in the context of operations research and stochastic optimal control, in particular in the case of finite state, observation and control spaces, for partially observed Markov decision processes (countable POMDP [3]) for which finite and infinite horizon problems can be solved by so-called value iteration algorithms [33]. Continuous-state POMDPs are most often approximated as finite POMDP by discretization [23], which may result however in discrete-state POMDPs of huge dimensions and hard to solve numerically. Other approximating approaches have also been proposed to address continuous-state POMDP problems (see for example [50] and [60] for variants of particle-filtering-based approximations).
As a matter of fact, stochastic uncertainty, added to imperfect state information for discrete-time systems in which the state variables \(X_t\) to be controlled are not directly accessible but only output variables \(Y_t\) instead, set up a critical challenge for NMPC: at each time step t the control to be applied with respect to the given receding horizon has to be determined by the constrained minimization of an expected cost-to-go function dependent on the possible induced future values of the state variables on the receding horizon. These anticipated possible values are ideally summarized by their anticipated probability distributions conditional on past observations and past control values but also conditional on possible future control values over the receding horizon. This has led to ideal open-loop feedback controllers, which perform at least as well as optimal open-loop policies (see [4, 57]). However apart from the linear-quadratic-Gaussian (LQG) special case, this theoretical approach raises two important critical issues: the determination of the anticipated conditional state distributions and the successive expected cost-to-go estimations and minimizations. This partitioning of the imperfect state information stochastic NMPC problem is the starting point of the development of some tentative suboptimal NMPC controllers but the relevant literature is still extremely limited and more practically than theoretically oriented: several works rely on approximation of the conditional state distribution through Gaussian mixtures and approximate filtering (for example [56], in the case of a finite set of control inputs and time varying state and observation system models). Under perfect knowledge of the system noise distributions, some particle based approaches use particle collections (see [16]) to replace the conditional probability distributions of the state variables by particle filters (see [15, 22]), in order to estimate the cost-to-go expectations when needed during the successive minimizations (see [46], and more recently [48] which combines the so-called MPC Scenario approach and a particle approach). Other works use particle filters for state estimation followed by a subsequent MPC optimization (see [1, 6]) or by another particle based procedure as sequential Monte Carlo (SMC) in the prediction/optimization step of the control determination (see [29, 49]). However the issue of the consistency of all these particle based estimations remains open. Moreover, when parameter estimation has to be done in parallel with the control of the system, particle depletion phenomena may occur, due to the introduction of the unknown parameters as state extension, which impair the control quality [21]. Last but not least, the crucial issue of the stochastic stability of the resulting closed-loop system, as desirable as it may be for insuring reaching feasible states, has not yet received a comprehensive response even in its weakest form (existence of invariant measure, positive recurrence, etc) even for stochastic NMPC under complete state information (see the acute recent analysis of [11]) and a fortiori neither in the case of imperfect state information just mentioned and still less when the noise distribution is also unknown.
This paper is devoted to the presentation of the main principles of a stochastic NMPC novel approach for systems described by time varying nonlinear state space and observation models, with unknown parameters and unknown but simulable system random effects. This approach is based on the use of particle convergent estimators of the multi-step ahead conditional probability density functions (pdf) of the state variables to be controlled [52]. These convergent pdf predictors rely themselves on a new generation of so-called convolution or nonparametric particle filters, free of any particle depletion risk (see [9, 44, 45]). This particle approach, combined with a simulation-based estimation of the expected cost-to-go function, allows construction of epi-convergent estimators of the successive cost-to-go expectations when the number of simulations and the number of particles used grow to infinity. According to the epi-convergence theory (see [2, 54, 55, 59]) the epi-convergence property of these cost expectation estimators, ensures itself the almost sure convergence of the corresponding optimal controls estimators to their true counterparts under regular conditions.
The paper is organised as follows. The nonlinear modelling context to be considered and its assumption set are described in Sect. 2. The setting of the relevant NMPC problem is done in Sect. 3. The almost sure convergence of the particle estimators of the expected costs and of their minimizers to their true respective counterparts is established in Sect. 4. In Sect. 5 a simulated case study of NMPC control in predictive microbiology is presented that shows the efficiency of the proposed approach. Then, the construction principles of the nonparametric estimator of the multi-step ahead conditional pdf of the state, which is used in the particle generations during the predictive control process, are presented in Appendix 1. They are preceded by a contrasting and brief recall about Monte Carlo particle filters. Finally, Appendix 2 is devoted to the proof of a technical lemma.
2 The Modelling Context
The dynamic systems of interest are supposed to obey general state space models of the form:
in which \(X_t \in I\!\!R^d\) is the vector of the unobserved state variables, \(u_t \in \mathcal {U} \subset I\!\!R^q\) the vector of control variables, \(Y_t \in I\!\!R^s\) the vector of the observed output variables. \(\theta \in \Theta \subset I\!\!R^p\) is a vector of p known or unknown fixed parameters. \(\varepsilon _t\) is a vector of random variables (possibly noises). For all \(t \in I\!\!N^+\), \(f_t\) is a known Borel measurable function, \(G_t\) is an absolutely continuous probability distribution function with respect to the Lebesgue measure with pdf \(p_t^Y(y|x_t,\theta )\). The probability distribution function of the state \(X_t\) at \(t=0\) and the transition distribution functions \(P_{t+1}^X(x|x_t,u_t)\) for \(t \ge 0\) are also supposed to be absolutely continuous w.r.t. the Lebesgue measure. The probability distribution function \(G_t\) and that of \(\varepsilon _t\) are not necessarily known but are supposed to be at least simulable. As a particular case of model (1) the output variable model can be given by a regression equation \(Y_{t+1} = h_{t+1}(X_{t+1}, \theta , \eta _{t+1})\) in which \(h_t\) is a known Borel measurable function where \(\eta _t\) is a vector of random variables (possibly noises) supposed to be at least simulable.
2.1 Primary Notations
Let
-
\(p_0^X(.)\) : the probability density of the state variable vector X at time \(t=0\), supposed to be known or simulable.
-
\(p_0^{\theta }(.)\) : a given prior density for \(\theta \in \Theta \) when unknown, non zero for \(\theta ^*\) the true values of the parameters.
-
\({\mathcal L}_{\varepsilon _t}\) : the probability distribution function of \(\varepsilon _t\), at least simulable whatever t.
-
\(x_{1:t} \! := \! (x_1,\ldots , x_t)\), some realizations of the successive state vectors \( X_1,\ldots ,X_t\).
-
\(y_{1:t} := (y_1,\ldots , y_t), \) observed values of the output variables up to time t and \(u_{0:t} := (u_0,\ldots , u_t)\), controls applied until time t.
-
\(c_t(x,u) : \ I\!\!R^d \times I\!\!R^q \ \longrightarrow \ I\!\!R^+,\) the cost function at time t (\(t \in I\!\!N^+\)) for the predictive control problem to be considered, supposed to be continuous in both x and u.
-
\(p_{t+k}^X(x|y_{1:j}, u_{0:t+k-1}), \, 1 \le j \le t, \, k \ge 0 :\) the pdf of the state variables X at time \(t+k\), conditional on the past values \(y_{1:j}\) and \(u_{0:t+k-1}\), supposed to be continuous with respect to \(u_{0:t+k-1}\), and with corresponding probability distribution function denoted \(P_{t+k}^X(x|y_{1:j}, u_{0:t+k-1})\).
The previous probabilistic assumptions ensure the existence of the conditional pdf \(p_{t+k}^X(x|y_{1:j}, u_{0:t+k-1})\). Indeed, with obvious notations:
\( p_{t+k}^X(x|y_{1:j}, u_{0:t+k-1}) = \int p^X(x_{1:t+k}|y_{1:j}, u_{0:t+k-1})dx_1\ldots dx_{t+k-1}.\)
By the Bayes’ rule, \(\displaystyle p^X(x_{1:t+k}|y_{1:j}, u_{0:t+k-1}) \,{=}\, \frac{p^Y(y_{1:j}|x_{1:j})p^X(x_{1:t+k}|u_{0:t+k-1})}{p^Y(y_{1:j}|u_{0:j\,{-}\,1})}\)
with \(p^Y(y_{1:j}|u_{0:j-1}) = \int p^Y(y_{1:j}|x_{1:j})p^X(x_{1:j}|u_{0:j-1})dx_1 \ldots dx_j\),
\(\displaystyle p^Y(y_{1:j}|x_{1:j}) = \Pi _{1 \le \ell \le j} \, p^Y(y_{\ell }|x_{\ell })\) and
\(p^X(x_{1:t+k}|u_{0:t+k-1}) = \int p_0^X(x_0) \, \Pi _{1 \le \ell \le t+k} \, p^X(x_{\ell }|x_{\ell -1},u_{\ell -1})dx_0, \hbox {or simply}\)
\(\Pi _{1 \le \ell \le t+k} \, p^X(x_{\ell }|x_{\ell -1},u_{\ell -1})\) if \(x_0\) is known.
The computation of \(p_{t+k}^X(x|y_{1:j}, u_{0:t+k-1})\) is intractable in the general case. This pdf is at the core of the predictive control process and will be consistently estimated following a nonparametric particle approach described in Appendix 1.
2.2 Assumptions
-
A1: \(\mathcal {U} \subset I\!\!R^q, \) the set of admissible controls, is supposed to be compact.
-
A2: \(\forall x, \ \forall j: 0 < j \le t\), \(\hbox {E}\Big [c_{t+1}(X_{t+1},u_t)|y_{1:j},u_{0:t}\Big ] \ = \ \displaystyle \int c_{t+1}(x,u_t)p_{t+1}^X\big (x|y_{1:j}, u_{0:t}\big )dx \ < \ \infty \).
Remark 1
As mentionned in the introduction, the closed-loop stability issue in a general stochastic NMPC procedure has not yet received a definitive treatment (if any) and must be examined on a case-by-case basis (i.e. according to each system model setting). This is all the more true as imperfect state information and unknown random effects (noise distributions) superadd new complexities, as in the situation considered in the present paper. Therefore the particle nonparametric NMPC approach to be developed in the following and the results therefrom, do not deal with this issue. However for a given system model, the stability of the closed-loop system this approach leads to, can in any case be investigated through more or less severe sufficient conditions on the system model and the control settings, as the following one (from [18]):
-
A3: \(\displaystyle \sup _{t \in I\!\!N^{^+}} \sup _{u \in \mathcal {U}^{^{ } }} \hbox {E}_{{\mathcal L}_{\varepsilon _t} }\Big [\big \Vert f_t(x,\theta ,u,\varepsilon _t) \big \Vert ^a \Big ] \ \le \ \alpha \big \Vert x \big \Vert ^a \ + \ \beta \)\(\hbox {with} \ \ a>1, \ \ 0 \le \alpha< 1, \ \ 0 \le \beta < \infty .\)
which implies that the system (1) is stabilized by any admissible control strategy (and in particular an optimal one): there exists a constant \(\kappa \) such that, whatever the initial state probability distribution and whatever the admissible strategy considered, it holds:
$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T+1} \sum _{t=0}^T \Vert X_t \Vert ^2 \le \kappa \ \ \ \ \ a.s. \end{aligned}$$Moreover, for all \( \xi > 0\), there exists a compact \(\Phi \) such that:
$$\begin{aligned} \liminf _{T \rightarrow \infty } \frac{1}{T+1}\sum _{t=0}^T 1\!\!1_{[X_t \in \Phi ]} \ge 1-\xi \ \ \ \ \hbox {a.s.} \ \ (\hbox {from }[18]). \end{aligned}$$This sufficient condition will be considered in the proposed case study (Sect. 5).
3 The Predictive Control Problem
Let us consider the system at time \(j \ge 1\), time instant until which the controls \(\{u_0,u_1,\ldots ,u_{j-1}\}\) have been applied and the observations \(\{y_1,y_2,\ldots ,y_j\}\) have been recorded.
Let us denote
-
\(v_{j:t} := v_j,v_{j+1},\ldots ,v_t\), unknown future controls to be applied until a future time t, \(t~\ge ~j\).
-
\(Y_{j+1:t+1} := Y_{j+1},Y_{j+2},\ldots ,Y_t, Y_{t+1}\), the corresponding future observations.
For a given receding horizon length H, let
-
\( {\mathbf {v}} = v_{j:j+H-1}. \)
-
Define the expected cost-to-go
$$\begin{aligned}&J_H(\mathbf {v}) := \nonumber \\&\quad \hbox {E}\left\{ \sum _{t=j}^{j+H-1} \hbox {E}\Big [c_{t+1}(X_{t+1}, v_t) \Big | y_{1:j},Y_{j+1:t+1},u_{0:j-1},v_{j:t}\Big ]\right\} \nonumber \\&\quad = \ \sum _{t=j}^{j+H-1} \hbox {E}\Big [c_ {t+1}(X_{t+1},v_t) \Big | y_{1:j},u_{0:j-1},v_{j:t}\Big ]. \end{aligned}$$(2)
Suppose that all expectations in (2) can be evaluated. As time goes on, a classic sliding horizon control procedure (see [3, 8]) would proceed as the following:
-
(1)
Find \(\displaystyle \mathbf {v}^*= v_j^*,\ldots ,v_{j+H-1}^*= \hbox {arginf}_{v_j,\ldots ,v_{j+H-1}} J_H(\mathbf {v)}\),
with \(v_k \in \mathcal {U}, \ k=j,\ldots ,j+H-1\)
-
(2)
Apply control \(v_j^*\) to the system
-
(3)
Get the new observation \(y_{j+1}\)
-
(4)
Let \(j=j+1\)
-
(5)
Go back to step 1.
Given \( u_0,\ldots ,u_{j-1}\) and the observations \(y_1,\ldots , y_j\), the exact computation of \(J_H(\mathbf {v})\) is generally neither feasible nor is its exact minimization with respect to \(\mathbf {v}\). However, based on particle simulation approaches and the theory of epi-convergence, convergent estimators of \(J_H(.)\) and of its minimizers \(\mathbf {v}^*\), can be obtained as shown in the next section.
Remark 2
As regards the unknown system model parameters \(\theta \) :
Their convergent filtering estimation can be performed simultaneously with the filtering of the state variables by the convolution filter procedure to be used in the control operation, after a classic state extension of model (1) (see Appendix 1).
Remark 3
Particular case: tracking control
Let \(\{x_t^*\}\) be a given reference trajectory for the system dynamics. Let us suppose that the system obeys the following particular form of state equation in model (1):
with \(\hbox {E}(\varepsilon _{t+1}) = 0_{(d \times 1)}\) and \(\hbox {Var}(\varepsilon _{t+1}) = \Lambda _{{t+1}_{(d \times d)}}\).
An appropriate simple cost function is then given by the quadratic discrepancy
Let \( \ \psi _{t+1} = f_{t+1}(X_{t}, \theta , u_{t}) - x_{t+1}^*\).
Then, \( \ \Vert X_{t+1} - x_{t+1}^*\Vert ^2 \ = \ \Vert \varepsilon _{t+1} \Vert ^2 \ + \ \Vert \psi _{t+1}\Vert ^2 \ + \ 2\psi _{t+1}^T \varepsilon _{t+1},\) and
Applying Jensen’s inequality, it follows
(5) gives the decomposition of \(J_H\) corresponding to the quadratic cost (4) and shows the pure expected quadratic error reduction performed by the minimization of this cost expectation. Moreover (6) shows the link between the criterion \(J_H\) with quadratic cost and another relevant criterion of the predictive least squares type, which is also reduced by the minimization of \(J_H\).
In all the sequel the cost function \(c_t(x,u)\) will be considered in its general form.
4 Convergent Estimators of the Expected Cost-to-Go Function and Its Minimizers, to Their True Counterparts
In this central section, an estimator of the expected cost-to-go \(J_H(\mathbf {v})\) is proposed (Sect. 4.2), built from a particle estimator of the conditional predictive pdf of the state variables (Sect. 4.1). This expected cost-to-go estimator, \(J_H^m(\mathbf {v})\), is then shown to converge pointwise almost surely as m grows to infinity whatever \(\mathbf {v}\), to its true counterpart \(J_H(\mathbf {v})\) (Sect. 4.3). From that, the \(J_H^m(.)\) function is shown to converge according to a so-called epi-convergent mode to \(J_H(.)\) with m (Sect. 4.4), which in the present case ensures the almost sure convergence of the \(J_H^m(.)\) minimizers into the set of minimizers of \(J_H(.)\), and that of the corresponding minima to their true counterparts.
The first step of this construction is then the introduction of a convergent estimator with sufficiently good properties, of any multistep ahead conditional predictive pdf \(p_{t+k}^X(x|y_{1:j},u_{0:t+k-1})\) of the state vector \(X_t\), \(\forall j > 0, \ \forall t \ge j, \ \forall k \ge 0.\)
4.1 A Convergent Particle Estimator of the Conditional Predictive pdf of the State Variables
A convergent n-particle estimator of the multi-step ahead conditional pdf of the state vector has recently been proposed (see [52, 53]), such that under reasonable conditions: \(\forall j > 0, \ \forall t \ge j, \ \forall k \ge 0, \ \)
-
$$\begin{aligned} \lim _{n \rightarrow \infty } \Big \Vert p^{n,X}_{t+k}(x|y_{1:j},u_{0:t+k-1}) - p_{t+k}^X(x|y_{1:j},u_{0:t+k-1})\Big \Vert _{L_1} \ = \ 0 \ \ \ \ a.s. \end{aligned}$$(7)
-
$$\begin{aligned} \forall x, \ \lim _{n \rightarrow \infty } p^{n,X}_{t+k}(x|y_{1:j},u_{0:t+k-1}) \ = \ p_{t+k}^X(x|y_{1:j},u_{0:t+k-1}) \ \ \ \ a.s. \end{aligned}$$(8)
in which \(p_{t+k}^{n,X}(x|y_{1:j},u_{0:t+k-1})\) is an n-particle estimator of the true \((t+k-j)\)-step ahead conditional pdf \(p_{t+k}^X(x|y_{1:j},u_{0:t+k-1})\) of the state (see Appendix 1), with corresponding probability distribution function denoted \(P_{t+k}^{n,X}(x|y_{1:j},u_{0:t+k-1})\).
As said previously, the possible unknown parameters \(\theta \) can be treated as are the state variables X, with equivalent results for their conditional pdf estimation, by classic state extension (Appendix 1).
4.2 A Particle Estimator of the Expected Cost-to-Go Function \(J_H(.)\)
For \(t \ge j\), let:
-
Q(x): an absolutely continuous probability distribution function with density q(x), which dominates the distribution \( P_{t+1}^X(x|y_{1:j},u_{0:j-1}, v_{j:t})\) and also the distribution \( P_{t+1}^{n,X}(x|y_{1:j},u_{0:j-1}, v_{j:t})\) for n sufficiently large. This last assumption is all the more plausible as n tends to infinity because of (7) and (8). The pdf q(x) will be used to perform changes of probability measures of theoretical interest to make easier subsequent convergence proofs in Sect. 4.4 (see also [20]).
-
\( \mathbf {X} := \ X_{j+1 : j+H}. \)
-
\(\mathcal {X} := x_{j+1:j+H},\) a realization of \(\mathbf {X}\).
-
\(\displaystyle \sigma _{t+1}(x,v_{j:t}) := c_{t+1}(x,v_t)\frac{p_{t+1}^X\big (x|y_{1:j}, u_{0:j-1}, v_{j:t}\big )}{q(x)}.\)
-
\(S(\mathbf {X}, \mathbf {v}) \ := \sum _{t=j}^{j+H-1} \sigma _{t+1}(X_{t+1},v_{j:t}).\)
-
\(\hbox {E}_q[.]\), the expectation operator with respect to q(x).
-
$$\begin{aligned} \displaystyle J_H(\mathbf {v}):= & {} \sum _{t=j}^{j+H-1} \hbox {E}_{p_{t+1}^X}\Big [ c_{t+1}(X_{t+1},v_t) \Big | y_{1:j}, u_{0,j-1},v_{j:t}\Big ] \\ \displaystyle= & {} \sum _{t=j}^{j+H-1} {\hbox {E}}_q \Big [\sigma _{t+1}(X_{t+1},v_{j:t})\Big ] \ = \ {\hbox {E}}_{q^H}\Big [S(\mathbf {X},\mathbf {v})\Big ]. \end{aligned}$$
The number n of particles used in the estimation of the state conditional pdf’s of interest will be taken as some chosen growing function of m, the number of draws in a simulation procedure defined just below: \(n = n(m).\) With this single growth constraint, the choice of the function n(.) is immaterial to all the convergence results to follow.
Then let :
-
$$\begin{aligned} \displaystyle \sigma ^m_{t+1}(x,v_{j:t}) := c_{t+1}(x,v_t)\frac{p_{t+1}^{n(m),X}\big (x|y_{1:j}, u_{0:j-1}, v_{j:t}\big )}{q(x)}. \end{aligned}$$
-
$$\begin{aligned} S^m(\mathbf {X},\mathbf {v}) \ := \ \sum _{t=j}^{j+H-1} \sigma ^{m}_{t+1}(X_{t+1},v_{j:t}). \end{aligned}$$
-
$$\begin{aligned} \bar{\sigma }_{t+1}^{m} \ := \ \frac{1}{m} \sum _{i=1}^m\sigma ^m_{t+1}(X^i_{t+1},v_{j:t}), \ with X_{t+1}^i \sim \, q(x), \ i=1,\ldots ,m. \end{aligned}$$
-
$$\begin{aligned} \mathbf {X}^i := X_{j+1}^i,\ldots ,X_{j+H}^i \ (\hbox {with} \ \mathcal {X}^i := x_{j+1}^i,...,x_{j+H}^i \, \hbox {a realization of}\,\,\, \mathbf {X}^i). \end{aligned}$$
-
$$\begin{aligned} J_H^{m}(\mathbf {v}):= & {} \mathop \sum \nolimits _{t=j}^{j+H-1}\bar{\sigma }_{t+1}^{m} \ = \ \frac{1}{m} \mathop \sum \nolimits _{i=1}^m \mathop \sum \nolimits _{t=j}^{j+H-1} \sigma ^m_{t+1}(X^i_{t+1},v_{j:t}) \\= & {} \frac{1}{m} \mathop \sum \nolimits _{i=1}^m S^m(\mathbf {X}^i,\mathbf {v}). \end{aligned}$$
Let us note that given \(\mathbf {v}\), the approximated cost-to-go expectation \(J_H^{m}(\mathbf {v}) \) is a random variable which depends on the set of mH drawings \({X}_{t+1}^i, \, i=1,\ldots , m, t=j,\ldots ,j+H-1\) according to the pdf q(x), and on the \(j+H\) sets of n(m) particles generated from \(t=0\), to get the predictive conditional pdf estimates \(p_{t+1}^{n(m),X}(x|y_{1:j},u_{0:j-1},v_{j:t}), \ t=j,\ldots ,j+H-1\) (see Appendix 1).
4.3 Almost Sure Convergence of the Expected Cost-to-Go Estimator
Theorem 4.1
Under the assumptions of Sect. 2,
Proof
For \( j \le t \le j+H-1,\) for fixed \(\mathbf {v}\),
Let us denote for brevity’s sake without any ambiguity: \(\sigma _{x,t+1} := \sigma _{t+1}(x, v_{j:t}), \sigma ^m_{x,t+1} := \sigma ^m_{t+1}(x, v_{j:t})\), \(\sigma _{i,t+1} := \sigma _{t+1}(X^i_{t+1}, v_{j:t})\) and \(\sigma ^m_{i,t+1} := \sigma ^m_{t+1}(X^i_{t+1}, v_{j:t})\).
From (8), \(\{\sigma ^m_{x,t+1}\}\) is a sequence of measurable functions which converges pointwise a.s. to \(\sigma _{x,t+1}\) with m for all x, then also Q-almost-everywhere a.s. As Q is a finite measure, one has by Egoroff’s theorem:
\(\forall \delta > 0, \exists E_\delta \subset I\!\!R^d\) with \(Q(E_\delta ) < \delta \), such that \(\sigma ^m_{x,t+1}\) converges to \(\sigma _{x,t+1}\) uniformly a.s. with m on \(I\!\!R^d \! \setminus \! E_\delta \) (the complementary of \(E_\delta \) in \( I\!\!R^d\)). Note that given \(\delta \) there exists an indefinite number of such subsets \( E_\delta \).
Then \(\forall \delta > 0, \displaystyle \sup _{x \in I\!\!R^d \setminus E_\delta } \mid \sigma ^m_{x,t+1} - \sigma _{x,t+1} \mid \ {\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} \ 0,\)
and \( g_m(\delta ) = \inf _{E_\delta } \sup _{x \in I\!\!R^d \setminus E_\delta } \mid \sigma ^m_{x,t+1} - \sigma _{x,t+1} \mid \ {\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} \ 0 .\)
Now let \( 0< L_1< L_2 < \infty \) and \(\Delta = ]0, L_2]\). We have
- (i):
-
\(g_m(\delta )\) converges uniformly a.s. to 0 as m grows to \(\infty \) on \([L_1, L_2]\) whatever \(L_1 > 0\), with \(L_1 < L_2\).
- (ii):
-
\(\{0\}\) is adherent to \(\Delta \).
- (iii):
-
$$\begin{aligned} \displaystyle&g_m(\delta ) = \\&\inf _{E_\delta }\sup _{x \in I\!\!R^d \setminus E_\delta } \! \! \mid \! \sigma ^m_{x,t+1} - \sigma _{x,t+1} \! \mid \ \ {\mathop {\longrightarrow }\limits ^{\delta \rightarrow 0}} \ \ell _m\nonumber = \lim _{\delta \rightarrow 0} \inf _{E_\delta } \sup _{x \in I\!\!R^d \setminus E_\delta } \! \! \mid \! \sigma ^m_{x,t+1} - \sigma _{x,t+1} \! \mid \\&\quad \quad \quad \quad \quad \quad \quad \quad \quad \qquad \;\qquad \quad \quad \quad \quad \quad \quad \quad \le \sup _{x \in I\!\!R^d} \mid \! \sigma ^m_{x,t+1} - \sigma _{x,t+1} \! \mid \ \ < \infty . \end{aligned}$$
Then by the Moore–Osgood’s theorem on exchanging limits:
$$\begin{aligned} \displaystyle \lim _{m \rightarrow \infty } \lim _{\delta \rightarrow 0} g_m(\delta ) \ = \ \lim _{\delta \rightarrow 0} \lim _{m \rightarrow \infty } g_m(\delta ) \ = \ 0 \quad a.s. \end{aligned}$$(10)and
$$\begin{aligned}&\displaystyle \lim _{m \rightarrow \infty } \left| \frac{1}{m} \sum _{i=1}^m(\sigma ^m_{i,t+1} - \sigma _{i,t+1}) \right| \ \le \nonumber \\&\lim _{m \rightarrow \infty } \max _{i=1,\ldots ,m} \vert (\sigma ^m_{i,t+1} - \sigma _{i,t+1}) \vert \displaystyle \ \le \lim _{m \rightarrow \infty } \lim _{\delta \rightarrow 0} g_m(\delta ) = 0 \\&\displaystyle \Longrightarrow \ \frac{1}{m} \sum _{i=1}^m \sigma _{i,t+1}^m \nonumber \\&{\mathop {\simeq }\limits ^{m \rightarrow \infty }} \ \frac{1}{m} \sum _{i=1}^m \sigma _{i,t+1} \ {\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} \hbox {E}_q[\sigma _{t+1}(X_{t+1},v_{j:t})] \quad \hbox { a.s.} \nonumber \\&\qquad \quad \quad \,\, \hbox { (by the strong law of large numbers)}\nonumber \\&= \ \hbox {E}_{p_{t+1}^X}\Big [ c_{t+1}(X_{t+1},v_t) \Big | y_{1:j}, u_{0,j-1},v_{j:t}\Big ].\nonumber \end{aligned}$$(11)Finally,
$$\begin{aligned}&\displaystyle J_H^m(\mathbf {v}) = \! \sum _{t=j}^{j+H-1} \! \! \frac{1}{m} \sum _{i=1}^m \sigma ^m_{t+1}(X_{t+1}^i, v_{j:t}) \\&{\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} \sum _{t=j}^{j+H-1} \! \hbox {E}_{p_{t+1}^X}\Big [ c_{t+1}(X_{t+1},v_t) \Big | y_{1:j}, u_{0,j-1},v_{j:t}\Big ] \\&= \ J_H(\mathbf {v}) \ \ \hbox {a.s.} \end{aligned}$$
\(\square \)
4.4 Almost Sure epi-Convergence Results
Simple pointwise convergence of a sequence of deterministic or random functions \(\{F^m(.)\}\) to a limit function F(.) does not necessarily involve the convergence of the corresponding sequence of minima to a minimum of the limit function F(.). The so-called epi-convergence (see [2, 54, 55])—which is essentially a one-sided locally uniform convergence combined with a weak pointwise convergence on the other side (see [19])—is then useful to establish this result. The sequence of functions \(\{F^m(.)\}\) is said to epi-converge to the function F(.) if and only if the sequence of corresponding epi-graphs converges to the epi-graph of F(.). Actually, if \(\{F^m(.)\}\) epi-converges to F(.) and \(\mathbf {v}^m\) minimizes \(F^m(.)\), then any cluster point of the sequence \(\{\mathbf {v}^m\}\) is a minimizer of F(.) (see [19]). Moreover the corresponding optimal values also converge (see [12]).
Now, \(\mathbf {v} := v_{j:j+H-1} \in \mathcal {U}^H \subset I\!\!R^{H \times q}\) which is a separable metric space. Then, let
-
\(\mathcal {B}_c := \{B_1, B_2, \ldots \}\), a countable basis of open sets of \(\mathcal {U}^H\) for the topology of \(\mathcal {U}^H\) induced by the usual topology of \(I\!\!R^{H \times q}\).
-
\(\mathcal {N}(\mathbf {v})\): the set of open neighborhoods of the point \(\mathbf {v}\).
-
\(\mathcal {N}_c(\mathbf {v}) := \mathcal {B}_c \bigcap \mathcal {N}(\mathbf {v})\), the set of neighborhoods in the countable basis.
-
\(\mathbf {v}^k \in B_k \in \mathcal {B}_c\), such that
$$\begin{aligned} \displaystyle J_H(\mathbf {v}^k) \le \inf _{\mathbf {w} \in B_k} J_H(\mathbf {w}) + \frac{1}{k}, \ \ \forall k \in I\!\!N^+. \end{aligned}$$(12) -
\( \mathcal {U}^H_c := \{\mathbf {v}^1, \mathbf {v}^2, \ldots , \mathbf {v}^k,\ldots \}\).
According to the usual standard approach (see [12]) the epi-convergence of \(J_H^{m}\) to \(J_H\) as m grows to infinity will be established if it can be shown that the epi-limits superior and inferior of \(\{J_H^m\}\) are both equal to \(J_H\) a.s., or equivalently: \(\forall \mathbf {v} \in \mathcal {U}^H\),
and
Theorem 4.2
\(J_H^m(.)\) epi-converges to \(J_H(.)\) almost surely as m grows to \(\infty \).
The following lemmas will be needed.
Lemma 4.3
\(\displaystyle J_H(.)\) is lower semi-continuous at \(\mathbf {v}\), \(\forall \mathbf {v} \in \mathcal {U}^H\).
Proof
if \(\mathbf {v}^{\ell } {\mathop {\longrightarrow }\limits ^{{\ell } \rightarrow \infty }} \mathbf {v}\), then
\(\square \)
Lemma 4.4
Proof
from (8) \(p^{n(m),X}_{t+1}(x|y_{1:j},u_{0:j-1},v_{j:t}) \ {\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} \ p_{t+1}^{X}(x|y_{1:j},u_{0:j-1},v_{j:t})\) a.s. (see Appendix 1).
Lemma 4.5
\(S^m(\mathcal {X}, . )\) epi-converges almost surely to \(S(\mathcal {X}, . )\) for all \(\mathcal {X}\) as m grows to infinity.
Proof
See Appendix 2.
Lemma 4.6
\(\forall B \in \mathcal {B}_c\),
or more compactly \( \displaystyle \frac{1}{m}\sum _{i=1}^m \inf _{\mathbf {v} \in B} S^m(\mathbf {X}^i,\mathbf {v}) \ {\mathop {\longrightarrow }\limits ^{m \rightarrow \infty }} {\hbox {E}}_{q^H}\Big [ \inf _{\mathbf {v} \in B} S(\mathbf {X},\mathbf {v}) \Big ] \ \ \ a.s.\)
Proof
Let \(\displaystyle Z(\mathcal {X}) = \inf _{\mathbf {v} \in B}S(\mathcal {X},\mathbf {v})\) and \(\displaystyle Z^m(\mathcal {X}) = \inf _{\mathbf {v} \in B}S^m(\mathcal {X},\mathbf {v})\). As a consequence of Lemma 4.5, \(\{Z^m(\mathcal {X})\}\) is a sequence of functions which converges pointwise (a.s.) and then also \(Q^H\)-almost-everywhere (a.s.) to \(Z(\mathcal {X})\) with m, \(\forall \mathcal {X} \in I\!\!R^{H \times d}\). Moreover the functions \(\{Z^m(\mathcal {X})\}\) are measurable (as are the functions \(\{S^m(\mathcal {X},\mathbf {v})\}\) for all \(\mathbf {v}\)) due to the property of the inf operation. The rationale of the proof of Theorem 4.1 can then be reused with \(Z(\mathcal {X}),\, Z^m(\mathcal {X}), \, \mathcal {X}\) and \(\mathbf {X}^i\) in place of \(\sigma _{x,t+1}, \, \sigma ^m_{x,t+1}, \, x\) and \(X^i_{t+1}\) respectively.
With the same notations and applications of Egoroff’s and Moore–Osgood’s theorems as previously, we have:
\(\square \)
The rest of the paragraph is inspired by Theorem 1 of Chen et al. [12].
Let us show first that (13) is satisfied:
\(\forall B \in \mathcal {B}_c\) and \(\forall \mathbf {v} \in B,\) we have by (9)
then
and \(\forall \mathbf {v} \in B\)
Given \(B \in \mathcal {B}_c\), any ball S with center at \(\mathbf {v} \in B \) and sufficiently small radius, is such that \(S \subset B\), since B is open. Moreover there exists \(B^\prime \) in the basis \(\mathcal {B}_c\) such that \(\mathbf {v} \in B^\prime \subset S\) and then \(B \supset B^\prime \). Starting from \(\mathbf {v} \in B^\prime \) a new ball \(S^\prime \) centered at \(\mathbf {v}\) and with sufficiently small radius to being countained in \(B^\prime \), can be found, and there is a \(B^{\prime \prime } \in \mathcal {B}_c\) such that \(\mathbf {v} \in B^{\prime \prime } \subset S^\prime \) and then \(B^\prime \supset B^{\prime \prime }\). This process can be iterated. It is then possible to choose a subsequence \(\{k_l\}\) such that \(B_{k_l} \supset B_{k_{l+1}}\), with \(\bigcap _l B_{k_l} = \{\mathbf {v}\}\).
Then by the lower semicontinuity of \(J_H(.)\)
and by (12)
The inequality (13) is implied by (20), (21) and (19). \(\square \)
Let us show now that (14) is also satisfied:
\(\forall B \in \mathcal {B}_c\), let \(\mathbf {w} \in B\), with \(\mathbf {w} := w_{j:j+H-1}\).
Let us note first that the continuity of \(\sigma ^m_{t+1}(x,w_{j:t})\) for all t, \(j \le t \le j \! + \! H \! - \! 1\), assures that \(\displaystyle \inf _{\mathbf {w} \in B}\sum _{t=j}^{j+H-1}\sigma ^m_{t+1}(x,w_{j:t})\) is measurable.
We have
in which (22) is true by subadditivity of the infimum, (23) is the immediate consequence of (16), (24) is due to the decrease of the embedded sequence \((B_{k_l})\), (25) is a direct application of the dominated convergence theorem and (26) is due to the lower semicontinuity of the operand in the expectation. \(\square \)
The epi-convergence of \(J_H^{m}(.)\) to \(J_H(.)\) as m grows to infinity is then established.
Now let \(\{\mathbf {v}^m\}\) be a sequence of \(\epsilon \)-minimizers of \(\{J_H^m(.)\}\), i.e.
According to Theorem 1.10 of Attouch [2] every converging subsequence \(\{\mathbf {v}^{m_k}\}\) of \(\{\mathbf {v}^m\}\) must converge to one of the minimizers \(\{\mathbf {v}^{*,i}, i=1,\ldots ,r\}\) of \(J_H(.)\)i.e. \(\{\mathbf {v}^{m_k}\} {\mathop {\longrightarrow }\limits ^{k \rightarrow \infty }} \mathbf {v}^*\) a.s., implies that \(\displaystyle J_H(\mathbf {v}^*) = \min _{\mathbf {v} \in \mathcal {U}^H}J_H(\mathbf {v})\). Moreover according to [2] the optimal values also converge : \(J_H^{m_k}(\mathbf {v}^{m_k}) {\mathop {\longrightarrow }\limits ^{k \rightarrow \infty }} J_H(\mathbf {v}^*)\) a.s.
Theorem 4.7
The sequence \(\{\mathbf {v}^m\}\) of \(\epsilon \)-minimizers of \(\{J_H^m(.)\}\) defined by (27) converges with probability one into the set of the minimizers of \(J_H(.)\). Moreover the sequence \(\{J_H^m(\mathbf {v}^m)\}\) converges itself with probability one into the set of the corresponding minima of \(J_H(.)\).
Proof
- (i):
-
Since the random sequence \(\{\mathbf {v}^m\}\) is bounded by definition, it is tight. Hence by Prokhorov’s theorem every subsequence \(\{\mathbf {v}^{m_k}\}\) has a sub-subsequence \(\{\mathbf {v}^{m_{k_l}}\}\) whose related probability distribution functions converge weakly to the probability distribution function of a random variable V. Moreover according to the Skorokhod’s representation theorem, there exists random variables \(\{{W}_l\}\) and W, respectively distributed as \(\{\mathbf {v}^{m_{k_l}}\}\) and V, such that \(\{W_l\}\) converges toward W almost surely. Then \(\{\mathbf {v}^{m_{k_l}}\}\) converges a.s. into \(\hbox {Supp}(W) \equiv \hbox {Supp}(V)\) as does \(\{W_l\}\) itself, and according to Th. 1.10 of Attouch [2], the corresponding limit points are minimizers of \(J_H(.)\). Then the Prokhorov’s theorem ensures that \(\hbox {Supp}(V) \equiv \hbox {Supp}(W)\) can only be a subset of the set of minimizers \(\{\mathbf {v}^{*,i}, i=1,\ldots ,r\}\) of \(J_H(.)\).
- (ii):
-
Now suppose that the random sequence \(\{\mathbf {v}^m\}\) does not converge with probability one into the set \(\{\mathbf {v}^{*,i}, i=1,\ldots ,r\}\). Then, there exist r open sets \(\{ \mathcal {O}_i, i=1,\ldots ,r\} \subset \mathcal {U}^H \), each one containing one of the \(\{\mathbf {v}^{*,i}\}\), and a subsequence \(\{\mathbf {v}^{m_k}\}\) of \(\{\mathbf {v}^m\}\) such that for all \(k \in I\!\!N,\)\(P(\mathbf {v}^{m_k} \notin \mathcal {O}_i, i=1,\ldots ,r) > 0 \) and for any of its embedded sub-subsequence \(\{\mathbf {v}^{m_{k_l}}\}\), \(P(\mathbf {v}^{m_{k_l}} \notin \mathcal {O}_i, i=1,\ldots ,r) >0\). By Prokhorov’s theorem every such subsequence \(\{\mathbf {v}^{m_k}\}\) has still a sub-subsequence \(\{\mathbf {v}^{m_{k_l}}\}\) converging in distribution to a random variable V. But the Skorokhod sequence \(\{W_l\}\) distributed as \(\{\mathbf {v}^{m_{k_l}}\}\) is then also such that \(\forall l \in I\!\!N\)\(P(W_l \notin \mathcal {O}_i, i=1,\ldots ,r) > 0 \) and \(\{W_l\}\) cannot converge a.s. into the set \(\{ \mathbf {v}^{*,i}, i=1,\ldots ,r \}\), which contradicts (i). Hence the result for the sequence \(\{\mathbf {v}^m\}\). Moreover, due to the convergence of the corresponding optimal values themselves according to Attouch’s theorem, the proof is complete. \(\square \)
5 Application: A Simulated Case Study in Predictive Microbiology
5.1 The State-Space Model Considered
5.1.1 The State Equation
One of the most efficient tools in the field of food safety is the stochastic modelling of a (pathogenic) bacterial population decreasing in a given culture medium . Indeed, under particular conditions of environmental factors such as temperature, pH, water activity, and after a lactic acid shock, a decreasing of the bacteria number can be observed (the so-called growth inactivation). This decreasing can be very slow if the temperature is kept constant (see the curve number 1 in Fig. 1), but can go faster if the temperature is increased, due to the enhanced efficiency of the lactic acid effect on the bacteria morbidness. A goal of the microbiologists is then to control the temperature evolution for obtaining a particular (a priori chosen) growth inactivation profile, which will be called hereafter the target decreasing trajectory.
For some bacterial species (Listeria, Salmonella,\(\ldots \)) efficient mathematical models are available for describing growth inactivation. For the bacterial dynamics simulation and its subsequent predictive control in our case study, the model proposed by Coroller et al. [14] will be considered under its approximate discrete time autoregressive form, with a multiplicative lognormal noise (as usually considered by microbiologists for counts variability) :
with
-
t : time variable, with \(t\in \left[ 0,t_{\max }\right] \), \(t_{\max }\) being a priori selected by the microbiologist (here \(t_{\max }=600\) hours).
-
\(X_{t}:\) the bacteria number per ml of culture broth at time t, which cannot be observed directly (from which, the associated filtering problem).
The initial number of bacteria \(x_0\) is chosen as \(x_0 = 10^{7}\) for the simulation and is considered as known in this simulated control processing. However it could also be considered as an additional unknown parameter to be estimated by filtering from a given initial prior density \(p_0^X(.)\) at time \(t=0\).
-
\(\lambda :\) an unknown shape parameter that must be estimated by filtering, simultaneously with the control procedure. According to \(\lambda \) the graph of the bacterial dynamics without noise is convex (\(\lambda < 1\)), straight (\(\lambda =1\)) or concave (\(\lambda >1\)).
-
\(D_t:\) a function of the temperature, \(T_t( {{}^\circ } C)\). \(D_t\) is the so-called decimal reduction, defined in [14], by
$$\begin{aligned} \log _{10}D_t=D^{s}-\left[ \left( \frac{2\left( T^{c}-T^{opt}\right) }{(Z)^{2}}\right) \times \left( T_t-T^{s}\right) \right] \ \ \ \hbox { if } \ \ \ T_t \le T^{c} \end{aligned}$$(29)and
$$\begin{aligned} \log _{10}D_t=D^{s}+\left( \frac{\left( T^{c}-T^{opt}\right) \left( 2T^{s}-T^{c}-T^{opt}\right) }{(Z)^{2}}\right) \ -\left( \frac{T_t-T^{opt}}{Z}\right) ^{2} \end{aligned}$$(30)if \(T_t>T^{c},\)
where \(D^{s},\)\(T^{c},\)\(T^{opt},\)\(T^{s},\)Z are in general badly known parameters, the values of which depend on the bacterial species. For the data simulation in the present case (Listeria) these parameters and the shape parameter \(\lambda \), are fixed to (see [14]):
$$\begin{aligned} \lambda = 2, \ D^{s}=2.5, \ T^{c}=20, \ T^{opt}=10, \ T^{s}=12, \ Z = 22. \end{aligned}$$(31)All these six parameters, will have to be estimated by the filtering process, in parallel with the system control processing: in our case study the unknown \(\theta \) of model (1) corresponds then to the vector \(\{ \lambda , D^s, T^c, T^{opt}, T^s, Z\}\).
Remark 4
-
The temperature, \(T_t,\) is the \(u_{t}\) control variable to be used: \( 0 \le \ T_t \ \le 40\,^{\circ }\)C.
-
\(\delta \) is the model time step (here always fixed at the value of 1 hour).
-
\(\varepsilon _{t+1}\) is a noise, taken as a lognormal random variable such that \(e_{t+1} = \ln \varepsilon _{t+1}\) is a Gaussian variable \(N(0,\rho _t^2)\), with \(\rho _t = \frac{1}{2}CV\log _{10}x_t\), CV being an approximate surrogate coefficient of variation supposed to be constant according microbiological considerations and known during the control process (realistic values 0.01, 0.025, 0.05, 0.10, 0.20 were considered for the different simulations done). Note that this quantity could also be considered as a parameter to be estimated by filtering during the control process.
-
The noninteger values provided for the state variable \(X_t\) by model (28) as approximations of integer bacteria counts, are quite acceptable with respect to these very high population sizes.
Remark 5
With this setting it can be checked that the stabilisability sufficient condition A3 is satisfied by model (28) if whatever t, the applied computed temperature \(T_t\) (the control) is such that:
with \(C=\exp {\Big ((\ln 10)\sqrt{\beta }\Big )}\), for a given \(\beta > 0\) and \(a=2\) in A3.
5.1.2 The Observation Equation
In the simulated case study considered here, the observed variable \(Y_{t}\) at time t is the number of cells (bacteria) supposed to have been detected by flow cytometry counting [32] in the last of a series of diluted samples in successive test tubes, from the original culture broth at time t. The few minutes requested by this counting process can be considered as negligible with respect to the slow dynamics of the growth inactivation. The probability distribution function \(G_{t}(.|X_t=x_{t},\theta )\) of \(Y_{t}\) in model (1), is here the result of the interaction of several independent random phenomena: the spatial sampling in the primary test tube at time t, a given number of successive samplings in several tubes of increasing dilution (with Poisson or aggregative assumptions for the bacteria spatial probability distributions), the successive volume sampling errors and dilution errors (assumed to be Gaussian) and finally, the lognormal error counts attributed to the flow cytometer device itself. See [40] for full details about this sophisticated sampling-dilution-numbering procedure. The probability distribution function \(G_{t}\) cannot be analytically characterised but can be easily simulated, which is the only requirement for the proposed particle predictive control procedure to be used, according to the particle generation algorithm considered (Appendix 1). Here, the dimension of the \(Y_{t}\) variable, s, corresponds to the number of repetitions of the previous bacteria sampling-dilution-counting procedure achieved at every time, t.
5.2 Settings and Results of the Predictive Control
In the present case study the goal of the successive minimizations of the approximated cost function expectation, \(J_{H}^{m}(.)\) defined in Sect. 4.2, is to obtain a controlled state trajectory being the closest as possible to a given deterministic target trajectory \(\{x_t^*\}\), all along the selected time range, \(\left[ 0,t_{\max }\right] \). The cost function considered is then taken as the quadratic discrepancy (4). The computation of \(J_H^m\) when requested by the minimization algorithm, is done by using a N(0, 1) probability density as intrumental pdf q(x), and Gaussian kernels for the particle pdf estimator \(p_{t+1}^{n(m),X}(x|y_{1:j}, u_{0,j-1}, v_{j:t})\) of the pdf \(p_{t+1}^X(x|y_{1:j}, u_{0,j-1}, v_{j:t})\) (see Appendix 1).
Let \(\delta _{v^{*}}\) be the time during which a same computed optimal control \(v^{*}\) is applied (the tested values for \(\delta _{v^{*}}\) were 1, 2, 5, 7, 10, 15, 50 hours). Note that the minimizations must be performed every \(\delta _{v^{*}}\), that means for example with \(\delta _{v^{*}}=\delta =1\) and a time range of \(\left[ 0,t_{\max }=600h \right] \), that the minimization procedure should be performed \(600 \times \gamma \) times (with \(\gamma \) a given number of independent runs from different initial values for limiting the risk to be trapped into a local minimum), i.e. 6000 times if \(\gamma =10\). Moreover, the dimension of the optimization space is H, with realistic values from 1 to 10.
Two minimization algorithms were compared: a global stochastic procedure based on [51] and a deterministic procedure based on the well-known Nelder and Mead simplex algorithm from the SAS/IML library [47]. The stochastic procedure, leading presently to too costly computer time—more than two weeks when (m, n) \(=\) (100, 1000) and \(\delta _{v^{*}}\)\(=\) 1 to 5—was abandoned. Only the simplex procedure was carried out, with still long but affordable computing times for these exploratory trials, typically several days with a Pentium IV computer, depending on the simulation conditions (much less however than the 600 hour duration of the virtual microbiological experiment): as previously said, our objective in these tests was not to find the fastest minimization procedure but rather to provide illustrations of the relevance of the \(J_{H}^{m}(.)\) approximated cost function expectation under different simulation conditions. Note however that this whole simulation/minimization procedure could be parallelized, with significant time saving beneficial effect.
Several successive optional settings of the predictive control / minimization procedure were tested:
-
For the horizon, H: values from 1 to 10, by step of 1.
-
For (m, n) (computation of \(\bar{\sigma }_{t+1}^{m}\)): (20, 100), (50, 500), (100, 1000).
-
For \(\delta _{v^{*}}\): 1, 2, 5, 7, 10, 15, 50.
-
For s (number of bacteria counting repetitions carried out at each time t): 1, 2, 3.
-
For the filtering process itself, the initial prior distribution of the six unknown parameters, \(p_0^{\theta }(.)\), was taken as a uniform distribution over the following intervals (chosen according to microbiological considerations):
$$\begin{aligned}&1.62 \le \lambda \le 2.42, \ \ 2.03 \le D^s \le 3.03, \ \ 16.20 \le T^c \le 24.20, \ \ \nonumber \\ \nonumber \\&8.10 \le T^{opt} \le 12.10, \ \ 9.72 \le T^s \le 14.52, \ \ 17.82 \le Z \le 26.62. \end{aligned}$$(34)
By combining these settings, several simulated predictive control processings were performed for a given decreasing deterministic bi-lobbed target trajectory \(x^*(t)\) (curve 3 in Fig. 1). The results of the most illustrative nine of them are reported in the following three tables (all with \(\delta _{v^{*}} = 2\), \(s=2\) and each noise surrogate coefficient of variation CV taken fixed as 0.025). Without surprise the best predictive control was obtained for \(m = 100, n=1000, H=10\) and is displayed in Fig. 1.
Results:
-
Table 1 presents the evolution with respect to different combinations of m, n, H, of the discrepancy sum of squares between the target trajectory and the estimate of the expected controlled trajectory, after a logarithmic transformation, \(\displaystyle SSQ = \sum _{t=0}^{600} \left[ \log _{10}x^*_t - \overline{\log _{10}x_t}\right] ^2\) where \(\overline{\log _{10}x_t} = \frac{1}{n}\sum _{i=1}^{n}\log _{10}\bar{x}_t^i\) (see Appendix 1 for the particles \(\{\bar{x}_t^i\}\) generation). The most significant result is the major effect of the chosen horizon H on the decreasing of the SSQ, with respect to the values of m and n for sufficiently big values (from 50 and 500 upwards respectively).This behaviour reveals the good predictability of this nonparametric predictive control approach.
-
Tables 2 and 3 display the final filtering estimates of each of the six parameters and the lower and upper bounds of their respective particle-estimated \(95\%\) confidence intervals (see [45] for technical details). These estimates are to be compared with the true parameter values (31) and with the initial prior parameter intervals (34) respectively. Beside the good quality of these estimates in spite of the relatively moderate m and n values used, another noticeable result is again the effect of the horizon H, the growing of which seems to be more sensible upon the ranges of the parameter confidence intervals than upon the parameter estimates themselves, with again a global improvement with increasing H, for m and n sufficiently big.
-
Figure 1 displays four curves related to the control processing with the setting \((m=100, n=1000, H=10)\), in \(\log _{10}\) units for the state variable \(X_t\) (bacteria number, left vertical axis) and in degree Celsius units for the control variable \(T_t\) (applied temperature, right vertical axis):
-
\(\circ \) Curve 1 is that of a simulation without control of the noisy state variable \(X_t\) (bacteria number per ml of culture broth (28)) under a fixed temperature (\(T_t = 2\,^{\circ }\)C).
-
\(\circ \) Curve 2 is that of the computed optimal control \(v^*_t\) (temperature \(T_t^*\)) with \(\delta _{v^*} = 2\). With some algebra one can easily check that all \(v^*_t\) satisfy (32, 33) for \(\beta > (\log _{10}x_0)^2\) and then, that the sufficient stabilisability condition A3 is satisfied.
-
\(\circ \) Curve 3 is that of the bi-lobbed target trajectory \(x^*_t\).
-
\(\circ \) Curve 4 is that of the evolution of the expected optimally controlled state trajectory \(X_t\).
One can notice the good predictive control anticipation of the change of curvature of the bi-lobbed reference trajectory under this horizon setting, leading to a satisfactory computed discrepancy (\(SSQ = 1.15\)).
The performance of this simulated predictive control processing and that of other non reported trials, could be improved by increasing the number m of simulations in the approximation of the cost-to-go function expectation and then the number n of particles used for the filtering step, at the price of a still heavier computing time. But as said previously this last drawback could be drastically reduced by the parallelization of the computer code, as often done for particle procedures like this one. Moreover with reasonable values for m and n as in the present settings, the proposed predictive control procedure seems already able to correctly anticipate the dynamic variations of the target trajectory and provides rather good control of the state space system considered, as shown by Fig. 1.
6 Conclusion
Solving stochastic NMPC problems on continuous state spaces, for imperfectly observed systems described by nonlinear non-Gaussian discrete-time state space models, is still a theoretical and a practical challenge. This paper addresses both aspects of this ambitious objective: first, the estimation of the multi-step ahead conditional pdf of the system state variables and the estimation of the subsequent cost-to-go expectation; secondly, the minimization of this expected cost estimate, providing optimal controls to be applied at each time step. Based on the use of a recently developed nonparametric particle estimator of the multi-step-ahead conditional pdf of the state variables and on the theory of epi-convergence, a simulation-based epi-convergent estimator of the expected cost-to-go over the receding horizon is proposed. Therefrom, every sequence of approximated minimizers of the corresponding expected cost-to-go estimates, converges with probability one into the set of the minimizers of the true expected cost-to-go at each time step. Idem for the convergence of the sequence of the corresponding minima to their true counterparts.
References
Andrieu, C., Doucet, A., Singh, S.S., Tadić, V.B.: Particle methods for change detection, system identification and control. Proc. IEEE 92, 423–438 (2004)
Attouch, H.: Variational Convergence for Functions and Operators. Pitman Advanced Publishing Program, Boston (1984)
Bertsekas, D.: Dynamic Programming and Optimal Control. Athena Scientific, Belmont (2005)
Bertsekas, D.: Dynamic programming and suboptimal control: a survey from ADP to MPC. Eur. J. Control 11(4–5) (2005)
Bidot, C., Gauchi, J.P., Vila, J.P.: Programmation Matlab du filtrage non linéaire par convolution de particules pour l’identification et l’estimation d’un système dynamique microbiologique. Rapport Technique 2009-3, INRA-MIA, Centre de Recherches de Jouy-en-Josas (2009)
Botchu, S.K., Ungarala, S.: Nonlinear predictive control based on sequential Monte Carlo state estimation. In: 8th International IFAC Symposium on Dynamics and Control of Process Systems, Vol. 3, pp. 31–36 (2007)
Calvet, L.E., Czellar, V.: Accurate methods for approximate Bayesian computation filtering. J. Financ. Econom. 13(4), 798–838 (2015)
Camacho, E.F., Bordons, C.: Model Predictive Control. Springer, Berlin (2013)
Campillo, F., Rossi, V.: Convolution particle filter for parameter estimation in general state-space models. IEEE Trans. Aerosp. Electron. Syst. 45, 1063–1072 (2009)
Cannon, M., Cheng, Q., Kouvaritakis, B., Raković, S.V.: Stochastic tube MPC with state estimation. Automatica 48, 536–541 (2012)
Chatterjee, D., Lygeros, J.: Stability and performance of stochastic predictive control (2013). arXiv:1304.2581v2
Chen, L.S., Geisser, S., Geyer, C.J.: Monte Carlo minimization for sequential control. Technical Report 591. School of Statistics, University of Minnesota (1993)
Choquet, R., Rossi, V.: Documentation à propos des routines pour le filtrage particulaire. CEFE-CNRS (2005)
Coroller, L., Kan-King-Yu, D., Leguerinel, I., Mafart, P., Membré, J.M.: Modelling of growth, growth/no-growth interface and nonthermal inactivation areas of Listeria in foods. Int. J. Food Microbiol. 152, 139–152 (2012)
Del Moral, P.: Nonlinear filtering using random particles. Theory Probab. Appl. 40(4), 690–701 (1996)
Del Moral, P.: Feynman-Kac Formulae—Genealogical and Interacting Particle Systems with Applications, p. 556. Springer, New York (2004)
Del Moral, P., Jacod, J., Protter, P.: The Monte Carlo method for filtering with discrete-time observations. Probab Theory Relat. Fields 120(3), 346–368 (2001)
Duflo, M.: Random Iterative Models. Springer, New York (1997)
Geyer, C.J.: On the convergence of Monte Carlo maximum likelihood calculations. J. R. Stat. Soc. B 56(1), 261–274 (1994)
Geyer, C.J.: Estimation and optimization of functions. In: Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (eds.) Markov Chain Monte Carlo in Practice, pp. 241–258. Chapman & Hall/CRC, Boca Raton (1996)
Greenfield, A., Brockwell, A.: Adaptive control of nonlinear stochastic systems by particle filtering. In: Proceedings of the 4th International Conference on Control and Automation (2003)
Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear non-Gaussian Bayesian state estimation. IEE Proc. Part F 140(2), 107–113 (1993)
Hernandez-Lerma, O.: Adaptive Markov Control Processes. Springer, New York (1989)
Hilgert, N., Senoussi, R., Vila, J.P.: Nonparametric identification of controlled nonlinear time varying processes. SIAM J. Control Optim., 950–960 (2000)
Hilgert, N., Rossi, V., Vila, J.P., Wagner, V.: Identification, estimation and control of uncertain dynamic systems: a nonparametric approach. Commun. Stat. Theory Methods 36, 2509–2525 (2007)
Hokayem, P., Cinquemani, E., Chatterjee, D., Ramponi, F., Lygeros, J.: Stochastic receding horizon control with output feedback and bounded controls. Automatica 48, 77–88 (2012)
Hürzeler, M., Künsch, H.R.: Monte Carlo approximations for general state space models. J. Computat. Gr. Stat. 7(2), 175–193 (1998)
Jasra, A., Singh, S., Martin, J., McCoy, E.: Filtering via approximate Bayesian computation. Stat. Comput. 22, 1223–1237 (2012)
Kantas, N., Maciejowski, J.M., Lecchini-Visinti, A.: Sequential Monte Carlo for model predictive control. Nonlinear Model Predictive Control, LNCIS 384, pp. 263–273. Springer, Heidelberg (2009)
Kouvaritakis, B., Cannon, M.: Stochastic model predictive control. In: Encyclopedia of systems and control. Springer, London (2015)
Langsom, W., Chryssochoos, I., Raković, S.V., Mayne, D.Q.: Robust model predictive control using tubes. Automatica 40, 125–133 (2004)
Laplace-Builh, C., Hahne, K., Hunger, W., Tirilly, Y., Drocourt, J.L.: Application of flow cytometry to rapid microbial analysis in food and drinks industries. Biol. Cell. 78, 123–128 (1993)
Lovejoy, W.S.: A survey of algorithmic methods for partially observed Markov decision processes. Ann. Oper. Res. 28, 47–66 (1991)
Maciejowski, J.M.: Predictive Control with Constraints. Prentice Hall, London (2002)
Marin, J.M., Pudlo, P., Robert, C.P., Ryder, R.: Approximate Bayesian computation methods. Stat. Comput. 22, 1167–1180 (2012)
Mayne, D.Q.: Model predictive control: Recent developments and future promise. Automatica 50, 2967–2986 (2014)
Mesbah, A.: Stochastic model predictive control: an overview and perspective for future research. IEEE Control Syst. 36, 30–44 (2016)
Musso, C., Oudjane, N., Le Gland, F.: Improving regularized particle filters. In: Doucet, A., Freitas, J.F.D., Gordon, N.J. (eds.) Sequential Monte Carlo Methods in Practice. Springer, New York (2000)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962)
Pérez-Rodriguez, F., Valero, A.: Predictive Microbiology in Foods. Spinger, New York (2013)
Raković, S.V., Kouvaritakis, B., Findeisen, R., Cannon, M.: Homothetic tube model predictive control. Automatica 48, 1631–1638 (2012)
Raković, S.V., Kouvaritakis, B., Cannon, M., Panos, C., Findeisen, R.: Parametrized tube model predictive control. IEEE Trans. Autom. Control 57, 2746–2761 (2012)
Rossi, V.: Filtrage non linéaire par noyaux de convolution. Application à un procédé de dépollution biologique. Ph.D. thesis, Ecole Nationale Supérieure Agronomique de Montpellier, France (2004)
Rossi, V., Vila, J.P.: Approche non paramétrique du filtrage de système non linéaire à temps discret et à paramètres inconnus. Comptes Rendus de l’Académie des Sciences, I 340, 759–764 (2005)
Rossi, V., Vila, J.P.: Nonlinear filtering in discrete time: a particle convolution approach. Annales de l’I.S.U.P. Publication de l’Institut de Statistique de l’Université de Paris, L, 71–102 (2006)
Salmond, D., Gordon, N.: Particles and mixtures for tracking and guidance. In: Doucet, A., de Freitas, N., Gordon, N. (eds.) Sequential Monte carlo Methods in Practice, pp. 517–532. Springer, New York (2001)
SAS Software. IML module, NLPNMS procedure, Version 9.2, SAS Institute, North Carolina, USA (2011)
Sehr, M.A., Bitmead, R.R.: Particle model predictive control: tractable stochastic nonlinear output-feedback MPC. (2016). arXiv:1612.00505v1
Stahl, D., Hauth, J.: PF-MPC: Particle Filter Model Predictive Control. Systems and Control Letters 60–8, 632–643 (2011)
Thrun, S.: Monte Carlo POMDPs. Adv. Neural Inf. Process. Syst. 12, 1064–1070 (2000)
Venot, A., Pronzato, L., Walter, E., Lebruchec, J.F.: A distribution-free criterion for robust identification, with applications in system modelling and image processing. Automatica 22, 105–109 (1986)
Vila, J.P.: Nonparametric multi-step prediction in nonlinear state space dynamic systems. Stat. Probab. Lett. 81, 71–76 (2011)
Vila, J.P.: Enhanced consistency of the resampled convolution particle filter. Stat. Probab. Lett. 82, 786–797 (2012)
Vogel, S., Lachout, P.: On continuous convergence and epi-convergence of random functions. Part I: Theory and relations. Kybernetika 39, 75–98 (2003)
Vogel, S., Lachout, P.: On continuous convergence and epi-convergence of random functions. Part II: sufficient conditions and applications. Kybernetika 39, 99–118 (2003)
Weissel, F., Schreiter, T., Huber, M.F., Hanebeck, U.D.: Stochastic model predictive control of time-variant nonlinear systems with imperfect state information. In: Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, pp. 40–46 (2008)
White, C.C., Arrington, D.P.: Application of Jensen’s inequality to adaptive suboptimal design. J. Optim. Theory Appl. 32, 89–99 (1980)
Yan, J., Bitmead, R.R.: Incorporating state estimation into model predictive control and its application to network traffic control. Automatica 41, 595–604 (2005)
Zervos, M.: On the epi-convergence of stochastic optimization problems. Math. Oper. Res. 24(2), 495–508 (1999)
Zhou, E., Fu, M.C., Marcus, S.I.: A density projection approach to dimension reduction for continuous-state POMDPs. In: Proceedings of the 47th IEEE Conference on Decision and Control, pp. 5576–5581 (2008)
Acknowledgements
The authors would like to thank a referee for his constructive comments and the associate editor for his suggestions of improved connexions with other filtering approaches.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1
Given \((y_{1:j},u_{0:j-1},v_{j:t})\), the construction and the convergence of the cost expectation estimator \(J_H^{m}(\mathbf {v}) = \frac{1}{m} \sum _{i=1}^m \sum _{t=j}^{j+H-1} \sigma ^{m}_{t+1}(X_{t+1}^i,v_{j:t})\) to its true counterpart \(J_H(\mathbf {v})\) as seen in Sects. 4.3 and 4.4, rely on \(p_{t+1}^{n(m),X}(x|y_{1:j},u_{0:j-1},v_{j:t})\), a convergent estimator of the true \((t+1-j)\)-step ahead state conditional pdf \(p_{t+1}^X(x|y_{1:j},u_{0:j-1},v_{j:t})\). This estimator will now be proposed. To alleviate notations, the control variables will be suppressed from model (1) without any loss of generality for this estimator rationale, and the general problem of building a convergent estimator \(p_{t+k}^{n,X}(x|y_{1:t})\) of the k-step ahead pdf \(p_{t+k}^X(x|y_{1:t})\) will now be considered. This estimator has been recently developed (see [52]), inspired from [44, 45] who proposed nonparametric convergent estimators of the pdf \(p_{t}^X(x|y_{1,t},u_{0,t-1})\) by convolution particle filtering.
1.1 A Brief Recall About Particle Filters et alia
Let us first recall briefly that to estimate the state conditional pdf of a discrete-time state space dynamical system, the filtering procedures rely on iterative series of couples of prediction step based on the state dynamics, and correction step taking the novel observation into account. Based on this scheme the so-called particle filters generate swarms of n state variable realizations (particles) approximately distributed according to this posterior pdf. Among the earliest and most famous ones are the sampling and importance-resampling (SIR) filter [22] or interacting particle (IPF) filter [16], with good asymptotic behaviour with n (weak convergence to the optimal filter), but with occasional particle concentration occurrences in case of small dynamical noise. To remedy this particle degeneracy and insure their diversity with time, some regularizations of the particle system distribution have been proposed by introducing a regularization step before or after the correction step [27, 38]: a kernel method is used to change the particle discrete empirical distribution into an absolutely continuous distribution in the resampling step. Both such pre- and post-regularization particle filters (RPF filters) have been proved to converge to the optimal filter in the weak sense. However all such particle filters require the analytical availability of the pdf of the observation variables (up to a normalized constant), a condition rarely met in practical situations. Other particle filtering methods have then been developed to get rid of this constraint. Thus, approximate Bayesian computation (ABC) approaches [35] free of this requirement, have then been applied to filtering. Based also on propagation and correction particle steps, these ABC approaches are characterized by the introduction of a so-called tolerance level \(\epsilon \) in the particle weighting of the correction phase, the selection of which conditions the asymptotic behaviour of the filter with n [28]. By considering this tolerance level not either as a fixed value but as a time-varying bandwidth parameter function of n, a kernel-based particle weighting has been recently introduced [7], allowing \(L^2\)-convergence of these modified ABC filters.
However, it is the almost sure convergence property of the conditional pdf estimator of the state which is needed in the present work (see Sects. 4.3, 4.4). The generalized convolution particle filter/predictor to be presented in the next paragraph offers this valuable convergence property, still allowing non-analytical availability of the observation variable pdf. Moreover in this nonparametric approach, convolution kernels are used not as simple regularization tools as in the RPF filters, but to build nonparametric almost sure convergent estimators of the state variable pdf, conditional on the observation variables.
Let us close this short recall on particle filtering by mentioning that the filtering of continuous-time state space dynamical systems has also taken advantage of Monte Carlo approaches. For example, for the filtering of systems with state variable evolution governed by stochastic differential equations with non-explicitable process transition semi-group, some original approaches have been proposed, adapted to different schemes of discrete-time observations: approximate filters of state function expectations have been developed via simulations of Euler approximations of the stochastic equations, with control of the discrepancy errors [17].
1.2 Convolution Particle Filering
In the following, the construction principles of the nonparametric particle pdf filter/predictor advocated for the estimation of the conditional pdf of interest in the present work (see Sect. 4.1), are summed up.
With respect to model (1), supposed free of any control variable as said previously, the state vector at time \(t+k\) can be written
Let
Let
Estimating the conditional pdf \(p_{t}^Z(z |y_{1:t})\) of \(Z_t\) is then equivalent to estimating the pdf of interest \(p_{t+k}^X(x|y_{1:t})\) of \(X_t\).
Let us introduce \(Z_t\) as a new state variable into the state equations of model (1), and let us also introduce as extended state equations the parameter invariance equality \(\theta _{t}= \theta _{t-1}\) (with initial prior \(p_0^\theta \)), to take account of the parameter unknowledge. Model (1) is unchanged by these additions but it writes now, without control:
The estimation of the joint conditional pdf \(p_{t}^{X,Z,\theta }(x, z, \theta |y_{1:t})\) of \((X_t, Z_t, \theta _t)\) and its marginals \(p_t^X(x|y_{1:t})\), \(p_t^Z(z|y_{1:t})\), \(p_t^\theta (\theta |y_{1:t})\), is a filtering problem. A convergent nonparametric particle filtering approach, free of any particle depletion drawback, has been recently proposed to solve filtering problems of this kind under the mild assumptions of Sect. 2 (see [25, 43,44,45]). This approach can provide convergent estimators of all the conditional pdf of interest and in particular of the pdf \(p_t^Z(z|y_{1:t})\) (see [52, 53]). It relies on the recursive generation of n particles \( \, ({x}^i_t, {z}^{i}_t, {\theta }^i_t, {y}^i_t), \, i=1,\ldots ,n\), at each time step t.
In the following we shall focus only on the pdf \(p_t^Z(z|y_{1:t})\). The extension to \(p_t^X(x|y_{1:t})\) and to \(p_t^\theta (\theta |y_{1:t})\) is straightforward.
Algorithm:
-
Step \(t=0\): For \(i=1,\ldots ,n\), generate \(\bar{x}_0^i \sim p_0^x, \ \bar{\theta }_0^i \sim p_0^\theta , \ \ t = t + 1\).
-
Step \(t > 0\): For \(i=1,\ldots ,n\)
-
if \(t=1\): generate \( {\varepsilon }^i_1 \sim {\mathcal L}_{\varepsilon _1}, \ {\nu }^{1,i}_1 \sim {\mathcal L}_{\varepsilon _2}, \ \ldots ,\ {\nu }^{k,i}_1 \sim {\mathcal L}_{\varepsilon _{1+k}},\)
\({x}_1^i = f_1(\bar{x}_0^i, \bar{\theta }_0^{i}, {\varepsilon }_1^i), {z}_{1}^{i} = F_{1+k}(\bar{x}_0^i, \bar{\theta }_0^{i}, {\varepsilon }_1^i, {\nu }^{1,i}_1,\ldots ,{\nu }^{k,i}_1), {\theta }_1^i = \bar{\theta }_0^i,\)
\(\hbox {generate} \ {y}_1^i \sim G_1(. | {x}_1^i, {\theta }_1^{i})\).
-
if \(t > 1\), generate: \((\bar{x}^i_{t-1}, \bar{z}^{i}_{t-1}, \bar{\theta }^i_{t-1}) \sim p^{n,X,Z,\theta }_{t-1}(x,z,\theta |y_{1:t-1}),\)
\({\varepsilon }^i_t \sim {\mathcal L}_{\varepsilon _t}, {\nu }^{1,i}_t \sim {\mathcal L}_{\varepsilon _{t+1}}, {\nu }^{2,i}_t \sim {\mathcal L}_{\varepsilon _{t+2}}, \ldots , {\nu }^{k,i}_t \sim {\mathcal L}_{\varepsilon _{t+k}},\)
\({x}^i_t = f_t(\bar{x}^i_{t-1}, \bar{\theta }^{i}_{t-1}, {\varepsilon }^i_t), {z}^{i}_t = F_{t+k}(\bar{x}^i_{t-1}, \bar{\theta }^{i}_{t-1}, {\varepsilon }^i_t,{\nu }^{1,i}_t,\ldots ,{\nu }^{k,i}_t),\)
\({\theta }^i_t = \bar{\theta }^i_{t-1}, \hbox {generate} \ {y}^i_t \sim G_t(.|{x}^i_t, {\theta }^{i}_t)\),
with
$$\begin{aligned}&p_t^{n,X,Z,\theta }(x, z, \theta | y_{1:t}) \ := \nonumber \\&\displaystyle \frac{\sum _{i=1}^n K_{\delta _n}^{Y}({y}_t^i-y_t)\times K_{\delta _n}^{X}({x}^i_t-x)\times K_{\delta _n}^{Z}({z}^{i}_t - z)\times K_{\delta _n}^{\theta }({\theta }_t^i-\theta )}{\sum _{i=1}^n K_{\delta _n}^{Y}({y}_t^i-y_t)}, \qquad \end{aligned}$$(38)$$\begin{aligned}&p^{n,Z}_t(z| y_{1:t}) \ := \ \displaystyle \frac{\sum _{i=1}^n K_{\delta _n}^{Y}({y}_t^i-y_t)\times K_{\delta _n}^{Z}({z}^{i}_t - z)}{\sum _{i=1}^n K_{\delta _n}^{Y}({y}_t^i-y_t)}, \end{aligned}$$(39)$$\begin{aligned}&\hat{z}_t^n \ := \ \frac{1}{n} \sum _{i=1}^n \bar{z}_t^i, \end{aligned}$$(40)\(t = t + 1\), go back to Step t.
-
in which
-
\( K_{\delta _n}^Y(v) \ := \ \frac{1}{\delta _n^s}K^Y\big (\frac{v}{\delta _n}\big )\), \( v \in I\!\!R^s\), where \(K^Y(.)\) is a basic Parzen-Rosenblatt kernel function (see [24, 39]) of dimension s : \(K^Y(.)\) is bounded, positive, symmetric, with \(\lim _{\parallel v \parallel \rightarrow \infty } {\parallel \! v \! \parallel }^sK^Y(v) = 0\), \(\displaystyle { \int \! K^Y(v)d\lambda (v) = 1}\) where \(\lambda \) is the Lebesgue measure, and \(\delta _{n}\) is the kernel window width parameter to be adequately chosen (see [52]). Example: the simple Gaussian Kernel \(K^Y(v) = (1/\sqrt{2\pi })^s\exp (- \! \parallel \! v \! \parallel ^2 \! /2)\).
-
\( K_{\delta _n}^Z(.)\), \(K_{\delta _n}^X(.)\) and \(K_{\delta _n}^{\theta }(.)\) : kernels defined in a similar way for the variables Z, X and the parameter \(\theta \), of dimension d, d and p respectively and with the same kernel window width parameter \(\delta _{n}\) as that of \(K_{\delta _n}^Y(.)\) (this assumption could be relaxed).
Remark 6
In practice the generation of the \(\{\bar{x}_t^i, \bar{z}_t^i, \bar{\theta }_t^i, i=1,\ldots ,n\}\) according to (38) is done very easily by a multinomial resampling step of the particles \(\{x_t^i, z_t^i, \theta _t^i, i=1,\ldots ,n\}\), followed by a regularization step according to the simulable distributions of the respective state noises (see [5, 13]).
Theorem 7.1
For any \(t>1\), if the pdf \(p_{t}^Y(y|y_{1:t-1})\) is continuous and strictly positive at \(y_{t}\), if there exists \(M>0\) such that \(p_{t}^Y(y|x_{t},\theta _t)\le M\), and if \(\hbox {Var}[X_{t},Z_t,\theta _{t}|y_{1:t}]\) exists and is bounded, then
with \(\displaystyle \big \Vert \Phi (z)\big \Vert _{L_{1}} = \int \big |\Phi (z)\big |dz,\) for an integrable function \(\Phi (z)\).
\( p^{n,Z}_t(z| y_{1:t})\) is then an a.s. \(L_1\)-convergent estimator of the pdf \( p_t^Z(z| y_{1:t})\) and then also of the k-step ahead pdf of interest \(p_{t+k}^X(x| y_{1:t})\).
Proof
direct application to the state variable \(Z_t\) of the convergence results of the Resampled Convolution Particle filter (see [45], which gives also the rate of convergence). \(\square \)
Theorem 7.2
For any \(t > 1\), if the pdf \(\,p_{t}^Y(y|y_{1:t-1})\,\) is continuous and strictly positive at \(y_{t}\,\) and if there exist \(M_{1}>0\) such that \(p_{t}^{X,Z}(x,z|x_{t-1},z_{t-1},\theta _{t-1})\le M_{1}\), \(M_{2}>0\) such that \(p_{t}^Y(y|x_{t},\theta _t)\le M_{2}\), and \(M_{3}>0\) such that \(p_{t}^{\theta }(\theta |y_{1:t})\le M_{3}\), then
Proof
idem (see [53]).
\(p_{t+k}^{n,X}(x| y_{1:t})\), the n-particle estimator of the k-step ahead conditional state pdf \(p_{t+k}^X(x| y_{1:t})\), is then given by the pdf \(p^{n,Z}_t(z| y_{1:t})\) defined in (39) and inherits its convergence properties. \(\square \)
Remark 7
Similar estimators \(p_t^{n,X}(x|y_{1:t})\) and \(p_t^{n,\theta }(\theta |y_{1:t})\), of \(p_t^X(x|y_{1:t})\) and \(p_t^{\theta }(\theta |y_{1:t})\) respectively, can be defined with similar convergent properties.
Appendix 2: Proof of Lemma 4.5
The almost sure epi-convergence of \(S^m(\mathcal {X},\mathbf {v})=\sum _{t=j}^{j+H-1}\sigma ^m_{t+1}(x_{t+1},v_{j:t})\) to \(S(\mathcal {X},\mathbf {v}) = \ \sum _{t=j}^{j+H-1}\sigma _{t+1}(x_{t+1},v_{j:t})\) for all \(\mathcal {X}\), is established by the same standard approach as that already followed to show the epi-convergence of \(J_H^{m}\) to \(J_H\) in Sect. 4.4 and with the same notations, by checking the two inequalities
and
For brevity’s sake only the proof of (43) which is more specific, will be proposed.
By (39), \(\forall t \ge j\)
with \({z}^i_j = F_{t+1}\Big (\bar{x}^i_{j-1}, \bar{\theta }^i_{j-1},{\varepsilon }^i_j,{\nu }^{1,i}_j,\ldots ,{\nu }^{(t+1-j),i}_j,u_{0:j-1},v_{j:t}\Big ), \ \ \ i=1,\ldots ,n(m)\).
and according to (8)
Now with the same notations as in Sect. 4.4, given \(\mathcal {X}\), let \(\mathbf {v}^k \in B_k \subset \mathcal {B}_c\), such that
By the same approach as that proposed in Sect. 4.4, it is possible to choose a subsequence \(\{k_l\}\) such that \(B_{k_l} \supset B_{k_{l+1}}\), with \(\cap _l B_{k_l} = \{\mathbf {v}\}\).
Then
\(\square \)
Rights and permissions
About this article
Cite this article
Vila, JP., Gauchi, JP. Predictive Control of Discrete Time Stochastic Nonlinear State Space Dynamical Systems: A Particle Nonparametric Approach. Appl Math Optim 80, 165–194 (2019). https://doi.org/10.1007/s00245-017-9462-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-017-9462-9
Keywords
- Stochastic state space dynamical systems
- Nonlinear model predictive control
- Particle convolution filter
- Kernel density estimator
- Epi-convergence