1 Introduction

The modern stochastic optimal control theory has been developed along the lines of Pontryagin maximum principle and Bellman dynamic programming [1, 2]. The stochastic maximum principle has been first considered by Kushner [3]. A general theory of stochastic maximum principle based on random convex analysis was given by Bismut [4]. Modern presentations of stochastic maximum principle with backward stochastic differential equations are considered in [5].

The optimal control of stochastic systems is a difficult problem, particularly when the system is strongly non-linear and there are state and control constraints [6]. The solution of higher dimensional problems demands a different approach. Rather than trying to solve the problem globally, one can look for a locally optimal solution [7]. This is usually achieved by first solving the problem in the noise-free case, yielding an optimal trajectory, and then modelling the influence of the noise around this trajectory, hybrid Monte Carlo sampling was used to infer the control on the problem of stochastic optimal control in continuous-time and state-action space of system with state contraints [8]. A revised Hooke–Jeeves algorithm based on the above idea is constructed to solve the stochastic optimal control problem on designing the trajectory of horizontal wells [9]. A situation that is particularly common is that the state space is constrained. Taking such constraints into account is an important issue in the computation of the optimal control.

Over the past several years, 1,3-PD has been paid attention in microbial production throughout the world because of its lower cost, higher production and no pollution [10]. In order to get a better understanding of the processes involved, to extend results and to make predictions, mathematical models are indispensable. For bacterial growth, the models are usually taken the form of differential equations or systems thereof. The bioprocess, including mathematical model, optimal control, robust \(H_{\infty }\) control, stochastic perturbance and uncertainty, have been widely studied in literatures [11,12,13,14]. Compared with continuous and fed-batch cultures, under some certain operation conditions, the motivations to study batch culture can be divided into two aspects [15]: (1) batch culture is a simple and easy operation mode compared with fed-batch and continuous cultures; (2) batch culture is the basis to understand or control fed-batch and continuous cultures. Therefore, nonlinear dynamical systems in batch culture have been extensively considered in recent years, including sensitivity analysis [16], joint estimation [17], hybrid system [18], multi-objective optimization [19], robust multi-objective optimal control [20] and strong stability [21].

However, for all the papers mentioned above, a major drawback is that they do not take into account the presence of stochasticity. The dynamics of the system are not deterministic, but intrinsically stochastic, and consideration of inherent stochasticity of microorganism is necessary to uncover the precise nature of the real process. Differential equations where some or all the coefficients are considered random variables or that incorporate stochastic effects (usually in the form of white noise) have been increasingly used to deal with errors and uncertainty which becomes a growing field of great scientific interest. In this paper, the stochasticity in the model is introduced by parameter perturbation in the specific growth rate of cells [22, 23]. The process is modeled by a stochastic ordinary differential system driven by five dimensional Brownian motion, which is time independent and suitable for the factual fermentation. Stochastic control is a subfield of control theory which deals with the existence of uncertainty. Stochastic control aims to design the optimal controller that performs the desired control task. In this paper, we study the stochastic optimal control of 1,3-PD production where the volumetric productivity of 1,3-PD and dilution rate are used as the optimization target and manipulated variable, respectively. The main differences from our previous work [23] is that the dynamical system is time-dependent. Our main concern is to seek the solution of this stochastic optimal control problem.

The remaining of the paper is organized as follows. In Sect. 2, we present a nonlinear stochastic dynamical system of batch fermentation process. In Sect. 3, an optimal control model is established. In Sects. 4 and 5, we develop a computational approach to solve this optimal control model. Section 6 illustrates the numerical results. Finally, conclusions are provided in Sect. 7.

2 Nonlinear stochastic dynamical system

2.1 Deterministic model

Mass balances of biomass, substrate and products in batch microbial culture are written as follows (see [24]).

$$\begin{aligned}\left\{ \begin{array}{ll} \dot{x}_{1}(t)=\mu x_{1}(t)\\ \dot{x}_{2}(t)=-q_{2}x_{1}(t)\\ \dot{x}_{i}(t)=q_{i}x_{1}(t), \ \ i=3,4,5. \end{array}\right. \ \ \ \ \ \ t\in (0,T) \end{aligned}$$

where the specific growth rate of cells \(\mu (t)\), specific consumption rate of substrate \(q_{2}(t)\) and specific formation rate of product \(q_{i}(t)\) are expressed by Eqs. (2)–(4), respectively.

$$\begin{aligned} \mu (t)&=\mu _{m}\exp \left( \frac{-(t-t_m)^2}{2t_l^2}\right) \prod _{i=2}^{5}\left( 1-\frac{x_{i}(t)}{x_{i}^{*}}\right) \\ q_{2}(t)&=m_{2}+\frac{\mu (t)}{Y_{2}}\\ q_{i}(t)&=m_{i}+\mu (t) Y_{i},\quad i=3,4,5, \end{aligned}$$

where \(x_{1}(t),x_{2}(t),x_{3}(t),x_{4}(t),x_{5}(t)\) are the concentration of biomass, glycerol, 1,3-PD, acetic acid and ethanol at time t in reactor, respectively. \(x(0):=x_{0}\in {\mathbb {R}}_+^5\) denotes the initial state. Under anaerobic conditions at 37 \(^{\circ }\hbox {C}\) and pH \(=\) 7.0, \(\mu _{m}\) is the maximum specific growth rate of cells, and \(k_{s}\) is Monod saturation constant. \(m_i\), \(Y_{i}\), \(k_i\), \(i=2,3,4\) are system parameters [2]. \(T\in (0,+\infty )\) is the terminal time of batch fermentation.

The growth of microorganisms such as bacteria and algae is of great interest in biology and medicine. Field observations and laboratory experiments are expensive and very difficult to perform. In the laboratory there is much more control. However, it is almost impossible to keep the external factors constant and uniform in the field. In addition, the conditions may still vary in time and also be different from actual situations in the real environments of interest. Moreover, errors in measuring the population sizes occur frequently. Even when measurements are done with the utmost care, the measured values will differ between experiment batches. In most cases, this variation is quite dramatic due to the large sizes and/or variability of the populations, inaccuracies in the methods used to assess them, error (human or otherwise), as well as other unknown factors. Thus, it is important to consider the stochastic disturbances in the nonlinear multistage dynamical system, especially for the stochastic optimal control problem in this case.

2.2 White noise stochastic disturbances on the model parameter

Based on our previous literature [22], mass balances of biomass, substrate, and products in batch culture can be formulated as the following nonlinear multistage stochastic dynamical system:

$$\begin{aligned}&\displaystyle dx(t)=F(x(t))\text {d}t+G(x(t))\text {d}w,\, t\in I=[0, T], \end{aligned}$$
(1)
$$\begin{aligned}&\displaystyle x(0)=x_{0}, \end{aligned}$$
(2)

where

$$\begin{aligned}&F(x(t))=(\mu x_1(t),-\,q_{2}x_1(t),q_{3}x_1(t),q_{4}x_1(t), q_{5}x_1(t))^{\top },\end{aligned}$$
(3)
$$\begin{aligned}&G(x(t))=\sigma _{\mu }(x_1(t),x_1(t)/Y_2,Y_3x_1(t),Y_4x_1(t), Y_5x_1(t))^{\top }, \end{aligned}$$
(4)
$$\begin{aligned}&E(\dot{w(t)})=0,\\&D(\dot{w(t)})=1. \end{aligned}$$

\(\sigma _{\mu }\) is the intensity of the inherent stochasticity disturbance. In Eq. (1), \(x(t)=(x_1(t),\ldots ,x_5(t))^{\top }\) is a stochastic process that reflects the fluctuating trend of the proportion under the inherent stochasticity disturbance.

The value of the system parameters are listed in Table 1 (see [24]).

Table 1 Parameters values of each reactant in the system

3 Stochastic optimal control problem

The solution of systems (1) and (2) with respect to control vector is defined by \(x(\cdot ,u)\). When the concentration of glycerol was declined to 150 mmol/L, we terminate the process of batch culture, that is,

$$\begin{aligned} {\mathbb {E}}c_2^{\top }x(\tau ,u)=150, \end{aligned}$$

where \(c_2:=(0,1,0,0,0)^{\top }\), \(\tau =\inf \{t: {\mathbb {E}}c_{2}^{\top }x(t,u)=150\}\) and \({\mathbb {E}}\) denotes the mathematical expectation. Similar with the deterministic case, in batch culture, the initial concentrations of biomass, glycerol and the terminal time can be chosen as control variables. Let \(u:=(u_1,u_2,u_3)^{\top }=(x_{01},x_{02},\tau )^{\top } \in {\mathbb {R}}^{3}_{+}\) be the control vector. Based on the factual fermentation, there exist critical concentrations, outside which cells cease to grow, of biomass, glycerol, 1,3-PD, acetate and ethanol. Hence, it is biologically meaningful to restrict the concentrations of biomass, glycerol and products in a set W and the control vector in a admissible control set U defined respectively as follows:

$$\begin{aligned}&x(t,u) \in W\triangleq \prod _{i=1}^5[x_{i*},x^{*}_i] \subset {\mathbb {R}}_{+}^5. \ \ \ \ \forall t\in I,\\&u \in U\triangleq \prod _{i=1}^3[u_{i*},u^{*}_i] \subset {\mathbb {R}}_{+}^3, \end{aligned}$$

where \(u_{*i}\) and \(u^{*}_i\) denote the upper and lower bound of the control variables, respectively.

The aim of the microbial fermentation in batch culture is to maximize the yield of the 1,3-PD, so we establish the stochastic optimal control problem of the batch culture as follows.

$$\begin{aligned} (\text {SOCP}):&\inf \,\,\,J(u)\triangleq -{\mathbb {E}}\frac{x_3(u_3, u)}{u_3} \\&\mathrm{s.t.}\,\,\,x(t,u)\in W, \ \ \ \ \forall t\in [0, u_3],\\&\ \ \ \ \ \ u\in U. \end{aligned}$$

From the theory on continuous dependence of solutions on parameters, the existence of the optimal control and some important properties had already been studied in our previous work [23].

4 Time-scaling transformation

Problem \(\text{ SOCP }\) can be treated as a constrained optimization problem. The control variable \(u_3\) is decision variable to be optimized. It is cumbersome to integrate the state and variational or costate systems numerically when the control variable \(u_3\) is decision variable. To address the problem caused by the variable control variable \(u_3\), the time-scaling transformation [25], which maps the variable switching times to fixed points in a new time horizon, is now one of the most popular tools.

Define

$$\begin{aligned} t(s)=u_3\cdot s. \end{aligned}$$
(5)

Let \({\tilde{x}}(s)=x(t(s)).\) If \(s\in [0, 1],\) then \(t(s)\in [0, u_3]\) and (1)–(2) with \(F(x), G(x), E(\dot{w}(t)), D(\dot{w}(t))\) become

$$\begin{aligned} \dot{{\tilde{x}}}(s):= & {} {\tilde{f}}({\tilde{x}}(s), u,{\widetilde{w}}(s))\nonumber \\= & {} u_3{\widetilde{F}}({\tilde{x}}(s)) +u_3{\widetilde{G}}({\tilde{x}}(s))\dot{{\widetilde{w}}}(s) =u_3[F(x(t(s)))+G(x(t(s)))\dot{w}(t(s))], \nonumber \\ \end{aligned}$$
(6)
$$\begin{aligned} {\widetilde{x}}(0)= & {} x_{0}, \end{aligned}$$
(7)

where

$$\begin{aligned} {\widetilde{F}}({\tilde{x}}(s)):= & {} (\mu (t(s))x_1(t(s)), -q_{2}(t(s))x_1(t(s)), q_{3}(t(s))x_1(t(s)), q_{4}(t(s)) x_1(t(s)),\\&q_{5}(t(s))x_1(t(s)))^{\top }, \\ {\widetilde{G}}({\tilde{x}}(s)):= & {} \sigma _{\mu }(x_1(t(s)), x_1(t(s))/Y_2,Y_3x_1(t(s)),Y_4x_1(t(s)),Y_5x_1(t(s)))^{\top }, \nonumber \\ E(\dot{{\widetilde{w}}}(s))= & {} 0, \\ D(\dot{{\widetilde{w}}}(s))= & {} 1. \end{aligned}$$

The solution of system (6) with (7) is defined by \({\tilde{x}}(\cdot ,u)\). Problem \(\text{ SOCP }\) becomes

$$\begin{aligned} \widetilde{\text {SOCP}}:&\inf \,\,\,\tilde{J}(u)\triangleq -\frac{{\tilde{x}}_3(1, u)}{u_3} \\&\mathrm{s.t.}\,\,\,{\tilde{x}}(s,u)\in W, \ \ \ \ \forall s\in [0, 1]\\&\ \ \ \ \ \ u\in U. \end{aligned}$$

5 Approximate problem

Problem \(\widetilde{\text {SOCP}}\) can be treated as a constrained optimization problem by using the control parametric method, see the monograph [25] or the recent survey papers [25]. In addition, based on the above description, we can easily see that the corresponding parameter selection problem of Problem \(\widetilde{\text {SOCP}}\) is a semi-infinite programming problem. An efficient algorithm, in which the so-called constraint transform and local smoothing techniques are involved, can address the issue [25].

For Problem \(\widetilde{\text {SOCP}}\), it is difficult to directly judge whether the constraint condition holds or not. In practice, the essential difficulty lies in the judgement of the constraint:

$$\begin{aligned} {\tilde{x}}(s,u)\in W, \ \ \ ~\forall ~ s\in [0,1]. \end{aligned}$$
(8)

To surmount these difficulties, let

$$\begin{aligned} {\tilde{g}}_j({\tilde{x}}(s,u)):={\tilde{x}}_j(s,u)-x_j^*, g_{5+j}({\tilde{x}}(s,u)):=x_{j*}-{\tilde{x}}(s, u),~j\in I_5. \end{aligned}$$

The constraint (8) is equivalently transcribed into

$$\begin{aligned} \mathcal {G}(u)=0, \end{aligned}$$
(9)

where \(\mathcal {G}(u):=\displaystyle \sum \nolimits _{j=1}^{10}\int _0^{1} \max \{0,g_j({\tilde{x}}(s,u))\}\text {d}s\). However, \(\mathcal {G}(u)\) is non-smooth in \(u\in U\). Consequently, standard optimization routines would have difficulties with this type of equality constrains. We replace (9) with

$$\begin{aligned} \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u) :=\tilde{\mathcal {G}}_{\varepsilon }(u)-\gamma =\sum _{j=1}^{10} \int _0^{1}\varphi _{\varepsilon ,j}(s,u)ds-\gamma \le 0, \end{aligned}$$
(10)

where \(\varepsilon>0,\gamma >0\) and

$$\begin{aligned} \varphi _{\varepsilon ,j}(s,u)\,\,{:}{=} \begin{array}{cc} \left\{ \begin{array}{ll} 0,&{}\quad \text {if}~~{\tilde{g}}_j({\tilde{x}}(s,u)) <-\varepsilon ,\\ \displaystyle \frac{({\tilde{g}}_j({\tilde{x}}(s,u))+\varepsilon )^2}{4\varepsilon }, &{}\quad \text {if}~~ -\varepsilon \le {\tilde{g}}_j(s,u) \le \varepsilon ,\\ {\tilde{g}}_j({\tilde{x}}(s,u)),&{}\quad \text {if}~~ {\tilde{g}}_j({\tilde{x}}(s,u))>\varepsilon . \end{array} \right. &{} \\ \end{array} \end{aligned}$$

Therefore, Problem \(\widetilde{\text {SOCP}}\) can be approximated by a sequence of nonlinear programming problems \(\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\) defined by replacing constraint (9) with (10), respectively. As shown in [25], the following theorem shows that the solution of the corresponding Problem \(\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\), will satisfy the continuous state inequality constraint (10) under certain conditions. Let

$$\begin{aligned} U_{\varepsilon }:= & {} \Bigg \{u \in U\Big | \tilde{\mathcal {G}}_{\varepsilon }(u)\le 0\Bigg \}, \\ \nonumber U_{\varepsilon ,\gamma }:= & {} \Bigg \{u \in U\Big | \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)\le 0\Bigg \}. \end{aligned}$$

Theorem 1

Given \(\delta >0\), for each \(\varepsilon >0\), there exists a \(\gamma (\varepsilon )>0\) such that for all \(\gamma (\varepsilon )>\gamma >0,\) any feasible solution of Problem \(\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\) is also a feasible solution of Problem \(\widetilde{\text {SOCP}}\).

Proof

For any \(u\in U,\) we have

$$\begin{aligned} \frac{\text {d}{\tilde{g}}_j(s,u)}{\text {d}s}=\sum _{i=1}^5 \frac{\partial {\tilde{g}}_j({\tilde{x}}(s,u))}{\partial {\tilde{x}}_i}{\tilde{f}}_i({\tilde{x}}(s),u,{\widetilde{w}}(s)), j\in I_{10}. \end{aligned}$$

Clearly, \({\tilde{g}}_j({\tilde{x}}(s|\theta )), j\in I_{10}, \) are continuously differentiable. Then, there exists a positive constant \(m_j, j\in I_{10},\) such that, for all \(\theta \in \Theta ,\)

$$\begin{aligned} \Big |\frac{\text {d}{\tilde{g}}_j({\tilde{x}}(s|\theta ))}{\text {d}s}\Big |\le m_j, \quad \forall s\in [0,1(\delta )]. \end{aligned}$$
(11)

Furthermore, for \(\varepsilon >0,\) define

$$\begin{aligned} k_{\varepsilon }=\sum _{j=1}^{10}k_{j,\varepsilon } :=\sum _{j=1}^{10}\frac{\varepsilon }{10}\min \Big \{1, \frac{\varepsilon }{2m_j}\Big \}. \end{aligned}$$
(12)

It suffices to show that \(U_{\varepsilon ,\gamma }\subset U,\) for any \(\gamma \) satisfying

$$\begin{aligned} 0<\gamma <k_{\varepsilon }. \end{aligned}$$
(13)

We assume if this is not the case. Then, there exists a \(u\in U\) such that

$$\begin{aligned} \sum _{j=1}^{10}\int _{0}^{1}\varphi _{\varepsilon ,j}(s, u)\text {d}s-\gamma\le & {} 0, \end{aligned}$$
(14)

but

$$\begin{aligned} \displaystyle \sum _{j=1}^{10}\int _{0}^{1}\max \Big \{0, {\tilde{g}}_j({\tilde{x}}(s,u))\Big \}\text {d}s> & {} 0. \end{aligned}$$
(15)

Since \({\tilde{g}}_j({\tilde{x}}(s,u)), j\in I_{10}, \) are continuous functions of s in [0, 1], (15) implies that there exists a \(\bar{s}\in [0,1]\) such that

$$\begin{aligned} {\tilde{g}}_j({\tilde{x}}(\bar{s},u))>0,\quad j\in I_{10}. \end{aligned}$$
(16)

Again by continuity, for each \(j\in I_{10},\) there exists an interval \(I_j\subset [0,1]\) containing \(\bar{s}\) such that

$$\begin{aligned} {\tilde{g}}_j({\tilde{x}}(s,u)>-\frac{\varepsilon }{2},\quad \forall s\in I_j. \end{aligned}$$
(17)

Using (11) it is clear from (17) that the length \(|I_j|\) of the interval \(I_j\) must satisfy

$$\begin{aligned} |I_j|\ge \min \Big \{1,\frac{\varepsilon }{2m_j} \Big \}. \end{aligned}$$
(18)

From the fact that \(\varphi _{\varepsilon ,j}(s,u)\) is nonnegative, it follows from (14) that

$$\begin{aligned}&0\ge \sum _{j=1}^{10}\int _{0}^{1}\varphi _{\varepsilon ,j}(s,u) \text {d}s-\gamma \ge \sum _{j=1}^{16}\int _{I_j}\varphi _{\varepsilon ,j}(s,u )\text {d}s\nonumber \\&\quad -\gamma \ge \sum _{j=1}^{10}\Big \{\min _{s\in I_j}\varphi _{\varepsilon ,j}(s,u)\Big \}|I_j|-\gamma . \end{aligned}$$
(19)

Now, in view of (17) and the monotony of the function \(\varphi _{\varepsilon ,j}(s,u), j\in I_{10}\), we have

$$\begin{aligned} \min _{s\in I_j}\Big \{\varphi _{\varepsilon ,j}(s,u)\Big \}>\frac{\varepsilon }{10},\quad j\in I_{10}. \end{aligned}$$
(20)

Combining (12), (13), (18), (19) and (20), we have

$$\begin{aligned} 0\ge \sum _{j=1}^{10}\int _{0}^{1}\varphi _{\varepsilon ,j}(s,u)\text {d}s-\gamma> \sum _{j=1}^{10}\frac{\varepsilon }{10}\min \Big \{1,\frac{\varepsilon }{2m_j} \Big \}-\gamma =k_\varepsilon -\gamma >0. \end{aligned}$$

This is a contradiction. Thus the proof is complete. \(\square \)

Remark 1

Theorem 1 ensures that the corresponding \(\gamma (\varepsilon )\) for each \(\varepsilon \) in this sequence is finite.

For constructing the optimization algorithm, when the parameter u is not feasible, we can move the parameter towards the feasible region in the direction of the gradients of constraint \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)\) with respect to parameter u. In this paper, we develop a scheme for computing the gradients of constraint \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)\).

Theorem 2

For each \(\varepsilon>0,\gamma >0,\) the derivatives of the constraint functionals \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)\) with respect to the ith component of the parameter vector are

$$\begin{aligned} \frac{\partial \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)}{\partial u_{i}}:=\int _0^{1}\frac{\partial H(s;u,\chi ,{\tilde{x}})}{\partial u_i}ds,\quad i=1,2,3, \end{aligned}$$

where

$$\begin{aligned} H(s;u,\chi ,{\tilde{x}})&{:=}\displaystyle \sum _{j=1}^{10}\varphi _{\varepsilon ,j}(s, u)+\chi ^{\top }(s)\left[ u_3{\widetilde{F}}({\tilde{x}}(s, u))+u_3{\widetilde{G}}({\tilde{x}}(s, u))\dot{\tilde{w}}\right] , \end{aligned}$$
(21)

and

$$\begin{aligned} \chi (s):=(\chi _1(s),\chi _2(s),\ldots ,\chi _5(s))^{\top }, \end{aligned}$$
(22)

is the solution of the costate system

$$\begin{aligned} \dot{\chi }(s)=-\frac{\partial H(s;u,\chi ,{\tilde{x}})}{\partial {\tilde{x}}(s,u)}, \end{aligned}$$
(23)

with the boundary condition

$$\begin{aligned} \chi (1)=(0,0,0,0,0)^{\top }. \end{aligned}$$
(24)

Proof

Let \(u{\in }U\) be an arbitrary but fixed vector and \(\epsilon _i,~i\in I_{3}\), be an arbitrary real number. Define

$$\begin{aligned} u^{1,\sigma }:=(u^{1}+\sigma \epsilon _1,u^{2},u^{3}),\\ u^{2,\sigma }:=(u^{1},u^{2}+\sigma \epsilon _2,u^{3}),\\ u^{3,\sigma }:=(u^{1},u^{2},u^{3}+\sigma \epsilon _3), \end{aligned}$$

where \(\sigma >0\) is an arbitrarily small real number such that \(u_{i*}<u^{i}+\sigma \epsilon _i<u_{i}^*,~i\in I_{3}.\) Therefore, \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{i,\sigma })\) can be expressed as

$$\begin{aligned} {\tilde{\mathcal {G}}}(u^{i,\sigma }):=\gamma +\sum _{j=1}^{10} \int _0^{1}\varphi _{\varepsilon ,j}(s,u)\text {d}s +\int _0^{1}\chi ^{\top }(s)[{\tilde{f}}({\tilde{x}}(s),u, {\widetilde{w}}(s))-\dot{{\tilde{x}}}(s,u)]\text {d}s,\nonumber \\ \end{aligned}$$
(25)

where \(\chi \) is yet arbitrary. Thus, it follows that

$$\begin{aligned}&\Delta \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{i,\sigma }) {:}{=}\frac{d\tilde{\mathcal {G}}_{\varepsilon ,\gamma } (u^{i,\sigma })}{d\sigma }|_{\sigma =0}=\frac{\partial \tilde{\mathcal {G}}(u)}{\partial u^{i}}\epsilon _i \nonumber \\&\quad = \int _0^{1}\{\frac{\partial H(s;u,\chi ,{\tilde{x}})}{\partial {\tilde{x}}(s,u)}\Delta {\tilde{x}}(s,u)+\frac{\partial H(s;u,\chi ,{\tilde{x}})}{\partial u^{i}}\epsilon _i-\chi (t)\Delta \dot{{\tilde{x}}}(s,u)\}\text {d}s,\nonumber \\ \end{aligned}$$
(26)

where \(H(s;u,\chi ,{\tilde{x}})\) is defined as in (21). Integrating (26) by parts and combining (21)–(25), we have

$$\begin{aligned}&\frac{\partial \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u)}{\partial u^{i}}\epsilon _i= \int _0^{{1}}\frac{\partial H(s;u,\chi ,{\tilde{x}})}{\partial u^{i}}\epsilon _i \text {d}s.~~~ \end{aligned}$$
(27)

Since \(\epsilon _i\) is arbitrary, (21) follows readily from (27) and the proof is complete. \(\square \)

6 Practical optimization algorithm of the stochastic optimal control problem

Various optimization methods such as gradient-based techniques [26] can be applied to solve Problems \(\big \{\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\big \}\). Nonetheless, all those techniques mentioned above are only aimed at finding local (not global) optimal solutions. Particle swarm optimization (PSO) algorithm introducing by Kennedy and Eberhart [27] can find the global optimum or a good suboptimal solution than gradient-based techniques. PSO algorithm [28] shares many similarities with evolutionary computation techniques such as Genetic Algorithms (GA). The system is initialized with a population of random solutions and searches for optima by updating generations. However, unlike GA, PSO algorithm has no evolution operators such as crossover and mutation. In PSO algorithm, the potential solutions, called particles, fly through the problem space by following the current optimum particles. However, what we need to solve is a problem with state constraints, to which the original PSO algorithm can not be applied directly. Although there exist many constraint handling techniques in the evolutionary computation [29], the treatment of continuous state constraints is seldom considered. We propose a handling technique for this type of constraints. The optimization algorithm, based on the theory of swarm intelligence algorithm, for Problem \(\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\) is directly proposed as follows.

Algorithm 1

Step 1. Set constants \(w_{start}, w_{end}\in (0,1)\) and \(I_{ter}=0;\)\(c_{1}\), \(c_{2}\) are positive constants and \(R_1\), \(R_2\) are random numbers in [0, 1].

Step 2. Randomly generate N initial particles with a uniform distribution on U. Denote the position and velocity of particles by \(u^{I_{ter},j}\in U\) and \(v^{I_{ter},j}\in V:=[v_{\min }, v_{\max }]\), respectively. Set \(J_{gbest}=+\infty .\)

Step 3. If \(I_{ter}<M_{Iter}\), set \(j=0\); otherwise output \(J_{gbest}\), \(P_{gbest}\) and stop.

Step 4. If \(j<N\), compute the solution of system (6) and (7) with \(u^{I_{ter},j}\) and go to Step 5; otherwise goto Step 6.

Step 5. Check the value of \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{I_{ter},j}).\) If \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{I_{ter},j})\le 0\), then computer \(J(u^{I_{ter},j})\), set \(j=j+1\) and go to Step 4; otherwise, that is, \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{I_{ter},j})>0\), move the parameter towards the feasible region in the negative direction of \(\frac{\partial \tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{I_{ter},j})}{\partial u^{I_{ter},j}_i}, i \in I_3,\) computed by Theorem 2 with Armijo line searches until \(\tilde{\mathcal {G}}_{\varepsilon ,\gamma }(u^{I_{ter},j})\le 0\), compute \(J(u^{I_{ter},j})\), set \(j=j+1\) and go to Step 4.

Step 6. Compute

$$\begin{aligned} J_{best}(u^{I_{ter}})= & {} \max _{j=1,2,\ldots N}J(u^{I_{ter},j}),\\ p_{best}(u^{I_{ter}})= & {} \arg \max _{j=1,2,\ldots N}J(u^{I_{ter},j}). \end{aligned}$$

Step 7. If \(J_{best}(u^{I_{ter}})\le J_{gbest}\), let \(J_{gbest}=J_{best}(u^{I_{ter}})\), \(p_{gbest}=p_{best}(u^{I_{ter}})\).

Step 8. Update particles:

$$\begin{aligned} u^{I_{ter}+1,j}= & {} u^{I_{ter},j}+v^{I_{ter}+1,j},\\ v^{I_{ter}+1,j}= & {} w*v^{I_{ter},j}\!+\!c_1*R_1*(p_{best}(u^{I_{ter}})\!-\!u^{I_{ter},j})\!+\!c_2*R_2*(p_{gbest}-u^{I_{ter},j}),\\ w= & {} (w_{start}-w_{end})*\frac{M_{Iter}-I_{ter}}{M_{Iter}}+w_{end}; \end{aligned}$$

set \(I_{ter}=I_{ter}+1\) and go to step 3.

Remark

The set of velocity of particles V is bound and closed, The admissible control set U can be call a “box” because of its rectangular shape. Thus, when the particles hit the boundary, for \(i=1,2,3\) and \(j=1,2,\ldots N,\) we use the classic method to cope with it as follows:

$$\begin{aligned}&v^j_i=\left\{ \begin{array}{lll} v_{\max }, &{}&{} \text{ if } \ v^j_i\ge v_{\max },\\ v_{\min },&{}&{} \text{ if } \ v^j_i\le v_{\min }, \end{array} \right. \\&u^j_i=\left\{ \begin{array}{lll} 2u^{*}_{i}-u^j_i, &{}&{} \text{ if } \ u^j_i\ge u^{*}_{i},\\ 2u_{*i}-u^j_i,&{}&{} \text{ if } \ u^j_i\le u_{*i}. \end{array} \right. \end{aligned}$$

Combining Algorithm 1 with Theorem 1, we can develop the following algorithm to solve Problem \(\widetilde{SOCP}\).

Algorithm 2

Step 1. Choose initial values of \(\varepsilon>0,\gamma >0.\)

Step 2. Solve Problem \(\widetilde{\text {SOCP}}_{\varepsilon ,\gamma }\) using Algorithm 2 to give \(u^*_{\varepsilon ,\gamma }\).

Step 3. Check feasibility of \({\tilde{g}}_j({\tilde{x}}(s,u^*_{\varepsilon ,\gamma }))\) for \( t\in [0,1]\), \(j\in I_{10}.\)

Step 4. If \(u^*_{\varepsilon ,\gamma }\) is feasible, go to Step 5; otherwise, set \(\gamma :=\alpha \gamma \). If \(\gamma >\bar{\gamma }\), where \(\bar{\gamma }\) is a prespecified positive constant, go to Step 2; otherwise go to Step 6.

Step 5. Set \(\varepsilon :=\beta \varepsilon \). If \(\varepsilon >\bar{\varepsilon }\), where \(\bar{\varepsilon }\) is a prespecified positive constant, go to Step 2; otherwise, go to Step 6.

Step 6. Output \(u^*_{\varepsilon ,\gamma }\) and stop.

Remark 2

In Algorithm 2, \(\varepsilon \) is a parameter to control the accuracy of the smoothing approximation. \(\gamma \) is a parameter to control the feasibility of the constraint (10). \(\bar{\varepsilon }\) and \(\bar{\gamma }\) are two predefined sufficiently small parameters to ensure the termination of the algorithm. The parameters \(\alpha \) and \(\beta \) must be chosen less than 1.

7 Numerical results and computer simulations

According to the model and algorithm mentioned above, we have programmed the software and applied it to the optimal control problem of microbial fermentation in batch culture.

The basic data are listed, respectively, as follows:

boundary value of control vector:

\(u_{*1}=0.01\) mmol/L, \(u^{*}_{1}=1\) mmol/L, \(u_{*2}=200\) mmol/L, \(u^{*}_{2}=939.5\) mmol/L, \(u_{*3}=2\)h, \(u^{*}_{3}=10\)h.

boundary value of state vector:

\(x_{*1}=0.001\) mmol/L, \(x^{*}_{1}=2039\) mmol/L, \(x_{*2}=0.001\) mmol/L, \(x^{*}_{2}=939.5\) mmol/L, \(x_{*3}=0.01\) mmol/L, \(x^{*}_{3}=10\) mmol/L. \(u_{*4}=0.01\) mmol/L, \(u^{*}_{4}=1026\) mmol/L, \(u_{*5}=200\), \(u^{*}_{5}=360.9\) mmol/L.

We adopt \(w_{start}=0.9, w_{end}=0.4\), \(c_{1}=c_2=2\) and \(N=1000\) in the procedure. Then, by Algorithms 1–3, the optimal control vector \(\bar{u}\) and objective function \(J(\bar{u})\) are \((0.11201,723.423,5.2783)^{\top }\) and 58.256, respectively. In [30], the optimal control vector \(\bar{u}\) and objective function \(J(\bar{u})\) are \((0.973186,547.04,5.17509)^{\top }\) and 54.5911, respectively. Numerical results show that, by employing the optimal control, the concentration of 1,3-PD at the terminal time can be increased, compared with the previous results.

Fig. 1
figure 1

The future concentration of 1,3-PD and the number of sample trajectory is 5

Fig. 2
figure 2

The average future concentration of 1,3-PD and the number of sample trajectory is 5000

To illustrate the stochastic nature of batch fermentation process sufficiently, a numerical example is given. In the example, we let \(\sigma _{\mu }=0.01638\) [31] and use Monte Carlo method to generate five thousand random inputs, which consist of the infinitesimal increment of standard Brownian motion dW(s). Afterwards, we solve the proposed stochastic model using the following Stochastic Euler–Maruyama scheme and obtain five thousand solution paths of the model. Our numerical approximation to \(X(\tau _j)\) will be denoted by \(X_j\).

Stochastic Euler–Maruyama method [32]

$$\begin{aligned}&X^k_j=X^k_{j-1}+F(X^k_{j-1})\Delta s + G(X^k_{j-1}) [W(\tau _j)-W(\tau _{j-1})], \\&j=1,2,...,L \end{aligned}$$

where \(\Delta s=1/L\) for some positive integer L; \(X^k\) denotes the k-th component of the X(t); \(\tau _j=j\Delta s\); \(W(\tau _j)-W(\tau _{j-1})\) is normally distributed with mean zero and variance \(\tau _j-\tau _{j-1}=\Delta s\); equivalently, \(W(\tau _j)-W(\tau _{j-1})\sim \sqrt{\Delta s} N(0,1),\) where N(0, 1) denotes a normally distributed random variable with zero mean and unit variance.

Figure 1 shows that the different sample paths of 1,3-PD based on the different perturbations and Fig. 2 shows that average path of 1,3-PD based on the 5000 sample paths.

8 Conclusions

In this paper, different from the previous approach in [9], we proposed a modified particle swarm algorithm to solve the stochastic optimal control problem based on the theory of swarm intelligence algorithm. Numerical results show that, by employing the optimal control, the concentration of 1,3-PD at the terminal time can increase significantly when compared with those obtained by the previous results. Our current tasks accommodate practical numerical method of the stochastic optimal control problem in the fermentation process. However, the convergence analysis and optimality function of the algorithm are also needed to be further investigated.