Keywords

1 Introduction

Food processes are devoted to transform raw ingredients into food or to transform food products into other forms. Bio-processes are those which make use of living organisms to make useful products. Production may be carried out by using yeasts or bacteria or by using enzymes from organisms. Bio-systems are defined as living organisms or systems of living organisms that can interact with others.

The drivers of innovation in the food processing sector can be divided into six major axes, corresponding to general consumer expectations: safety, pleasure, health, physical, convenience and ethics. These consumer demands pose serious challenges to the industry that must comply with a continuously changing market in due time to maintain competitiveness. In the context of biotechnology, the challenge is to define robust bio-processes to produce large quantities of high quality bio-based products in a sustainable and economical way.

Computer-aided simulation and model-based optimization offer a powerful, rational and systematic way to achieve those goals, enabling the possibility to (i) test “what-if” scenarios in a quick and inexpensive way; (ii) improve the understanding of the process or the system at hand; (iii) compute optimal designs or operation conditions given certain objectives and constraints and (iv) control the process operation so as to respond to possible uncertainties and disturbances.

Mathematical models can be roughly classified into three types: white-box models, based on the conservation principles; black-box models, based on data (for example, surface responses or artificial neural networks) and gray-box models which combine first principles with empirical descriptions. In addition models can be classified attending to their mathematical characteristics in linear or non-linear; static or dynamic; lumped or distributed; continuous or discrete; deterministic or stochastic; structured or unstructured.

In recent decades there has been a growing interest in the development of rigorous, mostly hybrid models, to describe food and bio-processes as well as biological systems. Each type of process or system has its own peculiarities. In this work we have selected a set of examples representative of bio-systems (biofilm formation), bio-processes (gluconic-acid production) and processes of the food industry (deep-fat frying of potato chips and thermal processing of packaged foods). The physical, chemical and biological underlying mechanisms are different. However the corresponding mathematical formulations share several properties: they are dynamic, non-linear, continuous, deterministic and unstructured models, and typically distributed.

Despite all efforts on developing rigorous models and the necessary numerical simulation techniques, model validation is still a challenge and it is considered as critical to develop confidence on models use in the food and biotechnological industries. In this scenario it is necessary to develop protocols and to standardise data acquisition so as to obtain transport properties for different food materials, new products and packages, as well as kinetic constants related to microbial and biochemical processes.

In this respect we will describe in some detail how models can be reconciled with experimental data by means of parameter estimation, identifiability analyses and optimal experimental design.

The parameter estimation problem is devoted to find the model parameter values that minimise the distance between model predictions and the experimental data. The identifiability analysis is aimed at evaluating the quality of the model fit and the confidence on the parameter values whereas the optimal experimental design problem is devoted to improve model predictive capabilities.

Once a suitable model with an accurate value for model parameters becomes available, it is possible to formulate optimisation problems to find those operating conditions that achieve a given objective (maximise product quality, minimise energy consumption, etc.) subject to constraints (maximum and minimum processing temperatures, food safety, etc.). We will describe how those problems are mathematically formulated and numerically solved, with special emphasis in the control vector parametrisation approach and the use of reduced order modelling techniques so as to improve computational efficiency.

From the numerical point of view, we will realise that many of the problems of interest: parameter estimation, optimal experimental design, operation design and real time optimisation are formulated as constrained non-linear programming problems including dynamic constraints (the model). Therefore we will describe the type of numerical methods we may use for solving the models, including discretisation based techniques and model reduction techniques, and the most suitable non-linear programming methods with special emphasis in global optimisers.

2 Modelling

Mathematical modelling is the art of quantitatively describing from observations certain aspects of the structure and function of a particular process. Model building is an iterative process which starts from the definition of the purpose of the model, that is, the questions to be addressed with the model. In the next step, using the a priori available knowledge and preliminary experimental data, a modelling framework is chosen and a first mathematical model structure is proposed. This first model usually contains unknown non-measurable parameters that may be estimated by means of experimental data fitting. In this regard, we need to know whether it is possible to uniquely determine their values (identifiability analysis) and if so, to estimate them with maximum precision and accuracy (parameter estimation step). This leads to a first working model that must be (in)validated with new experiments, revealing in most cases a number of deficiencies. In this case, a new model structure and/or a new (optimal) experimental design must be planned, and the process is repeated iteratively until the validation step is considered satisfactory.

Most of the models related to food and bio-processes and bio-systems are non-linear dynamic models, typically stated as (ordinary and partial) differential equations (ODEs and PDEs), as follows:

$$\displaystyle\begin{array}{rcl} & & \varPsi (\mathbf{x},\mathbf{x}_{\boldsymbol{\xi }},\mathbf{x}_{\boldsymbol{\xi }\boldsymbol{\xi }},\mathbf{x}_{t},\mathbf{v}_{t},\mathbf{v},\mathbf{u},\boldsymbol{\theta },t) = 0{}\end{array}$$
(1)
$$\displaystyle\begin{array}{rcl} \mathbf{x}(\boldsymbol{\xi },t_{0}) =\varPsi _{0}(\mathbf{x}(\boldsymbol{\xi },t_{0}),\mathbf{u}(t_{0}),\boldsymbol{\theta },t_{0});& & \quad \mathbf{v}(t_{0}) =\varPhi _{0}(\boldsymbol{\theta },t_{0});{}\end{array}$$
(2)
$$\displaystyle\begin{array}{rcl} \mathbf{\mathcal{B}}(\mathbf{x},\mathbf{x}_{\boldsymbol{\xi }},\mathbf{v},\mathbf{u},\boldsymbol{\theta },\boldsymbol{\xi },t) = 0& &{}\end{array}$$
(3)

where \(\boldsymbol{\xi }\in \varOmega \subset \mathbb{R}^{3}\) are the spatial variables, \(\mathbf{x}(\boldsymbol{\xi },t) \in \mathbb{Z} \subset \mathbb{R}^{\nu }\) are the distributed state variables (temperature, water content, microorganisms concentration, etc.), \(\mathbf{x}_{\boldsymbol{\xi }} = \partial \mathbf{x}/\partial \boldsymbol{\xi }\), \(\mathbf{x}_{\boldsymbol{\xi }\boldsymbol{\xi }} = \partial ^{2}\mathbf{x}/\partial \boldsymbol{\xi }^{2}\), \(\mathbf{x}_{t} = \partial \mathbf{x}/\partial t\), \(\mathbf{v} \subset \mathbb{R}^{\mu }\) are the lumped variables, \(\mathbf{v}_{t} = d\mathbf{v}/dt\), \(\mathbf{u} \in U \subset \mathbb{R}^{\sigma }\) are the control variables (processing temperature, feeding substrate, valves openings, etc.) and \(\boldsymbol{\theta }\in \varTheta \subset \mathbb{R}^{\eta }\), time independent parameters (thermo-physical properties, kinetic related constants, etc.). Equations ( 2) and ( 3) represent the initial and boundary conditions, respectively.

2.1 Parameter Estimation

Given a general set of differential equations explaining the dynamics of a system, Eqs. (1), (2), and (3), the values assigned to the parameters \(\boldsymbol{\theta }\) will give rise to different system behaviours. The problem of parameter estimation may be formulated as follows: Find model unknown parameters (e.g. thermo-physical properties, kinetic coefficients, initial conditions, etc.) so as to minimise a measure of the distance among the model predictions and the available experimental data as obtained under a particular experimental scheme (illustrated in Fig. 1) [36].

Fig. 1
figure 1

Illustrative representation of the experimental scheme

Let us suppose the most general experimental scheme where several experiments \(\mathcal{E} = 1,\ldots,n_{\mathcal{E}}\) and types of outputs \(k = 1,\ldots,n_{y}^{\mathcal{E}}\) are used for the estimation (for instance, two experiments with different inputs where some concentration and temperature are measured). Due to the discrete nature of these outputs they are located at a given number (\(n_{t}^{\mathcal{E},k}\)) of certain sampling times t s and a number (\(n_{\mathcal{S}}^{\mathcal{E},k,s}\)) of sensor positions \(\boldsymbol{\xi }_{p}\) for each experiment. Their associated model predictions must be obtained by means of the implementation of the above experiments and evaluating the results at the same sampling times and sensor positions. Considering the general non-linear model described in (1), (2), and (3), these predictions are calculated from:

$$\displaystyle{ \mathbf{v}^{\mathcal{E}}(\boldsymbol{\xi }_{ p},t_{s},\boldsymbol{\theta }) =\mathbf{ f}_{y}^{\,\mathcal{E}}\left (\mathbf{x}^{l}(t_{ s}),\mathbf{x}^{d}(t_{ s},\boldsymbol{\xi }_{p}),\boldsymbol{\theta }\right ) }$$
(4)

where in \(\mathbf{f}_{y}^{\,\mathcal{E}}\in \mathbb{R}^{n_{y}}\) the implicit influence of the inputs is not made explicit to simplify notation.

For the sake of clarity the measurements and model predictions will be encoded in the following vectors:

$$\displaystyle{ \mbox{ $\boldsymbol{\mathcal{Y}_{m}}$} = [\,y_{m1},\ldots,y_{m\boldsymbol{\ell}},\ldots,y_{m\boldsymbol{n_{\ell}}}]^{\mathrm{T}} \in \mathbb{R}^{\boldsymbol{n_{\ell}}} }$$
(5)

and

$$\displaystyle{ \mbox{ $\boldsymbol{\mathcal{Y}}$}(\boldsymbol{\theta }) = [\,y_{1}(\boldsymbol{\theta }),\ldots,y_{\boldsymbol{\ell}}(\boldsymbol{\theta }),\ldots,y_{\boldsymbol{n_{\ell}}}(\boldsymbol{\theta })]^{\mathrm{T}} \in \mathbb{R}^{\boldsymbol{n_{\ell}}}, }$$
(6)

where \(\boldsymbol{\ell}\) represents a certain data defined by the sub-indexes \(p,s,k,\mathcal{E}\) and \(\boldsymbol{n_{\ell}}\) is the total number of such data.

At the time of defining a measure of the distance between the experimental and predicted data, several possibilities exist. Here the maximum likelihood approach is considered. The idea is to find the vector of parameters that gives the highest likelihood to the measured data. Under the assumptions of independent measurements with Gaussian noise, the distance to be minimised becomes:

$$\displaystyle{ J_{ml} =\sum _{ \boldsymbol{\ell}}^{\boldsymbol{n_{\ell}}}\left (-\frac{1} {2}\right )\left [\log (2\pi ) +\log (\sigma _{\boldsymbol{\ell}}^{2}) + \frac{\left (\mbox{ $y_{m}$}_{\boldsymbol{\ell}} - y_{\boldsymbol{\ell}}(\boldsymbol{\theta })\right )^{2}} {\sigma _{\boldsymbol{\ell}}^{2}} \right ] }$$
(7)

where

$$\displaystyle{\sum _{\boldsymbol{\ell}=1}^{\boldsymbol{n_{\ell}}}\left (\cdot \right ) =\sum _{ \mathcal{E}=1}^{n_{\mathcal{E}} }\left (\sum _{k=1}^{n_{y}^{\mathcal{E}} }\left (\sum _{s=1}^{n_{t}^{\mathcal{E},k} }\left (\sum _{p=1}^{n_{\mathcal{S}}^{\mathcal{E},k,s} }\left (\cdot \right )\right )\right )\right ).}$$

The parameter estimation problem is thus formulated as a nonlinear optimization problem subject to the system dynamics (Eqs. (1), (2), and (3)) and possibly bounds on the parameter values. Therefore, its numerical solution involves an outer iterative procedure to generate values for the unknown parameters and initial conditions, the nonlinear programming method (NLP), and an iterative procedure to solve the differential equations, the boundary value problem (BVP) solver, as shown in Fig. 2.

Fig. 2
figure 2

Numerical solution of: (a ) parameter estimation and (b ) optimal experimental design

2.2 Identifiability Analysis

Identifiability has to do with the possibility of finding a unique solution for the model parameters. At this point, it is important to note that in fact, in the presence of experimental error, there are several equivalent solutions defining the parameter uncertainty region. The shape and the size of such region will determine whether practical identifiably is or not guaranteed. Assuming that the uncertainty region corresponds to a hyper-ellipsoid (typical case), highly elongated hyperellipsoids tend to be associated with poor or lack of identifiability of some parameters.

In order to asses the uncertainty regions, several possibilities exist. Monte-Carlo based approaches allow to compute robust uncertainty regions [9]. However, the associated computational cost makes it difficult to use these methods for large scale models. Alternatively, the confidence interval of \(\boldsymbol{\theta }_{i}^{{\ast}}\) may be obtained through the covariance matrix

$$\displaystyle{ \pm t_{\alpha /2}^{\gamma }\sqrt{\mathbf{C } _{ ii}} }$$
(8)

where t α∕2 γ is given by Students t-distribution, \(\gamma = N_{d}-\eta\) degrees of freedom and (1 −α)100 % is the confidence interval selected, typically 95 %.

For non-linear models, there is no exact way to obtain the covariance matrix C. Therefore, the use of approximations has been suggested. Possibly the most widely used is based on the Crammèr-Rao inequality which establishes, under certain assumptions on the number of data and non-linear character of the model, that the covariance matrix may be approximated by the inverse of the Fisher information matrix (FIM) which is formulated as follows [24, 36]:

$$\displaystyle{ \mathcal{F} =\mathop{ \mathrm{E}}\nolimits _{\mathbf{v}_{m}\vert \boldsymbol{\theta }^{{\ast}}}\Bigg\{\left [\frac{\partial J_{ml}(\boldsymbol{\theta })} {\partial \boldsymbol{\theta }} \right ]\left [\frac{\partial J_{ml}(\boldsymbol{\theta })} {\partial \boldsymbol{\theta }} \right ]^{T}\Bigg\} }$$
(9)

where \(\mathop{\mathrm{E}}\nolimits\) regards the expected value.

2.3 Optimal Experimental Design

In order to improve the quality of parameter estimates it is possible to use the model to define new experiments. The idea is to formulate an optimisation problem where the objective is to find the experimental scheme (number of experiments, input conditions, number and location of sampling times and sensors, duration of the experiments) which result in maximum information content as measured by, for example, the FIM, subject to the system dynamics Eqs. (1), (2), and (3) plus experimental constraints. The problem can be solved by a combination of the control vector parametrisation (CVP) method and a suitable optimiser enabling the simultaneous design of several dynamic experiments with optimal sampling times [8] and optimal sensor locations [20].

The optimal experimental design problem is thus formulated as a nonlinear optimisation whose numerical solution involves an outer iterative procedure to generate values for the experimental conditions, the nonlinear programming method, and the boundary value problem solver to handle model simulation and the computation of the parametric sensitivities needed to evaluate the FIM, as shown in Fig. 2.

It should be remarked that the recently developed software tool AMIGO (Advanced Model Identification using Global optimisation)[4] covers model simulation, parameter estimation, identifiability analysis and optimal experimental design. Thus facilitating the implementation of the model identification loop for general non-linear dynamic models.

3 Optimization of the Operation

3.1 Problem Formulation

The optimization of the operation is formulated as a general dynamic optimization (DO) problem as follows: Find the controls \(\mathbf{u}(t)\) that minimise (or maximise) the objective functional

$$\displaystyle{ J =\phi \left (\mathbf{x}(\boldsymbol{\xi },t_{f}),\mathbf{v}(t_{f}),\boldsymbol{\theta },t_{f}\right ) +\int _{ t_{0}}^{t_{f} }L\left (\mathbf{x}(\boldsymbol{\xi },t),\mathbf{v}(t),\mathbf{u}(t),\boldsymbol{\theta },\boldsymbol{\xi },t\right )\,\mathrm{d}t, }$$
(10)

where the scalar functions ϕ (Mayer term) and L (Lagrangian term) are continuously differentiable with respect to all of their arguments, and the final time t f can be either fixed or free, subject to the following constraints:

  • The system dynamics Eqs. (1), (2), and (3).

  • Algebraic constraints on the state and control variables which force the fulfilment of particular operational or biological conditions (for example, microbiological lethality, maximum and minimum processing temperatures, etc.) at particular time points or throughout the process:

    $$\displaystyle{ \mathbf{r}_{k}^{eq}(\mathbf{x}(\boldsymbol{\xi },t_{ k}),\mathbf{v}(t_{k}),\mathbf{u}(t_{k}),\boldsymbol{\theta },t_{k}) = 0;\qquad \mathbf{r}_{k}^{in}(\mathbf{x}(\boldsymbol{\xi },t_{ k}),\mathbf{v}(t_{k}),\mathbf{u}(t_{k}),\boldsymbol{\theta },t_{k}) \leq 0; }$$
    (11)
    $$\displaystyle{ \mathbf{c}^{eq}(\mathbf{x}(\boldsymbol{\xi },t),\mathbf{v}(t),\mathbf{u}(t),\boldsymbol{\theta },t) = 0;\qquad \mathbf{c}^{in}(\mathbf{x}(\boldsymbol{\xi },t),\mathbf{v}(t),\mathbf{u}(t),\boldsymbol{\theta },t) \leq 0. }$$
    (12)
  • Bounds on the control variables:

    $$\displaystyle{ \mathbf{u}^{L} \leq \mathbf{ u}(t) \leq \mathbf{ u}^{U}. }$$
    (13)

3.2 Control Vector Parametrisation

There are several alternatives for the solution of dynamic optimization problems from which the direct methods are the most widely used. These methods transform the original problem into a non-linear programming problem by means of the complete parametrisation [12], the multiple shooting [13] or the control vector parametrisation (CVP) [35]. Basically, all of them are based on the use of some type of discretisation and approximation of either the control variables or both the control and state variables. The three alternatives basically differ in: the resulting number of decision variables, the presence or absence of parametrisation related constraints and the necessity of using an initial value problem solver.

While the complete parametrisation or the multiple shooting approaches may become prohibitively expensive in computational terms, the CVP approach allows handling large-scale DO problems, such as those related to PDE systems, without solving very large NLPs and without dealing with extra junction constraints.

The CVP method proceeds dividing the duration of the process into a number ρ of control intervals and the control function is approximated using a low order polynomial form over each interval. Each control variable approximation may be expressed using Lagrange polynomials as follows:

$$\displaystyle{ u_{j}(t) =\sum _{ i=1}^{M_{j} }u_{ij}\varPhi _{i}^{(M_{j})}(\tau ), }$$
(14)

where, j = 1, , ρ, t ∈ [t 0, t f ], and τ is normalized time given by,

$$\displaystyle{ \tau = \frac{t - t_{0}} {t_{f} - t_{0}} }$$
(15)

and the Lagrange polynomials of order M, \(\varPhi _{i}^{(M_{j})}\) are defined in the standard form:

  • If M = 1,

    $$\displaystyle{ \varPhi _{i}^{(M)}(\tau ) \equiv 1. }$$
    (16)
  • If M ≥ 2,

    $$\displaystyle{ \varPhi _{i}^{(M)}(\tau ) \equiv \prod _{ i'=1,i\neq 1}^{M} \frac{\tau -\tau _{i'}} {\tau _{i} -\tau _{i'}}. }$$
    (17)

The parameters of these polynomials, u ij , will be used as decision variables in the optimization process together with time independent parameters. Again the numerical solution of the associated NLP requires an inner iteration to solve the system dynamics, similarly to what is shown in Fig. 2.

4 Numerical Methods

4.1 Model Simulation

As explained before, most of the food, bio-processes and bio-systems models exhibit a nonlinear dynamic behavior which makes the analytical solution of models representing such systems rather complicated, if not impossible, for most of the realistic situations. In addition to non-linearity, these processes may present a spatially distributed nature. As a consequence they must be described using PDEs which, in turns, makes the analytical approach even more difficult. Numerical techniques must be, therefore, employed to solve the model equations.

Most of numerical methods employed for solving PDEs, in particular those employed in this work, belong to the family of methods of weighted residuals in which the solution of the distributed variables in the system (1), (2), and (3) is approximated by a truncated Fourier series of the formFootnote 1 [23]:

$$\displaystyle{ x(\boldsymbol{\xi },t) \approx \sum _{i=1}^{N}m_{ i}(t)\,\phi _{i}(\boldsymbol{\xi }). }$$
(18)

Depending on the selection of the basis functions \(\phi _{i}(\boldsymbol{\xi })\) different methodologies arise. Here two groups will be considered: those using locally defined basis functions as it is the case in classical techniques like the numerical method of lines (NMOL) or the finite element method (FEM) and those using globally defined basis functions.

4.1.1 Methods Using Local Basis Functions

The underlying idea is to discretise the domain of interest into a (usually large) number N of smaller sub-domains. In these sub-domains local basis functions, for instance low order polynomials, are defined and the original PDE is approximated by N ordinary differential equations (ODEs). The shape of the elements and the type of local functions allow distinguishing among different alternatives.

Probably the most widely used approaches for this transformation are the NMOL and the FEM. The reader interested on an extensive description of these techniques is referred to the literature [23, 29, 34].

However it must be highlighted that in many food, bio-processes and biological models, especially those in 2D and 3D, the number of discretisation points (N) to obtain a good solution might be too large for their application in parameter estimation, experimental design or process optimization.

Methods using global basis functions, which will reduce the computational effort, constitute an efficient alternative [6].

4.1.2 Methods Using Global Basis Functions

The use of eigenfunctions obtained from the Laplacian operator, Chevyshev or Legendre polynomials, etc. have been considered over the last decades—see [22] as a means to obtain reduced order descriptions of PDE systems. Probably the most efficient order reduction technique is the one based on the proper orthogonal decomposition (POD) approach [31]. In this approach each element ϕ i (ξ) of the set of basis functions in ( 18) is computed off-line as the solution of the following integral eigenvalue problem [31]:

$$\displaystyle{ \int _{\mathbb{V}}R(\boldsymbol{\xi },\boldsymbol{\xi }')\,\phi _{i}(\boldsymbol{\xi }')\,\mathrm{d}\boldsymbol{\xi }' =\lambda _{i}\,\phi _{i}(\boldsymbol{\xi }), }$$
(19)

where λ i corresponds with the eigenvalue associated with each global eigenfunction ϕ i . The kernel \(R(\boldsymbol{\xi },\boldsymbol{\xi }')\) in Eq. (19) corresponds with the two point spatial correlation function, defined as follows:

$$\displaystyle{ R(\boldsymbol{\xi },\boldsymbol{\xi }') = \frac{1} {\ell} \sum _{j=1}^{\ell}x(\boldsymbol{\xi },t_{ j})x(\boldsymbol{\xi }',t_{j}), }$$
(20)

with \(x(\boldsymbol{\xi },t_{j})\) denoting the value of the field at each instant t j (snapshot) and the summation extends over a sufficiently rich collection of uncorrelated snapshots at j = 1, ,  [31]. The basis functions obtained by means of the POD technique are also known as empirical basis functions or POD basis. The basis functions are orthogonal and can be normalised so that:

$$\displaystyle{\int _{\mathbb{V}}\phi _{i}\,\phi _{j}\,\mathrm{d}\boldsymbol{\xi } = \left \{\begin{array}{@{}l@{\quad }l@{}} 1,\quad &\text{if }\,i = j,\\ 0,\quad &\text{if } \,i\neq j. \end{array} \right.}$$

The dissipative nature of food and bio-processes makes that the eigenvalues obtained from Eq. ( 19) can be ordered so that λ i  ≤ λ j for i < j, furthermore λ n  →  as n → . This property allows to define a finite (usually low) dimensional subset ϕ A  = [ϕ 1, ϕ 2, , ϕ N ] which captures the relevant features of the system [1, 5, 6].

In order to compute the time dependent coefficients in Eq. ( 18), the original PDE system (1), (2), and (3) is projected onto each element of the POD basis set. Such projection is carried out by multiplying the original PDE by each ϕ i and integrating the result over the spatial domain. As a result the following set of ODEs is obtained:

$$\displaystyle{ \mathbf{m}_{A\,t} = F(\mathbf{m}_{A},\mathbf{x},\mathbf{v},\mathbf{u},\boldsymbol{\theta },t). }$$
(21)

At this point the basis functions ϕ A and time dependent coefficients

$$\displaystyle{m_{A} = [m_{1},m_{2},\ldots,m_{N}]}$$

are known, therefore the original field x can be recovered by applying Eq. (18), this is x = ϕ A m A . The number of elements N in the basis subset ϕ A can be increased to approximate the original state x with an arbitrary degree of accuracy.

4.2 Non-linear Programming Methods

Most of the problems of interest are formulated as non-linear optimisation problems which can be handled by adequate non-linear programming methods. Nonlinear programming methods may be largely classified in two main groups: local and global. Local methods are designed to generate a sequence of solutions, using some type of pattern search or gradient and Hessian information, that will converge to a local optimum, usually the closest to the provided initial guess. However the NLPs with non-linear dynamic constraints (such as in parameter estimation or the ones resulting from the application of the CVP approach) are frequently multimodal (i.e. presenting multiple local optima). Therefore, local methods may converge to local solutions, especially if they are started far away from the global optimum. In order to surmount these difficulties, global methods must be used.

Global methods have emerged as the alternative to search the global optimum [27]. The successful methodologies combine effective mechanisms of exploration of the search space and exploitation of the previous knowledge obtained by the search. Depending on how the search is performed and the information is exploited the alternatives may be classified in three major groups: deterministic, stochastic and hybrid.

Global deterministic methods [18, 28] in general take advantage of the problem’s structure and guarantee global convergence for some particular problems that verify specific smoothness and differentiability conditions. Although they are very promising and powerful, there are still limitations to their application, particularly for non-linear dynamic systems, since the computational cost increases rapidly with the size of the considered dynamic system and the number of decision variables.

Global stochastic methods do not require any assumptions about the problem’s structure. They make use of pseudo-random sequences to determine search directions toward the global optimum. This leads to an increasing probability of finding the global optimum during the run time of the algorithm, although convergence may not be guaranteed. The main advantage of these methods is that, in practise, they rapidly arrive to the proximity of the solution.

The most successful approaches lie in one (or more) of the following groups: pure random search and adaptive sequential methods, clustering methods or metaheuristics. Metaheuristics are a special class of stochastic methods which have proved to be very efficient in recent years. They include both population (e.g., genetic algorithms) or trajectory-based (e.g., simulated annealing) methods. They can be defined as guided heuristics and many of them try to imitate the behaviour of natural or social processes that seek for any kind of optimality [33].

Despite the fact that many stochastic methods can locate the vicinity of global solutions very rapidly, the computational cost associated to the refinement of the solution is usually very large. In order to surmount this difficulty, hybrid methods and metaheuristics have been recently presented for the solution of dynamic optimisation problems [7, 16] or parameter estimation problems [30]. They speed up these methodologies while retaining their robustness and, provided a gradient based local method is used, they guarantee convergence to a gradient zero solution.

The recently developed Scatter Search based methods [15, 17] have proved to be successful in the solution of parameter estimation and dynamic optimisation problems allowing to overcome typical difficulties of nonlinear dynamic systems optimisation such as noise, flat areas, non-smoothness, and/or discontinuities.

5 Illustrative Examples

5.1 Modelling and Simulation: Growth of Bacterial Biofilms

Bacteria, both in natural and pathogenic ecosystems, are found mainly within surface associated cell assemblages, the so called biofilms. It is now recognised that biofilms constitute a source for food related infections. Since they can render their inhabitants more resistant to disinfectants, biofilms have become problematic in a wide range of food industries, including brewing, seafood processing, dairy processing, poultry processing and meat processing [32].

Since resistance is associated to biofilm structure there is a growing interest in the characterisation of pathogenic biofilm structures. In this respect much research has been performed to gain deeper understanding of biofilms formation, adherence and growth. Basically two approaches have been considered: the use of imaging techniques and modelling.

While image quantitative analysis allows direct quantification from images obtained by microscopic techniques [11, 25, 39], mathematical models have been developed to provide mechanistic insight into structure evolution. Proposed models can be divided in three general classes according to the way the biomass is represented: discrete, continuous and hybrid discrete-continuous models (see Wanner et al. [37] for an extensive review).

Here we will consider the continuous model proposed by Eberl et al. [14]. The model represents bacteria and nutrients with two density fields denoted by \(m(t,\boldsymbol{\xi })\) and \(c(t,\boldsymbol{\xi })\), respectively. Their spatial distributions are represented by the following set of coupled diffusion-reaction mass balance equations:

$$\displaystyle\begin{array}{rcl} \dfrac{\partial C} {\partial t} & =& d_{1}\nabla ^{2}C - F(C,M),{}\end{array}$$
(22)
$$\displaystyle\begin{array}{rcl} \dfrac{\partial M} {\partial t} & =& \nabla \cdot (d_{2}(M)\nabla M) + G(C,M),{}\end{array}$$
(23)

with

$$\displaystyle{F(C,M) = K_{1} \dfrac{MC} {K_{2} + C},\,\,G(C,M) = K_{3} \dfrac{CM} {K_{2} + C}-K_{4}M,\,\,d_{2}(M) = m_{max}^{b-a}\left ( \dfrac{\epsilon } {1 - M}\right )^{a}M^{b},}$$

where

$$\displaystyle{\begin{array}{llllllll} k_{1} & = m_{max}\left ( \dfrac{\mu _{m}} {Y _{XS}} + m_{s}\right ),&\quad k_{2} & = K_{s},&k_{3} & = Y _{XS}/m_{max},&\quad k_{4} & = m_{s}m_{max}, \\ K_{1} & = m_{max}\dfrac{k_{1}} {c_{0}}, &\quad K_{2} & = \dfrac{k_{2}} {c_{0}}, &K_{3} & = k_{3}k_{1}, &\quad K_{4} & = k_{3}k_{4}, \end{array} }$$

and M and C dimensionless variables (\(M:= m/m_{max}\); \(C:= c/c_{0}\)). Model parameters are summarised in Table 1.

Table 1 Biofilm growth model parameters

In the equation describing biomass Eq. (23), the first term in the right-hand side accounts for the diffusion of the biomass and the second term for the production of biomass. Expansion of bacteria depends on the local density of bacteria and takes place only if the biomass density approaches a prescribed maximum value established by m max . Elberl et al. proposed a density-dependent expression for the diffusion factor d 2 that satisfies this condition. The physical interpretation is that the biomass diffusivity vanishes as m becomes small but increases as m grows due to biochemical reaction.

In our work we developed a numerical approach based on the combination of finite differences schemes in space—with centred differences for the nutrients and a backwards-forward space for the biomass- and the Crank-Nicolson approach in time [10]. The resulting set of non-linear equations is solved using a Newton-Raphson algorithm. Here we illustrate results achieved for one-dimensional growth with merging colonies under symmetric initial and boundary conditions. For this purpose, two equally sized colonies are located in the interval [0, L]:

$$\displaystyle\begin{array}{rcl} C(0,x)& =& 1,\qquad \forall x \in [0,1],{}\end{array}$$
(24)
$$\displaystyle\begin{array}{rcl} M(0,x)& =& \left \{\begin{array}{@{}l@{\quad }l@{}} M_{0},\quad &\text{for}\ x \in [x_{1},x_{2}] \cup [x_{3},x_{4}], \\ 0, \quad &\text{elsewhere}, \end{array} \right.{}\end{array}$$
(25)

with \(L = 10^{-4}\), \(x_{1} = L - x_{4}\), \(x_{2} = L - x_{3}\), \(x_{3} = L/2 + 3 \times 1.6 \times 10^{-6}\) and \(x_{4} = L/2 + 4 \times 1.6 \times 10^{-6}\). And symmetric boundary conditions are imposed:

$$\displaystyle{ \begin{array}{llll} C(t,0)& = 1,&\qquad C(t,L)& = 1\qquad \forall t \in [0,1]\end{array} }$$
(26)
$$\displaystyle{ \begin{array}{llll} M_{x}(t,0)& = 0,&\qquad M_{x}(t,L)& = 0\qquad \forall t \in [0,1].\end{array} }$$
(27)

As it can be seen in the Fig. 3, both colonies spread in both directions till collision is produced. It should be noted that a modification on model parameters, for example, an increased nutrient availability or decreased maximum biomass concentration, would accelerate the spatial spreading of biomass and, in consequence, colonies would merge earlier and form a more compact spatial structure. On the contrary, a decrease in nutrient availability or a larger maximum biomass concentration, would slow down the spatial spreading of biomass and colonies would merge later or even would not merge. This reveals the capacity of the model to describe the clusters and tunnels typically observed in the laboratory.

Fig. 3
figure 3

Numerical solution of the biofilm growth example

5.2 Reduced Order Models: Food Pasteurisation in Tunnels

Food thermal processing persists as one of the most widely used methods for food preservation. The product is treated at a given temperature for a given period of time to minimise public health hazards due to the presence of pathogenic microorganisms and to extend product shelf-life. Different time-temperature combinations could be used to achieve safety. However, the related time-temperature histories would affect the quality of the product in different ways.

Therefore, the design of thermal processes requires a deep understanding of the heating process of the given product, the impact on the target microorganism and quality factors. The thermal treatment will depend on the thermo-physical characteristics, shape and size of the food product and container; the type and thermal resistance of the microorganisms of interest and the kinetics of quality degradation.

In this section we consider the pasteurisation in tunnels of highly viscous liquid foods such as tomato or carrot puree in cylindrical food jars. The containers are loaded at one end of the pasteuriser and passed under sprinkles of water as they move along the conveyor belt. Temperature of the water changes in the different zones so as to achieve pasteurisation. The heat transfer occurs between the hot water film and the package surface and from the package to the food product.

The evolution of temperature and velocity within the food product during the pasteurisation is described by means of conservation laws. The package is assumed to be homogeneously heated therefore axial symmetry allows to consider a 2D geometry. The process can be mathematically described as follows [26]:

5.2.1 Continuity Equation

$$\displaystyle{ \frac{\partial u} {\partial z} + \frac{v} {r} + \frac{\partial v} {\partial r} = 0, }$$
(28)

being r and z being the spatial coordinates (radius and height of the package) while v and u are the velocity field components, i.e., w = [u, v]T.

5.2.2 Momentum Conservation

$$\displaystyle\begin{array}{rcl} \rho _{prod}\left (\frac{\partial v} {\partial t} + u\frac{\partial v} {\partial z} + v\frac{\partial v} {\partial r}\right ) = -\frac{\partial p} {\partial r} +\mu _{prod}\left ( \frac{\partial } {\partial r}\left (\frac{1} {r} \frac{\partial rv} {\partial r} \right ) + \frac{\partial ^{2}v} {\partial z^{2}}\right ),& &{}\end{array}$$
(29)
$$\displaystyle\begin{array}{rcl} \rho _{prod}\left (\frac{\partial u} {\partial t} + u\frac{\partial u} {\partial z} + v\frac{\partial u} {\partial r}\right ) = -\frac{\partial p} {\partial z} +\mu _{prod}\left (\frac{1} {r} \frac{\partial } {\partial r}\left (r\frac{\partial u} {\partial r}\right ) + \frac{\partial ^{2}u} {\partial z^{2}} \right ) +\hat{\rho }\, g,& &{}\end{array}$$
(30)

where p is the pressure, ρ prod corresponds with the food stuff density, g is the gravity constant, T represents the temperature distribution inside the food, μ prod stands for the viscosity expressed as a function of the temperature [26]:

$$\displaystyle{ \mu _{prod} = a_{\mu }T^{2} - b_{\mu }T + c_{\mu }, }$$
(31)

and the density \(\hat{\rho }\) is usually expressed in terms of the fluid temperature as follows:

$$\displaystyle{ \hat{\rho }=\rho _{ref}\left (1 -\beta \left (T - T_{ref}\right )\right ), }$$
(32)

being β the thermal dilatation coefficient and ρ ref and T ref given reference values.

5.2.3 Energy Conservation

$$\displaystyle{ \frac{\partial T} {\partial t} + v\frac{\partial T} {\partial r} + u\frac{\partial T} {\partial z} =\alpha _{prod}\left (\frac{1} {r} \frac{\partial } {\partial r}\left (r\frac{\partial T} {\partial r} \right ) + \frac{\partial ^{2}T} {\partial z^{2}} \right ). }$$
(33)

The system in Eqs. (28), (29), (30), (31), (32), and (33) is subject to the following initial and boundary conditions:

  • Initially the food stuff is at rest (v = 0) and at uniform temperature \(T(r,z,t = 0) = T_{0}\).

  • The velocity field components (u, v) are zero in the package walls, i.e.:

    $$\displaystyle{ u\vert _{z=0} = u\vert _{z=Z} = u_{\vert _{r=R}} = 0, }$$
    (34)
    $$\displaystyle{ v\vert _{z=0} = v\vert _{z=Z} = v\vert _{r=R} = 0. }$$
    (35)
  • Symmetry conditions are imposed in the symmetry axis (r = 0):

    $$\displaystyle{ \frac{\partial T} {\partial r} \bigg\vert _{r=0} = \frac{\partial u} {\partial r}\bigg\vert _{r=0} = \frac{\partial v} {\partial r}\bigg\vert _{r=0} = 0. }$$
    (36)
  • The package bottom is touching the transportation belt assumed to be an insulating material:

    $$\displaystyle{ \frac{\partial T} {\partial z} \bigg\vert _{z=0} = 0. }$$
    (37)
  • At the right and upper sides, the package is in direct contact with the falling film of heating fluid:

    $$\displaystyle{ k_{prod}\frac{\partial T} {\partial r} \bigg\vert _{r=R} = h_{jar}{\Bigl (T_{ff} - T\vert _{r=R}\Bigr )}, }$$
    (38)
    $$\displaystyle{ k_{prod}\frac{\partial T} {\partial z} \bigg\vert _{z=Z} = h_{jar}{\Bigl (T_{ff} - T\vert _{z=Z}\Bigr )}, }$$
    (39)

    with T ff being the temperature of the falling film, h jar the jar heat transfer coefficient and k prod the product thermal conductivity.

Here we compare the solution of the models Eqs. (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), and (39) using the finite element method (FEM) and the reduced order model (ROM) based on the proper orthogonal decomposition approach. The main steps to derive the ROM are the following:

  1. 1.

    Obtain a set of snapshots that characterises the spatio-temporal distribution of the variable of interest (temperature, velocity, etc.). In our case all the snapshots are obtained from a FEM based simulation of system (Eqs. (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), and (39)) under different possible experimental conditions (T ff , T 0). Since product and package properties are unknown, several values of the parameters within physical meaningful bounds have to be considered to obtain the snapshots. The finite element method, with a mesh of 725 discretisation points, was used to solve the system of Eqs. (28), (29), (30), (31), (32), (33), (34), (35), (36), (37), (38), and (39) and generate snapshots (see Fig. 4). Each simulation implies solving 2900 ODEs which takes around 25 s in a standard PC.

    Fig. 4
    figure 4

    Illustrative example of the package, operating conditions and FEM mesh

  2. 2.

    Computation of the POD basis. The snapshots of the previous point are used to compute the so-called POD basis as described above [19].

  3. 3.

    Projection of the model equations ( 28 ), ( 29 ), ( 30 ), ( 31 ), ( 32 ), ( 33 ), ( 34 ), ( 35 ), ( 36 ), ( 37 ), ( 38 ), and ( 39 ) over the selected POD basis. Projection is carried out by multiplying the original PDE system by the POD basis and integrating the result over the spatial domain. Note that the FEM structure may be exploited to numerically perform the projection [19]:

    $$\displaystyle\begin{array}{rcl} & \int _{V }\phi _{T,i}\,\frac{\partial T} {\partial t} \,\mathrm{d}\xi =\int _{V }\phi _{T,i}\,(\alpha \,\varDelta T - w\,\nabla T)\,\mathrm{d}\xi,& {}\end{array}$$
    (40)
    $$\displaystyle\begin{array}{rcl} & \rho \int _{V }\phi _{w,i}\,\frac{\partial w} {\partial t} \,\mathrm{d}\xi =\int _{V }\phi _{w,i}\,\left (\mu \varDelta w -\rho w\nabla w -\nabla P +\rho g\left (1 -\beta (T - T_{0})\right )\mathbf{z}\right )\,\mathrm{d}\xi,& {}\end{array}$$
    (41)

    with i = 1, , N x and \(\mathbf{z}\) being a unitary vector with the direction of the spatial coordinate z. Equation ( 41) with w = [u, v]T is equivalent to the result of the projections of Eqs. (29) and (30).

Taking into account:

$$\displaystyle\begin{array}{rcl} & T(\xi,t) \approx \sum _{i=1}^{N_{T}}m_{T_{ i}}(t)\,\phi _{T_{i}}(\xi ),&{}\end{array}$$
(42)
$$\displaystyle\begin{array}{rcl} & w(\xi,t) \approx \sum _{i=1}^{N_{w}}m_{w_{ i}}(t)\,\phi _{w_{i}}(\xi ),&{}\end{array}$$
(43)

and after some algebraic manipulations, Eqs. (40) and (41) can be rewritten as:

$$\displaystyle{ \frac{\mathrm{d}\mathbf{m}_{T}} {\mathrm{d}t} = \left (\alpha _{prod}A_{T} + B_{T} +\alpha _{prod}D_{T}\right )\mathbf{m}_{T}, }$$
(44)
$$\displaystyle{ \rho \frac{\mathrm{d}\mathbf{m}_{w}} {\mathrm{d}t} = \left (\mu A_{w} +\rho B_{w} +\mu D_{w}\right )\mathbf{m}_{w} -\rho g\beta C_{T,w}\mathbf{m}_{T} +\rho g(1 +\beta T_{0}), }$$
(45)

where each component of matrices A x , B x , C T, w and D x are of the form:

$$\displaystyle\begin{array}{rcl} \begin{array}{llll} A_{x}(i;j) & =\int _{V }\nabla \phi _{x,i}\,\nabla \phi _{x,j}\,\mathrm{d}\xi,\qquad &\qquad B_{x}(i;j) & =\int _{V }\phi _{x,i}(w\nabla \phi _{x,j})\,\mathrm{d}\xi, \\ C_{T,w}(i;j)& =\int _{V }\phi _{w,i}\,\phi _{T,j}\,\mathrm{d}\xi, &\qquad D_{x}(i;j)& =\int _{\partial V }\phi _{x,i}\,\nabla \phi _{x,j}\,\mathrm{d}\xi,\end{array} & & {}\\ \end{array}$$

with ∂ V denoting the boundary of V. The vector of time dependent functions m x is of the form m x  = [m x, 1, m x, 2, , m x, N ]T.

The larger the number of basis functions used, the better the accuracy of the reduced model. However at the expense of higher computational cost. In order to arrive to a compromise between accuracy and efficiency, several validation experiments were performed for various experimental conditions and parameter values. Table 2 shows the differences emerging from the addition of basis functions. Results are compared in terms of the mean error as compared to the worst validation example: \(E = 100\frac{x_{ROM} - x_{FEM}} {x_{FEM}}\) where x represents each of the state variables T, w = [u, v]T.

Table 2 FEM vs ROM in the simulation of thermal pasteurisation in tunnels

The best compromise quality and computational cost is offered by the ROM with 40 ODEs. It should be noted that the mean error is bellow the 2 % as compared to the FEM based simulation. The dynamic evolution of the temperature and velocity fields at five spatial locations distributed along the diagonal of the spatial domain (p 1 = (0, 0), p 2 = (0. 011, 0. 022), p 3 = (0. 019, 0. 045), p 4 = (0. 029, 0. 067), p 5 = (0. 04, 0. 09)) is presented in Fig. 5 for one validation example. Continuous lines correspond to the FEM simulation while marks represent the solution of the ROM with 40 ODEs. As shown in the figure the ROM is able to reproduce the system behavior.

Fig. 5
figure 5

Evolution of the state variables, temperature and fluid velocity, using the POD approach (marks) and the FEM (continuous lines) at different locations within the spatial domain

5.3 Model Identification: Production of Gluconic Acid in a Fed-Batch Reactor

Industrial fermentation is based on the conversion of glucose to other substances by the action of microorganisms under highly oxygenated and aerobic growth conditions. This kind of processes are widely employed to obtain for instance bread, wine and cheese in the food industry and biomass, metabolites (ethanol, citric acid, gluconic acid, vitamins, antibiotic) or recombinant products (insulin) in the biotechnology industry or, even, bio-fuels to replace conventional petrol.

Most of the fermentation processes to obtain Gluconic Acid (GA) are carried out by Aspergillus niger. The objective here is that of building a model with good predictive capabilities to describe the dynamics of glucose (G), oxygen (O 2), gluconic acid (GA) and biomass (B) during the growth phase of Aspergillus niger. We consider a fed-batch fermenter with two valves to regulate the incoming flux of glucose and water mixture (u 1) and the oxygen transfer rate described by the Henry’s law (u 2). The controls may take values between: 0, when closed and 1, when opened. Mathematically the process may be described as follows [21]:

$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}X} {\mathrm{d}t} & =& \mu _{max} \frac{G} {K_{G} + G} \frac{O_{2}} {K_{O_{2}} + O_{2}},{}\end{array}$$
(46)
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}GA} {\mathrm{d}t} & =& Y _{GA}\mu _{max} \frac{G} {K_{G} + G} \frac{O_{2}} {K_{O_{2}} + O_{2}},{}\end{array}$$
(47)
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}G} {\mathrm{d}t} & =& Y _{G}\mu _{max} \frac{G} {K_{G} + G} \frac{O_{2}} {K_{O_{2}} + O_{2}} + \frac{u_{1}F_{in}} {V } (G^{in} - G),{}\end{array}$$
(48)
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}O_{2}} {\mathrm{d}t} & =& Y _{O_{2}}\mu _{max} \frac{G} {K_{G} + G} \frac{O_{2}} {K_{O_{2}} + O_{2}} + u_{2}K_{La}(O_{2}^{{\ast}}- O_{ 2}),{}\end{array}$$
(49)
$$\displaystyle\begin{array}{rcl} \frac{\mathrm{d}V } {\mathrm{d}t} & =& u_{1}F_{in},{}\end{array}$$
(50)

where F in and K La represent the maximum incoming flux and oxygen transfer rate, respectively; O 2 is the saturation of dissolved oxygen; G in corresponds to the concentration of glucose in the inlet.

Next in model building loop is to compute model unknowns, in this case \(\boldsymbol{\theta }= [\mu _{max},Y _{G},Y _{O_{2}},Y _{GA},K_{La}]\) by measuring \(\mathbf{y} = [B,GA,G,O_{2}]\). To estimate their values we will first consider a qualitative experimental design. Basically two completely different experiments are designed: (i) where the incoming flux valve is almost closed u 1 = 0. 01 and the oxygen transfer is completely open u 2 = 1 and (ii) where the incoming flux valve is completely open u 1 = 1 and the oxygen transfer is almost closed u 2 = 0. 01. Pseudo-experimental data are obtained by direct numerical simulation of the model assuming the following nominal values for the parameters: \(K_{La} = 600\,\mathrm{h}^{-1}\), \(O_{2}^{{\ast}} = 0.0084\,\mathrm{g}\,\mathrm{l}^{-1}\), \(G^{in} = 250\,\mathrm{g}\,\mathrm{l}^{-1}\), \(F_{in} = 0.5\,\mathrm{min}^{-1}\), \(\mu _{max} = 0.2242\,\mathrm{h}^{-1}\), \(K_{G} = 9.9222\,\mathrm{g}\,\mathrm{l}^{-1}\), \(K_{O_{2}} = 0.0137\,\mathrm{g}\,\mathrm{l}^{-1}\), Y G A = 44. 8887, \(Y _{O_{2}} = -2.5598\), \(Y _{G} = -51.0365\). Gaussian experimental error is added to the model predictions and 40 equidistant sampling times are used per experiment.

The parameter estimation problem was solved using eSS as incorporated in AMIGO obtaining the following optimal solution:

$$\displaystyle{\boldsymbol{\theta }^{{\ast}} = [0.2241,-51.049,-2.1341,44.923,500.2].}$$

The corresponding optimal fit is shown in Fig. 6.

Fig. 6
figure 6

Best fit obtained for the qualitative experimental design

It should be noted that even though the values obtained for μ max , Y GA and Y G are within the 1 % of the known global solution, this is not the case for \(Y _{O_{2}}\) and K La where the differences are around the 17 %. In addition the Monte Carlo based identifiability analysis reveals uncertainties over the 20 %.

In view of the results a parallel-sequential optimal experimental design was pursued in order to improve parameter estimates. The two qualitative designs are incorporated in the FIM and two new experiments are designed allowing for constant control profiles that are optimised together with the final time and the initial conditions of glucose and biomass. The OED problem was solved to minimise the ratio between the maximum and the minimum eigenvalue of the FIM and the Monte Carlo based practical identifiability analysis was performed for the resultant experimental scheme so as to compare the expected uncertainty in the parameter estimates.

The parameter estimation problem was then solved by using the four experiments in the optimal experimental scheme. Figures 7 show the two optimally designed experiments together with the optimal fits obtained by the use of SSm that correspond to the following parameter set: \(\boldsymbol{\theta }^{{\ast}} = [0.2241,44.908,-51.04,-2.5606,600.04]\) which is within the 0. 04 % of the optimal value, i.e. with OED it is possible to converge to the real parameter values.

Fig. 7
figure 7

Best fit obtained for the optimally designed experiments

5.4 Identification and Dynamic Optimisation: Frying of Potato Chips

In deep-fat frying foodstuff is immersed into oil at high (constant) temperature. This induces water evaporation and the formation of a thin crust. As the temperature increases and moisture is lost, the typical deep-frying sensory characteristics (colour, flavour, texture) are developed. However, the use of high temperatures results in the production of acrylamide, a carcinogen compound. Thus model-based optimisation may assist in the design of those operating conditions that provide the best compromise between quality and safety.

A multiphase porous media based model was formulated to describe heat, mass and momentum transfer and acrylamide kinetics within a potato chip as described in Warning et al. [38]. The model consists of a set of coupled nonlinear PDEs describing the evolution of the saturation of water, oil and vapor (S w ,S o , S g ), product temperature (T), moisture content (M), pressure (P), water vapour mass fraction (ω v ) and acrylamide content (c AA ). The potato chip is assumed to be cylindrical and heated from outside therefore axi-symmetry can be assumed. The selected geometry is shown in Fig. 8.

Fig. 8
figure 8

Geometry of the potato chip

The model was solved in COMSOL©. The Convection and Diffusion module was used to solve for water, oil and acrylamide mass conservation while Maxwell-Stefan Diffusion and Convection was used to gas mass fraction and Darcy’s Law and Convection and Conduction were used to solve for pressure and temperature respectively. The selected mesh consists of 20 × 10 rectangular elements. The simulation of 1. 5 min frying takes around 40 s in a standard PC 3.25 GB RAM and 2.83 GHz.

Unknown parameters, the heat transfer coefficient (h) and the surface oil saturation S o, surf , were identified from experimental data using AMIGO, details can be found in Arias-Méndez et al. [3].

The final model exhibits good predictive capabilities (see Fig. 9) enabling the possibility to analyse alternative operating conditions. The objective was to compute the oil temperature profile (\(T_{oil_{min}} \leq T_{oil} \leq T_{oil_{max}}\)) that guaranties the desired quality attributes (colour and crispness) while minimising final acrylamide content subject to the process dynamics. The problem was solved by means of a combination of the CVP approach and eSS [17].

Fig. 9
figure 9

Best fit: experimental data (dots) vs model predictions (lines) of acrylamide (c AA ), oil and moisture content (MM(0)) at different process temperatures

In a first approximation to the problem the typical industrial process at constant oil temperature was designed. As expected, the lower the oil temperature the lower the acrylamide content and the longer the process. Results reveal that a reduction in the oil temperature from 180 C to 150 C translates into a reduction of around the 4 % in acrylamide content and an increase of the 25 % in the process duration. Since the process duration is critical for the production rate, and no recommendations or constraints are yet available on the maximum admissible acrylamide content, a good compromise would be to use intermediate temperature values 165–170 C during 80–85 s.

The general dynamic optimisation problem was then solved for different maximum process durations (80, 85, 90 and 95 s) and different numbers of maximum heating zones. Results show that using two heating zones significantly reduces the final acrylamide content with respect to typical constant operating profiles. The optimal profile corresponds to the use of a higher temperature at the beginning of the process, this helping to satisfy the constraint on the moisture content, followed by a lower temperature to minimise the final acrylamide content (Fig. 10).

Fig. 10
figure 10

Optimal operation profiles (oil temperatures) using different numbers of heating zones (t final  = 95 s)

5.5 Real Time Optimisation: Thermal Sterilisation of Packaged Foods

In this example we consider the thermal sterilisation of packaged solid foods in steam retorts. The product is introduced in a steam retort where it is subjected to a given heating-cooling cycle so as to get a pre-specified degree of microbial inactivation indicated by the microbiological lethality. However, some organoleptic properties or nutrients can be negatively affected by the heat action. The objective is, therefore, to optimise operation conditions to maximise quality while guaranteeing safety. In this example, we go a step further, and propose a real time optimisation (RTO) architecture to handle the optimisation during processing and in the presence of uncertainty or sudden disturbances. The performance of the proposed RTO architecture was experimentally validated for tuna paté at the pilot plant in the IIM-CSIC.

The dynamic representation of the plant couples the description of the temperature inside the retort, temperature distribution inside the food product and the corresponding distribution of nutrients and microorganisms:

5.5.1 Retort Dynamics

$$\displaystyle{ \frac{\mathrm{d}\mathbf{z}} {\mathrm{d}t} =\mathbf{ f}(\mathbf{z};\boldsymbol{\theta }) +\mathbf{ g}(\mathbf{z},\mathbf{u};\boldsymbol{\theta }), }$$
(51)

here \(\mathbf{f}\) and \(\mathbf{g}\) are nonlinear vector fields of appropriate dimensions; \(\mathbf{z}\) denotes the temperature and pressure in the retort [T R , P R ]; \(\mathbf{u}\) stands for the control variables: valve positions for input and output streams. Finally, \(\boldsymbol{\theta }\) denotes the vector of unknown parameters. For a detailed description the reader is referred to [2].

5.5.2 Temperature Distribution Inside the Food Product

$$\displaystyle{ \frac{\partial T_{\mathit{prod}}} {\partial t} =\alpha \nabla ^{2}T_{\mathit{ prod}},\quad n(k\nabla T_{\mathit{prod}}) = h(T_{R} - T_{\mathit{prod}}), }$$
(52)

where T prod is the temperature of the food stuff and h, k, α stand for the heat transfer coefficient of the package and the food thermal conductivity and diffusivity, respectively.

5.5.3 Quality and Safety Models

$$\displaystyle{ \frac{\mathrm{d}C_{i}(t)} {\mathrm{d}t} = -\left ( \frac{\ln 10} {D_{i,\mathit{ref }}}\right )C_{i}(t)\exp \left (\frac{T_{\mathit{prod}}(\boldsymbol{\xi },t) - T_{\boldsymbol{\xi },\mathit{ref }}} {z_{i,\mathit{ref }}} \right ), }$$
(53)

where subindex C i refers to the concentration of either microorganisms or nutrients.

The unknown parameters of the model, the functional dependencies of fluxes on valves openings and the valves related constants were identified by means of parameter estimation, identifiability analysis and multi-experimental optimal design, using AMIGO toolbox.

For the case of the evolution of temperature inside the retort, the resulting model presents excellent predictive capabilities taking into account that a maximum error of around 3 % is observed in fast transitions.

The product was packed in glass containers with metal top. The corresponding geometry and the FEM mesh used for simulation purposes are depicted in Fig. 11. Selected mesh consists of 184 nodes which translates into 553 ODEs. Three model parameters were estimated from the temperature measurements, namely, the product thermal conductivity, and the glass/steam and the metal/steam heat transfer coefficients. After the model identification, the differences between model predictions and experimental data are lower than 1 %.

Fig. 11
figure 11

Geometry of the food package

Once a satisfactory model became available, a POD-based ROM model was developed to be used within the RTO scheme, it should be noted that each simulation of the ROM takes less than 1 s. In addition, the optimal operating conditions were computed off-line using the CVP and scatter search methods.

Real time implementation of the optimal control needs to consider the effect of unmeasured disturbances not being part of the prediction model. To that purpose, feedback was implemented by regularly measuring the current retort variables and observing the relevant variables of the packaged product to compute efficient on-line optimisation. Optimal operation conditions are then re-computed any time a difference between predicted value and off-line optimal solution is detected. A combination of a local optimiser and SSm was designed so as to guarantee feasibility and optimality of the solution even in the presence of significant perturbations or plant/model mismatch (see details in [2]).

Figures 12 and 13 illustrate the performance of the RTO architecture in an experimental case were large perturbations occur. The implementation of the optimal off-line heating profile leads to a product that does not fulfill the lethality requirement (F c  = 8 min). The RTO architecture proposed in the work was able to drive the system to feasibility and optimality by means of re-computing optimal profiles on-line and slightly extending the duration of the heating phase.

Fig. 12
figure 12

Comparison of off-line and on-line optimal profiles under large perturbations in the retort at the pilot plant (IIM-CSIC)

Fig. 13
figure 13

Comparison of off-line and on-line optimal profiles surface nutrient retention and lethality value at the coldest point

6 Conclusions

Computer-aided simulation and model-based optimisation offer a powerful, rational and systematic way to improve food, bio-process and biological systems understanding or performance. In recent decades there has been a growing interest in the development of rigorous models, based on first principles, that enable not only to perform experiments in silico, but to design and to optimise operation policies.

However several problems have to be faced mostly related to (i) insufficient a priori knowledge to deduce the right model structure or model parameter values; (ii) the complexity of the processes that combine physical, chemical and biological phenomena on a wide range of time and space scales; (iii) the complexity of the associated models that calls for sophisticated numerical simulation techniques and (iv) the complexity of the associated optimisation problems due mainly to multi-modality.

In this work we have used a number of examples taken from the food and biotechnology industry to illustrate how those problems emerge and to present some alternatives to tackle them. Special emphasis was paid to describe the model identification loop, which involves parameter estimation, identifiability analyses and model based experimental design as well as the dynamic optimisation problem. Most of the problems can be formulated as non-linear optimisation problems whose solution requires adequate model simulation techniques, including accurate and efficient reduced order modelling approaches and the use of global optimisation methods. To finish with, all elements were combined to design and implement a real-time optimisation architecture, which is able to assure high operational stability, process reproducibility and optimal operation.