Introduction

The development of mammalian cell culture processes for the production of stable and active drugs is still challenging due to the high number of relevant variables [1]. In contrast to one-factor-at-a-time (OFAT) approaches, Design of Experiments (DoE) tools offer a systematic method for the evaluation of multiple variables simultaneously and the description of these with empirical models. DoE methods are recommended within the Quality by Design (QbD) methodology to describe the interdependency of process variables on the final biopharmaceutical [2, 3]. Mostly, a screening design is chosen at first to statistically identify relevant process variables, e.g., medium compositions [4,5,6] or process parameters [7, 8]. Second, the experimental space, consisting of significant variables (factors) and user-defined boundaries, is experimentally evaluated with respect to their impact on the targeted outcomes (responses, e.g., product yield) [9,10,11]. The obtained experimental results are further used to predict the parameters of an empirical response-surface model (RSM) to describe the effects and interactions of the process variables on the responses. A stable setpoint (e.g., medium concentrations) of the process variables is aimed to ensure the stability of the process and to define the space of stable operation, referred to as design space. The main advantages of DoE methods are the systematic planning of experiments and the description of the interactions between the variables [12, 13]. Disadvantages are the high number of time-consuming and cost-intensive experiments typically needed in bioprocess development [2, 14, 15]. Furthermore, DoE methods are based on user-defined choices of the experimental design and the definition of factor boundary values, including the definition of variables and their evaluated levels [2, 16]. Mostly, expert knowledge (e.g., literature, heuristics, and experience) is required to define suitable boundary values for the development and optimization of cell culture processes using DoE [17,18,19]. This is seen controversial, since the development of processes for the production of therapeutics should rely on sound process knowledge including a meaningful decision about the boundary values [3, 16]. Moreover, the heuristic restriction of the experimental space can lead to several iterative rounds, each consisting of a re-estimated experimental space and new experimental settings [16, 20]. Even if single-use high-throughput systems can perform the experiments in parallel, plenty of time and analytical support are required, which can increase the time a pharmaceutical takes to enter the market [21].

Mathematical process modeling has gained rising importance in the last decades, since it can be applied to design [22,23,24], control [25,26,27], and optimize [28, 29] biopharmaceutical production processes. Furthermore, a mathematical process model is seen as a sustainable part of QbD, because it contributes to a scientific understanding of the process variables and their impact on the final product [14, 30,31,32,33]. Although the application of mathematical process models for the development of sophisticated processes has many advantages, it is still not commonly applied in bioprocess development. Reasons for this can be found in the variety and complexity of mathematical models including different assumed mechanistics and quality of predictions (recently reviewed in [34]).

In this study, a novel concept for the a priori evaluation of DoE designs using mathematical process models (model-assisted DoE—mDoE) is discussed. The general structure including mathematical process models and DoE is schematically shown in Fig. 1.

Fig. 1
figure 1

Workflow of mDoE consisting of the combination of mathematical process models and classical DoE

Initial process development data from cultivation experiments based on literature or prior knowledge (e.g., from medium suppliers, former studies) are generated. If no prior knowledge is available or the influencing factors are unknown, a traditional screening design is recommended [4, 14]. The first data are evaluated and used to model the growth, metabolic rates, and productivity of the bioprocess. Prior and expert knowledge are incorporated in the mathematical process model (modeling workflow see [33]), since it consists of mechanistic links describing the interactions of the culture dynamics [32]. However, mathematical process modeling could only be used if knowledge about the general relationships is known [34]. Otherwise, data-based modeling approaches need to be applied. For example, a hybrid modeling concept was introduced by [20]. For the method of mDoE, typically, a low number of data are available (i.e., in the early process development) and the used model structures are simple and describing known mechanistic effects, but the model parameters can be estimated based on a few data. After modeling, a classical DoE is planned including the definition of factors and responses and the choice of an appropriate experimental design. The factor combinations are exported and the dynamics of the cell culture process are simulated based on the process model. The responses are treated like experimental data and the DoE is evaluated. A RSM is estimated and a constraint-based optimization of the experimental space is conducted. This loop can be repeated several times to reduce the boundary values for an experimental DoE and the number of experiments during process development.

This method is exemplary shown for (part A) the optimization of the glucose and glutamine concentration of an antibody-producing Chinese hamster ovary (CHO) batch process and (part B) the development of a bolus fed-batch feeding strategy. Furthermore, an experimental DoE was performed for each example and evaluated within the reduced experimental space and compared to the purely simulated responses.

Materials and methods

Mathematical process model

In this study, an unstructured, non-segregated saturation-type model (see Table 1) was adapted and modified from literature to describe the dynamics of cell growth and metabolism of antibody-producing CHO DP-12 cells in batch and, with a few extensions, in fed-batch mode [25]. This model was chosen due to its simple model structure and the opportunity to estimate all the model parameters from just a few shaking flask cultivations. Furthermore, it has already been used successfully for the online control of feeding strategies for hybridoma cells and the optimization of seed trains for \(\hbox {AGE1}_{\mathrm{AAT}}\) and CHO-K1 cells [25, 28, 35].

Table 1 Mathematical process model in batch mode.

Batch-process model

The mathematical process model describes the growth (\(X_{\mathrm{t}}\)—total cell density, \(X_{\mathrm{d}}\)—dead cell density, and \(X_{\mathrm{v}}\)—viable cell density) based on the main substrates glucose (\(c_{\mathrm{Glc}}\)) and glutamine (\(c_{\mathrm{Gln}}\)). Glucose is taken up and subsequently high lactate (\(c_{\mathrm{Lac}}\)) production rates are typically seen in CHO cell cultures due to the Warburg effect [36]. The uptake of glutamine leads to the formation of ammonium (\(c_{\mathrm{Amm}}\)) during the glutaminolysis [37, 38]. No inhibitory components were considered, because cell growth was not observed to be inhibited in batch mode in the experimental setting [28]. The growth is modeled with Monod-like structures including the kinetic parameters \(K_{\mathrm{S, i}}\) (\(i = \hbox {Glc, Gln}\)), a cell lysis constant (\(K_{\mathrm{Lys}}\)) of dead cells and a minimal (\(\mu _{\mathrm{d, min}}\)), and a maximal death rate (\(\mu _{\mathrm{d, max}}\)). The cell-specific uptake rates of glucose and glutamine depend on the current glucose and glutamine concentration (Eqs.  5, 6, 12, 13). The used differential equation for the glucose uptake describes the following phenomena: First, glucose uptake is based on Monod-type equations, for which \(q_{\mathrm{Glc}}\) decreases with lower glucose concentration. Second, a growth-associated term describing the decrease of the glucose uptake at low growth rates (e.g. typically during inhibition) was added to Eq. 12. This term is 1 if \(\mu \,=\,\mu _{\mathrm{max}}\) and decreases if \(\mu \) decreases. The concentrations of lactate and ammonium are proportional to the uptake rates of glucose (lactate) or glutamine (ammonium) (Eqs. 7, 8, 14, 15, 16) and are linked with the yield coefficients (\(Y_\text{Amm/Gln}\, \text{and}\, Y_\text{Lac/Glc}\)). The shift of lactate production to lactate uptake under low glucose concentrations below \(0.5\,\hbox {mmol} \, \hbox {l}^{-1}\) was implemented (Eqs. 14, 15). The antibody production (Eq. 17) was simulated according to Frahm [25], which describes the production proportional to the viable cell density. The modeling of antibody production by mammalian cells was controversially discussed in the literature (reviewed in [34]). A constant cell-specific productivity was chosen, because it is the simplest structure for modeling without additional terms, which reduces the possibility of over-fitting. However, CHO DP-12 cells stop production when they enter the stationary growth phase. Therefore, lower boundary values for the antibody production were defined below \(1\,\hbox {mmol} \, \hbox {l}^{-1}\) glucose (Eq. 18).

Model extension: fed-batch

The growth of CHO cells in fed-batch mode stops even if glucose and glutamine are in excess. This effect, in contrast to a batch process, is described based on the accumulation of lactate and ammonium. High lactate concentrations (25–110 \(\hbox {mmol} \, \hbox {l}^{-1}\)) [25, 39,40,41] can lead to an increase in osmolality with growth inhibition and reduced productivity [42,43,44]. Furthermore, [45] identified cell-cycle dependent and putative autocrine factor-dependent metabolic regulations of the lactate production. The lactate concentration in fed-batch mode did not exceed approx. \(30\,\hbox {mmol} \, \hbox {l}^{-1}\) during 240 h cultivation and was comparable to batch cultivations. Therefore, lactate inhibition was not considered and not modeled in this study. The accumulation of ammonium (2–20 \(\hbox {mmol} \, \hbox {l}^{-1}\)) [25, 39,40,41] results in growth diminution, decreasing productivity and changes in the product quality, including the glycosylation pattern [39, 42,43,44]. Various factors, such as pH or temperature, affect the chemical decomposition of glutamine and the inhibitory effect of ammonium [39, 46, 47]. The concentration of ammonium increases during the fed-batch cultivations. The diminution of the growth could be linked to the ammonium concentration and the model structure of \(\mu \) and \(\mu _{\mathrm{d, max}}\) was extended with an ammonium inhibition based on the current ammonium concentration and an inhibitory constant \(K_{\mathrm{i,Amm}}\):

$$\begin{aligned} \mu&= \mu _{\mathrm{max}} \times \frac{c_{\mathrm{Glc}}}{c_{\mathrm{Glc}} + K_{\mathrm{s, Glc}}} \times \frac{c_{\mathrm{Gln}}}{c_{\mathrm{Gln}} + K_{\mathrm{s, Gln}}} \times \frac{K_{\mathrm{i, Amm}}}{c_{\mathrm{Amm}} + K_{\mathrm{i, Amm}}} \end{aligned}$$
(19)
$$\begin{aligned} \mu _{\mathrm{d}}&= \mu _{\mathrm{d, min}} + \mu _{\mathrm{d, max}} \times \frac{K_{\mathrm{s, Glc}}}{c_{\mathrm{Glc}} + K_{\mathrm{s, Glc}}} \times \frac{K_{\mathrm{s, Gln}}}{c_{\mathrm{Gln}} + K_{\mathrm{s, Gln}}} \nonumber \\&\quad \times \frac{c_{\mathrm{Amm}}}{c_{\mathrm{Amm}} + K_{\mathrm{s, Amm}}} \end{aligned}$$
(20)

In addition, CHO DP-12 cells stop the production of antibodies if the growth is inhibited. Therefore, the antibody production was modeled to stop if \(c_{\mathrm{Amm}}\) increases above \(K_{\mathrm{i,Amm}}\) and if the current glutamine concentration decreases below \(0.05\,\hbox {mmol} \, \hbox {l}^{-1}\):

$$\begin{aligned} c_{\mathrm{Amm}} \ge K_{\mathrm{Amm}}:\frac{\hbox {d}c_{\mathrm{mAb}}}{\hbox {d}t}=0 \end{aligned}$$
(21)
$$\begin{aligned} c_{\mathrm{Gln}} < 0.05 \, \hbox {mmol} \, \hbox {l}^{-1}: \frac{\hbox {d}c_{\mathrm{mAb}}}{\hbox {d}t}=0. \end{aligned}$$
(22)

The mathematical model was implemented as ordinary differential equations. The fed-batch mode was modeled with the addition of the feed rate (\(F_\mathrm{rate}\)), the current volume V and the current concentration of a component (exemplary for component i) in the feed (\(F_{i}\)) to the balance equations in batch mode: [35]:

$$\begin{aligned} \frac{\hbox {d}c_{\mathrm{i, fed-batch}}}{\hbox {d}t}=\frac{F_{\mathrm{rate}}}{V} \times (c_{{ i, F}} - c_{{i}}) + \frac{\hbox {d}c_{i,\mathrm{batch}}}{\hbox {d}t}. \end{aligned}$$
(23)

The sampling and the bolus feed were modeled as a constant feeding rate.

Estimation of model parameters

The model parameters were estimated (MATLAB R2018a, USA) based on only four modeling experiments for each example by minimizing the weighted root-mean-squared deviation (RMSD) using the Downhill Simplex algorithm introduced by Nelder and Mead [48]. All parameters were adapted simultaneously. The viable cell density and glucose concentration were weighted with a factor of 100 and the glutamine concentration with a factor of 1000. Cell concentrations were divided with \(10^{6}\) to transfer all the numbers to the same magnitude. The simulations were evaluated in comparison to the experimental data using the coefficient of determination (\(R^{2}\)) [49, 50].

Model-assisted reduction of experimental space

The mathematical process model was used instead of laboratory experiments to reduce the boundary values in DoE. Experiments were designed with suitable DoE Software (in this study: Design-Expert 9, Statcon, USA). Each experimental factor combination of the experimental design was simulated (MATLAB), and the responses were calculated and exported to generate response surface plots (Design-Expert 9). The simulated data were treated in the same way as data from experiments. For this purpose, no data transformation was applied, and after analysis of variance (ANOVA, all hierarchical design mode, quadratic process order), an internal RSM was set up with a maximal significance value of 0.05. After defining the RSM for each response (\(y_{i}\)), user-defined constraints were chosen and the desirability function was calculated for each response individually [51, 52]. The main advantage of the desirability function is the standardization of the multidimensional optimization problem to just one desirability function \(d_{{i}}(y_{{i}})\) value between 0 and 1. Therefore, \(d_{{i}}(y_{{i}})\) is 0 if the optimization criteria is not fulfilled and \(d_{{i}}(y_{{i}})\) tends towards 1 if the optimization is highly desirable. It is calculated based on the user-defined lower acceptable response \(L_{{i}}\) and the upper acceptable response \( {U}_{{i}}\), which d is the optimization range \(( {U}_{{i}} - L_{{i}})\):

$$\begin{aligned} d_{{i}}(y_{{i}})= \left[ \begin{array}{ll} 0 & \quad \text {if } y_{{i}}< L_{{i}} \\ \left( \frac{{\hbox {y}}_{{i}} - L_{{i}}}{{U}_{{i}} - L_{{i}}} \right) & \quad \text {if } L_{{i}}< y_{{i}} < {U}_{{i}} \\ 1 & \quad \text {if } y_{{i}}> {U}_{{i}} \\ \end{array}\right] . \end{aligned}$$
(24)

The multidimensional optimization problem is reduced with the multiplication of the different desirability function values \(d_{i}({y}_{i})\) to one overall desirability D:

$$\begin{aligned} D = \left( \prod _{\mathrm{i=1}}^{\mathrm{n}} d_{{i}}(y_{{i}}) \right) = d_{1}(y_{1})\times d_{2}(y_{2})\times d_{3}(y_{3}) [\ldots ] \end{aligned}$$
(25)

Cell line and media

The suspension growing anti interleukin-8 (IgG-1) antibody producing cell line CHO DP-12 (clone #1934, ATCC CRL-12445) was used in this study (kindly provided by Prof. Dr. T. Noll, Bielefeld University, Germany). CHO DP-12 cells were cultivated in TC-42 medium (chemically defined, animal component-free, Xell AG, Germany) which was supplemented with \(0.1\,\hbox {mg} \, \hbox {l}^{-1}\) LONG R3 IGF-1, varying concentrations of glutamine and glucose and \(200\,\hbox {nM}\) Methotrexat (all Sigma-Aldrich).

Preculture

Cells were treated in similar fashion before each experiment and no maintenance culture was used, because a loss in the productivity of CHO DP-12 cells has been described in the past [53]. Cryo-cultures (\(1\times 10^7\hbox { cells} \, \hbox {ml}^{-1}\)) were thawed and inoculated to a single-use Erlenmeyer baffled flask working volume (\(40\,\hbox {ml}\), Corning, USA). The incubator atmosphere (LT-XC, Kuhner, Switzerland) was controlled at \(37\,^{\circ } \hbox {C}\), \(5 \%\,\hbox {CO}_2\), and \(85 \%\) humidity. Shaking speed was set at \(250\,\hbox {rpm}\) and \(12.5\,\hbox {mm}\) shaking diameter. Cell expansion was performed using multiple \(40\,\hbox {ml}\) and \(80\,\hbox {ml}\) working volume Erlenmeyer baffled flasks (Corning, Netherlands) until the required amount of cells was reached. Expansion was designed such that no limitation of substrates and no putative inhibition of metabolites took place. All cultivations were performed without antibiotics and serum.

A: Optimization of batch medium

The concept of mDoE was evaluated for the optimization of the initial concentrations of glucose and glutamine in a batch process.

Experiments for modeling: Four parallel \(80\,\hbox {ml}\) shaking flask cultures (Starting concentrations: \(42\hbox { mmol} \, \hbox {l}^{-1}\) glucose, \(8\hbox { mmol} \, \hbox {l}^{-1}\) glutamine) were inoculated with \(0.3 \times 10^6 \text {cells} \, \hbox {ml}^{-1}\) and samples were taken every 12 h.

Experimental DoE: The suggested experimental design (based on mDoE) was performed in 16 parallel shaking flasks with different concentrations of glucose and glutamine using the same medium. The experimental set-up was as described in "Cell line and media". All shaking flasks were inoculated from the same preculture (\(0.3 \times 10^{6}\,\text {cells}\,\hbox {ml}^{-1}\)) and samples were taken every 24 h.

B: Optimization of fed-batch strategy

As a second example, the feed concentrations of glucose (\(F_{\mathrm{Glc}}\)) and glutamine (\(F_{\mathrm{Gln}}\)), the feeding rate (\(F_{\mathrm{rate}}\)), and the start of feeding (Feed-start) were optimized in a bolus fed-batch process.

Experiments for modeling: Fed-batch experiments were performed in single-use Erlenmeyer (recommended working volume \(40\,\hbox {ml}\)) baffled flasks with a starting volume of \(40\,\hbox {ml}\). The incubator and the starting concentrations in batch mode were the same as above. The bolus feed (Chomacs basic feed, Xell AG) was supplemented with varying concentrations of glutamine. The glucose concentration in the feed was not changed for the modeling experiments, since it was fixed in the feed supplement to \(222\,\hbox {mmol} \, \hbox {l}^{-1}\) and feeding was started at different time points and with different (bolus) feeding rates (see Table 2). Therefore, the feed was freshly prepared each day, prewarmed for at least 45 min, and added daily as a bolus to the shaking flasks. The experiments were stopped if the viability decreased below \(70 \%\).

Table 2 Performed experiments for fed-batch model parameter estimation

Experimental DoE: The cultivation protocol was slightly changed for the recommended experimental design (mDoE) as follows: the minimal working volume was 30 ml and feeding was performed until the maximal working volume of 50 ml was reached. After that, no further bolus feeding was applied.

Growth, substrate, and metabolite analysis

Total cell concentration and cell-size distribution were measured with the Z2 particle counter (Z2, Beckmann Coulter, USA) as explained in [54]. The cell suspension was centrifuged (\(300\,\hbox {g}\), \(3\,\hbox {min}\)), the supernatant was frozen (\(-20^\circ \hbox {C}\)) for metabolite analysis and the cells were suspended in \(4^\circ \hbox {C}\) PBS. The viability was determined with DNA staining using the DAPI method. Therefore, cells were suspended in \(4^\circ \hbox {C}\) PBS with \(1\,\upmu \hbox {g}\, \hbox {ml}^{-1}\) 4,6-diamidin-2-phenylindol (DAPI) and immediately measured with flow cytometry (CytoFlex, Beckmann Coulter, Brea, CA, USA). The \(405\,\hbox {nm}\) laser and \(450/50\,\hbox {nm}\) (FITC-A) filter signal was used. Debris and doublets were excluded with SSC-A/FSC-A and FSC-H/FSC-A gating and non-stained cells were gated as viable (30000 recorded events, CyteExpert Software, Beckmann Coulter).

Glucose, glutamine, and lactate concentrations were measured with the YSI 2900D (Yellow Springs Instruments, USA) biochemistry analyzer. Ammonium concentration was determined with an enzymatic test kit (AK00091, Nzytech, Portugal).

Antibody quantification

The antibody was measured in part A with an IgG quantification assay (PAIA biotech, Germany). In brief, \(54\,\upmu \hbox {l}\) of the assay buffer and \(6\,\upmu \hbox {l}\) of 1:10 diluted (PBS) crude supernatant were added to the wells of a 384 well plate (PAIA). Rituximab (Mabthera, Roche, Switzerland) was used to calibrate the assay. All standards were prepared in cell culture medium diluted 1:10 with PBS. Assays were measured in technical triplicates (Tecan Safire, Austria; bottom reading, Exc. 640 nm, Em 670 nm).

As an alternative, a high-performance liquid chromatographic system (HPLC, Knauer Smartline, Germany) equipped with a Poros-A column (Thermo Fisher Scientific, USA; 0.1 ml) was used in Part B in accordance with the manufacturer’s protocol. Purified water containing \(150\,\hbox {mmol} \, \hbox {l}^{-1}\) NaCl and 50 \(\hbox {mmol} \, \hbox {l}^{-1}\,\hbox {Na}_{2}\hbox {HPO}_{4}\) (pH 7) was used as the mobile phase. The samples were filtered (cellulose filters, pore size: \(0.45 \upmu \hbox {m}\), Restek, Germany) before injection of \(50 \upmu \hbox {l}\). \(100\,\hbox {mmol} \, \hbox {l}^{-1}\) glycine (pH 2.5, in purified water) was applied to elute the antibody and the UV signal (280 nm) was measured. The system was calibrated with a standard curve of diluted rituximab (Roche) and samples were measured in duplicates. Both measurements show comparable results (data not shown).

Results and discussion

The aim of this study was to introduce the concept of mDoE for the reduction of boundary values in the experimental DoE for process development. This is discussed for (part A) the optimization of the initial glucose and glutamine concentration of a CHO cell culture medium and (part B) the optimization of the feeding rate, start of feeding and glucose and glutamine concentration in the feed of a bolus fed-batch process. At first, model parameters of a process model were estimated and then used to simulate the responses in DoE plans. The boundary values in the experimental space were reduced using the simulations instead of experiments. Then, experimental DoEs were performed in the reduced experimental space and compared to the predicted DoEs based on the process model. All experiments performed in this study are summarized in Table 3.

Table 3 Performed experiments in this study

Estimation of model parameters

The data obtained in the model building experiments were used to estimate the model parameters (see "Estimation of model parameters" in section) of the batch and fed-batch-process model. The initial values, estimated parameters, and statistical evaluation criteria are summarized in Supplementary Table 1.

Part A: medium optimization

The experiments for modeling in example A (medium optimization) were based on the previous publications of [53] and [55] with the same medium and cell line. Biological fourfold experiments were performed, the data were averaged and the model parameters were estimated. As shown in Fig.  2, exponential cell growth and the transition to the stationary phase and corresponding cell death (summarized in \(X_{\mathrm{v}}\), Fig. 2a) could be simulated with high accuracy.

Fig. 2
figure 2

Comparison of experimental data (symbols) and simulated data (solid line). Mean and one standard deviation of four parallel batch cultivations, samples were measured as three technical replicates for each shaking flask, and \(R^{2}\) was calculated compared to the mean data points

The concentrations of glucose (Fig. 2b) and glutamine (Fig. 2e) were reflected by the model with a \(R^2> 0.9\). The metabolic waste product formation of lactate (Fig. 2c) and ammonium (Fig. 2f) were predicted well (\(R^2 = 0.78\) and \(R^2 = 0.89\)). The lactate concentration increases faster in the first 48 h compared to the concentrations predicted by the model. Secondary lactate and ammonium accumulation after 120 h correspond to cell death which includes apoptosis, necrosis, and cell lysis [56]. The modeling of such effects was not aimed with the proposed simple model and available data. The antibody concentration (Fig. 2d) increases until \(X_{\mathrm v}\) decreases after approx. 144 h and was estimated with a high accuracy of \(R^2 = 0.98\).

Part B: optimization of fed-batch

The modeling of sophisticated fed-batch strategies is still challenging due to the complexity of the regulation of the cell metabolism in changing cultivation conditions such as changing pH, osmolality, or varying waste product concentrations [50]. A D-optimal DoE was used in the beginning to plan experiments suitable for modeling of the fed-batch process. Even if novel approaches for experimental design of experiments (model-based DoE) for the targeted estimation of model parameters are coming up, their application in the early process development for rather complex cell culture processes is under investigation [57] and so far not applicable with low experimental effort. Therefore, the boundary values of this first DoE were generated based on the available feeding solutions and vendor data (see "B: Optimization of fed-batch strategy" in section). Concentrated feed with \(222\,\hbox {mmol} \, \hbox {l}^{-1}\) glucose and 9–38 \(\hbox {mmol} \, \hbox {l}^{-1}\) glutamine were fed in a daily bolus feed. The boundary conditions for the feeding rate were 1–4 \(\hbox {ml} \, \hbox {d}^{-1}\) and were based on bolus feeding protocols described in the literature [58, 59]. The D-optimal design was evaluated based on the standard error of the suggested experiments using a quadratic RSM fit. Four distributed experiments with a standard error of the design \(\ge 0.6\) were randomly selected. The four performed experiments do not reflect the entire range of feeding strategies, but act as an initial point for modeling and evaluation of the experimental settings. The comparison of the simulated to experimental data is shown for the four different fed-batch shaking flask experiments in Fig. 3.

Fig. 3
figure 3

Comparison of experimental (Exp:) and simulated data (Sim:) summarized for the four performed fed-batch experiments (see "B: Optimization of fed-batch strategy" in section). Error bars show one standard deviation of three technical measurements; \(R^{2}\) reflects goodness of fit against the optimal simulation (\(x\,=\, y\))

The time-dependent courses of each individual estimation are shown in the Supplementary Figs.1–4. The viable (Fig. 3a) and total cell density (Fig. 3c) are experimentally measured and simulated from \(0.3 \times 10^{6}\,\hbox {cells} \, \hbox { ml}^{-1}\) up to \(28 \times 10^{6}\,\hbox {cells} \, \hbox { ml}^{-1}\) with a \(R^2 = 0.85\). The concentration of the dead cell density (Fig. 3b) was simulated higher than that measured and partially shifted. The modeling of the complexity of cell starvation and cell death was not aimed in this study and the achieved simulation was acceptable. Furthermore, the modeling of the bolus feed and sampling in shaking flasks were successfully modeled (Fig. 3d). The simulated concentrations of glucose (Fig. 3f) and glutamine (Fig. 3g) follow the experimental data and the corresponding metabolic waste product concentrations of lactate (Fig. 3h) and ammonium (Fig. 3i) are simulated satisfactory. The high lactate accumulation during the first 48 h could not be simulated and the experimental and simulated data differ below \(20\,\hbox {mmol} \, \hbox { l}^{-1}\). The dynamics of lactate metabolism in CHO culture is still controversially discussed in literature (summarized in [60, 61]) and the intracellular mechanisms are not well understood. The modeling of such effects by the here applied simple modeling structures was not targeted and the simulation is, therefore, sufficient for process optimization. The antibody concentration was estimated with an \(R^2\) of 0.74 (Fig. 3e).

It was aimed to model the data of first experiments typically performed to evaluate feeding medias and strategies. Therefore, the modeling of 90-fold increase in the viable cell density in changing cultivation conditions in shaking flasks over 240 h cultivation time using the proposed simple unstructured and non-segregated model structure was considered as sufficient.

Model-assisted reduction of boundary values

The concept of mDoE combines the model predictions with classical DoE. Simulations were used instead of laboratory experiments to predict the responses in a DoE plan. At first, widely distributed factor combinations are examined and a DoE is planned. Optimal DoE designs are commonly applied in bioprocess development [1, 2, 17] and an D-optimal design was, therefore, applied in Part A and an I-optimal design in Part B. The responses of each DoE were predicted with the mathematical process model and evaluated with RSM estimation as in a classical DoE. User defined constraints were chosen and the desirability function was calculated for the investigated responses. The experimental space was subsequently evaluated and boundary values of a fully experimental DoE were planned within the reduced experimental space. The suggested list of experiments, the simulated responses, the statistical evaluation of the responses, and the response surfaces are summarized in the Supplementary Tables 2–6 and 10.

Part A: medium optimization

Initial boundary values: The initial glucose concentration was varied in the simulated design between 20 and \(60\,\hbox {mmol} \, \hbox { l}^{-1}\). This corresponds to a \(50\%\) increase/decrease of the glucose concentrations as reported in [53] and [55]. These studies did not focus on the optimization of the batch-medium composition as aimed in this study. Glutamine concentrations typically applied in batch media range from \(2\,\hbox {mmol} \, \hbox { l}^{-1}\) up to \(8\,\hbox {mmol} \, \hbox { l}^{-1}\). The factor range of the initial glutamine concentration was, therefore, widely defined between 2 and \(12\,\hbox {mmol} \, \hbox { l}^{-1}\).

An D-optimal design with 20 suggested experiments was planned within the defined boundary values, and the maximum concentrations of the viable cell density, antibody, lactate, and ammonium were simulated as responses.

Constraints for desirability function: The constraints were chosen to maximize the viable cell density above a minimal viable cell density of \(10^{7}\,\hbox {cells} \, \hbox {ml}^{-1}\). Furthermore, the antibody concentration should be maximized. The constraints for the metabolic waste products were defined based on the literature data with respect to cell growth and product quality. High lactate concentrations were shown to correlate with a reduced integral of viable cell density and a reduced product titer at day 14 in pH controlled shaking flask cultivation with added sodium lactate [62]. Lactate concentration below \(20\,\hbox {mmol} \, \hbox {l}^{-1}\) are considered to not harm cell growth and productivity and more than \(40\,\hbox {mmol} \, \hbox { l}^{-1}\) lactate was shown to harm CHO cell growth [63]. Therefore, a maximal lactate concentration of \(30\,\hbox {mmol} \, \hbox { l}^{-1}\) was defined as upper constraint and the lactate concentration was minimized below. [64] identified that the sialylation of a granulocyte colony-stimulating factor was significantly reduced by ammonium concentrations over \(2\,\hbox {mmol} \, \hbox { l}^{-1}\). [65] investigated the mRNA expression levels of 52 N-glycosylation-related genes in recombinant CHO cells producing an Fc-fusion protein and observed a decrease of the protein production and the viable cell density after an addition of \(10\,\hbox {mmol} \, \hbox { l}^{-1}\) ammonium chloride. By the same time, the sialic acid content and the acidic isoforms were reduced after 5 days of cultivation. The ammonium concentration was, therefore, chosen to be minimized to take product quality into account, even if it was not measured.

Desirability function: The desirability function was calculated (see "Model-assisted reduction of experimental space" in section) and is shown in Fig. 4.

Fig. 4
figure 4

Contour plot of desirability function for the optimization of the initial glucose and glutamine concentration of a batch process, symbols are the factor combinations in a D-optimal design, responses were simulated ("Model-assisted reduction of experimental space" in section), and dashed line represents the reduced boundary values (see "Performed experimental designs" in section)

Glutamine concentrations higher than approx. \(10.5\,\hbox {mmol} \, \hbox { l}^{-1}\) result in a too high ammonium concentration (\(D_{\mathrm{Med-opt}} = 0\)) and glucose concentrations above 52 result in too high lactate concentrations and \(D_{\mathrm{Med-opt}}=0\). The minimal criteria of \(10\,\times 10^{6}\,\hbox {cells} \, \hbox { ml}^{-1}\) were not reached below \(4\,\hbox {mmol} \, \hbox {l}^{-1}\) glutamine and \(21\,\hbox {mmol} \, \hbox {l}^{-1}\) glucose. In this way, multiple constraints were considered and only a small area results as suggested experimental space with \(D_{\mathrm{Med-opt}}>0\). The desirability function tends towards \(D_{\mathrm{Med-opt}}=0\) outside of this area, where no experiments can fulfill the optimization criteria.

Reduced experimental design: Only 5 of the 20 evaluated factor combinations lay in an area with \(D_{\mathrm{Med-opt}}>0\) and would increase the process understanding. The remaining 15 experiments lay on the boundaries of the experimental space with \(D_{\mathrm{Med-opt}}=0\). The performance of these experiments would be time- and cost-intensive, without providing sufficient knowledge. The usage of mDoE allows the a priori evaluation and reduction of the boundary values if mechanistic links could be formulated beforehand. The reduced experimental space was selected within the estimated desirability function (Fig. 4, dashed line) between \(52.5\,\hbox {mmol} \, \hbox {l}^{-1}\,\ge \,c_{\mathrm{Glc}}\,\ge 32.5\,\hbox {mmol} \, \hbox {l}^{-1}\). A new D-optimal design was chosen to be performed within the reduced design and 16 experiments are suggested to be performed (see Supplementary Table 6).

Part B: optimization of fed-batch

The optimization of feeding strategies was widely discussed in literature [25, 66, 67]. In this study, a bolus feeding strategy was considered as starting point for the development of advanced control strategies [68, 69]. Broad boundary values were first defined for the optimization of the fed-batch strategy. The (bolus) feeding rate was evaluated at the beginning between 1 and 10 \(\hbox {ml} \, \hbox {d}^{-1}\) corresponding to 3.3–33.3% of the initial working volume. The time points for the start of the feed were defined at 48 h, 72 h, and 96 h, as in the modeling part ("B: Optimization of fed-batch strategy" in section).

Initial boundary values: The glutamine concentration in the feed was varied between the initial batch concentration (\(6\,\hbox {mmol} \, \hbox {l}^{-1}\)) and the recommended maximal concentration in the available feed (\(38\,\hbox {mmol} \, \hbox {l}^{-1}\)). The glucose concentration in the feed was varied between the batch concentration of \(46\,\hbox {mmol} \, \hbox {l}^{-1}\) and the maximal recommended feed concentrations of \(222\,\hbox {mmol} \, \hbox {l}^{-1}\) (\(40\,\hbox {g} \, \hbox {l}^{-1}\)). These correspond to typical described ranges of the glucose and glutamine concentrations in fed-batch processes of CHO cells using different media. Exemplary, [70] formulated a feed based on 10× DMEM/F12 Media (\(F_{\mathrm{Glc}} = 55\,\hbox {mmol} \, \hbox {l}^{-1}\)) to evaluate different CHO fed-batch strategies. [71] developed a dynamic high-throughput system for the optimization of fed-batch strategies and used a platform feed ranging from \(F_{\mathrm{Glc}} = 278-444\,\hbox {mmol} \, \hbox {l}^{-1}\). Glutamine was not considered in this study.

Constraints for desirability function: Twenty-nine experiments in an I-optimal design were planned, the factor combinations were exported, and the maximal viable cell density, the maximal antibody, and ammonium concentrations were simulated, and the desirability function was calculated based on the following constraints for the optimization of the feeding strategy: the lower maximal achievable viable cell density was defined to be at least \(20\,\times 10^{6}\,\hbox {cells} \, \hbox {ml}^{-1}\) and the antibody titer was aimed to be maximized. The ammonium concentration was minimized, but not limited due to the longer process time and higher measured ammonium concentrations ("Part B: optimization of fed-batch" in section) during fed-batch cultures. Furthermore, the lactate concentration was not considered during the fed-batch, because the lactate formation during the experiments for the estimation of the model parameters ("Part B: optimization of fed-batch" in section) does not exceed critical concentrations.

Desirability function: The desirability function was calculated and the corresponding plots are shown in Fig. 5.

Fig. 5
figure 5

Contour plot of desirability function for the optimization of the glucose and glutamine concentration in the feed, the feeding rate, and the start of feeding in a (bolus) fed-batch process (I-optimal design), responses were simulated ("Model-assisted reduction of experimental space" in section), and dashed line represents the reduced boundary values "Performed experimental designs" in section

Multiple 3D-plots were generated due to the four-dimensional optimization problem with three variable factors (\(F_{\mathrm{Glc}}, F_{\mathrm{Gln}}, F_\mathrm{rate}\)) and one categorical factor (Feed-start). The desirability function for an average \(F_{\mathrm{Glc}}\), a variable \(F_{\mathrm{Gln}}\), and variable \(F_\mathrm{rate}\) are \(D_{\mathrm{Fed-batch opt.}}>0.5\) for the start of feeding after 48 h (Fig. 5a) with an optimal area within the evaluated factors. This optimal area and the corresponding \(0.6 \ge D_{\mathrm{Fed-batch opt.}}\ge 0\) decrease for a start of feeding after 72 h (Fig. 5d) and shrinks to a small band for a start of feeding after 96 h (Fig. 5g) with \(D_{\mathrm{Fed-batch opt.}}=0\) in the upper part and lower part. Experiments within these regions are undesired. A comparable shape is shown for an average \(F_{\mathrm{Gln}}\) and a variable \(F_{\mathrm{Glc}}\) and variable \(F_{\mathrm{rate}}\) for the evaluated feeding start times (Fig. 5b, e, h). No \(D_{\mathrm{Fed-batch opt.}}=0\) were identified for an average \(F_{\mathrm{rate}}\) and variable \(F_{\mathrm{Glc}}\) and \(F_{\mathrm{Gln}}\) with \(0.8 \ge D_{\mathrm{Fed-batch opt.}} \ge 0.3\) (Fig 5c, f, i). The interactions in a four-dimensional DoE can hardly be known beforehand and the evaluation of the boundary values based only on experiments can be time-consuming and cost-intensive.

Reduced experimental boundaries: The reduced experimental design is shown in Fig. 5 (dashed line) and was selected based on the desirability function. \(F_{\mathrm{rate}}\) was chosen between \(3\,\hbox {ml} \, \hbox {d}^{-1}\,\le \,F_{\mathrm{rate}}\,\le 6\,\hbox {ml} \, \hbox {d}^{-1}\) based on the band-shaped areas of Fig. 5g, h and the limitation that the feed rate should not exceed \(20\%\) of the initial working volume. In general, no areas with \(D_{\mathrm{Fed-batch opt.}}=0\) were identified for \(F_{\mathrm{Glc}}\) and \(F_{\mathrm{Gln}}\) and experiments with \(D_{\mathrm{Fed-batch opt.}}> 0\) could be evaluated over the fully evaluated initial concentration ranges. In this study, the boundary values of \(F_{\mathrm{Glc}}\) were chosen based on the available feed supplements within \(222\,\hbox {mmol} \, \hbox {l}^{-1}\,\ge \,F_{\mathrm{Glc}}\,\ge 111\,\hbox {mmol} \, \hbox {l}^{-1}\). \(F_{\mathrm{Gln}}\) was ranged between the maximal \(F_{\mathrm{Gln}}\) in the evaluated feeds of \(38\,\hbox {mmol} \, \hbox {l}^{-1}\), which reflects the standard feed concentration and a minimal \(F_{\mathrm{Gln}}\) of \(9\,\hbox {mmol} \, \hbox {l}^{-1}\).

Performed experimental designs

The mDoE concept was used to reduce the boundary values of an experimentally performed DoE using the model predictions. This incorporates prior knowledge based on the mathematical process model. In this part, experiments within the reduced experimental boundary values were performed and compared to the simulated DoE. The performed list of experiments, the simulated and measured responses, the statistical evaluation of the responses, and the response surfaces are summarized in Supplementary Tables 6–13.

Part A: medium optimization

Sixteen parallel shaking flask cultivations in an D-optimal design were planned and experimentally performed. The maximal concentrations of antibody and ammonium were implemented as responses. The lactate and viable cell concentration were not considered as responses, because the lactate concentration was ensured to be below \(30\,\hbox {mmol} \, \hbox {l}^{-1}\) and the viable cell density above \(10\,\times 10^{6}\,\hbox {cells} \, \hbox {ml}^{-1}\), based on the modeled reduction of the boundary values ("Part A: medium optimization" in section). In addition, the responses were simulated and both designs (experimental performed and simulated) were statistically evaluated and the response surfaces were estimated. Both desirability functions were calculated due to the maximization of the antibody concentration and the minimization of the ammonium concentration, and are shown in Fig. 6.

Fig. 6
figure 6

Reduced simulated (a) and experimentally (b) performed DoEs for part A: optimization of medium composition; points are the considered factor combinations

The desirability function of the simulated design (Fig. 6a) recommends optimal starting concentrations in the upper right corner with high glucose as well as low glutamine concentrations with \(D_{\mathrm{Med-opt}} = 0.87\). By the same time, the experimentally performed design (Fig. 6b) recommends the same optimal initial concentrations with a slightly lower \(D_{\mathrm{Med-opt}} = 0.70\). These small differences are typical when comparing the simulated results with uncertainty-based experimental results. No further experimental design needs to be performed outside of this area, since the outer experimental space was evaluated beforehand using the process model and the experiments were performed with reduced boundaries. This decreases the experimental effort and simultaneously increases the process understanding.

Part B: optimization of fed-batch

Twenty-nine experiments in two blocks with 15 and 14 parallel fed-batch cultivations were performed for the reduced experimental space for the optimization of the fed-batch strategy. As can be seen in Fig.  7, the responses (maximal \(c_{\mathrm{mAb}}\) and maximal \(c_{\mathrm{Amm}}\)) were either simulated (\(D_{\mathrm{Fed-batch opt.,a}}:\) Fig. 7a) or experimentally determined (\(D_{\mathrm{Fed-batch opt.,b}}:\) Fig. 7b). Both desirability functions were calculated based on the maximization of the antibody concentration and the minimization of the ammonium concentration.

Fig. 7
figure 7

Reduced simulated (a) and experimentally (b) performed DoEs for part B: optimization of fed-batch strategy; points are the considered factor combinations

The gradient of the desirability was higher in the simulated than in the experimental performed design. Both designs recommended an optimal feeding strategy with \(D_{\mathrm{Fed-batch opt,a}} = 0.88\) and \(D_{\mathrm{Fed-batch opt,b}} = 0.63\) for a start of feeding after \(96\,\hbox {h}\), \(F_{\mathrm{Glc}} = 222\,\hbox {mmol} \, \hbox {l}^{-1}\), \(F_{\mathrm{Gln}} = 9\,\hbox {mmol} \, \hbox {l}^{-1}\) and a feeding rate of \(3\,\hbox {ml} \, \hbox {d}^{-1}\). However, the shape and qualitative range of the simulated desirability (0.0–0.9) was higher than in the experimentally derived desirability (0.2–0.6). These variations are based on experimental variations in the experimental desirability with 29 experiments over up to 10 days. Furthermore, the fed-batch model parameters (see "Model extension: fed-batch" in section) are based on only four experiments and model parameters are uncertainty-based, in general, which was not considered in the simulated desirability.

The development and optimization of fed-batch strategies based on DoE requires many experiments (e.g., 29 for the shown example) including multidimensional optimization challenges. Only four experiments are needed for mDoE for the model-assisted reduction of the experimental space and a comparable prediction quality. No iterative re-estimations of the experimental space are necessary, which results in a reduction in the time needed for knowledge-based bioprocess development.

Conclusions

In this contribution, a novel concept for the combination of mathematical process models with DoE was applied to a medium optimization and the optimization of a bolus fed-batch strategy. A mathematical process model was estimated based on four experiments for each example and widely distributed boundary values for a DoE were evaluated using model predictions instead of laboratory experiments. The reduced experimental spaces were experimentally performed (DoE) and compared to the simulated DoE. The same optimal conditions were found for both examples (simulated and experimentally performed). No heuristic restrictions with several iterative rounds were necessary, because the mathematical process model incorporates the known factors and interactions and their dynamics in DoE. Furthermore, DoEs are typically based only on endpoints and different responses and endpoints can be tested using the kinetic model [7, 8]. mDoE can be applied if an understanding of the mechanistic relationships is known and this is seen as a meaningful decision making for process development and optimization using DoE in QbD. Further studies will focus on the model-assisted comparison of experimental designs and the combination of simulated and experimental data to further decrease the experimental effort in bioprocess development.