1 Introduction

Biosurfactants are microbial compounds that exhibit pronounced surface and emulsifying activities. There is a great deal of interest in biosurfactants since they are considered as “green” alternatives to synthetic surfactants. Biosurfactants comprise a wide range of chemical structures, such as glycolipids, lipopeptides, polysaccharide–protein complexes and phospholipids: see Desai & Banat [1] and Abdel-Mawgoud et al [2]. Among the various biosurfactants, lipopeptides are particularly interesting because of their high surface activities and antibiotic potential. The bioactive peptides such as surfactin, fengycin and Turing A, B and C, mycosubtilins and bacillomycins fall under the category of lipopeptide biosurfactants.

The lipopeptide surfactin produced by Bacillus subtilis is the most powerful biosurfactant [3] with potential biotechnological and biomedical applications. Surfactin lipopeptide belongs to a group of cyclic lipoheptapeptides containing beta-hydroxyl fatty acids and D2/L-amino acid residues: Peypoux et al [4], Haddad et al [5] and Tang et al [6]. It possesses various biological activities; anti-microbial, anti-viral, anti-tumor, blood anticoagulant and fibrinolytic activities [7]. The cell growth and the accumulation of metabolic products of a lipopeptide biosurfactant process are strongly influenced by medium compositions such as carbon sources, nitrogen sources, phosphorous sources, growth factors, and inorganic salt concentrations. The potential applications considered for this biosurfactant depend on whether it can be produced economically. In this regard, experimental design provides greater insight in studying the impact of potential variables affecting the process.

Experimental design and response surface methodology (RSM), in general, have been studied for different biosurfactant processes in Abalos et al [8], Al-Araji et al [9], Rispoli et al [10] and Rikalovic et al [11]. The influence of media components such as carbon-, nitrogen-, and potassium- sources, and environmental factors for the growth and production of lipopeptides by Bacillus subtilis has been evaluated experimentally by various researchers (see, for instance, Suwansukho et al 12). Gu et al [13] have employed central composite design (CCD) based RSM to optimize the levels of sucrose substrate, ammonium chloride, ferrous sulphate, and zinc sulphate for the production of a lipopeptide by Bacillus subtilisin shaker flask fermentation. Media optimization of biosurfactant production by Bacillus subtilis has also been studied by Abdel Mawgoud et al [14]. Liu et al [15] have applied central composite rotatable design (CCRD) and RSM to optimize the medium composition for enhanced productivity of C15-surfactin by a Bacillus genus. Mutalik et al [16] employed RSM to find the optimum medium composition for the biosurfactant production by Rhodococcus spp. MTCC 2574. Seghal Kiran et al [17] employed a quadratic model – fitted to experimental data – to optimize the critical control factors involved in the production of lipopeptide biosurfactant by marine Brevibacterium aureum MSA13. In most of the above studies, media optimization is performed either by the analysis of experimental results or RSM results. Very few studies have considered the use of efficient optimization algorithms to explore the design search space for optimal solution of a biosurfactant process. De Lima et al [18] developed an empirical model using bioreactor data, and combined it with an optimization method developed in Maple VIII (release 4) software to optimize the conditions for biosurfactant production by Pseudomonas aeruginosa. Pal et al [19] applied artificial neural network (ANN) coupled with genetic algorithm (GA) for media optimization of biosurfactant production by Rhodococcus erythropolis MTCC 2794. Satya Eswari et al [20] employed artificial neural network-based response surface model (ANN RSM) coupled with non-dominated sorting differential evolution (DE) to optimize the medium composition for Rhamnolipid production by Pseudomonas aeruginosa AT10.

The use of optimization algorithms in conjunction with the response surface models that are formulated based on the data of designed experiments can better explore the design search space for optimal solution of a process. To overcome the limitations of classical optimization techniques, a host of new optimization algorithms that operate in a different way have been developed. In fact, evolutionary optimization techniques such as the GA [2123], simulated annealing [24; 25], ant colony optimization (ACO) [2628], particle swarm optimization [29], and DE [30; 31] are capable of yielding more flexible solutions than the classical optimization techniques. To the best of our knowledge, the RSM coupled with ACO has not been reported so far for biosurfactant processes. Hence, in this work, it is proposed to optimize the lipopeptide biosurfactant process by coupling the CCRD based response surface models with ACO. ACO could be a very powerful and flexible tool well suited for modeling the fermentation process due to an implicit corrective action arising from the training methodology and the associated estimation procedure.

RSM is the most preferred method for fermentation media optimization. The CCRD is the most popular method to obtain data for an RSM. ACO is the most efficient metaheuristic search algorithm used to solve combinatorial optimization problems. ACO is based on the observation that ants can find the optimal path between a food source and their nest exploiting a mix of probabilistic behavior and pheromone depositing. In ACO, a set of artificial ants simulate the behavior of real ants; the artificial ants move on the graph representation of a combinatorial optimization problem and build solutions probabilistically. The probabilities are biased by artificial pheromo nes that ants deposit while building solution. In this work, CCRD was used to design the experiments, and data was generated for lipopeptide production by Bacillus subtilis. The data was used to develop response surface models for lipopeptide and biomass productivities. These models were coupled with ACO algorithm to optimize the media composition for enhancing the lipopeptide activity. The optimized activity was validated experimentally. The results of ACO were compared with those of Nelder–Mead optimization method, and used to generate conclusions.

2 Materials and methods

2.1 Microorganism and culture conditions

Bacillus subtilis, a bacterial strain as lyophilized culture (2423) from IMTECH-MTCC, was activated in 5 ml of nutrient broth under laminar air flow. The nutrient broth used was beef extract (1 g/l), yeast extract (2 g/l), peptone (5 g/l) and sodium chloride (5 g/l). The broth was incubated in a rotary shaker run at 190 rpm at 37C.

The inocula were prepared as follows: The pure microbial culture, i.e. Bacillus subtilis MTCC 2423 from the broth was grown on nutrient agar slants for 24–48 h. The agar slants were sub-cultured for every two weeks. From the sub-culture, one isolated colony was dispensed in nutrient broth at room temperature (30 ±2C) and kept in rotary shaker at 200 rpm for 16 h. This was used as an inoculum at the concentration of 10% v/v. For biosurfactant production, Bacillus subtilisfrom the inocula was grown in 250 ml flask with 100 ml of Minimal medium at the same conditions for 16 h. The concentration of the Minimal medium was glucose (2.5 g/l), monosodium glutamate (1 g/l), yeast extract (0.3 g/l), MgSO4⋅7H2O (0.1 g/l), K2HPO4 (0.1 g/l), and KCl (0.05 g/l); the medium was sterilized at 121C for 20 min. Further 50 ml of culture from the inoculum was grown in a 1,000 ml conical flask with 300 ml of Minimal medium for 96 h on a rotary shaker at 160 rpm and 30C. This experiment is performed to observe the productivity of biosurfactant byMinimal medium and further to design the experiments for response surface analysis.

2.2 Analytical methods

Biomass

The biomass was determined from the cells after centrifugation of the culture broth at 6,700g (10,000 rpm), 4C for 10 min. The dry cell weight (DCW) was obtained from the cell pellets by washing twice with distilled water and drying in hot air oven at 105C for 24 h.

Surface activity measurement

Culture samples were centrifuged at 6,700g (10,000 rpm) for 20 min for cell removal, and the supernatant was subjected to surface activity measurements. Surface tension (ST) and interfacial tension (IT) were determined with a Kruss Tensiometer: this was performed at room temperature using the ring method. The instrument was calibrated by first measuring the surface tension of pure water. The measurement was repeated at least three times, and the average was reported as the surface tension of the sample.

Extraction of crude biosurfactant

The crude biosurfactant was isolated from the cell-free broth of culture kept for 96 h. The bacterial cells were removed from surfactant-containing culture broth by centrifugation at 6,700g (10,000 rpm), 4C for 20 min. The supernatant was kept overnight at 4C and precipitated by adding concentrated HCl to achieve a final pH of 2.0. Gray white pellets (of lipids and proteins) formed by precipitation were collected by centrifugation at 6,700g (10,000 rpm), 4C for 20 min. The pellets were lyophilized and weighed for quantification. For the extraction of biosurfactant compounds, 50 ml of chloroform–methanol mixture (2:1 v/v) was added to 500 mg of the dry product and incubated in a rotatory shaker at 250 rpm, 30C (±0.5C) for 15 min. The extract was evaporated to dryness and weighed for quantification. Assays were carried out in triplicates.

3 Design of experiments and data generation

In this study, CCRD is used to design the experiments for lipopeptide biosurfactant production. The CCRD is the most popular class of designs used for fitting second-order models. The total number of tests required for CCRD is 2k−1+2k + n 0, which includes 2k factorial points with its origin at the center, 2k points fixed axially at a distance β (β = 2k/4) from the center to generate the quadratic terms, and replicate tests at the center (n 0); where k is the number of independent variables. A design should include enough replications (often at the center point) to provide an independent estimate of the experimental error allowing it to be tested for lack of fit of the model. For five variables, the recommended number of tests at the center is six. Hence the total number of tests required for five independent variables is 24 + (2 × 5) + 6 = 32. For statistical calculation, the experimental variables x i has been coded as X i as per the transformation equation:

$$ X_{i} =\frac{x_{i} -x_{0} }{\Delta x_{i} }. $$
(1)

Here X i is the dimensionless coded value of the i th independent variable, x i is the uncoded value of the i th independent variable, x 0 is the value of x i at the center point, and Δx i is the step change value of the real variable x i .

In this study, the five independent process variables chosen were Glucose (x 1), Monosodium glutamate (x 2), Yeast extract (x 3), MgSO4⋅7H2O (x 4), and K 2 HPO 4(x 5). Five levels (±β, ± 1, 0: where β=24/4= 2) and six replicates at the central points were used to design the experiments. The levels of lowest, low, center, high and highest for the design variables in g/l were specified as x 1: (0.5, 1, 1.5, 2, 2.5), x 2:(1, 2, 3, 4, 5), x 3:(0.1, 0.2, 0.3, 0.4, 0.5), x 4:(0.1, 0.2, 0.3, 0.4, 0.5), and x 5:(0.1, 0.2, 0.3, 0.4, 0.5), respectively.

The response variables measured were the biomass (Y 1) and lipopeptide concentrations (Y2): note that Yi refers to (dimensional) concentration in g/l of i th response variable. Six replicates at the center of the design were used to estimate the sum of squares error. Experiments were randomized in order to maximize the effects of unexplained variability in the observed responses due to extraneous factors. The experiments were conducted according to the CCRD design given in table 1.

Table 1 Experimental design matrix.

4 Modeling and optimization

4.1 Response surface methodology (RSM)

The first step in RSM is to find a suitable approximation for the true functional relationship between the response (Y in g/l) and the set of independent variables. An important assumption is that the independent variables are continuous and controllable by experiments with negligible errors. Curvature is present in our system, and hence the RSM we use is a polynomial of high degree (the second-order model):

$$ Y=\beta_{0} +\sum\limits_{i=1}^{k} {\beta_{i} } X_{i} +\sum\limits_{i=1}^{k} {\beta_{ii} } {X_{i}^{2}} +\beta_{ij} X_{i} X_{j} +\varepsilon , $$
(2)

where β i i represents the quadratic effect of the i th factor and β i j represents the cross product effect, or interaction effect, between the i th and j th factors.

4.2 Ant colony optimization (ACO)

ACO introduced in Dorigo et al [32] is one of the most recent techniques to solve optimization problems. The ACO mimics the way real ants find the shortest route between a food source and their nest exploiting a mix of probabilistic behavior and pheromone depositing. The ants communicate with one another by means of pheromone trails and exchange information about which path should be followed. This autocatalytic and collective behavior of ants results in the establishment of the shortest route from the nest to the food source and back. This pheromone-mediated intelligent foraging behavior of ants is exploited by means of ACO algorithm to solve a number of optimization problems [33].

In this work, the ACO algorithm with its global search features is used to optimize the media compositions (g/l): glucose (x 1), monosodium glutamate (x 2), yeast extract (x 3), MgSO 4⋅7H 2O (x 4), K 2 HPO 4(x 5) involved in the production of lipopeptide surfactin by Bacillus subtilis. An objective function is defined based on the actual process measurements and the model predictions as given by

$$ J=f(\theta )=\sum\limits_{i=1}^{l} {\left( {Y_{i} -\;\hat{{Y}}_{i} } \right)^{2}} . $$
(3)

where J is the cost function, 𝜃 is the vector of parameters, l is the number of observations, Y i is the measured value of the i th variable, and \(\hat {{Y}}_{i}\) is the corresponding predicted value. Iterative convergence of this equation leads to minimization of the value of J thus providing the optimum parameter values.

The ACO-IM problem can be stated as

$$ f(\theta ):D\to R, $$
(4)

where 𝜃 is p-dimensional parameter vector, R is the space of real numbers and D is the search space of 𝜃. The objective is to find the parameter vector 𝜃 that minimizes f(𝜃). D is considered to be a hyper parallelepiped

$$ D=\left\{ {\theta_{i} \left|\theta_{i}^{-} \le \theta_{i} \le \theta_{i}^{+}\right.} \right\};\quad i=1,......,p, $$
(5)

where \(\theta _{i}^{-}\) and \(\theta _{i}^{+} \) denote the lower and upper levels of parameter 𝜃 i . The search space D is first divided in the interval \(\left [ {\theta _{i}^{-},\theta _{i}^{+} } \right ]\) for each parameter 𝜃 i into a number of strata, m i . If each stratum be represented by the value at the middle of the stratum, then there will be M = m 1,..., m p permutations or possible pathways through the search space of input parameters. Let 𝜃 i j (i=1...p and j=1... m i ) be the stratum j of parameter 𝜃 i in the search space D. The parameters can be initially specified such that they uniformly cover the whole parameter space. Use of more strata can speed up the convergence of the optimization problem but the increase of strata would increase the computational burden.

Let there be M ants representing M strata. The possible number of pathways, N, that the ants can travel through the parameter space p is given by N = M p. For instance, each ant follows one of the Npathways from the pathway structure list shown in the left side of figure 1. Accordingly, each ant will follow any one of the N pathways, e.g., pathway 29 (1,2,1,1,2), or pathway 99 (2,1,2,3,3), as shown in figure 1. The ants perform the tasks such as selection of pathways that they pass, remembering the parameter strata along the pathways, passing these parameter values to the model of the process, evaluating the value of objective function for each path way, and updating the pheromone based on the objective function values.

Figure 1
figure 1

Graphical representation of discretized parameter space and the pathway structure.

The computations involved in ACO implementation are as follows. The trail intensity also called the pheromone deposit, τ u on each pathway u (u=1...N) is computed based on the cost function values of J. The cost function J of each pathway is represented as J u and the minimum of the cost functions, J m i n is found. The mean, μ J and the standard deviation, σ J of the objective functions are evaluated. The ratio of standard deviation to mean represents the variability in the function values and the critical cost Cwhich is evaluated as

$$ C=J_{\min } +C_{c} \frac{\sigma_{J} }{\mu_{J} }, $$
(6)

where C c is a constant.

The critical cost, C is used to evaluate the trail intensity, τ u on each path according to the formula

$$ \tau_{u} =\left\{ {\begin{array}{ll} \exp \left[ {4.6\left( {\frac{J_{u} -C}{J_{\min } -C}} \right)} \right],&J_{u} \le C \\ 0,&J_{u} >C \end{array}} \right. $$
(7)

According to this equation, pathways with the cost function larger than C receive no trail intensity, while those below C receive larger trail intensity values. It can be observed that any single stratum ϕ i j in Dmay be the crossroad of many ant pathways. The trail share Γ u calculated of each stratum 𝜃 i j from each pathway can be summed to yield ϕ i j as \(\phi _{ij} =\sum \limits _{u=1}^{N} {\tau _{u} }\; u\in \) crossing pathway.

Here, the crossing pathways are those that cross the stratum ϕ i j . The mean, μ i j and the standard deviation, σ i j of the cost functions are used to compute the scores, S i j according to the formula:

$$ S_{ij} =\frac{(\phi_{ij} )^{A}(\sigma_{ij} )^{B}}{\sum\nolimits_{i} {\sum\nolimits_{j} } (\phi_{ij} )^{A}(\sigma_{ij} )^{B}}, $$
(8)

where

$$ B=C_{S} \frac{\sigma_{ij} }{\mu_{ij} }, $$
(9)

where A and C S are constants. Eqs. (8) and (9) in ACO structure represent trail intensity and transition probability [34]. The scores, S i j are used to eliminate the strata with lowest scores while retaining the highest score. The interval of highest retained score results in narrowing the ranges for the parameters. This strategy is subsequently reinitialized with updated parameter ranges and the procedure is repeated until the convergence in objective defined by Eq. (3) is achieved. The constant C c in Eq. (6) signifies the variability of the trail received by each stratum.

5 Results and discussion

5.1 Experimental observations

The lower and upper ranges considered for the media compositions in CCRD design were: glucose (x1): 0.5–2.5 g/l, monosodium glutamate(x2): 1–5 g/l, yeast extract (x3): 0.1–0.5 g/l, MgSO 4⋅7H 2O (x4): 0.1–0.5 g/l, and K 2 HPO 4 (x5): 0.1–0.5 g/l. Response data was generated for biomass and lipopeptide productivites. The design matrix of 32 experiments with the acutal and coded compositions, and the response data are shown in table 1. The carbon source and nitrogen source were observed to influence the biosurfactant production. The experimental results have shown that the media composition with glucose (x1): 2 g/l, monosodium glutamate (x2): 4 g/l, yeast extract (x3): 0.2 g/l, MgSO 4⋅7H 2O (x4): 0.4 g/l and K 2 HPO 4 (x5): 0.4 g/l yielded maximum lipopetide productivity. Increase of glucose or monosodium glutamate is observed to increase the biosurfactant productivity. Studies by various researchers have shown that the presence of glucose in the production medium increased the biosurfactant production [35; 36]. The presence of nitrogen can play an important role in the regulation of biosurfactant synthesis, and it was found that nitrogen plays an important role in the production of surface-active compounds by microbes [1; 37]. Decreasing glucose or monosodium glutamate concentration did not show any improvement in biosurfactant production. Increase of yeast extract has no influence on the production of biosurfactant. Decrease of MgSO 4⋅7H 2O and K 2 HPO 4 also did not show any improvement in biosurfactant production.

5.2 Response surface model

The aim of developing a response surface model based on the experimental design data is to employ these models to optimize the compositions of McKeen medium so as to maximize lipopeptide production. The data generated from the designed experiments for lipopeptide surfactin production by Bacillus subtilis was used to develop response surface models (RSM). A second order full regression model with factor interactions was considered to build relations between the factors and responses. The form of the model is

$$\begin{array}{@{}rcl@{}} Y\!\!&=&\!\!\beta_{0} +\beta_{1} X_{1} +\beta_{2} X_{2} +\beta_{3} X_{3} \,+\,\beta_{4} X_{4} \,+\,\beta_{5} X_{5} +\beta_{12} X_{12} \\&&\!\!+~\beta_{13} X_{13}+\beta_{14} X_{14} +\beta_{24} X_{24} +\beta_{34} X_{34} +\beta_{15} X_{15} \\ &&\!\!+~\beta_{25} X_{25} +\beta_{35} X_{35}+\beta_{45} X_{45} +\beta_{11} {X_{1}^{2}} +\beta_{22} {X_{2}^{2}}\\ &&\!\!+~\beta_{33} {X_{3}^{2}} +\beta_{44} {X_{4}^{2}} +\beta_{55} {X_{5}^{2}} \end{array} $$
(10)

The experimental data in table 1 was used to determine the regression coefficients of the second-order multiple regression models. The coefficients of the models representing the biomass and lipopeptide concentrations were determined by using the method of least squares. The empirical models identified for biomas and lipopeptide productivities are given as follows:

$$\begin{array}{@{}rcl@{}} Y_{bio} \!\!&=&\!\!12.997\,+\,0.260X_{1} \!+0.158X_{2} \!+0.975X_{3} \!-0.025X_{4} \\&&\!\!-~0.192X_{5} -0.063X_{12}+0.163X_{13} -0.138X_{23} \\&&\!\!-~0.113X_{14} +0.038X_{24} +0.113X_{34} -0.688X_{15} \\&&\!\!-~0.138X_{25} -0.013X_{35} -0.038X_{45} -3.051{X_{1}^{2}} \\ &&\!\!-~ 2.92{X_{2}^{2}} -1.771{X_{3}^{2}}-1.396{X_{4}^{2}} -0.321{X_{5}^{2}} \end{array} $$
(11)
$$\begin{array}{@{}rcl@{}} Y_{lipo} \!\!&=&\!\!8.802\,-\,0.023X_{1} \!-0.717X_{2} \,+\,1.217X_{3} \,+\,1.033X_{4} \\&&\!\!-~0.667X_{5} -0.4X_{12}-0.25X_{13} +0.625X_{23} \\&&\!\!-~ 0.65X_{24} -1.25X_{34} -1.625X_{15} -0.475X_{25} \\ &&\!\!+~0.00001X_{35} +0.001X_{45} -0.640{X_{1}^{2}} -0.877{X_{2}^{2}}\\ &&\!\!-~0.277{X_{3}^{2}}+0.973{X_{4}^{2}} +1.198{X_{5}^{2}} \end{array} $$
(12)

The validity of the second order regression models was studied by F test and the significance of the regressed model coefficients was evaluated by student t test as described below.

5.3 Analysis of variance and test of significance

The validity of the full regression models was studied using analysis of variance (ANOVA). Table 2(a) and 2(b) showed the ANOVA results of quadratic response-surface model fitting for both response Y b i o (Biomass concentration) and Y l i p o (Lipopeptide concentration), respectively. The higher values of calculated Fthan the table values and corresponding lower P values indicate less significance of the full order models. The R 2 values of these models were found to be 0.901 and 0.883. The significance of each of the coefficients of the full order regression model was determined by using student t-test. The student tvalues and the corresponding P values are given in columns 4 and 5 of table 3 and table 4, respectively. A large magnitude for t-value, and small P-value, indicates the higher significance of the corresponding coefficient. The elimination of insignificant coefficients led to the following reduced models:

$$\begin{array}{@{}rcl@{}} Y_{bio} \!\!&=&\!\!12.997+0.260X_{1} \!+0.158X_{2} \!+0.975X_{3} \,-\,0.192X_{5} \\&&\!\!-~0.063X_{12}+0.163X_{13} -0.138X_{23} - 0.113X_{14} \\&&\!\!+~0.038X_{24} +0.113X_{34}-0.688X_{15} -0.138X_{25} \\ &&\!\!-~0.038X_{45} -3.051{X_{1}^{2}} -2.92{X_{2}^{2}}-1.771{X_{3}^{2}}\\ &&\!\!-~ 1.396{X_{4}^{2}} -0.321{X_{5}^{2}} \end{array} $$
(13)
$$\begin{array}{@{}rcl@{}} Y_{lipo} \!\!&=&\!\!8.802-0.717X_{2} +1.217X_{3} +1.033X_{4} \,-\,0.667X_{5} \\&&\!\!-~0.4X_{12} -0.25X_{13}+0.625X_{23} -1.25X_{34} \\&&\!\!-~ 1.625X_{15} -0.475X_{25} +0.00001X_{35}+0.640{X_{1}^{2}}\\ &&\!\!-~0.877{X_{2}^{2}} -0.277{X_{3}^{2}} +0.973{X_{4}^{2}} +1.198{X_{5}^{2}}\\ \end{array} $$
(14)
Table 2 ANOVA results of full order regression models: (a) Biomass; (b) Lipopeptide.
Table 3 Test of significance for the coefficients of Biomass regression model.
Table 4 Test of significance for the coefficients of lipopeptide regression model.

The resulting reduced models were statistically validated and tested for their predictive ability by comparing with the experimental results. The ANOVA results of the reduced models are shown in table 5(a) and 5(b), respectively. The lower values of calculated F compared to the table values and the corresponding higher P values indicate the adequacy of the reduced models. The R 2 values of reduced models representing the biomass and lipopeptide were calculated as 0.912 and 0.904. The experimental and model prediction results of biomass and lipopeptide concentrations were shown in figures 2 and 3, respectively. These results showed the predictive ability of the fitted regression models for both the biomass and lipopeptide concentrations.

Figure 2
figure 2

Predicted and experimental values of biomass (Y bio) concentration.

Figure 3
figure 3

Predicted and experimental values of lipopeptide (Y lipo) concentration.

Table 5 ANOVA results of reduced order regression models: (a) Biomass; (b) Lipopeptide.

Sensitivity analysis was performed to find the media components that significantly influence the lipopeptide productivity. The variables that most influence the lipopeptide productivity were found by using the normalized sensitivity studies. The normalized sensitivities are evaluated by using the relation: \({\left ({\frac {\partial Y}{\partial X_{i} }} \right )} \left /{\vphantom {\left ({\frac {\partial Y}{\partial X_{i} }} \right )}}\right . {\left ({\frac {Y}{X_{i} }} \right )}\). Here Y refers to the lipopeptide concentration and X i refers to the respective independent variable, i.e., the (non-dimensional) media composition. The absolute normalized sensitivity values of lipopeptide productivity with respect to glucose (X 1), monosodium glutamate (X 2), yeast extract (X 3), MgSO 4⋅7H 2O (X 4), and K 2 HPO 4(X 5) were computed as 2.026, 2.283, 0.659, 3.110, and 3.795, respectively. This sensitivity analysis shows that the variables X 4 and X 5 exhibit greater influnce on the productivity of lipopetide. The contour plots represent the effect of the significant variables and their interaction in the response variable. The response surface counter plot drawn for X5 vs. X4 on lipopetide productivity is given in figure 4.

Figure 4
figure 4

Response surface counter plot of X5 vs. X4 on lipopetide concentration.

Figure 5
figure 5

Convergence in objective function of ACO with ants=20, strata=3, and estimated parameters=5.

5.4 Optimization of lipopeptide production

The reduced order response surface model detailed in the previous section was coupled with ACO to optimize the compositions of fermentation process medium components (Glucose, Monosodium glutamate, Yeast extract, MgSO4⋅7H2O, and K2HPO4) so as to maximize the lipopeptide productivity. The optimization problem is stated as

$$\text{Maximize}:Y_{lipo} (\mathrm{x}_{1} ,\;\mathrm{x}_{2} ,\mathrm{x}_{3} ,\mathrm{x}_{4} ,\mathrm{x}_{5} ) $$
(15)

within the ranges of medium composition (in g/l):

$$ {\begin{array}{l} {0.5\le \mathrm{x}_{1} \le 2.5} \\ {{\kern7pt}1\le \mathrm{x}_{2} \le 5} \\ {0.1 \le \mathrm{x}_{3} \le 0.5} \\ {0.1\le \mathrm{x}_{4} \le 0.5} \\ {0.1 \le \mathrm{x}_{5}\le 0.5} \end{array} } $$
(16)

The ACO was designed and coupled with the reduced order empirical models to solve the optimization problem. The cost function Jin Eq. (3) was defined based on the experimental and model predicted lipopeptide concentrations. The dimension of the vector (∝) representing the medium compositions was set as 5. The number of strata (M) was assigned as 3. The number of ant sets chosen is 20. The constants involved in Eqs. (6), (8) and (9) were appropriately tuned as C c = 0.5, A = 1.0 and C s = 0.3. The ACO algorithm was implemented iteratively to provide the updated values of compositions until convergence in the cost function was achieved. The number of pathways generated for ants’ travel was 243. The convergence in optimal solution defining the lipopeptide productivity evolved for each ant set by ACO was shown in figure 5. The ACO was executed by writing the program in C language. The optimized medium composition by ACO was found as x1 = 1.098 g/l, x2 = 4.01 g/l, x3= 0.426 g/l, x4 = 0.431 g/l, and x5= 0.219 g/l. The maximum lipopeptide concentration obtained due to the optimized medium composition was 1.501 g/l (y lipo) and the corresponding biomass concentration was obtained as 4.291 g/l (y bio).

A classical Nelder–Mead optimization (NMO) method [38] was also employed to compare with ACO. The NMO has been successfully applied for modeling and optimization of many chemical and biological problems [39; 40]. The tuning parameters involved in NMO were the reflection, contraction and expansion coefficients, which were set as 1.0, 2.0 and 0.5, respectively. The optimum medium components found by NMO are x1 = 2 g/l, x2 = 3.8 g/l, x3 = 0.2 g/L, x4 = 0.4 g/l and x5 = 0.4 g/l with the maximum lipopeptide concentration of 1.387 g/L and the corresponding biomass concentration was obtained as 3.99 g/l. The NMO predicted optimum lipopeptide concentration 1.387 g/l was found to be lower than the 1.498 g/l that was predicted by ACO. The results show that optimizing the medium composition by ACO can identify the maximum lipopeptide productivity more accurately. The experimentally validated lipopeptide concentration based on ACO optimized media composition was found to be 1.498 g/L. These results thus exhibit the effectiveness of ACO in optimizing the media composition for lipopeptide production.