Introduction

Ethanol produced from lignocellulose is now recognized as an alternative type of liquid fuel, and industrial large-scale production is gradually becoming a reality [1, 2]. In order to release glucose from biomass for the glucose to be fermented into ethanol, cellulose contained in the lignocellulosic matrix firstly needs to be hydrolysed. Enzymatic hydrolysis besides chemical hydrolysis is a promising method for decomposing the cellulose into small molecules and glucose. In addition to the recently discovered polysaccharide monooxygenase cellulose cleaving type of enzymes [3], the enzymatic hydrolysis reactions on cellulose involve a minimum of three groups of enzymes: endo-1,4-β-d-glucanase (EG) (EC 3.2.1.4), exo-1,4-β-d-glucanases (or cellobiohydrolase, CBH) (EC 3.2.1.91) and β-glucosidase (BG) (EC 3.2.1.21). EG catalyses hydrolysis of cellulose via random endo-attack on the β1,4 bonds in the cellulose backbone. CBH-type enzymes catalyse the hydrolysis of the cellulose from the ends and release mainly cellobiose. Finally, BG hydrolyses cellobiose into glucose [4].

The mechanism and notably the kinetics of converting an insoluble polymeric cellulosic substrate into soluble sugars by the action of cellulase enzymes are not yet completely understood due to the complexity of the involved phenomena (such as adsorption, desorption, enzyme deactivation, accessible area, crystallinity, degree of polymerization, lignin content, enzyme synergism, etc.), which affect the reaction kinetics. Nevertheless, a number of mathematical models for enzymatic hydrolysis have been proposed in the literature [59].

However, none of the models proposed in the past were rigorously validated. This lack of vigorous validation partly questions the credibility of the models for engineering applications. Among the factors that can explain the variability are different enzyme sources, substrate type, conceptual framework of the model (model structure), experimental data quality and quantity [10]. The present study was undertaken to validate the conceptual framework of the most recent model (model structure) proposed by the U.S. National Renewable Energy Laboratory published by Kadam et al. [7] based on a focused and dedicated rigorous experimental testing (improved data quality and quantity) in a systematic and iterative manner. The present study was also designed to assess the requirement for expansion of the model structure in view of improving the applicability range of the model. Notably, transglycosylation reactions which convert glucose back to oligosaccarides under high glucose and cellulose concentrations were investigated and introduced into the modelling.

The mathematical model proposed by Kadam et al. [7] describes the conversion of cellulose to cellobiose (r 1), the conversion of cellulose to glucose (r 2) and the conversion of cellobiose to glucose (r 3) (illustrated in the bold dashed-dot square in Fig. 1). The mathematical model was based on a number of assumptions:

Fig. 1
figure 1

Reaction scheme for modelling cellulose hydrolysis. Bold dashed-dot square illustrates the original Kadam model. Solid arrows represent the reaction routes and the dashed arrows show the inhibition of sugars on the reactions. Modified from Kadam et al. [7]

  • Enzyme adsorption follows a Langmuir-type isotherm with first-order reactions occurring on the cellulose surface.

  • Amorphous and crystalline cellulose are lumped and considered uniform in terms of susceptibility to enzymatic attack.

  • Enzyme activity remains constant throughout the reaction.

  • Conversion of cellobiose to glucose occurs in solution and follows the classical Michaelis-Menten kinetics.

  • Considers separate cellulase (mainly a mixture of EG and CBH) and BG activities on cellulose breakdown and competitive inhibition by simple sugars.

The hydrolysis pathways and the inhibition effects by intermediate and final products (cellobiose, glucose and xylose) can be classified into r 1, r 2 and r 3. These can be expressed as Eqs. (1) to (8):

  • Enzyme Adsorption

    $$ \mathrm{Langmuir}\ \mathrm{isotherm}\kern0.5em {E}_{i\mathrm{B}}=\frac{E_{i \max }{K}_{iad}{E}_{i\mathrm{F}}S}{1+{K}_{iad}{E}_{i\mathrm{F}}} $$
    (1)
  • Cellulose-to-Cellobiose Reaction with Competitive Glucose, Cellobiose and Xylose Inhibition

    $$ {r}_1=\frac{k_{1\mathrm{r}}{E}_{1\mathrm{B}}{R}_{\mathrm{S}}S}{1+\frac{G_2}{K_{1\mathrm{IG}2}}+\frac{G}{K_{1\mathrm{IG}}}+\frac{X}{K_{1\mathrm{IX}}}} $$
    (2)
  • Cellulose-to-Glucose Reaction with Competitive Glucose, Cellobiose and Xylose Inhibition

    $$ {r}_2=\frac{k_{2\mathrm{r}}\left({E}_{1\mathrm{B}}+{E}_{2\mathrm{B}}\right){R}_{\mathrm{S}}S}{1+\frac{G_2}{K_{2\mathrm{IG}2}}+\frac{G}{K_{2\mathrm{IG}}}+\frac{X}{K_{2\mathrm{IX}}}} $$
    (3)
  • Cellobiose-to-Glucose Reaction with Competitive Glucose and Xylose Inhibition Reaction

    $$ {r}_3=\frac{k_{3r}{E}_{2F}{G}_2}{K_{3M}\left(1+\frac{G}{K_{3 IG}}+\frac{X}{K_{3 IX}}\right)+{G}_2} $$
    (4)
  • Mass Balances

    $$ \begin{array}{cc}\hfill \mathrm{Cellulose}:\hfill & \hfill \frac{\mathrm{d}S}{\mathrm{d}t}=-{r}_1-{r}_2\hfill \end{array} $$
    (5)
    $$ \begin{array}{cc}\hfill \mathrm{Cellobiose}:\hfill & \hfill \frac{\mathrm{d}{G}_2}{\mathrm{d}t}=1.056{r}_1-{r}_3\hfill \end{array} $$
    (6)
    $$ \begin{array}{cc}\hfill \mathrm{Glucose}:\hfill & \hfill \frac{\mathrm{d}G}{\mathrm{d}t}=1.111{r}_2+1.053{r}_3\hfill \end{array} $$
    (7)
    $$ \begin{array}{cc}\hfill \mathrm{Enzyme}:\hfill & \hfill {E}_{i\mathrm{T}}={E}_{i\mathrm{F}}+{E}_{i\mathrm{B}}\hfill \end{array} $$
    (8)

This model [7] is based on the assumption that enzyme activity remains constant. Factors decreasing the reaction rate are as follows: (1) product inhibition, (2) inactivation of enzymes (BG) after the adsorption to the substrate and (3) change of substrate reactivity (R S). Substrate reactivity is derived from the secondary hydrolysis rate of the residual substrate at any given time [7, 11, 12] and is expressed as:

$$ {R}_{\mathrm{S}}=\alpha \frac{S}{S_0} $$
(9)

S 0 is the initial substrate concentration and S is the substrate concentration at a given time (grams per liter). α is a dimensionless constant derived from experimental data, e.g. the relation between the secondary initial hydrolysis rate and S/S 0. The introduction of R S in the model is based on the observation/assumption that the substrate may become less susceptible to enzymatic hydrolysis over time. Since the reasons for decreasing substrate reactivity are complicated (change of the substrate structure like crystal structure, degree of polymerization, pore size distribution, etc.), in practice, these properties are difficult to evaluate. Therefore, R S represents an empirical factor for the correction and consideration of the phenomena mentioned above in the reaction rate equations.

The model proposed by Kadam et al. [7] does not include any considerations of reverse reactions, i.e. transglycosylation reactions. To the best of our knowledge, transglycosylation was not considered in previous studies. However, this reaction transferring glucose back to cello-oligosaccharides by β-glucosidase at high glucose or cellobiose concentrations has been reported or described as a possible event in different studies [1319]. It might be a relevant mechanism to be considered in kinetic models, because in the bioethanol industry, the hydrolysis process is typically operated above 15 % (w/w) dry matter of biomass to obtain higher glucose concentration in order to obtain higher distillation efficiency [20]. Already 30 years ago, Gusakov et al. [16] proposed a very detailed, relatively complex reaction scheme for transglycosylation. Therefore, two simplified reactions were proposed and tested in the present research. The first reaction is for trisaccharide production 3G ↔ G 3 + 2H2O. The reaction rate can be expressed as:

$$ {r}_{\mathrm{tri}}={r}_{\mathrm{tri}+}-{r}_{\mathrm{tri}-} $$
(10)

Where

$$ {r}_{\mathrm{tri}+}={k}_{\mathrm{G}3+}\left[\frac{1}{1+{e}^{\left({G}_{\mathrm{cr},\mathrm{tri}}-G\right)}}\right]G $$
(11)
$$ {r}_{\mathrm{tri}-}={k}_{\mathrm{G}3-}\left[\frac{1}{1+{e}^{\left({G}_{\mathrm{cr},\mathrm{tri}}-G\right)}}\right]{G}_3 $$
(12)

G cr,tri is the critical glucose concentration, which means that when the glucose level is above this concentration, the transglycosylation for trisaccharide production is significant. The second reaction is for tetrasaccharide production G + G 3 ↔ G 4 + H2O (r tetra) and the reaction rate can be expressed as:

$$ {r}_{\mathrm{tetra}}={r}_{\mathrm{tetra}+}-{r}_{\mathrm{tetra}-} $$
(13)

Where

$$ {r}_{\mathrm{tetra}+}={k}_{\mathrm{G}4+}\left[\frac{1}{1+{e}^{\left({G}_{\mathrm{cr},\mathrm{tetra}}-G\right)}}\right]G $$
(14)
$$ {r}_{\mathrm{tetra}-}={k}_{\mathrm{G}4-}\left[\frac{1}{1+{e}^{\left({G}_{\mathrm{cr},\mathrm{tetra}}-G\right)}}\right]{G}_4 $$
(15)

G cr,tetra is again the critical glucose concentration, which means that when the glucose level is above this concentration, the transglycosylation reaction for tetrasaccharide production is significant. Figure 1 illustrates all the reaction pathways considered for glucose production from cellulose.

Materials and Methods

Substrate and Enzymes

The substrates employed in this research were Avicel® PH-101 (Sigma-Aldrich, St. Louis, MO, USA) and pretreated barley straw, respectively. The barley straw was grown and harvested in Denmark and then pretreated at DONG Energy (Danish Oil and Natural Gas Energy, Denmark) as described previously [21]; in brief, the pretreatment method consisted of a three-stage heating process, which involved triple heating treatment of the straw at increasing temperatures (15 min at 60 °C; liquids removal; 10 min at 180 °C; 3 min at 195 °C) [21]. After pretreatment, the liquids were removed. The standard procedures for acid hydrolysis and compositional calculation analysis of the dry solid were done according to the protocol of the U.S. National Renewable Energy Laboratory [22]. The cellulose, xylose and acid-insoluble lignin contents in the pretreated barley straw were 66.3, 3.5 and 26.6 % by weight dry matter, respectively.

The enzymes, Celluclast 1.5 L, Novozyme 188 and a special product of BG (not commercialized, named Xbg in this research), were obtained from Novozymes A/S (Bagsværd, Denmark): Celluclast 1.5 L (Cel, mainly EG + CBH) derived from Trichoderma reesei, having an activity of 65 FPU/mL (FPU = filter paper units), 10 CBU/mL (CBU = cellobiose units) and protein concentration of 79 mg/mL. Novozyme 188 (N188, mainly BG) derived from Aspergillus niger had an activity of 870 CBU/mL and a protein concentration of 88 mg/mL. Xbg had a protein concentration of 60 mg/mL. The CBU activity was determined by measuring glucose production from hydrolysis of cellobiose at 50 °C, pH 4.8 [23, 24].

Analyses of Protein and Sugars

Protein concentrations of the enzymes were measured by the Quick Start Bradford protein assay (Bio-Rad, Hercules, CA) and γ-globulin was used as standard. Hexokinase (420 U/mL) + glucose-6-phosphate-dehydrogenase (210 U/mL), purchased from Megazyme (Wicklow, Ireland), were used for glucose analysis. The concentration of cellobiose followed a previous research [25] calculated from the increase of glucose after treatment with excess of N188 for 24 h at 50 °C. Xylose contents in the pretreated barley straw were only 3.5 %, even completely released after hydrolysis, and the concentration was only 3.15 g/L, which was ignored and no further measurements were performed.

Experimental Design and Hydrolysis Reaction

Hydrolysis reactions under different enzyme concentrations/combinations, substrate and inhibitor concentrations are shown in Table 1. These data were used for parameter estimation or validation, depending on the modelling strategies. All reactions were conducted in 2 mL Eppendorf tubes, incubated in thermomixers at 50 °C and mixed at 1,000 rpm. Reaction buffer was 50 mM sodium acetate, pH 4.8, with 0.04 % sodium azide.

Table 1 Experimental conditions of hydrolysis for the development and validation of the kinetic model

Determination of Langmuir Adsorption Constants

Ten grams per liter of Avicel or 5 g/L of pretreated barley straw was mixed with different concentrations of enzyme and incubated for 1 h at 50 °C with mixing at 1,000 rpm. Free enzymes were measured as the protein concentrations in the supernatant. Adsorbed enzymes were calculated by subtracting free enzyme concentrations from the initial enzyme concentrations. K ad and E max were determined by Eq. (16), which was rearranged from Eq. (1).

$$ \frac{E_{\mathrm{F}}}{\left({E}_{\mathrm{B}}/S\right)}=\frac{1}{E_{\max }{K}_{\mathrm{ad}}}+\frac{E{}_{\mathrm{F}}}{E_{\max }} $$
(16)

The obtained Langmuir adsorption constants are shown in Table 2.

Table 2 Parameters of the Langmuir adsorption and substrate reactivity

Transglycosylation Reaction

N188 and Xbg (0.585 and 0.293 mg/mL) and Celluclast 1.5 L (1.58 and 0.585 mg/mL) were mixed with different glucose concentrations. Reactions were incubated at 50 °C for 48 h. Final glucose concentrations were analyzed and compared with the initial concentrations. The decreased glucose was regarded as being converted to oligosaccharides by transglycosylation.

Estimation of Kinetic Parameters and Model Validation Methodology

To achieve the goal of the study, a systematic framework was introduced and used. The framework consisted of the following steps: model construction, experimental data collection, estimation of the parameters and validation (see Fig. 2). Three kinetic parameters (k 3r , K 3M, K 3IG and K 3IX) of the cellobiose-to-glucose conversion (r 3) were calculated using data set E1–E5 (see Table 1), and glucose and xylose were used as initial inhibitors. Then the parameters were validated by data sets with different enzyme concentrations (data set D1–D5, see Table 1). If the fitting and validation did not seem completely correct, a further analysis of the mathematical model was made, and in addition, the cellotriose production transglycosylation reaction (r tri) was introduced into the original Kadam model. Two parameters (k +G3, k G3) involved in transglycosylation were estimated. The next step of the procedure was estimation of eight unknown parameters (k 1r, k 2r, K 1IG2, K 2IG2, K 1IG, K 2IG, K 1IX and K 2IX) in the kinetic reactions for cellulose conversion to cellobiose (r 1) and glucose (r 2) using the experimental data set A1–A4 (Celluclast + N188) or B1–B7 (Celluclast only) which involved analysis of the hydrolysis and product inhibition mechanism (see Table 3). Different initial concentrations of inhibitory agents, glucose, cellobiose and xylose, were used to quantify their inhibition effects on the cellulosic hydrolysis. R S of Avicel in the cellulose-to-cellobiose reaction (r 1) and the cellulose-to-glucose reaction (r 2) (Eqs. (2) and (3)) were regarded as constant, i.e. setting R S as 1, according to [11, 26] assuming that the reactivity of Avicel did not change over time. The validation of the models was performed using data sets with a different enzyme combination and concentration, substrate concentration and pretreated barley straw (data set C–Q, see Table 1). The R S of pretreated barley straw was not constant and α was set as 1, according to the values derived from other types of lignocellulose used in previous studies (e.g. [7, 12]). If the mathematical model predictions did not fit or predict the experimental data well, then a new return to the modelling step was done and finally another transglycosylation reaction was introduced (production for cellotetraose, r tetra). The expresssion of r 1 and r 2 could also be revised by introducing K 1M and K 2M (see Table 3, model 3). Finally, the implementation of the framework for parameter estimation and validation for further investigation could be done.

Fig. 2
figure 2

Framework for construction of the enzymatic cellulose hydrolysis mathematical model and parameter estimation

Table 3 Summary of the models proposed in this research

To this end, three different variations of the original mathematical model proposed by Kadam et al. [7] were proposed in this research (Table 3) and evaluated. Model 1 only considered the transglycosylation for cellotriose production. Model 2 and model 3 involved both trisaccharide and tetrasaccharide production. Therefore, Eq. (7) for the mass balance of glucose was modified for model 1 as follows:

$$ \frac{\mathrm{d}G}{\mathrm{d}t}=1.111{r}_2+1.053{r}_3-1.071{r}_{\mathrm{tri}+}+1.071{r}_{\mathrm{tri}-} $$
(17)

and Eqs. (7) and (10) were modified for models 2 and 3 as shown below, respectively

$$ \frac{\mathrm{d}G}{\mathrm{d}t}=1.111{r}_2+1.053{r}_3-1.071{r}_{\mathrm{tri}+}+1.071{r}_{\mathrm{tri}-}-0.2702{r}_{\mathrm{tetra}+}+0.2702{r}_{\mathrm{tetra}-} $$
(18)
$$ \frac{\mathrm{d}{G}_3}{\mathrm{d}t}={r}_{\mathrm{tri}+}+{r}_{\mathrm{tri}-}-0.7567{r}_{\mathrm{tetra}+}+0.7567{r}_{\mathrm{tetra}-} $$
(19)

in order to fulfill the mass balances of the compounds following the reaction pathway proposed in Fig. 1.

The modelling and estimation of the parameters was done using MatLab (The Mathworks, Natick, MA). A nonlinear least squared method was used for the parameter estimation. The initial values for the parameter estimation were taken from [7]. The fminseach function from Matlab and the Levenberg-Marquardt search were used to solve the objective function, J(θ):

$$ \arg \min \kern1em J\left(\boldsymbol{\uptheta} \right)={\displaystyle \sum_{j=1}^M{\displaystyle \sum_{i=1}^N\left(y{m}_{i,j}-{f}_{i,j}\left(\boldsymbol{\uptheta} \right)\right)}} $$
(20)

where J(θ) is the sum of squared errors, ym i,j is the ith measurement in the jth experiment and f i,j (θ) is the corresponding model prediction for the measurement and θ is the parameter subset used for parameter estimation. This is to mean that for parameter estimation, a set of experiments are typically used in view of identifying the parameter subset in question better. This is explained below.

Experimental data sets under different hydrolysis conditions used for parameter estimation and validation are shown in Table 1. Each data set was given a specific code from A to Q where different initial substrate concentrations, substrate types, enzyme loadings and combinations and inhibitor loadings were related with the different data sets. To make the article concise, these codes will be used repeatedly in this paper.

The comparison and selection for the proper mathematical structure among the proposed mathematical models was performed relying on the weighted sum of square error (WSSE) as follows:

$$ WSSE\kern0.5em ={\displaystyle {\sum}_j}{\displaystyle {\sum}_i{\left(\frac{y_{i,j}-f\left({x}_i,j.k\right)}{\sigma_{i,j}}\right)}^2} $$
(21)

where y i,j is the experimental value i of the dataset j, f(x i,j ) is the value of the function evaluated at the same experimental conditions i of the data set j in the model k and σ is the standard deviation of the measurement error (in this study a similar value was assumed for all measurement points which was calculated from triplicate measurements).

Kinetic Parameters Derived from a Standard Enzymatic Procedure

In enzymology, the standard procedure for deriving kinetic properties of enzymes is through the measurement of the initial reaction rate. In order to compare the parameters estimated from a standard procedure and numerical method in this research, kinetic parameters were also obtained in this way. The proper concentration of BG was mixed with different concentrations of cellobiose and inhibitors (glucose or xylose) and incubated at 50 °C with gentle shaking for 8 min. All reactions were stopped by heating at 100 °C for 5 min. Parameters, k 3, K 3M, K 3IG and K 3IX, were derived from Hanes-Woolf plots; values are shown in Table 4.

Table 4 Hydrolysis kinetic parameters derived from model simulation and standard assay

Results and Discussion

Parameter Estimation and Validation of Cellobiose-to-Glucose Conversion (r 3)

The parameters of describing the rate of cellobiose hydrolysis, r 3, were obtained using the experimental data from hydrolysis of cellobiose by N188 using data set E1–E5 (see Table 1). However, the preliminary evaluation showed that the deviations between the fitting/prediction and experimental data were large when the glucose concentration was higher than 70 g/L (Fig. 3a). Since the product inhibition was already taken into account, this deviation was thought to be a result of transglycosylation. This hypothesis was tested and proven by incubating the enzymes in different concentrations of glucose solutions to examine the effect of transglycosylation. The data showed that the higher the initial glucose concentration, the more glucose disappeared after incubation for 2 days (Fig. 3b). The lost glucose was assumed to be converted to trisaccharide. The level of transglycosylation was approximately proportional to the glucose concentration when the glucose concentration was over 40 g/L. Therefore, the original model of Kadam was revised to account for this reaction step, namely by introducing transglycosylation reactions for trisaccharide production (r tri, Eqs. (10)–(12)). The value of the critical glucose concentration (G cr,tri) was set to 40 g/L, meaning that when glucose concentration is higher than 40 g/L, the effect of transglycosylation is taken into account. To describe the rate expression mathematically, two terms were used: the first term is a sigmoid function which switches the transglycosylation reaction on or off depending on the critical glucose concentration and the other term is a first-order conversion rate of glucose to trisaccharide (see Eqs. (11)–(12)). With this modification, a better description of the experimental data was obtained as shown in Fig. 4. It should be noticed that under same enzyme loading (protein/substrate ratio), Xbg was less inhibited by glucose; in addition, the Xbg hydrolysis efficiency was much higher than that of N188 as shown in Fig. 3a. This is consistent with the observation that the values derived from the standard assay were different (Table 4). Therefore, we can expect that the parameters of N188 and Xbg derived from mathematical modelling estimation will be different.

Fig. 3
figure 3

The effect of transglycosylation. a Hydrolysis of 37.5 g/L cellobiose by N188 and Xbg (3.9 mg protein/g cellobiose) with initial background of glucose 50 g/L. Solid line indicates the fitting by model without introducing transglycosylation into the model. b Transglycosylation effect observed by incubating Celluclast 1.5 L, N188 and Xbg with different glucose concentrations

Fig. 4
figure 4

Parameter estimations of the enzymatic cellobiose hydrolysis reaction; 37.5 g/L cellobiose hydrolysed by N188 (3.9 mg protein/g substrate). Different inhibitor background with 40 g/L xylose (E4) is not shown. Fitting curves without the incorporation of transglycosylation reaction are also shown

Parameter Estimation and Validation of Cellulose to Glucose (r 1) and Cellulose to Cellobiose (r 2)

In this study, several model structures were proposed based on an iterative analysis of data and model predictions. The models are referred to as models 1, 2 and 3 representing modifications of kinetic expressions of the original Kadam model (see Table 3). Model 1 includes tranglycosylation for trisaccharide production (r tri). Model 2 includes the same model structure as model 1 with the addition of tranglycosylation for tetrasaccharide production (r tetra). Model 3 was based on model 2. But K 1M and K 2M were introduced into r 1 and r 2 (Eqs. (22) and (23)), that is, adding this type of kinetic for r 1 and r 2 which originally were not included in the model published by Kadam. For the parameter estimation strategy, in addition to data set A1–A4, data set I (150 g/L of Avicel) was also included. Each model candidate was evaluated step by step from model 1 to model 3 to optimise the performance of r 1 and r 2. Therefore, several factors, such as experimental data selected for parameters estimation, transglycosylation for tetrasaccharide (r tetra, Eqs. (13)–(15)), values of critical glucose concentration (G cr,tetra) and the mathematical expression of r 1 and r 2, were analyzed. Due to the limited space of this article, only selected results of fitting and validation are shown and discussed. Additional results and figures are shown in the Appendix (Supplementary Material). The comparison and evaluation of the models were performed by WSSE analysis (Table 5). It should be noted that in the present work, “fitting” means that the values or curves were derived from the nonlinear least squared method according to the experimental data sets used for parameter estimation, whereas “prediction” means that the values or curves were calculated using the parameters estimated from the fitting and the initial conditions used in the validation experiments.

Table 5 WSSE analysis of the performance of the models

In model 1, transglycosylation for tetrasaccharide production (r tetra) was not included. Two strategies for parameter estimation were pursued: the first strategy involved using data set B1–B7 for parameter estimation. The fitting of data set B was good (WSSE value is small). The predicted values of data set A2 and A3 were higher than the experimental value(s) when the glucose concentration was higher than around 70 ~ 80 g/L (Fig. 5b). The second strategy used data set A1–A4 for parameter estimation instead of data set B1–B7. The fitted values of data set A2 and A3 were still higher than the experimental values obtained under high glucose concentration; besides, the prediction of data set C1–C7 (Fig. 6b) became worse than stategy 1 (Fig. 5a). The WSSE values for prediction of data set C of strategy 1 and strategy 2 were 1,835 and 5,254, respectively (Table 5).

Fig. 5
figure 5

Evaluation of strategy 1 of model 1 by enzymatic hydrolysis of 100 g/L Avicel. a Parameter estimation by data sets hydrolysed by Cel (10.5 mg protein/g substrate) with different initial background inhibitors: B1, no inhibitor; B2, 25 g/L glucose; B3, 50 g/L glucose; B4, 15 g/L cellobiose; B5, 30 g/L cellobiose; B6, 40 g/L xylose; B7, 80 g/L xylose. b Validation of the model by data sets hydrolysed by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate) with different background inhibitors: A1, no background; A2, 50 g/L glucose; A3, 30 g/L cellobiose; A4, 80 g/L xylose

Fig. 6
figure 6

Evaluation of strategy 2 of model 1 by enzymatic hydrolysis of 100 g/L Avicel. a Parameter estimation by data sets hydrolysed by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate) with different background inhibitors: A1, no background; A2, 50 g/L glucose; A3, 30 g/L cellobiose; A4, 80 g/L xylose. b Validation by data sets hydrolysed by Cel (10.5 mg protein/g substrate) with different initial background inhibitors: B1, no inhibitor; B2, 25 g/L glucose; B3, 50 g/L glucose; B4, 15 g/L cellobiose; B5, 30 g/L cellobiose; B6, 40 g/L xylose; B7, 80 g/L xylose

According to the curves in data set A2 and A3 of model 1, the “over-prediction” problem (for strategy 1) and “over-fitting” problem (for strategy 2) were observed when the glucose concentration was higher than around 70 ~ 80 g/L (see Figs. 5b and 6a). Hence, the equations r 1, r 2, r 3 and r tri, did not suffice to describe the hydrolysis kinetic behaviour under such high glucose concentrations. In order to address this issue, an additional transglycosylation reaction for tetrasaccharide production, r tetra, was proposed in model 2 (see Fig. 1), of which the critical value for transglycosylation (G cr,tetra) was set as 75 and 80 g/L, and both values were evaluated. Data set A1–A4 was used for parameter estimation. In each case, the “over-fitting” problem under high glucose concentration in data set A2 and A3 (Fig. 7a) and the “over-prediction” problem in data set O and P were reduced (Fig. 7b).

Fig. 7
figure 7

Evaluation of model 2 (G cr,tetr = 75). a Enzymatic hydrolysis of 100 g/L Avicel. Parameter estimation by data sets hydrolysed by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate) with different background inhibitors: A1, no background; A2, 50 g/L glucose; A3, 30 g/L cellobiose; A4, 80 g/L xylose. b Validation of the model by 90 g/L pretreated barley straw hydrolysed by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate) with different background inhibitors: N, no background; O, 50 g/L glucose; P, 30 g/L cellobiose; Q, 80 g/L xylose

However, when the Avicel concentration was up to 150 g/L (data set I), the “over-prediction” problem was not solved by model 2. Therefore, model 2 was modified further by changing the expression of cellulose-to-cellobiose (r 1) (Eq. (22)) and cellulose-to-glucose (r 2) (Eq. (23)). The modified model was called model 3:

$$ {r}_1=\frac{k_{1\mathrm{r}}{E}_{1\mathrm{B}}{R}_{\mathrm{S}}S}{K_{1\mathrm{M}}\left(1+\frac{G_2}{K_{1\mathrm{IG}2}}+\frac{G}{K_{1\mathrm{IG}}}+\frac{X}{K_{1\mathrm{IX}}}\right)+S} $$
(22)
$$ {r}_2=\frac{k_{2\mathrm{r}}\left({E}_{1\mathrm{B}}+{E}_{2\mathrm{B}}\right){R}_{\mathrm{S}}S}{K_{2\mathrm{M}}\left(1+\frac{G_2}{K_{2\mathrm{IG}2}}+\frac{G}{K_{2\mathrm{IG}}}+\frac{X}{K_{2\mathrm{IX}}}\right)+S} $$
(23)

Of which the denominators of both equations were espressed according to the Michaelis-Menten equation with competitive inhibition. In addition to data set A1–A4, data set I (150 g/L Avicel concentration) was also used for parameter estimation. The critical value of transglycosylation for tetrasaccharide production (G cr,tetra) was set as 75 and 80 g/L and both were evaluated. The fitting curve of data set I came closer to the experimental data (Fig. 8). Among the WSSE values of data set I, the model 3 data were also smaller than the others (Table 5). For hydrolysis of barley straw (data set N, O, P and Q), the predicted values in all models were smaller than the experimental values during the initial phase (<24 h) but then produced an “over-shoot” (Fig. 7b). This can be ascribed to the difference of physical and chemical properties between Avicel and pretreated barley straw.

Fig. 8
figure 8

Evaluation of model 3 by enzymatic hydrolysis of 100 and 150 g/L Avicel. Parameter estimation by data sets hydrolysed by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate) with different background inhibitors: A1, no background; A2, 50 g/L glucose; A3, 30 g/L cellobiose; A4, 80 g/L xylose; I, no background

A comparison of the two strategies of model 1 (strategy 1: data set B1–B7 for parameter estimation; strategy 2: data set A1–A4 for parameter estimation) indicated that in strategy 2 the values were over-predicted on the xylose background (data set B6 and B7) (Fig. 6b). In contrast, in strategy 1, the fitting curves of data set B1–B7 were quite good and the prediction curves of data set A1–A4 were similar to the fitting curves of data set A1–A4 in strategy 2. The reason for this is not completely clear because a synergistic effect cannot explain this observation. In strategy 1, the parameters of r 1 and r 2 were derived from the effect of Celluclast addition alone (data set B1–B7); therefore, the synergism effects contributed from the cooperation of BG were not included in the paramers. That means, when the parameters derived from strategy 1 are used to describe the kinetic behaviour under the conditions of data set A1–A7 (Celluclast + BG), the predicted values should be lower than the experimental data. The experimental values may include the effect of synergism, but the prediction values do not account for synergism. However, this hypothesis is not consistent with the observed results. The predicted curves of data set A1 and A4 fit the experimental data very well by strategy 1 (not lower, as we expected). This discrepancy needs further investigation.

The reason why model 2 failed to predict the kinetics under Avicel concentration 150 g/L is not known. However, Fig. 3b reveals that the degree of transglycosyation not only depends on glucose concentration but also on enzyme concentration, i.e. the higher the enzyme concentration, the more transglycosylation. In this research, enzyme concentration was not considered in the equation of transglycosylation (either r tri or r tetra). Therefore, although the ratios of enzyme to cellulose in data set A1–A4 and data set I were the same (Table 1), the cellulose concentration of 100 g/L (data set A1–A4) and 150 g/L (data set I), respectively, means that the enzyme concentration of the latter reaction in total is 1.5 times higher than the former. It is possible that the level of transglycosylation reaction under the condition of data set I is higher than model 2 predicted. The same concept is supported by the fact that parameters of transglycosylation for trisaccharide production (r 3) were estimated from data set E1–E5, but the enzyme concentration of BG in those experiments were only 3.9 mg protein/g cellobiose. Therefore, the BG concentration in the reaction was 3.9 × 37.5 = 146 mg/L (enzyme to substrate ratio × substrate concentration = enzyme concentration), compared to the BG concentration in data set I, which was 5.9 × 150 = 885, i.e. six times that of data set E. In turn, the parameters of r 3 estimated from data set E1–E5 “hypo-estimated” the transglycosylation reaction of data set I. The detailed relation between enzyme concentration and transglycosylation needs further investigation.

Another explanation could be that the parameters were derived from data fitting by numerical analysis based on kinetic principles (enzyme adsorption, product inhibition and transglycosylation). Thus, when data set I was included in the parameter estimation, certain unknown factors not considered in the semimechanistic model were lumped into the other parameters. That would explain why, e.g. data set I fitted by model 3 was better than the predictions obtained by model 1 and model 2.

In all models, the validation of hydrolysis of Avicel under different ratios of Celluclast and N188 with 40 g/L xylose background (data F, G and H, Fig. 9) was good, meaning that the variation of the enzyme combination within a typically used range (Celluclast, 15.8–31.6 and N188, 1–5.9 mg protein/g cellulose), the models can describe the reaction kinetics precisely. The predictions of the hydrolysis kinetics at lower cellulose concentration (50 g/L Avicel, data set J) were also good (Fig. 10). From the sum of WSSE values in Table 5, the performance of the models was not only influenced by the reaction equations selected in the models, but also affected by the critical glucose concentration (G cr,tetra) of transglycosylation. If the reaction is conducted at a cellulose concentration up to 150 g/L, model 3 with G cr,tetra = 80 is the best choice (sum of WSSE = 7,827). However, when the cellulose concentration is below 100 g/L, model 1 suffices to predict reactions without considering an initial inhibitor background.

Fig. 9
figure 9

Validation of model 2 (G cr,tetr = 75) by enzymatic hydrolysis of 100 g/L Avicel under different enzyme loadings and combinations. F, Cel/N188 = 15.8/3 mg protein/g substrate; G, Cel/N188 = 15.8/1 mg protein/g substrate; H, Cel/N188 = 31.6/5.9 mg protein/g substrate. Background inhibitor is 40 g/L xylose

Fig. 10
figure 10

Validation of model 2 (G cr,tetr = 75) by enzymatic hydrolysis of different Avicel concentrations by Cel (15.8 mg protein/g substrate) and N188 (5.9 mg protein/g substrate). I, 150 g/L; J, 50 g/L

Limitations of the Models Proposed in This Research

The models proposed in this study can predict enzymatic cellulose hydrolysis kinetics well within certain ranges of conditions, including various enzyme-substrate ratios and substrate dry matter. However, there are two limitations, namely conditions under which the model does not predict the glucose release very well:

  1. (1)

    Substrate with different properties. The composition and physical structures of Avicel and pretreated barley straw are different. Most pretreated biomass contains more than 15 % of lignin, which non-specifically adsorbs the enzymes and then the adsorbed enzymes lose their activity. However, in the present work, all bound enzymes were regarded as active; thus, non-productive adsorption was not considered. The lack of this consideration may explain why the model over-estimated the released glucose concentration after ~48 h of reaction (Fig. 7b). The other explanation is that the crystallinity of Avicel is higher than the cellulose in pretreated barley straw. This may be why the hydrolysis curve of pretreated barley straw bends earlier than Avicel. In addition, the lower crystallinty of pretreated barley straw may allow adsorbtion of more water. Hence, at the same dry matter, more free water may exist in the Avicel system. If so, this results in different physical environments in the two systems.

  2. (2)

    Enzyme from different sources. The intrinsic properties of N188 and Xbg derived from the Hanes-Woolf plot (Table 4) were different, especially k 3r, the value for Xbg being more than 8 times higher than that of N188. Xbg moreover triggered less transglycosylation than N188 (Fig. 3b). The influence of these differences on the hydrolysis kinetics is shown in Fig. 3a.

The Significance of the Parameters Derived from This Research and Real Reactions

The semimechanistic kinetic model proposed by Kadam et al. [7] and the modified models presented in the present work provide a passable way to predict the kinetics of enzymatic cellulose hydrolysis because all the variables are easily obtained by regular laboratory measurements. However, the reaction pathways of r 1 and r 2 are oversimplified. As far as Celluclast 1.5 L is concerned, this is a mixture of different enzymes, mainly EG I, EG II, CBH I and CBH II, and the hydrolysis mechanisms of each type of enzyme differ. The main products of CBH I are cellobiose but may be accompanied by some glucose and cellotriose [27, 28], but CBH II produces cellobiose and cellotriose and only trace amounts of glucose may be considered [28]. The products of EG II are generally cellobiose, glucose and cellotriose. CBH I and EG II may interact negatively with respect to substrate adsorption but can exert positive synergistic effects on hydrolysis [27]. Thus, the parameters of r 1 and r 2 derived here are “lumped properties” of EG I, EG II, CBH I and CBH II.

The enzyme-cellulose interaction here is described by a Langmuir isotherm. Based on this theory, only enzymes adsorbing onto insoluble cellulose are defined as bound enzymes and regarded as active. However, EG II, CBH I and CBH II were reported to be able to hydrolyse soluble oligosaccharides with DP < 8 [27, 28]. Enzymes acting on soluble substrates are classified as free enzymes rather than bound enzymes. In the equations presented in this work, free Celluclast (EG + CBH) was regarded as not contributing to the reactions. The role of BG for r 2 is also ambiguous. BG works on small soluble oligosaccharides [19] rather than on cellulose. Thus, bound BG may not contribute to r 2. This hypothesis has been corroborated by Zheng et al. [12] who assumed that BG only adsorbs onto lignin, not cellulose. Therefore, in their model, BG did not exist in the cellulose-to-glucose reaction (r 2). The values of the parameters were very close to those reported by Kadam et al. [7].

The parameters here were derived from numerical estimation from the time course reaction curves rather than from initial reaction rates as reported by others (e.g. [25]). However, the kinetic parameters of N188 derived from the models and a standard enzymatic assay are quite different (Table 4). Thus, the parameters estimated from the models may not represent the real properties of the enzymes. Furthermore, parameters of r 3 proposed by [7] and those found in the work presented here differ. As far as our understanding is concerned, in the study of Kadam et al. [7], all values were derived from the hydrolysis of lignocellulose with a mixture of cellulase and BG. In this research, parameters of r 3 were obtained from hydrolysis of cellobiose by BG without the participation of Celluclast. It was also found that the two strategies in model 1 used different experimental data for parameter estimation and resulted in different parameter values of r 1 and r 2. Thus, it can be concluded that the parameters derived from these models cannot be regarded as “universal values”. However, for engineering applications, the semimechanistic models can be used to describe and predict the observed/measured enzyme-substrate conversion kinetics and product inhibition mechanisms. Therefore, as long as these type of models are used under the range of their validity (substrate type, enzyme source and substrate concentration), they can support process design and technology improvement efforts at pilot and full-scale studies [29] and optimisation studies [20].

Conclusions

The semimechanistic model proposed by Kadam et al. [7] was modified and validated by experimental data with different enzyme loadings/combinations and substrate loading/type. Factors such as experimental data used for parameter estimation and transglycosylation were evaluated step by step during the modification of the models. Model 3 can describe the hydrolysis behaviour under high cellulose concentration up to 150 g/L, except when (1) the substrate properties differ and (2) different enzyme sources are employed. The parameters are not universal, and the variations depend on the experimental conditions used for the parameter estimation. However, the model provides a useful description of the dynamics of cellulose hydrolysis. As long as these types of models are used, bearing in mind their limitations and staying within the range of their validity (e.g. use a proper data set to estimate parameter values), they can provide useful tools for simulations and support process design, optimisation and scale-up efforts.