Introduction

Xylan is the most abundant hemicellulose compound in the plant cell wall and is composed of a β-1,4-glycosidic bond-linked xylose unit backbone chain that is usually decorated by glucuronic acid, arabinose, xylose, galactose and other groups (methyl, acetyl and feruloyl) (Glass et al. 2013; Scheller and Ulvskov 2010). Endo-β-1,4 xylanase (EC 3.2.1.8) is the key enzyme in the process of degrading xylan into short fragments via the random cleavage of xylan backbone chains (Collins et al. 2005). Xylanase has been widely applied in many industries, including biofuel, fibre and food technologies and pulp and paper industries (Polizeli et al. 2005; Viikari et al. 1994). These industrial treatments require commercial enzymes with high enzyme activity and stability, and protein engineering of enzyme is receiving increased attention.

Directed evolution is a powerful method that can be performed without sufficient knowledge of structure-function relationships. Through constructing random mutant libraries and screening for variants with improved properties, many proteins have been successfully engineered, but the success of this method is highly dependent on the size and quality of the variant library used. Unfortunately, complete coverage of sequence space (Mark et al. 2005; Povolotskaya and Kondrashov 2010) is often unfeasible, even for small proteins. By contrast, engineering proteins by rational design requires a good understanding of the determinants of protein function. Rational design has been proven successful for improving the catalytic activity (Cheng et al. 2015; Huang et al. 2014) and thermostability (Badieyan et al. 2012) of enzymes. However, although some impressive results have been achieved through rational design, identifying appropriate targets based on the enzyme structure and deciding on the design direction remain challenging.

Advances in genomics and structural biology have dramatically expanded our knowledge of protein sequence and structure (Dai et al. 2012; Warnecke et al. 2007). Structural bioinformatics are useful for utilizing this wealth of knowledge to explore structure-function relationships and provide new insight into protein rational design (Tripathi and Varadarajan 2014). Catalysis by glycoside hydrolases (GHs) is closely connected with the active site architecture, where the amino acid residues at different subsites perform enzymatic functions (Halabi et al. 2009; Himmel 2008). In a previous study, determinants of substrate specificity were identified by analysing the sequence profile of the active site architecture of different GH families (Tian et al. 2016). However, despite some progress, learning how to construct a “small but smart” library by utilizing the natural diversity of protein sequences and structures is a major goal.

GH11 xylanases typically adopt a conserved β-jelly-roll fold and degrade the xylan backbone chain randomly via a retaining mechanism (Rye and Withers 2000). The wide range of pH and temperatures tolerated by GH11 xylanases makes them applicable in industrial production (Paës et al. 2012). Thermomyces lanuginosus is the dominant fungus in plant biomass composts and has the ability to degrade hemicellulose during the composting process (Zhang et al. 2015a). TlXynA isolated from T. lanuginosus is a thermostable GH11 xylanase as reported previously (Singh et al. 2003). Aspergillus niger is one of the important industrial species whose growth temperature is 30 °C. AnXynB produced by A. niger as a mesophilic GH11 xylanase is widely used in industry (Andersen et al. 2011). In many previous studies, directed evolution approaches were widely applied to improve biomass-degrading ability (Song et al. 2012), thermostability (Dumon et al. 2008; Miyazaki et al. 2006) and alkaliphilicity (Inami et al. 2003) of a GH11 xylanase. Alternatively, rational design strategies need further practice and improvement. In this study, we investigated structure-function relationships in GH11 family enzymes and proposed a strategy for their engineering by coupling computational and experimental approaches. The feasibility of the strategy was validated using AnXynB as a case study, and both enzyme activity and thermostability were improved significantly by selective site-directed mutagenesis at the − 3 subsite of its active site architecture. Our approach may be generally applicable for future engineering of glucoside hydrolase.

Materials and methods

Bioinformatics analysis

All protein sequences used in the present study were obtained from the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/). In total, 74 sequences of GH11 xylanases were selected to construct the dataset. Multiple sequence alignments were performed using CLUSTAL (Larkin et al. 2007). Sequence alignment images were produced using ESPript (Robert and Gouet 2014). The phylogenetic tree was constructed using MEGA (Kumar et al. 2016) and optimized by iTOL (Letunic and Bork 2016).

The TlXynA-substrate complex was constructed by docking the xylohexaose from TrXyn11A (PDB: 4HK8) into the active site cleft of TlXynA (PDB: 1YNA). The root-mean-square deviation (RMSD) between these two structures was 0.59 Å. Based on a cut-off of 5 Å, amino acid residues surrounding the substrate were selected to generate the reference using PyMOL (http://www.pymol.org). The structure of AnXynB was modelled by SWISS-MODEL (Arnold et al. 2006), and all other structures of GH11 xylanases were downloaded from the Protein Data Bank (PDB, http://www.rcsb.org). In total, 25 structures (Supplemental Table S1) were used to perform multiple structural alignments using PyMOL, and the sequence profile of the active site architecture of the whole GH11 family was created by WebLogo (Crooks et al. 2004). The accuracy of the sequence profile was tested using the ConSurf Sever (Ashkenazy et al. 2016).

Gene cloning, site-directed mutagenesis and protein expression

TlXynA (NCBI accession number: AAB94633) and AnXynB (NCBI accession number: ACA24724) were used for the research. A. niger (ATCC1015, American Type Culture Collection) was cultured at 30 °C in the minimal medium (5.95 g/L NaNO3, 0.522 g/L KCl, 1.497 g/L KH2PO4, 0.493 g/L MgSO4·7H2O, 5 g/L yeast extract, 2 g/L casamino acids) (Gong et al. 2016) with 1% xylose. Genomic DNA of A. niger ATCC1015 was used as the template for AnXynB gene cloning. The TlXynA gene was obtained by chemical synthesis (Genewiz, Suzhou, China). The sequences corresponding to the signal peptides of TlXynA and AnXynB were removed, and the remaining sequences were cloned into the pET28a plasmid (TransGen Biotech, Beijing, China) using NdeI and EcoRI restriction enzymes and T4 DNA ligase. AnXynB was used as a WT enzyme, and site-directed mutagenesis was performed using a PCR-based method (Weiner et al. 1994). The sequences of primers used for mutagenesis are listed in Supplemental Table S2. Recombinant plasmids were confirmed by DNA sequencing (Genewiz, Suzhou, China) and transformed into Escherichia coli BL21 (DE3).

E. coli BL21 (DE3) was cultured in LB medium supplemented with 50 μg/mL of kanamycin at 37 °C until the optical density at 600 nm reached 0.6–0.8. Isopropyl-β-D-thiogalactopyranoside of 0.5 mM (IPTG; Solarbio, Beijing, China) was added to the medium for induction, and the culture was incubated for another 20 h in a shaker at 20 °C. E. coli BL21 (DE3) was harvested by centrifugation and resuspended in lysis buffer (50 mM NaH2PO4, 300 mM NaCl, pH 8.0). After ultrasonic fragmentation, nickel column chromatography was used for protein purification. The eluent was replaced by PC buffer (20 mM sodium phosphate, 10 mM citrate, pH 6.0) by ultrafiltration (3-kDa cutoff membrane, Millipore, Billerica, MA) at 4 °C. The protein obtained above was analyzed by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE; Supplemental Fig. S1). Protein concentrations were determined by using the Bradford method (Bradford 1976).

Enzymatic activity assays and thermostability analysis

Xylanase activity was measured by incubating the purified enzymes (0.5 μg/mL) with 1% xylan at 50 °C for 10 min with occasional shaking. For analysis of the optimal temperature, 0.5 μg/mL enzymes was mixed with substrate and incubated for 10 min at different working temperatures. For thermostability analysis, protein samples were incubated at 60 °C for different time intervals of up to 120 min, and then cooled on ice for 5 min before activity measurement at 50 °C. The dinitrosalicylic acid (DNSA) method (Hu et al. 2009) was used to calculate enzyme activity by measuring the production of reducing sugars. Absorbance was read at a wavelength of 550 nm. One unit of xylanase activity corresponds to the amount of enzyme that releases 1 μmol of reducing sugar equivalent from xylose per minute.

Enzyme kinetics

To determine kinetic parameters for the different mutants, enzyme (100 μL, 0.5 μg/mL) was mixed with various concentrations of xylan solution (500 μL) for 3 min, and DNSA solution (400 μL) was added to terminate the reaction. Based on the one site binding equation, kinetic parameters including the turnover rate (k cat), Michaelis constant (K m) and catalytic efficiency (k cat/K m) were nonlinearly fitted using GraphPad Prism 5(Motulsky 2007).

Differential scanning calorimetry

The melting temperature (Tm) of enzymes was monitored using a VP-DSC MicroCalorimeter (MicroCal, Northampton, MA, USA) at a protein concentration of 0.02 mM and a scan rate of 1 °C/min. Protein samples were degassed and added into the 300 μL DSC sample cell. The reference cell was filled with buffer without enzyme. The initial temperature was 20 °C and the final temperature was 100 °C. The discrimination power was measured during the temperature increase.

Time-course analysis of reaction products

Fluorophore-assisted carbohydrate electrophoresis (FACE) was used to assess hydrolysis products of AnXynB and its mutants using xylotetraose (X4) and xylotriose (X3) as substrates. Firstly, substrate solutions (500 μg/mL) were mixed with an equal amount of protein (10 μg/mL) at 65 °C for up to 1 h as described previously (Zhang et al. 2015b) for 0, 4, 8, 15, 30 and 60 min. TIF images were obtained, and band intensity was quantified from peaks using Quantity One software (Bio-Rad Laboratories, Hercules, CA). Grey values for each band corresponded to the peak area value calculated from the peak volume (Gong et al. 2016).

Molecular dynamics simulations

Molecular dynamics simulations of WT AnXynB and its mutants (S41N, T43E and S41N/T43E) were carried out using Gromacs 4.5 as described previously (Jiang et al. 2016; Pronk et al. 2013). The number of contacts between enzyme and ligand were calculated based on the distance of enzyme and ligand. When the distance is less than 5 Å, a contact is formed between enzyme and ligand. In order to explore the influence of mutations on the dynamics of 54Trp, the collective motions of 54Trp in the WT and mutated enzymes were investigated using principal component analysis (PCA) (Jiang et al. 2017; Liu et al. 2014). Trajectory projections of 54Trp on PC1 and PC2 were used to describe its conformational ensembles in different systems. The van der Waals and electrostatic interaction energies between the enzyme and substrate were calculated according to the Amber force field equation (Cornell 1995).

Results

Determinants associated with the catalysis of GH11 xylanases

Protein sequence and structure determine protein function (Clark and Radivojac 2011; Lee et al. 2007). Using phylogenetic tree and active site architecture sequence profile analysis, the determinants associated with the catalysis of GH11 xylanases were investigated. As shown in Fig. 1, based on protein sequence similarity, GH11 xylanases were divided into two separate groups: Bacteria and Eukaryota. GH11 xylanases cannot be clustered in terms of their optimal temperatures. For example, the optimal temperatures of TlXynA and AnXynB were significantly different (Gong et al. 2016; Paës et al. 2012; Singh et al. 2003) despite their close evolutionary relationship. These results indicate that minor differences in protein sequence may mediate variation of the enzyme catalysis.

Fig. 1
figure 1

Phylogenetic tree of the GH11 family in which Bacteria are shown in blue and Eukaryota are shown in pink. The thermophilic TlXynA from T. lanuginosus and the mesophilic AnXynB from A.nigerare labeled withblack solid circles. The text part in the figure is in turn the genus (b, e), EC number, Uniprot ID, optimal pH and temperature separated each other by ‘_’

As shown in Supplemental Fig. S2, xylanases of the GH11 family adopt a highly conserved β-sandwich fold in which the active site is the location for xylan hydrolysis. The amino acid residues in the active site architecture were more conserved compared with those in other regions, consistent with the functional importance of the active site architecture. However, the conservation of amino acid residues was not equally distributed over the whole active site architecture (Fig. 2c). Specifically, amino acid residues forming the − 2, − 1 and + 1 subsites were highly conserved, while amino acid residues at distal subsites, and especially at the − 3 subsite, tended to be more variable, indicating that these various amino acid residues might contribute to differences in catalysis between different GH11 xylanases.

Fig. 2
figure 2

Sequence and structure alignment of TlXynA and AnXynB. a Sequence alignment of TlXynA and AnXynB. Strictly conserved residues are highlighted by a red background, and conservatively substituted residues are boxed. The secondary structure of TlXynA is shown above the aligned sequences. The green coloured number represents the location of the disulfide bond. b Structure alignment of TlXynA (salmon) and AnXynB (cyan). c Sequence profile of the active site architecture of GH11 family enzymes. The size of the letter and the ConSurf score indicates the degree of conservation at a given site. The location of ligand atoms interacting with the amino acid residues at each subsite is marked at the top, where CS stands for cleavage site

Differences in sequence, structure and function between TlXynA and AnXynB

To further investigate how variability at the − 3 subsite could modulate protein function, TlXynA and AnXynB were compared, since they are closely related but display different catalytic properties (Gong et al. 2016; Paës et al. 2012; Singh et al. 2003). The sequence similarity of TlXynA and AnXynB is 52.36%, and sequence differences are mainly concentrated in the loop regions (Fig. 2a), especially in the loop between two β-strands (residues 92–103 in TlXynA vs residues 127–138 in AnXynB) and in another loop between an α-helix and a β-strand (residues 161–171 in TlXynA vs residues 196–205 in AnXynB). The value of RMSD (0.715 Å) between these two structures suggests that TlXynA and AnXynB share similar structures (Chothia and Lesk 1986; Webb and Sali 2014). The catalytic cleft is approximately 30 Å long and contains a thumb structure (Fig. 2b). Among 23 amino acids defining the active site architecture, there are 3 amino acids which are different between TlXynA and AnXynB (Fig. 2c). The surface potential of the active site architecture is intimately related to substrate binding (Julián-Sánchez et al. 2013). As shown in Fig. 3, the surface potential at the − 3 subsite clearly differs between TlXynA and AnXynB. Notably, only two amino acid residues vary at this subsite; according to their predicted protein structures, 5Asn and 7Glu in TlXynA correspond with 41Ser and 43Thr in AnXynB (Fig. 2a). These amino acid residues would therefore be expected to give rise to differences in the surface potential at the − 3 subsite.

Fig. 3
figure 3

Comparison between TlXynA and AnXynB. Electrostatic potential distribution on the surface of TlXynA (a) and AnXynB (b). Electrostatic potential is coloured as a gradient from red (negative) to blue (positive). The local structure at the − 3 subsite is shown to highlight differences between TlXynA (c) and AnXynB (d)

In order to resist the detrimental effects of high temperatures, thermostable enzymes generally have a higher binding affinity for their substrates than mesophilic enzymes (Fields 2001). As shown in Fig. 4a, b, the melting temperature and optimal temperature for TlXynA were 74 and 65 °C, respectively, and the corresponding values for AnXynB were 51 and 47 °C. To verify this hypothesis, kinetic parameters were measured at different temperatures for these two enzymes.

Fig. 4
figure 4

Enzymatic characteristics of TlXynA and AnXynB. a Measurement of melting temperature (Tm). b Measurement of optimal reaction temperature. c, d Changes in Michaelis constant K m and turnover rate k cat with increasing temperature

The binding of enzyme and substrate can be a rate-limiting step for enzyme catalysis. As shown in Fig. 4c, the K m value of TlXynA and AnXynB altered minimally until the temperature was increased to the optimal temperature, and the binding affinity of the thermophilic TlXynA for xylan was twice of that of the mesophilic AnXynB. By contrast, the k cat values increased continuously with increasing temperature up to the optimal temperature (Fig. 4d). These results indicate that temperature modulates enzyme activity mainly by affecting the turnover rate (k cat) before enzyme unfolding occurs. Based on the above analysis, the higher binding affinity to xylan for TlXynA may be attributed to its highly polar residues (5Asn and 7Glu) at the − 3 subsite.

Computational analysis of the effects of mutations at the − 3 subsite

To explore the role of amino acid residues at the − 3 subsite in substrate binding, virtual mutants S41N and T43E of AnXynB were constructed. As shown in Fig. 5a, b, according to the result of virtual mutant T43E, one new hydrogen bond should be formed between 43Glu and xylose at the − 2 subsite, and the residue should also form another two new hydrogen bonds with 50Tyr and 52Ser, respectively. When 41Ser was mutated into 41Asn, a steric clash should occur between the side chain of 41Asn and the adjacent residue 54Trp. These results indicate that the mutation at the − 3 subsite altered the hydrogen bonding network within the active site architecture and affected the conformation of functional amino acid residues.

Fig. 5
figure 5

Virtual mutation analysis and molecular dynamics simulations. a, b Effect of S41N and T43E mutations on the local structure of AnXynB. Amino acids are shown in stick or sphere representation. Black dashed lines indicate potential hydrogen bonds. c Number of contacts between enzyme and substrate based on a cut-off distance of 5 Å. d Principal components analysis (PCA) showing conformational ensembles of 54Trp in different systems. PC1 and PC2 are the first two eigenvectors in the motion fluctuation of 54Trp

In order to verify the aforementioned prediction, molecular dynamics simulations were performed for WT AnXynB and its mutants (S41N, T43E and S41N/T43E). The number of contacts between enzyme and ligand was calculated and was ordered S41N/T43E > T43E > WT > S41N from greatest to fewest (Fig. 5c). It is worth mentioning that the average contact number of S41N/T43E and ligand was increased from 500 to 670 compared with that of WT. This result indicates differences in binding affinity to the substrate. We used PCA to investigate the conformation of the active site architecture of WT and enzyme variants S41N, T43E and S41N/T43E. One striking feature of enzymatic catalysis is that the catalytic elements in active site architecture are precisely positioned for their function (Benkovic and Hammes-Schiffer 2003). The conserved aromatic residue 54Trp at the − 2 subsite forms a stacking interaction with the xylose moiety to assist substrate binding (Cheng et al. 2014; Paës et al. 2012). As shown in Fig. 5d, trajectory projections of 54Trp on PC1 and PC2 differed between these enzymes. The mutations S41N and T43E only slightly changed the original conformation of 54Trp seen in the WT enzyme. The S41N/T43E mutation altered the conformation of 54Trp significantly which influenced the binding affinity of AnXynB. Many previous studies pointed out that the catalytic activity of enzymes could be affected by the change of the substrate binding affinity (Bernardi et al. 2014; Zhang et al. 2015c).

Experimental analysis of the effects of mutations at the − 3 subsite

The mutant genes have been synthesized and cloned. The encoded enzyme variants S41N, T43E and S41N/T43E were successfully heterologously expressed and purified. As shown in Fig. 6a, their enzyme activity differed considerably. Compared with WT AnXynB, the enzyme activity of the S41N mutant was slightly decreased, while that of the T43E variant was increased by 20%. Interestingly, the enzyme activity of S41N/T43E was increased by 72%. The specific activities of WT AnXynB and its mutants are shown in Supplemental Table S3. Compared to the other industrial xylanase TlXynA, the enzyme activity of S41N/T43E was increased by 40% (Supplemental Fig. S3). K m values for T43E and S41N/T43E were significantly lower (Table 1), indicating that the binding affinity was higher in these mutants, while S41N displayed a decrease in binding affinity. The turnover rate (k cat) of the three mutants was lower than the WT enzyme, and the T43E variant had the lowest k cat value. Finally, the catalytic efficiency (k cat/K m) of T43E and S41N/T43E mutants was increased, that of S41N was decreased compared to the WT enzyme, consistent with the enzyme activity results described above (Fig. 6a). Additionally, the interaction energy between enzyme and substrate was calculated by molecular dynamics simulation. The results showed that the T43E and S41N/T43E mutants had higher interaction energies than WT AnXynB (Supplemental Fig. S4).

Fig. 6
figure 6

Biochemical properties of WT AnXynB and its mutants. a Catalytic activity of WT and enzyme variants S41N, T43E and S41N/T43E. b Kinetics of thermal inactivation of WT and enzyme variants S41N, T43E and S41N/T43E at 60 °C. The residual activity was measured at different time points

Table 1 Kinetic parameters of WT AnXynB and mutant enzymes

Thermostability of AnXynB and its mutants

As shown in Fig. 6b, the residual enzyme activity of WT AnXynB and its S41N mutant was approximately 10% after incubation for 120 min at 60 °C. By contrast, the enzyme activity of T43E and S41N/T43E mutants decreased more slowly over time, and the residual enzyme activity of the S41N/T43E variant was 35%. In addition, the results of the DSC showed that the melting temperature of S41N/T43E is 3 °C higher than that of the WT (Supplemental Fig. S5). Combined with the virtual mutation results described above (Fig. 5), these results suggest that the increased thermostability might be related to the newly introduced hydrogen bonds within the active site architecture region. A previous study showed the importance of the N-terminus for the stability of proteins with a β-sandwich structure (Jiang et al. 2016), and the introduced hydrogen bond in the S41N/T43E variant is located in the N-terminal region.

Product profiling of AnXynB and its S41N/T43E variant

The enzymatic kinetics can be characterized by detecting the changes in concentration of hydrolysates over a time course using FACE (Gong et al. 2016). This method has been proved to be a simple, sensitive and relatively high-throughput technique (Kosik et al. 2012). Here, the product profiles of the WT AnXynB and its S41N/T43E variant were determined to provide insight into their mechanism of xylotetraose (X4) and xylotriose (X3) hydrolysis. These two enzymes on the degradation of X3 are very weak (Supplemental Fig. S6). As shown in Fig. 7, compared with WT AnXynB, the S41N/T43E mutant possessed a stronger ability to hydrolyze X4, which may contribute to the increased activity of this variant. There were more accumulations of X2 and X3 in the product profiles of S41N/T43E enzyme on xylotetraose. The xylose production (X1) which should be equivalent to the amount of X3 was very little. It is possible that X1 is unstable in the system. So, it suggested that the accumulation of X3 was due to the degradation of X4 by ‘X3 + X1’ mode and the accumulation of xylobiose (X2) was due to the degradation of X4 by ‘X2 + X2’ mode. The above experiments demonstrated that the double mutations increased the ability of the enzyme molecule to bind oligosaccharides.

Fig. 7
figure 7

Time courses of the product profiles of WT (a) and S41N/T43E (c) enzymes on xylotetraose. Time intervals are marked on the top of each lane, and abbreviations of sugars in the markers (M) are as follows; X1 = xylose, X2 = xylobiose, X3 = xylotriose, X4 = xylotetraose. b, d Changes in relative abundance of xylobiose, xylotriose and xylotetraose in hydrolysis products of WT and S41N/T43E, respectively, over time

Discussion

Catalytic activity is an essential feature of industrial enzymes. Active site architecture that makes direct contact with the ligand is an ideal region for the design of catalytic functions (Himmel 2008). The conservation of residues is often not equally distributed over the whole active site architecture. Highly conserved residues in the active site architecture are usually those that directly mediate substrate binding and catalysis, and these are generally similar in members of an enzyme family sharing identical structure and enzymatic mechanism (Paës et al. 2012; Tian et al. 2016). By contrast, more variable are amino acid residues that mediate the functional individuality of enzymes in the same family, and enzymatic properties can often be manipulated by engineering these relatively variable amino acid residues. For example, the mutation of the relatively variable amino acid 216Tyr improved the specific activity of mannanase (Huang et al. 2014). The enzyme activity of TrCel12A was also improved by engineering the relatively variable amino acid 7Trp at the − 4 subsite (Zhang et al. 2015c). In the GH11 family, there are at least two amino acids that differ at the − 3 subsite, but the screening workload of saturation mutagenesis at these positions (19 × 19 = 361 candidates) is laborious. Alternatively, the key steps could be the identification of suitable targets and choosing the appropriate design direction.

The diverse natural enzymes can adjust their sequences, local structures and even protein dynamics to fit their specific function (Henzler-Wildman and Kern 2007). On the basis of the molecular evolution, it becomes feasible to identify the variable amino acid residues which possibly enable the enzyme to catalyze in particular conditions such as high temperature and alkaline solution (Lee et al. 2007). It may be an effective method to create enzyme variants with novel catalytic characters by targeting and mutating these variable amino acid residues in the homologous enzymes. Here, a GH11 xylanase AnXynB was selected as our experimental model. Through probing into the phylogenetic relationship of the GH11 family, we chose the homologous TlXynA as reference model because AnXynB and TlXynA own highly similar sequences and structures but different catalytic properties (Gong et al. 2016; Paës et al. 2012; Singh et al. 2003). The variable amino acid residues in the active site that may affect the enzyme catalysis were found by performing sequence and structural alignments for these two xylanases. The influence of the mutations at these variable sites on enzyme catalysis can be explored by molecular dynamics simulations, which facilitates the rational design of enzymes. Finally, the mutations at the − 3 subsite in AnXynB significantly increased its enzyme activity, which demonstrates the effectiveness of the protocol used in our experiment. Notably, this protocol is also applicable to the protein engineering of other glycoside hydrolases. Of course, more experiments are required to further examine the strategy.

Rational design is already widely used in protein engineering, by which the desirable features can be introduced to target sites based on protein structural analyses. Although successful approaches for improving catalytic activity were reported (Cheng et al. 2015; Huang et al. 2014), the protein engineering strategy aiming at improving catalytic activity is still challenging. The protocols used for designing and screening targets are diverse due to differences in enzyme properties and the demands of different industrial applications. For example, when attempting to improve enzyme thermostability, protein unfolding is often taken into consideration (Zhang et al. 2014). Similarly, understanding the dissociation mechanism of functional amino acid residues in the active site architecture is necessary to design novel enzyme variants active over a wider pH range (Tishkov et al. 2013). Herein, we identified a specific set of sites likely to affect protein function by comparing two closely related GH11 xylanases that possess different enzymatic properties.

The S41N mutant decreased the enzyme activity of AnXynB, whereas the T43E mutant displayed a 20% increase in enzyme activity. Interestingly, the double mutant (S41N/T43E) increased the enzyme activity by 72%. Combining the results of the molecular dynamics simulation, evolutionary covariation (Hopf et al. 2017; Marks et al. 2012) of these two amino acid residues may exist in GH11 xylanases. The coevolved amino acid residues in specific sectors play an important structural and functional role in various protein families (Chakrabarti and Panchenko 2010). In the WT enzyme, Ser41 and Thr43 are located at the distal subsite of the active site and have no direct interaction with the substrate. The side chain of Glu43 in the mutant T43E may provide more hydrogen bonds because it contains one carbonyl group. Glu43 may make a direct hydrogen bond with O3 of the xylose unit at the − 2 subsite (Fig. 5b). The Asn41 and Glu43 residues in the double mutant S41N/T43E altered the conformation of 54Trp significantly which may influence the stacking interaction between the tryptophan and the − 2 sugar, resulting in the change of binding affinity of AnXynB. The increase of the binding affinity in the active site contributes to bind the oligosaccharides above the length of xylotriose, thereby increasing the catalytic efficiency of the enzyme. The results suggested that the interaction network of amino acids in the active site should be given attention to designing enzyme molecules. Compared with T43E, the S41N/T43Evariant has a similar binding affinity (K m) but higher product release efficiency (k cat), which partly demonstrates the importance of substrate binding and product release in enzyme catalysis. Many previous studies about improving the enzyme activity were creating new hydrogen bond or hydrophobic interactions between the enzyme and substrate or adding the non-catalytic carbohydrate binding module to increase the binding affinity of enzymes and substrat e(Thongekkaew et al. 2013; Zhang et al. 2013). But strongly binding to the catalytic pocket can inhibit the enzymatic activity such as by the higher binding energy between cellohexaose and Man5B (Bernardi et al. 2014). Payne et al. (2013) discovered that hydrolytic activity of GH7 cellulases could be enhanced by reducing the binding energy between enzyme and substrate. Therefore, higher catalytic efficiency requires a subtle balance between substrate binding and product release (Tian et al. 2016; Xie et al. 2014). Mutation at the − 3 subsite in AnXynB changed the distribution of binding energy over the whole active site architecture, resulting in an increased ability to bind large xylosaccharides. These results provide insight that will be useful for future glycoside hydrolase engineering.

In conclusion, we rationally engineered variants with higher catalytic activity and stability of the GH11 xylanase AnXynB by a combination of computational and experimental approaches. The results of virtual mutations and molecular dynamic simulations demonstrated that amino acid residues at the − 3 subsite play an important role in substrate binding. The site-directed mutagenesis at the − 3 subsite demonstrated that the binding energy of active site was improved, which accounted for the 72% increase in the catalytic activity of the double mutant S41NT43E. In addition, the thermostability of the enzyme variant S41NT43E was also improved due to the introduction of a polarity interaction in the N-terminus. Our approach provides an effective strategy for establishing a “small but smart” library of sequences that can guide rational design experiments.