Introduction

Enhancing protein thermostability to apply naturally occurring proteins to industrial processes has been an important issue and remains a challenge in the biotechnology field. Over the past two decades, numerous thermostable enzymes have been produced by directed evolution based on random mutagenesis [6], including in our study previously [32]. As the other distinct method, rational or computational design strategies can be considered to reduce the time and effort in engineering protein thermostability. Many important structural features involved in protein thermostability, such as hydrophobic interactions, hydrogen bonds, salt bridges, packing, cavity filling, conformational strain, secondary structure propensity, flexibility and rigidity, have been revealed by experimental and theoretical studies [2, 5, 15, 1921, 26, 28, 29, 31]. By far, most rational design strategies are based on a single feature, even rarely involved with two features [12]. But, further consideration for more features must be necessary to reduce false-positive predictions and obtain better results. Therefore, further development of a simple but effective strategy by combining multiple structural features rather than a single factor is necessary.

There is currently a better understanding of the relationship of thermostability with protein core structure than with the protein surface that has not been investigated fairly. Many rational designs are based on protein-core modifications [1, 2, 16, 30]. However, core design often results in a reduction in enzymatic activity [1]. Recently, more attention has been focused on the protein surface, discovering it has importance in protein thermostability, and it has begun to be recognized as a promising target for protein thermostabilization usually without reduction in enzymatic activity, especially combining with flexibility [5, 12, 14, 23]. Molecular dynamics (MD) simulation is usually used for searching for the sites of higher flexibility, but most studies are only limited to computer design, simulation and analysis, lacking experimental validation and support [13, 8]. Moreover, one of the most frequently reported structural differences between mesophilic and thermophilic proteins is the number of salt bridges. Several studies have pointed to a prominent role of salt bridges in high temperature adaptation of proteins, and a tendency for a higher number of salt bridges in thermophilic, especially hyperthermophilic proteins, has frequently been reported [4, 18, 25, 27]. Thus, it appears that salt bridge interactions may have considerable roles in enhancing the overall stability of thermophiles.

In this work, we used Escherichia coli AppA phytase as a model. Phytases are a general term for a class of enzymes which are able to hydrolyze phytic acid and phytate into inositol and phosphoric acid (or phosphate). They can efficiently improve the utilization rate of phytate phosphorus in feed and also reduce the environmental pollution of phytate phosphorus excreted by swine and poultry [9, 24]. Here, we first attempted to use a three-factors rational design strategy combining three common structural features, protein flexibility, protein surface, and salt bridges, to enhance the thermostability of AppA. It is helpful not only for the design of thermostable proteins but also for a more comprehensive understanding of the role of surface residues and salt bridges in protein thermostability. As a result, more thermostable mutants of AppA were constructed successfully without hampering catalytic activity, and their structure and enzyme features were analyzed roundly. By the way, the combination of rational design and directed evolution is also of considerable benefit for protein engineering.

Materials and methods

Materials

The crystal structure of AppA was downloaded from Research Collaboratory for Structural Bioinformatics (RCSB, 1DKQ). The residues are numbered throughout this paper according to a previous research [7]. Plasmid vector pET32a, E. coli strain Origami (DE3), and E. coli Origami (DE3)–pET32a-appA were stored in the Sichuan Centre of Typical Cultures Collection (SCTCC). Restriction enzymes (EcoRI and XhoI) and PrimeSTAR HS DNA Polymerase were purchased from TaKaRa (Japan). The substrate phytic acid sodium salt hydrate P-0109 was purchased from Beijing Biodee Biotechnology Co., Ltd. Chemicals used in this study were of analytical grade.

Molecular dynamics (MD) simulation

The MD Simulation that was performed referred to a previous research [8], but our heating system was using the canonical NPT ensemble.

Computational design of target residues

The RMSD of the whole protein and the RMSF of each residue were calculated by VMD. Mutants were designed by DeepView-The Swiss-PdbViewer, and salt bridges analysis was performed by VMD with the cutoff distance of 4 Å. H-bonds and Van der Waals (VDW) analysis were carried out by DeepView-The Swiss-PdbViewer.

Site-directed mutagenesis

Plasmid pET32a-appA was extracted from the E. coli Origami (DE3)/pET32a-appA by alkaline lysis, as the template for overlapping PCR. All primers used in this study are listed in Table I. The first PCR was carried out at 94 °C for 45 s, 55 °C for 15 s, 72 °C for 1 min 30 s for a total of 30 cycles, then the second PCR was carried out at 94 °C for 45 s, 55 °C for 15 s, 72 °C for 2 min for a previous 5 cycles to amplify the whole gene as template, and at 94 °C for 45 s, 60 °C for 15 s, 72 °C for 2 min for a total of 30 cycles after DNA extraction of the first two fragments. The final amplified fragments were digested by EcoRIand XhoI after DNA extraction and then cloned into the multicloning site of the pET32a vector. The plasmids of the mutants were chemically transformed into competent E. coli Origami (DE3) for protein expression. All mutations were verified by DNA sequencing (Invitrogen, Shanghai).

Protein expression, purification and enzyme assay

The strains of wild-type AppA and the mutants were cultured in 50 ml LB media (1 % peptone, 0.5 % yeast extract, 1 % NaCl and 50 μg/ml ampicillin) at 37 °C until the OD600 reached 0.6–0.8, then was added 0.3 mmol/l IPTG to induce expressing at 28 °C overnight. Suspending the cells in 8 ml 0.2 M NaAc–HAc buffer, pH 4.5 after centrifuging at 12,000 rpm for 10 min; crushing cells by ultrasonics, then centrifuging at 12,000 rpm for 10 min, and thereafter collecting the supernatant for purification. Purification was carried out by BioLogic Duo Flow (BIO-RAD, America). Enzyme purity was verified by 10 % SDS-PAGE and enzyme concentration as determined by the Bradford assay. The enzyme was diluted in 0.2 M NaAc–HAc buffer, pH 4.5 to an appropriate concentration, then 100 μl diluted enzyme solution and 900 μl substrate solution containing 4 mM sodium phytate dissolved in 0.2 M NaAc–HAc buffer, pH 4.5, were mixed. After incubation of the sample for 20 min at 37 °C, the reaction was stopped by addition of 1 ml 15 % trichloroacetic acid. Free inorganic phosphorus was measured at 750 nm after the sample was mixed with 2 ml of a solution containing 3.2 % H2SO4, 7.352 % ferrous sulfate, and 1 % ammonium molybdate, followed by waiting for 10 min at room temperature. Centrifuging if necessary was done before detection. One phytase unit was defined as the amount of activity that releases 1 μmol of inorganic phosphorus from sodium phytate per minute at 37 °C.

Measurement of thermal stability

The purified enzymes were diluted in 0.2 M NaAc–HAc buffer, pH 4.5 to give an activity of 0.4 U/ml. The diluted enzymes were incubated at 80 °C for the following times: 5, 6.5, 7.5, 8.5 and 10 min immediately after the heat treatment, the enzymes were placed on ice for 30 min [10, 11]. The remaining phytase activity was measured at 37 °C and pH 4.5, as described above.

pH profile, temperature optimum and kinetic parameters

The pH profile of phytase was determined at 37 °C using three different buffers to adjust the substrate: 0.2 M glycine–HCl buffer for pH 1.5–3.5, 0.2 M NaAc–HAc buffer for pH 4.0–5.5, and 0.2 M Tris–HCl buffer for pH 6.5 and pH 7.5. The purified enzymes were diluted with each buffer of different pH to give an activity of 0.4 U/ml. The optimal temperature was determined in 0.2 M NaAc–HAc buffer, pH 4.5 at 37, 50, 55, 60, 65, 70, and 80 °C.

The kinetic parameters (K m and V max) were determined at 37 °C using the Michaelis–Menten equation. Initial velocity was measured with substrate concentrations in the range of 0.05–1 mmol/l and the kinetic parameters were calculated by nonlinear fitting using Origin 7.5. All kinetic parameters presented in this study are the mean values derived from triplicate measurements.

Results

Selection and computational design of target residues

Protein flexibility and protein surface were adopted as the first two structural features of the rational design, and we employed the RMSF values of the residues as an indicator of protein flexibility to identify thermo-labile residues instead of the B-factors of the crystal structure. By a MD simulation at 310 K, the RMSFs of backbone atoms against each residue were calculated for the enzyme, shown in Fig. 1. These thermal unstable residues are usually located in turn or coil constructing loops. Lys usually has a higher RMSF value because of its long side chain, for example, the big peak around 100 site is composed of Lys96 and Lys97, and two other Lys residues of higher RMSF values are involved in another big peak around the 205 site. Considering protein surface with less specific interactions was more safe as good candidates for engineering thermostability without hampering catalytic activity or other enzyme features, we searched surface thermal unstable residues which’s RMSF values are above 2 Å and are above 4 Å away from the AppA functional sites as possible residues for further screening.

Fig. 1
figure 1

The RMSF value of each residue of AppA after a MD simulation at 310 K, calculated by VMD

Salt bridges were adopted as another structural feature to design mutation. Because charged residues interactions contribute to overall stability, engineering electrostatic interactions by mutating surface uncharged residues can minimize unwanted effects such as disruption in structure or function, which is often found in protein mutation. Therefore, uncharged residues in thermal unstable regions were mutated to form salt bridges. To minimize extra effects caused by mutation, we also employed unchanged polarity substitution, keeping the native hydrophilicity or hydrophobicity. N161R, Q182E, Q206E, Q307D, Y311K and Q380R were designed through the above rational strategy, and their locations at AppA is shown in Fig. 2. The target residues’ RMSF values are 2.36, 2.27, 2.19, 1.7, 1.89, 2.02 Å, respectively (Table 1). Gln307 and Tyr311’s RMSF values (1.7 and 1.89 Å, respectively) are close to the cutoff distance 2 Å, and after mutagenesis they form salt bridges with His304 and Asp144 which have higher RMSF values (3.22 and 2.21 Å, respectively), respectively. Furthermore, they are both in a thermal unstable region constructed by the 304–312 residues. The native salt bridges and new salt bridges formed by mutation are listed in Table 2. AppA has 12 native salt bridges. Asp384-Arg380 has the closest distance (2.87 Å) among the new formed salt bridges, closer than any native salt bridge. The new formed farthest salt bridge, Asp307-His304 (4.1 Å) is still closer than the native farthest salt bridge, Glu31-Arg359 (5.18 Å).

Fig. 2
figure 2

The location of the target residues at AppA. The target residues were labeled in yellow. Purple represents α-helix; blue represents 3_10_helix; yellow represents β-sheet; cyan represents turn and white represents coil, drew by VMD

Table 1 Mutation primers and the RMSF of the target residues
Table 2 Salt bridges analysis for AppA and the mutants, analyzed by VMD with the cutoff distance of 4 Å

Thermal stability of single mutants

After being heated at 80 °C for 7.5 min, Q206E and Y311K (remained at 59.08 and 57.52 %, respectively) showed 9.38 and 7.82 % thermostability improvement more than the wild-type (remained at 49.7 %) respectively, but the other mutants showed similar thermostability to the wild-type. I427L, a mutant with thermostability improvement obtained in our previous study by directed evolution, showed a 13.16 % thermostability improvement over the wild-type here (remained at 62.86 and 49.7 %, respectively) (Fig. 3). Here was exhibited a higher success rate at 40 % of prediction by our strategy, two in five exhibited thermostable property. Finally, Q206E, Y311K and I427L were chosen for further research to carry out multiple mutations.

Fig. 3
figure 3

The thermostability of wild-type AppA and the single mutants after being heated at 80 °C for 7.5 min. I427L is a mutant we obtained by directed evolution. Phytase activity without being heated was defined as 100 %. All data presented here are the mean values derived from triplicate measurements

Thermostability after combining thermostable single mutations

The wild-type AppA phytase, three double mutants Q206E/Y311K, Q206E/I427L, and Y311K/I427L and a triple mutant Q206E/Y311K/I427L were heated at 80 °C for 5, 6.5, 7.5, 8.5 and 10 min, respectively. As heating time extended, AppA phytase activity reduced gradually. It’s regrettable that the triple mutant didn’t perform the highest thermostability as we wanted. The multiple mutants showed the following order of increasing thermostability: Y311K/I427L > Q206E/I427L > Q206E/Y311K/I427L > Q206E/Y311K ≈ the wild-type, their residual activities were 61.7, 49.48, 41.05, 32.48 and 30.97 % after being heated at 80 °C for 10 min, respectively. The half-life of the wild-type AppA phytase, I427L, Q206E/Y311K, Q206E/I427L, Y311K/I427L and Q206E/Y311K/I427L at 80 °C were 7.74, 8.95, 7.77, 9.93, 11.86, 8.94 min, respectively. Here, we can see the further thermostability improvement through combining other mutations with I427L. All the multiple mutants showed higher thermostability than the single mutants except Q206E/Y311K/I427L and Q206E/Y311K (equivalent to the single mutants and the wild-type, respectively). The thermostability of Y311K/I427L with the highest thermostability almost increased by 30 % more than that of the wild-type (Fig. 4).

Fig. 4
figure 4

a The thermostability of wild-type AppA and the multiple mutants after being heated at 80 °C for different times. Phytase activity without being heated was defined as 100 %. All data presented here are the mean values derived from triplicate measurements. b The pH optimum of wild-type AppA, Q206E/I427L and Y311K/I427L. Phytase activity at the optimum pH was defined as 100 %. All data presented here are the mean values derived from triplicate measurements. c The temperature optimum of wild-type AppA, Q206E/I427L and Y311K/I427L. Phytase activity at the optimum temperature was defined as 100 %. All data presented here are the mean values derived from triplicate measurements

Other features of AppA phytase

The pH and temperature optimum were detected for the wild-type AppA phytase, Q206E/I427L and Y311K/I427L, shown in Fig. 4. The pH and temperature optimum of the three AppA phytases were not changed, the optimum pH was pH 4.5, and the optimum temperature was 60 °C. AppA had two pH peaks, pH 2.5 and 4.5, and exhibited the highest activity at pH 4.5, the same as previous studies [32]. AppA activity decreased sharply at both sides of the optimum pH 4.5, retained about 30 % relative activity at pH 1.5 compared to that of the optimum condition, while it lost almost the whole activity at pH 7.5. The optimum temperature was also the same as previous reports [32]. AppA activity decreased at both sides of the optimum temperature 60 °C, too, and more sharply at higher temperature, it remained about 35 % of relative activity at 37 °C compared with that of the optimum condition, while it lost almost the whole activity at 80 °C.

The kinetic parameters and the specific activities of the thermostable mutants are shown in Table 3. Here also are exhibited the synergy of Y311K and I427L. When Y311K and I427L cooperated, they made V max and K m reduce further and the catalytic efficiency increased further. Y311K/I427L with the highest thermostability had the lowest K m, representing the best affinity with the substrate, and the highest catalytic efficiency. There was no obvious difference in the specific activities of the three AppA phytases, except Y311K/I427L showed a little decrease less than the wild-type.

Table 3 Kinetic parameters and specific activities of thermostable mutants

Discussion

Here, to overcome the limitations of rational design based on a single structural feature, we employed more prudent rational criteria by combining three common structural features.

Protein flexibility and protein surface were adopted as the first two structural features of the rational design. Protein flexibility can be used to enhance protein thermostability by increasing the rigidity of flexible residues. Highly flexible residues could trigger protein unfolding due to their large thermal fluctuations; thus, B-factors can be used as identifying characteristics to search for thermolabile residues [22]. Molecular dynamic simulation can provide more information about protein flexibility than the simple B-factors of static structures by predicting the flexible motion of proteins, so we employed the RMSF values of the residues as an indicator of protein flexibility to identify thermo-labile residues instead of the B-factors of the crystal structure. The RMSF calculates the mobility of an atom during the MD trajectory, thus higher RMSF values indicate higher mobility and lower RMSF values indicate restricted mobility. Salt bridges were adopted as another structural feature to design mutation. Most thermophilic proteins tend to have more salt bridges, enhancing protein thermostability by reducing the heat capacity change of unfolding [3]. The decrease in the dielectric constant of water with increasing temperature makes remarkable the importance of electrostatic interactions in thermostability.

It is regrettable that the triple mutant did not perform at the highest thermostability as we wanted. The multiple mutants showed the following order of increasing thermostability: Y311K/I427L > Q206E/I427L > Q206E/Y311K/I427L > Q206E/Y311K ≈ the wild-type, the thermostability of Q206E/Y311K/I427L was equivalent to the single mutants. Here, we observed another interesting phenomenon, it seems Q206E and Y311K neutralize each other’s effect, this was exhibited by both Q206E/Y311K/I427L and Q206E/Y311K. It may be related to the overall structure affected by both Q206E and Y311K mutation sites, resulting in no other available local structure being formed. But the mechanism still needs to be studied. The two mutation sites (Y311K and I427L) both belong to the α/β-domain of AppA, and closer to the C-terminal. It may stabilize the local structure more efficiently by their synergy. Because of the key role in the thermostability of AppA, we conjectured that AppA begins to collapse from the C-terminal when at higher temperature.

The synergy of Y311K and I427L were also exhibited in the kinetic parameters. When Y311K and I427L cooperated, they made V max and K m reduce further and the catalytic efficiency increased further. Especially deserving to be mentioned, is that the α/β-domain where both Y311K and I427L mutagenesis occurred was just involved in substrate binding [17]. It may explain the phenomenon above, to some extent.

In summary, thermostable variants of AppA were successfully designed without compromising the catalytic activity by the computational design combining three common structural features, protein flexibility, protein surface, and salt bridges. The thermostable multiple mutants were constructed, from there into Y311K/I427L, it showed about twofold higher thermostability than the wild-type and had the highest catalytic efficiency, but further consideration for the enzyme-specific network interactions will be necessary to reduce false-positive predictions. Moreover, additional studies such as the role of the α/β-domain of AppA and the C-terminal in the protein thermostability, the mechanism of the positive synergy of Y311K and I427L, the mechanism of the negative synergy of Q206E and Y311K, the relationship between the number of salt bridges and thermostability or further computational design of the multiple mutant, will provide an opportunity to obtain further knowledge about protein thermostability and search for more thermostable sequences that were not discovered by our strategy.

This study elucidated the importance of the protein flexibility, protein surface, and salt bridges for thermostability and our multi-factors rational design strategy that can be applied practically as a thermostabilization strategy instead of the conventional single-factor approach. By the way, the combination of rational design and directed evolution is also considerable for protein engineering.