Introduction

Various enzymes have been commercially explored and are currently widely used in many industries such as textiles, leather, paper, food, feed, detergents, and pharmaceuticals. Important industrial enzymes include alkaline polygalacturonate lyase (used in the textile industry), trypsin (used in the leather industry), xylanase (used in paper-making), lipoxygenase (a food enzyme), phytase (a feed enzyme), alkaline α-amylase (a detergent enzyme), and hyaluronidase (a pharmaceutical enzyme), among others (Bhavsar et al. 2013; Juturu and Wu 2012; Ling et al. 2013; Lu et al. 2013; Plagemann et al. 2013; Sahoo et al. 2008; Wang et al. 2010; Yang et al. 2011).

Microorganisms have become prominent sources of enzymes owing to advantages such as easy culture, wide sourcing, and diversity. Microorganisms that are used to produce industrial enzymes are mainly obtained through screening from natural environments. However, because harsh conditions such as high temperature, strong acid, strong alkaline, high salinity, and solvent toxicity occur during industrial processes, enzymes that retain high catalytic efficiency and high stability under such conditions are required. Currently, the main method for obtaining stable proteins is to screen microorganisms from extreme environmental conditions, although this process is difficult. Meanwhile, since some microorganisms particularly from extreme environments cannot be cultivated in the laboratory, new enzymes applied in industrial processes are often found by a metagenomic approach (Martinez-Martinez et al. 2013; Zheng et al. 2013). Some methods such as immobilization and embedding can improve the application performance of enzymes to a certain extent, but these techniques provide only limited improvement in the catalytic performance of natural enzymes. Therefore, molecular engineering methods such as directed evolution and site-directed mutagenesis have been developed to optimize the characteristics of natural enzymes (Table 1).

Table 1 The effect of molecular engineering on the properties of different industrial enzymes

Molecular engineering can improve catalytic performance by changing the structure of certain proteins (Hida et al. 2007) and is an important technology that increases basic understanding about the relationships between enzyme structure and function. Commonly used engineering strategies include directed evolution, site-directed mutagenesis, terminal fusion, and truncation (Fig. 1). As listed in Table 2, each engineering approach has advantages and disadvantages. This review summarizes recent advances in the molecular engineering of industrial enzymes and discusses developing trends in this interesting field.

Fig. 1
figure 1

Molecular engineering methods and procedures to improve the catalytic performance of industrial enzymes. This figure shows a flow diagram of molecular engineering of industrial enzymes with structure information known or not. “Known structure” means that the structure information has been known via the reported experimental data (e.g., X-ray or NMR) or by homology modeling. “Unknown structure” means that structure information cannot be obtained from the reported experiment data (e.g., X-ray or NMR) or by homology modeling

Table 2 The advantages and disadvantages of different molecular engineering methods

Molecular engineering strategies

Directed evolution

Directed evolution is a powerful technique that has been widely adopted as the most practical and efficient means of modifying enzymes to improve catalytic performance (Hida et al. 2007). Combining random mutagenesis via error-prone polymerase chain reaction (ep-PCR), DNA shuffling, staggered extension process (StEP), and appropriate high-throughput screening or selection methods to mimic natural evolution, this strategy has yielded mutants exhibiting desirable properties such as enhanced enzymatic activity, improved environmental durability, and even novel catalytic activities distinct from those of the parent enzymes (Cobb et al. 2013a, 2013b; Zhao and Zha 2006). Random mutagenesis was performed to enhance the activity of a monooxygenase from Geobacillus thermodenitrificans NG80-2, and the hydroxylation activity of mutants was 2- to 3.4-fold higher than that of the wild-type enzyme (Dong et al. 2012). The activity of a transglutaminase from Streptomyces mobaraensis was improved using random mutagenesis, and mutant S199A, with an additional N-terminal tetrapeptide, showed the highest specific activity (1.7 times higher than that of the wild type) in a library of 24,000 mutants (Yokoyama et al. 2010). The StEP method has been used to improve the optimum temperature of Bacillus subtilis subtilisin E by 17 °C higher compared with that of the wild type (Zhao and Zha 2006; Zhao and Arnold 1999). The combination of directed evolution and rational design can further improve the properties of enzymes. For example, after directed evolution and rational design, glutaminyl-transfer RNA synthetase mutant (M110) acylates glutaminyl 4-fold more efficiently than it does glutamate and hydrolyzes adenosine triphosphate 2.5-fold faster in the presence of glutamate compared with glutaminyl (Guo et al. 2012).

The architecture of protein domains has been proposed to have evolved via the combinatorial assembly or exchange of pre-existing polypeptide segments or both. The recombined modular units may be simple secondary structural elements or larger subdomain fragments. This process can result from exon shuffling, nonhomologous recombination, or alternative splicing and can be simulated by selecting folded proteins from combinatorial libraries of shuffled secondary structure elements (Urvoas et al. 2012). To improve the enzymatic activity of Bacillus pumilus lipases, DNA shuffling was applied to two lipase genes from local B. pumilus isolates. Using a high-throughput activity assay, after DNA shuffling, a chimeric mutant (L3-3) carrying two crossover positions and three point mutations and having specific activity 6.4 and 8.2 times higher than that of the two parent enzymes (Akbulut et al. 2013) was selected. However, despite directed evolution being a practical and efficient method of improving the properties of enzymes, a trade-off between the targeted property and other essential properties often exists, which hinders the efficiency of this method. After random mutagenesis for the tyrosinase gene from Ralstonia solanacearum, mutant RV145 exhibited a 3-fold higher monophenolase/diphenolase activity ratio for d-tyrosine, but the k cat value for l-tyrosine decreased compared with that in the wild type (Molloy et al. 2013).

Site-directed mutagenesis

Site-directed mutagenesis is an important method for the modification of enzyme genes and is an invaluable tool to study the structural and functional properties of a protein. Site-directed mutagenesis is based on analyses of the structure, function, catalytic mechanism, and catalytic residues of enzymes. Structural analysis using bioinformatics methods is important for site-directed mutagenesis, which includes single and combinational mutation. To expedite and simplify methods for mutagenesis, single site-directed mutagenesis and multiple mutations have been recommended (Hsieh and Vaisvila 2013). For example, after site-directed mutagenesis to replace the cysteine residue 22 with alanine in Phi-class glutathionine S-transferase F3 from Oryza sativa, the K m value of the mutant C22A was approximately 2.2-fold that of the wild type (Jo et al. 2012). When the basic histidine residues His275, His293, and His310 of α-amylase from B. subtilis were all replaced with aspartic acid via site-directed mutagenesis, the k cat/K m value of mutant H275/293/310D increased by 16.7-fold compared with that of the wild type (Yang et al. 2013a). The combination of site-directed mutagenesis and other methods can significantly improve the properties of enzymes. For example, a thermostabilization strategy combining site-directed mutagenesis and calcium ion addition markedly improved the yield (30.6 % increase) of maltose-binding protein-fused Hepl from recombinant Escherichia coli (Chen et al. 2013).

Saturation mutagenesis

Site-directed saturation mutagenesis is a unique method for rapid laboratory evolution of proteins whereby each amino acid of a protein is replaced with each of the other 19 naturally occurring amino acids. Saturation mutagenesis is performed at “hotspots” of enzymes, and variants with single amino acid changes show improved thermostability or catalytic efficiency (Chen et al. 2012). After the site saturation mutagenesis of tyrosine 195, tyrosine 260, and glutamine 265 in cyclodextrin glycosyltransferase from Paenibacillus macerans, the mutants Y195S, Y260R, and Q265K produced higher 2-O-d-glucopyranosyl-l-ascorbic acid yields than those of the wild type (Han et al. 2013). After saturation mutagenesis, 24 sites in cutinase from Fusarium solani pisi were discovered, at which amino acid replacement resulted in an approximate 2- to 11-fold increase in stability compared with that in the wild-type enzyme (Brissos et al. 2008). Wang et al. (2012) have suggested a new method called combinatorial coevolving-site saturation mutagenesis, in which the functionally correlated variation sites of proteins are chosen as the hotspot sites at which to construct focused mutant libraries. This approach identified novel beneficial mutation sites and enhanced the thermostability of wild-type α-amylase from B. subtilis CN7 by 8 °C (Wang et al. 2012). Buettner et al. (2012) improved the thermostability of microbial transglutaminase of S. mobaraensis via saturation mutagenesis and DNA shuffling, and the mutant S23V-Y24N-K294L exhibited a 12-fold higher half-life at 60 °C and a 10-fold higher half-life at 50 °C compared to the unmodified recombinant wild-type enzyme (Buettner et al. 2012).

Methodology development in the quest to make laboratory evolution more efficient and therefore faster is currently an important focus of research (Johannes and Zhao 2006; Lutz and Patrick 2004; Reetz et al. 2006). The challenge is to maximize the quality of mutant libraries—defined in terms of the frequency of superior mutants (hits) in a given library—and the degree of catalyst improvement (Reetz et al. 2008). High-quality libraries require less screening effort (Reetz et al. 2008), which is the bottleneck of laboratory evolution (Boersma et al. 2007; Bradley et al. 2011; Johannes and Zhao 2006).

Born from the credo “quality, not quantity” (Lutz and Patrick 2004), iterative saturation mutagenesis (ISM; see Table 2) first randomizes appropriate sites in the protein comprising one or more amino acid positions through formation of focused libraries (Bougioukou et al. 2009; Reetz and Carballeira 2007; Reetz et al. 2008). ISM involves (1) randomized mutation of appropriate sites of one or more residues; (2) screening of the initial mutant libraries for properties such as catalytic efficiency, stereoselectivity, and thermal robustness; (3) use of the best hit in a given library as a template for saturation mutagenesis at other sites; and (4) continuation of the process until the desired degree of enzyme improvement has been reached (Gumulya et al. 2012). Stereoselectivity, substrate acceptance (rate), and thermostability can be investigated, and the criteria for choosing the proper randomization sites are different according to the catalytic property under study (Bougioukou et al. 2009; Reetz and Carballeira 2007). In addressing stereoselectivity, substrate scope, or both, sites aligning the complete binding pocket are considered with a method known as the combinatorial active-site saturation test (Reetz and Carballeira 2007; Reetz et al. 2005). Because only small mutant libraries in the range of 100–3,000 transformants are generally required, the screening effort is minimized. For example, the efficacy of ISM has been rigorously tested by applying it to the previously most systematically studied enzyme in directed evolution: the lipase from Pseudomonas aeruginosa as a catalyst in the stereoselective hydrolytic kinetic resolution of a chiral ester. After screening only 10,000 transformants, enantioselectivity increased from 1.1 S in the wild type to 594 S in the mutant Met16Ala/Leu17Phe/Leu162Asn (1B2). ISM has proven considerably more efficient than all previous systematic efforts using ep-PCR with various mutation rates, saturation mutagenesis at hot spots, or DNA shuffling, with pronounced positive epistatic effects being the underlying reason (Reetz et al. 2010).

Truncation

Some domains of enzyme proteins are unnecessary for enzyme activity, and therefore, random or directed truncation has been used to improve the expression/yield or change the properties of enzymes. Truncation includes site-directed truncation, through which truncated enzymes can be directly obtained, and random truncation, through which a truncation library is obtained and the mutants with optimum properties are screened. After truncation, the endo-dextranase mutant TM-NCGΔ from Streptococcus mutants ATCC 25175 exhibited hydrolytic activity on 0.4 % dextran T2000 that was similar to that of SmDex90 and displayed 1.4- or 2.0-fold increased activity on 0.05 % dextran T2000 or T10, respectively, and 1.6- to 2.4-fold increased activity with the small substrates CI-18, pNP-IG3, and pNP-IG4 (Kim et al. 2011). After C-terminus truncation, the half-life of endo-beta-glucanase mutant Eg1330 at 65 °C was 3-fold that of the wild-type enzyme from B. subtilis JA18 (Wang et al. 2009). The combination of truncation with other molecular engineering strategies can further improve the properties of enzymes. For example, seven N-terminal residues of Streptomyces hygroscopicus transglutaminase were deleted, and the fifth residue (E5) in the N-terminus was substituted with 19 other amino acids via saturation mutagenesis. Through this combination of truncation with site-directed mutagenesis, a transglutaminase mutant E5A exhibited a 1.85-fold higher specific activity and a 2.7-fold longer half-life at 50 °C compared to the wild-type enzyme (Chen et al. 2012).

Fusion

Recently, the construction of new “chimeric enzymes” with improved catalytic quality (e.g., catalytic activity, thermostability, substrate specificity, or product selectivity) has become a novel and effective method for engineering enzymes. Most chimeric enzymes are constructed by fusing the catalytic domain and substrate binding domain from different enzymes. Carbohydrate-active enzymes, for example, often contain two separate modules, a catalytic module and a carbohydrate-binding module (CBM), that are discrete structural and functional units usually connected by a flexible linker. A CBM is defined as a contiguous amino acid sequence from a carbohydrate-active enzyme that folds as a separate domain and shows carbohydrate-binding capability (Christiansen et al. 2009). There are 54 defined families of CBMs based on amino acid similarities (http://www.cazy.org/fam/accCBM.html) (Han et al. 2013; Linke et al. 2012), and these CBMs display substantial variation in ligand specificity. Therefore, different CBMs usually recognize different carbohydrates such as crystalline cellulose, noncrystalline cellulose, l-rhamnose, chitin, β-1,3-glucans, and β-1,3–1,4 mixed-linkage glucans, xylan, mannan, galactan, and starch (Fujimoto et al. 2013). CBMs are usually fused with other enzymes to create new chimeric enzymes to improve catalytic quality. For instance, the CBM from Thermotoga neapolitana has been fused with a family-10 xylanase from Bacillus halodurans S7 to enhance hydrolytic efficiency on insoluble xylan (Mamo et al. 2007). Kittur et al. (2003) created a chimeric xylanase by fusing a family-2b CBM from Streptomyces thermoviolaceus STX-II to the C-terminus of XynB, a thermostable and single-domain family-10 xylanase from Thermotoga maritima, to increase its catalytic activity. CBM1 from (CBHII) Trichoderma reesei and CBM6 from Clostridium stercorarium xylanase A have been fused with Cel5A from T. maritima. Both the CBM-engineered Cel5A chimeras showed approximately 14- to 18-fold higher hydrolytic activity toward Avicel (Mahadevan et al. 2008). Zhang et al. (2010) fused the CBMs from Thermobifida fusca cellulose Cel6A and Cellulomonas fimi cellulose CenA to the C-terminus of T. fusca cutinase and improved scouring efficiency. To enhance the soluble starch transformation efficiency for 2-O-d-glucopyranosyl-l-ascorbic acid production, Han et al. (2013) created a chimeric CGTase by fusing the CBM from Alkalimonas amylolytica α-amylase to the C-terminus of CGTase, improving the titer of 2-O-d-glucopyranosyl-l-ascorbic acid from soluble starch 5-fold compared with that of the wild-type CGTase.

In addition, other genes or oligopeptides with effects on functional and structural characteristics can be used to construct new chimeric enzymes with multiple activities and high stability. For example, to improve the thermostability and catalytic activity of Aspergillus niger xylanase A (AnxA), Sun et al. (2005) substituted the AnxA N-terminus with the corresponding region of Thermomonospora fusca xylanase A, enhancing the thermostability and catalytic activity of AnxA. The fusion alkaline α-amylase containing peptide AEAEAKAKAEAEAKAK exhibited improved catalytic efficiency, alkaline stability, thermal stability, and oxidative stability compared to the wild-type enzyme from A. amylolytica (Yang et al. 2013c). Moreover, six self-assembling amphipathic peptides were individually fused to the N-terminus of lipoxygenase from P. aeruginosa, yielding self-assembling amphipathic peptide–lipoxygenase fusions with approximately 2.3- to 4.5-fold increases of thermostability at 50 °C (Lu et al. 2013).

Concluding remarks

The reported protein engineering strategies mainly include directed evolution, site-directed mutagenesis, saturation mutagenesis, terminal fusion, and truncation. To select an engineering strategy for a given protein if the structure information is unknown, we can use directed evolution (e.g., ep-PCR or DNA shuffling), terminal fusion, or truncation based on an analysis of gene sequence; or after obtaining structure information from experimental data (e.g., X-ray), we can use site-directed mutagenesis and so on (see Fig. 1). Comparative analysis of the gene sequences of parental proteins and positive mutants reveals which sites contributed to improved catalytic performance. Then site-directed saturation mutagenesis of these important sites can further improve that performance. If the structural information has been known from experimental data or can be obtained from homology modeling, site-directed mutagenesis can be applied directly to improve catalytic properties based on the analysis of three-dimensional structure and catalytic mechanism. Owing to rapid developments in techniques, molecular design based on bioinformatics is being widely applied for optimizing the catalytic performance of enzymes. In the future development of rational synthetic sequence modification methods, more artificial and synthetic enzymes will be produced that will include three characteristics: (1) optimum properties; (2) a great distance in sequence space from the amino acid sequence of the most highly homologous enzymes; and (3) combinations of rational design with directed evolution.