Keywords

12.1 Introduction

Microbes have been traditionally used in fermentation processes to obtain desired products. Scientific studies of the fermentation processes have led to the great advancements in this area. Many significant products, which were previously synthesized by chemical processes, are now being produced by fermentation using microbial cell factories and biotransformation. Microbes have various features such as to the ease of their mass cultivation, fast growth, use of cheap substrates and the diversity of the potential products produced by them, which makes them suitable for the production of value-added proteins. In the recent years, the application of microbial proteins has been increased enormously. The industries involving the use of microbial proteins include detergents, starch processing, animal feed, paper and pulp, fruit/vegetable processing, oil and gas, dairy, brewery, textile, baking, tanning, and so on. There could be further increase in the production as well as supply of the microbial products by improving our understanding of the recombination in microbes, metagenome mining, various fermentation processes, and recovery methods.

Metabolic engineering is a new emerging field with huge impact on human socioeconomic level, for example, its applications in the production of fuels, materials, chemicals, medicine, and pharmaceuticals where modifications are made at the genetic level. It can be defined as enhancement of formation of desired product and/or cellular properties via editing-specific biochemical reaction(s) involved in its formation or the complete designing and introduction of new reactions with the use of r-DNA technology. The targeted biochemical reaction needs to be very specific to obtain maximum yield of the desired product. The procedure involves identification of the target reactions followed by the use of r-DNA techniques to augment, knockdown or delete, transfer, or deregulate the corresponding genes or their protein products. In this book chapter, we are dealing with the various criteria that are utilized for the selection of the strains, various approaches that are routinely utilized for the higher expression of genes, as well as various metabolic engineering strategies.

12.2 Strain Selection Criteria

There is a vast diversity of microbes available in nature. Many microbes especially those isolated from extreme environments are being explored as a source for production of industrially important enzymes since their optimum growth conditions help avoid contamination and produce more stable enzymes. The microbial strain to be used for industrial production of desired protein varies greatly and is usually selected on the basis of the following:

  1. 1.

    The input of efforts required in engineering and the toolsets as well as the availability of resources.

  2. 2.

    The nature and compatibility of the product in the selected host.

  3. 3.

    Metabolic requirements of the selected microbe for the production of the desired protein, viz. synthesis pathways, substrates/precursors, enzymes, and cofactors.

There are various factors which must be considered, such as metabolic resources, secretion of products, proper folding of proteins when selecting a microbe as a host for metabolic engineering. In the coming section, these factors will be discussed briefly.

12.2.1 Metabolic Resources

For the production of protein of interest by a microbe, it requires the precursors and various organic and inorganic cofactors. Cofactors may either act as redox carriers for the biosynthetic reactions or mediate energy transfer for the cell. Manipulation of cofactors include change in culture conditions, modification of pathway for increased availability or introduction of novel cofactors for enhancement of biochemical processes and overproduction of protein of interest (Wang et al. 2013). The advancements in the tools of bioinformatics and the availability of whole genome sequences of microbes helped a lot in exploring their biodiversity and evaluating the diverse potential of the microbial hosts thus enabling to identify the best potential host that can accumulate the pathway for production of desired protein. The availability of databases and computerized methods of model building like ModelSEED (Henry et al. 2010) and Path2models (Buchel et al. 2013) aid in the development of models for potential hosts for production of desired protein. There are various other tools also available like MetaNetX that can facilitate the direct assimilation of the new synthetic pathways available in other genome-scale models, where in their interactions and effects on the host metabolic network can be understood.

12.2.2 Minimum Metabolic Adjustment

Metabolic engineering of bacteria many a times forces it to pass through the evolutionary pressure to overcome its wild-type form and function and attain optimality to produce protein of interest (Fong and Palsson 2004). Therefore, it might be always advantageous to use a microbial host requiring marginal metabolic adaptations and minimum adjustment through progression (Fisher et al. 2014). Advancements in synthetic biology is making use of different bioinformatic tools for selection of microbial host for different de novo biosynthetic pathways for the production of protein of interest. Toolsets used for modeling and substrate-based analysis make use of global sensitivity analysis and agent-based modeling for screening significant components. Along with components, various other parameters such as genome-scale metabolic flux modeling have been done for identification of metabolic networks and medium formulations so that the expression of a biosynthetic pathway can be maximized (Apte et al. 2014).

12.2.3 Secretion of Proteins

The kinetics of protein production and purification vary greatly from lab scale to industrial scale. When going for industrial production, it is an added advantage if the protein of interest is secreted extracellularly. It aids in down streaming as well as the yield could be higher in comparison to the intracellular expression. Bacillus spp. are favored as the host for secretion of protein in comparison to E. coli. However, research is still ongoing for the identification of other protein secretion hosts (Ferrer-Miralles and Villaverde 2013a), such as Streptomyces (Okesli et al. 2011) halophiles that enhance solubility (Tokunaga et al. 2010) and yeast (Mattanovich et al. 2012). Yeast has some other advantages also such as posttranslational modifications. The final protein of interest can influence the microbial host as well.

12.2.4 Genomic Toolsets

Although E. coli is a common and preferred host for expression and production of recombinant protein due to its wide exploration both genetically and physiologically, other microbes are also being investigated in the same direction for protein production by metabolic engineering on an industrial scale. Toolsets like exonucleotide-based Gibson assemblies (Zhang et al. 2012) are of great help that increase the reliability of the assembly to be used for the protein synthesis. This assembly method is useful for the construction of natural and synthetic genes, pathways, and entire genomes (Fisher et al. 2014).

12.2.5 Proper Folding and Functionality of Protein of Interest

A critical step in recombinant protein production is to obtain properly folded and functional protein in the host used for expression. In case of use of metabolic engineering especially when expression of eukaryotic proteins is required, various factors like availability of the necessary chaperons and chapronins as well as transcript reading for synthesis and folding and posttranslational modifications to yield a functional protein need to be taken into consideration (Hartl et al. 2011; Bernal et al. 2014; Osterlehner et al. 2011). Eukaryotic protein expression in microbes where posttranslational modifications are required usually made use of yeasts like Pichia pastoris as a host besides the human and baculovirus-based insect cell lines. There is also significant interest being shown in transferring the posttranslational capabilities of eukaryotic cells to other microbes as well (Fisher et al. 2014). Heterologous expression, proper folding and N-linked glycosylation of eukaryotic proteins AcrA and IgG have been achieved in E. coli using combinatorial libraries, codon optimization, and shotgun proteomics (Pandhal et al. 2013). Earlier metabolic engineering experiments used to fail at the stage of translation because of lack of tools to check if the heterologous protein will be expressed properly or not. A translational coupling cassette is now available to quickly determine if the heterologous mRNA will be translated in the chosen host along with its level of expression, even when it expresses large multidomain enzymes. It also helps to isolate the translation problems to the C-terminal domains, and to optimize conditions for expressing a codon-optimized sequence variant (Mendez-Perez et al. 2012).

12.3 Approaches to Attain Higher Expression of Industrial Enzymes

12.3.1 E. coli as a Host

E. coli is still one of the favorite organisms for the expression of heterologous proteins. Its usage as a microbial cell chassis is very highly acknowledged and well explored. Multiple studies carried out on its expression and regulation have made it the most popularly used expression platform. For E. coli system, various molecular biology tools and protocols are available which give higher expression and production of heterologous proteins. There is a huge number of expression plasmids, and many engineered strains and diverse cultivation strategies are also available for E. coli system (Rosano and Ceccarelli 2014). The different levels at which the E. coli expression system can be regulated are described in the following.

12.3.1.1 Transcriptional Regulation

A strong and controlled promoter as well as tight regulation of expression of protein is a requisite for efficient and high-level recombinant protein expression in E. coli chassis. In E. coli, large-scale production of protein makes use of chemical or thermal inducers for expression (Chao et al. 2004). Promoters that are tightly regulated help in designing many novel and highly repressible or inducible expression systems. Studies on regulation of promoters help in providing vital tools as information on regulation for gene expression.

Besides transcription promoters, transcription terminator elements are also crucial in controlling the expression and stability of heterologous gene expression, by enhancing the mRNA stability, which further increases protein expression (Newbury et al. 1987).

12.3.1.2 Translational Regulation

Translational regulation involves controlling and optimizing the factors involved in translation. Regulation can be done at the level of initiation codon, secondary structures at the site of initiation, or even stability of translated mRNA (Sprengart et al. 1996). The initiation codon AUG has been shown to be much more efficient as compared to GUG or UUG. The residues at 3′ end of initiation codon especially the second one have been shown to influence the translation rate. The Shine-Dalgarno sequence upstream to the initiation codon that initiates translation is also shown to be more proficient than other translational initiators. The efficacy also varies directly to the distance of the translation initiation sequence (Ringquist et al. 1992). The bases 458–466 of the 16S rRNA of E. coli when placed upstream of the ribosomal binding site (RBS) has also shown to increase the translation efficiency by 110-fold (Olins and Rangwala 1989). All these factors can be regulated and can be taken into consideration during strain development.

12.3.1.3 Enhancement by Formation of Additional Proteins

Attempts for overexpression of proteins many times lead to accumulation of inclusion bodies within the cell. Purification and refolding to obtain active proteins from inclusion bodies is many a times a tedious job and leads to losses in downstreaming. Methods like use of molecular chaperones and low temperatures has been shown to be efficient in reducing the inclusion body formation and proper refolding of recombinant proteins. Proper folding of proteins is influenced by the oxidative environment of the periplasm and proper signal peptide cleavage during translocation. Various strategies have been used for efficient transport and protein folding. These include overproduction of signal peptidase I (Zhang et al. 1997), simultaneous expression of the secEand prlA4 genes (Pérez-Pérez et al. 1994), addition of Golgi retention or endoplasmic reticulum sequences (Zhan et al. 1998), and mutations in secY (Brinkworth et al. 2011). Uthandi et al. 2012 carried out deletion of the twin-arginine translocation motif, whereas Brinkworth et al. 2011 has described use of type III secretion chaperone to facilitate translocation.

The extracellular secretion of proteins is always favored over the intracellular expression due to several advantages like high expression levels and simplified downstream purification and protein folding. Since E. coli secretes very few proteins, use of signal peptide mutations compatible to E. coli membrane (Ismail et al. 2011) and limited leakage of outer membrane by synergistic use of EDTA and lysozyme (Liu et al. 2012) have also been employed for extracellular protein secretion.

12.3.1.4 Use of Fusion Proteins or Molecular Chaperones

A fusion protein is the product of two or more genes that are translated together with no stop codon in between them. In protein overexpression systems, fusion proteins serve to increase protein yield due to various modes of action like increased solubility of expressed protein, improved folding, efficient mRNA translation (Rosano and Ceccarelli 2014). Translational fusion of trpE gene fragments has positive effect of the expression heterologous genes in E. coli (Makoff et al. 1989). Molecular chaperones are very well known for their assistance in protein folding under normal and various stress conditions like heat or temperature stress (Hartl 1996) as well as they are also capable of providing correctly folded, biologically active proteins that were found to be difficult to be produced in E. coli. It is not necessary to use target protein as well as chaperones from the same organisms. In a two-vector system in E. coli, soluble gp37 protein was reported to be effectively produced by co-expression of two bacteriophage T4 chaperones (Bartual et al. 2010).

12.3.1.5 Codon Optimization

The presence of biased codons or codons requiring the rare tRNAs for their expression results in alteration of the proficiency of expression of heterologous protein in E. coli. Expression of such genes (without codon optimization) in E. coli displays a nonrandom utilization of identical codons which further affects the expression of host genes, or it may lead to an adversely rigorous response (Burgess-Brown et al. 2008). For example, AGA and AGG the arginine codons are mainly infrequent in E. coli, and they lead to lower protein expression and mistranslational errors (Calderone et al. 1996). In such cases, there is a need for codon optimization strategies to improve the fidelity of transformation as well as expression of enzymes (Hutterer et al. 2012). The issue can be overcome by conversion of rare codons to commonly used codons by site-directed mutagenesis. Another approach for rescue involves the co-expression of the genes encoding rare tRNA along with the protein of interest (Kleber-Janke and Becker 2000). Kim and Lee (2006) and Gustafsson et al. (2004) have reported use of synthetic DNA with optimized and commonly used codons for successful expression of desired enzymes.

12.3.2 Other Bacteria as a Host

The exploration of microbes, their physiology, and metabolism is attracting other bacterial hosts as microbial chassis. The diversity and biosynthetic potential which is the result of the adaptation due to exposure to varied environments makes other bacteria as a useful host (Ferrer-Miralles and Villaverde 2013b). Besides E. coli, Bacillus species are also regularly used as a host for production of recombinant protein due to their high capacity of secretion of protein and ability to export proteins directly into extracellular media. Different Bacilli systems that have been explored for protein production include B. licheniformis, B. subtilis, B. megaterium, B. brevis, and B. amyloliquefaciens. Another group of potential and promising microbes used for recombinant protein expression is lactic acid bacteria. They are safer expression hosts since they do not produce endotoxins like Bacillus spp. Besides these, proteobacteria and actinobacteria have also been used as host for protein expression. Table 12.1 highlights significant bacterial groups used as cell factories for recombinant protein production (Ferrer-Miralles and Villaverde 2013b).

Table 12.1 Significant bacterial groups used as cell factories for recombinant protein production

12.3.3 Yeasts as a Host

Yeasts are also exceptional microbial hosts used for recombinant proteins expression since they have a dual advantage of being unicellular, fast growth and easy genetic manipulation like prokaryotes as well as secretory pathway, protein processing, and ability of posttranslational modifications as in eukaryotes. Hence they are of significant interest as microbial chassis especially when the source of protein to be produced is of eukaryotic origin. Yeasts hosts mainly used as expression systems include Pichia pastoris, Arxula adeninivorans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Hansenula polymorpha, and Yarrowia lipolytica.

Details of the different yeasts that have been used as microbial host for recombinant proteins production, their advantages, and shortcomings are mentioned in Table 12.2. Besides use of yeast vectors, integration of gene encoding the protein product of interest into yeast genome has also been done for more expression and added stability of the insert.

Table 12.2 Yeast strains used as microbial host for recombinant proteins production, their advantages, and shortcomings

12.3.4 Filamentous Fungi as a Host

Besides bacteria and yeasts, the research efforts for the production of industrially important proteins have also focused on filamentous fungi as a host. For production from these fungi, their already known best regulatory, expression, and secretory machinery is employed in conjunction with heterologous gene producing recombinant protein for commercial production. Table 12.3 highlights the fungal strains employed for recombinant protein production. The use of fungi as industrial hosts for protein production is still under exploration phase since fungi encompasses a very large and diverse group of microbes, and many of them are yet to be well characterized for their biosynthetic potential. Filamentous fungi also have disadvantages like their ability to produce and secrete homologous proteases, which might degrade the recombinant product. Also, the process of glycosylation of proteins is different in mammalian and fungal cells. This can also affect the final product synthesized using fungal machinery (Ward 2012).

Table 12.3 Fungal strains used as microbial host for recombinant proteins production, their advantages, and shortcomings

12.4 Metabolic Engineering Strategies

12.4.1 Directed Evolution

The use of enzymes on an industrial scale many times needs properties like thermostability and resistance to osmotic pressure under actual process conditions that are not found in naturally occurring enzymes. Directed evolution can be defined as the tuning of the natural enzyme in a lab by a process similar to natural evolution by random mutagenesis and recombination followed by efficient screening and selection of the mutants so obtained for desired activity.

Random mutagenesis for evolution makes use of the DNA libraries generated by techniques like error-prone PCR, combinatorial oligonucleotide mutagenesis, DNA shuffling, and staggered extension process (StEP recombination) for strain development followed by screening to obtain a better host with enhanced efficacy. Recombination target modifications make use of nonhomologous recombination, exon shuffling, or alternative splicing to develop and select folded proteins from the obtained secondary structure elements (Urvoas et al. 2012). Although the method is feasible and proficient, a limitation of this method many times involves a compromise between the targeted gene and other essential properties of host required for proper growth and survival.

12.4.2 Site-Directed Mutagenesis (SDM)

It is a highly versatile and precise molecular biology technique to confer tailored mutation in double-stranded DNA. It is also a widely used technique for protein engineering. The approach can be used in traditional cloning, in mapping and control of regulatory as well as in functional analysis of proteins. It also facilitates genome editing in a defined manner through homology-directed repair. Both single-site-directed mutagenesis (point mutation, insertion, deletion, multiple nearby substitutions) as well as multiple site mutations (100 bp apart) have been reported for modification and regulation of the desired gene/s for protein expression (Hsieh and Vaisvila 2013). Irfan et al. 2018 have reported to improve the thermostability of xylanase from Geobacillus thermodenitrificans by using SDM. Thermostability of bacterial chitinase was also shown to improve by 15% by using site-directed mutagenesis to alter the enzyme structure (Emruzi et al. 2018).

12.4.3 Site Saturation Mutagenesis (SSM)

SSM is used for substitution of targeted residue (of a protein) with some other naturally occurring amino acid to improve the expression and efficacy the desired significant proteins. The protocol makes use of single amino acid substitutions achieved by using different sets of primers having degenerate mixture of the four nucleotide bases at the three positions of the codon at the site linked to the functionality of the protein. The amplified mutated PCR products transformed to competent cells and the activities of all substitutions are checked to determine the substitution having the maximum efficiency (Steffens and Williams 2007). When multisites are iteratively subjected for enzyme optimization by saturation mutagenesis, the procedure is called iterative saturation mutagenesis. SSM has led to the evolution of enzymes with stability, enhanced activity, manipulation of binding properties of antibodies and transcription factors (TFs) as well as stereoselectivity. Besides modifying the gene encoding functional protein, SSM has some applications in the engineering of promoters, transcriptional enhancers, RBS, trans-acting factors and cis-regulatory (Guazzaroni et al. 2015). ISM for evolution experiments has proved to be the most efficient technique even when compared to error-prone PCR (Yang et al. 2017).

12.4.4 Protein Truncation

Truncation is the random or directed deletion of the protein domains that are not necessary for its activity. It has been reported for enhancement of the desirable properties of the enzymes in some cases and may also lead to reduction of specific activity in others. Truncated versions of endo-dextranases from Streptococcus mutans were rendered resistant to degradation by proteases during long-term storage by truncation to remove domains not involved in catalytic activity (Kim et al. 2011). Amylases that are active at a high pH are in demand in textile and detergent industries. The stability and specific activity of the alkaline α-amylase of Bacillus pseudofirmus were found to improve to almost 35-fold due to N-terminal truncated mutant (Lu et al. 2016).

12.4.5 Fusion to Generate Chimeric Enzymes

Chimeric proteins are proteins prepared by fusion of the structural genes of the two different proteins/polypeptides having different functions or physico-chemical properties to produce a single protein with higher efficiency. Fusion proteins may be a product of two end-to-end fused sub-units linked by a linker or a product of gene where the amino acids from both sources are interspersed with each other. The product of the end fusion genes usually shows the activity of both the parent genes, whereas the product of the later type often shows a novel activity (Irfan et al. 2018). The fused molecules range from short, synthetic oligonucleotides to full-length structural genes. The increasing amount of publicly available sequenced gene databases provides endless number of fusion partners, thus making this technique a valuable and versatile tool for expression of desired proteins.

12.5 Conclusions

Proteins are present and are synthesized by all living forms. They are a part of cyto-skeleton as well as help body functions to be carried out smoothly by serving as biocatalysts. Use of microbes as cell factories for production of proteins and enzymes is increasing due to their higher efficiency as compared to chemical catalysts and they are ecofriendly and renewable. They usually have simple nutritional requirements and are easy for handling and manipulation as well. Microbes as chassis for production make use of bacteria, fungi, and yeasts. The use of advanced metabolic engineering strategies has led to assimilation of various microorganisms that can be used in microbial cell factories. Metabolic orthogonality is a main objective that is to be achieved in metabolic engineering for the production of different products by microbes. It is preferred because microbes when engineered have to compromise on their natural pathways. The availability of resources in the form of genetic exploration of microbes will help in achieving this goal, and even researchers will have to input less efforts for bioproduction. Amongst bacteria, a major area is still occupied by E. coli. Bacillus spp. are employed as microbial chassis due to their secretory properties that help increase the expression level of proteins. Lactic acid bacteria are still under the process of exploration as a host. Yeasts and fungi are preferred hosts due to posttranslational modifications as well as their high secretion capacity. The choice of best microbial skeleton for industrial protein production is a diverse area as still under exploration phase as the information on genetic and metabolic resources is continuously being added. The addition of the more knowledge in terms of databases will also help for selection of metabolic engineering strategy from the above-discussed methods for the production of desired proteins.