Introduction

The methylotrophic yeast Pichia pastoris (syn. Komagataella sp.) is among the most favoured microbial eukaryotic expression systems for production of heterologous proteins of biopharmaceutical or industrial interest, most of which are produced in a secreted form (for reviews, see Cereghino and Cregg 2000; Daly and Hearn 2005; Gasser et al. 2013). An overview about yeast derived pharmaceutical products on the market or in clinical trials can be found in Meehl and Stadheim (2014). While many proteins can be produced at high levels with this expression system, especially the production of complex, often multimeric human proteins turned out to be a challenge, as their overproduction often leads to misfolded product and triggers stress in the host cells, which consequently results in low product yields. Production of secretory proteins can be hampered at both the level of protein synthesis as well as at the level of protein folding and secretion. Modelling of intracellular recombinant protein fluxes based on measured degradation and secretion rates revealed that the rate of protein synthesis has a linear correlation to productivity and can thus be seen as the first rate-limiting step, until the secretory pathway capacity becomes saturated and leads to a plateau of productivity independent of the complexity of the analysed protein (Love et al. 2012; Pfeffer et al. 2011). Regarding protein synthesis, especially the choice of the promoter and the gene copy number are crucial. For high level expression in P. pastoris, mainly the methanol inducible promoter of alcohol oxidase (P AOX1 ) or the promoter of constitutive glycolytic glyceraldehyde-3-phosphate dehydrogenase (P GAP ) are used, although recently alternative constitutive and inducible promoters have been reported. For a detailed overview on available promoters for P. pastoris, the reader is referred to the comprehensive review by Vogl and Glieder (2013). It has been shown for many different heterologous proteins that increasing the gene dosage (copy number of the expression cassette) leads to enhanced product formation (reviewed, e.g. by Daly and Hearn 2005); however, it has to be kept in mind that this is not always a linear correlation for products to be secreted (Hohenblum et al. 2004; Inan et al. 2006; Marx et al. 2009). Ways how to achieve high gene copy numbers are summarized, e.g. in Gasser et al. (2013). Additionally, codon optimization to adapt the codon usage of the produced gene of interest to the preferred host codon usage was described to lead to beneficial results, but there is some debate which kind of optimization is the best (Chung and Lee 2012). Furthermore, the folding and secretion capacity has a strong impact on the productivity of secretory proteins. A schematic overview of the architecture of the secretory pathway of P. pastoris is shown in Fig. 1. In the following, the route of recombinant proteins through the secretory pathway will be discussed, with a special emphasis on endoplasmic reticulum (ER) stress responses due to secretory protein overproduction, folding and proteolytic pathways including solutions to overcome potential obstacles as well as P. pastoris glycosylation characteristics and glycoengineering.

Fig. 1
figure 1

Organelles of the secretory pathway in P. pastoris. In the diagram, all compartments involved in protein secretion and proteolysis are depicted. Additionally, fluorescence microscopy images of the organelles are shown below. 1 Nucleus (purple) stained with DAPI. 2 ER (red) visualized by preKar2-DsRed-HDEL. 3 COP-II vesicles at tER sites (orange) visualized by the marker protein Sec13-DsRed. 4 Golgi apparatus (green) visualized by the marker protein Sec7-3xGFP. 5 Secretory vesicles visualized by the marker protein Snc2-SGFP. 6 Vacuole stained with the vital dye FM4-64 (red); the green signal in this image is due to a protein localized in the cytosol. preKar2-DsRed-HDEL, Sec13-DsRed and Sec7-3xGFP constructs were kindly provided by B. Glick

Protein synthesis and translocation

In eukaryotes including yeasts such as P. pastoris, nascent proteins intended for secretion are translocated into the ER lumen where folding and formation of disulfide bonds take place, and post-translational modifications such as glycosylation are initiated (Brodsky and Skach 2011; Delic et al. 2014; Delic et al. 2013; Margittai and Sitia 2011).

Although some other secretion leaders have been investigated (Ahmad et al. 2014; Gasser et al. 2013; Massahi and Calik 2015), entry into the secretory pathway is mediated in most cases by the Saccharomyces cerevisiae alpha-mating factor pre-pro leader (MFα). Despite having some drawbacks regarding the correct N-terminus of the product (Cabral et al. 2003), the MFα leader is still the most commonly used as it is very efficient in driving secretion. In particular, the presence of EAEA (glutamine-alanine) overhangs on the N-terminus of the secreted product due to inefficient cleavage by diaminopeptidase Ste13 is often observed in a highly product-dependent manner, while inefficient processing by Golgi-resident furin-like protease Kex2 is rather rare. Recently, some approaches to modify the MFα pre-pro leader to further enhance secretion levels in P. pastoris have been reported. Mutations that prove to be beneficial include fusion of the pro-region to an alternative pre-region or deletion of amino acids forming the third alpha-helix of MFα pro-region, whereas exchange of these residues to alanine or complete deletion of the pro-region significantly decreases secretion (Fitzgerald and Glick 2014; Lin-Cereghino et al. 2013). Interestingly, exchange of some of these amino acids to more polar residues was shown to enhance secretion of recombinant proteins in S. cerevisiae before, indicating that the native residues are involved in slowing down the secretory function of MFα (Rakestraw et al. 2009). Indeed, Fitzgerald and Glick (2014) suggest that the pro-region of MFα is necessary to stimulate both, efficient entry into as well as exit from the ER, using superfolder GFP as model protein. However, the detailed regions responsible for this behaviour are not determined yet. Other possibilities to improve secretion are Kex2 overexpression or modification of amino acids surrounding the Kex2 cleavage site (Kjeldsen et al. 1999; Yang et al. 2013).

Once in the ER, the nascent protein needs to be correctly folded and post-translationally processed before it is allowed to exit the ER towards its final destination. The endowment of P. pastoris with chaperones, protein disulfide isomerases and enzymes required for glycosylation has been reported in Delic et al. (2013). It can be assumed that these chaperones together with their co-chaperones and foldases help to prevent misfolding and aggregation, assist in proper folding and are involved in ER quality control. It should be noted, however, that most of the knowledge on the P. pastoris secretion machinery is derived from similarity to other yeasts (mainly S. cerevisiae) and that detailed functional characterization is largely missing.

Limitations in protein folding and secretion lead to the induction of the unfolded protein response

Overproduction of recombinant proteins may overload the ER folding and secretion capacity, resulting in the accumulation of misfolded or unfolded proteins, and ER stress as a consequence. This triggers the activation of the unfolded protein response (UPR) pathway, which aims at reducing ER stress conditions by induction of genes involved in protein folding and the ER-associated degradation (ERAD) pathway (Hoseki et al. 2010; Kohno 2010; Malhotra and Kaufman 2007; Vembar and Brodsky 2008). We have shown that these effects also occur in P. pastoris upon heterologous protein production (Hohenblum et al. 2004). During the development of P. pastoris strains secreting different heterologous proteins (e.g. human antibody Fab fragments, human and porcine trypsinogen), severe limitations were identified. Significant amounts of the product were retained in the ER fraction of the cells, and an induction of the UPR was observed (Gasser et al. 2006; Hohenblum et al. 2004). Saturation of the folding and secretion capacity of the host cells and activation of the UPR upon production of secreted heterologous proteins were also observed in S. cerevisiae previously (Kauffman et al. 2002; Parekh et al. 1995), and by other groups in P. pastoris later on (Inan et al. 2006; Resina et al. 2007; Whyteside et al. 2011a).

The UPR in P. pastoris

Despite its importance during recombinant protein production, or coping with environmental stresses, there is no comprehensive knowledge about UPR regulation in industrially relevant production organisms. To close this gap for P. pastoris, we summarize here all important findings and compare them to results obtained from S. cerevisiae.

The mechanisms how yeast cells sense and signal the presence of unfolded proteins in the ER to activate a transcriptional response in the nucleus were elucidated in the 1990s by the groups of Peter Walter and Kazutoshi Mori (reviewed by Worby and Dixon 2014). In yeast, the main players of the UPR signaling pathway are the ER/nuclear transmembrane serine/threonine kinase Ire1 and the transcription factor Hac1. Hac1 activity is regulated by an unconventional splicing event of HAC1 messenger RNA (mRNA), removing an intron that prevents its translation in the absence of ER stress. Unfolded or misfolded proteins in the ER are sensed by Ire1, which most probably depends on the presence or absence of Ire1-bound Kar2. If unfolded proteins accumulate in the ER, Kar2 dissociates from Ire1 to perform its chaperone function, and the endonuclease function of Ire1 gets activated. After removing the intron from precursor HAC1u mRNA, the exons are joined by tRNA ligase Rgl1 to form translation-competent HAC1i mRNA. The encoded Hac1 protein then locates to the nucleus and activates target genes with UPR elements (UPREs) in their promoters (Cox and Walter 1996; Mori et al. 1996; Sidrauski et al. 1996; Sidrauski and Walter 1997). The induced transcriptional response aims at restoring ER homeostasis by increasing both the ER lumen and surface area as well as the ER folding machinery. The P. pastoris Hac1 homolog was identified and characterized by Guerfal et al. (2010). P. pastoris HAC1 contains a 322 bp intron, flanked by splicing sites within characteristic stem-loop structures, and a stem-loop structure in the 3′UTR similar to homologs from other species. As in S. cerevisiae, HAC1 splicing is dependent on Ire1 in P. pastoris, and the last five amino acids of the newly generated C-terminus of PpHac1 are required for UPR activation (Whyteside et al. 2011b), while Δhac1 cannot grow on inositol-deficient media. Constitutively spliced HAC1i mRNA was observed in P. pastoris cells grown at 20 and 30 °C (Guerfal et al. 2010) as well as in sorbitol and/or methanol grown cells in chemostat cultivations (Hesketh et al. 2013) independent of externally applied ER stress conditions or recombinant protein production. On the other hand, unspliced PpHAC1 mRNA was found in conditions without ER stress by Whyteside et al. (2011b), indicating that growth conditions might have an impact on UPR activation. In this respect, a UPR like response was also induced upon changes in environmental conditions such as high osmolarity, heat, and methanol induction (Dragosits et al. 2010; Hohenblum et al. 2004; Khatri et al. 2011; Resina et al. 2007; Zhong et al. 2014); however, none of these studies measured HAC1 splicing.

As mentioned above, UPR is activated by protein overproduction, but the response is much less intense than by chemical induction or HAC1 overexpression (Gasser et al. 2007a; Lin et al. 2013). There seems to be a clear fine-tuning between intracellular product retention, secretion efficiency and the magnitude of UPR. Interestingly, UPR induction upon recombinant protein secretion can sometimes only be clearly seen during small-scale cultivation, while during production-like conditions in the bioreactor, they seem to be masked or overlaid by other stress responses (Hesketh et al. 2013; Vanz et al. 2014). However, this seems to be dependent on the process conditions and/or the heterologous product as other studies clearly report UPR activation (Khatri et al. 2011; Resina et al. 2007). Upon externally applied ER stress, HAC1 is transcriptionally upregulated (Guerfal et al. 2010), indicating that Hac1 is transcriptionally and post-transcriptionally regulated in P. pastoris. In S. cerevisiae, the Hac1 intron is necessary and sufficient to prevent translation (Chapman and Walter 1997); however, detailed mechanisms of the translational block are not known yet for P. pastoris.

Effects of HAC1i overexpression

Comparison of the transcriptional regulation in P. pastoris during DTT treatment or upon HAC1i overexpression with literature data for S. cerevisiae (Travers et al. 2000) revealed similarities only regarding a core UPR response (induction of ER-resident chaperones and protein glycosylation). Functions more distal to the ER were regulated differentially between the two yeast species, thus underlining the importance of transcriptional studies in industrially relevant production organisms (Graf et al. 2008). Out of approximately 1500 differentially regulated genes (with an equal balance of up and downregulated transcripts), only one third of the genes were regulated similarly upon DTT treatment and endogenous UPR activation (Graf et al. 2008). This common regulatory set involved upregulation of classical UPR target functions such as ‘protein folding’, ‘ER-to-Golgi transport by COPII vesicles’, ‘glycosylation’, ‘GPI-anchor biosynthesis’ and ‘vesicular transport’, as well as a downregulation of major core metabolic processes.

Furthermore, P. pastoris reacts to endogenous UPR activation by regulating the expression of genes involved in ribosome biogenesis and translation; however, the impact on these processes remains rather unclear: While many genes involved in RNA metabolism and organelle biosynthesis, but not the genes encoding ribosomal subunits, were upregulated upon ScHAC1i overexpression in the study by Graf et al. (2008) on the transcriptional level, Lin et al. (2013) reported significantly higher levels of ribosomal subunit proteins in an UPR-induced producing strain but lower abundance of ribosomal proteins on the proteome level upon PpHAC1i co-overexpression. Similarly, lower transcript levels of ribosomal genes were observed by Vogl et al. (2014) upon co-overexpression of PpHAC1i in a membrane protein producing strain. It remains to be elucidated if these alternating responses are due to the source of the HAC1 gene or due to the fact that Graf et al. (2008) analysed HAC1-overexpression in a wild-type background, while the two latter studies examined production strains with already pre-induced UPR. Upon HAC1-overexpression, proliferation of the ER was observed, characterized by the expansion of intracellular membranes and appearance of cubic membrane structures (Guerfal et al. 2010) or formation of karmellae (Vogl et al. 2014) corresponding to transcriptional changes in fatty acid biosynthesis genes (Graf et al. 2008; Vogl et al. 2014). Recently Zhong et al. (2014) provided the first report of ER-phagy in P. pastoris, a degradative process positively corresponding to survival upon ER stress (Schuck et al. 2014), for a strain expressing rhIL-10 at high temperature (30 °C), and showed that it anticorrelates with UPR induction.

Furthermore, HAC1 co-overexpression was shown to enhance secretion of soluble secretory proteins, cell surface displayed proteins and membrane proteins (summarized in Table 1). One might deduce that inducible HAC1 co-overexpression is especially effective for production under control of the P AOX1 promoter system while constitutive HAC1 co-overexpression is better suited in combination with a constitutive promoter; however, such an interpretation would need a more detailed comparative study. Contrary to S. cerevisiae, constitutive or inducible overexpression of either PpHAC1i or ScHAC1i does not lead to a growth defect in P. pastoris (Graf et al. 2008; Guerfal et al. 2010) but leads to secretion of HDEL-containing ER-resident proteins such as Kar2 and Pdi1 into the supernatant (Guerfal et al. 2010).

Table 1 Effects of Hac1 co-overexpression on production of secretory, surface displayed (YSD) and membrane proteins (MP) in P. pastoris in small scale cultivations (screening in plates or shake flasks)

By applying redox sensitive GFP variants targeted to the ER or the cytosol of P. pastoris, respectively (Delic et al. 2010), we could show that cells with activated UPR exhibit a more reducing redox milieu of both compartments (Delic et al. 2012). Reduction of the ER redox state can be explained by the need for the isomerisation or reduction of disulfide bonds in misfolded proteins, which is catalysed by the protein disulfide isomerase (PDI) family. Obviously, reduced glutathione provides the reductive power to the ER by directly reducing PDI family members, thereby promoting their isomerase/reductase activity. Reduced PDIs can in turn then reduce incorrect protein disulfides enabling degradation of misfolded proteins by the ER-associated degradation (ERAD) pathway, or isomerisation to obtain the correct disulfide bonds (Hatahet and Ruddock 2009). On the other hand, the occurrence of even more reducing conditions in the cytosol was rather unexpected, as induction of the UPR resulted in the formation of reactive oxygen species (ROS) such as superoxide anion and hydrogen peroxide (Delic et al. 2012), which are thought to contribute to the oxidation of cellular components.

In summary, overproduction of secretory proteins in P. pastoris usually leads to induction of UPR regulation, but the extent of UPR activation is highly dependent on the nature of the produced protein, the gene dosage of the product as well as on the cultivation conditions. Production of secretory proteins, in particular membrane proteins, can be further enhanced by targeted UPR activation through HAC1 overexpression. It is speculated that constitutive HAC1 splicing in P. pastoris activates UPR constitutively and may be responsible for the better secretory capacity compared to S. cerevisiae. We suggest, however, that such a hypothesis—though attractive—requires further experimental verification.

Overcoming limitations in protein folding

The effects that heterologous overproduction of secretory proteins elicit on P. pastoris cellular physiology has been investigated at various levels such as transcriptomics (Baumann et al. 2010; Gasser et al. 2007a; Hesketh et al. 2013; Liang et al. 2012; Resina et al. 2007), proteomics (Dragosits et al. 2009; Lin et al. 2013; Vanz et al. 2012, 2014), metabolomics (Carnicer et al. 2012) and flux analysis (Jorda et al. 2014). Despite, most cell engineering approaches were made based on educated guesses and mainly involved co-overexpression of known folding related genes (Table 2) or disruption of protease genes (see chapter ‘Impact of proteolysis on recombinant protein production in P. pastoris’). Some exceptions are the transcriptomics-based identification of secretion enhancing factors (Gasser et al. 2007b), as well as genome wide screenings for mutants with enhanced secretion based on cDNA-overexpression libraries (Stadlmayr et al. 2010) or REMI-based insertion mutants (Larsen et al. 2013).

Table 2 Examples for co-overexpression and downregulation of different chaperones

ER-Golgi trafficking in P. pastoris

After having passed ER quality control, properly folded proteins destined for the secretory pathway are then packed in COP-II vesicles and transported to the Golgi apparatus where glycosylation continues. In P. pastoris, the Golgi apparatus has a different structure than known from S. cerevisiae. The Golgi morphology is more close to mammalian cells; it is arranged in ordered stacks, close to transitional ER (tER) sites where COP-II vesicles bud off, contrary to S. cerevisiae where the Golgi is distributed all over the cell (Rossanese et al. 1999) (see Fig. 1). In P. pastoris, the Golgi consists of three to four Golgi cisternae per stack (cis, medial and trans as well as trans-Golgi network) (Mogelsvang et al. 2003). The fact that these Golgi cisternae mature as a means of anterograde intra-Golgi transport could be shown for S. cerevisiae (Losev et al. 2006).

It is known from S. cerevisiae that COP-II vesicles form an inner coat with the proteins Sar1, Sec23 and Sec24 which binds the cargo. The outer coat is formed with the participation of Sec13 and Sec31 and forms a cage around the vesicle. The COPII-mediated vesicle formation in S. cerevisiae is reviewed in detail by Jensen and Schekman (2011). However, it was shown for P. pastoris that COPII vesicles contain besides the essential Sec23 and Sec24 also non-essential Sec23 and Sec24 homologs (Shl23, Lst1) that are absent in S. cerevisiae (Esaki et al. 2006). A further phenomenon conserved in P. pastoris and mammalian cells is the interaction of the two proteins, Sec12 and Sec16, which are also involved in COPII assembly at transitional ER sites (Montegna et al. 2012), where Sec16 acts as a negative regulator of Sar1 GTPase activity (Bharucha et al. 2013). To better understand how COP-II vesicles can form in different size and shape, Zanetti and co-workers studied the coat formation in more detail. They investigated subunit interaction and regulation of vesicle packaging by mixing purified coat proteins with artificial membranes in vitro and imaging them with electron microscopy. With this method a detailed 3D projection of the assembled coat was gained. It was found out that the structures are very flexible which could help in assembling different shapes and sizes of vesicles to carry different cargos (Zanetti et al. 2013).

Retrograde transport of ER-resident proteins is believed to be fulfilled by COP-I vesicles. ER-resident proteins such as Kar2 and PDI family members have an HDEL-retrieval signal which is recognized in the Golgi apparatus by the respective receptor. This leads to packaging of the ER-protein into COPI vesicles and the transportation back to the ER (Papanikou and Glick 2014; Pelham et al. 1988).

Looking at these facts about P. pastoris, it becomes obvious that P. pastoris can be nicely used as a model for mammalian cells and furthermore could explain why secretion in P. pastoris is more efficient than in S. cerevisiae as there are similarities to mammalian cells which have a higher secretory capacity than yeasts.

Glycosylation

Glycosylation is a common post-translational modification of proteins. While mostly observed in eukaryotic cells, it is established today that also prokaryotes feature a homologous process of protein glycosylation. N-linked glycosylation is initiated by the transfer of a Glc3Man9GlcNAc2 oligosaccharide to an asparagine residue in a consensus sequence motif N-X-S/T (Aebi 2013), while typical O-linked glycosylation in yeasts begins with the transfer of a mannose residue to serine or threonine. Both processes take place in the ER in close proximity of the translocation pore. Further processing of the glycans starts in the ER and is followed by consecutive steps in the Golgi compartment. The processing steps in ER and Golgi, their responsible enzymes and the respective genes of eight yeast species including P. pastoris are described in Delic et al. (2013).

Role of N- and O-glycosylation in yeasts

It is well established that N-glycosylation has a major initial role in signaling of protein folding steps involving calnexin as a folding chaperone. After removal of two terminal glucose residues, calnexin binds specifically to Glc1Man9GlcNAc2, exhibiting its chaperone function. Removal of the last glucose residue releases calnexin from the glycoprotein, thus enabling further folding or targeting towards degradation. While S. cerevisiae lacks the gene for UGGT (the mammalian enzyme re-adding glucose to the glycan and thus recycling proteins to calnexin binding), a homolog was identified in P. pastoris and other yeast species (Babour et al. 2004; Delic et al. 2013; Fernandez et al. 1994). The function of initial O-glycosylation in the ER is much less clear; however, some role in quality control has been discussed for S. cerevisiae (Delic et al. 2013). Both N- and O-glycans are further modified in the Golgi and function in protein solubilization, cell wall stabilization, osmotolerance, and budding.

Structures of N- and O-glycans in P. pastoris

Yeast N-glycans are typically referred to as high-mannose type glycans, consisting of 2 GlcNAc residues and a branched oligomannosyl structure which can extend to 200 residues in S. cerevisiae (Dean 1999) while in P. pastoris, it encompasses 9–16 mannose residues (most frequently 9, 10 or 11) (Hamilton et al. 2003; Maccani et al. 2014; Vervecken et al. 2004). The presence of respective mannosyltransferase genes in the genome allows to draft the possible glycan structures (Delic et al. 2013): Och1 is the α-1,6 mannosyl transferase initiating the addition of further mannoses in the Golgi. Further α-1,6 mannose addition is enabled by mannan polymerase complexes I and II (M-Pol I and II). Additionally, α-1,2 residues are added by the Mnn2/5 family (with three homologs in P. pastoris) while the Mnn1 family of α-1,3 mannosyl transferases is absent in P. pastoris, reflected by the absence of this linkage in N-glycans. Mannosylphosphate may be added by glycosyl transferases of the KTR family of which P. pastoris has five homologs. In heterologous glycoproteins produced in P. pastoris, about 10 % of the glycans are phosphorylated (Maccani et al. 2014). Four β-1,2 mannosyl transferases (Bmt1-Bmt4) enable a linkage in P. pastoris that has been observed in a number of other yeasts and fungal species, but not in S. cerevisiae. A consensus P. pastoris N-glycan is illustrated in Fig. 2a with indication of the respective enzymes responsible for the glycosyl transfers.

Fig. 2
figure 2

Native glycan structures of recombinant glycoproteins produced in P. pastoris. a Typical N-glycan. b Typical O-glycan. Core structures are drawn with filled black symbols with full lines. Additional frequently occurring mannosyl residues are depicted with grey filling, while white symbols, grey borders and dotted lines indicate decreasing frequency of the respective residues on glycans. Black circle GlcNAC; Black diamond mannose. (P) and (β) indicate the possibility of a phosphomannose or a β-1,2 linkage, respectively. The involved enzymes are 1 Ost complex; 2 Och1; 3 M-Pol I and II; 4 Mnn2/5 family, Ktr family, or Bmt family; 5 Pmt complex

Yeast O-glycans begin with a mannose residue bound to serine or threonine. Further elongation is less well characterized as compared to N-glycosylation. Nett et al. (2013) report a chain length of 1–4 mannosyl residues in O-glycans of an IgG1 expressed in wild-type P. pastoris, with the majority having two or three residues. Recombinant human erythropoietin (hEPO) produced in P. pastoris is O-glycosylated at the same residue (Ser126) as the CHO cell derived and native hEPO (Dube et al. 1988; Gong et al. 2013). Different to more complex mammalian O-glycans, P. pastoris produced EPO carried mainly mannobiose structures and more rarely single mannose residues on Ser126 (Gong et al. 2013). The authors speculate ‘that the protein itself plays a more fundamental role than the expression host does in determining the O-glycosylation sites’. The same group also observed mannosylation of one threonine residue of recombinant hGCSF (again, the same residue that is O-glycosylated in the native human protein). In this case, only a single mannose was added to the protein (Gong et al. 2014). Elongation of O-glycans is performed by partly the same Golgi-resident mannosyl transferases also active on N-glycans, so that α-1,2 mannose, α-1,2 mannosylphosphate and β-1,2 mannose can be added. A consensus structure of P. pastoris O-glycans is illustrated in Fig. 2b. In summary, a single mannosyl residue may occur, or one or few additional α-1,2 mannoses with or without phosphorylation, and/or a β-1,2 mannose. Probably, the phosphomannose or β-1,2 mannose are terminal residues when they occur.

Protein glycosylation varies not only in glycan structure but also in the frequency of site occupancy. In recombinant IgG produced in P. pastoris, 75–85 % of potential N-glycosylation sites were actually occupied (Choi et al. 2012) while only 32 % of the O-glycosylation sites carried a glycan (Nett et al. 2013). N-Glycan occupancy of IgG was improved to more than 99 % by overexpression of the Leishmania major homolog of STT3, a subunit of the oligosaccharyl transferase (OST) complex (Choi et al. 2012). Also, naturally non-glycosylated proteins may be O-glycosylated in yeast, as e.g. recombinant insulin precursor, carrying short mono- or dimannosyl glycans on 5 % of the produced protein (Govindappa et al. 2013).

Engineering of N-glycosylation

Metabolic engineering of recombinant protein production hosts to enable the production of human-like N-glycan structures has been achieved in yeasts and fungi, plants and insect cells (Loos and Steinkellner 2012). In P. pastoris, this work encompassed prevention of a specific fungal α-1,6 mannosylation by deletion of OCH1, the removal of terminal mannose residues and the ordered addition of N-acetyl glucosamine (GlcNAc), galactose and sialic acid by recombinant expression of the respective glycosyl transferases and—where needed—the pathways towards synthesis of the respective activated sugars (Hamilton and Gerngross 2007). This modification of N-glycan structures will of course also modify all native P. pastoris N-glycoproteins which may lead to a decrease of fitness (and maybe also productivity). While the aspect of fitness has not been discussed in literature, it was shown that glycoengineered P. pastoris is able to produce recombinant human IgG in the g/L range in the lab and pilot scale (Ye et al. 2011).

Depending on the application of the protein, it may be desired to achieve human-like complex N-glycans as described above, or rather to prevent the yeast type high-mannose structure. In the latter case, trimming of glycans to a minimal GlcNAc-Man structure may be sufficient to achieve the goal of a small glycan. As an example, the so-called SuperMan5 strain was developed, which produces Man5GlcNAc2 structures (Jacobs et al. 2010). An even more radical strategy, termed GlycoDelete, has recently been described for mammalian cells and may be applied to P. pastoris as well. To trim native glycans back to the first GlcNAc Meuris et al. (2014) targeted an endo-β-N-acetylglucosaminidase (EndoT) of Hypocrea jecorina (anamorph Trichoderma reesei) to the trans-Golgi of human embryonic kidney (HEK) cells. Thus, native N-glycans are maintained in the ER for proper protein folding and then trimmed to the first GlcNAc in the late Golgi, where minimal human-like sugar structures (galactose and sialic acid) may be added.

Engineering of O-glycosylation

Two aspects have limited O-glycosylation engineering in yeast, as compared to N-glycosylation. Firstly, the pharmaceutically more interesting O-glycans in human proteins are initiated in the Golgi rather than the ER, where fungal (and also mammalian) mannosyl-type O-glycans are added. This means that while mammalian cells and yeasts share a pathway for O-mannosylation, yeasts do not share the second pathway for complex O-glycosylation on which a humanization of the yeast structures could be built on. Secondly, the full prevention of O-mannosylation in yeast could not be achieved as it is obviously lethal to the producing cell. However, at least the limitation of O-mannosylation in P. pastoris was demonstrated recently by Nett et al. (2013). O-glycosylation is initiated by protein mannosyl transferases (Pmt), which are dimeric enzymes either consisting of one member of the Pmt1 family and one of the Pmt2 family, or of a homodimer of two Pmt4 subunits. In P. pastoris, Pmt1 family members Pmt1 and Pmt5 have been annotated, and the Pmt2 family members Pmt2 and Pmt6 (Delic et al. 2013; Nett et al. 2013). Knock-out of both members of one family is lethal, while one of the two can be deleted. PMT1 and PMT2 have the highest expression levels of these genes, and their deletion also has the highest impact on O-glycosylation (Nett et al. 2013). Deletion of PMT1 or PMT2 decreases site occupancy from 20 to 30 % down to about 4 %, and also reduces the chain length of the remaining O-glycans from a weighted average of 2.3 to 1.3 residues. Additional inhibition with sub-lethal concentrations of an O-glycosylation inhibitor decreased site occupancy further below 2 % and the chain length to single mannose (Nett et al. 2013). O-glycosylation of insulin precursor could be prevented to 60 % by deletion of PMT1, while single deletion of PMT4, PMT5 and PMT6 had only minor impact (Govindappa et al. 2013).

By deletion of the genes responsible for phospho- and β-mannose addition and overexpression of an α-1,2 mannosidase single mannose O-glycans could be achieved in P. pastoris. Additional overexpression of GlcNAc, galactosyl and sialyl transferases in the same ordered way as for N-glycoengineering enabled the creation of a human-like α-dystroglycan type O-glycan structure on recombinant proteins (Hamilton et al. 2013).

To date, glycoengineering has mainly focused on mimicking human-like glycosylation patterns which could be achieved efficiently for N-glycans. New developments address either the removal of unwanted N- and O-glycans or the modification of yeast glycan patterns with artificial oligosaccharide chains that share structural similarity to human glycans without rebuilding the entire complex structure.

Impact of proteolysis on recombinant protein production in P. pastoris

In eukaryotic cells, different ways of proteolysis occur in different compartments. In yeast, proteolytic events due to ERAD, mistargeting to the vacuole and proteases located in the cell wall are potential risks of recombinant protein degradation. In the following chapter, the influence of the above mentioned processes will be discussed related to recombinant protein production in P. pastoris.

Proteolysis due to ERAD

In the ER, a strict quality control mechanism is responsible to ensure that only properly folded proteins enter the further pathway through the cell. The nascent polypeptide is bound to the chaperone Kar2 which helps in protein folding. This process is based on a trial and error principle until the correct conformation is found. If the protein is bound to Kar2 for too long, UPR and ER-associated degradation (ERAD) are getting involved. In case of ERAD, the misfolded protein is re-translocated to the cytosol (through the translocon pore (Sec61 complex) also with the help of Kar2) where the protein is ubiquitinated and then further transported to the proteasome where the protein is finally degraded (Delic et al. 2013). We could recently show that ERAD plays a crucial role in the production of recombinant proteins as about 60 % of the newly synthesized antibody fragment Fab3H6 in P. pastoris was degraded intracellularly (Pfeffer et al. 2011). A major role was attributed to ERAD as inhibition of the proteasome increased secretion threefold, speculating that the ER quality mechanism is overshooting in P. pastoris when secreting recombinant proteins, especially Fab fragments (Pfeffer et al. 2012). Binding partners for ERAD but also for the vacuole were found in co-immunoprecipitation with the Fab. ERAD seems only to be induced upon strong or prolonged ER stress and was also observed for strains producing membrane proteins (Vogl et al. 2014), ER-resident hepatitis B surface antigen (Vanz et al. 2012), or aggregation prone variants of lysozyme (Whyteside et al. 2011a), but not for strains producing secretory proteins in general. In the latter example, mutational variants of human lysozyme revealed that less stable variants were retained in the cells leading to the induction of UPR and ERAD. This was tested by qRT-PCR of common markers like KAR2, HAC1 and PDI1 for UPR and HRD3, DER1 and SEC61 for ERAD. The less stable the lysozyme was, the more expression of UPR and ERAD marker genes were detected. Similar results were observed for scFv antibodies as well (Whyteside et al. 2011a).

Interestingly, a very recent paper reporting on insulin precursor (a quite stable protein with only about 10 % retained in the cells) production in P. pastoris showed the opposite effect of KAR2 and PDI1 mRNA levels 48 h after methanol induction. The mRNA levels were decreasing; however, compared to the non-producing host strain, they were a bit increased in insulin precursor producing cells. Also, ERAD (e.g. marker gene CDC48) and ubiquitination (e.g. marker gene UBA1) were strongly decreased in abundance in methanol fed-batch phase. The authors suggest that the observed effect of methanol fed-batch on UPR and ERAD related proteins are due to other non-recombinant protein effects (Vanz et al. 2014).

A further influence on ERAD could be the hydrophobicity of the product. In 2002, it was observed for S. cerevisiae that hydrophobic cutinase, where hydrophobic patches were introduced to increase activity, was retained in the ER and finally degraded by ERAD (Sagt et al. 2002). During the folding process, proteins expose their hydrophobic parts until the correct conformation is found. If too many hydrophobic parts are present, it is very likely that the protein is recognized as partially unfolded and therefore targeted for degradation.

Proteolysis in the vacuole

There are two main pathways to the vacuole, the CPY pathway, named after the most prominent cargo carboxypeptidase Y, which is transported via the endosomes to the vacuole. The second pathway (ALP pathway), which is named after its cargo alkaline phosphatase, is going the direct way from the late Golgi to the vacuole (Bowers and Stevens 2005; Conibear and Stevens 1998). Degradation of recombinant product by the vacuole is a known bottleneck in recombinant protein production in yeasts like S. cerevisiae and S. pombe (Idiris et al. 2006). The importance of vacuolar proteases in protein production in P. pastoris is manifested by the Invitrogen PEP4 (proteinase A) and PRB1 (proteinase B) knock-out strains, proteases which self-activate and also activate further proteases in the vacuole (Invitrogen Manual: PichiaPink Expression System 2014).

Proteolysis due to yapsins

A further proteolytic impact on recombinant secretory proteins is contributed by yapsins. Yapsins are GPI anchored aspartyl proteases which are involved in cell wall assembly and remodeling and are also important for cell wall integrity. Yapsins were first identified in S. cerevisiae (Gagnon-Arsenault et al. 2006; Krysan et al. 2005). In P. pastoris, degradation of secreted collagen-inspired proteins could be prevented by yapsin 1 disruption (Silva et al. 2011). Similarly, human parathyroid hormone (fused to HSA) could be stabilized in a PEP4/YPS1 double knock-out strain. In this work, seven putative yapsin genes (YPS1, YPS2, YPS3, YPS7, MKC7, YPS’, YPS”) were disrupted, but only yapsin 1 showed an impact on protein degradation. In combination with PEP4 (proteinase A) knock-out, the level of non-degraded product increased from ∼40 up to 80 % (Wu et al. 2013).

Proteolytic processes have a clear impact on recombinant protein production in P. pastoris; however, their impact on individual products, as well as the relative influence of the different processes is difficult to predict.

Limitations in exocytosis in respect of recombinant protein secretion in P. pastoris

The final step in the secretory pathway is the transport of the cargo in light and dense vesicles that bud off at the trans-Golgi (network) (constitutive secretion) and fuse with the plasma membrane (Fig. 1). The exocytosis event is performed by a complex called the exocyst, where many different proteins function together (Harsay and Schekman 2002; TerBush et al. 1996). Beside the exocyst, v-SNAREs and t-SNAREs are involved in exocytosis. The Golgi-derived vesicles which bear the v-SNAREs Snc1 and Snc2 (Gerst et al. 1992; Protopopov et al. 1993) fuse with the plasma membrane, where the t-SNAREs Sso1 and Sso2 are located (Aalto et al. 1993).

For S. cerevisiae, it was reported that overexpression of the yeast syntaxin Sso (either Sso1 or Sso2) enhanced production of secreted proteins (Bacillus α-amylase and invertase) several fold (Ruohonen et al. 1997; Toikkanen et al. 2004). However, overexpressing S. cerevisiae Sso2 in P. pastoris resulted only in 20 % increase of secreted protein (Gasser et al. 2007b) which could be due to a gene copy effect, as in P. pastoris, only one copy was integrated in comparison to a multicopy strain in S. cerevisiae. Furthermore, overexpression of S. cerevisiae Kin2 in P. pastoris, a protein kinase regulating the final step of exocytosis, showed a 2-fold increase in Fab production (Gasser et al. 2007b).

A further barrier in recombinant protein production is the cell wall. Gas1 is a GPI anchored protein in the outer layer of the plasma membrane playing an essential role in cell wall integrity. In S. cerevisiae GAS1 null mutants, the cell wall is highly resistant to zymolyase, more sensitive to weakening by cell wall-perturbing agents (Lesage et al. 2005) and less protected against osmotic destabilizing agents (Vai et al. 1996) while the cells exhibit an enhanced secretion of recombinant insulin-like growth factor 1 (Vai et al. 2000). Therefore, by deleting GAS1 in P. pastoris, a similar effect was expected. However, the secretion efficiency could only be improved for Rhizopus oryzae lipase (increased 2-fold), whereas trypsinogen and albumin did not show any enhancement (Marx et al. 2006). A more leaky cell wall phenotype was also reported for some of the P. pastoris mutants with an enhanced beta-galactosidase secretion phenotype; however, also in this study, the effects were protein-specific (Larsen et al. 2013).

Conclusions

Products made with P. pastoris have successfully been introduced to the market. However many potential products in development require high production volumes at low costs which imposes pressure on the development of new more productive strains. Systems biology enabled a wealth of novel mechanistic understanding employed for strain improvement. However, many of these solutions are still punctual, addressing single steps of the cellular protein production process. To enable a next level of productivity, future developments need to address a more systemic view of understanding and engineering the entire process chain of protein folding and transport.