Introduction

Advancements in the field of genetic manipulation methods have recently expanded the range of microorganisms capable of synthesizing refined chemicals from glucose and xylose at increased rates and efficiencies. The yeasts have been made capable, through recent research, of metabolizing not only the sugars but also other substrates, thereby generating diverse chemicals that hold significance in the industrial domain. Yeasts, which are prominent eukaryotic organisms, are capable of producing functional recombinant proteins due to several advantageous factors. The benefits encompass the capability to conduct suitable post-translational modifications, simplified genetic manipulation, generation of substantial biomass, rapid growing, absence of pyrogen, and prospective expression of intricate heterologous eukaryotic protein (Baghban et al. 2019).

The yeast Pichia pastoris, which has recently been reclassified as Komagataella pastoris, has emerged as a significant asset in the field of biotechnology, particularly in the realm of producing heterologous proteins (Besleaga et al. 2023). Phillips Petroleum introduced this process over four decades ago for the purpose of commercially producing single-cell protein (SCP), which is used as an additive in animal feed. In this process, a high cell density was applied which relies on methanol to use in fermentation procedures of the carbon source. The price for methanol experienced a significant surge due to the oil crisis in 1973, rendering the production of SCP economically unfeasible. During the 1980s, the utilization of P. pastoris emerged as an alternative approach for expressing heterologous proteins. This was accomplished by employing the powerful and precisely controlled AOX1 promoter (Cregg et al. 1985). The AOX1 promoter demonstrated remarkable efficacy in producing heterologous proteins when used in conjunction with the pre-existing fermentation method for producing the SCP. In the 1990s, the production of hydroxynitrile lyase enzyme which is derived from plants at a quantity of greater than 20 g/l of recombinant protein was established as one of the initial large-scale industrial production processes (Hasslacher et al. 1997). This yeast is a renowned yeast species that exhibits the ability to metabolize methanol. This microorganism possesses numerous advantageous characteristics, including a minimal need for essential nutrients, the capacity to achieve substantial cell densities even in environments with acidic culture media, and extensive utilization in various biotechnological applications for the production of commercially significant heterologous proteins (Ata et al. 2021). Hence, the yeast's capacity to employ methanol as both a carbon and energy provider, its non-fermentative application of glucose in the presence of oxygen, and its efficacy in thriving on glycerol are among the primary factors contributing to the favored selection of this particular yeast strain for the advancement of bioprocessing techniques (Bernauer et al. 2021). These features may lead to the transformation of the P. pastoris yeast into the proper host for numerous industrial implementations. Recently, there has been an increasing tendency to refer to this particular strain of yeast as the "biotech yeast" due to its extensive application in the field of metabolic engineering or synthetic biology as a cellular factory, with the ultimate goal of producing valuable chemical components (Fig. 1) (Chiang et al. 2020).

Fig. 1
figure 1

Schematic image of the various sources of carbon in Pichia pastoris

P. pastoris has previously been employed as an efficient host for the purpose of expressing heterologous proteins with considerable success (Barone et al. 2023). Nevertheless, there remains a need to enhance the efficiency of P. pastoris for chemical production, with a particular focus on the optimization of methanol metabolism (Anggiani et al. 2018), parameters in the fermentation process (Sun et al. 2020), and the development of novel genetic tools including promoters and plasmids (Gao et al. 2021). In order to tackle these requirements, novel approaches, platforms, and strategies have been devised to facilitate the engineering of this yeast (Yang and Zhang 2018). Strategies aimed at enhancing the efficiency rate of target integration genes in the genome of P. pastoris have been implemented through the elimination of homologous genes (ku70 (Näätsaari et al. 2012) and Dnl4p (Ito et al. 2018)), along with the genome editing of P. pastoris through the utilization of CRISPR/Cas9 (Weninger et al. 2016). Although the capabilities possessed by P. pastoris are extensive, there exist certain restrictions when utilizing this yeast as a system for protein expression. Firstly, it is incapable of consistently conveying every desired genetic sequence. While certain proteins may face no difficulties, others may encounter various impediments related to the stability of RNA, the process of glycosylation, the folding of proteins, or the secretion process. Despite these limitations, however, the utilization of P. pastoris as an expression system has endured the trials of time, and novel advancements are consistently being implemented in this domain. The AOX1 promoter, formerly exclusively induced by methanol, has been subject to modifications by VALIDOGEN (Grambach, Austria) in order to exhibit robust expression in environments devoid of methanol (Carneiro et al. 2021).

This review is focused on the current achievements of utilization of P. pastoris as an expression host, and its strategies. Moreover, it represents the advantages and drawbacks of using this yeast for producing recombinant proteins. Furthermore, it recognizes a number of prevalent obstacles and overcoming to them that arise during attempts to generate recombinant proteins within this particular expression host. These challenges include the identification of strains that exhibit high levels of expression, enhancement of secretion efficacy, and reduction of hyperglycosylation.

Background of P. pastoris as an expression host system

Proteins, as biomolecules, exhibit a multitude of functions in the existence of all organisms. These functions encompass biological roles, research applications, therapeutic endeavors, and disease prevention measures. However, the quantity of protein present in the natural world is inadequate to satisfy the demands of these diverse functions, particularly in the realm of scientific investigation where specific proteins possessing both purity and adequate quantities must be acquired promptly and at a minimal expense (Safdar et al. 2018). Recombinant protein refers to a protein that is synthesized and engineered through the process of genetic alteration. This process involves the use of recombinant DNA to encode the protein, which is then expressed within a system that facilitates the transcription and translation of the recombinant DNA into a protein with potential commercial value. Recombinant DNA undergoes modification with the purpose of regulating the target promoter to achieve enhanced expression of the target protein within the selected appropriate system for protein expression (Chumnanpuen et al. 2016). The selection of the proper expression system plays a major factor in the successful expression of recombinant proteins. Several factors are taken into consideration, including the property of the target protein, its intended application, the yield of the protein, the cost involved, and the ability to undergo posttranslational alteration. The consideration of the location and molecular weight of a substance is of utmost importance (Madhavan et al. 2021). Proteins are susceptible to denaturation due to various factors, including the concentration of salt, fluctuations in temperature, pH levels, the presence of organic solvents, and the interactions occurring at the surface and interface. These interactions can lead to the formation of protein aggregates during the process of lyophilization. Hence, it is imperative to conduct the genetic engineering procedure with great caution in order to generate a fully intact and operational protein (Ma et al. 2020).

The primary approach for generating recombinant protein involves the introduction of the vector that carries the desired DNA fragment into the host. E. coli, a prokaryotic organism, has been historically employed as the bacterial host cell due to its early adoption and advanced development (Yuan et al. 2021). E. coli possesses a number of capacities including rapid reproduction, simple and inexpensive isolation with significant expression levels, genetic background that lends itself to ease of modification, and a stable and diverse range of utilization (Huleani et al. 2021). However, the prokaryotic system frequently exhibits the protein’s incorrectly folded within the cytoplasm, leading to the formation of insoluble protein aggregates (Mital et al. 2021). These aggregates, known as inclusion bodies, pose a challenge in the purification process. Regrettably, the imperfection of the post-translational modification process in prokaryotes necessitates the development of a more advanced system. This system aims to enable the expression of proteins that were previously deemed impossible, including glycosylated proteins (Kumar and Kumar 2019; Kis et al. 2019; Gomes et al. 2018). Yeasts are a type of eukaryotic cells, which possess the capability to be employed as a heterologous system for the expression of recombinant proteins. Generally, yeast can be categorized into two distinct groups, specifically non-methylotrophic, and methylotrophic. Saccharomyces cerevisiae, a yeast strain that is not capable of utilizing methylotrophic compounds, along with Pichia pastoris and Hansenula polymorpha, which are yeast strains capable of utilizing methylotrophic compounds, are frequently employed as a host organism for the purpose of producing recombinant proteins (Baghban et al. 2019). Utilization of recombinant proteins finds application in the production of vaccine products designed for the prevention of infectious diseases. The methylotrophic yeast is currently employed on a large scale due to its possession of prokaryotic characteristics, enabling it to apply energy from C1 compounds (methanol) and utilize methylamine as a source of nitrogen (Zaver et al. 2021). The fundamental principle underlying the process of producing recombinant proteins is represented in Fig. 2.

Fig. 2
figure 2

The process of producing the recombinant protein in Pichia pastoris

The process commences with the integration of genes of interest into recombinant plasmids, followed by the execution of protein expression procedures, which involves the upstream stages. The purification of proteins and the characterization of proteins, which contain downstream processes (Ma et al. 2020). The principle of generating proteins through recombination within the yeast system aligns with the central dogma. The chosen vector underwent cloning with the target gene that was expressed. Electroporation is a process through which the transformation of the gene subsequently occurs into suitable yeast cells. The transformants were subsequently inoculated onto the growth medium and subjected to cultivation on a smaller scale. Cells exhibiting elevated levels of protein expression can undergo a large-scale process for further production. The process of cultivating relies on the energy source that is supplied via the culture medium (Matthews 2019). The yeast that is not a methylotroph utilizes a system of fermentation in order to acquire energy. Thus, the consideration of the source of glucose, the impact of pH, the quantity of dissolved oxygen, temperature, and osmolarity in the culture medium is imperative when employing it as a system for producing recombinant proteins.

Historically, the yeast strain known as P. pastoris was initially obtained within the exudates of a chestnut tree in France and was subsequently designated as Zygosaccharomyces pastoris (Zahrl et al. 2017). Later, Yamada et al (1995) subsequently classified the organism into a novel genus, either Komagataella or Pichia (Naumov et al. 2018). It has been demonstrated that P. pastoris, in its engineered methylotrophic state, is capable of utilizing methanol as the exclusive source of energy and carbon (Baghban et al. 2019). The Y-11430 strain, also known as the wild-type, is not employed for protein expression. Conversely, GS115 is a highly favored strain utilized extensively as a significant expression system, specifically in the fields of industry and medicine (Julien 2006). The GS115 strain of P. pastoris possesses a pair of encoding genes, AOX1 and AOX2, responsible for synthesizing the alcohol oxidase (AOX) enzyme. The presence of this particular compound triggers the transcriptional activation of the AOX1 and AOX2 genes, in the presence of methanol, subsequently leading to the production of a substantial quantity of the AOX enzyme (Vanz et al. 2012). In the synthesis of an AOX enzyme both genes are utilized however, AOX1 exhibits a higher production rate of that enzyme. Consequently, the deletion of the AOX1 gene results in a significant deceleration of the growth of methanol. This particular phenotype is referred to as methanol utilization slow (MutS). The cessation of the AOX2 gene will not prevent the growth rate of methanol, and the rates of growth are analogous to the methanol utilizing plus (Mut +) phenotype, also known as the wild-type. However, through the elimination of both genetic elements, the strains exhibit an incapacity to grow in the presence of methanol (methanol utilizing minus [Mut −]) (Camara et al. 2017). In the GS115 derivation from the KM71 strain, the AOX1 gene has been eliminated, resulting in the identification of this particular strain as the MutS strain (Charoenrat et al. 2013). The strains of KM7121, MC100-3, and MC101-1, which are of an older variety, are unable to utilize methanol as a source of sustenance due to the absence of AOX genes in these strains. Consequently, in the methanol presence, these strains are incapable of growing (Ergun 2018).

The cloning and expression system in P. pastoris

In the formative approaches of cloning and expression in P. pastoris strain, several factors must be taken into account initially. These factors encompass the selection of promoter-terminator combinations, appropriate markers for selection, and the utilization of vector systems for either intracellular or secreted manifestation, which necessitates the careful selection of appropriate secretion signals. The selection of the appropriate expression vector and compatible host strain constitutes a crucial imperative for achieving the successful expression of recombinant proteins.

Promoters

The promoters that are commonly used are AOX1, YPT1, GAP, NPS, FLD1 and ICL1. These promoters can be categorized into two distinct groups: natural and synthetic promoters (Juturu and Wu 2018). Due to the restricted ability of natural promoters to regulate expression levels and regulatory features effectively, numerous initiatives have aimed to modify promoters in P. pastoris, with particular emphasis on PAOX1 (Vogl 2022). In addition to PAOX1, P. pastoris has revealed the characterization of several other promoters that are triggered by methanol.

The utilization of firmly regulated promoters, such as the promoter for alcohol oxidase (AOX1), confers benefits in terms of protein overexpression. By decoupling the process of growth from the phase of production, the accumulation of biomass occurs prior to the initiation of the expression of the protein. During the phase of growing the cells do not experience stress due to the build-up of recombinant protein and it is even feasible to produce proteins that are harmful to P. pastoris. Additionally, it might be advantageous to simultaneously express helper proteins such as chaperones at specific temporal intervals, for instance prior to the actual synthesis of the target protein. On the contrary, the utilization of constitutive promoters can potentially mitigate the challenges associated with handling the process. Constitutive promoters are commonly employed for the purpose of expressing selection markers. The utilization of well-adjusted constitutive promoters in the engineering approaches of the metabolic pathway could additionally enable the manipulation of metabolites in a regulated manner (Vogl and Glieder 2013). Table 1 provides a comprehensive account of widely utilized and extensively researched promoters, as well as those that have recently been investigated.

Table 1 The most dominant utilization of promoters for expression of the recombinant protein in P. pastoris

Inducible promoters

The AOX1 promoter (PAOX1), which was initially utilized for the expression of foreign genes (Tschopp et al. 1987), continues to be the most frequently employed promoter (Dharmorathna et al. 2023; Lünsdorf et al. 2011; Yu et al. 2013) due to its tightly regulated nature. PAOX1 experiences significant repression under conditions where P. pastoris uses glucose, ethanol, and glycerol as its source of carbon (Singh and Nerang 2023). The promoter becomes de-repressed when these carbon sources are exhausted, although it is exclusively activated when methanol is added. Methanol, a substance that possesses high flammability and potential hazards, is considered undesirable for the conduct of extensive fermentations. Desirable are alternative inducible promoters or variants of PAOX1 that can achieve high expression levels without the need for methanol induction. A newly released patent application describes a technique in which an expression is managed by methanol-inducible promoters, exemplified by AOX1, formate dehydrogenase (FMDH), and methanol oxidase (MOX) in the absence of methanol supplementation (Vogl et al. 2018). This aim was accomplished through the constitutive co-expression of the transcription factor Prm1p, which is known for its positive regulatory effects, via the utilization of the promoters GAP, TEF, or PGK. The phytase reporter protein demonstrated a threefold increase in relative activity in the absence of methanol in comparison to a control strain expressing PRM1 including the native promoter. Certain alterations in the promoter were discovered by using EGFP as a reporter, to converse heightened levels of expression compared to the original PAOX1 promoter, with a range spectral from 6 to 160% of the original promoter's effectiveness. Several additional manageable promoters are presently being examined for their capacity to stimulate significant levels of expression (Vogl et al. 2018) (Table 1).

A recent investigation has provided an illustration of the use of three newly discovered inducible promoters sourced from P. pastoris. These promoters including glycerol kinase (GUT1), enolase (ENO1), and alcohol dehydrogenase (ADH1) (alcohol dehydrogenase), exhibit appealing regulatory characteristics (Cregg and Tolstorukov 2012). The expression of a gene in a constitutive manner simplifies the manipulation of processes, eliminates the need for potentially dangerous substances that induce gene expression, and ensures the gene of interest’s constant transcription. The glyceraldehyde-3-phosphate promoter (PGAP) is frequently employed for this particular purpose, as it achieves nearly identical levels of expression as the methanol-induced PAOX1 (Waterham et al. 1997). when glucose is present. The expression levels derived from PGAP exhibit a decrease of approximately 50% when cultivated in glycerol, and a decrease of about 33% in growing with methanol (Juturu and Wu 2018). A recent study utilizing DNA microarray technology has successfully identified the new promoters that display repression in the presence of glycerol, but exhibit induction upon transition to media limited in glucose (Prielhofer et al. 2013). The approach has purportedly identified the promoters that are deemed to be the most affecting. These promoters are responsible for regulating the expression of a glucose transporter with a high affinity, known as HGT1, as well as an aldehyde dehydrogenase that is hypothesized to exist. In certain instances, there is a desire for the ability to finely adjust the levels of expression. This is done in order to (1) simultaneously express additional proteins that aid in the production and release of recombinant proteins, (2) the post-transitional protein implementation, and (3) construct complete metabolic pathways that encompass a series of distinct enzymatic reactions (Pan et al. 2022; Ahmad et al. 2014).

Engineering of AOX1 promoter

The PAOX1 gene promoter from P. pastoris has consistently been the preferred choice for constructing RP expression vectors in P. pastoris (Vogl and Glieder 2013). Glucose or glycerol can strongly restrict it, however, methanol can activate it, allowing cells to exclusively utilize methanol as a carbon source for growth. Essentially, it separates the stages of cell growth and protein production. By utilizing these stated advantages, it becomes highly beneficial for achieving increased protein expression levels and has the potential to substitute for the constant use of the PGAP promoter in specific instances (Pan et al. 2022). The advancement of PAOX1 engineering was stimulated by the elucidation of the regulatory mechanism. Mxr1, the regulator of methanol expression, plays a crucial role in governing the methanol pathway employment and has the ability to activate numerous genes when exposed to methanol (Wang et al. 2016). According to Zhan et al (2017), the increased expression of Mxr1 serves the purpose of enhancing the expression of AOX1 by suppressing the expression of glycerol transporter 1 (GT1). Camara et al. (2019) discovered that by overproducing an unregulated version of Mxr1 in strains with multiple expression cassettes of PAOX1, a basic output can still be maintained despite partially rewiring the PAOX1 transcriptional circuits. CRISPR/Cas9 technology, along with plasmids carrying sgRNA and the homology arms of methanol expression regulator 1, proves effective in precisely editing Mxr1 (Hou et al. 2020). To put it differently, the presence of frame-shift mutations in the Mxr1 protein could potentially diminish the levels of AOX1 protein and undermine the productivity of the enzyme (Hou et al. 2020). Research conducted showed that the PAOX1 modification occurred through a mutation in the central promoter region, where the original triplet sequences were altered with cytosine or adenine triplets. This modification was entirely created artificially, indicating that the core promoters can withstand minor mutations quite effectively. This finding supports the hypothesis of regulatory models that involve flexible motifs or redundant designs in the future (Portela et al. 2018). In a more recent investigation, the manipulation of PAOX1 genes was accomplished through the employment of synthetic Aca2, Adr1, Cat8, and DNA elements, which substituted particular regulatory DNA elements responsible for Mxr1, Cat8, and Aca1 bonding interactions. The enhancement in the potency of methanol, equivalent to 1.97 times that of PAOX1, can be achieved by employing a composite promoter consisting of 3 Cat8 motifs, 3 palindromic Adr1 motifs, and Aca2 synthetic binding motifs (Ergün et al. 2020).

Advanced synthetic promoters

Although P. pastoris is restricted to using PAOX1 or PGAP for protein expression promoters (Yang and Zhang 2018), scientists have been striving to discover alternative synthetic promoters in order to supplant the existing ones. Engineered promoter variants (EPVs) demonstrate significantly greater effectiveness compared to natural promoters and enable the implementation of environmentally friendly production methods using non-harmful carbon sources as the preferred choice for yeast carbon utilization. Ergn et al (2019) recently conducted a study to enhance RP expression through the employment of ethanol pathways. In order to create new forms of the promoter, Ergn et al (2019) manipulated the transcription binding sites within the alcohol dehydrogenase 2 promoter (PADH2). Hybrid-promoter architectures are the preferred approach for creating EPVs. One effective way to achieve this architecture is by utilizing synthetic DNA sequences instead of native cis-acting DNA elements (Ergün et al. 2020). A novel approach in architecting promoters is the combination of monodirectional double-promoter expression systems (DPESs) with a hybrid architecture. This design incorporates engineered promoter variants such as PADH2-Cat8-L2 and PmAOX1, along with the natural promoter PGAP, to boost and increase the expression of deregulated genes in P. pastoris. Furthermore, this aims to amplify and elevate the expression of genes that are deregulated in P. pastoris, particularly in methanol-free environments (Demir and Calik 2020). Compared to twin DPESs, biofunctional DPESs displayed amplification in both transcription and expression, showcasing their superior upregulation capability. Ergun and Calik's study (2021) highlights the cis-acting DNA advancements in P. pastoris promoter engineering, offering a deep understanding of the functional elements within the DNA sequence. This knowledge enables the creation of nontraditional promoter libraries with enhanced potency and unique regulatory mechanisms. Evidently, there is a growing inclination towards using modified promoter sequences to enhance the production of proteins in the P. pastoris system (Machens et al. 2017).

Vectors

P. pastoris vectors can be classified into two categories based on the position of the proteins they express. The first type is intracellular expression vectors (e.g. pPIC3, pPHIL-D2, pPICZ, etc.), while the second type is secretory expression vectors (e.g. pPIC9, pPIC9K, pPICZα, etc.). In secretory expression vectors, a signal peptide sequence is typically included following the promoter insertion (Vijayakumar and Venkataraman 2023). P. pastoris often relies on integrative plasmids to express foreign genes, and these vectors are commonly designed as shuttle vectors that can be used in both E. coli and P. pastoris. These vectors include components for increasing the amount of plasmids in E. coli. These components consist of a selection marker as well as a replication origin, commonly manifested as resistance towards antibiotics. In contrast, these vectors contain the necessary elements for expressing foreign genes in P. pastoris, such as the promoter/terminator, the region for inserting new genes, and a suitable marker for selection (Nakamura et al. 2018; Gu et al. 2019). Recent advancements in cloning techniques, including Golden Gate, Gateway®, Gibson Assembly, TALEN, and as well as CRISPR/Cas9 have significantly improved the efficiency and specificity of cell engineering. Consequently, these innovative technologies have brought revolutionary change to the entire field (Casini et al. 2015). Golden Gate cloning relies on restriction enzymes (type IIs), which cleave DNA outside their recognition sequence. This method provides significant advantages: it eliminates the need for extensive flanking DNA, employs efficient one-pot reactions, enables cloning scar-free, and proves to be a cost-effective alternative to various other sophisticated methodologies (Prielhofer et al. 2017). In the process of golden gate assembly, a pair of dissimilar type IIs restriction endonucleases, namely BsaI and BpiI, are employed. These enzymes produce overhangs comprised of four base pairs that extend beyond their sequence of recognition. Obst et al (2017) and Schreiber et al. (2017) have recently documented the application of Golden Gate cloning in P. pastoris to create collections of expression cassettes. These cassettes were then evaluated for their ability to produce reporter proteins by efficiently and swiftly assembling standardized components like promoters, ribosome binding sites, secretion signals, and terminators. The objective of these investigations was to enhance the efficiency of a singular transcription unit for the generation of a particular foreign target protein.

The vectors are designed in a standard configuration, which serves as a dual-functional system that facilitates replication and maintenance in E. coli and P. pastoris respectively. This is achieved by incorporating selection markers, which can be either auxotrophy markers (e.g., ADE1, ARG4, URA3, URA5, GUT1, HIS4, MET2,) or genes that confer resistance to blasticidin S, Zeocin™ and geneticin (G418) drugs. Although there are a few instances of utilizing episomal plasmids for the expression of heterologous proteins or for the examination of mutant libraries in P. pastoris (Chen et al. 2017; Uchima and Arioka 2012), the integration of plasmids into the genome of the host is the most favored approach. In contrast to the prevalence of homologous recombination (HR) in Saccharomyces cerevisiae, non-homologous end-joining (NHEJ) is a frequent occurrence in P. pastoris. Life Technologies (Carlsbad, CA, USA) offers standard vector systems for intracellular and secretory expression. These systems encompass constitutive (PGAP) and inducible promoters using methanol or methylamine (PAOX1, PFLD) for activation. Several methods exist to disrupt the OCH1 gene and induce the expression of various glycosidases or glycosyltransferases in P. pastoris, which ultimately leads to the production of N-glycan structures resembling those found in mammals. Scientists have successfully designed a series of plasmids with the ability to induce the secretion of protein with expression intracellularly in P. pastoris. These plasmids contain the highly effective AOX1 promoter. The vectors include sites where restriction can occur to linearize the marker genes. This allows for the precise targeting of the expression cassettes to the desired location. Additionally, these sites facilitate the integration of multiple copies of the vectors (Lin-Cereghino et al. 2001). The plasmids referred to as pPp, which was discussed in the study conducted by Näätsaari et al. in (2012), consist of vectors that contain either the GAP or AOX1 promoters. These vectors also include the secretion signal in S. cerevisiae α-mating factor (α-MF) in order to facilitate secretory expression. In the pPpB1 and pPpT4 vectors, the marker cassettes of the antibiotic selection were positioned under the regulation of ADH1 or ILV5 promoters, respectively. It is stated that vectors based on pPpT4 typically result in a decreased number of gene copies within the cellular environment when compared to the vectors based on pPpB1. In order to achieve intracellular expression, the target genes are cloned by employing EcoRI and NotI, wherein the restoration of the Kozak consensus sequence is necessary for translation initiation efficiency. A distinguishing feature of these vectors is that the EcoRI site has been incorporated into the AOX1 promoter sequence through a mutation of the single point, while the promoter functionality remains unaffected. The fusion of the gene of interest to the promoter may occur without the presence of any extra nucleotides between the start codon and the promoter. The utilization of the ARG4 promoter for the implementation of the selection markers presents an additional benefit. The ARG4 promoter of lower strength, which is employed for the purpose of cassette marker selection, facilitates the process of selection by allowing for the usage of Zeocin™ reduced concentration, specifically 25 μg/ml, as opposed to the standard 100 μg/ml. This reduction in the concentration of Zeocin™ effectively prevents the occurrence of false-positive clones (Ahmad et al. 2014).

In P. pastoris the process of expressing a recombinant gene can be divided into three distinct phases: a) a new gene cloning into an appropriate vector for expression b) the cloned vector insertion into the yeast genome as a host and c) The evaluation of various strains to determine the expression of the recombinant integrated gene ability (García-Suárez et al. 2021). To enhance the effectiveness of the integration of external DNA into the genome of Pichia yeast, it is necessary to ensure that the vector used is linearized using restriction enzymes such as Sac I, PmeI, and BstX, as recommended by the EasySelect™ Pichia Expression Kit (Invitrogen). Linear DNA then needs to be inserted into the competent cell through electroporation. The inserted gene is incorporated into the cellular genome through the process of crossover recombination, resulting in electroporation the formation of recombinant cells (Fig. 3).

Fig. 3
figure 3

Crossover recombination happened in the genome of P. pastoris which occurred through the electroporation method

Typically, only one crossover event takes place within the genome, although multiple insertions happen in approximately 1–10% of cases using the kit. The expression vectors play a crucial role in the P. pastoris expression system, serving as one of its key components. The vectors consist of three distinct sequences. The first sequence, located in the 5' region, is known as the Promoter sequence, with AOX1 being the most frequently observed variant. The second sequence, found in the 3' region, is responsible for transcriptional termination. It plays a crucial role in the processing and polyadenylation of messenger RNAs. Finally, the third sequence contains single or multiple cloning sites, which are indispensable for the gene of interest insertion. The episomal vectors possess the capability to undergo replication, either independently within the cytoplasm or in conjunction with a chromosome. The vectors utilized in P. pastoris, however, lack a steadfast episomal status; consequently, they are required to undergo initial linearization by means of enzymes before being incorporated into the chromosome of the yeast (Tripathi and Shrivastava 2019). Expression vectors in P. pastoris, similar to E. coli, are shuttle vectors, meaning they have the ability to be propagated in two distinct species of hosts. The vectors additionally encompass specific genes associated with resistance to drugs, including Kan, Shble, Bsd, Amp, or FLD1, which exhibit resistance against geneticin, zeocin, blasticidin, ampicillin, and formaldehyde, correspondingly (Ilgen et al. 2005).

The secretory expression aspects

One of the primary benefits of utilizing P. pastoris as a host for protein production lies in its capacity to excrete substantial quantities of correctly folded, post-translationally modified, and biologically functional recombinant proteins into the medium culture. As a general principle, it can be deduced that proteins that are excreted in their original organisms will also be excreted in P. pastoris. Nevertheless, there have been instances where the successful release of commonly intracellular proteins, including GFP or human catalase, has been documented (Pan et al. 2022; Eiden-Plach et al. 2004). The secretion signals most frequently utilized in P. pastoris originate from S. cerevisiae α-MF, S. cerevisiae invertase (SUC2), and the endogenous acid phosphatase of P. pastoris (PHO1) (Barone et al. 2023). The α-MF signal sequence comprises a pre- and pro-region and has demonstrated remarkable efficacy in guiding proteins along the pathway of secretory in P. pastoris. The pre-region assumes the responsibility for the nascent protein guiding into the endoplasmic reticulum (ER) through posttranslational mean and subsequently, the signal peptide activity leads to removal (Prielhofer et al. 2013). The pro-region is believed to have a function in facilitating the transfer of the protein from the ER to the Golgi compartment and is ultimately cleaved by the endo-protease Kex2p in the KR site at the dibasic. One of the prevalent challenges encountered during the utilization of the α-MF secretion signal pertains to the lack of uniformity in the recombinant proteins N-terminus region caused by the processing of STE13 incompletely. The elimination of these sequences, nevertheless, has the potential to impact the yield of proteins. the simultaneous HAC1 overexpression, which is a transcription factor involved in the unfolded protein response (UPR) route, along with the adenosine A2 receptor in a membrane protein, yielded a favorable outcome in terms of the α-MF signal sequence appropriate processing (Guerfal et al. 2010). Scientists observed an increase in the production of secretory proteins at the Kex2 P1' site through the amino acids optimization located at that site. Various approaches have been employed to augment the secretory capacity of the α-MF signal sequence. These encompass, directed evolution (Rakestraw et al. 2009) the introduction of spacers and deletion mutagenesis (Lin-Cereghino et al. 2013) as well as the optimization of codons (Aza et al. 2021). Based on a projected model of the α-MF signal peptide, the application of deletion mutagenesis led to a notable enhancement of 50% in the secretion levels of lipase B (CALB) in C.antarctica along with the horseradish peroxidase in P. pastoris (Lin-Cereghino et al. 2013). Decreasing the hydrophobicity of the leader sequence through the removal or replacement of hydrophobic residues with charged or polar amino acids resulted in augmented flexibility of the structure of the α-MF signal sequence. This, in turn, enhanced the overall capacity of the pro-region for secretion. The summary of Table 2 encompasses the various alternative signal sequences employed for the purpose of protein secretion, along with their distinctive characteristics and diverse applications (Prielhofer et al. 2013).

Table 2 Diverse signal sequences, their utilization, and application employed in protein secretion

There exist numerous additional elements that regulate the effective secretion of proteins, aside from the selection of secretion signals. The proteins that have recently been synthesized are transported into the ER lumen using the Sec61p translocon either simultaneously or after the process of translation has occurred. Then, proteins can potentially experience one or multiple posttranslational alterations, such as the process of assuming their native conformation, the formation of disulfide bonds, the addition of glycosyl groups, and the anchoring to the cellular membrane. When the recombinant protein is unable to adopt its native conformation or when protein expression surpasses the folding capacity of the ER (Sha et al. 2013), it may lead to the aggregation of unfolded proteins, thus initiating the UPR pathway. Inadequate mRNA configuration and variations in the number of gene copies, constraints in the process of transcribing, translating, and the protein transporting into ER, insufficient protein folding, and suboptimal protein transportation to the extracellular region are significant inhibitions faced in the heterologous protein secretion express. Commonly employed approaches in overcoming these secretory bottlenecks involve the up-regulation of chaperone proteins such as Ero1p, BiP/Kar2p, PPIs, DnaJ as well as PDI. Another strategy entails the up-regulation of HAC1, a transcription factor that controls the expression of genes involved in the unfolded protein response pathway. Unlike the situation in S. cerevisiae, this alternative method has proven effective in alleviating these bottlenecks. Scientists documented that The expression and splicing of HAC1 in P. pastoris remains consistent and uninterrupted during regular growth circumstances. This phenomenon might account for the increased quantities of secreted proteins that can be achieved using this particular organism (Guerfal et al. 2010).

Positive and negative characterizations in P. pastoris as an expression host

The utilization of the Pichia expression system presents several merits when it comes to expressing various types of recombinant proteins. It is worth mentioning that P. pastoris, being a methylotrophic yeast, is commonly acknowledged as a host system for recombinant expression. One of the key benefits associated with the Pichia system lies in its remarkable resemblance to sophisticated eukaryotic expression systems, including CHO cell lines. The yeast system is characterized by its cost-effectiveness and the relatively swift duration of expression. Furthermore, it possesses both co-translational and post-translational processing capabilities. Through the utilization of industrial bioreactors, it is probable to achieve substantial production of target proteins from relatively limited culture volumes. Recently, investigations have revealed that the Pichia expression system exhibits exceptional characteristics when it comes to the synthesis of membrane proteins, involving the receptor of histamine H1, transporter of the phosphate and nitrate along with the potassium and calcium channels (Byrne 2015). Moreover, P. pastoris proves to be a suitable microorganism for the secretion-based manufacturing of genetically engineered proteins, allowing for their direct release in the medium culture supernatant. In the utilization of P. pastoris as an expression host system, the ease of purifying recombinant protein arises from its restricted capacity to produce endogenous secretory proteins (Tachioka et al. 2016). Pichia pastoris offers an additional benefit as a host for protein production due to its capacity to carry out post-translational alterations, including the formation of the disulfide bond as well as O‐ and N‐linked glycosylation. Therapeutic proteins, which encompass a substantial number, are characterized by their classification as glycoproteins. This classification necessitates the presence of carbohydrate structures attached to the protein backbone, a process known as glycosylation. The execution of glycosylation is crucial as it facilitates the attainment of proper folding, stability, solubility, and ultimately the desired biological property (Baghban et al. 2019). There exist two primary forms of glycosylation (N-linked and O-linked) within yeast cells which occur within the Golgi or ER apparatus of the yeast cells. In yeast, the configuration of N-linked glycans adheres to the type of the hypermannose category, whereas in humans, hybrid and complex configurations are predominantly observed. Oligosaccharides are affixed to the amide nitrogen of asparagine (Asn) residue via an N-glycosyl linkage within the consensus sequence Asn-X-Ser/Thr (X denotes any amino acid except proline) in order to facilitate N-linked glycosylation. The glycosidic bonds connect oligosaccharides to serine or threonine to facilitate the production of O-linked glycosylation. The saccharides that are O-linked are usually significantly smaller in size compared to the saccharides that are N-linked (less than 5 residues). In S. cerevisiae, the process of N-glycosylation is distinguished by hypermannosylation involving α-1,2-, α-1,6-, and α-1,3- mannosyltransferases (Fig. 4A) (Prielhofer et al. 2013).

Fig. 4
figure 4

The N-link glycan structure in Saccharomyces cerevisiae, Pichia pastoris and mammalian cells. A Saccharomyces cerevisiae with hypermannosylated structures, B Pichia pastoris with hypomannosylated structures and C Mammalian cells with complex terminally sialylated structures

In contrast with S. cerevisiae, P. pastoris may possess a potential benefit in terms of the glycosylation process of proteins that are secreted due to its potential lack of hyperglycosylation. In P. pastoris, the N‐glycans, which consist of Man8‐14GlcNAc2, often exhibit a reduced length compared to the lengthy oligosaccharide chains of S. cerevisiae, with a Man > 50GlcNAc2 composition. In addition, the core oligosaccharides of S. cerevisiae exhibit α‐1,3 glycan linkages at their termini, a feature not present in P. pastoris. Unlike S. cerevisiae, P. pastoris does not exhibit the phenomenon of hyperglycosylation in the context of therapeutic proteins. Additionally, it does not possess the presence of α‐1,3‐linked mannoses (terminal) potentially (Fig. 4B).

The hyper-antigenic nature of glycosylated proteins derived from S. cerevisiae, which renders them improper for therapeutic applications, is thought to be predominantly attributed to the α-1,3 glycan linkages. In addition, there has been minimal detection of O-linked glycosylation in P. pastoris (Radoman et al. 2021). P. pastoris is an organism of significant interest for the synthesis of therapeutic glycoproteins due to its appealing attributes for heterologous protein production (low occurrence of hyperglycosylation). In mammalian cells, the N-glycosylation process involves the incorporation of one or more residues of N-acetylglucosamine (GlcNAc), which is subsequently followed by additional galactose (Gal) and sialic acid sequence. This results in the formation of a sophisticated form of N-glycans (Fig. 4C).

In the P. pastoris expression system hypermannosylation of the recombinant protein may give rise to an immunologic response and a reduction in the serum half‐life occurs, that is imperative to note that this phenomenon has been observed (Laukens et al. 2015). Recently, novel methodologies have been developed with the aim of manipulating the N-glycosylation pathway of P. pastoris. In the field of glycoengineering, a strategy has been employed wherein an endogenous glycosyltransferase gene (OCH1) is disrupted and heterologous enzyme activities are introduced (Radoman et al. 2021; Jacobs et al. 2009).

Similar to alternative systems of gene expression, the eukaryotic system possesses certain drawbacks. During the phase of transformation, unlike the bacterial system, competent cells necessitate substantial quantities (measured at the microgram level) of the plasmid. The quantity of E. coli transformants (108–109) surpasses that of P. pastoris transformants (103–104) per microgram of DNA (Wu and Letchworth 2004). The regulation of recombinant protein production in this particular system is executed by means of a two-promoter utilization. The promoter responsible for regulating the activity of glyceraldehyde-3-phosphate dehydrogenase (PGAP) and the promoter responsible for regulating the activity of AOX (PAOX1) play important roles in gene expression. Despite having multiple advantages, neither promoter possesses any form of tenability (Rajamanickam et al. 2017). The production of proteins in yeast systems relies on methanol utilization. According to the EasySelect™ Pichia Expression Kit (Catalog no. K1750‐01, Invitrogen, Carlsbad, CA), the expression of recombinant proteins necessitates a minimum concentration of 0.5% methanol. The maximum level of methanol production could potentially reach 2–2.5% (wt/vol) (Wang et al. 2010). Typically, the tolerable degree of methanol concentration for organisms does not exceed 5%; however, elevated levels of methanol concentrations (> 5%) exhibit pronounced toxicity to cellular viability and can inhibit the process of production effectively (Santoso et al. 2012). One additional restriction in the Pichia system entails the existence of a small number of selectable markers for the transformation of P. pastoris. These selectable markers encompass genes such as hsble, arg4, and his4 (Tulio 2022). A frequently observed phenomenon within this particular system is the growth of saprophytic bacteria and fungi, leading to the contamination of the expressive broth culture. The proteolytic enzymes that are released by these microorganisms have the potential to catalyze the hydrolysis of the proteins that are secreted into the supernatant (Stewart 2015). One of the challenges faced in the P. pastoris expression system is the degradation of proteins generated by proteolytic enzymes. Novel strains of this yeast, including SMD1168 (his4 pep4), BG21, Pichia pink SMD1163 (his4 pep4 prb1), SMD1165 (his4 pep4) exhibit an absence of protease activity. As a consequence, the secreted protein’s degradation is effectively inhibited. In order to attain elevated levels of product output and to ensure the recombinant quality of the proteins, modifications have been made to the genetic composition of proteinase A (pep4) and proteinase B (prb1) in these strains (Safdar et al. 2018).

Strategies in the expression system of P. pastoris

Following the selection of appropriate expression vectors and suitable host strains, and the subsequent transformation of the expression cassettes, it becomes crucial to recognize transformants for the desired protein that exhibit elevated levels of the protein expression. The mutant’s single copy can be readily produced by directing the linear expression cassettes towards the locus of AOX1, which leads to the occurrence of gene replacement incidents. Simultaneous occurrences of ectopic integrations can happen. Transformants arising from gene substitution at the AOX1 site exhibit a sluggish phenotype in the utilization of methanol (MutS). These transformants can be readily distinguished on minimal methanol plates by means of replica plating (Aw and Polizzi 2013; Prielhofer et al. 2013). The screening of high-yielding P. pastoris transformants is typically conducted using the most commonly employed approach, which involves the identification of clones that exhibit multi-copy expression cassette integrations. Due to the remarkably recombinogenic characteristic of P. pastoris, there is a possibility that expression cassettes may undergo excision via the loop-out recombination process. The impact of this phenomenon appears to be heightened as the number of copies being incorporated increases (Aw and Polizzi 2013; Prielhofer et al. 2013). A multitude of recent studies have demonstrated a direct correlation, particularly within the realm of intracellular expression, between copy number and the level of expression (Vassileva et al. 2001; Marx et al. 2009). The potential validity of the direct relationship between the level of expression and the number of gene copies may not always be considered valid in cases where the protein is targeted toward the secretory pathway. The primary approach utilized for producing multi-copy expression strains in P. pastoris involves the application of a technique that entails the direct placement of the transformation mixture onto selection plates that contain gradually escalating levels of antibiotics. The vast majority of transformants will possess the expression vector’s single copy that has been integrated into the genome. Consequently, a multitude of clones will necessitate screening in order to identify transformants exhibiting high-copy numbers (Lin-Cereghino and Lin-Cereghino 2007; Pan et al. 2022).

Strategies of employment the high-throughput methods for evaluating the clone’s quality

Multiple high-throughput methodologies have been developed by means of small-scale cultivation techniques in order to examine a substantial quantity of clones (Mellitzer et al. 2012; Weinhandl et al. 2012; Weis et al. 2004). The chosen clones, nevertheless, may not exhibit any performance in fermenter cultivations because of various conditions of cultivations. A notable issue that arises with the screening method that relies on resistance markers is the elevated occurrence of colonies that yield false-positive outcomes (Mellitzer et al. 2012). The supposedly induced transformation of this background is believed to result from the stress and rupture of cells. The mechanism through which antibiotic resistance is conferred by the resistance marker depends on the cells that have not undergone a transformation that may be able to persist in the immediate vicinity of transformed cells that have ruptured (Weis et al. 2004). The issue was resolved through the creation of expression vectors which were founded upon the expression of marker genes under the influence of the ARG4 promoter, which is known for its relatively low strength. This guarantees the maintenance of fundamental levels of gene expression, thus enabling researchers to distinguish between individual and multiple copies of genetic variations by cultivating the transformed organisms directly onto the culture media containing minimal quantities of Zeocin™. In order to decrease the likelihood of obtaining a single gene of transformant, it is advisable to minimize the duration of cellular regeneration and place the transformant directly onto the plates involving the high antibiotic concentrations (Weinhandl et al. 2012; Weis et al. 2004). By utilizing this technique, only a limited number of organisms that have undergone genetic modification can withstand the presence of elevated levels of antimicrobial agents. However, it is highly probable that these organisms will possess multiple duplications of the genetic material, a characteristic that can subsequently be assessed through Southern blot analysis or quantitative polymerase chain reaction (qPCR). The evaluation of performance can subsequently be conducted directly within the context of actual production circumstances within bioreactor cultivations, as opposed to small-scale cultivations performed in shake flasks or deep well plates. Pichia pastoris has demonstrated the capacity to generate over 15 g/l grams of recombinant protein either intracellularly or via secretion (Werten et al. 1999). The ability of P. pastoris to achieve high titers is attributed to its capacity to grow extremely high cell densities, reaching up to 150 g of cell dry weight per liter of fermentation broth in bioreactor cultivations conducted under fed-batch conditions (Jahic et al. 2006). At extremely elevated cellular concentrations, proteins that have a restricted quantity within each individual cell possess the capability to be generated with acceptable amounts in terms of volume in this yeast. Typical instances of proteins that are not abundant but possess significant scientific and commercial significance are known as integral membrane proteins. These proteins constitute the targets of more than 50% of drugs that are administered to human beings (Arinaminpathy et al. 2009).

Strategies of membrane protein’s expression in P. pastoris host system

Only a small number of membrane proteins have been examined in detail at the molecular level with respect to the connections between structure and function. The primary cause is the challenging nature of acquiring a high amount of purified protein (membrane) suitable for conducting comprehensive structural and biochemical investigations. However, this obstacle can be overcome through the acquisition of affinity-tagged membrane proteins that exhibit satisfactory yields. Pichia pastoris, in fact, has been regularly utilized to generate membrane proteins attached to affinity tags for the purpose of purifying proteins and conducting subsequent biochemical investigations (Cohen et al. 2005; Haviv et al. 2007; Lifshitz et al. 2007). Moreover, this yeast has been selected as the preferred expression host for the crystal structure elucidating in the membrane proteins derived from various sources, including those originating from higher eukaryotes (Hino et al. 2012). The advantages of successful recombinant expression are enhanced by the evolutionary closeness between a heterologous expression host and the source of the membrane proteins in which they are expressed (Grisshammer and Tateu 2009). Additionally, regarding the intramolecular forces and bonds, soluble proteins are frequently stabilized by cofactors, ions, and interacting proteins. Moreover, membrane proteins typically engage in interactions and find partial stabilization through the lipids available in the surrounding two layers of the membrane (Adamian et al. 2011). Pichia pastoris and various other yeasts as the expression hosts for the recombinant proteins exhibit substantial variations in the composition of their cell membranes compared to the cells of plants, animals, and bacteria (Wriessnegger et al. 2009). Membrane proteins of different origins may encounter challenges in terms of stability when they are expressed in hosts that are genetically distant from their source. Hence, various methods have been implemented in order to enhance the host strains of P. pastoris and optimize the conditions for the expression and production of membrane proteins. The application of comparable methods utilized for enhancing the production of soluble proteins, such as the manipulation of conditions for protein expression, the incorporation of chemical chaperones, the expression of proteins or chaperones simultaneously that activate the unfolded protein response, and the utilization of strains lacking protease activity, has demonstrated varying degrees of success in the expression of membrane proteins, even though frequently specific to the target protein (Cohen et al. 2005; Haviv et al. 2007; Lifshitz et al. 2007).

Up-scaling of bioprocess in P. pastoris and the challenges

The engineering of cellular membranes in P. pastoris represents a new and innovative strategy aimed at enhancing the capacity of the organism to host the heterologous proteins. An effective bioprocess in P. pastoris and other microorganisms necessitates a host strain that is both optimized and stable, exhibiting enhanced metabolic rates. Furthermore, a fermentation process that is both efficient and consistent, operating at high levels of titer, yield, and productivity, is essential. However, in order to generate proteins for scientific purposes, it is crucial to achieve a successful upscaling of a bioprocess that is specifically designed to be easily expandable, ensuring that both efficiency and cost remain unaffected by any consequential development (Gasser and Mattanovich 2018; Wehrs et al. 2019). In spite of recent progress using P. pastoris for producing the recombinant protein the process of scaling-up also encounters certain limitations associated with strains. Unlike S. cerevisiae, the P. pastoris strains predominantly serve as prototypes, indicating that they possess minimal genetic markers, alter their mating type through aggregation, and, regrettably, there is a shortage of strains employed in both industrial and research settings, thus posing constraints on the diversity of its genetic pools (Gasser and Mattanovich 2018). Another significant metabolic constraint arises from the uptake of glucose. The uptake of glucose experiences a decrease and fails to exceed the respiratory capacities of the yeast. In contrast to S. cerevisiae, there is a considerable decrease, close to a tenfold reduction, in the maximum intake of glucose in P. pastoris. This diminishment has the potential to reduce the overall efficacy of the process. Evidently, the stringent absorption of glucose and the elevated flow of PPP may result in the generation of intricate undesired secondary metabolites as byproducts of fermentation, including terpenoids and carotenoids (Gasser and Mattanovich 2018). The primary challenges encountered in the application of recombinant P. pastoris are the deficiencies in carbon assimilation, specifically in terms of capacity consumption and rates, as well as the formation of byproducts. These limitations result in low production yields and productivity, thereby preventing the widespread use of this organism (Gasser and Mattanovich 2018; Wehrs et al. 2019).

Strategies for optimization and development in the fermentative process of P. pastoris

The ability to generate proteins and chemicals within P. pastoris possesses the potential to effectively tackle worldwide demands. However, in spite of the progressive advancement of this microbial framework, the fermentation procedures associated with it have yet to attain the equivalent level of sophistication as controversial procedures of biochemistry (García-Ortega et al. 2019; Hill et al. 2020; Pena et al. 2018; Looser et al. 2015). There exists a discrepancy between the encouraging experimentation on a smaller scale and its implementation on a larger industrial scale. Despite numerous research investigations successfully showcasing theoretical approaches for novel compounds, a considerably lesser number of these endeavors have concentrated on the development of bioprocesses, strategies for optimization, or the expansion of the procedures (Duman-Özdamar and Binay 2021; Brooks and Alper 2021). Integrating the manipulation of strain and employing various strategies during fermentation is of utmost importance in order to enhance the effectiveness of bioprocesses. The transfer of bioprocess technologies into host systems typically occurs as a consequence of iterative optimizations of the process and cycles of redesign in biological terms (Crater and Lievense 2018). Despite the prevalence of conditions commonly employed in the fermentative procedures in P. pastoris, the optimal conditions vary depending on the desired product, the genetic composition employed, and whether the regulation of the products occurred via the constructive or inducible promoter (Karbalaei et al. 2020). The major factors for the successful procedures of improvement, optimization, and simulation in P. pastoris are described below.

Parameters in fermentation

The operational parameters of fermentation processes, including pH, temperature, medium composition, and dissolved oxygen (DO) osmolality, have a significant impact on their efficiency. For the cultivation of P. pastoris, a number of scientists employ a comparable range of ideal set points and control strategies for these variables, predominantly relying on established protocols (Garrigós-Martínez et al. 2021). The optimal temperature for the growth of P. pastoris is between 28 and 30 °C (Nieto-Taype et al. 2020a, b). Any temperature exceeding 32 °C may lead to cellular death and a decrease in the expression of a protein.

In certain instances, the act of decreasing the cultivation temperature may yield favorable results in terms of protein production for heterologous proteins. This is achieved through the increase in yeast viability, despite the simultaneous decrease in growth rate. Additionally, the act of decreasing the cultivation temperature also mitigates folding stress and minimizes the activity of proteolytic as opposed to the target protein, thus leading to enhanced protein production (Nieto-Taype et al. 2020a, b). Decreasing the cultivation temperature leads to an increase in the solubility of oxygen and thereby enhances the rate of oxygen transfer. Research conducted on the impact of temperature has shown that the enhancement in effectiveness is contingent upon the desired outcome and necessitates assessment in every instance (Berrios et al. 2017). The range of pH values suitable for the cultivation of P. pastoris is observed to be between 5 and 6.5. It should be noted that when the pH exceeds 8, there is a significant decrease in the viability of cells (Matthews et al. 2018). The optimal value is greatly reliant on the production of the recombinant protein or metabolite. A pH value of 5.5 aids in the reduction of protease-induced harm (Berrios et al. 2017). The typically maintained range for dissolved oxygen concentration in P. pastoris fermentation is set at 20–30% saturation (Nieto-Taype et al. 2020a, b). The maintenance of this concentration becomes increasingly onerous as the cell density in fermentation increases (Garcia-Ortega et al. 2017). The restricted ability to transfer oxygen is a crucial aspect in the advancement of the P. pastoris procedure. Due to the limited solubility of oxygen in the growth medium, the maximum quantity of biomass that can be achieved in P. pastoris cultivation at high cell densities is determined by the equipment's maximum oxygen transfer capacity (Garcia-Ortega et al. 2017). In addition, investigations concerning the production of recombinant proteins have discovered that the utilization of low oxygen conditions leads to an increase in the production of proteins (García-Ortega et al. 2019). The production of P. pastoris at a large scale is reliant upon cultural media that are low in cost. Various formulations of new mediums have been assessed for particular investigations, in all cases as a generic composition does not prove effective (Safdar et al. 2018). Media optimization has the potential to serve as a viable alternative for enhancing the growth rate and product yield. Nevertheless, it is worth noting that this particular approach has not been extensively investigated for P. pastoris (Matthews et al. 2018). The basal salt medium (BSM) stands as the most frequently referenced medium for P. pastoris fermentation. Nevertheless, a number of studies document the possibility of potential product titers at comparable cell densities or enhanced per-cell productivity through the supplementation of nutrients in this fermentation medium (Garcia-Ortega et al. 2017).

Mode of operation and mathematical model

Concerning the operational mode, the fed-batch culture, which involves the gradual feeding substrate, is widely recognized as the predominant approach employed for the production of recombinant proteins in P. pastoris. This particular strategy successfully attains elevated cell concentrations and increased levels of product, while also preventing any potential inhibition caused by the substrate or repression due to catabolites (Liu et al. 2019). In the context of fed-batch culture, the utilization of exponential feeding profiles has been regarded as the preeminent approach to attaining a pseudo-stationary state. Consequently, this method ensures the maintenance of constant as well as regulated rates of specific growth, consumption, and production (Garrigós-Martínez et al. 2021). The feeding strategies commonly employed, including ramp, pulse, step-base additions, and constant rates have been deemed obsolete due to their inability to meet the physiological cell requirement and the optical performance inhibition (Nieto-Taype et al. 2020a, b). The combination of feeding profiles and carbon-starving is advised for the overexpression of recombinant protein in fed-batch cultures, without modifying growth parameters (García-Ortega et al. 2019). Although this strategy offers certain advantages, it diminishes the values of productivity and does not consistently surpass exponential feeding. The strategy most commonly employed in recent studies to attain a high level of enzymes or chemical production is the utilization of high-cell-density fermentation (HCDF) in fed-batch mode (Looser et al. 2015; Duman-Özdamar and Binay 2021). Nevertheless, challenges persist in the establishment of a resilient HCDF procedure, particularly when methanol is employed as both an inducer and the source of carbon (Liu et al. 2019). Given that elevated levels of methanol have toxic effects, it becomes imperative to ensure a constant and appropriate supply of methanol during the fermentation process. The growth of cells, utilization of substrate, and the generation of final products will be relying on the chosen approach for the strategy of methanol feeding (Liu et al. 2019). While HCDF plays a role in enhancing productivity, it can also result in the accumulation of protease, which can be particularly detrimental in the production of heterologous proteins. There are several strategies that can be employed to address this issue, such as controlling the pH of the fermentation medium within a range of 3.0–7.0, incorporating supplements that are rich in amino acids, or utilizing a host strain of P. pastoris that is deficient in protease (Duman-Özdamar and Binay 2021). Although the fed-batch operation is the most commonly employed technique, continuous mode production has emerged as a prevalent tendency. In the context of continuous culture, a constant influx of fresh medium is introduced into the bioreactor, while concurrently eliminating the culture broth and cellular components at an equivalent rate of flow. The specific growth rate (µ) is maintained in close proximity to µmax in order to achieve this fermentation mode. This mode offers several benefits in comparison to the fed-batch mode, including increased productivity, reduced expenditure on utilities, and a simplified process through steady-state operation (Nieto-Taype et al. 2020a, b). Continuous culture is widely regarded as the most optimal option for acquiring precise physiological information in order to ensure dependable particular strain and the subsequent process of evaluation (Nieto-Taype et al. 2020a, b). However, the implementation of the continuous operation mode in the industrial sector also poses certain difficulties, including genetic contamination and instability, the requirement for an elevated degree of automation, and additional complications specific to the product (García-Ortega et al. 2019). To the best of our knowledge, there have been documented instances of productive and uninterrupted fermentations for P. pastoris on a small scale within a laboratory setting (Garrigós-Martínez et al. 2021; Nieto-Taype et al. 2020a, b; Berrios et al. 2017) however, the absence of continuous bio- manufacturing procedures remains apparent. In the course of advancing continuous fermentation, the parameter that holds the utmost importance is the dilution rate (D). This parameter, as defined by mass balance, is considered critical due to its consistency in value with µ [126]. Models that are based on the Monod equation (Nieto-Taype et al. 2020a, b) are commonly used to describe the kinetic of the cell growth that is most frequently reported. However, comparing equation parameters or the YP/X pattern (product to biomass yield) behavior as a function of µ presents a challenge due to the evaluation of different strains utilizing varying substrates and metabolic pathways. irrespective of the approach employed to optimize the process, certain variables need to be taken into account as design parameters. Depending on the manipulation of strains and the conditions under which they are grown, it is possible to ascertain an optimal specific growth rate (µopt) in order to achieve enhanced performance. It is imperative to take into account the maximization of conventional parameters including the final titer and the specific rate of product formation (qP), primarily in the case of high-value products. Additionally, the yield of product to substrate (YP/S) is typically considered for products with relatively low costs (Nieto-Taype et al. 2020a, b). A higher product titer is typically achieved by employing a high initial concentration of cells and a low specific rate of growth. This is due to the fact that growth-associated biomass, which is a secondary product, must be carefully regulated so as not to surpass the maximum allowable concentration (Nieto-Taype et al. 2020a, b; García-Ortega et al. 2019). The comparison of efficiency under various conditions relies mainly on understanding the correlation between qP and µ. This approach was employed to combine the enhancement of succinate production with cellular proliferation by means of gene elimination in E. coli (Wehrs et al. 2019), as well as to enhance the expression of an extracellular human granulocyte–macrophage colony-stimulating factor in this yeast (Looser et al. 2015). Genome-scale models have facilitated the integration of production and growth for the majority of metabolites (Wehrs et al. 2019). When it becomes unattainable to establish a biochemical connection between the metabolism of the target and its reliance on growth, or when the coupling of growth is confined to specific conditions of cultivation, it becomes imperative to employ an engineered strain that possesses enhanced resistance to stress factors and inhibitory agents (Wehrs et al. 2019). In this particular instance, in order to maintain the performance of the strain, it is imperative to steer clear of selecting strains that display enhanced growth but diminished production across successive generations (García-Ortega et al. 2019). One potential option for ensuring consistent strain performance, as well as separating the processes of growth and production, involves the activation of product pathways exclusively upon the attainment of cell density in the stationary phase (Wehrs et al. 2019). The µmax, (maximum specific growth rate), holds significant importance in the advancement of bioprocess development. This parameter determines the highest level of substrate consumption and, subsequently, establishes the upper boundary for substrate assimilation in the mode of fed-batch (Nieto-Taype et al. 2020a, b). Strains that have been genetically modified to produce proteins from different organisms typically exhibit a µmax that is lower compared to the values that have undergone in unmodified strains (Looser et al. 2015). The relationship between yield and productivity is an additional crucial factor to consider. It is imperative to note that it is not possible to optimize one of these variables separately, so would have a negative impact on the other (Ponte et al. 2018; Campbell et al. 2017). Mathematical methodologies for establishing a worldwide optimal resolution, grounded on objective criteria, can be advantageous, particularly in instances where the kinetics of production exhibit nonlinearity.

Despite the progress made, the procedure of developing and enhancing biological systems typically relies on empirical methods. Mathematical models have the ability to diminish the amount of experimental work and costs involved. Moreover, they suggest operating conditions that are favorable and go against intuition. Additionally, they aid in the development of control strategies that are aimed at optimizing the system. These models can be evaluated and incorporated to guide the process towards optimal performance at either the macroscopic or microscopic scales. At a microscopic scale, there exist two distinct categories of mathematical models that can be employed for the purpose of understanding cellular metabolism: the genome-scale metabolic and the kinetic models (Campbell et al. 2017; Guirimand et al. 2021; Patra et al. 2021). The model of kinetic may have certain limitations when it comes to comprehending cellular metabolism due to the unavailability of parameters related to the enzyme kinetics for all reactions (Campbell et al. 2017; Guirimand et al. 2021; Patra et al. 2021). The utilization of genome-scale modeling (GEM), which relies on data pertaining to the sequence of genomes, proves to be highly advantageous in the identification of engineering targets and the formulation of effective bioprocess strategies. A multitude of GEMs are at one's disposal for P. pastoris, consistently undergoing enhancement and progression (De et al. 2021). For the purpose of a macroscopic analysis, the well-established methodologies of genetic algorithms (GA), design of experiments (DoE), and artificial neural networks (ANN) have demonstrated their effectiveness in achieving reliable results for optimizing the fermentation process (Abt et al. 2018). These techniques are commonly linked with the development of fermentation media or the initial optimization of the process (Abt et al. 2018). The effect of genetic and cultural conditions in P. pastoris on the production of thaumatin, a sweetener, can be determined by combining GEM and DoE (Patra et al. 2021). Model predictive control (MPC) has become increasingly important in the field of optimizing the bioprocess (Rathore et al. 2021). This approach takes into account the dynamic and static relationships between the input, output, and disturbance variables. Furthermore, it utilizes a predictive model of the process to align the control estimate with the calculations of the optimal set points (Hong et al. 2021). The utilization of the MPC technique has demonstrated its precision in characterizing the experimental results for the fermentation process of P. pastoris. This process, known for its highly nonlinear nature, has been effectively simulated, providing an accurate representation of the data (Barrero et al. 2018). Despite the progress made in the field, the task of successfully incorporating cells, metabolites, and processes while accurately predicting all resulting responses remains a formidable challenge (Barrero et al. 2018).

Challenges and overcoming obstacles in the cloning and expression system using P. pastoris

Ineffective signal secretion

In P. pastoris, there are multiple potential factors that can lead to ineffective protein secretion. One possible approach to ascertain if the decreased secretion is due to the translocation of proteins from the cytoplasm to the ER is by conducting the GFP-HDEL test. This test can yield results within a timeframe of 2 to 3 days (Barrero et al. 2018). The green fluorescent protein which is denoted as GFP-HDEL, possesses an HDEL tetra-peptide situated at the C-terminus, which allows for the retention of this protein in the ER. The GFP-HDEL assay enables scientists to observe the spatial distribution of the fluorescent fusion protein, consequently ascertaining if there is any inhibition in the translocons within the ER, resulting in the cytoplasmic occurrence of GFP-HDEL. The GFP-HDEL assay represents a cost-effective, but remarkably effective approach for determining whether the translocation of proteins from the cytoplasm to the ER poses a challenge in the protein secretion process. If the issue lies in the translocation step, it is possible to utilize a novel pre-signal sequence called Ost1 to address this problem. The implementation of Ost1 may lead to a significant enhancement in the secretion of proteins. The utilization of the Ost1 sequence led to the secretion of a monomeric super folder GFP which was impeded in both S. cerevisiae and P. pastoris as a result of the inhibition caused by the ɑ-factor secretion signal (Fitzgerald and Glick 2014). The α-factor secretion signal, comprised of a pre-signal containing 19 amino acids and a pro domain containing 66 amino acids, is responsible for guiding post-translational translocation across the membrane of the ER. It is possible that this signal may be effective only for certain proteins (Ingram et al. 2021; Brake et al. 1984). The characteristics of the signal sequence have the ability to govern whether the protein undergoes transportation through a post-transitional or a co-translational mechanism (NG et al. 1996). With this current understanding, the conventional ɑ-factor secretion signal that was previously used as a pre-signal has been substituted with the sequence of Ost1 pre-signal. This sequence facilitates the co-translational translocation process across the membrane of the ER. The outcome yielded a fusion transmission composed of the pre-signal of Ost1 and the pro-domain of the ɑ-factor (Fitzgerald and Glick 2014). In order to evaluate the effectiveness of the sequence in Ost1 pre-signal, researchers manipulated a controllable, secretory fluorescent protein known as E2-Crimson to visualize the ER and the secretory passage (Barrero et al. 2018). Upon realizing that E2-Crimson has the potential to become confined within the secretory pathway, the scientific community undertook two modifications to address the aggregation problem. Firstly, the sequence of Ost1 pre-signal employment, and secondly, the introduction of a single amino acid (Ser42) in the pro region as an allelic variant. Furthermore, the researchers conducted additional investigations into the impacts of the sequence of the Ost1 signal away from the fluorescent protein model. Additionally, they confirmed that the enhancement of the secretion signal also led to an increase in the secretion of BTL2 lipase. When the enhanced pre-signal sequence of Ost1 is employed, the corresponding secreted protein’s stability persists entirely unaffected due to the fact of clipping the secretion signal prior to the protein being released. There may exist potential constraints in the utilization of the sequence of Ost1 pre-signal that still require determination. If the selected protein possesses a substantial number of disulfide bonds for secretion, there is a likelihood that the protein's folding process may be delayed as a result of its extensive requirement for post-translational alterations. Consequently, this diminishes the probability of observing any enhancements when employing the sequence of Ost1 pre-signal. The inclusion of the sequence for Ost1 pre-signal alleviated the stress placed upon the cells throughout the process of protein secretion, thereby promoting enhanced cellular growth in its entirety (Ingram et al. 2021).

Chaperon’s overexpressing for the secretion enhancement

Pichia pastoris exhibits robust exocytosis abilities despite its production of limited quantities of native proteins (Cereghino and Cregg 2000). Similar to the various other systems of protein expression, P. pastoris does possess certain drawbacks. Elevated expression of foreign proteins has the potential to induce saturation within the native secretory pathway, causing an increase in the abundance of misfolded protein species (Zahrl et al. 2017). Due to the presence of a proofreading phenomenon that inhibits the exit of these proteins from the ER, the cell's ability to fold secreted proteins properly, becomes a crucial determinant in enhancing the velocity at which proteins are transported from the ER to the Golgi and ultimately to the extracellular region (Damasceno et al. 2007). Protein disulfide isomerase (PDI) as well as immunoglobulin binding protein (BiP) are a pair of highly prevalent proteins that are present within the ER. Their primary function is to assist in the process of protein folding, thereby promoting the efficient delivery of secreted proteins with various ranges (Brodsky and Skach 2011; Raschmanova et al. 2021). BiP is categorized as a heat shock protein (HSP) of the Hsp70 class, and it has the capacity to attach itself to the immune protein’s hydrophobic regions, thereby resulting in their stabilization. PDI stands as a significantly prevalent protein within the ER, playing a significant role as an indispensable protein implicated in the processes of disulfide bond reduction, isomerization, and oxidation. The misfolded protein’s aggregation, including those that do not have disulfide bonds, is effectively suppressed by PDI, thus exhibiting chaperone-like characteristics. Previous research has demonstrated encouraging outcomes in the enhancement of protein secretion through the BiP and/or PDI overexpression in S. cerevisiae (Miura et al. 2022). Specifically, the overexpression of either BiP or PDI has resulted in a twofold elevation in the levels of single-chain antibody fragment (ScFv) in S. cerevisiae. Furthermore, the simultaneous overexpression of both BiP and PDI has yielded a remarkable eightfold enhancement in protein secretion (Shusta et al. 1998). Particularly, in S. cerevisiae the BiP or PDI overexpression has resulted in a twofold elevation in the levels of single-chain antibody fragment (ScFv). Furthermore, the simultaneous overexpression in BiP and PDI has yielded a remarkable eightfold enhancement in the secretion of proteins (Shusta et al. 1998). Researchers have discovered that in the P. pastoris, the act of overexpressing BiP resulted in a threefold enhancement in the secretion of ScFv when compared to the strain serving as a control. This observation correlates with the outcomes obtained from S. cerevisiae. However, in contrast to S. cerevisiae, the overexpression of PDI did not yield a favorable outcome in terms of secretion (Damasceno et al. 2007). Moreover, in contrast to the findings observed in S. cerevisiae, the BiP and PDI co-expression did not result in a synergistic impact on the process of protein secretion. Nevertheless, the exclusive overexpression of PDI in isolation resulted in a significant threefold elevation in the level of BiP and implicated that excessive production of a chaperone has the potential to impose a substantial load on the secretory system, thereby promoting the initiation of the UPR (Liu et al. 2023), which in turn may have resulted in reduced quantities of proteins being secreted. Today, it has been observed by researchers that the utilization of P. pastoris strains that exhibit overexpression of BiP and/or PDI for the purpose of releasing recombinant proteins has emerged as a nearly ubiquitous procedure. However, there exist numerous constraints associated with this approach that may be applicable to a multitude of recombinant proteins in this yeast. Firstly, the anticipated synergistic outcome resulting from the simultaneous Bip and PDI overexpression was not recognized. Secondly, in spite of their numerous resemblances, P. pastoris and S. cerevisiae exhibit divergent strains of yeasts. Finally, the detrimental consequences of PDI overexpression on the ScFv expression may be contingent upon the inherent structural attributes of the protein itself, rather than the foldase ineffectiveness. It is highly probable that this is the situation since it has been demonstrated that PDI overexpression has a favorable impact on the enhancement of protein secretion including the human secretory leukocyte protease inhibitor (SLPI) (Li et al. 2010). Looking forward, this research proposes several modifications that can be implemented in this approach. A potential decrease in the activity of the promoter responsible for the expression of BiP and/or PDI may provide an opportunity to achieve the appropriate quantity of BiP/PDI within the ER, thereby facilitating the desired improvement in secretion without activating the UPR. Secondly, the expression constitutive of the gene that encodes HAC1, a transcription factor that regulates the UPR, has been demonstrated to mitigate the load of misfolded proteins by reducing the rate of protein synthesis and enhancing the efficacy of the protein ER. In aggregate, the excessive expression of HAC1, along with the co-overexpression of reduced quantities of BiP/ PDI, may potentially serve as an efficacious approach in P. pastoris to amplify secretion in the yeast (Tsygankov and Padkina 2018; Lin et al. 2013). Therefore, the implementation of an intensified BiP and/or PDI synthesis serves as a viable approach to increasing the secretion of proteins, which merits careful contemplation. It is important to acknowledge, however, that the excessive expression of either or both of these proteins does not ensure the desired outcome for all recombinant proteins.

Transformant screening in an effective way

Screening the most optimal P. pastoris strain for the secretion of a protein of interest is an essential initial phase in achieving successful expression of recombinant protein (Olsen et al. 2000). Researchers demonstrate that the 96-deep well plate screening assay is a cost-effective and efficient method of screening, that enables investigators to identify a limited number of appropriate P. pastoris strains instead of using the Pichia Expression Kit which is the classic technique in Pichia. The primary characteristics of this screening assay encompass a fundamental microplate with 96 bottoms in flat and square shapes, to mitigate the tendency of cell pelleting or decreasing the cell precipitation, to which the yeast is susceptible, a dependable laboratory shaker equipped with elevated speed and humidity regulation, in addition to a plate reading system to recognize the secreted protein quantity in the medium (Weis 2019). The outcomes of this technique exhibited the greatest level of uniformity and replicability among a vast multitude of plates, P. pastoris variations and strains, target proteins, and users when the accumulation of biomass was carried out in a media containing 1% glucose. In the medium with 1% glucose concentration, the expression of the recombinant protein in the highest regular as well as the stable level of dissolved oxygen is also recognized. In spite of the fact that there is a commonly held assumption that enhancing the biomass of P. pastoris can be achieved by increasing the concentration of glucose in the medium, resulting in a subsequent increase in the productivity of recombinant proteins (Looser et al. 2015), it was found that this was not the outcome in the screening assay conducted using a 96 deep well plate. As anticipated, elevated glucose levels led to a rise in yeast biomass. Nevertheless, protein output and, of utmost significance, uniformity throughout the complete 96-well plate was diminished with the concentration of 1% exceeding glucose. The productivity was found to be inversely correlated with the concentration of glucose in the media, suggesting that the 1% concentration of glucose can be considered the "preferred state" in terms of oxygen availability. On the other hand, the presence of 2% or even 1.5% of glucose in the media led to significant necrosis and apoptosis in the wells, indicating an unfavorable condition. During the past few years, the utilization of droplet-based microfluidics has become increasingly popular as a high-throughput screening (HTS) technique. This is mainly due to its ability to handle larger volumes of samples and use fewer reagents than traditional microtiter plates (MTP). Additionally, its ability to compartmentalize assays within emulsion droplets offers advantages over fluorescence-activated cell sorting (FACS) methods (Yang et al. 2021). At high speeds, numerous uniform droplets of water enclosed in oil are created and controlled using this method. The control includes actions such as dividing, merging, injecting, detecting, and arranging the droplets. These tiny liquid droplets have the potential to serve as small reactors, segregating various substances such as proteins, cells, and chemical reactions. This innovative approach ensures accurate and dependable quantitative analysis of individual samples. Over the last couple of years, the utilization of droplet microfluidics has proven to be effective in the controlled development of various microorganisms. These microorganisms consist of filamentous fungi, yeasts, bacteria and all of which have displayed the production of a range of enzymes such as aldolase, β-galactosidase, esterase, cellulase, and α-amylase (Huang et al. 2015; Qiao et al. 2017; Ma et al. 2018; He et al. 2019; Tu et al. 2021). Consequently, this approach proves to be successful in identifying transformants and improving production levels in pichia pastoris. Furthermore, flow cytometry is recognized as an influential and efficient technology capable of analyzing individual cells in a solution through rapid, multi-parametric analysis (McKinnon 2018). The main purpose of its usage lay in the measurement of the intensity of fluorescence, resulting from antibodies that were labeled with fluorescence. These antibodies were responsible for identifying proteins or ligands that interacted with different cellular molecules. Particularly, it was employed to measure the physical condition of P. pastoris during the production of heterologous proteins in cultures containing a large number of cells (Zepeda et al. 2018).

Flow cytometry was employed to decrease sedimentation and biased cell agglomerations, as well as to minimize the chances of falsely detecting loosely clustered cells. This was achieved through the application of in-flow velocity on the suspension of the cells and exerting force on the cells (Pekarsky et al. 2018). Consequently, flow cytometry has the potential to be utilized in assessing both the cellular viability alone with unfolded protein response (UPR) (Raschmanova et al. 2019).

Overcoming in the variation of the clone

Clonal variation is characterized by the existence of obvious dissimilarities in the expression level exhibited by a heterologous protein across strains that possess identical copy numbers and occupy appropriate sites for the recombinant gene's insertion. In P. pastoris the presence of clonal variation presents a significant challenge in achieving optimal levels of secretion of the proteins. Conversely, this issue does not arise in Escherichia coli or Saccharomyces cerevisiae. Although P. pastoris was previously characterized by exhibiting clonal variability three decades ago (Cregg et al. 1989), only two additional publications have undertaken thorough investigations of the variability in the yeast clones (Schwarzhans et al. 2016). One potential approach to address clonal variation could involve the identification of a distinct biomarker that defines individuals with a high level of secretion. If a biomarker were to be categorized, the procedure of examining numerous colonies in order to identify an individual with a high secretion level would be optimized into an efficient and cost-efficient process. Moreover, in the event that a distinct attribute of a highly active secretor was identified, the application of genetic manipulation techniques could be employed to significantly enhance the efficacy of secretion in alternative transformants. To examine clonal diversity, three strains that produce human serum albumin (HSA) at high, medium, and low levels were obtained (Aw et al. 2017). The copy number and the locations where the expression plasmids were inserted, which were dependent on expression driven by the AOX1 promoter, exhibited uniformity across all strains. The cultures that produced HSA were stimulated with methanol and evaluated for the completely folded and undamaged protein secretion, rather than the total protein quantity, which could have incorporated or degraded variants. The findings were examined through different methods such as qPCR, titer analysis, flow cytometry, transcriptomic analysis, and the gene expression method. Several factors were excluded as the underlying determinant of clonal diversity in P. pastoris among the strains that secrete HSA. Firstly, the role of the unfolded protein response (UPR) was not considered in the context of clonal variation. More specifically, the process of protein degradation through the pathway known as ER-associated degradation (ERAD) did not play a part in the diminished levels of protein expression. Secondly, an examination of the transcriptome demonstrated that the strains with superior secretion capabilities did not possess the most elevated mRNA quantities, thereby suggesting the absence of an association between mRNA abundance and heightened protein secretion proficiency. Thirdly, the measure of cell viability did not exhibit any correlation with the process of exportation of proteins; indeed, it was observed that a majority of the cells with high secretion rates displayed a heightened proportion of deceased cells in comparison to the remaining samples when cultivated in the presence of methanol, thereby yielding an unexpected discovery. A positive association was observed between increasing levels of oxidative phosphorylation and elevated levels of secretion of completely folded, undamaged HAS (Aw et al. 2017). The cells may potentially enhance their oxidative phosphorylation levels as a means to generate a greater quantity of ATP, in response to the escalated need for recombinant protein synthesis. In addition, the SKP1 gene, which participates in the process of oxidative phosphorylation, demonstrated an increase in expression levels in strains that exhibited high secretion capacity. This finding suggests that SKP1 may serve as a potential biomarker associated with candidates that possess a high secretory phenotype. It has been postulated by researchers that clonal variation could potentially be influenced by non-homologous recombination. The purification of linearized plasmid before the P. pastoris transformation is a crucial step for researchers. This cautionary measure is necessary because the potential presence of the contaminated E.coli DNA, which may originate from a plasmid miniprep, can result in the fragments of bacterial genome integration into various locations within the genome of the yeast that may lead to the clonal variation (Sunga et al. 2008).

Glycosylation enhancement

Glycosylation constitutes an essential type of posttranslational alteration for eukaryotic proteins that are secreted or associated with the membrane. To achieve effective folding of specific glycoproteins, cells typically need to execute the appropriate N-glycosylation on these peptides while they pass through the secretory organelles (Aebi 2013). Due to the absence of N-glycosylation in the majority of prokaryotic hosts, including E. coli, researchers frequently employ P. pastoris as a system for expressing recombinant proteins that necessitate this post-translational alteration (Wang et al. 2022). Despite the capability of P. pastoris to facilitate N-glycosylation, it is imperative to perform genetic alterations to the yeast due to the common occurrence of heterogeneous hyperglycosylation of certain recombinant proteins in wild-type yeasts (Gong et al. 2009). This issue can be seen as problematic from two perspectives: Firstly, P. pastoris is known to carry out modifications on its proteins utilizing mannose, whereas humans are capable of producing more intricate N-glycan configurations, involving sialic acid and galactose. Secondly, P. pastoris has the propensity to generate a heterogeneous population of the protein that is secreted and exhibits variability in terms of the number of saccharide units appended to the specified protein. Nevertheless, the consequences of these variances may encompass a reduction in the time it takes for the desired protein to decay within the organism, an increase in its susceptibility to immune response, as well as the possibility of negative impacts on the functionality and structural folding of the protein (Jacobs et al. 2010). In order to address this predicament, researchers have made efforts to transform the high mannose N-glycosylation in P. pastoris into mammalian N-glycosylation (Choi et al. 2003). This particular process of pathway manipulation comprises two distinct components, 1) deactivation of the primary glycosyltransferases found in P. pastoris, resulting in the buildup of a precursor glycan structure that is shared with the human pathway. Specifically, it leads to the accumulation of Man8GlcNAc2 and 2) the incorporation of exogenous glycosidase and glycosyltransferases is implemented to catalyze the conversion of the ubiquitous precursor into the targeted configurations. Researchers frequently employ the Pichia GlycoSwitch® system as a means of altering the N-glycosylation mechanisms of the yeast (Laukens et al. 2015). Seven distinct strains of P. pastoris were initially generated for this intention, wherein the features of secretion and robustness in every strain demonstrate distinctive outcomes (Jacobs et al. 2009). For a particular strain known as M8, the technique of homologous recombination was employed in order to eliminate the naturally occurring OCH1 gene. This resulted in the production of secreted recombinant proteins that predominantly possessed the glycosylation structure Man8GlcNac2. A gene encoding a human α1,2 mannosidase was subsequently introduced in order to eliminate all α1,2 linked mannosidases and generate Man5GlcNAc2 on the targeted proteins, resulting in the M5 strain creation. The five remaining strains, which synthesize proteins with hybrid or complex glycosylation patterns, were engineered by introducing additional genes responsible for glycosylation modification. The utilization of the M5 strain host represents a commendable initial step towards the endeavor of generating recombinant human proteins provided with glycosylation patterns that closely resemble those observed in humans (Jacobs et al. 2009).

The proteins that are generated in M5 necessitate thorough investigation in order to ascertain the precise characteristics and uniformity of the glycosylation configuration on the recombinant proteins. Further achievement is imperative in order to obtain the desired glycosylation in P. pastoris, despite the existence of some accomplishments. The proteins produced by M5, for example, have demonstrated protein degradation, and there is a scarcity of uniformity within the group of proteins that are secreted (Laukens et al. 2020). One possible explanation is that the newly formed N-glycan structures may unintentionally serve as substrates for one or multiple endogenous glycosyltransferases, consequently leading to the creation of unforeseen and undesirable structures of glycan. The present Pichia GlycoSwitch® strains are not flawless; nevertheless, various approaches can be attempted to address certain obstacles. One possible strategy for addressing the issue of interference caused by endogenous glycosyltransferases involves the alteration of the amino acid residues that undergo glycosylation, in the expectation that the functional integrity will remain unaffected by the replacement. Another alternative approach involves altering the fermentation conditions in order to achieve a higher degree of uniformity in the N-glycan profiles (Tripathi and Shrivastava 2019). An additional possible choice would be to surpass the competitive abilities of these naturally occurring enzymes by overexpressing glycosyltransferases from a different source that possesses a comparable specificity for acceptor substrates. There exists a threshold with regard to the extent of genetic alterations that can be endured within the secretion mechanism of P. pastoris, as an excessive disruption of the gene’s host glycosylation enzyme results in the development of strains that exhibit reduced growth rates and diminished productivity. The alterations in glycosylation have an impact on the functioning of the cell wall and the proteins located in the membrane that are essential for the survival of the organism (Tripathi and Shrivastava 2019; Macedo 2019).

Conclusion

The Komagataella species exhibit versatile characteristics as yeasts, including the production of a diverse array of bio compounds that are of great interest, as well as the successful expression of heterologous proteins. With the progressive escalation in the number of variations and novel molecular instruments and methodologies, these strains of yeasts are gaining a greater competitive advantage over the other yeasts. In particular, P. pastoris (K. phaffii) possesses the benefit of being a eukaryotic organism (in contrast to bacteria) and exhibits untapped potential in terms of fermentative cultivation (in comparison with S. cerevisiae). In addition to this reality, P. pastoris is recognized as a prominent microbial framework utilized extensively in various sectors, including feed, pharmaceuticals, detergent, food, and other industries, for the purpose of manufacturing recombinant proteins. In spite of the fact that the expression systems of P. pastoris exhibit impressive characteristics and are user-friendly due to their clearly defined process protocols, a certain level of process optimization is necessary in order to attain the highest possible yield of the desired proteins. The conditions necessary for the efficient recombinant protein’s production in the expression system of P. pastoris ultimately vary depending on the specific target protein. Because the efficacy of P. pastoris is contingent upon the structural attributes of the specific protein, it is imperative for users of the system to acknowledge that certain approaches may enhance the production of recombinant proteins, whereas others may yield no discernible impact or potentially diminish expression levels. If, however, researchers are dedicated to and continue to investigate the possibilities, it is highly probable that they will achieve increased production of their active proteins biologically.