Introduction

Pichia pastoris (P. pastoris) is the second most widely used recombinant expression system after Escherichia coli (Theron et al. 2018). The transcription levels of heterogenous genes largely determine their expression efficiency for this system (Norden et al. 2011). Therefore, driving the heterogenous gene under the control of strong promoters combined with multiple inserts of expression cassettes is preferred (Aw and Polizzi 2013; Shen et al. 2012; Shen et al. 2020). The methanol-inducible alcohol oxidase 1 promoter (PAOX1) is most commonly used because it is remarkably strong and tightly regulated by methanol (Vogl and Glieder 2013). Using this promoter, cells are allowed to grow to high densities before the methanol-inducing phase, which eliminates the possibility of the toxicity of heterogenous proteins to P. pastoris during the growth phase. In numerous examples, increasing the inserts of expression cassettes have been shown to boost yields by the enhancement of transcription levels of heterogenous gene (Juturu and Wu 2018). The in vitro strategy of generation of multicopy inserts is to transform cells with plasmids containing tandem inserts of the expression cassettes. However, the effort for the preparation of these plasmids is high, and rearrangements may occur owing to the large size of plasmids. The in vivo strategy utilizes the antibiotic-resistant genes to screen multicopy inserts since the level of antibiotic resistance roughly correlates with the number of expression vectors integrated into transformants. The widely used promoter for initiating the transcription of antibiotic-resistant genes in commercial vectors, e.g., pPICZαA, pPIC6αA, and pGAPαA, is a region of DNA that located upstream of the translational elongation factor 1a (TEF1) gene of Saccharomyces cerevisiae, which has a strong promoter activity under all commonly used carbon sources (Kitamoto et al. 1998). There may be two drawbacks of the PTEF1 for driving antibiotic-resistant genes: Firstly, the expression of antibiotic-resistant genes not only is useless but also may increase the metabolic burden during heterogenous protein expression (Karim et al. 2013). Secondly, large amounts of expensive antibiotic are required for screening positive transformants, especially for multicopy inserts. Considering that, we hypotheses that a weaker promoter that specifically initiates transcription during the selection process would be ideal for driving the antibiotic-resistant genes.

While discovering strong promoters is attractive for driving recombinant protein production, identification of weak promoters is required for fine-tune regulation of gene expression in P. pastoris (Qin et al. 2011; Stadlmayr et al. 2010). The developments of genomics and transcriptomics allowed for the large-scale screening of novel promoters in P. pastoris (Zahrl et al. 2017). After re-analyzing the P. pastoris GS115 strain RNA-seq data on different carbon sources (Xu et al. 2018), we chose putative promoters that may be suitable for driving the transcription of the antibiotic-resistant gene based on criteria as described above. To conveniently monitor their strengths and performances under different carbon sources, a green fluorescent protein variant optimized for excitation by UV light (GFPuv) was expressed under the control of the promoter candidates. After screening, putative promoters of the 2-deoxyglucose-6-phosphate phosphatase gene exhibited significantly higher activities under glucose (the normally used carbon source for the selection process) than carbon sources (glycerol and methanol) used for the PAOX1 promoter–based production of recombinant protein.

We next investigated the feasibility of driving the Streptoalloteichus hindustanus bleomycin (Sh ble) gene by PDog2p1000 and its truncations. It is shown that a 300-bp upstream fragment of the start site of the Dog2p gene already could initiate Sh ble gene transcription. Driving Sh ble gene expression by PDog2p300 in expression vectors had proven to be efficient for the selection of high-yield strains expressing exogenous genes such as GFPuv and insulin precursor (IP).

Materials and methods

Strains, plasmids, enzymes, and reagents

P. pastoris strain GS115 (his4; Thermo Fisher Scientific, Carlsbad, USA) was used as an expression host. Vector pGFPuv was purchased from Takara Bio (Dalian, China). Vectors pGAPZαB, pPICZαB, pPIC3.5K, and pPIC9 were purchased from Thermo Fisher Scientific (Carlsbad, USA). The pmCherry-1 vector was purchased from Takara Biomedical Technology (Beijing, China). ClonExpress II One Step Cloning Kit, Universal SYBR qPCR Master Mix kit, HiScript II Q RT SuperMix for qPCR, and Phanta Max Super-Fidelity DNA polymerase were obtained from Vazyme Biotech (Nanjing, China).

Construction of GFPuv reporter plasmids

Firstly, the vector pPGAP-GFPuv was constructed (Fig. 2a). Briefly, a linearized pPIC9 vector was amplified from pPIC9 by reverse PCR using primers FP-pPIC9 and RP-pPIC9 (all primers used for the construction of vectors are listed in Table S1), the promoter sequences of glyceraldehydes-3-phosphate dehydrogenase (PGAP) were amplified from pGAPZαB using primers FP-PGAP and RP-PGAP, and the GFPuv gene was amplified from pGFPuv using primers FP1-GFPuv and RP1-GFPuv. The amplified PGAP sequences and GFPuv gene were inserted into the linearized pPIC9 vector sequentially by recombination using ClonExpress II One Step Cloning Kit according to the manufacturer.

To construct GFPuv expression plasmids driven by different putative promoters (Fig. 2a), the sequences of PPFK2, P768, P181, P642, P090, and PDog2p1000 were amplified from the P. pastoris genome by corresponding primer pairs, and inserted into a linearized pPGAP-GFPuv expression vector that was amplified from pPGAP-GFPuv using primers FP-pPGAP-GFPuv and RP-pPIC9. The pPTEF1-GFPuv was constructed similarly except that the sequences of PTEF1 were amplified from pGAPZαB using primers FP-PTEF1 and RP-PTEF1. All GFPuv reporter plasmids, except the pP181-GFPuv that was transfected directly, were linearized by Stu I before transfection of P. pastoris.

Construction of pPDog2p1000-ble and pPPFK2-ble

The PTEF1 in pPICZαB was replaced with PDog2p1000 and pPPFK2 to give pPDog2p1000-ble and pPPFK2-ble, respectively (Fig. 3a). Briefly, the PDog2p1000 product was amplified from the P. pastoris genome by FP2-PDog2p1000 and RP2-PDog2p1000; the PCR product was cloned into the linearized vector that amplified from pPICZαB using primers FP1-pPICZαB and RP1-pPICZαB by recombination. pPPFK2-ble was constructed similarly, except that the PPFK2 product was amplified from the P. pastoris genome by FP2- PPFK2 and RP2- PPFK2. Plasmids were linearized by Sac I before transfection of P. pastoris.

Construction of pPDog2p1000 truncations for driving GFPuv or Sh ble expression

The PDog2p1000 truncations PDog2p700, PDog2p500, PDog2p400, and PDog2p300 were amplified from the P. pastoris genome by corresponding upstream primers with downstream primer RP1-PDog2p1000 and inserted into the linearized pPGAP-GFPuv expression vector to give pPDog2p700-GFPuv, pPDog2p500-GFPuv, pPDog2p400-GFPuv, and pPDog2p300-GFPuv, respectively (Fig. 4a). Plasmids were linearized by Stu I before transfection of P. pastoris.

To replace PTEF1 with PDog2p700 in pGAPZαB, PDog2p700 was amplified from the P. pastoris genome by FP2-PDog2p700 and RP2-PDog2p1000, the PCR product was inserted into the linearized pPDog2p1000-ble vector to give pPDog2p700-ble (Fig. 3e). pPDog2p300-ble (Fig. 3f) was constructed similarly, except that PDog2p300 was amplified from the P. pastoris genome by FP2-PDog2p300 and RP2-PDog2p1000. Plasmids were linearized by Sac I before transfection of P. pastoris.

Construction of pPICZαB-ble-His and pPDog2p300-ble-His

To detect the protein expression level of Sh ble in P. pastoris, a DNA fragment coding for a six-histidine tag was fused at the C terminal of the Sh ble gene in pPICZαB and pPDog2p300-ble (Fig. 5 a and b). Briefly, Sh ble-His was amplified from pPICZαB using primer FP-ble-His and RP-ble-His, note that the primer RP-ble-His consists of reverse complement DNA sequences coding for a six-histidine tag. The gene of Sh ble-His was cloned into the linearized vector that amplified from pPICZαB using primers FP2-pPICZαB and RP2-pPICZαB by recombination using ClonExpress II One Step Cloning Kit according to the manufacturer. Using the same strategy, the gene of Sh ble-His was cloned into the linearized vector pPDog2p300-ble to give pPDog2p300-ble-His. Plasmids were linearized by Sac I before transfection of P. pastoris.

Construction of pPICZ-GFPuv, pPDog2p300-ble-GFPuv, and pPDog2p300-ble-IP

To construct pPICZ-GFPuv (Fig. 6a), the linearized pPICZB vector was amplified from pPICZB by reverse PCR using FP3-pPICZαB and RP3-pPICZαB, and the GFPuv gene was amplified from pGFPuv using primers FP2-GFPuv and RP2-GFPuv. The amplified GFPuv gene was inserted into the linearized pPICZB vector by ClonExpress II One Step Cloning Kit. The TEF1 promoter in pPICZ-GFPuv was replaced by PDog2p300 to give pPDog2p300-ble-GFPuv (Fig. 6b). Similar with the construction of pPDog2p300-ble, PDog2p300 was amplified from the P. pastoris genome by primer pairs FP2-PDog2p300 and RP2-PDog2p1000, and the PCR product was then cloned into the linearized vector pPICZ-GFPuv that amplified from pPICZ-GFPuv by primer pair FP1-pPICZαB and RP1-pPICZαB. Plasmids were linearized by BstX I before transfection of P. pastoris.

The vector pPICZα-IP was constructed as described (Fig. 7a) (Gurramkonda et al. 2010). The TEF1 promoter in pPICZα-IP was replaced by PDog2p300 to give pPDog2p300-ble-IP (Fig. 7b), PDog2p300 was amplified from the P. pastoris genome by primer pairs FP2-PDog2p300 and RP2-PDog2p1000, and the PCR product was then cloned into the linearized vector pPICZα-IP, which was amplified from pPICZα-IP by primer pair FP1-pPICZαB and RP1-pPICZαB, by ClonExpress II One Step Cloning Kit. Plasmids were linearized by Sac I before transfection of P. pastoris.

Construction of pPDog2p300-Kan-mCherry

To replace the GFPuv gene with mCherry gene in pPDog2p300-ble-GFPuv, the linearized vector was amplified from pPDog2p300-ble-GFPuv by reverse PCR using FP3-pPDog2p300 and RP3-pPDog2p300, and the mCherry gene was amplified from pmCherry-1 using primers FP1-mCherry and RP1-mCherry. The amplified mCherry gene was inserted into the linearized vector to give pPDog2p300-ble-mCherry.

A linearized vector was amplified from pPDog2p300-ble-mCherry by reverse PCR using FP4-pPDog2p300 and RP4-pPDog2p300, and the Kanamycin resistance gene was amplified from pPIC3.5K using primers FP1-pPIC3.5K and RP1-pPIC3.5K. The amplified Kanamycin resistance gene was inserted into the linearized vector to give pPDog2p300-Kan-mCherry (Fig. 6g). The pPDog2p300-Kan-mCherry was linearized by Sac I before transfection of P. pastoris.

Transformation of P. pastoris

High-efficiency transformation of P. pastoris by electroporation was achieved using the method described by Wu et al (Wu and Letchworth 2004). Unless otherwise stated, LiAc and DTT pretreated cells were incubated with 500 ng linearized plasmid for 5 min, then the mixture was electroporated at 1.5 kV, 25 μF, 186 Ω. After electroporation, 1 ml of ice-cold 1 M sorbitol was immediately added to the cuvette, the cuvette contents were then transferred to a tube and incubated at 30 °C without shaking for 1 h. For plasmids containing the HIS4 gene, 100 μl of transferred cells were spread on plates containing RDB medium. For plasmids harboring the Sh ble gene, the tubes containing transferred cells were next added 1 ml YPD medium and incubated at 30 °C for another 3 h. All cells were then spread on plates containing, unless specifically stated otherwise, 100 μg/ml of Zeocin. For vector pPDog2p300-Kan-mCherry, transferred cells after incubation in YPD medium were spread on plates containing 1.0 mg/ml of Geneticin. After 3 days of transformation, colony number was counted by Bio-Rad Quantity One analysis software (Bio-Rad, Mississauga, Canada).

Determination of the strength of promoter candidates under different carbon sources

Strains containing the GFPuv gene driven by putative promoters were seed into flasks containing 25 ml of YPD, BMGY, or BMMY medium at 30 °C with shaking. For the BMMY medium, 250 μl of methanol was added into flasks at 48 h and 72 h. At indicated time points, cells from 1 ml culture broth were washed three times with 1 ml phosphate-buffered saline (PBS) and were then transferred into 96-well plates for measurement of the OD600 and GFP fluorescence values. The GFPuv fluorescence value was detected using an excitation wavelength of 395 nm and an emission wavelength of 506 nm. The mCherry fluorescence value was detected using an excitation wavelength of 587 nm and an emission wavelength of 610 nm. The relative fluorescence unit was calculated as fluorescence value/OD600. The background fluorescence determined from parental GS115 was subtracted.

Expression of IP or GFPuv driven by AOX1 promoter

To produce the recombinant protein driven by the AOX1 promoter, strains were cultured into flasks containing 25 ml of BMGY at 30 °C with shaking for 24 h. The cells were harvested by centrifugation and were resuspended in 25 ml of BMMY to induce expression. A total of 250 μl of methanol was added into flasks every 24 h to maintain induction. After 72 h of induction, culture supernatants or cells were collected for analysis.

SDS-PAGE and immunoblotting

Cells were harvested and lysed using glass beads according to the Pichia Expression Kit Instruction Manual. Twenty micrograms of cytosolic proteins determined by the BCA assay (Biyotime, Shanghai, China) was subjected to 4–12% GenScript SurePAGE gel (Nanjing, China). After electrophoresis, proteins were transferred to a PVDF membrane using a Trans-Blot Turbo transfer system (Bio-rad, Hercules, USA). His-tag mouse antibody (Abmart, Shanghai, China) and goat anti-mouse antibody conjugated to HRP (Shenggong biocompany, Shanghai, China) were used as the primary and second antibodies, respectively. Immunoreactive bands were visualized by BeyoECL Plus kit (Beyotime, Suzhou, China).

Tricine-SDS-PAGE analysis of IP

Twenty microliters of culture supernatant with 5 μl of 5X SDS-PAGE Gel loading buffer was mixed and boiled for 10 min, then the mixture was loaded onto 16.5% tricine-SDS-PAGE (Shanghai Shenggong biocompany, Shanghai, China). After electrophoresis, the separated bands were visualized using Coomassie blue staining.

Determination of gene copy number and mRNA level by real-time PCR

Genomic DNA of P. pastoris was extracted using AxyPrep Multisource Genomic DNA Miniprep Kit (Suzhou, China) following the manufacturer’s instruction. The gene copy number in strains was determined by real-time polymerase chain reaction (RT-PCR). The AOX2 promoter sequence was employed as the endogenous gene, the Sh ble gene and Kanamycin resistance gene were the target genes. For determination of the copy number of Kanamycin resistance gene in cells, a single colony selected from RDB plate after transfection of Sac I linearized pPIC3.5K into GS115 was arbitrarily assigned a single-copy strain. All the primers used for RT-PCR were listed in Table S2. The reaction was performed on a Q-tower system (Analytik-Jena, Jena, Germany) with Universal SYBR qPCR Master Mix kit.

The total RNA from P. pastoris was prepared using the yeast total isolation kit (Shanghai Shenggong biocompany, Shanghai, China), and was reverse-transcripted into complementary DNA (cDNA) with the random hexamer primer. The RT-PCR was performed on a Q-tower system (Analytik-Jena, Jena, Germany) with Universal SYBR qPCR Master Mix kit, and the comparative CT method was used to compare the expression level of Sh ble in strains. The actin gene was used as the endogenous gene (Table S2), and Sh ble gene was the target gene.

Statistical analysis

The one-way ANOVA test was applied to perform statistical analysis using the GraphPad Prism software v5.01. Results are presented as mean ± SD. Statistical significance was defined as *p < 0.05 and **p < 0.01.

Results

PDog2p1000 and PPFK2 exhibited significant higher activities under YPD compared with BMGY or BMMY for driving GFPuv

The RNA-seq analysis of P. pastoris under glucose, glycerol, and methanol allowed us to identify novel genes responsive to specific carbon sources (Xu et al. 2018). For a subset of these genes, their expression profiles could be mainly determined by their promoters. Since we focused on genes with high levels of transcription under glucose, the ratios of individual gene expression under glucose to non-glucose carbon sources were calculated using reads per kilobase million (RPKM) values generated by RNAseq (Fig. 1 a and b). For 5040 genes analyzed, 1.6% and 4.2% genes exhibited more than a 4-fold increase in transcription levels under glucose compared with glycerol and methanol, respectively. Among these genes with lower transcription levels under glycerol or methanol, 25 of them were overlapping. Given that appropriate promoter strength is required for driving the Sh ble gene, a too high or too low transcription level of a gene suggests that its promoter strength is not suitable. Therefore, genes with RPKM values ranging from 103 to 104 under glucose were chosen, resulting in only 10 genes whose RPKM values were in this range had low transcription levels under both glycerol and methanol (Fig. 1c). 6 putative promoters of these 10 genes (1000-bp upstream fragments) were fused with the GFPuv gene individually as an initial screening to identify their regulatory profiles (Fig. 2a and Table 1).

Fig. 1
figure 1

Re-analyzing of the P. pastoris RNA-seq data under different carbon sources. a The ratio of RPKM values under glucose to glycerol. Points with blue color indicate genes whose ratio is bigger than 4. Points with red color indicate 6 genes whose putative promoters were further investigated. b The ratio of RPKM values under glucose to methanol. Points with blue color indicate genes whose ratio is bigger than 4. Points with red color indicate 6 genes whose putative promoters were further investigated. c A Venn diagram indicating the intersections of genes with indicated RPKM values

Fig. 2
figure 2

Initial screening of putative promoters with low activity on glycerol and methanol. a Construction of GFPuv reporter plasmids for initial screening of putative promoters. PGAP, PTEF1, or putative promoters was fused with the GFPuv gene. bd Relative fluorescence units of cells containing different GFPuv reporter plasmids at 24, 48, and 72 h. Cells were incubated into YPD, BMGY, or BMMY medium at 30 °C with shaking. At indicated times, cells from 1 ml culture broth were washed three times with 1 ml phosphate-buffered saline (PBS), and were then determined the relative fluorescence unit. There were 3 samples in each group

Table 1 Putative promoters selected for driving the antibiotic-resistant gene

GAP promoter exhibited the highest activity from day 2 under all carbon sources (Fig. 2b–d). Its promoter strength on glucose was around 2-fold and 4–5-fold higher than on glycerol and methanol after 48 h, respectively. In the case of TEF1 promoter, the relative fluorescence units were quite similar under all carbon sources, indicating that TEF1 promoter is not a carbon source-regulated promoter in P. pastoris. Neither P768 nor P181 showed apparent activity under all carbon sources at all time points detected. Although P642 exhibited a low promoter activity under methanol, its activity under glycerol was comparable with glucose after 48 h. Furthermore, its activity under glucose was stronger than TEF1 promoter. The PDog2p1000 and PPFK2 (putative promoter of 6-phosphofructo-2-kinase gene) showed preferred profiles under different carbon sources, as their activities were weaker than TEF1 promoter under glucose, and they were further downregulated under glycerol and methanol. Thus, we next investigated the feasibility of driving the antibiotic-resistant gene Sh ble by these two promoters.

PDog2p1000 is a better promoter for driving the antibiotic-resistant gene Sh ble compared with PTEF1

To investigate the activities of PDog2p and PPFK2 for driving Sh ble gene, the pPICZαB, pPDog2p1000-ble, or pPPFK2-ble was transformed into P. pastoris GS115 (Fig. 3a). After 3 days of incubation of cells on plates containing 100 μg/ml of Zeocin, approximately 1500, 20, and 200 colonies were formed for cells transformed with pPICZαB, pPDog2p1000-ble, and pPPFK2-ble plasmids, respectively (Fig. 3 b and c). Six single colonies transformed with each plasmid were picked and analyzed for the Sh ble gene copy number. The mean gene copy numbers of colonies transformed with pPDog2p1000-ble and pPPFK2-ble were 3.6 and 1.5, respectively (Fig. 3d). Interestingly, approximately 1400 and 330 colonies were formed after incubation of cells on plates containing 25 and 50 μg/ml of Zeocin, respectively, for cells transformed with pPDog2p1000-ble (Fig. 3e). It has been shown that copy numbers of integrated Sh ble gene in the colonies after transformation of pPDog2p1000-ble were correlated with their Zeocin resistance. (Fig. 3 d and f).

Fig. 3
figure 3

PDog2p1000 is a potential promoter for driving the antibiotic-resistant gene Sh ble. a Construction of expression vector containing Sh ble driven by PDog2p1000 or PPFK2. The PTEF1 in pPICZαB was replaced by PDog2p1000 or PPFK2. b, c Colony formation on agar plates after transformation with pPICZαB, pPDog2p1000-ble, or pPPFK2-ble. 500 ng of linearized plasmid was used for transformation. 3 days after transformation, the number of colonies on each plate was counted by Bio-Rad Quantity One analysis software. There were 3 samples in each group. d Gene copy number analysis of cells transfected with pPICZαB, pPDog2p1000-ble, or pPPFK2-ble. 6 single colonies transformed with each plasmid were picked and analyzed for the Sh ble gene copy number. e Colony formation on agar plates after transformation with pPDog2p1000-ble. 500 ng of linearized plasmid was used for transformation, and transformed cells were spread on plates containing 25 or 50 μg/ml of Zeocin. f Gene copy number analysis of cells selected from plates containing 25 or 50 μg/ml of Zeocin after transformation with pPDog2p1000-ble

A 300-bp upstream fragment of the start site of Dog2p gene already has the ability to initiate the transcription

Considering that a smaller size of vector may facilitate gene manipulations, we investigated the length of Dog2p promoter that is necessary for initiating the transcription. The relative fluorescence units were very close under all carbon sources when GFPuv gene was fused with a 1000-bp, 700-bp, 500-bp, 400-bp, or 300-bp upstream fragment of the transcribed Dog2p gene (Fig. 4 b–d), suggesting the exact promoter length of Dog2p gene is short than 300 base pairs. We next constructed the plasmid containing the Sh ble gene driven by PDog2p-700 or PDog2p-300 (Fig. 4 e and f). Similar numbers of colonies were obtained among cells transformed with either of these plasmids (Fig. 4g). Importantly, strains transfected with pPDog2p300-ble still exhibited an approximately 3-fold higher level of Sh ble gene transcription under YPD medium compared with BMGY and BMMY media (Fig. 4h). Given its short length, appropriate strength, and the lower activity under BMGY and BMMY media, PDog2p300 seemed like a suitable promoter for driving Sh ble gene in expression vectors of P. pastoris.

Fig. 4
figure 4

PDog2p300 can drive the exogenous gene. a Construction of report vectors containing GFPuv driven by PDog2p1000 truncations. The PDog2p1000 in pPICZαB was replaced by PDog2p700, PDog2p500, PDog2p400, or PDog2p300. bd Relative fluorescence units of cells containing GFPuv driven by PDog2p700, PDog2p500, PDog2p400, or PDog2p300 at 24, 48, and 72 h. Cells were incubated into YPD, BMGY or BMMY medium at 30°C with shaking. At indicated times, cells from 1 ml culture broth were washed three times with 1 ml phosphate-buffered saline (PBS), and were then determined the relative fluorescence unit. There were 3 samples in each group. e, f Construction of expression vector containing Sh ble driven by PDog2p700 or PDog2p300. The PTEF1 in pPICZαB was replaced by PDog2p700 or PDog2p300. g Colony number of cells transformed with PDog2p700 or PDog2p300. 3 days after transformation, the number of colonies on each plate was counted by Bio-Rad Quantity One analysis software. There were 3 samples in each group. h Relative mRNA levels of Sh ble gene in cells transformed with p PDog2p300-ble on different carbon sources. Representative single colonies were seed in YPD, BMGY, or BMMY medium at 30°C with shaking for 48 h. Cells were harvested for analyzing mRNA levels of Sh ble gene. There were 3 samples in each group

The Sh ble protein levels were lower in high-copy strains transformed with pPDog2p300-ble compared with pPICZαB

Although a larger amount of Zeocin may be used, high-copy strains could be screened by transformation with pPICZαB. It is interesting to know the difference of antibiotic resistance gene expression between high-copy strains obtained by transformation with pPICZαB and pPDog2p300-ble. To achieve this goal, pPICZαB-ble-His and pPDog2p300-ble-His were constructed and transformed into P. pastoris (Fig. 5 a and b). For transformation of pPICZαB-ble-His, all transformed cells were necessarily spread on one plate containing 800 μg/ml Zeocin, and only 9 colonies were obtained. The average Sh ble gene copy number in 4 of them was 3.4, which was comparable with the number in high Sh ble gene copy strains obtained by transformation with pPDog2p300-ble-his (Fig. 5c). In contract, the plate used for screening cells transformed with pPDog2p300-ble-his contained only100 μg/ml Zeocin. The protein expression levels of Sh ble gene were apparently lower in pPDog2p300-ble-His transformed high-copy colonies (Fig. 5d). Collectively, these data indicated that replacement of PTEF1 by PDog2p300 in the expression vector not only decreased the amount of Zeocin needed for screening high gene copy strains but also reduced metabolic burden of high gene copy strains by downregulation of the Sh ble gene expression.

Fig. 5
figure 5

High gene copy number strains selected from pPDog2p300-ble transfection express low levels of Sh ble protein. a, b Construction of expression vectors pPICZαB-ble-His and pPDog2p300-ble-His. A DNA fragment coding for a six-histidine tag was fused at the C terminal of Sh ble gene in pPICZαB and pPDog2p300-ble. c Gene copy number analysis of strains transformed with pPICZαB-ble-His or pPDog2p300-ble-His. Cells transfected with pPICZαB-ble-His were spread on a plate containing 800 μg/ml Zeocin after transfection. In contrast, Cells transfected with pPICZαB-ble-His were spread on a plate containing 100 μg/ml Zeocin. There were 4 samples in each group. d Analysis of Sh ble gene expression by immunoblotting. High gene copy number strains selected from pPICZαB-ble-His or pPDog2p300-ble-His transfection were incubated into YPD medium at 30°C with shaking for 48 h. Cells were then harvested and lysed. 20 μg of cytosolic proteins was used for immunoblotting. There were 4 samples in each group

Multiple copy strains are achieved using a low concentration of Zeocin by transfection with pPDog2p300-ble-GFPuv

To investigate the availability of the pPDog2p300-ble as an expression vector in P. pastoris, the GFPuv gene was inserted after the AOX1 promoter to give pPDog2p300-ble-GFPuv (Fig. 6b). For comparison, the GFPuv gene was inserted into the corresponding site in pPICZαB to give pPICZ-GFPuv (Fig. 6a). The relative fluorescence units and GFPuv gene copy numbers were 6.5-fold and 3.2-fold higher, respectively, in cells transfected with pPDog2p300-ble-GFPuv compared with pPICZ-GFPuv (Fig. 6c–e). The strain harboring 4.3 copy numbers of pPDog2p300-ble-GFPuv was named as GPF. Furthermore, mRNA levels of Sh ble gene were lower in cells transfected with pPDog2p300-ble-GFPuv under BMGY and BMMY media (Fig. 6f).

Fig. 6
figure 6

Expression of GFPuv and mCherry by the PDog2p300-derived vector. a, b Construction of expression vectors pPICZ-GFPuv and pPDog2p300-ble-GFPuv. The DNA fragment coding for GFPuv was inserted into pPICZαB (a) and pPDog2p300-ble (b). c Gene copy number analysis of strains transformed with pPICZ-GFPuv or pPDog2p300-ble-GFPuv. Cells transfected with pPICZ-GFPuv or pPDog2p300-ble-GFPuv were spread on a plate containing 100 μg/ml Zeocin after transfection. There were 4 samples in each group. d Relative fluorescence units of cells containing GFPuv driven by AOX1 promoter. Strains transfected with pPICZ-GFPuv or pPDog2p300-ble-GFPuv were incubated into BMGY medium at 30 °C with shaking for 24 h. The cells were then resuspended in BMMY to induce expression for another 72 h. There were 4 samples in each group. e The photograph visualizing green fluorescence under UV light in strains transformed with pPICZ-GFPuv or pPDog2p300-ble-GFPuv. Cells were collected after 72 h of methanol induction. The parental GS115 was set as a negative control. There were 3 samples in each group. f Relative mRNA levels of Sh ble gene in cells transfected with pPDog2p300-ble-GFPuv on different carbon sources. Representative single colonies were seed in YPD, BMGY, or BMMY medium at 30°C with shaking for 48 h. Cells were harvested for analyzing mRNA levels of Sh ble gene. There were 3 samples in each group. g Construction of expression vector pPDog2p300-Kan-mCherry. h Gene copy number analysis of strains transformed with pPDog2p300-Kan-mCherry. Cells transfected with pPDog2p300-Kan-mCherry was spread on a plate containing 1.0 mg/ml of Geneticin. i Relative fluorescence units of cells containing GFPuv or/and mCherry. There were 4 samples in each group

To test whether this PDog2p300 could be used for driving two antibiotic genes in one strain, we replaced the GFPuv and Sh ble gene in pPDog2p300-ble-GFPuv with mCherry and kanamycin resistance gene, respectively, to give pPDog2p300-Kan-mCherry. The pPDog2p300-Kan-mCherry was transformed into GS115 and strain GFP (Table 2). Strains harboring multiple copies of kanamycin resistance gene were successfully achieved after transformation of strain GFP (Fig. 6h), and both GFPuv and mCherry signals were detected in these strains (Fig. 6i).

Table 2 Construction of strain co-expression of GFPuv and mCherry

Higher expression levels of IP in strains transfected with pPDog2p300-ble-IP

To investigated the advantage of using pPDog2p300-ble as a vector for expression of recombinant proteins other than model proteins, we next transfected cells with pPDog2p300-ble-IP or pPICZα-IP (Fig. 7 a and b). Consistent with results of expression GFPuv, the IP gene copy numbers were 3.3-fold higher in strains harboring pPDog2p300-ble-IP picked from YPDS plates containing 100 μg/ml Zeocin (Fig. 7c). Importantly, the expression levels were significantly higher in cells transfected with pPDog2p300-ble-IP (Fig. 7d). RT-PCR proved that the transcription levels of Sh ble gene in strains transformed with pPDog2p300-ble-IP were lower under BMGY and BMMY compared with YPD (Fig. 7e).

Fig. 7
figure 7

Expression of IP by the PDog2p300-derived vector. a, b Construction of expression vectors pPICZα-IP and pPDog2p300-ble-IP. The vector pPICZα-IP was constructed as described (a). The DNA fragment coding for the α-factor signal peptide and IP was inserted into pPDog2p300-ble (b). c Gene copy number analysis of strains transfected with pPICZα-IP or pPDog2p300-ble-IP. Cells transfected with pPICZ-GFPuv or pPDog2p300-ble-GFPuv were spread on a plate containing 100 μg/ml Zeocin after transfection. There were 4 samples in each group. d Expression levels of IP in strains transfected with pPICZ-GFPuv or pPDog2p300-ble-GFPuv. Strains was incubated into BMGY medium at 30 °C with shaking for 24 h. The cells were then resuspended in BMMY to induce expression for another 72 h. 20 μl of culture supernatant was used for tricine-SDS-PAGE analysis. There were 4 samples in each group. e Relative mRNA levels of Sh ble gene in cells transformed with pPDog2p300-ble-IP on different carbon sources. Representative single colonies were seed in YPD, BMGY, or BMMY medium at 30 °C with shaking for 48 h. Cells were harvested for analyzing mRNA levels of Sh ble gene. There were 3 samples in each group

Discussion

The most widely used promoter for driving antibiotic-resistant genes is the strong constitutive promoter PTEF1. In the present study, we want to replace this promoter with a novel one to decrease the cost for selection of high-copy number strains and reduce the metabolic burden from the high production of antibiotic-resistant proteins. Considerable efforts were made to identify and engineer novel promoters for the P. pastoris expression system. Most of these studies focused on either discovery of strong promoters for driving recombinant protein production or engineering promoter libraries for fine-tuning of gene expression (Arruda et al. 2016; Liang et al. 2013; Menendez et al. 2003). To the best of our knowledge, none of the promoters have been proven to have the regulatory profile suitable for driving antibiotic-resistant genes.

After transcriptomic data mining and initial screening, we found that the activities of putative promoters of Dog2p and PFK2 were largely dependent on the carbon sources (Fig. 2b–d). Interestingly, both of these two genes are related to glucose metabolism in yeast (Kretschmer and Fraenkel 1991; Tsujimoto et al. 2000). This result is reasonable since we screened putative promoters with different activities on glucose and other carbon sources. Similarly, many upstream fragments of genes that participated in methanol metabolism are proven to be methanol-inducible promoters, such as PAOX1, PAOX2, and PDAS (Inan and Meagher 2001; Ohi et al. 1994; Tschopp et al. 1987). The methanol-inducible promoters seem to be regulated by the carbon source more strictly compared with glucose-inducible promoters identified by us. No activity can be detected for PAOX1, PAOX2, and PDAS in the presence of glucose, whereas there was approximately 30% of activity for PDog2p300 under glycerol compared with glucose (Fig. 6e and Fig. 7e). The synthetic biology approaches such as modification of its regulatory elements may be applied for engineering PDog2p300 to further decrease its activity under non-glucose carbon sources (Portela et al. 2017), but is beyond the scope of this study.

PPFK2 exhibited much stronger activity for driving Sh ble gene compared with PDog2p1000 (Fig. 3 b and c), whereas their abilities to initiate the transcription of GFPuv were at the same level (Fig. 2b–d). It is well known that the relative activities of a promoter can be various for driving different genes (Stadlmayr et al. 2010), although the underlying mechanism is not completely clear. Thus, the trial and error processes are always necessary for engineering P. pastoris. For instance, it is difficult to predict the effect of chaperons co-expression on the production of a recombinant protein (Sallada et al. 2019; Shen et al. 2020; Wu et al. 2014).

The approach of conditional and inducible gene expression is widely used in establishing transgenic animal models of diseases (Lucas et al. 2001), it also has a significant potential for the expression host engineering. For example, the expression of heterogenous proteins by the AOX1 promoter is induced under methanol. In our study, the conditional expression of antibiotic-resistant genes showed benefit for the cell factory. The PDog2p300 may be suitable for driving other genes such as CRISPR/Cas9 that enables specific and precise genome engineering in P. pastoris (Weninger et al. 2016), giving that CRISPR/Cas9 is only required during the genome engineering. Besides the carbon source–inducible promoters, promoters regulated by the unfolded protein response and phosphate were also described (Ahn et al. 2009; Prattipati et al. 2020). However, inducible promoters with tight regulation, different strength, and various regulatory profiles were still lacking, which obstructed the engineering of P. pastoris.