Introduction

The methylotrophic yeast Pichia pastoris is a promising host for heterologous protein production. Most P. pastoris promoters used for efficient expression of heterologous proteins are derived from genes that code for enzymes in the methanol metabolism pathway. Alcohol oxidase I promoter (P AOX1 ) is widely used to drive the efficient expression of heterologous proteins and a library of P AOX1 variants with different degrees of methanol regulation strength has been reported (Hartner et al. 2008). The formaldehyde dehydrogenase promoter (P FLD1 ) is strongly and independently induced by either methanol as sole carbon source or methylamine as sole nitrogen source (Shen et al. 1998). However, fermentation using methanol induced promoter for heterologous protein expression is slow and sophisticated induction strategies are often needed for high-level production. Other inducible promoters include the sodium-coupled phosphate symporter promoter (P PHO89 ) (Ahn et al. 2009) and the thiamine biosynthesis gene promoter (P THI11 ) (Stadlmayr et al. 2010).

Expression systems based on strong constitutive promoters have been developed using glucose, glycerol or sorbitol as carbon source. Since no induction is needed for constitutive promoter expression, peak production of heterologous proteins can be achieved in a relatively short fermentation period. The glyceraldehyde-3-phosphate dehydrogenase (GAPDH) promoter (P GAP or P GAPDH ) was isolated (Waterham et al. 1997) and used for constitutive expression of many heterologous proteins (Li et al. 2010; Zhang et al. 2009). A library of P GAP variants with various strengths of heterologous protein expression was developed (Qin et al. 2011). The translation elongation factor 1-alpha promoter (P TEF1 ) (Ahn et al. 2007) and a set of 24 novel P. pastoris promoters were also reported (Stadlmayr et al. 2010). However, the number of strong promoters available for heterologous protein production in P. pastoris is still limited and more constitutive promoters are needed.

The gene Chr1-4_0586 (named GCW14), that encodes a potential glycosyl phosphatidyl inositol (GPI)-anchored protein (termed GCW14p), exhibited the highest expression level when P. pastoris was cultivated with glycerol in our previous study (Liang et al. 2012). The transcriptional level of GCW14 was marginally lower than AOX1 but significantly higher than other genes when the carbon source was shifted from glycerol to methanol. These results indicate that GCW14 possesses effective and strong promoter sequences. In this study, the potential promoter P GCW14 was isolated from Chr1-4_0586 and its promoter activity was compared to P GAP and P TEF1 using enhanced green fluorescent protein (EGFP) as the reporter.

Materials and methods

Strains and media

Escherichia coli Top10F’ and P. pastoris X33 were cultured as previously described (Liang et al. 2013). E. coli transformants and P. pastoris transformants were selected on LBLZ agar plates (10 g tryptone/l, 5 g yeast extract/l, 5 g NaCl/l, 20 g agar/l, and 25 mg zeocin/l) and YPDZ plates (20 g peptone/l, 10 g yeast extract/l, 20 g dextrose/l, 20 g agar/l, and 100 mg zeocin/l), respectively. Buffered minimal media contained 10 g yeast extract/l, 20 g peptone/l, 100 mM potassium phosphate buffer at pH 6.0, and 13.4 g yeast nitrogen base (YNB) /l without amino acids. Media had either 20 g dextrose/l for BMDY, 10 g glycerol/l for BMGY, 1 % (v/v) ethanol for BMEY, or 1 % (v/v) methanol for BMMY. Strains and primers used are listed in Supplementary Table 1.

Quantitative reverse transcription PCR

Total RNA was extracted from the cells of a 24 h batch culture of P. pastoris using an acid/phenol method and purified by NucleoSpin Extract II kit (Machery-Nagel Corp., Germany). Quantitative reverse transcription PCR (qRT-PCR) was performed using a PrimeScript RT reagent Kit with gDNA Eraser (Takara, Japan) in an ABI PRISM 7500 Real Time PCR System (Applied Biosystems, USA). The used primers were GAPDH1/GAPDH2 for GAPDH, GCW1/GCW2 for GCW14, and AOX1/AOX2 for AOX1.

Construction of EGFP expression vectors

The vector pAOX1-EGFP was constructed through inserting the EGFP gene amplified from the vector pEGFPN1 (Clontech) using the primers EGFP1/EGFP2 to the pPICZαA vector (Invitrogen) at the sites EcoRI and XbaI. Subsequently, the 3.3 kb PstI-KpnI fragment of pAOX1-EGFP without P AOX1 was amplified from pAOX1-EGFP using the primers Fragment1/Fragment2. The promoter sequences of P GCW14 , P GAP and P TEF1 were obtained from P. pastoris genomic DNA and ligated with the PstI-KpnI fragment, generating the plasmids pGCW14-EGFP, pGAP-EGFP and pTEF1-EGFP. The used primers were GCW3/GCW4, GAP1/GAP2, and TEF1/TEF2, respectively.

P. pastoris transformation and gene copy number determination

The EGFP-expression vectors were linearized using Sac I for pAOX1-EGFP, BstP I for pGCW14-EGFP, Avr II for pGAP-EGFP, and SnaB I for pTEF1-EGFP. Each linearized vector was transformed into P. pastoris X33 using a LiCl transformation method (Invitrogen). The positive transformants were subcultured in microplates containing 150 μl YPD medium at 30 °C. The copy number of the expression cassette was determined by Real-time Quantitative PCR (qPCR) with a 2−ΔΔCt calculation and the P. pastoris GAPDH gene as a reference. For the gene EGFP and GAPDH, primers EGFP3/EGFP4 and GAPDH1/GAPDH2 were used, respectively. Transformants with a single copy of the EGFP expression cassette were selected for promoter activity experiments.

Shake-flask cultivation

The transformants with a single copy of the EGFP expression cassette were inoculated into 10 ml YPD medium and incubated overnight at 30 °C. The culture was transferred to 250 ml shake-flasks containing 50 ml BMGY, BMMY, BMEY, or BMDY, and incubated at 30 °C with shaking at 250 rpm. All cultivation experiments were performed 3–5 times independently with triplicate replications. Growth was monitored from the OD600 values.

EGFP fluorescence measurement

Culture broth, 0.5 ml, was centrifuged and the supernatant diluted with an equal volume of refolding buffer (0.05 M NaH2PO4, 0.1 M NaCl, and 0.5 M imidazole), EGFP expression was then analyzed by its fluorescence using an Infinite M200 microplate reader (Tecan, USA) with excitation at 470 nm and emission at 510 nm. EGFP fluorescence was calculated by subtracting the blank value (P. pastoris X33 grown and measured under the same conditions) and multiplied by the dilution factor.

Transcription start site TSS identification and sequence analysis of the GCW14 promoter

The TSS of P GCW14 was identified using 5′-rapid amplification of cDNA end (5′-RACE) by SMARTer RACE cDNA Amplification Kit (Clontech, USA). The cDNA resulting from purified mRNA without DNA contamination was used as the template for 5′-RACE PCR with the Universal Primer A Mix supplied by the manufacturer, and the gene specific primer GSP1. The 5′ RACE PCR product was sequenced (Majorbio, China) and sequence alignment was performed using ClustalX (Thompson et al. 1997) version 2.0.10. The 5′-terminal nucleotide of the identical sequence between the RACE PCR product and the genomic sequence was designated as the TSS. The online program TESS (http://www.cbil.upenn.edu/cgi-bin/tess/tess) was used to search for transcription factor binding sites (TFBS) in the cloned 822-bp upstream region of GCW14.

Results and discussion

Expression characteristics of P. pastoris GCW14

Our previous transcriptional research revealed that gene GCW14 exhibited the highest expression level when P. pastoris was cultivated with glycerol and the transcriptional level of GCW14 was still significantly higher than other genes except for AOX1 when the carbon source was changed to methanol, indicating that the expression of GCW14 was constitutive. P. pastoris X33 was grown in medium with glucose, glycerol, methanol or ethanol for 24 h and qRT-PCR was performed to investigate the expression characteristics of GCW14. Expression of constitutive GAPDH and inducible AOX1 was examined for comparison. GAPDH also served as the internal control for relative quantification. As shown in Fig. 1, after normalization to GAPDH mRNA levels, the expression of AOX1 was consistent with previous reports describing AOX1 repression by ethanol, glycerol, or glucose, and marked induction by methanol (Cregg et al. 1993). The expression level of GCW14 was 4- to 5-fold higher than GAPDH, which is often constitutively expressed at high levels. Of note, when P. pastoris was grown on methanol, the relative mRNA level of GCW14 was as high as two-thirds the level of AOX1. GCW14 was constitutively expressed at high levels when grown in glucose, glycerol, or ethanol besides methanol.

Fig. 1
figure 1

Characterization of GCW14 and AOX1 under different carbon sources. P. pastoris X33 was cultivated with glucose, glycerol, methanol, or ethanol for 24 h and the expression level of GCW14 and AOX1 was investigated using qRT-PCR. Error bars depict the SE of the mean. The white column and the black columns represent AOX1 and GCW14, respectively

Comparison of P GCW14 to commonly used promoters in P. pastoris

Promoter length can vary from 100 to 1,000 bp and the promoter regions for organisms with a compact genome are often defined as 1000, 800, or 600 bp upstream of the transcriptional start site (Kristiansson et al. 2009). To obtain a region with high promoter activity and avoid the restriction enzyme sites located between 1000 bp and 800 bp upstream of the start codon, A 822 bp segment was selected and cloned as the promoter of GCW14. The transcriptional activity of the GCW14 promoter was compared to other commonly used promoters (P TEF1 and P GAP ), with EGFP as the reporter protein. Increasing the gene copy number generally increases the expression level of heterologous proteins but can sometimes result in a decreased protein yield (Cos et al. 2005; Hohenblum et al. 2004). To circumvent gene dosage effects on the expression level analysis, only strains harboring a single copy of the EGFP gene were selected for further study. We initially screened P pastoris transformants using 96-well plate culturing. Transformants were grouped by fluorescence levels after 24 h of cultivation; those with high fluorescence levels were selected for a second round of cultivation and genomic DNA isolation. EGFP copy number was determined using qPCR. The copy number of EGFP gene in recombinant P. pastoris X33/P AOX1 -EGFP, X33/P GAP -EGFP, X33/P GCW14 -EGFP, and X33/P TEF1 -EGFP were 1.14 ± 0.02, 0.94 ± 0.03, 1.1 ± 0.01, and 1.17 ± 0.02, respectively. These results indicated that each recombinant P. pastoris strain contained a single copy of the EGFP gene. Subsequently, the transformants with a single copy of the EGFP expression cassette were selected for promoter activity study.

To compare the expression of EGFP from P GCW14 and other constitutive promoters (P TEF1 and P GAP ), transformants were cultivated in shake-flasks using glucose, glycerol or methanol (Fig. 2). EGFP expression from P GCW14 was also compared with expression from P AOX1 with methanol as sole carbon source. As shown in Fig. 2c, EGFP expression from a P AOX1 transformant cultivated in BMMY medium was about 50 % higher than expression from P GCW14 . Throughout the cultivation procedure, the OD600 values of the transformants were similar on the different single-carbon sources. However, regardless of carbon source, the secreted relative fluorescence units (RFU) of EGFP from the P GCW14 transformants were higher than that from the P TEF1 and P GAP transformants. The secreted fluorescence from the P GCW14 transformant was about ten times higher than P GAP transformants on glycerol and methanol and five times higher than those transformants on glucose. The secreted fluorescence from P GCW14 transformants was 30–60 % higher than from P TEF1 transformants (Fig. 2). These results suggested that the novel promoter P GCW14 was a strong constitutive promoter over the conditions tested in this study that might offer an additional constitutive promoter for expression of recombinant proteins in P. pastoris.

Fig. 2
figure 2

Comparison of promoter transcription using EGFP as a reporter. Host strain X33 and recombinant P. pastoris strains with reporters expressing EGFP from different promoters were cultivated with glucose (a), glycerol (b), or methanol (c) as the carbon source and the secreted relative fluorescence units (RFU) of EGFP were measured every 24 h

Sequence analysis of the GCW14 promoter

The TSS of GCW14, which is important for the promoter activity, was identified by 5′-RACE analysis. A major RACE product band was detected (Fig. 3a), suggesting that the GCW14 promoter had at least a TSS. The purified RACE product was sequenced and aligned with the P. pastoris GS115 genomic sequence from 822-bp upstream to the stop codon of GCW14 (Fig. 3b). The 3′-terminal 360 nucleotides of the RACE product (416 bp) exactly matched the genomic sequence. The unmatched nucleotides (56 bp) were from RACE oligonucleotides. These results indicated that the TSS was 45 bp upstream of the translation start codon of the GCW14 ORF. The length of the 5′-untranslated region (5′-UTR) of the GCW14 mRNA was 45 bp (Fig. 3b). The presence of upstream AUGs (uAUGs) and upstream ORFs (uORFs) in 5′-UTR is reported to affect the translation efficiency (Hood et al. 2009; Wang and Rothnagel 2004). No uAUGs or uORFs were found in the 5′-UTR of the GCW14 mRNA.

Fig. 3
figure 3

Sequence analysis of the GCW14 promoter. a 5′-RACE product amplification for identifing the transcription start site of GCW14. b RACE product sequence alignment with P. pastoris genomic DNA. The box represents the 5′-UTR of the GCW14 mRNA. c The prediction of transcription factor binding sites

The 822-bp promoter region upstream of GCW14 was used to search for TFBS (Fig. 3c). Two putative TATA-boxes were found at −48 to −43 and −107 to −102. A putative CCAAT box was found at −519 to −513. The sequence context surrounding the TSS (underlined in TTACTTA) agreed well with the initiator consensus sequence context (YYANWYY). Other core promoter elements, such as the TFIIB recognition element BRE and the downstream promoter element, were not found in the promoter region.

Taken together, we isolated and characterized the GCW14 promoter which mediated the constitutive and high expression of gene GCW14 on various carbon sources. The promoter activity of P GCW14 was stronger than the classic strong constitutive promoters, P TEF1 and P GAP , and might offer an additional constitutive promoter for expression of recombinant proteins in P. pastoris. Besides, the transcription start site and TATA boxes of P GCW14 were further analyzed. Future research will focus on identifying the core promoter region and using the novel promoter to express much more pharmaceutical and/or industrial proteins.