Introduction

The methylotrophic yeast Pichia pastoris has established itself as the eukaryotic expression system of choice for large-scale production of recombinant proteins, which do not fold well in prokaryotes. The system has a major advantage over Saccharomyces spp. in that it contains an efficient secretion system, which can direct large amounts of recombinant protein into the culture medium, making purification of the desired product straightforward. Although its limited capacity for protein glycosylation limits its use for production of mammalian glycoproteins, it is an ideal host for producing small proteins with a high content of disulphide bridges, which require the conditions of a eukaryotic endoplasmic reticulum compartment to fold to biologically active forms [3]. Typical of this use is the production of snowdrop lectin (Galanthus nivalis agglutinin, GNA), which is secreted as a fully active folded protein [1, 16], whereas expression in E. coli results in an insoluble inclusion body, from which GNA can only be recovered by a process of denaturation and renaturation [20].

GNA has been used as a “carrier” domain in the production of recombinant insecticidal fusion proteins, where it confers oral toxicity on peptide toxins from scorpions and spiders [48, 22]. Oral toxicity is achieved via delivery of attached toxins, by GNA, to the central nervous system of target pests. These toxins themselves contain multiple disulphide bonds, and expression in E. coli gives products that require careful refolding to give any activity; in contrast, fusion proteins where the toxin has full biological activity can be produced directly in P. pastoris. As a result, production of fusion proteins on a scale to allow them to be used as pesticides has become possible, but is limited by the economic cost of the fermentation system. To decrease costs, and generate an economically viable product, yields of the recombinant fusion protein must be increased as far as possible. The recombinant insecticidal fusion protein Hv1a/GNA, containing ω-hexatoxin-Hv1a (Hv1a), from the venom of the funnel web spider Hadronyche versuta, linked to GNA has previously been shown to be orally active against lepidopteran larvae, and has been selected as a “best candidate” in terms of potential for development as a biopesticide [8]. The sequence of the Hv1a/GNA expression construct has been given the accession number JQ898015 by GenBank.

To maximise expression of recombinant proteins in P. pastoris, investigators have often used constructs driven by the alcohol oxidase (AOX1) promoter, where expression can be induced by addition of methanol to a growing culture. Experiments with fusion protein production have shown that better expression was achieved with the constitutively expressed GAP promoter, which is also advantageous for industrial production in not requiring a methanol feed.

Previous studies have shown that insecticidal fusion proteins are prone to degradation by yeast extracellular proteases during production by bench-top fermentation [6, 8, 22]. Proteolysis occurs predominantly at or near the linker region between the insecticidal peptide and the carrier protein resulting in a reduction in yields of intact protein. Proteolysis is particularly evident when fusion proteins are expressed using wild-type X33 P. pastoris strain. The use of the P. pastoris strain SMD1168H, which is deficient in the extracellular vacuole peptidase A (pep4), responsible for activating carboxypeptidase Y and protease B1, has been found to reduce proteolysis [11] allowing for an increase in yields of intact fusion protein to be achieved. However, X33 is a preferred strain for large-scale production, as the protease deficient strains tend to be less robust than the wild-type strain, leading to lower growth and poorer survival on storage [11]. In this study, the removal of a potential Kex2 cleavage site present in the Hv1a toxin sequence has been shown to significantly reduce proteolysis in the wild-type strain. A comparison of levels of intact fusion protein obtained by bench-top fermentation using X33 and protease deficient strains has shown that yields from X33 cells can be almost double that obtained using the protease deficient strain.

Previous work using P. pastoris expression vectors where recombinant protein expression is driven by the inducible alcohol oxidase AOX1 promoter has shown that engineering multiple copies of an expression construct into the Pichia genome can result in increases in expression levels [15, 23], with recombinant protein production increasing with increasing copy number to an optimum, after which further increasing the copy number results in decreased expression [24]. The experiments described in this paper show that the multi-copy strategy for maximising recombinant protein expression is also applicable to constructs based on the constitutive GAPDH promoter, and identify an optimum transgene copy number for engineered P. pastoris strains producing the insecticidal Hv1a/GNA fusion protein.

Materials and methods

Materials and recombinant techniques

General molecular biology protocols were carried out as described in Sambrook and Russell [17] except where otherwise noted. Subcloning was carried out using the TOPO cloning kit (pCR2.1 TOPO vector; Life Technologies). P. pastoris wild-type X33 and SMD1168H (protease A deficient) strains, the expression vector pGAPZαB, and Easycomp Pichia transformation kit were from Invitrogen. Gel extraction was carried out using Qiagen gel extraction kits. Oligonucleotide primers were synthesised by Sigma-Genosys Ltd. and restriction endonucleases were purchased from Fermentas. Plasmid DNA was prepared using Promega Wizard miniprep kits. T4 polynucleotide kinase and T4 DNA ligase were supplied by Promega. Phusion polymerase was from New England BioLabs. GNA was produced as a recombinant protein in yeast using a clone generated as previously described [1, 16]. Polyclonal anti-GNA antibodies (raised in rabbits) were prepared by Genosys Biotechnologies, Cambridge, UK. Chemiluminescence detection reagents (coumaric acid and luminol) were supplied by Sigma.

All DNA sequencing was carried out using dideoxynucleotide chain termination protocols on Applied Biosystems automated DNA sequencers by the DNA Sequencing Service, School of Biological and Biomedical Sciences, University of Durham, UK. Sequences were checked and assembled using Sequencher (Gene Codes Corp.) software running on Mac OS computers.

Construct preparation

The generation of a construct encoding the mature omega peptide (Hv1a) linked to the N-terminus of GNA has been previously reported [8]; the sequence was codon optimised for expression in yeast. A potential Kex2 cleavage site present in the Hv1a peptide sequence was removed and a construct was created whereby residue number 34 in the toxin (lysine; K) was replaced by a glutamine (Q) by site-directed mutagenesis (Fig. 1a). The Hv1a sequence was modified by PCR using primers encoding a 5′ Pst I site and a 3′ primer encoding a modified C-terminus (as above) and Not I site. The PCR product was restricted and ligated into similarly digested Hv1a/GNApGAPZαB to create MODHv1a/GNApGAPZαB. A histidine tag was subsequently incorporated at the C-terminus of the fusion protein gene cassette by PCR amplification of MODHv1a/GNA using a 3′ primer coding for six histidine residues, a stop codon and an Xba I restriction site. The PCR product (~500 bp) was digested with Pst I/Xba I and ligated into pGAPZαB vector digested with same enzymes, to create MODHv1a/GNA/His in pGAPZαB. All constructs were verified by sequencing prior to yeast transformation.

Fig. 1
figure 1

a Diagrammatic representation of Hv1a/GNA and modified (MOD) Hv1a/GNA constructs showing linker region sequence and predicted molecular masses of Hv1a and GNA components. The location and identity of the modified amino acid in the Hv1a peptide is depicted in white text. b Composite of western analysis (anti-GNA antibodies) of culture supernatants derived from small-scale YPG and bench-top fermented (MM) samples of Hv1a/GNA and MODHv1a/GNA/His expressing SMD1168H or X33 clones. Lane ‘S’ represents GNA as standard. Arrow depicts GNA standard (25 ng). c Table showing estimated percentage of intact fusion protein present in Hv1a/GNA and MODHv1a/GNA/His supernatants derived from western analysis in 1B

To enable the insertion of multiple fusion protein cassettes into the yeast genome, the pGAPZαB vector was modified to contain a Hind III site in the GAP promoter region by site-directed mutagenesis (as Bln I could not be used for linearisation of vector prior to yeast transformation). Primers were designed to introduce a Hind III site 35 bp from the Bln I site in the GAP promoter region, Forward (5′ CATTACGTTGCGGGTAAAACGG) and reverse (5′ CTGGGAAGAAGCTTGCTGCAAG), where AATGCT sequence from original GAP region was changed to AAGCTT. PCR was performed using MODHv1a/GNA/His pGAPZαB vector template DNA and Phusion polymerase. The PCR product (~3.6 kb) was phosphorylated using T4 polynucleotide kinase kit as per manufacturer’s instructions, gel purified, and the DNA was then re-ligated using T4 DNA ligase. Re-ligated vector was transformed into E. coli (TOP10) cells and plasmid DNA was sequenced (using primers; Forward- 5′ GTAGAAATGTCTTGGTGTCC and Reverse-5′ AGTCTTTGGGTCAGGAGAAA) to verify the presence of a Hind III site in GAP region.

Transformation of single and multi-copy expression cassettes

Single-copy, 3-copy, 5-copy, 7-copy and 11-copy expression construct plasmids were assembled using standard molecular biology methods, as described below.

Each expression cassette consisted of pGAP region, alpha factor secretary signal, MODHv1a/GNA, six residue His tag followed by an AOX1 transcription terminator (Fig. 2a). This cassette was restricted from the backbone vector using BamHI and BglII. The vector into which this cassette was to be introduced, already containing an expression cassette, was linearised with BamHI. After ligation of the expression cassette, the correct resulting circular plasmid contains the unchanged BglII site, an intact reformed BamHI site, and a hybrid BamHI/BglII site (BamHI and BglII produce compatible overhangs) between the original and introduced expression cassettes, which cannot be digested by either enzyme. It can then be linearised with BamHI to introduce further expression cassettes. Each time further cassettes were inserted; the recombinant plasmid was checked either by releasing the complete set of cassettes (using BamHI and BglII) or linearising the whole plasmid (using BamHI). An overview of the preparation of multiple cassettes is shown in Fig. 2b.

Fig. 2
figure 2

Diagrammatic representation of the cloning strategy adopted to enable insertion of multiple copies of the fusion protein cassette into the Pichia yeast genome. a Location of Hv1a/GNA cassette and position of inserted Hind III site in the promoter region of pGAPZαB expression vector. b Procedure for vector restriction and ligation of fusion protein cassettes to create multi-copy expression cassettes in E. coli prior to transformation into yeast

Plasmid DNA (5 μg) was linearised by treatment with Hind III, and transformed into P. pastoris strains SMD1168H and X-33 using standard Invitrogen kit protocols. Transformants were selected on antibiotic containing plates (100 μg/ml zeocin).

Transgene copy number determination by quantitative PCR

Selected yeast clones were grown in 10 ml YPG medium (1 % w/v yeast extract; 2 % w/v peptone; 4 % v/v glycerol; 100 μg/ml zeocin) in baffled flasks with shaking at 30 °C for 72 h. Genomic DNA (gDNA) was extracted as described by Lõoke et al. [14] and quantified by absorbance at 260 nm using a Nanodrop spectrophotometer. Primers were designed to amplify partial sequences of MODHv1a/GNA, and the P. pastoris actin gene (PAS_chr3_1169, encoding Uniprot protein Q9P4D1) was used as an endogenous control for gene copy number, as follows:

  • For MODHv1a/GNA amplification: Fwd 5′ TGGTCTCTCCCGTAGCTGCTT; Rev 5′ ATCGAACAAACCGATTTGGG.

  • For Actin amplification: Fwd 5′ CGGTATGTGTAAGGCCGGATA; Rev 5′ ACGACCGATGGGAACACTGT.

50 ng of gDNA was used as a template for individual quantitative PCRs, with each reaction run in triplicate. Amplification was measured using fluorescence of SYBR Green on an Applied Biosystems StepOne instrument; Step One software was used to compare samples using the ∆Cq (∆Ct) method, using an ANOVA method to estimate error bars for samples. All qPCR experiments were carried out with independent biological replicates.

Small-scale screening for fusion protein expression

Selected clones were grown in small-scale (10 ml) cultures in YPG medium with shaking at 30 °C for 48 h. After centrifugation (3,000g, 10 min), 10 μl supernatant samples were analysed by SDS-PAGE followed by western blotting using anti-GNA antibodies, as described previously [7], except that chemiluminescence detection was carried out using coumaric acid (0.2 mM) and luminol (1.25 mM) in 1 M Tris (pH 8.5) with the addition of 0.009 % (v/v) hydrogen peroxide. Recombinant GNA purified to 100 % homogeneity was used as a standard to allow quantitative estimation of fusion protein expression.

Bench-top fermentation experiments

All runs were carried out in a 5 l capacity BioFlo 110 (New Brunswick) fermentation vessel with 2.5 l of basal salt minimal media (MM) supplemented with PTM1 salts [2]. Fermenters were seeded with inoculum culture (150–180 ml), grown in YPG medium (shaking at 30 °C) for 48 h, and run at 30 °C. The inoculum volume was standardised such that at time 0, the O. D. 600 of the fermentation media was 0.7–0.8. A sterile glycerol (50 % v/v) feed of 1.25 l was introduced over a fermentation period of 72 h, maintaining dissolved oxygen at 30 % and pH at 4.7–4.9. Growth was estimated by wet weight of pelleted cells at the end of the fermentation.

Yield estimates following fermentation, were obtained by SDS-PAGE analysis and western blotting. For western analysis, samples were diluted in distilled water prior to loading. Gels (17.5 % acrylamide) were stained for total proteins with Coomassie Blue and blots were probed with anti-GNA antibodies, as described previously. Different loadings of purified GNA were used as a standard to give semi-quantitative estimates of fusion protein content.

Purification of Hv1a/GNA and MODHv1a/GNA/His fusion proteins

Purification of the non-modified recombinant Hv1a/GNA fusion protein was carried by hydrophobic interaction chromatography followed by a gel filtration clean-up step as described previously [8]. Purification of MOD/Hv1a/GNA/His was carried out in a single step using nickel affinity chromatography on 5 ml HisTrap crude nickel columns (GE Healthcare). Briefly, one-fourth volume 4 × binding buffer (BB; 0.4 M sodium chloride, 0.04 M sodium acetate; pH 7.4) was added to culture supernatants, which were then loaded onto columns equilibrated with 1X BB at a flow rate of 2–4 ml/min. Bound protein was eluted by the addition of imidazole (200 mM) in 1X BB. The eluted protein peak was diluted 50:50 with distilled water, dialysed against distilled water and freeze-dried. For analysis of fusion protein content, a known quantity was weighed and re-suspended in distilled water at a concentration of 10 mg/ml. Following centrifugation at 12,000g for 2 min, aliquots of supernatant were run on SDS-PAGE, using GNA standards run on the same gel to allow semi-quantitative estimation of fusion protein content after staining.

Analysis of biological activity: injection bioassays

Biological activity was assessed by injection of purified Hv1a/GNA and MODHv1a/GNA/His into newly moulted fifth instar cabbage moth (Mamestra brassicae) larvae. Proteins were re-suspended in distilled water at concentrations of 1–4 μg/μl. Larvae were anaesthetised with carbon dioxide for 10–20 s prior to injecting 5 μl volumes using a Hamilton syringe with a 24 gauge needle. Controls were injected with distilled water. For each sample tested, 20 larvae were injected per dose. Survival was monitored daily for 5 days. Samples were compared using survival analysis (Prism v. 5.0 software, www.graphpad.com), with a log-rank comparison test.

Results and discussion

The expression vector used as the basis for constructs was pGAPZαB, a shuttle vector propagated in E. coli. The basic construct for heterologous expression of the Hv1a/GNA fusion protein encodes a hybrid protein composed of the yeast (S. cerevisiae) α-factor prepro-sequence, the ω-ACTX-Hv1A atracotoxin peptide (Hv1a), joined by a 3-alanine linker region to the coding sequence for GNA polypeptide (Fig. 1a). Expression of this construct results in secretion of the insecticidal Hv1a/GNA fusion protein into culture supernatant of transformed P. pastoris. To date, the production of gram quantities of intact fusion protein required for field testing has been hindered by problems associated with proteolytic cleavage during expression by P. pastoris, and by the relatively low levels of expression observed for protease deficient strains carrying single-copy Hv1a/GNA expression cassettes. Here, we report work carried out to improve levels of expression of intact Hv1a/GNA through modification of the Hv1a/GNA construct and subsequently the generation of a modified expression vector backbone that enabled the integration of multiple Hv1a/GNA expression cassettes.

Construct modification and expression analyses

The C-terminus of the Hv1a peptide (residues 33–36) includes the sequence –VKRC–, which is similar to the sequence –EKRE– that is present in the α-factor signal sequence of the pGAP expression vector. –EKRE– is cleaved between R and E by the P. pastoris Kex2 gene product. The 34th lysine residue in Hv1a was replaced with a glutamine residue by site-directed mutagenesis and, following sequence verification, reassembled into an expression vector that was transformed into P. pastoris wild-type X33 and protease deficient SMD1168H strains. Western blot analysis using anti-GNA antibodies enabled comparative analysis of the stability of the original Hv1a/GNA fusion protein with the modified MODHv1a/GNA/His in both nutrient rich (YPG) and minimal media (MM). As reported previously [8], the presence of two GNA immunoreactive bands on western blots corresponds to intact fusion protein and GNA from which the Hv1a toxin has been cleaved. Due to the presence of a histidine tag, intact MODHv1a/GNA/His and cleaved GNA/His migrate at higher molecular masses than the non-modified Hv1a/GNA and cleaved GNA proteins. A composite of these experiments is presented in Fig. 1b, c.

Levels of intact Hv1a/GNA were found to be higher in both nutrient rich (YPG) and minimal media (MM) when the protein is expressed in the protease deficient strain as compared to the wild-type strain (Fig. 1b). This has been reported previously for a fusion protein encoding a venom peptide SFl1, from the spider Segestria florentina, linked to GNA [6]. In the case of SFI1/GNA, expression by X33 cells was found to produce a 1:1 ratio of intact to cleaved protein, whereas predominantly intact SFI1/GNA was obtained using the protease deficient strain SMD1168H. Similarly for Hv1a/GNA, expression using SMD1168H cells resulted in 75–100 % intact fusion protein (0–25 % cleaved GNA) as compared to 50–60 % in X33 (50–40 % cleaved GNA), depending on the composition of the culture media.

Modification of the C-terminus of the toxin sequence resulted in a small increase in the levels of intact fusion protein in SMD1168H cells as compared to the original construct (Fig. 1b, c). This indicated that proteolysis during processing of the protein in the golgi apparatus by Kex2 membrane bound proteases may be slightly reduced. More significantly, toxin sequence modification resulted in a marked improvement (approx. 30 %, Fig. 1b, c) in the levels of intact fusion protein being expressed in the wild-type X33 strain as compared to Hv1a/GNA. This is indicative of reduced susceptibility to proteolytic cleavage by extracellular yeast proteases, such as carboxypeptidase Y and proteinase B that are activated by vacuolar Peptidase A. Reduced proteolysis of the modified fusion protein was most evident for cells grown in bench-top fermentation in minimal media (MM) as compared to small-scale shake flask cultures in nutrient-rich media. Previous studies have shown that intracellular vacuolar proteases are released from yeast cells during high-density cell growth and can play a significant factor in protein degradation [13, 18]. Thus, it is likely that the modified fusion protein has improved resistance to both extracellular and intracellular yeast proteases as compared to Hv1a/GNA.

The release of intracellular proteases during high-cell density fermentation may also account for the increased proteolysis of both Hv1a/GNA and MODHv1a/GNA observed in fermented cultures grown in minimal basal salt media as compared to small-scale shake flask cultures grown in nutrient-rich YPG media (Fig. 1b, c). It is also possible that the additional proteins in YPG media may act as substrates for released proteases, decreasing the availability of the fusion proteins as substrates, and thus contributing to a reduction in susceptibility of the recombinant fusion proteins to proteolysis.

Analysis of biological activity: injection bioassays

Hv1a/GNA and MODHv1a/GNA/His were produced in P. pastoris using minimal media in a 5 l bench-top fermenter, and were purified by hydrophobic interaction chromatography and metal affinity chromatography, respectively, as described earlier. Amounts of protein estimated by quantitative SDS-PAGE were injected into lepidopteran larvae to investigate if modification of the C-terminus of the toxin peptide had any impact upon the insecticidal activity of the fusion protein. As shown in Table 1, levels of mortality observed for M. brassicae larvae injected with different doses of the original or modified fusion protein were similar (dose–response curves not significantly different at p < 0.05) verifying that mutagenesis of the Hv1a peptide had not disrupted or altered biological activity. Hv1a is a member of the ω-ACTX-1 family of 36–37 insecticidal residue peptides isolated from the Australian funnel web spider that block insect but not vertebrate, voltage-gated calcium channels [9]. Previous alanine scanning mutagenesis studies by Tedford et al. [21] have identified 3 key functional residues (Pro10, Asn27, and Arg35) that determine specific binding to insect calcium channels. Here, replacement of the 34th lysine residue in Hv1a with a glutamine residue by site-directed mutagenesis was selected as glutamine is known to be present at this position of other members of the ω-ACTX-1 family [21] and was thus unlikely to disrupt biological function of the recombinant toxin.

Table 1 Insecticidal activity of recombinant fusion proteins

Assembly of expression vector constructs containing multiple gene copies

The pGAPZαB expression vector integrates into the P. pastoris genome by homologous recombination at the chr2-1_0437 locus, the gene encoding glyceraldehyde-3-phosphate dehydrogenase (GAP). The plasmid contains the entire promoter region of the GAP gene, which both drives expression of the incorporated coding sequence in P. pastoris, and directs recombination at a site in the 5′ UTR determined by the restriction enzyme used to linearise the plasmid prior to yeast transformation. As a result of the recombination, the entire plasmid is incorporated into the P. pastoris genome, and the 5′ UTRs of both the endogenous GAP gene and the introduced gene construct are reconstructed.

To ensure multiple copies of recombinant gene carried by the vector are incorporated into the genome of P. pastoris, the approach followed was to produce vectors carrying multiple copies of the expression construct, using a strategy based on that described by Zhu et al. [24]. This approach is much more reliable than the earlier method of screening large numbers of transformants for “jackpot” high copy strains resulting from multiple integrations of a transforming plasmid [12].

The strategy to obtain multi-copy expression vectors is summarised in Fig. 2. The vector backbone for the multi-copy plasmid was modified by insertion of a Hind III site near the Bln I site in the GAP gene 5′ UTR; this modification changed two adjacent bases, and was designed not to affect promoter function. The resulting expression vector (pGAPZαBH-MODHv1a/GNA(FP)) was transformed into P. pastoris and checked for expression; no difference to the original pGAPZαB expression construct was observed. This construct gave the 1-copy baseline for subsequent manipulation. A restriction fragment containing the GAP promoter, the recombinant protein coding sequence, and the AOX gene 3′ UTR was isolated by restriction of the original expression construct with BamHI and BglII, and was ligated into the original construct restricted with BamHI; selection of the correct orientation of the inserted fragment gave a construct (pGAPZα-2FP) from which a fragment containing two copies of the GAP promoter—coding sequence—AOX 3′ UTR in the same orientation (“2-copy cassette”) could be isolated by restriction with BamHI and BglII, since the site where the two fragments join was not restrictable by either enzyme after the ligation. The multi-copy expression vectors were then built up by combining the 2-copy cassette with the modified expression vector pGAPZαBH-FP. For example, to produce the 3-copy expression vector, pGAPZαBH-3FP, the 2-copy cassette was ligated with pGAPZαBH-FP which had been linearised by restriction with BamHI. Restriction analysis of the resulting clones was used to select a recombinant where the three copies of the cassette were in the same orientation; the resulting plasmid has a single BamHI site, which can be used to insert further cassettes.

In practice, it was more efficient to use both 2-copy and 4-copy cassettes to produce multi-copy expression vectors; the 4-copy cassette was produced by ligation of a 2-copy cassette into the pGAPZα-2FP plasmid in the correct orientation to allow a 4-copy cassette to be excised by restriction with BamHI and BglII (Fig. 2). Addition of the 4-copy cassette to pGAPZαH-FP gave pGAPZαH-5FP; addition of the 2-copy and 4-copy cassettes to pGAPZαH-5FP gave pGAPZαH-7FP and pGAPZαH-9FP; addition of the 4-copy cassette to pGAPZαH-7FP gave pGAPZαH-11FP. The larger plasmids became successively more difficult to transform into E. coli for cloning and propagation, and production of multi-copy expression vectors with more than 11 copies of the expression construct was not found possible.

The multi-copy expression constructs pGAPZαH-3FP, pGAPZαH-5FP, pGAPZαH-7FP, pGAPZαH-9FP and pGAPZαH-11FP were verified by restriction analysis; DNA sequencing was not feasible as the multiple cassettes all contained sequences corresponding to the pGAP sequencing primers. Linearisation of the plasmids at a single Hind III site in the pGAPZαB backbone showed that plasmid size increased in line with predicted copy number (results not shown).

Yeast transformation and gene copy analysis of transformants

Transformation of P. pastoris with the multi-copy expression vectors was carried out using normal procedures; selection on high levels of zeocin was not necessary. In contrast to transformation of E. coli with large plasmids, no significant fall-off in numbers of transformants obtained from the larger multi-copy expression vectors was observed. Expression constructs were transformed into both the “wild-type” P. pastoris strain X-33 and the protease deficient strain SMD1168H.

P. pastoris clones that were selected on the basis of zeocin resistance were screened for recombinant protein expression by western blot analysis of culture supernatant from small-scale shake flask cultures. These assays showed that >90 % of the selected (n = 20) transformants produced detectable levels of the desired recombinant protein, MODHv1a/GNA/His, irrespective of the copy number of the multi-copy expression vector used for transformation. Clones were selected for further study on the basis of positive fusion protein expression in the small-scale screen.

A selection of clones produced using single-copy and multi-copy expression vectors was analysed by quantitative PCR to estimate the actual copy number of the fusion protein gene(s) incorporated into the yeast genome, by comparison to an endogenous single-copy gene sequence. Analysis of 8 different clones for single-copy, 3-copy, 5-copy, 7-copy, 9-copy and 11-copy transformants in P. pastoris SMD1168H, and 11-copy transformants in P. pastoris X-33 is presented in Fig. 3. These results are typical of other analyses carried out on multi-copy transformants. 7/8 of the single-copy transformant clones contain a single integrated gene for MODHv1a/GNA, with the remaining clone containing 2 copies. Of the nominal 3-copy clones, 4/8 do contain 3 copies, with the remaining clones containing 1-2 copies; the “5-copy” clones have 2/8 with 5 copies, 6/8 with 1–3 copies; the “7-copy” clones have 2/8 with 7 copies, 6/8 with 1–6 copies; the “9-copy” clones have 2/8 with 9 copies, and 6/8 with 2–7 copies; and the “11-copy” clones have 4/16 with 11 copies, and 12/16 with 4–8 copies. These results show that clones produced using the multi-copy expression vectors are liable to lose copies of the expression cassette during the transformation process but that intact multi-copy integration also occurs at a viable frequency. The results in the two yeast strains were similar.

Fig. 3
figure 3

Quantitative PCR analyses of yeast clones derived from transformation of single and multi-copy fusion protein cassettes. Actin was used as an endogenous control. RQ values (1–12) show relative quantitation of FP in terms of copy numbers compared to housekeeping gene. All samples were compared with clone expressing one copy of FP

Recombinant protein expression from P. pastoris clones

The small-scale screen of clones for expression of MODHv1a/GNA/His showed an approximate correspondence between levels of expression and the number of gene copies present in the yeast clones, but expression from clone to clone was variable (results not presented). This variability is likely to be due to differences in the number of cells used to inoculate the media as well as to variation in parameters such as oxygen level and pH in small-scale cultures. The influence of copy number on protein expression is not predictable [19] and to allow a more accurate comparison of production of recombinant protein; a series of clones was selected for pilot-scale bench-top fermentation, under controlled conditions. The clones were selected on the basis of qPCR results, which showed 1, 2, 5, 9 and 11 fusion protein gene cassettes to be present in strain SMD1168H clones, and 9 and 11 copies in strain X-33 clones. Results are summarised in Table 2.

Table 2 Growth and recombinant protein expression in multi-copy clones of P. pastoris

Growth parameters for the different multi-copy expression clones in the fermenter suggested that the multi-copy clones in strain X-33 grew better than any of the clones in the protease deficient strain, resulting in higher pellet weights at the end of the fermentation (Table 2), as would be expected [11]. When culture supernatants from the fermenter runs were analysed for protein content, it was clear that the multi-copy expression clones were able to produce enhanced levels of recombinant MODHv1a/GNA. Single-copy clones had previously been selected for optimum protein production. Based on previous experiments, and controls run as comparisons to multi-copy clones, the level of expression from a single-copy clone in strain SMD1168H for the original Hv1a/GNA construct was approx. 50 mg/l, whereas for the modified construct (MODHv1a/GNA), the fusion protein level from a single-copy clone was approx. 100 mg/l culture supernatant. The multi-copy clones in strain SMD1168H gave yields of 200–600 mg/l of MODHv1a/GNA, with yield increase apparent between 2-copy and 5-copy clones (Table 2). Analysis of growth curves for these clones showed only small differences in growth between 1, 2, and 5-copy clones in strain SMD1168H, with no correlation between copy number and growth, suggesting that any increased production of recombinant protein in clones containing up to five gene copies was not limiting growth. However, although similar yields of fusion protein (600 mg/l) were observed for the 5- and 9-copy SMD1168H clones, a reduction in expression level (200 mg/l) was recorded for the 11-copy SMD1168H clone. Similarly, fusion protein expression levels in wild-type cells increased with copy number to a maximum (1,000 mg/l) for a 9-copy clone, and then decreased with increasing copy number.

These results are comparable with those reported by Zhu et al. [24], where production of a recombinant protein using a methanol-induced vector in P. pastoris increased approximately tenfold when increasing gene copy number from 1 to 12, but decreased when vectors with higher copy numbers were introduced. In addition, Zhu et al. [24] found a reduction in cell growth rate with strains carrying more than the optimal 12 copies of the transgene. Expression of the Hv1a/GNA fusion protein using a methanol-induced vector gave low yields of product in our hands (unpublished results), necessitating the use of vectors containing the constitutive GAPDH promoter. In this study, a reduction in cell growth rate was noted for SMD1168H cells carrying more than five copies of the transgene, but this was not observed for X33 cells. The higher optimum recombinant protein yield observed in wild-type cells than protease deficient cells may thus reflect lower limits to growth in the latter [11], even when using a constitutive promoter to drive transgene expression, and more sensitivity to the effects of multiple transgene copies.

As shown in Fig. 4, the purified recombinant fusion protein product from multi-copy strains of P. pastoris gave a similar band pattern on SDS-PAGE to the product from a single-copy strain. Furthermore, SDS-PAGE analysis of samples derived from fermentation of the highest expressing clone demonstrated that the fusion protein is the most abundant protein representing more than 50 % of the total protein in the culture supernatant.

Fig. 4
figure 4

Analysis of purified proteins and culture supernatants by SDS-PAGE (17.5 % acrylamide gels) under reducing conditions; gels stained with Coomassie blue. a Effects of K34Q mutation; purified Hv1a/GNA and MODHv1a/GNA/His derived from fermentation of single-copy SMD1168H clones. b High expression of recombinant protein in multi-copy P. pastoris clone.; culture supernatant and purified MODHv1a/GNA/His samples derived from fermentation of 9-copy X33 clone. M denotes molecular scale protein marker mix. Loading in μg of intact fusion protein or GNA standards (a and b) or μl of culture supernatant (b) is denoted

Conclusion

In the present study, we provide evidence to show that a single amino acid change in a recombinant hybrid protein sequence can have a significant impact on resistance to yeast proteases without loss of biological function. Furthermore, the data presented show that producing strains of P. pastoris containing multiple copies of a transgene expression construct result in increased expression levels of recombinant proteins when a constitutive, rather than an induced promoter is used, and identify likely limits for recombinant protein production using this system. Further improvements may be possible by exploiting other recent developments in available genetic elements and strains for recombinant protein production in P. pastoris [10]. The increase in expression level of Hv1a/GNA is a necessary prerequisite to production of this fusion protein on an industrial scale at a viable cost, so that it can be taken forward for evaluation as a novel biopesticide.