Introduction

Genetic engineering in the chloroplast of the green alga Chlamydomonas reinhardtii has been used to express foreign genes to produce a wide range of recombinant therapeutic proteins (Almaraz-Delgado et al. 2014; Dyo and Purton 2018), including antibodies (Mayfield et al. 2003; Tran et al. 2009), immunotoxins (Tran et al. 2013a, b), antigens (Dreesen et al. 2010; Michelet et al. 2011; Jones et al. 2013), toxins (Kang et al. 2017) and growth factors (Rasala et al. 2010; Wannathong et al. 2016). However, the number of reports applying metabolic engineering in the chloroplast of C. reinhardtii is still scarce (Fukusaki et al. 2003; Gangl et al. 2015; Zedler et al. 2015). Nevertheless, the potential is enormous, novel or endogenous biosynthetic pathways can be introduced or modified (Gimpel et al. 2015) to obtain high value molecules such as carotenoids (Harada et al. 2014), as well as molecules that could be used as a renewable source for biofuel production such as lipids (Scranton et al. 2015) and to improve biohydrogen production (Wu et al. 2011). Despite the biotechnological potential of genetic engineering in the chloroplast of C. reinhardtii, tools that allow for a more systematic manipulation of the metabolism, by introducing and expressing entire pathways for instance, have not been developed (Scaife et al. 2015), resulting in a technology that is lagging behind in development compared to the advances that have been made in plant chloroplast biotechnology (Fuentes et al. 2018).

For many years it has been clear that some genes in the chloroplast of C. reinhardtii are expressed as bicistrons or polycistrons (Rochaix 1996; Harris 2009). While some of these polycistronic transcripts (psbB-psbT-psbH) are post transcriptionally processed to yield monocistronic units, others (psbF-psbL, ycf4-ycf3 and atpA-psbI-cemA-atpH) seem to remain as translatable bicistronic or polycistronic units (Mor et al. 1995; Boudreau et al. 1997; Drapier et al. 1998). The attention paid to the characterization of these operons has been somewhat uneven, being the transcripts for psbB-psbT-psbH and petA-petD the most widely studied while the transcripts for rps7-atpE and psbZ-psbM have been perhaps the least studied. More recently, two research groups have found that polycistronic transcription of genes in the chloroplast of C. reinhardtii seems to be more common than what was previously thought. Using high-throughput sequencing of small and long RNAs, Cavaiuolo et al. (2017) have reported that 84 of the 109 genes from C. reinhardtii chloroplast are arranged in 22 polycistronic units, while Gallaher et al. (2017) have identified 16 clusters of transcribed genes. Polycistronic transcription of genes in the chloroplast is a feature that could be exploited for the design and construction of synthetic polycistronic operons; genes from a particular multi-gene metabolic pathway could be stacked, making the introduction and expression of entire foreign pathways in the chloroplast of C. reinhardtii somewhat easier.

Transcript stability and translation initiation depends largely on the nature of untranslated regions (UTR) at the 5′- and 3′-ends (Stern et al. 2010; Rasala et al. 2011) as well as in nucleus-encoded trans-acting protein factors (Johnson et al. 2010). In polycistronic mRNAs, intergenic regions act as 3′UTR of the first cistron and as 5′UTR of the second cistron, containing elements for mRNA processing and mRNA stability (Vaistij et al. 2000a, b; Stoppel and Meurer 2013). It has been proposed that processing of polycistronic transcripts to monocistronic mRNAs is initiated through non-specific cleavages by endonucleases RNase E and RNase J (Pfalz et al. 2009; Luro et al. 2013). Maturation of the 5´-end termini is then achieved as the result of the activity of exonucleases (for example, RNAse J), which degrade the mRNA until encountering a sequence-specific RNA-binding protein. In plants and in C. reinhardtii, these proteins have been demonstrated to be members of the helical-hairpin-repeat protein families: PPR (pentatricopeptide repeat), TPR (tetratricopeptide repeat) and OPR (octotricopeptide repeat) proteins. Proteins of these families have also been implicated in the maturation of the transcript 3′-end in plants but not in Chlamydomonas, where the presence of a stem-loop seems to be the structure that predominantly prevents exonuclease activity. Helical-hairpin-repeat proteins bind on a sequence-specific manner to mRNAs and act as trans-acting factors that control splicing, editing, stabilization, turn over, processing and translation activation (Auchincloss et al. 2002; Loiselay et al. 2008; Stern et al. 2010; Eberhard et al. 2011; Rahire et al. 2012; Wang et al. 2015; Douchi et al. 2016). The absence of these regulatory elements, cis and trans-acting factors, brings a rapid degradation or allows only for low accumulation of primary transcripts (Salvador et al. 2004) and, in some cases, a reduction in translation efficiency is seen (Trösch et al. 2018). There is also evidence that coding regions might have a significant contribution to transcript stability. When the 5´UTR of the chloroplast rbcL was used to drive the expression of a foreign gene, the mRNA generated was susceptible to rapid degradation upon exposure to light. This susceptibility was eliminated with the addition of a segment of the rbcL coding region (Salvador et al. 1993). Transcript processing and stability are, not surprisingly, complex and multi factor events.

In tobacco chloroplasts a small element, termed the intercistronic expression element (IEE), with origin on the psbT-psbH intercistronic region from the psbB operon, was used for bicistronic expression of nptII and yfp (Zhou et al. 2007). While the presence of the IEE was not required for processing and maturation of the first cistron (nptII) it was required for stability and translation of the second cistron (yfp). A few years later, the IEE was used to express three genes involved in the tocochromanol pathway for vitamin E production in tobacco and tomato (Lu et al. 2013). The plants expressing the three genes, stacked one after the other using the IEE in between them, accumulated more tocochromanol than the lines where the genes were expressed using a bacterial type operon or independently. More recently, Fuentes et al. (2016) used IEEs to express the core biosynthetic pathway of artemisinic acid production in tobacco chloroplasts, making evident again that this element can be used for metabolic engineering in this plant. The functionality of the tobacco IEE has been attributed to the fact that it contains the target sequence for the half-a-tetratricopeptide (HAT) repeat RNA-binding protein HCF107 (Hammani et al. 2012), which stabilizes the transcript and makes translation more efficient (Zhou et al. 2007; Lu et al. 2013; Bock 2013). In plants, the HAT protein HCF107 is required for the stability of the psbH transcripts (Felder et al. 2001; Hammani et al. 2012).

Even though the tobacco IEE has been demonstrated to be sufficiently robust, recently, Legen et al. (2018) have shown that when it is used to stabilize the transcripts of foreign genes, a negative effect is seen on the stabilization of the endogenous psbH transcript. Stabilization of the foreign transcripts, achieved by the binding of the HCF107 HAT protein to the target site contained in the IEE, yields a reduction on the accumulation of the psbH transcript. There does not seem to be enough HCF107 to serve target sites in excess. To overcome this problem, Legen et al. (2018) proposed the use of binding sites for PPR proteins as IEE-like elements. They have showed that by placing target sites for PPR10, HCF152, CCR2 and the binding site of a not yet identified protein, located upstream of the rpl12 reading frame, as IEEs in a synthetic neo-egfp bicistron in tobacco chloroplasts, GFP accumulates to various levels. These results indicate that alternative sequences, to the already popular IEE, can be developed and used as modulators and enhancers of genes expression in polycistronic constructs.

In this regard, identification of IEEs for C. reinhardtii chloroplasts is strongly needed to speed up and improve metabolic engineering. IEEs would facilitate the stacking of genes and consequently make the expression of entire biosynthetic pathways in the chloroplast of a biotechnology relevant organism a reality. Here, we have examined different intercistronic regions from the chloroplast genome of C. reinhardtii to determine their usefulness as IEEs in the construction of synthetic operons. We show that the intercistronic regions from the psbN-psbH and tscA-chlN bicistrons can be placed between the coding sequences of two foreign genes and yield monocistronic translatable units that produce the aminoglycoside phosphotransferase enzyme (APH(3′)-VI) and the green fluorescent protein (GFP). The IEEs that we have identified could be used for the stacking and co-expression of genes in synthetic operons in C. reinhardtii.

Materials and methods

Algal strain and culture conditions

Chlamydomonas reinhardtii wild-type CC-125 (mt+) was obtained from the Chlamydomonas Resource Center (University of Minnesota, USA). Wild-type and transformed cells were grown in Tris-acetate-phosphate (TAP) medium (Gorman and Levine 1965) at 25 °C under a photoperiod (16 h /8 h, light/dark) with white LED lighting (10,000–12,500 lx). For transformed lines, TAP solid medium was supplemented with spectinomycin (150 µg/mL) and kanamycin (100 µg/mL) while for liquid cultures, the antibiotics were used at concentrations of 50 µg/mL for spectinomycin and 20 µg/mL for kanamycin.

Plasmid construction

Expression cassettes were first assembled in the pJ248/aphA-6/gfp vector. This vector carries the aphA-6 gene (Bateman and Purton 2000) and a codon-optimized gfp gene (DNA 2.0, now AUTUM, USA; Supplementary Data 1). The genes are under the control of the rbcL promoter (PrbcL) and terminator (TrbcL). The aphA-6 gene contains a HindIII site at the 3′-end while the gfp gene contains a NcoI site at the 5′-end for convenient insertion of the IEEs in between them (Fig. 1a). All endogenous IEEs (IEE-1 to IEE-5) were amplified by PCR using C. reinhardtii total DNA as template, IEE-6 (T7 g10) was amplified from the previously reported tobacco chloroplast transformation vector pZSJH1-PrrnIp24 (Zhou et al. 2008). Primers were designed to introduce HindIII and NcoI sites at the 5′- and 3′-ends respectively (Supplementary Table 1). The expression cassettes PrbcL/aphA-6/IEE/gfp/TrbcL were transferred to the BamHI site of C. reinhardtii chloroplast transformation vector p320 (Guzmán-Zapata et al. 2016) to generate p320-IEE vectors (p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5, p320-IEE6). Chloroplast transformation vector p320 targets the insertion of foreign genes to the intergenic region between exon 5 of psbA and 5S rRNA by homologous recombination in the C. reinhardtii chloroplast genome (Fig. 1b).

Fig. 1
figure 1

Generation of transplastomic C. reinhardtii lines with a synthetic operon using different intercistronic regions. a Chloroplast transformation vector p320. This vector targets the insertion of foreign genes to the intergenic region between 23S-5S rRNA and exon 5-intron 4 of psbA. b and c Physical map of the synthetic bicistronic operon and origin of the intercistronic regions used, respectively. In the synthetic operon aphA-6/IEE/gfp genes are under the control of the rbcL promoter (PrbcL) and terminator (TrbcL). The intercistronic regions indicated were used to generate vectors p320-IEE1 to p320-IEE6. d Physical map of the transformed genome of lines obtained with vectors of the series p320-IEE. Introns and exons of the chloroplast genome are represented in filled black and white boxes, respectively. Foreign genes are represented with light grey boxes while the intercistronic region is represented by a line filled box. A DNA fragment generated with SacII and PstI restriction enzymes was used for restriction fragment length polymorphism analysis and is represented as a black line with borders. Probes used for Southern and northern blotting are indicated with bold black lines. Primers are indicated with arrows and their respective names

Chloroplast transformation, selection of transformed lines and PCR analysis

Chloroplast transformation was carried out by particle bombardment using the PDS-1000/He Particle Delivery System (Bio-Rad, USA) as described by Guzmán-Zapata et al. (2016). Briefly, a 250 µL aliquot of C. reinhardtii cellular culture (1 × 108 cell/mL) was placed in a Petri dish with solid TAP medium supplemented with antibiotics, and then bombarded with M-10 (0.7 µm) tungsten particles (Bio-Rad) coated with one microgram of each plasmid. After transformation, Petri dishes were incubated overnight in the dark, and then placed under a photoperiod for 4 weeks until antibiotic resistant colonies were visualized. Chloroplast transformation vector p228 (Chlamydomonas Resource Center, University of Minnesota), which by introducing a point mutation in the 16S rRNA gene confers resistance to spectinomycin, was used for co-transformation with the p320-IEEs vectors mentioned above. Selection of primary transformed cells was carried out in spectinomycin (150 µg/mL). Selection for kanamycin-resistant transformed lines was carried out in TAP media supplemented with 100 µg/mL kanamycin for solid medium and 20 µg/mL for liquid media. Kanamycin-resistant lines were verified by colony PCR using the resin Chelex-100 (Bio-Rad) as described previously (Cao et al. 2009; Guzmán-Zapata et al. 2016). To identify transformed lines with the transformation cassette, both gfp and aphA-6 genes were amplified by PCR using primers pairs qFJAB7/qRJAB7 and qFJAB8/qRJAB8, respectively (Supplementary Table 1). For detection of the mRNA transcripts by RT-PCR we first synthesized the cDNA using a random hexamer and M-MuLV Reverse Transcriptase (New England Biolabs, USA) following the manufacturer’s instructions. Samples were treated with DNase I (New England Biolabs).

DNA isolation and Southern blot

Total DNA was extracted from 15 mL (1 × 105 cell/mL) of liquid culture using the Wizard Genomic DNA Purification Kit (Promega, USA) following the manufacturer’s instructions for plant DNA extraction. Briefly, cells were recovered by centrifugation at 3500×g and then resuspended in Nuclei Lysis Solution and incubated for 15 min at 65 °C. RNase Solution was added and the sample incubated for 15 min at 37 °C. Proteins were eliminated by protein precipitation with Protein Precipitation Solution and DNA was purified by ethanol precipitation and then resuspended with nuclease-free water.

For Southern blotting, 10 µg of total DNA were digested with restriction enzymes SacII and PstI and then separated by electrophoresis in a 0.6% agarose gel. DNA was transferred to a Zeta-probe nylon membrane (Bio-Rad) by capillary blotting following a standard protocol (Reed and Mann 1985). The psbA probe (1 343 bp), comprising part of the psbA gene, was generated by PCR using primers NVDF354 and NVDF355 (Supplementary Table 1). The probe was labeled with [α-32P]-dATP by random priming (Sambrook and Russell 2001). Hybridization was carried out at 65 °C following a standard protocol (Sambrook and Russell 2001). Membranes were exposed to a storage phosphor imaging screen, followed by scanning with a Personal Molecular Imager System (Bio-Rad) and analyzed with Quantity One 1-D analysis software (Bio-Rad).

RNA isolation and northern blot

Total RNA was isolated from a 50-mL (1 × 105 cell/mL) culture with Tri-Reagent (Sigma-Aldrich, USA). Fifteen micrograms of RNA were denatured, separated by electrophoresis in a denaturing, formaldehyde-containing, 2% agarose gel and blotted onto a Zeta-Probe nylon membrane (Bio-Rad) by capillary blotting. Membranes were stained with methylene blue solution to verify RNA transference. Probes for aphA-6 and gfp were generated by PCR using primers NVDF233/234 and NVDF235/236 (Supplementary Table 1), respectively, using plasmid pJ248/aphA-6/gfp as a template (Fig. 1b). After purification by agarose gel electrophoresis, probes were labeled with [α-32P]-dATP by random priming and hybridization was carried out at 65 °C following a standard protocol (Sambrook and Russell 2001). Northern blots were exposed to a storage phosphor imaging screen followed by scanning with a Personal Molecular Imager System (Bio-Rad) and analyzed with Quantity One 1-D analysis software (Bio-Rad).

Protein extraction and immunoblot analyses

Total soluble protein was extracted as described previously (Bertalan et al. 2015). A 15-mL aliquot of cell culture was harvested by centrifugation (3000×g, 5 min). The pellet was washed once with 1 mL of solution A (0.1% Na2CO3) and then resuspended in 300 µL of solution A, 200 µL of solution B (5% SDS, 30% sucrose) and 25 µL ß-mercaptoethanol (14.3 M) and agitated with a vortex for 25 min at room temperature. Cell debris was removed by centrifugation for 10 min at 16,000×g at 4 °C. The supernatant was recovered, and total cellular protein was quantified by Bradford assay with Quick Start Bradford 1× Dye Reagent (Bio-Rad). Proteins (30 µg) were separated by electrophoresis in 15% SDS-Polyacrylamide gels, and subsequently transferred to Immun-Blot PVDF Membranes (Bio-Rad). Membranes were probed with a chicken anti-GFP primary antibody (1:5000, Aves Lab Inc., USA) followed by probing with a goat anti-chicken antibody conjugated to horseradish peroxidase (1:5000, Abcam, UK). Immunobiochemical detection was carried out using the Western Lightning Plus-ECL reagent (Perkin Elmer Inc., USA) following the manufacturer’s instructions and recorded using a ChemiDoc MP Imaging System (Bio-Rad). Purified recombinant GFP (rGFP) was obtained by overexpressing the gfp gene using the pET28-b vector in E. coli BL21 Star (Invitrogen, USA) and subsequently purified using Ni–NTA agarose following standard protocols (Invitrogen).

Microscopy

Subcellular localization of GFP was determined using a Zeiss Confocal Laser Scanning Multiphoton LSM-710NLO microscope system (Carl Zeiss, Germany), equipped with a plan-apochromat 63×/1.4 oil objective. GFP fluorescence was visualized using a 468-nm excitation filter and a 510-nm emission filter. Chlorophyll autofluorescence was visualized using a 450-nm and 705-nm excitation and emission filters, respectively. Images were processed and analyzed using the ZEN software (Carl Zeiss).

Circularized RT-PCR

RNA circularization was performed as described previously by Zandueta-Criado and Bock (2004). Briefly, 5 µg of total RNA were self-ligated at 37 °C for 2 h with 10 units of T4 RNA ligase (New England Biolabs) in a final reaction volume of 50 µL. For cDNA synthesis, 1 µg of the circularized RNA, 2 µL (50 µM) of oligonucleotide and 1 µL of dNTPs (10 mM), in a 10-µL total volume reaction, were denatured at 70 °C for 5 min and used for cDNA synthesis with 200 U of M-MuLV Reverse Transcriptase (New England Biolabs) following the manufacturer’s instructions. The qRJAB7 primer was used as a specific oligonucleotide for gfp transcripts. The resulting cDNA of each reaction was used directly as a template for PCR reactions, and amplified following standard protocols using KOD Hot Start polymerase (Merck Millipore, USA) using primer pairs qRJAB7/NVDF267 (Supplementary Table 1). PCR products were cloned in pBlueScript SK II (+) and sequenced with universal primers M13F and M13R.

Results

Selection of intercistronic regions and construction of the synthetic operons

To characterize IEE that can be used for the expression of multiple foreign genes in the chloroplast of C. reinhardtii we followed the next approach. First, we looked for genes that are expressed in a polycistronic transcript, for which there is northern blot evidence, and identified the following that comply with such criteria: petA-petD (Sakamoto et al. 1994; Sturm et al. 1994; Loiselay et al. 2008), psbB-psbT-psbH (Monod et al. 1992; Johnson and Schmidt 1993; Vaistij et al. 2000b; Loizeau et al. 2014), psbD-exon 2 psaA (Herrin and Schmidt 1988; Choquet et al. 1998), rps7-atpE (Robertson et al. 1990), psaC-ORF58-petL (Takahashi et al. 1991, 1996) and tscA-chlN (Rochaix 1996; Glanz et al. 2012). From these, we selected the intercistronic regions that had a length shorter than 1 kb: psbB-psbT-psbH (451 bp and 569 bp, respectively), psaC-petL (607 bp), petL-trnN (334 bp) and tscA-chlN (650 bp), identified hereafter as IEE-1, IEE-2, IEE-3, IEE-4 and IEE-5, respectively. petA-petD and rps7-atpE have a size of 2 531 bp and 1 229 bp respectively. Ideally, IEEs should be short, as this facilitates manipulation and has little impact on the final size of the polycistronic transcript. With this in mind we have left out the intercistronic regions from petA-petD and rps7-atpE.

The five chloroplast intercistronic regions but one, the petL-trnN, come from bicistrons that contain protein coding genes. The petL-trnN is a bicistron that contains a protein coding gene, petL, and a tRNA coding gene, trnN. As such, this intercistronic region would not be useful for the translation of the second cistron and we have included it both as a control and to preliminarily explore the potential of this region for the expression of a foreign protein and an RNA. Additionally, we used the 5′UTR from gene 10 of bacteriophage T7 (Olins et al. 1988; Olins and Rangwala 1989). This 5′UTR, also referred to as 5′UTR T7g10, lacks a bicistronic processing site but contains a ribosome-binding site (RBS) and has been used in plant chloroplasts for the expression of genes and accumulation of proteins at a high level (Oey et al. 2009). We included the 5′UTR T7g10, hereafter identified with the code IEE-6, because to the best of our knowledge, this element has not been tested in the chloroplast of C. reinhardtii.

With the selected intercistronic regions, IEE1–IEE6 (Fig. 1c, Supplementary Data 1), we constructed a series of synthetic bicistronic operons and inserted them in the previously reported chloroplast transformation vector p320 (Guzmán-Zapata et al. 2016). This vector targets the insertion of foreign genes to the intergenic region between 23S rRNA-5S rRNA and exon 5-intron 4 of psbA in the chloroplast genome of C. reinhardtii (Fig. 1a). The synthetic operons comprised the selectable marker aphA-6 (Bateman and Purton 2000) and the codon-optimized reporter gene gfp (Supplementary Data 1), placed under the control of the rbcL promoter and terminator (PrbcL, TrbcL). Each intercistronic region was placed between aphA-6 and gfp (Fig. 1b) to generate vectors p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5, and p320-IEE6 (Fig. 1c). We reasoned that both aphA-6 and gfp would be transcribed from the rbcL promoter as a single mRNA unit and that translation of the first cistron (aphA-6) would occur from cis translation elements present in the 5´UTR of PrbcL (Wang et al. 2008). In turn, translation of the second cistron (gfp) would only occur if the intercistronic region yields a transcript that can either be processed to generate translatable monocistronic transcripts or serve directly as a bicistronic transcript for translation (Stern et al. 2010). In any case, the resultant mRNA would require a certain stability and to contain the necessary cis-elements and the regions for interaction with trans-acting factors, if any. Although the aphA-6 gene, confers resistance to kanamycin and selection of transformed lines could have been carried out directly on this antibiotic, we did not know if translation of the aphA-6 mRNA would be somehow affected by the presence of the IEE, placed downstream of the coding sequence. For this reason, we decided to use vector p228 for co-transformation and primary selection of transformed lines (Newman et al. 1990; Chlamydomonas Resource Center, University of Minnesota). Vector p228 carries a point mutation in the 16S rRNA gene so when the endogenous 16S rRNA is replaced, strains carrying the point mutation become resistance to spectinomycin.

Chloroplast transformation and primary analysis of transplastomic lines

To introduce the synthetic operons in the chloroplast of C. reinhardtii, vectors p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5 and p320-IEE6, along with vector p228, were bombarded using a particle bombardment device. After four events of transformation we recovered a total of 47, 33, 50, 8, 18 and 8 spectinomycin-resistant colonies for vector combinations p320-IEE1/p228, p320-IEE2/p228, p320-IEE3/p228, p320-IEE4/p228, p320-IEE5/p228 and p320-IEE6/p228, respectively (Fig. 2a). Because not all the recovered spectinomycin-resistant lines are the result of transformation with both vectors (they could be the result of transformation with p228 only), we tested each colony to determine if they were resistant to both spectinomycin and kanamycin by growing them in TAP media with these antibiotics (Fig. 2b, Supplementary Fig. 1). With this test we determined that 16, 11, 33, 8, 2 and 4 lines were the successful events of transformation with vectors p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5 and p320-IEE6, respectively (Fig. 2a, Supplementary Fig. 1). The resistance to kanamycin in these lines strongly suggests that the first cistron (aphA-6), independently of the intercistronic region placed between the two cistrons, is translated, producing in turn the detoxifying enzyme APH(3′)-VI.

Fig. 2
figure 2

Transplastomic lines obtained after co-transformation with p320-IEE1-6 and p228. C. reinhardtii lines recovered after transformation with vectors of the series p320-IEE (p320-IEE1–6, Fig. 1c), and vector p288, which contains a fragment of the chloroplast 16S rRNA with a point mutation that confers resistance to spectinomycin. a Total number of primary lines recovered and resistant to spectinomycin (Spec); total number from the primary lines that were also resistant to kanamycin (Km). b Transplastomic lines for each IEEs were grown in TAP media (TAP), TAP media supplemented with spectinomycin (TAP/Spec; 150 µg/mL spectinomycin) and TAP media supplemented with kanamycin (TAP/Km; 100 µg/mL kanamycin)

We selected 2–3 independent lines from each of the strains transformed with p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5 and p320-IEE6, and carried out a more detailed molecular characterization. Here we present the results of these lines, namely CrC-IEE1, CrC-IEE2, CrC-IEE3, CrC-IEE4, CrC-IEE5 and CrC-IEE6. First, we checked by PCR for the presence of both aphA-6 and gfp and detected amplification products of the expected sizes, 248 bp for aphA-6 and 178 bp for gfp, in all transplastomic lines but not in the wild-type strain (Fig. 3a). Then, we performed a restriction fragment length polymorphism (RFLP) using SacII and PstI restriction enzymes (Fig. 1a, d) to determine homoplasmy and that the synthetic operon had integrated in the chloroplast genome in the targeted site. We used a probe that aligns in the psbA gene (Fig. 1d) and detected restriction fragments of 3.7 kb for the wild-type and of 6.6, 6.7, 6.7, 6.5, 6.8 and 6.2 kb for transplastomic lines CrC-IEE1, CrC-IEE2, CrC-IEE3, CrC-IEE4, CrC-IEE5 and CrC-IEE6, respectively (Fig. 3b). This result indicated that the synthetic operon had integrated in the expected site and that no copies of the wild-type genome could be detected (with this method). When copies of the wild-type genome cannot be detected it is acceptable to say that homoplasmy has been reached.

Fig. 3
figure 3

Confirmation of transformation in transplastomic lines by PCR and RFLP. a Amplification of a fragment of aphA-6 and gfp by PCR in the transplastomic lines. b RFLP analysis to determine homoplasmy. DNA samples from wild type and transplastomic lines were extracted, digested with SacII and PstI, separated by electrophoresis in a 0.6% agarose gel, blotted and hybridized to the radiolabeled psbA probe shown in Fig. 1d. The probe detects a 3.7 kb fragment in the wild-type strain and fragments of 6.6 kb, 6.7 kb, 6.7 kb, 6.5 kb, 6.8 kb and 6.2 kb in transplastomic lines CrC-IEE1, CrC-IEE2, CrC-IEE3, CrC-IEE4, CrC-IEE5 and CrC-IEE6 obtained with vectors p320-IEE1, CrC-IEE2, CrC-IEE3, CrC-IEE4, CrC-IEE5 and CrC-IEE6

Analysis of transcript processing in transplastomic lines

To determine the presence of aphA-6 and gfp transcripts we carried out qualitative RT-PCR and northern blot analyses. For RT-PCR, we first synthesized the cDNA as described in the section "Materials and methods". Using specific primers for gfp and aphA-6 we determined, by RT-PCR, that in all lines the aphA-6 and gfp mRNA transcripts were present (Fig. 4a). For all lines, except for the wild-type strain, we detected an amplification product of 248 bp for aphA-6 and a product of 178 bp for gfp. RNA samples, not taken through the cDNA step, were also used as template for the RT-PCR to rule out amplification from DNA contamination (negative control, not shown in Fig. 4a). We used the constitutively expressed rbcL transcript as a control and detected it in all lines including the wild-type strain. Detection of the transcripts by RT-PCR reliably indicates that the transcripts are present but it is not useful to determine the monocistronic or bicistronic nature of the transcripts. Because we were ultimately interested in determining the cleavage pattern of the synthetic operon, as this could help us to better understand the molecular behavior of the IEEs used, we performed a northern blot analysis as is described next.

Fig. 4
figure 4

Analysis by RT-PCR and northern blotting of aphA-6 and gfp transcripts in transplastomic lines. a RT-PCR assays were performed to detect aphA-6 and gfp and rbcL transcripts in wild-type and transplastomic lines after transformation with vectors p320-IEE1, p320-IEE2, p320-IEE3, p320-IEE4, p320-IEE5 and p320-IEE6. b Northern blot analysis of transplastomic lines using a radiolabeled aphA-6 probe (See Fig. 1d). The upper band in all lanes corresponds to the bicistronic form of the transcript composed of aphA-6-IEE-gfp, while the bands indicated with the stars correspond to the monocistronic form of aphA-6. c Northern blot analysis of transplastomic lines using a radiolabeled gfp probe (See Fig. 1d). The upper band in all lanes corresponds to the bicistronic form of the transcript composed of aphA-6-IEE-gfp, while the bands indicated with the stars correspond to the monocistronic form of gfp. Membranes stained with methylene blue are shown

Total RNA was extracted, separated by electrophoresis and blotted onto a nylon membrane. It was then hybridized with radiolabeled probes for both genes in the synthetic operon. First, a probe specific for aphA-6 revealed that larger transcripts with a size of around 2.0 kb for Crc-IEE1, Crc-IEE2, Crc-IEE3 and Crc-IEE5, 1.8 kb for Crc-IEE4 and 1.6 kb for Crc-IEE6 were present in the transformed lines (Fig. 4b). The sizes of these transcripts corresponded to non-cleaved bicistronic units originating from the PrbcL. In wild-type cells, the IEEs that we have used in this study also yield a stable bicistron of the genes from which they come from (Takahashi et al. 1991; Johnson and Schmidt 1993; Rochaix 1996; Vaistij et al. 2000b; Glanz et al. 2012). Our finding indicates then that the IEEs, even in a foreign gene context are sufficient to yield stable bicistronic transcripts. In addition to these bicistronic transcripts, we detected smaller-sized transcripts for all lines except for Crc-IEE6. For lines CrC-IEE1, CrC-IEE3, CrC-IEE4 we detected a discrete, clearly defined transcript of approximately 800 bp, which presumably corresponds to the cleaved and processed monocistronic form of aphA-6 (Fig. 4b, indicated with stars). For lines Crc-IEE2 and Crc-IEE5 we also detected discrete clearly defined bands, however, we detected two in each case: transcripts of 1 kb and 1.1 kb for Crc-IEE2 and transcripts of 1.1 kb and 1.3 kb for Crc-IEE 5 (Fig. 4b, also indicated with stars). Faint bands were also detected for Crc-IEE5, most likely the result of complex processing after cleavage of the stable and highly abundant bicistronic transcript. As mentioned above, we could not detect a monocistronic unit for aphA-6 in line Crc-IEE-6 and at least in this case, we can say that the aphA-6 transcript can be translated from a bicistronic uncleaved aphA-6/gfp mRNA. The presence of the aphA-6 cistron in all lines, either in the bicistronic or monocistronic form, is consistent with the kanamycin resistant phenotype observed.

Then we performed a northern blot assay using a probe for gfp, the second cistron of the synthetic operon. Consistent with what we observed when the aphA-6 probe was used, with the gfp probe we also detected the presence of the bicistronic unprocessed mRNA transcript in all lines (Fig. 4c). The sizes of these bicistronic transcripts are, as mentioned above, 2.0 kb for Crc-IEE1, Crc-IEE2, Crc-IEE3 and Crc-IEE5, 1.8 kb for Crc-IEE4 and 1.6 kb for Crc-IEE6. A monocistronic form of gfp, with sizes 0.7 kb, 1.1 kb, 0.7 kb and 0.9 kb were detected in lines Crc-IEE2, Crc-IEE3, Crc-IEE4 and Crc-IEE-5, respectively (Fig. 4c, indicated with stars). Note that these monocistronic units seem to be less abundant in Crc-IEE3 and Crc-IEE4 than in Crc-IEE2 and Crc-IEE5. Interestingly, for line Crc-IEE-5 additional bands were detected, again suggesting that this IEE is yielding mRNA transcripts with a broad range of sizes (Fig. 4c). Transcripts of various sizes are frequently reported as the result of polycistronic processing (Barkan 1988; Hahn et al. 1998; Monde et al. 2000). This has been attributed to the initial random cleavage by RNases in intercistronic regions (Luro et al. 2013; Pfalz et al. 2009). However, the presence of transcripts of various sizes as the result of mRNA instability and rapid degradation cannot be discarded. For lines Crc-IEE1 and Crc-IEE6 we could not detect a monocistronic unit for gfp. In the case of Crc-IEE1, the bicistron is in fact cleaved, as the aphA-6 transcript was detected but not the gfp transcript, indicating that the resultant cistron containing the gfp unit is highly unstable and shortly degraded after processing of the bicistronic aphA-6-IEE1-gfp transcript. In wild-type cells, in the expression of the psbB-psbT-psbH polycistron only the psbB, psbH and psbB-psbT units but not the psbT monocistronic unit can be detected (Monod et al. 1992; Vaistij et al. 2000a, b). Our IEE1 comes from the psbB-psbT region and it is behaving similarly, we can detect the bicistronic form of the transcript plus the monocistronic form of the first cistron but not the second one. In the case of Crc-IEE6, the bicistron does not seem to be processed whatsoever as neither aphA-6 nor gfp were detected in the monocistronic form. This is not surprising as the 5´UTR T7g10 does not seem to contain a recognizable sequence for chloroplast RNases to initiate cleavage and processing.

Immunodetection and confocal microscopy detection of GFP in transplastomic lines

To determine if the second cistron, corresponding to gfp, was efficiently translated, we checked for the presence of GFP by western blotting and by confocal laser microscopy. For immunodetection, total soluble protein was extracted from lines Crc-IEE1, Crc-IEE2, Crc-IEE3, Crc-IEE4, Crc-IEE5, Crc-IEE6 and used to determine the presence of GFP using a specific polyclonal antibody. It is worth reminding the reader that all the lines were resistant to kanamycin, indirectly indicating that the first cistron of the synthetic operon was translatable. However, when we checked for the presence of GFP we could not detect GFP in CrC-IEE1, CrC-IEE3, Crc-IEE4 and Crc-IEE6 while we were able to detect a positive result in lines CrC-IEE-2 and Crc-IEE-5 (Fig. 5a). Drawing conclusions on why GFP was not detected in lines CrC-IEE1, Crc-IEE4 and Crc-IEE6 is somewhat straightforward. GFP could have only been generated from the bicistronic form of the transcript, as in both cases the gfp monocistronic mRNA was absent (Fig. 4 c). We can conclude then that the gfp cistron cannot be used for translation when located as the second cistron using the IEE regions from the psbB-psbT and T7g10 intercistronic regions. Interestingly, in both cases, the presence of these intercistronic regions in the uncleaved bicistronic mRNA does not seem to affect translation of aphA-6 to produce APH(3′)-VI. In the case of Crc-IEE4 we did not expect translation of the gfp transcripts from either the bicistronic or monocistronic units; the IEE4 comes from petL-trnN, a protein-tRNA intercistronic region and as such lacks the necessary cis elements for translation. However, the usefulness of this IEE could rely on the potential to use it as an element to co-express a protein coding gene along with an RNA coding gene (e.g. a Cas9 protein with a gRNA) eventually needed in the implementation of the CRISPR/Cas9 technology in C. reinhardtii chloroplast. As mentioned above,

Fig. 5
figure 5

Detection and quantification of GFP protein in transplastomic lines. a Total soluble protein was extracted from wild-type and transplastomic lines and 30 µg separated by SDS-PAGE, blotted and incubated with a polyclonal antibody against GFP. The expected size for GFP is of approximately 27 kDa. Recombinant GFP (rGFP) from E. coli was used as a positive control. b Quantification of GFP protein in positive lines Crc-IEE2 and Crc-IEE5. Quantification was performed by comparison with the density of a dilution series of rGFP in the range 5–30 ng. Picture of the staining of the gels with Coomassie brilliant blue to show equal loading of protein are also presented

GFP could only be detected in lines CrC-IEE-2 and Crc-IEE-5, two of the lines where we also detected processing of the bicistronic mRNA to yield monocistronic aphA-6 and gfp. This could tempt us to think that the monocistronic form of gfp is required for translation, however in line Crc-IEE3 the monocistronic for of gfp is also detected, albeit to a lesser extent, which could contradict our presumption, because in line Crc-IEE3 no GFP was detected. An alternative explanation for the accumulation of GFP in these two lines could be that GFP correlates with the accumulation of bicistronic mRNA. CrC-IEE-2 and Crc-IEE-5 accumulate bicistronic mRNA to a greater degree than CrC-IEE-3 which could make the difference in the level of protein accumulation. Monocistronic transcripts are not always required for translation and there is evidence showing that individual cistrons can be translated from bicistronic mRNA (Barkan 1988; Zoschke and Barkan 2015). We cannot say at present if translation is occurring from the mono or bicistronic form of the transcript in lines Crc-IEE2 and Crc-IEE5. However, we can conclusively say that the use of the intercistronic regions from psbN-psbH and tscA-chlN in a synthetic bicistronic construct can efficiently serve to generate APH(3′)-VI and GFP from translatable transcripts.

Having detected the presence of GFP in lines Crc-IEE2 and Crc-IEE5, we then determined the level of accumulation of this protein by densitometry of the western blot membrane and found it to be 0.05–0.1% of total soluble protein (Fig. 5b). We then used confocal laser-scanning microscopy to visualize GFP fluorescence. We screened all lines but as expected, and in accordance with the results of immunodetection, we could only visualize GFP in lines Crc-IEE2 and Crc-IEE5 (Fig. 6). GFP fluorescence in these lines is solid evidence that the gfp transcripts are being translated and that the resulting protein is folded correctly to generate biologically active GFP. We did not observe fluorescence in the wild-type line (Fig. 6) nor in the rest of the lines (pictures not shown).

Fig. 6
figure 6

Detection of GFP fluorescence in transplastomic lines Crc-IEE2 and Crc-IEE5 by confocal laser-scanning microscopy. Wild-type and transplastomic lines were visualized by CLSM using a 468-nm excitation filter and a 510-nm emission filter. Chlorophyll fluorescence was determined using a 450 nm and 705 nm excitation and emission filters. Bright fields and the overlays of the GFP/chlorophyll fluorescence are shown

Mapping of the 5′-end termini in the gfp cistrons

Having demonstrated that the intercistronic regions from psbN-psbH and tscA-chlN (IEE2 and IEE5) can be used as IEEs in synthetic polycistronic operons, as both yielded biologically active APH(3′)-VI and GFP, we then wished to map the 5′-end of the monocistronic gfp transcripts, generated after processing of the bicistronic aphA-6-gfp transcript when this regions are used (Fig. 4c). Total RNA was extracted from lines Crc-IEE3 and Crc-IEE5, circularized using T4 RNA ligase and used for the synthesis of cDNA using primer qRJAB7. The cDNA was then used as template for PCR amplification of the head–tail junction with primers qRJAB7 and NVDF267 (Supplementary Table. 1). PCR products were gel purified, cloned in vector pBlueScript SKII+ and sequenced using universal primers M13F and M13R. For line Crc-IEE2, we observed that the monocistronic transcripts of gfp had three different 5′-ends. This 5′-ends corresponded to positions -286, -59 and -56 with respect to the start codon of the gfp coding region (Fig. 7a). From the sequencing results, those with the ends at -59 and -56 were more abundant than the transcript with the end at position -286 in a ratio of 3:1. The 5′-end located at position -59 has been previously reported by Loizeau et al. (2014). Whereas the 5′-end at positions -56 and -286 had not been reported before. This should be taken with caution though, as we did not detect an additional, clearly defined, band for the transcript with the 5′-termini at position -286 in the northern blot and this could be an intermediate species in the processing and maturation of the mRNA. Interestingly, in the three transcript ends we mapped we could observe the intact target sequence of Mbb1 (Loizeau et al. 2014), a nuclear encoded protein, member of the TPR family (Fig. 7a, underlined). The target sequence of Mbb1 contains an S-box sequence (Fig. 7a, in bold) which has been found to be determinant in the processing and stabilization of the monocistronic forms of psbB and psbH in wild-type C. reinhardtii (Vaistij et al. 2000a; Loizeau et al. 2014). As the gfp transcripts identified in line Crc-IEE2 include the target binding sequence of Mbb1, it is reasonable to believe that this TPR protein could bind and stabilize the gfp transcripts protecting them from exonucleases (Prikryl et al. 2011; Hammani et al. 2012). For line Crc-IEE5, we determined the 5´-end of one monocistronic form of gfp (Fig. 7b), most likely the most abundant form of a ~ 900 band detected in the northern blot (Fig. 4c). This 5′-end was located at position -133 with respect to the start codon in the gfp coding region. In wild-type cells, processing of the bicistronic tscA-chlN, yields at least three different RNA species (Hahn et al. 1998). It is interesting to note that in the 5′end of this transcript, closely related sequences AAGUAAg/AAGUAAu, similar to the S-box sequence AAGUAAA at the core of the Mbb1 footprint are present (Fig. 7b, in bold). However, their potential role, if any, as the target site of RNA binding proteins remains to be determined. It has been suggested that PPR7 might be involved in the processing and stabilization of the tscA-chlN transcript (Jalal et al. 2015), however not much is known on the stabilization on the 5´-end of the monocistronic form of chlN.

Fig. 7
figure 7

Mapping of the 5′-end termini of gfp monocistrons in transplastomic lines Crc-IEE2 and Crc-IEE5. 5′-end termini were determined by sequencing of the circularized gfp cistrons generated after cleavage and processing of the intercistronic regions IEE2 and IEE5 derived from the chloroplast operons psbN-psbH and tscA-chlN, respectively. a The three 5′-end termini mapped for IEE2. The target sequence of the nuclear encoded, member of the TPR (Tetratricopeptide repeat) family, Mbb1 protein is underline while the S-box is in bold. b The only 5′-end termini mapped for IEE5. Closely related sequences to the S-box are shown in bold. In both cases, the start codon of GFP is highlighted in bold uppercase

Discussion

In this study, we have used the intercistronic regions psbB-psbT, psbN-psbH, psaC-petL, petL-trnN, tscA-chlN from the C. reinhardtii chloroplast genome and the T7g10 5′-UTR (corresponding to the 5′UTR sequence from gene 10 of bacteriophage T7) to construct synthetic bicistronic operons with the foreign genes aphA-6 and gfp. We used these constructs to determine if such intercistronic regions could serve as IEEs for the coexpression of foreign genes in the chloroplast of C. reinhardtii. The transplastomic lines obtained, all showed a kanamycin resistant phenotype, independently of the IEE used, an indirect indication that the APH(3′)-VI was accumulating. However, only the lines that harbored the synthetic operon with IEE2 and IEE5 were able to generate a translatable transcript of gfp and accumulate a detectable amount of GFP. These elements, IEE2 and IEE5, come from psbN-psbH and tscA-chlN, respectively. One of these regions, psbN-psbH was in fact the source of the first IEE proposed and developed for tobacco chloroplast (Zhou et al. 2007) over a decade ago. The tobacco chloroplast IEE has since then been used for metabolic engineering (Lu et al. 2013; Bock 2014; Fuentes et al. 2016) but this is the first time that IEEs have been characterized for C. reinhardtii.

In all lines obtained we observed the accumulation of a bicistronic transcript containing the aphA-6 and gfp genes (Fig. 4b, c). All chloroplast intercistronic regions were cleaved to generate monocistronic mRNA units of aphA-6 and gfp, however, in line IEE1 we could not detect the gfp monocistronic transcript by northern blotting, strongly suggesting that the transcript is highly unstable and rapidly degraded. In fact, in wild-type cells, the psbB-psbT bicistron yields a monocistronic form of psbB but not of psbT. Because the ultimate aim of using IEEs is to obtain expression of the two genes (or more), cleavage and processing of the bicistronic transcript is not enough to qualify as a useful element for the construction of synthetic operons. In this regard IEE3, IEE4 and IEE6 are not useful for such purpose. As mentioned above, recently, it has been reported that transcription in the chloroplast of C. reinhardtii generates polycistronic units more abundantly than what was previously thought (Cavaiuolo et al. 2017; Gallaher et al. 2017). We find this interesting, as a few more intercistronic regions could still be characterized in future studies and IEEs added to the two we have identified here. Alternatively, data obtained from sRNA-Seq could serve as the source to identify sRNA footprints, the result of RNA-binding proteins (e.g., PPR, TPR and OPR proteins), that could serve in turn to develop synthetic construct with cis-elements and trans-acting factors that can be incorporated to the 5′-end of transcripts to stabilize and enhance their translation. A similar approach has been successfully used in tobacco chloroplast where the target sequences of various PPR proteins been incorporated in synthetic bicistronic constructs facilitating the expression of the second cistron (Legen et al. 2018).

When we determined the level of GFP accumulation, this was on the limit of what is estimated to be a good level of protein accumulation (> 1%) for a biotechnology application (Rasala et al. 2010). However, a high level of protein accumulation is not always required. For applications that involve the introduction of 2–3 enzymes, involved all in a metabolic pathway, transcript stability and translatability seems to be more important (Fuentes et al. 2016). In one of the IEEs we report here, psbN-psbH, we found that a binding sequence for the nuclear encoded Mbb1 TPR protein is present in the mature translatable form of the gfp transcript, suggesting that this transcript is also being stabilized as it occurs with the psbH transcript in wild-type cell. An exciting area of research will be the characterization of new factors that generate and stabilize transcripts (monocistronic or polycistronic) and how this knowledge could then help to fine tune the expression of foreign genes for metabolic engineering and synthetic biology in C. reinhardtii.

Before this report, there were no studies on the identification and use of IEE for the stacking and expression of foreign genes in the chloroplast of C. reinhardtii. However, there has been some progress made in this regard in the chloroplast of tobacco. In a pioneering work Zhou et al. (2007) studied the psbB operon from tobacco chloroplast, and identified a small intercistronic element, called the IEE, that when included in a synthetic construct to drive the expression of nptII and yfp, proved to be sufficient to mediate processing of polycistronic transcripts into stable and translatable monocistronic mRNAs. Since then, this feature has been used to express polycistronic mRNA of transgenes from a single promoter in a single transformation event (Lu et al. 2013; Bock 2014; Fuentes et al. 2016). For example, the tobacco chloroplast IEE was used for the construction of a synthetic operon for the expression of three key genes (coding for the key enzymes homogentisate phytyltransferase, tocopherol cyclase and c-tocopherol methyltransferase) involved in the biosynthesis of vitamin E (tocochromanol). Expression of the genes containing the IEE resulted in a tenfold increase in the accumulation of tocochromanol, in transplastomic tobacco and tomato plants, compared to the levels obtained when the genes were expressed without the IEE (Lu et al. 2013). These works have shown that IEEs can serve to facilitate and improve the efficient translation of transcripts from synthetic operons in the plastid genome and with this contribute to the development of metabolic engineering at a more accelerated pace.

The IEEs we have identified could be used to construct bi, tri or polycistronic operons for C. reinhardtii chloroplast. These could contain genes that comprise an entire metabolic pathway not present in C. reinhardtii chloroplast, or genes that complement one that is already present but lack the enzymes to divert a certain intermediate for the production of high value molecules such as terpenoids, vitamins or fatty acids.