Introduction

Chloroplasts of plant cells contain more than 3000 different types of proteins involved in their functions (Jarvis and Sol 2001). More than 90% of chloroplast proteins are transcribed in the nucleus, translated on free cytosolic ribosomes as precursors, and post-translationally acquired through protein-import mechanisms (Xu et al. 2021). Delivery of precursor proteins by the import machinery is critical in chloroplast proteome biogenesis (Li and Chiu 2010; Gao et al. 2023). The import machinery at the chloroplast surface consists of the translocase of the outer envelope (TOC) and the translocase of the inner envelope (TIC) (Rochaix 2022). The TOC-TIC multiprotein complex forms channels through the envelope, facilitating the delivery of chloroplast preproteins, except most of the outer envelope, aided by stromal chaperones (Li et al. 2020; Nakai 2018).

The TOC-TIC complex recognizes chloroplast preproteins through an N-terminal cleavable signal sequence termed the chloroplast transit peptide (cTP) (Inaba and Schnell 2008). cTPs are functionally conserved but highly divergent in primary sequence and length (Bruce 2001). Certain characteristics are commonly present in cTPs, such as a high concentration of hydroxylated amino acid residues, a low number of acidic residues, a high amount of proline residues that contribute to the unstructured nature of cTPs, and an α-helix secondary structure that is induced on binding to chloroplast lipids (Bruce 2000). Although lacking a consensus sequence, studies have identified multiple sequence motifs in cTPs that function in an independent, additive, or combinatorial manner to engage the translocation machinery (Lee et al. 2006, 2015, 2018). The functioning of cTP motifs is greatly dependent on their position and the sequence context (Lee et al. 2006).

The addition of a cTP to the N-terminus is sufficient to direct heterologous proteins to the chloroplast (Broeck et al. 1985; Kavanagh et al. 1988; Kim et al. 2009, 2010; Zhong et al. 2003). However, for the efficient import of membrane proteins and specific heterologous soluble proteins, the sequence stretch beyond the cTP cleavage site is found to be critical (Bionda et al. 2010; Caspari et al. 2022; Shen et al. 2017; Uehara et al. 2016; 2020). Caspari et al. (2022) found that eight out of ten tested Chlamydomonas cTPs require unstructured regions downstream of the cleavage site to import a common heterologous cargo. The unstructured region was proposed to help in the passive set-up phase of import by allowing short cTPs to reach across the translocase channels to stromal chaperones without the steric hindrance of the structured cargo protein. This hypothesis was supported by the finding that N-terminal unstructured regions could be functionally replaced with unstructured regions from the C-terminus of Chlamydomonas chloroplast proteins EPYC1 and VIPP1, indicating that N-terminal unstructured regions act as spacer sequences with no active role in chloroplast protein import (Caspari et al. 2022).

Much less is known about the sequence requirements for the chloroplast targeting of heterologous membrane proteins, especially multi-transmembrane domain (TMD) proteins. Arabidopsis chloroplast inner envelope membrane (IEM) proteins commonly contain an N-terminal domain (hereafter referred to as N-region) between the cTP and the first transmembrane domain (TMD) (Rolland et al. 2016; Singhal et al. 2017). The N-regions are highly diverse in sequence and length but, unlike cTPs, have a net negative or neutral charge (Rolland et al. 2016). Deleting the N-region of Arabidopsis chloroplast IEM proteins plastidic sugar transporter (AtpSuT) and the glycolate-glycerate transporter (AtPLGG1) resulted in import failure of the mutant proteins, highlighting its crucial role in the import process (Rolland et al. 2016). However, it is yet to be determined if the N-region acts as a spacer, as proposed by Caspari et al. (2022) for Chlamydomonas proteins, or if it contains motifs necessary for import.

Chloroplast IEM proteins require an additional signal for sub-organellar localization (Froehlich and Keegstra 2011; Viana et al. 2010). The IEM signal is proposed to be present in the TMD region of the proteins (Anderson et al. 2019; Okawa et al. 2014; Singhal et al. 2017; Viana et al. 2010). In multi-TMD proteins, a single or a pair of specific TMDs contain the IEM signal (Okawa et al. 2014; Singhal et al., 2017). However, if taken out of the context of the native protein, the TMDs fail to act as IEM signals, suggesting the need for a specific sequence context (Okawa et al. 2014). Singhal et al. (2017) found that replacing the N-region of the multi-TMD IEM protein AtSCY2 with that of its thylakoid homolog AtSCY1 displaced the chimeric protein to the stroma. The N-region’s sequence context potentially helps determine the protein’s sub-organellar localization.

The N-region is reported to be necessary for the chloroplast import and membrane integration of multi-TMD proteins, but its role in both processes requires further study. To understand the N-region’s role, we tested the potential of cTP and N-region combinations from different Arabidopsis IEM proteins in directing the cyanobacterial multi-TMD protein SbtA to the chloroplast. The SbtA protein was employed as it lacks organellar signal sequence or IEM signals in its TMDs and has been the object of plant engineering efforts. The sub-cellular localization of the generated fluorescent proteins was determined through western blotting and laser confocal microscopy.

Materials and methods

Subcellular localization, TP prediction, transmembrane domains, and protein alignments

A list of Arabidopsis thaliana chloroplast IEM proteins with multiple TMDs was obtained from the TAIR (https://www.arabidopsis.org/index.jsp) database. Putative cTPs were determined using TargetP-2.0. The TMDs were identified through topology prediction in UniProt (https://www.uniprot.org/) or consensus TMD prediction programs available in the Aramemnon database (http://aramemnon.uni-koeln.de/). The sequence between the cTP (as annotated by UniProt, TargetP2.0) and the first TMD (as given in Uniprot, TOPCONS) was delineated as the N-region. Protein sequence alignments were carried out using ClustalW.

Cloning

The full or partial coding sequence of the following proteins was used in this study: A. thaliana Plastidic sugar transporter (pSuT) (AT5G59250), A. thaliana Plastidic glycolate-glycerate transporter (PLGG1) (AT1G32080), A. thaliana Inorganic phosphate transporter (PHT2;1) (AT3G26570), A. thaliana Dicarboxylic acid transporter (DIT2;2) (AT5G64280), A. thaliana Sodium-bile acid transporter (BASS4) (AT3G56160), A. thaliana small multidrug transporter (MDT) (AT2G02590), Synechococcus elongatus PCC7942 SbtA (SYNPCC7942_1475), and Synechococcus sp. PCC7002 BicA (SYNPCC7002_A2371). The inserts for substitution constructs were obtained through custom gene fragment synthesis (Twist Bioscience, San Francisco). All inserts were cloned into the pCOLDI vector and sub-cloned in pCAMBIA1302, upstream of the Green Fluorescent Protein (GFP). The inserts were confirmed through Sanger sequencing (AgriGenome Labs, India). For detailed information on the cloning procedure for each construct, see supplementary fig. S1 and Table S2, Table S3 and Table S4.

Sub-cellular marker plasmids

The plasmid pUC19_ER-mCherry (Addgene Cat no.183163) highlighted the endoplasmic reticulum. The IEM marker plasmid was generated by cloning the coding sequence of A. thaliana TIC20-II (AT2G47840) upstream of mScarlet-I in pAN991_mScarlet-I (gift from Dr. Andreas Nebenfuehr).

Plant material and growth conditions

Nicotiana benthamiana plants were grown in potting soil for 21 days in a greenhouse maintained at 26 °C temperature and 60% humidity, with a 12 h/12 h day/night cycle.

Protoplast isolation and transfection

Protoplast isolation and PEG-mediated transfection for Nicotiana benthamiana were carried out as described by Yoo et al. (2007) for Arabidopsis, with modifications. Briefly, 30–40 leaves from 21-day-old N. benthamiana were sliced and vacuum infiltrated with the enzyme solution (1% cellulase ‘Onozuka’ R10, 0.5% macerozyme, 0.6 M mannitol, 10 mM CaCl2, 20 mM KCl, 0.1% BSA and 20 mM MES pH 5.7). The leaves were incubated in the enzyme solution and placed in a shaking incubator at 37ºC, 40 rpm for 4 h in the dark. The protoplast solution was filtered through a 70 µM filter and transferred to a test tube placed in ice. An equal volume of W5 solution (154 mM NaCl, 125 mM CaCl2, 5 mM KCl, 2 mM MES pH 5.7) was added, and the protoplasts were allowed to settle for 30 min. The protoplast pellet was resuspended in 2 mL fresh W5 solution and allowed to settle in ice. The protoplasts were counted using a hemocytometer under a light microscope and resuspended in W5 solution to a final concentration of 1 × 105 cells ml–1. Approximately 100,000 protoplasts in 100 μL were mixed with 5–10 μG of plasmid DNA in a test tube for transfection. In the case of co-transfection with marker plasmids, a 2:1 plasmid concentration of construct: marker was used. An equal volume of 40% PEG solution (40% w/v, PEG MW 4000, 0.1 M CaCl2, and 0.2 M mannitol) was added, and the mixture was incubated at room temperature for 10 min. Post incubation, 2 mL W5 solution was added slowly from the sides of the test tube, and the protoplasts were allowed to settle at room temperature. The pellet was resuspended in 0.5 mL of fresh W5 solution and incubated in the dark for 16–18 h at room temperature. Prior to visualization, mitochondria were stained by incubating protoplasts in W5 solution containing 100 nM MitoTracker Red CMXRos (ThermoFisher Scientific) in Di-methyl sulfoxide (DMSO) for 20 min.

Microscopy

The protoplasts were imaged using the Olympus FV3000 confocal microscope under the 60 × oil immersion objective. The GFP and chlorophyll were excited at 488 nm, and the emission was recorded at 499–535 nm and 630–735 nm, respectively. mCherry and mScarlett were excited at 561 nm, and the emission was recorded at 579–633 nm in a separate track. The laser power, PMT gain, and offset settings of channels were kept constant for all recorded images. The protoplast images are representative of at least three independent experiments. In each experiment, 10–15 protoplasts showing GFP fluorescence were recorded and manually sorted as ER/mitochondria/chloroplast targeted based on observed overlapping with marker protein/probe/autofluorescence, respectively. Colocalization was quantified through the spatial cross-correlation method using the ImageJ software plugin RGB profiler. Appropriate linear regions across the length of a chloroplast were selected for analysis. A minimum of three intensity fluorescence plots from different protoplasts were generated and compared for each construct. The protoplasts and plots shown in the figures are representative images.

Protein extraction and western blotting analysis

Total protein was extracted from 4–5 weeks-old N. benthamiana leaves 2 days post-Agroinfiltration, as described by Rolland et al. 2016. Total protein extracts were separated by SDS-PAGE and transferred to polyvinylidenedifluoride (PVDF) membranes. The membranes were probed with primary anti-bodies raised in rabbits: anti-GFP (ABclonal cat. no. AE011) and anti-TIC110 (Agrisera AS08 293), followed by horse radish peroxidase (HRP) conjugated anti-rabbit IgG secondary antibody (GeNei 114038001A).

Results

The N-region is necessary for multi-TMD protein import

The cyanobacterial protein SbtA contains 10 TMDs and lacks a cTP. We first attempted to direct SbtA to chloroplasts by fusing the cTPs of Arabidopsis IEM proteins: the sodium bile acid symporter (AtBASS4) (AT3G56160) and the small multidrug transporter (AtMDT) (AT2G02590) to the N-terminus of SbtA. The predicted cTPs of AtBASS4 and AtMDT are 48 and 78 residues long, respectively, with diverse primary sequences. Control vectors coding for cTP-GFP fusion proteins without SbtA were used to test the cTP’s potential to target a soluble heterologous protein. N. benthamiana protoplasts were transfected with the cloned pCAMBIA1302 constructs and visualized through confocal microscopy 16–18 h post-transfection. A minimum of 30 transfected protoplasts were observed for each construct and manually sorted as localized to the chloroplast, endoplasmic reticulum (ER), or otherwise. Figure 1A depicts the schematics of the chimeric proteins and an overview of their intracellular localization.

Fig. 1
figure 1

The N-region is necessary to direct SbtA to the chloroplast. A A schematic representation of chimeric proteins and a summary of their sub-cellular localization. The cellular compartment where most protein is localized (primary targeting) is indicated with ‘ + .’ Compartments with trace amounts of the protein and no protein are indicated as ‘ ± ’ and ‘−’, respectively. OT-Other; ER Endoplasmic Reticulum, CHL Chloroplast. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted as OT/ER/CHL localized. B, F, J, N, R Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O, S  Single-plane image of protoplast showing mCherry fluorescence of ER marker in magenta; D, H, L, P, T Composite image of protoplast combining the GFP and ER marker fluorescence images; E, I, M, Q, U Composite image of protoplast combining the GFP and chlorophyll autofluorescence images; V, X, Y, Z Un-transfected protoplast imaged in different channels and under same settings taken as an autofluorescence control. (i) Percentage of protoplasts showing GFP fluorescence in endoplasmic reticulum/other (indicated in blue) and chloroplasts (indicated in green). The total number of protoplasts counted is indicated above the bars. (ii) Western blot patterns of chimeric proteins. Total protein extracts from transiently transformed plants were analysed by western blotting using anti-GFP antibody. To confirm that proteins were loaded and transferred equally, blots were stained with Ponceau dye post-transfer (data not shown) and incubated with anti-Tic110 antibody. The results shown are representative of at least three independent western blot analyses. P, Precursor form; M, Mature processed form

The results show that the chimeric protein AtMDT78-SbtA-GFP was misdirected to the endoplasmic reticulum (ER) in all observed protoplasts (Fig. 1 B–E). AtBASS448-SbtA-GFP was collected in cytosolic foci (Fig. 1 F–I). The control proteins AtBASS448-GFP and AtMDT78-GFP were successfully directed to the chloroplast in all observed protoplasts (Fig. 1 J–Q). We tested the N-region’s necessity in chloroplast import by fusing the AtMDT cTP + N-region coding sequence to the N-terminus of sbtA in pCAMBIA1302. Confocal imaging of transfected protoplasts showed the AtMDT97-SbtA-GFP signal localized to the chloroplast in all observed protoplasts (Fig. 1 R–U). Non-transfected protoplasts imaged under the same settings confirmed that the low chloroplast signals in the GFP channels of AtMDT78-SbtA-GFP and AtBASS448-SbtA-GFP were due to autofluorescence and not the chimeric proteins (Fig. 1V–Z). Figure 1i represents the percentage of observed protoplasts showing chloroplast (green) or ER/cytosolic localization (blue) of chimeric proteins. Protoplasts showing GFP signals in ER/cytosol and chloroplasts were counted in both categories.

The cloned pCAMBIA1302 constructs were further agroinfiltrated into N. benthamiana leaves and the total protein extracted at two days post-infiltration (DPI) was used for western blotting analysis. The SbtA-GFP fusion protein is ~ 68KDa in size (Fig. 1ii). Efficient chloroplast import of cTP-fused SbtA chimera must result in cTP cleavage and a ~ 68KDa-sized SbtA-GFP mature protein band. However, western blotting of AtMDT78-SbtA-GFP and AtBASS448-SbtA-GFP chimeric proteins showed precursor protein bands ~ 75KDa in size, suggesting import failure (Fig. 1ii). The GFP protein is ~ 28KDa in size. The AtBASS448-GFP and AtMDT78-GFP control proteins showed band sizes of ~ 28KDa, suggesting cTP cleavage and successful chloroplast import of GFP (Fig. 1ii). The AtMDT97-SbtA-GFP chimera showed a mature protein band of ~ 70KDa, indicating successful chloroplast import of the chimeric protein (Fig. 1ii). These results confirm the essential role of the N-region in the chloroplast import of multi-TMD proteins.

The N-region sequence context influences the import efficiency of multi-TMD protein

To determine if the N-region’s sequence context influences the SbtA protein’s import efficiency, we replaced the N-region in AtMDT97-SbtA-GFP with the N-regions of the chloroplast IEM proteins AtPLGG1, AtPHT2;1 (hereafter referred to as AtPHT), AtDIT2;2 (hereafter referred to as AtDIT), and AtpSUT. The selected proteins are transporters containing more than 10 TMDs, a defined cTP, and N-regions of varying lengths and low sequence similarity (Fig. S5). The generated pCAMBIA1302 constructs were transfected in N. benthamiana protoplasts, and the sub-cellular localization of the chimeric proteins was observed 16–18 h post-transfection. A minimum of 30 transfected protoplasts were observed for each construct and manually sorted as localized to the chloroplast, endoplasmic reticulum (ER), or otherwise. Figure 2A depicts the schematics of the chimeric proteins and an overview of their intracellular localization.

Fig. 2
figure 2

N-region sequence context influences SbtA localization A A schematic representation of chimeric proteins and a summary of their sub-cellular localization. The cellular compartment where most protein is localized (primary targeting) is indicated with ‘ + .’ Compartments with trace amounts of the protein and no protein are indicated as ‘ ± ’ and ‘−’, respectively. OT Other, ER Endoplasmic Reticulum, CHL Chloroplast. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted as OT/ER/CHL localized. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O Single-plane image of protoplast showing mCherry fluorescence of ER marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and ER marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and chlorophyll autofluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. (i) Percentage of protoplasts showing GFP fluorescence in endoplasmic reticulum/other (indicated in blue) and chloroplasts (indicated in green). The total number of protoplasts counted is indicated above the bars. (ii) Western blot patterns of chimeric proteins. Total protein extracts from transiently transformed plants were analysed by western blotting using anti-GFP antibody. To confirm that proteins were loaded and transferred equally, blots were stained with Ponceau dye post-transfer (data not shown) and incubated with anti-Tic110 antibody. The results shown are representative of at least three independent western blot analyses. P Precursor form, M Mature processed form

The GFP signals of chimeric proteins AtMDT78-AtPLGG116-SbtA-GFP, AtMDT78-AtDIT24-SbtA-GFP, and AtMDT78-AtPHT55-SbtA-GFP overlapped with chlorophyll autofluorescence in all observed protoplasts, with a few protoplasts showing traces of GFP signal in the ER (Fig. 2 B–M). All the observed protoplasts showed the AtMDT78-AtpSuT66-SbtA-GFP signal distributed in cytosolic foci, ER, and chloroplasts (Fig. 2 N–Q). The cytosolic foci overlapped with Mitotracker Red, indicating the mitochondria as the targeted organelle (Fig. S6). The chloroplast signals in the GFP channel of transfected protoplasts were brighter than the chlorophyll autofluorescence of non-transfected protoplasts imaged under the same settings (Fig. 2 R–U), indicating the presence of the chimeric protein. Figure 2i represents the percentage of observed protoplasts showing chloroplast (green) or ER/cytosolic localization (blue) of chimeric proteins. Protoplasts showing GFP signals in ER/cytosol and chloroplasts were counted in both.

To validate the confocal findings, the cloned constructs were agroinfiltrated into N. benthamiana leaves and the total protein extracted at 2 DPI was used for western blotting analysis. The AtMDT78-AtPLGG116-SbtA-GFP, AtMDT78-AtDIT24-SbtA-GFP, and AtMDT78-AtPHT55-SbtA-GFP chimeric proteins showed prominent mature protein bands sized ~ 70KDa, suggesting efficient chloroplast import (Fig. 2ii). However, the AtMDT78-AtpSuT66-SbtA-GFP chimeric protein showed an additional, prominent precursor band of ~ 80KDa (Fig. 2ii). This agrees with the confocal imaging analysis, which shows a significant amount of the protein misdirected to the mitochondria.

To confirm the importance of the N-region context in cTP functioning, we introduced the N-regions of AtPLGG1, AtPHT, AtDIT, and AtpSUT in AtBASS448-SbtA-GFP (Fig. 3A). The GFP signals of chimeric proteins AtBASS448-AtPLGG116-SbtA-GFP, AtBASS448-AtDIT24-SbtA-GFP, and AtBASS448-AtPHT55-SbtA-GFP overlapped with chlorophyll autofluorescence in all observed protoplasts, with traces in the ER (Fig. 3 B–M). Similar to AtMDT78-AtpSuT66-SbtA-GFP, the AtBASS448-AtpSuT66-SbtA-GFP protein signal was found distributed in cytosolic foci, that overlapped with mitochondria (Fig. S6), and the chloroplast (Fig. 3 N–Q). The chloroplast signals in the GFP channel of transfected protoplasts were brighter than the chlorophyll autofluorescence of non-transfected protoplasts imaged under the same settings (Fig. 3 R–U), indicating the presence of the chimeric protein. Figure 3i represents the percentage of observed protoplasts showing chloroplast (green) or ER/cytosolic localization (blue) of chimeric proteins. Protoplasts showing GFP signals in ER/cytosol and chloroplasts were counted in both.

Fig. 3
figure 3

N-region sequence context influences SbtA localization A A schematic representation of chimeric proteins and a summary of their sub-cellular localization. The cellular compartment where most protein is localized (primary targeting) is indicated with ‘ + .’ Compartments with trace amounts of the protein and no protein are indicated as ‘ ± ’ and ‘−’, respectively. OT Other, ER Endoplasmic Reticulum, CHL Chloroplast. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted as OT/ER/CHL localized. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O  Single-plane image of protoplast showing mCherry fluorescence of ER marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and ER marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and chlorophyll autofluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. (i) Percentage of protoplasts showing GFP fluorescence in endoplasmic reticulum/other (indicated in blue) and chloroplasts (indicated in green). The total number of protoplasts counted is indicated above the bars. (ii) Western blot patterns of chimeric proteins. Total protein extracts from transiently transformed plants were analysed by western blotting using anti-GFP antibody. To confirm that proteins were loaded and transferred equally, blots were stained with Ponceau dye post-transfer (data not shown) and incubated with anti-Tic110 antibody. The results shown are representative of at least three independent western blot analyses. P Precursor form, M Mature processed form

Western blotting analysis of N. benthamiana total protein extracted at 2 DPI confirmed the confocal findings. The chimeric proteins AtBASS448-AtPLGG116-SbtA-GFP, AtBASS448-AtDIT24-SbtA-GFP, and AtBASS448-AtPHT55-SbtA-GFP showed prominent mature protein bands sized ~ 70KDa, suggesting efficient chloroplast import (Fig. 3ii). AtBASS448-AtpSuT66-SbtA-GFP showed an additional precursor protein band sized ~ 80KDa, suggesting significant chloroplast import failure (Fig. 3ii).

The N-regions of AtPLGG1, AtPHT, and AtDIT, combined with AtMDT or AtBASS4 cTP, can direct SbtA to the chloroplast. However, the confocal imaging and western blotting analysis of chimeric proteins with the AtpSuT N-region show significant mis-targeting to the mitochondria, indicating that the N-region sequence context influences the import of multi-TMD proteins.

The cargo protein influences the import potential of cTP + N-region

To understand if a cTP with a compatible N-region sequence context is sufficient to target a cargo multi-TMD protein to the chloroplast, we tested the cTP + N-regions of AtpSuT, AtPLGG1, AtPHT, and AtDIT for their SbtA targeting potential. N. benthamiana protoplasts were transfected with the generated constructs, and the sub-cellular localization of chimeric proteins was observed through confocal microscopy. Figure 4A depicts the schematics of the chimeric proteins and an overview of their intracellular localization.

Fig. 4
figure 4

The cTP-N region functioning depends on the cargo transmembrane domains A A schematic representation of chimeric proteins and a summary of their sub-cellular localization. The cellular compartment where most protein is localized (primary targeting) is indicated with ‘ + .’ Compartments with trace amounts of the protein and no protein are indicated as ‘ ± ’ and ‘−’, respectively. OT Other, ER Endoplasmic Reticulum, CHL Chloroplast. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted as OT/ER/CHL localized. B, F, J, N, R Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O, S Single-plane image of protoplast showing mCherry fluorescence of ER marker in magenta; D, H, L, P, T Composite image of protoplast combining the GFP and ER marker fluorescence images; E, I, M, Q, U Composite image of protoplast combining the GFP and chlorophyll autofluorescence images; V, X, Y, Z Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. (i) Percentage of protoplasts showing GFP fluorescence in endoplasmic reticulum/other (indicated in blue) and chloroplasts (indicated in green). The total number of protoplasts counted is indicated above the bars. (ii) Western blot patterns of chimeric proteins. Total protein extracts from transiently transformed plants were analysed by western blotting using anti-GFP antibody. To confirm that proteins were loaded and transferred equally, blots were stained with Ponceau dye post-transfer (data not shown) and incubated with anti-Tic110 antibody. The results shown are representative of at least three independent western blot analyses. P Precursor form, M Mature processed form

In all observed protoplasts, the chimeric protein AtPLGG192-SbtA-GFP localized in the chloroplasts (Fig. 4 B–E). Confocal imaging of AtDIT78-SbtA-GFP showed signals in chloroplasts as well as cytosolic foci (Fig. 4 F–I). The foci overlapped with Mitotracker Red, suggesting mitochondrial import (Fig. S6). Confocal imaging of protoplasts expressing AtPHT126-SbtA-GFP showed GFP signals distributed in the ER and the chloroplast (Fig. 4 J–M). The AtpSuT97-SbtA-GFP signals, similar to all chimeric proteins containing the pSuT N-region, were distributed in chloroplasts and mitochondria (Fig. 4 N–Q, S6). To determine if the cargo protein size impacts the import efficiency of a cTP + N-region, we fused the AtDIT cTP-N region to the N-terminus of the cyanobacterial bicarbonate transporter BicA. BicA is an ~ 60 kDa protein containing 14 TMDs (Shelden et al. 2010). Confocal imaging of transfected protoplasts showed that the cTP-N region of AtDIT, which significantly misdirected SbtA, efficiently targeted the larger sized BicA to chloroplasts (Fig. 4 R–U). The chloroplast signals in the GFP channel of transfected protoplasts were brighter than the chlorophyll autofluorescence of non-transfected protoplasts imaged under the same settings (Fig. 4V–Y). Figure 4i represents the percentage of observed protoplasts showing chloroplast (green) or ER/cytosolic localization (blue) of chimeric proteins. Protoplasts showing GFP signals in ER/cytosol and chloroplasts were counted in both.

Western blotting analysis of total protein from agroinfiltrated leaves showed a prominent mature protein band of AtPLGG192-SbtA-GFP at ~ 70KDa, confirming efficient chloroplast import of the chimera (Fig. 4ii). Western blotting analysis of AtDIT78-SbtA-GFP, AtpSuT97-SbtA-GFP, and AtPHT126-SbtA-GFP chimeric proteins showed additional prominent precursor protein bands at ~ 80KDa, suggesting significant import failure (Fig. 4ii). The results show that the tested cTP + N-regions from characterized chloroplast IEM proteins are not equally efficient in directing SbtA to the chloroplast. Potentially, the import efficiency of the cTP + N-regions are also influenced by the cargo TMDs. Supporting this hypothesis, western blotting of AtDIT78-BicA-GFP showed a single mature protein band ~ 88KDa in size (Fig. 4ii). The efficient import of a larger-sized protein suggests that cargo size is not the determining factor of a cTP + N-region’s targeting efficiency.

cTP-N region in IEM integration of membrane proteins

Several of the tested cTP-N regions have efficiently directed SbtA to the chloroplast. Chloroplasts contain three membranes in which the chimeric proteins are likely integrated: the outer envelope, the inner envelope, and the thylakoid membrane. Most chloroplast outer envelope proteins lack cTPs and are inserted spontaneously in the outer membrane (Day et al. 2019). With cTP-N regions, the SbtA chimeras are likely recognized as TOC-TIC complex substrates and directed to the chloroplast interior. We conducted co-localization experiments with the IEM marker protein AtTIC20II to determine the targeted membrane. N. benthamiana protoplasts were co-transfected with plasmids encoding SbtA chimeras and the IEM marker protein TIC20II-mScarlet_ I. Figure 5A depicts the schematics of the chimeric proteins and an overview of their intraorganellar localization.

Fig. 5
figure 5

Sub-organellar localization of SbtA chimeras A A schematic representation of chimeric proteins and a summary of their sub-organellar localization. Prominent IEM localization is indicated with ‘ +  + ’. Partial IEM localization is indicated with ‘ + ’. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O  Single-plane image of protoplast under 560 nm laser showing mScarlett fluorescence of IEM marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and IEM marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and IEM marker fluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. Relative pixel intensity plots correspond with the white lines on the merged images. (i) Percentage of protoplasts showing a ring of GFP around the chloroplasts overlapping with inner envelope membrane (IEM) marker (indicated in green) or partial overlap with IEM marker (indicated in blue). The number of protoplasts is indicated inside the bars

In all observed protoplasts, the chimeric protein AtMDT97-SbtA-GFP partially overlapped with TIC20II-mScarlet_I (Fig. 5 B–E). The profile of Tic20II-mScarlet_I showed two peaks in linear scans, indicating envelope localization. The AtMDT97-SbtA-GFP plot included GFP peaks coinciding with Tic20II-mCherry and spatially separate peaks. The fluorescence signals of AtPLGG192-SbtA-GFP, AtMDT78-AtPLGG116-SbtA-GFP, and AtBASS448-AtPLGG116-SbtA-GFP formed clear rings around the chloroplasts in a majority of the observed protoplasts, with GFP peaks coinciding with the TIC20II-mScarlet_I, indicating IEM targeting of the chimeric proteins (Fig. 5 F–Q). The signals in the GFP and Tic20II channels of the transfected protoplasts were brighter compared to the chlorophyll autofluorescence of non-transfected protoplasts imaged under the same settings (Fig. 5 R–U), indicating chimeric protein localization. Figure 5i represents the percentage of protoplasts showing complete (green) or partial (blue) overlap of GFP and TIC20II-mScarlet_I. A complete overlap showed a clear ring of GFP around the chloroplasts, while a partial overlap showed GFP signal peaks in regions not coinciding with the IEM marker.

Figure 6A depicts the schematics of the chimeric proteins and an overview of their intraorganellar localization. Most of the observed protoplasts expressing AtMDT78-AtDIT24-SbtA-GFP and AtMDT78-AtPHT55-SbtA-GFP did not show clear GFP rings around the chloroplast that overlap with TIC20II-mScarlet_I. The plot profiles of the proteins also show spatially separate peaks (Fig. 6 B-I). The fluorescence signals of AtBASS448-AtPHT55-sbtA-GFP and AtBASS448-AtDIT24-sbtA-GFP showed clear envelope localization with peaks spatially coinciding with TIC20II-mScarlet_I in most of the observed protoplasts, indicating IEM localization (Fig. 6 J–Q). The signals in the GFP and Tic20II channels of the transfected protoplasts were high in comparison to the autofluorescence control cell (Fig. 6 R–U). Figure 6i represents the percentage of protoplasts showing complete (green) or partial (blue) overlap of GFP and TIC20II-mScarlet_I.

Fig. 6
figure 6

Sub-organellar localization of SbtA chimeras A A schematic representation of chimeric proteins and a summary of their sub-organellar localization. Prominent IEM localization is indicated with ‘ +  + ’; Partial IEM localization is indicated with ‘ + ’. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O Single-plane image of protoplast under 560 nm laser showing mScarlett fluorescence of IEM marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and IEM marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and IEM marker fluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. Relative pixel intensity plots correspond with white lines on the merged images. (i) Percentage of protoplasts showing a ring of GFP around the chloroplasts overlapping with inner envelope membrane (IEM) marker (indicated in green) or partial overlap with IEM marker (indicated in blue). The total number of protoplasts counted is indicated above the bars

These results show that the cTP + N-regions of characterized chloroplast IEM proteins are capable of directing foreign multi-TMD proteins to the IEM. A few chimeric proteins show GFP signals confined inside the chloroplasts, suggesting that the proteins are largely directed to the chloroplast interior. However, further experiments are necessary to confirm their localization.

N-region motifs in import

So far, we have shown that the N-region is necessary for the import of multi-TMD proteins, with its sequence context influencing the protein’s chloroplast import efficiency. We further aimed to understand if this influence translates to the presence of motifs in the N-region that function in import. Our previous results show that the chimeric protein AtMDT97-SbtA-GFP is efficiently imported into the chloroplast. We generated mutants of AtMDT97-SbtA-GFP by substituting amino acids in the AtMDT N-region. Each mutant consisted of four sequential amino acid motifs in the N-region replaced with alanine residues. The sub-cellular localization of the mutants was studied through confocal microscopy and western blotting. Figure 7A depicts the schematics of the chimeric proteins and an overview of their intracellular localization.

Fig. 7
figure 7

Sub-cellular localization of mutant AtMDT97-SbtA proteins A A schematic representation of chimeric proteins and a summary of their sub-cellular localization. The cellular compartment where most protein is localized (primary targeting) is indicated with ‘ + .’ Compartments with trace amounts of the protein and no protein are indicated as ‘ ± ’ and ‘−’, respectively. OT Other, ER Endoplasmic Reticulum, CHL Chloroplast. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted as OT/ER/CHL localized. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O Single-plane image of protoplast showing mCherry fluorescence of ER marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and ER marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and chlorophyll autofluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. (i) Percentage of protoplasts showing GFP fluorescence in endoplasmic reticulum/other (indicated in blue) and chloroplasts (indicated in green). The total number of protoplasts counted is indicated above the bars. (ii) Western blot patterns of chimeric proteins. Total protein extracts from transiently transformed plants were analysed by western blotting using anti-GFP antibody. To confirm that proteins were loaded and transferred equally, blots were stained with Ponceau dye post-transfer (data not shown) and incubated with anti-Tic110 antibody. The results shown are representative of at least three independent western blot analyses. P, Precursor form; M, Mature processed form

We find that all the tested AtMDT97-SbtA-GFP mutants successfully localized in the chloroplast (Fig. 7 B–Q). The transfected protoplasts were compared to that of a non-transfected cell imaged under the same settings (Fig. 7 R–U) to confirm chimeric protein localization. Figure 7i represents the percentage of observed protoplasts showing chloroplast (green) or ER/cytosolic localization (blue) of chimeric proteins. Protoplasts showing GFP signals in ER/cytosol and chloroplasts were counted in both.

The mature protein band sizes of ~ 70KDa in western blotting analysis confirmed the successful import of the mutant proteins (Fig. 7ii). The alanine screen of the N-region was conducted by replacing four amino acid motifs at a time. We speculate that the N-region motifs, much like the cTP motifs, are functionally redundant. Potentially, double mutants of N-region motifs will impact the protein’s import efficiency and reveal motifs relevant in import.

We further examined the sub-organellar localization of the mutant proteins by co-localizing with the IEM marker TIC20II-mScarlet_I. Figure 8A depicts the schematics of the chimeric proteins and an overview of their intraorganellar localization. Confocal imaging revealed the fluorescence signals of AtMDT97 [DDEE/AAAA]-SbtA-GFP and AtMDT97 [GNEI/AAAA]-SbtA-GFP partially overlapping with TIC20II-mScarlet_I in a majority of the observed protoplasts (Fig. 8 B–I). The plot profiles of the chimeric proteins showed multiple GFP peaks spatially separate from those of the IEM marker. Interestingly, a majority of protoplasts expressing the AtMDT97 [IQLP/AAAA]-SbtA-GFP chimera showed a clear GFP ring around the chloroplasts (Fig. 8 J–M). The profile plot showed AtMDT97 [IQLP/AAAA]-SbtA-GFP peaks spatially coinciding with the peaks of Tic20II-mScarlet_I, suggesting IEM localization. The AtMDT97 [SIGV/AAAA]-SbtA-GFP protein only partially overlapped with TIC20II-mScarlet_I in most of the observed protoplasts (Fig. 8 N–Q). The plot profile also shows several GFP peaks spatially separate from the IEM marker peaks. Figure 8i represents the percentage of protoplasts showing complete (green) or partial (blue) overlap of GFP and TIC20II-mScarlet_I. The results suggest that mutations in the N-region potentially affect the sub-organellar localization of multi-TMD proteins. However, additional experiments are necessary to validate these claims.

Fig. 8
figure 8

Sub-organellar localization of mutant AtMDT97-SbtA proteins A A schematic representation of chimeric proteins and a summary of their sub-organellar localization. Prominent IEM localization is indicated with ‘ +  + ’; Partial IEM localization is indicated with ‘ + ’. Each protoplast image is representative of at least three independent experiments. The protoplasts were imaged under 60 × oil immersion objective and manually sorted. B, F, J, N Single-plane image of protoplast showing GFP fluorescence of the chimeric protein indicated in the left panel; C, G, K, O Single-plane image of protoplast under 560 nm laser showing mScarlett fluorescence of IEM marker in magenta; D, H, L, P Composite image of protoplast combining the GFP and IEM marker fluorescence images; E, I, M, Q Composite image of protoplast combining the GFP and IEM marker fluorescence images; R, S, T, U Un-transfected protoplast imaged in different channels and under same settings taken as autofluorescence control. Relative pixel intensity plots correspond with the white lines on the merged images. (i) Percentage of protoplasts showing a ring of GFP around the chloroplasts overlapping with inner envelope membrane (IEM) marker (indicated in green) and partial overlap with IEM marker (indicated in blue). The total number of protoplasts counted is indicated above the bars

Discussion

Efforts to introduce a carbon-concentrating mechanism in C3 plants to increase photosynthetic efficiency are ongoing. Mistargeting foreign membrane proteins due to an inadequate understanding of chloroplast signal sequence requirements is a significant obstacle to achieving these goals (Rottet et al. 2021). Multi-TMD chloroplast proteins require sequences in addition to the canonical cTP for efficient import and envelope targeting. The N-region, a sequence stretch C-terminal to the cTP, is reported to be crucial for the import of native and heterologous multi-TMD proteins to the chloroplast (Rolland et al. 2016). Our results support these claims, as adding the N-region redirected an ER-targeted SbtA chimera to the chloroplast. The N-region’s necessity raises multiple questions regarding its role in the import of multi-TMD proteins. Does the N-region act as a spacer sequence that passively aids import setup? Or does the N-region sequence affect targeting efficiency?

In this study, we find that the sequence context of the N-region influences the chloroplast import of the heterologous multi-TMD protein SbtA. Adding the N-region enabled chloroplast import, but specific N-region sequences misdirected SbtA to the mitochondria. These results suggest that the effectiveness of targeting relies on the compatibility between the cTP and the N-region. Similarly, Rolland et al. (2016) observed that inverting the N-region led to failure in chloroplast targeting of AtpSUT. They concluded that the N-region’s charge distribution might be critical in the import of multi-TMD proteins (Rolland et al. 2016). Our results are in support of this hypothesis as the charge distribution pattern of the AtMDT and AtBASS4 N-regions is at least partially maintained in the N-regions of AtPLGG1, AtPHT, and AtDIT, all of which efficiently target SbtA to the chloroplast in combination with the AtMDT and AtBASS4 cTPs (Fig. 9). However, the AtpSuT N-region, having a reverse charge distribution pattern, significantly misdirects the protein to the mitochondria. The importance of the N-region’s sequence context potentially translates to the region containing motifs that engage chaperones to facilitate import. Similarly, Shen et al. 2017 found the EGIKK motif in the region following the rbcS cTP to increase import efficiency. However, in our study, substituting individual motifs in the AtMDT N-region did not significantly alter the import efficiency of AtMDT97-SbtA-GFP. Like cTPs, N-regions of chloroplast IEM proteins show poor conservation and an absence of consensus sequence. The motifs in the N-region probably exhibit a similar level of functional redundancy. Double mutant studies of the N-region will help better understand N-region motifs.

Fig. 9
figure 9

Charge distribution in the N-region is a component that decides compatibility with the cTP A. Charge distribution in AtMDT cTP- N regions (AtMDT, AtPLGG1, AtPHT, AtDIT, AtpSuT). The positive charges are highlighted in red, and the negative charges are highlighted in blue. The N-regions of AtPLGG1 and AtPHT have a similar charge distribution to the N-region of AtMDT, with negative charges at the N-terminus and neutral/positive charges at the C-terminus. The charge distribution in AtDIT is not polarized but contains negative charges towards the N-terminal end of the N-region. The charge distribution in the AtpSUT N-region is the opposite, with positive charges at the N-terminus and negative charges at the C-terminus. B. Charge distribution in AtBASS4 cTP-N regions (AtBASS4, AtPLGG1, AtPHT, AtDIT, AtpSuT). The positive charges are highlighted in red, and the negative charges are highlighted in blue. The N-region of AtBASS4 is less polarized but contains a more negatively-charged N-terminus and positively-charged C-terminus, similar to the N-regions of AtPLGG1, AtPHT, and AtDIT. The charge distribution of the AtpSUT N-region is the opposite, with positive charges at the N-terminus and negative charges at the C-terminus

TMD characteristics affect the intracellular sorting of membrane proteins (Borgese & Fasana 2011; Rayner & Pelham 1997). We found that the cTP + N-regions of native chloroplast proteins did not efficiently target SbtA to the chloroplast, resulting in a significant amount of protein being misdirected to the ER or mitochondria. Furthermore, the chloroplast targeting efficiency of the AtDIT cTP + N-region differed for BicA and SbtA proteins. These results indicate an influence of the cargo TMDs on protein sorting. A recent study implicates hydrophobic amino acid motifs in the TMD, TMD charge, and codon usage bias in heterologous protein sorting (Engel et al. 2023). However, introducing mutations in the TMD regions of heterologous proteins for efficient sorting might compromise the protein’s functioning. Our findings demonstrate that specific combinations of cTPs and N-regions from different IEM proteins (AtMDT78-AtPHT55, AtMDT78-AtDIT24, AtBASS448-AtPHT55, AtBASS448-AtDIT24) are more effective in directing SbtA to the chloroplast than native cTP + N-region pairs. Testing combinations from various proteins may improve the import efficiency of heterologous proteins that fail to be imported efficiently through native cTP + N-regions.

Multiple studies on chloroplast IEM proteins have concluded that the TMDs contain a secondary signal determining their IEM localization (Froehlich & Keegstra 2011; Viana et al. 2010). The nature of the IEM signal is unclear but potentially related to the TMD’s biochemical properties and behaviour in transit (Singhal et al. 2017). In multi-TMD IEM proteins studied thus far, a single or a pair of TMDs are crucial for IEM integration (Okawa et al. 2014; Singhal et al. 2017). However, the TMDs do not function as the IEM signal when taken out of the protein context and fail to target heterologous proteins to the IEM (Okawa et al. 2014; Singhal et al. 2017). Our study shows that cTP + N-regions of chloroplast IEM multi-TMD proteins can effectively target bacterial TMDs to the IEM, suggesting that the N-terminal sequences contain IEM targeting elements. Similarly, Singhal et al. (2017) found the N-region of a thylakoid multi-TMD protein to contain thylakoid targeting signals. Our preliminary confocal study on IEM targeting suggests that N-region motif substitutions can alter the sub-organellar localization of the SbtA chimera. Chloroplast fractionation studies of N-region mutants will help confirm these speculations and identify motifs determining IEM localization.

Conclusion

Targeting heterologous multi-TMD proteins to the chloroplast requires a bipartite signaling sequence composed of the cleavable cTP and the non-cleavable N-region. Our study shows that the cTP and the N-region can function in new sequence contexts but require compatibility to prevent mistargeting the multi-TMD proteins to other cellular compartments. Specific cTP + N-region sequences are found to target different cargo TMDs with varying efficiencies. The influence of the cargo TMDs on targeting efficiency makes it difficult to design a single sequence that can effectively target diverse foreign multi-TMD proteins to the chloroplast. Our study found that specific N-regions targeted SbtA more efficiently in combination with cTPs from different proteins. Testing chimeric cTP and N-region pairs might help optimize the chloroplast targeting signal for a multi-TMD protein. A significant problem of utilizing non-cleavable targeting elements is the possibility of forming scars that impede protein functioning. Our study shows that the N-region can be mutated without compromising its targeting efficiency. This property can provide flexibility in designing non-cleavable targeting elements.