Introduction

Plastids are semiautonomous subcellular organelles whose functions largely depend on coordinated nuclear gene expression. Plastid-generated signals mediate this coordination via a process called retrograde signaling which modulates the expression of many nuclear genes, most of them associated to developmental pathways and stress responses towards environmental fluctuations (Crawford et al. 2018; Dietz et al. 2019; Mielecki et al. 2020). Key players controlling the coordinated transcriptional response to chloroplast redox status in the cell nucleus are transcription factors (TFs), which regulate the expression of target genes by recognizing and binding to short cis-regulatory DNA sequences (Franco-Zorrilla and Solano 2017).

Several chloroplast-derived signals involved in retrograde signaling have been identified, including tetrapyrrole intermediates (Mochizuki et al. 2001), the isoprenoid precursor methylerythritol cyclodiphosphate (MEcPP), 3’-phosphoadenosine 5’-phosphate (PAP), sugars and other metabolites (Estavillo et al. 2011; Xiao et al. 2012; Martínez-Noël and Tognetti 2018), calcium (Guo et al. 2016) and chloroplast-to-nucleus shuttling proteins (Sun et al. 2011; Isemer et al. 2012). A major component of retrograde signaling is mediated via sensing of chloroplast redox state and the levels of plastid-related reactive oxygen species (ROS; Gadjev et al. 2006; Willems et al. 2016). For instance, the expression of about 750 nuclear genes is regulated by the oxidation state of the plastoquinone pool (Jung et al. 2013). The mechanisms by which plastoquinone redox changes are perceived and transmitted out of the chloroplast remain unknown, although the serine/threonine protein kinase State Transition 7 (STN7) has been identified as an intermediate in this pathway (Bellafiore et al. 2005; Chan et al. 2016).

Among chloroplast ROS, both H2O2 and singlet oxygen (1O2) participate in retrograde signaling. As the most stable oxygen derivative, H2O2 can exit the chloroplast through envelope aquaporins (Mubarakshina Borisova et al. 2012) or reach the nucleus directly via stromules (Caplan et al. 2015), bypassing the cytosol and affecting the expression of many stress-associated genes (Mullineaux et al. 2020). Products encoded by these genes include heat-shock proteins, mitogen-activated protein kinase (MAPK) kinases and transcriptional regulators of the dehydration-responsive element-binding (DREB), zinc finger basic helix–loop–helix (bHLH) and WRKY families (Crawford et al. 2018). Unlike H2O2, 1O2 has too short lifespan to migrate from plastids, but can trigger different retrograde signaling pathways by reacting with proteins or metabolites in this organelle (Shumbe et al. 2017; Dogra et al. 2019). Chloroplast ROS may also promote retrograde responses by modulating the synthesis of PAP, MEcPP and Mg-protoporphyrin IX (Crawford et al. 2018; Mielecki et al. 2020).

In plants, all reducing power derived from photosynthetic electron transport chain (PETC) is mobilized through ferredoxin (Fd), an iron-sulfur carrier protein which acts as an electron distribution hub able to provide feedback on the redox state of the chloroplast (Scheibe and Dietz 2012; Pierella Karlusich et al. 2014; Supplementary Figure S1a). Most environmental hardships cause down-regulation of Fd levels, which in turn leads to acceptor side limitation, over-reduction of the PETC and misrouting of the excess of reducing power to oxygen, with concomitant ROS generation (Pierella Karlusich et al. 2014, and references therein). In many cyanobacteria and algae there is an isofunctional protein named flavodoxin (Fld; Supplementary Figure S1a), which is induced under multiple stress conditions to replace declining Fd (Pierella Karlusich et al. 2014). The Fld-coding gene disappeared from the plant genome (Pierella Karlusich et al. 2015) but noteworthy, the recombinant expression of a plastid-targeted cyanobacterial Fld in various plant species (including tobacco and potato, two members of the economically important Solanaceae family) improved delivery of reducing equivalents to productive pathways of the chloroplast, which in turn restricted plastid production of all ROS (Supplementary Figure S1b) and increased tolerance to abiotic (Tognetti et al. 2006, 2007; Zurbriggen et al. 2008; Coba de la Peña et al. 2010; Li et al. 2017; Pierella Karlusich et al. 2020) and biotic stresses (Zurbriggen et al. 2009; Rossi et al. 2017).

When Fld-expressing plants were grown under normal conditions, namely, in the absence of stress, Fd and Fld accumulated to similar levels in the transgenic lines (Tognetti et al. 2006; Ceccoli et al. 2012; Mayta et al. 2019), resulting in the introduction of a new player in the network of chloroplast electron distribution. Indeed, combination of the two electron carriers did have phenotypic consequences, as reflected by a more oxidized plastoquinone pool (Gómez et al. 2020), higher pigment contents and photosynthetic activity per leaf cross-section (Tognetti et al. 2006; Ceccoli et al. 2012), smaller antenna size, and delayed leaf senescence (Mayta et al. 2018). These Fld-dependent phenotypic features under normal growth conditions were accompanied by significant changes in the transcriptional profiles of the leaves: about 1000 and 1500 transcripts changed their expression patterns in response to chloroplast-targeted Fld in tobacco and potato, respectively (Pierella Karlusich et al. 2017, 2020). A large fraction of these Fld-responsive genes might be potentially involved in stress acclimation as they respond in the same direction in wild-type (WT) plants exposed to abiotic and biotic stresses (Pierella Karlusich et al. 2017, 2020; Supplementary Figure S1c).

In the current research, we used the transcriptomic profiles of the Fld-expressing solanaceous plants as a tool to study the role of chloroplast ROS and/or redox status in the expression of nuclear genes involved in stress acclimation. We first searched for transcriptomic imprints in these plants associated with the chemical identity of ROS and their subcellular site of production by using vector-based algorithms trained with a wide range of ROS-responsive transcriptomes from the literature. We then identified TFs potentially involved in Fld-associated signaling by introducing our datasets in a large-scale interaction network that integrates TF binding- and expression-based regulatory interactions reported in the literature. Finally, we analysed the architecture of the promoters of the genes modulated by Fld and/or stress by searching for conserved cis regulatory motifs. Our analysis provides a genome-wide overview of the cis- and trans-acting regulatory architecture of nuclear genes modulated by chloroplast redox status and/or stress conditions in leaves of Solanaceae, providing relevant information to improve our understanding on the basic mechanisms of plant gene regulation and to expand the toolbox of available promoters for use in plant biotechnology.

Methods

Transcriptomic datasets

The microarray datasets for leaves of tobacco (Nicotiana tabacum cv. Petit Havana) and potato (Solanum tuberosum cv. Solara) plants expressing Fld in plastids (pfld lines) and their WT siblings were downloaded from NCBI (GEO Series accession numbers GSE92596 and GSE149503; Pierella Karlusich et al. 20172020). In addition to the leaf microarray profiles generated from plants grown under normal conditions, we included those generated from tobacco plants inoculated with the hemibiotrophic bacterium Xanthomonas campestris pv. vesicatoria (Xcv) and potato plants subjected to water withdrawal during 3 days (Pierella Karlusich et al. 2017, 2020; Supplementary Figure S1c).

Data processing and statistical analysis were carried out with the Bioconductor library limma (Ritchie et al. 2015). Background correction and normalization were performed using the 'normexp' and quantile methods, respectively. We only considered probes whose intensity was more than 10% above background on at least one genotype/treatment combination. An empirical Bayes method with moderated t-statistic was employed for determination of the genes with statistically significant changes, whereas the Benjamini and Hochberg’s method was used to control false discovery rate (FDR). Differentially expressed genes were identified from pairwise comparisons when FDR < 0.05 and fold-change was > 2 (induction) or < 0.5 (repression).

ROSMETER analysis

ROSMETER (Rosenwasser et al. 2013; https://irosmeter.g-incpm.com/noa/cal.php) is a bioinformatic tool for assessing transcriptomic features associated with ROS-related responses. The input transcriptomes of interest are compared by vector-based correlations against those from Arabidopsis plants accumulating ROS in different subcellular compartments due to chemical applications or mutagenesis of antioxidant enzymes (Rosenwasser et al. 2013; Supplementary Table S1). Accordingly, we first retrieved the Arabidopsis orthologues of potato and tobacco genes by using the annotation tables at the GoMapMan website resource (Ramšak et al. 2014; http://www.gomapman.org/export/current/generic). Input data for each gene consisted of the fold change and the respective FDR of its significance for the two samples to be compared. The output file represents correlation values between known oxidative stresses or treatments, and the transcriptomes of the two cultivars.

Prediction of transcription factors regulating a set of differentially expressed genes

We mined a recently published regulatory network (De Clercq et al. 2021) to identify TFs controlling genes modulated by chloroplast redox status. The integrated gene regulatory network (iGRN) covers 1.7 million regulatory interactions in Arabidopsis (1,491 TFs and 31,393 target genes), and was constructed by combining a wide range of different regulatory input networks capturing empirical complementary information about DNA motifs, open chromatin, and TF binding- and expression-based regulatory interactions with a supervised learning approach (De Clercq et al. 2021).

The Arabidopsis orthologues of potato and tobacco genes were obtained as described above, and the corresponding genes modulated by Fld under control conditions were used as input to mine the full iGRN using overlap and TF enrichment analysis through the http://bioinformatics.psb.ugent.be/webtools/iGRN/ website.

The enrichment of TFs associated with (i.e., regulating) our set of input genes was determined. Enrichment statistics were computed using the hypergeometric distribution combined with Benjamini–Hochberg correction for multiple hypotheses testing (a q-value cutoff of 1e−3 is applied).

De novo motif discovery

For the reference genome of S. tuberosum and N. tabacum, a FASTA file containing the 1000-bp flanking regions upstream of the transcriptional start sites (TSS) for all available genes was generated using the ‘blastdbcmd’ tool of the Blast software (Altschul et al. 1997). The transcription start site and strand information for every gene were obtained from the ITAG1.0 and PGSC annotation files for potato and the Nitab-v4.5 annotation file for tobacco (Xu et al. 2011; Sharma et al. 2013; Fernandez-Pozo et al. 2015; Edwards et al. 2017). Two subsets were created from this general file, each containing only the upstream flanking regions of the genes that were found to be significantly over-expressed or under-expressed in response to chloroplast Fld expression and/or stress. Motif discovery was performed separately for each set of differentially expressed genes.

Two different programs were used for de novo motif discovery: Seeder v. 0.01 and Weeder v. 2.0 (Fauteux et al. 2008; Zambelli et al. 2014). Each list of differentially expressed promoters was used as input for the motif discovery programs.

The FASTA file containing all 1000-bp upstream regions in the genomes was used to generate the background files in Seeder and Weeder. All motif discovery programs were run using default parameters to find motifs in both the forward and reverse complementary strands.

Motif annotation

All motifs discovered were uploaded to the STAMP web server for visualization and search in the PLACE database for potential matches (Higo et al. 1999; Mahony and Benos 2007). The alignment method used the ungapped Smith-Waterman algorithm, which compares segments of all possible lengths and optimizes the similarity measure (Mahony and Benos 2007). Diagrams for visualization of nucleotide frequencies in motifs were created using Weblogo v.2.860 (Crooks et al. 2004). Significant matches of motifs were chosen based on the E-value of alignment, considering 10e−5 as a maximum for this parameter.

Functional enrichment analysis among genes with shared cis motifs in their promoter regions

Pathway over-representation analyses among genes with shared cis motifs in their promoters were performed with PageMan (Usadel et al. 2006) using Fisher’s exact test with Bonferroni correction (FDR < 0.05). Mapman ontology was used for functional annotation (Thimm et al. 2004) employing the corresponding mapping files from the GoMapMan website resource (Ramšak et al. 2014; www.gomapman.org).

Graphical analysis

Graphs were generated in R language (http://www.r-project.org/) with the ggplot2 library (Wickham 2009), using the geom_density function for plotting the positional distribution of the discovered cis motifs, and geom_bar for plotting the frequency bar graphs. The R library upset from the UpSetR package was employed for the intersection graphs of the different cis motifs (Conway et al. 2017).

Results

Potato and tobacco leaves expressing a plastid-targeted Fld shared ROS-related transcriptomic imprints

Since the expression of a plastid-targeted Fld in tobacco and potato leaves was shown to mitigate ROS production during stress episodes by changes in the chloroplast redox state (Tognetti et al. 2006; Zurbriggen et al. 2009; Pierella Karlusich et al. 2020), we analysed if these transgenic plants exhibit transcriptomic imprints that could be related to the chemical identity of ROS and their subcellular origin. We therefore employed the ROSMETER bioinformatic tool to determine vector-based correlations between different Arabidopsis transcriptomes, chosen for their ROS signatures, and our data of interest. The stored Arabidopsis results correspond to a wide range of transcriptomes obtained in experiments involving knock-out (KO), knocked-down (KD) or antisense (AS) lines deficient in antioxidant enzymes, or WT plants subjected to chemical applications that lead to increases in ROS production in different cellular compartments (Rosenwasser et al. 2013; Supplementary Table S1). After retrieving the Arabidopsis orthologous genes from tobacco and potato using the annotation tables at the GoMapMan website resource (Ramšak et al. 2014; see Methods), we carried out the analysis of our leaf transcriptomes (Fig. 1; Supplementary Table S2).

Fig. 1
figure 1

ROS-related transcriptomic imprints in potato and tobacco leaves from Fld-expressing plants grown under normal conditions. The ROSMETER bioinformatic tool (Rosenwasser et al. 2013) was employed to determine vector-based correlations between the transcriptomes of Arabidopsis with different ROS signatures due to treatments or mutations (left) and those of the solanaceous transgenic plants expressing plastid-targeted flavodoxin (right). The colour intensity is proportional to the correlation values (red and green for positive and negative correlations, respectively). Abbreviations: AOX1a, mitochondrial alternative oxidase 1a; APX1, ascorbate peroxidase 1; SOD, superoxide dismutase

The expression of a plastid-targeted Fld under control conditions showed common correlations among tobacco and potato expression patterns with various ROSMETER signatures (Fig. 1). Negative correlations were observed in both species with the transcriptomic responses caused by a 6-h treatment with methyl viologen (MV), which generates superoxide in chloroplasts (Fig. 1). This ROS-related transcriptomic signature is likely connected to the potential effects of Fld presence as a general antioxidant in plastids. Negative correlations were also detected with a cytoplasmic ascorbate peroxidase (KO-APX1) knock-out mutant exposed to high light, suggesting that Fld activity in chloroplasts contribute to lower oxidative levels in the cytosol. Positive correlations were instead found with a mitochondrial alternative oxidase mutation (TDNA-AOX1a) and with exposure to 3-aminotriazole (AT), a catalase inhibitor spray (Fig. 1). The first is related to mitochondrial oxidative stress, whereas the second leads to increased H2O2 in peroxisomes (Rosenwasser et al. 2013). The connection of these results with the transcriptomic data of both plant species is the down-regulation of genes involved in the category ‘redox.dismutases and catalases’ which evidences the change direction in redox status of peroxisomes (see Table S2 in Pierella Karlusich et al. 2017, and Table S1 in Pierella Karlusich et al. 2020).

Association of transcriptomic signatures affected by Fld presence and stress exposure with ROS-related profiles

As described in both transcriptomic studies, drought and biotic stress led to a major reconfiguration of expression patterns in leaf samples taken at early stages of the treatments, prior to the manifestation of stress symptoms (Pierella Karlusich et al. 2017, 2020). The protective effects of Fld were accompanied by amelioration of stress-dependent transcriptional changes, most conspicuously, of those involving repression of photosynthetic genes (Pierella Karlusich et al. 2017, 2020). We thus extended the ROSMETER analysis to identify common correlations regarding oxidative status under stress conditions in transgenic plants and their WT siblings.

In the case of potato leaves exposed to drought, the previously described positive correlation with Arabidopsis TDNA-AOX1a mutants under control conditions was maintained after application of the stress (Supplementary Figure S2a). Negative correlations were predominant with KO-APX1 plants exposed to high light, with the 6-h MV treatment and with TDNA-AOX1a mutants exposed to mild light and drought stress (Supplementary Figure S2a). Giraud et al. (2008) have shown that the relative abundance of genes encoding antioxidant defence components located in chloroplasts was the same or higher in the TDNA-AOX1a lines compared to their WT siblings. In addition, AOX deficiency has been shown to enhance the expression of ROS scavenging enzymes in both Arabidopsis and tobacco (Amirsadeghi et al. 2006; Giraud et al. 2008). The collected evidence suggests that AOX function in metabolic and signaling homeostasis is particularly important during stress (Vanlerberghe 2013). Only specific for water limitation was the negative correlation with a 24-h treatment with MV in both WT and pfld potato leaves (Supplementary Figure S2a). This observation suggests that production of superoxide in chloroplasts and mitochondria was not among the early (pre-symptomatic) responses to water deficit.

A positive correlation was found in WT potato leaves exposed to drought with the application of rotenone, an inhibitor of mitochondrial complex I which also promotes superoxide formation but specifically in mitochondria (Supplementary Figure S2a). The application of rotenone in Arabidopsis has been related to the accumulation of glycolytic end products and associated amino acid pools, together with decreases in early tricarboxylic acid cycle-derived products (Garmier et al. 2008). It is worth noting that a similar increase in amino acid levels was observed in WT potato leaves under water limitation, which was largely prevented by Fld expression (Pierella Karlusich et al. 2020). On the other hand, drought-exposed pfld plants showed positive correlations with catalase2 mutants which accumulate H2O2 in peroxisomes, and with AS-AOX1 (Supplementary Figure S2a). Both responses were described to be significant at the initial stage of drought in Arabidopsis (Rosenwasser et al. 2013). Negative correlations were instead displayed with flu mutants, which are associated with production of singlet oxygen in chloroplasts due to the accumulation of photodynamic chlorophyll precursors, and with knocked-down chloroplast superoxide dismutase (KD-SOD) and KO-APX1 lines exposed to high light during 3 h (Supplementary Fig. S2a). This suggests that the more oxidized state of the PETC due to Fld presence limits all ROS propagation, including singlet oxygen.

Common ROS regulations due to the effect of Fld were also found under biotic stress in tobacco. Indeed, there were significant correlations with more than a half of the ROSMETER signatures in both WT and pfld plants (Supplementary Figure S2b). Positive correlations with KD-SOD, AS-AOX1, and 30-min exposure to MV (Supplementary Figure S2b) are related to superoxide formation in chloroplast and mitochondria. They resulted from Fld-dependent down-regulation of transcripts encoding chloroplast SOD in tobacco plants under control and stress conditions (Pierella Karlusich et al. 2017). On the other hand, negative correlations were obtained with KO-APX1 exposed to 3 h of high light, 6 h of MV treatment, and 3 h of rotenone application (Supplementary Figure S2b). Although APX1 was not detected among tobacco genes affected by Fld under control conditions, it was induced upon Xcv infiltration in both tobacco lines (Pierella Karlusich et al. 2017).

In conclusion, the ROS transcriptomic features under stress were very different between the two species, as anticipated by the different nature of the applied adverse conditions. Much more drastic reconfiguration of transcriptomic ROS-associated patterns was observed under biotic stress (74 indexes with correlations > 0.2 or < -0.2) than under drought (37 indexes).

Network-based identification of transcription factors potentially involved in the modulation of Fld-responsive genes

We also investigated if individual TF functions can be involved in the signaling network modulated by Fld expression in the chloroplast. We used a supervised learning approach that integrates large-scale functional data about TF binding- and expression-based regulatory interactions in Arabidopsis, referred to as iGRN (see Methods). This approach was shown to predict known regulatory interactions obtained by state-of-the-art experimental methods, and can therefore be used for predicting the functions of TFs in a given environmental condition (De Clercq et al. 2021). Thus, we created a list of Arabidopsis orthologues to the genes shared between tobacco and potato that are regulated by chloroplast Fld under control conditions, and used it to mine the full iGRN using overlap and TF enrichment analysis. We obtained a total of 750 TFs potentially regulating our set of input genes (Supplementary Table S3) and we focused on the first 50 ordered by lower p-value. More than a half of them (28 TFs) were related to ROS or stress responses, either by experimental evidence (named as “ROS” or “STRESS” type TFs in Table 1) or exclusively by their position in the iGRN (defined as “NOVEL” in Table 1 as they are unknown TFs with potential involvement in these regulations). Three of them were associated to biotic and (mostly) abiotic stresses, one belonging to the bHLH and the other two to the Cys2/His2-type (C2H2) TF families (Sun et al. 2018; Han et al. 2020; Table 1). In addition, the APETALA2/ethylene-responsive binding protein (AP2-EREBP) TF family was overrepresented among TFs related to stress responses (Table 1).

Table 1 TFs predicted to regulate Fld-responsive nuclear genes

Nuclear genes modulated by the expression of a plastid-targeted Fld contain overrepresented cis motifs in their promoters

To discover conserved cis motifs in the promoters of the nuclear genes regulated by the presence of plastid-targeted Fld, we searched for overrepresented DNA patterns in the upstream flanking regions of these differentially expressed genes using Seeder and Weeder tools with default parameters (Fauteux et al. 2008; Zambelli et al. 2014). We obtained 50 motifs for tobacco and potato using Weeder and 18 motifs using Seeder (Supplementary Table S4), and we focused on the top motifs with the highest score for STAMP alignment (E-value < 10e−5) for further analysis (Tables 2 and 3). Four of the retrieved consensus sequences were similar (or even identical) in tobacco and potato (Table 2).

Table 2 Regulatory motifs discovered in the upstream flanking region of Fld-induced genes in potato and tobacco leaves
Table 3 Regulatory motifs discovered in the upstream flanking region of Fld-repressed genes in potato and tobacco leaves

We also searched for enriched functional categories in the list of genes containing each of the discovered motifs in their promoters (last column in Tables 2 and 3). Among the genes up-regulated by Fld in potato leaves, most cis motifs were found in genes encoding pathogenesis-related (PR) proteins, proteasome components and enzymes involved in cell wall metabolism, while there is a cis motif also detected in peroxidase-coding genes. Regarding potato genes down-regulated by Fld, the motifs were mainly found in genes related to ethylene metabolism and metal handling. Among genes up-regulated by Fld in tobacco, there are cis motifs detected in those associated to cell wall metabolism, PR proteins and the proteasome (as in potato), and also stress-related ribosomal proteins and TFs of the auxin/indole-3-acetic acid (Aux/IAA) family. In relation to the genes down-regulated by Fld in tobacco, cis motifs were exclusively detected in TFs of the Aux/IAA family.

Regarding the organization of the cis elements within the promoters, most genes had 1–2 copies of the same cis motif, although in some cases there were as many as 7–9 copies (Fig. 2a). In addition, most genes contained different types of cis motifs in their promoter regions, and indeed a significant fraction of the genes presented all cis motifs discovered for the corresponding type of regulation, namely, either induced or repressed by Fld (Fig. 2b), suggesting an additive effect of these elements in the regulation of the corresponding genes.

Fig. 2
figure 2

Frequency of the identified cis motifs in the promoters of nuclear genes modulated by the expression of a chloroplast-targeted Fld in leaves of potato and tobacco plants grown under normal conditions. A) Repetitions of the cis motifs in the same promoter region. B) Counts for the cis motif combinations along the same promoter region

Screening of promoter databases for cis motifs presented in Fld-induced genes

We screened the predicted motifs against the PLACE database (Higo et al. 1999), and found that all of them had been previously described in the bibliography (first column in Tables 2 and 3), but to our knowledge their links to chloroplast redox status were not recognized before.

Among the overrepresented cis elements in genes up-regulated by Fld in both species, the consensus sequence CTAATA was distributed in two main positions within the promoters, one distal and the other proximal to the TSS (Table 2). This motif was first described in the promoter of the cucumber gene for NADPH-protochlorophyllide reductase, an enzyme involved in chlorophyll biosynthesis (Fusada et al. 2005). The potential induction of this gene by Fld concurs with previous reports indicating that pfld plants contain higher chlorophyll a per leaf fresh weight (Gómez et al. 2020). In addition, this cis motif was shown to be critical for cytokinin-enhanced in vitro binding of nuclear proteins to the promoter of NADPH-protochlorophyllide reductase (Fusada et al. 2005). High cytokinin levels are associated with retarded senescence and protective responses against oxidative damage (Hönig et al. 2018). This is also related with previous work showing that Fld-expressing tobacco plants displayed elevated contents of three cytokinins in mature and senescent leaves, and exhibited delayed senescence with mitigation of chloroplastic ROS production (Mayta et al. 2018).

Two regulatory motifs shared by genes positively regulated by plastid-targeted Fld were two cis elements first characterized in the promoter of the extA gene from Brassica napus, encoding the cell wall protein extensin (Elliott and Shirsat 1998). They are the 'quantitative activator region' ACACGTT and the 'root specific region' GTATA, and both have positive effects on extA expression. The first of these motifs was detected in potato and the second in tobacco (Table 2), both distributed between − 1000 and − 500 bp from the TSS. The extA gene has been shown to be expressed in leaves under wounding and tensile stress (Elliott and Shirsat 1998).

Another cis motif overrepresented in the promoters of Fld-induced genes in both potato and tobacco leaves was a positively acting element called PE1, which was initially described in the phytochrome A3 promoter of Avena sativa (Bruce and Quail 1990; Table 2). It had a bimodal distribution along the promoters, with the two peaks separated by ~ 500 bp. It has been demonstrated that a positive factor 1 belonging to the high-mobility group I-Y (HMGI/Y) protein family binds preferentially to PE1 (Nieto-Sotelo and Quail 1994). HMG-I/Y proteins are reported to play an ‘architectural’ role by modelling chromatin structure in the vicinity of genes. A different HMG-I/Y binding site sequence (TTTTAG) named ATRICHPSPETE was also found in promoters of potato genes up-regulated by Fld (Table 2), displayed a similar bimodal density distribution, and was reported to enhance activity of the pea plastocyanin gene promoter (Sandhu et al. 1998).

A motif only found in promoters of potato genes up-regulated by Fld was the consensus TTCTAT sequence called Box II, which tends to be located in the distal part of the promoter region (Table 2). Box II was initially reported in the promoter of the nuclear gene coding for the photosystem II subunit PsbR of the drought-tolerant species Prosopis juliflora (Suja and Parida 2008). Interestingly, this sequence is also found in a plastid-encoded gene of tobacco which codes for the β-subunit of the chloroplast ATP synthase (AtpB; Kapoor and Sugiura 1999).

A number of cis regulatory elements present in the promoters of genes up-regulated by chloroplast Fld were only found in tobacco. Among them, both motif discovery tools identified the octopine synthase (ocs) element binding factor 5 (OBF5) recognition site first described in the promoter of an Arabidopsis GST gene encoding a glutathione S-transferase (Chen et al. 1996; Table 2). The distribution of this motif along the promoters showed a small peak at about -700 bp from the TSS. The GST gene is known to be induced by auxin, salicylic acid (SA), H2O2 and oxidative stress (Chen et al. 1996), whereas the bZIP-type DNA binding protein OBF5 was up-regulated by cadmium treatment in Arabidopsis (Suzuki et al. 2001).

Another motif overrepresented in genes induced by Fld only in tobacco was called JERECRSTR, a jasmonate- and elicitor-responsive element which was found in the strictosidine synthase gene promoter (Menke et al. 1999; Table 2). This motif was homogeneously distributed throughout the distal region, from − 1000 to − 500 bp of the TSS. By analysing synthetic promoters, this cis element was found necessary for the induction of the corresponding genes during pathogen infection and wounding stress (Rushton et al. 2002). The last identified tobacco-specific motif was another ocs element responsive to auxin, SA, and methyl jasmonate. Similar to OBF5, it was initially described in the promoter of a soybean GH2/4 gene that encodes a GST (Ulmasov et al. 1994; Table 2). Its position distribution showed a pronounced distal density peak between – 1000 and − 750 bp.

Screening of promoter databases for cis motifs present in Fld-repressed genes

Cis elements recognized in the promoters of genes repressed by the presence of plastid-targeted Fld were specific for either potato or tobacco, with the single exception of an E2Fc-binding motif found in both species. This consensus sequence (GGGAAT) is present in the promoter of the cell nuclear antigen (PCNA)-encoding genes and other E2F target genes (Table 3). It showed a distal regulatory position in potato, while in tobacco it is inverted in the opposite DNA strand and dispersed throughout the promoter region (Table 3). The E2F proteins play an important role in the regulation of genes that are induced at the G1(G0)/S transition, as reported for rice and tobacco PCNA (Kosugi and Ohashi 2002).

Regarding those cis elements only found in promoters of potato genes down-regulated by Fld, the consensus CTTTA sequence was mainly localized in distal regions of the promoters (Table 3). It was reported to be a target site for the trans-acting StDof1 protein and extensively characterized in the promoter of the KST1 gene, which encodes a K+ influx channel of guard cells in potato. The block mutagenesis of the consensus sequences dramatically reduced guard cell promoter activity (Plesch et al. 2001).

The consensus TTTGAAWT, characterized as an ethylene-responsive element, was also overrepresented among potato promoters, distributed uniformly through the entire promoter region (Table 3). This cis motif was first identified in carnation GST and cysteine proteinase gene promoters, both implicated in senescence of different tissues (Itzhaki et al. 1994; Rawat et al. 2005). Another ethylene-responsive motif called the ‘evening element’ was predominantly located in proximal regions of the potato promoters (Table 3). This motif is present upstream of a Solanum melongena gene coding for a cysteine proteinase (SmCP; Rawat et al. 2005). The regulation of SmCP is under circadian control, with peak expression in late light. It is also induced during leaf senescence and fruit ripening, when endogenous ethylene is abundant (Xu et al. 2003).

The last of the potato-specific conserved regulatory elements found in Fld-repressed genes is Box C, which had a pronounced distribution peak in distal parts of the promoters (Table 3). This cis element was initially described in the upstream region of a pea gene coding for asparagine synthetase, and was postulated to be a DNA binding element involved in light-dependent transcriptional repression (Ngai et al. 1997). It was also found in several dehydrin promoters (Zolotarov and Strömvik 2015).

Four conserved motifs were retrieved from the promoters of genes negatively regulated by Fld in tobacco but not in potato. A regulatory cis element named jasmonic acid/senescence-responsive element 1 (JASE1) was preferentially located in the middle of the promoter regions (Table 3). It was originally reported in the 12-oxo-phytodienoic acid-10,11-reductase gene promoter, and is required for the up-regulation of this gene during leaf senescence and in response to jasmonic acid (He and Gan 2001). The presence of this motif among Fld-repressed genes suggests a potential signaling pathway involved in delayed leaf senescence (Mayta et al. 2018).

A second overrepresented motif was an upstream activating sequence named '23 bp UAS' which was localized in two preferential locations, one distal and the other proximal to the TSS, with the last being the most populated (Table 3). It has been identified in the promoter of the Nicotiana sylvestris cyclin B1 gene, which belongs to the group of B-type cyclins that are expressed at a maximum level in the G2-M phase (Bulankova et al. 2013). This consensus sequence contains the MYB (v-myb avian myeloblastosis viral oncogene homolog) binding element (Tréhin et al. 1999), which interacts with AmMYB308 and AmMYB330 TFs to repress phenolic acid metabolism and lignin biosynthesis (Tamagnone et al. 1998). The latter category was enriched among genes down-regulated by Fld in tobacco (Pierella Karlusich et al. 2017).

Another overrepresented motif was DE1, which was uniformly distributed along its promoters (Table 3). It was first identified in the upstream region of the light-responsive gene pra2, which encodes a small GTPase belonging to the YPT/rab family (Inaba et al. 2000). It has been demonstrated that DE1 receives signals from phytochromes A, B and blue-light photoreceptors, leading to down-regulation of pra2 levels under red, far-red and blue light conditions (Inaba et al. 2000). These observations suggest a possible regulation by photoreceptors on repressed genes.

The last tobacco-specific motif identified in Fld-repressed genes is the low-temperature-responsive element CCGAAA (Table 3), which has been implicated in responses to low temperature, abscisic acid and other environmental factors (Dunn et al. 1998). It was distributed predominantly in distal parts of down-regulated promoters.

Cis motifs in the promoters of stress-responsive nuclear genes primed by the expression of a plastid-targeted Fld

As mentioned before, a significant fraction of the genes whose expression is modulated by Fld under control conditions were also modulated in the same direction by the assayed stresses in WT counterparts: drought in potato and biotic stress in tobacco (column 4 in Tables 2 and 3). Among up-regulated genes, there were several characterized cis mofifs: the PE1 element found in the phytochrome A3 promoter, the ATRICHPSPETE element, the Box II described in the plastid gene promoter for AtpB, the jasmonate- and elicitor-responsive element which was characterized in the strictosidine synthase promoter, the ocs element responsive to auxin, SA and methyl jasmonate, and the elements found in the promoters of the extA gene promoter, QARBNEXTA and WARBNEXTA (Table 2; Supplementary Table S5). Among down-regulated genes, we identified the ethylene-responsive 'evening element', Box C, JASE1 and the low-temperature-responsive element CCGAAA (Table 3; Supplementary Table S6).

Cis motifs in the promoters of nuclear genes modulated by biotic and/or abiotic stresses

We then evaluated the presence of conserved cis elements in the promoters of genes affected by the stress treatments. Common elements in the promoters of genes up-regulated by drought in both WT and pfld potato plants are summarized in Supplementary Table S5. They are the ABRE2 element originally found in the promoter of the barley ABA-responsive gene HVA22 (Shen and Ho 1995), the ATRICHPSPETE element reported in an enhancer of Pisum sativum (Sandhu et al. 1998), the 'quantitative activator region' ACACGTT and the 'wounding activating region' WARBNEXTA, both first reported in the promoter of the extA gene from B. napus (Elliott and Shirsat 1998). A common element found in the promoters of genes down-regulated by drought in both potato genotypes was the Box C element (Ngai et al. 1997). As indicated previously, the 'quantitative activator region', the ATRICHPSPETE and the Box C elements were also overrepresented motifs in the promoters of Fld-responsive genes under control conditions (Table 2).

Among those genes exclusively induced by drought in WT potato plants, an overrepresented cis motif in their promoters was the Box II element (Supplementary Table S5), already mentioned as overrepresented in the promoters of Fld-responsive genes under control conditions (Table 2). Overrepresented cis motifs present in the promoters of genes repressed by drought exclusively in WT plants were DE1 of the pra2 gene promoter (Inaba et al. 2000) and the auxin-responsive element DR5, found by site-directed mutations as part of the soybean GH3 gene promoter (Ulmasov et al. 1997; Supplementary Table S6).

Genes of WT and pfld tobacco lines induced by biotic stress exhibited 3 enriched cis motifs in their promoters also present in Fld-induced genes under control conditions ('root specific region', PE1, and ocs), as well as an element only present in promoters of genes induced by Xcv inoculation: the elicitor-responsive core element of the gene coding for pathogenesis-related protein 1 (PR1) initially described in parsley (Rushton et al. 1996; Supplementary Table S7). The DNA-binding proteins putatively recognizing this sequence belong to the WRKY family (Rushton et al. 1996). Among the promoters of tobacco genes induced by Xcv inoculation only in the WT genotype, there was an enriched cis motif recognized by the TDBA12 protein that increased markedly during the tobacco mosaic virus-induced hypersensitive response (Yang et al. 1999; Supplementary Table S7). TDBA12 is related to a novel class of DNA-binding factors containing WRKY domains and regulates the inducible expression of genes in response to pathogens (Yang et al. 1999).

Promoters of tobacco genes repressed by biotic stress in both genotypes are shown in Supplementary Table S8. The list includes: the Box C regulatory element (Ngai et al. 1997), which to our knowledge has not been associated to biotic-responsive genes so far; the 52/56 box, first described in the promoters of the tomato anther-specific LAT52 and LAT56 genes (Twell et al. 1991) and indispensable for the developmental regulation of the LAT52 gene in the pollen; the I-Box element, which has only two nucleotides of difference with respect to the I-Box core motif found in the promoters of potato genes induced by drought exclusively in transgenic lines (Martínez-Hernández et al. 2002; Supplementary Table S5). Finally, among the promoters of tobacco genes repressed by Xcv inoculation only in WT plants (Supplementary Table S8), we found the B-box element first reported in the promoter of the napA gene from B. napus and with similarities to ABA-responsive elements (Ezcurra et al. 1999).

Conclusions

Chloroplasts play a critical role as sensors of environmental fluctuations, emitting signals that regulate nuclear gene expression associated to developmental and stress response pathways. Changes in the redox state of the PETC and the accumulation of ROS trigger plastid signaling (Gadjev et al. 2006; Willems et al. 2016). Key factors controlling the coordinated transcriptional response to chloroplast redox status in the nucleus of the plant cell are TFs, which regulate their target genes by recognizing and binding short cis-regulatory DNA sequences (Franco-Zorrilla and Solano 2017). Introduction of the alternative electron shuttle Fld in plastids shifts the redox poise of the PETC to a more oxidized state (Gómez et al. 2020) and limits ROS propagation under various stress conditions (Tognetti et al. 2006; Zurbriggen et al. 2008, 2009; Pierella Karlusich et al. 2014, 2020; Rossi et al. 2017), thus providing a tool to modulate the chloroplast oxido-reductive status in a predictable way. About 5% of the leaf-expressed genes in these transgenic plants were affected by the presence of Fld even under normal conditions, and a big proportion of these Fld-responsive genes are potentially involved in stress acclimation as they showed a response in the same direction as that displayed by WT siblings under stress (Pierella Karlusich et al. 2017, 2020). Therefore, the transcriptomes of tobacco and potato plants expressing a plastid-located Fld represent interesting datasets for disentangling the chloroplast-associated ROS and redox networks that modulate the expression of nuclear genes involved in stress acclimation.

We first searched these plants for transcriptomic imprints associated with the chemical identity of ROS and their subcellular site of production by using the ROSMETER tool. We found considerable differences between the Fld-expressing plants and their WT counterparts in both normal and stress conditions, and the patterns obtained suggest the involvement of chloroplasts in retrograde signaling, as inferred from the positive and negative correlations with ROS located in different organelles.

We then identified TF functions potentially involved in the observed transcriptomic features modulated by Fld under control conditions by introducing our datasets in a large-scale network that integrates TF binding- and expression-based regulatory interactions covering a wide range of growth conditions and treatments. The top 50 TFs in terms of statistical significance include members of the bHLH, C2H2 and AP2-EREBP families, and about 50% of them are indeed associated with ROS and stress responses in the network.

Finally, we analysed the architecture of the promoters by predicting 68 conserved cis regulatory motifs in the genes modulated by Fld and 213 in those genes modulated by the stresses. When focusing on those 25 cis elements with the highest statistical significance among Fld-responsive genes, we found close matches to experimentally validated regulatory motif sequences in the PLACE database. In addition, we found identical cis elements in the promoters of many genes that responded to Fld in the same direction as in WT plants under drought and/or biotic stress, pointing to the Fld activation of stressed-like expression patterns before the occurrence of the stress.

In conclusion, these analyses provide a genome-wide picture illustrating the coordinated transcriptional modulation of nuclear plant genes by the chloroplast redox status, and the relevance of this process to the acclimation and response to biotic and abiotic stresses. In addition, we suggested new cis and trans targets to generate stress tolerance in solanaceous crops. It is worth noting, within this context, that this plant family includes many cultivated species with high economic impact. Potato, in particular, is the third most important food crop (FAO 2021), and yet relatively few studies have been carried out to identify regulatory motifs in the upstream regions of its genes that could be used in breeding programs (Aminedi et al. 2014; Gálvez et al. 2016). This situation highlights the need to further study and understand potential gene regulation mechanisms in potato, especially in response to critical abiotic factors such as drought.