Introduction

Zinc finger proteins play key roles in a variety of cellular functions, such as transcriptional regulation, apoptosis control, RNA binding, and protein–protein interactions. The Lesion Simulating Disease (LSD) family comprises well-characterized zinc finger proteins that contain the conserved zinc finger LSD domain CxxCRxxLMYxxGASxVxCxxC (Dietrich et al. 1997). The LSD genes were first characterized in Arabidopsis thaliana mutants that spontaneously form necrotic lesions in the absence of pathogenic infection. These genes were shown to negatively regulate programmed cell death (PCD) (Dietrich et al. 1994; Dietrich et al. 1997). Lesion formation in lsd mutants is associated with the expression of histochemical and molecular markers of the plant disease response, and in some cases, the mutants showed a significant increase in their resistance to pathogen attack (Dietrich et al. 1994). The well-characterized A. thaliana LSD1 gene (AtLSD1) has been described as a key regulator of abiotic and biotic stress responses in plants. AtLSD1 participates in the signaling pathway that induces the expression of copper–zinc superoxide dismutase in response to salicylic acid (SA), and it also negatively regulates reactive oxygen species (ROS) generated under stress conditions (Kliebenstein et al. 1999), thereby limiting the accumulation of superoxide in the cell (Jabs et al. 1996). The Arabidopsis lsd1 mutant exhibited impaired spreading of the hypersensitive response (HR) throughout the course of a pathogen attack, which consequently expanded beyond the infection site engulfing the entire leaf (Dietrich et al. 1997). During photooxidative stress, Arabidopsis lsd1 mutants showed high levels of ROS, reduced stomatal conductance and low peroxisomal catalase activity, which suggests that LSD1 is required for acclimation to conditions that promote excess excitation energy (Mateo et al. 2004). Together with the Enhanced Disease Susceptibility 1 and Phytoalexin-deficient 4 genes, AtLSD1 acts as an ROS/ethylene homeostatic switch during light acclimation and pathogen defense (Muhlenbock et al. 2008). Under hypoxia conditions, AtLSD1 regulates lysigenous aerenchyma formation (Muhlenbock et al. 2007), while under low temperature conditions, it regulates the response to cell death (Huang et al. 2010). Thus, AtLSD1 has been proposed to act as a cellular hub, making a central contribution to the oxidative stress response in plants (Kaminaka et al. 2006).

Soybean (Glycine max) is one of the most economically important crops in the world. Biotic and abiotic stresses severely restrict soybean yield. With regard to biotic stress, Asian Soybean Rust (ASR), caused by the fungus Phakopsora pachyrhizi, is one of the main diseases affecting soybean production. The pathogen develops in leaves, stems and pods and is able to defoliate soybean plants in a few days, leading to drastic crop losses (Goellner et al. 2010). Soybean genes determining resistance to P. pachyrhizi (Rpp genes) have been identified (McLean and Byth 1980; Bromfield and Hartwig 1980; Bromfield and Melching 1982; Hartwig and Bromfield 1983; Hartwig 1986; Garcia et al. 2008; Kim et al. 2012; Li et al. 2012). Because no available commercial soybean cultivar is resistant to all pathotypes of P. pachyrhizi, the application of fungicides is the only method of controlling the spread of infection. Regarding abiotic stress, drought is the most important condition that affects soybean production, especially during flower establishment (Meckel et al. 1984). Moreover, drought stress affects biological nitrogen (N) fixation, which leads to a reduced supply of nitrogen for protein production (Ledrera et al. 2007; Manavalan et al. 2009).

Although some LSD genes have been reported to play important roles in the defense against fungal isolates (Dietrich et al. 1994; Weymann et al. 1995; Hunt et al. 1997; Aviv et al. 2002; Wang et al. 2005; Yeh et al. 2011), bacterial isolates (Epple et al. 2003; Bhatti et al. 2008) and abiotic stress (Mateo et al. 2004; Muhlenbock et al. 2008; Huang et al. 2010), the potential of G. max LSD genes (GmLSD) to protect plants against environmental stresses remains to be determined. Furthermore, there is no complete identification of the LSD gene family, which hampers the reconstitution of its evolutionary history. Thus, the present study reports the identification of the LSD gene family in Viridiplantae and presents evidence of the involvement of GmLSD genes in the responses to P. pachyrhizi infection and dehydration conditions.

Material and methods

Viridiplantae LSD gene annotation

To identify the LSD genes present in Viridiplantae, including representatives of the monocot, eudicot, moss, lycophyte and algae species, the complete nucleotide and protein sequences of the well-characterized AtLSD1 gene (Dietrich et al. 1994) were used as the query in tBLASTx and BLASTn searches conducted against the Phytozome (http://www.phytozome.org/) and PLAZA (http://bioinformatics.psb.ugent.be/plaza/) databases. Homologous sequences exhibiting greater than 50 % homology with the query sequence in this first-round of BLAST searches were then used as query sequences in a second-round of BLAST searches against the National Center of Biotechnology Information database (NCBI) (http://www.ncbi.nlm.nih.gov/). The putative LSD protein sequences identified were examined for the presence of the previously reported zinc finger LSD domain (Dietrich et al. 1997). These analyses were performed using the SMART (http://smart.embl-heidelberg.de/) and InterProScan Signature (http://smart.embl-heidelberg.de/) databases. In order to analyze the degree of conservation of the three zinc finger LSD domains, the amino acid sequences were aligned using the Molecular Evolutionary Genetics Analysis 5.05 program (Tamura et al. 2011) and manually analyzed.

To identify putative pseudogenes, the coding sequences of the LSD genes identified were used as query sequences in a BLAST search against the NCBI expressed sequence tag (EST) database (http://www.ncbi.nlm.nih.gov/nucest/). The identities of the resulting ESTs were confirmed using the Phytozome (http://www.phytozome.org/) and PLAZA (http://bioinformatics.psb.ugent.be/plaza/) databases.

Soybean exon–intron structure and promoter sequence analysis

To assess the structural conservation of GmLSD genes, their exon–intron structure was analyzed. The putative promoter region from the 2,000-base pairs (bp) upstream of the transcription start site (TSS) of each GmLSD gene was used to search for putative cis-elements. The analysis was performed using the Plant Care database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/), and the cis-elements identified were classified based on their putative biological functions.

Expression analysis of GmLSD genes

To assess GmLSD gene expression profiles, several experiments were performed and gene expression was analyzed using real time quantitative polymerase chain reaction (RT-qPCR). To confirm GmLSD gene expression, different plant tissues were analyzed. Leaves, roots, and stems of plants in the vegetative (V) phase and seeds, pods and flowers before and after fertilization were harvested from the MGBR-46 Conquista Brazilian soybean cultivar grown under greenhouse conditions. Tissues were harvested and immediately frozen in liquid nitrogen before being stored at -80 º C until RNA extraction. Three biological replicates, with three plants per replicate, were sampled for each plant organ. The relative expression levels of the transcripts were statistically analyzed using variance analysis, with data transformation when necessary. Means were compared using the Tukey multiple comparison test. Statistical Analysis System (SAS) 9.2 and the Statistical Package for the Social Sciences (SPSS/PASWSTAT) 18 were also used.

To analyze GmLSD gene expression profiles in response to P. pachyrhizi infection, an experiment was carried out in Embrapa Soja, Londrina, Paraná, Brazil. In this experiment, the susceptible cultivar EMBRAPA48 (van de Mortel et al. 2007) and the resistant genotype PI561356 (Almeida et al. 2012) were used. The PI561356 genotype presents the Rpp1 gene (responsible for immune response against some isolates), mapped at the chromosome 18 (Kim et al. 2012). Plants in the V2 stage, which consist of a fully developed trifoliolate leaf at the node above the unifoliolate nodes (Fehr and Caviness 1977), were grown in a greenhouse and sprayed with a fungal spore suspension according to the method of Wiebke-Strohm et al. (2012). Leaves sprayed with a solution lacking fungal spores were used as mock-treated control. One trifoliolate leaf from each plant was collected at 12, 24, 48, 96, and 192 h post-inoculation (hpi) and was frozen in liquid nitrogen before being stored at −80 °C. Three biological replicates, with three plants per replicate, were analyzed from each genotype per treatment. The relative expression levels of GmLSD genes were statistically compared by variance analysis with factorial treatments based on three factors: genotype, time, and pathogen presence. When necessary, data were transformed using the weighted least squares method. Means were compared using the Bonferroni multiple comparison test. The SAS 9.2 and the SPSS/PASWSTAT 18 software packages were used to perform this analysis.

An additional experiment was carried out to analyze changes in the expression profile of GmLSD genes in response to dehydration stress. A highly sensitive BR16 cultivar and a slightly sensitive EMBRAPA48 cultivar (Oya et al. 2004) were grown in a greenhouse according to the method described by Kulcheski et al. (2011) and were submitted to dehydration stress as described by Martins et al. (2008). Briefly, seedlings in the V2 stage were removed from a hydroponic solution and kept in a tray in the dark without nutrients. Leaves and roots were collected at 0 (control), 25, 50, 75, 100, 125, and 150 min after the initiation of dehydration stress and were froze in liquid nitrogen at −80 °C until RNA extraction. Three biological replicates (three plants/replicate) were sampled for each organ/genotype/treatment point. The relative expression level of GmLSD genes in each organ was statistically compared by variance analysis considering dehydration stress time and cultivar. Data were transformed using the weighted least squares method. Means were compared using the Bonferroni multiple comparison test. The SAS 9.2 and the SPSS/PASWSTAT 18 programs were used.

RNA extraction, reverse transcription, and quantitative PCR

Total RNA was extracted from each sample using TRIzol reagent (Invitrogen, Carlsbad, CA), and 2 μg of RNA of each sample was subsequently treated with DNAse I (Promega, Madison, WI) and used for cDNA synthesis. The first-strand cDNA synthesis reaction was performed using the M-MLV Reverse Transcriptase System (Invitrogen) and a 24-mer oligo dT anchored primer.

RT-qPCR was performed in a StepOne Applied Biosystem Real-Time Cycler. For all experiments, the RT-qPCR thermocycling began with a 5 min initial denaturation step at 94 °C, after which 40 cycles consisting of a 10-s denaturation step at 94 °C, a 15-s annealing step at 60 °C, and a 15-s extension step at 72 °C were performed, and this was followed by a final extension step of 2 min at 40 °C. To identify different products, a melting curve analysis over a range of 55–99 °C and with a stepwise temperature increase of 0.1 °C/s was performed at the end of each PCR run. The final volume of each reaction was 25 μl and comprised 12.5 μl of diluted cDNA, 1× PCR buffer (Invitrogen), 50 mM MgCl2, 10 mM of each dNTP, 10 μM of each primer, 2.5 μl SYBR-Green solution (1:100,000, Molecular Probes Inc., Eugene, OR) and 0.06 U Platinum Taq DNA Polymerase (Invitrogen). The first-strand cDNA reaction product was diluted 1:100 for evaluation the relative expression analyses. In all reactions, technical quadruplicates were performed for each sample and samples with water (instead cDNA) were used as negative control.

Primer 3 software (http://frodo.wi.mit.edu/) was used to design specific oligonucleotides corresponding to each GmLSD gene (Table S1). The annealing temperature used for all primers was 60 °C. The specificity of the amplicons obtained in the RT-qPCR was confirmed by sequencing on an ABI PRISM 3100 Genetic Analyzer automatic sequencer (Applied Biosystems) in the ACTGene Laboratory (Centro de Biotecnologia-UFRGS, Porto Alegre, Rio Grande do Sul, Brazil). Several housekeeping genes were selected for the normalization of mRNA levels in RT-qPCR. The ACTII, CYP2 (Jian et al. 2008), and metalloprotease (Libault et al. 2008) genes were used for normalization in the plant organ experiment, while the metalloprotease and f-box genes (Libault et al. 2008) were used in the P. pachyrhizi assay, and the f-box, ACTII, and ELF genes (Jian et al. 2008) were used in the dehydration assay. The stability of the housekeeping genes was examined using Genorm software (http://medgen.ugent.be/~jvdesomp/genorm/). The relative gene expression was analyzed using the 2-ΔΔCt method (Livak and Schmittgen 2001).

Results

LSD genes are widely distributed in Viridiplantae

We identified a total of 117 putative LSD genes that are widely distributed in Viridiplantae species (Fig. 1). Genes were assigned acronyms according to the BLAST output order, and these are shown in Table S2. The number of genes in each species ranged from one to eight, and the Embryophyte node was the earliest point in the phylogeny at which increases in the copy numbers of the genes were detected. In the most ancestral genomes, Volvox carteri and Chlamydomonas reinhardtii, only one LSD gene was identified. In the moss Physcomitrella patens and in the lycophyte Selaginella moellinorffii, two and one LSD genes were found, respectively. Among the monocot clade, Setaria italica and Zea mays were the species with the greatest number of LSD family genes, with six members identified, while soybean and Linum usitatissimum had the most LSD genes among the eudicot species, with eight genes identified in each. In contrast, genes belonging to the LSD family were not identified in fungi or animals, including humans.

Fig. 1
figure 1

The total number of LSD genes annotated per species and the number of proteins with one, two, or three zinc finger LSD domains. Adapted from Phytozome (http://www.phytozome.org/)

The analysis of the number of LSD domains present in their encoded protein products showed that the LSD proteins possess one, two or three zinc finger LSD domains, although the proteins with three such domains constitute the majority of the LSD sequences identified (Fig. 1). A decrease in the number of encoded LSD domains per protein was identified only in the Embryophyte clade. In addition, the proteins with only one LSD domain were highly represented at the grass clade. The analysis of the conservation between LSD domains showed that the LSD domains exhibit broad conservation across all LSD genes and that the cysteine residues at positions 1, 4, 19, and 22 are particularly well conserved (Fig. 2). The neutral amino acids leucine (positions 7 and 8), tyrosine (position 10), glycine (position 13), alanine (position 14), serine (position 16), valine (position 17), and the basic amino acid arginine (position 5) were more conserved than were the other amino acids within the three LSD domains.

Fig. 2
figure 2

Conservation analysis of the consensus sequence of the zinc finger LSD domains. The LSD domains from the 117 LSD genes were aligned using the MEGA 5.05 program and were manually analyzed. Different colors indicates the amino acids: tryptophan (W), aspartic acid (D), glutamic acid (E), tyrosine (Y), proline (P), glycine (G), histidine (H), arginine (R), lysine (K), methionine (M), valine (V), leucine (L), isoleucine (I), phenylalanine (F), cysteine (C), alanine (A), threonine (T), serine (S), glutamine (Q), asparagine (N). a Eighty-four sequences of the first domain (N-terminal); b 111 sequences of the second domain (core); and c 117 sequences of the third domain (C-terminal)

The presence of at least one of the following features indicates that a gene is potentially a pseudogene: a lack of transcript support, a short coding region, a long untranslated region, and either a lack of a paralog or the presence of a significantly shorter paralog for genes residing within a segmental duplication (Thibaud-Nissen et al. 2009). Searching for corresponding transcript support, we identified ESTs for 87 out of the 117 genes analyzed, and these included ESTs corresponding to genes with only one or two LSD domains (Table S2). Considering soybean, ESTs for all GmLSD were identified.

GmLSD genes have different exon–intron organization and their promoter regions are enriched for cis-elements related to stress responses

Investigation of the properties of exons and introns can provide a better understanding of the mechanisms that determine exon–intron architecture (Zhu et al. 2009). Structural analysis of the GmLSD genes revealed three distinct groups: the GmLSD1, GmLSD2, and GmLSD4 genes all contain six exons, while the GmLSD3, GmLSD5, and GmLSD7 genes have five exons, and the GmLSD6 and GmLSD8 genes have four exons (Fig. 3). The sizes of the exons are very similar, while the introns are somewhat more variable, even among genes that have the same number of introns.

Fig. 3
figure 3

Exon–intron structure of the GmLSD genes. The boxes represent the exons, and the lines connecting them represent the introns. The sequences are drawn to scale

Analysis of the putative promoter regions (defined as the 2,000 bp region upstream of the TSS) of the GmLSD genes identified an enrichment for putative cis-elements that are related to the response of the plant to abiotic and biotic stresses. Several regulatory cis-elements that are known to be responsive to hormones and to a variety of stress factors and signaling pathways were found in GmLSD gene promoters (Table 1). Cis-elements related to the response to light, to circadian control, and to the regulation of expression in the meristem, the seed and the endosperm were also identified (data not shown). Two putative cis-elements involved in the abscisic acid (ABA) response were also identified. The abscisic acid-responsive element (ABRE), and the SA-responsive element (TCA) were both found in the putative promoters of the GmLSD2, GmLSD3, and GmLSD7 genes, whereas the corresponding regions of GmLSD5 and GmLSD6 contained two and four copies of the TCA element, respectively. The defense and stress responsive element (TC-rich repeat) was found in the promoters of all GmLSD genes except GmLSD8, and between one and five copies of this element were present in these promoters. Moreover, two sequences involved in the response to signaling by the methyl jasmonate hormone (MeJA), namely, the CGTCA and TGAC elements, were identified in the putative promoters of the GmLSD2, GmLSD3, GmLSD4, GmLSD7, and GmLSD8 genes. The gibberellin hormone response element (GARE) was found in GmLSD1, GmLSD2, GmLSD3, and GmLSD7, while the ethylene-responsive element (ERE) was found in the upstream regions of the GmLSD3, GmLSD4, GmLSD5, GmLSD7, and GmLSD8 genes. In addition, cis-elements related to the specific response to pathogen attack were also identified. The Box-W1 cis-element was found in the putative promoters of both GmLSD3 and GmLSD5, whereas the Elicitor-box 3 (ELI-box3) element was found exclusively in the GmLSD8 gene. The wound-responsive element (WUN) was observed only upstream of the GmLSD4 gene. The MBSII element, a MYB transcription factor binding site found in flavonoid biosynthesis genes, was found only in the putative promoters of the GmLSD3 and GmLSD6 genes. With regard to abiotic stress response cis-elements, the MBS element, a MYB transcription factor binding site associated with drought response, was identified in the GmLSD1, GmLSD3, GmLSD4, GmLSD5 and GmLSD6 genes, but no further elements of this class were found. Overall, the promoter region of the GmLSD3 gene exhibited the greatest diversity in its putative cis-elements related to stress responses because only the ELI-box 3 and WUN elements were not found in its sequence, while the GmLSD7 promoter contained the greatest number of cis-elements.

Table 1 Number of copies and biological function of the putative cis-elements related to stress that were identified in the GmLSD genes promoters

Modulation of the expression of GmLSD genes varies in different organs and under different stress conditions

To examine the expression profile of GmLSD genes in different organs, for each gene, the lowest transcript level detected was used to normalize the transcript levels in other organs and thereby to quantify transcript accumulation (Fig. 4). Statistical analysis indicated that there was a significant association (p < 0.001) between the organ factors, except GmLSD6. GmLSD1 gene had higher transcript levels in roots, stems, leaves and flowers after fertilization, while GmLSD2 showed higher expression levels in roots, stems and flowers after fertilization. The transcript levels of the GmLSD3 and GmLSD5 genes were increased in leaves, while those of GmLSD4 were increased in roots, stems and pods. Transcripts of GmLSD6 were detected only in roots and flowers after fertilization. The expression level of GmLSD7 was higher in stems, pods and leaves. Lastly, the level of GmLSD8 transcription was elevated in stems. Interestingly, under our experimental conditions, the transcripts of most genes were not detected in flowers before fertilization. Exceptions to this were the GmLSD7 and GmLSD8 genes, in which the relative expression levels were statistically similar in this organ.

Fig. 4
figure 4

Relative expression levels of the GmLSD genes in different organs of the Conquista Brazilian soybean cultivar. The values are the means of three biological replicates with four technical replicates each. The means that are labeled identically (with a letter) do not differ significantly (Tukey comparison test, p < 0.05). The ACTII, CYP2, and metalloprotease reference genes were used as internal controls to normalize for the amount of mRNA present in each sample. For each gene, the lowest transcript level was used to normalize the transcript levels in other organs. To allow a better comparison of groups of genes with large differences in their expression levels, some graphs of fold-change are shown in log2 scale, and this is indicated at the y-axis. ND not detected

We also performed a comparative analysis of all GmLSD genes in each organ (Fig. S1). For each organ, the gene exhibiting the lowest expression level was used to normalize the transcript levels of the other genes. The statistical analysis indicated significant differences among the transcript levels of the different GmLSD genes in each organ (p < 0.001), with the exception of flowers before fertilization. The GmLSD8 gene exhibited peak transcript levels in all organs analyzed, but other GmLSD genes had the same high level of expression in leaves, stems and flowers before fertilization.

To detect changes in GmLSD gene expression patterns occurring in response to P. pachyrhizi infection, the transcript levels in mock-inoculated plants were used to normalize the transcript levels detected in inoculated plants (Fig. 5). Plant tissues were sampled at a series of time-points that comprised the plant basal response (in which the first peak of plant gene expression occurs), fungal appressorium formation and epidermal cell penetration (until 12 hpi), the quiescent period between 24 and 48 hpi (when the fungus proceeds the early infection processes but does not cause strong differential gene expression in the plant) and the time-point (usually after 72 hpi) at which a second round of gene expression can occur in the plant due to fungal colonization or the formation of lesions or uredinia (van de Mortel et al. 2007; Schneider et al. 2011). Statistical analysis indicated a significant interaction among genotype, time-point and treatment for the most GmLSD genes (p < 0.05). GmLSD3, GmLSD5, and GmLSD7 were notable exceptions to this: a significant interaction (p < 0.05) was found only between the treatment condition and the time-point for these genes. GmLSD1 expression was induced only in PI561356 at 12, 24, and 48 hpi. Increased expression of the GmLSD2 gene was detected only at 12 hpi in the EMBRAPA48 cultivar and at 24 hpi in PI561356 plants. GmLSD3 was induced at 48 hpi in EMBRAPA48, while in PI561356 it was induced at 48 hpi, repressed at 96 hpi, and induced again at 192 hpi. Expression of the GmLSD4 gene was only induced in PI561356 plants and only at 48 hpi. GmLSD5 expression was induced at 48 hpi in EMBRAPA48 and at 24 and 48 hpi in PI561356. The GmLSD6 gene exhibited an induction response at 192 hpi in EMBRAPA48 plants, while it was found to be induced at 12, 24, 48, and 96 hpi in the PI561356 genotype. No differential expression was observed for GmLSD7 in either genotype. Finally, the GmLSD8 gene was induced at 24 and 48 hpi but only in PI561356 plants.

Fig. 5
figure 5

GmLSD gene expression profiles in response to P. pachyrhizi infection in the leaves of EMBRAPA48 (susceptible) and PI561356 (resistant) soybean genotypes. The relative expression levels of GmLSD genes were measured by RT-qPCR at 12, 24, 48, 96, and 192 hpi (hours post-inoculation). The black bars represent the mock plants (non-infected), and the grey bars represent the infected plants. The values are the means of three biological replicates with four technical replicates each. The means that are labeled identically (with a letter) in the same cultivar do not differ significantly (Bonferroni multiple comparison test, p < 0.05). The f-box and metalloprotease reference genes were used as internal controls to normalize for the amount of mRNA present in each sample. The transcript levels from the mock-inoculated plants were used to normalize the transcript levels from the inoculated plants. ND not detected

Figures 6 and 7 provide details of the changes in GmLSD gene expression observed in the leaves and roots under dehydration conditions. Expression data for GmLSD6 in roots were not submitted to statistical analysis because the transcripts were not detected at some of the time-points analyzed. A significant interaction between the genotype and the time-point was observed for all other GmLSD genes in the leaves (p < 0.01) and the roots (p < 0.05). The transcript levels in plants at T 0 were used to normalize the transcript levels for the subsequent time-points. Expression of GmLSD1 was induced at T 125 and T 150 in the leaves of both cultivars. In the roots, GmLSD1 was not induced in BR16, but was induced at T 100 and T 150 in EMBRAPA48. GmLSD2 expression was induced in the leaves at T 50, T 100, and T 125 in BR16 and at T 150 in EMBRAPA48. In contrast, GmLSD2 was not induced in the roots of BR16, while in EMBRAPA48 it was induced at T 75, T 100, and T 150 in this organ. GmLSD3 transcription was induced at T 25, T 50, T 100, T 125, and T 150 in the leaves of BR16 but only at T 150 in those of the EMBRAPA48 cultivar, while in the roots GmLSD3 was induced at T 100 and T 150 in BR16 and at T 75, T 100, and T 150 in EMBRAPA48. GmLSD4 expression was induced in leaves at T 50 and T 150 in BR16 plants and at T 75, T 125, and T 150 in plants of the EMBRAPA48 cultivar. In the roots, expression of the GmLSD4 gene was detected only in the BR16 cultivar, in which it was induced at T 150. GmLSD5 was not differentially expressed in the leaves and roots of BR16. However, we found that GmLSD5 transcription was repressed in the leaves of EMBRAPA48 plants, except at T 150. On the other hand, GmLSD5 expression was induced only at T 150 in the EMBRAPA48 roots. Transcripts from the GmLSD6 gene were not detected in the leaves of the EMBRAPA48 cultivar, and the expression of this gene was found to be repressed at T 25, T 50, and T 100 in BR16 leaves. In the roots, GmLSD6 was expressed at T 0, T 25 T 50, and T 125 in BR16 and at T 0, T 75, and T 150 in EMBRAPA48. The GmLSD7 gene was repressed at T 25 and T 150 in the leaves of BR16 plants and at T 100, T 125, and T 150 in those of EMBRAPA48 plants, while in the roots GmLSD7 was induced only at T 150 in BR16 plants and was not differentially expressed in EMBRAPA48 plants. GmLSD8 was not differentially expressed in BR16 leaves from T 0 to T 125 but was found to be repressed at T 150, while in EMBRAPA48 leaves it was repressed at T 25, T 100 and T 125 and was induced at T 150. In the roots, no differential expression of GmLSD8 was observed for the EMBRAPA48 cultivar, but an induction of GmLSD8 expression in this organ was observed at T 150 in BR16 plants.

Fig. 6
figure 6

GmLSD gene expression in the leaves of BR16 (sensitive) and EMBRAPA48 (tolerant) cultivars in response to dehydration stress. The relative expression levels of the GmLSD genes were measured by RT-qPCR at T 0 (control), T 25, T 50, T 75, T 100, T 125, and T 150 minutes of dehydration stress. The values are the means of three biological replicates with four technical replicates each. The means that are labeled identically (with a letter) in the same cultivar do not differ significantly (Bonferroni multiple comparison test, p < 0.05). The f-box, ACTII, and ELF reference genes were used as internal controls to normalize for the amount of mRNA present in each sample. The transcript levels from the plants at T 0 were used to normalize the transcript levels from the plants subjected to dehydration stress. ND not detected

Fig. 7
figure 7

GmLSD gene expression in the roots of BR16 (sensitive) and EMBRAPA48 (tolerant) cultivars in response to dehydration stress. The relative expression levels of the GmLSD genes were measured by RT-qPCR at T 0 (control), T 25, T 50, T 75, T 100, T 125, and T 150 minutes of dehydration stress. The values are the means of three biological replicates with four technical replicates each. The means that are labeled identically (with a letter) in the same cultivar do not differ significantly (Bonferroni multiple comparison test, p < 0.05). The f-box, ACTII, and ELF reference genes were used as internal controls to normalize for the amount of mRNA present in each sample. The transcript levels from plants at T 0 were used to normalize the transcript levels from plants subjected to dehydration stress. To allow for a better comparison of groups of genes with large differences in their expression levels, some graphs of fold-change are shown in log2 scale, and this is indicated at the y-axis. ND not detected

Discussion

Here, we report the identification of the LSD gene family in the available eukaryotic genomes. Our data show that the LSD genes are exclusive to the Viridiplantae, suggesting that LSD proteins might have emerged and evolved as a strategy to subvert specific cell death modalities in plants. The diversification of the LSD gene family seems to have occurred in the Embryophyte clade, whose species possess greater numbers of LSD genes than are found in basal organisms. The fact that soybean and L. usitatissimum have the greatest numbers of LSD genes is not surprising because soybean underwent two genome duplication events approximately 59 million and 13 million years ago (Schmutz et al. 2010) and L. usitatissimum underwent a whole genome duplication approximately 5–9 million years ago (Wang et al. 2012).

The loss and gain of LSD domains support the theory that plant genomes have passed by dynamic and progressive evolution (Zhang et al. 2012). In our analysis of LSD genes in algae and plants, we observed that a gene structure consisting of three LSD domains represents the ancestral condition because it is present in algae and other basal organisms (Fig. 1). Although the majority of proteins have three LSD domains, the identification of genes with one or two LSD domains indicates that these structures have also been equally conserved during evolution. However, the consequence of the number of LSD domains on the function of the LSD genes remains unknown. Xu and He (2007) reported an investigation of LSD1-like 2 of Oryza sativa (OsLOL2), which sequence contains two LSD domains. Transgenic rice expressing the antisense sequence of OsLOL2 showed a dwarf phenotype and lower resistance to pathogen attack. However, the dwarfism was restored by exogenous application of gibberellin (GA), suggesting that OsLOL2 supports the regulation of GA biosynthesis, playing a role in the growth of rice plants. Moreover, the overexpression of OsLOL2 conferred resistance against bacteria and induced a HR in tobacco (Bhatti et al. 2008). Thus, the presence of three LSD domains is not a requirement for a role in pathogen and stress responses in rice. The AtLSD1 protein, which contains three LSD domains, interacts with Arabidopsis Metacaspase 1 (Coll et al. 2010) and with Arabidopsis basic region leucine zipper 10 (Kaminaka et al. 2006) through its second and third LSD domains only. However, although OsLOL2 and AtLSD1 are able to function with just two LSD domains, the presence of three LSD domains appears to be essential for certain functions of other LSD proteins. An example is the LSD1 protein of Pisum sativa, in which the three domains are necessary for nuclear export of the protein (He et al. 2011a). Interestingly, although these three aforementioned LSD proteins can function with two or three LSD domains, the third domain (the C-terminal) is maintained in all three cases. We observed that the third domain was present in all species and proteins analyzed (data not shown), which provides further evidence of the maintenance of this domain during LSD family evolution and suggests that it could be essential for LSD protein function.

Protein domains are structural, functional, and evolutionary units and tend to be more stable than the surrounding regions of a protein during evolution (Zhang et al. 2012). Our analysis of consensus sequences showed that the structure and composition of the three LSD domains are highly conserved in all species analyzed (Fig. 2). The cysteine residues are conserved in all domains and species analyzed, forming the typical structure C2C2, which confirms the classification of the LSD domain as a C2C2-type zinc finger. This amino acid is a zinc-chelating residue involved in the binding of zinc ions, which is extremely important for the tertiary structure of the LSD domain (He et al. 2011b). Cysteine residues play important roles in protecting the cell against oxidative damage through the thiol functional group in the active sites of proteins (Requejo et al. 2010). Thus, the function of AtLSD1 in oxidative stress (Jabs et al. 1996; Kliebenstein et al. 1999; Mateo et al. 2004) might be mediated by the conserved cysteine residues in the protein, which could explain the high degree of maintenance of the LSD domains encoded by the LSD gene family. Overall, the high degree of sequence conservation in LSD domains throughout plant evolution suggests that the maintenance of these sequences is important to ensure the functionality of LSD genes and their involvement in biological processes.

There is a strong tendency towards gene duplication and retention in stress responsive genes (Zou et al. 2009). After duplication, pseudogenization is the most common fate of duplicated genes, although they can alternatively undergo neofunctionalization (when one copy acquires a novel function) or subfunctionalization (in which both copies are mutated and adopt complementary functions) (Cagliari et al. 2011). Both neofunctionalization and subfunctionalization are required to explain the retention of some duplicated genes (Zou et al. 2009). In the course of our EST database searches, we found ESTs for the majority of the putative LSD genes, which suggests that all the LSD structures that we found (consisting of one, two or three LSD domains) are likely to be important in plant development. Based on the apparent process of evolution from an ancestral protein structure consisting of three LSD domains to structures consisting of one or two such domains, we suggest that even though the loss of LSD domains has occurred during LSD gene family evolution, the basic function of these proteins has been maintained. The three structural classes observed might represent an evolutionary innovation that was required for the further development of diverse plant species, as well as an evolutionary adaptation to diverse environmental conditions and to biotic and abiotic stresses in particular.

In recent years, extensive promoter analyses have identified a large number of cis-elements that are components of the transcriptional regulatory networks that regulate biological processes such as development and the responses to stresses and hormones (Mochida et al. 2009). In soybean, in silico analysis of promoter regions has previously been reported for transcription factors (Mochida et al. 2009), the chalcone synthases (CHS)7 and CHS8 (Yi et al. 2010) and the BowmanBirk protease inhibitor (Almeida et al. 2011). The analysis of cis-elements can be useful in predicting the stress responsive genes, which may then be used in the genetic engineering of plants better suited to conditions of stress (Tran and Mochida 2010). Genes that are differentially expressed in response to various environmental stimuli will possess a greater number of distinct regulatory elements than are found in genes that respond to fewer environmental factors (Walther et al. 2007). In our in silico analysis, several cis-elements involved in different stress responses were identified (Table 1), which indicates that the LSD genes might respond to a variety of signaling pathways when plants are under stress conditions. The ABRE is a major cis-element involved in the response to osmotic stress (Yamaguchi-Shinozaki and Shinozaki 2006). This element interacts with the DRE cis-element in the Rd29A promoter region and induces a response to drought, high salt and cold stresses in addition to ABA treatment (Narusaka et al. 2003). The presence of the ABRE cis-element in the putative promoters of GmLSD2, GmLSD3, and GmLSD7, together with the modulation seen in their expression in response to dehydration, suggests a possible stress response function for these genes that may be regulated by ABA.

At least one copy of the TC-rich repeat element, which is related to pathogen defense and stress response, was identified in the majority of GmLSD genes promoters. Moreover, the number of copies of the TC-rich repeat ranged from one to five, which could explain the differences in modulation observed for each GmLSD gene in response to dehydration and P. pachyrhizi infection because the number of copies of a cis-element in the promoter region affects the function of these regulatory sequences (Mehrotra et al. 2005). According to Mehrotra et al. (2005), the ACGT cis-element functioned as a positive activator of transcription when tobacco leaves were transiently transformed with one or two copies, while in the case of the GT cis-element, a second copy interfered negatively with the positive effect of the first copy.

The presence of Box-W1, ERE, TCA, and MBS elements in the promoter regions of the GmLSD3 and GmLSD5 genes indicates that these genes might respond to fungal attack, ethylene, ABA, and drought stress. In our experiments, the expression levels of GmLSD3 and GmLSD5 were modulated in response to dehydration and fungal infection, which was consistent with the cis-elements identified. Similar results have been reported for the cotton mitogen-activated protein kinase 16 gene, wherein these cis-elements were identified and the expression of the gene was modulated in response to pathogens, drought stress and additional molecules such as SA, MeJA, and ABA (Shi et al. 2011).

The highly significant positive correlation between the increase in the density of cis-elements upstream of the TSS and the number of conditions in which a gene was differentially regulated has already been reported elsewhere (Walther et al. 2007). Our in silico analysis indicated that the promoter region of GmLSD3 exhibited greater diversity in its putative cis-elements than was found in other GmLSD genes, which suggests that this gene could respond to a wide range of environmental conditions. In fact, we observed modulation of GmLSD3 expression in all experiments performed. However, additional experiments are necessary to demonstrate the in vivo function of the cis-elements we have identified in GmLSD genes.

To our knowledge, ours is the first report describing a comprehensive expression analysis of all the LSD gene family members in a species. The previously reported expression analysis of AtLSD1, AtLOL1 A. thaliana LSD1-like 1, and OsLSD1 showed that these genes are constitutively expressed in plant organs (Epple et al. 2003; Wang et al. 2005), while our data showed that the expression of the GmLSD genes is organ-dependent (Fig. 4). A similarly variable expression pattern was also reported for the LOL1 gene in the developed shoots of Bambusa oldhamii (Yeh et al. 2011), which suggests that LSD genes can present a distinct modulation pattern depending on the species and organs analyzed.

In terms of organ-specific expression, a notable observation is that the majority of the GmLSD genes were not expressed in the flowers before fertilization, whereas all GmLSD members were expressed in the flowers after fertilization. To our knowledge, LSD genes have not previously been shown to be involved in the flowering process, and thus, further analysis of this correlation is important because it has direct consequences on the yield of crop plants such as soybeans (Jung et al. 2012).

The GmLSD8 gene was highly expressed relative to other GmLSD genes (except in leaves, stems and flowers before fertilization). Interestingly, GmLSD8 is a unique GmLSD gene with two zinc finger LSD domain. Beyond its role in responding to P. pachyrhizi infection and dehydration stress, GmLSD8 may have evolved novel biological functions. Its ubiquitous expression in the majority of soybean organs points to putative roles in different regulatory pathways during plant development. Further experiments are necessary to better understand the role of GmLSD8 in soybean.

Studies have suggested a distinct biphasic response of genes against P. pachyrhizi infection. The first peak in this response, which occurs within the first 12 hpi, is related to basal defense and is a non-specific modulation that is transient or is suppressed when the early steps in colonization are completed in both the susceptible and the resistant genotypes (van de Mortel et al. 2007). A second peak in gene expression occurs early in resistant genotypes, approximately 72 hpi, while in susceptible genotypes, the response is later and continuing unaltered until 96 hpi and allows rapid fungal growth (van de Mortel et al. 2007; Schneider et al. 2011). Our data suggest that GmLSD1, GmLSD2, GmLSD4, GmLSD5, and GmLSD8 may be involved in the basal response because their expression was modulated at early time points and returned to levels similar to those of mock-treated controls after 48 hpi (Fig. 5). In contrast, the modulation of GmLSD3 and GmLSD6 expression was maintained after 48 hpi, which suggests their involvement in the networks of basal and specific defense against fungal attack. Panthee et al. (2007) performed a microarray analysis of soybean plants in the V2 stage at 72 hpi and reported that the majority of the up-regulated genes were related to defense and stress signaling, which supports our hypothesis that GmLSD3 and GmLSD6 have specific functions against the development and expansion of ASR infection. However, additional experiments are necessary to elucidate the function of these genes in response to ASR.

Recently, RNA-Seq analysis of drought-treated and well-watered fertilized ovary and basal leaf meristem tissues of maize identified significant changes in the expression patterns of several known PCD genes under these conditions (Kakumanu et al. 2012). Among these PCD genes, the maize homolog of AtLSD1 was reported to exhibit a decrease in its transcript abundance under drought stress. To our knowledge, this was the first study associating LSD gene expression with drought stress.

In this study, we characterized the changes in the expression patterns of GmLSD genes in both the leaves and the roots of plants subjected to dehydration conditions. These organs play an important role in signaling dehydration. The imbalance between the rate of transpiration in the leaves and the uptake of water by the roots is an important signal that triggers the response to this stress (Aroca et al. 2011). The plasticity of roots in dry conditions is critical for the response of the plant to this stress because they are the primary site of dehydration signaling and play a fundamental role in recovery after stress (Porcel and Ruiz-Lozano 2004). Moreover, dry perception by roots leads to signal transduction throughout the plant via ABA or ethylene hormones, which induces the closing of the stomata (Dubos and Plomion, 2003). Although all GmLSD genes showed a response to dehydration stress, some of these responses are especially interesting. The detection of transcripts of GmLSD6 in leaves and of GmLSD4 in roots only in the sensitive BR16 plants suggests that these genes are involved in a damage response that is specific to this cultivar. Similarly, the differential expression of GmLSD5 in the leaves and of GmLSD1 and GmLSD2 in the roots of the tolerant EMBRAPA48 plants following exposure to dehydration stress indicates an important function for these genes in dehydration tolerance in this soybean plant. Future analysis will be useful to clarify the specific role of these genes in the dehydration stress response.

In summary, we identified 117 genes belonging to the LSD gene family in Viridiplantae. We showed that the LSD domain is widely conserved in Viridiplantae and that the protein structure comprising three LSD domains represents the ancestral condition. Expression analyses of GmLSD genes indicated that they are modulated during the response to P. pachyrhizi infection and dehydration stress, representing an important variable under these stress conditions. These findings indicate that these genes could be useful in efforts to improve the stress tolerance and disease resistance of the soybean plant.