Introduction

Eucalyptus wood is a competitive source for paper manufacture and bioenergy production (Grattapaglia and Kirst 2008). It is composed by lignin and hemicelluloses in similar amounts, about 25 % each (dry weight), while cellulose composes the remaining 50 % (Ramírez et al. 2009). Only a few Eucalyptus species attends the wood characteristics desirable in the biomass industry. In this context, adjusting the parameters of cell wall composition to the industrial demands is crucial to increase the process efficiency. As a major obstacle to accomplish it is the fact that the genetic control mechanisms behind secondary cell wall formation are still unclear (Zhou et al. 2009).

In the last years, some transcription factors (TFs) related to this process were discovered and characterized in Arabidopsis thaliana (Zhong et al. 2006; Demura and Fukuda 2007; Mitsuda et al. 2007) and in Eucalyptus (Goicoechea et al. 2005; Legay et al. 2007). Recently, a new TF assumed the position of the master regulator of plant cell wall biosynthesis. Ambavaram et al. (2010) demonstrated that SHINE/WAX INDUCER (SHN/WIN) clade of TFs coordinate cell wall components deposition by directly regulating a vast range of TFs (including MYB and NAC TFs). By this way, SHINE TFs regulate the accumulation of cellulose, lignin and cutin (the top three plant biomass polymers) in plant protective layers, such as those formed during tissue strengthening, abscission, dehiscence and wounding (Aharoni et al. 2004; Ambavaram et al. 2010; Shi et al. 2011).

There are three SHINE genes in A. thaliana, named AtSHN1, AtSHN2 and AtSHN3. They belong to the ERF-B6 (Ethylene Responsive Factor-B6) clade, which is a sub-group from the AP2/EREBP (APETALA 2/ethylene response element binding protein) TFs family (Dietz et al. 2010). All AP2/EREBP genes contain at least one conserved motif AP2 DNA Binding Domain. Additionally, SHINE genes contain more other two exclusive motifs: “mm” (middle domain, with approximately 61 amino acids) and “cm” (C-terminal domain, containing approximately 10 aminoacids). Another specific SHINE characteristic is the presence of just one intron positioned about 80 bp from the start codon (Aharoni et al. 2004).

In A. thaliana, there is another gene that contains the SHINE specific domains (At5g25190). However, overexpression of this gene did not result in typical morphological SHINE phenotype, most likely due to the presence of an incomplete “mm” motif in that gene (Aharoni et al. 2004).

Expression analysis using fusions of SHINE genes with promoter-β-glucuronidase provided evidences for their functional role. The AtSHN1 is strongly expressed in flowering organs that go through abscission and it is low expressed in leaves, stem and silique (Broun et al. 2004). In the case of AtSHN2, the expression is highly precise and temporary coordinated in anther and silique dehiscence zones. On the other hand, AtSHN3 is constitutively expressed in all plant tissues (Aharoni et al. 2004).

In a recent work, Ambavaram et al. (2010) demonstrated that the heterologous expression of AtSHN2 in rice (Oryza sativa) caused a 34 % increase in cellulose and 45 % decrease in lignin content without prejudice plant strength and development. Moreover, lignin composition was also altered in SHINE transgenic plants, leading to improved digestibility.

In this context, due to the economic potential of SHINE TFs in promoting an increase of wood productivity, this work searched for Eucalyptus homologous genes by similarity searches using E. grandis EST and genome data. Results were complemented by the expression analysis of the candidate genes, in order to investigate their putative role in cell wall biosynthesis.

Materials and methods

Bioinformatics analysis

Putative transcription factors of SHINE family were searched in Eucalyptus ssp. transcriptome database “Eucspresso” (Mizrachi et al. 2010) and E. grandis ESTs database generate by Genolyptus Project (http://www.lge.ibi.unicamp.br/eucalyptus/). Databases were queried using the three A. thaliana SHINE protein sequences public available (AtSHN1, gi: 28950720; AtSHN2, gi: 48479321; AtSHN3, gi: 28973112) and tBlastn algorithm (Altschul et al. 1997). Putative Eucalyptus SHINE EST sequences were anchored in the Eucalyptus grandis genome to fill gaps and define the entire coding sequences based on gene prediction produced during the Genome Project (Eucalyptus grandis Genome Project 2010, unpublished—http:://www.phytozome.net/eucalyptus). The complete sequences can be accessed on http://www.phytozome.net/cgi-bin/gbrowse/eucalyptus/ using the queries Eucgr.C04221.1, Eucgr.C01178.1, Eucgr.C02719.1 or Eucgr.F03947.1 for EgrSHN1, EgrSHN2, Egr33m or Egr40m, respectively.

Phylogenetic analysis

Multiple sequence alignments were carried on using ClustalW2.0 with default settings (Larkin et al. 2007). SHINE homologous gene sequences from A. thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa were obtained in NCBI databases using tBlastn and AtSHNs sequences as query. The amino acid sequences of all species were aligned to Eucalyptus sequences using Muscle tool (Edgar 2004) from MEGA5 software (Tamura et al. 2011). The phylogenetic trees were constructed using PhyML 3.0 program, available online at http://www.atgc-montpellier.fr/phyml/ (Guindon et al. 2010). It was used the Maximum Likelihood method and Jones-Taylor-Thornton (JTT) model (Jones et al. 1992) and a gamma distribution with 4 discrete categories. This model was chosen by Akaike information criterion available at MEGA5 software. Bootstrap values were calculated with 1,000 replicates.

Gene expression analysis

Gene expression was investigated by the exploration of the RNAseq database generated from xylem tissue from three Eucalyptus species: E. grandis, E. globulus and E. urophylla, as described in Salazar et al. (submitted).

Gene expression was also investigated by qReal Time-PCR using the mRNA obtained from immature xylem, flower and leaves from 5-years old Eucalyptus urograndis trees (n = 3). Total RNA was extracted as described by Zeng and Yang (2002). After cDNA synthesis (SuperScript™ II, Invitrogen), RT-qPCR was performed using the following primers: EgrSHN1 (F, 5′-GCAATCAAAGAAGTTCAGAGGAG-3, R, 5′-AAGGTGCCAAGCCAGACTCG-3′); EgrSHN2 (F, 5′-CCGAAATTCGCCATCCTCTG-3′, R, 5′-ATCAAGATGGCGGCCTGGTC-3′); housekeeping gene Histone H2B (F, 5′-GAGCGTGGAGACGTACAAGA-3′, R, 5′-GGCGAGTTTCTCGAAGATGT-3′). Data generated were analyzed accordingly to Pfaffl (2001).

Results and discussion

Gene discovery and bioinformatic analysis

The analysis using A. thaliana SHINE proteins to identify homologous genes in Eucalyptus genome indicated the existence of four putative Eucalyptus SHINE ESTs. These sequences were anchored in E. grandis genome to obtain the complete genes, using the database generated by Zander Myburg and colleagues (Eucalyptus grandis Genome Project 2010, http://www.phytozome.net/eucalyptus).

From the four genes, only two presented all the features required to classify them as true SHINE genes: a single intron located 80 bp (EgrSHN1) or 86 bp (EgrSHN2) from the start codon, an AP2 DNA binding domain and “mm” (middle domain) and “cm” (C-terminal domain). They were further given the nomenclature EgrSHN1 and EgrSHN2, following the respective similarity to AtSHN genes (Fig. 1).

Fig. 1
figure 1

Protein sequence alignment between E. grandis and A. thaliana SHINE transcription factors. All SHINE charactheristic motifs are entirely present on EgrSHN1 and ErgSHN2. However, “mm” domain is partially present at Egr33m and Egr40m, the other two Eucalyptus sequences found in E. grandis genome through tBlastN using AtSHNs as query. At5g25190 is the correspondent A. thaliana gene that lacks part of the “mm” domain in spite of having the “AP2” and “cm” motifs, as Egr33m and Egr40m

The amino acid sequence similarity over the “mm domain” from both EgrSHNs and AtSHNs was high enough to corroborate on their identification as SHINE genes. While the minimum “mm domain” identity level between SHINE members is approximately 60 % (Aharoni et al. 2009), the similarity between EgrSHNs and AtSHNs is above 72 %. The similarity between EgrSHNs and AtSHNs is also elevated (56 %) when analyzing the full length protein sequence.

The other two Eucalyptus genes identified also presents the typical SHINE characteristics but with an exception: “mm” domain is not complete (Fig. 1). Phylogenetic analysis of these transcripts (Fig. 2a) reveals that both sequences are positioned out of ‘SHINE’ clade; instead, they are probable orthologs to At5g25190, which also presents all SHINE specific domains but lacks part of the “mm” motif (Dietz et al. 2010).

Fig. 2
figure 2

a Phylogenetic correlation between the SHN clade and the closely related out-group genes: an AtERF-B6 gene member (At5g25190) and its putative Eucalyptus orthologs (Egra40m: Egrandis_v1_0.027840m and Egra33m: Egrandis_v1_0.028133m). An AtDREB-A2 gene member (At2g40350) was also included to show that PtDREBs sequences are more closely related to AtSHNs than to AtDREBs. b Detail of SHINE clade shown in item A. It was included genes from five plant species: E. grandis (EgrSHN1 and EgrSHN2), A. thaliana (AtSHN1, AtSHN2 and AtSHN3), Populus trichocarpa (PtDREB27, PtDREB28, PtDREB29, PtDREB30 and PtDREB31), Vitis vinifera (Vv_LOC100250826, Vv_LOC100259484, Vv_LOC100266250) and Oryza sativa (OsSHN1 and OsSHN2). The scale bar of 0.2 corresponds to 20 % sequence divergence. Bootstrap values are given for nodes and are considered as value of significance of the branches (Tamura et al. 2011)

Phylogenetic analysis of EgrSHN1 and EgrSHN2 genes indicated that, when comparing both sequences to those identified as SHINE genes in other species (A. thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa), all of them grouped in the same clade (Fig. 2a, branch identified as SHINE clade and detailed in Fig. 2b). Interestingly, since the phylogeny was constructed using the sequences from each species after tBlastn alignments using the well-annotated AtSHN genes as query, and not by the use of keyword searches in the public database, thus at least in one case we retrieved results indicating wrong gene annotation.

In P. trichocarpa, the most similar sequences to SHINE proteins were those identified as DREB proteins (PtDREB27, PtDREB28, PtDREB29, PtDREB30 and PtDREB31), which is another subgroup from the AP2/EREBP family (Tuskan et al. 2006; Zhuang et al. 2008). According the phylogeny results, however, these sequences group in the SHINE clade, indicating a possible error in gene annotation (Fig. 2b). Some further analysis on these sequences demonstrates that P. trichocarpa genes are evolutionarily closer to AtERF-B6 genes than to AtDREB-A2 (Supplementary Fig. 1) and contain all typical SHINE specific motifs (Supplementary Fig. 2). Based on these evidences, our work suggests that PtDREB27, PtDREB28, PtDREB29, PtDREB30 and PtDREB31 sequences can be re-classified as PtSHNs.

Other result from phylogenetic analysis provides strong evidences that Eucalyptus EgrSHNs evolved from the same ancestor sequence by a duplication event (Fig. 2b). According to our analysis, this kind of gene evolution seems to be common in SHINE family, as for example AtSHN2 and AtSHN3, PtSHN30 and PtSHN31, OsSHN1 and OsSHN2. It is also possible to infer that some gene losses events probably occurred during SHINE evolution. For example, V. vinifera has an orthologous gene to PtDREB27 and PtDREB28, which indicates it was present in plant evolution before E. grandis speciation. Thus, since E. grandis does not have its correspondent orthologous gene, it can be concluded that it was probably lost. This ‘gene loss’ tendency is not surprising since SHINE paralogous develops redundant functions (Shi et al. 2011).

Gene expression results

Expression analysis was performed evaluating the RNAseq database generated from xylem of E. grandis, E. globulus and E. urophylla (Salazar et al. manuscript in submission). Results reveal the absence of EgrSHN1 or EgrSHN2 transcripts in the xylem of these three species, which is similar to the expression observed for AtSHN1 and AtSHN2 genes (Broun et al. 2004). Accordingly, qRT-PCR results indicated that EgrSHN1 expression is similar to AtSHN1 transcription in A. thaliana, i.e. higher expression in flower than in leaves and immature xylem (Fig. 3a) (Broun et al. 2004). In the case of EgrSHN2 gene, qRT-PCR results were inconclusive (data not shown).

Fig. 3
figure 3

a EgrSHN1 expression ratio in flower (rich in abscission and dehiscence zones), leaf and immature xylem assessed through qRT-PCR. Data are expressed as fold change and “immature xylem” was chosen as reference condition. This experiment was carried on with biological triplicates so error bars represent SE (n = 3). b RNAseq data for the both genes closest linked to EgrSHNs phylogenetically. In this experiment, E. urograndis young plants were supplemented with flavonoid (chalcone and narigenin) in order to reduce wood lignification (Lepikson et al. submitted). Both sequences were repressed in samples under supplementation with flavones, indicating a possible involvement in wood formation

The expression of Egr33m and Egr40m genes was also investigated in RNAseq database. At the contrary of EgrSHNs, both transcripts were detected in xylem of the three Eucalyptus species (Salazar et al. manuscript in submission). Besides, in an RNAseq database generated by other work in our group (Lepikson-Neto et al. manuscript in preparation), it was observed that Egr33m and Egr40m are inhibited in presence of flavonoids narigenin-chalcone and narigenin (Fig. 3b). Since flavonoid supplementation is proved as an efficient way to decrease wood lignification (Besseau et al. 2007), we hypothesize that both genes might be involved in lignin deposition.

In our work we did not perform qRT-PCR of Egr33m and Egr40m genes, since we focused in the characterization of the genes considered true SHN TF Eucalyptus genes.

As conclusion of the present work, two Eucalyptus SHINE genes were identified (EgrSHN1 and EgrSHN2), as well as two close sequences (Egr33m and Egr40m). Phylogenetic analysis indicated that the two Eucalyptus EgrSHNs evolved from the same ancestor sequence by a duplication event. Expression similarities between AtSHNs and EgrSHNs allow concluding that Eucalyptus SHINEs might actually develop SHINE functions. Additionally, Egr33m and Egr40m also might participate in cell wall biosynthesis once there expression is altered under a lignification inhibitory treatment, i.e. Eucalyptus seedlings flavonoid supplementation. The identification of SHINE transcription factors in Eucalyptus can generate information about wood formation processes allowing the increase in forest plantation productivity.