Introduction

Long non-coding RNAs (lncRNAs) are non-coding RNA transcripts of more than 200 nucleotides in length (Kapranov et al. 2007). Although most lncRNAs are transcribed by RNA polymerase II, some are transcribed by RNA polymerase III (Gibb et al. 2011). Most lncRNAs exist in the nucleus with only a few lncRNAs in the cytoplasm, unlike mRNAs, which mostly exist in the cytoplasm (Kapranov et al. 2007). In general, lncRNAs occur in very low abundance, and in specific patterns in tissue types and subcellular compartments (Wang et al. 2011b). LncRNAs are classified into i, o, u and x types, according to their genomic locations relative to their neighboring genes (Mercer et al. 2009; Ponting et al. 2009). For example, the class i intron lncRNAs, are entirely within a gene and the majority of the intron lncRNAs have similar expression patterns to their host genes (Sigova et al. 2013). Following the advent of the high-throughput sequencing, especially RNA sequencing (RNA-seq), emerging work has revealed that lncRNAs regulate gene expression in many biological processes, including in transcriptional, posttranscriptional and epigenetic mechanisms, in either cis or trans (Mercer et al. 2009; Yoon et al. 2013).

Since the first report of lncRNAs in humans by Lukiw et al. (1992), recent work has made significant advances in the identification and functional analysis of lncRNAs. Recently, many studies have reported that lncRNAs are associated with human diseases such as cancer, Alzheimer’s disease, and heart diseases (Taft et al. 2010). The systematic identification of lncRNAs in zebrafish embryos revealed that lncRNAs are relatively short, with low, tissue-specific expression, characteristics shared with their mammalian counterparts (Pauli et al. 2012). In plants, lncRNAs have been identified in Arabidopsis thaliana, Zea mays, Triticum aestivum and Oryza sativa, indicating that lncRNA-mediated gene expression occurs widely in plants (Lu et al. 2012; Boerner and McGinnis 2012). Many studies have indicated that plant lncRNAs may be involved in the regulation of flowering, male sterility, metabolism, and biotic and abiotic stress responses (Zhang et al. 2013a). For instance, the lncRNA of LDMAR regulates photoperiod-sensitive male sterility in rice, and its low expression in photoperiod-sensitive male sterile rice Nongken 58S leads to male sterility under long-day conditions (Ding et al. 2012).

In addition, some studies reported that a subset of lncRNAs can be processed to small RNAs, indicating a potential regulatory function of lncRNAs (Jalali et al. 2012). The large intergenic lncRNA H19 gene includes a miRNA, miR-675, which is involved in the developmental regulation of genes (Cai and Cullen 2007). LncRNAs could also potentially interact with other classes of non-coding RNAs, including miRNAs (Jalali et al. 2013). Like mRNAs, lncRNAs could also be targeted and regulated by miRNAs (Zhang et al. 2013b). For example, analysis of lncRNAs by reverse transcription-polymerase chain reaction array revealed that miR-21 could suppress the lncRNA of growth arrest-specific 5 (GAS5) (Zhang et al. 2013b). LncRNAs have been described as a part of the “dark matter” of the genome and are annotated only rarely; particularly, identification and characterization of lncRNAs in trees remain to be addressed.

As perennial plants and complex multicellular organisms, trees experience many different biological conditions and undergo complex developmental processes during their lives. For example, wood formation (xylogenesis) is precisely regulated and distinguishes trees from herbaceous plants. Cellulose, lignin, and other biopolymers make up of the majority of woody tissues, providing a vital raw material for pulp and paper, biofuels, and solid wood products (Mellerowicz and Sundberg 2008). In addition to wood formation during normal growth, trees develop a special kind of wood, tension wood (TW) in branches that are bent or displaced by environmental conditions. In angiosperms, TW forms on the upper side of a bent stem or branch, and has decreased lignin and increased cellulose contents (Timell 1969; Joseleau et al. 2004). In the study and refinement of wood development, TW has been reported as an ideal model to identify the functions of genes involved in wood formation (Kwon 2007). To discover the molecular mechanisms that condition the different lignin and cellulose contents in opposite wood (OW), TW and normal wood (NW), TW models have been used to identify key genes, especially carbohydrate-active genes (Du et al. 2004; Nugroho et al. 2012). A microarray and metabolite analysis of TW-forming tissues in Populus tremula suggested that an increased C flux to cellulose was mainly caused by the up-regulation of sucrose synthase transcripts (Andersson-Gunnerås et al. 2006). Also, in pine species, 4-coumarate-CoA ligase (4CL) expression is stimulated during the formation of compression wood (CW) (Zhang et al. 2013b). Additionally, transgenic suppression of 4CL in Pinus radiata decreased lignin contents (Wagner et al. 2009). The genes involved in the synthesis of plant hormones such as auxin, ethylene and gibberellin have important roles in the formation of TW (Du et al. 2004; Nugroho et al. 2012). For example, gibberellin is required for the formation of TW and for stem gravitropism in seedlings of Acacia mangium (Nugroho et al. 2012). Recently, regulation by ncRNAs such as miRNAs has been implicated in the control of normal and TW formation in trees (Lu et al. 2005, 2013). Transgenic studies of miRNAs discovered that overexpression of Ptr-miR397a could reduce lignin content, revealing that the Ptr-miR397a is a negative regulator of laccase genes affecting lignin content in Populus trichocarpa (Torr. & Gray) (Lu et al. 2013). However, whether lncRNAs have functions in wood formation, and how they work in wood formation, remains unknown.

To explore whether lncRNAs have a role in wood formation, we used RNA-seq to identify lncRNAs differentially expressed in the xylem tissues of NW, TW and OW in Populus tomentosa (Carr.). We also found lncRNAs that potentially target genes for trans-regulatory effects and lncRNAs confirmed by quantitative RT-PCR showed tissue-specific expression. Gene ontology (GO) analysis showed that these potential targets include important genes involved in response to stimulus, and in cellulose, lignin and gibberellin biosynthesis. In addition, we identified a network of interactions between ncRNAs, and mRNAs. Thus, our RNA-seq and computational analysis identified novel lncRNAs that have potentially important roles in the regulation of wood formation; moreover, the identification of the lncRNA–miRNA regulatory network places these ncRNAs in a functional context for gene regulation.

Materials and methods

Plant materials

Three 30-year-old individuals of P. tomentosa Carr. with well-developed branches in the national nursery of Guan Xian County (Shandong Province, China; N36°28′28″, E115°26′17″) were used as biological replicates in our study. The largest branch from each tree was further selected for study in natural condition without any treatment. The mature xylem tissues of normal, opposite and tension wood (NW, OW and TW) were isolated from the same tree with a sharp chisel after removing the bark of the sampling area and the tissues were immediately frozen in liquid nitrogen, then stored at −80 °C for RNA extraction. The TW and OW were taken from the upper and lower sides of the same branch.

Total RNA isolation, RNA library construction, and Solexa sequencing

Total RNA from TW, OW and NW were isolated by a modified CTAB method (Chang et al. 1993) that used isopropanol instead of lithium chloride for RNA precipitation. DNase was applied to eliminate the DNA from the total RNA. The total RNA quality was monitored by NanoDrop ND-1000 (A260/A280: TW 2.11, OW 2.17 and NW 2.09) and Agilent Bioanalyzer 2100 (28S/18S: TW 1.6, OW 1.7 and NW 1.5; RIN: TW 8.3, OW 8.6 and NW 8.2) and was of sufficient quality (2100 RIN ≥7.0 and 28S/18S ≥0.7) for RNA-seq. Extracted RNA was used for RNA library construction. Qualified total RNA was purified with the RNeasy micro kit (Cat#74004, Qiagen) which only RNAs longer than 200 bp were purified, and then processed according to the TruSeq RNA Sample Preparation Guide to build a cDNA library. The sequencing of the cDNA library was carried out by the Shanghai Biotechnology Corporation (Shanghai, China) using Illumina HiSeq 2000, following the cBot User Guide and HiSeq 2000 User Guide with paired-end program.

Prediction of lncRNAs

Pre-processing and assembly of raw sequence data were performed using the CLC Genomics Workbench (version 4.9). Pre-processing included removal of the adapter sequences, low-quality (Q20 <20) sequences, and sequences shorter than 20 nucleotides. Clean reads were then aligned with the Populus trichocarpa genome (http://www.phytozome.net/poplar) (Tuskan et al. 2006) using Tophat (Trapnell et al. 2012) with two base mismatches allowed. The transcripts were assembled using cufflinks (Trapnell et al. 2012). The cuffcompare program (Trapnell et al. 2012) was used to merge the RefSeq-, Ensembl- and UCSC Populus-known genes into one set of gene annotations for comparison with the assembled transcripts. All the sequencing data were uploaded to SRA database with accession number SRP040531. The low-quality transcripts were then filtered according to the optimum FPKM (fragments per kilobase of exon per million fragments mapped) (2.12). The prediction of lncRNAs was performed according to Sun et al. (2012) based on RNA-Seq data, as described in Fig. 1. The lncRNA classification was carried out according to their genomic location, referring to the neighboring genes, and lncRNAs were classified to ‘i’, ‘u’, ‘o’ and ‘x’ categories as defined by cufflinks (Roberts et al. 2011). Specifically, the ‘i’ category contained lncRNAs that fall entirely within an intron of a known gene. The ‘u’ category contained intergenic lncRNAs (lincRNAs). The ‘o’ category contained the lncRNAs that have generic exonic overlap with a known transcript, and the ‘x’ category contained lncRNAs that also have exonic overlap with known transcripts, but on the opposite strand.

Fig. 1
figure 1

Pipeline for predicting novel lncRNAs. Raw reads are first mapped onto the reference Populus trichocarpa genome. The initial assemblies are categorized by cuffcompare, compared with the combined gene annotations. The low-quality transcripts are then filtered according to the optimum FPKM (2.12). The transcripts that encode protein were deleted from the remaining high-quality assemblies

The cufflink program (version 2.0.2) was used to calculate the expression of transcripts (Trapnell et al. 2010). The fold change (FC) lncRNA expression level equals the FPKM of TW divided FPKM of NW, and so on. Here, the FPKM(OW)/FPKM(NW) was defined as OW vs. NW, FPKM(TW)/FPKM(NW) as TW vs. NW and FPKM(OW)/FPKM(TW) as OW vs. TW. The differentially expressed lncRNAs were selected with a log2FC (OW vs. NW, TW vs. NW, or OW vs. TW) ≥1 or ≤−1, and with statistical significance (P < 0.05 and false discovery rate (FDR) <0.01).

Genomic characterization of lncRNAs

The transcription factors (TFs) binding sites of differentially expressed lncRNAs were identified by BLAST against the Plant Transcription Factor Database v3.0 (http://planttfdb.cbi.pku.edu.cn/) (Jin et al. 2013) using the 2,000-bp (base pair) sequence upstream of the lncRNAs with E value <1e−20. To explore whether some lncRNAs may act as precursors of miRNAs, the published miRNAs in miRBase (Release 20, http://www.mirbase.org/) were aligned to the sequences of the novel lncRNAs, to find known miRNA precursors in the lncRNAs. We used the prediction software Mireap (https://sourceforge.net/projects/mireap/) to predict novel miRNAs by exploring the secondary structures, DL1 cleavage sites, and minimum free energies of the unannotated sRNA tags that could be mapped to the P. trichocarpa reference genome and selected by basic criteria according to Meyers et al. (2008), with low free energy characteristic of stable hairpin structures (<−25 kcal/mol). The secondary structure was predicted with the Vienna RNA package RNAfold web (http://rna.tbi.univie.ac.at/). The target genes of miRNAs were predicted by psRNATarget (http://plantgrn.noble.org/psRNATarget/) (Dai and Zhao 2011) with expectation ≤3.

Prediction of the potential target genes of lncRNAs

Identification of the target genes of differentially expressed lncRNAs via trans-regulatory effects used sequence complementarity, as previously described (Han et al. 2012). The first step uses BLAST to select a target sequence that was complementary to the lncRNA, setting E value = 1e−5 and identity = 95 %. Then RNAplex was used to select targets with E value = −30 (Tafer and Hofacker 2008). Target gene annotations were carried out by PopGenie (http://www.popgenie.org/) (Sjödin et al. 2009) and gene ontology terms were analyzed with agriGO (http://bioinfo.cau.edu.cn/agriGO/index.php) (Du et al. 2010). The enriched GO categories were checked using an FDR-adjusted value of ≤0.05 as the cutoff to find significant GO categories. The interactions among mRNAs, lncRNAs and miRNAs were modeled with Cytoscape 3.2 (Saito et al. 2012).

5′-RACE

RNA ligase-mediated 5′-RACE (RLM-RACE) was performed with the First Choice RLM-RACE Kit (Ambion) according to the manufacturer’s instructions with slight modifications. PCR was performed with 5′ adaptor primers and 3′ gene-specific primers (Supplementary material S1) using cDNA as the template. Four putative lncRNAs for four categories were chosen for verification, including i: TCONS_00025385 and host gene POPTR_0016s00250; o: TCONS_00016255 and host gene POPTR_0012s02210; x: TCONS_00058236 and refer gene POPTR_0008s02100; u: TCONS_00012272. The RACE products were gel-purified, cloned, and sequenced.

qRT-PCR

qRT-PCR was performed as described, using the TaKaRa ExTaq R PCR Kit and SYBR green dye (TaKaRa, Dalian, China) on a DNA Engine Opticon 2 machine (MJ Research, Waltham, MA, USA) (Zhang et al. 2011). All reactions were carried out in triplicate technical and triplicate biological repetitions. Fourteen differentially expressed lncRNAs from the four classes (Supplementary material S1) were chosen for qRT-PCR verification. Among them, the expression of three lncRNAs (R4, R5 and R8) was detected in five tissues including leaf, phloem, cambium, root, and developing xylem along with TW, NW and OW. To discover the relationship between expression of the lncRNA and expression of its host gene, expression levels of lncRNA and its host gene (Supplementary material S1) were measured by qRT-PCR. Similarly, to find the relationship between lncRNA expression and expression of its potential target gene, expression levels of four pairs (Supplementary material S1) were also measured by qRT-PCR. Among the four genes, lnTqP1 encodes a protein similar to cellulose synthase-like D4 (PtCSLD4), lnTqP3 encodes a protein similar to gibberellin 2-oxidase (GA2ox) and lnTqP4 encodes a protein similar to a putative glucosyltransferase. At the same time, the expression levels of miRNAs and their potential targets (genes and lncRNAs) were also measured, as well as their host lncRNAs (Supplementary material S1). Small RNAs (<200 nt) were isolated using the mirVana miRNA Isolation Kit (Ambion, Foster City, CA, USA) following the manufacturer’s instructions. qRT-PCR of miRNAs precursors was carried out as previously described (Schmittgen et al. 2008). Primers are shown in Supplementary material S1. The efficiency of the primers was calculated by performing real-time PCR on several dilutions of first-strand cDNAs. Efficiencies of the different primer sets were similar. The specificity of each primer set was checked by sequencing PCR products (Zhang et al. 2010). The reactions were carried out in a 20-µl volume containing 2 µl of diluted cDNA, 200 nM of each primer, and 19 PCR Master Mix with the following conditions: 95 °C for 30 s, and 45 cycles of 95 °C for 5 s, 58 °C for 15 s, and 72 °C for 20 s. Then, a thermal denaturing cycle of 95 °C for 15 s and 60 °C for 1 min was applied to determine the dissociation curves, which were used to verify the specificity of PCR amplifications. Relative expression levels of candidate genes, miRNAs and lncRNAs were calculated by the \(2^{ - \Delta C_{\rm t} }\) method. The results obtained for the different tissues were analyzed and standardized to the mRNA levels of poplar ACTINII-like (accession number: EF145577), which shows stable expression.

Results

Characterization of novel lncRNAs in Populus

In total, our sequencing generated 128,972,228, 117,672,362 and 140,978,316 raw reads in OW, NW and TW, respectively. After trimming, 108,187,750, 101,399,718 and 119,716,602 reads remained in OW, NW and TW, respectively (Supplementary material S2). Our computational analysis identified 1,377 putative lncRNAs from wood xylem tissues that were isolated from three different parts of P. tomentosa (Supplementary material S3). Structural analysis showed that these Populus lncRNAs range from 201 to 4,524 nt, with a mean of 694 nt. Here, the lower range is normal because the RNeasy micro kit does not recover RNAs with size lower than 200 nt that agreed with the definition of lncRNAs longer than 200 nt. The average length of lncRNAs was 2.2-fold shorter than the average length of protein-coding transcripts, which is 1,560.5 nt. For expression levels, the average counts for lncRNAs were less than the counts for coding transcripts by about 0.13-fold for OW, 0.14-fold for NW, and 0.18-fold for TW. As determined by their genomic locations with respect to nearby genes, the 1,377 lncRNAs included members of all four classes of lncRNAs, with most in the o class, which overlap exonic regions (15 members of the i class, 1,030 in o, 202 in u and 130 in x). In order to validate the novel lncRNAs, the sequences of four lncRNAs from four categories were cloned by 5′-RACE in our study (Fig. S1; Supplementary material S4). The cloning of RACE-PCR products showed that the entire contig can be amplified, indicating that the contigs were not assembly artifacts. As shown in Fig. S1a, the cloned sequence of POPTR_0016s00250, which includes the class i lncRNA TCONS_00025385, shows that the lncRNA was transcribed from the first intron. Also, the class o lncRNA TCONS_00016255 overlaps with the last exon and 3′-UTR of POPTR_0012s02210 (Fig. S1b). Moreover, the reverse-complement sequence of the class x lncRNA TCONS_00058236 covers three parts of POPTR_0008s02100 (Fig. S1c). Lastly, the cloning of a class u lncRNA showed that it could be cloned and no gene was located near it (Supplementary material S4).

Differentially expressed lncRNAs

As shown in Fig. 2a, among the 1,377 putative lncRNAs, 54 lncRNAs were only identified in one library, not in the others, with 17 uniquely expressed in TW, 11 in OW and 26 in NW. Also, 766 lncRNAs were differentially expressed in TW, OW and NW with log2FC ≥1 or ≤−1, and FDR ≤0.01. Of these, 32 TFs from 16 families identified by TFs discovery analysis could bind to the 2-kb upstream sequences of 32 lncRNAs (Table 1). In comparison to NW, 589 lncRNAs were differentially expressed in OW, including 380 up-regulated and 209 down-regulated (Fig. 2b; Supplementary material S5). Similarly, 578 lncRNAs were differentially expressed in TW compared to NW, with 370 up-regulated and 208 down-regulated (Fig. 2b; Supplementary material S5). Lastly, 298 lncRNAs were differentially expressed in OW compared to TW, with 135 up-regulated and 163 down-regulated (Fig. 2b; Supplementary material S5). The expression patterns of the common differentially expressed lncRNAs in three wood tissues (109) could be categorized into five major types, A to E (Supplementary material S6). The expression of type A and B lncRNAs such as TCONS_00026694 and TCONS_00028236 was NW > TW > OW and NW > OW > TW, and these lncRNAs were similarly suppressed by tension and compression stresses (Supplementary material S6). The lncRNAs showing lowest expression in NW belonged to type C (TW > OW > NW) and D (OW > TW > NW), and included TCONS_00044376 and TCONS_00055243 (Supplementary material S6). The expression of type E lncRNAs, such as TCONS_00028236, was TW > NW > OW (Supplementary material S6).

Fig. 2
figure 2

Venn diagram of lncRNAs in TW, NW and OW. a Of the total 1,377 lncRNAs, 17, 11 and 26 were uniquely expressed in TW, OW and NW, respectively. Also, 1,189 lncRNAs were expressed in all three wood tissues. There are 11, 11 and 33 lncRNAs expressed in TW and OW, OW and NW, and TW and NW, respectively. b The summary of differently expressed lncRNAs in OW vs. NW, TW vs. NW and OW vs. TW

Table 1 Putative transcription factors binding sites in the upstream 2,000-bp sequence of significantly differentially expressed lncRNAs

Trans-acting target genes of differently expressed lncRNAs

Computational prediction of potential trans-regulated target genes of the differentially expressed lncRNAs identified 1,151 potential target genes for 389 lncRNAs (Supplementary material S7). To functionally annotate and predict their biological roles, the predicted targets of the differentially expressed lncRNAs were sorted into GO-term categories for molecular function, biological processes, and cellular component using agriGO based on the Blast2GO program (Du et al. 2010) (Supplementary material S8). The significant GO categories were selected with an FDR-adjusted value of ≤0.05 as the cutoff (Table 2). For biological process, nine common significant terms were identified among potential target genes of up-regulated lncRNAs in OW vs. NW, OW vs. TW and OW vs. TW (Table 2). The “response to stimulus” category was identified in OW vs. NW (Table 2), suggesting that target genes predicted from the up-regulated lncRNAs among OW vs. NW were involved in response to stimulus. Within the GO category, “molecular function”, the “transcription regulator activity” and “binding” terms were identified in OW vs. NW, OW vs. TW and OW vs. TW (Table 2). The “molecular transducer activity” was identified among OW vs. NW and OW vs. TW (Table 2). Regarding the “biological process” GO category, of the target genes of down-regulated lncRNAs, eight terms were enriched in OW vs. TW and one in OW vs. NW, including “metabolic process” (Table 2). For “molecular function”, there were two terms, “binding” and “electron carrier activity” identified in TW vs. NW (Table 2).

Table 2 The significant GO categories of significantly expressed lncRNAs target genes

After the functional exploration of target genes in general, we next selected genes related to cellulose synthase (UDP-forming) activity, lignin biosynthesis, sugar carrier, UDP-glucoronosyl and UTP-glucose glucosyltransferase, as potentially important target genes in wood formation (Table 3). The lncRNA TCONS_00032782 was predicted to regulate POPTR_0013s07900.1 (Pt-CSLD4), which encodes a protein similar to cellulase synthase 3. As shown in Fig. 3a, the match sites were in the first exon of Pt-CSLD4 and occurred in good alignment. Genes encoding glucosyltransferases were predicted as targets of TCONS_00056386 (Table 3; Fig. 3b). Besides the genes related to cellulose, lignin and glucosyltransferase, POPTR_0010s15950.1 (Pt-GA2.3) encodes GA2ox and was predicted as a target of TCONS_00022311 (Fig. 3c). Another target gene, predicted to be targeted by TCONS_00053930, encodes a protein similar to 4CL, the central enzyme of the plant-specific phenylpropanoid pathway, which has an important role in lignin biosynthesis (Table 3).

Table 3 The important target genes related to wood formation
Fig. 3
figure 3

The sites where lncRNAs match their potential target genes. a The match sites of TCONS_00032782 and POPTR_0013s07900.1 (Pt-CESA3.1). b The match sites of TCONS_00056386 and POPTR_0017s06100.1 (Pt-ZOG1.12). c The match sites of TCONS_00022311 and POPTR_0010s15950.1 (Pt-GA2.3). The results show the lncRNAs and their potential target genes match very well and probably interact to cause trans-regulatory effects

LncRNA transcripts as targets of miRNAs and as miRNA precursors

To examine the regulatory functions of the lncRNAs, we next examined their sequences to determine if they could be targets of known miRNAs. The psRNATarget program, with expectation ≤3, predicted 25 lncRNAs as targets of 44 miRNAs from 15 families (Supplementary material S9). Based on this, we also modeled the network of miRNAs regulating lncRNAs, showing that one lncRNA could potentially be targeted by 1–12 miRNAs and one miRNA could target 1–5 lncRNAs (Fig. 4). As shown in Fig. 5, there are perfect seed-match sequences between miRNAs and lncRNAs, indicating that miRNAs may target the lncRNAs. The five lncRNAs that were predicted to be targeted by miRNAs also may interact with 11 target genes via trans-regulatory effects, as indicated by the computational analyses described in the previous section. Also, 204 transcripts were predicted to be target genes of the 44 miRNAs that probably regulate lncRNAs. The interaction network of miRNAs, lncRNAs and mRNAs, in which miRNAs regulated lncRNAs and mRNAs, and lncRNAs targeted mRNAs, was modeled using Cytoscape 3.2 (Fig. 6). Since we focused on the network of mRNAs, lncRNAs and miRNAs, only the mRNAs involved in interactions with miRNAs and lncRNAs are shown in Fig. 6. The interaction network involved in 44 miRNAs, 25 lncRNAs and 210 mRNAs (5 were commonly regulated by both lncRNAs and miRNAs), showing mRNAs could be regulated by both lncRNAs and miRNAs (Fig. 6). Thus, the network of lncRNAs, miRNAs and mRNAs shows that lncRNAs could potentially interact with miRNAs and may be a component of regulatory interactions with miRNAs.

Fig. 4
figure 4

The networks of miRNAs and lncRNAs. 25 lncRNAs were predicted as potential target genes of 44 miRNAs, in psRNATarget with expectation ≤3

Fig. 5
figure 5

The binding alignments of miRNAs targeting predicted lncRNAs in Populus. The figure shows predicted binding alignment of Ptc-miR1445, Ptc-miR168b-5p, Ptc-miR172b-3p, Ptc-miR172b-5p and Ptc-miR408-5p with predicted Populus lncRNAs

Fig. 6
figure 6

The interaction networks of miRNAs, lncRNAs and mRNA. Only the mRNAs involved in interactions with miRNAs and lncRNAs are shown

We also checked the sequences for lncRNAs that correspond to miRNA precursors, and found that only three of the 1377 lncRNAs (0.2 %), i.e., TCONS_00061773, TCONS_00023843 and TCONS_00048079, harbor complete precursors, for four miRNAs (Ptc-miR1448, Ptc-miR482a.2, Ptu-172 and Ptu-37) (Table 4; Fig. 7). Among the four miRNAs, two have been reported in miRBase and another two were novel miRNAs with a characteristic hairpin structure. Prediction of the secondary structure for the four transcripts using the Vienna RNA package RNAfold program showed that these four miRNA precursors had stable hairpin structures (Fig. 7). The precursors of two known miRNAs, Ptc-miR1448 and Ptc-miR482a.2 were located in one lncRNA, TCONS_00061773. None of the target genes of these three lncRNAs was predicted, but psRNATarget predicted 39 target genes for the four miRNAs (Supplementary material S10).

Table 4 LncRNAs transcripts corresponding to miRNA precursors
Fig. 7
figure 7

The predicted secondary structures of lncRNAs transcripts and miRNAs sequences. a TCONS_00061773 consists of Ptc-miR1448 and Ptc-miR482a.2 precursors. b TCONS_00023843 contains Ptu-172 precursors. c TCONS_00048079 contains Ptu-37 precursors. The secondary structure was created by Vienna RNA package RNAfold web (http://rna.tbi.univie.ac.at/)

Verification of RNA-seq data and examination of tissue-specific expression of lncRNAs by qRT-PCR

The expression patterns identified by RNA-seq were further validated by qRT-PCR with sequence-specific primers. Figures 8 and 9 illustrate the relative transcript levels of lncRNAs, miRNAs and mRNAs normalized to the expression level of Pt-ACTIN. For this verification, we focused primarily on the transcripts of significantly expressed lncRNAs (R1-R14). As shown in Fig. 8a, similar expression patterns were detected by RNA-seq and qRT-PCR; for instance, R5 (TCONS_00056386) was detected as having its highest expression in OW, then TW and lowest in NW (Supplementary material S5) by RNA-seq and a similar expression pattern was also detected by qRT-PCR. The expression of five lncRNAs representing five expression patterns (R11:A, R12:B, R13:C, R3:D and R14:E) was measured by qRT-PCR and agreed with the results of RNA-seq. As the RNA-seq revealed that the lncRNAs have specific expression in the three xylem tissues, the lncRNAs also have tissue-specific expression patterns in five different tissues examined by qRT-PCR (Fig. 8b). As indicated in Fig. 8b, the lncRNAs displayed distinct tissue-expression patterns with abundance in root, stem and leaf in poplar. The highest expression of R5 lncRNA was observed in flowers and the lowest in NW (Fig. 8b). The expression of R5 lncRNA in developing xylem is higher than in mature xylem (NW, TW and OW) or in phloem and cambium, suggesting that lncRNAs are differentially expressed in vascular tissues. Of the 14 lncRNAs, seven were located in one known gene (detail in Supplementary material S11); thus, the correlation of expression levels between the host gene and lncRNAs was further analyzed. As shown in Fig. 8c, the lncRNAs and host genes have almost the same expression in TW, NW and OW. For example, both R9 and G9 expressed highest in TW (TW > OW > NW). However, the expression relationship between lncRNAs and their host lncRNAs in one tissue was not the same; for example, R8 expressed higher than G8, while R9 lower than G9. To explore the expression relationship between lncRNAs and their potential target genes, four potential target genes were measured by qRT-PCR. The expression of lncRNAs and their potential target genes identified by qRT-PCR showed the two pairs R1–T1 and R4–T4 had the same expression pattern in three xylem tissues. The lncRNAs and their potential target genes of the remaining two pairs were all expressed highest in OW, while their expression in NW and TW was opposite. Expression levels measured by qRT-PCR thus suggested that lncRNAs and their target genes might have similar tissue-specific expression (Fig. 8d).

Fig. 8
figure 8

The expression levels as measured by qRT-PCR. a The expression level of lncRNAs in TW, NW and OW. b The expression level of lncRNAs in different tissues. c The expression level of lncRNAs and host genes in TW, NW and OW. d The expression level of lncRNAs and their target genes in TW, NW and OW. The relative expression levels were standardized to the levels of Pt-ACTIN mRNA

Fig. 9
figure 9

The expression levels of miRNAs, lncRNAs and potential target genes as measured by qRT-PCR. a The expression of miRNAs and their potential target lncRNAs. b The expression of miRNAs and their potential target genes. c The expression of miRNAs and their host lncRNAs. The relative expression levels were standardized to the levels of poplar ACTIN mRNA

In order to examine the relationship between expression of miRNAs and their potential targets, including genes and lncRNAs, we used qRT-PCR to measure the expression of miRNAs and their potential targets (Fig. 9). Figure 9a shows that a negative relationship between miRNAs and their potential target lncRNAs were detected; for example, miR2 (Ptc-miR168b-5p) is expressed in OW (OW > TW > NW), while its potential target lncRNA lmR2 (TCONS_00039944) showed its lowest expression in OW (NW > TW > OW). Also, the miRNA miR3 (Ptc-miR1448) showed the highest expression in TW (TW > NW > OW) and its potential target gene miT3 showed the lowest expression in TW (OW > NW > TW), indicating a negative expression relationship between them (Fig. 9b). Two miRNAs (miR7:Ptu-172 and miR8:Ptu-37) that we predicted in our study were expressed only in TW, and their potential genes (miT5 and miT6) showed their lowest expression in TW (Fig. 9b). Since Ptc-miR1448 (miR3) and Ptc-miR482a.2 (miR4) were located in TCONS_00061773 (lmR3), we also measured their expression (Fig. 9c), finding the Ptc-miR1448 is much more abundant than Ptc-miR482a.2. However, no significant correlation between miRNAs and their host lncRNAs was found. In summary, the qRT-PCR analysis verified the data from the RNA-seq analysis, and revealed that the lncRNAs had tissue-specific patterns of expression in Populus.

Discussion

The P. tomentosa lncRNAs are shorter and expressed at lower levels than protein-coding transcripts

Our study identified 1,377 putative lncRNAs in three xylem tissues by RNA-seq, and used computational analysis to classify these into four categories to explore their probable functions. Since the whole genome sequence of P. tomentosa has not been generated and hybrids between P. trichocarpa and P. tomentosa are fertile, the genome of P. trichocarpa was used as a reference in our study. Intron sequences are rapidly broken down by nuclease, therefore the number of intronic lncRNAs detected by RNA-seq was low, less than class o. However, the transcriptional regulation of lncRNAs is unclear and explaining this phenomenon will require further analysis. Previous studies in mammals indicated that lncRNAs are shorter and expressed at significantly lower levels than protein-coding transcripts (Pauli et al. 2012). In our study, the lncRNAs were 2.2-fold shorter, on average, than coding transcripts, and the expression level of these lncRNAs was lower than the coding transcripts, in agreement with previous studies (Pauli et al. 2012). LncRNAs also have specific expression patterns in tissue types and subcellular compartments (Wang et al. 2011b) and are only expressed in some conditions and biological processes (Zhang et al. 2013a). In our study, 54 lncRNAs displayed unique expression patterns in different parts of the xylem tissue, suggesting that some lncRNAs are involved in the formation of special wood types (Fig. 2). Among the significant differently expressed lncRNAs, we found five expression patterns in OW, TW and NW, suggesting that lncRNAs are possibly associated with the formation of specialized wood types. According to our definitions, lncRNAs of types A and C can be considered as responding to mechanical stress, but such responses generally do not discriminate between tension and compression stresses, revealing potentially important roles for type A and C lncRNAs in counteracting overall mechanical stimuli. As indicated earlier, lncRNAs of the types B, D, and E, which showed differences between TW and OW, are possibly associated with more specialized regulation that may lead to preferential development of either TW or OW. High-throughput genomic technologies such as lncRNA microarray and RNA-Seq often generate a set of lncRNAs of interest, but little is known about the transcriptional regulation of lncRNA genes. Here, we identified TFs that may bind to the upstream of putative Populus lncRNAs and found 32 TFs including members of the NAC and MYB families, providing clues on which TFs could be common transcriptional regulators of lncRNA genes. Together, our data represent the first characterization of lncRNAs in woody plants and provide many clues for the further study of regulation of wood formation in Populus by lncRNAs.

LncRNAs and miRNAs are involved in an interaction network

Our study discovered 25 lncRNAs that may be targeted by 44 miRNAs and our qRT-PCR studies indicated a probable negative relationship between expression of lncRNAs and their targets, in agreement with earlier studies that showed that miRNAs can target lncRNAs (Zhang et al. 2013b) (Fig. 5). In our study, ptc-miR1445 was predicted to target TCONS_00049512, which has been predicted to function in cleaving the transcripts of development-related or stress-responsive genes in Populus (Lu et al. 2008). Besides ptc-miR1445, miRNAs that were specifically expressed in xylem and reportedly participate in formation of tension wood (Lu et al. 2005, 2008; Puzey et al. 2012) were predicted to regulate some of the lncRNAs identified in our study. For example, ptr-miR172, which was only suppressed in compression tissue (Lu et al. 2005), may target TCONS_00012272 and TCONS_00012273, which were detected in our study and showed a type C expression pattern (TW > OW > NW). In addition, our study constructed a comprehensive network of RNA-mediated interactions, putting together miRNA–mRNA, miRNA–lncRNA and lncRNA–mRNA interactions, using a computational approach (Fig. 6). Our results agreed with former studies (Jalali et al. 2013), and suggested that there are widespread regulatory interactions between non-coding RNAs and mRNAs in Populus.

Our study identified three lncRNAs that contain miRNA precursors, indicating that some lncRNAs are processed into miRNAs in Populus (Fig. 7), in agreement with earlier studies (Cai and Cullen 2007). TCONS_00061773 harbors precursors for two known miRNAs, Ptc-miR1448 and Ptc-miR482a.2 (Fig. 7). Previous studies supported the idea that Ptc-miR1448 may be involved in the resistance of plants to biotic and abiotic stresses (Lu et al. 2008) and Ptc-miR482a.2 has been reported as a tree-specific miRNA in tension wood (Lu et al. 2005, 2008). The expression measured by qRT-PCR revealed that Ptc-miR1448 is much more abundant than Ptc-miR482a.2, although both miRNAs occur in one cluster in the genome (Lu et al. 2005, 2008). These findings supported the idea that TCONS_00061773 may be associated with abiotic stress tolerance in Populus. Although we identified only two novel miRNAs by examination of our lncRNAs, lncRNAs might be considered as a resource for identifying novel miRNAs. Whether lncRNAs function by themselves or as precursors of miRNAs requires more evidence. All these findings supported a functional network of lncRNAs and miRNAs, but this network requires further study, and identifying well-established miRNAs that bind lncRNAs may help to infer the function of lncRNAs.

LncRNAs may function via trans-regulatory effects on genes involved in many processes

Studies have implicated lncRNAs in transcriptional regulation via trans effects in many different ways, such as by activation of TFs, which is the predicted mechanism of action of the lncRNA Evf2 (Bond et al. 2009). The prediction of trans-regulatory target genes of lncRNAs provides a useful way to identify which process lncRNAs participated in and to infer their potential functions. Our functional prediction according to GO categories (Table 2) showed that the target genes of lncRNAs are involved in many biological processes such as “multicellular organismal process”, and “signaling and biological regulation”. NcRNAs can target transcriptional activators or repressors, different components of the transcription reaction, including RNA polymerase II and even the DNA duplex to regulate transcription (Goodrich and Kugel 2006). Here, significant GOs for “transcription regulator activity” and “binding” were identified among the potential target gens of lncRNAs, which agreed with previous studies (Yoon et al. 2013). The “response to stimulus” term was only detected in the target genes of lncRNAs up-regulated in OW vs. NW, indicating a role for lncRNAs on the lower side of a branch, which receives more stimulus from gravity. Since this process is important to regulate plant adaptation to the environment, and the lncRNAs potentially regulated these genes, it could be proposed that lncRNAs might play important roles in plant environmental adaption. Lastly, all the functional predictions implicated the potential target genes of lncRNAs in many biological processes, including in development and regulation of biotic and/or abiotic stresses, as well as in TW formation induced by gravity.

LncRNAs may regulate genes involved in wood formation

Twelve lncRNAs were identified that potentially regulate 16 important genes involved in wood formation (Table 3). Cellulose produced by the cellulose synthase (CesAs) complex is well known as a basic component of plant cell walls, but the roles of a number of CesAs and cellulose synthase-like proteins in cellulose biosynthesis require further study. CSLD family members were proposed to have glucosyltransferase activity in synthesizing a polysaccharide that has a specialized structural role in the cell walls of tip-growing cells in Arabidopsis (Bernal et al. 2008). Arabidopsis CSLD1 and CSLD4 are important for normal growth of pollen tubes, likely through cellulose deposition (Wang et al. 2011a). Ptc-miRX41 is predicted to target a CSLD4 in P. trichocarpa (Puzey et al. 2012), and this study opened the door to understanding the role of ncRNAs in regulation of genes involved in cellulose biosynthesis in trees. In our study, TCONS_00032782, which was predicted to target Pt-CSLD4, revealed that lncRNA might function in wood formation through regulation of CSLD in Populus.

A prominent product of the phenylpropanoid pathway is lignin, the second most abundant naturally occurring polymer, after cellulose; lignin confers the rigidity and mechanical strength needed for plant growth. 4CL is the central enzyme of the plant-specific phenylpropanoid pathway and has an important role in lignin biosynthesis (Hu et al. 2010). TCONS_00053930 was predicted to target 4CL (Table 3), indicating that there might be a regulatory relationship between lncRNA and 4CL. Another important enzyme involved in the phenylpropanoid pathway is caffeoyl-CoA 3-O-methyltransferase (CCOMT). The methylation steps in the biosynthesis of lignin monomers catalyzed by CCOMT (Maury et al. 1999) are essential in lignin biosynthesis in poplar (Zhong et al. 2000). Strong down-regulation of CCOMT led to reduced lignin levels and influenced lignin composition of guaiacyl and syringyl subunits (Guo et al. 2001). Although CCOMTs play important roles in lignin content, little is known about the regulators of CCOMT. TCONS_00060049 was predicted to target CCOMT, suggesting that it may have a potential role in lignin formation by regulation of CCOMT (Table 3). Besides 4CL and CCOMT, other important genes, such as the genes encoding UDP-glucoronosyl/UDP-glucosyl transferase family members, were predicted as target genes of some lncRNAs identified in our study (Table 3). More recently, the discovery that Ptr-miR397a functions as a negative regulator of laccase genes, and in turn affects lignin content in transgenic P. trichocarpa (Lu et al. 2013), revealed that miRNAs targeting ncRNAs function as an important factor in regulation of genes involved in lignin biosynthesis. All these results indicated that lncRNAs might take part in regulation of lignin biosynthesis via 4CL and CCOMT, uncovering an area of regulation of lignin biosynthesis genes by ncRNAs.

Apart from genes involved in cellulose and lignin biosynthesis, genes related to plant hormones also play important roles in controlling cambial activity and the differentiation of cambial derivatives in woody plants, and also affect the formation of TW induced by gravity (Du et al. 2004; Israelsson et al. 2005; Dayan et al. 2012). Gibberellin can promote the formation of TW on the upper sides of branches (Yoshida et al. 1999). Recently, Nugroho et al. (2012) reported that gibberellin is required for the formation of tension wood and stem growth in Acacia mangium seedlings. All these results showed that wood formation is likely to be strongly affected by gibberellin. GA2ox is a key enzyme in the gibberellin biosynthetic pathway. In our study, TCONS_00022311 was predicted to target a GA2ox, suggesting that lncRNAs may take part in regulating wood formation through gibberellin biosynthesis.

Conclusions

Using the TW in poplar as a model, we have identified differentially expressed lncRNAs in NW, TW and OW by RNA-seq, and predicted their potential roles in wood formation. This study identified and characterized 1,377 putative lncRNAs Populus. Of these, 589, 578, 298 were significantly differentially expressed between OW vs. NW, TW vs. NW, and OW vs. TW, respectively. We found that the predicted target genes of lncRNAs are involved in many biological processes, such as cellulose, lignin and gibberellin biosynthesis, suggesting lncRNAs’ potential roles in wood formation. In addition, an interaction network between lncRNAs, miRNAs and mRNAs was constructed, showing that lncRNAs may function through miRNA pathways and indicating interactions between lncRNAs and miRNAs. This study provides new insights into the role of lncRNAs in the molecular mechanisms underlying reaction wood formation. This lncRNA resource can be mined to explore the functions of lncRNAs and for further genetic studies in Populus.