Introduction

Over recent years, numerous studies have found that non-coding RNAs (ncRNAs) can work as regulatory molecules in various developmental processes and responses of plants to biotic or abiotic stress (Liu et al. 2015; Zhang and Chen 2013). According to their length, ncRNAs are arbitrarily classified into small ncRNAs (sRNAs; <40 nt), such as microRNAs (miRNAs), small interfering RNAs (siRNAs), natural antisense siRNAs (nat-siRNAs), and trans-acting siRNAs (ta-siRNAs), and long ncRNAs (lncRNAs). LncRNAs are more than 200-nt ncRNAs that are transcribed by RNA polymerase II and lack an open reading frame (ORF) (Wang and Chang 2011). Based on their genomic locations, lncRNAs are generally categorized into i (intronic lncRNAs), o (overlapping lncRNAs), u (intergenic lncRNAs), and x (antisense lncRNAs) types (Roberts et al. 2011). Some lncRNAs have been shown to be involved in plant flowering, sexual reproduction, male sterility, and various biotic or abiotic stress responses by regulating the expression of related genes at various levels, including chromatin remodeling, transcription control, post-transcriptional processing, protein function or localization, and inter-cellular signaling (Zhang and Chen 2013; Heo et al. 2013; Ponting et al. 2009). Nitrogen (N) is an indispensable macronutrient for the normal growth and development of plants, which needs to take up sufficient nitrate (NO3 ) and/or ammonium (NH4 +) from the soil. Beyond functioning as a nutrient, various nitrogenous compounds, such as amino acids, proteins, and nucleic acids, can act as signaling molecules that modulate plant gene expression, metabolism, and growth (Crawford and Forde 2002; Vidal and Gutierrez 2008). It is known that N deficiency is one of the main abiotic stresses that greatly limit the productivity of trees that have no regular fertilizer addition. Thus, it is important to study the gene expression mechanisms of tree adaptation to N stress in forest soil so as to improve the efficiency of N use (NUE) in trees and thereby enhance tree yields.

Populus has emerged as a common model tree for studies of metabolism, physiology, growth, and various environmental stimuli of woody plants. Recently, mechanisms of responses of trees to low N have been studied with multiple phenotype changes and physiological characterizations, whereas the signaling pathways and the molecular and genetic mechanisms underlying N responses are less clear (Rennenberg et al. 2010). Nitrate transporter 1 (NRT1), calcineurin B-like protein (CBL), chloride channel (CLC), and hormones (auxins and cytokinins) have been found to be involved in nitrate uptake, transport, allocation, assimilation, and N-signaling pathways in higher plants (Vidal et al. 2010; Wang et al. 2012). N-limitation also leads to global changes in RNA levels, and genome-wide transcriptome analysis has been used broadly to identify a variety of stress-responsive ncRNAs (Chen et al. 2015a; Ren et al. 2015; Liang et al. 2012; Wang et al. 2013). In plants, lncRNAs have recently been identified globally in Arabidopsis thaliana (Liu et al. 2012), rice (Oryza sativa) (Zhang et al. 2014a), maize (Zea mays) (Li et al. 2014), Digitalis purpurea (Wu et al. 2012), Panax ginseng (Wang et al. 2015), Medicago truncatula (Wen et al. 2007), Salvia miltiorrhiza (Li et al. 2015) and wheat (Triticum aestivum) (Zhang et al. 2013a). In A. thaliana, 6480 intergenic lncRNAs were predicted and 2708 lncRNAs were identified by RNA sequencing (Liu et al. 2012), and 626 concordant and 766 discordant light-responsive antisense lncRNAs (lncNATs) were detected (Wang et al. 2014). In total, 125 putative stripe rust pathogen-responsive lncRNAs have been identified in wheat (Zhang et al. 2013a), and 2542 lncRNA candidates were detected in drought-treated Populus trichocarpa (Shuai et al. 2014). In addition, a previous study identified 1377 lncRNAs and revealed 776 differentiated lncRNAs in tension wood, opposite wood, and normal wood xylem in Populus tomentosa (Chen et al. 2015b).

lncRNAs can regulate gene expression in various ways. Similar to siRNAs and miRNAs, lncRNAs can pair with mRNA transcripts, mediating cleavage, translation inhibition, and gene silencing. For example, an antisense lncRNA, COOLAIR, bound to and blocked the transcription of the sense transcript of FLC (flowering locus C) for vernalization in A. thaliana (Heo and Sung 2011). Some lncRNAs can act as precursors to the biogenesis of small RNAs, including miRNAs and siRNAs (Zhang et al. 2014b), whereas certain lncRNAs, such as IPS1 and AT4 (Franco-Zorrilla et al. 2007), can act as target mimics of miRNAs, interfering with the interaction between miRNAs and their target mRNAs to regulate gene expression. A lncRNA was also found to be involved in a NSR–ASCO–lncRNA regulatory module to regulate the alternative splicing patterns of several target mRNAs during root development in A. thaliana (Bardou et al. 2014). In addition, lncRNAs can affect RNA-protein complexes, regulate the activity of proteins, and play important roles in specific biological processes (Shafiq et al. 2015).

To date, numerous N-responsive miRNAs and the corresponding targets have been identified in a large range of plant species, including A. thaliana (Liang et al. 2012), rice (Nischal et al. 2012), soybean (Glycine max) (Wang et al. 2013), Chrysanthemum nankingense (Song et al. 2015), and P. tomentosa (Chen et al. 2015a). For example, upregulated miR156, with target genes encoding SBP (squamosa promoter binding protein) and SPL (SBP-like) in A. thaliana (Liang et al. 2012), rice (Nischal et al. 2012), and maize (Zhao et al. 2013), can improve NUE by increasing anthocyanin production. In addition, in A. thaliana (Liang et al. 2012), maize (Zhao et al. 2013), common bean (Valdes-Lopez et al. 2010), and P. tomentosa (Ren et al. 2015), downregulated miR169 and miR399 induce the accumulation of NFYA transcription factor and UBC24 (PHO2), respectively, thus regulating the N-signaling pathway in response to low-N stress in plants (Fischer et al. 2013).

In contrast to miRNAs, the molecular mechanisms of lncRNA-mediated gene regulation in specific biological processes remain largely unknown. So far, there are no reports of lncRNAs in woody plants in response to N stress. P. tomentosa Carr. (the Chinese white poplar) is a significant indigenous tree species, widely distributed in northern China, and is a fast-growing tree of high economic and environmental value, especially with regard to timber production and forestry enhancement (Du et al. 2013; Jansson and Douglas 2007). Consequently, studying lncRNAs involved in N-responsive regulation in P. tomentosa will provide low N-responsive lncRNAs and should improve our understanding of the lncRNA-guided N signal pathway, and provide more regulatory elements to increase tree NUE.

A genome-wide analysis was used in this investigation to detect and identify lncRNAs in P. tomentosa under low-N stress. In total, 388 lncRNA candidates were identified. To explore the lncRNAs involved in the response to N deficiency, fragments per kilobase of per million mapped fragments (FPKM) and fold change of lncRNAs expression in two libraries were calculated. In total, 126 low-N-responsive lncRNAs were detected: 8 repressed and 118 induced lncRNAs. Moreover, some lncRNAs were identified as precursors of miRNAs, and certain lncRNAs could be cleaved by miRNAs and act as their target mRNAs. The interaction network among lncRNAs, miRNAs, and mRNAs was also investigated. Finally, 111 protein-coding genes with regions complementary to 38 lncRNAs were predicted and the potential functions were analyzed further. Overall, our investigation revealed that lncRNAs can play a significant role in the response of P. tomentosa to low-N stress, and provides new insights for further research on the molecular mechanisms of lncRNAs and related miRNAs pathways in nutrient deficiency in trees.

Materials and methods

Plant materials and transcriptome sequencing

Populus tomentosa plantlets were grown on ½ Murashige–Skoog (MS) medium (20 g L−1 sucrose and 0.4 mg L−1 indole-3-butyric acid; IBA) at 25 °C under a 16/8-h light/dark photoperiod. After 60 days, the plantlets were transferred to conditions of sufficient N for 5 days of hydroponics, with modified ½MS liquid medium changed every 2 days. Then, the hydroponic plantlets were treated or not treated under low-N conditions for 3 days according to previous researches (Ren et al. 2015; Wei et al. 2013). The modified ½MS liquid medium for low-N conditions contained 0.01 mM NH4NO3 and 1 mM KCl (DN), and untreated medium with sufficient N was used as a control (KK), containing 2 mM NH4NO3 and 1 mM KNO3. After these treatments, three plantlets of DN or KK, respectively, were harvested, snap-frozen in liquid nitrogen, and then stored at −80 °C until RNA extraction. Total RNAs were isolated from P. tomentosa plantlets using the TRIzol reagent (Invitrogen, Carlsbad, CA, USA), and cDNA libraries of DN and KK were constructed, as described previously (Zhai et al. 2013). The quality of total RNAs was examined with an Agilent 2100 Bioanalyzer, and the qualified libraries were sequenced using an Illumina HiSeq 2000 at the Beijing Genomics Institute (BGI), Shenzhen, China.

Prediction and differential expression analysis of lncRNAs in P. tomentosa under low-N stress

Raw data obtained from the HiSeq 2000 were preprocessed to filter out 5′ adapters, 3′ adapters, reads shorter than 18 nts, and low-quality reads. Then, the obtained clean reads from two samples were aligned to the genome and gene sets of model tree species P. trichocarpa (http://www.phytozome.net/poplar.php) using TopHat (ver. 2.0.11), allowing no more than two mismatches (Trapnell et al. 2012). Subsequently, clean transcripts from RNA-seq reads were assembled using Cufflinks (ver. 2.2.1) (Trapnell et al. 2012), thus reconstructing the transcripts with the directions were determined by the conserved 5′GT and 3′AG sites of alternative splicing. Next, Cuffmerge (ver. 2.2.1) was used to merge the reference transcripts with a reference annotation-based transcript (RABT) assembly, producing a single annotation file for the subsequent analysis. Then, the Cuffcompare program (ver. 2.2.1) was applied to compare the assembled transcripts with reference annotations (known Populus mRNAs and lncRNAs), and reads more than 200 bp were selected as candidates. According to their genomic locations relative to reference genes and lncRNAs by Cufflinks, the assembled transcripts were classified into several types, including j, x, u, and o (Roberts et al. 2011). Then, transcripts were obtained for protein-coding capacity prediction. The coding potential of transcripts was assessed using the Coding Potential Calculator (CPC) (Kong et al. 2007) and Pfam (Pfam, ver. 27.0) with Populus protein-coding genes as a reference (Fan and Zhang 2015). Finally, the non-coding transcripts were defined as lncRNAs (Fig. 1). Specifically, category j contained lncRNAs that shared at least one identical junction splice site with a reference gene. In category o, lncRNAs had generic overlap with the exon of a reference transcript, whereas the x category contained lncRNAs that overlapped a known protein-coding gene on the opposite strand. Finally, lncRNAs falling in the intergenic space between two protein-coding loci were classified into category u.

Fig. 1
figure 1

Computational pipeline for the prediction of novel lncRNAs in P. tomentosa. First, the RNA-Seq data were mapped to the reference genome of Populus with TopHat (ver. 2.0.11). Transcripts were assembled and merged using Cufflinks and Cuffmerge with a reference annotation-based transcript (RABT), and the initial assembled transcripts were compared to known transcripts of Populus using Cuffcompare. High-quality assemblies with length ≥200 bp were retrieved. Finally, the protein-coding potential of these transcripts was predicted with CPC, and the protein-coding transcripts were removed to obtain novel lncRNAs

Cuffdiff (ver. 2.2.1) (Trapnell et al. 2012) was used to calculate the lncRNA expression levels in DN and KK, to test the statistical significance of differentially expressed transcripts between two samples, and to screen significantly altered lncRNAs. The lncRNA expression levels were normalized to FPKM. The fold change of lncRNA expression under N-deficient conditions was equal to the log2 ratio: log2 (FPKM [DN]/FPKM [KK]). Based on these calculations, lncRNAs with the absolute value of a log2 ratio ≥1 and P value ≤0.05 were regarded as being significantly changed.

lncRNAs identified as miRNA precursors

Small RNAs, such as miRNAs and siRNAs, are generated from precursors, some of which have also been identified as a kind of lncRNA (Zhang et al. 2014b). To identify those lncRNAs acting as precursors of known or novel miRNA in low-N-treated P. tomentosa, the lncRNAs were aligned with precursors of known miRNAs in miRBase 21.0 (http://www.mirbase.org/) using BLAST with default parameters. The lncRNAs homologous to miRNA precursors with >90 % coverage were eventually defined as miRNA precursors (Meyers et al. 2008). The program ‘miRPare’ was used to predict novel miRNA precursors from the lncRNAs. Mfold software (ver. 3.6) (http://mfold.rna.albany.edu/?q=mfold/RNA-Folding-Form) was used to predict lncRNA secondary structures containing minimum free energies with default parameters.

lncRNAs predicted to be miRNA targets

To explore whether lncRNAs functioned as target mRNAs of miRNAs, the lncRNAs and Populus miRNAs were submitted to ‘psRNATarget’ (http://plantgrn.noble.org/psRNATarget/) (Dai and Zhao 2011) with an expectation ≤3. Then, lncRNAs containing a total of no more than four mismatches and G/U pairs within the lncRNAs and miRNAs complementary regions were considered to be miRNA targets.

Analysis of antisense lncRNAs and sense genes

We performed an analysis of interactions between lncRNAs and mRNAs using ‘RNAplex’ (Tafer and Hofacker 2008), which was used to search for short, highly stable interactions between two long RNAs, based on base pairing, thus predicting the potential interaction between antisense lncRNAs and the corresponding sense protein-coding genes. Especially, optimum base pair probabilities based on minimum free energy were predicted using the Vienna RNA package (ver. 1.6).

lncRNAs near coding genes analysis and prediction of lncRNA families

To annotate the potential interaction between lncRNAs and their upstream or downstream bracketing genes, a 2-kb distance around a gene was used as a criterion to detect globally any nearby lncRNAs using a method modified from a previous report (Yan et al. 2015). The genomic positions of these genes were calculated. Moreover, to better annotate and understand the structures, properties, and functions of the predicted lncRNAs, it was necessary to classify the lncRNAs into different ncRNA families in evolutionary terms. Based on multiple sequence alignments, secondary structures, and covariance models, all predicted lncRNAs were classified into families that shared a common evolutionary ancestor using Rfam and INFERNAL (Nawrocki 2014) with default parameters.

Prediction of protein-coding genes with regions complementary to lncRNAs

Global identification of protein-coding genes with regions complementary to lncRNAs was performed as described previously (Han et al. 2012). Briefly, we first performed BLAST to screen a target mRNA based on sequence complementarity between target mRNAs and lncRNAs with an E value = 1e−5 and identity = 95 %. Then, RNAplex was applied to select the complementary protein-coding genes with an E value = 1e−30 as a parameter. Next, protein-coding gene annotations were analyzed with ‘phytozome’ (http://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Ptrichocarpa). The mRNA–lncRNA–miRNA regulatory network was further modeled with the Cytoscape 3.2 software (Saito et al. 2012).

RT-PCR validation of the transcription direction of lncRNAs

RT-PCR was conducted on total RNA isolated from P. tomentosa plantlets treated with low N stress for 3 days as mentioned previously using GeneRacer Oligo dT primer 5′-GCTGTCAACGATACGCTACGTAACGGCATGACAGTG(T)24-3′ and SuperScript III RT of GeneRacer kit (Invitrogen, Carlsbad, CA, USA). PCR amplifications were performed using lncRNAs specific primers (Supplementary Table 1) designed with Primer 5.0. PCR products were gel-purified and sequenced at the Beijing Genomics Institute (BGI), Beijing, China.

Table 1 P. tomentosa lncRNA candidates with length over 600 bp

Results

Transcriptome-wide prediction of novel lncRNA candidates in Populus tomentosa

To identify the lncRNAs in P. tomentosa under N deficiency, we carried out a deep analysis of Illumina transcriptome sequencing data of DN and KK of P. tomentosa through alignment with genome and gene sets of P. trichocarpa which has been sequenced and is generally used as a model tree, as previous researches conducted in P. tomentosa (Chen et al. 2015b; Song et al. 2013, 2014). Based on these two transcriptome data sets, 756 and 141 non-coding novel transcripts were identified in DN and KK, respectively, through alignment to the genome of model tree species P. trichocarpa. All data were merged using Cuffmerge (ver. 2.2.1) to identify 514 putative lncRNAs (388 unique lncRNAs belonging to 380 unique gene loci) in P. tomentosa (Supplementary Tables 2, 3). To further validate the transcription of lncRNAs, we performed RT-PCR experiment of nine lncRNAs including TCONS_00008674, TCONS_00008686, TCONS_00016745, TCONS_00017786, TCONS_00015374, TCONS_00021140, TCONS_00026533, TCONS_00026542 and TCONS_00047011 using GeneRacer Oligo dT primer and lncRNAs specific primers and sequencing. These lncRNAs were successfully amplified (Supplementary Fig. 1) and the results showed that these lncRNAs have the same transcriptional direction as determined through transcriptome-based computational identification pipeline, which further validated the reliability of the prediction although we cannot rule out the possibility that some might be assembled with different direction of the authentic lncRNAs. Our investigation also showed that lncRNAs candidates ranging between 200 and 500 bp accounted for a large proportion (Supplementary Table 2). There were 12 lncRNA candidates with length from 600 to 699 bp and 21 over 700 bp (Table 1). The possibility also cannot be excluded that some of the predicted lncRNAs were fragments of protein-coding genes of P. tomentosa, especially the short ones, however, the longer lncRNA candidates, for example, those with length over 700 bp are very likely to be bona fide (Wu et al. 2012; Li et al. 2015). Combined with these results, our investigation further showed that lncRNAs are non-coding regulators generally existing in Populus. Based on their genomic locations relative to nearby genes, the 388 lncRNAs were further categorized into four classes. Most fell into class u, accounting for 339 (87.37 %), whereas 11 (2.83 %), 23 (5.93 %), and 15 (3.87 %) lncRNAs belonged to classes j, o, and x, respectively.

Two or more lncRNAs could be generated from some gene loci, and six such loci (XLOC_004520, XLOC_020600, XLOC_023347, XLOC_030453, XLOC_036454, and XLOC_037793) were found in our investigation. Especially, XLOC_004520 could generate two lncRNAs, one overlapping lncRNA, TCONS_00008303, and one intergenic lncRNA, TCONS_00008308, whereas four nearby intergenic lncRNAs (TCONS_00068134, TCONS_00068135, TCONS_00068136, and TCONS_00068137) were located at XLOC_037793 (Supplementary Table 2).

Characterization of the genomic location of lncRNAs

In our investigation, 57 and 141 lncRNAs were located downstream of 53 bracketing genes and upstream of 137 genes, respectively, with 2-kb distance between upstream or downstream of bracketing genes and lncRNAs as a criterion (Supplementary Table 4). Among them, TCONS_00008676 and TCONS_00008677 were found to be located downstream of one gene locus (Potri.001G180000.1), and a similar distribution was found in another two groups of lncRNAs (TCONS_00031857 and TCONS_00031858; TCONS_00039009, TCONS_00037686 and TCONS_00037687). Also, four pairs of lncRNAs (TCONS_00021846 and TCONS_00021847, TCONS_00008688 and TCONS_00008689, TCONS_00058782 and TCONS_00058783, TCONS_00065840 and TCONS_00065841) were distributed upstream of a bracketing gene. In particular, TCONS_00057907 was found downstream of Potri.014G105100.1, whereas TCONS_00058793 was on the upstream side. Furthermore, TCONS_00013609, the target of seven miRNAs (pto-miR6457b, pto-miR6462a/b/c-3p/d/e/f), and TCONS_00069233, cleaved by seven miRNAs (pto-miR396a/b/c/d/e-5p/f/g-5p), were downstream of Potri.002G021200.1 and Potri.018G126700.1, respectively (Supplementary Tables 4, 5). In addition, TCONS_00021861, containing a precursor of pto-miR168b, was upstream of Potri.004G234200.1 (Supplementary Table 4, Table 2). These results showed a significant connection in terms of regulatory potential based on the genomic location of lncRNAs in low-N-treated P. tomentosa. A lncRNA may show a co-expression pattern with neighboring genes, thus promoting or repressing the expression of nearby protein-coding genes as a result of various interactions, such as binding to DNA or protein, or forming a ribonucleoprotein complex (Ponting et al. 2009). The specific interactions between these lncRNAs and their neighboring genes in P. tomentosa need further exploration.

Table 2 LncRNAs identified as precursors of known miRNAs in P. tomentosa

Characterization of lncRNA families in P. tomentosa

To better understand and annotate novel lncRNAs at an evolutionary level, we classified all of the predicted novel lncRNAs into different non-coding RNA families using INFERNAL. Global detection of sequences and secondary structures of the novel lncRNAs showed that seven P. tomentosa lncRNAs belong to seven conserved non-coding RNA families: NAMA_1, bxd_5, DAOA, NPPA, ST7, ZFAT, and Six3os1_2 (Supplementary Table 6). However, most (381) lncRNAs in our investigation were novel and had no homolog in Rfam non-coding RNA families, indicating most of P. tomentosa lncRNAs are species-specific, which consists with previous reports of low conservation of lncRNAs (Li et al. 2015; Wang et al. 2015; Wen et al. 2007; Wu et al. 2012). Also, previous reports have revealed that the majority of the lncRNAs were expressed in specific tissues (Wang et al. 2011) or under certain environmental conditions (Wu et al. 2012) and developmental processes (Zhang and Chen 2013). Specifically, Six3OS was a lncRNA located at the distal promoter region of transcription factor Six3, and was involved in cell specification by acting as a modular scaffold for recruiting histone modification complexes to Six3 target genes (Rapicavoli et al. 2011). Populus tomentosa TCONS_00039019 (XLOC_021277), a homolog of Six3os1_2, was located at Chr08: 18820880−18821201+, 2 kb upstream of Potri.008G222300.1, encoding an isoleucyl/leucyl/tyrosyl/valyl/methionyl-tRNAsynthetase. Especially, TCONS_00039019 was accumulated (2.12 fold) in a low-N environment, indicating that TCONS_00039019 and its bracket gene Potri.008G222300.1 may be involved in the response of P. tomentosa to N deficiency (Supplementary Table 7).

LncRNA transcripts as precursors for known and candidate novel miRNAs

In this investigation, 9 intergenic lncRNAs were detected as precursors of 11 known miRNAs (pto-MIR156c, pto-MIR156e, pto-MIR156g, pto-MIR156h, pto-MIR156j, pto-MIR159a, pto-MIR159b, pto-MIR159c, pto-MIR168b, and pto-MIR169b), and 5 lncRNAs were identified as precursors of 14 novel miRNAs in P. tomentosa based on the alignment of miRNA precursors with 388 lncRNAs. For example, TCONS_00066551 could be aligned with three miRNA precursors (pto-MIR156g, pto-MIR156h, pto-MIR156j) in P. tomentosa, at 114–216 nts, indicating the complexity and variety of lncRNAs (Fig. 2). Also, TCONS_00052315 could be cleaved at 181–363 nts, generating pto-MIR159a and pto-MIR159c (Table 2). In total, secondary structures of 9 lncRNAs that acted as precursors of 11 known miRNAs in P. tomentosa are shown in Supplementary Fig. 1. In addition, by homology alignment of precursor sequences of plants, three lncRNAs (TCONS_00031860, TCONS_00061456, and TCONS_00066551) matched well with precursors of gma-miR156k, aqc-miR156b, and tcc-miR156e, indicating that three new members of the miR156 family in P. tomentosa were further identified from lncRNAs in our investigation. Thus, deep analysis of lncRNAs may provide a method to identify new miRNA members in plants.

Fig. 2
figure 2

The secondary structure of TCONS_00066551 predicted by Mfold, which contains precursors of pto-miR156g, h, and j at 114–216, nts indicated in red letters

In addition, TCONS_00039013:138-217 was predicted to generate five candidate novel miRNA sequences from its transcript: four miRNAs in the 5′ arm and one in the 3′ arm. Also, two miRNAs were generated from TCONS_00069374:109-199:16_23 (UACCAUAGGUAUGUGGUGGUGGU) and TCONS_00069374:109–199:16_24 (UACCAUAGGUAUGUGGUGGUGGUG) (Supplementary Table 8; Supplementary Fig. 3). The precursor of these two miRNAs was processed at 109–199 nts of TCONS_00069374, located on the antisense strand of Potri.019G017700.1 (encoding a Caulimovirus viroplasm protein), suggesting UACCAUAGGUAUGUGGUGGUGGU and UACCAUAGGUAUGUGGUGGUGGUG as P. tomentosa natural antisense miRNAs (nat-miRNAs), a class of miRNAs that can form perfect matches with their target genes. In a previous study, global discovery of nat-miRNAs showed the universality of natural sense–antisense transcript loci as another source of miRNA gene loci (Lu et al. 2008). For the validation of these novel miRNA candidates, these predicted sequences were further searched against clean reads of P. tomentosa sRNA libraries based on our previous investigations. One miRNA generated from TCONS_00026533:2556-2639:51_21 (UUUCAAUUAAAGUUCUAAGGA) was found in sRNA library of P. tomentosa treated with salt stress (Ren et al. 2013). Unfortunately, other sequences were not found in sNA libraries of P. tomentosa possibly due to that these novel miRNAs were expressed with low abundance or spacial-temporal pattern which should need further exploration for the validation and characterization of bona-fide miRNAs. Those results showed that novel miRNAs could also be generated from lncRNAs. Our investigation further suggests that further mining of lncRNAs may be a new method to detect new miRNA members and novel miRNAs.

lncRNA transcripts as targets of miRNAs in P. tomentosa

To further examine the potential regulatory function of lncRNAs, the ‘psRNATarget’ program was used to predict whether lncRNA transcripts could be targets of miRNAs, with an expectation of ≤3. In our investigation, four lncRNAs were targeted by 29 miRNAs belonging to 5 families, 22 conserved and 7 non-conserved miRNAs (Supplementary Table 5). Among them, all miRNAs could only pair with one target lncRNA, whereas one lncRNA could be targeted by 3–12 miRNAs. For example, TCONS_00052316 was targeted by 12 miRNAs belonging to the pto-miR156 family (Fig. 3). Specifically, seven members of pto-miR396 (pto-miR396a/b/c/d/e-5p/f-5p/g-5p) were identified to target TCONS_00069233 (upstream of Potri.018G126700.1), displaying cleavage activity and possibly negatively regulating the expression of TCONS_00069233 (Supplementary Table 5; Fig. 4). Furthermore, TCONS_00052316, the target of pto-miR156a-l, was depressed (−2.17 fold) by low-N stress in P. tomentosa (Supplementary Table 7). Thus, all of these results link P. tomentosa adaptation to low-N conditions to the interaction of lncRNAs and miRNAs, indicating the complexity of the RNA regulatory system.

Fig. 3
figure 3

Interactions between lncRNAs and miRNAs. Four lncRNAs were predicted to be target genes (mRNAs) of 29 miRNAs by psRNATarget. Specifically, the green Vs indicate downregulated lncRNAs or miRNAs, whereas the red triangles show the upregulated miRNAs under nitrogen deficiency in P. tomentosa

Fig. 4
figure 4

Complementary pairing of miRNAs targeting lncRNAs in P. tomentosa. The binding alignments of pto-miR396a/b/f and pto-miR396c/d/e-5p/g-5p with their predicted lncRNA targets are shown

Identification of N-responsive lncRNAs in P. tomentosa

To globally identify lncRNAs under N-deficiency stress and to detect N-responsive lncRNAs, the expression levels of 388 lncRNAs belonging to 380 gene loci under low-N stress were analyzed. Among the 388 unique lncRNAs, 126 lncRNAs were significantly altered under N-deficiency conditions, with |log2FC| ≥ 1 and P ≤ 0.05, with most of the lncRNAs significantly induced (Supplementary Table 7). Among them, 8 lncRNAs were downregulated, and the remaining 118 were upregulated under N deficiency. The fold changes of lncRNA expression in P. tomentosa under low-N stress varied from −2.20 to −1.23 and from 1.00 to 4.53. TCONS_00008679 (4.53), TCONS_00066553 (4.23), and TCONS_00052313 (3.56) were the most upregulated lncRNAs, whereas TCONS_00064021 (−2.20), TCONS_00052316 (−2.17), and TCONS_00021140 (−2.11) were the most depressed lncRNAs in response to low-N stress. In addition, TCONS_00008675 and TCONS_00043791 were highly induced in P. tomentosa treated with low N, but were absent from control plants (Supplementary Table 7).

Identification of antisense lncRNAs in P. tomentosa under N deficiency

To investigate interactions between protein-coding genes and their antisense lncRNAs, we used RNAplex to detect coupling antisense lncRNAs and sense mRNAs. In our investigation, 15 lncRNAs were generated from the antisense strands of 14 corresponding protein-coding genes, and TCONS_00069373 and TCONS_00069374 were located on opposite strands of Potri.019G017700.1 (Table 3). Specifically, Potri.005G174300.1 (TCONS_00023334), Potri.006G058300.1 (TCONS_00029674), and Potri.014G018700.6 (TCONS_00057148) were identified to encode functional proteins: zinc knuckle/‘cold-shock’ DNA-binding domain, cytochrome P450, and acyl CoA binding protein, respectively (Fig. 5). Furthermore, five N-induced antisense lncRNAs (TCONS_00023745, TCONS_00049938, TCONS_00040024, TCONS_00009752, and TCONS_00001301) and their sense genes were detected (Table 3). Consequently, the plastic molecular mechanism of the sense mRNA–antisense lncRNA pairs needs further experimental investigation to reveal the specific functions involved in regulating the adaptation of P. tomentosa to N deficiency.

Table 3 Identification of antisense lncRNA and sense gene pairs in P. tomentosa
Fig. 5
figure 5

Interactions between sense genes and antisense lncRNA. a Potri.005G174300.1 (TCONS_00023334), b Potri.006G058300.1 (TCONS_00029674), c Potri.014G018700.6 (TCONS_00057148)

Identification and function annotation of 111 protein-coding genes with regions complementary to 38 lncRNAs of P. tomentosa

Computational analysis was used to globally reveal the relationship between protein-coding genes and lncRNAs in P. tomentosa under N deficiency combined with BLAST with E value = 1e−5 and identity = 95 %, and RNAplex with an E value = 1e−30 as a parameter. A total of 111 protein-coding genes were uncovered to be complementary to 38 lncRNAs in P. tomentosa (Supplementary Table 9). Among them, 10 differentially expressed lncRNAs, including 8 upregulated and 2 downregulated lncRNAs, were found to be complementary to 20 and 15 genes, respectively (Table 4). For example, TCONS_00008687 (1.94 fold) was found to be complementary to five members of Potri.017G050900 encoding the UbiA prenyltransferase family protein, and TCONS_00017288 (2.72 fold) targeted three genes encoding two domains: SRF-type transcription factor (DNA-binding and dimerization domain) and a K-box region. Moreover, TCONS_00064021 (−2.20 fold) was identified to be complementary to 12 genes, all encoding 3 domains: a protein kinase domain, a leucine-rich repeat, and a di-glucose binding region within the endoplasmic reticulum (Table 4). Similarly, Potri.019G017500.1, Potri.018G135400.1 and Potri.003G192300.1 could be complementary, respectively, to three sets of lncRNAs: TCONS_00069373 and TCONS_00069374, TCONS_00052311 and TCONS_00055120, TCONS_00008303 and TCONS_00008308 (Supplementary Table 9). These results demonstrated that a lncRNA could be complementary to multiple protein-coding genes and vice versa, which is similar with previous report and further shows the complex regulatory relationship between lncRNAs and protein-coding genes in plants (Wu et al. 2012). Besides forming complementary pairs between protein-coding genes and regulating genes in trans, certain N-responsive lncRNAs were found to be located upstream or downstream of neighboring genes, suggesting that cis regulation between lncRNAs and neighboring genes may have also played regulatory roles in the response of P. tomentosa to N deficiency through unknown mechanism or by the same cis/trans factors (Wu et al. 2012). Specifically, our results showed that five N-responsive lncRNAs were located near certain bracketing genes. For example, TCONS_00021860 (1.14 fold) and TCONS_00064021 (−2.20 fold) were located downstream of Potri.004G232200.1 and Potri.016G011500.1, respectively, whereas another three lncRNAs (TCONS_00049805, TCONS_00055120, and TCONS_00071416) were upstream of corresponding genes (Table 4, Supplementary Table 4). Combined with these results, it showed the presence of sense:sense or sense:antntisense interaction between lncRNAs and the protein-coding genes.

Table 4 Protein-coding genes with regions complementary to lncRNAs with differential expression in P. tomentosa under nitrogen deficiency

Discussion

Antisense lncRNAs in P. tomentosa under N deficiency

Antisense RNAs, a class of natural antisense transcripts (NATs), also named cis-NATs, are RNAs that are transcribed in cis from the reverse-complementary strands of known genes, and partially or entirely overlap the corresponding mRNA transcripts (Wang et al. 2005). Previous studies have shown a wide distribution of antisense RNAs in a large range of plant species, such as A. thaliana (Wang et al. 2005), rice (Gan et al. 2012), soybean (Britto-Kido Sde et al. 2013), and maize (Ma et al. 2006). Antisense RNAs are known to function in modulating gene expression at various levels (genomic imprinting, chromatin modification, transcription interference, alternative splicing) (Wight and Werner 2013; Zhang et al. 2013b). For example, antisense transcripts can be cleaved to produce short and long siRNAs in plant tissues such as the root and stem and to regulate target genes by RNA interference at transcriptional and post-transcriptional levels in apples (Celton et al. 2014).

Antisense lncRNAs and their regulatory functions in plants have also been identified in previous investigations. Recently, 10 antisense lncRNAs and TalncRNA73 were identified in drought-stressed maize (Zhang et al. 2014b) and in wheat (Zhang et al. 2013a) infected with stripe rust pathogen. In A. thaliana, ssRNA-seq has been applied to verify 626 concordant and 766 discordant light-responsive lncNATs in specific tissues and development stages, and most of the lncNATs were correlated with H3K9ac- and/or H3K27ac-associated histone modification and acetylation (Wang et al. 2014). Also, two other lncNATs, COOLAIR, induced in early vernalization (Csorba et al. 2014) and as HSFB2a controlling HSFB2a in gametophyte development, were confirmed in A. thaliana (Wunderlich et al. 2014).

Based on the relative expression level of antisense lncRNAs and sense genes in our investigation, six concordant regulatory and five discordant regulatory pairs were identified. Among them, five N-induced antisense lncRNAs and their corresponding sense genes were detected in P. tomentosa (Table 3). For example, Potri.009G112800.1 was depressed (−0.56 fold) under N deficiency, which was discordant with its antisense TCONS_00040024 (1.41 fold), encoding a 48-kDa subunit of oligosaccharyltransferase complex that functioned in dolichyl-diphosphooligosaccharide-protein glycotransferase activity in yeast and mammalian cells (Kelleher and Gilmore 1994). The antisense lncRNAs may function by forming double-stranded RNAs with the sense mRNAs, causing gene silencing and thus regulating related gene expression at a post-transcriptional level (Wang et al. 2014).

The N-induced antisense lncRNA, TCONS_00001301 (1.75 fold) and its sense gene (Potri.001G147200.1, 0.04 fold) exhibited concordant regulation (Table 3); a similar expression trend was found in a previous report (Wang et al. 2014). Potri.001G147200.1 encoded a double-psi beta-barrel (DPBB) fold, which was found in bacterial rare lipoprotein A (RlpA) and pollen allergen (Jorgenson et al. 2014). The natural antisense transcript pair TCONS_00023745 (1.6 fold) and Potri.005G215200.2 (0.49 fold) also showed a congruous expression pattern. Potri.005G215200.2 encodes cupin superfamily proteins, mainly containing germin and germin-like proteins (GLPs), which act as enzymes in various plant development and defense processes such as dioxygenase, decarboxylase, and phosphodiesterase in higher plants (Uberto and Moomaw 2013). A previous study revealed that GLP1 was targeted by pto-miR6439a under low-N conditions in P. tomentosa (Chen et al. 2015a). Combined with these results, we consider that the interaction between accumulated antisense TCONS_00023745 and the sense GLP gene was involved in plant low-N responses. However, unlike the universal reverse regulation relationship between miRNAs and target genes, the detailed interaction mechanism between these concordant antisense lncRNAs and sense genes and their activation or repression regulatory pattern are extensive and complex, requiring further experimental exploration. In conclusion, identification of antisense lncRNAs and the possible associated regulation of opposite-strand genes may enrich our understanding of the complex molecular mechanism of lncRNAs and provide new insights into ncRNAs responses to environmental stresses in plants.

Networks of lncRNAs, miRNAs, and mRNAs involved in the response of P. tomentosa to N deficiency

Previous investigations have revealed similarities and interactions between miRNAs and lncRNAs, indicating that miRNAs can bind to lncRNAs, directing cleavage and degradation (Juan et al. 2013). The reverse regulation of miRNA–lncRNA was first discovered in A. thaliana and was called target mimicry (TM). The non-coding transcript IPS1 could act as a target mimic to bind and sequester miR399, inhibiting the normal cleaving of miR399 on its target, PHO2, thereby modulating Pi homeostasis in A. thaliana (Franco-Zorrilla et al. 2007). Wu et al. (2013) systematically identified 36 endogenous target mimics for 11 A. thaliana miRNAs and 189 target mimics for 19 rice miRNAs, and TM-miR160 and TM-miR166 were confirmed to be involved in plant development (Wu et al. 2013). In P. trichocarpa, 51 lncRNAs were identified as putative targets of 20 miRNAs, including ptc-miR396e-3p, ptc-miR476b, and ptc-miR6476a. Specifically, ptc-6462a/b/c-3p/d/e/f could target lncRNA396, whereas ptc-miR6462c-5p/e cleaved lncRNA225 (Shuai et al. 2014). In our investigation, pto-miR6462a/b/c-3p/d/e/f was found to target TCONS_00013609 (Supplementary Table 5), which was highly homologous to lncRNA225 (94.23 % homology) and lncRNA396 (93.44 %) in P. trichocarpa. These results further confirmed the widespread phenomenon of miRNAs targeting lncRNAs in Populus.

A previous investigation revealed that H19, the first imprinted ncRNA, which produced a 23-nt miRNA, could work as a miRNA precursor to circumstantially regulate target mRNAs at a post-transcriptional level during vertebrate development in humans and mice (Hung and Chang 2010). In maize, lncRNA transcripts could correspond to numerous precursors of miRNAs (e.g., miR169d, miR169h, miR172c) and siRNAs under drought stress (Li et al. 2014). In addition, a study of P. tomentosa xylem showed that ptc-miR1448 and ptc-miR482a.2 were derived from TCONS_00061773, which may play a regulatory role in the defense response in Populus to abiotic stresses (Chen et al. 2015b), and pto-miR1448 could be produced from TCONS_00035853 in P. tomentosa in a low-N environment (Table 2). BLAST analysis with the sequence of P. trichocarpa genome confirmed that TCONS_00061773 (Chen et al. 2015a) and TCONS_00035853 (this investigation) aligned perfectly with each other, with no mismatch or gap. Also, TCONS_00021861 was found to contain the precursor of pto-miR168b in P. tomentosa treated with low-N stress in this investigation. Consequently, our results further demonstrated the authenticity of precursor miRNAs generated from lncRNAs. Moreover, these results showed that further identification of lncRNAs may act as a new category of miRNA precursors and regulate miRNA biogenesis, linking lncRNAs to miRNAs in various biological progresses.

Furthermore, our investigation demonstrated a comprehensive RNA-mediated network, including miRNA–lncRNA and miRNA–mRNA interactions (Fig. 6). For example, pto-miR156a-j, TCONS_00052316, and Potri.014G158200.1/Potri.014G158200.4 were involved in a complex interaction network. Specifically, pto-miR396a/b (suppressed under N deficiency), pto-miR396c (induced) (Ren et al. 2015), and pto-396g-5p were identified to target TCONS_00069233 (XLOC_038422; 0.81 fold), while pto-396g-5p also targeted Potri.017G062800.2 (shikimate kinase). Another RNA-directed network possibly involved in N signaling was also identified in our investigation. The suppressed pto-miR159a/b targeted TCONS_00052315 (XLOC_028745; 0.09 fold) and Potri.006G0129900.1 (encoding peroxidase 21 precursor) under N deficiency (Chen et al. 2015a) (Supplementary Tables 3, 5). miR159 regulation of its conserved target, MYB, played extensive and significant roles in plant growth, development, and resistance to various biotic and abiotic stresses. In A. thaliana rosette leaves, N depletion could induce R3-MYB (CPC), thereby having a negative feedback effect on anthocyanin accumulation (Nemie-Feyissa et al. 2014). Also, CmMYB1 could act as a central N regulator to initiate the expression of N assimilation genes (such as CmNRT, CmNAR, and CmNIR) under N deficiency in Cyanidioschyzon merolae (Imamura et al. 2009). In addition, peroxidase 21 precursor mainly functions to resist oxidative stress under N deficiency. However, the regulatory mechanism of the two slightly accumulated lncRNAs in response to N starvation in the miRNA-lncRNA-mRNA network needs further research. Similar to previous studies (Jalali et al. 2013), our results further confirmed that there are universal interactions between various regulatory ncRNAs (miRNAs and lncRNAs) and mRNAs in Populus. However, their specific biological processes require further experimental investigation and exploration.

Fig. 6
figure 6

Networks of miRNAs, lncRNAs, and mRNAs (genes). Specifically, the green Vs indicate downregulated lncRNAs or miRNAs, whereas the red triangles show the upregulated miRNAs under nitrogen deficiency in P. tomentosa

N-responsive lncRNAs in P. tomentosa

TCONS_00049805 was downregulated (−1.88 fold) under low-N stress and was found to target three genes (Potri.011G062600.1, Potri.011G062800.1, and Potri.011G062900.1) encoding insect homologous proteins, a ligand-gated ion channel (LGIC), and a bacterial extracellular solute-binding protein family 3 receptor (Table 4). Generally, LGICs include nicotinic acetylcholine receptors (nAChRs), aminobutyric acid gated anion channels (GABAs), glutamate-gated chloride channels (GluCls), and histamine-gated chloride channels (HisCls), which are involved in multiple insect behaviors, such as escape, learning, and memory (Swope et al. 1999). In plants, LGICs could act as molecular messengers to transduce intra- or extracellular biochemical signals into electrical events and ion fluxes by activating cyclic nucleotide-gated (CNG) channels (Dietrich et al. 2010). In addition, TCONS_00049805 was 2 kb upstream of Potri.011G063000.1 (Supplementary Table 4), also encoding a ligand-gated ion channel, bacterial extracellular solute-binding protein family 3, which suggested a relative stationary regulatory relationship between TCONS_00049805 and ligand-gated ion channel/bacterial extracellular solute-binding proteins family 3. However, its specific biological and physiological function in the response of P. tomentosa to N deficiency needs further study. Another downregulated lncRNA, TCONS_00064021, was detected at a target gene encoding a protein kinase domain, a leucine-rich repeat, and di-glucose binding within the endoplasmic reticulum (Table 4).

The increased TCONS_00017288 (2.72 fold) targeted genes encoding SRF-type transcription factor (Table 4), which generally regulates various biological processes such as cell growth, morphology, spreading, migration, and adhesion in animals through binding to a DNA sequence with a CArG motif (Ragu et al. 2010). TCONS_00021860, located downstream of Potri.004G232200.1 (methyltransferase domain), was upregulated in N deficiency, targeting genes encoding a leucine-rich repeat (LRR) and a protein tyrosine kinase. LRRs and a nucleotide-binding site (NBS) were domains of plant resistance proteins (encoded by R-genes) (Table 4, Supplementary Table 4), which play key roles in the defense of plants against pathogens (Zhang and Thomma 2013) and were widespread in various plant species, including A. thaliana (Guo et al. 2011) and P. trichocarpa (Kohler et al. 2008). The NBS–LRR genes are classified into three groups: Toll/interleukin-1 receptor (TIR) group, the coiled-coil (CC) or leucine zipper (LZ) group, and a non-motif group. Among them, LRRs function mainly in protein–protein interactions with a variety of binding specificities. In rice, leucine-rich repeat domain protein (OsLRR1) can interact with hypersensitive-induced response protein 1 (OsHIR1) to resist bacterial infection by participating in cellular signal transduction, lowering pathogen titers, activating higher expression of defense marker genes, and thus alleviating disease invasive symptoms (Zhou et al. 2009). Upregulated TCONS_00021860 (1.14 fold) and relatively depressed LRR (−0.06 fold) may be involved in biological processes during the adaptation of P. tomentosa to low-N stress, but further experimental investigations are required to reveal the specific function.

In conclusion, in this investigation, we first uncovered 126 lncRNAs differentially expressed under N deficiency in P. tomentosa, and then explored the characteristics and detailed functions of these lncRNAs and their complementary protein-coding genes involved in responses to N deficiency. Additionally, comprehensive miRNA–lncRNA, lncRNA–mRNA, and miRNA–mRNA networks were determined in Populus under low-N stress, revealing widespread regulatory interactions between various non-coding RNAs and mRNAs. Specifically, 15 pairs of antisense lncRNAs and sense protein-coding genes were identified, and five antisense lncRNAs were low-N induced. Our investigation provides more information on the global identification of novel Populus lncRNAs and new insights into exploring molecular mechanisms of RNA interactions, and it lays a foundation for further research on the responses of Populus to N deficiency and on means of improving NUE in woody plants.