Keywords

1 Introduction

RNA-mediated processes together with DNA methylation and histone modifications are considered as the main molecular mechanisms constructing the epigenetic regulatory network which has an essential role in plant developmental programs, stress response and adaptation, transposon silencing, signaling pathways, and various non-Mendelian patterns of inheritance (Grant-Downton and Dickinson 2005; Avramova 2011).

2 Micro RNAs in the Plant Small RNA World

Plant small RNAs (sRNAs), which range in size from approximately 20 to 30 nucleotides (nt), can be distinguished both by their biogenesis and mode of action. They include microRNAs (miRNAs) and small interfering RNAs (siRNAs) such as repeat-associated siRNAs (ra-siRNAs), trans-acting siRNAs (ta-siRNAs), and natural antisense transcript siRNAs (nat-siRNAs) (Vaucheret 2006). Each type of sRNAs is unique with respect to the molecular size, the plant DICER-like (DCL) RNaseIII enzyme which is involved in its biogenesis, and the ARGONAUTE (AGO) protein which is directed by the sRNA for silencing of the target gene expression. Most siRNAs target the same locus they are derived from, except for miRNAs and ta-siRNAs, which target mRNAs produced from different loci. Plant sRNA molecules function as negative regulators of gene expression on a transcriptional or posttranscriptional level (Ruiz-Ferrer and Voinnet 2009).

In general, ra-siRNAs (or heterochromatic siRNAs, hc-siRNAs) are 21–24 nt in length, and their biogenesis involves DCL2, DCL3, and DCL4 (Kasschau et al. 2007; Xie et al. 2004). These sRNAs are mainly loaded into AGO3 and are known to be involved in DNA and histone methylation (Xie et al. 2004; Zheng et al. 2007; Zilberman et al. 2003). ta-siRNAs are phased 21 nt RNA molecules whose production involves only DCL4 and is triggered by miRNA-directed cleavage of the TAS transcripts. Further ta-siRNAs are loaded into AGO1 and, like miRNAs, promote sequence-specific cleavage of their targeted gene transcripts (Allen et al. 2005, 2006). The biogenesis of 21 and 24 ntnat-siRNAs involves one of DCL1, DCL2, or DCL3, and a subgroup of nat-siRNAs is dependent on RDR2 and PolIV (Borsani et al. 2005; Katiyar-Agarwal et al. 2006; Zhang et al. 2012). A category of sRNAs ranging from 30 to 40 nt in size, referred to as long siRNAs (lsiRNAs), has been subsequently identified (Katiyar-Agarwal et al. 2007). siRNAs act as transcriptional repressors of a subset of transposons and genes through triggering de novo methylation of homologous DNA in the process of RNA-dependent DNA methylation (RdDM) (Wassenegger et al. 1994; Law and Jacobsen 2010).

First plant sRNAs with miRNA characteristics were recognized in Arabidopsis in 2002 (Llave et al. 2002; Mette et al. 2002; Park et al. 2002; Reinhart et al. 2002). Canonical miRNAs are typically 21 nt in length, and their precursors (pri-miRNAs) are transcribed by RNA polymerase II (PolII) (occasionally by RNA Polymerase III) from miRNA (MIR) genes (Jones-Rhoades et al. 2006). Some precursors were found produced from the spliced introns of the gene transcripts in Arabidopsis and rice (Meng and Shao 2012). The single-stranded precursor forms a hairpin structure which is cut out by DCL1 releasing double-stranded miRNA with an approximately 2-nt 3′-end overhang. Since other DCL proteins (DCL2, DCL3, or DCL4) may recognize and process miRNA precursors, miRNAs of diverse sizes ranging from 20- to 24-nt could be generated (Margis et al. 2006). The double-stranded miRNA is subsequently 2′-O-methylated by the methyl transferase Hua enhancer1 (HEN1) and protected by RNA degradation (Yu et al. 2005; Molnár et al. 2007; Abe et al. 2010). Only one strand of the duplex is selected to become the functional miRNA (guide strand, mature miRNA) while the other strand is degraded (passenger strand, miRNA*). In some cases, the passenger strand can be differentially expressed in different tissues and developmental stages and can be functionally active (Okamura et al. 2008). In Arabidopsis, miR396a-5p is the mature miRNA which is preferentially expressed in root to target growth-regulating factors (GRFs) family (Jones-Rhoades and Bartel 2004), while miR396a-3p is preferentially expressed in flower (Jeong et al. 2013). There are examples of multiple mature miRNA production from a single precursor as in the case of closely related miR161.1, miR161.2, and miR161.3 targeting genes encoding pentatricopeptide repeat proteins (PPRs) in Arabidopsis (Allen et al. 2004; Jeong et al. 2013).

Mature miRNA strand, or occasionally miRNA* strand, is loaded into an AGO protein to form a RNA-induced silencing complex (RISC). Most plant miRNAs possess 5′ uridine, which serves as a sign for association with AGO1 (Mi et al. 2008; Montgomery et al. 2008; Takeda et al. 2008). MiRNA directs RISC to target mRNAs in a sequence-dependent manner. Plant miRNAs share perfect or nearly perfect complementarity with their targets (Rhoades et al. 2002) and bind to complementary sequences in the 3′UTRs of target mRNAs. While miRNAs with near perfect complementarity repress predominantly translation machinery, those with perfect complementarity induce transcript cleavage (Hutvágner and Zamore 2002; Llave et al. 2002; Bartel 2004).

Though miRNA and miRNA* are the most predominant species from a precursor, high-throughput sequencing (HTS) data show that there are often low-frequency positional and length variations from miRNA hairpin structures (Meyers et al. 2008; Morin et al. 2008). These length and sequence variants of canonical miRNAs have been widely demonstrated as isoforms or isomiRs. Based on the homology assessment methods, they have been primarily classified into templated (miRNAs length variants having homology to parent genes) or non-templated (nucleotide additions and/or posttranscriptional RNA edits resulting in no homology to parent genes) isomiRs (Neilsen et al. 2012; Rogans and Rey 2016). Although the precise mechanism by which these isomiRs originate has not yet been established, it has been laid forth that the isomiR biogenesis is due to the imprecise cleavage activity of DCL1 (Bartel 2004), 3′ uridylation by nucleotidyl transferases such as UTP:RNA uridylyltransferase (URT1) and HEN1 suppressor1 (HESO1) (Tu et al. 2015; Wang et al. 2015), or 3′-5′ exoribinuclease activity of Small RNA Degrading Nucleases (SDNs) (Ramachandran and Chen 2008), producing isomiRs with 5′-, 3′-nt additions or deletions. The regulatory roles of isomiRs have already been established in tissue specificity, development, leaf senescence, and stress response (Colaiacovo et al. 2012; Hackenberg et al. 2013; Xu et al. 2014). In Arabidopsis, the comparative assessment of the genome-wide sRNA profiles induced by temperature stress displayed differential expression of a specific subset of mature miRNAs and a variety of isomiRs (Baev et al. 2014). For example, the miR160c precursor gave rise to numbers of isomiRs which were found to be differentially expressed upon high- and low-temperature treatment (Fig. 1). The isomiR with the highest copy number was the most considerably upregulated during stress exposure implying its likely functional significance for plant stress response (Baev et al. 2014).

Fig. 1
figure 1

miR160c and its isomiRs in Arabidopsis. The profile was generated from NGS datasets derived from temperature conditional responses (NT normal temperature, LT low temperature, HT high temperature). The mature miRNA indexed in miRBase is shown in bold; the isomiR having the highest copy number is underlined

Many miRNAs come in families, in which MIR loci are often closely related and occasionally produce identical miRNAs. Some of annotated miRNA families are conserved across vast phylogenetic scales, while others are family- or species-specific (Cuperus et al. 2011). Like miRNAs, numerous isomiRs have been found conserved across species and are likely to participate in regulation of important biological processes (Ameres and Zamore 2013).

3 MiRNA-Mediated DNA Methylation

The increasing evidences of isomiRs complexity in plants have prompted searching for biologically significant variants amongst the large number of MIR-derived sRNAs (Sablok et al. 2015). The well-recognized mechanism of action of miRNAs is sequence-specific repression of gene expression on posttranscriptional level. Recent findings have revealed that newly identified miRNAs and MIR-derived sRNAs can act as well in sequence-specific transcriptional silencing thus influencing genome function through DNA methylation (Bao et al. 2004; Chellappan et al. 2010; Wu et al. 2010; Khraiwesh et al. 2010).

3.1 First Evidences for an Indirect Link Between MiRNAs and DNA Methylation

The first evidence for involvement of miRNAs in DNA methylation came from the study of Bao et al. (2004) in Arabidopsis that intended to answer whether the miRNA complementary site in the target mRNA affected the chromatin state of corresponding gene. For that, they compared DNA methylation between wild-type plants and two mutant lines—phb-1d and phv-1d in which dominant mutations in the PHABULOSA (PHB) and PHAVOLUTA (PHV) genes disrupted the complementarity site of miRNAs 165 and 166 in PHB and PHV mRNAs (McConnell et al. 2001; Emery et al. 2003). In the two genes, the miR165/166 complementary site is split by an intron, and the downstream exons were found heavy methylated in the most cells of wild type. Decreased methylation at the PHB and PHV loci in phb-1d and phv-1d, respectively, revealed indirect interaction between miR165/166 and the PHB and PHV templates. It was hypothesized that the miRNA binds to the complementary site of the processed, nascent mRNA and recruits in trans a chromatin-modifying complex to the closely located template locus (Bao et al. 2004). Intriguingly, the PHB and PHV methylation was not affected in dcl1 and ago1 mutants (Bao et al. 2004) that raised the question of whether other players, different but still related to miRNAs, could mediate DNA methylation.

The hypothesis proposed by Bao and collaborators (2004) has been confirmed in moss Physcomitrella patens through analyzing the interaction between miR166 and its target mRNAs—PpC3HDZIP1 and PpHB10 (Khraiwesh et al. 2010). Similarly to the PHB and PHV genes in Arabidopsis, the miRNA binding site contains an intron and is reconstituted upon splicing of the PpC3HDZIP1 and PpHB10 primary transcripts. It was shown in moss mutants without DCL1b that, although the miRNA level was unchanged, the target transcripts were significantly downregulated without being cleaved. Evidences were obtained for accumulation of stable miRNA:target mRNA complexes and for hypermethylation of target loci in these mutants, and was proposed that the ratio between miRNAs and their targets was determining for recruitment in trans of the DNA-methylation effector molecules to template loci (Khraiwesh et al. 2010).

3.2 MIR-Derived sRNAs: The Real Players in MiRNA-Mediated DNA Methylation

MIR-Derived sRNA Diversity and Biogenesis

In Arabidopsis, a novel class of 23- to 27-nt sRNAs was identified in Arabidopsis that was produced together with the canonical 20- to 22-nt miRNAs from number of MIR genes (Vazquez et al. 2008; Chellappan et al. 2010). Unlike canonical miRNAs which originate from ancient, highly conserved MIR genes, the long MIR-derived sRNAs originate from recently evolved MIR genes (Vazquez et al. 2008). These studies demonstrate that canonical miRNAs and MIR-derived sRNAs can be generated independently from the same hairpins by DCL1 and DCL3, respectively. Mutational analysis revealed that the accumulation of MIR-derived sRNAs was dependent not only on DCL3 but also on RDR2 and PolIV in Arabidopsis (Chellappan et al. 2010). The involvement of PolIV, RDR2, and DCL3, which are the main components of the small interfering RNA (siRNA) biogenesis, in the processing of MIR-derived sRNAs is the reason to refer to these sRNAs as MIR-derived siRNAs (Chellappan et al. 2010). The question what is the activity of PolIV and RDR2 in MIR-derived siRNAs biogenesis in Arabidopsis remains still open.

Unlike Arabidopsis, numbers of MIR genes were found to produce both canonical miRNAs and 24 ntsRNAs, or only 24 ntsRNAs in rice (Zhu et al. 2008; Wu et al. 2009). The two types of sRNA can be processed from the same hairpins by the cooperative action of DCL1 and DCL3, or, as in some cases, a hairpin precursors can be processed by DCL3 giving rise to only 24-nt long sRNA species called long miRNAs or lmiRNAs (Wu et al. 2010). 31 of the 54 lmiRNAs identified by Wu and coworkers in rice (2010) were observed later to be located in the intronic regions of protein-coding genes (Tong et al. 2013). Contrasting to the Arabidopsis MIR-derived siRNAs, there is no evidence to suggest that the formation of lmiRNAs needs RDR2 and PolIV. In tomato, 10 loci encoding putative 24 ntlmiRNAs were predicted, and the expression of four of them was proved to be DCL3-dependent confirming their identity as lmiRNAs (Kravchik et al. 2014). In addition, the expression profiles of two of the tomato lmiRNAs in different organs showed their involvement in tomato reproductive development.

MIR-Derived sRNA Effector Complexes and DNA Methylation

Canonical miRNAs and MIR-derived sRNA are loaded on functionally different argonaute complexes—canonical miRNAs associate specifically with AGO1 while MIR-derived siRNAs are sorted in AGO4 clade proteins. The reduced levels of MIR-derived siRNAs in ago4-1 mutant and AGO4-coimmunoprecipitation assay have revealed that these sRNAs could associate with AGO4 in Arabidopsis (Chellappan et al. 2010). The observations of Wu et al. (2010) in rice suggest that lmiRNAs initiated with adenine sort into AGO4a, AGO4b, and AGO16, and those beginning with uracil are loaded on AGO4b.The association of MIR-derived siRNAs with AGO4 clade proteins suggested their involvement in DNA methylation.

There are increasing evidences that some MIR-derived siRNAs are indispensable for DNA methylation of their target loci in trans and/or of their own MIR loci in cis. In the Arabidopsis mutant nrpd1-3 lacking the Pol IV largest subunit, where 23- to 26-nt sRNAs were absent, reduced DNA methylation of the putative target mRNAs At4g16580 and At5g08490 for the recently evolved miR2328 and miR2831-5P, respectively, correlated with upregulated mRNA expression (Chellappan et al. 2010). In the same mutant, the DNA methylation of SPL2 (squamosa-promoter binding protein-like), a target of the canonical miR156, was found reduced up- and downstream of the miR target site compared to wild type. To examine the DNA methylation of lmiRNA-producing and target loci in rice, two mutant lines dcl3a-17 and rdr2-2 were subjected to bisulfite sequencing (Wu et al. 2010). Since the lmiRNAs biosynthesis requires DCL3, but not RDR2, the comparison of methylation profiles of these mutants would allow for differentiation between siRNA- and lmiRNA-dependent DNA methylation. In that way, it was demonstrated that miR1873 directed the methylation at its own locus as well as that miR1863, miR820.2, miR1873.1, and miR1876 mediated the methylation of their target genes (Wu et al. 2010). Unlike siRNA-dependent DNA methylation spreading in the 3′-direction only, lmiRNAs induce DNA methylation bidirectionally from the miRNA-binding site in target genesas exemplified by bisulfite sequencing of the target genes of miR1862c, miR1863b, miR1867, miR2121b, miR5150, and miR5831 in rice (Hu et al. 2014) and of miR160 and miR166 in moss (Kravchik et al. 2014).

The described variety of plant canonical and non-canonical miRNAs and MIR-derived siRNAs, with the particularities of their biogenesis, AGO incorporation, and mode of action, is schematically presented in Fig. 2.

Fig. 2
figure 2

Diversity of plant MIR-derived sRNAs. The hairpin precursors, transcribed from most plant MIR genes, are cut out by DCL1 to produce canonical mature ~21 nt miRNAs which associate with AGO1 and mediate target mRNA cleavage or translational repression. In addition to this classical pathway, some plant MIR genes can generate sRNA species that differ from the canonical miRNAs. In Arabidopsis, two sRNA species—canonical miRNAs and MIR-derived siRNAs (23–27 nt)—can be generated independently from different molecules of the same hairpin population by DCL1 and DCL3, respectively (Chellappan et al. 2010). In rice and tomato, some MIR genes produce only 24 ntlmiRNAs using DCL3, while other MIR genes can produce canonical miRNA and lmiRNA species simultaneously by coordinate activities of DCL1 and DCL3 on the same molecule (Wu et al. 2010; Kravchik et al. 2014). MIR-derived siRNAs and lmiRNAs associate predominantly with AGO4 and mediate DNA methylation of target genes or their own MIR genes. The DCL1/AGO1 pathway is depicted in red color, while the DCL3/AGO4 pathway is depicted in blue color

4 Epigenetic Control of MIR Genes

Plant miRNA expression might be regulated on transcriptional level by chromatin remodeling due to histone modification or DNA methylation of the corresponding MIR loci.

4.1 Impact of Histone Modifications of MIR Loci on MiRNA Expression

The studies of Kim et al. (2009) performed with the Arabidopsis mutants, gcn5-1 and gcn 5-2, revealed that acetylation of histone H3 lysine 14 at numbers of miRNA loci interfered with miRNA production. The analysis of pri-miRNA/mature miRNA accumulation levels suggested a role of GCN5 (histone acetyltransferase) at both pri-miRNAs maturation and miRNA expression.

Briefly, the reduced expression of miRNA processing genes—DCL1, SE, HYL, and AGO1, correlated with low accumulation of pri-miRNAs and high levels of mature miRNAs in the wild type. In gcn5 mutants, the expression of these components of miRNA processing pathway was upregulated and correlated with increased levels of miRNAs and decreased levels of corresponding pre-miRNAs. No direct interaction between GCN5 and DCL1, SE, HYL, and AGO1 genes was identified. These data inspired the hypothesis of the presence of a common repressor of these genes, the activity of which was influenced by GCN5.

Furthermore, a direct interaction was described between GCN5 and four MIR genes (miR165a, miR172a, miR395e, and miR399d) in the wild-type Arabidopsis. Based on studies of gcn 5-2 mutant, the specific miRNA-protein interaction was mapped to the bromodomain of GCN5. Moreover, some histone deacetylases (HDA9 and HDA19) were also seen as factors regulating miRNA accumulation.

Nosaka et al. (2012) have found that miR820 cleaves OsDRM2 mRNA and induce DNA methylation at the miR820 target site in OsDRM2 gene. On the other hand, it became clear that miR820 is encoded by CACTA TEs (five copies, located on different chromosomes) (Nosaka et al. 2013). The heterochromatic mark (H3K9 dimethylation) is detected in CACTA copies carrying miR820 genes suggesting repression of miR820. However, in one of the CACTA copies, that resides in chromosome 7 and produces transcripts, low levels of active histone marks (H3K4 di/tri methylation and H3K9 acetylation) correlate with high level of asymmetric cytosine methylation (CHH) in the same region, but still the transcription of miR820 from the this locus is allowed.

In Brachypodium distachyon, a representative of Pooideae plants, a newly evolved, species-specific miRNA-miR5200 was identified to target the mRNAs of two florigen genes, FTL1 and FTL2 (Wu et al. 2013a, b). The miR5200 expression was found changed under different day lengths being upregulated under short day (SD) and downregulated under long day (LD). The authors claimed that differential expression of miR5200 might due to changes in chromatin modifications at the MIR genes, MIR5200a and MIR5200b. A repressive histone mark H3К27tri-methylated was enriched in the MIR5200 genes under LD conditions. Thus, epigenetics control of miR5200 expression was found to participate in the photoperiodic regulation of the transition from vegetative toward reproductive stage in B. distachyon (Wu et al. 2013a, b).

4.2 DNA Methylation of MIR Genes Affects MiRNA Expression

DNA methylation status at CG, CHG, and CHH contexts has been explored in both promoter region and gene body of MIR genes in rice. These parameters were compared between conserved and species-specific miRNA (Hu et al. 2014). The highest rate of methylation was determined in CG context of both promoters and gene bodies of non-conserved MIR genes. It was found that the genes of majority of highly expressed, conserved miRNAs were constitutively hypomethylated, while the genes of recently evolved miRNAs were hypermethylated at both promoters and gene bodies. The authors suggested that a strong control orchestrated by DNA methylation had been established in plant evolution as a means for repression of newly evolved species-specific miRNAs.

DNA methylation landscape was monitored in five different rice tissues –embryo, endosperm, root, shoot, and mature leaves (Hu et al. 2014). The promoters and gene bodies of non-conserved MIR genes showed the same trend in their methylation profile in CG context among the studied tissues. Hypomethylation was detected in all contexts in endosperm. A higher rate of methylation was detected only in CHH context in embryo and mature leaves. DNA methylation of promoters and gene bodies of conserved miRNAs didn’t show significant differences in all tissues except for the endosperm where reduced methylation in CHG and CHH context was observed.

In poplar, a relationship between DNA methylation and gene expression was described for both long noncoding RNAs (lncRNA) >200 bp and miRNAs (Song et al. 2016).The forth exon of lncRNA00268512 gene was observed on the complementary strand of the first exon of the protein-coding gene Potri.018G127000, in which miR396e was located. Due to this substantial overlap, it was proposed that the lncRNA00268512 and miR396e may interact (Song et al. 2016). Moreover, a stress-specific differentially methylated region (SDMR 162) was identified in the first exon of Potri.018G127000. The decreased DNA methylation of SDMR 162 correlated with increased levels of lncRNA upon cold and osmotic stress. It was suggested that the excess lncRNA molecules could captured miRNA396e-3p and caused its lower abundance in response to stress.

4.3 Link Between MIR Gene DNA Methylation and Plant Stress Response

Abiotic stress-responsive DNA methylation was observed in five conserved MIR genes (miR167-3p, miR6445a, miRNA319c, miR156f, and miR472a) and eleven non-conserved MIR genes in Populus simonii (Song et al. 2016). A long-term impact of DNA methylation on gene expression of miRNAs was seen in response to short-term abiotic stress in ~15% of de novo methylated sites. Stress-induced DNA demethylation was reported at two MIR genes encoding miR156f and miR472b transcripts. Approximately, 11% of demethylated sites that responded to abiotic stress were found preserved 6 months later (Song et al. 2016).

Ci et al. (2015) identified 1066 stress-specific DNA methylated sites in poplarin response to heat and cold stress. Analysis of DNA methylation levels showed 150 stress-specific DNA methylated sites per each type of stress and 100 DNA methylated sites common to both stress types. Seven MIR genes (miR156i, miR156j, miR167h, miR390c, miR393a, miR396e, and miR396g) were found to be differentially methylated in response to temperature stress. Moreover, their expression was influenced by the cytosine methylation pattern. Most temperature responsive MIR genes that carry CNG methylation pattern displayed higher expression than those having CG methylation.

5 Computational Tools for Plant MiRNA Analysis from NGS Datasets

The identification of the entire repertoire of sRNAs and miRNAs has been made possible by the next-generation sequencing (NGS) techniques in an efficient and cost-effective fashion. Due to the big data output of these methods, the computational approaches are a necessary step to depict miRNAs in outgrowing datasets.

A typical bioinformatics miRNA identification algorithm based on NGS datasets involves several steps including, but not limited to (1) quality filtering and adapter trimming; (2) mapping of sRNA reads to identify corresponding genomic loci; (3) estimating the miRNA expression based on copy number; (4) exploring the 2D RNA structure of the loci for identification of pre-miRNA; etc. sRNA sequencing libraries usually also contain other non-miRNA RNA molecules, such as other sRNAs, degradation reads from protein-coding genes, rRNA reads, etc. The computational miRNA identification also compels filtering these non-miRNAs as much as possible from the reads in the library to make analysis more accurate and the “background” to be discarded. Each step in the process may result in a change of the output of the analysis, but perhaps the most crucial part is how the mapping stage is done. The software pipelines available so far vary in terms of user interface, user control, parameters, the input format data, reference databases, and how they adopt each of the above analysis stages.

The key step in NGS data analysis is the mapping the huge amount of short reads to a given genome. Several algorithms and software modules have been specifically designed for dealing with the alignment of millions of reads. Some of the most used tools for the alignment to the reference genome are Bowtie (http://bowtie.cbcb.umd.edu/) (Langmead et al. 2009), BWA (http://maq.sourceforge.net/) (Li and Durbin 2009), MAQ (http://maq.sourceforge.net/) (Li et al. 2008), and SOAP (http://soap.genomics.org.cn/) (Li et al. 2008).

The standard approach for identification of plant miRNAs involves cross-species discovery of the conserved miRNAs in the sample due to the fact that large population of miRNAs has orthologues in the plant kingdom. However, this is limited to organisms where known reference genome and miRNAs genes are available.

In such comparative analysis, the NGS short reads are aligned to a known reference database. These miRNA databases are the source for known miRNA sequence (mature and hairpin sequences) and annotation information. They also are essential for expression profiling of miRNAs. The most popular miRNA databases include, but not limited to: miRBase (Griffiths-Jones 2004), deep Base (Yang and Qu 2012), microRNA.org (Betel et al. 2008), miRGen (Megraw et al. 2007), miRNAMap (Hsu et al. 2006), miRNEST (Szcześniak et al. 2012), and PMRD (Zhang et al. 2010); among them the most comprehensive ones for plant miRNAs are miRBase and PMRD.

The more challenging analysis is to try to identify novel mature and precursor miRNA sequences directly from NGS read data without any dependence upon homologous references of known conserved plant miRNAs. Such methods usually include deep investigation of the mapping results and the clustering of the sRNAs produced from miRNA/miRNA* regions. To discover the mature miRNA among such clusters of sRNAs expressed from the precursor region, only those reads which fit a specific criteria (e.g., could form duplex, maximum with four mismatches, observed 3′ overhangs, duplex length stayed within the range of 18–24 bp, and high copy number) are considered as miRNA candidates (Meyers et al. 2008). However, such rules have limitations and necessitate experimental validation over large amount of datasets. Based on PCR experiments which aimed to validate de novo predicted miRNA candidates, some authors reported that 40% of them were false positive (Wei et al. 2009). All these studies have urged the researchers in the recent years to try to optimize the existing and develop new methods to decrease the false positive rate of the output.

A variety of web-based and stand-alone software have been developed for analyses of plant miRNA data. The list of some of the available software that are designed specifically for plant miRNAs or can be used with plant miRNAs datasets can be found in Table 1.

Table 1 Bioinformatics tools for plant miRNA analysis from NGS datasets

The recent discovery that miRNAs can both regulate and be regulated by target interactions has a key role for understanding their roles in gene regulation (Salmena et al. 2011). A unique and remaining task in the field is the capacity to identify miRNA targets with high confidence. Moreover, finding the real functional miRNA targets is still puzzling even though the biological rules of miRNA targeting have been shown experimentally and computationally.

The classical way of identifying a plant miRNA target lays on the complementarity between itself and its mRNA site defined by the stability of the duplex which has been utilized widely as a main feature in the analysis step by computational tools. Some of the tools that utilize these algorithms are psRNATarget (Dai and Zhao 2011) and imiRTP (Ding et al. 2012), which predict the functional type of miRNA based on the complementary at the central region of the miRNA:target pair.

Recently in the era of NGS, degradome datasets have been used to find evidence of cleaved miRNA targets without relying on computational RNA folding predictions. Experimental methods have shown that miRNA AGO-mediated cleavage of mRNA happens exactly between the 10th and 11th nucleotide of miRNA–mRNA duplex. The subsequent upstream molecule of the cleaved target is degraded, but the downstream fragment is shown to be stable (Llave et al. 2002). And therefore, NGS techniques involving the capturing of these downstream fragments (Addo-Quaye et al. 2008; German et al. 2008; Gregory et al. 2008) and the following bioinformatics analysis of such datasets can be used to identify the miRNA targets. Recently, several tools have been developing that use degradome data in order to predict miRNA targets. Among the most popular software are CleaveLand (Addo-Quaye et al. 2009), SeqTar (Zheng et al. 2012), PAREsnip (Folkes et al. 2012), PatMaN (Prüfer et al. 2008), SoMART (Li et al. 2012), and StarScan (Liu et al. 2015).

In the past few years, cohorts of tools have been developed to identify isomiRs, which are either stand-alone or web-based (Table 2). Among the tools used to profile isomiRs in plants are SeqCluster (Pantano et al. 2011), miRSeqNovel (Qian et al. 2012), isomiRID (de Oliveira et al. 2013), sRNAtoolbox (Rueda et al. 2015), isomiRex (Sablok et al. 2013), and isomiRage (Muller et al. 2014). Although these tools allow the identification of the isomiRs, they suffer some omissions such as isomiR profiling that would take into account the sequencing artifacts, expression-based read support, visualization of the isomiRs with respect to read depth and mapping, target predictions, and functional enrichment. Some tools such as isomiRex (Sablok et al. 2013) are web-based, and without allowing the identification of novel miRNAs, provide support for isomiRs visualization based on read depth, whereas stand-alone tools lack for isomiRs visualization, but support the PARE-Seq-based targets predictions. Although the recently published isomiR Detection tool DeAnnoIso (Zhang et al. 2016) allows for the detection of isomiRs, it can be only accessible through the web interface and lacks to detect the isomiRs across a wide range of plant species (only four plant species are supported). IsomiRage can group the functionally relevant isomiRs according to the adenylation, uridylation, and other respective events in response to biological context (Muller et al. 2014). Other tools such as miR-isomiRExp (Guo et al. 2016) and isomiR-SEA (Urgese et al. 2016) are yet to be assessed for their efficiency in plant isomiR profiling. In terms of the pre-analyzed isomiRs, only one such database tool isomiRBank (Zhang et al. 2016) exists, providing the pre-compiled set of isomiRs across four plant species.

Table 2 Recently developed classification tools for identifying isomiRs

Conclusions

The use of genome-wide technologies has enabled the identification of novel miRNAs and a large number of isomiRs that transform the plant miRNAome in an increasingly complex world. Though some of the emerging MIR-derived siRNAs resemble canonical miRNAs, they show deviation from the conventional view of miRNA biogenesis, AGO incorporation, mode of action, and regulatory effect on gene expression. LmiRNAs are an excellent example of the biological significance of such MIR-derived siRNAs, that being mediators in a noncanonical RdDM pathway, lead to the fine-tuning of epigenetic control in plants. Furthermore, the variability in DNA methylation patterns of genes encoding novel, species-specific miRNAs, induced by different types of stress, upgrades our view about the regulatory networks associated with plant response, adaptability, and tolerance.