Introduction

The micro-RNAs (miRNAs) are small (19–24 nt), endogenous, non-coding RNAs abundant in eukaryotic organisms. These tiny molecules are key post-transcriptional and translational regulators of gene expression which interact with their respective target-mRNA transcripts in both plants and animals (Devi et al. 2018; Kumar et al. 2018). Despite the similarities, plant and animal miRNAs differ in their biogenesis, interaction sites, and target recognition (Axtell et al. 2011). For instance, the miRNA-producing loci have distinctive genomic arrangements in animal and plant kingdoms. Furthermore, excision of miRNAs from the respective precursor transcripts varies amongst plants and animals, and in various sub-cellular sections. In animals, the miRNA-mediated repression of the gene expression is attained via translational attenuation through (multiple) binding sites for miRNAs within the 3′ UT region of the target gene (Miller and Waterhouse 2005). On the other hand, almost all plant miRNAs direct the mRNA cleavage to regulate the target gene(s) because of the near-perfect base complementation at single sites in the coding regions. These differences, therefore, emphasize the need for plant-specific tools and databases.

In addition to the vital roles plant miRNAs play in cellular growth and development; there is growing evidence confirming them as master modulators of plant responses to stress factors (Shivaprasad et al. 2012; Ouyang et al. 2014; Shriram et al. 2016; de Vries et al. 2018, 2019; Kumar et al. 2018). Plant miRNAs also contribute in maintaining plant genome stability (Cimini et al. 2019) besides their potential applicability in genome editing (Xu et al. 2019). Owing to this, recent years have seen a sharp increase in investigations related to miRNAs. Plant miRNAs comprise of endogenously conserved entities spread throughout the plant kingdom and are produced via RNAs with imperfectly base-paired hairpin structures. The synthesis of mature plant miRNA involves RNA Pol II-driven miRNA-gene transcription which produces the primary miRNA which is then processed to mature miRNA in two phases (Zhang et al. 2006a); the first phase includes the generation of precursor miRNA from the primary transcripts in the nucleus followed by the formation of mature miRNA from precursor miRNA in the cytoplasm. Both the cleavage steps take place in the nucleus. In this process, the primary miRNAs are converted into the precursor miRNAs in a reaction which is catalyzed by the RNase III like enzyme DICER-LIKE (DCL) along with proteins HYL1 and SERRATE (SE) (Chapman and Carrington 2007; Meyers et al. 2008). The DCL-1 further processes these hairpin-looped structures giving rise to miRNA:miRNA* duplexes (Hannon 2002) that are exported from their site of synthesis by the protein HASTY. These duplexes are then loaded into RNA-induced silencing complex (RISC), after which they unwind by the ARGONAUTE (AGO) proteins (Zhu 2008). Finally, the mature miRNAs lead the RISC to target the complementary mRNAs (reviewed by Shriram et al. 2016; Djami-Tchatchou et al. 2017; Kumar et al. 2018; Treiber et al. 2019).

There are several recent changes in the methods used for identifying plant miRNAs and their respective targets; involved in life processes and responses to environmental stimuli, such as next-generation sequencing (NGS) (Simon et al. 2009), microarray (Yin et al. 2008), RNA ligase-mediated amplification of cDNA ends (RLM-RACE) (Jones-Rhoades et al. 2006), and homology search of public databases (Xu et al. 2019). Small-RNA deep sequencing coupled with degradome analysis has revolutionized this field. In silico tools and techniques are proving fast, efficient, and dependable for assisting the identification and characterization of plant miRNAs (Hannoufa et al. 2018). Wide applications of small RNA-sequencing technology have led to the explosion of miRNA discovery in the last few years. Therefore, miRNA entries in miRBase from different plant species are on a constant rise (Kozomara et al. 2018; Yu et al. 2019). We are presenting herein a review on plant miRNAs, their roles in plant life processes and plant responses to challenging environments, available tools and technologies for their identification and characterization, target prediction, confirmation and validation, and network analysis, with an emphasis on the current status and future challenges.

MiRNAs: emerging regulators of gene expression during plant growth, development, and stress responses

MiRNAs are known to regulate almost all aspects of plant life as they have sufficient diversity to match any RNA entity encoded in the genome, and this regulation is often via mediating the transcription factors (Chen and Rajewsky 2007; Carlsbecker et al. 2010; Huang et al. 2011; Djami-Tchatchou et al. 2017). MiRNAs are known to regulate a broad spectrum of plant developmental processes (reviewed by Singh et al. 2018). Several studies have confirmed the developmental roles of plant miRNAs (Chen 2009; Li and Zhang 2015). High-throughput or deep sequencing (NGS platforms like Illumina) and degradome analysis assisted with advanced in silico or bioinformatics tools and data sets have led to identification of an increasing number of miRNAs involved in plant growth, development, morphogenesis, and responses to external environmental stimuli (Brant and Budak 2018). Eventually, plant miRNAs are emerging tools for crop improvement (Shriram et al. 2016; Budak and Zhang 2017; Khare et al. 2018).

Plant miRNAs have proved their involvement throughout the life-cycle of plant, i.e., from seed germination to seedling establishment, vegetative growth, flowering, and reproduction. Plant seeds exemplify miniature new plants in the form of dormant embryos and miRNAs are reported to be involved in the regulation of seed germination (miR156 and miR172) (Li and Zhang 2015; Das et al. 2018). Similarly, miRs-156, 159, 165, 166, and 319 are involved in regulation of leaf formation (reviwed by Xu et al. 2019). miRNAs are also involved in shoot development including controlling of shoot meristem functions (Liu et al. 2018). Plant miRNAs also play a decisive role in root development (Khan et al. 2011). Almost all aspects of flowering are regulated by the miRNAs (reviewed by Teotia and Tang 2015). All these findings establish that miRNAs are critical for plant cell identity establishment in plant development for accomplishing appropriate morpho-/physiological state (D’Ario et al. 2017).

Similarly, plant miRNAs regulate how plants respond and/or adapt to changing and challenging environments, including countering with abiotic or abiotic stresses (Shivaprasad et al. 2012; de Vries et al. 2015; Khare et al. 2018; Jiang et al. 2018; Xu et al. 2019). Recent reports have confirmed that plant miRNAs re-program the gene expression for modulating the deleterious stress impacts on plants (Brant and Budak 2018).

In spite of the inherently conserved nature, miRNAs (and families thereof) may be specific to plants, development or physiological stage, type of tissues as well as type and exposure to stress condition (Sunkar et al. 2017). Many miRNA candidates are described as stress-responsive in diverse plant species. Such studies were conducted in the context of specific abiotic factors such as salinity (Deng et al. 2015; Eren et al. 2015), drought (Wang et al. 2011; Awasthi et al. 2019), nutrient deficiency (Valdés‐López et al. 2010; Zhao et al. 2012), UV radiations (Jia et al. 2009), heat (Yu et al. 2011; Chen et al. 2012), cold (Zhang et al. 2014; Sun et al. 2015), and heavy metals (Ghosh et al. 2017; Zhou et al. 2019). Likewise, several miRNA candidates, responsive to biotic stresses caused by bacteria (Xie et al. 2017), fungi (Inal et al. 2014), insects (Wu et al. 2017), nematods (Li et al. 2012a), and viruses (Sun et al. 2017), have been reported. Along with in silico analysis, it is helpful to analyze the real-time expression patterns of miRNAs, as it provides actual expression levels. Therefore, miRNAs are analyzed from control as well as stressed plants and differentially expressed miRNAs are identified after normalization and comparison of miRNA expression patterns from plants under stress against their control counterparts.

All such investigations have resulted into deciphered miRNomes of many plants, responsive against specific biotic or abiotic stress factors. However, there is a recent emphasis on identifying miRNAs responsive to multiple stresses, to help develop climate resilient crops (Xu et al. 2019). Though deeper insights are needed on the regulation of expression levels of miRNAs as well as their targets, where in silico tools might be of great support.

Approaches for identification of miRNAs from small RNA transcriptome

Owing to the importance of miRNA in plant growth and adaptability to stressful conditions, several new methods and databases are being developed for identifying miRNAs and further necessary analyses (Backofen et al. 2017). Initially, the miRNA identification process used to involve direct cloning and sequencing; however, the small sizes of miRNAs, their multiple occurrences in genome, and methylation status were the factors making the process tedious (Prabu and Mandal 2010). Progressions in cloning and computational techniques boosted the miRNA identification procedure (Tripathi et al. 2015). The current pipelines for identification of miRNAs are a blend of interdisciplinary paths involving biological and bioinformatics approaches (Akhtar et al. 2016).

Current approaches of miRNA analysis revolve around getting an exact sequence of the miRNA transcriptome, and the revolutionized NGS strategy is acting as a backbone for this. The founding methods in nucleotide sequencing were Sanger’s dideoxy synthesis method (Sanger and Coulson 1975; Sanger et al. 1977) and Maxam–Gilbert’s chemical cleavage method (Maxam and Gilbert 1980). Despite being considered relatively slow, the modified, automated, and commercialized Sanger’s approach is still widely used for many applications. These modifications include incorporation of fluorescent dyes for detection, use of in silico tools for investigating high-throughput data, and efficient use of thermal-cycle sequences which reduces the quantity of input genomic material to be sequenced besides increasing the accuracy and efficiency (Slatko et al. 2018). After automation of Sanger’s method, the requirement of cost-effectiveness for large genome sequencing triggered the evolution of creative methodologies which are referred as second-generation sequencing techniques. We can classify these second-generation methods in two major groups, the sequencing by hybridization, and the sequencing by synthesis.

The Ion Torrent™ Technology directly translates the chemical encoded information into the digital one on a semiconductor chip (Rothberg et al. 2011). The process uses the pH fluctuation during release of hydrogen at the DNA extension event, which are recorded in form of voltage spike. Ion Torrent™ supports a wide array of applications in sequencing including de novo/targeted DNA/RNA sequencing, transcriptome sequencing, and small RNA and miRNA sequencing (Furey 2012).

However, by far, the most prominent player in sequencing is Illumina. This approach is a bridge amplification strategy in which nucleic acid template undergoes repeated amplification on a solid support to create millions of sequence clusters. Currently, wide range of models available for Illumina platform including MiniSeq, MiSeq, NextSeq, NovaSeq and HiSeq provide a differential range of outcomes. If we compare the wide array of sequencing strategies, it can be depicted that currently the Illumina platform seems a best option for its high resolution (Slatko et al. 2018). Illumina HiSeq can provide 600 Gb of sequence yield per run which is significantly higher than the Ion Torrent, PacBio, as well as other sequencing platforms. The comparison between next-generation strategies may also provide the insights regarding the cost-effectiveness and higher accuracy and efficiency of Illumina HiSeq compared to the rest next generation sequencing platforms (Liu et al. 2011; Quail et al. 2012). Sequencing only small RNA transcriptome is proving more suitable because of their specificity towards small RNA transcripts including miRNAs. Therefore, the small RNA-sequencing strategy combined with the depth offered by Illumina NGS has emerged as a method of choice to identify novel miRNA candidates.

Advanced NGS platforms have made the objectives including identification of miRNA fractions, miRNA expression levels, prediction of miRNA precursors, variants, miRNA targets, and other characters achievable with good precision (Motameny et al. 2010; Conesa et al. 2016). The efficacy levels in miRNA studies are increased because of the integrative applications of NGS (Ma et al. 2015). Increased sensitivity and cost-effectiveness of NGS has led to large miRNA data from plants. Strand-specific non-coding RNA transcripts even with low abundance can be analyzed using NGS and different bioinformatics approaches (Turner et al. 2012; Li et al. 2013a; Wu et al. 2014).

Channeling for exact recognition of conserved/novel miRNAs is intricate but a necessary task. The in silico techniques used to identify miRNAs can be grouped into two categories (Kleftogiannis et al. 2013), the comparative approach and the machine learning approaches. Comparative approach incorporates mainly the flow-connected programing, while machine learning approach focuses on predictive modeling constructed from a set of known information. For both the approaches, two common processes are applicable for miRNA identification (a) similarity-based search against existing databases and (b) secondary structure folding (Numnark and Suwannik 2019).

Approaches based on deep-sequencing and available in silico tools are being explored to decipher plant miRNAs involved in biological processes and phenomena. If we consider the current available data of miRNAs from various plants, several miRNAs seem conserved (Zhang et al. 2006a, b). Based on this, many computational approaches are being developed for miRNA identification (Patanun et al. 2013). In silico tools have been exploited for predicting and/or identifying miRNAs from plants including Arabidopsis, Oryza sativa, Zea mays, Lycopersicon esculantum, Setaria italica, Glycine max, Brassica napus, Vitis vinifera, and Gossypium hirsutum (reviewed by Banerjee et al. 2016).

Pre-processing of raw sequence data

A typical workflow for miRNA work comprises the generation of raw sequences from small RNA libraries, alignment of the generated raw data with available data sets, miRNA identification, and prediction of secondary structures of recognized miRNA candidates, miRNA-target prediction, and the putative functional assignment (Liao et al. 2018). Specific in silico tools have been developed for all these steps (Chen et al. 2018). The primary step for miRNA analysis is mapping of the raw sequencing data on the reference genome. However, the raw data also involve adapter (used during library preparation) sequences which can hinder the correct processing and further downstream analysis of actual small RNA data. Hence, the adapter sequences need to be eliminated from the raw data to get the pure transcriptome sequence devoid of any artificial adapter sequence (Lindgreen 2012). Trimming of adapters was first implemented for small RNA sequencing where target fragments typically ranged between 18 and 30 nucleotides, with read length requirement of 36 nt (He et al. 2014). The trimming process differs from removal of contaminant sequences. Trimming-off the adapters is usually associated with next-generation sequencing protocols, where the specific adapter sequences are synthesized by reagent venders. As the pattern of adapters is known, semi-global or end-space free alignment is modeled for adapter removal. These alignments are usually performed via Smith–Waterman or Needleman–Wunsch algorithms (Jiang et al. 2014). Based on such algorithms, some of the tools developed initially involve cutadapt (Martin 2011) and Btrim (Kong 2011) (Table 1a). Trimmomatic (v0.38) (https://www.usadellab.org/cms/?page=trimmomatic, Bolger et al. 2014) is trimming tool which analyze paired- or single-ended NGS data. Depending on the Illumina pipeline used, this tool works with FASTQ (uncompressed or zipped) format. The cutadapt (v1.18) (https://cutadapt.readthedocs.io/en/stable/, Martin 2011) is another tool which can find and eliminate adapter sequences, primers, and polyA tails from high-throughput sequence reads. This tool supports paired-end reads as well as color-space data. To process the short-read sequences generated through NGS, FASTX-Toolkit (https://hannonlab.cshl.edu/fastx_toolkit/, Gordon 2011) is also an option developed by Hannon Lab Public Galaxy, which is a collection of available tools useful in processing of FASTA/FASTQ files. One of the available tools from this collection is FASTQ/A Trimmer, which works for shortening the reads by removing barcodes/noise. For actual removal of adapters/linkers sequences, FASTQ/A Clipper tool is available. The development of newer and more effective tools for serving the same purpose has taken place with the advent of new in silico tools with improved accuracy and efficiency. Some of these tools are SeqPurge (https://github.com/imgag/ngs-bits, Sturm et al. 2016), a very sensitive tool for adapter trimming with a capacity to remove very short adapter sequence (even one base) AdapterRemoval v2 (https://github.com/MikkelSchubert/adapterremoval/, Schubert et al. 2016). All these in silico tools act as filters for complete removal of adapter sequences providing clear and true sequence reads without any partial or complete contamination with adapter sequences.

Table 1 A summarized list of major tools/databases available for identification and analysis of plant miRNAs

After getting high-quality reads, the next step includes the mapping of the reads to known databases for identification of known miRNAs and identification of other non-coding/repetitive small RNAs. The mapping of reads for identification of miRNA/other nc-small RNAs/repetitive small RNAs is achieved via exploring the databases Rfam and RepBase (Table 1b). The Rfam database (https://rfam.xfam.org/, Kalvari et al. 2018) is a collection of RNA families with multiple sequence alignments, covariance models, and consensus secondary structures. These Rfam families are frequently assembled via external sources and comprise three broad functional classes, the nc-RNA genes, self-splicing RNAs, and cis-regulatory elements. The first Rfam database was published in 2003 (Griffiths-Jones et al. 2003) as a collection of multiple sequence alignment and covariance models representing families of non-coding RNAs with 25 such families, the current version 14.0 (2019), incorporates information from 3016 families. RepBase (https://www.girinst.org/about/repbase.html, Bao et al. 2015) is also available for repetitive sequences from eukaryotic species and is considered as a reference standard for annotation of repetitive genomic data, having more than 44,000 sequences from over hundred model organisms. This database is necessary for repetitive sequence masking tools such as RepeatMasker (Smit et al. 2015), REPET (Flutre et al. 2011), and CENSOR (Kohany et al. 2006). Application of such repetitive sequence masking process omits the repeats present in the raw sequence. Therefore, after clearing the adapter sequences and repetitive sequences, the raw sequence can now be considered as high-throughput, non-repetitive, small RNA reads, ready to be compared with the known miRNA databases. The processed raw reads can then be subjected for sequence alignment using general purpose aligners such as SOAP2 (Li et al. 2009) or Bowtie2 (Langmead and Salzberg 2012) (Table 1c).

Following alignment, the obtained miRNA counts are required to be normalized for eliminating the data variation. The normalization procedure compares raw sequence counts against known common denominator (McCormick et al. 2011). An effective normalization minimizes the technical and experimental bias without a noise to provide an authentic biological variation across the samples. There are different approaches proposed for miRNA sequence normalization including linear/non-linear scaling and quantile normalization (Garmire and Subramaniam 2012; Dillies et al. 2013). Several normalization methods have been developed including counts per million based normalization (cpm), total count scaling, upper quartile scaling (UQ), Trimmed mean of M-values (TMM), linear regression, quantile-based normalization, and cyclic loess normalization (Tam et al. 2015). One of the widely used methods in NGS for length normalization is reads per kilobase per million mapped reads (RPKM). ERPKM is an upgraded form of RPKM as it follows an effective read length (Li et al. 2015). Expression of raw abundance values in form of transcripts per million (TPM) is also a way to obtain normalized biological expression abundance. It represents the number of specific transcripts found per million transcripts, apparently developed to produce more equivalent results across the sample sequences than RPKM (Wagner et al. 2012). Although the accuracy of miRNA-abundance quantification is affected with respect to the normalization technique, the downstream impacts of application of different aligners for mapping the reads to a reference are unknown. Moreover, the normalization process will not ensure the data without any inconsistencies caused due to original alignment (Tam et al. 2015). The housekeeping genes, the conventional normalizers, display considerable variation with different biological conditions (Lee et al. 2002). Therefore, the global normalization approaches discussed above are preferred due to their greater stability achieved by exploiting the abundance measures of large number of entities. Nevertheless, the availability of large number of profiled and detected miRNAs have provided stable global normalization approach for miRNA sequences (Tiberio et al. 2015). For comprehensive miRNA profiling, the application of suitable aligner followed by use of appropriate miRNA count normalizer is highly recommended.

Once the data are normalized, numerous analysis can be performed on it, most of which falls under differential expression analysis category. Methods for identification of statistically significant, differentially expressed genes may vary as per the number of biological replicates under consideration (McCormick et al. 2011). There are different methods implemented for differential expression analysis. For instance, Bioconductor (https://bioconductor.org/) offers a tool as edgeR which identifies variable gene expression patterns in biological replicates via over dispersed Poisson model (Robinson et al. 2009). This tool requires raw sequence counts as well the total library counts as an input, and hence, pre-normalized data are not required. DESeq2 is yet another impressive tool provided by Bioconductor which uses negative binomial generalized linear model for testing differential gene expression. This tool also requires no pre-normalized data as DESeq2 model internally corrects the library size (Love et al. 2014). For differential expression analysis, another approach is provided by Fahlgren et al. (2009), which is an adaptation of significance analysis of microarrays (SAM) to sequencing data, called as SAM-Seq. In this approach, the differential expression score is calculated amongst the samples by considering the average gene abundance in every replicate for sRNA under consideration along with the standard deviations. It also minimizes the data variation via integrating a small, positive constant.

Only the correct quantification of differential miRNA expression along with statistical significance will provide a clear scenario of miRNA-driven biological differences in the samples (compared as control vs test). The information regarding miRNA:mRNA interaction will be meaningless if miRNA-abundance changes between control and test are not evaluated correctly. Therefore, application of tools such as DESeq2 or edgeR is must for better results.

Searching hits in miRNA databases

After pre-processing, the conserved miRNA candidates from the raw sequences are then compared with already established databases for miRNAs (Table 1d). Amongst the miRNA databases, the miRBase (v22, https://www.mirbase.org/index.shtml, Kozomara et al. 2018) is the preferred miRNA database which delivers integrated interfaces to an inclusive miRNA sequence data, annotation, and predicted target genes. The database initially published as ‘The microRNA Registry’ (Griffiths-Jones 2004) with 506 miRNA entries from six organisms, at present, (v22) has miRNAs from 271 organisms representing 38,589 hairpin precursors along with 48,860 mature miRNAs. These data are collected from 1493 small RNA sequence data sets and have mapped 5.5 billion reads to miRNA sequences. Each entry in this database denotes a predicted hairpin segment of miRNA transcript (referred as ‘mir’) along with location and mature miRNA sequence (referred as ‘miR’). Retrieval of this information is possible via providing name, keywords, annotations, or references for particular miRNA. This database is extensively applied in most of the miRNA-related investigations and considered as a central repository for miRNA sequences, nomenclature annotation, and target prediction.

Apart from the miRBase, there are some plant-specific miRNA databases (Table 1e). For instance, at Université du Québec à Montréal, Laboratoire Bioinformatique, a new wheat miRNA database as Wheat MicroRNA Portal (WMP) has been developed (https://wheat.bioinfo.uqam.ca/index.php, Remita et al. 2016). This encompasses ten small RNA libraries obtained from the plant samples subjected to different abiotic stresses with a compilation of published wheat miRNAs (total 5036). The miRNAs are available from aerial and root tissues of wheat under specific conditions including developmental phases (vegetative/reproductive), salinity, cold, and aluminum stresses. Another web resource, Soybean knowledge base (SoyKB) (https://soykb.org/, Joshi et al. 2017) is based on translational genomics of soybean. It is a comprehensive web resource specific to soybean, which also provides data regarding miRNAs and sRNAs. This database allows soybean-specific miRNA search against query in form of miRNA ID, target gene or sequence. Similarly, Fei et al. (2010) developed a tomato functional genomics database (TFGD) (https://ted.bti.cornell.edu/cgi-bin/TFGD/sRNA/miRNA.cgi) which incorporates tomato expression database along with metabolite profiles and large-scale data sets of tomato small RNAs. The TFGD harbors a list of putative miRNAs from tomato and the miRNAs can be searched on the basis of miRNA ID or name from this database along with the target prediction tool. The sorghum-specific multidimensional network analysis database, referred to as Sorghum Functional Genomics Database (SorghumFDB) (https://structuralbiology.cau.edu.cn/sorghum/miRNA.php), is developed at Zhen Su’s lab, China Agriculture University (Tian et al. 2016). One feature of this database includes integration of miRNAs from other miRNA databases such as miRBase, PMRD, and PNRD. The list provides miRNAs, respective chromosome number with base-position and nature of strand (positive or negative). The Cereal Small RNAs Database (CSRDB) (https://sundarlab.ucdavis.edu/smrnas/, Johnson et al. 2006) encompasses miRNA knowledge base consisting exhaustive list of miRNAs present in rice and maize. This database encompasses small RNAs from mitochondria as well as chloroplast. Li et al. (2012a, b) have designed SoMART (Solanaceae miRNA/tasiRNA Analysis Resource and Tools) webserver (https://bakerlab.berkeley.edu/somart-webserver-mirna-sirna-analysis), a suite of providing the databases and computational tools for identification and analysis of small RNAs from Solanaceae family members. This web server can be used for identification of miRNAs targeting the gene-of-interest, detection of miRNA/siRNA cleavage transcripts, identification of miRNA progenitors, and target confirmation. Along with all these specific web-based resources, the whole-genome databases, available for many plants now can provide search tools for identification of specific sequences including miRNAs.

Besides, there are some other miRNA identification pipelines incorporating miRNA-related data from more than one plant (Table 1d). Plant miRNA database (PMRD) (https://bioinformatics.cau.edu.cn/PMRD/, Zhang et al. 2010) is a database integrating available evidence regarding miRNA sequences, their target genes, secondary structures, and expression profiles along with the genome browsers, an updated version was released in 2014 as plant non-coding RNA database (PNDB) (https://structuralbiology.cau.edu.cn/PNRD/, Yi et al. 2015). Current version of PNRD consists of 28,214 non-coding RNA entries from 166 species. This database also contains functional search and analytical tools such as non-coding RNA keyword search, the functional search based on the literature, target gene search along with miRNA prediction toolkit, coding potential calculator toolkit, and BLAST tool. The miRNEST2.0 (https://rhesus.amu.edu.pl/mirnest/copy/, Szcześniak and Makałowska 2014) is an integrative miRNA resources from several plant species for browsing different miRNA candidates, besides in silico tools for deep-sequence predictions and miRNA genes. For identification and quantification of the known/novel miRNAs from raw small RNA sequences, An et al. (2012) developed an integrated miRNA identification tool known as miRDeep* (https://www.australianprostatecentre.org/research/software/mirdeep-star). It is a user-friendly graphic interphase which accepts FASTQ or SAM/BAM formats for miRNA analysis. It is principally a java-coded in silico application. There are development of variants for this database, one of the latest of such kind is miRDeep-P2 (https://sourceforge.net/projects/mirdp2/, Kuang et al. 2018). Even though there are continuous progressions in the development of in silico analytical tools, there is always a challenge of minimizing the false-positive predictions as well as analytical time. The current version of miRDeep-P2 provides superiority in speed and precision because of the updated annotation criteria and new scoring system, and, therefore, may serve as a method of choice.

Assigning appropriate name to the newly identified miRNA member is essential. The current miRNA nomenclature system implies primary forms (as pri-miRNA), miRNA precursors (as pre-miRNA) along with name of gene or corresponding number assigned. The miRNA annotation ending with -3p and -5p signifies the 3′ and 5′ arm mature sequence of that miRNA (Desvignes et al. 2015). The perfect nomenclature is vital in miRNA-related investigations as the exact annotation of identified miRNAs ensures comparability of the results obtained.

Secondary structure prediction for miRNAs

As miRNAs are established as key modulators of plant responses to environmental stress and pathogen invasions/disease environment, prediction of new potential miRNA species involved during such an interaction is of high interest. The ab initio predictions are usually helpful in this, as they use intrinsic properties of an actual sequence for prediction (Tav et al. 2016). These ab initio tools predict the miRNA precursors having characteristic hairpin structure, which plays a crucial role during biogenesis of mature miRNAs. Hence, prediction of secondary structure by computational tools is always an area of interest for researchers (Li et al. 2010). These secondary structures may suggest the presence of lower loop/bulge or unstructured region, actual miRNA sequence, upper junction region, and upper/terminal loop region, giving rise to the hairpin or stem-loop with single- or double-stranded regions (Meyers et al. 2010). Presence of bulge at particular position (upper, middle, or lower regions of hairpin) is a characteristic feature of the predicted secondary structure.

Lower stem region is crucial in pri-/pre-miRNA processing as initial cleavage takes place at ~ 15 nucleotide position from the single-stranded (unpaired) lower stem to form a mature miRNA. The presence or absence of a big bulge in this region decides the fate of the initial cleavage (Bologna et al. 2013). Thus, this bulge above the unpaired region at lower region of hairpin describes an optimum assembly certifying very precise and effective processing of plant miRNA precursors (Meyers et al. 2010). On the other hand, bulge or terminal loop in an uppermost region of hairpin along with upper junction region is important in miRNA processing as these are the sites for second Dicer-mediated cleavage (Mateos et al. 2010; Werner et al. 2010). Owing to the importance, the input miRNA sequences, hence, are subjected to the search and analysis for bulges/internal loops. In most cases, the compact hairpin precursor displaying minimum-free energy is finalized from each input sequence (Miskiewicz et al. 2017). Hence, the presence or absence of loops at upper/middle/lower position and minimum-free energy requirements are critical for optimum secondary structure predictions.

The process of miRNA secondary structure prediction is boosted by the development of several in silico tools presented in Table 1f. The miRFinder (https://www.bioinformatics.org/mirfinder/, Huang et al. 2007) is a pre-miRNA prediction tool which estimates the secondary structure by an initial search of hairpin-candidate sequences, leaving the non-robust structures. This tool represents the output using symbols to show a state of paired/unpaired, insertion/deletion, and a bulging in the hairpin. On a similar line, ab initio approach for miRNA prediction in the genome is also implemented in another in silico tool, the miRNAFold (https://evryrna.ibisc.univ-evry.fr/miRNAFold, Tav et al. 2016). It is dedicated to identification of miRNA precursor in a highly sensitive and fast way which initiates the searching by approximation of miRNA hairpin assembly. It is followed by reconstituting the pre-miRNA structures and allows a considerable reduction in theoretical complexity and search time. This tool gives output in form of well-presented structure with sequence positions and hairpin region marked with different colors (stem: green, loops/bulges: brown, terminal loop: blue). Another webserver frequently applied for secondary structure predictions is mfold (https://unafold.rna.albany.edu/?q=mfold/RNA-Folding-Form, Zuker 2003). This server provides easy access to nucleic acid folding and hybridization. It uses universally available web graphical user interphases (GUIs). The folding event of sequences is provided with detailed outputs as energy dot plots, single-stranded frequency plots, and structure plots with reliability information. This and one more web server are amalgamated for providing a new service as UNAFold (https://unafold.rna.albany.edu/, Markham and Zuker 2008), an expansion of existing algorithms and software to gain enhanced analysis. Lorenz et al. (2011) developed The ViennaRNA Package (https://www.tbi.univie.ac.at/RNA/). This software comprises of C-code library and many programs for predicting and comparing RNA secondary structures, where the energy minimization is the method of choice for secondary structure prediction. Three dynamic algorithms are provided under this package, first the minimal free algorithm to get structure, second, the partition function algorithm for calculating the base pair probabilities (thermodynamic ensemble), and third being sub-optimal folding algorithm to generate sub-optimal structures for provided energy. This package also provides an algorithm for inverse folding (designing sequence with pre-determined secondary structure). Similarly, RNAstructure (v6.1) (https://rna.urmc.rochester.edu/RNAstructure.html, Mathews 2014) is also a complete package for secondary structure prediction and investigation. Along with algorithms for structural predictions, this facility also provides probabilities for base pairing, which predicts bimolecular structures and equilibrium binding affinity of nucleotide to a structured RNA target (Mathews 2014).

As discussed earlier, huge raw small RNA sequence data are generated during the miRNA analysis and there is a need for separating real miRNA precursors from non-miRNA hairpins. To achieve this sorting, so far the machine learning approaches seem most promising. Though machine learning requires the use of training data, which is tricky because of the inequality in positive (no. of miRNA) and negative (number of non-miRNA) data leading lesser efficiency.

Secondary structure of miRNAs is crucial intermediary levels providing the vital information about miRNAs which condenses dominating part of folding energy. These structures are evolutionarily well conserved and frequently serve as an explanation of respective experimental findings. As discussed above various in silico tools are available for secondary structure predictions of miRNAs. Besides, predicting the probable position of plant miRNA is also important and can be obtained using programs like MaturePred (https://39.106.10.28/maturepred/, Xuan et al. 2011) which works on machine learning methods. This web server predicts the start position of miRNA for its pre-miRNA candidature. For an optimistic progression in miRNA analysis, there is a persistent necessity of new methods for positional predictions of miRNAs for aiding discovery of new pre-miRNA species (Szwacka et al. 2018).

mRNA–miRNA interaction analysis

The modulation of gene expression through miRNAs revolves around and depends on the alteration of translation efficiency and/or mRNA targets’ stability. It is assumed that wide share of total protein-encoding RNAs are subjected under miRNA-mediated control, but identification of such miRNA–mRNA interactions necessitates intricate procedures or prior knowledge of target mRNAs (Afonso-Grunz and Müller 2015). However, experimental evidence for these miRNA–mRNA interactions is infrequent and in silico analysis based on sequence complementarity, suffers from a high degree of false-positive outcomes. Hence, a comprehensive analysis of interactions of small non-coding RNAs and mRNAs is needed for better knowledge and prediction of multifaceted dynamics of miRNA-mediated gene regulation (Afonso-Grunz and Müller 2015). Recent approaches show the application of cross-linking and ARGONAUTE immuno-precipitation coupled with high-throughput CLIP-Seq and degradome sequencing, also known as parallel analysis of RNA ends (PARE) for target isolation. These methods have curtailed the frequency of false-positive results, besides reducing the search time. The improved frequency of application of CLIP-Seq and PARE, however, urges the need of integrated databases and tools to assist the analysis and annotation of these interactive miRNA/mRNA entities, and available in silico tools for this are presented in Table 1g. starBase (https://starbase.sysu.edu.cn/), a database to explore miRNA–mRNA interaction maps generated from CLIP-Seq and/or PARE has been developed by Yang et al. (2010). This database helps in inclusive annotation and searching of miRNA and respective targets. It also contains information regarding Ago-binding and cleavage sites. As an extension of this database, starBase v2.0 is launched by Li et al. (2013b) for understanding the regulatory interactions between multiple classes of RNAs. This updated version provides complete and experimentally proven information about miRNA–mRNA interactions.

Considering the significance of information on miRNA–mRNA interactions, several such interactions have been experimentally discovered and confirmed. However, plenty of such knowledge is available in fragmented form across thousands of manuscripts and raw sequence data sets. Hence, to index the published (and experimentally authenticated) miRNA–gene interaction events, DIANA-TasBase (v7.0) (https://diana.imis.athena-innovation.gr/DianaTools/index.php?r=tarbase/index) was generated by Vlachos et al. (2014). This database enables researchers to recognize negative/positive experimental outcomes along with implemented methods and experimental conditions. This version also delivers advanced information such as binding site locations (experimental and in silico), primer sequences for cloning.

Some resources are plant specific as well; for instance, a rice miRNA:mRNA interaction resource (ARMOUR) (https://armour.icgeb.trieste.it/login) was launched recently (Sanan-Mishra et al. 2018). This database comprises experimentally validated expression patterns from seven different indica rice cultivars under variable environmental conditions and developmental phases. It is flexible in terms of query as it uses different means like recognized gene–gene ontology/KEGG identifiers, besides permitting to analyze fly-fold-change.

Target prediction for miRNAs

Experimental evidence has suggested that many TFs, and development-related genes, are target mRNAs for the miRNAs. Interestingly, most of the algorithms developed initially for target prediction were animal miRNA-centric (Lim et al. 2003; Li et al. 2010), but there are variations between plant and animal miRNAs. For instance, to accomplish a miRNA–mRNA interaction in animal system, loose complementarity with target at approximately first 8 nucleotides of the miRNA is necessary, while a plant system requires complete mature miRNA sequence in near-perfect alignment with its respective target mRNA. Also, inhibition of expression of target gene(s) is achieved by animal miRNAs at the translational level, while plant miRNAs achieve the same via direct cleavage of target transcript (Lim et al. 2003; Li et al. 2010) along with some reports of translational repression (Ouyang et al. 2014). Animal miRNAs identify the 3′-UTR region of the corresponding target mRNA, but plant miRNA requires no such preference (Lim et al. 2003; Li et al. 2010; Dai and Zhao 2011). These dissimilarities demanded the need to develop separate recognition process for miRNA targets in plants (Dai and Zhao 2011). The stable state of miRNA–mRNA duplex, which depends on the complementarity between target sites and respective miRNAs, is a decisive factor during target gene analysis. Another important measure to filter out the miRNA target is the conserved nature of the target sequence. Orthologous search for such conserved region may facilitate the target prediction process. Multiple occurrences of the target sequence are also considered while targeting prediction (Dai et al. 2010).

A region of first ~ 6–8 nucleotides within the structure of a mature miRNA initiating at 5′ and counting towards 3′ is known as a seed region, a highly conserved and crucial region which binds to the complementary target in the 3′ UTR. A slight change in nucleotide sequence of seed region may result into the miRNA dysfunctioning and subsequent variation in the expression pattern of respective miRNA target (Borel and Antonarakis 2008). Overall, a higher degree of conservation is found in seed region than others (Lewis et al. 2003). The characterization of conserved nature of these seed sites at 3′ UTR indicates the fact that single miRNA might regulate several genes (Lewis et al. 2003; Brennecke et al. 2005; Krek et al. 2005).

The stability of miRNA:mRNA interaction is measured in terms of free energy (also referred as hybridization energy), which is calculated by considering high and low free energies of terminal loop region and stem region, respectively (Yue et al. 2009). Following the transcription process, the mRNA is usually in a secondary structure form which may hinder the miRNAs’ accession to its binding site on target-mRNA (Mahen et al. 2010). The site accessibility is granted to miRNA after initial binding of miRNA to short, accessible region on mRNA, followed by unfolding of secondary mRNA structure and complete miRNA:mRNA hybridization (Long et al. 2007). Therefore, to check the probability of mRNA being target of a particular miRNA, the energy essential for making site available to miRNA is considered. Collectively, free energy of the system indicates its stability and correctness in the target prediction. As, direct quantification of the free energy is tough, the variations in free energy levels during the interaction are considered. The negative difference in free energy results in a system with increased stability (Peterson et al. 2014).

Continuous developments in in silico tools for miRNA-target predictions have highlighted the major biological information, namely, seed region complementarity and conservation, free energy, and site accessibility, as factors to be considered for prediction of miRNA targets (Du and Zamore 2005). Along with these, physicochemical properties of miRNA and its respective targets’ interaction should be given a due consideration. The accessibility of miRNA-target site is an effective structural feature for target predictions (Dai et al. 2010). Keeping these benchmarks in considerations, various in silico tools have been developed for target predictions (Table 1h).

Dai and Zhao (2011) developed an analytical server psRNATarget (https://plantgrn.noble.org/psRNATarget/) for predicting the plant miRNA targets. This server provides two important analytical utilities, namely, a reverse complementary matching of small RNAs with their target transcripts via the proven scoring system and unpaired energy calculation-mediated target-site accessibility evaluations. The current updated version of psRNATarget (2017 release, Dai et al. 2018) provides enhanced analytical efficiency via the development of a new set of scoring systems. Both canonical and non-canonical target search is possible via a customizable process in this server. It is compelling, clear, and user-friendly interphase which presents the data concisely and clearly. Another tool for plant miRNA prediction, Target-align (https://www.leonxie.com/targetAlign.php, Xie and Zhang 2010) is developed using Smith–Waterman-like algorithm. This dynamic program contains complementarity based score matrix system to trace the optimal local alignments. In this tool, maximum mismatches/consecutive mismatches are used for clarifying optimal local alignments. This tool can also identify multi-targets and non-cleaved targets. Another web-based integrated in silico system, miRU (https://bioinfo3.noble.org/miRU.htm, Zhang 2005), is used for plant miRNA-target prediction, in case several sequences needs to be analyzed. After providing a mature miRNA sequence, possible complementary target sites with acceptable target mismatches are searched within the system. Evolutionary conservation is also one of the considered criteria for target predictions. The TAPIR (https://bioinformatics.psb.ugent.be/webtools/tapir, Bonnet et al. 2010) webserver for plant miRNA-target prediction uses two different search engines, the FASTA and RNAhybrid, providing more precision to give rich output along with free energy calculations and ratios. This server also provides possibilities for probable target mimics. This feature is helpful as target mimics representing large loop bearing miRNA-target duplexes which are undetectable by the traditional methods. TarHunter (https://github.com/XMaBio, Ma et al. 2017) is one more tool for miRNA-target prediction which takes target mimics into account. Since, endogenous target mimics along with non-canonical targets play a crucial role for particular miRNAs, this tool has a mandatory filter for cross-speciation to find out probable mimics, which ultimately improves the precision of the target prediction.

Another tool for plant miRNA and target identification, C-mii (https://www.biotec.or.th/isl/c-mii), was developed by Numnark et al. (2012). It is a ready-made software package which provides some distinct features. It is user friendly and all in one database with easy graphical user interphase for the user. It also provides a comprehensive view of identified miRNA/targets with info-graphics and grouping information and supports multi-project as well as multi-thread management. Kakrana et al. (2014) developed a small RNA-PARE target analyzer (sPARTA) (https://github.com/atulkakrana/sPARTA/releases) which utilizes high-throughput sequence data to profile a genome-wide cleavage product. A built-in parallelized genome-wide target prediction module, called as miRferno, is also merged with this analyzer. This multi-core server achieves two-dimensional parallelization for maintenance of low-memory footprints for complete genome analysis. Another bioinformatics pipeline referred as miRTour (https://bio2server.bioinfo.uni-plovdiv.bg/miRTour/, Milev et al. 2011) presents an automated steps for miRNA similarity searches, pre-miRNA selection, target prediction, and annotation. The experimentally authenticated interactions between the miRNAs and their respective targets have been compiled in miRTarBase (https://mirtarbase.mbc.nctu.edu.tw/php/index.php, Chou et al. 2017). The 2018 release of this database contains 422,517 curated interactions between miRNA and their targets from 4076 miRNAs and 23,054 target genes, compiled from 8500 reports. This version also provides downloadable information about target sites validated by reporter assays.

Interestingly, Srivastava et al. (2014) compared the performance of different plant miRNA-target prediction tools. The authors assessed the performance of eleven computational tools used for genome-wide identification of targets in Arabidopsis and other plants using optimized score cutoffs. Targetfinder was found most proficient with 89% precision (exactness in prediction) and 97% ‘recall’ (sensitivity) in prediction of true-positive targets in Arabidopsis. Overall, authors concluded that Targetfinder alone or with psRNATarget/Tapirhybrid proved best for the prediction of miRNA targets (Srivastava et al. 2014).

Gene ontology and pathway analysis of identified miRNA targets

Gene ontology of the identified targets is preferred to get a good understanding of the functions of identified miRNAs and their targets, followed by the analysis of respective pathways in which the target genes are involved. In silico tools for such analysis are summarized in Table 1i. The Gene Ontology resource (https://geneontology.org/, Gene Ontology Consortium 2014) is a comprehensive and widely used knowledge portal. This knowledge base is constructed in ontology form by stratifying classes of gene functions having specific relation with each other. The gene ontology facilitates systematized and accurate vocabularies and classification of domains in the considered biological process, cellular mechanisms, and gene functions. Another tool designed on a similar line (enabling gene ontology) is Balst2GO (https://www.blast2go.com/, Conesa et al. 2005). This tool enables ontology-based data mining from raw sequences without requiring the previous annotations. It combines similarity-based searches for annotation with statistical analysis, with highlighted visualization on direct graphical representations. Apart from these two plug-ins, BiNGO (A Biological Network Gene Ontology tool, https://www.psb.ugent.be/cbd/papers/BiNGO/Home.html, Maere et al. 2005) (v3.0.3), also a Java-based tool, defines that which gene ontology categories are statistically over-represented in the biological network being analyzed. This app maps the pre-dominant functions of considered gene/s on ontological hierarchy and outputs these maps on interaction networks. The WEGO (Web Gene Ontology Annotation Plot, https://wego.genomics.org.cn/, Ye et al. 2018) (v2.0) is simple and useful tool for visualization, comparison and graphical representation of gene ontology annotation results. This tool analyzes multiple files at one operation with harboring data for nine different organisms. The agriGO (https://systemsbiology.cau.edu.cn/agriGOv2/, Tian et al. 2017) platform specifically focuses on gene ontology enrichment analysis of crop species. The current version of agriGO accumulates information from 394 species and 865 datatypes. The agriGO provides the user a high computational efficiency because of the batch analysis, p-value distribution, and greater user-friendliness. Additional visualization features are add-ons for this platform including a cross comparison of singular enrichment analysis (SEACOMPARE), direct acyclic graph (DAG), and scatter plots. ClueGO (https://www.ici.upmc.fr/cluego/cluegoDownload.shtml, Bindea et al. 2009) is yet another tool for interpreting the gene ontology as well as pathway annotation networks. The main feature of this tool includes the integration of gene ontology terms and KEGG/BioCarta pathways which results in the creation of functionally organized GO/pathway term network. Gene set enrichment analysis (GSEA) is one of the preferred ways for interpreting genes via calculating the overlaps with already defined gene sets. Using this method, Yi et al. (2013) developed a webserver named PlantGSEA (https://structuralbiology.cau.edu.cn/PlantGSEA) which focuses on plants. This server provides 20,290 defined gene sets from various resources. A rapid genome-wide functional annotation of large set of sequences via fast orthology assignment can be accomplished through a tool, known as eggNOG-mapper (https://eggnog-mapper.embl.de/, Huerta-Cepas et al. 2017). This tool validates the ontology predictions against BLAST and InterProScan, two widely used homology-based approaches. Functional annotation in eggNOG-mapper is done using pre-computed clusters and phylogenies from the eggNOG database. This tool is rapid, with higher proteome coverage and precision in ontology prediction than the other tools like BLAST and InterProScan (Huerta-Cepas et al. 2017).

In modern plant science studies, long with GO, MapMan (https://mapman.gabipd.org/, Thimm et al. 2004) is also a widely used ontology specifically for a plant. MapMan was initially tailored only for Arabidopsis, but later was extended to other plants such as maize, tomato, potato, and medicago (Klie and Nikoloski 2012). It is a user-driven tool which presents large data sets as a diagrammatic representation of metabolic pathways. In this tool, different SCAVENGER modules assign the hierarchical classes of measured parameters (such as transcripts or metabolites), whereas IMAGEANNOTATOR module uses this classified information to organize a diagrammatic display of experimental data as per the user’s need. As both MapMan and GO offer alternative ways for ontology-driven analysis of raw sequence data, the choice of tool totally depends on the requirements of the researcher. The comparative account on the choice between MapMan and GO for plant gene function prediction is provided excellently by Klie and Nikoloski (2012) and the readers are advised to refer it.

Cytoscape (https://cytoscape.org/cy3.html, Su et al. 2014) is one of the most prevalent, open-source in silico tool implemented for visual investigation of biological networks incorporating genes, proteins, and different interactions. It provides the researchers with a versatile and interactive conception of complex metabolic inter-connections, along with experimental details and annotations. It offers significant functionality of loading, visualizing, searching, filtering, and saving the network data. This function is further improved because of the availability of hundreds of apps to address particular needs, e.g., GeneMANIA (Montojo et al. 2014) to construct a network of composite gene–gene functional interactions or KEGGscape (Nishida et al. 2014) for replicating hand-drawn pathway diagrams with details in Cytoscape). Kyoto Encyclopedia of Gene and Genomes (KEGG, https://www.genome.jp/kegg/, Kanehisa and Goto 2000) is considered as the main knowledge base which provides a systematic analysis of gene function and helps in linking the genomic interactions with a high degree of functional information.

Functional characterization and validation of plant miRNAs

One of the easiest approaches to validate the assigned target for a particular miRNA is profiling of the expression patterns and probable target(s) of the miRNAs by quantitative PCR (qPCR). In a typical state, when miRNA expression increases the quantity for their respective target decreases and vice versa. However, this approach is not useful for studying translational repression. Additionally, only expression patterns cannot be considered to understand the regulatory mechanisms between transcript regulated by many miRNAs or single miRNA regulating many transcripts (Devi et al. 2018). The cloning-based approach such as RLM-RACE and PolyA polymerase-mediated (PPM)-RACE can be implemented for recognition of direct cleavage products of miRNA targets (Wang and Fang 2015). These approaches can separate the cleavage events by different miRNAs. The cleaved transcripts are added either by oligonucleotide (for RLM-RACE) or by poly-A tail (for PPM RACE), followed by reverse transcriptase-mediated transcription, cloning, and then, the products are sequenced. A cleavage product of miRNA targets can also be detected via PARE, as described in previous sections. This high-throughput analytical approach is also referred to as degradome sequencing in which initial processing is equivalent to that of the RLM RACE (German et al. 2009). The exact target-miRNA interaction can be methodically validated using HITS-CLIP approach (Chi et al. 2009) and photoactivable ribonucleoside-enhanced cross-linking and immuno-precipitation (PAR-CLIP) (Hafner et al. 2010). Both these approaches principally involve linking of miRNA target to AGO protein followed by precipitation of epitope-tagged AGO and sequencing (Thomson et al. 2011; Hafner et al. 2012). For validation of miRNA targets in transgenics, reporter gene assays utilizing luciferase or green fluorescent proteins can be used. Luciferase activities from different species are also utilized where luciferase from one species possesses miRNA binding site which is subsequently cleaved and another counterpart devoid of the cleavage site, hence acting as an internal control (Moyle et al. 2017). These methods functionally validate the miRNA-target pairs that can be placed appropriately in the gene-interactive networks to attain the comprehensive knowledge of the miRNA-mediated regulatory event. A workflow for plant identification and characterization of miRNA from plants, their target prediction and confirmation, and functional validation is presented in Fig. 1.

Fig. 1
figure 1

A workflow for plant miRNA identification, characterization, target prediction and confirmation, and their functional validation

Conclusion and future perspectives

Plant miRNA research has witnessed an upward swing in the past few years, owing to the key roles played by these non-coding RNA species in plant life processes and their adjustments to the challenging environmental conditions. However, the research community should expand its base and should identify miRNAs from the non-model and lesser-explored plants. Similarly, genome-wide expression analysis of plant miRNA families is needed.

Technological advancements, particularly in high-throughput small RNA sequencing and degradome analysis along with advent of modern computational or in silico tools and databases useful for identifying the miRNAs and their respective targets, have revolutionized this field. The NGS approaches available at reduced costs with high precision are enabling researchers to fast and reliable miRNA discovery from the plants even if their genome sequences are unavailable. Computational tools and algorithms thereof are playing significant roles in increasing the precision in identification of miRNAs and their targets. The interplay between miRNAs and their targets is still a field to be explored fully. However, experimental approaches coupled with the use of bioinformatics tools and databases have the potential to unfold the mysteries of these tiny but powerful molecules and their roles in plant cellular events.

Author contributions statement

XZ and VK conceived the idea. XZ, VK and TK authored the manuscript. All three authors contributed substantially.