Introduction

MicroRNAs (miRNAs) are small noncoding endogenous single-stranded RNA 19–25 nucleotides in length. They are generally present in the 3′ untranslated region (UTR) of the gene (Kaeuferle et al. 2014). As per earlier reports, miRNA present in the coding region of gene modulates their functional activity (Hendrickson et al. 2008; Tay et al. 2008).They play an important role in developmental biology by participating in post-transcriptional gene regulation. The biogenesis of miRNA has been well depicted several times in vertebrates, including teleosts (Bizuayehu and Babiak 2014; Treiber et al. 2012). These are produced from a pri-miRNA transcript by RNA polymeraseII containing a strong secondary stem-loop structure. Further Dicer endoribonuclease cleaves this stem-loop to form a hairpin RNA (Kim et al. 2009).The RISC (RNA induced silencing complex) forms a complex with the miRNA by guiding the effector complex to the target mRNA for gene regulation (Macfarlane and Murphy 2010). This binding site of miRNA to the target mRNA is essential and this area is called as the ‘seed’ region, which is conserved across species and within miRNA families (Brennecke et al. 2005). An earlier study suggested that one single miRNA can regulate more than 200 mRNAs and vice versa (Dweep et al. 2011), whereas, the mature miRNA encompasses a seed region of 2–7 nucleotides (Grimson 2010). Conservation of the ‘seed’ region is the basis for many target prediction algorithms. Several miRNAs and their targets were identified in plants and animals including fish using approaches such as gene cloning and sequencing, next generation sequencing, and computational prediction (Fu et al. 2013; Yang et al. 2013; Zhang et al. 2013). Recent studies depicted that epigenetic mechanism of certain gene expression is maintained by small RNA associated with DNA methylation (Huang et al. 2014; Lister et al. 2008). The miRNA binds to the respective target mRNA based onits sequence complementarity and regulates the gene expression pattern (Zhang et al. 2009). Interestingly, recent work revealed that miRNAs disturb many target mRNA levels as well as decreasing protein production by repression/destabilization (Guo and Lu 2010). So far, evidence in plants and animals suggest that miRNAs have a pivotal role in developmental biology and gene regulation (Chen et al. 2005; Ebert and Sharp 2012; Rajagopalan et al. 2006;Treiber et al. 2012).

While investigating the mechanisms of adaptation in fishes, for understanding evolution, it was revealed that miRNA also have crucial functions in adaptive evolution (Chaturvedi et al. 2014; Kitano et al. 2013). Previously, cloning and sequencing, miRNA array screening and northern blotting were used extensively to identify many individual miRNAs in fish. An illustration of miRNA information for nine fish species is available in the recently released database (http://www.mirbase.org/). Zebrafish and medaka are considered ‘model’ fishes, because the whole genome/transcriptome sequence is available and used in studying developmental biology, while aquaculture species are refered to as non-model fishes. Mature miRNAs have been documented in some fish species such as Zebrafish, Danio rerio (total number of miRNAs; 346); Atlantic salmon, Salmo salar (371); Fugu, Takifugu rubripes (175); Medaka, Oryzias latipes (168); Common carp, Cyprinus carpio (134); Tetraodon, Tetraodon nigroviridis (132); Olive flounder, Paralichthys olivaceus (20); Channel catfish, Ictalurus punctatus (281)and Atlantic halibut, Hippoglossus hippoglossus (40) (Hsu et al. 2014). It has been demonstrated that miRNAs are evolutionarily conserved from species to species in almost all organisms (Daido et al. 2014; Maher et al. 2006;Takane et al. 2010). The small size and low abundance of miRNA’scomes as a hinderance for identification through traditional molecular biology tools, namely by directional cloning. To address this challenge, high-throughput sequencing methods and computational approaches were developed (Andreassen et al. 2013; Baev et al. 2009; Bekaert et al. 2013;Wang et al. 2014).

The recent advancement in sequencing technologies and progress in computational tools has enabled us to develop a stronger understanding on the gene function or regulation at a genome wide scale (Margulies et al. 2005; Mehinto et al. 2012; Qian et al. 2014; Valouev et al. 2008). NGS technologies such as 454, Illumina, Ion-Torrent and ABI-SOLiD have facilitated high-throughput sequencing of non-model fish genomes or transcriptome and exome. These sequencing technologies support genome-wide small RNA studies, thus, providing a global view of small RNA in different species. This will be helpful in downstream analysis to identify and quantify miRNAs using computational tools and to reveal their role in gene regulation using sequencing technology (Burnside et al. 2008; Kaeuferle et al. 2014).

We have comprehensively reviewed traditional approaches as well as different NGS platforms and computational tools for identification of mature miRNAs and their targets in fish. We have also schematized the data processing steps for miRNA investigation and provided an outline of available computational tools for the same. To achieve robust data analysis, a soild knowledge of computational approaches is required, in terms of understanding the role of miRNAs in gene modulation. The genome or transcriptome sequencing of non-model fishes would be considered significant data for further mining of important miRNAs and their target sites in the genes associated with important traits. Thus the miRNA identified through sequencing and their association with specific genes related to performance and production traits would allow geneticists to understand the role of epigenetics in developmental biology.

miRNA identification with evidence from traditional and high-throughput techniques

In fish, several studies were conducted to reveal miRNAs in the genome by transcriptome analysis principally involved with thebiological processes which affect their development, metabolism and disease. We have compiled the available information of the identified miRNAs in fishes (Table 1). Due to advancement in sequencing technologies our ability to dissect transcriptomes even for lowly expressed RNAs has improved markedly and several such cases have been reported in fish. High-throughput sequencing has been utilized for miRNA discovery in several organisms (Salem et al. 2010; Zhu et al. 2015). Changes in miRNA expression have been observed during larval and juvenile growth (Campos et al. 2014), in eggs (Ma et al. 2012), larval ontogeny (Bizuayehu et al. 2012; Mennigen et al. 2014a), and skin pigmentation (Yan et al. 2013a, b). We have shown consolidated information on miRNA discovery using different sequencing technologies (Table 2).

Table 1 List of identified miRNA with their function in fishes
Table 2 The overview of investigated miRNA via next generation sequencing

Recently, 43 miRNAs belonging to 38 miRNA families in eleven different fish species and their target genes were predicted using computational methods from the expressed sequence tags (EST), and genome sequence survey (GSS) databases (Huang et al. 2015b). Those identified miRNAs were highly conserved and predicted target genes were found to be involved with various biological processes such as cell development and stress biology. Further, 21 novel miRNAs involved with biological processes such as signal transduction, metabolism and development biologywere predicted (Huang et al. 2015a). Also, they validated five randomly selected miRNAs, such as ccr-miR430b, cau-miR3198, man-miR142-3p, mam-miR10a-5p and cal-miR4483 using Real-time Polymerase Chain Reaction (RT-PCR) techniques.These findings suggested that computational tools are more pertinent to screen-out miRNAs and their targets from non-model fishes (Table 3).

Table 3 List of microRNA and their target prediction tools

miRNA associated with cell proliferation, differentiation and embryonic development

In earlier studies, the role of miRNAs during cell specification and differentiation, and their expression level were studied largely in zebrafish using direct cloning and microarray techniques (Fjose and Zhao 2010; Thatcher et al. 2007; Wienholds et al. 2005). One hundred and fifty four mature miRNAs were detected by sequencing of small RNA (sRNA) libraries prepared from different developmental stages of zebrafish and two adult cell lines (Chen et al. 2005). The role of miR-430 in maternal RNA clearance during maternal to zygotic transition in zebrafish is well documented (Schier and Giraldez 2006). Fourteen conserved miRNAs involving in regulation of maternal mRNA degradation during early embryogenesis were identified in rainbow trout (Oncorhynchus mykiss) (Ramachandra et al. 2008). Using 454 sequencing, 25 novel miRNAs belong to the different developmental stages were detected in zebrafish (Soares et al. 2009). On the other hand, 8 novel miRNAs and a piRNAs were discovered during early embryonic development of zebrafish using sequencing approach (Wei et al. 2012). Accumulated evidence suggested that miRNAs are involved in several biological events such as cell growth and differentiation and apoptosis via post-transcriptional miRNA-mediated gene regulation (Bizuayehu and Babiak 2014).

Using transcriptome sequencing in the rainbow trout, 210 miRNAs from tissues and 496 miRNAs from eggs were identified (Ma et al. 2012; Salem et al. 2010). In rainbow trout, 13 differentially expressed miRNAs were generated by microarray as well as real time PCR techniques and the identified genes through target prediction analysis revealed their role in steroidogenesis, ovulation and oocyte development (Juanchich et al. 2013). Recent work has shown that miR-20a is essential for normal embryogenesis in goldfish and zebrafish, and post-transcriptional regulation for protein coding gene Vsx1(Visual System Homeobox 1), which plays different roles in diverse developmental events (Sun et al. 2015).

miRNA associated with metabolism

Liver-specific miRNAs were identified in rainbow trout, and predicted that miR-33 and miR-122 are linked with cholesterol and lipid regulating metabolism as well as glucose homeostasis (Mennigen et al. 2014a, b). MicroRNAs are important as a signature in response to nutrient restriction and refeeding in fast skeletal muscle of grass carp (Ctenopharyngodon idella), and by recording changes in their expression level, it has been shown that eight miRNAs are related to the muscle growth (Zhu et al. 2014). For understanding the regulatory roles of miRNAs in Asian seabass (Lates calcarifer) living under different environmental conditions a challenge was conducted with lipopolysaccharide (LPS) and 63 novel miRNAs belonging to 29 conserved miRNA families were identified (Xia et al. 2011).

miRNA associated with ontogeny and immune system

Kloosterman and co-workers identified 139 known and 66 novel miRNAs from 5-day old zebrafish larvae and adult zebrafish brain using in situ hybridization and northern blotting techniques (Kloosterman et al. 2007). They identified developmental-stage specific and tissue-specific expression patterns for some of these miRNAs. Their study suggested that miR-153 regulates snap25 during synaptic transmission and motor neuron development and miR-27 targets ptk2.2 to regulate pharyngeal arch morphogenesis in zebrafish.

In annual fish (Nothobranchius furzeri) miR-15a, miR-20a, and 17–92 microRNA clusters were identified and a further 165 conserved miRNAs in brain associated with neurogenesis were detected using in situ hybridization techniques (Terzibasi Tozzini et al. 2014). Recently, 194 conserved and 12 novel miRNAs, belonging to the 30 gene families of the miRNAs associated with immunity were identified from the spleen of common carp using Solexa sequencing with computational methods (Li et al. 2014).

Stress associated miRNA

The expression of miRNAs in mammals and plants is modulated by various environmental stressors, and related studies have also been carried out on fish. NGS has shown that 223 distinct miRNAs are associated with hypoxia stress in brain, liver and gonads of medaka, Oryzias melastigma (Lau et al. 2014). Their results suggested that 55 miRNAs from 34 families were common in all tested tissues, while some of the miRNAs were evident only in reproductive tissue. Recently, 389 putative miRNA precursor loci, 120 novel precursor miRNAs, and 281 mature miRNAs were found by using several degrees of temperature elevation during embryonic and larval developmental stages in the Atlantic cod (Gadus morhua) using a NGS approach (Bizuayehu et al. 2013, 2015).

Zhao and co-workers hypothesized that vascular endothelial growth factor (VEGF), which is responsible for physiological blood vessel formation and pathological angiogenesis under hypoxic conditions might be influenced by miRNAs (Zhao et al. 2014a, b). They found that VEGF expression was directly regulated by miR-204 as there was a substantial increase of VEGF level when miR-204 in the 3′ UTR was inhibited in vivo. Deep sequencing of hepatic small RNA libraries from blunt snout bream, Megalobrama amblycephala fed with normal- and high-fat diets generated six putative lipid metabolism related target genes (fetuin-B, Cyp7a1, NADH-dehydrogenase (ubiquinone) 1-beta sub-complex subunit2, 3-oxoacid CoA transferase 1b, stearoyl-CoA desaturase, and fatty-acid synthase), which were found to have a significant role in developing diet-induced hepatic steatosis (Zhang et al. 2014).

Strategies for confirmation of miRNA

Several strategies are used for validation of computationally identified miRNAs, such as profiling the miRNA transcriptome using real-time PCR or microarray platforms. The expression pattern of miRNA using RT-PCR is one of the ideal ways for confirmation of miRNA. Enzymatic modification of miRNAs, such as RNA editing and 3 nucleotide additions has been used previously. Microarray techniques, quantitative real-time PCR, and RNA-seq are all widely used for elucidation of miRNA’s. Northern blotting, PCR, and 5′ rapid amplification of cDNA ends (5′RACE) are also used for miRNA validation. Recent advancements in gene editing technology such as TALEN and Cas9 systems in-synchronization with homologous recombination mediated transgene integration at precise locations within the genome has made it possible to identify the probable functions of the gene of interest. It was shown that miRNA vector is effective in causing enhanced Green Fluroscent Protein (eGFP) knockdown in a transient in vivo eGFP assay via gene knockdown in zebrafish (Leong et al. 2012).

Computational tools/algorithms used for miRNA investigation

Adavanced computational approaches are used for identifying miRNAs and their targets which play an important role in gene regulation. In recent years, a large number of additional putative-miRNAs in diverse organisms including fishes have been identified. Various algorithms such as miRDeep, TargetScan, DIANA-MicroT, RNAhybrid, MIReNA, miRExplorer, miRanalyzer, and miRTools are being utilized for miRNA and their target identification. The identification of miRNA targets is mainly based on Watson–Crick complementarity in the seed region between the miRNA and the target region of gene. Those algorithms are based on the formation of a hairpin loop secondary structure with a minimum folding free energy, and on the presence of mature miRNAs in the stem of the secondary structure and evolutionary conservation (Lim et al. 2003). Databases, such as miRBase (http://microrna.sanger.ac.uk/) comprising of miRNA sequence data, annotation and predicted miRNA gene targetshave been used for miRNA identification. The selection of mature miRNAs were based on various criteria such as mature miRNA having 0–4 mismatches in the sequence, stem-loop hairpin secondary structure or lower minimal free energy (MFE) (Huang et al. 2015b). Further, the secondary structures of all the selected miRNAs sequences need to generate using RNA Fold webserver (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi). It gives results for minimum free energy of miRNA, free energy of the thermodynamic ensemble, frequency of the MFE structure and ensemble diversity. Thus, most of the algorithms are based on base pairing patterns and evolutionary conservation of the secondary structure of target transcript and nucleotide composition of target sequences (Grimson 2010). The structural investigation of desired miRNAs could be carried out using web-servers as well as offline or standalone tools.

Several computational tools are being used for the miRNA discovery and their target site identification. We have depicted here the general outline (Fig. 1) for miRNA identification from transcriptome/ESTs/genome sequence of fish as described earlier (Huang et al. 2015a). The computational methods are useful for accurate depiction of miRNA and to provide evidence for further studies. Below, we describe the tools that have been used for miRNA identification and their target prediction based on different algorithms.

Fig. 1
figure 1

General outline for miRNA discovery and target site identification

MirDeep

MirDeep was the first computational tool to be developed for miRNA investigation using miRbase data (An et al. 2013). This software was developed to extract putative precursor structures and predict secondary structures using RNAfold after genome alignment of the sequences retrieved by NGS. It scores compatibility of the position and frequency of sequenced RNA with the secondary structures of miRNA precursor. Further, it identifies novel, conserved and nonconserved miRNAs with high confidence score and based on their alignment using stem loop sequences. The sequence having highest expression is recognized as a mature miRNA sequence. This software is standalone and can be run on a local machine using any operating system. Preinstallation of any other programs to support this software is not required.

TargetScan

TargetScan is an algorithm developed to identify the targets of vertebrate miRNAs. TargetScan predicts the miRNAs targets by searching for the presence of conserved 8 and 7-mer sites that match the seed region of each miRNA. The program integrates thermodynamics-based modeling of miRNA–mRNA interactions and comparative sequence analysis to predict miRNA targets conserved across multiple genomes of species. The software is reliable because of a low rate of false positives. However, the software may have limited applicability as it is based on prediction of miRNAs with substitution less than one between the species.

miRanda

This program predicts miRNA based on three properties such as sequence complementarity, conservation of target sites and free energies of RNA–RNA duplexes in related gene sequences. The disadvantages include occurrence of false negatives and unreliability of Smith–Waterman algorithm as it works better for comparison of sequences which are evolutionarily related.

MirEval

MirEval is used to predict miRNA precursor. These precursor sequences are used for BLASTx (http://www.ncbi.nlm.nih.gov) analysis for removing the protein-coding sequences and retain only non-protein encoding sequences.

RNAhybrid

The RNAhybrid is an extension of classical RNA secondary structure prediction software tools such as RNAfold and Mfold. The secondary structures of putative pre-miRNAs can be predicted using RNAfold in the Vienna-RNA package.

DIANA-microT

DIANA-microT uses a window of 38 nucleotides that progressively go through a 3′ UTR of target. Here, mRNA UTR structure is incorporated to predict microRNA targets, while MicroRNA targets are conserved across species. The DIANA-microT provides wide-ranging online connectivity by means of web-service to the biological resources. This server is mainly connected to a different online server, such as UniProt, iHOP, KEGG and miRBase, respectively. On the other hand, DIANA-microT can sortout pre-miRNAs from pseudo hairpins.

miRanalyzer

The miRanalyzer is a machine learning approach for novel miRNA discovery based on the random forest method. It can be applied to miRNA discovery from different model organisms, including fish to build the final prediction.

PicTar (Probabilistic identification of combinations of target sites)

PicTar checks the alignments of 3′ UTRs for those displaying seed site matches to miRNAs, filters the retained alignments based on their thermodynamic stability, and estimates a Hidden Markov Model (HMM) maximum likelihood score (PicTar score) for each predicted target. It has advantages in that translation repression increases exponentially with increased miRNA binding sites in 3′ UTR.

Studies on association of SNPs with miRNA using computational tools

Few studies on SNP-miRNA association have described the effect of SNPs on transcription rate of genes and transcription factors. Peñaloza and colleagues suggested that SNPs in the flanking region of the myostatin gene of Atlantic salmon affected the regulation of muscle development and growth acted through interfering with the highly conserved miRNA target site (Penaloza et al. 2013). Zhu et al. (2012) identified microRNAs and microRNA-related SNPs in common carp using a combinational strategy i.e. homology-based prediction combined with small-RNA sequencing. Two SNPs in 3′ UTR of target genes was predicted to disturb or create miRNA-target interactions.

Recently, computational approaches are being used for examining the impact of harmful missense variants or SNPs in various important genes based on evolutionary information (George Priya Doss and Rajith 2012). These missense mutations affect gene expression and thereby alter protein stability (Thomas et al. 1999). Computational tools successfully predicted the consequences of mutation on important genes such as GAPDH (Rasal et al. 2015) and TGB-III (Rasal et al. 2016). However, impact analysis of the mutation on gene function is fundamentally required, particularly those mutations leading to molecular basis of results. In silico prediction of SNPs in miRNA has enabled screening and analysis of large miRNA datasets in relatively less time and with less labour. Song et al. (2014) predicted 47 SNPs within 95 miRNAs of the inflammatory genes associated with gastric cancer using computational methods (Song et al. 2014). They used several target prediction databases including MirSNP, Targetscan Human 6.2, PolymiRTS 3.0, miRNASNP 2.0, and Patrocles for predicting miRNA target sites.

Future perspectives

The era of genomics is expanding and simultaneous documentation of many genes in a dynamic manner is arising. Genomics answers many questions associated with the evolution of genes, their biological pathways, and the functions of genes which influence the physiology of the organism. Despite much improvement happening during the last decade to find out the regulatory mechanisms controlling miRNAs biogenesis and function, still many questions remain to be answered. Recently, epigenetic factors have been recognized to overwhelm gene expression and corresponding protein products. We know understand that miRNAs are involved in post-transcriptional gene regulation, but its mechanism during disease conditions is still largely ambiguous. Each miRNA has a few hundred predicted target mRNAs, but only a small set of these interactions have been experimentally confirmed.

In the case of fish, only scanty genome data are available with well characterized genes. The future challenge lies in understanding the large number of non-model fish genomes along with overall gene functions and their evolution patterns using NGS and computational approaches. Subsequently, advanced computational tools are becoming available for sequencing data analysis to gain insights into the miRNAs via transcriptomics. Developing awareness of the vast miRNAs along with varied expression and wide range of target sites has triggered a major interest in understanding their possible regulatory functions. Thus, there is a growing need for research to be focused on regulatory molecules which are major modulators of acclimation and adaptation in case of adverse conditions or climate change. MicroRNAs are useful and substantial contributors to regulatory networks of development and adaptive plasticity in fishes. Future studies on these aspects will allow geneticists to mine the genes and genetic variations or SNPs, including miRNA identification in the genes associated with important traits for the best use of NGS or genomics and bioinformatics in the aquaculture.

Conclusions

MicroRNAs have been shown to regulate biological processes in plants, nematodes, insects, mammals and other organisms including fish. The advent of high-throughput sequencing methodologies has provided unique opportunities to generate comprehensive sequencing data for the identification and quantification of known and novel miRNAs. These technologies have created new challenges for the biological interpretation of large sequencing data sets. Consequently, computational tools integrating small noncoding RNA data with gene expression data and target predictions are essential to understand the biological processes regulated by miRNAs and other small noncoding RNA classes. Further investigation of the molecular mechanisms through which miRNAs regulate gene expression will provide important parameters for target identification and thereby predicting biological outcomes of miRNA expression.