Introduction

MicroRNAs (miRNAs), a class of prevalent small regulatory RNAs (~22 nt), mediate gene silencing and play key roles in various biological processes. Deregulation of specific miRNAs plays an important role in human diseases by regulating gene expression on a post-transcriptional level, including diverse cancers, and therefore miRNA profiling is proving invaluable in the study of human cancers. According to canonical pathway, miRNA is transcribed as a primary miRNA (pri-miRNA). The long transcript is then processed by Drosha and Dicer to produce the mature single-stranded miRNA, which is then loaded into the RNA-induced silencing complex (RISC) [1]. Due to the cleavage of Drosha and Dicer is imprecise and alternative, multiple miRNA variants with various 5′ and/or 3′ ends, termed as isomiRs, are generated from a given miRNA locus [25]. Interestingly, miRNAs may have additional non-template nucleotides at 3′ ends after Dicer processing [612]. The widespread phenomenon of 3′ additions is not a random event and conserved across animal species, especially prone to adenosines or uridines, which may contribute to miRNA stability and play a role in interactions of miRNA:target [7, 9]. However, isomiR profiles and 3′ addition events across different miRNAs in miRNA gene clusters and families still remain largely unknown.

Some miRNA genes are linked as a cluster and as a single polycistronic transcript to coordinately regulate multiple biological processes. Clustered miRNA genes were estimated according to pairwise chromosomal distance at 10 kb, and these miRNA gene clusters were obtained according to annotation in the miRBase database. Some miRNA members in gene cluster share sequence similarity [13], and these miRNA members might experience complex duplication history [1416]. With high sequence similarity, even with the same “seed sequence” (nucleotide 2–8), different miRNAs were named homologous miRNA genes and constituted gene family. Although miRNA cluster is often located in a polycistron and co-transcribed with neighboring miRNAs and host genes [17], they show consistent and inconsistent expression patterns [18, 19]. Here, by analyzing isomiR profiles and isomiR expression patterns of clustered miRNAs and homologous miRNAs, we address whether these special miRNAs have consistent maturation mechanism according to isomiR profiles and 3′ addition events.

Materials and methods

A human placental sample from pregnant women was obtained from Zhongda Hospital, Nanjing, China. Total RNA was extracted with TRIzol (Invitrogen), and small RNAs were isolated using mirVana® miRNA Isolation Kit (Ambion). Purified small RNAs were then subjected to miRNA library construction according to the protocol of SOLiD® Small RNA Expression Kit (ABI, Life Technologies). Sequencing was performed using SOLiD® system (ABI, Life Technology) at the State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, China. After discarding adaptor sequences, short RNAs were aligned to known human miRNA precursors (pre-miRNAs) sequences from the miRBase database (Release 16.0, http://www.mirbase.org/) [20] using Bowtie 0.12.7 [21]. 3′ non-template additional nucleotides were surveyed and compared based on miRNA members in gene clusters and families [5]. All of miRNAs were analyzed if their sequence counts were not less than 10. Relative expression levels of miRNAs were estimated based on sequence count of the most abundant isomiR and sum of all isomiR sequence counts, respectively. Due to the length distribution of miRNAs (~22 nt), shorter reads from deep sequencing can be mapped to multiple loci, termed as the phenomenon of cross-mapping in miRNAs, especially between multicopy miRNA precursors and homologous miRNA genes [16, 22] (Fig. S1). In order to comprehensively estimate relative expression levels of miRNAs in miRNA gene cluster and family, here counts of these common sequencing reads were repeatedly distributed between different miRNA genes. For example, hsa-miR-24 can be generated by hsa-mir-24-1 and hsa-mir-24-2, while the two mutlicopy miRNA precursors are found in different miRNA gene clusters (mir-23b gene cluster: mir-23b and mir-24-1; mir-24-2 gene cluster: mir-23a, mir-27a and mir-24-2). We hypothesized both the two pre-miRNAs could yield mature miRNAs, and therefore expression patterns of miRNAs in gene clusters could be estimated. In miRNA gene family, those miRNAs that can be yielded by multicopy pre-miRNAs only were estimated one time. According to several dominant isomiRs that can be generated from a given miRNA locus [5], we assessed whether miRNA expression patterns in gene clusters and families were consistent or not at isoimR levels by Fisher’s exact test. Specifically, the frequencies of high/low expressed isomiRs were estimated based on relative expression levels from a given miRNA locus. Rare isomiRs (less than 1% relative expression level) were removed from Fisher’s exact test.

Results

According to abundantly expressed miRNA clusters, we found most of their members always showed consistent expression patterns at isomiR levels based on Fisher’s exact test: one kind of dominant and abundant isomiR and other rare isomiRs, or several kinds of abundant isomiRs and other rare isomiRs (Fig. 1; Table S1). The fold change values of the most and secondary abundant isomiRs of clustered miRNAs always were similar (Table S2). According to abundant isomiRs, dominant cleavage sites of Drosha and Dicer during pre-miRNA processing could be estimated. These abundantly expressed isomiRs were not detected 3′ non-template additional nucleotides. Simultaneously, according to the dominant modified isomiRs, we found the same additional nucleotides between many clustered miRNAs (Table 1). Some clustered miRNAs might have different dominant additional nucleotides, but isomiR with consistent additional nucleotide always was secondary abundantly expressed isomiR. For example, hsa-miR-23b and hsa-miR-24-1 showed inconsistent dominant isomiRs with 3′ additions. However, expression level of hsa-miR-23b with adenosine was close to dominant isomiR with cytosine, which is consistent with dominant modified hsa-miR-24-1. Generally, isomiRs with 3′ additions had lower relative expression levels, and they showed larger expression divergence than dominant isomiRs from given miRNA loci. Interestingly, except for hsa-mir-143 cluster, others generated their mature miRNAs from the same arms of miRNA precursors. Based on different estimation methods, the most abundant isomiR and sum of all isomiRs, consistent and inconsistent expression levels of miRNA members in gene clusters could be found (Table 1). Specifically, some specific miRNAs had lower sequence counts, although their co-transcribed miRNAs in gene cluster were abundantly expressed (for example, hsa-mir-425 and hsa-mir-23b clusters, Table 1).

Fig. 1
figure 1figure 1

Consistent isomiR expression patterns between different miRNAs in gene clusters based on sequence counts. The horizontal coordinate every isomiR from the given miRNA locus. These isomiRs are ranked according to their relative expression levels. Except for several kinds of abundantly expressed isomiRs without non-template additional nucleotides, others always show rare sequence counts. Although inconsistent expression levels of miRNAs in gene cluster can be detected, they show consistent expression patterns according to their variants. Some members, with lower expression levels (less than 100), are not showed here

Table 1 Relative expression levels of miRNAs in gene clusters based on different estimation methods

Similarly, in miRNA gene families, except for hsa-mir-103 family, other homologous miRNAs also showed consistent isomiR profiles and isomiR expression patterns (Fig. 2; Tables S1, S3). These homologous miRNAs were generated from the same arms of their miRNA precursors. Generally, at a given miRNA locus, isomiRs with 3′ additional nucleotides were not abundant isomiR. Many of them also showed dominant isomiRs with the same 3′ additional nucleotides (Table 2). Although homologous miRNA sequences always showed high sequence similarity, consistent and inconsistent expression patterns could be found. For example, hsa-miR-103 and hsa-miR-107 showed larger expression divergence (Table 2). On the other hand, we also estimated the dominant additional nucleotide based on sequence counts of all modified isomiRs from a single miRNA locus. As expected, clustered and homologous miRNAs always showed consistent dominant isomiRs with the same additional nucleotides (Tables S2, Table S3).

Fig. 2
figure 2

Consistent isomiR expression patterns between different miRNAs in gene families based on sequence counts. The horizontal coordinate every isomiR from the given miRNA locus. Except for several kinds of abundantly expressed isomiRs without non-template additional nucleotides, others always show rare sequence counts. Although inconsistent expression levels of miRNAs in gene cluster can be detected, they always show consistent expression patterns according to their variants (except for hsa-miR-103 and hsa-miR-107 in hsa-mir-103 family). Some members, with lower expression levels (less than 100), are not showed here

Table 2 Relative expression levels of miRNAs in gene families based on different estimation methods

Discussion

Although consistent and inconsistent expression levels of miRNA members were detected in miRNA gene clusters and families (Tables 1, 2), they always showed consistent expression patterns at isomiR levels (Figs. 1, 2, Tables S1, S2, S3). Generally, 1–3 abundant isomiRs were detected from a given miRNA locus, but none of them was modified isomiR with 3′ non-template additional nucleotide [5]. Most of clustered miRNAs and homologous miRNAs showed consistent types of expressed abundant isomiRs and rare isomiRs (Figs. 1, 2; Table S1). Abundantly expressed isomiRs could be used to estimate dominant cleavage sites of Drosha and Dicer during pre-miRNA processing [23]. Consistent expression patterns implicated similar dominant cleavage mechanisms between these clustered miRNAs and homologous miRNAs. Also, these special miRNAs always were generated from the same arms of their pre-miRNAs. Due to modified isomiRs were not abundant isomiRs [5], 3′ non-template additional nucleotide was surveyed and compared based on dominant modified isomiRs. We found most of clustered miRNAs and homologous miRNAs showed dominant modified isomiRs with the same additional nucleotides, especially the miRNAs in gene clusters (Tables 1, 2). Adenosines and cytosines were dominant additional nucleotides in the study, and the phenomenon of 3′ addition event should not be a random event (Tables 1, 2). Some miRNA members showed inconsistent dominant modified isomiRs, for example, miRNAs in hsa-mir-143 cluster and hsa-mir-23 family. Interestingly, the additional nucleotide of one member could be detected as additional nucleotide of secondary abundantly expressed isomiR. Indeed, sometimes the relative expression levels of the most and secondary dominant modified isomiRs were very similar. Therefore, miRNA members in gene clusters and families had consistent additional nucleotides. Although these modified isomiRs only possessed lower percentage of total expression level from a given miRNA locus, some of them also unexpectedly showed higher sequence counts and relative expression levels. Simultaneously, we also estimated the most dominant additional non-template nucleotide according to sequence counts of modified isomiRs. The same nucleotide was detected as dominant additional nucleotide between miRNA members in many miRNA gene clusters and families (Tables S2, S3). 3′ addition events in miRNAs may play a role in miRNA stability and interactions of miRNA:target [7, 9]. The same dominant additional nucleotide in clustered miRNAs and homologous miRNAs may contribute to complex co-regulation network. Taken together, the interesting consistent isomiR expression patterns and dominant additional nucleotides might imply strict regulation during miRNA biogenesis in special miRNA groups (miRNA gene clusters and families).

In addition, we showed that the phenomenon of cross-mapping also contribute partly to the consistent isomiR profiles and expression patterns, especially between homologous miRNAs. Some clustered miRNAs were homologous miRNAs, for example, hsa-mir-30b cluster (including hsa-mir-30b and hsa-mir-30d genes). Due to high sequence similarity and various isomiRs with 5′ and/or 3′ ends, many isomiRs could be mapped to different miRNA genes (Fig. S1). Here, in order to perform a comprehensive analysis based on miRNAs members in gene clusters and families, these common isomiRs were repetitively estimated. Simultaneously, SOLiD sequencing platform employs a two-base encoding technology, and here only one mismatch was tolerated during mapping analysis using Bowtie software. To some extent, the strict selection of error can avoid cross-mapping events, but it still led to cross-mapping between homologous miRNA genes if they had common isomiR sequences, different nucleotides were located at 3′ ends or involved additional nucleotides (Fig. S1).

Despite of contribution of the cross-mapping, the fact that the interesting consistent isomiR expression patterns between miRNA members in gene clusters and families, especially consistent additional nucleotides of dominant modified isomiRs, indicated similar processing and maturation process and implicated specific evolutionary and functional relationships. Due to relative stability of 3′ addition events and isomiR profiles [5, 7], the consistent isomiR expression patterns and 3′ addition events in miRNA gene clusters and families were also supported by analyzing sequencing datasets of other placental samples. Therefore, the phenomenon should not be a random event. Clustered miRNA genes and homologous miRNA genes experience complex duplication history [1416], and consistency of pre-miRNA processing and 3′ modification events, might be derived from their ancestral miRNA genes. The conservation of miRNA maturation mechanism would lead to similar isomiR expression patterns, including similar dominant modified isomiRs, which may contribute to co-regulate process. Further studies should reveal whether isomiR profiles and their expression patterns may contribute partly to the process of binding their targets, and consistent isomiR profiles, including widespread 3′ addition events, may also contribute to co-regulate biological process.