Abstract
MicroRNAs (miRNAs) are small noncoding RNA molecules which are processed into ~20–24 nt molecules that can regulate the gene expression post-transcriptionally. MiRNA gene clusters have been identified in a range of species, where in miRNAs are often processed from polycistronic transcripts. In this study, a computational approach is used to investigate the extent of evolutionary conservation of the miR-71/2 cluster in animals, and to identify novel miRNAs in the miRNA cluster miR-71/2. The miR-71/2 cluster, consisting of copies of the miR-71 and miR-2 (including miR-13) families, was found to be Protostome-specific. Although, this cluster is highly conserved across the Protostomia, the miR-2 family is completely absent from the Deuterostomia species, while miR-71 is absent from the Vertebrata and Urochordata. The evolutionary conservation and clustering propensity of the miR-71/2 family across the Protostomes could indicate the common functional roles across the member species of the Protostomia.
Avoid common mistakes on your manuscript.
Introduction
The gene clustering that occurs for both protein-coding and noncoding genes can reflect the evolutionary pressures to define a functional genomic landscape for suites of genes involved in similar biological functions, i.e., a co-regulatory gene complex (Chopra 2011). Amongst the noncoding RNA genes, miRNAs have emerged as key regulators of a broad spectrum of cellular functions (Berezikov et al. 2006; Bushati and Cohen 2007; Huang et al. 2010; Kozomara and Griffiths-Jones 2011; Li et al. 2010; Xiong et al. 2009). Many such miRNAs are located in the miRNA gene clusters, where they are often expressed as polycistronic transcripts (Lee et al. 2002; Saini et al. 2008). It is estimated that ~37 % of the known miRNAs in humans form clusters suggesting evolutionary and functional significance for the clustering phenomenon (Altuvia et al. 2005).
The miR-71/2 cluster (including miR-13 as a member of the miR-2 family) is conserved across the insects and other invertebrates (Marco et al. 2012; Marco et al. 2010); miR-2 is predicted to play an important role in the neural development (Marco et al. 2012). We previously demonstrated the duplication of the miR-71/2 cluster in the Lophotrochozoan parasite Schistosoma mansoni with a high level of conservation observed across the clustered sma-miR-71 and sma-miR-71b genes (de Souza et al. 2011). The miRNA database (version 19.0) contains 171 pre-miRs that belong to the miR-2 family and 33 pre-miRNAs that belong to the miR-71 family.
To elucidate the evolutionary history of the miR-71 and miR-2 families across the animal kingdom, we screened for new miRNA members of the family across the animal genomes. In this study, we identify 19 novel miRNAs, which are not reported in the microRNA database, miRBase, version 19.0 (http://www.mirbase.org) (Table 1). These novel miRNAs were identified by screening the candidate regions that surround the known precursors from the same species, or using the known ortholog precursors. In total, we have identified 12 novel miR-2, four novel miR-13, and three novel miR-71 genes (Table 1) (these newly predicted miRNAs are named based on their respective ortholog).
Our pipeline also identified an additional nine miRNAs, which have been shown in previous studies to be miR-71 or miR-2 family members (Chopra 2011; de Souza et al. 2011) (Table 1). We analyzed the structural characteristics and thermodynamic features by comparing the novel pre-miRNAs and the known miR-71, miR-2, and miR-13 genes deposited in mirBase (Supplementary Figure 1 and Supplementary Table 1). All the novel predicted miRs were highly conserved in terms of both primary and secondary structures, although, this partly reflects the conservative prediction criteria used to identify the novel miRNAs (Supplementary Figures 1, 2a/b/c, Supplementary Tables 1, 2). Furthermore, all the novel predicted miRNAs form stable hairpin structures (minimum free energy <−25 kcal/mol), which is essential for the processing of pre-miR transcripts into mature miRs (Supplementary Table 1). Our analyses of fifteen different pre-miR parameters demonstrated similar values to the set of known miRNAs tested (Supplementary Table 1).
To determine whether the miR-71/2 cluster is evolutionarily conserved in Protostome and Deuterostome species, we retrieved the genomic regions at the boundaries of the miR-71, miR-2, and/or miR-13 miRNAs for these species. We considered miRNAs to be clustered, if the members of both miRNA families were within 10 kb of each another. The miR-71/2 cluster is only conserved in Protostome species (including Ecdysozoan and Lophotrochozoan) (Fig. 1), except for the genus Drosophila that lacks miR-71 as previously described (Marco et al. 2010).The majority of miR-2 copies found within a miR-71/2 cluster are found within 5 kb of the nearest miR-71 copy. However, in some species, such as those in the Drosophila genus, clusters of miR-2/miR-13 genes were found to cluster together in the absence of miR-71. In Caenorhabditis briggsae, miR-2 was found ~11 kb distal from the miR-71 member in the genome, one kb higher than our bioinformatic cut-off. In our study, we considered cbr-miR-2 as clustered with cbr-miR-71 due to their high conservation compared to cel-miR-2. The mature sequences, cel-miR-2 and cbr-miR-2, displayed 100 % of identity at nucleotide level (Supplementary Figure 3).
Whereas the miR-2/miR-13 family was found to be restricted to Protostomes, miR-71 was found in Deuterostome species from the clades Echinodermata, Hemichordata, and Cephalochordata, but not in Vertebrata and Urochordata (Fig. 1) Marco et al. 2012). Indeed, Urochordata and Vertebrata genomes lack both miR-71 and miR-2 family members (Fig. 1). In the Deuterostome species, Strongylocentrotus purpuratus and Saccoglossus kowalevskii, miR-71 was found in isolation; this is consistent with the presence of nonclustered copies of the miR-71 family in Protostome species. In contrast, in Branchiostoma floridae, miR-71 is clustered with miR-4890, which is unrelated to the miR-2/miR-13 family (Fig. 1). The presence of different miRNAs within a miR-71/2 cluster is also observed in other Protostome species, such as Schmidteamediterranea (Fig. 1, miRBase; Palakodeti et al. 2006).
The duplication of the miR-71/2 cluster has previously been reported in S. mansoni (de Souza et al. 2011) and Schistosoma japonicum (Huang et al. 2009). Here, we identify an extra miR-2 copy in S. japonicum (sja-miR-2f) and S. mansoni (sma-miR-2f) (Fig. 1). This additional miRNA clearly illustrates that both the sets of miR-71/2 are conserved between Schistosoma species. In S. mansoni and S. japonicum, the miR-71/2 cluster exists as two separate clusters each containing four miRNAs (one miR-71 and three miR-2 genes) found in the same order in each cluster (Fig. 1). The miR-71/2 cluster is also duplicated in Schmidtea mediterranea (Fig. 1), although, the characteristics of the clusters in S. mediterranea differ (Fig. 1). The miR-71/2 duplicated clusters in S. mediterranea are found in five separate clusters (mirBase 19.0). Three of these clusters consist of one miR-71 and one miR-2 gene. The other two clusters contain either three or four miRNA genes, including miRNAs from the other families. An isolated (nonclustered) copy of miR-2 is also present in S. mediterranea (miRBase 19.0). Although, the functional significance of the miR-71/2 clusters is currently unclear, the comparison of S. mediterranea to S. mansoni and S. japonicum highlights that the duplication of this cluster is conserved across these Platyhelminthes species, and may indicate that miR-71/2 cluster(s) have played significant roles in the evolution of the Schistosoma lineage.
The emergence of miR-2 in Protostomes, coupled to its absence in Deuterostomes suggests that this miRNA gene could have a functionally significant role in Protostome species, possibly in the neural development as the miR-2 cluster displays an enrichment of neural development predicted target genes in Protostomes (Marco et al. 2012). The copy-number amplification of miR-2 genes in the Arthropoda, Annelida, and Platyhelminthes lineages could reflect a divergent neofunctionalization of miR-2 activity between miR-2 homologs and/or a gene dosage effect on the miR-2 activity in these lineages. Furthermore, the conserved clustering pattern of miR-2 with miR-71 to form the miR-71/2 cluster could reflect a functional co-evolutionary relationship of these two miRNAs within Protostome species. Finally, the segmental duplication of the miRNA cluster in specific lineages may also have a functional significance, for example, in S. mansoni, one of the segmentally duplicated miR-71/2 clusters, is located on the female chromosome W where it has a potential to play a role in the sexual differentiation (de Souza et al. 2011).
Methods
miRNAs and Genomic Regions
The mature and precursor miRNA sequences were retrieved from miRBase (Welcome Trust Sanger Institute’s miRBase, http://microrna.sanger.ac.uk—release 19.0; (Kozomara and Griffiths-Jones 2011). The cluster region in each genome was identified using BLASTN (default parameters) against the available animal genomes. For Acyrthosiphon pisum genome and Rhodnius prolixus genome, we used BLASTN using as query known ortholog clustered miRNAs from Ecdyzoan species. The new miRNAs were identified in the following species: Ixodes scapularis (VectorBase, http://www.vectorbase.org, Ixodes scapularis annotation IscalW1; Lawson et al. 2009), Daphia pulex (Colbourne et al. 2011), Acyrthosiphon pisum (Human Genome Sequencing Center at the Baylor College of Medicine—http://www.hgsc.bcm.tmc.edu/) (Consortium 2010), Anopheles gambiae (VectorBase, http://www.vectorbase.org, Anopheles gambiae annotation AgamP3.5) (Lawson et al. 2009), Rhodnius prolixus (VectorBase, http://www.vectorbase.org, Rhodnius prolixus annotation RproC1, SuperContig.feb11) (Lawson et al. 2009), Pediculus humanus (VectorBase, http://www.vectorbase.org, Pediculus humanus annotation PhumU1) (Lawson et al. 2009), Culexquinque fasciatus (VectorBase, http://www.vectorbase.org, Culex quinquefasciatus annotation CpipJ1)(Lawson et al. 2009), C. briggsae (WormBase database, http://www.wormbase.org/) (Yook et al. 2012), S. mansoni (GeneDB, www.genedb.org) (Berriman et al. 2009; Protasio et al. 2012), and S. japonicum (GeneDB, www.genedb.org) (Consortium 2009). The genomic DNA fragments containing the miRNA precursors flanked by 10,000 nt on each side were retrieved from the genome data. A distance of ~10 kb or less between consecutive miRNA genes on the genome has been used to consider miRNA genes clustered. For instance, the miRBase version (version 19.0) has considered the clustered miRNAs within 10,000 nt. Hence, we also considered this cut-off distance to clustered miRNAs for our analysis. The sequences were stored in multi-fasta format for further analysis.
Computational Prediction of Clustered microRNA Genes
Hairpin-like structures were identified from the DNA fragments using einverted EMBOSS and BLASTN tools. The parameters used for einverted program were minimum score threshold 20, gap penalty 4, match score 2, mismatch score 2, and maximum extent of repeats 110 collecting sequences with the length between 50 and 110 nt. BLASTN was used to find matches to known pre-miR structures. These hairpin-like sequences were filtered using MFE (minimal free energy), GC content, mature sequence homology, and noncoding RNAs. MFE of the RNA secondary structures were performed using Vienna RNA Package with the RNAfold and the following parameters: RNA secondary folding energy threshold −20 kcal/mol and with the options “-p -d2 -noLP” (Hofacker 2009). These structures were filtered with GC content ranging from 30 to 65 %. In addition, the animal mature miRNAs were aligned against these sequences and not more than six mismatches were accepted in whole mature miRNAs and one mismatch in seed region (2–8 nt). To remove the other classes of noncoding RNAs (i.e., rRNA, snRNA, SL RNA, SRP, tRNAs, and RNase P), the putative hairpin-like sequences near the putative cluster members were compared against the Rfam microRNA Registry (version 10.0)(Gardner et al. 2009).
Abbreviations
- miRNA:
-
microRNA
References
Altuvia Y, Landgraf P, Lithwick G, Elefant N, Pfeffer S, Aravin A, Brownstein MJ, Tuschl T, Margalit H (2005) Clustering and conservation patterns of human microRNAs. Nucleic Acids Res 33:2697
Ayala FJ, Rzhetsky A (1998) Origin of the metazoan phyla: molecular clocks confirm paleontological estimates. Proc Natl Acad Sci USA 95:606
Berezikov E, Cuppen E, Plasterk RH (2006) Approaches to microRNA discovery. Nat Genet 38(Suppl):S2
Berriman M, Haas BJ, LoVerde PT, Wilson RA, Dillon GP, Cerqueira GC, Mashiyama ST, Al-Lazikani B, Andrade LF, Ashton PD, Aslett MA, Bartholomeu DC, Blandin G, Caffrey CR, Coghlan A, Coulson R, Day TA, Delcher A, DeMarco R, Djikeng A, Eyre T, Gamble JA, Ghedin E, Gu Y, Hertz-Fowler C, Hirai H, Hirai Y, Houston R, Ivens A, Johnston DA, Lacerda D, Macedo CD, McVeigh P, Ning Z, Oliveira G, Overington JP, Parkhill J, Pertea M, Pierce RJ, Protasio AV, Quail MA, Rajandream MA, Rogers J, Sajid M, Salzberg SL, Stanke M, Tivey AR, White O, Williams DL, Wortman J, Wu W, Zamanian M, Zerlotini A, Fraser-Liggett CM, Barrell BG, El-Sayed NM (2009) The genome of the blood fluke Schistosoma mansoni. Nature 460:352
Bushati N, Cohen SM (2007) microRNA functions. Annu Rev Cell Dev Biol 23:175
Chopra VS (2011) Chromosomal organization at the level of gene complexes. Cell Mol Life Sci 68:977
Colbourne JK, Pfrender ME, Gilbert D, Thomas WK, Tucker A, Oakley TH, Tokishita S, Aerts A, Arnold GJ, Basu MK, Bauer DJ, Caceres CE, Carmel L, Casola C, Choi JH, Detter JC, Dong Q, Dusheyko S, Eads BD, Frohlich T, Geiler-Samerotte KA, Gerlach D, Hatcher P, Jogdeo S, Krijgsveld J, Kriventseva EV, Kultz D, Laforsch C, Lindquist E, Lopez J, Manak JR, Muller J, Pangilinan J, Patwardhan RP, Pitluck S, Pritham EJ, Rechtsteiner A, Rho M, Rogozin IB, Sakarya O, Salamov A, Schaack S, Shapiro H, Shiga Y, Skalitzky C, Smith Z, Souvorov A, Sung W, Tang Z, Tsuchiya D, Tu H, Vos H, Wang M, Wolf YI, Yamagata H, Yamada T, Ye Y, Shaw JR, Andrews J, Crease TJ, Tang H, Lucas SM, Robertson HM, Bork P, Koonin EV, Zdobnov EM, Grigoriev IV, Lynch M, Boore JL (2011) The ecoresponsive genome of Daphnia pulex. Science 331:555
Consortium IAG (2010) Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol 8:e1000313
Consortium TSjGSaFA (2009) The Schistosoma japonicum genome reveals features of host-parasite interplay. Nature 460:345
de Souza Gomes M, Muniyappa MK, Carvalho SG, Guerra-Sa R, Spillane C (2011) Genome-wide identification of novel microRNAs and their target genes in the human parasite Schistosoma mansoni. Genomics 98:96
Douzery EJ, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils? Proc Natl Acad Sci USA 101:15386
Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A (2009) Rfam: updates to the RNA families database. Nucleic Acids Res 37:D136
Hofacker IL (2009) RNA secondary structure analysis using the Vienna RNA package. Curr Protoc Bioinformatics Chapter 12:Unit12 2
Huang J, Hao P, Chen H, Hu W, Yan Q, Liu F, Han ZG (2009) Genome-wide identification of Schistosoma japonicum microRNAs using a deep-sequencing approach. PLoS ONE 4:e8206
Huang Y, Zou Q, Wang SP, Tang SM, Zhang GZ, Shen XJ (2010) The discovery approaches and detection methods of microRNAs. Mol Biol Rep 38(6):4125–4135
Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39:D152
Lawson D, Arensburger P, Atkinson P, Besansky NJ, Bruggner RV, Butler R, Campbell KS, Christophides GK, Christley S, Dialynas E, Hammond M, Hill CA, Konopinski N, Lobo NF, MacCallum RM, Madey G, Megy K, Meyer J, Redmond S, Severson DW, Stinson EO, Topalis P, Birney E, Gelbart WM, Kafatos FC, Louis C, Collins FH (2009) VectorBase: a data resource for invertebrate vector genomics. Nucleic Acids Res 37:D583
Lee Y, Jeon K, Lee JT, Kim S, Kim VN (2002) MicroRNA maturation: stepwise processing and subcellular localization. EMBO J 21:4663
Li L, Xu J, Yang D, Tan X, Wang H (2010) Computational approaches for microRNA studies: a review. Mamm Genome 21:1
Marco A, Hui JH, Ronshaugen M, Griffiths-Jones S (2010) Functional shifts in insect microRNA evolution. Genome Biol Evol 2:686
Marco A, Hooks K, Griffiths-Jones S (2012) Evolution and function of the extended miR-2 microRNA family. RNA Biol 9:242
Palakodeti D, Smielewska M, Graveley BR (2006) MicroRNAs from the Planarian Schmidtea mediterranea: a model system for stem cell biology. RNA 12:1640
Protasio AV, Tsai IJ, Babbage A, Nichol S, Hunt M, Aslett MA, De Silva N, Velarde GS, Anderson TJ, Clark RC, Davidson C, Dillon GP, Holroyd NE, LoVerde PT, Lloyd C, McQuillan J, Oliveira G, Otto TD, Parker-Manuel SJ, Quail MA, Wilson RA, Zerlotini A, Dunne DW, Berriman M (2012) A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni. PLoS Negl Trop Dis 6:e1455
Saini HK, Enright AJ, Griffiths-Jones S (2008) Annotation of mammalian primary microRNAs. BMC Genomics 9:564
Xiong N, Huang J, Zhang Z, Xiong J, Liu X, Jia M, Wang F, Chen C, Cao X, Liang Z, Sun S, Lin Z, Wang T (2009) Stereotaxical infusion of rotenone: a reliable rodent model for Parkinson’s disease. PLoS ONE 4:e7878
Yook K, Harris TW, Bieri T, Cabunoc A, Chan J, Chen WJ, Davis P, de la Cruz N, Duong A, Fang R, Ganesan U, Grove C, Howe K, Kadam S, Kishore R, Lee R, Li Y, Muller HM, Nakamura C, Nash B, Ozersky P, Paulini M, Raciti D, Rangarajan A, Schindelman G, Shi X, Schwarz EM, Ann Tuli M, Van Auken K, Wang D, Wang X, Williams G, Hodgkin J, Berriman M, Durbin R, Kersey P, Spieth J, Stein L, Sternberg PW (2012) WormBase 2012: more genomes, more data, new website. Nucleic Acids Res 40:D735
Acknowledgments
CS and MTAD thank the support of Science Foundation Ireland grants 02/IN.1/B49 and 08/IN.1/B1931. GMS was supported by The Brazilian Federal Agency for Support and Evaluation of Graduate Education (Capes Foundation—Scholarship Proc. no. 1495-10-0) and FAPEMIG (CBB 2935/09).
Author information
Authors and Affiliations
Corresponding author
Additional information
Matheus de Souza Gomes and Mark T. A. Donoghue contributed equally to this study.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
de Souza Gomes, M., Donoghue, M.T.A., Muniyappa, M. et al. Computational Identification and Evolutionary Relationships of the MicroRNA Gene Cluster miR-71/2 in Protostomes. J Mol Evol 76, 353–358 (2013). https://doi.org/10.1007/s00239-013-9563-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-013-9563-2