Abstract
MicroRNAs (miRNAs) are short, noncoding RNAs that have the capacity to bind, capture, and silence hundreds of genes within and across diverse signaling pathways 1(Bartel, Cell 136:215–33, 2009) Specific sets of miRNAs characterize specific cell lineages of normal organisms and an increasing number of diseases have been shown to be associated with the dysregulation of specific miRNAs. Deep sequencing platforms have revealed unexpected complexity in relation to miRNAs, including 5′ and 3′-end-length heterogeneity and RNA editing. These insights not uncovered by previous microarray-based studies underscore the importance of data analysis tools that enable users to rapidly and easily analyze the unprecedented amounts of small RNA sequencing data that is emerging from next-generation sequencing platforms, such as Illumina/Solexa, SOLiD, and 454. In this chapter, we summarize the increasing number of analysis platforms that are available for miRNA discovery and profiling and the identification of functional miRNA–mRNA pairs in the context of biology and disease. We also discuss in greater detail our contributions to this effort.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bartel, D. P. (2009) MicroRNAs: Target recognition and regulatory functions. Cell 136, 215–33.
Thomson, J. M., Parker, J., Perou, C. M., and Hammond, S. M. (2004) A custom microarray platform for analysis of microRNA gene expression. Nat Methods 1, 47–53.
Miska, E. A., Alvarez-Saavedra, E., Townsend, M., Yoshii, A., Sestan, N., Rakic, P., et al. (2004) Microarray analysis of microRNA expression in the developing mammalian brain. Genome Biol 5, R68.
Mardis, E. R. (2008) The impact of next-generation sequencing technology on genetics. Trends Genet 24, 133–41.
Morozova, O. and Marra, M. A. (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92, 255–64.
Creighton, C. J., Reid, J. G., Gunaratne, P. H. (2009) Expression profiling of microRNAs by deep sequencing. Brief Bioinformatics 10, 490–7.
Creighton, C. J., Nagaraja, A. K., Hanash, S. M., Matzuk, M. M., Gunaratne, P. H. (2008) A bioinformatics tool for linking gene expression profiling results with public databases of microRNA target predictions. RNA 14, 2290–6.
Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S. R, and Bateman, A. (2005) Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, D121–4.
Sethupathy, P., Megraw, M., and Hatzi-georgiou, A. (2006) A guide through present computational approaches for the identification of mammalian microRNA targets. Nat Methods 3, 881–6.
Lewis, B. P., Shih, I. H., Jones-Rhoades, M. W., Bartel, D. P., and Burge, C. B. (2003) Prediction of mammalian microRNA targets. Cell 26, 787–98.
Krek, A., Grün, D., Poy, M., Wolf, R., Rosenberg, L., Epstein, E., et al. (2005) Combinatorial microRNA target predictions. Nat Genet 37, 495–500.
Betel, D., Wilson, M., Gabow, A., Marks, D., and Sander, C. (2008) The microRNA.org resource: Targets and expression. Nucleic Acids Res 36, D149–53.
Jiang, Q., Wang, Y., Hao, Y., Juan, L., Teng, M., Zhang, X., et al. (2009) miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic Acids Res 37, D98–104.
Nam, S., Kim, B., Shin, S., and Lee, S. (2008) miRGator: an integrated system for functional annotation of microRNAs. Nucleic Acids Res 36, D159–64.
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102, 15545–50.
Kaya, K. D., Karakülah, G., Yakicier, C. M., Acar, A. C., Konu, O. (2011) mESAdb: microRNA expression and sequence analysis database Nucleic Acids Res 39, D170–80.
Pasaniuc, B., Zaitlen, N., and Halperin, E. (2010) Accurate estimation of expression levels of homologous genes in RNA-seq experiments. Proceedings of the Fourteenth International Conference on Research in Computational Biology 397–409.
Allen, E., Xie, Z., Gustafson, A. M., and Carrington, J. C. (2005) microRNA-directed phasing during trans-acting siRNA biogenesis in plants. Cell 121, 207–21.
Chen, H. M., Li, Y. H., Wu, S. H. (2007) Bioinformatic prediction and experimental validation of a microRNA-directed tandem trans-acting siRNA cascade in Arabidopsis. Proc Natl Acad Sci USA 104, 3318–23.
Hackenberg, M. and Matthiesen, R. (2008) Annotation-Modules: a tool for finding significant combinations of multisource annotations for gene lists. Bioinformatics 24, 1386–93.
Witten, I. H. and Frank, E. (2005) Data Mining: practical machine learning tools and techniques. Morgan Kaufmann Publishers, San Francisco.
Breiman, L. (2001) Random forests. Machine Learning 45, 28.
Dohm, J. C., Lottaz, C., Borodina, T., and Himmelbauer, H. (2008) Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res 36, e105.
Reinartz, J., Bruyns, E., Lin, J. Z., Burcham, T., Brenner, S., Bowen, B., et al. (2002) Massively parallel signature sequencing (MPSS) as a tool for in-depth quantitative gene expression profiling in all organisms. Brief Funct Genomic Proteomic 1, 95–104.
Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B 57, 289–300.
Mullan, L. J. and Bleasby, A. J. (2002) Short EMBOSS user guide. European molecular biology open software suite. Brief Bioinformatics 3, 92–4.
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–402.
Li, R., Yu, C., Li, Y., Lam, T. W, Yiu, S. M, Kristiansen, K., et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–7.
Griffiths-Jones, S., Saini, H. K., van Dongen, S., and Enright, A. J. (2008) miRBase: tools for microRNA genomics. Nucleic Acids Res 36, D154–8.
Chen, N. (2004) Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinform 4, 108.
Audic, S., and Claverie, J. M. (1997) The significance of digital gene expression profiles. Genome Res 7, 986–95.
Hofacker, I. L. (2003) Vienna RNA secondary structure server. Nucleic Acids Res 31, 3429–31.
Coarfa, C., Yu, F., Miller, C. A., Chen, Z., Harris, R. A, Milosavljevic, A. (2010) Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing. BMC Bioinformatics 23, 572.
Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–60.
Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., et al. (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7, 335–6.
Kursa, M. B. and Rudnicki, W. R. (2010) Feature selection with the Boruta package. J Stat Softw 36, 1–13.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Gunaratne, P.H., Coarfa, C., Soibam, B., Tandon, A. (2012). miRNA Data Analysis: Next-Gen Sequencing. In: Fan, JB. (eds) Next-Generation MicroRNA Expression Profiling Technology. Methods in Molecular Biology, vol 822. Humana Press. https://doi.org/10.1007/978-1-61779-427-8_19
Download citation
DOI: https://doi.org/10.1007/978-1-61779-427-8_19
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-426-1
Online ISBN: 978-1-61779-427-8
eBook Packages: Springer Protocols