Abstract
GeMoMa is a homology-based gene prediction program that predicts gene models in target species based on gene models in evolutionary related reference species. GeMoMa utilizes amino acid sequence conservation, intron position conservation, and RNA-seq data to accurately predict protein-coding transcripts. Furthermore, GeMoMa supports the combination of predictions based on several reference species allowing to transfer high-quality annotation of different reference species to a target species. Here, we present a detailed description of GeMoMa modules and the GeMoMa pipeline and how they can be used on the command line to address particular biological problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hoff KJ , Stanke M (2015) Current methods for automated annotation of protein-coding genes. Curr Opin Insect Sci 7:8–14. https://doi.org/10.1016/j.cois.2015.02.008. ISSN 2214-5745
Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinf 12(1):491. https://doi.org/10.1186/1471-2105-12-491. ISSN 1471-2105
Hartung F, Blattner FR, Puchta H (2002) Intron gain and loss in the evolution of the conserved eukaryotic recombination machinery. Nucleic Acids Res 30(23):5175–5181. https://doi.org/10.1093/nar/gkf649
Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F (2016) Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 44(9):e89. https://doi.org/10.1093/nar/gkw092
Fedorov A, Merican AF, Gilbert W (2002) Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci U S A 99(25):16128–16133. https://doi.org/10.1073/pnas.242624899
Hartung F, Suer S, Bergmann T, Puchta H (2006) The role of AtMUS81 in DNA repair and its genetic interaction with the helicase AtRecQ4A. Nucleic Acids Res 34(16):4438–4448. https://doi.org/10.1093/nar/gkl576
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. https://doi.org/10.1016/S0022-2836(05)80360-2. ISSN 0022-2836
She R, Chu JS-C, Uyar B, Wang J, Wang K, Chen N (2011) genBlastG: using BLAST searches to build homologous gene models. Bioinformatics 27(15):2141–2143. https://doi.org/10.1093/bioinformatics/btr342
Slater G, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinf 6(1):31. https://doi.org/10.1186/1471-2105-6-31. ISSN 1471-2105
Testa AC, Hane JK, Ellwood SR, Oliver RP (2015) CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts. BMC Genomics 16(1):170. https://doi.org/10.1186/s12864-015-1344-4. ISSN 1471–2164
Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016) BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32(5):767. https://doi.org/10.1093/bioinformatics/btv661
Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J (2018) Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi. BMC Bioinf 19(1):189. https://doi.org/10.1186/s12859-018-2203-5. ISSN 1471-2105
Grau J, Keilwagen J, Gohr A, Haldemann B, Posch S, Grosse I (2012) Jstacs: a Java framework for statistical analysis and classification of biological sequences. J Mach Learn Res 13(June):S. 1967–1971
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15. https://doi.org/10.1093/bioinformatics/bts635
Song Li, Shankar DS, Florea L (2016) Rascaf: improving genome assembly with RNA sequencing data. Plant Genome 9(3)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Keilwagen, J., Hartung, F., Grau, J. (2019). GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_9
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9173-0_9
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9172-3
Online ISBN: 978-1-4939-9173-0
eBook Packages: Springer Protocols