Abstract
The way to infer well-supported phylogenetic trees that precisely reflect the evolutionary process is a challenging task that completely depends on the way the related core genes have been found. In previous computational biology studies, many similarity based algorithms, mainly dependent on calculating sequence alignment matrices, have been proposed to find them. In these kinds of approaches, a significantly high similarity score between two coding sequences extracted from a given annotation tool means that one has the same genes. In a previous work article, we presented a quality test approach (QTA) that improves the core genes quality by combining two annotation tools (namely NCBI, a partially human-curated database, and DOGMA, an efficient annotation algorithm for chloroplasts). This method takes the advantages from both sequence similarity and gene features to guarantee that the core genome contains correct and well-clustered coding sequences (i.e., genes). We then show in this article how useful are such well-defined core genes for biomolecular phylogenetic reconstructions, by investigating various subsets of core genes at various family or genus levels, leading to subtrees with strong bootstraps that are finally merged in a well-supported supertree.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Alkindy, B., Couchot, J., Guyeux, C., Mouly, A., Salomon, M., Bahi, J.M.: Finding the core-genes of chloroplasts. Journal of Bioscience, Biochemistry, and Bioinformatics 4(5), 357–364 (2014)
Alkindy, B., Guyeux, C., Couchot, J., Salomon, M., Bahi, J.M.: Gene similarity-based approaches for determining core-genes of chloroplasts. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine, BIBM (2014) 978-1-4799-5669-2/14/
Chaffey, N., Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular biology of the cell. Annals of Botany 91(3), 401–401 (2003)
Stoebe, B., Martin, W., Kowallik, K.V.: Distribution and nomenclature of protein-coding genes in 12 sequenced chloroplast genomes. Plant Molecular Biology Reporter 16(3), 243–255 (1998)
Grzebyk, D., Schofield, O., Vetriani, C., Falkowski, P.G.: The mesozoic radiation of eukaryotic algae: The portable plastid hypothesis1. Journal of Phycology 39(2), 259–267 (2003)
De Chiara, M., Hood, D., Muzzi, A., Pickard, D.J., Perkins, T., Pizza, M., Dougan, G., Rappuoli, R., Moxon, E.R., Soriani, M., Donati, C.: Genome sequencing of disease and carriage isolates of non typeable haemophilus influenzae identifies discrete population structure. Proceedings of the National Academy of Sciences 111(14), 5439–5444 (2014)
Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., Salzberg, S.L.: Versatile and open software for comparing large genomes. Genome Biology 5(2), R12 (2004)
Apweiler, R., ODonovan, C., Martin, M.J., Fleischmann, W., Hermjakob, H., Moeller, S., Contrino, S., Junker, V.: Swiss-prot and its computer-annotated supplement trembl: How to produce high quality automatic annotation. Eur. J. Biochem. 147, 9–15 (1985)
Sugawara, H., Ogasawara, O., Okubo, K., Gojobori, T., Tateno, Y.: Ddbj with new system and face. Nucleic Acids Research 36(suppl. 1), D22–D24 (2008)
Wyman, S.K., Jansen, R.K., Boore, J.L.: Automatic annotation of organellar genomes with dogma. Bioinformatics 20(17), 3252–3255 (2004)
Zafar, N., Mazumder, R., Seto, D.: Coregenes: A computational tool for identifying and cataloging. BMC Bioinformatics 33(1), 12 (2002)
Stamatakis, A.: Raxml version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics (2014)
Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Chris, Duran, o.: Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28(12), 1647–1649 (2012)
Ranwez, V., Criscuolo, A., Douzery, E.J.: Supertriplets: a triplet-based supertree approach to phylogenomics. Bioinformatics 26(12), i115–i123 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
AlKindy, B., Al-Nayyef, H., Guyeux, C., Couchot, JF., Salomon, M., Bahi, J.M. (2015). Improved Core Genes Prediction for Constructing Well-Supported Phylogenetic Trees in Large Sets of Plant Species. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-16483-0_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16482-3
Online ISBN: 978-3-319-16483-0
eBook Packages: Computer ScienceComputer Science (R0)