Abstract
Genome-based phylogeny plays a central role in the future taxonomy and phylogenetics of Bacteria and Archaea by replacing 16S rRNA gene phylogeny. The concatenated core gene alignments are frequently used for such a purpose. The bacterial core genes are defined as single-copy, homologous genes that are present in most of the known bacterial species. There have been several studies describing such a gene set, but the number of species considered was rather small. Here we present the up-to-date bacterial core gene set, named UBCG, and software suites to accommodate necessary steps to generate and evaluate phylogenetic trees. The method was successfully used to infer phylogenomic relationship of Escherichia and related taxa and can be used for the set of genomes at any taxonomic ranks of Bacteria. The UBCG pipeline and file viewer are freely available at https://www.ezbiocloud.net/tools/ubcg and https://www.ezbiocloud.net/tools/ubcg_viewer, respectively.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Ankenbrand, M.J. and Keller, A. 2016. bcgTree: automatized phylogenetic tree building from bacterial core genomes. Genome 59, 783–791.
Chun, J. and Rainey, F.A. 2014. Integrating genomics into the taxonomy and systematics of the Bacteria and Archaea. Int. J. Syst. Evol. Microbiol. 64, 316–324.
Chun, J., Oren, A., Ventosa, A., Christensen, H., Arahal, D.R., da Costa, M.S., Rooney, A.P., Yi, H., Xu, X.W., De Meyer, S., et al. 2018. Proposed minimal standards for the use of genome data for the taxonomy of prokaryotes. Int. J. Syst. Evol. Microbiol. 68, 461–466.
Creevey, C.J., Doerks, T., Fitzpatrick, D.A., Raes, J., and Bork, P. 2011. Universally distributed single-copy genes indicate a constant rate of horizontal transfer. PLoS One 6, e22099.
Darling, A.E., Jospin, G., Lowe, E., Matsen, F.I., Bik, H.M., and Eisen, J.A. 2014. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2, e243.
Dupont, C.L., Rusch, D.B., Yooseph, S., Lombardo, M.J., Richter, R.A., Valas, R., Novotny, M., Yee-Greenbaum, J., Selengut, J.D., Haft, D.H., et al. 2012. Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage. ISME J. 6, 1186–1199.
Eddy, S.R. 2011. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195.
Edgar, R.C. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461.
Eisen, J.A. and Fraser, C.M. 2003. Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707.
Felsenstein, J. 1985. Confidence-limits on phylogenies–an approach using the bootstrap. Evolution 39, 783–791.
Finn, R.D., Coggill, P., Eberhardt, R.Y., Eddy, S.R., Mistry, J., Mitchell, A.L., Potter, S.C., Punta, M., Qureshi, M., Sangrador-Vegas, A., et al. 2016. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D279–D285.
Fox, G.E., Wisotzkey, J.D., and Jurtshuk, P.J. 1992. How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity. Int. J. Syst. Bacteriol. 42, 166–170.
Haft, D.H., Selengut, J.D., Richter, R.A., Harkins, D., Basu, M.K., and Beck, E. 2013. TIGRFAMs and genome properties in 2013. Nucleic Acids Res. 41, D387–D395.
Hyatt, D., Chen, G.L., LoCascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119.
Jeon, Y.S., Lee, K., Park, S.C., Kim, B.S., Cho, Y.J., Ha, S.M., and Chun, J. 2014. EzEditor: a versatile sequence alignment editor for both rRNA-and protein-coding genes. Int. J. Syst. Evol. Microbiol. 64, 689–691.
Katoh, K. and Standley, D.M. 2013. MAFFT Multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780.
Price, M.N., Dehal, P.S., and Arkin, A.P. 2010. FastTree 2-approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490.
Radford, A.D., Chapman, D., Dixon, L., Chantrey, J., Darby, A.C., and Hall, N. 2012. Application of next-generation sequencing technologies in virology. J. Gen. Virol. 93, 1853–1868.
Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.F., Darling, A., Malfatti, S., Swan, B.K., Gies, E.A., et al. 2013. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437.
Rosselló-Mora, R. and Amann, R. 2001. The species concept for prokaryotes. FEMS Microbiol. Rev. 25, 39–67.
Shih, P.M., Wu, D.Y., Latifi, A., Axen, S.D., Fewer, D.P., Talla, E., Calteau, A., Cai, F., de Marsac, N.T., Rippka, R., et al. 2013. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proc. Natl. Acad. Sci. USA 110, 1053–1058.
Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313.
Tagini, F. and Greub, G. 2017. Bacterial genome sequencing in clinical microbiology: a pathogen-oriented review. Eur. J. Clin. Microbiol. Infect. Dis. 36, 2007–2020.
Wu, D., Hugenholtz, P., Mavromatis, K., Pukall, R., Dalin, E., Ivanova, N.N., Kunin, V., Goodwin, L., Wu, M., Tindall, B.J., et al. 2009. A phylogeny-driven genomic encyclopaedia of bacteria and archaea. Nature 462, 1056–1060.
Wu, D.Y., Jospin, G., and Eisen, J.A. 2013. Systematic identification of gene families for use as markers for phylogenetic and phylogeny-driven ecological studies of bacteria and archaea and their major subgroups. PLoS One 8, e77033.
Yoon, S.H., Ha, S.M., Kwon, S., Lim, J., Kim, Y., Seo, H., and Chun, J. 2017. Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies. Int. J. Syst. Evol. Microbiol. 67, 1613–1617.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supplemental material for this article may be found at http://www.springerlink.com/content/120956
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Na, SI., Kim, Y.O., Yoon, SH. et al. UBCG: Up-to-date bacterial core gene set and pipeline for phylogenomic tree reconstruction. J Microbiol. 56, 280–285 (2018). https://doi.org/10.1007/s12275-018-8014-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12275-018-8014-6