Skip to main content

Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG

  • Protocol
  • First Online:
Protein Function Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1611))

Abstract

Comparative genomics is becoming an essential approach for identification of genes associated with a specific function or phenotype. Here, we introduce the microbial genome database for comparative analysis (MBGD), which is a comprehensive ortholog database among the microbial genomes available so far. MBGD contains several precomputed ortholog tables including the standard ortholog table covering the entire taxonomic range and taxon-specific ortholog tables for various major taxa. In addition, MBGD allows the users to create an ortholog table within any specified set of genomes through dynamic calculations. In particular, MBGD has a “My MBGD” mode where users can upload their original genome sequences and incorporate them into orthology analysis. The created ortholog table can serve as the basis for various comparative analyses. Here, we describe the use of MBGD and briefly explain how to utilize the orthology information during comparative genome analysis in combination with the stand-alone comparative genomics software RECOG, focusing on the application to comparison of closely related microbial genomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fitch WM (1970) Distinguishing homologous from analogous proteins. Syst Zool 19:99–113

    Article  CAS  PubMed  Google Scholar 

  2. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A 96:4285–4288

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285:751–753

    Article  CAS  PubMed  Google Scholar 

  4. Overbeek R, Fonstein M, D’Souza M, Pusch GD, Maltsev N (1999) The use of gene clusters to infer functional coupling. Proc Natl Acad Sci U S A 96:2896–2901

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Uchiyama I, Mihara M, Nishide H, Chiba H (2015) MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data. Nucleic Acids Res 43:D270–D276

    Article  CAS  PubMed  Google Scholar 

  6. Uchiyama I (2003) MBGD: microbial genome database for comparative analysis. Nucleic Acids Res 31:58–62

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Uchiyama I (2006) Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes. Nucleic Acids Res 34:647–658

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Sonnhammer EL, Koonin EV (2002) Orthology, paralogy and proposed classification for paralog subtypes. Trends Genet 18:619–620

    Article  CAS  PubMed  Google Scholar 

  9. Chiba H, Uchiyama I (2014) Improvement of domain-level ortholog clustering by optimizing domain-specific sum-of-pairs score. BMC Bioinformatics 15:148

    Article  PubMed  PubMed Central  Google Scholar 

  10. Uchiyama I (2007) MBGD: a platform for microbial comparative genomics based on the automated construction of orthologous groups. Nucleic Acids Res 35:D343–D346

    Article  CAS  PubMed  Google Scholar 

  11. Uchiyama I, Higuchi T, Kobayashi I (2006) CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes. BMC Bioinformatics 7:472

    Article  PubMed  PubMed Central  Google Scholar 

  12. Uchiyama I (2008) Multiple genome alignment for identifying the core structure among moderately related microbial genomes. BMC Genomics 9:515

    Article  PubMed  PubMed Central  Google Scholar 

  13. Galperin MY, Makarova KS, Wolf YI, Koonin EV (2015) Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res 43:D261–D269

    Article  CAS  PubMed  Google Scholar 

  14. Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014) Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res 42:D199–D205

    Article  CAS  PubMed  Google Scholar 

  15. Haft DH, Selengut JD, Richter RA, Harkins D, Basu MK, Beck E (2013) TIGRFAMs and genome properties in 2013. Nucleic Acids Res 41:D387–D395

    Article  CAS  PubMed  Google Scholar 

  16. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Soding J et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539

    Article  PubMed  PubMed Central  Google Scholar 

  18. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490

    Article  PubMed  PubMed Central  Google Scholar 

  20. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  CAS  PubMed  Google Scholar 

  22. Wu J, Kasif S, DeLisi C (2003) Identification of functional links between genes using phylogenetic profiles. Bioinformatics 19:1524–1530

    Article  CAS  PubMed  Google Scholar 

  23. Galperin MY, Koonin EV (2000) Who’s your neighbor? New computational approaches for functional genomics. Nat Biotechnol 18:609–613

    Article  CAS  PubMed  Google Scholar 

  24. Drissi F, Merhej V, Angelakis E, El Kaoutari A, Carriere F, Henrissat B, Raoult D (2014) Comparative genomics analysis of Lactobacillus species associated with weight gain or weight protection. Nutr Diabetes 4:e109

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgment

The author thanks Hirokazu Chiba, Hiroyo Nishide, and Motohiro Mihara for the development and maintenance of the MBGD Web service. The development of MBGD is supported by National Bioscience Database Center, Japan Science Technology Agency. Computational resources were provided by the Data Integration and Analysis Facility, National Institute for Basic Biology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ikuo Uchiyama .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this protocol

Cite this protocol

Uchiyama, I. (2017). Ortholog Identification and Comparative Analysis of Microbial Genomes Using MBGD and RECOG. In: Kihara, D. (eds) Protein Function Prediction. Methods in Molecular Biology, vol 1611. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7015-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7015-5_12

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7013-1

  • Online ISBN: 978-1-4939-7015-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics