Skip to main content

MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features

  • Protocol
  • First Online:
Multiple Sequence Alignment Methods

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1079))

Abstract

Multiple Sequence Alignment (MSA) is an essential tool in protein structure modeling, gene and protein function prediction, DNA motif recognition, phylogenetic analysis, and many other bioinformatics tasks. Therefore, improving the accuracy of multiple sequence alignment is an important long-term objective in bioinformatics. We designed and developed a new method MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue–residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. Different from the multiple sequence alignment methods that use the tertiary structure information of some sequences, our method uses the structural information purely predicted from sequences. In this chapter, we first introduce some background and related techniques in the field of multiple sequence alignment. Then, we describe the detailed algorithm of MSACompro. Finally, we show that integrating predicted protein structural information improved the multiple sequence alignment accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barton GJ, Sternberg M (1987) A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J Mol Biol 198(2):327

    Article  PubMed  CAS  Google Scholar 

  2. Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J Mol Evol 25(4):351–360

    Article  PubMed  CAS  Google Scholar 

  3. Krogh A, Brown M, Mian IS, Sjolander K, Haussler D (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235(5):1501–1531

    Article  PubMed  CAS  Google Scholar 

  4. Liu Y, Schmidt B, Maskell DL (2010) MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26(16):1958–1964

    Article  PubMed  CAS  Google Scholar 

  5. Do CB, Mahabhashyam MSP, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15(2):330–340

    Article  PubMed  CAS  Google Scholar 

  6. Poirot O, Suhre K, Abergel C, O’Toole E, Notredame C (2004) 3DCoffee@ igs: a web server for combining sequences and structures into a multiple sequence alignment. Nucleic Acids Res 32(Suppl 2):W37–W40

    Article  PubMed  CAS  Google Scholar 

  7. Heringa J (1999) Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput Chem 23(3):341–364

    Article  PubMed  CAS  Google Scholar 

  8. Kim NK, Xie J (2006) Protein multiple alignment incorporating primary and secondary structure information. J Comput Biol 13(9):1615–1629

    Article  CAS  Google Scholar 

  9. Subramanian AR, Hiran S, Steinkamp R, Meinicke P, Corel E, Morgenstern B (2010) DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS. Nucleic Acids Res 38(Suppl 2):W19–W22

    Article  PubMed  CAS  Google Scholar 

  10. Zhou H, Zhou Y (2005) SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21(18):3615–3621

    Article  PubMed  CAS  Google Scholar 

  11. Pei J, Grishin NV (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res 34(16):4364–4374

    Article  PubMed  CAS  Google Scholar 

  12. Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808

    Article  PubMed  CAS  Google Scholar 

  13. Cheng J, Randall A, Sweredoski M, Baldi P (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 33(Web Server Issue):W72–W76

    Article  PubMed  CAS  Google Scholar 

  14. Gonnet GH, Cohen MA, Benner SA (1992) Exhaustive matching of the entire protein sequence database. Science 256(5062):1443–1445

    Article  PubMed  CAS  Google Scholar 

  15. Kawabata T, Nishikawa K (2000) Protein structure comparison using the Markov transition model of evolution. Proteins 41(1):108–122

    Article  PubMed  CAS  Google Scholar 

  16. Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge, MA

    Book  Google Scholar 

  17. Tegge AN, Wang Z, Eickholt J, Cheng J (2009) NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 37(Suppl 2):W515–W518

    Article  PubMed  CAS  Google Scholar 

  18. Sneath PHA, Sokal RR (1973) Numerical taxonomy. The principles and practice of numerical classification. Freeman, San Francisco, CA

    Google Scholar 

  19. Barney B (2011) OpenMP tutorial

    Google Scholar 

  20. Thompson JD, Koehl P, Ripp R, Poch O (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins 61(1):127–136

    Article  PubMed  CAS  Google Scholar 

  21. Van Walle I, Lasters I, Wyns L (2004) Align-m—a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20(9):1428–1435

    Article  PubMed  Google Scholar 

  22. Raghava GPS, Searle SMJ, Audley PC, Barber JD, Barton GJ (2003) OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 4(1):47

    Article  PubMed  CAS  Google Scholar 

  23. Thompson JD, Plewniak F, Poch O (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res 27(13):2682–2690

    Article  PubMed  CAS  Google Scholar 

  24. Deng X, Cheng J (2011) MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts. BMC Bioinformatics 12:472

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgment

This work was supported by an NIH grant (1R01GM093123) to JC.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Deng, X., Cheng, J. (2014). MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features. In: Russell, D. (eds) Multiple Sequence Alignment Methods. Methods in Molecular Biology, vol 1079. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-646-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-646-7_18

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-645-0

  • Online ISBN: 978-1-62703-646-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics