MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features

Deng, Xin; Cheng, Jianlin

doi:10.1007/978-1-62703-646-7_18

Xin Deng³ &
Jianlin Cheng⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1079))

5021 Accesses
2 Citations

Abstract

Multiple Sequence Alignment (MSA) is an essential tool in protein structure modeling, gene and protein function prediction, DNA motif recognition, phylogenetic analysis, and many other bioinformatics tasks. Therefore, improving the accuracy of multiple sequence alignment is an important long-term objective in bioinformatics. We designed and developed a new method MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue–residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. Different from the multiple sequence alignment methods that use the tertiary structure information of some sequences, our method uses the structural information purely predicted from sequences. In this chapter, we first introduce some background and related techniques in the field of multiple sequence alignment. Then, we describe the detailed algorithm of MSACompro. Finally, we show that integrating predicted protein structural information improved the multiple sequence alignment accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MSAIndelFR: a scheme for multiple protein sequence alignment using information on indel flanking regions

Article Open access 23 November 2015

Multiple Sequence Alignment Algorithms in Bioinformatics

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

Article Open access 06 October 2015

References

Barton GJ, Sternberg M (1987) A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. J Mol Biol 198(2):327
Article PubMed CAS Google Scholar
Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J Mol Evol 25(4):351–360
Article PubMed CAS Google Scholar
Krogh A, Brown M, Mian IS, Sjolander K, Haussler D (1994) Hidden Markov models in computational biology: applications to protein modeling. J Mol Biol 235(5):1501–1531
Article PubMed CAS Google Scholar
Liu Y, Schmidt B, Maskell DL (2010) MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26(16):1958–1964
Article PubMed CAS Google Scholar
Do CB, Mahabhashyam MSP, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15(2):330–340
Article PubMed CAS Google Scholar
Poirot O, Suhre K, Abergel C, O’Toole E, Notredame C (2004) 3DCoffee@ igs: a web server for combining sequences and structures into a multiple sequence alignment. Nucleic Acids Res 32(Suppl 2):W37–W40
Article PubMed CAS Google Scholar
Heringa J (1999) Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Comput Chem 23(3):341–364
Article PubMed CAS Google Scholar
Kim NK, Xie J (2006) Protein multiple alignment incorporating primary and secondary structure information. J Comput Biol 13(9):1615–1629
Article CAS Google Scholar
Subramanian AR, Hiran S, Steinkamp R, Meinicke P, Corel E, Morgenstern B (2010) DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS. Nucleic Acids Res 38(Suppl 2):W19–W22
Article PubMed CAS Google Scholar
Zhou H, Zhou Y (2005) SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21(18):3615–3621
Article PubMed CAS Google Scholar
Pei J, Grishin NV (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res 34(16):4364–4374
Article PubMed CAS Google Scholar
Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808
Article PubMed CAS Google Scholar
Cheng J, Randall A, Sweredoski M, Baldi P (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 33(Web Server Issue):W72–W76
Article PubMed CAS Google Scholar
Gonnet GH, Cohen MA, Benner SA (1992) Exhaustive matching of the entire protein sequence database. Science 256(5062):1443–1445
Article PubMed CAS Google Scholar
Kawabata T, Nishikawa K (2000) Protein structure comparison using the Markov transition model of evolution. Proteins 41(1):108–122
Article PubMed CAS Google Scholar
Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge, MA
Book Google Scholar
Tegge AN, Wang Z, Eickholt J, Cheng J (2009) NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 37(Suppl 2):W515–W518
Article PubMed CAS Google Scholar
Sneath PHA, Sokal RR (1973) Numerical taxonomy. The principles and practice of numerical classification. Freeman, San Francisco, CA
Google Scholar
Barney B (2011) OpenMP tutorial
Google Scholar
Thompson JD, Koehl P, Ripp R, Poch O (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins 61(1):127–136
Article PubMed CAS Google Scholar
Van Walle I, Lasters I, Wyns L (2004) Align-m—a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20(9):1428–1435
Article PubMed Google Scholar
Raghava GPS, Searle SMJ, Audley PC, Barber JD, Barton GJ (2003) OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy. BMC Bioinformatics 4(1):47
Article PubMed CAS Google Scholar
Thompson JD, Plewniak F, Poch O (1999) A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res 27(13):2682–2690
Article PubMed CAS Google Scholar
Deng X, Cheng J (2011) MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts. BMC Bioinformatics 12:472
Article PubMed CAS Google Scholar

Download references

Acknowledgment

This work was supported by an NIH grant (1R01GM093123) to JC.

Author information

Authors and Affiliations

Computer Science Department, University of Missouri, Columbia, MO, USA
Xin Deng
Computer Science Department, Life Science Center, Informatics Institute, University of Missouri, Columbia, MO, USA
Jianlin Cheng

Authors

Xin Deng
View author publications
You can also search for this author in PubMed Google Scholar
Jianlin Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
David J Russell

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Deng, X., Cheng, J. (2014). MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features. In: Russell, D. (eds) Multiple Sequence Alignment Methods. Methods in Molecular Biology, vol 1079. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-646-7_18

Download citation

DOI: https://doi.org/10.1007/978-1-62703-646-7_18
Published: 23 August 2013
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-645-0
Online ISBN: 978-1-62703-646-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

MSAIndelFR: a scheme for multiple protein sequence alignment using information on indel flanking regions

Multiple Sequence Alignment Algorithms in Bioinformatics

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

MSAIndelFR: a scheme for multiple protein sequence alignment using information on indel flanking regions

Multiple Sequence Alignment Algorithms in Bioinformatics

DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment

References

Acknowledgment

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation