Multiple Vector Seeds for Protein Alignment

Brown, Daniel G.

doi:10.1007/978-3-540-30219-3_15

Daniel G. Brown²¹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3240))

Included in the following conference series:

International Workshop on Algorithms in Bioinformatics

593 Accesses
7 Citations

Abstract

We present a framework for improving local protein alignment algorithms. Specifically, we discuss how to extend local protein aligners to use a collection of vector seeds [3] to reduce noise hits. We model picking a set of vector seeds as an integer programming problem, and give algorithms to choose such a set of seeds. A good set of vector seeds we have chosen allows four times fewer false positive hits, while preserving essentially identical sensitivity as BLASTP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Boosting Alignment Accuracy by Adaptive Local Realignment

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

Article 28 July 2020

References

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of Molecular Biology 215(3), 403–410 (1990)
Google Scholar
Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research 28(1), 45–48 (2000)
Article Google Scholar
Brejova, B., Brown, D., Vinar, T.: Vector seeds: an extension to spaced seeds allows substantial improvements in sensitivity and specificity. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 39–54. Springer, Heidelberg (2003)
Chapter Google Scholar
Brejova, B., Brown, D., Vinar, T.: Optimal spaced seeds for homologous coding regions. J. Bioinf. and Comp. Biol. 1, 595–610 (2004)
Article Google Scholar
Buhler, J., Keich, U., Sun, Y.: Designing seeds for similarity search in genomic DNA. In: Proceedings of the 7th Annual International Conference on Computational Biology (RECOMB), pp. 67–75 (2003)
Google Scholar
Choi, K.P., Zhang, L.: Sensitive analysis and efficient method for identifying optimal spaced seeds. J. Comp and Sys. Sci. 68, 22–40 (2004)
Article MATH MathSciNet Google Scholar
Hochbaum, D.: Approximating covering and packing problems. In: Hochbaum, D. (ed.) Approximation algorithms for NP-hard problems, pp. 94–143. PWS (1997)
Google Scholar
Keich, U., Li, M., Ma, B., Tromp, J.: On spaced seeds for similarity search. Discrete Appl. Math. 138, 253–263 (2004)
Article MATH MathSciNet Google Scholar
Li, M., Ma, B., Kisman, D., Tromp, J.: Patternhunter II: Highly sensitive and fast homology search. Journal of Bioinformatics and Computational Biology (2004)
Google Scholar
Ma, B., Tromp, J., Li, M.: PatternHunter: faster and more sensitive homology search. Bioinformatics 18(3), 440–445 (2002)
Article Google Scholar
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Article Google Scholar
Sun, Y., Buhler, J.: Designing multiple simultaneous seeds for DNA similarity search. In: Proceedings of the 8th Annual International Conference on Computational Biology (RECOMB), pp. 76–84 (2004)
Google Scholar
Xu, J., Brown, D., Li, M., Ma, B.: Optimizing multiple spaced seeds for homology search. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 47–58. Springer, Heidelberg (2004)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Daniel G. Brown

Authors

Daniel G. Brown
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Informatics and Computational Biology Unit, HIB, University of Bergen, 5020, Bergen, Norway
Inge Jonassen
Department of Biology,, Penn Center for Bioinformatics, Penn Genomics Institute, 415 S. University Ave., PA 19104, Philadelphia, USA
Junhyong Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brown, D.G. (2004). Multiple Vector Seeds for Protein Alignment. In: Jonassen, I., Kim, J. (eds) Algorithms in Bioinformatics. WABI 2004. Lecture Notes in Computer Science(), vol 3240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30219-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-540-30219-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23018-2
Online ISBN: 978-3-540-30219-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Multiple Vector Seeds for Protein Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Boosting Alignment Accuracy by Adaptive Local Realignment

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multiple Vector Seeds for Protein Alignment

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Boosting Alignment Accuracy by Adaptive Local Realignment

PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Benchmarking Methods of Protein Structure Alignment

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation