Abstract
Polymers are compounds formed by the joining of smaller, often repeating, units linked by covalent bonds. The analysis of their sequence is a fundamental issue in many areas of chemistry, medicine and biology. Nowadays, the prevalent approach to this problem consists in using a mass spectrometry analysis that gives information about the molecular weights of the polymer and of its fragments. This information should be used in order to obtain the sequence. This is however a difficult mathematical problem, and several approaches have been proposed for it. In particular, a promising one is based on a propositional logic modeling of the problem. This paper presents conceptual improvements in this approach, principally the off-line computation of a database that substantially speeds-up the sequencing operations. This is obtained by finding a correspondence between sequences and natural numbers, so that all sequences up to a certain molecular weight can be implicitly considered in the above database, and explicitly computed only when needed. Results on real-world problems show the effectiveness of this approach.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aspvall, B., Plass, M.F., Tarjan, R.E.: A linear time algorithm for testing the truth of certain quantified boolean formulas. Inf. Process. Lett. 8, 121–123 (1979)
Bafna, V., Edwards, N.: On de novo interpretation of tandem mass spectra for peptide identification In: Annual Conference on Research in Computational Molecular Biology, RECOMB03, pp. 9–18 (2003)
Boros, E., Crama, Y., Hammer, P.L.: Polynomial time inference of all valid implications for Horn and related formulae. Ann. Math. Artif. Intell. 1, 21–32 (1990)
Bruni, R.: Solving peptide sequencing as satisfiability. Comput. Math. Appl. 55(5), 912–923 (2008)
Bruni, R., Gianfranceschi, G., Koch, G.: On peptide de novo sequencing: a new approach. J. Pept. Sci. 11, 225–234 (2005)
Bruni, R., Santori, A.: Adding a new conflict-based branching heuristic in two evolved DPLL SAT solvers. In: Proceedings of the Seventh International Conference on Theory and Applications of Satisfiability Testing, SAT2004 (2004)
Casella, G., Robert, C.P.: Monte Carlo Statistical Methods. Springer, New York (2006)
Chandru, V., Hooker, J.N.: Extend Horn clauses in propositional logic. J. ACM 38, 203–221 (1991)
Chandru, V., Hooker, J.N.: Optimization Methods for Logical Inference. Wiley, New York (1999)
Chen, T., Kao, M.Y., Tepel, M., Rush, J., Church, G.M.: A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol. 8(6), 571–583 (2001)
Clocksin, W.F.: Logic programming and digital circuit analysis. J. Log. Programm. 4(1), 59–82 (1987)
Conforti, M., Cornuéjols, G.: A class of logical inference problems soluble by linear programming. J. ACM 42(5), 1107–1113 (1995)
Dancik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol. 6, 327–342 (1999)
Garey, M.R., Johnson, D.S.: Computers and Intractability. Freeman, New York (1979)
Gu, J., Purdom, P.W., Franco, J., Wah, B.W.: Algorithms for the satisfiability (SAT) problem: a survey. In: DIMACS Series in Discrete Mathematics, vol. 35, pp. 19–151. American Mathematical Society, Providence (1997)
Johnson, R.S., Taylor, J.A.: Searching sequence databases via de novo peptide sequencing by tandem mass spectrometry. Methods Mol. Biol. 146, 41–61 (2000)
Kleine Büning, H., Lettman, T.: Propositional Logic: Deduction and Algorithms. Cambridge University Press, Cambridge (1999)
Lee, T.D.: Fast atom bombardment and secondary ion mass spectrometry of peptides and proteins. In: Shively, J.E. (ed.) Methods of Protein Microcharacterization, pp. 403–441. Humana Press, Clifton (1986)
Montaudo, G., Lattimer, R.P. (eds.): Mass Spectrometry of Polymers. CRC Press, Boca Raton (2001)
Schlipf, J.S., Annexstein, F.S., Franco, J.V., Swaminathan, R.P.: On finding solutions for extended Horn formulas. Inf. Process. Lett. 54(3), 133–137 (1995)
Siuzdak, G.: Mass Spectrometry for Biotechnology. Academic, New York (1996)
Software System DeNovoX: ThermoFinnigan Corp. http://www.thermo.com
Software System Mass Seq: Micromass Ltd. http://www.micromass.co.uk
Software System PEAKS: Bioinformatics Solutions Inc. http://www.bioinformaticssolutions.com
Software System Spectrum Mill: Agilent Technologies Inc. http://www.agilent.com
Stults, J.T.: Peptide sequencing by mass spectrometry. Methods Biochem. Anal. 34, 145–201 (1990)
Taylor, J.A., Johnson, R.S.: Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem. 73, 2594–2604 (2001)
Truemper, K.: Effective Logic Computation. Wiley, New York (1998)
Van Hentenryck, P.: Constraint Satisfaction in Logic Programming. MIT Press, Cambridge (1989)
Author information
Authors and Affiliations
Corresponding author
Additional information
Italian Patent number: MI2002A 000396. International Patent Application number: PCT/IB03/00714.
Rights and permissions
About this article
Cite this article
Bruni, R. A Logic-Based Approach to Polymer Sequence Analysis. J Math Model Algor 9, 213–232 (2010). https://doi.org/10.1007/s10852-010-9136-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10852-010-9136-y