Abstract
Alignment of samples from Liquid chromatography-mass spectrometry (LC-MS) measurements has a significant role in the detection of biomarkers and in metabolomic studies.The machine drift causes differences between LC-MS measurements, and an accurate alignment of the shifts introduced to the same peptide or metabolite is needed. In this paper, we propose the use of genetic programming (GP) for multiple alignment of LC-MS data. The proposed approach consists of two main phases. The first phase is the peak matching where the peaks from different LC-MS maps (peak lists) are matched to allow the calculation of the retention time deviation. The second phase is to use GP for multiple alignment of the peak lists with respect to a reference. In this paper, GP is designed to perform multiple-output regression by using a special node in the tree which divides the output of the tree into multiple outputs. Finally, the peaks that show the maximum correlation after dewarping the retention times are selected to form a consensus aligned map.The proposed approach is tested on one proteomics and two metabolomics LC-MS datasets with different number of samples. The method is compared to several benchmark methods and the results show that the proposed approach outperforms these methods in three fractions of the protoemics dataset and the metabolomics dataset with a larger number of maps. Moreover, the results on the rest of the datasets are highly competitive with the other methods.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Lange, E., Gröpl, C., Schulz-Trieglaff, O., Leinenbach, A., Huber, C.G., Reinert, K.: A geometric approach for the alignment of liquid chromatography-mass spectrometry data. Bioinformatics 23(13), 273–281 (2007)
Vandenbogaert, M., Li-Thiao-Te, S., Kaltenbach, H., Zhang, R., Aittokallio, T., Schwikowski, B.: Alignment of LC-MS images, with applications to biomarker discovery and protein identification. Proteomics 8(4), 650–672 (2008)
Lange, E., Tautenhahn, R., Neumann, S., Gropl, C.: Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements. BMC Bioinformatics 9(1), 375–394 (2008)
Heidi Vhmaa, Ville R. Koskinen, W.H.: PolyAlign: A versatile LC-MS data alignment tool for landmark-selected and automated use. International Journal of Proteomics, pp. 1–10 (2011)
Listgarten, J., Neal, R., Roweis, S., Wong, P., Emili, A.: Difference detection in LC-MS data for protein biomarker discovery. Bioinformatics 23(2), 198–204 (2007)
Pluskal, T., Castillo, S., Villar-Briones, A., Oresic, M.: MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010)
Palmblad, M., Mills, D.J., Bindschedler, L.V., Cramer, R.: Chromatographic Alignment of LC-MS and LC-MS/MS Datasets by Genetic Algorithm Feature Extraction. Journal of the American Society for Mass Spectrometry 18(10), 1835–1843 (2007)
Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Lulu Enterprises, UK Ltd. (2008)
Ahalpara, D.P.: Improved forecasting of time series data of real system using genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, GECCO 2010, pp. 977–978. ACM, New York (2010)
Smart, W.D., Zhang, M.: Probability based genetic programming for multiclass object classification. In: Proceedings of the 8th Pacific Rim International Conference on Artificial Intelligence, pp. 251–261 (2004)
Rodríguez-Vázquez, K., Oliver-Morales, C.: Multi-branches Genetic Programming as a Tool for Function Approximation. In: Deb, K., Tari, Z. (eds.) GECCO 2004. LNCS, vol. 3103, pp. 719–721. Springer, Heidelberg (2004)
Zhang, Y., Zhang, M.: A multiple-output program tree structure in genetic programming. In: Proceedings of The Second Asian-Pacific Workshop on Genetic Programming, pp. 1–12 (2004)
Defoin Platel, M., Vérel, S., Clergue, M., Chami, M.: Density Estimation with Genetic Programming for Inverse Problem Solving. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 45–54. Springer, Heidelberg (2007)
Prince, J., Carlson, M., Lu, R., Marcotte, E.: The need for a public proteomics repository. Nat. Biotechnol. 22, 471–472 (2004)
Kohlbacher, O., Reinert, K., Gropl, C., Lange, E., Pfeifer, N., Schulz-Trieglaff, O., Sturm, M.: TOPP-the OpenMS proteomics pipeline. Bioinformatics 23(2), 191–197 (2007)
Smith, C., Want, E., O’Maille, G., Abagyan, R., Siuzdak, G.: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78(3), 779–787 (2006)
White, D.R.: Software review: the ECJ toolkit, 65–67 (2012)
Bellew, M., Coram, M., Fitzgibbon, M., Igra, M., Randolph, T., Wang, P., May, D., Eng, J., Fang, R., Lin, C., Chen, J., Goodlett, D., Whiteaker, J., Paulovich, A., McIntosh, M.: A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS. Bioinformatics 22(15), 1902–1909 (2006)
Katajamaa, M., Miettinen, J., Oresic, M.: MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636 (2006)
Li, X., Yi, E., Kemp, C., Zhang, H., Aebersold, R.: A software suite for the generation and comparison of peptide arrays from sets of data collected by Liquid Chromatography-Mass Spectrometry. Molecular & Cellular Proteomics: MCP 4(9), 1328–1340 (2005)
Zhang, X., Asara, J., Adamec, J., Ouzzani, M., Elmagarmid, A.: Data pre-processing in liquid chromatography/mass spectrometry-based proteomics. Bioinformatics 21(21), 4054–4059 (2005)
Voss, B., Hanselmann, M., Renard, B., Lindner, M., Kthe, U., Kirchner, M., Hamprecht, F.: Sima: simultaneous multiple alignment of lc/ms peak lists. Bioinformatics 27(7), 987–993 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahmed, S., Zhang, M., Peng, L. (2014). GPMS: A Genetic Programming Based Approach to Multiple Alignment of Liquid Chromatography-Mass Spectrometry Data. In: Esparcia-Alcázar, A., Mora, A. (eds) Applications of Evolutionary Computation. EvoApplications 2014. Lecture Notes in Computer Science(), vol 8602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45523-4_74
Download citation
DOI: https://doi.org/10.1007/978-3-662-45523-4_74
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45522-7
Online ISBN: 978-3-662-45523-4
eBook Packages: Computer ScienceComputer Science (R0)