Abstract
Ultra-performance liquid chromatography (UPLC) is the established technology for accurate analysis of IgG Fc N-glycosylation due to its superior sensitivity, resolution, speed, and its capability to provide branch-specific information of glycan species. Correct and cost-efficient preprocessing of chromatographic data is the major prerequisite for subsequent analyses ranging from inference of structural isomers to biomarker discovery and prediction of humoral immune response from characterized changes in glycosylation. The complexity of glycomic chromatograms poses a number of challenges for developing automated data annotation and quantitation algorithms, which frequently necessitated manual or semi-manual approaches to preprocessing, most notably to peak detection and integration. Such procedures are meticulous and time-consuming, and may be a source of confounding due to their dependence on human labelers. Although liquid chromatography is a mature field and a number of methods have been developed for automatic peak detection outside the area of glycomics analysis, we found that hardly any of them are suitable for automatic integration of UPLC glycomic profiles without substantial modifications. In this chapter, we illustrate practical challenges of automatic peak detection of UPLC glycomics chromatograms. We outline a robust, semi-supervised method ACE (Automatic Chromatogram Extraction) for automated alignment and detection of glycan peaks in chromatograms, developed by Pharmatics Limited (UK) in collaboration with Genos Limited (Croatia). Application of the tool requires minimal human interference, which results in a significant reduction in the time and cost of IgG glycomics signal integration using Waters Acquity UPLC instrument (Milford, MA, USA) in several human cohorts with blind technical replicas.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Data patterns from one cohort will typically be representative of other cohorts. In this case, the old labeled data can still be used without any noticeable drop in performance for labeling of a new batch of chromatographic data. However, it is recommended to provide manually labeled set for each new cohort by following the procedure of step 1 of the algorithm.
References
Paul SM, Mytelka DS, Dunwiddie CT et al (2010) How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discov 9:203–214. doi:10.1038/nrd3078
Hay M, Thomas DW, Craighead JL et al (2014) Clinical development success rates for investigational drugs. Nat Biotechnol 32:40–51. doi:10.1038/nbt.2786
Prinz F, Schlange T, Asadullah K (2011) Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov 10:712. doi:10.1038/nrd3439-c1
Vivó-Truyols G, Torres-Lapasió JR, Van Nederkassel AM et al (2005) Automatic program for peak detection and deconvolution of multi-overlapped chromatographic signals: Part II: peak model and deconvolution algorithms. J Chromatogr A 1096:146–155
Yu T, Peng H (2010) Quantification and deconvolution of asymmetric LC-MS peaks using the bi-Gaussian mixture model and statistical model selection. BMC Bioinformatics 11:559. doi:10.1186/1471-2105-11-559
Yang C, He Z, Yu W (2009) Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinformatics 10:4. doi:10.1186/1471-2105-10-4
Tomasi G, van den Berg F, Andersson C (2004) Correlation optimized warping and dynamic time warping as preprocessing methods for chromatographic data. J Chemom 18:231–241
Wang CP, Isenhour TL (1987) Time-warping algorithm applied to chromatographic peak matching gas chromatography/Fourier transform infrared/mass spectrometry. Anal Chem 59:649–654
Clifford D, Stone G (2012) Variable penalty dynamic time warping code for aligning mass spectrometry chromatograms in R. J Stat Softw 47:1–17
Wang S-Y, Ho T-J, Kuo C-H, Tseng YJ (2010) Chromaligner: a web server for chromatogram alignment. Bioinformatics 26:2338–2339
Bork C, Ng K, Liu Y et al (2013) Chromatographic peak alignment using derivative dynamic time warping. Biotechnol Prog 29:394–402
Prince JT, Marcotte EM (2006) Chromatographic alignment of ESI-LC-MS proteomics data sets by ordered bijective interpolated warping. Anal Chem 78:6140–6152
Hoffmann N, Keck M, Neuweger H et al (2012) Combining peak- and chromatogram-based retention time alignment algorithms for multiple chromatography-mass spectrometry datasets. BMC Bioinformatics 13:214. doi:10.1186/1471-2105-13-214
Keogh EJ, Pazzani MJ (2001) Derivative dynamic time warping. In: SDM. SIAM, pp 5–7
Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping data mining. In: Proc. SIAM Int. Conf. Data Min. SDM’05. SIAM, pp 506–510
McQuillan R, Leutenegger A-L, Abdel-Rahman R et al (2008) Runs of homozygosity in European Populations. Am J Hum Genet 3:359–372
Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14:1612
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Comput. Vis. Pattern Recognit. 2001 CVPR 2001 Proc. 2001 IEEE Comput. Soc. Conf. On. IEEE, pp I-511
McCabe GP (1984) Principal variables. Technometrics 26:137–144
Balkema AA, De Haan L (1974) Residual life time at great age. Ann Probab 2:792–804
Johnsen LG, Skov T, Houlberg U, Bro R (2013) An automated method for baseline correction, peak finding and peak grouping in chromatographic data. Analyst 138:3502–3511
Acknowledgements
Pharmatics and Genos acknowledge partial support of this work by EU FP7 MIMOmics. F.A. thanks Yurii Aulchenko and Lennart Karssen for useful discussions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
Agakova, A., Vučković, F., Klarić, L., Lauc, G., Agakov, F. (2017). Automated Integration of a UPLC Glycomic Profile. In: Lauc, G., Wuhrer, M. (eds) High-Throughput Glycomics and Glycoproteomics. Methods in Molecular Biology, vol 1503. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6493-2_17
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6493-2_17
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6491-8
Online ISBN: 978-1-4939-6493-2
eBook Packages: Springer Protocols