Skip to main content

Retention Time Prediction and Protein Identification

  • Protocol
  • First Online:
Mass Spectrometry Data Analysis in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2051))

Abstract

In bottom-up proteomics, proteins are typically identified by enzymatic digestion into peptides, tandem mass spectrometry and comparison of the tandem mass spectra with those predicted from a sequence database for peptides within measurement uncertainty from the experimentally obtained mass. Although now decreasingly common, isolated proteins or simple protein mixtures can also be identified by measuring only the masses of the peptides resulting from the enzymatic digest, without any further fragmentation. Separation methods such as liquid chromatography and electrophoresis are often used to fractionate complex protein or peptide mixtures prior to analysis by mass spectrometry. Although the primary reason for this is to avoid ion suppression and improve data quality, these separations are based on physical and chemical properties of the peptides or proteins and therefore also provide information about them. Depending on the separation method, this could be protein molecular weight (SDS-PAGE), isoelectric point (IEF), charge at a known pH (ion exchange chromatography), or hydrophobicity (reversed phase chromatography). These separations produce approximate measurements on properties that to some extent can be predicted from amino acid sequences. In the case of molecular weight of proteins without posttranslational modifications this is straightforward: simply add the molecular weights of the amino acid residues in the protein. For IEF, charge and hydrophobicity, the order of the amino acids, and folding state of the peptide or protein also matter, but it is nevertheless possible to predict the behavior of peptides and proteins in these separation methods to a degree which renders such predictions useful. This chapter reviews the topic of using data from separation methods for identification and validation in proteomics, with special emphasis on predicting retention times of tryptic peptides in reversed-phase chromatography under acidic conditions, as this is one of the most commonly used separation methods in bottom-up proteomics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Laemmli UK (1970) Cleavage of structural proteins during assembly of head of bacteriophage-T4. Nature 227(5259):680–685

    CAS  PubMed  Google Scholar 

  2. Stacey CC, Kruppa GH, Watson CH, Wronka J, Laukien FH, Banks JF, Whitehouse CM (1994) Reverse-phase liquid chromatography/electrospray-ionization Fourier-transform mass spectrometry in the analysis of peptides. Rapid Commun Mass Spectrom 8:513–516

    CAS  Google Scholar 

  3. Voyksner RD (1997) Combining liquid chromatography with electrospray mass spectrometry. In: Cole RB (ed) Electrospray ionization mass spectrometry. John Wiley & Sons, New York, pp 323–341

    Google Scholar 

  4. Jensen PK, Pasa-Tolic L, Peden KK, Martinovic S, Lipton MS, Anderson GA, Tolic N, Wong KK, Smith RD (2000) Mass spectrometric detection for capillary isoelectric focusing separations of complex protein mixtures. Electrophoresis 21(7):1372–1380

    CAS  PubMed  Google Scholar 

  5. Käll L, Storey JD, MacCoss MJ, Noble WS (2008) Posterior error probabilities and false discovery rates: two sides of the same coin. J Proteome Res 7(1):40–44. https://doi.org/10.1021/Pr700739d

    Article  PubMed  Google Scholar 

  6. Käll L, Storey JD, MacCoss MJ, Noble WS (2008) Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res 7(1):29–34. https://doi.org/10.1021/Pr700600n

    Article  PubMed  Google Scholar 

  7. de Bruin JS, Deelder AM, Palmblad M (2012) Scientific workflow management in proteomics. Mol Cell Proteomics. https://doi.org/10.1074/mcp.M111.010595. M111.010595 [pii]

    Google Scholar 

  8. Keller A, Eng J, Zhang N, Li XJ, Aebersold R (2005) A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol Syst Biol 1:2005.0017

    PubMed  PubMed Central  Google Scholar 

  9. Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75(17):4646–4658

    CAS  PubMed  Google Scholar 

  10. Palmblad M, Ramström M, Markides KE, Håkansson P, Bergquist J (2002) Prediction of chromatographic retention and protein identification in liquid chromatography/mass spectrometry. Anal Chem 74(22):5826–5830

    CAS  PubMed  Google Scholar 

  11. Eriksson J, Chait BT, Fenyö D (2000) A statistical basis for testing the significance of mass spectrometric protein identification results. Anal Chem 72(5):999–1005

    CAS  PubMed  Google Scholar 

  12. Victor B, Gabriel S, Kanobana K, Mostovenko E, Polman K, Dorny P, Deelder AM, Palmblad M (2012) Partially sequenced organisms, decoy searches and false discovery rates. J Proteome Res 11(3):1991–1995. https://doi.org/10.1021/pr201035r

    Article  CAS  PubMed  Google Scholar 

  13. Pardee AB (1951) Calculations on paper chromatography of peptides. J Biol Chem 190(2):757–762

    CAS  PubMed  Google Scholar 

  14. Knight CA (1951) Paper chromatography of some lower peptides. J Biol Chem 190(2):753–756

    CAS  PubMed  Google Scholar 

  15. Sanger F, Thompson EOP (1953) The amino-acid sequence in the glycyl chain of insulin. Biochem J 53:353–374

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Cornette JL, Cease KB, Margalit H, Spouge JL, Berzofsky JA, DeLisi C (1987) Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. J Mol Biol 195(3):659–685

    CAS  PubMed  Google Scholar 

  17. Palmblad M, Mills DJ, Bindschedler LV, Cramer R (2007) Chromatographic alignment of LC-MS and LC-MS/MS datasets by genetic algorithm feature extraction. J Am Soc Mass Spectrom 18(10):1835–1843. https://doi.org/10.1016/j.jasms.2007.07.018. S1044-0305(07)00624-1 [pii]

    Article  CAS  PubMed  Google Scholar 

  18. Petritis K, Kangas LJ, Ferguson PL, Anderson GA, Pasa-Tolic L, Lipton MS, Auberry KJ, Strittmatter EF, Shen Y, Zhao R, Smith RD (2003) Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses. Anal Chem 75(5):1039–1048

    CAS  PubMed  Google Scholar 

  19. Meek JL (1980) Prediction of peptide retention times in high-pressure liquid chromatography on the basis of amino acid composition. Proc Natl Acad Sci U S A 77(3):1632–1636

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Meek JL, Rossetti ZL (1981) Factors affecting retention and resolution of peptides in high-performance liquid-chromatography. J Chromatogr 211(1):15–28

    CAS  Google Scholar 

  21. Browne CA, Bennett HP, Solomon S (1982) The isolation of peptides by high-performance liquid chromatography using predicted elution positions. Anal Biochem 124(1):201–208. 0003-2697(82)90238-X [pii]

    CAS  PubMed  Google Scholar 

  22. Guo DC, Mant CT, Taneja AK, Parker JMR, Hodges RS (1986) Prediction of peptide retention times in reversed-phase high-performance liquid-chromatography .1. Determination of retention coefficients of amino-acid-residues of model synthetic peptides. J Chromatogr 359:499–517

    CAS  Google Scholar 

  23. Guo DC, Mant CT, Taneja AK, Hodges RS (1986) Prediction of peptide retention times in reversed-phase high-performance liquid-chromatography .2. Correlation of observed and predicted peptide retention times and factors influencing the retention times of peptides. J Chromatogr 359:519–532

    CAS  Google Scholar 

  24. Wilce MCJ, Aguilar MI, Hearn MTW (1991) High-performance liquid-chromatography of amino-acids, peptides and proteins .107. Analysis of group retention contributions for peptides separated with a range of Mobile and stationary phases by reversed-phase high-performance liquid-chromatography. J Chromatogr 536(1–2):165–183

    CAS  Google Scholar 

  25. Wilce MCJ, Aguilar MI, Hearn MTW (1993) High-performance liquid-chromatography of amino-acids, peptides and proteins .122. Application of experimentally derived retention coefficients to the prediction of peptide retention times – studies with Myohemerythrin. J Chromatogr 632(1–2):11–18

    CAS  PubMed  Google Scholar 

  26. Mohammed Y, Palmblad M (2015) Method and software workflow for integrating paired CE-MS and LC-MS bottom-up proteomics data from SDS-PAGE pre-fractionated samples. Paper presented at the 21st international mass spectrometry conference, Toronto, Canada, 2016-08-22

    Google Scholar 

  27. Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157(1):105–132

    CAS  PubMed  Google Scholar 

  28. Terabe S, Konaka R, Inouye K (1979) Separation of some polypeptide hormones by high-performance liquid-chromatography. J Chromatogr 172:163–177

    CAS  PubMed  Google Scholar 

  29. Hearn MTW, Aguilar MI (1987) High-performance liquid-chromatography of amino-acids, peptides and proteins. 69. Evaluation of retention and bandwidth relationships of myosin-related peptides separated by gradient elution reversed-phase high-performance liquid-chromatography. J Chromatogr 392:33–49

    CAS  PubMed  Google Scholar 

  30. Hearn MT, Aguilar MI, Mant CT, Hodges RS (1988) High-performance liquid chromatography of amino acids, peptides and proteins. LXXXV. Evaluation of the use of hydrophobicity coefficients for the prediction of peptide elution profiles. J Chromatogr 438(2):197–210

    CAS  PubMed  Google Scholar 

  31. Mant CT, Hodges RS (2006) Context-dependent effects on the hydrophilicity/hydrophobicity of side-chains during reversed-phase high-performance liquid chromatography: implications for prediction of peptide retention behaviour. J Chromatogr A 1125(2):211–219. https://doi.org/10.1016/j.chroma.2006.05.063

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Mant CT, Burke TWL, Black JA, Hodges RS (1988) Effect of peptide-chain length on peptide retention behavior in reversed-phase chromatography. J Chromatogr 458:193–205

    CAS  PubMed  Google Scholar 

  33. Krokhin OV, Craig R, Spicer V, Ens W, Standing KG, Beavis RC, Wilkins JA (2004) An improved model for prediction of retention times of tryptic peptides in ion pair reversed-phase HPLC - its application to protein peptide mapping by off-line HPLC-MALDI MS. Mol Cell Proteomics 3(9):908–919. https://doi.org/10.1074/mcp.M400031-MCP200

    Article  CAS  PubMed  Google Scholar 

  34. Krokhin OV, Ying S, Cortens JP, Ghosh D, Spicer V, Ens W, Standing KG, Beavis RC, Wilkins JA (2006) Use of peptide retention time prediction for protein identification by off-line reversed-phase HPLC-MALDI MS/MS. Anal Chem 78(17):6265–6269. https://doi.org/10.1021/Ac060251b

    Article  CAS  PubMed  Google Scholar 

  35. Krokhin OV (2006) Sequence-specific retention calculator. Algorithm for peptide retention prediction in ion-pair RP-HPLC: application to 300-and 100-angstrom pore size C18 sorbents. Anal Chem 78(22):7785–7795. https://doi.org/10.1021/Ac060777w

    Article  CAS  PubMed  Google Scholar 

  36. Strittmatter EF, Ferguson PL, Tang K, Smith RD (2003) Proteome analyses using accurate mass and elution time peptide tags with capillary LC time-of-flight mass spectrometry. J Am Soc Mass Spectrom 14(9):980–991

    CAS  PubMed  Google Scholar 

  37. Petritis K, Kangas LJ, Yan B, Monroe ME, Strittmatter EF, Qian WJ, Adkins JN, Moore RJ, Xu Y, Lipton MS, Camp DG 2nd, Smith RD (2006) Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information. Anal Chem 78(14):5026–5039. https://doi.org/10.1021/ac060143p

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Eisenberg D, Weiss RM, Terwilliger TC (1982) The helical hydrophobic moment – a measure of the amphiphilicity of a helix. Nature 299(5881):371–374

    CAS  PubMed  Google Scholar 

  39. Eisenberg D, Weiss RM, Terwilliger TC (1984) The hydrophobic moment detects periodicity in protein hydrophobicity. Proc Natl Acad Sci U S A 81(1):140–144

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Eisenberg D (1984) 3-Dimensional structure of membrane and surface-proteins. Annu Rev Biochem 53:595–623

    CAS  PubMed  Google Scholar 

  41. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Google Scholar 

  42. Klammer AA, Yi XH, MacCoss MJ, Noble WS (2007) Improving tandem mass spectrum identification using peptide retention time prediction across diverse chromatography conditions. Anal Chem 79(16):6111–6118. https://doi.org/10.1021/Ac070262k

    Article  CAS  PubMed  Google Scholar 

  43. Pfeifer N, Leinenbach A, Huber CG, Kohlbacher O (2007) Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics. BMC Bioinformatics 8. https://doi.org/10.1186/1471-2105-8-468. Artn 468

  44. Scholkopf B, Smola AJ, Williamson RC, Bartlett PL (2000) New support vector algorithms. Neural Comput 12(5):1207–1245

    CAS  PubMed  Google Scholar 

  45. Meinicke P, Tech M, Morgenstern B, Merkl R (2004) Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites. BMC Bioinformatics 5. https://doi.org/10.1186/1471-2105-5-169. Artn 169

    PubMed  PubMed Central  Google Scholar 

  46. Kohlbacher O, Reinert K, Gropl C, Lange E, Pfeifer N, Schulz-Trieglaff O, Sturm M (2007) TOPP – the OpenMS proteomics pipeline. Bioinformatics 23(2):E191–E197. https://doi.org/10.1093/bioinformatics/btl299

    Article  CAS  PubMed  Google Scholar 

  47. Moruz L, Tomazela D, Kall L (2010) Training, selection, and robust calibration of retention time models for targeted proteomics. J Proteome Res 9(10):5209–5216. https://doi.org/10.1021/Pr1005058

    Article  CAS  PubMed  Google Scholar 

  48. Rousseeuw PJ, Van Driessen K (2006) Computing LTS regression for large data sets. Data Min Knowl Disc 12(1):29–45. https://doi.org/10.1007/s10618-005-0024-4

    Article  Google Scholar 

  49. Zimmerman JM, Eliezer N, Simha R (1968) Characterization of amino acid sequences in proteins by statistical methods. J Theor Biol 21(2):170–201

    CAS  PubMed  Google Scholar 

  50. Goloborodko AA, Levitsky LI, Ivanov MV, Gorshkov MV (2013) Pyteomics—a Python framework for exploratory data analysis and rapid software prototyping in proteomics. J Am Soc Mass Spectrom 24(2):301–304. https://doi.org/10.1007/s13361-012-0516-6

    Article  CAS  PubMed  Google Scholar 

  51. MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26(7):966–968. https://doi.org/10.1093/bioinformatics/btq054

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Dorfer V, Maltsev S, Winkler S, Mechtler K (2018) CharmeRT: boosting peptide identifications by chimeric spectra identification and retention time prediction. J Proteome Res. https://doi.org/10.1021/acs.jproteome.7b00836

    PubMed  PubMed Central  Google Scholar 

  53. Krokhin OV, Ezzati P, Spicer V (2017) Peptide retention time prediction in hydrophilic interaction liquid chromatography: data collection methods and features of additive and sequence-specific models. Anal Chem 89(10):5526–5533. https://doi.org/10.1021/acs.analchem.7b00537

    Article  CAS  PubMed  Google Scholar 

  54. Spicer V, Krokhin OV (2018) Peptide retention time prediction in hydrophilic interaction liquid chromatography. Comparison of separation selectivity between bare silica and bonded stationary phases. J Chromatogr A 1534:75–84. https://doi.org/10.1016/j.chroma.2017.12.046

    Article  CAS  PubMed  Google Scholar 

  55. Gussakovsky D, Neustaeter H, Spicer V, Krokhin OV (2017) Sequence-specific model for peptide retention time prediction in strong cation exchange chromatography. Anal Chem 89(21):11795–11802. https://doi.org/10.1021/acs.analchem.7b03436

    Article  CAS  PubMed  Google Scholar 

  56. Giese SH, Ishihama Y, Rappsilber J (2018) Peptide retention in hydrophilic strong anion exchange chromatography is driven by charged and aromatic residues. Anal Chem 90(7):4635–4640. https://doi.org/10.1021/acs.analchem.7b05157

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Bruce JE, Anderson GA, Wen J, Harkewicz R, Smith RD (1999) High-mass-measurement accuracy and 100% sequence coverage of enzymatically digested bovine serum albumin from an ESI-FTICR mass spectrum. Anal Chem 71(14):2595–2599

    CAS  PubMed  Google Scholar 

  58. Conrads TP, Anderson GA, Veenstra TD, Pasa-Tolic L, Smith RD (2000) Utility of accurate mass tags for proteome-wide protein identification. Anal Chem 72(14):3349–3354

    CAS  PubMed  Google Scholar 

  59. Hodges RS, Parker JM, Mant CT, Sharma RR (1988) Computer simulation of high-performance liquid chromatographic separations of peptide and protein digests for development of size- exclusion, ion-exchange and reversed-phase chromatographic methods. J Chromatogr 458:147–167

    CAS  PubMed  Google Scholar 

  60. Mant CT, Burke TW, Zhou NE, Parker JM, Hodges RS (1989) Reversed-phase chromatographic method development for peptide separations using the computer simulation program ProDigest-LC. J Chromatogr 485:365–382

    CAS  PubMed  Google Scholar 

  61. The Cygwin homepage. http://www.cywin.com/

  62. Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock MR, Wipat A, Li P (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17):3045–3054

    CAS  PubMed  Google Scholar 

  63. Lesk AM (2008) Introduction to bioinformatics, 3rd edn. Oxford University Press, New York

    Google Scholar 

  64. Rost B (2001) Review: Protein secondary structure prediction continues to rise. J Struct Biol 134(2–3):204–218

    CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Magnus Palmblad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Henneman, A., Palmblad, M. (2020). Retention Time Prediction and Protein Identification. In: Matthiesen, R. (eds) Mass Spectrometry Data Analysis in Proteomics. Methods in Molecular Biology, vol 2051. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9744-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9744-2_4

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9743-5

  • Online ISBN: 978-1-4939-9744-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics