Abstract
Cancer-specific High-throughput Annotation of Somatic Mutations (CHASM) is a computational method that uses supervised machine learning to prioritize somatic missense mutations detected in tumor sequencing studies. Missense mutations are a key mechanism by which important cellular behaviors, such as cell growth, proliferation, and survival, are disrupted in cancer. However, only a fraction of the missense mutations observed in tumor genomes are expected to be cancer causing. Distinguishing tumorigenic “driver” mutations from their neutral “passenger” counterparts is currently a pressing problem in cancer research.CHASM trains a Random Forest classifier on driver mutations from the COSMIC databases and uses background nucleotide substitution rates observed in tumor sequencing data to model tumor type-specific passenger mutations. Each missense mutation is represented by quantitative features that fall into five major categories: physiochemical properties of amino acid residues; scores derived from multiple sequence alignments of protein or DNA; region-based amino acid sequence composition; predicted properties of local protein structure; and annotations from the UniProt feature tables. Both a software package and a Web server implementation of CHASM are available to facilitate high-throughput prioritization of somatic missense mutations from large, multi-tumor exome sequencing studies. After ranking candidate driver mutations with CHASM, the vector of features describing each mutation can be used to suggest possible mechanism by which mutations alter protein activity in tumorigenesis. This chapter details the application of both implementations of CHASM to tumor sequencing data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Futreal P, Kasprzyk A, Birney E, Mullikin J, Wooster R, Stratton M (2001) Nature 409(6822):850
Stratton M, Campbell P, Futreal P (2009) Nature 458(7239):719
Greenman C, Stephens P, Smith R, Dalgliesh G, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C, et al (2007) Nature 446(7132):153
Sjöblom T, Jones S, Wood L, Parsons D, Lin J, Barber T, Mandelker D, Leary R, Ptak J, Silliman N, et al (2006) Science 314(5797):268
Wood L, Parsons D, Jones S, Lin J, Sjöblom T, Leary R, Shen D, Boca S, Barber T, Ptak J, et al (2007) Science 318(5853):1108
Parsons D, Jones S, Zhang X, Lin J, Leary R, Angenendt P, Mankoo P, Carter H, Siu I, Gallia G, et al (2008) Science 321(5897):1807
Jones S, Zhang X, Parsons D, Lin J, Leary R, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, et al (2008) Science 321(5897):1801
Carter H, Chen S, Isik L, Tyekucheva S, Velculescu V, Kinzler K, Vogelstein B, Karchin R (2009) Canc Res 69(16):6660
Carter H, Samayoa J, Hruban R, Karchin R (2010) Canc Biol Ther 10(6):582
Amit Y, Geman D (1997) Neural Comput 9(7):1545
Breiman L (2001) Mach Learn 45(1):5
Apweiler R, Bairoch A, Wu C, Barker W, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, et al (2004) Nucleic Acids Res 32(suppl 1):D115
Bamford S, Dawson E, Forbes S, Clements J, Pettett R, Dogan A, Flanagan A, Teague J, Futreal P, Stratton M, et al (2004) Br J Canc 91(2):355
Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Kinzler KW (2013) science 339(6127), 1546
Futreal P, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton M (2004) Nat Rev Canc 4(3):177
Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov J (2011) Bioinformatics 27(12):1739
Benjamini Y, Hochberg Y (1995) J Roy Stat Soc Ser B (Methodological) 57:289–300
Douville C, Carter H, Kim R, Deikhans M, Stenson P, Cooper D, Ryan M, Karchin R (2012) Bioinformatics 29(5), 647
Kent W, Sugnet C, Furey T, Roskin K, Pringle T, Zahler A, Haussler D, et al (2002) Genome Res 12(6):996
Karolchik D, Baertsch R, Diekhans M, Furey T, Hinrichs A, Lu Y, Roskin K, Schwartz M, Sugnet C, Thomas D, et al (2003) Nucleic Acids Res 31(1):51
Pruitt K, Tatusova T, Maglott D (2007) Nucleic Acids Res 35(suppl 1):D61
Pruitt K, Harrow J, Harte R, Wallin C, Diekhans M, Maglott D, Searle S, Farrell C, Loveland J, Ruef B, et al (2009) Genome Res 19(7):1316
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, et al (2002) Nucleic Acids Res 30(1):38
Hainaut P, Wiman K (2009) Lancet Oncol 10(9):913
Samuels Y, Wang Z, Bardelli A, Silliman N, Ptak J, Szabo S, Yan H, Gazdar A, Powell S, Riggins G, et al (2004) Science 304(5670):554
Wang J, Fu L, Gu F, Ma Y, et al (2011) Oncol Rep 26(5):1295
Liu H, Xing Y, Yang S, Tian D, et al (2011) Oncology Rep 26(6):1539
Hinrichs A, Karolchik D, Baertsch R, Barber G, Bejerano G, Clawson H, Diekhans M, Furey T, Harte R, Hsu F, et al (2006) Nucleic Acids Res 34(suppl 1):D590
Jiao X, Wood L, Lindman M, Jones S, Buckhaults P, Polyak K, Sukumar S, Carter H, Kim D, Karchin R, et al (2012) Gene Chromosome Canc 51(5), 480
Gonzalez-Angulo A, Ferrer-Lozano J, Stemke-Hale K, Sahin A, Liu S, Barrera J, Burgues O, Lluch A, Chen H, Hortobagyi G, et al (2011) Mol Canc Therapeut 10(6):1093
Gartner JJ, Parker SC, Prickett TD, Dutton-Reester K, Stitzel ML, Lin JC, Simhadri VL, Jha S, Katagiri N, et al (2012) Proceedings of the National Academy of Sciences 110(33), 13481
Acknowledgements
Funding: This work was funded by National Institutes of Health CA 152432, National Science Foundation DBI 0845275 to R.K and a National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a to H.C. We would like to extend special thanks to Andy Wong and Mark Diekhans for implementing the CHASM software package and Mike Ryan and Rick Kim for implementing the CRAVAT Web server.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Carter, H., Karchin, R. (2014). Predicting the Functional Consequences of Somatic Missense Mutations Found in Tumors. In: Ochs, M. (eds) Gene Function Analysis. Methods in Molecular Biology, vol 1101. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-721-1_8
Download citation
DOI: https://doi.org/10.1007/978-1-62703-721-1_8
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-720-4
Online ISBN: 978-1-62703-721-1
eBook Packages: Springer Protocols