Bioinformatic Tools for Identifying Disease Gene and SNP Candidates

Mooney, Sean D.; Krishnan, Vidhya G.; Evani, Uday S.

doi:10.1007/978-1-60327-367-1_17

Sean D. Mooney³,
Vidhya G. Krishnan³ &
Uday S. Evani³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 628))

6471 Accesses
30 Citations
6 Altmetric

Abstract

As databases of genome data continue to grow, our understanding of the functional elements of the genome grows as well. Many genetic changes in the genome have now been discovered and characterized, including both disease-causing mutations and neutral polymorphisms. In addition to experimental approaches to characterize specific variants, over the past decade, there has been intense bioinformatic research to understand the molecular effects of these genetic changes. In addition to genomic experimental assays, the bioinformatic efforts have focused on two general areas. First, researchers have annotated genetic variation data with molecular features that are likely to affect function. Second, statistical methods have been developed to predict mutations that are likely to have a molecular effect. In this protocol manuscript, methods for understanding the molecular functions of single nucleotide polymorphisms (SNPs) and mutations are reviewed and described. The intent of this chapter is to provide an introduction to the online tools that are both easy to use and useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bioinformatics Tools for Discovery and Functional Analysis of Single Nucleotide Polymorphisms

Innovative strategies for annotating the “relationSNP” between variants and molecular phenotypes

Article Open access 14 May 2019

BALL-SNP: combining genetic and structural information to identify candidate non-synonymous single nucleotide polymorphisms

Article Open access 01 July 2015

References

Mooney, S. (2005) Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief Bioinform, 6, 44-56.
Article PubMed CAS Google Scholar
Ng, P.C. and Henikoff, S. (2006) Predicting the effects of amino Acid substitutions on protein function. Annu Rev Genomics Hum Genet, 7, 61-80.
Article PubMed CAS Google Scholar
Steward, R.E., MacArthur, M.W., Laskowski, R.A. and Thornton, J.M. (2003) Molecular basis of inherited diseases: a structural perspective. Trends Genet, 19, 505-513.
Article PubMed CAS Google Scholar
Cooper, D.N., Stenson, P.D. and Chuzhanova, N.A. (2006) The Human Gene Mutation Database (HGMD) and its exploitation in the study of mutational mechanisms. Curr Protoc Bioinformatics, Chapter 1, Unit 1.13.
Hamosh, A., Scott, A.F., Amberger, J., Valle, D. and McKusick, V.A. (2000) Online Mendelian Inheritance in Man (OMIM). Hum Mutat, 15, 57-61.
Article PubMed CAS Google Scholar
Altman, R.B. (2007) PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Nat Genet, 39, 426.
Article PubMed CAS Google Scholar
Mailman, M.D., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., et al. (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat Genet, 39, 1181-1186.
Article PubMed CAS Google Scholar
Sjoblom, T., Jones, S., Wood, L.D., Parsons, D.W., Lin, J., Barber, T.D., et al. (2006) The consensus coding sequences of human breast and colorectal cancers. Science, 314, 268-274.
Article PubMed Google Scholar
Greenman, C., Stephens, P., Smith, R., Dalgliesh, G.L., Hunter, C., Bignell, G., et al. (2007) Patterns of somatic mutation in human cancer genomes. Nature, 446, 153-158.
Article PubMed CAS Google Scholar
Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C. and Ferrin, T.E. (2004) UCSF Chimera - a visualization system for exploratory research and analysis. J Comput Chem, 25, 1605-1612.
Article PubMed CAS Google Scholar
Chen, R., Morgan, A.A., Dudley, J., Deshpande, T., Li, L., Kodama, K., Chiang, A.P. and Butte, A.J. (2008) FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease. Genome Biol, 9, R170.
Article PubMed Google Scholar
Aerts, S., Lambrechts, D., Maity, S., Van Loo, P., Coessens, B., De Smet, F., et al. (2006) Gene prioritization through genomic data fusion. Nat Biotechnol, 24, 537-544.
Article PubMed CAS Google Scholar
van Driel, M.A., Cuelenaere, K., Kemmeren, P.P., Leunissen, J.A., Brunner, H.G. and Vriend, G. (2005) GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res, 33, W758-W761.
Article PubMed Google Scholar
Perez-Iratxeta, C., Wjst, M., Bork, P. and Andrade, M.A. (2005) G2D: a tool for mining genes associated with disease. BMC Genet, 6, 45.
Article PubMed Google Scholar
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 25, 25-29.
Article PubMed CAS Google Scholar
Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. and Pickard, B.S. (2006) SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics, 22, 773-774.
Article PubMed CAS Google Scholar
Adie, E.A., Adams, R.R., Evans, K.L., Porteous, D.J. and Pickard, B.S. (2005) Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics, 6, 55.
Article PubMed Google Scholar
Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., et al. (2007) New developments in the InterPro database. Nucleic Acids Res, 35, D224-D228.
Article PubMed CAS Google Scholar
Rossi, S., Masotti, D., Nardini, C., Bonora, E., Romeo, G., Macii, E., et al. (2006) TOM: a web-based integrated approach for identification of candidate disease genes. Nucleic Acids Res, 34, W285-W292.
Article PubMed CAS Google Scholar
Franke, L., van Bakel, H., Fokkens, L., de Jong, E.D., Egmont-Petersen, M. and Wijmenga, C. (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet, 78, 1011-1025.
Article PubMed CAS Google Scholar
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H. and Kanehisa, M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res, 27, 29-34.
Article PubMed CAS Google Scholar
Bader, G.D., Betel, D. and Hogue, C.W. (2003) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res, 31, 248-250.
Article PubMed CAS Google Scholar
Peri, S., Navarro, J.D., Kristiansen, T.Z., Amanchy, R., Surendranath, V., Muthusamy, B., et al. (2004) Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res, 32, D497-D501.
Article PubMed CAS Google Scholar
Mishra, G.R., Suresh, M., Kumaran, K., Kannabiran, N., Suresh, S., Bala, P., et al. (2006) Human protein reference database - 2006 update. Nucleic Acids Res, 34, D411-D414.
Article PubMed CAS Google Scholar
George, R.A., Liu, J.Y., Feng, L.L., Bryson-Richardson, R.J., Fatkin, D. and Wouters, M.A. (2006) Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res, 34, e130.
Article PubMed Google Scholar
Radivojac, P., Peng, K., Clark, W.T., Peters, B.J., Mohan, A., Boyle, S.M. and Mooney, S.D. (2008) An integrated approach to inferring gene-disease associations in humans. Proteins, 72, 1030-1037.
Article PubMed CAS Google Scholar
Tiffin, N., Adie, E., Turner, F., Brunner, H.G., van Driel, M.A., Oti, M., et al. (2006) Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res, 34, 3067-3081.
Article PubMed CAS Google Scholar
Turner, F.S., Clutterbuck, D.R. and Semple, C.A. (2003) POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol, 4, R75.
Article PubMed Google Scholar
Karolchik, D., Baertsch, R., Diekhans, M., Furey, T.S., Hinrichs, A., Lu, Y.T., et al. (2003) The UCSC Genome Browser Database. Nucleic Acids Res, 31, 51-54.
Article PubMed CAS Google Scholar
Birney, E., Andrews, D., Bevan, P., Caccamo, M., Cameron, G., Chen, Y., et al. (2004) Ensembl 2004. Nucleic Acids Res, 32 Database issue, D468-D470.
Article PubMed CAS Google Scholar
Laskowski, R.A. and Thornton, J.M. (2008) Understanding the molecular machinery of genetics through 3D structures. Nat Rev Genet, 9, 141-151.
Article PubMed CAS Google Scholar
Karchin, R., Diekhans, M., Kelly, L., Thomas, D.J., Pieper, U., Eswar, N., et al. (2005) LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics, 21, 2814-2820.
Article PubMed CAS Google Scholar
Yue, P., Melamud, E. and Moult, J. (2006) SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics, 7, 166.
Article PubMed Google Scholar
Singh, A., Olowoyeye, A., Baenziger, P.H., Dantzer, J., Kann, M.G., Radivojac, P., et al. (2007) MutDB: update on development of tools for the biochemical analysis of genetic variation. Nucleic Acids Res, 36 (Database issue), D815-D819.
Article PubMed Google Scholar
Jegga, A.G., Gowrisankar, S., Chen, J. and Aronow, B.J. (2007) PolyDoms: a whole genome database for the identification of non-synonymous coding SNPs with the potential to impact disease. Nucleic Acids Res, 35, D700-D706.
Article PubMed CAS Google Scholar
Pieper, U., Eswar, N., Braberg, H., Madhusudhan, M.S., Davis, F.P., Stuart, A.C., et al. (2004) MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res, 32 Database issue, D217-D222.
Article PubMed CAS Google Scholar
Youn, E., Peters, B., Radivojac, P. and Mooney, S.D. (2006) Evaluation of features for catalytic residue prediction in novel folds. Protein Sci, 16, 216-226.
Article PubMed Google Scholar
Ofran, Y. and Rost, B. (2003) Predicted protein-protein interaction sites from local sequence information. FEBS Lett, 544, 236-239.
Article PubMed CAS Google Scholar
Iakoucheva, L.M., Radivojac, P., Brown, C.J., O’Connor, T.R., Sikes, J.G., Obradovic, Z. and Dunker, A.K. (2004) The importance of intrinsic disorder for protein phosphorylation. Nucleic Acids Res, 32, 1037-1049.
Article PubMed CAS Google Scholar
Wang, Z. and Moult, J. (2001) SNPs, protein structure, and disease. Hum Mutat, 17, 263-270.
Article PubMed Google Scholar
Ye, Y., Li, Z. and Godzik, A. (2006) Modeling and analyzing three-dimensional structures of human disease proteins. Pac Symp Biocomput, 11, 439-446.
Google Scholar
Radivojac, P., Baenziger, P.H., Kann, M.G., Mort, M.E., Hahn, M.W. and Mooney, S.D. (2008) Gain and loss of phosphorylation sites in human cancer. Bioinformatics, 24, i241-i247.
Article PubMed Google Scholar
UniProt Consortium (2008) The universal protein resource (UniProt). Nucleic Acids Res, 36, D190-D195.
Article Google Scholar
Wang, P., Dai, M., Xuan, W., McEachin, R.C., Jackson, A.U., Scott, L.J., et al. (2006) SNP Function Portal: a web database for exploring the function implication of SNP alleles. Bioinformatics, 22, e523-e529.
Article PubMed CAS Google Scholar
Reumers, J., Maurer-Stroh, S., Schymkowitz, J. and Rousseau, F. (2006) SNPeffect v2.0: a new step in investigating the molecular phenotypic effects of human non-synonymous SNPs. Bioinformatics, 22, 2183-2185.
Article PubMed CAS Google Scholar
Conde, L., Vaquerizas, J.M., Santoyo, J., Al-Shahrour, F., Ruiz-Llorente, S., Robledo, M. and Dopazo, J. (2004) PupaSNP Finder: a web tool for finding SNPs with putative effect at transcriptional level. Nucleic Acids Res, 32, W242-W248.
Article PubMed CAS Google Scholar
Reumers, J., Conde, L., Medina, I., Maurer-Stroh, S., Van Durme, J., Dopazo, J., et al. (2008) Joint annotation of coding and non-coding single nucleotide polymorphisms and mutations in the SNPeffect and PupaSuite databases. Nucleic Acids Res, 36, D825-D829.
Article PubMed CAS Google Scholar
Cai, Z., Tsung, E.F., Marinescu, V.D., Ramoni, M.F., Riva, A. and Kohane, I.S. (2004) Bayesian approach to discovering pathogenic SNPs in conserved protein domains. Hum Mutat, 24, 178-184.
Article PubMed CAS Google Scholar
Chasman, D. and Adams, R.M. (2001) Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J Mol Biol, 307, 683-706.
Article PubMed CAS Google Scholar
Krishnan, V.G. and Westhead, D.R. (2003) A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics, 19, 2199-2209.
Article PubMed CAS Google Scholar
Saunders, C.T. and Baker, D. (2002) Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J Mol Biol, 322, 891-901.
Article PubMed CAS Google Scholar
Vitkup, D., Sander, C. and Church, G.M. (2003) The amino-acid mutational spectrum of human genetic disease. Genome Biol, 4, R72.
Article PubMed Google Scholar
Care, M.A., Needham, C.J., Bulpitt, A.J. and Westhead, D.R. (2007) Deleterious SNP prediction: be mindful of your training data! Bioinformatics, 23, 664-672.
Article PubMed CAS Google Scholar
Ferrer-Costa, C., Gelpi, J.L., Zamakola, L., Parraga, I., de la Cruz, X. and Orozco, M. (2005) PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics, 21, 3176-3178.
Article PubMed CAS Google Scholar
Ramensky, V., Bork, P. and Sunyaev, S. (2002) Human non-synonymous SNPs: server and survey. Nucleic Acids Res, 30, 3894-3900.
Article PubMed CAS Google Scholar
Ng, P.C. and Henikoff, S. (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res, 31, 3812-3814.
Article PubMed CAS Google Scholar
Ye, Z.Q., Zhao, S.Q., Gao, G., Liu, X.Q., Langlois, R.E., Lu, H. and Wei, L. (2007) Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP). Bioinformatics, 23, 1444-1450.
Article PubMed CAS Google Scholar
Bromberg, Y. and Rost, B. (2007) SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res, 35, 3823-3835.
Article PubMed CAS Google Scholar
Tian, J., Wu, N., Guo, X., Guo, J., Zhang, J. and Fan, Y. (2007) Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics, 8, 450.
Article PubMed Google Scholar
Mi, H., Lazareva-Ulitsky, B., Loo, R., Kejariwal, A., Vandergriff, J., Rabkin, S., et al. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res, 33, D284-D288.
Article PubMed CAS Google Scholar
Wang, G.S. and Cooper, T.A. (2007) Splicing in disease: disruption of the splicing code and the decoding machinery. Nat Rev Genet, 8, 749-761.
Article PubMed CAS Google Scholar
Freimuth, R.R., Stormo, G.D. and McLeod, H.L. (2005) PolyMAPr: programs for polymorphism database mining, annotation, and functional analysis. Hum Mutat, 25, 110-117.
Article PubMed CAS Google Scholar
Smith, P.J., Zhang, C., Wang, J., Chew, S.L., Zhang, M.Q. and Krainer, A.R. (2006) An increased specificity score matrix for the prediction of SF2/ASF-specific exonic splicing enhancers. Hum Mol Genet, 15, 2490-2508.
Article PubMed CAS Google Scholar
Yvert, G., Brem, R.B., Whittle, J., Akey, J.M., Foss, E., Smith, E.N., et al. (2003) Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet, 35, 57-64.
Article PubMed CAS Google Scholar
Hudson, T.J. (2003) Wanted: regulatory SNPs. Nat Genet, 33, 439-440.
Article PubMed CAS Google Scholar
Pruitt, K.D. and Maglott, D.R. (2001) RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res, 29, 137-140.
Article PubMed CAS Google Scholar
Riva, A. and Kohane, I.S. (2002) SNPper: retrieval and analysis of human SNPs. Bioinformatics, 18, 1681-1685.
Article PubMed CAS Google Scholar
Kim, B.C., Kim, W.Y., Park, D., Chung, W.H., Shin, K.S. and Bhak, J. (2008) SNP@Promoter: a database of human SNPs (single nucleotide polymorphisms) within the putative promoter regions. BMC Bioinformatics, 9 Suppl 1, S2.
Article PubMed Google Scholar
Matys, V., Fricke, E., Geffers, R., Gossling, E., Haubrock, M., Hehl, R., et al. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res, 31, 374-378.
Article PubMed CAS Google Scholar
Chen, K. and Rajewsky, N. (2006) Natural selection on human microRNA binding sites inferred from SNP data. Nat Genet, 38, 1452-1456.
Article PubMed CAS Google Scholar
Montgomery, S.B., Griffith, O.L., Schuetz, J.M., Brooks-Wilson, A. and Jones, S.J. (2007) A survey of genomic properties for the detection of regulatory polymorphisms. PLoS Comput Biol, 3, e106.
Article PubMed Google Scholar
Segal, E., Raveh-Sadka, T., Schroeder, M., Unnerstall, U. and Gaul, U. (2008) Predicting expression patterns from regulatory sequence in Drosophila segmentation. Nature, 451, 535-540.
Article PubMed CAS Google Scholar
Kawabata, T., Ota, M. and Nishikawa, K. (1999) The Protein Mutant Database. Nucleic Acids Res, 27, 355-357.
Article PubMed CAS Google Scholar

Download references

Acknowledgments

We are graciously supported by K22LM009135 (PI: Mooney), R01LM009722 (PI: Mooney), P01AG018397 (PI: Econs), U01GM061373 (PI: Flockhart), and the Indiana Genomics Initiative. The Indiana Genomics Initiative (INGEN) is supported in part by the Lilly Endowment.

Author information

Authors and Affiliations

Department of Medical and Molecular Genetics, Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
Sean D. Mooney, Vidhya G. Krishnan & Uday S. Evani

Authors

Sean D. Mooney
View author publications
You can also search for this author in PubMed Google Scholar
Vidhya G. Krishnan
View author publications
You can also search for this author in PubMed Google Scholar
Uday S. Evani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sean D. Mooney .

Editor information

Editors and Affiliations

Medicines Research Centre, GlaxoSmithKline R&D Limited, Gunnels Wood Road, Stevenage, Hertfordshire, SG12NY, United Kingdom
Michael R. Barnes
Institute of Psychiatry, Social, Genetic & Developmental, King's College, Denmark Hill 111, London, SE58AF, United Kingdom
Gerome Breen

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Mooney, S.D., Krishnan, V.G., Evani, U.S. (2010). Bioinformatic Tools for Identifying Disease Gene and SNP Candidates. In: Barnes, M., Breen, G. (eds) Genetic Variation. Methods in Molecular Biology, vol 628. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60327-367-1_17

Download citation

DOI: https://doi.org/10.1007/978-1-60327-367-1_17
Published: 08 February 2010
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60327-366-4
Online ISBN: 978-1-60327-367-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Bioinformatic Tools for Identifying Disease Gene and SNP Candidates

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Bioinformatics Tools for Discovery and Functional Analysis of Single Nucleotide Polymorphisms

Innovative strategies for annotating the “relationSNP” between variants and molecular phenotypes

BALL-SNP: combining genetic and structural information to identify candidate non-synonymous single nucleotide polymorphisms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Bioinformatic Tools for Identifying Disease Gene and SNP Candidates

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Bioinformatics Tools for Discovery and Functional Analysis of Single Nucleotide Polymorphisms

Innovative strategies for annotating the “relationSNP” between variants and molecular phenotypes

BALL-SNP: combining genetic and structural information to identify candidate non-synonymous single nucleotide polymorphisms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation