Summary
Complex diseases can involve the interaction of multiple genes and environmental factors. Discovering these genes is difficult, and in silico based strategies can significantly improve their detection. Data mining and automated tracking of new knowledge facilitate locus mapping. At the gene search stage, in silico prioritization of candidate genes plays an indispensable role in dealing with linked or associated loci. In silico analysis can also differentiate subtle consequences of coding DNA variants and remains the major method to predict functionality for non-coding DNA variants, particularly those in promoter regions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- cM:
-
centimorgan
- EST:
-
expressed sequence tag
- LD:
-
linkage disequilibrium
- OMIM:
-
Online Mendelian Inheritance in Man
- SNP:
-
single nucleotide polymorphism
References
Grant, S. F., Thorleifsson, G., Reynisdottir, I., Benediktsson, R., Manolescu, A., Sainz, J., et al. (2006) Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat. Genet. 38, 320–323.
Herbert, A., Gerry, N. P., McQueen, M. B., Heid, I. M., Pfeufer, A., Illig, T., et al. (2006) A common genetic variant is associated with adult and childhood obesity. Science 312, 279–283.
Watkins, H., and Farrall, M. (2006) Genetic susceptibility to coronary artery disease: from promise to progress. Nat. Rev. Genet. 7, 163–173.
Thomson, G. (2001) Significance levels in genome scans. Adv. Genet. 42, 475–486.
Trent, R. J. (2005) Molecular Medicine. Elsevier Academic Press, San Francisco.
Gauderman, W. J. (2002) Sample size requirements for matched case-control studies of gene–environment interaction. Stat. Med. 21, 35–50.
Laird, N. M., and Lange, C. (2006) Family-based designs in the age of large-scale gene-association studies. Nat. Rev. Genet. 7, 385–394.
Lalouel, J. M., and Rohrwasser A. (2002) Power and replication in case-control studies. Am. J. Hypertens. 15, 201–205.
Ambrosius, W. T., Lange, E. M., and Langefeld, C. D. (2004) Power for genetic association studies with random allele frequencies and genotype distributions. Am. J. Hum. Genet. 74, 683–693.
Kruglyak, L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat. Genet. 22, 139–144.
de Bakker, P. I., Yelensky, R., Pe’er, I., Gabriel, S. B., Daly, M. J., and Altshuler, D. (2005) Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223.
Batzoglou, S. (2005) The many faces of sequence alignment. Brief Bioinform. 6, 6–22.
Weeber, M., Kors, J. A., and Mons, B. (2005) Online tools to support literature-based discovery in the life sciences. Brief Bioinform. 6, 277–286.
van Driel, M. A., Cuelenaere, K., Kemmeren, P. P., Leunissen, J. A., Brunner, H. G., and Vriend, G. (2005) GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res. 33, W758–W761.
Freudenberg, J., and Propping, P. (2002) A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 18, S110–S115.
Perez-Iratxeta, C., Bork, P., and Andrade, M. A. (2002) Association of genes to genetically inherited diseases using data mining. Nat. Genet. 31, 316–319.
Turner, F. S., Clutterbuck, D. R., and Semple, C. A. M. (2003) POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 4, R75.
Adie, E. A., Adams, R. R., Evans, K. L., Porteous, D. J., and Pickard, B. S. (2005) Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 6, 55.
Devos, D., and Valencia, A. (2001) Intrinsic errors in genome annotation. Trends Genet. 17, 429–431.
Judson, R., Stephens, J. C., and Windemuth, A. (2000) The predictive power of haplotypes in clinical response. Pharmacogenomics 1, 15–26.
Adkins, R. M. (2004) Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset. BMC Genet. 5, 22.
Van Den Bogaert, A., Schumacher, J., Schulze, T. G., Otte, A. C., Ohlraun, S., Kovalenko, S., et al. (2003) The DTNBP1 (dysbindin) gene contributes to schizophrenia, depending on family history of the disease. Am. J. Hum. Genet. 73, 1438–1443.
Yu, B. (2004) What is the value of mutation identification in familial hypertrophic cardiomyopathy? IUBMB Life 56, 281–283.
Mooney, S. (2005) Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief Bioinform. 6, 44–56.
Ng, P. C., and Henikoff, S. (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814.
Cartegni, L., and Krainer, A. R. (2002) Disruption of an SF2/ASF-dependent exonic splicing enhancer in SMN2 causes spinal muscular atrophy in the absence of SMN1. Nat. Genet. 30, 377–384.
Kashima, T., and Manley, J. L. (2003) A negative element in SMN2 exon 7 inhibits splicing in spinal muscular atrophy. Nat. Genet. 34, 460–463.
Fairbrother, W. G., Yeh, R. F., Sharp, P. A., and Burge, C. B. (2002) Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007–1013.
Amador, M. L., Oppenheimer, D., Perea, S., Maitra, A., Cusat, I. G., Iacobuzio-Donahue, C., et al. (2004) An epidermal growth factor receptor intron 1 polymorphism mediates response to epidermal growth factor receptor inhibitors. Cancer Res. 64, 9139–9143.
Tokuhiro, S., Yamada, R., Chang, X., Suzuk, I. A., Kochi, Y., Sawada, T., et al. (2003) An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis. Nat. Genet. 35, 341–348.
Bulyk, M. L. (2003) Computational prediction of transcription-factor binding site locations. Genome Biol. 5, 201.
Pavesi, G., Mauri, G., and Pesole, G. (2004) In silico representation and discovery of transcription factor binding sites. Brief Bioinform. 5, 217–236.
Gayagay, G., Yu, B., Hambly, B., Boston, T., Hahn, A., Celermajer, D. S., et al. (1998) Elite endurance athletes and the ACE I allele—the role of genes in athletic performance. Hum. Genet. 103, 48–50.
Henderson, J., Withford-Cave, J. M., Duffy, D. L., Cole, S. J., Sawyer, N. A., Gulbin, J. P., et al. (2005) The EPAS1 gene influences the aerobic–anaerobic contribution in elite endurance athletes. Hum. Genet. 118, 416–423.
Bouchard, C., Rankinen, T., Chagnon, Y. C., Rice, T., Perusse, L., Gagnon, J., et al. (2000) Genomic scan for maximal oxygen uptake and its response to training in the HERITAGE Family Study. J. Appl. Physiol. 88, 551–559.
Miller, R. T., Christoffels, A. G., Gopalakrishnan, C., Burke, J., Ptitsyn, A. A., Broveak, T. R., et al. (1999) A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9, 1143–1155.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Yu, B. (2008). In Silico Gene Discovery. In: Trent, R.J. (eds) Clinical Bioinformatics. Methods in Molecular Medicine™, vol 141. Humana Press. https://doi.org/10.1007/978-1-60327-148-6_1
Download citation
DOI: https://doi.org/10.1007/978-1-60327-148-6_1
Publisher Name: Humana Press
Print ISBN: 978-1-58829-791-4
Online ISBN: 978-1-60327-148-6
eBook Packages: Springer Protocols