Abstract
The fast measurement of millions of sequence variations across the genome is possible with the current technology. As a result, a difficult challenge arise in bioinformatics: the identification of combinations of interacting DNA sequence variations predictive of common disease [1]. The Multifactor Dimensionality Reduction (MDR) method is capable of analysing such interactions but an exhaustive MDR search would require exponential time. Thus, we use the Ant Colony Optimization (ACO) as a stochastic wrapper. It has been shown by Greene et al. that this approach, if expert knowledge is incorporated, is effective for analysing large amounts of genetic variation[2]. In the ACO method integrated in the MDR package, a linear and an exponential probability distribution function can be used to weigh the expert knowledge. We generate our biological expert knowledge from a network of gene-gene interactions produced by a literature mining platform, Pathway Studio. We show that the linear distribution function of expert knowledge is the most appropriate to weigh our scores when expert knowledge from literature mining is used. We find that ACO parameters significantly affect the power of the method and we suggest values for these parameters that can be used to optimize MDR in Genome Wide Association Studies that use biological expert knowledge.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Expert Knowledge
- Multifactor Dimensionality Reduction
- Literature Mining
- Pathway Studio
- Analyse Epistasis
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Greene, C.S., Gilmore, J.M., Kiralis, J., Andrews, P.C., Moore, J.H.: Optimal Use of Expert Knowledge in Ant Colony Optimization for the Analysis of Epistasis in Human Disease. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds.) EvoBIO 2009. LNCS, vol. 5483, pp. 92–103. Springer, Heidelberg (2009)
Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human disease. Human Heredity 56, 73–82 (2003)
The International HapMap Consortium: A second Generation human haplotype of over 3.1 million SNPs. Nature 449, 851–861 (2007)
Nikitin, A., Egorov, S., Mazo, I.: Pathway Studio-the analysis and navigation of molecular networks. Bioinformatics Oxford Journals 19(16), 2155–2157 (2003)
Moore, J.H., Gilbert, J.C., Tsai, C.T., Chiang, F.T., Holden, T., Barney, N., White, B.C.: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology 241(2), 252–261 (2006)
Cordell, H.J.: Detecting gene-gene interactions that underlie human diseases. Nature Review Genetics 10, 392–404 (2009)
Ritchie, M.D., et al.: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. American Journal of Human Genetics 69, 138–147 (2001)
Julia, A., et al.: Identification of a two-loci epistatic interaction associated with susceptibility to rheumatoid arthritis through reverse engineering and multifactor dimensionality reduction. Genomics 90, 6–13 (2007)
Cho, Y.M., et al.: Multifactor-dimensionality reduction shows a two-locus interaction associated with type 2 diabetes mellitus. Diabetologia 47, 549–554 (2004)
Tsai, C.T., et al.: Reninangiotensin system gene polymorphisms and coronary artery disease in a large angiographic cohort: detection of high order genegene interaction. Atherosclerosis 195, 172–180 (2007)
Andrew, A.S., et al.: Bladder Cancer SNP panel predicts susceptibility and survival. Human Genetics 125(5-6), 527–539 (2009)
Urbanowicz, R.J., Kiralis, J., Sinnot-Armstrong, N.A., Heberling, T., Fisher, J.M., Moore, J.H.: GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Mining 5(16) (2012)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/glm.html
http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/stars.html
Sokal, R.R., Rohlf, F.J.: Biometry: the principles and practice of statistics in biological research, 3rd edn. W.H. Freeman and Co., New York (1995)
Dorigo, M., Maniezzo, V., Colorni, A.: Positive Feedback as a search strategy. Dipartimento di Elettronica e Informatica, Politecnico di Milano, Technical Reports, 91–116 (1991)
Dorigo, M., Stützle, T.: Ant Colony Optimization (2004)
Martens, D., et al.: Editorial Survey: Swarm Intelligence for Data Mining. Machine Learning 82(1), 1–42 (2011)
Moore, J.H., White, B.C.: Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In: Genetic Programming Theory and Practice IV. Springer (2007)
Pattin, K., Moore, J.H.: Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Human Genetics 124(1), 19–29 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sulovari, A., Kiralis, J., Moore, J.H. (2013). Optimal Use of Biological Expert Knowledge from Literature Mining in Ant Colony Optimization for Analysis of Epistasis in Human Disease. In: Vanneschi, L., Bush, W.S., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2013. Lecture Notes in Computer Science, vol 7833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37189-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-37189-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37188-2
Online ISBN: 978-3-642-37189-9
eBook Packages: Computer ScienceComputer Science (R0)