Abstract
Genome wide association studies (GWAS) are typically designed as case-control studies, collecting thousands of sick and healthy individuals, genotyping hundreds of thousands of SNPs, and documenting the SNPs which are more abundant in one group or the other. Direct-to-consumer genetic testing has opened the possibility for a regular person to receive data about his/her genotype, but the validity of risk assessment procedures and the final genetic risk estimate have been questioned. Many authors have discussed the advantage of use of the asymptotic Bayes factor (ABF) to measure the strength of SNP/trait associations, over the use of p-values. We propose a ABF based heuristic to filter-our and select SNP/trait associations to be used in multigenic risk assessment.
A raw genotype result from the 23andMe web service was merged with the GWAS catalog, and SNP/trait associations were filtered and selected using the R programming language together with free and publicly available databases.
From the initial 3195 SNP/trait associations, only 425 remained after the initial filters on descent, replicated findings, qualitative trait and availability of the number of cases and controls in the study. Selecting only one SNP/trait association from repeated studies and studies done with proxy SNPs left us with 377 SNP/trait associations available for multigenic risk assessment. After excluding the associations with unsatisfying ABF, only 300 SNP/trait associations remain for the multigenic risk assessment.
Whatever the link between SNP/trait associations and final DTC multigenic risk assessment for a given trait is, the final value of a risk score is heavily influenced by the number, as well as strength of evidence for individual SNP/trait pairs that are used for calculation. The ABF provides an unambiguous and simple criterion for ranking and including SNP/trait associations in multigenic risk assessment.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Mitchell, J.A., Fun, J., McCray, A.T.: Design of Genetics Home Reference: a new NLM consumer health resource. Journal of the American Medical Informatics Association 11(6), 439–447 (2004)
Goldstein, D.B.: Common genetic variation and human traits. N. Engl. J. Med. 360(17), 1696–1698 (2009)
Baptista, P.V.: Principles in genetic risk assessment. Ther. Clin. Risk Manag. 1(1), 15–20 (2005)
Regaldo, A.: How a Wiki Is Keeping Direct-to-Consumer Genetics Alive (2014), http://www.technologyreview.com/featuredstory/531461/how-a-wiki-is-keeping-direct-to-consumer-genetics-alive/
Sherry, S.T., et al.: dbSNP: the NCBI database of genetic variation. Nucleic Acids Research 29(1), 308–311 (2001)
Landrum, M.J., et al.: ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Research, gkt1113 (2013)
Yang, Q., et al.: Using lifetime risk estimates in personal genomic profiles: estimation of uncertainty. The American Journal of Human Genetics 85(6), 786–800 (2009)
Szoka, B. FDA Just Banned 23andMe’s DNA Testing Kits, and Users Are Fighting Back (2013), http://www.huffingtonpost.com/berin-szoka/fda-just-banned-23andmes-_b_4339182.html
Jannot, A.-S., Ehret, G., Perneger, T.: P<5*10-8 has emerged as a standard of statistical significance for genome-wide association studies. Journal of Clinical Epidemiology (2015)
Zheng, G., Yuan, A., Jeffries, N.: Hybrid Bayes factors for genome-wide association studies when a robust test is used. Computational Statistics & Data Analysis 55(9), 2698–2711 (2011)
Zaykin, D.V., Zhivotovsky, L.A.: Ranks of Genuine Associations in Whole-Genome Scans. Genetics 171(2), 813–823 (2005)
Wang, L., et al.: Bayes Factor Based on a Maximum Statistic for Case-Control Genetic Association Studies. Journal of Agricultural, Biological, and Environmental Statistics 17(4), 568–582 (2012)
Kass, R.E., Raftery, A.E.: Bayes Factors. Journal of the American Statistical Association 90(430), 773–795 (1995)
Wakefield, J.: Bayes factors for genome-wide association studies: comparison with P-values. Genet. Epidemiol. 33(1), 79–86 (2009)
Sawcer, S.: Bayes factors in complex genetics. Eur. J. Hum. Genet. 18(7), 746–750 (2010)
Stephens, M., Balding, D.J.: Bayesian statistical methods for genetic association studies. Nat. Rev. Genet. 10(10), 681–690 (2009)
Wakefield, J.: A Bayesian measure of the probability of false discovery in genetic epidemiology studies. The American Journal of Human Genetics 81(2), 208–227 (2007)
23andMe. 23andMe Web Service (2014), https://www.23andme.com/
Welter, D., et al.: The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Research 42(D1), D1001–D1006 (2014)
Carey, V.: gwascat: structuring and querying the NHGRI GWAS catalog (2013)
Wickham, H.: stringr: modern, consistent string processing. The R Journal 2(2), 38–40 (2010)
Burton, P.R., et al.: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145), 661–678 (2007)
Evangelou, E., Ioannidis, J.P.: Meta-analysis methods for genome-wide association studies and beyond. Nat. Rev. Genet. 14(6), 379–389 (2013)
Kraft, P., Zeggini, E., Ioannidis, J.P.: Replication in genome-wide association studies. Statistical Science: A Review Journal of the Institute of Mathematical Statistics 24(4), 561 (2009)
Johnson, A.D., et al.: SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24(24), 2938–2939 (2008)
Team, R.C.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2012) ISBN 3-900051-07-0
Gentleman, R.C., et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biology 5(10), R80 (2004)
Wei, Z., et al.: From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet. 5(10), e1000678 (2009)
Barrett, J.: Why prediction is a risky business (2010), http://genomesunzipped.org/2010/08/why-prediction-is-a-risky-business.php
Cummings, P.: The relative merits of risk ratios and odds ratios. Arch. Pediatr. Adolesc. Med. 163(5), 438–445 (2009)
Wang, Z.: Converting Odds Ratio to Relative Risk in Cohort Studies with Partial Data Information. Journal of Statistical Software 55(5) (2013)
Nikaido, I., Tsuyuzaki, K., Morota, G.: meshr: Tools for conducting enrichment analysis of MeSH. R package version 1.2.4
Tsuyuzaki, K., et al.: How to use MeSH-related Packages (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bojić, S., Mandić-Rajčević, S. (2015). Risk Quantification of Multigenic Conditions for SNP Array Based Direct-to-Consumer Genomic Services. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9044. Springer, Cham. https://doi.org/10.1007/978-3-319-16480-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-16480-9_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16479-3
Online ISBN: 978-3-319-16480-9
eBook Packages: Computer ScienceComputer Science (R0)