Abstract
Recent groundbreaking technological and scientific achievements impelled the field of personalized medicine (PM), which promises to start a new era in clinical disease treatment. However, the degree of success of PM strongly depends on the establishment of a vast resource library containing the connections between many common complex diseases and specific genetic signatures. Particularly, these connections can be discovered performing whole-genome association studies, which attempt to link diseases to their genetic origins. Such large-scale surveys, combined with modern advanced statistical methods, have already identified many disease-related genetic variants. In this review, we describe in detail novel statistical methods based on Bayesian data analysis ideas—Bayesian modeling, Bayesian variable partitioning, and Bayesian graphs and networks—which are promising to help shine light on complex biological processes involved in disease formation and development. Particularly, we outline how to use Bayesian approaches in the context of clinical applications to perform epistasis analysis while accounting for the block-type genome structure.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Promise and Complexity of Personalized Medicine
Simple and inexpensive genetic tests capable of showing person’s risks to develop certain diseases would help to effectively target clinical treatments to each individual patient in order to achieve the best possible results [1, 2]. Consequently, efficient technologies and software for uncovering treatment-related mutations in illness-inducing viruses as well as disease-related variants in patient’s DNA will play important roles in the future of medicine. Improved disease prevention and diagnosis as well as novel routes to therapies are the main motivations for extensive studies aimed at finding disease-related genetic signatures.
Presently, the estimated disease risks via characterization of known genetic risk factors can provide only a limited help in clinical applications [1, 3]. Even though a large amount of resources has been directed in this direction recently, the genetic basis of common human diseases has not been identified for the most part [4, 5]. Recent emergence of successful experimental and statistical strategies for the genome-wide association studies was supposed to provide the necessary tools for deciphering genetic causes of complex human illnesses like type 1 and 2 diabetes [6], rheumatoid arthritis, and bipolar disorder [4, 7]. However, the presence of complicated multi-locus interactions immensely complicates the task of discovering disease-related variants in patient’s genome [8, 9]. Thus, biochemical and statistical understanding of genetic interactions will play a crucial role in future clinical applications.
Whole-Genome Association Studies
An examination of a large number of genetic markers across the whole genome for multiple individuals with the goal of identifying variants-disease associations is known as genome-wide association study (GWAS). Novel scientific and technological advances in high-throughput biotechnologies such as microarrays and next-generation sequencing [10,11,12] made GWAS a powerful tool for unlocking the genetic basis of complex diseases. Particularly, development of International HapMap resource [13] that simplified design and analysis of association studies, emergence of dense genotyping chips [10, 14], and assembly of large and characterized clinical samples [4] should be singled out as important factors in recent successful progress for GWAS. While many disease loci have been identified in such surveys [4, 15], discovered variants explain only a small proportion of the observed familial aggregation [2, 16], thus posing a famous problem of missing disease heritability [17]. While there are a few proposed solutions to the encountered challenge [5], an urgent contemporary question that still needs to be solved is regarding the architecture of complex human traits. While, “common variant” hypothesis has come under a lot of criticism lately [1, 17], it is now necessary to devise experimental and computational methods to determine which one of the proposed disease architectures describes the reality in order to help develop future clinical medicine applications of bioinformatics technologies [1, 3, 17].
The most common type of DNA change is known as the single-nucleotide polymorphism (SNP), which arises when a single base (A, T, C, or G) is replaced by another one at a specific DNA position. Some SNPs can directly lead to disease formation; others increase the chance of disease statistically [18]. Analysis of SNP data is complicated because of a large number of possible interaction combinations as well as by the presence of correlation with the nearby SNPs.
Beyond Single-Locus Analysis
Despite striking success in the twentieth century in pinpointing genes responsible for Mendelian diseases, genetic origins of common complex diseases are, in fact, non-Mendelian in nature [9, 19]. Particularly, gene–gene interactions are involved in many complex biological processes like metabolism, signal transduction and gene regulations; thus, genetic variants in multiple loci may contribute to the disease formation together [20, 21]. For example, breast cancer and type 2 diabetes have been linked to multi-SNP interactions [21,22,23]. While most current bioinformatics approaches focus on detecting single-SNP associations, advanced statistical methods are necessary for multi-SNP association mapping because single-variant methods not only lose power when interactions exist but are, in fact, helpless in detecting rare mutations [24]. Also, the number of possible interactions is so vast that it is computationally unrealistic to search through all possible interactions in the genome for a large-scale case-control study [8, 25].
Additional challenge for disease origin discovery comes from the statistical correlation between nearby variants known as linkage disequilibrium or LD [25, 26]. LD patterns have many important applications in genetics and biology [27] and arise due to shared ancestry for contemporary chromosomes [13]. Due to LD patterns, it is likely that there will be a lot of redundant positive signals in dense studies [24]. Later on we address in detail how Bayesian strategies can address the burning problems in genetics while dealing with epistasis and linkage disequilibrium.
Modern Bioinformatics Approaches
Currently, most of the approaches to disease association mapping employ the standard “frequentist” attitude to the evaluation of significance [2]. Particularly, such algorithms use hypothesis testing procedures to deal with one variant at a time [24]. However, failures of such “frequentist” methods to account for the power of a study and the number of likely true positives [2] combined with the increased likelihood to report a multitude of redundant associations [24] sparked a wide interest in the Bayesian procedures. In this review, we survey the challenges facing statistical geneticists while analyzing the GWAS data and outline how recently emerged Bayesian methods can help with the process. In addition to outlining the main differences between various proposed approaches, we highlight limitations and advantages of each method and describe future prospects in the field and how Bayesian approaches can aid in answering outstanding questions in biomedicine.
Bayesian Data Analysis Methods
In Fig. 1, we have shown multiple complicated interactions that have to be considered while developing statistical models for understanding of the multi-locus interactions resulting in the disease development. The ultimate goal is to be able to accurately understand all the shown connections in large-scale case-control studies while also comprehending the biological processes that lead to disease development. Thus, while statistical understanding is important, developing methods that can point in the direction of the appropriate biological processes taking place is the ultimate goal.
Overview of Bayesian Data Analysis
Statistical conclusions about an unknown parameter θ (or unobserved data x unobs ) in the Bayesian approach to parameter estimation are described utilizing probability statements, which are conditional on the observed data x: p(θ|x) and p(x unobs |x). Additionally, implicit conditioning is performed on the values of any covariates [28]. The concept of conditioning on the observed data is what separates Bayesian statistics from other inference approaches which estimate unknown parameter over the distribution of the possible data values while conditioning on the true, yet unknown parameter values [28, 29].
At the heart of all the Bayesian approaches for detection of gene–gene interactions lies the concept of Bayesian inference and model selection. The goal is to determine the posterior distribution of all parameters in the problem (disease association, epistatic interactions, gene–environment interactions and others), given the common variants data for the case-control study while incorporating prior believes about parameter values. The conditional probability of all parameters \( ({\text{Params}}) \) given the observed data \( ({\text{Data}}) \) is given by the product of the likelihood function of the data and prior distribution on the parameters, as well as the normalization constant:
For most high-dimensional data sets encountered in large-scale studies, \( P({\text{Data}}) \) cannot be explicitly calculated [9] and, therefore, \( P({\text{Params}}|{\text{Data}}) \) can be evaluated analytically only up to the proportionality constant. However, advanced computational techniques (iterative sampling methods) can be used to determine posterior distribution of parameters [29, 30]. The main task is to make appropriate choices of statistical models to describe the likelihood expression and also to choose appropriate prior distributions on the values of parameters, \( P({\text{Params}}) \).
Overview of Bayesian Variable Partition
Instead of testing each SNP set in a stepwise manner [31, 32], Bayesian approaches fit a single statistical model to all of the data simultaneously [9, 25, 33] allowing for increased robustness when compared to hypothesis testing methods [2, 24]. Another advantage of Bayesian approach to the problem is the ability to quantify all the uncertainties and information, and to incorporate previous knowledge about each specific SNP marker into the statistical model through priors [9, 29].
In the Bayesian model selection framework, we are interested in figuring out which of the set of models \( \left\{ {M_{i} } \right\}_{i = 1}^{N} \) is the most likely one given the observed \( {\text{Data}} \). The posterior probability for a particular model \( M_{i} \) given \( {\text{Data}} \) is described by:
Thus, through comparison of the posterior odds ratio for \( P(M_{i} |{\text{Data}}) \) and \( P(M_{j} |{\text{Data}}) \) it can be determined whether model \( M_{i} \) or \( M_{j} \) is more likely [29]. It is important to note that the normalization constant in Eq. 2 involves summation over all possible models: \( P\left( {\text{Data}} \right) = \mathop \sum \nolimits_{i = 1}^{N} P\left( {{\text{Data|}}M_{i} } \right)P(M_{i} ) \). For example, consider the case of a genome-wide study containing 1500 SNPs each of which can take one of the three possible states; thus, \( N = 3^{1500} \approx 5 \times 10^{715} \) is the total number of feasible models to sum over. In such instances, it is necessary to use stochastic methods to sample from the posterior distribution. Now let us consider how this conceptual framework is applied in practice to the determination of multi-locus interactions in case-control studies.
Epistasis Analysis Methods
While statistical methods like BGTA [34], MARS [35], and CPM [36] are capable of detecting epistatic associations, the Bayesian epistasis association mapping (BEAM) algorithm [9] was the first practical approach capable of handling genome-wide case-control data sets. BEAM algorithm gives for each SNP marker posterior probabilities for disease association and epistatic interaction with other markers given the case-control genotype SNP data. Figure 2 shows the input file format necessary for application of the algorithm. The core of the Bayesian marker partition model used can be briefly summarized as follows.
BEAM can detect both interacting and noninteracting disease loci among a large number of variants. It is an application of Bayesian model selection procedure. Particularly, all the markers are split into three groups: (1) markers not associated with the disease, (2) marginally disease-associated variants, and (3) those with interaction associated disease effect. Thus, using the priors on the marker memberships and Markov Chain Monte Carlo (MCMC) methods, posterior probabilities for group memberships are determined. Specifically, by interrogating each SNP marker conditionally on the current status of others via MCMC method, the algorithm produces posterior probabilities [9]. Particularly, the genotype counts are modeled by the multinomial distribution with frequency parameters \( \theta = \left\{ {\theta_{1} ,\theta_{2} ,\theta_{3} } \right\} \), \( \mathop \sum \nolimits_{i = 1}^{3} \theta_{i} = 1 \) described by the Dirichlet prior:
In order to determine the posterior probability of each marker’s group membership (represented by I), the Metropolis–Hastings (MH) algorithm [30] is used to sample from \( P(I|D,H) \) as given in Eq. 3:
where D is the patient data set (with disease), H is the control data set (healthy), and then D 0, D 1, and D 2 are correspondingly partitions of the patient data set into the three categories described above. The assumption is that case genotypes at the disease-associated markers will have different distributions when compared to control genotypes. Furthermore, the likelihood model assumes independence among markers in control group.
While BEAM algorithm was one of the first few to be able to handle GWAS data, it suffered from an assumption that SNPs dependence structure could be described by the Markov chain [9, 25]. In fact, SNP markers are highly correlated within haplotype blocks which are separated by recombination events [13, 37]. Therefore, despite its success, BEAM model is unable to capture the block-like human genome structure.
Incorporating Block-Type Genome Structure
Given that nearby SNPs are strongly correlated due to linkage disequilibrium, a new Bayesian model [25] that infers diplotype blocks and chooses SNP markers within blocks that are disease-associated becomes much more powerful when compared to other similar approaches. Here, we review the statistical Bayesian model for the LD-block structure determination [25, 26]. The main assumption is that diplotypes of individuals come from a multinomial distribution with frequency parameters described by the Dirichlet prior and that genotype combinations of SNPs in different blocks are mutually independent. The compact expression for the marginal probability of the data for a specific block is given by:
where a block of SNPs considered consists of the SNPs (s, …, b − 1); Γ is the gamma function, \( \vec{a} \) is the vector of Dirichlet parameters and \( ni \) refers to the number of counts for a specific diplotype. For joint inference of diplotype blocks and disease association status, we use the joint statistical model for the observed genotype data in cases and controls, the marker membership and block partition variable:
Finally, in order to determine the posteriors \( P(B|D,H) \) and \( P(I|D,H) \) the model uses a combination of MH algorithm and Gibbs sampler [25].
Detailed Interaction Partition Structure Determination
While successful in inferring epistatic interactions in large-scale case-control studies, both BEAM and its newer version BEAM2 had a disadvantage of using saturated models which limited the ability of the algorithms to accurately determine the epistatic interactions structure. Recent studies showed [4, 33, 38] that such interaction details arising due to encoding of the complicated regulatory mechanisms might play an important role in the disease formation. In order to carefully explore the etiopathogenesis and genetic mechanisms of diseases, a novel algorithm named Recursive Bayesian Partition (RBP) was proposed [33]. The RBP approach employs a Bayesian model to discover independence groups among interacting markers: first, it recursively infers all the marginally independent interaction groups, and then determines the conditional independence within each group using a chain dependence model. RBP therefore successfully recursively determines dependence structure among interacting variants in GWAS. Figure 3 shows an example of the possible outcomes of the RBP algorithm applied to GWAS data when determining the epistatic interactions independence structure.
Bayesian Graph Models and Networks
In order to improve disease mapping sensitivity and specificity, BEAM3 algorithm [24] uses a graph model to allow for flexible interaction structures for multi-SNP associations. Through the use of Bayesian networks, BEAM3 detects flexible interaction structures instead of using saturated models (like BEAM and BEAM2), therefore, highly reducing the interaction model complexity. Moreover, because only the disease association graphs are constructed, BEAM3 provides for higher computational efficiently in the whole-genome association settings [24].
In detail, BEAM3 allows for higher order couplings via saturated interactions within cliques (nonoverlapping partition of SNPs) and pairwise interactions between them. It can be shown [24] that the joint probability of all SNPs X, parameters, including disease graph and association status (G, I), and disease status indicator (Y) is given by:
where \( G = (C,\Delta ) \) is an undirected disease graph constructed on disease-associated SNPs (X 1) and including partition of SNPs into cliques (C) and interaction between cliques (∆); probability function of X1 set under the phenotype association hypothesis is described by P A . Therefore, as can be seen from Eq. 7, only a few disease-associated SNPs are modeled (in set X 1), and hence a significant portion of computational time is saved by avoiding explicit modeling of complicated dependence structures of all SNPs which could be millions [19, 24, 39]. Additionally, through the choice of a proper baseline probability function P 0(X 1), the model automatically accounts for the complex LD effects among dense SNPs employing graphs. Thus, a significant number of repetitive false interactions are avoided reducing computational burden [24]. Specifically, summing over all \( G^{\prime} \) graphs, the expression for the baseline model becomes:
An alternative approach toward learning disease inducing gene–gene interactions is using binary classification trees. Bayesian methodology has been recently applied [21] to identification of multi-locus interactions in the large-scale data sets using a Bayesian classification tree model. Specifically, this kind of machine learning approach produces tree structure models, where each nonterminal node determines the splitting rule based upon the predictor variables like SNPs, and edges between nodes correspond to different possible values for the variable in the top parent node. A path along such a tree till the terminal node represents a specific combination of predictor variables along the path, therefore, automatically accommodating for epistasis [8, 21].
There are various ways for searching through the feasible tree space in such recursive partitioning approaches including greedy algorithms [40], random forests approach [8, 41], and MCMC [42, 43]. Bayesian variable partition and Bayesian classification trees are conceptually similar in that prior is assigned to all the tree models with the purpose of controlling the tree size [21]. One main advantage of this approach is in a possible enhancement of finding probability for multi-locus interactions with weak marginal effects due to ensuring the variable splitting through the prior specification. Moreover, due to the adaptivity of the MCMC algorithm, such Bayesian tree models detect higher order interactions by performing thorough searches near trees with the interacting variables determined in previous iterations [21]. It is important to point out that classification tree approaches do not test for epistatic interactions directly [8].
Clinical Applications of Bayesian Methodology
Even though practical Bayesian approaches for whole-genome multi-locus interactions analysis have emerged relatively recently, such methods have already helped to make important advances in determination of disease etiology. Table 1 succinctly summaries and compares all the statistical methods described above as well as their success in determination of the previously known disease loci and, more importantly, in the discovery of new multi-locus interactions responsible for complex diseases. We specifically note what interaction model each method utilizes. For example, Bayesian analysis strategy combining BEAM and BEAM2 software [44] allowed for the discovery of 319 high-order interactions across the genome that can potentially explain the missing genetic component of the rheumatoid arthritis susceptibility. Moreover, their findings indicate that nervous system, in addition to autoimmune one, potentially performs a crucial role in the disease development. Figure 4 shows a schematic diagram of the combined Bayesian strategy used for the analysis. This is an example of the statistical study in which disease underlying biological processes can be extracted from determined statistical associations. For sure, many more studies will follow in the near future that apply Bayesian methods either to existing GWAS data or to new large-scale studies.
Conclusions and Future Prospects
Certain issues need to be considered when using Bayesian approaches described above. For example, the combination of genotyping errors, disease heterogeneities, and population substructures could have adverse effects on the statistical results of the methods [9]. Currently, the major problem in the field is that the determined disease-associated genetic variants explain only a small part of the disease heritability [3, 4]. However, it is conceivable that the usage of the software tools outlined above will help with the detailed understanding of the interactions involved. Additionally, recent development of Bayesian models should allow for the elucidation of the detailed etiopathogenesis of the disease formation and the underlying causal biology.
Improvements to the Bayesian approaches mentioned in this article can include incorporation of environmental factors and population structures as covariates in the statistical model [33, 45]. Another possible improvement is to impute untyped SNPs and missing genotypes [46]. Efficient incorporation of prior biological knowledge into the Bayesian model can increase the probability of making discoveries in association studies [47]. Finally, recent computational proposals attempt to apply Bayesian methodology specifically toward efficient identification of causal rare variants in GWAS [48, 49].
It is important to keep in mind that the clinical applications of the statistical methods will arise from the understanding of the relationship between determined mathematical couplings and their biochemical underpinnings. The biological interpretation of the determined single- and multi-variant effects is currently a crucial area of research in genetics [8]. Modern statistical approaches to the analysis of the SNP data from whole-genome association studies have potential to play an important role in the future of bioinformatics and genomics research. Specifically, such methods will contribute to novel understandings of disease pathogenesis and provide crucial information for drug discovery [50], thus leading to important clinical applications.
References
S.S. Hall, Revolution postponed. Sci. Am. 303, 60–67 (2010)
M.I. McCarthy et al., Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat. Rev. Genet. 9, 356–369 (2008)
P. Donnelly, Progress and challenges in genome-wide association studies in humans. Nature 456, 728–731 (2008)
WTCCC, Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
E.E. Eichler et al., Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 11, 446–450 (2010)
J.A. Todd et al., Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat. Genet. 39, 857–864 (2007)
J.N. Hirschhorn, M.J. Daly, Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005)
H.J. Cordell, Detecting gene-gene interactions that underline human diseases. Nat. Genet. 10, 392–404 (2009)
Y. Zhang, J.S. Liu, Bayesian inference of epistatic interactions in case-control studies. Nat. Genet. 39, 1167–1173 (2007)
M.L. Metzker, Sequencing technologies—the next generation. Nat. Rev. Genet. 11, 31–46 (2010)
D. Branton et al., The potential and challenges of nanopore sequencing. Nat. Biotechnol. 26, 1146–1153 (2008)
A. Schaffer, Nanopore sequencing. Technol. Rev. (2012)
The International HapMap Consortium, A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
E. Svoboda, The DNA transistor. Sci. Am. 303, 46 (2010)
A.D. Johnson, C.J. O’Donnell, An open access database of genome-wide association results. BMC Med. Genet. 10, 6 (2009)
D. Altshuler, M. Daly, Guilt beyond a reasonable doubt. Nat. Genet. 39, 813–815 (2007)
G. Gibson, Rare and common variants: twenty arguments. Nat. Rev. 13, 135–145 (2012)
M. Carmichael, One hundred tests. Sci. Am. 303, 50 (2010)
X. Jiang et al., Learning genetic epistasis using Bayesian network scoring criteria. BMC Bioinform. 12, 89 (2011)
J.H. Moore, The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003)
M. Chen et al., Detecting epistatic SNPs associated with complex diseases via a Bayesian classification tree search method. Ann. Hum. Genet. 75, 112–121 (2011)
M.D. Ritchie et al., Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001)
S. Wiltshire et al., Epistasis between type 2 diabetes susceptibility loci on chromosomes 1q21-25 and 10q23-26 in Northern Europeans. Ann. Hum. Genet. 70, 726–737 (2006)
Y. Zhang, A novel graphical model for genome-wide multi-SNP association mapping. Genet. Epidemiol. 36, 36–47 (2012)
Y. Zhang et al., Block-based Bayesian epistasis association mapping with application to WTCCC type 1 diabetes data. Ann. Appl. Stat. 5, 2052–2077 (2011)
I. Kozyryev, J. Zhang, Bayesian determination of disease associated differences in haplotype blocks. Am. J. Bioinform. 1, 20–29 (2012)
J.D. Wall, J.K. Pritchard, Haplotype blocks and linkage disequilibrium in the human genome. Nat. Rev. Genet. 4, 587–597 (2003)
A. Gelman et al., Bayesian Data Analysis, 2nd edn. (2003)
J.A. Rice, Mathematical Statistics and Data Analysis, 3rd edn. (2006)
J.S. Liu, Monte Carlo Strategies in Scientific Computing, 1st edn. (2001)
J. Marchini et al., Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37, 413–417 (2005)
Y. Liu et al., Genome-wide interaction-based association analysis identified multiple new susceptibility loci for common diseases. PLoS Genet. 7, 3 (2011)
J. Zhang et al., A Bayesian method for disentangling dependent structure of epistatic interaction. Am. J. Biostat. 2, 1–10 (2011)
T. Zheng et al., Backward genotype-trait association (BGTA)—based dissection of complex traits in case-control design. Hum. Hered. 62, 196–212 (2006)
N.R. Cook et al., Tree and spline based association analysis of gene-gene interaction models for ischemic stroke. Stat. Med. 23, 1439–1453 (2004)
M.R. Nelson et al., A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11, 458–470 (2001)
D.E. Reich et al., Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001)
Y. Yang et al., Testing association with interactions by partitioning chi-squares. Ann. Human. Genet. 73, 109–117 (2009)
Y. Zhang, J.S. Liu, Fast and accurate approximation to significance tests in genome-wide association studies. J. Am. Stat. Assoc. 106, 846–857 (2011)
T. Hastie et al., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 5th edn. (2011)
L. Breiman, Random forests. Mach. Learn. 45, 5–32 (2001)
H.A. Chipman et al., Bayesian CART model search. J. Am. Stat. Assoc. 93, 935–948 (1998)
D.G.T. Denison et al., A Bayesian CART algorithm. Biometrika 85, 363–377 (1998)
J. Zhang et al., High-order interactions in rheumatoid arthritis detected by Bayesian method using genome-wide association studies data. Am. Med. J. 3, 56–66 (2012)
I. Lobach et al., Genotype-based association mapping of complex diseases: gene-environment interactions with multiple genetic markers and measurement errors in environmental exposures. Genet. Epidemiol. 34, 792–802 (2010)
Y. Zhang, Bayesian epistasis association mapping via SNP imputation. Biostat 12, 211–222 (2011)
M. Chen et al., Incorporating biological pathways via a Markov random field model in genome-wide association studies. PLoS Genet. 7(4), e1001353 (2011)
F. Liang, M. Xiong, Bayesian detection of causal rare variants under posterior consistency. PLoS ONE 8(7), e69633 (2013)
M.A. Quintana et al., Incorporating model uncertainty in detecting rare variants: the Bayesian Risk Index. Genet. Epidemiol. 35, 638–649 (2011)
Y. Okada et al., Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2013)
Acknowledgements
Zhang was supported by the start-up funding and Sesseel Award from Yale University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Kozyryev, I., Zhang, J. (2017). Clinical Assessment of Disease Risk Factors Using SNP Data and Bayesian Methods. In: Xu, D., Wang, M., Zhou, F., Cai, Y. (eds) Health Informatics Data Analysis. Health Information Science. Springer, Cham. https://doi.org/10.1007/978-3-319-44981-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-44981-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44979-1
Online ISBN: 978-3-319-44981-4
eBook Packages: Computer ScienceComputer Science (R0)