Introduction

Type 2 diabetes [T2D (MIM 222100)] is a common disease and a major medical burden on society (Narayan et al. 2006). As a means to better understand the disease towards the development of novel therapies and prophylactic treatments, studying the genetic basis of T2D has for long been a top scientific priority. Until recently, in spite of the significant heritability of the disease, only a small fraction of the genes affecting susceptibility to the disease have been identified. Last year, however, a novel technology was applied at its currently full capacity to decipher the genetic makeup of T2D. Several whole-genome association (WGA) studies were conducted on thousands of samples (Salonen et al. 2007; Saxena et al. 2007; Scott et al. 2007; Steinthorsdottir et al. 2007; Zeggini et al. 2007; Sladek et al. 2007; The Wellcome Trust Case Control Consortium 2007). Subsequently, a combined analysis of several WGA studies resulted in the identification of 10 loci, all of them robustly replicated as T2D risk factors (Scott et al. 2007). T2D is the first extensively studied disease, using the WGA approach. Consequently, it is important to examine how these findings replicate in the other populations.

In the current study, we looked at these 10 loci and assessed their effect on T2D in the Ashkenazi Jewish (AJ) population. The AJ population is a unique population in terms of its demographic history and genetic architecture. Historical studies suggest that the AJ population descended from a small number of founders and then rapidly expanded 700–1,000 years ago. The AJ population founder bottleneck eliminated many rare haplotypes, causing a reduction in the population’s genetic heterogeneity and a modest increase in linkage disequilibrium (LD) (Shifman and Darvasi 2001; Olshen et al. 2008). Therefore, a replication study in the AJ population appears both important and efficient.

Materials and methods

Subjects

The 1,131 T2D subjects included in the study were collected in Israel by the patients’ physicians in specialized clinics. The 1,147 healthy controls were collected from the Israeli blood bank. All subjects analyzed are of AJ ancestry, as self declared for all four grandparents. All samples are part of the Hebrew University Genetic Resource (HUGR), www.hugr.org.

SNP genotyping

Genotyping was carried out using the KASPar technology by KBioscience (http://www.kbioscience.co.uk).

Analysis

A standard chi-square allele test was performed to calculate one-tail P values using the R environment. The statistical power for testing association was estimated based on the allele frequencies in the AJ control sample and the odds ratios given by Scott et al. (2007), using formula (12.2) on page 285 of Siegmund and Yakir (2007). To assess the predictive value of the seven SNPs exhibiting a T2D susceptibility effect (P < 0.1, Table 1) we fitted, similar to the simulation analysis conducted by Scott et al. (2007), a logistic regression model to our data (assuming additive non-interacting effects), simulating nine times as much weight to AJ healthy individuals as compared to AJ T2D patients, in order to reflect the underlying assumption of a 0.1 population prevalence for the disease. Likewise, for the frequency estimation in the population of subjects in each risk interval, healthy subjects were counted nine times and cases only once. The standard deviation of the frequency estimation in each risk interval was calculated using 1,000 bootstrap samples.

Table 1 The association results for the 10 SNPs analyzed based on 1,131 cases versus 1,147 controls

Results

Average call rate was above 98% for all SNPs and for all cases and controls, except for rs5219 that had a 93% call rate in the controls. Five of the SNPs here genotyped were also independently genotyped for approximately 200 of the cases used in the current study, as part of our T2D WGA study using Sentrix HumanHap300 BeadChips and Infinium II genotyping assays by Illumina (Salonen et al. 2007). Concordant genotypes were observed for 99.8% of the genotypes called by both technologies. Except for rs5219, none of the other SNPs exhibited a significant deviation (P < 0.05) from Hardy–Weinberg equilibrium (in either cases or controls). As presented in Table 1, allele frequencies in the controls of the AJ population were generally only slightly different from those of the general Caucasian population. Table 1 presents the association results for the 10 SNPs analyzed based on 1,131 cases versus 1,147 controls. We tested for allele association of the previously established risk allele and therefore present the one-tail P value. The expected power to detect these SNPs at P = 0.05 (presented for each SNP in Table 1) was on average 0.65. Therefore, nominal statistical significance is not necessarily expected for every SNP in our study. Nevertheless, in seven of the 10 SNPs the P value was below 0.1; sevenfold more than what is expected by chance. The rs5219 SNP was the only SNP to exhibit an effect in the opposite direction to what has been reported in the previous studies. This can possibly be interpreted as a consequence of a biased allele calling rate between the cases and controls for this particular SNP. For example, if the difference in allele calling between the cases and the controls was caused by a higher proportion of genotyping fails for the CC genotype in the controls, the effect of this SNP could disappear. Therefore, we will not refer to this SNP hereafter. One of the polymorphisms tested here, a SNP (rs7754840) in the CDKAL1 gene, exhibited a significantly increased OR as compared to the average OR calculated in the meta-analysis of the general Caucasian population (Scott et al. 2007). This may occur as a consequence of the homogeneity of the AJ population, differences in LD patterns or simply by chance. Figure 1 presents the LD patterns around this SNP in the AJ population and in the Caucasian population. Data for the LD patterns were obtained for the AJ population from a previous study on this population (Salonen et al. 2007) and for the Caucasian population from the HapMap database (www.hapmap.org). As shown in Fig. 1, LD patterns for the two populations are similar, thus unlikely to be the source of the difference in ORs. Interestingly, in a totally independent study of 513 T2D cases versus 475 controls, again all collected in the AJ population, the estimated OR was 1.3, with a 95% confidence interval of 1.08–1.56 (B. Glaser, personal communication). This result is strikingly similar to the OR here observed (1.29), strengthening the possibility that the higher OR observed in the AJ population is a true effect. All other SNPs in the current study (Table 1) did not differ significantly with regards to OR when compared to the meta-analysis values.

Fig. 1
figure 1

LD patterns at the CDKAL1 locus in the Caucasian population (upper figure) and the AJ population (bottom figure). The location of the SNP tested in this study is indicated by an arrow

Figure 2 presents the distribution of the various risk levels, given genetic information for the seven SNPs that exhibit an effect (P < 0.1, Table 1). Each of the seven loci may increase or decrease the risk to develop T2D, and the particular combination of genotypes at those seven may jointly alter the individual’s risk. Based on the observed OR and assuming a population prevalence of 10%, one can estimate the risk of any individual to develop T2D given the particular genotypes at all the risk loci. With allele and genotype frequencies in hand for all the SNPs, one can also estimate the frequency in the population of such an individual with a particular seven-SNP genotype combination. Grouping together individuals with similar risks and presenting their combined frequency in the population serves as an insight to the value of the genetic information for identifying individuals at high or low risk to develop T2D. Without any genetic information, the risk of an individual to develop T2D is 0.1 in this model. As shown in Fig. 2, even with genetic information available, most individuals will carry a risk close to that base line (i.e. between 0.08 and 0.12). Only a small percentage (about 2%) will carry a significantly lower risk (about 0.05) and another small percentage (about 1%) will carry a significantly higher risk (about 0.2).

Fig. 2
figure 2

Distribution of the risk of developing T2D in the AJ population. The y-axis presents the percentage of the population carrying a given risk to develop T2D (x-axis) as estimated from the genetic information at seven loci. Each value on the x-axis refers to the central point of a 0.02-wide interval. Standard error bars are also shown

Discussion

Recent advances provide the opportunity to uncover the genetic basis of common diseases. In the current study, we looked at well-established associations recently discovered, using WGA in T2D. The associations were tested in a large cohort of AJ cases and controls. Our cases and controls are unlikely to suffer from population stratification or other systematic difference between them (causing false positive results), as all samples have been collected from the AJ population at the same small region in central Israel. Also, WGA studies that have been conducted on such sample collections (including a subset of the samples used in the current study) did not exhibit false-positive results (Salonen et al. 2007; Shifman et al. 2008). The genetic effects tested were typically replicated with similar OR. Furthermore, our data cannot confidently discard any of the previously reported risk loci as not affecting susceptibility for T2D in the AJ population. One locus, rs7754840 in the CDKAL1 gene, exhibits a stronger effect in the AJ population compared to that of the general Caucasian population. This may be a result of the specific homogeneous genetic background of the AJ population (Shifman and Darvasi 2001). It is worth noting that OR can also be affected by covariates such as sex or body mass index (BMI). We did not include covariates in our analysis for comparative purposes but also since phenotypic records of the controls were limited and did not include parameters such as BMI. Until recently, the identification of genes affecting common diseases suffered from inconsistencies and lack of replication. It is very reassuring to see that well-designed studies, such as the meta-analysis reported by (Scott et al. 2007), are indeed reproducible.

Consistent with the previous findings showing that such set of SNPs may explain only a small fraction (2–3%) of the trait variation (Saxena et al. 2007), we have also found that the predictive value of these findings is relatively limited. For most individuals, the genotypic information will only modestly alter their baseline risk to develop T2D. Only for a small fraction of the population (about 3%), the genetic information may indicate a decrease or increase of the baseline risk by twofold (upwards or downwards). Consequently, at present, the use of genetic information by clinicians for the identification of patients at risk to develop T2D does not seem too relevant. Nevertheless, it is important to emphasize that the value of gene discovery lies primarily in the understanding of the disease as a means to develop efficient therapies.