Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Type 2 diabetes (T2D) is a common, chronic disorder whose prevalence is increasing rapidly across the globe. Like other complex diseases, T2D represents a challenge for genetic studies aiming to uncover the underlying pathophysiological mechanisms. It is predicted that T2D will affect 592 million individuals by 2035 (Federation 2013) in developed and low- and middle-income countries. While the recent increase in T2D prevalence has been attributed to a sedentary “westernized” lifestyle and changes in diet, a significant proportion of heritable factors also contribute to individual susceptibility (Hu 2011).

The strong family clustering and heritability of T2D and related glycemic traits have motivated a large number of studies to identify genetic factors that cause this disease (Permutt et al. 2005; Stumvoll et al. 2005); despite much effort, by late 2006 only three genetic loci had been reproducibly shown to increase T2D risk [reviewed in Majithia and Florez (2009), McCarthy (2008), Stolerman and Florez (2009)]. The earliest attempts to discover T2D-associated genes used either position- or function-based strategies. In a position-based search, genes are identified within families by studying the co-inheritance of the disease with a set of polymorphic markers whose genomic positions are known. Such “linkage studies” usually identify a genomic region (~10 Mbp) that confers genetic risk; the disease-causative mutations are identified by sequencing transcribed and functional elements of all genes in the target region. In function-based approaches, risk association is tested for common genetic variants in candidate genes involved in T2D pathophysiology. In these studies, variants identified in a small number of patients and control subjects are genotyped in larger case-control samples. Both these approaches are characterized by a number of limitations. Linkage analysis is underpowered to detect low penetrance variants, expected to contribute to T2D susceptibility, given its high population prevalence. Candidate gene studies usually were conducted in samples of insufficient size and their findings had low reproducibility as well as difficulty to select good biological candidates.

Positional strategies have identified putative T2D loci in several large chromosome regions (McCarthy 2003) and in a number of specific genes (Horikawa et al. 2000; Meyre et al. 2005; Silander et al. 2004; Hara et al. 2002); however, none of these associations have been convincingly replicated. The candidate-gene approach generated a large number of positive reports, two of which have been confirmed in independent studies (Table 2.2) (Gloyn and McCarthy 2001). The Pro12Ala variant in the peroxisome proliferator-activated receptor gamma (PPARG) gene (Deeb et al. 1998; Altshuler et al. 2000; Lohmueller et al. 2003) and the Glu23Lys variant in the potassium inwardly rectifying channel, subfamily J, member 11 (KCNJ11) gene were shown to contribute to T2D risk in multiple studies (Gloyn et al. 2003; Laukkanen et al. 2004). Each of these two common variants contributes only modestly (increasing T2D risk by 15–20 % for each susceptibility allele) to the risk of developing common form of diabetes, while rare variants in both these genes cause monogenic diseases such as familial partial lipodystrophy and neonatal diabetes. Interestingly, these variants occur within pharmacological targets for the thiazolidinedione (PPARG) and sulfonylurea compounds (KCNJ11) used to treat T2D.

2 Common Variants: The First Steps Toward Large-Scale Association Mapping

Given the inefficient progress of early T2D gene discovery, the application of genome-wide association (GWA) studies to identify risk loci for T2D and glycemic traits represented a major advance in complex trait genetics. GWA studies are observational epidemiological studies in which genetic risk exposure is measured using hundreds of thousands of genotyping assays. The critical difference between GWAs and other observational epidemiological studies lies in the large number of genetic tests performed to assess exposure in each individual patient. On the one hand, the success of GWA studies relies on the development of technologies capable of screening a large number of polymorphisms (predominantly represented by single-nucleotide polymorphisms, SNPs), as the prior probability that any individual polymorphism will be associated with disease is small. On the other hand, the polymorphisms studied using commercial microarray platforms are not genetically independent and display complex linkage structures that may extend over tens or hundreds of thousands of base pairs. As a result, the success of a GWA study relies on achieving an adequate marker density to model local linkage structures across the genome.

The potential benefits of using GWA studies to discover complex disease risk loci were first demonstrated in a seminal paper by Risch and Merikangas that showed that the analysis of one million variants in the sample of unrelated individuals had greater statistical power than a linkage analysis with a few hundred markers (Risch and Merikangas 1996). In this context, Reich and Lander suggested a theoretical population-genetics model for a relatively simple distribution of susceptibility variants at a disease locus and rephrased the common disease common variant hypothesis (CDCV) to propose that high-frequency variants with low penetrance at disease loci contribute to the largest proportion of disease risk in a population (Reich and Lander 2001). Their theoretical demonstration of the CDCV hypothesis did not provide any expectation about the number of disease loci or their effect sizes in establishing complex disease risk.

The majority of GWA studies performed to identify T2D risk loci have used a case-control study design (Table 2.1), with retrospective longitudinal studies being primarily reserved for validation of previously identified loci. Alternate study designs to detect T2D risk associations are far less common and have included populations with early-onset diabetes (taken as a proxy for more severe illness), longitudinal studies in at-risk populations, and studies in isolated populations. Affected individuals in the genetic discovery cohorts are typically selected carefully using diagnostic criteria established by the American Diabetes Association or World Health Organization that are based solely on blood glucose levels. In contrast, selection of control subjects has been more problematic, with most discovery cohorts including patients based on a single normal blood glucose measurement and absent medical history of glucose intolerance. Many discovery cohorts have excluded patients with monogenic diabetes based on a suggestive family history or specific genetic tests and patients with autoimmune diabetes based on specific serum markers. In many studies, the case and control samples differ significantly in age, in order to avoid selecting individuals who will develop diabetes later in life as control subjects. Comparable environmental exposures are used as basis for selection of both, T2D cases and controls; and in addition to matching affected and control individuals for general ethnic background, ethnic outliers are excluded from analyses. Diabetic individuals usually are of comparable body mass index (BMI) in respect to controls in large genetic studies.

Table 2.1 Major published T2D GWAS and meta-analyses

The first successful “hypothesis-free” demonstration of T2D association came from the discovery of an intronic SNP in the transcription factor 7-like-2 (TCF7L2) gene, which confers the largest effect on T2D risk reported to date among common variants (Fig. 2.1; Tables 2.1 and 2.2) (Weedon 2007; Grant et al. 2006). These association studies were motivated by the group’s earlier demonstration of microsatellite associations in a linkage region on chromosome 10 (Reynisdottir et al. 2003) rather than by functional criteria. In fact, TCF7L2 encodes a transcription factor within the Wnt signaling pathway whose involvement in T2D pathogenesis remained elusive for many years following the initial genetic studies. Despite this, detailed physiological studies have now demonstrated the importance of the TCF7L2 locus in β-cell function and insulin secretion in human cohorts (da Silva Xavier et al. 2009, 2012; Dupuis et al. 2010; Dimas et al. 2014) and as a critical regulator of β-cell mass and function (Takamoto et al. 2014) and hepatic carbohydrate metabolism (Boj et al. 2012) in mouse models (see Chap. 15 for details).

Fig. 2.1
figure 1

Effect sizes of established T2D-susceptibility loci. Established T2D-susceptibility variants have only modest individual effects. The x-axis gives the per-allele odds ratio estimated for European-descent samples or for the ethnic group of discovery, if association was reported after mid-2012 (Table 2.1) for each locus listed on the y-axis. Loci are sorted by descending order of per-allele effect size within each year. Colors highlight the discovery study approach: red, candidate gene; yellow, large-scale association; blue, genome-wide association; dark blue, genome-wide association meta-analysis; sky blue, genome-wide meta-analysis with Metabochip follow-up; green, genome-wide meta-analysis of glycemic traits; pink, genome-wide sex-differentiated meta-analysis with larger effects in women; brown, genome-wide sex-differentiated meta-analysis with larger effects in men; hacky, genome-wide meta-analysis in lean/obese; gray, whole-exome sequencing. For loci with sex differentiation, the effect size for the sex with larger effect is presented. X-axis lists loci names, labeled by the gene names within region. Y-axis shows odds ratio for T2D observed at a given locus. Loci are split by the year of discovery and are ordered from top to bottom by the decreasing OR on T2D risk within each year. Shadow is used for loci from studies with discovery including non-European individuals

Table 2.2 T2D-susceptibility loci with genome-wide significant evidence for association

3 Loci Established Through T2D GWA Studies

The capacity to undertake efficient, large-scale association analyses using hypothesis-free approach through genome-wide studies opened a new wave of discoveries in T2D genetics. Four GWA studies published in 2007 (Diabetes Genetics Initiative of Broad Institute of H et al. 2007; Scott et al. 2007b; Sladek et al. 2007; Zeggini et al. 2007) (Table 2.1) confirmed the strongest association at TCF7L2, two previously established signals at PPARG and KCNJ11 and identified six novel loci, at HHEX/IDE, CDKAL1, IGF2BP2, CDKN2A/2B, SLC30A8, and FTO (Frayling et al. 2007; Freathy et al. 2008; Fall et al. 2013). Although it is conventionally used to name the loci by the most credible regional candidate (e.g., SLC30A8) rather than the tag SNP showing the strongest association (e.g., rs13266634), these assignments are used as a matter of convenience and do not imply that a mechanistic link has been proven. The association signals found in GWA studies require further investigation through extensive fine mapping and functional characterization to establish causal variants and determine their impact on T2D pathogenesis at a molecular level (Prokopenko et al. 2008).

The first round of published T2D GWA studies has provided both the identification of novel associated loci and the landscape of T2D susceptibility across the whole genome, the latter providing a point of reference for the previous equivocal findings accumulated through candidate-gene and linkage studies (Parikh and Groop 2004). Thus, controversial evidence for variants in Calpain-10 (CAPN10) and the insulin (INS) gene in T2D has not been confirmed by the GWA approach. Similarly, while the 1q chromosome region (30-Mb region near to centromere) contained a number of highly plausible candidates supported by genome-wide linkage analyses in multiple ethnicities, none have been confirmed through association studies and large-scale GWA meta-analyses to date (Prokopenko et al. 2009b; Morris et al. 2012; Replication and Meta-analysis 2014).

Simultaneously with the advent of GWA studies, large-scale replication efforts confirmed two loci highlighted by the candidate-pathway pre-GWA studies. Variants within Wolfram syndrome 1 (WFS1) gene and common variants in hepatocyte nuclear factor 1-b (HNF1B, also known as TCF2) were confirmed as associated with T2D (Franks et al. 2008; Sandhu et al. 2007; Winckler et al. 2007). These variants, along with KCNJ11 and PPARG, provide interesting examples of the convergence between Mendelian and polygenic causes of diabetes, as coding variants in these genes had previously been isolated in families with autosomal dominant inheritance of diabetes (Maturity Onset Diabetes of the Young, MODY) and as part of the multisystem Wolfram syndrome (McCarthy and Hattersley 2008).

As anticipated, GWA studies, by testing hundreds of thousands of genetic variants in parallel, have identified loci with modest effects (Manolio et al. 2009). To contend with the stringent significance thresholds that account for the number of independent tests performed across the genome, identification of additional T2D susceptibility loci required larger population samples, which was achieved by combining existing GWA studies in meta-analyses. The Diabetes Genetics Replication And Meta-analysis (DIAGRAM, http://www.diagram-consortium.org/) consortium carried out the first meta-analysis for T2D (Zeggini et al. 2008) of three GWA studies of European-descent individuals, including ~4500 cases and 5500 controls. Differences in the genotyping platforms used for individual GWA studies were overcome by imputation using a common variant set based on haplotype structure of densely characterized reference samples in HapMap (Consortium IH 2005) and extended the analysis to ~2.2 million SNPs across the genome (Scott et al. 2007b; Zeggini et al. 2007; Diabetes Genetics Initiative of Broad Institute of H et al. 2007). Following a 2-stage replication with genotyping of selected SNPs in ~75,500 individuals, the DIAGRAM study identified six novel loci (Tables 2.1 and 2.2), including only one reasonable biological candidate gene (NOTCH2, Notch homologue 2, Drosophila), which is involved in pancreatic development.

The DIAGRAM consortium published two further meta-analyses, each based on increasingly larger case-control samples from European populations. The first combined discovery data from 21 GWA studies in up to 8130 individuals with T2D and 38,987 controls all imputed to a HapMap 2 reference panel, followed by large-scale replication in 34,412 cases and 59,925 controls where 13 (11 novel) out of 23 autosomal signals were confirmed (Tables 2.1 and 2.2) (Voight et al. 2010). This meta-analysis was the first to examine T2D associations on chromosome X (taking X-inactivation into account) and identified an association at DUSP9 with a large effect on T2D risk (OR = 1.27, Table 2.2; Fig. 2.1) (Voight et al. 2010). The second meta-analysis, in addition to dramatically increasing the sample size (34,840 cases and 114,981 controls), implemented a novel cost-effective strategy for large-scale replication based on the CardioMetabochip (Metabochip), an Illumina iSelect genotyping array. Metabochip, which was designed through collaboration between six GWA consortia studying metabolic and atherosclerotic/cardiovascular diseases and traits (Voight et al. 2012), permitted follow-up of ~66,000 putative signals for cardiometabolic phenotypes (~5000 of which were selected for T2D) (Morris et al. 2012). The Metabochip array also contained approximately 120,000 SNP probes to fine map 257 established loci in an attempt to identify causal T2D susceptibility variants. The DIAGRAM meta-analysis with Metabochip follow-up established T2D associations at 10 loci (Tables 2.1 and 2.2), including two at CCND2 and GIPR with larger effects on T2D risk in males and females, respectively (Morris et al. 2012). Among previously established T2D loci, sex differentiation in effect size has been shown for KCNQ1, DGKB, and BCL11A (larger effects in males) and GRB14 (larger effects in females).

A separate DIAGRAM GWA meta-analysis of the effects of obesity on T2D risk, performed in Europeans through GWA meta-analysis of lean (BMI < 25 kg/m2) and obese (BMI ≥ 30 kg/m2) T2D diabetics with ~54,000 controls, identified associations with lean diabetic participants at LAMA1 and with obese subjects at HMG20A (Perry et al. 2012). A GWA meta-analysis in >8000 T2D cases and >10,870 controls in Europeans with large replication, including several additional datasets with de novo genotyping and the DIAGRAM discovery meta-analysis data in silico, reported association at RBMS1 (Tables 2.1 and 2.2) (Qi et al. 2010).

In parallel to studies in European populations, T2D GWA studies in Asian ethnic groups (representing Japanese, Chinese, Punjabi Sikhs, Indians, South Asian, and East Asian subjects) have established T2D associations at 27 loci (Table 2.1). These studies have generally followed a design based on a GWA study with large-scale replication in an individual ethnic group, frequently undertaken in multistage fashion. In addition, several groups have combined efforts to complete a recent East Asian GWA meta-analysis in up to 6952 T2D cases and 11,865 controls (with imputation based on the East Asian HapMap 2 reference panel) and identified eight novel loci, including GLIS3, PEPD, FITM2-R3HDML-HNF4A, KCNK16, MAEA, GCC1-PAX4, PSMD6, and ZFAND3 (Cho et al. 2012). A second meta-analysis of Chinese samples (with imputation based on the 1000 Genomes Project JPT (Japanese in Tokyo) and CHB (Han Chinese in Beijing) reference panels) has described T2D association with a common variant in the PAX4 gene , which is expressed in early pancreatic endocrine cells. The association, which was confirmed in a multiethnic analysis including European and five East Asian populations (Ma et al. 2013), adds another example of common variant associations with T2D at a MODY locus as heterozygous mutations in PAX4 have been identified as a cause of MODY9 (omim.org/entry/606391). Therefore, while rare coding mutations severely impair islet function and cause rare monogenic forms of diabetes, common variants can act through the same genes, but with smaller effects, to increase an individual’s risk of developing a more common form of diabetes.

A small number of GWA studies have been reported for other ethnic groups. Studies in Mexican individuals reported associations at several established loci (Parra et al. 2011) and a novel association at SLC16A11/SLC16A13 where the haplotype carriers had amino acid substitutions in SLC16A11 (Consortium et al. 2014b). The locus is thought to affect triacylglycerol metabolism and shows stronger association in leaner and younger people. While common in Native Americans and Asians, risk variants at this locus are rare in European and African individuals and have introgressed into modern humans through admixture with Neanderthals. A second study in American Pima Indians confirmed associations for a set of previously established loci while reaching study-wise significance (P-value = 6.6 × 10−8) at the DNER gene (Hanson et al. 2014). Finally, an African American GWA study has provided evidence for association at RND3/RBM43 (Palmer et al. 2012).

Methodological development in to combine data from multiple ancestry groups by accounting for heterogeneous allelic effects (Morris 2011) has enabled performing meta-analysis across different ethnicities. For example, combining European, East Asian, South Asian, and Mexican and Mexican-American GWA meta-analyses in up to 26,488 T2D cases and 83,964 controls has identified seven novel T2D susceptibility loci TMEM154, SSR1-RREB1, FAF1, POU5F1-TCF19, LPP, ARL15, and MPHOSPH9 (Replication et al. 2014). Importantly, the study demonstrated an overwhelming concordance of allelic effects across ethnicities, even at loci with only weak evidence of association, supporting the hypothesis that T2D risk variants predate migration of humans out of Africa and arguing against the “synthetic association ” hypothesis, which predicts that associations at common variants are driven by unobserved lower frequency causal alleles with large effects (Dickson et al. 2010).

It has long been suggested that the high prevalence of metabolic disorders related to impaired glucose homeostasis may be a result of selective evolutionary advantage of T2D and obesity-risk variants during periods of scarce food resources, which resulted in an increase in their frequency at the population level (thrifty gene hypothesis) (Neel 1962, 1999). Given that food intake is known to act as a trigger for insulin release, it has also been hypothesized that a positive selection may have operated in particular on those loci associated with T2D through an influence on β-cell function (Ayub et al. 2014). Some evidence of directional population differentiation and nominal positive selection at individual T2D risk loci, including TCF7L2, THADA, and NOTCH2, has been reported (Chen et al. 2012; Corona et al. 2013; Klimentidis et al. 2011). The collective analysis of all T2D-associated variants along with stratified by their impact on β-cell function or insulin resistance has to date found no support for global or differential positive selection at T2D loci, thus offering little support for the thrifty gene hypothesis (Ayub et al. 2014; Southam et al. 2009).

4 Common Variants with Modest Effect Sizes

Most GWA study designs are based on common variant genotyping arrays, which have determined the allele spectrum of the resulting T2D-associated variants (Table 2.2). The 88 known T2D risk loci (Table 2.2) show only modest effects (OR = 1.1–1.2), with TCF7L2 being the only locus showing larger effects in European populations (OR ~1.40, Fig. 2.1) (Morris et al. 2012). While this has led to an intense search for additional rare and common variants (particularly for causal variants which are expected to have larger effects), the early search for rare coding variants has had limited success (Table 2.2) (Steinthorsdottir et al. 2014; Albrechtsen et al. 2013). Additionally, studies in non-Europeans have recently provided support for a number of novel T2D susceptibility loci that show low allele frequencies in European populations (Unoki et al. 2008; Hanson et al. 2014; Consortium et al. 2014b). While this provides a challenge to validating these loci in European populations, the high concordance of the direction of effects across ethnicities for T2D risk variants (Replication et al. 2014) suggests that additional common T2D risk variants with consistent and modest effects across ethnic groups remain to be described. Their identification will require larger sample sizes and combined efforts of many studies and research centers (Morris et al. 2012).

The discriminatory capacity of genetic variants for T2D risk prediction and patient stratification has been assessed in longitudinal studies by examining whether inclusion of genetic risk scores (GRS) in predictive models increases the area under the receiver-operating-characteristic curve compared to predictive models including only clinical parameters. Early studies suggested that inclusion of GRS provided little improvement in T2D risk prediction compared to clinical risk factors and family history alone (Lyssenko et al. 2008; Meigs et al. 2008; Balkau et al. 2008; Talmud et al. 2010; de Miguel-Yanes et al. 2011). More recent studies, incorporating increasing numbers of T2D risk variants into the GRS, have also had mixed results (Hivert et al. 2011; Muhlenbruch et al. 2013; Vaxillaire et al. 2014). For example, while a recent study incorporating 43 T2D associated variants showed little improvement in T2D prediction, inclusion of the GRS in predictive models improved the receiver-operating-characteristic curve for subgroups of subjects at increased risk of T2D, including obese subjects, older participants, and those with a family history of diabetes (Muhlenbruch et al. 2013). Similarly, Hivert et al. have shown that a GRS with 34 variants was significantly associated with increased risk of progression to T2D in high-risk individuals, as well as a reduced effect of lifestyle interventions on genetic risk (Hivert et al. 2011).

A recent study comparing the discriminative capacity of GRSs including 65 T2D-associated loci and 36 FG-associated loci FG showed modest but significant improvement in T2D reclassification rates in models including a GRS incorporating T2D risk loci and modestly improved reclassification rates of incident and non-incident T2D and impaired fasting glucose (IFG) using the GRS incorporating both T2D risk and FG loci, suggesting that inclusion of risk loci associated with glycemic traits may be beneficial for intermediate phenotypes such as IFG (Vaxillaire et al. 2014). Further studies using GRS based on new loci and causative variants will help to improve insight into the longitudinal impact of genetic variants associated with glycemic traits on T2D risk of and disease trajectories.

5 Understanding Relationship with Other Phenotypes

Two critical processes leading to T2D development are β-cell dysfunction and insulin resistance in peripheral tissues including fat, muscle, liver, and elsewhere (Prokopenko et al. 2008). Beginning long before the clinical diagnosis of T2D, these processes are hallmarks of prediabetes; following which, progressive deterioration of β-cell function reaches a point when they are no longer able to meet the increased insulin demands from peripheral tissues, leading to the development of diabetes. In parallel to T2D GWA meta-analyses, a number of large-scale association studies have been successful in identifying genetic loci that influence quantitative glycemic traits, including fasting and postprandial glucose and serum insulin levels. These studies take advantage of the increased power that can be obtained when similarly sized cohorts studied for continuous traits compared to dichotomous outcomes; their success relies on the hypothesis that genes influencing blood levels in normal subjects will also increase diabetes risk. Significantly, while the genetic risk loci identified for T2D overlap to some degree with quantitative trait loci for blood glucose and insulin, several genes have shown association only with glycemic traits or only with increased T2D risk (Fig. 2.1). While it’s possible that this discordance may reflect the statistical power of the studies completed to date, the milder phenotypes observed in patients with glucokinase mutations compared to patients with other forms of MODY (McDonald and Ellard 2013) suggest that it is important to distinguish two overlapping but distinct groups of GWAs loci that are associated with altered glucose homeostasis on the one hand and the progressive of metabolic decompensation that leads to T2D on the other.

The only association with FG established before the GWA study era was at the glucokinase (GCK) locus (Weedon et al. 2005, 2006), a gene in which rare mutations cause MODY2 (Froguel et al. 1992). GCK catalyzes the first step in glycolysis and is one of the principal regulators of FG concentration and of β-cell secretory activity. An indicative association at the glucokinase regulator (GCKR) locus (rs780094) with FG, as well as an association at the same variant with serum triglyceride levels, was described by the DGI T2D GWA study, which however was not powered enough to detect an effect on T2D (Diabetes Genetics Initiative of Broad Institute of H et al. 2007). The product of GCKR regulates GCK activity and is a highly plausible candidate involved in T2D pathogenesis (see Chap. 16). The GCK and GCKR loci have since been associated with FG/HOMA-B (homeostasis model assessment of β-cell function) and FG/FI/HOMA-IR (homeostasis model assessment of insulin resistance), respectively, and with T2D (Dupuis et al. 2010; Manning et al. 2012). These findings prompted further interest in well-powered GWA studies for glycemic traits to detect reliable genetic associations which may be relevant to T2D pathogenesis (see Chap. 3).

In 2009, the collaborative Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC, http://www.magicinvestigators.org/) was established to consolidate the efforts of many groups working on glycemic trait genetics, in order to understand the variation of these traits within the physiological range and investigate their impact on T2D risk and other cardiometabolic traits (Prokopenko et al. 2009a). The first effort of MAGIC confirmed the association at GCK and G6PC2 loci and identified a novel signal at the melatonin receptor 1B (MTNR1B) locus for higher FG and lower insulin secretion. The inverse correlation between the levels of the neurohormone melatonin, secreted by the pineal gland, and insulin has long been known. However, few studies had investigated the relationship between melatonin signaling in pancreatic islets and metabolic disease (Peschke et al. 2007), prior to publication of large-scale association studies (Prokopenko et al. 2009a; Bouatia-Naji et al. 2009; Chambers et al. 2009; Go et al. 2013; Lyssenko et al. 2009). Association with T2D at MTNR1B locus was subsequently confirmed at genome-wide significance (Prokopenko et al. 2009a; Dupuis et al. 2010; Voight et al. 2010; Lyssenko et al. 2009).

To extend the first MAGIC study, a new, larger, whole GWA meta-analysis (21 studies, up to 46,186 nondiabetic individuals) was performed (Dupuis et al. 2010). It increased the number of glycemic trait loci to 16 and reported novel effects on T2D from a large-scale analysis at five of the FG/FI-associated loci (ADCY5, GCK, GCKR, DGKB, PROX1), thus highlighting that the overlap between the genetic variation influencing glucose homeostasis and risk of T2D is only partial (Fig. 2.1). Four of these loci contributed to impaired β-cell function as measured by HOMA-B and one (GCKR) was associated with insulin resistance.

Three MAGIC GWA meta-analyses for additional glycemic traits provided further insights into pathophysiology of T2D. A study focusing on 2-hour postprandial glucose (2hGlu) levels (15,234 nondiabetic individuals in discovery and up to 30,620 in replication) identified five associated loci (GIPR, VPS13C, ADCY5, GCKR, TCF7L2), including the novel locus GIPR (rs10423928) containing the gene encoding the GIP receptor for the insulin-response stimulating hormone GIP (glucose-dependent insulinotropic polypeptide) in pancreatic islet β-cells (Saxena et al. 2010) and in linkage disequilibrium (LD, HapMap CEU r 2 = 0.78) with BMI-associated rs2287019 (Speliotes et al. 2010). Genome-wide meta-analysis of HbA1c, including study in 46,368 in nondiabetic individuals by the MAGIC investigators, identified 10 genetic loci, of which MTNR1B and GCK also increase T2D risk, suggesting that their effect on hyperglycemia (as measured by FG) extends an effect on average glycemia over a 2- to 3-month period (as detected through HbA1c) and is related to T2D pathogenesis, while ANK1 maps close to T2D risk variant (Soranzo et al. 2010). Variants at VPS13C/C2CD4A/B and GIPR were subsequently associated with T2D, the latter showing larger effects in women, but both were in weak LD with glycemic trait variants (Yamauchi et al. 2010; Morris et al. 2012). Similarly, the ANK1 HbA1c locus variant rs4737009 identified by Soranzo et al. is not in LD with the T2D risk variant (rs516946, HapMap CEU r 2 < 0.01) (Morris et al. 2012). The genetic architecture at these three loci is complex and requires further investigation to dissect the relationships between genetic effects on the associated glycemic phenotypes.

Large-scale studies of glycemic traits using Metabochip have discovered additional common variant loci with small effects on FG/FI/2hGlu loci trait variability and further increased the overlap with T2D risk loci. In this study, 39 FG-raising alleles were related to increased T2D risk, although only 20 (>60 %) of them showed at least nominal significance (P < 0.05) for T2D. Similarly, 13 of the 19 FI loci were nominally associated with T2D and all but TCF7L2. Similarly, 13 of the 19 FI loci were nominal association with T2D and all, but TCF7L2, FI/insulin resistance-increasing alleles were associated with higher T2D risk and showed an impaired lipid profile (Fig. 2.2) (Scott et al. 2012).

Fig. 2.2
figure 2

Effects of established T2D loci: (a) on cardiometabolic phenotypes and (b) on glycemic traits. (a) Among a total 88 T2D loci, only 49 do not overlap with other cardiometabolic traits. The independent loci were defined by physical distance more than 500 kb from each other and by CEU LD r 2 > 0.01. (b) Among a total 88 T2D loci, only 27 overlap with fasting glucose or fasting insulin levels. The independent loci were defined by physical distance more than 500 kb from each other and by CEU LD r 2 > 0.01

FG-associated loci from GWAS studies have also helped define the relationship between T2D and abnormal insulin processing and secretion in β-cells. Among other glycemic trait analyses by the MAGIC, nine genome-wide significant loci were described for corrected insulin response (CIR), seven of which were previously associated with both T2D and other glycemic traits (MTNR1B, GCK, HHEX/IDE, CDKAL1, CDKN2A/2B, ANK1, C2CD4A/B) (Prokopenko et al. 2014). Two other loci included G6PC2 associated with glycemic trait variability in nondiabetic individuals and the novel GRB10 association, which showed potential tissue-specific methylation and parental imprinting that might mask its association with T2D). Meta-analysis of GWA studies by MAGIC for fasting proinsulin levels adjusted for FI identified eight loci, of which four demonstrated that both proinsulin-raising (for TCF7L2, SLC30A8, and VPS13C/C2CD4A/B) and proinsulin-lowering alleles (for ARAP1) influenced T2D risk through a decrease in insulin secretion caused by distal or proximal impairment of proinsulin conversion, respectively (Strawbridge et al. 2011). Similarly, Dimas and colleagues described associations at the HHEX/IDE and MTNR1B loci with defects in early insulin secretion through reduced insulinogenic index for the T2D risk allele and showed that the T2D risk allele at ARAP1 was related to defects in the first steps of insulin production, through association with 32,33 split proinsulin (Dimas et al. 2014).

These large-scale discovery efforts in nondiabetic individuals have provided genetic markers that may provide mechanistic insights into the pathogenesis of T2D and possibly to classify disease mechanisms that are active in individual patients. For example, physiological characterization of the effects of glycemic and T2D loci on quantitative glycemic traits has revealed a clear separation of hyperglycemic loci (MTNR1B and GCK) which are associated with reduced basal and stimulated β-cell secretion and consequent fasting hyperglycemia without large effects on T2D risk from β-cell loci that show an effect on insulin processing and secretion that only modestly change FG but exert much stronger effects on T2D risk (TCF7L2, SLC30A8, HHEX/IDE, CDKAL1, CDKN2A/2B, THADA, DGKB, PROX1, ADCY5) (Dimas et al. 2014).

Loci with effects on insulin sensitivity represent a much smaller proportion of T2D variants. Physiological characterization of T2D loci grouped variants with primary effects on insulin sensitivity in basal and stimulated state (IRS1, GCKR, PPARG, KLF14); in addition, weak effects on insulin sensitivity have also been suggested for HMGA2 (Voight et al. 2010). Insulin sensitivity indices showed consistently decreased effects for T2D risk alleles only for loci with known effects on insulin resistance at basal measures (HOMA-IR) (Dimas et al. 2014). In many cases, these loci may exert widespread biochemical changes affecting cardiometabolic risk (Fig. 2.2): some FI-associated loci can alter BMI and body fat distribution, while most loci associated with higher insulin levels are also associated with lower HDL cholesterol and higher triglyceride levels (Manning et al. 2012). For example, variants within the fat mass and obesity-associated (FTO) gene and at melanocortin-4 receptor (MC4R) exert their T2D effect through a primary impact on BMI (Frayling et al. 2007; Loos et al. 2008; Morris et al. 2012). In contrast, effects of IRS1 and PPARG on insulin resistance and T2D are independent from obesity (Scott et al. 2012; Kilpelainen et al. 2011; Rung et al. 2009). For a number of loci, the association with lipids and T2D (HNF4A, CILP2, KLF14, HNF1A/TCF1, MC4R) and additionally with FI (GRB14, GCKR, FTO, PEPD, ANKRD55, IRS1, ARL15) has been reported independently for each phenotype, underlying the close relationship between increased lipids/adiposity and increased insulin (Fig. 2.2) (Scott et al. 2012). This picture is consistent with the first stages of diabetes, where high adiposity in peripheral tissues causes insulin resistance, which is complemented by an increase in β-cell insulin production.

Several T2D loci appear to have an effect on complex diseases whose pathogenesis is not commonly associated with changes in metabolic fitness : pleiotropy could be a probable mechanism for these effects, since the correlation between the associated disease outcomes is low for them to be considered as comorbidities. Thus, variants at ~20 T2D loci, including CDKN2A/2B, JAZF1, HNF1B, THADA, CCND2, ZMIZ1, and IGF2, have a role in cancer susceptibility (Gudmundsson et al. 2007; Thomas et al. 2008; Finkel et al. 2007). Interestingly, T2D risk alleles at THADA, TSPAN8, and HNF1B are protective against prostate cancer, an inverse relationship that supports epidemiological observations. The genetic links between diabetes and cancer point to a set of shared biological pathways, including opposing roles in regulation of cell cycle and common signaling pathways.

6 What Is Next in T2D GWA Studies?

Despite the success of GWA studies in identification of common variant associations, the largest heritable component of T2D susceptibility remains unexplained. Rapid development and reduced costs of exome sequencing approaches has opened wide opportunities in both sequencing of large numbers of individuals and generation of large reference panels for imputation of rare variants from resequencing [e.g., those from the 1000 Genomes Project (Genomes Project et al. 2010)]. Population-based studies have also benefited from sequencing through implementation of population-specific next-generation sequencing-based reference panels, including deCODE Icelandic and Genome of the Netherlands (GoNL) reference panels (Boomsma et al. 2014; Steinthorsdottir et al. 2014). To date, these sequencing studies have not succeeded in identifying a large number of novel risk loci. For example, whole-exome sequencing at 8× depth in a Danish sample of 1000 T2D cases and 1000 controls hasn’t produced evidence of association with T2D at rare exomic variants, but has confirmed associations with T2D at common variants in COBLL1 and MACF1 (Tables 2.1 and 2.2) (Albrechtsen et al. 2013). While the sample size used in the study was small and the variant calling accuracy was not optimal for detecting small indels or changes in copy number, the results are consistent with previous regional resequencing studies which suggest that most causative variants linked to the GWA risk loci will not alter protein coding sequences. A recent whole-exome sequencing study in 3756 Latinos with an average depth 67.17× has identified a rare missense variant in HNF1A (c.1522G>A [p.E508K], odds ratio [OR] = 5.48) (Consortium et al. 2014a). As a result, there is considerable interest in pursuing whole-genome and whole-exome sequencing studies, particularly in cohorts that have sufficient statistical power to detect epistatic interactions that may confer additional T2D risk. Several international T2D collaborations have recently focused their efforts on large-scale sequencing projects, including the GoT2D (genomics of T2D) consortium that has undertook whole-genome (low-pass 4–6×) and deep whole-exome sequencing for ~2800 T2D case and control individuals from Northern Europe and the T2D Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples (T2D-GENES ) consortium that has undertook trans-ethnic deep whole-exome sequencing in ~10,000 individuals distributed equally between five ethnic groups (McCarthy 2010).

GWA studies have provided an excellent springboard for large-scale T2D studies through international collaborative efforts focused on Europeans and being widely extended to other ethnic groups. Improved sequencing technologies and variant calling algorithms will extend the variant set to other types of genetic variability, including copy number variation, which may have significant impact on the dissection of T2D susceptibility. These collaborations will enable well-powered fine-mapping studies and identification and functional characterization of disease-causing variants. Overall, identifying causative genetic variants and discovering the molecular mechanisms linking them to the development of prediabetic changes will be essential in understanding the pathophysiology of T2D. This in turn may lead to rational drug development and suggest therapies that can be applied appropriately and early to those most at risk of developing T2D (Tuomilehto and Lindstrom 2003). This outcome is potentially feasible as genes that have already been associated with diabetes have also acted as targets for its treatment : while this is best demonstrated by the use of sulfonylureas to treat neonatal diabetes associated with inactivating mutations of the Sur1 protein (Gloyn et al. 2004), the same family of drugs have also provided a mainstay for treating adults with polygenic T2D for many years.