Introduction

The global prevalence of diabetes mellitus is approximately 537 million, and projections indicate an increase to 643 million by 2030 and to 783 million by 2045 [1]. Type 2 diabetes (T2D) constitutes approximately 90–95% of diabetes cases globally, characterized by insulin resistance in peripheral tissues and dysregulated insulin secretion from pancreatic beta cells. While the surge in T2D prevalence is prevalently attributed to lifestyle changes, a significant genetic component is acknowledged to contribute to susceptibility [2,3,4,5]. Genome-wide association studies (GWAS) stand as potent biological agnostic methods for identifying genetic variations linked to disease predisposition. These studies involve screening the entire genomes of individuals with and without the target diseases (i.e., cases and controls), and examining numerous single-nucleotide polymorphisms (SNPs). GWAS have played a pivotal role in rapidly identifying a substantial number of confirmed genetic susceptibility variants of complex diseases, including diabetes. This review summarizes recent advances in the genetics of T2D and microvascular complications of diabetes brought by GWAS.

T2D GWAS in European populations

In 2007, a GWAS for T2D was conducted by a French study group, encompassing 661 cases and 614 controls, covering 392,935 SNP loci [6]. This groundbreaking study identified novel T2D loci, namely SLC30A8, HHEX, LOC387761, and EXT2 and validated the association of TCF7L2 previously identified through linkage analysis [6]. Shortly thereafter, CDKAL1 was identified by an Icelandic study group [7] and CDKAL1, IGF2BP2, and CDKN2A/B were identified by three European collaborating groups [8,9,10]. The first round of European GWAS confirmed eight T2D susceptibility loci—TCF7L2, SLC30A8, HHEX, CDKAL1, IGF2BP2, CDKN2A/B, PPARG, and KCNJ11—across various ethnic groups [6,7,8,9,10].

Following this initial wave of GWAS, meta-analyses were conducted by combining individual GWAS data to efficiently increase sample sizes, facilitated by the Diabetes Genetics Replication and Meta-Analysis (DIAGRAM) [11,12,13,14] and Meta-Analyses of Glucose and Insulin-related Traits Consortium (MAGIC) [15]. With the expansion of GWAS meta-analyses sample sizes from tens to hundreds of thousands, the number of identified T2D loci saw a dramatic increase. In 2018, Mahajan et al. reported the results of a GWAS meta-analysis involving 898,130 individuals of European descent (9% T2D cases) [16], marking the largest T2D GWAS in a single ancestral population as of 2023. This extensive dataset led to the identification of 245 loci, including 135 newly implicated in T2D predisposition (p < 5 × 10–8) [16].

T2D GWAS in the Japanese and East Asian populations

Cumulative evidence suggests that East Asians exhibit a higher susceptibility to T2D than Europeans with equivalent body mass index (BMI) or waist circumference, which suggests predisposition to insulin resistance and diabetes among East Asians compared to Europeans [17, 18]. As susceptibility loci have predominantly been identified in European GWAS, conducting GWAS in non-European populations, such as Asian populations, becomes crucial to unveil population-specific loci not captured in European studies.

In 2008, two independent Japanese GWAS concurrently identified KCNQ1 as a T2D susceptibility locus in Japanese individuals [19, 20]. This association was replicated in other populations, including East Asians and Europeans [19, 20]. Despite limited sample sizes at the initial stage of the genome-wide scan (187–194 T2D cases versus 752–1,558 controls), the Japanese GWAS successfully identified a novel T2D locus. KCNQ1 locus was not captured in the first round of European studies, which underscores the importance of conducting GWAS in diverse ethnic groups. In 2010, a Japanese GWAS with an expanded sample size (4,470 T2D cases and 3,071 controls) identified two additional T2D susceptibility loci, UBE2E2 and C2CD4A-C2CD4B [21]. These loci’s associations with T2D were validated in East Asian replication studies [21] and large-scale European GWAS [13], emphasizing the utility of non-European GWAS for identifying both ethnicity-specific and common susceptibility loci.

In 2012, the Asian Genetic Epidemiology Network (AGEN) consortium bolstered East Asian population sample sizes, identified eight additional novel loci through a genome-wide scan with substantial sample size (6,952 cases and 11,865 controls), followed by replication testing [22]. In the same year, a subsequent Japanese GWAS identified the ANK1 locus [23], utilizing the same discovery set (4,470 cases and 3,071 controls) as a prior Japanese GWAS [21]. Notably, this study incorporated an augmented number of variants (~2 million) through genotype imputation [23]. Subsequently, in 2014, another Japanese GWAS, with an enlarged sample size (5,976 cases and 20,829 controls), identified MIR129-LEP, GPSM1, and SLC16A11-SLC16A13, accompanied by an increased number of variants (~6.2 million) [24]. A Japanese GWAS meta-analysis, with an increased sample size of genome-wide scans (15,463 cases and 26,183 controls) followed by replication testing identified seven additional T2D susceptibility loci (CCDC85A, FAM60A, DMRTA1, ASB3, ATP8B2, MIR4686, and INAFM2) [25]. In 2019, a GWAS meta-analysis consisting of 36,614 cases and 155,150 controls of Japanese ancestry was performed, which is the largest T2D GWAS in the Japanese population as of 2023 [26], in which 88 loci were identified as T2D-associated (p < 5 × 10–8), and 28 of these were novel. In 2020, a GWAS meta-analysis of T2D in an East Asian population was conducted using 77,418 individuals with T2D and 356,122 controls [27]. The largest meta-analysis of T2D from East Asian individuals identified 183 T2D loci (p < 5 × 10–8), including 61 novel loci. The T2D loci identified in the Japanese and East Asian patients with T2D are summarized in Table 1.

Table 1 Summary of GWAS for type 2 diabetes in the Japanese and East Asian populations (p < 5 × 10–8)

Transethnic comparison between East Asian and European populations

A transethnic comparison of large-scale GWAS data demonstrated significant shared and distinct T2D susceptibility loci between European and East Asian populations [26, 27]. According to the observation in large-scale Japanese T2D GWAS, the majority (77%) of the Japanese lead variants were common [minor allele frequency (MAF) > 0.05] in both Japanese and European populations, and their effect sizes were strongly correlated (Pearson’s r = 0.83, p = 8.7 × 10−51) and directionally consistent (94%) between these two populations [16, 26]. A parallel finding was noted in a large-scale East Asian T2D GWAS [27], indicating that most T2D genetic susceptibilities are common across populations. Nevertheless, 8.4% of the T2D susceptibility variants identified in the East Asian GWAS exhibited significant heterogeneity of effect between East Asian and European populations [27], underscoring considerable distinctions in T2D susceptibility between the populations. In addition, variations in allele frequencies among populations can contribute to differences in genetic susceptibility among the populations. For instance, rs3765467 in GLP-1R (p.Arg131Gln), where the minor allele Gln is a protective allele for T2D, was identified as a T2D susceptibility locus in a Japanese GWAS [26]; this variant is prevalent in Japanese (MAF = 0.18) and East Asian (MAF = 0.23) populations but rare in Europeans (MAF = 0.001). Considering that GLP-1R encodes a receptor for glucagon-like peptide 1, a target of widely used therapeutic drugs for T2D, p.Arg131Gln serves not only as an indicator of T2D risk but might also be a marker for the clinical response to GLP-1R agonists in Japanese and East Asian patients.

Multi-ancestry T2D GWAS meta-analysis

Although GWAS was initially conducted in a single-ancestry group, multi-ancestry GWAS meta-analyses have been conducted by combining GWAS data from multiple ethnic groups [28], motivated by the consistency of common variant associations observed across different populations [29, 30]. After 2020, two large-scale multi-ethnic T2D GWAS, including European, African, Hispanic, South Asian, and East Asian populations, were reported, in which the sample size was further expanded to more than 1 million [31, 32]. These studies identified 568 and 277 significant associations (p < 5 × 10–8) including 318 and 11 novel loci, respectively [31, 32]. In addition, a multi-ancestry T2D GWAS of 2.5 million individuals, including 428,452 T2D cases, was recently performed [33]. Taken together, approximately 800 genetic loci have been identified as predisposing individuals to T2D through GWASs as of 2023 (Fig. 1). Novel T2D risk loci identified in recent large-scale GWAS have smaller effects [odds ratio (OR) < 1.05 per allele] on disease risk than those of the first round of T2D GWAS (OR = 1.2~1.4 per allele), indicates that an increased sample size by including participants across a variety of ancestries effectively enhances the statistical power to detect association signals with smaller effects. Among all T2D risks including genetic and environmental factors, the proportion of T2D risks explained by association data of all SNPs in large-scale GWAS meta-analysis [31] was estimated to be 19%. Given that the heritability of T2D is estimated to be 30–70% [5], approximately half of T2D heritability is still unknown.

Fig. 1
figure 1

Timeline of discoveries in type 2 diabetes genetics by GWAS. X-axis indicates cumulative number of genetic susceptibility loci identified by GWAS (p < 5 × 10–8). Y-axis indicates the year. Bars are colored according to ethnic composition of the sample set of GWAS: European (blue), Japanese (red), East Asian (green), Others (yellow), Multi-ethnic (gray). a~h indicate the key milestones of GWAS for type 2 diabetes (T2D). a In 2007, the first round of T2D GWAS in European populations identified eight novel susceptibility loci for T2D and confirmed three T2D loci previously identified by candidate gene analyses (Refs. [6,7,8,9,10, 74,75,76]). b In 2008, a GWAS meta-analysis combining three European T2D GWAS data identified additional 6 novel T2D susceptibility loci (Ref. [11]). c In 2008, T2D GWAS in the Japanese populations identified KCNQ1 as a novel T2D susceptibility locus (Refs. [19, 20]). d In 2011, an East Asian GWAS and a South Asian GWAS identified eight and six novel T2D susceptibility loci, respectively (Refs. [43, 77]). e In 2018, a large-scale European T2D GWAS (sample size: n = 0.89 million) identified 135 novel T2D susceptibility loci (Ref. [16]). f In 2019, a large-scale Japanese T2D GWAS (sample size: n = 0.19 million) identified 28 novel T2D susceptibility loci (Ref. [26]). g In 2020, a large-scale East Asian T2D GWAS (sample size: n = 0.43 million) identified 61 novel T2D susceptibility loci (Ref. [27]). h Three large-scale multi-ethnic GWAS meta-analyses were conducted in 2020 (sample size: n = 1.4 million), 2022 (sample size: n = 1.3 million), and 2023(sample size: n = 2.5 million), by which nearly 500 novel T2D susceptibility loci in total were identified (Refs. [31,32,33])

T2D susceptibility loci identified by GWAS are classified into several categories depending on their pathophysiological mechanisms. Suzuki et al. categorized 1,289 independent association signals located in 611 genome-wide significant loci ( p < 5 × 10–8) into the eight clusters, namely a) beta-cell_increased proinsulin (91: number of signals out of 1,289), b) beta-cell_decreased proinsulin (89), c) residual glycemic (389), d) body fat (273), e) metabolic syndrome (166), f) obesity (233), g) lipodystrophy (45), h) liver and lipid metabolism (3) [33]. Among these eight categories, a) and b) were associated with beta-cell dysfunction (180 signals in total, 14%), d)–g) were associated with insulin resistance (717 signals, 56%), and c) and h) were associated with insulin resistance and reduced insulin secretion (392 signals, 30%) [33].

GWAS for microvascular complications of diabetes

Microvascular complications of diabetes, such as diabetic retinopathy (DR), diabetic kidney disease, and diabetic neuropathy, are the major causes of morbidity and mortality in individuals with type 1 diabetes (T1D) and T2D. There is strong evidence of a genetic influence on the development of these complications [34,35,36,37,38,39,40,41,42], although chronic exposure to high glucose is the most relevant risk factor. In the past decade, GWAS for diabetes complications have conducted and identified several genome-wide significant genetic susceptibility loci (p < 5 × 10–8) as shown in Table 2.

Table 2 Summary of GWAS for microvascular complications of diabetes (p < 5 × 10–8)

1) GWAS for diabetic kidney disease

It has been shown that only ~30% of all patients with diabetes develop overt albuminuria [43]. Moreover, familial clustering of diabetic nephropathy has been observed in both T1D and T2D [36,37,38], implying that genetic factors are involved in the development and progression of diabetic kidney disease. We performed an initial GWAS for diabetic nephropathy in Japanese patients with T2D and identified four candidate genes: SLC12A3 [44, 45], ELMO1 [46], NCALD [47], and ACACB [48], but robust replication in independent studies has not been observed.

In 2012, a GWAS of European T1D patients identified two loci (AFF3, RGMA-MCTP2) associated with end-stage renal disease (ESRD) [49]. The same European study group additionally identified SP3-CDCA7, which association with DKD was observed in women (p < 5 × 10−8) but not in men (p = 0.77) [50]. From 2015 to 2018, SCAF8-CNKSR3 [51], GABRR1 [52], and FTO [53] were identified through two multi-ethnic GWAS and a Japanese GWAS, respectively. Interestingly, the FTO locus is a well-established obesity-related locus [54], and the risk allele for diabetic nephropathy is identical to that for obesity.

In 2019, a GWAS for ESRD with T2D in African Americans identified five loci, namely RND3-RBM43, SLITRK3, ENPP7, GNG7, and APOL1 [55], and a GWAS meta-analysis for up to 19,406 individuals of European descent with T1D identified 16 genome-wide significant risk loci, including a common missense variant in the collagen type IV alpha 3 chain (COL4A3) gene [56].

2) GWAS for diabetic retinopathy

Persistent hyperglycemia, along with other clinical factors such as a long duration of diabetes, hypertension, and dyslipidemia, are responsible for the onset or progression of DR [57]. Nonetheless, the occurrence of familial aggregation of DR or advanced DR has been observed in patients with T1D and T2D, implying the involvement of genetic predisposition to DR development [39]. The heritability of DR is estimated to range from 25 to 52% [40, 41].

In 2011, a Taiwanese GWAS identified four DR-related loci (HS6ST3, ARHGAP22, PLXDC2, KIAA0825), despite a modest sample size at the discovery stage (N = 749) [58]. In 2018, an Australian GWAS, Scottish GWAS, and GWAS meta-analysis of eight European Studies identified GRB2 [59], NOX4 [60], and NVL [61], respectively. In 2021, we conducted a GWAS for DR in Japanese T2D patients and identified STT3B and PALM2 [62]. A gene-based analysis using Japanese GWAS discovery stage data also identified EHD3 [62]. Recently, a multi-ancestry GWAS of diabetic macular edema (DME) identified a missense variant in APOL1(K150E) and an intergenic variant located between PLVAP and ANKLE1 as susceptibility loci for DME [63].

3) GWAS for diabetic neuropathy

Similar to diabetic nephropathy or retinopathy, diabetic neuropathy is a multifactorial condition associated with several risk factors, such as glycemic control, hypertension, smoking status, and BMI [64]. The heritability of painful neuropathy was estimated to be 11% [42], and familial clustering analysis revealed a 2.2-fold increased risk of developing diabetic neuropathy in the families of probands with diabetic neuropathy [39].

A GWAS conducted in European T2D with diabetic peripheral neuropathy cases defined based on an MNSI clinical examination score (N = 4,384) and controls (N = 784) identified a variant associated with peripheral neuropathy at chr2q24 (OR = 0.57, p = 1.9 × 10–9), this observation was supported by an independent replication study (p < 0.05, N = 949). The protective allele is associated with increased expression of an adjacent gene (SCN2A) that encodes the human voltage-gated sodium channel NaV1.2 in the tibial nerve [65].

Translation of T2D genetics into clinical practice: application to disease prediction

Previous investigations have indicated that lifestyle interventions can mitigate the genetic risk defined by carrying variants linked to T2D [66, 67]. This highlights the clinical utility of presymptomatic genetic testing in the detection of high-risk individuals and facilitating precision healthcare interventions, such as lifestyle modifications or health checkups. In polygenic disorders comprising numerous common variants with modest effect sizes, a single variant is not informative for assessing disease risk. Instead, multiple risk variants must be considered to predict genetic risk for individuals. Currently, genetic risk is most often assessed using the polygenic risk score (PRS), which is typically calculated as the weighted sum of numerous risk variants in an individual’s genotype data or summary statistics of a single large-scale GWAS.

In 2018, Khera et al. developed a PRS constructed with 7 million variants based on the results of a large-scale European T2D GWAS (N ~ 160,000) [14, 68]. Individuals within the top 3.5% of the PRS exhibited a threefold higher likelihood of developing T2D compared to the remaining 96.5% population in the UK Biobank [68].

The predictive accuracy of a PRS largely depends on the sample size and genetic background of the study population in the base GWAS data, for which summary statistics are utilized as “a training set.” A sample size of at least 100,000 was required for the base GWAS set to ensure the prediction accuracy of the T2D PRS [69]. Symmetrical comparisons between PRSs using European (UK Biobank) or Japanese (BBJ) GWAS as training sets have revealed that a PRS based on a European GWAS make less accurate predictions of T2D risk in the Japanese population compared to a PRS based on a Japanese GWAS, and vice versa [70]. This observation suggests that the currently available PRS based on large-scale European GWAS or multi-ethnic GWAS, which consist mainly of European participants, may be less useful for non-European populations. Most genetic studies to date, including T2D GWAS, have mainly been undertaken in European populations, and the sample size of the largest Japanese T2D GWAS is approximately one-fifth that of the largest European sample [16, 26]. Therefore, increasing the diversity of participants included and analyzed in genetic studies is required to improve the utility of the PRS for all ethnic groups. In addition, ongoing methodological developments in cross-population polygenic prediction by jointly modeling GWAS summary statistics from multiple populations may help considerably [71,72,73].

Conclusion

GWAS has produced significant breakthroughs in the field of common disease genetics, including diabetes and its complications. To date, approximately 800 T2D susceptibility loci have been identified. GWAS focusing on microvascular complications of diabetes have revealed several genetic determinants; however, the number of susceptibility loci identified is limited compared to those associated with T2D. This limitation is likely attributed to their modest sample sizes. PRSs for T2D using extensive GWAS data can serve as tool to screen populations and identify high-risk groups. To improve the utility of the PRS for all ethnic groups, it is necessary to increase the diversity of the participants included and analyzed in future genetic studies.