Introduction

Despite the recent advances resulting from genome-wide association studies (GWAS), most of the genetic factors contributing to type 2 diabetes remain undetermined [1]. IRS-1 is an important member of a protein family phosphorylated by the insulin receptor upon its binding with insulin [2]. Tissue-specific knockout mice have shown that IRS-1 is necessary for in vivo insulin action and secretion [2]. A relatively infrequent glycine to arginine substitution at position 972 of IRS1 (G972R or rs1801278, minor allele frequency [MAF] ranging from 0.02 to 0.10 in the four different population samples available from HapMap) has been extensively investigated as a determinant of type 2 diabetes susceptibility. In vitro studies have shown that the R972 allele results in a loss of IRS-1 function, which impairs insulin signalling in several target tissues, including skeletal muscle, fat and pancreatic beta cells [24]. In vivo studies have reported an association between IRS1 R972 variant and both insulin resistance [2, 5] and reduced insulin secretion [2, 6]. The deleterious role of the R972 variant on in vivo insulin action and glucose homeostasis has been recently confirmed by studies in transgenic mice [7]. In spite of such strong evidence for a functional role, the data concerning the association of this variant with type 2 diabetes have been, thus far, conflicting. An initial meta-analyses of 27 studies indicated that R972 carriers had a 25% increase in type 2 diabetes risk [8], but subsequent large case–control studies have failed to replicate this association (in Table 1 of the Electronic supplementary material [ESM] see Zeggini et al. [9], Florez et al. [10] and van Dam et al. [11]). Unfortunately, neither the G972R variant nor good proxies in linkage disequilibrium with it (i.e. r 2 > 0.5) were included in the publicly available GWAS meta-analysis DIAGRAM [12].

To obtain further insight into the role of R972 in type 2 diabetes, we performed an updated meta-analysis of all case–control studies available to date (ESM Table 1). BMI and age at diabetes onset were analysed as covariates in meta-regression.

Methods

Study design

All case–control studies reported in previous meta-analyses [8] and all papers found in the PubMed database as of January 2009 by using ‘insulin receptor substrate-1’, ‘IRS-1’, ‘Gly972Arg’, ‘G972R’, ‘diabetes’, ‘variant’, ‘polymorphism’ and ‘genotype’ as keywords, were analysed. In addition, we included five unpublished case–control studies in which all study participants were self-reported whites: four sets from the Genetics of Type 2 Diabetes in Italy and the United States (GENIUS T2D) Consortium [13] (N. Abate, A. Doria, G. Sesti and V. Trischitta) and one set recruited in Chieti, Italy (ESM Table 1; Cama A. sample) (S. Mammarella and A. Cama). Three of the published studies were excluded because they were subsets of these unpublished sets: in ESM Table 1 see Sigal et al. [14] of the GENIUS Boston sample, Mammarella et al. [15] and Esposito et al. [16] of the Cama A. sample.

Study individuals in unpublished samples

Controls in all unpublished samples were non-diabetic individuals with fasting plasma glucose <6.9 mmol/l and absence of drug treatment known to affect glucose metabolism. Cases were patients with type 2 diabetes defined according to the 2003 American Diabetes Association criteria.

DNA extraction and genotyping

DNA from the unpublished sets was extracted from whole blood by standard methods. Genotyping details are described in the methods section of the ESM.

Statistical methods

Cases and controls of all studies were tested for Hardy–Weinberg equilibrium (HWE) by means of an exact χ 2 test. Between-study heterogeneity and the possible presence of publication bias were assessed by Cochran’s Q test and Macaskill’s inverse pooled variance weighting method [17], respectively. Random-effects meta-analysis and meta-regression were used to estimate OR and to explore heterogeneity [18]. Where appropriate, permutation-resampling p values were calculated to address the risk of spurious significant results [19]. All the analyses were performed using SAS Statistical Package Release 9.1 (SAS Institute, Cary, NC, USA).

Results

Of the 35 available studies, only those 32 that did not show significant deviations (exact p < 0.05) from HWE in cases or controls were considered in the meta-analysis (ESM Table 1). Given the small number of RR individuals (i.e. homozygous for the R972 variant) in 12 studies and their absence in the other 20 (a finding that could seriously bias the results of both additive and recessive models), we investigated only the dominant model, by comparing GR+RR (these latter when available) with GG individuals. Figure 1 shows the individual results from the 32 case–control studies, along with those of the meta-analysis, which included 12,076 cases and 11,285 controls. As for any meta-analysis performed on published genetic data, we cannot exclude the possibility that some sample overlap has occurred; however, by carefully reading the description of samples analysed in each study, this seems to be an unlikely event. No evidence of publication bias was observed p = 0.27). The ORs for association between R972 and type 2 diabetes ranged from 0.55 to 4.75. In the meta-analysis, the R972 variant did not show a significant association with type 2 diabetes (OR 1.09, 95% CI 0.96–1.23, p = 0.184). Some evidence of heterogeneity was observed across studies (Cochran’s Q test p = 0.1). In a meta-regression analysis, neither the mean BMI of cases nor that of controls (available in 23 studies corresponding to 20,114 individuals) significantly explained such heterogeneity (p = 0.58 and p = 0.84, respectively). Similar data were obtained when analyses were carried after stratifying for BMI status (i.e. <30 kg/m2 or ≥30 kg/m2) (p = 0.77). Also no effect of ethnicity (i.e. either white [19,075 individuals from 20 studies], Asian [2,699 individuals from eight studies] or other [1,587 individuals from four studies]) was observed (p = 0.91). Also, when only studies whose sample size was >500 individuals were analysed, a similar OR to that obtained in the whole meta-analysis was observed (OR 1.08, 95% CI 0.93–1.24). By contrast, the mean age at type 2 diabetes diagnosis (available in 14 studies corresponding to 9,713 individuals) was significantly correlated with the magnitude of the genetic effect, explaining 52% of the heterogeneity (p = 0.03) (Fig. 2a). When these studies were subdivided into tertiles of mean age at diagnosis, the summary OR of type 2 diabetes was 1.48 (95% CI 1.17–1.87) for studies in the youngest tertile (39–44.9 years), 1.22 (95% CI 0.97–1.53) for studies in the intermediate tertile (45–50.9 years), and 0.88 (95% CI 0.68–1.13) for studies in the oldest tertile (51–58 years) (Fig. 2b). The standard p value for the decreasing trend of ORs with increasing mean age at diagnosis was 0.0022 and the permutation p value was 0.014.

Fig. 1
figure 1

Meta-analysis of 30 case–control studies. The cumulative effect of 32 published (ordered by publication date) and unpublished studies on the association between IRS1 G972R polymorphism and type 2 diabetes was tested by a random-effects model. A borderline significant heterogeneity was observed across studies (Cochran’s Q test p = 0.1). ORs and 95% CIs for dominant genetic model are shown. Sizes of OR symbols are proportional to the study sample size. 95% CIs have arrowheads when they exceed the figure limits. SGR, San Giovanni Rotondo, Italy

Fig. 2
figure 2

Relationship between OR of type 2 diabetes and age at type 2 diabetes diagnosis in the 14 studies for which this information was available (n = 9,713 individuals). a Meta-regression of mean age at diagnosis of type 2 diabetes and log OR for type 2 diabetes of the R972 variant according to a dominant genetic model. Sizes of OR symbols are proportional to the study sample size. There was a significant correlation (p = 0.03) explaining 52% of between-study heterogeneity. b Summary ORs of type 2 diabetes according to tertiles of age at type 2 diabetes diagnosis. The ranges of age at type 2 diabetes diagnosis were 39–44.9 years (five studies, n = 3,234 individuals), 45–50.9 years (five studies, n = 4,228 individuals) and 51–58 years (four studies, n = 2,251 individuals) in tertile 1, 2 and 3, respectively

Discussion

Our findings illustrate the difficulties of ascertaining contributions to type 2 diabetes susceptibility by ‘low-frequency–low-risk’ variants. Despite the fact that this study included more than 23,000 individuals, the power to identify a 9% increase in type 2 diabetes risk associated with a variant having 0.06 frequency was only 58% at nominal significance levels (α = 0.05) and virtually zero at genome-wide significance levels \( \left( {\alpha = 5 \times {{10}^{ - 8}}} \right) \). One can estimate that a total of ~40,000 and ~200,000 individuals would have been required to have 80% power at α = 0.05 and \( \left( {\alpha = 5 \times {{10}^{ - 8}}} \right) \), respectively. Under these circumstances, improving the outcome definition and decreasing its heterogeneity may have critical effects on our ability to identify genetic effects.

In our meta-analysis, studies in which the mean age at type 2 diabetes diagnosis was <45 years showed an OR for type 2 diabetes of 1.48, an effect size that a sample of ‘only’ ~8,500 individuals would have 80% power to detect with genome-wide significance. Similar data, indicating a stronger effect on early abnormality of glucose homeostasis, were recently reported for TCF7L2 [20] and for TRIB3 [10]. Unfortunately, no data on the combined effect of several single-nucleotide polymorphisms (SNPs) that are singly associated with early glucose abnormalities are so far available. Overall, focusing on forms of diabetes diagnosed relatively early in life, which are known to have a stronger genetic component [21, 22], may be a useful strategy to facilitate the identification of SNPs associated with type 2 diabetes that are otherwise difficult to find, either because of their moderate effect or because of their low allelic frequency, or because of both factors, as in the case of IRS1 G972R. The usefulness of this approach may also extend to truly rare variants (MAF < 0.01), such as those that are believed to underlie the linkage peaks that are not explained by the common variants identified through GWAS. Indeed, in the linkage screen of the Diabetes UK Warren 2 sib pair collection, all seven linkage signals that were identified were stronger in families with an average age at diagnosis <55 years than in the families diagnosed at an older age [23].

In conclusion, the study of early-onset forms is emerging as a critical tool to reach the ‘high-hanging’ fruits of type 2 diabetes genetics and mirrors the approach taken with other complex disorders such as coronary artery disease [24]. Thus, both adequately powered new studies specifically targeted to early-onset cases and further analyses of available GWAS data after stratification by age at onset are needed.