Introduction

Diabetes is increasingly prevalent globally, particularly in developing countries and regions. According to the International Diabetes Federation (IDF), 79% of diabetes patients live in low and middle-income countries (https://www.diabetesatlas.org/en/resources/). Type 2 diabetes (T2D), as the primary type of diabetes, is highly heterogeneous in clinical characteristics, progression and risk of complications [1]. Through targeted prevention and treatment, we could prevent diabetes complications and reduce the risk of premature death. Given its heterogeneity, attempts have been made to define its subgroups through genetic [2, 3] or clinical features [4,5,6,7,8,9]. Although genetic data are stable over a patient’s lifetime and not influenced by disease progression, they are often not available. Based on clinical features, the majority of current studies used five clinical variables (age, BMI, hemoglobin A1c, homeostasis model assessment of β cell function, and homeostasis model assessment of insulin resistance) to stratify T2D patients into severe insulin-deficient diabetes (SIDD), severe insulin-resistant diabetes (SIRD), mild obesity-related diabetes (MOD), and mild age-related diabetes (MARD) [5, 10]. Instead of k-means clustering method, Li and Chen used thresholds for HOMA2-IR, HOMA2-β and BMI to separate the T2D subgroups, which is a more convenient method in clinical applications [11].

However, islet function and insulin resistance required for these classification methods are expensive, hard to be standardized, and not routinely adopted in most medical institutions of developing countries and regions, which means such classification methods are unfeasible for two-thirds of the world’s population with diabetes [12]. Therefore, alternative variables which are affordable and readily available are needed to detect their insulin resistance and islet function. The triglyceride-glucose (TyG) index as an alternative variable under fasting state is affordable and easily available. It has been described as a biochemical marker of insulin resistance [13,14,15]. In addition, according to previous studies, there was a significant relationship between β-cell mass and fasting plasma glucose (FPG) concentration [16, 17]. Gordon C. Weir and Susan Bonner-Weir proposed five stages of diabetes progression by FPG levels, each characterized by different changes in the mass, phenotype and function of β cells. At stage 4 and 5, patients’ islet function begins to decline gravely [18].

Against this background, we proposed to generate subgroups based on established thresholds for low-cost and high availability of diabetes characteristics, and test whether participants in these subgroups have different risks for several outcomes, including cardiovascular health (CVH), chronic kidney disease (CKD), nonalcoholic fatty liver disease (NAFLD), advanced liver fibrosis, retinopathy, and mortality caused by all-cause, cardiovascular disease (CVD), cancer. The results of this study will enable targeted therapeutic interventions for people in developing countries and regions.

Methods

Subjects

We used data from 1988 to 1994 and 1999 to 2014 of the National Health and Nutrition Examination Survey (NHANES). Diabetes is defined as a self-reported diagnosis, use of insulin or oral hypoglycemic medication, FPG ≥ 7.0 mmol/L or glycated haemoglobin (HbA1c) level ≥ 6.5%, according to ADAs diabetes diagnostic criteria [19]. To ensure the samples were type 2 diabetes, we excluded patients who were diagnosed with diabetes before the age of 30 years [20]. We also excluded patients who were self-reported as pregnant or having cancer at baseline. Some patients were also excluded according to the following criteria: (1) missing triglycerides (TG), FPG and BMI values at baseline; (2) extreme TG or FPG values (> 3SD) [21]; (3) BMI outliers (< 15 kg/m2 or > 60 kg/m2) [22]. The details were shown in Fig. 1 and the final sample size was 4,060.

Fig. 1
figure 1

Algorithm for Type 2 diabetes selection in the NHANES (1988–2014)

Definitions of different subgroups of T2D

According to Gordon C. Weir and Susan Bonner-Weir’s five stages of evolving beta-Cell dysfunction in the progression of diabetes, when patients’ fasting plasma glucose levels exceeds 16 mmol/L, their capacity for insulin secretion is considerably less than the case of 50% reduction in β-cell mass, which results in less efficient insulin secretion [18]. Hence, we defined severe insulin-deficient as fasting glucose greater than 16 mmol/L. In previous studies, the 75th percentile of the HOMA-IR level in the population was often used as the threshold of IR [23,24,25]. As a marker of insulin resistance, the higher the TyG value, the higher the level of insulin resistance. Thus, we defined server insulin resistance as the TyG above the 75th percentile value. The TyG index was calculated by TyG index = Ln [fasting TG (mg/dL) × fasting glucose (mg/dL)/2]. According to the World Health Organization (WHO), obesity is defined as BMI ≥ 30 kg/m2. The grouping rules were shown in Fig. 1. Acknowledging differences in the approach of this study and the study by Ahlqvist [5] and Peng-Fei Li et al. [11], we chose letter-based grouping labels, which correspond to the replicated Ahlqvist et al. labels as follows: subgroup A, obesity-related diabetes; subgroup B, age-related diabetes; subgroup C, insulin resistant diabetes, and subgroup D, severe insulin deficient diabetes.

Outcome assessment

According to the American Heart Association's Life's Simple 7 (LS7) [26], CVH was evaluated based on smoking, weight, physical activity, diet, blood cholesterol, blood glucose, and blood pressure. The definitions of for each metric were shown in Additional file 1: Table S1. Each LS7 component was given a score of 0, 1, or 2 to reflect poor, intermediate, and ideal health, respectively. A total LS7 score between 0 and 14 was calculated as the sum of the LS7 component scores, and poor CVH was defined as a score between 0 and 4 [27]. CKD was defined as kidney damage and an estimated glomerular filtration rate (eGFR) of less than 60 ml min-1 per 1.73 m2, and eGFR was calculated using the CKD-EPI study equation with serum creatinine [28]. Fundus Photograph data was only available for 1988–1994, 2005–2008, and the final sample was 2276. Retinopathy was defined as the presence of the following factors on fundus photograph: non proliferative retinopathy, intraretinal microvascular abnormalities without microaneurysms, retinal microaneurysms, hemorrhages, soft exudates, hard exudates, and proliferative retinopathy [29]. The criteria to categorize Nonalcoholic fatty liver disease (NAFLD) included a United States Steatosis Index (USFLI) score of ≥ 30, no excessive alcohol consumption (average ≤ 1 alcoholic drink per day for women & ≤ 2 alcoholic drinks per day for men), negative Hepatitis C antibody, and negative Hepatitis B surface antigen [30]. The formula for USFLI score was shown in Additional file 1: Table S2 [31]. Due to the lack of data on drinking in the 1988–1994, only 1999–2014 was considered, and the final sample was 2343. Advanced liver fibrosis was determined using two noninvasive markers of liver fibrosis: the fibrosis-4 (FIB-4) score and the NAFLD fibrosis score (NFS). Their cutoff values were FIB-4 ≥ 2.67 or NFS > 0.676 [32]. FIB-4 [33] and NFS [34] indexes were calculated as shown in Additional file 1: Table S2.

Mortality data of the NHANES (1988–2014) participants were provided by the National Centre for Health Statistics using probabilistic record matching with death certificate data found in the National Death Index (NCHS Linked Mortality File) by December 31, 2015. Mortality outcomes of interest include all-cause, CVD and cancer-related death. Follow-up period was calculated as the time between the date of NHANES examination date and the last known about each participant’s living or death [35].

Covariates assessment

Information on age, sex, race/ethnicity, education level, family income, smoking status, physical activity, disease status, and medication use were collected from NHANES household interviews. Measurements of body mass index (BMI), systolic blood pressure (SBP) and diastolic blood pressure (DBP) were obtained in the NHANES MEC. Clinical indicators including fasting glucose, HbA1c, triglycerides (TG), total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C) and high-density lipoprotein cholesterol (HDL-C) were measured in the NHANES laboratory. According to the 2019 ESC/EAS Guidelines, hypertriglyceridemia was defined as triglycerides ≥ 1.7 mmol/L (150 mg/dL) [36]. Education level was categorized as < 9th grade, 9-11th grade, 12th grade and > 12th grade. Family income-to-poverty ratio was classified as 0–1.0, 1.0–3.0, or > 3.0. Smoking status was classified as never smoker, former smoker, or current smoker. Ideal physical activity was defined as ≥ 150 min of moderate-intensity activities per week, ≥ 75 min of vigorous-intensity activities per week, or an equivalent combination of both [37].

Statistical analysis

Analyses were conducted according to the guidelines recommended by the NHANES, we computed new sample weights according to NCHS guidelines for combining data from multiple cycles. To calculate the differences between different subgroups at baseline, weighted chi-square and linear regression model were used among categorical variables and continuous variables, respectively. Multiple imputation was used for the covariates with missing values. We used odds ratios (ORs) to evaluate the prevalence of poor CVH, CKD, retinopathy, NAFLD and advanced liver fibrosis across different subgroups, and conducted logistic regression to calculate the value of ORs. Individual survival among different subgroups was plotted using Kaplan–Meier curves, and multivariate Cox proportional hazard models were used to obtain the hazard ratios (HRs) of all-cause, CVD and cancer-related mortality. In model 1, we adjusted for age, gender and race/ethnicity. In model 2, we further adjusted for education level, family income-poverty ratio, smoking status, ideal physical activity. In model 3, we further adjusted for duration of diabetes, diabetes medication use, self-reported hypertension, hypercholesterolemia, and CVD, hypertriglyceridemia, self-reported hypertension, hypercholesterolemia medication use, systolic blood pressure, diastolic blood pressure, total cholesterol, high density lipoprotein cholesterol, and low density lipoprotein cholesterol.

Stratified analysis were also conducted by gender (male or female), race/ethnicity (White or non-White), stages of diabetes (newly diagnosed or already diagnosed). The P values for the product terms between subgroups and stratification variables were used to estimate the significance of interactions. As a sensitivity analysis, we additionally performed the analyses after excluding participants who died within 2 years of follow-up, to reduce the potential reverse causation bias. All the analyses were performed in R software (4.1.0).

Results

Characteristics of the subgroups

The study included 4060 adults from the NHANES database. Table 1 showed the demographic data of the four subgroups: subgroup A (n = 1497, 36.87%), subgroup B (n = 1544, 38.03%), subgroup C (n = 811, 19.98%), subgroup D (n = 208, 5.12%). As shown in the Table 1, Subgroup A presented the highest BMI, more hypertension and hypertension medication use. Subgroup B was the oldest and had the highest level of HDL. Subgroup C had the highest TG, and relatively high TyG and fasting glucose levels. Subgroup D had the highest fasting glucose, TyG and TC levels, and more diabetes pills and insulin use. Furthermore, subgroup A had the highest prevalence of advanced liver fibrosis, subgroup B had the lowest prevalence of poor CVH. Subgroup D had the highest prevalence of retinopathy. Compared with other subgroups, subgroup C had the highest all-cause and CVD-related mortality.

Table 1 Baseline Characteristics of subjects among different T2D subgroups

ORs of diabetes complications among different subgroups

Table 2 listed the ORs of poor cardiovascular health (CVH), CKD, retinopathy, NAFLD and advanced liver fibrosis among different T2D subgroups. The poor CVH prevalence of subgroup B was significantly lower than that of the subgroup A (adj. OR: 0.08, 95%CI: 0.05–0.12), while no significant differences between the other three subgroups (all P > 0.05). In the prevalence of CKD, subgroup B had a lower prevalence than subgroup A (adj. OR: 0.69, 95%CI: 0.48–1.00). Compared with subgroup A, subgroup D had a significantly higher prevalence of retinopathy (adj. OR: 2.94, 95%CI: 1.16–7.48). Among the four subgroups, subgroup B had the lowest prevalence of NAFLD and advanced liver fibrosis (adj. OR:0.64, 95% CI: 0.43–0.95, adj. OR: 0.21, 95% CI: 0.15–0.29, respectively).

Table 2 ORs of poor CVH, CKD, retinopathy, NAFLD and advanced liver fibrosis among different T2D subgroups

HRs of all-cause, CVD and cancer-related mortalities of different subgroups

During 41,447 person-years of follow-up, 1,714 deaths were documented, including 524 CVD-related deaths and 268 cancer-related deaths. Figure 2 showed the Kaplan–Meier curves of the survival rate among the four subgroups, and the cumulative incidence of death due to all-cause was significantly different (log-rank test, P < 0.001). HRs of all-cause, CVD and cancer-related mortality across T2D subgroups were summarized in Table 3. In all-cause mortality, the adjusted HRs and 95% CIs for Subgroup B and Subgroup C were 1.30 (95% CI, 1.02–1.67) and 1.48 (95% CI, 1.06–2.06), respectively, significantly lower than Subgroup A. Subgroup C had the highest, while subgroup A had the lowest all-cause mortality. In CVD and cancer-related mortality, there were no significant difference among the four subgroups (all P > 0.05).

Fig. 2
figure 2

Kaplan–Meier curves for all-cause, CVD and cancer-related mortality categorized by different subgroups of T2D

Table 3 HRs of all-cause, CVD and cancer-related mortality among different T2D subgroups

Subgroup analysis

Additional file 1: Tables S3 and S4 showed stratified analysis of the association between subgroups and complications and mortalities, respectively. In the prevalence of poor CVH, CKD, retinopathy, the results were consistent when analysis were stratified by gender, race and diabetes stages (all P interaction > 0.05). In terms of NAFLD and advanced liver fibrosis, the P interaction with gender, race and diabetes stages were 0.679, 0.031, 0.411 and 0.029, 0.249, 0.791, respectively. The results showed that the prevalence of NAFLD in subgroups B and C were significantly lower than subgroup A in non-white samples. In male samples, the prevalence of advanced liver fibrosis in subgroup D was significantly lower than subgroup A. Regarding mortality risks, except for diabetes stages and all-cause mortality (P interaction = 0.024), no significant interaction was found (all P interaction > 0.05). After excluding the participants who died during the two-year follow-up period, the results of sensitivity analysis were generally consistent with Table 3, as shown in table S5.

Discussion

Our study proposed a novel yet simple approach to categorize the U.S. T2D population according to the thresholds of FPG, TyG index and BMI. We divided patients into subgroup A (obesity-related diabetes), subgroup B (age-related diabetes), subgroup C (insulin resistant diabetes), and subgroup D (severe insulin deficient diabetes). Patients in subgroup B were older, without severe impaired insulin secretion, severe insulin resistance or obesity. Compared with other subgroups, subgroup B was a low-risk subgroup with the lowest prevalence of poor CVH, NAFLD, and advanced liver fibrosis. NAFLD was closely linked to obesity and insulin resistance [38], as subgroup B had the lowest level of BMI, fasting glucose and insulin resistance, so the risk of NAFLD was low, as well as advanced liver fibrosis. We also found that patients in subgroup B had a higher CVH level than other subgroups. Cardiovascular health (CVH) was related to the risk of CVD [39, 40]. Therefore, although physical function declined with aging, the risk of CVD-related mortality was not very high. Patients in subgroup D had the highest glucose level and severely impaired ability of insulin secretion. The risk of retinopathy in subgroup D was significantly higher than other subgroups, as retinopathy was associated with reduced β-cell function, fasting and postprandial hyperglycemia and hypoinsulinemia instead of insulin resistance [41]. For subgroup C, the all-cause mortality risk was significantly higher than subgroup A, because this subgroup had relatively high levels of age, glucose, BMI and insulin resistance.

From results presented herein, we make the following suggestions: (1) For subgroup A, B and C with a low CVH level, patients should improve diet quality, change dietary behaviors [42] and enhance aerobic exercises [43]; (2) For subgroup D, more attention can be paid to screening for retinopathy; (3) More attention should be paid to all-cause mortality in patients with diabetes, especially for subgroups C; (4) More attention should be paid to screening for NAFLD and advanced liver fibrosis for subgroup A, C and D.

The strengths of this study lie in several aspects. First, we used nine cycles data from the NHANES, which provided a large nationally representative sample of diabetes to analyze. Second, this study conducted a relatively full risk prediction comparison, including the prevalence of CVH, chronic kidney diseases, retinopathy, nonalcoholic fatty liver disease, advanced liver fibrosis, as well as the risks of mortality many years later. Third, the logistic and Cox models were adjusted for several potential confounding factors, including demographic, socioeconomic, lifestyle information, disease history, medication history, etc. However, there are still some limitations. First, as we all know that FPG had a high glycemic variability, and can’t represent islet function stably. It was less stable than HbA1c, but had a stronger correlation with HOMA2-B (β-cell function), as shown by their weighted coefficient: −0.50 for FPG, and −0.36 for HbA1c. This was consistent with the research of Cuiliu Li et al. that FPG showed stronger correlations with indices for β-cell function than HbA1c [44]. In the future, an index similar to TyG index may be developed to represent the function of islet stably. Second, althoughpatients who were diagnosed with diabetes younger than 30 were excluded, type 1 diabetes may also confound our results. In addition, our study has only been validated in the U.S. population, further studies in additional populations are warranted.

Conclusion

Instead of k-means grouping, we used thresholds of FPG, TyG and BMI to stratify T2D patients into different subgroups. This approach is more practical and easily adopted by clinicians, since it can identify specific subgroup risks, and help clinicians make specific treatment recommendations for people with diabetes. Moreover, fasting glucose, triglycerides and BMI are convenient for detection that can be performed in the primary medical care setting, which has implications for appropriate management, and will go a long way in reducing the complications.