Introduction

Prevalence of diabetes is increasing dramatically worldwide. In China, the overall diabetes prevalence is estimated to be higher than 11.6% and affects over 100 million adults [1]. It is indicated that lifestyle modification or pharmacological intervention could prevent up to two-thirds of high-risk population to develop diabetes [2, 3]. Therefore, early screening of diabetes and intervention may reduce the harm of long-term hyperglycemia and prevent or delay chronic diabetes complications.

Numerous diabetes risk scores have been developed to screen high-risk individuals for early intervention [4,5,6,7,8,9,10]. However, the predictive power of a diabetes risk score developed from one population may not directly apply to other populations [11]. In addition, only a small proportion of scores were constructed based on longitudinal studies, particularly in China [12,13,14,15]. Therefore, it still remains to be elucidated whether ethnic- or country-specific screening methods are required for early diagnosis and intervention for diabetes.

In the present study, we aimed to identify risk factors associated with incident diabetes and develop a simple points-based score to predict diabetes risk after a 5-year follow-up among a cohort of middle-aged and older Chinese adults.

Materials and methods

Study population

The design, methods, and detailed information of the Dongfeng–Tongji cohort have been described elsewhere [16]. Briefly, a total of 27,009 retired employees were recruited in the cohort and completed baseline questionnaires and medical examinations and provided baseline blood samples between September 2008 and June 2010. Among 25,978 individuals (96.2%) who completed the follow-up until October 2013, we excluded individuals with diabetes at baseline (n = 4970), as well as those with missing information related to anthropometric, clinical data, or other covariates (n = 3326), resulting in a final study sample of 17,690 subjects (7926 males and 9746 females with a mean age of 63.3 years). The study has been approved by the Ethics and Human Subject Committee of the School of Public Health, Tongji Medical College, and Dongfeng General Hospital, the Dongfeng Motor Corporation (DMC). All study participants provided written informed consents.

Data collection

Ascertainment of baseline and incident diabetes

The diagnosis of diabetes was on the basis of American Diabetes Association (ADA) criteria [17] as meeting any of the following criteria in follow-up interviews or laboratory examinations: (1) self-reported physician diagnosed of diabetes, (2) fasting blood glucose level of ≥7.0 mmol/L, (3) HbA1c level ≥6.5%, (4) 2-h 75-g oral glucose tolerance test (OGTT) value of ≥11.1 mmol/L, and (5) usage of diabetes medication (insulin or oral hypoglycemic agent). The incident diabetic cases were those occurred after baseline survey but before the end of October 2013. Because the OGTT test was not conducted in this study and the HbA1c levels were only assayed during the follow-up in 2013, baseline and incident diabetic cases were thereby ascertained according to self-reported physician diagnosis of diabetes, fasting plasma glucose levels, and usage of diabetes medication. A total of 1390 incident diabetic cases were diagnosed during the follow-up period.

Assessment of covariates

Baseline data were collected by trained interviewers by semi-structured questionnaires during face-to-face interviews. Information on socio-demographic factors such as age, sex, education, marital status, medications, health status, and lifestyle including smoking status, alcohol consumption status, and physical activity was included in the questionnaires. Participants were asked about their medical history, including diabetes, CHD, stroke, hypertension, hyperlipidemia, and cancer. Hypertension was defined as individuals with self-reported physician diagnosis of hypertension, blood pressure ≥140/90 mmHg, or current usage of antihypertensive medication. Hyperlipidemia was defined as total cholesterol >5.72 mmol/L or triglycerides >1.70 mmol/L at medical examination, current usage of lipid-lowering medication, or a previous physician diagnosis of hyperlipidemia. According to the respondents’ self-reported smoking status, participants were classified as ex-smokers, current smokers, and non-smokers. Based on the self-reported alcohol consumption status, participants were grouped as ex-, current, and non-alcohol consumers. As the sample size of ex-smokers (12%) and ex-alcohol consumers (6%) was too small, we combined them into non-smokers and non-alcohol consumers.

The general health examination was performed at the same time. Standing height, body weight, and waist circumference were measured with participants in light indoor clothing and without shoes. Body mass index (BMI) was calculated as weight in kilograms divided by height in meters squared.

All subjects were examined in the morning after overnight fasting, and fifteen milliliters of fasting blood was drawn with 3 vacuum (ethylenediamine tetraacetic acid, EDTA) anticoagulation tubes for plasma and coagulation tube for serum. Fasting blood glucose level was determined through Glucose Oxidase method by Abbott Aroset analyzer. Triglyceride, total cholesterol, LDL cholesterol, and HDL cholesterol levels were measured in the hospital’s laboratory with ARCHITECT Ci8200 automatic analyzer (ABBOTT Laboratories. Abbott Park, Illinois, USA) using the Abbott Diagnostics reagents according to the instructions of the manufacturer.

Statistical analysis

All statistical analyses were performed using SPSS 13.0 software. Categorical variables were presented in percentages and compared by Chi-square analysis. Continuous variables were expressed in means (SD) and compared by Student’s t test or analysis of variance (ANOVA) unless otherwise specified.

Cox proportional hazard regression model was fitted with potential risk factors for diabetes including age, sex, BMI, waist circumference, fasting glucose, hypertension, hyperlipidemia, current smoking status, current alcohol drinking status, and family history of diabetes. All candidate risk factors were categorized. Factors significantly associated with diabetes risk were retained in the final model. The diabetes risk score was calculated by multiplying the β-coefficients of the significant variables by 10 and rounding to the nearest integer [7, 18]. The 5-year risk of diabetes was estimated based on the score.

The receiver operating characteristic (ROC) curve was obtained by plotting sensitivity against 1 − specificity. The optimal cutoff point was identified based on the Youden index, which was at the maximum sum of the sensitivity and specificity − 1 [18]. The area under the curve (AUC) was also calculated on the basis of several reported diabetes risk models, including the San Antonio Heart Study [9], the Framingham Offspring Study [10], the Atherosclerosis Risk in Communities (ARIC) study [8], the India model [7], and the Thailand model [4]. A two-side p value of <0.05 was considered to be statistically significant.

In order to testify the stability of the present model, we validate it in the following progress. Model is developed in randomly selected 90% of the overall sample according to the aforementioned analysis method and validated in the rest 10% sample. The above progress is repeated 10 times. Significant risk factors are identical with those developed in the overall sample, and the AUCs range from 0.746 to 0.770.

Results

The baseline characteristics of the newly incident diabetic and non-diabetic participants are summarized in Table 1. Compared with non-diabetic subjects, the newly incident diabetic cases were more likely to have higher levels of BMI, waist circumference, blood pressure, total cholesterol, triglyceride, LDL-C and fasting glucose, and have lower levels of HDL-C. Moreover, incident diabetic subjects were more likely to have family history of diabetes, hypertension, and hyperlipidemia at baseline.

Table 1 Baseline characteristics of newly incident diabetic and non-diabetic participants

As shown in Table 2, BMI, fasting glucose, hyperlipidemia, hypertension, current smoking status, and family history of diabetes were significantly associated with the incident diabetes in the Cox proportional hazard regression models, whereas sex, age, waist circumference, current alcohol drinking status were not significantly related to diabetes risk. The β-coefficients of significant variables ranged from 0.139 to 1.914, and the optimal cutoff value was 1.5. A simple score system was developed to estimate the risk of future diabetes within 5 years based on the Cox regression coefficients.

Table 2 β-Coefficients and relative risk (95% CI) of incident diabetes using Cox proportional hazard regression analysis in the Dongfeng–Tongji cohort

We further estimated the performance of the developed diabetes risk score (Table 3). The estimated probability of developing diabetes 5 years later gradually escalated in association with higher risk scores. The total points of the risk score ranged from 0 to 36. The optimal cutoff point for incident diabetes was 15. In the current study, 25.0% of the participants had a risk below 25%, 39.5% had a risk between 25 and 35%, and 35.5% had a risk of 35% or higher using this scoring system (data not shown). The AUC was 0.751 (95% CI 0.737–0.764) (Fig. 1).

Table 3 Screening performance of the developed diabetes risk scores for predicting future diabetes
Fig. 1
figure 1

Receiver operating characteristic curves for the diabetes risk score applied to the study population in the 5-year follow-up of Dongfeng–Tongji cohort. The area under the curve (AUC) was 0.751 (95% CI 0.737–0.764; p < 0.0001). For cut point diabetes risk score = 15, sensitivity was 0.656, specificity was 0.729, and positive predictive value (PPV) was 0.363

We further validated five previously reported prediction models derived from prospective cohort studies in Dongfeng–Tongji cohort and compared the discrimination of the newly established diabetes risk model based on these seven parameters with these foreign models. All variables in each foreign model were directly included in Dongfeng–Tongji cohort, and corresponding AUCs were calculated. In terms of ARIC model and San Antonio model, all the predictive variables were included except for ethnicity. The performance of the present predictive model (AUC 0.764 [95% CI 0.750–0.777]) approximates to the San Antonio model (AUC 0.761 [95% CI 0.747–0.775]) and ARIC model ( 0.760 [95% CI 0.746–0.774]) and is superior to the other three predictive models including the Framingham Offspring Study, India Study, and Thailand Study (all p < 0.05) in terms of AUCs (Fig. 2).

Fig. 2
figure 2

Receiver operating characteristic curves of different models for the prediction of incident diabetes. The areas under the curves (AUC) were as followings: Dongfeng–Tongji: AUC = 0.764 (95% CI 0.750–0.777). San Antonio Heart Study: AUC = 0.761 (95% CI 0.747–0.775). ARIC Study: AUC = 0.760 (95% CI 0.746–0.774). Framingham Offspring Study: AUC = 0.582 (95% CI 0.566 0.598). India Study: AUC = 0.646 (95% CI 0.631–0.662). Thailand Study: AUC = 0.650 (95% CI 0.635–0.666)

Discussion

In the present cohort study, a new diabetes risk prediction model including BMI, fasting glucose, hypertension, hyperlipidemia, current smoking status, and family history of diabetes was established among a middle-aged and older Chinese population.

Several risk scores to predict or detect undiagnosed diabetes have been developed. Most of the scores were derived from the Caucasian populations and the common risk factors included age, family history of diabetes, and anthropometric indicators of obesity [5, 6, 8,9,10, 19]. However, most of the risk predictive models performed better in their original population and their predictive power might not be satisfactory in other populations due to the ethnicity heterogeneity. Moreover, the present population was middle-aged and older population with an average age of 63.3 years and it was needed to develop a new risk predictive model to estimate the 5-year diabetes risk.

The AUC of the present model approximated to that of the San Antonio and the ARIC models, which was probably because the Dongfeng–Tongji score is tested on the same data used for its development. The variables in the San Antonio model consisted of age, sex, Mexican–American ethnicity, fasting glucose, systolic blood pressure, HDL cholesterol, BMI, and family history of diabetes; the predictive factors in ARIC model included height, waist circumference, black race/ethnicity, systolic blood pressure, fasting glucose, HDL cholesterol, triglycerides, and parental history of diabetes. Compared with these two models, the present model consistently includes fasting blood glucose, hypertension, hyperlipidemia, and family history of diabetes, which together weight a large proportion in the diabetes risk score for the middle-aged and older population. Most of the included factors are known to be associated with diabetes from previous etiological research. Obesity is a well-established risk factor for numerous chronic diseases [20]; adipose tissue can release a large number of cytokines and bioactive mediators which disturb the regulation of insulin and play important roles in the pathogenesis of diabetes [21]. Hypertension and dyslipidemia were well-known risk factors for cardiovascular diseases as well as diabetes. The risk factors included in these models are listed in Supplementary Table 1.

In contrast to San Antonio and ARIC models, the present predictive model did not include age and sex but additionally included current smoking status. A meta-analysis showed that active smoking was associated with a 1.44-fold higher risk of developing diabetes compared with non-smoking [22]. A Finland study indicated that adding smoking into the original Finnish model could significantly improve its predictive ability [19]. In the present study, age and sex were not included in the predictive model probably because that the mean age of the participants was 63.3 years old at baseline when most women were postmenopausal and diabetes incidence might not be different between men and women. In the present study, participants aged 70 or more had decreased risk of diabetes, and it might partially be due to the survival bias. It should be considered that the BMI in Asian individuals is lower than that in other populations and results of fasting glucose performance might be different according to race; therefore, whether the thresholds shown in this study are universally useful remained to be validated in other ethnic groups.

Specially, the present study constructed a 5-year diabetes risk predictive model for the Chinese middle-aged and older individuals on the basis of a large sample size. Moreover, β-coefficient of fasting glucose is the highest among all the included risk factors, which means fasting glucose might be the most significant risk factor in the development of diabetes. Some risk score models have been developed in Chinese populations [12,13,14,15, 23]. However, most of them were based on cross-sectional studies, small sample size, or shorter follow-up period. Further studies focusing on the middle-aged and older population were warranted to validate our findings.

Several strengths of this study are needed to be highlighted. Firstly, the prospective design, the relatively large sample size, and 5-year follow-up period provided us modest power to obtain relatively strong evidence. Secondly, the baseline and incident diabetes cases were diagnosed in terms of rigorous standards and the false positive could be reduced to a large extent. Thirdly, the present predictive model was based on common and easily measured factors which were easy to translate to the clinical care. Fourthly, few studies focused on the diabetes risk prediction in middle-aged and older Chinese population, and these findings might provide new insight into diabetes prediction and prevention from different populations.

Nevertheless, some limitations should also be taken into consideration. First, participants in the present study were middle-aged and older Chinese (mean age 63.3 years old); this simple score may not be generalized to other age groups. Second, HbA1c levels and 2-h OGTT were not available at baseline, which might misclassify the undiagnosed diabetes. Thirdly, although we attempted to adjust for all available potential confounders, we still could not eliminate the residual confounding. Finally, the comparison of the discrimination performance between the present score and the other previously published scores is not completely fair because the Dongfeng–Tongji score is tested on the same data used for its development, and external validation in future is warranted.

In summary, in this population-based prospective study a simple diabetes risk score was established on the basis of BMI, fasting glucose, hypertension, hyperlipidemia, current smoking status, and family history of diabetes. This score can be conducted as a simple tool to screen and to estimate the 5-year risk of diabetes in a middle-aged and older population.