Introduction

Chronic kidney disease (CKD) is irreversible decreased kidney function shown by glomerular filtration rate (GFR) of less than 60 mL/min·1.73 m2, or markers of kidney damage, or both, of at least 3 months duration, regardless of the underlying cause [1]. Global estimated prevalence of CKD was 13.4% (11.7–15.1%) [2]. The prevalence varies within countries, which was reported to be around 11% in high-income countries, including the USA and Australia [1]. A study conducted in China found that the CKD prevalence was 10.8% (10.2–11.3%) among Chinese adults [3]. CKD, raising widespread concerns in recent years, has become a major public health problem across the globe, leading to irreversible nephron loss, end-stage renal disease (ESRD), cardiovascular events, and/or premature death [4]. The global prevalence of CKD is still increasing owing to the rising rates of hypertension, diabetes, obesity, and aging [5].

People usually will not be diagnosed with CKD until severe damages emerge for the lack of obvious clinical symptoms in the early stages of CKD [6]. Patients with CKD are at high risk of poor prognosis and death, since a considerable amount of patients are suffering from hypertension, diabetes, severe electrolyte, or structural abnormalities [7]. To date, the appropriate treatment of older patients with CKD is not clear [4].

Primary risk factors for CKD include aging, blood pressure levels, diabetes status, serum lipid status, obesity, smoking, and alcohol consumption [8,9,10,11]. There are already some prediction models for CKD, mainly applied to the USA or western population [12, 13], which were not suitable for the Chinese population. Reasons for its inappropriateness were threefold: differences between the European ancestry populations and the East Asian ancestry populations due to allelic variation, the varied prevalence of the most common risk factors for CKD, such as older age, hypertension, diabetes, and so on, and the different prevalence of CKD in the US and China [14,15,16,17,18,19]. As for models in China, some models were developed targeting the Chinese population with hypertension or type 2 diabetes mellitus [20, 21]. Chien’s model for the incidence of CKD was limited to its homogeneous participants from one health center with a median 2.2 follow-up years from 2003 to 2007 [22]. Owing to the earth-shaking changes in the economy, drastic alteration in healthy lifestyles, and the emerging environmental problems in the past decades [23, 24], it is indispensable to develop a new prediction model for CKD targeting Chinese adults to identify individuals at high risk for CKD, and then, timely interventions could be carried out. Nomogram has been accepted to be a reliable graphical tool to predict, consisting of a set of scales that each scale represents a characteristic of the study population in recent years [25,26,27], which was convenient to apply for clinicians. Yet, nomogram was rarely utilized to predict the risk of CKD. In the present study, based on the China Health and Retirement Longitudinal Study (CHARLS), we aimed to develop a nomogram to predict the 4-year risk of CKD among Chinese elderly adults.

Materials and methods

The data were obtained from the China Health and Retirement Longitudinal Study (CHARLS), a nationwide study among Chinese adults aged 45 years or older and their spouses. The national baseline survey for the study was conducted between June 2011 and March 2012 (CHARLS2011), and respondents across 150 counties/districts were recruited using a multistage sampling strategy [28]. Detailed information regarding the demographic background and biomedical findings were collected at baseline and each follow-up (every 2 years) using a structured questionnaire. The present study included participants recruited in CHARLS2011 and re-examined in CHARLS2015 where blood sample collections were done.

Variables associated with CKD in previous studies were extracted as follows: age, gender, body mass index (BMI), hypertension, diabetes, stroke, asthma, triglyceride (TG), total cholesterol (TC), low-density lipoprotein cholesterol (LDL), high-density lipoprotein cholesterol (HDL), serum creatinine (SCr), smoking status, drinking status, Cystatin C, hemoglobin, uric acid at baseline, and SCr at follow-up.

Due to the low awareness of CKD [29], the estimated glomerular filtration rate (eGFR) was calculated to accurately estimate the kidney function in the Chinese population using coefficient-modified CKD Epidemiology Collaboration (CKD-EPI) equation from Japanese [30]: eGFR (mL/min/1.73 m2) = 0.813 × 141 × min (SCr/κ, 1)α × max (SCr/κ,1)−1.209 × 0.993Age × 1.021 [if female] × 1.159 [if black], where κ is 0.7 for females and 0.9 for males, α is − 0.329 for females and − 0.411 for males, min indicates the minimum between SCr/κ and 1, and max indicates the maximum between SCr/κ and 1. In the study, individuals with eGFR < 60 mL/min/1.73 m2 can be diagnosed with CKD according to the clinical practice guideline for the evaluation and management of chronic kidney disease [31].

A total of 14,574 participants with two visits in CHARLS 2011 and CHARLS 2015 were originally involved in the study. Participants were excluded for the following reasons: (1) no available information on the variables stated above; (2) extreme BMI values (< 15 kg/m2 or > 55 kg/m2); (3) participants diagnosed with CKD at baseline. Finally, 3562 participants were included in the study.

All participants were randomly divided into the training cohort and the validation cohort by a ratio of 7:3. Descriptive statistics (median, the first quartile, and the third quartile for skewed continuous data, and frequencies and percentages for categorical data) were used to report the baseline demographics and clinical characteristics. Differences between groups were analyzed using Chi-square tests for categorical variables and Wilcoxon rank-sum tests for skewed continuous variables.

Univariate and multivariate logistic regression analyses were used to establish a model for predicting the risk of CKD. The variables with a p value less than 0.1 in the univariate analysis were entered into the stepwise multivariate logistic regression to determine the final risk factors for CKD. The predictive nomogram was constructed based on the final logistic model with the data of the training cohort and externally validated using the data of the validation cohort.

The performance of the nomogram was assessed by the discriminate ability, predictive accuracy, and clinical application value of the model, using a receiver-operating characteristic (ROC) curve, calibration plot, and decision curve analysis. The discriminate ability of the model was determined by the area under the receiver-operating characteristic curve (AUC), which ranged from 0.5 (no discrimination) to 1 (perfect discrimination). The calibration plot describes the degree of fit between the actual risk of CKD and the nomogram-predicted risk of CKD. Decision curve analysis (DCA) was used to assess the utility of the nomogram.

All the statistical analyses were performed with SAS 9.4 statistical software (SAS Institute Inc., Cary, North Carolina) and R version 4.1.0 software (http://www.R-project.org/). The tests were two‐tailed, and p < 0.05 was taken as statistically significant.

Results

A total of 3562 participants (28.97% men and 71.03% women) were included in the study and randomly divided into a training cohort and a validation cohort by a ratio of 7:3. A total of 413 participants developed CKD in the following 4 years. The overall cumulative incidence of CKD was 11.59%. The basic demographics and clinical characteristics of the training cohort and validation cohort are depicted in Table 1. Except for smoking status, there were no statistically significant differences in all the other variables. The training cohort had a higher proportion of smokers.

Table 1 Demographics and clinical characteristics of the training cohort and the validation cohort

Table 2 displays the results of the univariate and multivariate logistic regression analysis for risk predictors relevant to incident CKD in the training cohort. In the univariate analysis, age (OR 1.042), hypertension (OR 1.889), total cholesterol (OR 1.004), uric acid (OR 1.365), and Cystatin C (OR 22.828) were positively associated with CKD (p < 0.01), while male (OR 0.315), smoking (OR 0.503), drinking (OR 0.448), eGFR (OR 0.958), and hemoglobin (OR 0.827) were negatively associated with CKD (p < 0.01). These factors were entered in the multivariate stepwise logistic regression analysis. The multivariate analysis showed that male (OR 0.051), hypertension (OR 1.406), eGFR (OR 0.880), hemoglobin (OR 0.840), and Cystatin C (OR 2.478) were associated with CKD (p < 0.001), which were risk predictors in the final model. The VIF values were all < 3, indicating that no collinearity existed among selected variables.

Table 2 Risk predictors for incident CKD in the univariate and multivariate analysis

According to the results in Table 2, the final model containing gender, eGFR, hypertension, hemoglobin, and Cystatin C was used to construct a predicting nomogram for the 4-year risk of CKD among Chinese adults. Figure 1 shows the nomogram to predict the incident probability of participants. The total points were related to the risk of CKD.

Fig. 1
figure 1

A constructed nomogram to predict the risk of CKD for participants. Density plots of total points, Cystatin C, Hb, and eGFR showed their distribution. For gender and HTN, the distribution was reflected by the size of the box (the smaller one represented male or Yes and the bigger one represented female or No). Each factor was given a point based on the nomogram. The final total points were obtained by adding the individual score of each of the five risk factors and then obtaining the estimated probability. HTN hypertension, Hb hemoglobin, GFR estimated glomerular filtration rate, CKD chronic kidney disease

Finally, we verified the accuracy of the nomogram. As shown in Fig. 2, AUC in the training cohort and the validation cohort was 0.809 and 0.837 (shown in Fig. 2a and d), respectively. At the best threshold, the specificity and sensitivity rates in the training cohort were 83.1% and 67.9%, respectively. The calibration plots, which examine the consistencies between the nomogram-predicted probability and the observed probability, displayed considerable predictive accuracy of the nomogram to predict CKD in the training cohort (Fig. 2b) and the validation cohort (Fig. 2e). Additionally, DCA, a statistical model to decide whether the prediction model has utility in supporting clinical decisions, showed that the model had potential clinical application value. When the risk threshold ranged from 0.04 to 0.89 in the training cohort (Fig. 2c) and from 0.03 to 0.68 in the validation cohort (Fig. 2f), the nomogram had greater net benefit than either the treat-all-patients strategy or the treat-none strategy.

Fig. 2
figure 2

Evaluation of the nomogram model. (1) Receiver-operating characteristic curve for the nomogram in the training cohort (a) and the validation cohort (d). (2) Nomogram calibration plot in the training cohort (b) and the validation cohort (e). When the solid line (performance nomogram) was closer to the dotted line (ideal model), the prediction accuracy of the nomogram was better. (3) Decision curve analysis for the prediction model in the training cohort (c) and the validation cohort (f). The red solid line is from the prediction model, the gray line is for all participants with CKD, and the solid horizontal line indicates that no participants have CKD. The graph depicts the expected net benefit per patient relative to the nomogram prediction of CKD risk

Discussion

We developed and validated a nomogram for predicting the 4-year risk of CKD among the Chinese adults based on the data from CHARLS 2011 and CHARLS 2015. The model included five variables, containing gender, hypertension, eGFR, hemoglobin, and Cystatin C. The evaluation of the nomogram illustrated that the model performed well and it may be able to help in the prevention of CKD. Any physical examination center could identify participants who are at higher risk for CKD, by the aid of this nomogram, and then make professional suggestions, such as managing blood pressure and anemia, to slow the progression of renal function decline for the participants.

CKD is currently known to be associated with various complications, including cancer, cardiovascular disease, osteoporosis, kidney failure, mortality, and poor quality of life for survivors in general [32,33,34,35,36]. In most recent studies of prediction models for CKD, attention had been given to the progression of CKD [37,38,39,40]. Models for the risk of CKD in the overall general population are relatively few in the last decade. In 2011, Nynke developed a prediction model to identify individuals at increased risk for developing progressive CKD and found that age, urinary albumin excretion, systolic BP, C-reactive protein, and known hypertension were predictors [13]. In 2019, Robert developed a prediction model based on age, sex, race, eGFR, history of cardiovascular disease, ever smoker, hypertension, BMI, and albuminuria concentration [41]. The model was observed to have high discrimination and variable calibration in diverse populations, but they did not conduct a decision curve analysis to evaluate its clinical utility. Lin [20] and Wan [21] developed prediction models for renal disease in Chinese patients with type 2 diabetes mellitus and hypertension, respectively, which were not suitable for the general population. In 2017, Chien developed a point system to estimate chronic kidney disease risk at 4 years [22], with age topping the list. The limitation of the study was the homogeneity of participants, limited to patients in one hospital, and the participants were followed up from 2003 to 2007. In the past 2 decades, with the development of economic and social development, there were dramatic changes in healthy lifestyles of participants, which could potentially impact the prevalence of CKD. Besides, increasing environmental pollution problems deserve attention as well, imposing huge threat on public health, resulting in substantial disease burden in terms of excess number of premature deaths, disability-adjusted life-year loss, and kidney disease [42,43,44]. Under the circumstances, it is imperative to establish the current predictive model of CKD.

Gender was a predictor for CKD. Previous studies indicated that CKD was more common among women than men in most areas [45, 46]. The reasons for this discrepancy across gender may be due in part to gender-related inequities, with more women lacking food, education, or economic power, directly or indirectly impacting the risk for kidney diseases [47]. A review found that women were at higher risk for CKD development in Asia [48], and a cross-sectional study conducted in China revealed that the female gender was significantly associated with low eGFR as well [3]. Our results proved the higher risk of CKD in female again. Since the current medical management of CKD patients was gender-blind [49], it is demanded to bridge the gap aiming at slowing the progression of CKD in women.

Laboratory parameters were often used in the prediction model [50, 51]. Our nomogram included hemoglobin, Cystatin C, and eGFR. It was known that CKD progression contributed to declining hemoglobin [52]. A cohort study concluded that mildly increased hemoglobin was associated with subtle declines in GFR among a population with GFR ≥ 60 mL/min/1.73 m2 [53]. Another study indicated that as GFR decreased, the hemoglobin level climbed and then peaked at an eGFR of 60–89 mL/min/1.73 m2, followed by a decrease at an eGFR of < 60 mL/min/1.73 m2 [54]. In the univariate analysis, hemoglobin concentration was positively associated with CKD. Yet, the association turned negative in the multivariate analysis. Since the VIF values of the multivariate analysis were < 3, no collinearity existed in the final model and the reasons for opposite outcomes remained unclear. Cystatin C was seen as an alternative marker of kidney function apart from creatinine [55, 56]. A study conducted in the USA demonstrated that Cystatin C was greater in CKD groups [57]. In the present study, baseline Cystatin C and creatinine-based GFR were indicators for risk prediction of CKD, consistent with previous research [13, 58].

Hypertension and diabetes mellitus were among well-known causes of CKD [3, 5]. The greater the prevalence of hypertension and diabetes, the larger the amount of CKD patients [59]. Poorly controlled hypertension could lead to renal damage [60]. Two study conducted in China both found that hypertension and diabetes were independently associated with CKD [61, 62]. However, a community-based survey in Taiwan showed that hypertension was not significantly associated with CKD [63]. In the present study, there were significant association between hypertension and CKD. However, we did not found the association between diabetes mellitus and CKD. The results regarding the associations between diabetes mellitus/hypertension and CKD remained controversial among the Chinese population and further studies are required to figure out the actual associations.

Age was a conventional risk factor for CKD in the previous studies. Aging was known as the main risk factor for a sea of diseases, including cancer, impaired cardiovascular health, declining cognitive function, and so on [64,65,66,67]. Chronic kidney disease was no exception, and significant molecular, structural, and functional changes of the kidney were observed among the aging population even without any other chronic diseases [68, 69]. A cohort study found that older age was associated with prevalent CKD [70]. Another study brought age into the prediction model and took age as the biggest risk factor for CKD [22]. However, age was not a predictor for CKD in the final predictive model. In the univariate analysis, age was significantly associated with the risk of CKD, but age was excluded from the final model after applying stepwise multivariate logistic regression.

The nomogram had excellent value in clinical use. First, the nomogram performed well in the light of ROC, the calibration plots, and DCA. Second, the participants in the present study were from a nationally representative longitudinal survey, enabling the model applicable to Chinese populations. Additionally, all variables in the model are easily available as long as participants have done blood routine and liver and renal function tests, which are common and reasonable in regular physical examination. Billions of people across the globe perform a physical exam every year, and they can use the nomogram to predict the risk of CKD. When at higher risk of CKD, participants are supposed to take the first step of health self-management, such as managing blood pressure and anemia, to prevent from developing CKD or slowing renal function decline.

The present study had some limitations. Although CHARLS was a nationwide multi-center study, the study only included 3562 of 14,574 participants, mainly for the lack of key information. A large part of participants being excluded may have biased our findings. In addition, the nomogram was only internally validated, and further studies concerning external validation among other countries are needed.

Conclusions

In conclusion, we constructed and internally validated a nomogram to predict the 4-year risk of CKD among the Chinese elderly population, composed of gender, hypertension, eGFR, hemoglobin, and Cystatin C. The nomogram may be conducive to identifying individuals at increased risk for CKD given its remarkable performance in the evaluation.