Introduction

Nonalcoholic fatty liver disease (NAFLD) has long been deemed a Western disease and is now increasingly prevalent in Asian countries. The pooled NAFLD prevalence in Asian countries is approximately 25.0%, while in China, the prevalence during the last decade has been estimated to be 15–40% [1]. NAFLD has emerged as the most common chronic liver disease, placing a significant burden on healthcare systems worldwide [2].

Early detection of NAFLD is essential to identify those with potentially silent progressive fatty liver disease. Diagnostic practice varies, including clinical, biochemical and radiographic results [3]. Among all diagnostic tests, liver biopsy remains the golden standard for NAFLD diagnosis, particularly for nonalcoholic steatohepatitis (NASH) diagnosis [4]. However, liver biopsy is considered impractical because it is invasive and costly. Therefore, there is a need to develop an efficient clinical prediction model to screen NAFLD with high sensitivity and specificity. This screening tool could be widely adapted in primary, secondary, and tertiary care centers for an early NAFLD detection.

Several anthropometric and metabolic parameter-based models for diagnosing NAFLD have been described in previous literature [5,6,7,8]. Of these, the fatty liver index (FLI) has shown good performance in the detection of NAFLD in different populations [9, 10]. Other models, including the NAFLD liver fat score (NAFLD-LFS) [7], the hepatic steatosis index (HSI) [8], the triglycerides and glucose (TyG) index [11] and the visceral adiposity index (VAI) [12], were also commonly used for NAFLD screening.

A nomogram is a graphical presentation format of a disease-specific prediction model using different clinical data. These models are useful in the early detection of high-incidence diseases and can be easily installed on a computer in an office setting or potentially in inpatient units for clinical use [13]. However, the application of nomograms for NAFLD is rare [14]. Our study aimed to develop a novel clinical and laboratory-based nomogram (CLN) to accurately detect NAFLD in the Chinese population.

Methods

Subjects

A cross-sectional study was conducted among adults (18–75 years old) who presented for their annual health examinations at the First Affiliated Hospital of Zhejiang University School of Medicine in 2014 and 2016. The data were extracted retrospectively from the health examination database. The enrollment was limited to participants who had full records of anthropometric and metabolic data, as well as results of hepatic ultrasonography examination. Exclusion criteria included (1) those taking antihypertensive agents, antidiabetic agents, lipid-lowering agents or uric acid-lowering agents; (2) those with alcohol consumption > 140 g/week for men and 70 g/week for women; (3) those with a history of other known causes of chronic liver disease such as viral hepatitis or autoimmune hepatitis; and (4) those using hepatotoxic medications (e.g., sulfonamides and azithromycin). A total of 16,468 participants from 2014 were included in the training dataset, while 5000 participants from 2016 were assigned to the validation dataset. The personal information of each participant was anonymized at collection prior to analysis. The study was approved by the Ethics Committee of the First Affiliated Hospital of Zhejiang University School of Medicine.

Clinical information and questionnaire

The study data included five parts: medical history, questionnaire, anthropometric and blood biochemical measurements, and image studies. All medical histories, including current/previous diseases and drug prescriptions, were assessed by the examining physicians. Questions about alcohol intake included the frequency of weekly alcohol consumption and the usual amount of daily intake. Body mass index (BMI) was calculated as weight (kg) divided by the square of the height (m). Blood pressure, including systolic blood pressure (SBP) and diastolic blood pressure (DBP), was measured on the right arm with participants in a seated position after 5-min rest. Waist circumference (WC) was measured with the measuring tape positioned midway between the lowest rib and the superior border of the iliac crest as the patient exhaled normally. Blood biochemical measurements were performed according to procedures described previously [15].

Diagnosis of NAFLD by image study

Hepatic ultrasound examination was carried out by trained ultrasonographists who were unaware of the aim of the study and blinded to the laboratory values using a Toshiba Nemio 20 sonography machine (Toshiba, Tokyo, Japan) with a 3.5-MHz probe. Images were captured in a standard fashion, with the patient in the supine position, with the right arm raised above the head. NAFLD was diagnosed, and its degree was assessed according to the criteria described by the Chinese Liver Disease Association [16].

Existing four NAFLD prediction models

$$ {\text{FLI}}:\;\;\frac{{\left[ {{\text{e}}^{{\left( {0.953{\text{*loge}}\left( {{\text{triglycerides}}} \right) + 0.139{\text{*BMI}} + 0.718\, \times \,{\log}\,\,{\text{e}}\left( {{\text{GGT}}} \right) + 0.053\, \times \,{\text{WC}} - 15.745} \right)}} } \right]}}{{\left[ {1 + {\text{e}}^{{\left( {0.953\, \times \,{\log}\,{\text{e}}\left( {{\text{triglycerides}}} \right) + 0.139\, \times \,{\text{BMI}} + 0.718\, \times \,{\log}\;\,{\text{e}}\left( {{\text{GGT}}} \right) + 0.053\, \times \,{\text{WC}} - 15.745} \right)}} } \right]}}\, \times \,100. $$
$$ {\text{HSI:}}\;8 \times \frac{{{\text{ALT}}}}{{{\text{AST}}}} + {\text{BMI}}\left( { + 2,\;{\text{if}}\;{\text{type}}\,{ }2\;{\text{diabetes}}; + 2,\;{\text{if}}\;{\text{females}}} \right). $$
$$ \begin{aligned} {\text{VAI}} & :\,\;\left[ {\frac{{{\text{WC}}}}{{39.68 + 1.88 \times {\text{BMI}}}}} \right] \times \left( {\frac{{{\text{triglycerides}}}}{1.03}} \right) \times \left( {\frac{1.31}{{{\text{HDL}}}}} \right),\;\;{\text{for}}\;{\text{males}}; \\ & \;\;\,\left[ {\frac{{{\text{WC}}}}{{36.58 + 1.89 \times {\text{BMI}}}}} \right] \times \left( {\frac{{{\text{triglycerides}}}}{0.81}} \right) \times \left( {\frac{1.52}{{{\text{HDL}}}}} \right),\;\;{\text{for}}\;{\text{females}}. \\ \end{aligned} $$
$$ {\text{TyG index}}:\;{\log}\;{\text{e}}\left[ {\left( {{\text{triglycerides}}} \right)\left( {{\text{mg}}/{\text{dL}}} \right)\, \times \,{\text{glucosemg}}/{\text{dL}}/2} \right]. $$

Statistical analysis

Continuous variables were expressed as the mean ± standard deviation or median ± interquartile range, while categorical values were expressed using relative frequencies and proportions. Student’s t test or Mann–Whitney U test was used for comparisons of continuous data with or without normal distribution, while the chi-square test was used for comparisons of categorical variables. Univariate and multivariate analyses were performed to identify independent factors strongly associated with NAFLD. Logistic regression models were performed to estimate the odds ratio (OR) and 95% confidence interval (CI).

Based on the logistic regression results, six parameters were selected to construct a nomogram, which allowed us to predict a patient’s probability of having NAFLD. Calibration curves were created to assess the prediction accuracy of the nomogram. An area under the receiver operator characteristic curve (AUROC) was used as a measure of the diagnostic accuracy. Decision curve analysis (DCA) was performed to evaluate the net benefit, namely whether the application of the new model does more good than harm. For all analyses, a p value of less than 0.05 was considered statistically significant. All statistical analyses were performed using R software (version 3.4.1, https://www.Rproject.org) and MedCalc Software (version 12.7, Ostend, Belgium).

Results

Participant characteristics

In this study, 16,468 participants were enrolled in the testing dataset, with a mean age of 45.64 years. Of the 16,468 subjects, 6261 (38.02%) were diagnosed with NAFLD by ultrasound examination. Among the 6261 subjects with NAFLD, 4614 were male and 1647 were female. Participants with NAFLD tended to be older and male and had a higher BMI. In addition, NAFLD patients had a higher prevalence of central obesity, hypertriglyceridemia, low HDL-C, elevated blood pressure, and elevated FBG, which are five components of metabolic syndrome (MetS). Other metabolic parameters that had significant differences between NAFLD and non-NAFLD patients are shown in Table 1.

Table 1 Clinical features of participants

In the validation dataset, the mean age was 45.81 years, and 50% were male. Of the 5000 subjects, 1759 (35.18%) were diagnosed with NAFLD. The clinical features of the validation dataset were similar to those of the training dataset (Supplement Table 1).

CLN development

The results of logistic regression analysis among variables related to NAFLD are given in Table 2. Six independent predictors, BMI, DBP, UA, FBG, TG, and ALT, with the best performance were incorporated into our model and presented as the nomogram (Fig. 1).

Table 2 Univariate and multivariate analyses for the prediction of NAFLD
Fig. 1
figure 1

Nomogram for predicting NAFLD

Each variable was assigned a score ranging from 0 to 100 on a point scale. By calculating the total score of various covariates and placing the total score on a total point scale, the probability of NAFLD could be efficiently estimated.

Accuracy of NAFLD diagnosis

Sensitivity and specificity were calculated to compare the prediction ability of the five models, including the newly built CLN, the FLI, the VAI, the HSI and the TyG index. Figure 2a shows the ROC curves of all five models. Details of the performance are shown in Table 3. The nomogram had the highest AUROC (0.857, 95% CI 0.852–0.863, p < 0.001) for predicting NAFLD compared with AUROCs of the FLI (0.849, 95% CI 0.843–0.855, p < 0.001), the VAI (0.752, 95% CI 0.745–0.760, p < 0.001), the HSI (0.828, 95% CI 0.822–0.834, p < 0.001) and the TyG index (0.774, 95% CI 0.767–0.781, p < 0.001). The sensitivity and specificity of the nomogram were 79.60% and 76.90%, respectively, while the cut-off was 0.37.

Fig. 2
figure 2

Receiver operating characteristics (ROC) curves for predicting NAFLD. a Training dataset and b validation dataset

Table 3 Performance assessment of the developed nomogram and other scoring systems (FLI, VAI, HSI and TyG index) for the prediction of NAFLD

In the validation dataset, the nomogram also had the highest AUROC (0.861, 95% CI 0.850–0.871, p < 0.001) for predicting NAFLD compared with the AUROCs of the FLI (0.854, 95% CI 0.844–0.865, p < 0.001), the VAI (0.766, 95% CI 0.753–0.780, p < 0.001), the HSI (0.839, 95% CI 0.828–0.850, p < 0.001), and the TyG index (0.783, 95% CI 0.769–0.796, p < 0.001). The sensitivity and specificity of the nomogram were 75.50% and 80.70%, respectively, while the cut-off was 0.40 (Fig. 2b and Supplement Table 2).

The calibration curve of the nomogram for predicting NAFLD demonstrated good agreement between both the training dataset and the validation dataset (Fig. 3a, b). The closer the calibration curve is to the diagonal line, the higher the prediction accuracy of the model.

Fig. 3
figure 3

The performance of the nomogram was assessed by calibration curves in the training dataset (a) and the validation dataset (b). (1000 bootstrap resamples). Nomogram-predicted probability of significant fibrosis is plotted on the x-axis; actual probability is plotted on the y-axis. The clinical utility of the nomogram was evaluated by decision curves in the training dataset (c) and the validation dataset (d). The y-axis represents net benefits, calculated by subtracting the relative harms (false positives) from the benefits (true positives). The x-axis measures the threshold probability

DCA of the clinical utility of the CLN

DCA was conducted to evaluate the clinical utility of the CLN by quantifying the probabilities of net benefits at a threshold from 0.0 to 1.0. The farther the decision curve is from the two extreme curves, the higher the clinical decision net benefit of the model. The decision curve (Fig. 3c, d) demonstrated a higher net benefit of the nomogram than of the other four models (FLI, VAI, HSI and TyG index) in both the training dataset and validation dataset. This result implies that this CLN has comparable functions as abdominal ultrasound in diagnosing NAFLD. Since this CLN does not require abdominal ultrasound, it can be used as a screening tool to decide if a particular participant needs further abdominal ultrasound to confirm the diagnosis of NAFLD.

Discussion

As a result of overnutrition and less physical exercise, the frequency of NAFLD continues to increase in China [17]. NAFLD encompasses the entire spectrum of fatty liver disease, from nonalcoholic fatty liver (NAFL) to NASH, finally leading to cirrhosis and hepatocellular carcinoma [18]. NAFLD has long been and will continue to be a very large inevitable challenge for healthcare systems. Early diagnosis and prevention are critical in managing NAFLD patients.

Clinically, most NAFLD patients are diagnosed incidentally [19] when having imaging studies for other medical illnesses or more often during annual physical examinations. In the latter case, clinicians’ suspicion of NAFLD is critical since abdominal ultrasound is not routinely ordered. In contrast to imaging studies, BMI, arterial blood pressure, UA, FBG, TG, ALT, etc. are monitored routinely during the annual physical examination.

Developing a simple and practical diagnostic tool seems particularly critical in solving this problem. By analyzing clinical and laboratory variables, we were able to construct an operable nomogram, a prediction model that can be easily used for physicians in the office setting i.e., during physical examination. In this CLN, six parameters, two physical examinations (BMI, DBP) and four blood tests (UA, FBG, TG, ALT), were included. The eligible subjects with a high possibility of NAFLD were recommended for further examination, i.e., imaging study. This novel approach provides a practical tool for screening subjects with a potential risk of NAFLD.

This study developed a simple CLN for predicting NAFLD in a large Chinese population. The AUROC of the CLN was 0.857 (95% CI 0.851–0.863), which was better than the FLI, the VAI, the HSI and the TyG index (Fig. 2). The FLI is a biochemical assessment that was proposed based on the Italian population [6], while the HSI was derived from a Korean study involving more than 10,000 subjects. Currently, the FLI is widely used for predicting NAFLD in different populations [20,21,22,23,24]. In our study, the FLI maintained good performance but was slightly inferior to the CLN in terms of AUROC. The HSI had a comparable performance as the FLI. However, the HSI, a Korean population-derived model, has limited external validity and is not as widely applied as the FLI. The comparable performance of the HSI might be a result of similarities in ethnicity and diet. The VAI is an indicator of visceral adipose dysfunction, which is associated with cardiovascular events and inversely correlated with insulin sensitivity [12]. The VAI was demonstrated to be associated with histologically defined steatosis [25]. Finally, the TyG index is the product of fasting triglycerides and glucose [11]. Similarly, the TyG index is an independent predictor of moderate-to-severe steatosis [26]. Both the VAI and the TyG index were inferior to the other three models, including our CLN, which was consistent with the results of previous studies.

The main strength of this study is that we developed a novel individualized NAFLD risk prediction model from a large Chinese population in the annual physical examination setting. The CLN provided a visualized approach for clinicians to calculate the potential risk of NAFLD. In addition, this was a head-to-head study comparing CLN with four other prediction models, which avoids the disadvantages of different sampled populations. We acknowledge that there are several limitations. First, in this retrospective single-center study, the enrolled subjects were mostly office workers, known to have sedentary lifestyles and lacked exercise. This selection bias could partially explain the relatively high NAFLD prevalence in our study. Therefore, the results from this study may not be representative of the general Chinese population. Second, abdominal ultrasonography served as the reference standard in building this diagnostic CLN. It is well known that ultrasonography tends to underestimate hepatic steatosis. Thus, this CLN may potentially underdetect or underdiagnose NAFLD. Third, the data were extracted retrospectively from a big data bank. The accuracy and intraobserver/interobserver concordance were not assessed, which is another inevitable limitation of this study. Last, we would like to address the fact that the CLN cannot be applied to patients on medications for metabolic disorders, as these patients were excluded in this study. Although the CLN had good performance in this study, the sensitivity and specificity remain to be improved. A multicenter study including participants from different regions and with various occupations is needed for further external validation.

In conclusion, our study developed a novel clinical and laboratory-based nomogram with relatively good predictive ability for screening NAFLD in a large Chinese population. Clinicians can provide individualized plans to the subjects according to the risk assessment. Individuals with high risk should be referred for other diagnostic tests to confirm NAFLD, which will result in early lifestyle and medical interventions and prevent disease progression.