Introduction

Non-alcoholic fatty liver disease (NAFLD) is the most common liver disease in many developed countries, and it is strongly associated with metabolic syndrome [1]. NAFLD encompasses a wide range of conditions, including non-alcoholic fatty liver (NAFL), non-alcoholic steatohepatitis (NASH), liver cirrhosis, and hepatocellular carcinoma [2, 3]. Progression of liver fibrosis rather than liver steatosis was reported to contribute to mortality from cardiovascular disease and liver-related disease in NAFLD [4, 5]; thus, early detection and estimation of liver fibrosis is important.

Percutaneous liver biopsy is the gold standard in the diagnosis of NAFLD, and it can be used to estimate the degree of fatty change, inflammation, and fibrotic changes in the liver. However, disadvantages include the risk of complications because of the invasiveness of the procedure, sampling errors, and high cost [6]. A possible solution for these drawbacks is a non-invasive scoring system, and several scoring systems have been proposed to estimate the presence of liver fibrosis; these are based on clinical data and laboratory findings. Of these scoring systems, the FIB4 index and NAFLD fibrosis score (NFS) are the most useful for predicting advanced stage liver fibrosis (Brunt stage; F = 3,4) in NAFLD patients. The area under the receiver operating characteristic curve (AUROC) for predicting advanced fibrosis in NAFLD using the FIB4 index and NFS has been reported to be 0.86 and 0.88, respectively [7, 8]. The American Association for the Study of Liver Disease (AASLD) made a practice guidance for NAFLD in 2017; NFS or FIB4-index are clinically more useful tools for identifying NAFLD patients with higher likelihood of having bridging fibrosis (stage 3) or cirrhosis (stage 4) than other scores such as BARD, APRI and AST/ALT ratio [9]. In Japan, the FIB4 index was reported to more accurately reflect liver fibrosis compared with the NFS [10]. The FIB4 index was originally proposed for the estimation of significant liver fibrosis in HCV and HIV patients by Sterling et al. [11]. Shah et al. also set new cutoff points of the FIB4 index to predict advanced liver fibrosis in NAFLD patients; the low cutoff point (LCO) was 1.30 and the high cutoff point (HCO) was 2.67 [12]. Currently, these cutoff points are used to detect or exclude advanced fibrosis. In daily clinical practice, the FIB4 index may be more convenient because of its high accuracy and small number of variables in its formula.

These scoring systems are useful in detecting advanced fibrosis, but there may be some underlying problems as a result of including an age factor in their formulae. For the FIB4 index, the scores are expected to increase with age, although the cutoff values are fixed at 1.30 (LCO) and 2.67 (HCO) in the FIB4 index to estimate advanced fibrosis. Additionally, although these cutoff values were proposed by Shah et al. based on their study population, the mean ± standard deviation of age was 47 ± 12 years in non-NASH patients and 48 ± 12 years in NASH patients, and these mean values were lower than in the general NAFLD patient population. Thus, the reduced accuracy of the FIB4 index when estimating liver fibrosis in older NAFLD patients is concerning. Currently, we estimate advanced fibrosis using these fixed cutoff values, but there are some potential risks that may result in a misdiagnosis in older NASH patients with advanced fibrosis.

In the present study, therefore, we estimated whether the conventional FIB4 index cutoff value reflected liver fibrosis in different age groups; in addition, we defined the adequate cutoff values of the FIB4 index for each generation and investigated whether the new cutoff values were efficient in detecting liver fibrosis in NAFLD patients.

Methods

Patients

This study was a cross-sectional study to estimate the accuracy of the FIB4 index in the diagnosis of liver fibrosis in NAFLD patients when the FIB4 index was categorized by age. In this study, 1050 patients were enrolled in the Japan Study Group of NAFLD (JSG-NAFLD) and the patients all underwent a liver biopsy from 2002 to 2015. The JSG-NAFLD included 11 Japanese hepatology centers as follows: Kyoto Prefectural University of Medicine; Nara City Hospital; Yokohama City University Graduate School of Medicine; Hiroshima University; Kochi Medical School; Saga University; Osaka City University; Asahikawa Medical College; Saiseikai Suita Hospital.

All patients in this study were divided into four groups according to their age: 49 years or younger (≤ 49 years), between 50 and 59 years (50–59 years), between 60 and 69 years (60–69 years), and 70 years or older (≥ 70 years).

The diagnosis of NAFLD was based on the reported criteria [13] and appropriately excluded liver disease with clear etiologies, such as viral and autoimmune hepatitis and alcoholic liver disease. Written informed consent was obtained from all patients at the time of their liver biopsy. This study was approved by the Institutional Review Board of all participating institutions and it conformed to the ethical guidelines of the 1975 Declaration of Helsinki.

Covariates

Physical characteristics and clinical laboratory data were examined for each patient. The physical factors included age, sex, and body mass index (BMI), and the clinical characteristics included a confirmed diagnosis of hypertension, diabetes mellitus (DM), and dyslipidemia. Blood samples were taken in the morning after a 12-h overnight fast and the clinical laboratory data were also collected. These parameters were measured using standard techniques in clinical laboratories. BMI was calculated as weight in kilograms divided by height in square meters. Obesity was defined as BMI > 25.0, according to the criteria of the Japan Society for the Study of Obesity [14]. Patients assigned to a diagnosis of DM, when there was a documented use of oral hypoglycemic medication, or their random glucose levels exceeded 200 mg/dL, or they had FPG > 126 mg/dL and HbA1c ≥ 6.6 [15]. Dyslipidemia was diagnosed if the cholesterol level was > 220 mg/dL and/or the LDL cholesterol level was > 140 mg/dL and/or the HDL cholesterol level was < 40 mg/dl and/or the triglyceride level was > 160 mg/dL. Hypertension was diagnosed if the patient was on antihypertensive medication and/or had a resting recumbent blood pressure ≥ 140/90 mmHg on at least two occasions.

Liver histology and the diagnosis of NAFLD and NASH

All patients enrolled in this study underwent a percutaneous liver biopsy under ultrasonic guidance. The liver specimens were embedded in paraffin and stained with hematoxylin and eosin, Masson-trichrome, and reticulin silver stain. Three pathologists (S.I., K.S and Y.S.), who were blinded to all clinical and identifying data, reviewed the liver biopsy specimens. An adequate liver biopsy sample was defined as a biopsy specimen with length > 1.5 cm and/or having > 6 portal tracts. NASH was defined as steatosis with lobular inflammation and ballooning degeneration with or without Mallory–Denk bodies or fibrosis. Patients whose liver biopsy specimens showed steatosis, or steatosis with non-specific inflammation, were identified as the NAFL cohort [2, 3]. The severity of hepatic fibrosis (stage) was defined as Stage 1, zone 3 perisinusoidal fibrosis; Stage 2, zone 3 perisinusoidal fibrosis with portal fibrosis; Stage 3, zone 3 perisinusoidal fibrosis and portal fibrosis with bridging fibrosis; or Stage 4, cirrhosis [16]. Advanced fibrosis was defined as having a severity higher than Stage 3. The scoring of steatosis included both microvesicular and macrovesicular steatosis and was based on the percentage area of the parenchyma that was fatty: < 33% was considered mild, 33–65% moderate, and > 66% advanced [13].

Non-invasive fibrosis marker

The FIB4 index was assessed in each patient. It was calculated using the following formula: [age (years) × AST]/[platelet counts (× 109/L) × ALT1/2]. In Japanese NAFLD patients, the LCO and HCO of 1.30 and 2.67 were applied to estimate advanced fibrosis in NAFLD patients [10, 17].

Statistical analysis

Data are expressed as median and ranges for quantitative data or as numbers of patients with percentages in parentheses for qualitative data. Statistical differences between two or three groups were analyzed using the Mann–Whitney U test and the Kruskal–Wallis analysis for quantitative data, and using Fisher’s test or the Chi square test for qualitative data. Normality was confirmed using the Shapiro–Wilk analysis. We defined the new cutoff for the FIB4 index using ROC analysis in each group. We defined the high cutoff point as a specificity of 0.90 and the low cutoff point as a sensitivity of 0.90. Accuracy was calculated as the rate of true positives and true negatives. Validation was performed by calculating kappa statics with tenfold cross-validation in each age group. All analyses were performed using JMP® 12 (SAS Institute Inc., Cary, NC, USA). Nominal, two-sided p values were used and were considered statistically significant for values < 0.05 a priori.

Results

Characteristics of NAFLD patients categorized by age

The patients were categorized into four age groups: ≤ 49 years (n = 395), 50–59 years (n = 217), 60–69 years (n = 270), and ≥ 70 years (n = 168). As age increased, the prevalence of NASH and the rate of female increased, but the BMI, ALT level, and platelet count declined. In addition, as age increased, the proportion of advanced fibrosis also increased (Table 1). As age increased, the values of the FIB4 index increased (correlation coefficient: 0.771, p < 0.001). The box plot of the FIB4 index which is divided to the advanced fibrosis is presence or not in each group according to each age group is shown in Fig. 1. The median and the top 25th percentile (Q1) and the bottom 25th percentile (Q3) of FIB4 index with the advanced fibrosis or not is 0.74 (Q1: 0.98, Q3: 0.53) and 1.51 (2.34, 1.17) in ≤ 49 years, 1.16 (1.59, 0.90) and 2.38 (3.70, 1.62) in 50–59 years, 1.80 (2.41, 1.36) and 3.27 (4.04, 2.72) in 60–69 years, 2.56 (3.29, 1.99) and 4.49 (6.51, 2.90) in ≥ 70 years. Though the purpose of the LCO is to exclude the advanced fibrosis and the HCO is to pick up the advanced fibrosis, with the conventional cutoff points, the exclusion of the advanced fibrosis is decreasing as the age becoming higher and the picking up of the advanced fibrosis is decreasing as the age becoming lower.

Table 1 Characteristics of NAFLD patients categorized by age groups
Fig. 1
figure 1

Box plot of the FIB4 fibrosis index in each age group according to the presence or absence of advanced fibrosis. Absence: F = 0–2; presence: F = 3–4

ROC curve analysis

ROC curves were developed for each age group (Fig. 2). The ROC curves were calculated to estimate the utility of the FIB4 index in the prediction of advanced fibrosis (stages 3, 4 vs. lower stages) in the different age groups, which was the clinical question of interest in this study. The area under the ROC curve (AUROC) was 0.917 for ≤ 49 years, 0.849 for 50–59 years, 0.855 for 60–69 years, and 0.779 for ≥ 70 years. As age increased, the AUROC decreased. The AUROC for ≥ 70 years was significantly lower than that for ≤ 49 years (p = 0.013). This showed that the diagnostic performance of the FIB4 index decreased as age increased.

Fig. 2
figure 2

Comparison of the receiver operating characteristics for the FIB4 index, categorized according to age group. Age groups: ≤ 49, 50–59, 60–69  and ≥ 70 years

Clinical utility of the FIB4 index categorized by age for the prediction of advanced fibrosis

The sensitivity and specificity of the conventional low and high cutoff points were assessed (Table 2). With the conventional low cutoff point, which was fixed at 1.30, the sensitivity increased as age increased [0.636 (7/11) in ≤ 49 years, 0.897 (26/29) in 50–59 years, 1.000 (43/43) in 60–69 years, 1.000 (40/40) in ≥ 70] and the efficacy of exclusion of advanced fibrosis seemed to improve as age increased. The rate of false positives, however, increased remarkably as age increased. Similarly, the conventional high cutoff point was fixed at 2.67, and the specificity decreased as age increased. In ≤ 49 and 50–59 years, the specificity was high [0.989 (375/379) and 0.972 (180/185), respectively], but the false-negative results were also remarkably high [0.818 (9/11) and 0.621 (18/29), respectively].

Table 2 Predictive values of the FIB4 index scores for advanced fibrosis (stage 3–4)

The AUROC was used to determine the modified cutoff points of the FIB4 index that discriminated between the absence (LCO) and presence (HCO) of advanced fibrosis with a sensitivity of 0.90 and a specificity of 0.90 in each age group. These modified cutoff points (mLCO and mHCO) were as follows: mLCO and mHCO were 1.05 and 1.21 in ≤ 49 years; 1.24 and 1.96 in 50–59 years; 1.88 and 3.24 in 60–69 years; and 1.95 and 4.56 in ≥ 70 years. These modified cutoff points increased as age increased. For the modified low cutoff points in ≤ 49 and 50–59 years, the risk of a false-positive result was increased, but there were fewer or a similar number of false-negative results compared with the conventional low cutoff points. In 60–69 and ≥ 70 years, the number of false-negative results was fixed at 0.100, and the false-positive results were immediately decreased. The accuracy of detecting the absence of advanced liver fibrosis was improved compared with the conventional cutoff point. For the modified high cutoff points in ≤ 49 and 50–59 years, the false-negative results decreased, but in 60–69 and ≥ 70 years, the number of false-positive results decreased while the number of false-positive results increased. Thus, the accuracy to detect the presence of advanced fibrosis was improved in ≤ 49 and 50–59 years, but not in 60–69 and ≥ 70 years using the modified high cutoff points.

Clinical utility of the new cutoff points proposed to predict advanced fibrosis

Based on our results, we proposed new cutoff points for the FIB4 index categorized by age group: a low cutoff point and a high cutoff point, which are 1.05 and 1.21 for ≤ 49 years, 1.24 and 1.96 for 50–59 years, 1.88 and 2.67 for 60–69 years, and 1.95 and 2.67 for ≥ 70 years, respectively. With these new cutoff points, the rate of indeterminate results was improved in 50–59, 60–69 and ≥ 70 years, although the rate was almost equal in ≤ 49 years compared with the conventional cutoff point. To validate this new cutoff point, Kappa statistics were calculated and validated using 10-fold cross-validation. The mean Kappa statistics were improved in ≤ 49, 60–69, and ≥ 70 years, although the mean Kappa statistic decreased slightly in 50–59 years.

Discussion

It was recently reported that mortality of NAFLD patients is related to advanced liver fibrosis rather than steatosis [4, 5]; thus, it is important to diagnose advanced liver fibrosis in NAFLD patients. To avoid the potential complications of liver biopsy, several non-invasive markers of liver fibrosis, such as the FIB4 index and NFS have been described. Non-invasive fibrosis markers, such as the NFS, have also been reported to be useful for the prediction of mortality [18], and the European Association for the Study of the Liver (EASL) in 2016, and the American Association for the Study of Liver Disease (AASLD) in 2017, recommended that the FIB4 index and NFS might be confidently used for first-line risk stratification for the exclusion of severe disease [9, 19].

One potential limitation of the use of the FIB4 index or NFS to estimate the degree of liver fibrosis is the inclusion of age in the calculation of the FIB4 index and its effect on the NFS. As shown in Fig. 1, the FIB4 index increases with age. Cutoffs for intima-media thickness (IMT) are adjusted for age when carotid ultrasonography is performed, such that patients are evaluated using compensated values to determine if they will develop atherosclerosis [20]. Similarly, when liver fibrosis is estimated using non-invasive fibrosis markers, we should adjust the appropriate cutoff points according to age to more accurately identify advanced fibrosis. Moreover, because the cutoff points proposed by Shah et al. were defined in a relatively young population, the accuracy of the conventional cutoffs should be determined.

The conventional cutoffs for the FIB4 index used a negative predictive value (NPV) of 90% for the LCO and a positive predictive value of PPV 80% for the HCO. The purpose of these cutoff points is to exclude and identify advanced fibrosis for the LCO and HCO, respectively. In this study, Fig. 1 and Table 2 show that the AUROCs were lower and the diagnostic accuracy varied using the fixed cutoffs among the age groups. In this study, the modified cutoff points were set using a sensitivity of 90% and a specificity of 90% for LCO and HCO, respectively. The conventional cutoffs were calculated using NPV and PPV in the original paper [12], but the modified cutoffs were calculated using sensitivity and specificity because NPV and PPV are influenced by prevalence, and the efficacy of the index should be evaluated using sensitivity and specificity. With this modification to the LCO, the false-negative rate and NPV were improved in ≤ 49 years and were similar in 50–59 years. In ≥ 60 years, the false-positive rate was very high with the conventional cutoff point but, using the modified cutoffs, the false-positive rate was improved. By contrast, using this modified HCO, the false-negative rate was improved in ≤ 59 years but worsened in ≥ 60 years. Therefore, in ≥ 60 years, the conventional HCO cutoff was more appropriate.

Based on these results, we proposed new cutoff points that combine the conventional and modified values (Table 3, Fig. 3). Specifically, the LCO was the modified value at 90% sensitivity and the HCO was the modified value at 90% specificity for people of ≤ 59 years and the conventional value for people ≥ 60 years. The Kappa statistics for these new cutoffs consistently excluded or identified advanced fibrosis, definitively diagnosed by liver biopsy, and were estimated using the new cutoffs after 10-fold cross-validation. The accuracy of the diagnosis was better in the ≤ 49, 60–69, and ≥ 70 years age groups, but poorer in the 50–59 years age group. Additionally, the indeterminate rate, representing the proportion of individuals for whom the test is unable to identify or exclude advanced fibrosis because the result is between the two cutoffs, was lower in ≥ 50 years and was similar for both sets of cutoffs in individuals who were ≤ 49 years.

Table 3 Predictive values of advanced fibrosis with conventional and our proposal cutoff points
Fig. 3
figure 3

Proposed FIB4 index cutoffs, modified according to age group. When the FIB4 index is below the lower cutoff, patients can be considered not to have advanced fibrosis; when the index is between the low and high cutoff, the patient must be followed up; and when the index is greater than the high cutoff, the patient can be considered to have advanced fibrosis

However, the diagnostic accuracy, which reflects the true positive and true negative rates for all subjects, excluding those in the indeterminate group, was higher when using the new cutoff points, although it was difficult to assess in the ≥ 70 years group in our clinic. The explanation for these findings is that the levels of ALT and the platelet count declined with age in NAFLD patients. Thus, because the FIB4 index is higher in patients with advanced fibrosis than in those without, the proportion of false positives increases with age. Therefore, when diagnosing advanced fibrosis in older NAFLD patients using the FIB4 index, these factors must be taken into consideration. The new cutoffs that we propose herein were effective in identifying advanced fibrosis in elderly NAFLD patients.

The prevalence of advanced fibrosis diagnosed by the FIB4 index was recently reported to decline with age, and was especially low in patients aged ≥ 65 years (35% for the FIB4 index) [21]. The authors of this study used a new cutoff point for this age group, which improved the efficacy with which advanced fibrosis was excluded (when the lower cutoff was 2.0, the specificity was improved from 35 to 70%). Comparing this study with our own, the AUROC in our study indicated better performance than that constructed in the previous study for the younger age groups: the AUROCs in ≤ 55 years [0.60 in ≤ 35 years, 0.79 in 36–45 years, and 0.77 in 46–55 years] were lower than in ≤ 49 years in our study. The discrepancy between these studies might be explained by differences in age distribution or patient background; the prevalence of advanced fibrosis was 23.5% and the mean FIB4 index was 1.32 in ≤ 55 years in the previous study, whereas the prevalence of advanced fibrosis was 2.8% and the FIB4 index was 0.74 in ≤ 50 years in our study. These discrepancies may reflect a difference in the prevalence of NASH between the West and Japan.

With respect to NFS, The AUROC was 0.903 in ≤ 49, 0.870 in 50–59, in 60–69, and 0.813 in ≥ 70 years. The AUROCs decreased with age, with the exception of in the ≥ 70 years (Fig. 4). Using the conventional cutoff points (LCO − 1.455, HCO 0.675), the false-positive rate tended to increase around the LCO and the false-negative rate tended to increase around the HCO, for older individuals (Fig. 4, Table S1). NFS was influenced by age, as for the FIB4 index, but its influence was lower than on the FIB4 index, because the calculation includes other factors, such as the prevalence of DM and BMI. As shown in Fig. 4, age adjustment may be necessary for the NFS to identify advanced fibrosis.

Fig. 4
figure 4

a Comparison of the ROCs for the NFS, categorized by age group: ≤ 49, 50–59, 60–69, and ≥ 70 years. b Box plot of the NFS in each age group, according to the absence (F = 0–2) or presence (F = 3,4) of advanced fibrosis

The diagnostic accuracy and efficacy of the FIB4 index for the diagnosis of advanced fibrosis in NAFLD have been assessed in many studies. As revealed in our current study, however, the diagnostic accuracy of the FIB4 index changes with age. Therefore, it is important to be aware of the limitations of the non-invasive fibrosis scores for NAFLD, such as the FIB4 index or NFS, because they are likely to become widely used for the diagnosis of advanced fibrosis, in place of liver biopsy.

In summary, the current study has demonstrated that the FIB4 index is not effective for the diagnosis of advanced fibrosis in NAFLD patients using the conventional cutoff points, and the newly defined cutoff points, which have been adapted for each age group, improve its diagnostic performance. Although the FIB4 index is a simple and inexpensive method for the diagnosis of advanced fibrosis, its limitations must be considered; in particular, its diagnostic accuracy decreases with age in NAFLD patients. The conventional and modified cutoffs were calculated using the same ROC analysis, and the selected relationship between sensitivity and specificity represents a compromise: altering the cutoff points sacrifices either sensitivity or specificity, which represents the main limitation of this study. However, we have described more suitable cutoff points than the conventional values for the diagnosis of advanced fibrosis in NAFLD.