Introduction

Obesity diffusion has reached epidemic proportions in developed countries, in parallel to its comorbidities, which mainly include type 2 diabetes (T2D), and insulin resistance (IR) [1]. At all ages, the risk of T2D increases with increasing body weight. In particular, recent data evidenced how obesity enhances the risk of developing type 2 diabetes by at least 6 times, regardless of genetic predisposition to the disease [2]. Obesity as well as IR are recognized determinants of non-alcoholic fatty liver disease (NAFLD), which progression can lead to hepatic fibrosis and cirrhosis and a new definition, i.e., metabolic (dysfunction)-associated fatty liver disease (MAFLD), has been recently proposed [3]. The current estimated prevalence of NAFLD in the general population is 25% and increases 55% in patients with T2D [4] and it is estimated that NASH prevalence will increase by 15–56%, while liver mortality and advanced liver disease will more than double as a result of an aging/increasing population [5] making early detection peremptory [6, 7].

Liver biopsy is considered as the gold standard for assessing NAFLD because it provides a direct visualization of fibrosis [8]. However, the large prevalence of the disease and the invasiveness of liver biopsy render the adoption of non-invasive scores (based either on imaging, clinical and/or laboratory variables) attractive for their use in the clinical practices [9].

Non-invasive biomarkers of liver fibrosis/steatosis have been widely used in studies involving the general population or primary care [10, 11]. These non-invasive blood and parametric indices can be easily implemented, retaining possibility of monitoring, as well as low cost and high patient compliance.

Aim of the study was to evaluate concordance of recently proposed noninvasive alternative fibrosis and steatosis indices in a large population of adult subjects with overweight/obesity. Moreover, if it is recognized that the progression of NAFLD itself mainly depends on the presence of obesity, IR and T2D, it is also evident that not all subjects with obesity, and/or those with metabolic syndrome (MS) develop steatosis, as well as not all subjects with hepatic steatosis progress toward fibrosis or cirrhosis. Thus, as the definition of risk factors remains probably the most unresolved point in prognostic terms, we also aimed to assess and how obesity and its physio-pathological features may interact with steatosis/fibrosis indexes and related markers reflecting a cardio-metabolic risk.

Subjects and methods

A total of 1145 outpatients with overweight or obesity (mean age 54 ± 14 years; 856 females), sent by general practitioners or occupational medicine doctors for cardiometabolic prevention and overweight/obesity treatment, were enrolled during their visit at one of the two recruiting hospitals (Obesity and Work Center, Clinica del Lavoro “L. Devoto” of Fondazione Ca’ Granda Ospedale Maggiore, Policlinico of Milan, and Fondazione CNR-Regione Toscana G. Monasterio, Pisa, Italy) [12]. Upon entering the study, each participant signed an informed consent form and underwent medical examination. The study was approved by the Ethics Committee of Milan Policlinico Hospital (Study registration number: 1370).

We used people-first language (according to the recommendation of The Obesity Society, the Canadian Obesity Network, the Italian Dietetic Association-ADI, and the European Association for Study of Obesity-EASO) to reduce bias associated with the term “obesity”, avoiding to refer to a subject with obesity as “obese”, and to label patients by their condition [13, 14].

Arterial hypertension was defined as follow: when systolic blood pressure > 140 mmHg and/or diastolic pressure > 90 mmHg or use of antihypertensive medication. Dyslipidemia was considered when total cholesterol was ≥ 200 mg/dL, or triglyceride ≥ 150 mg/dL, or patient currently use lipid-lowering drugs. Metabolic syndrome was defined as a cluster of conditions that occur together (when present three or more), including increased blood pressure, elevated glycemia, excess body fat around the waist, abnormal cholesterol or triglyceride levels (NCEP/ATP III).

Subjects were excluded if they tested positive for chronic liver disease (e.g., hepatitis C, hepatitis B), or presented systemic diseases (e.g., inflammatory diseases, cancer) or liver complications (e.g., liver failure), or reported significant alcohol consumption or had missing data on alcohol intake.

Peripheral blood samples were drawn after an overnight fast. A complete blood counts was determined for each subjects together with routine biochemical parameters (uric acid, glucose, insulin, glycated hemoglobin, fibrinogen, liver enzymes, creatinine, C-reactive protein, homocysteine, and a lipid panel) by standard automatized laboratory analyzers. Serum samples of all the subjects were frozen and stored at − 80 °C for batch analysis periodically assessed at Central Laboratory of Policlinico of Milan or FTGM Laboratory, Pisa.

Data are available from the authors, for research purpose upon reasonable request.

Steatosis and fibrosis scores

Scores for steatosis

Fatty liver index (FLI) [15]

Parameters used are triglycerides, BMI, GGT and waist circumference. Simple, efficient screening tool validated using a large validation cohort. Diagnosis of NAFLD for score ≥ 60, without NAFLD if score < 30.

This index is calculated according to the following formula: (e 0.953 × loge (TG) + 0.139 × BMI + 0.718 × loge (GGT) + 0.053 × WC − 15.745)/(1 + e 0.953 × loge (TG) + 0.139 × BMI + 0.718 × loge (GGT) + 0.053 × WC − 15.745) × 100.

NAFLD liver fat score (NLFS) [16]

Parameters used are insulin and liver enzymes AST/ALT, presence of metabolic syndrome and/or type 2 diabetes. High sensitivity for score > − 0.640.

This index is calculated according to the following formula: (AST/ALT). − 2.89 − 1.18 × MS (yes = 1/no = 0) + 0.45 × T2D (yes = 2/no = 0) + 0.15fS-insulin(mU/L) + 0.04 × fS-AST (U/L)-0.94 × AST/ALT.

Scores for fibrosis

Fibrosis 4 (Fib-4) score [17]

Parameters used are age, platelet counts and AST and ALT levels.

cut-off > 3.25 fibrosis versus < 1.45 no-fibrosis.

This index is calculated according to the following formula: [age (years) × AST (U/L)]/[PLT (109/L) × ALT (U/L))1/2].

Body mass index, AST/ALT ratio, diabetes (BARD) Score [18]

Parameters used are BMI and AST/ALT ratio, presence or not of diabetes.

Severe fibrosis is diagnosed for score ≥ 2.

Three variables were combined: BMI > 28 kg/m2 (1 point), AST/ALT Ratio > 0.8 (2 points) and presence of diabetes (1 point).

FORN [19]

An indirect serological marker used for the evaluation of liver fibrosis estimated from a multivariate analysis that identified a number of routine clinic variables as independent predictors of fibrosis, according to the following formula: 7.811–3.131.ln(PLT) + 0.781.ln(GGT) + 3.467.ln(age) − 0.014.(cholesterol).

Cut-off < 4.2 no-fibrosis, ≥ 6.9 fibrosis.

Body mass index, Age at liver biopsy, Alanine aminotransferase, and serum Triglycerides (BAAT) Score [20]

Parameters used are ALT levels and circulating triglyceride concentration (TG) with body mass index (BMI) and age-specific score was calculated for each variable so that the BAAT score ranged between 0 and 4, according to the following formula: age (≥ 50 years = 1; < 50 years = 0), BMI (≥ 28 kg/m2 = 1; < 28 kg/m2 = 0), TG (≥ 1.7 mmol / L = 1; < 1.7 mmol/L = 0), ALT (≥ 2 times normal = 1; < 2 times normal = 0).

cut-off ≥ 2.

Statistical analysis

Continuous variables were reported as mean ± SD, or as median and interquartile range for not normally distributed parameters. Categorical data are presented as number (%). Concordance between two scores was evaluated using the Cohen’s kappa, that is a statistical coefficient that represents the degree of accuracy and reliability in a statistical classification, measuring the agreement between classifications of the items into mutually exclusive categories according to the following scale:

  • 0.01 – 0.20 slight agreement

  • 0.21 – 0.40 fair agreement

  • 0.41 – 0.60 moderate agreement

  • 0.61 – 0.80 substantial agreement

  • 0.81 – 1.00 very good agreement

χ2 analysis was utilized to evaluate if there is a statistically significant difference between the expected frequencies and the observed frequencies in two or more categories of a contingency table. Comparisons between continuous variables normally distributed were evaluated by the Student’s t test for unpaired data. Differences between continuous parameters not normally distributed were analyzed with Mann–Whitney (non-parametric test).

A logistic regression analysis was performed to evidence independent correlates of each index. Continuos variables in Table 1 with univariate association of p < 0.05 were entered into a multivariate regression to estimate independent predictors for each index after adjusting for confounding and risk factors. Clearly, the parameters included in the formula for calculation of a specific index were excluded. A p value 0.05 was considered statistically significant.

Results

Anthropometric parameters, clinical characteristics and laboratory markers in the overall population are reported in Table 1. There were 740 (65%) subjects with obesity, the overall prevalence of metabolic syndrome was 44%.

Table 1 Clinical, anthropometric characteristics of study subjects

Following cut-off was considered for each score: steatosis absence with values < 30 for FLI, ≤ − 0.64 for NLFS; fibrosis absence with values ≤ 4.2 for FORN, ≤ 3.25 for FIB-4, < 2 for BAAT and BARD. In the overall population, according to the FLI score, 234 tests (20%) resulted positive for steatosis whereas only 56 (5%) positive according to NLFS criteria (Table 2). Instead, as regards fibrosis score, 495 (40%), 173 (15%), 881 (77%) and 731 (34%) were positive for FORN, FIB-4, BARD and BAAT, respectively (Table 2).

Table 2 Number and percentage of subjects with fibrosis or steatosis according to the non-invasive score criteria

Concordance between steatosis tests occurred in 81% of the overall values between FLI and NLFS, but lower when comparing the other fibrosis scores (FIB-4 vs FORN 72%, FIB-4 vs BARD 36%, BARD vs FORN 46%, BARD vs BAAT 58%, FIB-4 vs BAAT 46%, BAAT vs FORN 62%) (Table 3).

Table 3 Concordance between noninvasive steatosis and fibrosis scores in the overall population

Tables 3 and 4 report the anthropometric, clinical and laboratory parameters according to different positivity criteria for each score. For steatosis, FLI score appears significantly correlated with age differently from NLFS both indices correlated with inflammatory/oxidative stress related parameters, although FLI but no NLFS; correlated with fibrinogen levels (Table 4).

Table 4 Association between steatosis scores and clinical and biochemical characteristics in subjects overweight and obesity (subject with overweight and obesity)

For fibrosis, the scores were differently associated to anthropometric, clinical and laboratory parameters (Table 5). Differently by the other scores, higher BARD was more frequent in females, whereas FIB-4 did not resulted affected by dyslipidemia or T2D (Table 5). BARD and BAAT were not associated to PLT number (Table 5). High BAAT is associated to high inflammatory-related parameters, and higher FIB-4, FORN and BAAT to increased homocysteine and uric acid levels (Table 5).

Table 5 Association between fibrosis scores and clinical and biochemical characteristics in subjects subject with overweight and obesity

Following parameters remained as significant predictors at multivariate analysis as follow:

BAAT male sex (OR 0.4, 95%CI 0.24–0.64, p < 0.001), WC (1.1, 1.05–1.09, < 0.001), systolic blood pressure (1.02, 1.02–1.5, < 0.001), diastolic blood pressure (0.97, 0.94–0.99, < 0.05), 120 min-glucose (1.01, 1–1.02, < 0.01), glycated hemoglobin (1.1, 1.05–1.1, < 0.001), total cholesterol (1.01, 1–1.01, < 0.001), platelets (0.99, 0.99–1, < 0.001), fibrinogen (1.01, 1–1.01, < 0.05), uric acid (1.2, 1.05–1.3, < 0.01) homocysteine (1.05, 1.01–1.1, < 0.5).

FORN WC (1.02, 1.01–1.02, < 0.05), systolic blood pressure (1.02, 1.01–1.04, < 0.001), creatinine (5.2, 2–13.6, < 0.001), glucose (1.01, 1–1.03, < 0.05), neutrophils (0.75, 0.7–0.8, < 0.001), uric acid (1.2, 1.07–1.4, < 0.01), homocysteine (1.04, 1–1.08, < 0.05), monocytes (0.65, 0.5–0.82, < 0.01).

BARD male sex (0.5, 0.4–0.7, < 0.01), age (1.04, 1.02–1.05, < 0.001), diastolic pressure (0.97, 0.95–0.99, < 0.05), low-density lipoproteins (0.99, 0.99–1, < 0.01).

FIB-4 male sex (2.6, 1.6–4.3, < 0.001), neutrophils (0.7, 0.6–0.9, < 0.01), systolic pressure (1.02, 1.01–1.04, < 0.001), monocytes (0.52, 0.4–7, < 0.01).

FLI male sex (8, 3.4–18.9, < 0.001), uric acid (1.1, 1.1–1.6, < 0.05), creatinine (0.19, 0.05–0.6, < 0.05), C-reactive protein (2.5, 1.3–4.8, < 0.05), homocysteine (1.1, 1.07–1.2, < 0.001), age (1.02, 1.01–1.03, < 0.05), diastolic blood pressure (1.03, 1.01–1.06, < 0.05), insulin (1.2, 1.1–1.2, < 0.001), high-density cholesterol (0.98, 0.96–0.98, < 0.001).

NAFLD LIVER FAT male sex (4.7, 1.6–13.9, < 0.01), neutrophils (1.4, 1.03–1.8, < 0.05), monocytes (2.3, 1.3–4.2, < 0.01), C-reactive protein (2.9, 1.03–8.4, < 0.05).

Discussion

Main result of the present study is the evidence of low concordance between steatosis/fibrosis indices in a population with overweight/obesity. As indices consider different parameters in their formula, other than hepatic enzyme that are always included, and showed different relationship with cardio-metabolic risk factors, they likely mirror different physio-pathological aspect related to liver damage.

FLI score was proposed and evaluated in an Italian population of about 600 subjects, with and without suspected liver disease [15].

FLI was also used by our group in an observational cross-sectional study enrolling 385 (290 F/94 M, age range 18–69 years) consecutive participants of a nutritional education program; logistic regression between MetS and FLI showed FLI to be a risk factor (odds ratio 1.051; 95% confidence interval, 1.039–1.064; p < 0.001) [21,22,23]NLFS was proposed in 470 no-T2D/T2D subjects[16], and successively validated in a population of more than 700 consecutive biopsy-proven NAFLD patients, where its use was estimated to avoid biopsy in 75% of patients [21]. Moreover, a meta-analysis including 13 studies shows a pooled AUROC of 0.85 for the diagnosis of advanced fibrosis using the NLFS[24]. We observed a good concordance between, FLI and NLFS (81%), and both score resulted correlated with inflammatory and anthropometric parameters. Moreover, FLI score correlated with variables included in the NLFS. However, in favor of the FLI index, there is the more simplicity of calculation, in fact it requires only four variable (BMI, WC, TG and GGT), easily obtainable, against the evaluation of metabolic syndrome, and the increase in the concentration of insulin, glucose, AST, ALT as well as the decrease in the ratio, necessary for NLFS.

FIB4 comprised age with liver enzymes (ALT, and AST) and PLT count in its calculation. First proposed and validated in human immunodeficiency virus/hepatitis C virus co-infected patients (n = 832), where the use of FIB4 was estimated to avoid more than 70% of biopsy [17]. Then, it was applied to a chronic hepatitis C virus population (n = 592), where the value < 1.45 retained a sensitivity of 74% for excluding severe fibrosis [25]. In addition to its use in hepatitis C virus population, FIB4 may be useful as NAFLD biomarker, with a high negative predictive value, and avoid biopsy in about 60% of patients [26]. In the present study, between fibrosis indices, FIB4 identifies the highest percentage of subjects without fibrosis (85%), with highest concordance with FORN, especially in the capacity of reciprocal recognition of no-fibrosis patient (n = 673, Table 3). Both indices contain PLT and aging as determinants in their calculation. In particular, FORN is calculated utilizing a combination of 4 variables identified by comparative analysis of patients with and without significant fibrosis, which included age, GGT, platelet count, and total cholesterol level, this last parameter which in turns inversely correlated with FIB4 (Table 5). Both indices are also related to increased creatinine, glucose, homocysteine levels and significant changes of uric acid concentration.

BARD is the only index, which increase is correlated to a higher percentage of women respect to men. This fact may be related to the inclusion of AST/ALT in the calculation, together with BMI and T2D, which are recognized determinants of fibrosis. In particular, even if both serum AST and ALT values are lower in females than males at all ages, AST/ALT resulted higher in females [27]. Its peculiarity is also evidenced by the low concordance with the other indices (FIB-4, BAAT and FORN), always below 60%.

The panel included in BAAT calculation consists in considering age, BMI, TG, and ALT. In particular, ALT values are often considered a surrogate biomarker of liver disease severity. In patients with advanced fibrosis, serum ALT level was significantly lower than patients with no/mild fibrosis, but the AST levels were similar, resulting in lower AST/ALT ratios in patients with advanced fibrosis, suggesting that ALT might not be the ideal biomarkers to be used to assess fibrosis presence and extent[28]. As previously data evidenced that aging is associated with increase fibrosis, it is notably that FIB-4, FORN, and BAAT did not require to be adjusted for age, which is included in their calculation.

Conclusion

The non-invasive assessment of steatosis/fibrosis tried to overcome some of peri-procedural risk of liver biopsy. As these indices are easy to calculate, their use is potentially clinically useful, and may avoid liver biopsy in a part of patients. However, they retain low concordance, and differently correlated with anthropometric, clinical and laboratory variables in a population with overweight/obesity. Thus, these differences between steatosis/fibrosis scores must be considered when expressing a differential liver disease diagnosis or assessing the progression of a known liver disease, since their use will be validated in this clinical context.

What is already known on this subject?

Other recently published results evidenced the low/moderate concordance between different fibrosis scores applied in diabetes or subjects with obesity, which may reinforce the existence of different pathophysiological background reflected by each score [10, 29,30,31].

What this study adds?

Since non-invasive biomarkers of liver fibrosis/steatosis have been widely used in studies involving the general population or primary care, this study showed that the indices evaluated retain low concordance. For this reason, clinicians should be aware of these differences between steatosis/fibrosis scores when expressing a differential liver disease diagnosis or assessing the progression of a known liver disease.

Strengths and limitations

Main strengths of the present study include the large number of subjects enrolled, which allow an adequate analysis according to gender, and the number of laboratory parameters (traditional as well as those more recently proposed) related to the cardio-metabolic risk. Conversely, the present study also presents limitations that need to be considered. We evaluated non-invasive score to asses liver fibrosis and steatosis without evaluating the real prevalence using a gold standard technique for steatosis (magnetic resonance spectroscopy or liver biopsy) or fibrosis (liver biopsy). However, it must be considered that our aim was to the accordance of these indices in subjects with overweight/obesity and evaluate their correlations with other cardio-metabolic risk biomarkers.

Another limitation is that many subjects may fall in a gray zone where the diagnosis remains undetermined, that may account, almost in part, for the different concordance between scores. For example, regarding the cut-off ≤ 3.25 adopted for FIB-4, really this low cut-off is understood to exclude advanced liver fibrosis, but values above this cut-off do not rule in advanced fibrosis and it would not clearly indicate the word "positive" for fibrosis, since there is a large gray area. In this context, recent data, obtained in type 2 diabetes patients, suggest that the use of age-adjusted FIB-4 cut-offs leads to a drop in gray zone results, rendering selection of patients referring to specialized hepatologists more efficacious, as liver biopsy is not clearly applicable to a large patient portion [31]. Moreover, a possibility to overcome the limitation related to the significant percentage of subjects which remain difficult to categorize, may be a sequential or parallel use of different scores, although this procedure needs to be further verified. Vibration-controlled transient elastography (VCTE) may offer support in “borderline” situations. However, using VCTE with the obesity-specific probe (XL), inability or unreliable scans may occur in 3%–14% and 1%–9% of patients, respectively, more likely in patients with obesity and by inexperienced operators. Moreover, approximately 30% of patients with obesity presented unreliable/invalid scans in a prospectively evaluated cohort of patients with NAFLD (n = 291) irrespective of the use of VCTE, p-SWE or 2D-SWE [32].

In the present study, we evaluated concordance between indirect indexes of steatosis/fibrosis. We did not have the availability of data obtained with a gold standard technique (e.g., liver biopsy, considered the gold standard for the diagnosis of NAFLD, although sometimes sources of possible complications and affected by potential sample-to-sample variability and interpretation). Consequently, it is difficult to establish which index is superior to another. The scores evaluated showed low concordance, and differently correlated with anthropometric, clinical and laboratory variables, likely reflecting their relationship with disparate and even unrelated underlying pathophysiological pathways and risk factors. It seems likely that indexes are not equivalent, and that no single index is reliable in all clinical conditions, and that clinicians may be aware of these differences because important in the differential liver disease diagnosis or when evaluating the progression of a known liver disease. Thus, our results are a warning for clinicians. In the next future, we expect that studies comparing indirect indexes with liver biopsy (or other widely tools, such as ultrasonography, computed tomography, magnetic resonance imaging, and transient elastography) would be performed and validated to better assess reliability of each indirect index in specific clinical contexts.

Clearly, we aimed to focus on very common utilized scores, although other indices of steatosis/fibrosis could be evaluated (of particular interest the Hepatic Steatosis Index-HSI and the APRI) in future studies [33, 34].