Introduction

In the last 10 years, the construct of food addiction (FA) has become more and more popular [1,2,3,4], and has received increasing interest in both clinical and research practice [2, 5]. One of the reasons for its popularity could be attributed to its dual nature [2,3,4], that led to two hypotheses: Some individuals could be addicted to food and/or to their eating behavior [6]. Indeed, the construct of FA refers to the idea that high calories (and/or highly processed foods) should activate an addiction-like response in individuals [7] that may lead to excessive food consumption [8]. Therefore, FA seems to share clinical characteristics of some eating disorders (EDs, e.g., bulimia nervosa, binge eating disorder) [7, 9] but also of substance-related and addictive disorders (SRAD) [10,11,12] as well as behavioral addictions [3, 13].

Individuals with problematic EDs—e.g., binge eating disorder (BED)—and/or with overweight/obesity are more exposed to FA [14,15,16,17] and often show a sort of addiction-like eating behaviors [16], reporting symptoms that can be seen also in addiction-related disorders—both SRAD as well as behavioral addictions [18,19,20,21]. These individuals often refer to a huge amount of time spent thinking of food and/or obtaining high palatable foods [20,21,22]. Moreover, they often report the inability to stop themselves to continue overeating despite well-known several adverse consequences [19, 21]. Furthermore, they also report the loss of control once they begin eating [23] as well as cravings symptoms [24]. As a consequence, these symptoms lead the individual to experience significant life impairment—physical, social, and psychological—and distress [9, 18,19,20,21, 25].

By a biological point of view, neuroscience revealed that aforementioned individuals showed alterations in brain circuitry similar to those with an addiction-related diagnosis—showing parallels between factors implicated in some EDs as well as in SRAD [24]. Indeed, akin to SRAD, an elevated activation in reward circuitry, through the release of dopamine, was observed in response to high palatable food cues [26], as well as a reduced stimulation of inhibitory regions in response to food consumption [27], that often lead to impulsive and/or compulsive food-intake behaviors [28]. Moreover, it was shown that ‘food craving’ in people can activate the hippocampus, caudate nucleus, and insula: which is analogous to ‘drug craving’ in people [29]. Furthermore, to complete the picture, it is well known that certain foods have a powerfully addictive and rewarding effect similar to drugs [30, 31].

These results suggest that the neural activation in response to high-caloric palatable foods (e.g., sweetened foods, foods with high levels of refined carbohydrates, and food with added fat) for individuals with addiction-like eating is comparable to that found in individuals with SRAD [32,33,34,35] and reinforces the hypothesis that—in some individuals with overweight/obesity—addictive processes may lead to a disproportionate food intake [36,37,38].

Considering this background, the second version of the Yale Food Addiction Scale (YFAS 2.0 [8, 39]) was developed at the light of DSM-5 SRAD criteria [10]. Whereas the first version of the YFAS was constructed on DSM-IV-TR substance dependence criteria [40] applied to palatable foods (e.g., chocolate, etc.) [41], the YFAS 2.0 reflects the DSM 5 version by assessing both abuse (e.g., failure in role obligations) and dependence (e.g., tolerance, withdrawal) criteria and the new criteria of craving (as well as clinically significant distress/impairment). Finally, by introducing a severity classification continuum (mild/moderate/severe), the YFAS 2.0 allows making a diagnostic interpretation of the obtained score [8]. The YFAS 2.0 was translated and/or validated in several languages: German [42], French [43], Spanish [44], Italian [45], Arabic [46], Turkish [47], Korean [48], and Japanese [49]. Moreover, structural validity and other psychometric properties of this scale were tested in both general [8, 43, 45, 46, 49] and clinical populations [17, 44]—giving psychometrically sound results.

Considering structural validity findings from the general population, the original development study of Gearhardt et al. [8] was conducted on a first sample of 550 participants (average BMI = 26.67; SD = 6.76) and successively on a second sample of 224 participants (average BMI = 28.03; SD = 7.31). The two samples had an equal sex distribution. Authors specified a one-factor model solution by positing a single latent factor—Food Addiction—loaded by the eleven observed criteria measured by YFAS 2.0. Items evaluating the twelfth criterion (significant impairment and distress) were not included in the confirmatory factor analysis (CFA) because they assess clinical significance of the FA rather than indicators of individual criteria [8]. The performed CFA revealed an almost adequate model fit to the data, with factor loadings higher than 0.70. Moreover, the internal consistency was good as well as convergent, discriminant, and incremental validity [8]. The prevalence rate of FA diagnosis was 14.6% in participants of the first study and 15.8% in the second one. The German version [42] was validated first on a sample of 455 university students (study one), with a large majority of females (89%) and an average BMI equal to 22.32 (SD = 3.65), and further analyses were made on a second sample of 133 bariatric surgery candidates (study two) with a BMI of 48.80 (SD = 7.08). The CFA results for the sample of students revealed a good model fit to the data for the hypothesized single-factor structure. The prevalence rate of FA diagnosis was 9.7% in the first study and 47.4% in the second one. The Japanese version [49] was validated on a sample of 731 undergraduate students, with a large majority of females (78.5%) Even in this case, the CFA revealed a good model fit to the data for the single-factor structure and showed factor loadings ranging from 0.31 to 0.62. Internal consistency was found to be good. The prevalence rate of FA diagnosis was 3.3%. The first Italian study [45] was performed on a sample of 574 undergraduate medical students, balanced by sex, and with an average BMI of 22.5 (SD = 2.3). The CFA results showed a good model fit to the data, with factor loadings ranging from 0.79 to 0.93. Internal consistency was good, and the prevalence rate of FA diagnosis was 3.4%. The French validation study [43] was conducted on a sample of 330 participants, with a BMI of 23.3 (SD = 4.9). The CFA revealed a non-adequate fit for the single-factor structure (two out of three fit indices did not reach the threshold for good model fit to the data) with factor loadings ranging from 0.31 to 0.74. However, internal consistency was satisfactory. The prevalence rate of FA diagnosis was 8.2%. Finally, the Arabic validation study [46] was conducted on a sample of 236 Egyptian medical students, balanced by sex, with an average BMI of 22.3 (SD = 4.1). Authors did not run a CFA testing the factorial structure of the YFAS 2.0. Internal consistency, measured with Cronbach’s alpha, was found to be good and the prevalence rate of FA diagnosis was 11%.

To date, only two validation studies assessed the factorial validity of the YFAS 2.0 in the clinical population. The Spanish version of the YFAS 2.0 [44] was validated on three different samples: EDs patients (N = 135; average BMI = 26.89, SD = 10.17), gamblers (N = 166; average BMI = 26.48, SD = 4.83), and healthy controls (N = 152; average BMI = 22.12, SD = 4.08). The CFA revealed a good model fit to the data for the single-factor structure. Internal consistency was found to be good. The prevalence rate of FA diagnosis was 77.8% for EDs patients and 3.3% for healthy controls. The Australian study [17] assessed the psychometric properties of YFAS 2.0 in a sample of BED patients (average BM = 25.95, SD = 5.97) with a high majority of females (94%). The CFA revealed an almost adequate model fit for the single-factor solution, with factor loadings ranging from 0.61 to 0.93. The internal consistency was good as well as convergent, discriminant, and incremental validity. The prevalence rate of FA diagnosis was 42.3%.

As underlined by Meule and Gearhardt in a recent review [39], none of the aforementioned studies assessed whether the YFAS 2.0 items loaded onto the designed symptoms/criteria. Furthermore, among the YFAS 2.0 validation studies, only Meule [42] and Granero [44] provided a comparison between EDs patients and healthy controls. However, both studies did not report a thorough comparison between the structures of the YFAS 2.0 across these two populations and assumed that the model structure is equal between groups. It is crucial that the factorial structure of a scale is equivalent across different populations—due to the fact that the YFAS2.0 is retained as a screening instrument for FA, in clinical samples as well as in the general population.

However, to date, there is no study showing how the factorial structure of the YFAS 2.0 performs in different populations, such as people with severe obesity (BMI ≥ 35) and/or the general population.

Consequently, the present study was thus aimed to assess—for the first time—whether the YFAS 2.0 items loaded onto the designed symptoms/criteria as well as the test–retest reliability and measurement invariance of the I-YFAS 2.0 in a sample of inpatients with severe obesity compared to a sample of subjects enrolled from the general population, in addition to its structural validity and internal consistency.

Materials and methods

Participants

The sample comprised 704 participants: 400 inpatients with severe obesity [176 males (44.0%) and 224 females (56.0%) aged from 18 to 79 years (mean = 55.54, SD = 12.72)] and 304 individuals [91 males (29.9%) and 213 females (70.1%) aged from 18 to 87 years (mean = 34.26, SD = 16.29)] from the general population. Inpatients with severe obesity (BMI ≥ 35) were recruited at the San Giuseppe Hospital, IRCCS, Istituto Auxologico Italiano, Verbania (Italy) during the first week of a one-month inpatient program for weight reduction and rehabilitation. Individuals from the general population were enrolled in Padua (Italy). Inclusion criteria were: (A) being over 18 years old; (B) being native Italian-speaker; and—for inpatients with severe obesity only—(C) having a BMI higher or equal to 35 (BMI ≥ 35). Exclusion criteria were: (D) illiteracy; (E) inability to complete the assessment due to vision and/or cognitive impairments; and (F) denial of informed consent. All participants signed a written and informed consent.

Translation and cultural adaptation

According to international guidelines [50, 51], the Italian version of YFAS 2.0 was independently translated from the original English version into Italian by two bilingual clinicians who are experts in the field. To ensure equivalence between translations, a blind back-translation was conducted by an independent bilingual translator. The final version of the I-YFAS 2.0 was trialed with a random sample of 30 individuals (15 inpatients with severe obesity and 15 non-clinical participants) in order to assess items’ comprehensibility. No further adjustments were required.

Measures

Bio-demographics

Information including age, gender, weight (in kg), height (in m), desired weight, minimum and maximum weight reached was collected with a self-report form preceding the I-YFAS 2.0. Weight and height data were used to compute BMI [52].

The Yale Food Addiction Scale 2.0 (YFAS2.0)

The YFAS 2.0 [8, 39] is a 35-item self-report questionnaire assessing FA symptoms in both general [43, 45, 46, 49] and clinical populations [8, 17, 44]. The YFAS 2.0 assesses the 11 DSM-5 diagnostic criteria for SRAD [10] and the significant impairment and/or distress related to food. Items concern the key behavioral features of addiction-like eating behaviors in which the person engaged over the previous 12 months (1 year). Specifically, the criteria assessed by the YFAS 2.0 are: (A) “Substance taken in larger amount and for longer period than intended” (consumed more than intended); (B) “Persistent desire or repeated unsuccessful attempts to quit” (unable to cut down or stop); (C) “Much time/activity to obtain, use, recover” (great deal of time spent); (D) = “Important social, occupational, or recreational activities given up or reduced” (important activities given up); (E) = “Use continues despite knowledge of adverse consequences (e.g., emotional problems, physical problems)” (use despite physical/emotional consequences); (F) = “Tolerance (marked increase in amount; marked decrease in effect)”; (G) = “Characteristic withdrawal symptoms; substance taken to relieve withdrawal” (withdrawal); (H) = “Continued use despite social or interpersonal problems” (use despite interpersonal/social problems); (I) = “Failure to fulfill major role obligation (e.g., work, school, home)” (failure in role obligation); (J) = “Use in physically hazardous situations” (use in physically hazardous situations); (K) “Craving, or a strong desire or urge to use” (craving); (L) = “Significant distress/impairment” (distress). The questionnaire is scored on an 8-point Likert-type scale (ranging from 0 = “never” to 7 = “every day”). To compute the final score, each of the 35 items has to be transformed (dichotomized; 0 = “non-endorsed” vs. 1 = “endorsed”) according to an item-specific cutoff [8]. Subsequently, according to Gearhardt et al. [8], it is possible to determine the presence (vs. absence) of each criterion via these thresholds.

Finally, in line with the first version of the YFAS (YFAS1.0; e.g., [16, 41]), also this version provides two different scoring options. The first one is the symptom count score: the number of FA symptoms experienced in the previous 12 months, ranging from 0 to 11; the L criterion (impairment/distress) should not be counted. The second one is the diagnostic score: FA is diagnosed when the participant reports (at least) 2 or more symptoms plus a clinically significant impairment/distress. Moreover, FA could be diagnosed as mild if there are 2 or 3 symptoms and clinically significant impairment/distress, moderate if there are 4 or 5 symptoms and significant impairment/distress, or severe if there are 6 or more symptoms and significant impairment/distress [8].

The Binge Eating Scale (BES)

The BES [53, 54] is a 16-item self-report measure of binge eating severity in both general [16] and clinical population [55]. The BES investigates the frequency of feelings, thoughts, and behaviors associated with BED; its items consist of groups of three or four statements increasing in severity which constitutes two subscales (FC—Feelings/Cognitions; and B—Behaviors) and a total score [55]. Usually, the BES has satisfactory internal consistency and several studies highlight its great ability to discriminate between clinical and non-clinical individuals [53], showing also a high concordance with the interview-based diagnosis of binge eating disorder [56]. The BES has generally received support as an adequately reliable and valid measure of eating-related pathology. In this study, Cronbach’s alphas were 0.895, 0.825, and 0.806 for the BES Total scale, the FC scale, and the B scale, respectively.

The Dutch Eating Behavior Questionnaire (DEBQ)

The DEBQ [57, 58] is a 33-item self-report measure of behaviors and attitudes related to ED in both general [16, 59] and clinical populations [60]. Items concern the occurrence of key psychological as well as behavioral features of EDs in which the person engages. The questionnaire is scored on a 5-point Likert scale (1–5), and its item composes three subscales (Restrained Eating—RE; Emotional Eating—EE; and External Eating—ExE) and a total score. Also, the DEBQ has shown to be an adequately reliable and valid measure of eating-related pathology with a strong three-factor structure, high internal consistency, and high test–retest reliability after a 4-week period [57,58,59,60]. In this study, Cronbach’s alphas were 0.920, 0.893, 0.965, and 0.829 for the total scale, the RE scale, the EE scale, and the ExE scale, respectively.

The Eating Disorder Examination Questionnaire 6.0 (EDEQ)

The EDEQ 6.0 [61, 62] is a 28-item self-report measure of ED behaviors, attitudes, and tendencies to psychopathology in both general [63, 64] and clinical populations [65]. Items concern the frequency of key behavioral features of EDs in which the person engages over the preceding 28 days. The questionnaire is scored on a 7-point Likert scale (0–6), and its item composes four subscales (Restraint—R; Eating Concern—EC; Shape Concern—SC; and Weight Concern—WC) and a global score. The EDEQ has generally received support as an adequately reliable and valid measure of eating-related pathology [63, 64]. In this study, Cronbach’s alphas were 0.928, 0.802, 0.748, 889, and 0.729 for the EDEQ Total scale, the Restraint scale, the Eating Concern scale; the Shape Concern scale, and Weight Concern scale, respectively.

Statistical analyses

Statistical analyses were performed with R software (v. 3.5.3) [R-core project [66, 67] and the following packages: psych (v. 1.8.12; [68]), lme4 (v. 1.1-21; [69]), lavaan (v. 0.6-5; [70, 71]), pROC (v. 1.13.0 [72]), and semTools (v. 0.5-2 [73]). Graphical representations—reported in Supplementary material ‘S1’—were performed with the R package ggplot2 (v. 3.1.0 [74]).

Two different structural models were sequentially tested by means of CFA: a hierarchical model (Fig. 1) and a first-order model (Fig. 2). In the hierarchical model, each of the 35 items of the I-YFAS 2.0 loaded onto its specific criteria (from Criterion A to Criterion L), which in turn loaded onto an overarching general (second-order) latent dimension. In the first-order model, in line with previous validations of the YFAS 2.0 [8, 17, 43,44,45,46, 49], each of the eleven symptoms (from Criterion A to Criterion K) loaded onto a latent dimension.

Fig. 1
figure 1

The hierarchical model of the YFAS2.0. Items loaded onto its first-order latent symptom/criterion which in turn they loaded onto a second-order (general) latent variable

Fig. 2
figure 2

First-order model of the YFAS2.0. Each of the eleven symptoms/criteria of food addiction loaded onto a first-order latent variable

Considering the binary response scale (non-endorsed vs. endorsed)—both for items and for criteria/symptoms (Table 5)—the diagonally weighted least square (DWLS) estimator was used to assess the factorial structure of the I-YFA S 2.0 [75,76,77]. Model fit was assessed by means of several indices: the Chi-square statistics (χ2) [78], the Root-Mean-Square Error of Approximation (RMSEA) [79,80,81], the Comparative Fit Index (CFI) [82], and the ratio of χ2 to the degrees of freedom (df) [77, 83, 84]. The following criteria were used as cutoffs for an ideal fit [85, 86]. A χ2 test non-significant is desirable [81, 87]. The RMSEA values should be lower than 0.08 for an ‘acceptable’ model fit [85, 88] and below 0.05 to indicate a ‘good’ fit [89]. The CFI values should be between 0.90 and 0.95 for an ‘acceptable’ fit [75, 85, 89] and higher than 0.95 to indicate a ‘good’ fit [88, 90]. Also, the χ2/df should be lower or equal to 3 to indicate a ‘good’ fit [16, 84, 91,92,93,94].

Measurement invariance (MI) analyses were computed and specified to evaluate whether the two aforementioned structures of the I-YFAS 2.0 (the hierarchical model and the first-order model) were invariant between the sample of inpatients with severe obesity and the individuals from the general population [95]. The following procedure was performed for the hierarchical model—merging the technique suggested by Dolan [96, 97] and the current guidelines for the treatment of dichotomous data [75, 78, 98,99,100,101]. Three nested models were sequentially specified and constrained to equality: the hierarchical structure (Model 1: Configural Invariance); first-order factor loadings and thresholds (Model 2); first- and second-order factor loadings and thresholds of measured variables (Model 3: Strong Invariance); first- and second-order factor loadings, thresholds of measured variables, and factor means (Model 4: Means Invariance). Differently, the ‘usual’ procedure for models with categorical indicators was followed for the first-order model [95, 98, 102,103,104,105]. In this case, the simple first-order model (Configural Invariance), the factor loadings and items’ thresholds (Strong Invariance), and latent means (Means Invariance) were sequentially constrained to equality between groups.

Measurement invariance was assessed by using test differences in three fit indices and with the following criteria as cutoffs for model equivalence: DIFFTEST (equal to Δχ2; p value > 0.050), ΔCFI (< 0.010), and ΔRMSEA (< 0.015) [75, 78, 98, 106]. An excess of the cutoff in two out of these three indices, combined with worse fit indices, was considered as the evidence of model non-invariance.

Categorical McDonald’s omega was used as a measure of internal consistency for the I-YFAS 2.0 in each CFA. More in detail, omega hierarchical (ωh; the proportion of the second-order factor explaining the total score, or the coefficient omega), omega total (ωt; the proportion of the second-order factor explaining the variance at first-order factor level), and omega partial (ωp; proportion of observed variance explained by the second-order factor after partialling the uniqueness from the first-order factor) were computed for the hierarchical model [107,108,109,110,111]. For the first-order model, ωt was computed [107,108,109,110,111] and it was supported by the Kuder–Richardson coefficient (KR-20) [112].

Convergent validity was assessed with the Pearson correlation coefficient [113]. The strength of correlations was interpreted using the Cohen’s benchmarks: r < .10, trivial; r from 0.10 to 0.30, small; r from 0.30 to 0.50, moderate; r > 0.50, large [114]. In addition, the χ2 was performed to assess the associations between the I-YFAS 2.0 diagnostic score and the other measures’ clinical thresholds [113]. A deepened analysis of the relation between BMI an I-YFAS 2.0 symptom count is reported in the Supplementary Material ‘S1.’

According to guidelines [115, 116], the test–retest reliability of the first-order model was estimated on a subsample of 20 inpatients with severe obesity by using the two-way mixed intraclass correlation coefficient (ICCconsistency) [92, 117]. This statistic was used to evaluate also the stability of the FA diagnosis.

Considering that the YFAS 2.0 was conceptualized as a screening tool for FA [8, 41], a Receiver Operating Characteristics (ROC) curve methodology was used to assess its accuracy to differentiate between inpatients with severe obesity and individuals from the general population [118, 119]. Moreover, according to previous studies [44], the sample (inpatients with severe obesity versus the general population) was used as an external criterion variable and the I-YFAS Symptom Count score was used as the dependent variable. Moreover, according to Gearhardt et al. [8], a latent multivariate variable (MLV) was created ad hoc by including the measures of EDs symptoms (the BES FC scale, the BES B scale, the DEBQ ExE, the DEBQ EE and the EDEQ EC) and its factor score (FS) was extracted. Values higher than the 75° percentiles of the MLV FS distribution were considered as indicators of ED and labeled as “case” (opposite, values lower than the 75° percentiles were labeled as “control”) into the ROC curve analyses. The global accuracy–validity of the I-YFAS 2.0 was estimated with the area under the ROC curve (AUC; 5000 stratified bootstrap resamples)—interpreted using the Swets’ benchmarks: AUC = 0.50, null; AUC from 0.51 to 0.70, small; AUC from 0.71 to 0.90, moderate; AUC from 0.91 to 0.99, large; and AUC = 1.00, perfect accuracy [120, 121]. Moreover, sensibility (Se) and specificity (Sp) were computed for the cutoff point [118, 119].

Incremental validity was examined performing two different stepwise hierarchical multiple regressions. The first analysis was conducted to assess whether the YFAS 2.0 symptom score predicts a statistically significant incremental variance (ΔR2) in BMI. The second regression analysis was carried out to test whether the YFAS 2.0 symptom scores predict a statistically significant incremental variance (ΔR2) in BED attitudes. For each regression analysis, potential confounds variables (e.g., EDs convergent measures, sociodemographic variables) were firstly entered in the regression equation and tested with a stepwise procedure. The YFAS 2.0 symptom score was then added, and the ΔR2 was checked to evaluate the YFAS 2.0 symptom score contribution.

Prevalence analyses were performed concerning symptoms endorsement as well as the diagnosis (no FA vs. mild FA vs. moderate FA vs. severe FA).

Finally, the independent sample t-test and the one-way analyses of variance (ANOVA) were, respectively, performed to examine differences in the EDs measures between FA diagnoses (No FA vs. FA) and FA severity levels (mild FA vs. moderate FA vs. severe FA). The strength of differences was interpreted using the Cohen’s f and Cohen’s d [122] and their benchmarks: null (f < 0.10; d < 0.20); small (f from 0.10 to 0.25; d from 0.20 to 0.49); moderate (f from 0.25 to 0.40; d from 0.50 to 0.79); and large (f ≥ 0.40; d ≥ 0.80). Finally, the Games–Howell test was chosen for performing post-hoc analysis [113, 123,124,125].

Results

Sample characteristics

Inpatients with severe obesity

Inpatients’ BMI ranged from 35.06 to 81.04 [mean = 42.99; SD = 6.47; skewness = 1.85; kurtosis = 6.28]. According to the WHO BMI classification, 152 individuals (38%) had an obesity class II BMI (from 35 to 39.99) and 248 (62%) an obesity class III BMI (BMI ≥ 40). The one sample’s t test revealed that the mean BMI of inpatients with severe obesity was significantly different from the WHO obesity class I cutoff criterion (BMI = 34.99; t = 25.47; p < 0.001) The self-reported minimum weight reached by inpatients during adolescence ranged from 25 to 110 kg (mean = 56.40; SD = 16.09), and the self-reported maximum weight reached ranged from 30 to 160 kg (mean = 70.50; SD = 21.81). The self-reported minimum weight reached during the adulthood ranged from 40 to 210 kg (mean = 70.73; SD = 22.87), and the self-reported maximum weight reached ranged from 45 to 247.2 kg (mean = 121.22; SD = 34.57). Inpatients’ desired weight ranged from 50 to 130 kg (mean = 81.97; SD = 15.58).

Individuals from the general population

BMI ranged from 14.84 to 39.26 [mean = 22.65; SD = 3.67; skewness = 0.85; kurtosis = 0.99]. Two hundred and seven individuals (68.1%) had a BMI that falls into a normal weight range (from 18.5 to 24.99). Twenty-five participants (8.2%) were underweight (BMI from 16 to 18.5), and 5 (1.6%) were severely underweight (BMI < 16). Fifty-six subjects (18.4%) had a BMI that falls into the overweight class (from 25 to 29.99), 10 participants (3.3%) had a BMI that falls into the obesity class I (from 30 to 34.99), and only one individual had a BMI that falls into the obesity class II (from 35 to 39.99). One sample’s t tests revealed that the study sample was neither underweight (BMI = 18.49; t = 19.76; p < 0.001) nor overweight (BMI = 25; t = − 11.13; p < 0.001) on average. The self-reported minimum weight reached during adolescence ranged from 30 to 99 kg (mean = 51.34; SD = 9.70), and the self-reported maximum weight reached ranged from 42 to 110 kg (mean = 60.29; SD = 10.80). The self-reported minimum weight reached during adulthood ranged from 34 to 90 kg (mean = 55.35; SD = 9.58), and the self-reported maximum weight reached ranged from 45 to 110 kg (mean = 66.60; SD = 11.99). Participants’ desired weight ranged from 42 to 98 kg (mean = 59.32; SD = 10.07).

Structural validity

Hierarchical model

The second-order model (Fig. 1) shows an acceptable fit to the data (Table 1) for the two samples combined. Despite the Chi-square statistic resulted to be statistically significant [χ2 (548) = 938.403; p < 0.001], the CFI (CFI = 0.990), the RMSEA [RMSEA = 0.032; 90% CI 0.028–0.035; p(RMSEA < 0.05) = 1] and the χ2/df (χ2/df = 1.712) were indicative of an adequate model fit. As depicted in Table 1, all the first-order items’ loadings were statistically significant (mean = 0.853; SD = 0.08) as well as the second-order loadings (mean = 0.888; SD = 0.06).

Table 1 Structural validity of the YFAS2.0 second-order model presented in Fig. 1: Item factor loadings (λ) and explained variance (R2)

Regarding the sample of inpatients with severe obesity, the Chi-square statistic resulted to be statistically significant [χ2 (548) = 719.408; p < 0.001]. However, the CFI (CFI = 0.994), the RMSEA [RMSEA = 0.028; 90% CI 0.022–0.033; p(RMSEA < 0.05) = 1], and the χ2/df (χ2/df = 1.313) were indicative of an good model fit. As depicted in Table 1, all the first-order items’ loadings were statistically significant (mean = 0.843; SD = 0.09) as well as the second-order loadings (mean = 0.886; SD = 0.08).

Regarding the sample of individuals from the general population, the Chi-square statistic resulted to be statistically significant [χ2 (548) = 1000.908; p < 0.001]. However, the CFI (CFI = 0.964), the RMSEA [RMSEA = 0.052; 90% CI 0.047–0.057; p(RMSEA < 0.05) = 0.233], and the χ2/df (χ2/df = 1.827) were indicative of a good model fit. As depicted in Table 1, all first-order items were statistically significant (mean = 0.858; SD = 0.129), as well as the second-order loadings (mean = 0.846; SD = 0.09).

First-order model

The single-factor model (Fig. 2) shows a good fit to the data for the two samples combined. Although the Chi-square statistic resulted to be statistically significant [χ2 (44) = 60.645; p = 0.049], all the other fit indices revealed a good fit to the data: the CFI = 0.998, the RMSEA = 0.023; 90% CI 0.002–0.036; p(RMSEA < 0.05) = 1, and the χ2/df = 1.378. As reported in Table 2, all the items’ loadings were statistically significant and ranged from 0.728 (Criterion A) to 0.871 (Criterion E) (mean = 0.818; SD = 0.04).

Table 2 Structural validity of the eleven YFAS2.0 symptoms/criteria: item factor loadings (λ) and explained variance (R2)

Regarding sample of inpatients with severe obesity, all of the fit indices revealed a good fit to the data: χ2 (44) = 40.186; p = 0.636 ns, the CFI = 1.000, the RMSEA = 0.000; 90% CI 0.000–0.029; p(RMSEA < 0.05) = 1], and the χ2/df = 0.913. As shown in Table 2, all of the items’ loadings were statistically significant (p < 0.001) and ranged from 0.706 (Criterion H) to 0.899 (Criterion K); mean = 0.809; SD = 0.06.

Also regarding sample of the general population, all fit indices revealed a good fit to the data: χ2 (44) = 50.827; p = 0.223 ns, CFI = 0.996, RMSEA = 0.023; 90% CI 0.000–0.046; p(RMSEA < 0.05) = 0.974], and χ2/df = 1.155. As reported in Table 2, all the items’ loadings were statistically significant and ranged from 0.661 (Criterion F) to 0.936 (Criterion I) (mean = 0.795; SD = 0.08).

Measurement invariance

Hierarchical model

Configural invariance

A second-order configural invariance model was specified between inpatients with severe obesity and the general population, and the same hierarchical model was estimated simultaneously within each group. Good model fit indices were found: χ2 (1096) = 1720.316, p < 0.001; CFI = 0.984, RMSEA = 0.040; 90% CI 0.037–0.044; p(RMSEA < 0.05) = 1, and the χ2/df = 1.569. The achieved configural invariance suggests that the hierarchical structure was similar between groups.

Strong invariance

The second-order strong invariance model reported negative observed variances—and as a result, comparison of fit indices was not performed. Consequently, means invariance was not tested.

First-order model

Configural invariance

A first-order configural invariance model was specified between groups. Good model fit indices were found (χ2 (88) = 91.013, p = 0.392 ns; CFI = 1.00, RMSEA = 0.010; 90% CI 0.000–0.031; p(RMSEA < 0.05) = 1, and the χ2/df = 1.034), suggesting that, even in this case, the factor structure was similar between inpatient with severe obesity and the general population.

Strong invariance

Also the first-order strong invariance model still fitted data well: χ2 (97) = 110.903, p = 0.158 ns; CFI = 0.998, RMSEA = 0.020; 90% CI 0.000–0.036; p(RMSEA < 0.05) = 1, and the χ2/df = 1.143. In this case, non-significant decreases—in two out of three fit indices—were found (DIFTEST = 19.890; p = 0.019; ΔRMSEA = 0.010; ΔCFI = − 0.001), indicating that items were equivalently related to the latent factor between groups and had the same expected item response at the same absolute level of the trait.

Means invariance

Finally, even the first-order means invariance model revealed adequate fit indices: χ2 (98) = 201.428, p < 0.001; CFI = 0.987, RMSEA = 0.055; 90% CI 0.044–0.066; p(RMSEA < 0.05) = 0.221, and the χ2/df = 2.055. However, decreases in fit indices compared to the strong invariance model were found (DIFTEST = 90.525, p < 0.001; ΔRMSEA = 0.035; ΔCFI = − 0.011), suggesting that groups had not the same expected latent mean of the traits.

Reliability

Reliability analysis revealed satisfying results. Indeed, for the hierarchical model, ωh was equal to 0.959, ωt was equal to 0.979 and ωp was equal to 0.987. In line with these results, ωt was equal to 0.900 and KR20 was 0.874 for the first-order model.

Convergent validity

As shown in Table 3, moderate-to-large correlations were found between the I-YFAS 2.0 symptom count and the BES scales (BES total score: r = 0.664, p < 0.001; FC scale: r = 0.639, p < 0.001; B scale: r = 0.610; p < 0.001), the DEBQ scales (DEBQ total score: r = 0.448, p < 0.001; EE scale: r = 0.487, p < 0.001; ExE scale: r = 0.426; p < 0.001), and the EDEQ 6 scales (EDEQ total score: r = 0.511, p < 0.001; EC scale: r = 0.523, p < 0.001; SC scale: r = 0.480, p < 0.001; WC scale: r = 0.456, p < 0.001). In addition, a moderate association was found between I-YFAS 2.0 symptom count and the BMI: r = 0.311, p < 0.001 (see Supplementary Material).

Table 3 Correlations between scales and subscales

Furthermore, the Chi-square test showed statistically significant associations between the diagnostic scores and both the BES [χ2 = 149.211; p < 0.001], the EDEQ [χ2 = 76.127; p < 0.001], and the BMI [χ2 = 65.053; p < 0.001] clinical thresholds.

Test–retest reliability

Test–retest reliability statistics revealed good results. Indeed, the two-way mixed ICC was equal to 0.853 (95% CI 0.580–0.949) for the first-order model and equal to 0.96 (95% CI 0.886–0.896) for the FA diagnosis.

Accuracy of the YFAS 2.0 as a screening/diagnostic tool

The symptoms count of the I-YFAS 2.0 obtained good accuracy in discriminating between inpatients with severe obesity and the general population: AUC = 0.706; 95% CI = 0.670–0.741; s.e. = 0.018; p < 0.001; with a Se equal to 0.875 (95% CI 0.836–0.911) and a Sp equal to 0.412 (95% CI 0.363–0.463).

Considering the MLV, a preliminary CFA assessed its adequateness. Results of this preliminary analysis revealed an adequate fit to a one-factor solution (χ2 (5) = 9379, p = 0.095 ns, CFI = 0.995, RMSEA = 0.043 (95% CI 0.000–0.085), SRMR = 0.043) with loadings that ranged from 0.532 to 0.880; these results supported the presence of a single MLV. The symptoms count of the I-YFAS 2.0 obtained good accuracy in discriminating between groups (MLV FS ≤ 75° vs. MLV FS ≥ 76°): AUC = 0.849; 95% CI = 0.806—0.891; s.e. = 0.022; p < 0.001; with a Se equal to 0.807 (95% CI 0.737–0.877) and a Sp equal to 0.768 (95% CI 0.724–0.810).

Incremental validity

The first stepwise hierarchical multiple regression was performed to examine the contribution of the YFAS 2.0 symptom count in predicting incremental variance in BMI (Table 4). In Step 1, two EDEQ subscales scores (SC and R) and two DEBQ subscales scores (ExE and RE) were entered into the regression and accounted for the 24.9% of BMI variance. In Step 2, the YFAS 2.0 symptom count was entered in the regression equation and was found to increase significantly the proportion of BMI variance accounted for by the model: 30.3%; ΔR2 = 0.054.

Table 4 Incremental validity analyses

The second stepwise hierarchical multiple regression was carried out to explore the contribution of the YFAS 2.0 symptom count in predicting incremental variance in binge eating attitudes (BES) (Table 4). In Step 1, two EDEQ subscales scores (EC and WC) and three DEBQ subscales scores (ExE, EE and RE) were entered into the regression and accounted for the 61.1% of BMI variance. In Step 2, the entrance of the YFAS 2.0 symptom count increased significantly the proportion of variance in binge eating attitudes accounted for by the model: 65.4%; ΔR2 = 0.043.

Prevalence of FA symptoms and FA diagnosis

Regarding the two samples combined, the number of FA symptoms that were met by participants ranged from 0 to 11 (mean = 1.942; SD = 2.739; median = 1). The lowest endorsement rate was for “Substance taken in larger amount and for a longer period than intended” (Criteria C—10.23%), while the highest was for “Persistent desire or repeated unsuccessful attempts to quit” (Criteria H—36.8%). The diagnostic threshold for FA was met by 15.20% of participants (n = 107). More in detail, two or three symptoms (mild FA) were endorsed by 18 participants (2.56%), four or five symptoms (moderate FA) were endorsed by 25 participants (3.55%), and six or more symptoms (severe FA) were endorsed by 64 participants (9.09%). Finally, 10 subjects (1.42%) met only the clinical impairment criterion but, according to Gearhardt et al [8], did not receive an FA diagnosis. Descriptive statistics are displayed in Table 5.

Table 5 Prevalence of FA symptoms

Regarding the sample of inpatients with severe obesity, the number of FA symptoms that were met ranges from 0 to 11 (mean = 2.745; SD = 3.055; median = 2). The lowest endorsement rate was for “Substance taken in larger amount and for a longer period than intended” (Criteria C—15%), while the highest was for “Continued use despite social or interpersonal problems” (Criteria H—54.5%). The diagnostic threshold for FA was met by 24% (n = 96). More in detail, two or three symptoms (mild FA) were endorsed by 13 (3.25%), four or five symptoms (moderate FA) were endorsed by 22 (5.5%), and six or more symptoms (severe FA) were endorsed by 4 (15.25%). Finally, 4 inpatients (1%) met only the clinical impairment criterion (and the most one criterion) but, according to Gearhardt et al. [8], did not receive an FA diagnosis.

Regarding the sample of the general population, the number of FA symptoms that were met ranges from 0 to 11 (mean = 0.885; SD = 1.775; median = 0). The lowest endorsement rate was for “Substance taken in larger amount and for a longer period than intended” (Criteria C—3.95%), while the highest was for “Persistent desire or repeated unsuccessful attempts to quit” (Criteria G—12.5%). The diagnostic threshold for FA was met by 3.62% (n = 11). More in detail, two or three symptoms (mild FA) were endorsed by 5 (1.65%), four or five symptoms (moderate FA) were endorsed by 3 (0.99%), and six or more symptoms (severe FA) were endorsed by 3 (0.99%). Finally, 6 subjects (1.97%) met only the clinical impairment criterion (and the most one criterion) but, according to Gearhardt et al. [8], did not receive an FA diagnosis.

FA versus no FA comparisons on ED measures

The group of study participants who received an FA diagnosis showed significantly higher values in binge eating tendencies (BES total: t = − 10.42, d = 1.26; FC: t = − 10.60, d = 1.27; B: t = − 8.86, d = 1.08), in disordered eating attitudes (DEBQ total: t = − 8.40, d = 0.91; EE: t = − 8.26, d = 0.95; ExE: t = − 6.23, d = 0.72), in EDs tendencies (EDEQ total: t = − 10.62, d = 1.41; R: t = − 6.61, d = 0.80; EC: t = − 7.53, d = 1.03; SC: t = − 12.68, d = 1.38; WC: t = − 11.19, d = 1.31), and in BMI (t = − 7.43, d = 0.74) (Table 6).

Table 6 Comparison of the YFAS 2.0 diagnoses (No FA vs. FA) and diagnostic categories (mild FA, moderate FA, and severe FA) with EDs measures and BMI

FA severity levels comparisons in EDs measures

Statistically significant differences across the FA severity levels were found in the BES total score (F = 18.537, p < 0.001, and f = 0.612) as well as in the FC subscale (F = 16.620, p < 0.001; and f = 0.579) and the B subscale (F = 14.317, p < 0.001; and f = 0.538). Regarding the DEBQ EDs attitudes, statistically significant differences between the FA severity levels were found in the DEBQ total score (F = 7.788, p = 0.001; and f = 0.397) as well as in the EE subscale (F = 8.699, p < 0.001; and f = 0.419) and the ExE subscale (F = 7.738, p = 0.001; and f = 0.395). No statistically significant difference was found in the RE subscale (F = 2.887, p = 0.061; and f = 0.241). Regarding the EDEQ EDs tendencies, no statistically significant difference was found between the FA severity levels in any scale [EDEQ total score (F = 1.733, p = 0.184; and f = 0.215), the R subscale (F = 2.486, p = 0.090; and f = 0.257), the EC subscale (F = 2.192, p = 0.119; and f = 0.242), the SC subscale (F = 1.006, p = 0.371; and f = 0.164), and the WC subscale (F = 0.965, p = 0.386; and f = 0.160)]. Finally, no statistically significant difference was found in BMI (F = 0.870, p = 0.422; and f = 0.127) (Table 6).

Discussion

In the last few years, FA has increased exponentially its popularity [1,2,3,4] and has received more and more interest in both clinical and research fields [2, 5]. According to Gearhardt et al. [8], FA relies on the evidence that some foods may be potentially addictive [20,21,22, 32,33,34,35]. Indeed, several studies highlighted that the neural reward response triggered by high palatable food (e.g., chocolate, pizza, etc.) is observed also in SRAD (e.g., drug and/or alcohol) [4, 13, 27, 126,127,128,129,130] as well as in certain EDs (e.g., bulimia nervosa, BED) [12, 39, 131, 132]. Based on this parallelism and in light of the DSM-5 SRAD criteria, the YFAS 2.0 [8] rapidly became the instrument of choice for the assessment of FA [6, 18, 39].

The aim of this study was to extensively examine—for the first time—several psychometrical properties of the I-YFAS 2.0 that have not been tested yet, such as the factorial structure at the item level and the measurement invariance between inpatients with severe obesity and the general population, filling thus the gap recently underlined in scientific literature [39].

Two different models were specified: at the item level (the hierarchical model; Fig. 1) and at the symptom/criteria level (the first-order model; Fig. 2). These models were assessed on the two samples combined as well as in the sample of inpatients with severe obesity and in the sample of the general population, separately. For each of the aforementioned samples, CFA successfully confirmed that all of the 35 items of the I-YFAS 2.0 loaded onto the supposed criteria that in turn loaded onto the general dimension of FA (Fig. 1). CFA revealed also that the I-YFAS 2.0 has good structural validity with good fit indices. Furthermore, first-order factor loading revealed a strong relationship between the items and the corresponding latent symptom/criteria [75, 77, 78, 85].

However, some differences emerged between the sample of inpatients with severe obesity and the general population. These differences suggest a different conceptualization of the latent construct of FA across these two samples. Item#3 (Criterion A: “I ate to the point where I felt physically ill”) and item#32 (Criterion B: “I tried and failed to cut down on or stop eating certain foods”) were the most representative item for the sample of inpatients with severe obesity, and these results could partially explain the strong association between FA, BED, and obesity [13, 15, 133]. On the other hand, item#5 (Criterion C: “I spent a lot of time feeling sluggish or tired from overeating”), item#19 (Criterion I: “My overeating got in the way of me taking care of my family or doing household chores.”), and item#21 (Criterion H: “I avoided social situations because people wouldn’t approve of how much I ate.”) were the most representative for the sample from the general population. These results could be partially explained by the possible presence of people with a co-diagnosis of ED such as anorexia nervosa and/or bulimia nervosa [39, 133]. Furthermore, individuals from the general population are probably more tied to their social context than individuals with severe obesity—something that could work as a protection factor (buffer) from the development and/or onset of disordered eating attitudes  and other psychological issues [134,135,136,137,138,139]. Furthermore, Criterion L (“Use causes clinically significant impairment or distress”) was the most representative for the sample of inpatients with severe obesity at the level of symptoms/criteria, while Criterion E (“Use continues despite knowledge of adverse consequences (e.g., emotional problems, physical problems) was the most representative for the sample from the general population. Taken together, these results suggest that individuals with severe obesity were more focused on the internal negative consequences of FA (both physical and psychological), probably due to a stronger food-dependency [13, 26, 39, 131]. Conversely, individuals from the general population were more focused on the external negative consequences of FA, for example, avoiding social situations and fulfilling social roles.

MI was thus tested to explore at which level (structural vs. loadings and thresholds) the differences between the two samples were. As advocated by Meule and Gearhardt [39], it was essential to investigate that items correctly loaded onto the respective criteria and that this factor structure was at least equivalent across different populations—like inpatients with severe obesity and the general population. The configural invariance was reached for the hierarchical model, and achieving structural invariance should be considered as a good result. Indeed, the YFAS 2.0 does not work at the level of the items but at the level of criteria [8, 39]; thus, the achievement of configural invariance demonstrated that the I-YFAS 2.0 symptoms/criteria were effectively loaded by the supposed items. These results provide the first scientific evidence that the YFAS 2.0 scoring procedure is correct.

Considering the first-order model (Fig. 2), CFA successfully confirmed that all the 11 symptoms/criteria of the I-YFAS 2.0 loaded onto a latent dimension of FA. Perfectly in line with the literature, CFA revealed that the I-YFAS 2.0 had excellent fit indices and an excellent structural validity [80, 82, 87, 92] in each sample (the two samples combined, the inpatients with severe obesity and the general population). As reported in Table 2, considering the overall sample, all of the items showed high factor loadings, revealing a strong relationship between the items and the latent dimension of FA. Moreover, both fit indices and factor loadings were in line with previous validation studies [42,43,44,45,46,47,48,49]—slightly observed differences could be due to differences in sample size, sample composition, and/or cultural differences [39]. Also in this case, some differences emerged between the sample of inpatients with severe obesity and the general population—that reinforce the idea of a different conceptualization of the latent construct of FA across these two samples. Criterion K (“Craving, or a strong desire or urge to use”) and Criterion F (“Tolerance (marked increase in amount; marked decrease in effect)”) were the most representative criteria for people with severe obesity—suggesting that the ‘craving’ symptom/criterion should be responsible of the overlap between FA, BED, and obesity [13, 15, 133]. These results also reinforce the idea that these individuals are more focused on physical and psychological negative consequences of FA—probably due to a stronger food-dependency than people in the general population [13, 26, 39, 131]. However, it should be highlighted that differences in results between the higher-order CFA and this first-order model are probably due to the absence of the distress symptom/criterion in the latter structural model (according to Gearhardt and colleagues [8]). On the other hand, Criterion C (“Much time/activity to obtain, use, recover”) and Criterion I (“Failure to fulfill major role obligation (e.g., work, school, home)”) were the most representative criteria for the general population. Even in this case, these results suggest that these individuals are more focused on negative consequences of FA not strictly related to psychological and physical dependency of food (interpersonal)—such as fulfill of social role.

MI analysis was performed to explore these differences between the two samples, and strong invariance was achieved. This kind of invariance suggests that the eleven items were equivalently related to the latent FA factor in each sample and that samples had the same expected item response at the same absolute level of the trait. The MI of the first-order factor structure [8, 39] was tested for the first time between two samples with clearly different characteristics—inpatients with severe obesity and the general population—which can be translated, in a very raw way, to a huge difference in BMI. These results suggest that participants in the two samples interpreted the YFAS 2.0 questions in the same way (the factorial structure was equal across samples), with the same strength (items were related to the latent construct equally between the two samples), and with the same starting point (items thresholds were equal between the two samples). However, the latent trait was not equally distributed and related to inpatients with severe obesity and the general population (latent means were different between the two samples). These results suggest that the comparisons between these two samples should be taken with caution and attention (different latent means), but these two groups were perfectly comparable (equal items threshold) [80, 82, 87, 92].

Reliability analyses were also performed, providing satisfying results for both the structural models that were tested. Considering the dichotomous nature of the YFAS 2.0 response scale, categorical McDonald ω and KR20 were computed [107,108,109,110,111,112] and both showed a high internal consistency of the I-YFAS 2.0, which is in line with the literature [8, 39, 42,43,44,45,46, 49]. A totally new result was instead the good one-month test–retest reliability of both the I-YFAS 2.0 symptom count and diagnosis as was shown by the two-way mixed ICC [140].

Convergent validity analyses were also performed. In line with the previous validation studies, significant correlations were found between the I-YFAS 2.0 symptom count and EDs measures [39, 141]. The strongest correlations were found with the BES total score (r = 0.664), its subscales (BES feelings, r = 0.639; BES behaviors, r = 0.610), and the EDEQ eating concern scale (r = 0.523). These correlations suggest a strong association between FA and EDs, due to the imaginable presence of people with a co-diagnosis such as FA and BED or FA and bulimia nervosa [39, 133]. A significant, but small correlation was found also with BMI (r = 0.311). Despite the common idea that people with a high BMI is positively linearly associated with FA, these results showed a nonlinear relation between these two variables (Table 1, Supplementary material), as underlined by the correlations reported in Table 3 and as also argued by Meule [142] and Meule and Gearhardt [39].

In addition, results from the ROC analyses showed for the first time that the I-YFAS 2.0 is a good screening/diagnostic tool for the detection of FA in different populations. The I-YFAS 2.0 showed a good strength of accuracy (AUC = 0.706), modest Sp (0.412), and high Se (0.875) in discriminating between inpatients with severe obesity and individuals from the general population. Moreover, considering the ED’s ad hoc created MLV, the I-YFAS 2.0 showed an excellent strength of accuracy (AUC = 0.849), high Sp (0.768), and high Se (0.807) in discriminating between participants with EDs tendencies and participants with no EDs tendencies. These results show the good ability of the I-YFAS 2.0 to correctly detect people with FA and support both the (small) association between YFAS 2.0 symptom count and BMI as well as the association between FA and EDs.

Furthermore, the I-YFAS 2.0 symptom score accounted for unique variance in BMI as well as in BED tendencies. Considering BMI, the I-YFAS 2.0 symptom score increased significantly the proportion of BMI variance accounted for by the model (total R2 = 30.3) in line with previous studies [43]. This result suggests a (small) positive relationship between BMI and FA—that in some cases seems to be nonlinear: indeed, according to Meule, despite FA positively relates to BMI, the slope levels off in higher body weight ranges [39, 142]. Considering the BED tendencies, the I-YFAS 2.0 symptom score increased the proportion of BES variance accounted for by the model (total R2 = 65.4).

Prevalence analysis revealed the classical ‘J-shape curve’ found in the previous studies [8, 44, 45, 49] for both the sample of inpatients with severe obesity and the general population. Considering the sample of inpatients with severe obesity, the diagnostic threshold for FA was met by 24% of participants (n = 96). More in detail, ‘Mild FA’ was endorsed by 13 participants (3.25%), ‘Moderate FA’ was endorsed by 22 participants (5.5%), and ‘Severe FA’ was endorsed by 61 participants (15.25%). Moreover, the lowest endorsement rate (15%) was for the Criteria C (“Substance taken in larger amount and for a longer period than intended”), while the highest (54.5%) was for Criterion H (“Continued use despite social or interpersonal problems”). Considering the sample of the general population, the diagnostic threshold for FA was met by 3.62% of participants (n = 11): More in detail, ‘Mild FA’ was endorsed by 5 participants (1.65%), ‘Moderate FA’ was endorsed by 3 participants (0.99%), and ‘Severe FA’ was endorsed by 6 participants (1.97%). Moreover, the lowest endorsement rate (3.95%) was for the Criteria C (“Substance taken in larger amount and for a longer period than intended”), while the highest (12.5%) was for Criteria G (“Characteristic withdrawal symptoms; substance taken to relieve withdrawal)—probably due to the presence of EDs diagnosis in comorbidity. Differences in endorsement rate—across the two samples of this study as well as between previous studies—should be due to cross-cultural differences and/or differences in sample size and/or in sample composition. Indeed, some studies may have enrolled individuals with a co-diagnosis of ED, thus modifying the endorsement of some symptoms/criteria (e.g., a high prevalence of individuals with BED and/or bulimia nervosa).

Statistically significant differences were found between individuals with FA and individuals without FA in almost all the EDs measures and also in BMI. The largest differences were found in the EDEQ total score (d = 1.41), the EDEQ shape concern subscale (d = 1.38), the EDEQ weight concern (d = 1.31), and the BES feelings subscale (d = 1.27). These results are in line with previous studies that showed a good discriminant ability of the YFAS 2.0 between subjects with EDs and subjects without EDs [44] and are supported by the results of the ROC curve analysis. These results corroborate the aforementioned hypothesis that people with FA should be more prone to have an ED in comorbidity. Statistically significant differences in EDs measures were also found across the YFAS 2.0 severity levels—in particular, between ‘Mild FA’ and ‘Severe FA’—with the exception of the DEBQ RE subscale, the EDEQ subscales, and the BMI. The lack of statistically significant differences in those variables is probably due to the characteristics of participants composing the three categories, which consisted of both inpatients with severe obesity (the majority, 89.7%) and people from the general population (the minority, 10.3%). Moreover, as argued before, the absence of differences in the aforementioned EDs measures between the YFAS 2.0 severity levels should be due to the conceivable presence of individuals with a co-diagnosis of an ED such as BED or bulimia nervosa across each YFAS 2.0 severity class [39, 44, 133].

Despite these interesting findings, several limitations have to be underlined. First of all, although the sample size was adequate to perform a CFA, small number of individuals with FA not allowed to perform a MI analysis of the I-YFAS 2.0 between participants with a FA diagnosis and the ones without a FA diagnosis. Also, this study lacks of a specific EDs screening, not allowing a comparison between the general population, inpatients with severe obesity, and subjects with ED—representing an interesting perspective for future studies. Moreover, the number of participants (n = 20) who were re-administered the I-YFAS 2.0 was enough for the assessment of the test–retest reliability but was far from being adequate for a longitudinal MI analysis. Finally, no other measure of FA was used. However, it has to be highlighted that—to date—no other questionnaires assessed FA from the general perspective of DSM 5 SRAD. Indeed, for example, the ‘Food Craving Questionnaire (state and trait)’ refers solely to the specific ‘craving’ criterion. Moreover, the ‘Rep(-EAT)-Q’ as well as the ‘Grazing Inventory’ assessed the attitudes to graze, but they do not consider the SRAD criteria. Finally, the ‘Addictive-Like Eating Behavior Scale’ measures the eating addiction rather than FA.

Despite these limitations, this study indicates that the I-YFAS 2.0 is a good instrument for the assessment of FA in both the clinical and the general populations, and it can also be considered as a valid and reliable tool to be used in the research of FA.

Finally, the I-YFAS2.0 should be considered as a starting point for the assessment of FA and in the planning of psychological treatments [143,144,145,146,147,148,149,150]. By means of both the I-YFAS 2.0 symptom count and diagnosis, healthcare professionals could get support in making clinical decisions about patients with severe obesity as well as individuals from the general population who request a visit for problems related to weight, food, and EDs.

What is already known on this subject?

Parallels in the biological, psychological, and behavioral factors implicated in addiction and problematic eating have led to the hypothesis that an addictive process may contribute to excessive food consumption.

To date, the Yale Food Addiction Scale 2.0 (YFAS 2.0) is currently the most important instrument to investigate the addiction-like eating behavior according to the DSM-5 Substance-Related Use Disorder (SRAD) criteria.

The YFAS 2.0 was successfully translated and/or validated in several languages: German, French, Spanish, Italian, Arabic, Turkish, Korean, and Japanese.

What does this study add?

The present study aimed to cover a lack underlined by Meule and Gearhardt in a recent review [39]: Indeed, none of the aforementioned validation studies assessed whether the YFAS 2.0 items loaded onto the designed symptoms/criteria.

The present study aimed—for the first time—to assess measurement invariance of the Italian version of the YFAS 2.0 in a sample of inpatients with severe obesity compared to a sample of subjects enrolled from the general population.

The present study aimed to provide a first reliable estimation of prevalence of food addiction in a sample of inpatients with severe obesity and the general population.