Introduction

Several validated self-report instruments are currently available to assess anti-fat attitudes, fat stereotypes, and weight-related teasing [1, 2]. Recently, the importance of assessing internalized weight stigma (IWS), defined as “[…] the degree to which an obese person endorses weight-based negative stereotypes and attributes these to themselves (Durso et al., 2012)” [3, p. 240], was also highlighted. This led to the development of the Weight Self-Stigma Questionnaire (WSSQ) [4].

The psychometric properties of the WSSQ were examined in an American sample of 169 overweight and obese adults [4]. A principal component analysis with varimax rotation suggested a two-factor model encompassing self-devaluation (self-depreciation and weight-related shame) and fear of enacted stigma (fear of enacted stigma and weight discrimination). Subsequent analyses revealed acceptable scale score reliability (α = 0.81 and 0.87) and test–retest reliability (r = 0.62 and 0.80) for both subscales, as well as significant relations between the WSSQ subscales and measures of disordered eating, general experiential avoidance, quality of life, psychological distress, and attitudes toward obese people.

Two recent studies have examined the psychometric properties of the WSSQ in other languages and cultures. Hain, Langer, Hünnemeyer, Rudofsky, Zech, and Wild [5] examined the psychometric properties of the German version of the WSSQ among a sample of 94 severely obese adults. Their results revealed acceptable scale score reliability for both WSSQ subscales (α = 0.81 and 0.83) and demonstrated relations between the WSSQ and measures of depression, quality of life, psychological distress, dissociative symptoms, and weight- and body-related shame and guilt.

Farhangi, Emam-Alizadeh, Hamedi, and Jahangiry [6] examined the psychometric properties of an Arabic version of the WSSQ among 170 overweight and obese Iranian women. Again, the scale score reliability for both WSSQ subscales was acceptable (α = 0.75 and 0.78), and they were associated with quality of life and psychological distress.

Although informative, the studies reviewed present some limitations that need to be addressed. First, although studies of the German and Arabic versions of the WSSQ [5, 6] suggest that it provides a valid measure of IWS, the factor validity of the WSSQ was not assessed. Therefore, the generalizability of the WSSQ factor structure within other linguistic or cultural groups remains unknown.

Second, the three studies reviewed focused on adults and did not consider samples of overweight or obese adolescents. This lack of research on IWS in younger populations [7] is particularly concerning because overweight/obese adolescents are known to represent a particularly vulnerable population in terms of exposure to weight-related stigmatization [810]. The WSSQ thus needs to be validated in a sample of overweight and obese adolescents.

Third, although Hain et al. [5] and Lillis et al. [4] relied on mixed-sex samples, they did not examine the presence of differential item functioning (DIF) and possible latent mean differences on the WSSQ subscales as a function of the participants’ sex. This lack of verification could lead to erroneous interpretations of observed sex differences, which might reflect DIF rather than true sex-based differences. Consequently, it is unknown whether the WSSQ could be used for sex-based mean comparisons, and whether previous results showing that women present higher levels of IWS than men [1113] could be replicated with the WSSQ. In fact, examination of the presence of DIF is a prerequisite for sex-based mean comparisons with the WSSQ.

Finally, there is currently no self-report questionnaire available to assess IWS among French-speaking overweight/obese individuals. Therefore, the development of a French version of the WSSQ would likely contribute to the crosslinguistic generalizability of this questionnaire and greatly facilitate the assessment of IWS in cross-cultural studies. Indeed, French is the official or co-official language of numerous countries and territories worldwide (e.g., Canada, Belgium, Switzerland, and several former French colonies).

The main objective of this study was to examine the validity of the French WSSQ among overweight/obese French Canadian adolescents. First, the WSSQ was translated into French, and its factor validity and reliability were examined. Second, the DIF and latent means of the WSSQ subscales as a function of the respondents’ sex were examined. Finally, the convergent validity of the WSSQ subscales with measures of body mass index, self-esteem, physical appearance, anxiety, depression, fear of negative appearance evaluation, and eating-related pathology was also investigated. These measures were chosen because they are known to be significantly related to IWS [4, 5, 7, 14].

Method

Procedures and participants

This study was approved by the research ethics committee of the Université du Québec en Outaouais and the relevant school boards from various regions of the Canadian Province of Quebec. The participants were recruited by the Association pour la santé publique du Québec, between November 2013 and March 2014, in 78 secondary schools. Participants’ parents/legal representatives received a letter specifying the purpose of the study, which was entitled “Weight discrimination and its consequences among adolescents.” They also received a passive consent form for their adolescent’s participation. The adolescents were informed about the study through posters and information booths. Those interested in participating in the study were invited to fill out the consent form and the questionnaires online. Every respondent who completed the questionnaires was eligible for a draw (an iPad and five gift certificates of $100 each). A total of 156 overweight (76.3%) and obese (23.7%) French-speaking adolescents (81 boys and 75 girls; 14 to 19 years old, M = 16.31, SD = 0.85) completed the WSSQ and were therefore selected for the study.

Measures

Demographic and anthropometric characteristics

The adolescents were asked to report their sex, age, height, and weight. To control for self-report biases, weight and height were corrected [15] and then used to estimate participants’ body mass index [BMI (in kg/m2) = weight/(height × height)]. More specifically, participants’ self-reported values were adjusted using sex-based correction formulas that consider their age (regarding height, see equations 7 for boys and 9 for girls, and regarding weight, see equations 13 for boys and 14 for girls in Brettschneider et al. [15]). Overweight and obesity statuses were determined based on the revised BMI cutoff scores provided by the International Obesity Task Force [16]. These scores are sex and age specific and correspond to a BMI ≥ 25 kg/m2 (for overweight) and ≥30 kg/m2 (for obesity) at age 18.

Weight self-stigma

The English version of the WSSQ was translated into French following standardized back-translation techniques [17]. The original WSSQ items were first translated into French by a professional translator and discussed in committee until a consensus was obtained. Then, the French items were back-translated into English by a second professional translator and compared with the original English version. Any inconsistencies between versions were discussed in committee, and this process was repeated until both versions were considered equivalent. The final 12 French items of the WSSQ (see Table 3) were rated by the adolescents using a five-point scale ranging from completely disagree (1) to completely agree (5).

Self-esteem/physical appearance

Participants’ self-esteem (10 items; e.g., “Overall, I have a lot to be proud of”) and physical appearance (eight items; e.g., “I am good looking”) were measured using two subscales (global self-esteem and physical appearance) from the French [18] Self-Description Questionnaire II (SDQ-II) [19]. Participants rated the items using a six-point scale ranging from false (1) to true (6).

Anxiety/depression

Participants’ anxiety (seven items; e.g., “Worrying thoughts go through my mind”) and depression (seven items; e.g., “I have lost interest in my appearance”) were measured using a French version [20] of the Hospital Anxiety and Depression Scale (HADS) [21]. All items were rated on four-point severity scales.

Fear of negative appearance evaluation

Respondents’ fear of having their physical appearance negatively evaluated by others (five items; e.g., “I worry that people will find fault with the way I look”) was measured using the French version [22] of the Fear of Negative Appearance Evaluation Scale (FNAES) [23]. These items were rated using a five-point scale ranging from not at all (1) to extremely (5).

Eating-related pathology

Adolescents’ eating-related pathology was assessed using the 18-item French version for adolescents [24] of the Eating Attitudes Test (EAT-26) [25, 26]. This version assesses six dimensions: fear of getting fat (FGF; e.g., “I am terrified about being overweight”), social pressure to gain weight (SPGW; e.g., “I feel that others would prefer if I ate more”), vomiting-purging behaviors (VPB; e.g., “I vomit after I have eaten”), eating-related control (ERC; e.g., “I avoid foods with sugar in them”), eating-related guilt (ERG; e.g., “I feel uncomfortable after eating sweets”), and food preoccupation (FP; e.g., “I give too much time and thought to food”). Participants answer each item using a six-point scale ranging from always (6) to never (1).

Data analysis

The a priori two-factor structure of the WSSQ was examined with a confirmatory factor analysis (CFA). Model fit was assessed based on multiple fit indices [2730]: the Chi-square test of exact fit (χ2), the comparative fit index (CFI < 0.90 indicates “bad” fit; ≥0.90 indicates “acceptable” fit; >0.95 indicates “good” fit), the Tucker-Lewis index (TLI; same thresholds as for the CFI), the root-mean-square error of approximation (RMSEA > 0.10 indicates “bad” fit; > 0.05 and ≤0.10 indicates “acceptable” fit; ≤0.05 indicates “good” fit), and the 90% confidence interval of the RMSEA. The composite reliability of the WSSQ subscales was estimated using McDonald’s [31] omega (ω), which is interpreted in the same way as other reliability coefficients (e.g., Cronbach’s alpha), but provides a more accurate representation of the measurement model.

The sample was too small to conduct complete tests of measurement invariance as a function of sex. Therefore, DIF and latent mean differences as a function of sex were investigated using a multiple indicators multiple causes approach (MIMIC) [32]. These analyses were performed in the following sequence: (a) a null effects model (the paths from sex to the latent factors and item responses were constrained to be zero); (b) a saturated model (the paths from sex to the latent factors were constrained to be zero, but those to the item responses were freely estimated); and (c) an invariant model (the paths from sex to the latent factors were freely estimated, but those to the item responses were constrained to be zero). The goodness of fit of these models was then compared (null vs. saturated or invariant for the effect of sex on the WSSQ responses; saturated vs. invariant for possible DIF). All model comparisons were based on typical interpretation guidelines [33, 34], with ∆CFI/TLI < 0.01 and ∆RMSEA < 0.015 reflecting an equivalent fit to the data for alternative models.

Given the relatively small sample available in this study, we elected to rely on latent factor scores rather than fully latent variables to maximize the statistical power of the convergent validity analyses by reducing the number of freely estimated parameters. In addition, as opposed to manifest mean scores, latent factor scores have the advantage of providing a partial level of control for measurement error by attributing higher weight to more reliable items (with higher factor loadings, and thus lower uniquenesses) [35, 36]. Therefore, a CFA was first conducted separately for each instrument used to assess the convergent measures (SDQ-II: self-esteem and physical appearance; HADS: anxiety and depression; FNAES: fear of negative appearance evaluation; EAT: FGF, SPGW, VPB, ERC, ERG, and FP), and the parameter estimates from these measurement models were used to estimate the composite reliability (ω) for each subscale. Second, the latent factor scores from each CFA were saved from these preliminary measurement models using the regression method implemented in Mplus 7.4 [37] as part of the FSCORES command [38] and imported into the main data set. Third, the convergent validity of the WSSQ latent factors with the observed BMI scores, and the latent factor scores for the measures of self-esteem, physical appearance, anxiety, depression, fear of negative appearance evaluation, and eating-related pathology were examined using a structural equation model (SEM).

All analyses were conducted using the robust weighted least squares (WLSMV) estimator implemented in the statistical software Mplus 7.4 [37] to better consider the ordered categorical nature of the five-point response scale and asymmetric response thresholds of the WSSQ [39]. To account for the few missing responses (M = 7.6%) at the item level, models were estimated based on the full available information using algorithms implemented in Mplus for WLSMV [40].

Results

Factor validity and reliability

The a priori two-factor CFA resulted in an acceptable (for RMSEA) to good (for CFI/TLI) level of fit to the data [χ2(53) = 106.73, p < .001, CFI = 0.977, TLI = 0.971, RMSEA = 0.081, RMSEA 90% CI = 0.058–0.103]. The estimates from this model, which are presented in Table 1, show that all loadings are substantial (self-devaluation: Mλ = 0.778; fear of enacted stigma: Mλ = 0.828). Additionally, the WSSQ subscales were significantly correlated (r = 0.881) and provided excellent composite reliability (self-devaluation: ω = 0.905; fear of enacted stigma: ω = 0.929).

Table 1 Standardized parameter estimates from the confirmatory factor analysis of the WSSQ

Differential item functioning and latent mean differences

The null effects model provided an acceptable level of fit for all indicators [χ2(65) = 183.61, p < .001, CFI = 0.947, TLI = 0.936] except for the RMSEA (0.108; 90% CI = 0.090–0.127). The saturated [χ2(53) = 103.94, p < .001, CFI = 0.977, TLI = 0.966, RMSEA = 0.078, 90% CI = 0.056–0.101] and invariant models [χ2(63) = 142.50, p < .001, CFI = 0.964, TLI = 0.956, RMSEA = 0.090, 90% CI = 0.070–0.110] both resulted in satisfactory levels of fit to the data. Furthermore, comparison of the null effects model with the saturated (∆CFI/TLI: +0.030/+0.030; ∆RMSEA: −0.030) and invariant models (∆CFI/TLI: +0.017/+0.020; ∆RMSEA: −0.018) resulted in a substantial improvement in model fit, revealing that the participants’ sex had some effect on the WSSQ responses. However, the saturated model resulted in substantially better model fit than the invariant model (∆CFI/TLI: + 0.013/+ 0.010; ∆RMSEA: −0.012), revealing DIF as a function of sex. Examination of the modification indices of the model suggested that DIF might be limited to a single item (# 7: “I feel insecure about others’ opinions of me”) from the fear of enacted stigma subscale. The results revealed that boys tended to present significantly lower means than girls on this item (β = −0.483, p < .001). A fourth model of partial invariance was thus estimated by freeing the direct effect of sex on responses for this item. This model provided a satisfactory level of fit to the data [χ2(62) = 125.31, p < .001, CFI = 0.972, TLI = 0.964, RMSEA = 0.081, 90% CI = 0.060–0.101] that was almost identical to the fit of the saturated model (∆CFI/TLI: −0.005/−0.002; ∆RMSEA: +0.003). More specifically, the results showed that boys tended to present significantly lower latent means of self-devaluation (β = − 0.286, p = .001) and fear of enacted stigma (β = −0.228, p = .012) than girls.

Convergent validity

The SEM that includes the WSSQ latent factors and convergent measures (i.e., observed BMI and latent factor scores) resulted in an acceptable level of fit to the data [χ2(173) = 278.38, p < .001, CFI = 0.946, TLI = 0.914, RMSEA = 0.062, 90% CI = 0.049–0.076]. As noted in Table 2, the composite reliability of the convergent measures was satisfactory (Mω = 0.820). The correlations estimated between the WSSQ, the observed BMI scores, and the latent factor scores for the other convergent measures are reported in Table 2. The results from these analyses revealed that the WSSQ subscales were significantly and negatively related to self-esteem and physical appearance. Additionally, they were significantly and positively related to anxiety, depression, fear of negative appearance evaluation, and to the FGF, ERC, FP, VPB, ERG, and SPGW components of the EAT, although the association with SPGW was limited to the fear of enacted stigma subscale. No significant relation was found between the WSSQ subscales and BMI.

Table 2 Factor correlations from the analyses of convergent validity

Discussion

This study proposed a French adaptation of the WSSQ and examined its factor validity and reliability among a sample of overweight and obese French Canadian adolescents. The CFA model resulted in an acceptable (for RMSEA) to a good (for CFI/TLI) level of fit to the data. Despite a slight difference between the RMSEA and the CFI/TLI, the adequacy of the two-factor structure of the French WSSQ is supported. It is common in the structural equation modeling literature to observe small divergences between the RMSEA and the CFI/TLI [30]. This is true particularly in the context of categorical data analysis for which the relative performance of these indices is not as well documented as it is for more typical analyses of continuous data [29, 41, 42]. Indeed, recent studies suggest that the use of traditional interpretation guidelines for the RMSEA in categorical data analysis may be inappropriate, or too stringent, under certain conditions [29, 41, 42].

Additional results revealed that the WSSQ subscales present satisfactory levels of composite reliability and are strongly correlated with one another. These results are consistent with those obtained with the original English version [4], as well as with the German and Arabic versions [5, 6]. Moreover, this study confirms the generalizability of the WSSQ factor structure to overweight/obese adolescents.

To our knowledge, this study is the first to examine possible DIF in WSSQ responses and latent mean differences as a function of the respondents’ sex. The results generally revealed no item bias in relation to the youth’s sex, except for one item (i.e., “I feel insecure about others’ opinions of me”) included in the fear of enacted stigma subscale. This last result shows that this item tends to function differently in boys and girls, and that boys may feel more secure than girls about other people’s opinions. Therefore, this item should be excluded when computing the fear of enacted stigma subscale for the purpose of sex-based comparisons. Additional results revealed that boys have significantly lower latent means of self-devaluation and fear of enacted stigma than girls. These results are consistent with those from previous research conducted with adults [1113]. However, because this is the first study to examine sex-based differences in IWS in an adolescent sample, the generalizability of these results remains an open question that should be examined more thoroughly in future research.

This study also tested the convergent validity of the WSSQ subscales in relation to several criterion measures. The results revealed that the subscales are moderately and negatively related to self-esteem and physical appearance. In addition, they showed the WSSQ subscales to be moderately and positively related to anxiety, depression, fear of negative appearance evaluation, and eating-related pathology. These results are all in the expected directions and consistent with previous IWS research conducted among overweight/obese youth or adults [4, 5, 7, 14]. However, in contrast to the original validation study [4], no significant association was found between the WSSQ subscales and BMI. This lack of significant association is consistent with the study on the German version of the WSSQ [5] conducted with a sample of severely obese adults. Such discrepancies could be attributed to differences in sample composition between the studies. Indeed, 76% of the present sample was overweight (resulting in a range restriction of BMI measurements), and 100% of the German participants were severely obese. Conversely, the original validation study [4] included a far wider range of BMI levels (from overweight to severely obese). Future studies should thus examine this association more thoroughly over a wide range of BMI levels.

This study has limitations that must be considered. One important limitation is that the psychometric properties of the French WSSQ were examined in a single, small sample of overweight and obese adolescents from 14 to 19 years old. Consequently, there is a need to cross-validate the present results using additional, and larger, samples of overweight or obese youth from different age groups (children, early and late adolescence). Studies using larger samples will also be necessary to assess the measurement invariance and latent mean differences of the WSSQ more systematically across age (children vs. adolescents, early vs. late adolescence), sex (boys vs. girls), and weight (overweight vs. obese) categories, as well as possible interactions among these characteristics (age × sex).

Furthermore, the crosslinguistic and cultural invariance of the French WSSQ was not examined in this study. It would be interesting to examine the measurement invariance of the French WSSQ among samples from various countries and cultures (Belgium, France, Morocco, Switzerland, Tunisia), and linguistic groups, as well as to compare it with the original English version among samples of bilingual overweight and obese adolescents.

Additionally, the test–retest reliability and the longitudinal invariance of the French WSSQ were not examined and should thus be tested in future research. These analyses represent a prerequisite for longitudinal studies aiming to assess the evolution of IWS levels over time.

Finally, the WSSQ is an explicit measure of IWS that “reflect[s] attitudes endorsed as personal beliefs” [43, p. 180]. However, this type of measure can hide the real magnitude of these attitudes because responses on explicit measures can be influenced by social desirability and the “participant’s ability and willingness to accurately report his or her evaluations and judgments” [44, p. 96]. Therefore, it might be interesting for future studies to also use an implicit measure of IWS (for a review, see Morrison et al. [44]). Implicit measures are unconscious and “automatically activated evaluations acquired from repeated messages in the environment” [43, p. 180]. As mentioned by Ruggs, King, Hebl, and Fitzsimmons [2], implicit measures of IWS “assess the attitudes people privately hold but may not explicitly express due to social norms” (p. 65). Interestingly, recent findings [43] suggest that combining the two methods helps to achieve a more accurate depiction of IWS.

In conclusion, this study has confirmed that the French WSSQ has satisfactory psychometric properties and can be used in research to examine IWS among samples of overweight and obese adolescents. However, given the limitations of this study, the use of this instrument with children or young adolescents, or in longitudinal studies, appears premature. The French WSSQ can be used with Francophone adolescents with similar characteristics and in the context of sex-group-based comparisons.