Introduction

Food security is the state of having access to a sufficient quality and quantity of food needed for healthy living [1]. In 2018, 11% of the US population lacked such access, with the prevalence of food insecurity increasing dramatically during the 2020 economic downturn [2, 3]. Despite being partly defined by limited access, food insecurity in the United States and other developed countries is a risk factor for weight gain and obesity, likely due to reliance on relatively inexpensive, energy–dense, highly processed food [4,5,6]. Food insecurity is also associated with psychological morbidity, including—strikingly—as much as a six-fold greater prevalence of eating disorder (ED) pathology compared to non-food-insecure individuals [7, 8].

Despite this strong association between food insecurity and ED pathology highlighting a critical need for additional research understanding the impact of food insecurity on ED pathology, existing ED measures were developed based on samples of individuals who were likely predominantly food secure. Indeed, most commonly used ED measures were largely developed in undergraduate and other convenience samples that may have over-represented female, White, and/or middle-class participants [e.g., 911]. Several authors have begun to address issues of diversity and ED stereotypes in measurement by examining the measurement of ED symptoms in Black or Hispanic respondents, and in men, but these studies still relied on undergraduate samples, and were not designed or recruited to be socioeconomically diverse [12,13,14,15]. Notably, there may be differences in the presentation of ED pathology in food-insecure populations. For instance, commonly used ED pathology measures typically inquire about dietary restraint due to weight/shape concerns, whereas “other-reason” dietary restraint is also linked to ED pathology [16]

People with food insecurity and ED pathology may be under-diagnosed and under-treated in medical settings. Clinicians and patients are both influenced by stereotypes that the prototypical patient with ED pathology is young, thin, white, and affluent [17]. This leads to the under-diagnosis and under-representation in treatment settings of people of color, men, people with overweight and obesity, and people who do not restrict for reasons related to weight and shape [18,19,20,21,22]. The use of validated screening measures in healthcare settings may help reduce disparities in identifying and treating ED pathology in food-insecure populations. However, given that existing measures (1) limit their assessment of dietary restraint to focus on weight and shape concerns and (2) were developed in mostly food-secure samples, there is a need to understand whether existing ED symptom measures function appropriately in samples with food insecurity.

The underlying assumption of self-report trait and symptom measures is that they measure the same construct in the same way across groups and populations [23]. This assumption is the basis of research that compares levels of ED pathology between groups of interest: given the same level of actual ED pathology, members of different subgroups should endorse items at equivalent rates. If the probability of endorsing a specific item differs by group (e.g., food security status), this is evidence of differential item functioning (DIF). In psychological self-report instruments, DIF is commonly interpreted as group membership influencing the way participants understand or interpret an item [e.g., 14]. However, it may also be the case that DIF in ED measures suggests that ED symptoms themselves are differentially likely to appear in relation to underlying measured ED pathology across food security status. The aim of the present study was to assess for DIF in two commonly used measures of ED symptomatology in a sample of women with high/marginal food security and low/very low food security.

Methods

Participants

Participants included cisgender women who were recruited to participate in an online study about the “impact of food availability on eating and feeding behaviors” using Amazon’s Mechanical Turk (MTurk) [24]. Participants were compensated $5 for participation in the study, which took an average of 28.49 min (SD = 22.31) to complete. The present study is a secondary analysis of data collected between July 2019 and January 2020 for a study of intergenerational effects of food insecurity on home and parenting factors. Thus, approximately half of the participants were recruited to have experienced food insecurity during childhood, and the other half denied experiencing food insecurity during childhood. An additional inclusion criterion was having at least one child between the ages of six and eleven. Participants were women given the higher prevalence of ED pathology in women [25] and the increased likelihood that mothers would be more directly involved in the feeding of their child. To ensure quality responses, MTurk worker qualifications/requirements to participate included a HIT approval rate of greater than 95% (# of approved HITs that a worker has completed) and number of HITs approved greater than 100 (# of HITs that a worker has successfully completed since registering with MTurk). These requirements are consistent with other studies using MTurk samples (i.e., ≥ 90% approval rate, ≥ 100 approved HITs; [26]). Participants also needed to reside within the United States. Prior to completing the online questionnaire, participants provided online informed consent. All study procedures were approved by the University of Chicago’s Institutional Review Board and participants were compensated via MTurk.

Data were collected from 805 women. Six quality checks required the participant to select the one sentence that does not make semantic sense (e.g., “Planes yell on the dream”) from a set of four syntactically correct sentences (e.g., “Boats are sailing on the lake”). The average completion time of the survey did not differ based on number of quality checks missed, with the exception of one comparison [missing 0 quality checks (M = 26.47 min, SD = 11.92) and 4 quality checks (M = 17.84 min, SD = 10.29)].Footnote 1 The internal consistency of well-validated, reliable measures that included items that are reverse scored was explored with differing numbers of quality checks missed. Cronbach’s alphas decreased when including individuals who missed 4 or more quality checks. Thus, participants who answered incorrectly at least four of six quality check questions were excluded (n = 171), resulting in an analytic sample of 634 women. The mean age of participants was 34.75 years (SD = 7.40, range 22–72) and the mean body mass index (BMI; calculated using self-reported height and weight) was 25.16 (SD = 6.34). Participants were economically diverse with 18.0% reporting a household income < $30,000, 40.8% between $30,000 and $59,999, 23% between $60,000 and $89,999, and 18.3% ≥ $90,000. Participants predominantly identified as non-Hispanic (82.0%) and White (77.4%), followed by participants who identified as African American (18.3%), Asian (3.8%), American Indian/Alaskan Native (3.2%), Native Hawaiian/Pacific Islander (1.1%), and those who preferred not to provide their race or ethnicity (1.3% and 2.4%, respectively).

Measures

Food security

Food security was assessed using a self-report version of the 18-item United States Department of Agriculture (USDA) Food Security Survey Module [28]. Items assess eating patterns and anxiety about accessing food over the past 12 months (e.g., “In the last 12 months, were you ever hungry but didn't eat because there wasn't enough money for food?”). A total score was created from the sum of affirmative responses. Total scores can range from 0 to 18, with 0 indicating high food security (n = 203), 1–2 indicating marginal food security (n = 37), 3–7 indicating low food security (n = 117), and 8–18 indicating very low food security (n = 277) [28]. For analyses, high and marginal food security were combined to represent food security (n = 240), and low and very low food security were combined to represent food insecurity (n = 394). Very good internal consistency was observed in the current sample (Cronbach’s alpha = 0.84).

Eating disorder pathology

ED pathology was assessed with the Short-Eating Disorder Examination Questionnaire (S-EDE-Q) [9, 29] and the Eating Disorder Diagnostic Scale for DSM-5 (EDDS-5) [11]. The S-EDE-Q is a 7-item, self-report questionnaire (adapted from the original 28-item EDE-Q; [30]) aimed at assessing ED attitudes and behaviors over the past 28 days. Items are rated on a 7-point ordered rating scale. The S-EDE-Q yields a global score (calculated as the mean of all seven items) reflecting participants’ overall level of dietary restraint, weight/shape overvaluation, and body dissatisfaction [9, 10, 29]. In the current sample, Cronbach’s alpha for the global S-EDE-Q was 0.93.

The revised version of the Eating Disorder Diagnostic Scale for DSM-5 (EDDS-5) [11] is a 23-item self-report measure that assesses diagnostic criteria for anorexia nervosa, bulimia nervosa, and binge eating disorder by asking the respondent about body image, eating habits, and compensatory behaviors over the last 3–6 months. It includes Likert, dichotomous (Yes/No), and frequency response options. Items 1–17, which can be summed to yield a symptom composite score, were used for the present analyses. Research with the EDDS-4 (the DSM-IV version of the same assessment) suggests good internal consistency and criterion validity with interview-based measures of disordered eating [11]. In the current sample, Cronbach’s alpha for the EDDS-5 symptom composite was 0.86.

Data analysis

The lordif package in R [31] was used to conduct differential item functioning (DIF) analyses examining whether endorsement of items on the S-EDE-Q and EDDS-5 differed across food-secure and food-insecure groups after conditioning on overall ED pathology as measured by each respective scale. There were no missing data on the S-EDE-Q or EDDS-5 among participants with food security data. The lordif package uses a hybrid ordinal logistic regression/item response theory (IRT) approach to test for DIF. Instead of conditioning on an observed composite score, this approach uses IRT-based estimates of the latent variable (theta)—in this case, overall ED pathology—as the matching criterion. Predictors of item responses were entered into ordinal logistic regression models in a three-step approach: the IRT-based estimate of the latent variable representing overall ED pathology was entered into the model first (Model 1), the group variable representing food security status was added to the model next (Model 2), and, finally, an interaction term between the two variables (thetaoverall ED pathology × food security status) was added to the model (Model 3).

Uniform and non-uniform DIF were examined. Uniform DIF suggests the possibility of consistent item-level bias by group membership in the same direction across all levels of the underlying trait (i.e., analogous to confounding), whereas non-uniform DIF suggests the possibility of item-level bias by group membership that differs by level of the underlying trait (i.e., analogous to moderation [32]). Statistically significant uniform DIF was indicated by a significant likelihood ratio χ2 difference test comparing model fit between Models 1 and 2, and statistically significant non-uniform DIF was indicated by a significant likelihood ratio χ2 difference test comparing model fit between Models 2 and 3. To account for the high number of statistical tests performed, criteria for a statistically significant likelihood ratio χ2 difference test was set at α = 0.01, following standard practice for DIF analyses as established by Zumbo [33]. To determine clinical significance of the DIF results, the Jodoin and Gierl’s [34] effect size guidelines were used, such that changes in pseudo R2 between each step were characterized as follows: negligible (pseudo ΔR2 < 0.035), moderate (pseudo ΔR2 ≥ 0.035 and < 0.070), or large (pseudo ΔR2 ≥ 0.070). McFadden’s pseudo R2 was used, as it is considered one of the most generally applicable versions of pseudo R2 for logistic regression due to its relative independence from the base rate of the outcome [35]. Both statistical (p < 0.01) and clinical significance (pseudo ΔR2 ≥ 0.035) must be demonstrated for evidence of uniform or non-uniform DIF [34, 36]. Item characteristic curves—depicting the likelihood of endorsing a particular item score at various levels of the underlying latent variable representing overall ED pathology—were plotted for items demonstrating both statistically and clinically significant DIF.

Results

Short-eating disorder examination-questionnaire

S-EDE-Q global scores ranged from 0 to 6, with M = 2.75 (SD = 1.74) in the full sample. DIF model comparison results for the S-EDE-Q are presented in Table S1 (available online). Item 1 (“Have you been consciously trying to restrict the amount of food you eat to influence your shape or weight?”) demonstrated statistically significant, but not clinically significant, non-uniform DIF by food security status. No evidence of DIF by food security status was detected for any other items on the S-EDE-Q.

Eating disorder diagnostic scale for DSM-5

EDDS-5 raw symptom composite scores ranged from 0 to 117, with M = 30.62 (SD = 27.61) in the full sample. DIF model comparison results for the EDDS-5 are presented in Table S2 (available online). A total of fourteen items (items 1, 2, 3, 4, 8, 9, 10, 11, 12, 13, 14, 15, 16, and 17) of the EDDS-5 demonstrated statistically significant DIF by food security status, but only two of these items also demonstrated clinically significant DIF. More specifically, results indicated statistically and clinically significant non-uniform DIF for items 9 (“During episodes of overeating with a loss of control, did you eat large amounts of food when you didn’t feel physically hungry?”) and 11 (“During episodes of overeating with a loss of control, did you feel disgusted with yourself, depressed, or very guilty after overeating?”). Compared to food-secure participants, food-insecure participants were more likely to endorse items 9 and 11 at lower levels of overall ED pathology but less likely to endorse these items at higher levels of overall ED pathology (Fig. 1). Effect sizes were moderate across both items demonstrating DIF, with McFadden pseudo ΔR2 values ranging from 0.045 to 0.046.

Fig. 1
figure 1

Item characteristic curves for items from the Eating Disorder Diagnostic Scale for DSM-5 (EDDS-5) demonstrating both statistically and clinically significant differential item functioning for the food insecure group compared to the food secure group

Discussion

The present study examined the possibility of DIF across food-secure and food-insecure women for items on the S-EDE-Q and EDDS-5, two commonly used measures of ED pathology. No evidence of clinically significant DIF was observed within the S-EDE-Q, but two items on the EDDS-5 exhibited statistically and clinically significant DIF with moderate effect sizes. These items (i.e., eating large amounts of food when not physically hungry and feeling disgusted/depressed/guilty about overeating) demonstrated poorer ability to differentiate between different levels of ED pathology among individuals experiencing food insecurity. In other words, both items were endorsed at relatively more consistent rates regardless of levels of ED pathology for individuals with food insecurity relative to those who were food secure.

Eating large amounts of food when not physically hungry was endorsed less at higher levels of ED pathology in the food-insecure group. This finding may suggest that eating in the absence of physical hunger may not be sensitive to disordered overeating in people with food insecurity, perhaps due to the more frequent experience of physiological hunger in this population. Alternatively, this item’s poorer ability to differentiate across ED pathology severity may suggest that eating large amounts of food may be considered less “pathological” (i.e., less connected to other types of ED pathology) in food-insecure populations relative to food-secure populations. The DIF exhibited by the disgusted/depressed/guilty about overeating item may also support this explanation. Indeed, this item assessing negative affect related to overeating was also endorsed less at higher levels of ED pathology, perhaps reflecting that eating large amounts when food is available may be a normative, and potentially adaptive behavior in the context of food insecurity [37].

Future research is needed to understand how endorsement of ED pathology items may differ with severity of food insecurity, as there may be differences in ability to engage in specific symptoms of ED pathology across the spectrum of food insecurity severity. For example, marginal/low food security is more often characterized by concern about maintaining access to food, or lacking access to a variety of healthy, nutritious foods, but not by lacking food and going hungry. Many households with marginal/low food security have an overabundance of low-cost, high energy–density, palatable food. Conversely, very low food security is characterized by reduced food quantity. These differences may play a role in the ability to engage in particular ED behaviors, such as objective overeating and binge eating, and put marginal/low food-secure individuals in an environment with more triggers for binge eating (e.g., relative over-availability of highly palatable, highly processed foods in their home, and neighborhood food environments; [4,5,6]) than individuals experiencing very low food insecurity. This idea is also consistent with several findings that the relationship between food insecurity and obesity is curvilinear, with the strongest association and risk for obesity at moderate levels of food insecurity [38]. More research is needed to better understand how different experiences of food insecurity might influence ED pathology.

Moving forward, the moderate effect sizes observed for the two items demonstrating DIF on the EDDS-5 highlight that there should be some concern with using the EDDS-5 symptom composite in food-insecure samples but do not invalidate past results, as the items demonstrating DIF in the present study have not been the focus of past results. Our findings instead emphasize the potential need to adapt various items from commonly used ED measures to ensure ED pathology is being captured appropriately considering the possibility that some eating behaviors potentially serving adaptive purposes within food-insecure populations. Of note, although we expected that items assessing dietary restraint might exhibit DIF, this was not observed in either ED questionnaire. Further research, however, should explore the extent to which dietary restraint motivated by reasons other than weight/shape may be under-reported by these measures.

Strengths and limitations

The current study design has numerous strengths, including the recruitment of a large sample of women who varied in their food security status and who reported a range of severity in ED pathology. To our knowledge, this is the first study designed to test the validity and performance of validated ED measures for people with food insecurity. The S-EDE-Q and EDDS-5 are two of the most commonly used measures of ED pathology. The S-EDE-Q is brief enough to be used as a screening instrument in healthcare settings, and the current findings suggest that this measure assesses ED pathology similarly for women across food security status. However, this study was not without limitations, including the use of a non-clinical population, limiting the generalizability of the findings to clinical samples. Additionally, while items on the EDDS-5 and original EDDS are quite similar and, thus, findings would likely generalize, it is less clear whether the findings from the S-EDE-Q would generalize to the full range of items on the EDE-Q. Further, the use of MTurk to collect a large sample of respondents may limit the generalizability of the data within food-insecure populations; the current study only included participants who read and speak English fluently, and the content of items was not adjusted to make measures more accessible to individuals with lower reading comprehension skills [7]. Given that the full food-insecure population may have relatively lower rates of English fluency and/or literacy than the general population, the current findings may not be generalizable to all food-insecure individuals. However, it is important to note that past studies have demonstrated a substantial proportion of MTurk workers report being underemployed (e.g., [39]), suggesting that exploration of food insecurity within this population would likely be fruitful. Finally, given that the present study was conducted in a U.S. sample, it is possible that findings may not generalize to less developed countries, where food insecurity is often associated with underweight [40].

Conclusion

Findings from the present study highlight a need for continued and more comprehensive exploration of ED pathology in food-insecure populations to evaluate the utility of commonly used ED measures in food-insecure samples. For example, future research is needed to better understand how differences in severity of food insecurity may influence the endorsement of various ED symptoms. Nevertheless, results of this study suggest that the S-EDE-Q may be useful to screen individuals in medical settings. Finally, research designs that allow stakeholders to participate in study design, including qualitative approaches, may be helpful in refining measurement to represent the experiences of food-insecure individuals with ED pathology.

What is already known on this subject?

Food insecurity is associated with elevated rates of ED pathology in adults [e.g., 7–8]. However, measures of ED pathology were largely developed on samples that represent the prototypical ED patient (e.g., White, middle-class, females; e.g., [9,10,11]), which may lead to under-diagnosis of ED pathology in populations that deviate from this stereotype.

What this study adds?

This study explores whether items on two commonly used ED pathology measures (EDDS-5 and S-EDE-Q) operate similarly in a food-insecure population compared to a food-secure population. Findings indicate no DIF on the S-EDE-Q, but two items on the EDDS-5 demonstrated DIF, showing lower ability to differentiate across levels of ED pathology severity in food-insecure relative to food-secure populations. While DIF exhibited within the EDDS-5 was moderate in effect size and only affected two items, findings highlight a need to consider how the wording of items assessing ED pathology may contribute to differential endorsement by food security status.