Introduction

Prevalence of eating disorders (EDs), which occur all over the world [1, 2] is increasing in the Middle East [3,4,5]. The Eating Disorder Examination (EDE) interview is the most widely used assessment tool to assess EDs and ED symptoms and is generally considered reliable and valid [6]. As administration of the EDE is time consuming a self-report questionnaire, the Eating Disorder Examination Questionnaire (EDE-Q) was created to screen for EDs and assess its severity [7,8,9]. EDE-Q is a reliable assessment tool with a high test–retest reliability [10, 11], internal consistency [8, 10], discriminative validity [7, 12] and sufficient sensitivity to changes in eating pathology [10]. If it proves to be a sufficiently sensitive and specific screener, it can be used prior to the EDE interview as part of a two-staged assessment procedure.

There is a lack of valid ED assessment tools in the Middle East [2, 13]. Normative data of the EDE-Q are only available for Western populations [7, 8, 10], and due to cultural differences, norms for Western and Arabic populations may differ [14]. An Arabic version of the EDE-Q is needed to facilitate detection of Saudis at high risk of an ED and subsequently treatment of EDs in Saudi Arabia [15, 16].

The aim of this study was to translate the EDE-Q in Arabic, to assess its psychometric properties among Saudi nationals, and to assess its utility as a screener to identify Saudis at high risk for EDs. An additional aim was to establish EDE-Q norms for Saudis.

Methods

Procedure

This study recruited a convenience sample, from students of Princess Noura bint Abdulrahman University (PNU), King Saud University, and the Sixth High School for Quran Memorization, all located in Riyadh. Additional participants were recruited through the principal investigator her social network, and through a project explanation shared through social media, targeting Saudi nationals all over the country. One influencer (@Eyaad) and several sports facilities also distributed a link to the questionnaire: NuYu gym (Riyadh, Dammam & Jedda), Sukoun Yoga studio in Riyadh, and the Belgian embassy in Riyad sent a link to Saudi nationals residing in or who previously had resided in Belgium. The Saudi embassies in Germany, France and Switzerland did the same in those countries. Recruitment was conducted between April 2017 and May 2018. Participants were asked to complete an online survey, available in Arabic (n = 2599 participants, 96.6%) and English (n = 91 participants, 3.4%). Prior to assessment, all participants were asked to provide informed consent. Minors were asked to get their guardian (father, brother or uncle) to sign the informed consent form. Questions were sent by email to the principal investigator. Participants were recruited in two phases. In the first, based on self-report, the EDE-Q and a demographics questionnaire were administered. During the second phase, the EDE interview was administered to a subset of participants (N = 98). Participants who provided their contact details in the online questionnaire were contacted for EDE interview, which took place between November 2017 and February 2018, in English (52.5%) and Arabic (46.5%). The majority of the participants were Saudi nationals (97.4% for the EDE-Q and 99.0% of the EDE sample. For demographic details (Table 1).

Table 1 Demographics of the EDE and EDE-Q sample

Measures

Self-report questionnaire: EDE-Q 6.0

The EDE-Q is based on the DSM [17, 18]. EDE-Q is a self-report questionnaire of 28 items with a 7-point Likert scale ranging from 0 (feature was absent) to 6 (feature was markedly present or present every day), measuring purging and binging behaviors during the previous 28 days [9], consists of four subscales: dietary restraint, weight concern, shape concern, and eating concern [18] and a global score for general severity [7, 8, 10, 18]. Dietary restraint, weight concern and eating concern are each measured by five items, shape concern by eight items and six additional items measure frequency of binge episodes, overeating, purging and laxative abuse. Subscale scores are the mean of the items that compose them, with a range of 0 to 6. The global score is the mean of the subscale scores, with higher scores indicating higher severity/frequency of EDs. Questions about weight, height and menstrual functioning are also included in the EDE-Q [8].

The Arabic version of the EDE-Q was translated by a native speaker, a clinical psychology student of Princess Noura bint Abdulrahman University with a parallel translation by a professional translator. Minor differences in word choice and phrasing between both versions were discussed then resolved and translated back, and the following cultural adaptations were made by the principal investigator. In item 2 on restraint eating behavior, it was stipulated that not eating for a long period while being awake should be motivated by shape and weight concerns and not by religious motivates arising from the holy month of Ramadan. Item 28, involving shape concern (avoidance of exposure) was also adapted, because communal changing rooms and public swimming pools are rare in Saudi Arabia and the few that exist are strictly genders separated. Thus, communal changing rooms and swimming pools were replaced by weddings and the gym, which are locations where women do not cover themselves, because they are strictly gender separated. When this study was conducted, female gyms were about to be legalized and were a popular place to work-out and meet friends. Still, since women consider weddings a good place to find future wives for their male family members, some women in the principal investigators personal network admitted feeling exposed while attending them.

The Arabic version of the EDE-Q 6.0 is available from the principal investigator upon request. A pilot study among 50 Princess Noura bint Abdulrahman Health faculty students conducted in January 2017, offered the choice of completing the English or Arabic version of the EDE-Q. Although bilingual, most students preferred the Arabic to the English. Participant feedback on the pilot indicated that the quality of the translation was good. Substantial non-response (14 persons, 28%) was noticed on the questions regarding menstrual functioning, probably due to Saudi Arabia being a closed society, leading to a taboo on discussing fertility and menstrual issues. Besides, Islam considers women as impure during their menstruation and they are not allowed to fast during menstruation in the holy month of Ramadan, or to pray and do their ablutions.

Interview: EDE 16.0

The EDE [9] 16th edition is a semi-structured interview that is widely used to assess ED pathology [19] in the previous 28 days to 6 months [20]. The EDE, has good internal consistency [21], test–retest reliability [22], and discriminative and concurrent validity is supported [23]. ED pathology severity is measured on a 7-point Likert scale (0: feature was absent, to 6: feature was markedly present or present every day) [9], and the global score calculated as the mean of all individual items. Subscale scores are calculated as average score of the relevant items. For the Arabic version, some items were first culturally adapted by the principal investigator and two of her students, then translated to Arabic by the students. As with the EDE-Q, for questions regarding dietary restriction, it was made clear that this must not involve religious motivation during Ramadan. The question regarding discomfort about exposure, swimming and communal changing rooms were replaced by gym and weddings, and wearing a wider or dark colored abaya (mandatory coat for women) was added. Because of the high non-response to the question about menstruation, interview questions regarding menstruation were introduced after asking for permission to discuss a potentially taboo topic. All female participants agreed to discuss taboo topics.

Administration of measures

The EDE-Q was administered online through Survey Monkey. For cultural reasons, as Saudi law does not allow being in public places with a non-relative of the opposite sex, leading to some participants feeling uncomfortable being interviewed in a public place or at home, participants who agreed to participate in the EDE interview were asked to propose locations for their interviews. Most interviews with female participants were held at the principal investigator her office, which was only accessible to women. Most interviews with male participants were held at the principal investigators home office. Other locations were restaurants and participants’ homes.

Participants

The EDE-Q was completed by 2769 participants, a substantial proportion of female respondents declined to respond to the question about their menstrual cycle (143 participants, 5.3%) in the EDE-Q. Those respondents were included, as were participants with missing data on the behavioral frequency items, 79 (2.9%) participants had missing data in their EDE-Q so were excluded. In total, 2690 respondents completed the EDE-Q without any missing items.

The EDE interview was conducted in a subset (N = 102). Six participants did not attend their scheduled EDE assessment, four participants (3.9%) who completed the EDE and EDE-Q had ≥ 5% missing data in their EDE-Q and were also excluded. In total, there was complete data for both the EDE and the EDE-Q for 98 participants. All participants were Saudi passport holders, ≥ 14 years and literate.

EDE-Q sample

Participants were 2690 Saudi nationals (Table 1), and the EDE-Q was completed between April 2017 and May 2018. There were differences between the EDE-Q sample and the Saudi population: females were over-represented, with the EDE-Q sample including 78% females, vs. 42.3% in the Saudi population [24]. There was a greater percentage of single participants (70.4% vs 33.0%) and a smaller percentage married (27.5% vs 58.8%) [25]. The EDE-Q sample was also more highly educated: most participants attended high school (27.7%) compared to 5.4% of the wider population [26], and around a quarter of the sample attended university in KSA, compared to 4.4% generally [27]. Besides, 20.4% of the sample was employed, compared to 30.2% generally [28]. Participants resided all over Saudi Arabia (details available with the principal investigator), though most resided in the larger cities, and some regions were over-represented in the EDE-Q sample [24]. Some participants lived abroad, 56 off them lived in another Gulf country and one female participant resided in Spain. Nationality statistics of Saudi Arabia were unavailable.

EDE sample

Participants with high EDE-Q global scores who provided email addresses were sent a request in Arabic and English, to schedule an appointment for an EDE interview. If they didn’t respond but had left their phone number, they were called within one week. Text message reminders about the appointment were sent on the day the interview was scheduled. The EDE was generally conducted within two weeks of completion of the EDE-Q. The EDE (N = 98) was conducted in Arabic or English depending on the participants’ preference; 53 (54%) in English by the principal investigator; 45 (46%) in Arabic by one of her female students, trained by the principal investigator. Differences between the average Saudi population and the participants interviewed with the EDE were as follows: females were over- represented, 70.4% in the sample vs. 42.7% in the general population [24], there were minor differences in BMI, with a smaller proportion of the sample (20.6%) suffering from obesity compared to the average population (29.0%) [29]. In addition, the majority of the sample was attending university in Saudi Arabia (56.1%), compared to 4.4% of the general population [27]; and most participants resided in the largest cities, Riyadh (44.0%) and Jeddah (19.4%), as compared to national percentages 12.2% and 8.3% respectively [24].

Statistical analysis

SPSS version 25 and AMOS version 26 were used for statistical analysis. Scores were inspected regarding the frequency distribution by skewness and kurtosis. A two way mixed intra class correlation (ICC) with consistency agreement was calculated to test for agreement between the EDE and EDE-Q scores. A Bland–Altman plot was used to investigate the relationship between severity level and differences between EDE and EDE-Q total scores, and to identify bias and outliers. Kappa (κ) between the EDE and EDE-Q was calculated to test concordance between both. Binge eating frequencies and compensatory behavior were also assessed and compared between instruments.

Cronbach’s alpha was calculated to measure internal consistency of the EDE-Q subscales and total score. A score of ≥ 0.70 was considered to be acceptable, ≥ 0.80 good and ≥ 0.90 as excellent. Convergent validity with the EDE of ≥ 0.70 was considered to be acceptable, ≥ 0.80 good and ≥ 0.90 as excellent. Effect of age, gender and, level of education/profession on the EDE-Q global score were investigated with ANOVA. To investigate the factor structure of the translated EDE-Q in both samples, the original four factor model [30] was evaluated on fit using a confirmatory factor analysis (CFA). In addition, the factor structure of a three factor model (1: weight/ shape concern, 2: eating concern, 3: dietary restraint) [31] and two factor model (1: eating/weight/shape concern, 2: dietary restraint) [32] were evaluated on fit using CFA.

Based on a subsample scoring at least 2SD above average (n = 44, 45%) on the EDE (EDE + sample), a receiver operating characteristic (ROC) analysis was performed to test the discriminative validity of the EDE-Q for ED pathology severity according to the outcome of the EDE interview. The area under the curve (AUC) was calculated to test how well the EDE-Q discriminated between the groups at high and low risk for an ED. An AUC ≥ 0.90 meant high accuracy, 0.70–0.90 moderate accuracy, and 0.50–0.70 low accuracy. To determine optimal cut-off values on the EDE-Q as a screener for EDs, sensitivity and specificity were calculated for various scores. In order to account for selective non-response percentile scores were weighted using inverse response probability weighting to obtain estimates for the prevalence of Saudis at high risk for EDs. Finally, the association between BMI and ED pathology was estimated [30]. T-tests were used to assess differences in severity level between groups.

Ethical considerations

The study design was approved on May 7, 2017 (17-0097) by the ethical review boards of PNU and the King Abulaziz City for Science and Technology, both in Riyadh, Saudi Arabia.

Results

Internal consistencies of the EDE-Q were calculated based on the complete sample (N = 2690). Cronbach’s α was high for the global scale score (α = 0.93), the dietary restraint (α = 0.81), shape concern (α = 0.84) and weight concern (α = 0.83) scales. Internal consistency of the eating concern scale was lower (α = 0.69), but still acceptable. Convergent validity was excellent on the global, weight concern and shape concern scales, high on the eating concern scale and unacceptable on the dietary restraint scale. An ANOVA revealed no significant differences in score on the EDE-Q between age groups (age ≥ 18 versus age < 18) (F(1, 2689) = 0.90, p = 0.343), gender (F(1, 2665) = 2.38, p = 0.123), or level of education/profession (F(6, 2659) = 2.04, p = 0.058). Furthermore, all scales were normally distributed. BMI was strongly and positively correlated with the EDE-Q global score r = 0.96, p < 0.001 (Table 2).

Table 2 Average scores, skewness, kurtosis, reliability, convergent validity and association with BMI of the EDE-Q and it’s subscales (N = 2690) and AUC of the EDE-Q global score based on the EDE and concordance rates (N = 98)

Confirmatory factor analysis

There was a strong relationship between the weight concern and shape concern subscales (r = 0.86, p < 0.001). A moderate to strong relationship was found between the eating concern and weight concern scales (r = 0.68, p < 0.001) and the eating concern and shape concern (r = 0.67, p < 0.001) scales. These correlations indicate that these four subscales primarily assess the same underlying construct. None of the models tested showed acceptable fit (Table 3). All items loaded substantially on their respective factors in all models.

Table 3 Fit statistics for alternate models of EDE-Q data in sample with elevated scores on the EDE-Q (n = 538) and community sample (N = 2690)

In the three factor model [31] correlations were high between the weight/shape concern and eating concern scale. Best fit was a found for a three factor model (with shape and weight concerns combined in a single factor) with RSMEA < 0.10 and TLI and CFI of 0.77 and 0.74. As the lack of fit may be due to low intercorrelations among items typically found in a community based sample, the CFA were repeated for a subsample with elevated scores on the EDE-Q. We selected the 20% respondents with the highest scores (n = 538). All items loaded positive and significantly on their respective factors in all models. As, even when non-normal items were removed and items with estimated correlations greater than 1 were merged, the four factor and bifactor did not yield a solution, the non-positive method was used, which only provides estimates. Best fit for the sample with elevated scores was the three factor model with the smallest χ2 to degrees of freedom ratio, GFI = 0.86, AGFI = 0.83 and RSMEA < 0.10. CFA were repeated for a female subsample with elevated scores on the EDE-Q, the results were comparable (results not shown in Table 3). The subsample involving males with elevated scores was too small to conduct a CFA.

Exploratory factor analysis

As CFA revealed limited fit, an Exploratory Factor Analysis (EFA) with promax rotation was conducted on the entire sample and on sample with elevated scores. The scree-plot suggested four factors explaining 69.9% of the variance. However, item allocation to these four factors did not correspond at all with the purported factor structure. A three factor model was better interpretable and resulted in 15 items loading between 0.21–0.76 in the sample with elevated scores (supplementary Table 1). The first factor (dissatisfaction and discomfort) was comprised of three items of the shape concern scale and two items of the weight concern scale (two items describing dissatisfaction with shape or weight and two items about seeing or exposing body), the second factor (dietary restraint) was comprised of all the items of the dietary restraint subscale, the third factor, four items, included two items of the weight concern scale. Preoccupation with food, guilt about eating, preoccupation with shape and weight and flat stomach did not load substantially on any factor. An EFA conducted for males and females separately provided comparable results, although the item loadings on the first factor were a bit higher in the female than in the male sample. Fit of this three factor model sample with elevated scores was compared to the fit indices of the CFA. The re-specified three factor model had the best fit for the current sample in comparison to other models according to CFA with χ2 being relatively small in comparison to the degrees of freedom, GFI = 0.916, AGFI = 0.884 and RSMEA < 0.10 for the subclinical sample, although these indices reveal still not acceptable fit. Cross loadings on other factors are low (0.00–0.29) (Table 3).

Other aspects of validity

Since the CFA and EFA did not support the purported factor structure, in further analyses the EDE-Q global score was based on equally weighed item scores rather than the average score on the four subscales. Several analyses were conducted to investigate whether the EDE-Q actually measured a high risk for EDs and ED symptoms. An ICC indicated an acceptable relationship between the EDE and EDE-Q global (r = 0.78, 95% CI [2.03, 4.51]), eating concern (r = 0.75, 95% CI [1.01, 3.78]), shape concern (r = 0.79, 95% CI [2.35, 5.03]), weight concern (r = 0.73, 95% CI [2.21, 4.78]) and dietary restraint (r = 0.73, 95% CI [1.03, 3.51]) scale scores. The EDE and EDE-Q global scores were compared: the mean difference in score was −2.42 (SD = 2.01) with higher scores for the EDE-Q. There are differences between the EDE and EDE-Q and the difference increases for higher scores (Fig. 1).

Fig. 1
figure 1

Bland Altman plot revealing a weak association between severity and the difference in score between the EDE interview and the EDE-Q

Concordance rates between de EDE and the EDE-Q on the global (κ = 0.009, p = 0.054) and shape concern (κ = 0.022, p = 0.137) scales were low and not significant. Concordance rates on the eating concern (κ = 0.10, p < 0.001), weight concern (κ = 0.07, p < 0.001) and dietary restraint (κ = 0.07, p < 0.001) scales between the EDE and EDE-Q were low and significant.

The ability of the EDE-Q to discriminate between Saudis at high and low risk for an ED (according to the EDE) was assessed with a ROC analyses in the subsample of interviewed participants. EDE+ was defined as a global score of ≥ 2.5 [33]; 44 respondents (45%) met this criterion. The ROC analysis of the EDE-Q data showed an AUC of 0.84 (95% CI [0.69–0.90]), indicating that the EDE-Q is an excellent classifier [34]. Apparently, the EDE-Q global score discriminates well between Saudi nationals at high and low risk for an eating disorder. With a cut-off of 4.87 (based on 2 SD above average), sensitivity of the EDE-Q was 89%, specificity was 69%. There were 31% false positives and 11% false negatives. EDE-Q cut-off score of 2.68 had good sensitivity 84% with a specificity of 75%, a cut-of value of 3.40 had a sensitivity of 64% and a specificity of 85% and a cut-off score of 2.93 yielded a good compromise, a specificity of 80% and a sensitivity of 82%.

Using this last cut-off value of an EDE-Q score (≥ 2.93) it is estimated that 28.8% (n = 775) of the sample was at high risk for an ED. This was 28.5% of the females and, 29.7% of the males included in the sample, there were no differences between both genders (p = 0.205). In addition, of the total sample scored 2.1% (n = 57) 2 SD’s above average and 20.0% (n = 538) of the participants scored 1 SD above average. Off the participants with obesity (n = 648) scored 5.4% (n = 35) 2SD above average and 52.2% (n = 338) of off them scored ≥ 2.93. In addition, 25.9% of the females, 28.4% of males had an EDE-Q score of 2.05 or lower and, 58.1% of the females, 45.9% of male participants had an EDE-Q score below international cut-off of 2.77 or lower. These results are presented in Table 4, along with the prevalence of specific ED behaviors according to the two measures in both samples, which clearly differ according to both measures. There were no differences in EDE-Q scores (p = 0.124) between the subgroup that participated in EDE interview (N = 98; M = 2.21, SD = 1.27), and the subgroup (N = 2690; M = 2.25, SD = 1.32) that did not participate.

Table 4 Prevalence of eating disorder behaviors measured by the EDE-Q and EDE

Percentile scores were weighted for education/occupation, and presented separately for males and females (Table 5). The average EDE-Q scores were 2.24 (SD = 1.53) and 2.30 (SD = 1.28) for females and males respectively. There were no differences in EDE-Q scores (p = 0.205) between females and males.

Table 5 Mean, standard deviation and percentiles for females and males of the EDE-Q, weighted by occupation/education, and the EDE, non-weighted

Discussion

Aim of this study was to assess psychometric properties of a Saudi version of the EDE-Q, to assess its utility as a screener for Saudis at high risk for EDs, and to establish EDE-Q norms for the Saudi population. This is the first study to assess the psychometric properties, including norms and discriminative validity of a culturally adapted EDE-Q in a large Saudi community sample. As in other studies, the results did not support the four factor structure of the English original [7, 9, 31, 32, 35], or the alternative three factor model [36, 37], questioning the validity of EDE-Q subscale scores. Moreover, attempts to establish an alternative factor structure with sufficient fit to the data were not successful.

Females and highly educated Saudis were oversampled, therefore percentile scores were presented separately for males and females and weighted for education/occupation. Although the living conditions for males and females diverge considerably in Saudi Arabia, no gender differences were found in scores on the EDE-Q, nor on the factor structure of the measure. Compared to Western community samples [7,8,9, 38,39,40,41], the global and subscale scores were higher in the Saudi sample. However, Luce et al. [10] reported comparable dietary restraint scores for USA students as found among Saudi students in this sample. Whether these high scores are specific to Saudi’s or rather reflect an aspect more general in the Arabic culture is unclear, as no data are available of other countries in the Gulf area. Interestingly, in somewhat comparable societies, such as Turkey [42] and Iran [43] the EDE-Q global score is also higher compared to western societies. Severity of ED pathology was associated with BMI, a finding consistent with other international [44, 45] and Saudi studies where associations were found between BMI and body dissatisfaction [46], binge eating behavior and irregular eating patterns [47]. Furthermore, EDE-Q scores of Saudis suffering from obesity were comparable with Mexicans suffering from obesity [48], slightly higher than Dutch with obesity [7] and, higher than Iranians with obesity [43], Norwegians [49] and Australians [32]. High EDE-Q scores in the population might reflect the high rates of obesity and maladaptive strategies to lose weight among the Saudi population. This is especially important because Saudi Arabia has the highest rates of obesity worldwide [50], increasing the risk of developing an ED two to three times [15], although it is remarkable that the items restraint, food avoidance and dietary rules were negatively associated with importance of weight and shape, guilt about eating, social eating and reaction to prescribed weighing.

Reliability (internal consistency and convergent validity) was high and the EDE-Q discriminates accurately between individuals at high and low risk for an ED according to the EDE interview. EDE-Q global scores can be used to determine severity level of ED pathology and screen for Saudis at high risk for EDs in community samples. Participants were consistent in presentation of the severity of their ED pathology, but not in symptom presentation between the EDE and EDE-Q. These findings are also similar to what has been found in other studies. Even in a clinical sample, classification concordance between the EDE and EDE-Q was moderate at best [51] and concordance between the subscales varied from poor to excellent [52]. It was remarkable that the concordance rate was higher in items regarding eating and eating behavior than in the other items. Apparently, EDE and EDE-Q results tend to diverge. The low concordance rate might result from participants’ lack of knowledge and understanding about EDs and their symptoms, invalidating their self-reports on the EDE-Q. In general, there is a lack of knowledge and awareness of EDs and its risk factors and EDs are rarely recognized and properly treated in Arab clinics [53, 54]. Participants who are better informed about EDs indeed appear to have more valid scores on the EDE-Q [52].

The EDE-Q should not be used for classification assessment, and might therefore better suited as a treatment outcome measure [55] or as part of two-staged sampling using the EDE-Q as a screener to identify Saudis at high risk for an ED, and to conduct the EDE or another clinical interview to formally establish an ED diagnosis in case of elevated EDE-Q scores. The explanation and expression in Arab countries of EDs as somatic rather than in psychiatric symptoms may also contribute to the low concordance rate [56, 57]. Besides, Saudis with an ED generally seek psychiatric help only after suffering from somatic complaints such as diabetes, kidney failure and infertility.

The study has several strengths. First of all, while Saudis are an understudied population, this is the first study to explore the psychometric properties of the EDE-Q in Saudi Arabia, testing also alternative factorial models. Psychometric analysis was based on a large sample, which allowed the CFA to be fully powered. Second, it is the first study to evaluate psychometric properties of the EDE-Q in the Gulf. Moreover, EDE-Q scores were verified by the semi-structured diagnostic EDE interview in a subsample to understand the involved culture. Furthermore, this is the first study based on interview data, as they were not available in Saudi Arabia. Since we were unable to identify a clinical population by classification, all psychometric tests were done in subsamples at high risk for EDs. Although classification concordance was low, as there is statistically significant association between both global scores, severity of ED pathology appeared to be measured accurately. Therefore the discriminant validity of the Arabic EDE-Q got confirmed and a cut-off point for the identification of Saudis at risk for EDs was obtained. Due to the short time span between the EDE-Q and EDE differences in measurement outcome are likely to be due to differences in response patterns and not due to change in ED pathology. This study provides data on a widely used assessment tool on EDs, which will allow future comparisons of Saudis with samples from other socio-cultural contexts. As data were collected in a closed society with a taboo on mental health care [16] this is a first step towards expanding knowledge about EDs in Saudi Arabia.

Notwithstanding the strengths several limitations must be considered. First, although this study contains a large sample, reflecting the whole Saudi population well regarding age, geographical location and weight status, the sample was biased regarding gender and education level. A too large proportion of the sample was female, single and highly educated. To counteract potential effects of selection bias, the prevalence estimates and percentile scores were presented separately for males and females and corrected for educational level by propensity weighting [58]. Still, unmeasured factors may have caused selection bias: respondents to the EDE-Q, but EDE interviewees in particular, can be expected to be more interested in health care, mental health care, EDs or to have more concerns regarding their body image or eating behavior compared to the general population. Since sexes are strictly separated in Saudi Arabia and all EDE interviews were conducted by females, it is also likely that more progressive section of the population, especially among the males, participated in the EDE interview. Thus, although the presented norms were based on a large Saudi community sample, it should be noted that it was a web-based convenience sample and results should be interpreted with care.

The EDE was used as a reference to determine discriminative validity of the EDE-Q. The EDE was somewhat adapted to the Saudi culture. However, formal assessment of reliability and validity of a Saudi version of the EDE has not yet taken place in a clinical Saudi sample. The EDE could unfortunately not be validated in a clinical sample due to the unavailability of such a sample in Saudi Arabia. Still, as it is an assessor based detailed interview assessing ED pathology over the same time period, by the same rating scale and similar phrases and wording are used in the EDE and EDE-Q [9] it deemed suitable as criterion but it should not yet be considered as the gold standard. An alternative that has been evaluated on its validity is the shortened Eating Attitude Test (EAT26). Its use was considered but dismissed for the following reasons: high rates of false positives were found, and psychometric properties of the EAT26 were only assessed among teenage schoolgirls [59], while our study aimed to target a representative community sample. In addition, the EAT26 was validated more than 20 years ago while the country has undergone rapid socio- cultural changes [54, 60] and EDs were perceived as diseases of globalization [61]. Therefore, the established norms might not be accurate anymore.

Last, use of a clinical sample would be of great value but this appeared to be impossible since there were no specialized clinics and therapists [62] and EDs were rarely recognized in Saudi Arabia [53]. A few off the respondents got ED treatment in Germany and the United Kingdom. Furthermore, within our sample we tried to identify participants with AN, BN and BED. Participants were consistent in severity presentation, but their symptom presentation between the EDE and EDE-Q was inconsistent. Therefore, classification identification appeared to be impossible.

The results implicate several areas of future research regarding the EDE and EDE-Q. Future validation of the EDE among Saudis is important, as differences may exist in the presentation and manifestation of ED pathology between populations [56, 57, 63]. Availability of a validated diagnostic interview will improve classification of these conditions, which may be quite prevalent given the high EDE-Q scores in our population sample. In addition, since none of the evaluated models of the EDE-Q provided fit, it’s imperative to further evaluate the factor structure of the EDE-Q in different Saudi populations, including patients suffering from obesity classified with an ED.

In summary, the results indicate poor fit for the four factor model of the EDE-Q, which is in line with previous research. However, the total score on the Saudi version of the EDE-Q adequately measures ED pathology and identifies Saudis at high risk for EDs.

What is already known on this subject?

The EDE-Q is a widely used screener to assess for EDs and its severity. It has been translated and used in many, but not in Arab countries. In most studies the results do not support the purported four factor structure of the English original. In Saudi Arabia, only psychometric properties of the EAT26 were assessed. However, the established norms might not be accurate anymore.

What does this study add?

A Saudi version of the EDE-Q discriminates well between Saudis at high and low risk for EDs according to the EDE interview. There were no differences in gender. Global scores were high compared to Western community samples and fairly associated with BMI. This is the first study regarding EDs based on interview data in Saudi Arabia. Data were collected in a closed society with a taboo on mental health care. This study involves a first step towards expanding knowledge about EDs in Saudi Arabia.