Introduction

Binge eating disorder (BED) and bulimia nervosa (BN) are the most common eating disorders (ED) [1]. Their lifetime prevalences are 1.9% and 1.0%, respectively [2]. They are characterized by recurrent binge eating episodes in which individuals eat an unusually large amount of food followed by a sense of loss of control over eating. However, in BED the episodes are not followed by the inappropriate compensatory behaviors seen in BN, such as self-induced vomit, misuse of laxatives or other medications, excessive exercise and fasting [1]. Both BED and BN are persistent ED that impair physical health and psychosocial functioning [1, 2].

Undergraduate students are a group of risk for the development of ED symptoms [3]. In addition, it seems that these characteristics are more prevalent in undergraduate students from health sciences, such as dietitians’ courses. Some studies have investigated the differences in the eating behaviors of students from health and human sciences [4, 5]. For instance, Vitolo et al. [5] compared the prevalence of binge eating in both areas. They did not find significative differences. In contrast, there are studies showing that dietitians’ students tend to have higher levels of dietary restraint, binge eating and body image concern when compared to students from other courses [4, 6,7,8]. Although it is not clear what science area is more prone to develop ED, the diagnostic and treatment of undergraduate students with these conditions is essential to improve their physical and psychological health.

Self-report instruments are an alternative for the assessment of ED in large samples. One of the most widely used instrument is the Questionnaire on Eating and Weight Patterns (QEWP) [9]. It was developed as an assessment instrument for the initial multisite field trials that described BED prevalence in clinical and community samples and supported its clinical utility [9, 10]. Additionally, QEWP was developed for the screening of BED categorically, using questions based on the proposed diagnostic criteria for BED [10, 11]. Further, QEWP was revised to be in line with DSM-IV [11]. The Questionnaire on Eating and Weight Patterns – Revised (QEWP-R) [12] was widely used on clinical and community settings [13,14,15]. Also, it was translated and validated to Portuguese [16]. Considering the changes made in the Diagnostic and Statistical Manual of Mental Disorders, Fifth edition (DSM-5) [1], the QEWP-R was updated to be in line with the current diagnostic criteria [17]. Therefore, Questionnaire on Eating and Weight Patterns-5 (QEWP-5) [17] is the only instrument that screens individuals for BED and BN converting for a categorical scale, as diagnosed by DSM-5 [1]. This questionnaire was cross-culturally adapted to the Brazilian context [18].

Self-report instruments should be valid and reliable to be used in specific settings. An important characteristic that should be considered when choosing a questionnaire is the test–retest reliability. It is related to the stability of a measure between two assessments within a time interval [19, 20]. A systematic review about psychometric properties of 29 self-report measures of binge eating indicated that none of them were able to meet all the criteria for good psychometric quality. Also, this review highlighted the scarcity of conclusive data regarding psychometric properties of those measures [21].

Although QEWP-5 represents an alternative for the assessment of BED and BN, up to date, the psychometric properties of its Brazilian version were not assessed yet. Thus, the aim of the present study was to assess the test–retest reliability of the Brazilian version of QEWP-5 in a sample of undergraduate students from Dietitian and Psychology courses.

Materials and methods

Design and participants

Undergraduate students (n = 403) were recruited from dietitian (n = 197; 48.9%) and psychology (n = 206; 51.1%) courses at a Brazilian public university. Students were approached in class and informed about the study. From those invited, 345 (179 from dietitian and 166 from psychology courses) completed the Brazilian version of QEWP-5 [18] twice, corresponding to 85% of the students approached. The questionnaire was applied twice, with a 2 weeks interval. This interval is considered appropriate to avoid temporal changes in the answers [22]. The two applications (test and retest) were independent and the students did not have access to the results of the first assessment. This research was approved by the Ethics Committee from Institute of Psychiatry from the Federal University of Rio de Janeiro. A written informed consent was obtained from all study participants before performing any study procedures.

Measures

QEWP-5 is an updated version of the QEWP. It was developed in 1992 to the multisite field trials that supported the clinical utility and described the prevalence of BED in different settings [9]. Afterward, QEWP was revised to be in line with DSM-IV criteria [11]. The QEWP-R [12] was widely used in the literature [13, 15, 23]. Also, it was translated and validated to the Brazilian context [16]. Considering the changes made in the diagnostic criteria for BED and BN in the DSM-5 [1], QEWP-R was updated to QEWP-5 in 2015 [17].

QEWP-5 [17] is a 26-item self-report measure, developed for the screening of BED and BN. The instrument includes questions about demographic characteristics (such as age, sex and race), weight history, binge eating episodes (both objective and subjective), duration and frequency of the episodes, distress regarding binge eating, inappropriate compensatory behaviors, overvaluation of weight and shape, and parents’ silhouettes. According to DSM-5 criteria, the diagnostic items time frame focus in the past 3 months [17].

QEWP-5 provides a possible diagnostic of BED and BN in a dichotomous measure (presence/absence), based on DSM-5 [1] diagnostic criteria. The presence of BED is considered when: a) Presence of at least 1 binge eating episode per week for three months (binge eating is defined as eating a large amount of food in a short period and the feeling of loss of control); b) the absence of inappropriate compensatory behaviors (such as vomiting, diuretics/laxatives/other medications abuse and excessive exercise); c) the presence of at least 3 of the following associated symptoms during the episodes (eating much faster than usual; eating until feeling uncomfortably full; eating large amount of food when not physically hungry; eating alone because of feeling embarrassed by how much one is eating; felling disgusted with oneself, depressed or very guilty after the episode); and d) marked distress regarding binge eating; The presence of BN is considered when: a) presence of at least 1 binge eating episode per week for three months; b) presence of any inappropriate compensatory behavior at least 1 time per week for three months; and c) overvaluation of weight/shape [17].

QEWP-5 was translated and adapted into Brazilian Portuguese following international guidelines. The process of cross-cultural adaptation comprised the following stages: forward translation, comparison of translations and synthesis version, preliminary version/experts’ panel, blind back-translations, comparison between back-translations, and comprehensibility test. The Brazilian version of QEWP-5 was pre-tested and well understood by 10 patients with BED/BN and 10 ED experts [18].

The Body Mass Index (BMI = weight/height2) was calculated through the weight and height self-reported in the questionnaire. BMI was classified in four categories: underweight (BMI < 18.5 kg/m2), normal weight (18.5–24.9 kg/m2), overweight (25–29.9 kg/m2) and obesity (≥ 30 kg/m2).

Data control and analysis

The data were registered twice, by two independent people, and compared (double-data entry). If there were disagreements between typing, the data were corrected after checking the respective questionnaire. The sample was characterized by sex, BMI classification and semester of the course. Participants’ age and BMI were also analyzed as continuous measures. These variables were tested regarding their normality using Kolmogorov–Smirnov test. The students who completed QEWP-5 twice (test and retest) were compared with those who were present only in the first application (missing data). Also, the students from Dietitian course were compared with those from Psychology course. Chi square tests were performed to compare the categorical variables (sex, BMI classification, presence of BED or BN and the semester of the course). Considering that continuous variables (age and BMI) did not fit the normal distribution, the Mann–Whitney test were used to compare them.

The test–retest reliability for each possible diagnostic was based on an analysis of 2 × 2 contingency tables with the following categories (at times 1 and 2): (1) BED x No diagnostic (ND); and (2) BN x ND. Considering that QEWP-5 converts the diagnostic items for a dichotomous scale, the kappa coefficient [24] was used to determine the test–retest reliability of the questionnaire. It is the preferred method for the assessment of temporal stability of instruments with dichotomous scores [22]. The kappa coefficient has been used to assess the test–retest reliability of health measurement scales in different settings [25,26,27,28]. It was considered the Landis and Koch [30] criteria to rate the agreement between the applications, as follows: < 0.00 Poor; 0.00–0.20 Slight; 0.21–0.40 Fair; 0.41–0.60 Moderate; 0.61–0.80 Substantial; 0.81–1.00 almost perfect. The temporal stability of the QEWP-5 between the students from both courses were compared using the Landis and Koch criteria [30]. The analysis was performed using SPSS—Statistical Package for the Social Sciences, version 22. Statistical significance was set at p < 0.05.

Results

Table 1 shows the comparison between participants who completed the first application of QEWP-5 (n = 403) and those who were present both in the test and retest (n = 345) regarding sex, semester of the course, BMI classification, positive screening for BED or BN, and course. The age’s median of the first group was 21 years (min: 17; max: 57). Regarding BMI, the median was 22.5 kg/m2 (min: 13.8; max: 46.6). The medians of age and BMI of the participants who completed QEWP-5 twice was 21 years (Min: 17; Max: 57) and 22.4 kg/m2 (min: 15.1; max: 46.6), respectively. No significant differences were found between the groups according to these characteristics (age: p = 0.96; BMI: p = 0.86). Also, both groups were not statistically different regarding sex (p = 0.77), semester of the course (p = 0.82), BMI classification (p = 0.82), presence of BN (p = 0.87) and BED (p = 1.00), and type of course (p = 0.42).

Table 1 Comparison between participants who completed QEWP-5 in the test and retest with those who completed only the test

Table 2 shows the comparison between students who completed QEWP-5 twice, by course. The age’s medians of the participants from dietitian course (21 years; min: 18; max: 37) was not statistically different from those who were students of psychology course (21 years; min: 17; max: 57; p = 0.99). Regarding BMI, the medians for dietitian (22.6 kg/m2; min: 0.50; max: 40.5) and psychology (22.2 kg/m2; min: 15.2; max: 46.6) courses were not different either (p = 0.47). Concerning other sample characteristics, in both courses the higher proportion of students were between first and fifth semesters (p = 0.44), with normal weight (p = 0.35) and similar prevalence of BN (p = 0.52) and BED (p = 1.00). The only statistically significant difference observed was related to sex, in which the proportion of females was higher in the dietitian course (86.6% vs. 77.7%; p = 0.03).

Table 2 Comparison of participants who completed QEWP-5 twice by course

Test–retest reliability

Table 3 shows the concordance between test and retest for identifying undergraduate students with positive screening for BED and BN within total sample (n = 345) and by course. Considering the entire sample, the two applications of QEWP-5 identified a similar number of students with BED (test = 2.6%; retest = 3.2%) and BN (test = 5.8%; retest = 6.1%). However, the concordance between the measures were moderate (k = 0.48) for BED and substantial (k = 0.71) for BN. Analyzing by course, the frequency of students with BED in the dietitian course was 2.8% (test) and 4.5% (retest), and BN was 6.1% for both applications. The agreement between the two measures was moderate (k = 0.60) and substantial (k = 0.80), respectively. In the psychology course, the kappa coefficient for the screening of BED (test = 2.4%; retest = 1.8%) and BN (test = 5.4%; retest = 6%) was fair (k = 0.27) and moderate (k = 0.60), respectively.

Table 3 Frequency of BED and BN obtained in the 1st application (test) and 2nd application (retest) of the QEWP-5 and kappa values in the entire sample and by course

Discussion

This study investigated the test–retest reliability of the Brazilian version of QEWP-5 in undergraduate students from the Dietitian and Psychology courses. To our knowledge, our research group was the first to translate and adapt the QEWP-5 to the Brazilian context, and thus, now to proceed with the evaluation of its psychometric properties. Also, it is the first evaluation of the temporal stability of a version of QEWP for the screening of BN. The test–retest reliability was assessed after applying the questionnaire to the students in two different occasions. The applications were done within an interval of 2 weeks. Overall, the questionnaire was considered moderately and substantially stable for the screening of BED and BN, respectively. In the dietitian course, QEWP-5 was considered moderately stable to assess BED and substantially stable for the screening of BN. In the psychology course, the stability over the time for the assessment of BED and BN were fair and moderate, respectively.

Although there are no studies about test–retest reliability of QEWP-5, the temporal stability of the previous versions has been assessed. The QEWP was administered twice, within an interval of three weeks, to 52 women and 2 men who identified binge eating episodes as very problematic for them (self-reported binge eaters) and 52 women who did not reported binge eating episodes as a problem (comparison sample). The questionnaire was considered moderately stable. The kappa coefficient was 0.57 for the self-referred binge eating sample, and 0.58 for the total sample (self-referred and comparison groups) [25]. Johnson et al. [31] evaluated the stability the QEWP-A for the screening of BED in adolescents (both males and females). The time interval between the two assessments was also of three weeks. The test–retest reliability, assessed by phi-coefficient, was 0.42 [31]. Therefore, the assessment of the reliability of previous versions of QEWP for the screening of BED shows quite similar values to those found in the present study.

The evaluation of the agreement between test and retest demonstrated that kappa coefficient for the assessment of BN were higher than for BED in the entire sample and in both courses. Additionally, the stability of QEWP-5 to assess BED were higher in the dietitian course than in the psychology. One possible explanation for these findings is that kappa coefficient is influenced by the prevalence of the condition being measured [32]. That is, when comparing the stability of two measures, the kappa tends to be higher if the condition assessed is more prevalent. In the present study, BN’s prevalence was higher than BED, both in the total sample and in the courses. Also, when comparing the prevalence of BED between the courses, it was higher in Dietitian than Psychology. A similar situation was described by Johnson et al. [29] when evaluating, by QEWP-A, the agreement between two assessments of BED in 367 adolescents. The researchers found a kappa coefficient of 0.19 and a low prevalence of BED in the sample (1.07%) [29]. Another characteristic that could have influenced the test–retest reliability of QEWP-5 is the different course of the EDs assessed. BED is more instable, with a tendency to remit the symptoms and few relapsing over the time. In contrast, BN is a more persistent diagnosis with a stable course, higher relapse and lower remission [2, 33]. Johnson et al. [31] reported the influences of the instable course of BED in the stability of QEWP-A. One third of the sample of females diagnosed as subthreshold binge eating in the first assessment were classified as no diagnostic three weeks later [31].

Up to date, there is only one study about psychometric properties of QEWP-5. Calugi et al. [34] proceeded the validation of the Italian version of the questionnaire in 604 adults seeking treatment for obesity. Using the Eating Disorders Examination (EDE) [35, 36] as the gold standard, the authors assessed the concordance between the QEWP-5 and the clinical interview in identifying the presence of BED. QEWP-5′s sensibility, specificity, positive predictive value and negative predictive value was 0.49, 0.93, 0.34 and 0.96, respectively. In addition, the agreement between the two instruments was poor (k = 0.34). These results indicated that QEWP-5 can be useful as a screening tool for BED. However, the diagnostic should be confirmed by a clinical interview. As the authors did not performed test–retest evaluation, their findings cannot be compared with our results.

Undergraduate students are a group of risk for the development of ED [37]. Also, the students from courses of the health sciences, such as dietitian, seem to be in a higher risk [38, 39]. Kolar et al. [40] performed a meta-analysis of studies on ED in Latin America. Considering only studies that diagnosed ED in college samples with clinical interviews, the prevalence of BN and BED ranged from 0% to 2.8% and from 0% to 4.21%, respectively [41,42,43,44]. In the present study, the prevalence of BN ranged from 5.8% to 6.1, and the prevalence of BED ranged from 2.6 to 3.2%, between the applications of the QEWP-5 in the entire sample. However, it should be confirmed with a clinical interview, because self-report instruments tend to overestimate the prevalence of ED [45].

The present study has some limitations, such as the missing data from the 14.4% of the students that did not complete the retest. However, they did not show differences from the group of students that completed QEWP-5 twice. Therefore, the missing data did not undermine the study’s results. Second, the low prevalence of BED both in the entire sample and in the psychology course could have negatively influenced the kappa coefficient [32]. Third, the use of kappa coefficient as the unique measurement of stability. Although it is considered the preferred statistical method to assess the reliability of scales with dichotomous scores, no information about agreement/disagreement structures are available [22, 46, 47].

Conclusion

In general, the stability of the Brazilian version of QEWP-5 was considered moderate to assess BED and substantial for the screening of BN in undergraduate students. Stratifying by course, the questionnaire had a better stability to assess both BED and BN in dietitian students than in psychology ones. More research is required to evaluate the psychometric properties of this version of QEWP-5 in samples with different backgrounds. Finally, a clinical interview should be performed to confirm the high prevalence of ED on this sample, and to validate the questionnaire as a screening instrument.

What is already known in this area?

QEWP-5 is the only instrument translated to Brazilian Portuguese that screens individuals for BED and BN converting for a categorical scale. However, its psychometric properties were not assessed yet.

What this study adds?

The Brazilian version of QEWP-5 was moderately stable to assess BED and substantially stable for the screening of BN in undergraduate students.