Background

Anorexia nervosa (AN) comprises a multitude of symptoms, including self-imposed starvation, fat phobia, and a distorted body image. It most commonly develops in women between the ages of 15 and 19 years, although a trend toward earlier debut has been detected [1]. The prognosis is poor, with less than half of patients attaining complete recovery [2]. Approximately 20% follow a chronic path, and the mortality rate (5%) is the highest of any psychiatric disorder [2, 3]. AN has both physical and psychological features that affect patients’ everyday life, and studies have reported significantly decreased health-related quality of life (HRQoL) [4] even in remitted patients [5].

Generic HRQoL questionnaires have been developed for use in a wide range of populations to compare different diseases, whereas disease-specific HRQoL questionnaires have been developed to take into account specific nuances of a certain disease. In AN, HRQoL is mostly assessed by Short-Form 36 (SF-36) or its abbreviated version (SF-12).The increasing interest in disease-specific HRQoL questionnaires has resulted in studies using generic and disease-specific questionnaires alongside each other, showing impaired HRQoL in patients with AN compared to the general population [6].

One of the major obstacles in assessing HRQoL in AN is the influence of the egosyntonic nature of the disorder [7, 8], i.e., where disease behaviors are perceived by the patient to be beneficial to oneself and changes to these behaviors are consequently perceived as non-beneficial or even threatening. This complicates the person’s motivation to change or accept treatment or to recover from their pathologic behaviors [9]. The effect of a treatment in AN is often assessed using a physical parameter measured by the clinician as the main outcome, e.g., weight gain. However, the egosyntonic nature of AN means that weight gain is perceived as non-beneficial or threatening to the patient’s self. Imposing such a treatment will lead to a conflict with inner desires and goals, resulting in absence or low motivation to accept or adhere to the weight-gaining treatment. A recent study supported this notion of a conflict between the clinician’s and patient’s perceptions of ‘a good outcome’ and reported that an increase in body mass index (BMI) was significantly correlated with an increase in ED psychopathology [10]. The alignment of treatment expectations could thus be an important aspect in improving the chances of successful treatment in AN [10].

A reliable and valid assessment tool is necessary to gain an understanding of the patient’s perspective. Several HRQoL questionnaires have been developed specifically for eating disorders (EDs), e.g., the Eating Disorder Quality of Life instrument (EDQoL) by Engel et al. [11], the Health-Related Quality of Life in Eating Disorders (HeRQoLED) by Las Hayas et al. [12], the Quality of Life for Eating Disorders (QOL ED) by Abraham et al. [13], and the Eating Disorder Quality of Life Scale (EDQLS) by Adair et al. [14]. So far, none of these has become the standard for assessing HRQoL in EDs, and validation studies are limited in number.

The EDQLS was developed in 2005 by a Canadian research group [15] and aimed to minimize response bias attributable to egosyntonicity. The questionnaire was developed to be applicable for both standard and individualized HRQoL assessment, to detect changes due to treatment response, and to be appropriate for both adolescents and adults. As part of the development, it was validated in a multi-site setting in its original language [15]. The EDQLS has also been validated in a non-clinical sample, but it has not yet been validated in other languages. We recently translated the EDQLS into Danish according to WHO translation guidelines [16, 17] and performed a pilot validation study in a small and broad sample of patients with eating disorders (BMI ranged from slightly undernourished to obesity). However, the study warranted a replication in a larger independent sample and including a more extensive description of symptomatology.

The primary aim of the current study was to investigate the factor structure of the Danish translation of EDQLS and subsequent to evaluate the internal reliability and convergent validity of the EDQLS in a cohort of women with AN. We also tested known-groups validity.

Method

Appropriateness of the Danish EDQLS for patients < 18 years

In the Danish pilot study, the translated EDQLS was administered to patients ≥ 18 years, while the original English language version was tested from the age of 14 years [15]. Before commencing data collection for the current study, we wanted to ensure that the translated version was appropriate for individuals below 18 years of age. Over 3 days, we conducted individual interviews with six patients aged 13–17 years who were admitted for treatment of AN to the Child and Adolescent Mental Health Services in the Region of Southern Denmark. The interview began by asking their general impression of the questionnaire, and then, each questionnaire item was investigated in turn. If the language or a specific word was unclear, a suggestion for better wording was encouraged. We also inquired whether the participants understood the question, what the question meant to them, what they were thinking about when answering, and whether they would suggest any alterations to the question. Finally, we asked about areas not sufficiently covered by the questionnaire, and whether the questions “made sense” and “should be included”. All participants found the questionnaire to be highly relevant and found it easy to answer the questions. There was a general consensus that the questionnaire had an appropriate length, included clear instructions, and was easy to understand. The participants were enthusiastic regarding the content of the questionnaire and found that it covered relevant areas associated with quality of life and made no suggestions for changes or additions.

Study participants

Participants for the current study were recruited between June 14th 2017 and March 10th 2019 from the specialized centers for eating disorders in the five regions of Denmark. Female inpatients and outpatients, diagnosed with AN, were encouraged by the health professionals to use an online survey link inviting them to participate in the study. Participants could be in different stages of the recovery process, but could only complete the survey once. Eligibility criteria were body mass index (BMI) < 18.5 and age of 13–40 years. BMI for participants < 18 years of age were subsequently converted into BMI-for-age percentiles according to WHO’s growth reference [17]. A percentile < 10 indicated underweight, and a percentile of 10–85 represented a normal weight range [18].

The data provided by the study participants were automatically uploaded to Research Electronic Data Capture (REDCap), a secure web application to manage online surveys and databases, via OPEN at Odense University Hospital.

Questionnaires

The survey comprised 253 questions including medical history. Participants completing all questionnaires were compensated with 20 euros. To receive this remuneration, they needed to provide their name, personal security (CPR) number, and address to the financial department. We used the CPR number to check that no participants were included twice. To determine eligibility, the survey began with screening questions on age, height, weight, and medication. If inclusion criteria were met, a branching logic incorporated in the database allowed participants to move forward to the medical history questions on AN (duration of disease, and highest and lowest weight), alcohol consumption, and drug use, and a question on level of education. Subsequently, the survey moved on to the six questionnaires chosen for this study. Questionnaires were chosen based on prior research in the field and after discussing different possibilities with the authors of the original EDQLS instrument. Information on in- or out-patient status was not recorded.

Eating Disorder Quality of Life Scale (EDQLS)

This self-report questionnaire was developed for adolescents and adults with EDs and has been extensively tested in patients aged 14–60 years, with preliminary testing done in ages 9–13 years [15]. The EDQLS consists of 40 items across 12 subscales and takes 2–11 min (mean 5 min) to complete. Each item is rated on a five-point Likert scale from ‘strongly disagree’ (scored as 1) to ‘strongly agree’ (scored as 5), with a higher score indicating better HRQoL (maximum score 200). The 12 subscales are cognitive, education/vocation, family and close relationships, relationships with others, future outlook, appearance, leisure, psychological, emotional, values and beliefs, physical, and eating. Each subscale contains three items, except for the ‘eating’ subscale which has six items. The EDQLS software includes an automatic scoring program that converts all item responses to a total score (some subscales require reverse scoring prior to summing).

The Eating Disorder Inventory-3 (EDI-3)

The self-report EDI was developed to assess psychopathology associated with EDs. The latest version, published in 2004, consists of 91 questions across 12 subscales and has been validated in a Danish sample [19, 20]. The subscales comprise three areas specific to eating disorders (drive for thinness, bulimia, and body dissatisfaction) and nine general areas related to eating disorders (low self-esteem, personal alienation, interpersonal insecurity, interpersonal alienation, interoceptive deficits, emotional dysregulation, perfectionism, asceticism, and maturity fear). The EDI-3 is rated on a 0–4 point scoring system and can be used from age 13 years. When purchasing a license to the EDI, a software program is included to summarize and convert the scores. Higher scores represent a higher level of ED symptomatology.

Short-Form 36 (SF-36)

The SF-36 is a generic, self-report HRQoL questionnaire developed in 1992 by Ware and Sherbourne [21]. It consists of eight subscales: physical functioning (PF), role limitations due to physical health problems (RP), bodily pain (BP), general health perception (GH), vitality (VT), social functioning (SF), role limitations due to emotional problems (RE), and general mental health (MH). Each subscale is transformed into a 0–100 scale, with higher scores indicating better HRQoL. The SF-36 has been widely used and is validated in a Danish sample [22].

Beck Depression Inventory (BDI)

The self-report Beck Depression Inventory has 21 questions rating symptoms of depression. The first version developed in 1961 by Beck et al. has been used in studies and clinical settings to measure severity of depression. A revised version of the BDI was published in 1966 in response to the new diagnostic criteria for major depression [23] and has been validated in a Danish sample [24]. Each item is rated with a value between 0 and 3, with higher scores indicating more severe depressive symptoms. Standardized cutoffs have been determined where a total score of 0–13 reflects minimal depression, 14–19 mild depression, 20–28 moderate depression, and 29–63 severe depression.

Work and Social Adjustment Scale (WSAS)

The self-report WSAS was developed to rate functional impairment and contains five items rated on a nine-point Likert scale (‘not at all’ to ‘very severely’). Respondents rate the extent to which their current problem influences their work, household chores, social/private activities, families, and relationships. The maximum score of 40 represents a high impairment of normal functioning. The developers proposed that a total score < 10 represents no impairment, 10–20 represent significant impairment, and scores > 20 represent severe impairment [25]. The questionnaire has been validated in a Norwegian sample [26] and used in several Danish studies.

The WHO-5 well-being index

The WHO-5 assesses subjective psychological well-being within the last 2 weeks and has been widely used since its development in 1998 [27]. It was developed in Denmark and validated in a Danish sample [27]. It is freely available on the Internet and is available in over 30 languages. It consists of five items rated on a six-point Likert scale (‘all the time’ to ‘none of the time’). The maximum total score is 100, with higher scores representing greater well-being. A score of 0–35 represents significantly lower well-being compared to the general population, with a risk of psychopathology being present. A score of 36–50 represents lower well-being compared to the general population, where 50 is the general population norm value.

Ethics

The project was approved by the Danish Data Protection Agency, File no. 17/3218. The ethical committee was contacted to enquire about approval prior to the initiation of the project, and the project was approved without further application. The study is registered in ClinicalTrials.gov, registration number NCT03230435. Patients were informed about the aim of the study, and for underage patients, parents gave oral consent.

Statistical analysis

Statistical analyses were conducted in the statistical program STATA (StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC). The sample size (n = 211) was based on the Canadian validation study using the original English version by sample size calculations. A high response rate was expected due to the high participatory interest in the subject and the participation payment.

Normality of data was assessed visually by histograms and by performing Shapiro–Wilk tests. Data are presented as medians and interquartile ranges (IQR) due to their non-normal distribution. Mann–Whitney U tests were used to compare characteristics and questionnaire scores between stratified age groups (13–17 years and 18–40 years). Cronbach’s α was computed to evaluate the internal consistency of the 12 subscales of the EDQLS and of the total scale.

Convergent validity was determined in a similar manner to the original developmental and validation study and according to established validation methods [15, 28]. In the absence of a gold standard for measuring HRQoL in AN, we used Spearman’s rank correlation to compare the EDQLS total or subscale scores with the responses to similar items from measures of psychopathology, social function, and general well-being.

Known-group validity was tested by stratifying participants according to ED severity (measured by EDI-3), age, BMI and duration of disease and then using regression analysis to assess the association between these variables and the EDQLS total score.

Kendall’s Tau correlations were performed to determine the internal correlations between EDQLS items within the same subscale. These correlations were used to help determine the most appropriate clustering of EDQLS items.

Confirmatory factor analyses (CFA) were performed to determine goodness-of-fit of the current 12-factor structure of the EDQLS and analyzed the following: comparative fit index (CFI), the Tucker–Lewis index (TLI), the standardized root-mean-square residual (SRMR), and the root-mean-square error of approximation (RMSEA) with its 90% confidence interval. Subsequent principal component analyses (PCA) and scree plots were computed to propose an alternative model.

Results

A total of 211 females with AN completed all the survey questions and were included in the study. Of these, one-third were aged 13–17 years, and two-thirds were aged 18–40 years (Table 1). As only (n = 9) were males, they were excluded from the current analysis and will be reported separately in a future study.

Table 1 Descriptive and comparative statistics

Participants in the two age groups (adolescents 13–17 years and adults 18–40 years) were similar with respect to BMI, and both groups had current BMI and nadir BMI under cut-off for underweight (18.5) as defined by WHO. For participants age 13–17 years, BMI was converted to BMI-for-age percentiles and approximately two-thirds of these participants reported a BMI < 10th percentile, indicating underweight for their age. The remaining one-third had BMI percentile in the normal range, but were still included in the following analyses, as they were in treatment for AN at the time of data collection. The older age group had a significantly higher medication use, (primarily antidepressants/anxiety medication) and longer duration of disease of AN, but alcohol intake was infrequent in both groups. The two age groups had similar high total scores on the ED-specific questionnaires, EDQLS and EDI-3 (Table 1). Median EDQLS subscale scores are shown in Fig. 1. The EDQLS total scores and subscale scores did not differ significantly between the two age groups.

Fig. 1
figure 1

Eating Disorder Quality of Life Scale median scores

Both age groups showed significant health and functional impairment. Their SF-36 domain scores for Mental health and Vitality were well below the norm score of 50 [21], while Physical Functioning and Bodily Pain were above the norm score of 50. Median BDI total score indicated severe depression in both age groups, the median WSAS total score indicated severe impairment of work/study and social activities (especially in participants aged 18–40 years), and the median WHO-5 total score indicated lower well-being than in the general population (Table 1).

Internal consistency

Cronbach’s α coefficient (0.94) showed excellent internal consistency for the EDQLS total score (Table 2). Internal consistency was acceptable or good for the subscales ‘future outlook’ (alpha = 0.79) and ‘eating’ (alpha = 0.82).

Table 2 Summary of goodness-of-fit statistics, n = 211

Convergent validity was evident from the significant correlations between EDQLS subscale/total scores and the pre-determined items/subscale scores from the BDI, EDI-3, SF-36, WHO-5, and WSAS (Table 3).

Table 3 Internal consistency of the EDQLS 12 subscales, the total scale, and the 5 subscales proposed by Akoury et al. [32] n = 211

Regression analyses for known-group validity (Table 4) revealed significantly worse HRQoL with increasing ED severity measured by EDI-3 but not with other ED variables or age group.

Table 4 Spearman correlations between EDQLS subscale scores and validation instruments, n = 211

Most EDQLS items did not demonstrate the strongest correlation with items in their own pre-determined subscale, thus not supporting the clustering of items as in the original 12-factor structure. Many items correlated as strongly or stronger with items from other subscales than items in the same pre-determined subscale (Supplementary table).

CFA of the original 12 subscales showed a poor fit in the current data (Chi-squared = 1538.4; p value ≤ 0.001; RMSEA = 0.078; CFI = 0.773; TLI = 0.737; SRMR = 0.087). The Comparative Fit Index (CFI) and the Tucker–Lewis Index (TLI) fit statistics were low, also indicating poor fit to a 12-factor structure for the EDQLS questionnaire.

The results of the PCA suggested an eight-factor model accounting for 61.6% of the variance, using the eigenvalue of > 1.00. The scree plot leveled off after approximately five factors (Fig. 2). Table 5 shows the rotated factor pattern matrix, indicating the clustering of items in the eight factors proposed by the PCA by displaying the highest loading. The eight factors included mixed items but could be summarized as: (1) eating disorders, (2) relationships, (3) attention to body/weight, (4) positive self-image, (5) negative emotionality, (6) energy/vitality, (7) social activities, and (8) mixed (Table 6). The eighth factor comprised only two items that appeared clinically unrelated.

Fig. 2
figure 2

Scree plot eigen values

Table 5 Association between EDQLS and EDI, age, BMI, and duration of disease, n = 211
Table 6 Factor loadings on 7-factor model from exploratory factor analysis, oblique quartimin rotation, n = 211

Discussion

The current study assessed the psychometric properties of the Danish translation of the Eating Disorder Quality of Life Scale in terms of factor structure, convergent validity, and internal consistency. Our data revealed excellent internal consistency for the EDQLS total score and acceptable convergent validity with individual items or total scores of other questionnaires assessing ED psychopathology, physical and social functioning, and well-being. Known-groups validity also showed an expected association between EDQLS total score and ED severity measured by EDI-3. These findings support the use of the EDQLS total score in patients with AN. However, our data did not support the current division of EDQLS into 12 subscales [15].

The EDQLS has been validated in few clinical samples [15, 29], and one non-clinical sample [30]. The non-clinical sample tested the factor structure of the EDQLS and involved individuals who were not diagnosed with an ED but had BMI ranging from underweight to extreme overweight. Data from a heterogeneous group are difficult to analyze as HRQoL has been documented to be severely impaired in obese patients, similar to cancer patients [32]. Furthermore, items on preoccupation with shape, body, or weight concerns have different meaning to obese patients than to underweight patients [33]. Akoury et al.[30] proposed a revision of the item clustering in the EDQLS to five subscales from the current 12 subscales which improved subscale internal consistency to 0.72–0.82. In their original study, Adair et al. proposed an eight-factor model by performing PCA [15] but deferred CFA for future larger samples and maintained the original 12-factor model. The CFA performed on our data did not support the current 12-factor model as the goodness-of-fit was poor but does support the use of the total score of the EDQLS. Relying on the total score instead of subscale scores simplifies the questionnaire and its scoring and enhances overall understanding and comparability. By quantifying the burden of symptoms, we can measure treatment effect and compare different patient groups and settings. Some subscale information may be lost, however, when only using a total score. HRQoL is highly subjective, and patients’ individual preferences for different domains and how they are affected may be useful information when agreeing treatment plans/interventions.

The five-factor model proposed by Akoury et al. [30] included subscales of positive emotionality, body/weight dissatisfaction, disordered eating behaviors, negative emotionality, and social engagement. PCA performed on our data suggested an eight-factor model as proposed by the original authors [15]. Our correlation analyses also supported an alternative division than the current 12-factor model as the items correlated more strongly with items in other subscales than their own. Furthermore, many items correlated with more than one factor, indicating an imperfect factor model. We consider the clustering suggested by the eight-factor model to be clinically relevant. However, the eighth factor included only two items that were not obviously related and correlated stronger with items from other factors. Thus clinically, it might be relevant to include these two items in another factor.

In line with previous studies performed by the authors of the EDQLS, we found excellent internal consistency (Cronbach α = 0.94) on the total score [15, 29]. Similar to these previous publications, subscale internal consistency was less convincing with scores ranging from 0.38 to 0.82. [15, 29]. The poorest consistency was for domains regarding leisure and family/close relationships and the best was for future outlook and eating domains. In addition to the previous clinical validation studies [15, 29], a non-clinical study by Akoury et al. [30] also found excellent internal consistency on total score but low internal consistency at subscale level supporting the use of the total EDQLS score but proposing scale revision of the EDQLS [30].

The EDQLS demonstrated acceptable convergent validity, where significant correlations on all subscales (and total scores) indicated that increasingly impaired HRQoL was associated with more symptoms of depression or eating disorder and greater impairment of well-being or daily function. This is in line with the results from a non-clinical study demonstrating worse HRQoL with increasing ED symptoms [30] and indicates an all-encompassing disease, as stated in previous publications [31].

The main strength of the current study is the large sample of patients with AN who were recruited from several geographical regions of Denmark, thereby increasing representativeness of the AN population. In addition, the sample included both in- and outpatients and both adolescents and adults. The 53% survey completion rate was lower than expected, even with the high number of questions, but non-completers were similar to completers regarding age and BMI.

The measures used to test the convergent validity of the EDQLS were well established and showed relevant associations. The reliance on self-report data assumes that the items have been interpreted as intended. We did not test for divergent validity as none of the chosen measures were optimal for this. A next step would be to test the responsiveness of the EDQLS to investigate whether it can identify HRQoL changes in relation to treatment.

The results of the current study were based on data from the Danish translation of the EDQLS and for patients diagnosed with AN, thus not conclusive for the other eating disorders or other languages. HRQoL is a complex and subtle construct and difficult to quantify and assess by analytical processes. Factor analysis is a statistical approach based on numeric correlations. Future studies should include clinimetrics to further investigate whether factor revision would be appropriate.

We highly recommend using the EDQLS to assess disease-specific HRQoL in patients with AN. We found the EDQLS total score to be a valid and useful reflection of the patient’s health status, and it appeared to be easily completed and well accepted by the study participants. Future studies should focus on exploring the factor structure and testing the use of the EDQLS in different settings.

What is already known on the subject:

Only one clinical and one non-clinical validation study has been performed on the EDQLS prior to this study. These validation studies proposed an alternative factor model.

What does this study add?

This study adds validation regarding the Danish version of the EDQLS and proposes to use the total score of the instrument.