Introduction

Eating disorders (EDs) are severe psychopathological conditions with significant impact on professional, academic, and social life and are associated with clinically significant psychosocial impairment [1] Patients with EDs show extreme concerns about weight and shape and a tendency to judge self-worth almost exclusively regarding the ability to control their eating, weight and body shape [2]. These core features of EDs compromises functioning in other areas of life such as interpersonal relationships, work, family, among others [3].

Although several measures exist to assess the quality of life of EDs’ patients (for a review, see [4,5,6]), few questionnaires are developed to evaluate the clinical impairment associated with EDs’ specific psychopathology [3]. The Clinical Impairment Assessment (CIA) was developed by Bohn and Fairburn to address this particular aspect. The CIA is a 16-item self-report measure instrument that assesses areas of life usually affected by specific EDs’ psychopathological features, namely, mood and self-perception, cognitive functioning, interpersonal functioning, and job performance. The CIA assesses impairment during the previous 28 days in three domains (personal, social, and cognitive) and was developed to be used in conjunction with the Eating Disorders Examination Questionnaire (EDE-Q) providing information about impairment secondary to EDs’ symptoms and attitudes [3].

Several studies addressed the psychometric properties of the CIA. Four studies have explored the replication of the 3-factor structure corresponding to the 3 domains of impairment assessed by the scale, using exploratory factor analysis [7] and confirmatory factor analysis [8,9,10] finding support for the original factor structure of the CIA with 3 domains of impairment (personal, cognitive and social). The original report [3] and more recent studies [8,9,10,11], using both clinical and a high-risk EDs’ samples, have demonstrated adequate psychometric properties of the original version of the scale with acceptable reliability and validity for the different studies assessing CIA. Several studies have demonstrated a high correlation of the CIA global score and subscales with measures of eating disorders measures and associated psychopathology (e.g., [9]). Calugui and colleagues [10] assessed test–retest reliability in a clinical sample of ED patients and found support for the temporal stability of the measure. A study from Martin [9], in a clinical sample of Spanish EDs’ patients, studied sensitivity to change (responsiveness) supporting the results of the original measure. Several studies have also provided support for known-groups validity of CIA, namely, testing for differences in CIA for groups in socio-demographic variables [9] and the presence of different ED symptoms and groups of patients [8, 10]. A cutoff of 16 in the global score was proposed by Bohn for discriminating patients with EDs from recovered ED participants and was replicated in two studies using with clinical samples [10, 12]. To date, the original English version of the CIA was translated, adapted and validated into several languages including Persian [13], Fijian [14], Norwegian [7], Spanish [9], Swedish [15], Italian [10], and all have demonstrated good psychometric properties. No previous studies exist for the validation of the CIA in Portuguese. Accordantly, the current study aimed to validate the Portuguese version of CIA assessing several psychometric properties of this questionnaire in a sample of eating disorder patients (reliability and convergent and discriminant validity) and providing clinical significance change cut-off values and reliable change indexes that can be used for evaluating patients individual change. This study aims to contribute to the cross-cultural validation of this measure given the clinical utility of assessing this construct to plan and evaluate outcome in EDs’ treatment.

The following hypothesis were tested: CIA will present good construct validity.

In addition, CIA will present good convergent validity with higher correlations between the CIA and EDE-Q scores. In addition, CIA global score will be highly correlated with symptom distress as measured by the OQ-45 and with depressive symptoms assessed by the BDI. For discriminant validity, EDs’ patients will score significantly higher than the comparison group of college students. In addition, CIA scores will accurately distinguish ED participants from the comparison group. For known-groups validity participants above the CIA cutoff will score significantly higher in all CIA subscales. CIA scores will significantly distinguish participants in the presence of different ED behaviors (objective bulimic episodes (OBE) and compensatory behaviors) and symptoms (e.g., BMI under 17.5).

Method

Participants

Participants were 237 women diagnosed with EDs and recruited from two specialized outpatient units in the treatment of EDs in the North and Center of Portugal and patients are referred to when in need of specialized EDs’ care by general practitioners, other health care professionals and through self-referrals. Participants from the clinical sample were diagnosed with EDs according to the DSM-5 criteria [16] at intake. No structured clinical interview was conducted for diagnosis purpose, but patients were diagnosed by very experienced psychiatrists specialized in EDs’ treatment and assessment.

A non-clinical sample was recruited at a University Campus in the North of Portugal. Participants were 196 students.

Measures

Clinical Impairment Assessment questionnaire (CIA, [3])—is a 16 items self-report questionnaire assessing impairment secondary to EDs in three domains: personal, cognitive and social impairment. A global impairment score is computed to measure the overall severity of impairment secondary to EDs’ features. Participants answer items rated on a 4-point Likert scale (Not at all—0; A little—1; Quite a bit—2; A lot—3) reporting to the previous 28 days (e.g., “Over the past 28 days, to what extent have your eating habits, exercise or feelings about eating, shape or weigh interfere with meals with family or friends”). The obtained score ranges between 0 and 48, and a higher score reflect more impairment.

Eating Disorders Examination Questionnaire (EDE-Q, [16]): is a self-report measure derived from the Eating Disorders Examination interview (EDE; [17]) used to assess eating disorder symptoms and associated psychopathology. It contains 28 items rated on a Likert rating scale ranging from 0 to 6 and indicating the number of days out of the previous 28 in which particular behaviors, attitudes, or feelings occurred. It comprises four subscales (dietary restriction, eating concern, shape concern and weight concern) and a total score. The Portuguese version [18] used in this study showed adequate psychometric properties. The Cronbach’s alpha for the current study was 0.95, considered an excellent value of internal consistency.

Beck Depression Inventory (BDI) [19]—is a 21-item self-report inventory used to assess depression symptoms (cognitive, affective, and somatic symptoms). Responders rate each of the items from 0 to 3 according to the severity of depressive symptoms. The Portuguese version of the BDI has excellent psychometric properties [20]. In the present study, the Cronbach’s alfa was 0.88.

Outcome Questionnaire-45 (OQ-45 [21])—is a 45-item self-report instrument, designed specifically for tracking and assessing patient outcomes in a therapeutic setting and measures areas (symptoms, interpersonal problems social role functioning and quality of life) part of mental health and life functioning. The OQ-45 is scored using a five-point Likert scale from 0 = never to 4 = always. High scores on the OQ-45 indicate more distress. The OQ-45 generates three subscale scores: the symptom distress (SD), the interpersonal relationships (IR), and social role functioning (SR). The Portuguese version shows good psychometric properties [22] Cronbach’s alfa in the current study for the total score was 0.87 and is considered a good value of internal consistency.

A socio-demographic questionnaire was developed for assessing age, gender, course (in the case of the college sample), weight and height, school years and residential area.

Weight and height were self-reported by all the participants when filling the questionnaires (EDE-Q).

Procedure

All participants referred by their psychiatrist as having an ED diagnose were included in the study and completed the self-report measures as part of the intake procedure. No exclusion criteria were defined except for the inability to understand the questions or not to speak and write in Portuguese.

A non-clinical sample was collected using college students. Data collection was made both in paper–pencil (n = 56) (collected in college course classes) and electronically via Google forms versions. Course directors were contacted by email, and permission asked to go to classes for students to participate by filling in the questionnaire. Online data collection was made through institutional email contact by sending a link of the CIA questionnaire and some brief socio-demographic questions. For the college sample, only the CIA questionnaire was used to reduce administration time and increase student participation. In addition, students were informed about the purpose of the study and ensured anonymity. No exclusion criteria were defined other than the ability to understand questions.

The ethical committee of the all institutions approved the study approved the study, and informed consent was obtained from all participants. For the clinical sample, if participants were less than 18 years’ consent for their parents was obtained. Participants were assured that the data would be anonymized for research purposes.

The CIA was translated and adapted from the original English version CIA. Approval from the authors of the original version was obtained. An experienced and fluent bilingual (English and Portuguese) psychologist then translated the Portuguese version back into English. The Original English questionnaire with the back-translated one was compared. Identified discrepancies were analyzed, and adjustments to the Portuguese version made when necessary. A preliminary version was checked and administered to a small group of graduate students that showed a good understanding of item meaning and from whose feedback was incorporated in the final Portuguese version.

Statistical analysis

Data analyzed included only participants with no missing values in the CIA questionnaire. Twelve participants (2.77% of the sample) with missing data in the CIA questionnaire were excluded for the analysis and the total sample analyzed was 433 participants. For testing the normality of the distributions Kolmogorov–Smirnov and Shapiro–Wilk tests were performed. The internal consistency of the CIA was measured using Cronbach’s alpha and McDonald omega coefficients. Calculations for these indices were made utilizing the reliability function of the SemTools package from R (SemTools Contributors 2016). Indices are considered acceptable if greater than 0.70. For calculating measurement error, the formula \({\text{SEM}}={\text{ SD}}\sqrt 1 - r\) was used.

To test the adjustment of data to the hypothesized model, confirmatory factor analysis (CFA) was performed with the clinical sample participants. Maximum likelihood method of parameters estimation was used to this end. The following parameters of adjustment were considered for indicating a good fit: CMIN (χ2) is non-significant (p < 0.05) or 2 ≤ CMIN/DF ≤ 5 for large samples such as ours; standardized root mean square residual (SRMR) < 0.08; root mean square error of approximation (RMSEA) < 0.07 and PCLOSE > 0.05; comparative fit index (CFI) > 0.95; normed fit index (NFI) > 0.95; NNFI (TLI) > 0.991 [23] To improve model fit, the co-variance between errors that based on the observation of the modification indexes. CFA was conducted using IBM® SPSS® Amos™ 20.0.

Spearman correlation coefficients were used to assess convergent validity by examining the relation between CIA scores and EDE-Q, OQ- BDI scores.

Receiver operating characteristic (ROC) analysis was used to test the ability of CIA to predict case status. The cutoff was calculated for groups maximizing AUC, by attributing equal weigh to sensibility and sensitivity. Known-group validity was tested using the proposed cutoff for CIA and comparing groups that scored above and below the cut-off score regarding EDE-Q, OQ-45 and BDI using Mann–Whitney test. In addition, CIA scores differences for clinical and college sample and participants with and without the presence of different EDs’ behaviors and symptoms were compared for CIA scores using the Mann–Whitney test. Bonferroni correction was used when multiple comparisons applied. Effect sizes for Mann–Whitney tests were calculated according to \(r=\frac{z}{{\sqrt N }}\) and interpreted as 0.1 = small, 0.3 = medium and 0.5 = large effect as proposed by Cohen [24].

Finally, the reliable change index (RCI) and the cut-off scores for clinically significant change (CS) were calculated using the formula proposed by Jacobson and Truax [25]. According to these authors, a patient has to have a score out of the dysfunctional population range, and the degree of change has to be statistically reliable for a patient to be classified as having made a clinically significant change. To determine the likelihood of a patient score to be on the functional range, we calculated a cutoff score using the criterion C and formula proposed by Jacobson and colleagues [26]. For the reliable change index calculation, we used the internal consistency of the non-clinical sample according to the suggestion of Tingey, Lambert, Burlingame and Hansen [27] for whom the use of internal consistency is more appropriate if we measure change over time. Patients are “recovered” regarding the CIA if they show reliable positive change and pass the cutoff according to the RCI and CS.

Significance levels were set at p < 0.05. All analyses, except for reliability and CFA, were made using the Statistical Package for the Social Sciences (IBM® SPSS®) (Version 24.0).

Results

Sample characteristics are described in Table 1.

Table 1 Sample characteristics

The mean Global score for EDs’ patients was 27.79 (SD = 12.65). Each CIA Item means and standard deviations, skewness, and kurtosis are presented in Table 2.

Table 2 Mean, standard deviation, skewness, and kurtosis for each CIA item in EDs’ participants’ sample (n = 237)

Confirmatory factor analysis

A CFA was performed to test the adjustment of the data to the hypothesis model. CFA showed good fit indexes for the model tested (Fig. 1): CMIN = 162.671; DF = 95, p < 0.001; CMIN/DF = 1.712; SRMR = 0.038; RMSEA = 0.055; PCLOSE = 0.275; CFI = 0.972; NFI = 0.937; TLI = 0.965. These data confirmed the three-factor structure originally proposed [5] and showed an appropriate fit for the clinical sample.

Fig. 1
figure 1

Confirmatory factor analysis of the CIA

Reliability

Internal consistency measured by Cronbach’s alpha coefficient was 0.96 for the total scale, 0.93, 0.92, and 0.91 for personal, social, and cognitive subscales, respectively. McDonald indices of reliability showed omega values for the global CIA score of 0.97, 0.93 for the personal subscale, 0.90 for the social score, and 0.90 for the cognitive scale. Both Cronbach alfa and McDonald omega indicated the excellent reliability of the CIA.

Item-total score correlations ranged from 0.66 to 0.81, and all items correlated positively with the scale’s global score.

Measurement error was calculated the following values were obtained 2.53 for global score, 1.29, 1.25, and 1.46 for personal, social and cognitive scale, respectively.

Validity

Convergent validity

Table 3 shows the correlations between CIA global score and subscales, and the EDE-Q, OQ-45 and BDI for EDs’ patients.

Table 3 Correlations between CIA global score and subscales and the EDE-Q, OQ-45, and BDI in EDs’ patients (n = 237)

All correlations between CIA global scale and its subscales and restraint, weight, shape, eating concerns subscales, and the EDE-Q global score were significant. CIA was also significantly correlated with measures of depression (BDI) and psychological distress (OQ-45). All correlations scored above 0.50 are considered strong. CIA global score displayed especially high correlations with the EDE-Q total score (0.71) and OQ-45 total score (0.70).

Discriminant validity

ROC showed an area under de curve (AUC) of 0.91 (SE = 0.02; p < 0.001; 95% CI 0.88–0.94), suggesting that CIA has an excellent discriminant power with a patient randomly selected having a 91% probability of being correctly classified based on a global score. The results yielded a sensitivity of 0.86 and a specificity of 0.88 for a cut-off value of 15 for the global score.

Table 4 shows results the comparison between CIA global score and subscales for the clinical and college sample.

Table 4 Comparison between eating disorders patients sample and college sample for CIA scores

EDs’ patients scored significantly higher than college sample in all CIA scales. All effect sizes were above 0.60 what can be considered a large effect size according to Cohen [24].

In addition, the CIA cutoff calculated in the present study was used to compare patients regarding the EDE-Q total scores, OQ-45 total scores, and BDI scores to test for known-groups validity. Results showed that patients scoring above CIA global scores cutoff scored significantly higher (U = 300.00; p < 0.001) in the EDE total scores (M = 3.89; SD = 1.21) than those scoring below the proposed cut-off score (M = 1.10; SD = 0.91). The OQ-45 total scores were also significantly higher for participants’ scoring above the CIA cutoff (U = 420.00; p <  0.001; mean score of 87.5; SD = 21.82) compared with a mean score of 43.07 (SD = 22.63) for participants below the cut-off value. BDI mean scores were also significantly higher for participants’ scoring above cutoff (M = 24.23; SD = 10.12) compared with participants below cut-off score (M = 9.19; SD = 6.00; U = 415.00; p < 0.001). Underweight patients score significantly lower than patients with BMI above 17.5 for the personal scale of the CIA (U = 5076; p = 0.003), not differing significantly in the cognitive, social and global score of the CIA.

To evaluated criterion validity, participants were compared regarding the presence of ED symptoms on CIA scores. Results are presented in Table 5.

Table 5 Comparison of CIA scores for the presence of different ED dysfunctional behaviors in EDs’ patients (N = 237)

Results show that when in the presence of different dysfunctional EDs’ behaviors, OBE, self-induced vomiting, laxative use, exercise, and multiple compensatory methods, CIA scores are significantly higher in the global score and personal, social and cognitive impairment scales. The only exception was the presence of self-induced vomiting that seems not to impact social impairment score (U = 5346; p = 0.82). Effect sizes ranged from 0.12 to 0.36. A medium effect was found for personal impairment in OBE (0.36) and for the presence of multiple compensatory behaviors regarding CIA cognitive impairment scale (0.32). The presence of both ED behaviors (OBE and multiple compensatory behaviors) presented medium effects sizes for the global CIA score.

Clinically significant change

The estimated cut-off score for clinically significant change was 14 for the total CIA, 8 for the personal impairment, 3 for cognitive impairment and 4 for the social impairment. When a patient’s score falls below this score, it is concluded that his or her functioning is more similar to that of non-patients than of patients at that time. The RCI was 5 points for the global score. Individuals who change in a positive or negative direction by at least 5 points in the CIA global score are considered as having made a reliable change. For personal, cognitive, and social impairment, the RCI is 3, 2, and 2, respectively.

Discussion

The main aims of the current study were to evaluate the psychometric properties of the Portuguese version of the CIA and to test the original factor structure.

In the present study, the 3-factor structure proposed by Bohn and colleagues [6] was replicated, with results confirming the utility of global score as well as cognitive, social and personal dimensions showed a good fit for the model tested. Results replicate the factor structure with Spanish [9] and Italian clinical samples [10].

The questionnaire showed excellent internal consistency values and is in line with prior studies both with clinical and community samples [7,8,9,10,11, 13,14,15].

Good convergent validity was found with CIA scores being highly correlated with the EDE-Q total score and EDE-Q subscales, with subjects with higher ED psychopathology experiencing increased impairment secondary to ED as previously reported by other studies using clinical samples [8,9,10].

The CIA further showed good discriminant validity, participants with an ED diagnosis scored significantly higher than the non-clinical sample on the CIA global score and its dimensions. In addition, ROC analysis showed an excellent accuracy of the CIA global score for determinate EDs’ case status. The cutoff for the CIA in the present study of 15 provided the best trade-off between sensitivity and specificity. Nerveless the value obtained for the Portuguese sample was lower compared to the value of 16 proposed by the original study [3] and replicated in subsequent studies [10, 12]. However, these studies included both outpatients and impatiens which may account for a more severe presentation of EDs’ features impacting CIA scores and results. Contrary, the present study sample was composed by outpatients with different EDs’ diagnosis, including OSFED which may have influenced the mean global scores and the cutoff of the CIA. In addition, a significant number of participants are AN outpatient’s and lower CIA scores on the personal scale were found for underweight. This fact can have accounted for the mean lower scores in the clinical sample and can be explained the tendency of this patients to deny the gravity of their condition consistently underreporting ED behaviors and dysfunctional attitudes in self-report measures.

In addition, this study provided support for known-groups validity with increased higher impairment in the presence of different ED behaviors, except for OBE impacting social impairment which is in line with the study of Calugi and colleagues [10] and can be explained by the secrecy of this behaviour with OBE mostly occurring when patients are alone. Medium effects sizes were found for the presence of OBE and Multiple purging methods in global CIA scores. OBE may contribute to elevated impairment scores by its relation with fear of weight gain and by consequence with increased weight, shape and food dysfunctional concerns. On the hand, multiple purging methods have been associated with greater illness severity [28, 29] and contribute to increased impairment [30]. Futures studies should study the contribution of different ED behaviors and attitudes in impairment secondary to EDs.

The present study allowed the calculation of the RCI and a clinical significant cut-off score. This calculation is useful to inform clinical decisions regarding treatment, namely, to decide when a patient has returned to the functional group and made a reliable change in impairment what may be very important in considering an effective recovery and improvement from ED and deciding the need for additional intervention directed to improve impairment secondary to ED.

This study has some limitations including the fact that the present study only assessed women, and generalization of findings for men should be carefully made. In the present study, test–retest data were not collected. Test–retest data would allow the calculation of minimal clinical difference (MCD) contributing to a more robust study of the reliability of the questionnaire. In addition, repeated measures across treatment should also be obtained to inform about sensitivity to changes in EDs’ impairment and addressing responsiveness. Axis I and II comorbidities were not evaluated what is recognized as possibly contributing to more severe presentations of EDs [31] and consequently can have impacted impairment. Although CIA is designed to evaluate impairment secondary to EDs, future studies should consider screening for axis I and II diagnosis and its influence in impairment in EDs.

The literature is scarce regarding the use of CIA in routine clinical practice and how impairment is reduced after treatment and in follow-up. Reducing clinical impairment should be an essential therapeutic goal of treatment, since this is often a motivation to seek treatment for ED [32] and can be used as a motivation to change. Routinely assessing impairment with EDs can be a fundamental aspect in highlighting dissonance regarding the denial of the severity of eating symptoms and the need for therapeutic help and adherence to treatment goals. Future studies should focus on predictors of clinical impairment to develop treatment strategies that can impact impairment effectively.

In summary, the current study provides evidence for the validity and reliability of the Portuguese version of the CIA in a clinical sample of ED patients. In addition, it allows the use of CIA as a clinically relevant and useful tool that should be used to assess and inform ED treatment.