Introduction

Orthorexia nervosa (ON), which is described as a pathological obsession of eating healthy food [1], has been receiving more attention during the recent 10 years [2, 3]. Although ON has not been included in the most recent Diagnostic and Statistical Manual of Mental Disorders (DSM–5), it has been proposed as a potential eating disorder or an emerging eating disorder [4]. Dunn, Bratman [2] have proposed two main diagnostic criteria for ON: the obsession towards healthy foods, and the clinical symptoms resulting from the disorder.

Based on previous studies, the prevalence of ON was found to range from less than 1% to over 80% [5]. It has been suggested that the large inconsistency was mainly due to the difference of instruments used, but not due to cultural differences [5]. Individuals with ON tend to have psychological and social problems. For example, individuals with ON could experience intense frustration when they transgress or fail to comply with their nutrition rules [6]; and individuals with ON are also at risk of experiencing social isolation due to their tendency to fully control the entire process of food preparation [7, 8]. Thus, it is believed that, for individuals with ON, the fixation on healthy and proper nutrition is damaging their own well-being [9].

Currently, the majority of empirical research available on ON was conducted in European countries, such as Hungary, Italy, Germany, Spain, and Turkey [3, 10,11,12,13]. In North America and Australia, the number of studies on ON was much smaller, but it has been increasing [2, 5, 14]. Our literature search has revealed no studies on ON conducted in Asia where research on eating disorders is traditionally less active than Western countries. However, recent evidence has revealed that the prevalence of eating disorders (EDs) is a global issue, and the prevalence of EDs in Asia has been increasing [15]. In China, the prevalence of EDs has become more and more similar to that of the Western countries [16]. To the best of our knowledge, there have been no studies examining ON in China. Given that culture has been suggested to play an important role in developing EDs [17], and more research in Asia will contribute to our better understanding of the development and expression of EDs globally [15], there is a clear need to study ON in China to gain a deeper understanding of this possible emerging eating disorder.

To date, there have been several measures available for assessing and screening ON, such as, the Bratman’s Orthorexia Test (BOT) [18], the ORTO-15 [19], the Orthorexia Screen [20], the Eating Habits Questionnaire (EHQ) [21], and the newly developed Düsseldorf Orthorexia Scale (DOS) [22]. Among these measures, the BOT has not been recommended to be used to assess ON due to some methodological problems [23]. The ORTO-15 is the most widely used for screening ON [24], but it might be problematic in distinguishing between healthy eating and pathologically healthful eating [5, 25, 26]. For the EHQ, Koven, Abry [6] discussed that the instrument might not measure the compulsive character of ON. Thus, the Düsseldorf Orthorexia Scale (DOS) may be the best tool currently available for assessing ON [3]. Currently, the German version and English version of the DOS have been shown to have good psychometric qualities (e.g., validity, reliability) for assessing ON [22, 27].

Overall, given that there is no instrument available in Chinese for evaluating ON, and the DOS appears to be a psychometrically sound measure for assessing and screening ON, DOS was chosen for this study. The main aim of the present study was to translate the DOS into Chinese and to examine its psychometric properties in the Chinese cultural context. Furthermore, based on the translated Chinese version of the DOS, the prevalence of ON and its related demographic correlates were also explored.

Methods

Participants and procedures

Using a convenient sampling method, participants in the current study were recruited from two universities in mainland China, one in Liaoning province (in northern China) and the other in Zhejiang province (in eastern China). The research protocol was reviewed and approved by the research ethics office of Hunan University. To determine the minimum sample size required for the current study, G*Power version 3.1 Program [28] was used. The result indicated that 782 participants were needed for correlation analyses among the study variables for detecting a small effect (r = 0.10) with a power of 0.80 and a 0.05 level of significance with a two-tailed test. Initially, around 1400 undergraduates were invited to participate in the paper–pencil format survey, and 1075 finally agreed and completed the survey. All participants read and agreed to the informed consent, which was on the first page of the survey and then they completed the questionnaires used in the current study.

Of the final sample of 1075 undergraduates, 567 were female (52.7%) and 508 were male (47.3%). Their ages ranged from 17 to 24 years, with the mean being 20.11 (SD 1.01) and median 20. Moreover, 201 (18.7%) were freshmen, 830 (77.2%) were sophomores, 36 (3.5%) were juniors or seniors, and 7 did not report their study year. Their body mass index (BMI) ranged from 14.30 to 36.89 kg/m2, with the mean being 21.11 (SD 3.10) kg/m2. Based on the cutoff values of BMI for Chinese adults [29] (i.e., < 18.5 kg/m2, 18.5–23.9 kg/m2, 24–27.9 kg/m2, and > 28 kg/m2 are underweight, normal weight, overweight, and obese, respectively), 210 (19.8%) were underweight, 681(63.3%) were normal weight, 136 (12.7%) were overweight, and 36 (3.3%) were obese. Twelve respondents did not report height and/or weight for calculating BMI.

Measures

Düsseldorf Orthorexia Scale (DOS)

The ten-item Düsseldorf Orthorexia Scale is the Orthorexic Eating Behavior subscale of the longer 21-item DOS, which has three subscales in total, with the other two being the Avoidance of Additives subscale and the Supply of Minerals subscale. This ten-item DOS is used as a unidimensional measure for assessing and screening ON [22], and the ten items are to be answered on a four-point Likert scale from “definitely does not apply to me” to “definitely applies to me”. There are no inverted items, and the maximum total score of the DOS is 40. According to the study on the development of the DOS [22], a total score greater than 30 (95th percentile) indicates the presence of ON, between 25 and 29 indicates at risk of ON, while a total score less than 25 is absent of ON.

In the current study, the Chinese version of the ten-item DOS (C-DOS) was used and it was obtained by following standard translation procedures [30]. Specifically, two doctoral students majoring in psychology with a high level of English proficiency translated the English version of the ten-item DOS [27] into Chinese, and then, another doctoral student majoring in English education back-translated the preliminary Chinese version of C-DOS. The back-translated version and the original English version were sent to two experts in the area of healthy eating behaviors for review. Based on their comments, we did minor revisions in the wording of the translation. This version of translation was used in a sample of ten undergraduate students for evaluating the clarity of each item. Based on their responses, no further revisions were made, and the Chinese version of the C-DOS was finalized and used in subsequent validation procedures.

Inflexible Eating Questionnaire (IEQ)

The Inflexible Eating Questionnaire (IEQ) is a unidimensional measure for evaluating psychological inflexibility in eating (e.g., inflexible adherence to eating rules) [31]. The IEQ contains 11 items on a five-point Likert scale from “fully disagree” to “fully agree”, and there are no inverted items. The sum of the 11 items is the total score, and individuals with higher total score have higher levels of inflexible adhesion to eating rules [31]. The IEQ showed good reliability (Cronbach’s alpha = 0.90 and test–retest = 0.84) in the study about the development of the IEQ [31]. As there was no Chinese version of the IEQ currently available, with the permission of the developer of the IEQ, our research team followed the same procedures as used for the translation of the DOS to obtain the Chinese version of the IEQ. The ordinal alpha of the Chinese version of the IEQ was 0.87 in the current sample. The IEQ was used to examine the convergent validity of the C-DOS. We expected that the total scores of the C-DOS would be highly and positively correlated with inflexible eating measured by the IEQ.

Three-Factor Eating Questionnaire Revised 18

The Three-Factor Eating Questionnaire-R18 (TFEQ-R18) is a multidimensional measure for assessing disordered eating behaviors [32]. It contains 18 items scored on a four-point Likert scale from “totally false” to “totally true”, and there are no inverted items. The TFEQ-R18 comprises three subscales, namely, cognitive restraint, uncontrolled eating, and emotional eating. In the current study, the Chinese version of the TFEQ-R18 was used, and it has demonstrated good psychometric properties in a Chinese sample [33]. In the current sample, the ordinal alpha was 0.84, 0.86, and 0.90 for the cognitive restrain, uncontrolled eating, and emotional eating subscales, respectively. Similar to the use of the Eating Disorder Inventory (EDI) [34] by Chard et al. [27] to examine the discriminant validity of the English version of the DOS, the TFEQ-R18 was used in this study as a measure of disordered eating behaviors (e.g., uncontrolled eating) for the purpose of examining the discriminant validity of the C-DOS. We expected that the total scores of the C-DOS would have weak correlations with the three subscales of the TFEQ-R18.

Data analysis

In the current study, the data analyses were carried out via R version 3.50 [35] with the packages of psych [36], lavaan [37], and semTools [38].

First, the total sample (n = 1075) was randomly split into two halves, with one-half for exploratory factor analysis (EFA; n = 537) and the other for confirmatory factor analysis (CFA; n = 538). EFA was conducted with maximum likelihood (ML) estimation method and an Oblimin rotation. The scree plot based on parallel analysis [39] was used for determining the number of factors to be retained. Moreover, according to Tabachnick, Fidell [40], as a good rule of thumb, 0.32 was used as the minimum loading of an item in EFA, and a cross-loading item would have loadings of 0.32 or higher on two or more factors.

After the factor structure was derived from EFA, CFA was then conducted in the other half of the data to confirm the structure derived from EFA. In CFA, the robust estimator MLR was used [41]. For evaluating the model fit, the following fit indicators with respective recommended cutoff values [42] were reported: the root mean square error of approximation (RMSEA; less than 0.08 indicating good fit) with its 90% confidence interval, comparative fit index (CFI; greater than 0.90/0.95 indicating acceptable/good fit), Tucker–Lewis index (TLI; greater than 0.90/0.95 indicating acceptable/good fit), and standardized root mean square residual (SRMR; less than 0.05/0.08 indicating good/acceptable fit).

Reliability of the C-DOS was evaluated by calculating the ordinal alpha coefficient. Test–retest reliability was assessed via the intra-class correlation coefficient (ICC), which was calculated based on the data from a sub-sample of 101 participants who took the survey again after 4 weeks. For both the alpha coefficient and ICC, a value greater than 0.6 indicates acceptable reliability, while a value greater than 0.7 indicates good reliability [43].

Furthermore, measurement invariance across gender groups was examined in accordance with the guideline provided by Van de Schoot et al. [44]. In terms of previous literature, ΔCFI < 0.010, and ΔRMSEA < 0.015 indicates measurement invariance across groups [45, 46]. In addition, when a scalar invariance across gender groups could be obtained, tests of latent mean differences across gender groups were further conducted.

Results

Preliminary analysis

In the current sample, a small number of data values were missing on the ten items of the C-DOS, with the missing rates ranging from 0 to 1.3% across items. Using Little’s MCAR test, the missing pattern was missing completely at random, \({\chi }^{2}\) (80) = 85.22 (p = 0.32). Thus, we used the expectation maximization procedure to impute the missing values [47]. Using Mahalanobis distance, seven multivariate outliers (0.6%) were detected. Since these cases were determined to be reasonable and was a very small proportion (< 1%) of the total sample, these cases were retained in the analyses [48].

A CFA with 1075 participants was first conducted for the one-factor model proposed in the study about the development of the DOS [22]. Results revealed that the one-factor model did not fit the current data well, with χ2 = 473.73 (df = 35, p < 0.01), RMSEA = 0.11 (90% CI 0.10–0.12), CFI = 0.80, TLI = 0.75, and SRMR = 0.07. Thus, in line with the validation study for the English version of the DOS [27], EFA was further conducted.

Exploratory factor analysis

Based on the result of parallel analysis (Appendix 1), a three-factor structure was clearly indicated. These three factors retained could explain 46% of the total variance. Finally, based on the meanings of the items clustering on the three factors, the emergent three factors for the C-DOS were labeled as “Obsession in healthy food” (five items), “Adherence to nutrition rules” (three items), and “Emotional symptoms” (two items). The factor loadings and inter-factor correlations are presented in Table 1.

Table 1 Factor loadings of the exploratory factor analysis for the Chinese version of the Düsseldorf Orthorexia Scale (C-DOS) (n = 537)

Confirmatory factor analysis

Using the remaining sample of n = 538, we tested the model fit of the three-factor model derived in the current study for the data. Results revealed that the three-factor model had acceptable model fit in general, with χ2 = 105.16 (df = 32, p < 0.01), RMSEA = 0.06 (90% CI 0.05–0.08), CFI = 0.93, TLI = 0.89, and SRMR = 0.05. As TLI was slightly lower than the recommended cutoff value of 0.90, we checked the modification indices (MI) and found the error covariance between Item 1 and Item 2 had the largest MI value (25.26). Given that Item 1 and Item 2 have the similar meaning and were loaded on the same factor (i.e., obsession in healthy food), we re-ran the model with the error covariance between Item 1 and Item 2 freely estimated. Results for the modified three-factor model revealed a good model fit, with χ2 = 82.89 (df = 31, p < 0.01), RMSEA = 0.06 (90% CI 0.04–0.07), CFI = 0.95, TLI = 0.92, and SRMR = 0.04. Moreover, to assess if there is a global construct of “orthorexia nervosa”, we further tested a second-order model. Results showed that the second-order model also fitted the data well, with χ2 = 82.89 (df = 31, p < 0.01), RMSEA = 0.06 (90% CI 0.04–0.07), CFI = 0.95, TLI = 0.92, and SRMR = 0.04 (Table 2).

Table 2 Factor loadings and inter-factor correlations of the confirmatory factor analysis for the Chinese version of the Düsseldorf Orthorexia Scale (C-DOS) (n = 538)

Tests of gender invariance and latent mean differences

Based on the second-order three-factor model, gender invariance tests were conducted. As presented in Table 3, the C-DOS achieved scalar/strong measurement invariance across gender groups. Given the support of scalar invariance across gender groups, a comparison of latent factor mean differences was conducted. For such purpose, latent mean values were set to zero for the female group and freely estimated for the male group. Results showed that there were significant latent mean differences on the second-order factor (or total scale) (Z = 5.90, p < 0.01, d = 0.37) and “Emotional symptoms” (Z = 2.94, p < 0.01, d = 0.24), with males endorsing higher scores. However, latent mean differences on “Obsession in healthy food” (Z = 1.64, p = 0.102, d = 0.17) and “Adherence to strict nutrition rules” (Z = 1.90, p = 0.06, d = 0.16) were not statistically significant.

Table 3 Fit indices of gender invariance tests for the Chinese version of the Düsseldorf Orthorexia Scale (C-DOS) (n = 1075)

Reliability

The ordinal alpha was 0.84 for the total scale, 0.77 for “Obsession in healthy food”, 0.75 for “Adherence to strict nutrition rules”, and 0.71 for “Emotional symptoms”. Moreover, test–retest reliability (ICC) was examined in a sub-sample of 101 participants who took the C-DOS again after 4 weeks. Results showed that the ICC was 0.77 for the total scale, 0.71 for “Obsession in healthy food”, 0.46 for “Adherence to strict nutrition rules”, and 0.50 for “Emotional symptoms”.

Construct validity

Table 4 presents the correlations between the C-DOS and other constructs. The C-DOS total score had strong and statistically significant positive correlations with eating inflexibility (r = 0.59, p < 0.01). The C-DOS total score had weak but statistically significant positive correlations with cognitive restrain (r = 0.06, p < 0.05), weak but statistically significant negative correlation with uncontrolled eating (r = − 0.10, p < 0.01), and statistically non-significant correlations with emotional eating (r = 0.02, p > 0.05) and BMI (r = 0.04, p > 0.05).

Table 4 Correlations between the C-DOS and other variables

Prevalence of ON

Of the 1075 participants, 84 (7.8%) had total scores greater than 30, and 196 (18.2%) had total scores in the range of 25–29. Table 5 presents the associations between the demographic variables and the prevalence of ON. As shown in Table 5, gender groups showed statistically significant differences in the prevalence of ON, while other demographic variables did not. Specifically, males (ON = 10.6%, ON at risk = 22.4%) had significantly higher prevalence rate of ON than females (ON = 5.3%, ON at risk = 14.5%), with χ2 = 25.56, df = 2, p < 0.01.

Table 5 The occurrence of ON by demographic variables (n = 1075)

Discussion

The aim of the current study was to translate the DOS into Chinese and examine its psychometric properties for the assessment of ON in the Chinese cultural settings. Furthermore, the prevalence and demographic correlates of ON, assessed by the C-DOS, among Chinese university students were also reported. The results indicated that the C-DOS is a psychometrically sound instrument for screening ON, and ON was found to be prevalent among Chinese university students.

In the current study, a three-factor structure was revealed for the C-DOS. The meanings of the three factors were in line with the diagnostic criteria recently proposed for ON [2]. More specifically, the first two factors (i.e., obsession in healthy food, and adherence to strict nutrition rules) were related to the first diagnostic criterion of ON (i.e., obsession towards healthy foods), while the third factor (i.e., emotional symptoms) was related to the second diagnostic criterion of ON (i.e., clinical symptoms resulting from the disorder). Furthermore, the CFA with a model composed by one second-order factor and three first-order factors also showed good model fit, indicating that the use of total score of the C-DOS is appropriate.

The high correlation between the total score of the C-DOS and inflexible eating measured by the IEQ suggests good convergent validity of the C-DOS. Inflexible eating involves the inflexible adherence to eating rules; it gives a person a sense of control when one meets such rules, and a sense of distress when one fails to meet such rules [31]. Such characteristics are core features of ON as indicated in the proposed diagnostic criteria of ON [2]. Furthermore, the total score of the C-DOS had low correlations with eating behaviors which can lead to eating disturbances as measured by the TFEQ-R18 [33]. These findings indicate the good divergent validity of the C-DOS, similar to what Chard et al. [27] revealed that the construct of ON measured by the DOS is different from the risk of eating disturbances as measured by the EDI [34].

The results on the reliability of the C-DOS revealed that the total scale and the three subscales had good reliability. The results of test–retest reliability showed that the total scale and the subscale of “Obsession in healthy food” had good test–retest reliability, while the other two subscales had marginally acceptable test–retest reliability. Thus, given that the DOS has been recommended to be used as a whole scale [27], we also take this recommendation for the C-DOS to be used in Chinese populations, especially in longitudinal studies.

The present findings also revealed that the association between the C-DOS scores and BMI was very low and statistically non-significant. The very low relationship as reported here is generally in line with what has been reported in most previous researches [12, 49,50,51,52]. The possible explanation is that, eating healthy (e.g., ON behaviors) does not depend on the weight an individual has, but mainly on the healthy eating knowledge of individuals [51]; as a result, it could be that the more health knowledge an individual has, the more likely the individual may have ON behaviors [53].

Using the C-DOS, our results revealed that 7.8% of the Chinese university students were classified as having ON and 18.2% were at risk of developing ON. This is higher than the prevalence found in Germany with the DOS, which showed that 3.3% of German university students had ON and 9.0% were at risk of developing ON [3]. On the other hand, our findings of the prevalence of ON among Chinese students are somewhat more comparable with those found among the US university students with the use of English version of the DOS: 8.0% of US university students had ON and 12.4% were at risk of developing ON [27]. Note that the prevalence of at-risk ON in our sample was higher than the US sample (18.2% vs. 12.4%; Pearson \({\upchi }2 =6.68, p < 0.01\)). Thus, it appears that ON and at-risk ON might be more prevalent in China than in Germany and in the US. As indicated in previous research, cultural differences could play a role in the differences [27]. From the perspective of culture, China has 1000 years of history about how to keep in good health called yangsheng in Chinese, in which healthy eating is one of the most important components [54, 55]. It is possible that people in China might be more likely to pay attention to healthy eating due to such cultural influence, such that they become more likely to develop ON behaviors. However, this should be further explored in future studies.

Previous researches on gender differences of ON have been inconsistent, with some studies finding no gender differences [56,57,58,59], some reporting greater ON behaviors in females than males [60, 61], and some others showing greater ON behaviors in males than females [8, 62]. Such inconsistency is also reflected in the findings of the current work, with gender differences on some, but not on all, dimensions of ON. This issue warrants further attention in the future studies.

Strengths and limitations

The current study has several strengths. First, the C-DOS is the first measure with good psychometric quality (e.g., reliability, validity) for screening ON behaviors in China, which may be applied in the future studies concerning ON in China and possible cultural differences of ON. Such cross-cultural research on ON could contribute to our better understanding of this potential or emerging eating disorder. Second, the current study involved a large sample of university students, based on which the prevalence rate of ON was first reported in China as a non-Western society.

However, it should be noted that the one main limitation of the current study was about its generalizability. As the psychometric properties of the C-DOS and prevalence rate of ON were based on Chinese university students, caution is warranted in generalizing the findings to other Chinese populations (e.g., general population, clinical population, children and adolescents); future studies involving more heterogeneous Chinese populations are highly recommended. Furthermore, it should be noted that although the current study revealed a three-factor model for the C-DOS, this should be further tested in other samples and contexts.

Conclusion

In conclusion, the findings in the current study suggest that the Chinese version of the DOS (C-DOS) was a psychometrically adequate instrument to assess ON in a sample of Chinese students. Furthermore, using the C-DOS, we found that ON was prevalent among 7.8% of the investigated university student population, and a significant gender difference was also found with males displaying higher rates of ON than females (10.6% vs. 5.3%). Given that the high prevalence of ON was found in the current study, more attention to ON, as well as further research and potential interventions, are warranted in China.