Introduction

Dysmorphic concerns are defined as excessive preoccupations with a slight or perceived defect in physical appearance [1]. They are a symptom of body dysmorphic disorder (BDD) and may indicate BDD proneness or even a subclinical or clinical diagnosis [2, 3]. Individuals with dysmorphic concerns often seek help in cosmetic treatment or aesthetic surgery [4]. In these settings, prevalence rates of dysmorphic concerns are reasonably high ranging from 6.9 to 17.2% [5,6,7,8,9,10]. However, dysmorphic concerns are associated with a higher dissatisfaction after aesthetic surgery [11, 12]. Thus, these patients have to be identified before medical treatment.

As a screening instrument for dysmorphic concerns, the Dysmorphic Concern Questionnaire (DCQ) by Oosthuizen et al. [1] is frequently used. It consists of seven items that include statements on appearance-related concerns and behaviors. The response format is based on the General Health Questionnaire [13] estimating one’s own concern in comparison with most people [1]. Studies confirmed a good reliability with an internal consistency in Cronbach’s α varying from 0.80 to 0.88 [1, 2, 5, 14,15,16]. The one-factor structure of the questionnaire was supported in several studies [1, 5, 14], and results on convergent and divergent validity were good [5, 14, 17]. The DCQ can be interpreted according to its sum score. Higher DCQ values correspond to higher appearance concerns, but for case identification there are also cutoff values available [5, 17].

In previous studies the DCQ has been applied in non-clinical samples of the general population [3, 15, 18, 19] and in psychiatric inpatients [1, 14]. Regarding non-psychiatric medical settings, the DCQ has been frequently used in plastic surgery patients [8,9,10, 20, 21], in dermatological outpatients [5, 7] and in people seeking cosmetic enhancement [22].

The DCQ is a concise and economical instrument that is considered for clinical use as well as for epidemiological studies in large samples [16, 17]. However, the validation of the DCQ was performed only in clinical settings [1, 5, 14] and non-representative samples of the general population, including mainly students [2, 17]. The DCQ has not been validated in a large representative sample of the general population yet, and a normative data set is still lacking.

Thus, the primary aim of the study was to validate the DCQ in a general population sample. Secondly, we aimed to obtain a normative data set based on the representative sample and to explore the prevalence rate of cases with significant dysmorphic concerns using different previously defined cutoff values.

Methods

Design and Participants

The present study had a cross-sectional design investigating psychometric properties of the DCQ in a random general population sample of Germany. It was part of a large survey under the topic “Healthiness in Germany” initiated by the University of Erlangen-Nuernberg. An independent demographic consulting agency (USUMA, Berlin, Germany) conducted data collection in 2011 to recruit a representative sample of the German general population. A total of N = 4212 individuals between 18 and 65 years were contacted, of whom n = 2286 (56%) completed the questionnaires online or in written form. Prior surveys of the general population revealed similar participation rates [23, 24]. The selection process was based on a multistep procedure choosing random samples of households as well as identifying randomly a target subject of each household with the Kish selection grid. The selected sample represents the general German population in relation to age, gender and living area (www.destatis.de). Further details on the procedure were published previously by Schieber et al. [25]. All participants gave written informed consent. The survey was conducted in accordance with the Helsinki Declaration and met the ethical guidelines of the International Code of Marketing and Social Research Practice, by the International Chamber of Commerce and the European Society for Opinion and Marketing Research.

According to the primary aim of the present analysis, we excluded cases with missing values in the DCQ resulting in a final sample of n = 2053 individuals. The mean age of the sample was 45.5 years; 54% of the participants were female. For further sociodemographic characteristics see Table 1.

Table 1 Characteristics of the general population samples comparing female and male participants

Instruments

The participants completed the survey assessing sociodemographic data as well as the following questionnaires:

DCQ

As described above, the DCQ [1] aims to assess over-concern about physical appearance. Its seven items are rated on a four-point scale from 0 to 3 (see Table 2). The sum score ranges from 0 to 21, whereby higher values indicate higher dysmorphic concerns. Previous exploratory factor analysis revealed a one-factor model as optimal accounting for 43–58% of the variance [1, 5, 14]. Stangier et al. [5] established a cutoff value of ≥ 11 to identify individuals with significant concerns in bodily appearance in a sample of female dermatological outpatients. To discriminate between individuals with BDD, individuals with disfiguring disorders and individuals with non-disfiguring conditions, the authors suggest a cutoff of ≥ 14. In a sample of undergraduates, a cutoff of 9 was showed the best balance of sensitivity and specificity [17]. Monzani et al. [15] proposed a cutoff of 17 to distinguish between individuals with BDD and individuals with eating disorders referring to unpublished data of their group. Furthermore, in a large sample of twins of the general population Monzani et al. [15] suggested a classification of the DCQ sum score to differentiate the severity degree of dysmorphic concerns in four categories. The four classes represent a three-threshold solution: no symptoms (DCQ sum score = 0), minimal concerns (DCQ sum scores 1–5), moderate concerns (DCQ sum scores 6–10) and clinically significant concerns (DCQ sum scores ≥ 11).

Table 2 Distribution characteristics and item-intercorrelation matrix of the DCQ items

Appearance Concerns

Appearance concerns were assessed by using a defects and flaws rating list containing 13 body parts [25]. We computed the total number of subjective flaws marked by the participant. Furthermore, participants had to rate their subjective impairment due to their perceived appearance concerns on a scale ranging from 0 to 10. A higher score indicated a higher subjective impairment.

Depression

The Patient Health Questionnaire PHQ-9 [26, 27] is a widely used instrument to assess symptoms of major depressive disorder during the last 2 weeks. The sum score based on nine items indicates the severity of depression (range 0–27). Cutoff values represent the diagnostic status of depression: ≥ 5 mild, ≥ 10 moderate, ≥ 15 moderately severe, ≥ 20 severe.

Statistical Analysis

For data analysis, the software IBM SPSS statistics 21 was used. Differences in demographic features were assessed by using t tests and χ2 tests. The analysis of psychometric properties included descriptive characteristics of the items, construct validity, divergent/convergent validity and internal consistency. Cronbach’s α was used to assess the reliability of the DCQ. The examination of construct validity included an exploratory factor analysis, which was run on the seven DCQ items with orthogonal rotation. Sampling adequacy was assessed by the Kaiser–Meyer–Olkin measure. The eigenvalue of the factor was also computed. Furthermore, a confirmatory analysis was performed using the program LISREL 8.71. In addition to the respective χ2 statistics, a range of fit indices were utilized. Model fit statistics were examined by the Comparative Fit Index (CFI), Non-normed Fit Index (NNFI), root mean square error of approximation (RMSEA) and standardized root mean square residual (SRMR). According to Kline [28], the model was regarded acceptable if the indices CFI and NNFI exceeded ≥ 0.90, RMSEA ≤ 0.08 and SRMR < 0.10. Convergent validity of the DCQ was examined by using Pearson’s correlations. For divergent validity, four DCQ severity groups were compared by using the Welch’s F with Games–Howell post hoc tests. The Welch test is similar to the ANOVA, but is a robust measure regarding homogeneity of variances and unequal sample sizes. Effect sizes were computed with Cohen’s d and interpreted according to general guidelines: small (0.2), medium (0.5) and large (0.8). For the normative data set raw scores, percentile ranks and T values are reported. Percentile ranks were computed as data were not normally distributed. T values were computed converting raw scores into z-scores according to mean and SD of the respective age class. Then, z-scores were converted into T values with a range of 0–100 (M = 50, SD = 10).

Results

DCQ Item Characteristics

Table 2 presents means, standard deviations, skewness, kurtosis and intercorrelations of the DCQ items. Every item showed the full range of scores (0–3). The highest mean item score (M = 1.0, SD = 0.6) was found in the first item, whereas the fourth and the fifth item displayed the lowest item scores with a mean value of 0.2. Accordingly, both items displayed higher indices of skewness as well as higher positive kurtosis indices in comparison with the remaining items indicating that there are a high number of low values in the tail of the distribution compared to the normal distribution. Items 1–3, 6 and 7 showed positive kurtosis indices ranging from 0.5 to 2.4 indicating a flatter distribution of values. Their indices of skewness ranged from 0.8 to 1.6 indicating a right skewed distribution. The DCQ sum score ranged from 0 to 21; the present sample showed a mean sum score of M = 3.5 (SD = 3.1).

Reliability

All DCQ items were significantly associated with each other (p < 0.001). Values ranged from r = 0.3 to r = 0.7 which can be interpreted that the items are related to the same construct (see Table 2). Furthermore, internal consistency was examined using Cronbach’s alpha. With a value of α = 0.81 in the whole sample the internal consistency was proven to be good.

Factorial Construct Validity

We conducted an exploratory factor analysis on the seven DCQ items first. Sampling adequacy was confirmed with the Kaiser–Meyer–Olkin measure with KMO = 0.84. Results suggest a one-factor solution revealing an eigenvalue of 3.32. It accounted for 47.4% of the variance. Thus, the one-factor model as suggested by Jorgensen et al. [14] was confirmed in our sample. Item loadings on the factor are shown in Table 2. Item number six and seven had the highest loadings of 0.82 and 0.78. The item with the lowest loading was the third item with 0.57. The one-factor structure of the DCQ was verified by a confirmatory factor analysis. There was a significant effect for χ2 = 357.0, df = 14, p < 0.001. Three out of the four indices supported a good to acceptable data model fit: NNFI = 0.92, CFI = 0.95, RMSEA = 0.11 and SRMR = 0.05. Furthermore, all standardized path coefficients (ranging from 0.46 to 0.82) were significant (p < 0.001).

Convergent Validity

Convergent validity was supported by positive correlations between DCQ score and subjective impairment as well as DCQ score and number of flaws. The higher the DCQ score the higher was the self-reported subjective impairment due to appearance concerns (r = 0.54; p < 0.001). A high DCQ score was also associated with a higher number of reported body parts of concern in the flaw list (r = 0.45; p < 0.001). Furthermore, the DCQ score was negatively correlated with age (r = − 0.19, p < 0.001), whereas BMI showed no significant association with the DCQ score (r = 0.01, p = 0.779).

We created four groups according to the categories of severity of the extent of dysmorphic concerns [15]: individuals with no symptoms (DCQ sum score = 0), minimal concerns (DCQ sum scores 1–5), moderate concerns (DCQ sum scores 6–10) and clinically significant concerns (DCQ sum scores ≥ 11). Mean numbers of reported flaws, subjective impairment due to appearance concern and the degree of depression in the four groups are presented in Table 3. The group main effects were highly significant for each variable. Post hoc tests revealed that individuals with minimal dysmorphic concerns reported significantly more flaws compared to individuals with no dysmorphic concerns (p < 0.001; Cohen’s d = 0.6). However, individuals with moderate and individuals with clinically significant concerns did not differ in the number of reported flaws (p = 0.150; Cohen’s d = 0.2), but both groups reported significantly more flaws than individuals with no symptoms or minimal concerns (p’s < 0.001; Cohen’s d > 0.6). Regarding subjective impairment post hoc tests revealed significant differences between all groups (p < 0.001). Effect sizes for all between group differences were at least Cohen’s d > 0.3. Levels of depression did also differ significantly in post hoc tests between the DCQ severity groups (p’s < 0.001; Cohen’s d > 0.3). Thus, psychological impairment increased with higher levels of severity of dysmorphic concerns.

Table 3 Mean scores and standard deviation [M (SD)] of numbers of reported flaws, subjective impairment related to appearance concerns and degree of depression by groups classified according to the categories of DCQ severity

Normative Data

Normative data were obtained from the whole sample (n = 2053) representative for the general German population. The DCQ sum score ranged from 0 to 21; in the present sample, the mean was M = 3.5 (SD = 3.1). Women reported a significantly higher DCQ score than men (t(2050.7) = − 7.1; p < 0.001). Analyzing DCQ scores between age groups the Welch test yielded a significant group effect (18–29 years: M (SD) = 4.7 (3.6); 30–39 years: M (SD) = 4.0 (3.4); 40–49 years: M (SD) = 3.3 (2.9); 50–59 years: M (SD) = 3.2 (2.8); 60–65 years: M (SD) = 2.8 (2.6); Welch’s F(4;907.8) = 17.1, p < 0.001). Pairwise post hoc tests showed significant differences regarding DCQ sum score between young and old age groups. Both age groups 18–29 and 30–39 did not differ to each other (p = 0.066), but showed a higher DCQ score than all age groups > 40 years (p’s < 0.027). Post hoc tests revealed no significant difference between all age groups > 40 years (p’s > 0.077).

Therefore, normative data were stratified according to gender and age groups. Subgroups consisted of an adequately large number of individuals ranging from n = 126 to n = 291. In Table 4, percentile ranks and T values of the respective DCQ raw sum score are reported for women, men as well as the whole sample. In particular, the subgroup of young woman (age range 18–29 years) reported high DCQ scores. According to the classification of severity by Monzani et al. [15], a DCQ score of 6 and higher represents at least moderate dysmorphic concerns. Applying the score of 6 the subgroup of young woman up to 29 years scored only in the 69th percentile, whereas men of the same age group scored in the 88th percentile. The DCQ cutoff value of 11 proposed by Stangier et al. [5] indicating excessive dysmorphic concerns represented a percentile rank of 97 in the whole sample.

Table 4 Normative data

Prevalence of Screening Positive Cases for Excessive Dysmorphic Concerns

Table 5 shows point prevalence rates of DCQ screening positive cases with dysmorphic concerns according to different cutoff proposals in previous studies, as well as 95% confidence intervals [5, 15, 17]. Applying the most commonly used cutoff values of 11 and 14 as proposed by Stangier et al. [5] resulted in prevalence rates of 4.0 and 1.4% in the present sample.

Table 5 Prevalence rates of screening positive cases for excessive dysmorphic concerns according to different DCQ cutoff proposals in the present sample of the general population (N = 2053)

Discussion

To our knowledge, this is the first study to validate the DCQ in a large sample of the general population. Previous studies included mostly samples of help-seeking individuals [1, 5, 14] or non-representative samples [2, 17]. The results of the present analyses confirmed the good reliability of the DCQ, as well as its one-factor structure [2, 14]. Associations between DCQ sum score and subjective impairment due to appearance concerns, number of reported flaws and depression indicate that the DCQ measures appropriately the intended construct.

Furthermore, the present study provides a normative data set of the DCQ scores, which is important for a valid interpretation of the scores in clinical practice. This may be of special interest in the setting of aesthetic surgery as patients with clinical dysmorphic concerns have to be identified before undergoing an operation. The study provides a large-scale data set which contains percentile ranks and T values for the DCQ sum scores classified according to gender and age class. The discrimination between gender and age groups appeared relevant for the interpretation of the DCQ scores as prior research showed gender differences regarding body dissatisfaction [29]. Even in the present study, women reported more dysmorphic concerns than men, but also within females the DCQ scores differed between age classes. This indicates that for women a higher threshold for clinical significant dysmorphic concerns may be applied than for men, as different thresholds may be applied also for different age classes. Thus, the normative data set provides important information for clinical practice and may lead to a more precise evaluation of the patients.

In the present sample, we applied the various cutoff values ranging from 9 to 17 used in prior studies [5, 15, 17]. Accordingly, the resulting prevalence rates of excessive dysmorphic concerns show a broad range from 0.2 to 7.1%. The cutoff of ≥ 11 by Stangier et al. [5] is often used in previous studies, but their sample included only female dermatological outpatients, whereas Mancuso et al. [17] suggested a lower cutoff value of ≥ 9 conducting analyses in a sample of undergraduates. However, there might be a bias as the samples were very specific. Furthermore, normative data of the present study showed differences in the DCQ score according to gender and age. Therefore, future research is needed conducting cutoff analysis with a clinical structured interview as external criterion in a representative sample of the general population.

According to the current state of research, however, for clinical practice in the setting of aesthetic surgery we would suggest a cutoff value of ≥ 11 [5]. Although a lower value leads to a higher number of individuals that are falsely identified with BDD, it reduces the number of individuals that are not detected. As the purpose of a screening instrument is to individuate people at risk and to avoid missing them, it seems to be reasonable to apply a lower cutoff. However, for an adequate interpretation of the results it is necessary to interpret individual values to the normative data according to gender and age class.

Some aspects of the DCQ, however, have to be considered with caution. As it is a short instrument, every single item can have a strong influence on the final score. The examination of item characteristics revealed item 3 (related on body malfunction) to be the weakest item representing dysmorphic concerns. Further investigation of an adaption of the item should be a topic of future research. Furthermore, one needs to keep in mind that it is not sufficient using the DCQ to establish a final BDD diagnosis. BDD is mainly characterized by excessive appearance concerns, while the perceived flaws or defects are not observable or only appear slight to others. Patients with BDD furthermore show repetitive behaviors or mental acts that include thinking excessively about the perceived defect, trying to camouflage or alter it or avoiding social situations. These behaviors or mental acts result in distress or lead to impairment in important areas of functioning [30]. However, the DCQ is primarily intended for measuring the severity of dysmorphic concerns, which are relevant in various disorders and health problems. As a screening instrument, the DCQ helps to identify individuals at risk, but a structured diagnostic interview is necessary to verifying the presence of BDD.

This outlines also a limitation of the present study. Appearance concerns and symptoms of depression were based on self-report measurements only. A large population-based survey like the present one has the advantage of generalizability, but certainly underlies some procedural restrictions. Face-to-face diagnostic interviews would be preferable and should be considered in future studies.

Conclusions

Overall, the DCQ has proven to be a valid and reliable questionnaire. It has been developed as a screening measurement to identify individuals with dysmorphic concerns. Now there is a normative data set available to interpret the scores according to gender and age. Subsequently to the DCQ screening it is important to add an exhaustive diagnostic test to ensure the diagnosis of BDD. The best method for this purpose would be a face-to-face interview as symptoms can be evaluated in detail to distinguish between clinical and subclinical cases.