Introduction

Given the importance of measuring health-related quality of life when evaluating patients with chronic inflammatory skin disease such as atopic dermatitis, there is increasing interest in capturing this outcome in routine clinical care and research settings. The 10-item Dermatology Life Quality Index (DLQI) is one of the most popular instruments used to assess health-related quality of life in dermatology and is commonly used in clinical trials [1,2,3]. In addition, in some clinical settings, patient-reported outcome measures such as the DLQI are being used for coverage determination for access to systemic treatments and to assess response to treatment for a variety of skin conditions [4].

Although the DLQI is a popular health-related quality of life instrument, there have been recent concerns about the influence of “not relevant” responses (NRRs) on the DLQI. For 8 of the 10 items included in the DLQI, there is an option to respond “not relevant” which is scored the same as “not at all.” These NRRs are common among patients with chronic inflammatory and autoimmune skin disease such as psoriasis, hidradenitis suppurativa, vitiligo, pemphigus, and morphea [5,6,7,8,9]. Among patients with psoriasis, NRRs are also associated with worse dermatologist- and patient-reported disease activity measures, suggesting that the DLQI may underestimate health-related quality of life impact for those with NRRs [10]. There are also meaningful differences in the frequency of NRRs among adults with psoriasis by age, sex, race/ethnicity, marital status, income, and employment status [11, 12]. Given these and other limitations, there have been calls to revise the DLQI scoring to account for the potential influence of NRRs (e.g. DLQI-R) or to discontinue use of the DLQI [8, 13, 14].

Although NRRs have been well described among adults with several chronic inflammatory and autoimmune skin diseases, little is known about NRRs among adults with atopic dermatitis. The objective of this study was to examine the frequency of NRRs on the DLQI among adults with atopic dermatitis, whether NRRs are associated with underestimation of disease burden, and whether the NRR frequencies differ by sociodemographic characteristics. In addition, as a secondary objective, we sought to evaluate the construct validity of the recently proposed DLQI-R scoring modification in this population.

Materials and methods

Study population

A cross-sectional analysis was conducted using data from the Atopic Dermatitis in America survey, which has previously been described in detail [15]. Briefly, this survey population is drawn from the GfK knowledge panel, and online surveys were fielded in November and December 2016. The GfK Knowledge Panel is a large, probability-based web panel in the United States that includes over 40,000–50,000 adult members at any given time. This panel is constructed from a national address-based sample of households who are recruited to join and who receive small incentives to participate in web-based surveys. A cross-sectional sample of participants in the GfK panel were recruited to participate in the Atopic Dermatitis in America survey, which sought to identify participants with atopic dermatitis. This survey also captured data on patient disease and treatment history.

Adults who participated in the Atopic Dermatitis in America survey and who met the UK Working Party diagnostic criteria for atopic dermatitis were included in this study (with modified age of onset criteria of < 18 years, since recall of childhood atopic dermatitis is low) [16, 17]. The UK Working Party definition of atopic dermatitis that was used was the presence of itchy skin plus three or more of the following: skin fold involvement, history of asthma or hay fever, history of dry skin in the past year, or age of onset under the age of 18.

Patient-reported outcomes

Dermatology Life Quality Index (DLQI)

The DLQI is a 10-item questionnaire that measures dermatology-specific health-related quality of life based on patient report with a 1-week recall period. Scores range from 0 to 30, with higher scores indicating greater health-related quality of life impact. For items 3–10 on the DLQI, the patient has the additional response option of “not relevant”, which is scored the same as “not at all” [1]. Score bands for the DLQI have been proposed as follows: 0–1: no effect on health-related quality of life; 2–5: small effect; 6–10: moderate effect; 11–20: very large effect; 21–30: extremely large effect [18]. The DLQI-R scoring modification is calculated by multiplying the traditional DLQI score by a conversion factor that increases with the number of NRRs [13].

Patient-Oriented Eczema Measure (POEM)

The POEM is a 7-item symptom inventory for eczema with a 1-week recall period. Scores range from 0 to 28, with higher scores indicating worse severity of disease [19]. POEM is recommended by the Harmonizing Outcome Measures for Eczema) initiative as the core outcome instrument for measuring patient-reported symptoms in eczema trials [20]. Specific severity strata for POEM for use in this population have been proposed: 0–7: mild; 8–19: moderate; 20–28: severe [21].

Patient-Oriented SCORing Atopic Dermatitis (PO-SCORAD)

The PO-SCORAD is a self-assessment score, which uses subjective and objective criteria from the SCORAD physician clinical assessment tool to allow patients to comprehensively evaluate their atopic dermatitis. It is a static assessment and scores range from 0 to 103, with higher scores indicating greater burden of disease [22]. Severity strata for the PO-SCORAD have been proposed: 1–27: mild, 28–56: moderate; 57–103: severe [21].

Short-Form (SF)-12

The SF-12 is a 12-item generic health-related quality of life patient-reported outcome measure, which was derived from the SF-36. It uses a 4-week recall period. Scores range from 0 to 100, with lower scores indicating greater health-related quality of life impact. The SF-12 also includes two aggregate summary measures: the mental health score and physical health score [23,24,25,26].

Study outcomes and statistical analysis

To evaluate whether NRRs may be associated with underestimation of disease burden, for items 3–10 on the DLQI, severity of disease and health-related quality of life measures were compared between those who responded “not relevant” and those who responded “not at all”. In addition, to examine for sociodemographic differences with respect to NRRs, we evaluated the NRR frequency for items 3–10 on the DLQI, stratified by sex, race/ethnicity, educational attainment, income, employment status, and marital status.

Differences in median scores between those who responded “not relevant” and those who responded “not at all” were evaluated using quantile regression. Pearson chi-squared tests were used to evaluate for differences in the frequency of NRRs for each DLQI item by sociodemographic characteristics. Multivariable logistic regression was used to evaluate for associations between these sociodemographic characteristics and having at least one NRR, adjusting for DLQI score and PO-SCORAD score, since disease severity has been shown to be associated with NRRs for psoriasis [10]. Model fit was assessed using the Hosmer and Lemeshow’s goodness-of-fit test chi-square. Since the NRR data were overdispersed, negative binomial regression was used to evaluate for associations between sociodemographic characteristics and the total number of NRRs.

To examine the construct validity of the DLQI-R, both the DLQI and DLQI-R were calculated and their correlation with POEM, PO-SCORAD, and SF-12 scores was assessed.

Spearman’s correlations were used to evaluate for correlation between the DLQI-R and DLQI with POEM, PO-SCORAD, and SF-12 scores. Correlation coefficients were interpreted using the following categorization schema: 0–0.29: negligible correlation; 0.3–0.49: low correlation; 0.5–0.69: moderate correlation; 0.7–0.89: high correlation; 0.9–1.0: very high correlation [27]. Steiger’s Z was used to evaluate for significant differences between correlation coefficients calculated for the DLQI and DLQI-R.

Known-groups validity of the DLQI and DLQI-R were assessed by comparing DLQI and DLQI-R scores across the severity categories for the POEM and PO-SCORAD, which have been previously been proposed for use in this population (POEM: mild = 0–7, moderate = 8–19, and severe = 20–28; PO-SCORAD: mild = 1–27, moderate = 28–56, severe = 57–104) [21].

Respondents with missing DLQI, POEM, SF-12, PO-SCORAD, or covariate data were excluded (one respondent was excluded due to missing SF-12 data, no other respondents had missing data). Statistical analyses were performed using Stata 15 (StataCorp, College Station, TX). Analyses were performed using post-stratification sample weights to account for the survey design. These weights were developed to ensure all samples follow the equal probability of selection method and are designed to adjust for any differential non-response during the survey data acquisition. They are developed using several geodemographic benchmarks including gender, age, race/ethnicity, region, income, home ownership status, and metropolitan area status [28]. Standard errors were calculated using Taylor-linearized variance estimation. This study was deemed exempt by Institutional Review Board of the University of Pennsylvania with a waiver of informed consent. This study is reported according to the STROBE guidelines [29].

Results

Among 764 adults with atopic dermatitis, 58.1% were female and the median age was 41 years (IQR 30–56). History of systemic medication use (e.g. cyclosporine, mycophenolate mofetil, azathioprine, methotrexate) and oral steroid use were reported by 8.7% and 17.8% of participants, respectively. The median DLQI score was 2 (IQR 1–6), corresponding to small effect on quality of life. Median POEM score was 5 (IQR 2–10), and median PO-SCORAD score was 24 (IQR 14–34) (Table 1).

Table 1 Subject characteristics (n = 764)

The median number of NRRs was 1 (IQR 0–3) and 55.2% of participants had at least one NRR, with 17.9% having 4 or more NRRs. NRRs were most common for item 6 (“sport”, 32.4%), item 3 (“daily routines”, 30.5%), and item 9 (“sexual relationships”, 27.9%) (Supplemental Table 1). For items 5–10 of the DLQI, those who responded “not relevant” had significantly lower (worse) SF-12 mental health scores than those who responded “not at all.” For items 6–9 of the DLQI, those who responded “not relevant” had significantly lower (worse) SF-12 physical health scores than those who responded “not at all,” although these differences were small in magnitude. While there were some statistically significant differences in POEM and PO-SCORAD scores between those who responded “not relevant” and “not at all”, the differences were generally small and in different directions depending on the item (Table 2).

Table 2 Comparison of DLQI scores, POEM scores, PO-SCORAD, and SF-12 scores between those who responded ‘not at all’ and ‘not relevant’ for each DLQI item

For Items 7–9, NRRs were more common among those with lower income. Compared to those who were married, NRRs were more common among those who were never married or widowed/divorced for Items 8 and 9. Compared to those who were working, NRRs were more common among those who were disabled or retired for Items 6, 7 and 9 (Table 3).

Table 3 Frequency of ‘not relevant’ responses by patient sociodemographic characteristics

In multivariable analyses, compared to white individuals, Hispanic individuals had fewer NRRs (IRR 0.71; 95% CI 0.53–0.96). Black individuals also had fewer NRRs (IRR 0.70; 95% CI 0.45–1.11), although this did not reach statistical significance. Compared to those with annual income < $25,000, those with income > $100,000 (IRR 0.50; 95% CI 0.35–0.73) had fewer NRRs. Compared to those who were married, those who were never married had more NRRs (IRR 1.38; 95% CI 1.02–1.87) (Table 4).

Table 4 Association of patient sociodemographic characteristics with 'not relevant' responses

The median DLQI-R score was 2.2 (IQR 1–7). The DLQI-R scoring modification had stronger correlation with the SF-12 Physical Health Score (− 0.09 vs − 0.07, Steiger’s Z p = 0.02) and SF-12 Mental Health Score (− 0.44 vs − 0.41, Steiger’s Z p < 0.001) than the traditional DLQI score. The DLQI-R scoring modification performed similarly to the traditional DLQI score with respect to correlation with POEM and PO-SCORAD scores (Table 5). Consistent with prior studies of the DLQI, more severe disease as assessed by POEM and PO-SCORAD was associated with higher DLQI scores indicating larger impact on health-related quality of life (Supplemental Fig. 1) [7, 30].

Table 5 Correlation between DLQI, DLQI-R and POEM, PO-SCORAD, and SF-12 scores

Discussion

While studies among patients with psoriasis and hidradenitis suppurativa have found that 20–48% of patients have at least one NRR [6, 9], in our cohort over 55% had at least one NRR. In addition, nearly a fifth of patients with atopic dermatitis in this cohort had NRRs for at least half of the items on the DLQI compared to 2–10% among patients with psoriasis [6]. The high frequency of NRRs suggests that there may be content validity problems with the DLQI when administered to adults with atopic dermatitis. Similar issues have been noted with the DLQI among patients with vitiligo, with one study finding 76.6% had at least one NRR [9].

Consistent with prior studies, NRRs were most common among items 3 (“daily routines”), 6 (“sport”), 7 (“work/study”), and 9 (“sexual relationships”) [12, 31, 32]. These items may be particularly problematic as they may not apply broadly to diverse sociodemographic groups, which is supported by differences in the frequencies of NRRs for these items by sex, race/ethnicity, income, employment status, and marital status. In addition, the one week recall period on the DLQI could influence the frequencies of NRRs as some individuals may not be engaged in these activities on a weekly basis (e.g. “sport”, “sexual relationships”) [12].

While several studies have highlighted that NRRs are associated with underestimation of disease severity among patients with psoriasis [10, 13], our data do not demonstrate a clear pattern of NRRs being associated with greater disease burden. Although those who responded “not relevant” had worse SF-12 scores than those who responded “not at all,” which could suggest that NRRs are associated with underestimation of health-related quality of life impact, similar patterns were not consistently observed for DLQI, POEM, and PO-SCORAD scores. In addition, the magnitude of these differences was small, and the clinical significance of these differences is unclear.

We found that several sociodemographic factors were associated with having fewer NRRs, including Hispanic race/ethnicity, increasing income, and being married. Similarly, studies in psoriasis have also found that sociodemographic factors such as increasing income and being married are associated with decreased NRRs [11, 12]. These differences suggest there may be issues when the DLQI is used among diverse populations of patients with atopic dermatitis.

Given the potential bias introduced from NRRs, the DLQI-R scoring modification has been proposed as a simple approach to adjust the DLQI score to account for the potential influence of NRRs [13]. Although some studies among patients with psoriasis have found that the DLQI-R has improved measurement properties compared to the traditional DLQI scoring method, others have not [5, 8, 13]. In this study among a cohort of patients with mostly mild atopic dermatitis, the DLQI-R did demonstrate stronger correlation with SF-12 scores than the traditional DLQI, although the correlations were weak and differences observed between the DLQI and DLQI-R as assessed by Spearman’s rank correlation were small, each differing by less than 0.03. Furthermore, while the DLQI-R scoring modification may help account for bias introduced by NRRs, it does not address the fundamental issue of content validity with the DLQI, which is considered the most important measurement property of a patient-reported outcome measure by the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative [33].

This study should be interpreted in the context its design. Although the broad population included in the Atopic Dermatitis in America cohort is a strength of this cohort, patients have relatively mild skin disease which may have contributed to the lack of clear association between NRRs and burden of disease in our study. This limitation is particularly relevant as studies in psoriasis have found that NRRs are more common among those with more severe disease [10, 12, 34]. In addition, these data were collected from an online platform. The relatively mild skin disease in this cohort may also have limited our ability to compare for differences between the DLQI and DLQI-R scoring modification. Future studies are needed to examine whether these findings are similar among patients with more moderate-to-severe disease. Given the nature of the survey design, we are unable to assess how NRRs may influence clinical decisions and treatment recommendations at the point of care. In addition, we are unable to evaluate the underlying factors contributing to NRRs in this population.

Conclusion

NRRs on the DLQI are common among a cohort of adults with atopic dermatitis and differ across several sociodemographic characteristics, suggesting important issues with respect to content validity. Unlike what has been observed for psoriasis, there is not a clear association between NRRs and underestimation of disease severity among a cohort of adults with mostly mild atopic dermatitis. Further study is needed to understand the factors contributing to NRRs, the impact of NRRs on patient outcomes when the DLQI is used in routine clinical care, and optimal strategies to assess health-related quality of life among patients with atopic dermatitis and other inflammatory skin diseases.