Introduction

Outcomes and endpoints for both dermatology clinical trials have historically been decided by clinical research professionals with the collaboration of pharmaceutical executives and regulatory authorities. These outcome measures employed “validated tools” which are agreed objective criteria which calculate efficacy and safety. Successful statistical changes in the measure are required to determine that an investigational product (drug) is effective. Outcome measures are both global and detailed. For example, for acne, current investigator global assessment outcomes are defined as clear, almost clear, mild, moderate, severe, and very severe. This assessment is static, meaning without consideration of prior assessments. These categories are an “at hand’s length” assessment, without counting, of the amounts of inflammatory and non-inflammatory acne lesions and the percent of the face involved. The detailed assessment is lesion counting. Some clinical trials require recording independently by facial region of the number of open comedones, closed comedones, papules, pustules, nodules, and cysts. What is not routinely assessed is the degree and type of scaring, the dyspigmentation, and the effect on work, school, and social activities.

Outcome measures used in clinical trials are required to be used in clinical practice in some countries as a standard by which choice of treatment is determined. Many find these measures to be too time consuming and not necessarily valid when used in clinical practice.

Thus, these outcomes utilized to prove safety, efficacy, and drug approval may not include all aspects of the disease which are important to patients [1]. A new movement of outspoken educated patients is collaborating with researchers to generate outcome measures that matter to them. This movement has been lobbying congress, professional medical groups, and patient advocacy groups to get their voice heard. In March 2016, the FDA held a public meeting on patient-focused drug development for psoriasis to discuss disease symptoms and daily impacts that matter to patients. This information is meant to guide the FDA’s assessment in determining what to measure to provide evidence of treatment benefit, how to measure, and what is most meaningful to patients. Medical literature refers to these people as “patient research partners,” “patient experts,” and “patient representatives” or “patient stakeholders.” Stakeholders are anybody who in anyway has a relationship to the disease or its treatment, from patients to clinicians to laboratory bench scientists. Patient experts are defined as “persons with a relevant disease who operate as active research team members on an equal basis with professional researchers, adding the benefit of their experiential knowledge to a research project” [2]. Due to the variability in a disease presentation, patient experts are asked to represent not only their own experiences but also the general views of all with the disease. This movement is happening across all fields, and the voice given to patients in one prostate cancer trial led to four major and five minor changes in the trial design [1]. International efforts are underway to develop a consensus for which outcome measures should be required, optional, or avoided in clinical trials and what watered down versions can be employed in clinical practice. An international consensus for outcome measures in clinical trials can simplify the process of gaining regulatory approval across the globe. The Cochrane Skin Group has established the Core Outcome Set Initiative (CSG-COUSIN). Started in 2014, the group is committed to develop and implement a core outcome set (COS) in dermatology to standardize the outcome measure for clinical trials that lead to more useful clinic decision making. The group is focusing on diseases such as atopic dermatitis, vitiligo, acne vulgaris, incontinence-dermatitis, vulvar skin conditions, leishmaniosis, hidradenitis suppurativa, and psoriasis through the roadmap set out by the Harmonizing Outcome Measures for Eczema (HOME) initiative. The opening meeting was held March 2015 in Germany focusing on CSG-COUSIN initiative and efforts in COS, along with challenges associated with outcome assessment. Current project groups from the 2016 meeting in London include facial aging, melanoma, urticarial, nail psoriasis, vascular malformation, and wound healing, in addition to the ones previously mentioned [3].

This review article will focus on the efforts currently underway to make patient-centric outcome measures for clinical trials in relationship to psoriasis (PSO), acne vulgaris, atopic dermatitis (AD), and hidradenitis suppurativa (HS).

Psoriasis

Psoriasis affects 1–3% of the population and is a disease in which it is consistently thought of as a burden only to the skin [4,5,6,7, 8•, 9•, 10•]. Psoriasis severity is determined by the intensity and extent of the psoriatic lesions [7]. Due to their high visibility, lesions on the face and scalp have a more negative impact [8•]. However, psoriasis may have a more pronounced effect not only on the physical effects but also on the psycho-social and financial aspects of daily living [5,6,7, 10•, 11]. Patients with psoriasis also experience cardiovascular risks and depression, to name a few [4]. Patients with psoriasis, socially, are faced with the general misconceptions of the general public that their condition is contagious or mistaken for another disease, limiting their access from public facilities [11]. In a review by Naldi et al. between 1997 and 2000, 44 different scoring systems were used in 171 randomized clinical trials in psoriasis, with half of them using the Psoriasis Area and Severity Index (PASI) [12,13,14]. Psoriasis is a fluctuating chronic condition with current treatments sometimes only producing incremental improvements. Techniques to access this response rely on subjective assessments by the physician and patient [15].

PASI is a common primary outcome used to indicate the severity by measurement of three signs of psoriatic plaques (erythema, scaling, and thickness) in conjunction with the amount of coverage of these plaques. However, the PASI fails to account for the disproportionate burden of more visible lesions. Multiple studies have shown a correlation between improvements in the Dermatology Life Quality Index (DLQI) and PASI. Armstrong et al. showed that a relationship existed between PASI and DLQI improvements but these reductions in score were only achieved with at least a PASI 75 response and that the degree of improvement of QoL depended on which area was affected. However, another study suggests that PASI 90 score has a greater relevant measure with regard to QoL [8•].

The importance of assessing the impact of psoriasis symptoms on a patient’s overall well-being is recognized by The Medical Advisory Board of the National Psoriasis Foundation. In accordance with recommendations set forth by the US Food and Drug Administration (FDA) Guidance for Patient-Reported Outcome Measures, the 16-question Psoriasis Symptom Diary (PSD) was developed for use in global clinical studies to examine the efficiency of treatments in moderate to severe chronic plaque psoriasis. A phase II trial of the PSD showed sound reliability, validity, and sensitivity to change. Although many clinician-reported measures exist, such as PASI and IGA, the PSD provides daily assessments rated by the individual patient. The newly published phase III trial confirmed the phase II results. Evidence of convergent and divergent validity was shown between the PSD and two clinician-reported measures (PASI and IGA), as well as two generic dermatology and HRQoL measures (DLQI and EQ-5D) [9•].

Treatment goals assessed by the patients themselves differ from the physicians’ assessments. The Patient Benefit Index (PBI-S), used for chronic skin diseases, measures the importance of these goals. The first part of the index rates the importance of 25 different treatment goals through the Patient Needs Questionnaire (PNQ). The second part, the Patient Benefit Questionnaire (PBQ), measures the benefits provided by the patient’s current treatment by assessing the achievement of the goals. The items assessed were based off of a survey of 100 people with chronic skin conditions, including psoriasis. This was assessed in a German Psoriasis Registry PsoBest by Blome et al. who concluded that treatment goals went beyond skin clearance and included concerns of itching, burning, pain, and normalization of everyday life. Although a majority of the participants had a change in treatment goals after 1 year of systemic treatment, the importance of these goals did not change depending on treatment success, indicating that treatment goals should be reassessed on a regular basis [10•].

The International Dermatology Outcome Measures (IDEOM) was formed in order to address these inconsistencies between the “validated” psoriasis outcome measures and those aspects of the disease which most effect patients. IDEOM’s inaugural meeting was held in 2013 by Alice Gottlieb et al. in Boston [16]. IDEOM was joined in 2014 by Psoriasis Outcome Assessment Review (pSOAR), a similar effort from the American DermatoEpidemiology Network (ADEN)/DermatoEpidemiology Expert Resource Group [17•]. IDEOM has adopted the methods used by Outcome Measures Rheumatoid Arthritis Clinical Trials (Omeract) to organize an international committee of PSO stakeholders to reach a consensus on PSO patient-centric outcome measures. In addition to recruiting patient stakeholders from clinics, clinical trials, and patient advocacy groups, IDEOM recruited and interacted with patients, globally through social media such as Facebook, Twitter, and Instagram. Social media including LinkedIn, Twitter, and Facebook were employed to interact with and recruit professional stakeholders. IDEOM has met in the USA, Canada, and Italy. Recently, the group agreed to meet in Washington, DC, in order that the dermatology director of the FDA and other US government officials can meet with the group [17•, 18•, 19,20,21].

Acne Vulgaris

Acne vulgaris is the world’s most widespread dermatological disease, and its prevalence is increasing globally [22]. It is the most common reason for a dermatology visit. It involves the formation of comedones, papules, pustules, and, in particularly severe cases, nodules. These lesions often and notoriously appear on cosmetically sensitive areas such as the face, much to the detriment of the afflicted individual. Due to its facial location, acne has long been known to have an impact on the quality of life [23•].

Subjective grading of the lesions based on what appeared to be the most common, dominant lesion type and counting of the lesions have historically been used in the classification of acne. The determination of the lesion type and the counting of lesions could be performed by either the patient or the dermatologist. In general, however, because acne vulgaris can be widespread or limited to a particular anatomical region, and due to it being comprised of several different categories of lesions, acne has been historically very difficult to score. Adding to that, a lack of knowledge regarding the precise pathophysiology of acne and the spontaneous resolution in some patients of particular lesions complicates the scoring of this condition even more. A scale for acne must also be user-friendly, as clinicians outside of the specialty of dermatology have the ability to treat acne [24•]. Finally, as mentioned earlier, acne has an indisputable impact on the psychosocial well-being of the patient, and thus, any scale for acne would ideally take into account this aspect of the condition [23•].

Several attempts at a standardized measure for clinicians who treat acne have been made. For example, the Cardiff Acne Disability Index (CADI) (an offshoot of the 48-item Acne Disability Index) is a five-question survey asking patients to report on how their acne lesions have affected their social behavior and how they would grade the severity of their own acne lesions. This grading assessment has been translated into several languages and is a favorite among dermatologists. It is often combined with the Global Acne Grading System (GAGS) which further assesses the type and location of acne lesions [25•].

Despite the availability of several dozen acne grading systems, there is yet to be a single, standardized patient-reported outcome measure for acne that simultaneously graded the acne lesions as well. One may ask, if there are so many scales available, with some being highly used among dermatologists, what makes it so difficult for one of the scales to emerge as a standardized grading system? The problem lies in a lack of reliability and validity testing for several of the developed scales, especially with regard to reliability and validity testing in the setting of a clinical or community trial. For example, the previously mentioned Cardiff Acne Disability Index has been translated into multiple languages and has had its validity confirmed (albeit with varying results across languages) in said languages; however, there is a substantial shortage of data regarding how well this index would fare in a clinical trial [25•].

A lack of a standardized scale makes acne-related research incredibly difficult. The Acne Core Research Outcome Network, otherwise known as ACORN, is an international consortium that was developed to design validated outcome measures for acne. It is divided into several groups of projects. One of the groups, the Core Outcome Domains Identification (CODI) group, has the job of determining precisely which facets of acne (i.e., symptoms, psychosocial impact, etc.) should be included in such a validated measure. Currently, ACORN hopes to assemble several validated measures for use in a clinician’s “toolbox” as opposed to a single, standardized measure. ACORN is currently working with IDEOM to help develop an international consensus for patient-centric clinical trial acne outcome measures [26].

In 2014, the Acne Symptom and Impact Scale (ASIS) was suggested. This outcome measure was based upon the physical symptoms of acne as well as the psychosocial impact it had on the patient [27•]. This scale was found to be both reliable and valid in both adolescents and adults with acne across both Caucasian and non-Caucasian subgroups. The current model of ASIS has 17 items [28•]. While the ASIS may seem promising, it is still a relatively new scoring system of acne and has yet to become mainstream in dermatology practices.

In summary, acne vulgaris has traditionally been a very difficult disease for clinicians and researchers to grade due to its various lesion types, impact on quality of life, and diverse array of anatomical distribution. While several scoring systems have been developed, they lack the inter-researcher reliability and validity testing needed to create a universal, standardized scoring system. Other new scoring systems such as ASIS have the proper reliability and validity testing and maybe where the future of a universal scoring system for acne is headed; however, these are relatively new systems and need to be further tested.

Atopic Dermatitis

Atopic dermatitis (AD) is a chronically relapsing and severely pruritic inflammatory skin disorder. Patients suffering from this disorder often display impaired skin barrier function, exacerbated by itch-induced scratching [29]. There are various known triggers of AD patient reactions that are often examined to determine the interactions that occur.

Avoidance of exposure to airborne formaldehyde appears to be fundamental to prevent further exacerbation of AD [30]. The resulting skin barrier damage can be seen shortly after patient exposure. It is important to note the air quality of homes and other places these patients spend much of their time, as well as identify other potential sources of formaldehyde that they may be exposed to [31]. Various other environmental chemicals and exposures have triggering effects in AD patients. While the incidence of AD has increased over the recent years, so has the prevalence of chemical allergen release into our environment. Car exhaust, fertilizers, insecticides, and cigarette smoke trigger a heightened response from the immune system in these patients. Preservatives, fragrances, and taste enhancers found in processed foods and cosmetics serve as additional chemical allergens of concern. Patients with allergies to known irritants should avoid exposure to the products and foods that may contain them. Preservative contact allergy is becoming more prevalent and is a result of the perceived necessity of its usage in chemical and cosmetic products [32].

The Scoring Atopic Dermatitis (SCORAD) index is an assessment used to determine the severity of AD developed in Europe for pediatric AD patients. Similarly, the EASI score was developed in the USA to parallel the PASI score used in psoriasis. The SCORAD alone contains patient-reported effects on itching and sleep [33].

Similar to the effects of IDEOM and ACORN, the Harmonising Outcome Measures for Eczema (HOME) group, http://www.homeforeczema.org/, an international group based in the UK, has engaged multiple classes of stakeholders to develop an outline of four domains to be included in all relevant clinical trials for AD. These measures clinical signs, symptoms, long-term control, and quality of life; provides a basis for research that is multi-disciplinary; demonstrates effectiveness; involves patients; and is relevant to caregivers. HOME includes core outcome measures that assess the four domains. These include Eczema Area and Severity Index (EASI), a validated scoring system measuring clinician-reported signs and Patient-Oriented Eczema Measure (POEM), a validated scoring system completed by the patient or caregiver regarding illness experience. HOME IV meeting in Sweden in 2015 voted POEM as an adequate measure of itch, sleep loss, redness/inflamed skin, and irritated skin. However, there was no consensus on a QoL instrument or whether long-term control should be a separate entity or involve repeated measurements of one of the three other domains [34].

These measures along with global scores remain static and not fully representative of the BSA involved nor the effect on the patient’s activity of daily living (ADL). The current focus of HOME is to choose among currently validated outcome measures those that most represent the severity of disease and its effects on patients. Whether or not HOME will follow the lead of IDEOM to fully revise the AD outcome measures likely depends on the success of IDEOM to draft and validate new patient-centric PSO outcome measures that are accepted internationally by regulatory authorities, investigators, pharmaceutical industry, patients, and other stakeholders.

Hidradenitis Suppurativa

Hidradenitis suppurativa (HS) is a chronic inflammatory skin condition that affects the apocrine glands that is characterized by recurrent outbreaks of painful abscesses, fistulas, and skin infections in the axillae, genitals, groin, breast, and perianal regions [35]. Furthermore, there have been advancements in detecting subclinical HS using a variety of imaging techniques (mainly ultrasound). In addition to advancements in HS detection, core outcome measures for hidradenitis suppurativa (HS) are also present in the dermatological literature and play a large role in management of the disease [36]. The most commonly used clinical measurements for clinical progression and staging of the disease are based on the Hurley staging system. In addition to a variety of Modified Hurley Staging (MHS), other HS outcome measures used in clinical trials include the Modified Sartorius Score (MSS) and the HS Physician’s Global Assessment (HS-PGA) [37]. In fact, there are 30 outcome measures used for HS in the current scientific literature. However, the problem lies in the fact that 90% lack any validation data to support their use (including the Hurley staging system, MSS, and HS-PGA) [38•].

However, two validated outcome measures in the current scientific literature are the Hidradenitis Suppurativa Clinical Response (HiSCR) and Acne Inversa Severity Index (AISI). Despite being validated, these two scales have yet to be used in a randomized clinical trial to date. The HiSCR instrument was created as a binary outcome measure that was created retrospectively from a randomized clinical trial that used other outcome measures. HiSCR is regulated by three defining criteria: abscesses (which may or may not be painful, fluctuant, or with drainage), nodules (tender, erythematous, or pyogenic), and fistulas (tracts, communications, and may or may not contain purulent fluid) [39]. The Acne Inversa Severity Index is an HS-tailored, composite score that was designed to include a physician-scored assessment of the evolving body lesions and the relative body sites as well as a physician-scored assessment of a patient’s pain, discomfort, and disability due to HS. This scale has been validated in a study of 46 patients with HS and was even noted to be significantly faster than Sartorius score (mean of 46.66 vs 83.2 s) [40].

Regarding the validity of the HiSCR instrument, it had a correlation coefficient of 0.61 when compared to another scale, the HS-PGA. However, when compared to the Hurley Stage and MSS, the correlation coefficient was lower at 0.49 and 0.51, respectively. Furthermore, the test re-test reliability for inflammatory lesions and sinuses was 0.91 and 0.95, respectively. The AISI instrument demonstrated similarity to Hurley stage and Sartorius 2003 score with correlation coefficients of 0.71 and 0.97, respectively. Correlation with DLQI scores was 0.83 [38•].

Ultimately, an analysis of the 30 HS outcome measures found that there is a problem with heterogeneity among the 30 instruments used in a total of 12 different randomized controlled trials. The HiSCR instrument is promising regarding convergent validity and test-retest reliability, but further studies on internal consistency, inter-reliability, and MCID are still needed. Furthermore, a notable problem with this instrument is the fact that it requires participants with at least an inflammatory lesion count of three as to avoid an endpoint based on the reduction of just one such lesion. Furthermore, the high correlation coefficient of AISI with the DLQI, indicates overlapping inclusion criteria. Currently, no single outcome measure can be recommended, although ten potential efficacy outcome measure domains were identified (patient global self-assessment, recurrence, satisfaction with treatment, functional impartment, appearance, and recovery length) [38•]. Despite lack of outcome measure, there have been recent efforts to help establish a better core outcome set in order to reduce the heterogeneity in the literature. A core outcome set is useful because it defines a list of outcomes that should be included, measured, and reported in all clinical trials. The International Dermatology Outcome Measures (IDEOM) group have discussed outcome measures for HS [18•]. Furthermore, a protocol has been established for such a core outcome set and will hopefully be used in future research to develop a unified outcome measure for hidradenitis suppurativa. This protocol is the Hidradenitis SuppuraTiva cORe outcomes set International Collaboration (HISTORIC) derived from the joint efforts of IDEOM and CSG-COUSIN with the initiative to develop core outcome sets (COS) for HS clinical trials [37].

New Approaches

Despite the adequate validity, reliability, and sensitivity of many patient-reported outcomes, there is no single instrument that fully captures all of the clinical symptoms of the disease that concern the patients nor the emotional well-being and coping strategies of patients with chronic skin diseases. Efforts are underway to organize the full set of stakeholders, including patients, clinical investigators, pharmaceutical companies, and regulatory agencies into working groups to develop an international consensus for dermatology clinical trial outcome measures. Table 1 outlines which groups are associated with which disease and measures. These groups recognize the need to develop simplified versions of these measures that can be used in clinical practice.

Table 1 Summary of efforts to create patient-centric outcome measures

In support of these efforts, a multi-specialty, multi-institutional research consortium is developing two approaches to use social media and the internet to globally learn from patients which outcomes they find to be essential.

One approach is to mine the over 2.5 million public posts per month on social media in relation to skin disease. This effort is underway through collaboration with supercomputer health sciences experts who utilize large data computer technology in combination with social media listening software experts who are modifying the software from mining global internet chatter related to commercial use to software focusing on dermatological disease.

The second approach is to utilize the internet and social media to reach out and follow tens of thousands of dermatology patients. The new continuous quality improvement assessment platform will interact with patients by capturing in relation to disease outcomes, the minutia of activities of daily living (ADL), diet, lifestyle, exposures (DLE), complete list of known countries of origin/ heritage, and a modified fully detailed Fitzpatrick skin typing. Current validated patient-reported outcome measures (PROs) are being compared with newly drafted PROs. The association between ADL/DLE/heritage/skin type with disease outcomes will be assessed using complex adaptive systems methodology which identifies arrays of relationships that are non-linear, multifactorial, non-time dependent.

With these two long-term internet-based global approaches, we hope to provide large powered, big data studies that measure patient-centric outcome goals and quality of life assessments for patients with dermatological diseases.

Conclusion

With the persistence of multiple global groups—IDEOM, CSG-COUSIN, ACORN, HOME, and HISTORIC—core sets of outcome measures are being developed to combine previously clinician-oriented outcome measures to patient-oriented assessments to be used in clinical trials. The most recent literature studies have involved psoriasis, acne vulgaris, atopic dermatitis, and hidradenitis suppurativa. Also, new approaches are underway through the use of social media to utilize the large data-collecting capabilities of software experts to identify the internet chatter that is being produced by patients with dermatological conditions.