Introduction

Vascular manifestations, such as Raynaud’s phenomenon (RP) and digital ulcer (DU) disease, both of which, are major causes of disease-related morbidity in patients with systemic sclerosis (SSc) [1, 2]. There are a limited number of licenced drugs for SSc-RP and SSc-DU disease, in part due to the difficulty establishing treatment efficacy in the clinical trial setting [3]. This has contributed to geographic variation in regulatory approval with marketing authorisation, reimbursement policies and clinical practice, with respect to phosphodiesterase V inhibitors (PDEVi) for SSc-RP and endothelin receptor antagonists (ERA) for SSc-DU disease [3]. In the absence of robust treatment strategy trials, there is some variation in consensus amongst SSc experts regarding the positioning of PDEVi and ERA (bosentan) in the management of SSc-RP and SSc-DU respectively (only 64–76% agreement) [4•]. One factor that could be influencing expert opinion and reimbursement policies for these interventions is the contrasting findings of randomised controlled trials (RCTs) within these fields, with pooled meta-analyses suggesting modest treatment benefits at best [5••, 6••]. The objectives of the present review are to critically appraise potential reasons behind the observed contrasting findings of clinical trials undertaken in SSc-RP and SSc-DU disease. For this, we have chosen trials of tadalafil for SSc-RP and ERA (bosentan and macitentan) for SSc-DU disease. Elucidating potential factors resulting in positive and negative clinical trials for within class interventions could have important implications for future trial design that ensure the therapeutic benefits of new treatments are not overlooked.

Phosphodiesterase inhibitors for Raynaud’s phenomenon in systemic sclerosis

Meta-analysis of PDEVi trials in SSc-RP suggests a very modest benefit for these potent vasodilators that have gained acceptance as an important second line treatment for SSc-RP amongst SSc experts [4•, 6••, 7]. Pooled analysis of six clinical trials identified a mean decrease in the Raynaud’s Condition Score (RCS) of only − 0.46 (95% CI − 0.74 to − 0.17), with similarly modest benefits of active treatment over placebo for the average daily frequency (− 0.49 attacks/day) and duration (− 14.62 min/day) of RP attacks [6••]. To put these findings in perspective, the estimated minimally important difference for the RCS (1.4) is 3 times greater than the pooled treatment benefit with PDEVi for SSc-RP [8]. Moreover, the modest treatment benefit reported following meta-analysis was largely dependent on the findings of two RCTs of alternate-day tadalafil, one of which was only reported as a conference proceeding (ClinicalTrials.gov Identifier: NCT01117298) [9, 10]. In contrast, a separate RCT of daily tadalafil reported no significant benefit on SSc-RP symptoms [11].

A summary of the two contrasting clinical trials of tadalafil for SSc-RP is presented in Table 1 [9, 11]. Conveniently, for the purpose of this review, both trials adopted a similar prospective randomised double-blind, placebo-controlled cross-over single-centre study trial design and incorporated the RCS diary parameters as the primary endpoints. The RCS diary was developed for a negative clinical trial of oral iloprost and collects daily data on the duration of each RP attack and a single-item global assessment of the severity/impact of RP (the RCS) in the form of either an 11-point numeric rating scale (NRS) or 100-mm Visual Analogue Scale (VAS) [15]. Using this data, it is possible to establish the mean daily frequency and aggregate duration of RP attacks, in addition to a mean daily RCS. The RCS diary has been partially validated and is currently the preferred endpoint for SSc-RP clinical trials [16, 17]. There were, however, differences between the studies that were of interest.

Table 1 Summary of trials examining the role of tadalafil therapy for Raynaud’s phenomenon in systemic sclerosis

Clinical trial design, participant eligibility and setting

The major difference in study design related to tadalafil 20-mg dose administration which was alternate day in the trial by Shenoy et al. (hereafter named study A) and daily in the trial of Schiopu et al. (hereafter named study B) [9, 11]. The study schedules differed slightly in terms of treatment period duration (6 vs. 4 weeks for studies A and B respectively) and washout period (1 versus 2 weeks for studies A and B respectively). Study B had a priori co-primary endpoint in a study looking at female sexual function in SSc [12, 13]. As such, study B solely recruited only females, although 83% of the participants in study A were also female (as expected). It is possible that an important motivating factor for study entry may have been the perceived potential beneficial effects on sexual dysfunction. Similarly, the expectation that participants would engage in sexual activity weekly in study B may have introduced important selection bias with respect to overall well-being and psychosocial considerations. Study B mandated a higher average frequency of RP attacks/week during the run-in phase (≥ 6 vs. ≥ 4 attacks/week), which resulted in the withdrawal of three patients after gaining consent.

There was a higher proportion of patients with dcSSc in study A (75% vs. 26%), although there is no known difference in the character or severity of RP symptoms between the major disease subsets. Participants in study A were younger (mean 36.9 vs. 52.9 years) and had a shorter disease duration (6.8 vs. 11.8 years). Recent work suggests potential evolution of RP symptoms during the course of SSc, possibly alongside the evolving obliterative microangiopathy of SSc [18]. Adaptation with enhanced self-efficacy at avoiding or ameliorating SSc-RP attacks also lessens the burden of Raynaud’s symptoms over time [19].

Both studies were single-centre (Lucknow, India and Ann Arbor, Michigan, US) which allows a helpful comparison of the likely weather conditions experienced by trial participants. Both studies enrolled participants during winter. Study A had a narrower enrolment window and all study visits were completed over a 15-week period between December and March. The authors reported local mean daily temperatures of 14–18 °C during this period. In contrast, participants in study B had study visits between October and June. The authors did not report local weather data during this period but a more pronounced seasonal effect would be expected in study B than study A given the mean daily temperatures in Ann Arbor range from − 4.1 °C in January to 20.5 °C in June [20]. Study B examined ‘subjects studies in October through November versus February through March’ without finding differences in absolute RCS or change from baseline, although this analysis did not capture the Winter months of December or January [11]. A recent multicentre study at the UK and USA sites has highlighted the impact of weather on RCS diary returns with significantly lower mean daily frequency (1.8 vs. 0.9 attacks/day), duration (33.6 vs. 15.7 min/day) and RCS scores (2.5 vs. 0.9) in patients completing the diary during Summer compared to Winter [21]. The other major difference in eligibility criteria related to co-administration of vasodilator medications for SSc-RP. Study A allowed participants to remain on a steady dose of calcium channel blockers (CCB), angiotensin-converting-enzyme inhibitors (ACEi) and angiotensin II receptor blockers (ARBs) administered for SSc-RP, whereas in study B, these treatments were stopped (unless being administered for other cardiovascular diseases). This resulted in significantly greater use of CCB, ACEi and ARB use in study A. Indeed, all of the patients in study A were receiving CCB therapy and the majority were receiving ACEi or ARBs (71%). Anti-platelet therapy use, meanwhile, was higher in study B (possibly indicating higher conventional cardiovascular risk factors at US site). Despite the expected beneficial effects of permitted co-administration of vasodilators therapy and milder environmental temperatures, it is interesting to note the higher reported mean daily RCS in study A compared with study B (5.28 vs. 3.76). It is unlikely that this can be entirely explained by differences in baseline demographics (e.g. disease duration) and other biopsychosocial factors may have contributed to the overall higher burden of RP within study A compared to study B (92.3% Caucasian). The higher rate of DU disease at baseline (7/24 [29%] vs. 2/39 [5%]) suggests more pronounced digital vascular disease within the participants of study A. Recent work has confirmed a more severe disease burden (including vascular features) of SSc in African Americans compared to those of European ancestry [22] and there may be, hitherto unexplored, important ethnic differences between Caucasian and Indian patient experiences of SSc-RP.

Study findings

As summarised in Table 1, there were significant improvements favouring alternate day tadalafil over placebo for all primary and secondary endpoints (with the exception of circulating biomarkers). This included objective assessment of macrovascular function using flow-mediated dilatation (FMD) and remarkable reported outcomes for DU healing and recurrence. In contrast, there was no improvement in RCS diary parameters in study B. Only two patients had active DU at study entry and these ischaemic lesions were still evident at the end of the study. The most notable difference in study findings concerned the placebo effect, or absence thereof in Study A. In study B, there were trends to improvement for all parameters of the RCS diary within both the tadalafil and placebo arms (with no significant difference between arms). In contrast, there was no placebo response in study A with the RCS and average daily frequency of attacks remaining static, whereas the average aggregate duration of RP attacks actually rose from 46.3 to 54.9 min in the placebo arm. There was no period effect in study B to suggest that the placebo effect was magnified by increase in environmental temperature in those patients who received active therapy in the initial treatment arm [11]. The magnitude of the treatment effect was similar in both studies across all the RCS diary parameters, but the absence of a placebo effect (common to most clinical trials of SSc-RP) in study A appears to have been the major factor leading to an observed treatment effect in this positive study and the reasons for this are unexplained. The authors of study A did not report any potential blinding failure but this was not formally tested [23]. It is also possible that the active treatment and placebo effects are influenced by geographic location and environmental temperature, with placebo responses less evident in studies undertaken in warmer climates (the mean daily maximum temperature in Lucknow is 26.2° during the month of February) [24]. Alongside, the aforementioned notable differences in baseline demographics (such as disease subsets, age at enrolment, disease duration and concomitant vasodilator therapy), there may have been other factors that contributed to the contrasting findings of these studies including participant motivation for study entry, cultural differences and issues around translation of PRO instruments for Hindi-speaking subjects in study A.

Implications for management of SSc-RP and future SSc-RP clinical trials

The use of PDEVi in the management SSc-RP is supported by recent clinical guidelines and expert recommendations [4•, 7, 25]. These recommendations are generally based on the aforementioned pooled analysis which identified a modest treatment benefit over placebo [6••]. It is unlikely this modest treatment benefit would have been identified had the meta-analysis not included the two positive trials of alternate-day tadalafil; both of which did not identify a meaningful placebo effect that has yet to be fully explained [9, 10]. Nonetheless, the extensive anecdotal evidence and personal experience of using PDEVi for RP justifies its position in existing and future SSc management recommendations. Additional work is required to better understand the placebo effect, and future clinical trials should undertake formal assessment of blinding effectiveness. Concerns have been raised about the RCS diary amongst SSc experts [26] and amongst patients [27]. Efforts are underway to devise a novel patient-reported outcome (PRO) instrument for SSc-RP that is not reliant on diary monitoring [28]. This could facilitate shorter clinical trials in SSc-RP, which may also abrogate the impact of seasonal variation in weather. It would be helpful for future studies of SSc-RP to report the impact of interventions during summer as well as winter, although there are obvious advantages to testing new treatments during colder months when SSc-RP symptoms are more pronounced [21]. A comparison of these two studies also suggests we should further explore the burden of SSc-RP across different ethnic groups. To better reflect real-life clinical practice, future clinical trials should opt to allow background stable vasodilator therapy and should consider undertaking post-hoc analyses to examine whether differences in response are influenced by disease subset, age, concomitant cardiovascular risk factors and disease duration.

Endothelin receptor antagonist therapy for systemic sclerosis-related digital ulcers

Endothelin (ET) is a potent vasoconstrictor with pro-inflammatory and proliferative effects that could exacerbate the vasculopathy of SSc. Increased circulating endothelin and over expression of the endothelin receptors (ETA and ETB) have been demonstrated in the skin, lung and blood vessels in SSc. Bosentan and macitentan are dual ETA and ETB receptor antagonists (ERA) that have each been approved for the management of pulmonary arterial hypertension. Bosentan has also established a role in the management of SSc-DU disease following the encouraging findings of the RAPIDS-1 and RAPIDS-2 clinical trials [4•, 7, 25, 29, 30]. Macitentan was developed to optimise the structure of bosentan to deliver improved efficacy and safety compared to other ERA therapies [31]. There are case reports of macitentan demonstrating treatment benefit in patients who have experienced DU despite bosentan therapy [32] and it was anticipated that macitentan could provide an alternative treatment option for SSc-DU disease. The DUAL-1 and DUAL-2 clinical trials, however, did not support a role for macitentan in the management of SSc-DU disease [33]. A summary of the contrasting clinical trial programmes of ERA therapy for SSc-DU are presented in Table 2. We have focussed on the larger RAPIDS-2 clinical trial of bosentan therapy and both the DUAL-1 and DUAL-2 studies of macitentan [30, 33]. Conveniently, these clinical trial programmes also adopted a similar study design facilitating easy comparison between studies.

Table 2 Summary of DUAL and RAPIDS-2 trials examining the role of endothelin receptor antagonist therapy for digital ulcer prevention in systemic sclerosis

Clinical trial design, participant eligibility and setting

The DUAL and RAPIDS-2 clinical trials were both phase 3, randomised, double-blind, placebo-controlled, multicentre, parallel-group trials. The RAPIDS-2 study was solely conducted at European and North American sites, whereas the DUAL study enrolled patients more broadly from sites in Asia, South America and Australasia. There could have been larger geographic variation in the overall burden and pathophysiology of DU disease in the DUAL trials in comparison with RAPIDS. Both studies required the presence of a recent DU at study entry, although the DU definition differed slightly with the RAPIDS-2 trial specifying a DU size of > 2 mm and requiring DUs to be present on the volar aspect of the digit distal to the proximal interphalangeal (PIP) joint. It is likely that the DU definitions applied in the different studies resulted in differences in the types of digital lesions permitting study entry. For example, the absence of an agreed DU size in DUAL may have facilitated the inclusion of lesions that some SSc experts might consider to represent healed DU or digital pitting. There has been surprisingly poor inter-rater (amongst SSc experts) agreement as to what constitutes an active, healed or non-DU [34,35,36]. The more accommodating definition of DU in the DUAL trials may have also permitted study entry of patients with DU on the extensor aspect of the PIP that would have been ineligible for RAPIDS-2. Hachulla et al. proposed the existence of 3 distinct types of DU, with ulcers occurring over the extensor aspects of the PIPs representing microtraumatic events and traction of sclerotic skin across the fixed-flexion deformities of the digits [37••]. It is possible that such DU are less amenable to healing (or less liable to develop new DU) during the course of clinical trials of vasodilator therapy. Another major difference between the two clinical trial populations was prior therapy with the DUAL study (undertaken almost 10 years after RAPIDS-2 trial) specifying the exclusion of patients receiving (or who had recently received) bosentan therapy. With clinical practice guidelines increasingly advocating the use of bosentan in the prevention of DU, the DUAL trial is likely to have enrolled patients with both more refractory disease (some of whom may have failed earlier trials of ERA therapy) but also milder disease (with potential subjects choosing standard care in preference to facing the prospect of possible randomisation to a ‘no-treatment’ arm). It would be expected that many more patients, treated and responding favourably to bosentan therapy, would no longer have been considered for the DUAL trial programme as a direct result of the RAPIDS-2 trial findings and subsequent marketing authorisation. The exclusion of current smokers in DUAL will also have led to differences in the baseline demographics and likely progression in the two studies. Current smoking has been associated with more persistent DU disease [38].

Study findings

The most notable difference in the baseline characteristics of the study groups related to smoking exposure. Of note, the placebo arm of the RAPIDS-2 had a higher number of current smokers than the active treatment arm (22% vs. 13%) which may have amplified the treatment effect. There was a higher proportion of dcSSc in the DUAL studies (55% vs. 40%). This may relate to the broader geographic participation in the DUAL trial but might again be indicative of differences in DU aetiopathogenesis in patients enrolled in the DUAL and RAPIDS-2 trials. Studies have identified a higher burden of DUs related to joint contractures in dcSSc [38, 39]. The studies were otherwise fairly well matched in terms of age, sex, disease duration and, most importantly, the burden of DU disease at baseline with both trials reporting a mean of 3.4 DU per person at baseline (~ 65% of whom had ≤ 3 DU at baseline). Enrolment to both studies took place throughout the year with no suggestion of a seasonal affect, although South American and Asian sites included in the DUAL trial do not experience the winter weather conditions typical of North America and Europe. The major study findings are summarised in Table 2. In brief, the RAPIDS-2 trial reported a 30% reduction in the mean number of new DUs following bosentan compared with placebo in the study population (1.9 (95% CI 1.4 to 2.3) vs. 2.7 (2.0 to 3.4) new DUs, p = 0.035). Fewer new DUs were observed with bosentan than placebo in all subgroups except amongst current smokers, of whom there were proportionally more in the placebo arm (and who would have been excluded from entry to DUAL). The remaining co-primary and secondary endpoints (e.g. DU healing, proportion not experiencing new DU, mean number of new DUs per patient, hand pain, function) in RAPIDS-2 did not differ between bosentan and placebo arms. The clear absence of healing benefit was an important reason why the FDA were not approached for approval as they had previously emphasised that they would consider healing a key endpoint for SSc-DU trials. The DUAL trials were negative across all studied endpoints (with early termination of DUAL-2 in expectation of low likelihood of treatment efficacy). There is no difference in mechanism of action that might explain the contrasting findings. Indeed, macitentan is considered more potent dual receptor antagonist than bosentan [31]. The observation period was longer in the RAPIDS-2 study (24 vs. 16 weeks), although the RAPIDS-2 trial findings largely replicated the earlier RAPIDS-1 study findings suggesting this was not a significant contributory factor [29]. It is more likely that important differences in study populations, and specifically the likelihood of future DU, existed that contributed to the negative DUAL trials. In the DUAL trials, the mean number of new DU per patient in the placebo-treated arm was only 1.21, in contrast to 2.7 in the RAPIDS-2 trial. Indeed, the mean number of new DU per patient in the bosentan-treated arm of RAPIDS-2 was higher than the placebo-treated arm of DUAL (1.9 vs. 1.21). In the RAPIDS-2 trial, the difference between treatment groups was largest in patients with at least four new DU. The proportion of placebo-treated patients with no new DU in DUAL-1 was 67% compared with only 29.2% in RAPIDS-2 (although the RAPIDS trials failed to identify statistically significant differences in first new DU occurrence between treatment arms suggesting this may not have greatly influenced the broader trial findings). The differences in the rate of new DU in the placebo-treated arms of the two trial programmes will undoubtedly have contributed to the lack of treatment effect in the DUAL trials but also indicate significant differences in the study populations, including aforementioned factors such as smoking exposure, likely DU aetiopathogenesis, geographic variation in weather and prior treatment.

Implications for management of SSc-DU and future SSc-DU clinical trials

Of the putative factors contributing to contrasting findings of the RAPIDS-2 and DUAL trials, the likely differences in DU aetiopathogenesis owing to differences in DU definition and site (possibly leading to a greater number of patients with extensor PIP DU) may have been the most important contributory factor. Ongoing issues around DU classification and poor inter-rater agreement as to what constitutes a DU is also likely to present challenges in clinical trial programmes of SSc-DU undertaken across larger numbers of units [34,35,36]. There have been recent calls for all DU to be included in clinical trials of SSc-DU [40] although there may be value in future investigators carefully considering the location and likely aetiopathogenesis of DU when designing clinical trials of vasodilator therapy. The RAPIDS-2 trial may have benefited from specifying the presence of ischaemic ‘vascular’ DU, and future clinical trials should consider adopting similar eligibility criteria. There is an urgent need for practice-based evidence examining outcomes of DU of varying aetiology to vasodilator therapies such as ERA. There are case reports of DU disease refractory to bosentan therapy responding well to alternative interventions due to the contribution of other factors such as calcinosis cutis [41]. Future clinical trials also need to consider the implications of the advances made in the management of SSc-DU. The scope for future traditional placebo-controlled trials is limited, with clinical practice guidelines consistently advocating the use of treatments such as PDEVi and ERA therapies. If placebo-controlled, future trials should consider add-on combination therapy. This will avoid the enrichment within studies of milder disease (a possible factor in the DUAL trials) as both potential subjects and investigators become in increasingly dissuaded from participating in trials that risk randomisation to a ‘no treatment’ arm. It will also better reflect likely future clinical practice. Combination therapy was permitted in the SEDUCE trial, and post hoc intention-to-treat analysis in this study revealed a faster rate of SSc-DU healing in subjects receiving combination therapy with background bosentan plus sildenafil, compared to background bosentan therapy plus placebo [42]. Similarly, efforts to design trials that will reflect routine clinical practice should allow the inclusion of smokers. Future trials must also consider recent trial findings within placebo-controlled arms when estimating sample sizes to ensure adequately powered studies given the significant variation in DU occurrence and DU healing across trials [29, 42]. There also remains the unresolved issue of what constitutes a DU and further work to improve the classification and inter-rater reliability of DU is needed [34,35,36, 43, 44]. Finally, it is worth noting that SSc-DU studies have not fully captured the patient experience of DU disease and PRO instruments could be valuable tools for demonstrating DU healing and burden in the future and facilitate a shift in emphasis away from the number of SSc-DU. The development of the Hand-Disability in Systemic Sclerosis-Digital Ulcers tool, the clinician-reported Digital Ulcer Clinical Assessment Score in Systemic Sclerosis (DUCAS) and efforts underway to devise other disease specific PRO instruments for SSc-DU disease will be valuable in this regard [28, 45, 46].

Conclusions

Digital vascular manifestations of SSc are a major cause of disease-related morbidity, and demonstrating treatment efficacy in the management of cutaneous vascular manifestations of SSc has been a major challenge [3]. The contrasting findings of clinical trials of SSc-RP and SSc-DU for within-class medications have led to uncertainty on the likely efficacy with geographic variation in reimbursement policies and clinical practice. These contrasting trials do, however, provide investigators with valuable insight into the pre-requisites of effective clinical trial design and can help to identify potential barriers for future successful clinical trials. Future clinical trials of both SSc-RP and SSc-DU should adopt eligibility criteria that reflects clinical practice (e.g. allowing co-administration of other therapies and enrolment of active smokers). Enrolment of subjects during the winter remains desirable, considering the expected influence of environmental temperature on both outcomes. Trust in the reported outcomes is essential within both fields, and the participation of fewer participating centres may be advantageous for minimising geographic variation in weather and inter-rater assessment variability across sites. SSc-RP trials should test and report their blinding procedures, whereas SSc-DU trials may benefit from the use of an adjudication committee to determine DU presence and healing to minimise overcome issues around inter-rater variability. For both SSc-RP and SSc-DU clinical trials, the emergence of novel outcome measures shall facilitate the evaluation of new interventions, in the context of the advances already made in the respective fields over the last 20 years, ensuring further progress is made in the management of these debilitating manifestations of SSc.