Keywords

Importance of Therapeutic Trials

While therapeutic trials are essential when seeking guidance in treating diseases, relying solely on open label studies or case reports may be misleading due to selection bias, reporting bias, and the lack of control group. Thus, several SSc treatments were thought to be effective until investigated in a randomized case-control manner [1, 2]. Progress in the development and validation of outcome measures, together with improved insights on SSc pathogenesis, have opened the door to establishing therapies in SSc through well-designed controlled trials.

Epidemiological Considerations

Status of Scleroderma as a Rare Disease

The Orphan Drug and Rare Disease Act of 1983 encourages pharmaceutical companies to develop drugs for “rare diseases” that otherwise have a very low prevalence and for which drug development lacks profit motive. In the US regulatory environment, “rare disease” is defined as one affecting fewer than 200,000 Americans. Given that SSc falls in the category of “rare disease”, affecting about 1 in 5,000 [3], pharmaceutical companies have tax and patent incentives under the Orphan Drug and Rare Disease Act to develop drugs for this condition. In part due to the support from the abovementioned legislations, there have been 23 randomized clinical trials in SSc in the last 5 years compared to seven such trials between 1980 and 1986 [4, 8, 10].

Trial Design

Phase I–III

The principle focus of phase I trials is the safety of the tested treatment, adverse events (AE), serious adverse events (SAEs), and/or death. Even during this phase, placebo controls are necessary because only placebo controls will enable one to differentiate whether a sign or symptom is due to treatment-related adverse event or an SSc-related complication. Stopping rules during this phase are particularly important (although should be included in all trials of disease with severe consequences such as SSc). This is because it is unacceptable in some circumstances to continue the tested drug for patients who develop organ complications or nonresponders when there is available effective treatment for such organ involvement. On the other hand, it is possible to continue a drug tested in certain aspects of organ involvement when there is no known effective treatment.

Phase II trials are mainly focused on evaluating initial efficacy and establishing an appropriate dose for later trials, although safety must continue to be carefully monitored. This is also an opportunity to explore and validate clinical, laboratory and biomarker end points. End points used for clinical trials should be practical and fully validated; in SSc, the use of surrogate outcome measurements may be more feasible in selected cases. As in phase 1, there should be controls, usually placebo, to establish the true early efficacy and further safety of the drug.

Phase III trials involve more patients to establish efficacy at the chosen dose(s) and establish the safety profile of the therapy for more common adverse events. This phase should be controlled, whether placebo and/or positive controls.

Phase IV: Although drugs are carefully tested in the above three phases before being marketed, postmarketing studies establish the profile of the drug in a more general population, further establish the therapy’s safety profile, and attempt early discovery of less common adverse events during long-term use.

Risk evaluation and mitigation strategies (REMS) are risk management strategies initiated by the Food and Drug Administration Amendments Act of 2007 (“FDAAA”), which gave FDA the authority to request a REMS from drug companies to make sure that the benefits of a drug or biological product continue to outweigh its risks. It was specifically tailored to make sure that there is a favorable risk: benefit ratio in larger populations and in general use. The FDA website provides a list of REMS with the currently approved drugs including biologics via REMS [5].

Characteristics of Outcome Measurements in SSc

OMERACT (outcome measure in rheumatologic clinical trials) is an initiative established by a group of rheumatologists, statisticians, and epidemiologists whose main objective is to improve outcome measures in rheumatology. Clinical trials in SSc should seek to evaluate outcomes in a thorough, valid manner; the OMERACT principles of truth, discrimination, and feasibility are one approach and are frequently used [6, 7]. Those include feasibility, face, content, criterion and construct validity, reproducibility/reliability, sensitivity to change, and ability to discriminate therapy from control; it includes patient involvement and a consideration of the context (e.g., comorbidities, other medications used, cultural factors) of the measure. Certain aspects of measurement validation are particularly important, as they are critical to trial design and the ability to discern treatment effects. This applies to discrimination and responsiveness to change.

Discrimination

Discriminant validity was shown in some outcome measures in SSc clinical trials. FVC percent predicted could discriminate between cyclophosphamide-treated and placebo control groups as a measure of improvement in SSc-ILD (interstitial lung disease) [8]. Johnson et al. used Bayesian model analysis of uncommon diseases to identify MRSS as an outcome measure of skin tightness in SSc. Better mean outcomes of MRSS in MTX-treated group than placebo (94 %) demonstrated the discriminant validity of MRSS in SSc [9]. Most recently, event-free survival was identified as the outcome measure in a study of long-term effects of treatment with Hematopoietic stem cell transplantation (HSCT) vs. cyclophosphamide in SSc. Event-free survival (time from randomization until the occurrence of death or persistent major organ failure) could discriminate the significant survival in HSCT group than in control group after 4-year follow-up [10].

Responsiveness to Change

In SSc, several outcome measures may not show any change in RCTs. The reason behind the lack of change in RCTs is that most of SSc disease modification trials have been negative, although some trials showed positive change – for example, cyclophosphamide, which improved FVC and skin score [11]. Outcomes like GIT 2.0, FVC, HAQ-DI, SF-36, 6MWD, MRSS, and RCS are responsive to change and were able to show some improvement in clinical trials [1216], while others, such as oral aperture opening, handspan, and other biomarkers, did not show any change in response to treatment [17].

Overall Measures of Scleroderma

A group of SSc experts within OMERACT started the combined response index for SSc (CRISS) as an instrument to be used for clinical trials. In an effort to develop single measure composed of a set of domains which reflect organ involvement, CRISS conducted a Delphi exercise with expert review to distinguish 11 core set items to be considered in SSc clinical trials: soluble biomarkers, cardiac, digital ulcers, gastrointestinal, global health, health-related quality of life and function, musculoskeletal, pulmonary, RP, renal, and skin. Ongoing prospective study to test the validity of CRISS against OMERACT criteria is currently being undertaken. Further revision and definition of the final set of domains will be commenced based on obtained results [18].

Another overall outcome measure in SSc is the European scleroderma study group activity index (EscSG) which evaluates both clinical domains and specific laboratory values, including the MRSS, DLCO, and presence of scleredema, digital ulcers, arthritis, ESR, hypocomplementemia, and patient-reported worsening of the skin and vascular and cardiopulmonary symptoms [1922]. Valentini et al. evaluated the validity of EScSG activity index; face, content, and construct validity was demonstrated [20]. Further assessment of the content and construct validity was conducted by Minier et al. [22] in a larger cohort of SSc patients. Responsiveness to change, however, has not yet been evaluated for EscSG activity index, and further validation steps are still warranted.

Khanna et al. developed a consensus of 22 points to consider for evidence-based clinical trial design in SSc. They entail establishing standards for more uniform clinical trial design and improved selection of outcome measures; they also outlined areas where further research is warranted [23]. Outcome measures used in SSc clinical trials are listed in Table 46.1.

Table 46.1 Outcome measures used in SSc clinical trials

The Role of Surrogate Measurements

A surrogate end point is defined as a measure of a treatment effect that correlates or reflects a change in a clinical end point. Additionally, a surrogate end point is expected to predict clinical benefit based on epidemiologic, therapeutic, or pathophysiologic evidence [152]. Scleroderma is a complex disease with high rates of morbidity and case-specific mortality [153]. However, the use of mortality as a primary outcome is not feasible and requires longer study duration (years).

Surrogate end points are adopted as potential markers for clinically relevant outcomes and their response to therapy. Improved insights into the pathophysiologic pathways of SSc, in addition to identifying key cellular and molecular targets, pave the way for potential organ (pathway)-specific markers. Clinically addressed outcomes usually reflect organ function or organ-related complication. Dyspnea scales and 6-min walk distance are used as surrogate for PAH [154, 155]. FVC and HRCT are surrogates for ILD progression [8, 156]. Time to clinical worsening was considered a surrogate marker of PAH worsening in a recent study by Pulido et al. where they assessed the effect of macitentan (dual endothelin receptor antagonist) in a randomized controlled trial. They reported that macitentan significantly reduced morbidity and mortality in PAH patients [157]. Gene expression signature in the skin and peripheral blood play a major role in understanding SSc pathogenesis, identifying potential biomarkers and therapeutic targets [158]. Gene expression signatures were tested by Milano et al.; inflammatory, proliferative, limited, and normal skin patterns were identified in clustered analysis of intrinsic genes [159]. Further analyses of those intrinsic genes for changes in response to treatment were assessed by Hinchcliff et al., and differential expression was shown in MRSSs of MMF-responsive patients in comparison to nonresponders [160]. Chung et al. showed differential gene expression in the skin of two SSc patients examined before and after imatinib treatment; they also identified an imatinib-responsive signature which was differentially expressed in dcSSc (early and late) in comparison to lcSSc and normal skin [161]. Genetic studies reveal the potential value of gene signatures as surrogate markers of fibrosis and response to treatment in SSc patients, in addition to their contribution to the growing innovative field of personalized translational medicine.

Measurement Error in SSc Outcomes

Demonstration of measurable effect by a treatment in a clinical trial is of great importance. Application of treatments and diagnostic tests relies on scores obtained by the measured variable. As noted above, validated measures should adhere to the OMERACT principles or a similar approach. In a study by Pope et al. [85], of ten rheumatologist and ten Ssc patients, they found that the intraobserver reliability was better than the interobserver reliability for most variables examined. Czirjåk et al. [162] demonstrated that, with repeated teaching of rheumatologists, the coefficient of variation of the measure decreased from 54 % to 32 %, while the intra-class correlation coefficient (ICC) increased from 0.496 to the expert level of 0.722. Clinical trials in Ssc thus need a carefully validated and reliable measurement instrument to ensure accurate and clinically meaningful results. Further, training to reduce inter-investigator variability seems to improve the usefulness of some clinical surrogates.

Patient Selection

Sample Size

A limitation in clinical study design in SSc is sample size because SSc is an uncommon/rare disease, so it is hard to enroll sufficient patients to have statistical power for confidence in the results. In addition, sample size calculation is dependent on a change in validated clinically relevant measures as the primary outcome, which requires a sample size of adequate number of patients to detect the change in such an outcome. For example, an adequately powered clinical trial of cyclophosphamide versus placebo, using FVC as the primary outcome, required about 150 patients. To recruit an adequate number of SSc patients in such a clinical trial in a timely manner, multisite trial designs are often adopted. This, in turn, requires consideration of the negative aspects of multicenter design: heterogeneity among patients, increased variability in outcome measures, reduced reliability among participating sites, and high cost.

Sampling Frame

SSc is a multisystem disease with various possible phenotypes; the phenotypic variability starts with the skin which yields two distinct SSc subtypes: limited (lcSSc) and diffuse cutaneous subtypes (dcSSc). Pope et al. studied SSc patients with both SSc subtypes to calculate the baseline characteristics of commonly used outcome measures and to provide parameters for sample size calculations for SSc clinical trials. Multiple baseline characteristics were significantly different in patients with diffuse SSc in comparison to patients with limited SSc, including health assessment questionnaire (HAQ) disability score, functional Index, grip strength, skin score, and physician global assessment [163]. SSc trials to date choose to enroll patients with diffuse cutaneous disease because the primary outcomes often chosen (e.g., skin or lung changes) change more quickly in this subtype, despite the fact that the limited subtype is more common – often 60–70 % of SSc population [164]. This approach may change as serological subtyping becomes more clearly defined and differentiating [165] or as genetic signatures as a more reliable method for subtyping on a pathogenetic basis becomes validated [161]. The predominance of fibrotic and inflammatory pathways in dcSSc versus vasculopathy in lcSSc supports the dcSSc vs. lcSSc grouping. However, genotypes may differ within the same subtype, pointing to the potential for a different subgrouping [159]. The potential here, not yet proven, is that patient populations in clinical trials will have more uniform pathogenetic backgrounds and, thus, more uniform response to appropriately targeted therapies.

Thus, patient selection at baseline has a substantial effect on the outcome measured; in cases of mild to moderate ILD in SSc patients, dyspnea and decreased quality of life (QOL) may be minimal, and improvement with treatment is not practical, which is not the case in severe ILD patients. Similarly, a lower baseline renal function in a clinical trial may allow us to discern small changes to define progressive renal dysfunction progression. Subsequently, variability in baseline severity could influence the outcomes measured. Accordingly, a careful consideration of possible predictable baseline differences for defining inclusions into the study (e.g., disease duration, disease activity, medications) is appropriate, as is a plan to account for baseline differences during analysis.

Disease Duration

The preliminary ACR criteria, developed in 1980 [166] for SSc, overlook the early stages of disease, with consequent delay in treatment. Matucci-Cerinic et al. developed a consensus for very early diagnosis of systemic sclerosis (VEDOSS) in 2009 to detect early symptoms/signs of SSc before the evolution of full-blown SSc. They identified the presence of Raynaud’s phenomenon (RP), abnormal capillaroscopic pattern, and abnormal laboratory values (antinuclear, anticentromere, and antitopoisomerase-I antibodies) as major criteria for VEDOSS diagnosis [167]. A recent Delphi exercise in 2011 also documented four symptoms/signs necessary for VEDOSS: Raynaud’s phenomenon, puffy fingers turning to sclerodactyly, specific SSc autoantibodies, and abnormal capillaroscopy with SSc pattern [168]. The importance of early identification of such abnormalities is to detect and treat as early as possible with potential to delay progression to fully defined SSc and, perhaps, to alter the long-term course of the disease. The development of the 2013 ACR/EULAR SSc criteria [169] improved the ability to diagnose SSc patients early, yet only 44 % of the VEDOSS population fulfilled the new ACR/EULAR criteria [170]. Recently, a study by Bruni et al. [171] showed that digital lesions (ulcers and scars) are present among 26 % of 110 VEDOSS patients and demonstrated significant correlation with gastrointestinal involvement in VEDOSS patients. This actually implied that these VEDOSS patients may have had vasculopathic aspects of SSc well before being seen and diagnosed as VEDOSS patients. It is far too early to consider using VEDOSS as a criterion for trial design, but it is possible that it will be an important consideration in the future.

Trial Design

In 1995, the ACR published guidelines for designing clinical trial in patients with scleroderma [172]. Since then there have been significant advances in diagnostic testing, pathophysiological understanding, and treatment of the disease. Clinical trials should be designed using validated outcome measures, and the use of the OMERACT principles can be used to guide the use of those measures [6]. EULAR has recently put forward some point to consider when designing clinical trials in scleroderma [23] see Table 46.2.

Table 46.2 Issues in clinical trial design

Data Analysis

Data analysis of studies is a complex and individualized process, and a complete discussion cannot be undertaken in this section. A few points to consider are:

  • Consider consulting with an expert for help with designing the trial.

  • Design of the trial and outcomes will determine how the analysis is conducted and vice versa.

  • The analysis should be prespecified before the trial starts, although exploratory analyses and work on validation of outcomes in early trials are encouraged.

  • Critical to all trials is trying to minimize bias by using control groups and, if at all possible, blinding the trial as well as randomization of allocation.

  • Sample size and power calculations for all phase III trials will depend upon the primary outcome measure(s), treatment duration, expected responses in the groups, and desired alpha and beta levels, among other factors. However, not all studies need to have a power analysis done (e.g., safety analysis, pharmacokinetics, some early phase 2 studies, and dose response trials are examples where power analysis is less important).

  • Statistical analysis for inbetween group comparisons should consider the probability of distributions of the results (i.e., parametric vs. nonparametric variables).

  • Outcome variables should be defined, using validated measures whenever possible. The characteristics of the outcomes should be considered, as they may determine the robustness of the data when not normally distributed and the power of the statistics to discriminate among therapies. In general, for example, dichotomous measures do not have as much discriminatory power as continuous measures. Continuous measures are more able to discriminate among therapies than other approaches. If the continuous measures are particularly variable, nominal, categorical, or dichotomous measures are preferable. The specific analyses available are myriad – from simple proportions tests, through ANOVA, through generalized linear regressions with many variations, through survival analyses, etc. This is a very important reason to consult early with your statistical colleagues.

  • Missing data, from single variables through patient dropout, are an inevitable aspect of clinical trial design, and there are multiple methods of imputing missing data, from simple completer analysis, through nonresponder imputation, through averaging, and through general linear equation modeling. The method chosen should be chosen in advance

  • Adverse event reporting is as important as reporting of benefit and should be considered before the trial begins, although the methodology of such reporting remains unsophisticated. Data safety monitoring should be considered for larger or multicenter trials.

  • Criteria for early termination of the trial and interim analysis should be prespecified, if needed.

Conclusion

Clinical trials in scleroderma are inherently difficult because the disease is uncommon/rare, making recruitment problematic and requiring multisite trials; longer trials are also often needed. Partly in response to these difficulties, clinical trial methodology in SSc is evolving and has been improving. This chapter reviewed updated issues in trial design including factors such as epidemiology, phases of trial design, outcome measures, surrogate measures, patient selection, analysis, and updated guidelines for trial design.