Background

Randomized controlled clinical trials (RCTs) are typically the evidentiary basis for the approval by regulatory agencies of new pharmaceutical and device products for use in treating appropriate patients. However, in certain situations, RCTs may not be a feasible study design.

For example, RCTs may not be an option for rare diseases or other conditions where an adequate sample size is hard to obtain globally, let alone regionally [1,2,3]. RCTs may also not be a feasible study design for debilitating or life-threatening diseases with limited alternative treatment options, as it may be unethical to include a placebo or a significantly less effective comparator. Additionally, RCTs may be impractical in other disease areas with limited alternative treatment options or when early-phase clinical trials for an investigational drug have shown promise, since recruiting and retaining patients for the placebo arm could be challenging. [4]

In such cases, single-arm trials (SATs) are often used to support regulatory submissions for approval of new indications for drugs and biologics [5, 6]. In SATs, a group of individuals with the condition of interest receiving the investigational new drug or biological are followed over time to observe their response to treatment. [7]

There is established precedence for use of SATs in regulatory submissions in the United States and European Union. These include preliminary and early phase studies of product safety, and open-label extensions of randomized Phase 2 and 3 studies [8, 9]. These studies may be submitted as supportive evidence alongside traditionally well-controlled trials such as RCTs.

Under certain limited circumstances, SATs may be submitted as pivotal evidence for determination of efficacy and safety for approval. When serving as the basis for approval, SATs may use an external control arm (ECA) to mitigate methodologic and statistical concerns arising from the lack or inadequacy of an enrolled comparator group [10]. An SAT with an ECA has a “control group that consists of patients who are not enrolled as part of the single-arm trial, i.e., there is no concurrently randomized control group” [11]. External control arm data may come from numerous sources, including past clinical trial data or real-world data (RWD) sources such as registries or natural history studies, electronic health records (EHRs) or administrative claims. [12, 13]

Recent studies have reported widespread use of Real World Evidence (RWE) in FDA submissions and EMA applications for marketing authorization [14, 15]. Regulatory acceptance of submissions using single-arm designs and external control arms has increased, concordant with an more submissions for rare disease and gene therapy products [14, 16]. Applications along with Health Technology Assessments (HTAs) and regulatory agency assessments of SATs have been examined in oncology [17, 18]. However, there are limited resources to guide the design and analysis of non-cancer programs to improve the likelihood of regulatory acceptance. Applications of SATs in non-oncology contexts may have a different regulatory likelihood of acceptance. Yet, studies that examine submissions in rare diseases and other non-oncology indications fail to identify specific methodologic and other features of the intervention and study design that led to regulatory success. [11]

As such, we developed a framework for considerations in SAT strategies and ECAs that may affect likelihood of regulatory success. Our framework helped identify key types of submissions that may face greater regulatory challenges: novel approvals fo first indications. We then reviewed all FDA and EMA approvals from 2019 to 2022 that used SATs as pivotal evidence for first indications of new molecules and biologicals to identify and understand the common factors associated with regulatory acceptance. Since SAT and ECA approaches are documented within oncology our review focused on non-oncology approvals for first indications. [18, 19]

Methods

Development of Framework

We developed a framework to understand the regulatory acceptability of an SAT strategy in multiple phases, drawing from a narrative literature review, interviews across disciplines of drug development, and the extensive experience of our core team, who have over 40 years of experience in drug development and regulation. Systematic phased focus groups were conducted in in a large pharmaceutical company with senior leaders in epidemiologiy, statistics, regulatory affairs, clinical science and clin pharmacology. The first focus groups probed as to possibilities of where single arm studies could be used, supplemented with a narrative review of the literature and regulatory guidance documents. This was followed by focus groups targeting different medical and regulatory considerations that could impact potential acceptance of SAT.

Our framework differentiates between the diverse types of SATs, including supportive SATs such as pediatric extrapolations and SATs submitted as pivotal evidence. The framework used to identify the types of submissions that we expected to face the most regulatory challenge, which informed the scope of our study. As a first test of the framework, we applied it to a subset of approvals from 2019 to 2022 to understand regulatory responses to novel submissions outside of oncology. Reviewing responses to other aspects of the framework were not in scope for this study. We then used the regulatory and medical considerations listed in Fig. 1 of the framework and the data and methodological considerations listed in Fig. 2 of the framework to guide the key outcomes for abstraction.

Figure 1
figure 1

A framework for determining the likelihood of regulatory acceptance of a Single Arm Study. Footnote: Top of figure shows factors regulatory decisions believed to be less likely (on left) for regulatory acceptance of a single arm trial (SATs), with increasing likelihood for decisions moving to the right of graph. Bottom left part of figure reflects considerations that may increase the likelihood of regulatory acceptance of SATs, while the right side shows those that may decrease likelihood of SAT acceptance, depending on regulatory decision. The scope of this study is novel approvals, indicated on the far left end of the spectrum.

Figure 2
figure 2

A framework for determining the likelihood of regulatory acceptance of an External Control Arm. Footnote: Data considerations and methodological considerations that are less likely (left of graph) or more likely (right of graph) to lead to regulatory acceptance of an external control arm.

Selection of FDA Approvals and EMA Authorisations

We identified all FDA approvals and EMA authorisations from 2019 to 2022 for which at least one Phase 2 or 3 SAT was submitted as pivotal evidence.. FDA CDER approvals were identified from the Compilation of CDER NME and New Biologic Approvals 1985–2022 from Drugs@FDA website and CBER approvals were identified from the Biological Approvals by Year page on the CDER Website. EMA authorisations were identified from the table of European public assessment reports (EPARs) for all human and veterinary medicine, automatically excluding all veterinary products and products for which authorization status was not “Authorised”.

Using the information on indication available in the EMA and FDA databases, products were screened to exclude any oncologic and similar indications like radiologic, non-malignant tumors, pre-cancer indication. To keep review focus predominantly on therapeutics, we excluded blood products like blood typing reagents, molecules designed for imaging, and diagnostic assays. Finally, we excluded vaccines as a special case.

Four products for which reviews described a single-arm trial as pivotal (Oxlumo, Xenpozyme, Oxbryta, and Voxzogo) were excluded, because the purpose of these SATs was to extrapolate efficacy and safety data to pediatric populations at the same time as the original submission for approval. Although pediatric extrapolation studies are included in the framework, they did not meet the criteria for this review, which examined only the original approval for a first indication.

Identification of Submissions with Single-Arm Trials

Individual review documents for each product with a relevant indication were examined to determine whether at least one phase 2/3 or phase 3 SAT was included in the submission. For FDA approvals, clinical and statistical review documentation were reviewed where available. Otherwise, integrated or multi-discipline review documentation were used. For EMA authorisations, the “Clinical aspects”, “Clinical efficacy”, and “Clinical safety” sections of the product’s EPAR were reviewed. In documents for both agencies, where available, any comprehensive table of clinical studies submitted was examined to determine if the body of evidence submitted for review included any Phase 2 or 3 SAT. In cases of ambiguous study design or phase, any study cited in the approval documentation was cross-referenced with results from a search of Clinicaltrials.gov, confirming that a study was a SAT if the listed intervention model was “Single-group assignment”.

Exclusion of Submissions with Solely Supportive Single-Arm Trials

Approvals identified to have included SATs were then assessed to determine whether the SAT was used as pivotal or supportive evidence. In documents for both agencies, any comprehensive table of clinical studies submitted was examined. While all FDA and EMA approvals presented table(s) of clinical studies used for the approval, they differed in how the review presented pivotal vs. supportive evidence. In some cases, review documents included tables that specified “pivotal” or “primary” evidence over supportive evidence. In these cases, any study not listed as supportive was considered pivotal. In other cases, the review documents described in words in the review strategy which trials were purely supportive evidence versus pivotal. If there was only one Phase 2 or 3 trial listed at all for efficacy, it was considered pivotal. Where possible, explicit text was used from approval documentation that described each SAT as either pivotal or supportive. Otherwise, if the study was described in the approval documentation or Clinicaltrials.gov as an open-label extension (OLE) or long-term extension (LTE) of a controlled trial, it was classified as supportive or non-pivotal evidence. If the study was ongoing at the time of submission, it was classified as non-pivotal evidence unless the review text explicitly stated that an ongoing study with a pre-specified data cutoff point was used as pivotal evidence. Finally, submissions for which a pediatric extrapolation study was submitted at the same time as the submission for the first indication in an adult population were excluded.

A primary reviewer identified SATs in FDA and EMA submissions. An additional reviewer cross-checked between the FDA and EMA approvals of the same products to assess any discrepancies.

Document Search and Data Abstraction

Regulatory documents were evaluated using a pre-specified template developed from our framework for data abstraction. Data from the same regulatory documents were used to identify single arm trials (clinical and statistical review documents for FDA approvals and EPAR documentation for EMA authorisations). Again, for FDA approvals, if clinical and statistical review documents were not available for data abstraction, clinical sections of integrated or multi-discipline reviews were used. Key information abstracted included:

  1. 1.

    General submission and approval information, including details on product and indication, agency and center (if applicable), date of approval or authorization, and any orphan and/or priority designation

  2. 2.

    Information on totality of pivotal evidence submitted (i.e. whether relevant SAT/SATs were sole pivotal evidence or submitted alongside other traditionally well-controlled studies)

  3. 3.

    Agency reviewer responses (critiques and positive assessments) to submission of pivotal SAT(s), including methodological or statistical issues and any information on how therapeutic context influenced acceptability of study design; corresponding to Step 1 of the framework (Fig. 1)

  4. 4.

    Information on external control arms, including data source for control arm and details on RWD used, if applicable

  5. 5.

    Agency reviewer responses (critiques and positive assessments) to submission of external control arm, if applicable, including methodological or statistical issues and any information on how therapeutic context influenced acceptability of study design; corresponding to Step 1 of the framework (Fig. 2)

  6. 6.

    Labeling information, including whether SAT and/or external control arm (where applicable) was mentioned in product labeling (FDA) or package leaflet (EMA)

A full list of abstracted fields and some variable definitions are included in Supplementary Appendix Table A2.

Results

Of 482 FDA and EMA product approvals from 2019 to 2022, 37 approvals—20 FDA and 17 EMA—were identified as non-oncology approvals that included a pivotal SAT in the submission. (Table 1; Fig. 3). For these 37 approvals, we abstracted data for 101 fields (Supplementry Appendix Table A2). In both FDA and EMA approvals, the majority of applicants utilized SATs as the sole pivotal efficacy evidence (Table 2). Characteristics of the SATs were largely similar between FDA and EMA approvals, except for the inclusion of SATs in patient-facing product labeling. In FDA approvals, 18/20 (80%) of FDA-approved applications mentioned the use or findings from the pivotal SATs in product labels for clinicians and patients. These approvals were for the drugs Amvuttra, Pyrukynd, Enjaymo, Nextstellis, Nulibry, Imcivree, Zokinvy, Pretomanid, Vyondys 53, Egaten, Skysona, Zynteglo, Rethymic, Ryplazim, Xembify, Zolgensma, Asceniv, and Esperoct. Meanwhile, only 1 of the 19 EMA-approved applications (5%), Esperoct, mentioned a SAT as evidence in the package leaflet.

Table 1 Summary of Included Products and Key Characteristics
Figure 3
figure 3

Inclusion and exclusion criteria flowchart for selection of 37 FDA and EMA approvals into final analysis.

Table 2 Summary of Characteristics of Pivotal Evidence in Included Submissions

Two approvals, Amvuttra (EMA and FDA), and Egaten (FDA), were atypical in their uses of single-arm trial data. Amvuttra was approved by the FDA and EMA in 2022 to treat polyneuropathy of hereditary transthyretin-mediated amyloidosis in adults. While this application included a randomized controlled trial, pivotal efficacy evidence functionally came from a single-arm trial, since only investigational arm of the RCT was compared to a historical control arm from a previous trial for the primary endpoint analysis [21, 22]. Egaten was approved by the FDA in 2019 to treat fascioliasis in patients 6 years of age and older. One randomized controlled trial was submitted for this product; however, a single-arm trial compared to a historical control arm was also used as pivotal evidence. One arm of a study evaluating two randomized arms of different doses of the experimental medication was compared to a an active control arm of a different past trial. Determination of efficacy was made by the totality of pivotal evidence, which included single-arm data with historical control [38]. In each situation, one investigational arm of an RCT was compared to a historical control arm to generate pivotal efficacy evidence, so both Amvuttra and Egaten were included.

Characterization of ECA and comparator arms for each application was conducted. (Table 3). Real-world-data (RWD) ECAs, were utilized by a sizable proportion of both FDA (45%) and EMA-approved (47%) applicants. Strikingly, no applications used exclusively claims data or EHRs as a form of RWD in an ECA. Instead, all of the RWD ECAs were based on registries or natural history (NH) controls. Some of these NH studies could have utilized EHRs to populate case report forms. A similar proportion of FDA and EMA-approved proposals (35%) compared SAT results to a non-patient level aggregate benchmark. The use of baseline-controlled participants, in which participants are compared to their own values prior to intervention, was common in applications submitted to both agencies. The use of a historical control group from a prior controlled trial was much more common in FDA approvals (20%) than EMA approvals (6%).

Table 3 Summary of Characteristics of External Controls/Comparators

Factors in the aforementioned framework may have contributed to the approvals of applications using SATs as pivotal evidence (Table 4). The most common justifications for approval in this context were medical conditions with an established natural history and no spontaneous improvement (condition progressively deteriorates and does not improve without treatment, seen in over 80% of FDA and EMA approvals), and conditions with either no effective therapies or limited standard or care options (seen in over 80% of FDA and EMA approvals).

Table 4 Agency and Regulator Responses to Submission of Pivotal Single-Arm Trial

This phenomenon of approvals in rare conditions with no expected spontaneous improvement and limited standard of care options is exemplified in the EMA approval for the drug Upstaza, indicated for the treatment of patients aged 18 months and older with a clinical, molecular, and genetically confirmed diagnosis of aromatic L amino acid decarboxylase (AADC) deficiency with a severe phenotype (Box 1). [51]

Discussion

We reviewed all approvals for first indications for non-oncology applications based on SAT strategies submitted to FDA and EMA to summarize common factors. Briefly, we found that regulatory approvals primarily occurred in contexts involving rare diseases characterized by limited or insufficient standard of care options and a notable unmet medical need in debilitating or life-threatening conditions. When implemented, external control arms (ECAs) most frequently were derived from natural history studies, both retrospective and prospective. Criticisms of ECAs commonly revolved around issues such as an imbalance between the ECA and trial arm, leading to confounding. Additionally, concerns were raised about outcome ascertainment bias resulting from measurement errors or subjective endpoints, along with data quality issues attributed to missing data, potentially introducing selection bias.

Patient-facing labels occasionally referenced single-arm trials, suggesting their relevance in the context of communication to patients. FDA approvals more frequently included information on pivotal single-arm trials and ECAs in product labeling than did EMA approvals. This may be due to differences between the agencies in the provision product information to patients and providers. FDA’s package insert serves as a label for both healthcare professionals and patients, while the EMA provides a separate summary of product characteristics (SmPC) for providers that differs from the patient-facing package leaflet or label [59, 60]. While reviewing EMA SmPC documents was not in scope for this review, previous studies have found these documents to contain detailed information on clinical trials. [61, 62]

External control arms (ECAs), frequently drawn from registries or natural history studies, played a key role providing context to single-arm trials. Notably, our analysis found no ECAs that explicitly described using EHR or claims data. We found that in the absence of these data sources, submissions often relied on natural history studies to provide necessary context and while many NH studies did used patient-level healthcare data through retrospective chart reviews, none mentioned EHR data explicitly. Reviews did not explicitly state which such chart reviews used EHRs or whether EHRs were used to populate case report forms. Verification was not possible due to differing timelines of transition from paper to electronic records across health systems. Nevertheless, our finding is consistent with the documented gap in the availability of research-grade Real World Data (RWD) for the use of external control arms in rare diseases [63]. This scarcity poses a challenge in utilizing external control arms from electronic health records or claims data for non-oncology trials particularly in rare diseases, where they can contribute essential contextual information to the evaluation of single-arm trials. These observations deepen our understanding of the regulatory landscape surrounding single-arm trials and highlight the challenges associated with the choice of external control arms, as well as the reliance on natural history studies for context. As clinical trials increasingly utilize EHR data for various purposes and methodological approaches continue to evolve for implementing EHRs as external control arms, we may see EHRs will see increased use as ECAs in non-oncology indications. [64,65,66]

The findings of the current study align with several past findings from other reviews of regulatory submissions using RWD and/or external control arms [11, 17, 18]. Like Jahanshahi et al., we found that single-arm trials met greater regulatory acceptance in the context of rare diseases. Similarly, Jaksa et al.'s examination of the influence of external control arms (ECAs) on regulatory decisions and the importance of data quality corresponds closely with our identification of criticisms related to imbalances between ECAs and trial arms, outcome ascertainment bias, and data quality issues. Izem et al.'s focus on the contextualization of single-arm trials using real-world data (RWD) aligns with our study's emphasis on the rare disease context and the utilization of natural history studies as common external controls.

Our results are also consistent with studies noting a general increase and upward trend in the use of RWE in regulatory submissions in both the United States and European Union. In a review of NDA submissions to the FDA from 2019 to 2022, Purpura et al. found a substantial increase in single-arm trials, reported that regulators often flagged issues with endpoint objectivity, and emphasized need for increased guidance for assessing single-arm trials as fit for regulatory submission and approval [14]. In our study, we found that the majority of products approved with pivotal SATs had objective and/or large expected endpoint sizes. This is in alignment with Vaghela et al.’s recent systematic review of FDA-approved non-oncology orphan drug therapies that used RWD, which found increased regulatory acceptance of RWD studies demonstrating a large effect size [67]. Our finding that natural history studies constituted all RWD-based external control arms appears aligns with an earlier study of EMA authorizations and FDA approvals; Flynn et al. found that registries were the most commonly used data source in 2018 and 2019. [15]

Our findings align with the recent FDA guidance, which supports the use of externally controlled trials in rare diseases with well-defined natural histories and limited treatment options. Both our findings and the FDA guidance highlight the importance of high quality patient-level data. Similar to the concerns raised in the guidance, our study noted significant critiques regarding the comparability of ECAs. The high proportion of FDA approvals mentioning SAT data in product labeling mirrors the agency’s emphasis on transparency in presenting efficacy evidence​. While the EMA does not currently have dedicated guidance on externally controlled trials, the ICH E10 guideline on control groups discusses external controls, emphasizing the necessity for appropriate methodological approaches to ensure the validity and reliability of the efficacy data [68, 69]. The 2001 EMA guideline takes a more cautious stance than the FDA despite similar numbers of approvals between agencies in this review. This may suggest a need for updated guidance on externally controlled trials that reflects current European regulatory perspectives. In the absence of updated EMA guidance and relative recency of FDA guidance on SATs and ECAs, our framework provides a useful and succinct summary of key considerations that is consistent with the present regulatory landscape.

Despite the many strengths of this review, there were some limitations. First, while the development of our framework was a phased process conducted with expert input and focus groups, we did not conduct systematic reviews or structured interviews to guide its creation. Instead, we chose to test the framework with a systematic approach for novel approvals in non-oncology as they pose potentially the greatest regulatory challenges to SAT submissions. Further testing of other aspects of this framework to better understand regulatory considerations in other types of submissions.

We were unable to compare results from approvals to applications that were ultimately rejected by the FDA, because these are not publicly available. While the EMA does publish reports on authorization applications that were refused or withdrawn, this was not in scope for this study. Within our study scope, we are unable to pinpoint why certain applications using SATs as pivotal evidence were approved, while others may not have been. In addition, we based our review on unstructured medical and statistical reviewer comments, and some factors may not have been mentioned despite being relevant. Additionally, we did not collect data on the history of communications between the applicant and agency. Future studies would benefit from more detailed monitoring of communications between parties to determine whether aspects in the communication between agency and applicant influence the likelihood of a new drug or biologic application being approved.

We were also limited by differences between how the EMA and FDA review drug applications. The EMA appeared to publish approvals more consistently than the FDA, and EMA approval documents maintained a uniform format, making the analysis of European approvals more systematic. In some instances, the agencies also classified evidence differently. For example, the FDA language described single-arm pediatric extrapolation studies at the time of submission for the first indication, as pivotal evidence while the EMA considered these studies supportive. Due to these differences, we elected to remove studies from our analysis that used single-arm trials exclusively to extrapolate to pediatric patients. Discrepancies both within and across agencies in how trials were presented in review documentation may also have led to bias in the identification of pivotal vs. supportive trials.

Lastly, our analysis had somewhat limited scope. Our exclusion of oncologic indications may limit the generalizability of these results. Most applications covered conditions that were rare (including orphan drugs), had significant unmet medical need, and lacked effective SoC options. Thus, it is difficult to assess if and how single-arm trials and ECAs could be employed for conditions that are more common and have acceptable, if not ideal, SoC therapies. While traditional RCTs could be used to study new therapies for common conditions, it can be difficult to recruit for RCTs if the control arm is not as effective as treatments that are already available. A smaller control arm in the RCT along with a well-constructed ECA could be beneficial and improve efficiency and duration of trials to provide patients faster access to effective medicines. We restricted our review to initial indications, excluding supplemental applications and label expansions. Future studies should consider including approvals outside this time window, therapeutic areas, and submission types to determine if the findings are consistent.

Despite the limitations, this review is the first to directly assess regulatory responses to specific features of single-arm trials submitted as pivotal evidence for product approvals and authorizations. Our study offers the first comprehensive examination of how regulators respond to submissions employing these designs beyond the realm of oncology. This departure from the oncology-focused analyses is particularly robust for two reasons: a) single-arm trials tend to encounter greater regulatory acceptance in oncology submissions, necessitating a distinct evaluation for other therapeutic areas, and b) the landscape of available data for comparison and context differs significantly outside of oncology. Our pre-specified systematic methodology involved scrutinizing all approvals within a specified timeframe, to identify single-arm trials submitted to support approval in filings without RCTs. This methodological approach allowed us to meticulously sift through an extensive volume of regulatory data, to provide a comprehensive understanding of the regulatory landscape surrounding single-arm trials across various therapeutic domains.

Our results are consistent with the medical, regulatory, methodological, and data quality factors identified to affect regulatory acceptance of SATs in our framework. In a fast evolving regulatory landscape in the United States and Europe, our framework provides a summary that is useful early in drug development stages, allowing stakeholders to understand potential regulatory critiques that they may face in using a single arm study for pivotal evidence in non-oncology approvals.

Conclusion

Based on recent FDA and EMA approvals, the likelihood of regulatory success for SATs with ECAs appears to depend on many design, analytic, and data quality considerations. Our framework is useful in early drug development to guide discussion when considering single-arm trial strategies for evidence generation.