Introduction

The modern landscape of therapeutics in oncology has undergone rapid changes with a shift from traditional cytotoxic chemotherapy to molecularly targeted agents and immunotherapy drugs. The milieu of genetically targeted mutations has expanded from the 2001 landmark discovery of targeting BCR-ABL in CML [1] to practice changing drugs that target genetic aberrations, such as EGFR [2], ALK [3], BRAF [4, 5], HER2 [6], KIT [7], etc. Immunotherapy has also brought profound changes to treatment patterns in an ever broadening group of tumor types, starting with melanoma [8] and non-small cell lung cancer (NSCLC) [9], and expanding to recent approvals in head and neck, bladder, gastric, hepatocellular, and mismatch repair-deficient tumors. Alongside the development of these drugs, rapid uptake of next-generation sequencing promises to bring tumor profiling to the majority of patients and allows for an increased pool of genomically stratified tumors. However, the new era of oncology drugs brings a new set of challenges for early clinical trial development. These include how to determine the optimal dose for a targeted or immunotherapy agent, how to design and implement genomically defined trials, and how to expedite the approval of effective drugs to patients. In this review, we will discuss the evolution of early phase clinical trials in oncology.

Dose-finding strategies

Rule-based designs

In phase 1 clinical trials, the primary goal is to establish safety and tolerability, and to define the maximum tolerated dose (MTD) to use as the dose in a phase 2 trial (recommended phase II dose, RP2D). Following the determination of experimental dose levels, dose escalation strategies, principally the rule and model based, utilize observed toxicity to guide dosing. Rule-based designs are by far the most common and employ a simple up-down sequential dose escalation scheme. The standard up-down design is the 3 + 3 method as described by Storer in 1989 [10] (Fig. 1a). A review of 1235 cancer phase 1 trials conducted between 1991 and 2006 showed that over 98% used simple up-down methods while only 1.6% used adaptive Bayesian designs [11]. The main advantage of rule-based designs is their simplicity. They do not require special statistical support and are easy to understand and implement. However, there are considerable drawbacks to this approach. Firstly, this is a conservative strategy which may treat many patients at low or sub-therapeutic doses, a significant problem in trials treating patients with advanced cancers. Second, statistical simulations have repeatedly shown that there is a low probability of actually finding the true MTD, especially if there are many dose levels [12]. There have been many modifications to the 3 + 3 design to help ameliorate some of its shortcomings. Accelerated titration designs (ATD) attenuate the slow escalation by allowing single patient cohorts along with rapid escalation in the dose for early dose levels (Fig. 1b). When moderate toxicity or dose-limiting toxicities (DLTs) occur, the design switches to the more conservative 3 + 3. Simulations have shown that ATD significantly reduces the number of patients who are under-dosed and enables faster completion of phase 1 trials [13].

Fig. 1
figure 1

Description of three different dose escalation designs. Green box represents a patient that does not experience any toxicity. Yellow box represents mild toxicity. Red box represents moderate to severe toxicity counting as a DLT. a 3 + 3 design. b Accelerated titration design (ATD). c Continual reassessment method

Model-based designs

In contrast to rule-based designs, model-based (or Bayesian) designs allow for rapid dose escalation and the contribution of all patient data to determine the MTD. One of the most widely used Bayesian designs in practice is the continual reassessment method (CRM) described by O’Quigley in 1990 [14] (Fig. 1c). In this method, a pre-specified dose-toxicity curve has a slope that is continuously updated as patient toxicity (or non-toxicity) data comes in. Once the total allotted patients have been accrued, the final shape of the dose-toxicity curve is used to determine the MTD. The advantages to this method are that it is more efficient than the 3 + 3 design with more patients being treated at, or near, the MTD level. In addition, toxicity data from all patients treated on study is used to guide dose escalation decisions, as opposed to the 3 + 3 design, where only the most recent patient cohort counts. Although there are many advantages to the model-based designs, rule-based methods still dominate the landscape of phase 1 oncology clinical trials. This has been attributed to the difficulty in obtaining adequate statistical and computational support for Bayesian methods as well as general inertia in the medical community [15].

Beyond using toxicity as dose determination

Traditional phase 1 designs utilize toxicity-based dosing with the underlying assumption that the mechanism resulting in toxicity is similar to the mechanism resulting in antitumor efficacy, a valid approach with conventional cytotoxic chemotherapy. However, in the current era of targeted and immunotherapies, this assumption has been brought into question and concepts, such as establishing the “optimal biologic dose”, have been proposed. A retrospective study of 683 patients (97.7% of who received targeted agents) enrolled in phase 1 trials at MD Anderson showed no different in efficacy outcomes when comparing patients who were assigned to low dose (≤ 25% MTD), medium dose (25–75% MTD), or high dose (≥ 75% MTD) therapy [16]. Furthermore, in the pembrolizumab phase 1 dose escalation study, the maximum specified dose of 10 mg/kg was reached without difficulty, no DLTs were observed, and responses were seen in all doses beyond 1 mg/kg [17]. However, multiple translational models analyzing ex vivo patient PBMCs, target receptor occupancy, and pharmacokinetic/pharmacodynamic (PK/PKD) data, converged on 2 mg/kg as the lowest dose with the highest likelihood of clinical efficacy [18, 19]. Therefore, despite 10 mg/kg being the maximum administered dose (MAD) in the toxicity framework, 2 mg/kg every 3 weeks became the dose that was explored further in expansion studies and eventually became the Food and Drug Administration (FDA)-approved dose for melanoma in 2014 [20].

Using pharmacodynamic endpoints for first-in-human trials formed the basis for “phase 0” trials conducted under the US Food and Drug Administration’s Exploratory Investigational New Drug guidance [21,22,23]. In the phase 0 trial of veliparib (ABT-888), an inhibitor of poly ADP-ribose polymerase (PARP), 13 patients received a single dose of veliparib ranging from 10, 25, or 50 mg and subsequently underwent blood sampling and tumor biopsies to evaluate on-target inhibition of PARP activity. Adequate drug exposures were achieved and statistically significant reductions in PAR (poly ADP-ribose) levels were observed at 3–6 h post-drug administration at 25 and 50 mg. These results were obtained within 5 months of study activation, establishing proof of principle for obtaining human pharmacokinetic (PK) and pharmacodynamic (PD) earlier in the drug development process in order to guide dosing and scheduling for subsequent trials. Establishing the optimal biological dosing (OBD) in early phase trials requires the development of validated assays and standardizing sample handling and processing procedures prior to the initiation of first-in-human trials [24].

Another potential issue with novel agents is how to consider the time frame for the cataloging of adverse events. Traditionally, toxicities encountered during the first cycle have formed the basis for dose escalation decisions in phase 1 trials. However, the observation of delayed, cumulative, and certain toxicities, such as immune-related adverse events, may be slow to resolve, and this could lead to unexpected de-escalations later on in the trial [25]. In addition, for immunotherapy agents, there may not be a linear relationship between dose and development of immune-related adverse events [26]. Other non-dosing-related factors such as a history of autoimmune disease, HLA type, or other genetic risk factors may play an important role [27]. The challenge for future dose-finding trial designs is to take into consideration these factors and safely and efficiently select an optimal dose for patient care.

Seamless expansion strategy

Conventional phases of clinical drug development are divided into: safety/tolerability and dose finding (Phase I), preliminary efficacy in a specific disease type (Phase II), and eventually, the randomized comparison to an existing standard of care (Phase III) (Fig. 2a). Only once a drug reaches its primary endpoint in a phase III trial is it FDA approved, with the clinical development timeline taking 7–10 years. In 2014, the FDA approval of pembrolizumab for melanoma marked a significant departure from this traditional sequential approach. Not only did it usher in the era of checkpoint inhibition and immunotherapy across tumor types, but also it exemplified the process for expedited drug development. It took 3 years from the time the first in-human phase 1 started in 2011 until accelerated FDA approval in 2014. The KEYNOTE-001 phase 1 study initially enrolled ten patients with advanced solid tumors using the traditional 3 + 3 design. When objective responses were observed in the melanoma (PR/CR in four of seven) and NSCLC (stable disease in four of seven) cohorts, expansion cohorts were rapidly added for these disease types [17]. In the melanoma arm, multiple variables were tested including dose levels, varying cycle lengths, various prior therapy cohorts (ipilimumab-naïve, treated, and refractory patients), and randomization cohorts. In all, a total of 655 melanoma patients were enrolled and treated, with 173 serving as the basis for initial accelerated approval of pembrolizumab in ipilimumab-refractory melanoma. In the NSCLC arm, an additional companion diagnostic for PD-L1 expression was trained and validated throughout the multiple cohorts, with a final determination of a tumor proportion score (TPS) of ≥ 50% necessary for optimal efficacy. Data from the NSCLC cohorts eventually led to the 2015 accelerated approval by the FDA of pembrolizumab in PD-L1-positive NSCLC that have progressed on other treatments. In total, one trial/protocol enrolled and treated over 1200 patients, established efficacy in two different tumor types, and validated a biomarker assay in the span of under 4 years. This seamless expansion strategy is illustrated in Fig. 2b.

Fig. 2
figure 2

Traditional sequential phase versus seamless expansion trial design. a In the traditional sequential phase trial, the phase I trial uses a dose escalation design in many different tumor types to find the maximally tolerated dose for the subsequent phase II trial (RP2D). The phase II trial tests single-arm efficacy in a single tumor type. The pivotal phase III trial is a randomized control trial versus standard of care. b In the seamless expansion design, one trial establishes the RP2D, enrolls additional expansion cohorts testing multiple hypotheses and patient subgroups, and uses the data from these cohorts for submission to the FDA

Despite the quick turnaround from bench to bedside, there are inherent issues with this design which do not make it generalizable to all new drugs. First, there was a high degree of protocol complexity with multiple protocol amendments (nine in total) and multiple expansion cohorts testing different hypotheses in various tumor types and drug doses simultaneously. These intrinsic complexities can increase the potential for protocol violations at different trial sites and create adherence issues for patients. Second, incorporating an entire drug development program into a single, continuously updating trial lends itself to missing critical milestones that are normally built into a sequential design. For instance, safety concerns and efficacy data that arise from a phase 1 trial are subsequently incorporated into the informed consent for the phase 2 trial, allowing for adequate patient education on the risks and benefits of the experimental drug. Also, key guidance meetings/detailed review by the regulatory agency that occurs at standard timepoints may be missed. Third, clear rationale and explicit statistical plans are not defined for the multiple expansion cohorts, making it possible for patients to continue to enroll in subgroups that are not showing significant efficacy. Most drugs evaluated in phase I trials are not shown to be eventually efficacious and safeguards must be in place to inform and protect patients. An in-depth review of 60 phase 1 trials at Dana-Farber Cancer Center submitted in 2011 showed that only 13.3% had formal power calculations justifying the sample size, while 60% had no statistical justification of sample size or endpoint assessment [28]. In 2016, members of the FDA published an editorial on their concerns about the rapid uptake of the seamless expansion design (over 40 active investigational new drug applications) in phase I oncology trials [29]. They proposed to limit the seamless expansion strategy to only drugs that have been designated as “breakthrough therapies”. This would allow the rapid development of clinically promising drugs while still ensuring a high degree of regulatory collaboration and oversight.

Next-generation genomic trial designs

The advent of next generation sequencing has brought an unprecedented wealth of information and promising treatment options for patients. The holy grail of precision oncology is to sequence a patient, find the corresponding driver mutations, and treat them with a specific inhibitor of that gene, leading to clinical benefit. With success thus far in various tumor type specific gene-drug combinations (BCR-ABL, EGFR, ALK, etc), the thought is that broader and deeper sequencing will reveal a mutational landscape that allows targeted treatment irrespective of tumor type. The two main trial designs that have evolved out of this approach are the basket and the umbrella trials.

Basket trials

In the basket design, a single drug is used to target a specific genetic alteration in a variety of different tumor types (Fig. 3a). One of the first publications of a basket trial was the vemurafenib trial in BRAF V600-mutated non-melanoma patients [4].Results showed marked heterogeneity in response that depended upon histology, with good responses seen in NSCLC (ORR 42%), Erdheim-Chester Disease (43%), but minimal responses in colorectal cancer (CRC 0%) and cholangiocarcinoma (12%). Thus, the conclusion of the study was that histologic context still needs to be considered for certain prevalent mutations. BRAF inhibition in NSCLC was explored further in a subsequent study testing dual BRAF/MEK inhibition in metastatic BRAF V600E-mutated NSCLC, and found an impressive ORR of 63.2% with median progression-free survival of 9.7 months [5]. This data led to the FDA approval of this combination in BRAF V600E-mutated NSCLC in June 2017 [30].

Fig. 3
figure 3

Basket versus umbrella trial design. a In a basket trial, a targeted therapy (purple pill) is tested on patients with a specific genetic mutation (purple diamond) across a variety of tumor types. b In an umbrella trial, different tumor types (organ icon linked with patient color) are tested for specific genetic mutations (purple diamond, brown triangle, yellow star). These mutations are then sorted into independent groups and treated with a matched inhibitor (purple, brown, yellow pill)

There have been emerging success stories in targeting histology agnostic genetic alterations. Recently, clinical benefit has been reported with larotrectinib in patients with a variety of solid tumors carrying neurotrophic receptor tyrosine kinase (NTRK) fusion. In a phase I/II study of 55 adult and pediatric patients across 17 unique tumor types, larotrectinib produced an impressive ORR of 76% with median PFS not yet reached after 7.7 months median follow-up [31]. For immunotherapies, high tumor mutational burden has been shown to correlate with clinical response [32]. Given the extraordinarily high mutational burden seen in mismatch repair-deficient (MMR-D) tumors, multiple studies were undertaken to investigate the potential effectiveness of checkpoint blockade, both in CRC as well as other tumors with MMR-D. One study of 41 patients (11 MMR-D CRC, 21 MMR-proficient CRC, 9 MMR-D non-CRC) found marked responses in the MMR-D cohort with an ORR Of 40% in MMR-D CRC and 71% in MMR-D non-CRC, compared with 0% in MMR-proficient CRC [33]. In addition to the high objective response rate, observed responses were durable, with a 20-week immune-related PFS of 67–78% in the MMR-D patients. This and four other single-arm clinical trials led to the May 2017 FDA approval of pembrolizumab in mismatch repair-deficient cancers, the first tissue/site-agnostic approval [34].

As more effective drugs targeting specific genetic alterations are developed, patients will need to have comprehensive sequencing in order to find potentially rare variants that are actionable. The potential issues include resources, ensuring that sequencing panels incorporate all known actionable genes, and that providers are aware that therapies exist for these genetic alterations. Fusions can be particularly problematic as breakpoints are frequently in intronic regions and all potential breakpoint regions must have adequate sequencing coverage in order to capture the actual alteration. Another potential issue is that rare variants are, by nature, difficult to find in patients and assembling a reasonable sample size for a clinical trial can be challenging. A solution is to aggregate different mutation-drug combinations into larger pathway-defined arms which have been used in umbrella trials.

Umbrella trials

The umbrella design takes patients from a single or variety of tumor types, pairs them with a pre-specified drug for their actionable genetic alteration, and directs these drug-gene combinations as independent, parallel cohorts within one large trial (Fig. 3b). Prominent ongoing examples of umbrella trials are the NCI-MATCH, with over 30 gene-drug pairs enrolling in over 1000 study locations, and the ASCO TAPUR, with 16 gene-drug pairs enrolling in 101 study locations. One large, multi-histology umbrella trial has been published thus far, with negative results [35]. The SHIVA trial was a randomized phase II trial run at eight French academic centers in which patients were genomically matched to ten regimens aggregated into three molecular pathways (hormone receptor, PI3K/AKT/mTOR, and RAF/MEK) and randomized to matched therapy versus standard of care. Seven hundred and forty-one patients were screened with 293 (40%) having at least one molecular alteration in the pre-specified target group. There was no difference in mPFS between the matched and control groups (2.3 vs 2.0 months). When broken down by individual pathway, there were still no differences between matched and control group, although small sample sizes potentially masked significant effects.

This trial has brought up considerable debate as to the future of precision umbrella trials as well as concerns over the SHIVA trial’s particular design. First is the issue of the balance of genomic alterations that are represented and obtaining adequate enrollment of matched patients. In the SHIVA trial, 716 patients underwent tumor sampling, 496 (70%) had satisfactory tissue for genomic profiling, 293 (59%) of these patients had a targetable alteration, and 195 (39%) were able to be randomized. However, when analyzing the distribution of targeted mutations, it becomes clear that there is a significant skew with 42% of the alterations occurring in the hormone receptor pathway (AR/ER/PR), 19% with PIK3CA activating mutations, and 24% with PTEN inactivations. All other mutations occurred in less than 5% of patients. These uneven distributions can potentially lead to significant problems in identifying less common, effective drug-gene matches by diluting out pairs with small sample sizes. The NCI-MATCH interim analysis reported issues with efficient patient matching despite the rapid accrual of patients [36]. Although 739 patients were enrolled and tested in just 7 months, only 56 (9%) were able to be matched to therapy, and only 16 actually started treatment (2.5%). This was well below the target estimated mutation matching rate of 30%. The investigators responded by expanding to 24 arms in late May 2016, and then up to 30 arms.

Second, is the question of the effectiveness of single-agent targeted therapy in heavily pre-treated patients. Tumors undergo selective pressure through multiple therapies to evolve and acquire new mutations and sub-clonal populations [37]. In addition, there can be marked genetic differences between a primary tumor and its corresponding metastases [38], making a single biopsy of one site inadequate for determining cancer driver mutations [38]. With patients enrolling in umbrella trials after progressing through multiple prior lines of therapy, the clonal heterogeneity of tumors may be too complex to be treated with single-agent targeted therapy alone. On the other hand, bringing matched therapy trials to earlier line settings may be unethical if there are still standard of care options with proven benefit. Further research is needed to elucidate the genomic state of a tumor and how best to exploit its alterations for clinical benefit.

Conclusion

In the era of genomic sequencing and immunotherapy, the field of oncology has endless optimism towards finding effective treatments for patients at a faster rate than any time in previous history. The crucial concepts that will need to be addressed by the next generation of early phase trials are the adequacy of determining dose by toxicity, the appropriate incorporation of seamless expansion designs, and the advances in genomic and other biomarker-driven strategies. Also important will be ensuring that a solid regulatory framework exists to maintain high standards of patient safety while ensuring expedited development of promising anticancer agents. The new frontier of early phase oncology trials represents an important and exciting time for researchers, providers, and patients alike.