Keywords

4.1 Introduction

Development of a scientifically rigorous study design that addresses the study questions is a key component of clinical trials in oncology. The development of the study design is a collaborative effort between clinical investigators and biostatisticians. An appropriate study design is critical to the success of a clinical trial as it will facilitate the ability of the trial to answer the study questions; will define the number of patients required for enrollment to answer these questions and the duration of the study; and, in early phase studies, will enable the identification of the desired dose to move into subsequent studies (e.g., maximum tolerated dose (MTD); recommended Phase II dose (RP2D)). In early stage pediatric oncology studies, the identification of a safe dose is critically important, as this dose may vary from adult doses. In this chapter, we introduce Phase I clinical trial designs from three categories based on statistical assumptions (Table 4.1): rule-based/algorithm-based designs; model-based designs; and model-assisted design (Wei et al. 2019). This chapter will also discuss the advantages and disadvantages of these study designs in pediatric oncology studies.

Table 4.1 Summary of study designs

The starting dose in adult oncology studies is calculated based on nonclinical safety data in animal models (Shen et al. 2019). However, most pediatric clinical trials are conducted after the completion of adult Phase I clinical trials that can provide historical data for the determination of the starting dose (Smith et al. 1998). Typically, pediatric oncology studies start at approximately 80% of the adult MTD/RP2D (Marsoni et al. 1985; Lee et al. 2005). This prevents the trial from starting with doses that are too low and potentially ineffective doses, unnecessarily exposing children to toxicity without likelihood of benefit and accelerates the dose escalation period. However, for drugs where there is no MTD in adults, there can be consideration to starting at the adult RP2D, as determined in adult, single-agent studies. Pediatric oncology studies starting at the adult RP2D would be designed with plans for dose de-escalation, if necessary.

4.2 Rule-Based Designs

4.2.1 Traditional 3 + 3 Design

The most traditional and straightforward design in clinical trials is the 3 + 3 study design (Storer 1989). The detailed procedure is described in the FDA Guidance for Industry Clinical Considerations for Therapeutic Cancer Vaccines (FDA 2011) which is summarized as follows:

  1. 1.

    Three patients are enrolled into a pre-defined starting dose level:

    1. (a)

      If there is no DLT observed in these three patients, then escalate the dose level, and enroll three additional patients into the new dose level. Move to Step 2.

    2. (b)

      If there is only one DLT observed in these three patients, then stay on the same dose level, and enroll three additional patients in the current dose level:

      • If there is no DLT observed in these three patients (i.e., one DLT in six patients), then escalate the dose level, and enroll three additional patients into the new dose level. Move to Step 2.

      • If there is one DLT or more observed in these three additional patients (i.e., ≥2 DLTs in six patients), then stop the trial and choose the previous dose level as MTD.

    3. (c)

      If there are more than one DLT observed in these three patients, then stop the trial and choose the previous dose level as MTD.

  2. 2.

    Repeat Step 1 for each new dose level.

This study design has all the advantages of rule-based designs. It is simple to understand the design structure and is convenient for clinical operational teams to implement since the parameters for dose escalation are clearly defined. However, this commonly used study design has significant disadvantages and constraints. Statistically, it can only estimate the MTD when the target probability of DLT is between 20% and 33% (Le Tourneau et al. 2009). The 3 + 3 design is “memory-less” since it is based on only the last, most recently enrolled three (or six) patients. This design also requires many escalation steps with doses that may be too low to be effective, leading to suboptimal treatment (if the drug is, in fact, effective) for a large number of patients. It is also difficult to predict the final sample size since all that is known is the cohort size (three or six patients) at each dose level. Another concern of 3 + 3 study design in pediatric oncology studies is that this design has significant operational limitations since most of these studies are multicenter clinical trials (Doussau et al. 2016).

However, in both adult and pediatric oncology studies, the conservative 3 + 3 design is still one of the most popular and commonly used study designs despite criticism of its inefficiency and underestimation of MTD.

4.2.2 Accelerated Titration Design

The accelerated titration design was proposed by Simon et al. (1997). This study design extends the traditional 3 + 3 design to reduce the number of patients enrolled at lower dose levels by adding an initial accelerated phase before the start of a 3 + 3 study design phase.

During the initial accelerated phase, there is only one patient per dose level cohort enrolled at lower dose levels until the pre-defined stopping rules for DLTs or toxicities are met. After the initial accelerated phase, a traditional 3 + 3 design is implemented with cohorts of three to six patients at the higher dose levels.

Simon et al. performed simulations based on four Phase I designs (Simon et al. 1997): Design 1 was a traditional 3 + 3 design, while Designs 2, 3, and 4 were accelerated titration designs with different assumptions (Table 4.2). The simulation results showed that the average number of required patients in a Phase I trial was reduced from 39.9 for Design 1 to 24.4, 20.7, and 21.2 for Designs 2, 3, and 4, respectively.

Table 4.2 Summary of phase I simulations by Simon et al. (1997)

Therefore, the accelerated titration design has an advantage to have fewer required patients, especially when there are many dose levels planned. Some accelerated titration designs also allow for intra-patient dose escalation during the initial accelerated phase to further reduce the number of patients and provide some patients the opportunity to receive the investigation agent at higher dose levels (Ivy et al. 2010). This has the potential to make the accelerated titration design more attractive to adult patients enrolled in FIH Phase I trials. However, in pediatric oncology clinical trials, the accelerated titration design is not compelling since most starting doses are based on the MTD in adult trials and can be too low when compared to the true MTD.

4.2.3 Rolling Six Design

In order to shorten the timeline of pediatric oncology Phase I trials, the rolling six design was proposed by Skolnik et al. (2008). It is another rule-based design that also extends the 3 + 3 study design (Skolnik et al. 2008).

In the rolling six design, two to six patients can be enrolled continuously on the same dose level. The escalation and de-escalation rules can be summarized as follows:

  1. 1.

    If 0/3, 0/4, 0/5, 0/6, or 1/6 patients are observed with DLTs, then escalate to the dose level.

  2. 2.

    If 0/2, 1/2, 1/3, 1/4, or 1/5 patients are observed with DLTs, then stay on the same level, and enroll more patients up to a total of 6.

  3. 3.

    If there are more than two DLTs observed, then de-escalate the dose level.

  4. 4.

    If six patients have been enrolled on the current dose level, escalation/de-escalation decision will not be made until at least five of those six patients have completed the DLT period.

The dose level assigned to a new patient is based on the following three components:

  1. 1.

    The number of patients currently enrolled and evaluable

  2. 2.

    The number of patients experiencing a DLT

  3. 3.

    The number of patients at risk of experiencing a DLT

In 1000 study simulations performed by Skolnik et al. (2008), the average (±standard deviation) time of study duration was 294 (±75 days) for the rolling 6 design versus 350 (±84) days for the traditional 3 + 3 design. This design successfully shortens the study duration for pediatric oncology studies in situations where there is prior information about the adult starting dose (Le Tourneau et al. 2009). Since this study design was specifically developed for pediatric oncology clinical trials, it has been increasingly implemented over the last decade (Doussau et al. 2016).

4.3 Model-Based Designs

4.3.1 Continual Reassessment Method (CRM)

One of the earliest model based-designs using Bayesian statistics, the continual reassessment method (CRM), was proposed by O’Quigley et al. (1990). This predated the accelerated titration and rolling six designs. In 2003, it was applied in a simulation study on pediatric Phase I oncology clinical trials by Onar-Thomas and Xiong (2010).

As an example of an adaptive design, a Bayesian statistical model is used to fit a dose-toxicity curve to find the dose (e.g., MTD) with the toxicity rate closest to the target rate. The target DLT rate is fixed at the beginning of the study, and only one patient is required for each dose level or cohort. The assumptions of the prior distributions for the parameters of the dose-toxicity curve are made based on historical data. Dose escalation decisions can then be made by investigators and biostatisticians based on the whole updated posterior distribution of toxicity at each dose, based on accumulating DLT information. The dose level recommended for the next patient is the one minimizing the difference between its probability of toxicity and the target toxicity rate.

There are three essential steps in a CRM study (Zhou et al. 2018):

  1. 1.

    Assume a parametric model for dose-toxicity curve, like a power model:

    $$ {p}_j={\alpha}_j^{\exp \left(\alpha \right)}, $$

    while pj denotes the true DLT probability of dose level j, α is the unknown parameter, and 0 < α1 < ⋯ < αJ < 1 are prior guesses for the DLT probability at each dose.

  2. 2.

    Update the estimate of the dose-toxicity curve based on the accumulating DLT data across all dose levels, and assign the next cohort of patients to the “optimal” dose, defined as the dose whose posterior mean estimate of the DLT probability is closest to the target DLT probability.

  3. 3.

    Rules to forbid skipping doses and safety stopping rules.

For the Bayesian CRM, advantages include the assumption that the target DLT level is more flexible compared to the traditional 3 + 3 design. This design allows for a more precise estimate of the MTD. Therefore, more patients can be treated at a potentially therapeutic dose level. In a comparison of simulations among study designs for pediatric oncology Phase I clinical trials, CRM was also been found to be more efficient than two algorithm-based methods (3 + 3 and rolling six) and reduce the number of skipped children (Doussau et al. 2012).

However, like all Bayesian models, justification of prior distributions considered in the CRM design analysis is always critical. Incorrect assumptions will expose patients to overtreatment risk. With respect to operational considerations, it is more complicated to constantly update the posterior distribution based on accrued DLT information after each cohort. This requires timely collaboration between statisticians, investigators, and clinical operational team throughout the dose escalation period.

The CRM design is still not been widely utilized in clinical trials, but various modifications have been made to improve the performance of the CRM design, including the escalation with overdose control (EWOC) design in 1998 (Babb et al. 1998), the time-to-event continual reassessment method (TITE-CRM) in 2000 (Cheung and Chappell 2000), an adaptive CRM design called TriCRM in 2006 (Zhang et al. 2006), a Bayesian-based extension of TriCRM in 2007 (Mandrekar et al. 2007), the Bayesian logistic regression model (BLRM) in 2008 (Neuenschwander et al. 2008), and time-to-event escalation with overdose control (TITE-EWOC) in 2011 (Mauguen et al. 2011). In the following sections, we will introduce the EWOC, BLRM, TITE-CRM, and TITE-EWOC study designs.

4.3.2 EWOC and BLRM

EWOC and BLRM are similar when compared to the CRM. They both assume a Bayesian two-parameter logistic regression model for dose-toxicity curves that actively control for the risk of overdosing. But these two designs use two different definitions to estimate the optimal dose.

The EWOC selects the optimal dose by selecting the highest dose whose posterior probability of being higher than the MTD is equal to or less than a pre-specified threshold, such as 25% or 30%. The EWOC was applied in a Phase I dose escalation study of oral gefitinib and irinotecan in children with refractory solid tumors that was published in 2014. In this study, the pre-specified threshold to control the overdosing risk was set at 30% (Brennan et al. 2014).

The BLRM defines the optimal dose that has the highest posterior probability of being within a pre-specified dosing interval (δ1, δ2). Another feature of the BLRM is that the dose skipping is not allowed.

In conclusion, in order to find the optimal dose with lower risk of overdose, the EWOC puts a constraint on the probability, while the BLRM puts a constraint on the dose directly.

4.3.3 TITE-CRM and TITE-EWOC

In the model-based designs discussed above, the dose-toxicity curve has to be updated by statisticians after all previous patients have completed their safety and toxicity evaluations. Enrollment of additional patients is delayed in studies with long DLT assessment periods (e.g., DLT assessment periods greater than 28 days). Time-to-event approaches combine existing model-based designs to estimate the next dose level from all previous patients with some data from patients still in the DLT assessment period, or in patients in whom follow-up is pending (Doussau et al. 2016).

The TITE-CRM was proposed in 2000 to estimate of the cumulative probability of late-onset toxicity over several cycles when some patients have not yet completed the DLT assessment period (Cheung and Chappell 2000). In 2006, Normolle and Lawrence (2006)) used Monte Carlo simulations of 60,000 Phase I studies to demonstrate that TITE-CRM trials are considerably shorter compared with traditional 3 + 3 and CRM study designs when toxicity observation times are long. However, Le Tourneau et al. (2009) pointed out that in two pancreatic cancer trials (Muler et al. 2004; Desai et al. 2007), the TITE-CRM design had accrual of more patients to dose levels below the RP2D as compared to those using a traditional 3 + 3 design.

Similar to the TITE-CRM, when statisticians update the estimates based on the Bayesian model in the TITE-EWOC design, the observations of patients who have not completed the follow-up period are likely to be down-weighted. Mauguen et al. (2011) showed that compared with the EWOC design, trial duration can be significantly decreased with the TITE-EWOC, without a major impact on the probability of overdose risk or the number of DLTs. This design also avoids waiting time in pediatric cancer chemoradiation trials (Doussau et al. 2016).

4.4 Model-Assisted Designs

4.4.1 Modified Toxicity Probability Interval (mTPI) Design

In 2007, Ji et al. (2007) proposed a Phase I dose-finding approach with simple escalation and de-escalation rules based on toxicity probability intervals (TPI). In 2010, Ji et al. (2010) presented a modified TPI (mTPI) design to improve efficiency while maintaining the simplicity of the original TPI design.

In the mTPI design, three intervals are specified to denote the proper dosing interval (δ1, δ2), the underdosing interval (0, δ1), and the overdosing interval (δ2, 1). The mTPI makes the decision about dose escalation and de-escalation based on the unit probability mass (UPM) of the three intervals (Fig. 4.1). Let pcur denote the DLT probability of the current dose. The UPM is defined as the posterior probability that pcur is within the interval, divided by the length of the interval.

Fig. 4.1
A graph illustrates the posterior distribution of the D L T rate over D T L rates. It includes U P M 1,U P M 2, and U P M 3. It has a bell-shaped curve.

mTPI calculates and compares the UPMs of the underdosing, proper dosing, and overdosing intervals

By assuming the target toxicity rate, dose levels, and potential toxicity rate at each dose level, a Monte Carlo experiment can be performed to identify the operating characteristics, including estimated number of patients and the observed number of toxicities. Given the simulation results, dose escalation and de-escalation can be determined before the onset of the trial, which makes the mTPI design easy to use for investigators. This creates an advantage for the mTPI design, namely, the ease of implementation of studies with this design.

Two further modifications of the mTPI were proposed in 2017: the mTPI-2 design (Guo et al. 2017) and the keyboard design (Yan et al. 2017). These two study designs are very similar and have almost the same operating characteristics in simulations, but the keyboard design is conceptually easier to understand.

4.4.2 Keyboard Design

The keyboard design was proposed to improve the performance of the mTPI design (Yan et al. 2017), since the original mTPI has a higher risk of overdosing patients due to the use of the UPM to guide dose escalation. The keyboard design constructs a series of equal-width dosing intervals, referred to as “keys,” to guide dose escalation and de-escalation (Fig. 4.2).

Fig. 4.2
A graph illustrates the posterior distribution of the D L T rate over the D L T rate. It indicates a bell-shaped distribution curve with a target key and the strongest key.

The keyboard design forms a series of equal-width keys and bases the decision on the position of the strongest key with respect to the target key

The keyboard design starts by eliciting the proper dosing interval (referred to as the target key) from clinicians, and then forms a series of equal-width keys on both sides of the target key. The keyboard design makes the decision of dose escalation and de-escalation based on the location of the “strongest” key, defined as the key that has the largest area under the posterior distribution curve of pcur. The rule of dose escalation and de-escalation is intuitive by comparing the location of target key and strongest key.

4.4.3 Bayesian Optimal Interval (BOIN) Design

The BOIN design is another model-assisted design that has overdose toxicity controls (Liu and Yuan 2015; Yuan et al. 2016). Unlike the mTPI and keyboard designs, the BOIN design makes the decision of dose escalation and de-escalation simply by comparing the observed DLT rate with a pair of fixed, predetermined dose escalation and de-escalation boundaries (Fig. 4.3).

Fig. 4.3
An image illustrates a horizontal bar that includes escalation between 0 to lambda subscript e, a stay between lambda subscript e and lambda subscript d, and a de-escalation between lambda subscript d and 1.

BOIN compares the observed DLT rate at the current dose with the pre-specified dose escalation and de-escalation boundaries

The respective dose escalation and de-escalation boundaries are derived from a pair of pre-specified toxicity probability thresholds: the highest DLT probability that is predicted to be underdosing such that dose escalation is needed and the lowest DLT probability that is predicted to be overdosing such that dose de-escalation is needed.

The BOIN design can target any pre-specified DLT rate without limitations. During the escalation phase, the process is very transparent and assessable for non-statisticians.

4.5 New Designs

Because the practical demands of recent advances in oncology treatments, many new study designs are under development or being tested prospectively in upcoming clinical trials (George et al. 2016). Some of these new designs have been incorporated into clinical trials, but have not yet been published and validated statistically.

4.5.1 Modified 4 + 4 Design

As the name suggests, the 4 + 4 design is a modification of the traditional 3 + 3 design. In addition to the three patients treated by study drug in each cohort, it adds one more patient on placebo. The 4 + 4 design is blinded and needs the safety review committee (SRC) involvement for the evaluation for each cohort.

The following guidelines are provided for each dose level:

  1. 1.

    If 0/4 patients are observed with DLTs, then escalate the dose level.

  2. 2.

    If 1/4 patients are observed with DLTs, then stay on the same level and enroll 4 more patients:

    1. (a)

      If 1/8 patients are observed with DLTs, then escalate the dose level.

    2. (b)

      If 2/8 patients are observed with DLTs, then SRC is unblinded to treatment:

      • If there are ≥1 DLTs in placebo group, then escalate the dose level.

      • If both DLTs are in treatment group, then stop the trial and choose the previous dose level as MTD.

  3. 3.

    If 2/4 patients are observed with DLTs, then SRC is unblinded to treatment:

    1. (a)

      If there is 1 DLT in treatment group, then stay on the same level and enroll 4 more patients:

      • If 1/8 patients are observed with DLTs, then escalate the dose level.

      • If 2/8 patients are observed with DLTs, then SRC unblinded to treatment:

        • If there is 1 DLT in placebo group, then escalate the dose level.

        • If both DLTs are in treatment group, then stop the trial and choose the previous dose level as MTD.

    2. (b)

      If there are two DLTs in treatment group, then stop the trial and choose the previous dose level as MTD.

  4. 4.

    If 3/4 patients are observed with DLTs, then the trial stops and the previous dose level is defined as the MTD.

This design is perhaps applicable to the studies where it is difficult to ascertain the difference between adverse events related to the investigational agent and adverse events that are expected due to the underlying disease. In this scenario, the placebo group can help increase the probability to escalate the dose level. This idea can also be borrowed into oncology studies by replacing the placebo group with a control group of the other lines of therapy if efficacy is a secondary objective in Phase I.

4.6 Conclusions

While there is great interest and enthusiasm about model-based study designs, over the past decade, the rule-based designs, like the traditional 3 + 3 design and newer rolling six study design, are still the most commonly used in pediatric oncology trials. As noted above, these study designs are easy to execute since the rules about dose escalation and de-escalation are a priori defined in the protocol based on observed DLTs.

Model-based designs have significant advantages on reducing numbers of study patients and shortening study durations. However, they require significant involvement of statisticians for the development of the study design, for monitoring of the study, and in dose escalation/de-escalation decisions. Model-based designs are also operationally complicated, due to a requirement for repeated model fitting, conceptual and computational complexity, and nontransparent approach to decision-making.

The model-assisted designs combine the superior performance of model-based designs with the simplicity of algorithm-based designs. They offer more flexible approaches to patient enrollment while retaining clear escalation and de-escalation rules. Because of their good performance and simplicity, model-assisted designs have been increasingly used in practice. In addition, many software and online tools are now available to support the simulation of operating characteristics for both model-based and model-assisted designs. But since model-assisted designs are relatively new, they are not commonly utilized in pediatric oncology trials. In the future, we believe more model-assisted designs will be developed by investigators and biostatisticians and applied to pediatric oncology studies.