Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

A long-term survivor mixture model, also known as standard cure rate model, assumes that the studied population is a mixture of susceptible individuals, who experience the event of interest and non susceptible individuals that will never experience it. These individuals are not at risk with respect to the event of interest and are considered immune, non susceptible or cured [9]. Following Maller and Zhou [9], the standard cure rate model assumes that a certain fraction p in the population is cured or never fail with respect to the specific cause of death or failure, while the remaining (1−p) fraction of the individuals is not cured, leading to the survival function for the entire population written as:

$$\begin{aligned} S(t)=p+(1-p)S_{0}(t), \end{aligned}$$
(24.1)

where p∈(0,1) is the mixing parameter and S 0(t) denotes a proper survival function for the non cured group in the population. Considering a random sample of lifetimes (t i , δ i , i=1,…,n), under the assumption of right censored lifetime, the contribution of the ith individual for the likelihood function is:

$$\begin{aligned} L_{i}= \bigl[ f ( t_{i} ) \bigr] ^{\delta _{i}} \bigl[ S ( t_{i} ) \bigr] ^{1-\delta _{i}}, \end{aligned}$$
(24.2)

where δ i is a censoring indicator variable, that is, δ i =1 for an observed lifetime and δ i =0 for a censored lifetime.

From the mixture survival function, (24.1), the probability density function is obtained from \(f ( t_{i} ) =-\frac{d}{dt}S(t_{i})\) and given by:

$$\begin{aligned} f ( t_{i} ) = ( 1-p ) f_{0} ( t_{i} ) \mbox{,} \end{aligned}$$
(24.3)

where f 0(t i ) is the probability density function for the susceptible individuals.

An alternative to a long-term survivor mixture model is the long-term survivor non-mixture model suggested by [7, 12, 13] which defines an asymptote for the cumulative hazard and hence for the cure fraction. The survival function for a non-mixture cure rate model is defined as:

$$\begin{aligned} S ( t ) =p^{1-S_{0} ( t ) }\mbox{,} \end{aligned}$$
(24.4)

where, like in (24.1), p∈(0,1) is the mixing parameter and S 0(t) denotes a proper survival function for the non cured group. Observe that if the probability of cure is large, then the intrinsic survival function S(t) is large – S 0(t) will be large which implies in F 0(t)=1−S 0(t) small. Larger values of F 0(t) at a fixed time t imply lower values of S(t). This model was derived under the threshold model for tumor resistance (cancer research) where, F 0(t) refers to the distribution of division time for each cell in a homogeneous clone of cells. The non-mixture model (24.4) or the promotion time cure fraction has been used by Lambert et al. [7, 8] to estimate the probability of cure fraction in cancer lifetime data.

From (24.4), the survival and hazard function for the non-mixture cure rate model can be written, respectively, as:

$$\begin{aligned} S ( t_{i} ) =\exp \bigl[ \log ( p ) F_{0} ( t_{i} ) \bigr] \end{aligned}$$
(24.5)

and

$$\begin{aligned} h ( t_{i} ) =-\log ( p ) f_{0} ( t_{i} ) \mbox{.} \end{aligned}$$
(24.6)

Since f(t)=h(t)S(t), the contribution of the ith individual for the likelihood function is given by:

$$\begin{aligned} L_{i}=h ( t_{i} ) ^{\delta _{i}}S ( t_{i} ) \end{aligned}$$
(24.7)

that is:

$$\begin{aligned} L_{i}= \bigl[ -\log ( p ) f_{0} ( t_{i} ) \bigr] ^{\delta _{i}}\exp \bigl[ \log ( p ) F_{0} ( t_{i} ) \bigr] \mbox{.} \end{aligned}$$
(24.8)

A Bayesian formulation of the non-mixture cure rate model is given in Chen et al. [2]. A model which includes a standard mixture model for cure rate was considered in Yin and Ibrahim [14]. Rodrigues et al. [10] extended the long-term survival model proposed by Chen et al. [2].

In this paper, considering the Burr XII distribution, we compare the performance of the mixture and non-mixture cure fraction formulation when the scale and shape parameters are dependent of covariates. The Burr XII distribution provides more flexibility than the Weibull distribution which could be a special case of the Burr XII distribution if its parameters are extended to a limiting case. It is also important to point out that the Burr XII distribution is mathematically tractable with a closed form for its cumulative distribution function.

The Burr XII Distribution Cure Model

Burr [1] suggested a number of cumulative distributions, where the most popular one is the so-called Burr XII distribution, whose three-parameter probability density function is given by:

$$\begin{aligned} f_{0} (t\mid\mu,\alpha,\lambda )=\frac{\alpha}{\mu^{\alpha}}t^{\alpha-1} \biggl[1+\lambda \biggl(\frac{t}{\mu} \biggr)^{\alpha} \biggr]^{- (1+\frac{1}{\lambda} )}, \end{aligned}$$
(24.9)

where μ>0 is the scale parameter; α>0 and λ>0 are shape parameters. For λ→+0 we have the Weibull distribution as a particular case. The hazard function of a Burr XII distribution is decreasing if α≤1 and is unimodal with the mode at \(t=\frac{ (\alpha-1 )^{1/\alpha}}{\mu^{-1}\lambda^{1/\alpha}}\) when α>1. The three-parameter Burr XII distribution is much more flexible than the standard two-parameter Weibull distribution.

From (24.9), the survival function is written by:

$$\begin{aligned} S_{0} (t\mid\mu,\alpha,\lambda )= \biggl[1+\lambda \biggl( \frac{t}{\mu} \biggr)^{\alpha} \biggr]^{-\frac{1}{\lambda}}. \end{aligned}$$
(24.10)

From (24.10), the Burr XII model in the presence of long-term survivors or immunes has a probability density function and a survival function given, respectively, as follows:

$$\begin{aligned} f (t\mid\mathbf{\theta} ) =& (1-p )\frac{\alpha}{\mu^{\alpha}}t^{\alpha-1} \biggl[1+ \lambda \biggl(\frac{t}{\mu} \biggr)^{\alpha} \biggr]^{- (1+\frac{1}{\lambda} )}, \end{aligned}$$
(24.11)
$$\begin{aligned} S ( t\mid \mathbf{\theta } ) =&p+ ( 1-p ) \biggl[1+\lambda \biggl( \frac{t}{\mu} \biggr)^{\alpha} \biggr]^{-\frac{1}{\lambda}}, \end{aligned}$$
(24.12)

where \(\mathbf{\theta}= (\mu,\alpha,\lambda,p )\), μ is the scale parameter, α and λ are shape parameters and p is the proportion of immunes or non susceptible.

Under the non-mixture formulation and using (24.10), the probability density function and the survival function are given respectively by:

$$\begin{aligned} f (t\mid\mathbf{\theta} ) =&-\log (p )\frac{\alpha}{\mu^{\alpha}}t^{\alpha-1} \biggl[1+ \lambda \biggl(\frac{t}{\mu} \biggr)^{\alpha} \biggr]^{- (1+\frac{1}{\lambda} )} p^{ \bigl\{ 1- \bigl[1+\lambda \bigl(\frac{t}{\mu} \bigr)^{\alpha} \bigr]^{-\frac{1}{\lambda}} \bigr\} } \end{aligned}$$
(24.13)
$$\begin{aligned} S (t\mid\mathbf{\theta} ) =& p^{ \bigl\{ 1- \bigl[1+\lambda \bigl(\frac{t}{\mu} \bigr)^{\alpha} \bigr]^{-\frac{1}{\lambda}} \bigr\} }. \end{aligned}$$
(24.14)

In the presence of one covariate x i , i=1,…,n, we can assume a link function for μ, α, λ and p, that is, log(μ i )=β 0+β 1 x i , log(α i )=α 0+α 1 x i , log(λ i )=γ 0+γ 1 x i and \(\log ( \frac{p_{i}}{1-p_{i}} )=\eta_{0}+\eta_{1}x_{i}\), where x i , for example, taking the value 0 if individual i is in the treatment group 1 or the value 1 if individual i is in the treatment group 2. In this way, we can have interest in test the following hypothesis: H 0: β 1=0 (no treatment effect in the susceptible patients), H 0: α 1=0 (no treatment effect in the shape of the lifetime distribution), H 0: γ 1=0 (no treatment effect in the shape of the lifetime distribution) or H 0: η 1=0 (no treatment effect in the proportion of cured individuals).

A Bayesian Analysis

For a Bayesian analysis of the mixture and non-mixture models introduced in Sect. 24.1, we assume an prior uniform distribution defined in the interval (0,1), U(0,1), for the probability of cure p and Gamma(0.001,0.001) prior distributions for the scale parameter μ and shape parameters α and λ, where Gamma(a,b) denotes a gamma distribution with mean a/b and variance a/b 2. We further assume prior independence among p, μ, α and λ. Observe that we are using approximately non-informative priors for the parameters of the models.

Assuming the mixture and non-mixture models introduced in Sect. 24.1, let us consider a gamma prior distribution Gamma(0.001,0.001) for the regression parameters β 0 and α 0 and a normal prior distribution N(0,100) for the regression parameters β l and α l , l=1,…,k, where N(μ,σ 2) denotes a normal distribution with mean μ and variance σ 2. We also assume prior independence among the parameters.

Posterior summaries of interest are obtained from simulated samples for the joint posterior distribution using standard Markov Chain Monte Carlo (MCMC) methods as the Gibbs sampling algorithm [4] or the Metropolis–Hastings algorithm [3].

An Application

In this section we analyze a leukaemia data set consisting of 90 observations introduced by Kersey et al. [6] and reproduced by Maller and Zhou [9]. In this data 46 patients were treated by allogeneic transplant (Group I) and the other 44 by autologous transplant (Group II). The survival time refers to the number of days to recurrence of leukaemia for patients after one of the two treatments. The medical problems of interest include: the existence of “cured” patients (who will never suffer a recurrence of leukaemia) and the estimation of their proportion; the failure distributions of susceptible patients; and comparison between the effects of the two treatments.

In Tables 24.1 and 24.2, we have the inference results considering the Bayesian approach for mixture and non-mixture models, respectively. We also have the Monte Carlo estimates of DIC (Deviance Information Criterion) used as a discrimination criterion for different models. Smaller values of DIC indicates better models.

Table 24.1 Posterior means (standard deviation) for μ, α, λ and p in each group—mixture model
Table 24.2 Posterior means (standard deviation) for μ, α, λ and p in each group — non-mixture model

To obtain the Bayesian estimates we have used MCMC (Markov Chain Monte Carlo) methods available in SAS software 9.2, SAS/MCMC [11]. A single chain has been used in the simulation of samples for each parameter of both models considering a “burn-in-sample” of size 15,000 to eliminate the possible effect of the initial values. After this “burn-in” period, we simulated other 200,000 Gibbs samples taking every 100th sample, to get approximated uncorrelated values which result in a final chain of size 2,000. Usual existing convergence diagnostics available in the literature for a single chain using the SAS/MCMC procedure indicated convergence for all parameters.

In Fig. 24.1, we have the plots of the estimated survival functions considering mixture and non-mixture models in presence of cure fraction and the plot of the non-parametric Kaplan–Meier estimate for the survival function [5]. We also have in Fig. 24.1, the plot of the estimated survival function based on the Weibull and Burr XII distributions not considering the cure fraction modeling.

Fig. 24.1
figure 1

Fitted models for the data

From the fitted survival models (see Fig. 24.1), we conclude that the survival times are very well fitted by the mixture and non mixture cure fraction models. From the results of Tables 24.1 and 24.2, the obtained DIC discrimination values from both models also give similar results.

We can also consider a binary variable related to the different groups where x 1i =1 for Group II and 0 for the Group I. Then we consider three cases: model without covariates (Model 1), regression model for μ (Model 2) and regression model for μ and α (Model 3).

In Tables 24.3 and 24.4, we have the inference results considering the Bayesian approach for regression models considering mixture and non-mixture models, respectively.

Table 24.3 Posterior Means (PM) and Standard Deviation (SD) for regression models — mixture model
Table 24.4 Posterior Means (PM) and Standard Deviation (SD) for regression models — non-mixture model

In Bayesian context using MCMC methods, we have used the DIC given automatically by the SAS software (see, Table 24.5).

Table 24.5 Deviance Information Criterion (DIC)

From the results of Table 24.5, we conclude that Model 3 (regression model for μ and α) is better fitted by the data. Since DIC is a little bit smaller considering the non-mixture Model 3 when compared to the other models, we use this model to get our final inferences of interest. From Table 24.4 and using the non-mixture Model 3, we conclude that the parameters β 1 and α 1 have significative treatment effect in the ratio of susceptible patients.