Analysis of Survival Data Under an Assumed Copula

Emura, Takeshi; Chen, Yi-Hau

doi:10.1007/978-981-10-7164-5_4

Takeshi Emura³ &
Yi-Hau Chen⁴

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

1368 Accesses

Abstract

This chapter introduces statistical methods for analyzing survival data subject to dependent censoring. We review the copula-graphic estimator, parametric likelihood methods, and semi-parametric likelihood methods developed under a variety of copula models. All these approaches employ an assumed copula, a copula function that is completely specified including its parameter value to avoid the non-identifiability.

Access provided by CONRICYT-eBooks. Download chapter PDF

Keywords

4.1 Introduction

The idea of an assumed copula was suggested by Zheng and Klein (1995) in their analysis of survival data subject to dependent censoring. They considered a bivariate distribution function of survival time and censoring time, where the form of the copula function is completely specified, including its parameter value. This strong assumption of the copula is imposed to make the model identifiable. Assuming the independence copula is equivalent to the assumption of independent censoring between survival time and censoring time.

Zheng and Klein (1995) view censoring as a competing risk of death and view death as a competing risk of censoring. This is the setting of bivariate competing risks where one can observe the first-occurring event time and the type of the observed event (death or censoring whichever comes first). With this view, survival data with dependent censoring are equivalent to bivariate competing risks data. In the context of competing risks, the independence among event times is rarely assumed since many medical and engineering applications yield event times that are positively associated. Hence, statistical methods for analyzing bivariate competing risks data can be applicable for analyzing survival data with dependent censoring.

Under an assumed copula, Zheng and Klein (1995) estimated the marginal survival function by the copula-graphic (CG) estimator. The survival function estimated by the CG estimator is analogous to the one estimated by the Kaplan–Meier estimator. The CG estimator reduces to the Kaplan–Meier estimator under the independence copula. In real applications, the CG estimator is calculated by assuming one of Archimedean copulas. Rivest and Wells (2001) obtained a simple expression of the CG estimator when the assumed copula belongs to Archimedean copulas. Nowadays, the CG estimator is an indispensable tool for analyzing survival data with dependent censoring (Braekers and Veraverbeke 2005; Staplin 2012; de Uña-Álvarez and Veraverbeke 2013; 2017; Emura and Chen 2016; Emura and Michimae 2017; Moradian et al. 2017). Note, however, that the CG estimator cannot handle covariates. Likelihood-based approaches can naturally deal with covariates under an assumed copula.

Throughout this chapter, we review the copula-graphic estimator, parametric likelihood methods, and semi-parametric likelihood methods developed under an assumed copula.

4.2 The Copula-Graphic (CG) Estimator

Analysis of survival data often begins by drawing the Kaplan–Meier survival curve which graphically summarizes survival experience of patients in the data. However, under dependent censoring, the Kaplan–Meier estimator may give biased information about survival. A survival curve calculated from the CG estimator provides unbiased information about survival if the copula function between death time and censoring time is correctly specified. Below, we shall introduce the CG estimator under an Archimedean copula as derived in Rivest and Wells (2001).

Consider random variables, defined as

T: survival time
U: censoring time

Consider an Archimedean copula model

$$ \Pr (T > t,U > u) = \phi_{\theta }^{ - 1} [\phi_{\theta } \{ S_{T} (t)\} + \phi_{\theta } \{ S_{U} (u)\} ], $$

(4.1)

where $ \phi_{\theta } :[0,1] \mapsto [0,\infty ] $ is a generator function, which is continuous and strictly decreasing from $ \phi_{\theta } (0) = \infty $ to $ \phi_{\theta } (1) = 0 $ (Chap. 3); $ S_{T} (t) = \Pr (T > t) $ and $ S_{U} (u) = \Pr (U > u) $ are the marginal survival functions.

Let $ (t_{i} ,\delta_{i} ) $, $ i = 1, \ldots ,n $, be survival data without covariates, where $ t_{i} = \hbox{min} \{ T_{i} ,U_{i} \} $, $ \delta_{i} = {\mathbf{I}}(T_{i} \le U_{i} ) $, and $ {\mathbf{I}}( \cdot ) $ is the indicator function. Assume that all the observed times are distinct ($ t_{i} \ne t_{j} $ whenever $ i \ne j $). Based on the data, one can estimate the survival function by the following estimator:

.

The CG estimator is defined as

$$ \hat{S}_{T} (t) = \phi_{\theta }^{ - 1} \left[ {\sum\limits_{{t_{i} \le t,\delta_{i} = 1}} {\phi_{\theta } \left( {\frac{{n_{i} - 1}}{n}} \right) - \phi_{\theta } \left( {\frac{{n_{i} }}{n}} \right)} } \right],\quad 0 \le t \le \mathop {\hbox{max} }\limits_{i} (t_{i} ) $$

where $ n_{i} = \sum\nolimits_{\ell = 1}^{n} {{\mathbf{I}}(t_{\ell } \ge t_{i} )} $ is the number at risk at time t_i; $ \hat{S}_{T} (t) = 1 $ if no death occurs up to time t; $ \hat{S}_{T} (t) $ is undefined for $ t > \mathop {\hbox{max} }\limits_{i} (t_{i} ) $.

The derivation of the CG estimator: Assume that S_T(t) is a decreasing step function with jumps at death times. Thus, δ_i = 1 implies S_T(t_i) ≠ S_T(t_i − dt) and S_U(t_i) = S_U(t_i − dt). Setting t = u = t_i in Eq. (4.1), we have

$$ \phi_{\theta } \{ \Pr (T > t_{i} ,U > t_{i} )\} = \phi_{\theta } \{ S_{T} (t_{i} )\} + \phi_{\theta } \{ S_{U} (t_{i} )\} . $$

In the left-side of the preceding equation, we estimate $ \Pr (T > t_{i} ,U > t_{i} ) $ by (n_i − 1)/n, where $ n_{i} - 1 = \sum\nolimits_{\ell = 1}^{n} {{\mathbf{I}}(t_{\ell } > t_{i} )} $ is the number of survivors at time t_i. Accordingly,

$$ \phi_{\theta } \left( {\frac{{n_{i} - 1}}{n}} \right) = \phi_{\theta } \{ S_{T} (t_{i} )\} + \phi_{\theta } \{ S_{U} (t_{i} )\} . $$

(4.2)

Meanwhile, we set t = u = t_i − dt in Eq. (4.1) and then estimate $ \Pr (T > t_{i} - dt,U > t_{i} - dt) $ by n_i/n. Then,

$$ \phi_{\theta } \left( {\frac{{n_{i} }}{n}} \right) = \phi_{\theta } \{ S_{T} (t_{i} - dt)\} + \phi_{\theta } \{ S_{U} (t_{i} )\} ,\quad \delta_{i} = 1. $$

(4.3)

Equations (4.2) and (4.3) result in the system of difference equations

$$ \phi_{\theta } \left( {\frac{{n_{i} - 1}}{n}} \right) - \phi_{\theta } \left( {\frac{{n_{i} }}{n}} \right) = \phi_{\theta } \{ S_{T} (t_{i} )\} - \phi_{\theta } \{ S_{T} (t_{i} - dt)\} ,\quad \delta_{i} = 1. $$

We impose the usual constraint that S_T(t_i − dt) = 1 when t_i is the smallest death time. Then, the solution to the different equations is

$$ \begin{aligned} \phi_{\theta } \{ S_{T} (t)\} & = \sum\limits_{{t_{i} \le t,\delta_{i} = 1}} {[\phi_{\theta } \{ S_{T} (t_{i} )\} - \phi_{\theta } \{ S_{T} (t_{i} - dt)\} ]} \\ & = \sum\limits_{{t_{i} \le t,\delta_{i} = 1}} {\phi_{\theta } \left( {\frac{{n_{i} - 1}}{n}} \right) - \phi_{\theta } \left( {\frac{{n_{i} }}{n}} \right)} , \\ \end{aligned} $$

which is equivalent to the CG estimator. ■

Under the independence copula, given by $ \phi_{\theta } (t) = - \log (t) $, the CG estimator is equivalent to the Kaplan–Meier estimator. Under the Clayton copula, given by $ \phi_{\theta } (t) = (t^{ - \theta } - 1)/\theta $ for θ > 0, the CG estimator is written as

$$ \hat{S}_{T} (t) = \left[ {1 + \sum\limits_{{t_{i} \le t,\delta_{i} = 1}} {\left\{ {\left( {\frac{{n_{i} - 1}}{n}} \right)^{ - \theta } - \left( {\frac{{n_{i} }}{n}} \right)^{ - \theta } } \right\}} } \right]^{ - 1/\theta } . $$

This CG estimator can be computed by the compound. Cox R package (Emura et al. 2018).

The CG estimator provides a graphical summary of survival experience for patients in the same manner as the Kaplan–Meier estimator.

.

The survival curve is defined as the plot of $ \hat{S}_{T} (t) $ against t, starting with t = 0 and ending with $ t_{\hbox{max} } = \mathop {\hbox{max} }\limits_{i} (t_{i} ) $. The curve is a step function that jumps only at points where a death occurs. On the curve, censoring times are often indicated as the mark “+”.

If $ t_{\hbox{max} } = \mathop {\hbox{max} }\limits_{i} (t_{i} ) $ corresponds to time-to-death of a patient, then $ \hat{S}_{T} (t_{\hbox{max} } ) = \phi_{\theta }^{ - 1} (\infty ) = 0 $. This is because $ \phi_{\theta } \left( {\frac{{n_{i} - 1}}{n}} \right) = \phi_{\theta } (0) = \infty $ for some i in the definition of the CG estimator. If $ t_{\hbox{max} } = \mathop {\hbox{max} }\limits_{i} (t_{i} ) $ corresponds to censoring time of a patient, then $ \hat{S}(t_{\hbox{max} } ) > 0 $.

Additional remarks: The CG estimator can be modified to accommodate a variety of different censoring and truncation mechanisms. de Uña-Álvarez and Veraverbeke (2013) derived the CG estimator when survival time is subject to both dependent censoring and independent censoring. This estimator is convenient if the data provide the causes of censors for all patients. For instance, censoring caused by dropout may be dependent while censoring caused by the study termination is independent (see Chap. 14 of Collett (2015)). de Uña-Álvarez and Veraverbeke (2017) derived the CG estimator when survival time is subject to both dependent censoring and independent truncation. Chaieb et al. (2006) and Emura and Murotani (2015) derived the CG estimator when survival time is subject to independent censoring and dependent truncation.

4.3 Model and Likelihood

Throughout this chapter, we consider a bivariate survival function

$$ \Pr (T > t,U > u|{\mathbf{x}}) = C_{\theta } \{ S_{T} (t|{\mathbf{x}}),S_{U} (u|{\mathbf{x}})\} , $$

where C_θ is a copula (Nelsen 2006) with a parameter θ; $ S_{T} (t|{\mathbf{x}}) = \Pr (T > t|{\mathbf{x}}) $ and $ S_{U} (u|{\mathbf{x}}) = \Pr (U > u|{\mathbf{x}}) $ are the marginal survival functions. The covariates are defined as $ {\mathbf{x}} = ({\mathbf{x}}_{1} ,{\mathbf{x}}_{2} ) $ such that $ S_{T} (t|{\mathbf{x}}) = S_{T} (t|{\mathbf{x}}_{1} ) $ and $ S_{U} (u|{\mathbf{x}}) = S_{U} (t|{\mathbf{x}}_{2} ) $. For instance, if $ {\mathbf{x}}_{1} = ({\text{Age}},{\text{gender}}) $ and $ {\mathbf{x}}_{2} = ({\text{gender}}) $, the model does not consider the effect of age on censoring time.

Survival data consist of $ (t_{i} ,\delta_{i} ,{\mathbf{x}}_{i} ) $, $ i = 1, \ldots ,n $, where $ {\mathbf{x}}_{i} = (x_{i1} , \ldots ,x_{ip} )^{{\prime }} $ is a vector of covariates. The likelihood for the ith patient is expressed as

$$ L_{i} = \Pr (T = t_{i} ,U > t_{i} |{\mathbf{x}}_{i} )^{{\delta_{i} }} \Pr (T > t_{i} ,U = t_{i} |{\mathbf{x}}_{i} )^{{1 - \delta_{i} }} = f_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} )^{{\delta_{i} }} f_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} )^{{1 - \delta_{i} }} , $$

where

$$ \left. {f_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} ) = - \frac{\partial }{\partial x}\Pr (T > x,U > t_{i} |{\mathbf{x}}_{i} )} \right|_{{x = t_{i} }} ,\quad \left. {f_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} ) = - \frac{\partial }{\partial y}\Pr (T > t_{i} ,U > y|{\mathbf{x}}_{i} )} \right|_{{y = t_{i} }} , $$

are called the sub-density functions . Therefore, the log-likelihood is defined as

$$ \ell = \sum\limits_{i = 1}^{n} {[\delta_{i} \log f_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} ) + (1 - \delta_{i} )\log f_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} )]} . $$

(4.4)

An equivalent expression is

$$ \ell = \sum\limits_{i = 1}^{n} {[\delta_{i} \log h_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} ) + (1 - \delta_{i} )\log h_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} ) -\Phi (t_{i} ,t_{i} |{\mathbf{x}}_{i} )]} , $$

(4.5)

where

$$ h_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} ) = \frac{{f_{T}^{\# } (t_{i} |{\mathbf{x}}_{i} )}}{{\Pr (T > t_{i} ,U > t_{i} |{\mathbf{x}}_{i} )}},\quad h_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} ) = \frac{{f_{U}^{\# } (t_{i} |{\mathbf{x}}_{i} )}}{{\Pr (T > t_{i} ,U > t_{i} |{\mathbf{x}}_{i} )}}, $$

are the cause-specific hazard functions , and

$$ \Phi (t_{i} ,t_{i} |{\mathbf{x}}_{i} ) = - \log \,\Pr (T > t_{i} ,U > t_{i} |{\mathbf{x}}_{i} ) = - \log \,\Pr (\hbox{min} \{ T,U\} > t_{i} |{\mathbf{x}}_{i} ) $$

is the cumulative hazard function for $ \hbox{min} \{ \;T,U\;\} $.

With appropriate models on C_θ, $ S_{T} ( \cdot |{\mathbf{x}}) $ and $ S_{U} ( \cdot |{\mathbf{x}}) $, one can obtain the maximum likelihood estimator (MLE) with Eqs. (4.4) or (4.5).

4.4 Parametric Models

4.4.1 The Burr Model

Escarela and Carrière (2003) considered a copula model with the Burr distribution defined as

$$ S_{T} (t|{\mathbf{x}}_{1i} ) = \{ 1 + \gamma_{1} (\lambda_{1i} t)^{{\nu_{1} }} \}^{{ - 1/\gamma_{1} }} ,\quad t \ge 0;\quad S_{U} (\;u\;|{\mathbf{x}}_{2i} \;) = \{ \;1 + \gamma_{2} (\lambda_{2i} u)^{{\nu_{2} }} \;\}^{{ - 1/\gamma_{2} }} ,\quad u \ge 0, $$

where $ v_{j} > 0 $, γ_j > 0, and $ \lambda_{ji} = \exp (\beta_{j0} + {\varvec{\upbeta}}_{j}^{{\prime }} {\mathbf{x}}_{ji} ) $ for $ j = 1 $ and 2. The Burr distribution includes many distributions as special cases; $ v_{j} = 1 $ gives the Pareto distribution , γ_j = 1 gives the log-logistic distribution, and γ_j → 0 gives the Weibull distribution. For the copula, Escarela and Carrière (2003) considered the Frank copula .

$$ C_{\theta } (u,v) = - \frac{1}{\theta }\log \left\{ {1 + \frac{{(e^{ - \theta u} - 1)(e^{ - \theta v} - 1)}}{{e^{ - \theta } - 1}}} \right\},\quad \theta \ne 0. $$

Their motivation to use the Frank model is that they wish to consider both positive dependence $ (\theta > 0) $ and negative dependence $ (\theta < 0) $ between two variables.

4.4.2 The Weibull Model

Likelihood-based analyses of Escarela and Carrière (2003) focused on the Weibull model

$$ S_{T} (t|{\mathbf{x}}_{1i} ) = \exp \{ - (\lambda_{1i} t)^{{\nu_{1} }} \} ,\quad t \ge 0;\quad S_{U} (u|{\mathbf{x}}_{2i} ) = \exp \{ - (\lambda_{2i} u)^{{\nu_{2} }} \} ,\quad u \ge 0. $$

With the Frank copula model, they maximize the log-likelihood of Eq. (4.4) with respect to $ (\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} ) $ given the value θ. This leads to the profile likelihood

$$ \ell^{*} (\theta ) = \mathop {\hbox{max} }\limits_{{(\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} )}} \ell (\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} |\theta ). $$

The MLE of $ (\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} ) $ is obtained at a given value $ \hat{\theta } = \arg \max_{\theta } \ell^{*} (\theta ) $.

The data analysis of Escarela and Carrière (2003) revealed that the estimator $ \hat{\theta } $ had a wide confidence interval (CI) if no covariate enters the model. This phenomenon is related to the non-identifiability of the model. The CI of $ \hat{\theta } $ was shrunken if many covariates enter the model. Heckman and Honoré (1989) showed that the non-identifiability is resolved by adding covariates into the marginal models. Unfortunately, there are no papers that give the conditions (e.g., how many covariates or how many samples) required to give reasonable precision of $ \hat{\theta } $ for estimating the true value θ.

In this context, we suggest regarding the approach of Escarela and Carrière (2003) as a two-step fashion. The first stage selects (not estimates) θ via the profile likelihood. With the selected value $ \hat{\theta } $, the second stage estimates the remaining parameters $ (\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} ) $ by the MLE. The SEs of $ (\beta_{10} ,{\varvec{\upbeta}}_{1} ,\nu_{1} ,\beta_{20} ,{\varvec{\upbeta}}_{2} ,\nu_{2} ) $ may not account for the variation of $ \hat{\theta } $ following the approaches of an assumed copula .

4.4.3 The Pareto Model

In the absence of covariates, Shih et al. (2018) considered the Pareto marginal models

$$ S_{T} (t) = (1 + \alpha_{1} t)^{{ - \gamma_{1} }} ,\quad t \ge 0;\quad S_{U} (u) = (1 + \alpha_{2} u)^{{ - \gamma_{2} }} ,\quad u \ge 0, $$

where α_j > 0 and γ_j > 0 are re-parameterized from the Burr models. The marginal hazard functions are $ h_{T} (t) = \alpha_{1} \gamma_{1} /(1 + \alpha_{1} t) $ and $ h_{U} (u) = \alpha_{2} \gamma_{2} /(1 + \alpha_{2} u) $ and the marginal density functions are $ f_{T} (t) = h_{T} (t)S_{T} (t) $ and $ f_{U} (u) = h_{U} (u)S_{U} (u) $. Applying the Frank copula to Eq. (4.4), the log-likelihood can be written as

$$ \begin{aligned} \ell (\alpha_{1} ,\alpha_{2} ,\gamma_{1} ,\gamma_{2} |\theta ) & = \sum\limits_{i = 1}^{n} {\delta_{i} \{ \log f_{T} (t_{i} ) - \theta S_{T} (t_{i} ) + \log (e^{{ - \theta S_{T} (t_{i} )}} - 1) - \log (e^{ - \theta } - 1) + \theta S(t_{i} )\} } \\ & \quad + \sum\limits_{i = 1}^{n} {(1 - \delta_{i} )\{ \log f_{U} (t_{i} ) - \theta S_{U} (t_{i} ) + \log (e^{{ - \theta S_{U} (t_{i} )}} - 1) - \log (e^{ - \theta } - 1) + \theta S(t_{i} )\} } , \\ \end{aligned} $$

where $ S(t) = C_{\theta } \{ S_{T} (t),S_{U} (t)\} $. The MLE is obtained by maximizing the preceding equation.

They developed a Newton–Raphson algorithm to obtain the MLE of $ (\alpha_{1} ,\alpha_{2} ,\gamma_{1} ,\gamma_{2} ) $ given the value θ. The Bivariate.Pareto R package (Shih and Lee 2018) can be used to compute the MLE and the SE for the parameters. Hence, this model uses an assumed copula. Their Newton–Raphson algorithm employs a randomization scheme to reduce the sensitivity of the convergence results against the initial values, which is termed the randomized Newton–Raphson algorithm (Hu and Emura 2015). When θ is unknown, the profile likelihood estimate was suggested, namely $ \hat{\theta } = \arg \max_{\theta } \ell^{*} (\theta ) $, where $ \ell^{*} (\theta ) = \mathop {\hbox{max} }\limits_{{(\alpha_{1} ,\alpha_{2} ,\gamma_{1} ,\gamma_{2} )}} \ell (\alpha_{1} ,\alpha_{2} ,\gamma_{1} ,\gamma_{2} |\theta ) $. However, they reported that the profile likelihood occasionally does not have a peak and $ \hat{\theta } $ has a large sampling variation. These problems are related to the non-identifiability of competing risks data (Tsiatis 1975).

Due to the difficulty of estimating θ, Shih et al. (2018) considered a restricted model $ S_{T} (t) = S_{U} (t) = (1 + \alpha t)^{ - \gamma } $. The model makes a strong assumption that the two marginal distributions are the same. Under the Frank copula, they developed the randomized Newton–Raphson algorithm to obtain the MLE of $ (\alpha ,\gamma ,\theta ) $. While the peak of the likelihood always exists under this restricted model, the variation of estimating θ remains large. Including covariates into the marginal Pareto models may improve the precision of $ \hat{\theta } $. Alternatively, a sensitivity analysis may be considered under a few selected values of θ.

4.4.4 The Burr III Model

In the absence of covariates, Shih and Emura (2018) considered the Burr III marginal distributions

$$ S_{T} (t) = 1 - (1 + t^{ - \gamma } )^{ - \alpha } ,\quad t > 0;\quad S_{U} (u) = 1 - (1 + u^{ - \gamma } )^{ - \beta } ,\,\,u > 0, $$

where $ (\alpha ,\beta ,\gamma ) $ are positive parameters. They considered the generalized FGM copula with a copula parameter $ \theta $. In their model, the copula is imposed on a bivariate distribution function rather than a bivariate survival function. More details about this copula, such as the range of $ \theta $ and the expressions of Kendall’s tau, are referred to Amini et al. (2011), Domma and Giordano (2013) and Shih and Emura (2016, 2018).

Shih and Emura (2018) used the randomized Newton–Raphson algorithm to obtain the MLE of $ (\alpha ,\beta ,\gamma ) $ given the value of θ. When the value of θ is unknown, they suggested making inference for $ (\alpha ,\beta ,\gamma ) $, followed by the profile likelihood estimate $ \hat{\theta } = \arg \max_{\theta } \ell^{*} (\theta ) $, where $ \ell^{*} (\theta ) = \mathop {\hbox{max} }\limits_{(\alpha ,\beta ,\gamma )} \ell (\alpha ,\beta ,\gamma |\theta ) $. They also proposed a goodness-of-fit method to test the validity of the generalized FGM copula and the Burr III marginal models. The estimation and goodness-of-fit algorithms are implemented in the GFGM.copula R package (Shih 2018). Their method is developed for bivariate competing risks data, where dependent censoring is a competing risk of death, and death is a competing risk of dependent censoring.

4.4.5 The Piecewise Exponential Model

The piecewise exponential model has been considered to fit survival data with dependent censoring (Staplin et al. 2015; Emura and Michimae 2017). Let $ 0 = \alpha_{0} < a_{1} < \cdots < a_{m} $ be a knot sequence, where m is the number of knots. Assume that the hazard function for T in an interval $ (a_{j - 1} ,a_{j} ] $ is a constant $ e^{{\theta_{j} }} $ for $ j = 1, \ldots ,m $, such that $ {\varvec{\uptheta}} = (\theta_{1} , \ldots ,\theta_{m}) $ are parameters without restriction to their ranges. The survival function is

$$ S_{T} (t;{\varvec{\uptheta}}) = \exp \left\{ { - e^{{\theta_{j} }} (t - a_{j - 1} ) - \sum\limits_{k = 1}^{j - 1} {e^{{\theta_{k} }} (a_{k} - a_{k - 1} )} } \right\},\quad \quad t \in (a_{j - 1} ,a_{j} ], $$

where $ \sum\nolimits_{k = 1}^{0} {( \cdot ) \equiv 0} $. In a similar fashion, define the survival function $ S_{U} (u;{\varvec{\upgamma}}) $ for the censoring time U, where $ {\varvec{\upgamma}} = (\gamma_{1} , \ldots ,\gamma_{m} ) $.

Emura and Michimae (2017) considered a copula model

$$ \Pr (T > t,U > u) = C_{\theta } \{ S_{T} (t;{\varvec{\uptheta}}),S_{U} (u;{\varvec{\upgamma}})\} ,\,\,{\varvec{\uptheta}} = (\theta_{1} , \ldots ,\theta_{m} ),{\varvec{\upgamma}} = (\gamma_{1} , \ldots ,\gamma_{m} ), $$

where $ S_{T} (t;{\varvec{\uptheta}}) $ and $ S_{U} (u;{\varvec{\upgamma}}) $ follow the piecewise exponential models. The Clayton copula and the Joe copula were chosen for their numerical studies. They developed inference procedures based on the likelihood in Eq. (4.4) given the value θ. Hence, they applied an assumed copula . They did not use the profile likelihood for selecting θ since it may not work with many parameters in the marginal distributions. Alternatively, they suggested a sensitivity analysis to examine the result under a few different values of θ.

Staplin et al. (2015) originally proposed the piecewise exponential models for dependent censoring, but did not use copulas. Consequently, the sub-density functions in their likelihood function require some numerical integrations of the joint density of T and U.

4.5 Semi-parametric Models

4.5.1 The Transformation Model

Chen (2010) considered a semi-parametric transformation model defined as

$$ S_{T} (t|{\mathbf{x}}_{1i} ) = \exp [ - G_{1} \{\Lambda _{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} \} ],\quad S_{U} (u|{\mathbf{x}}_{2i} ) = \exp [ - G_{2} \{\Gamma _{0} (u)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} \} ], $$

where $ {\varvec{\upbeta}}_{j} $ are regression coefficients, and $ G_{j} ( \cdot ) $ is a known and nonnegative increasing function such that $ G_{j} (0) = 0 $, $ G_{j} (\infty ) = \infty $, and $ g_{j} (t) \equiv dG_{j} (t)/dt > 0 $ for $ j = 1 $ and 2; $ \Lambda _{0} $ and $ \Gamma _{0} $ are unknown increasing functions. No distributional assumptions are imposed on $ \Lambda _{0} $ and $ \Gamma _{0} $. The linear transformation $ G_{j} (t) = t $ corresponds to the Cox model.

Under the semi-parametric transformation model, the cause-specific hazard functions are

$$ h_{T}^{\# } (t|{\mathbf{x}}_{i} ) = \lambda_{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} \eta_{1i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ),\quad h_{U}^{\# } (t|{\mathbf{x}}_{i} ) = \gamma_{0} (t)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} \eta_{2i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ), $$

where $ \lambda_{0} (t) = d\Lambda _{0} (t)/dt $, $ \gamma_{0} (t) = d\Gamma _{0} (t)/dt $,

$$ \begin{aligned} & \eta_{1i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) = g_{1} \{\Lambda _{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} \} S_{T} (t|{\mathbf{x}}_{1i} )D_{\theta ,1} [S_{T} (t|{\mathbf{x}}_{1i} ),S_{U} (t|{\mathbf{x}}_{2i} )], \\ & \eta_{2i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) = g_{2} \{\Gamma _{0} (t)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} \} S_{U} (t|{\mathbf{x}}_{2i} )D_{\theta ,2} [S_{T} (t|{\mathbf{x}}_{1i} ),S_{U} (t|{\mathbf{x}}_{2i} )], \\ \end{aligned} $$

$$ D_{\theta ,1} (u,v) = \frac{{\partial C_{\theta } (u,v)/\partial u}}{{C_{\theta } (u,v)}},\quad D_{\theta ,2} (u,v) = \frac{{\partial C_{\theta } (u,v)/\partial v}}{{C_{\theta } (u,v)}}. $$

Under the independence copula $ C_{\theta } (u,v) = uv $, the cause-specific hazard functions are equal to the marginal hazards:

$$ h_{T}^{\# } (t|{\mathbf{x}}_{i} ) = \lambda_{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} g_{1} \{\Lambda _{0}(t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }}\}, \quad h_{U}^{\# } (t|{\mathbf{x}}_{i} ) = \gamma_{0} (t)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} g_{2} \{\Gamma _{0}(t)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }}\}.$$

To obtain the MLE of $ ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} ) $, we treat $ \Lambda _{0} $ and $ \Gamma _{0} $ as increasing step functions that have jumps sizes $ d\Lambda _{0} (t_{i} ) =\Lambda _{0} (t_{i} ) -\Lambda _{0} (t_{i} - ) $ for δ_i = 1 and $ d\Gamma _{0} (t_{i} ) =\Gamma _{0} (t_{i} ) -\Gamma _{0} (t_{i} - ) $ for δ_i = 0. Putting the cause-specific hazard functions into Eq. (4.5) and replacing $ \lambda_{0} (t_{i} ) $ by $ d\Lambda _{0} (t_{i} ) $ and $ \gamma_{0} (t_{i} ) $ by $ d\Gamma _{0} (t_{i} ) $, we obtain the log-likelihood

$$ \begin{aligned} \ell ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) & = \sum\limits_{i} {\delta_{i} [{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} + \log \eta_{1i} (t_{i} ;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) + \log d\Lambda _{0} (t_{i} )]} \\ & \quad + \sum\limits_{i} {(1 - \delta_{i} )[{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} + \log \eta_{2i} (t_{i} ;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) + \log d\Gamma _{0} (t_{i} )]} \\ & \quad - \sum\limits_{i} {\Phi _{\theta } [S_{T} (t_{i} |{\mathbf{x}}_{1i} ),S_{U} (t_{i} |{\mathbf{x}}_{2i} )]} , \\ \end{aligned} $$

where $ \Phi _{\theta } (u,v) = - \log C_{\theta } (u,v) $. Since the marginal distributions have a number of parameters to be estimated, the profile likelihood may not properly identify a suitable value of $ \theta $. Chen (2010) suggested a sensitivity analysis to examine the result under a few different values of $ \theta $, possibly selected by prior knowledge and expert opinion.

The approach of Chen (2010) reduces to Cox’s partial likelihood approach (Cox 1972) under the independence copula and the linear transformation. Under these assumptions, the MLE $ ({\hat{\varvec{\upbeta}}}_{1} ,{\hat{\varvec{\upbeta}}}_{2} ,\hat{\Lambda }_{0} ,\hat{\Gamma }_{0} ) $ is obtained by maximizing two functions

$$ \begin{aligned} \ell_{1} ({\varvec{\upbeta}}_{1} ,\varLambda_{0} ) & = \sum\limits_{i} {\delta_{i} [{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} + \log \,d\Lambda _{0} (t_{i} )]} + \sum\limits_{i} {\log S_{T} (t_{i} |{\mathbf{x}}_{1i} )} , \\ \ell_{2} ({\varvec{\upbeta}}_{2} ,\Gamma _{0} ) & = \sum\limits_{i} {(1 - \delta_{i} )[{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} + \log d\Gamma _{0} (t_{i} )]} + \sum\limits_{i} {\log S_{U} (t_{i} |{\mathbf{x}}_{2i} )} , \\ \end{aligned} $$

since $ \ell ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} ) = \ell_{1} ({\varvec{\upbeta}}_{1} ,\Lambda _{0} ) + \ell_{2} ({\varvec{\upbeta}}_{2} ,\Gamma _{0} ) $. Then , the MLE $ ({\hat{\varvec{\upbeta}}}_{1} ,\hat{\Lambda }_{0} ) $ for $ ({\varvec{\upbeta}}_{1} ,\Lambda _{0} ) $ is the partial likelihood estimator $ {\hat{\varvec{\upbeta}}}_{1} $ and the Breslow estimator $ \hat{\Lambda }_{0} $ (Chap. 2).

4.5.2 The Spline Model

Emura et al. (2017) considered a spline-based model defined as

$$ S_{T} (t|{\mathbf{x}}_{1i} ) = \exp \{ -\Lambda _{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} \} ,\quad S_{U} (u|{\mathbf{x}}_{2i} ) = \exp \{ -\Gamma _{0} (u)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} \} , $$

where $ {\varvec{\upbeta}}_{j} $ are regression coefficients, and the baseline hazard functions are modeled by

$$ \frac{d}{dt}\Lambda _{0} (t) = \lambda_{0} (t) = \sum\nolimits_{\ell = 1}^{5} {g_{\ell } M_{\ell } (t)} = {\mathbf{g}}^{{\prime }} {\mathbf{M}}(t),\quad \frac{d}{dt}\Gamma _{0} (t) = \gamma_{0} (t) = \sum\nolimits_{\ell = 1}^{5} {h_{\ell } M_{\ell } (t)} = {\mathbf{h}}^{{\prime }} {\mathbf{M}}(t), $$

where $ {\mathbf{M}}(t) = (M_{1} (t), \ldots ,M_{5} (t))^{{\prime }} $ are the cubic M-spline basis functions (Ramsay 1988). Here, $ {\mathbf{g}}^{{\prime }} = (g_{1} , \ldots ,g_{5} ) $ and $ {\mathbf{h}}^{{\prime }} = (h_{1} , \ldots ,h_{5} ) $ are unknown positive parameters. These five-parameter approximations give a good flexibility in estimation for real applications (Ramsay 1988) and are one of reasonable choices (Commenges and Jacqmin-Gadda 2015). Since the spline bases are easy to integrate, the baseline cumulative hazard functions are computed as $ \Lambda _{0} (t) = \sum\nolimits_{\ell = 1}^{5} {g_{\ell } I_{\ell } (t)} $ and $ \Gamma _{0} (t) = \sum\nolimits_{\ell = 1}^{5} {h_{\ell } I_{\ell } (t)} $, where $ I_{\ell} (t) $ is the integration of $ M_{\ell} (t) $, called the I-spline basis (Ramsay 1988).

The joint. Cox package (Emura 2018) offers functions M.spline () for computing $ M_{\ell} (t) $ and I.spline () for $ I_{\ell} (t) $. To compute these spline bases, one needs to specify the range of t. The package uses the range $ t \in [\xi_{1} ,\xi_{3} ] $ for the equally spaced knots $ \xi_{1} < \xi_{2} < \xi_{3} $, where $ \xi_{2} = (\xi_{1} + \xi_{3} )/2 $. A possible choice is $ \xi_{1} = \min_{i} (t_{i} ) $ and $ \xi_{3} = \max_{i} (t_{i} ) $. The expressions of $ M_{\ell} (t) $ and $ I_{\ell} (t) $ are given in Appendix A. Figure 4.1 displays the M- and I-spline basis functions with the knots $ \xi_{1} = 1 $, $ \xi_{2} = 2 $, and $ \xi_{3} = 3 $.

Under the spline model, the cause-specific hazard functions are

$$ h_{T}^{\# } (t|{\mathbf{x}}_{i} ) = \lambda_{0} (t)e^{{{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} }} \eta_{1i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ),\quad h_{U}^{\# } (t|{\mathbf{x}}_{i} ) = \gamma_{0} (t)e^{{{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} }} \eta_{2i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ), $$

where

$$ \begin{aligned} & \eta_{1i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) = S_{T} (t|{\mathbf{x}}_{1i} )D_{\theta ,1} [S_{T} (t|{\mathbf{x}}_{1i} ),S_{U} (t|{\mathbf{x}}_{2i} )], \\ & \eta_{2i} (t;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) = S_{U} (t|{\mathbf{x}}_{2i} )D_{\theta ,2} [S_{T} (t|{\mathbf{x}}_{1i} ),S_{U} (t|{\mathbf{x}}_{2i} )]. \\ \end{aligned} $$

Putting these formulas into Eq. (4.5), we obtain the log-likelihood

$$ \begin{aligned} \ell ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,{\mathbf{g}},{\mathbf{h}}|\theta ) & = \sum\limits_{i} {\delta_{i} [{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} + \log \eta_{1i} (t_{i} ;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) + \log \lambda_{0} (t_{i} )]} \\ & \quad + \sum\limits_{i} {(1 - \delta_{i} )[{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} + \log \eta_{2i} (t_{i} ;{\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,\Lambda _{0} ,\Gamma _{0} |\theta ) + \log \gamma_{0} (t_{i} )]} \\ & \quad - \sum\limits_{i} {\Phi _{\theta } [S_{T} (t_{i} |{\mathbf{x}}_{1i} ),S_{U} (t_{i} |{\mathbf{x}}_{2i} )]} . \\ \end{aligned} $$

The estimator of $ ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,{\mathbf{g}},{\mathbf{h}}) $ is obtained by maximizing the penalized log-likelihood

$$ \ell ({\varvec{\upbeta}}_{1} ,{\varvec{\upbeta}}_{2} ,{\mathbf{g}},{\mathbf{h}}|\theta ) - \kappa_{1} \int {\ddot{\lambda }_{0} (t)^{2} dt} - \kappa_{2} \int {\ddot{\gamma }_{0} (t)^{2} dt} , $$

where $ \ddot{f}(t) = d^{2} f(t)/dt^{2} $, and (κ_1, κ₂) are given nonnegative values. The parameters (κ_1, κ₂) are called smoothing parameters, which control the degrees of penalties on the roughness of the two baseline hazard functions. It is shown in Appendix A that

$$ \int\limits_{{\xi_{1} }}^{{\xi_{3} }} {\ddot{\lambda }_{0} (t)^{2} dt} = {\mathbf{g}}^{{\prime }}\Omega {\mathbf{g}},\quad \int\limits_{{\xi_{1} }}^{{\xi_{3} }} {\ddot{\gamma }_{0} (t)^{2} dt} = {\mathbf{h}}^{{\prime }}\Omega {\mathbf{h}},\quad\Omega = \frac{1}{{\Delta^{5} }}\left[ {\begin{array}{*{20}c} { 1 9 2} & { - 1 3 2} & { 2 4} & { 1 2} & 0\\ { - 1 3 2} & { 9 6} & { - 2 4} & { - 1 2} & { 1 2} \\ { 2 4} & { - 2 4} & { 2 4} & { - 24} & { 2 4} \\ { 1 2} & { - 1 2} & { - 2 4} & { 9 6} & { - 1 3 2} \\ 0 & {12} & {24} & { - 132} & {192} \\ \end{array} } \right], $$

where $ \Delta = \xi_{2} - \xi_{1} = \xi_{3} - \xi_{2} $. A naïve approach is to set $ \kappa_{1} = \kappa_{2} = 0 $ as in Shih and Emura (2018).

A more sophisticated approach is to choose (κ_1, κ₂) by optimizing a likelihood cross-validation (LCV) criterion (O’ Sullivan 1988). Under the independence copula, the penalized log-likelihood is written as the sum of two marginal penalized log-likelihoods,

$$ \left[ {\ell_{1} ({\varvec{\upbeta}}_{1} ,\Lambda _{0} ) - \kappa_{1} \int {\ddot{\lambda }_{0} (t)^{2} dt} } \right] + \left[ {\ell_{2} ({\varvec{\upbeta}}_{2} ,\Gamma _{0} ) - \kappa_{2} \int {\ddot{\gamma }_{0} (t)^{2} dt} } \right], $$

where

$$ \begin{aligned} \ell_{1} ({\varvec{\upbeta}}_{1} ,\Lambda _{0} ) & = \sum\limits_{i} {\delta_{i} [{\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} + \log \lambda_{0} (t_{i} )]} - \sum\limits_{i} {\Lambda _{0} (t_{i} )\exp ({\varvec{\upbeta}}_{1}^{{\prime }} {\mathbf{x}}_{1i} )} , \\ \ell_{2} ({\varvec{\upbeta}}_{2} ,\Gamma _{0} ) & = \sum\limits_{i} {(1 - \delta_{i} )[{\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} + \log \gamma_{0} (t_{i} )]} - \sum\limits_{i} {\Gamma _{0} (t_{i} )\exp ({\varvec{\upbeta}}_{2}^{{\prime }} {\mathbf{x}}_{2i} )} . \\ \end{aligned} $$

We suggest choosing κ₁ and κ₂ based on the two marginal LCVs defined as

$$ LCV_{1} = \hat{\ell }_{1} - {\text{tr}}\{ \hat{H}_{PL1}^{ - 1} \hat{H}_{1} \} ,\quad LCV_{2} = \hat{\ell }_{2} - {\text{tr}}\{ \hat{H}_{PL2}^{ - 1} \hat{H}_{2} \} , $$

where $ \hat{\ell }_{1} $ and $ \hat{\ell }_{2} $ are the log-likelihood values evaluated at their marginal penalized likelihood estimates, and $ \hat{H}_{PL1}^{{}} $ and $ \hat{H}_{PL2}^{{}} $ are the converged Hessian matrices for the marginal penalized likelihood estimations, $ \hat{H}_{1}^{{}} $ and $ \hat{H}_{2}^{{}} $ are the converged Hessian matrices for the marginal log-likelihoods such that

$$ \hat{H}_{1} = \hat{H}_{PL1} + 2\kappa_{1} \left[ {\begin{array}{*{20}c} {O_{{p_{1} \times p_{1} }} } & {O_{{p_{1} \times 5}} } \\ {O_{{5 \times p_{1} }} } &\Omega \\ \end{array} } \right],\quad \hat{H}_{2} = \hat{H}_{PL2} + 2\kappa_{2} \left[ {\begin{array}{*{20}c} {O_{{p_{2} \times p_{2} }} } & {O_{{p_{2} \times 5}} } \\ {O_{{5 \times p_{2} }} } &\Omega \\ \end{array} } \right], $$

where $ O $ is a zero matrix and $ p_{j} $ is the dimension of $ {\varvec{\upbeta}}_{j} $ for $ j = 1 $ and 2. The values of $ (\kappa_{1} ,\kappa_{2} ) $ are obtained by maximizing $ LCV_{1} $ for $ \kappa_{1} $ and $ LCV_{2} $ for $ \kappa_{2} $, separately. One may apply the R function splineCox.reg in the joint.Cox R package to find the optimal value of $ \kappa_{1} $ (or $ \kappa_{2} $).

References

Amini M, Jabbari H, Mohtashami Borzadaran GR (2011) Aspects of dependence in generalized Farlie-Gumbel-Morgenstern distributions. Commun Stat Simul Comput 40(8):1192–1205
Article MathSciNet MATH Google Scholar
Braekers R, Veraverbeke N (2005) A copula-graphic estimator for the conditional survival function under dependent censoring. Can J Stat 33:429–447
Article MathSciNet MATH Google Scholar
Chaieb LL, Rivest LP, Abdous B (2006) Estimating survival under a dependent truncation. Biometrika 93(3):655–669
Article MathSciNet MATH Google Scholar
Chen YH (2010) Semiparametric marginal regression analysis for dependent competing risks under an assumed copula. J R Stat Soc Ser B Stat Methodol 72:235–251
Article MathSciNet Google Scholar
Collett D (2015) Modelling survival data in medical research, 3rd edn. CRC Press, London
Google Scholar
Commenges D, Jacqmin-Gadda H (2015) Dynamical biostatistical models. CRC Press, London
MATH Google Scholar
Cox DR (1972) Regression models and life-tables (with discussion). J R Stat Soc Ser B Stat Methodol 34:187–220
MATH Google Scholar
de Uña-Álvarez J, Veraverbeke N (2013) Generalized copula-graphic estimator. TEST 22(2):343–360
Article MathSciNet MATH Google Scholar
de Uña-Álvarez J, Veraverbeke N (2017) Copula-graphic estimation with left-truncated and right-censored data. Statistics 51(2):387–403
Article MathSciNet MATH Google Scholar
Domma F, Giordano S (2013) A copula-based approach to account for dependence in stress-strength models. Stat Pap 54(3):807–826
Article MathSciNet MATH Google Scholar
Emura T, Chen YH (2016) Gene selection for survival data under dependent censoring, a copula-based approach. Stat Methods Med Res 25(6):2840–2857
Article MathSciNet Google Scholar
Emura T, Murotani K (2015) An algorithm for estimating survival under a copula-based dependent truncation model. TEST 24(4):734–751
Article MathSciNet MATH Google Scholar
Emura T, Michimae H (2017) A copula-based inference to piecewise exponential models under dependent censoring, with application to time to metamorphosis of salamander larvae. Environ Ecol Stat 24(1):151–173
Article MathSciNet Google Scholar
Emura T, Nakatochi M, Murotani K, Rondeau V (2017) A joint frailty-copula model between tumour progression and death for meta-analysis. Stat Methods Med Res 26(6):2649–2666
Article MathSciNet Google Scholar
Emura T (2018) joint.Cox: penalized likelihood estimation and dynamic prediction under the joint frailty-copula models between tumour progression and death for meta-analysis, CRAN
Google Scholar
Emura T, Chen HY, Matsui S, Chen YH (2018) compound.Cox: univariate feature selection and compound covariate for predicting survival, CRAN
Google Scholar
Escarela G, Carrière JF (2003) Fitting competing risks with an assumed copula. Stat Methods Med Res 12(4):333–349
Article MathSciNet MATH Google Scholar
Heckman JJ, Honore BE (1989) The identifiability of the competing risks models. Biometrika 76:325–330
Article MathSciNet MATH Google Scholar
Hu YH, Emura T (2015) Maximum likelihood estimation for a special exponential family under random double-truncation. Comput Stat 30(4):1199–1229
Article MathSciNet MATH Google Scholar
Moradian H, Denis Larocque D, Bellavance F (2017). Survival forests for data with dependent censoring. Stat Methods Med Res, https://doi.org/10.1177/0962280217727314
Nelsen RB (2006) An introduction to copulas, 2nd edn. Springer, New York
MATH Google Scholar
O’ Sullivan F (1988) Fast computation of fully automated log-density and log-hazard estimation. SIAM J Sci Stat Comput 9:363–379
Google Scholar
Ramsay J (1988) Monotone regression spline in action. Stat Sci 3:425–461
Article Google Scholar
Rivest LP, Wells MT (2001) A martingale approach to the copula-graphic estimator for the survival function under dependent censoring. J Multivar Anal 79:138–155
Article MathSciNet MATH Google Scholar
Shih JH, Emura T (2016) Bivariate dependence measures and bivariate competing risks models under the generalized FGM copula. Stat Pap, https://doi.org/10.1007/s00362-016-0865-5
Shih JH, Lee W, Sun LH, Emura T (2018) Fitting competing risks data to bivariate Pareto models. Commun Stat Theory, https://doi.org/10.1080/03610926.2018.1425450
Shih JH, Emura T (2018) Likelihood-based inference for bivariate latent failure time models with competing risks under the generalized FGM copula. Comput Stat, https://doi.org/10.1007/s00180-018-0804-0
Shih JH (2018) GFGM.copula: generalized Farlie-Gumbel-Morgenstern copula, CRAN
Google Scholar
Shih JH and Lee W (2018) Bivariate.Pareto: bivariate Pareto models, CRAN
Google Scholar
Staplin ND (2012) Informative censoring in transplantation statistics. Doctoral Thesis, University of Southampton, School of Mathematics
Google Scholar
Staplin ND, Kimber AC, Collett D, Roderick PJ (2015) Dependent censoring in piecewise exponential survival models. Stat Methods Med Res 24(3):325–341
Article MathSciNet Google Scholar
Tsiatis A (1975) A nonidentifiability aspect of the problem of competing risks. Proc Natl Acad Sci 72(1):20–22
Article MathSciNet MATH Google Scholar
Zheng M, Klein JP (1995) Estimates of marginal survival for dependent competing risks based on an assumed copula. Biometrika 82(1):127–138
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Graduate Institute of Statistics, National Central University, Taoyuan, Taiwan
Takeshi Emura
Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
Yi-Hau Chen

Authors

Takeshi Emura
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Hau Chen
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Emura, T., Chen, YH. (2018). Analysis of Survival Data Under an Assumed Copula. In: Analysis of Survival Data with Dependent Censoring. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-10-7164-5_4

Download citation

DOI: https://doi.org/10.1007/978-981-10-7164-5_4
Published: 06 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7163-8
Online ISBN: 978-981-10-7164-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Analysis of Survival Data Under an Assumed Copula

Abstract

Keywords

4.1 Introduction

4.2 The Copula-Graphic (CG) Estimator

.

.

4.3 Model and Likelihood

4.4 Parametric Models

4.4.1 The Burr Model

4.4.2 The Weibull Model

4.4.3 The Pareto Model

4.4.4 The Burr III Model

4.4.5 The Piecewise Exponential Model

4.5 Semi-parametric Models

4.5.1 The Transformation Model

4.5.2 The Spline Model

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation