Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Longitudinal data occur in many fields such as the medical follow-up studies that involve repeated measurements. In these situations, study subjects are generally observed only at discrete times. Therefore, for the analysis of longitudinal data, two processes need to be considered: one is the response process, which is usually of the primary interest but not continuously observable; the other one is the observation process, which is nuisance but gives rise to the discrete times when the responses are observed.

An extensive literature exists for the analysis of longitudinal data. Sun and Kalbfleisch (1995) and Wellner and Zhang (2000) investigated nonparametric estimation of the mean function when the response process is a counting process. Cheng and Wei (2000), Sun and Wei (2000), Zhang (2002) and Wellner and Zhang (2007) developed some semiparametric approaches for regression analysis under the proportional means models. However, with respect to the observation process, most existing approaches assume that the observation times are independent of the underlying response process either completely or given some covariates. For the analysis with a correlated observation process, there is limited work and most of them assume independent censoring or require some restrictive conditions such as the Poisson assumption or specified correlation structure for dependence (Huang et al. 2006; Sun et al. 2007; He et al. 2009; Zhao and Tong 2011; Kim et al. 2012; Li et al. 2013; Zhao et al. 2013; Zhou et al. 2013).

In many situations, however, the response process, the observation and censoring times may be mutually correlated. In addition, such correlations may be time-dependent. For instance, both the observation times and longitudinal responses may depend on the stage of disease progression. Their correlation may change over time and so are their correlations with the follow-up times. He et al. (2009) considered such correlations in shared frailty models. However, their method requires the assumptions that the underlying random effect is normally distributed and the observation process is a nonhomogeneous Poisson process. Also all correlations between the three processes are assumed to be fixed over time. Zhao et al. (2013) proposed a robust estimation procedure and relaxed the Poisson assumption required in He et al. (2009). However, the follow-up times are assumed to be independent from covariates, responses and observation times; and the possible correlations between responses and observation times are time-independent. More recently, Sun et al. (2012) presented a joint model with time-dependent correlations between the response process, the observation times and a terminal event, where the random effect associated with the terminal event is fixed over time and follow a specified distribution. In practice, however, such conditions may not hold or be difficult to check when informative censoring involves.

We consider regression analysis of longitudinal data when the underlying response process, the observation and censoring times are mutually correlated and none of the correlations is restricted by specified forms or distributions. A general estimation approach is proposed. The remainder of this chapter is organized as follows: In Sect. 2, we introduce the notation and present the model. Section 3 presents the estimation procedure and establishes the asymptotic properties of the resulting estimators. In Sect. 4, a simulation study is performed to evaluate the finite sample properties of the proposed estimators. Some concluding remarks are given in Sect. 5.

2 Notation and Models

Consider a longitudinal study in which the response process of interest is observed only at some discrete sampling time points. For each subject i, i = 1, ⋯ , n, let N i (t) be the observation process, which gives the cumulative number of observation times up to time t. In practice, one observes \(\widetilde{N}_{i}(t) = N_{i}(t \wedge C_{i})\) where ab = min(a, b) and C i denotes the censoring or follow-up time. Let Y i (t) denote the response process, which gives the response of interest at time t but is observed only at m i discrete observation times \(\{T_{i,1},\cdots \,,T_{i,m_{i}}\}\) when \(\widetilde{N}_{i}(t)\) has jumps. Suppose that there exists a p-dimensional vector of covariates denoted by Z i , which will be assumed to be time-independent.

In the following, we model the correlation between Y i (t), N i (t) and C i through an unobserved random vector b i (t) = (b 1i (t), b 2i (t), b 3i (t))′, which could be time-dependent. Define \(\mathcal{B}_{it} =\{ \mathbf{b}_{i}(s),s \leq t\}\). It will be assumed that the b i (t)’s are independent and identically distributed, \(\mathcal{B}_{it}\) is independent of Z i , and given Z i and \(\mathcal{B}_{it}\), C i , N i (t) and Y i (t) are mutually independent. To be specific, the mean function of Y i (t) is assumed to follow the proportional means model

$$\displaystyle{ E\{Y _{i}(t)\vert \mathbf{Z}_{i},\mathbf{b}_{i}(t)\} =\varLambda _{0}(t)\exp \{\beta '\mathbf{Z}_{i} + b_{1i}(t)\}, }$$
(1)

where Λ 0(t) is an unknown baseline mean function and β denotes a vector of p-dimensional regression coefficients. When b 1i (t) = 0 meaning that Y i (t) is independent of both N i (t) of C i , model (1) has been considered extensively by Cheng and Wei (2000), Sun and Wei (2000), Zhang (2002) and Hu et al. (2003) among others. When b 1i (t) is time-independent, model (1) is equivalent to model (3) considered in Zhao et al. (2013). In general, b 1i (t) is unknown and may follow an arbitrary distribution.

The observation process N i (t) follows the proportional rates model

$$\displaystyle{ E\{dN_{i}(t)\vert \mathbf{Z}_{i},\mathbf{b}_{i}(t)\} =\exp \{\gamma '\mathbf{Z}_{i} + b_{2i}(t)\}d\mu _{0}(t)\,, }$$
(2)

where γ is a vector of unknown parameters and 0(t) is an unknown baseline rate function. For the C i s, motivated by the additive hazards models that have been commonly used in survival analysis (Lin and Ying 2001; Kalbfleisch and Prentice 2002; Zhang et al. 2005), we consider the additive hazards model. That is, the hazard λ i (t | Z i , b i (t)) of C i , defined as the rate of observing C i at time t provided that C i is no larger than t, is given by

$$\displaystyle{ \lambda _{i}(t\vert \mathbf{Z}_{i},\mathbf{b}_{i}(t)) =\lambda _{0}(t) +\xi '\mathbf{Z}_{i} + b_{3i}(t)\,. }$$
(3)

Here λ 0(t) is an unknown baseline hazard function and ξ denotes the effect of covariates on the hazard function of C i s. Note that instead of model (3), one may consider the proportional hazards model. As pointed out by Lin et al. (1998) and others, the additive model (3) can be more plausible than the proportional hazards model in many applications. Related applications and model-checking techniques of model (3) can be found in Yuen and Burke (1997), Kim and Lee (1998), Ghosh (2003) and Gandy and Jensen (2005) among others.

In the above, models (1)–(3) can be viewed as natural generalizations of some existing and commonly used models. In fact, when any of the b ki (t)’s (k = 1, 2, 3) is zero or independent from other b ji (t)’s (j = 1, 2, 3 and jk), the corresponding process is independent from the others. Therefore, the proposed joint model also applies to special cases when either the observation or censoring times are noninformative. In general, since the form or distribution of b i (t) is arbitrary and completely unspecified, the joint model described above is quite flexible compared to many existing procedures.

Note that in models (1)–(3), for simplicity, we have assumed that the set of covariates that may affect Y i (t), N i (t) and C i is the same. In practice, it is apparent that this may not be the case and actually the estimation procedure proposed below still applies as long as one replaces Z i by appropriate covariates. As an alternative, one can define a single and big covariate vector by combining all different covariates together. In the following, we will focus on estimation of regression parameters β along with γ and ξ. For this, it is easy to see that the use of the existing procedures that assume independence could give biased or even misleading results.

3 Estimation Procedure

In this section, we will present an inference procedure for estimation of β which is usually of the primary interest. For this, first note that the counting process \(\widetilde{N}_{i}(t) = N_{i}(t \wedge C_{i})\) jumps by one at time t if and only if C i  ≥ t and dN i (t) = 1. Also we have

$$\displaystyle\begin{array}{rcl} & & E\{d\widetilde{N}_{i}(t)\vert \mathbf{Z}_{i}\} = E\{I(t \leq C_{i})dN_{i}(t)\vert \mathbf{Z}_{i}\} \\ & & \quad = E\bigg[E\{I(t \leq C_{i})dN_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\}\bigg\vert \mathbf{Z}_{i}\bigg] \\ & & \quad = E\bigg[E\{I(t \leq C_{i})\vert \mathbf{Z}_{i},\mathcal{B}_{it}\}E\{dN_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\}\bigg\vert \mathbf{Z}_{i}\bigg] \\ & & \quad = E\bigg[exp\{ -\varLambda _{0}^{{\ast}}(t) - B_{ i}(t) -\xi '\mathbf{Z}_{i}^{{\ast}}(t)\}\exp \{\gamma '\mathbf{Z}_{ i} + b_{2i}(t)\}d\mu _{0}(t)\bigg\vert \mathbf{Z}_{i}\bigg] \\ & & \quad =\exp \{\gamma '\mathbf{Z}_{i} -\xi '\mathbf{Z}_{i}^{{\ast}}(t)\}d\varLambda _{ 1}^{{\ast}}(t), {}\end{array}$$
(4)

where

$$\displaystyle{\varLambda _{0}^{{\ast}}(t) =\int _{ 0}^{t}\lambda _{ 0}(s)ds,\;\;B_{i}(t) =\int _{ 0}^{t}b_{ 3i}(s)ds,\;\;\mathbf{Z}_{i}^{{\ast}}(t) =\int _{ 0}^{t}\mathbf{Z}_{ i}ds\;\;}$$

and

$$\displaystyle{d\varLambda _{1}^{{\ast}}(t) =\exp \{ -\varLambda _{ 0}^{{\ast}}(t)\}E[exp\{b_{ 2i}(t) - B_{i}(t)\}]d\mu _{0}(t).}$$

Define

$$\displaystyle{dM_{i}^{{\ast}}(t;\eta ) = d\widetilde{N}_{ i}(t) - e^{\eta '\mathbf{X}_{i}(t)}d\varLambda _{ 1}^{{\ast}}(t)}$$

and dM i (t) = dM i (t; η 0), where η = (γ,   ξ)′, X i (t) = (Z i ,   −Z i (t))′ and η 0 denotes the true value of η. It can be shown that M i (t) is a mean-zero stochastic process. It follows that the estimators of η and 1 (t) can be obtained by solving the following two estimating equations

$$\displaystyle{ U_{\eta }(\eta ) =\sum _{ i=1}^{n}\int _{ 0}^{\tau }\bigg\{\mathbf{X}_{ i}(t) -\bar{ X}(t;\eta )\bigg\}d\widetilde{N}_{i}(t) = 0 }$$
(5)

and

$$\displaystyle{ \sum _{i=1}^{n}\bigg[d\widetilde{N}_{ i}(t) - e^{\eta '\mathbf{X}_{i}(t)}d\varLambda _{ 1}^{{\ast}}(t)\bigg] = 0. }$$
(6)

In the above, τ is the longest follow-up time, \(\bar{X}(t;\eta ) = S^{(1)}(t;\eta )/S^{(0)}(t;\eta )\) and \(S^{(k)}(t;\eta ) = n^{-1}\sum _{i=1}^{n}e^{\eta '\mathbf{X}_{i}(t)}\mathbf{X}_{i}(t)^{\otimes k}\) with a ⊗0 = 1, a ⊗1 = a, \(\bar{x}(t) = lim_{n\rightarrow \infty }\bar{X}(t;\eta _{0})\) and s (k)(t) = lim n →  S (k)(t; η 0),  k = 0, 1. 

To estimate β, consider

$$\displaystyle\begin{array}{rcl} & & E\{Y _{i}(t)d\widetilde{N}_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} {}\\ & & \quad = E\{I(t \leq C_{i})Y _{i}(t)dN_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} {}\\ & & \quad = E\{I(t \leq C_{i})\vert \mathbf{Z}_{i},\mathcal{B}_{it}\}E\{Y _{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\}E\{dN_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} {}\\ & & \quad = exp\{ -\varLambda _{0}^{{\ast}}(t) - B_{ i}(t) -\xi '\mathbf{Z}_{i}^{{\ast}}(t)\} {}\\ & & \varLambda _{0}(t)\exp \{\beta '\mathbf{Z}_{i} + b_{1i}(t)\}\exp \{\gamma '\mathbf{Z}_{i} + b_{2i}(t)\}d\mu _{0}(t) {}\\ & & \quad =\exp \{ (\beta +\gamma )'\mathbf{Z}_{i} -\xi '\mathbf{Z}_{i}^{{\ast}}(t)\} {}\\ & & \qquad \exp \{ -\varLambda _{0}^{{\ast}}(t) + b_{ 1i}(t) + b_{2i}(t) - B_{i}(t)\}\varLambda _{0}(t)d\mu _{0}(t), {}\\ \end{array}$$

and therefore

$$\displaystyle{ E\{Y _{i}(t)d\widetilde{N}_{i}(t)\vert \mathbf{Z}_{i}\} =\exp \{\beta '\mathbf{Z}_{i} +\eta '\mathbf{X}_{i}(t)\}d\varLambda _{2}^{{\ast}}(t), }$$
(7)

where

$$\displaystyle{d\varLambda _{2}^{{\ast}}(t) =\exp \{ -\varLambda _{ 0}^{{\ast}}(t)\}\varLambda _{ 0}(t)E[exp\{b_{1i}(t) + b_{2i}(t) - B_{i}(t)\}]d\mu _{0}(t).}$$

Define

$$\displaystyle{dM_{i}(t;\beta,\eta ) = Y _{i}(t)d\widetilde{N}_{i}(t) -\exp \{\beta '\mathbf{Z}_{i} +\eta '\mathbf{X}_{i}(t)\}d\varLambda _{2}^{{\ast}}(t)}$$

and dM i (t) = dM i (t; β 0, η 0), where β 0 denotes the true value of β. Then M i (t) is a mean-zero stochastic process. This naturally suggests the following estimating equations to estimate β and 2 (t):

$$\displaystyle{ U_{\beta }(\beta;\hat{\eta }) =\sum _{ i=1}^{n}\int _{ 0}^{\tau }W(t)\mathbf{Z}_{ i}\bigg[Y _{i}(t)d\widetilde{N}_{i}(t) - e^{\beta '\mathbf{Z}_{i}+\hat{\eta }'\mathbf{X}_{i}(t)}d\varLambda _{ 2}^{{\ast}}(t)\bigg] = 0, }$$
(8)

and

$$\displaystyle{ \sum _{i=1}^{n}\bigg[Y _{ i}(t)d\widetilde{N}_{i}(t) - e^{\beta '\mathbf{Z}_{i}+\hat{\eta }'\mathbf{X}_{i}(t)}d\varLambda _{ 2}^{{\ast}}(t)\bigg] = 0,\;\,0 \leq t \leq \tau, }$$
(9)

where \(\hat{\eta }= (\hat{\gamma },\;\;\hat{\xi })'\) and \(d\widehat{\varLambda }_{1}^{{\ast}}(t)\) are the estimators of η and 1 (t), respectively, solved from (5) and (6), and W(t) is a possibly data-dependent weight function. We denote the estimates of β and 2 (t) by \(\hat{\beta }\) and \(d\widehat{\varLambda }_{2}^{{\ast}}(t)\), respectively, solved from (8) and (9).

To establish the asymptotic properties of \(\hat{\beta }\) and \(\hat{\eta }\), define

$$\displaystyle\begin{array}{rcl} & \widehat{M}_{i}^{{\ast}}(t) =\widetilde{ N}_{i}(t) -\int _{0}^{t}e^{\hat{\eta }'\mathbf{X}_{i}(s)}d\widehat{\varLambda }_{1}^{{\ast}}(s;\hat{\eta }), & {}\\ & \widehat{M}_{i}(t) =\int _{ 0}^{t}Y _{i}(s)d\widetilde{N}_{i}(s) -\int _{0}^{t}e^{\hat{\beta }'\mathbf{Z}_{i}+\hat{\eta }'\mathbf{X}_{i}(s)}d\widehat{\varLambda }_{2}^{{\ast}}(s;\hat{\beta },\hat{\eta }), & {}\\ & \widehat{E}_{Z}(t;\beta,\eta ) = \frac{\sum _{i=1}^{n}\mathbf{Z}_{ i}e^{\beta '\mathbf{Z}_{i}+\eta '\mathbf{X}_{i}(t)}} {\sum _{i=1}^{n}e^{\beta '\mathbf{Z}_{i}+\eta '\mathbf{X}_{i}(t)}} \mbox{ and }e_{z}(t) = lim_{n\rightarrow \infty }\widehat{E}_{Z}(t;\beta _{0},\eta _{0}).& {}\\ \end{array}$$

The following theorem gives the consistency and asymptotic normality of \(\hat{\beta }\) and \(\hat{\eta }\).

Theorem 1.

Assume that the conditions (C1)–(C5) given in the Appendix hold. Then \(\hat{\eta }\) and \(\hat{\beta }\) are consistent estimators of η 0 and β 0 , respectively. The distributions of \(n^{1/2}(\hat{\eta }-\eta _{0})\) and \(n^{1/2}(\hat{\beta }-\beta _{0})\) can be asymptotically approximated by the normal distributions with mean zero and covariance matrices \(\widehat{\varSigma }_{\eta } =\widehat{\varOmega }_{ \eta }^{-1}\widehat{\varPsi }\widehat{\varOmega }_{\eta }^{-1}\) and \(\widehat{\varSigma }_{\beta } =\widehat{ A}_{\beta }^{-1}\widehat{\varSigma }\widehat{A}_{\beta }^{-1}\) , respectively, where a ⊗2 = aa′, \(\widehat{\varPsi }= n^{-1}\sum _{i=1}^{n}\hat{u}_{i}^{\otimes 2}\), \(\widehat{\varSigma }= n^{-1}\sum _{i=1}^{n}(\hat{v}_{1i} -\hat{ v}_{2i})^{\otimes 2}\),

$$\displaystyle\begin{array}{rcl} \hat{u}_{i}& =& \int _{0}^{\tau }\Big(\mathbf{X}_{ i}(t) -\bar{ X}(t;\hat{\eta })\Big)d\widehat{M}_{i}^{{\ast}}(t)\,, {}\\ \hat{v}_{1i}& =& \int _{0}^{\tau }W(t)\Big(\mathbf{Z}_{ i} -\widehat{ E}_{Z}(t;\hat{\beta },\hat{\eta })\Big)d\widehat{M}_{i}(t)\,, {}\\ \hat{v}_{2i}& =& \int _{0}^{\tau }\widehat{A}_{\eta }\widehat{\varOmega }_{\eta }^{-1}\Big(\mathbf{X}_{ i}(t) -\bar{ X}(t;\hat{\eta })\Big)d\widehat{M}_{i}^{{\ast}}(t)\,, {}\\ \widehat{A}_{\beta }& =& n^{-1}\sum _{ i=1}^{n}\int _{ 0}^{\tau }W(t)e^{\hat{\beta }'\mathbf{Z}_{i}+\hat{\eta }'\mathbf{X}_{i}(t)}\Big(\mathbf{Z}_{ i} -\widehat{ E}_{Z}(t;\hat{\beta },\hat{\eta })\Big)^{\otimes 2}d\widehat{\varLambda }_{ 2}^{{\ast}}(t;\hat{\beta },\hat{\eta }), {}\\ \widehat{A}_{\eta }& =& n^{-1}\sum _{ i=1}^{n}\int _{ 0}^{\tau }W(t)e^{\hat{\beta }'\mathbf{Z}_{i}+\hat{\eta }'\mathbf{X}_{i}(t)}\Big(\mathbf{Z}_{ i} -\widehat{ E}_{Z}(t;\hat{\beta },\hat{\eta })\Big)X'_{i}(t)d\widehat{\varLambda }_{2}^{{\ast}}(t;\hat{\beta },\hat{\eta }) {}\\ \end{array}$$

and

$$\displaystyle{\widehat{\varOmega }_{\eta } = n^{-1}\sum _{ i=1}^{n}\int _{ 0}^{\tau }\{\mathbf{X}_{ i}(t) -\bar{ X}(t;\hat{\eta })\}^{\otimes 2}e^{\hat{\eta }'\mathbf{X}_{i}(t)}d\widehat{\varLambda }_{ 1}^{{\ast}}(t;\hat{\eta }).}$$

4 A Simulation Study

In this section, we report some results obtained from a simulation study conducted to assess the finite sample behavior of the estimation procedure proposed in the previous sections. For each subject i, the covariate Z i was assumed to be a Bernoulli random variable with the probability of success being 0. 5. Given Z i and some unobserved random effects b i (t) = (b 1i (t), b 2i (t), b 3i (t))′, the hazard function of the censoring time C i was assumed to have the form

$$\displaystyle{ \lambda _{i}(t\vert \mathbf{Z}_{i},\mathcal{B}_{it}) =\lambda _{0} +\xi \mathbf{Z}_{i} + b_{3i}(t), }$$
(10)

with the largest follow-up time τ = 1. The number of observations \(\widetilde{N}_{i}(t)\) was assumed to follow a Poisson process on (0, C i ) with the mean function

$$\displaystyle{ E\{N_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} =\int _{ 0}^{t}\exp \{\gamma \mathbf{Z}_{ i} + b_{2i}(s)\}d\mu _{0}(s)\,. }$$
(11)

In practice, the exact time of C i may not be observable and \(d\widetilde{N}_{i}(t)\) is observed instead of dN i (t), thus we considered \(E\{\widetilde{N}_{i}(t)\vert \mathcal{B}_{it}\}\) for the observation process. From (10) and (11),

$$\displaystyle{E\{d\widetilde{N}_{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} =\exp \{\gamma \mathbf{Z}_{i} -\xi \mathbf{Z}_{i}t\}d\varLambda _{1}^{{\ast}}(t),}$$

where 1 (t) = exp{ −λ 0 t + b 2i (t) − B i (t)} 0(t) and B i (t) =  0 t b 3i (s)ds. Given Z i and \(\mathcal{B}_{it}\), \(\widetilde{N}_{i}(t)\) was assumed to follow a nonhomogeneous Poisson process and the total number of observation times m i was generated with mean \(E\{m_{i}\} = E\{\widetilde{N}_{i}(\tau )\vert Z_{i},\mathcal{B}_{i\tau }\}\). Then the observation times \(\{T_{i,1},\ldots,T_{i,m_{i}}\}\) were taken as m i order statistics from the density function

$$\displaystyle{f_{\widetilde{N}}(t) = \frac{\exp \{\gamma \mathbf{Z}_{i} -\xi \mathbf{Z}_{i}t\}d\varLambda _{1}^{{\ast}}(t)} {\int _{0}^{\tau }\exp \{\gamma \mathbf{Z}_{i} -\xi \mathbf{Z}_{i}t\}d\varLambda _{1}^{{\ast}}(t)}.}$$

The longitudinal response Y i (t) was generated from a mixed Poisson process with the mean function

$$\displaystyle{ E\{Y _{i}(t)\vert \mathbf{Z}_{i},\mathcal{B}_{it}\} = Q_{i}\varLambda _{0}(t)\exp \{ -\beta \mathbf{Z}_{i} + b_{1i}(t)\}, }$$
(12)

where Q i was generated independently from a gamma distribution with mean 1 and variance 0.5. The results given below are based on the sample size of 100 or 200 with 1000 replications and W(t) = W i  = 1.

Table 1 shows the estimation results on β for the situation when b 1i , b 2i and b 3i are time-independent. Note that here ξ 0 = 0 or γ 0 = 0 represents the cases when either censoring or the observation times is independent of covariates, respectively. For the random effects, we took b 1i  = b 2i  = b 3i  = b i , where the b i s were generated from the uniform distribution over (−0. 5, 0. 5). It can be seen that the proposed estimates seem unbiased and the estimated standard errors (SEE) are close to the sample standard errors (SSE). Also the empirical 95 % coverage probabilities (CP) are quite accurate.The same conclusions are also obtained for the situation when b 1i , b 2i and b 3i are time-dependent, for which the results are presented in Table 2. Here we took b 1i (t) = b i t 1∕3, b 2i (t) = b i t 1∕2 and b 3i  = b i with the same b i generated as for Table 1. We also considered other set-ups such as using different baselines and with Z i being a continuous variable and obtained similar results.

Table 1 Estimation results with λ 0 = 2, μ 0(t) = 20t, Λ 0(t) = 5t, b 1i  = b 2i  = b 3i
Table 2 Estimation results with λ 0 = 2, μ 0(t) = 20t, Λ 0(t) = 5t, b 1i (t) = b i t 1∕3, \(b_{2i}(t) = b_{i}\sqrt{t}\) and b 3i (t) = b i

To further investigate the performance of the proposed estimators of β in comparison with those proposed by He et al. (2009) and Sun et al. (2012), we carried out a simulation study and estimated β using all four methods. Note that unlike the proposed estimation procedures, the latter two methods require observing the exact time of a censoring or terminal event C i . For this, we used the subjects’ last observation times as commonly done in practice. With respect to the method given by Sun et al. (2012), we applied it by using C i as its original terminal event time D i and τ as its C i . Note that as mentioned earlier, both He et al. (2009) and Sun et al. (2012) considered the distribution-based random effects for possible correlations. For the comparison, we focus on the performances of their procedures when the random effects follow various distributions besides those assumed. However, since both of them involve covariate effects in forms different from those considered by our proposed models, we fix β 0 = 0 and ξ 0 = 0 in order to avoid unfair comparisons caused by the misspecification of covariate effects. The estimation results are given in Table 3 with three set-ups. In the first set-up, referred to as M 1, we considered the situation as used for Table 1 except μ 0(t) = 10t and b 1i  = −b 2i  = b 3i . In the second and third set-ups called M 2 and M 3, we generated b 1i (t), b 2i (t) and b 3i (t) from various distributions such that the assumptions required by either Sun et al. (2012) or He et al. (2009) are satisfied. For example, we took λ 0(t) = 0 and generated b 3i (t) from an extreme-value distribution as assumed by Sun et al. (2012). We also generated b 1i (t), b 2i (t) and b 3i (t) from the assumed distributions required by He et al. (2009).

Table 3 Estimation results on β based on the proposed procedure and the procedures given in Sun et al. (2012) and He et al. (2009) with β 0 = ξ 0 = γ 0 = 0

Note that in all set-ups considered above, our proposed models are correctly specified because there are no assumed distributions on b 1i (t), b 2i (t) or b 3i (t). In contrast, the models from either of He et al. (2009) or Sun et al. (2012) are only correctly specified in one of the set-ups. On the other hand, since there are no covariate effects in all set-ups, we do not expect that the point estimates of β given by He et al. (2009) or Sun et al. (2012) are much biased even if the imposed distributions are misspecified in the estimation. For their variance estimates, we expect that SEE and SSE agree for both, because the former applied bootstrap resampling and the latter did not involve any assumed distribution of random effects in their variance estimation. Therefore, we only compare bias and SSE. It can be seen that all estimation procedures gave comparably small bias as expected. However, it appears that the proposed estimators are more efficient for all cases in general. In comparison, the method given by He et al. (2009) is comparably efficient to the proposed estimators only under M 3 when all its distribution assumptions are satisfied. For the method given by Sun et al. (2012), it is worth noting that when D i is substituted by the last observation time C i from subject i, it gives relatively large SSE, especially when C i ’s vary much, regardless of whether the assumption about b 3i (t) is satisfied (for M 2) or not (for M 3).

5 Concluding Remarks

We proposed a joint model for analyzing longitudinal data with informative censoring and observation times. The mutual correlations are characterized via a shared vector of time-dependent random effects. As mentioned earlier, several procedures have been developed in the literature for longitudinal data when either censoring or observation process is informative. However when both of them are informative, there is limited work that can apply except those given in He et al. (2009) and Sun et al. (2012). In addition, all the existing procedures assumed time-independent or specifically distributed correlation structures. The proposed joint model is flexible in that the shared vector of random effects can be time-dependent and neither of its structure nor distribution are specified. For the parameter estimation, the proposed procedure is simple and easy to implement.

There exist several directions for future research. One is that as mentioned above, one may want to consider other models rather than models (1)–(3) and develop similar estimation procedures. Of course, a related problem is model selection and one may want to develop some model selection techniques to choose the optimal model among several candidate models (Tong et al. 2009; Wang et al. 2014). Note that in the proposed method, we have employed a weight function W(t) and it would be desirable to develop some procedures for the selection of an optimal W(t). As in most similar situations, this is clearly a difficult problem as it requires the specification of the covariance function of Y i (t) and \(\widetilde{N}_{i}(t)\) (Sun et al. 2012). Finally in the above, we have focused on regression analysis of Y i (t) with time-independent covariates. Sometimes one may face time-dependent covariates and thus it would be helpful to generalize the proposed method to this latter situation. Also sometimes nonparametric estimation of Y i (t) or the baseline functions may be of interest. For those purposes, some constraints should be imposed on b i (t) for identifiability, for example, E{b i (t)} = 0. When panel count data arise (Sun and Zhao, 2013), the generalization of existing nonparametric estimation procedures to cases with informative observation or censoring times is a challenging direction for future work too.