Abstract
The censored linear regression model, also referred to as the accelerated failure time model, is a useful alternative to the popular Cox model in the analysis of censored survival data. In this paper, we combine the quantile information with censored least-squares normal equations to get estimators with smaller estimated standard error for regression parameters. An inverse probability-weighted method is proposed to construct unbiased estimating equations with censored data and the lack of smoothness of the objective equations is overcome by replacing them with smooth approximations. The proposed estimators are established based on the empirical likelihood method and generalized method of moments, respectively, and their asymptotic properties are studied under some regular conditions. We also conduct some simulation experiments to investigate the finite-sample properties of the proposed estimators. The Stanford Heart Transplant data are used to illustrate the proposed estimating method.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The least-squares approach for fitting a linear regression model provides a statistical technique for investigating the relationship between variables. Its simple structure and ease of interpretation have made it an attractive method for practitioners. Another important approach for fitting a linear model is the quantile regression method originated by Koenker and Bassett (1978). A collection of conditional quantiles can characterize the entire conditional distribution and capture the rich underlying relationship between the quantiles of response variable and covariates. But a major difficulty of quantile regression is that to obtain the asymptotic covariance matrix of estimators, we need an estimation of the regressor density, which is often cumbersome to obtain. This motivated Zhou et al. (2011) to bring together these two well-known techniques, and develop a coherent estimation framework that can be applied to a myriad of situations. Their simulation studies and real data analysis have shown that the least-squares estimator combined with auxiliary quantile information not only leads to a more efficient estimator, but also results in a relatively simple calculation of estimator’s standard error that does not require any density estimation. Based on the same idea, Liu and Ishfaq (2011) and Liu et al. (2011) considered the estimation of distribution function when auxiliary quantile information is available with complete data and missing data, respectively. The aim of this paper is to extend this idea to censored linear regression model.
The censored linear regression model, also referred to as the accelerated failure time (AFT) model, specifies that the logarithm of the failure time T is related to a \(p\times 1\) vector of covariate \(Z_i\) in the following way:
where \(\beta _0\) is a \(p\times 1\) vector of unknown regression parameters and \(\epsilon _i\), \(i=1,\ldots ,n\), are independent and identically distributed with an unspecified common distribution function \(F_{\epsilon }\), but zero mean and finite variance. This model, as an alternative to the popular Cox model, has been studied extensively in literature, see, for instance, Buckley and James (1979), Koul et al. (1981), Lai and Ying (1991), Ritov (1990) and Wei (1992), among others. When the data are completely observed, Zhou et al. (2011) suggested to estimate \(\beta _0\) in (1) based on the following estimating function:
The first part of (2) is based on the normal equation of least squares and the second part of (2) is based on the auxiliary quantile information, using the assumption that the errors are symmetric or the median of the errors is zero. In general, if information corresponding to the \(\zeta \)-th quantile is known, then 1 / 2 in (2) may be replaced by \(\zeta \). In survival analysis, the survival time T is usually right censored by another variable C. The observed data are \((X_i,\varDelta _i,Z_i)\), \(i=1,2,\ldots ,n\), where \(X_i=\min (T_i,C_i)\), \(\varDelta _i=I(T_i\le C_i)\), and \(I(\cdot )\) is the indicator function. Assume that \(T_i\) and \(C_i\) have absolutely continuous survival function S(t) and K(t), respectively, and \(C_i\) is independent of \(T_i\) and \(Z_i\). In this paper, we will extend the results of Zhou et al. (2011) to right censored data case.
The rest of this paper is arranged as follows. A smoothing technique is introduced in Sect. 2. In Sect. 3, we propose an IPW estimating equation method to construct asymptotic unbiased estimating equations with right censored data. In Sect. 4, we propose two estimators based on EL and GMM methods, respectively, and we also show that the proposed two estimators are asymptotically normally distributed. Section 5 reports some simulation results and a real data example. A discussion is given in Sect. 6, and the proofs of the theorems are contained in the Appendix.
2 A smoothing technique for non-smooth EEs
Note that based on (2), we can construct 2p estimating equations (EEs), but we only have p unknown parameters. This is so-called over-determined case. Obviously, ordinary estimating methods are infeasible here. The common procedures equipped to handle the over-determined case are the generalized method of moments (GMM) given by Hansem (1982) and empirical likelihood (EL) method developed by Qin and Lawless (1994). However, the functions obtained from quantile information are non-differentiable in \(\beta \), since they are indicators.
The lack of smoothness of the objective equations can be handled by replacing them with smooth approximations because smoothness of the objective function is required for Taylor expansions, such as Chen and Hall (1993), Heller (2007) and Song et al. (2007). Similar to the smoothing technique developed in Zhou et al. (2011), the proposed smoothed estimating equations in this paper are also kernel based. Without loss of generality, we illustrate the idea by assuming \(p=1\). Here, we use the high-order kernel to smooth the estimating equations associated with quantiles as those in existing literature, for more details please refer to Zhou and Jing (2003). Consider a smooth kernel function \(l(\xi )\), then \(L(t)=\int ^{t}_{-\infty }l(u)\mathrm{d}u\) is also a smooth function, and \(b_n\rightarrow 0\), \(nb_n\rightarrow \infty \) as \(n\rightarrow \infty \). In practice, \(l(\xi )\) can be any smooth function, for example, \(l(\xi )\) may be an r-th order kernel function such that
for some integer \(r\ge 2\). The smoothed version of the second part of (2) comes out to be
From Zhou et al. (2011), we know that
3 Inverse probability-weighted EEs with censored data
With complete data, we can get the estimators of regression parameters along lines of Zhou et al. (2011) based on the smoothed estimating equations
While, in the presence of censoring, (5) is no longer asymptotically unbiased estimating equations. Hence, we consider the modified estimating equations which are called inverse probability-weighted (IPW) estimating equations,
The above idea of weighting the complete observations by their inverse probabilities was originated by Horvitz and Thompson (1952) in the context of sample surveys. The adaptation of this idea to the setting of censored survival data was initially considered by Koul et al. (1981), and later on by Robins and Rotnitzky (1992) and Lin and Ying (1993). Zhao and Tsiatis (1997) applied this idea to the problem of quality adjusted survival time. Recently, Bang and Tsiatis (2000), Lin (2000) and Bang and Tsiatis (2002) used this method to estimate medical costs. We find that (6) is an asymptotically unbiased estimating equation which is a consequence of the following equality:
In practice, the survival function \(K(\cdot )\) is unknown. Here, we propose to estimate \(K(\cdot )\) by the Kaplan–Meier estimator (Kaplan and Meier 1958) with the roles of censoring time \(C_i\) and survival time \(T_i\) reversed. That is,
where \(N^c(u)=\sum _{i=1}^n I(X_i\le u, \varDelta _i=0)\), \(Y(u)=\sum _{i=1}^n I(X_i\ge u)\). The simple weighted complete-case EEs come out to be:
In the next section, we give the estimators of regression parameters based on the IPW estimating equations (7) by empirical likelihood and GMM methods.
4 Inference based on IPW EEs
4.1 Empirical likelihood
In this section, we construct an estimated empirical likelihood to make statistical inference on \(\beta \). For convenience, denote \(\varDelta _i/\widehat{K}(X_i)=V_{ni}\). Let \(p=(p_1,\ldots ,p_n)\), \(p_i\ge 0\) for all \(1\le i\le n\) with \(\sum _{i=1}^n p_i=1\). Define \(F_p\) to be the distribution function which assigns probability \(p_i\) to the point \(V_{ni}\phi (X_i,Z_i,\beta )\). The empirical likelihood is
We maximize (8) subject to restrictions
For any given \(\beta \), let the set \(\varOmega _{\beta }=\{\lambda :1+\lambda ^\tau V_{ni}\phi (X_i,Z_i,\beta )\ge 1/n\}\) be convex, closed and also bounded if the convex hull of \(V_{ni}\phi (X_i,Z_i,\beta )\) contains 0. By the Lagrange multiplier method, we have
where \(\lambda \) is the solution to
Note that \(\varPi _{i=1}^n p_i\) subject to \(\sum _{i=1}^n p_i=1\) and \(p_i\ge 0\), \(1\le i\le n\), attains its maximum value \(n^{-n}\) at \(p_i=n^{-1}\). Hence, we define the profile empirical likelihood ratio by
The log empirical likelihood ratio multiplied by \(-2\) is then given by
Let \(\widehat{\beta }_e\) be the MELE that results from minimizing \(\mathcal{R}(\beta )\), then we have:
Theorem 1
Let Assumptions 1–6 in the Appendix be satisfied. Then,
where
and \(\varSigma \) is given in Lemma 1.
To use Theorem 1 to construct confidence interval for parameter \(\beta \), we have to estimate \(A(\beta )\) and \(B(\beta )\). Based on the results of Lemma 4, \(A(\beta )\) and \(B(\beta )\) can be estimated consistently by
respectively. Under some mild regular conditions, it can be shown that
where \(W_{ni}(\beta )=V_{ni}\phi (X_i,Z_i,\beta )\). It can be readily shown that \(\mathcal{R}(\beta )\) converges in distribution to a weighted sum of Chi-square distributions, as stated in the following theorem.
Theorem 2
Let Assumptions 1–6 in the Appendix be satisfied. Then,
where the weights \(\omega _j,1\le j\le q,\) are the eigenvalues of \(B(\beta _0)^{-1}\varSigma (\beta _0),\) and \(\chi _{1,j}^2\) for \(1\le j\le q\) are independently distributed Chi-square variables with 1 degree of freedom.
Remark 1
We can give a modification of \(\mathcal{R}(\beta _0)\), let
and let
where \(\widehat{\xi }(\beta _0)= \eta _1(\beta _0)/\eta _2(\beta _0)\) and \(\widehat{B}(\beta _0)=\frac{1}{n}\sum _{i=1}^{n}W_{ni}(\beta _0)W_{ni}(\beta _0)^{\tau }\). It can be shown \(\widehat{\mathcal{R}}(\beta _0)\) has the limiting Chi-square distribution with q degree of freedom.
Corollary 1
Let \(\beta ^{\tau } = (\beta _1^{\tau }, \beta _2^{\tau }),\) where \(\beta _1\) are \(q_1 \times 1\) vector and \(\beta _2\) are \( q_2 \times 1\) vectors. For \(H_0: \beta _1 = \beta _{1,0},\) the profile empirical likelihood test statistic is
where \(\tilde{\beta }_{2,0}\) minimizes \(\mathcal{R}(\beta _{1,0}, \beta _2)\) with respect to \(\beta _2,\beta _{1,0}\) is the true value of \(\beta _1,\) and \(\widehat{\beta }_{e} = (\widehat{\beta }_1, \widehat{\beta }_2)\).
Under \(H_0,\)
where the weights \(\rho _j,1 \le j \le q_1,\) are the eigenvalues of \(B_{2}(\beta _{1,0})\varSigma (\beta _{1,0}),\) \(\chi ^{2}_{1,j}\) for \(1 \le j \le q_1\) are independently distributed Chi-square variables with 1 degree of freedom, and \(B_2\) is a positive definite matrix given in the Appendix.
4.2 Generalized method of moments
The GMM approach chooses parameter values such that
is minimized for some positive semi-definite symmetric weight matrix W. In practice, the unknown W is typically replaced by a consistent estimator \(\widehat{W}\). The resultant GMM estimator is then
Theorem 3
Let Assumptions 1–6 in the Appendix be satisfied. Then,
where \(\varSigma _g=\varSigma _2A^{\tau }W\varSigma WA\varSigma _2\) and \(\varSigma _2=\{A(\beta _0)^{\tau } WA(\beta _0)\}^{-1}.\)
The choice of W that leads to the most asymptotically efficient GMM estimator is the asymptotic covariance \(\varSigma ^{-1}\) defined in Theorem 1, which also results in a “sandwich” covariance \(\varSigma _2\). If we set \(W=B^{-1}\), which is defined in Sect. 4.1, it can be shown that the asymptotic covariance of the resultant GMM estimator coincides with the asymptotic covariance of the EL estimator.
5 Numerical studies
5.1 Simulations
In this section, we carry out simulation studies to evaluate the finite-sample performance of the GMM and EL procedures developed in this paper. The data are generated from the following censored linear regression model, which is similar to the model given in Zhou et al. (2011).
where \(\beta _1\) = 1, \(\beta _2\) = 1, \(Z_1\sim \) Bernoulli distribution with success probability 0.5, \(Z_2\sim U[ 1, 3 ]\), \(\varepsilon \) is generated from the symmetric distribution \(\sqrt{2}/4 N(0, 1)+t_{3}/2\) and the censored variable \(C\sim U[ 0, 8.3 ]\) (for heavy censoring) and U[ 0, 25 ] (for light censoring), where C is independent of \(Z_1\), \(Z_2\) and \(\varepsilon \). \(X=\min ( T, C )\), \(\varDelta =I( T\le C )\), the corresponding unbiased estimating functions are
with \(E\psi _1(\cdot )=0\) and \(E\psi _2(\cdot )=0\) representing conditions from least squares, and \(E\psi _3(\cdot )=0\) and \(E\psi _4(\cdot )=0\) arising from the median regression. We use the second-order kernel, Gaussian kernel, \(l(u)=\exp (-u^{2}/2)/(2\pi )^{1/2}\) to smooth \(\psi _3(\cdot )\) and \(\psi _4(\cdot )\). Four estimators are examined, specifically, Koul et al. estimator (1981), least-squares (LS), GMM and EL estimators.
The estimator proposed by Koul et al. (1981) is
where \(X_{i\widehat{K}}=\frac{\varDelta _iX_i}{1-\widehat{K}(X_i)}, i=1,\ldots ,n\), and \(1-\widehat{K}(\cdot )\) is the Kaplan–Meier estimator of the censoring distribution. It generalizes ordinary least-squares estimator to censored linear regression model. The LS estimator uses information only from \(\psi _1(\cdot )=0\) and \(\psi _2(\cdot )=0,\) while the GMM and EL estimators use information from all the four EEs. Since we use the second-order kernel l(u), Assumption 1 implies \(b=o(n^{-1/4})\). Such a bandwidth is of smaller order of magnitude than \(o(n^{-1/5})\) which is usually appropriate for minimizing error of curve estimator. Chen and Hall (1993) suggested choices of b in the range between \(n^{-1/2}\) and \(n^{-3/4}\) which generally provides quite good coverage accuracy. In our simulation study, we selected b through a rule of thumb proposed by Cui et al. (2002) (see also Fan and Yao 2003; Sepanski et al. 1994; Zhou et al. 2008; Zhou and Liang 2009, etc.) and suggested to set \(b=c\times \sigma \{\frac{\varDelta }{\widehat{K}(X)}(X-Z^{\tau }\beta )\}n^{-1/3}\), where \(\sigma \{X\}\) is the standard variance of X, and c is a suitable constant. We replace \(\beta \) with its LS estimator and set c to 1.5, 2, 2.5, 3, 3.5. Tables 1 and 2 report the simulation results with light censoring (about 10 %) and heavy censoring (about 30 %), respectively. Each experiment is based on 1000 replicated samples with sample size \(n=200\). The comparisons are in terms of the magnitude of bias in the estimators (BIAS), standard error of the estimators (SE), standard deviation (SD), coverage probability (COV) at the nominal confident level 95 % and the length of confidence interval (LEN) with the same confident level. The coverage probabilities of Koul et al., LS and GMM estimators were constructed using asymptotic normal distribution, while the coverage probability of EL estimator was constructed by the empirical likelihood method.
From Tables 1 and 2, it can be seen that all the four estimators have very small biases, which implies they are asymptotic unbiased and consistent. Meanwhile, the SD (standard deviation) approximates SE (standard error) of the estimator well and coverage probability is close to the nominal confidence level 95 %. The choices of bandwidth have little influence on the results, and the proposed GMM and EL estimators perform better than Koul et al. and LS estimators with smaller SE, SD and shorter LEN, since the prior make use of more information.
In addition, comparing GMM with EL method, it seems GMM estimator has generally less SE and SD, while EL estimator has shorter LEN, especially when censoring rate increases. Besides, the results of Table 2 with heavy censoring rate are very similar as those in Table 1, which implies that the EL and GMM estimators still perform well although the censoring is heavier. The most interest of this paper is reducing SD and SE by proposed censored GMM and EL method, which indeed illustrated by Tables 1 and 2.
Finally, we compare the performance of the proposed GMM and EL estimators with Gehan and Logrank type of rank regression estimators and Buckley–James (B–J) type estimator, respectively. We do not need smoothed technique to use the Gehan estimator, Logrank estimator and B–J estimator. Results are shown in Table 3 below. Zhou (2005) mainly derived a test and a confidence interval based on the rank estimator (Gehan and Logrank type estimators) of regression coefficient in the accelerated failure model. Compared with proposed GMM and EL method from Tables 1, 2 and 3, we can notice that bias of Gehan and Logrank estimators is obviously bigger, especially when censoring rate increases. Again we can find that GMM and EL methods generally perform well with less SD, SE and shorter confidence interval than Gehan, Logrank and Buckley–James type estimators. As we expect that we can improve the efficiency of estimators of the parameters in the AFT model by taking account into auxiliary quantile information, and that implies why the proposed estimators are better than the existed methods.
5.2 A real data example
We illustrate the proposed estimating method with the Stanford Heart Transplant data. These data contain the survival times of 184 heart-transplanted patients with their ages at the time of first transplant and their T5 mismatch scores, and details can be seen in Miller and Halpern (1982). Out of these 184 patients, 27 patients did not have T5 scores. And of the remaining 157 patients, the survival times of 55 patients were censored. The cutoff date for the data was in February 1980. It is reasonable to believe that the censoring is dictated by administrative decisions. So, we can estimate the survival function of C by Kaplan–Meier estimator.
But in Miller and Halpern’s paper, T5 mismatch score was nonsignificant, so age was only considered in their further analysis. Moreover, 5 of the 157 patients’ survival times (T) were less than 10 days, so they were deleted to make \(\log _{10}{T}>0\). Similarly, in this paper, we use the same dataset with 152 patients, only consider the age covariate, and adopt the same model as in Miller and Halpern (1982), to compare our proposed estimator with theirs. The model is
Three different c are chosen for the bandwidth parameter, which result in three different smoothing bandwidths. The analysis results can be seen in Table 4.
Buckley–James estimator was given in Miller and Halpern (1982) and Gehan and Logrank type rank regression estimators were obtained by the R-codes given in Zhou (2005). As shown in Table 4, proposed GMM and EL estimators indeed have smaller SD than other estimators. Compared with Koul et al. and LS estimator, GMM and EL estimators look more stable and the estimates are similar to Buckley–James estimator, Gehan estimator and Logrank estimator. Moreover, different choices of smoothing bandwidths have little effect on the results, especially for \(\hbox {Age}^{2}\) covariate.
6 Discussion
In this article, we proposed a method to estimate the parameters of interest in the AFT model by combining the quantile information with censored least-squares normal equations in the estimating equations. The proposed method is based on the EL and GMM methods, and estimators obtained both have smaller standard error and standard deviation than other estimators such as Koul et al. and LS estimators, which are illustrated in the simulation studies. And their asymptotic properties were studied under some regular conditions. However, there are some problems which need further study. For example, both of the referees ask whether some practical guideline can be used to choose the bandwidth in the kernel smoothing procedure. There is not a standard method by now as we know, especially for the right censored data. In this paper, the bandwidth is chosen according to the thumb rule, and set \(b=c\times \sigma \{\frac{\varDelta }{\widehat{K}(X)}(X-Z^{\tau }\beta )\}n^{-1/3}\), where \(\sigma \{X\}\) is the standard variance of X. Actually, different c in a large range of possible choices affect little for the results, which can be seen from Tables 1 and 2 of the paper. Besides, the choice of the optimal bandwidth may vary for different datasets, which is a difficult but interesting question, and deserves further study.
7 Appendix
In this section, we will present the proofs of Theorems 1–3. First, we need some assumptions and symbols. Let \(\Vert \cdot \Vert \) denote Euclidean norm, \(a^{\otimes 2}=aa^{\tau }\), and \(O_p(\cdot )\) denote bound in probability. Assume that \(\beta \in \varTheta \), where \(\varTheta \) is a tight space.
Assumptions:
-
1.
The selected bandwidth b satisfies the conditions: \(b\rightarrow 0\), \(nb\rightarrow \infty \) and \(nb^{2r}\rightarrow 0\).
-
2.
L(x) is the r-th kernel distribution function such that \(\int |x|^r\mathrm{d}L(x)<\infty \).
-
3.
\(\tau _s\le \tau _k\), where \(\tau _{s}=\sup \{x:S(x)>0\}\), \(\tau _{k}=\sup \{x:K(x)>0\}\), and
$$\begin{aligned} \int _0^{\tau _{s}}\frac{\psi (u,z,\beta )^{\otimes 2}}{K(u)}\mathrm{d}F_z(u)<\infty . \end{aligned}$$ -
4.
\(Q(\beta )=E_z\psi (T,Z,\beta )\) is r-th continuously differentiable in the neighborhood of \(\beta _0\), the rank of \(\partial {Q(\beta )}/{\partial {\beta }}\) is identical to the dimension of parameter \(\beta \), \(\parallel \partial {Q(\beta )}/{\partial {\beta }}\parallel \) and \(\parallel \psi (u,z,\beta )\parallel ^3/K^2(u)\) is bounded by some integrable function G(u) in some neighborhood of \(\beta _0\).
-
5.
Matrix \(B(\beta _0)\) is positive definite.
-
6.
\(Q(\beta )\) satisfies the Lipschitz condition in some neighborhood of \(\beta _0\), that is \(\parallel E_z\{\psi (T,Z,\beta )-\psi (T,Z,\beta _0)\}^{\otimes 2}\parallel =O(\parallel \beta -\beta _0\parallel )\) in some neighborhood of \(\beta _0\).
Assumptions 1 and 2 are common assumptions used in nonparametric studies, while Assumption 3 is often seen in studies of censored survival data and Assumptions 4–5 are used in empirical likelihood (Qin and Lawless 1994). Assumption 6 can be easily satisfied in many occasions. Note that we only need to smooth the second part of (2). So, in the proof of Lemmas 3, 4, \(\phi =\phi _{(2)}\), \(\psi =\psi _{(2)}\).
Lemma 1
Suppose that the Assumption 3 is satisfied. Then,
where \(\varSigma (\beta _0)=(\sigma _{lk}(\beta _0))_{l,k=1,\ldots ,q}\) is the covariance matrix with
where
\(\psi _k(T,Z,\beta )\) is the kth element of \(\psi (T,Z,\beta )\).
Proof
The proof may be constructed along the lines of Bang and Tsiatis (2000). \(\square \)
Lemma 2
Suppose that Assumptions 1–3 and 6 are satisfied. Then,
where \(\varSigma (\beta _0)\) is given in Lemma 1.
Proof
We only need to show that
In fact,
Next, we will proof \(J_{1}=o_{p}(1)\). Similar to the argument of Bang and Tsiatis (2000),
where \(\tilde{\phi }(T_i,Z_i,\beta _0)=\phi (X_i,Z_i,\beta _0)-\psi (X_i,Z_i,\beta _0)\), the definition of \(\tilde{G}(\beta _0,u)\) similar to the \(G_k(\beta _0,u)\) in Theorem 1, \(\mathcal {M}_i^c(u)\) is a martingale (More details can be seen in Bang and Tsiatis 2000, p 332). It can be shown that
In addition, note that for any constant vector \(\alpha \),
From assumption of \(F_z(\cdot )\) and assumption of \(L(\cdot )\), we have
By (20) and (21) we can get \(I_{1}=o_{p}(1)\). Now, we consider \(I_{2}\).
By the property of martingale, we have \(EI_{2}=0\) and the kth diagonal element of variance of \(I_{2}\) is given by
By the results established for \(I_1\), we know that
Using (22), we have
By (22) and (23) we have that \(I_{2}=o_{p}(1)\). Combining this with \(I_{1}=o_{p}(1)\), we complete the proof of Lemma 2. \(\square \)
Lemma 3
Suppose that Assumptions 1–3 and 6 are satisfied. Then,
Proof
\(\square \)
Lemma 4
Suppose that Assumptions 1–3 and 6 are satisfied. Then,
Proof
First, we will prove that
Note that
By the law of large number, we have
In addition,
Using the fact of Zhou (1991)
we have
By (26), (27), (28) and Lemma 3, we get (25).
Similarly to (27), we can get
By the law of large number and Lemma 3, we have
As for \(I_2\), notice that
Using the fact of Gill (1980, p 37)
and (25) we get \(I_{2}=o_{p}(1)\).
Now, we consider the third part \(I_{3}\)
and
Using (21), similarly to the proof of \(J_2\), we have that \(I_3=o_p(1)\). So, we complete the proof the first result of Lemma 4.
Now, we will consider the second result of Lemma 4, note that
we have
where \(\rho (\beta )\equiv E\psi (T,Z^{\tau }\beta )\). So, by the law of large number,
Using the fact of Gill (1980, p 37) again
Combining (32) and (33), we complete the proof of the second result of Lemma 4. \(\square \)
Lemma 5
Suppose that Assumptions 1–3 and 6 are satisfied, then for any \(\beta \) on \(\{\beta : ||\beta -\beta _0||\le cn^{-\varrho }\}\) where \(1/3<\varrho <1/2,c\) is some constant, we have
Proof
We can get the result only by a Taylor expansion. \(\square \)
Lemma 6
Suppose that Assumptions 1–6 are satisfied, then \(\lambda (\beta )=O_p(n^{-\varrho })\) uniformly on \(\{\beta : ||\beta -\beta _0||\le cn^{-\varrho }\}\) where \(1/3<\varrho <1/2,c\) is some constant, and
uniformly on \(\{\beta : ||\beta -\beta _0||\le cn^{-\varrho }\},\) where \(\lambda (\beta )\) satisfies (10).
Proof
By assumptions and the proof of Lemma 3 in Owen (1990), \(\max _{1\le i\le n}|V^{(0)}_{ni}\phi (T_i,Z_i,\beta )|=o_p(n^{1/3})\), where \(V^{(0)}_{ni}=\varDelta _i/K(X_i)\), so we have:
where \(X_{(n)}\) is the largest order statistic. Using the following equality (Zhou 1991),
we have
Using Lemma 5, similar to the proof of Lemma 3 in Owen (1990), we have \( \lambda (\beta )=O_p(n^{-\varrho })\). Using Eq. (10),
where \(Y_i=\lambda (\beta )^{\tau }W_{ni}(\beta )\), and
So we have
where
Thus, we complete the proof of Lemma 6. \(\square \)
Lemma 7
Suppose that Assumptions 1–6 are hold, then, as \(n\rightarrow \infty ,\) with probability 1 \(\mathcal{R}(\beta )\) attains its minimum value at some point \(\widehat{\beta }_{e}\) in the interior of the ball \(\Vert \beta -\beta _0\Vert \le cn^{-\varrho },\) with \(\widehat{\beta }_{e}\) and \(\widehat{\lambda }=\lambda (\widehat{\beta }_e)\) satisfying
where
Proof
Similar to the proof of Lemma 1 in Qin and Lawless (1994). \(\square \)
Proof of Theorem 1
Given Lemmas 1–7, the proof of Theorem 1 can be constructed along lines of Theorem 1 of Qin and Lawless (1994). Here, we only give a sketch of the proof. It is easy to show that
where \(\delta _n=\Vert \widehat{\beta }_e-\beta _0\Vert +\Vert \widehat{\lambda }\Vert \) and
From Lemma 2, we have \(Q_{1n}(\beta _0,0)=\frac{1}{n}\sum _{i=1}^nV_{ni}\phi (T_i,Z_i,\beta _0) =O_p(n^{-1/2})\). So, we know that \(\delta _n=O_p(n^{-1/2})\). Easily we have
\(\square \)
Proof of Theorem 2
The log empirical likelihood ratio multiplied by \(-2\) is given by
where
From Lemma 2, we have
In addition, by the virtue of Lemma 4,
So using (39), (40), (41), we complete the proof of Theorem 2. \(\square \)
Proof of Corollary 1
Denote
Similar as Corollary 5 in Qin and Lawless (1994), \(B_2\) is non-negative definite matrix, and then Corollary 1 can be proved easily by Lemma 3 in Qin and Jing (2001). \(\square \)
Proof of Theorem 3
It is easy to show that \(\widehat{\beta }_g\) is a consistent estimator of \(\beta _0\) (see, for example, Newey and McFadden 1994, chapter 36, p 2132). By the assumptions, the first-order condition
is satisfied with probability approaching one. Expanding \(W_{ni}(\widehat{\beta }_g)\) around \(\beta _0\) and multiplying through by \(\sqrt{n}\), we have
where \(\bar{\beta }\) lies between \(\widehat{\beta }_g\) and \(\beta _0\). By Assumption 4 and Lemma 4,
Thus, we have, in probability, \(\left[ A_n(\widehat{\beta }_g)^{\tau }\widehat{W}A_n(\bar{\beta })\right] ^{-1}\! A_n(\widehat{\beta }_g)^{\tau }\widehat{W}\!\rightarrow ^{p}(A^{\tau }WA)^{-1}A^{\tau }W\). The conclusion then follows by the Slutsky theorem. \(\square \)
References
Bang, H., Tsiatis, A. A. (2000). Estimating medical costs with censored data. Biometrika, 87, 329–343.
Bang, H., Tsiatis, A. A. (2002). Median regression with censored cost data. Biometrics, 58, 643–649.
Buckley, J., James, I. (1979). Linear regression with censored data. Biometrika, 66, 429–436.
Chen, S. X., Hall, P. (1993). Smoothed empirical likelihood confidence intervals for quantiles. The Annals of Statistics, 21, 1166–1181.
Cui, H. J., He, X. M., Zhu, L. X. (2002). On regression estimators with de-noised variables. Statistica Sinica, 12, 1191–1205.
Fan, J., Yao, Q. (2003). Nonlinear time series: nonparametric and parametric methods. New York: Springer.
Gill, R. D. (1980). Censoring and stochastic integrals. Statistica Neerlandica, 34, 124.
Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50, 1029–1054.
Heller, G. (2007). Smoothed rank regression with censored data. Journal of the American Statistical Association, 102, 552–559.
Horvitz, D. G., Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–685.
Kaplan, E. L., Meier, P. (1958). Nonparametric estimation from incomplete observation. Journal of the American Statistical Association, 53, 457–481.
Koenker, R., Bassett, G. (1978). Regression quantiles. Econometrica, 46, 33–50.
Koul, H., Susarla, V., van Ryzin, J. (1981). Regression analysis with randomly right-censored data. The Annals of Statistics, 9, 1276–1288.
Lai, T. L., Ying, Z. (1991). Large sample theory of a modified Buckley–James estimator for regression analysis with censored data. The Annals of Statistics, 19, 1370–1402.
Lin, D. Y. (2000). Linear regression analysis of censored medical costs. Biostatistics, 1, 35–47.
Lin, D. Y., Ying, Z. (1993). A simple nonparametric estimator of the bivariate survival function under univariate censoring. Biometrika, 80, 573–581.
Liu, X., Ishfaq, A. (2011). Distribution estimation with smoothed auxiliary information. Acta Mathematicae Applicatae Sinica, English Series, 27, 167–176.
Liu, X., Liu, P., Zhou, Y. (2011). Distribution estimation with auxiliary information for missing data. Journal of Statistical Planning and Inference, 141, 711–724.
Miller, R., Halpern, J. (1982). Regression with censored data. Biometrika, 69, 521–531.
Newey, W. K., McFadden, D. (1994). Large sample estimation and hypothesis testing. Handbook of Econometrics, 4, 2111–2245.
Owen, A. B. (1990). Empirical likelihood ratio confidence regions. The Annals of Statistics, 18, 90–120.
Qin, G., Jing, B.-Y. (2001). Empirical likelihood for censored linear regression. Scandinavian Journal of Statistics, 28, 661–673.
Qin, J., Lawless, J. (1994). Empirical likelihood and general estimating equations. The Annals of Statistics, 22, 300–325.
Ritov, Y. (1990). Estimation in a linear regression model with censored data. The Annals of Statistics, 18, 303–328.
Robins, J. M., Rotnitzky, A. (1992). Recovery of information and adjustment for dependent censoring using surrogate markers. AIDS epidemiology-methodological issues (pp. 297–331). Boston: Birkhäuser.
Sepanski, J. H., Knickerbocker, R., Carroll, R. J. (1994). A semiparametric correction for attenuation. Journal of the American Statistical Association, 89, 1366–1373.
Song, X., Ma, S., Huang, J., Zhou, X. (2007). A semiparametric approach for the nonparametric transformation survival model with multiple covariates. Biostatistics, 8, 197–211.
Wei, L. J. (1992). The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Statistics in Medicine, 11, 1871–1879.
Zhao, H., Tsiatis, A. A. (1997). A consistent estimator for the distribution of quality adjusted survival time. Biometrika, 84, 339–348.
Zhou, M. (1991). Some properties of the Kaplan-Meier estimator for independent nonidentically distributed random variables. The Annals of Statistics, 19, 2266–2274.
Zhou, M. (2005). Empirical likelihood analysis of the rank estimator for the censored accelerated failure time model. Biometrika, 92, 492–498.
Zhou, W., Jing, B. Y. (2003). Smoothed empirical likelihood confidence intervals for the difference of quantiles. Statistica Sinica, 13, 83–95.
Zhou, Y., Liang, H. (2009). Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates. The Annals of Statistics, 37, 427–458.
Zhou, Y., Wan, A. T. K., Wang, X. (2008). Estimating equations inference with missing data. Journal of the American Statistical Association, 103, 1187–1199.
Zhou, Y., Wan, A. T. K., Yuan, Y. (2011). Combining least-squares and quantile regressions. Journal of Statistical Planning and Inference, 141, 3814–3828.
Acknowledgments
The authors thank the editor, the associate editor and two anonymous referees for their helpful comments and suggestions which have substantially improved this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Y. Zhou’s work was supported by National Natural Science Foundation of China (NSFC) (71271128), the State Key Program of National Natural Science Foundation of China (71331006), NCMIS, Key Laboratory of RCSDS, AMSS, CAS (2008DP173182), IRTSHUFE and PCSIRT (IRT13077). Y. Wang’s work was supported by Outstanding Ph.D. Dissertation Cultivation Funds of Shanghai University of Finance and Economics, Graduate Education Innovation Funds of Shanghai University of Finance and Economics (No. CXJJ-2011-438).
About this article
Cite this article
Zhao, M., Wang, Y. & Zhou, Y. Accelerated failure time model with quantile information. Ann Inst Stat Math 68, 1001–1024 (2016). https://doi.org/10.1007/s10463-015-0522-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-015-0522-0