Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models

Liu, Ai-Ai; Liang, Han-Ying

doi:10.1007/s00362-015-0689-8

Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models

Regular Article
Published: 30 May 2015

Volume 58, pages 95–122, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistical Papers Aims and scope Submit manuscript

Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models

Download PDF

Ai-Ai Liu¹ &
Han-Ying Liang¹

380 Accesses
6 Citations
Explore all metrics

Abstract

For the partially linear varying-coefficient model when the parametric covariates are measured with additive errors, the estimator of the error variance is defined based on residuals of the model. At the same time, we construct Jackknife estimator as well as Jackknife empirical likelihood statistic of the error variance. Under both the response variables and their associated covariates form a stationary $\alpha $-mixing sequence, we prove that the proposed estimators and Jackknife empirical likelihood statistic are asymptotic normality and asymptotic $\chi ^2$ distribution, respectively. Numerical simulations are carried out to assess the performance of the proposed method.

Generalized varying coefficient partially linear measurement errors models

Article 29 July 2015

Simultaneous inference for Berkson errors-in-variables regression under fixed design

Article 30 January 2022

Empirical likelihood-based inferences in varying coefficient models with missing data

Article 12 July 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Consider the following partially linear varying-coefficient errors-in-variables (EV) model

$$\begin{aligned} \left\{ \begin{array}{l} Y_i=X_i^\tau \beta +W_i^\tau a(T_i)+\epsilon _i, \\ \xi _i=X_i+e_i, \end{array} \right. i=1,2,\ldots ,n, \end{aligned}$$

(1.1)

where $Y_i$ are the scalar response variables and $(X_i^\tau , W_i^\tau , T_i)$ are covariates, $a(\cdot )=(a_1(\cdot ),\cdots ,a_q(\cdot ))^\tau $ is a q-dimensional vector of unknown coefficient functions, $\beta =(\beta _1,\cdots ,\beta _p)^\tau $ is a p-dimensional vector of unknown parameters, $\epsilon _i$ are random errors. Because of the curse of dimensionality, we assume that $T_i$ is univariate; $e_i$ are independent and identically distributed (i.i.d.) with mean zero and covariate matrix $\Sigma _e$, and are independent of $(Y_i, X_i, W_i, T_i)$. In order to identify the model, $\Sigma _e$ is assumed to be known. When $\Sigma _e$ is unknown, one can employ the approaches proposed by Liang et al. (1999) to estimate it.

When $X_i$ are observed exactly, the model (1.1) boils down to the partially linear varying-coefficient model, which has been studied by many authors, for example, Fan and Huang (2005) proposed a profile least square method to estimate the unknown parameter and studied the asymptotic normality of the estimator. Besides, based on the estimator, they introduced the profile empirical likelihood ratio test and showed the test statistic asymptotically $\chi ^2$ distributed under the null hypothesis. In addition, Ahmad et al. (2005), You and Zhou (2006), Huang and Zhang (2009), Wang et al. (2011), Bravo (2014) extensively explored partially linear varying-coefficient models; Zhou et al. (2010), Wei et al. (2012), Singh et al. (2014) for similar research related to EV models.

For the model (1.1), You and Chen (2006) studied the case where the covariates were observed with measurement errors and proposed estimators for the parametric and nonparametric component respectively. When the covariates in nonparametric part are measured with errors, Feng and Xue (2014) investigated the profile least square estimators and conducted a linear hypothesis test for the parametric part.

It is worth pointing out that the works mentioned above all assume that variables or errors are independent. However, the independence assumption is inadequate in some applications, especially in the field of economics and financial analysis, where the data often exhibit dependence to some extent. Therefore, the dependence data have drawn considerable interests of statisticians. One case of them is serially correlated errors, such as AR(1) errors, MA($\infty $) errors, negatively associated errors, martingale difference errors, etc. See, for example, the work of You et al. (2005), Liang et al. (2006), Liang and Jing (2009), You and Chen (2007), Fan et al. (2013), Fan et al. (2013) and Miao et al. (2013).

As we know, the empirical likelihood (EL) introduced by Owen (1988, (1990) is an effective method for constructing confidence regions which enjoys numerous nice properties over normal approximation-based methods and the bootstrap [see Hall (1992), Hall and La Scala (1990), Zi et al. (2012)]. The EL related to model (1.1) or partially linear varying-coefficient model has been studied by some authors, for example, You and Zhou (2006), Huang and Zhang (2009), Wang et al. (2011), and Fan et al. (2012) for the partially time-varying coefficient (in this case $T_i=i/n$) errors-in-variables model. It can be seen that the EL in these papers is based on linear functional of the studied parametric or nonparametric parts in the models. However, when nonlinear functionals are involved, such as U-statistics and variance of random sample, an application of the EL method will be computationally difficult and the Wilks theorem does not hold in general, i.e., the asymptotic distribution of the EL ratio is not a chi-squared distribution. Fortunately, in the study of the EL on one and two-sample U-statistics, Jing et al. (2009) proposed a new approach called jackknife empirical likelihood (JEL), which can handle the situation where nonlinear statistics are involved. At the same time, another attractive feature of the JEL is that the new method is simple to use. Thanks to the advantages, the JEL method has been applied recent years. See, for example, Gong et al. (2010), Peng (2012), Peng et al. (2012) and Feng and Peng (2012).

In the sequel, we assume that $\{(X_i, W_i, T_i, \epsilon _i),i\ge 1\}$ is a sequence of stationary $\alpha $-mixing random variables with $E(\epsilon _i|X_i,W_i,T_i)=0 ~a.s.$ and $E(\epsilon _i^2|X_i,W_i,T_i)=\sigma ^2 ~a.s.$ from the model (1.1). Recall that a sequence $\{\zeta _k, k\ge 1\}$ is said to be $\alpha $-mixing if the $\alpha $-mixing coefficient

$$\begin{aligned} \alpha (n):\mathop {=}\limits ^\mathrm{def}\sup _{k\ge 1}\sup \{|P(AB)-P(A)P(B)|: A\in \mathcal{F}^\infty _{n+k}, B\in \mathcal{F}^k_1\} \end{aligned}$$

converges to zero as $n\rightarrow \infty $, where $\mathcal{F}^m_l=\sigma \{\zeta _l, \zeta _{l+1},\cdots ,\zeta _m\}$ denotes the $\sigma $-algebra generated by $\zeta _l, \zeta _{l+1},\ldots ,\zeta _m$ with $l\le m$. As we know, among the most frequently used mixing conditions, the $\alpha $-mixing is the weakest and many time series present $\alpha $-mixing property. For a more detailed and general review, we refer to Doukhan (1994) and Lin and Lu (1996).

In this paper, we focus on estimating the error variance $\sigma ^2$, and investigate asymptotic normality of estimator for the error variance. It is well known that the error of a regression model impacts its performance, and the study for the error variance could help researchers to improve the accuracy of the model. So it is necessary to investigate large sample properties of the estimators of the error variance. Up to now, only a few researchers have discussed the asymptotic normality of the estimator for the error variance. Among of them, we refer to You and Chen (2006), Liang and Jing (2009), Zhang and Liang (2012) and Fan et al. (2013), Fan et al. (2013). At the same time, we construct Jackknife estimator as well as JEL statistic of $\sigma ^2$, and prove that they are asymptotic normality and asymptotic $\chi ^2$ distribution, respectively. Based on the JEL statistic of $\sigma ^2$, we can construct its confidence interval which plays a crucial role in quantifying estimation uncertainty. With the study for error variance, we can get more comprehensive understanding of statistical models. Hence, the statistical inference can be improved. These results are new, even for independent data.

We organize the paper as follows. In Sect. 2, we give the methodologies and show how to build the estimators. Main results are listed in Sect. 3. Section 4 presents a simulation study to verify the idea and demonstrate the advantages of jackknife method. Proofs of Main Results are put in Sect. 5. Some preliminary lemmas, which are used in the proofs of the main results, are collected in Appendix.

2 Estimators

2.1 Profile least squares estimation

The local linear regression technique is applied to estimate the coefficient functions $\{a_j(\cdot ),j=1,2,\cdots ,q\}$ in (1.1). For t in a small neighborhood of $t_0$, one can approximate a(t) locally by a linear function $a_j(t)\approx a_j(t_0)+a'_j(t_0)(t-t_0)\equiv a^*_j+b^*_j(t-t_0)$, $j=1,2,\cdots ,q,$ where $a'_j(t)=\partial a_j(t)/\partial t$. This leads to the following weighted local least-squares problem if $\beta $ is known: find $(a^*,b^*)$ so as to minimize

$$\begin{aligned} \sum _{i=1}^n\Big [Y_i-X_i^\tau \beta -\Big (W_i^\tau ,~~\frac{T_i-t}{h}W_i^\tau \Big )\Big ( \begin{array}{c} a^*\\ hb^* \end{array} \Big )\Big ]^2 K_h(T_i-t), \end{aligned}$$

(2.1)

where $a^*=(a^*_1,a^*_2,\cdots ,a^*_q)^\tau $, $b^*=(b^*_1,b^*_2,\cdots ,b^*_q)^\tau $, $K_h(\cdot )=K(\cdot /h)/h$, $K(\cdot )$ is a kernel function and $0<h:=h_n\rightarrow 0$ is a bandwidth.

For the sake of descriptive convenience, we denote $\mathbf Y =(Y_1,Y_2,\cdots ,Y_n)^\tau , \mathbf X =(X_1,X_2,\cdots ,X_n)^\tau , \mathbf W =(W_1,W_2,\cdots ,W_n)^\tau , \omega _t=diag(K_h(T_1-t),K_h(T_2-t),\cdots ,K_h(T_n-t))$, and

$$\begin{aligned} M=\left( \begin{array}{c} W_1^\tau a(T_1)\\ \vdots \\ W_n^\tau a(T_n) \end{array}\right) ,~~ D_t=\left( \begin{array}{cc} W_1^\tau &{}\quad \frac{T_1-t}{h}W_1^\tau \\ \vdots &{}\quad \vdots \\ W_n^\tau &{}\quad \frac{T_n-t}{h}W_n^\tau \end{array}\right) . \end{aligned}$$

Then the minimizer in (2.1) is found to be $ \Big (\begin{array}{c} \hat{a}^* \\ h\hat{b}^* \end{array}\Big ) =\{D_t^\tau \omega _tD_t\}^{-1}D_t^\tau \omega _t(\mathbf Y -\mathbf X \beta ). $ Therefore, when $\beta $ is known, we obtain the estimator of $\alpha (t)$ by

$$\begin{aligned} \tilde{a}(t,\beta )=\Big (I_q,~~0_q\Big )\{D_t^\tau \omega _tD_t\}^{-1}D_t^\tau \omega _t(\mathbf Y -\mathbf X \beta ). \end{aligned}$$

(2.2)

Let $S_i=\Big (W_i^\tau ~~ 0\Big )\{D_{T_i}^\tau \omega _{T_i}D_{T_i}\}^{-1}D_{T_i}^\tau \omega _{T_i}$, $\tilde{Y}_i=Y_i-S_i\mathbf Y $ and $\tilde{X}_i^\tau =X_i^\tau -S_i\mathbf X $. Substituting (2.2) into the original varying-coefficient model, and applying the least square method, one can obtain the estimator of parametric component $\beta $, $ \tilde{\beta }=(\sum _{i=1}^n\tilde{X}_i\tilde{X}_i^\tau )^{-1}\sum _{i=1}^n\tilde{X}_i\tilde{Y}_i. $ However, since $X_i$ cannot be observed directly and we have $\xi _i=X_i+e_i$ instead, we can write (2.1) as

$$\begin{aligned} \sum _{i=1}^n\Big [Y_i-\xi _i^\tau \beta -\Big (W_i^\tau ,~~\frac{T_i-t}{h}W_i^\tau \Big )\Big ( \begin{array}{c} a^*\\ hb^* \end{array} \Big )\Big ]^2 K_h(T_i-t)-n\beta ^\tau \Sigma _e\beta . \end{aligned}$$

Similarly, one can obtain the following modified profile least squares estimator of $\beta $

$$\begin{aligned} \hat{\beta }_n=\Big (\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau -n\Sigma _e\Big )^{-1}\sum _{i=1}^n\tilde{\xi }_i\tilde{Y}_i, \end{aligned}$$

and the estimators of $a(\cdot )$ and $\sigma ^2$, respectively

$$\begin{aligned} \hat{a}_n(t)=\Big (I_q,~~0_q\Big )\{D_t^\tau \omega _tD_t\}^{-1}D_t^\tau \omega _t(\mathbf Y -\mathbf \xi \hat{\beta }_n), \\ \hat{\sigma }_n^2=\frac{1}{n}\sum _{i=1}^n[Y_i-\xi _i^\tau \hat{\beta }_n-W_i^\tau \hat{a}_n(T_i)]^2-\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n. \end{aligned}$$

2.2 Jackknife method

Since the estimators we have constructed are based on samples $(\tilde{\xi }_i, \tilde{Y}_i)_{i=1}^n$, they are regarded as the pseudo observations. Let $\hat{\beta }_{n,-i}$ be the estimator of $\beta $ when the ith observation is deleted,

$$\begin{aligned} \hat{\beta }_{n,-i}=\Big [\sum _{j\ne i}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -(n-1)\Sigma _e\Big ]^{-1}\sum _{j\ne i}^n\tilde{\xi }_j\tilde{Y}_j. \end{aligned}$$

Therefore the ith Jackknife pseudo sample is $ J_i=n\hat{\beta }_n-(n-1)\hat{\beta }_{n,-i}. $ Hence, we have the Jackknife estimator of $\beta $

$$\begin{aligned} \hat{\beta }_J=\frac{1}{n}\sum _{i=1}^nJ_i=n\hat{\beta }_n-\frac{(n-1)}{n}\sum _{i=1}^n\hat{\beta }_{n,-i}. \end{aligned}$$

From $ \hat{\sigma }_n^2=\frac{1}{n}\sum _{i=1}^n(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)^2-\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n, $ similarly, let $\hat{\sigma }^2_{n,-i}$ be the estimator of $\sigma ^2$ when the ith observation is deleted, $ \hat{\sigma }_{n,-i}^2=\frac{1}{n-1}\sum _{j\ne i}^n(\tilde{Y}_j-\tilde{\xi }_j^\tau \hat{\beta }_{n,-i})^2 -\hat{\beta }_{n,-i}^\tau \Sigma _e\hat{\beta }_{n,-i}. $ Then we have the ith Jackknife pseudo sample $ \sigma _{J_i}^2=n\hat{\sigma }_n^2-(n-1)\hat{\sigma }_{n,-i}^2, $ and the Jackknife estimator of $\sigma ^2$

$$\begin{aligned} \hat{\sigma }_J^2=\frac{1}{n}\sum _{i=1}^n\sigma _{J_i}^2=n\hat{\sigma }_n^2-\frac{n-1}{n}\sum _{i=1}^n\hat{\sigma }_{n,-i}^2. \end{aligned}$$

Based on the Jackknife pseudo sample, one constructs the Jackknife empirical likelihood of $\sigma ^2$

$$\begin{aligned} L(\sigma ^2)\!:=\!\sup \Big \{\prod _{i=1}^n np_i: p_1>0, p_2>0,\ldots ,p_n>0,\sum _{i=1}^np_i=1,\sum _{i=1}^np_i\sigma _{J_i}^2=\sigma ^2\Big \}. \end{aligned}$$

The solution to the above maximization is $ \hat{p}_i=\frac{1}{n[1+\lambda (\sigma _{J_i}^2-\sigma ^2)]},~~i=1,2,\ldots ,n, $ where $\lambda $ satisfies $ \frac{1}{n}\sum _{i=1}^n\frac{\sigma _{J_i}^2-\sigma ^2}{1+\lambda (\sigma _{J_i}^2-\sigma ^2)}=0. $ Therefore, we have the log empirical likelihood ratio function of $\sigma ^2$

$$\begin{aligned} l(\sigma ^2)=2\sum _{i=1}^n\log [1+\lambda (\sigma _{J_i}^2-\sigma ^2)]. \end{aligned}$$

3 Main results

In order to formulate the main results, we need to impose the following basic assumptions.

(A1)
The random variable T has bounded support $\Omega $, and its density function $f(\cdot )$ is Lipschitz continuous and away from 0 on its support.
(A2)
The $q\times q$ matrix $E(\mathbf W \mathbf W ^\tau |T)$ is nonsingular for each $T \in \Omega $. $E(\mathbf X \mathbf X ^\tau |T)$, $E(\mathbf W \mathbf W ^\tau |T)$ and $E(\mathbf X \mathbf W ^\tau |T)$ are all lipschitz continuous. Set $\Gamma (T_i)=E(W_iW_i^\tau |T_i)$, $\Phi (T_i)=E(X_iW_i^\tau |T_i)$, $i=1,2,\cdots ,n$, the derivatives of order 2 of functions $\Gamma (\cdot )$ and $\Phi (\cdot )$ are bounded for each $T\in \Omega $. The $q\times q$ matrix $EX_1X_1^\tau -E\Phi ^\tau (T_1)\Gamma ^{-1}(T_1)\Phi (T_1)$ is positive definite.
(A3)
There is a $\delta >4$ such that $E(\Vert X_1\Vert ^{2\delta }|T_1)<\infty ~ a.s.$, $E(\Vert W_1\Vert ^{2\delta }|T_1)<\infty ~ a.s.$, $E\Vert \xi _1\Vert ^{2\delta }<\infty ~ a.s.$, $E[|\epsilon _1|^{2\delta }|X_1,W_1]<\infty ~ a.s.$
(A4)
$\{a_j(\cdot ),j=1,2,\cdots ,q\}$ have continuous second derivatives in $T\in \Omega $.
(A5)
The function $K(\cdot )$ is a symmetric probability density function with bounded compact support which is Lipschitz continuous as well, and the bandwidth h satisfies $nh^8\rightarrow 0$ and $nh^2/(\log n)^2\rightarrow \infty $.
(A6)
The $\alpha $-mixing coefficient $\alpha (n)$ satisfies that $\alpha (n)=O(n^{-\lambda })$ for some $\lambda >\max \{\frac{7\delta +4}{\delta -4},\frac{9\delta +4}{\delta +4}\}$ with the same $\delta $ as in (A3).

Remark 3.1

(a)
Assumptions (A1)–(A6) are quite mild and commonly used in literature. Particularly, (A1)–(A2) and (A4)–(A5) are employed in Fan and Huang (2005), Feng and Xue (2014).
(b)
Assumptions (A3) implies $E\Vert X_1\Vert ^{2\delta }<\infty $ and $E\Vert W_1\Vert ^{2\delta }<\infty $.
(c)
Assumption (A6) indicates relatively low mixing speed. In fact, when the $\alpha $-mixing coefficient decays exponentially, i.e. $\alpha (n)=O(\rho ^n)$, $0<\rho <1$, one can verify easily that (A6) is satisfied.

Theorem 3.1

(i)
Suppose assumptions (A1)–(A6) are satisfied, then $ \sqrt{n}(\hat{\sigma }_n^2-\sigma ^2)\mathop {\rightarrow }\limits ^\mathcal{{D}} N(0,\Pi ), $ where $\Pi =\lim _{n\rightarrow \infty }Var\{\frac{1}{\sqrt{n}}\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )^2\}$. Further, $\hat{\Pi }$ is a plug-in estimator of $\Pi $, where $ \hat{\Pi }=\frac{1}{n}\{\sum _{i=1}^n [(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)^2-\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n-\hat{\sigma }_n^2]\}^2. $
(ii)
Suppose assumptions (A1)–(A6) are satisfied, then $ \sqrt{n}(\hat{\sigma }^2_J-\sigma ^2)=\sqrt{n}(\hat{\sigma }^2_n-\sigma ^2)+o_p(1). $ Furthermore, with (i) we have $\sqrt{n}(\hat{\sigma }_J^2-\sigma ^2)\mathop {\rightarrow }\limits ^\mathcal{{D}} N(0,\Pi ).$

Theorem 3.2

Suppose assumptions (A1)–(A6) are satisfied, then $ \frac{\Sigma _4}{\Pi }l(\sigma ^2)\mathop {\rightarrow }\limits ^\mathcal{{D}}\chi _1^2, $ where $\Sigma _4=E(\epsilon _1-e_1^\tau \beta )^4-(\sigma ^2+\beta ^\tau \Sigma _e\beta )^2>0$. Moreover, $\hat{\Sigma }_4$ is a plug-in estimator of $\Sigma _4$, where $ \hat{\Sigma }_4=\frac{1}{n}\sum _{i=1}^n\{ (\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)^4-(\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n+\hat{\sigma }_n^2)^2\}. $

Remark 3.2

(a)
Under the conditions of Theorem 3.2, if $\{\epsilon _i\}$ is a sequence of independent random variables, then one can verify $\Pi =\Sigma _4$ and $l(\sigma ^2)\mathop {\rightarrow }\limits ^\mathcal{{D}}\chi _1^2$. In this case, the jackknife empirical likelihood method does not relate to estimation for the asymptotic variance $\Sigma _4$ of the jackknife pseudo samples. However, when $\{\epsilon _i\}$ is a sequence of dependent random variables, we cannot ignore the covariance between $(\epsilon _i-e_i^\tau \beta )^2$ and $(\epsilon _j-e_j^\tau \beta )^2)$ for $i\ne j$, which leads to $\Pi \ne \Sigma _4$. Thus, to construct an approximate confidence interval of $\sigma ^2$, we need to estimate $\Pi $ and $\Sigma _4$.
(b)
From Theorem 3.2, it is easy to construct an approximate confidence region with level $1-\tau $ for $\sigma ^2$ as $I(\tau )=\{\sigma ^2: \frac{\hat{\Sigma _4}}{\hat{\Pi }}l(\sigma ^2)\le c_\tau \}$, where $c_\tau $ is chosen to satisfy $P(\chi _1^2\le c_\tau )=1-\tau $.

4 Simulation

In this section, we conduct numerical simulation to investigate the finite sample behavior of the profile least square estimator $\hat{\sigma }_n^2$ and the jackknife estimator $\hat{\sigma }_J^2$ in terms of sample means, bias, mean square error (MSE). Besides, we study the performance of proposed jackknife empirical likelihood method for constructing confidence intervals for $\sigma ^2$ and compare it with normal approximation method in terms of coverage probability and average interval length.

Consider the following partially linear varying-coefficient EV model:

$$\begin{aligned} \left\{ \begin{aligned} Y_i&=X_{1i}\beta _1+X_{2i}\beta _2+W_{1i}a_1(T_i)+W_{2i}a_2(T_i)+\epsilon _i,\\ \xi _i&=X_i+e_i,\\ \end{aligned} \right. ~~i=1,2,\ldots ,n, \end{aligned}$$

where $\beta _1=1$, $\beta _2=2$, $a_1(T)=sin(6\pi T)$, $a_2=sin(2\pi T)$. The measurement error $e_i \sim N(0,\Sigma _e)$, where $\Sigma _e=0.3^2I_2$ and $I_2$ is the $2\times 2$ identity matrix. $X_i, W_i, T_i, \epsilon _i$ are generated from AR(1) model as follows:

$X_{i,j}=\rho X_{i,j-1}+u_{i,j}$, $i=1,2$ with $u_{i,j}$ are i.i.d. N(0, 1),
$W_{i,j}=\rho ^2 W_{i,j-1}+w_{i,j}$, $i=1,2$ with $w_{i,j}$ are i.i.d. N(0, 1),
$T_j=\sqrt{\rho } T_{j-1}+t_j$, $t_j$ are i.i.d. $N(0,0.1^2)$,
$\epsilon _j=\rho \epsilon _{j-1}+\eta _j$, $\eta _j$ are i.i.d. N(0, 0.5).

It is easy to verify that $\{X_i, W_i, T_i, \epsilon _i\}$ is a sequence of stationary and $\alpha $-mixing random variables (see Doukhan (1994)) with $0<\rho <1$. When $\rho =0$, $\{(X_i, W_i, T_i, \epsilon _i),~~i=1,2,\ldots ,n\}$ are i.i.d. random variables. In order to investigate the influence of dependence on the estimators, we take the samples with $\rho $=0, 0.2, 0.5, 0.8, respectively. In fact, since the data generated from AR(1) model, one can easily find that the true value of $\sigma ^2=0.5/(1-\rho ^2)$, which means that when the coefficient $\rho $ changes, $\sigma ^2$ changes as well.

The following simulation is based 1000 replications. For the proposed estimators, we employ the Epanechnikov kernel function $K(u)=15/16(1-u^2)^2I(|u|\le 1)$, and the bandwidth h is selected by minimizing the MSE in a grid search.

Table 1 Sample means, biases and mean square errors for the estimator $\hat{\sigma }_n^2$ and $\hat{\sigma }_J^2$

Full size table

Taking sample sizes $n=50$, 100, 200, 500, we calculate bias and MSE of $\hat{\sigma }_n^2$ and $\hat{\sigma }_J^2$, respectively, to evaluate the two estimators’ performance. According to Table 1, basically, the jackknife estimator performs better than the profile least square estimator. Both Bias($\hat{\sigma }_J^2$) and MSE($\hat{\sigma }_J^2$) are smaller than those of $\hat{\sigma }_n^2$. Besides, both estimators get more accurate when n increases. The gap between $MSE(\hat{\sigma }_n^2)$ and $MSE(\hat{\sigma }_J^2)$ becomes narrow as n increasing. In other words, the jackknife estimator can significantly improve the estimation accuracy when sample size is small. In addition, as the dependence of observations increases (i.e., $\rho $ increases), which leads to larger $\sigma ^2$, the accuracy of estimation slightly decreases when observations present relatively strong dependence. Specifically, the MSE for both estimators become larger as $\sigma ^2$ rise.

Table 2 Coverage probabilities for the jackknife empirical likelihood ($CP_J$) and the normal approximation method based on $\sigma _n^2$ ($CP_N$) with confidence level 0.90, 0.95, respectively, and their corresponding average interval lengths $AIL_J$ and $AIL_N$

Full size table

Coverage probabilities and average interval lengths are reported in Table 2, showing that the jackknife empirical likelihood method is much more accurate than the normal approximation method in all scenarios in terms of coverage probabilities. Since it is obvious that the coverage probabilities for JEL are closer to the level than normal approximation method (NAM). In most cases, the average interval lengths based on JEL are smaller than NAM. More precisely, as n increases, the coverage probabilities for both JEL method and NAM become closer to the level, the confidence intervals for both methods becomes narrow. When $\rho =0$ i.e. independent cases, JEL performs much better than NAM with higher coverage probabilities and shorter confidence intervals. When dependence increases, the coverage probabilities slightly fall down, due to the fact that stronger dependence leads to bigger variance $\sigma ^2$.

5 Proofs of main results

Throughout this paper, let C, $C_1$, $C_2$ denote finite positive constants, whose values may change in different scenarios. Let $\mu _i=\int u^iK(u)du$, and $c_n=\{\log (n)/(nh)\}^{1/2}+h^2$. From (A5), one can easily verify that $c_n=o(n^{-1/4})$. Set $\epsilon =(\epsilon _1,\epsilon _2,\cdots ,\epsilon _n)^\tau ,~~\mathbf 1 _n=(1,1,\cdots ,1)^\tau $.

Proof of Theorem 3.1

(i) From Lemma 6.3, it follows that $ \frac{1}{\sqrt{n}}\sum _{i=1}^n[(\epsilon _i-e_i^\tau \beta )^2-(\sigma ^2+\beta ^\tau \Sigma _e\beta )] \mathop {\rightarrow }\limits ^\mathcal{{D}} N(0,\Pi ), $ where $\Pi =\lim _{n\rightarrow \infty }Var\{\frac{1}{\sqrt{n}}\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )^2\}$. Therefore, to prove Theorem 3.1 (i), it is sufficient to show that

$$\begin{aligned} \hat{\sigma }_n^2-\sigma ^2&=\frac{1}{n}\sum _{i=1}^n[(\epsilon _i-e_i^\tau \beta )^2-(\sigma ^2+\beta ^\tau \Sigma _e\beta )]+o_p(1). \end{aligned}$$

From $ \hat{\sigma }_n^2=\frac{1}{n}\sum _{i=1}^n(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)^2 -\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n, $ one can write

$$\begin{aligned} \hat{\sigma }_n^2-\sigma ^2&=\Big [\frac{1}{n}\sum _{i=1}^n(\tilde{Y}_i-\tilde{X}_i^\tau \hat{\beta }_n)^2-\sigma ^2\Big ]+\Big [\frac{1}{n}\sum _{i=1}^n\hat{\beta }_n^\tau \tilde{e}_i\tilde{e}_i^\tau \hat{\beta }_n-\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n\Big ] \nonumber \\&\qquad -\,\Big [\frac{2}{n}\sum _{i=1}^n(\tilde{Y}_i-\tilde{X}_i^\tau \hat{\beta }_n)\hat{\beta }_n^\tau \tilde{e}_i\Big ] \nonumber \\&:=A_1+A_2-A_3. \end{aligned}$$

(5.1)

First, we prove that

$$\begin{aligned} A_1=\frac{1}{n}\sum _{i=1}^n(\epsilon _i^2-\sigma ^2)+o_p\Big (\frac{1}{\sqrt{n}}\Big ), \end{aligned}$$

(5.2)

$$\begin{aligned} A_2=\frac{1}{n}\sum _{i=1}^n\beta ^\tau (e_ie_i^\tau -\Sigma _e)\beta +o_p\Big (\frac{1}{\sqrt{n}}\Big ), \end{aligned}$$

(5.3)

$$\begin{aligned} A_3=\frac{2}{n}\sum _{i=1}^n\epsilon _ie_i^\tau \beta +o_p\Big (\frac{1}{\sqrt{n}}\Big ). \end{aligned}$$

(5.4)

From the definition of $\tilde{Y}_i$ and (1.1), one can write

$$\begin{aligned} A_1&=\frac{1}{n}\sum _{i=1}^n(\epsilon _i^2-\sigma ^2)+\frac{1}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)]^2+\frac{1}{n}\sum _{i=1}^n\tilde{M}_i^2+\frac{1}{n}\sum _{i=1}^n(S_i\epsilon )^2 \nonumber \\&\quad \ +\frac{2}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)]\tilde{M}_i+\frac{2}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)]\tilde{\epsilon }_i+\frac{2}{n}\sum _{i=1}^n\tilde{M}_i\tilde{\epsilon }_i\!-\!\frac{2}{n}\sum _{i=1}^n\epsilon _iS_i\epsilon \nonumber \\&:=\frac{1}{n}\sum _{i=1}^n(\epsilon _i^2-\sigma ^2)+\sum _{j=1}^7A_{1j}. \end{aligned}$$

(5.5)

Note that from the proof of Lemma 3 in Owen (1990) and (A3), we have $\max _{1\le i\le n}\Vert X_i\Vert =o(n^{1/2\delta })~~a.s.$ and $\max _{1\le i\le n}\Vert W_i\Vert =o(n^{1/2\delta })~~a.s.$

Furthermore, from Lemma 6.6 and (A2), we have

$$\begin{aligned} \max _{1\le i\le n}\Vert \tilde{X}_i\Vert&\le \max _{1\le i\le n}\Vert X_i\Vert +\max _{1\le i\le n}\Vert W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i)\Vert \{1+O_p(c_n)\} \\&\le O_p(n^{1/{2\delta }})+C\max _{1\le i\le n}\Vert W_i^\tau \Vert \{1+O_p(c_n)\}=O_p(n^{1/{2\delta }}). \end{aligned}$$

Lemma 6.9 (i) gives $\Vert \hat{\beta }_n-\beta \Vert =O_p(n^{-1/2})$, therefore

$$\begin{aligned} A_{11}=\frac{1}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)]^2\le \max _{1\le i\le n}\Vert \tilde{X}_i\Vert ^2\Vert \beta -\hat{\beta }_n\Vert ^2=O_p(n^{1/\delta -1})=o_p(n^{-1/2}). \end{aligned}$$

(5.6)

From (A1)–(A4), one can easily obtain that $P\big (\frac{1}{n}\sum _{i=1}^n(W_i^\tau a(T_i))^2>\eta \big ) \le \frac{E[a^\tau (T_1)\Gamma (T_1)a(T_1)]}{\eta }<\frac{C}{\eta }, $ which implies $\frac{1}{n}\sum _{i=1}^n(W_i^\tau a(T_i))^2=O_p(1)$. Together with (6.9) and (A5) we have

$$\begin{aligned} A_{12}=\frac{1}{n}\sum _{i=1}^n(W_i^\tau a(T_i))^2O_p(c_n^2)= O_p(c_n^2)=o_p(n^{-1/2}). \end{aligned}$$

(5.7)

Note that $\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau =O_p(1)$. Therefore, together with (6.14), we have

$$\begin{aligned} A_{13}=\frac{1}{n}\sum _{i=1}^n(S_i\epsilon )^2= \frac{1}{n}\sum _{i=1}^nW_i^\tau W_i O_p\Big (\frac{\log n}{nh}\Big ) =O_p\Big (\frac{\log n}{nh}\Big )=o_p(n^{-1/2}). \end{aligned}$$

(5.8)

From (6.9), (A3) and (A4), we have $ \max _{1\le i\le n}|\tilde{M}_i|=\max _{1\le i\le n}|W_i^\tau a(T_i)|O_p(c_n) =O_p(n^{1/{2\delta }})O_p(c_n). $ Similar to the proof of (5.6), one can obtain that

$$\begin{aligned} |A_{14}|\le 2(\max _{1\le i\le n}\Vert \tilde{X}_i^\tau \Vert \Vert \beta -\hat{\beta }_n\Vert \max _{1\le i\le n}|\tilde{M}_i|) =O_p(n^{1/\delta -1/2}c_n)=o_p(n^{-1/2}). \end{aligned}$$

(5.9)

As to $A_{15}$, by (6.6), (6.14), Lemma 6.10, (A1), (A2) and (A5), we have

$$\begin{aligned} |A_{15}|&=\Bigg |\frac{2}{n}\sum _{i=1}\tilde{X}_i^\tau (\beta -\hat{\beta }_n)\epsilon _i-\frac{2}{n}\sum _{i=1}^n\tilde{X}_i^\tau (\beta -\hat{\beta }_n)W_i^\tau O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg )\Bigg | \nonumber \\&\le \Bigg \Vert \frac{2}{n}\sum _{i=1}X_i^\tau \epsilon _i\Bigg \Vert \Bigg \Vert \beta -\hat{\beta }_n\Bigg \Vert [1+O_p(c_n)] \nonumber \\&\ \quad +\Bigg \Vert \frac{2}{n}\sum _{i=1}W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i)\epsilon _i\Bigg \Vert \Bigg \Vert \beta -\hat{\beta }_n\Bigg \Vert [1+O_p(c_n)] \nonumber \\&\quad \ +\max _{1\le i\le n}\Vert \tilde{X}_i^\tau \Vert \max _{1\le i\le n}\Vert W_i^\tau \Vert \Vert \beta -\hat{\beta }_n\Vert O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg ) \nonumber \\&=o(n^{-1/4})O_p(n^{-1/2})+O_p(n^{1/\delta })O_p(n^{-1/2})O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg ) =o_p(n^{-1/2}). \end{aligned}$$

(5.10)

From (A1), (A2), (A4), it is easy to verify that $|\frac{1}{n}a^\tau (T_i)W_iW_i^\tau \mathbf 1 |=O_p(1)$. Therefore, with Lemma 6.10, (6.9) and (6.14), we have

$$\begin{aligned} |A_{16}|&=\Bigg |\frac{2}{n}\sum _{i=1}^na^\tau (T_i)W_i\tilde{\epsilon }_i\Bigg |O_p(c_n)\le \Bigg |\frac{2}{n}\sum _{i=1}^na^\tau (T_i)W_i\epsilon _i\Bigg |O_p(c_n) \nonumber \\&\qquad +\,\Bigg |\frac{2}{n}\sum _{i=1}^na^\tau (T_i)W_iW_i^\tau \mathbf 1 \Bigg |O_p(c_n)O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg ) =o_p(n^{-1/2}). \end{aligned}$$

(5.11)

From Lemma 6.10 and (6.14), it is directly derived that

$$\begin{aligned} |A_{17}|=\Bigg |\frac{2}{n}\sum _{i=1}^n\epsilon _iW_i^\tau \mathbf 1 \Bigg |O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg ) =o(n^{-1/4})O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg )=o_p(n^{-1/2}). \end{aligned}$$

(5.12)

Hence, with (5.4)–(5.7),(5.8)–(5.12), we finish the proof of (5.2). Write

$$\begin{aligned} A_2&=\frac{1}{n}\sum _{i=1}^n\beta ^\tau (e_ie_i^\tau -\Sigma _e)\beta \frac{1}{n}\sum _{i=1}^n(\hat{\beta }_n-\beta )^\tau (e_ie_i^\tau -\Sigma _e)(\hat{\beta }_n-\beta ) \nonumber \\&\quad \ +\frac{1}{n}\sum _{i=1}^n(\hat{\beta }_n-\beta )^\tau (e_ie_i^\tau -\Sigma _e)\beta +\frac{1}{n}\sum _{i=1}^n\beta ^\tau (e_ie_i^\tau -\Sigma _e)(\hat{\beta }_n-\beta ) \nonumber \\&:=\frac{1}{n}\sum _{i=1}^n\beta ^\tau (e_ie_i^\tau -\Sigma _e)\beta +A_{21}+A_{22}+A_{23}. \end{aligned}$$

(5.13)

Note that $\frac{1}{n}\sum _{i=1}^ne_ie_i^\tau -\Sigma _e=o_p(1)$ from the strong law of large number for i.i.d. random variables and $\Vert \hat{\beta }_n-\beta \Vert =O_p(n^{-1/2})$. Then

$$\begin{aligned} |A_{21}|&=\Bigg |(\hat{\beta }_n-\beta )^\tau \Bigg [\frac{1}{n}\sum _{i=1}^ne_ie_i^\tau -\Sigma _e\Bigg ](\hat{\beta }_n-\beta )\Bigg |=o_p(n^{-1})=o_p(n^{-1/2}), \end{aligned}$$

(5.14)

$$\begin{aligned} |A_{22}|&=\Bigg |(\hat{\beta }_n-\beta )^\tau \Bigg [\frac{1}{n}\sum _{i=1}^ne_ie_i^\tau -\Sigma _e\Bigg ]\beta \Bigg | =o_p(n^{-1/2}), \end{aligned}$$

(5.15)

$$\begin{aligned} |A_{23}|&=\Bigg |\beta ^\tau \Bigg [\frac{1}{n}\sum _{i=1}^ne_ie_i^\tau -\Sigma _e\Bigg ](\hat{\beta }_n-\beta )\Bigg | =o_p(n^{-1/2}). \end{aligned}$$

(5.16)

Hence, by (5.13)–(5.16), we complete the proof of (5.3). Write

$$\begin{aligned} A_3&=\frac{2}{n}\sum _{i=1}^n\epsilon _ie_i^\tau \beta -\frac{2}{n}\sum _{i=1}^2S_i\epsilon e_i^\tau \beta +\frac{2}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)+\tilde{M}_i]\beta ^\tau e_i \nonumber \\&\quad \ +\frac{2}{n}\sum _{i=1}^n[\tilde{X}_i^\tau (\beta -\hat{\beta }_n)+\tilde{M}_i+\tilde{\epsilon }_i](\hat{\beta }_n-\beta )^\tau e_i \nonumber \\&:=\frac{2}{n}\sum _{i=1}^n\epsilon _ie_i^\tau \beta +A_{31}+A_{32}+A_{33}. \end{aligned}$$

(5.17)

Applying Lemma 6.3, we have $\Vert \frac{1}{n}\sum _{i=1}^nW_ie_i^\tau \Vert =O_p(n^{-1/2})$. Then by (6.14), we have

$$\begin{aligned} |A_{31}|=\Bigg |\frac{2}{n}\sum _{i=1}^n\mathbf 1 ^\tau W_ie_i^\tau \beta \Bigg |O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg ) =O_p(n^{-1/2})O_p\Bigg (\sqrt{\frac{\log n}{nh}}\Bigg )=o_p(n^{-1/2}). \end{aligned}$$

(5.18)

Similarly, by (6.6) and (6.9), one can obtain that

$$\begin{aligned} |A_{32}|&=\Bigg |(\beta -\hat{\beta }_n)^\tau \Bigg [\frac{2}{n}\sum _{i=1}^n\tilde{X}_ie_i^\tau \Bigg ]\beta +\Bigg [\frac{2}{n}\sum _{i=1}^n\tilde{M}_ie_i^\tau \Bigg ]\beta \Bigg | \nonumber \\&\le \,\Bigg |(\beta -\hat{\beta }_n)^\tau \Bigg [\frac{2}{n}\sum _{i=1}^nX_ie_i^\tau \Bigg ]\beta \Bigg |[1+O_p(c_n)] \nonumber \\&\quad +\,\Bigg |(\beta -\hat{\beta }_n)^\tau \Bigg [\frac{2}{n}\sum _{i=1}^n\Phi (T_i)\Gamma ^{-1}(T_i)W_ie_i^\tau \Bigg ]\beta \Bigg |[1+O_p(c_n)] \nonumber \\&\quad +\,\Bigg |\Bigg [\frac{2}{n}\sum _{i=1}^na^\tau (T_i)W_ie_i^\tau \Bigg ]\beta \Bigg |O_p(c_n)=o_p(n^{-1/2}), \end{aligned}$$

(5.19)

$$\begin{aligned} |A_{33}|&=\Bigg |(\beta -\hat{\beta }_n)^\tau \Bigg [\frac{2}{n}\sum _{i=1}^n\tilde{X}_ie_i^\tau \Bigg ](\beta -\hat{\beta }_n)\Bigg |+\Bigg |\Bigg [\frac{2}{n}\sum _{i=1}^n\tilde{M}_ie_i^\tau \Bigg ](\hat{\beta }_n-\beta )\Bigg | \nonumber \\&\qquad +\,\Bigg |\Bigg [\frac{2}{n}\sum _{i=1}^n\tilde{\epsilon }_ie_i^\tau \Bigg ](\hat{\beta }_n-\beta )\Bigg | \nonumber \\&\le \,O_p(n^{-1})O_p(n^{-1/2})+O_p(n^{-1})O_p(c_n)+\Bigg |\frac{2}{n}\sum _{i=1}^n\epsilon _ie_i^\tau (\hat{\beta }_n-\beta )\Bigg | \nonumber \\&\qquad +\,\Bigg |\frac{2}{n}\sum _{i=1}^nS_i\epsilon e_i^\tau (\hat{\beta }_n-\beta )\Bigg | \nonumber \\&=\,O_p(n^{-3/2})+O_p(n^{-1}c_n)\!+\!O_p(n^{-1})\!+\!O_p(n^{-1})O_p\Bigg (\!\sqrt{\frac{\log n}{nh}}\Bigg ) \!=\!o_p(n^{-1/2}). \end{aligned}$$

(5.20)

Combining (5.17)–(5.20), we prove (5.4). As a result, (5.1) can be written as

$$\begin{aligned} \hat{\sigma }_n^2-\sigma ^2=\frac{1}{n}\sum _{i=1}^n[(\epsilon _i-e_i^\tau \beta )^2-(\sigma ^2+\beta ^\tau \Sigma _e\beta )]+o_p(n^{-1/2}). \end{aligned}$$

This completes the proof of Theorem 3.1 (i).

(ii) To prove $\sqrt{n}(\hat{\sigma }_J^2-\sigma ^2)=\sqrt{n}(\hat{\sigma }_n^2-\sigma ^2)+o_p(1)$, it is sufficient to prove that $\hat{\sigma }_J^2=\hat{\sigma }_n^2+o_p(n^{-1/2}).$ According to the definition, we have $\hat{\sigma }_J^2=\hat{\sigma }_n^2+\frac{n-1}{n}\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)$. Therefore, to obtain the desired result, we only need to prove

$$\begin{aligned} \sqrt{n}\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)=o_p(1). \end{aligned}$$

(5.21)

Note that $\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]=0$, with Lemma 6.4 we have

$$\begin{aligned} \sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)&=\frac{1}{n-1}\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad +\,\frac{2}{n-1}\sum _{i=1}^n[\tilde{\xi }_i^\tau (\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\hat{\beta }_n^\tau \Sigma _e](\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad +\,\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \Sigma _e(\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad -\,\frac{1}{n-1}\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i}):=\sum _{k=1}^4B_k. \end{aligned}$$

Therefore, to prove (5.21), it is sufficient to prove $B_k=o_p(n^{-1/2}),~~k=1,2,3,4.$

From Lemmas 6.7 and 6.11, we have

$$\begin{aligned} B_1=\frac{1}{n-1}\sum _{i=1}^n[(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i})] =O_p(n^{-2}). \end{aligned}$$

(5.22)

Similarly, one can easily check that

$$\begin{aligned} B_3&=\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \Sigma _e(\hat{\beta }_n-\hat{\beta }_{n,-i}) O_p(n^{-1}), \end{aligned}$$

(5.23)

$$\begin{aligned} B_4&=\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \frac{1}{n-1}\sum _{j=1}^n \tilde{\xi }_j\tilde{\xi }_j^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i})=O_p(n^{-1}). \end{aligned}$$

(5.24)

Using Lemmas 6.11 and 6.5, we have

$$\begin{aligned} B_2^2&=\frac{4}{(n-1)^2}\Big \{\sum _{i=1}^n[\tilde{\xi }_i^\tau (\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\hat{\beta }_n^\tau \Sigma _e](\hat{\beta }_n-\hat{\beta }_{n,-i})\Big \}^2 \\&\le \frac{4n}{(n-1)^2}\sum _{i=1}^n(\tilde{\xi }_i^\tau (\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\hat{\beta }_n^\tau \Sigma _e)^2O_p(n^{-2})=O_p(n^{-2}). \end{aligned}$$

Therefore, one can obtain that

$$\begin{aligned} |B_2|=o_p(n^{-1/2}). \end{aligned}$$

(5.25)

Hence, combining (5.22)–(5.25), we finish the proof of (5.21). $\square $

Proof of Theorem 3.2

Define $g(\lambda )=\frac{1}{n}\sum _{i=1}^n\frac{\sigma _{J_i}^2-\sigma ^2}{1+\lambda (\sigma _{J_i}^2-\sigma ^2)}$. It is easy to check that

$$\begin{aligned} 0=|g(\lambda )|&=\Big |\frac{1}{n}\sum _{i=1}^n(\sigma ^2_{J_i}-\sigma ^2)-\frac{\lambda }{n}\sum _{i=1}^n\frac{(\sigma ^2_{J_i}-\sigma ^2)^2}{1+\lambda (\sigma ^2_{J_i}-\sigma ^2)}\Big |\ge \frac{|\lambda | S_{\sigma ^2}}{1+|\lambda |R_n} \nonumber \\&\quad -\,\Big |\frac{1}{n}\sum _{i=1}^n\sigma _{J_i}^2-\sigma ^2\Big |, \end{aligned}$$

where $S_{\sigma ^2}=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)^2$, $R_n=\max _{1\le i\le n}|\sigma _{J_i}^2-\sigma ^2|$. Next we prove

$$\begin{aligned}&R_n=\max _{1\le i\le n}|\sigma _{J_i}^2-\sigma ^2|=o_p(\sqrt{n}), \end{aligned}$$

(5.26)

$$\begin{aligned}&S_{\sigma ^2}=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)^2\mathop {\rightarrow }\limits ^\mathrm{P} \Sigma _4. \end{aligned}$$

(5.27)

Write

$$\begin{aligned} \hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2&=\frac{1}{n-1}[(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)^2-\hat{\sigma }_n^2-\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n] \\&\quad +\frac{1}{n-1}(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad +\frac{2}{n-1}[\tilde{\xi }_i^\tau (\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\hat{\beta }_n^\tau \Sigma _e](\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad +(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \Sigma _e(\hat{\beta }_n-\hat{\beta }_{n,-i}) \\&\quad -\frac{1}{n-1}(\hat{\beta }_n-\hat{\beta }_{n,-i})^\tau \sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau (\hat{\beta }_n-\hat{\beta }_{n,-i}):=\sum _{k=1}^5b_{ki}. \end{aligned}$$

Hence, to prove (5.26) we only need to prove $\max _{1\le i\le n}|b_{ki}|=o_p(n^{-1/2})$ for $k=1,2,3,4,5.$

Apparently, we have

$$\begin{aligned} \frac{(n-1)^2}{n}\sum _{i=1}^nb_{1i}^2&=\frac{1}{n}\sum _{i=1}^n\left[ (\epsilon _i-e_i^\tau \beta )^4+(\tilde{\xi }_i^\tau (\beta -\hat{\beta }_n))^4+4(\epsilon _i-e_i^\tau \beta )^3\tilde{\xi }(\beta -\hat{\beta }_n)\right. \\&\left. \quad +\,4(\epsilon _i-e_i^\tau \beta )(\tilde{\xi }(\beta -\hat{\beta }_n))^3+6(\epsilon _i-e_i^\tau \beta )^2(\tilde{\xi }(\beta -\hat{\beta }_n))^2\right] \\&\quad -\,(\hat{\sigma }_n^2+\hat{\beta }_n^\tau \Sigma _e\hat{\beta }_n)^2. \end{aligned}$$

From (A3), we have

$$\begin{aligned} P\Big (n^{-3/2}\Big |\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )^3\tilde{\xi }_i\Big |>\eta \Big ) \le \frac{1}{\eta }n^{-3/2}\sum _{i=1}^nE|(\epsilon _i-e_i^\tau \beta )^3\tilde{\xi }_i|\rightarrow 0, \end{aligned}$$

which implies $\frac{4}{n}\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )^3\tilde{\xi }(\beta -\hat{\beta }_n)=o_p(1)$ from $\Vert \hat{\beta }_n-\beta \Vert =O_p(n^{-1/2})$ given by Lemma 6.9 (i). Similarly $\frac{4}{n}\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )(\tilde{\xi }(\beta -\hat{\beta }_n))^3=o_p(1)$, $\frac{6}{n}\sum _{i=1}^n(\epsilon _i-e_i^\tau \beta )^2(\tilde{\xi }(\beta -\hat{\beta }_n))^2=o_p(1)$ and $\frac{1}{n}\sum _{i=1}^n(\tilde{\xi }(\beta -\hat{\beta }_n))^4=o_p(1)$. Therefore, from Lemma 6.5, we have

$$\begin{aligned} \frac{(n-1)^2}{n}\sum _{i=1}^nb_{1i}^2 \mathop {\rightarrow }\limits ^\mathrm{P} E(\epsilon _1-e_1^\tau \beta )^4-(\sigma ^2+\beta ^\tau \Sigma _e\beta )^2=\Sigma _4. \end{aligned}$$

(5.28)

From (5.28), one can derive that

$$\begin{aligned} \max _{1\le i\le n}|b_{1i}|=o_p(n^{-1/2}). \end{aligned}$$

(5.29)

By the same approaches used in (5.22)-(5.25), one can easily check

$$\begin{aligned} \max _{1\le i\le n}|b_{ki}|=O_p(n^{-1}),~~k=2,3,4,5. \end{aligned}$$

(5.30)

Hence, together with (5.29) and (5.30), we have proved (5.26).

According to Theorem 3.1, one can write $ S_{\sigma ^2}=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2)^2-(\sigma ^2)^2+o_p(1), $

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2)^2 =(\hat{\sigma }_n^2)^2 +\frac{2(n-1)}{n}\hat{\sigma }_n^2\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2) +\frac{(n-1)^2}{n}\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)^2. \end{aligned}$$

Therefore, to prove (5.27), we need to investigate the convergency of $\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2)^2$ first.

From (5.21), we have $\frac{2(n-1)}{n}\hat{\sigma }_n^2\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)=o_p(n^{-1/2})$. Using the same techniques in proving (5.26), one can get $ \frac{(n-1)^2}{n}\sum _{i=1}^n(\hat{\sigma }_n^2-\hat{\sigma }_{n,-i}^2)^2 =\frac{(n-1)^2}{n}\sum _{i=1}^nb_{1i}^2+o_p(1). $ Together with (5.28), we have

$$\begin{aligned} S_{\sigma ^2}=\frac{(n-1)^2}{n}\sum _{i=1}^nb_{1i}^2+o_p(1) \mathop {\rightarrow }\limits ^\mathrm{P} \Sigma _4, \end{aligned}$$

which proves (5.27).

Applying Theorem 3.1, we have $|\frac{1}{n}\sum _{i=1}^n\sigma _{J_i}^2-\sigma ^2|=O_p(n^{-1/2})$. Together with (5.27), we have $\frac{|\lambda |}{1+|\lambda |R_n}=O_p(n^{-1/2})$. From (5.26), it follows that $|\lambda |=O_p(n^{-1/2})$. Let $\gamma _i=\lambda (\sigma _{J_i}^2-\sigma ^2)$, then still by (5.26), $\max _{1\le i\le n}|\gamma _i|=|\lambda |R_n=o_p(1)$. Note that

$$\begin{aligned} 0=g(\lambda )&=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)\frac{1}{1+\gamma _i} =\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)(1-\gamma _i+\frac{\gamma _i^2}{1+\gamma _i}) \\&=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)-\lambda S_{\sigma ^2} +\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)\frac{\gamma _i^2}{1+\gamma _i}. \end{aligned}$$

By (5.26) and (5.27), it is easy to derive that $\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)\frac{\gamma _i^2}{1+\gamma _i} =\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)^2\lambda ^2(\sigma _{J_i}^2-\sigma ^2)\frac{1}{1+\gamma _i} =o_p(n^{-1/2})$. Therefore

$$\begin{aligned} \lambda S_{\sigma ^2}=\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)+o_p(n^{-1/2}). \end{aligned}$$

Denote $\lambda = S_{\sigma ^2}^{-1}\frac{1}{n}\sum _{i=1}^n(\sigma _{J_i}^2-\sigma ^2)+\phi _n$, where $|\phi _n|=o_p(n^{-1/2})$. Let $\eta _i=\sum _{k=3}^\infty \frac{(-1)^{k-1}}{k}\gamma _i^k$, then $\eta _i=O(\gamma _i^3)$, which implies $ |\sum _{i=1}^n\eta _i|\le C\sum _{i=1}^n|\gamma _i|^3 =C\sum _{i=1}^n|\lambda ^2(\hat{\sigma }_{J_i}^2-\sigma ^2)^2\gamma _i| \le Cn\lambda ^2S_{\sigma ^2}\max _{1\le i\le n}|\gamma _i|=o_p(1)$. Hence

$$\begin{aligned} l(\sigma ^2)&=2\sum _{i=1}^n\gamma _i-\sum _{i=1}^n\gamma _i^2+2\sum _{i=1}^n\eta _i= 2\lambda n(\hat{\sigma }_J^2-\sigma ^2)-n\lambda ^2S_{\sigma ^2}+2\sum _{i=1}^n\eta _i \\&=2n(\hat{\sigma }_J^2-\sigma ^2)^2[S_{\sigma ^2}^{-1}(\hat{\sigma }_J^2-\sigma ^2)+\phi _n] -nS_{\sigma ^2}[S_{\sigma ^2}^{-1}(\hat{\sigma }_J^2-\sigma ^2)\!+\!\phi _n]^2\!+\!2\sum _{i=1}^n\eta _i \\&=nS_{\sigma ^2}^{-1}(\hat{\sigma }_J^2-\sigma ^2)^2-nS_{\sigma ^2}\phi _n^2+2\sum _{i=1}^n\eta _i \\&=nS_{\sigma ^2}^{-1}(\hat{\sigma }_J^2-\sigma ^2)^2+o_p(1). \end{aligned}$$

Finally, together with Theorem 3.1, we finish the proof of Theorem 3.2. $\square $

References

Ahmad I, Leehalanon S, Li Q (2005) Efficient estimation of a semiparametric partially linear varying coefficient model. Ann Stat 33:258–283
Article MathSciNet MATH Google Scholar
Bravo F (2014) Varying coefficients partially linear models with randomly censored data. Ann Inst Stat Math 66:383–412
Article MathSciNet MATH Google Scholar
Doukhan P (1994) Mixing: properties and examples. Springer, New York
Book MATH Google Scholar
Fan GL, Xu HX, Liang HY (2012) Empirical likelihood inference for partially time-varying coefficient errors-in-variables models. Electron J Stat 6:1040–1058
Article MathSciNet MATH Google Scholar
Fan GL, Liang HY, Wang JF (2013) Statistical inference for partially linear time-varying coefficient errors-in-variables models. J Stat Plann Inference 143:505–519
Article MATH Google Scholar
Fan GL, Liang HY, Wang JF (2013) Empirical likelihood for heteroscedastic partially linear errors-in-variables model with $\alpha $-mixing errors. Stat Pap 54:85–112
Article MathSciNet MATH Google Scholar
Fan J, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Beroulli 11:1031–1057
Article MathSciNet MATH Google Scholar
Feng H, Peng L (2012) Jackknife empirical likelihood tests for distribution functions. J Stat Plan Inference 142:1571–1585
Article MathSciNet MATH Google Scholar
Feng S, Xue L (2014) Bias-corrected statistical inference for partially linear varying coefficient errors-in-variables models with restricted condition. Ann Inst Stat Math 66:121–140
Article MathSciNet MATH Google Scholar
Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. John Hopkins University Press, Baltimore
MATH Google Scholar
Gong Y, Peng L, Qi Y (2010) Smoothed jackknife empirical likelihood method for roc curve. J Multivar Anal 101:1520–1531
Article MathSciNet MATH Google Scholar
Hall P (1992) The bootstrap and edgeworth expansion. Springer, New York
Book MATH Google Scholar
Hall P, La Scala B (1990) Methodology and algorithms of empirical likelihood. Int Stat Rev 58:109–127
Article MATH Google Scholar
Hong S, Cheng P (1994) The convergence rate of estimation for parameter in a semiparametric model. Chin J Appl Probab Stat 10:62–71
Google Scholar
Huang Z, Zhang R (2009) Empirical likelihood for nonparametric parts in semiparametric varying-coefficient partially linear models. Stat Probab Lett 79:1798–1808
Article MathSciNet MATH Google Scholar
Jing BY, Yuan J, Zhou W (2009) Jackknife empirical likelihood. J Am Stat Assoc 104:1224–1232
Article MathSciNet MATH Google Scholar
Liang H, Härdle W, Carroll RJ (1999) Estimation in a semiparametric partially linear errors-in-variables model. Ann Stat 27:1519–1535
Article MathSciNet MATH Google Scholar
Liang HY, Jing BY (2009) Asymptotic normality in partially linear models based on dependent errors. J Stat Plan Inference 139:1357–1371
Article MATH Google Scholar
Liang HY, Mammitzsch V, Steinebach J (2006) On a semiparametric regression model whose errors form a linear process with negatively associated innovations. Statistics 40:207–226
Article MathSciNet MATH Google Scholar
Liebscher E (2001) Estimation of the density and the regression function under mixing conditions. Stat Decis 19:9–26
MathSciNet MATH Google Scholar
Lin Z, Lu C (1996) Limit theory for mixing dependent random variables. Science Press, New York
MATH Google Scholar
Miao Y, Zhao F, Wang K, Chen Y (2013) Asymptotic normality and strong consistency of LS estimators in the EV regression model with NA errors. Stat Pap 54:193–206
Article MathSciNet MATH Google Scholar
Miller RG (1974) An unbalanced jackknife. Ann Stat 2:880–891
Article MathSciNet MATH Google Scholar
Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249
Article MathSciNet MATH Google Scholar
Owen AB (1990) Empirical likelihood ratio confidence regions. Ann Stat 8:90–120
Article MathSciNet MATH Google Scholar
Peng L (2012) Approximate jackknife empirical likelihood method for estimating equations. Can J Stat 40:110–123
Article MathSciNet MATH Google Scholar
Peng L, Qi Y, Van Keilegom I (2012) Jackknife empirical likelihood method for copulas. Test 21:74–92
Article MathSciNet MATH Google Scholar
Shao QM (1993) Complete convergence for $\alpha $-mixing sequences. Stat Probab Lett 16:279–287
Article MathSciNet MATH Google Scholar
Singh S, Jain K, Sharma S (2014) Replicated measurement error model under exact linear restrictions. Stat Pap 55:253–274
Article MathSciNet MATH Google Scholar
Wang X, Li G, Lin L (2011) Empirical likelihood inference for semiparametric varying-coefficient partially linear EV models. Metrika 73:171–185
Article MathSciNet MATH Google Scholar
Wei C, Luo Y, Wu X (2012) Empirical likelihood for partially linear additive errors-in-variables models. Stat Pap 53:485–496
Article MathSciNet MATH Google Scholar
Yang SC (2007) Maximal moment inequality for partial sums of strong mixing sequences and application. Acta Math Sin Engl Ser 23:1013–1024
Article MathSciNet MATH Google Scholar
You J, Chen G (2006) Estimation of a semiparametric varying-coefficient partially linear errors-in-variables model. J Multivar Anal 97:324–341
Article MathSciNet MATH Google Scholar
You J, Chen G (2007) Semiparametric generalized least squares estimation in partially linear regression models with correlated errors. J Stat Plan Inference 137:117–132
Article MathSciNet MATH Google Scholar
You J, Zhou X, Chen G (2005) Jackknifing in partially linear regression models with serially correlated errrors. J Multivar Anal 92:386–404
Article MATH Google Scholar
You J, Zhou Y (2006) Empirical likelihood for semiparametric varying-coefficient partially linear regression models. Stati Probab Lett 76:412–422
Article MathSciNet MATH Google Scholar
Zhang JJ, Liang HY (2012) Asymptotic normality of estimators in heteroscedastic semiparametric model with strong mixing errors. Commun Stat 41:2172–2201
Article MathSciNet MATH Google Scholar
Zhou H, You J, Zhou B (2010) Statistical inference for fixed-effects partially linear regression models with errors in variables. Stat Pap 51:629–650
Article MathSciNet MATH Google Scholar
Zi X, Zou C, Liu Y (2012) Two-sample empirical likelihood method for difference between coefficients in linear regression model. Stat Pap 53:83–93
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank anonymous referees for their valuable comment and suggestions which lead to the improvement of the paper. This research was supported by the National Natural Science Foundation of China (11271286) and the Specialized Research Fund for the Doctor Program of Higher Education (20120072110007).

Author information

Authors and Affiliations

Department of Mathematics, Tongji University, Shanghai, 200092, People’s Republic of China
Ai-Ai Liu & Han-Ying Liang

Authors

Ai-Ai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Han-Ying Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Han-Ying Liang.

Appendix

In this section, we give some preliminary Lemmas, which have been used in Section 5. Let $\{X_i, i\ge 1\}$ be a stationary sequence of $\alpha $-mixing random variables with the mixing coefficients $\{\alpha (k)\}$.

Lemma 6.1

(Liebscher (2001), Proposition 5.1) Assume that $EX_i=0$ and $|X_i|\le S<\infty $ a.s. $(i=1,2,\cdots ,n)$. Then for n, $m\in \mathbb {N}$, $0<m\le n/2$ and $\epsilon >0$, $ P(|\sum _{i=1}^nX_i|>\epsilon )\le 4\exp \{-\frac{\epsilon ^2}{16}(nm^{-1}D_m+\frac{1}{3}\epsilon Sm)^{-1}\}+32\frac{S}{\epsilon }n\alpha (m), $ where $D_m=\max _{1\le j\le 2m}Var(\sum _{i=1}^jX_i)$.

Lemma 6.2

(Yang (2007), Theorem 2.2)

(i)
Let $r>2,~\delta >0,~EX_i=0$ and $E|X_i|^{r+\delta }<\infty $. Suppose that $\lambda >r(r+\delta )/(2\delta )$ and $\alpha (n)=O(n^{-\lambda })$. Then for any $\epsilon >0$, there exists a positive constant $C:=C(\epsilon ,r,\delta ,\lambda )$ such that $E\max _{1\le m\le n}|\sum _{i=1}^mX_i|^r\le C\{n^\epsilon \sum _{i=1}^nE|X_i|^r+(\sum _{i=1}^n\Vert X_i\Vert _{r+\delta }^2)^{r/2}\}.$
(ii)
If $EX_i=0$ and $E|X_i|^{2+\delta }<\infty $ for some $\delta >0$, then $E(\sum _{i=1}^nX_i)^2\le \{1+16\sum _{l=1}^n\alpha ^{\frac{\delta }{2+\delta }}(l)\}\sum _{i=1}^n\Vert X_i\Vert _{2+\delta }^2$.

Lemma 6.3

(Lin and Lu (1996), Theorem 3.2.1) Suppose that $EX_1\!=\!0,~~E|X_1|^{2+\delta }\!<\!\infty $ for some $\delta \!>\!0$ and $\sum _{n=1}^{\infty }\alpha ^{\delta /(2+\delta )}(n)\!<\!\infty $. Then $\sigma ^2\!:=\!EX_1^2+2\sum _{j=2}^\infty EX_1X_j<\infty $ and, if $\sigma \ne 0$, $ \frac{S_n}{\sigma \sqrt{n}}\mathop {\rightarrow }\limits ^\mathcal{{D}}N(0,1). $

Lemma 6.4

(Miller (1974), Lemma 2.1) For a nonsingular matrix A, and vectors U and V, we have $(A+UV^\tau )^{-1}=A^{-1}-\frac{(A^{-1}U)(V^\tau A^{-1})}{1+V^\tau A^{-1}U}$.

Lemma 6.5

(Shao (1993), Corollary 1) Let $EX_i=0$ and $\sup _i E|X_i|^r<\infty $ for some $r>1$. Suppose that $\alpha (n)=O(\log ^{-\psi }n)$ for some $\psi >r/(r-1)$. Then $n^{-1}\sum _{i=1}^n X_i=o(1)~~a.s$.

Lemma 6.6

Suppose (A1)–(A3), (A5) and (A6) are satisfied, then

$$\begin{aligned} \sup _{t\in \Omega }\Big |\frac{1}{n}D_t^\tau \omega _tD_t-f(t)\Gamma (t)\otimes \Big ( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{}\quad \mu _2 \end{array}\Big )\Big | =O_p(c_n), \end{aligned}$$

(6.1)

$$\begin{aligned} \sup _{t\in \Omega }\Big |\frac{1}{n}D_t^\tau \omega _t\mathbf X -f(t)\Phi (t)\otimes \Big ( \begin{array}{c} 1 \\ 0 \end{array} \Big )\Big | =O_p(c_n). \end{aligned}$$

(6.2)

Proof

We only prove (6.1) here, because (6.2) can be proved similarly. Write

$$\begin{aligned} D_t^\tau \omega _tD_t&= \left( \! \begin{array}{ccc} W_1,&{}\ldots ,&{} W_n \\ \frac{T_1-t}{h}W_1,&{}\ldots ,&{} \frac{T_n-t}{h}W_n \end{array} \!\right) \!\left( \! \begin{array}{ccc} K_h(T_1\!-\!t) &{} &{} \\ &{}\ddots &{} \\ &{} &{} K_h(T_n\!-\!t) \end{array} \!\right) \!\left( \! \begin{array}{ccc} W_1^\tau &{} \frac{T_1\!-\!t}{h}W_1^\tau \\ \vdots &{} \vdots \\ W_n^\tau &{} \frac{T_n\!-\!t}{h}W_n^\tau \end{array}\right) \nonumber \\&=\left( \begin{array}{llll} \sum _{i=1}^nW_iW_i^\tau K_h(T_i-t) &{}\quad \sum _{i=1}^nW_iW_i^\tau \frac{T_i-t}{h}K_h(T_i-t) \\ \sum _{i=1}^nW_iW_i^\tau \frac{T_i-t}{h}K_h(T_i-t) &{}\quad \sum _{i=1}^nW_iW_i^\tau \Big (\frac{T_i-t}{h}\Big )^2K_h(T_i-t) \end{array} \right) . \end{aligned}$$

(6.3)

Here, we only give the proof of

$$\begin{aligned} \sup _{t\in \Omega }\Big |\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau K_h(T_i-t)-f(t)\Gamma (t) \Big |=O_p(c_n). \end{aligned}$$

(6.4)

We divide $\Omega $ into subintervals $\{\Delta _l\}$ ($l=1,2,\cdots ,l_n$) with length $r_n=h\sqrt{\frac{\log n}{nh}}$, and the center of $\Delta _l$ is at $t_l$. Then the total number of the subintervals satisfies $l_n=O(r_n^{-1})$. Then

$$\begin{aligned}&\sup _{t\in \Omega }\Big |\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau K_h(T_i-t)-f(t)\Gamma (t)\Big | \\&\quad \le \max _{1\le l\le l_n}\sup _{t\in \Delta _l}\Big |\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau K_h(T_i-t)-\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau K_h(T_i-t_l)\Big | \\&\qquad +\,\max _{1\le l\le l_n}\Big |\frac{1}{n}\sum _{i=1}^nW_iW_i^\tau K_h(T_i-t_l)-f(t_l)\Gamma (t_l)\Big |\\&\qquad +\,\max _{1\le l\le l_n}\sup _{t\in \Delta _l}\Big |f(t_l)\Gamma (t_l)-f(t)\Gamma (t)\Big | \\&\quad :=I_1+I_2+I_3. \end{aligned}$$

Therefore, to prove (6.4), it is sufficient to show that $I_k=O_p(c_n),~~k=1,2,3$.

Using the Lipschitz continuity of $K(\cdot )$, we have $ |K_h(T_i-t)-K_h(T_i-t_l)|\le \frac{C_1}{h^2}|t-t_l|I(|T_i-t_l|\le C_2h)\le \frac{C_1 r_n}{h^2}I(|T_i-t_l|\le C_2h). $ Therefore, the $(k_1,k_2)$ component in $I_1$, $1\le k_1\le k_2\le p$, can be written as

$$\begin{aligned}&\max _{1\le l\le l_n}\sup _{t\in \Delta _l}\Bigg |\frac{1}{n}\sum _{i=1}^nW_{ik_1}W_{ik_2}[K_h(T_i-t)-K_h(T_i-t_l)]\Bigg | \\&\quad \le \frac{C_1 r_n}{nh^2}\max _{1\le l\le l_n}\Bigg |\sum _{i=1}^n|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h) \\&\qquad -\,\sum _{i=1}^nE|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h)\Bigg | \\&\qquad +\,\frac{C_1 r_n}{nh^2}\max _{1\le l\le l_n}\sum _{i=1}^nE|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h):=I_{11}+I_{12}. \end{aligned}$$

For $I_{11}$, applying Lemmas 6.1 and 6.2 we have

$$\begin{aligned}&P\Big (\frac{C_1 r_n}{h^2}\max _{1\le l\le l_n}\Big |\frac{1}{n}\sum _{i=1}^n[|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h) \\&\qquad \qquad -E|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h)]\Big |\ge C_0\sqrt{\frac{\log n}{nh}}\Big ) \\&\le \sum _{l=1}^{l_n}\Big \{4\exp \Big [\frac{-\frac{1}{16}C_0^2nh\log n n^{-2/\delta }}{\frac{n}{m}D_m+\frac{1}{3}C_0\sqrt{nh\log n }n^{-1/\delta }C_1m}\Big ]+32\frac{C_1}{C_0\sqrt{nh\log n}n^{-1/\delta }}n\alpha (m)\Big \}, \end{aligned}$$

where $D_m=\max _{1\le j\le 2m}E(h\sum _{i=1}^j[|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h) -E|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h)])^2 n^{-2/\delta } \le \frac{C_2mh}{n^{2/\delta }}$. Taking $m=[\frac{n^{1-1/\delta }h}{C_0\sqrt{nh\log n}}]$, we have

$$\begin{aligned}&P\Big (\max _{1\le l\le l_n}\Big |\frac{1}{n}\sum _{i=1}^n[|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h) \nonumber \\&\qquad \qquad -E|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h)]\Big |\ge C_0\sqrt{\frac{\log n}{nh}}\Big ) \nonumber \\&\le l_n\Big \{\frac{4}{n}+C_1\frac{C_1n^{1+1/\delta }}{\sqrt{nh\log n}}\alpha (m)\Big \}\le \frac{C_0}{n}l_n\rightarrow 0. \end{aligned}$$

(6.5)

On the other hand, we have $E|W_{ik_1}W_{ik_2}|I(|T_i-t_l|\le C_2h)=O(h).$ Therefore $I_{12}=O(\sqrt{\frac{\log n}{nh}}). $ Together with (6.5), one can derive $I_1=O_p(C_n).$ One can rewrite $I_2$ as

$$\begin{aligned} I_2\le&\max _{1\le l\le l_n}\Big |\frac{1}{n}\sum _{i=1}^n[W_iW_i^\tau -\Gamma (T_i)]K_h(T_i-t_l)\Big | \\&\qquad \qquad +\max _{1\le l\le l_n}\Big |\frac{1}{n}\sum _{i=1}^n\Gamma (T_i)K_h(T_i-t_l)-E\Gamma (T_i)K_h(T_i-t_l)\Big | \\&+\max _{1\le l\le l_n}|E\Gamma (T_i)K_h(T_i-t_l)-f(t_l)\Gamma (t_l)|:=I_{21}+I_{22}+I_{23}. \end{aligned}$$

By the same technique used in proving (6.5), we have $I_{21}=O_p\left( \sqrt{\frac{\log n}{nh}}\right) $, $I_{22}=O_p\left( \sqrt{\frac{\log n}{nh}}\right) .$ Using Taylor’s expansion, we have $I_{23}=O(h^2). $ From (A1), we have

$$\begin{aligned} I_3=\max _{1\le l\le l_n}\sup _{t\in \Delta _l}|f(t_l)\Gamma (t_l)-f(t)\Gamma (t)|\le C_1r_n^2+C_2r_n=O\Big (\sqrt{\frac{\log n}{nh}}\Big ). \end{aligned}$$

Thus, (6.4) is proved, which completes the proof of this lemma. $\square $

Lemma 6.7

Suppose (A1)–(A3), (A5) and (A6) are satisfied, then $ \frac{1}{n}\sum _{i=1}^n\tilde{\xi _i}\tilde{\xi _i}^\tau \mathop {\rightarrow }\limits ^\mathrm{P} \Sigma _e+EX_1X_1^\tau -E[\Phi ^\tau (T_1)\Gamma ^{-1}(T_1)\Phi (T_1)]. $

Proof

From the definition $\tilde{\xi _i}^\tau =\xi _i^\tau -S_i{\varvec{\xi }}$ and (1.1), we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\tilde{\xi _i}\tilde{\xi _i}^\tau&=\frac{1}{n}\sum _{i=1}^n(X_i^\tau -S_i\mathbf X )^\tau (X_i^\tau -S_i\mathbf X ) +\frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (X_i^\tau -S_i\mathbf X ) \nonumber \\&\quad \ +\frac{1}{n}\sum _{i=1}^n(X_i^\tau -S_i\mathbf X )^\tau (e_i^\tau -S_i\mathbf e ) +\frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (e_i^\tau -S_i\mathbf e ), \end{aligned}$$

where $S_i=(W_i^\tau ,~0)(D_{T_i}^\tau \omega _{T_i}D_{T_i})^{-1}D_{T_i}^\tau \omega _{T_i}$. By (6.1) and (6.2) in Lemma 6.6, we have

$$\begin{aligned} S_i\mathbf X&=(W_i^\tau ,~0)(D_{T_i}^\tau \omega _{T_i}D_{T_i})^{-1}D_{T_i}^\tau \omega _{T_i}\mathbf X \nonumber \\&=(W_i^\tau ,~0)\Big \{[nf(T_i)\Gamma (T_i)]^{-1}\otimes \frac{1}{\mu _2} \Big ( \begin{array}{cc} 1 &{}\quad 0\\ 0 &{}\quad \mu _2 \end{array}\Big ) \Big \}\nonumber \\ {}&\quad \ \times \Big \{ n\Phi (T_i)f(T_i)\otimes \Big ( \begin{array}{c} 1 \\ 0 \end{array} \Big )\{1+O_p(c_n)\} \Big \} \nonumber \\&=(W_i^\tau ,~0)\Big \{[nf(T_i)\Gamma (T_i)]^{-1}[n\Phi (T_i)f(T_i)]\otimes \frac{1}{\mu _2} \Big (\begin{array}{cc} 1 &{}\quad 0\\ 0 &{}\quad \mu _2 \end{array}\Big ) \Big ( \begin{array}{c} 1 \\ 0 \end{array} \Big )\{1\!+\!O_p(c_n)\} \Big \} \nonumber \\&=(W_i^\tau ,~0)\Big \{\Gamma ^{-1}(T_i)\Phi (T_i)\otimes \frac{1}{\mu _2} \Big (\begin{array}{c} 1 \\ 0 \end{array} \Big )\{1+O_p(c_n)\} \Big \} \nonumber \\&=W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i)\{1+O_p(c_n)\}. \end{aligned}$$

(6.6)

Similarly, using the approaches above and those in the proof of (6.1) and (6.2), we have

$$\begin{aligned} S_i\mathbf e =W_i^\tau \Gamma ^{-1}(T_i)E(W_ie_i^\tau |T_i)\{1+O_p(c_n)\}=0. \end{aligned}$$

(6.7)

From (6.6) and using Lemma 6.5, it follows that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n(X_i^\tau -S_i\mathbf X )^\tau (X_i^\tau -S_i\mathbf X ) \mathop {\rightarrow }\limits ^\mathrm{P} EX_1X_1^\tau -E[\Phi ^\tau (T_1)\Gamma ^{-1}(T_1)\Phi (T_1)]. \end{aligned}$$

Similarly $ \frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (X_i^\tau -S_i\mathbf X ) =\frac{1}{n}\sum _{i=1}^ne_i (X_i^\tau -W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i))\{1+O_p(c_n)\} \mathop {\rightarrow }\limits ^\mathrm{P} 0. $ According to (6.7), we have $ \frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (e_i^\tau -S_i\mathbf e ) =\frac{1}{n}\sum _{i=1}^ne_ie_i^\tau \mathop {\rightarrow }\limits ^\mathrm{a.s.} \Sigma _e. $ Thus the conclusion is proved. $\square $

Lemma 6.8

Suppose (A1)–(A6) are satisfied, then $\sum _{i=1}^n\tilde{\xi _i}\tilde{M}_i=o_p(\sqrt{n}),$ where $\tilde{M}_i=M_i-S_iM$ and $M_i=W_i^\tau a(T_i)$.

Proof

According to the definition, we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\tilde{\xi _i}\tilde{M}_i =\frac{1}{n}\sum _{i=1}^n(X_i^\tau -S_i\mathbf X )^\tau (M_i^\tau -S_iM) +\frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (M_i^\tau -S_iM). \end{aligned}$$

(6.8)

Note that $ D_t^\tau \omega _tM =\Big (\begin{array}{ccc} \sum _{i=1}^nW_iW^\tau _i a(T_i) K_h(T_i-t) \\ \sum _{i=1}^nW_iW^\tau _i a(T_i) \frac{T_i-t}{h}K_h(T_i-t) \end{array}\Big ). $ Using the similar techniques in the proof of Lemma 6.6, one can easily check that $ D_t^\tau \omega _tM=n\Gamma (t)f(t) a(t)\otimes \Big (\begin{array}{c} 1 \\ 0 \end{array} \Big )\{1+O_p(c_n)\}. $ Therefore $S_iM=W_i^\tau a(T_i)\{1+O_p(c_n)\}$, furthermore,

$$\begin{aligned} \tilde{M}_i=M_i-S_iM=W_i^\tau a(T_i)O_p(c_n). \end{aligned}$$

(6.9)

Then, from (6.6) and law of large numbers for stationary $\alpha $-mixing sequences, one can obtain

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n(X_i^\tau -S_i\mathbf X )^\tau (M_i^\tau -S_iM) \nonumber \\&\quad =\frac{1}{n}\sum _{i=1}^n[X_i^\tau -W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i) -W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i)O_p(c_n)]^\tau W_i^\tau a(T_i)O_p(c_n) \nonumber \\&\quad =\frac{1}{n}\sum _{i=1}^nX_iW_i^t a(T_i)O_p(c_n) -\frac{1}{n}\sum _{i=1}^n\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_iW_i^\tau a(T_i)O_p(c_n) \nonumber \\&\qquad -\,\frac{1}{n}\sum _{i=1}^n\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_iW_i^\tau a(T_i)O_p(c_n^2) \nonumber \\&\quad =E[\Phi ^\tau (T_1) a(T_1)]O_p(c_n^2). \end{aligned}$$

(6.10)

Similarly with (6.7), we have $ \frac{1}{n}\sum _{i=1}^n(e_i^\tau -S_i\mathbf e )^\tau (M_i^\tau -S_iM) \mathop {\rightarrow }\limits ^\mathrm{P} 0, $ which, together with (6.8) and (6.10), yields that $ \sum _{i=1}^n\tilde{\xi }_i\tilde{M}_i=O_p(nc_n^2)=o_p(\sqrt{n}). $ $\square $

Lemma 6.9

(i)
Suppose (A1)–(A6) are satisfied, then
$$\begin{aligned} \sqrt{n}(\hat{\beta }_n-\beta )\mathop {\rightarrow }\limits ^\mathcal{{D}} N(0,\Sigma _1^{-1}\Sigma _2\Sigma _1^{-1}), \end{aligned}$$
where $\Sigma _1=E(X_1X_1^\tau )-E[\Phi ^\tau (T_1)\Gamma ^{-1}(T_1)\Phi (T_1)]$, $\Phi (T_1)=E(W_1X_1^\tau |T_1)$, $\Gamma (T_1)=E(W_1W_1^\tau |T_1)$ $\Sigma _2=\lim _{n\rightarrow \infty }Var\{\frac{1}{\sqrt{n}}\sum _{i=1}^n[\xi _i-\Psi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i][\epsilon _i-e_i^\tau \beta ]\}$. Further, $\hat{\Sigma }_1^{-1}\hat{\Sigma }_2\hat{\Sigma }_1^{-1}$ is a consistent estimator of $\Sigma _1^{-1}\Sigma _2\Sigma _1^{-1}$, where $\hat{\Sigma }_1=\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau -\Sigma _e,$ $\hat{\Sigma }_2=\frac{1}{n}\Big \{\sum _{i=1}^n [\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)]+\Sigma _e\hat{\beta }_n\Big \}^{\otimes 2}, $ here $C^{\otimes 2}$ means $CC^\tau $.
(ii)
Suppose (A1)–(A6) are satisfied, then $\sqrt{n}(\hat{\beta }_J-\beta )=\sqrt{n}(\hat{\beta }_n-\beta )+o_p(1)$.

Proof

(i) Let $\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }^\tau _i-n\Sigma _e=A$, then $\hat{\beta }_n=A^{-1}\sum _{i=1}^n\tilde{\xi }_i\tilde{Y}^\tau _i$. Write

$$\begin{aligned} \hat{\beta }_n-\beta =A^{-1}n\Sigma _e\beta +A^{-1}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}^\tau _i-\tilde{\xi }_i^\tau \beta ). \end{aligned}$$

(6.11)

From Lemma 6.7, we have $A^{-1}=O(\frac{1}{n})$. According to the definition and (1.1), we write

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}^\tau _i-\tilde{\xi }_i^\tau \beta )&=\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(M_i-S_iM) +\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\epsilon _i-S_i\epsilon )\nonumber \\&\quad \ -\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(e_i^\tau -S_i\mathbf e ^\tau )\beta . \end{aligned}$$

(6.12)

From (6.6) and (6.7), we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(e_i^\tau -S_i\mathbf e ^\tau )\beta =\frac{1}{n}\sum _{i=1}^n[\xi _i-\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i]e_i^\tau \beta +o_p\Big (\frac{1}{\sqrt{n}}\Big ). \end{aligned}$$

(6.13)

Similar to the proof of (6.2) in Lemma 6.6, one can easily check that $ D_t^\tau \omega _t\epsilon =n\mathbf 1 _{2q}\otimes \Big (\begin{array}{c} 1 \\ 0 \end{array} \Big )O_p\Big (\sqrt{\frac{\log n}{nh}}\Big ). $ Together with (6.1), (A1) and (A2), we have

$$\begin{aligned} S_i\epsilon&=(W_i^\tau ,~~0)(D_{T_i}^\tau \omega _{T_i}D_{T_i})^{-1}D_{T_i}\omega _{T_i}\epsilon \nonumber \\&=(W_i^\tau ,~~0)\Big \{[nf(T_i)\Gamma (T_i)]^{-1}\otimes \frac{1}{\mu _2} \Big ( \begin{array}{cc} \mu _2 &{} 0 \\ 0 &{} 1 \end{array}\Big ) \Big \}\Big \{n\mathbf 1 _{2q}\otimes \Big ( \begin{array}{c} 1 \\ 0 \end{array}\Big ) \Big \}O_p\Big (\sqrt{\frac{\log n}{nh}}\Big ) \nonumber \\&=W_i^\tau \mathbf 1 _{q}[f(T_i)\Gamma (T_i)]^{-1}O_p\Big (\sqrt{\frac{\log n}{nh}}\Big ) =W_i^\tau \mathbf 1 _{q} O_p\Big (\sqrt{\frac{\log n}{nh}}\Big ). \end{aligned}$$

(6.14)

Therefore

$$\begin{aligned} \sum _{i=1}^n\tilde{\xi }_i(\epsilon _i-S_i\epsilon )&=\sum _{i=1}^ne_i\epsilon _i+\sum _{i=1}^n(X_i-\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i)\epsilon _i+o_p(\sqrt{n}) \nonumber \\&=\sum _{i=1}^n(\xi _i-\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i)\epsilon _i+o_p(\sqrt{n}). \end{aligned}$$

(6.15)

Combining (6.11)–(6.15) and Lemma 6.8, we have

$$\begin{aligned} \sqrt{n}(\hat{\beta }_n\!-\!\beta )\!=\!\Big (\frac{A}{n}\Big )^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^n \Big \{\Sigma _e\beta \!+[\xi _i\!-\!\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i][\epsilon _i\!-\!e^\tau \beta ]\Big \}\!+\!o_p(1). \end{aligned}$$

Let $\eta _i=\Sigma _e\beta +[\xi _i-\Phi ^\tau (T_i)\Gamma ^{-1}(T_i)W_i][\epsilon _i-e^\tau \beta ]$. Obviously, $\{\eta _i,i\ge 1\}$ is an $\alpha $-mixing sequence with $E\eta _i=0$ and $E|\eta _i|^\delta <\infty $ for $\delta >4$. Applying Lemma 6.3, one can complete the proof of (i).

(ii) To prove $\sqrt{n}(\hat{\beta }_J-\beta )=\sqrt{n}(\hat{\beta }_n-\beta )+o_p(1)$, it is sufficient to prove $\hat{\beta }_J=\hat{\beta }_n+o_p(\frac{1}{\sqrt{n}})$.

Note that $\hat{\beta }_J=\hat{\beta }_n+\frac{n-1}{n}\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i}).$ Therefore, we only need to prove that

$$\begin{aligned} \sqrt{n}\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i})=o_p(1). \end{aligned}$$

(6.16)

From the definition,

$$\begin{aligned} \hat{\beta }_n-\hat{\beta }_{n,-i}\!=\!\left[ \sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau -n\Sigma _e\right] ^{-1}\sum _{i=1}^n\tilde{\xi }_i\tilde{Y}_i \!-\!\left[ \sum _{j\ne i}^n\tilde{\xi }_j\tilde{\xi }_j^\tau \!-\!(n-1)\Sigma _e\right] ^{-1}\sum _{j\ne i}^n\tilde{\xi }_j\tilde{Y}_j. \end{aligned}$$

Using the fact [see Theorem 11.2.3 in Golub and Van Loan (1996)] $ (A+B)^{-1}=A^{-1}-A^{-1}BA^{-1}-A^{-1}B\sum _{k=1}^{\infty }C^kA^{-1}, $ where A is a nonsingular matrix, and $C=-A^{-1}B$. We write

$$\begin{aligned}&\left[ \sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau -(n-1)\Sigma _e\right] ^{-1} \nonumber \\&\quad =\,\left[ \sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e\right] ^{-1} \!-\!\left[ \sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau \!-\!n\Sigma _e\right] ^{-1}\Sigma _e \left[ \sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau \!-\!n\Sigma _e\right] ^{-1}\!-\!D, \end{aligned}$$

(6.17)

where $D=A^{-1}B\sum _{k=1}^{\infty }C^kA^{-1}, A=[\sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e], B=\Sigma _e, C=-A^{-1}B$.

Applying Lemma 6.4, we write

$$\begin{aligned}&\left[ \sum _{j\ne i}\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e\right] ^{-1} =\left[ \sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e-\tilde{\xi }_i\tilde{\xi }_i^\tau \right] ^{-1} \nonumber \\&\quad =\, \left[ \sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e\right] ^{-1} +\frac{[\sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e]^{-1}\tilde{\xi }_i\tilde{\xi }_i^\tau [\sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e]^{-1}}{1-\tilde{\xi }_i^\tau [\sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e]^{-1}\tilde{\xi }_i}. \end{aligned}$$

(6.18)

Let $A=[\sum _{j=1}^n\tilde{\xi }_j\tilde{\xi }_j^\tau -n\Sigma _e]$, the same as in the proof of Lemma 6.9 (i). Then combining (6.17), (6.18) and the definitions of $\hat{\beta }_n$ and $\hat{\beta }_{n,-i}$ and noting that $\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]=0$, we write

$$\begin{aligned}&\sum _{i=1}^n(\hat{\beta }_n-\hat{\beta }_{n,-i}) \nonumber \\&=A^{-1}\sum _{i=1}^n\frac{v_i\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n}{1-v_i} -A^{-1}\sum _{i=1}^n\frac{r_i[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]}{(1-v_i)^2} \nonumber \\&~~~~-A^{-1}\Sigma _eA^{-1}\sum _{i=1}^n\frac{\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n}{1-v_i} -A^{-1}\sum _{i=1}^n\frac{v_i}{1-v_i}\Sigma _e\hat{\beta }_n \nonumber \\&~~~~ +A^{-1}\Sigma _eA^{-1}\sum _{i=1}^n\frac{1}{1-v_i}\Sigma _e\hat{\beta }_n \nonumber \\&~~~~+A^{-1}\sum _{i=1}^nr_i\frac{\Sigma _e\hat{\beta }_n}{(1-v_i)^2} +A^{-1}\sum _{i=1}^n\frac{\tilde{\xi }_i\tilde{\xi }_i^\tau A^{-1}\Sigma _e\hat{\beta }_n}{1-v_i} +D\sum _{i=1}^n\sum _{j\ne i}\tilde{\xi }_j\tilde{Y}_j \nonumber \\&:=A^{-1}\sum _{i=1}^7I_i+D\sum _{i=1}^n\sum _{j\ne i}\tilde{\xi }_j\tilde{Y}_j, \end{aligned}$$

(6.19)

where $v_i=\tilde{\xi }_i^\tau A^{-1}\tilde{\xi }_i$, $r_i=\tilde{\xi }_i^\tau A^{-1}\Sigma _e A^{-1}\tilde{\xi }_i$. By Lemma 6.7 and (A3), we have $v_i=O_p(n^{-1})$ and $r_i=O_p(n^{-2})$. Therefore, to prove (6.16), it is sufficient to prove that

$$\begin{aligned} I_i=o_p(\sqrt{n}),~~i=1,2,\cdots ,7~~\text{ and }~~ D\sum _{i=1}^n\sum _{j\ne i}\tilde{\xi }_j\tilde{Y}_j=o_p(\frac{1}{\sqrt{n}}). \end{aligned}$$

First, we deal with $I_1$. Since

$$\begin{aligned}&\Bigg |\sum _{i=1}^n\frac{v_i}{1-v_i}[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]\Bigg |\le \sqrt{n}(\max _{1\le i\le n}v_i^2)^{1/2} \\&\qquad \qquad \Bigg (\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]^2\Bigg )^{1/2}, \end{aligned}$$

to prove the desired result, one needs only to show that

$$\begin{aligned} \Bigg (\max _{1\le i\le n}v_i^2\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]^2\Bigg )^{1/2}=o_p(1). \end{aligned}$$

In fact, from $\max _{1\le i\le n}|v_i|=o(n^{-3/4})~a.s.$ by the proof of Lemma 3 in Owen (1990), and Lemma 6.11, it follows that $ \frac{1}{\sqrt{n}}I_1=o_p(1). $ Similarly $\frac{1}{\sqrt{n}}I_2=o_p(1)$, $\frac{1}{\sqrt{n}}I_3=o_p(1).$

Meanwhile, $ \Vert \frac{1}{\sqrt{n}}I_{n4}\Vert =\frac{n}{\sqrt{n}}O_p(\frac{1}{n})\rightarrow 0. $ Similarly, we have

$$\begin{aligned} \frac{1}{\sqrt{n}}I_5=o_p(1),~ \frac{1}{\sqrt{n}}I_6=o_p(1),~ \frac{1}{\sqrt{n}}I_7=o_p(1). \end{aligned}$$

Recall the definition of A, B, C, D and Lemma 6.7, we have $A^{-1}=O(\frac{1}{n})$, $C=O(\frac{1}{n})$ and

$$\begin{aligned} D=A^{-1}B(CA^{-1}+C^2A^{-1}+C^3A^{-1}+\cdots ) =\frac{1}{n^3}+\frac{1}{n^4}+\cdots =O\Big (\frac{1}{n^3}\Big ). \end{aligned}$$

Therefore, by (A3), one can easily obtain that $\sqrt{n}D\sum _{i=1}^n\sum _{j\ne i}^n\tilde{\xi }_j\tilde{Y}_j=\sqrt{n}O\Big (\frac{1}{n^3}\Big )n^2O_p(1)\rightarrow 0.$ $\square $

Lemma 6.10

Suppose (A3) and (A6) are satisfied, then $\frac{1}{n}\sum _{i=1}^n\epsilon _iW_{ik}\!=\!o(n^{-1/4})~~a.s. $ for $1\le k\le p$.

Proof

Following the proof of Lemma 2 in Hong and Cheng (1994) under the independent case, using Lemmas 6.1 and 6.2, it is not difficult to prove this lemma. $\square $

Lemma 6.11

Suppose (A1)–(A3), (A5) and (A6) are satisfied, then $ \frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n] [\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]^\tau \mathop {\rightarrow }\limits ^\mathrm{P} \Sigma _3$ and $\max _{1\le i\le n}\Vert \hat{\beta }_n-\hat{\beta }_{n,-i}\Vert =O_p(n^{-1})$, where $\Sigma _3=(\Sigma _1+\Sigma _e)(\sigma ^2+\beta ^\tau \Sigma _e\beta )-\Sigma _e\beta \beta ^\tau \Sigma _e$.

Proof

(i) Write

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n] [\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]^\tau \nonumber \\&\quad =\,\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }^\tau \beta )][\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }^\tau \beta )]^\tau +\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )(\hat{\beta }_n-\beta )^\tau \tilde{\xi }_i\tilde{\xi }_i^\tau \nonumber \\&\qquad +\,\frac{1}{n}\sum _{i=1}^n\Sigma _e\hat{\beta }_n\hat{\beta }_n^\tau \Sigma _e -\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\tilde{\xi }_i^\tau \tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\nonumber \\&\qquad +\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\hat{\beta }_n^\tau \Sigma _e -frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\hat{\beta }_n^\tau \Sigma _e \nonumber \\&\qquad -\,\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\tilde{\xi }_i^\tau \tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )]^\tau +\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\hat{\beta }_n^\tau \Sigma _e]^\tau \nonumber \\&\qquad -\,\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\hat{\beta }_n^\tau \Sigma _e]^\tau . \end{aligned}$$

(6.20)

First, we evaluate the cross terms. By Lemmas 6.9 and 6.5, (A2) and (A3), we have

$$\begin{aligned} \Bigg \Vert \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\hat{\beta }_n^\tau \Sigma _e\Bigg \Vert =\Bigg \Vert \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau \Bigg \Vert \Bigg \Vert \hat{\beta }_n-\beta \Bigg \Vert \Bigg \Vert \hat{\beta }_n^\tau \Sigma _e\Bigg \Vert =O_p(n^{-1/2})\rightarrow 0. \end{aligned}$$

Similarly $\Vert \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\tilde{\xi }_i^\tau \tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\Vert \mathop {\rightarrow }\limits ^\mathrm{P}0.$ Note that $\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]=0$, with Lemma 6.7 we have

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \beta )\hat{\beta }_n^\tau \Sigma _e&=-\Sigma _e\hat{\beta }_n\hat{\beta }_n^\tau \Sigma _e+\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )\hat{\beta }_n^\tau \Sigma _e \\&=-\Sigma _e\hat{\beta }_n\hat{\beta }_n^\tau \Sigma _e\!+\!(\Sigma _e\!+\!\Sigma _1)(\hat{\beta }_n\!-\! \beta )\hat{\beta }_n^\tau \Sigma _e \mathop {\rightarrow }\limits ^\mathrm{P} \!-\!\Sigma _e\beta \beta ^\tau \Sigma _e. \end{aligned}$$

Therefore, one can write (6.20) as

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n] \tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]^\tau \nonumber \\&\quad =\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }^\tau \beta )][\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }^\tau \beta )]^\tau +\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\hat{\beta }_n-\beta )(\hat{\beta }_n-\beta )^\tau \tilde{\xi }_i\tilde{\xi }_i^\tau \nonumber \\&\qquad +\,\frac{1}{n}\sum _{i=1}^n\Sigma _e\hat{\beta }_n\hat{\beta }_n^\tau \Sigma _e-2\Sigma _e\beta \beta ^\tau \Sigma _e \nonumber \\&\quad :=H_1+H_2+H_3-2\Sigma _e\beta \beta ^\tau \Sigma _e. \end{aligned}$$

On applying Lemma 6.5 and (6.6) we have

$$\begin{aligned} H_1&=\frac{1}{n}\sum _{i=1}^n\tilde{\xi }_i\tilde{\xi }_i^\tau (\epsilon _i-e_i^\tau \beta )^2 =\frac{1}{n}\sum _{i=1}^n(X_i^\tau -W_i^\tau \Gamma ^{-1}(T_i)\Phi (T_i) \\&\qquad \qquad (1+O_p(c_n))+e_i^\tau )^2(\epsilon _i-e_i^\tau \beta )^2 \nonumber \\ \mathop {\rightarrow }\limits ^\mathrm{P}&E[X_1^\tau -W_1^\tau \Gamma ^{-1}(T_1)\Phi (T_1)+e_1^\tau ]^2(\epsilon _1-e_1^\tau \beta )^2 =(\sigma ^2+\beta ^\tau \Sigma _e\beta )(\Sigma _1+\Sigma _e). \end{aligned}$$

With $\max _{1\le i\le n}\Vert \tilde{\xi }_i\Vert =o(n^{1/{2\delta }})$, $\Vert \hat{\beta }_n-\hat{\beta }\Vert =O_p(n^{-1/2})$, and Lemma 6.7, one can derive that $H_2\rightarrow 0$, $H_3\rightarrow \Sigma _e\beta \beta ^\tau \Sigma _e.$ Hence, the first conclusion is verified.

Similar to the derivation of (6.19), one can write

$$\begin{aligned} \hat{\beta }_n-\hat{\beta }_{n,-i}&=A^{-1}\frac{\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n}{1-v_i}-A^{-1}\frac{v_i}{1-v_i}\Sigma _e\hat{\beta }_n \\&\ \quad -A^{-1}\Sigma _eA^{-1}\frac{\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n}{1-v_i} \nonumber \\&\quad \ +A^{-1}\Sigma _eA^{-1}\frac{\Sigma _e\hat{\beta }_n}{1-v_i} +A^{-1}\frac{\tilde{\xi }_i\tilde{\xi }_i^\tau A^{-1}\Sigma _e\hat{\beta }_n}{1-v_i} \\&\quad \ -A^{-1}r_i\frac{\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n}{(1-v_i)^2} \nonumber \\&\quad \ +A^{-1}r_i\frac{\Sigma _e\hat{\beta }_n}{(1-v_i)^2} +D\sum _{j\ne i}\tilde{\xi }_j\tilde{Y}_j :=\sum _{k=1}^8a_{ki}, \end{aligned}$$

where $v_i=\tilde{\xi }_i^\tau A^{-1}\tilde{\xi }_i$, $r_i=\tilde{\xi }_i^\tau A^{-1}\Sigma _e A^{-1}\tilde{\xi }_i$. Then, it is sufficient to show that

$$\begin{aligned} \max _{1\le i\le n}\Vert a_{ki}\Vert =O_p(n^{-1}),~~k=1,2,\cdots ,8. \end{aligned}$$

For $a_{1i}$, since $E[\tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n]=0$, $E\Vert \tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n\Vert ^\delta <\infty $ and $\max _{1\le i\le n}|v_i|=o(n^{-3/4})~~a.s.$, we have $\max _{1\le i\le n}\Vert \tilde{\xi }_i(\tilde{Y}_i-\tilde{\xi }_i^\tau \hat{\beta }_n)+\Sigma _e\hat{\beta }_n\Vert =O_p(1)$. Therefore, $ \max _{1\le i\le n}\Vert a_{1i}\Vert =O_p(n^{-1}). $ It is easy to see that

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^na_{i3}^2 \!=\!\frac{1}{n}\sum _{i=1}^n[\tilde{\xi }_i(\tilde{Y}_i\!-\!\tilde{\xi }_i^\tau \hat{\beta }_n)\!+\! \Sigma _e\hat{\beta }_n] [\tilde{\xi }_i(\tilde{Y}_i\!-\!\tilde{\xi }_i^\tau \hat{\beta }_n)\!+\!\Sigma _e\hat{\beta }_n]^\tau O(n^{-4}) \!=\!O(n^{-4}), \end{aligned}$$

which implies $\frac{n^2\max _{1\le i\le n}\Vert a_{i3}\Vert }{\sqrt{n}}\rightarrow 0$. Then $\max _{1\le i\le n}\Vert a_{i3}\Vert =o_p(n^{-3/2}). $ Similarly, $\max _{1\le i\le n}\Vert a_{6i}\Vert =o_p(n^{-3/2}).$

From $\max _{1\le i\le n}|v_i|\!=\!o(1)~a.s.$, $\max _{1\le i\le n}|r_i|\!=\!o(n^{-1})~a.s.$ and $\max _{1\le i\le n}\Vert \tilde{\xi }_i\Vert =o(n^{1/{2\delta }})~~a.s.$, it is easy to show that $\max _{1\le i\le n}\Vert a_{2i}\Vert =o(n^{-1})$, $\max _{1\le i\le n}\Vert a_{4i}\Vert =O(n^{-2})$, $\max _{1\le i\le n}\Vert a_{5i}\Vert =o(n^{-1})$, $\max _{1\le i\le n}\Vert a_{7i}\Vert =o(n^{-2})$, $\max _{1\le i\le n}\Vert a_{8i}\Vert =o(n^{-1})$.

Then the proof of the second conclusion is completed. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, AA., Liang, HY. Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models. Stat Papers 58, 95–122 (2017). https://doi.org/10.1007/s00362-015-0689-8

Download citation

Received: 12 November 2014
Revised: 07 May 2015
Published: 30 May 2015
Issue Date: March 2017
DOI: https://doi.org/10.1007/s00362-015-0689-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models

Abstract

Similar content being viewed by others

Generalized varying coefficient partially linear measurement errors models

Simultaneous inference for Berkson errors-in-variables regression under fixed design

Empirical likelihood-based inferences in varying coefficient models with missing data

1 Introduction

2 Estimators

2.1 Profile least squares estimation

2.2 Jackknife method

3 Main results

Remark 3.1

Theorem 3.1

Theorem 3.2

Remark 3.2

4 Simulation

5 Proofs of main results

Proof of Theorem 3.1

Proof of Theorem 3.2

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 6.1

Lemma 6.2

Lemma 6.3

Lemma 6.4

Lemma 6.5

Lemma 6.6

Proof

Lemma 6.7

Proof

Lemma 6.8

Proof

Lemma 6.9

Proof

Lemma 6.10

Proof

Lemma 6.11

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation