The large deviation results for the nonlinear regression model with dependent errors

Yang, Wenzhi; Zhao, Zhangrui; Wang, Xinghui; Hu, Shuhe

doi:10.1007/s11749-016-0509-z

The large deviation results for the nonlinear regression model with dependent errors

Original Paper
Published: 07 November 2016

Volume 26, pages 261–283, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

TEST Aims and scope Submit manuscript

The large deviation results for the nonlinear regression model with dependent errors

Download PDF

Wenzhi Yang¹,
Zhangrui Zhao¹,
Xinghui Wang² &
…
Shuhe Hu¹

298 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we investigate the least squares (LS) estimator of the nonlinear regression model based on the extended negatively dependent errors which are widely dependent structures. Under the general conditions, we establish some large deviation results for the LS estimator of the nonlinear regression parameter, which can be applied to obtain a weak uniform consistency and a complete convergence rate for this estimator. In addition, some examples and simulations are presented for illustration.

The strong consistency of M estimator in linear models based on widely orthant dependent errors

Article 30 August 2016

Complete consistency of estimators for regression models based on extended negatively dependent errors

Article 28 April 2016

Heteroskedastic Linear Regression: Steps Towards Adaptivity, Efficiency, and Robustness

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 The nonlinear regression model

First, we consider the nonlinear regression model for the observations $X^n:=\left( X_1,X_2,\ldots ,X_n\right) $:

$$\begin{aligned} X_t=f_t\left( \theta \right) +\varepsilon _t,~~t=1,2,\ldots ,n, \end{aligned}$$

(1)

where the $f_t$ are known continuous functions on a parameter set $\varTheta \subset \mathscr {R}^k$, the $\varepsilon _t$ are random errors and the $\theta \in \varTheta $ is the true value of the parameter. Denote

$$\begin{aligned} Q_n\left( \theta \right) =\sum \limits _{t=1}^n\left( X_t-f_t(\theta )\right) ^2. \end{aligned}$$

Let $\hat{\theta }_n(X_1,X_2,\ldots ,X_n)$ denote the least squares (LS) estimator of the parameter $\theta \in \varTheta $ such as

$$\begin{aligned} Q_n\left( \hat{\theta }_n\right) =\inf \limits _{\theta \in \varTheta }\sum \limits _{t=1}^n\left( X_t-f_t(\theta )\right) ^2. \end{aligned}$$

The LS method plays a central role in the inference of parameters in nonlinear regression models. The study of asymptotic properties of the LS estimator for parameters in nonlinear regression models has been the main subject of investigation, since it is, in general, difficult to obtain the exact distribution of the LS estimator for any fixed sample. For the LS estimator of the nonlinear model based on the independent identically distributed (i.i.d.) errors, Jennrich (1969) presented the asymptotic normality, Malinvaud (1970) obtained the consistency, and Wu (1981) established the necessary and sufficient condition for the strong consistency, etc. In addition, for the the nonlinear model based on the independent but not necessarily identically distributed errors, Bunke and Schmidt (1980) established the strong consistency and asymptotic normality for the weighted LS estimator, Ibragimov and Has’minskii (1981) obtained some large deviation results of the maximum likelihood (ML) estimator, Sieders and Dzhaparidze (1987) extended the results of Ibragimov and Has’minskii (1981) to the M-estimator and gave its application to the main results of large deviation for the LS estimator. For the LS estimator of the nonlinear model, Prakasa Rao (1984a) extended the result of Ibragimov and Has’minskii (1981) to the case of i.i.d. Gaussian errors, Hu (1993) extended the results of Prakasa Rao (1984a) and Sieders and Dzhaparidze (1987) to the cases of locally generalized Gaussian errors, martingale differences, etc.

As far as we know, there is no large deviation result of the LS estimator of model (1) based on the extended negatively dependent (END, Liu 2009) errors. The END sequences are widely dependent structures which cover several negative dependent sequences such as negatively orthant dependent (NOD, Lehmann 1966), negatively superadditive dependent (NSD, Hu 2000) and negatively associated (NA, Joag-Dev and Proschan 1983). Based on the M estimator of Sieders and Dzhaparidze (1987), we obtain the main result of large deviation such as Theorems 2.1–2.4 and Corollary 2.1 for the LS estimator $\hat{\theta }_n$ of $\theta \in \mathscr {R}^k$ in model (1), which can be applied to establish a weak uniform consistency and a complete convergence rate.

Now, we recall the M-estimator. Let $\mathscr {E}^{(n)}=\{\mathscr {X}^{(n)},\mathscr {U}^{(n)},P_\theta ^{(n)},\theta \in \varTheta \}$ be a family of probability spaces, where the $P_\theta ^{(n)}$ does not necessarily have known form. The parameter set $\varTheta $ is a Borel subset of k-dimensional Euclidean space. We shall consider the M-estimator maximizing an M-functional $C_n:\mathscr {X}^{(n)}\times \varTheta \rightarrow [0,\infty )$, which is assumed to be, for all $X^n\in \mathscr {X}^{(n)}$, a positive continuous function of $\theta $ and, for all $\theta \in \varTheta $, a measurable functional of $X^n$.

Throughout the paper, we assume that, for all $\theta \in \varTheta $ and $P_\theta ^{(n)}$-almost all $X^n$, a solution $\hat{\theta }_n$ to the equation

$$\begin{aligned} C_n\left( X^n,\hat{\theta }_n\right) =\sup \limits _{\theta \in \varTheta }C_n\left( X^n,\theta \right) \end{aligned}$$

(2)

exists (this is certainly true if $\varTheta $ is compact). So $\hat{\theta }_n$ is called the M-estimator of $\theta $. Especially, the LS estimator $\hat{\theta }_n$ maximizes the M-functional

$$\begin{aligned} C_n\left( X^n,\theta \right) :=\exp \Bigg (-\frac{1}{2}\sum \limits _{t=1}^n\left( X_t-f_t(\theta )\right) ^2\Bigg ). \end{aligned}$$

For all $n\in \mathcal {N}$ and $\theta \in \varTheta \subset \mathscr {R}^k$, let $u\in \mathscr {R}^k, \phi _n(\theta )$ be a nonsingular $k\times k$ matrix and define the normalized M-ratio

$$\begin{aligned} Z_{n,\theta }(u):=Z_{n,\theta }\left( X^n,u\right) =\frac{C_n\left( X^n,\theta +\phi _n(\theta )u\right) }{C_n\left( X^n,\theta \right) }, \end{aligned}$$

(3)

which, for fixed observation $X^n$, is a continuous, nonnegative finite function on the set

$$\begin{aligned} U_{n,\theta }=\phi ^{-1}_n(\theta )\left( \varTheta -\theta \right) . \end{aligned}$$

Throughout the paper, for a matrix $A_{m\times n}, |A_{m\times n}|$ denotes its norm. Define

$$\begin{aligned} {\varGamma }_{n,\theta , R}:=\bar{U}_{n,\theta }\cap \left\{ u:R\le |u|\le R+1\right\} , \end{aligned}$$

where $\bar{U}_{n,\theta }$ is a closure of $U_{n,\theta }$.

Similar to Theorem 1.5.1 of Ibragimov and Has’minskii (1981) and Theorem 2.1 of Sieders and Dzhaparidze (1987), we define the following sets of functions.

$\mathbf G $ is the set of all functions $g_n(\cdot )$ possessing the following properties:

(i) for fixed n, $g_n(\cdot )$ is a function on $[0,\infty )$ monotonically increasing to infinity;

(ii) for all $N>0$,

$$\begin{aligned} \lim \limits _{\begin{array}{c} R\rightarrow \infty \\ n\rightarrow \infty \end{array}}R^N\exp (-g_n(R))=0. \end{aligned}$$

(4)

Remark 1.1

If $g_n(R)=R^{\alpha }$ and $\alpha >0$, then $g_n\in \mathbf G $.

Let K be a measurable subset of $\varTheta $ and $\mathbf H _K$ be the set of all functions $\eta _{n,\theta }(\cdot )$ possessing the following properties:

(iii) for fixed n and $\theta \in \varTheta $, $\eta _{n,\theta }(\cdot )$ is a function $U_{n,\theta }\rightarrow (0,\infty )$;

(vi) there exists a polynomial pol$_K(R)$ in R such that for R and n sufficiently large,

$$\begin{aligned} \sup \limits _{\theta \in K;~u\in {\varGamma }_{n,\theta , R}}\left( \eta _{n,\theta }(u)\right) ^{-1}\le \text {pol}_K(R). \end{aligned}$$

(5)

For each n and $\theta $, let $\zeta _{n,\theta }:[0,\infty )\rightarrow \mathscr {R}$ be a monotonically nondecreasing continuous function and define the random function

$$\begin{aligned} \zeta _{n,\theta }(u):=\zeta _{n,\theta }\left( Z_{n,\theta }(u)\right) . \end{aligned}$$

(6)

As a generalization of Theorem 1.5.1 of Ibragimov and Has’minskii (1981), Sieders and Dzhaparidze (1987, Theorem 2.1) obtained a large deviation result for the M-estimator as follows.

Theorem 1.1

Let the functionals $\zeta _{n,\theta }(u)$ process the following properties: given a measurable subset $K\subset \varTheta \subset \mathscr {R}^k$, there correspond to it numbers m and $\alpha $, where $m\ge \alpha >k$, functions $g_{n}\in \mathbf G $ and $\eta _{n,\theta } \in \mathbf H _K$, and a polynomial pol$_K(R)$ in R such that, for all R and n large enough,

$$\begin{aligned} E_\theta ^{(n)}|\zeta _{n,\theta }(u)-\zeta _{n,\theta }(v)|^m\le & {} |u-v|^\alpha \text {pol}_K(R), \nonumber \\&\text {for all} ~\theta \in K~ \text {and}~ u,v \in {\varGamma }_{n,\theta , R}, \end{aligned}$$

(7)

$$\begin{aligned} P_\theta ^{(n)}(\zeta _{n,\theta }(u)-\zeta _{n,\theta }(0)\ge -\eta _{n,\theta }(u))\le & {} \exp (-g_{n}(R)) \nonumber \\&\text {for all} ~\theta \in K~ \text {and}~ u \in {\varGamma }_{n,\theta , R}. \end{aligned}$$

(8)

Then there exist positive constants $B_0$ and $b_0$ such that, for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in K}P_{\theta }^{(n)}\Big \{|\phi ^{-1}_n(\theta )\left( \hat{\theta }_n-\theta \right) |\ge H\Big \}\le B_0\exp \left( -b_0g_n(H)\right) . \end{aligned}$$

(9)

The constant $b_0$ can be made arbitrarily close to $(\alpha -k)/(\alpha -k+mk)$ by choosing $B_0$ large enough.

Remark 1.2

In view of Sieders and Dzhaparidze (1987), the condition (8) can be replaced by the following condition

$$\begin{aligned} P_\theta ^{(n)}\left( \zeta _{n,\theta }(u)-\zeta _{n,\theta }(0)\ge -\eta _{n,\theta }(u)\right) \le C\exp \left( -g_{n}(R)\right) \end{aligned}$$

(10)

for all $\theta \in K$ and $u \in {\varGamma }_{n,\theta , R}$, where C is a positive constant independent of n and $\theta $.

Ibragimov and Has’minskii (1981, Theorem 1.5.1) obtained the large deviation (9) for the ML estimator. Under the i.i.d. Gaussian errors, Prakasa Rao (1984a) obtained the result of LS estimator $\hat{\theta }_n$ such that for all $\rho >0$ and $n\ge 1$

$$\begin{aligned} \sup \limits _{\theta \in K}P_\theta \left( n^{1/2}|\hat{\theta }_n-\theta |>\rho \right) \le B \mathrm{e}^{-b\rho ^2}, \end{aligned}$$

where K is a compact in $\varTheta \subset \mathscr {R}, B$ and b are some positive constants. Hu (1993) extended (9) to the locally generalized Gaussian and martingale differences cases. In addition, Ivanov (1976) investigated the LS estimator $\hat{\theta }_n$ of model (1) based on the i.i.d. errors. Assume that there exist positive constants $D_1$ and $D_2$ such that, for all $\theta ,\theta ^{\prime }\in \varTheta \subset \mathscr {R}$,

$$\begin{aligned} D_1n\left( \theta -\theta ^{\prime }\right) ^2\le \sum \limits _{t=1}^n\left( f_t(\theta )-f_t(\theta ^{\prime })\right) ^2\le D_2n\left( \theta -\theta ^{\prime }\right) ^2. \end{aligned}$$

(11)

By (11) and some other conditions, Ivanov (1976) presented that for all $\rho >0$ and all $n\ge 1$,

$$\begin{aligned} P_\theta \left( n^{1/2}|\hat{\theta }_n-\theta |>\rho \right) \le C\rho ^{-p}, \end{aligned}$$

(12)

where p is a positive constant such as $p\ge 2$, and C is a positive constant independent of n and $\rho $. Prakasa Rao (1984b) extended (12) to the dependent cases of $\varphi $-mixing and $\alpha $-mixing errors. Under some general conditions and $\sup \nolimits _{n\ge 1} E|\varepsilon _n|^p<\infty $ for some $p>2$, Hu (2002) also obtained (12) and gave some applications to the dependent cases of martingale differences, $\varphi $-mixing sequence and NA sequence. By the condition $\sup \nolimits _{n\ge 1} E|\varepsilon _n|^p<\infty $ for some $1<p\le 2$, Hu (2004) established that

$$\begin{aligned} P_\theta \left( n^{1/2}|\hat{\theta }_n-\theta |>\rho \right) \le Cn^{1-p/2}\rho ^{-p}, \end{aligned}$$

(13)

for all $\rho >0$, $n\ge 1$ and some $C>0$, which was also applied to some dependent errors. In view of (12) and (13), by some moment information of errors, Yang and Hu (2014) obtained some similar results of (12) and (13), which can be used in some case satisfying $\sup \nolimits _{n\ge 1}E|\xi _n|^p=\infty $ for some $p>1$.

For more works on the nonlinear regression models, one can refer to Ivanov and Leonenko (1989) and Ivanov (1997) for some basic asymptotic theories, Midi (1999) for the robustness of weighted LS estimator under i.i.d. errors with mean zero and unknown variance $\sigma ^2$, Ivanov and Leonenko (2008) for the consistency and asymptotic distribution theory of LS estimator under long-range-dependent noise, etc.

Due to the importance of END random variables and the LS estimator of a nonlinear regression parameter, we investigate the LS estimator $\hat{\theta }_n$ for the model (1) based on the END errors which are not necessarily identically distributed. With the techniques of some exponential inequalities of END random variables giving by Sect. 5, we obtain the large deviation results for the LS estimator $\hat{\theta }_n$, which can be applied to get a weak uniform consistency and a complete convergence rate $\hat{\theta }_n-\theta =O(n^{-1/2}\log ^{1/2} n)$, completely (see our results in Sect. 2). Some examples and simulations for the nonlinear models are illustrated in Sect. 3, and the conclusions are presented in Sect. 4. Last, we give the proofs in Sect. 5.

1.2 The concept of END random variables

In this subsection, let us recall the concept of END random variables which was introduced by Liu (2009).

Definition 1.1

We call random variables $\left\{ Z_n,n\ge 1\right\} $ to be END if there exists a constant $M>0$ such that both

$$\begin{aligned} P\left( Z_i>z_i, i=1, 2, \ldots , n\right) \le M\prod \nolimits _{i=1}^nP\left( Z_i>z_i\right) \end{aligned}$$

and

$$\begin{aligned} P\left( Z_i\le z_i, i=1, 2, \ldots , n\right) \le M\prod \nolimits _{i=1}^nP\left( Z_i\le z_i\right) \end{aligned}$$

hold for each $n\ge 1$ and all real numbers $z_1,z_2,\ldots ,z_n$.

If $\left\{ Z_n,n\ge 1\right\} $ is a sequence of END random variables, then for any fixed $m\ge 1, \left\{ Z_{n+m},n\ge 1\right\} $ is also a sequence of END random variables with the same dominating coefficient M. In fact, by Definition 1.1 and the continuity of probability, one can get this property.

Let $\left\{ Z_n,n\ge 1\right\} $ be a sequence of random variables. For some $1\le i\le n$, if $P\left( Z_i\le z_i\right) =0$, then $P\left( Z_1\le z_1, Z_2\le z_2,\ldots ,Z_n\le z_n\right) =0$. Similarly, if for some $1\le i\le n$, $P \left( Z_i> z_i\right) =0$, then $P\left( Z_1> z_1, Z_2> z_2,\ldots ,Z_n>z_n\right) =0$. Define $\frac{0}{0}=1$. If

$$\begin{aligned} M_1=\sup \limits _{n\ge 1}\sup \limits _{z_i\in (-\infty ,\infty ),1\le i\le n}\frac{P\left( Z_i>z_i, i=1, 2, \ldots , n\right) }{\prod \nolimits _{i=1}^nP(Z_i>z_i)}<\infty \end{aligned}$$

and

$$\begin{aligned} M_2=\sup \limits _{n\ge 1}\sup \limits _{z_i\in (-\infty ,\infty ),1\le i\le n}\frac{P\left( Z_i\le z_i, i=1, 2, \ldots , n\right) }{\prod \nolimits _{i=1}^nP(Z_i\le z_i)}<\infty \end{aligned}$$

then we take $M=\max \{M_1,M_2\}$ in Definition 1.1 and obtain that $\{Z_n,n\ge 1\}$ are END random variables. Obviously, for all $1\le i \le n$, let $z_i=-\infty $ or $z_i=+\infty $ in Definition 1.1, it is easy to see that the dominating coefficient $M\ge 1$.

Moreover, for any $n\ge 1$, let $Z_1,Z_2,\ldots ,Z_n$ be dependent according to a multivariate copula function $C(u_1,\ldots ,u_n)$ with absolutely continuous distribution functions $F_1,\ldots ,F_n$. Assume that the joint copula density

$$\begin{aligned} C_{1,\ldots ,n}(u_1,\ldots , u_n)=\frac{\partial ^{n}}{\partial u_1\ldots \partial u_n}C(u_1,\ldots ,u_n) \end{aligned}$$

exists and is uniformly bounded in the whole domain. Then random variables $\{Z_n,n\ge 1\}$ are END (see Example 4.2 of Liu 2009). By Remark 3.1 of Ko and Tang (2008), the copulas in the Frank family of the form

$$\begin{aligned} C(u_1,\ldots ,u_n;\theta )=-\frac{1}{\theta }\ln \Big (1+\frac{(\mathrm{e}^{-\theta u_1}-1)\ldots (\mathrm{e}^{-\theta u_n}-1)}{(\mathrm{e}^{-\theta }-1)^{n-1}}\Big ),~~\theta <0 \end{aligned}$$

belong to this category. Meanwhile, Chen et al. (2010) showed that every n-dimensional Farlie–Gumbel–Morgenstern (FGM) distribution described a specific END structure.

If $M=1$, then END random variables reduce to NOD random variables (see Lehmann 1966), which contain NA random variables and NSD random variables (see Joag-Dev and Proschan 1983; Hu 2000; Wang et al. 2015a). Joag-Dev and Proschan (1983) established that a permutation distribution is NA. Recall that a family of real-valued random variables $Z=\left\{ Z_t,t\in T\right\} $ is called normal (or Gaussian) system if all its finite-dimensional distributions are Gaussian. Let $Z=\left( Z_1,\ldots ,Z_n\right) $ be a normal random vector, $n\ge 2$. Then Joag-Dev and Proschan (1983) proved that it is NA if and only if its components are non-positively correlated. They also pointed out that NA random variables are NOD random variables but the converse statement cannot always be true. For various examples of NA random variables and the related fields, we can refer to the studies by Bulinski and Shaskin (2007), Prakasa Rao (2012), Oliveira (2012) and so on. Since END random variables are widely dependent random variables, many researchers pay attention to the study of the properties of END. For example, Liu (2009, 2010) studied the precise large deviations and moderate deviations of END sequence with heavy tails; Chen et al. (2010) obtained strong law of large numbers of END sequence. They also established some large deviation inequalities and applications to risk theory and renewal theory; Shen (2011) obtained some moment inequalities of END sequence; Wang et al. (2013) and Hu et al. (2015) investigated the complete convergence for END sequences; Wang et al. (2015b) investigated the application of the nonparametric regression model under END errors, etc.

2 The large deviation results of the LS estimator

Let $\varTheta $ be a Borel subset of $\mathscr {R}^k, f_t(\theta )$ be a continuous deterministic function from $\varTheta $ to $\mathscr {R}$ for each $t\in \mathcal N$. Assume that $X^n:=\left( X_1,X_2,\ldots ,X_n\right) $ are the observed random variables of the nonlinear regression model (1).

The LS estimator $\hat{\theta }_n$, which we assume to exist (see (2)), maximizes the M-functional

$$\begin{aligned} C_n(X^n,\theta ):=\exp \Big (-\frac{1}{2}\sum \limits _{t=1}^n\left( X_t-f_t(\theta )\right) ^2\Big ). \end{aligned}$$

(14)

Given a sequence of nonsingular $k\times k$ matrix norming factors $\phi _n(\theta )$, we define the ratio

$$\begin{aligned} Z_{n,\theta }(u):= & {} \frac{C_n\left( X^n,\theta +\phi _n(\theta )u\right) }{C_n\left( X^n,\theta \right) }\nonumber \\= & {} \exp \Bigg (\sum \limits _{t=1}^n d_{tn\theta }(u)\varepsilon _t-\frac{1}{2}\sum \limits _{t=1}^n d^2_{tn\theta }(u)\Bigg ), \end{aligned}$$

(15)

where

$$\begin{aligned} d_{tn\theta }(u)=f_t\left( \theta +\phi _n(\theta )u\right) -f_t(\theta ). \end{aligned}$$

Similar to Theorem 3.1 of Sieders and Dzhaparidze (1987), we assume that, for some Borel subset K of $\varTheta $, there exist functions $g_n(R)\in \mathbf G $, positive constant $r>0, \varLambda _1\in (0,\infty ], \delta \in (0,1/2), k_1>0, \rho \in (0,1]$ and a polynomial pol(R) in R such that for all n and R large enough, the following inequalities hold:

(N.1) for all $t\in \mathcal {N}$ and $|\lambda |\le \varLambda _1$,

$$\begin{aligned} E\exp \left( \lambda \varepsilon _t\right) \le \exp \left( \frac{1}{2}r\lambda ^2\right) ; \end{aligned}$$

(16)

(N.2) for all $\theta \in K$ and $u,v\in {\varGamma }_{n,\theta , R}$, where $|u-v|\le k_1$, one has

$$\begin{aligned} \sum \limits _{t=1}^n\left[ f_t\left( \theta +\phi _n(\theta )u\right) -f_t\left( \theta +\phi _n(\theta )v\right) \right] ^2\le |u-v|^{2\rho }\text {pol}(R) \end{aligned}$$

(17)

and

$$\begin{aligned} \sum \limits _{t=1}^nd_{tn\theta }^2(u)\le \text {pol}(R); \end{aligned}$$

(18)

(N.3) for all $\theta \in K$ and $u\in {\varGamma }_{n,\theta , R}$, one has

$$\begin{aligned} \sum \limits _{t=1}^nd_{tn\theta }^2(u)\ge \max \Big (\frac{8r}{\delta ^2},\frac{4}{\varLambda _1\delta }\max \limits _{1\le t\le n}\left| d_{tn\theta }(u)\right| \Big )g_n(R). \end{aligned}$$

(19)

By (N.1)–(N.3), we have the following large deviation result.

Theorem 2.1

Assume that the errors $\left\{ \varepsilon _t \right\} $ in the nonlinear regression model (1) are END random variables with $E\varepsilon _t=0$ for all $t\in \mathcal {N}$. For some $K\subset \varTheta \subset \mathscr {R}^k$ and suitably chosen nonsingular $\phi _n(\theta )$, let the assumptions (N.1)–(N.3) be fulfilled. Then there exist positive constants $B_0$ and $b_0$ such that, for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in K}P_{\theta }^{(n)}\Big \{|\phi ^{-1}_n(\theta )(\hat{\theta }_n-\theta )|\ge H\Big \}\le B_0\exp \left( -b_0g_n(H)\right) . \end{aligned}$$

(20)

Moreover, for any $\beta >0$, we can choose $B_0$ large enough such that $b_0\ge \frac{\rho }{\rho +k}-\beta $.

We list two assumptions (N.1)$^\prime $ and (N.4) as follows:

(N.1)$^{\prime }$ for some $r>0$, condition (N.1) holds with $\varLambda _1=\infty $;
(N.4) there exist positive constants $D_1$ and $D_2$ such that, for all $\theta ,\theta ^{\prime }\in \varTheta \subset \mathscr {R}^k$ and n large enough,
$$\begin{aligned} D_1|\phi ^{-1}_n\left( \theta -\theta ^{\prime }\right) |^2\le \sum \limits _{t=1}^n\left( f_t(\theta )-f_t(\theta ^{\prime })\right) ^2\le D_2|\phi ^{-1}_n\left( \theta -\theta ^{\prime }\right) |^2. \end{aligned}$$
(21)

Replacing (N.1)–(N.3) by (N.1)$^\prime $ and (N.4), we have a result as follows.

Theorem 2.2

Assume that the errors $\left\{ \varepsilon _t \right\} $ in the nonlinear regression model (1) are END random variables with $E\varepsilon _t=0$ for all $t\in \mathcal {N}$. For a suitably chosen nonsingular $\phi _n(\theta )$, let the assumptions (N.1)$^\prime $ and (N.4) be fulfilled. Then there exist positive constants $B_0$ and b such that, for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{|\phi ^{-1}_n(\theta )(\hat{\theta }_n-\theta )|\ge H\Big \}\le B_0\exp (-bH^2). \end{aligned}$$

(22)

For any $\beta >0$, it can be chosen $B_0$ large enough such that $b\ge \frac{D_1}{32r(1+k)}-\beta $.

Similar to (N.1) and (N.3), we give the following assumptions:

(N.1)$^{*}$ for all $t\in \mathcal {N}$, suppose that there exists a positive number L such that

$$\begin{aligned} |E\varepsilon _t^m|\le \frac{m!}{2}\sigma ^2L^{m-2}, \end{aligned}$$

for all positive integers $m\ge 2$, where $\sigma ^2=Var(\varepsilon _t)$;

(N.3)$^{\prime }$ for all $\theta \in K$ and $u\in {\varGamma }_{n,\theta , R}$, it has

$$\begin{aligned} \sum \limits _{t=1}^nd_{tn\theta }^2(u)\ge \max \Big (\frac{16\sigma ^2}{\delta ^2},\frac{8L}{\delta }\max \limits _{1\le t\le n}|d_{tn\theta }(u)|\Big )g_n(R), \end{aligned}$$

(23)

where $0<\delta <1/2$.

Therefore, similar to Theorems 2.1 and 2.2, we also establish the following results:

Theorem 2.3

Assume that the errors $\left\{ \varepsilon _t \right\} $ in the nonlinear regression model (1) are END random variables with $E\varepsilon _t=0$ and $Var(\varepsilon _t)=\sigma ^2$ for all $t\in \mathcal {N}$. For some $K\subset \varTheta \subset \mathscr {R}^k$ and suitably chosen nonsingular $\phi _n(\theta )$, let the assumptions (N.1)$^{*}$, (N.2) and (N.3)$^{\prime }$ be fulfilled. Then there exist positive constants $B_0$ and $b_0$ such that, for all n and H large enough, (20) holds. For any $\beta >0$, it can be chosen $B_0$ large enough such that $b_0\ge \frac{\rho }{\rho +k}-\beta $.

Theorem 2.4

Assume that the errors $\left\{ \varepsilon _t \right\} $ in the nonlinear regression model (1) are END random variables with $E\varepsilon _t=0$ and $Var(\varepsilon _t)=\sigma ^2$ for all $t\in \mathcal {N}$. For a suitably chosen nonsingular $\phi _n(\theta )$, let the conditions (N.1)$^{*}$ and (N.4) be fulfilled. Then there exist positive constants $B_0$ and $C_0$ such that, for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{|\phi ^{-1}_n(\theta )\left( \hat{\theta }_n-\theta \right) |\ge H\Big \}\le B_0\exp (-C_0H). \end{aligned}$$

(24)

For all $\theta \in \varTheta \subset \mathscr {R}^k$, all $\rho >0$ and n large enough, by taking $\phi _n(\theta )=n^{-1/2}I_k$ and $H=n^{1/2}\rho $ in Theorem 2.2, we obtain the following corollary, where $I_k$ is a $k\times k$ unit matrix.

Corollary 2.1

Assume that the errors $\{\varepsilon _t \}$ in the nonlinear regression model (1) are mean zero END random variables satisfying (N.1)$^{\prime }$. Assume that there exist positive constants $D_1$ and $D_2$ such that, for all $\theta ,\theta ^{\prime }\in \varTheta \subset \mathscr {R}^k$ and n large enough,

$$\begin{aligned} D_1n|\theta -\theta ^{\prime }|^2\le \sum \limits _{t=1}^n\left( f_t(\theta )-f_t(\theta ^{\prime })\right) ^2\le D_2n|\theta -\theta ^{\prime }|^2. \end{aligned}$$

(25)

Then for all $\rho >0$ and n large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{|\hat{\theta }_n-\theta |\ge \rho \Big \}\le B_0\exp (-b\rho ^2n), \end{aligned}$$

(26)

where $B_0$ and b are defined by (22). So it follows

$$\begin{aligned} \hat{\theta }_n-\theta =O\left( n^{-1/2}\log ^{1/2} n\right) ,\quad completely,\quad as\quad n\rightarrow \infty . \end{aligned}$$

(27)

Remark 2.1

The (N.1), (N.1)$^\prime $ and (N.1)$^{*}$ control the tails of errors $\varepsilon _t$ for all $t\in \mathcal {N}$. Similar to Condition (III) of Ivanov (1976) (see (11)), Assumption A(ii) of Wu (1981), (2.5) of Prakasa Rao (1984a) and (N.2) of Sieders and Dzhaparidze (1987), the (N.2) is a Hölder-type continuity condition on the parametrization $\theta \rightarrow f(\theta )$. Similar to (N.3) of Sieders and Dzhaparidze (1987), the (N.3) and (N.3)$^\prime $ prescribe the rate of asymptotic separation. The asymptotic separation is a necessary condition for the consistent estimation (see Theorem 1 of Wu 1981). The similar conditions can be found by Condition III of Ivanov (1976), (2.6) of Prakasa Rao (1984a), etc. In addition, by the proof of Theorem 2.2 in Sect. 5, (N.2) and (N.3) are fulfilled from the condition (N.4) and $\varLambda _1=\infty $. Similarly, by the proof of Theorem 2.4, (N.2) and (N.3)$^\prime $ are fulfilled from (N.4).

Assume that $\{a_n,n\ge 1\}$ is a positive constant sequence satisfying that $a_n\rightarrow 0$. For all $\theta \in \varTheta \subset \mathscr {R}^k$, let $\phi _n(\theta )=a_nI_k$ and the conditions of Theorem 2.1 be fulfilled, where $I_k$ is a $k\times k$ unit matrix. Then for all $\rho >0$, taking $H=\frac{1}{a_n}\rho $ in (20), we obtain that

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_\theta ^{(n)}\left( |\hat{\theta }_n-\theta |>\rho \right)= & {} \sup \limits _{\theta \in \varTheta }P_\theta ^{(n)}\left( \frac{1}{a_n}|I_k\left( \hat{\theta }_n-\theta \right) |>\frac{\rho }{a_n}\right) \\\le & {} B_0\exp \left( -b_0g_n(\rho /a_n)\right) \rightarrow 0, \end{aligned}$$

in view of (i) of $\mathbf G $. So the LS estimator $\hat{\theta }_n$ is a weak uniform consistency estimator of $\theta $.

3 Some examples and simulations

In this section, some examples and simulations for the LS estimator of nonlinear regression models are illustrated.

Example 3.1

In the nonlinear model (1), let

$$\begin{aligned} f_t(\theta )=\frac{1}{\theta ^{-1}+t^{1/4}},~~t=1,2,\ldots ,n, \end{aligned}$$

where $\theta \in \varTheta =\{\theta :0<\delta _1\le \theta \le \delta _2<\infty \}$. Obviously, there exist some positive constants $D_1$ and $D_2$ such that, for all $\theta ,\theta ^{\prime }\in \varTheta $ and n large enough,

$$\begin{aligned} D_1\left( \theta -\theta ^{\prime }\right) ^2\log n\le \sum \limits _{t=1}^n \left( f_t(\theta )-f_t(\theta ^{\prime })\right) ^2\le D_2 \left( \theta -\theta ^{\prime }\right) ^2\log n, \end{aligned}$$

where $D_1<\frac{1}{\delta _2^4}$ and $D_1$ can be chosen arbitrarily close to $ \frac{1}{\delta _2^4}$. Let the conditions of Theorem 2.2 hold. Then there exist some constants $B_0$ and b such that, for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{(\log {n})^{1/2}|\hat{\theta }_n-\theta |\ge H\Big \}\le B_0\exp (-bH^2), \end{aligned}$$

(28)

where b can be chosen arbitrarily close (from below) to $ \frac{1}{64r\delta _2^4}$.

If the errors $\{\varepsilon _t\}$ are i.i.d. random variables and $\varepsilon _1\sim N(0,\sigma ^2)$, then by Theorem 5 of Wu (1981), it holds that

$$\begin{aligned} (\log {n})^{1/2}(\hat{\theta }_n-\theta )\xrightarrow {\mathscr {L}}N(0,\sigma ^2 \theta ^4), \end{aligned}$$

(29)

which yields

$$\begin{aligned} \lim \limits _{\begin{array}{c} H\rightarrow \infty \\ n\rightarrow \infty \end{array}}\Big (-H^{-2}\log P_{\theta }^{(n)}\Big \{(\log {n})^{1/2}|\hat{\theta }_n-\theta |\ge H\Big \}\Big )=\frac{1}{2\sigma ^2\theta ^4}, \end{aligned}$$

(30)

(see Example 1 of Sieders and Dzhaparidze 1987).

Moreover, by (28), we can get that

$$\begin{aligned} \liminf \limits _{\begin{array}{c} H\rightarrow \infty \\ n\rightarrow \infty \end{array}}\Big (-H^{-2}\log P_{\theta }^{(n)}\Big \{(\log {n})^{1/2}|\hat{\theta }_n-\theta |\ge H\Big \}\Big )\ge \frac{1}{64r\theta ^4}. \end{aligned}$$

(31)

By comparing (31) with (30), we obtain the large deviation result of (28) under END errors, which has the same order of optimal bound as that of independent case.

Example 3.2

Consider the linear model

$$\begin{aligned} X_t=\theta +\varepsilon _t,~~t=1,2,\ldots ,n,~\theta \in \varTheta \subset \mathscr {R}, \end{aligned}$$

where the errors $\left\{ \varepsilon _t\right\} $ are mean zero END random variables satisfying (N.1)$^{\prime }$. Applying Theorem 2.2 with $D_1=D_2=1$ and $\phi _n(\theta )=n^{-1/2}$, we have that for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{n^{1/2}|\hat{\theta }_n-\theta |\ge H\Big \}\le B_0\exp (-bH^2), \end{aligned}$$

(32)

where $B_0$ and b are positive constants and b can be chosen arbitrarily close (from below) to $ \frac{1}{64r}$. For all $\theta \in \varTheta $, we take $H=\sqrt{\frac{2}{b}}\log ^{\frac{1}{2}}n$ in (32) and obtain

$$\begin{aligned} \sum _{n=1}^\infty P_{\theta }^{(n)}\Big \{n^{1/2}|\hat{\theta }_n-\theta |\ge \sqrt{\frac{2}{b}}\log ^{\frac{1}{2}}n\Big \}\le B_0\sum _{n=1}^\infty \exp (-2\log n)<\infty , \end{aligned}$$

i.e.,

$$\begin{aligned} \hat{\theta }_n-\theta =O(n^{-\frac{1}{2}}\log ^{\frac{1}{2}}n),~~\text {completely},~\text {as}~n\rightarrow \infty .\end{aligned}$$

Example 3.3

Consider a power cure model

$$\begin{aligned} X_t=(t+\theta )^d+\varepsilon _t,~~t=1,2,\ldots ,n, \end{aligned}$$

(33)

where $d>1/2$ and $\theta \in \varTheta =\{\theta :0<\delta _1\le \theta \le \delta _2<\infty \}$. Let $f_t(\theta )=(t+\theta )^d$. Then there exist some positive constants $D_1$ and $D_2$ such that, for all $\theta ,\theta ^{\prime }\in \varTheta $,

$$\begin{aligned} D_1 n^{2d-1}\left( \theta -\theta ^{\prime }\right) ^2\le \sum \limits _{t=1}^n \left( f_t(\theta )-f_t(\theta ^{\prime })\right) ^2\le D_2 n^{2d-1}\left( \theta -\theta ^{\prime }\right) ^2. \end{aligned}$$

Let the errors $\{\varepsilon _t\}$ be mean zero END random variables satisfying (N.1)$^{\prime }$. Applying Theorem 2.2 with $\phi _n(\theta )=n^{1/2-d}$, we establish that for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{n^{d-1/2}|\hat{\theta }_n-\theta |\ge H\Big \}\le B_0\exp (-bH^2), \end{aligned}$$

(34)

where $B_0$ and b are positive constants and b can be chosen arbitrarily close (from below) to $ \frac{D_1}{64r}$. For all $\theta \in \varTheta $, taking $H=\sqrt{\frac{2}{b}}\log ^{\frac{1}{2}}n$ in (34), we obtain that

$$\begin{aligned} \sum _{n=1}^\infty P_{\theta }^{(n)}\Big \{n^{d-\frac{1}{2}}|\hat{\theta }_n-\theta |\ge \sqrt{\frac{2}{b}}\log ^{\frac{1}{2}}n\Big \}\le B_0\sum _{n=1}^\infty \exp (-2\log n)<\infty , \end{aligned}$$

i.e.,

$$\begin{aligned} \hat{\theta }_n-\theta =O(n^{-(d-\frac{1}{2})}\log ^{\frac{1}{2}}n),~~\text {completely},~\text {as}~n\rightarrow \infty , \end{aligned}$$

where $d>1/2$. Under the independent errors, Wu (1981) investigated the power cure model and obtained the strong consistency of the LS estimator $\hat{\theta }_n$ for $\theta $. We extend the result of Wu (1981) to the END case and establish the complete convergence rate for the LS estimator $\hat{\theta }_n$ of $\theta $.

Simulation 3.1 For simplicity, we do the simulation for the power cure model (33) with the case $d=2$, i.e.,

$$\begin{aligned} X_t=(t+\theta )^2+\varepsilon _t,~~t=1,2,\ldots ,n, \end{aligned}$$

where $\theta \in \varTheta =\left\{ \theta :0<\delta _1\le \theta \le \delta _2<\infty \right\} $. Let $(\varepsilon _1,\varepsilon _2,\ldots ,\varepsilon _n)$ be a normal random vector such as $(\varepsilon _1,\varepsilon _2,\ldots ,\varepsilon _n)\sim N_n(\mathbf 0 ,\Sigma )$, where $\mathbf 0 $ is zero vector, $\Sigma $ is

$$\begin{aligned} \Sigma = \begin{bmatrix} 1+\rho&-\rho&-\rho ^2&0&\cdots&0&0&0&0\\ -\rho&1+\rho&-\rho&-\rho ^2&\cdots&0&0&0&0 \\ -\rho ^2&-\rho&1+\rho&-\rho&\cdots&0&0&0&0 \\ 0&-\rho ^2&-\rho&1+\rho&\cdots&0&0&0&0 \\ \vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots&\vdots \\ 0&0&0&0&\cdots&1+\rho&-\rho&-\rho ^2&0 \\ 0&0&0&0&\cdots&-\rho&1+\rho&-\rho&-\rho ^2 \\ 0&0&0&0&\cdots&-\rho ^2&-\rho&1+\rho&-\rho \\ 0&0&0&0&\cdots&0&-\rho ^2&-\rho&1+\rho \\ \end{bmatrix}_{n\times n}, \end{aligned}$$

for $0<\rho <1$. By Joag-Dev and Proschan (1983), it can be seen that $(\varepsilon _1,\varepsilon _2,\ldots ,\varepsilon _n)$ is an NA vector. So it is also an END vector. By (14), the LS estimator ${\hat{\theta }}_{n}$ is that ${\hat{\theta }}_{n}=\mathop {{{\mathrm{arg\,min}}}}\limits _{\theta \in {\varTheta }}\sum \nolimits _{t=1}^n(X_t-(t+\theta )^2)^{2}$. Let $\frac{\mathrm{d}(\sum \nolimits _{t=1}^n(X_t-(t+\theta )^2)^2)}{\mathrm{d}\theta }=0$, then

$$\begin{aligned} n\theta ^3+\left( 3\sum \limits _{t=1}^n t\right) \theta ^2+\left( 3\sum \limits _{t=1}^nt^2-\sum \limits _{t=1}^n X_t\right) \theta +\sum \limits _{t=1}^nt^3-\sum \limits _{t=1}^ntX_t=0. \end{aligned}$$

(35)

It is a cubic equation of $\theta $. By choosing the solutions of cubic equation, one can get $\hat{\theta }_n$. For $\theta =1$, $\rho =0.1,0.2,0.3,0.4,0.5$ and sample size $n=10,50,100,200$, we use MATLAB software to obtain the roots of cubic equation (35) by repeating the experiments 10000 times, and find that for each experiment, there are one real root and two complex roots. So we choose the real root as the LS estimator $\hat{\theta }_n$ and plot the Box plots as Fig. 1.

Similarly, for $\theta =2$, $\rho =0.1,0.2,0.3,0.4,0.5$ and sample size $n=10,50,100$, 200, we do the simulation by repeating the experiments 10,000 times and plot the Box plots for LS estimator $\hat{\theta }_n$ as Fig. 2.

In Fig. 1a–d, with the same $\theta =1$ but different $\rho =0.1,\ldots ,0.5$, the median of LS estimator $\hat{\theta }_n$ is close to 1 and the variation range becomes smaller as the sample n increasing by 10, 50, 100 and 200. Likewise, in Fig. 2e–h, with the same $\theta =2$ but different $\rho =0.1,\ldots ,0.5$, the median of $\hat{\theta }_n$ is close to 2 and the variation range becomes smaller as the sample n increases.

We also give the Q–Q plot with $\theta =1,2$, $\rho =0.1,0.2$ and $n=100$ by repeating the experiments 10,000 times, to test the normality of LS estimator $\hat{\theta }_n$ and obtain the Fig. 3. In Fig. 3, it can be found that the LS estimator $\hat{\theta }_n$ has a asymptotic normality based on this multivariate normal experiment.

4 Conclusion

In this paper, we investigate the LS estimator $\hat{\theta }_n$ of $\theta $ for the nonlinear model based on END errors which are not necessarily identically distributed. Under the general conditions, we establish some large deviation results such as Theorems 2.1–2.4 for the LS estimator $\hat{\theta }_n$. As applications, by some simple conditions, a weak uniform consistency of $\hat{\theta }_n$ is established (see Remark 2.1), and a complete convergence rate of $\hat{\theta }_n$ is presented as $\hat{\theta }_n-\theta =O(n^{-1/2}\log ^{1/2} n)$, completely, in Corollary 2.1. Some examples of nonlinear regression models and simulations are given to illustrate in Sect. 3. We extend the results of Sieders and Dzhaparidze (1987), Prakasa Rao (1984a) and Hu (1993) for the independence, Gaussian, locally generalized Gaussian and martingale differences to the case of END random variables. Since END random variables can be NOD random variables, NSD random variables and NA random variables, the results obtained in this paper also hold true for these dependent structures.

5 Proofs

Before proving our results, we give some technical preliminaries as follows.

Lemma 5.1

(cf. Liu 2010, Lemma 3.1) Let random variables $\{Y_n,n\ge 1\}$ be a sequence of END random variables. We have that

(1) if $\{f_n,n\ge 1\}$ is a sequence of all nondecreasing (or nonincreasing) functions, then $\{f_n(Y_n),n\ge 1\}$ is also a sequence of END random variables;

(2) for each $n\ge 1$, there exists a positive constant M such that

$$\begin{aligned} E\Bigg (\prod \limits _{i=1}^n Y_i^{+}\Bigg )\le M\prod \limits _{i=1}^n EY_i^{+}. \end{aligned}$$

Lemma 5.2

Let $\{Y_n, n\ge 1\}$ be a sequence of END random variables and $\{r_n,n\ge 1\}$ be a sequence of positive numbers. For fixed $n\ge 1$, suppose that there exists a positive number $\varLambda _1$ such that

$$\begin{aligned} E\exp \left( \lambda Y_i\right) \le \exp \left( \frac{1}{2}r_i\lambda ^2\right) ,~~0\le |\lambda |\le \varLambda _1,~~i=1,2,\ldots , n. \end{aligned}$$

(36)

Denote $S_n=\sum \nolimits _{i=1}^n Y_i$ and $G_n=\sum _{i=1}^nr_i$, $n\ge 1$. Then there exists a positive constant M such that

$$\begin{aligned} P(S_n\ge x)\le \left\{ \begin{array}{ccc} M\exp \left( -\frac{x^2}{2G_n}\right) ,~ 0\le x\le G_n\varLambda _1,\\ M\exp \left( -\frac{\varLambda _1 x}{2}\right) ,~~x\ge G_n\varLambda _1,~~~~ \end{array}\right. \end{aligned}$$

(37)

and

$$\begin{aligned} P(S_n\le -x)\le \left\{ \begin{array}{ccc} M\exp \left( -\frac{x^2}{2G_n}\right) ,~ 0\le x\le G_n\varLambda _1,\\ M\exp \left( -\frac{\varLambda _1 x}{2}\right) ,~~x\ge G_n\varLambda _1.~~~~ \end{array}\right. \end{aligned}$$

(38)

Consequently,

$$\begin{aligned} P(|S_n|\ge x)\le \left\{ \begin{array}{ccc} 2M\exp \left( -\frac{x^2}{2G_n}\right) ,~ 0\le x\le G_n\varLambda _1,\\ 2M\exp \left( -\frac{\varLambda _1 x}{2}\right) ,~~x\ge G_n\varLambda _1.~~~~ \end{array}\right. \end{aligned}$$

(39)

Proof

For all x, by Markov’s inequality, Lemma 5.1 and (36), we obtain that

$$\begin{aligned} P\left( S_n\ge x\right)\le & {} \exp (-\lambda x)E\exp (\lambda S_n)= \exp (-\lambda x)E\left( \prod \nolimits _{i=1}^n\exp (\lambda Y_i)\right) \\\le & {} M\exp (-\lambda x)\prod \nolimits _{i=1}^nE\exp (\lambda Y_i)\\\le & {} M\exp \left( \frac{G_n\lambda ^2}{2}-\lambda x\right) , \quad \text {for}~ 0<\lambda \le \varLambda _1. \end{aligned}$$

So it has

$$\begin{aligned} P(S_n\ge x)\le M\inf _{0<\lambda \le \varLambda _1}\exp \left( \frac{G_n\lambda ^2}{2}-\lambda x\right) =M\exp \left( \inf \limits _{0<\lambda \le \varLambda _1}\left( \frac{G_n\lambda ^2}{2}-\lambda x\right) \right) . \end{aligned}$$

(40)

For the fixed $x\ge 0$, if $\varLambda _1\ge \frac{x}{G_n}\ge 0$, then

$$\begin{aligned} \exp \left( \inf \limits _{0<\lambda \le \varLambda _1}\left( \frac{G_n\lambda ^2}{2}-\lambda x\right) \right) =\exp \left( -\frac{x^2}{2G_n}\right) . \end{aligned}$$

(41)

Meanwhile, for the fixed $x\ge 0$, if $\varLambda _1\le \frac{x}{G_n}$, $x\ge 0$, then

$$\begin{aligned} \exp \left( \inf \limits _{0<\lambda \le \varLambda _1}\left( \frac{G_n\lambda ^2}{2}-\lambda x\right) \right) =\exp \left( \frac{G_n\varLambda _1^2}{2}-\varLambda _1 x\right) \le \exp \left( -\frac{\varLambda _1 x}{2}\right) . \end{aligned}$$

(42)

Consequently, (37) follows from (40) to (42) immediately.

According to Lemma 5.1 (1), $\{-Y_n\}$ are also END random variables. Therefore, by (37), it yields

$$\begin{aligned} P(S_n\le -x)=P(-S_n\ge x)\le \left\{ \begin{array}{ccc} M\exp \left( -\frac{x^2}{2G_n}\right) ,~ 0\le x\le G_n\varLambda _1,\\ M\exp \left( -\frac{\varLambda _1 x}{2}\right) ,~~x\ge G_n\varLambda _1,~~~~ \end{array}\right. \end{aligned}$$

which implies (38). Combining (37) with (38), we obtain (39) finally. $\square $

Remark 5.1

Lemma 5.2 is an extension of exponential inequalities for the independent case (see Theorem 2.6 of Petrov 1995) and NOD case (see Theorem 2.1 of Wang et al. 2010) to the END structure case.

Corollary 5.1

Let $\{Y_n, n\ge 1\}$ be a sequence of END random variables, $\{d_n, n\ge 1\}$ be a sequence of real numbers and $\{r_n, n\ge 1\}$ be a sequence of positive numbers. Suppose there exists a positive constant $\varLambda _1$ ($\varLambda _1$ possibly $\infty $) such that for all $|\lambda |\le \varLambda _1$, (36) holds true. Denote $\tilde{S}_n=\sum \nolimits _{i=1}^n d_i Y_i$, $\tilde{G}_n=\sum \nolimits _{i=1}^n r_id_i^2$ and $\varLambda =\varLambda _1/\max \nolimits _{1\le i\le n}|d_i|$. Then for all $x\ge 0$, there exists a positive constant M such that

$$\begin{aligned} P(\tilde{S}_n\ge x)\le & {} 2M\exp \left\{ -\min \left( \frac{x^2}{8\tilde{G}_n},\frac{\varLambda x}{4}\right) \right\} , \end{aligned}$$

(43)

$$\begin{aligned} P(\tilde{S}_n\le -x)\le & {} 2M\exp \left\{ -\min \left( \frac{x^2}{8\tilde{G}_n},\frac{\varLambda x}{4}\right) \right\} , \end{aligned}$$

(44)

$$\begin{aligned} P(|\tilde{S}_n|\ge x)\le & {} 4M\exp \left\{ -\min \left( \frac{x^2}{8\tilde{G}_n},\frac{\varLambda x}{4}\right) \right\} . \end{aligned}$$

(45)

Proof

Obviously, for $|\lambda |\le \varLambda =\varLambda _1/\max \nolimits _{1\le i\le n}|d_i|$, we have $|\lambda d_i^+|\le \varLambda |d_i|\le \varLambda _1$. So by (36), it can be argued that

$$\begin{aligned} E\exp \left( \lambda d_i^+Y_i\right) \le \exp \left( \frac{1}{2}r_i\left( d_i^+\right) ^2\lambda ^2\right) \le \exp \left( \frac{1}{2}r_id_i^2\lambda ^2\right) ,~\text {for}~|\lambda |\le \varLambda . \end{aligned}$$

According to Lemma 5.1 (1), $d_1^{+}Y_1,\ldots ,d_n^{+}Y_n$ are still END random variables. Denote $\tilde{S}_n(1)=\sum \nolimits _{i=1}^n d_i^+Y_i$ and $\tilde{G}_n=\sum \nolimits _{i=1}^n r_id_i^2$. Then, we apply Lemma 5.2 and establish that

$$\begin{aligned} P(\tilde{S}_n(1)\ge x)\le M\exp \left\{ -\min \left( \frac{x^2}{2\tilde{G}_n},\frac{\varLambda x}{2}\right) \right\} . \end{aligned}$$

(46)

Meanwhile, $d_1^{-}Y_1,\ldots ,d_n^{-}Y_n$ are still END random variables. Denote $\tilde{S}_n(2)=\sum \nolimits _{i=1}^n d_i^-Y_i$. Similar to the proof of (46), one has

$$\begin{aligned} P(\tilde{S}_n(2)\ge x)\le M\exp \left\{ -\min \left( \frac{x^2}{2\tilde{G}_n},\frac{\varLambda x}{2}\right) \right\} . \end{aligned}$$

(47)

Since

$$\begin{aligned} P(\tilde{S}_n\ge x) \le P\left( \tilde{S}_n(1)\ge x/2\right) +P\left( \tilde{S}_n(2)\ge x/2\right) , \end{aligned}$$

(48)

by (46) to (48), we obtain the result (43). Combining with the proofs of (38), (39) and (43), we have the results of (44) and (45) immediately. $\square $

Corollary 5.2

Let $\left\{ Y_n, n\ge 1\right\} $ be a sequence of END random variables satisfying $EY_i=0$ and $EY_i^2=\sigma _i^2<\infty $, $i=1,2,\ldots ,$ $\{d_n, n\ge 1\}$ be a sequence of real numbers. Denote $\tilde{S}_n=\sum \nolimits _{i=1}^n d_i Y_i$ and $\tilde{B}_n^2=\sum \nolimits _{i=1}^n \sigma _i^2d_i^2$. For the fixed $n\ge 1$, suppose that there exists a positive number L such that

$$\begin{aligned} |EY_i^m|\le \frac{m!}{2}\sigma _i^2L^{m-2},~~i=1,2,\ldots , n \end{aligned}$$

(49)

for all positive integers $m\ge 2$. Then there exists a positive constant M such that for all $x\ge 0$,

$$\begin{aligned} P(\tilde{S}_n\ge x)\le & {} 2M\exp \left\{ -\min \left( \frac{x^2}{16\tilde{B}_n^2},\frac{x}{8L\max \limits _{1\le i\le n}|d_i|}\right) \right\} , \end{aligned}$$

(50)

$$\begin{aligned} P(\tilde{S}_n\le -x)\le & {} 2M\exp \left\{ -\min \left( \frac{x^2}{16\tilde{B}_n^2},\frac{x}{8L\max \limits _{1\le i\le n}|d_i|}\right) \right\} , \end{aligned}$$

(51)

$$\begin{aligned} P(|\tilde{S}_n|\ge x)\le & {} 4M\exp \left\{ -\min \left( \frac{x^2}{16\tilde{B}_n^2},\frac{x}{8L\max \limits _{1\le i\le n}|d_i|}\right) \right\} . \end{aligned}$$

(52)

Proof

It can be argued by $EY_i=0$, $EY_i^2=\sigma _i^2$ and (49) that

$$\begin{aligned} E\exp (\lambda Y_i)= & {} 1+\frac{\lambda ^2}{2}\sigma _i^2+\frac{\lambda ^3}{6}EY_i^3+\cdots +\frac{\lambda ^k}{k!}EY_i^k+\cdots \\\le & {} 1+\frac{\lambda ^2}{2}\sigma _i^2\left( 1+L|\lambda |+L^2\lambda ^2+\cdots +L^{k-2}|\lambda |^{k-2}+\cdots \right) ,~i=1,2,\ldots ,n. \end{aligned}$$

If $|\lambda |\le \frac{1}{2L}$, then

$$\begin{aligned} E\exp (\lambda Y_i)\le 1+\frac{\lambda ^2\sigma _i^2}{2}\frac{1}{1-L|\lambda |}\le 1+\lambda ^2\sigma _i^2\le \exp \left( \lambda ^2\sigma _i^2\right) := \exp \left( \frac{1}{2}r_i\lambda ^2\right) , \end{aligned}$$

(53)

where $r_i=2\sigma _i^2$ and $i=1,2,\ldots ,n$. Taking $\varLambda _1=\frac{1}{2L}$ and $\tilde{G}_n=\sum _{i=1}^nr_id_i^2=2\sum _{i=1}^n\sigma _i^2d_i^2=2\tilde{B}_n^2$ in Corollary 5.1, we have the results (50)–(52) immediately. $\square $

Lemma 5.3

For some $m\ge 2$, let $\left\{ Y_n, n\ge 1\right\} $ be a sequence of END random variables with $EY_n=0$ and $E|Y_n|^m<\infty $, $n=1,2,\ldots $. Assume that $\left\{ a_{ni}, 1\le i\le n, n\ge 1\right\} $ is a triangular array of real numbers. Denote $S_n=\sum \nolimits _{i=1}^n a_{ni}Y_i$. Then there exists a positive constant C not dependent on n such that

$$\begin{aligned} E|S_n|^m\le C\max \limits _{1\le i\le n}E|Y_i|^m\Bigg (\sum \limits _{i=1}^na_{ni}^2\Bigg )^{m/2}. \end{aligned}$$

(54)

Proof

Denote $S_n(1)=\sum \nolimits _{i=1}^n a_{ni}^{+}Y_i$ and $S_n(2)=\sum \nolimits _{i=1}^n a_{ni}^{-}Y_i$. For $m\ge 1$, by $C_r$ inequality, one has

$$\begin{aligned} E|S_n|^m=E|S_n(1)-S_n(2)|^m\le 2^{m-1}(E|S_n(1)|^m+E|S_n(2)|^m). \end{aligned}$$

(55)

Obviously, by Lemma 5.1 (1), $\{a_{ni}^{+}Y_i,1\le i\le n\}$ and $\{a_{ni}^{-}Y_i,1\le i\le n\}$ are also END random variables. Then, for $m\ge 2$, by Corollary 3.2 of Shen (2011), it yields that

$$\begin{aligned} E|S_n(1)|^m\le & {} C_1\Bigg (\sum \limits _{i=1}^n (a_{ni}^{+})^mE|Y_i|^m+\Bigg (\sum \limits _{i=1}^n (a_{ni}^{+})^2EY_i^2\Bigg )^{m/2}\Bigg )\nonumber \\\le & {} C_1\max \limits _{1\le i\le n}E|Y_i|^m\Bigg (\sum \limits _{i=1}^n |a_{ni}|^m+\Bigg (\sum \limits _{i=1}^n a_{ni}^2\Bigg )^{m/2}\Bigg )\nonumber \\\le & {} C_2\max \limits _{1\le i\le n}E|Y_i|^m\Bigg (\sum \limits _{i=1}^n a_{ni}^2\Bigg )^{m/2}. \end{aligned}$$

(56)

Similarly, it has

$$\begin{aligned} E|S_n(2)|^m\le C_3\max \limits _{1\le i\le n}E|Y_i|^m\Bigg (\sum \limits _{i=1}^n a_{ni}^2\Bigg )^{m/2}. \end{aligned}$$

(57)

Thus, (54) follows from (55) to (57) immediately. $\square $

Proof of Theorem 2.1

Let $\theta \in K$ and $u,v\in {\varGamma }_{n,\theta ,R}$. If $|u-v|\ge k_1$, then we obtain by (18) that

$$\begin{aligned} \sum \limits _{t=1}^n\left( d_{tn\theta }(u)-d_{tn\theta }(v)\right) ^2\le 2\text {pol}(R)\le 2|u-v|^{2\rho }k_1^{-2\rho }\text {pol}(R). \end{aligned}$$

By (17), it can be found that

$$\begin{aligned} \sum \limits _{t=1}^n\left( d_{tn\theta }(u)-d_{tn\theta }(v)\right) ^2\le |u-v|^{2\rho }\text {pol}(R),~~\text {for all}~u,v\in {\varGamma }_{n,\theta ,R}. \end{aligned}$$

(58)

Taking $\zeta _{n,\theta }(u):=\log Z_{n,\theta }(u)$ in (15), we establish that

$$\begin{aligned} \zeta _{n,\theta }(u)-\zeta _{n,\theta }(v)=\sum \limits _{t=1}^n(A_t\varepsilon _t-B_t), \end{aligned}$$

(59)

where

$$\begin{aligned} A_t=d_{tn\theta }(u)-d_{tn\theta }(v),~~~B_t=\frac{1}{2}(d^2_{tn\theta }(u)-d^2_{tn\theta }(v)). \end{aligned}$$

(60)

For all $m\ge 1$, by $C_r$ inequality, it yields

$$\begin{aligned} E_\theta ^{(n)}|\zeta _{n,\theta }(u)-\zeta _{n,\theta }(v)|^m\le 2^{m-1}\left( E_\theta ^{(n)}\Big |\sum \limits _{t=1}^nA_t\varepsilon _t\Big |^m+\Big |\sum \limits _{t=1}^nB_t\Big |^m\right) . \end{aligned}$$

(61)

Obviously, the condition (N.1) implies that $\left\{ E|\varepsilon _t|^m,t\in \mathcal {N}\right\} $ is uniformly bounded. So, by (58) and Lemma 5.3 with (N.1) and $E\varepsilon _t=0$, we obtain that for all $m\ge 2$

$$\begin{aligned} E_\theta ^{(n)}\Big |\sum \limits _{t=1}^nA_t\varepsilon _t\Big |^m\le C\left( \sum \limits _{t=1}^n A_t^2\right) ^{m/2}\le |u-v|^{\rho m}\text {pol}(R). \end{aligned}$$

(62)

Meanwhile, by Cauchy–Schwarz inequality, (18) and (58), one has that

$$\begin{aligned} \Bigg |\sum \limits _{t=1}^n B_t\Bigg |\le & {} \frac{1}{2}\sum \limits _{t=1}^n|d_{tn\theta }(u)-d_{tn\theta }(v)|\cdot |d_{tn\theta }(u)+d_{tn\theta }(v)|\nonumber \\\le & {} \frac{1}{2}\Bigg \{\sum \limits _{t=1}^n A_t^2\cdot \sum \limits _{t=1}^n(d_{tn\theta }(u)+d_{tn\theta }(v))^2\Bigg \}^{1/2}\le |u-v|^{\rho }\text {pol}(R). \end{aligned}$$

(63)

So it follows from (61)–(63) that for all $\theta \in K$ and $u,v\in {\varGamma }_{n,\theta ,R}$

$$\begin{aligned} E_\theta ^{(n)}|\zeta _{n,\theta }(u)-\zeta _{n,\theta }(v)|^m\le |u-v|^{\rho m} \text {pol}(R). \end{aligned}$$

(64)

Taking $m>\max (2,k/\rho )$ in (64), we find that (7) is fulfilled with $\alpha =\rho m$.

Next, we verify that (10) holds true. For $0<\delta <\frac{1}{2}$, let

$$\begin{aligned} \eta _{n,\theta }(u)=\left( \frac{1}{2}-\delta \right) \sum \limits _{t=1}^n d^2_{tn\theta }(u). \end{aligned}$$

(65)

By (19), it can be argued that

$$\begin{aligned} \sum \limits _{t=1}^n d^2_{tn\theta }(u)\ge 32rg_n(R), \end{aligned}$$

(66)

which shows that $\eta _{n,\theta }(u)\in \mathbf H _K$, because from (5), $g^{-1}_n(R)\le 1$ for all n and R large enough. Denote $d_{tn\theta }(u)=d_t$ and $\max \limits _{1\le t\le n}|d_{tn\theta }(u)|=\max |d_{t}|$. Then, by (19), (59), (60), (65), (66) and Corollary 5.1, we get

$$\begin{aligned} P_\theta ^{(n)}\Big (\zeta _{n,\theta }(u)-\zeta _{n,\theta }(0)\ge -\eta _{n,\theta }(u)\Big )= & {} P_\theta ^{(n)}\Bigg (\sum \limits _{t=1}^n d_t\varepsilon _t\ge \delta \sum \limits _{t=1}^n d_t^2\Bigg )\\\le & {} 2M\exp \left\{ -\sum \limits _{t=1}^n d_t^2\min \Big (\frac{\delta ^2}{8r},\frac{\delta \varLambda _1}{4\max |d_t|}\Big )\right\} \\= & {} 2M\exp \left\{ -\sum \limits _{t=1}^n d_t^2/\max \Big (\frac{8r}{\delta ^2},\frac{4\max |d_t|}{\delta \varLambda _1}\Big )\right\} \\\le & {} 2M\exp (-g_n(R)), \end{aligned}$$

which implies that (10) is fulfilled. Combining Theorem 1.1 with Remark 1.2, we obtain (20). Meanwhile, by choosing $\alpha =\rho m$ in Theorem 1.1, for all $\beta >0$ and m large enough, there exists a positive $B_0$ such that (20) holds, where $b_0\ge \frac{\rho }{\rho +k}-\beta $. $\square $

Proof of Theorem 2.2

By (21), (17) is fulfilled with $\rho =1$ and pol$(R)=D_2$. Obviously, for all $u\in {\varGamma }_{n,\theta ,R}$, $\theta \in \varTheta $, it has

$$\begin{aligned} \sum \limits _{t=1}^n d_{tn\theta }^2\le D_2|u|^2\le D_2(R+1)^2, \end{aligned}$$

(67)

which implies that (18) is satisfied. Meanwhile, for all $0<\delta <1/2$ and $u\in {\varGamma }_{n,\theta ,R}$, it can be argued that

$$\begin{aligned} \sum \limits _{t=1}^n d_{tn\theta }^2\ge D_1|u|^2\ge D_1R^2. \end{aligned}$$

(68)

Then, by (68) and $\varLambda _1=\infty $, (19) is fulfilled with $g_n(R)=\frac{D_1\delta ^2}{8r}R^2$. Letting $\delta \rightarrow \frac{1}{2}$, we apply Theorem 2.1 and obtain the result (22) finally. $\square $

Proof of Theorem 2.3

Combining Corollary 5.2 with the proof of Theorem 2.1, where r is replaced by $2\sigma ^2$ and $\varLambda _1$ is replaced by $\frac{1}{2L}$, we have (20) immediately. $\square $

Proof of Theorem 2.4

Together (21) with (67), for all $u\in {\varGamma }_{n,\theta ,R}$, $\theta \in \varTheta $, it can be checked that

$$\begin{aligned} d_{tn\theta }^2\le \sum \limits _{t=1}^n d_{tn\theta }^2\le D_2|u|^2\le D_2(R+1)^2,~~1\le t\le n, \end{aligned}$$

which implies

$$\begin{aligned} \max \limits _{1\le t\le n}|d_{tn\theta }|\le \sqrt{D_2}(R+1). \end{aligned}$$

(69)

Let $g_n(R)=C_1R$, where $C_1$ is a positive constant and is defined later. Next, we prove that (23) in (N.3)$^\prime $ is fulfilled for all R large enough. By (21), (68) and (69), for all $u\in {\varGamma }_{n,\theta ,R}$, $\theta \in \varTheta $, we take a positive constant $C_1$ such that for all R large enough,

$$\begin{aligned} \sum \limits _{t=1}^n d_{tn\theta }^2\ge D_1R^2\ge \frac{16\sigma ^2}{\delta ^2}C_1R=\frac{16\sigma ^2}{\delta ^2}g_n(R) \end{aligned}$$

(70)

and

$$\begin{aligned} \sum \limits _{t=1}^n d_{tn\theta }^2\ge D_1R^2\ge \frac{8L}{\delta }\sqrt{D_2}(R+1)C_1R\ge \frac{8L}{\delta }\max \limits _{1\le t\le n}|d_{tn\theta }|g_n(R). \end{aligned}$$

(71)

Thus, by (70) and (71), (23) is fulfilled. Consequently, by the proofs of Theorems 2.1 and 2.3, we apply Theorem 2.3 and establish that for all n and H large enough,

$$\begin{aligned} \sup \limits _{\theta \in \varTheta }P_{\theta }^{(n)}\Big \{|\phi ^{-1}_n(\theta )(\hat{\theta }_n-\theta )|\ge H\Big \}\le B_0\exp (-C_0H), \end{aligned}$$

(72)

where $C_0=b_0C_1$, $B_0$ and $b_0$ are defined in Theorem 2.3. $\square $

Proof of Corollary 2.1

For all $\theta \in \varTheta \subset \mathscr {R}^k$ and all $\rho >0$, taking $\phi _n(\theta )=n^{-1/2}I_k$ and $H=n^{1/2}\rho $ in (22), one establishes the result (26) immediately, where $I_k$ is a $k\times k$ unit matrix. Taking $C_1$ large enough, we apply (26) and establish that

$$\begin{aligned} \sum _{n=1}^\infty P_{\theta }^{(n)}\Big \{|\hat{\theta }_n-\theta |\ge C_1n^{-\frac{1}{2}}\log ^{\frac{1}{2}}n\Big \}\le & {} B_0\sum _{n=1}^\infty \exp (-bC_1^2\log n)<\infty . \end{aligned}$$

This completes the proof of (27). $\square $

References

Bulinski AV, Shaskin A (2007) Limit theorems for associated random fields and related systems. World Scientific, Singapore
Book Google Scholar
Bunke H, Schmidt WH (1980) Asymptotic results on nonlinear approximation of regression functions and weighted least squares. Math Oper Stat Ser Stat 11(1):3–32
MathSciNet MATH Google Scholar
Chen YQ, Chen AY, NG KW (2010) The strong law of large numbers for extended negatively dependent random variables. J Appl Prob 47(4):908–922
Article MathSciNet MATH Google Scholar
Hu SH (1993) A large deviation result for the least squares estimators in nonlinear regression. Stoch Process Appl 47(2):345–352
Article MathSciNet MATH Google Scholar
Hu SH (2002) The rate of convergence for the least squares estimator in nonlinear regression model with dependent errors. Sci China Ser A 45(2):137–146
MathSciNet MATH Google Scholar
Hu SH (2004) Consistency for the least squares estimator in nonlinear regression model. Stat Probab Lett 67(2):183–192
Article MathSciNet MATH Google Scholar
Hu T-C, Rosalsky A, Wang KL (2015) Complete convergence theorems for extended negatively dependent random variables. Sankhya 77A(1):1–29
Article MathSciNet MATH Google Scholar
Hu TZ (2000) Negatively superadditive dependence of random variables with applications. Chin J Appl Probab Stat 16(2):133–144
MathSciNet MATH Google Scholar
Ibragimov IA, Has’minskii RZ (1981) Statistical estimation: asymptotic theory. Springer, New York Translated by Samuel Kotz
Book MATH Google Scholar
Ivanov AV (1976) An asymptotic expansion for the distribution of the least squares estimator of the nonlinear regression parameter. Theory Probab Appl 21(3):557–570
Article MATH Google Scholar
Ivanov AV (1997) Asymptotic theory of nonlinear regression. Kluwer Academic Publishers, Dordreht
Book MATH Google Scholar
Ivanov AV, Leonenko NN (1989) Statistical analysis of random fields. Kluwer Academic Publishers, Dordreht
Book MATH Google Scholar
Ivanov AV, Leonenko NN (2008) Semiparametric analysis of long-range dependence in nonlinear regression. J Stati Plan Inference 138(6):1733–1753
Article MathSciNet MATH Google Scholar
Jennrich RI (1969) Asymptotic properties of nonlinear least squares estimators. Ann Math Stat 40(2):633–643
Article MATH Google Scholar
Joag-Dev K, Proschan F (1983) Negative association of random variables with applications. Ann Stat 11(1):286–295
Article MathSciNet MATH Google Scholar
Ko B, Tang Q (2008) Sums of dependent nonnegative random variables with subexponential tails. J Appl Probab 45(1):85–94
Article MathSciNet MATH Google Scholar
Lehmann EL (1966) Some concepts of dependence. Ann Math Stat 37(5):1137–1153
Article MathSciNet MATH Google Scholar
Liu L (2009) Precise large deviations for dependent random variables with heavy tails. Stat Probab Lett 79(9):1290–1298
Article MathSciNet MATH Google Scholar
Liu L (2010) Necessary and sufficient conditions for moderate deviations of dependent random variables with heavy tails. Sci China Ser A 53(6):1421–434
Article MathSciNet MATH Google Scholar
Malinvaud E (1970) The consistency of nonlinear regression. Ann Math Stat 41(3):956–969
Article MathSciNet MATH Google Scholar
Midi H (1999) Preliminary estimators for robust non-linear regression estimation. J Appl Stat 26(5):591–600
Article MathSciNet MATH Google Scholar
Oliveira PE (2012) Asymptotics for associated random variables. Springer, Berlin
Book MATH Google Scholar
Petrov VV (1995) Limit theorems of probability theory: sequences of independent random variables. Clarendon Press, Oxford
MATH Google Scholar
Prakasa Rao BLS (1984a) On the exponential rate of convergence of the least squares estimator in the nonlinear regression model with Gaussian errors. Stat Probab Lett 2(3):139–142
Article MathSciNet MATH Google Scholar
Prakasa Rao BLS (1984b) The rate of convergence of the least squares estimator in a non-linear regression model with dependent errors. J Multivar Anal 14(3):315–322
Article MATH Google Scholar
Prakasa Rao BLS (2012) Associated sequences, demimartingales and nonparametric inference. Birkhäuser, Springer, Basel
Book MATH Google Scholar
Shen AT (2011) Probability inequalities for END sequence and their applications. J Inequal Appl 2011 2011(98):12
Sieders A, Dzhaparidze K (1987) A large deviation result for parameter estimators and its application to nonlinear regression analysis. Ann Stat 15(3):1031–1049
Article MathSciNet MATH Google Scholar
Wang XJ, Hu SH, Yang WZ, Ling NX (2010) Exponential inequalities and inverse moment for NOD sequence. Stat Probab Lett 80(5–6):452–461
Article MathSciNet MATH Google Scholar
Wang XJ, Hu T-C, Volodin AI, Hu SH (2013) Complete convergence for weighted sums and arrays of rowwise extended negatively dependent random variables. Commun Stat Theory Methods 42(13):2391–2401
Article MathSciNet MATH Google Scholar
Wang XJ, Shen AT, Chen ZY, Hu SH (2015a) Complete convergence for weighted sums of NSD random variables and its application in the EV regression model. Test 24(1):166–184
Wang XJ, Zheng LL, Xu C, Hu SH (2015b) Complete consistency for the estimator of nonparametric regression models based on extended negatively dependent errors. Statistics 49(2):396–407
Wu CF (1981) Asymptotic theory of nonlinear least squares estimation. Ann Stat 9(3):501–513
Article MathSciNet MATH Google Scholar
Yang WZ, Hu SH (2014) Large deviation for a least squares estimator in a nonlinear regression model. Stat Probab Lett 91:135–144
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are deeply grateful to the editor, the associate editor and three anonymous referees for their careful reading and insightful comments. The comments led us to significantly improve the paper. This work is supported by the National Natural Science Foundation of China (Grant: 11426032, 11501005, 11526033, 11671012), National Social Science Fund of China (Grant: 14ATJ005), Natural Science Foundation of Anhui Province (Grant: 1408085QA02, 1508085J06, 1608085QA02), Provincial Natural Science Research Project of Anhui Colleges (Grant: KJ2014A020, KJ2015A065, KJ2016A027), Quality Engineering Project of Anhui Province (2015jyxm054) and Applied Teaching Model Curriculum of Anhui University (XJYYKC1401, ZLTS2015053).

Author information

Authors and Affiliations

School of Mathematical Sciences, Anhui University, Hefei, People’s Republic of China
Wenzhi Yang, Zhangrui Zhao & Shuhe Hu
School of Economics, Anhui University, Hefei, People’s Republic of China
Xinghui Wang

Authors

Wenzhi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhangrui Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xinghui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuhe Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuhe Hu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, W., Zhao, Z., Wang, X. et al. The large deviation results for the nonlinear regression model with dependent errors. TEST 26, 261–283 (2017). https://doi.org/10.1007/s11749-016-0509-z

Download citation

Received: 18 March 2016
Accepted: 27 October 2016
Published: 07 November 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11749-016-0509-z

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The large deviation results for the nonlinear regression model with dependent errors

Abstract

Similar content being viewed by others

The strong consistency of M estimator in linear models based on widely orthant dependent errors

Complete consistency of estimators for regression models based on extended negatively dependent errors

Heteroskedastic Linear Regression: Steps Towards Adaptivity, Efficiency, and Robustness

1 Introduction

1.1 The nonlinear regression model

Remark 1.1

Theorem 1.1

Remark 1.2

1.2 The concept of END random variables

Definition 1.1

2 The large deviation results of the LS estimator

Theorem 2.1

Theorem 2.2

Theorem 2.3

Theorem 2.4

Corollary 2.1

Remark 2.1

3 Some examples and simulations

Example 3.1

Example 3.2

Example 3.3

4 Conclusion

5 Proofs

Lemma 5.1

Lemma 5.2

Proof

Remark 5.1

Corollary 5.1

Proof

Corollary 5.2

Proof

Lemma 5.3

Proof

Proof of Theorem 2.1

Proof of Theorem 2.2

Proof of Theorem 2.3

Proof of Theorem 2.4

Proof of Corollary 2.1

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation