Abstract
In this work, we introduce the random weighting method to the nonlinear regression model and study the asymptotic properties for the randomly weighted least squares estimator with dependent errors. The results reveal that this new estimator is consistent. Moreover, some simulations are also carried out to show the performance of the proposed estimator.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Consider the nonlinear regression model:
where \(X_{n}\) is observed, \(\{g_{n}(\theta )\}\) is a known sequence of continuous functions possibly nonlinear in \(\theta \in \Theta \), a closed interval on the real line, and \(\{\xi _{n},n\ge 1\}\) is a sequence of random errors with zero mean. The nonlinear regression models have significant advantages over the linear models. The main one is that the nonlinear regression models have essentially fewer unknown parameters. Also, the parameters of nonlinear models have the meaning of physical variables while the linear parameters are usually devoid of physical significance. Therefore, it is of great interest to study the nonlinear regression model. In most of studies devoted to the problems of regression analysis in the past decades, the central place is occupied by the least squares method of estimation of parameters, which has a protracted history. Let
where \(\{\omega _{i}\}\) is a known sequence of positive numbers. An estimator \(\theta _{n}\) is said to be an ordinary least squares estimator (OLSE, for short) of \(\theta \) if it minimizes \(Q_{n}(\theta )\), that is, \(Q_{n}(\theta _{n})=\inf _{\theta \in \Theta }Q_{n}(\theta )\).
The study of asymptotic properties of the OLSE for parameters in nonlinear regression models has been the main subject of investigation. It is challenging in investigating this model since the OLSE non-linearly entering in a regression function parameters can not be found in an explicit form, which complicates the description of its mathematical properties. Hence, on introducing into statistics the use of nonlinear regression analysis it is necessary to overcome a series of mathematical difficulties which do not have analogues in the linear theory. For the OLSE of the nonlinear model based on i.i.d. random errors, Jennrich (1969) established the asymptotic normality, Malinvaud (1970) investigated the consistency, and Wu (1981) established the necessary and sufficient condition for the strong consistency, and so on. In particular, Ivanov (1976) obtained the following result on large deviation for the OLSE with \(\omega _{i}\equiv 1\) based on i.i.d. random errors.
Theorem 1.1
Let \(\{\xi _{n},n\ge 1\}\) be i.i.d. random variables with \(E|\xi _{1}|^{p}<\infty \) for some \(p\ge 2\). Suppose there exist some constants \(0<c_{1}\le c_{2}<\infty \) such that
Then for every \(\rho >0\) and for all \(n\ge 1\), it has
where \(\theta _{0}\) is the true parameter such that \(\theta _{0}\in \) interior of \(\Theta \) and c is a positive constant independent of n and p.
Prakasa Rao (1984) extended Theorem 1.1 from i.i.d. case to some dependent cases such as \(\varphi \)-mixing and \(\alpha \)-mixing assumptions. Hu (2002) extended Theorem 1.1 to martingale differences, \(\varphi \)-mixing and negative association (NA, for short) assumptions under \(\sup _{i\ge 1}E|\xi _{i}|^{p}<\infty \) for some \(p>2\) without identical distribution. Hu (2004) further considered the large deviation result under the moment condition \(\sup _{i\ge 1}E|\xi _{i}|^{p}<\infty \) for some \(1<p\le 2\). Yang and Hu (2014) obtained some general results on the large deviation, which can also be available under some cases satisfying \(\sup _{i\ge 1}E|\xi _{i}|^{p}=\infty \) for some \(p>1\); Yang et al. (2017) established some large deviation results under extended negatively dependent (END, for short) random errors, and so on. However, a new challenge will emerge if the errors are heteroscedastic, i.e., estimating the variances of the errors is not an easy work.
It is well known that the bootstrap is a very excellent method, which has been used comprehensively in many statistical models including nonlinear regression model. One can see Staniewski (1984) for example. As an alternative approach, the random weighting method, or Bayesian bootstrap method, has received increasing attentions by scholars since it was originally suggested by Rubin (1981). The random weighting method is motivated by the bootstrap method and can be regarded as a kind of smoothing of bootstrap. Instead of re-sampling from the original data set, the random weighting method propels us to generate a group of random weights directly from the computer and then use them to weight the original samples. In comparison with the bootstrap method, the random weighting method has advantages such as the simplicity in computation, the suitability for large samples, and there is no need to know the distribution function. Therefore, this method was adopted in various statistical models. For more details, we refer the readers to Zheng (1987), Gao et al. (2003), Xue and Zhu (2005), Fang and Zhao (2006), Barvinok and Samorodnitsky (2007), Gao and Zhong (2010), and so forth.
However, to the best of our knowledge, there is no literature considering the randomly weighted estimation in nonlinear regression models. In this paper, the random weighting method is adopted for the first time to the least squares estimation in nonlinear regression models. Now we are at a position to present this method.
Definition 1.1
(cf. Ng et al. 2011) Let \((W_{1},\cdots ,W_{n})\) be a random vector with \(W_{i}\ge 0\) and \(\sum _{i=1}^{n}W_{i}=1\). Then the Dirichlet probability density function of \((W_{1},\cdots ,W_{n})\) is defined as
where \(\alpha _{i}>0\), \(\alpha _{0}=\sum _{i=1}^{n}\alpha _{i}\), \(w_{i}\ge 0\), \(\sum _{i=1}^{n-1}w_{i}\le 1\) and \(w_{n}=1-\sum _{i=1}^{n-1}w_{i}\). This distribution is denoted by \(Dir(\alpha _{1},\cdots ,\alpha _{n})\).
By virtue of the concept of Dirichlet distribution, we can propose the randomly weighted least squares estimator of \(\theta \) as follows. Let
where \(W_{i}\)’s are independent of \(\xi _{i}\)’s and the random vector \({\varvec{W}}=(W_{1},\cdots ,W_{n})\) obeys the Dirichlet distribution \(Dir(4,4,\ldots ,4)\), namely, \(\sum _{i=1}^{n}W_{i}=1\) and the joint density of \(W_{1},\cdots ,W_{n-1}\) is
where \((w_{1},\cdots ,w_{n-1})\in D_{n-1}\) and \(D_{n-1}=\{(w_{1},\cdots ,w_{n-1}):w_{i}\ge 0,i=1,\ldots ,n-1,\sum _{i=1}^{n-1}w_{i}\le 1\}\). An estimator \(\hat{\theta }_{n}\) is said to be a randomly weighted least squares estimator (RWLSE, for short) of \(\theta \) if \(\hat{\theta }_{n}=\arg \inf _{\theta \in \Theta }H_{n}(\theta )\).
Since independence assumption is usually implausible in reality, we will adopt a relatively broad dependence, i.e., END assumption in the sequel. The concept of END random variables was introduced by Liu (2009) as follows.
Definition 1.2
A finite collection of random variables \(X_1,X_2,\cdots ,X_n\) is said to be END if there exists a constant \(M > 0\) such that both
and
hold for all real numbers \(x_1, x_2, \cdots , x_n\). An infinite sequence \(\{X_n, n\ge 1\}\) is said to be END if every finite sub-collection is END.
Liu (2009) provided some examples satisfying END structure, one of which shows that if \(X_1,X_2,\cdots ,X_n\) are dependent according to a multivariate copula function \(C(u_{1},\cdots ,u_{n})\) with absolutely continuous distribution functions \(F_{1},\cdots ,F_{n}\), the joint copula density \(c(u_{1},\cdots ,u_{n})=\frac{\partial ^{n}C(u_{1},\cdots ,u_{n})}{\partial u_{1}\cdots \partial u_{n}}\) exists and be uniformly bounded in the whole domain, then \(\{X_{n},n\ge 1\}\) are END. If we take \(M=1\), then the END structure degenerates to negatively orthant dependent (NOD, for short) structure which was introduced by Lehmann (1966) (cf. also Joag-Dev and Proschan 1983). The END structure can reflect not only a negative dependence structure but also a positive one to some extent. Liu (2009) pointed out that the END random variables can be regarded as negatively or positively dependent and provided some interesting examples to support this idea. Joag-Dev and Proschan (1983) also pointed out that negatively associated (NA, for short) random variables are NOD but the inverse is not necessarily true, thus NA random variables are also END. Hence, the consideration of END structure is reasonable and of great interest. Many applications have been found for END random variables. For example, Liu (2010) studied the sufficient and necessary conditions of moderate deviations for END random variables with heavy tails; Chen et al. (2010) established the strong law of large numbers for END random variables and gave their applications to risk theory and renewal theory; Shen (2011) established some exponential probability inequalities for END random variables and presented some applications; Wang and Wang (2013) investigated the precise large deviations for random sums of END real-valued random variables with consistent variation; Wang et al. (2014) proved some results on complete convergence of END random variables; Lita da Silva (2015) established the almost sure convergence for sequences of END random variables; Wang et al. (2015) and Yang et al. (2018) studied the complete consistency of the estimator of nonparametric regression models based on END errors; Wu et al. (2019) investigated the complete f-moment convergence for END random variables, and so on.
For the proposed RWLSE above-mentioned, we establish two general results on large deviation for RWLSE of the parameter \(\theta \) with \(p>2\), and respectively, \(1<p\le 2\) under END errors. As direct corollaries, the rates of complete consistency, strong consistency, and weak consistency are obtained, which reflect that the proposed RWLSE is a consistent estimator of \(\theta \). The numerical analysis reveals that the RWLSE performs as well as the OLSE in heteroscedastic nonlinear regression models, where sometimes the former one is better. As we have pointed out earlier, it is not easy to estimate the variances of heteroscedastic errors, this paper provides an alternative method to estimate the parameters in a heteroscedastic nonlinear regression model.
Throughout this paper, the symbol C represents some positive constant which can be different in different places. \(C(p),C'(p),C_{1}(p),C_{2}(p),\cdots \) are some positive constants depending only on p. Let I(A) be the indicator function of the event A and \(\lfloor x\rfloor \) denote the integer part of x. Denote \(x^{+}=xI(x\ge 0)\), \(x^{-}=-xI(x<0)\). \(\log n=\ln \max (x,e)\), where \(\ln x\) represents the natural logarithm of x.
The rest of this paper is organized as follows: The main results are stated in Sect. 2. The numerical analysis is provided in Sect. 3. The proofs of the main results are presented in Sect. 4. Some lemmas for proving the main results are given in Appendix.
2 Main results
The main results on large deviations are presented as follows.
Theorem 2.1
Let \(p>2\). In model (1.1), assume that \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(E|\xi _{n}|^{p}<\infty \) for each \(n\ge 1\). If there exist positive numbers \(\lambda _n\le \Lambda _n\) for each \(n\ge 1\), such that
then there exists a positive constant C(p) depending only on p such that for all \(\rho >0\) and each \(n\ge 1\),
where \(\Delta _{np}=\sum _{i=1}^{n}E|\xi _{i}|^{p}\) and \(\nabla _{np}=\left( \sum _{i=1}^{n}E\xi _{i}^{2}\right) ^{p/2}\).
Theorem 2.2
Let \(1<p\le 2\). In model (1.1), assume that \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(E|\xi _{n}|^{p}<\infty \) for each \(n\ge 1\). If (2.1) holds, then there exists a positive constant \(C'(p)\) depending only on p such that for all \(\rho >0\) and each \(n\ge 1\),
Remark 2.1
It is easy to see that if \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(p>2\), Theorem 2.1 extends Theorem 1.1 from i.i.d assumption to END random errors with not necessarily identical distribution. Similarly, if \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(1<p\le 2\), then Theorem 2.2 also extends the corresponding result of Hu (2004).
Remark 2.2
Yang and Hu (2014) also established the similar results for OLSE of \(\theta \) with NOD errors. By taking \(\lambda _{n}=c_{1}\), \(\Lambda _{n}=c_{2}\) for some \(0<c_1\le c_2\), we point out that the conditions in Theorem 2.1 and the corresponding result of Yang and Hu (2014) do no imply each other. For example, \(n^{-1}\sum _{i=1}^{n}E|\xi _{i}|^{p}>n^{-p/2}\sum _{i=1}^{n}E|\xi _{i}|^{p}\) but \(n^{-p/2}\left( \sum _{i=1}^{n}E\xi _{i}^{2}\right) ^{p/2}\le n^{-p/2}\left( \sum _{i=1}^{n}(E|\xi _{i}|^{p})^{2/p}\right) ^{p/2}\). However, if \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(p>2\), they are equivalent. Hence, our results extend the corresponding ones of Yang and Hu (2014).
By Theorem 2.1, we can obtain the result concerning the rate of complete consistency and strong consistency as follows.
Corollary 2.1
In model (1.1), assume that \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(p>2\). If (2.1) holds with \(\lambda _{n}=c_1\) and \(\Lambda _{n}=c_{2}\) for some \(0<c_{1}\le c_{2}<\infty \), then for any \(\epsilon >0\),
and thus
By Theorem 2.2, we can also obtain the following result on rate of weak consistency of the RWLSE \(\hat{\theta }_{n}\).
Corollary 2.2
In model (1.1), assume that \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(1<p\le 2\). If (2.1) holds with \(\lambda _{n}=c_1\) and \(\Lambda _{n}=c_{2}\) for some \(0<c_{1}\le c_{2}<\infty \), then
In particular, if \(p=2\), then for any positive sequence \(\{\tau _{n},n\ge 1\}\) satisfying \(\tau _{n}=o(n)\),
3 Some examples and numerical analysis
3.1 Some examples
In this subsection, we present some examples for the RWLSE of nonlinear regression models.
Example 3.1
Consider the linear model
where \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(E|\xi _{n}|^{p}<\infty \) for each \(n\ge 1\). Obviously, (2.1) holds with \(\lambda _{n}=\Lambda _{n}=1\). Hence, Theorems 2.1 and 2.2 follows from \(E|\xi _{n}|^{p}<\infty \) with \(p>2\) and respectively, \(1<p\le 2\).
Example 3.2
Consider the Michaelis-Menten model (see Sieders and Dzhaparidze (1987)) or Miao and Tang (2021) for example)
which is used to describe the relation between the velocity V of an enzyme reaction and the concentration v of the substrate. The parameter L denotes the maximal reaction velocity and the parameter N implies the chemical affinity. Based on the model above, for each concentration \(v_{i}\), there is a measurement of the velocity \(V_{i}\) with error \(\xi _{i}\), i.e.,
Assume that the parameter set \((L, N)\in \Theta \) is a bounded open set in the positive quadrant. Consider the following simple form of model (3.5)
which follows from (3.5) by assuming N/L is known (without loss of generality, we may assume that \(N/L=1\)) and letting \(v_{i}=i^{-\mu }\), \(0<\mu <\min \{(p-1)/(4p),1/8\}\), where \(p>1\). It is easy to see that
for some \(0<c_{3}\le c_{4}<\infty \). Assume further that \(\{\xi _{n},n\ge 1\}\) is a sequence of END random errors with zero mean and \(E|\xi _{n}|^{p}<\infty \) for each \(n\ge 1\), then Theorems 2.1 and 2.2 hold. Moreover, by choosing \(\rho =n^{1/2}\epsilon \), we can obtain the weak consistency for \(p>1\) and strong consistency for some p large enough.
3.2 Numerical analysis
In this section, we will carry out some simulations to study the finite sample performance of the RWLSE in the homoscedastic and heteroscedastic nonlinear regression models. The data are generated from model (3.4) (denoted as Model 1) and (3.6) (denoted as Model 2) respectively. For Model 1, set \(\theta =1\) and for Model 2, set \(N=1/5\). Set the sample size \(n=50,100,200,400,800,1600\). Let \((\epsilon _1,\cdots ,\epsilon _n)\sim N_n(0,\Sigma )\) with
The weights \(W\sim Dir(4,4,\cdots ,4)\), and the generation method of W is refer to Narayanan (1990).
We first use the RWLSE to estimate \(\theta \) for Model 1 and N for Model 2 with homoscedasticity, i.e., \(\xi _i=\epsilon _i\) for each \(1\le i\le n\). Repeat the procedure 1000 times and calculate the mean and variance of the estimator. The results are given in Table 1. We can see that \(\hat{\theta }\) for Model 1 is unbiased, while \(\hat{N}\) for Model 2 is asymptotically unbiased. The trend that \(Var[\sqrt{n}(\hat{\theta }-\theta )]\) and \(Var[\sqrt{n}(\hat{N}-N)]\) are finite indicates that the convergence rate of the RWLSE is asymptotically \(O(n^{-1/2})\). To compare the RWLSE to the OLSE under END errors, we further present the the mean and variance of the OLSE in Table 2. The results show that there are no intrinsic difference between the mean of the two estimators. The mean and the variance of the RWLSE are slightly inferior to those of the OLSE in both the two models.
We now consider the heteroscedastic case, i.e., \(\xi _i=\big [1+\frac{(-1)^{i}(i-1)}{n}\big ]\epsilon _{i}\) for each \(1\le i\le n\). Other settings are the same as above. The results are given in Table 3 and Table 4. The mean and the variance of the RWLSE are better than those of the OLSE in Model 1 but slightly weaker in Model 2. However, the convergence rates of the two estimators are almost the same. Note that in our simulation, the heteroscedasticity is known. However, in many realistic applications, it is not easy to estimate the variance of the errors if they are heteroscedastic. Therefore, our simulation results show that the RWLSE has a good performance without estimating the variance of the errors first, which provide us an alternative choice when dealing with similar issues.
4 Proofs of the main results
Proof of Theorem 2.1
Denote
and
Note from \(\sum _{i=1}^{n}W_{i}=1\) and (2.1) that
For all \(\omega \in (|\hat{\theta }_{n}-\theta _0|>\varepsilon )\), where \(\varepsilon >0\) is arbitrary, we have that \(\hat{\theta }_{n}\ne \theta _0\) and thus
which together with \(\Psi _{n}(\hat{\theta }_{n},\theta _{0})>0\) implies \(U_{n}(\hat{\theta }_{n})\ge 1/2\). Hence, \((|\hat{\theta }_{n}-\theta _0|>\varepsilon )\subseteq (U_{n}(\hat{\theta }_{n})\ge 1/2)\). Via choosing \(\varepsilon =\rho n^{-1/2}\), we have
By Cauchy’s inequality, we can see that for all \(\theta \ne \theta _{0}\),
Observing that \(\sum _{i=1}^{n}W_{i}=1\) and \(f(x)=|x|^{r}\) is a convex function for all \(r\ge 1\), we have by \(p>2\) and Lemma A.3 that
Moreover, we obtain by (4.1) that
Hence, it follows from (4.3)–(4.5) and Markov’s inequality that
For \(m=0,1,2,\ldots ,\lfloor n^{1/2}\rfloor \), let \(\theta (m)=\theta _{0}+\frac{\rho }{n^{1/2}}+\frac{m\rho }{\lfloor n^{1/2}\rfloor }\) and \(\rho _{m}=\theta (m)-\theta _{0}=\frac{\rho }{n^{1/2}}+\frac{m\rho }{\lfloor n^{1/2}\rfloor }\). It follows from (4.1) again that
Hence, it yields that
By Lemma A.3 and Stirling’s approximation, we have that for each \(1\le i\le n\), when n is sufficiently large,
Note that \(0=EW_{i}E\xi _{i}=EW_{i}\xi _{i}=EW_{i}\xi _{i}^{+}-EW_{i}\xi _{i}^{-}\) for \(1\le i\le n\) and from Lemma A.1 that \(\{W_{n}\xi _{n}^{+}-EW_{n}\xi _{n}^{+},n\ge 1\}\) and \(\{W_{n}\xi _{n}^{-}-EW_{n}\xi _{n}^{-},n\ge 1\}\) are still sequences of END random variables with zero mean. Hence, applying Markov’s inequality, Lemma A.2, (2.1) and (4.8), one can easily obtain that
Similarly, we also obtain by Lemma A.2, (2.1) and (4.8) that for all \(\theta _{1},\theta _{2}\in \Theta \) and n large enough,
Hence, taking \(r=1+\alpha =p\), \(C=C(n,p)\), \(\varepsilon =\rho /\lfloor n^{1/2}\rfloor \), \(a=\lambda _n^{2}\rho _{m}^{2}/4\), and \(\gamma \in (2,p+1)\) in Lemma A.4, we obtain
Noting that \(\rho _{0}=\rho n^{-1/2}\), \(\rho _{m}>m\rho n^{-1/2}\) and \(p>2\), we obtain by (4.7), (4.9) and (4.10) that
Similarly, we also have
The desired result (2.2) follows from (4.2), (4.6), (4.11) and (4.12) immediately. \(\square \)
Proof of Theorem 2.2
The proof is similar to that of Theorem 2.1. Thus, we only present the differences. It follows from (4.1) that
Therefore, we have by (4.8), (4.13) and \(C_{r}\)-inequality that
Applying Markov’s inequality, the Marcinkiewicz-Zygmund inequality in Lemma A.2, (2.1) and (4.8), we can also obtain that for all n large enough,
and
Hence, analogous to the proof of (4.10), we have
Analogous to the proof of (4.11), we obtain by (4.7), (4.15) and (4.16) that
Similarly, we also have
Combining (4.2), (4.14), (4.17), and (4.18), we obtain (2.3) immediately. \(\square \)
Proof of Corollary 2.1
Taking \(\rho =\epsilon n^{1/p}\sqrt{\log n}\) in Theorem 2.1, we have that
which together with the Borel-Cantelli lemma yields the rate of strong consistency. \(\square \)
Proof of Corollary 2.2
Noting that \(\sup _{n\ge 1}E|\xi _{n}|^{p}<\infty \) for some \(1<p\le 2\), we may assume that \(\Delta _{np}\le 1\) for each \(n\ge 1\). Hence, for any \(\epsilon >0\), taking \(\rho =\left( \frac{C'(p)}{\epsilon }\right) ^{1/p}n^{1/p-1/2}\) in Theorem 2.2, we have
The second conclusion follows immediately by choosing \(\rho =\epsilon \sqrt{n/\tau _n}\) in Theorem 2.2. This completes the proof. \(\square \)
5 Conclusions
In this work, we mainly consider the following nonlinear regression model:
where \(X_{n}\) is observed, \(\{g_{n}(\theta )\}\) is a known sequence of continuous functions possibly nonlinear in \(\theta \in \Theta \), and \(\{\xi _{n},n\ge 1\}\) is a sequence of random errors with zero mean.
The nonlinear regression model not only has essentially fewer unknown parameters, but also has the meaning of physical variables while the linear parameters are usually devoid of physical significance. Therefore, it is of great interest to study the nonlinear regression model.
In this work, in view of the concept of Dirichlet distribution, we introduce the random weighting method to the nonlinear regression model and propose the randomly weighted least squares estimator of \(\theta \) as follows:
where \(W_{i}\)’s are independent of \(\xi _{i}\)’s and the random vector \({\varvec{W}}=(W_{1},\cdots ,W_{n})\) obeys the Dirichlet distribution \(Dir(4,4,\ldots ,4)\).
In this work, we establish the asymptotic properties for the randomly weighted least squares estimator with END errors. The results reveal that this new estimator is consistent. Moreover, some simulations are also carried out to show the superiority to the ordinary least squares estimator, especially in a heteroscedastic nonlinear regression model.
References
Barvinok A, Samorodnitsky A (2007) Random weighting, asymptotic counting, and inverse is operimetry. Israel J Math 158(1):159–191
Block HW, Savits TH, Shaked M (1982) Some concepts of negative dependence. Ann Prob 10(3):765–772
Bozorgnia A, Patterson RF, Taylor RL (1996) Limit theorems for dependent random variables. In: Proceedings of the first world congress of nonlinear analysts 92 (II), Walter de Grutyer, Berlin, 1639–1650
Chen PY, Bai P, Sung SH (2014) The von Bahr-Esseen moment inequality for pairwise independent random variables and applications. J Math Anal Appl 419(2):1290–1302
Chen YQ, Chen AY, Ng KW (2010) The strong law of large numbers for extended negatively dependent random variables. J Appl Prob 47:908–922
Fang YX, Zhao LC (2006) Approximation to the distribution of LAD estimators for censored regression by random weighting method. J Stat Plan Inference 136(4):1302–1316
Gao SS, Zhang JM, Zhou T (2003) Law of large numbers for sample mean of random weighting estimate. Inf Sci 155(1–2):151–156
Gao SS, Zhong YM (2010) Random weighting estimation of kernel density. J Stat Plan Inference 140(9):2403–2407
Hu SH (2002) The rate of convergence for the least squares estimator in nonlinear regression model with dependent errors. Sci China Ser A 45(2):137–146
Hu SH (2004) Consistency for the least squares estimator in nonlinear regression model. Stat Prob Lett 67(2):183–192
Ivanov AV (1976) An asymptotic expansion for the distribution of the least squares estimator of the nonlinear regression parameter. Theory Prob Appl 21(3):557–570
Jennrich RI (1969) Asymptotic properties of nonlinear least squares estimators. Ann Math Stat 40(2):633–643
Joag-Dev K, Proschan F (1983) Negative association of random variables with applications. Ann Stat 11:286–295
Lita da Silva J (2015) Almost sure convergence for weighted sums of extended negatively dependent random variables. Acta Math Hungarica 146(1):56–70
Liu L (2009) Precise large deviations for dependent random variables with heavy tails. Stat Prob Lett 79:1290–1298
Liu L (2010) Necessary and sufficient conditions for moderate deviations of dependent random variables with heavy tails. Sci China Ser Math 53(6):1421–1434
Lehmann E (1966) Some concepts of dependence. Ann Math Stat 37:1137–1153
Malinvaud E (1970) The consistency of nonlinear regression. Ann Math Stat 41(3):956–969
Miao Y, Tang YY (2021) Large deviation inequalities of LS estimator in nonlinear regression models. Stat Prob Lett, 168, Article ID 108930, https://doi.org/10.1016/j.spl.2020.108930
Ng KW, Tian GL, Tang ML (2011) Dirichlet and related distributions: theory, methods and applications, vol 888. John Wiley & Sons, London (Chapter 1, 2)
Narayanan A (1990) Computer generation of Dirichlet random vectors. J Stat Comput Simul 36(1):19–30
Prakasa Rao BLS (1984) The rate of convergence of the least squares estimator in a non-linear regression model with dependent errors. J Multivariate Anal 14(3):315–322
Rubin DB (1981) The Bayesian bootstrap. Ann Stat 9:130–134
Shen AT (2011) Probability inequalities for END sequence and their applications. J Inequal Appl 2011:98, 12
Sieders A, Dzhaparidze K (1987) A large deviation result for parameter estimators and its application to nonlinear regression analysis. The Annals of Statistics 15(3):1031–1049
Staniewski P (1984) The Bootstrap in nonlinear regression. In: Rasch D, Tiku ML (eds) Robustness of statistical methods and nonparametric statistics. Theory and decision library (series b: mathematical and statistical methods), vol 1. Springer, Dordrecht, pp 139–142
Wang SJ, Wang XJ (2013) Precise large deviations for random sums of END real-valued random variables with consistent variation. J Math Anal Appl 402:660–667
Wang XJ, Hu SH, Shen AT, Yang WZ (2011) An exponential inequality for a NOD sequence and a strong law of large numbers. Appl Math Lett 24:219–223
Wang XJ, Li XQ, Hu SH, Wang XH (2014) On complete convergence for an extended negatively dependent sequence. Commun Stat Theory Methods 43(14):2923–2937
Wang XJ, Zheng LL, Xu C, Hu SH (2015) Complete consistency for the estimator of nonparametric regression models based on extended negatively dependent errors. Statistics 49(2):396–407
Wu CF (1981) Asymptotic theory of nonlinear least squares estimation. Ann Stat 9(3):501–513
Wu Y, Wang XJ, Hu T-C, Volodin A (2019) Complete \(f\)-moment convergence for extended negatively dependent random variables. RACSAM 113:333–351
Xue LG, Zhu LX (2005) \(L_1\)-norm estimation and random weighting method in a semiparametric model. Acta Math Appl Sin 21(2):295–302
Yang WZ, Hu SH (2014) Large deviation for a least squares estimator in a nonlinear regression model. Stat Prob Lett 91:135–144
Yang WZ, Xu HY, Chen L, Hu SH (2018) Complete consistency of estimators for regression models based on extended negatively dependent errors. Stat Pap 59(2):449–465
Yang WZ, Zhao ZR, Wang XH, Hu SH (2017) The large deviation results for the nonlinear regression model with dependent errors. TEST 26(2):261–283
Zheng ZG (1987) Random weighting method. Acta Math Appl Sin 10(2):247–253
Acknowledgements
The authors are most grateful to the Editor and anonymous referee for carefully reading the manuscript and valuable suggestions which helped in improving an earlier version of this paper. Supported by the National Natural Science Foundation of China (12201004, 12201079, 12201600), the National Social Science Foundation of China (22BTJ059), and the Natural Science Foundation of Anhui Province (2108085MA06).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
There is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Lemma A.1
Let \((W_{1},\cdots ,W_{n})\sim Dir(\alpha _{1},\cdots ,\alpha _{n})\) for each \(n\ge 1\) and \(\{\xi _{n},n\ge 1\}\) is a sequence of nonnegative END random variables. Then \(\{W_{n}\xi _{n},n\ge 1\}\) is still a sequence of END random variables.
Proof
It follows from Example 5.4 of Block et al. (1982) that \(\{W_{n},n\ge 1\}\) is a sequence of nonnegative NOD random variables, which is independent of \(\{\xi _{n},n\ge 1\}\). Hence, for any real numbers \(z_1,\cdots ,z_n\), we have by Definition 1.2 and the properties of NOD random variables (see in Bozorgnia et al. (1996) or Lemmas 1.1 and 1.2 of Wang et al. (2011) for example) that
Similarly, we also have that
Therefore, \(\{W_{n}\xi _{n},n\ge 1\}\) is still a sequence of END random variables by Definition 1.2 again. \(\square \)
Lemma A.2
Let \(\{a_{ni},1\le i\le n,n\ge 1\}\) be an array of real numbers and \(\{X_{n},n\ge 1\}\) be a sequence of END random variables with \(EX_{n}=0\) and \(E|X_{n}|^{p}<\infty \) for each \(n\ge 1\) and some \(p>1\). Then there exist positive constants \(C_{p}\) and \(C_{p}'\) depending only on p such that
Proof
Noting that \(a_{ni}=a_{ni}^{+}-a_{ni}^{-}\) for each \(1\le i\le n\), \(n\ge 1\), the Rosenthal inequality above is a direct consequence of Corollary 3.2 in Shen (2011) by using \(C_{r}\)-inequality. The second inequality, i.e., Marcinkiewicz-Zygmund inequality can be obtained by the first one and the method used in the proof of Theorem 2.1 in Chen et al. (2014). The details are omitted. \(\square \)
Lemma A.3
Let \((Y_{1},\cdots ,Y_{n})\sim Dir(\alpha _{1},\cdots ,\alpha _{n})\) and \(\alpha _0=\sum _{i=1}^{n}\alpha _{i}\). Then for any \(p>0\) and each \(1\le i\le n\),
Proof
Without loss of generality, we only need to show \(EY_{1}^{p}=\frac{\Gamma (\alpha _{0})\Gamma (\alpha _{1}+p)}{\Gamma (\alpha _{0}+p)\Gamma (\alpha _{1})}\). By Definition 1.1 and some standard calculation, we have that
where the second equality above follows by letting \(\eta _{n-1}=y_{n-1}/\left( 1-\sum _{i=1}^{n-2}y_{i}\right) \). \(\square \)
Lemma A.4
(cf. Hu 2004) Let \((\Omega ,\mathscr {F},P)\) be a probability space, \([T_{1},T_{2}]\) be a closed interval on the real line. Assume that \(V(\theta )=V(\omega ,\theta )\) \((\theta \in [T_{1},T_{2}],\omega \in \Omega )\) is a stochastic process such that \(V(\omega ,\theta )\) is continuous for all \(\omega \in \Omega \). If there exist numbers \(\alpha >0\), \(r>0\) and \(C=C(T_{1},T_{2})<\infty \) such that
then for any \(\varepsilon >0\), \(a>0\), \(\theta _{0},\theta _{0}+\varepsilon \in [T_{1},T_{2}]\), and \(\gamma \in (2,2+\alpha )\), it has
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, Y., Yu, W. & Wang, X. Large deviations for randomly weighted least squares estimator in a nonlinear regression model. Metrika 87, 551–570 (2024). https://doi.org/10.1007/s00184-023-00926-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-023-00926-0