Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models

Araújo, Mariana C.; Cysneiros, Audrey H. M. A.; Montenegro, Lourdes C.

doi:10.1007/s00362-017-0933-5

Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models

Regular Article
Published: 06 July 2017

Volume 61, pages 167–188, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistical Papers Aims and scope Submit manuscript

Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models

Download PDF

Mariana C. Araújo¹,
Audrey H. M. A. Cysneiros² &
Lourdes C. Montenegro³

369 Accesses
7 Citations
Explore all metrics

Abstract

In this paper we address the issue of testing inference of the dispersion parameter in heteroscedastic symmetric nonlinear regression models considering small samples. We derive Bartlett corrections to improve the likelihood ratio as well modified profile likelihood ratio tests. Our results extend some of those obtained in Cordeiro (J Stat Comput Simul 74:609–620, 2004) and Ferrari et al. (J Stat Plan Inference 124:423–437, 2004), who consider a symmetric nonlinear regression model and normal linear regression model, respectively. We also present the bootstrap and bootstrap Bartlett corrected likelihood ratio tests. Monte Carlo simulations are carried out to compare the finite sample performances of the three corrected tests and their uncorrected versions. The numerical evidence shows that the corrected modified profile likelihood ratio test, the bootstrap and bootstrap Bartlett corrected likelihood ratio test perform better than the other ones. We also present an empirical application.

Corrected T(q)-Likelihood Estimator in a Generalized Linear Structural Regression Model with Measurement Errors

Article 01 May 2015

Heteroskedastic Linear Regression: Steps Towards Adaptivity, Efficiency, and Robustness

Two-Step Estimation in a Heteroscedastic Linear Regression Model

Article 27 April 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The normal model, although very attractive, is not always appropriate to fit a dataset, especially if the data present extreme or outlying observations. Due to this problem, new regression models which are not so easily affected by extreme or outlying observations have been developed in the statistical literature. The symmetric family of distributions supplies an extension of the normal distribution, including other ones with heavier and lighter tails than the normal, such as Cauchy, Student-t, generalized Student-t, logistic I and II, generalized logistic, power exponential, Kotz distribution and generalized Kotz distribution, among others. This family of distributions provides a vast source of alternative models for analyzing data containing outlying observations. These models have been widely studied in the statistical literature. There are several recent articles considering the symmetric distribution; see Cordeiro et al. (2000), Cordeiro (2004), Cysneiros et al. (2010a), Cysneiros et al. (2010b), Vanegas et al. (2013), Maior and Cysneiros (2016). Further details about the symmetric family of distributions can be seen in Fang et al. (1990) and Fang and Anderson (1990).

When the dispersions are not constant over the observations, the inference strategies for regression parameters are different. Thus, it is extremely important to test whether variable dispersion is present in the data. One way to do this is to model the dispersion parameter as a function which depends on regressors and an unknown parameter vector in such a way that for a specific value of the parameter vector, the function corresponds to varying dispersion. Following this approach, one can formulate a hypothesis test in which the null hypothesis leads to constant dispersion. To that end, the likelihood ratio test is commonly used. Under the null hypothesis, the likelihood ratio statistic (LR) is asymptotically chi-square ($\chi ^2$) distributed up to an error of order $n^{-1},$ where n is the sample size. However, it is well known that for small samples this test can provide very distorted rejection rates. This happens due to the approximation of the null distribution of the LR statistic to the $\chi ^2$ distribution, which is not accurate. One way to improve the approach of the LR statistic distribution to the $\chi ^2$ distribution, and consequently reduce its distortion, is to multiply the LR statistic by a Bartlett correction factor (Bartlett 1937). This method was generalized later by Lawley (1956). The resulting statistic has a $\chi ^2_k$ null distribution up to an error of order $n^{-2}$, where k is the difference between the dimensions of the parameter space under the two hypothesis being tested.

Another factor that must be considered is the presence of nuisance parameters, which can have a profound impact on inference. Many approaches have been proposed to eliminate or reduce their impact. In the presence of nuisance parameters, generally it is feasible to perform inference based on profile likelihood, which is a function where the nuisance parameters are replaced by consistent estimates in the likelihood function, resulting in a function that depends only on the parameters of interest. The profile likelihood has some properties of the usual likelihood (Pace and Salvan 1997, Chapter 4), but for problems with large numbers of nuisance parameters, this procedure results in inconsistent or inefficient estimates. Another problem caused by the number of nuisance parameters is the poor approximation of the LR statistic distribution to the $\chi ^2$ distribution. To reduce the effect of the nuisance parameters, Cox and Reid (1987, 1992) proposed a modified profile likelihood. The modified profile likelihood ratio statistic ($LR_m$) has asymptotic $ \chi ^ 2_k$ null distribution up to an error of order $n^{-1}$ and it can also be corrected by Bartlett correction (DiCiccio and Stern 1994), producing a more accurate inference, as can be seen in Ferrari et al. (2004, 2005), Cysneiros and Ferrari (2006) and Melo et al. (2009).

To improve the large-sample $\chi ^2$ approximation to the null distribution of the LR and $LR_m$ statistics in many parametric models, the Bartlett correction is widely used. Although the Bartlett correction factors are somewhat complex to obtain, they can be readily implemented in computer programs. Moreover, this is a worthwhile practice, since Bartlett corrections generally provide a considerable improvement. In recent years, interest in Bartlett corrections has resurfaced and some articles have been published considering this issue. See for example Fujita et al. (2010), Lemonte et al. (2012), Bayer and Cribari-Neto (2013), Stein et al. (2014).

The main purpose of this article is to derive Bartlett correction factors to improve the inference of the dispersion parameter based on the likelihood ratio statistic and modified profile likelihood ratio statistic in the class of heteroscedastic symmetric nonlinear models (HSNLM), a class of models proposed by Cysneiros et al. (2010a), when the number of the observations available is small. To that end, we will follow the approach of modeling the dispersion parameter vector as a function of regressors and unknown parameters such that under the null hypothesis the function is constant, that is, the null hypothesis leads to the symmetric nonlinear regression model. Our results extend some of those obtained in Cordeiro (2004), since we consider a regression structure for the dispersion parameter while they assume the dispersion parameter as a scalar in the class of symmetric nonlinear regression models. We also extend the results obtained in Ferrari et al. (2004), who improved likelihood-based tests for heteroscedasticity in linear regression models. In this work, we also consider the bootstrap Bartlett correction introduced by Rocke (1989) as a numerical alternative to analytical Bartlett correction. A Monte Carlo simulation study is performed to evaluate the performance of the corrected tests and their uncorrected versions. We expected the proposed tests to deliver more trustworthy inferences in small samples and our Monte Carlo simulation results show that, in fact, this happens. That is, the corrected likelihood ratio and corrected modified profile likelihood ratio tests proposed here are attractive alternatives to the usual likelihood ratio test in the HSNLM class when the sample size is small. We are unaware of any simulation study in the statistical literature comparing the performance of the proposed tests in HSNLM. Thus, this paper fills this gap.

The article is organized as follows. In Sect. 2, we define the model and present some inferential aspects such as estimation and hypothesis testing of regression parameters. In Sect. 3, we discuss Bartlett corrections to improve the usual likelihood ratio and modified profile likelihood ratio tests in HSNLM. We also present the bootstrap Bartlett correction for likelihood ratio statistic. A Monte Carlo simulation is performed in Sect. 4 to evaluate the performance of the studied tests. An application using real data is considered in Sect. 5. Conclusions about the results obtained are presented in Sect. 6. Finally, appendices with technical details are presented at the end.

2 Heteroscedastic symmetric nonlinear models

Let $y_1,\ldots ,y_n$ be n independent random variables. Each $y_\ell ,\,\,\ell =1,\ldots ,n,$ follows a continuous symmetric distribution with location parameter $\mu _\ell \in {\mathbb {R}}$ and dispersion parameter $\phi _\ell >0$ if its probability density function is of the form

$$\begin{aligned} \pi (y_\ell ;\mu _\ell ,\phi _\ell )=\frac{1}{\sqrt{\phi _\ell }}g(u_\ell ), \quad y_\ell \in {\mathbb {R}} \end{aligned}$$

(1)

where $g:{\mathbb {R}} \rightarrow [0, \infty )$ is generally known as the density generator, such that, $\int _0^{\infty }g(u)du<\infty ,$ with $u_\ell =(y_\ell -\mu _\ell )^2/\phi _\ell $. In what follows, we will denote $y_\ell \sim S(\mu _\ell ,\phi _\ell ,g).$ The density generation function $g(\cdot )$ for some symmetric distributions is given in Table 1.

Table 1 Density generation function $g(\cdot )$ for some symmetric distributions

Full size table

The heteroscedastic symmetric nonlinear regression model proposed by Cysneiros et al. (2010a) is defined as:

$$\begin{aligned} y_\ell = \mu _\ell + \sqrt{\phi _\ell }e_\ell , \quad \ \ell =1,\ldots ,n, \end{aligned}$$

(2)

where $\mu _\ell =f(x_\ell ;{\varvec{\beta }})$ is a continuous and twice differentiable nonlinear regression structure with respect to the components of a vector of unknown regression parameters ${\varvec{\beta }}=(\beta _1,\ldots ,\beta _p)^\top $, $(p<n),\,\,{\varvec{x}}_\ell =(x_{\ell 1},\ldots ,x_{\ell P})^\top $ is a vector of known explanatory variables associated with the $\ell $th observation and $e_\ell \sim S(0,1,g)$. Moreover, we assume that ${\varvec{\beta }}$ is defined in a subset $\varvec{\varOmega _{\beta }} \in {\mathbb {R}}^p\,\,(p<n)$ such that the $n \times p$ matrix of derivatives of ${\varvec{\mu }}=(\mu _1, \ldots , \mu _n)^\top $ with respect to ${\varvec{\beta }},$ denoted by $\tilde{\varvec{X}}=\partial {\varvec{\mu }}/\partial {\varvec{\beta }},$ has rank p for all ${\varvec{\beta }}.$ In addition we consider $ \phi _\ell =\sigma ^2 m(\varvec{\omega }_\ell ^\top {\varvec{\delta }})$, where $m(\cdot )>0$ is any known one-to-one continuously differentiable function, where $\varvec{\omega }_\ell =(\omega _{\ell 1},\ldots ,\omega _{\ell k})^\top $ is a vector of explanatory variables that may have components in common with ${\varvec{x}}_\ell $, ${\varvec{\delta }}=(\delta _1,\ldots ,\delta _k)^\top $ is a vector of unknown parameters to be estimated, and $\sigma ^2 \in (0 , +\infty )$ is an unknown constant. We also assume that a unique value $\varvec{\delta _0}$ of ${\varvec{\delta }}$ exists such that $m(\varvec{\omega _\ell }^\top \varvec{\delta _0})=1$ for all $\ell .$ Consequently, $\phi _\ell =\sigma ^2,$ if ${\varvec{\delta }}=\varvec{\delta _0},$ which implies that the $y_\ell '$s have constant dispersion. Note that the concept of heteroscedasticity in this context refers to varying dispersion, i.e., we say that the model is homoscedastic when all dispersion parameters $\phi _1,\ldots , \phi _n$ are equal; otherwise we say that the model is heteroscedastic. To perform the procedure, an explicit form for m must be chosen. A possible and common choice is to consider $m(\varvec{\omega _\ell }^\top {\varvec{\delta }})=\exp (\varvec{\omega _\ell }^\top {\varvec{\delta }}),$ since this functional form for m does not impose any restriction on the components of $\varvec{\omega _\ell }$ (Cook and Weisberg 1983; Lin et al. 2009).

In this way, we are interested in assessing the constancy of the dispersion parameter in model (2) by testing the null hypothesis $H_0:{\varvec{\delta }}=\varvec{\delta _0}$ against the alternative hypothesis $H_1:{\varvec{\delta }}\ne \varvec{\delta _0}$, where $\varvec{\delta _0}$ is a $k \times 1$ vector of specified constant such that $m_\ell =m(\varvec{\omega }_\ell ^\top \varvec{\delta _0})=1$ for $\ell =1,\ldots ,n$. In other words, we are performing a test for heteroscedasticity in symmetric nonlinear regression models, since under the null hypothesis, model (2) reduces to the aforementioned class of models. The number of parameters of interest is k and the number of nuisance parameters is $p+1$. The total log-likelihood function for the parameter vector ${\varvec{\theta }}=({\varvec{\beta }}^\top , {\varvec{\delta }}^\top , \sigma ^2)^\top $ given $y_1,\ldots , y_n$ in model (2) is expressed by:

$$\begin{aligned} l(\varvec{y};{\varvec{\theta }})=-\frac{n}{2}\log \sigma ^2 -\frac{1}{2}\sum _{\ell =1}^n \log (m_\ell ) + \sum _{\ell =1}^n t(z_\ell ), \end{aligned}$$

where $t(z_\ell )=\log g(z^2_\ell )$ with $z_\ell =\sqrt{u_\ell }=\frac{(y_\ell - \mu _\ell )}{\sqrt{\phi _\ell }}$.

Note that to obtain the maximum likelihood estimator (MLE) of ${\varvec{\delta }}$ we maximize the profile log-likelihood function

$$\begin{aligned} l_p({\varvec{\delta }})=l\Big (\varvec{y};{\varvec{\delta }},\varvec{\hat{\beta }_\delta }, {\hat{\sigma }}^2_{{\varvec{\delta }}}\Big ), \end{aligned}$$

where $\varvec{\hat{\beta }_\delta }$ and ${\hat{\sigma }}^2_{{\varvec{\delta }}}$ are MLEs of ${\varvec{\beta }}$ and $\sigma ^2$ given ${\varvec{\delta }}$, respectively. Under usual regularity conditions, $\varvec{\hat{\beta }}_{{\varvec{\delta }}}$ and ${\hat{\sigma }}^2_{{\varvec{\delta }}}$ are solutions of the equations $\varvec{U_\beta } = \varvec{0}$ and $U_{\sigma ^2} = \varvec{0}$, respectively, which cannot be obtained in closed form. Thus, $\varvec{\hat{\beta }}_{{\varvec{\delta }}}$ and ${\hat{\sigma }}^2_{{\varvec{\delta }}}$ are derived from computationally iterative restricted maximization techniques. Further details of these techniques can be obtained from Nocedal and Wright (1999).

The likelihood ratio statistic (LR) for test $H_0$ can be written as

$$\begin{aligned} LR=2\{l_p(\varvec{\hat{\delta }})-l_p(\varvec{\delta _0}) \}, \end{aligned}$$

where $\varvec{\hat{\delta }}$ is the MLE of ${\varvec{\delta }}$. Asymptotically and under the null hypothesis, LR has $\chi ^2_k$ distribution.

When replacing the nuisance parameters by their maximum likelihood estimates, we are in a way treating them as known, and as a consequence, the profile log-likelihood function may present biases in the score and information function (Ferrari et al. 2005). This procedure is also known to provide inconsistent or inefficient estimates for problems with large numbers of nuisance parameters. Cox and Reid (1987) proposed a modified version of the profile likelihood function in order to attenuate the impact of the number of nuisance parameters on the resulting inference. However, that version requires orthogonality between parameters of interest and nuisance ones. Therefore, in our case ${\varvec{\delta }}$ should be orthogonal to the remaining parameters. For this, a transformation $({\varvec{\delta }}^\top , {\varvec{\beta }}^\top , \sigma ^2)^\top \rightarrow ({\varvec{\delta }}^\top , {\varvec{\beta }}^\top , \gamma )^\top $ is necessary and sufficient, where $E[-\partial l/\partial \delta _a \partial \gamma ]=0, \ a=1,\ldots ,k.$ Following Cox and Reid (1987, Eq. 4), the desired transformation is obtained by solving

$$\begin{aligned} \frac{n}{2 \sigma ^4} \frac{\partial \sigma ^2}{\partial \delta _a} = -\frac{1}{2 \sigma ^2} \sum _{\ell =1}^n \frac{\partial m_\ell }{\partial \delta _a} \frac{1}{m_\ell }, \end{aligned}$$

which has solution (Simonoff and Tsai 1994)

$$\begin{aligned} \sigma ^2 = \frac{\gamma }{(\prod _{\ell =1}^n m_\ell )^{1/n}}. \end{aligned}$$

Considering the reparameterized model, the modified profile log-likelihood function for ${\varvec{\delta }}$ (Cox and Reid 1987) is given by

$$\begin{aligned} l_{CR}^{*}({\varvec{\delta }}) = l_p^{*}({\varvec{\delta }}) - \frac{1}{2} \log \{ \det [j^{*}({\varvec{\delta }}; \varvec{\hat{\beta }_{\delta }}, \hat{\gamma }_{{\varvec{\delta }}})] \}, \end{aligned}$$

(3)

where $l^{*}_p({\varvec{\delta }})=l^*(\varvec{y};{\varvec{\delta }},\varvec{\hat{\beta }_{\delta }}, \hat{\gamma }_{{\varvec{\delta }}})= -\frac{n}{2} \log \gamma + \sum _{\ell =1}^n t(z_\ell )$ correspond to the profile log-likelihood function for ${\varvec{\delta }},\,\,j^{*}({\varvec{\delta }};\varvec{\hat{\beta }_{\delta }}, \hat{\gamma }_{{\varvec{\delta }}})$ denotes the block of the observed information matrix for the nuisance parameters $({\varvec{\beta }}^\top , \gamma )^\top $evaluated at $({\varvec{\delta }},\varvec{\hat{\beta }_{\delta }}, \hat{\gamma }_{{\varvec{\delta }}}),$ and $\hat{\gamma }_{{\varvec{\delta }}}={\hat{\sigma }}^2_{{\varvec{\delta }}}\left( \prod _{\ell =1}^n m_\ell \right) ^{1/n}.$ The matrix $j^{*}({\varvec{\delta }};\varvec{\hat{\beta }_{\delta }}, \hat{\gamma }_{{\varvec{\delta }}})$ is shown in Appendix A.

The modified profile likelihood ratio statistic ($LR_m$) for testing $H_0$ against $H_1$ is given by

$$\begin{aligned} LR_m = 2 \{ l_{CR}(\varvec{\hat{\delta }}) - l_{CR}(\varvec{\delta _0}) \}, \end{aligned}$$

where $\varvec{\hat{\delta }}$ is the MLE of ${\varvec{\delta }}$. Under the null hypothesis, the $LR_m$ statistic has asymptotic $\chi ^2_k$ distribution.

3 Bartlett corrections

For large samples, the null distributions of LR and $LR_m$ statistics are approximated by the $\chi ^2$ distribution. On the other hand, if the sample size is not large enough, it is well known that these approximations cannot be satisfactory, leading to size-distorted tests. In order to improve such approximations, some correction factors for incorporation in LR and $LR_m$ statistics have been proposed in the literature, yielding corrected test statistics whose null distributions are better approximated by the reference $\chi ^2$ distribution, that is, the approximation error from the corrected test statistic distribution to the $\chi ^2$ one reduces from order $n^{-1}$ to $n^{-2}.$ The ideas of transforming the LR and $LR_m$ statistics to make their distributions better approximated by the chi-squared distribution are due to Bartlett (1937) and DiCiccio and Stern (1994), respectively. These correction factors do not depend on a particular parametric model, i.e., they are very general and need to be obtained for each problem of interest.

3.1 Bartlett correction for the LR statistic

It is known that for large samples and under the null hypothesis, the LR statistic has chi-square distribution up to error of order $n^{-1}$. Bartlett (1937) proposed to multiply the LR statistic by a correction factor, denoted by $(1+c/k)^{-1}$, resulting in a corrected statistic $LR^*$ given by

$$\begin{aligned} LR^*=\frac{LR}{1+c/k}, \end{aligned}$$

where c is a constant of order $n^{-1}$ that can be estimated under $H_0$. Moreover, c can be written in terms of moments of likelihood derivatives up to the fourth order, see Lawley (1956). Particularly, $P(LR^*\le x)=P(\chi ^2_k\le x)+O(n^{-2})$^{Footnote 1} under the null hypothesis. Further details on Bartlett corrections can be seen in Cordeiro and Cribari-Neto (2014).

In what follows, we shall consider the case of heteroscedasticity with multiplicative effects, that is, the special case where $m_\ell =\exp (\varvec{\omega _\ell }^\top {\varvec{\delta }})$. Thus, to test $H_0: {\varvec{\delta }}=\varvec{\delta _0}$ against $H_1: {\varvec{\delta }}\ne \varvec{\delta _0}$ in the class of HSNLM, the constant c from the Bartlett correction factor for the LR statistic can be expressed as

$$\begin{aligned} c = \epsilon _k({\varvec{\delta }})+\epsilon _{p,k}({\varvec{\beta }},{\varvec{\delta }})+\epsilon _{p,k}({\varvec{\delta }},\gamma )+\epsilon _{p,k}({\varvec{\beta }},{\varvec{\delta }},\gamma ), \end{aligned}$$

(4)

where

$$\begin{aligned} \epsilon _k({\varvec{\delta }})= & {} \frac{\varDelta _2}{4}\text{ tr }(\varvec{H_d^2}) + \frac{\varDelta _1^2}{6}\varvec{\iota }^\top \varvec{H^{(3)}}\varvec{\iota } + \frac{\varDelta _1^2}{4}\varvec{\iota }^\top \varvec{H}\varvec{H_d}\varvec{H}\varvec{\iota },\\ \epsilon _{p,k}({\varvec{\beta }},{\varvec{\delta }})= & {} - \frac{\varDelta _6}{4\delta _{(0,1,0,0,0)}}\varvec{\iota }^\top \varvec{Q}\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota } - \frac{\varDelta _7}{4\delta _{(0,1,0,0,0)}}\varvec{\iota }\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota }\\&+ \frac{\delta _{(0,0,1,0,1)}}{2\delta _{(0,1,0,0,0)}}\varvec{\iota }\varvec{Q}\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota } + \varvec{\iota }^\top \varvec{Q}\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota }\\&- \frac{\varDelta _6}{4\delta _{(0,1,0,0,0)}}\varvec{\iota }^\top \varvec{Q}\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota } - \frac{\varDelta _7}{4\delta _{(0,1,0,0,0)}}\varvec{\iota }\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota }\\&+ \left( \frac{\varDelta _7^2}{2(\delta _{(0,1,0,0,0)})^2} - \frac{\varDelta _7}{\delta _{(0,1,0,0,0)}} \right) \varvec{\iota }^\top \varvec{Q}\varvec{Z_{\beta }}\odot \varvec{H}\odot \varvec{Z_{\beta }}\varvec{Q}\varvec{\iota }\\&+ \frac{\varDelta _7^2}{4(\delta _{(0,1,0,0,0)})^2}\varvec{\iota }^\top \varvec{Q}\varvec{Z_{\beta d}}\varvec{H}\varvec{Z_{\beta d}}\varvec{Q}\varvec{\iota }\\&+\frac{\varDelta _1 \varDelta _7}{2\delta _{(0,1,0,0,0)}}\varvec{\iota }^\top \varvec{Q}\varvec{Z_{\beta d}}\varvec{H}\varvec{H_d}\varvec{\iota },\\ \epsilon _{p,k}({\varvec{\delta }}, \gamma )= & {} -\frac{\varDelta _4\varDelta _8}{2} \text{ tr }(\varvec{H_d}) - \varDelta _1 \varDelta _4 \text{ tr }(\varvec{H_d})- \frac{\varDelta _1^2 \varDelta _4}{2}\varvec{\iota }^\top \varvec{H^{(2)}}\varvec{\iota }\\&- \frac{\varDelta _1^2 \varDelta _4}{4} (\text{ tr }(\varvec{H_d}))^2 + \left( \frac{\varDelta _1 \varDelta _5}{4} + \frac{\varDelta _1(\varDelta _5 - 4\varDelta _3)}{4} \right) \varDelta _4^2\text{ tr }(\varvec{H}_{\varvec{d}}) \\ \text{ and }\\ \epsilon _{p,k}({\varvec{\beta }},{\varvec{\delta }},\gamma )= & {} - \frac{\varDelta _1 \varDelta _4 \varDelta _7}{4\delta _{(0,1,0,0,0)}} \left[ (\varvec{\iota }^\top \varvec{W}\varvec{Z_{\beta d}} \varvec{\iota })\odot (\varvec{\iota }^\top \varvec{H_d}\varvec{\iota }) + \varvec{\iota }^\top \varvec{Q}\varvec{H_d}\varvec{Z_{\beta d}}\varvec{\iota } \right] , \end{aligned}$$

where $\varDelta _\ell ,\,\,\ell =1,\ldots ,8$, are scalars expressed as

$$\begin{aligned} \varDelta _1= & {} -\frac{1}{8} \{\delta _{(0,0,1,0,3)} + 3\delta _{(0,1,0,0,2)} + \delta _{(1,0,0,0,1)}\},\\ \varDelta _2= & {} \frac{1}{16} \{ \delta _{(0,0,0,1,4)} + 6\delta _{(0,0,1,0,3)} + 7\delta _{(0,1,0,0,2)} + \delta _{(1,0,0,0,1)}\},\\ \varDelta _3= & {} - \frac{n}{2} \{2 + 3\delta _{(1,0,0,0,1)} + \delta _{(0,1,0,0,2)}\},\\ \varDelta _4= & {} \frac{4}{n \{2+\delta _{(0,1,0,0,2)} + 3\delta _{(1,0,0,0,1)} \}},\\ \varDelta _5= & {} -\frac{n}{8} \{8 + 15\delta _{(1,0,0,0,1)} + 9\delta _{(0,1,0,0,2)} + \delta _{(0,0,1,0,3)} \},\\ \varDelta _6= & {} \frac{1}{4} \{\delta _{(0,0,0,1,2)} + 3\delta _{(0,0,1,0,1)} \}, \\ \varDelta _7= & {} \frac{1}{2} \{\delta _{(0,0,1,0,1)} + 2\delta _{(0,1,0,0,0)} \} \ \text{ and } \\ \varDelta _8= & {} \frac{1}{16} \{ \delta _{(0,0,0,1,4)} +8\delta _{(0,0,1,0,3)} + 13\delta _{(0,1,0,0,2)} + 3\delta _{(1,0,0,0,1)} \}. \\ \end{aligned}$$

The $\delta 's$ correspond to $\delta _{(a,b,c,d,e)}=E\{t(z_\ell )^{(1)a}t(z_\ell )^{(2)b}t(z_\ell )^{(3)c}t(z_\ell )^{(4)d} z_\ell ^e \}$ for $a,\,b,\,c,\,d,\,e \in \{1,2,3,4\}$ and $t(z_\ell )^{(k)}=\frac{\partial ^k t(z_\ell )}{\partial z_\ell ^k}$ for $k=1,2,3,4$. Some $\delta 's$ values for symmetric distributions studied in the literature can be found in Uribe-Opazo et al. (2008). In addition, $\varvec{H}=\{ h_{\ell s} \} = - (\varvec{W} - {\bar{\varvec{W}}})[(\varvec{W} - {\bar{\varvec{W}}})^\top \varvec{V} (\varvec{W} - {\bar{\varvec{W}}})]^{-1}(\varvec{W} - {\bar{\varvec{W}}})^\top $, with $(\varvec{W} - {\bar{\varvec{W}}})=(\varvec{w}_1 - \bar{\varvec{w}},\ldots ,\varvec{w}_n - \bar{\varvec{w}})^\top ,\,\,\varvec{V}$ representing the diagonal matrix of order n with $v_\ell =(1-\delta _{(0,1,0,0,2)})/4$ and $\ell , s = 1,\ldots ,n.$ We also have $\varvec{H^{(2)}}=(h_{\ell s}^2)$, $\varvec{H^{(3)}}=(h_{\ell s}^3)$, $\varvec{Q}=\text{ diag }(q_1,\ldots ,q_n)$, with $q_\ell =\exp \{-(\varvec{\omega }_\ell - \bar{\varvec{\omega }})^\top {\varvec{\delta }} \}$ and $\bar{\varvec{\omega }}=(\bar{\omega }_1,\ldots ,\bar{\omega }_k)^\top $, $\varvec{Z_\beta }=\tilde{\varvec{X}}(\tilde{\varvec{X}}^\top \varvec{Q} \tilde{\varvec{X}})^{-1}\tilde{\varvec{X}}^\top ,$ an $n \times p$ matrix $\varvec{W}$ with the ijth element given by $\omega _{ij}$ and an $n \times 1$ vector of ones represented by $\varvec{\iota }.$ The subscripts d in some matrices indicate that only diagonal elements of these matrices were considered. Finally, the symbol $\odot $ denotes the Hadamard (elementwise) product of matrices.

Therefore, we can observe that the constant c involves only simple matrix operations and can be easily implemented in symbolic computation packages and programming languages that allow performing basic linear algebra operations, such as Ox and R. Details about the derivation of the expression c are presented in Appendices B and C.

3.2 Bartlett correction for the $LR_m$ statistic

The modified profile likelihood ratio statistic, $LR_m$, as well as the usual likelihood ratio statistic, LR, has null asymptotic $\chi ^2_k$ distribution up to an error of order $n^{-1}$. DiCiccio and Stern (1994) proposed a Bartlett correction for $LR_m$, reducing the approximation error from its distribution to the reference $\chi ^2$ distribution to order $n^{-2}$. The corrected statistic is defined by

$$\begin{aligned} LR_m^*=\frac{LR_m}{1+c_m/k}, \end{aligned}$$

where $c_m$ is a constant of order $n^{-1}$ such that under the null hypothesis the expected value of the corrected modified test statistic is written as $E(LR^*_m)=k + O(n^{-3/2})$. The general expression of $c_m$ was defined in DiCiccio and Stern (1994, Eq. 25). In this context, Ferrari et al. (2004, Eq. 5) obtained an equation to calculate $c_m$ from normal linear regression models which can be used in any class of models that uses the partition of the parameter vector we are using and where orthogonality holds. To test $H_0$ in the HSNLM class considering heteroscedasticity with multiplicative effect, the constant $c_m$ for the correction factor for the $LR_m$ statistic is used, which is presented in details in Appendix D. It is written in matrix notation as

$$\begin{aligned} c_m= & {} \frac{1}{4} \varDelta _2 \text{ tr } (\varvec{H_d^2}) - \frac{1}{4} \varDelta _1^2 \varDelta _4 [\text{ tr }(\varvec{H_d})]^2 + \frac{1}{4} \varDelta ^2_1 \varvec{\iota }^\top \varvec{H_d H H_d} \varvec{\iota } + \frac{1}{6} \varDelta ^2_1 \varvec{\iota }^\top \varvec{H^{(3)}}\varvec{\iota } \nonumber \\&- \varDelta _1 \varDelta _4 \text{ tr }(\varvec{H_d}) - \varDelta _1 \varDelta _3 \varDelta _4^2 \text{ tr }(\varvec{H_d}) - \frac{1}{2} \varDelta ^2_1 \varDelta _4 \varvec{\iota }^\top \varvec{H^{(2)}}\varvec{\iota }. \end{aligned}$$

(5)

We can see that the constant $c_m$ involves only simple matrix operations, such as the constant c in (4) for the correction factor of the LR statistic. We also observe that $c_m$ in (5) involves only matrix $\varvec{W}$ of covariates (defined in 4), the number of unknown parameters in $\phi _\ell $ and the number of observations, not depending on unknown parameters or the number of nuisance parameters. Both c and $c_m$ depend on the symmetric distribution considered, since the $\delta '$s values change from one distribution to another.

3.3 Bootstrap Bartlett correction

As an alternative to the asymptotic likelihood ratio and modified profile likelihood ratio tests, we can carry out the inference based on a test with critical values ($p-$values) obtained from the bootstrap technique introduced by Efron (1979). The bootstrap likelihood ratio test ($LR_{boot}$) offers a reliable inference and does not involve complex calculations. However, it requires very intensive computing. Considering the bootstrap a numerical alternative to the Bartlett correction factor for the LR statistic, deriving the bootstrap Bartlett technique, Rocke (1989) proposed corrected likelihood ratio statistic ($LR^*_{boot}$), which is obtained as follows. Initially, we generate B bootstrap resamples $(y_1^*,\ldots ,y_B^*)$ from the assumed model under $H_0,$ replacing the unknown parameter vector with its respective restricted estimates (i.e., the estimates obtained under the null hypothesis), calculated using the original sample $(y_1,\ldots ,y_n)$. After that, we calculate the LR statistic for each pseudo sample $y_1^*,\ldots ,y^*_B$, denoted by $LR^b_{boot}, \ b=1,\ldots ,B$. The bootstrap Bartlett corrected likelihood ratio statistic is obtained by

$$\begin{aligned} LR^*_{boot}=\frac{LR}{\overline{LR}^*_{boot}}k, \end{aligned}$$

where $\overline{LR}^*_{boot}=\frac{1}{B}\sum ^B_{b=1}LR^b_{boot}$ and k is the number of restrictions imposed by $H_0$. Under the null hypothesis, $LR^*_{boot}$ has asymptotic $\chi ^2_k$ distribution (Rocke 1989).

The $LR_{boot}$ statistic does not follow the $\chi ^2$ distribution. Instead it, is based on this statistic, performed as described below: for a fixed nominal level $\alpha ,$ we calculate the $1-\alpha $ percentile of $LR_{boot},$ which is estimated by $\hat{q}_{(1-\alpha )}$ such that $\#\{LR_{boot} \le \hat{q}_{(1-\alpha )}\}/B= 1-\alpha ,$ with $\#$ denoting the cardinality set. Then, we reject the null hypothesis if $LR > \hat{q}_{(1-\alpha )}.$ Alternatively, the decision rule can be written based on the bootstrap p-value given by $p^*= \#\{LR^b_{boot} \ge LR \}/B.$

Recent works have developed inference based on these tests, see for example Bayer and Cribari-Neto (2013), Cribari-Neto and Queiroz (2014), and Loose et al. (2016). An advantage of using the bootstrap Bartlett correction instead of the usual bootstrap technique is its computational efficiency. To obtain a critical value using the bootstrap Bartlett correction requires a resample with smaller size than the one needed when using the bootstrap technique, which implies that the bootstrap Bartlett correction is computationally more efficient than the usual bootstrap technique, see (Rocke 1989).

Table 2 Null rejection rates: $t_5$ model with $p=3,5$, $k=3$ and several values for n

Full size table

4 Simulation results

In this section, we present Monte Carlo simulation results to compare the performance of seven tests in HSNLM: the likelihood ratio test (LR); the modified profile likelihood ratio test ($LR_m$); the bootstrap likelihood ratio test ($LR_{boot}$); their corrected versions, denoted by $LR^*$, $LR^*_m$ and $LR^*_{boot}$, respectively; and the score test ($S_r$) in small and moderate-sized samples. As known, an advantage of the $S_r$ test in relation to the others is its easy implementation, since it involves estimations under the null hypothesis only. The number of Monte Carlo replications was 10, 000 and for each Monte Carlo replication we performed 1000 bootstrap replications. All simulations were performed using the programming language Ox (Doornik 2006). We considered the heteroscedastic symmetric nonlinear regression model given by

$$\begin{aligned} y_\ell = \beta _0 + \exp \{\beta _1 x_{\ell 1} \} + \sum _{s=2}^{p-1} \beta _s x_{\ell s} + \epsilon _\ell , \quad \ \ell =1,\ldots ,n, \end{aligned}$$

where $\epsilon _\ell \sim S(0, \sigma ^2\exp \{\varvec{\omega _\ell }^\top {\varvec{\delta }}\}, g)$. The covariates $x_1, \ldots , x_{p-1}$ and $\omega _1,\ldots ,\omega _q$ were obtained as random draws from the uniform U(0, 1) and their values were kept fixed during the simulations. We considered two symmetric distributions for the errors, namely Student-t with 5 degrees of freedom ($\nu $) and power exponential with shape parameter $\kappa =0.3$. The test to be considered is $H_0:\delta _1=\cdots =\delta _q=0$ against $H_1:\delta _i \ne 0$for at least one $ i, i=1,\ldots ,q$. The true values of the parameters for the simulations were taken as $\beta _0=\cdots =\beta _{p-1}=1, \ \sigma ^2=1, \ \delta _1=0.1, \delta _2=0.3, \delta _3=0.5$ and $\delta _4=\delta _5=1.0$. The null hypothesis was tested for sample sizes 30, 35, 40, 50 and 100 considering the three nominal levels $\alpha =10, \ 5 \ \text{ and } \ 1\%$.

Table 3 Null rejection rates: power exponential model with $\kappa =0.3$, $p=3,5$, $k=3$ and several values for n.

Full size table

The null rejection rates of the seven tests for different sample sizes are presented in Tables 2 and 3. In these tables, it can be noted that the likelihood ratio test is substantially oversized, for example, in Table 2 when $p=5, \alpha =5\%$ and considering all sample sizes ($n=30,40,50$ and 100), the null rejection rates of the LR test are $22.6, 16.8, 12.3 \ \text{ and } \ 7.7\%,$ respectively. The corrected version of the likelihood ratio test tends to attenuate the oversized behavior of the usual likelihood ratio test, but still presents higher rejection rates. For example, considering again $p=5, \alpha =5\%$ and all sample sizes, the $LR^*$ test presents the following null rejection rates: $12.9, 8.8, 7.4 \ \text{ and } \ 5.8\%$, respectively. In general, for both tests, as the sample size increases, the distortion of the tests decreases.

The simulation results for the $t_5$ model shown in Table 2 indicate that the $S_r$ test and the corrected tests $LR^*_m$ and $LR^*_{boot}$ present better results than the other ones. Furthermore, for $p=5 \ \text{ and } \ \alpha =5\%$, the rejection rates for the $S_r$ test considering all four sample sizes are, respectively, 5.5, 5.5, 5.4 and $5.1\%$ and the corresponding rates for the $LR^*_m$ test are 4.5, 5.2, 4.6 and $5.2\%$ and for the $LR^*_{boot}$ test are 5.8, 4.8, 5.2 and $4.7\%$. Table 3 presents the results for the power exponential model. We can observe that, in general, the tests $LR_{boot}$ and $LR^*_{boot}$ perform better than the others, followed by the $S_r$ test. For example, when $p=5 \ \text{ and } \ \alpha =10\%$, the rejection rates for the $LR_{boot}$ test considering all four sample sizes are 9.9, 10.6, 9.3 and $9.6\%$, while for the $LR^*_{boot}$ test they are 9.8, 10.7, 9.3 and $9.7\%$, and for the $S_r$ test they are 11.1, 10.4, 10.8 and $10.3\%.$

Table 4 Null rejection rates: the $t_5$ model and power exponential $\kappa =0.3$ with $n=35$, $k=3$ and multiple values for p

Full size table

In Table 4 we present the null rejection rates of the tests and evaluate the effect of the number of nuisance parameters on the performance of the tests, fixing the sample size ($n=35$), the number of parameters of interest ($k=3$) and varying the number of nuisance parameters ($p= 2, 3, 4 $ and 5). For both considered models, the usual LR test and its corrected version are quite distorted, and when the number of nuisance parameters increases, the distortion of the $LR \ \text{ and } \ LR^*$ tests also increases. In contrast, the number of nuisance parameters is indifferent to the other tests, and the bootstrap tests $LR_{boot}$ and $LR^*_{boot}$ performed best. For example, for model $t_5$ when $p=4 \ \text{ and } \ \alpha =5\%$, the null rejection rates of the tests are $15.8\% \ (LR)$, $8.8\% \ (LR^*)$, $4.5\% \ (LR_m)$, $4.7\% \ (LR^*_m)$, $5.0\% \ (LR_{boot})$, $4.8\% \ (LR^*_{boot})$ and $4.3\% \ (S_r)$. Assuming the same scenario, we have for the power exponential model the following null rejection rates: $14.7\% \ (LR)$, $8.8\% \ (LR^*)$, $3.4\% \ (LR_m)$, $4.1\% \ (LR^*_m)$, $5.2\% \ (LR_{boot})$, $5.2\% \ (LR^*_{boot})$ and $4.9\% \ (S_r)$.

Table 5 Null rejection rates: $t_5$ model and power exponential $\kappa =0.3$ with $n=35$, $p=3$ and different values of k

Full size table

Table 5 reports results for the situation where $n=35$, $p=3$ and $k=2,3,4 \ \text{ and } \ 5.$ The tests’ performances are similar to that shown in Table 4, with the $LR^*_{boot}$ test outperforming the other ones, especially in the power exponential model. For example, considering the scenario where $k=3$ and $\alpha =1\%$, the null rejection rates for the power exponential model are $3.9\% \ (LR)$, $1.1\% \ (LR^*)$, $0.7\% \ (LR_m)$, $0.9\% \ (LR^*_m)$, $1.1\% \ (LR_{boot})$, $1.2\% \ (LR^*_{boot})$ and $1.1\% \ (S_r)$. Considering the same scenario for the $t_5$ model, we have $4\% \ (LR)$, $1.8\% \ (LR^*)$, $0.8\% \ (LR_m)$, $0.8\% \ (LR^*_m)$, $1.1\% \ (LR_{boot})$, $0.9\% \ (LR^*_{boot})$ and $0.9\% \ (S_r)$.

The numerical results presented in Tables 2, 3, 4 and 5 show that the corrected tests outperform the uncorrected tests in small and moderate sample sizes, except for the $S_r$ test, which performs as good as the corrected tests in the indicated scenarios. Moreover, for some cases the $S_r$ test also outperforms the $LR_m^*$ test, one of the best performing tests. The simulation results showed that the $LR_{boot}^*$ and $LR^*_m$ tests are the best performing corrected tests, followed by the $S_r$ test. The $LR^*$ test attenuates the oversized behavior of the LR test, but still presents distorted rejection rates, especially when the number of parameters in the models increases.

Table 6 Non-null rejection rates: $t_5$ model and power exponential $\kappa =0.3$ with $n=35$, $\alpha =10\%$, $p=3$ and $k=3$

Full size table

Table 6 presents the test non-null rejection rates, i.e., their power. The data generation was carried out using different values of $\delta .$ We only considered the tests $LR_m$, $LR_m^*$, $LR_{boot}$, $LR^*_{boot}$ and $S_r.$ The usual and Bartlett-corrected likelihood ratio test are not included in the power comparison since they are considerably size distorted. The results in Table 6 show what is expected, that is, the tests are more powerful as $\delta $ moves away from zero. The difference in powers is very small. For the $t_5$ model, we have that the $LR_{boot}$, $LR^*_{boot}$ and $S_r$ tests are slightly more powerful than the others. For the power exponential model, we have that the $LR_{boot}$ and $LR^*_{boot}$ tests are slightly more powerful than the others.

Figures 1a–b show the relative quantile discrepancies of the test statistics against the corresponding asymptotic quantiles ($\chi ^2$) for the $t_5$ model and Fig. 1c–d for the power exponential model, considering $n=35, \ p=3,5 \ \text{ and } \ k=3$. Relative quantile discrepancy is defined as the ratio of the difference between the exact quantile (estimated by simulation) and the asymptotic quantile by asymptotic quantile. We consider the score test and the corrected tests due to their performances. The closer the curve is to zero ordinate, the better approximated to the reference $\chi ^2$ distribution is the test statistic’s null distribution. For both models, it is clear that the corrected likelihood ratio test statistic’s null distribution is not well approximated by the reference $\chi ^2$ distribution in all considered scenarios. In contrast, the null distributions of the score statistic, the corrected modified profile likelihood ratio statistic and bootstrap Bartlett corrected likelihood ratio statistic are well approximated to the reference $\chi ^2$ distribution, since their quantile discrepancy curves are very close to zero ordinate, as reflected in the performance of these tests presented in Tables 2, 3, 4 and 5.

5 An illustrative example

In this section, we apply the test method presented in the previous sections to a real dataset. The computer code for computing these statistics can be requested from us by email. The data analyzed refer to the weight of eye lenses of European rabbit in Australia (Oryctolagus Cuniculus), y, in mg, and the age of the animal, x, in days, in a sample containing 71 observations. These data were analyzed by Wei (1998, Example 6.8), who verified the suspicion of two aberrant points under least squares estimation, thus indicating that the dataset supports errors with heavier tails than the normal. Cysneiros et al. (2005) also analyzed this dataset under the Student-t distribution with 4 degrees of freedom. The choice of the degrees of freedom was based on the model that presented the lowest AIC. The residual plot (see Cysneiros et al. 2005, Sect. 3.2) showed that they are not uniformly distributed around zero, giving evidence of heteroscedasticity. Motivated by that, we decided to consider a more general model than that adjusted by Cysneiros et al. (2005), introducing a regression structure for modeling dispersion.

We consider the following heteroscedastic model

$$\begin{aligned} y_\ell = \exp \left( \beta _0 - \frac{\beta _1}{x_\ell + \beta _2} \right) e^{\epsilon _\ell }, \end{aligned}$$

where $\epsilon _\ell \sim S(0,\sigma ^2 \exp \{\delta x_\ell \})$, $\ell =1,\ldots ,71.$ Our main goal now is to test $H_0: \delta =0$ (homoscedasticity) against a two-sided alternative (heteroscedasticity).

To test $H_0,$ the observed values of the test statistics are $LR=8.368$ (p-value: 0.004), $LR^*=7.865$ (p-value: 0.005), $LR_m=8.919$ (p-value: 0.003), $LR_m^*=8.871$ (p-value: 0.003), $S_r=6.766$ (p-value: 0.009) and $LR^*_{boot}=5.777$ (p-value: 0.016). The p-value for the bootstrap likelihood is 0.026. So, at the $1\%$ nominal level, the bootstrap based tests lead to not rejecting the null hypothesis, that is, the dispersion is constant over the observations (homoscedasticity), while a different decision was reached when we employed all other tests, leading to rejection of the null hypothesis, that is, heteroscedasticity of the dispersion. Recalling our previous section and the literature, the LR and $S_r$ tests are size distorted when we deal with small or even moderate-sized samples, leading us not to rely on the inference delivered by these tests. Also, recall from our simulation study that the bootstrap based tests performed best in most scenarios and were not affected by the sample size, number of parameters of interest or nuisance parameters, so, the bootstrap based tests should be preferable.

6 Concluding remarks

Symmetric models have received increasing attention in the statistical literature in recent years and much of this attention is due to the fact that such models are less sensitive than the normal model to the presence of outlying observations in the dataset to be modeled. Many works have addressed the class of symmetric models. Cysneiros et al. (2010a) proposed the HSNLM class, whose parametric estimation is performed by numerically maximizing the log-likelihood function, since the maximum likelihood estimators do not have closed form. In this class of models, hypothesis testing of model parameters are usually based on the likelihood ratio test, which is based on first-order asymptotic approximations. Thus, for small and even moderate sample sizes, the likelihood ratio statistic’s null distribution is not well approximated from the reference $\chi ^2$ distribution and as a consequence, the test based on its statistic shows distorted rejection rates, making it important to develop strategies to yield more accurate inferences when the sample size is not large enough.

In this paper, we presented Bartlett corrections to improve hypothesis tests based on the likelihood ratio and modified profile likelihood ratio statistics in the HSNLM class. Our work extends some results presented by Cordeiro (2004) and Ferrari et al. (2004), who obtained Bartlett correction factors for the likelihood ratio statistic in symmetric nonlinear models and for the modified profile likelihood ratio statistic in normal linear models, respectively. In order to compare the performance of the tests, we also considered the score test and the bootstrap likelihood ratio and bootstrap Bartlett corrected likelihood ratio tests.

Numerical results showed that the usual likelihood ratio test is somewhat oversized and its corrected version, although attenuating this tendency, still has distorted rejection rates that increase as the number of parameters in the model increases. The simulation results also showed that the inference based on the modified profile likelihood ratio test outperforms the usual test and does not suffer influence from the number of parameters in the model when this increases. Moreover, The numerical evidence showed the better performance of the corrected tests and the uncorrected score test. The score test is very simple to compute, since it involves only estimations under the null hypothesis. In particular, the numerical results showed the superior performance of the corrected modified profile likelihood ratio test and also both bootstrap-based tests in small and moderate sample sizes. Thus, we encourage practitioners to use these tests in applications.

Notes

Let $\{a_n\}$ and $\{ b_n\}$ be real sequences. $a_n$ is said to be at most of order equal to $b_n,$ denoted by $a_n=O(b_n),$ if $K \in {\mathbb {R}}^+$ exists and $n_0(K),$ such that $|a_n/b_n|\le K, \ \forall n\ge n_0(K).$

References

Bartlett MS (1937) Properties of sufficiency and statistical tests. Proc R Soc Lond A 160:268–282
Article Google Scholar
Bayer FM, Cribari-Neto F (2013) Bartlett corrections in beta regression models. J Stat Plan Inference 143:531–547
Article MathSciNet Google Scholar
Cook D, Weisberg S (1983) Diagnostics for heteroscedasticity in regression. Biometrika 70:1–10
Article MathSciNet Google Scholar
Cordeiro GM (2004) Corrected likelihood ratio tests in symmetric nonlinear regression models. J Stat Comput Simul 74:609–620
Article MathSciNet Google Scholar
Cordeiro GM, Cribari-Neto F (2014) An introduction to Bartlett correction and bias correction. Springer, New York
Book Google Scholar
Cordeiro GM, Ferrari SLP, Uribe-Opazo MA, Vasconcellos KLP (2000) Corrected maximum-likelihood estimation in a class of symmeytric nonlinear regression models. Stat Probab Lett 46:317–328
Article Google Scholar
Cox DR, Reid N (1987) Parameter orthogonality and approximate conditional inference. J R Stat Soc B 49:1–39
MathSciNet MATH Google Scholar
Cox DR, Reid N (1992) A note on the difference between profile and modified profile likelihood. Biometrika 79:408–411
Article MathSciNet Google Scholar
Cribari-Neto F, Queiroz MPF (2014) On testing inference in beta regression. J Stat Comput Simul 84:186–203
Article MathSciNet Google Scholar
Cysneiros FJA, Cordeiro GM, Cysneiros AHMA (2010a) Corrected maximum likelihood estimators in heteroscedastic symmetric nonlinear models. J Stat Comput Simul 80:451–461
Article MathSciNet Google Scholar
Cysneiros AHMA, Ferrari SLP (2006) An improved likelihood ratio test for varying dispersion in exponential family nonlinear models. Stat Prob Lett 76:255–265
Article MathSciNet Google Scholar
Cysneiros FJA, Paula GA, Galea M (2005) Modelos Simétricos Aplicados. 9 Escola de Modelos de Regressão, São Paulo
Cysneiros AHMA, Rodrigues KSP, Cordeiro GM, Ferrari SLP (2010b) Three Bartlett-type corrections for score statistics in symmetric nonlinear regression models. Stat Pap 51:273–284
Article MathSciNet Google Scholar
DiCiccio TJ, Stern SE (1994) Frequentist and Bayesian Bartlett correction of test statistics based on adjusted profile likelihoods. J R Stat Soc Ser B Methodol 56:397–408
MathSciNet MATH Google Scholar
Doornik JA (2006) An object-oriented matrix language Ox 5. Timberlake Consultants Press, London
Google Scholar
Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
Article MathSciNet Google Scholar
Fang KT, Anderson TW (1990b) Statistical inference in elliptical contoured and related distribuitions. Allerton Press, New York
Google Scholar
Fang KT, Kotz S, Ng KW (1990a) Symmetric multivariate and related distributions. Chapman and Hall, London
Book Google Scholar
Ferrari SLP, Cysneiros AHMA, Cribari-Neto F (2004) An improved test for heteroskedasticity using adjusted modified profile likelihood inference. J Stat Plan Inference 124:423–437
Article MathSciNet Google Scholar
Ferrari SLP, Lucambio F, Cribari-Neto F (2005) Improved profile likelihood inference. J Stat Plan Inference 134:373–391
Article MathSciNet Google Scholar
Fujita A, Kojima K, Patriota AG, Sato JR, Severino P, Miyano S (2010) A fast and robust test based on likelihood ratio with Bartlett correction to identify granger causality between gene sets. Bioinformatics 26:2349–2351
Article Google Scholar
Lawley DN (1956) A general method for approximating to the distribution of the likelihood ratio criteria. Biometrika 71:233–244
MathSciNet MATH Google Scholar
Lemonte AJ, Cordeiro GM, Moreno G (2012) Bartlett corrections in Birnbaum-Saunders nonlinear regression models. J Stat Comput Simul 82:927–935
Article MathSciNet Google Scholar
Lin JG, Zhu LX, Xie FG (2009) Heteroscasticity diagnostics for $t$ linear regression models. Metrika 70:59–77
Article MathSciNet Google Scholar
Loose LH, Bayer FM, Liberal TL (2016) Bootstrap Bartlett corretion in inflated beta regression. Commun Stat Simul 46(4):2865–2879. doi:10.1080/03610918.2015.1065326
Article MATH Google Scholar
Maior VQS, Cysneiros FJA (2016) SYMARMA: a new dynamic model for temporal data on conditional symmetric distribution. Stat Pap. doi:10.1007/s00362-016-0753-z
Article MathSciNet Google Scholar
Melo TFN, Ferrari SLP, Cribari-Neto F (2009) Improved test inference in mixed linear models. Comp Stat Data Anal 53:2573–2582
Article Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Book Google Scholar
Pace L, Salvan A (1997) Principles of statistical inference from a Neo-Fisherian perspective. World Scientific, Singapore
Book Google Scholar
Rocke DM (1989) Bootstrap Bartlett adjustment in seemingly unrelated regression. J Am Stat Assoc 84:598–601
Article MathSciNet Google Scholar
Simonoff JS, Tsai CH (1994) Use of modified profile likelihood for improved tests of constancy of variance in regression. Appl Stat 43:357–370
Article MathSciNet Google Scholar
Stein MC, Silva MF, Duczmal LH (2014) Alternatives to the usual likelihood ratio test in mixed linear models. Comp Stat Data Anal 69:184–197
Article MathSciNet Google Scholar
Uribe-Opazo MA, Ferrari SLP, Cordeiro GM (2008) Improved score test in symmetric linear regression model. Commun Stat Theory Methods 37:261–276
Article MathSciNet Google Scholar
Vanegas LH, Rondon LM, Cysneiros FJA (2013) Assessing robustness of inference in symmetrical nonlinear regression models. Commun Stat Theory Methods 42:1692–1711
Article MathSciNet Google Scholar
Wei BL (1998) Exponential family nonlinear models. Springer, Singapore
MATH Google Scholar

Download references

Acknowledgements

We would like to thank the reviewers for their constructive comments, which helped to improve this manuscript. Also, we gratefully acknowledge the financial support of CAPES, CNPq, FAPEMIG and FACEPE.

Author information

Authors and Affiliations

Departamento de Estatística, Universidade Federal do Rio Grande do Norte, Lagoa Nova, Natal, RN, 59078-970, Brazil
Mariana C. Araújo
Departamento de Estatística, Universidade Federal de Pernambuco, Cidade Universitária, Recife, PE, 50740-40, Brazil
Audrey H. M. A. Cysneiros
Departamento de Estatística, Universidade Federal de Minas Gerais, Av. Antonio Carlos 6627, Pampulha, Belo Horizonte, MG, 31270-010, Brazil
Lourdes C. Montenegro

Authors

Mariana C. Araújo
View author publications
You can also search for this author in PubMed Google Scholar
Audrey H. M. A. Cysneiros
View author publications
You can also search for this author in PubMed Google Scholar
Lourdes C. Montenegro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariana C. Araújo.

Appendices

Appendix A

In this appendix we obtain the matrix $j^*({\varvec{\delta }}; \varvec{\hat{\beta }_\delta }, \hat{\gamma }_\delta )$ expressed in (3) which denotes the observed information matrix for the nuisance parameters $({\varvec{\beta }}^\top , \gamma )^\top . $ We can express $j^*({\varvec{\delta }}; \varvec{\hat{\beta }_\delta }, \hat{\gamma }_\delta )$ by

$$\begin{aligned} j^{*}({\varvec{\delta }}; \varvec{\hat{\beta }_\delta },\hat{\gamma }_{{\varvec{\delta }}}) = - \left( \begin{array}{cc} j^{*}_{\beta \beta } &{} \varvec{0} \\ \varvec{0} &{} j^{*}_{\gamma \gamma } \end{array} \right) , \end{aligned}$$

where $j^{*}_{\beta \beta }$ is a square matrix of order p whose entries are given by

$$\begin{aligned} j^{*}_{\beta \beta } = -\frac{\partial ^2 l^*(\varvec{y};{\varvec{\theta }}^*)}{\partial \beta _j \partial \beta _l} = -\frac{1}{\gamma } \sum _{\ell =1}^{n} t(z_\ell )^{(2)} q_\ell (j,l)_\ell + \frac{1}{\gamma ^{1/2}}\sum _{\ell =1}^n t(z_\ell )^{(1)}q_\ell ^{1/2}(jl)_\ell \end{aligned}$$

and

$$\begin{aligned} j^{*}_{\gamma \gamma } = -\frac{\partial ^2 l^*(\varvec{y};{\varvec{\theta }}^*)}{\partial \gamma ^2} = - \frac{n}{2 \gamma ^2} - \frac{3}{4 \gamma ^2} \sum _{\ell =1}^n t(z_\ell )^{(1)}z_\ell - \frac{1}{4 \gamma ^2} \sum _{\ell =1}^n t(z_\ell )^{(2)}z_\ell ^2 \end{aligned}$$

is a scalar, with $\varvec{\theta ^*}=({\varvec{\beta }}^\top , {\varvec{\delta }}^\top ,\gamma )^\top \,\,q_\ell =\frac{(\prod _{s=1}^n m_s)^{1/n}}{m_\ell }$, $(j,l)_\ell = (\partial \mu _\ell /\partial \beta _j)\,\,(\partial \mu _\ell / \partial \beta _l)$ and $(jl)_\ell =\partial ^2 \mu _\ell / \partial \beta _j \partial \beta _l, \ j,l=1,\ldots ,p.$

Appendix B

In this appendix we present the required derivatives of the log-likelihood function in (2) up to the fourth order and their respective moments, considering $m_\ell =\exp (\varvec{\omega }_\ell ^\top {\varvec{\delta }}),$ to obtain the constants c and $c_m$ from the Bartlett correction factors. For this, we introduce the following notation: $\lambda _{rs}=E(\partial ^2 l/\partial {\varvec{\theta }}_r \partial {\varvec{\theta }}_s)$, $\lambda _{rst}=E(\partial ^3 l/\partial {\varvec{\theta }}_r \partial {\varvec{\theta }}_s \partial {\varvec{\theta }}_t),$ etc, $a, b, c,\ldots $ index parameters of interest ${\varvec{\delta }}$, i, j, $l,\ldots $ index nuisance parameters ${\varvec{\beta }}$ and $r,s,t,\ldots $ index all $p+q+1$ parameters of the model. Thus, we have

$$\begin{aligned} \lambda _{ab \gamma }= & {} \frac{\varDelta _1}{\gamma }\sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab},\ \ \lambda _{acd} = \varDelta _1 \sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{acd},\\ \lambda _{abcd}= & {} \varDelta _2 \sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{abcd},\\ \lambda _{\gamma \gamma }= & {} (\varDelta _4 \gamma ^2)^{-1}, \ \ \lambda _{ija}= \frac{1}{2\gamma } (\delta _{(0,0,1,0,1)}+2\delta _{(0,1,0,0,0)})\sum _{\ell =1}^n \varvec{q}_{\ell ab},\\ \lambda _{abij}= & {} \frac{1}{\gamma }\varDelta _6\sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab}q_\ell (i,j)_\ell + \frac{1}{\gamma }\varDelta _7\sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab}(i,j)_\ell ,\\ \lambda _{ab\gamma \gamma }= & {} \frac{\varDelta _8}{\gamma ^2}\sum _{\ell =1}^n(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab} \quad \ \text{ and }\\ \lambda _{ai}= & {} \lambda _{a\gamma }=\lambda _{i\gamma }= \lambda _{ai\gamma }=\lambda _{abi}=\lambda _{a\gamma \gamma }=0, \end{aligned}$$

and the derived cumulants required to calculate c and $c_m$ are given by:

$$\begin{aligned} (\lambda _{\gamma \gamma })_\gamma= & {} \frac{\varDelta _3}{\gamma ^3}, \ \ (\lambda _{ab \gamma })_\gamma = - \frac{\varDelta _1}{\gamma ^2}\sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab},\\ (\lambda _{ija})_b= & {} \frac{1}{2\gamma }(\delta _{(0,0,1,0,1)}+2\delta _{(0,1,0,0,0)})\sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_{ab}q_l(i,j)_\ell \ \text{ and }\\ (\lambda _{ac})_d= & {} (\lambda _{ac})_{bd} = (\lambda _{acd})_b = \lambda _{iab}^{(c)} = (\lambda _{iab})_j=\lambda _{ai}^{(bj)}=\lambda _{a\gamma \gamma }^{(b)}=\lambda _{a\gamma \gamma }^{(\gamma )}=0. \end{aligned}$$

Appendix C

Here we obtain in detail the expression c for the Bartlett correction factor for the LR statistic. As defined in Sect. 3.1, the Bartlett correction factor is expressed by $1+c/k,$ where

$$\begin{aligned} c=\epsilon _{p,k}-\epsilon _p. \end{aligned}$$

From Lawley’s expansion (1956), we have that $E(LR)=k+\epsilon _{p,k}-\epsilon _p+O(n^{-2}),$ where

$$\begin{aligned} \epsilon _{p,k}=\sum _{{\varvec{\theta }}^*} (l_{rstu}-l_{rstuvw}), \end{aligned}$$

(6)

being

$$\begin{aligned} l_{rstu}= & {} \lambda ^{rs}\lambda ^{tu} \left\{ \frac{\lambda _{rstu}}{4} - \lambda _{rst}^{(u)}+\lambda _{rt}^{(su)} \right\} \ \text{ and }\\ l_{rstuvw}= & {} \lambda ^{rs}\lambda ^{tu}\lambda ^{vw} \left\{ \lambda _{rtv} \left( \frac{\lambda _{suw}}{6}-\lambda _{sw}^{(u)} \right) \right. \nonumber \\&+\left. \lambda _{rtu}\left( \frac{\lambda _{svw}}{4} - \lambda _{sw}^{(v)} +\lambda _{rt}^{(v)}\lambda _{sw}^{(u)} +\lambda _{rt}^{(u)}\lambda _{sw}^{(v)} \right) \right\} , \end{aligned}$$

where $-\lambda ^{rs}$ is the (r, s) element from the inverse of Fisher’s information matrix. The sum in (6) varies in all components of ${\varvec{\theta }}^*$. The expression $\epsilon _p$ is obtained from (6) considering that the sum varies across all nuisance parameters. Thus, we can write c as

$$\begin{aligned} c=\epsilon _k ({\varvec{\delta }}) +\epsilon _{p,k}({\varvec{\beta }},{\varvec{\delta }})+\epsilon _{p,k}({\varvec{\delta }},\gamma )+\epsilon _{p,q}({\varvec{\beta }},{\varvec{\delta }},\gamma ), \end{aligned}$$

where $\epsilon _k ({\varvec{\delta }})$ indicates that the sum (6) was performed on all components of ${\varvec{\delta }}$, analogously to other cases, considering the respective parameters. Replacing $\lambda '$s obtained in Appendix B in the $\epsilon '$s and after of extensive algebra, we have the constant c in (4).

Appendix D

Finally, here we obtain the constant $c_m$ for the Bartlett correction factor for the $LR_m$ statistic, considering Equation (5) of Ferrari et al. (2004). Thus, we have that $c_m$ can be expressed by

$$\begin{aligned} c_m= & {} \frac{1}{4} \lambda ^{ab}\lambda ^{cd}\lambda _{abcd} - \lambda ^{ab}\lambda ^{cd}(\lambda _{acd})_b + \lambda ^{ab}\lambda ^{cd}(\lambda _{ac})_{db} -\lambda ^{ij}\lambda ^{ab}(\lambda _{iab})_j \nonumber \\&-\, \lambda ^{\gamma \gamma }\lambda ^{ab}(\lambda _{ab\gamma })_\gamma - \left( \frac{1}{4} \lambda ^{ab}\lambda ^{cd}\lambda ^{ef} + \frac{1}{2} \lambda ^{ab}\lambda ^{cf}\lambda ^{de} -\frac{1}{3} \lambda ^{ab}\lambda ^{cf}\lambda ^{de} \right) \lambda _{acd}\lambda _{bef} \nonumber \\&+\,(\lambda ^{ab}\lambda ^{cd}\lambda ^{ef}+\lambda ^{ab}\lambda ^{cf}\lambda ^{de})\lambda _{acd}(\lambda _{be})_f - (\lambda ^{ab}\lambda ^{cd}\lambda ^{ef} +\lambda ^{ab}\lambda ^{cf}\lambda ^{de})(\lambda _{ac})_d(\lambda ^{be})_f \nonumber \\&-\, \left( \frac{1}{4} \lambda ^{ij}\lambda ^{ab}\lambda ^{cd} + \frac{1}{2} \lambda ^{ij}\lambda ^{ad}\lambda ^{bc} \right) \lambda _{iab}\lambda _{jcd} + (\lambda ^{ij}\lambda ^{ab}\lambda ^{kl})\lambda _{iab}(\lambda _{jk})_l \nonumber \\&- \left( \frac{1}{4}\lambda ^{\gamma \gamma }\lambda ^{ab}\lambda ^{cd} + \frac{1}{2} \lambda ^{\gamma \gamma }\lambda ^{ad}\lambda ^{bc} \right) \lambda _{ab \gamma }\lambda _{cd \gamma } + (\lambda ^{\gamma \gamma }\lambda ^{ab}\lambda ^{\gamma \gamma })\lambda _{ab\gamma }(\lambda _{\gamma \gamma })_{\gamma }. \end{aligned}$$

(7)

Replacing the cumulant $\lambda '$s obtained in Appendix B in Equation (7) and after intense algebra, we have

$$\begin{aligned} c_m= & {} \frac{1}{4} \varDelta _2 \sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_a \lambda ^{ab}(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_b(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_c \lambda ^{cd}(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_d\\&- \frac{1}{4} \varDelta _4 \varDelta _1^2 \sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_a \lambda ^{ab} (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_b (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_c \lambda ^{cd}(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_d\\&- \frac{1}{4}\varDelta ^2_1 \sum _{\ell =1}^n \sum _{s=1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_a\lambda ^{ab}(\varvec{\omega }_s - \bar{\varvec{\omega }})_a (\varvec{\omega }_s - \bar{\varvec{\omega }})_c \lambda ^{cd}(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_d (\varvec{\omega }_s - \bar{\varvec{\omega }})_e \lambda ^{ef}(\varvec{\omega }_s - \bar{\varvec{\omega }})_f\\&- \frac{1}{6} \varDelta _1^2 \sum _{\ell =1}^n\sum _{s=1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_a \lambda ^{ab}(\varvec{\omega }_s - \bar{\varvec{\omega }})_b (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_c \lambda ^{cf}(\varvec{\omega }_s - \bar{\varvec{\omega }})_f (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_d \lambda ^{de}(\varvec{\omega }_s - \bar{\varvec{\omega }})_e\\&+ \varDelta _1 \varDelta _4 \sum _{s=1}^n (\varvec{\omega }_s - \bar{\varvec{\omega }})_a\lambda ^{ab}(\varvec{\omega }_s - \bar{\varvec{\omega }})_b + \varDelta _1 \varDelta _3 \varDelta _4^2 \sum _{\ell =1}^n (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_a \lambda ^{ab} (\varvec{\omega }_\ell - \bar{\varvec{\omega }})_b\\&- \frac{1}{2} \varDelta _1^2 \varDelta _4 \sum _{\ell =1}^n\sum _{s=1}^n (\varvec{\omega }_s - \bar{\varvec{\omega }})_a \lambda ^{ad}(\varvec{\omega }_s - \bar{\varvec{\omega }})_d(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_b\lambda ^{bc}(\varvec{\omega }_s - \bar{\varvec{\omega }})_c, \end{aligned}$$

where $(\varvec{\omega }_\ell - \bar{\varvec{\omega }})_i = \omega _{\ell i}-\bar{\omega }_i,$ with $i=a,b,c,d.$ In matrix notation, we have

$$\begin{aligned} c_m= & {} \frac{1}{4} \varDelta _2 \text{ tr } (\varvec{H_d^2}) - \frac{1}{4} \varDelta _1^2 \varDelta _4 [\text{ tr }(\varvec{H_d})]^2 + \frac{1}{4} \varDelta ^2_1 \varvec{\iota }^\top \varvec{H_d H H_d} \varvec{\iota } + \frac{1}{6} \varDelta ^2_1 \varvec{\iota }^\top \varvec{H^{(3)}}\varvec{\iota } \nonumber \\&-\, \varDelta _1 \varDelta _4 \text{ tr }(\varvec{H_d}) - \varDelta _1 \varDelta _3 \varDelta _4^2 \text{ tr }(\varvec{H_d}) - \frac{1}{2} \varDelta ^2_1 \varDelta _4 \varvec{\iota }^\top \varvec{H^{(2)}}\varvec{\iota }. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Araújo, M.C., Cysneiros, A.H.M.A. & Montenegro, L.C. Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models. Stat Papers 61, 167–188 (2020). https://doi.org/10.1007/s00362-017-0933-5

Download citation

Received: 08 July 2016
Revised: 21 June 2017
Published: 06 July 2017
Issue Date: February 2020
DOI: https://doi.org/10.1007/s00362-017-0933-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models

Abstract

Similar content being viewed by others

Corrected T(q)-Likelihood Estimator in a Generalized Linear Structural Regression Model with Measurement Errors

Heteroskedastic Linear Regression: Steps Towards Adaptivity, Efficiency, and Robustness

Two-Step Estimation in a Heteroscedastic Linear Regression Model

1 Introduction

2 Heteroscedastic symmetric nonlinear models

3 Bartlett corrections

3.1 Bartlett correction for the LR statistic

3.2 Bartlett correction for the \(LR_m\) statistic

3.3 Bootstrap Bartlett correction

4 Simulation results

5 An illustrative example

6 Concluding remarks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Appendix C

Appendix D

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models

Abstract

Similar content being viewed by others

Corrected T(q)-Likelihood Estimator in a Generalized Linear Structural Regression Model with Measurement Errors

Heteroskedastic Linear Regression: Steps Towards Adaptivity, Efficiency, and Robustness

Two-Step Estimation in a Heteroscedastic Linear Regression Model

1 Introduction

2 Heteroscedastic symmetric nonlinear models

3 Bartlett corrections

3.1 Bartlett correction for the LR statistic

3.2 Bartlett correction for the \(LR_m\) statistic

3.3 Bootstrap Bartlett correction

4 Simulation results

5 An illustrative example

6 Concluding remarks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Appendix C

Appendix D

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation