1 Introduction

Zero-adjusted regression models (ZAR models) are often used to fit variables that are discrete at zero and continuous at some interval of the positive real numbers. They are used in many areas such as insurance (Bortoluzzo et al. 2011), botany (Thomson et al. 2018), credit risk (Tong et al. 2016), microbiology (Rocha et al. 2017), biodiversity (Rubec et al. 2016) and meteorology (Zamani and Bazrafshan 2020). ZAR models are also known as zero-augmented regression models (Nogarotto et al. 2020) or zero-inflated regression models. The last expression is used especially when the continuous component of the response variable is beta distributed (Ospina and Ferrari 2012). However, zero-inflated regression models usually refer to models in which the response variable is discrete with more zeros than expected by a known probability distribution (Lambert 1992). Recent works related to ZAR models include Tomazella et al. (2019); Calsavara et al. (2019); Hashimoto et al. (2019); Pereira et al. (2020); Michaelis et al. (2020); Ye et al. (2021) and Silva et al. (2021).

Zero-adjusted generalized linear models (ZAGLMs) are a subclass of ZAR models, in which the continuous component of the regression model is a generalized linear model (Dunn and Smyth 2018). The main components of the class of ZAGLMs are the zero-adjusted gamma regression models (ZAGA regression models, Tong et al. 2013) and the zero-adjusted inverse Gaussian regression models (ZAIG regression models, Heller et al. 2006).

Hypothesis testing inference in the class of ZAGLMs is usually performed using the likelihood ratio (LR) statistic, especially when the null hypothesis of interest involves more than one parameter. Under the null hypothesis, the LR statistic has an asymptotic chi-squared distribution (Sen et al. 2010). However, in many regression models, the chi-squared distribution is not a good approximation of the null distribution of the LR statistic when the sample size is small (Melo et al. 2009; Pereira and Cribari-Neto 2014). As a consequence, in these cases, the test based on the LR statistic is often size distorted.

An alternative to improve the chi-squared approximation to the exact null distribution of the LR statistic is to use the Bartlett correction (Bartlett 1937; Lawley 1956). The Bartlett correction is usually effective in bringing the true sizes of the test of the model closer to the nominal levels (Botter and Cordeiro 1997). In practical situations, type I errors should be nearer to the fixed nominal value (usually 1%, 5% or 10%) than the original statistic. Many authors have presented Bartlett correction factors for specific regression models. Cordeiro (1983) derived the Bartlett correction factor for generalized linear models and Botter and Cordeiro (1997) extended it to double generalized linear models. This correction was derived for mixed linear models and for beta regression by Melo et al. (2009) and Bayer and Cribari-Neto (2013), respectively. Moulton et al. (1993) and Das et al. (2018) showed that the Bartlett correction also improves the LR statistic in logistic regression. Recent works related to this topic include Loose et al. (2018); Araújo et al. (2020); Magalhães and Gallardo (2020); Rauber et al. (2020); Guedes et al. (2020, 2021) and Melo et al. (2022).

Few works have proposed small-sample adjustments to the LR statistic in ZAR models. Pereira and Cribari-Neto (2014) derived a correction for the LR statistic known as Skovgaard’s adjustment (Skovgaard 2001) for zero-adjusted beta regressions. For the same model, Loose et al. (2017) proposed a Bartlett correction based on bootstrap (Efron 1979), instead of the traditional analytical correction. To the best of our knowledge, no study has proposed small-sample adjustments to the LR statistic in ZAGLMs. Moreover, the previous works involving ZAR models have not studied the behavior of the adjusted LR statistic when the null hypothesis of interest involves parameters of more than a submodel of the ZAR model. In practice, it is often desirable to test whether the distribution of the response variable is related to a given covariate. In these cases, the null hypothesis has parameters of all ZAGLM submodels.

The chief goal of our paper is to improve the LR statistic in the class of ZAGLMs. Two approaches are used. First, we derive an analytical Bartlett correction of the LR statistic. In addition, we propose using two different adjustments for the LR statistic based on bootstrap (Cordeiro and Cribari-Neto 2014). The performance in small and medium-sized samples of the adjusted statistics is compared with the usual LR statistics through extensive Monte Carlo simulation studies.

The remainder of the paper is organized as follows. Section 2 defines the ZAGLMs and presents some of their inferential aspects. The adjusted LR statistic is derived in Sect. 3 and the bootstrap corrections are also described in that section. In the following section, Monte Carlos simulation studies are performed to compare the finite sample behavior of different LR statistics. Section 5 presents an application to real data. Concluding remarks are provided in Sect. 6.

2 Model

Suppose that the univariate random variable \(Y \in \{0\} \cup (0, \infty )\), has a density with the following

$$\begin{aligned} g(y; \pi , \theta , \phi ) = \left\{ \begin{array}{ll} 0 &{} \text{ if } y < 0, \\ \pi &{} \text{ if } y = 0, \\ (1 - \pi ) f(y; \theta , \phi ) &{} \text{ if } y > 0, \end{array} \right. \end{aligned}$$
(1)

where \(\pi = \mathbb {P}(Y = 0)\) and \(f(y; \theta , \phi )\) is a probability density function (PDF) of a positive continuous random variable. The expression (1) can be written as:

$$\begin{aligned} g(y; \pi , \theta , \phi ) = \left\{ \pi ^{\mathbb {I}_{\{0\}}^{(y)}} (1 - \pi )^{1 - \mathbb {I}_{\{0\}}^{(y)}} \right\} \left\{ f(y; \theta , \phi )^{1 - \mathbb {I}_{\{0\}}^{(y)}} \right\} , \end{aligned}$$
(2)

with

$$\begin{aligned} \mathbb {I}_{\{0\}}^{(y)} = \left\{ \begin{array}{ll} 1 &{} \text{ if } y = 0, \\ 0 &{} \text{ otherwise }. \end{array} \right. \end{aligned}$$

In our work, we define

$$\begin{aligned} f(y, \theta , \phi ) = \exp \left\{ \phi [y \theta - b(\theta ) + c(y)] + d_1(y) + d_2(\phi ) \right\} , \end{aligned}$$
(3)

where \(b(\cdot )\), \(c(\cdot )\), \(d_1(\cdot )\) and \(d_2(\cdot )\) are known functions, i.e., \(f(\cdot , \theta , \phi )\) is the PDF of a member of the exponential family (EF) with parameters \(\theta \) and \(\phi \), the canonical and the precision parameters respectively (the inverse, \(\phi ^{-1}\), is the dispersion parameter). If \(Z \sim \) EF(\(\theta , \phi \)), then \(\mathbb {E}(Z) = db(\theta )/d\theta = \mu \) and Var\((Z) = \phi ^{-1} d^2b(\theta )/d\theta ^2 = \phi ^{-1} V(\mu )\), where \(V = V(\mu )\) is the variance function. Note that the function of the random variable \(\mathbb {I}_{\{0\}}^{(y)}\), in (2), can be seen as a probability function of a Bernoulli distribution with probability \(\pi \), also a member of the EF. Therefore, \(\mathbb {E}(Y) = (1-\pi )\mu \) and Var\((Y) = (1-\pi )[\pi \mu ^2 + \phi ^{-1} V(\mu )]\). Table 1 presents the quantities presented in (3) for the distributions in the EF used this paper.

Table 1 Important positive distributions belonging to the exponential family

We consider that (1) has three systematic components, which are parameterized as \(\mu = \mu ({{\varvec{\beta }}})\), \(\phi = \phi ({{\varvec{\delta }}})\) and \(\pi = \pi ({{\varvec{\gamma }}})\). The systematic components are:

$$\begin{aligned} \eta _1 = h_1(\mu ) = {{\varvec{x}}}^{\top } {{\varvec{\beta }}}, \ \eta _2 = h_2(\phi ) = {{\varvec{t}}}^{\top } {{\varvec{\delta }}}, \ \eta _3 = h_3(\pi ) = {{\varvec{s}}}^{\top } {{\varvec{\gamma }}}, \end{aligned}$$
(4)

where \(h_1(\cdot )\) to \(h_3(\cdot )\) are the link functions and are known one-to-one continuously four-times differentiable functions, \(\eta _1\) to \(\eta _3\) are the linear predictors, \({{\varvec{x}}} = (x_1, \ldots x_{p_{\mu }})^{\top }\), \({{\varvec{t}}} = (t_1, \ldots t_{p_{\phi }})^{\top }\), \({{\varvec{s}}} = (s_1, \ldots s_{p_{\pi }})^{\top }\) are specified regressor vectors, \({{\varvec{\beta }}} = (\beta _1, \ldots , \beta _{p_{1}})^{\top }\), \({{\varvec{\delta }}} = (\delta _1, \ldots , \delta _{p_{2}})^{\top }\) and \({{\varvec{\gamma }}} = (\gamma _1, \ldots , \gamma _{p_{3}})^{\top }\) are sets of unknown parameters to be estimated. We also assume that \({{\varvec{\beta }}}\), \({{\varvec{\delta }}}\) and \({{\varvec{\gamma }}}\) are functionally independent from each other.

Consider \(Y_{1}\), \(\ldots \), \(Y_{n}\) independent random variables from (1) and the parameter vector \({{\varvec{\lambda }}} = ({{\varvec{\beta }}}^{\top }, {{\varvec{\delta }}}^{\top }, {{\varvec{\gamma }}}^{\top })^{\top }\). The logarithm of the likelihood (log-likelihood) function based on a sample of n independent observations is given by

$$\begin{aligned} l({{\varvec{\lambda }}}) = l_1({{\varvec{\beta }}}, {{\varvec{\delta }}}) + l_2({{\varvec{\gamma }}}), \end{aligned}$$
(5)

where

$$\begin{aligned} l_1({{\varvec{\beta }}}, {{\varvec{\delta }}}) = \sum _{\ell : y_{\ell } \in (0, \infty )} \left\{ \phi _{\ell } \left[ y_{\ell } \theta _{\ell } - b \left( \theta _{\ell } \right) + c \left( y_{\ell } \right) \right] + d_1 \left( y_{\ell } \right) + d_2 \left( \phi _{\ell } \right) \right\} \end{aligned}$$

and

$$\begin{aligned} l_2({{\varvec{\gamma }}}) = \sum _{\ell = 1}^{n} \left\{ \mathbb {I}_{\{0\}}^{(y_{\ell })} \log \left( \frac{\pi _{\ell }}{1 - \pi _{\ell }} \right) + \log (1 - \pi _{\ell })\right\} . \end{aligned}$$

The function (5) is assumed to be regular with respect to all \({{\varvec{\lambda }}}\) derivatives. The score vector, obtained by differentiation of the log-likelihood function \(l({{\varvec{\lambda }}})\) with respect to \({{\varvec{\lambda }}}\), can be written as \({{\varvec{U}}} = {{\varvec{U}}}({{\varvec{\lambda }}}) = \left( {{\varvec{U}}}_{{\varvec{\beta }}}({{\varvec{\lambda }}})^{\top }, {{\varvec{U}}}_{{\varvec{\delta }}}({{\varvec{\lambda }}})^{\top }, {{\varvec{U}}}_{{\varvec{\gamma }}}({{\varvec{\lambda }}})^{\top }\right) ^{\top }\), with

$$\begin{aligned} {{\varvec{U}}}_{{\varvec{\beta }}}({{\varvec{\lambda }}})&= {{\varvec{X}}}^{\top } {{\varvec{\Phi }}} {{\varvec{W}}}^{1/2} {{\varvec{V}}}^{-1/2} {{\varvec{I}}}_{{\varvec{y}}} ({{\varvec{y}}} - {{\varvec{\mu }}}), \end{aligned}$$

where \({{\varvec{X}}} = ({{\varvec{x}}}_1, \ldots , {{\varvec{x}}}_n)^{\top }\) is a specified \(n \times p_1\) matrix of full rank \(p_1 < n\), \({{\varvec{\Phi }}} =\) diag\(\{\phi _1, \ldots , \phi _n\}\), \({{\varvec{W}}} =\) diag\(\{w_1, \ldots , w_n\}\), \(w_{\ell } = (d\mu _{\ell }/d\eta _{1\ell })^2 V_{\ell }^{-1}\), \({{\varvec{V}}} =\) diag\(\{V_{1}, \ldots , V_{n}\}\), \({{\varvec{I}}}_{{\varvec{y}}} =\) diag\(\left\{ 1 - \mathbb {I}_{\{0\}}^{(y_{1})}, \ldots , 1 - \mathbb {I}_{\{0\}}^{(y_{n})} \right\} \), \({{\varvec{y}}} = (y_{1}, \ldots , y_{n})^{\top }\), \({{\varvec{\mu }}} = (\mu _{1}, \ldots , \mu _{n})^{\top }\),

$$\begin{aligned} {{\varvec{U}}}_{{\varvec{\delta }}}({{\varvec{\lambda }}})&= {{\varvec{T}}}^{\top } {{\varvec{\Phi }}}_{1} {{\varvec{I}}}_{{\varvec{y}}} {{\varvec{\nu }}}, \end{aligned}$$

where \({{\varvec{T}}} = ({{\varvec{t}}}_1, \ldots , {{\varvec{t}}}_n)^{\top }\) is a specified \(n \times p_2\) matrix of full rank \(p_2 < n\), \({{\varvec{\Phi }}}_{1} =\) diag\(\{\phi _{11}, \ldots , \phi _{1n}\}\), \(\phi _{1\ell } = d\phi _{\ell }/d\eta _{2\ell }\), \({{\varvec{\nu }}} = (\nu _{1}, \ldots , \nu _{n})^{\top }\), \(\nu _{\ell } = y_{\ell } \theta _{\ell } - b(\theta _{\ell }) + c(y_{\ell }) + d d_2(\phi _{\ell })/d\phi _{\ell }\),

$$\begin{aligned} {{\varvec{U}}}_{{\varvec{\gamma }}}({{\varvec{\lambda }}})&= {{\varvec{S}}}^{\top } {{\varvec{W}}}_{\pi }^{1/2} {{\varvec{V}}}_{\pi }^{-1/2} ({{\varvec{y}}}^I - {{\varvec{\pi }}}), \end{aligned}$$

where \({{\varvec{S}}} = ({{\varvec{s}}}_1, \ldots , {{\varvec{s}}}_n)^{\top }\) is a specified \(n \times p_3\) matrix of full rank \(p_3 < n\), \({{\varvec{W}}}_{\pi } =\) diag\(\{w_{\pi 1}, \ldots , w_{\pi n}\}\), \(w_{\pi \ell } = (d\pi _{\ell }/d\eta _{3\ell })^2 V_{\pi \ell }^{-1}\), \({{\varvec{V}}}_{\pi } =\) diag\(\{V_{\pi 1}, \ldots , V_{\pi n}\}\), \(V_{\pi \ell } = \pi _{\ell } (1-\pi _{\ell })\), \({{\varvec{y}}}^I = (y_{1}^I, \ldots , y_{n}^I)^{\top }\), \(y_{\ell }^I = \mathbb {I}_{\{0\}}^{(y_{\ell })}\), \({{\varvec{\pi }}} = (\pi _{1}, \ldots , \pi _{n})^{\top }\). The partition \({{\varvec{\lambda }}} = ({{\varvec{\beta }}}^{\top }, {{\varvec{\delta }}}^{\top }, {{\varvec{\gamma }}}^{\top })^{\top }\) induces a corresponding partitioned Fisher information matrix for these parameters. This matrix is block-diagonal given by:

$$\begin{aligned} {{\varvec{K}}}_{{{\varvec{\lambda }}}, {{\varvec{\lambda }}}} = \left( \begin{array}{ccc} {{\varvec{X}}}^{\top } {{\varvec{\Delta }}} {{\varvec{W}}} {{\varvec{\Phi }}} {{\varvec{X}}} &{} {{\varvec{0}}} &{} {{\varvec{0}}} \\ {{\varvec{0}}} &{} - {{\varvec{T}}}^{\top } {{\varvec{\Delta }}} {{\varvec{D}}}_2 {{\varvec{\Phi }}}_{1}^{2} {{\varvec{T}}} &{} {{\varvec{0}}} \\ {{\varvec{0}}} &{} {{\varvec{0}}} &{} {{\varvec{S}}}^{\top } {{\varvec{W}}}_{\pi } {{\varvec{S}}} \\ \end{array} \right) , \end{aligned}$$
(6)

where \({{\varvec{\Delta }}} =\) diag\(\{1 - \pi _1, \ldots , 1 - \pi _n\}\), \({{\varvec{D}}}_2 =\) diag\(\{d_{21}, \ldots , d_{2n}\}\), \(d_{2\ell } = d^2 d_2(\phi _{\ell })/d\phi _{\ell }^2\), \({{\varvec{\Phi }}}_{1}^2 =\) diag\(\{\phi _{11}^2, \ldots , \phi _{1n}^2\}\), \(\phi _{1\ell }^2 = (d\phi _{\ell }/d\eta _{2\ell })^2\). Thus, the parameters \({{\varvec{\beta }}}\), \({{\varvec{\delta }}}\), \({{\varvec{\gamma }}}\) are globally orthogonal (Cox and Reid 1987) and their maximum likelihood estimates \({{\varvec{{\hat{\beta }}}}}\), \({{\varvec{{\hat{\delta }}}}}\) and \({{\varvec{{\hat{\gamma }}}}}\) are asymptotically independent. The former property is necessary to simplify the calculations of the Bartlett corrections, whereas the latter is desirable in the context of inference. The Fisher scoring method can be used to compute \({{\varvec{{\hat{\beta }}}}}\), \({{\varvec{{\hat{\delta }}}}}\) and \({{\varvec{{\hat{\gamma }}}}}\) by iteratively solving the following equations:

$$\begin{aligned} \left( \begin{array}{cc} {{\varvec{{\hat{\beta }}}}}^{(m+1)} \\ {{\varvec{{\hat{\delta }}}}}^{(m+1)} \\ {{\varvec{{\hat{\gamma }}}}}^{(m+1)} \\ \end{array} \right) = \left( \begin{array}{cc} {{\varvec{{\hat{\beta }}}}}^{(m)} \\ {{\varvec{{\hat{\delta }}}}}^{(m)} \\ {{\varvec{{\hat{\gamma }}}}}^{(m)} \\ \end{array} \right) + \left[ {{\varvec{K}}}_{{{\varvec{\lambda }}}, {{\varvec{\lambda }}}}^{-1}\right] ^{(m)} \times {{\varvec{U}}}^{(m)}. \end{aligned}$$

In many problems, the restrictions under a test involve a subset of the \({{\varvec{\beta }}}\), \({{\varvec{\delta }}}\) and \({{\varvec{\gamma }}}\) parameters. We partition the parameters as \({{\varvec{\beta }}} = \left( {{\varvec{\beta }}}_1^{\top }, {{\varvec{\beta }}}_2^{\top }\right) ^{\top }\), \({{\varvec{\delta }}} = \left( {{\varvec{\delta }}}_1^{\top }, {{\varvec{\delta }}}_2^{\top }\right) ^{\top }\) and \({{\varvec{\gamma }}} = \left( {{\varvec{\gamma }}}_1^{\top }, {{\varvec{\gamma }}}_2^{\top }\right) ^{\top }\) where \({{\varvec{\beta }}}_1 = \left( \beta _1, \ldots , \beta _{q_1}\right) ^{\top }\), \({{\varvec{\beta }}}_2 = \left( \beta _{q_1 + 1}, \ldots , \beta _{p_1}\right) ^{\top }\), \({{\varvec{\delta }}}_1 = \left( \delta _1, \ldots , \delta _{q_2}\right) ^{\top }\), \({{\varvec{\delta }}}_2 = \left( \delta _{q_2 + 1}, \ldots , \delta _{p_2}\right) ^{\top }\), \({{\varvec{\gamma }}}_1 = \left( \gamma _1, \ldots , \gamma _{q_3}\right) ^{\top }\) and \({{\varvec{\gamma }}}_2 = \left( \gamma _{q_3 + 1}, \ldots , \gamma _{p_3}\right) ^{\top }\). The partitions of \({{\varvec{\beta }}}\), \({{\varvec{\delta }}}\) and \({{\varvec{\gamma }}}\) induce the corresponding partitions \({{\varvec{X}}} = ({{\varvec{X}}}_1, {{\varvec{X}}}_2)\), \({{\varvec{T}}} = ({{\varvec{T}}}_1, {{\varvec{T}}}_2)\), \({{\varvec{S}}} = ({{\varvec{S}}}_1, {{\varvec{S}}}_2)\), \({{\varvec{U}}} = \left( {{\varvec{U}}}_{{\varvec{\beta }}_1}({{\varvec{\beta }}}_1, {{\varvec{\beta }}}_2)^{\top }, {{\varvec{U}}}_{{\varvec{\beta }}_2}({{\varvec{\beta }}}_1, {{\varvec{\beta }}}_2)^{\top }\right. \), \({{\varvec{U}}}_{{\varvec{\delta }}_1}({{\varvec{\delta }}}_1, {{\varvec{\delta }}}_2)^{\top }, {{\varvec{U}}}_{{\varvec{\delta }}_2}({{\varvec{\delta }}}_1, {{\varvec{\delta }}}_2)^{\top }\), \(\left. {{\varvec{U}}}_{{\varvec{\gamma }}_1}({{\varvec{\gamma }}}_1, {{\varvec{\gamma }}}_2)^{\top }, {{\varvec{U}}}_{{\varvec{\gamma }}_2}({{\varvec{\gamma }}}_1, {{\varvec{\gamma }}}_2)^{\top }\right) ^{\top }\) and

$$\begin{aligned} {{\varvec{K}}}_{{{\varvec{\beta }}}, {{\varvec{\beta }}}} = \left( \begin{array}{cc} {{\varvec{K}}}_{{{\varvec{\beta }}}_{11}} &{} {{\varvec{K}}}_{{{\varvec{\beta }}}_{12}} \\ {{\varvec{K}}}_{{{\varvec{\beta }}}_{21}} &{} {{\varvec{K}}}_{{{\varvec{\beta }}}_{22}} \\ \end{array} \right) , \ {{\varvec{K}}}_{{{\varvec{\delta }}}, {{\varvec{\delta }}}} = \left( \begin{array}{cc} {{\varvec{K}}}_{{{\varvec{\delta }}}_{11}} &{} {{\varvec{K}}}_{{{\varvec{\delta }}}_{12}} \\ {{\varvec{K}}}_{{{\varvec{\delta }}}_{21}} &{} {{\varvec{K}}}_{{{\varvec{\delta }}}_{22}} \\ \end{array} \right) , \ {{\varvec{K}}}_{{{\varvec{\gamma }}}, {{\varvec{\gamma }}}} = \left( \begin{array}{cc} {{\varvec{K}}}_{{{\varvec{\gamma }}}_{11}} &{} {{\varvec{K}}}_{{{\varvec{\gamma }}}_{12}} \\ {{\varvec{K}}}_{{{\varvec{\gamma }}}_{21}} &{} {{\varvec{K}}}_{{{\varvec{\gamma }}}_{22}} \\ \end{array} \right) , \end{aligned}$$

where \({{\varvec{X}}}_1\), \({{\varvec{X}}}_2\), \({{\varvec{T}}}_1\), \({{\varvec{T}}}_2\), \({{\varvec{S}}}_1\) and \({{\varvec{S}}}_2\) are known matrices of full rank and dimensions \(n \times q_1\), \(n \times (p_1 - q_1)\), \(n \times q_2\), \(n \times (p_2 - q_2)\), \(n \times q_3\), \(n \times (p_3 - q_3)\), respectively, and \({{\varvec{K}}}_{{{\varvec{\beta }}}_{11}} = {{\varvec{X}}}_1^{\top } {{\varvec{\Delta }}} {{\varvec{W}}} {{\varvec{\Phi }}} {{\varvec{X}}}_1\), \({{\varvec{K}}}_{{{\varvec{\beta }}}_{12}} = {{\varvec{K}}}_{{{\varvec{\beta }}}_{21}}^{\top } = {{\varvec{X}}}_1^{\top } {{\varvec{\Delta }}} {{\varvec{W}}} {{\varvec{\Phi }}} {{\varvec{X}}}_2\), \({{\varvec{K}}}_{{{\varvec{\beta }}}_{22}} = {{\varvec{X}}}_2^{\top } {{\varvec{\Delta }}} {{\varvec{W}}} {{\varvec{\Phi }}} {{\varvec{X}}}_2\), \({{\varvec{K}}}_{{{\varvec{\delta }}}_{11}} = - {{\varvec{T}}}_1^{\top } {{\varvec{\Delta }}} {{\varvec{D}}}_2 {{\varvec{\Phi }}}_{1}^{2} {{\varvec{T}}}_1\), \({{\varvec{K}}}_{{{\varvec{\delta }}}_{12}} = {{\varvec{K}}}_{{{\varvec{\delta }}}_{21}}^{\top } = - {{\varvec{T}}}_1^{\top } {{\varvec{\Delta }}} {{\varvec{D}}}_2 {{\varvec{\Phi }}}_{1}^{2} {{\varvec{T}}}_2\), \({{\varvec{K}}}_{{{\varvec{\delta }}}_{22}} = - {{\varvec{T}}}_2^{\top } {{\varvec{\Delta }}} {{\varvec{D}}}_2 {{\varvec{\Phi }}}_{1}^{2} {{\varvec{T}}}_2\), \({{\varvec{K}}}_{{{\varvec{\gamma }}}_{11}} = {{\varvec{S}}}_1^{\top } {{\varvec{W}}}_{\pi } {{\varvec{S}}}_1\), \({{\varvec{K}}}_{{{\varvec{\gamma }}}_{12}} = {{\varvec{K}}}_{{{\varvec{\gamma }}}_{21}}^{\top } = {{\varvec{S}}}_1^{\top } {{\varvec{W}}}_{\pi } {{\varvec{S}}}_2\) and \({{\varvec{K}}}_{{{\varvec{\gamma }}}_{22}} = {{\varvec{S}}}_2^{\top } {{\varvec{W}}}_{\pi } {{\varvec{S}}}_2\).

We are interested in testing

$$\begin{aligned} \left\{ \begin{array}{cl} {\mathcal {H}}: &{} {{\varvec{\beta }}}_1 = {{\varvec{\beta }}}_1^{(0)}, {{\varvec{\delta }}}_1 = {{\varvec{\delta }}}_1^{(0)}, {{\varvec{\gamma }}}_1 = {{\varvec{\gamma }}}_1^{(0)} \\ {\mathcal {A}}: &{} \text{ violation } \text{ of } \text{ at } \text{ least } \text{ one } \text{ equality } \end{array} \right. \end{aligned}$$
(7)

where \({{\varvec{\beta }}}_1^{(0)}\), \({{\varvec{\delta }}}_1^{(0)}\) and \({{\varvec{\gamma }}}_1^{(0)}\) are specified vectors of dimensions \(q_1\), \(q_2\) and \(q_3\), respectively. We assume that \(0 \le q_1 \le p_1\), \(0 \le q_2 \le p_2\) and \(0 \le q_3 \le p_3\), but the trivial case \(q_1 = q_2 = q_3 = 0\) is excluded because there are no parameters left under the null hypothesis. Let \({{\varvec{{\hat{\lambda }}}}} = \left( {{\varvec{{\hat{\beta }}}}}^{\top }, {{\varvec{{\hat{\delta }}}}}^{\top }, {{\varvec{{\hat{\gamma }}}}}^{\top }\right) ^{\top }\) be the unrestricted maximum likelihood estimates of \({{\varvec{\beta }}}\), \({{\varvec{\delta }}}\) and \({{\varvec{\gamma }}}\) and \({{\varvec{{\tilde{\lambda }}}}} = \left( {{\varvec{\beta }}}_1^{(0)\top }, {{\varvec{{\tilde{\beta }}}}}_2^{\top }, {{\varvec{\delta }}}_1^{(0)\top }, {{\varvec{{\tilde{\delta }}}}}_2^{\top }, {{\varvec{\gamma }}}_1^{(0)\top }, {{\varvec{{\tilde{\gamma }}}}}_2^{\top } \right) ^{\top }\) be their restricted maximum likelihood estimates under \({\mathcal {H}}\). The likelihood ratio statistic for testing \({\mathcal {H}}\) is

$$\begin{aligned} \text{ LR } = 2 \left[ l\left( {{\varvec{{\hat{\beta }}}}}_1, {{\varvec{{\hat{\beta }}}}}_2, {{\varvec{{\hat{\delta }}}}}_1, {{\varvec{{\hat{\delta }}}}}_2, {{\varvec{{\hat{\gamma }}}}}_1, {{\varvec{{\hat{\gamma }}}}}_2 \right) - l\left( {{\varvec{\beta }}}_1^{(0)}, {{\varvec{{\tilde{\beta }}}}}_2, {{\varvec{\delta }}}_1^{(0)}, {{\varvec{{\tilde{\delta }}}}}_2, {{\varvec{\gamma }}}_1^{(0)}, {{\varvec{{\tilde{\gamma }}}}}_2 \right) \right] , \end{aligned}$$

which is, under \({\mathcal {H}}\) and some regularity conditions, asymptotically distributed as \(\chi _{q_1 + q_2 + q_3}^2\) with approximation error of order \(n^{-1}\).

3 Corrected likelihood ratio tests

It is known that under the null hypothesis and general conditions of regularity, the likelihood ratio statistic, \(\text{ LR }\), has an asymptotic \(\chi _q^2\) distribution, where q is the number of restrictions imposed on the parameters by the null hypothesis. Lawley (1956) improved the \(\text{ LR }\) statistic by defining a statistic, say LR\(_\textrm{c}\), such that \(\mathbb {P}_{{\mathcal {H}}} \left( \text{ LR}_\textrm{c} \le w \right) = \mathbb {P} \left( \chi _q^2 \le w \right) + {\mathcal {O}}(n^{-2})\), while \(\mathbb {P}_{{\mathcal {H}}} \left( \text{ LR } \le w \right) = \mathbb {P} \left( \chi _q^2 \le w \right) + {\mathcal {O}}(n^{-1})\), where \(\mathbb {P}_{{\mathcal {H}}} \left( \cdot \right) \) is the cumulative distribution function under the null hypothesis. The improved statistic is given by LR\(_\textrm{c} = {\tilde{c}}^{-1} \text{ LR }\), where \({\tilde{c}} = q^{-1} \tilde{\mathbb {E}}(\text{ LR})\) is a consistent estimate of \(c = q^{-1} \mathbb {E}(\text{ LR})\) and \(\mathbb {E}(\text{ LR})\) is the expectation of the likelihood ratio statistic, evaluated under the null hypothesis up to order \(n^{-1}.\) The factor c is known as the Bartlett correction factor.

For the model presented in Sect. 2, define \(q = q_1 + q_2 + q_3\) and \(p = p_1 + p_2 + p_3.\) The expected likelihood ratio statistic to order \({\mathcal {O}}(n^{-1})\) for the test of hypotheses in (7) is \(\mathbb {E}(\text{ LR}) = q + \varepsilon _{p} - \varepsilon _{p - q}\), where

$$\begin{aligned} \varepsilon _{p}&= \varepsilon _{\beta _{p_{1}}} + \varepsilon _{\delta _{p_{2}}} + \varepsilon _{\gamma _{p_{3}}} \nonumber \\&\quad + \frac{1}{2} \text{ tr } \left\{ {{\varvec{\Delta }}} {{\varvec{\Phi }}}_{2} {{\varvec{W}}} {{\varvec{Z}}}_{\beta d} {{\varvec{Z}}}_{\delta d} {}- {{\varvec{\Delta }}} {{\varvec{\Phi }}}_1 {{\varvec{W}}} ({{\varvec{Z}}}_{\beta }^{(2)} \odot {{\varvec{Z}}}_{\delta }) {{\varvec{W}}} {{\varvec{\Phi }}}_1 {{\varvec{\Delta }}} \right\} \nonumber \\&\quad + \frac{1}{4} {{\varvec{1}}}^{\top } {{\varvec{\Delta }}} {{\varvec{\Phi }}}_1 {{\varvec{W}}} {{\varvec{Z}}}_{\beta d} {{\varvec{Z}}}_{\delta } \left[ {{\varvec{Z}}}_{\beta d} {{\varvec{W}}} + 2 {{\varvec{Z}}}_{\delta d} \left( {{\varvec{\Phi }}}_1^2 {{\varvec{D}}}_{3} + {{\varvec{\Phi }}}_2 {{\varvec{D}}}_{2} \right) \right] {{\varvec{\Phi }}}_1 {{\varvec{\Delta }}} {{\varvec{1}}}, \end{aligned}$$
(8)

with \({{\varvec{1}}} = (1, \ldots , 1)_{n \times 1}^{\top }\), \({{\varvec{\Phi }}}_{2} =\) diag\(\{\phi _{21}, \ldots , \phi _{2n}\}\), \(\phi _{2\ell } = d^2\phi _{\ell }/d\eta _{2\ell }^2\), \({{\varvec{Z}}}_{\beta } = {{\varvec{X}}} {{\varvec{K}}}^{{{\varvec{\beta }}}, {{\varvec{\beta }}}} {{\varvec{X}}}^{\top }\), \({{\varvec{Z}}}_{\beta d} =\) diag\(\{z_{\beta , 11}, \ldots , z_{\beta , nn}\}\), \({{\varvec{Z}}}_{\delta } = {{\varvec{T}}} {{\varvec{K}}}^{{{\varvec{\delta }}}, {{\varvec{\delta }}}} {{\varvec{T}}}^{\top }\), \({{\varvec{Z}}}_{\delta d} =\) diag\(\{z_{\delta , 11}, \ldots , z_{\delta , nn}\}\), \({{\varvec{Z}}}_{\gamma } = {{\varvec{S}}} {{\varvec{K}}}^{{{\varvec{\gamma }}}, {{\varvec{\gamma }}}} {{\varvec{S}}}^{\top }\), \({{\varvec{Z}}}_{\gamma d} =\) diag\(\{z_{\gamma , 11}, \ldots , z_{\gamma , nn}\}\), \({{\varvec{D}}}_3 =\) diag\(\{d_{31}, \ldots , d_{3n}\}\), \(d_{3\ell } = d^3 d_2(\phi _{\ell })/d\phi _{\ell }^3\), \({{\varvec{Z}}}_{\beta }^{(2)} = {{\varvec{Z}}}_{\beta } \odot {{\varvec{Z}}}_{\beta }\), \({{\varvec{Z}}}_{\beta }^{(3)} = {{\varvec{Z}}}_{\beta }^{(2)} \odot {{\varvec{Z}}}_{\beta }\), \({{\varvec{Z}}}_{\delta }^{(2)} = {{\varvec{Z}}}_{\delta } \odot {{\varvec{Z}}}_{\delta }\), \({{\varvec{Z}}}_{\delta }^{(3)} = {{\varvec{Z}}}_{\delta }^{(2)} \odot {{\varvec{Z}}}_{\delta }\), \({{\varvec{Z}}}_{\gamma }^{(2)} = {{\varvec{Z}}}_{\gamma } \odot {{\varvec{Z}}}_{\gamma }\), \({{\varvec{Z}}}_{\gamma }^{(3)} = {{\varvec{Z}}}_{\gamma }^{(2)} \odot {{\varvec{Z}}}_{\gamma }\) and \(\odot \) represents a direct product of matrices (Hadamard product). For sake of brevity, \(\varepsilon _{\beta _{p_{1}}}\), \(\varepsilon _{\delta _{p_{2}}}\), \(\varepsilon _{\gamma _{p_{3}}}\) are given in the Appendix. The term \(\varepsilon _{p}\) is of order \({\mathcal {O}}(n^{-1})\) evaluated at the true parameter point. For sake of brevity, we will not discuss particular cases derived from (8). However, they can be obtained similarly as Botter and Cordeiro (1997).

The Bartlett-corrected LR test statistic for testing (7) is

$$\begin{aligned} \text{ LR}_\textrm{c} = \frac{\text{ LR }}{(1 + \zeta )}, \end{aligned}$$

where \(\zeta = \left( \varepsilon _p - \varepsilon _{p-q}\right) /q\), \(\varepsilon _{p-q}\) can be determined from (8) with \({{\varvec{X}}}_2\), \({{\varvec{T}}}_2\) and \({{\varvec{S}}}_2\) in place of \({{\varvec{X}}}\), \({{\varvec{T}}}\) and \({{\varvec{S}}}\), respectively.

The Bartlett correction factor is very general, and in some cases it is very difficult or even impossible to particularize its formula for specific regression models. For instance, although Cordeiro et al. (1994) found the Bartlett correction factor for the dispersion models, it has no closed-form for the simplex distribution. As can be seen in (8), we have been able to apply the results for our model. For continuous case, i.e., \(\pi _{\ell } = 0, \forall \ell = 1, \ldots , n\), \(\varepsilon _{\gamma _{p_{3}}}\) vanishes, \({{\varvec{\Delta }}}\) is the identity matrix and the term \(\varepsilon _{p}\) in (8) coincides with equation (5) from Botter and Cordeiro (1997). Thus, our result generalizes their work.

An alternative strategy for improving LR testing inference is to use the bootstrap procedure. Suppose that \({{\varvec{{\tilde{\mu }}}}}\), \({{\varvec{{\tilde{\phi }}}}}\) and \({{\varvec{{\tilde{\pi }}}}}\) are the restricted MLEs of \({{\varvec{\mu }}}\), \({{\varvec{\phi }}}\) and \({{\varvec{\pi }}}\) from the original dataset. In the parametric bootstrap case, B pseudo-samples with size n are generated from ZAGLM(\({{\varvec{{\tilde{\mu }}}}}, {{\varvec{{\tilde{\phi }}}}}, {{\varvec{{\tilde{\pi }}}}}\)), and, for each b, \(b = 1, \ldots , B\), the regressor coefficients \({{\varvec{{\hat{\lambda }}}}}^{(b)}\) are estimated and the LR statistic is calculated, as:

$$\begin{aligned} \text{ LR}^{(b)} = 2 \left[ l\left( {{\varvec{{\hat{\lambda }}}}}^{(b)} \right) - l\left( {{\varvec{{\tilde{\lambda }}}}}^{(b)} \right) \right] . \end{aligned}$$

The null hypothesis (7) is rejected if \(p^{\star } = (k + 1)/(B + 1)\) is smaller than or equal to the significance level \(\alpha \). Here, k is the number of bootstrap replications in which \(\text{ LR}^{(b)}\) is greater than the LR statistic computed using the original sample and \(p^{\star }\) is the bootstrap p-value.

Rocke (1989) proposed bootstrap resampling to estimate the Bartlett correction factor in the following form: B parametric bootstrap samples, imposing the null hypothesis, are produced and the Bartlett-corrected bootstrap test statistic is computed as

$$\begin{aligned} \text{ LR}_\textrm{boot2} = \frac{q \text{ LR }}{\overline{\text{ LR }}^{\star }}, \end{aligned}$$

where \(\overline{\text{ LR }}^{\star } = B^{-1} \sum _{b=1}^{B} \text{ LR}^{(b)}\), the average of all bootstrap statistics. The statistic \(\text{ LR}_\textrm{boot2} \sim \chi _q^2\), for more details on the Bartlett corrections, included the bootstrap Bartlett adjustment, see Cordeiro and Cribari-Neto (2014).

4 Numerical results

This section presents the results of Monte Carlo simulation studies performed to evaluate the finite sample performance of the following tests: the likelihood ratio statistic (LR), the Bartlett-corrected statistic (\(\text{ LR}_\textrm{c}\)), the bootstrap test based on \(p^{\star }\) \((\text{ LR}_\textrm{boot1})\), and the Bartlett-corrected bootstrap statistic \((\text{ LR}_\textrm{boot2})\). For each considered scenario, we used 5,000 Monte Carlo replications and \(B = 1,000\). The simulations were performed using the Ox language (Doornik 2009).

The following ZAGA regression model was considered in the simulations:

$$\begin{aligned} \left\{ \begin{array}{lll} h_1(\mu _l) &{} = &{} \beta _1 + \beta _2x_{2l} + \beta _3x_{3l}, \\ h_2(\phi _l) &{} = &{} \delta _1 + \delta _2t_{2l} + \delta _3t_{3l}, \\ h_3(\pi _l) &{} = &{} \gamma _1 + \gamma _2s_{2l} + \gamma _3s_{3l}, \end{array} \right. \end{aligned}$$
(9)

in which we considered \(x_{2l}=t_{2l}=s_{2l}\) and \(x_{3l}=t_{3l}=s_{3l}\). First, we evaluated the null rejection rates of the different tests considering three nominal levels \((1\%, 5\% \text { and } 10\%)\) and three sample sizes (50, 75 and 100). We did not consider sample sizes smaller than 50, because in ZAR models, this leads to bootstrap samples with a very small number of observations equal to zero or a very small number of observations greater than zero. As a consequence, the parameters cannot be estimated in many bootstrap samples when sample size is smaller than 50.

Initially, we considered four scenarios to test the following hypothesis:

$$\begin{aligned} \left\{ \begin{array}{cl} {\mathcal {H}}: &{} \beta _3 = 0, \delta _3 = 0, \gamma _3 = 0, \\ {\mathcal {A}}: &{} \text{ violation } \text{ of } \text{ at } \text{ least } \text{ one } \text{ equality. } \end{array} \right. \end{aligned}$$
(10)

Note that in (10), we are testing if the distribution of the response variable is a function of the covariate \(x_3\). In Scenarios 1 to 3, the covariate values were taken as random draws of the standard uniform distribution. In Scenario 4, \(x_2\) and \(x_3\) were taken from the Gamma distribution and from inverse Gaussian distribution, respectively, considering for these distributions the same mean and the same variance of the standard uniform distribution. In all scenarios, covariate values were kept fixed in the Monte Carlo replicates.

In Scenario 1, we considered \(\beta _1 = 3.0\), \(\beta _2 = 2.0\), \(\beta _3 = 0.0\), \(\delta _1 = 4.0\), \(\delta _2 = -2.0\), \(\delta _3 = 0.0\), \(\gamma _1 = 0.1\), \(\gamma _2 = -1.0\) and \(\gamma _3 = 0.0\). These parameter values yielded \(\mu \in (20.08,148.42)\), \(\phi \in (7.38,54.60)\), and \(\pi \in (0.289,0.525)\). In the second scenario, we changed the value of \(\gamma _1\) to 0.8, which yielded \(\pi \in (0.450,0.690)\). In the third scenario, the value of \(\delta _1\) was changed to 2.0, which yielded \(\phi \in (1.00,7.39)\). Finally, in Scenario 4, we changed the distribution used to generate the covariates as mentioned before.

Table 2 presents the null rejection rates for the test (10) in the four scenarios described above. The LR test is considerably liberal for \(n=50\), reaching, for example, a rejection rate of 4.00 when \(\alpha =1\%\) in Scenario 4. Even when \(n=100\), in all scenarios, the rejection rates of the LR test were not close to the nominal levels. On the other hand, in general, the three improved tests presented rejection rates close to the nominal levels even when \(n=50\). For example, in Scenario 4, for \(\alpha =1\%\) and \(n=50\), the rejection rates of \(\text{ LR}_\textrm{c}\), \(\text{ LR}_\textrm{boot1}\) and \(\text{ LR}_\textrm{boot2}\) were 1.08, 0.84 and 1.04, respectively. There were some exceptions, mainly in Scenario 2, in which the rejection rates of the test based on the \(\text{ LR}_\textrm{boot1}\) statistic were not very close to the nominal levels.

Table 2 Null rejection rates \((\%)\) of tests for three parameters

We also performed simulation studies considering a hypothesis with a single parameter. We considered Scenario 1, and three different null hypotheses: \({\mathcal {H}}: \beta _3 = 0\), \({\mathcal {H}}: \delta _3 = 0\), and \({\mathcal {H}}: \gamma _3 = 0\). Table 3 presents the null rejection rates for these tests. The LR test is also considerably liberal for the two tests related with the continuous component of the model, but it performs well for the test related with the discrete component of the model. On the other hand, the improved LR tests have null rejection rates close to the nominal levels for the three hypothesis, even when \(n=50\).

Table 3 Null rejection rates \((\%)\) for tests for a single parameter in Scenario 1

In Tables 4, 5, 6, we present the simulation results to evaluate the non-null rejection rates of the tests (power) based on the statistics \(\text{ LR}_\textrm{c}\), \(\text{ LR}_\textrm{boot1}\) and \(\text{ LR}_\textrm{boot2}.\) The LR statistic is not included in this study, because our simulations showed that it is oversized. We considered the following three sets of hypotheses:

  • \({\mathcal {H}}: \beta _3 = 0, \delta _3 = 0, \gamma _3 = 0\) versus \({\mathcal {A}}: \beta _3 \ne 0, \delta _3 \ne 0, \gamma _3 \ne 0,\) considering in \({\mathcal {A}}\) \(\beta _3=\delta _3=\gamma _3=\tau \) (Table 4)

  • \({\mathcal {H}}: \beta _3 = 0, \delta _3 = 0, \gamma _3 = 0\) versus \({\mathcal {A}}: \beta _3 \ne 0, \delta _3 \ne 0, \gamma _3 \ne 0,\) considering in \({\mathcal {A}}\) \(\beta _3=\tau , \delta _3 = \gamma _3 = 0\) (Table 5)

  • \({\mathcal {H}}: \beta _3 = 0\) versus \({\mathcal {A}}: \beta _3 \ne 0,\) considering in \({\mathcal {A}}\) \(\beta _3=\tau ,\) (Table 6)

where \(\tau =\) \(-\)0.50, \(-\)0.25, 0.25 and 0.50. The remaining parameters were as in Scenario 1.

For the three tables, with fixed values of \(\alpha \), n and \(\tau \), the non-null rejection rates were similar in the three tests. As expected, for all cases, the non-null rejection rates increased as n and the absolute value of \(\tau \) grew.

Table 4 Non-null rejection rates \((\%)\) for \({\mathcal {H}}: \beta _3 = 0, \delta _3 = 0, \gamma _3 = 0\) - Scenario 1 with \(\beta _3=\delta _3=\gamma _3=\tau \)
Table 5 Non-null rejection rates \((\%)\) for \({\mathcal {H}}: \beta _3 = 0, \delta _3 = 0, \gamma _3 = 0\) - Scenario 1 with \(\beta _3=\tau , \delta _3 = \gamma _3 = 0\)
Table 6 Non-null rejection rates \((\%)\) for tests for \({\mathcal {H}}: \beta _3 = 0\) - Scenario 1 with \(\beta _3=\tau \)

5 Application

This section presents an application to real data using the LR test and its improved versions. The data are part of the work presented in Rocha et al. (2009) and refer to the production of mycotoxin FB\(_2\) in corn grains in Brazil. Fusarium verticillioides is a species of fungus that commonly produces mycotoxins in corn grains. When these substances are produced in high quantity, the corn grains become improper for consumption.

The dataset consists of 200 unit samples, each composed of 30 gs of corn grains. The response variable is the quantity of mycotoxin FB\(_2\) (measured in \(\mu \)g/g) and the covariates are the percentage of water activity (\(x_2\)) and the percentage of grains with F. verticillioides infection (\(x_3\)). A total of 51 out of 200 unit samples did not contain mycotoxin FB\(_2\).

We fitted ZAGA and ZAIG regression models for the quantity of mycotoxin FB\(_2\) with the following three systematic components:

$$\begin{aligned} \left\{ \begin{array}{lll} \log (\mu _l) &{} = &{} \beta _1 + \beta _2x_{2l} + \beta _3x_{3l}, \\ \log (\phi _l) &{} = &{} \delta _1 + \delta _2x_{2l} + \delta _3x_{3l}, \\ \text{ logit }(\pi _l) &{} = &{} \gamma _1 + \gamma _2x_{2l} + \gamma _3x_{3l}. \end{array} \right. \end{aligned}$$
(11)

Note that we considered in (11) the two covariates in the three components of the model. We used a logarithmic link function in the submodels for \(\mu \) and \(\phi \) and a logit link in the submodel for \(\pi \). Diagnostic analysis omitted here for the sake of brevity suggested that the ZAGA regression model is adequate to fit these data and that the ZAIG regression model produces a worse fit. In addition, the AIC value is much lower for the ZAGA regression model (373.09) than for the ZAIG regression model (429.37).

First, we tested if the three parameters related to the percentage of grains with Fusarium verticillioides are all equal to zero. The p-values of the four tests considered here are greater than 0.1. Therefore, there is no evidence that the quantity of mycotoxin FB\(_2\) is a function of the the percentage of grains with F. verticillioides and we excluded this covariate from the model.

Second, we fitted a model using water activity as the single covariate and tested if the three parameters related to this covariate were all equal to zero. The second and third columns of Table 7 present the results. Note that the value of the statistic is greater for the LR test than for the corrected test, in agreement with the simulation results. However, all tests yielded the same conclusion, that the quantity of mycotoxin FB\(_2\) is a function of the water activity at the usual nominal levels.

Table 7 Test results for the ZAGA model in the complete mycotoxin database

Finally, using the same model, we tested if each of the parameters of the model was equal to zero; the results are presented in the last 6 columns of Table 7. At the usual nominal levels, all statistics also yielded the same conclusion for the three tests. The mean of the continuous component of the quantity of mycotoxin FB\(_2\) and the probability of the quantity of mycotoxin FB\(_2\) assuming a zero value are functions of the water activity, but there is no evidence that the dispersion parameter of the continuous component of the quantity of mycotoxin FB\(_2\) was a function of the water activity.

Table 8 presents the estimates of the parameters and their standard errors for the final model. To interpret the estimates of the parameters, the table also presents the exponential of the estimates. The results indicated that, for every percentage point increase in the water activity, the mean quantity of mycotoxin FB\(_2\), given that there is some FB\(_2\) (mean of the continuous component of the response variable), increased by \(5.5\%\). It was also estimated that, for every percentage point increase in the water activity, the odds of a random sample of 30 gs of corn not containing FB\(_2\), decreased by \(12.8\%\).

Table 8 Estimates and standard errors for the final ZAGA model in the completed mycotoxin database

Similar to Melo et al. (2022), we randomly selected a subset to illustrate that the conclusions of different tests may be different. We selected the sample using a binomial random variable with probability of success of 0.2 and obtained a dataset with \(n=41\). Table 9 presents for this reduced dataset the results of the same tests performed before for the complete data. Note that considering a significance level of \(1\%\) or \(5\%\), the conclusions were different between the LR tests and the improved LR tests. Based on the results of the simulation studies, if the reduced database were the true one, we would rely on the conclusions reached when using the improved LR tests.

Table 9 Test results for the ZAGA model in the reduced mycotoxin database

6 Concluding remarks

Response variables that are discrete at zero and continuous on the positive real numbers are common in many areas and they are usually fitted using zero-adjusted generalized linear models. In many regression models, the likelihood ratio test is used to perform hypothesis testing, especially when the null hypothesis involves more than one parameter. However, the likelihood ratio test is considerably liberal (oversized) in the class of ZAGLMs when the sample size is small. In this work, we derived an analytical Bartlett-corrected likelihood ratio test and considered two bootstrap-based corrected likelihood ratio tests. We developed Monte Carlo simulation studies that showed that the null rejection rates of the three improved tests are close to nominal levels for small sample sizes. We also concluded that the three improved likelihood ratio tests considered here have similar power. An application illustrated the usefulness of the improved statistics.

Zero-adjusted regression models are a wide class of regression models that contain the ZAGLMs. There are no previous studies that have evaluated the performance of improved hypothesis tests that simultaneously involve parameters of the continuous and discrete component of the model. Therefore, this work is the first to show that the improved likelihood ratio tests perform well for these kinds of hypotheses, which are useful when one wants to test if the distribution of the response variable is a function of a covariate.

Based on the results of the simulation studies and the features of the three corrected likelihood ratio tests considered here, we suggest that practitioners use the analytical Bartlett-corrected likelihood ratio test when the sample size is small and they want to perform hypothesis testing in the class of ZAGLMs. Our recommendation is based especially on two reasons. First, the performances of the tests related to size and power were similar, but in one of the scenarios considered here, the null rejection rate of the bootstrap corrected test was slightly size distorted. The other reason is that bootstrap uses randomization. For this reason, when the p-value of the test is close to the significance level, two practitioners can reach different conclusion for the same database and hypothesis. This does not happen with the analytical correction.