1 Introduction

Let \((X_1,Y_1)\) and \((X_2,Y_2)\) be two independent random vectors. Assume that they satisfy the general nonparametric regression models,

$$\begin{aligned} Y_k=m_k(X_k)+\sigma _k(X_k)\varepsilon _k, \end{aligned}$$
(1)

where \( m_k(x) = E(Y_k \mid X_k=x) \) is the regression function, \(\sigma _k^2(x) = Var(Y_k \mid X_k=x)\) is the conditional variance function and \(\varepsilon _k\) is the regression error, which is assumed to be independent of \(X_k\), \(k=1,2\). By construction \(E(\varepsilon _k)=0\) and \(Var(\varepsilon _k)\)=1. The regression functions, the variance functions, the distributions of the error and that of the covariates are unknown and no parametric models are assumed for them. We are interested in testing for the equality of the error distributions, that is, in tests of the null hypothesis

$$\begin{aligned} H_0: F_1= F_2, \end{aligned}$$

versus the alternative

$$\begin{aligned} H_1: F_1\ne F_2, \end{aligned}$$

where \(F_1\), \(F_2\) stand for the cumulative distribution function (CDF) of \(\varepsilon _1\) and \(\varepsilon _2\), respectively. In view of the uniqueness of the characteristic function (CF), the null hypothesis can equivalently be stated as

$$\begin{aligned} H_0: C_1=C_2, \end{aligned}$$

versus the alternative

$$\begin{aligned} H_1: C_1 \ne C_2, \end{aligned}$$

where \(C_k\) denotes the CF corresponding to \(F_k\), that is, \(C_k(t)=\int \exp (\mathrm{i}t x)\)\( dF_k(x)=R_{k}(t)+\mathrm{i}I_{k}(t)\), \(k=1,2\).

The equality of the error distributions is a usual assumption in several statistical problems such as that of testing for the equality of regression curves (see for example, Young and Bowman 1995; Hall and Hart 1990; Kulasekera and Wang 2001). The equality of the error distributions may considerably simplify some procedures. For instance, under \(H_0\), the asymptotic null distribution of the test statistic for the equality of regression functions in Pardo-Fernández et al. (2015a) coincides with the one of the classical ANOVA test for comparing means (it also requires the equality of the densities of the covariates). Another example is given by the asymptotic null distribution of the test statistic for the equality of variance functions in Pardo-Fernández et al. (2015b) based on CFs. The authors prove that, when the covariates are identically distributed and the error distributions are equal, the asymptotic distribution coincides with that of the classical Levene test for comparing variances. Thus, the problem of testing for \(H_0\) is of considerable practical interest.

The problem of testing whether two samples come from the same population has generated a considerable amount of research. Many different approaches have been proposed to deal with this problem when the data are observable (see, for example, the references in Meintanis 2005; Alba-Fernández et al. 2008; Hobza et al. 2014; Baringhaus and Kolbe 2015; Modarres 2016). In our setting the errors are not observable and the inference must be based on the residuals, which are not independent even if the original data are. The number of proposals to deal with this case is not so big. Mora (2005) proposed tests for testing \(H_0\) when the regression models are linear; for the more general model given in (1), Pardo-Fernández (2007)—PF07 from now on- also proposed tests for \(H_0\). These two papers study Kolgomorov–Smirnov (KS) and Cramér–von Misses (CvM) type test statistics based on the empirical CDFs of the residuals. Since the null distribution of these test statistics is unknown, these papers use a smooth bootstrap to approximate the critical values. Two main problems with these procedures are: they assume strong conditions on the distributions of the errors which, among other things, are supposed to have a smooth density; in addition, although quite easy to implement, the bootstrap can become computationally expensive as the sample sizes of the data increase.

This paper proposes a test for \(H_0\) that is based on comparing the empirical CF (ECF) of the residuals in samples from the models. It can be seen as a residual version of the test in Alba-Fernández et al. (2008), designed for the two-sample problem when observable independent and identically distributed (IID) data are available from each population. A weighted bootstrap (WB) estimator, in the sense of Burke (2000), is proposed to approximate the critical values. This method has been previously used in Kojadinovic and Yan (2012) and Ghoudi and Rémillard (2014), to approximate the null distribution of goodness-of-fit tests based on the empirical CDF, in Jiménez-Gamero and Kim (2015), to approximate the null distribution of goodness-of-fit tests based on the ECF, in Quessy and Éthier (2012), for the two-sample problem for dependent data, and in Jiménez-Gamero et al. (2016), for the two-sample problem for observable independent data, among others. In view of the good properties of the WB in these and other papers, it is also expected to work well for approximating the null distribution of the test statistic considered in this paper.

Compared to the tests in PF07, the procedure suggested in this paper has two main advantages. First, it assumes less stringent assumptions on the distribution of the regression errors. Specifically, we do not assume that the error distribution has a probability density function. Thus, the method can be applied when such distribution is arbitrary: continuous, discrete or mixed. Secondly, the WB approximation is computationally more efficient than that based on the smooth bootstrap.

The paper is organized as follows. Section 2 describes the test statistic. The problem of approximating the null distribution of the proposed test statistic is dealt with in Sect. 3, where the use of a WB estimator is studied. The consistency of the resulting null distribution estimator is proved. It is also shown that the resulting test is consistent, in the sense of being able to detect any alternative. Some practical issues are addressed in Sect. 4. Section 5 reports the results of some simulation experiments designed to study the finite sample performance of the proposed approximation, to compare it with other methods as well as a real data application. From this numerical study it is concluded that the WB approximation works, in the sense of providing levels close to the nominal values, and that the power of the test is comparable or even greater than the power of the test based on the empirical CDF. Section 6 shows how the proposed test can be extended to the general case of comparison of \(d>2\) error distributions. All proofs and technical details are deferred to the last section.

The following notation will be used along the paper: all vectors are column vectors; the superscript \(^T\) denotes transpose; \(1_n\in \mathbb {R}^n\) has all its components equal to 1; if \(x \in \mathbb {R}^k\), with \(x^T=(x_1, \ldots , x_k)\), then diag(x) is the \(k \times k\) diagonal matrix whose (ii) entry is \(x_i\), \(1\le i \le k\); \(P_0\), \(E_0\) and \(Cov_0\) denote probability, expectation and covariance, respectively, by assuming that the null hypothesis is true; \(P_*\), \(E_*\) and \(Cov_*\) denote the conditional probability law, expectation and covariance, given the data, respectively; \(\mathop {\rightarrow }\limits ^{\mathcal {L}}\) denotes convergence in distribution; \(\mathop {\rightarrow }\limits ^{P}\) denotes convergence in probability; \(\mathop {\rightarrow }\limits ^{a.s.}\) denotes the almost sure convergence; for any complex number \(z=a+\text{ i }b\), |z| is its modulus; an unspecified integral denotes integration over the whole real line \(\mathbb {R}\); for a given non-negative real-valued function \(\omega \) we denote \(\Vert \cdot \Vert _{\omega }\) to the norm and \(\langle \cdot , \cdot \rangle _\omega \) to the scalar product in the Hilbert space \(L^2(\omega )=\{g:\mathbb {R} \rightarrow \mathbb {C}, \, \int |g(t)|^2\omega (t)dt<\infty \}\).

2 The test statistic

Let \((X_{kj},Y_{kj})\), \(j=1,2,\ldots ,n_k\), be an IID sample from \((X_k,Y_k)\) satisfying (1), and let \(\varepsilon _{k1},\ldots ,\varepsilon _{kn_k}\) denote the associated errors, \(k=1,2\). Since the hypothesis \(H_0\) establishes the equality of the distributions of the errors \(\varepsilon _{kj}\) and they are not observable, the inference must be based on the residuals,

$$\begin{aligned} \hat{\varepsilon }_{kj}=\frac{Y_{kj}-\hat{m}_k(X_{kj})}{\hat{\sigma }_k(X_{kj})}, \quad j=1,2,\ldots ,n_k, \end{aligned}$$
(2)

where \(\hat{m}_k\) and \(\hat{\sigma }_k\) are estimators of \({m}_k\) and \(\sigma _k\), respectively, \(k=1,2\). Several choices are possible for \(\hat{m}_k\) and \(\hat{\sigma }_k\). Here, as in PF07, we use the following kernel estimators for the regression function \(m_k\) and the variance function \(\sigma _k^2\),

$$\begin{aligned} \hat{m}_k(x)= \sum _{j=1}^{n_k} W_{kj}(x;h_{k})Y_{kj}, \quad x \in S, \end{aligned}$$
$$\begin{aligned} \hat{\sigma }_k^2(x)=\sum _{j=1}^{n_k} W_{kj}(x;h_{k})\left\{ Y_{kj}-\hat{m}_k(x)\right\} ^2, \quad x \in S, \end{aligned}$$

where

$$\begin{aligned} W_{kj}(x;h_{k})=\frac{K_{h_{k}}\left( X_{kj}-x\right) }{\sum _{s=1}^{n_k} K_{h_{k}}\left( X_{ks}-x\right) }, \quad x \in S, \end{aligned}$$

\(K_{h}(\cdot )=\frac{1}{h}K(\frac{\cdot }{h})\), \(K(\cdot )\) is a kernel and \(h_{k}\) is the bandwidth, satisfying certain conditions that will be specified later. The proposed test statistic takes the form

$$\begin{aligned} T_{n_1,n_2}=\Vert \hat{C}_1-\hat{C}_2\Vert ^2_\omega , \end{aligned}$$

where

$$\begin{aligned} \hat{C}_k(t)=\frac{1}{n_k}\sum _{j=1}^{n_k} \exp (\mathrm{i }t\hat{\varepsilon }_{kj})=\hat{R}_k(t)+\mathrm{i} \hat{I}_k(t), \end{aligned}$$
(3)
$$\begin{aligned} \hat{R}_k(t)=\frac{1}{n_k}\sum _{j=1}^{n_k}\cos (t\hat{\varepsilon }_{kj}),\quad \hat{I}_k(t)=\frac{1}{n_k}\sum _{j=1}^{n_k} \sin (t\hat{\varepsilon }_{kj}), \end{aligned}$$

\(k=1,2,\) and \(\omega (t)\) is a non-negative function.

In order to study properties of \(T_{n_1,n_2}\) some assumptions will be required. Next we list them.

  1. (A.1)

    All limits in this paper are taken when \(n_1,n_2 \rightarrow \infty \) in such a way that

    $$\begin{aligned} \lim \frac{n_1}{N}=\tau , \quad \text {for some} \quad \tau \in (0,1), \end{aligned}$$

    where \(N=n_1+n_2\).

  2. (A.2)

    The weight function \(\omega (t)\) is a non-negative symmetric function,

    $$\begin{aligned} \omega (t)=\omega (-t), \quad \forall t, \end{aligned}$$

    and \(\int t^4 \omega (t)dt<\infty \).

There is no restriction in assuming that the weight function \(\omega (t)\) is symmetric because otherwise by defining \(\omega _1(t)=0.5\{\omega (t)+\omega (-t)\}\), which is clearly symmetric, we have that

$$\begin{aligned} \Vert C_1-C_2\Vert _{\omega }=\Vert C_1-C_2\Vert _{\omega _1}, \end{aligned}$$

for any two CFs \(C_1\) and \(C_2\). Note that the symmetry of \(\omega \) implies that

$$\begin{aligned} T_{n_1,n_2}=\Vert \hat{R}_{1}-\hat{R}_{2}+\hat{I}_{1}-\hat{I}_{2}\Vert ^2_\omega . \end{aligned}$$

The following assumption will be required to ensure that \(\hat{m}_k\) and \(\hat{\sigma }_k\) provide consistent estimators of \(m_k\) and \(\sigma _k\), respectively.

  1. (A.3)

    For \(k=1,2\),

    1. (i)

      \(E(\varepsilon _{k}^4)<\infty \).

    2. (ii)

      \(X_k\) is absolutely continuous with compact support S and density function \(f_{k}\).

    3. (iii)

      \(f_k, m_k\) and \(\sigma _k\) are twice continuously differentiable.

    4. (iv)

      \(\inf _{x\in S}f_k>0\) and \(\inf _{x\in S}\sigma _k>0\).

    5. (v)

      \(n_kh_{k}^4\rightarrow 0\) and \(n_kh_{k}^2/\ln n_k \rightarrow \infty \).

    6. (vi)

      K is a symmetric density function with compact support and twice continuously differentiable.

For simplicity we assume that the same kernel function, K, is used in both populations. Nevertheless, the results to be stated remain true if different kernels are used, whenever they satisfy Assumption (A.3)(vi).

In order to give a sound justification of \(T_{n_1,n_2}\) as a test statistic for testing \(H_0\) we next derive its limit.

Theorem 1

Suppose that Assumptions (A.1)–(A.3) hold, then \(T_{n_1,n_2}\mathop {\rightarrow }\limits ^{P} \kappa =\Vert C_1-C_2\Vert ^2_{\omega }.\)

Note that \(\kappa \ge 0\). If \(H_0\) is true then \(\kappa =0\). Since two distinct characteristic functions can be equal in a finite interval (Feller 1971, p. 506), a general way to ensure that \(\kappa =0\) iff \(H_0\) is true is to take \(\omega \) positive for almost all (with respect to the Lebesgue measure) points in \(\mathbb {R}\). Thus, a reasonable test for testing \(H_0\) should reject the null hypothesis for large values of \(T_{n_1,n_2}\). Now, to determine what are large values we must calculate its null distribution, or at least an approximation to it. This is the topic of the next section.

3 Approximating the null distribution

The null distribution of \(T_{n_1,n_2}\) is clearly unknown, so it must be approximated. We first try to estimate it by means of its asymptotic null distribution.

Let

$$\begin{aligned} R'_k(t)=-\int x \sin (tx)dF_k(x), \quad I'_k(t)= \int x \cos (xt)dF_k(t), \end{aligned}$$

\(k=1,2\). Note that under the null hypothesis \(R_1(t)=R_2(t)=R(t)\), \(I_1(t)=I_2(t)=I(t)\), \(R'_1(t)=R'_2(t)=R'(t)\) and \(I'_1(t)=I'_2(t)=I'(t)\).

Theorem 2

Suppose that Assumptions (A.1)–(A.3) hold. If \(H_0\) is true, then

$$\begin{aligned} \frac{n_1n_2}{N}T_{n_1,n_2}\mathop {\longrightarrow }\limits ^{\mathcal {L}}\Vert Z\Vert ^2_{\omega }, \end{aligned}$$

where \(\{Z(t), \, t\in \mathbb {R}\}\) is a centered Gaussian process on \(L_2(\omega )\) with covariance kernel \(\varrho _0(t,s)=Cov_0\{Z_0(\varepsilon ;t),Z_0(\varepsilon ;s)\}\) and

$$\begin{aligned} Z_0(\varepsilon ;t)= & {} \cos (t\varepsilon )+t\varepsilon I(t)-t\frac{\varepsilon ^2-1}{2}R'(t)-R(t) \\ {}&+ \sin (t\varepsilon )-t\varepsilon R(t)-t\frac{\varepsilon ^2-1}{2}I'(t)-I(t). \end{aligned}$$

Remark 1

The asymptotic null distribution of \(\frac{n_1n_2}{N}T_{n_1,n_2}\) can be expressed as

$$\begin{aligned} \Vert Z\Vert _{\omega }^2 \mathop {=}\limits ^{d} \sum _{j\ge 1} \lambda _j Z_j^2, \end{aligned}$$
(4)

where \(\mathop {=}\limits ^{d}\) stands from the equality in distribution, \(Z_1, Z_2, \ldots \) are independent standard normal variables and the set \(\{ \lambda _j\}_{j \ge 1}\) are the non-zero eigenvalues of the integral equation

$$\begin{aligned} \int \varrho _0(t,s)g_j(t)w(t)dt=\lambda _j g_j(s) \end{aligned}$$

with corresponding eigenfunctions \(\{g_j(\cdot )\}_{j \ge 1}\).

From Remark 1 it becomes evident that the asymptotic null distribution of \(T_{n_1,n_2}\) does not provide a useful approximation to its null distribution since it depends on the unknown common distribution. So, we next study another way of approximating it by means of a WB estimator.

Let

$$\begin{aligned} C_\tau (t)= \tau C_1(t)+(1-\tau )C_2(t)=R_\tau (t)+\mathrm{i}I_\tau (t). \end{aligned}$$

Let \(\xi _{11},\ldots ,\xi _{1n_1},\xi _{21},\ldots ,\xi _{2n_2}\) be IID random variates with mean 0 and variance 1, which are independent of \((X_{11},Y_{11}),\ldots ,(X_{1n_1},Y_{1n_1}),(X_{21},Y_{21}),\ldots ,\)\((X_{2n_2},Y_{2n_2})\). We define the following WB version of \(T_{n_1,n_2}\),

$$\begin{aligned} T_{1,n_1,n_2}^*=\Vert C_{1}^*-C_{2}^*\Vert ^2_\omega , \end{aligned}$$

where

$$\begin{aligned} {C}_{k}^*(t)=\frac{1}{n_k}\sum _{j=1}^{n_k}\xi _{kj}Z_{k,\tau }(\varepsilon _{kj};t), \end{aligned}$$
$$\begin{aligned} \begin{array}{rcl} Z_{k,\tau }(\varepsilon ; t) &{} = &{} \displaystyle \cos (t\varepsilon )+t\varepsilon I_{k}(t)-t\frac{\varepsilon ^2-1}{2}R_{k}'(t)-R_\tau (t)\\ &{} &{} \displaystyle +\sin (t\varepsilon )-t\varepsilon R_{k}(t)-t\frac{\varepsilon ^2-1}{2}I_{k}'(t)-I_\tau (t), \end{array} \end{aligned}$$
(5)

\(k=1,2\). The next result gives the weak limit of the conditional distribution of \(T_{1,n_1,n_2}^*\), given the data \((X_{11},Y_{11}),\ldots ,\)\((X_{1n_1},Y_{1n_1}),(X_{21},Y_{21}),\ldots , \)\((X_{2n_2},Y_{2n_2})\).

Theorem 3

Suppose that Assumptions (A.1)–(A.3) hold, then

$$\begin{aligned} \sup _x \left| P_*\left\{ \frac{n_1n_2}{N}T_{1,n_1,n_2}^*\le x\right\} -P\left\{ T_\tau \le x \right\} \right| \mathop {\longrightarrow }\limits ^{P}0, \end{aligned}$$

where \(T_\tau =\Vert Z_\tau \Vert ^2_\omega \), \(\{Z_\tau (t), \, t\in \mathbb {R}\}\) is a centered Gaussian process on \(L_2(\omega )\) with covariance kernel \(\varrho _\tau (t,s)=(1-\tau )\varrho _{1,\tau }(t,s)+\tau \varrho _{2,\tau }(t,s)\) and \(\varrho _{k,\tau }(t,s)=E\{Z_{k,\tau }(\varepsilon _k;t)Z_{k,\tau }(\varepsilon _k;s)\}\), \(k=1,2\).

The result in Theorem 3 is valid whether the null hypothesis is true or not. If \(H_0\) holds, then the kernels \(\varrho _0(t,s)\) and \(\varrho _\tau (t,s)\) coincide. Therefore, a direct consequence of Theorems 2 and 3 is that the conditional distribution of \(T_{1,n_1,n_2}^*\), given the data, provides a consistent estimator of the distribution of \(T_{n_1,n_2}\) when \(H_0\) is true. However, from a practical point of view, this result is useless because the function \(Z_{k,\tau }(\varepsilon _{kj};t)\) depends on the non-observable error \(\varepsilon _{kj}\) and on the unknown values of the functions \(R_{k}(t)\), \(I_{k}(t)\), \(R_{k}'(t)\) and \(I_{k}'(t)\), \(j=1,\ldots ,n_j\), \(k=1,2\). To overcome these difficulties we replace \(\varepsilon _{kj}\) by \(\hat{\varepsilon }_{kj}\), \(R_{k}(t)\) by \(\hat{R}_{k}(t)\), \(I_{k}(t)\) by \(\hat{I}_{k}(t)\), \(R_{k}'(t)\) by \(\hat{R}_{k}'(t)\) and \(I_{k}'(t)\) by \(\hat{I}_{k}'(t)\), with

$$\begin{aligned} \hat{R}'_{k}(t)= -\frac{1}{n_k}\sum _{j=1}^{n_k} \hat{\varepsilon }_{kj}\sin (t\hat{\varepsilon }_{kj}), \quad \hat{I}'_{k}(t)=\frac{1}{n_k}\sum _{j=1}^{n_k} \hat{\varepsilon }_{kj}\cos (t\hat{\varepsilon }_{kj}). \end{aligned}$$

So, instead of \(T_{1,n_1,n_2}^*\), now we consider

$$\begin{aligned} T_{2,n_1,n_2}^*= \Vert \hat{U}^*_1-\hat{U}^*_2\Vert ^2_\omega , \end{aligned}$$

where

$$\begin{aligned} \begin{array}{ll} \hat{U}^*_{k}(t)=&{}\frac{1}{n_k}\sum _{j=1}^{n_k}\bigg \{\cos (t\hat{\varepsilon }_{kj})+t\hat{\varepsilon }_{kj} \hat{I}_{k}(t)-t\frac{\hat{\varepsilon }_{kj}^2-1}{2}\hat{R}_{k}'(t)-\hat{R}_\tau (t) \\ &{}+\,{\sin }(t\hat{\varepsilon }_{kj})-t\hat{\varepsilon }_{kj} \hat{R}_{k}(t)-t\frac{\hat{\varepsilon }_{kj}^2-1}{2}\hat{I}_{k}'(t)-\hat{I}_\tau (t)\bigg \}\xi _{kj}, \end{array} \end{aligned}$$
(6)

\(k=1,2\), and

$$\begin{aligned} \hat{R}_\tau (t) =\frac{n_1}{N} \hat{R}_{1}(t)+ \frac{n_2}{N} \hat{R}_{2}(t), \quad \hat{I}_\tau (t) =\frac{n_1}{N} \hat{I}_{1}(t)+ \frac{n_2}{N} \hat{I}_{2}(t). \end{aligned}$$

The next theorem states that replacing \(\varepsilon _{kj}\) by \(\hat{\varepsilon }_{kj}, \ldots , \, I_{k}'(t)\) by \(\hat{I}_{k}'(t)\) in the expression of \(T_{1,n_1,n_2}^*\) has no asymptotic effect, in the sense that both \(T_{1,n_1,n_2}^*\) and \(T_{2,n_1,n_2}^*\) have the same conditional asymptotic distribution, given the data. Observe that all quantities involved in the definition of \(T_{2,n_1,n_2}^*\) are known, thus, in principle, one could be able to know, or at least to accurately approximate its conditional distribution, given the data. This practical issue will be handled in Sect. 4.

Theorem 4

Suppose that Assumptions (A.1)–(A.3) hold, then

$$\begin{aligned} \sup _x \left| P_*\left\{ \frac{n_1n_2}{N}T_{2,n_1,n_2}^* \le x \right\} -P\left\{ T_\tau \le x \right\} \right| \mathop {\longrightarrow }\limits ^{P} 0, \end{aligned}$$

where \(T_\tau \) is as defined in Theorem 3.

The result in Theorem 4 is valid whether the null hypothesis \(H_0\) is true or not. As observed before for \(T_{1,n_1,n_2}^*\), an immediate consequence of this fact and Theorem 2 is the following.

Corollary 1

If \(H_0\) is true and the assumptions in Theorem 4 hold, then

$$\begin{aligned} \sup _x \left| P_*\left\{ \frac{n_1n_2}{N}T_{2,n_1,n_2}^* \le x\right\} -P\left\{ \frac{n_1n_2}{N}T_{n_1,n_2}\le x\right\} \right| \mathop {\longrightarrow }\limits ^{P}0. \end{aligned}$$

Let \(\alpha \in (0,1)\). For testing \(H_0\) we consider

$$\begin{aligned} \Psi _*=\left\{ \begin{array}{lll} 1, &{} \quad \text {if} \quad T_{n_1,n_2}\ge t_{2,n_1,n_2,\alpha }^*,\\ 0, &{}\quad \text {otherwise,} \end{array}\right. \end{aligned}$$

where \( t_{2,n_1,n_2,\alpha }^*\) is the \(1-\alpha \) percentile of the conditional distribution of \(T_{2,n_1,n_2}^*\), or equivalently, \(\Psi _*=1\) if \(p^*\le \alpha \), where \(p^*=P_* \left\{ T_{2,n_1,n_2}^* \ge T_{n_1,n_2,obs}\right\} \) and \(T_{n_1,n_2,obs}\) is the observed value of the test statistic. The result in Corollary 1 states that \(\Psi _*\) is asymptotically correct, in the sense that its type I error probability is asymptotically equal to the nominal value \(\alpha \).

From Theorems 1, 2 and 4, it readily follows the next result.

Corollary 2

Suppose that \(H_0\) is not true, the assumptions in Theorem 4 hold and \(\omega \) is such that

$$\begin{aligned} \kappa =\Vert C_1-C_2\Vert ^2_{\omega }>0, \end{aligned}$$
(7)

then \(P(\Psi _*=1)\rightarrow 1\).

Corollary 2 shows that, if \(\omega \) is such that (7) holds, then the test \(\Psi _*\) is consistent in the sense of being able to asymptotically detect any (fixed) alternative. As discussed before, a general way to ensure (7) is to take \(\omega \) positive for almost all (with respect to the Lebesgue measure) points in \(\mathbb {R}\).

Remark 2

The results so far stated keep on being true if instead of using the raw multipliers, \(\xi _{11},\ldots ,\xi _{1n_1}\), \(\xi _{21},\ldots ,\xi _{2n_2}\), we use the centered multipliers, \(\xi _{11}-\bar{\xi }_1,\ldots ,\xi _{1n_1}-\bar{\xi }_1\), \(\xi _{21}-\bar{\xi }_2,\ldots ,\xi _{2n_2}-\bar{\xi }_2\), where \(\bar{\xi }_k=\frac{1}{n_k}\sum _{j=1}^{n_k}\xi _{jk}\), \(k=1,2\), as suggested in Burke (2000) and Kojadinovic and Yan (2012).

Remark 3

In Remark 1 we saw that the null distribution of \(T_{n_1,n_2}\) is a linear combination of independent \(\chi ^2\) variables, the weights in that linear combination being the eigenvalues \(\{ \lambda _j\}_{j \ge 1}\) of certain functional. Routine algebra shows that the conditional distribution of \(T_{2,n_1,n_2}^*\), given the data, can be also expressed as a linear combination of a certain finite set of variables, \(\sum _{j=1}^N \hat{\lambda }_j W_j^2\), where \(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}\) are the eigenvalues of the symmetric \(N \times N\)-matrix \(M_2\), that will be defined in next section (see Eqs. (9) and (10)), and \((W_1, \ldots , W_{N})=(\xi _{11}, \ldots , \xi _{1n_1},\xi _{21}, \ldots , \xi _{2n_2})H\), H being the matrix containing the eigenvectors associated to the eigenvalues \(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}\), that is, \(M_2=H \, diag(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}) H^T\). What really happens is that the set \(\{\hat{\lambda }_j\}_{j =1}^N\) converges to \(\{ \lambda _j\}_{j \ge 1}\) (see Delhing and Mikosch 1994).

Remark 4

From Remark 3 it becomes evident that the conditional distribution of \(T_{2,n_1,n_2}^*\), given the data, depends on the distribution of \((W_1, \ldots , W_{N})\). The distribution of this random vector is, in general, unknown. For the special case where the multipliers come from a standard normal distribution, the vector \((W_1, \ldots , W_{N})\) has independent components distributed according to a standard normal distribution, and thus the conditional distribution of \(T_{2,n_1,n_2}^*\), given the data, is a finite linear combination of independent \(\chi ^2\) variables, where the weights in the linear combination are \(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}\). Note that in this special case the WB distribution of the test statistic is of the same type as its asymptotic null distribution.

4 On the practical calculation

This section describes some computational issues related to the calculation of the test statistic \(T_{n_1,n_2}\) and the WB approximation to its null distribution.

4.1 Calculation of the test statistic

Let

$$\begin{aligned} M=\left( \begin{array}{cc} M_{11} &{}\quad M_{12}\\ M_{21} &{}\quad M_{22}\end{array}\right) , \quad A=\left( \begin{array}{cc} \frac{1}{n_1^2}1_{n_1}1_{n_1}^T &{}\quad \frac{-1}{n_1n_2}1_{n_1}1_{n_2}^T\\ \frac{-1}{n_1 n_2}1_{n_2}1_{n_1}^T &{}\quad \frac{1}{n_2^2}1_{n_2}1_{n_2}^T\end{array}\right) , \end{aligned}$$

with \(M_{rs}=(\varphi _{\omega }(\hat{\varepsilon }_{rj}-\hat{\varepsilon }_{sv}))_{1\le j \le n_r,\, 1\le v \le n_v}\), \(r,s=1,2\), and

$$\begin{aligned} \varphi _{\omega }(x)=\int \cos (t'x)\omega (t)dt. \end{aligned}$$
(8)

Let v be the vector of \(\mathbb {R}^{N}\) with the first \(n_1\) components equal to \(1/n_1\) and the rest equal to \(-1/n_2\). In practice, the test statistic \(T_{n_1,n_2}\) can be computed by using the following expression (see Lemma 1 in Alba-Fernández et al. 2008)

$$\begin{aligned} T_{n_1,n_2}= & {} \frac{1}{n_1^2}\sum _{j,r=1}^{n_1}\varphi _{\omega }(\hat{\varepsilon }_{1j}-\hat{\varepsilon }_{1r})+ \frac{1}{n_2^2}\sum _{l,v=1}^{n_2}\varphi _{\omega }(\hat{\varepsilon }_{2l}-\hat{\varepsilon }_{2v} )\\&- \frac{2}{n_1n_2}\sum _{j=1}^{n_1}\sum _{l=1}^{n_2}\varphi _{\omega }(\hat{\varepsilon }_{1j}-\hat{\varepsilon }_{2l})\\= & {} v^TMv\\= & {} 1_{N}^TM_11_{N}, \end{aligned}$$

with \(M_1=M \odot A\), \(\odot \) denoting the Hadamard product.

The WB version of \(T_{n_1,n_2}\), \(T_{2,n_1,n_2}^*\), can be expressed as

$$\begin{aligned} T_{2,n_1,n_2}^{*}=\xi ^TM_2 \xi , \end{aligned}$$

with \(\xi ^T=(\xi _{11}, \ldots , \xi _{1n_1},\xi _{21}, \ldots , \xi _{2n_2})\) and

$$\begin{aligned} M_2=M_3\odot A, \quad M_3=\left( \begin{array}{cc} M_{3,11} &{}\quad M_{3,12}\\ M_{3,21} &{}\quad M_{3,22}\end{array}\right) , \end{aligned}$$
(9)
$$\begin{aligned} M_{3,rs}=\left( \int Z_{r,\tau }(\hat{\varepsilon }_{rj};t)Z_{s,\tau }(\hat{\varepsilon }_{sv};t)\omega (t)dt\right) _{1\le j \le n_r,\, 1\le v \le n_v}, \end{aligned}$$
(10)

\( r,s=1,2\). An explicit expression for \(M_3\) is given in the Appendix.

4.2 Calculation of the WB distribution of the test statistic

Normal multipliers. As observed in Remark 4, if the multipliers have a normal distribution then, conditional on the data, \(T_{2,n_1,n_2}^{*}\) is distributed as \(W=\sum _{j=1}^{N}\hat{\lambda }_j\chi ^2_{1,j}\), where \(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}\) are the eigenvalues of \(M_2\) and \(\chi ^2_{1,1}, \ldots , \chi ^2_{1,N}\) are independent variables having a chi-squared distribution with 1 degree of freedom. The law of W can be numerically approximated by using, for example, Imhof’s method (Imhof 1961). In this case, the WB estimator of the p-value can be calculated as follows:

Algorithm 1

  1. 1.

    Calculate the residuals \(\hat{\varepsilon }_{11}, \ldots ,\hat{\varepsilon }_{1n_1}\),\(\hat{\varepsilon }_{21}, \ldots ,\hat{\varepsilon }_{2n_2}\).

  2. 2.

    Calculate the observed value of the test statistic \(T_{n_1,n_2}\), \(T_{n_1,n_2,obs}\).

  3. 3.

    Calculate the eigenvalues of \(M_2\), \(\hat{\lambda }_1, \ldots , \hat{\lambda }_{N}\).

  4. 4.

    Approximate the p-value by \(\hat{p}^{*}=P_*\left( \sum _{j=1}^{N}\hat{\lambda }_j\chi ^2_{1,j}>T_{n_1,n_2,obs}\right) \).

Arbitrary multipliers. As also observed in Remark 4, the WB distribution of \(T_{n_1,n_2}\) is unknown for arbitrary multipliers. Nevertheless, the WB p-value estimator can be easily approximated by simulation as follows. Let \(\Delta (u)=1\) if \(u>0\) and \(\Delta (u)=0\) if \(u\le 0\).

Algorithm 2

  1. 1.

    Calculate the residuals \(\hat{\varepsilon }_{11}, \ldots ,\hat{\varepsilon }_{1n_1}\),\(\hat{\varepsilon }_{21}, \ldots ,\hat{\varepsilon }_{2n_2}\).

  2. 2.

    Calculate the observed value of the test statistic \(T_{n_1,n_2}\), \(T_{n_1,n_2,obs}\).

  3. 3.

    Calculate \(M_{2}\).

  4. 4.

    For some large integer B, repeat the following steps for every \(b \in \{1,\ldots ,B\}\):

    1. (a)

      Generate \(\xi _{11}, \ldots , \xi _{1n_1},\xi _{21}, \ldots , \xi _{2n_2}\) IID variables with mean 0 and variance 1.

    2. (b)

      Calculate \(T^{*b}_{2,n_1,n_2}=\xi ^TM_2\xi \).

  5. 5.

    Approximate the p-value by \(\hat{p}^*=\frac{1}{B}\sum _{b=1}^B \Delta (T^{*b}_{2,n_1,n_2}-T_{n_1,n_2,obs})\).

5 Numerical results

5.1 Finite sample performance

The properties so far studied are asymptotic. In order to empirically investigate the performance of the proposed test for finite sample sizes, we carried out a simulation experiment. The objective of this experiment is fourfold: first, to study the goodness of the WB approximation to the null distribution of the test statistic; second, to analyze the WB approximation in terms of power, comparing it to the power that results when the bootstrap employed in PF07 is used to approximate the null distribution of the proposed test statistic (denoted as Boot in the tables); third, to compare the power of the proposed test to the CvM type test in PF07 (denoted as CM in the tables) (the KS test is not considered in our simulation study because, in the simulations carried out in PF07, it was less powerful than the CvM test); and finally, to compare the WB and the bootstrap approximations in terms of the CPU time required. This section reports and summarizes the obtained results. All computations in this paper have been performed by using programs written in the R language (R Core Team 2015). Specifically, to numerically approximate the WB p-value by Imhof’s method the R package CompQuadForm (Duchesne and Lafaye de Micheaux 2010) was used.

Three specifications for the functions \(m_k\) and \(\sigma _k\) were considered:

  1. S1:

    \(Y_{kj}=X_{kj}+X_{kj}^2+(X_{kj}+0.5)\varepsilon _{kj}\), \(1\le j \le n_k\), \(k=1,2\),

  2. S2:

    \(Y_{kj}=X_{kj}+0.5\varepsilon _{kj}\), \(1\le j \le n_k\), \(k=1,2\),

  3. S3:

    \(Y_{1j}=X_{1j}+X_{1j}^2+(X_{1j}+0.5)\varepsilon _{1j}\), \(1\le j \le n_1\), and \(Y_{2j}=X_{2j}+0.5\varepsilon _{2j}\), \(1\le j \le n_2\),

with \(X_{kj}\sim U(0,1)\), \(1\le j \le n_k\), \(k=1,2\). For each of these specifications for \(m_k\) and \(\sigma _k\), the following three cases were considered for the error distribution:

  1. (i)

    \(\varepsilon _1, \varepsilon _2 \sim N(0,1)\),

  2. (ii)

    \(\varepsilon _1\sim N(0,1)\), \(\varepsilon _2 \sim E(1)-1\),

  3. (iii)

    \(\varepsilon _1\sim N(0,1)\), \(\varepsilon _2 \sim U(-\sqrt{3},\sqrt{3})\),

where E(1) stands for a negative exponential law with mean 1. Case (i) corresponds to the null hypothesis, while cases (ii) and (iii) are alternatives.

To estimate the regression function and the conditional variance the Epanechnikov kernel \(K(u)=0\).\(75\times (1-u^2)\) was employed.

As weight function for \(T_{n_1,n_2}\) we took \(\omega (t)=\exp (-\beta t^2)\). This weight function has been considered in many other test statistics involving ECFs (see, for example, the tests in Alba-Fernández et al. 2008; Meintanis 2005; Meintanis et al. 2015; Pardo-Fernández et al. 2015a, b, among others).

Another issue is the choice of h. Since the choice of the bandwidth for tests based on smoothing remains an open issue (see, for example, de Uña-Álvarez 2013; González-Manteiga and Crujeiras 2013; Sperlich 2013), we proceeded as in the simulation studies in PF07 and Pardo-Fernández et al. (2015a): we took \(h_k=c\times n_k^{-a}\), where c and a are real constants. To decide the values for a and c, we performed an extensive simulation experiment with the aim of selecting those values giving type I error probabilities closest to the nominal values. We also tried several values for \(\beta \), specifically \(\beta =\{0.05,0.10,0.15,0.20,0.25\}\). In general, better results -in the sense of agreement between the observed type I error probabilities and the target values- were obtained for \(\beta =0.15\). Because of this reason we fixed \(\beta =0.15\) in all simulations.

1000 samples with sizes \(n_1,n_2 \in \{50,100\}\) were generated for each case and each specification for \(m_k\) and \(\sigma _k\). For each sample, to approximate the WB p-value of the observed value of the test statistic, we applied Algorithm 2, with raw and centered multipliers generated from a standard normal distribution and \(B=1000\), and Algorithm 1. In simulations we observed that, as expected, these approximations provided quite similar values. Nevertheless, the WB with centered multipliers gives slightly better results, in the sense of yielding type I error probabilities which are a bit closer to the nominal values than the other two. Because of this reason, we recommend its use. All results displayed in the tables were obtained by using Algorithm 2 with centered multipliers. To approximate the bootstrap p-value we proceeded as in PF07, generating 200 bootstrap samples. The tables report the fractions of p-values less than or equal to 0.05 and 0.10.

Tables 1, 2, 3 display the results for the level. Looking at them it can be concluded that for \(n_1, \, n_2 = 100\) all choices for a and c in these tables give values very similar to the true value of \(\alpha \), for all specifications and for all tests. In general, \(a=0.3\) and \(c=1.0\) give quite reasonable results, so we set these values for a and c to study the power.

Table 4 displays the results for the power. In case (ii) all tests have similar power for all considered specifications; in case (iii) the test proposed in this paper exhibits larger power than the one based on the empirical CDF. As for the WB and bootstrap approximations to the null distribution of \(T_{n_1,n_2}\), the bootstrap test is slightly more powerful than the one based on the WB approximation. Nevertheless, as the sample size increases, the power of both tests become closer. This fact was also observed in Kojadinovic and Yan (2012) and Ghoudi and Rémillard (2014) for goodness-of-fit tests based on the empirical CDF. As will be seen a bit later, the practical importance of this fact resides in that for larger sample sizes the bootstrap becomes extremely time consuming.

Table 1 Percentage of rejections at the significance levels \(5\%\) (upper entry) and \(10\%\) (lower entry) for case (i) and specification S1
Table 2 Percentage of rejections at the significance levels \(5\%\) (upper entry) and \(10\%\) (lower entry) for case (i) and specification S2
Table 3 Percentage of rejections for at the significance levels \(5\%\) (upper entry) and \(10\%\) (lower entry) for case (i) and specification S3
Table 4 Percentage of rejections at the significance levels \(5\%\) and \(10\%\) for cases (ii) and (iii) and specifications S1–S3

We also compared the bootstrap and the WB approximations in terms of the required computing time. To calculate the WB approximation we used Algorithm 1 (denoted as WB1 in Table 5) and Algorithm 2 (denoted as WB2 in Table 5). For the comparisons to be fair, we took \(B=1000\) for the bootstrap and the WB2 estimators. As for the raw and the centered multipliers, the difference in the required computing time is negligible. Table 5 shows the CPU time consumed in seconds to get a p-value for testing the equality of the error distribution for several sample sizes. Looking at this table it becomes evident that WB2 is more efficient than the bootstrap approximation, in terms of the required computing time, specially for larger sample sizes. The difference between WB1 and WB2 is really small. The gain in computational efficiency of the WB over the bootstrap stems from the fact that one does not have to re-estimate m and \(\sigma \) at each iteration, which slows down the process considerably; by contrast, for the WB approximation, once the matrix \(M_3\) is calculated, the WB replicates \(T_{2,n_1,n_2}^{*1},\ldots , T_{2,n_1,n_2}^{*B}\) are calculated very rapidly.

Table 5 CPU time in seconds for the calculation of a p-value

Finally, we ran simulations when the error distributions come from a mixed distribution. Specifically, we considered the following cases:

  1. (iv)
    $$\begin{aligned} \varepsilon _1, \varepsilon _2 \sim \left\{ \begin{array}{l@{\quad }l} 0, &{} \text {with probability} \ 0.20, \\ N(0,\sqrt{5/4}) &{} \text {with probability} \ 0.80, \end{array}\right. \end{aligned}$$
  2. (v)
    $$\begin{aligned} \varepsilon _1 \sim \left\{ \begin{array}{l@{\quad }l} 0, &{} \text {with probability} \ 0.20, \\ N(0,\sqrt{5/4}) &{} \text {with probability} \ 0.80, \end{array}\right. \end{aligned}$$
    $$\begin{aligned} \varepsilon _2 \sim \left\{ \begin{array}{l@{\quad }l} 0, &{} \text {with probability} \ 0.50,\\ N(0,\sqrt{2}) &{} \text {with probability} \ 0.50. \end{array}\right. \end{aligned}$$

Case (iv) corresponds to the null hypothesis and case (v) is an alternative. In practice, these cases could model a situation where the observations come from two devices, one of them with no measurement error. The test in PF07 cannot be applied in this setting because it requires the error distribution to have a smooth density. Table 6 displays the obtained results for the test proposed in this paper for \(n_1,n_2=100\). Again the empirical levels are close to the target values and the test has power against the alternative.

Table 6 Percentage of rejections at the significance levels \(5\%\) (upper entry) and 10% (lower entry) for cases (iv) and (v) and specifications S1–S3

5.2 Real data analysis

Finally, we applied the proposed test to a real data set. To estimate the p-value we applied Algorithm 2 with \(B=1000\). Several values for a and c were tried. Next we briefly describe it.

Young and Bowman (1995) proposed a method for testing the equality and parallelism of two or more smooth curves. Their method assumes that the errors are equally distributed in each population. They applied their method to a data set consisting of the yield (g / plant) and density (plants/\(m^2\)) of White Spanish Onions from two South Australian localities, namely Purnong Landing (first group, 42 observations) and Virginia (second group, 42 observations). This data set is available in the R package sm (Bowman and Azzalini 2014). Table 7 displays the estimated p-values when using the test proposed in this paper for testing the equality of the error distributions. As in Young and Bowman (1995), the test was applied on the logarithm of the data. Looking at this table we see that the equality of the error distribution cannot be rejected.

Table 7 p-values for the data set

6 Testing for the equality of \(d>2\) error distributions

The proposed test can be extended to testing for the equality of \(d>2\) error distributions as follows. Let \((X_k,Y_k)\), \(1\le k \le d\), be d independent random vectors satisfying the general nonparametric regression model (1), \( 1\le k \le d\). Let \(F_k\) and \(C_k=R_k+\mathrm{i} I_k\) denote de CDF and the CF of \(\varepsilon _k\), respectively, \(1\le k \le d\). Suppose that independent samples are available from each population: \((X_{k1}, Y_{k1}), \ldots , (X_{kn_k},Y_{kn_k})\), \(1\le k \le d\). Let \(N=n_1+\cdots +n_d\). For testing

$$\begin{aligned} H_{0d}: \,F_1=\cdots =F_d \quad \Leftrightarrow \quad C_1=\cdots =C_d, \end{aligned}$$

against the general alternative

$$\begin{aligned} H_{1d}: \, H_{0d} \, \text{ is } \text{ not } \text{ true }, \end{aligned}$$

for observable data, Hušková and Meintanis (2008) have proposed to compare the ECF associated to the sample from each population to the ECF of all available data which, under \(H_{0d}\), estimates the common CF, say \(C=C_1=\cdots =C_d\). A residual version of such test can be used for testing \(H_{0d}\) in our setting. Specifically, let \(\hat{\varepsilon }_{kj}\), \(1\le j \le n_k\), \(1\le k \le d\), be defined as in (2) and let

$$\begin{aligned} T_N=\sum _{k=1}^d n_k\Vert \hat{C}_k-\hat{C}\Vert _{\omega }^2, \end{aligned}$$

where \(\hat{C}_k\) is as defined in (3) and

$$\begin{aligned} \hat{C}=\frac{1}{N}\sum _{k=1}^d n_k \hat{C}_k. \end{aligned}$$

Analogue results to those given in Theorems 1, 2 and 4 can be given for \(T_N\). Next we state them without proofs because they closely follow those provided for \(d=2\).

Theorem 5

Suppose that \(n_k/N \rightarrow \tau _k>0\), \(1\le k \le d\), Assumptions (A.2) and (A.3) hold for all \(1\le k \le d\), then \(\frac{1}{N}T_{N}\mathop {\rightarrow }\limits ^{P} \sum _{k=1}^d \tau _k\Vert {C}_k-C_0\Vert ^2_{\omega },\) with \(C_0=\sum _{k=1}^d \tau _k{C}_k\).

Theorem 6

Suppose that assumptions in Theorem 5 hold. If \(H_0\) is true, then

$$\begin{aligned} T_{N}\mathop {\longrightarrow }\limits ^{\mathcal {L}}\sum _{k=1}^d \Vert Z_k-\sqrt{\tau _k}Z_0\Vert ^2_{\omega }, \end{aligned}$$

where \(\{Z_k(t), \, t\in \mathbb {R}\}\), \(k=1,\ldots ,d\), are d IID centered Gaussian processes on \(L_2(\omega )\) with covariance kernel \(\varrho _0(t,s)\) as defined in Theorem 2 and \(Z_0=\sum _{k=1}^d \sqrt{\tau _k}{Z}_k\).

Now let \(\xi _{1,1},\ldots ,\xi _{1,n_1}, \ldots , \xi _{d,1},\ldots ,\xi _{d,n_d}\) be IID random variates with mean 0 and variance 1, which are independent of the data, \((X_{k1}, Y_{k1}), \ldots , (X_{kn_k},\)\(Y_{kn_k})\), \(1\le k \le d\). Let

$$\begin{aligned} T^*_N=\sum _{k=1}^d n_k\Vert \hat{U}^*_k-\hat{U}^*_0\Vert _{\omega }^2, \end{aligned}$$

where \(\hat{U}^*_k\) is as defined in (6) with \(\hat{R}_{\tau }\) and \(\hat{I}_{\tau }\) replaced by \(\hat{R}_{\tau _1,\ldots , \tau _d}\) and \(\hat{I}_{\tau _1,\ldots , \tau _d}\), respectively,

$$\begin{aligned} \hat{R}_{\tau _1,\ldots , \tau _d}(t) =\frac{1}{N}\sum _{k=1}^d n_k \hat{R}_{k}(t), \quad \hat{I}_{\tau _1,\ldots , \tau _d}(t) =\frac{1}{N}\sum _{k=1}^d n_k \hat{I}_{k}(t), \end{aligned}$$

and

$$\begin{aligned} \hat{U}^*_0=\frac{1}{N}\sum _{k=1}^d n_k \hat{U}^*_k. \end{aligned}$$

Theorem 7

Suppose that assumptions in Theorem 5 hold, then

$$\begin{aligned} \sup _x \left| P_*\left\{ T_{N}^* \le x \right\} -P\left\{ T_{\tau _1,\ldots , \tau _d} \le x \right\} \right| \mathop {\longrightarrow }\limits ^{P} 0, \end{aligned}$$

where \(T_{\tau _1,\ldots , \tau _d}=\sum _{k=1}^d \Vert Z_{k,\tau _1,\ldots ,\tau _k}-\sqrt{\tau _k}Z_{0,\tau _1,\ldots ,\tau _k}\Vert ^2_{\omega }\), \(\{Z_{k,\tau _1,\ldots ,\tau _k}(t), \, t\in \mathbb {R}\}\), \(1 \le k \le d\), are independent centered Gaussian processes on \(L_2(\omega )\) with covariance kernel

$$\begin{aligned} \varrho _{k,\tau _1,\ldots ,\tau _k}(t,s)=E\{Z_{k,\tau _1,\ldots ,\tau _k}(\varepsilon _k;t)Z_{k,\tau _1,\ldots ,\tau _k}(\varepsilon _k;s)\}, \end{aligned}$$

\(Z_{k,\tau _1,\ldots ,\tau _k}(\varepsilon _k;t)\) is defined as in (5) with \(R_{\tau }\) and \(I_{\tau }\) replaced by \({R}_{\tau _1,\ldots , \tau _d}\) and \({I}_{\tau _1,\ldots , \tau _d}\), respectively,

$$\begin{aligned} {R}_{\tau _1,\ldots , \tau _d}(t) =\sum _{k=1}^d\tau _k {R}_{r}(t), \quad {I}_{\tau _1,\ldots , \tau _d}(t) =\sum _{k=1}^d\tau _k {I}_{r}(t), \end{aligned}$$

and \(Z_{0,\tau _1,\ldots ,\tau _k}=\sum _{k=1}^d \sqrt{\tau _k}{Z}_{k,\tau _1,\ldots ,\tau _k}\).

Similar results to those stated in Corollaries 1 and 2 for \(T_{n_1,n_2}^*\) can be given for \(T^*_N\). To save space we omit them.