1 Introduction

Let \({\mathscr {P}}\) be the family of the Pareto distributions with the distribution function (d.f.)

$$\begin{aligned} F(x) = 1-x^{-\lambda }, \quad x \ge 1, \ \lambda >0. \end{aligned}$$
(1)

In this paper we develop the goodness-of-fit tests for the Pareto distribution using a new characterization based on the property of order statistics. The problem formulation is as follows: let \(X_1,\ldots ,X_n\) be positive i.i.d. random variables (rv’s) with continuous d.f. F. Consider testing the composite hypothesis \(H_0: F \in {\mathscr {P}}\) against the general alternative \(H_1: F \notin {\mathscr {P}}\), assuming that the alternative d.f. is also concentrated on \([1,\infty )\).

The goodness-of-fit tests for the Pareto distribution have been discussed in Beirlant et al. (2006), Gulati and Shapiro (2008), Martynov (2009), Rizzo (2009). We exploit a different idea for constructing and analyzing statistical tests based on characterization by the property of equidistribution of linear statistics by means of so-called U-empirical d.f.’s (Janssen 1988; Korolyuk and Borovskikh 1994). This method was developed early in several articles, particularly, in Nikitin (1996), Nikitin and Peaucelle (2004), Nikitin and Tchirina (1996), Nikitin and Volkova (2010), Nikitin and Volkova (2012), Litvinova (2004). The tests for the Pareto distribution using this approach were obtained and analyzed in Jovanovic et al. (2014). One can observe that the new tests based on characterizations have reasonably high efficiencies and can be competitive with previously known goodness-of-fit tests. Let us explain our approach.

We will say that the d.f. F belongs to the class of distributions \(\mathscr {F}\), if \(\forall x_1, x_2\): either \(F(x_1x_2)\le F(x_1)F(x_2)\) or \(F(x_1x_2)\ge F(x_1)F(x_2)\), see Ahsanullah (1989).

Let \(X_1,\ldots ,X_n\) be i.i.d. positive absolutely continuous random variables with the d.f. F from the class \(\mathscr {F}\). Denote by \(X_{(1,n)}\le X_{(2,n)}\le \ldots \le X_{(n,n)}\) - the order statistics of a random sample \(X_1,...,X_n\).

We present a new characterization within the class \(\mathscr {F}\).

Theorem 1

For the fixed k let \(X_1,...,X_k\) be i.i.d., positive and bounded rv’s having an absolutely continuous (with respect to Lebesgue measure) d.f. F(x). Then the equality in law of \(X_1\) and \(X_{(k,k)}/X_{(k-1,k)}\) takes place iff \(X_1\) has some d.f. from the family \({\mathscr {P}}\).

Proof

Let \(Y=\ln {X}\) and let G denote the d.f. of Y. It can be easily seen that \(F \in \mathscr {F}\) iff G is NBU (“new better than used”) or NWU (“new worse than used”) (Ahsanullah 1977). Further, since we use the monotonic transformation, then \(X_1\) and \(X_{(k,k)}/X_{(k-1,k)}\) will be identically distributed iff \(Y_1\) and \(Y_{(k,k)}-Y_{(k-1,k)}\) are identically distributed. It follows from Ahsanullah (1977) that \(X_1\) and \(X_{(k,k)}/X_{(k-1,k)}\) are identically distributed iff \(Y=\ln {X}\) has the exponential distribution with some scale parameter \(\lambda \), therefore \(X_1\) has the Pareto distribution with the same parameter \(\lambda \). \(\square \)

In the case when \(k=2\) our characterization coincide with another characterization of the Pareto distribution considered in Jovanovic et al. (2014), see also Nikitin and Volkova (2012). Note that our characterization extends the charaterization, involved in Jovanovic et al. (2014).

According to our characterization we construct the U-empirical d.f. by the formula

$$\begin{aligned} H_n(t)={n \atopwithdelims ()k}^{-1}\sum _{1 \le i_1<\ldots < i_k \le n}\mathbf 1 \{X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})}< t\}, \quad t\ge 1, \end{aligned}$$

where \(X_{(s,\{i_1,\ldots ,i_k\})}, \, s=\{k-1,k\}\) denotes the \(s-\)th order statistic of the subsample \(X_{i_1},\ldots ,X_{i_k}\). For rv X the \(U-\)statistical d.f. will be simply the usual empirical d.f. \(F_n(t)=n^{-1}\sum _{i=1}^n\mathbf 1 (X_i<t), t \in R^1\), based at the observations \(X_1,\dots ,X_n\).

It is known that the properties of U-empirical d.f.’s are similar to the properties of usual empirical d.f.’s (Helmers et al. 1988; Janssen 1988). Hence the difference \(H_n- F_n\) for large n should be almost surely close to zero under \(H_0\), and we can measure their closeness by using some test statistics, assuming their large values to be critical.

We suggest two test statistics

$$\begin{aligned} I_n^{(k)}&=\int _{1}^{\infty } \left( H_n(t)-F_n(t)\right) dF_n(t), \end{aligned}$$
(2)
$$\begin{aligned} D_n^{(k)}&=\sup _{t \ge 1}\mid H_n(t)-F_n(t)\mid . \end{aligned}$$
(3)

Note that both proposed statistics under \(H_0\) are invariant with respect to the change of variables \( X \rightarrow X^{\frac{1}{\lambda }}\), so we may set \(\lambda =1\).

We discuss their limiting distributions under the null hypothesis and find logarithmic asymptotics of large deviations under \(H_0\). Next we calculate their efficiencies against some parametric alternatives from the class \(\mathscr {F}\). We use the notion of local exact Bahadur efficiency (BE) (Bahadur 1971; Nikitin 1995), as the statistics \(D_n^{(k)}\) has the non-normal limiting distribution, hence the Pitman approach to the efficiency is not applicable. However, it is known that the local BE and the limiting Pitman efficiency usually coincide, see Wieand (1976), Nikitin (1995).

Finally, we study the conditions of the local optimality of our tests, describe the “most favorable”   alternatives for them and compare the powers of our tests with some standard goodness-of-fit tests.

The family of d.f. in null-hypothesis we consider is a particular case of the so-called Pareto type I distribution with the d.f. \(P_1(x) = 1-(\frac{x}{\beta })^{-\lambda }, \ x \ge \beta > 0, \ \lambda >0\), see, for example, Arnold (1983). For practice goodness-of-fit testing based on our new tests the unknown parameters of the hypothesized Pareto distribution can be estimated by a number of methods, see Arnold (1983, Ch. 5), Kleiber and Kotz (2003, Ch. 3), Brazausskas and Serfling (2003), Rizzo (2009). One can estimate first the parameter \(\beta \) for example by the MLE estimator \(\hat{\beta }= \min _{i=1, \ldots , n} X_i\). Then the sample \(X_1,\ldots ,X_n\) can be transformed to the new sample \(Y_1,\ldots ,Y_n\), where \(Y_i=X_i/\hat{\beta }\) and its has the d.f. considered in (1).

There exist the Pareto’s second model, so-called Pareto type II distribution with d.f. \(P_2(x) = 1-(1+\frac{x-\mu }{\beta })^{-\lambda }, \ x \ge \mu \in \mathbb {R}, \ \lambda >0\). Pareto type I and type II models are related by a following transformation: if rv X has a Pareto type II distribution, then \(X-(\mu -\sigma )\) has a Pareto type I distribution. Therefore using one of the estimator of the location parameter \(\mu \), one can reduce Pareto type II rv to our model. We do not discuss the difference between the parameter estimators and concentrate on the constructing of the goodness-of-fit tests.

2 Integral statistic \(I_{n}^{(k)}\)

The statistic \(I_{n}^{(k)}\) is asymptotically equivalent to the U-statistic of degree \((k+1)\) with the centered kernel

$$\begin{aligned} {\varPsi }_k(X_{i_1},\ldots , X_{i_{k+1}})\!=\! \frac{1}{k+1}\sum _{\pi (i_1, \ldots , i_{k+1})}\mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < X_{i_{k+1}}) -\frac{1}{2}, \end{aligned}$$

where \(\pi (i_1, \ldots , i_{k+1})\) means all permutations of different indices from \(\{i_1, \ldots ,i_{k+1}\}\).

Let \(X_1,\ldots , X_{k+1}\) be independent rv’s from the standard Pareto distribution. It is known that non-degenerate U-statistics are asymptotically normal (Hoeffding 1948; Korolyuk and Borovskikh 1994). To prove that the kernel \({\varPsi }_k(X_{1},\ldots , X_{{k+1}})\) is non-degenerate, we calculate its projection \(\psi _k(s)\). For a fixed \(X_{{k+1}}=s, \, s\ge 1\) we have:

$$\begin{aligned} \psi _k(s):= & {} E({\varPsi }_k(X_{1},\ldots , X_{{k+1}})\mid X_{{k+1}}=s) \\= & {} \frac{k}{k+1}\mathbb {P}(X_{(k,\{2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1})\\&+\frac{1}{k+1}\mathbb {P}(X_{(k,\{1,\ldots ,k\})}/X_{(k-1,\{1,\ldots , k\})}< s)-\frac{1}{2}. \end{aligned}$$

It follows from the above characterization that the second probability is equal to:

$$\begin{aligned} \mathbb {P}(X_{k,\{1,\ldots , k\}}/X_{k-1,\{1,\ldots , k\}}< s)=\mathbb {P}(X_1<s) = F(s). \end{aligned}$$

It remains to calculate the first term. For this purpose we decompose the probability as \(\mathbb {P}(X_{k,\{2,\ldots , k,s \}}/X_{k-1,\{2,\ldots , k,s\}} < X_{1}) = \mathbb {P}_1+\mathbb {P}_2+\mathbb {P}_3\), where \(\mathbb {P}_i, i={1,2,3}\) are initial probabilities, computed in one of the following cases:

  1. (1)

    Let the sample units take places as follows: \(X_2 < \ldots < X_k <s\). Then our probability transforms into

    $$\begin{aligned} \mathbb {P}_1= & {} (k-1)! \, \mathbb {P}\left( \frac{s}{X_k} <X_1, X_2 < \ldots < X_k <s\right) \\= & {} (k-1)! \, \mathbb {P}\left( X_k< s, X_1 > \frac{s}{X_k} , X_2 < X_3, X_3 <X_4, \ldots , X_{k-1} < X_{k}\right) . \end{aligned}$$

    After some calculations we obtain that the last probability is equal to:

    $$\begin{aligned}&(k-1)! \int _1^s \left( 1-F\left( \frac{s}{x_k}\right) \right) \frac{F^{k-2}(x_k)}{(k-2)!} d F(x_k)\\&\quad = F^{k-1}(s)-(k-1) \int _1^s \left( 1-\frac{1}{x}\right) ^{k-2} \left( 1-\frac{x}{s}\right) \frac{dx}{x^2}. \end{aligned}$$

    The integral in the second term can be evaluated using integration by parts and binomial representation of the function \((1-\frac{1}{x})^{k-1}\). Finally we have:

    $$\begin{aligned}&\int _1^s \left( 1-\frac{1}{x}\right) ^{k-2} \left( 1-\frac{x}{s}\right) \frac{dx}{x^2} = \frac{1}{s(k-1)}\int _{1}^{s}\sum _{j=0}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}x^{-j} dx\\&\quad = \frac{1}{s(k-1)}\left( s-1-(k-1)\ln {(s)}+\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}\right) . \end{aligned}$$

    Thus the initial probability in this case is equal to

    $$\begin{aligned} \mathbb {P}_1=F^{k-1}(s)-F(s)+(k-1)\frac{\ln {s}}{s}-\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}. \end{aligned}$$
  2. (2)

    The sample units are \(X_2 < X_3<\ldots X_{k-1}< s < X_k\), then for this case we have:

    $$\begin{aligned} \mathbb {P}_2= & {} (k-1)! \, \mathbb {P}\left( \frac{X_k}{s} <X_1, X_2 < X_3<\ldots X_{k-1}< s < X_k\right) \\= & {} (k-1)! \, \mathbb {P}\left( X_k> s, X_1 > \frac{X_k}{s} , X_2 < X_3, X_3 < X_4, \ldots , X_{k-1}<s\right) \\= & {} (k-1)! \int _s^{\infty } \left( 1-F\left( \frac{x_k}{s}\right) \right) \frac{F^{k-2}(s)}{(k-2)!} d F(x_k)\\= & {} \frac{(k-1)}{2s}F^{k-2}(s). \end{aligned}$$
  3. (3)

    The last case we consider is when s is situated on \(j-\)th place \((1 \le j \le {k-2})\) in variational series of the sample \(X_2, \ldots , X_{k-2}\). It means that the sample units take places as follows: \(X_2< \ldots < s < \ldots <X_{k-2}< X_{k-1} < X_k\) and s also may stand on first and \((k-2)\)-th places. Then the required probability is equal to

    $$\begin{aligned} \mathbb {P}_3= & {} (k-1)! \, \mathbb {P}\left( \frac{X_k}{X_{k-1}} <X_1, X_2< \ldots < s < \ldots <X_{k-2}< X_{k-1} < X_k\right) \\= & {} \frac{1}{2} C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s), \, 1 \le j \le {k-2}. \end{aligned}$$

Combining the results we get that the first term in the projection has the form:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{ 2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1}) = F^{k-1}(s)-F(s)+(k-1)\frac{\ln {s}}{s}\\&\quad -\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}+\frac{1}{2} \sum _{j=1}^{k-1}C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s). \end{aligned}$$

Note that the last sum is equal to \(\sum _{j=1}^{k-1}C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s)=1-F^{k-1}(s)\). Thus for the initial probability we get the result:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{ 2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1}) = \frac{1}{2} F^{k-1}(s)\\&\quad -F(s)+(k-1)\frac{\ln {s}}{s}-\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}+\frac{1}{2}. \end{aligned}$$

Hence we get the final expression for the projection of the kernel \({\varPsi }_k\):

$$\begin{aligned} \psi _k(s)= & {} \frac{kF^{k-1}(s)-1}{2(k+1)}-\frac{k-1}{k+1}F(s)+\frac{k(k-1)}{k+1}\frac{\ln {s}}{s} \nonumber \\&- \frac{k}{s(k+1)} \sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}. \end{aligned}$$
(4)

The calculation of the variance for the projection \(\psi _k\) in the general case is too complicated, therefore we calculate it only for particular k.

2.1 Integral statistic \(I_{n}^{(3)}\)

The projection \(\psi _k(s)\) for the case \(k=3\) has the form:

$$\begin{aligned} \psi _3(s)=\frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}. \end{aligned}$$
(5)

The variance of this projection \({\varDelta }_3^2 = E\psi _3^2(X_1)\) under \(H_{0}\) is given by

$$\begin{aligned} {\varDelta }_3^2 = \int _{1}^{\infty } \psi _3^2 (s) \frac{1}{s^2}ds =\frac{11}{1920} \approx 0.006. \end{aligned}$$

Therefore the kernel \({\varPsi }_3\) is centered and non-degenerate. We can apply Hoeffding’s theorem on asymptotic normality of U-statistics, see again Hoeffding (1948), Korolyuk and Borovskikh (1994), which implies that the following result holds

Theorem 2

Under the null hypothesis as \(n \rightarrow \infty \) the statistic \(\sqrt{n}I_{n}^{(3)}\) is asymptotically normal so that

$$\begin{aligned} \sqrt{n}I_{n}^{(3)} \mathop {\longrightarrow }\limits ^{d}{\mathscr {N}}\left( 0,\frac{11}{120}\right) . \end{aligned}$$

Now we shall evaluate the large deviation asymptotics of the sequence of statistics \(I_{n}^{(3)}\) under \(H_0\). According to the theorem on large deviations of such statistics from Nikitin and Ponikarov (1999), see also DasGupta (2008), Nikitin (2010), we obtain due the fact that the kernel \({\varPsi }_3\) is centered, bounded and non-degenerate the following result.

Theorem 3

For \(a>0\)

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( I_n^{(3)} >a) = - f_I^{(3)}(a), \end{aligned}$$

where the function \(f_I^{(3)}\) is continuous for the sufficiently small \(a>0\), and

$$\begin{aligned} f_I^{(3)}(a) \sim \frac{a^2}{32 {\varDelta }_3^2} = 5.455\, a^2, \quad \text{ as } \, a \rightarrow 0. \end{aligned}$$

2.2 Some notions from the Bahadur theory

Suppose that under the alternative \(H_1\) the observations have the d.f. \(G(\cdot ,\theta )\) and the density \(g(\cdot ,\theta ), \ \theta \ge 0\), such that \(G(\cdot , 0) \in {\mathscr {P}}\). The measure of the Bahadur efficiency (BE) for any sequence \(\{T_n\}\) of test statistics is the exact slope \(c_{T}(\theta )\) describing the rate of an exponential decrease for the attained level under the alternative d.f. \(G(\cdot ,\theta )\). According to the Bahadur theory (Bahadur 1971; Nikitin 1995) the exact slopes may be found by using the following Proposition.

Proposition 1

Suppose that the following two conditions hold:

  1. a)

          \(T_n \ \mathop {\longrightarrow }\limits ^{{P_\theta }} \ b(\theta ),\qquad \theta > 0\), where \(-\infty < b(\theta ) < \infty \), and \(\mathop {\longrightarrow }\limits ^{{P_\theta }}\) denotes convergence in probability under \(G(\cdot \ ; \theta )\).

  2. b)

          \(\mathop {\lim }\limits _{n\rightarrow \infty } n^{-1} \ \ln \ P_{H_0} \left( T_n \ge t \ \right) \ = \ - h(t)\)

for any t in an open interval I, on which h is continuous and \(\{b(\theta ), \, \theta > 0\}\subset I\). Then

$$\begin{aligned} c_T(\theta ) \ = \ 2 \ h(b(\theta )). \end{aligned}$$

We have already found the large deviation asymptotics. In order to evaluate the exact slope it remains to calculate the first condition of the Proposition.

Note that the exact slopes for any \(\theta \) satisfy the inequality (Bahadur 1971; Nikitin 1995)

$$\begin{aligned} c_T(\theta ) \le 2 K(\theta ), \end{aligned}$$
(6)

where \(K(\theta )\) is the Kullback-Leibler “distance”  between the alternative and the null-hypothesis \(H_0\). In our case \(H_0\) is composite, hence for any alternative density \(g_j(x,\theta )\) one has

$$\begin{aligned} K_j(\theta ) = \inf _{\lambda >0} \int _1^{\infty } \ln [g_j(x,\theta ) / \lambda x^{-\lambda -1} ] g_j(x,\theta ) \ dx. \end{aligned}$$

This quantity can be easily calculated as \(\theta \rightarrow 0\) for particular alternatives. According to (6), the local BE of the sequence of statistics \({T_n}\) is defined as

$$\begin{aligned} e^B (T) = \lim _{\theta \rightarrow 0} \frac{c_T(\theta )}{2K(\theta )}. \end{aligned}$$

2.3 The local Bahadur efficiency of \(I_n^{(3)}\)

According to Bahadur theory, the considered alternatives should be close to null-hypothesis as \(\theta \rightarrow 0\). Therefore we suggest three alternatives against the Pareto distribution. The first two alternatives we consider are obtained by skewing mechanism (Ley and Paindaveine 2008), we call them Ley–Paindaveine alternatives.

  1. i)

    First Ley–Paindaveine alternative with the d.f.

    $$\begin{aligned} G_1(x,\theta )=F(x)e^{-\theta (1-F(x))},\quad \theta \ge 0, x \ge 1; \end{aligned}$$
  2. ii)

    Second Ley–Paindaveine alternative with the d.f.

    $$\begin{aligned} G_2(x,\theta )=F(x)-\theta \sin {\pi F(x)}, \quad \theta \in [0,\pi ^{-1}], x\ge 1; \end{aligned}$$
  3. iii)

    log-Weibull alternative with the d.f.

    $$\begin{aligned} G_3(x,\theta )=1-e^{-(\ln {x})^{\theta +1}},\quad \theta \in (0,1), x\ge 1. \end{aligned}$$

Let us find the local BE for the alternative under consideration.

According to the Law of Large Numbers for U-statistics (Korolyuk and Borovskikh 1994), the limit in probability under \(H_1\) is equal to

$$\begin{aligned} b_1(\theta )=P_{\theta }(X_{(3,3)}/X_{(2,3)}<Y)-\frac{1}{2}. \end{aligned}$$

It is easy to show (Jovanovic et al. 2014) that

$$\begin{aligned} b_1(\theta ) \sim 4\theta \int _{1}^{\infty } \psi _3(s)h_1(s)ds, \end{aligned}$$

where \(h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}\) and \(\psi _3(s)\) is the projection from (5). Therefore for the first Ley–Paindaveine alternative we have

$$\begin{aligned} b_1(\theta ) \sim 4\theta \int _{1}^{\infty }\left( \frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}\right) \left( \frac{s-2}{s^3}\right) \frac{ds}{s^2} \sim \frac{\theta }{12}, \quad \theta \rightarrow 0, \end{aligned}$$

and the local exact slope of the sequence \(I_n^{(3)}\) as \(\theta \rightarrow 0\) admits the representation

$$\begin{aligned} c_1(\theta )=b^2_1(\theta )/(16{\varDelta }_3^2) \sim \frac{5}{66}\,\theta ^2, \quad \theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler “distance”   \(K_1(\theta )\) between the alternative and the null-hypothesis \(H_0\) admits the following asymptotics (Jovanovic et al. 2014):

$$\begin{aligned} 2K_1(\theta )\sim \theta ^2 \left[ \left\{ \int _1^\infty h_1^2(x)x dx -\left( \int _1^\infty h_1(x) \ln {(x)}dx\right) ^2\right] ,\quad \theta \rightarrow 0.\right. \end{aligned}$$

Therefore in our case

$$\begin{aligned} K_1(\theta ) \sim \theta ^2/24, \,\theta \rightarrow 0. \end{aligned}$$
(7)

Consequently, the local efficiency of the test is

$$\begin{aligned} e^B_1(I)=\lim _{\theta \rightarrow 0}\frac{c_1(\theta )}{2K_1(\theta )}\approx \frac{10}{11}\approx 0.909. \end{aligned}$$

Omitting the calculations similar to previous cases, we get for the second Ley–Paindaveine alternative \(b_2(\theta )\sim 0.353\, \theta \), \(c_2(\theta )\sim 1.363\,\theta ^2\), \(\theta \rightarrow 0\). It is easy to show that \(K_2(\theta ) \sim 0.753\, \theta ^2\), \(\theta \rightarrow 0\). Therefore the local BE is equal to 0.905.

After some calculations in case of the log-Weibull alternative we have:

$$\begin{aligned} b_3(\theta ) \sim \left( \frac{3}{4}-\ln {3}+\ln {2}\right) \theta \approx 0.345\, \theta , \quad \theta \rightarrow 0, \end{aligned}$$

and the local exact slope of the sequence \(I_n^{(3)}\) as \(\theta \rightarrow 0\) admits the representation \(c_3(\theta ) \sim 1.295\, \theta ^2\). Moreover for the log-Weibull distribution \(K_3(\theta )\) satisfies \(K_3(\theta ) \sim \frac{\theta ^2}{12}\), \(\theta \rightarrow 0\). Hence the local BE for the last case is equal to 0.787.

Table 1 gathers values of the local BE.

Table 1 Local Bahadur efficiencies for \(I_n^{(3)}\)

2.4 Integral statistic \(I_{n}^{(4)}\)

For the case \(k=4\) the projection \(\psi _k(s)\) has the form:

$$\begin{aligned} \psi _4(s)=\frac{12\ln {s}}{5s}-\frac{4}{5s^3}+\frac{18}{5s^2}-\frac{13}{5s}-\frac{3}{10}. \end{aligned}$$
(8)

The variance of this projection under \(H_{0}\) is equal to

$$\begin{aligned} {\varDelta }_4^2 = \int _{1}^{\infty } \psi _4^2 (s) \frac{1}{s^2} d s =\frac{271}{52500} \approx 0.005. \end{aligned}$$

Therefore the kernel \({\varPsi }_4\) is centered, non-degenerate and bounded. Due to Hoeffding’s theorem on asymptotic normality of U-statistics (Hoeffding 1948; Korolyuk and Borovskikh 1994), we have that:

Theorem 4

Under the null hypothesis as \(n \rightarrow \infty \) the statistic \(\sqrt{n}I_{n}^{(4)}\) is asymptotically normal so that

$$\begin{aligned} \sqrt{n}I_{n}^{(4)} \mathop {\longrightarrow }\limits ^{d}{\mathscr {N}}\left( 0,\frac{271}{2100}\right) . \end{aligned}$$

The large deviation asymptotics of the sequence of statistics \(I_{n}^{(4)}\) under \(H_0\) follows from the following result. It was derived using the theorem on large deviations (see again Nikitin and Ponikarov 1999; DasGupta 2008; Nikitin 2010), applied to the centered, bounded and non-degenerate kernel \({\varPsi }_4\).

Theorem 5

For \(a>0\)

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( I_n^{(4)} >a) = - f_I^{(4)}(a), \end{aligned}$$

where the function \(f_I^{(4)}\) is continuous for the sufficiently small \(a>0\), and

$$\begin{aligned} f_I^{(4)}(a) \sim \frac{a^2}{50 {\varDelta }^2_4} = 3.875\, a^2, \quad \text{ as } \, a \rightarrow 0. \end{aligned}$$

2.5 The local Bahadur efficiency of \(I_n^{(4)}\)

For this case the limit in probability under \(H_1\) has the following asymptotics

$$\begin{aligned} b_1(\theta ) \sim 5\theta \int _{1}^{\infty } \psi _4(s)h_1(s)ds, \end{aligned}$$

where again \(h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}\) and \(\psi _4(s)\) is the projection from (8). Therefore for the first Ley–Paindaveine alternative we have

$$\begin{aligned} b_1(\theta ) \sim 5\theta \int _{1}^{\infty }\left( \frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}\right) \left( \frac{s-2}{s^3}\right) \frac{ds}{s^2} \sim \frac{\theta }{12}, \quad \theta \rightarrow 0. \end{aligned}$$

and the local exact slope of the sequence \(I_n^{(4)}\) as \(\theta \rightarrow 0\) admits the representation

$$\begin{aligned} c_1(\theta )=b^2_1(\theta )/(25{\varDelta }_4^2) \sim \frac{5}{66}\,\theta ^2,\theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler “distance” for this alternative was already found above, and it satisfies \(K_1(\theta ) \sim \theta ^2/24, \,\theta \rightarrow 0\). Thus the local efficiency of the test is

$$\begin{aligned} e^B_1(I)=\lim _{\theta \rightarrow 0}\frac{c_1(\theta )}{2K_1(\theta )}\approx 0.930. \end{aligned}$$

For other alternatives the calculations are similar. Omitting the details, let us gather the values of the local BE for this case in the Table 2.

Table 2 Local Bahadur efficiencies for \(I_n^{(4)}\)

In Table 3 we present the efficiencies from Tables 1 and 2 gathered with maximal values of efficiencies against presumed alternatives.

Table 3 Comparative table of local efficiencies for the statistic \(I_n^{(k)}\)

3 Kolmogorov-type statistic \(D_n^{(k)}\)

Now we consider the Kolmogorov type statistic (3). For a fixed t the difference \(H_n(t) - F_n(t)\) is a family of U-statistics with the kernels, depending on \(t\ge 1\):

$$\begin{aligned} {\varXi }_k(X_{i_1},\ldots , X_{i_k};t)= \mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < t) -\frac{1}{k}\sum _{l=1}^{k} \mathbf 1 (X_l <t) . \end{aligned}$$

The projection of this kernels \(\xi _k(s;t)\) for a fixed \(t \ge 1\) has the form:

$$\begin{aligned}&\xi _k(s;t) := E({\varXi }_k(X_{1},\ldots , X_{k})\mid X_{k}=s)\\&\quad = \mathbb {P}(X_{(k,\{1,\ldots ,k-1, s\})}/X_{(k-1,\{1,\ldots ,k-1, s\})} < t)-\frac{1}{k} \mathbf 1 \{s <t\}-\frac{k-1}{k}\mathbb {P}\{X_1 <t\}. \end{aligned}$$

It remains to calculate the first term. For this purpose like in the previous cases, we write the decomposition

$$\begin{aligned} \mathbb {P}(X_{k,\{1, \ldots , k-1, s\}}/X_{k-1,\{1, \ldots , k-1, s\}} < t) = \mathbb {P}_1+\mathbb {P}_2+\mathbb {P}_3, \end{aligned}$$

where \(\mathbb {P}_i, i=1,2,3\), are the initial probabilities, computed in one of the following cases:

  1. (1)

    Let the sample units take places as follows: \(X_1 < X_2< \ldots < X_{k-1} <s\). Then the probability expresses as

    $$\begin{aligned} \mathbb {P}_1= & {} (k-1)! \, \mathbb {P}\left( \frac{s}{X_{k-1}} < t, X_1 < X_2< \ldots < X_{k-1} < s\right) \\= & {} (k-1)! \, \mathbf 1 (s\ge t) \mathbb {P}\left( \frac{s}{t}<X_{k-1}< s, X_1 < X_2< \ldots < X_{k-1}\right) \\&+(k-1)! \, \mathbf 1 (s < t) \mathbb {P}(X_1 < X_2< \ldots < X_{k-1} < s)\\= & {} \mathbf 1 ( s\ge t) \left( F^{k-1}(s)-F^{k-1}\left( \frac{s}{t}\right) \right) . \end{aligned}$$
  2. (2)

    The sample units are \(X_1 < X_2<\ldots X_{k-2}< s < X_{k-1}\), then for this case we have:

    $$\begin{aligned} \mathbb {P}_2= & {} (k-1)! \, \mathbb {P}\left( \frac{X_{k-1}}{s} < t, X_1 < X_2<\ldots X_{k-2}< s < X_{k-1}\right) \\= & {} (k-1)! \, \mathbb {P}(s < X_{k-1} < st, X_1 < X_2<\ldots X_{k-2}< s)\\= & {} (k-1)! \frac{F^{k-2}(s)}{(k-2)!}(F(st)-F(s)) =\frac{(k-1)}{s}\left( 1-\frac{1}{s}\right) ^{k-2}\left( 1-\frac{1}{t}\right) . \end{aligned}$$
  3. (3)

    In the last case let s be situated on \(l-\)th place \((1 \le l \le {k-2})\) in the variational series of the sample \(X_1, \ldots , X_{k-2}\). Then the required probability transforms into:

    $$\begin{aligned} \mathbb {P}_3= & {} (k-1)! \, \mathbb {P}\left( \frac{X_{k-1}}{X_{k-2}} < t, X_1 < \ldots <s< \ldots < X_{k-2} < X_{k-1}\right) \\= & {} \left( 1-\frac{1}{t}\right) C_{k-1}^{l-1}(1-F(s))^{k-j} F^{j-1}(s), \, 1 \le l \le {k-2} . \end{aligned}$$

Combining these results we get that the first term in the projection is equal to:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{1, \ldots , k-1, s\})}/X_{(k-1,\{1, \ldots , k-1, s\})} < t)\\&\quad = \mathbf 1 ( s\!\ge \! t) \left( F^{k-1}(s)-F^{k-1}\left( \frac{s}{t}\right) \right) \!+\! \left( 1\!-\!\frac{1}{t}\right) \sum _{l=1}^{k-1}C_{k-1}^{l-1}(1\!-\!F(s))^{k-j} F^{j-1}(s). \end{aligned}$$

Again we can see that the last sum can be simplified as

$$\begin{aligned} \sum _{l=1}^{k-1}C_{k-1}^{l-1}(1-F(s))^{k-j} F^{j-1}(s)=1-F^{k-1}(s). \end{aligned}$$

Thus the initial probability is equal to

$$\begin{aligned} \mathbb {P}(X_{(k,\{1, \ldots , k-1, s\})}/X_{(k-1,\{1, \ldots , k-1, s\})} < t)= \frac{1}{t}(F^{k-1}(s)-1) -\mathbf 1 ( s\ge t)F^{k-1}\left( \frac{s}{t}\right) . \end{aligned}$$

Hence we get the final expression for the projection of the family of kernels \({\varXi }_k(\cdot ,t)\):

$$\begin{aligned} \xi _k(s;t) =\frac{1}{t}\left( \left( 1-\frac{1}{s}\right) ^{k-1}-\frac{1}{k}\right) - \mathbf 1 ( s\ge t)\left( \left( 1-\frac{t}{s}\right) ^{k-1}-\frac{1}{k}\right) . \end{aligned}$$
(9)

It is easy to show that \(E(\xi _k (X; t))=0\). After some calculations we get that the variance of this projection under \(H_{0}\) is for any t

$$\begin{aligned} \delta _k^2(t)= & {} \frac{t+1}{(2k-1)t^2}+\frac{t-1}{k^2t^2}- \sum _{i=0}^{k-1}\frac{(-1)^{j}2(k-1)!(k-1)!}{(k+j)!(k-j-1)!}t^{j-1}\\&+\, (-1)^{k+1}\frac{2(k-1)!(k-1)!}{(2k-1)!}t^{k-2}F^{2k-1}(t)-\frac{2}{k^2t}F^k(t). \end{aligned}$$

3.1 Kolmogorov-type statistic \(D_n^{(3)}\)

In the case \(k=3\) the projection of the family of kernels \({\varXi }_3 (X,Y, Z;t)\), namely \(\xi _3 (s;t):=E({\varXi }_3 (X, Y, Z; t)\mid X=s)\) is equal to:

$$\begin{aligned} \xi _3 (s;t)= \frac{1}{t}\left( \frac{1}{s^2}-\frac{2}{s}+\frac{2}{3}\right) - \mathbf 1 \{s \ge t\}\left( \frac{t^2}{s^2}-\frac{2t}{s}+\frac{2}{3}\right) . \end{aligned}$$
(10)

Now we calculate variances of these projections \(\delta _3^2(t)\) under \(H_{0}\). Elementary calculations show that

$$\begin{aligned} \delta _3^2(t)=\frac{1}{45t^4}\left( 4t^3+4t^2-15t+7\right) . \end{aligned}$$

Hence our family of kernels \({\varXi }_3 (X,Y,Z;t)\) is non-degenerate in the sense of Nikitin (2010) and

$$\begin{aligned} \delta _3^2=\sup _{ t\ge 1} \delta _3^2(t)=0.035. \end{aligned}$$

This value will be important in the sequel when calculating the large deviation asymptotics (Figs. 1, 2, 3).

Fig. 1
figure 1

Plot of the function \(\delta _3^2(t)\)

Fig. 2
figure 2

Plot of the function \(b_1(t,\theta ), \text{ Ley--Paindaveine } \text{1 } \text{ alternative }\)

Fig. 3
figure 3

Plot of the function \(\delta _4^2(t)\)

The limiting distribution of the statistic \(D_n^{(3)}\) is unknown. Using methods of Silverman (1983), one can show that the U-empirical process

$$\begin{aligned} \eta _n(t) =\sqrt{n} \left( H_n(t) - F_n(t)\right) , \quad t\ge 1, \end{aligned}$$

weakly converges in \(D(1,\infty )\) as \(n \rightarrow \infty \) to certain centered Gaussian process \(\eta (t)\) with calculable covariance. Then the sequence of statistics \(\sqrt{n} D_n^{(3)}\) converges in distribution to the rv \(\sup _{t\ge 1} |\eta (t)|\) but currently it is impossible to find explicitly its distribution. Hence it is reasonable to determine the critical values for statistics \(D_n^{(3)}\) by simulation.

Now we obtain the logarithmic large deviation asymptotics of the sequence of statistics \(D_n^{(3)}\) under \(H_0\). The family of kernels \(\{{\varXi }_3(X, Y, Z; t), t\ge 0\}\) is not only centered but bounded. Using results from Nikitin (2010) on large deviations for the supremum of non-degenerate U-statistics, we obtain the following result.

Theorem 6

For \(a>0\)

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( D_n^{(3)} >a) = - f_D^{(3)}(a), \end{aligned}$$

where the function \(f_D^{(3)}\) is continuous for the sufficiently small \(a>0\), moreover

$$\begin{aligned} f_D^{(3)}(a) = (18 \delta _3^2)^{-1} a^2(1 + o(1)) \sim 1.598\, a^2, \quad \text{ as } \, \, a \rightarrow 0. \end{aligned}$$

3.2 The local Bahadur efficiency of \(D_n^{(3)}\)

To evaluate the efficiency, first consider again the first Ley–Paindaveine alternative with the d.f. \(G_1(x,\theta ),\theta \ge 0, x \ge 1\) given above. By the Glivenko-Cantelli theorem for U-statistics (Janssen 1988) the limit in probability under the alternative for statistics \(D_n^{(3)}\) is equal to

$$\begin{aligned} b_1(\theta ):= \sup _{t\ge 1}|b_1(t,\theta )|= \sup _{t\ge 1} |P_{\theta }(X_{(3,3)}/X_{(2,3)}<t)-G(t,\theta )|. \end{aligned}$$

It is not difficult to show that

$$\begin{aligned} b_1(t,\theta ) \sim 3\theta \int _{1}^{\infty } \xi _3(s; t)h_1(s)ds, \end{aligned}$$

where again \(h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}\) and \(\xi _3(s;t)\) is the projection defined above in (10). Hence for the first Ley–Paindaveine alternative we have for \(t\ge 1\):

$$\begin{aligned} b_1(t,\theta ) \sim \frac{t-1}{2t^2}\theta , \quad \theta \rightarrow 0. \end{aligned}$$

Thus \(b_1(\theta )=\sup _{t\ge 1}|b_1(t,\theta )| \sim 0.125\,\theta \), and it follows that the local exact slope of the sequence of statistics \(D_n\) admits the representation:

$$\begin{aligned} c_1(\theta ) \sim b^2_1(\theta )/(9\delta ^2_3) \sim 0.05\,\theta ^2, \quad \theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler information in this case is given by (7). Hence the local Bahadur efficiency of our test is \(e^B_1(D)= 0.599\).

Next we take the second Ley–Paindaveine distribution, where calculations are similar, and the local BE is equal to 0.689. In the case of the log-Weibull density we find that the local BE is 0.467.

We collect the values of local BE in the Table 4.

Table 4 Local Bahadur efficiencies for \(D_n^{(3)}\)

3.3 Kolmogorov-type statistic \(D_n^{(4)}\)

In the case \(k=4\) the projection of the family of kernels \({\varXi }_4 (X,Y, Z, W;t)\), is equal to:

$$\begin{aligned} \xi _4 (s;t)= \frac{1}{t}\left( \left( 1-\frac{1}{s}\right) ^3-\frac{1}{4}\right) - \mathbf 1 \{s \ge t\}\left( -\left( \frac{t}{s}\right) ^3+3\left( \frac{t}{s}\right) ^2-\frac{3t}{s}+\frac{3}{4}\right) . \end{aligned}$$

Therefore we get that variances of these projections \(\delta _4^2(t)\) under \(H_{0}\)

$$\begin{aligned} \delta _4^2(t)=\frac{1}{560t^5}\left( 45t^4+45t^3-252t^2+224t-62\right) . \end{aligned}$$

Hence our family of kernels \({\varXi }_4 (X,Y,Z,W;t)\) is non-degenerate in the sense of Nikitin (2010) and

$$\begin{aligned} \delta _4^2=\sup _{ t\ge 1} \delta _4^2(t)=0.026. \end{aligned}$$

The limiting distribution of the statistic \(D_n^{(4)}\) is unknown as in the previous section.

The logarithmic large deviation asymptotics of the sequence of statistics \(D_n^{(4)}\) under \(H_0\) is showed in the next theorem.

Theorem 7

For \(a>0\)

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( D_n^{(4)} >a) = - f_D^{(4)}(a), \end{aligned}$$

where the function \(f_D^{(4)}\) is continuous for the sufficiently small \(a>0\), moreover

$$\begin{aligned} f_D^{(4)}(a) =(32 \, \delta _4^2)^{-1} a^2(1 + o(1)) \sim 1.211\, a^2, \quad \text{ as } \, \, a \rightarrow 0. \end{aligned}$$

3.4 The local Bahadur efficiency of \(D_n^{(4)}\)

In the Table 5 we collect the calculated efficiencies for the statistic \(D_n^{(k)}\) joined with results from the Table 4 and with the maximal values of efficiencies against our alternatives.

Table 5 Comparative table of local efficiencies for statistic \(D_n^{(k)}\)

We observe that the efficiencies for the Kolmogorov-type test are lower than for the integral test. However, it is the usual situation when testing goodness-of-fit (Nikitin 1995; Rank 1999; Nikitin 2010).

3.5 Critical values

Tables 6 and 7 shows the critical values of the null distribution of \(D_n^{(3)}\) and \(D_n ^{(4)}\) for significance levels \(\alpha = 0.1, 0.05, 0.01\) and specific sample sizes n. Each entry is obtained by using the Monte-Carlo simulation methods with 10,000 replications.

Table 6 Critical values for the statistic \(D_n^{(3)}\)
Table 7 Critical values for the statistic \(D_n^{(4)}\)

4 Power comparison

We recall computation formulae for statistics \(I_n^{(k)}\) and \(D_n^{(k)}\) for \(k=\{3, 4\}\):

$$\begin{aligned} I_n^{(k)}= & {} \int _{1}^{\infty } \left( H_n(t)-F_n(t)\right) dF_n(t)\\= & {} {n \atopwithdelims ()k+1}^{-1}\sum _{1 \le i_1<\ldots < i_{k+1} \le n} \frac{1}{k+1}\sum _{\pi (i_1, \ldots , i_{k+1})}\mathbf 1 \left( X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < X_{i_{k+1}}\right) -\frac{1}{2}, \end{aligned}$$

where \(\pi (i_1, \ldots , i_{k+1})\) means all permutations of different indices from \(\{i_1, \ldots ,i_{k+1}\}\).

$$\begin{aligned} D_n^{(k)}= & {} \sup _{t \ge 1}\mid H_n(t)-F_n(t)\mid \\= & {} \sup _{t \ge 1}\mid {n \atopwithdelims ()k}^{-1}\sum _{1 \le i_1<\ldots < i_k \le n} \bigg [\mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < t) -\frac{1}{k}\sum _{l=1}^{k} \mathbf 1 (X_l <t)\bigg ] \mid . \end{aligned}$$

This section presents results of a Monte-Carlo study to compare powers of new tests with the widely applied for these types of hypotheses Kolmogorov-Smirnov (KS) and Cramer-von Mises (CvM) tests. The comparison is done for the size \(n=20\) and for the significance level \(\alpha =0.05\). All calculation were done using JAVA (The Apache Commons Mathematics Library) and R package with 10,000 replications. We consider following distributions as alternatives for the Pareto distribution:

  1. 1)

    the Gamma alternative \({\varGamma }(\theta )\) with the density \(({\varGamma }(\theta ))^{-1}x^{\theta -1}\exp (-x)\);

  2. 2)

    the log-normal law \(LN(\theta )\) with the density \((\theta x)^{-1}(2\pi )^{-1/2}\exp (-(\log {x})^2/2\theta ^2)\);

  3. 3)

    the first Ley–Paindaveine alternative \(LP1(\theta )\) with the d.f. \((1-\frac{1}{x})\exp (-\theta /x),\theta \ge 0, x \ge 1\);

  4. 4)

    log-Weibull alternative with the d.f. \(1-e^{-(\ln {x})^{\theta +1}},\theta \in (0,1), x\ge 1\);

  5. 5)

    the Weibull distribution \(W(\theta )\) with the density \(\theta x^{\theta -1}\exp (-x^{\theta })\).

The KS and CvM tests are not applicable to composite hypothesis, so first we estimate parameter \(\lambda \) with its maximum likelihood estimator (MLE) \(\hat{\lambda }=n(\sum _{k=1}^n \ln {X_k})^{-1}\), then calculate the critical values of corrected test and powers for the sample \(n=20\) using the Monte-Carlo procedure. Powers are given in the Table 8.

Table 8 Comparative table of the power simulation for different statistics

We observe that powers of our tests correspond to local Bahadur efficiencies for considerable alternatives. In whole we can notice that our new statistics in comparison with classical tests more favorable to alternatives with the density shapes similar to the Pareto distribution, for example like the first Ley–Paindaveine alternative. However they are less responsive to close alternatives but with another shapes (for example when density have some twists differ to the Pareto distribution), for example gamma and log-Weibull alternatives.

5 Application to the real data

In this section we apply our tests to the real data example from Hogg and Klugman (1984), where they are discussed in detail. The data set represents the losses due to wind-related catastrophes, 1977, rounded to the nearest million dollars and involved more than $2 million:

$$\begin{aligned} \begin{array}{llllllllllllllllllll} 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 3 &{} \quad 3 &{} \quad 3 &{} \quad 3 &{} \quad 4 &{} \quad 4 &{} \quad 4 &{} \quad 5\\ 5 &{} \quad 5 &{} \quad 5 &{} \quad 6 &{} \quad 6 &{} \quad 6 &{} \quad 6 &{} \quad 8 &{} \quad 8 &{} \quad 9 &{} \quad 15 &{} \quad 17 &{} \quad 22 &{} \quad 23 &{} \quad 24 &{} \quad 24 &{} \quad 25 &{} \quad 27 &{} \quad 32 &{} \quad 43 \end{array} \end{aligned}$$

These data widely analyze in the literature, see Brazausskas and Serfling (2003) for detail, the new goodness-of-fit tests were proposed in Rizzo (2009). First we apply the same to Brazausskas and Serfling (2003) and Rizzo (2009) the data de-grouping method. The necessity of this method appear as a consequence of the initial data rounding and make from discrete grouping the uniform distributed data. Put

$$\begin{aligned} X_k = \left( 1-\frac{k}{m+1}\right) A + \frac{k}{m+1}B, k=1,\ldots ,m, \end{aligned}$$

where (AB) is a grouping interval with m observations. According to Brazausskas and Serfling (2003) for example to observations corresponds to 2 we conditionally take as (AB) the interval (1.5, 2.5).

We tested the null-hypothesis \(H_0: X\) has the Pareto distribution with the scale parameter \(\sigma = 1.5\) and MLE of the tail parameter \(\lambda = 0.764\). Such special parameters consider in Brazausskas and Serfling (2003), Hogg and Klugman (1984) and in Rizzo (2009), Philbrick and Jurschak (1981) applied \(\sigma = 2.0\).

Applying our tests to these data, we get in the Table 9 the following p-values of test statistics \(I_n^{(k)}\) and \(D_n^{(k)}\), based on 10,000 simulations.

Table 9 Goodness-of-fit tests for fitted Pareto models based on MLE \(\lambda = 0.764\) and \(\sigma = 1.5\) for statistics \(I_n^{(k)}\), \(D_n^{(k)}\)

So we conclude that our tests do not reject the null-hypothesis. It correspond to results of Brazausskas and Serfling (2003), Rizzo (2009).

6 Conditions of the local asymptotic optimality

In this section we are interested in conditions of the local asymptotic optimality (LAO) in Bahadur sense for both sequences of statistics \(I_n^{(k)}\) and \(D_n^{(k)}\). This means to describe the local structure of alternatives for which the given statistic has maximal potential local efficiency so that the relation

$$\begin{aligned} c_T(\theta ) \sim 2 K(\theta ),\quad \theta \rightarrow 0, \end{aligned}$$

holds (Nikitin 1995; Nikitin and Tchirina 1996). Such alternatives form the domain of LAO for the given sequence of statistics.

Consider functions

$$\begin{aligned} H(x)=G^{'}_{\theta }(x,\theta )\mid _{\theta =0},\quad h(x)=g^{'}_{\theta } (x,\theta )\mid _{\theta =0}. \end{aligned}$$

We will assume that the following regularity conditions are true (see also Nikitin and Tchirina 1996):

$$\begin{aligned}&\int _{1}^{\infty } h^2(x)x \, dx < \infty \quad \text{ where } \quad h(x)=H'(x), \, \end{aligned}$$
(11)
$$\begin{aligned}&\frac{\partial }{\partial \theta }\int _{1}^{\infty } g(x,\theta ) \ln {x} \, dx \mid _{\theta =0} \ = \ \int _{1}^{\infty } h(x)\ln {x} \, dx . \end{aligned}$$
(12)

Denote by \(\mathscr {G}\) the class of densities \(g(x,\theta )\) with d.f.’s \(G(x,\theta )\), satisfying the regularity conditions (11)–(12). We are going to deduce the LAO conditions in terms of the function h(x).

Recall that for alternative densities from \({\mathscr {G}}\) the following asymptotics is valid:

$$\begin{aligned} 2K(\theta )\sim \theta ^2 \left[ \int _{1}^{\infty } h^2(x)x\, dx -\left( \int _{1}^{\infty } h(x)\ln {x} \, dx\right) ^2\right] ,\quad \theta \rightarrow 0. \end{aligned}$$

6.1 LAO conditions for \(I_n^{(k)}\)

First consider the integral statistic \(I_n^{(k)}\) with the kernel \({\varPsi }_k(X_1, \ldots , X_{k+1})\) and its projection \(\psi _k(x)\) from (4). Let introduce the auxiliary function

$$\begin{aligned} h_0(x) = h(x) - \frac{(\ln {x}-1)}{x^2}\int _1^\infty h(u)\ln {u}\, du. \end{aligned}$$

Simple calculations show that

$$\begin{aligned}&\int _{1}^{\infty } h^2(x)x^2dx -\left( \int _{1}^{\infty } h(x)\ln {x} \, dx\right) ^2=\int _{1}^{\infty } h_0^2(x) x^2 dx,\\&\int _{1}^{\infty } \psi _k(x)h(x)dx = \int _{1}^{\infty } \psi _k(x)h_0(x)dx. \end{aligned}$$

Hence the local asymptotic efficiency takes the form

$$\begin{aligned} e^B (I_n^{(k)})= & {} \lim _{\theta \rightarrow 0} b_I^2(\theta ) / \left( (k+1)^2{\varDelta }_k^2 \cdot 2K(\theta )\right) \\= & {} \left( \int _{1}^{\infty } \psi _k(x)h_0(x)dx\right) ^2/\left( \int _{1}^{\infty }\psi _k^2(x) \frac{dx}{x^2} \cdot \int _{1}^{\infty } h_0^2(x)x^2 dx \right) . \end{aligned}$$

By Cauchy-Schwarz inequality we obtain that the expression in the right-hand side is equal to 1 iff \(h_0(x)=C_1\psi _k(x)\frac{1}{x^2}\) for some constant \(C_1>0\), so that

$$\begin{aligned} h(x) =(C_1\psi _k(x)+ C_2 (\ln {x}-1))\frac{1}{x^2} \quad \text { for some constants } C_1>0 \text { and } C_2. \end{aligned}$$
(13)

The set of distributions for which the function h(x) has such form generate the domain of LAO in the class \(\mathscr {G}\). The simplest examples of such alternatives density \(g(x,\theta )\) for small \(\theta > 0\) is given by the Table 10.

Table 10 Examples of LAO alternative density \(g(x,\theta )\) for statistic \(I_n^{(k)}\)

6.2 LAO conditions for \(D_n^{(k)}\)

Now let consider the Kolmogorov type statistic \(D_n^{(k)}\) with the family of kernels \({\varXi }_k\) and their projections \(\xi _k(x;t)\) from (9). After simple calculations we get

$$\begin{aligned} \int _{1}^{\infty } \xi _k(x; t)h(x)dx = \int _{1}^{\infty } \xi _k(x; t )h_0(x)dx, \quad \forall t \in [1,\infty ). \end{aligned}$$

Hence the local efficiency takes the form

$$\begin{aligned} e^B (D_n^{(k)})= & {} \lim _{\theta \rightarrow 0} \left[ b_D^2(\theta )/ \sup _{t \ge 1}\left( k^2 \delta _k^2(t)\right) \cdot 2K(\theta ) \right] \\= & {} \frac{ \sup _{t \ge 1}\bigg ( \int _{1}^{\infty } \xi _k(x;t)h_0(x)dx\bigg )^2 }{ \ \sup _{t \ge 1} \bigg ( \int _{1}^{\infty } \xi _k^2 (x;t) \frac{dx}{x^2} \cdot \int _{1}^{\infty } h_0^2(x)x^2 dx\bigg )} \le 1. \end{aligned}$$

We can apply once again the Cauchy-Schwarz inequality to the numerator in the last ratio. It follows that the sequence of statistics \(D_n\) is the locally asymptotically optimal, and \(e^B (D_n^{(k)})=1\) iff

$$\begin{aligned} h(x)=(C_3\xi _k(x; t_0)+ C_4 (\ln {x}-1))\cdot \frac{1}{x^2} \quad \text { for } t_0= \arg \sup _{t \ge 1}\delta _k^2(t)\, \end{aligned}$$

and some constants \(C_3>0\) and \(C_4\).

Distributions with such h(x) form the domain of LAO in the class \({\mathscr {G}}\). The simplest examples are given in the Table 11.

Table 11 Examples of LAO alternative density \(g(x,\theta )\) for statistic \(D_n^{(k)}\)

7 Conclusion

We constructed two new tests for goodness-of-fit testing for the Pareto distribution based on the new characterization for the Pareto distribution. We describe their limit distribution and large deviations. The Bahadur efficiency for some alternatives has been obtained and it turned out reasonably high. Also we derived the conditions of local optimality for our tests. These tests were compared with some commonly used goodness-of-fit tests and it can be noted that in most cases our tests are more powerful. They can be of some use in statistical research, especially when the alternative is close to the alternative from the LAO class.