Goodness-of-fit tests for the Pareto distribution based on its characterization

Volkova, Ksenia

doi:10.1007/s10260-015-0330-y

Goodness-of-fit tests for the Pareto distribution based on its characterization

Published: 25 July 2015

Volume 25, pages 351–373, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Statistical Methods & Applications Aims and scope Submit manuscript

Goodness-of-fit tests for the Pareto distribution based on its characterization

Download PDF

Ksenia Volkova¹

439 Accesses
8 Citations
Explore all metrics

Abstract

A new characterization of the Pareto distribution is proposed, and new goodness-of-fit tests based on it are constructed. Test statistics are functionals of U-empirical processes. The first of these statistics is of integral type, it is similar to the classical statistics $\omega _n^1$. The second one is a Kolmogorov type statistic. We show that the kernels of our statistics are non-degenerate. The limiting distribution and large deviations asymptotics of the new statistics under null hypothesis are described. Their local Bahadur efficiency for parametric alternatives is calculated. This type of efficiency is mostly appropriate for the solution of our problem since the Kolmogorov type statistic is not asymptotically normal, and the Pitman approach is not applicable to this statistic. For the second statistic we evaluate the critical values by using Monte-Carlo methods. Also conditions of local optimality of new statistics in the sense of Bahadur are discussed and examples of such special alternatives are given. For small sample size we compare the power of those tests with some common goodness-of-fit tests.

Distribution-free goodness-of-fit tests for the Pareto distribution based on a characterization

Article 14 July 2021

Goodness-of-Fit Tests Based on Characterization of Uniformity by the Ratio of Order Statistics, and Their Efficiency

Article 08 January 2020

Goodness-of-Fit Tests Based on a Characterization of Logistic Distribution

Article 01 April 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let ${\mathscr {P}}$ be the family of the Pareto distributions with the distribution function (d.f.)

$$\begin{aligned} F(x) = 1-x^{-\lambda }, \quad x \ge 1, \ \lambda >0. \end{aligned}$$

(1)

In this paper we develop the goodness-of-fit tests for the Pareto distribution using a new characterization based on the property of order statistics. The problem formulation is as follows: let $X_1,\ldots ,X_n$ be positive i.i.d. random variables (rv’s) with continuous d.f. F. Consider testing the composite hypothesis $H_0: F \in {\mathscr {P}}$ against the general alternative $H_1: F \notin {\mathscr {P}}$, assuming that the alternative d.f. is also concentrated on $[1,\infty )$.

The goodness-of-fit tests for the Pareto distribution have been discussed in Beirlant et al. (2006), Gulati and Shapiro (2008), Martynov (2009), Rizzo (2009). We exploit a different idea for constructing and analyzing statistical tests based on characterization by the property of equidistribution of linear statistics by means of so-called U-empirical d.f.’s (Janssen 1988; Korolyuk and Borovskikh 1994). This method was developed early in several articles, particularly, in Nikitin (1996), Nikitin and Peaucelle (2004), Nikitin and Tchirina (1996), Nikitin and Volkova (2010), Nikitin and Volkova (2012), Litvinova (2004). The tests for the Pareto distribution using this approach were obtained and analyzed in Jovanovic et al. (2014). One can observe that the new tests based on characterizations have reasonably high efficiencies and can be competitive with previously known goodness-of-fit tests. Let us explain our approach.

We will say that the d.f. F belongs to the class of distributions $\mathscr {F}$, if $\forall x_1, x_2$: either $F(x_1x_2)\le F(x_1)F(x_2)$ or $F(x_1x_2)\ge F(x_1)F(x_2)$, see Ahsanullah (1989).

Let $X_1,\ldots ,X_n$ be i.i.d. positive absolutely continuous random variables with the d.f. F from the class $\mathscr {F}$. Denote by $X_{(1,n)}\le X_{(2,n)}\le \ldots \le X_{(n,n)}$ - the order statistics of a random sample $X_1,...,X_n$.

We present a new characterization within the class $\mathscr {F}$.

Theorem 1

For the fixed k let $X_1,...,X_k$ be i.i.d., positive and bounded rv’s having an absolutely continuous (with respect to Lebesgue measure) d.f. F(x). Then the equality in law of $X_1$ and $X_{(k,k)}/X_{(k-1,k)}$ takes place iff $X_1$ has some d.f. from the family ${\mathscr {P}}$.

Proof

Let $Y=\ln {X}$ and let G denote the d.f. of Y. It can be easily seen that $F \in \mathscr {F}$ iff G is NBU (“new better than used”) or NWU (“new worse than used”) (Ahsanullah 1977). Further, since we use the monotonic transformation, then $X_1$ and $X_{(k,k)}/X_{(k-1,k)}$ will be identically distributed iff $Y_1$ and $Y_{(k,k)}-Y_{(k-1,k)}$ are identically distributed. It follows from Ahsanullah (1977) that $X_1$ and $X_{(k,k)}/X_{(k-1,k)}$ are identically distributed iff $Y=\ln {X}$ has the exponential distribution with some scale parameter $\lambda $, therefore $X_1$ has the Pareto distribution with the same parameter $\lambda $. $\square $

In the case when $k=2$ our characterization coincide with another characterization of the Pareto distribution considered in Jovanovic et al. (2014), see also Nikitin and Volkova (2012). Note that our characterization extends the charaterization, involved in Jovanovic et al. (2014).

According to our characterization we construct the U-empirical d.f. by the formula

$$\begin{aligned} H_n(t)={n \atopwithdelims ()k}^{-1}\sum _{1 \le i_1<\ldots < i_k \le n}\mathbf 1 \{X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})}< t\}, \quad t\ge 1, \end{aligned}$$

where $X_{(s,\{i_1,\ldots ,i_k\})}, \, s=\{k-1,k\}$ denotes the $s-$th order statistic of the subsample $X_{i_1},\ldots ,X_{i_k}$. For rv X the $U-$statistical d.f. will be simply the usual empirical d.f. $F_n(t)=n^{-1}\sum _{i=1}^n\mathbf 1 (X_i<t), t \in R^1$, based at the observations $X_1,\dots ,X_n$.

It is known that the properties of U-empirical d.f.’s are similar to the properties of usual empirical d.f.’s (Helmers et al. 1988; Janssen 1988). Hence the difference $H_n- F_n$ for large n should be almost surely close to zero under $H_0$, and we can measure their closeness by using some test statistics, assuming their large values to be critical.

We suggest two test statistics

$$\begin{aligned} I_n^{(k)}&=\int _{1}^{\infty } \left( H_n(t)-F_n(t)\right) dF_n(t), \end{aligned}$$

(2)

$$\begin{aligned} D_n^{(k)}&=\sup _{t \ge 1}\mid H_n(t)-F_n(t)\mid . \end{aligned}$$

(3)

Note that both proposed statistics under $H_0$ are invariant with respect to the change of variables $ X \rightarrow X^{\frac{1}{\lambda }}$, so we may set $\lambda =1$.

We discuss their limiting distributions under the null hypothesis and find logarithmic asymptotics of large deviations under $H_0$. Next we calculate their efficiencies against some parametric alternatives from the class $\mathscr {F}$. We use the notion of local exact Bahadur efficiency (BE) (Bahadur 1971; Nikitin 1995), as the statistics $D_n^{(k)}$ has the non-normal limiting distribution, hence the Pitman approach to the efficiency is not applicable. However, it is known that the local BE and the limiting Pitman efficiency usually coincide, see Wieand (1976), Nikitin (1995).

Finally, we study the conditions of the local optimality of our tests, describe the “most favorable” alternatives for them and compare the powers of our tests with some standard goodness-of-fit tests.

The family of d.f. in null-hypothesis we consider is a particular case of the so-called Pareto type I distribution with the d.f. $P_1(x) = 1-(\frac{x}{\beta })^{-\lambda }, \ x \ge \beta > 0, \ \lambda >0$, see, for example, Arnold (1983). For practice goodness-of-fit testing based on our new tests the unknown parameters of the hypothesized Pareto distribution can be estimated by a number of methods, see Arnold (1983, Ch. 5), Kleiber and Kotz (2003, Ch. 3), Brazausskas and Serfling (2003), Rizzo (2009). One can estimate first the parameter $\beta $ for example by the MLE estimator $\hat{\beta }= \min _{i=1, \ldots , n} X_i$. Then the sample $X_1,\ldots ,X_n$ can be transformed to the new sample $Y_1,\ldots ,Y_n$, where $Y_i=X_i/\hat{\beta }$ and its has the d.f. considered in (1).

There exist the Pareto’s second model, so-called Pareto type II distribution with d.f. $P_2(x) = 1-(1+\frac{x-\mu }{\beta })^{-\lambda }, \ x \ge \mu \in \mathbb {R}, \ \lambda >0$. Pareto type I and type II models are related by a following transformation: if rv X has a Pareto type II distribution, then $X-(\mu -\sigma )$ has a Pareto type I distribution. Therefore using one of the estimator of the location parameter $\mu $, one can reduce Pareto type II rv to our model. We do not discuss the difference between the parameter estimators and concentrate on the constructing of the goodness-of-fit tests.

2 Integral statistic $I_{n}^{(k)}$

The statistic $I_{n}^{(k)}$ is asymptotically equivalent to the U-statistic of degree $(k+1)$ with the centered kernel

$$\begin{aligned} {\varPsi }_k(X_{i_1},\ldots , X_{i_{k+1}})\!=\! \frac{1}{k+1}\sum _{\pi (i_1, \ldots , i_{k+1})}\mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < X_{i_{k+1}}) -\frac{1}{2}, \end{aligned}$$

where $\pi (i_1, \ldots , i_{k+1})$ means all permutations of different indices from $\{i_1, \ldots ,i_{k+1}\}$.

Let $X_1,\ldots , X_{k+1}$ be independent rv’s from the standard Pareto distribution. It is known that non-degenerate U-statistics are asymptotically normal (Hoeffding 1948; Korolyuk and Borovskikh 1994). To prove that the kernel ${\varPsi }_k(X_{1},\ldots , X_{{k+1}})$ is non-degenerate, we calculate its projection $\psi _k(s)$. For a fixed $X_{{k+1}}=s, \, s\ge 1$ we have:

$$\begin{aligned} \psi _k(s):= & {} E({\varPsi }_k(X_{1},\ldots , X_{{k+1}})\mid X_{{k+1}}=s) \\= & {} \frac{k}{k+1}\mathbb {P}(X_{(k,\{2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1})\\&+\frac{1}{k+1}\mathbb {P}(X_{(k,\{1,\ldots ,k\})}/X_{(k-1,\{1,\ldots , k\})}< s)-\frac{1}{2}. \end{aligned}$$

It follows from the above characterization that the second probability is equal to:

$$\begin{aligned} \mathbb {P}(X_{k,\{1,\ldots , k\}}/X_{k-1,\{1,\ldots , k\}}< s)=\mathbb {P}(X_1<s) = F(s). \end{aligned}$$

It remains to calculate the first term. For this purpose we decompose the probability as $\mathbb {P}(X_{k,\{2,\ldots , k,s \}}/X_{k-1,\{2,\ldots , k,s\}} < X_{1}) = \mathbb {P}_1+\mathbb {P}_2+\mathbb {P}_3$, where $\mathbb {P}_i, i={1,2,3}$ are initial probabilities, computed in one of the following cases:

(1)
Let the sample units take places as follows: $X_2 < \ldots < X_k <s$. Then our probability transforms into
$$\begin{aligned} \mathbb {P}_1= & {} (k-1)! \, \mathbb {P}\left( \frac{s}{X_k} <X_1, X_2 < \ldots < X_k <s\right) \\= & {} (k-1)! \, \mathbb {P}\left( X_k< s, X_1 > \frac{s}{X_k} , X_2 < X_3, X_3 <X_4, \ldots , X_{k-1} < X_{k}\right) . \end{aligned}$$
After some calculations we obtain that the last probability is equal to:
$$\begin{aligned}&(k-1)! \int _1^s \left( 1-F\left( \frac{s}{x_k}\right) \right) \frac{F^{k-2}(x_k)}{(k-2)!} d F(x_k)\\&\quad = F^{k-1}(s)-(k-1) \int _1^s \left( 1-\frac{1}{x}\right) ^{k-2} \left( 1-\frac{x}{s}\right) \frac{dx}{x^2}. \end{aligned}$$
The integral in the second term can be evaluated using integration by parts and binomial representation of the function $(1-\frac{1}{x})^{k-1}$. Finally we have:
$$\begin{aligned}&\int _1^s \left( 1-\frac{1}{x}\right) ^{k-2} \left( 1-\frac{x}{s}\right) \frac{dx}{x^2} = \frac{1}{s(k-1)}\int _{1}^{s}\sum _{j=0}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}x^{-j} dx\\&\quad = \frac{1}{s(k-1)}\left( s-1-(k-1)\ln {(s)}+\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}\right) . \end{aligned}$$
Thus the initial probability in this case is equal to
$$\begin{aligned} \mathbb {P}_1=F^{k-1}(s)-F(s)+(k-1)\frac{\ln {s}}{s}-\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}. \end{aligned}$$
(2)
The sample units are $X_2 < X_3<\ldots X_{k-1}< s < X_k$, then for this case we have:
$$\begin{aligned} \mathbb {P}_2= & {} (k-1)! \, \mathbb {P}\left( \frac{X_k}{s} <X_1, X_2 < X_3<\ldots X_{k-1}< s < X_k\right) \\= & {} (k-1)! \, \mathbb {P}\left( X_k> s, X_1 > \frac{X_k}{s} , X_2 < X_3, X_3 < X_4, \ldots , X_{k-1}<s\right) \\= & {} (k-1)! \int _s^{\infty } \left( 1-F\left( \frac{x_k}{s}\right) \right) \frac{F^{k-2}(s)}{(k-2)!} d F(x_k)\\= & {} \frac{(k-1)}{2s}F^{k-2}(s). \end{aligned}$$
(3)
The last case we consider is when s is situated on $j-$th place $(1 \le j \le {k-2})$ in variational series of the sample $X_2, \ldots , X_{k-2}$. It means that the sample units take places as follows: $X_2< \ldots < s < \ldots <X_{k-2}< X_{k-1} < X_k$ and s also may stand on first and $(k-2)$-th places. Then the required probability is equal to
$$\begin{aligned} \mathbb {P}_3= & {} (k-1)! \, \mathbb {P}\left( \frac{X_k}{X_{k-1}} <X_1, X_2< \ldots < s < \ldots <X_{k-2}< X_{k-1} < X_k\right) \\= & {} \frac{1}{2} C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s), \, 1 \le j \le {k-2}. \end{aligned}$$

Combining the results we get that the first term in the projection has the form:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{ 2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1}) = F^{k-1}(s)-F(s)+(k-1)\frac{\ln {s}}{s}\\&\quad -\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}+\frac{1}{2} \sum _{j=1}^{k-1}C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s). \end{aligned}$$

Note that the last sum is equal to $\sum _{j=1}^{k-1}C_{k-1}^{j-1}(1-F(s))^{k-j} F^{j-1}(s)=1-F^{k-1}(s)$. Thus for the initial probability we get the result:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{ 2,\ldots , k, s\})}/X_{(k-1,\{2,\ldots , k, s\})} < X_{1}) = \frac{1}{2} F^{k-1}(s)\\&\quad -F(s)+(k-1)\frac{\ln {s}}{s}-\frac{1}{s}\sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}+\frac{1}{2}. \end{aligned}$$

Hence we get the final expression for the projection of the kernel ${\varPsi }_k$:

$$\begin{aligned} \psi _k(s)= & {} \frac{kF^{k-1}(s)-1}{2(k+1)}-\frac{k-1}{k+1}F(s)+\frac{k(k-1)}{k+1}\frac{\ln {s}}{s} \nonumber \\&- \frac{k}{s(k+1)} \sum _{j=2}^{k-1}(-1)^j {{k-1} \atopwithdelims ()j}\frac{1-s^{-(j-1)}}{j-1}. \end{aligned}$$

(4)

The calculation of the variance for the projection $\psi _k$ in the general case is too complicated, therefore we calculate it only for particular k.

2.1 Integral statistic $I_{n}^{(3)}$

The projection $\psi _k(s)$ for the case $k=3$ has the form:

$$\begin{aligned} \psi _3(s)=\frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}. \end{aligned}$$

(5)

The variance of this projection ${\varDelta }_3^2 = E\psi _3^2(X_1)$ under $H_{0}$ is given by

$$\begin{aligned} {\varDelta }_3^2 = \int _{1}^{\infty } \psi _3^2 (s) \frac{1}{s^2}ds =\frac{11}{1920} \approx 0.006. \end{aligned}$$

Therefore the kernel ${\varPsi }_3$ is centered and non-degenerate. We can apply Hoeffding’s theorem on asymptotic normality of U-statistics, see again Hoeffding (1948), Korolyuk and Borovskikh (1994), which implies that the following result holds

Theorem 2

Under the null hypothesis as $n \rightarrow \infty $ the statistic $\sqrt{n}I_{n}^{(3)}$ is asymptotically normal so that

$$\begin{aligned} \sqrt{n}I_{n}^{(3)} \mathop {\longrightarrow }\limits ^{d}{\mathscr {N}}\left( 0,\frac{11}{120}\right) . \end{aligned}$$

Now we shall evaluate the large deviation asymptotics of the sequence of statistics $I_{n}^{(3)}$ under $H_0$. According to the theorem on large deviations of such statistics from Nikitin and Ponikarov (1999), see also DasGupta (2008), Nikitin (2010), we obtain due the fact that the kernel ${\varPsi }_3$ is centered, bounded and non-degenerate the following result.

Theorem 3

For $a>0$

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( I_n^{(3)} >a) = - f_I^{(3)}(a), \end{aligned}$$

where the function $f_I^{(3)}$ is continuous for the sufficiently small $a>0$, and

$$\begin{aligned} f_I^{(3)}(a) \sim \frac{a^2}{32 {\varDelta }_3^2} = 5.455\, a^2, \quad \text{ as } \, a \rightarrow 0. \end{aligned}$$

2.2 Some notions from the Bahadur theory

Suppose that under the alternative $H_1$ the observations have the d.f. $G(\cdot ,\theta )$ and the density $g(\cdot ,\theta ), \ \theta \ge 0$, such that $G(\cdot , 0) \in {\mathscr {P}}$. The measure of the Bahadur efficiency (BE) for any sequence $\{T_n\}$ of test statistics is the exact slope $c_{T}(\theta )$ describing the rate of an exponential decrease for the attained level under the alternative d.f. $G(\cdot ,\theta )$. According to the Bahadur theory (Bahadur 1971; Nikitin 1995) the exact slopes may be found by using the following Proposition.

Proposition 1

Suppose that the following two conditions hold:

a)
$T_n \ \mathop {\longrightarrow }\limits ^{{P_\theta }} \ b(\theta ),\qquad \theta > 0$, where $-\infty < b(\theta ) < \infty $, and $\mathop {\longrightarrow }\limits ^{{P_\theta }}$ denotes convergence in probability under $G(\cdot \ ; \theta )$.
b)
$\mathop {\lim }\limits _{n\rightarrow \infty } n^{-1} \ \ln \ P_{H_0} \left( T_n \ge t \ \right) \ = \ - h(t)$

for any t in an open interval I, on which h is continuous and $\{b(\theta ), \, \theta > 0\}\subset I$. Then

$$\begin{aligned} c_T(\theta ) \ = \ 2 \ h(b(\theta )). \end{aligned}$$

We have already found the large deviation asymptotics. In order to evaluate the exact slope it remains to calculate the first condition of the Proposition.

Note that the exact slopes for any $\theta $ satisfy the inequality (Bahadur 1971; Nikitin 1995)

$$\begin{aligned} c_T(\theta ) \le 2 K(\theta ), \end{aligned}$$

(6)

where $K(\theta )$ is the Kullback-Leibler “distance” between the alternative and the null-hypothesis $H_0$. In our case $H_0$ is composite, hence for any alternative density $g_j(x,\theta )$ one has

$$\begin{aligned} K_j(\theta ) = \inf _{\lambda >0} \int _1^{\infty } \ln [g_j(x,\theta ) / \lambda x^{-\lambda -1} ] g_j(x,\theta ) \ dx. \end{aligned}$$

This quantity can be easily calculated as $\theta \rightarrow 0$ for particular alternatives. According to (6), the local BE of the sequence of statistics ${T_n}$ is defined as

$$\begin{aligned} e^B (T) = \lim _{\theta \rightarrow 0} \frac{c_T(\theta )}{2K(\theta )}. \end{aligned}$$

2.3 The local Bahadur efficiency of $I_n^{(3)}$

According to Bahadur theory, the considered alternatives should be close to null-hypothesis as $\theta \rightarrow 0$. Therefore we suggest three alternatives against the Pareto distribution. The first two alternatives we consider are obtained by skewing mechanism (Ley and Paindaveine 2008), we call them Ley–Paindaveine alternatives.

i)
First Ley–Paindaveine alternative with the d.f.
$$\begin{aligned} G_1(x,\theta )=F(x)e^{-\theta (1-F(x))},\quad \theta \ge 0, x \ge 1; \end{aligned}$$
ii)
Second Ley–Paindaveine alternative with the d.f.
$$\begin{aligned} G_2(x,\theta )=F(x)-\theta \sin {\pi F(x)}, \quad \theta \in [0,\pi ^{-1}], x\ge 1; \end{aligned}$$
iii)
log-Weibull alternative with the d.f.
$$\begin{aligned} G_3(x,\theta )=1-e^{-(\ln {x})^{\theta +1}},\quad \theta \in (0,1), x\ge 1. \end{aligned}$$

Let us find the local BE for the alternative under consideration.

According to the Law of Large Numbers for U-statistics (Korolyuk and Borovskikh 1994), the limit in probability under $H_1$ is equal to

$$\begin{aligned} b_1(\theta )=P_{\theta }(X_{(3,3)}/X_{(2,3)}<Y)-\frac{1}{2}. \end{aligned}$$

It is easy to show (Jovanovic et al. 2014) that

$$\begin{aligned} b_1(\theta ) \sim 4\theta \int _{1}^{\infty } \psi _3(s)h_1(s)ds, \end{aligned}$$

where $h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}$ and $\psi _3(s)$ is the projection from (5). Therefore for the first Ley–Paindaveine alternative we have

$$\begin{aligned} b_1(\theta ) \sim 4\theta \int _{1}^{\infty }\left( \frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}\right) \left( \frac{s-2}{s^3}\right) \frac{ds}{s^2} \sim \frac{\theta }{12}, \quad \theta \rightarrow 0, \end{aligned}$$

and the local exact slope of the sequence $I_n^{(3)}$ as $\theta \rightarrow 0$ admits the representation

$$\begin{aligned} c_1(\theta )=b^2_1(\theta )/(16{\varDelta }_3^2) \sim \frac{5}{66}\,\theta ^2, \quad \theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler “distance” $K_1(\theta )$ between the alternative and the null-hypothesis $H_0$ admits the following asymptotics (Jovanovic et al. 2014):

$$\begin{aligned} 2K_1(\theta )\sim \theta ^2 \left[ \left\{ \int _1^\infty h_1^2(x)x dx -\left( \int _1^\infty h_1(x) \ln {(x)}dx\right) ^2\right] ,\quad \theta \rightarrow 0.\right. \end{aligned}$$

Therefore in our case

$$\begin{aligned} K_1(\theta ) \sim \theta ^2/24, \,\theta \rightarrow 0. \end{aligned}$$

(7)

Consequently, the local efficiency of the test is

$$\begin{aligned} e^B_1(I)=\lim _{\theta \rightarrow 0}\frac{c_1(\theta )}{2K_1(\theta )}\approx \frac{10}{11}\approx 0.909. \end{aligned}$$

Omitting the calculations similar to previous cases, we get for the second Ley–Paindaveine alternative $b_2(\theta )\sim 0.353\, \theta $, $c_2(\theta )\sim 1.363\,\theta ^2$, $\theta \rightarrow 0$. It is easy to show that $K_2(\theta ) \sim 0.753\, \theta ^2$, $\theta \rightarrow 0$. Therefore the local BE is equal to 0.905.

After some calculations in case of the log-Weibull alternative we have:

$$\begin{aligned} b_3(\theta ) \sim \left( \frac{3}{4}-\ln {3}+\ln {2}\right) \theta \approx 0.345\, \theta , \quad \theta \rightarrow 0, \end{aligned}$$

and the local exact slope of the sequence $I_n^{(3)}$ as $\theta \rightarrow 0$ admits the representation $c_3(\theta ) \sim 1.295\, \theta ^2$. Moreover for the log-Weibull distribution $K_3(\theta )$ satisfies $K_3(\theta ) \sim \frac{\theta ^2}{12}$, $\theta \rightarrow 0$. Hence the local BE for the last case is equal to 0.787.

Table 1 gathers values of the local BE.

Table 1 Local Bahadur efficiencies for $I_n^{(3)}$

Full size table

2.4 Integral statistic $I_{n}^{(4)}$

For the case $k=4$ the projection $\psi _k(s)$ has the form:

$$\begin{aligned} \psi _4(s)=\frac{12\ln {s}}{5s}-\frac{4}{5s^3}+\frac{18}{5s^2}-\frac{13}{5s}-\frac{3}{10}. \end{aligned}$$

(8)

The variance of this projection under $H_{0}$ is equal to

$$\begin{aligned} {\varDelta }_4^2 = \int _{1}^{\infty } \psi _4^2 (s) \frac{1}{s^2} d s =\frac{271}{52500} \approx 0.005. \end{aligned}$$

Therefore the kernel ${\varPsi }_4$ is centered, non-degenerate and bounded. Due to Hoeffding’s theorem on asymptotic normality of U-statistics (Hoeffding 1948; Korolyuk and Borovskikh 1994), we have that:

Theorem 4

Under the null hypothesis as $n \rightarrow \infty $ the statistic $\sqrt{n}I_{n}^{(4)}$ is asymptotically normal so that

$$\begin{aligned} \sqrt{n}I_{n}^{(4)} \mathop {\longrightarrow }\limits ^{d}{\mathscr {N}}\left( 0,\frac{271}{2100}\right) . \end{aligned}$$

The large deviation asymptotics of the sequence of statistics $I_{n}^{(4)}$ under $H_0$ follows from the following result. It was derived using the theorem on large deviations (see again Nikitin and Ponikarov 1999; DasGupta 2008; Nikitin 2010), applied to the centered, bounded and non-degenerate kernel ${\varPsi }_4$.

Theorem 5

For $a>0$

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( I_n^{(4)} >a) = - f_I^{(4)}(a), \end{aligned}$$

where the function $f_I^{(4)}$ is continuous for the sufficiently small $a>0$, and

$$\begin{aligned} f_I^{(4)}(a) \sim \frac{a^2}{50 {\varDelta }^2_4} = 3.875\, a^2, \quad \text{ as } \, a \rightarrow 0. \end{aligned}$$

2.5 The local Bahadur efficiency of $I_n^{(4)}$

For this case the limit in probability under $H_1$ has the following asymptotics

$$\begin{aligned} b_1(\theta ) \sim 5\theta \int _{1}^{\infty } \psi _4(s)h_1(s)ds, \end{aligned}$$

where again $h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}$ and $\psi _4(s)$ is the projection from (8). Therefore for the first Ley–Paindaveine alternative we have

$$\begin{aligned} b_1(\theta ) \sim 5\theta \int _{1}^{\infty }\left( \frac{9}{8s^2}+\frac{3\ln {s}}{2s}-\frac{1}{s}-\frac{1}{4}\right) \left( \frac{s-2}{s^3}\right) \frac{ds}{s^2} \sim \frac{\theta }{12}, \quad \theta \rightarrow 0. \end{aligned}$$

and the local exact slope of the sequence $I_n^{(4)}$ as $\theta \rightarrow 0$ admits the representation

$$\begin{aligned} c_1(\theta )=b^2_1(\theta )/(25{\varDelta }_4^2) \sim \frac{5}{66}\,\theta ^2,\theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler “distance” for this alternative was already found above, and it satisfies $K_1(\theta ) \sim \theta ^2/24, \,\theta \rightarrow 0$. Thus the local efficiency of the test is

$$\begin{aligned} e^B_1(I)=\lim _{\theta \rightarrow 0}\frac{c_1(\theta )}{2K_1(\theta )}\approx 0.930. \end{aligned}$$

For other alternatives the calculations are similar. Omitting the details, let us gather the values of the local BE for this case in the Table 2.

Table 2 Local Bahadur efficiencies for $I_n^{(4)}$

Full size table

In Table 3 we present the efficiencies from Tables 1 and 2 gathered with maximal values of efficiencies against presumed alternatives.

Table 3 Comparative table of local efficiencies for the statistic $I_n^{(k)}$

Full size table

3 Kolmogorov-type statistic $D_n^{(k)}$

Now we consider the Kolmogorov type statistic (3). For a fixed t the difference $H_n(t) - F_n(t)$ is a family of U-statistics with the kernels, depending on $t\ge 1$:

$$\begin{aligned} {\varXi }_k(X_{i_1},\ldots , X_{i_k};t)= \mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < t) -\frac{1}{k}\sum _{l=1}^{k} \mathbf 1 (X_l <t) . \end{aligned}$$

The projection of this kernels $\xi _k(s;t)$ for a fixed $t \ge 1$ has the form:

$$\begin{aligned}&\xi _k(s;t) := E({\varXi }_k(X_{1},\ldots , X_{k})\mid X_{k}=s)\\&\quad = \mathbb {P}(X_{(k,\{1,\ldots ,k-1, s\})}/X_{(k-1,\{1,\ldots ,k-1, s\})} < t)-\frac{1}{k} \mathbf 1 \{s <t\}-\frac{k-1}{k}\mathbb {P}\{X_1 <t\}. \end{aligned}$$

It remains to calculate the first term. For this purpose like in the previous cases, we write the decomposition

$$\begin{aligned} \mathbb {P}(X_{k,\{1, \ldots , k-1, s\}}/X_{k-1,\{1, \ldots , k-1, s\}} < t) = \mathbb {P}_1+\mathbb {P}_2+\mathbb {P}_3, \end{aligned}$$

where $\mathbb {P}_i, i=1,2,3$, are the initial probabilities, computed in one of the following cases:

(1)
Let the sample units take places as follows: $X_1 < X_2< \ldots < X_{k-1} <s$. Then the probability expresses as
$$\begin{aligned} \mathbb {P}_1= & {} (k-1)! \, \mathbb {P}\left( \frac{s}{X_{k-1}} < t, X_1 < X_2< \ldots < X_{k-1} < s\right) \\= & {} (k-1)! \, \mathbf 1 (s\ge t) \mathbb {P}\left( \frac{s}{t}<X_{k-1}< s, X_1 < X_2< \ldots < X_{k-1}\right) \\&+(k-1)! \, \mathbf 1 (s < t) \mathbb {P}(X_1 < X_2< \ldots < X_{k-1} < s)\\= & {} \mathbf 1 ( s\ge t) \left( F^{k-1}(s)-F^{k-1}\left( \frac{s}{t}\right) \right) . \end{aligned}$$
(2)
The sample units are $X_1 < X_2<\ldots X_{k-2}< s < X_{k-1}$, then for this case we have:
$$\begin{aligned} \mathbb {P}_2= & {} (k-1)! \, \mathbb {P}\left( \frac{X_{k-1}}{s} < t, X_1 < X_2<\ldots X_{k-2}< s < X_{k-1}\right) \\= & {} (k-1)! \, \mathbb {P}(s < X_{k-1} < st, X_1 < X_2<\ldots X_{k-2}< s)\\= & {} (k-1)! \frac{F^{k-2}(s)}{(k-2)!}(F(st)-F(s)) =\frac{(k-1)}{s}\left( 1-\frac{1}{s}\right) ^{k-2}\left( 1-\frac{1}{t}\right) . \end{aligned}$$
(3)
In the last case let s be situated on $l-$th place $(1 \le l \le {k-2})$ in the variational series of the sample $X_1, \ldots , X_{k-2}$. Then the required probability transforms into:
$$\begin{aligned} \mathbb {P}_3= & {} (k-1)! \, \mathbb {P}\left( \frac{X_{k-1}}{X_{k-2}} < t, X_1 < \ldots <s< \ldots < X_{k-2} < X_{k-1}\right) \\= & {} \left( 1-\frac{1}{t}\right) C_{k-1}^{l-1}(1-F(s))^{k-j} F^{j-1}(s), \, 1 \le l \le {k-2} . \end{aligned}$$

Combining these results we get that the first term in the projection is equal to:

$$\begin{aligned}&\mathbb {P}(X_{(k,\{1, \ldots , k-1, s\})}/X_{(k-1,\{1, \ldots , k-1, s\})} < t)\\&\quad = \mathbf 1 ( s\!\ge \! t) \left( F^{k-1}(s)-F^{k-1}\left( \frac{s}{t}\right) \right) \!+\! \left( 1\!-\!\frac{1}{t}\right) \sum _{l=1}^{k-1}C_{k-1}^{l-1}(1\!-\!F(s))^{k-j} F^{j-1}(s). \end{aligned}$$

Again we can see that the last sum can be simplified as

$$\begin{aligned} \sum _{l=1}^{k-1}C_{k-1}^{l-1}(1-F(s))^{k-j} F^{j-1}(s)=1-F^{k-1}(s). \end{aligned}$$

Thus the initial probability is equal to

$$\begin{aligned} \mathbb {P}(X_{(k,\{1, \ldots , k-1, s\})}/X_{(k-1,\{1, \ldots , k-1, s\})} < t)= \frac{1}{t}(F^{k-1}(s)-1) -\mathbf 1 ( s\ge t)F^{k-1}\left( \frac{s}{t}\right) . \end{aligned}$$

Hence we get the final expression for the projection of the family of kernels ${\varXi }_k(\cdot ,t)$:

$$\begin{aligned} \xi _k(s;t) =\frac{1}{t}\left( \left( 1-\frac{1}{s}\right) ^{k-1}-\frac{1}{k}\right) - \mathbf 1 ( s\ge t)\left( \left( 1-\frac{t}{s}\right) ^{k-1}-\frac{1}{k}\right) . \end{aligned}$$

(9)

It is easy to show that $E(\xi _k (X; t))=0$. After some calculations we get that the variance of this projection under $H_{0}$ is for any t

$$\begin{aligned} \delta _k^2(t)= & {} \frac{t+1}{(2k-1)t^2}+\frac{t-1}{k^2t^2}- \sum _{i=0}^{k-1}\frac{(-1)^{j}2(k-1)!(k-1)!}{(k+j)!(k-j-1)!}t^{j-1}\\&+\, (-1)^{k+1}\frac{2(k-1)!(k-1)!}{(2k-1)!}t^{k-2}F^{2k-1}(t)-\frac{2}{k^2t}F^k(t). \end{aligned}$$

3.1 Kolmogorov-type statistic $D_n^{(3)}$

In the case $k=3$ the projection of the family of kernels ${\varXi }_3 (X,Y, Z;t)$, namely $\xi _3 (s;t):=E({\varXi }_3 (X, Y, Z; t)\mid X=s)$ is equal to:

$$\begin{aligned} \xi _3 (s;t)= \frac{1}{t}\left( \frac{1}{s^2}-\frac{2}{s}+\frac{2}{3}\right) - \mathbf 1 \{s \ge t\}\left( \frac{t^2}{s^2}-\frac{2t}{s}+\frac{2}{3}\right) . \end{aligned}$$

(10)

Now we calculate variances of these projections $\delta _3^2(t)$ under $H_{0}$. Elementary calculations show that

$$\begin{aligned} \delta _3^2(t)=\frac{1}{45t^4}\left( 4t^3+4t^2-15t+7\right) . \end{aligned}$$

Hence our family of kernels ${\varXi }_3 (X,Y,Z;t)$ is non-degenerate in the sense of Nikitin (2010) and

$$\begin{aligned} \delta _3^2=\sup _{ t\ge 1} \delta _3^2(t)=0.035. \end{aligned}$$

This value will be important in the sequel when calculating the large deviation asymptotics (Figs. 1, 2, 3).

The limiting distribution of the statistic $D_n^{(3)}$ is unknown. Using methods of Silverman (1983), one can show that the U-empirical process

$$\begin{aligned} \eta _n(t) =\sqrt{n} \left( H_n(t) - F_n(t)\right) , \quad t\ge 1, \end{aligned}$$

weakly converges in $D(1,\infty )$ as $n \rightarrow \infty $ to certain centered Gaussian process $\eta (t)$ with calculable covariance. Then the sequence of statistics $\sqrt{n} D_n^{(3)}$ converges in distribution to the rv $\sup _{t\ge 1} |\eta (t)|$ but currently it is impossible to find explicitly its distribution. Hence it is reasonable to determine the critical values for statistics $D_n^{(3)}$ by simulation.

Now we obtain the logarithmic large deviation asymptotics of the sequence of statistics $D_n^{(3)}$ under $H_0$. The family of kernels $\{{\varXi }_3(X, Y, Z; t), t\ge 0\}$ is not only centered but bounded. Using results from Nikitin (2010) on large deviations for the supremum of non-degenerate U-statistics, we obtain the following result.

Theorem 6

For $a>0$

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( D_n^{(3)} >a) = - f_D^{(3)}(a), \end{aligned}$$

where the function $f_D^{(3)}$ is continuous for the sufficiently small $a>0$, moreover

$$\begin{aligned} f_D^{(3)}(a) = (18 \delta _3^2)^{-1} a^2(1 + o(1)) \sim 1.598\, a^2, \quad \text{ as } \, \, a \rightarrow 0. \end{aligned}$$

3.2 The local Bahadur efficiency of $D_n^{(3)}$

To evaluate the efficiency, first consider again the first Ley–Paindaveine alternative with the d.f. $G_1(x,\theta ),\theta \ge 0, x \ge 1$ given above. By the Glivenko-Cantelli theorem for U-statistics (Janssen 1988) the limit in probability under the alternative for statistics $D_n^{(3)}$ is equal to

$$\begin{aligned} b_1(\theta ):= \sup _{t\ge 1}|b_1(t,\theta )|= \sup _{t\ge 1} |P_{\theta }(X_{(3,3)}/X_{(2,3)}<t)-G(t,\theta )|. \end{aligned}$$

It is not difficult to show that

$$\begin{aligned} b_1(t,\theta ) \sim 3\theta \int _{1}^{\infty } \xi _3(s; t)h_1(s)ds, \end{aligned}$$

where again $h_1(s)=\frac{\partial }{\partial \theta }g_1(s,\theta )\mid _{\theta =0}$ and $\xi _3(s;t)$ is the projection defined above in (10). Hence for the first Ley–Paindaveine alternative we have for $t\ge 1$:

$$\begin{aligned} b_1(t,\theta ) \sim \frac{t-1}{2t^2}\theta , \quad \theta \rightarrow 0. \end{aligned}$$

Thus $b_1(\theta )=\sup _{t\ge 1}|b_1(t,\theta )| \sim 0.125\,\theta $, and it follows that the local exact slope of the sequence of statistics $D_n$ admits the representation:

$$\begin{aligned} c_1(\theta ) \sim b^2_1(\theta )/(9\delta ^2_3) \sim 0.05\,\theta ^2, \quad \theta \rightarrow 0. \end{aligned}$$

The Kullback-Leibler information in this case is given by (7). Hence the local Bahadur efficiency of our test is $e^B_1(D)= 0.599$.

Next we take the second Ley–Paindaveine distribution, where calculations are similar, and the local BE is equal to 0.689. In the case of the log-Weibull density we find that the local BE is 0.467.

We collect the values of local BE in the Table 4.

Table 4 Local Bahadur efficiencies for $D_n^{(3)}$

Full size table

3.3 Kolmogorov-type statistic $D_n^{(4)}$

In the case $k=4$ the projection of the family of kernels ${\varXi }_4 (X,Y, Z, W;t)$, is equal to:

$$\begin{aligned} \xi _4 (s;t)= \frac{1}{t}\left( \left( 1-\frac{1}{s}\right) ^3-\frac{1}{4}\right) - \mathbf 1 \{s \ge t\}\left( -\left( \frac{t}{s}\right) ^3+3\left( \frac{t}{s}\right) ^2-\frac{3t}{s}+\frac{3}{4}\right) . \end{aligned}$$

Therefore we get that variances of these projections $\delta _4^2(t)$ under $H_{0}$

$$\begin{aligned} \delta _4^2(t)=\frac{1}{560t^5}\left( 45t^4+45t^3-252t^2+224t-62\right) . \end{aligned}$$

Hence our family of kernels ${\varXi }_4 (X,Y,Z,W;t)$ is non-degenerate in the sense of Nikitin (2010) and

$$\begin{aligned} \delta _4^2=\sup _{ t\ge 1} \delta _4^2(t)=0.026. \end{aligned}$$

The limiting distribution of the statistic $D_n^{(4)}$ is unknown as in the previous section.

The logarithmic large deviation asymptotics of the sequence of statistics $D_n^{(4)}$ under $H_0$ is showed in the next theorem.

Theorem 7

For $a>0$

$$\begin{aligned} \lim _{n\rightarrow \infty } n^{-1} \ln P ( D_n^{(4)} >a) = - f_D^{(4)}(a), \end{aligned}$$

where the function $f_D^{(4)}$ is continuous for the sufficiently small $a>0$, moreover

$$\begin{aligned} f_D^{(4)}(a) =(32 \, \delta _4^2)^{-1} a^2(1 + o(1)) \sim 1.211\, a^2, \quad \text{ as } \, \, a \rightarrow 0. \end{aligned}$$

3.4 The local Bahadur efficiency of $D_n^{(4)}$

In the Table 5 we collect the calculated efficiencies for the statistic $D_n^{(k)}$ joined with results from the Table 4 and with the maximal values of efficiencies against our alternatives.

Table 5 Comparative table of local efficiencies for statistic $D_n^{(k)}$

Full size table

We observe that the efficiencies for the Kolmogorov-type test are lower than for the integral test. However, it is the usual situation when testing goodness-of-fit (Nikitin 1995; Rank 1999; Nikitin 2010).

3.5 Critical values

Tables 6 and 7 shows the critical values of the null distribution of $D_n^{(3)}$ and $D_n ^{(4)}$ for significance levels $\alpha = 0.1, 0.05, 0.01$ and specific sample sizes n. Each entry is obtained by using the Monte-Carlo simulation methods with 10,000 replications.

Table 6 Critical values for the statistic $D_n^{(3)}$

Full size table

Table 7 Critical values for the statistic $D_n^{(4)}$

Full size table

4 Power comparison

We recall computation formulae for statistics $I_n^{(k)}$ and $D_n^{(k)}$ for $k=\{3, 4\}$:

$$\begin{aligned} I_n^{(k)}= & {} \int _{1}^{\infty } \left( H_n(t)-F_n(t)\right) dF_n(t)\\= & {} {n \atopwithdelims ()k+1}^{-1}\sum _{1 \le i_1<\ldots < i_{k+1} \le n} \frac{1}{k+1}\sum _{\pi (i_1, \ldots , i_{k+1})}\mathbf 1 \left( X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < X_{i_{k+1}}\right) -\frac{1}{2}, \end{aligned}$$

where $\pi (i_1, \ldots , i_{k+1})$ means all permutations of different indices from $\{i_1, \ldots ,i_{k+1}\}$.

$$\begin{aligned} D_n^{(k)}= & {} \sup _{t \ge 1}\mid H_n(t)-F_n(t)\mid \\= & {} \sup _{t \ge 1}\mid {n \atopwithdelims ()k}^{-1}\sum _{1 \le i_1<\ldots < i_k \le n} \bigg [\mathbf 1 (X_{(k,\{i_1,\ldots ,i_k\})}/X_{(k-1,\{i_1,\ldots ,i_k\})} < t) -\frac{1}{k}\sum _{l=1}^{k} \mathbf 1 (X_l <t)\bigg ] \mid . \end{aligned}$$

This section presents results of a Monte-Carlo study to compare powers of new tests with the widely applied for these types of hypotheses Kolmogorov-Smirnov (KS) and Cramer-von Mises (CvM) tests. The comparison is done for the size $n=20$ and for the significance level $\alpha =0.05$. All calculation were done using JAVA (The Apache Commons Mathematics Library) and R package with 10,000 replications. We consider following distributions as alternatives for the Pareto distribution:

1)
the Gamma alternative ${\varGamma }(\theta )$ with the density $({\varGamma }(\theta ))^{-1}x^{\theta -1}\exp (-x)$;
2)
the log-normal law $LN(\theta )$ with the density $(\theta x)^{-1}(2\pi )^{-1/2}\exp (-(\log {x})^2/2\theta ^2)$;
3)
the first Ley–Paindaveine alternative $LP1(\theta )$ with the d.f. $(1-\frac{1}{x})\exp (-\theta /x),\theta \ge 0, x \ge 1$;
4)
log-Weibull alternative with the d.f. $1-e^{-(\ln {x})^{\theta +1}},\theta \in (0,1), x\ge 1$;
5)
the Weibull distribution $W(\theta )$ with the density $\theta x^{\theta -1}\exp (-x^{\theta })$.

The KS and CvM tests are not applicable to composite hypothesis, so first we estimate parameter $\lambda $ with its maximum likelihood estimator (MLE) $\hat{\lambda }=n(\sum _{k=1}^n \ln {X_k})^{-1}$, then calculate the critical values of corrected test and powers for the sample $n=20$ using the Monte-Carlo procedure. Powers are given in the Table 8.

Table 8 Comparative table of the power simulation for different statistics

Full size table

We observe that powers of our tests correspond to local Bahadur efficiencies for considerable alternatives. In whole we can notice that our new statistics in comparison with classical tests more favorable to alternatives with the density shapes similar to the Pareto distribution, for example like the first Ley–Paindaveine alternative. However they are less responsive to close alternatives but with another shapes (for example when density have some twists differ to the Pareto distribution), for example gamma and log-Weibull alternatives.

5 Application to the real data

In this section we apply our tests to the real data example from Hogg and Klugman (1984), where they are discussed in detail. The data set represents the losses due to wind-related catastrophes, 1977, rounded to the nearest million dollars and involved more than $2 million:

$$\begin{aligned} \begin{array}{llllllllllllllllllll} 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 2 &{} \quad 3 &{} \quad 3 &{} \quad 3 &{} \quad 3 &{} \quad 4 &{} \quad 4 &{} \quad 4 &{} \quad 5\\ 5 &{} \quad 5 &{} \quad 5 &{} \quad 6 &{} \quad 6 &{} \quad 6 &{} \quad 6 &{} \quad 8 &{} \quad 8 &{} \quad 9 &{} \quad 15 &{} \quad 17 &{} \quad 22 &{} \quad 23 &{} \quad 24 &{} \quad 24 &{} \quad 25 &{} \quad 27 &{} \quad 32 &{} \quad 43 \end{array} \end{aligned}$$

These data widely analyze in the literature, see Brazausskas and Serfling (2003) for detail, the new goodness-of-fit tests were proposed in Rizzo (2009). First we apply the same to Brazausskas and Serfling (2003) and Rizzo (2009) the data de-grouping method. The necessity of this method appear as a consequence of the initial data rounding and make from discrete grouping the uniform distributed data. Put

$$\begin{aligned} X_k = \left( 1-\frac{k}{m+1}\right) A + \frac{k}{m+1}B, k=1,\ldots ,m, \end{aligned}$$

where (A, B) is a grouping interval with m observations. According to Brazausskas and Serfling (2003) for example to observations corresponds to 2 we conditionally take as (A, B) the interval (1.5, 2.5).

We tested the null-hypothesis $H_0: X$ has the Pareto distribution with the scale parameter $\sigma = 1.5$ and MLE of the tail parameter $\lambda = 0.764$. Such special parameters consider in Brazausskas and Serfling (2003), Hogg and Klugman (1984) and in Rizzo (2009), Philbrick and Jurschak (1981) applied $\sigma = 2.0$.

Applying our tests to these data, we get in the Table 9 the following p-values of test statistics $I_n^{(k)}$ and $D_n^{(k)}$, based on 10,000 simulations.

Table 9 Goodness-of-fit tests for fitted Pareto models based on MLE $\lambda = 0.764$ and $\sigma = 1.5$ for statistics $I_n^{(k)}$, $D_n^{(k)}$

Full size table

So we conclude that our tests do not reject the null-hypothesis. It correspond to results of Brazausskas and Serfling (2003), Rizzo (2009).

6 Conditions of the local asymptotic optimality

In this section we are interested in conditions of the local asymptotic optimality (LAO) in Bahadur sense for both sequences of statistics $I_n^{(k)}$ and $D_n^{(k)}$. This means to describe the local structure of alternatives for which the given statistic has maximal potential local efficiency so that the relation

$$\begin{aligned} c_T(\theta ) \sim 2 K(\theta ),\quad \theta \rightarrow 0, \end{aligned}$$

holds (Nikitin 1995; Nikitin and Tchirina 1996). Such alternatives form the domain of LAO for the given sequence of statistics.

Consider functions

$$\begin{aligned} H(x)=G^{'}_{\theta }(x,\theta )\mid _{\theta =0},\quad h(x)=g^{'}_{\theta } (x,\theta )\mid _{\theta =0}. \end{aligned}$$

We will assume that the following regularity conditions are true (see also Nikitin and Tchirina 1996):

$$\begin{aligned}&\int _{1}^{\infty } h^2(x)x \, dx < \infty \quad \text{ where } \quad h(x)=H'(x), \, \end{aligned}$$

(11)

$$\begin{aligned}&\frac{\partial }{\partial \theta }\int _{1}^{\infty } g(x,\theta ) \ln {x} \, dx \mid _{\theta =0} \ = \ \int _{1}^{\infty } h(x)\ln {x} \, dx . \end{aligned}$$

(12)

Denote by $\mathscr {G}$ the class of densities $g(x,\theta )$ with d.f.’s $G(x,\theta )$, satisfying the regularity conditions (11)–(12). We are going to deduce the LAO conditions in terms of the function h(x).

Recall that for alternative densities from ${\mathscr {G}}$ the following asymptotics is valid:

$$\begin{aligned} 2K(\theta )\sim \theta ^2 \left[ \int _{1}^{\infty } h^2(x)x\, dx -\left( \int _{1}^{\infty } h(x)\ln {x} \, dx\right) ^2\right] ,\quad \theta \rightarrow 0. \end{aligned}$$

6.1 LAO conditions for $I_n^{(k)}$

First consider the integral statistic $I_n^{(k)}$ with the kernel ${\varPsi }_k(X_1, \ldots , X_{k+1})$ and its projection $\psi _k(x)$ from (4). Let introduce the auxiliary function

$$\begin{aligned} h_0(x) = h(x) - \frac{(\ln {x}-1)}{x^2}\int _1^\infty h(u)\ln {u}\, du. \end{aligned}$$

Simple calculations show that

$$\begin{aligned}&\int _{1}^{\infty } h^2(x)x^2dx -\left( \int _{1}^{\infty } h(x)\ln {x} \, dx\right) ^2=\int _{1}^{\infty } h_0^2(x) x^2 dx,\\&\int _{1}^{\infty } \psi _k(x)h(x)dx = \int _{1}^{\infty } \psi _k(x)h_0(x)dx. \end{aligned}$$

Hence the local asymptotic efficiency takes the form

$$\begin{aligned} e^B (I_n^{(k)})= & {} \lim _{\theta \rightarrow 0} b_I^2(\theta ) / \left( (k+1)^2{\varDelta }_k^2 \cdot 2K(\theta )\right) \\= & {} \left( \int _{1}^{\infty } \psi _k(x)h_0(x)dx\right) ^2/\left( \int _{1}^{\infty }\psi _k^2(x) \frac{dx}{x^2} \cdot \int _{1}^{\infty } h_0^2(x)x^2 dx \right) . \end{aligned}$$

By Cauchy-Schwarz inequality we obtain that the expression in the right-hand side is equal to 1 iff $h_0(x)=C_1\psi _k(x)\frac{1}{x^2}$ for some constant $C_1>0$, so that

$$\begin{aligned} h(x) =(C_1\psi _k(x)+ C_2 (\ln {x}-1))\frac{1}{x^2} \quad \text { for some constants } C_1>0 \text { and } C_2. \end{aligned}$$

(13)

The set of distributions for which the function h(x) has such form generate the domain of LAO in the class $\mathscr {G}$. The simplest examples of such alternatives density $g(x,\theta )$ for small $\theta > 0$ is given by the Table 10.

Table 10 Examples of LAO alternative density $g(x,\theta )$ for statistic $I_n^{(k)}$

Full size table

6.2 LAO conditions for $D_n^{(k)}$

Now let consider the Kolmogorov type statistic $D_n^{(k)}$ with the family of kernels ${\varXi }_k$ and their projections $\xi _k(x;t)$ from (9). After simple calculations we get

$$\begin{aligned} \int _{1}^{\infty } \xi _k(x; t)h(x)dx = \int _{1}^{\infty } \xi _k(x; t )h_0(x)dx, \quad \forall t \in [1,\infty ). \end{aligned}$$

Hence the local efficiency takes the form

$$\begin{aligned} e^B (D_n^{(k)})= & {} \lim _{\theta \rightarrow 0} \left[ b_D^2(\theta )/ \sup _{t \ge 1}\left( k^2 \delta _k^2(t)\right) \cdot 2K(\theta ) \right] \\= & {} \frac{ \sup _{t \ge 1}\bigg ( \int _{1}^{\infty } \xi _k(x;t)h_0(x)dx\bigg )^2 }{ \ \sup _{t \ge 1} \bigg ( \int _{1}^{\infty } \xi _k^2 (x;t) \frac{dx}{x^2} \cdot \int _{1}^{\infty } h_0^2(x)x^2 dx\bigg )} \le 1. \end{aligned}$$

We can apply once again the Cauchy-Schwarz inequality to the numerator in the last ratio. It follows that the sequence of statistics $D_n$ is the locally asymptotically optimal, and $e^B (D_n^{(k)})=1$ iff

$$\begin{aligned} h(x)=(C_3\xi _k(x; t_0)+ C_4 (\ln {x}-1))\cdot \frac{1}{x^2} \quad \text { for } t_0= \arg \sup _{t \ge 1}\delta _k^2(t)\, \end{aligned}$$

and some constants $C_3>0$ and $C_4$.

Distributions with such h(x) form the domain of LAO in the class ${\mathscr {G}}$. The simplest examples are given in the Table 11.

Table 11 Examples of LAO alternative density $g(x,\theta )$ for statistic $D_n^{(k)}$

Full size table

7 Conclusion

We constructed two new tests for goodness-of-fit testing for the Pareto distribution based on the new characterization for the Pareto distribution. We describe their limit distribution and large deviations. The Bahadur efficiency for some alternatives has been obtained and it turned out reasonably high. Also we derived the conditions of local optimality for our tests. These tests were compared with some commonly used goodness-of-fit tests and it can be noted that in most cases our tests are more powerful. They can be of some use in statistical research, especially when the alternative is close to the alternative from the LAO class.

References

Ahsanullah M (1977) A characteristic property of the exponential distribution. Ann Stat 5(3):580–582
Article MathSciNet MATH Google Scholar
Ahsanullah M (1989) On characterizations of the uniform distribution based on functions of order statistics. Allgarh J Stat 9:1–6
MathSciNet MATH Google Scholar
Arnold BC (1983) Pareto distributions. Internetional Co-operative Publishing House, Fairland, MD
MATH Google Scholar
Bahadur RR (1971) Some limit theorems in statistics. SIAM, Philadelphia
Book MATH Google Scholar
Beirlant J, de Wet T, Goegebeur Y (2006) A goodness-of-fit statistic for the Pareto-type behavior. J Comp Appl Math 186:99–116
Article MATH Google Scholar
Brazausskas V, Serfling R (2003) Favorable estimators for fitting Pareto models: a study using goodness-of-fit measures with actual data. ASTIN Bull 33(2):365–381
MathSciNet MATH Google Scholar
DasGupta A (2008) Asymptotic theory of statistics and probability. Springer, New York
MATH Google Scholar
Gulati S, Shapiro S (2008) Goodness of fit tests for the Pareto distribution. In: Vonta F, Nikulin M, Limnios N, Huber C (eds) Statistical models and methods for biomedical and technical systems. Birkhäuser, Boston, pp 263–277
Google Scholar
Helmers R, Janssen P, Serfling R (1988) Glivenko–Cantelli properties of some generalized empirical DF’s and strong convergence of generalized L-statistics. Prob Theory Relat Fields 79:75–93
Article MathSciNet MATH Google Scholar
Hoeffding W (1948) A class of statistics with asymptotically normal distribution. Ann Math Stat 19:293–325
Article MathSciNet MATH Google Scholar
Hogg RV, Klugman SA (1984) Loss distributions. Wiley, New York
Book Google Scholar
Janssen PL (1988) Generalized empirical distribution functions with statistical applications. Limburgs Universitair Centrum, Diepenbeek
Google Scholar
Jovanovic M, Milosevic B, Obradovic M (2014) Goodness of fit tests for Pareto distribution based on a characterization and their asymptotics. arXiv:1310.5510. Accepted for publication in Statistics doi:10.1080/02331888.2014.919297
Kleiber C, Kotz SA (2003) Satistical size distributions in economics ans actuarial sciences. Wiley
Korolyuk VS, Borovskikh YV (1994) Theory of $U$-statistics. Kluwer, Dordrecht
Book MATH Google Scholar
Ley C, Paindaveine D (2008) Le Cam optimal tests for symmetry against Ferreira and Steel’s general skewed distribution. J Nonparam Stat 21:943–967
Article MathSciNet MATH Google Scholar
Litvinova VV (2004) Asymptotic properties of goodness-of-fit and symmetry tests based on characterizations. Dissertation, Saint-Petersburg University
Martynov GV (2009) Cramér-von Mises test for the Weibull and Pareto Distributions. In: Proceedings of Dobrushin Intern Conf, Moscow, pp 117–122
Nikitin Y (1995) Asymptotic efficiency of nonparametric tests. Cambridge University Press, New York
Book MATH Google Scholar
Nikitin YY (1996) Bahadur efficiency of a test of exponentiality based on a loss of memory type functional equation. J Nonparametr Stat 6(1):13–26
Article MathSciNet MATH Google Scholar
Nikitin YY (2010) Large deviations of $U$-empirical Kolmogorov–Smirnov tests, and their efficiency. J Nonparametr Stat 22:649–668
Article MathSciNet MATH Google Scholar
Nikitin YY, Peaucelle I (2004) Efficiency and local optimality of distribution-free tests based on $U$- and $V$- statistics. Metron LXII:185–200
MathSciNet Google Scholar
Nikitin YY, Ponikarov EV (1999) Rough large deviation asymptotics of Chernoff type for von Mises functionals and $U$-statistics. In: Proceedings of StPetersburg Math Soc, vol 7, pp 124–167. Engl transl (2001) AMS Transl, vol 2(203), pp 107–146
Nikitin YY, Tchirina AV (1996) Bahadur efficiency and local optimality of a test for the exponential distribution based on the Gini statistic. Stat Methods Appl 5:163–175
MATH Google Scholar
Nikitin YY, Volkova KY (2012) Asymptotic efficiency of goodness-of-fit test for power distribution based on Puri–Rubin characterization. Zapis Nauchnykh Semin POMI 408:115–130
Google Scholar
Nikitin YY, Volkova KY (2010) Asymptotic efficiency of exponentiality tests based on order statistics characterization. Georgian Math J 17:749–763
MathSciNet MATH Google Scholar
Philbrick SW, Jurschak J (1981) Discussion of “Estimating casualty insurance loss amount distributions”. In: Proceedings of the Casualty Actuarial Society, LXVIII
Rank RF (1999) Statistische Anpassungstests und Wahrscheinlichkeiten grosser Abweichungen. Dissertation, University of Hannover
Rizzo ML (2009) New goodness-of-fit tests for Pareto distribution. Astin Bull 39(2):691–715
Article MathSciNet MATH Google Scholar
Silverman BW (1983) Convergence of a class of empirical distribution functions of dependent random variables. Ann Probab 11:745–751
Article MathSciNet MATH Google Scholar
Wieand HS (1976) A condition under which the Pitman and Bahadur approaches to efficiency coincide. Ann Stat 4:1003–1011
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors express their deep gratitude to the Referees and the Associate Editor for their useful suggestions for the improvement of the paper.

Author information

Authors and Affiliations

Saint-Petersburg State University, Universitetsky pr. 28, Stary Peterhof, 198504, Russia
Ksenia Volkova

Authors

Ksenia Volkova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ksenia Volkova.

Additional information

Research supported by Grant RFBR No. 13-01-00172, Grant NSh No. 2504.2014.1 and by SPbGU Grant No. 6.38.672.2013.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Volkova, K. Goodness-of-fit tests for the Pareto distribution based on its characterization. Stat Methods Appl 25, 351–373 (2016). https://doi.org/10.1007/s10260-015-0330-y

Download citation

Accepted: 07 July 2015
Published: 25 July 2015
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10260-015-0330-y

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Goodness-of-fit tests for the Pareto distribution based on its characterization

Abstract

Similar content being viewed by others

Distribution-free goodness-of-fit tests for the Pareto distribution based on a characterization

Goodness-of-Fit Tests Based on Characterization of Uniformity by the Ratio of Order Statistics, and Their Efficiency

Goodness-of-Fit Tests Based on a Characterization of Logistic Distribution

1 Introduction

Theorem 1

Proof

2 Integral statistic \(I_{n}^{(k)}\)

2.1 Integral statistic \(I_{n}^{(3)}\)

Theorem 2

Theorem 3

2.2 Some notions from the Bahadur theory

Proposition 1

2.3 The local Bahadur efficiency of \(I_n^{(3)}\)

2.4 Integral statistic \(I_{n}^{(4)}\)

Theorem 4

Theorem 5

2.5 The local Bahadur efficiency of \(I_n^{(4)}\)

3 Kolmogorov-type statistic \(D_n^{(k)}\)

3.1 Kolmogorov-type statistic \(D_n^{(3)}\)

Theorem 6

3.2 The local Bahadur efficiency of \(D_n^{(3)}\)

3.3 Kolmogorov-type statistic \(D_n^{(4)}\)

Theorem 7

3.4 The local Bahadur efficiency of \(D_n^{(4)}\)

3.5 Critical values

4 Power comparison

5 Application to the real data

6 Conditions of the local asymptotic optimality

6.1 LAO conditions for \(I_n^{(k)}\)

6.2 LAO conditions for \(D_n^{(k)}\)

7 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation