Abstract
The beta distribution is used in different models of environmental research. The power of the test for beta distribution of Raschke [Biased transformation and its application in goodness-of-fit tests for the beta and gamma distribution. Commun. Statist. B–Computa. Simula. 38 (2009): 1870–1890] is researched here. The power of the Kolmogorov–Smirnov, Kuiper, Cramér-von Mises, Watson and Anderson–Darling tests are researched for different sample sizes, levels of significance and parameters of the beta distribution. The limitation to these tests is discussed including the differences between previous publications. The empirical behaviour is investigated by extensive Monte Carlo simulations. The most powerful test for the beta distribution is the Anderson–Darling test for the considered constellations of alternative distribution, contamination or scaling. The second best test is the Cramér-von Mises test, followed by the Watson test. The analysis of relative humidity data of meteorology and of runoff coefficients of the hydrology demonstrates the advantages of the new tests and the necessity to test an assumption of beta distribution.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The beta distribution has the probability density function (PDF)
wherein B(p,q) is the beta function. This distribution is used in different fields of environmental research. Gottschalk and Weingarter (1998) used the beta distribution in the hydrology for runoff coefficients. Yao (1974) modelled the distribution of relative humidity in meteorology by a beta distribution. Flynn (2004) used the beta distribution in a model for human exposure to airborne contaminants. Nadarajah and Kotz (2007) described and disscused modifications of the beta distribution and their applications.
The problem in the statistical inference of the beta distribution was a lack of practical goodness-of-fit tests. Recently, it is possible to test the hypothesis of a beta distribution by the biased transformation (BT) of Raschke (2009) and a classical test for normality. But only the Anderson–Darling test has been researched for the beta distribution up to now. The goal of this paper is to research further EDF tests (EDF—empirical distribution function) and their power. The aim of this research might not be very interesting from the theoretical point of view but it is necessary from the practical one. The goodness-of-fit of models for environmental data have been the research object of previous studies, for example in Song and Singh (2010) and Crujeiras et al. (2010). The test procedure and the different tests for normality are explained in Sect. 2. The selection of the tests is also discussed in this section. The power of the test for the beta distribution is researched in Sect. 3. Empirical data are analysed in Sect. 4 to demonstrate the practical relevance. The results are concluded in Sect. 5. Large tables containing simulation results can be found in Appendix.
2 Test procedures
According to Raschke (2009) the basic procedure to test the hypothesis that a random variable X is beta distributed is the
-
Maximum likelihood estimation of all parameters in Eq. 1 for the sample,
-
Computation of a sample of Y with Y = F −1normal (F beta(X)),
-
Maximum likelihood estimation of all parameters for a normal distribution for Y with the PDF
$$ f(y) = 1/\left( {\sigma \sqrt {2\pi } } \right)\exp \left[ { - (y - \mu )^{2} /(2\sigma^{2} )} \right]_{{}}^{{}} ,\quad \sigma > 0\;{\text{and}} $$(2) -
Application of a Goodness-of-fit test for normal distributions.
The Hypothesis H 0 is that X is beta distributed. The Hypothesis H 0′ is that Y is normally distributed. H 0 is rejected if H 0′ is rejected with the same level of significance. The inverse F −1normal is the inverse function of a normal distribution with µ = 0 and σ = 1. The function F beta is the cumulative distribution function (CDF) of the beta distribution Eq. 1 with the estimated parameters. The condition that all observations are identical and independent distributed has been fulfilled.
There exist different tests for normality. del Barrio et al. (2000) describe the basics and the history of important tests. Landry and Leparge (1992) have researched the power of such tests. Only the EDF tests are considered here because they are more powerful than most of the other tests, according to the results of Landry and Leparge (1992). “More powerful” means that the average and the minimum share of rejections is higher when H 0 is really false. Some other tests like the Oja test according to Oja (1981, 1983) are also powerful. But there do not exist critical values for every sample size. The same—high power and missing critical values—is for the Shapiro–Wilk test (Shapiro and Wilk 1965) and its approximation (Shapiro and Francia 1972) too. Such tests are not considered here. The test based on the L2-Wasserstein distance (del Barrio et al. 1999; del Barrio et al. 2000) is not considered because of different reasons. Krauzci (2009) has developed a concrete test procedure (critical values) for this approach but the power is not researched and compared satisfiable with other tests. Results are published only for sample size n = 20 and classical EDF tests are applied with a mistake. The older critical values for the Anderson–Darling test (Stephens 1974) are used instead of the current values (Stephens 1986). This mistake was also made by Landry and Leparge (1992). Although the basic of the Anderson–Darling and the Kolmogorov–Smirnov test is the same in these studies there are considerable differences between the share of rejections computed by Landry and Leparge (1992, Table 5) and Krauzci (2009, Tables 2, 3). The differences are noted in the cases “Be(1,1)”-“Beta(1,1)”, “Be(2,2)”-“Beta(2,2)”, “We(2,1)”-“Weibull(2)”, “t10”-”Studenten(10)” and “C(0,1)”-”Cauchy”. Last but not least, the computational effort for the test based at the L2-Wasserstein distance seems to be considerable.
Here, the EDF tests according to Stephens (1986) are used only with values for the ordered sample of Y
and
The Kolmogorov–Smirnov test value is represented by D* and is defined by
The Kuiper test value is represented by V* and defined by
The Cramér-von Mises test value is represented by W 2* and defined by
The Watson test value is represented by U 2* and defined by
The Anderson–Darling test value is represented by A 2 * and defined by
The F(Y i ) is the CDF of the normal distribution with the maximum likelihood estimated parameters. The critical values for a significance level α = 5 and 10% are 0.895 and 0.819 for the Kolmogorov–Smirnov test, 1.489 and 1.386 for the Kuiper test, 0.126 and 0.104 for the Crámer-von Mises test, 0.117 and 0.096 for the Watson test and 0.752 and 0.631 for the Anderson–Darling test (Stephens 1986, Table 4.7, upper tail).
Beta distributed samples for X were simulated, parameters were estimated and the tests were applied for 10,000 samples in order to verify the tests. The share of rejections is listed in Table 2 for different sample sizes and parameter constellations in Eq. 1 and significance levels of α = 5 and 10%. The considered parameters are relative small, the largest value is p,q = 20. Small changes of them in range of p,q < 3 influence the shape of the beta distribution considerable. When p and q become large, the beta distribution becomes the shape of a normal distribution (s. Johnson et al. 1995). This is also the case when p and q become much larger. That is why the equal shares of rejections can be expected for the parameter p,q > 20 as for p,q = 20.
The used EDF tests work. The tests also work for α = 1%, but the results are not shown here.
3 Research of the power of the tests
3.1 Alternative distributions
The power of the EDF tests is researched in the first case with samples of a random variable X which are not beta distributed. The log-normal distribution is used with the PDF
This distribution for x ≥ 0 is truncated here at x = F −1 (0.99) of the non-truncated case and then scaled so that 0<x≤1. The gamma distribution is used in the same way. The gamma PDF is defined with
A normal distribution is truncated at x = F −1 (0.01) and x = F −1 (0.99) of the non-truncated case. Then the distribution is scaled and moved so that 0 ≤ x ≤ 1.
The triangle distribution is also used, with a density function
with 0 < c < 1. And the trapeze distribution is used with
with 0 < c < d < 1.
Samples of X were simulated, parameters were estimated for an assumed beta distribution and the tests were applied for 10,000 repetitions. The share of rejections is listed in Table 3 for different sample sizes and parameter constellations and significance levels of α = 5 and 10%. The Anderson–Darling test is the most powerful and has the highest average of shares of rejection. It is followed by the Crámer-von Mises and the Watson test.
3.2 Contaminated beta distribution
In this research beta distributed samples of X are generated. But 20% of the samples are from a different beta distributed variable with parameters p* and q*. The share of rejections for 10,000 repetitions is listed in Table 4 for different sample sizes and parameter constellations and significance levels of α = 5 and 10%. The Anderson–Darling test is the most powerful once again and has the highest average of shares of rejections. It is followed again by the Crámer von Mises and the Watson test.
3.3 Scaled beta distribution
In final research case, the beta distribution is scaled so that 0 ≤ x ≤ s but is assumed as non-scaled (0 ≤ x ≤ 1). The value s is the scale parameter. The share of rejections for 10,000 repetitions of samples is listed in Table 5 for different sample sizes and parameter constellations and significance levels of α = 5 and 10%. The Anderson–Darling test is again the most powerful and has the highest average of shares of rejections. It is followed again by the Crámer von Mises and the Watson test.
3.4 Discussion of results
The Anderson–Darling test is the most powerful test in almost all researched constellations with non beta distributed, contaminated or scaled samples of X (Tables 3, 4, 5). It has the highest share of rejections. The power of the different tests is shown in Fig. 1 as comparison of the minima and averages of the shares of rejection. According to this figure, it is obviously that the Anderson–Darling test is the most powerful. The Cramér-von Mises test is the second most powerful followed by the Watson test. The Kolmogorov–Smirnov test and the Kuiper test are the least powerful. The power of the test increases by increasing sample size as expected.
The power of the test is qualitatively the same for a significance level of α = 1%. These results are not listed here as they are not informative.
4 Practical application in environmental data
The procedure of EDF-tests for the beta distribution is applied now for empirical data. The Kolmogorov–Smirnov test for a fully specified distribution is applied also as done by Chia and Hutchinson (1991) and the classical χ2 test is applied also as done by Yao (1974) to demonstrate the differences from the new tests.
Some random variables are assumed to be beta distributed in the meteorology like the relative humidity of surface by Yao (1974), the daily cloud duration by Chia and Hutchinson (1991) and the sunshine duration by Sulaiman et al. (1999). That’s why an example of meteorological data is analysed here. The data of Haarweg Wageningen weather station of Wageningen University (Netherlands, 2009) of relative humidity (RH) of the air of May 2007 and 2008 has been selected (see Appendix) and analysed separately like Yao (1974) analysed data per month. The results of the estimation and the different tests are listed in Table 1.
The observations and the estimated CDF are shown in Fig. 2a. The hypothesis that the relative humidity RH is beta distributed is not rejected for the May 2007 with a significance level of ≥10% according to all new EDF-tests. The Kolmogorov–Smirnov test (full specified) accepts H 0 also for α ≥ 10%. In contrary to this the χ2 test accepts the hypothesis only for α ≥ 5%. The hypothesis is rejected for the data of May 2008 with a significance level of 1% for the new tests except the Kolmogorov–Smirnov test. This new test rejects the hypothesis with α ≥ 5% like the classical χ2 test. In contrary to these tests, the Kolmogorov–Smirnov test for the full specified distribution accepts the hypothesis at a level of α ≥ 10%.
The runoff coefficient c of catchments is modelled in the hydrology as a beta distributed random variable (Gottschalk and Weingarter 1998) that is why this variable is selected for a further empirical example. The runoff coefficient data has been provided by the authors Gottschalk and Weingartner (personnel communication, 2009) except the data for Emme-Eggiwill and Langeten-Huttwil (Gottschalk and Weingarter 1998, Table 1). Samples of four regions of Switzerland are analysed: Alpine, Pre-alpine, Midland and Southern alpine. The results of the ML estimation and the tests are listed in Table 2. The observations and the estimated CDF are shown in Fig. 2b. The runoff coefficient c can be assumed as beta distributed for the Alpine and Midland with a significance level α > 10% according to the new tests. In contrary to these results, the hypothesis of a beta distributed runoff coefficient is rejected for Pre-alpine and Southern alpine with a significance level of α ≥ 1–10% partly with α < 1%. The hypothesis H 0 would be accepted by the classical χ2 test for all regions and the Kolmogorov–Smirnov test for the full specified distribution with α ≥ 10% except the χ2 test for the Southern alpines which accepts H 0 only with α ≥ 5%.
The EDF test for the beta distribution works well in contrast to the classical χ2 test and the Kolmogorov–Smirnov test for the fully specified distribution (compare Table 2 with Table 6 in Raschke 2009).
5 Conclusion
The power of different EDF tests applied for the beta distribution by using the BT has been researched. The Anderson–Darling test has the highest power followed by the Cramer von Mises and the Watson test in the researched cases. They are recommended for application in this order. The classical χ2 test and the Kolmogorov–Smirnov test for the fully specified distribution should not be applied for the assumption of a beta distribution with estimated parameters. They do not work well. The test for normality based on L2-Wasserstein distance (del Barrio et al. 1999; del Barrio et al. 2000; Krauzci 2009) should be considered for the test for beta distribution in the future when the power of this test for normality is compared better with other tests.
References
Chia E, Hutchinson MF (1991) The beta distribution as a probability model for daily cloud duration. Agric For Meteorol 56:195–208
Crujeiras RM, Casal RF, González-Manteiga W (2010) Goodness-of-fit tests for the spatial spectral density. Stoch Env Res Risk Assess 24:67–79
del Barrio E, Cuesta-Albertos JA, Matrán C, Rodríguez-Rodríguez JM (1999) Tests of goodness of fit based on the L2-Wasserstein distance. Ann Stat 27:1230–1239
del Barrio E, Cuesta-Albertos JA, Matrán C (2000) Contributions of empirical and quantile processes to the asymptotic theory of goodness-of-fit tests [with discussion]. Test 9:1–96
Flynn MR (2004) The beta distribution—a physically consistent model for human exposure to airborne contaminants. Stoch Env Res Risk Assess 18:306–308
Gottschalk L, Weingarter R (1998) Distribution of peak flow derived from a distribution of rainfall volume and runoff coefficient, and a unit hydrograph. J Hydrol 208:148–162
Johnson NL, Kotz S, Balakrishnan N (1995) Continous univariate distributions–Vol. 2, 2nd edn. New York, Wiley
Krauzci E (2009) A study of the quantile correlation test for normality. Test 18:156–165
Landry L, Leparge Y (1992) Empirical behavior of some tests for normality, communications in statistics. Simul Comput 21:971–999
Nadarajah S, Kotz S (2007) Multitude of beta distributions with applications. Statistics 41:153–179
Oja H (1981) Two location and scale free goodness-of-fit tests. Biometrika 68:637–640
Oja H (1983) New tests for normality. Biometrika 70:297–299
Raschke M (2009) Biased transformation and its application in goodness-of-fit tests for the beta and gamma distribution. Commun Stat B Comput Simul 38:1870–1890
Shapiro SS, Francia RS (1972) An approximate analysis of variance test for normality. J Am Stat Assoc 67:215–216
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52:591–611
Song S, Singh VP (2010) Meta-elliptical copulas for drought frequency analysis of periodic hydrologic data. Stoch Env Res Risk Assess 24:425–444
Stephens MA (1974) EDF statistics for hoodness of fit and some comparison. J Am Stat Assoc 69:730–737
Stephens MA (1986) Tests based on EDF statistics. In: D’Augustino RB, Stephens MA (eds) Goodness-of-fit techniques. Statistics: textbooks and monographs vol 68. Marcel Dekker, New York, pp 97–194
Sulaiman MY, Hlaing Oo WM, Abd Wahab M, Zakaria A (1999) Application of beta distribution model to Malaysian sunshine data. Renew Energy 18:573–579
Wageningen University, Netherlands, Meteorology and air quality section (2009, download). Weather station Haarweg Wageningen. Data: http://www.met.wau.nl/(2009). Accessed 5 October 2009
Yao AYM (1974) A statistical model for the surface relative humidity. J Appl Meteorol 13:17–21
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Data of relative humidity of the Haarweg-Wageningen weather station
May 2007: 0.4, 0.44, 0.5, 0.55, 0.58, 0.62, 0.65, 0.69, 0.72, 0.72, 0.73, 0.75, 0.77, 0.8, 0.81, 0.81, 0.83, 0.83, 0.85, 0.85, 0.85, 0.85, 0.86, 0.86, 0.87, 0.87, 0.89, 0.92, 0.94, 0.94, 0.97
May 2008: 0.39, 0.4, 0.42, 0.43, 0.43, 0.43, 0.44, 0.46, 0.48, 0.49, 0.51, 0.52, 0.53, 0.54, 0.56, 0.59, 0.62, 0.64, 0.66, 0.73, 0.75, 0.76, 0.83, 0.85, 0.88, 0.91, 0.92, 0.92, 0.95, 0.97, 0.98
Rights and permissions
About this article
Cite this article
Raschke, M. Empirical behaviour of tests for the beta distribution and their application in environmental research. Stoch Environ Res Risk Assess 25, 79–89 (2011). https://doi.org/10.1007/s00477-010-0410-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-010-0410-3