Abstract
The performance of the Standardized Precipitation Index (SPI) is affected by the choice of an incorrect probability distribution function, which can skew the values of the index, exaggerating or minimizing the severity of drought. This study aims to test data fitability of ten statistical distribution functions (gamma, weibull, exponential, lognormal, gumbel, cauchy, logistic, chi-square, burr, pareto) for SPI computation at time scales (TSs) of 3, 6, 9, 12, 15, 18, 21 and 24 months, and to quantify the errors made if the gamma function were used by default as is the case in general. Monthly precipitation data collected at 24 meteorological stations distributed in the five Agro-Ecological Zones (AEZs) of Cameroon were used for the period 1951-2005. The parameters of the distribution functions were estimated with the Maximum Likelihood (ML) method, and the Kolmogorov-Smirnov (K-S) test was applied to assess how well these functions fit the data. The results show that the logistic and burr distributions provide for several stations of the five AEZs the best data fits. A comparative study between the SPIs from the appropriate distribution and the gamma functions shows a significant qualitative and quantitative difference in several stations and the root mean square error (RMSE) increases with the TS and with the severity of drought.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Drought, like other natural phenomena closely linked to climate change, is increasingly affecting the four corners of the globe (Bhaga et al. 2020). It is one of the costliest natural disasters in the world, affecting more people than other forms of disasters (Zarei et al. 2021). Indeed it is a natural hazard that begins slowly so that we often speak of a slowly evolving phenomenon (Sylla et al. 2016). As a result, its manifestations may take longer to make themselves felt (Han et al. 2019). The impacts of drought vary according to regions, needs, and disciplinary perspectives (Liu et al. 2012; Dai et al. 2020). They depend on the socio-economic environment in which it occurs, since each region has its own climatic characteristics (Maia et al. 2015; Gebremeskel Haile et al. 2019). According to climatologists and meteorologists, it is the state of an environment facing a significantly long and severe lack of water, less than normal with negative impacts on flora, fauna and societies (Quenum et al. 2019). Although occasional droughts have always been part of the earth’s natural phenomena, higher temperatures, greater water evaporation and less vegetation cover all contribute to exacerbating the phenomenon (Ojha et al. 2021). Since drought can be analyzed and interpreted from different angles and different perceptions (Liu et al. 2018), there is no single definition of drought accepted worldwide (Wilhite and Glantz 1985). In general, it is defined according to the situation experienced from one area to another (Qin et al. 2015). Depending on the manifestations observed, droughts are classified into four types: meteorological, hydrological, agricultural and socio-economic (Wang et al. 2016). The three first types refer to the deficits in precipitation, in soil moisture and in streamflow, respectively (Dai et al. 2020) while socio-economic drought refers to the insufficiency of water resources systems to meet the water demand (Zhao et al. 2019). The current study focuses on meteorological drought which, according to Wang et al. (2016) is the starting phase for other types of droughts.
Since each region has its own climatic characteristics, drought monitoring involves many different methods because the amount of precipitation, the seasonal cycle and the nature of the precipitation vary from region to region (Wilhite et al. 2007; Park et al. 2017; Bhaga et al. 2020). This complexity in accurately describing the phenomenon has prompted researchers to define drought indices ranging from the simplest to the most complex. These indices make it possible to characterize the drought by intensity, duration, spatial extent, probability of recurrence (Spinoni et al. 2014; Huang et al. 2017), and its detection at different stages of its evolution (location, time of appearance and end) (He et al. 2018; Santé et al. 2019; Zhang and Li 2020). There are several drought indicators, the choice of which depends on the type of impact to be taken into account as part of the mechanism for monitoring and understanding changes in the vulnerability of the phenomenon (Huang et al. 2019; Bae et al. 2019). Of these different indices, we can cite the Palmer Drought Index (PDSI: Palmer (1965)), the Standardized Precipitation and Evapotranspiration Index (SPEI: Vicente-Serrano et al. (2010)), the Standardized precipitation index (SPI: McKee et al. (1993)), the rainfall anomaly index (RAI: Van Rooy (1965)) and the Reconnaissance Drought Index (RDI: Tsakiris and Vangelis (2005)). The SPI is recommended by the World Meteorological Organization as a standard for characterizing meteorological droughts (Hayes et al. 2011) because of the particular advantages: it offers good flexibility of use for multiple TSs (Hayes et al. 1999; Gidey et al. 2018; Guenang et al. 2019), it applies to all climatic regimes and has good spatial consistency which allows comparisons between different areas subject to different climates (Hayes et al. 1999; Pieper et al. 2020), its probabilistic nature places it in a historical context which is well suited to decision-making (Gebremichael et al. 2022). Due to these exceptional advantages, the index have shown effectiveness in detecting various historical drought events in many regions of the world (Dogan et al. 2012; Ndayiragije et al. 2022). Motivated by all these strengths of the SPI and the fact that it only depends on the precipitation for which data were available, it was used as a drought indicator in this study.
SPI proponents have suggested using the gamma distribution to fit cumulative precipitation in the calculation of this index, but many studies show the limitations of this distribution (Stagge et al. 2015; Touma et al. 2015; Blain et al. 2018), and researchers have indicated that the applicability of theoretical distributions to describe cumulative precipitation was inconsistent between different regions and climates (Raziei 2021). So, some findings point at the gamma and weibull distributions to be the best suited for long periods (larger than 3 months) and for short periods (smaller than 3 months) respectively (Stagge et al. 2015). Other studies around the world and in particular in Africa, have shown some distribution functions more appropriate than the gamma function which is in most cases used by default (Angelidis et al. 2012; Guenang and Mkankam Kamga 2014; Okpara and Tarhule 2015; Pieper et al. 2020; Zhang and Li 2020). However, all these studies are limited to a reduced number of distribution functions and the quantification of errors made by using inappropriate distributions remains a real challenge.
The objective of this study is to find appropriate statistical rainfall distribution models for the computation of SPI and to quantify the errors made on the SPI values if inappropriate statistical models of precipitation distribution were used beforehand for its computation. Given that the study area has a high rainfall variability, it was necessary to go further by increasing the number of distribution functions to be tested in order to increase the probability of finding the most appropriate leading to a more accurate SPI. Subsequently, the error made by using the default gamma function as is most often the case was quantified. The next section (“Materials and methods”) shows materials and methods. “Results” shows the results and “Discussion and conclusion” provides the discussion and conclusion.
Materials and methods
Study area
Located in the heart of Central Africa, between 1.40\(^{\circ }\)N and 13\(^{\circ }\)N of latitude and 8.30\(^{\circ }\)E–16.10\(^{\circ }\)E of longitude, at the bottom of the Gulf of Guinea, Cameroon covers an area of \(475,650 km^{2}\), including \(466,050 km^{2}\) of continental area and \(9,600 km^{2}\) of maritime area with a 402 km long maritime facade. The country belongs to the junction area between equatorial Africa in the south and tropical Africa to the north (Net 2019). Cameroon is characterized by an extraordinarily contrasting relief where high and low lands alternate. It is traditionally divided into five AEZs which are defined on the basis of their ecological, edaphic and climatic characteristics (Nfornkah et al. 2021). Details on geographical and climatic characteristics of these AEZs are presented in Table 1.
Data used
Monthly precipitation data ranging from 1951 to 2005 were obtained from the database of the National Meteorological Service of Cameroon. They are from 24 measuring meteorological stations and were successfully used in many studies Penlap et al. (2004); Guenang and Mkankam Kamga (2014). The geographical positions of these stations and the topography of the domain are shown in Fig. 1. The stations on which the study were focused are distinguished by a different coloring from the others. The selection of representative stations by zone was made on the basis of the minimum missing values.
Computation of the SPI
The SPI is computed by fitting an appropriate probability density function to the frequency distribution of precipitation, summed over a considered TS (1, 3, 6, 9, 12, 15, 18, 21 and 24 months) and the adjusted distribution is transformed into a normal standardized distribution, so that the average SPI is equal to zero (Raziei 2021). The ML estimation method was used to find the optimal parameters of the distribution functions to be tested and the K-S test was then performed to choose the best fit distribution from the following ten functions: gamma, weibull, exponential, lognormal, gumbel, cauchy, logistic, \(chi-square\), burr and pareto. The lowest K-S statistics determines the best fit distribution. This was afterwards used in the data generation to calculate the cumulative distribution function (CDF) which were transformed into normalized random variables, and then into SPI. The same procedure is applied for each station and all TSs.
The period covered by the SPI varies according to the type of drought subject to the analysis and applications envisaged (Gebremichael et al. 2022). Thus, the interpretation of the SPI indicates the anomalies, which are deviations from the average of the total precipitation observed for any period. However, precipitation with high positive values corresponds to very wet periods (positive SPI) while high negative values correspond to periods of extreme drought (negative SPI). McKee et al. (1993), uses the SPI classification indicated in Table 2 to define different drought categories.
The ML method
The ML method makes it possible to estimate the parameters of a regression model, under the assumption that the true law of distribution of said parameters is known (Streit and Luginbuhl 1994). It consists, for a given sample, in maximizing the likelihood function (joint density function) with respect to the parameters. It seeks to find the parameter capable (with a high probability) of reproducing the true values of the sample (those actually observed), i.e. to find the most likely value of the parameter of a population starting from a given sample (Horváth 1993). Applied to a set of data, it provides values of the distribution parameter which maximize the likelihood function (Meng et al. 2014).
Either a random sample \( X_1, X_2, X_3,..,X_n \) from a distribution F(x; \( \theta _1,\theta _2,....\theta _p \)). When they exist, the estimators obtained by the ML method are the solutions \( \hat{\theta _1},\hat{\theta _2},....\hat{\theta _p}\) of the system of p equations:
with r=1,2,..,p; where the likelihood function is defined by:
It is often easier to maximize the logarithm of the likelihood function than the likelihood itself. Either method leads to the same maximum because the logarithmic function is a monotonically increasing function.
The ML method is considered to be a very efficient estimator because it generates results with a lower variance value. Moreover, in long series (n > 100) the results are even more satisfactory. It has the desirable properties of a good estimator; in fact it is correct (it tends in probability towards the true value \(\theta \)), asymptotically unbiased (the mathematical expectation of the estimator \(\hat{\theta } \) is equal to the true value of the parameter \(\theta \)) and asymptotically efficient (Horváth 1993).
Law of statistical distributions used to fit data
The gamma law
Several researches have been made on the gamma law, in particular (Choi and Wette 1969) deal in detail with the gamma law. The random variable X follows a gamma distribution if its probability density function (PDF) is:
To obtain the gamma cumulative function, we proceed as follows:
with
\(\alpha \) and \(\beta \) are obtained by the ML method as follows:
With n the number of observation years. We also note that for \(x=0\), this function is not defined, and its modified cumulative function is in the form:
With q the probability at each station of having zero precipitation over the entire considered period.
The weibull law
The PDF of a random positive variable X distributed according to the Weibull law (Panahi and Asadi 2011) is:
Where \(\alpha \) and \(\beta \) are respectively the shape and scale parameters which are obtained by the ML method seen above and which are presented in detail by Wu (2002). There is no closed-form expressions of the parameters \(\alpha \) and \(\beta \), and therefore they are estimated by maximizing the log-likelihood expression of the equation (Panahi and Asadi 2011). Its complementary cumulative distribution function is a stretched exponential function so its explicit form is given by:
The exponential law
A random variable X is distributed according to an exponential law if its PDF is given by:
with \( x\ge \mu \) and \(\beta >0 \), where \(\mu \) is the location parameter and \(\beta \) the scale parameter (Rahman and Pearson 2007). The scale parameter is often denoted \(\lambda =\frac{1}{\beta }\) and is called constant failure rate. The PDF of the exponential law can therefore be written:
Its distribution function is of the form:
The parameters \(\mu \) and \(\lambda \) are estimated from a random and independent sample. The ML estimator is determined by canceling the derivative of the logarithm of the likelihood function of the exponential law, which leads to:
with \(\bar{x}=\frac{1}{n}\sum _{i=1}^n x_i \)
The lognormal law
A positive random variable x follows a lognormal distribution if the logarithm of the random variable is normally distributed. The PDF of a lognormal distribution is defined as (Mage and Ott 1984):
with \(x>0\), \(\sigma >0\) and \( - \infty< \mu < +\infty \)
The term \(\mu \) is the scale parameter that stretches or shrinks the distribution, and \(\sigma ^2\) is the shape parameter that affects the shape of the distribution. They can be determined by the ML estimator method as follows:
The gumbel law
Also called doubly exponential law or law of extreme values, a random variable X is distributed according to a Gumbel law (Cooray 2010) if its PDF is given by :
with
The terms \(\mu \) and \(\beta \) are estimated using the ML method. Its cumulative distribution function is of the form:
Gumbel’s law represents the maximum and minimum of a number of samples of normally distributed data.
The cauchy law
A random variable X follows a Cauchy law or even a Lorentz law if its PDF depending on the two parameters \(\mu > 0\) and \(\beta > 0\) (Schuster 2012) is defined by:
with \(-\infty< x < +\infty \). The particularity of this law is that it has neither expectation nor variance. The term \(\mu \) is the position parameter and the term \(\beta \) the scale parameter, that is the spread parameter. Likewise the term \(\mu \) represents both the mode and the median. These two parameters are estimated by the ML method. Its cumulative distribution function is of the form:
The logistic law
A random variable X follows a logistic law if its PDF is given by Pérez-Sánchez and Senent-Aparicio (2018):
\(-\infty<x<+\infty \), with \(\alpha \) the shape parameter and \(\beta \) the scale parameter non-zero and positive. Its cumulative distribution function is given by:
The parameters \(\alpha \) and \(\beta \) are estimated by the ML method and were considered as starting value for the program \(\alpha = 0\) and \(\beta = 1\).
The chi-square law
It is a continuous distribution with k degrees of freedom, used to describe the distribution of a sum of squared random variables (Robertson 1969). Similarly, its importance also comes from its usefulness for independent data sets to test the goodness of fit of a data distribution (Canal 2005). A random variable X follows a \(chi-square\) distribution if its PDF is given by:
with \(x\ge 0\). Its cumulative distribution function is :
with \(\gamma {(\frac{k}{2},\frac{x}{2})}\) the lower incomplete gamma function
The burr law
The burr type XII distribution is a continuous and widely known distribution because it includes the characteristics of various well-known distributions such as for example the weibull and gamma distributions (Pérez-Sánchez and Senent-Aparicio 2018). A random variable X follows a burr or burr type XII distribution if its PDF is:
with
The estimation of these parameters with the ML method is the most common (Ghitany and Al-Awadhi 2002). Its cumulative distribution function is:
The pareto law
A random variable X follows a Pareto law if its PDF is defined by the relation (Pérez-Sánchez and Senent-Aparicio 2018):
with \(\beta \le x \le \infty \), \(\alpha > 0\) and \(\beta > 0\). The terms \(\alpha \) and \(\beta \) are respectively the shape and scale parameters, which are estimated by the ML method as follows:
Its cumulative distribution function is :
The K-S fit test
As mentioned by Stephens (1970), this test is inspired by the statistics proposed by Kolmogorov (1933) for fitting to a distribution. It determines to what extent the data Xi (i=1, ...n) follow a specific distribution law F(X). The K-S test is a nonparametric test that can be used to compare a sample with a reference probability distribution or to compare two samples (Mitchell 1971). The idea is to calculate the maximum difference, in absolute value, between the empirical cumulative distribution and the theoretical cumulative distribution under the null hypothesis for the running sum of the chosen TS. Under the H0 hypothesis, this difference is small and the distribution of observations fits well into a given distribution (Berger and Zhou 2014). For a specific data set and distribution, the better the law fits the data, the weaker the K-S test will be. So, for a law to be the best, its K-S test must be considerably weaker than the others. It quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution or between the empirical distribution functions of two samples. Therefore, the smaller the D statistic, the closer the theoretical distribution is to the empirical distribution (Massey Jr 1951; Ramachandran and Tsokos 2015). For a distribution function cumulative F(x) given, Stephens (1970) defined the statistic (K-S) by:
with \(- \infty< x < + \infty \), and by Glivenko-Cantelli theorem (DeHardt 1971),
where
Results
Determination of the adequate distribution functions
Figures 2, 3 and 4 show comparative results of the CDF for historical precipitation data and for each of the ten trial distribution functions. The results are shown for six target stations of the AEZs (Poli, Ngaounderé, Koundja, Bafia, Douala and Nkongsamba) and for 3, 12 and 24 months TS. The following abbreviations were adopted for the functions: gamma (g), weibull (w), exponential (e), lognormal (ln), gumbel (gu), cauchy (c), logistic (lo), \(chi-square\) (ch), burr (bu) and pareto (p). The K-S test was applied and the results are presented in Tables 3, 4 and 5 for 3, 12 and 24 months TSs respectively.
The results show that at 3-month TS (Table 3), the logistic distribution is the best fit in the four stations namely Poli, Ngaoundere, Bafia and Koundja. For the two other stations Douala and Nkongsamba, burr and weibull are the best fit respectively. At 12-month TS (Table 4), data from the stations of Poli, Nkongsamba and Douala fit better with the burr distribution; Koundja shows the gamma as the best fit, while data from Ngaoundere and Bafia are better suited to logistic distribution. At 24-month TS (Table 5), the burr distribution is the best choice at Poli, Nkongsamba and Bafia while Ngaoundere, Koundja and Douala show a preference to logistic, gamma and gumbel as the best fits respectively.
The results for all 24 stations and for eight TSs (3, 6, 9, 12, 15, 18, 21 and 24 months) are presented in Table 6. In general, the logistic and burr distributions are the most suitable for most stations except for 9-month TS where weibull followed by burr outperform the others. In Table 6, it is observed that for short (3-month) and long (> 6-month) TSs , the logistic and burr distributions are the most appropriates respectively. From the statistical point of view and at all TSs, the function burr is the most representative followed by logistic and then gamma. Table 7 summarizes the best fits for all AEZs. It is observed a few cases where functions that better fit the data are in equal numbers.
Analysis of computed SPIs with adequate and default gamma distributions
Time series of SPIs computed with adequate distributions
SPI time series were calculated using the best fit distribution at each station and results are shown in Figs. 5, 6 and 7. The SPIs on a 3-month TS (Fig. 5) show for each station, a high frequency of drought events ranging from mild to extreme categories. For 12-month SPI (Fig. 6), each station shows at least one extremely dry episode. Throughout the study period, the six stations differ markedly in the frequency of extreme drought periods (4 in Bafia and Poli, 3 in Nkongsamba, 2 in Ngaoundéré and Koundja, and none in Douala). For 24-month SPI (Fig. 7) and during the first 30 years, Ngaounderé, Douala and Nkongsamba stations only recorded very few drought but the following years recorded more frequent drought events. The dramatic drought events of the 1970s and 1980s are highlighted in each station and the magnitude and duration of the drought increased, especially from the mid-1970s.
Evaluating the shift in SPI values due to the use of gamma function instead of the appropriate functions
Figures 5, 6 and 7 show SPIs computed using both gamma and the best-fit distributions. At 3-month TS (Fig. 5), the values of SPIs obtained with the gamma distribution (\(SPI_g\)) are in general smaller than those obtained with the best distribution (\(SPI_{bd}\)), which means that gamma leads to an underestimation of extreme humidity and an overestimation of severe and extreme drought events. For the stations of the AEZ4 (Douala and Nkongsamba) in the Littoral area, the SPIs are less sensitive to the choice of the distribution function. Similar patterns are observed at 12-month TS (Fig. 6), but the differences between both SPIs (\(SPI_g\) and \(SPI_{bd}\)) are higher and more depicted. As for 24-month TS (Fig. 7), results are similar to those obtained at 3 and 12 months in the AEZs 1 and 2 (Poli and Ngaoundéré respectively) while the reverse situation is observed in the AEZ4 (Douala). However, in the AEZs 4 and 5 (Nkongsamba and Bafia respectively), the gamma distribution leads to an overestimation of extremely humid and drought intensity.
Figure 8 shows the root mean square error (RMSE) between the SPI values computed from the appropriate distribution function and the gamma function for 3, 6, 9, 12, 15, 18, 21 and 24 months TSs at each station. Results are shown for the four drought categories. In general, the RMSE increases with the severity of drought (from mild to extreme drought) for each TS and in any AEZ; likewise, it increases with TS for the AEZs 1, 2 and 5 (Poli, Ngaoundere and Bafia respectively), but for other areas (AEZs 3 and 4) no consistent increase with TS is observed and the RMSEs are the lowest, sometimes equal to zero because gamma is the most suitable and match the default function (gamma) or equal to low values due to the fact that the appropriate distribution function found has flexibility similar to the gamma function.
Discussion and conclusion
Discussion
In most studies on SPI, gamma is chosen by default as the best fit without any comparison with other distributions. It was shown that the SPI with the commonly used gamma distribution leads to shortcomings in evaluating ensemble simulations (Pieper et al. 2020). For West Africa, Okpara and Tarhule (2015) verified that the type two gamma distribution was a better model for adjusting precipitation over the Niger basin. So, it is clear that several functions can override the gamma distribution as the best fit in many stations and provide better SPI values. From this current study on a larger number of distribution functions, we found that new functions (burr and logistic) are able to better fit the data in some stations as compared to the findings of Guenang and Mkankam Kamga (2014) where only four functions (gamma, weibull, lognormal and exponential) were tested. Therefore, the choice of the appropriate distribution function depends on the geographical location of the station and the TS considered. This is confirmed by the results of the current study that show different distribution functions for different areas. The results also corroborate those of Cindric and Pasaric (2012), who suggested that it is not possible to recommend a single, optimal distribution because the ratio of skewness and the coefficient of variation of data precipitation could be the indicator for the choice of the most appropriate distribution for a particular region. Moreover, Angelidis et al. (2012) and Stagge et al. (2015) suggested that the suitable probability distribution is related with the TS of precipitation data to be fitted.
Stagge et al. (2015) compared the seven probability distributions and concluded that the gamma distribution produces the best fit for precipitation with long period (>6 months), while weibull is consistently the best for precipitation with short period (1 to 3 month). In the present study, the logistic distribution produces the best fit for precipitation with short accumulation (3 months TS), while for long periods (\(>6\) months TS) the burr distribution performs the best. Pieper et al. (2020) and Zhang and Li (2020) estimated that the appropriate probability distribution is related to the number of parameters of the PDF of the distribution to be fitted to rainfall data. They demonstrated that distributions with three parameters such as exponential weibull and \(log-logistic\) respectively, perform better than the correspondents two-parameter distributions, which is in agreement with the results obtained in this paper where the three-parameter burr distribution gave the best results in most cases. Some distribution functions such as \(chi-square\), cauchy, exponential and pareto generally show poorer fit in the study area. Guenang and Mkankam Kamga (2014) also found that the exponential distribution is the least suitable for the domain.
Considering the effects of different probability distributions on SPI characteristics in comparison with the default gamma distribution, it is observed that SPIs for mild and moderate droughts are less sensitive to the distribution functions used, than those corresponding to severe and extreme droughts. This statement agrees with that of Angelidis et al. (2012) who concludes that the consistency of the calculated SPI with different distributions is good for normal periods, while becoming poor for very dry or very wet periods. We also found that different probability distributions lead to a large difference in severe and extreme droughts. In fact, the SPI time series patterns obtained with gamma lead to an underestimation of extreme humidity and an overestimation of severe and extreme drought events as compare to that obtained with the best distribution. In a context of climate change, particularly due to global warming, strong fluctuations in average precipitation have a severe effect on the occurrence of drought. The current study also found that the magnitude and duration of drought increased with time for both short and long TSs. This may be the consequence of reduced precipitation resulting from climate change as suggested by Vicente-Serrano et al. (2010) and Tirivarombo et al. (2018) because temperature is also an important factor that can influence the availability of water as it controls the rates of evapotranspiration.
Conclusion
This study was undertaken to contribute to the improvement of mathematical tools for modeling drought which is a dangerous phenomenon and whose adaptation is difficult in developing countries such as Cameroon. The SPI used as drought indicator was studied in this paper by examining the relevance of using probability distribution functions different from those commonly used to fit and describe observed precipitation data, as preliminary step for SPI computation. Ten statistical distribution functions were tested to find the best fit in each of the 24 observation stations belonging to the five AEZs of Cameroon over the period 1951–2005 and for different TSs (3, 6, 12, 15, 18, 21 and 24 months). The ML method was used to estimate the parameters of the distribution functions. The K-S statistic was used to select the distribution functions that better fit station data, which were then used to calculate the SPI. The results were used to study drought occurrence and quantify the errors made if non appropriate distribution functions were used.
The appropriate distribution function for precipitation data was found to depend on the location of the station and the number of months in TS. The gamma distribution usually used as default is not always the best fit for SPI computation if many functions were tested. It was found that the logistic probability distribution remains the best choices in most cases for 3-month TS while above, burr shows the best fit. For 12 months and more, gamma, burr and logistic are found to be the best fits for many different stations. In all cases, a significant difference was found between the SPIs calculated with the best fit function and those calculated with the default gamma distribution; the differences between both SPIs are more significant from 12-month TS with higher values of the RMSEs.
This study raises the importance and the necessity of a preliminary study consisting in finding the best distribution functions fitting the data and using them for the calculation of SPI in order to reduce errors and increase the accuracy of the results.
Availability of data and materials
The datasets generated and/or analyzed during the current study are available on request by contacting the corresponding author.
References
Angelidis P, Maris F, Kotsovinos N et al. (2012) Computation of drought index spi with alternative distribution functions. Water Resour Manag 26:2453–2473. https://doi.org/10.1007/s11269-012-0026-0
Awchi TA, Kalyana MM (2017) Meteorological drought analysis in northern iraq using spi and gis. Sustainable Water Resour Manag 3:451–463. https://doi.org/10.1007/s40899-017-0111-x
Bae H, Ji H, Lim YJ et al (2019) Characteristics of drought propagation in south korea: Relationship between meteorological, agricultural, and hydrological droughts. Natural Hazards 99(1):1–16. https://doi.org/10.1007/s11069-019-03676-3
Berger VW, Zhou Y (2014) Kolmogorov-smirnov test: Overview. Statistics reference online, Wiley statsref. https://doi.org/10.1002/9781118445112.stat06558
Bhaga TD, Dube T, Shekede MD et al (2020) Impacts of climate variability and drought on surface water resources in sub-saharan africa using remote sensing: A review. Remote Sens 12(24):4184. https://doi.org/10.3390/rs12244184
Blain GC, de Avila AMH, Pereira VR (2018) Using the normality assumption to calculate probability-based standardized drought indices: selection criteria with emphases on typical events. Int J Climatol 38:e418–e436. https://doi.org/10.1002/joc.5381
Canal L (2005) A normal approximation for the chi-square distribution. Comput Stat Data Anal 48(4):803–808. https://doi.org/10.1016/j.csda.2004.04.001
Choi SC, Wette R (1969) Maximum likelihood estimation of the parameters of the gamma distribution and their bias. Technometrics 11(4):683–690. https://doi.org/10.2307/1266892
Cindric JJK, Pasaric Z (2012) Statistical distributions for the spi computation. EMS Annual Meeting AbstractsBerlin: EMS (Berlin) pp GF48: EMS2012–316
Cooray K (2010) Generalized gumbel distribution. J. Appl Stat 37(1):171–179. https://doi.org/10.1080/02664760802698995
Dai M, Huang S, Huang Q et al (2020) Assessing agricultural drought risk and its dynamic evolution characteristics. Agric Water Manag 231(106):003. https://doi.org/10.1016/j.agwat.2020.106003
DeHardt J (1971) Generalizations of the glivenko-cantelli theorem. Annals Math Stat 42(6):2050–2055. https://www.jstor.org/stable/2240133
Dogan S, Berktay A, Singh VP (2012) Comparison of multi-monthly rainfall-based drought severity indices, with application to semi-arid konya closed basin, turkey. J Hydrol 470:255–268. https://doi.org/10.1016/j.jhydrol.2012.09.003
Gebremeskel Haile G, Tang Q, Sun S et al. (2019) Droughts in east africa: Causes, impacts and resilience. Earth-Sci Rev 193:146–161. https://doi.org/10.1016/j.earscirev.2019.04.015
Gebremichael HB, Raba GA, Beketie KT et al. (2022) Temporal and spatial characteristics of drought, future changes and possible drivers over upper awash basin, ethiopia, using spi and spei. Environ, Develop Sustain 1–39. https://doi.org/10.1007/s10668-022-02743-3
Ghitany M, Al-Awadhi S (2002) Maximum likelihood estimation of burr xii distribution parameters under random censoring. J Appl Stat 29(7):955–965. https://doi.org/10.1080/0266476022000006667
Gidey E, Dikinya O, Sebego R et al. (2018) Modeling the spatio-temporal meteorological drought characteristics using the standardized precipitation index (spi) in raya and its environs, northern ethiopia. Earth Syst Environ 2:281–292. https://doi.org/10.1007/s41748-018-0057-7
Guenang GM, Mkankam Kamga F (2014) Computation of the standardized precipitation index (spi) and its use to assess drought occurrences in cameroon over recent decades. J Appl Meteorol Climatol 53:2310–2324. https://doi.org/10.1175/JAMC-D-14-0032.1
Guenang GM, Komkoua Mbienda A, Pokam Mba W et al. (2019) Sensitivity of spi to distribution functions and correlation between its values at different time scales in central africa. Earth Syst Environ 3:203–214. https://doi.org/10.1007/s41748-019-00102-3
Han Z, Huang S, Huang Q et al (2019) Propagation dynamics from meteorological to groundwater drought and their possible influence factors. J Hydrol 578(124):102. https://doi.org/10.1016/j.jhydrol.2019.124102
Hayes M, Svoboda M, Wall N et al. (2011) The lincoln declaration on drought indices: universal meteorological drought index recommended. Bullet Am Meteorolog Soc 92(4):485–488. http://www.jstor.org/stable/26226865
Hayes MJ, Svoboda MD, Wiihite DA et al (1999) Monitoring the 1996 drought using the standardized precipitation index. Bullet Am Meteorolog Soc 80(3):429–438. https://doi.org/10.1175/1520-0477(1999)080<0429:MTDUTS>2.0.CO;2
He Z, Liang H, Yang C et al. (2018) Temporal-spatial evolution of the hydrologic drought characteristics of the karst drainage basins in south china. Int J Appl Earth Observation Geoinfor 64:22–30. https://doi.org/10.1016/j.jag.2017.08.010
Horváth L (1993) The maximum likelihood method for testing changes in the parameters of normal observations. Annals Stat 671–680. https://doi.org/10.1214/aos/1176349143
Huang S, Li P, Huang Q et al. (2017) The propagation from meteorological to hydrological drought and its potential influence factors. J Hydrol 547:184–195. https://doi.org/10.1016/j.jhydrol.2017.01.041
Huang S, Wang L, Wang H et al. (2019) Spatio-temporal characteristics of drought structure across china using an integrated drought index. Agric Water Manag 218:182–192. https://doi.org/10.1016/j.agwat.2019.03.053
Kolmogorov AN (1933) Sulla determinazione empirica di una legge didistribuzione. Giorn Dell’inst Ital Degli Att 4:89–91
Liu D, You J, Xie Q et al (2018) Spatial and temporal characteristics of drought and flood in quanzhou based on standardized precipitation index (spi) in recent 55 years. J Geosci Environ Protection 6(8):25–37. https://doi.org/10.4236/gep.2018.68003
Liu L, Hong Y, Bednarczyk CN et al. (2012) Hydro-climatological drought analyses and projections using meteorological and hydrological drought indices: a case study in blue river basin, oklahoma. Water Resour Manag 26:2761–2779. https://doi.org/10.1007/s11269-012-0044-y
Mage DT, Ott WR (1984) An evaluation of the methods of fractiles, moments and maximum likelihood for estimating parameters when sampling air quality data from a stationary lognormal distribution. Atmospheric Environ 18(1):163–171. https://doi.org/10.1016/0004-6981(84)90239-7
Maia R, Vivas E, Serralheiro R et al. (2015) Socioeconomic evaluation of drought effects. main principles and application to guadiana and algarve case studies. Water Resour Manag 29:575–588. https://doi.org/10.1007/s11269-014-0883-9
Massey FJ Jr (1951) The kolmogorov-smirnov test for goodness of fit. J Am Stat Assoc 46(253):68–78. https://doi.org/10.1080/01621459.1951.10500769
McKee T, Doesken N, Kliest J (1993) The relationship of drought frequency and duration to time scales. in proceedings of the 8th conference of applied climatology, 17-22 january, anaheim, ca. Am Meterolog Soc, Boston, Massachusetts pp 179–184
Meng XP, Zhao CQ, Huo L (2014) Maximum likelihood method for parameter estimation of weibull distribution model based on fruit fly optimization algorithm. In: Applied Mechanics and Materials, Trans Tech Publ 3508–3511. https://doi.org/10.4028/www.scientific.net/AMM.602-605.3508
Mitchell B (1971) A comparison of chi-square and kolmogorov-smirnov tests. Area 237–241. https://www.jstor.org/stable/20000590
Ndayiragije JM, Li F, Nkunzimana A (2022) Assessment of two drought indices to quantify and characterize drought incidents: A case study of the northern part of burundi. Atmosphere 13(11):1882. https://doi.org/10.3390/atmos13111882
Net F (2019) Cameroon livelihood zone map and descriptions. FEWS NET, Washington, DC
Nfornkah BN, Enongene K, Kaam R, et al. (2021) Growth potential and sustainability of economically important rattan species in agro-ecological zones of cameroon. INBAR Working Paper. https://www.inbar.int/wp-content/uploads/2021/06/June-2021_Growth-Potential-and-Sustainability-of-Economically-Important-Rattan-Species-in-Agro-Ecological-Zones-of-Cameroon.pdf
Ojha SS, Singh V, Roshni T (2021) Comparison of meteorological drought using spi and spei. Civ Eng J 7:2130–2149. https://doi.org/10.28991/cej-2021-03091783
Okpara J, Tarhule A (2015) Evaluation of drought indices in the niger basin, west africa. J Geogr Earth Sci 3:1–32. https://doi.org/10.15640/jges.v3n2a1
Palmer W (1965) Meteorological drought. research paper no. 45. US Weather Bureau: Washington, DC. https://www.droughtmanagement.info/literature/USWB_Meteorological_Drought_1965.pdf
Panahi H, Asadi S (2011) Estimation of the weibull distribution based on type-ii censored samples. Appl Math Sci 5(52):2549–2558. http://www.m-hikari.com/ams/ams-2011/ams-49-52-2011/panahiAMS49-52-2011.pdf
Park S, Im J, Park S et al. (2017) Drought monitoring using high resolution soil moisture through multi- sensor satellite data fusion over the korean peninsula. Agric Forest Meteorol 237:257–269. https://doi.org/10.1016/j.agrformet.2017.02.022
Penlap K, Matulla C, Storch H et al. (2004) Downscaling of gcm scenarios to assess precipitation changes in the little rainy season (march-june) in cameroon. Climate Res 26:85–96. https://doi.org/10.3354/cr026085
Pérez-Sánchez J, Senent-Aparicio J (2018) Analysis of meteorological droughts and dry spells in semiarid regions: a comparative analysis of probability distribution functions in the segura basin (se spain). Theoretical Appl Climatol 133:1061–1074. https://doi.org/10.1007/s00704-017-2239-x
Pieper P, Düsterhus A, Baehr J (2020) A universal standardized precipitation index candidate distribution function for observations and simulations. Hydrol Earth Syst Sci 24(9):4541–4565. https://doi.org/10.5194/hess-24-4541-2020
Qin Y, Yang D, Lei H et al. (2015) Comparative analysis of drought based on precipitation and soil moisture indices in haihe basin of north china during the period of 1960-2010. J Hydrol 526:55–67. https://doi.org/10.1016/j.jhydrol.2014.09.068, drought processes, modeling, and mitigation
Quenum GML, Klutse NA, Dieng D et al (2019) Identification of potential drought areas in west africa under climate change and variability. Earth Syst Environ 3:429–444. https://doi.org/10.1007/s41748-019-00133-w
Rahman M, Pearson LM (2007) Estimation in two-parameter exponential distributions. J Stat Comput Simulat 70(4):371–386. https://doi.org/10.1080/00949650108812128
Ramachandran KM, Tsokos CP (2015) Mathematical statistics with applications in R, Second Edition, Chapter 7. Academic Press. https://doi.org/10.1016/C2012-0-07341-3
Raziei T (2021) Performance evaluation of different probability distribution functions for computing standardized precipitation index over diverse climates of iran. Int J Climatol 41(5):3352–3373. https://doi.org/10.1002/joc.7023
Robertson G (1969) Computation of the noncentral chi-square distribution. Bell Syst Technical J 48(1):201–207. https://doi.org/10.1002/j.1538-7305.1969.tb01111.x
Santé N, N’Go YA, Soro GE et al (2019) Characterization of meteorological droughts occurrences in côte d’ivoire: case of the sassandra watershed. Climate 7(4):60. https://doi.org/10.3390/cli7040060
Schuster S (2012) Parameter estimation for the cauchy distribution. In: 2012 19th International conference on systems, signals and image processing (IWSSIP), pp 350–353
Spinoni J, Naumann G, Carrao H et al (2014) World drought frequency, duration, and severity for 1951–2010. Int J Climatol 34(8):2792–2804. https://doi.org/10.1002/joc.3875
Stagge JH, Tallaksen LM, Gudmundsson L et al (2015) Candidate distributions for climatological drought indices (spi and spei). Int J Climatol 35(13):4027–4040. https://doi.org/10.1002/joc.4267
Stephens MA (1970) Use of the kolmogorov-smirnov, cramer-von mises and related statistics without extensive tables. J Royal Stat Soc: Series B (Methodological) 32(1):115–122. https://doi.org/10.1111/j.2517-6161.1970.tb00821.x
Streit RL, Luginbuhl TE (1994) Maximum likelihood method for probabilistic multihypothesis tracking. In: Signal and data processing of small targets 1994, SPIE, pp 394–405, https://doi.org/10.1117/12.179066
Sylla MB, Nikiema PM, Gibba P et al. (2016) Climate change over west africa: Recent trends and future projections. Adaptation to climate change and variability in rural West Africa pp 25–40. https://doi.org/10.1007/978-3-319-31499-0_3
Tirivarombo S, Osupile D, Eliasson P (2018) Drought monitoring and analysis: standardised precipitation evapotranspiration index (spei) and standardised precipitation index (spi). Physics and Chemistry of the Earth, Parts A/B/C 106:1–10. https://doi.org/10.1016/j.pce.2018.07.001
Touma D, Ashfaq M, Nayak MA et al (2015) A multi-model and multi-index evaluation of drought characteristics in the 21st century. J Hydrol 526:196–207. https://doi.org/10.1016/j.jhydrol.2014.12.011
Tsakiris G, Vangelis H (2005) Establishing a drought index incorporating evapotranspiration. European Water 9(10):3–11
Van Rooy M (1965) A rainfall anomally index independent of time and space, notos. Notos 14:43–48
Vicente-Serrano SM, Beguería S, López-Moreno JI (2010) A multiscalar drought index sensitive to global warming: the standardized precipitation evapotranspiration index. J Climate 23(7):1696–1718. https://doi.org/10.1175/2009JCLI2909.1
Vondou DA, Guenang GM, Djiotang TLA et al (2021) Trends and interannual variability of extreme rainfall indices over cameroon. Sustainability 13(12):6803. https://doi.org/10.3390/su13126803
Wang W, Ertsen MW, Svoboda MD et al (2016). Propagation of drought: from meteorological drought to agricultural and hydrological drought. https://doi.org/10.1155/2016/6547209
Wilhite DA, Glantz MH (1985) Understanding: the drought phenomenon: the role of definitions. Water Int 10(3):111–120. https://doi.org/10.1080/02508068508686328
Wilhite DA, Svoboda MD, Hayes MJ (2007) Understanding the complex impacts of drought: A key to enhancing drought mitigation and preparedness. Water Resour Manag 21:763–774. https://doi.org/10.1007/s11269-006-9076-5
Wu SJ (2002) Estimations of the parameters of the weibull distribution with progressively censored data. J Japan Stat Soc 32(2):155–163. https://doi.org/10.14490/jjss.32.155
Zarei AR, Shabani A, Moghimi MM (2021) Accuracy assessment of the spei, rdi and spi drought indices in regions of iran with different climate conditions. Pure Appl Geophys 178:1387–1403. https://doi.org/10.1007/s00024-021-02704-3
Zhang Y, Li Z (2020) Uncertainty analysis of standardized precipitation index due to the effects of probability distributions and parameter errors. Front Earth Sci 8:76. https://doi.org/10.3389/feart.2020.00076
Zhao M, Huang S, Huang Q et al (2019) Assessing socio-economic drought evolution characteristics and their possible meteorological driving force. Geomatics, Natural Hazards and Risk 10(1):1084–1101. https://doi.org/10.1080/19475705.2018.1564706
Acknowledgements
The authors are very grateful to the National Meteorological Service of Cameroon for providing observed precipitation data.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
“Material preparation, data collection and analysis were performed by A.R. Gamgo Fotse and G.M. Guenang. The first draft of the manuscript was written by A.R. Gamgo Fotse. The authors A.R. Gamgo, G. M. Guenang, A.J. Komkoua Mbienda and D.A. Vondou commented and approved the final manuscript.”
Corresponding author
Ethics declarations
Ethical Approval
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Communicated by: H. Babaie.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fotse, A.R.G., Guenang, G.M., Mbienda, A.J.K. et al. Appropriate statistical rainfall distribution models for the computation of standardized precipitation index (SPI) in Cameroon. Earth Sci Inform 17, 725–744 (2024). https://doi.org/10.1007/s12145-023-01188-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-01188-0