1 Introduction

The estimate of design events must be reasonably accurate to avoid high costs of constructing hydraulic structures. When flood magnitudes are overvalued, the cost of constructing hydraulic structures may be high, while excessive damage and even loss of human lives may be unavoidable if the flood potential is underestimated. Hence, it is essential to estimate how often a specific heavy rain leading to a flooding event will occur or how massive a storm will be for a particular probability of exceedance or recurrence interval. This might be achieved through the estimation of distributional parameters and the extrapolation of cumulative distribution functions (CDFs) to generate extreme flood values.

Design estimates of rainfall can be susceptible to different uncertainties and variabilities that question their credibility. Therefore, by ignoring credibility assessment of design rainfall estimates in terms of uncertainties and variabilities, practitioners fail to highlight the inherent dangers of designing unsustainable drainage infrastructure (Chuah et al. 2017). According to Hassan et al. (2019), uncertainties are unavoidable in hydrologic frequency analysis, and they may arise from various sources distinguished as data uncertainties, structural uncertainties and parameter uncertainties (Leandro et al. 2013; Leandro et al. 2019). Various origins of uncertainties have also been described by Mamoon and Rahman (2014). According to Nathan and Weinmann (2013) and Renard et al. (2010), the credibility of hydrologic simulations could be attenuated by the model selection process, the input of the model and its parameters, the structure of the model established and assumptions made, the slips contained in data used for simulation training and the output of the model. Climate change has also been proved to affect the credibility of design estimates of rainfall for flood control (Wang et al. 2013), due to alterations of future severity and frequency of storms (Hajani and Rahman, 2018a, b).

The usual practice in hydrologic frequency analysis is to select a small sample size of peak rains or discharge, which may not represent the statistical characteristics of the distribution of the population; as a result, estimated parameters of selected PDF (probability density function) are expected to be prone to uncertainties despite employing the proper parameter estimation methods. Uncertainty is always present when planning, developing, managing and operating water resources systems. It arises because many factors that affect the performance of water resources systems are not and cannot be known with certainty.

Variability is associated with temporal, geographical and/or individual heterogeneity among affiliates of a statistical population, and it is a complicated and integral characteristic of a system (Pouillot and Delignette-Muller 2010). Begg et al. (2014) asserted that variability can be assessed by a distribution of frequencies of several occurrences of the quantity resulting from an observed record. For instance, Loveridge and Rahman (2018) affirmed that frequent flood events pronounce the variability of factors that lead to flood, which further intensifies the uncertainties. Measurement or observation of data through statistical manipulations is used for computing variability (Begg et al. 2014), and these can be represented by frequency distribution. In flood risk management, variability and uncertainty have different implications and should be treated separately (Pouillot and Delignette-Muller 2010). The mean population risk and its connected uncertainty are important to decision-makers, which involve estimating the uncertainty of an estimate after its variability has been computed. This distinction has been advanced and validated by various methods in recent times.

Different methods have been proposed for the valuation of uncertainties to increase the credibility of design estimates, which include analytical, Monte Carlo simulation, Bayesian, first-order variance estimation centered on the development of Taylor's series, bootstrapping, cross-validation and fuzzy-based methods (Mamoon and Rahman 2014). But bootstrapping and Monte Carlo simulation have been widely applied for assessing the credibility of design rainfall estimates used in the construction of hydraulic structures. For instance, Chuah et al. (2017) and Mamoon and Rahman (2019) adopted Monte Carlo simulation in computing credibility intervals of design rainfall, while Seo and Park (2011), Hassan et al. (2019), Mamoon and Rahman (2019) adopted the bootstrapping method to ascertain the credibility of the parameters of suitable models.

Various studies have amplified the importance of using the two-dimensional Monte Carlo simulation framework in quantitative risk assessment involving the distinction of variabilities and their associated uncertainties (Pouillot and Delignette-Muller 2010; Begg et al. 2014). The two-dimensional Monte Carlo simulation framework is a valuable method for reducing the complexity of estimating the densities of distribution, through the development of an empirical distribution that is asymptotically related to the distribution of the risk crosswise a population (Pouillot and Delignette-Muller 2010). The uncertainty of risk arising from parameter estimates can be evaluated by the two-dimensional Monte Carlo simulation, and it has been used in numerous risk assessment domains (Vicari et al. 2007; Jang et al. 2009a, b; Jones et al. 2009; Özkaynak et al. 2009; Guo et al. 2020), flood management inclusive (Rahman et al. 2002; Kalyanapu et al. 2011; Liu and Liu 2019).

Extensive studies have been recently done in Nigeria on hydrologic frequency analysis to reduce flood risk based on peak rainfall and river discharge (Manta and Ahaneku 2009; Agbede and Abiona 2012; Ehiorobo and Izinyon 2013; Izinyon and Ajumuka 2013; Gbadebo et al. 2014; Agbede and Aiyelokun 2016; Aiyelokun et al. 2017; Okeke and Ehiorobo 2017; and Aiyelokun et al. 2018). Some of these studies endeavored to present statistical models or plots that relate design floods or rainfall estimates with return periods but failed to address the level of reliability of the models and credible intervals of their variability.

The assessment of the credibility of design rainfall estimates has been disregarded by many researchers owning to difficulties associated with the procedure of conducting it (Huang et al. 2016), intentional underestimation of risk by practitioners (Overeem et al. 2008) and the lack of interest by researchers, which have led to limited studies on credibility of design rainfall estimates (Tung and Wong 2014). It is also interesting to note that in situations where uncertainty and variability were conducted to validate hydraulic designs, they were not investigated spatially across a large region. In this regard, the innovative aspect of the study is that it presents a simplified framework for assessing the uncertainty and variability of design rainfall estimates to optimize flood risk reduction and management. The outcome of this study is beneficial to policymakers, researchers and engineers by informing a reliable risk assessment practice when designing hydraulic structures for flood control in Nigeria and other parts of the world.

The present study aims to assess the level of disregard of credibility evaluation of design rainfall estimations by Nigerian researchers and to establish a framework for assessing the credibility of design rainfall estimates and models, using the north central region of Nigeria as a case study. The objectives of the study are (1) to conduct a systematic review on the extent of disregard of credibility assessment of design rainfall estimation in Nigeria, (2) to evaluate the performance of selected probability models in fitting the empirical distribution of 24-h annual maxima series (AMS) in the study area, (3) to assess the uncertainty of the parameters of suitable models, (4) to assess the uncertainty of design rainfall estimate for different return periods and (5) to assess the variabilities of AMS and their uncertainties at 95% credible level through stochastic simulations across the study region.

2 Materials and methods

2.1 Study area and data acquisition

The study area is the north central region of Nigeria, which is located approximately between longitude 2° 40′ 20.14" and 10° 37′ 47.93" east and latitude 6° 25′ 57.09" and 11° 26′ 39.57" north. The region consists of Benue, Nasarawa, Plateau, Federal capital territory (FCT), Koji, Kwara and Niger States (Fig. 1), covering a landmass of about 229,028 km2. The study area has a tropical climate characterized by wet and dry seasons. Monthly mean rainfall varies from 6.50 to 225.9 mm in the southwestern (Ilorin) part and from 0.61 to 271.38 mm in the northeastern region (Jos), while monthly maximum temperature ranged from 29.04 to 35.66 °C in Ilorin and 24.13–30.91 °C in Jos. This statistic is based on the monthly climate data recorded between 1960 and 2016. A daily 30-year (1981–2010) rainfall data recorded at Lokoja, Bida, Makurdi, Minna, Yelwa, Abuja, Jos and Ilorin meteorological stations were obtained from the Nigerian Meteorological Agency (NIMET) to generate AMS of rains for north central Nigeria.

Fig. 1
figure 1

Map of north central Nigeria

2.2 Synthetic literature review

The study adopted a systematic literature review. Available publications were searched from existing databases such as Web of Science, SCOPUS, Google Scholar, African Journal Online, professional association affiliated journals and other online sources published in the past decade. The publications were systematically searched using keywords such as rainfall, design storms, probability distributions, stochastic modeling, frequency analysis of annual maximum and Nigeria. The primary source of the review was adopted for this study; herein, the report of the original researchers was assessed and analyzed (Cronin et. al. 2008). The qualitative assessment method (Grant and Booth 2009) in the form of the synthesis matrix was adopted; this was organized to present the indexed database of the publications, the year of publication, study area, period of data selection, assessment of the goodness of fit of various probability models, developed model, assessment of the credibility of model parameters and assessment of the credibility of model quantiles.

2.3 Proposed framework for credibility assessment of design rainfall estimates

To inform the practice of assessing the credibility of design rainfall estimates by practitioners before employing them in the designs of drainage infrastructure, a methodological framework was constructed based on the authors’ synthesis of closely related studies as shown in Fig. 2. The framework constitutes five (5) stages, which include the preliminary assessment, performance evaluation of selected theoretical distributions, credibility assessment of model parameters, the credibility of suitable model quantiles and propagation of variability and uncertainty stage.

Fig. 2
figure 2

Framework for assessing the credibility of design rainfall estimates

The preliminary assessment stage constitutes data selection and preprocessing, which involves sourcing for data by the practitioner. This has been reported to be a major bottleneck for reliable hydrological modeling, as data acquisition is very difficult in most regions of the world. Data are either not readily available for research purposes or characterized by missing values. Therefore, at this step, it is important to investigate the data for error detection and removal of anomalies. In general, rainfall data used must be relevant, adequate and accurate (Ojha et al. 2008).

After the relevance, adequacy and accuracy of the data have been ensured, annual maximum or instantaneous peak rainfall intensity for different durations or in the case where they are not available, the 24-h duration is then abstracted. This is followed by the computation of descriptive statistics of the abstracted data. Descriptive statistics are employed to support the selection of candidate probability distributions to fit the abstracted AMS (Delignette-Muller and Dutang 2015). These statistics include central tendency estimates such as mean, mode, median; dispersion estimates such as the coefficient of variation, standard deviation and the shape estimates such as skewness and kurtosis quantities of the annual maximum series which measures the empirical distribution. More so, a Cullen and Frey graph (skewness–kurtosis graph) could be employed to select potential probability distributions (Delignette-Muller and Dutang 2015).

Having selected candidate probability distributions, the second stage of the framework is employed, which involves the estimation of the parameters and evaluation of the performance of the selected probability model to select the most suitable model that explains the distribution of the abstracted AMS. Various parameter estimation methods have been proposed; however, any of maximum likelihood estimation (MLE), probability-weighted moment (PWM) and linear moment (L-M) method may be used since they have been reported to be robust than the method of moment (MOM) procedure.

The second stage also involves quantitative and qualitative assessment procedures. The quantitative constitutes the adoption of the goodness-of-fit statistics and goodness-of-fit criteria. The goodness-of-fit statistics such as Cramer–von Mises, Kolmogorov–Smirnov and Anderson–Darling statistics (Delignette-Muller and Dutang 2015) or Chi-square statistics (Delignette-Muller et al 2015) may also be employed. Also, the goodness-of-fit criteria based on the log-likelihood such as AIC and BIC or with an inclusion of diagnostic statistics such as the D-index test (Vivekanandan 2015) may be employed at the second stage. The qualitative assessment procedure basically constitutes the adoption of graphical plots to assess the extent of fit of the candidate distributions; examples include a density plot of candidate distributions with the histogram empirical distribution, CDF plot that combines the empirical and the candidate probability distributions, Q–Q plot which relates the quantiles of the empirical distribution to the candidate probability distribution and the P–P plot which relates the probabilities of the empirical distribution to the candidate probability distributions. The final step of this stage is to establish the most suitable model based on the combination of the quantitative and qualitative procedures, the practitioner may adopt rank scores as found in Hassan et al (2019) and Mamoon and Rahman (2019); however, the more attention may be given to the Anderson–Darling statistics due to its robustness and emphasis of fit around the tails, with the backing of a qualitative assessment in situations where more than one GOF test favors a particular probability distribution.

Once the most suitable probability distribution has been established for a particular annual maximum series, parametric bootstrapping is conducted, while the level of the credibility of the estimated parameters may be assessed by comparing their estimates with the median and the 95% confidence interval (CI) of the bootstrapped estimates. Those estimates of the parameters that are close to the median and also within the 95% CI of the bootstrapped estimate may be termed to be credible.

The next stage of the framework is based on the assessment of model quantiles of the bootstrapped samples for different return periods through the adoption of the quantile function and parameters of the most suited probability model. Then the estimates of the quantiles and their upper and lower bounds (95% credible intervals) are presented in the form of plots or tables. Quantiles with wider uncertainty or larger standard error are considered to be less reliable.

The final stage of the framework involves the propagation of variability and uncertainty of the design rainfall estimates using the two-dimensional Monte Carlo (2D-MC) simulation framework and the evaluation of various percentiles and their uncertainty for informed decision-making.

The codes used for implementing the framework are based on the integration of two packages of R programming environment that include the “mc2d” and “fitdisrtplus” which are freely available.

2.4 Statistical tools for credibility assessment

2.4.1 Statistical description and plotting position

Statistical summaries of the generated AMS were computed using mean, standard deviation, skewness and kurtosis to provide concise information of the series. The generated 1-day extremes were arranged in descending order of magnitude. After that, the probabilities that the ranked maxima will be equaled or exceeded for any return period were computed by Hazen’s plotting position represented as:

$$T_{r} = \frac{{\left( {m - 0.5} \right)}}{n}$$
(1)

where Tr is the return period, m is the order or rank, while n is the number of years of study.

2.4.2 Probability distributions

Preliminary assessment of the AMS leads to the selection of four probability distributions. These include the two-parameter distributions, such as Weibull, gamma, log-normal, normal. The probability density function (PDF) of Weibull, gamma, log-normal and normal distributions are represented in Eqs. (25), respectively.

$$f\left( {x;\alpha ,\beta } \right) = \frac{\beta }{\alpha }\left( {\frac{\alpha }{x}} \right)^{{\beta + 1}} e^{{ - \left( {\alpha /x} \right)^{\beta } }} , - \infty < x < \infty ,\alpha > 0$$
(2)
$$f\left( {x;\alpha ,\beta } \right) = \frac{1}{{\alpha ^{\beta } \Gamma \beta }}x^{{\beta - 1}} e^{{ - \left( {\frac{x}{\alpha }} \right)}} ,x,\alpha ,\beta > 0$$
(3)
$$f\left( {x,\mu _{y} ,\sigma _{y} } \right) = \frac{1}{{\sigma _{y} x\sqrt {2\pi } }}e^{{ - \frac{1}{2}\left( {\frac{{\ln \left( x \right) - \mu _{y} }}{{\alpha _{y} }}} \right)^{2} }} ,x,y > 0$$
(4)
$$f\left( {x,\mu _{x} ,\sigma _{x} } \right)\frac{1}{{\alpha _{x} \sqrt {2\pi } }}e^{{ - \frac{1}{2}\left( {\frac{{\left( x \right) - \mu _{x} }}{{\sigma _{x} }}} \right)^{2} }} ,x,y > 0$$
(5)

where \(\mu _{x} \) and \(\sigma _{y} \) are the mean and standard deviation of the series of the annual extreme storm; \(\mu _{y}\) and \(\sigma _{y} \) are the mean and standard deviation of the log-transformed series of extreme annual rainfall, α and β are the scale and location parameters, respectively (Rao and Hamed 2000).

These distributions are suitable for series with positive skewness and kurtosis not far from zero, which were observed for all the stations within the study area. This study uses the MLE method to estimate distributions’ parameters. The theoretical terms and the formulae for parameter estimation using ML of the selected distributions are given by Hassan et al. (2019).

2.4.3 Goodness-of-fit statistics

The goodness-of-fit (GOF) statistics based on the Anderson–Darling (AD) test (Laio 2004), Kolmogorov–Smirnov (KS) test (Chowdhury et al. 1991) and Cramer–von Mises (CM) test (Arnoldand Emerson 2011) were employed to assess the suitability of the selected probability distributions.

2.4.3.1 The Anderson–Darling (AD) test

The Anderson–Darling (AD) test was used to compare the cumulative distribution functions of the empirical and the probability distributions, and it gives more attention to outlier detection. AD test (A2) is represented in Eq. (6) as:

$$A^{2} = - N - \frac{1}{N}\sum _{{i = 1}}^{N} \left( {2i - 1} \right) * \left( {\ln F_{e} \left( {Q_{i} } \right)} \right) + \left( {\ln \left( {1 - F_{D} \left( {Q_{i} } \right)} \right)} \right)$$
(6)

where A2 is the Anderson–Darling test statistic, Fe is the cumulative distribution function of the specified distribution and \(Q_{i}\) is the ordered observed data.

2.4.3.2 Kolmogorov–Smirnov (KS) test

Kolmogorov–Smirnov (KS) test is based on the maximum vertical distance between the CDFs of empirical distribution and the theoretical distribution and is represented as:

$$KS = Max\left( {F\left( {Q_{i} } \right)\frac{{i - 1}}{N},\frac{i}{N}F\left( {Q_{I} } \right)} \right)$$
(7)

where \(F({Q}_{i}\)) is the theoretical cumulative distribution of distribution being assessed.

2.4.3.3 Cramer–von Mises (W2) test

Cramer–von Mises statistic (W2) was used to test the discrepancy between the CDFs of the empirical and hypothetical distribution by measuring the mean squared difference between them (Laio, 2004); and it is represented in Eq. (8) as:

$$W^{2} = \sum _{{i = 1}}^{N} \left( {F\left( {Q_{i} } \right) - \frac{{i - 0.5}}{N}} \right) + \frac{1}{{12N}}$$
(8)

2.4.4 Goodness-of-fit criteria

The goodness-of-fit (GOF) criteria such as the Akaike's information criterion (AIC) and Bayesian information criterion (BIC) were employed to further established the appropriateness of suitable models.

2.4.4.1 Akaike information criterion (AIC)

Concerning the adopted GOF criteria, the Akaike information criterion is widely adopted for selecting suitable stochastic models and is represented in Eq. (9):

$$AIC = n\left( {\log \sigma ^{2} + 1} \right) + 2p$$
(9)

where σ2 and p present the variance and the number of parameters of the subset stochastic model.

2.4.4.2 Bayesian Information Criterion (BIC)

BIC is closely related to AIC, and it is partially based on the likelihood function and is represented by Eq. (10):

$$- 2\ln p\left( {x\left| k \right.} \right) \approx - 2\ln L + K\ln \left( n \right)$$
(10)

where x is the observed data, n is the sample size, k is the number of free parameters to be estimated, p(x|k) is the probability of likelihood of parameters for given data set and L is the maximized value of the likelihood function.

In a situation where more than one distribution is favored by multiple goodness-of-fit statistics or criteria, the AD test that has been considered to be very powerful (Laio 2004) was used for final selection. Herein, the probability distribution whose cumulative distribution function (CDF) is close to the empirical distribution, that is, whose AD is lowest, was selected. A graphical or qualitative assessment method in the form of CDF plots was further used to affirm the fit of the distributions.

2.4.5 Parametric bootstrap method

The study employed the parametric bootstrap method for assessing the uncertainty of the parameters of the most suitable statistical models for north central Nigeria. The parametric bootstrap method, which is based on the resampling of distributions with the best fit, was employed for evaluating how well statistical estimates are accurate and for estimating the standard error of summary statistics based on the plug-in principle (Seo and Park 2011).

Assuming that \(X = \left( {x_{1} x,x_{2} x,...,x_{n} x} \right)\) is the constructed series of peak rainfall; its empirical distribution denoted by \({F}_{n}\) may be defined below:

$$F_{n} = \left\{ {\begin{array}{*{20}l} {0,x < x_{{\left( 1 \right)}} } \\ {\frac{k}{n},x_{k} \le x < x_{{\left( {k + 1} \right)}} } \\ {1,x \ge x_{{\left( k \right)}} } \\ \end{array} } \right.$$
(11)

Once the empirical function \(F_{n}\) has been calibrated to a known parametric distribution \(F_{\theta } * \left( x \right)\), with its estimated parameters θ*, the parametric bootstrap is based on generating several samples of \(F_{\theta } * \left( x \right)\) based on the sample size of the rainfall series, after which the credible intervals of quantiles and parameters may be estimated. In this study, bootstrapped sampling of the parameters of suitable distributions was done one thousand and one (1001) times.

2.4.6 2-dimensional Monte Carlo simulation framework

The two-dimensional Monte Carlo (2D-MC) simulation framework employed in the study is based on the one recommended by McMillan et al. (2018) and Pouillot and Delignette-Muller (2010) (Fig. 3). The 2D-MC simulation, as explained by Pouillot and Delignette-Muller (2010), is established after the disintegration of all input into the uncertain or variable input. This is followed by the random sampling of uncertain inputs from their separate distributions in row vectors nu of elements; thereafter, variable inputs are sampled randomly from their separate distributions in the column vectors nv. where nu and nv are specified number of iterations in the row and column dimension, respectively; 1001 iterations similar to what was adopted for the bootstrap method was used. Then a nv × nu matrix is developed based on the conditional random sampling of nv (variable parameters) to their nu values (uncertain parameters). Finally, the quantitative risk assessment model conglomerates these elements to derive the nv × nu ultimate matrix of outputs. A detailed procedure for adopting the 2D-MC is presented in Pouillot and Delignette-Muller (2010) and Pouillot et al. (2016).

Fig. 3
figure 3

Schematic representation of a two-dimensional Monte Carlo simulation

3 Results and discussion

3.1 Disregard of credibility assessment of design rainfall estimation in Nigeria

Table 1 presents a synthesis matrix that summarizes the investigations, which have been conducted by various researchers in Nigeria for the last decade on the estimation of design rainfall for flood control. It could be seen that the majority of the available publications that were used for the systematic literature review were drawn from Google Scholar, followed by those drawn from the database of African journal online (AJO), while the least publication was drawn from a Nigeria-based association journal. No related work conducted in Nigeria was found in other existing online databases such as Web of Science and Scopus. The selected studies span from 2010 to 2019 and were conducted in different parts of Nigeria. The majority of the work was done specifically in areas within the southwest, southeast and south–south regions of Nigeria. However, Nnaji (2014) is the only researcher that conducted a study spanning across selected states in Nigeria, including few cities in Northern Nigeria region, such as Ilorin, Jos, Kaduna, Bauchi, Kano and Gusau.

Table 1 Summary of systematic review of design rainfall estimates in Nigeria

As shown in Table 1, both empirical and theoretical methods have been used by the researchers in Nigeria to predict design storms for different return periods. Their results were presented in form of IDF (intensity–duration–frequency) or depth–frequency plots and sometimes in combination with equations. It could be seen that not all researchers endeavor to compare different candidate probability distributions to ascertain their fit; for instance, Nwaogazie and Ekwueme (2017) and Olatunde and Adejoh (2017) disregarded the adequacy of fit of Gumbel distribution, while Agbede and Abiona (2012) did not evaluate the fit of the empirical methods adopted.

Furthermore, Table 1 show that all the researchers in the selected studies did not investigate the reliability of the models or equations presented. The uncertainty of the developed models parameters and their predicted quantiles was not assessed. The assessment of uncertainty and variability dimensions of the rainfall estimations were generally neglected by investigators. This shows that the credibility of the design rainfall estimations for flood control structures has been largely disregarded in Nigeria.

3.2 Application of proposed framework in central Nigeria

When designing hydraulic infrastructures, the credibility of design estimates of rainfall should be assessed and reported, but unfortunately, it is usually ignored (Overeem et al. 2008; Tung and Wong 2014; Huang et al. 2016). This study conducted a systematic review of past literature in the last decade on how Nigerian researchers have ignored the assessment of the uncertainty of probability models and their parameters, before being used in the design of flood control structures. The study then proposed a simple framework (Fig. 3) for assessing the credibility of design rainfall estimates by practitioners, using the north central region of Nigeria as a case study. In order to apply the framework for the first time, the study evaluated the performance of selected probability models in fitting the empirical distribution of 24-h annual maxima (AM) in the study area, assessed the uncertainty of the parameters of suitable models, assessed the uncertainty of design rainfall estimate for different return periods and assessed the variabilities of AM and their uncertainties at 95% credible level through stochastic simulations.

3.3 Statistical description of AM rainfall in the study area

The statistical summary of the series of 1-day extreme rains for north central Nigeria is presented in Table 2. It could be observed in the table that Lokoja recorded the highest minimum peak instantaneous rainfall, while Yelwa recorded the lowest minimum; Abuja recorded the highest maximum peak rainfall, while Yelwa recorded the lowest maximum peak storm. Furthermore, Ilorin recorded the highest median, while Jos recorded the lowest average, and the highest mean peak rainfall was observed at Abuja, while the lowest mean was recorded at Jos. Also, the table shows that Abuja had the largest spread from its mean values when compared with other stations, while Jos had the lowest dispersion from its mean. In general, areas with lower elevations appear to be subjected to the highest extreme values (such as Abuja station) and areas at higher elevations tend to have had lower extremes (such as Jos station) in the study area.

Table 2 Statistical summary of peak rainfall stations (1981–2010)

Table 2 further reveals that the peak rainfall distribution at all stations is not symmetric since the skewness is not equal to zero, which majorly informed the choice of the candidate probability distributions used in fitting the data. Furthermore, the kurtosis estimations reveal that 1-day extreme rains at Lokoja, Minna and Yelwa have their extremes tails lower than a normal distribution, while other parts of north central Nigeria have higher peaks close to or higher than the rear of the normal distribution. The basic statistics presented in Table 2, it shows the intensity of heavy rains is largely dispersed across the study area.

These results corroborate similar studies conducted for the southwestern geopolitical region of Nigeria. For instance, rains were found to be positively skewed with spatial variation in kurtosis (Aiyelokun et al. 2018), while Olofintoye et al. (2009) investigated fifty-four (54) rainfall stations in Nigeria and revealed that all stations were positively skewed.

3.4 Performance evaluation of probability distribution models

The goodness-of-fit (GOF) statistics and goodness-of-fit (GOF) criteria employed for the study to select the most suitable model for Lokoja, Bida, Makurdi and Minna are presented in Table 3. The log-normal distribution was found to be favored by both the GOF statistics and criteria for Lokoja, Bida and Makurdi stations (Table 3). The Weibull distribution was found to fit best to the Minna station (Table3). The fit of the selected distributions is further emphasized in the CDF plots (Fig. 4).

Table 3 Summary of GOF statistics and criteria of stochastic models for Lokoja, Bida, Makurdi and Minna stations
Fig. 4
figure 4

CDF plots of fitted distribution and empirical distributions of (a) Lokoja, (b) Bida, (c) Makurdi and (d) Minna stations

All the distributions describe the left tail of the empirical distribution of Lokoja expect Weibull (Fig. 4a). Figure 4b emphasizes the ability of log-normal to fit the left and right tails as well as part of the center of the distribution of heavy rains at Bida station. Figure 4c emphasizes the weak fit of gamma distribution at the center and right tail, while Fig. 4d emphasizes the fit of Weibull against other distributions to heavy rains at Minna station; the fit was found to be strong at the center and the right tail of the empirical distribution.

The GOF statistics and criteria employed for the study to select the most suitable model for Yelwa, Abuja, Jos, Ilorin are explored in Table 4, which reveals that the GOF statistics and GOF criteria favor the gamma distribution for Yelwa station. Log-normal distribution was found to be favored by both the GOF statistics and criteria for Bida station (Table 4), while for Jos and Ilorin, both the GOF statistics and criteria prefer the log-normal distribution. The fit of the selected distributions is further emphasized in the CDF plots Fig. 5.

Table 4 Summary of GOF statistics and criteria of stochastic models for Yelwa, Abuja, Jos and Ilorin stations
Fig. 5
figure 5

CDF plots of fitted distribution and empirical distributions of (a) Yelwa, (b) Abuja, (c) Jos and (d) Ilorin stations

Every distribution was close to the empirical distribution of Yelwa; however, the Weibull distribution fits most to the heavy rains recorded at Yelwa station (Fig. 5a). Figure 5b emphasizes the ability of log-normal fit to the left and right tails of the empirical distribution of heavy rains at Abuja station. Figure 5c emphasizes the strong fit of log-normal distribution to heavy rains at Jos station, while Fig. 5d emphasizes the fit of all the distributions to heavy rains at the Ilorin station, although the log-normal outperformed others being stronger at the left and right tails of the empirical distribution.

These results show that no single candidate distribution can be selected for a particular region, as it has been revealed in similar hydrologic frequency analysis studies presented by Drissia et al. (2019); Hassan et al. (2019); Aiyelokun et al. (2018); Ahmad et al. (2015) and Olofintoye et al. (2009).

3.5 Establishment of suitable statistical models

Figure 6 shows the flood frequency curves of the most suitable stochastic model for each rainfall station. The figure also shows the established logarithm models for each station along with their respective coefficient of determination (R2) being equal to 0.992, 0.992, 0.993, 0.925, 0.934, 0.998, 0.988 and 0.992 for Lokoja, Bida, Makurdi, Minna, Yelwa, Abuja, Jos and Ilorin stations, respectively. The models are considered adequate for enhancing decision-making concerning drainage infrastructure for flood management, because of their strong ability to predict design storms based on return periods, as they can predict more than 92% of the variation of heavy rains of a particular return period. In general, the design rainfall estimate of any return period can be estimated from the statistical models.

Fig. 6
figure 6

Rainfall frequency curves for design floods based on suitable probability distributions for each station, with their established models

3.6 Credibility analysis of quantiles and parameters of selected statistical models

The uncertainty of the estimated parameters for the most suitable models based on the maximum likelihood estimation method was assessed by employing the parametric bootstrap method. Table 5 shows the comparison between the bootstrapped estimates at the median and 95 percent confidence intervals and the estimates of the selected distributions for each station. It could be observed in the table that the estimated parameters of the suitable distributions were close to the median as well as being within the 95 percent confidence level of the bootstrapped parameters. These results established that the parameters of the suitable probability models incurred slight chance of being uncertain.

Table 5 Comparison of parametric bootstrapped and probability distribution parameters for north central Nigeria

Figures 7 and 8 depict the plots of the estimated quantiles for 5, 10, 25, 50, 100, 200, 500 and 1000-year return periods based on the quantile function of the bootstrapped samples and parameter values of the best-fitted distributions for each station. Both Figs. 7 and 8 show that for each of the stations, the estimated quantiles for return periods were within the confidence interval of the bootstrapped sample. However, the larger spread of the quantiles of the bootstrapped percentiles (at 95% confidence intervals) at the right tails shows that the estimated parameters of the established models are more uncertain between 100- and 1000-year return periods. This is in agreement with Mamoon and Rahman (2019) who reported that rainfall intensity estimates of 100-year return period were more susceptible to uncertainty as a result of larger confidence band compared to lower return period. Furthermore, Overeem et al. (2008), Hu et al. (2013) and Huang et al. (2016) established that uncertainties of rainfall estimates increase substantially with increase in length of return period, despite using a long record of rainfall data, indicating that increasing the length of rainfall records does not guarantee certainty of records (Chuah et al. 2017). This further reinstates that practitioners should always ascertain the credibility of rainfall estimates and not to assume that continuous data collection to increase the length of rainfall records would reduce uncertainty.

Fig. 7
figure 7

Estimation of design storm of different return periods by bootstrap method at 95% confidence interval and by most suitable probability distributions for (a) Lokoja, (b) Bida, (c) Makurdi and (d) Minna. The plots emphasized that uncertainty becomes larger with an increase in return periods from 100 years above

Fig. 8
figure 8

Estimation of design storm of different return period by bootstrap method at 95% confidence interval and by most suitable probability distributions for (a) Yelwa, (b) Abuja, (c) Jos and (d) Ilorin. The plots emphasized that uncertainty becomes larger with increase in return periods from 100 years above

3.7 Variability and uncertainty analysis of simulated extreme rainfall estimates

The variability of regional extreme rains and their uncertainty within the 95% credible interval based on the Monte Carlo simulation is presented in Table 6. The table shows that the variability dimension is evaluated based on nine statistics, which include the mean, standard deviation, minimum, 2.5th percentile, 25th percentile, 50th percentile, 75th percentile, 97.5th percentile and maximum, while the uncertainty dimension of the median of the estimators was evaluated for 2.5% and 97.5% values (95% confidence interval). The table shows that the expected extreme rainfall at Lokoja is 79.6 mm and has a 95% chance of falling within 74.3 mm and 85.3 mm. Extreme rainfall at Bida has an expected value of 79.0 mm and has a 95% chance of falling within 71.6 mm and 87.0 mm; extreme rainfall at Makurdi station with an expected value of 84.3 mm has a 95% chance of falling within 75.9 mm and 93.4 mm, while Minna has an expected value of 80.2 mm and a 95% chance of falling within 74.4 mm and 86.0 mm. Table 6 further shows that the expected extreme rains at Yelwa could be up to 70.3 mm with a 95% chance of falling within 64.1 mm and 76.1 mm; the expected extreme rains at Abuja could be up to 93.4 mm with a 95% chance of falling within 81.3 mm and 107.2 mm; the extreme rainfall at Jos has an expected value of 66.2 mm and has a 95% chance of falling within 61.3 mm and 71.5 mm, while Ilorin station has an expected extreme rain of 87.4 mm and a 95% chance of falling within 79.4 mm and 96.3 mm. The median of the simulated statistics for all stations was within the 95% credible level as shown in Table 6. Depending on the interest of the risk assessor, the estimates of 2.5th to 97.5th percentiles of extreme rainfall with their uncertainty at 95% confidence level in the study area may be abstracted from Table 6 for designing hydraulic works in the study area.

Table 6 Variability and uncertainty of regional extreme rains based on Monte Carlo simulation

Figure 9(a–d) and Fig. 10(a–d) show the variability of cumulative distribution plots with uncertainty indications of extreme rains in the study area. It could be observed that the figures show the credibility bands on each quantile of the variability of peak rainfall at 50% CI (gray band) and 95% (light gray band) in the study region. A close observation of the plots reveals that at each station the extent of credibility varied for each quantile of rainfall variability. For example, extreme rainfall of 88.8 mm–142.0 mm at Lokoja (Fig. 9a), 90.7 mm–125.0 mm at Bida (Fig. 9b) and 97.2 mm–137.0 mm at Makurdi (Fig. 9c), respectively, representing the 75th to 97.5th percentile is more unreliable, whereas at Minna (Fig. 9d), 2.5% percentile, as well as the 75th to 97.5th percentile, were found to have high uncertainty (47.0 mm, 91.4 mm and 107.5 mm, respectively). In addition, Fig. 10(a-d) shows that at Yelwa station, peak rainfall estimates between 50th and 75th percentile of values of 71.0–82.2 mm are more credible than other percentiles, at Abuja station peak rainfall estimates between 2.5th and 50th percentile of values of 43.2–87.4 mm are more credible than other percentiles, at Jos station peak rainfall estimates between 2.5th and 75th percentile of values of 43.3 mm–74.5 mm are more credible than other percentiles, while at Ilorin station, peak rainfall estimate of 84.7–100.6 mm, respectively, for 75th–97.5th percentile were found to be less credible. These imply that the extent of the credibility of quantitative estimates of extreme rains leading to flooding is not equal for each variability indicator across the north central region of Nigeria. Hence, to optimize informed decision-making regarding flood risk reduction by risk assessor, variability and uncertainty of design rainfall estimates should be assessed spatially to minimize erroneous deductions.

Fig. 9
figure 9

Variability cumulative distribution plots with uncertainty indications based on Monte Carlo simulations of extreme rains: (a) Lokoja, (b) Bida, (c) Makurdi and (d) Minna stations

Fig. 10
figure 10

Variability cumulative distribution plots with uncertainty indications based on Monte Carlo simulations of extreme rains: (a) Yelwa, (b) Abuja, (c) Jos and (d) Ilorin stations

The median of the simulated statistics for all stations was within the 95% credible level implying that that quantitative estimates of extreme rains leading to flooding are reliable for the optimization of informed decision-making regarding flood risk reduction in north central Nigeria. However, these credibility intervals can be inferred as a degree of the precision of expected daily instantaneous peak rainfall in the study area. The extent of the credibility of quantitative estimates of extreme rains leading to flooding was found to vary for each variability indicator across the study area. Hence, flood risk managers are encouraged to not only estimate design storms but also assess their variability and uncertainty to minimize erroneous deductions. The framework proposed in the study could be adapted for any region of the world, and every stage could be implemented by integrating two packages of R computing environment (R Core Team 2019) such as the “mc2d” and “fitdisrtplus” which are available at Comprehensive R Archive Network (CRAN, http://cran.r-project.org) for free.

Furthermore, Figs. 11,12,13 demonstrate the spatial interpolation of design storms across north central Nigeria. The results presented in this study are useful for regional flood risk and water resources management of north central Nigeria. This research could also be useful by allowing researchers to better understand the hydrological phenomenon of the region of study as well as for setting plans for future actions. The probabilistic analysis presented in this study may be useful to risk assessors and decision-makers by providing quantitative insights about the credibility of extreme rains and design storms when dealing with flood risk reduction concerns.

Fig. 11
figure 11

Spatial prediction of heavy rains for different return periods within north central Nigeria for a 2-year, b 5-year, c 10-year and d 20-year return periods. The southern region of the study area is more susceptible to flood risk from heavy rains

Fig. 12
figure 12

Spatial prediction of heavy rains for different return periods within north central Nigeria for a 50-year, b 100-year, c 200-year and d 500-year return periods. The southern region of the study area is more susceptible to flood risk from heavy rains

Fig. 13
figure 13

Spatial prediction of heavy rains for the 1000-year return period within north central Nigeria

4 Conclusions

In many regions of the world, rainfall depths and patterns differ significantly, whether on a global, regional or local scale and significantly increase risks of extreme events such as floods and droughts. One of the major challenges of informed decision-making in flood risk management is the lack of adequate empirical data, which encourages the practice of fitting the empirical distributions of hydrologic frequencies with probability distributions. In order to solve these challenges, the present study has developed a simple five-stage methodological framework. The stages of the framework include preliminary assessment of annual maximum rainfall series, performance evaluation of selected theoretical distributions, credibility assessment of model parameters, credibility assessment of suitable model quantiles and propagation of variability and uncertainty of design rainfall estimates. In particular, different statistical tools for credibility assessment comprising of four probability models, the parametric bootstrap and the two-dimensional Monte Carlo (2D-MC) simulation framework were integrated in the developed framework.

The facts that emerged from this research are that credibility assessment has been largely ignored in Nigeria; no single probability distribution is sufficient to fit the whole north central region of Nigeria; and estimated parameters of selected probability distributions are reliable based on the bootstrap method. The reliability of design storms for hydraulic engineering constructions reduces from 100–1000 year return periods and that the quantitative estimates of extreme rains leading to flooding vary spatially across the region of study in terms of the credibility for all variability indicators.

The framework presented in this study is expected to encourage future researchers to assess the credibility of estimates generated from hydrologic frequency studies. Majority of the hydrologic frequency investigations in Nigeria have disregarded uncertainty analysis of parameters and quantiles of suitable probabilistic models. There is relatively no study that extensively propagated the variability and uncertainty of design rainfall estimates separately across of large region.