1 Introduction

Probability distribution, a statistical tool, is frequently used to describe the likelihood of stochastic variables like rainfall estimation and prediction. In hydro-meteorological prediction, rainfall distribution is one of the most critical variables since it exhibits skewed distribution and, therefore, does not follow uniform trend in overall distribution pattern. However, probability and frequency analysis of rainfall data facilitate in determining the expected rainfall occurrence at various level of confidence (Bhakar et al. 2008; Singh et al. 2012) and, thus, help in better understanding of spatio-temporal rainfall distribution pattern. Scientific prediction of rainfall occurrence using probability distribution models and crop planning accordingly prove a significant tool in sustaining food and livelihood security of resource poor farmers (Panigrahi and Panda 2001; Singh et al. 2005; Suhaila and Jemain 2007; Bhakar et al. 2008; Zaw and Naing 2008; Roman et al. 2012). However, level of probabilities as well as best fit models in prediction of rainfall distribution over different return periods vary from place to place. As a result, selection of location specific best-fit models and estimation of desired amount of rainfall for different return periods at various probability levels has been attempted by several researchers over the years (Mehta et al. 2002; Salauddin and Yusuf 2008; Nemichandrappa et al. 2010; Sharma and Singh 2010; Liang et al. 2012; Roman et al. 2012; Singh et al. 2012; Olumide et al. 2013). Therefore, it is imperative to understand stochastic process of precipitation using statistical models (Acreman 1990; Katz and Parlange 1996; Liang et al. 2012). Estimation and prediction of annual maximum daily rainfall (AMDR), seasonal maximum daily rainfall (SMDR) and monthly maximum daily rainfall (MMDR) would enhance management of water resources including its effective utilization. Therefore, probability analysis of rainfall is necessary for managing rainfall anomaly (deficit or excess)-induced intermittent water stress and the risk of crop failure, particularly in rainfed agriculture (Panigrahi and Panda 2001; Singh et al. 2012; Mazandarani et al. 2013).

Sagar Island of India by virtue of its geographic location at the delta region of Bay of Bengal is one of the most vulnerable islands to climate change, particularly rainfall anomalies (Mandal et al. 2013). Extreme climate-driven multifarious threats including tidal gushes, deluge with saline seawater, increasing incidents of severe cyclonic storms, floods, occurrence of droughts and water scarcity have taken a toll on environmental security of the island (Bandyopadhyay 1997; Gopinath 2010; Majumdar and Das 2011). Sole dependency on erratic distribution of monsoon rainfall poses a serious threat to the predominant paddy-based rainfed agricultural production vis-à-vis food grain and livelihood security of the rapidly growing population in the island (Mandal et al. 2013). Boojh (2008) projected that by 2030, abrupt climate change and its secondary consequences like rise in sea level, flood and permanent submergence of lands may cause permanent habitat loss of more than 70,000 people in the region including Sundarbans of Bay of Bengal. Though, several researchers (Bandyopadhyay 1997; Gopinath 2010; Majumdar and Das 2011) have explored the island in the context of coastal erosion, cyclone, tidal ingression, sea level rise, etc., yet there is no comprehensive study on probability analysis of rainfall anomalies (intensity, frequency, amount and distribution). Realizing the needs of better understanding of monsoon rainfall behaviour, the present study aimed at finding out the most reliable probability distribution function(s) and best fit models for annual, seasonal and monthly maximum daily rainfall (MDR) at Sagar Island. We also estimated the occurrence of rainfall events for various recurrence intervals as well as predicted the probability of occurrence of minimum amount of rainfalls for sustainable management of rainfed agricultural (lowland rice) production in the island. The study would enable in better understanding of rainfall distribution pattern in the present context of climate change, population pressure, crop production and food security in Sagar Island.

2 Materials and methods

2.1 Study area and data collection

The study area (Sagar Island) extends from 21° 37′ 20″ N to 21° 52′ 28″ N latitude and 88° 2′ 17″ E to 88° 10′ 25″ E longitude, with an elevation range of 2.5–3.5 m from mean sea level and falls under the ‘coastal saline’ agro climatic zone of West Bengal, India (Fig. 1). This is the largest Island in Sundarbans deltaic complex setting on the continental shelf of Bay of Bengal. Total area of the island is 28,211.4 ha, of which only 55 % area is available for cultivation and lowland rice alone occupies 97.4 % of the total cultivated area. Rainfed rice occupies more than 80 % area while only 19.8 % area is under irrigated rice and the sources of irrigation are farm ponds and tanks (Anonymous 2010–2011). In the present study, long period (1982–2010) daily rainfall data was collected from the lone meteorological observatory of the island, maintained by the India Meteorological Department (IMD), Government of India (Sagar IMD, station index 42903, ~3 m above mean sea level) (Fig. 1).

Fig. 1
figure 1

Administrative boundary map of West Bengal and location of study site (Sagar Island)

Fig. 2
figure 2

Rainfall extremities comparison with mean monthly rainfall at Sagar Island (1982–2010)

2.2 Selection of the probability distribution models

The study period was categorized into annual, pre-monsoon (MAM, March–April), monsoon (JJAS, June–September), post-monsoon (ONDJF, October–February), summer (AMJJASO, April–October), winter seasons (NDJFM, November–March) and monthly basis for probability analysis of AMDR, SMDR and MMDR. In our study, for convenience, we considered October to February as post-monsoon months (Choudhury et al. 2012). Various probability distribution models viz. Normal, Lognormal, Gamma, Weibull, Pearson and Generalized extreme values were selected to fit with the AMDR, SMDR and MMDR data. In addition, different forms of these six distribution models namely Normal, Lognormal (2P, 3P), Gamma (2P, 3P), generalized gamma (3P, 4P), Log gamma, Weibull (2P, 3P), Pearson 5 (2P, 3P), Pearson 6 (3P, 4P), Log Pearson 3 and Generalized extreme values were fitted with the MDR data and thus, all together, 16 probability distributions were tested to find out the best fit models for predicting AMDR, SMDR and MMDR. Probability distribution analysis was carried out in accordance with standard procedures (Olofintoye et al. 2009; Nemichandrappa et al. 2010; Sharma and Singh 2010; Roman et al. 2012; Olumide et al. 2013). The model parameters μ and σ represents mean and standard deviation for normal distribution, scale and shape parameter for lognormal distribution, while α (α 1 α 2), β and σ indicated the shape, scale and location parameters for rest of the distributions (Table 3). In generalized gamma (3P, 4P) and in generalized extreme value, the shape and location parameters were demonstrated by μ and k, respectively. Shape parameter (α) is a measure of skewness of the distribution. As α value increases, the distribution curve becomes more symmetrical. When α < 1, the shape of the curve is like a reverse ‘J’ and the value of probability density becomes maximum when x ~ 0. The scale parameter (β) is a measure of steepness of the distribution. The smaller the β is, the skewed curve becomes more positive (Liang et al. 2012).

2.3 Testing of the probability distribution models

The acceptability and reliability of the fitted models were tested by goodness of fit tests. The goodness of fit tests included Kolmogorov–Smirnov test (K-S), Anderson Darling Test (A 2) and Chi-Square Test (X 2). The K-S is the largest vertical difference between theoretical and empirical cumulative distribution function (ECDF) while A 2 compares the fit of an observed cumulative distribution function (CDF) to an expected CDF which gives more weight to the tails than the K-S test (Sharma and Singh 2010). Chi-square test (X 2) compares the observed frequency (f 0 ) of different classes with the expected frequency (fc) (Das 2007). The statistical tests were carried out in accordance with standard procedures as described by several previous researchers (Husak et al. 2007; Olofintoye et al. 2009; Sharma and Singh 2010; Roman et al. 2012). The goodness of fit tests was applied for testing the following null hypothesis, H0: the daily maximum rainfall data follow the specified distribution and Ha: the daily maximum rainfall data does not follow the specified distribution. The following goodness-of-fit tests viz. K-S, A 2 and X 2 were used at α (0.01) level of significance for the selection of the best fit probability distribution.

  1. (i)

    The Kolmogorov–Smirnov statistic (K-S) is defined as the largest vertical difference between the theoretical and the empirical cumulative distribution function (ECDF):

    $$ D{=}_{1\le i\le n}^{Max}\left\{ F\left({x}_i\right)-\frac{i-1}{n},\frac{i}{n}- F(x)\right\}\dots \dots . $$
    (1)

    Where x i  = random sample, i = 1, 2, …., n. The empirical distribution function F n for n iid observations X i is defined as

    $$ \mathrm{CDF} = Fn(x) = \frac{1}{n}{\displaystyle {\sum}_i^n{I}_{Xi\le x}\dots \dots, } $$
    (2)

    Where Ix i ≤ x is the indicator function, equal to 1 if X i  ≤ x and equal to 0 otherwise. This test is used to decide if a sample comes from a hypothesized continuous distribution (Azumi et al. 2010; Sharma and Singh 2010; Salarpour et al. 2012).

  2. (ii)

    The Anderson–Darling statistic (A 2) is defined as,

    $$ {A}^2 = - n - S,\ \mathrm{where}\ S = {\displaystyle {\sum}_{k=1}^n\frac{2 k-1}{n}}\left[1 n\left\{ F\left({Y}_k\right)\right\}+1 n\left\{1- F\left({Y}_{n+1- k}\right)\right\}\right]\dots \dots .. $$
    (3)

    The test statistics can then be compared against the critical values of the theoretical distribution. Unlike K-S test, this test gives more weight to the tails (Sharma and Singh 2010; Roman et al. 2012; Khudri and Sadia 2013).

  3. (iii)

    The value of the Chi-squared statistic(X 2) is

    $$ {X}^2={\displaystyle {\sum}_{i=1}^n\frac{{\left({O}_i-{E}_i\right)}^2}{ E i}\dots \dots \dots .} $$
    (4)

Where X 2 = Pearson's cumulative test statistic, which asymptotically approaches an X 2 distribution. O i  = an observed frequency; E i  = an expected (theoretical) frequency, asserted by the null hypothesis; i = the number of observations (1, 2, …k), calculated by

$$ Ei= F\left({x}_2\right)- F\left({x}_1\right) $$
(5)

Where F = CDF of the probability distribution being tested. The observed number of observations (k) in interval i is computed from equation given below

$$ K= l+{ \log}_2 n\dots \dots \dots \dots \dots $$
(6)

Where n = sample size (Azumi et al. 2010; Salarpour et al. 2012; Khudri and Sadia 2013).

2.4 Identification of best-fitted probability models

Best fit distribution model for each dataset was determined based on total score obtained from three goodness of fit test. The rank of different probability distributions was marked from 1 to 16 based on minimum test statistics. For a particular distribution, total score was obtained by summing up scores of three goodness of fit test. Probability model with the highest score was considered as the best-fitted distribution model for a particular time series. Similarly, the second highest score was considered as the second best fit model of distribution (Sharma and Singh 2010; Olofintoye et al. 2009). Coefficient of determination (R 2) was also calculated to justify the reliability of best-fitted models.

2.5 Estimation and prediction of MDR

With the help of best-fitted probability distribution models, the MDR was estimated for different return periods and probability (%) of desired amount of MDR was also projected for annual, seasonal and monthly rainfall at Sagar Island. The probability level (%) was categorized into low (<30 %), moderate (30–70 %) and high (>70 %). Kurtosis of distribution was classified into normal or mesokurtic (Ku = 0.263), platykurtic (Ku > 0.263) and leptokurtic (<0.263) (Das 2007). Similarly, skewness was classified as approximately symmetrical (−0.5 to +0.5), moderately skewed (−0.5 to −1 or +0.5 to +1) and highly skewed (−1 to +1) (Bulner 1979).

3 Results and discussion

3.1 Probability analysis of annual maximum daily rainfall

Descriptive statistics revealed that the island received an average AMDR of 145.6 (±46.5) mm over a span of 29 years (1982–2010) with 32 % variability. The highest AMDR was observed in the month of June 1988 (254.4 mm) and the lowest (73.5 mm) was during the month of August 2010 (Table 1, Fig. 2). About 52 % observations (15 out of 29 years) exceeded the long period average (LPA, 1982–2010) AMDR (145.6 ± 46.5 mm) and out of 52 %, 67 % (about 10 years) observations were distributed in the first two decades (1982–1999). The occurrence of AMDR, hence, exhibited irregular decreasing trend and was established by moderately positive skewness (+0.5 > 0.64 < +1.0) of the distribution (Table 1). Higher variation of AMDR identified by the excess kurtosis (Ku = −0.04) and the less concentration of data around its mean (145.6 mm) projected that the distribution was platykurtic. The unevenness of AMDR distribution was also affirmed by the standard error (SE = 8.64 mm) and standard deviation (SD = 46.5 mm). Among the fitted probability models, general gamma (4P) and lognormal (3P) were identified as reliable distributions considering their individual goodness of fit test with test statistics of 0.112 (K-S), 4.147 (A 2) and 0.353 (X 2), respectively (Table 2). Combining the three goodness of fit tests, Normal model (score = 36) was selected as the best-fitted model followed by Weibull (score = 34) as the second best-fitted model for AMDR data (Table 3). Our finding of normal distribution as the best-fitted model with AMDR data for Sagar Island corroborates the reported observations made by other workers for the comparable plain areas of Kharagpur, West Bengal of eastern India (Panigrahi and Panda 2001) and Banswara of Northwest India (Bhakar et al. 2006). However, there are reports that on shifting from plain to hilly tracts in India, lognormal distribution fitted best with AMDR data for Nilgiri in south India (Sulochanan and Koteeswarn 2000) and Pantnagar in Northwest India (Sharma and Singh 2010).

Table 1 Descriptive statistics for annual, seasonal and monthly maximum daily rainfall during the periods of 1982-2010 at Sagar Island
Table 2 Distribution of first rank probability functions in three goodness of fit test for annual, seasonal and monthly maximum daily rainfall at Sagar Island
Table 3 Best fit probability models with parameters for annual, seasonal and monthly maximum daily rainfall at Sagar Island

Estimation of expected AMDR for different return periods and probability levels (%) of getting desired amount of AMDR using Normal probability distribution model revealed that the probability of getting AMDR >50 mm was very high (99 %) while the probability of getting AMDR of >250 mm was only 3 %. Observed data also corroborated that in the last three decades (1982–2010), AMDR of >250 mm occurred once (254.4 mm in 1988). Similarly, chances of getting >150 mm daily rainfall reduced to 40 % probability, while for >200 mm, probability decreased to 12 %. Higher probability (>70 %) was observed for lesser AMDR of >50 to <150 mm while moderate probability (30–70 %) was observed for AMDR >150 mm. Very low probability of occurrence (<30 %) was observed for AMDR of >200–250 mm (Table 4). Panigrahi and Panda (2001) also reported higher probability (95 %) for the occurrence of lesser AMDR (48.1 mm) in Kharagpur region of West Bengal. Hence, moderate probability level for AMDR of >150 to <200 mm assures the continuation of cultivation practices for rainfed agriculture during the wet months (JJAS) in the Island. Therefore, probability analysis of rainfall offers a better scope for predicting the minimum assured rainfall to help in crop planning in rainfed regions (Panigrahi and Panda 2001). A maximum of 50 mm AMDR was expected to occur every 2 years of return period while 114 mm was expected at least once in 25 years (Table 4). Our findings are comparable to the reported AMDR values of 59.5 mm for 2 years and 123.8 mm for 25 years return period at Fatehabad, India (Maurya et al. 2013). Similarly, Manikandan et al. (2011) also reported AMDR of 77.8 mm for 2 years and 127.80 mm for 25 years return periods in Tamil Nadu, India. The estimated AMDR for different return periods (Fig. 3) reflected that logarithmic trend generated a better coefficient of determination (R 2 = 0.99) and this affirmed the suitable fit of estimated rainfall for different return periods vis-à-vis most appropriate fit of distribution function.

Table 4 Estimated rainfall for various return periods and expected probabilities for annual and seasonal maximum daily rainfall at Sagar IslandODMR one day maximum rainfall
Fig. 3
figure 3

Logarithmic curve fitting on estimated annual maximum daily rainfall against return periods

3.2 Probability analysis of seasonal maximum daily rainfall

Long period average MDR during pre-monsoon season (MAM, March–May) varied widely (CV = 54 %) from 105.4 mm (in 1995) to the lowest of 0.06 mm (in 1999) with a mean value of 46.1(±25) mm. During monsoon season (JJAS, June–September), MDR reflected wide inter-annual variation (CV = 34 %) and ranged from 56.2 mm (in 2008) to as high as 254.4 mm (in 1988) with a mean value of 135.2 (±45.7) mm (Table 1). Similarly, the long period average MDR received during post-monsoon (ONDJF, October–February), summer (AMJJASO, April–October) and winter (NDJFM, November–March) seasons were 84 (±54.8) mm, 145.6 (±46.5) mm and 39.8 (±25.7) mm, respectively. Compared to other seasons, pre-monsoon and summer seasons exhibited relatively higher uniformity both in amount and distribution of MDR which was evident from the platykurtic distribution pattern (Ku = −0.06 and Ku = −0.04) and moderate variability (CV = 32–54 %) in MDR distribution. All the seasons exhibited moderate (SK = 0.64 to 0.73) to strong (SK = 1.26 > +1) positive skewed distributions (Table 1). Higher MDR during monsoon season was probably due to the arrival of southwest monsoon with higher uniformity and consistency. Therefore, rainfed rice (Aman) cultivation during rainy season has a better chance of water availability compared to winter season with less MDR where supplementation from irrigation will be prerequisite for Rabi crop production in the Island.

Best-fitted probability models for pre-monsoon, monsoon, post-monsoon, summer and winter seasons were Log normal, Weibull, Normal, Normal and Pearson 5, respectively. The corresponding highly acceptable coefficient of determination values (R 2) for all the five models were 0.96, 0.96, 0.98, 0.97 and 0.63, respectively (Fig. 4). However, in the hilly tracts of Northwestern Himalayan Region of India (Pantnagar), Sharma and Singh (2010) reported Gamma 3P, General extreme value and Log gamma distributions as the best-fitted probability models for pre-monsoon, monsoon and post-monsoon MDR, respectively. This deviation from our findings might be due to the topographic variations (island of 3 m above msl vs. hilly tracts of Pantnagar located at 300–400 m above msl) as well as the existing wide variability in intra-annual rainfall distribution pattern across India (Mall et al. 2006). Expected SMDR for a minimum assured rainfall (>50 mm) was found to be 44, 97, 73, 98 and 100 % level of probabilities for pre-monsoon, monsoon, post-monsoon, summer and winter seasons, respectively (Table 4). Similarly, for MDR of >100 mm, all the seasons (except pre monsoon) exhibited high probability (38–85 %) of occurrence while pre-monsoon season restricted to 2 % probability for MDR of >100 mm. For an amount of MDR >250 mm, all the seasons exhibited negligible probabilities (0–3 %). Like annual, monsoon and summer seasons, winter season also exhibited higher probability (40 %) of getting a minimum rainfall of >150 mm. This unusual behaviour might be due to the deep depression often formed in the Bay of Bengal during October and November, (locally termed as Ashinerjar), which causes sudden heavy downpour. Normally, in October and November months, Aman rice is harvested, but due to this heavy downpour, crops often get damaged by the heavy rain. Therefore, the probability analysis indicates the need for rescheduling agronomic management practices and proper structural measures to reduce the risk of damages from flooding followed by water logging during this season.

Fig. 4
figure 4

Logarithmic curve fitting on estimated seasonal maximum daily rainfall against return periods

The MDR of 29, 102, 47, 114 and 6.1 mm were expected once in 25 years return period for pre-monsoon, monsoon, post-monsoon, summer and winter seasons, respectively. However, for 2 years return period, it decreased to 1, 48, 50 and 1.5 mm in pre-monsoon, monsoon, summer and winter seasons while post-monsoon season did not show any probability of occurrence (Table 4, Fig. 4). Estimated MDR for all return periods was comparatively higher in summer season than the winter season. As a result, winter season is considered as drier while summer is considered as wetter in the island. The highest probability of MDR was estimated in the summer season, since the arrival of southwest monsoon with heavy rainfall coincided during this period in the island.

3.3 Probability analysis of monthly maximum daily rainfall

During the study period, the highest mean MMDR was observed in July (95.6 ± 48.1 mm) and the minimum was in December (3.5 ± 7.9 mm). Relatively less variability (CV = 46 to 62 %) was observed during monsoon months compared to rest of the seasons/months (CV = 59 to 230 %) (Table 1). Mean monthly rainfall during June, July, August and September were 251.2, 372, 303.6 and 344.3 mm, respectively, while the mean MDR for the corresponding months were observed as 81.1, 95.6, 82.7 and 90.4 mm, respectively (Figs. 6 and 7). Sharma and Singh (2010) also reported a comparable mean MDR of 50.64, 101.84, 84.64 and 79.64 mm for the month of June, July, August and September, respectively, for Northwest India (Pantnagar). In the island, temperature variation is not a limiting factor for crop growth but it is the rainfall amount and its intra-annual distribution (Mandal et al. 2013) which mostly influences the choice of cropping pattern/sequence, duration of crop season and crop intensification. As a result, farmers in the island are left with little options to go for crop area diversification and intensification. Criteria proposed by the International Rice Research Institute and adopted by FAO (1977) stated that the monthly precipitation should be at least 200 mm for three consecutive wet months to allow cultivation of lowland wet rice (Oldeman and Frere 1982). Based on these criteria, the present study revealed that monsoon months could be considered as wet months and favourable for transplanted lowland Aman rice cultivation since the crop water requirement of 7 mm day−1 or 50 mm week−1 (Singh and Singh 2000; Choudhury et al. 2013) can be met from the monsoon rainfalls of 65.5 mm week−1 during mid-July to mid-November (Aman growing period) in the island. Without any supplemental water supply through irrigations, rainfall amount and distribution during pre- and post-monsoon months are not sufficient enough to support Aus and Boro rice cultivation in the island. Therefore, alternative sources of water supply for irrigating the crops are very much in need for crop intensification in the island. Probability of receiving even >25 mm of rainfall during December to April were very low (<30 %) (Figs. 5 and 7). On the other hand, very high probability (>70 %) of receiving more than 50 mm MDR was observed during June to September (Figs. 6 and 7). Comparatively higher probability of receiving MDR in the month of April and May might be due to Norwester that often occurs in the months of April and May over the island. In addition, during these months, low pressure formed in the Pacific Ocean which results in intermittent shower (Barman et al. 2012). Probability analysis showed that for the occurrence of MDR of >75 mm, monsoon months extended from June to October with moderate probability (33 to 63 %) but for the amount of >125 mm, probability decreased considerably from 22 to 0.2 % (Table 5). For 25 years return period, estimated rainfall was 26 to 60 mm during May to October, while during November, it was 12 mm. Similarly, estimated MDR was 7–30 mm during May to October, and 0.9–1.0 mm during April to November for 2 years return period (Table 5).

Fig. 5
figure 5

Logarithmic curve fitting on estimated monthly (January–April) maximum daily rainfall against return periods

Fig. 6
figure 6

Logarithmic curve fitting on estimated monthly (May–August) maximum daily rainfall against return periods

Fig. 7
figure 7

Logarithmic curve fitting on estimated monthly (September–December) maximum daily rainfall against return periods

Table 5 Estimated rainfall for various return periods and expected probabilities for monthly maximum daily rainfall at Sagar Island

4 Conclusions

Through the analysis of best-fitted probability models, it was possible to examine the likelihood of an area receiving MDR amount for different return periods and probability of getting desire amount of MDR. Knowing different levels (low, moderate and high) of rainfall probability might be helpful in agricultural planning, particularly in adjustment of agronomic practices (crop types, sowing date, fertilization, harvesting, cropping sequence, etc.). Estimated higher MDR probability during October to November suggests measure to be taken to avoid crop (Aman rice) damage during the peak harvesting periods in Sagar Island. During monsoon and summer seasons, appropriate planning and hydrologic design for harvesting excess rainfall water is needed to conserve soil and water and its multiple use (including agricultural activities) during water scarce post-monsoon and winter months. Risk from natural hazards from extreme rainfall events can also be addressed by knowing the MDR over the years. Thus, the results could be useful for improving agricultural productivity vis-a-vis food and environmental security of the island.