Introduction

Drought is a weather-related natural disaster. It denotes a deficiency of precipitation over an extended period of time, usually for one season, a year or more. It occurs in virtually all climatic zones, and its characteristics vary significantly among regions. Drought definitions are of two types: conceptual and operational. Conceptual definitions help understand the meaning of drought and its effects. Operational definitions help identify the drought’s beginning, end and degree of severity.

The various types of droughts are: meteorological, agricultural, hydrological and socioeconomic (Wilhite and Glantz 1985). Hydrological drought refers to a persistently low discharge and/or volume of water in rivers and reservoirs (Tallaksen and van Lanen 2004; Nalbantis and Tsakiris 2009); it exerts the greatest damages among all the natural disasters (Wilhite et al. 2006). Data sets required to assess hydrological drought are surface water area and volume, surface runoff, streamflow measurements, infiltration, water-table fluctuations and aquifer parameters.

Based on the studies of Peters et al. (2006) and Tallaksen et al. (2006), propagation of drought in the hydrological cycle can be analyzed using gridded information. The results revealed that catchment control in modifying drought signal is possible from a series of short-duration droughts in rainfall covering large parts of the catchment. In order to develop measures for mitigation of hydrological droughts, many studies have been performed in the past on the analysis and estimation of hydrological drought. These studies can be divided into two topics. For example, some studies have employed hydrological drought indices for quantitative estimation, such as cumulative streamflow anomaly (Fleig et al. 2006; Vasiliades et al. 2011), surface water supply index (SWSI) that is a suitable measure of hydrological drought in mountainous regions where snow contributes significantly to the annual streamflow (Niu et al. 2015), or balance with soil moisture information (Palmer Drought Severity Index) (Vasiliades et al. 2011; Kousari et al. 2014). Other studies have conducted hydrological drought analysis using deterministic approach (Piechota and Dracup 1996). Starting by Yevjevich (1967), many studies have been conducted on univariate analysis, based on the runs test (Cancelliere and Salas 2004). The Wald–Wolfowitz runs test (or simply runs test) is a nonparametric statistical test that checks a randomness hypothesis for a two-valued data sequence.

Drought duration and severity are often modeled by different distributions. A copula is employed to construct the bivariate drought distribution (Shiau et al. 2007). Copula is a function that links the univariate marginal distributions to form a bivariate distribution. The bivariate return periods are also established to explore the drought characteristics of the historical droughts. Copula functions were originally introduced by Sklar (1959) to describe multivariate distributions. De Michele and Salvadori (2003) used these functions in hydrological studies. In this paper, an improved storm intensity-duration model is developed, which describes the dependence between these variables by means of a suitable copula function. Then, the concept of copula functions was quickly applied in different areas of hydrology including flood frequency analysis (De Michele and Salvadori 2003; Shiau 2006; Shiau et al. 2007), multivariate analysis (Zhang and Singh 2012), and properties of precipitation (Serinaldi et al. 2009). Drought analysis by using copula functions is almost a new subject in hydrology. The following studies could be pointed out on application of copula functions in hydrological drought analysis. Shiau (2006) modeled joint drought duration and intensity distribution in Wushantou (Taiwan) using seven 2D copulas (Ali–Mikhael–Haq, Clayton, Farley–Gumble, Frank, Galambos, Gumbel–Huggard and Plackett). Copula fitting results for drought duration and intensity were quite satisfactory, and Galambos function was chosen the best. Song and Singh (2010) used two-dimensional copula functions to analyze frequency of drought in Texas, USA. Serinaldi et al. (2009) applied copulas to the probabilistic analysis of drought characteristics (drought length, mean and minimum SPI values and drought mean areal extent), based on mean areal precipitation, observed in Sicily, Italy, for 1921–2003. Results showed significant dependence of properties between almost all the considered pairs and that copulas are adequate to jointly model drought properties. Reddy and Ganguli (2012) analyzed bivariate flood frequency of annual maximum flow using Archimedean copula functions. Wong et al. (2013) analyzed drought based on climatic conditions (El Niño, La Niña and ENSO) using three-dimensional Gumbel–Huggard copula function in two basins of Australia. They determined three drought properties—duration, maximum intensity and moderate intensity—using Standardized Precipitation Index (SPI). Chen et al. (2013) used copula functions to construct multivariate joint distributions. Van Huijgevoort et al. (2014) determined the characteristics of discharge changes when there is a hydrological drought. Literature review showed the importance of drought analysis. Duration and severity are two significant parameters of this event and need to be discussed by bivariate analysis. Kwak et al. (2016) showed strong correlation among drought characteristics, and the drought with a 20-year return period in the Sacramento River basin. The latest studies show that the copula functions have been used in meteorological drought and hydrological phenomenon. The objective of the present research was bivariate analysis of severity and duration of hydrological drought, by applying the copula functions in the Tajan River basin, northern Iran. Also, in this research, the forecast of hydrological drought characteristics (duration, severity and return period) has been estimated for water resources management.

Materials and methods

Study area

Tajan River originates from Alikhani Mountains, 78 km southeast of Sari, Mazandaran province, Iran (Fig. 1). The watershed area of this river is 4147 km2. Shahid Rajaei (Soleyman Tangeh) dam is constructed on this river, and there are eight active hydrometric stations in this basin. In this research, 14,000 daily discharges data of 40 years (1974–2014) of Aliabad hydrometric station on Tajan River are used (Fig. 1).

Fig. 1
figure 1

Location of Soleyman Tange Dam, Ali Abad station on Tajan River

Reconstruction and homogeneity of data

The incomplete discharge data were reconstructed by calculating the average daily discharge in the control station and then filling the gaps for the low flows in Aliabad station. The runs test was used to analyze the homogeneity of data.

Univariate index and distributions of hydrological drought

Low-flow index of hydrological drought analysis is chosen for further discussions. The reasons for this selection include: (1) extensive use of this method, (2) reliability of the results, (3) comparison of low flows in various stations and (4) limited access to reliable information on an appropriate scale to use other indices.

Study by Hadiani et al. (2013) showed that in the Mazandaran province, except for a few exceptions, the average 30-day minimum river discharge with return period of 2 years was the best index for drought threshold of the low flow. Based on the low-flow index, the probability distributions of drought severity and duration were selected by Kolmogorov–Smirnov index to test the goodness-of-fit. Then, the results were compared with the x 2 index. As a result, univariate Gamma distribution was chosen for drought severity and exponential distribution for drought duration.

Copula functions

If the random variables X and Y have probability density functions f x (x) and f y (y), respectively, then the cumulative distribution function of the variables, namely F x (x) and F y (y), has a uniform probability density function U (0, 1). According to the Sklar’s theorem, if F x (x) and F y (y) are continuous, then there is a unique copula function, which is a cumulative distribution function, and its marginal is uniform; that is, C[0, 1]2 → [0, 1] such that \(F\left( {x,y} \right) = C\left( {F_{x} \left( {x,y} \right),F_{y} \left( {x,y} \right)} \right)\) is a bivariate copula function. There are several copula functions which their parameters control the amount and intensity of correlation (Nelsen 2006). In this study, five different copula functions were used (Table 1).

Table 1 Relationships and parameters of five different copula functions

Correlation of drought severity and duration

Estimation of Kendall Thaw correlation coefficient for n-variable observed data, including (x 1, y 1)…(x n , y n ), is as the following relationship:

$$\tau = \left( {\begin{array}{*{20}c} n \\ 2 \\ \end{array} } \right)^{ - 1} \mathop \sum \limits_{1 \le i \le j \le n} {\text{sgn}}\left[ {\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right)} \right],\quad i,\,j = 1,2, \ldots ,n$$
(1)

where

$${\text{sgn}}\left( {\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right)} \right) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{if }} \,\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) > 0} \hfill \\ 0 \hfill & { {\text{if }} \,\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) = 0} \hfill \\ { - 1 } \hfill & { {\text{if }}\, \left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) < 0} \hfill \\ \end{array} } \right.$$
(2)

where τ is Kendall correlation coefficient, sgn is conditional function, x i and y i are severity and duration variables, and n is number of data.

Estimation of parameter θ and goodness-of-fit of copula functions

There are four main methods for estimating the parameter θ, which maximum likelihood method is the most widely used (Nazemi and Elshorbagy 2011). This method is used in the present research. In this method, optimum value of this parameter value maximizes the log-likelihood function. The log-likelihood formula is as follows:

$$L\left( \theta \right) = \mathop \sum \limits_{i = 1}^{n} { \log }(c\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right) = \mathop \sum \limits_{i = 1}^{n} { \log }\left( {c\left( {u,v} \right)} \right)$$
(3)

where θ is parameter of copula function, F D (d) and F S (s) are distribution functions of drought duration (D) and severity (S), and c is density of copula function, which is calculated by the following equation (Zhang and Singh 2012):

$$C\left( {u,v} \right) = \frac{{\partial^{2} c\left( {u,v} \right)}}{\partial u\partial v}$$
(4)

The best copula function is selected when it has the highest maximum log-likelihood value (Nazemi and Elshorbagy 2011). Other than the highest maximum log-likelihood, Akaike information criterion (AIC), root-mean-square error (RMSE) and Nash–Sutcliffe efficiency (NSE) were calculated for the five copula functions as follows. The best copula function has the lowest RMSE and AIC and the highest NSE values:

$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left[ {C_{\text{p}} \left( i \right) - C_{\text{e}} \left( i \right)} \right]^{2} }$$
(5)
$${\text{AIC}}\, = \, -\, 2{\text{ln ML}} + 2K$$
(6)
$${\text{NSE}} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {C_{\text{p}} \left( i \right) - C_{\text{e}} \left( i \right)} \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {C_{\text{e}} \left( i \right) - C_{\text{e}} } \right)^{2} }}$$
(7)

where C p is computed parametric copula function, C e is likelihood of the observed data obtained from empirical copula function, ln ML is maximum log-likelihood function, and K is number of parameters of the copula function (Cisty et al. 2015).

Conditional probability

Based on the selected bivariate distribution, the information that has important role in drought management could be valuable. For example, probability of drought severity and duration exceeding some predetermined values will be calculated by the following equation (Shiau 2006):

$$P\left( {D \ge d,S \ge s} \right) = 1 - F_{D} \left( d \right) - F_{S} \left( s \right) + C\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right)$$
(8)

It is also possible to calculate conditional probability of the two variables. This means that if it is intended to assess the likely occurrence of a variable in the hypothetical case of exceeding a certain threshold value of another variable, the following equations could be used (Shiau 2006):

$$P\left( {S \le s |D \ge d^{{\prime }} } \right) = \frac{{F_{S} \left( s \right) - C\left( {F_{D} \left( {d^{{\prime }} } \right),F_{S} \left( s \right)} \right)}}{{1 - F_{D} \left( {d^{{\prime }} } \right)}}$$
(9)
$$P\left( {D \le d |S \ge s^{{\prime }} } \right) = \frac{{F_{D} \left( d \right) - C\left( {F_{D} \left( d \right),F_{S} \left( {s^{{\prime }} } \right)} \right)}}{{1 - F_{S} \left( {s^{{\prime }} } \right)}}$$
(10)

Joint and conditional return periods

Joint return period for severity (S) and duration (D) of drought has been defined for two following cases (Shiau 2006): (1) joint return period in case of exceeding of drought duration and severity from a predetermined threshold, and (2) return period for drought severity or duration greater than or equal to a certain value (Eqs. 11, 12, respectively):

$$D \ge d \,{\text{and}}\, S \ge s \to T_{DS} \, = \,\frac{E\left( L \right)}{{1 - F_{D} \left( d \right) - F_{S} \left( s \right) + C\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right)}}$$
(11)
$$D \ge d\, {\text{or}} \,S \ge s \to T_{DS}^{{\prime }} = \frac{E\left( L \right)}{{1 - C\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right)}}$$
(12)

where L is time interval between beginning of a drought and the next drought and E(L) is mathematical expectation of time interval between two consecutive droughts.

Conditional return periods can be defined as two forms; return period of drought duration when drought severity exceeds certain value, and return period of drought severity when drought duration exceeds certain value (Eqs. 13, 14):

$$T_{D|S \ge s} = \frac{{T_{S} }}{{P\left( {D \ge d,S \ge s} \right)}} = \frac{E\left( l \right)}{{\left[ {1 - F_{S} \left( s \right)} \right]\left[ {1 - F_{D} \left( d \right) - F_{S} + C\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right)} \right]}}$$
(13)
$$T_{S|D \ge d} = \frac{{T_{S} }}{{P\left( {D \ge d,S \ge s} \right)}} = \frac{E\left( l \right)}{{\left[ {1 - F_{D} \left( d \right)} \right]\left[ {1 - F_{D} \left( d \right) - F_{S} \left( s \right) + C\left( {F_{D} \left( d \right),F_{S} \left( s \right)} \right)} \right]}}$$
(14)

Results and discussion

Drought severity and duration distributions

According to Table 2, which is based on the estimated low-flow index, 10 drought events have occurred during the studied period. The significant statistical characteristics of these droughts are listed in Table 3. Based on the goodness-of-fit tests described in “Estimation of parameter θ and goodness-of-fit of copula functions” section, Gamma and exponential distributions showed the best fit for drought severity and duration, respectively.

Table 2 Severity and duration of the 10 droughts in Tajan Watershed
Table 3 Significant statistics of duration and severity of droughts observed in the hydrometric station

After fitting the exponential and Gamma distributions on drought duration and severity data and estimating the parameters based on maximum likelihood, the exponential distribution parameter λ = 3 and Gamma distribution parameters α = 2.2635 and β = 0.0644 were obtained. Figure 2 shows the theoretical and empirical distributions of severity and duration of drought.

Fig. 2
figure 2

Empirical and theoretical distributions: a drought severity; b drought duration

Optimum copula function

Regarding the Kendall coefficient of 0.5556 and Spearman coefficient of 0.697, it is concluded that severity and duration of droughts are correlated and the best copula function will be obtained by maximizing the log-likelihood function and controlling the AIC, RMSE and NSE indices. The results are shown for the five copula functions in Table 4.

Table 4 Parameter θ, maximum log-likelihood and error indices (RMSE, AIC, NSE) for the copula functions

Table 4 shows that Galambos copula function is the best copula function to describe the bivariate distribution of severity and duration of droughts in Tajan Watershed. Figure 3 shows a theoretical Galambos copula plotted versus the empirical copula for Aliabad station.

Fig. 3
figure 3

Fitted Galambos copula versus the empirical copula

In Fig. 3, the relationship is close to the 1:1 line and confirms the use of Galambos copula for both characterizing the dependence structure and constructing the bivariate model. Results of the combined probability of intensity and duration of drought for different levels of severity and duration are shown in Fig. 4. The curves in this figure could be useful in water resources management during the drought periods of the Tajan River. According to the results, the longest duration of drought was 5 months and the highest drought severity was 0.32. If probability of hydrological drought with severity of more than 0.32 and duration of more than 5 months is going to be predicted, the following calculations are used based on Eq. (8):

$$F_{D} (5) = 0.8592,\;\;F_{S} (0.32) = 0.8301\,\, \text{and}\,\, C\left( {F_{D} \left( 5 \right),F_{S} \left( {0.32} \right)} \right) = 0.7503$$
$$P(D \ge d,S \ge s) = 1 - F_{D} (d) - F_{S} (s) + C(F_{D} (d),F_{S} (s)) = 1 - 0.8592 - 0.8301 + 0.7503 = 0.061$$

Therefore, the combined probability is 6.1%.

Fig. 4
figure 4

Combined probability of drought severity and duration

Also, the conditional probability of drought could be calculated using Galambos bivariate distribution. Figure 5 shows the conditional probability of drought duration based on certain values of drought severity for hydrometer station, Tajan River, and Fig. 6 shows the conditional probability of drought severity based on values of drought duration. For example, for drought duration of less than 5 months and severity of more than 0.32, the conditional probability based on Eq. (10) will be 28.5%.

Fig. 5
figure 5

Conditional probability of drought severity for different durations

Fig. 6
figure 6

Conditional probability of drought duration for different values of severity

Return period

Return periods of \((D \ge 5\;{\text{and}}\;S \ge 0.32)\) \(\left( {D \ge 5\,\,{\text{and}}\,\,S \ge 0.32} \right)\) and \((D \ge 5\;{\text{or}}\;S \ge 0.32)\) will be 74.4 and 17.9 years, respectively (Figs. 7, 8). One of the goals of drought study is prediction of drought information and properties for future critical drought events, based on the present and/or previous observations and actual information. Therefore, in Figs. 4, 7 and 8, severity or return period curves were extrapolated for drought duration up to 10 months.

Fig. 7
figure 7

Return period (year) for the case of drought duration and severity exceeding a certain value

Fig. 8
figure 8

Return period for drought severity or duration greater than or equal to a certain value

If drought duration is 5 months and severity is more than 0.32, then the conditional return period \((T_{{D\left| {S \ge s} \right.}} )\) will be 905 years (Fig. 9). Results showed that as drought duration and its concurrent severity increases, the combined and conditional return period gets larger. Kwak et al. (2016) used copula method for bivariate drought analysis in the Sacramento River basin. Their results showed that strong correlation exists among drought characteristics, and the drought with a 20-year return period could be considered a critical level of drought for water shortages.

Fig. 9
figure 9

Conditional return period of drought duration for different drought severities \((T_{{D\left| {S \ge s} \right.}} )\) in the hydrometric station of Tajan River

Conclusions

In this paper, the methodology of constructing bivariate statistical distribution of drought severity and duration was studied in for a 40-year discharge data of Tajan River (1974–2014). The advantage of using copula functions is the possibility of using different univariate marginal distributions. Considering the low-flow index, 10 hydrological drought events were identified for the studied period, which have happened after the construction of Soleyman Tangeh dam. By analyzing the severity and duration of droughts, it was found that Gamma and exponential distributions had the best fit for severity and duration, respectively. The worst hydrological drought event which has happened in this area has lasted for 5 months with severity of 0.32. Then, 5 well-known copula functions were used to combine the marginal functions of drought severity and duration. Parameters of these copula functions were determined by maximizing the log-likelihood value. The best copula function was selected based on the highest maximum log-likelihood value. In this study, Galambos was the best copula function. By using the best copula function, the joint and conditional probability characteristics of drought duration and severity were determined. Results showed that the combined probability of hydrological drought with severity of more than 0.32 and duration of more than 5 months is 6.1%. For drought duration of less than 5 months and severity of more than 0.32, the conditional probability will be 28.5%. In general, the curves prepared for predicting joint or conditional probabilities of drought events with specified values of severity and duration are very useful in planning and management of water resource.