Introduction

The use of spring water is a common solution to meet the drinking water needs of cities and villages globally. Since the water from these springs is often of good quality, it does not need to be treated and no energy needs to be expended to extract it. In order to exploit the water from the springs and allocate it to the different needs, information is needed about the quantity of water supply, the changes in discharge during different months of the year and during periods of drought, and also the quality of the water. Therefore, it is necessary to continuously monitor the quality and quantity of spring water. In Iran, measurement of spring discharge and quality sampling are usually carried out seasonally (once every three months) and in some cases monthly. Various methods are used to measure spring discharge, such as the weir method, the volumetric method, the velocity–area method, and less frequently the dye tracing and pressure transducer methods (Gil‐Márquez et al., 2017). The application of each method depends on the flow rate, the morphology of the spring outlet, the available tools, and the purpose of the measurement (Kresic & Stevanovic 2010).

However, because some springs are inaccessible, they are measured annually, or no observational data have been recorded. Planning the use of water from such springs is very difficult because there is no reliable basic information to estimate the amount of spring water and conduct accurate planning (Jeannin et al. 2021). Therefore, providing reliable methods for estimating (predicting) the discharge from such springs where long-term hydrometric data are unavailable is crucial. To date, researchers have developed several methods to estimate spring discharge.

One of the common methods for estimating the discharge of a spring is to develop or use existing hydrologic models that simulate the behavior of the spring and its watershed (Chen & Goldscheider 2014; Zhou et al. 2019; Meng et al. 2021; Guo et al. 2023). These models take into account some effective factors such as precipitation, snowmelt, soil properties, and topography to estimate the discharge of the spring (Zhang et al. 1996; Hartmann et al. 2014; Peng et al. 2021). It is also very common to use numerical models to simulate groundwater flow. To build these models, much information is needed about the hydrodynamic characteristics of the aquifer (such as hydraulic conductivity, transmissivity, and specific yield), the geology, and also long-term observational data (Luo et al. 2016a; Katsanou et al. 2022; Wang et al. 2022). This is because the relationships used in these models should be calibrated using long-term observational data and the model should eventually be validated (Jeannin et al. 2021). The method of isotopic analysis is also used in some cases to trace the source and movement of groundwater, which can provide insights into the discharge patterns of the spring. However, the application of this method requires a lot of time and money, and its accuracy may decrease under the influence of some factors (Osati et al. 2014; Gil‐Márquez et al., 2016; Guo et al. 2022; Wang et al. 2022). The remote sensing technologies such as satellite imagery or aerial photography and drones have been also used to estimate spring discharge (Hunn & Cherry 1970; Loheide & Steven 2009; Jou-Claus et al. 2021; Bandini et al. 2021). In recent years, using data mining and artificial intelligence to estimate the discharge of springs has attracted the attention of researchers, although these methods require long-term observational data (Lambrakis et al. 2000; Granata et al. 2018; Savary et al. 2021; Gai et al. 2023; Mukherjee et al. 2023).

Some other indirect methods have been used to estimate the flow of springs, which are less accurate and may have limitations and uncertainties. These methods mainly provide an estimate of the potential flow of the spring or its extreme values. For example, by examining vegetation density, types of plants grown, and wetland conditions downstream of the spring, an estimate of the minimum discharge of the spring can be made (Andreo et al. 2016; Luo et al. 2016b). Because certain plant species or wetland conditions may require specific levels of water flow, giving us an idea of the minimum discharge required to maintain them (Kokinou et al. 2023). One of the other methods is based on hydrologic similarity. In this way, first some geomorphological, hydrological, and environmental observations (e.g., signs of erosion, sediment deposition, or vegetation patterns) are collected through field surveys of the spring and its surroundings. The next step is to check the compatibility of these features with other nearby springs with similar conditions that have measured discharge data to obtain an estimate of the historical discharge of the unmeasured spring (Zhang et al. 2022).

It should be noted that the use of each of these methods depends on the resources available, the characteristics of the spring, and the degree of precision desired, and that it is sometimes possible to refine estimates by combining information obtained from two or more methods.

This study was concerned with estimating the discharge of the Absefid spring. Due to the inaccessible location of this spring, its discharge was not measured regularly. In contrast, the discharge of some other springs near this area was measured monthly. Comparing the time series of the discharge of other springs in the region, it is found that the changes in the discharge of the springs are similar to each other. Considering the geological features of the region and the fact that most of these springs were formed around the same fault, there is a possibility that the similarity in the behavior of the springs among themselves is due to the geological structure of the region, in addition to the same climatic conditions. After comparing the measured data of the Absefid spring with those of the neighboring springs, it was found that the discharge of the Absefid spring has a higher correlation with the Gerdebisheh spring. Considering the short distance between the two springs, the almost equal elevation, and the fact that both springs are located near the same fault, we came up with the idea of creating a bivariate probability model between the discharge of the Absefid spring and the discharge of the Gerdebisheh spring. Using this bivariate model, it is possible to estimate the Absefid spring discharge by taking the Gerdebisheh spring discharge and assuming a certain probability level (e.g., 0.9). Copulas can be applied to construct a bivariate distribution between the discharge of two springs. Copula functions is a flexible statistical tool that has been used in recent years to create multivariate distribution in various hydro-climatology studies (Poonia et al. 2021; Birjandi et al. 2023; Amini et al. 2023; Sadeghfam et al., 2022; Vahidi et al. 2023). To our best knowledge, this is the first study that uses probabilistic multivariate analysis to estimate spring discharge. The main objective of this research is to use the simulation approach based on the bivariate copula in the estimation of spring discharge.

Materials and methods

Description of the studied area

Borujen County includes the cities of Borujen, Faradonbeh, Boldaji, Gandoman, Naqneh and Sefiddasht, a large part of which has a cold and dry climate. Currently, the population of this county is about 125,000 people, whose drinking water is often supplied from groundwater resources by digging wells. In recent years, with the development of agriculture in this region and the overexploitation of groundwater, the aquifer level in the plains of Borujen, Faradonbeh and Sefiddasht has declined drastically. This has caused the supply of drinking water in these cities to be accompanied by problems, and in addition to the decline in water quality, most days the drinking water of homes is cut off for several hours. So far, measures have been taken to solve this problem, such as digging deep wells and planning to transfer water from the Zayandeh Rud River (Ben-Borujen Water Transmission Project), but it has not been done yet (Sharifi et al. 2021). One of the solutions for sustainable supply of drinking water for Borujen is to transfer water from the Absefid spring.

The Absefid spring is located in the Borujen county, Iran, and 8 km west of the village of Sarpir. Access to this spring is possible via the 54 km paved road from Borujen to Dorahan and the 10 km dirt road from Dorahan to Sarpir, and then passing through an impassable path of about 8 km.

The outlet of the spring is located on the slope of Hezar Dareh Mountain, at an altitude of 1720 m above sea level and in the geographical position of 51° 5′ 14" east longitude and 31° 38′ 22" north latitude (Fig. 1). According to the latest report of Chaharmahal and Bakhtiari Regional Water Authority, the average discharge of this spring is about 750 L per second. The water of this spring has very good quality and can be used as drinking water without any treatment.

Fig. 1
figure 1

Location of Absefid and Gerdebisheh springs in Borujen County and Iran

The water from this spring flows into the Karebas River after traveling a distance of about 500 m on a steep slope. Due to the steep and rocky path from the outlet of the spring to the place where it flows into the Karebas River, the water of the spring is seen in white color, which is why it is called White Water Spring (Absefid in Persian). The Karebas River flows in a deep, impassable valley and then joins the Sabzkoh River and finally the Great Karun. The two sides of the Karebas River are relatively high in elevation and have a relative slope of more than 30 degrees. The northern slope of the Karebas River, which forms the heights of Hezar Dareh, is mostly rocky due to its predominant lithology, which is mainly carbonate, and the valleys are mostly U-shaped, the slopes are irregular and rough, and the vegetation cover is very sparse. The southern elevations of the river, Mount Pazan Pir and Mount Kalamooyi, consist of ancient sediments and part of the Paleozoic sediments.

According to the geological classification of Iran (Stocklin 1968), the feeding area of Absefid are located on the crushed zone of the main Zagros thrust (crushed Zagros) between the two structural zones of Sanandaj–Sirjan and Zagros fold and thrust belt. One of the most important features of the Zagros fold and thrust belt is folds parallel to the northwest-southeast axial direction and numerous thrusts in the same direction, which were formed during a bending and shortening mechanism (Ghorbani 2013).

Outlet of Absefid spring is located in the contact of the Surmeh Carbonate Formation with the Shale and Neiriz Marl units. In the distance between downstream of Absefid Spring and the Karebas River, tectonic mixed units are seen that include a mixture of the Khaneh-Kat, Mila, Zagon, and Lalon formations, and many parts of the complex are composed of shale and thin sandstone units of the Mila Formation.

The slope of the layering in this area ranges from 25 to 38 degrees in the northeast direction and opposite to the topographic direction. Structurally, this area is located near the southeast plunge of the Sabzehkouh syncline. This Syncline is formed as a popup structure between the end part of the Dena fault (Sabzekouh fault) and the thrust faults Hezar Dareh and Dopolan. In addition to the mentioned faults, due to the high tectonic pressure and the location of the region in the collision zone of the Dena fault and the Zagros main thrust, many faults and fractures have formed in the brittle carbonate units as transverse and systemic faults, which play a special role in creating secondary permeability and karst development in the area. In the incised ridges of the Sabzekouh alluvium, there are carbonate formations of the second era. At their core is the impermeable Gurpi Formation, which acts as a hydraulic boundary separating the karstic aquifers of the northwest ridge from the southwest ridge. Around this syncline and at the fault intersections, the karst aquifers were drained and numerous karstic springs were formed such as Tang Siah, Bagh Khan, Haft Cheshme Boldaji, Hossein Abad, Nasir Abad, Bidak, Tang Westgan, Chehel Gazi in Bijgerd on the northeast side and Chehrazgun, Tang Zendan, Absharan, Gerdebisheh and Absefid springs on the southeast side (Fig. 2).

Fig. 2
figure 2

Location of springs around Sabzehkouh and Hezar Dareh faults

Absefid spring originates from limestone and gray dolomitic limestone of Surmeh formation. In this area, under the Surmeh Formation, there are red to gray shale and sandstone units of the Neiriz Formation, which, as an impermeable bedrock, played a special role in storing the sinking water in the karst formations located on this formation and in the formation of the springs. The presence of a large thickness of carbonate formations, faults, fractures and useful joint porosity, as well as the development of karst and the high precipitation in the region, especially the high and snowy peak, have enhanced the formation of the karst aquifer in this deposit, resulting in the faults and fractures to drain a part of this aquifer as fault—karst springs such as Absefid.

Field investigations of the Absefid spring outlet show that this spring has many outlets originating in the area of the heights of the Hezar Dareh Mountains in the direction of the small fault valley. The higher elevation outlets become less watery and dry with the decline of snowmelt in the dry season and drought years, but the lower elevation outlets have permanent water. Therefore, the spring has a static reserve and a dynamic reserve, and the volume of dynamic discharge is relatively high due to the high gradient and adequate precipitation, which has resulted in large fluctuations in the total water supply to the spring during the wet and dry months of the year and also during the wet and drought periods.

Used data

Assuming the same hydrogeological regime (intra-annual flow fluctuation) of Gerdebisheh spring and Absefid spring, this study aimed to create a bivariate distribution between the monthly discharges of these two springs. To this end, based on the monthly discharge of Gerdebisheh spring and assuming a certain probability level, the discharge in Absefid spring can be estimated. The flow measurement of Absefid spring is not done continuously due to the impassability of the area and the only observational data from this spring is the monthly flow measured in the water year September 2006–August 2007. In 2022, the spring’s flow was measured in June, July and August. The measured discharge values of the Absefid spring are given in Table 1.

Table 1 Measured monthly discharge of Absefid spring in 2006, 2007, and 2022

The monthly discharge values measured at Gerdebisheh spring from March 1993 to August 2022 are presented in Appendix. The discharge values of Gerdebisheh spring were considered as the input of the copula-based model to estimate the discharge of Absefid spring.

Copula function

In multivariate analysis, a copula function is a mathematical tool to describe the dependence structure between multiple random variables and their marginal distributions. Copulas are particularly useful when dealing with complex relationships between variables that may not be adequately captured by linear correlation measures. The concept of a copula originates from probability theory and is often used in the field of finance, risk assessment, and various other disciplines where understanding the joint behavior of multiple variables is important. The key idea behind copulas is to separate the joint distribution of variables into two components: the marginal distributions of each variable and the copula function that captures their dependence structure (Nelsen 2006).

Using copula functions is more flexible than using common multivariate distributions. Because the marginal distributions can be of different types in creating a multivariate equation with copula functions. Copulas have the property that they are uniform distributions on the unit hypercube, which allows them to capture different types of dependence structures while preserving the marginal distributions. Some common copula families include the Gaussian (normal) copula, the Clayton copula, the student’s t copula, the extreme value copula, and many others. Each copula family has its own characteristics and is suitable for modeling different types of dependencies, such as positive dependence, negative dependence, and tail dependencies (Nazeri Tahroudi et al. 2021).

The introduction and presentation of the copula are credited to Sklar (1959), who elucidated a theory explaining how various univariate distribution functions can be linked to form a multivariate distribution.

For N-dimensional continuous random variables \(X_{1} ,X_{2} ,...,X_{N}\) with corresponding marginal CDFs denoted as \(F\left( {x_{i} } \right) = P_{{x_{i} }} \left( {X_{i} \le x_{i} } \right)\), the joint CDF of variable Xi can be defined as follows:

$$H_{{X_{1} ,...,X_{N} }} \left( {x_{1} ,x_{2} ,...,x_{N} } \right) = P\left[ {X_{1} \le x_{1} ,X_{2} \le x_{2} ,...,X_{N} \le x_{N} } \right]$$
(1)

Copula is a function that combines univariate marginal CDFs to construct a multivariate distribution function. Therefore, Sklar (1959) demonstrated that the multivariate probability distribution H can be expressed by the copula function (C) incorporating marginal distributions and dependence structure:

$$C\left( {F_{{X_{1} }} \left( {x_{1} } \right),F_{{X_{2} }} \left( {x_{2} } \right),...,F_{{X_{N} }} \left( {x_{N} } \right)} \right) = H_{{X_{1} ,...,X_{N} }} \left( {x_{1} ,x_{2} ,...,x_{N} } \right)$$
(2)

where \(F_{{X_{i} }} \left( {x_{i} } \right)\) denotes the ith marginal distribution and \(H_{{X_{1} ,...,X_{N} }}\) represents the CDF of joint distribution of \(X_{1} ,X_{2} ,...,X_{N}\).

Considering that for continuous random variables (RV), the cumulative distribution function (CDF) of the margins is non-decreasing from zero to one, we can consider the C copula as a transformation of \(H_{{X_{1} ,...,X_{N} }}\) from \(\left[ { - \infty , + \infty } \right]^{N}\) to \(\left[ {0,1} \right]^{N}\). This transformation separates the marginal distributions from each other and as a result, the copula function C is only related to the relationship between the variables and gives a comprehensive description of the dependence structure. For two-dimensional case (variables \(X_{1}\) and \(X_{2}\) with CDFs \(u_{1} = F_{{X_{1} }} \left( {x_{1} } \right)\) and \(u_{2} = F_{{X_{1} }} \left( {x_{1} } \right)\)), Eq. 2 is as follows (Nelsen 2006):

$$H\left( {x_{1} ,x_{2} } \right) = C\left( {u_{1} ,u_{2} } \right) = C\left( {F_{{X_{1} }} \left( {x_{1} } \right),F_{{X_{2} }} \left( {x_{2} } \right)} \right)$$
(3)

In practice, when using copulas, one typically follows these steps:

  1. 1.

    Determine the marginal distributions of the individual variables.

  2. 2.

    Choose a suitable copula family that fits the observed dependence structure

  3. 3.

    Estimate the dependency parameter (θ) of the copula based on the observed data

  4. 4.

    Simulate joint samples from the copula and combine them with the marginal distributions to obtain joint samples of the multivariate data

There are several methods to estimate copula dependence parameters, including method of moments, Maximum Likelihood Estimation (MLE), Inverse of Kendall’s tau, Canonical Maximum Likelihood Estimation, Pseudo Maximum Likelihood Estimation, Optimization-based techniques, and Inference Function for Margins (IFM) method. Among these methods, MLE and IFM have been used more in studies. In the present study, we used the IFM method to estimate the dependency parameter of copula (θ). The IFM stands as the most prevalent approach for estimating the copula parameter, which was suggested by Joe (1997) and includes two steps: (a) maximizing the loglikelihood functions of each of the univariate marginal distributions and (b) maximizing the log-likelihood function of Copula to estimate the dependency parameter of Copula. For more details on this method, see Joe (1997) and Mirabbasi et al. (2012).

To select the appropriate copula from several candidate copulas, the cumulative distribution function (CDF) values for each theoretical copula are compared to the corresponding empirical copula values. In fact, the concept of empirical copula functions is quite akin to the concept of plotting position formulas that are used in univariate analyzes. The empirical copulas are the cumulative distribution of the rank transformed variables (Nelsen 2006). For a sample with size n, the empirical two-dimensional copula can be calculated as follows:

$$C_{n} \left( {u_{1} ,u_{2} } \right) = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} I\left( {\frac{{R_{i1} }}{n + 1} \le u_{i1} ,\frac{{R_{i2} }}{n + 1} \le u_{i2} } \right)$$
(4)

In the above relationship, n denotes the sample size and \(I\left( \omega \right)\) denotes the indicator variable. When the logical expression of \(\omega\) is true, it takes a value of one, and if it is not true, it takes a value of zero. \(R_{i1}\) and \(R_{i2}\) are the ranks of the ith observation data (i.e., \(u_{1} and u_{2}\)), respectively, and \(u_{k}\) is the CDF value of the kth variable.

Dependency structure analysis

Normally, Pearson's linear correlation coefficient (r) is used to measure the relationship between two variables. However, this method has some flaws, including that the r coefficient is strongly affected by outlier data. Also, if X or Y or both of them uniformly power other than one, then the value of the correlation coefficient will change, but there will be no change in their rank correlation. Pearson's correlation coefficient is suitable for data that follows an elliptical distribution (Nelsen 2006).

Nonparametric statistics, including Kendall's τ statistic, can be used to solve Pearson's correlation coefficient drawbacks. Nonparametric correlation coefficients can model different types of correlation, because data distribution and outliers do not have much effect on them. Kendall's τ is a rank correlation coefficient that is used in problems related to copula functions which is defined as follows.

$$\tau = P\left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) > 0} \right] - P\left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) < 0} \right]$$
(5)

The term \(P\left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) > 0} \right]\) is the probability of concordance and \(P\left[ {\left( {X_{1} - X_{2} } \right)\left( {Y_{1} - Y_{2} } \right) < 0} \right]\) is the probability of discordance. If \(\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) > 0\) two pairs are considered concordance, if \(\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) = 0\) two pairs are neither concordance nor discordance, and if \(\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right) < 0\) two pairs are considered discordance. Kendall's coefficient is within the range \(\left[ { - 1,1} \right]\). In this case, the number 1 indicates complete concordance, zero indicates zero concordance, and the number − 1 indicates complete discordance.

For a random sample including n paired observations, \(\left( {x_{1} ,y_{1} } \right),\left( {x_{2} ,y_{2} } \right), \ldots ,\left( {x_{n} ,y_{n} } \right)\), sample estimator of Kendall's tau can be calculated by following relationship:

$$\hat{\tau } = \left( {\begin{array}{*{20}c} n \\ 2 \\ \end{array} } \right)^{ - 1} \mathop \sum \limits_{1 \le i < j \le n} {\text{sgn}} \left[ {\left( {x_{i} - x_{j} } \right)\left( {y_{i} - y_{j} } \right)} \right]$$
(6)

where \(i,j = 1,2, \ldots ,n\) and \({\text{sgn}} \left[ \psi \right]\) is the sign function:

$${\text{sgn}} \left( \psi \right) = \left\{ {\begin{array}{*{20}c} 1 & {{\text{if}}} & {\psi > 0} \\ 0 & {{\text{if}}} & {\psi = 0} \\ { - 1} & {{\text{if}}} & {\psi < 0} \\ \end{array} } \right.$$
(7)

In this study, in order to construct a bivariate distribution of the discharge of Gerdebisheh spring and Absefid spring, the fitness of 8 different copulas was evaluated. The CDF formula and the range of dependency parameter (θ) of the studied copula functions are given in Table 2.

Table 2 CDF formula and dependence parameter range of copula functions used in this study (Nelsen 2006)

Conditional state of the copula function

One of the important issues in using conditional joint distributions is the uncertainty of the results (Tahroudi et al. 2020). To overcome this shortcoming, this study used an alternative method based on conditional density relations of copulas. In this method, the following algorithm is implemented in each step:

  1. 1.

    The conditional density diagram \(c\left( {u,v} \right)\) is drawn in two-dimensional mode for the investigated variables. This graph is drawn for one of the CDF values (u or v). As an illustration, if the goal is to estimate the discharge of the Absefid spring (given the Gerdebisheh spring discharge values), v values are considered equal to the CDF of the Gerdebisheh spring discharge. Because the discharge of Absefid spring is dependent on the discharge of Gerdebisheh spring.

  2. 2.

    For each value of Gerdebisheh spring discharge, a graph is drawn based on u.

  3. 3.

    In each curve, the maximum value of \(c\left( {u,v} \right)\) is chosen and its corresponding value is determined on the x axis.

  4. 4.

    These maximum values are actually Absefid spring discharge corresponding to Gerdebisheh spring discharge.

The conditional density parameter of the copula is also estimated from the following equation:

$$c\left( {u,v} \right) = \frac{{\partial^{2} C\left( {u,v} \right)}}{\partial u\partial v}$$
(8)

where c and C are the PDF and CDF of the copula function, respectively. u and v are the CDFs of the marginal distributions of variables \(X_{1}\) and \(X_{2}\), respectively.

Choosing the best copula function

To choose the most suitable copula function, first, suitable marginal distribution was identified for each studied variables (monthly discharge of Gerdebisheh spring and monthly discharge of Absefid spring) and the parameters of the marginal distributions were estimated by the MLE method, then several types of copula functions was considered for the connection of these two marginal distributions and the parameter of the copula functions were estimated using the IFM method (Joe 1997). In the next step, the appropriate copula function was selected through a comparison of the joint probability values for each of the copulas with their corresponding empirical copula values, based on Nash–Sutcliffe Efficiency coefficient (NSE), root mean square error (RMSE), mean absolute error (MAE) and Akaike information criterion (AIC):

$${\text{NSE}} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} (C_{pi} - C_{ei} )^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} (C_{ei} - \overline{{C_{e} }} )^{2} }}$$
(9)
$${\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^{n} (C_{pi} - C_{ei} )^{2} }$$
(10)
$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {C_{pi} - C_{ei} } \right|$$
(11)
$${\text{AIC}} = 2m - 2\ln \left( L \right)$$
(12)

In the above relations, n is the sample size, \(C_{p}\) is the joint probability values calculated from the theoretical copula, \(C_{e}\) is the observed values of the empirical copula, \(\overline{C}_{e}\) is the mean value of the empirical copula, m denotes the number of parameters, and L is the maximum value of the likelihood function. The copula function is more suitable in which RMSE, and MAE values are closer to zero, AIC is less than the other, and NSE value is closer to one. The value of the NSE index changes from \(- \infty\) to 1. The NSE value falling within the range of 0.75–1.0 denotes very good model performance; while, a range of 0.36–0.75 indicates satisfactory performance, and a value less than 0.36 signifies poor model performance (Nash and Sutcliffe 1970).

Copula-based model for estimating the discharge of Absefid Spring

In this study, the Kendall’s tau coefficient was calculated between the monthly discharge data of Gerdebisheh spring (GS) and the monthly discharge of Absefid spring (AS), and considering that the value of this coefficient was high; therefore, it can be concluded that the discharge of Absefid spring and the discharge of Gerdebisheh spring have similar behavior and fluctuations. In the next step, the best marginal distribution was determined for each of the variables of the monthly discharge of Gerdebisheh spring and the monthly discharge of the Absefid spring. For this purpose, the fit of 11 different distributions was examined and finally, by comparing the probability values obtained from each distribution with the respective empirical probability values, the distribution with the best fitness for each variable was specified.

Then, the fitness of 8 different copulas listed in Table 2 was tested on the pair of GS discharge and AS discharge data, and the appropriate copula function was determined and its dependence parameter was estimated. After determining the appropriate copula, bivariate distribution of GS and AS monthly discharge was created. In the next step, taking into account the probability of 90% and having the monthly flow rate in Gerdebisheh spring from March 1993 to August 2022, the corresponding monthly discharge of Absefid spring was estimated.

Results and discussion

The results of the fitting of 11 different univariate distributions on the monthly discharge data of Gerdebisheh spring and Absefid spring are given in Table 3. As depicted in this table, based on the NSE and RMSE statistics, the GEV and Log-Normal distributions have the best fit on the discharge data of Gerdebisheh and Absefid springs, respectively.

Table 3 Results of examining the fit of marginal distributions on the discharge of Gerdebisheh and Absefid springs

Correlation analysis of springs discharge

In order to measure the correlation between the pairs of discharge variables of Gerdebisheh and Absefid springs, the Kendall's tau statistic and Pearson correlation coefficient were calculated and the results are presented in Figs. 3 and 4, respectively. The presence of high correlation between the pair of investigated variables is a prerequisite for the implementation of copula-based simulation. The results of correlation analysis between pairs of monthly discharge of Gerdebisheh and Absefid springs show that based on two statistics, Kendall's Tau and Pearson correlation coefficient, and there is an acceptable correlation between the data of these two springs (Nelsen 2006; Salvadori et al. 2007). In various studies, such as Wiboonpongse et al. (2015) and Nazeri Tahroudi et al. (2022), a Kendall's tau of more than 0.3 was considered an acceptable correlation. Therefore, it is possible to perform copula-based simulations using the conditional density of copulas.

Fig. 3
figure 3

Results of correlation analysis of pairs of GS–AS values based on Kendall's Tau statistic

Fig. 4
figure 4

Results of correlation analysis of pairs of GS–AS values based on Pearson correlation coefficient

Choosing the best copula

The results of fitting eight different copula functions on pairs of GS–AS pair discharges are given in Table 4. The Joe copula with AIC, RMSE, NSE, and MAE values equal to − 4.51, 0.059, 0.97, and 0.046, respectively, have the best fit on the GS–AS pair. However, Galambos and Gumbel–Hougaard (GH) copulas also have a good fit for studied variables. Therefore, the bivariate distribution of discharge of Gerdebisheh and Absefid springs was created using Joe copula. The value of the dependency parameter of the Joe copula was estimated 3.97 using the IFM method.

Table 4 Results of the goodness of fit test of different copula functions on the discharge data of Gerdebisheh and Absefid springs

Copula-based simulation using conditional density

Using Joe copula as the superior copula and its conditional density, a copula-based simulation was performed to analyze the frequency of pairs of GS–AS values. The values of Kendall's tau statistic in the bivariate simulation of GS–AS values are presented in Fig. 5. Black numbers and circles represent simulated values, and red numbers and circles represent the observed values. The results show that the simulated values have a higher correlation.

Fig. 5
figure 5

Simulation results of GS–AS values obtained from the copula-based model

The simulated values of AS under the condition of occurrence of GS values are presented in Fig. 6. A little overestimation and underestimation can be seen in the estimation of Absefid spring discharge (AS) values, but according to the RMSE and R statistics as well as the efficiency of 66% of the copula-based model, despite the short length of the input data, the obtained results are satisfactory.

Fig. 6
figure 6

Results of simulation of AS values under the condition of occurrence of GB values using the copula-based model

Presenting a relationship to predict the monthly discharge Absefid spring

After creating a bivariate model of the AS and GS pair and simulating the AS values, a regression relationship was created between the simulated and observed values based on the copula. This relation can be used to estimate the monthly discharge values of the Absefid spring (AS) given the monthly discharge of the Gerdebisheh spring (GS):

$${\text{AS}}_{i} = 1.2904 \times {\text{GS}}_{i} + 0.5822$$
(13)

It should be noted that relationship 13 is estimated with the assumption of 90% joint probability. In order to simulate a joint probability of more than 90%, the frequency analysis is first conducted to estimate the probability of joint occurrence of more than 90% and then the simulation is performed based on the conditional density of the copula functions for the probability of occurrence of more than 90%. Since the Eq. 13 is actually a regression equation fitted on the simulated discharge values of the Absefid spring, it is certainly less accurate than the direct use of the copula-based model, but using the Eq. 13 is much easier. In different possibilities, the simulation of the discharge values in Absefid spring (AS) under the condition of the discharge in Gerdebisheh spring (GS) is shown in Fig. 7. By using Fig. 7, the values of AS can be estimated according to the values GS in different probability levels. For example, if the discharge of Gerdebisheh spring is 2.5 cubic meters per second, according to Fig. 7, with a 90% probability, the discharge of Absefid spring will be 3.8 cubic meters per second.

Fig. 7
figure 7

Joint probability of occurrence of AS values under the condition of occurrence of GS values

Using Eq. 13, the discharge of Absefid Spring in the period from March 1993 to August 2022 was estimated with a 90% probability of occurrence and presented in Fig. 8 along with the time series of the discharge recorded in Gerdebisheh spring. Also, the estimated discharge values of Absefid spring during March 1993 to August 2022 using bivariate copula model with 90% probability of occurrence are presented in Table 5.

Fig. 8
figure 8

Monthly time series of measured discharge in Gerdebisheh spring and estimated discharge in Absefid spring using copula-based model

Table 5 Estimated discharge values of Absefid spring using copula-based model at 90% probability level

To determine the reliable drinking water for city of Borujen from Absefid spring, the average values of its minimum monthly discharge can be recommended (Fig. 9). As seen in Table 5 and Fig. 9, the lowest estimated discharge for the Absefid spring is 600 L per second in June 2001. Although the average of minimum discharge of this spring is about 920 L per second and the maximum average flow is 3240 L per second (Figs. 9, 10). Also, the average flow during the simulated period is about 1830 L per second. (Fig. 11).

Fig. 9
figure 9

Minimum discharge values of Absefid Spring in different months of the year (1993–2022)

Fig. 10
figure 10

Minimum discharge values of Absefid Spring in different months of the year (1993–2022)

Fig. 11
figure 11

Long-term average of monthly discharge Absefid Spring (1993–2022)

Conclusion

Geological investigations show that the origin of Gerdebisheh and Absefid springs is probably the same. Evaluating the correlation between the discharge of these two springs in the period when both springs had observation data also indicated the same behavior and fluctuations of the discharge of the two springs. Therefore, in this study, using a copula-based model, a bivariate distribution between the discharge of Gerdebisheh and Absefid springs was created, and using the conditional probability, the Absefid spring discharge value was estimated with a probability of 90% for the period of March 1993–August 2022. Based on the results obtained, the lowest discharge estimated for Absefid spring is 600 L per second. Although the minimum average discharge of this spring is about 920 L per second. Therefore, for water allocation planning of this spring to drinking needs, the safe and reliable water (firm yield) of this spring can be considered equal to 600 L per second or 18.92 MCM/year.

It should be noted that the environmental needs of the Karebas river must be evaluated to accurately determine the allocation water, because the water of the Absefid spring flows into this river, and if the spring water is allocated for drinking purposes, there may be adverse effects on the environment of the Karebas river basin. Also, due to the karst nature of this spring, the water catchment area of the spring is beyond the surface basin, and determining the effective factors on the water supply of this spring requires conducting karst hydrogeological studies in the area. In this study, the water quality of Absefid spring has not been studied, so it is suggested to continuously measure water quality of this spring. Especially when it rains, according to the local residents, the turbidity of the spring water sometimes increases at the entrance to the river, which needs more investigations. In order to apply the proposed method to estimate the flow rate of the spring, the geological characteristics of the study area must be investigated and a spring with the same hydrogeological regime whose flow has an acceptable dependence with the studied spring should be selected for modeling. In addition, the accuracy of the method used is strongly dependent on the length of the observed data.