1 Introduction

The need for accurate and up-to-date space–time information on crop needs based on a specific region climate for water resources managing and agricultural decision-making is undeniable (Shelia et al. 2019). Usually estimating agriculture parameters carried out through field sampling measurement (Wellens et al. 2017). However, as the traditional estimation is costly and time-consuming (Mohamed Sallah et al. 2019), there is a need to use crop models for agriculture planning and irrigation scheduling (Midingoyi et al. 2021).

Several significant crop models have been developed in the last few decades to understand the relationship between the soil-crop-atmosphere system and their main controlling factors (Bouman et al. 1996; Droutsas et al. 2019; Yu et al. 2021). Crop models are practical tools in research and management for different purposes such as ecology, environment, agronomy (Jin et al. 2018). They quantify the analysis of the growth and production of crops. These models can be helpful for irrigation scheduling, climate change impact evaluating, field managing, and crop yield predicting (Hossard et al. 2017; Huang et al. 2020).

Crop models (such as PMWIN, DSSAT, CropSyst, and APSIM) represent the mathematics of agricultural processes based on theory and empirical research; thus, the representation entails different assumptions and simplifications of reality that make the output variables uncertain and inaccurate (Han et al. 2019; Andrea Saltelli et al. 2008). On the other hand, too many parameters (up to hundreds) must be specified to describe the properties of the soil-crop-atmosphere system (Ganot and Dahlke 2021; Thorp et al. 2020). Estimating every parameter in the model needs significant field measurements, which is costly and time-consuming. To reduce the costs and time-saving, there is a need to find fewer parameters that affect crop growth most than others (Kelly and Foster 2021; Poulose et al. 2021; Xu et al. 2016).

The AquaCrop model is a water-driven model developed by FAO, which simulates the crop's parameters under different management conditions. This model makes a good balance between robustness, simplicity, and output accuracy (Raes et al. 2009; Steduto et al. 2009; Vanuytrecht et al. 2014a, b). The model is based on the concepts of crop yield response to water developed by Doorenbos and Kassam (Delgoda et al. 2016; Doorenbos et al. 1980). There have been several research-tested using the AquaCrop model to simulate yields for various crops under normal and different stress situations (e.g., wheat Jalil et al. 2020; Ruane et al. 2016; XING et al. 2017); maize (Elbeltagi et al. 2020; Jalil et al. 2020; Sandhu and Irmak 2019), barley (Hellal et al. 2019; López-Urrea et al. 2020), potato (Montoya et al. 2016; Razzaghi et al. 2017), rice (Er-Raki et al. 2021; Xu et al. 2019; Zhai et al. 2019), grape (Er-Raki et al. 2021), date (Nunes et al. 2021), soybeans (Adeboye et al. 2019), cotton (Tsakmakis et al. 2019). Although the results proved the AquaCrop model's accuracy, the need for extensive data is not desirable. So, there is a need to reduce the number of input, find the most influential ones, and understand the relations between different parameters and their best-fitted values to calibrate the model more efficiently (Hamby 1994; Shirazi et al. 2021; Zhang et al. 2022).

To quantifying and comparing the impact of various parameters on a model's output, the sensitivity analysis (SA) can be used (Green and Whittemore 2005). SA is an uncertainty analysis technique that characterizes the impact of the input factors on the output of a model (Sarrazin et al. 2016), which considers as a prerequisite step in the model-building process (Campolongo et al. 2007). This diagnostic tool suggests considering high-impact parameters while neglecting the low-impact ones (Stella et al. 2014) by identifying parameters that have a significant impact on model simulations for specific regions (van Griensven et al. 2006). The modeling domain and the specific applications aim control the type of approach, level of complexity, and purposes of SA (Pianosi et al. 2016).

SA methods are classified as local SA (LSA) and global SA (GSA) methods. In the local SA method, only one input factor varies at a time, while others are fixed at a nominal value (Wang et al. 2013). Although this method is efficient, quick, and easy to use (Xu et al. 2016), it cannot be used to study the effects of several model parameters on the model output responses (DeJonge et al. 2015). So to check the interactions of several factors and to evaluate the varying model parameters simultaneously, the global SA algorithms were developed considerably (Hamm et al. 2006).

GSA investigates the impact of input variation of a numerical model on output variations by a set of mathematical techniques. GSA has been used for different purposes, such as apportion output uncertainty to different sources of uncertainty of a model (e.g., unknown parameters, measurement errors in input data) (Pianosi et al. 2015), model calibration, verification, diagnostic evaluation, or simplification (Sieber and Uhlenbrook 2005), uncertainty reduction (Hamm et al. 2006), analysis the dominant controls of a system (Pastres et al. 1999), and robust decision-making (Anderson et al. 2014).

There have been several GSA methods developed (Morris 1991; Pappenberger et al. 2008; Saltelli et al. 1999; Sobol 1993; Yang 2011), which are commonly used as auxiliary tools for different purposes (e.g., hydrology (Mehdi Ahmadi et al. 2014) Ecology (Ciric et al. 2012), and crop models (Vazquez-Cruz et al. 2014)). Although GSA is an essential tool for developing and calibrating models, its techniques are rather limited in some domains (Pianosi et al. 2015). There are some freely available GSA tools such as a repository of Matlab and Fortran functions maintained by the Joint Research Centre, the GUI-HDMR Matlab package, the C++ based PSUADE software, Python Sensitivity Analysis Library SALib, and Matlab Sensitivity Analysis For Everybody (SAFE) (Pianosi et al. 2015). The last one, SAFE, is used in this study.

In recent years, there has been a surge in the number of ways for calculating meaningful uncertainty boundaries on the model predictions, such as classical Bayesian (Kuczera and Parent 1998; Liu et al. 2021; Vrugt et al. 2001; Yin et al. 2021), pseudo-Bayesian (Beven and Binley 1992; Freer et al. 1996; Freni et al. 2009), set-theoretic (Klepper et al. 1991; Van Straten and Keesman 1991; Jasper A. Vrugt et al. 2003), multiple criteria (Gupta et al. 1998; Madsen 2000; Henrik Madsen 2003; Yapo et al. 1998), sequential data assimilation (Blasone et al. 2008; Fan et al. 2016; Moradkhani et al. 2005; Jasper A. Vrugt et al. 2005), and multi-model averaging methods (Jasper A. Vrugt and Robinson 2007). Despite the different advantages and disadvantages of each of these models, the main difference between them is their assumptions and the kind of different errors that are treated and made explicit (Blasone et al. 2008). In this study, Generalized Likelihood Uncertainty Estimation (GLUE) is used to study different model parameters of the AquaCrop model. This method, which was introduced in 1992 (Beven and Binley 1992), is one of the first attempts of Beven and Binley to represent prediction uncertainty.

The Monte-Carlo simulations can be used to combine probability distributions and examine the relationships between model input and outcome variables (Nash and Hannah 2011). Monte-Carlo simulations have been used in different studies, such as flood zoning (Natale and Savi 2007), agriculture (Baranyai and Zude 2009; Nash and Hannah 2011; Qin and Lu 2009), environmental modeling (Jasper A. Vrugt 2016), hydrology modeling (Jeremiah et al. 2012), groundwater modeling (Hassan et al. 2009), rivers (Berends et al. 2018), wastewater managing (Piri et al. 2021), and coastal lands (Cooper et al. 2019).

Recently, Adabi et al. (2020) studied on LSA of the AquaCrop model for wheat and maize in two plains in Iran. They studied 47 crop parameters on five output variables on this closed-source crop model. The relative Nash–Sutcliffe Efficiency Index was used to evaluate the sensitivity of these parameters. They found out that around half of the selected parameters in the Qazvin plain were ineffective, and calibrating the AquaCrop model would be more efficient and simpler than different GSA methods (Adabi et al. 2020).

In this study, we continued this research on GSA and GLUE methods of wheat, for the Qazvin plain to find efficient domains of every 47 parameters to calibrate the AquaCrop model with the highest accuracy output. Five outputs were studied in this research as soil evaporation, crop transpiration, evapotranspiration, crop biomass at maturity, and grain yield. For this purpose, the model was calibrated by the two-years observed data, then ran for 36-years data from the synoptic station in the Qazvin plain. After that, 3000 random runs based on the Monte-Carlo method were conducted. Then a new domain of each parameter was introduced with a 5% error with the real data. Finally, the probabilistic behavior of each parameter on five outputs was introduced.

2 Methods and materials

2.1 The AquaCrop model

In this study, the AquaCrop model was calibrated by GSA methods. AquaCrop, a water-driven model, evolved from the previous Doorenbos and Kassam (1980) approach that simulates crop modeling by different factors. A water balance approach is used in this model to simulate the soil water condition in the root zone. The FAO-PM evapotranspiration is divided into actual crop transpiration and soil evaporation by using the soil water status and the canopy cover information. The first-order kinetics is used to develop the canopy cover. Also, the model considers different stresses such as water, temperature, and salinity stress. Then the actual crop transpiration uses a normalized form of the water productivity leading to estimating the biomass production. At last, biomass production is used to calculate the crop yield by a specific harvest index. The model inputs can be classified into four main groups: meteorological conditions, initial values of the model parameters, soil characteristics, and management practices (Li et al. 2016; Linker et al. 2016; Steduto et al. 2009). The ability of the model in predicting the total biomass and yield of a wide range of crops to various irrigation strategies with high accuracy has been proved in multiple studies (Andarzian et al. 2011; Araya et al. 2010; Battilani et al. 2015; Heng et al. 2009).

The model has two main aspects that set it apart from other crop models: First, proportional green canopy cover is used instead of leaf area index. In this way, the output is more accessible from visual field observations and remote sensing (Calera et al. 2001; Carlson and Ripley 1997; Johnson and Trout 2012; Kim and Kaluarachchi 2015). Second, the model considers stresses more than any other cop models, as it was mentioned (Foster et al. 2017). This model can be run in thermal mode or calendar time mode (Vanuytrecht et al. 2014a, b). In this study, AquaCrop version 5 was used in the thermal time model. In Table 1, the name of 47 crop parameters used in the AquaCrop model is represented. These parameters change to understand their impact on the output results. These outputs are: soil evaporation (E), crop transpiration (T), evapotranspiration (ET), crop biomass at maturity (Biomass), and grain yield (Y).

Table 1 Name of the crop parameters using in the AquaCrop model and change to study their impact on results

2.2 SAFE toolbox and easy fit software

The SAFE toolbox, which is designed for both specialist and non-specialist users, is used in this study for evaluating the uncertainty of 47 model parameters based on five outputs. Non-specialist users with basic knowledge in GSA or Matlab can use this toolbox. On the other hand, experienced users have the chance to understand, customize, and develop the code (Pianosi et al. 2015). The initial release of the toolbox comprises the Elementary Effects Test (EET or Morris method (Morris 1991)), Regional Sensitivity Analysis (RSA (Spear and Hornberger 1980; Thorsten Wagener and Kollat 2007)), Variance-Based Sensitivity Analysis (VBSA, or Sobol method (Andrea Saltelli et al. 2008)), Fourier Amplitude Sensitivity Test (FAST (Cukier et al. 1973)), Dynamic identify-ability analysis (DYNIA (T. Wagener et al. 2003)), and PAWN (Pianosi and Wagener 2015). The SAFE toolbox is written in Matlab, but it may also be used with the free GNU Octave environment (www.gnu.org/software/octave) by any operating system (Windows, Linux, and Mac OS X) (Pianosi et al. 2015). Also, there is another available version of the SAFE toolbox for R. In this research, a personal computer Core i7, 2.2 GHz with 8.0 GB Ram, was used to run the SAFE toolbox.

In this research, the Easy Fit software, Version 5.5, was used to determine the probability density function (PDF) of parameters fitness by the Kolmogorov-Smirnoff goodness of fit test. This software was used to evaluate 65 different PDFs and present the best PDF of each output.

2.3 Monte-Carlo simulations methods

Monte-Carlo simulation is a mathematical modeling tool that can be used to obtain values for uncertain variables. Monte-Carlo simulations can also be used to assess the impacts of individual terms on model outcomes when using stochastic data and stochastic models (Metropolis and Ulam 1949). These methods rely on repeated random sampling to obtain numerical results based on the large number of class of computation algorithms. They are mainly used in optimization, numerical integration, and probability distribution, which the last one is the reason of using these methods (Kroese et al. 2014). In this study, 3000 runs were conducted to simulated time series for each output target by the Monte-Carlo method. The large number of runs was due to converging the data.

2.4 The GLUE methodology

The GLUE approach is a Monte Carlo method that aims to find a set of behavioral models from many of possible model/parameter combinations. Each group of parameter values is assigned a likelihood value based on the comparison of predicted and observed responses. Higher likelihood function values often suggest a better match between model predictions and observations. The complete sample of simulations is then divided into behavioral and non-behavioral parameter combinations based on a cutoff threshold. The likelihood values of the preserved solutions are then rescaled to create the output prediction's cumulative distribution function (CDF). In most published GLUE studies, the deterministic model prediction is often supplied by the median of the output distribution, and the related uncertainty is determined from the CDF, which is commonly chosen at the 5% and 95% confidence levels, and known as 90% confidence bounds or prediction limits (Blasone et al. 2008).

The GLUE method has three steps which include: (1) Monte Carlo sampling from a possible parameter space with uniform distribution. Due to the lack of a prior distribution of a parameter, this distribution is selected because of its simplicity (Migliaccio and Chaubey 2008). The range of each model’s parameter is divided into a number of intervals based on equal probability and for each running of the AucaCrop model one set of the parameters is randomly selected from the possible ranges. In this study, the random sampling method is used for parameter sampling from each interval. (2) Definition of likelihood function to AucaCrop’s outputs against observed values. The NRMSE is selected because it is a widely used likelihood measure for GLUE method (Arabi et al. 2007; Beven and Freer 2001). (3) Defining threshold value for identifying behavioral and non-behavioral sets of parameters. Based on the literature (Beven and Binley 1992; Beven and Freer 2001) the NRMSE value of 10% is considered as a reasonable threshold for AucaCrop simulation. Parameter sets with NRMSE values higher than 10 are chosen as behavioral parameters sets (best simulations). The mean simulated time series of the best simulations is judged as the optimum simulated time series and the mean of the corresponding parameters sets is considered as the calibrated values.

However, these bounds do not have a statistical meaning. So, for not being properly Bayesian, the GLUE method is criticized, which is one the main drawbacks of this approach. (Christensen 2004; Y. Liu and Gupta 2007; Mantovan and Todini 2006; Montanari 2005; Vogel et al. 2008). Although because of this reason, the GLUE method is considered incoherent and inconsistent from a statistical point of view, some easy adjustments to GLUE can improve the drawbacks (Beven et al. 2007).

In this study, the following steps were used for the GLUE approach:

  • Primitive parameter distributions definition (Table 2—The first and second columns).

  • Random parameter sets generation based on the Monte Carlo methods (SAFE toolbox).

  • Model run (from the previous step) (MATLAB).

  • Likelihood values calculation (SAFE toolbox) (The index was Normalized Root Mean Square Error (NRMSE) (Eq. 1), and the threshold was 10%).

  • Secondary distribution construction (SAFE toolbox and Easy Fit software).

    $${\text{NRMSE}} = \frac{{{\text{RMSE}}}}{\mu } = \frac{{\sqrt {\frac{{\sum {(X_{s} - X_{c} )^{2} } }}{n}} }}{\mu }$$
    (1)
Table 2 The maximum and the minimum temperature and Average monthly effective rainfall of the Qazvin plain

In which RMSE is the root mean square error, μ is the average comparison criterion, n is the number of data (length of time series), Xs is simulated output, and Xc is the comparison criterion.

The three main phases for all GSA approaches are: (i) Taking a sample of the inputs within their range of variability, (ii) Testing the model against the sample input combinations, and (iii) Calculating sensitivity indices after post-processing the input/output samples (Pianosi et al. 2015).

2.5 Case study and field data

Qazvin province is located in the central and northern region of Iran (48° 53' to 36° 50' longitude, and 53° 35' to 35° 18' latitude) with 15 821 square kilometers area (Fig. 1). This province has an average annual precipitation of 300 mm, which varies from 210 mm in the eastern parts to more than 550 mm in the northeastern heights. The maximum and minimum temperature as well as average monthly effective rainfall of Qazvin Plain are shown in Table 2. Also, daily and monthly average evapotranspiration of the reference plant as well as the average monthly rainfall are represented in Table 3. The longitude, latitude, and elevation of the synoptic station in the province are 50.03, 36.15, and 1279.2 m, respectively (Mojgan Ahmadi et al. 2021). This province contains various geomorphological regions such as steep mountains (to the north), upland areas (south and west), and plains (in the center) which makes agricultural activities a major challenge (Darvishi et al. 2015; Yousefi et al. 2020). Therefore, this province has an important role in agriculture and is responsible for 3% of the country's total Gross Domestic Product (GDP) (Census Center of Iran, https://www.amar.org.ir/). The plain in the province is the most extensive plain in the Salt Lake basin, with an arid to semi-arid climate. According to the Regional Water Company of Qazvin, there are 22 000 deep and semi-deep wells in this province (http://www.qzrw.ir/).

Fig. 1
figure 1

The location of the Qazvin province and the synoptic station in the province (Mojgan Ahmadi et al. 2021)

Table 3 Daily and monthly average evapotranspiration of the reference plant, and the avereage monthly rainfall

The agricultural lands are irrigated by a network of irrigation dams and channels. Wheat, maize, barley, alfalfa, saffron, sugar beet, lentils, beans, potatoes, walnuts, grapes, hazelnuts, and hawthorns are cultivated crops in the province. More than 144 000 ha of land in this province is dedicated to wheat, as the most important crop in the region, which produces about 315 000 t yr−1. Also, maize is responsible for more than 11% of irrigated lands of the province, which results in 1,008,015 tons per year (Mojtabavi et al. 2018). In this study, both crops, wheat and maize, were investigated, but due to the similarity of the results, only one crop, wheat, was studied.

3 Results and discussion

Finding observed historical data for calibrating a model is a crucial issue in time-series-generating models. According to the lack of long-term data and time-saving in running models, we calibrated the model by accessible data. Then, we ran the model for long-term years. The new run with a good approximation can be assumed as observed data. In this study, the AquaCrop model was calibrated by yield output of wheat and maize for two years, then ran for 36 years of meteorological data from the Qazvin synoptic station (from 1979 to 2014). For this research, 3000 runs were conducted randomly for every 47 parameters from Table 1 to uniform probability distribution function. The number of runs continued until all output results became convergence. There were five output targets: output targets, soil evaporation, crop transpiration, evapotranspiration, crop biomass at maturity, and grain yield. The 10% threshold of the NRMSE index was considered in this study for applying Acceptable Sample Rate (ASR). In other words, the amount of RMSE should be less than 10 percent of the comparison criterion of this time series period average. Also, the 5% confidence level of observed data was used as another index.

Simulated time series of five output targets (soil evaporation (E), crop transpiration (T), evapotranspiration (ET), crop biomass at maturity (Biomass), and grain yield (Yield)) are represented in Fig. 2. In other words, the 47 model parameters in Table 1 are used to result in these five targets. In Fig. 2, the blue lines represent all 3000 simulated time series ran for each output target by the Monte-Carlo method. The black points, which are located between the green lines, represent observed data of 36-years simulation. Then, the 10% threshold of the NRMSE of these black points is presented by red lines as ASR. Also, 95% of the confidence level was performed (2.5% below and above the red time series) by green lines, which are responsible for uncertainty bounds. Actually, all of the comparison criteria are covered by a confidence level of 95%. The cause of this phenomenon is because of that the comparison criterion time series is a specific run of the model and acts similar behavior with its near Monte-Carlo time series. Also, the probability distribution functions (PDFs) of all important parameters are shown in Fig. 3.

Fig. 2
figure 2

Simulated time series of five target outputs. Blue lines represent 3000 simulated time series ran, black points are real data, red lines illustrate the 10% threshold of NRMSE, and green lines responsible for uncertainty bounds of 5% confidence level

Fig. 3
figure 3

The probability distribution functions (PDFs) of all important parameters of AquaCrop model

The quantified expression of Fig. 2 is represented in Table 4. In other words, the number of accepted time series based on the 95% of ASR threshold are given for soil evaporation (E), crop transpiration (T), evapotranspiration (ET), crop biomass at maturity (Biomass), and grain yield (Yield).

Table 4 The number and percent of accepted time series of 3000 runs based on NRMSE < 10% threshold and 5% confidence level for soil evaporation (E), crop transpiration (T), evapotranspiration (ET), crop biomass at maturity (Biomass), and grain yield (Yield)

According to Table 4, the number of accepted time series of these five targets among all 3000 runs are represented in the second row. In the last row, the percent of the accepted time series are represented, which is the ratio of the second row to the total runs multiplied by 100. From this table, we can find out that how much we can trust the model that is not calibrated. As seen from the table, evaporation and grain yield are the least unreliable targets, while crop transpiration, evapotranspiration, and crop biomass are reliable targets. Among all these targets, evapotranspiration is the best parameter by 26%. In other words, we can trust this target more than all other targets in this study. Transpiration and biomass are the second and third most reliable targets with 18.3 and 13.26 percent, while evaporation and yield are not reliable with 1.03 and 2.67% accepted time series. So, if we do not have a chance to calibrate the AquaCrop model, we can trust these three outputs: Evapotranspiration, Transpiration, and Biomass.

For each 47 parameters in Table 1, there is an efficient range of numbers that can be used as input of the model. As it was mentioned, due to lack of technology, time-saving, and cost-saving, we are not able to read and measure each parameter from field. So, if we do not have the choice to gather data from a field, we have to suppose a quantity between this efficient domains of numbers. On the other hand, we have to maximize the possibility of our choices. In other words, we have to guess an amount more precisely to maximize the accuracy of output, and minimize the amount of error. In Table 5, the domain of numbers that can be used as an input of each parameters for five targets are represented in the second and third columns. This domain, which is consider as primitive domain, is the amount of domain that is only between the 10 percent threshold from Fig. 2. After applying the ASR threshold of 95% confidence level, the subsequent distribution of model parameters derived from the accepted domains. In the subsequent columns, the amount of minimum, maximum, and percentage of covering (PoC) of each five targets are given for all 47 targets. POC is defined by the subsequent distribution domain divided by the primitive distribution domain for each model parameter and each output variable. This new domain is smaller or equal to the primitive domain from the second and the third columns. The aim of this table is to minimizing the domain of estimation, which results in maximizing the accuracy of the model. For each target, there is a column in which the percentage of primitive distribution are given. Also, parameters that are marked by "*" means that the model accepts only integer numbers. Finally, at the last column, the average POC of five targets are represented.

Table 5 Primitive, subsequent distribution, and Percentage of Cover after applying the 10% threshold of NRMSE in 95% confidence level for 47 model parameters and 5 targets

For a better understanding of this table, we have to talk about three parameters. For instance, the primitive range of the first parameter, X1, after applying the 10% of NRMSE threshold and 95% confidence level, is from 0 to 2. Also, it is marked by the only integer acceptance meaning. It means that the model accepts only 0, 1, or 2 as input for this parameter. Now, if we look at the results of evaporation, we can see that the minimum and maximum of input are considered as the same domain of the primitive distribution domain. It means that the model covers the entire domain of primitive distribution domain. In other words, we cannot reduce the size of the domain. In the last column of this row, we can see that the average POC of this parameter is 100%.

Now, we can see the second row, which is responsible for the X2 parameter. The primitive distribution domain of this parameter is from 0 to 5. Also, it is not marked, so the model accepts both integers and decimals. Unlike the first row, the domain of this parameter is different from the primitive distribution domain. As can be seen from the first and second columns of the evaporation section, the amount of the evaporation domain starts from 0.1523 to 2.751, which is much smaller than the primitive domain. The percentage of primitive distribution covered by subsequent distribution is represented in the third column, which is 52%. It means that the POC of this target is 52% of the primitive domain. In other words, the model is reduced for 48% of the primitive domain. So, if we do not have data for estimating the model, we can use the new domain to increase the estimation efficiency. However, the domains of transpiration, evapotranspiration, and biomass have not been decreased significantly, which the POC of the new domain of these targets are 99.6, 99.4, and 99.5%, respectively. Also, the amount of POC of the last column in this row is 87.2%, which is the average amount of other POCs from five targets.

Finally, if we look to the 23rd parameter, X23, we can see that the domain has been reduced for all targets. As can be seen from the table, the primitive distribution domain of this parameter is [0.0196–0.065]. But after applying subsequent distribution, the size of the evaporation domain reduces to 33% of the primitive domain, which is from 0.0214 to 0.0363. The subsequent domain of transpiration starts from 0.0196 to 0.0364, which covers only 36.9% of the primitive domain. The subsequent distribution domain of the other three targets is given in the subsequent columns, which the POC of them are 36.9, 36.7, and 36%, respectively. The average POC of these five targets is 35.9%, according to the last column of the 23rd row of this table. In other words, the new domain has been reduced to 35.9% on average.Finally, the probabilistic behavior of crop parameters of these five output variables is analyzed in this study. In Table 6, the probabilistic distribution of crop parameters of each output after the 10 percent NRMSE in 95% of confidence level is represented. Also, the optimized PDF of these targets is given in the last column. It should be mentioned that the yield output was used to calibrate the model.

Table 6 Probabilistic distribution of crop parameters of each output after the 10% threshold of NRMSE applying and 95% confidence level, with the optimized PDF of each output

Table 6 shows the probabilistic behavior of each parameter for each target. For instance, the X1 parameter has a Poisson distribution, while the other four targets have a uniform distribution. Also, the optimized PDF is considered Uniform. Another example of this table could be the probabilistic distribution of X36. For this parameter, Wakeby, GenPareto, Johnson SB, Wakeby, and Johnson SB are considered as probabilistic behavior of soil evaporation, crop transpiration, evapotranspiration, crop biomass at maturity, and grain yield, respectively.

We can deduce from Table 7 that there are different types of PDFs for each parameter. Table 7 shows the percentage of dominant PDF types of 47 parameters of the AquaCrop model. This table summarizes the Table 7. According to this table, 34% of PDFs are considered Uniform. Johnson SB and Wakeby are the second and third types of PDFs with 23.4 and 21.3 percent. Also, other types are responsible for 21.3% of PDFs. This table shows that about 80% of all parameters have one of the Uniform, Jonson SB, or Wakeby PDFs.

Table 7 Percentage of PDFs types of parameters for yield output

In other words, the perspective of yield output to the statistical population of parameters differs from soil evaporation’s perspective (or different target outputs). Considering that (i) the model is calibrated by yield, also (ii) for multi-output calibration by GLUE, we need to select just one PDF for sampling from target parameter domains.

As we know, for trusting and using each output of multi-output models (as AquaCrop), we must calibrate the model just by itself output. But this process is costly and time-consuming. One of the best ways for this situation is multi-output calibration that providing all output conditions in their optimized mode. From an optimization perspective, we recommend that probabilistic distributions of all 47 parameters in the optimized condition of five target outputs are according to the last column of Table 4. This column is derived by considering all of the accepted series for all five target outputs. For multi-criteria calibration of AquaCrop by GLUE in Qazvin Synoptic Station, we can refer to this column.

4 Conclusion

Although there has been significant progress in developing new crop models in recent years, measuring a large number of data as input for the specific region and crop is difficult, time-consuming, and cost-consuming. So, we have to find the most important and influential model parameters that significantly impact outputs. In this study, we tried to make the un-calibrated model more efficient.

In conclusion, the summary of this research is presented as below:

  • The AquaCrop model was chosen because of its simplicity, output accuracy, and application in research and managing.

  • Forty-seven parameters were chosen as input of the AquaCrop model to understand their impact on outputs.

  • The Qazvin province was used as the case study in this study to evaluate the results. Also, wheat was chosen as the most important crop in the province.

  • Five output targets were chosen to be studied: soil evaporation, crop transpiration, evapotranspiration, crop biomass at maturity, and grain yield.

  • The Global Sensitivity Analysis (GSA) methods were used to evaluate the model parameters' impact on output targets.

  • The SAFE toolbox in the MATLAB environment was used for GSA and uncertainty of inputs and their impact on outputs.

  • The model calibrated by the two-years observed data, then ran for 36-years from the synoptic station in the province.

  • Based on the Monte-Carlo method, 3000 runs were conducted for each parameter to evaluate the reliability of different outputs of the AquaCrop model that have not been calibrated.

  • The 10% NRMSE threshold and 5% confidence level of real data were used as an index.

  • Five targets were ranked according to their reliability when the model is not calibrated.

  • A domain of each output target with a 5% confidence level of real data was determined.

  • The new domain was introduced. The size of some model parameters reduced significantly, while others remain approximately the same.

  • The probabilistic behavior of each 47 parameters on all five outputs was introduced.

  • The Easy Fit software was used to determine the probability density function (PDF) of parameters based on the Kolmogorov-Smirnoff goodness of fit test. The software evaluated 65 different PDFs.

  • The optimized PDF of each parameter was introduced, which was the average of probabilistic behavior of five outputs.

  • Four classes of optimized PDFs of each parameter were found, including Uniform, Johnson SB, Wakeby, and others. It was found that about 80% of all parameters have one of the first three PDFs, and only 20% of them have different probabilistic behavior.

  • About 66% of PDFs were not uniform. This issue demonstrates that the first assumption of GLUE (sampling uniformly) is not noticeable. In other words, in these parameters, the probability of some values is more than others.

So, this study will help the agriculture and irrigation manager make a better estimation of different inputs if they cannot calculate and measure inputs from the field. They can predict the probabilistic behavior of each parameter and find an efficient domain of them to maximize the accuracy of outputs. The methods and process of this study are suggested for other crop models in other places with different crops. We can extend this research to other regions with different climates and crops. We can use these methods for other crop models.