Introduction

Agricultural production managers, natural resource managers, and strategic decision-makers require accurate, timely, and cost-effective information to maintain quality food and fiber supply for the nation and the world (Chenu, 2017). Wheat (Triticum aestivum), in particular, is the most widely grown of all crops and the cereal (Shewry, 2009), with over 730 million ton (MT) of production in 2018 (Faostat, 2018). In Iran, wheat production reached over 13 MT in 2017 as the county's highest cultivated grain crop (Ahmadi et al., 2017). However, the country has been experiencing a prolonged drought condition with limited water resource availability (Jamshidi, 2020), particularly in southern Iran, where the largest wheat cultivated areas are located. The current situation has posed a challenge for growers and decision-makers to evaluate and optimize the region's cost and benefit. In this regard, crop models can provide the required asset to simulate the growth and yield amount and optimize the water productivity for the region (Jin, 2018).

Early crop models were developed during the 1960s using a simplified version of water-balance equations for crop growth simulations (Monteith, 1996). The crop models have been evolving by improving two prime functions: (1) the physic and (2) the structure of the model. The crop model physic has been improving by implementing explicit mathematical methods such as using advanced numerical algorithms (Bartzanas, 2013; Noshadi & Jamshidi, 2014; De Wit & Van Diepen, 2007) and machine learning techniques (Folberth, 2019; Kuwata & Shibasaki, 2015; Liakos, 2018). The structure of crop models has been enhanced considering the advancement in the accuracy of the input datasets considering both ground-based data and remotely sensed methods and algorithms to generate crop-based datasets. For example, (Jamshidi, 2019a, b; Niyogi, 2020) used multiresolution data sources (MODIS, Landsat, and reanalysis) to estimate evapotranspiration over winter wheat and forage cornfields.

Accordingly, several crop models with different physics and structures, including empirical (Choudhury, Idso, &, Reginato, 1987; Gustafson, 2005), logistic (Overman, Scholtz, & Martin, 2003; Sepaskhah, Fahandezh-Saadi, & Zand-Parsa, 2011) and mechanistic (Bezuidenhout, 2000; Estes, 2013), have been developed for predicting the growth, development, and yield of different crops. The CERES (Crop Environment Resource Synthesis)-Wheat model was initially developed by the USDA-ARS (Ritchie, 1985) and has been used in several studies (Dettori, 2011; Hlavinka, 2010; Singh et al., 2008; Xiong, 2008). In a review study by Timsina and Humphreys (Timsina & Humphreys, 2006), the CERES-Wheat showed high accuracy in simulating the anthesis and maturity dates with 4–5% of RMSE (root mean squared error) and reasonable accuracy for predicting the yield (RMSE f 13–16%). The moderate accuracy of the model for predicting the grain yield has been primarily attributed to the poor estimates of leaf area index (Dente, 2008; Hussain et al., 2018).

Multi-spectral remote sensing from satellites (particularly the visible and near-infrared bands) has been widely applied for crop monitoring (Jamshidi, Zand-Parsa, & Niyogi, 2021a; Wu, 2015). In particular, vegetation indices such as NDVI (normalized difference vegetation index), LAI (leaf area index), SIF (sun-induced chlorophyll fluorescence) have been used as a proxy to monitor the plant condition (Duan, 2017; Hosseini, 2015; Jamshidi et al., 2021a) and has been used as yield estimators (Cai, 2019; Huang, 2015). The crop-based product driven from the satellite data could also be coupled with the crop models. This approach is particularly beneficial considering a data-void region with limited in situ observations or low accuracy in situ data as well as the poor functionality of a crop model for simulating a particular vegetation index (e.g., CERES-wheat for LAI estimates).

Therefore, in this study, we attempted to calibrate the CERES-wheat model over the study region and identifying the area in which the model does not perform optimally. Based on the analysis, we integrated the LAI values derived from the Landsat satellite into the CERES-wheat model to evaluate its performance of the model, particularly for simulating the grain yield. This study provides the first trial for applying the CERES-wheat model over the study region and links the model to remotely sensed data that will provide the fundamental step toward applying the model at a regional scale.

Material and Methods

Study Site

The experiments were carried out near Firouzabad city in southern Iran (Fig. 1) for three cropping seasons from 2015 to 2019. The climate of the study area is classified as a semiarid region with a mean annual rainfall of 321 mm. The winter is mild, with a mean air temperature (Tair) of 5 °C and relative humidity (RH) of 65%. The summer is extreme, with a mean Tair of 41 °C and RH of 20%. Weather data, including rainfall (P), Tair, RH, wind speed (U2), and radiation, were recorded from a nearby weather station.

Fig. 1
figure 1

The location of the study area in southern Iran, Firouzabad city. The red boxes show the locations where the samples for the measurements were collected

The experiment plots included a 25-hectare agricultural farm planted with winter wheat (Pishtaz cultivar) during mid-December in each cropping season. Winter wheat was planted with a grain drill to a depth of 2.5 to 3.5 cm at a seeding rate of about 80 kg ha−1 and a row spacing of 0.22 m. Crop irrigation requirement was determined based on the amount of actual evapotranspiration using the following equation:

$${\text{ET}}_{{\text{c}}} = K_{{\text{C}}} \times {\text{ET}}_{{\text{o}}}$$
(1)

where ETc is the crop evapotranspiration or actual crop water requirement, Kc is the crop coefficient, and ETo is the reference evapotranspiration. For each growing stage, the crop coefficient was retrieved from the study by Moghimi, Sepaskhah (Moghimi et al., 2015) for a similar climatic condition in the north of the province, and reference evapotranspiration was determined using the Penman–Monteith equation:

$${\text{ET}}_{{\text{o}}} = \frac{{\Delta \times (Rn - G) \times \rho_{{\text{a}}} \times C_{{\text{p}}} \times \frac{\Delta e}{{r_{{\text{a}}} }}}}{{\Delta + \gamma \left. {\left( {1 + \frac{{r_{{\text{c}}} }}{{r_{{\text{a}}} }}} \right)} \right) \times \lambda }}$$
(2)

where Rn is the net radiation; G is the ground heat flux; γ is the psychrometric constant; ∆ is the rate of change of saturation specific humidity with air temperature; γ is the psychrometric constant; λ is the latent heat; Cp is the air specific heat capacity; ρa is the dry air density; ra and rc are the aerodynamic and canopy resistance, respectively; ∆e is the difference between the saturated and actual vapor pressure in kPa.

Furrow irrigation was applied in a 10-day interval by monitoring the soil water status in different parts of the field. Soil analysis was performed to determine the physical and chemical properties of the study field (Table 1). According to the soil analysis, 32 kg ha−1 of phosphorus fertilizer (P2O5) and 155 kg ha−1 of nitrogen fertilizer as urea (46% N) were applied at planting and stem elongation stages. We attempted to control weeds using herbicides effectively, and the field was pets and disease-free during the study. The fields were harvested at the end of May.

Table 1 The physical and chemical properties of soil in the study area

The effective wheat LAI (cm2 cm−2) was measured through the growing season on a weekly basis using the LAI-2200 plant canopy analyzer (Li-Cor, Inc., Lincoln, NE, USA). The average effective LAI was measured in eight different field locations (30 * 30 boxes) associated with 8 Landsat pixels. The multiple measurements were taken in each area box based on the average value of one above-canopy and four below-canopy LAI measurements.

CERES-Wheat Model

The CERES-wheat model is a deterministic model (implemented in the DSSAT) that can simulate the effects of environmental conditions (i.e., weather and soil properties), genotype, and management aspects on the growth and development of wheat (Attia, 2016; Bannayan, Crout, & Hoogenboom, 2003).

The model is fed with four major inputs, including the meteorological, crop (physiological parameters), soil, and management data. The minimum meteorological data required by the model include daily solar radiation, Tair (maximum and minimum), and rainfall amount. For these input sets, we used the daily meteorological data measured by the nearby weather station. The soil parameterization inputs include the hydraulic and chemical properties, including initial soil moisture and nitrogen content, field capacity and saturated water content, bulk density, PH, factors related to the root growth, soil albedo and evaporation during the first stage, deep percolation and runoff coefficient. We measured the physical and chemical soil properties by collecting the soil samples from the experimented area. The main management input data consist of sowing date and depth, sowing density, and the amount of applied irrigation and fertilizer. The management data were historically available for the study site, and those that were not available were measured during the study. The model simulates the physiological processes using genotype coefficients (e.g., photoperiod sensitivity, kernel number per biomass). Therefore, these coefficients need to be provided to the model and typically involve a calibration process. The approach used to calibrate the genotype coefficient was an iterative procedure (Boote, 1998) that generated the genetic coefficient values with the lowest RMSE between the measured and simulated data. The calibration was done when the RMSE of the observed and simulated values was lower than 10% (Bannayan & Hoogenboom, 2009).

To simulate vegetative and reproductive growth, the model divides the phenological development into nine stages according to heat unit accumulation. The number of leaves is calculated as a function of the vegetative growth stage and based on the duration of grain filling (P5) and phyllochron interval, the number of developing leaves is computed. The model then uses the leaf-related factors (i.e., LAI, leaf expansion, and appearance rate) and environmentally driven inputs, including solar radiation and its use efficiency, canopy extinction coefficient, and plant population to partition the carbon assimilation to the plant parts. Grain yield is simulated using the plant population, grain number, and weight at physical maturity (G1 and G2, respectively).

The CERES-wheat model applies a one-dimensional soil water-balance approach to calculate the variation of soil water content. The approach considers the feedback among the rainfall and irrigation (as the main sources of input water to the soil) and runoff, drainage, evapotranspiration (soil evaporation and plant transpiration), and plant water uptake (as the main sources of output water of the soil). Rainfall is imported as a user input to the model and is divided into two parts: (1) runoff (calculated based on the curve number method) and (2) stored in the soil profile. The model divides the soil profile into different soil compartments, and each compartment is specified with a drained upper and lower limit (DUL and LL, respectively) and saturated water content (SAT). When the water content of a soil layer is between LL and DUL, the model considers an unsaturated upward flow, and when the water content exceeds the DUL, downward saturated flow occurs.

The model calculates crop evapotranspiration (ETc) using the typical equation using ETo and Kc (Eq. 1). The reference evapotranspiration in the model can be calculated using multiple options (e.g., Priestly–Taylor (Priestley & Taylor, 1972) or Penman–Monteith (Allen, 1998). The Kc, in the model (named as Kcs), is calculated using the following equation:

$$K_{{{\text{cs}}}} = 1 + \left( {{\text{EORATIO}} - 1} \right)\frac{{{\text{LAI}}}}{6}$$
(3)

where EORATIO is the maximum of Kcs when LAI equals 6. The ETc is then partitioned into the potential evaporation (ESo) and transpiration (Tp) using the following equations:

$${\text{ES}}_{{\text{o}}} = {\text{ET}}_{{\text{c}}} \times \exp (1 - K_{{{\text{EP}}}} \times {\text{LAI}})$$
(4)
$$T_{{\text{p}}} = {\text{ET}}_{{\text{c}}} - {\text{ES}}_{0}$$
(5)

where KEP is the energy extinction coefficient of the canopy (ranging from 0.5 to 0.8). The detailed model algorithm and processes can be found in Tsuji, Uehara (Tsuji, Uehara, & Balas, 1994).

Remotely Sensed Data and LAI Calculations

For remotely sensed data, we used the spectral imagery of Landsat 8 to calculate multiple vegetation indices. The non-cloudy images were retrieved from the USGS Earth Explorer website (www.earthexplorer.usgs.gov) from path/row of 162/40 from December to May (sowing to harvest) of 2017, 2018, and 2019. The Landsat 8 provides images at 16-day temporal resolution and 30 m spatial resolution and includes eleven spectral bands. The dates at which the images were retrieved during each growing season and according to the winter wheat development stage are provided in Table 2.

Table 2 The acquired dates of Landsat images and corresponding to the development stages of winter wheat

The LAI cannot be directly calculated from the multi-spectral data of Landsat imagery; rather, vegetation indices (VIs) are investigated, and empirical correlations are developed to construct the LAI (Li, 2017a). Accordingly, we employed four vegetation indices that are widely used to estimate biophysical parameters. These parameters include the normalized difference vegetation index (NDVI) (Rouse, 1974), the soil-adjusted vegetation index (SAVI) (Huete & Huete, 1988), the enhanced vegetation index (EVI) (Huete et al., 1994) and the two-band EVI or EVI-2 (Jiang, 2008). The following formulations were used to calculate these indices.

$${\text{NDVI}} = \frac{{\rho_{{{\text{Nir}}}} - \rho_{{{\text{Red}}}} }}{{\rho_{{{\text{Nir}}}} + \rho_{{{\text{Red}}}} }}$$
(6)
$${\text{SAVI}} = \frac{{\rho_{{{\text{Nir}}}} - \rho_{{{\text{Red}}}} }}{{\rho_{{{\text{Nir}}}} + \rho_{{{\text{Red}}}} + L}}(1 + L)$$
(7)
$${\text{EVI}} = 2.5 \times \frac{{\rho_{{{\text{Nir}}}} - \rho_{{{\text{Red}}}} }}{{\rho_{{{\text{Nir}}}} + 0.6 \times \rho_{{{\text{Red}}}} - 7.5 \times \rho_{{{\text{Blue}}}} + 1}}$$
(8)
$${\text{EVI - 2}} = 2.5 \times \frac{{\rho_{{{\text{Nir}}}} - \rho_{{{\text{Red}}}} }}{{\rho_{{{\text{Nir}}}} + 2.4 \times \rho_{{{\text{Red}}}} + 1}}$$
(9)

The correlation between the in situ measurements of LAI and the VIs was evaluated using the data during the first year (December to May of 2017). VI with the highest correlation was selected as the LAI representative, and the correlation was validated based on the data from 2018 and 2019. The LAI based on Landsat data was termed as LAIRS and was ingested into the model as a replacement of LAI calculated by the model.

Assimilating LAI into the CERES-Wheat Model

To assimilate the LAIRS (LAI based on Landsat8 data) into the CERES-wheat model, we used the residual resampling particle filter algorithm. The particle filter (PF, particle filter) is based on Monte Carlo methods, which use particle sets to represent probabilities. A detailed description of the algorithm is provided in Canty (2019), and a schematic flowchart is presented in Fig. 2.

Fig. 2
figure 2

The flowchart of the process used for retrieving the assimilated and remotely sensed LAI using Landsat images, CERES-wheat model particle filter (PF) method

The LAI in the dynamic state-space CERES-wheat model is not represented as a state variable as the model uses the plant leaf area to calculate LAI at each time step independently. However, the plant leaf area can be considered a state variable since it is presented as a heat unit function, and the number of leaves sets its potential value. One common issue with the PF method is its particle degradation phenomenon that requires high computational power and time. Therefore, we applied residual resampling to resolve the degeneracy issue of the PF method. To account for the problem of sample impoverishment (i.e., lack of particle diversity), the Gaussian repetitious perturbation was used to the states’ particles.

Based on the LAIRS assimilated into the model and PF method, the optimum yield was estimated and compared to the yield estimates based on the default LAI in the model.

Results and Discussion

Model Calibration and Evaluation

This study's first objective was to calibrate the CERES-wheat model so it can be used in the region of study to simulate winter wheat growth and yield accurately. Accordingly, certain parameters, including initial conditions, cultivar genetic coefficient, species, and ecotype determination, were calibrated, and the results are presented in Table 3. The calibration process was done based on the data acquired during the first year of the study (2017 growing season). The cultivar genetic information (i.e., G1, G2, G3) has been typically addressed as the parameters that need calibration in the model (Iglesias, 2006; Timsina, 2008). Adjusting these coefficients to 10 (G1), 50 (G2), and 2 (G3) resulted in accurately simulating anthesis and physiological maturity dates (97 and 132, respectively), compared to the observed phenological dates (94 and 138, respectively). The other important cultivar parameter was PHINT (intervals between the successive appearance of leaf tip), which was set to 100. This adjustment allowed the model to simulate the main-stem leaves during the heading growing stage as 7 to 8, which is recommended for winter-wheat simulation in a semiarid region (Attia, 2016). Calibrating the genotype information was not adequate for the model to accurately simulate all crop specifications. Large errors were still observed in the simulated leaf emergence stages and terminal spikelet initiation, as reported in Johnen, Boettcher (Johnen et al. 2012) and Andarzian, Hoogenboom (Andarzian, 2015). To account for these caveats, the ecotype information, including duration of phase end juvenile to end grain fill lag, i.e., P1, P2, P3, and P4, was adjusted to 420, 305, 195, and 195 °C, GDD. Additional ecotype information including SLAS, LSPHE (for better simulation of leaf area expansion) was calibrated to 405 cm2 g−1 and 6, respectively, and PARUE and PARU2 (for better simulation of biomass production) were calibrated to 3.2 and 3.3 g MJ−1, respectively, and have also been adjusted.

Table 3 The calibrated parameters used in the CERES-wheat model for simulating winter wheat (cultivar Pishtaz)

The calibrated model was evaluated using the data from the next two growing seasons (2018 and 2019). The evaluation results are presented in Table 4. It should be noted that the information shown in Table 4 presents the results before assimilating satellite data. In the evaluation stage, the model simulated the phenological dates with an overall RMSE of 11.5, MBE of 9.2, and R2 of 0.79. The maximum LAI values were simulated as 4.8 and 5.5 compared to 6.9 and 6.6, for growing seasons of 2018 and 2019, respectively. The grain yield and biomass were simulated with RMSE of 910 and 1215 Mg ha−1. The calibration results were comparable with the values reported from Mehrabi and Sepaskhah (Mehrabi & Sepaskhah, 2019), which was done over a region with a similar climatic condition and cultivar. Andarzian, Hoogenboom (Andarzian, 2015) and Attia, Rajan (Attia, 2016), for a similar climatic condition, reported a relatively similar range for the calibrated parameters; however, they planted a different cultivar, and certain differences are expected. For example, Andarzian, Hoogenboom (Andarzian, 2015) set the vernalization coefficient as zero for their spring-type cultivar, but in our study, the value was set to 35 as a winter-type cultivar.

Table 4 CERES-wheat simulation results for phenological dates, grain yield, and biomass without LAI assimilation

Evaluation of Vegetation Indices

We investigated the correlation between four commonly used vegetation indices and the observed LAI to identify the VI that could represent the LAI more accurately. The results of these comparisons and the statistical analysis are provided in Fig. 3 and Table 5. The yielded correlation from all the VIs was statistically significant (P value < 0.05), corresponding to the LAI values with R2 > 0.73. The NDVI estimates from the Landsat-8 images showed a moderately good correlation with − 2.94 of MBE and 3.46 of RMSE. The SAVI–LAI resulted in a slightly better correlation with a lower range of error (MBE = − 2.89 and RMSE = 3.32). The higher accuracy from SAVI could be linked to the index's ability to remove the soil background bias, while this bias has embedded in the NDVI. Notably, when the LAI is low for the early growing stages, the soil background in the NDVI correlation with the LAI could result in a large error. Removing this source of error using the SAVI index improved the accuracy of the relationship. EVI correlation with LAI was higher compared to NDVI and SAVI with R2 = 0.82, MBE = − 2.45, and RMSE = 3.01. EVI corrects specific atmospheric conditions and canopy background noise and is more sensitive in areas with dense vegetation. In areas of the dense canopy where the leaf area index (LAI) is high, the VI values can be improved by leveraging information in the blue wavelength (Zhao, 2011), enhanced in the EVI. These enhancements allowed the index to reduce the background and atmospheric noises, which resulted in a higher correlation with in situ LAI values. Among the VIs, EVI-2 showed the highest correlation with R2 = 0.87, MBE = − 2.04, and RMSE = 2.64. While the information from the Landsat blue band (used in EVI) can help achieve more accurate LAI estimates, it does not necessarily account for additional biophysical information on vegetation characteristics, particularly for the atmospheric conditions with low particulate matters. For our study area with an insignificant atmospheric aerosol effect, the blue band's information increased the complexity of the index more than its positive impact on atmospheric correction as using EVI-2 improved the accuracy of the index for LAI estimates. It should be noted that, since the investigated VIs have a typically smaller range of values (typically below 1) compared to LAI (with the values ranging from 0 − 6), the MBE showed negative error values. We should also highlight that we only presented the linear fitting while we tested different fitting equations, but EVI-2 returned the best result in other cases.

Fig. 3
figure 3

The resulting correlations from the Landsat-derived vegetation indices and observational leaf area index (LAI) based on data from the 2017 growing season

Table 5 The statistical analysis of the resulting equation between the Landsat-derived vegetation indices and observational leaf area index (LAI)

Based on the resulted equation between the EVI-2 and LAI (Table 5), we selected the EVI-2 index for further analysis. Accordingly, we validated the correlation by considering the two years of data (2018 and 2019). The resulting comparison is plotted in Fig. 4 on each of the growing stages, separately.

Fig. 4
figure 4

The validation result of the correlation between the Landsat-derived EVI-2 and the observational leaf area index (LAI) based on data from 2018 and 2019 growing seasons

Evaluation of the Assimilated CERES-LAI Trajectories

As the LAI based on the EVI-2 index showed the best performance (discussed in the previous section), the LAI-EVI-2 data was coupled with the CERES-wheat model. The resulting LAI simulated with the default model (without LAI assimilation) and the modified model (assimilated LAI based on EVI-2 index) is plotted in Fig. 5. In both growing seasons, the LAI assimilated using the Landsat EVI-2 data showed a closer agreement with the observed data. Based on observation for the growing season 2018, the LAI increased from 70 days after planting and reached 3.8 by the end of the stem elongation period (~ 150 days after planting). The LAI reached the maximum (LAImax) of 6.8 ± 0.4 during the maturity and then decreased. A similar trend was observed during the 2019 growing season with a marginally smaller LAImax (6.4 ± 0.3). The CERES model underestimated the LAI when the default model was used. The LAI was accurately simulated during the early growing seasons, particularly before stem elongation. However, the simulation started to deviate from the observation with a large error during the maximum greenness of the plant. The default CERES model simulated the LAImax with − 2.08 of MBE. When the model was coupled with the assimilated Landsat EVI2-LAI, the simulation error was reduced significantly to − 1.14. The assimilated LAI resulted in higher LAI values from the CERES model; nevertheless, the values were still underestimated compared to the observation. During the second year of the experiment (2019 growing seasons), the assimilated LAI resulted in a better simulation with  − 0.75 of MBE. The difference between the simulated LAI from the CERES default mode and EVI2-LAI was not statistically significant during the early growing stages (beginning of the stem elongation), and the maximum difference was observed during the plant maturity.

Fig. 5
figure 5

Time series of the observed and simulated winter wheat leaf area index based on the model using default and Landsat EVI2-LAI during the 2018 and 2019 growing seasons

The improvement in LAI simulated with Landsat EVI2-LAI could also be detected when the results were compared with one-to-one line (Fig. 6). During the 2018 growing season, RMSE reduced from 1.32 to 0.81, and during the 2019 growing season, RSME reduced from 1.37 to 0.78 by implementing Landsat EVI2-LAI into the CERES-wheat model. Considering all growing seasons, the MBE of the model simulated LAI was reduced by 38% when the satellite data was incorporated in the model.

Fig. 6
figure 6

Comparison of the observed and simulated winter wheat leaf area index (LAI). The observed LAI refers to the in situ measurements. The simulated LAI refers to values estimated by coupling the default CERE-wheat model with Landsat EVI2-LAI data during the 2018 and 2019 growing seasons. Black solid lines are 1:1 lines, and blue dotted lines are the best fit to the data

Evaluation of the CERES Components with Assimilated Data

After using assimilated LAI, the model accuracy has improved in several areas that are presented in Table 6. The model accurately simulated the anthesis and maturity dates (as the day of the year) with an overall RMSE of 5.8, MBE of 4.7, and R2 of 0.90. The most significant improvement was observed in the maximum LAI, which has already been discussed in the previous section. The LAI improvement led to a better simulation of grain yield and biomass production. The model simulated the average grain yield as 9.8 and 10.1 Mg ha−1 compared to 10.5 and 10.8 Mg ha−1 during the 2018 and 2019 growing seasons, respectively. Considering the biomass production, the model simulation resulted in 16.3 and 16.9 Mg ha−1 compared to 17.2 and 17.8 Mg ha−1 during the 2018 and 2019 growing seasons, respectively.

A more detailed analysis was performed on the harvested yield of the model considering the default settings and using the assimilated Landsat EVI2-LAI. The results were compared against the measured field data and are demonstrated in Fig. 7. The harvested yield for the winter wheat of the study site ranged from 7.9 to 12 Mg ha−1. The CERES-wheat model in the default mode simulated the crop yield with an RSME of 0.87–0.94 kg ha−1 and R2 of 0.49–0.51. A better LAI representation in the model could significantly improve simulated yield accuracy in both growing seasons. When the model was configured with the assimilated Landsat EVI2-LAI, the RMSE was reduced to 0.51–0.58 kg ha−1, and R2 was increased to 0.69–0.71. In a study by Dente, Satalino (Dente, 2008), when the CERES-wheat model was configured with assimilated LAI from ASAR (ENVISAT Advanced Synthetic Aperture Radar), the model was able to simulate the yield with an accuracy of 360 to 420 kg ha−1. In another study by Li 2017b), the assimilated LAI into the CERES-wheat model resulted in the predicted yield with R2 of 0.61 and RMSE of 523 kg ha−1, which is comparable with that of our study.

Fig. 7
figure 7

Comparisons of the CERES-wheat simulated grain yield (using the default and assimilated LAI) with the harvested yield during the 2018 and 2019 growing seasons

Discussion

The growth and development of a plant is a complex process that involves the interactions between the plant, soil, and atmosphere. More complexity evolves as the environmental conditions tend to change against the optimal growth condition. For example, in the study region, limited water supply and severe drought have altered the optimal condition for crop growth. We could rely on two aspects to reduce this complexity under this situation, either by testing new irrigation/management techniques or by improving the simulation model performance. For example, Jafari, Kamali (Jafari, 2021) and Jamshidi, Zand-Parsa (Jamshidi, Zand-Parsa, & Niyogi, 2021b) have tested different irrigation techniques on citrus species under the study region's climate. Considering the model improvement aspect, we should note that the simulation models often simplify certain phenological and chemi-physiological plant development aspects. Therefore, the uncertainty inherent in the model’s structure leads to the deviation of the model’s results and the true process. One approach to enhance the model outcome is to improve the model’s structure; however, this approach would add more complexity to the model routine (Tao, 2018). As an alternative approach to alleviate the uncertainties of the crop models, the state variables with high simulation errors and significant impact on the model’s outcome can be provided from other sources with higher precision. Considering our initial calibration and validation results (Table 4), the LAI was identified with large errors during the validation process. A similar result has also been reported in a study by Hussain, Khaliq (Hussain et al., 2018), in which the poor LAI simulation of the CERES-wheat model was demonstrated by comparing against other crop models. A poor LAI simulation can negatively impact the carbon assimilation processes and its partitioning to the crop components (i.e., stem, root, grain) (Basso, Cammarano, & Carfagna, 2013).

Several studies have highlighted the potential of remotely sensed data of satellite optical sensors for monitoring crop water status (Jamshidi et al., 2021a) and particularly for retrieving LAI (Wang, 2013; Yi, 2008; Zheng & Moskal, 2009). The LAI cannot be directly achieved from satellite data, and vegetation indices are typically used as a proxy to retrieve it. However, using different vegetation indices may result in discrepancies as the spectral response function varies among the VIs (Yang, 2015). In our study, the difference between the LAI acquired from the different VIs remained in an acceptable range (0.72 <  R2 < 0.87), and the highest agreement with the ground measurements was found from EVI-2. The differences between the LAI acquired based on NDVI, and the rest of vegetation indices (i.e., SAVI, EVI, and EVI-2) were particularly evident during the early growing stages. This was occurred due to the noise signals coming from soil backscattering because of the low vegetation covers during the early growth stages. The inconsistency between NDVI and LAI in sparse or low vegetative period and areas has also been addressed in other studies (Houborg & McCabe, 2018; Pontailler, Hymus, & Drake, 2003). From an applicability viewpoint, the soil backscattering issue could be particularly problematic for coarser spatial resolution platforms (e.g., MODIS) as the canopy signals cannot be differentiated from the soil background in VIs that do not account for soil impact. Therefore, the VIs occluding soil responses (e.g., SAVI, EVI, and EVI-2 in our study) resulted in higher accuracy for LAI estimates. Li, Chen (Li, 2017a) used EVI-2 for estimating winter wheat LAI from Landsat (Huang, 2015) used SAVI for grapevine LAI from Landsat and MODIS, Herrmann, Pimstein (Herrmann, 2011) used REIP (Red-Edge Inflection Point) to retrieve LAI from Sentinel satellite. Therefore, the optimal selection of a vegetation index that accurately represents LAI is crop-specific and may vary depending on the optical sensor characterizations.

The calibrated model reasonably simulated the overall yield on its default mode (R2 = 0.49–0.51). Other studies such as Mehrabi and Sepaskhah (Mehrabi & Sepaskhah, 2019) and Andarzian, Hoogenboom (Andarzian, 2015), however, reported a significantly higher accuracy of grain yield simulation by the CERES-wheat model compared to our results. These studies, however, only consider a small area (< 1 ha) in their analysis. Applying the crop model over small farmland reduces the field heterogeneity in soil characteristics, irrigation uniformity, and fertilization distribution. The different pattern of these elements has been reported to affect the grain yield (Rathore, 2017) significantly. In our study, the grain data has been collected over 25 ha of winter wheat fields that may have been subjected to different agricultural practices. Therefore, a lower accuracy in the simulated grain yield by the CERES-wheat was expected.

Improving the accuracy of the CERES-LAI by using satellite data showed positive impacts on the accuracy of the different model sections comparing Table 4 (results before LAI-assimilation) with Table 6 (results after LAI-assimilation). In particular, the simulated biomass and grain yield were significantly improved considering the 31% improvement of R2 and 62% of error reduction (average improvement of RMSE and MBE). Consistent with our results, Li (2017b) reported a higher accuracy (RMSE of 0.31 to 0.44) and significant improvement in CERES-wheat simulated LAI when high-resolution satellite imagery was used to assimilate leaf area index into the model. Li (2017a), however, used multiple satellite data (i.e., Landsat, GF-1, HJ-1), which increased the temporal resolution of the remotely derived data. This was likely resulted in higher accuracy of the simulated LAI compared to our study. Landsat revisit time is every 16 days, and it may extend longer if it encounters cloudy. Jamshidi, Zand-Parsa (Jamshidi, 2019a) showed that the optimal temporal resolution for satellite imagery to capture the crop phenology is one week during vegetative growth and two weeks during reproductive growth. Therefore, our results could have been further enhanced if supported by the platforms with a more frequent revisit time, such as Sentinel or MODIS. It should be noted that the spatial resolution is also essential to capture the phenology change; thus, data from MODIS (spatial resolution of 500 m) could be more suitable for regional assessment rather than field-scale experiments.

Conclusion

Given the importance of crop models for assessing yield and regional water productivity and sustainability, we evaluated the Crop Environment Resource Synthesis for Wheat (CERES-Wheat) model in an agriculturally dominated area in southern Iran, Firouzabad. The model was first calibrated using the field measurement data of the growing season of 2017 and evaluated during 2018 and 2019. Based on our analysis, the CERES model performance was found relatively poor in simulating the LAI. To improve the accuracy of the model, a remotely sensed leaf area index (LAI) was assimilated into the model using the particle filter approach. Four vegetation indices, including NDVI, SAVI, EVI, and EVI-2, were evaluated to identify the best estimator of winter wheat LAI from the Landsat optical sensor. All the vegetation indices resulted in the acceptable range of LAI (0.72 < R2 < 0.87) with NDVI as the lowest and EVI-2 as the highest accurate estimators. Using NDVI was particularly problematic during the early growth stage due to the noise signals coming from soil backscattering. While benefiting from the Landsat data at three spectral bands (NIR, red, and blue), the EVI index did not improve the LAI estimates compared to EVI with only two bands' data (NIR and red). Therefore, for our study area with an insignificant atmospheric aerosol effect, using blue band information would contribute to more complexity than improving accuracy, and the EVI-2 index is preferred. Applying the PF method to assimilate data into the CERES model was relatively simple and user-friendly over the study site (~25 ha); however, the complexity and computational power may increase if the model is applied at a regional scale. Assimilating the LAI based on Landsat-EVI-2 into the CERES model improved the model's overall performance, particularly for grain yield and biomass simulations (as RMSE and MBE were reduced by ~62% on average). The LAI heterogeneity increases by expanding the study area from field to catchment and regional level. Using the remotely sensed LAI helps capture the dynamic of LAI more accurately at larger scales. In this context, the methodology and framework used in our research to assimilate LAI into the model and apply it at a regional scale will be particularly useful. Our study also provides practical information on the CERES model's calibrated parameters for simulating winter wheat (a cultivar of Pishtaz) in the study region, which facilitates the regional application of the model.