Introduction

Rice (Oryza sativa L), one of the major crops in the world, forms the staple diet of about 2.7 billion people and is grown in diverse climatic zones (FAO 2010). In India, it is cultivated in about 150 million hectares, producing 132,013,000 metric tons, which covers about 26% of the global rice production (Global Rice Science Partnership 2011). In order to meet the increasing demands of rice due to rising population and income, rice production in India and the Asian countries in the world needs to be increased. Global Rice Science Partnership (GRisp) (CGIAR 2011) opined that by 2020, rice production will consistently meet demand as the world will be able to sustainably required 85 million additional tons of paddies. Dobermann and Fairhurst (2000) suggest that by 2020, average irrigated rice yield must raise by 30% to about 7 tons/hectare. This increase appears to be achievable, but requires improved germplasm with a yield potential of 12 tons/hectare in the dry season and 8–9 tons/hectare in the wet season within the span of the next 10 years. Moreover, significant improvements in soil and crop management conditions are necessary, particularly nutrient and pest aspects to lift average farm yields to about 70% of the yield potential. All this must be achieved in an environment where climate change and its effect on crop management practices are taking place, mainly triggered and driven by socioeconomic changes and competition for natural and human resources.

Studies on crop yield are traditionally carried out by using conventional agronomic researches, in which crop production functions are derived mainly on the basis of statistical analysis without referring much to the underlying biological or physical principles. The disadvantages of this approach and the need for greater in-depth analysis have long been recognized. In recent years, application of the crop yield simulation systems approach to agricultural management has been gaining popularity owing to the expanding knowledge of processes that are involved in the growth of plants, coupled with the availability of inexpensive and powerful computers.

Rice growth is influenced by several factors such as genetics of the crop, soil, water and agro-climatic condition, etc. Recently, there have been many computerized decision support systems for simulation of crop yields at farm field level: Decision Support System for Agrotechnology Transfer (DSSAT) (ICASA 2011; Rezzoug et al. 2008); crop-specific, crop-growth-stage-specific yield model (YIELD) (Burt et al. 1981); CENTURY (Parton et al. 1995); WOrld FOod STudies (WOFOST) (Boogaard et al. 1998); Simulation Model for Rice-Weather relations (SIMRIW) (Horie et al. 1987, 1994; Sudharsan et al. 2010); and DeNitrification-DeComposition (DNDC) (Zucong et al. 2003). Keeping in view the importance of climate change and its effect on rice yields, and generic simulation models available for yield prediction, an attempt has been made with the widely used stand-alone DSSAT CERES-Rice model and web based dissemination SIMRIW model to obtain confidence levels (for further use) with the field-level data. These confidence levels on yield predictions will be utilized in the ongoing long-term weather-based rice yield predictions with dynamic sensory data with distributed sensing systems (GeoSense 2011). The stand-alone DSSAT CERES-Rice model simulates growth and development of rice crop and water balance under flooded as well as rainfed conditions with fluctuating water regimes. On the other hand, with dynamic web based SIMRIW, one can simulate growth and yield of irrigated rice crops in relation to weather on the basis of vegetative and reproductive developments (Horie et al. 1994). However, the model can be improved with the inclusions of water flow relationships and soil, floodwater and plant N dynamics as suggested by Godwin and Jones (1991) and Singh (2003) in their research works. The model was developed by simple rational underlying physiological and physical processes of the growth of the rice crop (Horie et al. 1987). This model requires few weather and crop parameters, which are generally obtained from well-defined field experiments.

The objective of the present study was to evaluate CERES-Rice and SIMRIW models with the main emphasis on accuracy assessment of self script and web-based SIMRIW models with an Indian rice crop variety for developing confidence levels for further and wider usage. This accuracy assessment and comparative statistical results will help to carry out confidence levels with dynamic crop yield modeling using a distributed sensor network system with real-time data. In addition, based on this sensory data coupled with the SIMRIW model one can obtain vegetative index, leaf area index, dry weight, grain yield and potential yield in real to near real-time manner, which helps the farming community for better decision making.

Materials and methods

Test bed

Long-term field experiments have been carried out since 1980 at the Agricultural Research Institute (ARI), Acharya N G Ranga Agricultural University (ANGRAU), Rajedranagar, Hyderabad, Andhra Pradesh, India to study the yield and biomass of rice crop. The test bed is situated at 17°19’00” latitude and 78°23’00” longitude and at an altitude of 543.3 m above mean sea level (MSL). Meteorological parameters such as minimum-maximum temperatures, sunshine hours, growing degree days (GDD), carbon dioxide (CO2) and accumulated sunshine hours (ASH) were obtained from the weather station, which is in close proximity (~100 meters) to the test bed. The test bed falls under semi-arid tropics, and belongs to the Southern Telungana agro-climatic region with an average annual rainfall of 531.5 mm and mean temperatures ranging from 15°C to 41.6°C. The physio-chemical analysis of the experimental site indicates that the soil is of clay loam nature with high organic carbon content, medium in available nitrogen, phosphorus and low in available potassium.

Field experiment

A standard experimental design (randomized block design) was laid out, replicated thrice, with three different planting dates (Table 1). In the present study, rice experiments carried out during 1994–1997 were chosen for calculating yields with CERES-Rice and SIMRIW simulation models in both agricultural seasons, i.e monsoon (Kharif) and post-monsoon (Rabi). Daily maximum and minimum air temperature, rainfall and daily solar radiation data during 1994–1997 were obtained from the closely situated meteorological station of ARI. Data used by Rao and Reddy (1998) on soils, agronomic and management practices were taken as input in the model.

Table 1 Details of rice experiment

Crop management factors

Rice cultivar and transplantation of rice seedlings

A long duration (140–145 days), weakly photosensitive and popular variety in the local agriculture system, called Sambamashuri (BPT5204), was used in the experiment. Potential yield of the variety is 5.5–6.0 t/ha. Standard rice transplanting techniques were adopted when rice seedlings attained 20 cm height with four leaves and with 15 × 15 cm spacing (44.4 hills/sq.km).

Fertilizer, weed, plant protection and irrigation management

A common dose of 60 kg P2O5 and 40 kg K2O along with 50 kg ZnSO4 per hectare was applied before the final puddling. In addition, 120 kg N per hectare in the form of urea was added in three equal splits at the time of transplantation, 20 days after planting and at panicle initiation stages, respectively. The crop was kept free from biotic stresses throughout the growth so as to observe the yield variations primarily due to weather change. Throughout the crop growth period, a minimum of 3–5 cm of water level was maintained, thus the crop was raised under submerged water conditions.

Weather station data

The daily meteorological data was recorded in a Class 1 observatory situated about 100 meters away from the experimental field. The observatory includes an automatic weather station, which served to cross-check the weather parameters. The micro-climatic data, such as temperature, humidity, wind speed, solar radiation, etc., was recorded/collected within the crop (above and below) canopy with the help of a handheld weather tracer device.

Rainfall, temperature and sunshine hours

The annual amount of rainfall recorded in 1994 was 425.1 mm, compared to the decennial average rainfall of 531.5 mm. However, totals of 793.3 mm, 877.6 mm and 741.1 mm were observed in 1995, 1996 and 1997, respectively. This indicates that 1996 was a relatively wet year compared to 1994, 1995 and 1997. The minimum and maximum temperatures during 1994–1997 were 12.2°C, 42.8°C; 8.3°C, 43.5°C; 9.3°C, 42.9°C; and 8.4°C, 41.9°C, respectively.A glance at the availability of solar radiation in terms of bright sunshine hours during main crop season of the said years reveals that the year 1995 was ridden with more cloud cover, thereby receiving a lesser amount of radiation compared to 1994. The total hours of bright sunshine during September, October and November of 1995 were only 163.2, 155.1 and 261.5 hours against 216.1, 210.7 and 312.7 hours in 1994, respectively. This indicates that the Kharif plantings of 1994 received more hours of bright sunshine compared to that of 1995, 1996 and 1997. During the Rabi season of 1994–1995, weekly mean hours of bright sunshine throughout the cropping period (1–20 weeks) were high and more than 9 hours per day in the majority of the cases. However, during the second and third weeks (19 and 20 Julian weeks) of May, because of cloud cover, the sunshine was reduced to 5.5 and 7.5 hours per day. In the Rabi season of 1995–1996, weekly mean bright sunshine hours throughout the crop period was more than 8.3 hours per day. However, it was less when compared with the previous Rabi season (1994–1995). But the crop had experienced only a few weeks (11, 17 and 19) of bright sunshine hours (10–10.2 hours) in 1994–1995. During the 1996–1997 seasons, weekly mean bright sunshine hours were more than 11.85 hours per day in the majority of cases. However, it was relatively high when compared with the previous Rabi season (1995–1996). In this present study, the observed, DSSAT and SIMRIW predicted yields were evaluated and correlated with normal/below/above normal weather conditions. The weather conditions were formulated based on the average mean of the solar radiation, which is a major and common factor that affects the yield of rice crop, during the cropping period of both the agricultural seasons during 1994, 1995, 1995, 1996 and 1997.

Models

Two different types of process-based (mainly weather) rice yield simulation models were selected for the studies: a stand-alone widely used and well established model “CERES-Rice” of the DSSAT and a web-based self scripting (Java) simplified model “SIMRIW”.

DSSAT was initially developed by the International Benchmark Sites Network for Agrotechnology Transfer (IBSNAT) group (Uehara and Tsuji 1993) and has been improved by the International Consortium for Agricultural Systems Applications (ICASA 2011). DSSAT crop simulation models simulate growth, development and yield as a function of the soil-plant-atmosphere dynamics. Its crop simulation models have been used for many applications ranging from on-farm and precision management to regional assessments of the impact of climate variability and climate change. Currently, the DSSAT software application program (version 4.5) comprises crop simulation models for over 28 crops, including the CERES-Rice model.

SIMRIW, which was published in the report of Grants-in-Aid for Scientific Research-KAKENHI (Horie et al. 1995), was initially a FORTRAN-based program developed by Kyoto University, Japan. Subsequently, a web application version of SIMRIW was introduced by the National Agricultural Research Center (NARC), Japan (NARC 2012). SIRMIW helps to simulate the potential growth and yield of irrigated rice in relation to temperature, solar radiation, and CO2 concentration in the atmosphere. In the present study, SIMRIW was tested to: (1) develop a ubiquitous and cost-effective system for a rural extension community, and (2) gain the confidence level for use by comparing with well-established CERES-Rice and observed yield data for prediction of yield before applying the model with dynamic distributed sensory data (Sudharsan et al. 2011).

The CERES-Rice and SIMRIW models are discussed in the following sections.

CERES-rice model

The CERES-Rice is a rice crop growth simulation model and has been extensively used to understand the relationship between rice and its environment. This model has been used at local (Rao and Reddy 1998) regional and global levels (Bachelet and Gay 1993; Rosenzweig and Parry 1994) to study the impact of climate change on rice production. In addition, this model has also been successfully applied in a number of countries to estimate the impacts of climate change on rice productivity (Jones et al. 1998; Timsina and Humphreys 2006; Jayanta Kumar et al. 2010; Cheyglinted et al. 2001). The yield of rice in various environmental conditions (arid and semi-arid) is actually predicted by the CERES-Rice model (Surendran et al. 2010), which is variety-specific (Kumar and Sharma 2004).

Plant growth model

The CERES-Rice (plant growth sub-model) assumes that cultivar, soil-water conditions, weather and crop management are the primary influences on rice productivity (Bachelet and Gay 1993). Input parameters required for the model include weather, pedo-hydrological and management practices (Table 2) (Ritchie et al. 1998; Tsuji et al. 1994; Hoogenboom et al. 1995; Hunt and Boote 1998; Rezaul 1998). The growth stages in the model are juvenile, floral, heading, flowering, grain filling, maturing, and harvesting. Accomplishment of these growth stages is determined by accumulation of growing degree-days (GDD), calculated as:

$$ \matrix{{*{20}{c}} {GDD = T - 10} \hfill & {for\;{{10}^0}C\; < T\; < {{35}^0}C} \hfill \\ {GDD = \left( {45 - T} \right)} \hfill & {for\;{{35}^0}C\; < T\; < {{45}^0}C,\;and} \hfill \\ {GDD = 0.0} \hfill & {for\;T\; \leqslant \;{9^0}C\;and\;T\; \geqslant \;{{45}^0}C} \hfill \\ } $$
(1)

where, base temperature is 10°C and T is the daily mean air temperature (°C). When T reaches 35°C and approaches 45°C, GDD values decrease linearly towards zero, whereas GDD is equal to zero when temperature reaches 45°C. Beer’s Law is used to measure the solar radiation absorption from the following equation:

$$ I\;/\;Io = \exp \;\left( { - k\;LAI} \right)] $$
(2)

where I/Io is the sunlight transmission ratio, k is the extinction coefficient for rice plant (0.625), and LAI is leaf area index. Potential dry matter production (DMpot), in gm−2 is given by the equation:

$$ DMpot = PUE\;PAR\;\left[ {1 - \exp \;\left( { - k\,LAI} \right)} \right] $$
(3)

where PUE is the radiation use efficiency (g MJ-1), and PAR is the photosynthetically active radiation, assumed to equal 50% of the incoming solar radiation. LAI is not an input to this model, but it is simulated as a function of leaf-tip appearance in the rate of leaf area expansion. The CERES-Rice model assimilates and stores the above-said energy in the stem and was used by the plants partly or totally for grain-filling, depending on the degree of environmental stress and the resultant inadequate biomass production. In the beginning of a rice plant’s growth, a small fraction of assimilates are partitioned to stems and become large when the leaf growth stops. Allocation of biomass into the root depends on the stage and influences of the density of roots and their efficiency in supplying nutrients to the crop. In the CERES-Rice model, partitioning to roots will increase under water or nitrogen stress during the stages of the crop except grain-filling. The model maintains a constant proportionality between root mass and length throughout the growing season. Yield estimation (panicle weight at maturity) at the end of the season is calculated by individual kernel grain weight and the number of plants per unit area.

Table 2 Input parameters for CERES-Rice model

Soil water balance (SWB)

The CERES-Rice (SWB sub-model) calculates infiltration and evapotranspiration. The model offers the option of using the Priestly-Taylor method to estimate potential evapotranspiration (ETp) (Priestly and Taylor 1972):

$$ PET = \frac{{\alpha \;s\left( {Ta} \right)}}{{s\left( {Ta} \right) + \gamma }}\;\left( {Kn + Ln} \right).\;\frac{1}{{\rho w\lambda \upsilon }} $$
(4)

where K n is the short-wave radiation, L n is the long-wave radiation, \( s\left( {Ta} \right) \)is the slope of the saturation vapour pressure versus temperature curve, \( \gamma \) is the psychrometric constant, \( \rho w \) is the mass density of water, and \( \lambda \upsilon \)is the latent head of vaporization. Estimates of PET using the Priestley-Taylor Equation is scaled as a function of the difference in albedo:

$$ PET = 0.05.\,PET + 0.95.\,PET.\;\frac{{1 - alb}}{{1 - albe}} $$
(5)

where alb is land classes with different albedo and albe is albedo at the site. Ritchie’s model (Ritchie 1972) neglects wind speed and potential ET is determined based on the leaf area index. Richie’s model computes soil evaporation and plant transpiration separately. Evapotransipiration is computed by:

$$ E0 = \frac{{0.0504\;H0\Delta }}{{0.68 + \Delta }} $$
(6)

where, \( E0 \)is potential evapotranspiration (cm) calculated by Priestley-Taylor (1972), \( H0 \)is net solar radiation, and \( \Delta \) is the slope of the saturation vapor pressure curve at the mean air temperature:

$$ \Delta = \frac{{5304}}{{T2}}\;exp\;\left( {21.55 - \frac{{5304}}{T}} \right) $$
(7)

where \( T \)is the daily temperature (K). Finally, the grain yield (\( YG = gm - 2 \)) forms a specific proportion of the total dry matter production (\( Wt = gm - 2 \)) of a crop:

$$ YG = h\;Wt\, $$
(8)

where h is the harvest index.

The CERES-Rice model estimates potential water uptake by the roots and, in amalgamation with potential evapotranspiration, calculates a water stress deficit factor. This factor ranges from the absence of water stress (equal to 0.0) to extreme water stress (equal to 1.0). The model measures a daily temporal resolution over the growing season to estimate yield. In addition, the model also simulates rice-plant physiological processes and phasic growth of the rice plant and soil water balance at daily temporal resolution. This helps to identify the plant’s responses to various soil and atmospheric conditions and crop-management practices.

Simulation model for rice-weather relations (SIMRIW)

SIMRIW (SImulation Model for Rice-Weather relations) is a simplified process model for simulating growth and yields of irrigated rice in relation to weather. The model was developed by simple, rational underlying physiological and physical processes of the growth of the rice crop (Horie 1987). SIMRIW needs only a limited number of crop parameters which can be obtained easily from well defined field experiments. This model is based on the principle that the grain yields (YG, in Gm-2) form a specific proportion of the total dry matter production (Wt in gm-2) of a crop as described below:

$$ YG = hWt $$
(9)

where h is the harvest index.

The relationship of radiation and crop biomass is

$$ dWt/dt = CsIs $$
(10)

where C S is the absorbed short-wave radiation to rice crop biomass (g dry matter MJ-1),I S is the absorbed radiation per unit time (MJ m-2D-1),t is the time unit (d) day, and d represents one day.

The following Eq. (11) determines the relative growth rate (coefficient)

$$ \vartriangle Wt = CsSs $$
(11)

where ∆W is the daily increment of the crop weight (g m-2 d-1),S S is the daily total absorbed radiation (MJ m-2 d-1), andt is the time unit (d) in days.

Figure 1 schematically shows the processes of rice growth, development and yields in the SIMRIW model (Horie et al. 1987). The x-axis represents the development index (DVI), which is a measure of the crop development stage on a given day, and the y-axis represents the dry weight (W t). The quantities h, C s and S s are functions of the environment, and T is temperature, L is photoperiod (h), F is leaf area index and C s is conversion efficiency.

Fig. 1
figure 1

Systematic representation of processes of growth, development and yield information of rice used in the SIMRIW model (source: Horie et al. 1987)

Phenological development of the crop

The developmental processes of the rice crop such as ear initiation, booting, heading, flowering and maturation are strongly influenced both by the environment and crop genotype. In SIMIRW, these are described by the developmental index (DVI). This variable is defined as 0.0 at crop emergence, 1.0 at heading, and 2.0 at maturity. Thus, the development stage at any time in the life of the crop is represented by a value between 0.0 and 2.0. The value of DVI is calculated by summing the developmental rate (DVR) with respect to time:

$$ DVIt = \sum\limits_{1 = 0}^{1 = t} {DVRi} $$
(12)

whereDVI t is the development index at day t and DVR i is the developmental rate on the i-th day from emergence.

Day length and temperature are known to be the major environmental factors determining DVR. In rice, DVR from emergence to heading can be represented as a function of day length (L) and daily mean temperature (T) (Horie and Nakagawa 1994):

$$ \matrix{{*{20}{c}} {DVI \leqslant DVI*} \hfill & {DVR = 1/\left\langle {Gv\left\{ {1.0 + \exp \left[ { - A(T - Th} \right]} \right\}} \right\rangle } \hfill \\ {DVI \geqslant DVI*\;and\,L \leqslant Lc} \hfill & {DVR = \left\{ {1 - \exp \left[ {B\left( {L - Lc} \right)} \right]} \right\}/} \hfill \\ {} \hfill & {Gy\,\left\{ {1 + \exp \left[ {\Delta \,\left( {T - Th} \right)} \right]} \right\}} \hfill \\ {DVI \geqslant DVI*\;and\,L > Lc} \hfill & {DVR = 0} \hfill \\ } $$
(13)

whereDVI* is the value of DVI at which the crop becomes sensitive to photo period,L C is the critical day length (h),T h is the maximum rate of optimum temperature (°C),Gy is the minimum number of days required for heading of a cultivar under optimum conditions, andA & B are empirical constants.

In SIMRIW, the harvest index H is represented as a function of the fraction of sterile spikelets (γ) and the crop developmental index (DVI) as given below:

$$ H = hm\;\left( {1 - \gamma } \right)\;\left\{ {1 - \exp \;\left[ { - Kh\;\left( {DVI - 1.22} \right)} \right]} \right\} $$
(14)

where:

DVI = value of DVI at which the crop becomes sensitive to photo period

h m = maximum harvest index of a given cultivar (under optimum conditions)

K h = empirical constant.

On the basis of observation that the period from emergence to heading was shortened by 4% in rice under doubled CO2 conditions (Nakagawa et al. 1993), the basic vegetative period (Gy) of Eq. 13 was given by the following equation as a function of CO2 concentration:

$$ Gy = G\;\left[ {1 - 0.000114\left( {Ca - 350.0} \right)} \right] $$
(15)

where,

C α = atmospheric CO2 concentration (ppm)

G = value of G y at C a  = 350 ppm.

The amount of radiation absorbed by the canopy (S s) is a function of leaf area index (F), and the structure and optical properties of the canopy. In SIMRAW, S s (MJ m-2 d-1) is calculated by using the formula of Monsi and Saeki (1953):

$$ Ss = {\text{S0}}\left\{ {1 - r - \left( {1 - r0} \right)\exp \;\left[ { - \left( {1 - m} \right)k\,F} \right]} \right\} $$
(16)

where,

S 0 = daily total absorbed radiation (MJ m-2 d-1)

r and r 0 = reflectance of the canopy and bare soil

m = scattering coefficient

k = extinction coefficient of the canopy to daily short-wave radiation.

Dry matter production

The canopy reflectance r is given by the following equation (Research Group of Evapotranspiration 1967):

$$ r = rf - \left( {rf - r0} \right)\exp \;\left( { - F/2} \right) $$
(17)

where,

r f = reflectance of vegetation

r 0 = reflectance of the bare soil

F = leaf area index.

Daily dry matter production is calculated by multiplying the Ss value by an appropriate value of the radiation conversion efficiency Cs. As it has been shown in Horie and Sakuratani (1985), Cs is constant until the middle of the grain-filling stage, after which it decreases gradually toward zero at maturity. This pattern is simulated using the following equation:

$$ \begin{gathered} 0.0 < DVI\; < 1.0\quad Co = C \hfill \\ 1.0 \leqslant DVI\; < 2.0\quad Co = C\left( {1 + B} \right)/\left\{ {1 + B\;\exp \;\left[ {\left( {DVI - 1} \right)\;/t} \right]} \right\} \hfill \\ \end{gathered} $$
(18)

where,

DVI = value of DVI at which the crop becomes sensitive to the photo period

C O = radiation conversion efficiency at 330 ppm CO2 (g MJ-1)

C and B = empirical constants

t = time unit (d) day.

The expansion of leaf area is modeled independently of leaf weight, for reasons outlined by Horie et al. (1987). It is well documented that CO2 enrichment has little or no effect on leaf area development in rice (Imai et al. 1985; Baker et al. 1990a; Nakagawa et al. 1993) under the optimal cultivation conditions (water and nutrients are not limiting factors in the expansion of leaf area), and the main governing factor is temperature. In SIMRIW, the relationship between relative growth rate of leaf area index (F) and daily mean temperature (T) for the period before heading is given as:

$$ 1/F\;*\;dF/dt = A\left\{ {1 - \exp [ - Kf\left( {T - Tcf} \right)} \right\}\left[ {1 - \left( {F/Fas} \right)h} \right] $$
(19)

where,

A = maximum relative growth rate of LAI (m2 m-2), obtained under optimum conditions (temperature, solar radiation, nutrients, pests, and diseases are not limiting)

T cf = minimum temperature for LAI growth (°C)

F as = asymptotic value of leaf area index (when temperature is not limiting (m2 m-2),

K f and h = empirical constants.

Yield formation

The model terminates when DVI reaches 2.0 and depends on W y (dry weight of the panicles) and W t (whole crop dry weight that includes root) which varies with phonological stages (i.e. from booting to maturation). In SIMIRW, the harvest index h is a function of the fraction of sterile spikelets (γ) and the crop development index (DVI):

$$ h = hm\left( {1 - \gamma } \right)\left\{ {1 - \exp \left[ { - Kh(DVI - 1.22} \right]} \right\} $$
(20)

where,

h = maximum harvest index of a given cultivar

k h = empirical constant.

In SIMRIW, predicted potential yield (Y p) can be converted to actual yield (Y a) by

$$ Ya = K\;Yp $$
(21)

where, K is a technology coefficient representing the level of technology applied to the experiments. In applying this model to evaluate the effects of climate change, the K value is assumed as constant, and considered only the relative predicted changes in the potential yield.

The SIMRIW model effectively can cope-up with the Indian agro-climatic conditions of farming environments as it is a (1) weather-based model (2) requires few parameters, yet relevant, when the data is sparse and difficult, and (3) it is cost effective. However, it may require fine tuning as it is required to obtain more accuracy as it can not be considered under un-irrigated conditions.

Statistical approach

A few statistical techniques were attempted to compare (accurate or biased) the observed crop yields with SIMRIW and DSSAT simulated crop yields.

Correlation coefficient

The computational formula for the simple Pearson product–moment correlation coefficient between variable X and variable Y is

$$ rxy = n\sum {XY - \sum {X\,\sum {Y\,} } } /\sqrt {{[n\sum {X2} }} - (\sum {X)2][n\sum {Y2 - (\sum {Y)2} } } $$
(22)

where,

r xy = correlation coefficient between X (observed) and Y (simulated SIMRIW/DSSAT) variables

n = size of the sample (09 for each Kharif and Rabi)

X = individual’s score on the X variable (58.2 for Kharif and 72.7 for Rabi)

Y = individual’s score on the Y variable [(DSSAT: 62.7 for Kharif and 78.3 for Rabi) and (SIMRIW: 62.3 for Kharif and 80.7 for Rabi)]

XY = product of each X score times its corresponding Y score

X 2 = individual X score, squared (58.2)2 for Kharif and (72.7)2 for Rabi

Y 2 = individual Y score, squared [(DSSAT: (62.7)2 for Kharif and (78.3)2 for Rabi) and (SIMRIW: (62.3)2 for Kharif and (80.7)2 for Rabi)]

Linear regression

Linear regression analyzes the relationship between two variables, X (observed) and Y (simulated). The value R 2 is a fraction between 0.0 and 1.0, and has no units. An R 2 value of 0.0 means that knowing X does not help to predict Y. If there is no linear relationship between X (observed) and Y (simulated), the values scatter along the horizontal line going through the mean of all X values. When R 2 equals 1.0, all points lie exactly on a straight line with no scatter. Knowing X lets one predict Y perfectly.

Hypothesis test

The null hypothesis is a hypothesis about a population parameter. The purpose is to test the viability of the null hypothesis in the experimental data. Depending on the data, the null hypothesis either will or will not be rejected as a viable possibility. It also provides a benchmark against which observed and predicted outcomes can be compared to see whether the differences are due to some other factors. The computational formula for the null hypothesis test is given below (there is no difference between the means for population 1 and population 2):

$$ H0\;:\mu 1 = \mu 2 $$
(23)

where,

H 0 = represents the null hypothesis

μ 1 = represents the theoretical average for the population of the first group (average population of observed value)

μ 2 = represents the theoretical average for the population of the second group (average population of DSSAT/SIMRIW value).

If the results are not satisfied with the null hypothesis (i.e., there is no difference between the means of observed and simulated values), then the research hypothesis has been carried out.

$$ H0\;:\mu 1\,\; \ne \,\;\mu 2 $$
(24)

where,

H 0 = represents the null hypothesis

μ 1 = represents the theoretical average for the population of the first groups (observed)

μ 2 = represents the theoretical average for the population of the second groups (simulated).

Test significance

The test significance is based on the fact that each type of null hypothesis is associated with a particular type of statistical technique used in the study (correlation coefficient and t-test). Each of the statistical techniques is associated with a special distribution that one compares with the observed data from the simulated data. A comparison between the observed and the simulated (DSSAT/SIMRIW) resulting distribution indicates that if the simulated values are different from the observed values then the significance level of risk is not 100%. The test significance is the level of risk that a user is willing to take or reject a null hypothesis.

The t-test for significance of the difference between the means of two correlated results

The t-test has a major assumption that the amount of variability in each of the two groups (observed and simulated) is equal. The formula for computing the t value for independent means is:

$$ t = \frac{{X_1 - X_2}}{{\sqrt {{\left[ {\frac{{\left( {n_1 - 1} \right)s_1^2 - \left( {n_2 - 1} \right)s_2^2}}{{n_1 + n_2 - 2}}} \right]\,\left[ {\frac{{n_1 + n_2\,}}{{\left( {n_1\;n_2} \right)}}} \right]}} }} $$
(25)

where,

X 1 = mean of group 1 (observed)

X 2 = mean of group 2 (DSSAT/SIMRIW)

n 1 = number of participants in group 1 (9 observed)

n 2 = (9 DSSAT/SIMRIW)

\( s_1^2 \)= variance for group 1 (variance of observed)

\( s_2^2 \)= variance for group 2 (variance of DSSAT/SIMRIW).

The difference between the means of observed and simulated values makes up the numerator, whereas the amount of variation within the group and between each group makes up the denominator. As presented in Eq. 23, the null hypothesis states that there is no difference between the means for the first (observed) and second groups (DSSAT/SIMRIW). In our case, the research hypothesis (Eq. 24) states that, there is a difference between the means of the two groups. The research hypothesis is two-tailed and non-directional as it posits a difference but in no particular direction. The level of risk (level of significance or type I error) associated with the null hypothesis is set to 0.50/0.40/0.30/0.20/0.10/0.05/0.02/0.01/0.02 and 0.001 to give the type of errors. The risk level of significance is totally a decision of the researcher (Salkind 2007).

Modified Nash-Sutcliffe efficiency (E) and index agreement (d)

The modified Nash–Sutcliffe (Moriasi et al. 2007) method was used to describe the predictive accuracy of simulated models with observed data (it is also known as the coefficient of determination):

$$ E = 1 - \frac{{\sum\limits_{t = 1}^T {{{\left( {Q^to - Q^tm} \right)}^2}} }}{{\sum\limits_{t = 1}^T {{{\left( {Q^to - \overline Q o} \right)}^2}} }} $$
(26)

where,

Q o is observed crop yield

Q m is modeled crop yield

Q o t is observed crop growth at time t.

Nash–Sutcliffe efficiencies (E) can range from 0 to 1. An efficiency of 1 (E = 1) corresponds to a perfect match of modeled yield to the observed yield data. An efficiency of 0 (E = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas efficiency less than zero (E < 0) occurs when the observed mean is a better predictor than the model. Equation 26 is written in Matlab code and executed in MatLab simulation software.

The index of agreement (d) developed by Willmott et al. (1985) is a standard measure of the degree of model prediction error and varies between 0 and 1. A value of 1 indicates a perfect match, and 0 indicates no agreement at all. The index of agreement can detect additive and proportional differences in the observed and simulated means and variances; however, it is overly sensitive to extreme values due to the squared differences (Legates and McCabe 1999).

$$ d = 1 - \frac{{\sum\nolimits_{i = 1}^N {\left( {Oi - Si} \right)2} }}{{\sum\nolimits_{i = 1}^N {|Si - \overline O | + |} Oi - \overline O |}} $$
(27)

where,

d = index agreement

O i = observation grain yield

S i = simulated grain yield

\( \overline O \) = mean of the observed grain yield.

Results and discussion

The main objective is to find simple, yet accurate, yield predictions for the user community as well as to use the dynamic weather data ubiquitously in real-time. In this report, a web-based SIMRIW and a robust and stand-alone DSSAT model were evaluated with four years of weather and related parameters and finally compared the yield accuracy levels with observed/actual yield results.

DSSAT combines crop, soil and weather database into standard formats to simulate the crop models and application programs. The user can simulate multi-year outcomes of management strategies for different crops. It also provides validation of crop model results, thus allowing users to compare simulated results with observed values. Fulfilling all the data requirements for the CERES-Rice model is a difficult and challenging task. However, it can provide reasonably accurate estimates of yields with prevailing conditions of environment from the available data. Weather parameters were integrated over three different crop growth phases (Active Vegetative Phase-AVP, Reproductive Phase-RP and Maturity Phase-MP) to study the rice yield.

The web-based cost-effective and user-friendly SIMRIW model (Fig. 2) interface has been developed using Java and Java applets. The users can simulate yield with a few simple steps with relevant data: (1) select the temperature and solar radiation data, with an option of user defined data, (2) select the cultivar type, (3) enter the CO2 level in ppm and (4) enter date of sowing to calculate the value of DVI and LAI. Dry weight and other physical properties are already embedded in the SIMRIW interface depending on the cultivar that one selects. At the end, the user can obtain leaf area index, dry weight, grain yield, potential yield and actual yield on a daily basis.

Fig. 2
figure 2

Web-interface window for SImulation Model for RIce-Weather relations (SIMRIW) analysis

Table 3 shows the comparative results between DSSAT-CERES (Rao and Reddy 1998) and SIMRIW simulated rice yields with observed yield values. Both models indicate that the model-based predicted yields are generally higher than or close to the observed yields. Accuracy error could be due to the consideration of test bed under the optimum conditions. The comparative results with actual yields assist in obtaining the confidence levels for choosing either pc-based DSSAT or ubiquitous SIMRIW by the decision makers (scientists, policy makers, emergency managers, etc.) for prediction/estimation of crop yield on using minimum weather and crop parameters.

Table 3 Comparison of observed, DSSAT and SIMRIW predicted grain yields with respect to weather condition

The linear regression analysis developed in this study was to identify the close relationship between observed and model-based simulated crop yields (Fig. 3). The figure depicts weighted observed (x-axes) versus weighted simulated (y-axes) values, which contain three different weather conditions within the agricultural year (both Kharif and Rabi). The related correlation regression coefficients and results were computed and presented in Table 3.

Fig. 3
figure 3figure 3

Correlation coefficients weighted for Decision Support System for Agrotechnology Transfer (DSSAT) and SImulation Model for RIce-Weather relations (SIMRIW) versus weighted observed yields

The results of correlation analysis reveal that grain yield exhibits a significant positive correlation coefficient with the DSSAT R2 value (0.714, 0.804, 0.973, 0.927, 0.799, 0.993 during 1994, 1995, 1996 and 1997, respectively). These R2 trends are in conformity with Rao and Reddy (1998) with the observed values. In addition, the SIMRIW model has been adopted to validate with this experiment. The correlation coefficient values of SIMRIW are 0.951, 0.871, 0.968, 0.911, 0.942 and 0.985 for 1994, 1995, 1996 and 1997, respectively. It is also observed that SIMRIW simulated results are closer to observed yields under the normal weather condition (Kharif 1994, Rabi 1994–1995, Rabi 1996–1997), above normal condition (Kharif 1995, Rabi 1996–1996) and below normal condition (Kharif 1996). These differences could be due to the differences of temperature/solar radiation/sunshine hours of the corresponding season. The amount of rainfall may not create that much impact in the SIMRIW model result as the model is generally considered under irrigated condition. The overall correlation coefficient of CERES-Rice simulation is 0.94 and SIMRIW simulation is 0.93, which led to develop a web based and real/near real-time distributed sensory network based SIMRIW crop yield simulation to predict daily/monthly/season-wise crop yields.

A t-test for significance of the difference between the means of two correlated results

Tables 4 and 5 display the t-test summary statistics of the observed, DSSAT and SIMRIW samples. T-test results have been obtained through the mean of the differences between the paired observations, the standard deviation of these differences, followed by a 95% confidence interval for the mean (mean difference between simulated and observed values) and P-value (two-tailed probability) calculations (less than 0.05). The conclusion is that the mean difference between the paired observations is statistically and significantly different from 0. In general, if the obtained value (observed) is more than the critical value (simulated values), the null hypothesis cannot be accepted. However, in the present study, the obtained value did not exceed the critical value. The differences in the result could be due to sampling error/rounding or error/simple variability in the simulation results.

Table 4 Observed, DSSAT & SIMRIW t-test result (Kharif season)
Table 5 Observed, DSSAT & SIMRIW T-test result (Rabi season)

The obtained (observed) values and the critical (simulated) values are not satisfying the null hypothesis condition. Hence, DSSAT and SIMRIW results were performed differently with respect to level of risk defined by the user (according to the situation). Tables 4, 5, 6 and 7 show different tails (Tail 1 and Tail 2) and different types (Type1, Type2 and Type3) for both agricultural seasons (Kharif and Rabi). It is typically observed in the literature. The two-tailed t-test clearly shows the close differences in the model results, i.e., the Type 2 Tails 2 (T2) function clearly shows that yield accuracy differs in the Kharif (0.2269 and 0.3155 DSSAT and SIMRIW, respectively) and in the Rabi (0.2694 and 0.1170 DSSAT and SIMRIW, respectively) seasons. The results clearly indicate that SIMRIW model results were high in accuracy in Rabi when compared to Kharif as SIMRIW considers the test bed under optimum and irrigated conditions.

Table 6 Observed, DSSAT & SIMRIW t-test (two-tailed) T1 results (Kharif season)
Table 7 Observed, DSSAT & SIMRIW t-test (two-tailed) T2 result (Rabi season)

Modified Nash-Sutcliffe efficiency (E) and index agreement (d) results

It is observed from the modified Nash–Sutcliffe efficiency (E) that the rice yields of DSSAT and SIMRIW were 0.73 and 0.62, respectively. A screen-shot showing the results on E by MatLab is depicted in Fig. 4. The index agreement detected additive and proportional differences in the observed and simulated means and variance for DSSAT of 0.57 and SIMRIW of 0.47 (Table 8). In these statistical approaches the SIMRIW results are less than DSSAT values as only minimum weather/crop parameters are considered.

Fig. 4
figure 4

A MATLAB-based user interface window for modified Nash-Sutcliffe efficiency (E) analysis

Table 8 Observed, DSSAT and SIMRIW predicted grain yields with index agreement (d)

This present study confirms the potential of a web-based SIMRIW model to simulate rice yields with reasonable accuracy levels. It is also observed that SIMRIW is a feasible model to predict rice yield in near real-time mode as it was developed with Java and Java Applet to use with the Internet. SIMRIW simulated yield, in general, was higher than the observed yield, which may be mainly due to the fact that it neglects a contribution of nitrogen (N) and amount of water (irrigation water/rainwater) involved. The model could be improved with the addition of nutrient and water dimensions (Sudharsan et al. 2011). The results also indicate that there were differences in the yield, although the test bed maintained under the same optimum conditions. It may be due to the influence of climatic/environmental factors and with optimum (protected) conditions.

Conclusions

In the present study, simulated rice yields (pc-based DSSAT and web-based SIMRIW) were evaluated with observed rice yields grown under different weather conditions (normal, above and below normal). The study was carried out in a test bed, falling in semi-arid tropics in India, with a weather data set (obtained from close-by weather station) during the 1994 to 1997 agricultural (monsoon and post-monsoon) seasons, which are transplanted under different dates. One of the main purposes of taking up this study was to obtain confidence levels and to study the feasibility of a ubiquitous SIMRIW model with the proposed dynamic and real-time datasets from distributed sensing systems via wireless sensor network (GeoSense 2011). Quantitative yield comparisons and accuracy assessment of the simulated results were carried out with relevant statistical methods (correlation coefficient, linear regression, R-square, hypothesis test, t-test significance and E and d agreement). The above statistical techniques revealed that the simulated (SIMRIW and DSSAT) yields have a close relation with observed yields. Correlation coefficient, linear regression R-square statistical techniques correlate the simulated results with accuracy levels ranging from 85 to 99%, while t-test statistical techniques correlate the simulated results with an accuracy of 95–100% (depending on the level of accuracy risk that the user defines). Modified Nash–Sutcliffe efficiency is 0.73 in DSSAT and 0.62 in SIMRIW and the index agreement is 0.57 and 0.47, respectively.

In this paper, it is clear that yield responses vary based on the date of transplanting and weather condition of the cropping period. An observation from this research is that transplanting the rice crop in a regular transplanting date would result in the lowest yield. It is almost clear that potential yield is increasing rapidly if transplantation is adopted later (10–15 days) from the regular transplantation date if the weather conditions are normal. Thus, it is concluded that 10–15 days shift in transplantation will yield better results in the semi-arid tropical region.

In general, simulation models (DSSAT and SIMRIW) overestimated the grain yield, but still the values fall within the permissible limits. Relatively, DSSAT is more accurate than SIMRIW due to the fact that DSSAT incorporates more crop/soil/weather parameters (data hungry model), whereas SIMRIW considers only a few crop and weather parameters and test bed under the optimum and irrigated conditions. However, SIMRIW needs to be modified/developed with the consideration of amount of nutrient and irrigation water dimensions. Moreover, the SIMRIW model helps the user community (farmers, decision and policymakers) to simulate daily/weekly/monthly/seasonally rice crop yield with their own data (user defined) in real to near real-time mode.