1 Introduction

Statistical and dynamical downscaling are the two main downscaling techniques for transforming coarse spatial resolution climate projections to finer spatial resolution climate projections. The main concept of statistical downscaling (SD) is to derive statistical transfer functions between large-scale variables that are resolved by general circulation models (GCMs) and local variables of interest that are not resolved (Leung et al. 2003). Typical SD methods include analogue (Zorita and von Storch 1999; Timbal et al. 2008), circulation-pattern-downscaling or pattern-scaling (Mitchell 2003) and synoptic downscaling (Charles et al. 2003). There is a parallel development in dynamical downscaling (DD) with the advent of regional climate models (RCMs, McGregor 1997; Zhang et al. 2001). Stretched grid (variable resolution) and time-slice experiments are the state-of-the-art dynamical downscaling methods in modelling regional climate (Leung et al. 2003; Wang et al. 2004). These downscaling techniques have their own advantages and disadvantages. For example, analogue, one of the SD approaches, is based on historical analogue and has inherently better performance against observed climate in comparison with dynamical downscaling, which is needed as there are no analogues for some future climatic conditions.

It is widely recognised that uncertainties propagate through every step of climate risk assessment from the projection of greenhouse gases (GHGs), uncertainty in climate sensitivity and projection of local climate changes (CC), use of different downscaling techniques, the way climate change scenarios (CCSs) are constructed and use of different impact models. Substantial work has been done in identifying, quantifying and managing uncertainties in the projection of GHG emissions, global warming and regional CC (Webster and Sokolov 1998; Reilly et al. 2001; Stott and Kettleborough 2002) and in the construction of local CCSs for risk assessment (Luo et al. 2005a, b, 2006, 2010; Baigorrial et al. 2007; Semenov and Stratonovitch 2010). Several comparison studies have been conducted in modelling regional CC (Macadam et al. 2010) and crop yields and in using different methods for constructing CCSs. Significant difference was found in simulated crop yields between the use of high resolution scenarios and the use of a coarser resolution GCM scenarios (Mearns et al. 2001, 2003). Baron et al. (2005) compared millet yields between spatially and temporally aggregated climate variables and non-aggregated climate variables and found aggregated rainfall overestimated millet yield by 10 % to 50 % in dry latitudes while aggregated solar radiation caused significant bias in wetter zones in West Africa. Ines and Hansen (2006) found corrected daily GCM rainfall improved corn yield prediction in Kenya. Luo et al. (2009, 2010) found that there is a significant difference in the mean of simulated wheat yields, depending on whether or not changes in climate variability (CV) are considered. A recent study by Oettli et al. (2011) applied a bias correction technique (cumulative distribution function - transformation) to the outputs of eight RCMs in West Africa and found that the combination of both dynamical and statistical downscaling techniques would better contribute to the reduction of uncertainties in projected sorghum yield impact.

A couple of studies addressed uncertainty issues from the use of impact models such as crop growth models. Wolf et al. (1996) validated and calibrated five commonly used wheat models and applied them to quantify CC impact on wheat production in Europe. Ewert et al. (2002) tested the performance of three wheat crop models with different complexity against experimental datasets from Free Air CO2 Enhancement and Open Top Chamber. There is an increasing interest in inter-comparing CC impact and adaptation results by adopting different crop models and/or with different complexity associated with agricultural risk assessment, for example, the global Agricultural Model Inter-comparison and Improvement Project (Rosenzweig et al. 2013). Rötter et al. (2011) identified several urgent needs for improving the climate-crops modelling.

To the best of our knowledge there are very limited studies, which investigated the uncertainties originating from the use of different downscaling techniques from the perspective of agricultural application, even though some comparison studies do exist in the climatology and water resource domains. Zorita and von Storch (1999) inter-compared four types of statistical downscaling techniques. Hay et al. (2000) applied the delta change approach and the statistical downscaling approach in estimating current and future runoff in three river basins in America. Frost et al. (2011) tested the performance of six downscaling methods in producing a range of statistics in the context of hydrological studies. This motivated us to carry out a comparison study between two streams of downscaling approaches. This research aims to evaluate the performance of dynamical and statistical downscaling techniques against observed climate data, and to compare the difference in their projection of future CC and the resultant impact on wheat crop yields.

2 Method

2.1 Study sites

This study focused on three major wheat production areas in New South Wales (NSW), Australia: Bingara, Peak Hill and Deniliquin as shown in Fig. 1. These three sites span a large geographical region and thus have quite different climate patterns. Bingara, located in the northeast of NSW has a summer-dominant rainfall pattern while Deniliquin, in the southwest of NSW wheat belt, has a winter-dominant rainfall pattern. Peak Hill, located in the middle of the NSW has an intermediate rainfall pattern between the two. These three sites also differ in rainfall amount with Bingara and Peak Hill belonging to medium-high rainfall areas with growing season (GS, May–Oct. inclusive) rainfall of 349 mm and 305 mm respectively, while Deniliquin belongs to a low rainfall area with GS rainfall of 258 mm.

Fig. 1
figure 1

Location of study sites. Inset shows the Murray Darling Basin (MDB) within Australia

2.2 Downscaling techniques and climate data sources

Dynamically and statistically downscaled daily climate data for the periods 1980–1999 and 2046–2065, representing respectively the current and future climates, were used in this study. For the DD approach, outputs of the CSIRO Conformal-Cubic Atmospheric Model (CCAM) were used. CCAM is a stretched-grid model (McGregor and Dix 2008), which has a roughly uniform grid over the area of interest, and a coarser-resolution grid over the remainder of the globe. CCAM was driven by the CSIRO Mk 3.5 model fields for the A2 scenario of the Special Report on Emission Scenarios. It should be noted that outputs of CCAM with a spatial resolution of 15 by 15 km were used to represent station level in this study. The soundness of this representation was justified in Online Resource 1.

SD climate data were obtained from the Bureau of Meteorology (BoM) statistical downscaling model (SDM) based on an analogue approach and high quality historical climate datasets (Timbal et al. 2008). More details on the analogue approach can be found in Online Resource 2. The BoM SDM was applied to the outputs of the CSIRO Mk 3.5 model under the A2 emission scenario. Results from BoM SDM for the three study locations were obtained directly using a Graphical User Interface described by Timbal et al. (2008). These stations are located in the Southern Murray-Darling basin (as shown in their Fig. 13) and the optimal combination of predictors for each surface predictand is shown in their Tables 4a and 4b (Timbal et al. 2008). For agricultural application, information on both temperature and rainfall is usually needed. However there are limited choices of locations which have both SD temperature and rainfall in the NSW wheat belt. Among the three locations considered, Deniliquin has both SD temperature and rainfall. The other two only have SD rainfall data. Downscaled temperature data from nearby locations (within 60 km) such as Moree for Bingara and Dubbo for Peak Hill were used instead.

Observed climate data for the three study sites for the period 1980–1999 were extracted from BoM high quality datasets. Once again, high quality temperature data at Moree and Dubbo were used for Bingara and Peak Hill respectively. However, high quality temperature for Moree is only available up to the end of August 1998. To have the same length of observed temperature data as the other two locations, Tmax and Tmin at Moree from SILO Patched Point Dataset (http://www.longpaddock.qld.gov.au/silo) for the period from September 1998 to December 1999 filled in the gap in the high quality temperature dataset at Moree.

2.3 Derivation of climatic changes and construction of local CCSs

It is widely recognised that the daily outputs of downscaled climate projections cannot be directly used by impact models as these projections are associated with climate model bias. Further steps such as the use of the LARS-WG are needed to reduce the bias by linking with observed climate data (Luo and Yu 2012). In this study dynamically and statistically downscaled daily outputs for the above-mentioned two periods were used firstly to derive monthly changes in mean climate and in CV through a stochastic weather generator, LARS-WG, the most widely used weather generator in this research area (White et al. 2011). Within this weather generator, daily Tmax and Tmin are considered as stochastic processes with daily means and daily standard deviations conditioned on the wet or dry status of the day. The rationale of using this weather generator is that changes in both mean climate and in CV (including daily and interannual variability) can be incorporated into CCSs. Changes in mean climate included monthly mean temperature and mean rainfall. Changes in solar radiation were not available from the SD approach so they were not considered in this comparison study. Changes in CV included monthly average length of wet and dry spells and variability (standard deviation) of monthly mean temperature. A spell is defined as three or more consecutive days with wet or dry condition. Based on the characteristics of historical climatic data (1980–1999), derived monthly changes were then reapplied to the LARS-WG to produce 100-year climate (baseline and future) scenarios for impact assessment. A diagram of these procedures was given in Online Resource 3. A detailed description of the LARS-WG can be found at http://www.rothamsted.ac.uk/mas-models/larswg.php and Luo et al. (2003). The performance of the LARS-WG in diverse climates around the world was evaluated by Semenov et al. (1998), Qian et al. (2004, 2005) and Semenov (2008).

2.4 Model setting

The Agricultural Production System sIMulator (APSIM)-Wheat model (version 7.1) was used to quantify the potential effects of CC on wheat grain yield by coupling the dynamically and statistically downscaled outputs of the Mk 3.5 GCM. This crop model package has been widely applied to CC/CV impact studies (White et al. 2011) and in farming system studies in Australia, Europe and China. Description and validation of this model can be found in Probert et al. (1998). Information on cultivars used and sowing rules applied in this study can be found in Luo and Kathuria (2013). Irrigation was not applied in this modelling exercise. Table 1 shows crop management information across study sites and [CO2] set in the wheat model. A red-brown earth soil was assumed. Soil water, soil nitrogen and residue were reset to their initial conditions on the 1st of March each year. The purpose of this resetting is to exclude the interaction between CC and soil conditions so that clear messages of CC impacts on wheat production can be obtained.

Table 1 Management information and CO2 concentration as set in the APSIM-Wheat model

2.5 Statistical analysis

We examined the mean, variation and distribution of observed, dynamically and statistically downscaled climate with t-test, F-test and Kolmogorov-Smirnov (KS) test respectively. These tests were also conducted to compare the projected future CC and the resultant wheat grain yields for the period centred on 2055. We used R 2.13.0 (R Development Core Team 2011) for all statistical analysis.

3 Results

3.1 Performance of downscaling techniques

Figure 2 shows observed, DD and SD monthly rainfall, Tmax and Tmin across locations. From this figure it can be seen that DD underestimated rainfall at Bingara in most of the months with greatest error of −35 mm. SD overestimated rainfall in 5 months with error up to 43 mm and underestimated rainfall in 4 months with error down to 25 mm at this location. DD underestimated autumn and summer rainfall with greatest error of −26 mm at Peak Hill and −15 mm at Deniliquin and overestimated winter rainfall with error up to 23 mm at both locations. SD underestimated rainfall in most of the months with maximum error of −24 mm and −16 mm for Peak Hill and Deniliquin respectively.

Fig. 2
figure 2

Observed, dynamically downscaled (DD) and statistically downscaled (SD) mean climate for the period 1980–1999 across study locations

DD underestimated Tmax in the majority of the months at Bingara with error down to −3 °C. SD overestimated Tmax in 7 months with the maximum value up to 0.5 °C and underestimated Tmax in 5 months with error down to −1.22 °C. DD overestimated spring and summer Tmax with greatest error up to 2.4 °C for Peak Hill and 3.6 °C for Deniliquin respectively and underestimated autumn and winter Tmax (down to 2 °C) at these two locations. SD overestimated (half of the months) and underestimated (half of the months) Tmax at these two locations with greatest error of 1.2 °C and −1.7 °C for overestimation and underestimation respectively.

Both DD and SD underestimated Tmin at Bingara with maximum error of −2 °C for DD and −1.5 °C for SD. DD overestimated Tmin at Peak Hill and Deniliquin with maximum error of 2.7 °C for the former and 2.9 °C for the latter. SD overestimated Tmin in 7 months and underestimated Tmin in 5 months with greatest error of 1 °C at Peak Hill and overestimated Tmin at Deniliquin in the majority of the months with greatest error of 1.2 °C.

Generally speaking, biases arising from the use of SD were much smaller than that of DD across climate variables, months and locations considered. This is due to the fact that the SD is re-sampling the local observation and hence as the in-built capability to perfectly match the statistics of the local stations. In comparison, dynamical downscaling acts as a climate model producing spatial averages relevant for an entire grid box, albeit for a much smaller grid box than the GCM itself. More explanation can be found in the Discussion and Conclusions Section. The difference among locations is the result of the different climate regime and in the case of the SD the different combination of predictors which were used to relate to the local predictands (Timbal et al. 2008). It is known that the SD technique does not perform as well for Tropical summer rainfall as we can see in Bingara in summer.

Comparing with the historical records, both techniques reproduced significantly different climatic sequences (i.e. Tmax, Tmin and rainfall) in terms of mean, distribution and variability for all locations (Table 2); there is no clear direction of discrepancy (i.e. greater or less than the observed values (Fig. 2). Although inconsistent across test methods, locations and climate variables, SD generally performed better than DD (Table 2). For rainfall SD produced climate which was closer to observations according to the three tests with the exception of the F-test at Peak Hill. SD also simulated better daily Tmax time series in terms of mean and distribution but not variation (F-test). For Tmin, SD outperformed DD at Bingara and Deniliquin (except the mean value), DD outperformed SD at Peak Hill (except the mean value). For GS months (May–Oct. inclusive, data shown in the parenthesis of Table 2), DD performed better in relation to Tmin while SD performed better in relation to Tmax. Significant difference in the performance of these two downscaling approaches implies that significant difference may exist in the projection of future climate change and possibly the resultant impact assessment.

Table 2 Evaluation of dynamical and statistical downscaling techniques against observed climate (1980–1999): number of months with p value ≥0.05 (difference is not significant at 95 % confidence level). Numbers in parentheses are the number of months with p value ≥0.05 within growing season (May–Oct. inclusive)

3.2 Climate projection

Significant differences exist between the two downscaling approaches in the mean, variance and distribution of projected climate variables for both GS and whole year except for the mean value of rainfall across the three locations and most of the months (Table 3). It should be noted that these two downscaling approaches matched better in mean rainfall for GS months at Bingara and Peak Hill than that of Deniliquin. When translating these climate projections into climate variables relevant to the weather generator and crop production (Fig. 3), it was found that (1) DD projected a decrease (−7 % ~ −3 %) in GS rainfall across the three locations while SD projected an increase except at Deniliquin where a relatively large decrease of −13 % was projected; (2) both downscaling techniques projected a decrease in the average length of GS wet spells with SD projecting a smaller decrease; (3) both downscaling techniques projected an increase in the average length of GS dry spells (3 % ~ 8 %); (4) DD projected a decrease in the variability of GS mean temperature except at Peak Hill while SD projected an increase across the three sites; (5) both downscaling techniques projected an increase in average GS Tmax and Tmin. However DD projected a much bigger increase compared with SD.

Table 3 Significant test between DD and SD future climates for the period centred on 2055: number of months with p value ≥0.05 (difference is not significant at 95 % confidence level). Numbers in parentheses are the number of months with p value ≥0.05 within growing season (May–Oct. inclusive)
Fig. 3
figure 3

Growing season average climate changes for the period 2046–2065 across direct outputs of Mark 3.5, downscaling techniques and locations. GSR: GS mean rainfall, GS wet/dry spell: average length of GS wet/dry spells, GS Tvar: daily and interannual variability of GS mean temperature, GS Tmin/Tmax: GS minimum/maximum temperature. DD: dynamical downscaling, SD: statistical downscaling, NDS: not downscaled (direct outputs of Mark 3.5). Only GS mean changes from the direct outputs of Mark 3.5 are presented

3.3 Projected wheat yields

Figure 4 shows wheat yield distribution for the period centred on 2055 under baseline conditions and for the two downscaling approaches across the three locations. Compared with the baseline, median grain yield under DD CCSs decreases at Bingara and Peak Hill and is unchanged at Deniliquin; median grain yield under SD CCSs is unchanged at Bingara, increases at Peak Hill and decreases at Deniliquin. In comparison with the baseline, yield variability slightly decreases under DD CCS at Bingara and Deniliquin while it increases at Peak Hill. Yield variability slightly increases under SD CCS at Bingara while it decreases at Peak Hill and Deniliquin. The median wheat yield resulting from SD CCSs is higher than that from DD CCSs at Bingara and Peak Hill due to an increase in GS rainfall and a smaller increase in both Tmax and Tmin projected by SD (Fig. 3). Compared with DD CCSs, SD CCSs result in smaller variability in wheat yield at Peak Hill. The relatively large increase in GS rainfall (21 %) has contributed to the reduction in wheat yield variability at this location (Fig. 3). Statistical tests show that there are significant differences in projected wheat yields from the two downscaling approaches adopted at these two sites, except for variance at Bingara. On the other hand, median wheat yield resulting from SD CCSs is lower than that from DD CCSs at Deniliquin due to the larger decrease in GS rainfall projected by the SD approach. Statistical tests show that the difference in the distribution, variance and mean of projected wheat yields between the two downscaling approaches is not significant at this site. An explanation follows in the Discussion section.

Fig. 4
figure 4

Distribution of wheat grain yield for the period 2046–2065 across downscaling techniques and locations. BL baseline, DD dynamical downscaling, SD statistical downscaling

4 Discussion and conclusions

Based on statistical tests, it was found that SD outperformed DD for most of the climate variables, including their mean, variance and distribution for locations considered under the baseline scenario (Table 2). According to the maximum error, SD outperformed DD in Tmax and Tmin across the three locations. However DD outperformed SD in rainfall at Bingara and is equivalent to SD at the other two locations regarding rainfall amount (Fig. 2). The finding that SD performed better for the present climate than DD is worth discussion as the former is based on the observed climate. In other words, this result is to be expected. One of SD assumptions is that the predictors used to determine future local climate should not lie outside the range of the climatology which is used to calibrate the SD model. This implies a limitation of SD in projecting future CC.

Significant differences exist between the two downscaling approaches in the mean, variance and distribution of projected climate variables in 2055, except for the mean value of rainfall, across the three locations and for most of the months (Table 3). Differences exist in GS CC, including mean and variability in terms of direction and magnitude (Fig. 3). Significant difference in the projected future climate indicates that ongoing improvement in the two downscaling techniques is needed in future regional climate modelling efforts. Improvement in estimating Tmax, Tmin and rainfall (variance and distribution) is desirable for the DD in terms of the small number of month with p value ≥0.05 in Table 2. This is consistent with the finding of Frost et al. (2011), which suggested improvement of CCAM is needed in terms of daily rainfall variability. As to SD our results indicate that improvement is needed in the mean (Bingara, Fig. 2) and variability of rainfall, variance and distribution of Tmax, and the distribution and or the mean of Tmin (Table 2). Frost et al. (2011) suggested improvement is needed in the analogue approach in terms of the production of number of wet days and mean rainfall statistics. This is in line with our study. Oettli et al. (2011) found that bias correction to regional climate outputs improved the performance of DD which was comparable with SD. This indicates that bias correction is an essential step in reducing the uncertainties arising from the use of DD climate outputs. Combining both DD and SD techniques is a robust approach in local CC projection and in climate change risk assessment which needs to be promoted in this area. As noted earlier, outputs of CCAM with a spatial resolution of 15 × 15 km were used in this study to compare with SD climate datasets which were at station scale. The different spatial scales at which DD and SD operated in this study must have contributed to the differences in projected local CC and wheat yield. To be compatible, SD may need to be based on gridded historical datasets at similar spatial resolution to that of DD.

The significant difference in projected future climate between the two downscaling approaches mentioned earlier resulted in significant differences in simulated wheat yield for 2055 at Bingara and Peak Hill but not Deniliquin (Fig. 4). A detailed examination of Fig. 3 found that the change sign in average GS rainfall is opposite and the difference in Tmax and Tmin between the two downscaling approaches is relatively large (1.28–1.73 °C) at Bingara and Peak Hill. This led to significant difference in simulated wheat crop yields arising from the use of the two downscaling approaches. On the other hand, the change sign of average GS rainfall at Deniliquin is the same and the difference in Tmax/Tmin between the two downscaling techniques is much smaller (0.84–1.29 °C) compared with the other two locations. This resulted in insignificant difference in projected wheat grain yield at Deniliquin. Even though these locations have different sensitivity to heat stress (more sensitive at northern locations) and drought (more sensitive at southern locations), sub-seasonal metrics do affect wheat yield. The difference in projected wheat yield across locations and downscaling approaches were mainly due to the change sign and magnitude of specific climatic variables especially rainfall, which is the major driving force of dryland farming in Australia.

Research findings from this study indicated that downscaling techniques can be an important source of uncertainties in local climate projection and in CC risk assessment, like other sources of uncertainties. This study is one of the very few studies that evaluated the performance of different downscaling techniques against observed climate and compared projected future climate and resultant wheat grain yield due to the use of different downscaling techniques from the perspective of agricultural application. This research contributes to the reduction of uncertainties in the areas of local CC projection and CC risk assessment through the identification of future improvement directions of these two downscaling approaches and through the promotion of using better performing downscaling approaches and/or the combination of these two approaches.