Keywords

1 Introduction

Wheat, Triticum aestivum, is a major crop in Australia, with 11–13 million hectares planted annually, approximately 40 percent of this in south-west Western Australia [1]. Aside from soil fertility and other agronomic considerations, the major constraints on production are meteorological and climatological [29]. Soil moisture profile at the time of planting influences crop success [10, 11], which in turn is influenced by rainfall during the summer-autumn fallow period [12]. Water availability during the growing period is the most critical factor affecting crop yields [10, 13, 14]. The timing of the arrival of adequate autumn rains for sowing, the so-called autumn ‘break’, is a key factor in the crop establishment phase [6]. In particular, the timing of rainfall events in relation to crop requirements is regarded as more important than the total rainfall received during the life cycle of the crop [6, 15]. Therefore, accurate and detailed forecasts of rainfall are important for wheat farmers worldwide [16], including Western Australia.

Phenomena thought to affect rainfall, in particular the El Niño-Southern Oscillation (ENSO), have been extensively studied for Australia [1721]. Relationships between ENSO and yields of wheat and maize have been studied for Argentina [22] and Brazil [23]. In the USA, yields of a variety of crops, including wheat, cotton and sugarcane have been found to have some dependence on ENSO [24, 25]. In Asia, relationships between ENSO and yields of wheat and rice have been studied for both China [26] and India [27].

Until May 2013, the official seasonal forecast for southwest Western Australia issued by the Australian Bureau of Meteorology (BOM) was based on an empirical statistical scheme using an ENSO index as the primary predictor in a relatively simple statistical model [28]. Reviewing the performance of these forecasts, Fawcett and Stone (2010) described the demonstrated skill level for seasonal rainfall forecasting as “only moderate, although better than climatology and randomly guessed forecasts”. Since June 2013, the BOM has used output from a general circulation model (GCM), the Predictive Ocean Atmosphere Model for Australia (POAMA), to forecast for large grid areas (62,500 square kilometres) across the Australian continent. There has been no published detailed quantitative study enabling a direct comparison between the skills of the forecasts from POAMA versus the earlier statistical models, including for south-west Western Australia. Because the operational forecasts from POAMA are provided in a two-category format (above or below median rainfall), calculation of common measures of forecast skill including root mean square error (RMSE) are impossible. Furthermore, the BOM is unwilling to provide the output from POAMA used in operational forecasts in a form facilitating comparison with other methods, for example, as deterministic forecasts for point locations [29].

The BOM’s seasonal forecasts are considered too unreliable as a basis for major cropping decisions [30]. This is consistent with earlier studies [31] that suggest accurate seasonal rainfall forecasts do not exist for Western Australia. More recent research findings [32] based on output from the latest version of POAMA (version 2.4), also indicate that for grid cells corresponding to the western Australian wheat-belt, monthly rainfall forecasts fall into categories corresponding to a zero, negative, or very low positive level of skill relative to climatology. This was even after extensive post-processing of POAMA 2.4 output with statistical bridging models [32]. Research [33] considering the reliability and skill of POAMA 2.4 relative to earlier versions for the entire Australian continent also indicates a forecast skill only comparable to climatology, with little or no skill beyond climatology after 3 months. An indication of the skill of POAMA after downscaling for locations in south eastern Australia again showing skill scores only comparable to climatology, with mean correlations of approximately 0.4 at one month lead, declining for longer leads [34].

POAMA, like other GCMs, attempts to simulate climate from an understanding of physical processes. An alternative approach to both POAMA, and the earlier statistical models [28, 35, 36] is machine learning using artificial neural networks (ANNs) [3739]. In principle the forecasting of monthly rainfall in the wheat belt should be amenable to ANNs because there exist annual rainfall patterns modified by global ocean-atmospheric circulation patterns measurable though climatic indices including ENSO and the Indian Ocean Dipole. These climate indices, as well as historical rainfall and temperatures records for many locations in the wheat belt, potentially provide over 100 years of data for training and validation of ANN models.

Machine learning with ANNs focuses on prediction based on known properties learned from exposure to historical data sets during the training process. A core objective of the learning process is to be able to generalize from experience [40]. Machine leaning has been widely utilized in the hydrology area, for example in studying rainfall-runoff relationships [41]. Machine learning has also become important in the medical diagnostic field [42] where information needs to be combined from different tests, each carrying some relevant, but limited, diagnostic information. There may be no consistently useful method of combining the relevant information, with a traditional reliance on the skill and experience of the medical practitioner. A vast literature shows 13 difference climate indices can have some influence on seasonal rainfall across Australia, with their influences varying temporally and seasonally and forecasts improved when information is provided as a lag, typically from one to three months in length [43, 44].

In this study, arrays of data were constructed using 7 climate indices and 3 local climate attributes (rainfall, maximum and minimum temperature) with each of these input attributes lagged at monthly increments, for up to 12 months. The arrays were inputted into a state-of-the-art probabilistic ANN and models built that could be used for monthly rainfall forecasting from one to 18 months in advance for the three Western Australian wheat-belt towns of Narrogin, Merredin and Southern Cross, Fig. 1. Output, as monthly rainfall forecasts for the test period July 2004 to June 2014 from these models is presented as time-series charts, and the skill of the forecast measured using Pearson correlations, root mean square errors (RMSE), and mean absolute errors (MAE). Variations in forecast skill, between and within locations, and variability in the sensitivity of the input attributes was also analysed. This work builds on earlier research focused on the application of ANN models for forecasting monthly rainfall in north eastern Australia using a less sophisticated ANN platform [39].

Fig. 1.
figure 1

Map of the Western Australian wheat belt showing the POAMA grid lines and the towns of Narrogin, Merredin and Southern Cross

2 Data and Method

The skill of a rainfall forecast from an ANN will depend on the quality and relevance of the data provided as input to the model. As with any statistical model, including ANNs, it is generally preferable to train and test with historical data sets extending as far back as possible. Narrogin, Merredin and Southern Cross have relatively long rainfall and temperature records, and fall within three of the four different BOM grid areas used in POAMA forecasts [32] covering the Western Australia wheat belt shown in Fig. 1.

All three towns experience predominantly winter rainfall, with Narrogin historically receiving significantly more rain. Narrogin (station number 10614, latitude 32.93°S, longitude 117.18°E., elevation: 338 m) began recording rainfall in 1891, and is still operating as a weather station today. Merredin (station number 10092, latitude 31.48°S, longitude 118.28°E, elevation: 315 m) opened in 1903 and is still operational. The original weather station at Southern Cross (station number 12074, latitude 31.23°S, longitude 119.33°E, elevation: 355 m) opened in 1889 and was closed December 2007. Another station was opened at Southern Cross Airfield in October 1996 (station number 12320, latitude 31.24°S, longitude 119.36°E, elevation: 347 m) and this remains operational. The historical data used in this study indicates long term decline in rainfall at Narrogin (−0.52 mm per year, 1891–2013). However, mean annual rainfall has been increasing at Southern Cross (0.72 mm per year, 1903–2013) and shows negligible trend at Merredin (−0.01 mm per year, 1889–2013).

In this study, local temperatures for Merredin were a composite constructed from records at Merredin (station 10092) and Merredin Research Station (station 10093). The temperature records used for Southern Cross were a composite constructed from records at Southern Cross Airfield (station 12370) and Southern Cross (station 12074). The temperature records used for Narrogin were a composite constructed from records at Katanning Comparison (station 10579) and Katanning (station 10916).

Variations in rainfall in Western Australia, as for many other parts of the world, are associated with large-scale climate phenomena (including ENSO), which can be described by climate indices. Six of the seven indices used in this study are associated with Pacific Ocean phenomena. Changes in the equatorial Pacific sea temperatures and atmospheric pressures are known to precede seasonal climate variation in other regions by one to at least 12 months [45] and in turn to affect crop yields through an influence on rainfall [46, 47]. There are four climate indices, Niño 1.2, Niño 3, Niño 3.4 and Niño 4 that directly measure changes in sea surface temperature across the equatorial Pacific associated with ENSO. All four Niños were used in this study with values sourced from the Royal Netherlands Meteorological Institute Climate Explorer – a web application that is part of the World Meteorological Organisation and European Climate Assessment and Dataset project. Pressure differences associated with ENSO are typically represented by the Southern Oscillation Index (SOI), calculated as the pressure difference between Tahiti and Darwin, with values for this study obtained from the BOM website. A sixth Climate Index, the Inter-decadal Pacific Oscillation (IPO), was also inputted with values sourced directly from Chris Folland at the UK Met Office. The IPO is considered a measure of temperature and pressure over the central North Pacific with its negative and positive phases thought to modulate the ENSO cycles described by the SOI and Niño values [48].

The Indian Ocean Dipole (IOD) is the only non-Pacific index used in this study. It is a coupled ocean-atmosphere phenomenon measured by the difference in sea surface temperatures in the western and eastern equatorial Indian Ocean [44]. The index is called the Dipole Mode Index (DMI) with values for this study sourced from the Royal Netherlands Meteorological Institute Climate Explorer.

These climate indices, and also local temperatures and rainfall, henceforth referred to as attributes, were input to Neurosolutions Infinity software and used to build ANN models to forecast monthly rainfall for Narrogin, Southern Cross and Merredin for lead times of 1, 3, 6, 9, 12 and 18 months, using a test period between July 2004 and June 2014. In some of our earlier reported investigations of applying neural networks to forecast rainfall in Australia, less sophisticated ANN software was available including Neurosolutions 6 for Excel [39] and Peltarion Synapse [49]. In those investigations, the approach to configuring the neural network was through manual trial and error, eventually selecting an Elman ANN. A major advantage of Neurosolutions Infinity is that it provides automation in testing many ANN configurations, so that an optimal network can be selected. In our more recent [50, 51] investigations using Infinity, we have studied monthly rainfall forecasting in several regions in eastern Australia, and found that, in each case studied so far, the configuration selected is a probabilistic ANN. The automated selection of a probabilistic ANN using Infinity after testing of various alternative configurations was also found with the present investigation.

All the forecasts were run with the full set of attributes, including local rainfall, each lagged at monthly increments, for up to 12 months. The concept of lag and lead are important in rainfall forecasting. Lag can be defined as the time period between the current value of a parameter, and the same parameter as some time in the past. Lead can be defined as the time period between the current value of a parameter and the same parameter as some time in the future. In this study, lags extended back 12 months, while leads (representing the forecast period) extended for 1, 3, 6, 9, 12 and 18 months forward relative to a particular current value.

The reliability of the probabilistic ANNs forecast was first tested by running seven identical experiments for Narrogin each time letting the Infinity software find a best model for Narrogin at a six-month lead. While the choice of attributes used by Infinity to arrive at the final solution varied somewhat between experiments, there was little variability in the skill of the monthly forecast for Narrogin for the test period July 2004 to June 2014 (r = 0.73 ± 0.01). This same technique was then used to predict monthly rainfall for Narrogin, Southern Cross and Merredin for 1, 3, 6, 9, 12 and 18 months in advance, for the 10 year test period July 2004 to June 2014.

For every run the total data was divided into training (70%), evaluation (20%) and test sets (10%). The test set was not used in network training, but was important in the choice of the final model. Pearson correlation coefficients (r), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) were used to compare the skill of the rainfall forecast from the best model for each ANN run against observed monthly rainfall for the test period, from July 2004 to June 2014.

Because total annual rainfall varies across locations, in order to compare RMSE and MAE across locations it is necessary to normalise for differences in rainfall. In this study we normalised only for RMSE by dividing the RMSE values by mean monthly rainfall for each location generating a normalised RMSE. The forecast skill was found to be lower, and more variable, at Merredin and Southern Cross as measured through normalised RMSE. Consequently, it was decided to also explore variability in the forecasts at the 6-month lead for Merredin and Southern Cross.

3 Results

Despite the lack of true seasonality at Narrogin during the test period from July 2004 to June 2014, the probabilistic ANN models generated surprisingly accurate and consistent forecasts of the peaks and troughs in monthly rainfall six months in advance, as shown in Table 1 and Fig. 2. The best and worst runs (in terms of RMSE), both forecast the unseasonal and heavy rainfall event in December 2011 of 143 mms with a high level of skill as illustrated in Fig. 2a and b. Other months with heavy rain were also reliably forecast, including May 2005, July 2007, June 2009 and September 2013, as shown in Fig. 2a and b.

Table 1. Forecast skills for Narrogin, Merredin and Southern Cross relative to climatology.
Fig. 2.
figure 2

Mean monthly rainfall for Narrogin, observed vs ANN output - test period July 2004 to June 2014

Some of the runs overestimated, while others underestimated these events, with the most accurate forecast (r = 0.80, RMSE = 19.4, MAE = 14.7) generated by averaging the monthly output from the seven runs, referred to as the ensemble shown in Fig. 2c. This ensemble forecast had a better skill score than the mean of the 7 individual runs at the 6-month lead (r = 0.80 versus r = 0.73) as well as better than any of the individual forecast runs, Table 1, showing forecast skill from the ANN models and ensembles of models for Narrogin, Merredin and Southern Cross relative to climatology. Climatology was calculated by using the long term mean monthly rainfall listed at the BOM website for that location, and observed monthly rainfalls for the test period of July 2004 to June 2014. Ensemble means were calculated by averaging output for each month for the listed models for the test period, and also using observed monthly rainfall for the test period.

The skill of a monthly rainfall forecast from a GCM typically declines from one to three month-lead time [33, 34]. In contrast, the forecast 18 months in advance from the ANN was the most skilful measured in terms of highest Pearson correlation and lowest RMSE (r = 0.77, RMSE = 21.2), Table 1. This forecast was uniquely able to forecast the un-seasonally wet January months at Narrogin in 2006 and 2011. This counter-intuitive result would suggest that the predictors of monthly rainfall are in place up to 18 months in advance for Narrogin and that differences in the presentation of input data associated with the varying lead times (i.e. 6 months versus 18 months) are a more significant constraint on the skill of the ANN forecast.

Table 2 shows the relative contribution of the input attributes for the seven runs at Narrogin with a six-month lead and also for the six runs at the different lead times for Narrogin, Merredin and Southern Cross. ‘Niño’ includes Niño 1.2, Niño 3, Niño 3.4 and Niño 4. ‘Temps’ includes maximum and minimum temperatures for each location. ‘Complex’ refers to situations where the ANN has combined attributes into a formula.

Table 2. Relative contributions of the input attributes

At Narrogin, there was some variability in the relative contribution of the different input attributes for the six-month forecasts that was also apparent comparing the forecasts for the different lead times (1, 3, 6, 9, 12 and 18 months), as shown in Table 2. Local atmospheric temperatures contributed on average 35% to the skill of the forecasts for both the 6-month lead forecast and other lead times. The Indian Ocean Dipole measured by the DMI, was the next most consistently important input attribute, contributing on average 14–15% of the skill, but with this contribution varying from 8–20% for individual runs. The contribution of the Niños and SOI was more variable, but on average they contributed 12.8 and 8.2% to the 7 forecast runs for the 6 month lead, and 7.5 and 13.6% to the single runs at the variable lead times, respectively. Table 2 also shows that the IPO contributed on average less than 3% to the skill of the forecasts.

While the ANN forecasts for Narrogin showed considerable skill in anticipating the peaks and troughs in monthly rainfall for the period of July 2004 to June 2014 including the heaviest rainfall events, as shown in Fig. 3, this was not the case for Merredin. At Merredin, each of the seven runs at the six-month leads failed to forecast the two heaviest rainfall events, in January 2006 and November 2012. As for Narrogin, a better monthly rainfall forecasts at the 6-month lead was achieved by averaging the monthly output from the individual runs, and calculating a value for the ensemble, r = 0.72, Table 1. But even this ensemble showed no skill at forecasting the exceptionally wet months, as shown in Fig. 3a. The mean of the 7 runs gave a Pearson correlation of 0.58, as shown in Table 1.

Fig. 3.
figure 3

Mean monthly rainfall for Merredin, observed vs ANN output - test period July 2004 to June 2014.

The individual runs at forecast lead times of 1, 3, 6, 9, 12 and 18 months also failed to anticipate these periods of heavy rainfall, Fig. 4b. The worst individual run, Fig. 3c, which was at a lead-time of 12 months, with a Pearson correlation of 0.49, was nevertheless better than climatology at 0.32, as shown in Table 1.

Fig. 4.
figure 4

Mean monthly rainfall for Southern Cross, observed vs ANN output - test period July 2004 to June 2014

The mean of the ensemble of the 6 runs at the leads of 3, 6, 9, 12 and 18 months gave a better score than the ensemble mean for the 7 runs at the 6-month lead, Table 1. As for the ensemble based on the 6-month lead, this forecast did not anticipate the exceptionally wet months, Fig. 4, but did have the best overall skill score, as shown in Table 1. While the two wettest months in the 10-year forecast period were not forecast, the ANN models consistently forecast the peaks in June 2007, and July 2008, 2009 and 2010. The ANN models also consistently forecast the un-seasonally heavy rain in January 2011 of 70.2 mms, shown in Fig. 3.

Consistent with Narrogin, local atmospheric temperatures were again the most important input followed by the Indian Ocean Dipole, Table 2. In the case of the worst forecast measured in terms of the Pearson correlation, which is the 12-month lead, the DMI contributed 34.8% to the skill of the forecast, as illustrated in Table 2.

At Southern Cross, as for Narrogin and Merredin, the best forecasts were achieved by creating ensembles, Table 1. Averaging the monthly output from the 6,9,12 and 18 month forecasts creates a forecast with the highest Pearson correlation, and lowest RMSE and MAE, as shown in Table 1. This ensemble is, in effect, a 6-month forecast, but with a better skill score than the ensemble created from the seven runs at the 6-month lead, or any of the individual run. Combining the seven runs at the 6-month lead with the individual 9, 12 and 18 month runs (not including the 6 lead because it is already counted in the 7 runs at 6 months) does not achieve a better skill score, as shown in Table 1.

While the Pearson correlations for the ensembles for Southern Cross are comparable to those achieved for Narrogin, Table 1, visual inspection of the output indicates that the forecasts are not as skilful, Fig. 3 versus Fig. 4. This is consistent with the higher normalised RMSE for Southern Cross and Merredin, relative to Narrogin, Table 1. While for Narrogin the ANN successfully forecast the wettest months during the 10 year period, at Southern Cross this was not the case. Also at Southern Cross, there was considerable variability in the individual runs. For example, the first run with the six-month lead time forecast only 9 mm for December 2011, Fig. 4b, while the third run forecast 118 mms, Fig. 4c. Results from the ANN were nevertheless better than climatology which suggests monthly rainfall at Southern Cross is historically erratic, as shown in Table 1. At Southern Cross the ANN did a surprisingly good job of forecasting the low winter rainfall of 2006 and 2007, as illustrated in Fig. 4.

At Southern Cross, the four Niños were consistently more important inputs contributing from 11 to 33% to the skill of the forecast, with a mean of 25%, as shown in Table 2. Atmospheric temperatures and DMI were still important contributing 21 and 16% respectively.

4 Discussion and Conclusions

Western Australian wheat farmers would benefit from more skilful, site specific rainfall forecast information than currently provided in the operational forecasts from the BOM. In particular, they need more information on the likely quantity, duration and timing of rainfall [5255]. Forecasts from ANNs are the type of deterministic, location-specific format that studies have identified as potentially most valuable for wheat farmers. Our results indicate skilful rainfall forecast can be achieved using neural networks for the three locations of Narrogin, Merredin and Southern Cross using ANNs, with skill scores above climatology consistently achieved up to 18-months in advance, as illustrated in Table 1.

It is well established that rainfall in the Western Australian wheat belt is affected by global atmospheric and oceanic circulation patterns that can be described by climate indices, also known as teleconnection signals. It is also well established that these climate indices exhibit non-linearity such that their influence varies in complex ways with season and geography [44]. Perhaps not surprisingly, the very nature of the problem of successfully forecasting monthly rainfall in the wheat belt is amenable to the use of machine learning using ANN as demonstrated by the relatively skilful rainfall forecasts achieved in this study as shown in Table 1 and Figs. 2, 3 and 4.

The skill of the monthly rainfall forecasts were calculated through direct comparisons of output from the best ANN models developed through the machine learning process with observed rainfall for the test period, in this study extending for ten years from June 2004 to July 2014. Table 1 shows that the most skilful forecasts were achieved through the use of ensembles whereby the forecast is calculated after averaging monthly output from multiple runs. Table 1 also shows that when the output from multiple runs was considered as an ensemble, Pearson correlation coefficients consistently exceeded 0.7 for all three locations. This result is better than any so-far reported for POAMA for any locations in Australia for any long lead-times [33, 34, 43].

The lack of deterioration in the forecast skill with lead-time is significant and suggests that some predictors of rainfall are in place more than one year in advance. Changes in ENSO are known to precede seasonal climate variation by up to 12 months [45]. The ANNs models used in this study appear to be utilizing and integrating such information in order to generate skilful forecasts up to 18 months in advance, particularly for Narrogin.

We have combined climate indices and other potentially predictive inputs (atmospheric temperature and local rainfall) into long arrays, each lagged incrementally for up to 12 months. We have then used this information to build location-specific models for monthly rainfall forecasting with analysis of the sensitivity of the input attributes, Table 2, in effect confirming the work of others [43, 44] that no-single climate index is likely to ever contributes more than 20% to forecast skill. Considering broad categories of affect, we found that the DMI could contribute 13.8 to 20.2% to the skill of a forecast for locations within the wheat-belt, as shown in Table 2. In the case of Southern Cross, a combination of the four Niños could contribute up to 25% to the skill of the forecast, and typically the contribution from individual climate indices was less than this.

The significantly more skilful forecasts for Narrogin relative to the forecasts for Merredin and Southern Cross may be a function of the lower mean rainfall for Merredin and Southern Cross of 325.8 mm and 292.8 mms respectively, relative to Narrogin at 494.4 mm. Generally the lower and more erratic the rainfall at a particular location, the more difficult it will be for an ANN to achieve a skilful forecast.

The forecasts for all three locations may be amenable to further improvement through the incorporation of additional climate indices. Schepen et al. [43] identified the Indian Ocean East Pole Index (EPI), the ENSO Modoki Index (EMI) and Blocking as dominant influences for particular grid cells in the wheat belt. Other climate indices that should be considered in future research are the Madden Julian Oscillation (MJO) and the Southern Annual Mode (SAM), which are also measures of climate phenomena thought to impact Western Australia’s climate. These indices were not considered here because they are only available from the 1950s. We chose to use longer datasets and were thus restricted to the climate indices available from the late 1800s.

The neural network approach illustrated here for three locations in the Western Australian wheat belt provides the potential for significant improvement in medium-term rainfall forecasts, particularly relative to forecasts currently reliant on general circulation models that have skill levels often little better than climatology particularly at long lead times [32, 34]. Several dozen other sites can be identified within the Western Australian wheat belt where long rainfall records are available and the neural network approach can also potentially be applied. These improved rainfall forecasts can be tailored to the specific requirements of farmers, and would likely translate into tangible economic benefits [3, 56].

The official rainfall forecasts issued by the BOM are produced using the general circulation model POAMA. Monthly rainfall forecasts with long lead times up to 8 months for south-west region of Western Australia have low skill, and are generally little better than climatology [32], a simple averaging of rainfall over an extended period. In contrast, skills significantly superior to climatology are achievable using the ANN approach. We have previously reported similar findings for studies of monthly rainfall forecasting for eastern Australia, including Queensland [50] and the Murray Darling Basin [51]. Low skill in rainfall forecasting with general circulation models has been reported for other regions of the world, for example in the United States [57]. There is significant scope to further improve the skill of the monthly rainfall forecasts presented here for sites in Western Australia. For example, our investigations have demonstrated that generating forecasts one month at a time with an ANN instead of for all months together produces an enhancement in forecast skill [50]. This approach has been particularly useful in forecasting extreme rainfall events, as for example during the flooding of Brisbane during summer of 2010/2011 [58].