1 Introduction

Soil moisture is an important bridge facilitating land–atmosphere coupling and has important effects on weather and climate (Schär et al. 1999; Koster et al. 2004; Zhou et al. 2007; Wei et al. 2008; Seneviratne et al. 2010; Zhang and Zuo 2011; Bellucci et al. 2015). Abnormal soil moisture may have an ongoing impact on subsequent atmospheric variability and extreme weather anomalies. Compared to some other terrestrial factors, soil moisture possesses greater persistence and it can increase the predictability of the atmosphere by land–atmosphere interactions so it is valued in weather and climate predictions (Koster and Suarez 2001; Timbal et al. 2002; Seneviratne et al. 2006; Douville et al. 2007; Dirmeyer et al. 2009; Lorenz et al. 2010; McColl et al. 2017). There have been many previous studies on the predictability of soil moisture, examples of which include Schlosser and Milly (2002), who found that the predictability of soil moisture lasts from several weeks to several months, and Kanamitsu et al. (2003), who showed that soil moisture predictability shows a non-uniform spatiotemporal distribution.

Soil moisture predictability may be stemming from its own persistence or external forcing factors. In terms of persistence, studies have shown that the persistence of soil moisture in arid regions and humid regions is usually higher than in the semi-arid areas (Oglesby and Erickson 1989; Koster and Suarez 2001). But not all semi-arid zones have lower predictability of soil moisture because there are other sources of soil moisture predictability. Global sea surface temperature (SST) is one of the most important external forcing factors (Quan et al. 2004; Nicolai-Shaw et al. 2016; Hua et al. 2018a). SST can affect the distribution and transmission of global heat and moisture, thereby affecting local and global climate (Rodwell et al. 1999; Chang et al. 2000; Ashfaq et al. 2011). For example, a previous study showed that the contribution of the El Niño–Southern Oscillation (ENSO) to soil moisture predictability in many regions is stronger than that of soil moisture persistence (Sospedra-Alfonso and Merryfield 2018). However, there are many SST factors in addition to ENSO, these SST factors have different effects on the predictability of soil moisture in different regions. Therefore, the contribution of various SST factors to soil moisture predictability and how the effects vary among different regions are all aspects that need further clarification.

The specific routes by which SST affects soil moisture predictability is also a concern. The variability of soil moisture is affected by many factors, such as precipitation, temperature, evaporation, cloud cover, etc., and these factors are also affected by SST to a large extent. Studies showed that the drought events in North America can largely explain in terms of atmospheric circulation anomalies forced by the tropical SST (Schubert et al. 2009; Seager and Hoerling 2014). Evapotranspiration, an important factor of soil moisture variability, is affected by meteorological factors such as air temperature and wind speed, which is also highly influenced by SST in different ways (Wallace et al. 1989; Feudale and Shukla 2011). It is necessary to investigate how SST is connected with the soil moisture predictability in different regions.

Hence, our study employed the Community Earth System Model (CESM) for multiple groups of ensemble numerical simulations and used the canonical correlation analysis (CCA) between soil moisture and SST empirical orthogonal function (EOF) modes to construct a regression model. This model was used to study the respective contributions of soil moisture persistence and SST as a forcing factor to global soil moisture predictability. This paper is organized as follows. In Sect. 2, the data sets, model and experimental design, and the method to quantify the predictability of soil moisture are described. In the following section, the results are presented, followed by the conclusions and discussions in Sect. 4.

2 Methods

The Community Earth System Model (CESM) is a state-of-the-art Earth system climate model that is composed of multiple components such as the Community Atmosphere Model (CAM version 5) (Neale et al. 2010) and the Community Land Model (CLM version 4.5). CESM has been widely used for climate research and its model performance is one of the best among the current climate models (Neale et al. 2010; Hurrell et al. 2013; Oleson et al. 2013). We used the CESM model version 1.2.2 to conduct 20 sets of AMIP-type ensemble experiments (Gates et al. 1999; Hua et al. 2018b) that ran for 50 years from 1965 to 2014, with a horizontal resolution of 0.9 × 1.25°. SST data, natural and human activity forcings were identical across all ensemble members. SST data were obtained from the Hadley Centre (Hadley Centre Sea Ice and Sea Surface Temperature data set, HadISST), while external forcing data were obtained from the default dataset of the model. The 20 experiments only differ in the initial state, and data from the last 20 continuous days (1965-01-01 to 1965-01-20) in the spinup results of the first 15 years (1950–1964) were used as the initial state for every set of experiment. The different initial conditions ensured the divergence of the ensemble experimental results (Yang et al. 1995; Cosgrove et al. 2003; Zuo et al. 2013). Using model data instead of observational data can avoid data inconsistencies caused by observational biases, and the model ensemble results can provide some uncertainty range for the results. In order to further validate the results obtained from model simulation, we also used the soil moisture data from the Global Land Data Assimilation System (GLDAS) for comparative analyses (Rodell et al. 2004).

There are multiple definitions of soil moisture predictability. In this study, we employed the regression model to study the predictability of soil moisture and its contributions from soil moisture persistence and SST (Nicolai-Shaw et al. 2016). First, Xt was defined as the soil moisture at the surface layer (0–9.1 cm, the top three soil layers of the CESM model) in the model at time point t, and the following regression formula could be constructed (the autoregressive model):

$${\text{X}}_{\text{t}}={{\alpha }}_{\text{X}} {{\text{X}}_{{{\rm t}}-{\lambda }}}+ {{\alpha }}_{0}$$

where Xt−λ is soil moisture (λ = 1, 2…,5) at a leading month of λ, αX is the regression coefficient, and α0 is the error term. Before regression was carried out, detrend and standardization processing was carried out on the data.

To evaluate the influence of SST on global soil moisture, the time series of soil moisture at each grid point over land and the time series of the top 50 EOF modes of global SST (explaining about 66% variance) were optimally correlated through CCA (Wilks 2011) (detrend and standardization processing was also carried out before CCA). The first canonical variable obtained from CCA contained a linear combination of the 50 EOFs (using 5–20 EOFs also give robust results) time series which optimally correlates with soil moisture at each grid point (Orlowsky and Seneviratne 2010). This first canonical variable was used in the aforementioned equation to obtain a new regression equation (the SST regression model):

$${{\rm{X}}_{\rm{t}}} = \alpha _{{\rm X}} {{\rm X}}_{{{\rm{t}}-\lambda }} + {\alpha _{{\rm{sst}}}}{\rm{INDE}}{{\rm{X}}_{{\rm{sst}},{\rm{t}} - \lambda }} + {\alpha _0}$$

where INDEXsst is the first canonical variable obtained from CCA and is used to represent the effect of SST on soil moisture (different for each grid point), and αsst is the corresponding regression coefficient. We use the coefficients of determination (R2) to evaluate the performance of the regression models. R2 is between 0 to 1, and the greater the value, the better the fit of the regression model. The values of R2 from the two regression models have been adjusted based on the degrees of freedom. Studying the differences in R2 between two models can yield the magnitude of the effects of persistence and SST forcing on soil moisture predictability. The Student’s t-tests are conducted for the difference fields by using the variance from the 20 ensemble members. The two regression models are also calculated with the GLDAS soil moisture data and the HadISST data for the comparison.

3 Results

Soil moisture predictability arises from its persistence and external forcing factors. The effects of these two factors on predictability vary by region and season. Figures 1 and S1–S3 show the distribution map of the corresponding coefficients of determination, R2, in the autoregressive model for soil moisture. Here, R2 represents the predictability caused by soil moisture persistence. At one leading month, the R2 of the autoregressive model passes the significance test in many areas, indicating that the persistence of soil moisture itself increases predictability in these regions at this time. Subsequently, as the number of leading month increases, R2 rapidly decreases in most regions and seasons. At five leading months, R2 is significant in very little regions globally and in all seasons, showing that the predictability provided by soil moisture itself cannot last for 5 months.

Fig. 1
figure 1

The coefficients of determination (R2) of the autoregressive model (left panel, shaded areas indicate the significance is above 0.05), the differences in R2 between the SST regression model and the autoregressive model (middle panel, shaded areas indicate the significance is above 0.1) and the differences in R2 between the SST regression model and the autoregressive model calculated from GLDAS soil moisture data (right panel) for JJA at three leading months (LM = 1, 3, 5). The results are the mean values of 20 ensemble members.

However, when the impact of SST is considered, the predictability of soil moisture increased significantly. The distribution of differences in R2 for the SST regression model and the autoregressive model can also be found in Figs. 1 and S1–S3. It can be seen that the R2 for the regression model of various seasons was significantly increased after considering the impact of SST. At 1 leading month, regions that show an increase in R2 are mainly in tropical and arid-humid transition regions (Amazon, Sahel, Australia, etc.). As lead time increases, the difference also increases. At three leading months, a significant difference can be seen in South America, which moves from South to North during the four seasons with the rain belt. In addition, regions with significantly increased predictability were also present in southern Africa and parts of Asia. At five leading months, regions with increased R2 were still present in South America. Increased predictability can still be seen in the Sahel in autumn and winter, but the difference becomes less significant in other regions. The differences of R2 between the SST regression model and the autoregressive model from GLDAS data are similar to the model results (Figs. 1 and S1–S3, right panels) in the large value regions like North and South America, Sahel and Australia. However, the significance test cannot be conducted like the ensemble results since the GLDAS data only have one set of soil moisture results. Therefore, the GLADS results inevitably contain noises, and not quite consistent with the ensemble results in some areas. The results also show that by considering SST as an influencing factor can increase soil moisture predictability by 2–3 months and at most 5 months for tropical regions (Fig. 2). As the number of leading month increases, the contribution of SST towards soil moisture predictability also gradually increases (Fig. 3). As can be seen in Fig. 3, at one leading month, the contribution of soil moisture persistence to the SST regression model is greater than that of SST in most regions over land, except for a few tropical areas. With leading 3 months, the contribution of SST has become greater than the persistence of soil moisture, especially in JJA and SON. At five leading months, in opposite to one leading month, the contribution of SST factors had become dominant in most parts of the world.

Fig. 2
figure 2

The coefficients of determination (R2) of the SST regression model at the four seasons (from top to bottom) at different leading month (1, 3, 5 months from left to right). The results are the ensemble mean of 20 members and white part indicates the nonsignificant areas (p > 0.05).

Fig. 3
figure 3

Overview of the main contributor (persistence vs SST forcing) to SST regression model in the four seasons (from top to bottom) at different leading month (1, 3, 5 months from left to right). The white part over land indicates the nonsignificant areas (p > 0.05).

We then selected six regions with the most significant increases in soil moisture predictability for detailed analysis. Figure 4 shows the ranges of the selected regions and the map of the R2 variation with the number of leading months in these six regions. In general, R2 is greater in winter (locally) than in summer, showing that soil moisture predictability is higher in winter than in summer. It can be seen that considering the SST influence caused an overall slowdown in the decrease in R2 in various regions, and the corresponding R2 for the various months was higher than the autoregressive R2. Furthermore, it was found that the seasonal differences in predictability vary: Australia and Africa exhibit larger seasonal differences in predictability. This may be because these two regions are controlled by monsoons, and seasonal changes are more significant. In North America, the seasonal differences in predictability are the lowest.

In order to study which SST mode is dominant in terms of their effects on soil moisture predictability, we further analyzed the SST EOF mode corresponding to the largest loading factor in the first canonical variable in CCA. Figure 5 shows the spatial distribution of the top EOF modes corresponding to the largest loading factor in CCA. The first five SST modes (in the total 50 EOF modes) dominant almost 90% areas of land. The first SST mode (which corresponds to ENSO) has the greatest contribution and is the main contributor to increased predictability. This indicates that ENSO is the main source of soil moisture predictability. However, the other 4 EOF modes of SST showed in Fig. 5 have a greater contribution to the CCA than ENSO in many areas, such as most Asia, some subtropical regions of South America and Canada. In order to confirm the contribution of the first SST mode corresponding to the largest loading factor in CCA to the increasing of the soil moisture predictability, we used the top SST mode in contribution in CCA at each grid point for soil moisture regression. Figure 6 shows the distribution of the R2 from this regression model and the autoregressive model. Results show that the top SST EOF mode can only explain the predictability increases in some tropical regions and that other SST modes also have important roles in soil moisture prediction.

Fig. 4
figure 4

The regional mean difference of the coefficients of determination (R2) at four seasons between the autoregressive model and the SST-regression model at the six regions. The abscissa axis indicates the leading month (LM = 1–5) and the ordinate axis represents the R2 value.

Fig. 5
figure 5

The top SST EOF modes based on the sorting list of the canonical coefficients from the first canonical variables (top left). The top five EOF modes of the global monthly mean SST is also showed here.

Fig. 6
figure 6

The differences in the coefficients of determination (R2) between the SST-regression model base on the top SST EOF mode and the autoregressive model and in the four seasons (from left to right) at three leading months (LM = 1, 3, 5). The results are the mean values of 20 cases and the white part indicates the non-significant areas (p > 0.05).

In order to examine how SST increases soil moisture predictability, we compared the lagged correlation between the SST factor and soil moisture with various meteorological factors. Precipitation and surface temperature are two main factors that impact the soil moisture. The increase or decrease of precipitation generally results in the increase or decrease of the soil moisture. The surface temperature can affect the surface evaporation. When the surface temperature rises, it may cause an increase of surface evaporation, and then reduce the soil moisture. The clouds can also impact the amount of solar radiation that reaches the ground, which in turn will affect the surface temperature and evaporation (Seneviratne et al. 2010; Zhou et al. 2016). The results show that the correlations between precipitation/surface temperature and SST factor or soil moisture are both decreasing with leading months. However, the correlations between precipitation and SST factor are much higher than that of soil moisture at all five leading months (Fig. 7, take JJA as an example, other seasons showed similar results). As to the surface temperature, the correlations for SST factor and soil moisture are quite close at leading month 1, but with increasing of the leading month, the correlations for SST factor decrease much slower, suggesting that the SST factor shows more sustainable influence on the surface temperature (Fig. 8). The variability of soil moisture is highly affected by precipitation and the surface temperature. The higher correlation coefficient between SST factor and precipitation or surface temperature after leading month 1 indicated that the information in SST factor could bring more predictability to soil moisture through affected the local rain and temperature. The analysis by using GLDAS soil moisture data showed fairly similar characteristics. The correlation coefficient between SST factor and precipitation or temperature decreased more slowly with increasing of the leading month than that of soil moisture (Figures S4 and S5). The regional analysis showed that in South America, the SST factor mainly affects precipitation and surface temperature, thus increasing soil moisture predictability (Figs. 7, 8, 9, 10). From the figures, we can see that the correlations between soil moisture and surface temperature or soil moisture and precipitation in South America are significant after one leading month, while the correlation between SST and surface temperature or SST and precipitation are still significant after three leading months and the percent of grid points that passed t-test is two times that of soil moisture correlation. In North America, SST mainly affects precipitation to increase soil moisture predictability. Figure 9 shows that the percent of grid points that passed the t-test corresponding to the SST factor in this region is significantly higher than that for soil moisture. Among these, the difference in spring is the greatest. In Africa, SST mainly affects surface evaporation through surface temperature to control soil moisture predictability (Figs. 8 and 10). In winter over Europe and in spring over West Asia, the increase in soil moisture predictability is due to the effects of SST on local cloud cover (Figure S6), which affects surface evaporation (Figure S7). It can be seen that SST affects soil moisture predictability through the combined effects of multiple meteorological factors.

Fig. 7
figure 7

The correlation coefficient between the simulated precipitation and the CCA time series of SST (left panel) or the soil moisture (right panel) for JJA at leading month 1 and 5. The white areas indicate the nonsignificant correlation coefficients (p > 0.1). The bottom picture illustrates the globally percentage of the grid cells where the correlation coefficient passes the 0.1 significant level for JJA. The abscissa axis indicates the leading month (LM = 1–5) and the ordinate axis represents the percentage of the grids which pass the t-test.

Fig. 8
figure 8

Same as Fig. 7, but for surface temperature.

Fig. 9
figure 9

The difference of the percentage of the grid cells where the correlation coefficients pass the 0.1 significance level between the simulated precipitation and the simulated soil moisture or the CCA time serials of SST at four seasons in the six regions. The abscissa axis indicates the leading month (LM = 1–5) and the ordinate axis represents the percentage of the grids which pass the t-test. The distribution of the six areas illustrate in Fig. 4.

Fig. 10
figure 10

Same as Fig. 9, but for surface temperature.

4 Discussion and conclusions

In this study, we employed the CESM model for multiple groups of ensemble experiments to analyze the spatiotemporal characteristics of global soil moisture predictability. We studied the effects of SST as an external forcing on soil moisture predictability, and the routes through which SST can increase soil moisture predictability. The main conclusions of this study are as follows:

The results of the ensemble experiments show that the predictability of soil moisture persistence in most regions in the world only lasts for 1–2 months and the persistence of soil moisture in arid and permafrost regions are higher. After considering SST as an external forcing, soil moisture predictability could increase 1–3 months in many regions. In tropical regions (Amazon, Sahel, India, Indochina, and Australia), soil moisture predictability increases more after considering SST. In addition, in winter and spring in North America and the Iranian Plateau, the effects of SST in increasing predictability are more significant. Notice that the snowmelt and the soil freeze–thaw processes are also important factors to the soil moisture variation in the high latitudes and altitudes. To compare with the influence of SST, the snow depth and soil ice were added into a regression model together with the SST and the main contributor analysis show that the soil ice makes a greater contribution to the predictability of soil moisture in the high latitudes of the Northern Hemisphere in JJA (Figure S8). The contribution of snow depth is not as large as that of soil ice and SST factors. To focus on the discussion about the relations of SST and predictability of soil moisture, these areas dominated by soil ice and snow in Figure S8 are masked in those difference figures of R2.

Regression of the top SST EOF mode (corresponding to ENSO) exhibited good performance in Sahel and South America, which means the ENSO signal can explain the increase of soil moisture predictability there. This is also reported by other researchers (Nicolai-Shaw et al. 2016; Sospedra-Alfonso and Merryfield 2018). However, it does not result in a significant increase in predictability in other regions, particularly extratropical regions, where CCA with more SST EOF modes can significantly increase the predictability of soil moisture. This indicates that considering more SST modes in these regions could bring more information to the prediction of soil moisture variability.

Soil moisture is affected by various meteorological factors like precipitation and surface temperature, and these factors are also highly affected by the SST factor. Many studies have shown that SST factor has a significant impact on the future precipitation and surface temperature (Ropelewski and Halpert 1986; Bradley et al. 1987; Dai and Wigley 2000; Misra 2003; Kushnir et al. 2010; Donat et al. 2014; Manatsa and Reason 2017; McCoy et al. 2017, etc.). The SST factor has more long-term predictability than the soil moisture, so the information in the SST factor can also flow into the soil moisture through the meteorological factors such as precipitation or temperature, which brings long-term predictability to the soil moisture (Klopper et al. 1998; Power et al. 2006; Sospedra-Alfonso and Merryfield 2018). The correlation analysis shows that SST mainly affects precipitation and surface temperature in South America to enhance the soil moisture predictability for 1–2 months. In Africa, surface temperature and surface evaporation are important routes through which SST affects soil moisture predictability. During winter and spring in North America and West Asia, SST affects soil moisture predictability mainly through precipitation and cloud cover respectively. We further used GLDAS soil moisture and temperature data and the same method to calculate the predictability of soil moisture for comparison with the aforementioned ensemble experiment results and found that the differences between the two are not large, which validates the reliability of the conclusion.

In this paper, we mainly focus on the soil moisture of the surface layer. For deep layers (2–3 m), the effect of soil moisture persistence on its predictability is stronger and the influence of SST is similar in spatial patterns but gets weaker as the depth increases. In addition to SST, the physiological effects of vegetation, human activities, and soil properties may also affect the predictability of soil moisture. Their parameterizations in climate models still have large uncertainties and require further improvement. Therefore, as the models continue to develop, new challenges for soil moisture predictability research will inevitably arise.