1 Introduction

Soil temperature is a key land surface parameter that contributes considerably to climate variations and predictions (Tang et al. 1988; Wang 1991; Liu and Avissar 1999; Zhou and Huang 2006; Fan 2009; Wang et al. 2013). As a reflection of the land surface thermal conditions, soil temperature plays an important role in the energy and water balance of the land surface. Through interaction with the atmosphere, soil temperature has been demonstrated to have substantial effects on monthly to interannual climate variations (Hu and Feng 2004a, b; Mahanama et al. 2008; Wu and Zhang 2014). By inducing an eastward-propagating cyclone, a warm May subsurface soil temperature in the western United States can lead to more June precipitation in the southern United States and less precipitation in the north (Xue et al. 2012). The modeling work of Wu and Zhang (2014) emphasized the importance of subsurface soil temperature on summer surface air temperature variability over arid and semi-arid regions of Eastern Asia. As a slow variable of the land surface, soil temperature can “remember” climate anomalies and release their effects in subsequent seasons (Hu and Feng 2004a). The memory of soil temperature can persist for 1 month to years, depending on soil depth, season, and climate regime (Liu and Avissar 1999; Yang and Zhang 2015). Soil temperature memory is considered to be a potential predictor for seasonal climate anomalies and extremes, and could improve forecasts of monthly to interannual climate (Xue et al. 2012; Yang and Zhang 2015).

Reanalysis datasets have been widely adopted in climatology research to complement incomplete observational records (Robock et al. 2000; Koster et al. 2004; Zhang et al. 2008; Kim and Alexander 2013). The evaluation of reanalysis products, which is essential and critical because of the uncertainty that may be caused by data assimilation and forecast models, provides a reference for applying reanalysis datasets in different regions and fields (Dirmeyer et al. 2004; Hodges et al. 2011; Bao and Zhang 2012; Shah and Mishra 2014). Many variables, such as surface air temperature, radiative fluxes, precipitation, and wind speed, from different reanalysis datasets have been widely evaluated (Simmons et al. 2004; Chaudhuri et al. 2012; Lindsay et al. 2014). According to previous evaluation works, generally speaking, no one product performs better than the others in all fields and regions (Makshtas et al. 2007; Mao et al. 2010; Chaudhuri et al. 2012). The reanalysis data of land surface parameters, such as soil moisture, have also been evaluated. The 40-year European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis (ERA-40) dataset has been found to perform better than the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis 1 and the NCEP-Department of Energy (NCEP-DOE) datasets for reproducing the mean value and interannual variability of soil moisture in China (Li et al. 2005). Albergel et al. (2012) compared soil moisture in the ECMWF’s Interim reanalysis (ERA-Interim) dataset with in situ soil moisture observations from 117 stations across the world (Australia, Africa, America, and Europe). They found that the ERA-Interim dataset generally overestimates soil moisture, especially over dry land. However, it still performs well for surface soil moisture variability.

Considering the importance of soil temperature in the land system, recently, several works have assessed the forecast quality of soil temperature in numerical weather predictions (Holmes et al. 2012; Albergel et al. 2015). Holmes et al. (2012) assessed surface soil temperature from the Integrated Forecasting System from ECMWF, the modern-era retrospective analysis for research and applications (MERRA) from the NASA Global Modeling and Assimilation Office, and the global data assimilation system used by NCEP over Oklahoma. Albergel et al. (2015) used soil temperature measurements over the United States and Europe to assess ECMWF forecasts of soil temperature during 2012. They found that the ECMWF forecasts can generally represent the annual and diurnal cycle of soil temperature. Furthermore, they highlighted the importance of orographic data for estimating soil temperature. However, to the best of our knowledge, evaluations of soil temperature data from reanalysis products over China are still scarce.

Our present study evaluates four well-known reanalysis datasets, namely the land surface reanalysis of ECMWF (ERA-Interim/Land), the second MERRA (MERRA-2), the NCEP Climate Forecast System Reanalysis (CFSR), and version 2 of the Global Land Data Assimilation System (GLDAS-2.0), by comparing them with the observational soil temperature data over China, to provide a reference for the ability of these products to describe the seasonal mean, interannual variability and linear trend of soil temperature, and also, for their ability to estimate soil temperature memory over China. To explore the possible reasons for the different behaviors of the four reanalysis datasets, we further examine the relationship between soil temperature and surface energy balance components. The remainder of this paper is arranged as follows. The observational and reanalysis datasets are described in Sect. 2. Section 3 shows the results of the evaluation. Discussion and conclusions are presented in Sects. 4 and 5, respectively.

2 Data and methods

2.1 Observations

The observational data used in this study are the monthly mean soil temperature of 626 stations over China for the period of 1981–2005, provided by the China Meteorological Administration. The dataset has nine soil layers of 0, 5, 10, 15, 20, 40, 80, 160, and 320 cm. Owing to the limitation of data availability, we only retain the stations with complete records for specific periods and soil layers. Figure 1 shows the spatial distribution of the stations with available soil temperature data for all nine soil layers for summer (June, July, and August), winter (December, January, and February), and for all 12 months of all 25 years. These stations are mostly located over the east of China. In this study, bilinear interpolation is used to interpolate reanalysis data to stations.

Fig. 1
figure 1

Spatial distribution of stations with complete soil temperature records for all nine soil layers for a summer, b winter, and c all 12 months of all 25 years

2.2 ERA-Interim/Land

The ERA-Interim/Land dataset (Dee et al. 2011; Balsamo et al. 2015) is the newest land surface model simulation produced by ECMWF, covering 1979–2010. Based on a spatial resolution of 80 km (T255 spectral), the soil temperature data in ERA-Interim/Land have four layers with depths of 0–7, 7–28, 28–100, and 100–289 cm. Forced by the near-surface meteorological fields from ERA-Interim (Dee et al. 2011) and precipitation adjustments based on the Global Precipitation Climatology Project Version 2.1 (Huffman et al. 2009), ERA-Interim/Land is executed using the latest version of the Hydrology-Tiled ECMWF Scheme for Surface Exchanges over land (HTESSEL, Balsamo et al. 2009). Compared with the Tiled ECMWF Scheme for Surface Exchanges over Land (TESSEL) used in ERA-Interim, HTESSEL has significant improvements in terms of soil hydrology, snow scheme, vegetation climatology, and bare-soil evaporation. The ERA-Interim/Land dataset has been found to show more agreement with the observations for latent and sensible heat fluxes, soil moisture, and snow than the ERA-Interim dataset, and is considered to be more suitable for climate applications in terms of land surface parameters. The ERA-Interim/Land data are used on a 0.5°×0.5° grid in our study.

2.3 MERRA-2

The MERRA-2 dataset (Bosilovich et al. 2016), as a replacement for the MERRA reanalysis (Rienecker et al. 2011) produced by NASA, uses an upgraded version of the Goddard Earth Observing System Model, Version 5 (GEOS-5) data assimilation system, including the GEOS-5 atmospheric model (Rienecker et al. 2008; Molod et al. 2015) and the Gridpoint Statistical Interpolation (GSI) analysis scheme (Wu et al. 2002). Compared with MERRA, MERRA-2 uses observation-based precipitation data instead of model-generated precipitation to force the land surface parameterization, and it includes numerous additional satellite observations. The soil temperature in MERRA-2 has six layers with thicknesses of 9.88, 19.52, 38.59, 76.27, 150.7, and 1000 cm, and is provided on a grid with 576 points in the longitudinal direction and 361 points in the latitudinal direction (0.625°×0.5°).

2.4 CFSR

The CFSR dataset (Saha et al. 2010; Meng et al. 2012) is the newest global, high-resolution reanalysis covering 1979–2009 developed by NCEP. Using the Noah four-layer land surface model, CFSR adopts the NASA land information system (LIS) to execute the global land data assimilation system (GLDAS/LIS; Mitchell et al. 2004; Rodell et al. 2004; Peters-Lidard et al. 2007) to perform the land surface analysis. GLDAS/LIS is forced by the atmospheric data assimilation output of CFSR and observational precipitation, including the pentad data of the Climate Prediction Center (CPC) Merged Analysis of Precipitation (Xie and Arkin 1997) and the CPC unified global daily gauge analysis. The soil depths of the four soil layers are 0–10, 10–40, 40–100, and 100–200 cm.

2.5 GLDAS-2.0

The GLDAS-2.0 dataset (Rodell et al. 2004) is the newest reanalysis as part of the mission of NASA’s Earth Science Division covering 1948–2010, and is archived and distributed by the Goddard Earth Sciences (GES) Data and Information Services Center (DISC). Based on the Noah model, GLDAS-2.0 is forced by the global meteorological forcing dataset from Princeton University (Sheffield et al. 2006). In spite of the model version upgrade, GLDAS-2.0 uses the MODIS-based land surface parameter datasets and includes initialization of soil moisture over desert. The bottom-layer temperature in the Noah model is also updated, compared with GLDAS-1. GLDAS-2.0 has two resolutions of 1°×1° and 0.25°×0.25°, and the resolution of GLDAS-2.0 used in our study is 0.25°×0.25°. Similar to CFSR, GLDAS-2.0 has four soil layers with thicknesses of 0–10, 10–40, 40–100, and 100–200 cm.

3 Results

In land surface models, as an unavoidable limitation, soil temperature is given as an average of a soil layer. Linear interpolation is usually adopted to approximately calculate the soil temperature at certain soil depths, which may cause bias in the evaluation. To present the best behavior of each reanalysis dataset, we evaluate both the layer-averaged soil temperature (LA-ST) and the soil temperature interpolated to the nine observational soil depths using linear interpolation (INTER-ST) by comparing them with the observations (OBS-ST). Figure 2 shows the vertical distribution of the mean LA-ST, INTER-ST, and OBS-ST of the stations with complete observational records in summer and winter. As an observational fact to support the applicability of linear interpolation, the vertical variation of OBS-ST is approximately linear, especially for the soil temperature at 0–40 cm depth. All four reanalysis datasets generally underestimate the soil temperature. And for each reanalysis product, the comparison of LA-ST and INTER-ST depends on soil depth. Comparing LA-ST for the four datasets, in general, the ERA-Interim/Land and GLDAS-2.0 datasets are closer to the observations than the MERRA-2 and CFSR datasets. In summer, the LA-ST of the GLDAS-2.0 dataset at 0–10 cm is relatively higher than the other datasets, and the ERA-Interim/Land dataset has similar values for LA-ST at 0–7 cm. For summer LA-ST at 10–28 and 40–100 cm, the ERA-Interim/Land dataset has a higher estimation than the other datasets. The LA-ST of the GLDAS-2.0 dataset at 28–40 and 100–200 cm is relatively closer to the observations. The MERRA-2 dataset shows good behavior in representing the summer LA-ST in its second soil layer, which may be due to the fact that it has more soil layers than the other reanalysis datasets. The LA-ST of the CFSR dataset is relatively lower than the others datasets. For LA-ST in winter, the ERA-Interim/Land dataset has the highest estimation at 0–10 and 28–69.77 cm, and even has a higher value than the observations (estimated based on the assumption of linear variation) around 28 and 100 cm. The GLDAS-2.0 dataset has a relatively higher estimation for soil temperature at 10–28 cm. The MERRA-2 dataset has the highest values for LA-ST at 69.77–100 cm and 144.26–294.94 cm (the soil depth of the fifth soil layer of MERRA-2). Six soil layers are used in the MERRA-2 dataset, which is the most layers among the four datasets, makes MERRA-2 dataset has the closest winter soil temperature to the observations at deep soil layers, but also leads to a large bias in its estimation in summer. In addition, we find that the bias between the reanalysis data and the observations is larger for soil temperature in summer than in winter.

Fig. 2
figure 2

Vertical distribution of the mean observational soil temperature (°C), layer-averaged soil temperature (solid lines), and the soil temperature interpolated to the nine observational soil depths using linear interpolation (dashed lines) of all stations with complete observational records in a, b summer and c, d winter. a, c The soil temperature at 0–40 cm, and b, d the soil temperature at 40–320 cm

To investigate the ability of the reanalysis data to rebuild the spatial distribution of soil temperature in summer and winter, we choose 40 cm as an example. For each reanalysis dataset, INTER-ST or LA-ST at 40 cm is selected, depending on which one is closer to observations (shown in Fig. 2), to present the spatial distribution. Therefore, INTER-ST at 40 cm in the ERA-Interim/Land and MERRA-2 datasets and LA-ST of the second layer of the CFSR and GLDAS-2.0 datasets are adopted as the soil temperature at 40 cm in summer. The LA-ST of the third layer of the four datasets is chosen as the soil temperature at 40 cm in winter. The observed soil temperature in summer shows a clear north–south difference over the east of China, with relatively high values in the south and relatively low values in the north (Fig. 3a). Over the west of China, summer soil temperature is relatively larger in the north than in the south. All four reanalysis datasets, which mainly show negative anomalies, generally capture those spatial characteristics, and have fewer discrepancies with the observations in the east than in the west of China. Relatively, the ERA-Interim/Land and GLDAS-2.0 datasets show more in common with the observations compared with the other products. The MERRA-2 dataset also performs well at reproducing the summer soil temperature in central and south China, but has a relatively large bias for soil temperature in north China (Fig. 3). For soil temperature in winter, as shown in Fig. 4, the observations also show a north–south disparity in the east of China, which is generally reproduced by the four reanalysis datasets. The GLDAS-2.0 dataset stands out as having the smallest bias, with a positive anomaly in the north and a negative anomaly in the south. The ERA-Interim/Land dataset has a relatively large overestimation of winter soil temperature over north and northwest China, which could be the main reason for the large national mean LA-ST of the ERA-Interim/Land dataset at 40 cm shown in Fig. 2. The MERRA-2 and CFSR datasets show good behavior in terms of reproducing the winter soil temperature over the east of China, and the CFSR dataset has a relatively smaller bias. Compared with the estimation of soil temperature in summer, the reanalysis datasets have a smaller bias for soil temperature in winter.

Fig. 3
figure 3

Spatial distribution of the soil temperature (°C) for a observations, b ERA-Interim/Land, d MERRA-2, f CFSR, h GLDAS-2.0 at 40 cm in summer, and the spatial distribution of the bias of c ERA-Interim/Land, e MERRA-2, g CFSR, i GLDAS-2.0 compared with the observed summer soil temperature at 40 cm. Stations with complete observed records of summer soil temperature at 40 cm are selected for comparison. Reanalysis data are interpolated to these stations using bilinear interpolation

Fig. 4
figure 4

Similar to Fig. 3, but for soil temperature in winter. Stations with complete observed records of winter soil temperature at 40 cm are selected for comparison

We also calculate the multiyear mean, correlation coefficient with observation, root mean square difference of mean soil temperature of stations with complete records at 40 cm in summer and winter. As is shown in Table 1, except for ERA-Interim/Land dataset in winter, reanalysis data have an underestimation of soil temperature comparing with observations. GLDAS-2.0 datasets shows the smallest bias of multiyear mean both in summer and winter. And it also performs better than other datasets in terms of correlation coefficient and root mean square difference, except for having a relatively low correlation coefficient with observations in winter.

Table 1 Multiyear mean (MM, °C), correlation coefficient with observations(CC), standard deviation (SD), root mean square difference (RMSD, °C), linear trend (LT, °C/year) and memory lengths (STM, months) of mean soil temperature of stations with complete records at 40 cm in summer (JJA) and winter (DJF)

Standard deviation is adopted to represent the interannual variability of soil temperature. The standard deviation of soil temperature in summer for the observations has a spatial distribution characterized by obvious regional disparity, with higher values in the north than in the south (Fig. 5a). The reanalysis datasets also show a north–south gradient, and generally underestimate the interannual variability of summer soil temperature. Relatively, the ERA-Interim/Land dataset has a more similar spatial distribution to the observations than the other datasets. The other three datasets can mainly capture the interannual variability of summer soil temperature over south China (Fig. 5). Unlike the simple north–south disparity of summer soil temperature, the interannual variability of the observed soil temperature for winter is characterized by a high–low–high pattern from north to south (Fig. 6a). The CFSR dataset has a similar distribution to the observations. The GLDAS-2.0 dataset shows an underestimation in most areas, and can generally reproduce the spatial patterns of the observations over the east of China. The ERA-Interim/Land and MERRA-2 datasets do not rebuild the large standard deviation over north China, but they still capture the spatial characteristics of the observations over central China and south China (Fig. 6). In addition, ERA-Interim/Land and MERRA-2 datasets show the smallest bias of the standard deviation of mean soil temperature of stations with complete records in summer and winter respectively (Table 1).

Fig. 5
figure 5

Spatial distribution of the standard deviation of soil temperature from a observations, b ERA-Interim/Land, d MERRA-2, f CFSR, h GLDAS-2.0 at 40 cm in summer, and the spatial distribution of the bias of c ERA-Interim/Land, e MERRA-2, g CFSR, i GLDAS-2.0 compared with the standard deviation of the observational summer soil temperature at 40 cm. Stations with complete observed records of summer soil temperature at 40 cm are selected for comparison. Reanalysis data are interpolated to these stations using bilinear interpolation

Fig. 6
figure 6

Similar to Fig. 5, but for soil temperature in winter. Stations with complete observed records of winter soil temperature at 40 cm are selected for comparison

Figure 7 shows the spatial distributions of the linear trend of soil temperature from observations and reanalysis datasets in summer. The linear trend of observed summer soil temperature has a relatively high value in north and a relatively low value in south, which has been rebuilt by CFSR and GLDAS-2.0 datasets. Four reanalysis datasets generally have an underestimation of the linear trend over north and northwest China. Relatively, GLDAS-2.0 dataset has the smallest bias comparing with others. For the linear trend of mean soil temperature of stations with complete records in summer, GLDAS-2.0 also has a closer value with observations than other datasets (Table 1). Comparing with the linear trend of observed soil temperature in summer, the linear trend of observed soil temperature in winter has a relatively higher value over south China (Fig. 8). Reanalysis datasets have all underestimated the linear trend over north China, and fail to rebuild the north–south disparity shown in observations. MERRA-2 dataset has a relatively closer value with observations for the linear trend of mean soil temperature of stations with complete records in winter (Table 1).

Fig. 7
figure 7

Spatial distribution of the linear trend of soil temperature (°C/year) from a observations, b ERA-Interim/Land, d MERRA-2, f CFSR, h GLDAS-2.0 at 40 cm in summer, and the spatial distribution of the bias of c ERA-Interim/Land, e MERRA-2, g CFSR, i GLDAS-2.0 compared with the linear trend of the observational summer soil temperature at 40 cm. Stations with complete observed records of summer soil temperature at 40 cm are selected for comparison. Reanalysis data are interpolated to these stations using bilinear interpolation

Fig. 8
figure 8

Similar to Fig. 7, but for soil temperature in winter. Stations with complete observed records of winter soil temperature at 40 cm are selected for comparison

Soil temperature, as a reflection of the land surface thermal conditions, is highly related to the surface energy balance:

$${\text{S}}{{\text{R}}_{\text{n}}} + {\text{L}}{{\text{R}}_{\text{n}}} + {\text{SH}} + {\text{LH}} = {\text{G}}$$
(1)

where SRn is the net downward shortwave radiation, LRn is the net downward longwave radiation, SH is the sensible heat flux, LH is the latent heat flux, and G is the soil heat flux (Meng et al. 2012). Therefore, the quality of the estimations of surface energy balance components may have an influence on the estimations of soil temperature. Figure 9 shows the seasonal cycle of the observed soil temperature for 0–320 cm and the soil heat flux. During April–September, soil temperature in the upper layers is higher than in the deeper layers, and during November–March, the deeper layers have a higher soil temperature than the upper layers. Surface soil temperature peaks in July, while soil temperature at 80 and 320 cm peaks in August and October, respectively. In general, the four reanalysis products show a similar annual variability of the soil heat flux, which turns from negative to positive around February, then peaks in April and turns to negative around September–October. Positive and negative soil heat fluxes correspond to the soil gaining and losing energy. Therefore, the land surface is gaining energy from the atmosphere in spring and summer and losing energy to the atmosphere in autumn and winter. In the land surface models, LA-ST is usually calculated based on the heat diffusion equation. During April–September, the land surface transfers energy, which is got from atmosphere, to the deep soil layers, and during autumn and winter, the deep soil upwardly releases energy to the land surface.

Fig. 9
figure 9

a Seasonal cycle of observational mean soil temperature of stations with complete observational records shown in Fig. 1c at 0, 20, 40, 80, 160, and 320 cm. b Seasonal cycle of mean soil heat flux of stations with complete observational records shown in Fig. 1c from the four reanalysis datasets

The largest amount of energy that the land surface receives from the atmosphere is during April–June, while the highest land surface soil temperature appears around June–August. This phenomenon could be due to the fact that soil temperature has a memory ability for climate. We compare the increment of soil temperature from spring to summer in the first layer of reanalysis products with the averaged soil heat flux during April–June (Fig. 10a, c). The reason for using summer soil temperature minus spring soil temperature is considering that the energy from the atmosphere influences the summer soil temperature on the basis of the spring soil temperature. The increment of soil temperature from spring to summer corresponds well to the soil heat flux for the four products, with the CFSR dataset showing the highest values, followed by the ERA-Interim/Land and MERRA-2 datasets. The GLDAS dataset has both the smallest increment of soil temperature and soil heat flux. Therefore, in land surface models, the estimations of land surface energy balance components during April–June can significantly influence the estimations of summer soil temperature. In autumn and winter, energy is transmitted upward from the deep layers to the surface. Therefore, the soil heat flux should have an anti-correlation with the reduction in soil temperature, as a smaller reduction in surface soil temperature means more energy is transmitted from the deep soil layer to surface, leading to more energy being released to the atmosphere. As shown in Fig. 10b, d, the reduction of soil temperature from autumn to winter and the soil heat flux in winter correspond well. With the largest soil temperature reduction from autumn to winter, the CFSR dataset has the smallest winter soil heat flux. The ERA-Interim/Land dataset has the smallest soil temperature reduction and the largest winter soil heat flux.

Fig. 10
figure 10

a Mean summer and spring soil temperature differences in the first layer, b mean winter and autumn soil temperature differences in the first layer, c mean soil heat flux during April–June, and d mean soil heat flux in winter of four reanalysis products for stations with complete observational records shown in Fig. 1a (a, c), and Fig. 1b (b, d)

Except for the ERA-Interim/Land dataset, the magnitudes of standard deviation for soil temperature and soil heat flux also correspond well (Fig. 11). The CFSR dataset, with the largest standard deviation for the increase of soil temperature from spring to summer, has the largest standard deviation of soil heat flux during April–June. The GLDAS-2.0 dataset has smaller standard deviations than the MERRA-2 dataset for the soil temperature increase and soil heat flux. The comparisons of standard deviations for the decrease of soil temperature from autumn to winter and for soil heat flux in winter are similar to the comparisons in summer. Therefore, the accuracy of estimating the interannual variability of soil temperature can be influenced by the estimation of the interannual variability of the soil heat flux.

Fig. 11
figure 11

a Mean standard deviation of summer and spring soil temperature differences in the first layer, b mean standard deviation of winter and autumn soil temperature differences in the first layer, c mean standard deviation of soil heat flux during April–June, and d mean standard deviation of soil heat flux in winter of four reanalysis products for stations with complete observational records shown in Fig. 1a (a, c), and Fig. 1b (b, d)

Our previous work investigated the spatiotemporal characteristics of soil temperature memory over China based on the same observations, and emphasized its potential for improving our ability to predict seasonal climate (Yang and Zhang 2015). Owing to missing observational data, investigations of soil memory in some areas, especially northeast China and the Tibetan Plateau, are still scarce. Reanalysis products, as evaluated in this study, can be very helpful in compensating for this lack of data.

Based on the analysis in our previous work, we adopted the red noise method to calculate the soil temperature memory \((r(\tau ) = exp( - \tau /d))\), where d is the decay time scale, which characterizes the red noise process, and r(τ) is the autocorrelation coefficient at lag time τ (1 month in this study) (Jones 1975; Delworth and Manabe 1988). The 1-month autocorrelation coefficients of June and July, and July and August are averaged as the 1-month autocorrelation coefficients of summer, and the 1-month autocorrelation coefficients of November and December, and December and the January of the next year are averaged as the 1-month autocorrelation coefficients of winter.

We know that the main spatial characteristic of soil temperature memory is a northwest to southeast gradient, with relatively high values in northwest China, and relatively low values in southeast China, which can also been found in Fig. 12a. In summer, the ERA-Interim/Land dataset mainly shows an underestimation of soil temperature memory. It has a relatively larger memory length over northwest China than south China, which is generally consistent with the observations. The MERRA-2 dataset has a small bias of 0–2 months compared with the observations over the east of China, and it has a relatively larger bias over northwest China. The CFSR and GLDAS-2.0 datasets show similar spatial distributions of summer soil memory with an overestimation over north China, and they do not perform well at rebuilding the northwest–south disparity. For winter soil memory (Fig. 13), the ERA-Interim/Land dataset shows the northwest–south gradient, and overestimations over north China and northwest China. The memory lengths for the MERRA-2 dataset in northwest China are shorter than in the other datasets. Similar to the summer, the CFSR and GLDAS-2.0 datasets have a similar spatial pattern to soil memory in winter. They have a relatively smaller bias in the east than in the west of China. In spite of the poor skills presenting the spatial distribution of soil temperature memory, CFSR and GLDAS-2.0 datasets show the smallest bias of the memory lengths of mean soil temperature of stations with complete records in summer and winter respectively (Table 1).

Fig. 12
figure 12

Spatial distribution of the soil temperature memory (months) for a observations, b ERA-Interim/Land, d MERRA-2, f CFSR, h GLDAS-2.0 at 40 cm in summer, and the spatial distribution of the bias for c ERA-Interim/Land, e MERRA-2, g CFSR, i GLDAS-2.0 compared with the observed soil temperature memory at 40 cm in summer. Stations with complete observed records of summer soil temperature at 40 cm are selected for comparison. Reanalysis data are interpolated to these stations using bilinear interpolation

Fig. 13
figure 13

Similar to Fig. 12, but for soil temperature memory in winter. Stations with complete observed records of winter soil temperature at 40 cm are selected for comparison

4 Discussion

Figure 2 shows that the four reanalysis datasets perform differently at different soil depths, and there is no one product that performs better than the others at all soil depths. The evaluation results of soil temperature at 40 cm may not be applicable for the soil temperature at other depths. We also evaluated soil temperature at 80 cm using the same methods (not shown). In general, the four reanalysis datasets show similar spatial distributions of the seasonal mean and standard deviation for soil temperature at 40 and 80 cm, but not exactly the same. The MERRA-2 dataset has a better ability for reproducing the spatial characteristics of soil temperature at 80 cm than the other products. So the evaluation of soil temperature at 40 cm can’t be completely applied as the evaluation the soil temperature at other depths.

In this study, we only investigated the relationship between soil temperature and surface energy balance components to determine the reason for the different behaviors of the reanalysis datasets. In fact, except for energy balance, many land and atmospheric parameters and processes can influence or interact with soil temperature. Albergel et al. (2015) emphasized the importance of orography, soil moisture, and snow cover on the forecast of soil temperature in ECMWF. The orography over the west of China is much more complex than in the east of China, which could be the main reason that reanalysis products perform better in east than in west. Soil moisture, as a crucial parameter of the land–atmosphere interaction, is highly correlated with soil temperature (Subin et al. 2012). Heat transport in the soil column is usually based on the thermal gradient, and the heat conductivity is closely related to soil moisture (Koster et al. 2000). Soil moisture can also influence the surface energy balance by impacting the latent heat flux. The dynamics of snow, which is sensitive to air temperature, can influence the surface energy balance and then alter the soil temperature (Zhang et al. 2005a, b; Khoshkhoo et al. 2015). Soil and vegetation characteristics, and soil frost also play an important role in land energy and water balance (D’Odorico et al. 2007; Wu et al. 2011; Collow et al. 2014). The estimations of these parameters and processes can influence the estimation of soil temperature in reanalysis products to varying degrees.

A major conclusion of our study is that reanalysis datasets generally show an underestimation of the soil temperature over China, which is consistent with the evaluation of soil temperature from ECMWF forecasts during 2012 over Europe (Albergel et al. 2015). Ma et al. (2008) also found that ERA-40, NCEP/NCAR, and NCEP/DOE datasets have underestimated the air temperature (which is high correlated with soil temperature) over China. As mentioned above, Albergel et al. (2015) investigated the influence of orography data and snow cover on the estimation of soil temperature. They chose Darrington station (Washington DC, USA) as an example and found that the orography correction can make the surface soil temperature larger than the original data and closer to the observations. Zhao et al. (2008) demonstrated that “topographic correction” can notably improve the quality of surface air temperature in NCEP-NCAR and ERA-40, which have generally underestimate the surface air temperature over China. Albergel et al. (2015) investigated the impact of snow on soil temperature running offline ECMWF land surface model for a single grid point over the Wild Basin station (Colorado, USA). And they found that soil starts warming until snow depth reach 10 cm in model, while in observations, soil starts warming when soil depth is still 40 cm. They also mentioned that in model, soil starts warming until the snow depth of the entire grid point is less than 10 cm. These two limitations of land surface model can result in an underestimation of soil temperature. Another factor, the land use changes can also lead to an underestimation of soil temperature in reanalysis data. Urbanization (urban heat island effect) and other land use changes can contribute to a high surface temperature (Zhang et al. 2005a, b). While these anthropogenic changes in land surface condition are poorly described in models.

We found that reanalysis data perform better for estimating soil temperature in winter than in summer, especially over the east of China. This may be due to the effect of precipitation on soil temperature. Precipitation can influence the soil moisture and latent heat flux of land surface, which then directly or indirectly impacts the soil temperature. Influenced by the Asian monsoon, China has more precipitation in summer than in winter. Figure 14 shows the correlation coefficients of the soil temperature and precipitation from observations and reanalysis datasets. The observational precipitation data are provided by the China National Climate Center (http://ncc.cma.gov.cn/Website/index.php?ChannelID=43&WCHID=5). The stations of observational soil temperature data (ST-STATION) are different from the stations of observational precipitation data (PRE-STATION). For each PRE-STATION, the soil temperature of the adjacent ST-STATION (with a latitude and longitude difference from the latitude and longitude of the PRE-STATION less than 1° respectively) is selected or averaged (if there are more than one adjacent ST-STATION) as the soil temperature of this PRE-STATION. PRE-STATIONs with no adjacent ST-STATION are abandoned. We can find that the reanalysis datasets generally show similar spatial distribution characteristics with observations. In the four reanalysis datasets, soil temperature in summer shows high correlation with precipitation over most areas of the east of China, while in winter, there is no significant correlation between soil temperature and precipitation over those regions (Fig. 14). Therefore, the quality of the estimations of summer precipitation can play an important role in the estimations of summer soil temperature. The observed precipitation data adopted by the reanalysis products (mentioned in Sect. 2) can be very helpful for improving the reliability of the estimated soil temperature in summer.

Fig. 14
figure 14

Spatial distribution of the correlation coefficients of the soil temperature at 40 cm and precipitation for a observations in summer, b observations in winter, c ERA-Interim/Land in summer, d ERA-Interim/Land in winter, e MERRA-2 in summer, f MERRA-2 in winter, g CFSR in summer, h CFSR in winter, i GLDAS-2.0 in summer, j GLDAS-2.0 in winter. Correlations of ±0.35, ±0.41, and ±0.52 are significant at the 90, 95, and 99% levels, respectively. The interannual trend of soil temperature and precipitation data was removed before calculation

5 Conclusions

In this study, we evaluated soil temperature from four reanalysis datasets, namely ERA-Interim/Land, MERRA-2, CFSR, and GLDAS-2.0, in terms of climatological mean, interannual variability, linear trend and memory lengths by comparison with observational data over China for 1981–2005. The magnitude of soil temperature averaged over the study period is generally underestimated by all four reanalysis datasets, which can be due to the limitations of models at reproducing the topographic characteristics, snow cover and land use changes. The ERA-Interim/Land and GLDAS-2.0 datasets have a relatively closer national mean to the observations than the MERRA-2 and CFSR datasets. Benefitting from the utilization of six soil layers, the MERRA-2 dataset has a good ability for rebuilding the winter soil temperature. For soil temperature at 40 cm, the four datasets all rebuild similar spatial distributions to the observations, and the GLDAS-2.0 dataset stands out as having a smaller bias in both summer and winter. The ERA-Interim/Land dataset shows a similar spatial distribution to the GLDAS-2.0 dataset for soil temperature in summer. The spatial distribution for the interannual variability of soil temperature, as characterized by standard deviation, is well reproduced by the ERA-Interim/Land dataset in summer and by the CFSR dataset in winter. Reanalysis datasets can generally rebuild the linear trend of soil temperature in summer.

The reanalysis products generally perform better in the east of China than in the west of China, which could be due to the fact that the orography over the west of China is much more difficult to describe than over the east of China. Furthermore, the reanalysis products perform better in winter than in summer. Soil temperature in the reanalysis data is significantly correlated with precipitation in summer over the east of China, while in winter, the correlation is small and insignificant. Hence, the estimation of summer precipitation can have an important influence on the estimation of summer soil temperature.

We have demonstrated that summer soil temperature is highly correlated with the soil heat flux during April–June, and winter soil temperature is related to the soil heat flux in winter, which highlights the importance of the estimation of land surface energy balance components on the estimation of soil temperature.

The four datasets can mainly rebuild the northwest–southeast gradient of soil temperature memory, and the ERA-Interim/Land dataset is more consistent with the observations in both summer and winter. In addition, the four reanalysis datasets have different abilities for estimating soil temperature at different soil depths. The results of the evaluation of soil temperature at 40 cm may not be applicable for the soil temperature of the other layers.