1 Introduction

El Niño Southern Oscillation (ENSO) is the most prominent mode of climate variability on interannual time scales and affects global climate and weather. Moreover, ENSO impact in one region is different from those in other regions. For example, during El Niño events, some regions suffer from severe droughts but some regions from severe floods (e.g. Larkin and Harrison 2005). Therefore, Coupled Global Climate Models (CGCMs) have aimed to simulate more realistic ENSO for more accurate prediction of climate and weather. Although considerable progress has been made towards more realistic ENSO simulation, systematic errors still persist (Capotondi et al. 2006; Guilyardi et al. 2009a; Kim and Jin 2011b; Lloyd et al. 2009; Lin 2007; Zhang and Jin 2012). In particular, the Coupled Model Intercomparison Project phase 3 (CMIP3; Meehl et al. 2007) models generally underestimate thermodynamic damping and positive feedbacks including zonal advective and thermocline feedbacks (Kim and Jin 2011b; Lloyd et al. 2009), which are responsible for ENSO variability, and display a large diversity of ENSO amplitude, stability, and teleconnections (Guilyardi 2006; Yu and Kim 2010; Cai et al. 2009; Kim and Jin 2011b).

Recently, CMIP5, which include generally higher resolution models and a broader set of experiments relative to CMIP3, has been coordinated for the IPCC Fifth Assessment Reports (Taylor et al. 2012). Some improvements of ENSO simulation from CMIP3 to CMIP5 phase have been reported. For example, Kim and Yu (2012) reported that CMIP5 models simulate more realistically the observed spatial patterns of the two types (or flavors) of ENSO and have a significant reduction of inter-model diversity in their amplitudes of the two types of ENSO. A recent paper by Zhang et al. (2012) demonstrated that there is a modest improvement in simulating the meridional width of ENSO sea surface temperature (SST) anomaly that generally tends to be narrow, which is attributable to a more realistic simulation of equatorial winds and ENSO periodicity in CMIP5 models. However, there has not been documented for improvement, or otherwise, in simulating various air-sea feedbacks in the equatorial Pacific Ocean responsible for ENSO variability. This is the aim of our present study.

In this study, we implement the Bjerknes Stability (BJ) index formula for overall linear ENSO stability (Jin et al. 2006; Kim and Jin 2011a) to quantify air-sea feedbacks associated with ENSO variability across the historical simulations of the CMIP3 and CMIP5 models, and to access whether and how improvements have been made since CMIP3. The BJ index depicts the dependence of the growth rate of the leading coupled ENSO-like mode on the positive feedbacks (i.e., zonal advective feedback, thermocline feedback, and Ekman feedback) and damping processes (i.e., mean advections and net heat flux across the ocean surface). The positive feedback terms are a product of mean state and a series of coefficients that measure the response sensitivity of the atmosphere (i.e., surface winds) to SST changes, and the ocean (i.e., zonal currents, upwellings, and thermocline) to wind changes (see summary in Table 1). The BJ index has been found to be a useful tool for a comprehensive and quantitative analysis of relative contributions of the positive feedbacks and damping processes associated with ENSO variability (Kim and Jin 2011a, b). Particularly, the BJ index has been utilized for studies on possible changes in ENSO behavior which can arise from variations in air-sea coupling and/or changes in the mean state (e.g. Kim and Jin 2011a, b; Santoso et al. 2011, 2012).

Table 1 Contributing terms in the BJ index and important parameters

The rest of the paper is organized as follows. In Sect. 2 models and observations used in the study and the BJ index analysis are briefly described. Section 3 explore the inter-model diversity of ENSO stability, and in Sect. 4 results from coupled models and observations are compared to provide a possible guidance for improvement in simulating ENSO variability. Summary and conclusions are presented in Sect. 5.

2 Data and method

In this study, we analyze the historical experiments from 19 CMIP5 models (ACCESS1-0, ACCESS1-3, CCSM4, CNRM-CM5, CSIRO-MK3-6-0, FGOALS-g2, GFDL-ESM2G, GFDL-ESM2M, GISS-E2-H, GISS-E2-R, HadCM3, HadGEM2-CC, HadGEM2-ES, IPSL-CM5A-LR, IPSL-CM5A-MR, MIROC5, MPI-ESM-LR, MRI-CGCM3, and NorESM1-M). The results are compared with those from 12 CMIP3 models (cgcm3.1-t47, cgcm3.1-t63, cnrm-cm3, csiro-mk3.5, gfdl-cm2.0, gfdl-cm2.1, iap-fgoals1.0g, ipsl-cm4, miroc3.2-med, mpi-echam5, mri-cgcm2.3a, and ncar-ccsm3.0; small letter is used to CMIP3 models for distinction to the CMIP5 models) used in Kim and Jin (2011b) to gauge the improvement of simulating ENSO stability and its associated air-sea feedbacks in the CMIP5 models. Those models were chosen based on availability of variables required for a BJ index calculation, such as net surface heat fluxes, wind stresses, ocean potential temperatures, and ocean currents. More detailed information on the CMIP3 and CMIP5 coupled models can be obtained from http://www-pcmdi.llnl.gov/ipcc/model_documentation/ipcc_model_documentation.php and http://cmip-pcmdi.llnl.gov/cmip5/availability.html, respectively.

For estimating the observed BJ index, we use datasets (e.g. ocean potential temperatures, ocean currents, and wind stresses) from the Simple Ocean Data Assimilation Reanalysis version 2.0.2 (SODA; Carton and Giese 2008). For net surface heat fluxes, we use the 40-year European Centre for Medium-Range Weather Forecasts Reanalysis (ERA40; Simmons and Gibson 2000).

The detailed BJ index formulation and analysis procedure can be found in Kim and Jin (2011b). Table 1 lists the five contributing terms in the BJ index and the formulation. The response sensitivity coefficients in the formula of these terms are obtained by a least-squares regression analysis of a set of anomalous quantities. The anomalous quantities are obtained by removing long-term mean seasonal cycle and a linear trend. Before applying the analysis, a 7-year running mean is removed from the anomalous quantities to remove decadal and longer variability (Fang et al. 2008; Choi et al. 2009; Dewitte et al. 2012). The BJ index is estimated over the period 1958–1999 for the reanalysis and over 1950–1999 for all the CMIP5 models in the equatorial Pacific Ocean domain (5°N–5°S, 120°E–80°W) as applied to the CMIP3 models in Kim and Jin (2011b).

3 Diversity of ENSO stability

Figure 1 shows scatter plots of BJ index and ENSO amplitude defined as the standard deviation of Niño 3.4 index (SST anomalies averaged over 5°N–5°S and 170°W–120°W) from the 12 CMIP3 models (Fig. 1a) and the 19 CMIP5 models (Fig. 1b) to compare the inter-model diversity of the ENSO stability as well as amplitudes. It is noticed that colored circles, each representing a model, are converged more around the observed one (black circle) in the CMIP5 than in the CMIP3 models. The decrease in the diversity of ENSO stability and amplitude among the CMIP5 models is further highlighted by their respective inter-model standard deviations (STD) which are indicated by red bars in Fig. 1. The STD of the ENSO stability and amplitude, respectively, is decreased from 0.60 year−1 and 0.46 °C for the CMIP3 models to 0.39 year−1 and 0.26 °C for the CMIP5 models. These differences are statistically significant according to F test (p = 0.09 and p = 0.03). It is worth mentioning that since the BJ index is a measure of overall linear ENSO growth rate, it is essentially related to ENSO amplitude and so a significant inter-model relationship between BJ index and Niño 3.4 standard deviation is expected. A significant relationship is found for CMIP3 models (r = 0.79; Fig. 1a) as reported by Kim and Jin (2011b), however this linear relationship collapses (r = 0.26) in CMIP5 due to apparent outliers (CNRM-CM5, GFDL-ESM2M, MIROC5, and MPI-ESM-LR). Neglecting these CMIP5 outliers the correlation coefficient is improved to 0.58. One possible reason of those outliers in CMIP5 may be because those models have different level of nonlinearity and noise from other CMIP5 models, which are possible factors that also can control the ENSO amplitudes (e.g. An 2008, 2009; Zavala-Garay et al. 2003; Fedorov 2002; Eisenman et al. 2005; Jin et al. 2003, 2007). Those factors are not considered in the formulation of the BJ index which takes into account only a role of background state in determining ENSO stability (e.g. Fedorov and Philander 2001; Bejarano and Jin 2008).

Fig. 1
figure 1

Scatter plots of BJ index versus ENSO amplitude from (a) CMIP3 and (b) CMIP5 coupled models. The ENSO amplitude is estimated by the standard deviation of Niño3.4 index. The BJ index and ENSO amplitude computed with the reanalysis data (SODA and ERA40) is also shown on each panel to represent the observed. The inter-model standard deviations of ENSO amplitudes and BJ index are indicated by red bars

Figure 2a displays the total BJ index and its five contributing terms from the 19 CMIP5 models and the reanalysis. As in the reanalysis, the thermocline feedback (TH in Fig. 2a) is most dominant among the positive feedback terms in all the CMIP5 models except MRI-CGCM3. As in the CMIP3 models (Fig. 9 of Kim and Jin 2011b), most of CMIP5 models still have problem in simulating the zonal advective feedback (ZA) that plays a secondary role in ENSO growth in the reanalysis. Only four models (CCSM4, GFDL-ESM2G, GFDL-ESM2M, and MIROC5) have a zonal advective feedback term that is greater than the Ekman feedback (EK). The damping terms in the reanalysis are dominated by the thermodynamic damping term (TD) but this feature is simulated by only 6 out of 19 CMIP5 models. These six models include CCSM4, CNRM-CM5, FGOALS-g2, GFDL-ESM2M, GISS-E2-H, and GISS-E2-R. In other words, damping effects in most of the CMIP5 coupled models are dominated by the mean advection (MA) rather than thermodynamic damping, in contrast to the reanalysis.

Fig. 2
figure 2

a BJ index and its contributing terms in the CMIP5 models and b RMSE in each of the 19 models in the order of increasing RMSEs. The observed BJ index is estimated using SODA and ERA40 data. MA stands for damping by mean advection, TD for thermodynamic damping, ZA for zonal advective feedback, TH for thermocline feedback, and EK for Ekman feedback. Correlation coefficients between models and observations, which are calculated with five contributing terms and total BJ index, are also displayed. Colored model names indicate models with correlations that are significant above the 90 % level: models in red at the 99 % level, models in green at the 95 % level, and models in blue, 90 %. The RMSE in each model is calculated using the departures from the observed (reanalysis) of each of the five BJ terms and the BJ total. The reanalysis and the models each have six “samples” (five BJ terms and one BJ total), all with an identical sequence, allowing a calculation of correlation and RMSE

To reveal which CMIP5 models that are capable of simulating more realistic ENSO stability with respect to the observed one represented by the reanalysis, we calculated correlation coefficients and root mean square error (RMSE), between each model and the reanalysis, in terms of the relative importance of the five contributing terms and total BJ index following the sequence shown in Fig. 2a. Each model has six samples (five BJ terms and one BJ total), all with an identical sequence, allowing a calculation of correlation and RMSE. Figure 2b shows RMSE in each of the 19 models in the order of increasing RMSEs. The first eight models in Fig. 2b (GFDL-ESM2M, MIROC5, GISS-E2-R, FGOALS-g2, NorESM1-M, CCSM4, ACCESS1-0, GISS-E2-H) with small RMSE also have significant correlations at 99 % level (i.e., red colored models in Fig. 2a) with the reanalysis.

We now explore what factor causes the reduced inter-model diversity of ENSO stability in CMIP5 compared to CMIP3 by examining the inter-model STDs (i.e., vertical bars in Fig. 3) of contributing terms of the BJ index, and their associated mean states and response sensitivity coefficients in the CMIP3 and CMIP5 models. The inter-model discrepancy of the zonal advective feedback (ZA in Fig. 3a) and Ekman feedback (EK in Fig. 3a) terms is relatively small in both CMIP models. The thermocline feedback (TH in Fig. 3a), the most dominant contribution to ENSO growth, has a relatively large inter-model diversity among the positive feedback terms. It is noticed from Fig. 3a that inter-model STD of the thermocline feedback is reduced from 0.43 to 0.39 year−1 and that of Ekman feedback from 0.14 to 0.12 year−1. Although these decreases from CMIP3 to CMIP5 in the inter-model STD is not significant according to an F test, they do contribute to a decrease in the inter-model discrepancy of ENSO stability in CMIP5 relative to that in CMIP3. For the zonal advective feedback, its inter-model STD shows no difference between CMIP3 and CMIP5.

Fig. 3
figure 3

a Comparison of multi-model ensemble mean (MEM) of the BJ index and its contributing terms from CMIP3 (blue) and CMIP5 (red) with observed ones (black). Error bars indicate their inter-model standard deviations. b MEM of response sensitivity coefficients (μ a ; ×10−3 N/m2/°C, β h ; ×10 °C/N/m2, β w ; ×10−4 m/s/N/m2, β u ; m/s/N/m2, a h ; °C/°C) and mean state quantities (W; mean upwelling, ×10−8 m/s, dT/dx; mean zonal temperature gradient, ×10−7 °C/m, dT/dz; mean vertical ocean temperature gradient,×10−2 °C/m)

The reduced diversity in the thermocline feedback is mainly due to the reduced spread of response sensitivity coefficients, as shown in Fig. 3b, related to the response of the anomalous thermocline slope to equatorial wind change (β h ), the effect of thermocline depth variability on ocean subsurface temperature (a h ). As for the Ekman feedback, mean vertical temperature gradient (dT/dz) is contributing to the reduction in its inter-model discrepancy. However, the inter-model diversity of the two damping terms, namely mean advection damping (MA in Fig. 3a) and thermodynamic damping (TD in Fig. 3a), has actually increased from CMIP3 to CMIP5. The inter-model STD is increased from 0.16 to 0.23 year−1 for mean advection damping and from 0.29 to 0.34 year−1 for thermodynamic damping. This raises the question as to if there are other reasons which contribute to the decreased diversity of the ENSO stability or amplitude in CMIP5 models. This will be discussed in the next section.

4 Positive feedbacks and dampings in the models and observations

In Fig. 3a, multi-model ensemble mean (MEM) of the thermocline feedback and zonal advective feedback and their error bars indicate that these primary positive feedback terms are still underestimated in most CMIP5 models, which was also the case in the CMIP3 models (Kim and Jin 2011b), although a moderate improvement is found in the thermocline feedback term from the CMIP5 models. The underestimation is mainly due to a weaker thermocline slope response to wind forcing (β h ), and weaker mean upwelling (Fig. 3b), both involved in the thermocline feedback; and a weaker response of zonal currents to wind variability (β u ), and a weaker mean zonal temperature gradient, in the zonal advective feedback (Fig. 3b). The weaker surface wind response to SST variability (μ a ) is also contributing to the underestimation of the two feedbacks. In simulating thermodynamic damping (TD in Fig. 3a), the coupled models have not been improved from CMIP3 to CMIP5. The underestimations of simulated thermodynamic damping and two positive feedback terms tend to offset each other, and thus the MEM of the total BJ index in CMIP3 and CMIP5 is close to the observed one. This error cancellation is likely to contribute to the reduced diversity among the CMIP5 models in addition to the reduced spread of the thermocline feedback and Ekman feedback terms. In other words, there may exist a larger error cancellation in the CMIP5 models across the feedback terms that would make the simulated ENSO stability, as estimated by the BJ index, appear closer to the observed (see Figs. 1b, 3a).

Next, we suggest how the two underestimated positive feedback (thermocline and zonal advective feedback) terms can be improved in the coupled models. First, we determine which factors control each of the two feedback terms by evaluating inter-model correlation coefficients between each feedback term and its contributing factors, as shown in Fig. 4. A higher correlation coefficient with a particular factor suggests that this factor has a stronger control on the simulated feedback intensity. Figure 4 indicates that the thermocline slope response to wind stress change (β h ) controls the thermocline feedback, and oceanic zonal currents response to wind forcing (β u ) controls the zonal advective feedback, since they have the highest, significant correlations (0.83 and 0.50, respectively). In other words, models with a stronger thermocline feedback have a stronger coupling between the thermocline slope and surface winds, and models with stronger zonal advective feedback have a larger coupling between the zonal ocean currents and wind forcing. Furthermore, the response sensitivity coefficients (β h and β u ) are underestimated in most of the coupled models (Fig. 3b). Therefore, improvements toward more realistic values of the two coefficients can lead to improvement of the zonal advective and thermocline feedback terms across the coupled climate models. Many previous studies have linked the intensity of the zonal advective feedback and thermocline feedback to the climatological mean ocean zonal temperature gradient and mean upwelling, respectively (e.g. An and Jin 2001; Fedorov and Philander 2001). However, our results suggest that the response sensitivity of the ocean to a wind forcing is also an important factor for controlling the two primary ENSO feedbacks.

Fig. 4
figure 4

Scatter plots of (a) thermocline feedback (TH) versus its contributing factors, mean upwelling (W), μ a and β h , (b) zonal advective feedback (ZA) versus mean zonal temperature gradient (dT/dx), μ a , and β u . Blue circles correspond to CMIP3 models, red circles to CMIP5 models, and black circle to the reanalysis. A regression fitting line (black line) for all CMIP3 and CMIP5 models are also displayed

For a realistic simulation of ENSO variability, a proper simulation of the climatological mean state appears to be important as demonstrated by previous studies that found a strong linkage between ENSO stability and a mean state change (e.g. Battisti and Hirst 1989; An and Jin 2000; Guilyardi 2006; Fedorov and Philander 2001; Bejarano and Jin 2008; Santoso et al. 2011). Here we show that there is indeed a significant relationship between the simulation of the mean state and ENSO feedback processes. To this end, we explore the spatial pattern of inter-model correlation coefficients between selected atmosphere and ocean mean state quantities (e.g. zonal wind stress, thermocline depth defined as the depth of 20 °C isotherm, and upper ocean temperature) at each grid point and the two response sensitivity coefficients (β h and β u ), as shown in Fig. 5a–f. The spatial patterns of the inter-model correlation coefficients are obtained using all 31 models combining the 12 CMIP3 and the 19 CMIP5 models.

Fig. 5
figure 5

Spatial patterns of inter-model correlation coefficients between grid-point mean state quantities—namely (a, d) mean 20 °C isothermal depth, (b, e) mean tropical trade winds, and (c, f) mean vertical ocean temperature along the equator—and two response sensitivity coefficients, (left column) β h and (middle column) β u , from 31 coupled models combining 12 CMIP3 and 19 CMIP5 models; g spatial pattern of inter-model correlation coefficients between mean SST at each grid point and the thermodynamic damping coefficient. Correlation coefficients that are significant at the 90 % confidence level (student t test) are shaded. The MEM of mean 20 °C isothermal depth (thick black solid line) and mixed layer depth (thick black dashed line) is also shown. The mixed layer depth is defined as the depth where the temperature change from the surface ocean temperature is 0.5 °C

The models with a strong thermocline slope response to zonal wind stress have a deeper mean thermocline in the eastern equatorial Pacific (Fig. 5a) as indicated by the statistically significant correlation coefficients. Models that simulate a shallower mean thermocline in the western-to-central tropical Pacific, or a flatter equatorial mean thermocline, tend to have a stronger zonal currents response to wind forcing (Fig. 5d). Models with weaker tropical Pacific trade winds, indicating the weakening of mean Walker circulation, tend to simulate a stronger response of zonal currents and thermocline slope to zonal wind variability, (Fig. 5b, e), although correlations between the thermocline-slope response and the zonal trade winds are not significantly large. From Fig. 5c, f, it can also be seen that models with warmer mean temperature in upper ocean and colder mean temperature in the deeper ocean along the equator systematically simulate a stronger strength in the two response sensitivity coefficients. Therefore, the associated stronger upper ocean temperature stratifications are favorable to the intensity of the oceanic response in coupled models. In particular, the stratification across the mean thermocline depth (thick black solid line in Fig. 5c, f) in the eastern equatorial Pacific is important for the strength of the zonal thermocline-slope response to winds. On the other hand, the strength in the zonal current response to the equatorial wind stress appears to be influenced by temperature structures across the base of the mixed layer (thick black dashed line in Fig. 5c, f; the mixed layer depth is defined as the vertical location where the temperature is 0.5 °C cooler than the sea surface; Levitus 1982) over the western to central equatorial Pacific.

In summary, the two response sensitivity coefficients (i.e., β h and β u ) controlling the thermocline and zonal advective feedbacks that are underestimated in most CMIP5 and CMIP3 models are stronger in the models with weaker tropical Pacific mean trade winds, flatter mean thermocline along the equator (deeper thermocline in the east and shallower in the west), warmer mean temperature in the eastern Pacific, and stronger vertical upper layer stratification. Therefore, results from Fig. 5a–f may suggest that the underestimation of the zonal advective and thermocline feedbacks in the coupled models is associated with too cold upper ocean temperatures within the upper 200 m, which is evident in Fig. 6. Figure 6 shows the difference in vertical temperature along the equator (averaged over 5°N–5°S) between the MEM of 31 coupled models and the reanalysis (i.e., SODA). This cold equatorial SST bias is one of systematic mean state errors that persists in coupled models and has been found to be associated with too strong trade winds and/or shallower oceanic mixed layer depth (e.g. Lin 2007; Brown et al. 2011; Vannière et al. 2013). The too cold subsurface ocean temperature in the coupled models also contributes to the weaker thermodynamic damping as suggested by Fig. 5g, which can also be strongly affected by cumulus parameterization schemes in the coupled models (Guilyardi et al. 2009b). Figure 5g indicates that thermodynamic damping is stronger in the models with warmer mean SST in the Pacific cold tongue area since the warmer SSTs tend to increase evaporative cooling and clouds (Knutson and Manabe 1994).

Fig. 6
figure 6

Differences in vertical mean ocean temperature along the equator between the MEM and SODA. Contour interval is 0.5 °C

5 Summary and conclusion

We estimated the overall ENSO stability and the relative contribution of positive feedbacks and damping processes to the stability in historical simulations of the 19 CMIP5 models using the BJ index analysis. When compared with the CMIP3 models, the ENSO stability in the CMIP5 models are more converged around the observed ones, which are estimated from the atmosphere and ocean reanalysis data sets. The reduced diversity in the ENSO stability can be partly attributed to a reduced inter-model spread of the thermocline feedback and Ekman feedback terms, which is mainly due to a decreased inter-model discrepancy in the strength of the response of the thermocline-slope to equatorial wind change, in the effect of thermocline depth change on the subsurface ocean temperature, and in the oceanic mean vertical temperature gradient.

Furthermore, a moderate improvement is found in the thermocline feedback term from the CMIP5 models, although the majority of the CMIP5 models analyzed in this study still underestimate the zonal advective and the thermocline feedback, and thermodynamic damping terms with respect to the observed, a bias persisted from CMIP3 to CMIP5 models. The underestimated positive feedback and damping terms offset each other and cause the MEM of the total BJ index to be close to the observed one. The error cancellation in the CMIP5 appears to be greater, and also contribute to the reduced diversity of ENSO stability in CMIP5 models.

This study also attempted to give some suggestions for improving the two dominant positive feedback terms and the thermodynamic damping term that are underestimated by coupled models. Particularly, the mean surface-to-subsurface ocean temperatures are too low, making the simulated mean vertical ocean stratification weaker than the observed. This weaker vertical ocean stratification, in turn, affects the intensity of the response of the equatorial thermocline-slope and the response of the zonal currents to wind forcing that controls the strength of the thermocline feedback and zonal advective feedback, respectively.

How can the mean vertical ocean thermal stratification affect the intensity of the two response sensitivity coefficients? A stronger vertical ocean stratification in the upper ocean may cause wind stress-forced momentum to be confined in a shallower oceanic mixed layer, leading to an increase in the response sensitivity of zonal currents to wind stress. Also, the vertical ocean stratification may affect diffusion of ocean temperatures around the thermocline which is likely to influence the intensity of the thermocline response to a wind forcing. If that is the case, a diffusive thermocline, which is also one of systematic bias in coupled models, may be a possible reason for a simulated weak thermocline feedback. Extensive sensitivity experiments with numerical models may be needed to reveal conclusively why the response sensitivity increases with a stronger stratification.