1 Introduction

North American climate variability is largely influenced by the El Niño Southern Oscillation (ENSO) (Ropelewski and Halpert 1986, 1987, 1996; Kiladis and Diaz 1989; Halpert and Ropelewski 1992; Livezey et al. 1997; Montroy 1997; Gershunov and Barnett 1998; Barsugli and Sardeshmukh 2002; among others). Traditionally, sea surface temperature anomalies (SSTA) during warm ENSO events or El Niño peak in the eastern Pacific, but during some events SSTA warming peaks in the central Pacific (Kao and Yu 2009; Yeh et al. 2009; Kug et al. 2009; Lee and McPhaden 2010; Ren and Jin 2011; Takahashi et al. 2011; Capotondi et al. 2014; Yeh et al. 2014). The events are sometimes referred to as Eastern Pacific (EP) or Central Pacific (CP) El Niño events, respectively (Kao and Yu 2009), and CP events seem to be more prevalent in recent years (Yeh et al. 2009, 2014). During wintertime warm ENSO events, the western part of the United States tends to be warm and the east tends to be cold and wet (Trenberth et al. 1998). However, EP and CP events can cause differing climatic responses over North America (Larkin and Harrison 2005a, b; Ashok et al. 2007; Weng et al. 2007, 2009; Mo 2010; Yu et al. 2012; Yu and Zou 2013).

Of course, no two ENSO events are alike (Wyrtki 1975; Trenberth and Stepaniak 2001), and the position of tropical Pacific SSTA can lead to diverse influence on North American climate (Barsugli and Sardeshmukh 2002; Hoerling and Kumar 2002). The position of SSTA during EP and CP events causes differing local atmospheric responses and tropical Pacific precipitation (Kug et al. 2009), which can affect the extratropics (Wallace and Gutzler 1981; Sardeshmukh and Hoskins 1988; Barsugli and Sardeshmukh 2002). During EP events, anomalous local precipitation is located near the dateline, but for CP events local precipitation is shifted westward, leading to a reduction of precipitation in the eastern Pacific during CP events (Yeh et al. 2014). Anomalous southwesterly low level flow from the western tropical Pacific brings moisture to the southwestern US during CP events, and from the eastern tropical Pacific to the southeastern US during EP events (Weng et al. 2009). The typical 2-meter temperature (T2m) response to EP (CP) events is warming in the north/northeast (northwest) and cooling in the south/southwest (southeast) (Mo 2010; Yu et al. 2012). For precipitation, there is more rainfall over the southwest and less over the southeast during CP events versus EP events (Mo 2010), and CP events can enhance (weaken) dry (wet) impacts (Yu and Zou 2013).

Though not all climate anomalies can be explicitly understood as due to ENSO because of interfering influences from other ocean basins and/or atmospheric processes (Rasmusson and Wallace 1983), knowledge of the SSTA can aid in determining the likelihood of a particular pattern (Hoerling and Kumar 1997). Climate models, however, have difficulty simulating tropical Pacific SSTA differences for EP and CP events and favor EP events (Ham and Kug 2012). For example, using the Climate Forecast System model (CFS), Kim et al. (2012) found realistic simulation of EP and CP events, but CP events were weaker than observed. In the Predictive Ocean Atmosphere Model for Australia (POAMA), Hendon et al. (2009) found distingushable SSTA patterns of EP and CP events up to 3 month leads, but the patterns became indistingushable at longer leads. Kirtman et al. (2013) found similar results in the North American Multi-Model Ensemble (NMME) system for Intra-seasonal to Interannual (ISI) Prediction. NMME predictions are too warm in the eastern Pacific during CP events as compared to observations, thus strong eastern Pacific events are better predicted (Capotondi et al. 2014). Our discussion focuses on the extratropical (North American) response to EP and CP events in NMME, with particular attention on forecast ensemble agreement.

We concentrate on short lead NMME predictions of January–February–March (JFM) North American precipitation and T2m during EP and CP events. JFM is chosen due to the strong atmospheric teleconnection response to warm ENSO events (Wang and Fu 2000). The winter season coincides with the mature stage of warm ENSO events, thus, the influence of tropical heating is felt more by the climate system than it would be in other seasons (Trenberth et al. 1998). There has also been some study on the influence of CP events on North America in summer (June–July–August), in which there is a tendency for warming and drying in western North America, however, the large-scale atmospheric features that accompany tropical Pacific warming is weaker than is seen in winter months (Weng et al. 2007). As such, these results may not be generalized to other seasons.

Our focus in this manuscript is on warm ENSO events, though, cold ENSO events or La Niña should not be discounted as they also have associated teleconnection responses (e.g. Hoerling et al. 1997; Dommenget et al. 2013). Similar to warm ENSO events, non-linearity exists among La Niña events as well. For example Dommenget et al. (2013) showed that strong (weak) warm ENSO events are typically EP (CP) events, whereas strong (weak) cold ENSO events are usually CP (EP) events. As cold ENSO event teleconnections are likely differ due to this non-linearity, discussion could be important for climate predictions, however is beyond the scope of this manuscript.

Predictability is said to increase during strong warm events due to stronger atmospheric signal (Kumar and Hoerling 1998). Though SSTA amplitude tends to be weaker during CP events compared to EP events, Kug et al. (2009) found that the observed atmospheric response (and therefore teleconnections) may be stronger during events with NINO4 SSTA warming. Yu et al. (2012) categorized observed EP and CP events by strength, identifying “typical” North American T2m impacts for the strongest events in both categories, though they used a larger sample of events than shown here. In a study using outgoing longwave radiation (OLR) classification of warm events, Chiodi and Harrison (2012) found statistically significant patterns of T2m and precipitation over some US regions based on a composite analysis. OLR classification highlights some of the strongest events, 1982–1983, 1986–1987, 1991–1992, and 1997–1998, though does not classify into CP and EP. The authors analysis of the remaining defined warm events using typical definitions showed less significant results, likely due to reduced atmospheric heating during these events.

We discuss the agreement in sign for the NMME forecasts over North America during EP and CP El Niño events. For our purposes here “observational agreement” refers to the NMME ensemble forecasting the correct sign of the observed anomaly and correlation of predictions with observations. This is distinct from “ensemble agreement” which refers to the NMME ensemble members agreeing on the sign of the forecasted anomaly without any reference to the observations. Our main focus is the change in ensemble agreement with change in strength of event, i.e. assessing whether stronger events lead to larger ensemble agreement in NMME. There are additional important metrics of prediction skill not discussed here; for instance, skill metrics for the NMME system as a whole in Kirtman et al. (2014), and for southeast US rainfall prediction in Infanti and Kirtman (2013).

In the NMME predictions, there is less observational agreement of North American T2m and precipitation during CP events versus EP events. We examine weather North American ensemble agreement is tied to east or central Pacific anomalies, and if stronger events lead to larger ensemble agreement. The influence of the strength and position of tropical Pacific anomalies during warm ENSO events on ensemble agreement varies regionally over North America.

2 Data and methods

Observational estimates for verification of the NMME system include National Climate Data Center Optimum Interpolation Monthly Sea Surface Temperature Analysis (NCDC/OISST) (Reynolds et al. 2002), Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP) (Xie and Arkin 1997), and Global Historical Climatology Network/Climate Anomaly Monitoring System (GHCN/CAMS) 2-meter temperature (T2m) (Fan and van den Dool 2008). All datasets are on a 1.0 × 1.0 degree grid, we calculate anomalies relative to 1982–2009 climatology. Prediction data considered is Phase-1 North American Multi-Model Ensemble (NMME) (Kirtman et al. 2014) predictions of SSTA, T2m, and precipitation. Models included in NMME are CFS version 1 and 2 (Saha et al. 2006, 2013), GFDL CM2.2 (Zhang et al. 2007), IRI-ECHAM4-f and a2 (DeWitt 2005), CCSM3.0 (Kirtman and Min 2009), GEOS5 (Vernieres et al. 2012), CMC1-CanCM3 and CanCM4 (Merryfield et al. 2013). Ensemble members for each model range from 6 to 24 for 109 total ensemble members. We consider short lead analysis as skill of North American T2m and precipitation decreases at longer leads (Sect. 3.1). We calculate anomalies with respect to the above climatology for each ensemble member, and pool assuming equal weights to form the NMME ensemble mean anomaly.

EP and CP ENSO events are defined using the consensus method from Yu et al. (2012). The consensus method combines EP/CP index identification from Kao and Yu (2009), NINO3 and NINO4 index identifications from Yeh et al. (2009), and the El Niño Modoki index of Ashok et al. (2007). Yu et al. (2012) focuses on December–January–February (DJF), and we instead concentrate on JFM as this season shows the strongest extratropical response (Wang and Fu 2000). In this manner, observed EP events occur during JFM1983, 1987, 1998, 2007; CP events during JFM1992, 1995, 2003, 2005. JFM1998 and JFM1995 serve as examples of EP and CP events, respectively, as in Kirtman et al. (2013). JFM1998 is chosen as an example due to its overall strength. JFM1995, though not the strongest event in the group, shows the largest contrast between central and eastern pacific warm SSTA. The consensus method also includes JFM1988 as a weak CP event, but it is not included in this analysis in order to use an equal number of EP and CP events. A substantial CP event occurred in winter 2009–2010, however, JFM2010 is not included in the defined hindcast period. The nomenclature of EP and CP is common in the literature surrounding ENSO diversity, but is a qualitative description as events occur with a large range of longitudinal distributions (Capotondi et al. 2014).

Figure 1a shows observed NINO3.4 (5S to 5N; 190E to 240E) (shading), NINO3 (5S to 5N; 210W to 270W) (black contour) and NINO4 (5S to 5N; 160E to 210W) (green contour) 3-month running mean SSTA. NINO3.4 SSTA above 0.4C indicates an El Niño event (e.g. Trenberth 1997). Defined events are indicated by open circles. Regions of interest over North America loosely based on Barsugli and Sardeshmukh (2002); Schubert et al. (2009) are shown in Fig. 1b–e. Regions include Northwest (NW), Southwest (SW), Great Plains (GP), Northeast (NE), and Southeast (SE) North America. We stratify our analysis based on NINO3 and NINO4 regions to represent EP and CP events, respectively. Though NINO4 can be considered “western Pacific”, it captures the CP warming and allows us to differentiate between the two events; thus, we refer to NINO3 as eastern Pacific warming and NINO4 as central Pacific warming to match the event classes. Observed expected impacts due to EP and CP events in each region are shown in Fig. 1b, c for precipitation and Fig. 1d, e for T2m. Expected impacts are based on observed composites of EP and CP events, shading represents regions where the composite is significantly different from zero using a Student’s T Test with (p < 0.1), contours represent regions where the composite is significantly different from 0 with (p < 0.05).

Fig. 1
figure 1

a Observed SSTA in NINO3.4 (shading), NINO3 (black contour), NINO4 (green contour). Open black/green circles indicate EP/CP events based on consensus method. Gray dashed lines placed at ±0.4 °C as in the typical definition of an El Niño event. b “Typical” CP precipitation impacts where green indicates positive precipitation anomalies and brown indicates negative precipitation anomalies. Typical impacts are calculated based on an observed composite of CP events. Shading (white contour) indicates areas where composite anomaly is significantly different from zero at 90 % (95 %) confidence level based on a Students T Test. Boxes indicate regions (NW, SW, etc.). c As in b, but for EP impacts. d As in b, but for T2m where blue indicates negative T2m, red indicates positive. e As in d, but for EP impacts

3 Results

3.1 Anomaly correlation

The purpose of this section is to provide an estimate of the deterministic JFM hindcast skill in the NMME system, as the results concerning individual events must be placed in context with the hindcast skill. Anomaly correlation coefficient is used as an overall measure of skill for the JFM seasonal hindcasts (e.g. Wilks 2005) for SST/T2m and precipitation at “short” and “long” lead times (Fig. 2a–d). A short (long) lead-time hindcast for JFM has a January (September) start month. For example, if we were to predict JFM1998, a “short-lead” a hindcast would be initialized January 1998 predicting January 1998 (0.5 lead hindcast), February 1998 (1.5 lead hindcast), March 1988 (2.5 lead hindcast), averaged over the season. A “long-lead” hindcast is initialized in September 1997, predicting January 1998 (4.5 lead hindcast), February 1998 (5.5 lead hindcast) and March 1998 (6.5 lead hindcast), averaged over the season. Anomaly correlation provides a deterministic comparison of anomalous fields; in this case NMME hindcast SST, T2m and precipitation versus observed SST, T2m, and precipitation. The period of reference for this Figure is the defined hindcast period of 1982–2009, and anomalies are calculated with respect to this climatology. It is typical to say that a “skillful” hindcast or forecast has correlations above 0.6 (Wilks 2005), though we have calculated significance of the correlation based on a Student’s T Test. 99 % significance level is contoured, and correlation values are shaded.

Fig. 2
figure 2

Anomaly correlation of NMME with observations for JFM SST, T2m, and precipitation. Anomaly correlation is calculated with observed anomaly (1982–2009 climatology) and ensemble mean NMME JFM predictions (1982–2009 climatology). Shading indicates correlation coefficient (R), and contour indicates significance at the 99 % confidence level based on a Student’s T Test. a Anomaly correlation of NMME SST (ocean based shading) and NMME T2m (land based shading) versus observation for short lead prediction verifying in JFM. b As in a, but for long lead prediction verifying in JFM. Note that SST correlation coefficient shading ranges from ±0.6 to ±1.0. c Anomaly correlation of NMME precipitation versus observation for short lead prediction verifying in JFM. d As in c, but for long lead prediction verifying in JFM

Skill is significant in the tropical Pacific for SSTA and precipitation at both lead times. Note that SSTA correlation coefficient shading ranges from ±0.6 to ±1.0. Skill decreases for land-based T2m overall, but there is significant skill in the northwest at short leads (Fig. 2a). Precipitation skill is significant in the western US and coastal southeastern US at short leads (Fig. 2c). Skill is largely insignificant for land-based T2m and precipitation at long leads in most regions (Fig. 2b, d). Though short lead skill is only significant in some regions for T2m and precipitation, T2m shows largely positive correlation over much of North America, compared to precipitation, which does not. It has also been shown that hindcast skill is larger during ENSO years versus non-ENSO years, particularly over the Pacific-North American region, when considering both T2m and precipitation predictions (Shukla et al. 2000; Weigel et al. 2008).

Because of the (comparatively) lower skill in long-lead hindcasts, we focus our results on short lead in order to avoid complications due to decreased skill over land. In addition, it was shown in Kirtman et al. (2013) (among others) that CP events tend to show excess eastern Pacific warming at longer leads, and in utilizing the short lead analysis we obtain a larger contrast between central and eastern Pacific warming. This decrease in skill is mainly due to initialization strategies in the NMME system. Broadly, some of the models included in NMME include ocean, atmosphere, and land initialization in their hindcast strategy, which add to skill for a period of up to 1 month (atmosphere), 2–3 months (land), and approximately 6 months (ocean) (e.g. Dirmeyer 2003; Koster and Suarez 2003; Paolino et al. 2011, among others). The forecast centers involved in NMME use differing initialization strategies, and a full assessment of skill of the NMME hindcasts is beyond the scope of this paper, but can be found in Kirtman et al. (2014).

3.2 Forecast observational agreement

JFM1995 and 1998 are used as examples of CP and EP events, respectively, and are shown for observations and NMME predictions (Fig. 3). Remaining events are not shown, but were similarly analyzed. Observed anomalies (shading) are computed based on a 1982–2009 climatology, defined as the difference between monthly mean precipitation, SST, or T2m, averaged over the given JFM season. Ensemble mean anomalies (shading) are also computed based on a 1982–2009 climatology using a similar strategy. The climatology is removed from each ensemble member individually, and the resulting JFM anomalies are pooled assuming equal weights to form the ensemble mean anomaly, as in Infanti and Kirtman (2013). Contours in Fig. 3e–h shows the percent of ensemble members that agree with observed above or below 0 anomaly. 70 and 95 % observational agreement contours are shown. Hatching shows NMME ensemble agreement on the sign of the anomaly at 70 % (80 %) level over land (oceans). Ensemble agreement is discussed here and more thorough discussion is found in Sect. 3.3. We also include a brief analysis of the tropical Pacific for completeness.

Fig. 3
figure 3

Observed (ad) and hindcast (eh) SST, T2m, and precipitation anomalies, ensemble agreement, and observational agreement for representative seasons of JFM1995 (CP event), and JFM1998 (EP event). All calculated anomalies are based on a period of 1982–2009. a Observed SSTA (ocean based shading) and T2m anomaly (land based shading) in JFM1995. b Observed SSTA and T2m anomaly in JFM1998. c Observed precipitation anomaly in JFM1995. d Observed precipitation anomaly in JFM1998. e As in a, but for NMME short lead prediction verifying in JFM1995. Shading represents ensemble mean anomaly, in which the climatology is removed from each ensemble member, and is then pooled assuming equal weights to form the ensemble mean anomaly. Contours represent percent of ensemble members agreeing with observed sign of anomaly, at 70 and 95 % levels. Hatching represents ensemble agreement on the sign of the anomaly at 70 % over land and 80 % over oceans. f As in e, but for JFM1998. g JFM1995 NMME hindcast precipitation. h As in g, but for JFM1998

NMME JFM1995 SSTA (Fig. 3e) shows ensemble and observational agreement in the tropical Pacific. However, there is relatively large error as measured by RMSE in the eastern tropical Pacific due to the forecasts producing too much warming; note the neutral observed eastern Pacific SSTA in Fig. 3a. NMME JFM1995 T2m (Fig. 3e) shows little ensemble and observational agreement over North America, but captures the northwest-to-southeast warm-to-cool structure (Mo 2010; Yu et al. 2012; Fig. 1d) in the ensemble mean. NMME JFM1998 SSTA (Fig. 3f) also shows ensemble and observational agreement in the tropical Pacific, and closely matches observations (Fig. 3b). NMME JFM1998 T2m (Fig. 3f) is shows relatively better observational and ensemble agreement than JFM1995 T2m, as well as showing the expected result in the mean (Fig. 1e).

Studies have shown that tropical Pacific precipitation during El Niño events is important for extratropical teleconnections, where changes in latent heat release and subsequent vertical motions and winds can propagate into the extratropics (Sardeshmukh and Hoskins 1988; Trenberth et al. 1998; Barsugli and Sardeshmukh 2002; McPhaden 2004). As stated in the introduction, EP and CP events have marked differences in their respective local response, which can in turn lead to differences in extratropical responses (Yeh et al. 2014; among others). NMME robustly captures Tropical Pacific precipitation and agrees with observations in JFM1998 (Fig. 3h). JFM1995 (Fig. 3g) shows little ensemble agreement and only agrees with observation in a small region of the western Pacific. Land based precipitation over North America in NMME captures expected large scale features in the mean for both JFM1995 and 1998 (see Fig. 1b, c), but JFM1995 shows low ensemble and observational agreement.

To further examine the agreement with observations in NMME predictions, we calculate pattern (spatial) correlation for each ensemble member versus observational estimates for all North American regions and defined events. For each region and event, we calculate the coefficient of determination (squared correlation) for each NMME ensemble member, and determine the average over all ensemble members (Table 1). There is a large amount of variance in correlation among regions and events, thus we identify the two events in each region with the highest (red) and lowest (blue) average correlation. Of the highest correlations for precipitation (Table 1), 7/10 (3/10) occur during EP (CP) events. Of the lowest correlations, 2/10 (8/10) occur during EP (CP) events. T2m results are similar (Table 1). Regionally, the highest precipitation correlations occur in the NW, SW, and SE. For T2m, in the NW, NE and SE.

Table 1 Average spatial correlation coefficient of determination in each North American region and season for NMME short lead prediction versus observation verifying in JFM of each defined warm event

For precipitation, the highest correlations occur during the 1983 and 1998 EP events (indicated by superscript plus in Table 1). These are among the strongest EP events. Low correlations vary among weak and strong events. For T2m, the highest correlations occur during the 1992 and 2003 CP events, which have strong central Pacific warming. The overall majority of high T2m correlations occur during 1987, a somewhat weak EP event. Strong EP events correspond to larger regional agreement between NMME precipitation and observational estimates, and T2m shows mixed results in that the strongest warm ENSO events do not necessarily correspond to larger agreement between NMME and observations.

There is more observational agreement over North America during EP events versus CP, but there is apparent regionality. The NW and SE both show (respectively) higher correlation and observational agreement for precipitation during strong EP events, though JFM2003 (strong CP) also had correlations among the highest. Additionally, JFM1998 (EP), showed larger ensemble agreement over North America than JFM1995 (CP), but agreement is mainly in the Northwest and Southeast.

This apparent regionality has 2 likely causes, related to NMME hindcast skill as well as the regional response to warm ENSO events over North America. Hindcast skill, as shown in Fig. 2, is significant for T2m in the Northwest US. Though this is a measure of deterministic skill for the JFM season overall, we find some similarities with the observational agreement analysis. For example, T2m results are mixed, but the NW shows some high pattern correlation (Table 1), consistent with JFM anomaly correlation skill (Fig. 2a). Though JFM T2m anomaly correlation is not significant in the NE and SE, it is non-zero (Fig. 2a), which may explain the more mixed results. Comparing this to precipitation skill, we find that the JFM anomaly correlation is significant in the western and coastal southeastern US with some non-zero skill in the NE, but close to zero elsewhere (Fig. 2c). This is a likely factor in the observational agreement in these regions.

A second consideration is the typical response to warm ENSO events over North America. Wintertime warm episode relations are shown in Figure 2 of Trenberth et al. (1998), which highlights the same regions we have found in this analysis, though does not stratify into EP and CP events. This regionality is due to changes in the large-scale atmospheric circulation patterns associated with warm ENSO events, in that the winter jet shifts southward bringing winter storms southward as well, and shifts even further south during CP events, causing differing regional influences (Yu and Zou 2013).

We further examine the regional link between strength of tropical Pacific EP and CP events and North American precipitation and T2m in Sect. 3.3, focusing on ensemble agreement.

3.3 Forecast ensemble agreement

As shown above, JFM1995 showed respectively less ensemble agreement over North America for both T2m and precipitation than JFM1998. We expect more predictability during stronger ENSO events (Kumar and Hoerling 1998), and thus might expect larger ensemble agreement during EP events as they are typically stronger than CP events. This section focuses on the position and strength of NMME SSTA and precipitation in the tropical Pacific versus NMME ensemble agreement over North America during warm ENSO events. Specifically, we discuss the relationship between percent ensemble agreement in NMME forecasts regionally over North America versus event strength using anomalies in the eastern and central/western tropical Pacific.

For the remaining warm events (not shown), we analyze ensemble agreement similarly to above. Notable cases include JFM992, a CP event with some ensemble agreement in both variables, and JFM2007, an EP event with low ensemble agreement in both variables. Though JFM1992 is classified as CP, anomalies are large in both NINO3 and NINO4 regions (see Fig. 1a, NMME results similar). Similarly JFM2007 is classified as EP, but anomalies are of similar size in the east and west Pacific. JFM2003 (CP) shows ensemble agreement in Northwestern T2m, and NMME also predicts some warming in the eastern Pacific.

We define ensemble agreement by the percent of ensemble members agreeing on the sign of the anomaly in the NMME forecast, i.e. “Percent Positive” or “Percent Negative.” For defined warm ENSO events, we determine the percent of ensemble members that agree on above or below zero anomalies in each North American region for precipitation and T2m, and plot this value versus the corresponding NMME ensemble mean SSTA in NINO3 (Fig. 4a, c) and NINO4 (Fig. 4b, d) for the given event. The reader should note that all warm events are considered for this analysis, irrespective of their classification. In this manner, CP (EP) events may show neutral or weak SSTA in NINO3 (NINO4).

Fig. 4
figure 4

Percent of ensemble members agreeing on the sign of the anomaly during defined warm events in all North American regions. Left y-axis shows percent of ensemble members agreeing on above zero anomalies, right y-axis agreeing on below zero anomalies. X-axis shows SSTA amplitude in NINO3 and NINO4 during the given event. a NMME NINO3 SSTA versus North American regional precipitation ensemble agreement. b As in a, but for NINO4 SSTA. c As in a, but for NINO3 versus T2m. d As in b, but for NINO4 versus T2m

As a specific example, consider the green line and points corresponding to the SE in Fig. 4a. Each point corresponds to a defined warm ENSO event. On the y-axes, we show the percent of ensemble members that agree on the sign of the anomaly (above zero on the left, below zero on the right). Thus, for the top rightmost point, approximately 70 % (30 %) of ensemble members predict an above zero (below zero) precipitation anomaly in the SE during the given event. The x-axis shows the corresponding NMME ensemble mean SSTA in NINO3 during the event, approximately a 2.6 °C anomaly, a strong warm event. Similarly, for the bottom left most point, approximately 45 % (55 %) of ensemble members predict an above zero (below zero) precipitation anomaly in the SE, corresponding to a 0.6 °C SSTA anomaly in the NINO3 region, or weak warming. We repeat this analysis for all warm events. A line of linear regression is fit through the points, goodness of fit (R2) and slope are calculated for this line.

Table 2 shows goodness of fit and slope calculations for regional North American precipitation and T2m versus NINO3 and NINO4 SSTA, based on the linear regression lines in Fig. 4. Significance of R2 is calculated using an F-Test with 6 degrees of freedom at 90 % confidence level and is indicated by italicized text. Values in which the slope is significantly different from 0 at 90 % confidence level with 6 degrees of freedom is similarly indicated, based on a Student’s T Test. Figure 5a–d and Table 3 are similar, but versus NMME ensemble mean tropical Pacific precipitation anomaly during the warm ENSO events. A positive (negative) slope indicates that as the amplitude of SSTA or precipitation in NINO3 or NINO4 increases, the ensemble agreement of positive precipitation or T2m increases (decreases) in the given region. Similarly, this means that for agreement of negative events, results will be opposite. A neutral slope and/or insignificant R2 indicates lack of association between changing amplitude of tropical Pacific SSTA or precipitation during warm events and ensemble agreement, i.e. that both weak and strong warm events correspond to similar ensemble agreement.

Table 2 R2 (explained variance or goodness of fit) and slope of regression line from Fig. 5
Fig. 5
figure 5

As in Fig. 4, but NINO3 and NINO4 precipitation anomaly versus North American regional ensemble agreement

Table 3 As in Table 1, but for NMME precipitation amplitude in NINO3 and NINO4 versus North American precipitation and T2m

Figure 4a, b and Table 2 show NINO3 and NINO4 NMME SSTA versus percent of ensemble members predicting positive and negative rainfall during defined warm ENSO events in each North American region. NINO3 SSTA explains significant variance in the SE and SW, and NINO4 explains significant variance in the NW and NE (Table 2). Slope of the linear regression line is significant for SE precipitation ensemble agreement versus NINO3 SSTA (Fig. 4a; Table 2); and for GP, NW, SW, NE ensemble agreement versus NINO4 SSTA, though explained variance is not significant for the GP and SW (Fig. 4b; Table 2). NINO3 also explains significant variance in the SW (Table 2), but the slope is neutral, so it is likely that changing SSTA amplitude does not play a large role in precipitation ensemble agreement in this region. Overall, precipitation ensemble agreement in the SE, NW, and NE shows the strongest relationship with tropical Pacific SSTA amplitude increase.

The large explained variance and positive slope in the SE due to NINO3 SSTA indicates that as the strength of the event in NINO3 increases, ensemble agreement in predicting positive precipitation anomalies increases. The expected impact during EP events is positive precipitation in the southeast (Fig. 2c). In the NW, the larger NINO4 explained variance and negative slope is also optimistic, as the expected impact during CP events is negative (Fig. 2b), and the forecast agreement of negative precipitation anomalies increases as the strength of the event in NINO4 increases. The NE is negative during CP events and mixed during EP (Fig. 1b, c) and we find a stronger relationship with NINO4 SSTA. We note that there is one strong CP event (rightmost points in Fig. 4b, d) that may be adding to the negative slope, and we further examine ensemble agreement versus event amplitude in later Figures.

For SSTA versus T2m (Fig. 4c, d; Table 2) NINO3 SSTA explains significant variance of ensemble agreement in the SE, GP, and SW. NINO4 does not explain significant variance in any region. T2m is more closely related to anomalies in the NINO3 region, but the neutral slope in the GP, NW, and NE implies that the strength of the event does not exhibit a strong control on ensemble agreement. Event strength is important in the SW and SE for T2m, the SW and SE are cold when there is NINO3 warming (Fig. 1e) and show larger ensemble agreement for strong warm events in the NINO3 region (Fig. 4c; Table 2). In the NW there is large ensemble agreement on positive T2m overall, both weak and strong events in the NINO3 and NINO4 regions correspond to percent agreement over approximately 60 % for most of the events.

We conduct similar analyses using NINO3 and NINO4 NMME precipitation versus North American regional ensemble agreement in Fig. 5a, b and Table 3. Precipitation is perhaps a better proxy due to its control on atmospheric teleconnections (Sardeshmukh and Hoskins 1988; Trenberth et al. 1998; Barsugli and Sardeshmukh 2002; McPhaden 2004). This analysis further refines the above conclusions. There is significant variance explained for SE precipitation ensemble agreement by NINO3 precipitation, and in the NW and NE by NINO4 precipitation. The slope does not pass our significance test in any region, but is of the anticipated sign given the expected impacts in Fig. 1. Similarly, for North American T2m, we find that NINO3 precipitation explains significant variance for SE, GP, SW, and NE ensemble agreement, but the slope does not pass our significance test. We hesitate to state that there is no relationship between changing NINO3 and NINO4 precipitation amplitude and North American ensemble agreement; the slope simply does not pass the current significance test. Lowering the significance level would change this conclusion.

To further examine the influence of event strength in NINO3 and NINO4 on precipitation ensemble agreement, we consider conditional probability density functions. We focus on precipitation results for the SE and NW, as these regions showed increase in ensemble agreement with stronger events and correlation with observations.

Figure 6 shows the probability density function (PDF) of NINO3 and NINO4 SSTA (Fig. 6a) and precipitation (Fig. 6b) based on a kernel estimate of the PDF. The PDFs are computed using all standardized ensemble member SST and precipitation anomalies during defined warm ENSO events. Ensemble members are binned based on the standard deviation. For example, bin 1 in Fig. 6a contains all ensemble members predicting SSTA within 0–0.5 standard deviations from the mean (weak or neutral SSTA). Bin 5 in Fig. 6a contains all ensemble members predicting SSTA within 2.0–2.5 standard deviations from the mean (strong warm SSTA). Precipitation in Fig. 6b is similar but with additional bins to include the full range of possible precipitation values. We use this classification to plot North American precipitation PDFs corresponding to weak to strong tropical Pacific SSTA and precipitation.

Fig. 6
figure 6

Probability density function of NMME ensemble member standardized anomalies in NINO3 and NINO4 during defined warm events. Y-axis shows probability density. X-axis shows standardized anomalies. a NINO3 (dark red) and NINO4 (light red) PDFs. Bin 1 refers to all ensemble members within 0–0.5 standard deviations from the mean (neutral or weak SSTA warming), bin 5 refers to all ensemble members within 2 and 2.5 standard deviations from the mean (strong SSTA warming). The amounts of ensemble members in bins 1–5 for NINO3 anomalies are 97, 386, 162, 30, 165. For NINO4 anomalies; 59, 372, 369, 67, 1. b As in a, but for NINO3 and NINO4 precipitation. The amounts of ensemble members in bins 1–8 for NINO3 anomalies are 109, 291, 165, 72, 25, 28, 61, and 68. For NINO4 anomalies; 12, 11, 201, 292, 183, 58, 13, 1

For all ensemble members in each bin, we find the corresponding North American regional precipitation, and plot the resulting PDF (Figs. 7, 8). Thus, the dark blue PDF in Fig. 7a labeled “Bin 1” shows SE precipitation corresponding to neutral or weak SSTA in NINO3. Similarly, the dark red PDF labeled “Bin 5” shows SE precipitation corresponding to strong SSTA in NINO3. Dashed contours highlight the PDF corresponding to the weakest and strongest SSTA. Bins that do not include at least 15 ensemble members are excluded. Similar analysis, but binned according to tropical Pacific precipitation amplitude, is shown in Fig. 8. A centered, well-dispersed PDF shows no strong ensemble agreement of either sign, whereas a PDF shifted toward positive values shows ensemble agreement on positive precipitation.

Fig. 7
figure 7

a Probability density functions for southeast US precipitation corresponding to NINO3 SSA (see Fig. 6a). Bin 1 includes all ensemble member precipitation corresponding to weak NINO3 SSTA, bin 5 includes all ensemble member precipitation corresponding to strong NINO3 SSTA, and so on. b As in a, but for northwest US precipitation. c As in a, but for southeast US precipitation corresponding to NINO4 SSTA. Bin 5 is not included due to the low amount of ensemble members in the bin. d As in c, but for northwest US precipitation corresponding to NINO4 SSTA

Fig. 8
figure 8

As in Fig. 7, but for NINO3 and NINO4 precipitation bins (see Fig. 6b)

There is a clear relationship with NINO3 SSTA strength and SE precipitation shown in Fig. 7a. The precipitation PDF corresponding to strong positive SSTA is almost entirely positive, thus ensemble agreement on positive precipitation. The precipitation PDF corresponding to neutral SSTA is more centered, showing no strong ensemble agreement on either sign. SE precipitation PDFs corresponding to NINO4 SSTA do not show a strong relationship between ensemble agreement and strength of SSTA in NINO4 (Fig. 7c).

Similar analysis for NW precipitation is shown in Fig. 7b, d. NW precipitation PDFs corresponding to NINO3 SSTA (Fig. 7b) are similar regardless of which bin is considered. While there was some relationship between the strength of the event in NINO4 and percent negative NW precipitation in Fig. 4b, this is not as apparent in this analysis. For NW PDF’s corresponding to NINO4 SSTA (Fig. 7d) we only see a hint of sensitivity to NINO4 amplitude in the tails of the distribution. It is likely that the sensitivity to NINO4 amplitude is stronger for more extreme NW precipitation.

Figure 8a–d is similar, but corresponding to NINO3 and NINO4 precipitation. PDFs for SE precipitation again show more sensitivity to NINO3 precipitation amplitude than NINO4 (Fig. 8a, c). NW precipitation corresponding to NINO3 precipitation shows little sensitivity (Fig. 8b). NW precipitation corresponding to NINO4 precipitation shows a slight shift towards negative values in the mean for strong positive NINO4 precipitation (Fig. 8d). Additionally, in the left and right tails of the distribution there is a shift toward negativity as NINO4 precipitation amplitude increases.

4 Summary and discussion

There has been much discussion of the expected or typical impacts on North America due to EP and CP events in the tropical Pacific in both observations and modeling studies (Larkin and Harrison 2005a, b; Kug et al. 2009; Mo 2010; Yu et al. 2012; Yu and Zou 2013). Research has also addressed the predictability of North American variables during warm ENSO events in which likelihood of events can be assessed with prior knowledge of SST, and predictability increases with larger SSTA (Hoerling and Kumar 1997; Kumar and Hoerling 1998).

We consider an analysis of the North American response to EP and CP events in NMME. Our focus is on the amplitude of warming and precipitation in the eastern and central Pacific versus regional North American precipitation and T2m ensemble agreement in NMME forecasts. Specifically, we assess whether stronger warm events lead to larger ensemble agreement in NMME forecasts. Also considered is North American regional forecast agreement with observational estimates of precipitation and T2m during EP and CP events.

NMME forecast agreement with observations is superior for North American regional precipitation during strong EP events. The SE, SW, and NW show the highest correlations with observations during the JFM1983 and 1998 strong EP events. JFM2003 (CP) also showed some correlation with observational estimates, however; overall results showed that the majority of high correlation with observations occurs during EP events for precipitation. For T2m, the numerically highest correlations occurred in the NW, NE, and SE during CP events, but the majority of high correlations occurred during EP events of varying strength. As such, precipitation predictions in the SE, SW, and NW and their agreement with observations are more strongly tied to event strength than T2m predictions. The reader should note that though these results are important for seasonal climate prediction of precipitation and T2m, there are additional metrics of skill that should be considered in a complete analysis (see e.g. Kirtman et al. 2014).

NMME ensemble agreement is defined in this manuscript as the percent of ensemble members forecasting an anomaly of the same sign (above or below zero). We examine forecast ensemble agreement versus amplitude of SSTA and precipitation in the eastern and central Pacific during warm ENSO events. Forecast ensemble agreement for SE precipitation demonstrates a strong connection to changing amplitude of SSTA in the NINO3 region. For stronger predicted NINO3 SSTA, there is larger ensemble agreement on positive precipitation versus that for weaker predicted NINO3 SSTA. Forecast ensemble agreement for NW precipitation demonstrates a connection to changing amplitude of SSTA in the NINO4 region, though sensitivity is not as pronounced as the SE. For the NW, as the amplitude of SSTA in the NINO4 region increases, ensemble agreement on negative precipitation increases, seen more for strong negative precipitation in the NW. Other North American regions show less connection to amplitude of warm events, with similar ensemble agreement regardless of event strength.

T2m forecast ensemble agreement showed a connection to NINO3 amplitude in the SW and SE. NW T2m showed large ensemble agreement regardless of event strength, and thus does not demonstrate sensitivity to changing amplitude. Using NINO3 and NINO4 precipitation amplitude as a proxy shows similar results. Thus, the influence of tropical Pacific SSTA and precipitation amplitude on forecast ensemble agreement in NMME is regional, where, for instance, SE precipitation ensemble agreement is strongly tied to NINO3 warm event strength. NW precipitation ensemble agreement is weakly tied to NINO4 warm event strength.

The regionality of this response is due to a few factors; the first is due to JFM hindcast skill, shown in Fig. 2. T2m shows largely positive skill over North America, with significant skill in the NW. Precipitation skill is neutral over much of North America, positive over the NE, and significant in the NW and SE. The NW, SW, and to an extent the NE, all show a response in T2m and precipitation to warm ENSO events (Trenberth et al. 1998, among others). The sensitivity of the atmosphere to tropical SSTA in various regions was also studied in Barsugli and Sardeshmukh (2002), in that an anomaly in the tropics forces local changes which can be spread globally. Barsugli and Sardeshmukh (2002) highlighted the NW and SE as regions of interest, and though their findings mainly showed NINO4 as dominant, the authors did not discount other tropical Pacific regions as sources of sensitivity.

There are a number of implications these findings have on forecasting North American precipitation and T2m. In the SW and GP, the low forecast skill when compared with observations coincides with equally low ensemble agreement, as well as other literature that does not show strong sensitivity to tropical Pacific anomalies in these regions (e.g. Barsugli and Sardeshmukh 2002). Though this is disheartening in terms of prediction skill even during ENSO events, the forecasts showing small ensemble agreement does not point to over-confidence (excessive forecast agreement) in a forecast region with traditionally low skill. In the NE, which does show some response to wintertime warm ENSO (Trenberth et al. 1998), we find some forecast skill but lower ensemble agreement, and the forecasts are under-confident in this region. A tendency for forecast models to show, for instance, anomalously high SSTA in the eastern Pacific during CP events could lead to over-confidence in regions such as the Southeast, where there is both forecast skill and large NINO3 anomalies lead to larger agreement in the prediction of positive precipitation anomalies. In the NW, where there is large ensemble agreement overall for T2m, regardless of event strength, and we also see positive forecast skill, which means the forecasts are neither under- nor over-confident. However, the lack of sensitivity in the NW to event strength in either tropical Pacific region should be taken with caution, as the results for hindcast skill were mixed, and forecasts could be overconfident during a weaker warm ENSO event.

Climate models have shown difficulty in simulating EP and CP events (Ham and Kug 2012; Kim et al. 2012; Kirtman et al. 2013), in some cases the events were indistinguishable with anomalies of similar strength in the east and west Pacific (Hendon et al. 2009). In addition, it has been shown that the characteristics of warm El Niño events may be changing in terms of predictability. McPhaden (2012) found that CP events are indeed becoming more prevalent, but note that predictability of SST at lead times of 2–3 seasons may suffer due to less effective thermocline feedback during CP events. Barnston et al. (2011) found similar results, in that there was a decrease in ENSO prediction skill in 2002–2011 due to decadal variability of ENSO. Given these new challenges in predicting ENSO, as well as the increased prevalence of CP events in recent years, it is important to begin studying the predicted teleconnection response during the 2 types of El Niño.

Our analysis shows that sensitivity of North American forecast ensemble agreement to tropical Pacific amplitude during warm ENSO events varies not only by North American region, but also by eastern and central Pacific anomalies. Because CP events are typically weaker than EP events (Dommenget et al. 2013), this could imply less predictability, which is disconcerting from a forecasting standpoint due to a possible increase in CP events in recent years (Yeh et al. 2009). However, expected T2m impacts were found for the strongest EP and CP events (Yu et al. 2012), and the atmospheric response is stronger during CP events (Kug et al. 2009), thus there still may be predictability for North American impacts during CP events despite their comparatively weak SSTA. Our results show that NMME predictions of precipitation and T2m during EP and CP events captured the large-scale, expected impacts, but forecast ensemble agreement is regional over North America. Some regions are more sensitive to changing NINO3 or NINO4 amplitude, or unrelated to warm event strength. Regional North American forecast ensemble agreement does not necessarily correspond solely to strong El Niño events; rather, both amplitude and position of tropical Pacific anomalies tie to North American ensemble agreement.