1 Introduction

Every year, tropical cyclones (TCs) are responsible for catastrophic damage to coastal and inland regions (e.g., Mendelsohn et al. 2012; Pielke et al. 2005, 2008; Woodruff et al. 2013; Zhang et al. 2009). Among the different ocean basins, the western North Pacific (WNP) produces the highest number of TCs (Chan 1985; Gray 1968; Knapp et al. 2010), and these storms are responsible for catastrophic effects both in terms of economic damage and fatalities. For instance, Super Typhoon “Fred” caused 1126 deaths in 1994, and Super Typhoon “Herb” alone led to more than 10 billion US dollars in economic losses in 1996 (Zhang et al. 2009). Therefore, a better understanding of the variability of WNP TCs can potentially lead to a better prediction, which is of paramount importance for preparedness, mitigation and management of TC-related natural hazards.

The last few decades have witnessed significant improvements in terms of our understanding of the impacts of different climate modes on WNP TCs. For instance, the El Niño Southern Oscillation (ENSO) is known to exert strong impacts on TC genesis locations in the WNP (e.g., Camargo et al. 2007; Chan 1985; Chia and Ropelewski 2002; Li and Zhou 2012; Wang and Chan 2002; Zhang et al. 2016a). A new type of El Niño, known as Central Pacific (CP) El Niño (Ashok et al. 2007), has a strong association with the frequency of WNP TCs (WNP TCF) (Chen and Tam 2010; Kim et al. 2011; Zhang et al. 2012). The summer Western Pacific Subtropical high represented by the 850-hPa geopotential height has been used as a predictor for basin-wide and landfalling WNP tropical storms (Wang et al. 2013). The Indian Ocean sea surface temperature (SST) warming tends to suppress WNP TCF (Du et al. 2011; Zhan et al. 2011) by forcing an anticyclonic flow pattern in the western Pacific with warm Kelvin wave responses (Xie et al. 2009). Moreover, the North Pacific Gyre Oscillation (NPGO) exhibits a significant linkage with WNP TCF (Zhang et al. 2013). The Pacific Meridional Mode (PMM) profoundly impacts WNP TCF in observations and 1000-year climate model simulation (Zhang et al. 2016b, 2017a) by modulating vertical wind shear associated with Matsuno–Gill responses (Gill 1980). PMM has been identified to exert strong impacts on the 2015 and 2016 typhoon seasons (Zhan et al. 2017; Hong et al. 2018). Recently, Atlantic SST warming has been found to inhibit WNP TC genesis in observations and climate model experiments (Huo et al. 2015; Yu et al. 2016). Moreover, the positive phase of the Atlantic Meridional Mode (AMM) and Atlantic Multidecadal Oscillation (AMO) can lead to reduced WNP TCs at various time scales by modulating vertical wind shear associated with the Walker circulation change (Zhang et al. 2017b, 2018). The advancements in our understanding of WNP TCs discussed above can potentially be used for improving the seasonal forecasting of WNP TCs.

Seasonal forecasting of WNP TCF has been a challenging task for the scientific community and operational centers, evolving over time in light of scientific advances and improvements in numerical and computational techniques. Different studies have detailed the capabilities of statistical models in forecasting WNP TCF (Chan et al. 1998; Fan and Wang 2009; Lu et al. 2010; Wang et al. 2013), with dynamical models that have proven increasingly skilled because of the improvements in the understanding of the physical processes, and in the parametrizations and dynamical cores (Camargo and Zebiak 2002; Chen and Lin 2013; Huang and Chan 2014; Vecchi et al. 2014; Vitart 2006; Xiang et al. 2015). Despite major progress in climate modeling, the SST biases and other biases in key variables (e.g., vertical wind shear) associated with climate models limit their seasonal forecast capability (Vecchi et al. 2014). Even when the SST biases are corrected with atmospheric-oceanic flux adjustments, the forecast skill of WNP TCs still represents a challenging problem (Vecchi et al. 2014; Zhang et al. 2017a), with a higher spatial resolution in the atmospheric component of TC-permitting climate models that appears to improve seasonal forecasting of TCs (Murakami et al. 2015, 2016a).

Over the years, statistical-dynamical hybrid forecast models have been developed to integrate the advantages of both physically-based dynamical models and the statistical association between large-scale environmental variables and WNP TCF, showing promising results (Choi et al. 2016; Kim et al. 2017; Li et al. 2013; Zhan and Wang 2016; Zhang et al. 2016c, 2017a). However, these hybrid models rely heavily on the identification of proper predictors in representing TC genesis and frequency (Li et al. 2013; Zhan and Wang 2016; Zhang et al. 2016c, 2017a). For example, current hybrid models for TCs use variables (e.g., SST, vertical wind shear, 850-hPa relative humidity and 500-hPa geopotential height) averaged over selected spatial domains with different models having different regions and set of relevant predictors (Wang et al. 2013; Zhan and Wang 2016; Zhang et al. 2016c, 2017a). Therefore, given this dependence on the climate model, it is difficult to transfer relevant predictors across models, potentially limiting the applicability of these seasonal forecast models for WNP TCF.

Moreover, not all drivers associated with TC genesis, development and tracking are equally predictable, with the SST in the tropics that is more predictable than atmospheric drivers (Kumar et al. 2011; Li et al. 2013; Zhang et al. 2016c, 2017a). Therefore, hybrid statistical-dynamical models that rely only on SST may provide more skillful and longer-lead forecasts, similar to what done for North Atlantic TCs (Vecchi et al. 2011, 2013; Villarini et al. 2018). If SST-based predictors related to WNP TCF can be identified, and if these predictors can be reasonably well forecasted, we will be well-positioned to obtain skillful forecasts of WNP TCF. Therefore, the goal of this study is to develop statistical models that relate WNP TCF using SST-related predictors, and to forecast the TC activity via the SST forecasting based on outputs from the North-American Multi-Model Ensemble (NMME) project (Kirtman et al. 2014).

2 Data and methods

The observations of WNP TCs are obtained from the China Meteorology Administration (CMA) TC data in the International Best Track Archive for Climate Stewardship (IBTrACS v03r10; Knapp et al. 2010). We also examined the sensitivity our results on IBTrACS Japan Meteorological Agency (JMA), and obtained results similar to those presented here. We use SST data from the Met Office Hadley Center Sea Ice and Sea Surface Temperature data set (HadISST 1.1; Rayner et al. 2003).

The forecast SST data are obtained from the NMME project with hindcasts and forecasts (e.g., http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/) available from climate modeling centers in the Unites States and Canada (Kirtman et al. 2014). The NMME project provides hindcasts for up to December 2010 and forecasts since January 2011. Information on the climate models used in this study is provided in Table 1. We focus on the period 1982–2016 for the seasonal forecasts of WNP TCF and use the average of all available members for a given model. When multiple members are available, we consider their average as representative for the specific model.

Table 1 Information about the NMME climate models used in this study

In terms of predictors, we focus on the PMM and the North Atlantic SST anomalies (NASST) because of their physical connection with WNP TCF. Defined as the first maximum covariance analysis mode of SST and 10-m wind vectors in the eastern Pacific (Chiang and Vimont 2004), PMM has been found to perform quite well in representing and forecasting WNP TCF in observations and climate model simulations (Zhang et al. 2016b, 2017a). However, 10-m surface wind field forecasts are not available from the NMME project; because the impacts of PMM on WNP TCF are mainly associated with the forcing of SST, it is reasonable to use the first Empirical Orthogonal Function (EOF) of SST in the spatial domain of PMM as defined in Chiang and Vimont (2004) (165°E–95°W, 70°S–22°N). We should remove the linear fit of SST in the PMM domain to the cold tongue index (CTI; SST anomalies averaged over 180–90°W, 6°S–6°N) before the EOF analysis; because of the dominant role of ENSO, the first EOF of SST in this domain is an ENSO-like pattern if we do not linearly remove the ENSO signal. For simplicity, we name the first EOF of SST in the spatial domain of PMM as “pure-SST” PMM (PSPMM). Moreover, it has been shown that NASST (covering the area 85°W–10°W, 0–30°N) has strong impacts on WNP TC genesis and frequency (Huo et al. 2015; Yu et al. 2016; Zhang et al. 2017b).

Although previous studies have reported that SST anomalies in the other parts of the global ocean [e.g., Eastern Pacific (EP) El Niño, Central Pacific (CP) El Niño and Indian Ocean SST anomaly] have impacts on WNP TCs, these climate modes are excluded because some are indirectly included in NASST and PSPMM and some are not closely associated with the frequency of WNP TCs. The conventional/EP El Niño exerts strong impacts on the genesis locations of WNP TCs, with more (less) geneses in the southeastern (northwestern) part of WNP during strong El Niño years (Wang and Chan 2002; Chen and Tam 2010). However, the Niño 3.4 index has a relatively weak impacts on the total frequency of WNP TCs (correlation = − 0.05) for the period 1965–2016, consistent with previous studies (e.g., Chen and Tam 2010). Meanwhile, CP El Niño has a weaker correlation with WNP TC frequency than the PMM, it is closely associated with PMM (Chang et al. 2007) and these are the reasons why we did not include it in our forecast model. Although the Indian Ocean SST anomalies can also be a potential predictor (Zhan et al. 2011), we do not include them in our model because of their strong correlation with the North Atlantic SST anomaly (correlation coefficient of 0.77 over the 1965–2016 period). The predictors we selected are based on the regions (i.e., the PMM region and North Atlantic) outside of the western North Pacific basin, in agreement with previous studies that found that local SST is not closely associated with WNP TC activity (e.g., Chan and Liu 2004), and are not correlated (correlation coefficient of − 0.01 over the 1965–2016 period). Therefore, PSPMM and NASST are used as predictors for WNP TCF.

Poisson regression is used to analyze the relationship between the frequencies of WNP TCs (our predictand) and the SST-based predictors (e.g., Elsner and Jagger 2006; Villarini et al. 2010). Here we model the parameter µ of the Poisson distribution as a linear function of our two predictors (via a logarithmic link function):

$${\text{log}}\left( {{\mu _i}} \right)={\text{~}}{\beta _0}+{\text{~}}\mathop \sum \limits_{j} {\beta _j}{S_{ij}},$$
(1)

where β0 is the intercept, βj is the coefficient for the jth predictor S (in our case j = 1,2 as we only consider two predictors), while i refers to the ith year.

We first train the Poisson regression model using the observed WNP TC frequency and the observed values of NASST and PSPMM; we then use the coefficients estimated via Poisson regression, and replace the observed predictors with their forecast values based on NMME outputs.

3 Results

3.1 Predictors in observations

EOF analysis was used to calculate the PSPMM index from the monthly SST data. The spatial pattern of observed PSPMM features SST warming in the northwestern part and cooling in the southeastern part of the PMM region by regressing the monthly SST onto the monthly PSPMM index (Fig. 1a), which is quite similar to the classic PMM (Fig. 1b) (Chiang and Vimont 2004). The correlation coefficient between the PSPMM index and the classic PMM index averaged from June to November (JJASON) for 1965–2016 is 0.94, supporting the use of PSPMM as a predictor for WNP TCF. The other predictor for WNP TCF is NASST which mediates the vertical wind shear by altering the Walker Circulation (Li et al. 2016; McGregor et al. 2014; Zhang et al. 2017b).

Fig. 1
figure 1

Regression of SST (unit: °C) onto a the PSPMM index which is the first EOF of SST in the spatial domain in the black rectangle after linearly removing the impacts of ENSO and b the PMM index. c Time series of the PSPMM and PMM indices. The PSPMM is obtained by multiplying the first EOF of SST in the PMM region by ¼

We develop a Poisson regression model of WNP TCF with PMSST and NASST as predictors for the JJASON months. The estimated values of the coefficients are:

$${\mu _i}={\text{exp}}(3.0989 - 0.4608 \cdot NASS{T_i}+0.0056 \cdot PSPM{M_i}).$$
(2)

If we focus on the median of the fitted regression model as our best estimate, we obtain a high value of the correlation coefficient (0.73), with a root mean square error (RMSE) equal to 3.2 TC/year (Fig. 2). This suggests that our simple statistical model with observed PMSST and NASST indices during JJASON can well reproduce the interannual variability in WNP TCF, making it a viable option for seasonal forecasting of TC activity in this ocean basin.

Fig. 2
figure 2

Poisson regression modeling of observed WNP TCF (white circles) using observed PSPMM and NASST as predictors. The black line represents the median of the fitted Poisson distribution, while the dark (light) grey regions the 25–75th (5–95th) percentiles. The unit of RMSE is TC/year

3.2 Forecasts with NMME SST

Given that simple statistical models can represent the WNP TC activity well, we examine how well we can forecast the predictors and what skill we obtain in forecasting WNP TC activity during the JJASON months. We consider different lead times, with forecasts initialized from June (shortest lead time) to January (longest lead time). We consider the skill for each of the individual models in Table 1 as well as the multi-model ensemble average.

We start by examining how well the NMME models can forecast the two predictors. Overall, the forecast skill of PSPMM and NASST (correlation) can reach ~ 0.8 in the forecasts initialized in June or May for each individual model and their ensemble mean (Fig. 3), consistent with previous studies which evaluate the forecast skill of SST in NMME (Kirtman et al. 2014). The skill for NASST is slightly higher than PSPMM in terms of correlation coefficient. When the initialization months go from January to June (i.e., from longer to shorter lead time), the values of the correlation coefficient increase from ~ 0.6 to ~ 0.9 (Fig. 3). Overall, the GFDL FLOR and NASA models tend to better forecast NASST, while CCSM3 tends to perform the worst; in terms of PSPPM, the performance of the different models is more comparable, with CCSM3 and CFSv2 that are slightly lagging behind.

Fig. 3
figure 3

The forecast skill in terms of correlation coefficient for the two predictors: a PSPMM and b NASST using different climate models (x-axis) at different initialization months (y-axis; from January to June) for the 1982–2016 period

Based on the results in Fig. 3, it is possible to skillfully forecast the two predictors, even few months ahead of the TC season; this insight, combined with the capability of the Poisson regression model, encourages us to investigate the skill in forecasting WNP TCF. Figure 4 shows the forecasted WNP TCF with respect to the observed one for different initialization months. The Poisson regression model with the NMME predictors is able to capture well both the year-to-year and the decadal variability in the observations (Fig. 4). For the May- and June-initialized forecasts, the skill is very similar with a correlation of 0.61 and 0.63, respectively. As we increase the lead time, the skill decreases rapidly; for instance, forecasts initialized in March and April have values of correlation coefficients on the order of 0.5, a decrease from ~ 0.6 in May and June. As expected, the skill decreases for increasing lead times, consistent with what we observed in terms of the forecasting of the predictors (Fig. 3). Beside the decrease in skill, there is also decrease in the year-to-year variation, with more smoothed results when initialized in January than in June (Fig. 4), consistent with a larger RMSE for the forecasts initialized in January (Fig. 5). This is also similar to what observed in Villarini et al. (2018) in relation to the forecasting of North Atlantic TC activity. Overall, the forecast may fall short in capturing the WNP TC frequency with very high or low values (Fig. 4). This is also similar to what observed in Villarini et al. (2018) in relation to the forecasting of North Atlantic TC activity. Overall, the forecast may fall short in capturing the WNP TC frequency with very high or low values (Fig. 4).

Fig. 4
figure 4

Forecasted WNP TCF using NMME-based initialized from January to June based on the ensemble average. The white circles represent the observations. The black line represents the median of the fitted Poisson distribution, while the dark (light) grey regions the 25–75th (5–95th) percentiles. The unit of RMSE is TC/year

Fig. 5
figure 5

The forecast skill (top panel: correlation; bottom panel: RMSE) based on Poisson regression with NMME-forecasted NASST and PSPMM indices as predictors over the 1982–2016 period. The results on the x-axis are for the different models and their ensemble average, while the initialization month is on the y-axis

To be more quantitative in our assessment of the forecast skill, we use the correlation coefficient and RMSE. Overall, the forecast skill for WNP TCF measured by the correlation coefficient is very promising using the two predictors (Fig. 5, top panel). The seasonal forecast of WNP TCF in June-November based on the ensemble average of the SST forecasts outperforms almost all the forecasts of WNP TCF based on the individual climate model forecasts of SST. For the forecast initialized in June, most of the individual models have correlations greater than 0.4 and using the ensemble mean of the predictors we obtain a correlation of 0.63 for the period 1982–2016 (Fig. 5, top panel). Overall, the RMSE of the forecasts using NMME-based predictors is consistent with the results based on the correlation coefficient. The RMSE of the forecast based on the ensemble average is smaller than almost all individual model-based forecasts across initialization months, and increasing from ~ 3.27 to ~ 3.63 TC/year as the initialization goes from January to June (Fig. 5, bottom panel). To further justify the use of our hybrid model for seasonal forecasts, we apply leave-one-out-cross-validation for the forecast results. Specifically, we train the Poisson regression model and estimate its coefficients by excluding the observation during the period 1982–2016. The results after leave-one-out-cross-validation (Fig. 6) are quite similar to those using the full record (Fig. 5), with cross validation leading to only slightly lower skill in terms of correlation and RMSE.

Fig. 6
figure 6

The forecast skill (top panel: correlation; bottom panel: RMSE) based on Poisson regression with NMME-forecasted NASST and PSPMM indices as predictors over the 1982–2016 period after leave-one-out-cross-validation. The results on the x-axis are for the different models and their ensemble average, while the initialization month is on the y-axis

The uncertainties in our forecast model represented by the RMSE of the forecasts may originate from three main sources: the Poisson regression coefficients, the NMME-based predictors, and the deficiency of statistical models. The uncertainties related to the regression coefficients and the NMME-based predictors can be directly calculated from the forecasts. The uncertainty related to the “statistical model deficiency” can be obtained by subtracting these two uncertainties from the total uncertainty which is the RMSE value between the forecasted and observed WNP TC frequencies (the rightmost column of Fig. 5b). To quantify the uncertainty of the forecasts arising from the coefficients from the Poisson regression, the 52-year samples are divided into ten parts for validation/calibration, with nine validation sets having five samples and one with seven samples (for a total of 52 years). We obtain the values of the standard deviation for the ten sets of coefficients (Table 2), which are used in the forecast equation to quantify the uncertainties of the forecast WNP TC frequency. For example, we first calculate the forecast WNP TC frequency using the original coefficients (Equ-2), which is compared with the forecast TC frequency made by the new coefficients after adding the standard deviation values of the ten sets of coefficients, by keeping predictors unchanged. The RMSE between the two sets of forecasted frequencies is taken to represent the uncertainty related to the coefficients (Fig. 7, left column). Similarly, we calculate the values of the standard deviation of the predictors based on each ensemble member of the NMME forecasts, which are used to quantify the uncertainty related to the predictors. In addition, the uncertainty associated with the predictors in the different ensemble members are represented by the RMSE of two sets of forecasted frequencies using two sets of predictors (Fig. 7, middle column). Figure 7 indicates that “model deficiency” accounts for the largest portion of the uncertainties in the forecast model, followed by those associated with the Poisson regression coefficients and NMME-based predictors.

Table 2 Coefficients for PSPMM and NASST trained by using samples excluding one subset
Fig. 7
figure 7

Uncertainties of the forecasts (forecast errors) related to Poisson regression coefficients, NMME-based predictors and model deficiency for different initial months

Because of the different lengths in the hindcast and forecast periods of NMME, we have analyzed the differences in RMSE during the two periods (Fig. 8). Overall, the RMSE during 2011–2016 is larger than that during 1982–2010 (Fig. 8), suggesting a lower skill in the forecast period. In addition, the RMSE during 1982–2010 increases with the increasing forecast lead time (Fig. 8, top panel) while the RMSE during 2011–2016 is relatively stable for different lead times (Fig. 8, bottom panel). The differences in RMSE of the forecast model during 1982–2010 and 2011–2016 may be associated with the prediction skill of the predictors (Fig. 9). Taking a closer look at the forecast errors of predictors, Fig. 9 shows larger errors for both predictors during the period 2011–2016 compared with those during 1982–2010. Overall, the forecast errors for NASST and PSPMM decrease as the lead month decreases for the period 1982–2010 (Fig. 9a, c). However, the forecast errors for PSPMM increase as the lead month decreases for the period 2011–2016 (Fig. 9b). We speculate that this difference in dependence of RMSE on lead time is related to the transition of the forecast system (e.g., climate models) from hindcast to forecast (Kirtman et al. 2014). In addition, all NMME forecasts are bias corrected by making use of the hindcasts (Kirtman et al. 2014), which may lead to the differences in dependence of RMSE on lead time.

Fig. 8
figure 8

The forecast skill (RMSE) based on Poisson regression with NMME-forecasted NASST and PSPMM indices as predictors over the a 1982–2010 and b 2011–2016 periods. The results on the x-axis are for the different models and their ensemble average, while the initialization month is on the y-axis

Fig. 9
figure 9

Forecast errors (RMSE) for a, b PSPMM and c, d NASST during 1982–2010 and 2011–2016

4 Discussion and conclusion

This study proposes and develops a simple albeit skillful approach to the seasonal forecasting of WNP TC activity, and falls within the broad group of hybrid statistical-dynamical approaches. As we advance our understanding of the physical mechanisms underlying the drivers of TC frequency, we are building forecast models with a great deal of complexity, which may have the negative effect of limiting the generalization of these forecast models. For example, most of statistical-dynamical forecast models for WNP TCF use spatial domains which depend on climate models and on individual predictors; another potential limitation associated with the use of more complex models is related to the use of several atmospheric predictors: on the one hand, these atmospheric predictors would provide a more comprehensive representation of the processes at play, but on the other hand they come at the expense of their limited predictability compared to SST-driven predictors. To address these issues and to identify a trade-off to capture the important but predictable drivers, we took advantage of the SST forecasts by NMME, building a seasonal forecast model for WNP TC activity.

Here our goal was to simplify the model complexity but still maintain good forecast skill. Based on previous research findings, we have used two SST-based predictors: one is based on the first EOF of SST in the PMM region, while the other one is the SST anomalies averaged over the North Atlantic. We have used the first EOF of SST in the PMM region because NMME does not provide 10-m surface winds which are necessary to calculate the PMM index. The Poisson regression model trained on the observed WNP TCF and the two predictors exhibits very good skill (correlation coefficient equal to 0.73 and RMSE of 3.2 TC/year) for the period of 1965–2016. Using SST forecasts by NMME, we have built a forecasting system for WNP TCF with promising skill in terms of correlation coefficient and RMSE (Figs. 3, 4, 5), with forecasts initialized as early as January.

The seasonal forecast model developed in this study shows prediction skill (COR = 0.63, RMSE = 3.27 TC/year initialized in June) comparable to previous studies. By using a regional climate model to dynamically downscale the National Centers for Environmental Prediction Climate Forecast System version 2 (NCEP CFSV2) hindcasts, Huang and Chan (2014) built a forecast model with a correlation of 0.55 between predicted and observed WNP TC frequency for the 1980–2001 period. The Met Office fully coupled atmosphere–ocean Global Seasonal Forecast System 5 (GloSea5) produces a correlation of 0.57 for 1996–2009 (Camp et al. 2015). Zhang et al. (2017a) developed a hybrid model based on GFDL FLOR with values of the correlation coefficient ranging from 0.56 to 0.69 for different initialization months over the 1980–2015 period. It is noted that the dynamic seasonal forecasts using GFDL FLOR produce correlations of 0.3–0.5 between forecasted and observed WNP TC frequencies (Zhang et al. 2016c), although SST is well forecasted. This might be due to the fact that the atmospheric responses to SST forcing is not realistically reproduced in FLOR.

This study highlights the crucial role of PMM and NASST in modulating WNP TCF presented in previous studies (Yu et al. 2016; Zhan et al. 2017; Zhang et al. 2016b, 2017a, b; Hong et al. 2018). As the forecast skill of SST in the tropics is expected to improve thanks to the higher spatial resolution and better representation of the physical processes (Murakami et al. 2015, 2016b; Wehner et al. 2014), the forecast skill of WNP TCF is expected to be further improved as well. This study has attempted to balance the complexity and capability of a seasonal forecast model for WNP TC activity by combining observations and model simulations, paving the road for the future development of seasonal forecast models for other weather and climate events.