1 Introduction

The tropical cyclones (TCs) are notoriously known across the globe for their devastating and life-threatening nature. Millions of people live along the coastal regions and are exposed to the threat of the strong wind, precipitation, storm surge, and severe weather condition of TCs. In recent decades, the improved track forecasts and warnings, better communication channel system, and increased public awareness have reduced the loss of life associated with TCs. The North Indian Ocean (NIO) region (mainly BoB) is a potentially energetic region for the development of the TCs (Bhaskar Rao and Ashok 2001) which are highly variable in movement and intensification (Raghavan and Sen Sarma 2000), and contributes about 7% of the world’s total TCs. However, the devastation is very high because of the shallow waters of the BoB, the low flat coastal terrain, and the funneling shape of the coastline which favor strong storm surges as compared to other basins (India Meteorological Department 2011). It becomes necessary to accurately predict the track, intensity and associated storm surges of these devastating storms at least 48–72 h in advance is a very important task in weather forecasting.

The observational and Numerical Weather Prediction (NWP) models are deployed worldwide to provide advanced information of intensity and movement of these violent storms. In recent decade, the high-resolution mesoscale models are improved their skill to track and intensity predictions of the storms, especially proper representation of the axisymmetric inner core structure, and evolution of the TC with increases in vertical and horizontal resolution, advanced data assimilation techniques, etc. (Bender et al. 2007; Gentry and Lackmann, 2010; Davis et al. 2010; Gopalakrishnan 2010; Routray et al. 2016; Osuri et al. 2017). The recent studies (Gopalakrishnan et al. 2011, 2013; Tallapragada et al. 2014) showed that the reduction in the storm size errors, improvement in storm structure at the initial time, as well as forecast lead times in the upgraded version of high-resolution triple nested Hurricane Weather Research and Forecast model. These improvements in the model are mainly attributed due to the higher horizontal resolution and improved vortex initialization which allow the model to properly resolve the inner core of the TCs. The evolution of forecast intensity is sensitive to the initial inner core tropospheric moisture and also the initial wind fields which play an important role on predictability and forecast skill of the model for TCs prediction (Emanuel and Zhang 2016, 2017). The track of the TC mainly influenced by the large-scale environment, while the intensity changes associated with the storm, is a complex and multiscale interaction of large mesovortex scales.

The mesoscale models are nested in the global models by specifying lateral boundary conditions (LBCs), which spoil the information transmitted from the global model into the mesoscale model. Therefore, forecast skill of the mesoscale deteriorated due to the impact of LBCs as well as the domain size (Pielke 2002; Wu et al., 2005; Mohanty et al. 2010; Goswami et al. 2010, 2014). The large-scale features simulated by the mesoscale models may be better described in the global model (Waldron et al. 1996). The global model covered the whole globe and does not need LBCs; therefore, the model can properly describe the planetary and large-scale features. The large-scale environmental flow (steering and shear flows) plays a major role in track prediction of the TCs (Harr and Elsberry 1991; Carr and Elsberry 2000; Zhang et al. 2013). Hence, the global model is usually considered for good guidance of TC track forecasts. In other hand, the model underpredicts the intensity and overpredicts the size of the storm due to the low horizontal resolution. In recent years, advancements in numerical algorithms and increase of computing power allow to run the global models with higher horizontal and vertical resolutions, and also for longer model forecast integrations. The leading operational centres in the globe are running various NWP global forecasting systems with sophisticated data assimilation techniques and other features provide the skillful guidance for the TC forecasts (Heming et al. 1995; Goerss and Jeffries 1994; Kurihara et al. 1998; Vitart et al. 2010; Elsberry et al. 2009, 2010; Gall et al. 2011, etc.). Goerss et al. (2004) examined the TC track forecasting skill of operational global NWP models and their consensus over eastern North Pacific, the forecast skill of the models improved dramatically over the past decade (1992–2002). The numerous changes in the UK Met Office (UKMO) global model, including introduction of new data sources, dynamical core, 4D variational data assimilation (4DVAR), model physics upgrades, and changes in both horizontal resolution and vertical resolution, have a positive impact on track forecast errors (reduced by 5%) and significantly reduced the weakening bias of TCs (Heming and Greed 2002; Heming and Goerss 2010).

The track forecast errors of models for predicting TCs over the NIO basin have been appreciably reduced in recent years due to the development of various sources of observations and the data assimilation methods which are effectively incorporated in these dynamical global and regional models (Mohapatra et al. 2013b; Osuri et al. 2013, 2015; Routray et al. 2016). Mohapatra et al. (2013a, b), and evaluated the TC track and intensity forecasts issued by India meteorological Department (IMD) during 2003–2011 (9 years) over the NIO region. The study found that the forecast skill of track and intensity of TCs over the NIO region significantly improved in recent years. However, forecast skill has been found less for TCs forming over NIO region in comparison with those over northwest Pacific and North Atlantic Ocean. Goswami et al. (2006) claim that general circulation model could give comparable results with mesoscale model for a better accuracy as well as longer range for tropical cyclone forecasts. Kumkar et al. (2018) found that the GME model forecast underestimates cyclone’s intensity, but the model can capture the evolution of cyclone intensity, especially its weakening during landfall.

In recent years, the operational centers are likely to run the global forecasting NWP models operationally at the same resolution as the current operational mesoscale models (around 10–12-km horizontal resolution). Therefore, the weather community should systematically evaluate the global models for high impact weather events as well as for longer forecast lead times. The National Centre for Medium Range Weather and Forecasting (NCMRWF) adopted the UKMO Unified Model (named as NCUM) as an operational global forecast system (Davies et al. 2005; Rajagopal et al. 2012). There has been no detailed scientific analysis carried out to assess the capability of the NCUM for extreme weather events (mainly TCs) over the NIO region. Therefore, this study is undertaken to demonstrate and evaluate the skill of NCUM model for track and intensity prediction of TCs on the basis of 43 forecast cases of 11 TCs, which formed over the NIO region between 2012 and 2015. During this period, the NCUM has undergone a few important changes that include expansion of data assimilation procedures to include a greater number of satellite radiance measurements and additional surface observations over Indian region (Observation monitoring report; www.ncmrwf.gov.in/). Section 2 briefly discusses the NCUM modeling system and configuration used in this study. The methodology used to calculate the track errors of the TC is presented in Sect. 3. The simulation results are discussed in Sect. 4. Section 5 presents the broad conclusions of the study.

2 Modeling System and Configuration

The NCUM model based on Met office’s UM (Version 7.9) is a non-hydrostatic model having rotated latitude–longitude horizontal grid with Arakawa-C staggering. It has a terrain following hybrid-height vertical coordinate with Charney–Philips staggering. This model was operationally run at NCMRWF at N512L70 resolution (~ 25 km horizontal and 70 levels in vertical) which includes atmospheric forecast modeling system, Observation Processing System (OPS), and 4DVAR. Recently, the model is upgraded to NCUM N768 (i.e., horizontal resolution ~ 17 km). The OPS suite of model prepares quality-controlled observations for 4DVAR in the desired format. The 4DVAR system produces the analysis, which is the best estimate of the atmospheric state used as the initial condition for the NCUM forecast model. The main model variables are the three components of wind (westerly, southerly, and vertical), potential temperature, Exner pressure, density, and components of moisture (vapour, cloud water, and cloud ice). The NCUM uses a rotated latitude–longitude grid which ensures a quasi-uniform grid length over the whole integration domain. The model solves the deep-atmospheric dynamics using a semi-implicit, semi-Lagrangian, predictor corrector numerical scheme (Cullen et al. 1997; Davies et al. 2005). The predictor step includes all the processes (including the physics) but approximates some of the (non-linear) terms. The corrector step then updates the approximate terms to achieve a more accurate solution. The model includes different types of physical parameterization schemes such as surface (Essery et al. 2001), boundary layer (Lock et al. 2000; Martin et al. 2000), mixed-phase cloud microphysics (Wilson and Ballard 1999), and convection (Gregory and Rowntree 1990; Grant and Brown 1999), with additional downdraft and momentum transport parameterizations.

In the present study, 11 TCs, which formed over the NIO region between 2012 and 2015, are considered. Out of these 11 cases, seven were formed over the BoB and rest four over the Arabian Sea (AS). A TC forecast up to landfall (or dissipation) was performed at 00 UTC throughout the whole TC life with effects from the genesis stage (formation of depression). The IMD observed that best-tracks of all 11 TCs over the BoB and AS are provided in Fig. 1. Table 1 shows the period of simulations as well as the time of landfall for each TC. These numerical simulations resulted in 43 prediction cases. The details of synoptic situations and best-track data of 11 TCs were obtained from the IMD Regional Specialized Meteorological Centre (RSMC) reports.

Fig. 1
figure 1

Observed tracks of tropical cyclones (from IMD) over Arabian Sea (AS) and Bay of Bengal (BoB)

Table 1 Details of the model simulations and observed landfall time of each TC

3 Method

The NCUM model simulations are evaluated by calculating various standard metrics such as Direct Position Error (DPE), latitudinal (meridional or DY), longitudinal (zonal, or DX) errors, and gain (or loss) in skill in forecast. The intensity of the TCs is also evaluated in terms of Absolute Errors (AE) and Root-Mean-Square Errors (RMSE) of Maximum Sustainable Winds (MSW), gain (or loss) in skill in intensity forecast, and mean sea-level pressure (MSLP).

The MSW of the storm estimated by IMD considering 3-min average wind speed, whereas the Joint Typhoon Warning Center (JTWC) reports a 1-min mean wind speed (Levinson et al. 2010; Mohapatra et al. 2012, 2013a). To determine the position of the cyclone, IMD utilize only geostationary satellite (INSAT and Kalpana1) cloud images along with synoptic and radar (when storm close to coast) observations. However, the JTWC considers all available geostationary and polar satellite observations, including Scatterometer data along with other observations. Knapp and Kruk (2010) analyzed the best-track data sets reported by various operational agencies over different basins; it is found that the MSWs estimates from JTWC to IMD over NIO are in good agreement for weaker systems. However, for stronger storms (MSW > 50kts), IMD generally reports lower MSW values (~ 22%) than JTWC. The World Meteorological Organization (WMO) is given responsible to IMD for forecasting and monitoring the TC over NIO region as well as officially designated operational center for India. Therefore, we considered the best-track data from IMD to compare our results.

The DPE is the distance between observed and forecasted cyclone positions at the forecast verification time. The DX (DY) errors provide information about the directionality of the errors in the zonal or meridional directions. The DX (DY) error is positive or negative, indicating that the forecasted position of TC lies right or left (ahead or behind) of the best-track position. However, the DX and DY errors have additional difficulties to interpret the northward and westward motion of TCs. Therefore, the Cross-Track (CT) and Along-Track (AT) errors are calculated relative to the observed track to determine whether the TC forecasted position is left (negative CT) or right (positive CT) and slower (negative AT) or faster (positive AT), respectively. The cross-track error is estimated as the minimum distance of the forecasted position of the TC to an interpolated observed track (Fig. 2). The AT error is defined as the great circle distance between an observed cyclone and the point of intersection of the cross track with the interpolated observed track (Fig. 2). The average AT error reveals the forecast bias in the AT direction. The CT and AT errors help disaster managers to determine the area and time of evacuation needed in case of a landfalling TC. The detail method for calculation of these errors can found in Froude et al. (2007) and Osuri et al. (2013).

Fig. 2
figure 2

Graphical illustration of DPE, DX, DY, CT, and AT (Routray et al. 2017)

The skill of the model is the relative performance of the model with respect to some reference technique. The climatology and persistence (CLIPER; Pike and Neumann 1987) model is used as reference for the calculation of the skill in track forecasts of the NCUM. The CLIPER model is the combination of persistence method and climatological forecast of movement of TC using past year data. The persistence forecast is based on the principle that the future position of the TC for the next 72-h at 12-h intervals will be same as the position followed by the TC during the past 12 h. In climatology method, it is assumed that the TC would move in the average direction and speed of all past TCs formed in the same latitude and longitude box. The climatological monthly average direction (degrees) and speed (kts) of TCs was computed by considering 2.5° latitude and longitude square grid. The computational procedures of the persistence and climatology methods are clearly described in Bandyopadhyay and Singh (2006). The CLIPER model is entirely based on the climatological behavior of TC tracks for a particular basin; therefore, the CLIPER model differs from one basin to another (WMO 2009). The DPE from CLIPER model is calculated based on the Neumann and Mandal (1978) algorithm. In this study, we used the CLIPER model data collected from IMD. The CLIPER model is used as reference tool/method of skill measurement at all operational TC forecasting centres to calculate the skill in track forecast (Mohapatra et al. 2013b). The gain in skill (%) with respect to CLIPER model is defined as follows:

$${\text{Gain in skill}} = \frac{{{\text{CLIPER}}\;{\text{DPE}} - {\text{DPE}}}}{{{\text{CLIPER}}\;{\text{DPE}}}} \times 1 0 0.$$

The positive (negative) value of gain in skill represents gain (loss) in model skill.

The NCEP automated tracking system (Marchok 2002) is used to determine the centre of the TCs. The position of the TC is determined on the basis of the spatial distribution of seven low-level parameters such as the minimum MSLP, relative vorticity at 850 and 700 hPa, geopotential height at 850 and 700 hPa, and minimum wind speed at 850 and 700 hPa. The detailed explanation about the NCEP TC tracker can also found from Buckingham et al. (2010).

4 Results and Discussion

4.1 Initial Vortex Position Errors

The mean initial vortex position errors of each TC and the ensemble mean of 43 forecast cases over the NIO region are depicted in Fig. 3. The mean initial vortex position errors vary from 28 to 149 km considering all TC cases. The average error is 75 km with 95% Confidence Interval (CI) of 14 km over the NIO basin. The mean initial position errors are 80 and 73 km over the BoB and AS, respectively. The 95% CIs of the mean initial vortex position error for the BoB and the AS are 18 and 28 km, respectively. The standard deviation of the initial position errors are 47, 46, and 50 km, respectively, for the TCs formed over different basins (NIO, BoB, and AS). The initial position of the storm can be further improved by increasing the model resolution, performing vortex initialization, and assimilation of more observations, like dropwindsonde data along with various remotely sensed observations in the vicinity of the cyclone (Xiao et al. 2000; Bhaskar Rao et al. 2009; Aberson et al. 2011; Routray et al. 2014, 2016; Davidson et al. 2014; Osuri et al. 2015).

Fig. 3
figure 3

Mean initial vortex position error (km) for each TC and for the overall mean

4.2 Mean Track Forecast Errors Over the NIO

The model simulated tracks of all the TCs at different ICs along with IMD best-tracks are depicted in Fig. 4. From this figure, it is found that the model is reasonably well predicted the tracks for these TCs with most of the initial conditions. The model simulated track of the TC Helen at the IC (00UTC 20 November 2013) shows more southward displacement from the observed track (Fig. 4e). The simulated track of the TC from other model ICs is more close to the observed track. The mean DPE is about 234 km in this particular simulation. However, the DPEs are significantly reduced by 40% with higher model ICs run. In the case of Lehar (Fig. 4f) and Hudhud (Fig. 4i), the tracks diverge from the observed track for more cases. In case of the TC Madi (Fig. 4g), the looping of the storm is captured by the model simulations. However, in the first three model ICs (00UTC of 7–9 December 2013), the tracks are more southward oriented than the observed track. The TC has a unique track with near northerly movement till 15.7N, and then, it recurved south-westwards and crossed the Tamil Nadu coast due to the influence of lower and middle tropospheric steering ridge (Regional Specialized Meteorological Centre (RSMC), Cyclone Warning Division, India Meteorological Department, India 2014).

Fig. 4
figure 4

Model predicted tracks of TCs a Murjan; b Nilam; c Viyaru; d Phailin; e Helen; f Lehar; g Madi; h Nanauk; i Hudhud; j Nilofar; k Ashobaa at different initial time (shown in Table 1) along with IMD best track

The mean error statistics (DX, DY, CT, and AT errors; km) of NCUM and CLIPER models based on 43 IC cases of the NIO TC systems are provided in Table 2. The average DX values from NCUM model are positive for all forecast period, suggesting that the average forecast track of TCs lie to the right of the best-track position for all of the simulations. In other words, the NCUM model shows a bias to predict right or eastward movement of TCs. The CLIPER model-based TC forecast shows maximum TC positions to the left (negative DX) of the observed positions, indicating that model has left- or westward bias in predicting the TCs. The mean CT errors of the NCUM model-predicted tracks are positive for all forecast period. However, the CT error becomes more positive for higher forecast lengths because of rightward movement of the simulated TCs by the NCUM model. Similar to the analysis of DY, the analysis of mean AT errors of the NCUM model-predicted tracks reveals that track positions are generally behind the observed tracks as seen by the result that AT errors are negative for all forecast lengths. However, the CLIPER model shows track positions ahead of the observed tracks as the AT errors are positive throughout the forecast length. Thus, the track error is elliptical in nature with its major axis along the track. In other words, the spread of the track relative to the observed track is less. It is clearly seen that the 95% CI of mean error from the NCUM forecasts at all forecast intervals is smaller when compared with CLIPER track errors. It is suggested that the NCUM model forecasts are, in general, more consistent for all forecast intervals.

Table 2 Mean DX, DY, CT, and AT errors (km) of predicted tracks up to 72-h forecast length from the NCUM model and from the CLIPER model for TCs over NIO as a whole

Figure 5a shows the mean DPE (km) and forecast skill of TC forecast positions based on NCUM and CLIPER models over NIO region. It is clearly seen that the DPEs from NCUM are significantly less in comparison with the CLIPER at all forecast lengths. The DPEs are gradually increased with the lead of forecast hours. The 95% CI of mean DPE of the NCUM (CLIPER) for the 12-, 24-, 48-, and 72-h forecast is approximately 18 (15), 21(25), 53(70), and 61(118) km, respectively. The average DPE over NIO is about 118, 179, and 248 km for 24-, 48-, and 72-h forecasts, respectively. Comparing the DPE with other studies (Mohapatra et al. 2013b and Osuri et al. 2013) over NIO region, it is found that the DPEs at different forecasts hours are lesser. The average DPE error over NIO region from the study Mohapatra et al. (2013b) is about 140, 262, and 386 km for 24-, 48-, and 72-h forecasts, respectively, during 2003–2011. Similarly, Osuri et al. (2013) evaluated the performance of the WRF model at 27-km horizontal resolution for real-time prediction of TCs over NIO region considering 17 TCs during 2007–11; the DPE error is about 140, 248, and 375 km for 24-, 48-, and 72-h forecasts, respectively. In the present study, the resolution of NCUM model is ~ 25 km. The higher accuracy from NCUM may be due to the horizontal resolution and/or less number of TC cases considered during the 3-year period (2012–2014). It is further necessary to investigate the performance of NCUM model with longer period and diversified TC cases. The skill of the track forecast from NCUM is about 35, 53, and 57% for 24-, 48-, and 72-h forecasts over CLIPER, respectively (Fig. 5a). Comparing with the other study over NIO region, the skill is also higher. The average skill of (Mohapatra et al. 2013b) during 2003–2011 is about 27, 39 and 50% for the same forecast hours discussed above, respectively. The ensemble mean of the skill is around 43% averaged of whole forecast hours.

Fig. 5
figure 5

Mean DPE (km; shaded bars) with error bars at 95% confidence interval (CI) and gain in skill (%; line) from NCUM and CLIPER

4.3 Mean Track Forecast Errors Over the BoB

The mean DPE of the NCUM and CLIPER model along with the gain in skill of NCUM over CLIPER for the TCs over BoB is depicted in Fig. 5b. The mean DPE is lesser throughout the forecast lengths as compared to the CLIPER. The mean DPE gradually increased with an increase of forecast period in both the cases but except at 72-h forecast from NCUM. Similar pattern was also noticed in the Mohapatra et al. (2013b). The DPE (skill) is about 105 km (38%), 191 km (48%), and 51 km (91%) for 24-, 48-, and 72-h forecasts, respectively. The skill is gradually increased in higher forecast hours. The ensemble mean of skill is around 47%. The mean DPE at 95% CI for 12-, 24-, 48-, and 72-h forecast lengths is 18, 24, 45, and 47 km, respectively. For individual TCs, the error is higher for Nilam, Helen, and Madi. This may be due to the fact that the mean initial vortex position error is more for these TCs (Fig. 3).

The mean error of DX, DY, CT, and AT from NCUM and CLIPER models for seven TCs of BoB based on 27 IC cases is provided in Table 3 (similar to Table 2). The mean DX and DY values also follow similar trend to those for all TCs taken together over the NIO. The CT errors are positive in all forecast lengths that contribute the bias to the right and ahead of observed TC positions for all forecast lengths over the BoB, respectively. The errors from NCUM are lesser as compared to CLIPER. The AT errors are negative in all forecast lengths; however, these are positive for CLIPER. The 95% CI for the mean DX, DY, CT, and AT are lesser than the mean errors from the CLIPER model. Hence, the NCUM model reasonably well predicted the movement of the TCs in comparison with the CLIPER model.

Table 3 Same as Table 2, but for BoB TCs

4.4 Mean Track Forecast Errors Over the AS

Figure 5c depicts the mean DPE of the NCUM and CLIPER models along with the gain in skill of NCUM over CLIPER for four TCs of AS based on 16 IC cases. The mean DPE errors from both the models are gradually increasing with the forecast length. However, the mean DPEs from NCUM are lesser throughout the forecast lengths in comparison with the CLIPER. The ensemble mean of the NCUM and CLIPER is 160 and 356 km with 95% CI of 61 and 100, respectively. The skill of track forecast from NCUM is about 30, 60, and 64% at 24-, 48-, and 72-h forecast time with respect to CLIPER, respectively. It is clearly seen from the figure that the skill is gradually increasing with the forecast lengths. The ensemble mean of skill is around 46%. The 95% CI of mean DPEs corresponding to the NCUM model is small in comparison with that of CLIPER errors, suggesting that the NCUM model is more consistent, similar to that over the BoB and NIO. The DX, DY, CT, and AT errors from NCUM and CLIPER models for TCs of AS are provided in Table 4. The DX errors are positive for all TC forecast lengths over the AS. It is suggested that the NCUM track positions are also biased toward the right over the AS basin; however, the CLIPER track positions have leftward biases throughout forecast length. Similarly, the model track positions lie behind the actual positions as DY values are negative for all forecast lengths. The similar pattern is also observed over the NIO and BoB regions. The DY values are positive in the CLIPER in all forecast period, however. The CT and AT errors are opposite to it in CLIPER. The NCUM shows that the track positions lie right and behind the observed position, though the CLIPER depicts the track positions towards left and ahead of the observation. The 95% of CI of the track errors (Table 4) are considerably higher in the CLIPER as compared to the NCUM. For these cases, the AT errors are considerably higher than the CT errors for all forecast lengths. The error is elliptical in shape with its major axis along the track which is similar to that over the NIO and BoB regions.

Table 4 Same as Table 2, but for Arabian Sea TCs

4.5 Mean Track Forecast Errors Relative to Intensity of TCs

The performance of the NCUM model is further examined considering TC forecasts initialized at different category of intensity, viz., Depression, Cyclonic Storm (CS), and Severe Cyclonic Storm (SCS) stages. The analysis is carried out with respect to the stage of intensity at the time of model initialization. Out of 11 TC cases of the study, 5 TCs reached at the stage of CS and the remaining six TCs reached SCS intensity stage (Table 1). Accordingly, there are 12, 14, and 16 model ICs issued with depression, CS, and SCS stages, respectively, at model initial time. It is observed that the DPEs are gradually reduced throughout the forecast length when predictions are conducted from the depressions to SCS stages. There is an improvement of about 7–30 km of DPEs when predictions are carried out from the CS stage in comparison to those at the depression stage (not shown). However, a substantial difference in DPEs is noticed when the model is initialized at the CS and SCS stages (Fig. 6a). The differences are gradually increasing up to 60-h forecast ranging from 22 to 62 km. It is also noticed that the DPEs from CS to SCS stages are increased with the increase of the forecast lengths. The DPEs from SCS stages are varying from 71 to 243 km in comparison with the errors from CS stages (92–252 km). The mean DPEs for all forecast lengths are smaller in the case of forecasts initialized at the SCS stages. The skill of track forecast in case of SCS stage of TCs is about 24, 25, 32, 31, 27, and 4% with different forecast lead time (12–72 h), respectively, than that of CS stage of TCs. The ensemble mean of skill is around 24%. The mean initial position error of the TCs is about 75 km at the CS stage and 66 km at the SCS stage with 95% CI of 26 and 19 km, respectively. The 95% CI is also significantly less for the SCS initializations when compared with that of CS initializations. The 95% CIs from SCS stages are varying from the range 14–106 km as compared to the CIs from CS stages (33–140 km). It is evident that the TC vortices are well represented at the SCS stages in terms of horizontal and vertical structure, resulting in stronger intensity. Similar results are also observed when the model is initialized at the SCS stage (i.e., stronger storms) over the Atlantic basin (Gopalakrishnan et al. 2012). The NCUM model shows that the gain in track prediction with respect to CLIPER is high (7 and 54%) when the model is initialized at the SCS stage rather than the CS stage (5–52%) for all forecast lengths (Fig. 6b). The skill in case of SCS stage of TCs is more by 11, 26, and 72% at 24-, 48-, and 72-h forecast time, respectively, than the CS stage. The ensemble mean skill is less by 22% in case of CS stage than the SCS stage of the TCs. The 95% CI of the mean CT and AT errors is less for forecasts issued at the SCS stage as compared to those at the CS stage. This is evident that the stronger the storm, lower is the track forecast error range. Osuri et al. (2013) and Ryerson (2006) also demonstrated the similar results using WRF model showing that the model is more skillful in track prediction when initialized at SCS stage or greater intensity than at the stage of CS or lower over NIO and western Pacific Ocean regions, respectively. Mohapatra et al. (2013b) verified the operational TC forecasts of IMD and found that the predictability of SCS category of track of TCs is more in comparison with the CS categories.

Fig. 6
figure 6

a Mean DPEs (km) with error bars at 95% CI and b gain in skill (%; based on CLIPER) with respect to Cyclonic Storms (CS; maximum sustained wind speed of 34–47 kt) and Severe Cyclonic Storms (SCS; maximum sustained wind speed is 48 kt or more) over the NIO

4.6 Mean Track Forecast Errors for Straight Movers and Recurving TCs

The TCs considered for this study are further classified into two categories on the basis of characteristic movement (irrespective of the location of genesis) as (1) straight-moving TCs, viz., Murjan, Nilam, Phailin, Helen, Lehar, and Hudhud (contributing to 20 ICs), and (2) recurving TCs, viz., Viyaru, Madi, Nanauk, Nilofar, and Ashobaa (contributing to 23 ICs). The mean track errors along with 95% CI and associated gain in skill of the model with respect to CLIPER for the above two categories of TCs at different forecast hours are shown in Fig. 7a and b, respectively. From Fig. 7a, it is clear that the mean DPEs and the 95% CI associated with the recurving TCs are higher than the straight-moving TCs. The mean DPEs from recurving TCs vary from 78 to 270 km throughout the forecast hours and from 76 to 219 km for straight-moving TCs. It is noticed that the DPE is higher in the recurving TCs about 14, 73, and 23 km more for 24-, 48-, and 72-h forecasts, respectively, than that for the straight-moving TCs. The 95% CI of mean DPEs at the initial time, 12-, 24-, 48-, and 72-h forecast lengths are 42, 56, 79, 136 and 191 km for recurving TCs and 38, 51, 76, 116, and 166 km for straight movers TCs, respectively.

Fig. 7
figure 7

a Mean DPEs (km) with error bars at 95% CI and b gain in skill (%; based on CLIPER) for straight movers and recurving TCs over the NIO

The skill of the model (Fig. 7b) is gradually increased in both the cases at all forecast lengths. However, the skill is higher in the case of straight-moving (9–63%) TCs as compared to the recurving TCs (7–58%). The skill is more by about 4, 12, 15, and 5% at 24-, 36-, 48-, and 72-h forecasts, respectively, than that of recurving TCs. However, the skill is not much (5%) at 60- and 72-h forecasts. Ramarao et al. (2006) and Mohapatra et al. (2013b) demonstrated that the forecast errors increase for recurving cyclones and also found that the predictability is less (more) in the recurving/looping (straight movers) TCs; these real-time forecast issued by IMD. Osuri et al. (2013) evaluated the performance of the WRF modeling system in real-time prediction of TCs over NIO with 17 TCs during 2007–11; it is clearly seen that the DPE is higher and skill is less in recurving TCs as compared to the straight movers. It is noticed that the skill of the NCUM model for recurving/looping TCs is higher around 14, 14, and 45% in 24-, 48-, and 72-h as compared with the Osuri et al. (2013), respectively. The improvement of the skill may be due to properly represent the large-scale environmental flow by the NCUM model. However, it is found that the DPEs (142, 265, and 242 km) from the NCUM are higher at 24- and 48-h except 72-h forecast, respectively, in case of recurving TCs as compared to the IMD-forecasted DPEs (50, 200 and 300 km) from Mohapatra et al. (2013b). The analysis showed that the NCUM model still has difficulty in predicting recurving TCs and need for further improvement. The synoptic-scale weather systems are very important in influencing the environmental wind. The most prominent ones are upper level trough at mid-latitude, subtropical ridge, monsoon systems, and presence of other cyclonic disturbances nearby (Harr and Elsberry 1995; Chen et al. 2009). Sometimes, the NCUM model may not be able to capture these large-scale environmental flows accurately, leading to recurvature of the system. Therefore, it is desirable to incorporate high-resolution data assimilation techniques, increment in the model horizontal resolution, upgrade the newer dynamics, as well as coupling with ocean models for better representation of air–sea interaction to improve the predictions of recurving and looping storms in NCUM (Heming 2010; Hanley et al. 2015).

4.7 Landfall Errors of TCs

The mean landfall (LF) position and intensity (AE of 10 m MSW) errors from the model simulations for all the 8 out of 11 cases are depicted in Fig. 8a and b, respectively. The three TCs, viz., Nanauk, Nilofar, and Ashobaa, dissipated over the ocean without reporting landfall. In Fig. 8a, the maximum mean LF error (345 km) is noticed in the case of Viyaru and second maximum (291 km) is in the case of Lehar. The minimum LF position errors are 48 km and 79 km for TCs, Phailin and Murjan, respectively. The mean LF position errors are varying from 48 to 189 km taking all cases in consideration except for Viyaru and Lehar. In Fig. 8b, the mean LF intensity errors are high in the case of Phailin and Hudhud, because the two cyclones crossed the coast as very severe cyclonic storm. However, other TCs are crossed the coast in low intensity as cyclonic storm as per IMD report, and hence, the intensity errors are less. Similarly, the % of intensity errors is also high in these two cyclones. The % of errors is significantly less in other cyclone cases. It is evident that the NCUM model has reasonably well predicted the landfall of the storms. The landfall prediction of the TC is a challenging task when there is rapid change in the motion direction and/or speed near landfall. These complex situations comprise with many factors such as recurving TCs, rapid movement of TCs during landfall, and slow movement/stationarity of TCs near the coast. Therefore, the landfall errors are increased, even though small initial position error of few TC cases (Fig. 3) in this study. As per IMD RMSC reports, these TC cases were evolved with unique characteristics such as recurving and fast movement (Viyaru); quick change the speed, direction, and stationary for quite some time as well as very fast movement on the day of landfall (Nilam); changed direction and rapidly weakened before landfall (Helen); rapidly weakened over the sea from the stage of VSCS to depression before landfall (Lehar), and rapid intensification of intensity before landfall (Hudhud). Therefore, these TCs have higher landfall errors in many occasions.

Fig. 8
figure 8

Mean errors of a position (km) and b intensity (kts; AE of 10 m MSW) along with % of error (line) at the time of landfall of each TC

It is further examined that the LF position (km) and time errors (hour) of each cyclones with different ICs (Table 5). The LF position and time errors are reasonably well predicted by the model in maximum number of TC cases. However, the model simulated the maximum LF time errors (∓ 18 h) in few cases, i.e., delay or ahead of LF time from the observed one. In case of Viyaru and Lehar, the LF position errors from individual ICs are higher (ranged from 213 to 348 km and 40 to 499 km, respectively) as compared to the other cases, which is reflected in the mean LF position errors (Fig. 8a).

Table 5 Landfall (LF) errors of the TCs from different ICs of NCUM simulations

4.8 Mean Intensity Forecast Errors of TCs

The mean absolute errors (AE) along with % of errors (line) and Root-Mean-Square Error (RMSE) in intensity forecast based on MSW are shown in Fig. 9a and b, respectively. There are 27, 16, and 43 cases of track forecasts for different forecast lengths (Table 1) for TCs over the BoB, AS, and NIO as a whole, respectively. It is seen that the AEs (Fig. 9a) are less for TCs of the AS as compared to the BoB at all forecast lengths. The higher accuracy over AS is mainly attributed to the fact that the TCs formed over AS are less intense than in the BoB (Table 1). The AEs are also less considering TCs over the NIO as a whole in comparison with the TCs of BoB for all forecasts time. Similarly, the % of errors are higher in BoB than the AS and NIO. The AEs and % of errors are gradually increased with the increase of forecast length for all the basins, because the TC gradually intensified with the lead time. The ensemble mean of the % of error is 38, 36, and 37% in the BoB, AS, and NIO regions, respectively. The 95% CI of mean AEs is higher for TCs of AS than the NIO and BoB at all forecast times. It is also seen that the 95% CI of AEs is increased with the increase of forecast lengths. In general, the TCs formed over BoB are comparatively more intense than the TCs over AS. It is evident that the model has less predictability of intensity prediction of intense storms. The RMSE in intensity forecast (Fig. 9b) is also generally seen lesser for the TCs of AS as compared to the BoB and NIO region TCs throughout the forecast hours. The RMSEs are comparatively high for TCs of BoB than those of NIO region. The RMSEs are gradually increased with the increase of the forecast lengths.

Fig. 9
figure 9

a Mean absolute error (AE; kts) along with error bars at 95% CI (vertical black line) and % of errors (horizontal line); b root-mean-square error (RMSE) of 10-m maximum sustainable wind for the TCs over North Indian Ocean (NIO) as whole, Bay of Bengal (BoB), and Arabian Sea (AS)

We examined the intensity prediction of the model at different stage of evolution of the TC (CS and SCS or higher). There are five and six TCs with intensity of CS and SCS or higher intensity, respectively, during 2012–2014 (Table 1). Accordingly, there are 12, 14, and 16 model ICs issued with depression, CS, and SCS or higher intensity stages, respectively, at model initial time. The mean AE and RMSE of intensity forecast from different stages (CS and SCS or higher) of TCs over NIO region is depicted in Fig. 10a and b, respectively. It is noticed that the mean AE is significantly higher in the case of SCS or higher intensity TC stages as compared to the TCs of CS stage for all forecast lengths (Fig. 10a). The average AEs are gradually increased with the increase of the forecast length for TCs of CS stage. Although the AEs are gradually increased up to 24-h forecast and later followed the decreased trend in the TCs of SCS stage. The AEs from CS stages are varying between 10 and 21 m/s as compared to the errors from SCS stages (22–41 m/s). The 95% CI is also significantly less for the CS initializations when compared with that of SCS initializations. The 95% CIs from SCS stages are varying from 9 to 20 km as compared to the CIs from CS stages (3–17 km). It is further examined to find out whether the NCUM model under- or overestimates the intensity at different stages of evolution of the TCs. It is found that the NCUM model underestimates (overestimates) the intensity of the TCs about 67% (33%), 79% (21%), and 87% (13%) cases at depression, CS, and SCS stages, respectively. It is clearly seen that % of underestimation (overestimation) of intensity prediction is gradually increased (decreased) along with the increase of TC intensity. Hence, the NCUM model is not exactly predict the intensity at various stages of evolution of the TCs. Similarly, the RMSE is also higher in the case of SCS as compared to the CS stage TCs for all forecast lengths (Fig. 10b). In recent years, adequate progress has been made in TC track predictions technique by the advanced NWP models; however, the skillful forecast of intensity still remains a challenge for forecasters (Elsberry et al. 2007; Mohapatra et al. 2013a). The low skill of intensity prediction is mainly due to the complexity of the TC intensification process and not properly resolved the structures in the low-resolution model, which involves scale interactions between the environment, the storm, and convection.

Fig. 10
figure 10

a Mean AE (kts) with error bars at 95% CI and b RMSE of 10-m maximum sustainable wind with respect to CS and SCS over the NIO

The mean AEs along with 95% CI and associated RMSE of intensity forecast of straight-moving and recurving/looping TCs over NIO are illustrated in Fig. 11a, b, respectively. There are six and five TCs of straight-moving and recurving/looping types, respectively. The number of TC intensity forecasts issued for different forecast lengths (Table 1) is 20 and 23 for straight-moving and recurving/looping TCs, respectively. It is clearly seen that the AEs and RMSE are higher in the straight-moving TCs in comparison with the recurving/looping TCs. The errors are gradually increased with the increase of the forecast lengths. The 95% CI associated with the recurving TCs are lesser than the straight-moving TCs at all forecast hours. It is noticed that the straight movers are more intense than the recurving/looping TCs (Table 1); therefore, the intensity errors are found more for straight movers as compared to the recurving/looping TCs. Similar results has also been demonstrated by Mohapatra et al. (2013a).

Fig. 11
figure 11

Same as Fig. 10 but for recurving and straight mover TCs over the NIO

5 Conclusion

The overall performance of the NCUM global model with horizontal resolution of 25 km is assessed for the track and intensity prediction of TCs over the NIO as well as individual TC basins of NIO (BoB and AS). This evaluation is carried out on the basis of forecast cases of 11 TCs which formed during the period 2012–15. The broad conclusions drawn from the present study can be summarized as follows.

The overall performance of NCUM model is reasonably good to predict the TCs over NIO basin as it demonstrates a good skill irrespective of the region of formation, nature of movement, and intensity. The mean initial vortex position errors of eight storms out of 11 over NIO region are varying from the range of 28–92 km, the errors are quite less as compared to the other TCs (96–149). The approximate errors are 80 and 73 km, respectively, for the TCs formed over the BoB and AS.

The NCUM model is reasonably well predicted the tracks of the TCs with maximum number of the ICs at different stages of the storms. However, the predictability of the NCUM is relatively less in track of the recurvature/looping TC cases. The mean DPEs over the NIO vary from 97 to 248 km for 12–72-h forecast lengths. The skill of the track forecast is about 35, 53, and 57% for 24-, 48-, and 72-h forecasts, respectively. The ensemble mean of the skill is around 43% averaged of whole forecast hours.

The NCUM model is more skillful for track prediction when the model is initialized at the SCS stage rather than at the CS or depression stage. Therefore, the DPEs are lesser in higher model IC runs in each TC case. The LF position errors and time errors are reasonably well predicted in major cases by the model.

The NCUM results show that the average forecast tracks lie to the right (eastwards bias) of the best-track position in all simulations for the TCs which formed over NIO as well as BoB and AS, as the CT or DX values are positive for all the forecast periods. The analysis of AT errors revealed that the model forecasted positions are biased to the south of (behind) the observed positions. It is evident that the NCUM forecasts are slower relative to the actual translation speed of the system for all forecast lengths, and the NCUM model predicts a delayed landfall. The magnitude of CT errors is less in comparison with AT errors in the NCUM model. Hence, the NCUM model is more accurate in predicting TC landfall location than landfall time.

The 95% CI of mean error from the NCUM forecasts at all forecast intervals is smaller when compared with those of CLIPER track errors. It is suggested that the NCUM model forecasts are, in general, more consistent for all forecast intervals.

It is observed that the DPEs are gradually reduced throughout the forecast length when predictions are conducted from the depressions to SCS stages. There is an improvement of about 7–30 km in DPEs when predictions are carried out from the CS stage in comparison to those at the depression stage. The mean DPEs for all forecast lengths are smaller in the case of forecasts initialized at the SCS stages. The 95% CI is also significantly less for the SCS initializations when compared with that of CS initializations. The 95% CI of the mean CT and AT errors is less for forecasts issued at the SCS stage in comparison with forecasts generated at the CS stage. This is evident that the stronger the storm, lower is the track forecast error range.

The mean DPEs and the 95% CI associated with the recurving/looping TCs are higher than the straight-moving TCs. The mean DPEs from recurving/looping TCs vary from 78 to 270 km throughout the forecast hours and from 76 to 219 km for straight-moving TCs. The gain in skill of the model is gradually increased in both the categories for all the forecast lengths. However, the gain in skill is higher in the case of straight-moving (9–63%) TCs as compared to the recurving/looping (7–58%) TCs.

The mean AEs and % of errors are gradually increased with the increase of forecast length for all the basins, because the TC gradually intensified with the lead time. The ensemble mean of the % of error is 38, 36, and 37% in the BoB, AS, and NIO regions, respectively. The intensity errors from the SCS or higher intensity TCs are higher than the CS TCs for all forecast length. The intensity errors are also higher in the straight movers in comparison with the recurving/looping TCs. In general, the straight movers and TCs forming over BoB are more intense than the recurving/looping TCs and those of AS basin, respectively. It is evident that the NCUM model has less predictability of intensity prediction of intense storms.

It is found that the NCUM model underestimates (overestimates) the intensity of the TCs about 67% (33%); 79% (21%), and 87% (13%) cases at depression, CS, and SCS stages, respectively. It is clearly seen that % of underestimation (overestimation) of intensity prediction is gradually increased (decreased) along with the increase of TC intensity. Hence, the NCUM model is not exactly predict the intensity at various stages of evolution of the TCs.

The model shows low forecast skill of intensity, perhaps, less understanding of complexity of the TC intensification process which involves scale interactions between the environment, storm, and convection. The model forecast skill of intensity can be improved further by utilizing new satellite and dropwindsonde observations with high-resolution sophisticated data assimilation techniques like Ensemble Kalman Filter (EnKF), Hybrid, etc., as well as ocean–atmosphere coupling for a better representation of heat, moisture, and momentum exchanges (Aberson and Franklin, 1999; Aksoy et al. 2013; Routray et al. 2016) in the model.