1 Introduction

Recent storms have produced devastating flooding by hurricane-generated storm surge, wave setup, wave run-up, and rainfall-induced runoff. Emergency managers require fast and accurate estimates of inundation to make critical decisions for evacuations, structure closures, and emergency response before, during, and after storm landfall. Emergency managers’ decisions directly impact the safety of the public and emergency responders. If the decisions are too conservative, unnecessary evacuation can lead to significant socioeconomic costs and complacency. On the other hand, insufficient warning leads to a wide array of social impacts potentially including wide spread casualties.

Key emergency management decisions must be made more than 24 h in advance of storm landfall. The NOAA National Hurricane Center (NHC) provides forecast of the hurricane track, storm size, and intensity including significant uncertainty. Emergency managers routinely rely on the NHC’s storm surge forecasts. These forecasts are made using ensemble runs of the low-fidelity numerical storm surge model SLOSH (Sea, Lake, and Overland Surges from Hurricane, Jelesnianski et al. 1992). The SLOSH model used by the NHC is computationally efficient and stable. For emergency management decisions on a regional scale, these predictions are essential and extremely useful (Forbes and Rhome 2012). However, the model lacks fidelity because it lacks near-coast resolution, it neglects key processes such as wave forcing, and it uses parameterized wind and pressure fields for forcing (Blain et al. 1994; Resio and Westerink 2008; Kerr et al. 2013). As discussed in Forbes and Rhome (2012), the NHC SLOSH predictions are satisfactory for regional surge prediction considering that average track prediction errors exceed 40 miles at 24 h from landfall and 70 miles at 48 h.

On the other hand, the US Army Corps of Engineers (USACE) Coastal and Hydraulics Laboratory (CHL) has led application of coupled high-fidelity wave and surge modeling with intense development since Hurricane Katrina (e.g., IPET 2007). A coupled ADCIRC (ADvanced CIRCulation model: Luettich et al. 1992; Westerink et al. 1992) for storm surge modeling with WAM (WAve prediction Model: WAMDI Group 1988) and STWAVE (Steady-State Spectral Wave Model: Smith et al. 2001) for wave modeling (FEMA 2007; IPET 2007; LACPR 2009) has been used to produce highly accurate simulations of hurricane inundation. Coupled model validation has been run for Hurricanes Katrina, Gustav, and Ike (IPET 2007; LACPR 2009). The USACE Coastal and Hydraulics Laboratory routinely runs the coupled ADCIRC–STWAVE model known as CSTORM-MS in a forecast mode in order to provide USACE emergency managers with high-fidelity forecasts for operations involving flood gates, pump stations, and similar functions as well as general operational coordination and health and safety functions in coordination with FEMA and state and local governments (http://chl.erdc.usace.army.mil/chl.aspx?p=s&a=Spotlight;12). Dietrich et al. (2013) provide a summary of a similar system. In addition to high fidelity, these models can provide inundation depths and surge, as well as waves, winds, and currents over the grids. Unfortunately, the increased complexity of the high-fidelity models, the additional wave forcing process, and the increased number of computational points produce a large computational demand. Thousands of hours of CPU time are required for each simulation. For flood prediction, CSTORM-MS is usually run with H*Wind wind and pressure fields from NOAA (Powell et al. 2010) or something similar. These wind fields are typically provided well into the NOAA 6 h advisory update period resulting in limited time available from update to update for high-fidelity forecast runs. So using high-performance computers, multiple simulations of an approaching hurricane for probabilistic analysis are not presently feasible. In order to run multiple realizations or run a new between-update realization given a change in hurricane conditions, a compromise would be required that reduces the fidelity of the CSTORM-MS coupled model. In addition, updated wind and pressure fields would not be available. In summary, during a typical hurricane 6 h NOAA advisory update period, low-accuracy low-fidelity NHC SLOSH estimates provide the public and emergency managers with region-scale approximate storm surge estimates while high-fidelity high-accuracy models provide a single deterministic estimate. However, there are no operational estimates in the US that provide high-fidelity estimates of hurricane response, including inundation, rapidly and in a statistical context. This leaves emergency managers in an unenviable position when trying to make critical decisions.

The significant barrier to solving the problem of fast and accurate inundation forecasts is the huge computational effort required to make accurate forecasts. The Surge and Wave Island Modeling Study (SWIMS: Smith et al. 2011, 2012) in the USACE developed a surrogate model to predict peak storm surge and wave height over a gridded domain based on a moving least-square response-surface methodology with a database of high-fidelity simulations (Taflanidis et al. 2012, 2013a, b, 2014). Hsieh and Ratcliff (2010) developed an artificial neural network for prediction of the peak storm surge response for Louisiana. However, the above surrogate approaches are based on a basis set of peak responses and cannot deal with time series forecasting of storm surge and wave height at a specific area. Prediction of the inundation depth hydrograph and associated wave and wind response is essential for emergency mangers to understand how flooding will develop and recede as the storm passes. So a time-dependent surrogate model is needed for high-fidelity prediction of hurricane response statistics using machine learning techniques.

Among the machine learning techniques, artificial neural networks have been widely used for time series forecasting not only storm surge, but also waves and hydrological variability. Storm surge has been forecast by an artificial neural network with measured surge data at a single tide gauge and/or meteorological data including surface pressure, wind forcing, and tidal level estimated by a harmonic analysis or numerical simulation. While only measured data are used as input neurons in some studies (Makarynskyy et al. 2004; Rajasekaran et al. 2005; Makarynska and Makarynskyy 2008), the measured surge data and the meteorological data were used in more cases (Lee 2006; Tseng et al. 2007; Charhate and Deo 2007; Lee 2008; Siek et al. 2008; De Oliveiria et al. 2009; You and Seo 2009; Bajo and Umgiesser 2010; Hsieh and Ratcliff 2010; Chen et al. 2012). The above methods are virtually all site specific and not effective for predicting real-time time-dependent storm surge.

Emergency managers and decision makers need a tool to predict real-time storm surge inundation as a hurricane approaches critical facilities including public facilities, importance infrastructure, and residential districts. The term real time is used to describe the very fast prediction of storm surge inundation in advance of storm landfall immediately following the issuance of a NOAA tropical storm advisory. The purpose of this study is to develop a time-dependent surrogate model in order to quickly and accurately predict time series of storm surge and storm surge inundation. The time-dependent surrogate model based on an artificial neural network with the database of high-resolution and high-fidelity numerical models will be run in very short execution time on the order of seconds with a stand-alone PC. The fast execution time allows real-time predictions for a range of hurricane conditions and tracks that are statistically plausible, and allows probabilistic simulations to evaluate risk and support emergency management decision making.

2 Synthetic hurricanes

2.1 Simulation of synthetic hurricanes

A set of 446 synthetic storms has been simulated by STWAVE + ADCIRC as a coupled high-fidelity numerical hydrodynamic model (FEMA 2007; IPET 2007; LACPR 2009). Here, storms are referred to as synthetic because they are not historical but were defined from a joint probability model. This set of synthetic hurricanes spans and efficiently populates practical parameter space for the coast of Louisiana. Two sets of 152 high-intensity storms each were selected for eastern and western Louisiana by considering the probable combinations of central pressure, radius of maximum wind speed, forward speed, heading direction, and track. Figure 1 shows the storm tracks for both synthetic and historical storms used in this study. The synthetic storms are defined by five primary tracks and four secondary tracks in eastern Louisiana and a similar number for western Louisiana. Each set of 152 storms includes 50 in Category 3, 52 in Category 4, and 50 in Category 5 by the Saffir-Simpson intensity scale. Furthermore, surge levels are influenced by the storm size as well as the storm intensity (e.g., IPET 2007). Radii of the maximum wind speed for each hurricane category are distributed as follows: 11–35 nautical miles in Category 3, 8–25 nautical miles in Category 4, and 6–21 nautical miles in Category 5. The recurrence interval of the 152 high-intensity storms ranges between 1 in 50 years and 1 in 3,500 years. In addition, 71 low-intensity storms were also generated for the same regions of east and west Louisiana yielding a total of 446 storms.

Fig. 1
figure 1

Storm tracks for Louisiana (LACPR 2009) with synthetic storms as colored lines, eastern Louisiana storms are red lines, western Louisiana storms are blue lines, Hurricane Katrina is solid black line, and Hurricane Gustav is dashed black line

Figure 2 shows the atmospheric–hydrodynamic modeling system with four modeling components and their interaction. Wind and atmospheric pressure were calculated by the planetary boundary layer (PBL) model, and these were used as input for ADCIRC and WAM. The WAM results were used as boundary conditions for STWAVE. The surge levels were then calculated by ADCIRC forced with wind and atmospheric pressure and wind-wave radiation stresses from the coupled STWAVE model including river discharge. This surge model was validated for Hurricanes Katrina and Rita with the actual tide, but the local mean still water level was used for the simulation of the surge level for the synthetic storms. In all of the synthetic storms, a uniform steric water level adjustment of 0.36576 m from terrestrial datum (NAVD88 2004.65) was applied over the simulation domain to account for the seasonal thermal expansion of sea surface and the average offset between local mean sea level (LMSL) and NAVD88 (FEMA 2007). In application or validation, the steric adjustment should be subtracted from the storm surge computed by the surrogate model, and then, the actual tide should be added at a specific area.

Fig. 2
figure 2

Atmospheric–hydrodynamic modeling system (LACPR 2009)

2.2 Selection of forecast points in southern Louisiana

Figure 3 and Table 1 show the selected save points in southern Louisiana for storm surge prediction. These points were selected based on the critical flood protection system, i.e., flood gate (i.e., Seabrook, 17th Street Canal, and WCC), IHNC surge barrier and potentially vulnerable levee points (i.e., levee junctures) because these areas demand detailed hydrograph response prediction for effective emergency management. Specifically, decision makers and emergency managers can better control flood gates and pump stations based on the predicted inundation level. Two points were also co-located near water level gage stations to assist in forecasting and validation. Furthermore, the points were distributed within the vulnerable area in southern Louisiana.

Fig. 3
figure 3

The selected 30 points in southern Louisiana

Table 1 Latitude and longitude of the selected points in southern Louisiana

3 A time-dependent surrogate modeling

3.1 Surrogate model based on an artificial neural network

The artificial neural network has been widely applied in complicated engineering problems because this surrogate modeling strategy relatively easily detects complex relationships between inputs and outputs. A neural network is interconnected with numerous artificial neurons in a group. Figure 4 shows the neural network architecture employed in the present study for predicting surge level with single input, hidden, and output layers. The first network n 1 is expressed as

$${\mathbf{n}}_{1} = {\mathbf{W}}_{1} \cdot {\mathbf{p}} + {\mathbf{b}}_{1}$$
(1)

where \({\mathbf{p}}(R \times 1)\) is the input with R elements, and \({\mathbf{W}}_{1} (S^{1} \times R)\) and \({\mathbf{b}}_{1} (S^{1} \times 1)\) are the input weight matrix and the bias of the hidden layer with \(S^{1}\) neurons, respectively. The output of the hidden layer \({\mathbf{a}}_{1} = {\mathbf{f}}_{1} ({\mathbf{n}}_{1} )\) is calculated with the first network, and this output is also used as the input for the output layer. The second network n 2 is expressed similarly:

$${\mathbf{n}}_{2} = {\mathbf{W}}_{2} \cdot {\mathbf{a}}_{1} + {\mathbf{b}}_{2}$$
(2)

where \({\mathbf{W}}_{2} (S^{2} \times S^{1} )\) and \({\mathbf{b}}_{2} (S^{2} \times 1)\) are the weighting factors and bias of the output layer with \(S^{2}\) neurons. The final output \({\mathbf{a}}_{2} = {\mathbf{f}}_{2} ({\mathbf{n}}_{2} )\) is calculated with the second network. In Eqs. (1) and (2), f 1 and f 2 are transfer functions in the hidden layer and the output layer, respectively. The neurons in each layer can be various types of functions such as a linear approximation and a classification decision. This process represents a biologically plausible approximation to a real neuron-based system.

Fig. 4
figure 4

Neural network architecture

Table 2 lists the characteristics of the present artificial neural network model. The network was trained by the Levenberg–Marquardt algorithm (LMA Levenberg 1944; Marquardt 1963). The LMA is applied to numerical solutions on minimizing problems, which arise in least squares curve-fitting and nonlinear programming. The LMA is well known as an algorithm of generic curve-fitting problems in many software applications. Since the LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent, this method is more robust than the GNA, which means that a solution is easily found even if the starting point is very far from the final minimum. The backpropagation algorithm in the LMA not only is suited for a multilayer feedforward network, but also is structured by a supervising learning method. The tan-sigmoid and the linear transfer functions are the most commonly used transfer functions for a backpropagation algorithm. When the sigmoid transfer function is used in the output layer of a multilayer network, the outputs are limited to a small range. However, when the linear transfer function is used in the network, the outputs can freely take on any value without any limitation so the linear transfer function was used in the present study. The number of hidden layers was based on previous neural network studies in coastal engineering (Mase et al. 1995; Van Gent et al. 2007; Yoon et al. 2013). The number of neurons ranges from one to 20 normally, but in this study varied between 16 and 25 depending on the target performance defined by a specific value of the correlation coefficient. Data were divided by a random process with 70 % for training, 15 % for validation, and 15 % for testing.

Table 2 Characteristics of the present artificial neural network model

Table 3 shows an example of a time series simulation of a synthetic storm for the time-dependent surrogate model. Each of the synthetic storm simulations consists of 92 time steps, and the landfall in each storm occurred at the 44th time step. As the time interval of one step is 30 min, there are 21.5 and 24 h before and after the landfall, respectively. In Table 3, \(x_{\text{lon}}\) and \(x_{\text{lat}}\) are longitude and latitude of storm track positions, \(c_{p} ,v_{f} ,\theta\) are, respectively, the central pressure, the moving speed, and the heading direction, and R p is the radius of exponential pressure profile for the PBL wind and pressure model. R p is related to the radius of maximum wind speed.

Table 3 An example of inputs and target of a synthetic storm (no. 001) for ANN

In the present time-dependent surrogate model, the network is individually trained with the inputs and the target at each time step as follows:

$${\text{net}}^{{t_{i} }} = {\text{train}}\, (x^{{t_{i} }} , \eta^{{t_{i} }} )$$
(3)

where \({\mathbf{x}} = [x_{\text{lon}} ,x_{\text{lat}} ,c_{p} ,v_{f} ,\theta ,R_{p} ]\) and \(\eta^{{t_{i} }}\) is the target storm surge at each time step \(t_{i}\) (i = 1, 2, …, 92). Figure 5 shows the calculation process for the storm surge prediction. As described in Table 2, the acceptable performance criterion of the network is defined by a correlation coefficient of 0.995 with all the data including training, test, and validation data for the initial network. The initial network is trained with this criterion and 16 neurons in the hidden layer with 20 different random data divisions into training, validation, and test subsets. This initial network is repeated up to the maximal epoch of 200 in one training. Then, if the trained network does not satisfy the criterion, the number of neurons ascends stepwise with a uniform step size up to 25. When the performance of the network still does not approach the initial criterion, the criterion is reallocated as the maximal correlation coefficient in the previous 200 training sets (i.e., 20 different data divisions × 10 different numbers of neurons). The network with the modified criterion is trained again by the same process as mentioned above. Then, the surge level at each time step is simulated by the trained network along with new inputs of six storm parameters at each time step as follows:

$$\eta_{N}^{{t_{i} }} = {\text{sim}}\, ({\text{net}}^{{t_{i} }}, x_{N}^{{t_{i} }} )$$
(4)

where \(\eta_{N}^{{t_{i} }}\) and \({\mathbf{x}}_{N}^{{t_{i} }}\) are, respectively, the predicted surge level and new inputs at time step \(t_{i}\).

Fig. 5
figure 5

Flow chart of the prediction procedure of storm surge

3.2 Moving average of the surrogate model

Because a relationship between the response at time \(t_{i}\) and that at \(t_{i - 1}\) was not considered in the above calculation process, the predicted storm surge can be discontinuously oscillating. In order to improve this problem, the moving average is applied to the storm surge calculated by the previous equation as follows:

$$\eta_{*}^{{t_{i} }} = {\text{filter}} \,(\eta_{N}^{{t_{i} }} , W_{L} )$$
(5)

where \(\eta_{*}^{{t_{i} }}\) is the filtered storm surge at each time step \(t_{i}\), and \(W_{L}\) is the window length in the moving average. This moving average is equivalent to low-pass filtering with the response of the smoothing given by the following equation:

$$\eta_{*}^{{t_{i} }} = \frac{1}{2M + 1}\left\{ {\eta_{N}^{{t_{i + M} }} + \eta_{N}^{{t_{i + M - 1} }} + \cdots + \eta_{N}^{{t_{i - M} }} } \right\}$$
(6)

where \(M\) is the number of neighboring data points on either side of \(\eta_{N}^{{t_{i} }}\), and \(2M + 1\) is the window length.

3.3 Performance of the surrogate model

In order to improve the accuracy of the surrogate model, the moving average was applied. It is important for a window length of the moving average to be optimized because the result can be different depending on the window length. Therefore, a sensitivity analysis was performed to determine the optimal window length of the moving average. Of the 30 original save point locations, 25 points trained reasonably well using the raw model output, while 5 had complications with training such as too few storms with inundation. The 5 save points with complicated training (13, 15, 17, 20, and 29) required modifications of the raw model output in order to train well. For the present discussion, these save points are discarded. The focus in this paper is on save point locations 3, 4, 9, and 18 as shown in Table 4. These locations were selected among the 30 locations to facilitate external validations because measured data existed at these locations for Hurricanes Katrina and Gustav. In the next section, this information will be used for the historical hurricane validation. The 4 selected points are representative of the larger group of 25 points that trained well.

Table 4 Sensitivity analysis of the surrogate model at save points 3, 4, 9, and 18

Two primary metrics are used to define surrogate model accuracy: correlation coefficient and mean square error. Here, the correlation coefficient (cc) is defined as a normalized covariance with respect to the standard deviation of X and Y in the following equation.

$${\text{cc}}_{XY} = \frac{{{\text{COV}}(X_{1} ,X_{2} )}}{{\sigma_{{X_{1} }} \sigma_{{X_{2} }} }} = \frac{{\sum\nolimits_{i = 1}^{N} {(x_{i} - {\overline{x}})(y_{i} - {\overline{y}})} }}{{\sqrt {\sum\nolimits_{i = 1}^{N} {(x_{i} - {\overline{x}})^{2} } \sum\nolimits_{i = 1}^{N} {(y_{i} - {\overline{y}})^{2} } } }}$$
(7)

where COV and \(\sigma\) indicate the covariance and the standard deviation, respectively, \(x_{i}\) is the target surge, \(y_{i}\) is the predicted surge by the surrogate model at the same moment, N is the number of time steps, and \({\overline{x}}\) and \({\overline{y}}\) are the mean values of target surge and predicted surge. The parameter \(\overline{\text{cc}}\) with overbar, with values given in Table 4, denotes the ensemble average of the correlation coefficients for 446 synthetic storms. The mean square error is expressed as follows:

$${\text{mse}} = \frac{{\sum\nolimits_{i = 1}^{N} {(y_{i} - x_{i} )^{2} } }}{N}$$
(8)

As above, \(\overline{\text{mse}}\) with overbar, with values shown in Table 4, denotes the ensemble average of the mean square errors for 446 synthetic storms. \(\overline{\text{cc}}\) for the 4 select locations without the moving average applied was 0.982 while that for the 25 points was 0.983. \(\overline{\text{mse}}\) for the 4 points was 0.003 while that for the 25 points was 0.004.

A single moving average was used with window lengths of 3, 5, 7, 9, 11, and 13, and a double moving average was also applied with window lengths of (3, 5), (3, 7), (3, 9), (3, 11), (3, 13), (5, 7), (5, 9), (5, 11), (5, 13), (7, 9), (7, 11), (7, 13), (9, 11), (9, 13), and (11, 13). Because the peak storm surge appears near the moment of landfall and the storm landfall occurs at the 44th time step, 92 time steps are divided into before landfall (time steps 1–50) and after landfall (time steps 51–92).

The \(\overline{\text{cc}}\) is significantly higher for all window lengths for all 25 points than that without consideration of the moving average. This is illustrated in Table 4 for the 4 select points. In order to determine the optimal window length, the save points were ranked based on bands of \(\overline{\text{cc}}\) and \(\overline{\text{mse}}\). Generally, \(\overline{\text{cc}}\) was banded in increments of 0.002, while \(\overline{\text{mse}}\) was banded in increments of 0.0005 and the two ranks were summed. The optimal window sizes were selected as corresponding to the lowest rank sums. Results were similar for the 25 points and 4 select points. The accuracy of the predicted surges with window lengths (5, 7), (5, 9), (5, 11), and (7, 9) is better than others. Finally, the window length of (5, 11) was selected in this study because the longer window length is better than the short span after landfall in order to reduce the oscillation of the predicted surge. The final selection was somewhat subjective because the best \(\overline{\text{cc}}\) and \(\overline{\text{mse}}\) metrics varied between pre-landfall, peak surge, and post-landfall. However, \(\overline{\text{cc}}\) and \(\overline{\text{mse}}\) metrics for the best performing window lengths were very similar, so the final choice would not be expected to impact the results significantly and the peak surge was uniformly well predicted.

Figure 6 shows the performance of the developed surrogate model with and without consideration of the moving average at the four select points. The correlation coefficient considerably increases by incorporating the moving average. The high- and low-intensity storms in eastern Louisiana are numbered from 1 to 152 and 376 to 446, and the high- and low-intensity storms in western Louisiana are from 153 to 304 and 305 to 375, respectively. In general, because the four select save point locations are all located in eastern Louisiana, low correlation coefficients are associated with storms 153 and 375 in western Louisiana. When the correlation coefficient of the surrogate model is less than 0.8, the effect with consideration of the moving average is relatively remarkable as shown in Fig. 6 and Table 6. Table 5 shows the distribution of correlation coefficient at the points. The neural network of save point 4 was well trained; there is no case with a correlation coefficient less than 0.9. Because save point 4 is located near the boundary between ocean and land in southern Louisiana, the relationship between the hurricane parameters, i.e., track, central pressure deficit, moving speed, heading direction, radius of maximum wind speed and storm surge, is relatively strong. However, the correlation coefficient of networks at save points 3, 9, and 18 near inland lakes (Lake Pontchartrain and Borgne) is smaller than that of the network at save point 4. Nevertheless, the percentage exceeding correlation coefficient of 0.9 in the synthetic storm suite is over 95 % at those points. So, the trained neural networks at these points were acceptable in engineering application as shown in Figs. 7 and 8. In order to improve the neural networks at the points 3, 9, and 18, other parameters (e.g., wind speed and wind direction) may be required in the training process. However, if these new parameters are added in the training process, an extra-operational computational effort would be required because this information is not readily available at the time of the NOAA 6 h tropical storm advisory.

Fig. 6
figure 6

Correlation coefficients with or without consideration of a moving average as function of the number of storm. a Save point 3, b save point 4, c save point 9, d save point 18

Table 5 Distribution of correlation coefficient for the surrogate model at save points 3, 4, 9, and 18
Fig. 7
figure 7

Performance of the time-dependent surrogate model at save point 4. a Storm 50 (cc = 0.999), b storm 100 (cc = 0.996), c storm 150 (cc = 0.999), d storm 200 (cc = 0.998), e storm 231 (cc = 0.993), f storm 244 (cc = 0.976), g storm 330 (cc = 0.998), h storm 400 (cc = 0.999)

Fig. 8
figure 8

Performance of the time-dependent surrogate model at save point 18. a Storm 50 (cc = 0.997), b storm 100 (cc = 0.979), c storm 150 (cc = 0.974), d storm 200 (cc = 0.989), e storm 231 (cc = 0.952), f storm 244 (cc = 0.460), g storm 330 (cc = 0.831), h storm 400 (cc = 0.990)

4 Historical hurricane validation

Hurricanes Katrina and Gustav passed over the region of southern Louisiana in 2005 and 2008, respectively. Areas were severely damaged, and high storm surges were recorded during two hurricanes. In addition, extensive measured data sets, including high water marks and continuous gage data, exist for these storms. So, these storms represent good validation data sets for the time-dependent surrogate model. Among the selected 30 save points given in Table 1, continuous water level measurement stations are located near points 3, 4, 9, and 18 and both provide tide and water level during the two hurricanes. For point 18, there is no tide information during Katrina. Furthermore, the time-dependent surrogate model is also validated with the modeled storm surge from Dietrich et al. (2012) at the same points. Dietrich et al. (2013) developed a coupled wave and circulation model using SWAN + ADCIRC, with a high-resolution unstructured mesh in the Gulf of Mexico and southern Louisiana.

The historical hurricane parameters are obtained from the Atlantic hurricane database (HURDAT2—http://www.nhc.noaa.gov/data/#hurdat) and the extended best track from the National Hurricane Center (NHC) as shown in Table 6. HURDAT2 database did not provide the radius of maximum wind speed and for that reason the value was obtained from the extended best track based on observations. Because the PBL wind and pressure model uses the radius of exponential pressure profile \(R_{p}\) instead of the radius of maximum wind speed \(R_{\hbox{max} }\), \(R_{p}\) was converted into \(R_{\hbox{max} }\) using the empirical equation proposed by Cardone (1999)

$$R_{\hbox{max} } = 0.5387 + 0.9524\;R_{p} - 0.00575\;R_{p}^{{^{2} }} + 1.17 \times 10^{ - 5} \;R_{p}^{3}$$
(9)

where both radii are in nautical miles. For small values of \(R_{p}\) (less than roughly 10 nm), the two radii are roughly equal. For large values of \(R_{p}\) (more then 100 nm), \(R_{\hbox{max} }\) is approximately \(R_{p} /2\). The moving speed and the heading direction of storm are calculated with the longitude and latitude of storm track as given in Table 7. The historical hurricane parameters are then interpolated to produce 92 time steps with the time interval of 30 min, and these are used as inputs for the surrogate model for each time step.

Table 6 Average correlation coefficient with or without consideration of moving average at the four points
Table 7 Historical hurricane parameters

Figure 9 shows comparison of storm surge between the surrogate model, the measurements, and the high-fidelity numerical model for two historical hurricanes. The storm surge predicted by the surrogate model is in good agreement with the measured data of Hurricane Katrina and Gustav and the high-fidelity model at save point 4 because the neural network was well trained as shown in Figs. 6b and 7. This point is located at the end of the Mississippi river adjacent to the Gulf of Mexico, so that a relationship between the input parameters and storm surge is strong. The peak value of the surrogate model prediction is much closer to the measured peak surge than that of the SWAN + ADCIRC model. However, because other points are located near inland lakes, i.e., Lake Pontchartrain (point 3, 18) and Lake Borgne (point 9), the accuracy of the storm surge predicted by the surrogate model at these points is lower than that at save point 4 as shown in Figs. 6a, c, d, and 8. Nevertheless, the predicted storm surge before storm landfall is similar to that of the SWAN + ADCIRC model at points 3 and 9 as shown in Fig. 9. A trend of storm surge after storm landfall at point 9 is also similar to that of SWAN + ADCIRC and measured data. Although the predicted storm surge at save point 18 for Hurricane Gustav approaches the measured data before storm landfall, the predicted surge at the same save point for Hurricane Katrina underestimates the simulation of SWAN + ADCIRC and the measured data.

Fig. 9
figure 9

Historical hurricane validation of the time-dependent surrogate model. a Katrina at save point 3, b Gustav at save point 4, c Katrina at save point 4, d Gustav at save point 9, e Katrina at save point 18, f Gustav at save point 18

As the accuracy of the surrogate model is similar to that of a high-fidelity model, i.e., SWAN + ADCIRC or STWAVE + ADCIRC, the present time-dependent surrogate model is reasonable and acceptable for application. The results after storm landfall at save points 3, 9, and 18 along inland lakes are relatively poor compared to that at save point 4 near a boundary between ocean and land. There are some reasons why the accuracy of the surrogate model is not good at these points. First, the bathymetry/topography and morphology of those areas are complicated with marsh, inland lake, and artificial and natural structures. Many of these features are not explicitly resolved in the model mesh. Secondly, the characteristics of the synthetic storms may be different compared with those of the historical storm after storm landfall. The post-landfall infilling and decay within the synthetic storms are highly idealized and typically not of particular significance in most modeling studies because of the focus on peak surge. In complicated topography as exists near save points 3, 9, 18, it is likely that more detailed characterization of the post-landfall storm characteristics and resulting storm surge will be required to better resolve these areas. Furthermore, while the data set that was used in the training process was based on a half-hour time interval, the historical hurricane parameters were provided at a 6-h time interval. So, the quality between two data sets was fundamentally different. In order to resolve this problem in future studies, the characteristics of the synthetic storm set should be updated to cover those of the historical storms.

5 Conclusions

In this study, a time-dependent surrogate model that accurately and rapidly predicts storm surge was developed using a basis of 446 synthetic hurricane simulations computed by STWAVE + ADCIRC. The basis data set was constructed specifically to span and optimally sample practical probability space and parameter space. Even though it is difficult to achieve fast and highly accurate simulation of storm surge prediction by a single high-fidelity numerical hydrodynamic model, the present study resolved this problem by using an artificial neural network trained on the 446 storms in the basis data set. The network included input storm parameters track latitude, track longitude, central pressure, moving speed, heading direction, and radius of exponential scaling pressure and output parameter storm surge. The architecture of the network is structured by the two-layer feedforward and was trained by the backpropagation algorithm to reach a criterion of maximum correlation coefficient between output and target storm surge at each time step. The artificial neural network was shown to model the 446 basis storms at 25 save point locations with high accuracy. The developed surrogate model was validated with two historical hurricanes at four of the 25 save points. While the performance of the present model was excellent at the boundary between ocean and land (i.e., save point 4), the accuracy of the model decreased in the complex inland areas, which are composed of marsh, inland lake, and artificial and natural structures (i.e., save points 3, 9, and 18). Moreover, the accuracy was different depending on the location of storm at the same place. In general, while the storm surge was accurately predicted before storm landfall and including peak surge, the accuracy decreased after storm landfall. The relatively poor prediction post-landfall was likely due to highly idealized modeling of the post-landfall storm surge in the 446 synthetic storms because the focus of the synthetic storm modeling was the peak storm surge. So, storm infilling and overland drainage were of relatively lower importance. The time-dependent surrogate model based on an artificial neural network machine learning model was shown to predict storm surge with accuracy similar to the basis data set based on coupled high-fidelity hydrodynamic modeling.