Abstract
We explored the distributional changes in tsunami height along the eastern coast of the Korean Peninsula resulting from virtual and historical tsunami earthquakes. The results confirm significant distributional changes in tsunami height depending on the location and magnitude of earthquakes. We further developed a statistical model to jointly analyse tsunami heights from multiple events, considering the functional relationships; we estimated parameters conveying earthquake characteristics in a Weibull distribution, all within a Bayesian regression framework. We found the proposed model effective and informative for the estimation of tsunami hazard analysis from an earthquake of a given magnitude at a particular location. Specifically, several applications presented in this study showed that the proposed Bayesian approach has the advantage of conveying the uncertainty of the parameter estimates and its substantial effect on estimating tsunami risk.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A tsunami can be described as a series of sea waves associated with earthquakes, underwater volcanic eruptions, and changes in fault stability (Cho 1995). Tsunami waves are distinct from general currents or ordinary wind-driven sea waves due to their longer wavelength, which indicates that the waves can carry more energy, leading to potentially large wave heights along the shoreline. Although the direct effects of tsunami events are likely to be limited to coastal areas, their destructive power can result in deadly and devastating floods in entire ocean basins (Sohn et al. 2009; Mimura et al. 2011).
Deterministic numerical models that specify the earthquake source can provide scenarios associated with tsunami run-up, propagation, and inundation at particular locations of interest along the coast. The numerical approach is conducive in understanding the spatiotemporal dynamics of tsunami waves and their effects on coastal areas (Imamura et al. 1988; Cho and Yoon 1998; Yoon 2002; Cho et al. 2007). However, substantial amounts of bathymetric, coastline, and earthquake data are required to estimate tsunami heights, so tsunami hazard analyses are typically applied to only a single tsunami event (Choi et al. 2007; Kim et al. 2012; Cho et al. 2013). Moreover, numerical models do not provide the probability distribution information needed for the likelihood estimation step of hazard analysis.
Estimating the probability of tsunami occurrence is a key step in tsunami hazard assessment, in that the probability distribution provides an efficient way of describing tsunami heights. Evaluating the likelihood of a large tsunami carries significance for the design of coastal structures and disaster mitigation strategies, such as a tsunami warning system and evacuation routes. Since Van Dorn (1965) applied a statistical method to estimate the exceedance probability associated with tsunami heights on the coast of the Hawaiian Islands, a statistical approach to tsunami events has been of great interest from scientific and practical viewpoints (Liu et al. 1995a; Choi et al. 2002, 2005, 2006, 2011; Grezio et al. 2010; Cho et al. 2013; Yadav et al. 2013b). Historical tsunami observations are ideally suited to construct a hazard curve, however, a sufficiently long record is not readily available in most cases. In certain cases, it may be appropriate to consider performing numerical simulations for all relevant tsunami sources and combining the results to develop the hazard curve (Geist and Lynett 2014). In these contexts, a formal probabilistic tsunami hazard analysis (PTHA) has been employed over the last decade to combine the likelihood of a tsunami appearing with the distribution of tsunami heights, exceeding a certain threshold for a given tsunami (McCloskey et al. 2008; Geist and Parsons 2009; González et al. 2009; Grezio et al. 2010; Kumar 2012a, b; Mitsoudis et al. 2012; Sørensen et al. 2012; Lorito et al. 2015). Most PTHA analysis are directly based on the estimation of seismic return periods or the likelihood of a given seismic intensity, based on Gutenberg-Richter relationships. Numerical simulation for each tsunami source and its parameters is then performed to evaluate tsunami risk at a particular location. In cases when several seismic sources of information are available, these are combined in a logic-tree to yield PTHA. More generally, probability distributions of tsunami heights are separately estimated for all relevant tsunami sources, and the logic-tree is typically introduced to deal with uncertainty in the model parameters and processes due to limited data and knowledge (Geist and Lynett 2014). The logic-tree PTHA can combine different sources of information to produce a probabilistic quantity, which can be used as a basis for a risk assessment. However, the logic-tree PTHA is constructed for each of the tsunami sources in a discrete manner, which may lead to less efficient simulation and complexity in the modelling stage. In this article, we suggest a framework for simultaneously specifying potential tsunami sources and probability distributions of tsunami heights in a continuous manner.
In the PTHA, numerous statistical models based on different probability distributions have been proposed to estimate the return periods (or likelihoods) of tsunamis in different regions of the world (Orfanogiannaki and Papadopoulos 2007; Geist and Parsons 2008; Hasumi et al. 2009; Grezio et al. 2010; Yadav et al. 2013b; Clare et al. 2014; Knighton and Bastidas 2015; Omira et al. 2015; Shin et al. 2015). On the other hand, relatively little attention has been given to the use of statistical models for tsunami heights (Pelinovsky et al. 1997a; Choi et al. 2002; Geist 2002; Kulikov et al. 2005; Choi et al. 2006, 2011; Kim et al. 2014).
In the PTHA, the probability distributions for the tsunami magnitude and fault location (including the slip distribution) are key elements for describing tsunami risk. Moreover, nearshore wave heights are substantially influenced by bathymetry and resonance effects in bays and harbours (Rabinovich 1997; Geist and Parsons 2006). Among the various factors influencing tsunami generation, in this study, the location and magnitude of an earthquake are primarily considered as key factors in identifying link functions of the parameters in the probability distribution. On the other hand, other earthquake factors such as fault parameters and slip distributions are mainly taken to be constant, based on the virtual earthquakes established by the Korea Peninsula Energy Development Organization (KEDO 1999). Although the PTHA has been an active research field over the last decade, a limited number of studies have been conducted to simultaneously estimate the probability distributions for multiple tsunami events and to explicitly utilize their relationships with earthquake characteristics to produce rapid estimates of the tsunami height distribution along a coast. In addition, the estimation of uncertainty in the parameters of distributions used in tsunami hazard assessments is not often addressed, and it needs to be taken into consideration to provide practical guidance for decision-making. Therefore, formal hazard analyses that rely on probability distributions are of limited use in formulating risk management plans.
Given that background, we explore the following questions:
-
1.
Can the probability distributions of tsunami heights from multiple tsunami events be simultaneously estimated, and can their parameters be functionalized in a regression framework with both the locations and magnitudes of earthquakes?
-
2.
How can the uncertainty in the identified functions of the parameters be quantified within a Bayesian framework? What are advantages and potential limitations of the proposed Bayesian model for quantifying the tsunami hazard?
We here develop a Bayesian model for tsunami heights to investigate those questions in a statistical framework with numerically generated time series along the eastern coast of the Korean Peninsula. Following the brief introduction provided in this section, we present an overview of key data in Sect. 2. The numerical scheme for tsunami modelling and our Bayesian approach to parameter estimation are described in Sect. 3. We summarize our results and discussion in Sect. 4. Finally, we provide our conclusions and suggestions for future work in Sect. 5.
2 Study area and numerical simulation
Concerns on seismic risk in Korea had not been as high as those in Japan. However, a recent earthquake measured a magnitude of 5.8 on the Richter scale, which is the largest recorded in the Korean Peninsula since 1978. Therefore, researchers have shown great interest in examining how seismic activity is potentially related to hazards such as tsunamis and landslides. It is difficult to assess the comprehensive effect of a tsunami on coastal areas because of insufficient historic tsunami data. Only about 20 tsunamis resulting from earthquakes have been recorded and studied in Korea (Pelinovsky et al. 1997a; Choi et al. 2006; Tanioka et al. 2006).
2.1 Study area
The study area includes the eastern coast of the Korean Peninsula, the East Sea, and Japan, as shown in Fig. 1. We performed numerical simulations of the wave propagation with a Boussinesq model for 30 virtual tsunamis to obtain the tsunami height along the eastern coast of the Korean Peninsula (rectangular boxes in Fig. 1). Three historical tsunami events affected the eastern coast: in Japan in 1964 (Niigata tsunami), in 1983 (Central East Sea tsunami), and 1993 (Hokkaido tsunami). In all three cases, substantial property damage and loss of life were officially reported. Since the tsunami in 1993, a tsunami warning system has become a regular part of risk mitigation work along the eastern coast of the Korean Peninsula.
2.2 Numerical simulation of tsunami
Because our aim is to investigate the underlying distributions of tsunami heights based on numerical simulation results, we next provide a brief summary of the theoretical background and numerical simulation procedure for tsunami propagation.
The Boussinesq equations are generally used in numerical models for the simulation of tsunami propagation in shallow seas. In this study, we mainly used a tsunami wave propagation model developed by Cho et al. (2007) to simulate the tsunami heights. The modelling framework is based on the dispersion–correction and leap-frog schemes (Imamura et al. 1988; Cho and Yoon 1998), and the modelling scheme we used allows us to efficiently control the spatiotemporal resolution and provides more practical solutions than conventional algorithms. The utility and accuracy of the model were assessed and verified with virtual and historical events, and the enhanced accuracy in the tsunami run-up estimation was confirmed by Sohn et al. (2009). For more details about the governing equations and physical parameterization, see (Liu et al. 1995b), and for the numerical algorithms, see (Cho 1995; Cho et al. 2007; Sohn et al. 2009).
In this study, we considered 30 virtual earthquakes at different locations and magnitudes (\(\varvec{M}_{\varvec{w}}\)) to investigate distributional changes in tsunami heights on the eastern coast of the Korean Peninsula. In the 1900s, there were four tsunamis that occurred on the east coast of Korea. The most vulnerable tsunami event was the Akita earthquake that occurred in 1983 on the west side of Japan. At this time, a maximum wave height was recorded at about 4.2 m in the Imwon harbour in Korea. In this study, only the tsunami sources in that area were considered for the tsunami propagation analysis. The locations of the virtual earthquakes were established by the Korea Peninsula Energy Development Organization (KEDO 1999) as a set of plausible scenarios for treatment as boundary conditions for free surface displacement. Moreover, we use data from the three historical tsunamis, induced by the Niigata Earthquake \(\text{(}\varvec{M}_{\varvec{w}} :\,7.5)\) in 1964, Central East Sea Earthquake \((\varvec{M}_{\varvec{w}} :\,7.2)\) in 1983, and Hokkaido Earthquake \((\varvec{M}_{\varvec{w}} :\,7.8)\) in 1993, to validate our model. Thus, we consider 30 virtual and 3 historical tsunamis. The virtual tsunamis occur in 10 locations at three different magnitudes: 7.1, 7.7, and 8.3. The earthquake locations for the 10 virtual tsunamis are areas of initial free surface displacement, and their fault parameters are summarized and displayed in Table 1 and Fig. 1, respectively.
In Table 1, \(\varvec{S}\) is the depth of the fault plane, \(\varvec{\theta}\) is the strike angle, \(\varvec{\omega}\) is the dip angle of the fault, \(\varvec{\zeta}\) is the slip angle of the fault, \(\varvec{L}\) is the length of the fault, \(\varvec{W}\) is the width of the fault, \(\varvec{D}\) is the displacement of the fault, and \(\varvec{M}_{\varvec{w}}\) is the scale of the earthquake.
A more realistic representation of bathymetry was found to be crucial for simulating certain features of tsunami propagation, and thus the grid size of the finite difference should to be small enough to represent local bathymetric features. However, for a large domain including the East Sea, the use of such a small grid size would not be possible in terms of simulating tsunami propagation (Kim et al. 2014; Sohn et al. 2009). In these contexts, we used a nested multi-grid approach in which the spatial resolution gradually increased from the open sea toward the East Sea in order to better simulate tsunami wave propagation and inundation near the shore. Four computational domains were used in the numerical analysis, labelled A–D in Fig. 1. Region A used the free transmission condition, and the other regions used the dynamic linking method as a boundary condition for the open sea. Fully reflected conditions for regions A to D are given in Table 2. The maximum tsunami heights (H) under the 30 virtual tsunami scenarios considered in this study are illustrated along with the grid index in Fig. 2. The maximum tsunami heights are extracted from each grid along the eastern coast of the Korean Peninsula. The maximum tsunami height at the coastline ranges broadly from 0.1 to 9.9 m, with a tendency to decrease in height as the maximum height occurs further from the location of earthquake, which generally agrees with (Cho et al. 2013; Kim et al. 2014). On the figure, the tsunami heights under the earthquake scenarios #8–10, #18–20 and #28–30 are exceptionally lower than those under the other scenarios. This phenomenon arises mainly due to the Russian Far East coast, which decreases the wave heights of earthquake-induced tsunamis. The spatial patterns of the maximum tsunami heights were largely similar across the different scenarios.
3 Statistical methods for modelling tsunami heights
3.1 Overview of probability distributions for tsunami heights
The log-normality of the tsunami heights was theoretically represented by (Go 1987, 1997) and was further validated by other researcher (Choi et al. 2002). Several studies have suggested that the tsunami heights can be adequately described by the log-normal distribution (Van Dorn 1965; Kajiura 1983; Go 1987, 1997; Choi et al. 2006). On the other hand, other studies have shown that the underlying distribution of tsunami heights is found to be different from the log-normal distribution (Mazova et al. 1989; Choi et al. 1994; Pelinovsky et al. 1997a, b; Kim et al. 2014). The log-normality of the tsunami height along the eastern coast of Korean Peninsula has been extensively investigated on the length of the coastal line (Kim et al. 2014). It was found that the log-normality assumption may be inappropriate as the length of the coastal line increases. In this perspective, we reviewed different types of probability distributions—Weibull, normal, log-normal, log-logistic, logistic, inverse Gaussian, gamma, generalized extreme value, and exponential–to represent tsunami heights. Among them, the Weibull distribution was identified as the best-fit, using the BIC (Bayesian information criterion) as shown in Table S1. The distribution with the lowest BIC is preferred for model selection. For the eastern coast of Korean Peninsula, the selected distributions exhibit similarities over the all scenarios. The results aligning with those reported from previous study (Kim et al. 2014), suggest that both the characteristics of the undersea earthquakes and the bathymetry throughout its propagation path are equally effective in terms of characterizing the relative magnitude of the tsunami height.
The Weibull distribution (Weibull 1951) is a continuous probability distribution, a special case of the generalized extreme value distribution (GEV). The Weibull distribution has been successfully applied in many fields, including hydrologic engineering (Singh 1987; Vogel and Kroll 1989; Wilks 1989), wind engineering (Justus et al. 1978; Seguro and Lambert 2000; Pishgar-Komleh et al. 2015), earthquake modelling (Hagiwara 1974; Hasumi et al. 2009; Pasari and Dikshit 2014) and tsunami hazard modelling (Muraleedharan et al. 2006; Yadav et al. 2013a; Fukutani et al. 2016). The probability distribution and cumulative distribution function for the Weibull distribution can be written as follows:
where \(\varvec{\lambda}\) is the rate parameter, \(\varvec{\nu}\) is the shape parameter of the distribution, and H is the maximum tsunami height.
The shape of density function given in Eq. ( 1 ) changes significantly with the value of shape parameter \(\varvec{\nu}\); the skewness depends only on the shape parameter. The mean and variance of the Weibull distribution with the rate and shape parameter can be expressed as:
where, \({\varvec{\Gamma}}\) is the gamma function.
We used a Bayesian approach to estimate the parameters and their uncertainty. The posterior distribution \(p(\Theta |{\mathbf{H}})\) of the parameter vector Θ, is given by Bayes theorem:
where \({\varvec{\Theta}}\) is a set of parameters (λ and ν) of the distribution to be fitted, \(p({\mathbf{H}}|{\varvec{\Theta}}\)) is the likelihood function, and \(p({\varvec{\Theta}}\)) is the prior distribution.
The random variables, \({\mathbf{H}}\), can be regarded as exchangeable if their joint distribution is invariant under permutations of the variables. More specifically, exchangeability (or similarity) is determined based on if a given statistical property holds for every finite subset of the random variables for all permutations. The exchangeability condition from a Bayesian perspective is comparable to the independent identically distributed (iid) condition in frequentist theory. The assumption that a sequence of random variables is exchangeable allows us to infer the existence of unobserved subsets of the sequence that were observed within an inductive statistical paradigm.
Conjugate distributions are probability distributions whose prior and posterior distributions are in the same family. In particular, conjugate distributions are favourable for computational reasons. When the shape parameter is unknown, it is known that the Weibull distribution does not have a continuous conjugate joint prior distribution. In this case, the same gamma prior distributions can be assumed for both the shape and rate parameters (Berger and Sun 1993; Kundu and Joarder 2006). In these contexts, the prior distributions for the parameters \(\varvec{\nu}\) and \(\varvec{\lambda}\) were given vague gamma priors, i.e. gamma priors of (0.1, 0.1), indicating a mean of 1 and a variance of 10. The joint posterior distribution of the parameters for individual cases can be estimated by combining the prior distributions and the likelihood function as follows:
where N is the number of grids for the wave heights.
3.2 Integrated statistical model for multiple tsunami events
We further explored an integrated model for jointly analysing the probability distributions of tsunami heights in multiple tsunami events. More specifically, we investigated functional aspects in the estimation of distributional parameters for tsunami height, which might be functionally or physically related to the location and magnitude of an earthquake, in a Bayesian regression framework. Thus, the parameters \(\varvec{\nu}\) and \(\varvec{\lambda}\) in Eq. (6) become functions of the location and magnitude of earthquakes, as \(\varvec{F}_{\varvec{\nu}}\) and \(\varvec{F}_{\varvec{\lambda}}\). More specifically, the dependences of the parameters \(\varvec{\nu}\) and \(\varvec{\lambda}\) on the predictors X are specified via a link function, which is assumed to be related in a nonlinear way to the response’s distribution. The joint posterior distribution for multiple tsunami events can be formally reformulated as follows:
where c is the case number, and \({\mathbf{X}}\) is a vector of independent variables (the latitude, longitude and magnitude of earthquakes). Note that a nonlinear relationship between the parameters and predictors is described by the terms \({\mathbf{\beta X}}\) and \({\mathbf{\delta X}}\), which are composed of low-order polynomial and power functions. The functional forms of the two parameters are provided in the following section. The terms α and γ are constants and the parameters \({\varvec{\upbeta}}\) and \({\varvec{\updelta}}\) are the 4 × 1 vectors consisting of regression coefficients. It can be seen that there is enough information, with about 14,000 maximum tsunami heights (\({\mathbf{H}}\)) corresponding to 30 tsunami scenarios for the model proposed here; these data are used to estimate the twelve desired parameters in Eq. (7). Hence, non-informative prior distributions for the regression parameters are assigned, as suggested in the literatures (Gelman 2006; Gelman et al. 2014). In other words, the regression coefficients are assumed to be Gaussian with zero-mean and precision 10–4, and prior distributions for the shape and rate parameters (\(\theta_{\nu } \,\,{\text{and}}\,\, \theta_{\lambda }\)) are assumed to be gamma distributions.
The posterior distribution of the model parameters \(\Theta \left( {\alpha ,{\varvec{\upbeta}},\gamma , {\varvec{\updelta}},\varvec{ }\theta_{\nu } ,\theta_{\lambda } } \right)\) is simultaneously estimated using the MCMC (Markov Chain Monte Carlo) method, specifically the Gibbs sampler (Gelman and Hill 2006). Please refer to Gilks et al. (1995) for further information on the construction of the Gibbs sampler for Bayesian MCMC.
4 Results and discussion
4.1 Distributional changes in tsunami heights
In the Bayesian framework, parameters are treated as random variables conditional on evidence inferred from the data. The Bayesian model is specified by a prior distribution over a set of the parameters and a likelihood, which results in a posterior distribution through Bayesian updating for Eq. (6). The posterior medians with 95% credible intervals for the parameters of a Weibull distribution based on 30 virtual earthquakes are summarized in Table S2. The credible intervals that are reported in Table S2 are based on independently generated chains of parameter estimates and thus normality is not assumed. The estimates with a narrow uncertainty bound relative to their medians are statistically significant.
Tsunami height depends on various factors, of which earthquake magnitude and distance from the epicentre are considered the most important. As a way of delineating distributional changes in tsunami height, we explored functional relationships between the estimated parameters and the earthquake location and magnitude. As illustrated in Fig. 3, clear patterns in the parameters (\(\upnu \,\,{\text{and}}\,\,\uplambda\)) across locations and magnitudes are evident in tsunami heights. The rate parameter \(\uplambda\) shows a concave upward curve corresponding to the longitude, with higher values along the edges around 139°E due to the long distance from the earthquakes. For the magnitude, a similar profile is obtained, but the lower rate parameter corresponds to the higher magnitude. For the most part, patterns in the rate parameter with latitude are similar to those with longitude. There are no significant dissimilarities between the rate and shape parameter \(\upnu\) regarding their associations with the location and magnitude of earthquakes, except for smaller variation in the shape parameter. However, the shape parameter forms a concave downward curve corresponding to longitude. The shape of the Weibull distribution changes significantly with the shape parameter (ν). For ν > 1, as the shape parameter increases, the density function is monotonically increased until it reaches the mode, at which it begins to decrease. It is interesting to note that the variance of the Weibull distribution tends to decrease as the value of the shape parameter increases. Moreover, the skewness relies only on the shape parameter. The relationships identified in this study appear to be physically interpretable. The concave relationships between parameters and earthquake attributes (e.g., location and magnitude) reflect the fact that the distance from the earthquake epicentre affects the overall characteristics of the tsunami height. Specifically, the statistical moments (e.g., mean, variance, and skewness) associated with tsunami height all become higher both as the distance from the earthquake epicentre decreases and as the magnitude becomes higher. As illustrated in Fig. 4, the functional forms of the mean and variance of the Weibull distribution exhibit largely similar patterns of association with the Weibull parameters. The maximum mean tsunami height is obtained around 138.5°E and 40°N, as well as the minimum (or maximum) values for the Weibull parameters. The relationships identified in this study appear to be physically interpretable. Generally, the statistical moments (e.g., mean, variance, and skewness) associated with tsunami height become higher as the distance from the earthquake epicentre decreases and at higher magnitude. We will further explore the implications of those patterns to increase the practical use of the information within a Bayesian regression framework.
4.2 Functional model for estimating parameters
We next present a functional model for the statistical analysis of tsunami height, simulated from numerical modelling. In our study, we developed a Bayesian generalized linear regression (GLM) model for the estimation of parameters in a Weibull distribution by allowing the tsunami heights to be a function of earthquake characteristics through a link function. The GLM can be regarded as a generalization of an ordinary regression that assumes that the dependent variables have non-Gaussian distributions. We use a stepwise regression, an iterative procedure for constructing a model, to identify the best combination sets of the individual explanatory variables (latitude, longitude, and magnitude) and to minimize the difference between the observed and modelled data. The identified functional forms of the two parameters, ν and λ, can be written as follows:
where Xlon, Xlat, and Xint are the longitudes, latitudes, and magnitudes of the earthquakes, respectively. Again, note that subscript “c” denotes the number of cases (i.e. 30 virtual tsunami events).
Substituting Eq. (8) into (7) allows simultaneous estimation of the joint posterior distribution of the model parameters \({\varvec{\Theta}}\) for multiple tsunami events. The estimated parameters and their uncertainties are summarized in Table 3. The estimates with stable posterior means and narrow 95% credible intervals can be considered statistically significant. Figure 5a compares the jointly estimated parameters and their uncertainty in the Weibull distribution across cases, using Eq. (8), with the individual estimates of the parameters across the cases, using Eq. (6). The posterior median values of the jointly estimated parameters are almost identical to the individual estimates, within the 95% credible bounds for most cases. We further evaluated the degree of similarity between them using statistics of efficiency, such as the correlation coefficient, Nash–Sutcliffe model efficiency coefficient, and index of agreement. For more details on efficiency statistics, please see Legates and McCabe (1999). There is good agreement among the parameters, with correlation measures over 0.9 for the entire parameter range, as summarized in Table S3. Additionally, we calculated the mean and variance using the theoretically derived moment equations (Eqs. 3, 4) along with the estimated parameters, and compared the results with the observed mean and variance. A scatter plot of the observed (abscissa) and estimated (ordinate) moments (i.e. mean and variance) for all cases is illustrated in Fig. 5b. The near-perfect match along the reference line indicates that the integrated model can reproduce the statistical moments associated with tsunami heights for all of the cases at the different magnitudes and locations. Consequently, the results demonstrate that the proposed model can be effective and informative about tsunami height from an earthquake of a given magnitude at a particular location.
For further validation, we used the location and magnitude of three historical tsunami events to infer the parameters of the Weibull distribution through the proposed Bayesian GLM framework. We sampled the parameters from their fully conditional posterior distributions, which are derived from Eqs. (7)–(8), and compared those results with the independent estimates. The set of boxplots in Fig. 6 illustrates the uncertainty bound for the rate and shape parameters of the historical tsunamis in 1964, 1983, and 1993. Most of the true values (i.e., independent estimates) fall within the interquartile range of the simulated values, suggesting that the set of parameters obtained through the proposed Bayesian GLM offers accuracy comparable to the independent estimates. However, the true value of the shape parameter ν for the Niigata Earthquake in 1964 clearly falls outside the range of the simulated values. This difference could be because the Niigata Earthquake occurred distant from the earthquakes we used in fitting the model. On the other hand, the overall range of the shape parameter is narrower than that of the rate parameter, so the extent of the deviation will not affect the tsunami height predictions. Note that different virtual cases covering a wide range of tsunami mechanisms related to magnitudes and distances from the epicentre need to be explored to better understand the distributional changes in tsunami height, globally as well as locally.
4.3 Inundation probability estimation
The failure probability of a system can be defined as the probability that the loading condition exceeds its respective threshold (i.e. resistance). In this study, we used a rather simple definition for the estimation of overall inundation probability for the entire coastal area with respect to the tsunami heights as the probability that the tsunami height (i.e. loading) will exceed the ground elevation (i.e. resistance). A conceptual illustration of the estimation of failure probability using two normal distributions for the loading and resistance condition is represented in Fig. 7. Intuitively, the failure probability is the cumulative probability of the overlap between the loading and the resistance density function. Thus, a mathematical illustration of the failure probability can be formulated as follows:
where, l, r and pf are loading, resistance condition and failure probability, respectively.
Hence, in our case, the failure probabilities for all tsunami heights are then determined using following integral representation (Ang and Tang 1984). Specifically, the integration is performed over the failure area (i.e. r − l ≤ 0) to estimate failure probability, as illustrated in Fig. 7.
In order to explore loading-resistance inference, the probability distributions of the loading (fl) and resistance (fr) need to be estimated. In this manner, we used the joint posterior distributions of two parameters in Weibull distribution, estimated from the functional form as represented in Sect. 4.2. For the distribution of resistance (i.e. ground elevation), fr, the log-normal distribution was considered as the best-fit, given by the negative log-likelihood and BIC value. For presentation purposes, we illustrated the probability density functions of the tsunami height and ground elevation for 6 scenarios (i.e. #4, 8, 14, 18, 28 and 30), as shown in Fig. 8. The grey-shaded band represents the 95% Bayesian credible interval. The failure probability corresponds to the cyan-shaded area. It is clearly shown that the cyan-shaded area for the case of magnitude Mw-8.3 (#14 and #18) is larger than the case of magnitude Mw-7.1 (#4 and #8) and Mw-7.7 (#24 and #28). A direct numerical integration over the failure region was then performed to estimate the failure probability. The failure probability estimates and their credible intervals for 30 virtual tsunamis are presented in Fig. 9. The credible interval for the failure probability was obtained by repeatedly integrating over the failure domains, corresponding to the uncertainty in the probability distributions of tsunami heights. In Fig. 9, the lower and upper edges of the box indicate the 25th and 75th percentiles (i.e., interquartile range) of the failure probability, respectively. The line in the middle represents the median value. On the other hand, the vertical lines extending from the edge of the box correspond to values that are no greater than 1.5 times the interquartile range. How the failure probability and its uncertainty range differ substantially among different scenarios, suggests that the estimation of tsunami hazard is significantly sensitive to the loading conditions. The failure probabilities for the case of magnitude Mw-8.3 are notably higher compared to the case of magnitude Mw-7.1 and Mw-7.7, especially, over the range from latitude 38°– 42° along the fault zone. The results confirm that the proposed modelling scheme can translate the uncertainty in the model parameters into uncertainty in the estimation of tsunami hazard, which, thus, would offer an improved strategy for tsunami hazard mitigation program.
5 Concluding remarks
In this study, we investigated distributional changes in tsunami heights on the eastern coast of the Korean Peninsula using virtual earthquakes. We further examined a possible association between the distribution parameter for tsunami height and earthquake location and magnitude by exploring a functional relationship within a Bayesian generalized linear regression framework. We focused on developing a practical statistical tool that will allow rapid evaluation of tsunami hazard using a regression analysis with a small number of predictors. Our primary results are summarized as follows.
-
1.
We confirmed significant distributional changes in tsunami height depending on earthquake location and magnitude in the East Sea. The rate parameter has a concave upward (or downward) trend along the longitude and latitude. We also identified an increased pattern in the parameters as magnitude increased. In general, the statistical moments for the tsunami heights become higher as the distance from the earthquake epicentre decreases as well as at higher magnitudes.
-
2.
We developed a Bayesian GLM model to jointly analyse tsunami height in multiple events. The proposed model explicitly considers functional relationships with earthquake characteristics through a link function for the estimation of parameters in a Weibull distribution. The results show that the proposed model can practically and effectively estimate the tsunami hazard from an earthquake of a given magnitude at a particular location. Specifically, the correlation coefficients between the true and estimated values for both rate and shape parameters were over 0.9. In addition, as an experimental study, we applied the Bayesian GLM model to historical tsunami events to estimate the distributional parameters. That study generally confirmed that the proposed model effectively estimates parameters with a small number of predictors.
-
3.
The joint posterior distribution was used to measure the failure region against the ground elevation for each scenario. Then, the failure probability was computed by directly integrating probability over the identified failure region. Consequently, the failure probability (or region) differed significantly from scenario to scenario, with implication that the failure probability estimation is rather sensitive to the loading scenarios. In this regard, several applications presented in this study showed that the proposed Bayesian approach has the advantage of conveying the uncertainty of the parameter estimates and its substantial effect on modelling results. Especially, the proposed model can effectively translate the uncertainty in the model parameter into the uncertainty of the estimated tsunami hazard.
-
4.
On the other hand, it should be noted that the proposed model showed limitations in accurately estimating the shape parameter for earthquakes separated in space from the virtual earthquakes used to fit the model (i.e., Niigata Earthquake in 1964). Furthermore, the proposed model does not consider other earthquake factors, such as fault parameters. Thus, the proposed model might not be relevant to earthquakes significantly different from those modelled in this study.
Future work could focus on further exploring other earthquake factors and integrating the proposed model into a tsunami hazard analysis tool to handle the lack of adequate tsunami records in coastal areas.
References
Ang AHS, Tang WH (1984) Probability concepts in engineering planning and design, vol 1. Wiley, New York
Berger JO, Sun D (1993) Bayesian analysis for the poly-Weibull distribution. J Am Stat Assoc 88:1412–1418
Cho Y-S (1995) Numerical simulations of tsunami propagation and run-up. Cornell University, Ithaca
Cho Y-S, Yoon SB (1998) A modified leap-frog scheme for linear shallow-water equations. Coast Eng J 40:191–205
Cho Y-S, Sohn D-H, Lee SO (2007) Practical modified scheme of linear shallow-water equations for distant propagation of tsunamis. Ocean Eng 34:1769–1777
Cho Y-S, Kim YC, Kim D (2013) On the spatial pattern of the distribution of the tsunami run-up heights. Stoch Environ Res Risk Assess 27:1333–1346
Choi BH, Woo SB, Pelinovsky E (1994) A numerical simulation of the East Sea tsunami. J Korean Soc Coast Ocean Eng 6:404–412
Choi BH, Pelinovsky E, Ryabov I, Hong SJ (2002) Distribution functions of tsunami wave heights. Nat Hazards 25:1–21
Choi BH, Pelinovsky E, Lee HJ, Woo SB (2005) Estimates of tsunami risk zones on the coasts adjacent to the East (Japan) sea based on the synthetic catalogue. Nat Hazards 36:355–381
Choi BH, Hong SJ, Pelinovsky E (2006) Distribution of runup heights of the December 26, 2004 tsunami in the Indian Ocean. Geophys Res Lett 33:L13601. https://doi.org/10.1029/2006GL025867
Choi BH, Kim DC, Pelinovsky E, Woo SB (2007) Three-dimensional simulation of tsunami run-up around conical island. Coast Eng 54:618–629
Choi BH, Min B, Pelinovsky E, Tsuji Y, Kim K (2012) Comparable analysis of the distribution functions of runup heights of the 1896, 1933 and 2011 Japanese tsunamis in the Sanriku area. Nat Hazards Earth Syst Sci 12:1463–1467
Clare MA, Talling PJ, Challenor P, Malgesini G, Hunt J (2014) Distal turbidites reveal a common distribution for large (> 0.1 km3) submarine landslide recurrence. Geology 42:263–266
Fukutani Y, Anawat S, Imamura F (2016) Uncertainty in tsunami wave heights and arrival times caused by the rupture velocity in the strike direction of large earthquakes. Nat Hazards 80:1749–1782
Geist EL (2002) Complex earthquake rupture and local tsunamis. J Geophys Res Solid Earth 107(B5). https://doi.org/10.1029/2000JB000139
Geist EL, Lynett PJ (2014) Source processes for the probabilistic assessment of tsunami hazards. Oceanography 27:86–93
Geist EL, Parsons T (2006) Probabilistic analysis of tsunami hazards. Nat Hazards 37:277–314
Geist EL, Parsons T (2008) Distribution of tsunami interevent times. Geophys Res Lett 35:L02612. https://doi.org/10.1029/2007GL032690
Geist EL, Parsons T (2009) Assessment of source probabilities for potential tsunamis affecting the U.S Atlantic Coast. Mar Geol 264:98–108. https://doi.org/10.1016/j.margeo.2008.08.005
Gelman A (2006) Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Anal 1:515–534
Gelman A, Hill J (2006) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, Cambridge
Gelman A, Carlin JB, Stern HS, Rubin DB (2014) Bayesian data analysis, vol 2. Chapman & Hall/CRC, Boca Raton
Gilks WR, Best N, Tan K (1995) Adaptive rejection Metropolis sampling within Gibbs sampling. Appl Stat 44:455–472
Go CN (1987) Statistical properties of tsunami runup heights at the coast of Kuril Island and Japan. Institute of Marine Geology and Geophysics, Sakhalin
Go CN (1997) Statistical distribution of the tsunami heights along the coast Tsunami and accompanied phenomena. Sakhalin 7:73–79
González FI et al (2009) Probabilistic tsunami hazard assessment at seaside, Oregon, for near- and far-field seismic sources. J Geophys Res Oceans. https://doi.org/10.1029/2008JC005132
Grezio A, Marzocchi W, Sandri L, Gasparini P (2010) A Bayesian procedure for probabilistic tsunami hazard assessment. Nat Hazards 53:159–174
Hagiwara Y (1974) Probability of earthquake occurrence as obtained from a Weibull distribution analysis of crustal strain. Tectonophysics 23:313–318
Hasumi T, Akimoto T, Aizawa Y (2009) The Weibull–log Weibull distribution for interoccurrence times of earthquakes. Phys A 388:491–498
Imamura F, Shuto N, Goto C (1988) Numerical simulation of the transoceanic propagation of tsunamis. Paper presented at the sixth congress of the Asian and Pacific regional division international association hydraulic research, Kyoto, Japan
Justus C, Hargraves W, Mikhail A, Graber D (1978) Methods for estimating wind speed frequency distributions. J Appl Meteorol 17:350–353
Kajiura K (1983) Some statistics related to observed tsunami heights along the coast of Japan. Tsunamis–their Science and Engineering, Terra Pub, Tokyo, pp 131–145
KEDO (1999) Estimation of tsunami height for KEDO LWR project. Korea Power Engineering Company Inc, Gimcheon
Kim YC, Choi M, Cho Y-S (2012) Tsunami hazard area predicted by probability distribution tendency. J Coast Res 28:1020–1031
Kim D, Kim BJ, Lee S-O, Cho Y-S (2014) Best-fit distribution and log-normality for tsunami heights along coastal lines. Stoch Environ Res Risk Assess 28:881–893
Knighton J, Bastidas LA (2015) A proposed probabilistic seismic tsunami hazard analysis methodology. Nat Hazards 78:699–723
Kulikov EA, Rabinovich AB, Thomson RE (2005) Estimation of tsunami risk for the coasts of Peru and Northern Chile. Nat Hazards 35:185–209. https://doi.org/10.1007/s11069-004-4809-3
Kumar TS, Nayak S, Kumar CP, Yadav RBS, Ajay Kumar B, Sunanda MV, Devi EU, Kumar NK, Kishore SA, Shenoi SSC (2012a) Successful monitoring of the 11 April 2012 tsunami off the coast of Sumatra by Indian tsunami early warning centre. Curr Sci 102(11):1519–1526
Kumar TS et al (2012b) Performance of the tsunami forecast system for the Indian Ocean. Curr Sci 102:110–114
Kundu D, Joarder A (2006) Analysis of Type-II progressively hybrid censored data. Comput Stat Data Anal 50:2509–2528
Legates DR, McCabe GJ (1999) Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour Res 35(1):233–241
Liu PL-F, Cho Y-S, Briggs MJ, Kanoglu U, Synolakis CE (1995a) Runup of solitary waves on a circular island. J Fluid Mech 302:259–285
Liu PL-F, Cho Y-S, Yoon S, Seo S (1995b) Numerical simulations of the 1960 Chilean tsunami propagation and inundation at Hilo, Hawaii. In: Tsunami: progress in prediction, disaster prevention and warning. Springer, pp 99–115
Lorito S, Selva J, Basili R, Romano F, Tiberti M, Piatanesi A (2015) Probabilistic hazard for seismically induced tsunamis: accuracy and feasibility of inundation maps. Geophys J Int 200:574–588
Mazova R, Pelinovsky E, Poplavsky A (1989) Physical interpretation of tsunami height repeatability law. Vulcanol Seismol 8:94–101
McCloskey J et al (2008) Tsunami threat in the Indian Ocean from a future megathrust earthquake west of Sumatra. Earth Planet Sci Lett 265:61–81. https://doi.org/10.1016/j.epsl.2007.09.034
Mimura N, Yasuhara K, Kawagoe S, Yokoki H, Kazama S (2011) Damage from the Great East Japan Earthquake and Tsunami-a quick report. Mitig Adapt Strat Glob Change 16:803–818
Mitsoudis D, Flouri E, Chrysoulakis N, Kamarianakis Y, Okal E, Synolakis C (2012) Tsunami hazard in the southeast Aegean Sea. Coast Eng 60:136–148
Muraleedharan G, Sinha M, Rao AD, Murty TS (2006) Statistical simulation of boxing day tsunami of the indian ocean and a predictive equation for beach run up heights based on work-energy theorem. Mar Geodesy 29:223–231. https://doi.org/10.1080/01490410600939215
Omira R, Baptista M, Matias L (2015) Probabilistic tsunami hazard in the Northeast Atlantic from near-and far-field tectonic sources. Pure Appl Geophys 172:901–920
Orfanogiannaki K, Papadopoulos GA (2007) conditional probability approach of the assessment of tsunami potential: application in three tsunamigenic regions of the Pacific Ocean. In: Satake K, Okal EA, Borrero JC (eds) Tsunami and its hazards in the Indian and Pacific Oceans. Birkhäuser Basel, Basel, pp 593–603. https://doi.org/10.1007/978-3-7643-8364-0_18
Pasari S, Dikshit O (2014) Impact of three-parameter Weibull models in probabilistic assessment of earthquake hazards. Pure Appl Geophys 171:1251–1281
Pelinovsky E, Yuliadi D, Prasetya G, Hidayat R (1997a) The 1996 Sulawesi tsunami. Nat Hazards 16:29–38
Pelinovsky E, Yuliadi D, Prasetya G, Hidayat R (1997b) The January 1, 1996 Sulawesi Island tsunami. Int J Tsunami Soc 15:107–123
Pishgar-Komleh S, Keyhani A, Sefeedpari P (2015) Wind speed and power density analysis based on Weibull and Rayleigh distributions (a case study: firouzkooh county of Iran). Renew Sustain Energy Rev 42:313–322
Rabinovich AB (1997) Spectral analysis of tsunami waves: separation of source and topography effects. J Geophys Res Oceans 102:12663–12676
Seguro J, Lambert T (2000) Modern estimation of the parameters of the Weibull wind speed distribution for wind energy analysis. J Wind Eng Ind Aerodyn 85:75–84
Shin JY, Chen S, Kim T-W (2015) Application of Bayesian Markov Chain Monte Carlo method with mixed gumbel distribution to estimate extreme magnitude of tsunamigenic earthquake. KSCE J Civ Eng 19:366
Singh VP (1987) On application of the Weibull distribution in hydrology. Water Resour Manag 1:33–43
Sohn D-H, Ha T, Cho Y-S (2009) Distant tsunami simulation with corrected dispersion effects. Coast Eng J 51:123–141
Sørensen MB, Spada M, Babeyko A, Wiemer S, Grünthal G (2012) Probabilistic tsunami hazard in the Mediterranean Sea. J Geophys Res Solid Earth. https://doi.org/10.1029/2010JB008169
Tanioka Y, Yudhicara Kususose T, Kathiroli S, Nishimura Y, Iwasaki SI, Satake K (2006) Rupture process of the 2004 great Sumatra-Andaman earthquake estimated from tsunami waveforms. Earth Planet Space 58:203–209
Van Dorn WG (1965) Tsunamis. In: Chow VT (ed) Advances in hydroscience. Academic Press, London, pp 1–48
Vogel RM, Kroll CN (1989) Low-flow frequency analysis using probability-plot correlation coefficients. J Water Resour Plan Manag 115:338–357
Weibull W (1951) Wide applicability. J Appl Mech 103:293–297
Wilks DS (1989) rainfall intensity, the weibull distribution, and estimation of daily surface runoff. J Appl Meteorol 28:52–58. https://doi.org/10.1175/1520-0450(1989)028<0052:ritwda>2.0.co;2
Yadav R, Tripathi J, Kumar TS (2013a) Probabilistic assessment of tsunami recurrence in the Indian Ocean. Pure Appl Geophys 170:373–389
Yadav R, Tsapanos T, Tripathi J, Chopra S (2013b) An evaluation of tsunami hazard using Bayesian approach in the Indian Ocean. Tectonophysics 593:172–182
Yoon SB (2002) Propagation of distant tsunamis over slowly varying topography. J Geophys Res Oceans 107:C10
Acknowledgements
The authors thank the Associate Editor and the two anonymous reviewers for their constructive criticism of the paper. The insightful comments provided by the Associated Editor and reviewers have greatly improved the original manuscript. This research was supported by Korea Institute of Marine Science and Technology promotion. The third author was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2017-2015-0-00378) supervised by the IITP (Institute for Information & communications Technology Promotion). The data used in this study are available upon request from the corresponding author via email (hkwon@jbnu.ac.kr).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kim, KH., Cho, YS. & Kwon, HH. An integrated Bayesian approach to the probabilistic tsunami risk model for the location and magnitude of earthquakes: application to the eastern coast of the Korean Peninsula. Stoch Environ Res Risk Assess 32, 1243–1257 (2018). https://doi.org/10.1007/s00477-017-1488-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-017-1488-7