1 Introduction

In this paper, we extend the probabilistic seismic hazard assessment (PSHA) methodology to evaluate hazard for induced seismicity and develop regional risk estimates. PSHA is a widely used tool to estimate hazard from tectonic (or natural) seismicity (Petersen et al. 2014), largely based on work by Cornell (1968). It describes a framework to account for both epistemic and aleatory uncertainties involved at various levels of seismic hazard—earthquake sources, earthquake ruptures, magnitude distributions, soil velocity and ground motion propagation. The methods described here build upon concepts related to induced seismicity that have been described in previous research by the authors and have not been included here for succinctness—a change-point approach for estimating changing seismicity rates (Gupta and Baker 2017), and a ground-motion prediction equation developed for Oklahoma (Gupta et al. 2017). Additionally, an extension of the framework involving hazard assessment using injection volumes in Oklahoma has been described by Gupta (2017) but is not presented here.

The motivation for this paper is the significant increase in seismicity that has been observed in the central and eastern US (CEUS) (Ellsworth 2013) since 2008. Numerous studies have linked this increased seismicity to disposal of oilfield wastewater by injection (e.g., Ellsworth 2013; Katie 2014; Walsh and Zoback 2015; Horton 2012; Hornbach et al. 2015) and hence it is referred to here as induced seismicity.

PSHA has been proposed as a valuable tool to develop hazard estimates for induced seismicity. The United States Geologic Survey (USGS) has evaluated short-term seismic hazard for induced seismicity using PSHA (Petersen et al. 2016, 2017). van Eck (2006) and Bourne (2015) estimated hazard for induced earthquakes in the Netherlands, and van Elk (2017) additionally estimated the risk. Baker and Abhineet (2016) present a Bayesian approach to account for uncertainties in induced seismicity, like earthquake rates and location of faults in probabilistic hazard analysis. Several studies have been published on the identification of the two major components of hazard assessment—estimating seismicity rates (e.g., Llenos and Michael 2013, 2016; Gupta and Baker 2017), and developing new ground motion prediction equations for regions of induced seismicity (e.g., Atkinson 2015; Yenier and Atkinson 2015; Gupta et al. 2017). Bommer et al. (2015) emphasize the importance of using seismic risk as a metric for decision making by regulators for regions of induced seismicity. Walters (2015) present a traffic light system that qualitatively takes into account the seismic hazard, exposure and vulnerability of a region. Liu et al. (2017) present the sensitivity of building collapse and nonstructural component falling risks for induced seismicity. Mignan (2015) estimate the portfolio induced seismicity risk caused by Enhanced Geothermal System in Basel, Switzerland, based on discrete damage states of the assets within a 14 km radius.

Here we extend the PSHA framework to take into account the changing seismicity rates in regions of induced seismicity. We use a multiple-change-point approach to identify changes in seismicity rates, and perform hazard and risk assessments using a stochastic Monte Carlo based method. We apply the approach to Oklahoma, and discuss how the results may be useful in risk management decisions. Finally, we perform sensitivity analyses to assess the impacts of changes in the following parameters on Oklahoma’s hazard and regional risk—seismicity rates, magnitude distribution (b-value in Gutenberg-Richter relation, minimum, and maximum magnitudes), ground-motion prediction equations and exposure’s vulnerability. More informed decisions can be made on resource allocation, research efforts and risk mitigation measures by understanding these impacts.

2 Framework for hazard and risk assessments from induced seismicity

In this section, we describe a framework for hazard assessment from induced seismicity and apply these hazard estimates to develop regional risk estimates.

2.1 Hazard assessment

Seismic hazard refers to the the annual rate of exceeding a certain level of ground shaking. In traditional PSHA for tectonic seismicity, the rate of an intensity measure \(\mathrm {IM}\) exceeding an amplitude x, \(\lambda (\mathrm {IM}\ge x)\), is estimated by evaluating Eq. 1. Intensity measure is a catch-all term for various metrics of ground shaking, such as peak ground acceleration, peak ground velocity, spectral acceleration, or Modified Mercalli Intensity (Baker 2015).

$$\begin{aligned} \lambda (\mathrm {IM}\ge x)&= \sum _{n=1}^{N} \left [ \lambda (M_n \ge m_{\min }) \sum _{j=1;k=1}^{J_n;K_n} p (\mathrm {IM}\ge x \mid M_n = m_j; R_n = r_k) \ldots \right.\\&\left.\quad p(R_n = r_k \mid M_n = m_j) p(M_n = m_j) \right ] \end{aligned}$$
(1)

where \(\lambda (a)\) is the annual rate of a, \(p(a \mid b)\) is the probability of a given b, \(n=1, \ldots , N\) is the earthquake source, \(M_n = m \ge m_{\min }\) is the earthquake magnitude for source \(n,m_{\min }\) is the minimum magnitude considered at the source, \(R_n = r\) is the distance from earthquake source to site of interest, and \(J_n\) and \(K_n\) are the number of discretized magnitudes and source-to-site distances, respectively for source n. The probability \(p(\mathrm {IM}\ge x \mid M_n = m; R_n = r)\) is typically characterized by a ground motion prediction equation (GMPE) (e.g., Atkinson 2015). Earthquakes are typically assumed to occur as a Poisson process with rate \(\lambda\), with \(p(R_n = r \mid M_n = m)\) developed based on the source geometry, and \(p(M_n = m)\) developed based upon a recurrence relationship (e.g., Gutenberg and Richter 1949).

Due to epistemic uncertainties, there may exist multiple source characteristics, GMPE’s and magnitude distributions for the same region. These uncertainties are accounted for by estimating hazard for each of the individual possibilities, which we then represent as individual branches in a logic tree. Each branch \(d = 1, \ldots , D\), is assigned weight \(w_d\) such that \(\sum _{d=1}^D w_d = 1\), and the hazard is computed by the weighted contribution from each branch (Petersen et al. 2014).

$$\begin{aligned} \lambda (\mathrm {IM}\ge x)&= \sum _{d=1}^D w_d \lambda _d(\mathrm {IM}\ge x) \end{aligned}$$
(2)

where \(\lambda _d(\mathrm {IM}\ge x)\) is the hazard for branch d.

When the seismicity rates are changing over time, as for induced seismicity, then the estimated hazard is also time dependent. We represent hazard at time t as \(\lambda (\mathrm {IM}\ge x) (t)\) and calculate it by replacing the constant seismicity rate in Eq. 1 with time-dependent \(\lambda (M_n \ge m_{\min })(t)\). Then the mean hazard per unit time over a time duration \([t_1, t_2]\) is calculated by

$$\begin{aligned} \lambda (\mathrm {IM}\ge x) = \frac{1}{t_2 - t_1} \int _{t_1}^{t_2} \lambda (\mathrm {IM}\ge x) (t) \mathop {}\mathrm {d}t \end{aligned}$$
(3)

Hazard estimates are forecasts of anticipated future seismic shaking. Due to the transient nature of induced seismicity, these forecasts are meaningful for shorter duration of the next 6 months to 24 months, as compared to the National Seismic Hazard forecasts developed for next 50 years (Petersen et al. 2014).

Equation 1 estimates hazard at a single site of interest. For multiple sites, this calculation is required separately at each of the sites. This is computationally expensive, and Monte-Carlo based stochastic approach may be more efficient (Ross 2009). In this approach, for each source, we first simulate earthquakes according to the corresponding seismicity rate \(\lambda (M_n \ge m_{\min })\). For each earthquake, we assign a magnitude according to the magnitude distribution \(f_n(M_n = m)\), a location according to the source geometry, and finally estimate shaking at each of our sites using GMPE’s. Each simulation is independent and 10,000 one-year simulations are carried out. This approach also allows for implementation of spatial correlation between ground shaking at multiple sites from the same earthquake (e.g., Jayaram and Baker 2009). The detailed algorithm for this approach is described by Gupta (2017) and is implemented here using the OpenQuake-engine (Pagani 2014).

2.2 Risk assessment

Seismic risk refers to the annual rate of exceeding a certain level of loss to structures, population or other entities. The risk for loss \(\psi\) on a single asset s with a vulnerability distribution \(f(\psi _s \mid \mathrm {IM}= y)\) is described by Krawinkler and Miranda (2004) and shown below.

$$\begin{aligned} \lambda (\psi _s \ge x) = \int _{\mathrm {IM}_s} \lambda (\mathrm {IM}_s=y) p(\psi _s \ge x \mid \mathrm {IM}_s = y) \mathop {}\mathrm {d}y \end{aligned}$$
(4)

For a set of assets \(s = 1, \ldots , S\), the total loss \(\Psi\) is obtained by combining losses of all assets, \(\Psi = \sum _{s=1}^{S} \psi _s\). Then the probability distribution of \(\Psi\) represents a sum of random variables and Eq. 4 is modified as shown below.

$$\begin{aligned} \lambda (\Psi \ge x)&= \int _{\varvec{\mathrm {IM}}} \lambda (\varvec{\mathrm {IM}}=\varvec{y}) \idotsint _{\mathcal {S}} f(\psi _1, \ldots , \psi _S \mid \varvec{\mathrm {IM}} = \varvec{y}) \mathop {}\mathrm {d}\mathcal {S} \mathop {}\mathrm {d}y \\ \text{ and } \qquad \mathcal {S}&= \{x_1, \ldots , x_S \mid \sum _{s = 1}^S x_s \ge x ; x_s \ge 0 \; \forall \; s = 1,\ldots , S \} \end{aligned}$$
(5)

where \(f(\psi _1, \ldots , \psi _S \mid \varvec{\mathrm {IM}} = \varvec{y})\) is the joint probability distribution for loss to the S assets and \(\varvec{\mathrm {IM}}\) is a vector of \(\mathrm {IM}_s\) at each asset s. We use the stochastic Monte-Carlo simulation approach to develop risk assessments at a statewide scale, similar to our approach for hazard assessment. In this case, the ground shaking at each site from the hazard estimation is used as input to estimate losses for risk assessment. This algorithm is detailed in Gupta (2017), and is implemented here through OpenQuake, with the results processed in MATLAB.

3 Risk assessment for Oklahoma

We implement the framework described in Sect. 2 to assess hazard and state-wide post-earthquake repair costs for Oklahoma. While the hazard is computed at all locations in Oklahoma, we show estimates here from a single site at \({35.45}^{\circ }\hbox {N}\) and \({97.55}^{\circ }\hbox {W}\) in Oklahoma City. Due to the transient and rapidly changing nature of induced seismicity (Petersen et al. 2017), the hazard and risk forecasts presented here through 2017 are meaningful only for short duration of the next 6 to 24 months, although the framework might be used to update these estimates with more recent data.

For reference, we will compare some subsequent hazard results with USGS short-term hazard curves (Petersen et al. 2016, 2017). The USGS computed hazard using the weighted mean of multiple branches. Their seismicity rate estimates are based on observed seismicity over the previous 1-year, 2-year and 36-year windows. They use the same GMPE’s that are used in the 2014 hazard maps for the central and eastern United States (Petersen et al. 2014) and the Atkinson (2015) GMPE.

3.1 Parameters for risk assessment

3.1.1 Seismic sources

Seismicity rates are considered within Oklahoma and in southern Kansas near Oklahoma’s northern border. The background rates (before induced seismicity) are multiple orders lower than those from induced seismicity (Petersen et al. 2014) and contribute negligibly to short-term hazard and risk, hence we only consider regions with a recent rate increase. We use the change-point method, with sequential bisection to detect multiple change points, to estimate rates for \(M \ge 3\) earthquakes (Gupta and Baker 2017; Gupta 2017). Rates are estimated from a seismicity catalog declustered using the method proposed by Reasenberg (1985) with an effective lower magnitude cutoff of 3.0, based on Oklahoma’s catalog completeness threshold. We chose this declustering method because the alternative Gardner and Knopoff (1974) declustering removes many non-dependent earthquakes, as shown in Fig. 1a. The Reasenberg approach on the other hand appears to follow the number of monthly earthquakes much more closely and to smooth out the peaks that could be a result of dependent events. Stiphout et al. (2012) have also described that the Gardner-Knopoff approach tends to remove more events from the catalog than other approaches. Finally, we did not use the more recent and robust ETAS approach (Ogata 1992) because it requires establishing a constant background seismicity rate while the background rate is itself variable for regions of induced seismicity.

Fig. 1
figure 1

a Monthly \(\hbox {M}\ge 3\) earthquakes in Oklahoma and b number of earthquakes exceeding a specified magnitude, for non-declustered catalog and catalogs declustered using Reasenberg and Gardner-Knopoff declustering

Seismic sources are considered as area sources of \({0.1}^{\circ }\) latitude by \({0.1}^{\circ }\) longitude, similar to the USGS implementation. Seismicity rates are estimated at the center of these area sources, every 6 months from 2009 through 2017 and are shown in Fig. 2. For each point in time, only the catalog up to that date is considered. This allows us to evaluate how hazard and risk assessments would have evolved over time, had this approach been implemented over the past decade. Figure 3 shows that the model corresponds well with observed earthquakes at the statewide level; the approximately 6-month lag between the two lines is because the observed earthquakes are for a future 12-month period, while the estimated rates are empirically-based with no forecasting based on injection rates or other forward-looking metrics.

We use a truncated Gutenberg-Richter relation for magnitude distribution with a b-value of 1.3, a minimum magnitude of 3.0 and maximum magnitude of 8.0 at all sources. The b-value is selected based on our qualitative analysis of the seismic catalog (as shown in Fig. 1b) and observation by Langenbruch and Zoback (2016). Different studies have suggested different b-values for the region, including a study by (Rubinstein et al. (2018) that estimated \(b = 1\) for Kansas. The impact of b-values on hazard and risk is shown in Sect. 4. We include a distribution of focal depths within the hazard framework, instead of in a logic tree, through a probability mass function that reflects the depth distribution in the earthquake catalog. Depths of 3, 4, 5, 6 and 7 km are modeled as occurring with probabilities of 0.05, 0.15, 0.6, 0.15 and 0.05, respectively.

Fig. 2
figure 2

Predicted rate of \(\hbox {M}\ge 3\) earthquakes using the change-point model, based on earthquakes observed prior to the given date

Fig. 3
figure 3

Annual rates estimated using the change-point method based on earthquakes observed prior to the given date, and number of earthquakes observed in the year following the given date

3.1.2 Ground-motion prediction equation

We use the scaled version of Shahjouei and Pezeshk (2016) GMPE as described by Gupta et al. (2017), with spatial correlation in the ground motion fields using the Jayaram and Baker (2009) model. This GMPE has been developed for ground motions in Oklahoma and is applicable to earthquakes with magnitude \(\ge 3\).

3.1.3 Exposure and vulnerability

We use HAZUS data regarding building structure types and counts at a census block level, based on the 2010 census (Holmes et al. 2015). Building types in the large number of census blocks (\(\approx {255{,}000}\) census-blocks, \(3.9\times 10^{6}\) data rows) are aggregated on a \({0.1}^{\circ }\) latitude by \({0.1}^{\circ }\) longitude grid (1852 grid points, \(\approx {28,500}\) data rows). This approximately corresponds to a 10 km by 10 km grid. Bal et al. (2010) concluded that the difference in the accuracy and precision of loss estimates that come from working at a coarse spatial resolution is likely to be insignificant in comparison with the uncertainties associated with the prescription of recurrence intervals for major earthquakes in a fully probabilistic loss model. Bazzurro and Park (2007) discuss impacts of aggregating assets, one of them being introducing artificial correlations that tend to systematically underestimate frequent, small losses and overestimate the large, rare ones. One of the reasons for this correlation is using the same spectral acceleration at the site of aggregated assets. To address this issue, we aggregate assets by distributing them to the nearest grid-points in proportion of their closeness to the point. In other words, each grid-point receives a contribution of the assets from the neighboring grid, instead of combining all the assets within 5 km north, west, south and east of the point. As a result, each asset’s loss is computed based on the spectral accelerations observed at its nearest grid-points, instead of only one grid-point. A summary of the assets is provided in Table 1. Figure 4 shows the total asset cost at each grid point, along with markers for major cities and the Prague M5.7 and Pawnee M5.8 earthquakes.

Table 1 Buildings summary in Oklahoma
Fig. 4
figure 4

Total asset value for each grid point. Major cities and epicenters of Prague M5.7 and Pawnee M5.8 earthquakes are marked. The circles around the epicenters are 100 km in diameter and mark the approximate region with \(\hbox {PGA}\ge 0.05\, \hbox {g}\) based on USGS Shakemaps

Fig. 5
figure 5

a Vulnerability function for low-code classification of the most commonly occurring building types in Oklahoma. b Upgraded vulnerability curves developed for this study based on Krawinkler et al. (2012). The upgraded curves refer to upgrade from HAZUS curves based on previous research and do not reflect any structural intervention

We use HAZUS vulnerability functions that relate \(\mathrm {IM}\) to asset losses, as shown in Fig. 5a. HAZUS provides damage fragility functions for each asset that relates peak ground acceleration (PGA) with four distinct damage levels. Then, at various discrete levels of PGA, the probability of being in each damage level can be obtained. HAZUS also provides mean loss ratios for each damage level. Then to obtain the vulnerability functions, we estimate the probability of loss at each PGA level based on the probability of each damage level and its corresponding mean loss ratio. We then assume a log-normal distribution for loss at each PGA level and estimate its parameters based on the probability of loss. This yields a vulnerability function that is defined by a log-normal distribution at various PGA levels. We have obtained these vulnerability functions from OpenQuake developers through personal communication (Anirudh Rao, 2016), with the structural loss ratio mapped to total building loss ratio as the loss measure \(\psi\). Additionally, HAZUS classifies buildings as pre-code, low-code, moderate-code and high-code, based on their location and year of construction. HAZUS categorizes post-1975 buildings in low seismicity regions as low-code, hence all buildings in Oklahoma are classified as low-code. The vulnerability functions showing variation of the mean loss-ratios with PGA for the most common building categories are shown in Fig. 5. The variation in losses at each PGA level as characterized by the log-normal distribution is not shown in the figure. HAZUS’s PGA based fragility functions are developed for large magnitude events and hence there is a possibility of introducing bias when using these for the short durations and low energy of the motions associated with smaller earthquakes in this study. We have explored the impact of vulnerability functions on risk assessment in Sect. 4.4, however specifically exploring the bias of HAZUS fragility curves for small magnitude earthquakes is beyond the scope of this study.

OpenQuake implements complete correlation of losses between assets of the same type at a site. For example, if there are 6 wood buildings aggregated at a site, then each building will have an identical loss ratio for a given simulation. We also assumed mutual independence between assets of different types and at different sites (i.e., the loss ratio given a PGA for one asset type or site does not influence the loss ratio given PGA for another asset type or site). Asset losses may be correlated at different sites when they follow similar designs or construction quality, for example, when constructed by the same contractor. However, we did not have such information and hence assumed independence. Asset correlation will have the effect of reducing the occurrence of lower losses and increasing the occurrence of higher losses.

Fig. 6
figure 6

a Induced seismicity hazard in Oklahoma City, and b statewide risk using seismicity rates estimated using change-point method. Hazard reported for Oklahoma City by Petersen et al. (2016) and for Los Angeles (Petersen et al. 2014) are also shown for comparison

We calculated risk curves for these vulnerability functions and noted that they highly over-estimate the observed losses. For example, Fig. 6 shows losses of \(\approx \${2.8}\,{\hbox {billion}}\) with 10% annual probability (exceedance rate of roughly once in 10 years on average) and \(\approx \${383}\,{\hbox {million}}\) with an exccedance rate of once per year, out of total portfolio cost of \(\${240.15}\,{\hbox {billion}}\). In the last 6 years since 2011, when the first \(M > 5\) earthquake occurred in Oklahoma, there have been multiple cases when buildings have been damaged, but their exact loss values are not available. However based on estimates generated from news reports, we believe that losses have not exceeded \(\approx \${10}\,{\hbox {million}}\) for any of the earthquakes. Given our risk estimates, the probability of exceeding a loss of \(\${2.8}\,{\hbox {billion}}\) in 6 years is 45%, and that of \(\${383}\,{\hbox {million}}\) is 99.7%, and given the low occurrence of such high losses, we believe that our risk estimate is higher than the true risk. We further explore the reasons for this discrepancy in losses.

Figure 6a shows that our hazard estimate for the Reasenberg (1985) declustering approach is higher than that of USGS. Since the Gardner and Knopoff (1974) approach used by the USGS removes a greater number of earthquakes from the catalog, as described by Stiphout et al. (2012) and shown in Fig. 1, the hazard estimate based on this approach is much lower. Moreover, our hazard estimates and those of USGS for Oklahoma City are both greater than that of Los Angeles. This high hazard, combined with higher expected vulnerability of the Oklahoma building stock, results in our high loss estimates. Figures 6a and 8a also illustrate that our hazard estimates based on the change-point approach are in good agreement with those of USGS using a completely independent approach (Fig. 7).

Fig. 7
figure 7

Earthquakes that cause a loss \(\ge \${1}\) billion as a fraction of all earthquakes within that magnitude bin. Earthquakes are also binned by distance such that the cost of all assets within the shown distance is \(\ge\) twice the loss for that earthquake. The percentages marked on the figure represent the fraction of all earthquakes in that bin that caused the loss

Based on our high predicted losses but comparable hazard estimates as those of USGS, we believe that our vulnerability curves are too conservative, however few studies exist that provide fragility curves for buildings in the central and eastern US, and for small magnitude earthquakes. Other effects like aggregation of assets and asset loss correlations can also affect loss estimates, however their impacts are not large enough to completely explain the high estimated losses. Krawinkler et al. (2012) developed fragility functions for unreinforced masonry parapets and chimneys using observations from California and computer modeling. Since unreinforced masonry structures in California predate modern seismic design requirements in the region, we believe that these fragility functions developed for chimneys and parapets are reasonable estimates for unreinforced masonry structures in Oklahoma. We note that chimneys and parapets are not braced at the top and hence these fragility functions are still conservative when used for buildings. We use these fragility functions here because they have been created specifically for unreinforced masonry using more data and modeling than the HAZUS functions, however further research is required to generate Oklahoma specific fragility functions, which is beyond the scope of this study. The median PGA for toppling fragility function by Krawinkler et al. (2012) is \({0.5}\,{\hbox {g}}\) compared to 0.35 g for the loss vulnerability curve in our study based on HAZUS. To update our vulnerability curves, we increase our median PGA for unreinforced masonry to 0.5 g while keeping the same variability of the curve. Similar studies could not be found for other building types and hence we make the assumption to increase the median PGA for all vulnerability functions by a ratio of 1.43 (\(= \frac{0.5}{0.35}\)). Some of these updated vulnerability curves are shown in Fig. 5b. We use these updated vulnerability curves in all subsequent calculations, unless otherwise specified.

Finally, we note that the August 24, 2014 M6.0 earthquake in Napa incurred a loss of \(\${700}\,{\hbox {million}}\) (http://www.iii.org/issue-update/earthquakes-risk-and-insurance-issues, accessed August 09, 2017). Approximately 410,000 households were affected by that earthquake, compared to \(\approx {337{,}000}\) households in Oklahoma County (https://www.census.gov/2010census/popmap, accessed August 09, 2017). This suggests that it would be possible to observe losses in the order of \(\${500}\,{\hbox {million}}\) in Oklahoma City from a nearby \(\approx\) M6.0 earthquake, though fortunately previous earthquakes have caused losses in order of only \(\${10}\,{\hbox {million}}\) as they have not occurred in densely populated regions of the state.

3.2 Oklahoma results for 2017

Figure 8 shows the hazard in Oklahoma City and statewide risk from induced seismicity based on the updated vulnerability curves shown in Fig. 5b. The annual exceedance rates for PGA using the change-point seismicity rates are approximately twice that of the USGS 2017 hazard estimates (Petersen et al. 2017). This comparison is not anticipated to produce an exact match, due to differences in assumed seismicity rates and logic trees, but the rough correspondence of results is reassuring.

Fig. 8
figure 8

a Induced seismicity hazard in Oklahoma City and b statewide risk with updated vulnerability functions. Hazard reported by Petersen et al. (2016) in USGS 2016 report is also shown for comparison

Due to the transient nature of induced seismicity, we consider these calculations as short-term forecasts and consider only annual rates of exceedance \(\ge 0.01\) in our Figures. Our estimates indicate that Oklahoma City will experience peak ground acceleration of \(\approx {0.08}\,{\hbox {g}}\) with 10% annual probability and \(\approx {0.3}\,{\hbox {g}}\) with 1% annual probability. Generally, building losses occur at accelerations \(> {0.1}\,{\hbox {g}}\), but might occur at \(> {0.05}\,{\hbox {g}}\) in Oklahoma due to higher building vulnerability, as shown in Fig. 5.

The statewide risk in Fig. 8b indicates loss of \(\approx \${1.2}\,{\hbox {billion}}\) with 10% annual probability and \(\approx \${5.5}\,{\hbox {billion}}\) with 1% annual probability. Our estimate indicates a loss of \(\approx \${125}\,{\hbox {million}}\) expected once every year on average. The total asset cost for our exposure portfolio is \(\${240}\,{\hbox {billion}}\) for the state. This indicates loss ratios of \(\approx\) 2.3% at the 1% annual probability level, which appears reasonable given the high hazard and the vulnerability curves for wood buildings that are 53% of the total cost. However, the loss estimates are still substantially higher than those actually observed in the state to date. Since our hazard estimates are comparable to those of the USGS, we explore the relationship between vulnerability models and losses in Sect. 4.4.

4 Sensitivity analysis

In this section, we study the impacts of changes in seismicity rates, magnitude distribution (b-value in Gutenberg-Richter relation, minimum and maximum magnitudes), ground-motion prediction equations and exposure’s vulnerability on induced seismicity hazard and statewide loss risk in Oklahoma. Unless noted otherwise, the results are estimated based on seismicity rates estimated on 2017-01-01, with minimum and maximum magnitudes of 3.0 and 8.0 respectively, a b-value of 1.3, the \(\hbox{SP16}_{\mathrm {scaled}}\) GMPE and the vulnerability with upgrade ratio of 1.43 as described in the previous section.

4.1 Changes in seismicity rates

We illustrate the effect of changing seismicity rates by studying the evolution of hazard and risk in Oklahoma over time. We use the multiple change-point model to estimate rates at 6-months intervals, starting in 2009 (Figs. 2, 3).

We observe in Fig. 9 that shaking in Oklahoma City increases considerably at a given exceedance level between 2009 and 2010. There is little difference in PGA increase after 2010, however, despite high rate increases in the state, because the more recent rate increases occurred in northern Oklahoma (an area with less exposure). We observe a significant increase in statewide risk between 2013 and 2014, which agrees with the rate increase from the change-point model during the same time. There has been a reduction in observed seismicity since 2015 in the state and subsequently also reflected in the rate estimates from the change-point model starting in 2016, as shown in Fig. 3. However, this reduction is not pronounced in hazard estimates for Oklahoma City in Fig. 9a while the loss estimates show some reduction. This is because most of the rate reduction in 2015 occurred in Northern Oklahoma and southern Kansas while Oklahoma City is in central Oklahoma. This is also illustrated in the reduction of hazard in Wakita in Northern Oklahoma (shown in Fig. 4) as shown in Fig. 10. The statewide loss risk has only reduced slightly since earthquake rates have not decreased uniformly across the urban centers.

Fig. 9
figure 9

a Evolving hazard over time in Oklahoma City and b statewide risk at 10%, 50% and 90% annual rates of exceedance. Seismicity rates are too low for 2009-01-01 with the number of years considered in our simulations to generate loss estimates at the 50% and 90% annual rates of exceedance

Fig. 10
figure 10

Evolving hazard over time in Wakita in northern Oklahoma at 10%, 50% and 90% annual rates of exceedance

4.2 Changes in magnitude distribution

We use a truncated Gutenberg-Richter magnitude distribution, and vary the minimum magnitudes from 3 to 5 and maximum magnitudes from 5 to 8. In hazard analysis, the minimum magnitude is specified at a level such that shaking from lower magnitude earthquakes is not relevant because it will not affect buildings (Bommer and Crowley 2017), and the maximum magnitude is governed by the maximum earthquake that a seismic source can produce. For induced seismicity, the maximum possible magnitude continues to be an active area of study (McGarr 2014; Ellsworth 2013) and understanding its influence can inform future research. Figure 11 shows the impact of these parameters on hazard and risk. We observe that using a minimum magnitude \(m_{\min } = 5\) yields lower shaking and losses than the other cases, because \(M<5\) earthquakes do contribute to shaking and losses in the baseline analysis case. We observed in Fig. 7 that only a small percentage of \(M<5\) earthquakes cause losses larger than \(\${1}\,{\hbox {billion}}\), however since \(M<5\) earthquakes are much more frequent than \(M>5\) earthquakes, setting a larger \(m_{\min }\) has a potential to reduce the risk at these fairly high loss values. As the loss value is increased further, setting \(m_{\min } \ge 5\) does not change the risk significantly because smaller earthquakes do not cause losses larger than \(\${10}\,{\hbox {billion}}\). This also explains the difference observed between \(m_{\min } = 3\) and \(m_{\min } = 4\) for the lower shaking and loss levels at the higher exceedance rates. The high frequency of \(M<4\) earthquakes contribute to the low levels of shaking at PGA \(\le {0.1}\,{\hbox {g}}\) and, combined with the high vulnerability of our exposure, this difference in hazard at low shaking levels also propagates to risk at lower loss levels. The difference becomes negligible for losses \(\ge \${100}\,{\hbox {million}}\) because \(M\ge 4\) earthquakes are responsible for most of these losses. We observe that \(m_{\max }>6\) have little influence on shaking and loss levels for the same reason that these larger earthquakes are less frequent and hence contribute little to the short-term hazard and risk estimates at these high annual rates of exceedance. As expected, the influence of \(m_{\max }\) increases as the shaking and loss levels increase.

Fig. 11
figure 11

a Hazard in Oklahoma City and b statewide risk for different values of minimum and maximum magnitudes

Figure 12 shows the variation of hazard and risk with changes in b-value. Dempsey et al. (2016) show that induced earthquakes follow the Gutenberg-Richter relation, with b-values estimated between 0.8 and 1.5 for most regions. A smaller b-value indicates higher frequency of observing large magnitude earthquakes, for a given overall earthquake rate. As expected, we observe that increasing b-values reduce both hazard and risk due to lower frequency of large magnitude events. The reduction in hazard and risk with increasing b-values is greater at higher shaking and loss values due to the lower frequency of large magnitude earthquakes.

Fig. 12
figure 12

a Hazard in Oklahoma City and b statewide risk for different b-values at different minimum and maximum magnitudes

4.3 Changes in ground-motion prediction equations

Well-constrained ground-motion prediction equations for Oklahoma have only been available recently (Yenier et al. 2017) and had not been developed earlier due to extremely low seismicity in the region. Moreover, induced earthquakes have been generally located at shallower depths (\(\approx {5}\,{\hbox {km}}\)) compared to tectonic earthquakes (\(\approx {10}\,{\hbox {km}}\)) and it has been contended that ground motions from induced earthquakes exhibit different behavior than those from tectonic earthquakes (Hough 2014; Cremen et al. 2017; Gupta et al. 2017). In Fig. 13, we compare hazard and risk variation for the Atkinson (2015) (A15) and the Gupta et al. (2017) (\(\hbox{SP16}_{\mathrm {scaled}}\)) GMPE’s that have been developed for application in Oklahoma. We observe that hazard and risk estimates based on the A15 are lower than those based on the \(\hbox{SP16}_{\mathrm {scaled}}\). The A15 and the \(\hbox{SP16}_{\mathrm {scaled}}\) models have similar amplitudes at source-to-site distances of \(\le {60}\,{\hbox {km}}\), while A15 predicts lower amplitudes than \(\hbox{SP16}_{\mathrm {scaled}}\) at larger distances. The two GMPE’s have similar standard deviations. This explains the differences in our estimates in Fig. 13. We also observe that the differences increase at larger acceleration values as we would expect, because larger values are governed by larger magnitude earthquakes for which ground shaking at longer distances is a more important factor. However, this increased difference is not reflected in the risk curve because the higher losses at our exceedance levels of interest are governed by damages to large asset cost cities located at short distances from earthquake epicenters. This analysis emphasizes the need for better constrained GMPE’s for regions of induced seismicity especially at shorter distances, to better resolve the shaking and losses resulting from small-magnitude earthquakes at short distances.

Fig. 13
figure 13

Hazard in Oklahoma City (a) and statewide risk (b) for A15 and \(\hbox{SP16}_{\mathrm {scaled}}\) GMPE’s

4.4 Changes in vulnerability

We consider the reduction in risk by decreasing the exposure’s vulnerability, by increasing the medians of the vulnerability curves by a certain ‘upgrade ratio.’ In Sect. 3.1 we increased the medians by a ratio of 1.43. Here we further upgrade the vulnerability curves by ratios of 2.0 and 3.0. This upgrade could be achieved by retrofitting the buildings to a newer code standard or to the code standard applicable for high seismicity regions like California. Figure 14 shows the anticipated result that decreased vulnerability (or higher upgrade ratio) yields lower risk. The losses are \(\${63}\,{\hbox {million}}\) and \(\${26}\,{\hbox {million}}\) exceeded once a year on average, and \(\${700}{\hbox {million}}\) and \(\${344}\,{\hbox {million}}\) exceeded with 10% annual probability for the upgrade ratios of 2.0 and 3.0, respectively.

In Sect. 3.2, we mentioned that based on observed losses in Oklahoma, risk in recent years might be on the order of \(\${100}\,{\hbox {million}}\) exceeded with 10% annual probability.This indicates that vulnerability curves associated with upgrade ratio = 3.0 might be more representative of the building vulnerability in Oklahoma–this may reflect either stronger than expected seismic strength of buildings, or lower damage potential of ground motions with a given PGA in Oklahoma, e.g., due to short shaking duration or low long-period energy. This vulnerability roughly corresponds to the High-code classification in HAZUS in the case of masonry structures and exceeds this classification for wood structures. High-code classification in HAZUS is used for fragility functions of new buildings in California. Risk analysis for different vulnerability levels can be a useful tool for city officials and operators to quantify benefit-cost ratios of upgrading structures in a region.

Fig. 14
figure 14

Statewide risk for vulnerability curves with medians increased by the ratio shown, corresponding to change-point rates on January 01, 2017

5 Conclusions

We have presented a framework to estimate temporally-varying hazard for induced seismicity, and a stochastic Monte-Carlo simulation procedure to estimate regional risks. We estimated seismic risk for the state of Oklahoma, and confirmed that short-term hazard and risk are significantly elevated due to induced seismicity. We estimated peak ground acceleration of 0.08 g with 10% annual exceedance probability and 0.3 g with 1% annual exceedance probability in Oklahoma City. The statewide risk indicated losses of \(\${1.2}\,{\hbox {billion}}\) with 10% annual exceedance probability and \(\${5.5}\,{\hbox {billion}}\) with 1% annual exceedance probability. These hazard estimates are of the same order of magnitude as those estimated by USGS, but the risk estimates are an order of magnitude higher than anticipated based on observed losses from recent earthquakes. We explored this inconsistency by changing the vulnerability curves for buildings in Oklahoma and observed that curves with median PGA equal to three times those specified by HAZUS yielded risk curves in the expected range. The losses from this upgraded vulnerability were \(\${344}\,{\hbox {million}}\) with 10% annual exceedance probability and \(\${2.2}\,{\hbox {billion}}\) with 1% annual exceedance probability. Similar analyses with changing vulnerability curves can be used to quantify the benefits of retrofitting buildings to higher seismic resistance.

Analysis of Oklahoma hazard and risk over time in indicate that risk increased substantially between 2009 and 2010, and then again between 2013 and 2014. More recently, a reduction in seismicity rates, potentially resulting from reduction in injection volumes in the state as a result of regulation (Baker 2017) and market conditions, has caused a decrease in statewide risk. We also assessed the impacts on hazard and risk from changes in magnitude distribution and ground-motion prediction equations. Due to higher vulnerability of buildings in Oklahoma, buildings could be impacted by magnitude \(\le 5\) earthquakes, hence we suggest using minimum magnitudes of \(M \le 3\) for hazard and risk assessment. Maximum magnitudes above 5.0 did not have significant impacts on hazard and risk for the annual exceedance rates of interest. Since we have already observed a M5.8 earthquake in Oklahoma, we suggest using \(M \ge 6\) for maximum magnitude. b-values and GMPE’s impacted risk significantly, indicating that further research on these topics will benefit risk assessments.

The risk analyses presented here served three main objectives—(1) to demonstrate the framework, (2) to suggest how the current results can be used to inform policy, and (3) to evaluate the reasonableness of model inputs. Some of our observations, such as the issues with assumed building vulnerabilities, were a result of our implementation of the framework within the constraints of previous available data and research. There remain uncertainties associated with seismicity rates, ground-motion prediction equations, asset loss correlations and building vulnerability functions and their assumed distributions that should be further studied to better constrain the risk analyses.

The seismicity rates for induced seismicity need to be updated regularly, and resulting assessments can be used to quantify time-varying hazard and regional risk as presented in this study. Risk assessment using this framework for different vulnerability levels and seismicity rates can be performed in an automated and ongoing manner, and will help stakeholders to quantify the benefits of various risk mitigation measures, thus serving as a valuable decision support tool.