1 Introduction

The development of catastrophe risk models has been gaining a lot of popularity in the last decades and is now considered to be crucial for many organizations, governments, insurance and reinsurance companies, enterprises, and communities in the evaluation of ways to manage and reduce their own risk to natural perils. Long-term hazard/risk assessments are the basis for the definition of long-term actions for risk mitigation. In this regard, many initiatives have been conducted for the development of seismic risk analyses at various scales around the world. However, some of them are not easily available or open-sourced and for the ones that are, there are still few analyses regarding the validation and evaluation of the effects of the different assumptions undertaken when performing seismic risk assessments with those inputs at a local scale.

Studies including Crowley et al. (2005), Bazurro et al. (2007), Crowley et al. (2014), Pitilakis (2015), Sousa et al. (2018), Silva (2018), Silva (2019), Silva et al. (2019) and Kalakonas et al. (2020) have tried to evidence the shortcomings and future trends in the modelling approach of seismic risk assessments. These include comments for the treatment and propagation of uncertainties, the biases in the analyses based on the assumptions and input parameters used, and recommendations of future directions for the different stages of the analyses. Among their recommendations, regarding the local specific analyses, they have mentioned that analytical fragility models should not only be structure-specific but also hazard-specific and that the epistemic uncertainty of the hazard model should be also propagated into the fragility model.

There is also another group of studies that have illustrated, through applied cases of study, the significant variability in the losses when different inputs, scales, assumptions, and uncertainties are considered or neglected in the different stages of local analyses (Kohrangi et al. 2017; Riga et al. 2017). However, the amount of these studies is still low, and more attention is needed on the calibration and validation of risk results at different scales and in different locations, to be able to quantify the uncertainties and variabilities when using different models, assumptions, and inputs in this type of analyses.

When performing PSRA at a local scale, the common practice when there is not local information is to take inputs from already available global or regional analyses, or from local studies in other latitudes and adjust them to the reality of the site of interest. However, even when following this path, many problems can be encountered in the process: (1) there are not many open-source databases readily available; (2) the available data may not be converted or adjusted directly for the site of interest; (3) the results may not be able to illustrate the reality and specific characteristics of the site of study. Any PSRA generally requires three main components: a probabilistic seismic hazard model, an exposure dataset, and a set of vulnerability functions, each one contributing to the uncertainty or biases of the results. In the case of the vulnerability, localized risk analyses, for specific cities or areas, are being conducted using global or regional vulnerability models or models adjusted from them, that could disregard local hazard conditions and building practices, constraining, and biasing the seismic risk results. As stated by Villar-Vega et al. (2017) and Kohrangi et al. (2017), for the assessment of earthquake losses at a local scale, models derived using a higher level of detail should be considered.

The previous statement is particularly true for regions where the contribution of the subduction tectonic regime is significant. As stated by Martins and Silva (2020), the modeller should consider the distinct tectonic environments when selecting ground motion records, as their specific characteristics can influence the resulting fragility functions, specifically when an inefficient IM is used. Kohrangi et al. (2017) also expressed that using randomly selected record-sets to perform dynamic analysis without at least some consideration to spectral shape and hazard consistency, can generate potentially biased risk estimates. In many cases, it is assumed that the vulnerability function of two identical buildings located at different sites in the world, region, or country would be identical. However, according to Kohrangi et al. (2017), the structural response estimates even of identical buildings are sensitive to the characteristics of the earthquakes that control the hazard at each site in the region and thus the vulnerability functions should be consistent with the local hazard and building practices.

Considering the structural aspect, fragility curves cannot be derived in the same manner for different sites, as the construction practices, materials, and geometric properties can change significantly from one country to another, or even within the same country. This suggests that the common approach of applying a single fragility function to multiple sites can bring large uncertainty and biases into loss estimation unless the accelerograms used for its development are carefully selected to be consistent with the seismic hazard of the region or site of interest and lest there is a way to consider the local characteristics of materials and building practices in the analyses.

The following study addresses several of these issues in local seismic risk assessment, focusing on three main aspects: (1) the selection of the input motion, (2) the inclusion of specific local characteristics of the structure and (3) the methodology followed for deriving the fragility or vulnerability models. It then provides recommendations depending on the intended final use of the risk results based on the modeller and the decision-maker. It is known that the derivation of more detailed models could be inconvenient or extremely difficult in practice or outside the academic world, where the time, resources, or expertise may not be available. Therefore, it is important to know the implicit uncertainties and biases of using simpler models or readily available data and be able to allocate resources in the improvement of the aspects that contribute the most to the reduction of uncertainties and biases of the results, without compromising the efficiency of the analyses.

2 Methodology

The focus of the study is to provide insights into the sensitivity of risk metrics when using different inputs or assumptions in a local PSRA, focusing particularly on the vulnerability component. The three main components Exposure, Hazard, and Vulnerability are described in the following sections.

2.1 Exposure modelling

The city of Medellin, Colombia, located in the Andean zone of the country, was selected as the case study. Being the second-largest city of Colombia, it accounts for 6.15% of the GDP of the country and 5.2% of its population (DANE 2018). Studies in the region have estimated the number of buildings in the city between 245 and 345 thousand and the exposed value between 24 and 145 Billion USD (Osorio 2015; Salgado et al. 2014; González 2017). These consider a significant contribution of informal construction (with no seismic provisions) and low code reinforced concrete buildings, accounting jointly for over 60% of the total building stock of the city (Table 1). This study focuses on these assets representing the most vulnerable building classes for Medellín: unreinforced masonry and pre-code or low-code non-ductile middle to high rise concrete buildings with masonry infills.

Table 1 Contribution of vulnerable building classes in the total exposure of Medellin

To compute the number of buildings and their economic value for the present study, open-source databases were consulted. It must be said that most cadastral databases are only partially available (if they are available at all) and even then, they lack the variables needed to construct reliable exposure datasets. For this reason, information publicly available from research previously conducted in the city and from the municipality office open-source platform was used to derive the exposure in this study. For this purpose, the area per typology in each neighbourhood, area per dwelling, and dwellings per typology were obtained from Gonzalez (2017). The tables from the appendix of that study were used as input information. A further description of the procedure of the derivation of these tables can be consulted in Gonzalez (2017) and Acevedo et al. (2020).

Considering that the data of that study is aggregated by macro taxonomy, a distribution was later performed using (1) the socio-economic strata of each neighbourhood and (2) the number of buildings per story range in each case, as reported in the strata distribution per district in the open-source data site from the municipality office (Alcaldía de Medellín 2020). Three main structural typologies were assumed to represent the most vulnerable classes in the city: for low rise unreinforced masonry houses with no ductility, where only the counts of 2-storey houses were considered (referred to herein as MUR_H2). In the case of mid-rise reinforced concrete buildings with masonry infills and low-ductility, the representative buildings selected were the 4-storey buildings (referred to herein as CR_H4), and finally, for the high-rise pre-code reinforced concrete buildings with masonry infills and low-ductility, the 8-storey buildings were the ones used (referred to herein as CR_H8). These three building typologies were selected as index buildings for analysing the behaviour of the most vulnerable classes in the city (See Table 2). The maps with the distribution of buildings and the cost of these typologies in the city of Medellín are shown in Fig. 1.

Table 2 Values of the exposure model derived for this study
Fig. 1
figure 1

Number of buildings and replacement cost for the portfolio of buildings in Medellín: MUR_H2 (top), CR_H4 (centre), and CR_H8 (bottom)

Once the index building classes were chosen, a literature review was conducted to establish if there were previous local studies in the country focusing on these specific typologies that could be used in the definition of the structural parameters needed to build capacity curves. These were later used for the derivation of the vulnerabilities and to define the fundamental periods for the ground motion selection procedure, described in Sect. 2.2. Among the consulted studies, Sinisterra (2017), provided good insights on the behaviour of low and high-rise reinforced concrete buildings designed with the first seismic code in Colombia, the CCCSR-84 (AIS-84). According to that study, for mid-rise buildings, a structural period of 0.5 s should be considered, while it should be 1.0 s for high-rise buildings. For the unreinforced masonry case, two studies were consulted, Acevedo et al. (2017) and Spinel et al. (2019), and a structural period of 0.25 s was assumed for this case.

2.2 Hazard modelling

Colombia is located in the northwestern South American’s corner with a tectonic environment governed by the convergence of three plates: Caribbean to the north, Nazca to the southeast, and the South American to the southwest, as well as the contribution of the Panama and North Andean blocks (Fig. 2). The interaction between the Nazca and South American plates produces subduction events with shallow to intermediate depths along the Pacific coast. Likewise, the Caribbean plate moves towards the South American plate creating stresses in that zone. Additionally, the country holds the Bucaramanga seismic nest located to the northeast of the country characterized by deep events and the active shallow crust seismicity zone along the Andes Mountains, where most important cities of the country are located, including Medellin (Taborda et al. 2000; Paris et al. 2000; Pulido 2003).

Fig. 2
figure 2

Location of Medellin city and the Colombian seismicity (dashed lines: limit tectonic plates, dash-dotted lines: active faults traces)

The Colombian Geological Survey (SGC in Spanish) and the Global Earthquake Model (GEM) Foundation recently developed a seismic hazard model for the entire country (Riviera et al. 2020), which is available upon request and was the model used for the analyses in this study. The model (referred herein as SGC-GEM model) considers information from seismological and geological studies, digital elevation models, and a homogeneous earthquake catalogue updated from the national seismological network. These datasets facilitated the definition of seismic sources for the different tectonic regime types (TRT): active shallow crust, subduction interface, subduction intraslab, and deep seismicity, as well as the selection of several Ground Motion Prediction Equations (GMPEs) according to the local intensity levels recorded by the national accelerometric network. The epistemic uncertainty is considered in two stages: (1) associated to the seismic sources by a logic tree of two branches, and (2) related to the attenuation models by a logic tree structure of three GMPEs per tectonic region type, leading to a total of 162 branches. Further details of the model can be consulted in SGC (2018) and Riviera et al. (2020).

The SGC-GEM model is implemented in the OpenQuake-engine (Silva et al. 2014; Pagani et al. 2014) and is used here to perform a probabilistic seismic hazard analysis (Cornell 1968; Esteva 1970) as well as a seismic hazard disaggregation (Bazzurro and Cornell 1999) by tectonic regime for the city of Medellin, to determine the contribution of different seismic sources for the five intensity measure (IM) types of interest according to the exposure model and other consulted global and regional databases: Sa(0.25), Sa(0.30), Sa(0.50), Sa(0.60), Sa(1.0 s), and thirteen (13) different intensity measure levels (IML) between 73 years to 100,000 years return period. These IMs were selected close to the fundamental periods, T1, of the systems from global, regional, and local studies presented in Sect. 3, to increase the efficiency and ensure lower uncertainties in the response predictions (Luco and Cornell 2007). Figure 3 presents the contribution to the seismic hazard for each tectonic environment in the case of IM = Sa (0.50 s), and it is observed how the tectonic regimes of subduction interface and active shallow crust dominate the hazard at the site. Figure 3 also shows the uniform hazard spectrum for 2475-year return period for the principal tectonic regimes.

Fig. 3
figure 3

Left: Seismic hazard disaggregation by tectonic regime type for IM = Sa (0.5 s). Right: Uniform hazard spectrum for the principal tectonic regimes at the site of analysis

Even though there is a contribution between all IMs of subduction intraslab and deep seismicity sources, those tectonic regimes will not be considered in the analysis because they do not represent the majority of hazard for the different IMs and IMLs considered. However, the authors are aware that these seismic sources have a contribution that could impact or not in further analysis, but it is believed that the previous assumption is enough for the scope of this study. Therefore, from this point forward only subduction interface and active shallow crust sources will be the focus of analysis. It is noted that this differentiation by tectonic regime is done to estimate hazard parameters (magnitude, distance, and epsilon) that allow us to find realistic scenarios in each seismic source and perform a proper record selection. In the view of the authors, using mean hazard parameters estimated from the combination of tectonic regimes will result in events that are not real or possible within the seismic sources for the site of interest.

In addition, target intensity levels per tectonic regime for each intensity measure type and return period were computed considering a response on rock (soil type B). It is worth mentioning that for this study, soil conditions and amplification factors were not included, even when their contribution is well known in real local analysis. This decision was taken to avoid introducing more variabilities to the sensitivity analyses and to follow a similar procedure as that of Villar-Vega et al. (2017), which used ground motions that were computed using only rock-site records during the selection process.

Then, to determine the ground motion record selection we used the conditional-spectrum (CS) method (Jayaram et al. 2011) computing the mean scenario (i.e., mean magnitude, M, mean distance, R, and mean epsilon, ε) that best represent the site of analysis for the mentioned intensity measure levels and types (Harmsen 2001), and the selected tectonic regimes. It is important to mention that the GMPE model contains a logic tree for each tectonic environment (Riviera et al. 2020) and an approximate CS target spectrum using the weights per each branch was calculated using the mean values of M, R, ε (Lin et al. 2013). Correlation models for the tectonic regimes were also considered respectively (Baker and Jayaram 2008; Jayaram and Baker 2009; Jaimes and Candia 2019). Table 3 presents an example of the mean values of magnitude, distance, and epsilon computed for 2475-year return period for the principal tectonic regimes.

Table 3 Mean values of magnitude (M), Joyner–Boore distance (R, in km), and epsilon (ε) for IM = Sa (0.50 s), 2475-year return period for the principal tectonic regimes

The accelerograms were collected from several ground motion databases worldwide that include events for the tectonic environments presented in the analysis, such as Pacific Earthquake Engineering Research (PEER) NGA-West2 (Ancheta et al. 2014), NGA-Sub (Bozorgnia et al. 2020), Colombian Geological Survey (SGC 2020), K-NET and KiK-net networks (NIED 2019), National Seismological Service of Mexico (SSN-UNAM 2020) and the SIBER-RISK strong motion database of Chilean earthquakes (SIBER-RISK 2019). Multiple records were selected per TRT, IML and IM. When no records were available for a particular target acceleration value, a maximum scaling factor of 5.0 was established. The CS selected records for IM = Sa (0.5 s) and 2475-year return period are shown in Fig. 4 for both the subduction interface and active shallow crust tectonic regimes.

Fig. 4
figure 4

CS record selection for IM = Sa (0.5 s), TR = 2475-year return period for subduction interface (left) and active shallow crust (right) showing the 2.5th, 50th and 97.5th target percentiles of the spectral acceleration and the selected records

These records were later joined to create an ‘indifferent’ folder that combines records from both active shallow crust and subduction interface. To perform this merge, we use the results of the disaggregation by TRT (see Fig. 3) at each IM and IML to guarantee a distribution of records according to their contribution from each tectonic environment, ensuring a minimum of 20 records per case, prioritizing those presenting the smallest scaling factors. As a result, in total 1300 records were selected; this number has been assumed to be statistically sufficient to represent the structural response of the buildings and the tectonic environment in former analyses such as Sousa et al. (2017) and Martins and Silva, (2020).

2.3 Vulnerability modelling

Fragility and vulnerability models constitute one of the key elements of seismic risk assessment and at the same time an important source of uncertainty in the seismic risk modelling process. Some of the primary sources of uncertainty included in the vulnerability modelling when analytical curves are derived are: (1) the selection of the input motion, (2) the characteristics of the structure, (3) the modelling procedure, and (4) the methodology and statistical procedure followed for deriving the fragility curves. The effect of the assumptions in the input record selection, the consideration or not or local-specific characteristics of the structures, and the methodology followed in the derivation of fragility curves were the focus of this study.

In the following sections, a description of the databases used and the procedure for the derivation of fragility and vulnerability curves are presented. As previously stated, open-source global and regional databases were the principal sources of information for the fragility curve selection. Three different groups of fragility functions were considered: those from the Global Earthquake Model database (Global), those derived in the SARA project (Regional), and those derived specifically for this study (Local). These local curves are based on the local characteristics of the structures (obtained from local studies) but using the equations provided in the regional or global studies for the conversion to the equivalent Single Degree of Freedom (SDOF) systems, and thus the derivation of the capacity curves. Different vulnerability functions were selected and derived for each of the three typologies previously identified in the exposure section, with variations considering the ground motion input selection, the structural characteristics and the derivation methodology of the fragility and vulnerability curves.

2.3.1 Global database: global earthquake model—GEM

The capacities, fragilities, and vulnerabilities from the Global Earthquake Model derived following the procedure reported in Martins and Silva (2020), were used. These are hosted in the GitHub repository (https://github.com/lmartins88/global_fragility_vulnerability). The selected fragility curves from the database are: (1) low-rise non-ductile unreinforced masonry buildings in South America—MUR_SAmerica_LWAL-DNO_H2, (2) mid-rise low-code reinforced concrete buildings with infills- CR_LFINF-DUL_H4 and (3) High-rise low-code reinforced concrete buildings with infills—CR_LFINF-DUL_H8 (See Table 4). These original fragility and vulnerability functions were compared with curves derived in the present study using the median capacity curves reported in Martins and Silva (2020), whose parameters are shown in Table 5, but following the hazard-consistent ground motion record selection of Sect. 2.2. The methodology used was that of the previously mentioned study, where a censored cloud analysis methodology was used to derive the vulnerability functions, as a post process after conducting Nonlinear Time History Analysis (NLTHA) on equivalent single-degree-of-freedom (SDOF) oscillators, with the NLTHA Risk Modeller’s ToolKit–RMTK (Silva et al. 2015). The building-to-building variability is taken just and in Martins and Silva (2020) study by inflating the standard deviation by the same factor reported in their study. The damage thresholds (Table 6) and damage to loss model (Table 7) presented in Martins and Silva (2020) were used to allow a one-to-one comparison with the original fragility and vulnerability curves (as shown in Figs. 5 and 8).

Table 4 Structural and dynamic parameters used to define the capacity curves
Table 5 Median capacity curve parameters for the different analyses for each of the three selected typologies
Table 6 Damage state thresholds defined for the derivation following the two procedures
Table 7 Damage to loss model defined for the derivation following the two procedures
Fig. 5
figure 5

Comparisons between top-row) the global fragility curves from Martins and Silva 2020 (original) and the ones derived following their procedure but considering the site-specific record selection; second-row) the regional fragility curves from Villar-Vega et al. 2017 (original) and the ones derived following their procedure but considering the site-specific record selection; and third and fourth rows) the local fragility curves derived using the site-specific record selection (local 1 following procedure described in Martins and Silva 2020 and local 2 following derivation method of Villar-Vega et al. 2017), for the three structural typologies studied: MUR_H2, CR_H4, CR_H8

2.3.2 Regional database: SARA project database

In the case of the regional database, the curves derived for the South America Risk Assessment -SARA project following Villar-Vega et al. (2017) procedure were consulted. The selected fragility curves as reported in the paper are (1) low-rise unreinforced masonry buildings—MUR/H:2, (2) Mid-rise reinforced concrete buildings with infills—CR/LFINF/DNO/H:4 and (3) High-rise reinforced concrete buildings with infills—CR/LFINF/DNO/H:7, with parameters as shown in Table 4. As in the global case, the original regional vulnerability functions were compared with vulnerability curves derived in this study starting from the median capacity curves with parameters reported in Villar-Vega et al. (2017), as shown in Table 5, following the methodology reported in that study but applying the same hazard-consistent ground motions records selected with the Sect. 2.2 procedure. In this methodology, sets of 150 single-degree-of-freedom oscillators (to account for building-to-building variability) generated through Monte Carlo analysis around the median capacity curve are subjected to a series of ground motion records using non-linear time history analyses in the RMTK (Silva et al. 2015), following a Multi Stripe Analysis (MSA). They are later converted to fragility functions by fitting lognormal distributions to the probability damage matrix considering the different damage states, by using a least square regression method. As in the global case, damage thresholds (Table 6) and damage to loss model (Table 7) presented in Villar-Vega et al. (2017) were used to allow a one-to-one comparison with the original fragility and vulnerability curves (as shown in Figs. 5 and 8).

2.3.3 Derivation of local fragility and vulnerability curves

For the derivation of local fragility curves, the capacity curves were derived using as input parameters the local values for the structural characteristics as reported in Sinisterra (2017), Acevedo et al. (2017), and Spinel et al. (2019), which are based on data from site visits and experimental analyses performed to these structural classes in Colombia, and which are reported in Table 4. The capacity curve parameters finally used for the local cases are presented in Table 5.

After having these capacity curves as basis, to avoid adding uncertainties and to be able to do a sensitivity analysis on (a) the record selection, (b) the regression method used and (c) the inclusion or not of local structural characteristics, two sets of locally derived fragility models were produced: (1) using the censored cloud analysis following the same procedure reported in Martins and Silva (2020) and (2) using the same procedure as reported in Villar-Vega et al. (2017).

For the former, the methodology explained in Sect. 2.3.1 was followed, using as median capacity curve for each typology that of the local case. These curves are denominated Local 1 onwards. For the latter, the methodology explained in Sect. 2.3.2 was followed, using as median the capacity curves of the local case, and generating through a Monte Carlo procedure the set of 150 curves to account for the building-to-building variability. These curves are denominated Local 2. It is important to mention that the derivation methodology, the damage state thresholds, and the damage to loss model are different in both methodologies and could be sources of variation and differences in the results between both models. To avoid confusion, the denomination and definition of each of the conducted analysis and the assumptions taken in each case are shown in Table 8.

Table 8 Denomination of the different analyses and reference data used to derive the vulnerabilities in each case

In this point it is important to mention that, even when the local capacity curves had similar values in the yield and ultimate drift when compared to the global case (see Table 4), it was seen that the main factor for the significant differences in the local capacity curves is the assumed inter-storey height, which is significantly reduced for the unreinforced masonry case in Colombia. In addition, given the damage states are conditioned to the yield displacement (Sdy) and ultimate displacement (Sdu), see Table 6, this changes in the local capacity curve render significant differences in both the fragilities and vulnerabilities, as shown in the following section.

2.3.4 Derived fragility and vulnerability models and comparisons between them

Fragility curve parameters for the global, regional, and locally derived models using hazard-consistent records can be consulted in Table 9. The analyses are presented for each specific structural typology. Reported values include original global and regional curves as presented in the global and regional studies, global and regional curves using the CS method of Sect. 2.2, and local curves derived using both regression methods (Martins and Silva 2020 and Villar-Vega et al. 2017). Comparisons of the fragility curves are shown in Fig. 5, while comparison of the vulnerabilities are given in Figs. 6, 7 and 8.

Table 9 Median (θ) and log standard deviation (β) of the fragility functions defined for each building class
Fig. 6
figure 6

Comparisons between top-row) global vulnerability curves from Martins and Silva 2020 (original) and the ones derived following their procedure but considering the site-specific record selection; second-row) regional vulnerability curves from Villar-Vega et al. 2017 (original) and the ones derived following their procedure but considering the site-specific record selection; for the three structural typologies: MUR_H2, CR_H4, CR_H8. *Original: Sa (0.3), Indifferent: Sa (0.25). **Original: Sa (1.0), Indifferent: Sa (0.85)

Fig. 7
figure 7

Comparisons between the local vulnerability curves derived using the site-specific record selection—local 1 following the procedure described in Martins and Silva (2020) and local 2 following the derivation method of Villar-Vega et al. (2017), for the three structural typologies studied: MUR_H2, CR_H4, CR_H8

Fig. 8
figure 8

Comparisons between top-row) the global vulnerability curves from Martins and Silva (2020) (CS-based) and the Local-1 curves; bottom-row) the regional vulnerability curves from Villar-Vega et al. (2017) (CS-based) and the Local-2; for the three structural typologies studied: MUR_H2, CR_H4, CR_H8

2.4 Impact of assumptions in input parameters

The impact of the record selection (see Fig. 6), vulnerability derivation methods (see Fig. 7) and the use of capacity curves that include local characteristics (see Fig. 8) are shown below.

3 Risk results

The OpenQuake engine (Silva et al. 2014; Pagani et al. 2014) was used to conduct an event-based risk analysis for the different cases, using the vulnerabilities previously derived. The results of the analyses are reported in this section. To achieve convergence in the risk analyses, a 100,000 stochastic event set of 1-year duration per logic tree branch was used. This configuration was chosen after conducting a convergence analysis over the loss of exceedance curves as shown in Fig. 9. This illustrates that a SES of 100,000 years of events gives results of comparable variability to those of the 500,000 years SES, for a range of return periods up to 1000 years, with a 95% confidence interval. This is in line with the statement of Silva (2018) in which it was reported that 100,000 was the minimum number of years to consider achieving convergence. However, it is important to mention that in this case a full enumeration of the logic tree was conducted, reaching 162 realizations.

Two risk metrics were considered for the risk results sensitivity analyses: AALR and PML. The AALR represents the expected loss per year and is taken as the annuity to pay to compensate for all future modelled losses, normalized by the total exposed value. The AALR was obtained individually for each building typology, to be able to do a one-to-one comparison between the different analyses and cases. These results are reported in Fig. 10.

On the other hand, PML curves provide loss values or levels with their respective expected return period. These curves have been used extensively, as in the case of the aggregated loss associated with a return period of 200 years which is frequently used by insurance/reinsurance companies to establish their insurance premiums and portfolio coverage. These curves can also be used to select earthquake scenarios for the preparation and design of post-disaster emergency plans by government officials.

Fig. 9
figure 9

Analysis over the number of years per logic tree branch needed for convergence

Fig. 10
figure 10

Comparison of AALR using the global vulnerabilities (left), regional vulnerabilities (centre) and using the local 1 and local 2 vulnerabilities (right)

In the present study, three different analyses and comparisons are shown using PMLs: (1) the effect of the ground motion record selection (see Fig. 11), (2) the effect of using different regression methods (see Fig. 12) and (3) the effect of including the local characteristics (see Fig. 13) for the fragility and vulnerability curve derivation.

Fig. 11
figure 11

Impact of the record selection (original vs indifferent ‘CS-based using Sect. 2.2′) in the PML curves for a range up to 1000 years of return period using the global vulnerabilities (top-row) and using the regional vulnerabilities (bottom-row) for the three different typologies

Fig. 12
figure 12

Impact of the vulnerability curve derivation method in the PML curves for a range up to 1000 years of return period based on locally derived curves: local 1 following Martins and Silva (2020) procedure and local 2 using Villar-Vega et al. (2017), for the three different typologies

Fig. 13
figure 13

Impact of the inclusion of local characteristics in the PML curves for a range up to 1000 years of return period. Comparison of global case using Sect. 2.2 CS method vs. local 1 (top-row) and regional case using Sect. 2.2 CS method vs. local 2 (bottom-row), for the three different typologies

4 Analysis of results

As stated in previous sections, three main factors that could influence the results from the vulnerability point of view were studied: (1) the effect of the ground motion record selection, (2) the effect of using different vulnerability derivation methods and (3) the effect of including the local characteristics of the structure as input parameters for the capacity curve derivation but still using the same regression formula used in the global and regional cases.

Regarding the ground motion record selection (comparison of the global and regional Original and Indifferent cases), the AALR (Fig. 10) and the PML (Fig. 11) for the unreinforced masonry case presents the lowest difference between both analysis, while for the CR_H4 and CR_H8 cases there seems to be a reduction to almost half in both risk metrics when considering local hazard consistent records. A possible reason for this could be that there is a similar content in the high-frequency range for the records selected among the different databases, while for the low-frequency range, the process of scaling the records to a specific IM level using the CS procedure could be reducing the expected intensities and in consequence the possible losses. As an example, for the regional case, Villar-Vega et al. (2017) state that the “ranges of ground shaking were defined according to the minimum level expected to cause damage and the maximum ground shaking expected in the regions with the highest seismic hazard”.

On the other hand, the impact of the vulnerability derivation methodology was studied comparing the local 1 and local 2 analysis which make use of the same capacity curve but consider the two different methodologies (Martins and Silva 2020; Villar-Vega et al. 2017) for the derivation of the vulnerability models. It is important to mention that in addition to using different derivation or regression methodologies, each procedure also considers different damage state and consequence models that could be the source for the differences shown in the PML and AALR metrics. In this case, even when the AALRs present significant differences, it could be seen that there are negligible differences among the PMLs (Fig. 12) for a range of return periods of up to 100–200 years in all cases. It is also interesting to note that among the three typologies, only the CR_H4 case shows lower results for the local 2 case, which seems odd looking at the vulnerability curves presented in Fig. 7, where the local 2 case tends to be higher for most of the portion of the curve; however it is noted that for this case, the local 2 curve is indeed below local 1 curve in the range of lower-intensities of the vulnerability curve, up to the point where they cross each other at Sa(0.5 s) of 0.5–0.6 g.

Now, analysing the effect of including the local characteristics of the structures, important differences could be seen in the risk metrics when comparing the curves that considered the local characteristics of the structures and the ones that did not and used generic global and regional curves, for both the PML and AALR metrics. This is particularly true for the low-rise structures, where losses of the case considering local conditions doubled or tripled the results obtained when using global or regional vulnerability models. As mentioned previously in the vulnerability section this could be mostly because of the impact the inter-storey height has on the yield and ultimate spectral displacements and the fact that the damage states are defined based on these parameters. It is important to mention that, given the considerable difference encountered between the global, regional and local curves, there is a need for a revision and calibration of the regression equations used for the bilinearization of the capacity curves for these procedures at a local level, to establish if they can be used in local studies in different latitudes or only in countries with similar building practices as in Turkey, which is where these height-based regressions were derived (Bal et al 2008). Also, there is a need of validation of these models with reported damages and losses from previous local events.

Given the previous observations, in particular the one of the vulnerability derivation method comparison case, an additional analysis was conducted to check the sensitivity of the results to the analysed range of return periods of the PML. Given the analyses consider PMLs in a range up to a return period of 1000 years, a hypothesis was made stating that high-frequency low-intensity events will probably be the ones contributing the most in this range of the PML curve and in this way the behaviour and characteristics of the left tail of the vulnerability curves could dictate the PML shape and trend.

To test this hypothesis, event-based-risk analyses were conducted for Medellin independently for each of the cases (building typology and vulnerability curve) using the OpenQuake-engine. The ground motion values (GMVs) from the events reporting losses with return periods equal or below 1000 years were computed and reported for a central site in the city. These analyses showed that the largest recorded GMVs in Medellin reached values of 0.45 g for Sa (0.30 s), 0.24 g for Sa (0.60 s), and 0.16 g for Sa (1.00 s). These intensity values, when checking the vulnerabilities of Figs. 6, 7 and 8, are achieved in the lower-left portion of the curve. Given the event-based analysis will only take events reporting GMVs of up to 0.5 g at most, it is thought that most of the curve would be dictated by the values, shape, and characteristics of its lower left-tail. To check this, the vulnerability curves were once again plotted, having 0.5 g, 0.3 g, and 0.2 g as IMs thresholds for MUR_H2, CR_H4, and CR_H8 respectively (Fig. 14).

Fig. 14
figure 14

Left tails of the vulnerability curves for MUR_H2, CR_H4, CR_H8

When comparing the shapes of the curves presented in Fig. 14 with those of Figs. 11 and 12 similar trends could be seen for all cases. For example, considering the analysis of the impact of record selection (Fig. 11), it is shown that for the MUR_H2 of the global case (top left graph) there is a negligible difference between the curves. If we compare this with the top left graph of Fig. 14, an almost identical trend is seen, with both curves diverting minimally from each other at the end. The trends can also be seen when comparing all the other graphs of Fig. 11 with the first two lines of Fig. 14, showing higher original curves for the reinforced concrete cases while lower for the unreinforced masonry cases.

This left tail hypothesis could also explain the behaviour of the CR_H4 case in Fig. 12, regarding the comparison of local 1 and local 2 analyses. If only the left tail of the vulnerability curves for CR_H4 for local 1 and local 2 are taken (see bottom centre graph of Fig. 14), in this portion of the curve (IM levels below 0.3 g), the local 1 analysis is indeed higher than the local 2 analysis and thus the local 2 PML up to 1000 years of return period, which considers only events with intensity levels below approximately 0.3 g, should follow the same trend.

This shows an important result: given the range of loss return periods of the PML analyses, and the expected hazard for the site, in this case and in sites with similar characteristics, particular attention should be paid to the left tail of the vulnerability curves, given this portion seems to dictate the shape and trend of the PMLs in the return period range that is commonly used. This is important because this lower portion is in many cases overlooked or compromised for a better fit of the main body of the fragility or vulnerability curve; but watching the previous results, probably the fitting methods should be giving a higher weight to this portion of the curve, considering its impact in the PML results.

5 Conclusions

As stated in other studies regarding local seismic risk analysis, the derivation of vulnerability curves consistent with the local hazard and building practices seems crucial in the development of a local PSRA, as there is indeed a great impact in the loss results when making different assumptions. In this study significant differences were encountered between loss curves when considering different ground motion selection strategies for their derivation, the inclusion or not of local characteristics of the structures and the use of different vulnerability curve derivation methods.

In order to be able to quantify and propagate the epistemic uncertainties when choosing a fragility model, a similar approach to that of the GMPMs could be explored for the fragility component, by assigning different weights to each possible fragility model (considering different record selection approaches, fragility derivation methodologies, and input capacity curves). This could help to fill the missing gap regarding the epistemic uncertainties in the fragility models which have not been much considered until to now in the PSRA.

As expected, the major differences in the losses were obtained when including the local structural characteristics: the locally derived curves in some cases doubled or tripled the AALR and PMLs of their global or regional counterparts. As stated before, this could be because there is a need to calibrate the formulas used to derive the capacity curves for the local case before using them, as it was shown that an inclusion of a different parameter such as the inter-storey height may render significantly different capacity, fragility, vulnerability, and risk results.

Considering the use of different ground motion selection strategies, the CS-based record selection introduces into the risk assessment the advantage of considering a site-dependent hazard level which is, in most of the cases, a missing part in the calculation of fragility curves, especially at a local scale, where this factor could be over-estimated when using other record selection procedures. However, it is also important to note that special attention must be paid when the seismic hazard model contains a logic tree of several GMPEs, that may be estimated for other places, leading to a bias in the estimation; in other words, all the detail gained in a rigorous record selection could result in an inaccurate hazard level.

After analysing all the results, a major discovery was made regarding the left-tails of the vulnerability curves, which are many times disregarded in the vulnerability or fragility fitting procedures. According to the results, these lower tails condition the PML shapes and trends. This brings into light the importance of being aware of the left-tails of the vulnerabilities for this type of analysis where a range of losses with return periods up to 1000 years is considered. As a recommendation it is established that it would be important for the fitting methods to give a higher weight to this portion of the vulnerability curve, considering its impact in the PML results.

Finally, it is known that the process of deriving local vulnerability curves may be, in many cases, extremely difficult within a risk study, because of the lack of data, time, resources, or expertise. For this reason, given the results of the previous study, the greatest recommendation is to be aware of the constraints in the use of global or regional open-source vulnerabilities and the derivation procedures from global and regional studies. The assumptions need to be discussed and shared with the end-user and the uncertainties need to be treated and propagated throughout the analysis in a detailed and conscious manner. Finally, if there are resources to invest to improve the risk model, particular attention should be paid to (1) the calibration of equations to account for local structural characteristics of the site of study, (2) the process of selecting hazard-consistent records, and (3) the validation of the damages and losses with reported values from local events. In this sense, it is important to at least validate or calibrate the curves if they are going to be obtained from global or regional studies and adjusted to the local conditions.