1 Introduction

In order to reduce risk, several seismic-prone cities around the world (Tucker et al. 2013), such as Quito in Ecuador (Chatelain et al. 1999), Kathmandu in Nepal (KVERMP 1998), Istanbul in Turkey (Erdik et al. 2003), Catania in Italy (Faccioli et al. 1999) and Nice in France (Bard et al. 2005), have been analyzed in terms of seismic risk, with the objectives of educating the public, producing seismic scenarios to simulate losses and operational problems and implementing an action plan to manage seismic risk. Benson and Twigg (2004) claim that with $40 million invested in preventive measures worldwide in the 1990s, economic losses have been reduced by $280 million. They also support the view of emergency specialists, who are increasingly insistent on the need to invest in preparation, prevention and disaster attenuation measures, such as those supported by seismic scenarios, to reduce losses. Moreover, seismic risk scenarios are useful to study the best investment framework for the seismic retrofitting of buildings that can be attractive against the effects of long return periods events (Smyth et al. 2004). In fact, predicted damage in existing structures is the key parameter to anticipate direct and indirect seismic losses and fatalities. Consequently, before simulating and testing losses, efforts must be concentrated on the seismic vulnerability assessment of existing structures at urban or regional scales, such as in Algeria, a region whose first seismic code dates back to 1981 for public buildings. Assuming a small rate of renewal, the majority of constructions of Algeria’s building stock were therefore built with no seismic design engineering and can be considered vulnerable to regional earthquakes, as demonstrated by the 2003 Boumerdes earthquake where many reinforced concrete buildings were extensively damaged (Meslem et al. 2012).

Assessing vulnerability at the urban scale is a complex task due to the number of buildings concerned, their heterogeneities in terms of structural design and seismic response and the lack of main information concerning their design (Guéguen 2013). Macro-scale methods have been developed, based on data collected in the field during post-seismic periods. They propose damage functions calibrated on observed damage and building typology, as proposed by the Federal Emergency Management Agency (FEMA) in the USA (Hazus 1997), the Gruppo Nazionale per Difesa dai Terremoti in Italy (Benedetti and Petrini 1984; GNDT 1993) or the Risk-UE method developed within the framework of the European project (Spence and Brun 2006; Lestuzzi et al. 2016). Barbat et al. (2010) published a review of the seismic vulnerability assessment methods for urban application. Whatever the method, the difficulty remains to inventory the buildings, involving costly visual screening throughout the city. This is even more complex in moderate seismic-prone regions, where resources for seismic evaluation are often difficult to mobilize and are limited (Guéguen et al. 2007), even though the seismic hazard is not negligible.

Recently, simplified macro-scale methods have been adapted to reduce the inherent cost of the building in situ surveys at the urban scale. Such initiatives consist in simplifying the visual screening stage by considering only the key parameters whose contribution to seismic vulnerability is significant (Guéguen et al. 2007) or by using remote sensing methods (Mueller et al. 2006; Geiß et al. 2014; Riedel et al. 2015). Datamining-based methods have also been developed to derive the best proxy linking the building features, easily assessed by remote sensing or from preexisting databanks (e.g., national census), with the seismic vulnerability of buildings (Riedel et al. 2014, 2015).

In this study, we propose to validate this approach in a seismic-prone city, Constantine in Algeria, for which an extensive vulnerability assessment using the Risk-UE method is available. After presenting the seismic context and urbanization of Constantine, the method is developed and applied to the city. In the last section, the results are compared with those of the Risk-UE method, before concluding on the efficiency of this approach.

2 The city of Constantine

The city of Constantine (Fig. 1a) is located in the northern part of Algeria. The tectonic context of the Constantine region results from the convergence of the Eurasian and African plates. Descriptions of the tectonic context can be found in the abundant literature (e.g., McKenzie 1972; Aoudia and Meghraoui 1995; Mickus and Jallouli 1999; Yelles-Chaouche et al. 2006; Hamdache et al. 2012). As a result, this region is one of the most seismically active regions of the Mediterranean (Buforn et al. 1995; Kherroubi et al. 2009), characterized by intense seismic activity. This activity mainly affects the northern part of Algeria, home to the most important cities, i.e., the country’s most densely populated cities, concentrating housing, infrastructures, economic and industrial activities.

Fig. 1
figure 1

a General situation of the city of Constantine in Algeria. b Location of the main fault and historical earthquakes mentioned in the text (after Bounif et al. 1987). c Seismic macro-zoning map of Algeria (after RPA03 2003). d Location of the historic downtown area concerned by this study and aerial view of HDT with subzones Z1, Z2 and Z3 described in this study

The city is exposed to a complex and moderate to strong seismic hazard (Peláez et al. 2006; Baba Hamed et al. 2013). The Ain Smara Fault, situated southeast of Constantine (Fig. 1b), is a major active fault and a primary source of earthquakes affecting the Eastern Tellian Atlas (northeast Algeria) confirmed by surface ruptures observed after the 1985 earthquake (Bounif et al. 1987). This fault can be considered as an important potential source of seismic activity according to the probabilistic seismic hazard study done for Algeria (Peláez et al. 2006; Fig. 1c). It generated the strongest earthquake in 1908, 1947 and in 1985, the latest corresponding to the strongest event recorded since the implementation of instrumental seismology (Bounif et al. 1987; Ousadou et al. 2013), causing significant damage in Constantine (Ms = 5.9, October 27, 1985), with macro-seismic intensities ranging from IX to X.

Constantine has about 1 million inhabitants according to the last census of National Office of Statistics in 2008 (Office national des statistiques, ONS 2008). It is a very active city in social, economic and industrial terms, considered to be the third most important city in Algeria. It is also famous for its cultural heritage buildings, including constructions of urban and architectural value. Analysis of the Constantine city has been performed in the past using Hazus method (Boukri et al. 2014) with concluding to a high vulnerability of the city. For our study, the inventory phase covered only the historic downtown area (HDT, Fig. 1d), for the collection of building details required for the seismic vulnerability assessment using the Risk-UE method (Milutinovic and Trendafiloski 2003). According to official statistics published by the ONS (2008), this area is characterized by a dense, old stock of residential buildings and is highly populated, with 448,374 inhabitants that correspond to 2089 inhabitants/km2.

In total, 2252 buildings were surveyed by visual screening, collecting data on the state of conservation and on all structural and non-structural characteristics. A specific technical form was established (Appendix 1) for building by building inspection, after consultation with the direction in charge of the urban planning and construction in Constantine (Direction de l’Urbanisme et de la Construction Constantine, DUC) and the Technical Control of Construction (CTC) in Constantine, also with the help of several technical reports and documents available on the building construction in Constantine. Moreover, a specific datasheet was developed to compute the vulnerability index and the associated damage (Appendix 1). The building inventory is certainly the most time-consuming step of the seismic vulnerability assessment process. It took 3 years to collect detailed information on the design and construction types of Constantine’s buildings according to Risk-UE attributes, i.e., type of material (masonry, reinforced concrete, steel), year of construction classified into six classes, number of floors, in plan and elevation irregularities, state of maintenance, aggregation conditions for distinguishing stand-alone, middle, corner or header building (i.e., adjacent with another building on one side or on two or more sides), and information on soil morphology (cliff or slope). Vulnerability indexes by typology and modifiers used for Risk-UE are given Tables 1 and 2, respectively. At the same time, each building was classified according to the European Macro-seismic Scale typology (EMS98, Grünthal and Levret 2001).

Table 1 Representative values of vulnerability indexes for each class of the building typology used in Risk-UE (after Milutinovic and Trendafiloski 2003)
Table 2 Scores for the vulnerability modifiers used for masonry and reinforced concreted buildings, adapted from Risk-UE (after Milutinovic and Trendafiloski 2003) to the Constantine city

This information was fed into the Constantine Building Databank (CBD). Moreover, the national Algerian organism of technical and building inspection done in 2009 during a detailed but partial field survey confirmed the CBD information, by crossing modifiers and vulnerability assessment for several buildings. Finally, the HDT was divided into three zones (Z1, Z2 and Z3) according to the historic urbanization of the area (Fig. 1d). In Z1, about 63% of the total building stock (1887 buildings) was surveyed (i.e., 1185 buildings), in Z2 about 83% (1016/1221) and about 41% in Z3 (51/123). The distribution of surveyed buildings according to building material reflects the evolution of construction methods and urbanization trends: Unreinforced masonry buildings represent 94% of the whole buildings in Z1, 64% in Z2 and 0% in Z3.

A general description of the CBD is shown in Fig. 2. In this figure, buildings are grouped according to four attributes: (1) period of construction, according to the historic urbanization of the city and evolutions in Algerian design code; (2) number of stories, defined according to the interval given in the Risk-UE method, i.e., low-rise (1–2), mid-rise (3–5) and high-rise (>6) buildings; (3) type of material found in the studied area, i.e., reinforced concrete (RC), unreinforced masonry and steel; and (4) roof shape (slope or flat). These four attributes were selected for their ease of observation, with no attribution error, on the urban scale. In total, 78% of the buildings surveyed were unreinforced masonry, 21% reinforced concrete and only 1% steel. The CBD contains 2252 buildings:

Fig. 2
figure 2

Description of the number of buildings surveyed in this study according to four attributes: period of construction, number of floors, material type and roof shape

  • 448 built before 1837, composed mainly of unreinforced masonry buildings in adobe and rubble stone,

  • 744 from 1837 to 1920, corresponding to unreinforced masonry and RC buildings;

  • 819 between 1921 and 1962, corresponding to the generalization of reinforced concrete in buildings but without application of any seismic code measures;

  • 119 from 1963 to 1981;

  • 38 between 1982 and 2003;

  • 84 after 2003.

Most of the constructions of the last three periods are RC buildings. The two first periods correspond to the Ottoman and French epochs, representing approximately 50% of buildings surveyed herein. Almost all buildings (94%) were erected before publication of the first Algerian seismic code in 1981 (RPA81 1981), therefore considered as non-engineered buildings, 2% were designed between 1981 and publication of the latest version of the Algerian seismic code of 2003 (RPA03 2003), considered as buildings with moderate seismic design according to Risk-UE (moderate code), and only 4% of construction were built after 2003, considered as high seismic design (high-code) buildings. Moreover, the study contains about 120 high-rise buildings, which represent only 5% of the building stock, 845 (38%) and 1287 (57%) buildings being classified as low-rise and mid-rise, respectively.

Table 3 shows the distribution of buildings classified according to the Risk-UE typology (Milutinovic and Trendafiloski 2003). Solid stone, class M1.3 (called Ashlar), and adobe, class M2, are the most representative typologies in the building stock surveyed. Compared with the EMS98 typology, the vulnerability classes range from A (most vulnerable) to F (least vulnerable), with 99% of unreinforced masonry buildings in classes A and B, 75% of RC buildings in class C and the remaining 25% being divided among classes D and E. The vulnerability reflects the low seismic design of the existing RC and unreinforced masonry buildings, built before the issue of the first Algerian seismic code.

Table 3 Distribution of surveyed buildings by Risk-UE typology

3 Methods

The flowchart of the methods followed in this study is shown Fig. 3, each branch being explained in this section.

Fig. 3
figure 3

Flowchart of the methods followed in this study

3.1 Brief description of Risk-UE level 1

In the first step, the seismic vulnerability assessment of zones 1–3 was computed using the Risk-UE method, considered as the reference assessment method in this study. Risk-UE vulnerability assessment method was developed for the eponymous European project considering seven major European cities (Spence and Brun 2006). At first time, the method consists classifying each building into the typology defined by the materials and/or structural systems. Basic vulnerability indexes are attributed to each typology class (Milutinovic and Trendafiloski 2003) corresponding to the median value IV* and to the lower IV and upper IV+ bounds of the possible values of the vulnerability index. Modifier factors are then applied to IV*, to take into account height, irregularities, position, etc. The final vulnerability index, independent of the hazard, is the sum of IV* and the weighted values of the modifier factors as follows:

$${\text{IV}} = {\text{IV}}^{*} + \Delta V_{\text{M}} + \Delta V_{\text{R}}$$
(1)

where ∆V M represents the seismic behavior modifiers and ∆V R is a regional vulnerability factor (considered equal to 0). Once vulnerability has been assessed, the average damage grade µ D is given by the following equation:

$$\mu_{\text{D}} = 2.5\left[ {1 + \tanh \left( {\frac{{I + 6.25{\text{IV}} - 13.1}}{2.3}} \right)} \right]$$
(2)

where I is the seismic hazard described in terms of EMS98 macro-seismic intensity. µ D varies from 0 (no damage) to 5 (severe damage or destruction) following the six-level damage scale D k given in EMS98, with six grades (k [0, 5]) ranging from no damage (D0) to complete destruction (D5). The binomial distribution proposed in Lagomarsino and Giovinazzi (2006) and adjusted to post-earthquake observation is then used to give damage as the probability P(D k) of reaching each damage grade D k (k [0, 5]) for a given μ D as follows:

$$P\left( {D_{\text{k}} } \right) = \frac{5!}{{k!\left( {5 - k} \right)!}}\left( {\frac{{\mu_{\text{D}} }}{5}} \right)^{k} \left( {1 - \frac{{\mu_{\text{D}} }}{5}} \right)^{5 - k} \quad \left( {!{:}{\text{ factorial operator}}} \right).$$
(3)

The vulnerability curves are then plotted after computing the probability density function P(D k) for each vulnerability class. For each class of vulnerability and each intensity, they represent the probability of exceeding a degree of damage k for a given intensity (Fig. 4).

Fig. 4
figure 4

Probability of exceeding damage P(D k) for each class of vulnerability computed for macro-seismic intensities V–XII (according to Eq. 3 and Lagomarsino and Giovinazzi 2006)

3.2 Association rule learning classification

The building-by-building inventory of structural attributes is certainly the most costly step in any process to estimate seismic vulnerability at an urban scale. Riedel et al. (2014) thus propose a datamining-based method for developing a seismic vulnerability proxy using basic attributes. There is no doubt that additional information such as material or resistance of structural elements contributes to the vulnerability of existing buildings. However, datamining consists in discovering patterns and trends that go beyond simple analysis, and finding “hidden” correlations among different attributes and targets in large databanks. In the field of seismic vulnerability assessment, it consists in establishing correlations using mathematical algorithms (if/then statements) between basic attributes that are easily available (e.g., number of stories or roof shape) and the vulnerability classes.

In this study, we applied a popular method, called association rule learning (ARL; Agrawal et al. 1993), relevant to seismic vulnerability application (Riedel et al. 2014, 2015). Applied to the CBD, conditional probabilities between basic structural information (attribute X) and EMS-98 vulnerability classes (target Y = A, B, C, D, E) are derived to give the Constantine vulnerability proxy (CVP). Association rules also take the form X → Y i , where X (consequent) and Y i (antecedent) are two sets of independent items. Each relationship between X and Y i can be represented in binary format [0,1]: Knowing the building attributes X, the probability of belonging to class Y i is expressed by

$$P\left( {Y_{i} |X} \right) = \frac{{P\left( {Y_{i} \cap X} \right)}}{P\left( X \right)}$$
(4)

or in practice, P(Y i |X) can be calculated as:

$$P\left( {Y_{i} |X} \right) = \frac{{N_{\text{XY}} }}{{N_{\text{X}} }}$$
(5)

where N x is the total number of buildings with attribute X and N xy the number of buildings with attribute X and belonging to class Y i . One limitation of the ARL method is that by searching massive numbers of possible associations, there is a significant risk that the results will include inconsistencies, due to false associations. We used two phases applied to the CBD in order to test the efficiency of the CVP.

3.2.1 First phase: learning

Once the databank was ready, a learning phase was applied to a subset of data. The size of the learning subset was chosen according to the results obtained by Riedel et al. (2015). They found that the quality of an estimate reaches an asymptote beyond a size of learning subset representing 30% of the total data, and we assumed for this study the same size of subset. We therefore selected 2500 sets of data randomly, each representing 30% of the whole databank. Classification accuracy is sensitive to the dataset; so many combinations were then tested considering one or several of the basic attributes (i.e., period of construction, number of stories, material and roof). Classification accuracy is shown in Fig. 5 as the percentage of correctly classified classes of vulnerability for a given combination of attributes.

Fig. 5
figure 5

Accuracy of the vulnerability classification considering 2500 random subsets of 30% of the buildings in the Constantine Building Databank for several attribute combinations: a construction period and number of stories; b construction period, number of stories and material; c construction period, number of stories and roof shape; d construction period, number of stories, material, and roof shape

By applying ARL, we observed that the classification (i.e., assessment of vulnerability classes) is quite relevant even when only basic information is considered. Firstly, for all cases, variability of the accuracy distribution is less than 1%, reflecting the independence of subset selection for the classification (σ ≤ 0.61%). Secondly, mean accuracy is over than 73% for two single attributes (period of construction and number of stories), reaching about 83% for combinations including material. This attribute is considered to be a key parameter, having significant weight in the vulnerability assessment. However, as reported by Riedel et al. (2015), period of construction and material are two dependent attributes (e.g., 98% of buildings built before 1920 were unreinforced masonry constructions, and after 1962 all were RC), inferring that the benefit in terms of accuracy of considering material is rather limited. This is certainly the most difficult attribute to assess accurately and a balance between cost and accuracy of the inventory by visual screening and quality of the classification must be evaluated before starting a survey. We also observed that by adding roof shape, classification is slightly improved, suggesting correlation of this attribute with the others.

3.2.2 Second phase: validation on the rest of the Constantine Building Databank

After completion of the learning phase, the vulnerability proxy (CVP) was defined for all combinations and vulnerability classes. Table 4 shows the conditional values (i.e., CVP) of classification in the EMS98 vulnerability classes, knowing the two most basic building attributes (i.e., period of construction and number of stories). For example, a randomly selected building in Constantine known to have been built before 1837 and with less than 2 floors has a probability of 82.7% of being in Class A and 17.3% of being in Class B.

Table 4 Conditional probability for each EMS-98 vulnerability class according to building attribute (period of construction, number of stories) obtained using the ARL method

This proxy was then applied to the rest of the CBD, i.e., 70% of the databank, considering several attribute combinations and vulnerability classes. The buildings were then grouped into subsets or, in our case, into geographical units. The vulnerability within each unit was then expressed as the probability of being in class j, as follows:

$$P_{j} \left( Y \right) = \mathop \sum \limits_{1}^{N} \frac{{N_{ji} P\left( {Y|X_{i} } \right)}}{N}$$
(6)

with P j (Y) the probability of one building being in class Y i  = {A, B, C, D, E}, N ji being the number of buildings with attributes X i in class Y j , P(Y|X i ) the proxy value in Table 4 and N the number of buildings. Table 5 shows an example of a confusion matrix that compares the predicted vulnerability with the ground truth for three attributes (period of construction, number of stories, roof shape) and five classes. The values on the diagonal are the buildings that were assigned correctly.

Table 5 Example of a confusion matrix obtained with the ARL method to classify the buildings in Constantine into EMS98 seismic vulnerability classes based on three attributes

In this example, overall accuracy of construction assignment is 74.24%. Accuracy was lower for classes A and B: 261 class A buildings were correctly classified but 191 and 6 buildings were incorrectly classified in B and C; 706 class B buildings were correctly classified but 49 and 10 were misclassified in A and C, respectively. As observed by Riedel et al. (2014), distinguishing between classes A and B is difficult because of the equivalent vulnerability associated with the different types of buildings, according to EMS98. Most of the misclassified class A buildings are actually in B, and we can improve classification accuracy by merging these two classes. For classes C, D and E, the accuracy of the classification is better, with 86, 85 and 95% of well-classified buildings. Figure 6 shows the overall accuracy of the classification considering four classes, i.e., after merging A and B. Considering 3 attributes (period of construction, number of stories, roof shape), accuracy increases to 89 and to 99% if the material attribute is also included, with very slight variability (0.13%). 

Fig. 6
figure 6

Accuracy of the vulnerability classification considering 2500 random subsets of 30% of the buildings in the Constantine Building Databank, merging classes A and B, and for two attribute combinations: a construction period, number of stories and roof shape; b construction period, number of stories, material, and roof shape

In the rest of the paper, we used the CVP proxy given in Table 4 requiring only two attributes. The purpose is to extend the proxy to other Algerian cities, based on only these two parameters. The need for only two attributes eliminates one of the main difficulties related to any vulnerability study, while remaining relevant at the urban scale. Application of the proxy to other Algerian cities means assuming that the characteristics of the Constantine urbanization are similar in all cities throughout the region: This can be a weak assumption at the first order as shown by Riedel et al. (2015). Finally, we compared the results of the vulnerability assessment by ARL with the mean values of vulnerability given by Risk-UE. Having attributed the vulnerability classes, damage was computed following two strategies.

The first strategy, named ARL0 herein, is based on the EMS98 damage scale. For a given intensity, the number of buildings expected to suffer damage is defined by the terms “Few,” “Many” and “Most,” translated into numerical values 5, 35 and 80%, respectively (Lagomarsino and Giovinazzi 2006; Bernardini et al. 2010) and finally represented in a continuous manner by Riedel et al. (2015) to give the damage probability matrix given in Table 6 for vulnerability class A.

Table 6 Damage probability matrix for vulnerability class A interpreted according to EMS98 (after Riedel et al. 2015)

The damage probability for a given intensity P EMS98(D k) is then computed by the equation:

$$P_{EMS98} \left( {D_{\text{k}} } \right) = \frac{1}{N}\mathop \sum \limits_{i = A}^{i = E} N_{i} P(D_{\text{k}} |i,I_{EMS98} )$$
(7)

where N is the total number of buildings, N i the number of buildings with vulnerability class i (i = A, B, C, D, E) and P(D k|i,I EMS98) the probability of damage grade D k for a given vulnerability class i and intensity I EMS98 provided in Table 6.

The second strategy, called ARL1, uses the Risk-UE damage probability model (Eq. 3), considering the relationship between the EMS98 vulnerability class and the Risk-UE vulnerability index (Lagomarsino and Giovinazzi 2006; Bernardini et al. 2010), as shown in Fig. 4.

4 Results

4.1 The vulnerability map of Constantine

Figure 7 shows vulnerability in zones 1, 2 and 3 computed by Risk-UE and based on the EMS98 classification. Almost 80% of the buildings have a Risk-UE vulnerability index over than 0.70, which is equivalent to vulnerability classes A and B. In comparison, 78% of the buildings are classified as A and B according to the ARL method, which confirms the efficiency of the datamining-based method to assess vulnerability at the urban scale with basic building attributes compared to a more sophisticated method (Risk-UE). We observe that zone 1, corresponding to the historic center, is the most vulnerable, with classes A and B. This area was built before 1837 and partially between 1837 and 1920, with 94% of unreinforced masonry constructions. The rest were built during the French period, where most of buildings were designed without earthquake engineering. In zone 2, built between 1920 and 1962, the ARL proxy classified the majority of buildings in the northern part of the zone in A and B, indicating a mix of all classes for the southern area. This is the most heterogeneous area, which contains buildings of all construction periods. The modern part of the city corresponds to zone 3, urbanized mainly between 1962 and 1981, composed only of RC buildings (frame and shear walls) and characterized by the least vulnerable classes (C and D). Based on the correlation between the Risk-UE indexes and the EMS98 classes, the distributions of the most vulnerable buildings are roughly similar, with a slight difference in geographic distribution. A building-by-building comparison could be conducted, but in order to be consistent with the statistical approach required for vulnerability assessment at global scale, the comparison will be based on the probability to exceed damage per geographical unit.

Fig. 7
figure 7

Vulnerability distribution in zones 1, 2 and 3 of Constantine, represented by the EMS98 classes assigned using the ARL method (upper row) and using the Risk-UE indexes (lower row)

4.2 Damage assessment

The probability P(D k) of exceeding damage level D k was computed according to three different approaches: the Risk-UE model (Eq. 3), the ARL0 and ARL1 methods, considering a spatially uniform seismic. For this comparison, only intensity is considered without accounting for eventually additional effects such as site effects, soil nonlinearity or triggered effects (e.g., landslides) that might modify the seismic hazard. In order to check the relevancy of the ARL methods in terms of damage prediction, we computed the total absolute error ε of predicted damage P(D k) considering all damage grades k and different macro-seismic intensity scenarios (Fig. 8): ε 1 between Risk-UE and ARL0, ε 2 between ARL1 and Risk-UE and ε 3 between ARL0 and ARL1. Overall, we found ε 2 < ε 1 < ε 3. The differences between Risk-UE and ARL1 (ε 2) are mainly due to the seismic vulnerability assessment methods; the differences between ARL0 and ARL1 (ε 3) are due to the model for computing damage with the same vulnerability; and the differences between Risk-UE and ARL0 (ε 1) are a combination of the two.

Fig. 8
figure 8

Probability of exceeding damage P(D k) a for intensities between V and XII computed with the ARL0, ARL1 and Risk-UE methods and b for intensity VIII and vulnerability classes A–E. ε 1 is the absolute error between Risk-UE and ARL0, ε 2 between ARL1 and Risk-UE and ε 3 between ARL0 and ARL1

For all intensities (Fig. 8a), the absolute error is very small, with the largest values for the two extreme intensities equal to 0.10, 0.07, 0.06 and 0.06 for ε 1, 0.06, 0.04, 0.06 and 0.07 for ε 3 for intensities V, VI, XI and XII, respectively. The smallest error corresponds to ε 2, i.e., the error for the damage model (Risk-UE) but using two different models for vulnerability assessment. This confirms that for damage prediction, assessing seismic vulnerability using the ARL method provides the same magnitude of damage as Risk-UE, with differences of less than 0.04. For the highest intensities (XI and XII), we obtained the same probability of exceedance with both ARL0 and ARL1: For these intensities, the generalization of damage to the most vulnerable buildings smooths the error related to damage prediction.

For a given intensity VIII (Fig. 8b), the largest errors were observed for vulnerability classes A and E, with values equal to 0.05 and 0.07 for ε 1 and ε 3. ε 2 often has the lowest values (classes B, C and E) and remains very low except for class A. The probabilities P(D k) are therefore comparable, regardless of the methods used for damage and vulnerability assessment. This confirms that with only two attributes, damage predictions are reliable even if vulnerability may be slightly variable. This remark is important because at the scale of the city, it means that an individual building inventory is not necessary to collect data on all the attributes and modifiers required to assess the Risk-UE index. An alternative is to use existing data, for example, from preexisting national databases or collected by remote sensing for the roof shape and the number of stories, or a combination of both.

4.3 Comparing the seismic damage scenario for Constantine

Figure 9 presents the damage distribution obtained for a given seismic scenario (intensity VIII) considering the same intensity throughout Constantine. This scenario corresponds to the last major earthquake that hit the region in 1985 (I = VIII). The three zones are divided into subzones, defined as being homogenous in terms of urbanization (construction period and typology). We observed similar results for the three approaches.

Fig. 9
figure 9

Spatial distribution of damage in Constantine computed for the intensity VIII scenario. Damage levels are grouped into slight (D0 + D1), moderate (D2 + D3) and severe (D4 + D5) and given as the number of buildings: a ARL0 method; b ARL1 method; c Risk-UE method

The scenario results for zone 1 (i.e., the oldest part of the city) indicate moderate to severe damage according to the Risk-UE method and moderate damage for both ARL0 and ARL1 methods. This is mainly due to the poor quality of buildings in this zone, with 94% of buildings classified as A or B, zone 1 being mainly composed of unreinforced masonry buildings (95% of the building stock). This area corresponds to a densely urbanized zone, with residential and commercial buildings. Zones 2 and 3 may suffer slight to moderate damage, because of the presence of a majority of frame and shear-wall RC buildings (urbanized between 1920 and 1980).

The damage distribution for Risk-UE and ARL methods is given in Table 7 and shown in Fig. 10. For an earthquake comparable to the 1985 earthquake (I = VIII), the damage level computed by Risk-UE shows that nearly 29% of the buildings of the studied area would be undamaged or only slightly damaged (D0 + D1), while about 45% of buildings would suffer moderate damage (D2 + D3), and more than 25% of buildings would suffer severe damage (D4 + D5). Using ARL, similar results are obtained for slight damage (29%), and differences are observed for moderate (57%) and severe damage (14%), resulting from the difficulties encountered sometimes in distinguishing classes A and B. Nevertheless, in terms of damage prediction for seismic risk management and anticipation, both results are very similar and allow representation of the seismic risk required to implement a seismic risk management policy.

Table 7 Comparison of the damage distribution (number of buildings and percentage) obtained by ARL and Risk-UE in Constantine for intensity VIII
Fig. 10
figure 10

Comparison of the number of damaged buildings for intensity VIII computed by Risk-UE and ARL methods

5 Conclusions

One of the largest challenges in seismic vulnerability analysis at the urban scale is to compensate for the lack or poor quality of information concerning construction characteristics, and this is certainly the most costly step of any seismic risk analysis. To overcome this problem, a recent approach was tested for the city of Constantine, based on the ARL datamining method. In this manuscript, we have shown the possibility of using simplified analysis methods, using single building type attributes to represent and analyze the seismic risk at the urban scale. This will enable more detailed studies of existing buildings and the assessment of future seismic measures.

One of the most significant characteristics of this method is that it extracts “hidden” relationships between elementary attributes of buildings and seismic vulnerability. A vulnerability proxy is then derived to attribute the vulnerability class for all combinations of building attributes. The proxy was derived during a learning phase and then applied to the rest of the databank of Constantine’s buildings. We observed robust and accurate assessment of vulnerability, regardless of the building subset used for the learning phase and compared to the Risk-UE vulnerability assessed in Constantine considered as the ground truth. The accuracy of the ARL method was evaluated in comparison with the Risk-UE method, which requires a lot of information on building characteristics. Comparison of results showed that the two approaches provide similar results. This suggests that the ARL methodology was successfully applied to the Constantine data. We observed that considering the same damage model, vulnerability assessment by ARL provides the same damage estimate as Risk-UE, even if only two attributes are used. We confirmed that vulnerability assessment using the datamining approach provides a relevant estimation of seismic damage at a cost far lower than that of conventional methods, such as Risk-UE. In case of structural modification or retrofitting, ARL-based classification might be updated in the same manner as the Risk-UE vulnerability, and the date and the nature of modification could be considered as an additional attribute for the classification. Finally, the ARL method was able to give an overall assessment of seismic risk in an urban area. Assuming similar urbanization features through Algeria, the ARL-based vulnerability proxy derived for Constantine could be applied to other cities to enable seismic vulnerability and seismic risk to be assessed at the scale of the country, using national census data or remote sensing data.