1 Introduction

The Indo-Gangetic Plain is a foredeep basin that follows the trend of the Himalayan collision zone. The basin is filled with thick alluvium deposits of varying degrees of compaction overlying the basement faults, ridges, and other tectonic features, obliterating their surface expression. The sediment thickness in the Indo-Gangetic Foredeep (IGF) varies from 500 m to 3.9 km with maximum thickness observed along the foothills of the Himalaya (Srivatsava 2001; Srinivas et al. 2013; Chadha et al. 2016). The near-surface geological features, especially the overburden soils and soft sediments in the basin, can dramatically amplify seismic waves and cause severe damage and fatalities, even at sites relatively far away from the epicenter (e.g., Cassidy and Rogers 2004; Nath and Thingbaijam 2009, 2011b; Nath et al. 2013, 2014; Maiti et al. 2017). The occurrence of some devastating earthquakes viz. 1833 Nepal earthquake of Mw 7.6, 1934 Nepal-Bihar earthquake of Mw 8.1, 1988 Bihar-Nepal earthquake of Mw 6.8, 2011 Sikkim earthquake of Mw 6.9, and 2015 Nepal earthquake of Mw 7.8 in a region of 500 km radius around IGF signifies how vulnerable the region is toward seismic devastations.

In India, urban centers are more susceptible to seismic risk due to high population density, unplanned growth, inadequate planning, poor land use, and substandard construction practices, thus necessitating sound disaster mitigation and management plans through a judicious interplay of seismic hazard and structural vulnerability. Therefore, the Ministry of Earth Sciences (MoES), Govt. of India in its XII FYP period planned to take up seismic hazard microzonation (SHM) of urban agglomeration of 30 targeted cities with population more than one million and those located in seismic zones III, IV, and V in the BIS zonation map of India (BIS 2002). In the present study, we worked out seismic scenario in the Indo-Gangetic Foredeep region with focus on probabilistic seismic hazard and damage assessment of the capital cities of Patna and Lucknow, and the Hindu religious city of Varanasi. The Bureau of Indian Standards (BIS 2002) placed the entire IGF region in seismic zones II–V, corresponding to a peak ground acceleration (PGA) range of 0.1–0.36 g. The lowest hazard zone marked II pertains to the southwestern part of the IGF, while zone III covers the central part of the Indo-Gangetic Foredeep region. The cities such as Varanasi, Lucknow, Agra, Kanpur, and Bareilly are classified under zone III. The northern part of the IGF encompassing Patna, Delhi, Meerut, Ghaziabad, Faridabad, Gurgaon, and Noida is classified under zone IV. Zone V occupies predominantly the northernmost part of Bihar encompassing Madhubani, Supaul, Araria, Sitamarhi, Dharbhanga, Madhepura, and Saharsa region facing direct threat from the Nepal-Bihar earthquake regime. Stratigraphically, the IGF is characterized by Holocene alluvial deposits that are likely to be liquefied due to an impending large earthquake as reported in GSI memoir (GSI 1939, 1993) during the onslaught of the 1833 Nepal earthquake of Mw 7.6, the 1934 Nepal-Bihar earthquake of Mw 8.1, and the 1988 Bihar-Nepal earthquake of Mw 6.8, respectively.

In order to develop effective earthquake measures, it is, therefore, imperative that the seismic hazard is systematically estimated for the terrain. Several attempts have been made by various researchers in the past for seismic hazard assessment in different parts of the country (viz. Khattri et al. 1984; Bhatia et al. 1999; Parvez et al. 2003; Das et al. 2006; Jaiswal and Sinha 2007; Mahajan et al. 2010; NDMA 2010; Nath and Thingbaijam 2012; Sitharam and Kolathayar 2013; Nath and Adhikari 2013; Sitharam et al. 2015; Nath et al. 2014; Adhikari and Nath 2016; Maiti et al. 2017). In assessing probabilistic seismic hazard, we include two different earthquake source models viz. layered polygonal seismogenic sources and active tectonic seismogenic sources at three hypocentral depth ranges viz. 0–25, 25–70, and 70–180 km. The seismicity parameters, i.e., a value and b value, along with the estimation of maximum earthquake (Mmax) have been performed. Smoothening seismicity has been performed to estimate the activity rate of earthquake occurrence for both the layered polygonal and tectonic sources at different hypocentral depth ranges. Due to nonavailability of well-established attenuation relations for the region, six new next generation attenuation models (NGA) have been developed by nonlinear regression analysis for the three tectonic provinces, namely the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin. The ground motion prediction equation (GMPE) models given by Atkinson and Boore (2006) and Campbell and Bozorgnia (2003) have been used as fundamental equations for the generation of NGA models appropriate for this region. Apart from the new NGA models, we have altogether incorporated in this study 15 regional and global prediction equations based on suitability testing. Eventually, all the hazard contributing components viz. source attribution, seismic activity rates, and GMPEs are judiciously integrated with appropriate ranks and weights in a logic tree framework to generate probabilistic seismic hazard distribution with 10 and 2% probability of exceedance in terms of PGA and 5% damped pseudo-spectral acceleration (PSA) for different time periods at firm rock site (site class: B/C, Vs ~ 760 ms−1) for 475 and 2475 years of return periods. Thereupon, surface consistent seismic hazard for 10% probability of exceedance in 50 years has been estimated by using both the short and long period site amplification factors from the International Building Code (IBC 2009) corresponding to various sites classified from Vs30 values derived from the topographic gradient (Allen and Wald 2009). Subsequently, damage probability of different model building types has also been computed using the capacity spectrum method in Seismic Loss Estimation Approach (SELENA) using a logic tree based on damage functions and relational analysis protocol and validated qualitatively from damage reporting for historical earthquakes in the region.

2 Seismotectonism of Indo-Gangetic Foredeep and its adjoining region

The Indo-Gangetic Plain lies in the Himalayan foredeep which constitutes vast alluvium plains of the Ganges, Indus, and their tributaries. The basin is formed as a consequence of flexing of the Indian lithosphere due to the continued northward movement of the Indian plate and the thrust fold loading of the Himalayan Orogen. The structural limit between the Indo-Gangetic Foredeep and the Himalayan region in the north is defined by the Himalayan Frontal Thrust (HFT) which is a direct consequence of the compression resulting from collision of the Indian and the Eurasian plates. The major subsurface ridges along the Indo-Gangetic plains are the Faizabad Ridge, the Munger-Saharsa Ridge, and the Goalpara Ridge (Kayal 2008). The basement of the Indo-Gangetic Foredeep is formed by the extension of basement rocks and the depression. This basement configuration and the presence of thick sediments in the region render the Indo-Gangetic Foredeep prone to seismic hazard. Some major faults identified in the Indo-Gangetic Foredeep as reported in the Seismotectonic Atlas of India (Dasgupta et al. 2000) are the Hathusar Fault, the Moradabad Fault, the Mahendragarh Dehradun Fault, the Great Boundary Fault, the Lucknow Fault, the Main Frontal Thrust, the West Patna Fault, the East Patna Fault, the Munger-Saharsa Fault, and the Rajmahal Fault as depicted in Fig. 1. It is believed that most of the faults extend northwards transversely to the Himalayan belt (Valdiya 1976).

Fig. 1
figure 1

a, b Seismotectonic setting of the Indo-Gangetic Foredeep and its adjoining region (modified after Dasgupta et al. 2000; Yin et al. 2009)

The seismicity of the IGF is the result of the collision tectonics between the Indian plate and the Eurasian plate (Chandra 1978; Seeber et al. 1981; Ni and Barazangi 1984; Kayal 2008). Due to these collision tectonic activities, major seismic events of Mw ≥ 6.0 from the near- and far-field seismogenic sources like the Central Himalaya, the Indo-Gangetic Alluvial Basin, and the Central Indian Peninsular Shield impacted the IGF and its adjoining region as shown in Fig. 1. Some notably disastrous earthquakes are the 1833 Nepal earthquake of Mw 7.6, 1934 Nepal-Bihar earthquake of Mw 8.1, 1988 Bihar-Nepal earthquake of Mw 6.8, 2015 Nepal earthquake of Mw 7.8, 1999 Chamoli earthquake of Mw 6.8, and 1905 Kangra earthquake of Mw 7.8 that caused widespread damage and loss of life and property in the IGF and its adjoining region.

Past historical reporting exhibits that the damage potential in the IGF varies from MM intensity V to IX from both the near- and far-field earthquakes. The 1833 Nepal earthquake of Mw 7.6 killed nearly 500 people among which most of the fatalities were in the Kathmandu valley and the northern part of the IGF as shown in Fig. 2a (Dasgupta and Mukhopadhyay 2015). The 1905 Kangra earthquake of Mw 7.8 occurred in the West-Central Himalaya causing damage equivalent to MM intensity VI in the northern part of the IGF as depicted in Fig. 2b. The 1934 Bihar-Nepal earthquake of Mw 8.1 occurred at the Indo-Nepal boundary causing widespread damage and 10,500 fatalities in both Nepal and India (www.asc-india.org). Seismic intensity of MM VII to IX has been reported in Patna district, while the same is reported to vary between MM intensity VI and VII in Varanasi and MM intensity V and VI in Lucknow City (Dasgupta et al. 2000; GSI 1939) as depicted in Fig. 2c. The 1988 Bihar-Nepal earthquake of Mw 6.8 has killed 721 persons and injured 6553, and 64,470 buildings were damaged in eastern Nepal and India. The MM intensity IV to IX was felt in the IGF as depicted in Fig. 2d. The 1999 Chamoli earthquake of Mw 6.8 occurred in the West-Central Himalaya causing damage equivalent to MM intensity V–VII in the northern part of the IGF region as depicted in Fig. 2e. The 2015 Nepal earthquake of Mw 7.8 was the worst natural disaster to strike Nepal since the 1934 Nepal-Bihar earthquake and destroyed 138,182 houses completely and 122,694 houses partially across Nepal and its adjoining region. It took more than 9000 lives, uncountable people were injured, more than 50,000 persons were rendered homeless, and the economic loss was estimated to exceed the GDP rate of the country (www.usgs.gov). The earthquake was felt over large areas, nearly in 1608 cities of Nepal, India, Bangladesh, and China. The MM intensity IV–VII was felt in the IGF as shown in Fig. 2f. The structural destruction and building collapse due to the near- and far-source moderate to large earthquakes in the Indo-Gangetic Foredeep region is depicted in the photographs embedded in Fig. 3.

Fig. 2
figure 2

Isoseismal maps in the IGF for the a 1833 Nepal earthquake of Mw 7.6, b 1905 Kangra earthquake of Mw 7.8, c 1934 Nepal-Bihar earthquake of Mw 8.1, d 1988 Bihar-Nepal earthquake of Mw 6.8, e 1999 Chamoli earthquake of Mw 6.8, and f 2015 Nepal earthquake of Mw 7.8 (adopted from Dasgupta et al. 2000; Pandey and Molnar 1988; Dasgupta and Mukhopadhyay 2015; Ghosh and Mahajan 2013; www.usgs.gov)

Fig. 3
figure 3

Photographs of structural damage and building destruction due to near- and far-source moderate to large magnitude earthquakes viz. ac 1934 Nepal-Bihar earthquake of Mw 8.1, df 1988 Bihar-Nepal earthquake of Mw 6.8, gi 2015 Nepal earthquake of Mw 7.8, jl 1905 Kangra earthquake of Mw 7.8, and mo 1999 Chamoli earthquake of Mw 6.8 in the Indo-Gangetic Foredeep region (Nasu 1935; Duggal and Sato 1989; Rai et al. 2016; Shrikhande et al. 2000; http://www.nicee.org/nepaleq/; http://123himachal.com/ dharamsala/ links/1905.htm)

3 Seismic hazard potential of the Indo-Gangetic Foredeep region

Seismicity of the Himalaya and the IGF is not the sole factor accountable for the seismic hazard of the basin. The interaction with sediment and subsequent amplification of seismic waves enhances the vulnerability of the Himalayan Foredeep region (Bagchi and Raghukanth 2017). Additionally, thick deposition of alluvial or cratonic sediment over hard crust is the reason behind the impedance contrast between the sediment and the crust along the vertical direction. The variation in the mechanical material properties of the sediment also creates interfaces of vertical impedance contrast among various strata. This is the reason behind the ‘stratigraphic effect.’ The combined effect of the basin and the strata alters the frequency content of the resultant ground motion at the surface (Nath 2011). Repetitive reflection and refraction of the waves within the basin can cause either attenuation or amplification of the incident seismic energy depending upon their frequency content. Therefore, an in-depth site characterization study is essential to understand the ‘basin/stratigraphic effect’ of the IGF. Generally, geotechnical and geophysical tests are carried out to characterize the soil overburden. However, the topographic slope data is widely used for seismic site characterization at the macroscale level (Nath et al. 2013). The site characterization of the IGF has been carried out using topographic gradient-based approach generated from the digital elevation model (DEM) data. The global data from post-processing on the original Shuttle Radar Topography Mission (SRTM) digital elevation data generated by the National Aeronautics & Space Administration (NASA) and the National Imagery & Mapping Agency (NIMA) are used for this purpose. The processing includes application of hole-filling algorithm to provide seamless and complete elevation surfaces for the entire region while eliminating areas of no data. Using the spatial analysis tool available in ArcGIS, the topographic slope at each grid point has been evaluated. Based on the correlation studies conducted for active and stable continental regions, Allen and Wald (2009) proposed slope ranges corresponding to each National Earthquake Hazard Reduction Program (NEHRP) site class given in Table 1. The entire IGF is associated with the stable continental region (Nath and Thingbaijam 2011a). We, therefore, used correlation of stable continental region for the assessment of site characteristics of the entire IGF. Based on the average shear-wave velocity to a depth of 30 m (Vs30), the entire region has been classified into NEHRP site classes (Phillips 2011) as E (Vs30 < 180 ms−1), D1 (Vs30 180–240 ms−1), D2 (Vs30 240–300 ms−1), D3 (Vs30 300–360 ms−1), C1 (Vs30 360–490 ms−1), C2 (Vs30 490–620 ms−1), C3 (Vs30 620–760 ms−1), and B (Vs30 > 760 ms−1) as depicted in Fig. 4.

Table 1 Summary of slope ranges for subdivided NEHRP Vs30 categories for active tectonic and stable continental region (Allen and Wald 2009)
Fig. 4
figure 4

Seismic site condition map of the Indo-Gangetic Foredeep and its adjoining region derived from topographic gradient-based correlations of Allen and Wald (2009) exhibiting that the entire Gangetic Plain is associated with site class D. Subplots (ac) represent Vs30 distribution in Patna, Lucknow, and Varanasi cities derived from topographic conversion

In order to establish a correlation between the topography-derived site class and geotechnical/geophysical investigation-based site classification with Vs30 as proxy, the city of Lucknow is vividly analyzed for both the site classification maps as shown in Fig. 5 exhibiting around 95% similarity in site class D in both maps, while in site class C, the correlation is found to be of the order of about 60% on accuracy statistics, which as a first approximation is considered satisfactory in the present case. As seen on the map, the entire IGF is mostly associated with site classes E and D, while site classes C and B are located in the southern and northern parts of the region. From the NEHRP recommendation, soils with lower Vs30 values (i.e., toward site classes D and E) will experience more ground shaking due to the wave-amplifying properties of the soil. Generally, soft soils increase the ground motion amplitude during an earthquake and, therefore, are responsible for greater earthquake-induced disaster (Thitimakorn and Channoo 2012). Thus, most part of the IGF is expected to exhibit higher ground motion amplification during the onslaught of a moderate to large magnitude earthquake.

Fig. 5
figure 5

Site classification map of Lucknow City developed through the a topographic gradient-based approach and b geotechnical borehole analysis and MASW surveys (Anbazhagan et al. 2013a)

The surface and subsurface soil layers are very fertile in the IGF because the low-level floods in the Ganges continually replenish the surface soil. The subsurface lithological depth section of the cities of Patna, Lucknow, and Varanasi is characterized by fine- to coarse-grained sand overlaid by silt–clay horizon near the surface as depicted in Fig. 6. The ground water table reported by the Central Ground Water Board (CGWB) varies from 0.3 to 3.0 m in the region. Hence, the thick alluvium with shallow ground water table possesses high risk of undergoing excessive liquefaction and subsidence during an earthquake. On the other hand, the population density in this region is very high, with about 10,000 persons per square kilometer. The total population in the IGF is about 300 million, which is 25% of the total population of the country (Census 2011; http://censusindia.gov.in/). Therefore, the urban centers located in different parts of the IGF are susceptible to earthquake damages due to its proximity to seismically active Himalayan belt and the local site effects associated with thick alluvium deposits.

Fig. 6
figure 6

Subsurface lithological depth section of the cities of a Patna, b Lucknow, and c Varanasi which exhibits loose to medium cohesionless sediment deposits (adopted from http://www.aquiferindia.org/AboutAQUIM_Watershed_Patna.aspx for the city of Patna, Anbazhagan et al. (2013a) for the city of Lucknow, and Shukla and Raju (2008) for the city of Varanasi)

4 Probabilistic seismic hazard and structural damage modeling

The occurrence of devastating earthquakes in and around the IGF region drew attention to the seismic hazard and risk of the region and suggests that the Indo-Gangetic Foredeep is neotectonically active and, therefore, provides a possibility of triggering potential earthquakes in the near future. The contiguity to the active Himalayan region and the Peninsular Shield region, the nature of sediments, and the observed neotectonic activity render this region vulnerable to seismic devastations, especially due to amplification of ground motion in an alluvial filled terrain due to local site effects and soil liquefaction. Considering the seismogenic sources already defined, the subsurface lithology, the NEHRP-defined site condition, and the destructive earthquakes that triggered in and around the IGF, it is evident that the capital cities of Patna and Lucknow and the famous Hindu religious city of Varanasi are vulnerable to earthquake-induced damage and destruction, thus necessitating an estimation of the seismic hazard potential of the region with a focus on these three important smart cities.

The evaluation of probabilistic seismic hazard and structural damage involves the combination of three major components: (a) probabilistic seismic hazard at firm rock condition through a logic tree framework, (b) surface level PSH model based on the International Building Code (IBC 2009)-defined site effects par NEHRP site classes, and (c) damage model defining discrete damage probability based on the capacity spectrum method dealt in SELENA. The key components and work frame of the probabilistic seismic hazard and damage assessment of the Indo-Gangetic Foredeep region are illustrated in Fig. 7. The hazard modeling involves hierarchical development of different hazard components such as seismogenic sources, ground motion prediction equations, and NEHRP site conditions, appropriate hazard formulations considering both the epistemic and aleatory uncertainties and integration of all in a logic tree framework.

Fig. 7
figure 7

Integrated computational framework for probabilistic seismic hazard and structural damage assessment

At the onslaught of a destructive earthquake in a region, the prevailing georeferenced methodologies such as HAZUS (Hazard-US, FEMA 1999), RADIUS (Risk Assessment Tools for Diagnosis of Urban areas against Seismic Disasters, Okazaki et al. 2000), ELER (Earthquake Loss Estimation Routine, Hancilar et al. 2010), EPEDAT (The Early Post-Earthquake Damage Assessment Tool, Eguchi et al. 1997), and SELENA (Molina and Lindholm 2005; Molina et al. 2010) either individually or in unison are used for damage and loss modeling. In the present study, we implemented the capacity spectrum method in SELENA. Damage probability of different model building types is computed and categorized into five damage states viz. ‘none,’ ‘slight,’ ‘moderate,’ ‘extensive,’ and ‘complete’ in terms of total damaged area or the number of damaged buildings. The computational steps for seismic building damage modeling have also been illustrated in Fig. 7.

4.1 Probabilistic seismic hazard assessment

In the present study, rigorous formulations of hazard components have been adopted to deliver a case for a probabilistic seismic hazard model (PSHM) of the cities of Patna, Lucknow, and Varanasi based on underlying seismogenic source zones in the Indo-Gangetic Foredeep and its adjoining region employing the earthquake catalog supplemented by records of historical earthquakes, focal mechanism data, published literatures, paleoseismicity findings, and neotectonic database. In view of the site characterization study of Nath et al. (2013) across the country, Nath and Thingbaijam (2012) considered firm rock site condition (standard engineering bedrock) to be most appropriate for regional hazard computation purposes. Therefore, the standard engineering bedrock conforming to Vs30 ~ 760 ms−1 (defined as boundary site class B/C) is considered as the rock site in compliance to which the hazard computation is performed (Nath and Thingbaijam 2012). In the probabilistic seismic hazard analysis, the annual rate of ground motion exceeding a specific value is computed to account for different return periods of the hazard. Contributions from all the relevant sources and possible events are considered. Thereafter, a logic tree framework is developed toward hazard computation at each site to incorporate multiple models in source considerations, GMPEs, and seismicity parameters. The hazard computation is performed on grid points covering the entire study region in the cities of Patna, Lucknow, and Varanasi at 0.0005° × 0.0005° interval. The hazard distributions are computed for the source zones at each depth section separately and, thereafter, integrated. The preliminary model comprises of spatial distributions of seismic hazard in terms of PGA and 5% damped PSA.

4.1.1 Preparation of earthquake catalog for the IGF

The preparation of a homogeneous and declustered earthquake catalog is the starting point of the steps to be followed for PSHA of the region under study considering a period starting from the prehistoric era till the present time. We, thus, prepared an earthquake catalog of this region spanning for the period of 1900–2016 by considering three major earthquake data sources, namely the International Seismological Centre (ISC, http://www.isc.ac.uk), U.S. Geological Survey/National Earthquake Information Center (USGS/NEIC, http://neic.usgs.gov.us), and Global Centroid Moment Tensor (GCMT, http://www.globalcmt.org), wherein the hypocentral depth entries have been computed using the algorithm given by Engdahl et al. (1998). Other data sources used include the India Meteorological Department (IMD, http://www.imd.ernet.in) and Jaiswal and Sinha (2004). For uniform magnitude scaling and establishing data homogeneity for meaningful statistical analysis, Mw is preferred owing to its applicability for all ranges of earthquakes: large or small, far or near, or shallow or deep-focused. To implement uniform magnitude scaling for the instrumental catalog, the Mw entries found in GCMT are retained. The magnitude entry from the ISC catalog is selected maintaining a preference order of Ms,ISC, mb,ISC, ML,ISC, MD,ISC,mpv,ISC, and MLv,ISC; mb,ISC, ML,ISC, MISC, MN,ISCMD,ISC, MLv,ISCmpv,ISC, mb,USGS, Ms,USGS, ukUSGS, and mw,USGS into Mw,GCMT. Therefore, the conversion relations derived by Nath et al. (2017) through the orthogonal standard regression (OSR) have been used to convert all the magnitude types into Mw,GCMT along with the existing reported equations as illustrated in Table 2. The uncertainties of the unified moment magnitude due to the usage of the conversion equations are incorporated during the compilation.

Table 2 Conversion relations used to convert all the magnitude types into Mw,GCMT (Nath et al. 2017)

Eventually, we obtained a compilation with higher data volume compared to the original sources. Thereafter, the entire catalog has been declustered to remove foreshocks and aftershocks to derive a main shock catalog as elaborated in Nath et al. (2017). The space–time clustering of seismicity is mostly exhibited by foreshocks and aftershocks. Main shock catalogs are derived by eliminating these clusters. Windowing algorithms are generally used for the purpose. The available algorithms (e.g., Gardner and Knopoff 1974; Reasenberg 1985; Uhrhammer 1986; Zhuang et al. 2002; Hainzl et al. 2006) generally differ in terms of spatiotemporal window parameters. On the other hand, deciding on optimal parameters is difficult in the light of diverse seismotectonic conditions (Gomberg et al. 2003). In the present study, we used the window-based declustering algorithm of Gardner and Knopoff (1974) to identify aftershocks and foreshocks depending on interevent space–time distance. According to Gardner and Knopoff (1974), the length and duration of the windows are given in Table 3. This method does not consider secondary and higher order aftershocks.

Table 3 Aftershock identification windows (Gardner and Knopoff 1974)

We adopted this technique since (a) there is higher likelihood of aftershocks of larger main shock events being recorded in the catalog compared to those for the smaller ones and (b) the spatial spans of aftershocks, especially for those associated with larger earthquakes, are dynamic depending not only on the magnitude of the event but also on the geological background.

The parameters listed in Table 3 are adopted for magnitudes 3.0 ≤ Mw,GCMT ≤ 8.0, and the aftershock zone is identified by inspecting continuous spatial windows of 0.25° × 0.25° for the presence of at least one event within the specified days’ limit corresponding to the main shock of a given magnitude. Once the zones are demarcated, the events found within the zone from the advent till the end of the catalog are examined with cumulative number of events against time. Nyffenegger and Frohlich (2000) observed that the aftershock sequences for intermediate as well as deep earthquakes do not behave differently from those of the shallower ones. The algorithm, therefore, remains the same for the deeper (hypocentral depth ≥ 70 km) earthquakes, and the termination of the aftershock sequences is decided accordingly. The analysis has uncertainties due to errors associated with epicentral locations, time, and magnitudes. In the processing, the epicenters are grouped within a distance bound, and consequently, the errors associated are significantly reduced and so is with the case of time bins, while the magnitude-wise correlation between the events is done with the assigned magnitudes. We restricted to the identification of the most likely aftershocks, and henceforth, errors in the magnitudes are not given additional treatment. The same approach is used for the detection of likely foreshocks based on the increasing seismic activity. Finally, Fig. 8 represents a seismicity map prepared using the derived main shock catalog.

Fig. 8
figure 8

Declustered seismicity covering the period 1900–2016 and comprising of 4597 main shock events. Subplots represent the histogram of declustered earthquake distribution for the three hypocentral depth ranges of 0–25, 25–70, and 70–180 km

The hypocentral depth entries in the compiled catalog (only main shock events considered) are either fixed (i.e., uncertainty not provided) or have standard deviation assigned. In order to reduce the associated uncertainties, the Engdahl–van der Hilst–Buland (EHB) catalog from the International Seismological Centre (2009) is consulted. The EHB catalog was prepared using data from the ISC and preliminary determination of epicenters of USGS/NEIC; hypocentral depth was recomputed using the algorithm given by Engdahl et al. (1998). The EHB catalog spans through 1960–2006 and does not recalculate the magnitudes. The epicentral location and the depth entries along with the uncertainty (standard deviation) in the present compilation are updated on the basis of event-to-event comparison with the entries in the EHB catalog. Additionally, records given by Stork et al. (2008) are selectively used to update entries not covered by the EHB catalog.

Thus, the complete and homogeneous earthquake catalog prepared for Indo-Gangetic Foredeep and its adjoining region has been used for probabilistic seismic hazard assessment in terms of seismogenic source zonation, seismicity analysis, smoothened seismicity modeling, and seismic hazard assessment protocol.

4.1.2 Seismogenic source definition in Indo-Gangetic Foredeep and its adjoining region

The source characterization includes both a homogeneous earthquake catalog of the region just illustrated in the previous section and also the fault database which is compiled on the Geographical Information System. The sources include Seismotectonic Atlas of India and NGLM published by the Geological Survey of India (Dasgupta et al. 2000; http://www.portal.gsi.gov.in) and the one extracted from Landsat TM/MSS/ETM, LISS III/IV, Cartosat-DEM, and SRTM data. The seismogenic sources are defined by superimposing the homogeneous and declustered earthquake catalog for the period of 1900–2016 on the fault pattern in the region. In the present study, we classified seismogenic sources as (a) layered polygonal seismogenic sources and (b) active tectonic seismogenic sources as illustrated in the following subsections.

Layered polygonal seismogenic source zones

A popular approach in the seismogenic localization process is the areal source zonation, wherein the objective is to capture uniform seismicity. Source delineation is primarily based on tectonic trends and seismicity of the region. It has been observed that seismicity patterns and source dynamics have significant variation with hypocentral depth (Christova 1992; Tsapanos 2000; Allen et al. 2004). Therefore, the consideration of a single set of seismicity parameter over the entire hypocentral depth range may lead to erroneous hazard estimation. Accordingly, we considered three hypocentral depth ranges: upper crust (0–25 km), lower crust (25–70 km), and lower crust (70–180 km). The methodology adopted in the present study can be outlined into three aspects: (1) delineation of areal source zones on the basis of seismicity distribution and fault patterns complemented by available focal mechanism data, (2) formulation of seismicity model and associated uncertainty values for each source zone, and (3) application of seismicity smoothening algorithm to obtain activity rates for specific threshold magnitude/s. The source zonation at each depth layer is carried out by considering the seismicity patterns, fault networks, and similarity in the style of focal mechanisms (e.g., Cáceres et al. 2005). The layered seismogenic source model in the present study is expected to facilitate resolving source characteristics more precisely than a single layer scheme that has been considered hitherto in the study region under the Global Seismic Hazard Assessment Program (GSHAP) and other researchers. In order to establish the seismotectonic description at each layer, we constructed representative focal mechanism tensor (i.e., F̄) by calculating the weighted average of the known moment tensors as follows:

$$ {F}_{ij}=\frac{\sum \limits_{n=1}^N{M}_0^n{F}_{ij}^n}{\sum \limits_{n=1}^N{M}_0^n} $$
(1)

Where N is the total number of the focal mechanisms, \( {M}_0^n \) is the scalar moment of the nth focal mechanism, and \( {F}_{ij}^n \) is a function of the strike, dip, and rake of this focal mechanism (Aki and Richards 1980). Thus, following the above steps and the findings of Nath and Thingbaijam (2012), 27 areal source zones have been identified for the Indo-Gangetic Foredeep region as depicted in Fig. 9a-c, while the GSHAP considered only 14 areal sources as shown in Fig. 9d.

Fig. 9
figure 9

Layered polygonal seismogenic source framework for the Indo-Gangetic Foredeep region at the hypocentral depth ranges: a 0–25 km, b 25–70 km, and c 70–180 km as adopted from Nath and Thingbaijam (2012), and d areal sources identified by GSHAP (1999)

Active tectonic seismogenic sources

Additional seismogenic sources considered here are the active tectonic features such as the faults and lineaments (Azzaro et al. 1998; Slemmons and McKinney 1977). Many active faults and lineaments capable of producing earthquakes of Mw 3.5 and above are expected to influence seismic hazard of the Indo-Gangetic Foredeep region and, hence, have been extracted from the Seismotectonic Atlas of India (Dasgupta et al. 2000), the National Geomorphological & Lineament Mapping (NGLM) on a 1:50,000 scale (available at http://www.Portal.Gsi.gov.in/portal/page; http://bhuvan.nrsc.gov.in/gis/thematic/index.php), and additional features picked by image processing of Landsat TM, SRTM, ASTER, and LISS IV data as depicted in Fig. 10. Emphasis has, however, been given to large-scale lineaments having relevance to geomorphology, vegetation patterns, tectonic contact zones, and aligned abrupt drainage patterns/river which are generally related to faults. The seismicity of this region can be linked to all possible active faults/lineaments which were not identified and implicated earlier though they were also potential seismic sources in the region. The focal mechanism data employed in the present study are derived from the Global Centroid Moment Tensor (GCMT, available at www.globalcmt.org) database covering the period 1976–2016 and other available sources like Dasgupta et al. (2000), Chandra (1977), Singh and Gupta (1980), and Bilham and England (2001). We thus identified around 527 active tectonic features shown in Fig. 10 in the hypocentral depth ranges of 0–25, 25–70, and 70–180 km.

Fig. 10
figure 10

Active tectonic sources as identified to have seismic hazard contribution to the Indo-Gangetic Foredeep region

4.1.3 Smoothened gridded seismicity model

The contribution of background events in the hazard perspective is calculated using smoothened gridded seismicity models wherein discrete earthquake distributions are modeled into spatially continuous probability distributions using the Frankel (1995) methodology. In this study, the IGF and its adjoining region are gridded at 0.1° × 0.1° cells. The smoothened function used is given as,

$$ N\left({m}_{\mathrm{r}}\right)=\frac{\sum \limits_j{n}_j\left({m}_{\mathrm{r}}\right){e}^{-{\left({d}_{ij}/c\right)}^2}}{\sum \limits_j{e}^{-{\left({d}_{ij}/c\right)}^2}} $$
(2)

Where nj(mr) is the number of events with magnitude ≥mr, dij is the distance between the ith and jth cells, and c denotes the correlation distance. The annual activity rate \( {\lambda}_{m_{\mathrm{r}}} \) is computed each time as N(mr)/T, where T is the subcatalog period. The present analyses make use of different subcatalogs with the threshold magnitudes of Mw 3.5, 4.5, and 5.5, respectively, as summarized in Table 4 at the hypocentral depth levels of 0–25, 25–70, and 70–180 km. Correlation distances of 55, 65, and 85 km are decided for the respective cases by calibrating the outputs from several runs of the algorithm with the observed seismicity.

Table 4 The subcatalogs for the three threshold magnitudes are given here

The smoothened seismicity models given in Fig. 11 depict possible stressed zones within the East-Central Himalaya which incidentally had been the source for triggering the 1934 Nepal-Bihar earthquake of Mw 8.1, 1988 Bihar-Nepal earthquake of Mw 6.8, and the recent 25th April 2015 Nepal earthquake of Mw 7.8.

Fig. 11
figure 11

Smoothened seismicity models for different threshold magnitudes at three hypocentral depth levels of 0–25, 25–70, and 70–180 km

4.1.4 Analysis of seismic activity rate on active tectonic sources

In the present study, seismicity activity rates are calculated for each active tectonic source inscribed in each polygonal seismogenic source for the threshold magnitudes (mo) of Mw 3.5, 4.5, and 5.5 for the focal depth ranges < 25, 25–70, and 70–180 km. We employed the fault degradation technique of Iyengar and Ghosh (2004) for this purpose. The number of earthquake occurrence per year with m > mo in a given seismogenic layered polygonal source consisting of n faults is denoted as N(mo). According to the fault degradation technique, N(mo) should be equal to the sum of the number of earthquakes Ns(mo) along all the faults delineated in the seismogenic source zone, i.e., N(mo) = ∑Ns(mo), where Ns(mo) represents the annual frequency of occurrence of an event on sth subfault (s = 1, 2….n) with mo = 3.5, 4.5, and 5.5. The number of events Ns(mo) that occurs on a given fault depends upon various factors like the length of the fault (Ls) and the number of past earthquakes (ns) of magnitude mo and above associated with the sth fault having been used as weights for calculating Ns(mo). For example, if Nt is the total number of events occurring within a polygonal areal source, the weighting factor can be estimated as,

$$ {\alpha}_s={L}_s/\varSigma {L}_s\kern0.5em \mathrm{and}\kern0.5em {\delta}_s={n}_s/{N}_{\mathrm{t}} $$
(3)

Taking the mean of the above two weighting factors which indicates the seismic activity of the sth fault in the zone, we can compute the annual activity rate of the sth fault by,

$$ {N}_s\left({m}_{\mathrm{o}}\right)=0.5\left({\alpha}_s+{\delta}_s\right)\times N\left({m}_{\mathrm{o}}\right) $$
(4)

The annual activity rate of each tectonic feature inscribed in each of the 27 polygonal areal seismic sources has thus been computed wherein the regional recurrence is degraded into individual faults/lineaments. In Fig. 12, we presented representative plots of annual activity rate versus magnitude for a group of active tectonic features inscribed in the polygonal areal seismogenic source zones ‘2’ and ‘3’ for each of the three threshold magnitudes of Mw 3.5, 4.5, and 5.5. The spatial distribution of fault activity rate for different threshold magnitudes at three hypocentral depth ranges of 0–25, 25–70, and 70–180 km is also depicted in Fig. 13.

Fig. 12
figure 12

Representative plots of annual activity rate versus magnitude for a group of active tectonic features inscribed in the polygonal areal seismogenic sources ‘2’ and ‘3’ for the threshold magnitudes of Mw 3.5, 4.5, and 5.5

Fig. 13
figure 13

Fault activity rate for various threshold magnitudes at the three hypocentral depth ranges of 0–25, 25–70, and 70–180 km

4.1.5 Seismicity analysis

Seismicity parameters: ‘a’ and ‘b’ value assessment

The evaluation of seismicity parameters is one of the most important steps in the seismic hazard estimation. Earthquake occurrences across the globe follow the Gutenberg and Richter (GR) (Gutenberg and Richter 1944) relationship,

$$ {\log}_{10}\lambda (m)=a- bm $$
(5)

Where λ(m) is the cumulative number of events with magnitude ≥m. The slope parameter, commonly termed the b value, is often employed as an indicator of stress regime in the tectonic reinforcements and to characterize seismogenic zones (Schorlemmer et al. 2005). The maximum likelihood method for the estimation of b value given by Aki (1965) and Utsu (1965) has been used here as,

$$ b=\frac{\log_{10}(e)}{\left[{m}_{\mathrm{mean}}-\left({m}_{\mathrm{t}}-\frac{\varDelta m}{2}\right)\right]} $$
(6)

Where mmean is the average magnitude, mt is the minimum magnitude of completeness, and Δm is the magnitude bin size (= 0.1 in the present study). The standard deviation of b value (δb) has been computed by the bootstrapping method as suggested by Schorlemmer et al. (2003) which involves repeated computations, each time employing redundant data sample, allowing events drawn from the catalog to be selected more than once. A minimum magnitude constraint is generally applied on the GR relation given by Eq. (6) on the basis of the magnitude of completeness entailed by the linearity of the GR relation on the lower magnitude range. An upper magnitude has been suggested in accordance with the physical dissipation of energy and the constraints due to the tectonic framework (Kijko 2004). This is achieved by establishing the maximum earthquake Mmax physically capable of occurring within a defined seismic regime in an underlying tectonic setup. The magnitude distribution is, therefore, truncated at Mmax such that Mmax ≫ mmin. A modified version of Eq. (6) formulated by Page (1968) and Cornell and Vanmarcke (1969) is a truncated exponential distribution (TGR) as follows,

$$ \lambda (m)=\lambda \left({m}_{\mathrm{min}}\right)\frac{10^{-b\left(m-{m}_{\mathrm{min}}\right)}-{10}^{-b\left({M}_{\mathrm{max}}-{m}_{\mathrm{min}}\right)}}{1-{10}^{-b\left({M}_{\mathrm{max}}-{m}_{\mathrm{min}}\right)}} $$
(7)

Where mmin is the minimum magnitude and Mmax is the upper-bound magnitude. The maximum earthquake (Mmax) is the largest seismic event characteristic of the terrain under the tectonostratigraphic consideration. The incomplete data (including the historical data) is rendered return periods according to the GR and TGR model. The linear GR relation can statistically accommodate large events if the seismic source zone is of appropriate size and the temporal coverage of the catalog is also long enough, and the TGR model is reckoned to be more appropriate considering the energy dissipations at larger magnitudes. In several cases, zones with similar tectonics are merged to achieve sufficient number of events say ≥ 50 in the present case as well as an acceptable uncertainty with the estimated seismicity parameters. This ultimately produced 20 zones out of a total of 27 zones initially considered. Seismicity analysis has been performed in these zones to estimate both the ‘a’ and ‘b’ values. The sample frequency magnitude distribution plots for main shock events in the seismogenic source zones of ‘2,’ ‘3,’ and ‘18’ are depicted in Fig. 14, and the seismicity parameters estimated for all the polygonal seismogenic sources are listed in Table 5.

Fig. 14
figure 14

Representative frequency magnitude distribution plots at some typical polygonal seismogenic source zones ‘2,’ ‘3,’ and ’18.’ The red line represents the truncated Gutenberg–Richter (TGR) relation, the blue line represents the Gutenberg–Richter (GR) relation, while the circles and squares represent the instrumental events (the complete data coverage) and incomplete data (including the historical data as extreme data coverage), respectively

Table 5 Estimated seismicity parameters for all the polygonal seismogenic sources considered for PSHA in the Indo-Gangetic Foredeep region

Maximum earthquake prognosis

The maximum earthquake (Mmax) is the largest seismic event characteristic of the terrain under the seismotectonic consideration. The Mmax values are often calculated from fault dimensions and geodetic inferences (Wells and Coppersmith 1994; Anderson et al. 1996), in addition to the frequency magnitude distribution indicated by past earthquakes. Maximum earthquake prognosis has been performed for both the layered polygonal sources and the active tectonic sources. For polygonal sources, a maximum likelihood method for maximum earthquake estimation referred to as the Kijko–Sellevoll–Bayesian technique (Kijko 2004; Kijko and Graham 1998) is used. The technique is based on Bayesian equation of frequency magnitude distribution. It has been observed that empirical magnitude distribution deviates moderately from the Gutenberg–Richter relation following an exponential tail of a Gamma function at larger magnitudes. Hence, a readily available Mmax estimator computer code proposed by Kijko (2004) has been employed in the present study. The estimated maximum earthquake (Mmax) and the observed maximum earthquake of each polygonal seismogenic source are listed in Table 5.

The deterministic assessment of characteristic earthquake viz. maximum earthquakes from a fault is generally achieved with a relationship between earthquake magnitude and co-seismic subsurface fault rupture length. The primary method used to estimate subsurface rupture length and rupture area is the spatial pattern of early aftershocks (Wells and Coppersmith 1994). Aftershocks that occur within a few hours to a few days of the main shock generally define the maximum extent of co-seismic fault rupture (Kanamori and Anderson 1975; Dietz and Ellsworth 1990; Wong et al. 2000). Basically, an aftershock zone roughly corresponds to the fault ruptured during the main shock. Precise studies indicate that aftershocks are concentrated near the margin of the fault area where large displacement occurred (e.g., Das and Henry 2003; Utsu 2002). The general assumption, based on worldwide data, is that one third to one half of the total length of the fault would rupture when it generates the maximum earthquake (Mark 1977; Kayabalia and Akin 2003; Shukla and Choudhury 2012; Seyrek and Tosun 2011). In the present study, the fault rupture segmentation is identified using the maximum length of the well-aligned main shock and aftershocks along the faults (e.g., Besana and Ando 2005; Utsu 2002; Wells and Coppersmith 1994), which on digitization in GIS yields subsurface rupture length of each active tectonic feature whose maximum earthquake is estimated using the co-seismic subsurface fault rupture dimension and magnitude of Wells and Coppersmith (1994). Table 6 enlists some major active tectonic sources, their total length (TFL), the associated observed maximum earthquakes (Mmax,obs), the subsurface rupture length (RLD), and the maximum predicted earthquake (Mmax) from RLD which is seen to fall within the inner bounds of one third and one half approximations of Mark (1977), Kayabalia and Akin (2003), Shukla and Choudhury (2012), and Seyrek and Tosun (2011).

Table 6 Some major active tectonic features considered to be seismogenic in the region along with their total fault lengths (TFL), observed maximum earthquake (mmax,obs), projected subsurface rupture length (RLD), and estimated maximum earthquake (Mmax)

4.1.6 Region-specific ground motion prediction equations

An evaluation of seismic hazard, whether deterministic or probabilistic, requires an estimate of expected ground motion at the site of interest. The most common means of estimating this ground motion in engineering practices, including probabilistic seismic hazard analysis, is to use an existing attenuation relationship that relates a specific strong ground motion parameter of ground shaking in terms of PGA, peak ground velocity (PGV), PSA, pseudo-spectral velocity (PSV), or peak ground displacement (PGD), and intensity to one or more attributes of an earthquake (e.g., Campbell and Bozorgnia 2003; Atkinson and Boore 2003; Nath and Thingbaijam 2011a; Nath et al. 2012). There has been a large volume of work already available on the development and application of the GMPEs; recent reviews can be found in Douglas (2003), Campbell (2003), Power et al. (2008), Nath and Thingbaijam (2011a), and Anbazhagan et al. (2015a).

The ground motion parameters at a site of interest are evaluated by using a ground motion prediction equation that relates a specific strong motion parameter of ground shaking to one or more seismic attributes (Campbell and Bozorgnia 2003; Nath et al. 2012). The appropriate ground motion prediction equations are not only useful in rapid hazard assessment but also important for seismic risk analysis. The selection of a model for the prediction equation is important as it should not only be realistic but also a practical one and neither too complex nor too simple. Due to paucity of good magnitude coverage of strong ground motion data, analytical or numerical approaches for a realistic prognosis of possible seismic effects in terms of tectonic regime, earthquake size, local geology, and near fault conditions necessitate systematic ground motion synthesis. There are several algorithms available for ground motion synthesis. However, the finite-fault stochastic method is considered to be best suited over a large fault rupture distance and also the source characteristics for near-field approximation (Nath et al. 2009, 2012, 2014). Thus, in the present study, the stochastic finite-fault simulation is performed using EXSIM of Motazedian and Atkinson (2005) for strong ground motion synthesis. In order to create a strong ground motion database, we simulated earthquakes of Mw 3.5 to the maximum earthquake in the three tectonic provinces, namely the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin at 0.2 Mw intervals. The source functions for earthquake simulation using EXSIM have been obtained from published literatures and listed in Table 7. The amplification due to the shallow crustal effects, considered an important attribute for ground motion simulation at the crustal level, is used in the ground motion synthesis. The 1D crustal velocity model of the Indo-Gangetic Foredeep adopted from Monsalve et al. (2006) as shown in Fig. 15a is used to incorporate crustal amplification. The crustal amplification as a function of frequency is presented in Fig. 15b and is calculated from the shear-wave velocity profile using the quarter wavelength approximation (Boore and Joyner 1997).

Table 7 Parameters used for strong ground motion simulation in the Indo-Gangetic Foredeep region
Fig. 15
figure 15

a Crustal velocity model for the Indo-Gangetic Foredeep region (adopted from Monsalve et al. 2006). b First-order approximated crustal amplification is estimated using the crustal velocity model following Boore and Joyner (1997) for the Indo-Gangetic Foredeep region

Figure 16 exhibits the simulated acceleration time history as well as the corresponding spectrum of (a) the 2015 Nepal earthquake of Mw 7.8 simulated at Patna City, (b) the 1997 Jabalpur earthquake of Mw 5.8 simulated at Lucknow City, and (c) the 1934 Nepal-Bihar earthquake of Mw 8.1 simulated at Varanasi City.

Fig. 16
figure 16

Simulated acceleration time history and the corresponding spectrum of a the 2015 Nepal earthquake of Mw 7.8 simulated at Patna City, b the 1997 Jabalpur earthquake of Mw 5.8 simulated at Lucknow City, and c the 1934 Nepal-Bihar earthquake of Mw 8.1 simulated at Varanasi City

Thereupon, nonlinear regression analyses have been performed for different shaking parameters Y (i.e., PGA, PSA, PGV, and PGD at different periods) following least square error minimization to estimate the coefficients of NGA models following Atkinson and Boore (2006) and Campbell and Bozorgnia (2003) ground motion prediction formulations as given in Eqs. (8) and (9), respectively, for the three major tectonic provinces viz. the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin. The fundamental models adopted for nonlinear regression analysis are individually given as,

  1. (a)

    Atkinson and Boore (2006) (BA06):

$$ \mathrm{LogPSA}={C}_1+{C}_2M+{C}_3{M}^2+\left({C}_4+{C}_5M\right){f}_1+\left({C}_6+{C}_7M\right){f}_2+\left({C}_8+{C}_9M\right){f}_0+{C}_{10}{R}_{\mathrm{cd}} $$
(8)
$$ {\displaystyle \begin{array}{l}\mathrm{Where},\\ {}{f}_0=\max \left(\log \left({R}_0/{R}_{\mathrm{cd}}\right),0\right);{f}_1=\min \left(\log {R}_{\mathrm{cd}},\log {R}_1\right),\\ {}{f}_2=\max \left(\log \left({R}_{\mathrm{cd}}/{R}_2\right)\right);{R}_0=10;{R}_1=70;{R}_2=140,\end{array}} $$

M is the magnitude in Mw, Rcd represents fault distance in kilometers, and C1C10 are the regression coefficients. The obtained regression coefficients for the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin seismogenic zones in the Indo-Gangetic Tectonic Province using this next generation attenuation model are given in Table 8.

  1. (b)

    Campbell and Bozorgnia (2003) (CB03):

Table 8 Regression coefficients obtained by considering the Atkinson and Boore (2006) (BA06) NGA model for the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin seismogenic source zones
$$ \ln Y={c}_1+{f}_1\left({M}_{\mathrm{w}}\right)+{c}_4\ln \sqrt{f_2\left({M}_{\mathrm{w}},{r}_{\mathrm{seis}},S\right)}+{f}_3(F)+{f}_4(S) $$
(9)
$$ {\displaystyle \begin{array}{l} Where,\\ {}{f}_1\left({M}_w\right)={c}_2{M}_w+{c}_3{\left(8.5-{M}_w\right)}^2,\\ {}{f}_2\left({M}_w,{r}_{seis},S\right)={r^2}_{seis}+g{(S)}^2{\left(\exp \left[{c}_8{M}_w+{c}_9{\left(8.5-{M}_w\right)}^2\right]\right)}^2,\\ {}g(S)={c}_5+{c}_6\left({S}_{VFS}+{S}_{SR}\right)+{c}_7{S}_{FR},\\ {}{f}_3(F)={c}_{10}{F}_{RV}+{c}_{11}{F}_{TH},\\ {}{f}_4(S)={c}_{12}{S}_{VFS}+{c}_{13}{S}_{SR}+{c}_{14}{S}_{FR}\end{array}} $$

SVFS = 1 (very firm soil), SSR = 1 (soft rock), SFR = 1 (firm rock), SVFS = SSR = SFR = 0 (firm soil), FTH = 1 (thrust faulting), FRV = 1 (reverse faulting), and FRV = FTH = 0 (strike-slip and normal faulting). Mw represents the moment magnitude and rseis represents the closest distance to seismogenic rupture. According to Campbell and Bozorgnia (2003), the nonlinear site effects inherent in large ground motion on firm soil do not permit a significant increase in ground motion over the hanging wall effect. Moreover, the hanging wall effect dies out for rseis < 8 km, or sooner if rjb ≥ 5 km or δ ≥ 70°. Hence, in the present scenario, the hanging wall effect is not considered and the prediction equation has been modified after Campbell and Bozorgnia (2003) to generate the next generation attenuation model suitable for the entire Indo-Gangetic Foredeep Tectonic Province. The regression coefficients of the NGA models worked out for the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin seismogenic zones contributing to the seismic hazard of the Indo-Gangetic Foredeep Tectonic Province using the fundamental Eq. (9) are given in Table 9.

Table 9 Regression coefficients obtained by considering the Campbell and Bozorgnia (2003) (CB03) NGA model for the Central Himalaya, the Central Indian Peninsular Shield, and the Indo-Gangetic Alluvium Basin seismogenic source zones

For establishing the accuracy of these six NGA models worked out for the Indo-Gangetic Foredeep Tectonic Province, we compared in Fig. 17 the PGA values of the predicted BA06 NGA models with the simulated ones in the Central Himalaya, the Indo-Gangetic Alluvium Basin, and the Central Indian Peninsular Shield seismogenic zones with a satisfactory agreement prevailing among all the three seismogenic source zones.

Fig. 17
figure 17

The blue dots represent the simulated PGA and the red dots represent the predicted PGA from the predicted NGA models BA06 for a the Indo-Gangetic Alluvium Basin, b the Central Himalaya, and c the Central Indian Peninsular Shield seismogenic sources

The predicted NGA BA06 and CB03 models have further been validated using both PGA and PSA residual assessment following the formulations,

$$ \mathrm{residual}={\log}_{10}\left(\frac{Y_{\mathrm{os}}}{Y_{\mathrm{p}}}\right) $$
(10)

Where Yos is the simulated PGA/PSA and Yp is the estimated PGA/PSA from the empirical attenuation relations (BA06 and CB03 in this case). Residual plots for PGA as a function of fault distance for NGA BA06 models for the Central Himalaya, the Indo-Gangetic Alluvium Basin, and the Central Indian Peninsular Shield seismogenic sources are presented in Fig. 18. It is evident that the residuals have a zero mean and are uncorrelated with respect to fault distance. Apparently, residual analysis of PGA and PSA of the NGA models predicted in the present investigation is found to be unbiased in regard to both the magnitude and fault distance and, therefore, can be used along with other already available ground motion prediction equations for the IGF and its adjoining region and also those available for similar tectonic setup in a logic tree framework for seismic hazard assessment.

Fig. 18
figure 18

Residuals of PGA with respect to fault distance for a the Indo-Gangetic Alluvium Basin, b the Central Himalaya, and c the Central Indian Peninsular Shield seismogenic sources

Apart from the NGA models worked out as a part of this investigation, we also incorporated some regional and global prediction models based on the suitability test performed on each such model for the estimation of seismic hazard of the region. Altogether, we adopted a total of 15 GMPEs as given in Table 10. The coefficients of nine GMPEs already available for the region have been adopted from their original publications. GMPEs are selected and ranked through the ‘efficacy test,’ proposed by Scherbaum et al. (2009) which makes use of average sample log-likelihood (LLH) computation for the purpose of ranking. The method has been tested successfully by Delavaud et al. (2009) and applied in the Indian context by Nath and Thingbaijam (2011a) and Anbazhagan et al. (2015a). The LLH is computed as,

$$ \mathrm{LLH}=-\frac{1}{N}\sum \limits_{i=1}^N{\log}_2\left(g\left({x}_i\right)\right) $$
(11)
Table 10 Selected ground motion prediction equations for PSHA of any cities and urban centers located in the Indo-Gangetic Foredeep region

Where xi represents the observed data for i = 1,... N. The parameter N is the total number of events and g(xi) is the likelihood that model g has produced the observation xi. In this case, g is the probability density function given by a GMPE to predict the observation produced by an earthquake with magnitude M at a site i that is located at a distance R from the source.

The smaller the value of LLH, the higher is the ranking index of the GMPE. The ranking analysis has been carried out using macroseismic intensity data (Martin and Szeliga 2010; https://earthquake.usgs.gov/) and the PGA–European Macroseismic Scale (EMS, Grünthal 1998) relation at rock sites as given in Nath and Thingbaijam (2011a). Figure 19 presents the intensity as a function of distance for the indicated earthquakes derived from the ground motion prediction equations. The individual normalized weights of each GMPE have been derived by preparing a pairwise comparison matrix (Saaty 1980, 2000). The ranking analysis has been performed based on LLH values along with the weight assigned to each GMPE for the Central Himalaya, the Indo-Gangetic Alluvium Basin, and the Central Indian Peninsular Shield seismogenic sources for the IGF as illustrated in Table 11.

Fig. 19
figure 19

The intensity as a function of distance for the indicated earthquakes derived from the ground motion prediction equations for suitability testing of GMPEs for a the Central Indian Peninsular Shield, b the Central Himalaya, and c the Indo-Gangetic Alluvium Basin seismogenic sources

Table 11 The weights and ranks assigned to respective GMPEs based on the average LLH ranking in the Central Indian Peninsular Shield, the Central Himalaya, and the Indo-Gangetic Alluvium Basin seismogenic zones

4.1.7 PSHA logic tree framework for the Indo-Gangetic Foredeep region

The seismic hazard at a particular site is usually quantified in terms of level of ground shaking observed in the region. The methodology for probabilistic seismic hazard analysis incorporates how often the annual rate of ground motion exceeds a specific value for various return periods of hazard at a particular site of interest. In the hazard computation, all the relevant sources and possible earthquake events are considered. A synoptic probabilistic seismic hazard model is generated at engineering bedrock based on the protocol given by Nath and Thingbaijam (2012), Nath et al. (2014), Adhikari and Nath (2016), and Maiti et al. (2017). The basic methodology of the probabilistic seismic hazard analysis involves computation of ground motion thresholds that are exceeded with a mean return period of say 475 years/2475 years at a particular site of interest. The effects of all the earthquakes of different sizes occurring at various locations for all the seismogenic sources at various probabilities of occurrences are integrated into one curve that shows the probability of exceeding different levels of a ground motion parameter at the site during a specified time period. The computational formulation as developed by Cornell (1968), Esteva (1970), and McGuire (1976) is given as,

$$ \nu \left(a>A\right)=\sum \limits_i{\lambda}_i{\int}_m{\int}_r{\int}_{\delta }P\left(a>A|m,r,\delta \right){f}_m(m){f}_r(r){f}_{\varDelta}\left(\delta \right)\mathrm{d}m\mathrm{d}r\mathrm{d}\delta $$
(12)

where ν (a > A) is the annual frequency of exceedance of ground motion amplitude A, λ is the annual activity rate for the ith seismogenic source for a threshold magnitude, and function P yields probability of the ground motion parameter a exceeding A for a given magnitude m at source-to-site distance r. The corresponding probability density functions are represented by fm(m), fr(r), and fΔ(δ). The probability density function for the magnitudes is generally derived from the GR relation (Gutenberg and Richter 1944). In practice this relationship is truncated at some lower and upper magnitude values which are defined as the truncation parameters related to the minimum (mmin) and maximum (Mmax) values of magnitude, obtained by different methods. The present implementation makes use of the density function given by Bender (1983) as,

$$ {f}_m(m)=\frac{\beta \exp \left[-\beta \left(m-{m}_{\mathrm{min}}\right)\right]}{1-\exp \left[-\beta \left(M\max -m\min \right)\right]} $$
(13)

Where β = b ln(10), and b refers to the b value of the GR relation. The distribution is bounded within a minimum magnitude mmin and a maximum magnitude Mmax. fΔ(δ) is the probability density function (in lognormal distribution) associated with the standard deviation of the residuals in GMPE. The GMPEs are described as relationships between a ground motion parameter ‘Y’ (i.e., PGA, PGV, or PSA at different periods), earthquake magnitude ‘M,’ source-to-site distance ‘R,’ and uncertainty or residual (δ) as,

$$ \ln (Y)=f\left(M,R\right)+\delta $$
(14)

The ground motion uncertainty δ is modeled as a normal distribution with a standard deviation, σln,y. Hence, the above equation can be expressed as,

$$ \ln (Y)=f\left(M,R\right)+\varepsilon {\sigma}_{\ln, y} $$
(15)

Where ε is the normalized residual, which is also a normal distribution with a constant standard deviation, and σln,y is the standard deviation associated with the GMPE. In the PSHA formulation as given in Eq. (12), standard deviation denoted by δ is basically the residual associated with each GMPE. The probability density function fΔ(δ) follows a lognormal distribution that can be expressed as,

$$ {f}_{\varDelta}\left(\delta \right)=\frac{1}{\sqrt{2\pi }{\sigma}_{\ln, y}}\exp \left[-\frac{{\left(\ln y-\ln {y}_{mr}\right)}^2}{2{\sigma^2}_{\ln, y}}\right] $$
(16)

where lnymr = f(M, R) is the functional form of the prediction model in terms of magnitude, distance. Ground motion variability constitutes aleatory uncertainty intrinsic to the definition of GMPEs and, consequently, to that of PSHA. Computations based only on the median ground motions ignoring the associated variability are known to underestimate the hazards, especially at low annual frequencies of exceedance (Bommer and Abrahamson 2006). The value of εmax ranging from 2 to 4 has often been employed in probabilistic seismic hazard estimations (e.g., Marin et al. 2004). However, truncation at εmax < 3 has been suggested to be inappropriate (e.g., Bommer and Abrahamson 2006). In the present study, truncation at εmax = 4 is considered to be pragmatic and implemented uniformly for all the GMPEs.

The distance probability function fr(r) represents the probability of occurrence of a given earthquake at a distance in the range (r, r + dr). In the present analysis, instead of considering probability function for the source-to-site distance distinctively, we have implemented gridded point locations within the source zone, where finite-fault ruptures are constructed based on the rupture dimensions estimated for each magnitude.

The hazard computation is performed using a Poisson occurrence model given by Eq. (17) below on grid points covering the entire study region at a spacing of 0.0005° × 0.0005°.

$$ P=1-{e}^{-\lambda t} $$
(17)

Where λ is the rate of occurrence of the event (annual activity rate) and t is the time period of exceedance. With this, the annual rate of exceedance for an event with 10% probability in 50 years is given by,

$$ \lambda =-\left[\ln \left(1-0.1\right)/50\right]=0.0021/\mathrm{year} $$
(18)

A logic tree framework depicted in Fig. 20 is employed in the computation of probabilistic seismic hazard for the capital cities of Patna and Lucknow and the famous Hindu religious city of Varanasi at 0.0005° × 0.0005° grid resolutions to incorporate multiple models in the source considerations, GMPEs, and seismicity parameters. In the present study, the seismogenic sources, i.e., tectonic and layered polygonal sources, are assigned weights equal to 0.60 and 0.40, respectively. The three derivatives for the threshold magnitude of Mw 3.5, 4.5, and 5.5 are assigned weights equal to 0.20, 0.35, and 0.45, respectively. The seismicity model parameters, namely the annual rate of earthquakes λ(m) and β pair, are assigned weights of 0.36, while the respective ± 1 standard deviation gets weight equal to 0.32. Similar weight allotment is performed for Mmax. The weights are allocated following the statistical rationale suggested by Grünthal and Wahlström (2006). In order to define appropriate weights, the percentage of probability mass in a normal distribution for the mean value and ± 1 standard deviation are considered corresponding to the center of two equal halves.

Fig. 20
figure 20

A logic tree formulation for probabilistic seismic hazard computation at each node of the region gridded at 0.0005° × 0.0005° interval

4.2 Surface-consistent probabilistic seismic hazard modeling

In the present study, site response for both short and long periods as provided by IBC (2006) pertaining to NEHRP site classes in the IGF is shown as representative samples for the city of Lucknow in Table 12 which in comparison with regional site amplification factors derived through geophysical and geotechnical investigation for the same city by Anbazhagan et al. (2010) depict a satisfactory agreement. These site factors on convolution with firm rock level PGA and PSA values generated surface-consistent probabilistic seismic hazard of the cities of Patna, Lucknow, and Varanasi for 475 years of return period.

Table 12 Summary of site amplification factor given by IBC (2006) for NEHRP site classes and site amplification factors derived through geophysical and geotechnical investigation for Lucknow City by Anbazhagan et al. (2010)

4.3 Seismic damage modeling

The damage probability in various socioeconomic clusters of the cities of Patna, Lucknow, and Varanasi has been estimated in relationship with a given ground motion parameter to evaluate the building performance for a particular seismic event in an open-source MATLAB-based seismic risk assessment package like SELENA developed by NORSAR (Norwegian Seismic Array)/ICG (International Center for Geohazards, Norway) and the University of Alicante, Spain, for systematic seismic risk assessment using the capacity spectrum method. The methodology consists of (i) classification of buildings in different model building types as per FEMA nomenclature, (ii) development of uniform hazard response spectra for each socioeconomic cluster, (iii) definition of capacity and fragility curve for each model building type, and (iv) assessment of discrete damage probability according to different damage states. The detailed computational work flow is used as in SELENA presented in Fig. 21.

Fig. 21
figure 21

Computational framework used in SELENA for structural damage assessment (modified after Molina et al. 2010; Nath 2016; Ghatak et al. 2017)

4.3.1 Definitions of major building typologies in the cities of Patna, Lucknow, and Varanasi

The rapid visual screening (RVS) procedure has been developed to identify and screen buildings that are potentially seismically hazardous (FEMA 2000). The RVS procedure uses a methodology based on a sidewalk survey of a building and a data collection form, which the person conducting the survey completes based on visual observation of the building and using a set of questionnaires on the data collection form. Based on the RVS and building characteristics, we have selected six model building types in the cities of Patna, Lucknow, and Varanasi and those have been described as ‘IGW-RCF2IL (PAGER/FEMA:C1L),’ ‘IGW-RCF21M (PAGER/FEMA:C1M),’ ‘IGW-RCF11L (PAGER/FEMA:C3L),’ ‘IGW-RCF21H (PAGER/FEMA:C1H),’ ‘IGW-RCF11M (PAGER/FEMA:C3M),’ and ‘PAGER/FEMA:C3H’ based on the construction, height, and number of stories as followed in PAGER/FEMA ( 2000) and Pathak et al. (2015) nomenclature illustrated in Table 13.

Table 13 The model building types identified in the cities of Patna, Lucknow, and Varanasi as per FEMA (2000), WHE-PAGER (2008), and Pathak et al. (2015)

4.3.2 Structural damage assessment using the capacity spectrum method

The capacity spectrum method (CSM) is a nonlinear static analysis, which compares the capacity curve of a structure in terms of force and displacement with the seismic response spectrum (Freeman 1978). It consists of steps like generation of the capacity spectrum, computation of the design response spectrum, and determination of performance point. Structural capacity is represented by a force–displacement curve. A pushover analysis is performed for a structure with increasing lateral forces, representing the inertial forces of the structure under seismic demand. The process is continued till the structure becomes unstable.

Capacity curve

A building capacity curve is a plot of lateral load resistance as a function of characteristic lateral displacement (Yeh et al. 2000). It can be derived from a plot of base shear versus roof displacement when the building is subjected to equivalent static forces. The building capacity curve has three control points: design, yield, and ultimate capacity. A building is typically assumed to deform beyond the ultimate point without loss of stability, but the structural system provides no additional resistance to lateral load. Figure 22 depicts the capacity curves for IGW-RCF2IL (PAGER/FEMA:C1L), IGW-RCF21M (PAGER/FEMA:C1M), and IGW-RCF21H (PAGER/FEMA:C1H) model building types as obtained from NIBS (2002).

Fig. 22
figure 22

Representative capacity curves for IGW-RCF2IL (PAGER/FEMA:C1L), IGW-RCF21M (PAGER/FEMA:C1M), and IGW-RCF21H (PAGER/FEMA:C1H) model building types (adopted from NIBS 2002)

Seismic demand input

The design response spectrum (DRS) is defined as the smoothened plot of maximum acceleration as a function of frequency or time period of vibration for specific damping ratio for earthquake excitations at the base of a single degree of freedom system (Nath 2016). The earthquake actions are represented in the form of a design response spectrum in terms of PGA and PSA. The scheme given by IBC (2006, 2009) scales the design spectrum corresponding to the short and long periods, respectively, as presented in Table 12. The computational procedure for the design response spectrum is given in the Appendix. The spectral displacement has been calculated for the assessment of ultimate capacity of the building as,

$$ {S}_{\mathrm{D}}=9.8\times {S}_{\mathrm{A}}\times {T}^2 $$
(19)

Where SD is the spectral displacement, SA is the spectral acceleration in g, and T is the time period.

Fragility curve

The fragility curves express the probability of structural damage due to earthquakes as a function of ground motion indices viz. PGA and PSA. For the computation of damage probabilities, vulnerability curves or fragility curves for five damage states are essential, which are developed as lognormal probability distribution of damage from the capacity curve. The fragility curve of a particular building can be constructed by (i) selecting earthquake ground motion in terms of PGA and PSA, (ii) defining fine limiting states for discrete damage levels as per FEMA guidelines (FEMA 1999), (iii) analyzing building response using inelastic dynamic analysis, and (iv) conducting risk analysis to obtain the probability of exceeding various limiting states. In the present study, fragility curves for all model building types have been adopted from NIBS (2002) as listed in Table 14. For an expected displacement, cumulative probabilities are defined to obtain discrete damage probabilities of a structure in terms of ‘none,’ ‘slight,’ ‘moderate,’ ‘extensive,’ and ‘complete.’ The representative fragility curves for IGW-RCF2IL (PAGER/FEMA:C1L), IGW-RCF21M (PAGER/FEMA:C1M), and IGW-RCF21H (PAGER/FEMA:C1H) model building types are presented in Fig. 23.

Table 14 Fragility curve parameters for each model building type defined by NIBS (2002)
Fig. 23
figure 23

Representative fragility curves for IGW-RCF2IL (PAGER/FEMA:C1L), IGW-RCF21M (PAGER/FEMA:C1M), and IGW-RCF21H (PAGER/FEMA:C1H) model building types (adopted from NIBS 2002)

Determination of performance point for the computation of discrete damage probability

The peak building response at the point of interaction of the capacity curve and the design response spectrum is used with fragility curve for the estimation of damage state probability. The cumulative damage probabilities of all the model building types in terms of none, slight, moderate, extensive, and complete have been calculated by,

$$ p\left[\mathrm{ds}|{S}_{\mathrm{d}}\right]=\varPhi \left[\frac{1}{\beta_{\mathrm{d}\mathrm{s}}}\mathrm{In}\left(\frac{S_{\mathrm{d}}}{{\overline{S}}_{\mathrm{d},\mathrm{ds}}}\right)\right] $$
(20)

Where p[ds| Sd] = probability of being in or exceeding a damage state, ds; Sd = given spectral displacement (inches); \( {\overline{S}}_{\mathrm{ds}} \) = median value of Sd at which the building reaches the threshold of the damage state ds; βds = lognormal standard deviation of spectral displacement of damage state, ds; and Φ = standard normal cumulative distribution function. Both \( \overline{S} \)d,ds and βds depend on a building type and its seismic design level.

5 Results and discussion

The hazard distribution is estimated for the source zones at all the hypocentral depth ranges of 0–25, 25–70, and 70–180 km separately and thereupon integrated to obtain the holistic hazard value. Hazard curves exhibit the probability of exceeding different ground motion parameters at a particular site of interest. Figure 24 depicts the seismic hazard curves for the cities of Patna, Lucknow, and Varanasi corresponding to PGA and PSA at 0.2 and 1.0 s, respectively, at engineering bedrock. Both 2 and 10% probability of exceedance in 50 years have been demarcated by dotted lines in the diagram presenting both 475 and 2475 years of return period scenarios at firm rock condition.

Fig. 24
figure 24

Annual frequency of exceedance versus ground acceleration plots usually termed as seismic hazard curves for the selected locations like a Danapur in Patna City, b Aliganj in Lucknow City, and c BHU in Varanasi City for peak and spectral accelerations at 0.2 and 1.0 s for uniform firm rock site condition. Both 10 and 2% probabilities of exceedance in 50 years have been demarcated by horizontal dotted lines in each plot

The seismic hazard maps of Patna City corresponding to the spatial distribution of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years are depicted in Fig. 25 that exhibits a PGA variation of 0.138 to 0.149 g. The regions of Takiapar, Panapur Taufir, Sadikpur, and Bahadurpur are placed in the higher hazard zone, while a moderate hazard level is associated with the regions of Danapur, Deedarganj, Gardanibagh, and Ramkrishna Nagar. A low hazard level of PGA 0.138 g is observed in the southern part of the city encompassing areas of Anisabad, Ranipur, Murlichack, and Chhoti Badalpura. The PSA distribution for the short period of 0.2 s exhibits a variation between 0.207 to 0.238 g, and at 0.3 s, it is seen to vary from 0.204 to 0.232 g, while for a longer period of 1.0 s, spectral acceleration is seen to vary from 0.068 to 0.090 g.

Fig. 25
figure 25

Seismic hazard distribution maps of Patna City in terms of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years at firm rock site condition

Figure 26 depicts the seismic hazard maps of Lucknow City corresponding to the spatial distribution of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years with a return period of 475 years exhibiting a PGA variation of 0.168 to 0.185 g. The regions of Janakipuram, Shivaji Puram, and Kamta are seen with a higher hazard value, while a moderate hazard level is associated with the regions of Thakurganj, Vikas Khand, and Aliganj. A low hazard level implicated with PGA 0.168 g is observed in the southern part of the city encompassing areas of Eldeco II, Nilmatha, and Munnu Khera. The PSA distribution for the short period of 0.2 s exhibits a variation between 0.297 and 0.338 g, and at 0.3 s, it is seen to vary from 0.258 to 0.289 g, while for a longer period, spectral acceleration at 1.0 s ranges from 0.109 to 0.126 g.

Fig. 26
figure 26

Seismic hazard distribution maps of Lucknow City in terms of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years at firm rock site condition

The seismic hazard maps of Varanasi City corresponding to the spatial distribution of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years with a return period of 475 years are depicted in Fig. 27 that shows a PGA variation of 0.091 to 0.109 g. The regions of Lamhi, Balirampur, BHU, Newada, and Balirampur are seen with a higher hazard level, while a moderate hazard level is associated with the regions of Jaitpura, Barthara, and Kurauti township areas. A low hazard level of PGA 0.091 g is observed in the northeastern part of the city encompassing the area of Hiramanpur township. The PSA distribution for the short period of 0.2 s exhibits a variation between 0.174 and 0.210 g, and at 0.3 s, it is seen to vary from 0.153 to 0.182 g, while for a longer period of 1.0 s, spectral acceleration is seen to vary from 0.039 to 0.050 g.

Fig. 27
figure 27

Seismic hazard distribution maps of Varanasi City in terms of PGA and PSA at 0.2, 0.3, and 1.0 s for 10% probability of exceedance in 50 years at firm rock site condition

The results presented here indicate that the hazard distributions are significantly higher than those specified in the earlier published works as listed in Table 15. The differences in the estimated hazard distribution compared to the previously published maps can be attributed to several factors such as (a) inclusion of new NGAs developed in this study and also the employment of multiple GMPEs as appropriate for similar seismotectonic regimes globally which were not included in the earlier studies, (b) layered seismogenic source framework considerations and smoothened gridded seismicity models conforming to the variation of seismotectonic attributes with hypocentral depth, (c) depth-wise active tectonic specific source classification apart from the already considered layered polygonal sources, and (d) multiple models of activity rates for both the layered polygonal and tectonic sources based on intensive seismicity analysis.

Table 15 Comparison among all the computed PGAs reported here and estimated by other researchers for the cities of Patna, Lucknow, and Varanasi for 10% probability of exceedance in 50 years at firm rock condition

To understand the applicability of probabilistic seismic hazard on vulnerability aspect, we calculated damage probability in various socioeconomic clusters of the cities of Patna, Lucknow, and Varanasi in relationship with the given ground motion parameters to evaluate building performance for a particular seismic event. The design response spectrum of pseudo-spectral acceleration, the peak building response, and the cumulative damage probabilities have been calculated for all the model building types based on surface-consistent ground motion and existing capacity and fragility curves. The seismic hazard maps presented in Fig. 28 correspond to the spatial distribution of PGA which have been generated at the surface level by convolving those generated at firm rock condition with the site amplification factor given by IBC (2009) in compliance with the site classes in the cities of Patna, Lucknow, and Varanasi. The estimated surface-consistent PGA varies from 0.222 to 0.238 g for Patna City, while the same varies in the range of 0.257 to 0.295 g for Lucknow City and 0.146 to 0.172 g for Varanasi City. Thereafter, the MM intensity has been estimated from the surface-consistent probabilistic PGA using the relationship given by Wald et al. (1999). The predicted MM intensity varies from VII to VIII for Patna City, while the same is seen to vary from MM intensity VII to VIII for Lucknow City and VI–VII for Varanasi City. On the contrary, the maximum observed intensity till date due to the entire past moderate to large earthquakes that visited the IGF region varies from MM intensity V–VIII.

Fig. 28
figure 28

Seismic hazard distribution in the cities of a Patna, b Lucknow, and c Varanasi in terms of PGA spatial variation for 10% probability of exceedance in 50 years at the surface for a return period of 475 years

Figure 29 presents 5% damping design response spectra for the cities of Patna, Lucknow, and Varanasi at NEHRP site class D (Vs30 ~ 180–360 ms−1). The spectral displacement has been computed based on the intersection between the design response spectra and capacity curve of a model building type in order to locate the performance point. This performance point in conjunction with the fragility curve of the model building type estimates the damage state probability of each model building type in all the socioeconomic clusters of the cities of Patna, Lucknow, and Varanasi to generate a composite damage scenario of each of these three cities.

Fig. 29
figure 29

The 5% damped design response spectra for the cities of Patna, Lucknow, and Varanasi at surface-consistent level compliant with NEHRP site class D (Vs30 ~ 180–360 ms−1)

It has been estimated that out of 5000 buildings in Patna City, about 48% of the buildings are expected to suffer from ‘moderate’ damage followed by ~ 16% ‘complete,’ ~ 22% ‘extensive,’ and ~ 8% ‘slight’ damage, and almost 6% of the buildings are seismic resistant as shown in Fig. 30a. In Lucknow City, approximately 42% of the buildings are expected to suffer from ‘moderate’ damage followed by ~ 16% ‘complete,’ ~ 24% ‘extensive,’ and ~ 13% ‘slight’ damage. However, approximately 5% of the buildings are seismic resistant in the city as collectively shown in Fig. 30b. On the other hand, in Varanasi City, nearly 47% of the buildings are expected to suffer from ‘moderate’ damage followed by ~ 12% ‘complete,’ ~ 27% ‘extensive,’ and ~ 10% ‘slight’ damage, and about 4% of the buildings are seismic resistant in the city as shown in Fig. 30c. The discrete damage probability exhibits that ‘IGW-RCF2IL (PAGER/FEMA:C1L),’ ‘IGW-RCF21M (PAGER/FEMA:C1M),’ and ‘IGW-RCF11L (PAGER/FEMA:C3L) building typologies will suffer the minimum damage, while ‘IGW-RCF21H (PAGER/FEMA:C1H),’ ‘IGW-RCF11M (PAGER/FEMA:C3M),’ and ‘PAGER/FEMA:C3H’ building typologies will experience severe destruction during a strong impending earthquake of Mw 7.2 as predicted in this study in the cities of Patna, Lucknow, and Varanasi and its adjoining region. The discrete damage probabilities for different model building types in the cities of Patna, Lucknow, and Varanasi are depicted in Fig. 31, which exhibits that the ‘complete’ and ‘extensive’ damage probabilities are higher in the city of Patna followed by Lucknow and Varanasi. Incidentally, it is observed that the unreinforced masonry buildings in the region are the most seismically vulnerable ones (GSI 1939; Nath 2016) and, therefore, possess the probability of ‘complete’ damage. The present study also exhibits that all the buildings in these cities are susceptible to damage and destruction at different levels from ‘slight’ to ‘complete’ damage states.

Fig. 30
figure 30

The discrete damage states ‘ds’ considering all the model building types in the cities of a Patna, b Lucknow, and c Varanasi

Fig. 31
figure 31

The discrete damage probability computed from the cumulative damage probability of ‘IGW-RCF2IL (PAGER/FEMA:C1L),’ ‘IGW-RCF21M (PAGER/FEMA:C1M),’ ‘IGW-RCF11L (PAGER/FEMA:C3L),’ ‘IGW-RCF21H (PAGER/FEMA:C1H),’ ‘IGW-RCF11M (PAGER/FEMA:C3M),’ and ‘PAGER/FEMA:C3H’ model building types in terms of ‘slight,’ ‘moderate,’ ‘extensive,’ and ‘complete’ damage states for the cities of a Patna, b Lucknow, and c Varanasi

6 Conclusion

The seismic hazard analysis has emerged as an important issue in high-risk urban centers across the globe and is considered an integral part of earthquake-induced disaster mitigation practices. Probabilistic seismic hazard analysis provides useful solutions for end users, mainly as input to seismic design. This study delivers a next generation probabilistic seismic hazard model of the cities of Patna, Lucknow, and Varanasi in the IGF with the incorporation of various seismic hazard components, namely the seismogenic source models, seismicity analysis, and ground motion prediction equations in a logic tree framework. On the other hand, structural damage has been estimated for the probable earthquake scenario of the cities of Patna, Lucknow, and Varanasi for a return period of 475 years with a view to possible disaster mitigation and management. The produced high-resolution probabilistic seismic hazard maps and damage scenario will provide substantial information for the development of these cities in terms of land use planning and design of future infrastructures. The emergency response capabilities can be significantly improved to reduce casualties by rapid, selective, and effective use of provided services. The architects and civil engineers may also use this information to assess the failure risk of the existing structures and, thus, design future earthquake-resistant structures in these cities in the Indo-Gangetic Foredeep region.