1 Introduction

The only nuclear power plant site in Brazil is the Central Nuclear Almirante Álvaro Alberto (CNAAA) located close to the town of Angra dos Reis in the state of Rio de Janeiro, some 130 km southwest of the city of Rio de Janeiro, close to the boundary with São Paulo state, in southeast Brazil. The plant currently includes two operating pressurised water reactors, with a combined capacity of almost 1.9 GWe, which jointly account for about 4% of Brazil’s total electricity supply. The first unit was connected to the grid in 1985, the second in 2000. A third unit is currently under construction.

In common with global practice in the nuclear industry at the time, the original seismic hazard studies conducted for the CNAAA site in the 1970s and 1980s were performed deterministically. The resulting peak ground acceleration (PGA) at the site was estimated as 0.07×g; following recommendations of the US Nuclear Regulatory Commission (NRC), a minimum value of 0.1g was adopted and used to anchor a standard response spectrum, which was essentially that proposed in Regulatory Guide 1.60 (USAEC 1973).

In the late 1990s, as required by the Brazilian regulatory body, the Comissão Nacional de Energia Nuclear (CNEN), the plant operator, Eletrobrás Eletronuclear (ETN) initiated the first probabilistic seismic hazard analysis (PSHA) for the site, specifically for the Angra 3 reactor unit. Following the Fukushima Daiichi nuclear accident caused by the Tohuku Mw 9 earthquake of 11 March 2011, in common with many nuclear power plant operators worldwide, ETN initiated a re-evaluation of the safety of the CNAAA site, particularly with regard to natural hazards. This has included a comprehensive PSHA conducted according to modern best practice including the use of a logic-tree formulation to capture all major sources of epistemic uncertainty. This paper provides an overview of the PSHA study, which faced the challenge common to many intraplate regions of limited datasets due to the infrequent occurrence of significant earthquakes. In particular, with the short historical record of earthquakes in southeast Brazil compared to the recurrence intervals of large earthquakes and relatively low investment to date in palaeoseismology studies in this region, the seismogenic potential of geological faults in the site region is poorly determined. The study adopted an innovative approach to incorporate the potential contributions of these faults to the site hazard in a manner consistent with the current state of knowledge.

Following this brief introduction, we present the seismic setting of SE Brazil (Sect. 2) and then present in some detail the earthquake catalogue developed for the project (Sect. 3). The seismic source characterisation (SSC) model is presented in Sects. 4 (area sources) and 5 (faults sources) and the ground-motion characterisation (GMC) model in Sect. 6. The PSHA calculations are then presented in Sect. 7 in the form of hazard curves for response spectral accelerations at multiple oscillator frequencies and uniform hazard response spectra (UHRS) constructed for specified annual frequencies of exceedance (AFE). Disaggregations of the hazard to identify the relative contributions from different earthquake scenarios are also presented. The paper concludes with a brief discussion of the main findings and possible future refinements of the hazard model for this site.

2 Seismic setting of Southeast Brazil

The CNAAA site is located in the Mantiqueira province, composed of Neoproterozoic to Early Palaeozoic fold belts, which was affected by the Atlantic rifting in the Mesozoic. The site is located in the foothills of the Serra do Mar coastal range, a relief originated from multiple Cenozoic tectonic reactivations of NE- to E-W-oriented Neoproterozoic shear zones in a continental rift context in the passive continental margin of southeastern Brazil (Riccomini et al. 2004) (Fig. 1). Thermal histories inferred from apatite fission-track (AFT) and (U-Th)/He analysis indicate important cooling events related to continental break-up in the Early Cretaceous to Cretaceous alkaline magmatism and to Cenozoic tectonism (Hiruma et al. 2010; Cogné et al. 2011; Cogné et al. 2012).

Fig. 1
figure 1

Regional geologic context of the study region: 1 São Francisco Craton; 2 Brasília Belt; 3 Ribeira Belt; 4 Palaeozoic sedimentary rocks of the Paraná Basin; 3 Early Cretaceous tholeiitic volcanic rocks of the Serra Geral Formation; 4 Mesozoic to Cenozoic alkaline rocks; 5 Cenozoic basins of the Continental Rift of Southeastern Brazil (CRSB) (1 Itaboraí Basin, 2 Barra de São João Graben, 3 Macacu Basin, 4 Volta Redonda Basin, 5 Resende Basin, 6 Taubaté Basin, 7 São Paulo Basin, 8 Sete Barras Graben, 9 Pariqüera-Açu Formation, 10 Alexandra Formation and Guaraqueçaba Graben, 11 Curitiba Basin, 12 Cananéia Graben); 6 Precambrian shear zones, in part reactivated during the Mesozoic and Cenozoic. Modified from Riccomini et al. (2004) and references therein. The concentric red circles represent areas with radius of 101 and 323 km around the CNAAA site

The inversion of striae from Paleogene-Neogene faults indicated that the stress regimes in the region varied during the Cenozoic (Riccomini et al. 2004). Rifting processes, including faulting, sedimentation and magmatism, in the continent occurred during the Eocene-Oligocene, firstly as a result of reactivation of NE- to E-W-oriented shear zones as normal faults under NNW-SSE-oriented extension. After its installation, the rift system was subjected to four phases of deformation, which initiated in the early Miocene with left-lateral strike-slip and minor thrust reactivation of NE to E-W shear zones, under a general strike-slip regime with NW-SE extension and local NE-SW compression. The second phase of deformation, during the Late Pleistocene to Holocene, is recorded by right-lateral strike-slip and thrust reactivation of NE to E-W-oriented Neoproterozoic shear zones, resulting from a NW-SE compression and NE-SW extension. During the Holocene, the region has experienced a rapid change of the stress regime, initially an E-W to WNW-ESE extension responsible for the development of N-S-oriented grabens, and finally an E-W compression, which affects colluvial and alluvial deposits younger than 3410 year BP (Riccomini and Assumpção 1999; Modenesi-Gauttieri et al. 2002).

The different pulses of uplift and changes in the stress field played a major role in the development of the drainage network of this region. The major rivers in the Serra do Mar region are controlled by E-W to NE-SW-oriented basement structures, but the pulses of tectonic activity along NW-SE-oriented faults, mainly during Neogene and Quaternary, have promoted numerous river captures (Riccomini et al. 2010).

Although Brazil has a low level of seismicity compared to several other intraplate regions in the world, some significant earthquakes are worth mentioning (Fig. 2). The largest known crustal earthquake (1955, Mato Grosso state) had a magnitude 6.2 mb, (but probably Mw 6) and was felt up to 500 km away (Barros et al. 2009). There is also limited historical evidence for a magnitude ~ 7 in the Amazon in 1690 (Veloso 2014).

Fig. 2
figure 2

Epicentres of the raw catalogue with magnitudes ≥ 2.0 mR for the period 1720–2015

In SE Brazil, the largest event occurred in 1922 (magnitude 5.1 mb, ~ 4.8 Mw) near Mogi-Guaçu, SP, with intensities up to VI MMI and an average felt radius of 300 km (Berrocal et al. 1984). Offshore, the largest magnitudes were 5.5 mb (Mw ~ 5.2) in 1939, off Santa Catarina state, and 6.1 mb (~ 5.8 Mw) off Espírito Santo in 1955.

3 Earthquake catalogue

A catalogue of the earthquakes that have occurred is of fundamental importance to modelling both the location and average recurrence intervals of potential future earthquakes. The Brazilian earthquake catalogue has been produced as a joint effort of the universities of São Paulo and Brasilia with additional cooperation from Federal University of Rio Grande do Norte (Natal, RN), Technological Research Institute (São Paulo, SP) and University of the State of São Paulo (SP). More details regarding the earthquake catalogue are presented by Bianchi et al. (2018).

3.1 Historical and instrumental sources

For the seismic hazard evaluation of the CNAAA site, a revision of the Brazilian catalogue was carried out, both for historical events in SE Brazil, as well as for more recent instrumental data (Bianchi et al. 2018). For historical events in SE Brazil, within 320 km of Angra dos Reis, some events were revised with additional information resulting in larger magnitudes. This radius reflects regulatory guidance regarding study areas for PSHAs conducted for nuclear sites (e.g., USNRC 2007; IAEA 2010) and also encompasses the more populated areas of the states of Rio de Janeiro, São Paulo and Minas Gerais that have more complete newspaper archives. New information on pre-instrumental earthquakes was obtained from historical newspapers and other documents in European archives (Paola Albini, written communication, 2015). The magnitudes of the historical events were estimated from the felt area based on the empirical relation of Assumpção et al. (2014).

Until the 1960s, the instrumental data in the Brazilian catalogue were those provided by international agencies (mainly USGS and ISC) obtained from stations in other countries. In Brazil, although the first station (RDJ, Rio de Janeiro) was installed in 1906; only in the late 1960s and the 1970s did instrumental recording of Brazilian events really begin in earnest. The regional earthquake catalogue—covering a region defined a radius of some 600 km around the CNAAA site—was revised and updated for this project. Large events from the Brazilian catalogue were re-evaluated for location (epicentre and depth) and magnitudes. Small events recorded by the local station at Angra dos Reis (ESAR, which is operated on behalf of the CNAAA plant owner ETN) also were updated to help define possible limits of seismic source areas and the frequency-magnitude relation. Although regulatory guidance generally indicates a study area defined by a radius of at least ~ 300 km, we opted to extend the study area to 600 km because of the slow attenuation rates of seismic waves in Brazil. Moreover, using a larger region partly compensates for the rather sparse earthquake catalogue and allows more stable estimates of recurrence parameters. The labour-intensive historical investigations, however, were limited to a smaller area for pragmatic reasons.

3.2 Homogenisation of magnitudes

The earthquake catalogue assigns earthquake size on the regional magnitude scale, mR, determined from P-wave measurements from seismograph recordings at distances from about 200 to 2000 km (Assumpção 1983) and has been shown to be equivalent to the teleseismic body-wave magnitude, mb, in the range 3.5 to 5.5. The overwhelming majority of modern GMPEs are based on the moment magnitude scale, Mw (or more correctly, M), so the key task was to homogenise the measures of earthquake size in terms of Mw.

Data recorded by the Brazilian seismograph network from 1993 to 2014 have been used to perform spectral analysis of S-waves extending the work of Drouet and Assumpção (2013). Acceleration Fourier spectra are inverted to compute source (including moment magnitude), site and propagation term. We also included Mw values from moment-tensor inversions using waveform modelling, such as published by Assumpção and Suárez (1988), Assumpção et al. (2011), Agurto et al. (2015), Barros et al. (2015) and Dias et al. (2016). The moment magnitudes computed in these analyses are compared with the regional mR magnitude in Fig. 3. The relationship has a gradient of unity and implies that the difference between the two magnitude scales is a constant offset of 0.34. As can be appreciated from Fig. 3, while the relationship is well constrained over the magnitude range covered by the available data, there is also considerable scatter around the regression line. This variability needs to be taken into account in the calculation of the recurrence parameters. The error on the determination of mR values from the Brazilian seismograph network is 0.2 on average (Assumpção et al. 2014), and the standard deviation of the data with respect to the regression established in Fig. 3 is 0.14; hence, the standard deviation on the resulting Mw values is 0.24.

Fig. 3
figure 3

Relationship between regional and moment magnitudes for Brazil. The Mw values come from S-wave spectral analysis (Drouet and Assumpção 2013) and waveform inversion published in the literature

3.3 De-clustering

The algorithm chosen for de-clustering is that proposed by Gardner and Knopoff (1974), and it was applied using the OpenQuake HMTK code. The algorithm applies magnitude-dependent windows of time and distance to each earthquake in the catalogue, and an event falling within such a window is considered to be part of a cluster. After application of these windows to the full catalogue, a number of clusters are defined, and then the catalogue is filtered such that only the largest earthquake within each cluster is retained. The Gardner and Knopoff (1974) algorithm was chosen for this application, despite being originally derived for application in Southern California, because it has been widely used throughout the world and has been shown not to be highly sensitive to the parameters chosen for the window (e.g., Van Stiphout et al. 2012). The method has been found to perform well even when applied on a global scale (Shearer and Stark 2012) and interestingly yields very similar results to the independently derived approach of Grünthal (1985).

Application of the de-clustering algorithm to the initial Brazilian catalogue of 1147 events with magnitudes above 2.8 (as shown in Fig. 2) led to the removal of 309 foreshocks and aftershock. The effect of the de-clustering algorithm on the earthquake catalogue for SE Brazil is shown in Fig. 4, where the dashed line indicates the area covered by the catalogue and the solid line the region of the seismic source zones. The final catalogue for the study region has 229 independent events with mR magnitudes in the range from 2.8 to 5.2 (Mw values from 2.46 to 4.86).

Fig. 4
figure 4

Foreshocks (green) and aftershocks (red circles) identified in the earthquake catalogue for the CNAAA study region through the application of the Gardner and Knopoff (1974) de-clustering algorithm. The outermost line indicates the area covered by the catalogue, the solid line the region of the seismic source zones. The green line and the purple line show the areas used to define the catalogue of the offshore (shelf) source and Minas Gerais (MG) source, respectively. The area between purple and green lines delimits the earthquake catalogue for the coastal zone. The polygons that define the three source zones are limited by the intersections of the red line and the boundaries of the catalogues

4 Source model: diffuse seismicity

In situations where most earthquakes cannot be unambiguously associated with major geological faults, it is common to model future earthquakes in the form of diffuse seismicity. For the CNAAA site, the diffuse seismicity was modelled by the classical approach of using area source zones.

4.1 Seismic source areas

In a source zone, earthquakes are considered to be equally likely at any location and it is also assumed that recurrence rates, focal depth distributions, style-of-faulting and the maximum seismogenic potential are all constant across the source zone. In this PSHA, three alternative source zone configurations were used to model the diffuse seismicity.

Model 1 separates and defines three source zones that reflect both the observations of earthquake activity and the association with geological/tectonic structures: offshore (shelf), Minas Gerais and the coastal zone (Fig. 4). This is considered a highly credible model and is assigned a weight of 0.5. The Minas Gerais (MG) zone is an area around the São Francisco Craton, including the NE part of the Paraná Basin with the Alto Paranaiba Igneous Province (APIP) where stress concentration in the upper crust could be caused by lithospheric thinning (e.g., Rocha et al. 2011). The largest event in this area is the 1922 Mogi-Guaçu event with 5.1 mb. The coastal seismic source is the intervening, low-seismicity region between the “MG” and the “Shelf” zones. The largest event in this region is the 1861 Lorena earthquake, the magnitude of which was revised to 5.0 mb based on a much larger felt area. Most of the other larger (m ~ 4) earthquakes in this area (1886, 1917, 1946) are old historical events recorded as a result of the higher population density between the São Paulo and Rio de Janeiro provinces.

To define the limiting boundary between the offshore continental shelf and the onshore region, physiographic criteria were invoked. Most of the present deep sedimentary basins offshore started as rift basins in the Mesozoic bounded by major coast-parallel normal faults caused by crustal stretching and flexural stresses. These faults are probably reactivated, under the current compressional stresses, as reverse faults. Earthquakes offshore concentrate along the continental slope (approximately where sedimentary thicknesses are largest) and where the stretched continental crust underneath is presumably more fractured and weak. For this reason, the hinge faults can be viewed as the limit of the seismic zone near the continent. We take the NW limit of the shelf seismic zone roughly parallel to the main hinge fault, about 30 km closer to the coastline. This configuration of three individual sources is called model 1. Figure 4 shows the polygons that define this model.

Model 2 acknowledges the possibility that the seismicity in Minas Gerais may not be spatially constrained to that region, and in order to allow for the more conservative option of this concentration of seismic activity spreading to the CNAAA site, the two onshore area sources are merged in the continental zone. This is considered a credible but less likely configuration, for which reason a lower weight of 0.3 is assigned. Model 3 is the most conservative source characterisation, which removes any tectonic association of the earthquake activity, allowing both onshore and offshore earthquakes to happen anywhere. This is viewed as an unlikely model since the separation of oceanic and continental seismic activity is quite clear and also physically explained (see for instance Assumpção 1998a; Assumpção et al. 2011; Assumpção et al. 2014), but the model is included to ensure adequate capture of epistemic uncertainty. The weight assigned to this model is 0.2.

The maximum magnitude, Mmax, is the largest earthquake considered physically possible within a specific seismic source. When the seismic source is a geological fault, Mmax can be estimated from empirical relationships between rupture dimensions and moment magnitude, together with the assumptions regarding how much of the total fault length could participate in a single rupture. For source zones of diffuse seismicity, the estimation of Mmax is more challenging, and a great deal of attention—including the major study of Johnston et al. (1994)—has been devoted to the development of approaches for the estimating Mmax in regions of low seismicity. Normal practice would also lead to SSC logic-tree branches for alternative estimates of Mmax. However, the recurrence intervals of earthquakes of Mmax tend to be extremely long, and the impact of this parameter on the hazard is generally modest, particularly for spectral accelerations at higher oscillator frequencies. For this reason, we decided to adopt single values of Mmax for each source zone to avoid inflation of the logic-tree—and hence of the calculation times—for a parameter that will not exert a strong influence on the hazard estimates. Table 1 summarises the final values of Mmax assigned to the area sources.

Table 1 Summary of maximum magnitudes for area source zones

Average recurrence rates for earthquakes of different magnitude are estimated from the earthquake catalogue. As indicated in Fig. 4, for the calculation of recurrence parameters, earthquakes were included in the source zone-specific catalogues from an area larger than the source zones to allow for uncertainty in epicentral locations, especially in the offshore region. Before calculating recurrence parameter, it is important to account for the fact that as one goes further back in time, the catalogues are likely to become incomplete since not all earthquakes will have been registered. Initial estimates of magnitude completeness levels for the project catalogue were based on “expert knowledge/experience”, which takes into account the history of population increase and deployment of seismic stations in Brazil. These initial estimates were then refined with an approach based on the time evolution of the cumulative number of events (also called cumulative visual method or CUVI, Mulargia et al. 1987), which works well with sparse earthquake catalogues.

The completeness intervals for different magnitude levels were estimated together with uncertainty intervals, but only the best estimate periods were used in the derivation of recurrence parameters. Tests were performed to investigate the impact of this decision, and these showed that the uncertainty on completeness had a very small impact on the computed recurrence intervals. Investigations were also performed to explore the influence of the smallest magnitude considered in the calculation of the recurrence parameters. These tests showed that the recurrence parameters became unstable as the lower limit of magnitude is increased; consequently, a decision was taken to adopt a common minimum value in all the calculations of Mw 2.7. Completeness levels for the whole Brazilian catalogue vary from 5.0 mb since 1960 to 3.5 mb since 1980 (Bianchi et al. 2018). The regional catalogue, however, has a better completeness because of better historical coverage and larger number of seismic stations. In the continental area, the regional catalogue is believed to be complete for magnitude above 3.2 Mw since about 1962 and 4.2 Mw since about 1890 (Fig. 5). Separate completeness levels were determined for the oceanic areas; the results are summarised for the continental and oceanic regions in Tables 2 and 3, respectively.

Fig. 5
figure 5

Earthquake record for the continental area and the estimation of completeness periods. Upper: Time evolution of the magnitudes: the solid blue line represents the initial ‘expert’ judgement regarding completeness, whereas the dashed lines represent the minimum and maximum dates as suggested by the curves in the lower plot. Lower: Cumulative number of events (normalised to unity) for magnitudes above 3.0, 3.5 and 4.0 mR; small blue marks and circles are ‘expert judgement’, minimum and magnitude estimates for the inflexion point, respectively

Table 2 Completeness periods for the “continental” areas of the regional catalogue
Table 3 Completeness periods for the “oceanic” areas of the regional catalogue

For this study, the doubly truncated exponential recurrence relationship is used. Since empirical conversions have been applied to homogenise the earthquake catalogue, it is important to apply an adjustment for the effect of propagation of the uncertainty in the magnitude conversion equations. The effect of this variability in the empirical magnitude conversion relationship is to modify the activity rate of the recurrence relationship but not the b-value (Musson 2012). A number of approaches have been proposed to correct for this effect, and we adopted the method of Tinti and Mulargia (1985). The seismic activity parameters are computed using the Weichert (1980) maximum-likelihood method. In all cases, in order to capture epistemic uncertainty on seismic activity rates, three alternative recurrence relationships are derived. The first is computed directly from the data, while the second is obtained with the b-value fixed equal to 1.0, which is considered by some seismologists to be a global constant. A third model is computed with a fixed b-value mid-way between 1.0 and the b-value determined from the data (Figs. 6, 7 and 8).

Fig. 6
figure 6

Recurrence relationships for the continental shelf (offshore) source of model 1. The dashed lines are as derived from the data, the solid lines after application of the Tinti and Mulargia (1985) adjustment. The data are plotted on the x-axis at the lower limit of 0.1 unit bins of magnitude

Fig. 7
figure 7

Recurrence relationships for the Minas Gerais source of model 1. The dashed lines are as derived from the data, the solid lines after application of the Tinti and Mulargia (1985) adjustment

Fig. 8
figure 8

Recurrence relationships for the coastal source of model 1. The dashed lines are as derived from the data, the solid lines after application of the Tinti and Mulargia (1985)

Each source zone has been assigned a maximum magnitude, and three alternative recurrence relationships have been derived for each source zone (except for Minas Gerais in Model 1 where only the b = 1 branch is included since the distance of this source from the site renders the results insensitive to the uncertainty). In terms of the recurrence relationships for each source, we believe that it is logical to assign the highest weight to the recurrence parameters derived from the earthquake data. The other two branches are given equal weights since we have no basis for preferring one over the other. The final distribution of weights agreed upon by the team was a weight of 0.4 on the data-driven recurrence parameters and 0.3 on each of the other two branches. Tables 4, 5 and 6 summarise the recurrence parameters for the three SSC models; other details of the SSC models, including the coordinates of the vertices of all the source zones, are provided in the electronic supplement to the paper.

Table 4 Recurrence parameters for SSC model 1
Table 5 Recurrence parameters for SSC model 2
Table 6 Recurrence parameters for SSC model 3

5 Source model: seismogenic faults

There are many mapped geological faults in the region surrounding the site (Fig. 9, where the black circle indicates the area with radius of 100 km centred on CNAAA). Faults within the polygonal area (red) were considered in the study. This area was chosen because it is where faults have been mapped in most detail and few large faults have been identified beyond this distance. Moreover, faults close to the site are the ones most likely to contribute to the hazard estimates; an expanded area that would include few additional faults would have simply resulted in a more diluted seismicity model. The blue circles are epicentres and their sizes are proportional to magnitudes. Note the low correlation between fault traces and epicentres.

Fig. 9
figure 9

Possible Quaternary faults in the study area; the black circle is 100 km radius around the site and the red polygon the equivalent source zone defined for characteristic earthquakes on these faults

The robust seismic hazard assessment should always consider in the model the potential contributions of geological faults, but this consideration presents a significant challenge in regions of low seismicity. Initial studies concerning the definition of the regional and local geological stability for the installation of CNAAA indicated that the chosen site was in a stable intraplate region without potentially seismogenic faults. However, this sense of presumed tectonic stability of the faults has been substantially modified since the late 1980s (Riccomini et al. 1989).

To model an individual fault as a seismogenic source in PSHA, the location and geometry of the fault are required, together with an estimate of the average slip rate and the largest (or characteristic) earthquake expected on the fault. The locations of the faults in the region surrounding the current site are generally known, together with estimates of their geometry in terms of dip angles and directions. However, while there is field evidence of offsets on several of these structures that suggests the possibility of their being seismogenic within the current tectonic stress regime, there are presently rather limited geochronology data to constrain both a regional Quaternary stratigraphy model and the most recent movements on the individual structures. The slip rate was estimated at just one site in southeastern Brazil (~ 0.01 mm year−1, Campos do Jordão Fault) and is not significantly different from other stable continental regions, although the slip rate is known for just another fault in Brazil (Jundiaí Fault, Nogueira et al. 2010) and for only three faults in CEUS (Crone and Wheeler 2000). Consequently, there is considerable uncertainty regarding the potential seismic capacity and slip rates of these faults. This uncertainty is also heightened because the field evidence for movements is generally localised rather than persistent over the landscape, while the influence of erosional processes in concealing movements is not well quantified. In view of such uncertainty, it would be necessary to develop logic-tree branches to represent the possible distribution of slip rates on each fault. With a minimum of three branches on each of the 163 faults, the computational burden would also be extremely demanding and probably not justified given the relatively limited information that is actually available regarding the characteristics of these faults.

We recognise that there are multiple options for how to incorporate the geological information available for the study region into our site-specific PSHA. A common option in seismic hazard studies for stable regions is simply to ignore any specific contribution from faults—other than perhaps indirectly in the estimates of maximum magnitude—and to base the modelling of seismic source characteristics solely on the earthquake catalogue, implicitly assuming that this will automatically accommodate any activity associated with any mapped faults. We did not consider such an approach acceptable for our study, especially since the available evidence does point to the possibility of significant seismogenic capability for some of the mapped faults. At the same time, we do not have well-constrained information on any of the key parameters needed to define the recurrence models for any individual fault (characteristic magnitude and recurrence interval or slip rate), for which reason we do not believe that it would be appropriate to model individual fault sources without uncertainty bands. Our alternative approach is conservative (although it only strongly affects the hazard at very low annual frequencies of exceedance), and it represents a baseline that would most likely be reduced through geological investigations to develop the Quaternary record for the region and for the individual faults. We did consider the option of applying our approach to the smaller faults and modelling the larger faults as individual sources, but we believe that the choice of which faults warranted individual representation would not necessarily be straightforward. Firstly, smaller faults closer to the NPP site could contribute more to the hazard than larger faults at distance since the decay of ground-motion amplitudes over distances of say, 10–20 km, can be much greater than the increase due to a unit increase in magnitude. Secondly, since all faults—except the very smallest—are assigned the same slip rate, the shorter faults among the more significant ones will have shorter recurrence intervals for their characteristic earthquakes and thus contribute more to the hazard than longer faults producing larger earthquakes with longer recurrence intervals.

As an alternative, it was proposed to develop an additional area source to represent the potential seismic activity associated with these faults. This is a rather novel solution but it was considered an effective way of incorporating the potential seismic contribution of the faults to the site hazard while not building an excessively—and unjustifiably—complex source model. The area source is a simple polygon defined by eight vertices (Fig. 9).

In order to characterise the potential seismicity associated with the 163 faults in the CNAAA region, the following steps were followed:

  1. 1.

    A characteristic or maximum magnitude was assigned to each fault based on its length and the assumption that the entire fault would rupture in each earthquake. These magnitudes were obtained from the median predictions from the empirical equation for all fault types of Wells and Coppersmith (1994). The resulting magnitudes ranged from 4.42 to 7.33.

  2. 2.

    The average fault slip, AD (m), associated with each earthquake was estimated as the median value from another empirical relationship provided by Wells and Coppersmith (1994).

  3. 3.

    Slip rates were assigned to the faults on the basis of discussions with several geologists during a field survey in May–June 2016, on the basis of the maximum slip rate that would be expected without producing a more pronounced and visible signature on the landscape. These slip rates were selected as 0.01 mm/year for faults of less than 10 km in length and 0.025 mm/year for longer faults.

  4. 4.

    The average recurrence rate of the characteristic earthquake on each fault was then estimated as the ratio of the slip rate to the slip-per-event, AD.

The equivalent source zone representing the faults being discussed herein is only included in model 1, where in effect it is superimposed on the coastal sierra zone as an additional source of seismicity. In view of the Mmax assigned to the diffuse seismicity in the coastal sierra area source zone being 6.5 and the break in the recurrence relationship estimated for the faults at Mw 6.3, it was decided only to include the higher linear trend from the fault-based seismicity (since the smaller earthquakes can be assumed to be represented by the three recurrence relationships on the logic-tree branches for the coastal sierra source zone). Since the largest earthquake in the fault-based seismicity catalogue has magnitude 7.3, it was decided appropriate to assign a Mmax of 7.5 to the fault source zone, and the same value assigned in model 3.

Therefore, the final recurrence parameters were obtained by fitting an exponential doubly truncated Gutenberg-Richter (G-R) recurrence relationship to the data from Mw 6.3 and above. The coefficients of recurrence relationship were obtained using maximum-likelihood fitting, which first required the data to be re-sampled at constant intervals of magnitude. In Fig. 10 (left frame), the cumulative recurrence data are shown by red squares and the re-sampled data in constant magnitude bins as black circles. The dashed black line is the linear Gutenberg-Richter recurrence model and the solid black line is the truncated exponential relationship proposed for the model.

Fig. 10
figure 10

Left: Incremental (blue) and cumulative (red) recurrence data estimated for the faults, based on the assumptions of the entire fault length rupturing in characteristic events and slip rates of 0.01 and 0.025 mm/year for short and longer (> 10 km) faults. Right: Comparison of recurrence relationships (normalised by area) for fault and coastal zones in model 1: the curves in the lower magnitude range correspond to the recurrence parameters for the original coastal area source in the model, and the curves in the higher magnitude range to the equivalent area source for the faults

In view of the inconclusive evidence for seismogenic potential on most of the faults in the current stress regime, the faults are collectively assigned a probability of 10% of being active. There are several reasons for assigning this rather low value, in addition to the lack of compelling and consistent field indicators of repeated recent faults movements, which would be expected to be more clearly visible in the landscape. Only about 10% of the faults actually displace deposits that are believed to be of Quaternary age. Another important observation is that the longer fault traces generally show less evidence of recent activity and therefore the more important structures would actually be assigned lower probabilities of being seismogenic.

In order to estimate the likely impact of the fault source zone on the final hazard estimates, a comparison is made between the seismic activity modelled by this new source with that represented by the coastal sierra source in SSC model 1. In order to make this comparison more meaningful, the recurrence rates are normalised by the source areas, which are calculated as 269,911 km2 for the coastal sierra zone and 18,801 km2 for the fault source zone. In Fig. 10 (right frame), the upper and lower recurrence relationships for the coastal sierra source are shown in green and blue, respectively. The magenta curves are the recurrence values for the fault source, with the yellow curves showing the effect of applying the Tinti and Mulargia (1985) correction for magnitude estimation. The red curve shows these same recurrence rates scaled down by a factor of 0.1, to approximate the effect of the 10% probability assigned to the faults being seismogenic. Even with this final adjustment, the values are significantly higher than those that would be predicted by extrapolation of the G-R relationship for the coastal sierra source, but this is consistent with the characteristic earthquake model and the higher rates obtained from geological constraint for larger magnitude earthquakes than expected from extrapolation of the rates constrained by seismicity data for smaller-magnitude earthquakes. Figure 10 suggests a very large increase in seismicity due to adding in the fault source, but it does need to be borne in mind that the coastal sierra source zone of model 1 is the single most seismically quiet of all six source zones in the SSC logic-tree.

The full SSC logic-tree is presented in Sect. 7.

6 Ground-motion characterisation model

The basis for the construction of the GMC logic-tree is a procedure that has become known as the ‘backbone GMPE’ approach. Traditionally, GMC logic-trees have been constructed by selecting GMPEs and assigning weights to these equations. There are several shortcomings in this approach, including the fact that the resulting distribution of predicted ground motions is difficult to visualise (and may be much narrower than would be suspected if there are several branches predicting similar levels of motion). Another important consideration is that for regions with only sparse local data, the range of predicted motions obtained from a suite of imported GMPEs is unlikely to capture the full range of epistemic uncertainty. Both of these issues can be addressed by adopting a single GMPE that is well suited to the region and then to populate the other logic-tree branches with scaled versions of this backbone model (Atkinson et al. 2014).

6.1 Site characteristics

The first steps in defining a backbone for the site are to establish the VS profile and the site kappa. On the basis of the reported shear modulus of the rock (20 GN/m2), it was estimated that the shear-wave velocity, VS, in the rock is on the order of 2800 m/s. This is consistent with the site-specific measurements made for a nearby site in previous work, produced in 1985 as part of the site investigations from cross-hole measurements (Promon 1989). The indications are that the velocity profile at the site corresponds to a very hard rock site that is consistent with the site conditions implicit in most GMPEs developed for the Central and Eastern United States (CEUS); see Fig. 11. This is also supported by the fact that the crustal velocity structures in the upper crust of the two regions are very similar, as shown in Fig. 11; the apparent differences in the top 1 km simply reflect the low resolution in the SE Brazil model.

Fig. 11
figure 11

Left: VS profile at Ponta Grande from cross-hole measurements at the site (blue) compared with the generic CEUS hard rock profile (green) of Boore and Joyner (1997); Right: Shear-wave velocity profiles in the uppermost 5 km of the crust in SE Brazil and in CEUS, the latter again from Boore and Joyner (1997)

In terms of the site kappa value, a very valuable source of information is the recordings from the ESAR seismograph station installed adjacent to the CNAAA site on a rock outcrop. Fourier amplitude spectra (FAS) were generated from many recordings at this location and plotted on log-linear axes to estimate the high-frequency filter parameter, κ, following the procedure of Anderson and Hough (1984). Plotting these kappa values against the epicentral distance to the ESAR station allows the value of the site kappa, κ0, to be estimated from the intercept at zero distance. This analysis suggests a κ0 of 0.0036 s with an associated standard error of about 0.005 s; negative values of kappa are meaningless but these results confirm that the site kappa has a very low value (< 0.01 s), which is consistent with very hard rock conditions.

6.2 The backbone GMPE

In order to explore the suitability of various candidate GMPEs from stable regions to SE Brazil, a database of Brazilian ground motions was assembled, including the largest events of the database for inversion of Fourier amplitude spectra (FAS) and a few additional events for which only one or two records are available. For those events, moment magnitude is estimated using the site and path terms from the main inversion, and inverting only the source terms allows the determination of moment magnitude and corner frequency. Figure 12 shows the magnitude-distance distribution of the collected data and shows that only small events with Mw between 2.2 and 4.5 and mainly recorded beyond 100 km are available.

Fig. 12
figure 12

Distribution of ground-motion dataset in terms of moment magnitude and epicentral distance (Repi)

Based on these data, a statistical analysis of the residuals between recorded data and expected amplitude from GMPEs has been performed following Scherbaum et al. (2004) and Scherbaum et al. (2009) methods. A set of 15 pre-selected GMPEs, predominantly from stable continental regions (SCR), was used for testing. Observing the results of these tests, a clear conclusion is that it would not be possible to construct a GMC logic-tree based on these results. The best-fitting GMPEs oscillate with period, and the standard deviation of the normalised residuals for PGA is also quite large between 1.0 and 1.4, showing that observed variability is larger than the predicted variability. Interestingly, the GMPEs tend to overestimate observed ground motions at short periods.

In view of the failure of data-driven approaches to identify any clearly suitable GMPEs for application in Brazil, the selection was made on more qualitative—but in our view, also more robust—criteria. The chosen equation are those of Toro et al. (1997) as modified by Toro (2002), which are also among the more conservative GMPEs in current use in the Central and Eastern United States.

The equation is calibrated to a generic site with shear-wave velocity of 2.74 km/s (as inferred from the 9000 ft/s specified in the paper, which was erroneously published as 6000 ft/s according to Dr. Gabriel Toro, personal communication, 2012). This very closely matches the shear-wave velocity of 2.8 km/s inferred for the rock at the CNAAA site (Sect. 6.1). Moreover, the kappa values adopted for the derivation of the equations—equally weighted values of 0.003, 0.006 and 0.012 s—are consistent with the value of 0.0036 s estimated from the ESAR recordings (Sect. 6.1).

Figure 13 compares the distance scaling of ground motions with that modelled by selected GMPE for magnitudes Mw 3.5 and 4.0. The scaling with distance of the selected GMPE does not perfectly match the data, especially for smaller magnitudes, but here again, the tendency is towards over- rather than under-estimation. The choice of the Toro (2002) equation therefore seems to be a defensible and safely conservative choice. It may also be noted that the Toro (2002) GMPE and the scaled versions of this backbone model (represented by the discontinuous blue lines in the figures) generally envelope the distribution of the other GMPEs, without being an extreme outlier. In these figures, the Toro (2002) GMPE is shown with scaling factors of 0.5, 0.75, 1.0 and 1.33 (dotted, dashed, solid and dashed-dotted blue lines, respectively). The other GMPEs for SCR are shown as grey lines and recorded data in Brazil as red crosses.

Fig. 13
figure 13

Upper: Scaling of accelerations with distance at Mw 3.5 for PGA (left) and Sa (1.0 s) (right). The Toro (2002) GMPE is shown with scaling factors of 0.5, 0.75, 1.0 and 1.33 (dotted, dashed, solid and dashed-dotted blue lines, respectively). The other GMPEs for SCR are shown as grey lines and recorded data in Brazil as red crosses; Lower: The same for Mw 4.0

6.3 GMC logic-tree

The GMC logic-tree consists of four branches. The justification for the branch models and associated weights is as follows:

  • On the basis of the very low stress drops estimated for Brazilian earthquakes, a branch is included with a model predicting ground motions equal to half of the median predictions from the Toro (2002) which is based on a median stress drop of 120 bars. The lower stress drops encountered in Brazil are consistent with the shallower focal depths than those of earthquakes in CEUS. However, because the low stress drops are determined from small-to-moderate magnitude earthquakes in Brazil, it is acknowledged that the extrapolation to larger earthquakes is uncertain, for which reason this low branch is assigned a weight of only 20%.

  • The remaining 80% of the weight is assigned to a symmetrical distribution of three discrete branches cantered on the original Toro (2002) GMPE without modification, which is given a weight of 40%. This branch reflects the possibility that for moderate-to-large magnitude earthquakes, ground motions in SE Brazil may be essentially equivalent to those in CEUS, and any effects of lower stress drops due to shallower depths may be offset by the use of the RJB distance metric.

  • The intermediate lower branch, with a weight of 20%, also reflects both lower median stress drops and the predominance of strike-slip faulting in onshore SE Brazil (Assumpção 1998b) whereas the Toro (2002) GMPE is calibrated to region where reverse-faulting earthquakes dominate, the latter generally producing significantly higher levels of motion (e.g., Bommer et al. 2003).

  • The uppermost branch represents the unlikely but nonetheless feasible scenario that for large-to-moderate magnitude earthquakes, ground motions could be higher in Brazil than in CEUS. Although this is considered to be unlikely, this branch is conservatively assigned a weight of 20%.

7 Hazard calculations

The complete logic-tree is shown in Fig. 14. In summary, there are a total of 486 SSC branch combinations. Each of these may be applied with each of the four GMC branches, leading to a total of 1944 branch combinations. For this site-specific PSHA, a minimum magnitude of 4.5 Mw was chosen. The main reason for adopting a value lower than the standard of 5.0 is that we are using moment magnitudes converted from another magnitude scale, and the additional margin allows for the uncertainty in such conversions. In this way, we can be confident that we are not removing potentially important contributions to the hazard while at the same time conforming to international best practice (Bommer and Crowley 2017).

Fig. 14
figure 14

Complete logic-tree for the PSHA

The hazard calculations were executed using the OpenQuake hazard engine that has been developed for the Global Earthquake Model (GEM) project (Pagani et al. 2014). The hazard was calculated at 11 response frequencies between 0.5 and 100 Hz. The basic output from the PSHA calculations at each of these oscillator frequencies is a mean hazard curve on outcropping rock at the CNAAA site. The fractiles of the hazard are also calculated at the 5, 15, 50, 85 and 95% confidence levels. The fractiles are required for probabilistic risk analyses, and they also provide a measure of the total uncertainty in the seismic hazard assessment. Figure 15 shows the full suites of hazard curves for oscillator frequencies of 1, 10 and 100 Hz as illustrative examples. The most striking feature in all cases is how much the mean and median hazard curves separate at AFEs below 10−3 and how the mean curve tends towards rather high fractiles, exceeding the 85-percentile curve at low AFEs. The separation of the mean and median hazard curves is a direct indication of the degree of epistemic uncertainty in the PSHA input models, and it is clear that for the CNAAA study, the level of uncertainty captured is appreciable. This is consistent with current database for SE Brazil and lack of a new data collection campaign as a part of this PSHA study.

Fig. 15
figure 15

Seismic hazard curves for Sa(1 Hz), Sa (10 Hz), Sa(100 Hz) at the CNAAA site

The large confidence intervals (fractiles) reached by the mean hazard is primarily due to the rather conservative modelling choices made for the inclusion of potential hazard contributions from the mapped geological faults in the CNAAA site region. These modelling choices also reflect uncertainty in so much as there is currently a lack of definitive evidence regarding the seismogenic potential of these faults; in the face of this uncertainty, it was considered appropriate to include the possible contributions from these faults. Although the probability of seismogenicity assigned to these faults is low, their contribution to the hazard estimate is significant for AFE lower than 10−4.

Using these hazard curves discussed, the uniform hazard response spectra (UHRS) for 5% of critical damping at multiple AFEs are evaluated. Figure 16 compares the final UHRS (solid line) with the UHRS evaluated without the contributions from the equivalent area source for the seismicity associated with the geological faults (dashed line). This comparison shows that the contribution from the faults only manifest at lower AFEs, which makes perfect sense because of the long recurrence intervals associated with most of the characteristic earthquakes associated with those structures.

Fig. 16
figure 16

Comparison of the final UHRS (solid lines) with the UHRS evaluated without the contributions from the equivalent area source for the seismicity associated with the geological faults (dashed lines)

The disaggregation of PSHA results is essentially the process of examining the hazard integrations to quantify the contributions to the hazard estimates of different combinations of magnitude, distance and epsilon, the latter being the number of standard deviations from the GMPE above the median predicted accelerations. This information reveals which seismic sources and which earthquake scenarios are most influential on any particular hazard estimate. The dominant scenarios identified in this way can also be used as the starting point for more detailed representations of the hazard beyond the 5%-damped UHRS, including the vertical response spectra, response spectra for other damping values and acceleration time-histories.

Disaggregation is performed for the spectral acceleration at a specified response frequency, f, and annual frequency of exceedance (AFE), for the CNAAA site-specific PSHA. Some results are displayed in 3D representations that show the contributions from magnitude, M, distance, R and epsilon (ε) values, in Figs. 17 and 18 for AFEs of 10−4 and 10−5, which are the most relevant to defining design spectra for nuclear installations (USNRC, 2007).

Fig. 17
figure 17

Disaggregation in terms of M-R-ε triplets of the 10−4 hazard estimates in terms of Sa (10 Hz) (upper) and Sa (1 Hz) (lower)

Fig. 18
figure 18

Disaggregation in terms of M-R-ε triplets of the 10−5 hazard estimates in terms of Sa (10 Hz) (upper) and Sa (1 Hz) (lower)

The patterns visible in these disaggregation plots are entirely consistent with general expectations for PSHA in a low-seismicity region. The modal contributions to the hazard—as represented by the tallest columns—correspond to moderate magnitudes, which tend to be a little larger for lower oscillator frequencies and lower AFEs. The dominant distances also decrease with decreasing AFEs, with the dominant earthquake scenarios moving closer to the site. The epsilon values of the dominant scenarios also increase with decreasing AFEs but the dominant contributions come from ~ 1.5 sigmas with only small contributions from higher exceedances, which is the result of the AFEs being largely determined by the long recurrence intervals of the earthquakes.

8 Discussion and conclusion

A site-specific PSHA has been performed for the nuclear power plant at Angra dos Reis, located in the seismically quiet region of SE Brazil. The study was undertaken in response to the Fukushima disaster and to generate an assessment of the seismic demand at this site that is consistent with current good practice in this field. The sparse earthquake catalogue for this region, the limited palaeoseismological investigations in the surrounding region and the lack of strong-motion recordings from moderate-to-large earthquakes inevitably mean that there was considerable epistemic uncertainty in both the seismic source and ground-motion characterisation models.

The seismic hazard assessment made full use of all existing data to constrain the PSHA input models but also recognised the large epistemic uncertainties that were captured through a logic-tree formulation. However, in order to avoid excessive computational burden as a result of an excessively complicated logic-tree, a structure was developed that enabled capture of wide ranges of uncertainty on key parameters in an efficient manner. The logic-tree consisted of three nodes, the first acknowledging the uncertainty in the spatial distribution of future earthquakes through the definition of three different source zones for diffuse seismicity. The uncertainty on the recurrence relationships within each source zone was captured through three branches with pairs of a- and b-values. Rather than increase computational demand through branches for Mmax, it was considered sufficient to define relatively conservative estimates of this parameter, namely Mw 7.0 or 7.5 in each source zone. The one exception to this is in Model 1, where a source zone is defined along the coastal region that contains the NPP site. The maximum magnitude in this zone was set to 6.5 but within this zone, the model also allows for larger earthquakes associated with geological faults.

There are many mapped faults within the onshore region surrounding the site, several of these showing evidence for movements at specific locations. To incorporate each fault as a separate seismic source with branches to account for the uncertainty in slip rates would create an enormously complicated logic-tree and consequently a huge computational demand. The innovative approach adopted was to assign a maximum (characteristic) event to each fault based on median estimates from empirical relationships and assuming that the entire fault would rupture. Slip rates were assigned to longer and shorter faults based on the expert judgement of several geologists regarding the highest rates that could exist without clearer manifestation of the faults in the landscape. On this basis, a virtual catalogue of characteristic events was generated and a Gutenberg-Richter recurrence relationship derived from this catalogue. The recurrence relationship was applied in a source zone surrounding the NPP site with a minimum magnitude of 6.3 and a maximum magnitude of 7.5. The combination of the diffuse seismicity up Mw 6.5 and the higher recurrence rates for larger earthquakes on the faults effectively mimics a characteristic earthquake model.

For the other source zones, the minimum magnitude was set to 4.5, slightly smaller than the value of 5 often used for nuclear sites (e.g. URNRC 2007) to account for the fact that the original earthquake catalogue was defined in a scale other than moment magnitude.

The ground-motion logic-tree was developed using what has become known as the backbone GMPE approach. The CEUS model of Toro ( 2002) was selected on the basis of having been derived for sites with shear-wave velocity and kappa values comparable to those encountered at the CNAAA site, plus the fact that the available data from Brazil showed comparable—albeit slightly faster—attenuation rates over distance. Alternative braches for the logic-tree were created by applying scaling factors to this backbone GMPE that would reflect host-to-target region differences in stress drop.

Disaggregation of the hazard at annual exceedance frequencies of 10−4 and 10−5 indicate dominant contributions from earthquakes in the ranges Mw 5.5–6.1 and Mw 5.8–6.4, respectively. At 10−5, the hazard is dominated by contributions from the equivalent source zone representing the contributions from potential earthquakes on the geological faults. There is consequently strong motivation to invest in palaeoseismological studies of these faults—accompanied by an extensive geochronology campaign to develop the Quaternary record for the region—since the current model is based on several assumptions regarding the seismogenic potential of these faults, some of which may be highly conservative.