1 Introduction

The Po Plain is a deep sedimentary basin in Northern Italy, between the European Alps and the northern Apennines. It is a densely populated area of high economic importance for Italy. The thick foredeep sediments buried below the Po Plain are mainly of Pliocene–Pleistocene age and show depocenters up to \(\sim\) 7 to 8 km deep, especially in the south-western sector (e.g. Pieri and Groppi 1981; Molinari et al. 2015). Amplification of predominantly long-period waves makes the Po Plain region prone to significant shaking during earthquakes. The region hosts relatively infrequent, low- to moderate-magnitude seismicity. During a recent earthquake sequence in May 2012, two events reached a magnitude up to \(\sim 6.0\), causing unexpectedly heavy damage. The sequence, occuring in the Emilia-Romagna Region, had significant social and economic impact: the earthquakes caused 27 casualties, around 400 injuries and left 15,000 people homeless (Magliulo et al. 2014). The economic loss was estimated to be over 13 billion Euro (Munich Re 2018). The largest events were felt all over the plain, even at distances beyond 200 km. Peak ground acceleration exceeded 0.3 g (Luzi et al. 2013), and liquefaction phenomenae have been observed in the epicentral area (Civico et al. 2015).

In sedimentary basins, moderate to large regional events can significantly amplify and prolong long-period (< 1 Hz) ground motions (e.g. Anderson et al. 1986; Koketsu and Kikuchi 2000; Galetzka et al. 2015). Basin resonances with eigenperiods up to 10 s are known for the Po plain (Luzi et al. 2013; Massa and Augliera 2013). Long period ground motions carry the potential to cause damage to large constructions like high buildings, long bridges and large industrial complexes.

Traditionally, prediction of expected ground motion from a given source location and magnitude, relies on empirical ground motion prediction equations (GMPEs) (e.g. Douglas 2018 and references therein). These relations contain empirical correction terms for, e.g., type of soil, faulting style and directivity. Despite the possibility today to compute kinematic or dynamic rupture models coupled with simulations of three-dimensional wave propagation, the parametric GMPEs have remained the main tool for seismic hazard studies. As in some other cases, however, in the case of the May 2012 Emilia earthquakes it has been shown that GMPEs results are improving, but still not fully able to accurately predict ground motion in the complete Po Plain basin (Massa et al. 2012; de Nardis et al. 2014; Lanzano et al. 2016).

Today, numerical, full-waveform, deterministic simulation of ground motion is becoming increasingly important as an alternative method to model seismic shaking for hazard and risk assessment. Civil engineers are expected to model structures as non-linear multi-degree-of-freedom systems. For such simulations they are in need of complete time series, commonly from 0.1 Hz up to frequencies of 10 Hz (e.g. Irie and Nakamura 2000).

Moreover, full waveforms, computed taking into account the 3D structure of the basin fill, carry a great amount of information and they may provide a more realistic and physics-based distribution of ground-motion than shaking computed using empirical GMPEs. They could reduce the epistemic uncertainty in probabilistic hazard assessment.

As of today, full waveform simulations are very challenging especially at frequencies above 1 Hz. The low frequency range of the spectrum is commonly computed using deterministic numerical methods. Earlier studies have simulated waveforms in the lower frequency range, up to a maximum of 1.5 Hz, in the Po Plain region (Molinari et al. 2015; Paolucci et al. 2015). Other notable examples worldwide include the Los Angeles basin (e.g. Olsen et al. 2003), the Kanto basin (Koketsu and Kikuchi 2000; Yoshimoto and Takemura 2014), the Osaka basin (Kagawa et al. 2004), and the Grenoble basin (Chaljub et al. 2010). Generally these simulations result in satisfactory results when compared to observed seismograms at long periods. A satisfactory fit with observed data at higher frequencies with deterministic methods, however, is not reachable yet, as this requires a very accurate 3D velocity model, precise knowledge about the source mechanism and detailed information about the near-source structure where seismic scattering dominates.

Therefore, to simulate waveforms of the desired frequency range of 0.1–10 Hz, a hybrid method is required, where deterministically simulated low frequency (< 1 Hz) seismograms are combined with high frequency seismograms calculated by empirical-stochastic methods simulating the scattering (Zeng 1993; Hartzell et al. 2005; Mai and Beroza 2003; Mai et al. 2010; Mena et al. 2010), or by a stochastic representation of source radiation (Liu et al. 2006; Frankel 2009; Graves and Pitarka 2010). In this study, we apply the procedure proposed by Mai et al. (2010) where in a first step LF spectral element synthetic seismograms are calculated, in a second step the HF scattering effects for each observer location are determined and in the final third step the LF and HF seismograms are reconciled to form the hybrid broadband seismogram (Fig. 1).

Goals of this study are to simulate earthquake ground motions in the Po plain, to verify the results by providing a first-order fit to broadband seismograms recorded in the region, and to delineate the most important steps for further improvement. As mentioned above, precise and reliable knowledge of the basin infill and crustal 3D structure (geometries and velocities) are of pivotal importance to model wave propagation. The structure beneath the Po Plain and surrounding regions has long been under investigation through different geophysical methods. Molinari et al. (2015) exploited industry geological and geophysical data to describe the main tectonic and structural features of the inner structure of the basin with 3D velocity model MAMBo. Molinari et al. (2016) imaged the S-velocity structure of the northern Italian crust by ambient noise surface wave tomography. We apply the hybrid method of Mai et al. (2010) to these two 3D models and to a widely used 1D velocity model for the region (Mele et al. 2010), to calculate waveforms and analyse their differences in comparison with observed seismograms from selected recent earthquakes.

2 Data

2.1 Earthquake Source Mechanism and Waveform Data

We compute synthetic seismograms for four selected earthquakes that occurred in the Po Plain area between 2012 and 2017 with moderate magnitude (\(M_w>4.4\)) and shallow hypocentral depth. This choice ensures that the earthquakes have been clearly recorded instrumentally throughout the whole basin. We include the second large shock of the Emilia sequence, with magnitude \(M_w= 5.6\), which occurred on May 29th, 2012. The other large shock was very similar in magnitude, location, and source mechanism and is not expected to yield different information. A smaller event, on June 3rd, 2012, with magnitude \(M_w=\) 4.5, is located in the basin itself near the Emilia sequence location. Our third event is the \(M_w=4.5\), June 30th, 2013, quake that occurred under the Apennines, to the South of the Po Plain basin. The most recent event we consider was located at the edge of the basin and occurred on November 19th, 2017, with \(M_w=4.4\). While the events in the basin all have similar thrust faulting mechanisms, the 2013 event exhibits normal faulting, typical for this sector of the Apennines (see Fig. 2 and Table 1).

The events were located by the national network of the Istituto Nazionale di Geofisica e Vulcanologia (INGV) of Italy. Except for the main shock of the Emilia sequence, for all event sources we use the time domain moment tensor (TDMT) solution, routinely calculated by INGV (Scognamiglio et al. 2009; Dreger 2003). For the largest shock of the Emilia sequence (\(M_w\) 5.6), we consider instead the finite-fault source model by Paolucci et al. (2015) for more detailed simulations of source effects, in particular at small epicentral distances.

Besides the network operated by INGV—IV (INGV Seismological Data Centre 1997), observed broadband waveforms were collected from the North-eastern Italy Seismic Network [NI—OGS (Istituto Nazionale di Oceanografia e di Geofisica Sperimentale) and University of Trieste 2001], the Province Südtirol Network (SI), the Regional Seismic Network of North Western Italy (GU—University of Genova 1967), the Mediterranean Very Broadband Seismographic Network (MN—MedNet project partner institutions 1988) and the Swiss Seismological Network (CH—Swiss Seismological Service (SED) at ETH Zürich 1983).

Table 1 Origin time, magnitude and hypocenter location of the simulated earthquakes (INGV Centro Nazionale Terremoti 2018, http://terremoti.ingv.it/en, accessed October 2018)

2.2 Seismic Velocity Models

We compute synthetic seismograms for three different velocity models, to test their ability to reproduce longer-period observations.

(a) MAMBo First-order velocity and density discontinuities at the base of the sedimentary trough generate important wave resonance effects, that considerably impact the shaking amplitude and duration. Their geometry is therefore quite consequential for our purposes, and needs to be carefully considered. The basin structure in MAMBo has been constructed combining information from seismic reflection profiles, borehole data and geological maps. It consists of seven geological units of laterally varying thickness, covering the Po Plain sedimentary basin. Within each layer, velocity profiles are defined combining available information on layer composition. Outside sedimentary layers, MAMBo includes a simple 1D crustal velocity structure and a Moho topography taken from EPcrust (Molinari and Morelli 2011) (see Molinari et al. 2015, for details). We smoothed MAMBo to remove unrealistic features arising at points where geological boundaries end (this is shown in Fig. 3).

(b) Ambient noise tomography (ANT) 3D velocity model Consistent and precise knowledge of S wave velocity across the whole study region is of prime importance in the far field as most seismic energy is carried by surface waves. Where a uniformly dense station network is available, ANT achieves uniform data density. The technique is therefore well suited to derive models with laterally uniform resolution, nearly independent of source distribution. The ambient noise surface-wave tomography model (Molinari et al. 2016) describes the 3D S-wave velocity structure in the crust and uppermost mantle of North Italy. The model was derived inverting the group velocity of surface waves from the cross correlations of one year of noise records (2011) of 110 North Italian seismic broadband stations. The method applied to this dataset to derive the 3D model is similar to the one described in Molinari et al. (2015). While ANT is able to accurately resolve average layer velocities, topography of layer interfaces are generally poorly resolved. Consequently, there are differences in the shape of the basin and in Moho topography relative to MAMBo and velocity gradients across the main discontinuities are rather smooth (see Fig. 3).

(c) 1D velocity model 1D velocity models are often used in combination with randomly distributed scatterers to represents the near-source subsurface structure that dominates the high-frequency wave field in the source region. The 1D model we test is the reference model, employed to routinely locate earthquakes for the INGV Seismic Bulletin (Battelli et al. 2013). It consists of two layers over a half-space. The first layer has a thickness of 11.1 km and \(v_{P}\) 5.0 km/s, the second has a thickness of 26.9 km and \(v_{P}\) 6.5 km/s, and the half-space exhibits \(v_{P}\) 8.05 km/s. Shear-wave velocity \(v_{S}\) is calculated with a constant velocity ratio \(v_{P}/v_{S}= 1.732\). Although the performance of a 1D model is known to rather poorly fit long period observations of more distant stations, we use the 1D model for the LF computation to link our results with previous studies that used the GMPE approach.

3 Hybrid Simulation Method

We selected the approach of Mai et al. (2010) to compute the hybrid sesimograms. We compute the LF and the HF parts of a seismogram using, in turn, a deterministic and a stochastic approach, and then combine the two components in a broadband synthetic record. Long period seismograms are computed using SPECFEM3D (Peter et al. 2011). For the HF contribution, the method of Mai et al. (2010) allows us to introduce the elastic wave scattering model of Zeng (1993). Mai et al. (2010) considered only multiple S-to-S and single S-to-P conversions based on the models of Zeng et al. (1991) and Sato (1977) , respectively. We explicitly consider the full multiple P-to-P, S-to-P, P-to-S and S-to-S energy conversions, which we fine tuned for each station seperately to take into account the heterogenous scattering properties of the Po Plain. The long period and short period waveforms are combined in the frequency domain at a certain matching threshold at about 0.8 Hz, where the spectrum of the high frequency seismogram is scaled to match the low frequency spectrum. For an extensive explanation of the hybrid method applied we recall the work of Mai et al. (2010) and references therein. In the following, we briefly discuss the methodology of each part of the hybrid procedure.

3.1 Low Frequency Simulation

We simulate low frequency (< 1 Hz) seismograms using community software SPECFEM3D (e.g., Komatitsch et al. 2005; Peter et al. 2011). It implements the spectral element method to simulate wave propagation in a 3D complex media. The spectral element method was introduced by Patera (1984) in the fluid dynamics. The approach has been proven to be very usefull also in the application of elastic wave propagation in a solid earth medium (Faccioli et al. 1997; Komatisch and Vilotte 1998). It combines the flexibility of finite element methods with the accuracy of spectral methods. It can accurately handle distorted mesh elements, enabling topography and other major boundaries to be correctly implemented. Komatitsch et al. (2005) described the method in detail for seismology, Peter et al. (2011) presented the implementation of SPECFEM3D on a Cartesian grid.

We construct the computational mesh using the internal SPECFEM3D mesher. The mesh consists of hexahederal elements with a minimum width of 0.4 km, allowing accurate simulation up to frequencies of 1.0 Hz, with 8 nodes per element, a polynomial degree of 4 and minimum S-wave velocity of 400 m/s. To best reflect the major and well-known velocity boundaries, the ambient noise tomography model elements follow the free surface topography and our implementation of the MAMBo model includes both free surface and Moho topography. Everywhere else, MAMBo is smoothed with a horizontal 2D Gaussian filter (\(\sigma\) = 4 km) to avoid sharp discontinuities that could generate artefacts in the synthetic wavefield. The element size is doubled for depths greater than 8 km, to maintain a similar number of points per wavelength everywhere in the model space. For comparison we also implemented the 1D model described in the previous paragraph.

Due to the lack of a good 3D model of the quality factor Q in the Po Plain, attenuation is used by applying the method of Olsen et al. (2003). They showed that Q is linearly dependent on the S-wave velocity for the low frequency range tested (< 0.5 Hz). Different values for the Olsen ratio \(\frac{Q_{s}}{V{s}[{\text {m/s}}]}\) were tested. Differences in amplitude were minimal and \(\frac{Q_{s}}{V{s}\,[{\text {m/s}}]}=0.02\,[{\text {s/m}}]\) was chosen for the simulation (for details see Figure I in the supplementary material). The chosen Olsen ratio results in a Q value which ranges from 8.6 for \(v_{s} = 430\, {\text {m/s}}\) to 92 for \(v_{s} = 4600\, {\text {m/s}}\) in the velocity models. After each simulation, resulting seismograms have been convolved with a Brune source time function, to account for the source signature of the selected four events. The rise times were determined following Allmann and Shearer (2009).

3.2 High Frequency Simulation

The high frequency part of the seismogram is modelled by introducing the scattering model of Zeng (1993) in the technique of Mai et al. (2010). First the P- and S-wave arrival times are computed using raytracing of low frequency waves through the MAMBo velocity model. Then, site-specific scattering Green’s functions are computed using local scattering parameters, and, lastly, these Green’s functions are convolved with a Brune source time function to implement the source mechanism. To account for specific site conditions, the local S-wave velocity at each station is used, from the USGS VS30 values, computed from the topographic slope (Allen and Wald 2007).

The elastic scattering model of Zeng (1993) requires four scattering parameters accounting for the energy transfer between modes: \(\eta _{ss}\), \(\eta _{pp}\), \(\eta _{sp}\) and \(\eta _{ps}\). Intrinsic attenuation is represented by \(\eta _s\) and \(\eta _p\). Following (Sato 1977), we reduce the number of independent parameters by setting:

$$\begin{aligned} \eta _{ps} = 2 \bigg ( \frac{\alpha }{\beta } \bigg )^{2} \eta _{sp} \end{aligned}$$
(1)

and

$$\begin{aligned} \alpha \eta _{p} = \beta \eta _{s} \end{aligned}$$
(2)

where α and β are the P- and S-wave velocity. A common practice is to compute these scattering parameters and intrinsic absorption uniformly for a specific region. However, the high amplitude part of the S-coda is dependent on the local physical properties of the rock (see Sato 1977). Because of the heterogeneous character of the Po Plain region, with relatively porous sediments in the basin and denser rock in the southern Alps and Apennines, we determine the independent parameters at each individual station in the Po Plain area where at least 6 events were recorded (see Table S1 in the Supplementary Material). For each station, we manually picked P- and S-wave arrival times for a total of 24 different events occurring between 2011 and 2015, with relatively low magnitude (\(3<M_w<4.5\)) in order to minimise finite source effects. Assuming that the scattering coefficient and the intrinsic absorption coefficient are independent on the wave path, we took the median of the resulting parameters of each event for each station.

The estimation of the scattering parameters is based on the comparison between observed and synthetic envelopes at several frequency bands: 1–2 Hz, 2–4 Hz, 4–8 Hz and 8–16 Hz. We use a non-linear inversion algorithm (Sambridge 1999) to minimise the misfit function, defined as the L2 norm of the difference between the logarithmic envelopes. During the search, we further constrain the parameter space by rejecting models not satisfying the following conditions

$$\begin{aligned} 0.5< \frac{\alpha (\eta _{p}+\eta _{ps})}{\beta (\eta _{s}+\eta _{sp})} < 3.0 \end{aligned}$$
(3)

based on global observations (Sato et al. 2012). The search range for all the scattering parameters is set in the range [0.0001, 0.1] at all stations. We consider a Poisson medium and choose an average shear wave speed of 2.8 km/s. Finally, the P/S energy radiation is assumed to be 0.05, consistent with a DC source in a Poisson medium (Sato et al. 2012). Figure 4 shows the fit between the observed and synthetic envelope of one of the stations for a single event for all four frequency bands.

With respect to other similar studies (e.g. Przybilla et al. 2009; Jing et al. 2014; Zeng 2017), we do not only invert envelopes for each station separately, but also consider a maximum time-lapse only twice the S-wave arrival time. Our decision is based on two arguments: first of all, our main goal is not to derive the best scattering parameters for the crust but rather to reproduce, as close as possible, the observed envelopes where the largest amount of energy is concentrated. Typically, at short and intermediate epicentral distances, this occurs between the direct P-wave and the early coda waves. Secondly, in a sedimentary basin such as the Po Plain, we expect that a relevant amount of coda waves at larger time-lapses at low to moderate frequencies may be represented by scattered surface waves, which are not considered in the scattering model of Zeng (1993).

Based on the obtained scattering and intrinsic attenuation parameters (see Table S1 in the Supplementary Material), we computed for each station a set of 8 high-frequency synthetics by changing initial seed numbers at each iteration in order to reflect different distributions of the velocity heterogeneities. These were combined with the low frequency seismograms in the frequency domain at a frequency of 0.8 Hz. The final goodness of fit (GOF) scores for peak ground velocity (PGV) and duration (shown in Sect. 4) have been computed from the hybrid seismograms by averaging the GOF scores of the 8 simulations.

4 Validation of Results

To increase our understanding of the difference in performance of the two 3D models—MAMBo and the ANT model—and the 1D model, we simulate the May 29th 2012 Emilia event as a point source at imaginary receivers evenly spaced on a section crossing the basin South to North (see Fig. 5). Observed seismograms from five stations along this section are shown for reference. The section simulated with MAMBo (Fig. 5b) clearly shows the amplification effect of the basin. The amplitude of surface waves is larger inside than outside the basin, and the duration of the shaking is significantly longer. Moreover, the P- and S-wave arrival times accurately coincide with observed seismograms. On the other hand the waves simulated with the ANT model (Fig. 5c), seem to arrive with a delay of a few seconds. More significantly, the ANT model is not able to predict the full amplification: amplitudes are smaller and the duration in the basin is shorter than shown by observed seismograms. The main reason is that the ANT model does not have an explicit description of the sharp velocity discontinuities within and at the bottom of the basin. The 1D model (Fig. 5c) shows the importance of a 3D model for simulation in the Po Plain. Although the arrival times are relatively accurate, the 1D model is unable to create amplification in the basin, resulting in too short waveforms and too low-amplitude surface waves. Out of the three tested models, the Mambo model clearly shows the best performance, as it is the only model able to provide a long period fit to the observations. Although the other two models might match the observed amplitude, it is clear that they will not fit the recorded full waveform and especially the ground motion duration. Consequently, we used the MAMBo model for all further calculations.

Although visual inspection gives a good first impression about the quality of the synthetic seismograms, this becomes less practical when looking at the higher frequencies. We are not in the situation yet of being able to fit each wiggle of the observed seismogram with the synthetic one. However, quantitative comparison between obervations and predictions is necessary to assess the accuracy of our broadband simulations. Different validation methods of synthetic waveforms have been brought forward (Kristeková et al. 2006; Anderson 2004; Olsen and Mayhew 2010).

In principle, we follow the method proposed by Anderson (2004) accessing the goodness of fit (GOF) of broadband synthetics with respect to engineering applications. With this GOF measure, a score of over 8 would represent an excellent fit, a score of 6–8 a good fit, 4–6 a fair fit and scores below 4 denote a poor fit. The two main criteria we employ to check the GOF are peak ground velocity (PGV) and duration. Following Olsen and Mayhew (2010), in this study the calculation of the GOF score diverges slightly from the original formula proposed by Anderson (2004) (for details see Figure S2 in the Supplementary Material). We define the PGV GOF score as:

$$\begin{aligned} Spgv = 10 \cdot \text {erfc} \Big (\Big [ \frac{2 (V_{1}-V_{2})}{V_{1}+V_{2}} \Big ] \Big ) \ \ where \ \ V_{i}=max|v_{i}(t)| \end{aligned}$$
(4)

where \(v_{1}(t)\) and \(v_{2}(t)\) are the synthetic and observed velocity time series.

The GOF score for duration is again calculated somewhat differently than Anderson (2004) to better account for later arriving waves. The durations are computed following the method of Novikova and Trifunac (1995) and the same scaling as for the PGV GOF score (see Eq. 4) is applied.

Example synthetic broadband seismograms accompanied by their Fourier spectra and the GOF scores for PGV and duration of all simulated stations, computed with the MAMBo model, are shown in Figs. 78, 9 and 10 for the 2013 event, 2012 Emilia event, the June 2012 event and the 2017 event respectively.

As the 2012 Emilia event had magnitude \(M_w=5.6\), a finite fault model seems needed in order to take into account directivity, and other effects of the source. Here we implement the finite fault model of Paolucci et al. (2015). The same event was simulated using a point source by Molinari et al. (2015). Their maximum amplitude of shaking is higher, due to a difference in the choice of the moment tensor solution, where we chose the TDMT solution of INGV to be consistent with the other three simulated events. Besides the difference in amplitude, the shaking seems to be more directed towards the west and the south (Fig. 6) when a finite fault source is applied. In general the improvement in the fit of the PGV and duration of observed and synthetics is limited. The fit of the seismograms for stations in the proximity of the finite source are in general relatively poor, although GOF scores for duration are fair (Fig. 7). This indicates that we are lacking specific information for this certain event, whether in the description of the source, in the velocity model around the hypocenter or most probably both.

The records (Figs. 7, 8,  9 and 10) show different characteristics for each event, resulting from the difference in location and/or magnitude. However, some general trend of data fit is visible for all four events.

First of all, for all four events the amplitudes are often overestimated. This is mostly prominent for coda waves at stations located in the central and westernmost part of the basin. Larger amplitude surface waves arise in the long period simulation. The Fourier spectra of these stations show that the high frequency part of the synthetic waveforms in general follows the decaying trend of the amplitude with increasing frequency. However, due to scaling of the high frequency to the low frequency amplitude spectra, the synthetic amplitudes are mostly overestimated. Thus, the difference between synthetic and observed PGV for most of these stations is caused in the long period simulation. The MAMBo model seems to overamplify surface waves in the basin, especially in the western part.

More realistic amplitudes result in the northern part of the basin. For stations SANR and TEOL, the fit to observed waveforms seems relatively good, both in amplitude and duration. Finally, stations located in the Apennines south of the basin generally show quite poor GOF scores for PGV, with the exception of the 2013 event, located in the northern Apennines. This is explained by the fact that MAMBo is a 1D model outside the basin, and clearly lacks first order 3D velocity structures here.

The GOF scores for duration are more promising, showing fair to excellent GOF scores for most simulated stations. Throughout the Po Plain basin (excluding southern Alps and northern Apennines), and for all the events, the duration of shaking is comparable to the observed seismograms. Note again that this fit in duration directly relates to the duration simulated by the low frequency simulation. As the focus in the high frequency simulation was on the fit of the high energy part of the waveform, the amplitude distribution (and thus duration) of the coda waves is determined by the low frequency part of the spectrum.

5 Discussion

To better visualize the performance of the MAMBo model for the different events, Fig. 11 shows the mean GOF score for stations which were simulated for 3 or 4 events and which have a maximum difference of GOF score less than 2 (see for details Figure S3 in the supplementary material). There is only a limited amount of stations which satisfy these conditions. For the mean GOF score of the PGV, most stations located in the basin do not qualify. Apparently the effect of the source location has a large effect on how well the waveforms are simulated for each station. It is noteworthy to point out however, that two stations with a very poor mean GOF score are located along the edge of the sedimentary basin, where potentially mispresented lateral variation in the sediment thickness has the strongest effects. The mean GOF score of duration is represented by more stations in the basin and as we have already concluded for the individual events, the mean GOF scores in the basin show a good fit.

It is clear that in general the simulations are able to obtain higher GOF scores for duration than for PGV. As traditionally the maximum shaking amplitude is computed using GMPE’s, we test the performance of our simulation in terms of prediction of the PGV against the GMPE set ITA10 (Bindi et al. 2011). We compare synthetic PGV of stations located in the sedimentary basin and the PGV computed with the GMPE with observed data, for the 2012 Emilia event, the 2013 event, the June 2012 event and the 2017 event (Fig. 12). We calculate the Root Mean Square (RMS) difference between data and either ITA10 or our hybrid synthetics. ITA10 derives peak amplitudes for the geometrical mean of the horizontal components (GeoH) and the vertical component for periods between 0.04 and 2 s (Bindi et al. 2011). The geometrical mean of the components is calculated using \(G(Y_{ew},Y_{ns})=\sqrt{Y_{ew} Y_{ns}}\), where Y is a scalar ground motion parameter, ew the east-west component and ns the north-south component. The GMPE for soil class B (according to the EC8 classes) was chosen, as the stations in the basin generally have class A–C: A for stations on rock, just at the basin edges; C for stations on dense sediments; and B for stations on very dense sediments.

As a general observation, the synthetic broadband PGV follows the trend of the data and of the GMPE, but sometimes it under-predicts and/or over-predicts both horizontal and vertical components. In particular, for the 2013 event, observed peak amplitudes fall within one standard deviation of ITA10, meaning that the GMPE is able to estimate the observed PGV within error estimates. The synthetic seismograms overestimate the PGV, especially at large distances. RMS values for ITA10 are for both horizontal and vertical components lower than for the synthetics. Similar observations can be drawn for the June 2012 event. For the 2012 Emilia event, ITA10 seems to slightly overestimate the PGV. The synthetic seismograms tend to better estimate the amplitudes for the vertical components, having a lower RMS than ITA10. For the 2017 event, both ITA10 and the synthetics seem to overestimate amplitudes at stations at larger distance.

Overall, for especially the lower-magnitude events, synthetic seismograms overestimate the amplitude of shaking and ITA10 GMPEs result in better estimates of PGV. This over-estimation was documented before in the broadband seismograms and the PGV GOF scores (Figs. 7, 8, 9 and 10). Both in terms of PGV and duration, the success of the simulation heavily depends on the low frequency part and thus on the used 3D velocity model. It seems that MAMBo is able to quite accurately estimate the shaking duration, but it is unable to completely reproduce the correct amplification of waves throughout the basin. Still, out of all tested velocity models, it showed the best performance for the low frequency simulations. As noted before, the success of this model over the ambient noise tomography model most likely results from the presence of well-defined geometries of first-order velocity discontinuities. However, we may expect that the ambient noise velocity model carries more accurate information about S-wave velocity throughout the basin. Comparing the two models (Fig. 3), in the ambient noise tomography model sediments with velocity \(v_{S} \approx 3.0\) km/s reach a much larger depth than in MAMBo. This indicates that the different sedimentary layers in the MAMBo model possibly do not contain accurate thickness and/or velocities, especially in the center of the basin.

Overestimation of the PGV is also visible for the synthetics of the 2012 Emilia event (Fig. 12). Here, however, synthetics are able to reproduce amplitude of shaking better than GMPEs, especially for the vertical component at large distances. It has been mentioned before that GMPEs are not able to accurately estimate amplitudes for the 2012 Emilia events. For this larger event our hybrid broadband simulation shows better predictions than GMPEs, and we may conclude on the importance of accounting for 3D geometry of the basin, and the directivity effects of the source.

Fig. 1
figure 1

Workflow of the hybrid method applied in this work, based on the method described in Mai et al. (2010)

Fig. 2
figure 2

In red the location of the four simulated earthquakes (see Table 1) and their focal mechanism. In grey seismic events occurring in the time period of 2005–2015 with magnitudes larger than 1.5 (Chiarabba and De Gori 2016). The red rectangular defines the modelled space

Fig. 3
figure 3

A N–S section, at longitude 10.55, of the ambient noise tomography (ANT) velocity model (top) and the smoothed MAMBo model (bottom). The MAMBo model contains a more well defined topography of layer interfaces. The ANT model contains more accurate volume-wise averaged S-wave velocities, but first-order velocity discontinuities are less well resolved

Fig. 4
figure 4

An example of the envelope fitting for four different frequency bands to obtain the scattering parameters (see Table I in the Supplementary Material). The example shown is the envelope of the waveform for station IV.BAG8 for an event of magnitude 3.6 occuring in the Modena region at May 29th 2012 at 07:13. A maximum time-laps of only twice the S-wave arrival time is chosen to focus on reproducing the observed envelope where the largest amount of energy is concentrated. (for details see text)

Fig. 5
figure 5

Synthetic and observed transverse component of seismograms of the 2012 Emilia event simulated as a point source on a a SW–NE section, for b the MAMBo model, c the Ambient Noise tomography (ANT) model and d the 1D model. Synthetic seismograms in red, observed in blue, the black stripes represent the P wave arrival. The ANT model results in generally too low amplitudes and waves arrive with a few seconds delay. The 1D model allows good fit for bodywave arrivals and a very poor fit for long period reverberations. Synthetics computed with the MAMBo model show a fairly good fit with observations

Fig. 6
figure 6

Peak ground velocity (cm/s) predicted by the 3D MAMBo model and the finite fault solution from Paolucci et al. (2015) for period T > 0.5s for a Mw 5.63 earthquake (TDMT solution, http://terremoti.ingv.it/en/tdmt, accessed September 2018) that occurred on May 29th 2012 (maximum of 1.442 cm/s)

Fig. 7
figure 7

Simulation results of the 2012 Emilia event using the MAMBo velocity model and a finite fault source. Top: Broadband seismograms (filtered 0.1–10 Hz) and the corresponding Fourier spectra. Bottom: The average GOF scores for PGV and duration for 8 high frequency simulation with different random scatter distributions at the simulated stations

Fig. 8
figure 8

Simulation results of the 2013 event using the MAMBo velocity model. Top: broadband seismograms (filtered 0.1–10 Hz) and the corresponding Fourier spectra. Bottom: the average GOF scores for PGV and duration for 8 high frequency simulation with different random scatter distributions at the simulated stations

Fig. 9
figure 9

Simulation results of the June 2012 event using the MAMBo velocity model. Top: broadband seismograms (filtered 0.1–10 Hz) and the corresponding Fourier spectra. Bottom: the average GOF scores for PGV and duration for 8 high frequency simulation with different random scatter distributions at the simulated stations

Fig. 10
figure 10

Simulation results of the 2017 event using the MAMBo velocity model. Top: Broadband seismograms (filtered 0.1–10 Hz) and the corresponding Fourier spectra. Bottom: The average GOF scores for PGV and duration for 8 high frequency simulation with different random scatter distributions at the simulated stations

Fig. 11
figure 11

The mean GOF scores for each station which was simulated for 3 or 4 events, and of which the maximum difference in GOF score between simulated events is less than 2 (see Figure III in the Supplementary Material for the maximum differences for each station)

Fig. 12
figure 12

Comparison of the peak ground velocity recorded by the stations in the Po Plain, the PGV predicted by the GMPE ITA10 (Bindi et al. 2011) for soil class B with minus and plus one standard deviation (light grey area), and the PGV from our broadband synthetic seismograms for: a May 29th 2012 Emilia \(M_w 5.6\) event; b June 30th 2013 \(M_w=4.5\) event; c the June 3rd 2012 \(M_w=4.7\) event; d November 19th 2017 \(M_w=4.4\) event. In each panel on the left we show the PGV on the geometrical mean of the horizontal components and on the right the PGV on the vertical component

6 Conclusions

Using the hybrid broadband method discussed here, we are able to simulate full waveforms at stations throughout the complete Po Plain basin. Synthetic high frequency seismograms can be simulated accurately, using a relatively simple scattering model, with the introduction of local scattering parameters for each station. However, due to scaling of the high frequency spectrum to match the low frequency spectrum the duration of shaking and the PGV is primarily determined by the low frequency simulation. While the duration of shaking is well reproduced, amplitudes are overestimated in most parts of basin, especially in the western part. This overestimation is mostly due to a not complete/accurate knowledge or the source parameters and inaccuracy of the velocity model used.

Adjusting the high frequency simulation such that the amplitudes of the seismograms can be correctly simulated, without the need to scale the frequency spectrum to the low frequency spectrum should increase the accuracy of synthetic broadband seismograms. Moreover, using a more complex but complete scattering model would result in scattering parameters that reflect the local scattering properties of the rock. This would allow us to simulate high frequency waves throughout the whole basin, by using scattering parameters obtained by interpolation between seismic stations, considering local site conditions.

However at this moment, the most improvement is to be gained in the low frequency part. Our results confirm the importance of accurate knowledge about the source and the 3D S wave velocity field, and moreover, the importance of an appropiate geometry of first-order velocity discontinuities. The results obtained using MAMBo show the best correlation with observed seismograms. This documents that the 3D geometry of the top basement is of pivotal importance to achieve good fit to observations in the far field. Further improvements may be expected from a 3D model that updates MAMBo with the velocity information from the ANT model.