1 Introduction

An earthquake recording site was simply delineated as either rock or soil sites in some early ground-motion models (GMMs) (e.g., Si and Midoriakwa 1999) before being explicitly characterized according to a piecewise site classification scheme (e.g., Atkinson and Boore 2003) or, more recently, a continuous site proxy (e.g., Boore et al. 2014). Undoubtedly, the time-averaged shear-wave velocity in the first 30 m, VS30 (Borcherdt 1994) is the most widely used site or site class delineator.

However, many studies have shown that VS30 alone is not adequate to distinguish the site effects at one site from those at another (e.g., Gallipoli and Mucciarelli 2009; Kotha et al. 2018). Thus, efforts have been made in the search for alternative or complementary site proxies (or their combinations) to VS30. Based on that site period T0 is used for site classification in Japan for engineering design practice (Japan Road Association 1990). Zhao et al. (2006) proposed to utilize T0 and horizontal-to-vertical (H/V) response spectral ratios over a wide period range to classify K-NET stations in Japan and developed a GMM using the period-based classification scheme. Later, they found that T0 was better than both VS30 for soft soil sites (T0 > 0.6 s) at long periods (above 1.0 s), and for other site classes and spectral period bands, they performed equally well (Zhao and Xu 2013).

Cadet et al. (2012) compared the misfits of five single site proxies VSz (shear-wave velocity averaged to a depth z, z = 5, 10, 20, and 30 m) and T0 and four site proxy pairs (T0, VSz) in modelling KiK-net surface-to-borehole amplification. T0 and (T0, VS30) were found to be the best single proxy and proxy pair, respectively. Régnier et al. (2014) concluded that, in addition to VS30, T0 could reduce the site-to-site amplification variability of KiK-net sites in deep sedimentary basins within a specific VS30 class. Applying neural network approach to KiK-net data, Derras et al. (2017) considered VS30 and T0 to be the best single-proxy for periods below and above 0.6 s, respectively.

Outside Japan, McVerry (2011) reported that, in New Zealand, site effects could be better characterized using T0 than using VS30 for oscillator periods of 0.5 s or longer. Hassani and Atkinson (2018a) concluded that T0 was better than VS30 in parameterizing sites in Central and Eastern North America. Meanwhile, for sites in California, Hassani and Atkinson (2018b) achieved an average 5% further reduction in the standard deviation of residuals by including T0 after accounting for the site effects associated with VS30 and Z1.0 (depth to VS = 1.0 km/s). For recording stations in Italy, Luzi et al. (2011) realized a significant reduction in standard deviation when VS30 and T0 are used together compared to a VS30-based site classification. In addition to the empirical investigation, Stambouli et al. (2017) utilized the proxy pair (VS30, T0) to model numerical site responses of hundreds of global soil columns and achieved a reduction in intersite variance by over 60% compared to a model without site term.

Several previous research (e.g., Zhao and Xu 2013) focus on searching for a better alternative to VS30. However, the period-dependency of site amplification determines that there is no such a single proxy that performs the best for all oscillator periods. Thus, to improve the site amplification estimation over the whole period range of engineering interest, it is more viable to use a combination of site proxies than to use a single predictor variable. Meanwhile, the efforts devoted to measuring or inferring VS30 in the past decades, as well as the established status of VS30 in current seismic regulations render the idea of replacing VS30 unappealing. Therefore, to reduce the uncertainty associated with site characterization, it is deemed practical to find an additional site proxy to characterize the site effects that cannot be depicted by VS30 alone, namely an easy-to-measure site proxy complementary to VS30.

Site period T0, as well as site depths Z0.8 and Z1.0, are all potential additional parameters, where Z0.8 and Z1.0 are the depths in meters to isosurfaces having shear-wave velocities 0.8 and 1.0 km/s, respectively. Then the question arises as to which parameter, is the most suitable proxy secondary to VS30 in modelling site response. T0 can be reliably obtained for many sites at relatively low costs using the horizontal-to-vertical (H/V) spectral ratio (HVSR) technique (Nakamura 1989; Lermo and Chávez-García 1993) and thus was proposed by Pitilakis et al. (2004, 2013, 2018) as a parameter for site classification specifically in Eurocode 8. However, the NGA-West2 GMMs (e.g., Boore et al. 2014, hereafter referred to as BSSA14) chose site depth as the secondary proxy, but there is no unanimous answer as to which depth parameters should be utilized. For instance, the revised European building code Eurocode 8 proposes to include Z0.8, but BSSA14 opted for Z1.0 in their site terms.

Therefore, we dedicate this article in search of the best-performing site proxies not only alternative but also complementary to VS30 in characterizing linear site response. We will first present the KiK-net dataset used in this study, including site data and ground-motion data. Then we will describe different techniques utilized in this investigation, including response-, Fourier- and cross-spectral ratio approaches, to evaluate site effects and then compare the site amplifications (AF) derived using these methods. This is followed by residual analyses in which site amplification is modelled as a function of various site proxies or their combinations. The performance of a site proxy or a combination is gauged based on the standard deviation of residuals between the observed amplification and the amplification predicted from the proxy or the combination.

2 Data selection

We use a KiK-net database which consists of about 157,000 ground-motion time-series recorded between October 1997 and December 2011. These records were processed by Dawood et al. (2016) following a stringent fully-automated processing protocol. Data processing was elaborated by Dawood et al. (2016), and thus we only briefly introduce the procedure.

Time-series were firstly baseline corrected, tapered and zero-padded. Then, a high-pass corner frequency fc (0.04 Hz) was pre-selected to filter a record (three components for the surface station and three for the borehole station) using a high-pass, acausal, fourth-order Butterworth filter. The selection of fc would then be confirmed if the filtered records passed the check on its final displacement, the ratio between the final and the maximum displacements, the slope of the trailing portion of the displacement and velocity time-histories, as well as the trend of the smoothed Fourier amplitude spectrum. Otherwise, the above filter process would be iterated using a higher fc until a suitable fc was found. Then, each component was checked on its signal-to-noise ratio (SNR ≥ 3) in the frequency range between 2fc and 30 Hz and was flagged whether it passed the signal-to-noise ratio check or not. For more details about data processing, readers are encouraged to refer to Dawood et al. (2016).

From the processed database, we select ground motions with rupture distance (Rrup) up to 400 km, with fc no lower than 0.12 Hz, passing the signal-to-noise ratio (SNR ≥ 3) check and from earthquakes with a moment magnitude (Mw) between 3.0 and 8.0. Thus, all selected records have a maximum usage period of at least 1/(2fc) = 4.17 s. There are certain correlations between some parameters, e.g., short-period rock site spectral amplitude to source distance, and long-period rock site spectral amplitude to earthquake magnitude. However, magnitude and source distance exerted little influence on site amplification if the effect of nonlinear response was represented by using a function of rock site spectral values (Bazzurro and Cornell 2004). Thus, within the linear domain, site effects are often treated to be independent of earthquake magnitude and source-site distance in GMMs.

Therefore, to minimize the potential influences of magnitude and distance on site response, we limit our research on linear amplification. This is achieved by further selecting ground motions that are not significantly affected by soil nonlinearity based on its shear strain. Fujimoto and Midorikawa (2006) defined the minimum level of shear strain for a possible nonlinear response to be 3 × 10−4. The maximum shear strain within a velocity profile can be estimated using:

$$ \gamma_{eff}^{'} = 0.4PGV/V_{S30} $$
(1)

where \( \gamma_{eff}^{'} \) is the maximum shear strain at an recording site, and PGV is the peak ground velocity (m/s). Ground motions with \( \gamma_{eff}^{'} \) exceeding 3 × 10−4 are excluded from our dataset (Fig. 1a). Thus, we can assume a linear site response in this study.

Fig. 1
figure 1

a Nonlinearity check; and b site conditions of borehole stations

In this study, site amplification at a surface station is referenced to its borehole sensor which is situated in layers at different depths and with various shear-wave velocities for different stations (Fig. 1b). To minimize the influence of inhomogeneous reference site condition on surface-to-borehole spectral ratios, we only utilize KiK-net sites with a shear-wave velocity at borehole (VS,hole) above 800 m/s (Régnier et al. 2014). As shown in Fig. 1b, selected recording stations have VS,hole in the range between 800 and 3300 m/s and are at least 100 m below the ground surface. As demonstrated by Oth et al. (2011), only a few KiK-net velocity profiles have an abrupt velocity contrast below the depth of 100 m, and on average site effects of velocity structures deeper than 100 m are insignificant. Thus, we do not correct the borehole records to a common depth.

For a site with multiple recordings, its intra-site (within-site) amplification variability can be partially attributed to the lateral inhomogeneity of near- and/or sub-surface velocity structures, thus, we only use stations that recorded at least three seismic events. The surface-to-borehole spectral ratios are then averaged over all records at a site to minimize the intra-site variability associated with the azimuthal effects of incoming waves on site response (Field et al. 1992). Then the deviation of the observed amplification at a specific site from the median amplification predicted using site proxy or proxies is treated as the site-to-site (inter-site) residual which arises from the inadequacy of site proxy or proxies, e.g., VS30, (VS30, T0) or (VS30, Z0.8), in characterizing a site (e.g., Al Atik et al. 2010). The standard deviation of amplification residuals is taken as the inter-site standard deviation, which represents site-to-site amplification variability.

Though inter-site and intra-site residuals can be partitioned using a mixed-effect model (Abrahamson and Youngs 1992), we do not adopt this approach herein for two reasons. Firstly, the inter-site and intra-site residuals cannot be completely separated from each other even when a mixed-effect model is used (Zhao and Xu 2013). More importantly, the purpose of this study is to evaluate the performances of various site proxies in modelling site effects, and a good proxy or proxy combination is supposed to reduce both the total and inter-site variabilities, as shown by Derras et al. (2017). Therefore, a stringent partition between inter-and intra-site standard deviation is not implemented in this study.

Finally, there are 1840 ground motions selected in this study. The distribution of selected records are presented in Mw-Rrup space in Fig. 2a, and the number of recordings at each period and at each station is illustrated in Fig. 2b and c, respectively. Average velocities (VSz) and site depths (Z0.8 and Z1.0) of recording stations are derived from available one-dimensional (1D) velocity profiles which are established from downhole PS-logging (Aoi et al. 2000; Okada et al. 2004). Though these PS-logging data might be unreliable at some sites (Kawase and Matsuo 2004; Pilz and Cotton 2019), they are the best information publically available at this stage.

Fig. 2
figure 2

Selected ground motions aMw versus Rrup; b number of recordings at each period; and c number of recordings per station

Wang et al. (2018) derived fundamental periods (T0) of KiK-net sites using the horizontal-to-vertical spectral ratio (HVSR) technique. Individual HVSR curve was computed on 5%-damped response spectrum of earthquake ground motions (complete waveforms) and then was averaged over all recordings (at least 10) at a given site. Local maximum points on an average HVSR curve with amplitudes larger than both \( 1.48*\overline{HVSR} \) and 2.0 were considered as significant peaks, where \( \overline{HVSR} \) is the average HVSR curve integrated over all usable periods. For sites with more than one significant peak, the longest period peak was taken as the fundamental mode and its period as T0. However, T0 was only assigned to a site if it passed a consistency-check, namely 0.5 < T0/TR < 2.0, where TR is the theoretical fundamental period calculated from a velocity profile using the Rayleigh method (Biggs 1964). Site fundamental periods derived by Wang et al. (2018) are utilized in this study.

All site parameters used in this investigation, including VSz, T0, Z0.8 and Z1.0, are obtained from site-specific measurements. The distributions of VS30 against Z0.8 and Z1.0 are illustrated in Fig. 3. VS30 is correlated with Z0.8 and Z1.0 to different extents. T0 is depicted against VSz in Fig. 4, which shows that there are a few sites with T0 larger than 1.0 s. The 5th, 50th and 95th percentile values of site data are also given in Table 1.

Fig. 3
figure 3

Selected KiK-net recording stations aVS30 vs. Z0.8; bVS30 versus Z1.0

Fig. 4
figure 4

Selected KiK-net recording stations aVS10 and VS30 versus T0; bVS20 and VS5 versus T0

Table 1 Percentile values of site data

3 Surface-to-borehole spectral ratios

3.1 Fourier spectral ratio (SSRFAS)

In the frequency domain, amplitude spectrum Aki(f) of a recording at a surface site (i) during an earthquake (k) can be represented by the convolution of the source, path and site effects:

$$ A_{ki} \left( f \right) = O_{k} \left( f \right) \cdot P_{ki} \left( f \right) \cdot S_{i} \left( f \right) $$
(2)

where Ok(f) is the source term of event k; Pki(f) is the path term between station i and event k; and Si(f) is the site term for station i.

Standard spectral ratio (SSR) technique is to calculate the ratio of the Fourier amplitude spectrum (FAS) of a recording at a site of interest (i) to that at a reference site (j). If the reference site has similar source and path effects to the site of interest and has negligible site response (ideally a flat transfer function with an amplitude of one), the spectral ratio is an estimate of site response at site i. Then the SSRFAS of the recording at the station (i) to that at the reference station (j) can be simplified as follows:

$$ SSR_{k,ij} \left( f \right) = \frac{{A_{k,i} \left( f \right)}}{{A_{k,j} \left( f \right)}} = \frac{{O_{k} \left( f \right) \cdot P_{k,i} \left( f \right) \cdot S_{i} \left( f \right)}}{{O_{k} \left( f \right) \cdot P_{k,j} \left( f \right) \cdot S_{j} \left( f \right)}} = \frac{{S_{i} \left( f \right)}}{{S_{j} \left( f \right)}} \approx S_{i} \left( f \right) $$
(3)

3.2 Response spectral ratio (SSRPSA)

SSR is conventionally calculated using FAS of an earthquake recording. However, response spectrum or pseudo-spectral acceleration (PSA) has also been used by some researchers (e.g., Zhao et al. 2006; Cadet et al. 2012) due to some of its merits. Firstly, response spectral amplitude reflects structural response (a series of oscillators of varying natural frequency) and thus is widely adopted in engineering-orientated research, although Fourier spectrum has an explicit physical meaning. In addition, the Fourier spectrum requires extra soothing to reduce the effects of noise on spectral ratios (Safak 1997). In contrast, the damping ratio (e.g., 5%) of the response spectrum has a uniform smoothing effect on all recordings (Zhao et al. 2006).

Cadet (2007) calculated spectral ratios of KiK-net sites using both FAS and PSA. They found that, in a statistical sense, the SSRPSA was comparable to SSRFAS in the period range between 0.07 and 2.0 s but was higher than SSRFAS outside this range. However, Safak (1997) reported that SSRPSA was only comparable to SSRFAS at long periods (> ~ 1.0 s) but were higher than SSRFAS at other frequencies. Given the conflicting results, both SSRPSA and SSRFAS are obtained for a surface-borehole station pair in this study.

3.3 Cross-spectral ratio (c-SSRFAS)

SSR approach should be implemented with caution when the reference site is in a borehole rather than on surface-rock. This is because the downhole recording usually contains downgoing waves reflected from the free surface and other interfaces between sedimentary layers, as well as waves scattered from local inhomogeneity (e.g., Shearer and Orcutt 1987). These downgoing waves can interfere destructively with the upgoing incident waves at some frequencies, producing a notch in the FAS of the borehole recording and thus artificial peaks in surface-to-borehole spectral ratios (e.g., Steidl 1993).

Cross-spectral ratio (c-SSR) is defined as the product of the spectral ratio estimate and the coherence function (Eqs. 4, 5), implicitly accounting for the record coherence in the formulation of site response estimate (e.g., Bendat and Piersol 1980). Thus, the c-SSR technique is recommended for surface-borehole recordings (Steidl 1993; Safak 1997; Assimaki et al. 2008). Since recording stations with VS,hole < 800 m/s are excluded in this study, effects of downgoing waves on borehole recordings can be minimized but cannot be ruled out completely. Hence, the c-SSR approach is also implemented in this study.

$$ c - SSR_{xy} \left( f \right) = \frac{{P_{xy} \left( f \right)}}{{P_{xx} \left( f \right)}} $$
(4)
$$ C_{xy} \left( f \right) = \frac{{\left| {P_{xy} \left( f \right)} \right|^{2} }}{{P_{xx} \left( f \right)P_{yy} \left( f \right)}} $$
(5)

in which Pxx(f) and Pyy(f) are the power spectral densities of the surface (x) and borehole (y) recordings, respectively; Pxy(f) is the cross power spectral densities between x and y; Cxy(f) is the magnitude-squared coherence, which is a function of frequency with values ranging between 0 and 1.

All three techniques, including SSRPSA, SSRFAS and c-SSRFAS, are utilized to derive surface-to-borehole spectral ratios in this study. SSRPSA at each horizontal component is computed as the ratio of 5% damped pseudo-spectral acceleration of the waveform recorded at the ground surface to that recorded at the borehole. The geometrical mean of the SSRPSA in each horizontal direction is used and then averaged over different events (≥ 3) recorded at the site. SSRFAS and c-SSRFAS are obtained in a similar manner to SSRPSA. One difference lies in that SSRFAS is calculated as the ratio of FAS between surface and borehole recordings, while c-SSRFAS is derived as the ratio of the cross-power density spectrum between the surface and downhole recordings to the power density spectrum of surface recording using Welch’s method (Welch 1967). The other difference is that, for SSRFAS and c-SSRFAS, we take an extra step—smoothing before deriving the spectral ratio. Konno and Ohmachi (Konno and Ohmachi 1998) smoothing is utilized with a bandwidth coefficient b = 20.

3.4 Comparison of SSRPSA, SSRFAS, and c-SSRFAS

Values of SSRFAS are compared with those of SSRPSA and c-SSRFAS at T = 0.2 and 2.0 s in Fig. 5. SSRFASs, SSRPSAs and c-SSRFASs at different spectral periods are given in Online Resource 1. It is worth noting that, in Fig. 5, amplifications at T = 2.0 s are much less notable than at T = 0.2 s. This is because that most selected KiK-net sites do not have very thick sediments. There are only 20 (or 10%) sites with Z0.8 more than 100 m (Fig. 3) and also a few sites with T0 above 1.0 s (Fig. 4). Thus, at these investigated KiK-net stations, site effects are not significant at relatively long periods (> ~ 1.0 s).

Fig. 5
figure 5

Comparison between SSRPSA, SSRFAS and c-SSRFAS at 0.2 (first column) and 2.0 s (second column)

Ratios between SSRPSA and SSRFAS, as well as between c-SSRFAS and SSRFAS at various periods are presented in Fig. 6. At T = 0.02 s, SSRPSA deviates significantly from SSRFAS (Fig. 6a). This is attributable to the peculiarity of response spectrum of which spectral content is not directly proportional to that of the corresponding Fourier spectrum at some oscillator periods (or frequencies). For instance, peak ground acceleration (PGA) is controlled by the entire Fourier spectrum (e.g., Bora et al. 2016). In addition, the strong scenario-dependency of response spectral ratios at short periods (Stafford et al. 2017) may explain the large scatter at T = 0.02 s in Fig. 6a. However, at T = 0.2 s, SSRPSA is, on average, only slightly smaller than SSRFAS with SSRPSA/SSRFAS = 0.94 (± 0.12) (Fig. 6a), and then the ratio between SSRPSA and SSRFAS increases gradually with the oscillator period, reaching 1.09 (± 0.35) at T = 4.0 s, suggesting an insubstantial divergence in the period range between 0.2 and 4.0 s. The trend of SSRPSA/SSRFAS with period (Fig. 6a) is compatible with the results of Cadet (2007). In Fig. 6b, the median value of c-SSRFAS/SSRFAS is systematically lower than unity in the period range from 0.1 to 4.0 s. For instance, the ratio between c-SSRFAS and SSRFAS is 0.71 (± 0.11) at T = 0.2 s. As pointed out by many researchers (e.g., Field et al. 1992; Steidl 1993), c-SSR does give a consistently lower amplification than SSR.

Fig. 6
figure 6

Ratios of a SSRPSA to SSRFAS and b c-SSRFAS to SSRFAS at various spectral periods

As indicated by Assimaki et al. (2008) and Thompson et al. (2009), downgoing waves could be identified in KiK-net recordings but did not severely contaminate the borehole records. This is especially true for our KiK-net dataset because of the inclusion of only sites with VS,hole ≥ 800 m/s. For these sites, abrupt velocity contrasts are more likely to be present above borehole sensors, resulting in the trapping of seismic waves between a velocity contrast and the free surface and thus inhibiting the surface reflections reaching downhole stations (e.g., Oth et al. 2011). However, destructive interferences at downhole stations, though deemed insubstantial here, still cannot be completely excluded and may contribute to the upward bias in SSRFAS. This is evidenced by the shrinking aberration with the increase in spectral periods. Since for most selected KiK-net sites, TS,ave at which fundamental mode of destructive interference between incident waves and surface reflections occurs is below 1.0 s (85th percentile), as shown in Fig. 7, constructive interference has a negligible impact on the selected dataset at spectral periods above 1.0 s, e.g., c-SSRFAS/SSRFAS = 0.97 (± 0.07) at T = 2.0 s and 0.99 (± 0.08) at T = 4.0 s (Fig. 6b).

Fig. 7
figure 7

Fundamental period of constructive interference between up-and down-going seismic waves at a borehole station. TS, ave = 4HHole/VS,hole, in which HHole is the distance and time-averaged velocity between the ground surface and downhole sensors

Another reason for the deviation between SSRFAS and c-SSRFAS is that noises in signals have different impacts on them. For noise-free data, SSRFAS is identical to c-SSRFAS. If noise is present in recordings, it will modify the SSRFAS and c-SSRFAS by factors [(1 + SNR−2S)/(1 + SNR−2B)]0.5 and 1/(1 + SNR−2B), respectively, where SNRS and SNRB represent signal-to-noise ratios of surface and borehole recordings, respectively. There is relatively little noise in downhole recordings compared with those at the ground surface (Field et al. 1992), thus SSRFAS will be scaled up whereas c-SSRFAS will remain relatively unchanged.

4 Site proxies alternative to VS30

After obtaining site amplifications using SSRPSA, SSRFAS and c-SSRFAS (Online Resource 1), we first model the observed amplification factor (AF) at a given spectral period using a single site proxy, i.e., VSz, T0, Z0.8 and Z1.0. Fig 8a–c depict the AF at T = 0.4 s derived using c-SSRFAS against VS30, T0, Z0.8 and Z1.0, respectively. AF decreases with the increase in VS30 at T = 0.4 s (Fig. 8a) and all other spectral periods except for 0.02 and 0.1 s at which AF scales positively with VS30. Kawase and Matsuo (2004) also found an increase of AF with VS30 at very short periods (e.g., 0.06 and 0.08 s) using K-NET, KiK-net and JMA stations.

Fig. 8
figure 8

Site amplification (c-SSRFAS) at T = 0.4 s versus aVS30, bT0, cZ0.8, and dZ1.0 in ln–ln space. Dots represent sites with 800 < Vs,hole < 1800 m/s whereas circles are sites with 1800 < Vs,hole < 3300 m/s. The dotted line in each plot represents the regression to all data points, and R2 is the coefficient of determination

For a simple configuration of layered sediments underlain by bedrock, surface-to-bedrock amplification is governed by two competing phenomena, impedance contrast and attenuation (e.g., intrinsic material damping, geometric spreading and scattering). The increase of AF with VS30 at short periods may be caused by the fact that the detrimental impact of short-period (high-frequency) attenuation (e.g., Van Houtte, et al. 2011) on amplification outweigh the incremental effect of impedance contrast. For a site with a large VS30, high-frequency wave energies tend to be attenuated to a lesser extent than they are at a softer site, resulting in a higher AF at a stiffer site.

In the linear range, many amplification models or site terms in GMMs (e.g., BSSA14) introduce a limiting velocity beyond which amplification no longer scales with VS30. This limiting velocity (Vc in BSSA14) is period-dependent and decreases with the spectral period. For example, the value of limiting velocity Vc in BSSA14 is as large as 1500 m/s at 0.02 s, reducing to 844 m/s at 4.0 s. Since only a few sites investigated in this study have a VS30 higher than 800 m/s (Fig. 3), a limiting velocity is considered unnecessary in this research, and a simple linear function Eq. 6 is used for VS30. However, as shown in Fig. 8b–d, AF does not scale linearly with T0, Z0.8 and Z1.0. The relationships of AF with site period and depth can be represented by a piecewise linear function, as shown in Fig. 9a and b, respectively. Hassani and Atkinson (2018a, b) modelled the trend of AF with T0 using a similar function as shown in Fig. 9a and fixed a1 and a2 to 0.05 and 2.0 s, respectively. However, our results show that all these coefficients (e.g., a1 and b1) are period-dependent. With current dataset, it is difficult to constraint all these period-dependent coefficients.

Fig. 9
figure 9

Trends of amplification with aT0 and bZ0.8 (or Z1.0) at a given spectral period

However, based on visual inspection, the relationship between AF and T0 can also be described by a polynomial function. Given the purpose of the present research is to evaluate the efficacy of site proxies, rather than proposing a robust AF estimation model, we thus depict the trend using a polynomial function. The function with fewest terms that are statistically significant is a quadratic function. Hence a second-order polynomial (Eq. 7) is adopted to characterize the variations of AF with T0, Z0.8 and Z1.0. We calculate the amplification residual Res1 (Eq. 8) of observation about the regression line (Eqs. 6 or 7), as well as the standard deviation of amplification residuals \( \sigma_{Res1} \) according to Eq. 9. The performance of each site parameter is assessed based on \( \sigma_{Res1} \), which is displayed in Fig. 10 for each amplification model using a single proxy.

Fig. 10
figure 10

Standard deviations of intersite amplification residuals of AF models using a single site proxy

$$ \ln \left[ {AF\left( {V_{S30} } \right)} \right] = a_{0} + a_{1} \ln (V_{S30} ) $$
(6)
$$ \ln \left[ {AF\left( Y \right)} \right] = b_{0} + b_{1} ln\left( Y \right) + b_{2} \left[ {ln\left( Y \right)} \right]^{2} $$
(7)
$$ Res_{1} = ln\left( {AF_{obs} } \right) - ln\left( {AF_{pre} } \right) $$
(8)
$$ \sigma_{Res1} = \sqrt {\frac{{\mathop \sum \nolimits_{1}^{N} \left[ {ln\left( {AF_{obs} } \right) - ln\left( {AF_{pre} } \right)} \right]^{2} }}{N}} $$
(9)

where AFobs—Observed site amplification at a certain period using either SSRPSA, SSRFAS or c-SSRFAS; AFpre—Predicted site amplification, including AF(VS30) and AF(Y); AF(VS30) and AF(Y)- Site amplifications predicted using VS30 and Y, respectively; Y—Site period or depths, e.g., T0, Z0.8 or Z1.0; Res1—Amplification residual; σ—Standard deviation of amplification residuals; N—Number of sites.

Figure 10 shows that, among VS30, T0, Z0.8 and Z1.0, there is no one single-proxy which is the best for all spectral periods. However, it is obvious that, regardless of the techniques used to derive amplification, T0 has the best overall performance, especially for oscillator periods between 0.1 and 4.0 s. VS30 performs relatively well for periods from 0.2 to 0.7 s and is only second to T0 in this period range, but for periods higher than 0.7 s, VS30 is among the worst-performing indexes. Site depth Z0.8 exhibits nearly identical performance to T0 at periods over 0.5 s, but site depth to a stiffer layer Z1.0 does not further improve amplification estimation. Comparing the efficacies of VSz (z = 5, 10, 20 and 30 m), shear-wave velocity averaged to a larger depth can lead to better overall performance, and the improvement manifests in a period range from 0.2 to 0.7 s.

T0 is shown to be the best-performing single-proxy among VSz, T0, Z0.8 and Z1.0 in modelling linear site response at KiK-net sites. We then depict in Fig. 11a the \( \sigma_{Res1} \) reduction brought by replacing VS30 with T0. Compared with the conventional parameter VS30, using T0 can reduce the site-to-site amplification variability by up to 17%, 24% and 27% for SSRPSA, SSRFAS and c-SSRFAS, respectively. The reductions are negligible for spectral periods below 0.1 s, suggesting the inability of T0 in improving amplification prediction in this range. There are troughs in the period range between 0.2 and 0.7 s (Fig. 11a) since VS30 performs relatively well at these periods, as shown in Fig. 11b which illustrates the Pearson correlation coefficient (R2) between AF and VS30. Fig 11b is nearly identical to Fig. 23 in the paper by Kawase and Matsuo (2004). At 0.1 s, AF scales poorly with VS30, but at periods from 0.2 to 0.7 s, AF is well correlated with VS30. Thus, for periods between 0.2 and 0.7 s, substituting T0 for VS30 can only induce a limited reduction. The decrease in intersite variability is most pronounced at relatively long periods (> 0.7 s), implying the efficiency of T0 in this period band (Fig. 11a).

Fig. 11
figure 11

a Reduction in intersite variability of model AF(T0) relative to model AF(VS30); b Pearson correlation coefficient between AF and VS30

5 Site proxies (secondary) complementary to VS30 (primary)

It is known that VS30 alone cannot account for all aspects of site effects. However, given the widespread use of VS30 in current seismic provisions, replacing VS30 with a new site proxy appears to be unappealing for many. Considering that the VS30-corrected residual amplification Res1 shows a certain degrees of dependency on T0, as well as site depths Z0.8 and Z1.0, it is desirable to search for a site proxy complementary to VS30 in classifying a site or parameterizing site effects. Then the question arises as to which site index among T0, Z0.8 and Z1.0 is the best parameter secondary to VS30 in modelling the residual amplification. To address this issue, we first model Res1 as a quadratic function (Eq. 10) of a secondary site proxy (Y), including T0, Z0.8 and Z1.0. Then we derive the residual Res2 (Eq. 11) between Res1 and proxy-based prediction Res1(Y), as well as its standard deviation \( \sigma_{Res2} \) (Eq. 12). The difference between σRes1 and σRes2 is then utilized to gauge the efficacy of each secondary site proxy. The best-performing candidate is considered to be the one that can produce the largest reduction.

$$ Res_{1} \left( Y \right) = b_{0} + b_{1} ln\left( Y \right) + b_{2} \left[ {ln\left( Y \right)} \right]^{2} $$
(10)
$$ Res_{2} = Res_{1} - Res_{1} \left( Y \right) $$
(11)
$$ \sigma_{Res2} = \sqrt {\frac{{\mathop \sum \nolimits_{1}^{N} \left( {Res_{1} - Res_{1} \left( Y \right)} \right)^{2} }}{N}} $$
(12)

Figure 12 depicts the standard deviation (Eq. 12) and its relative reduction brought by adding an additional site proxy (secondary) on top of VS30 (primary). It shows that, for the investigated oscillator periods, all secondary parameters (T0, Z0.8 and Z1.0) can reduce intersite amplification variability to different extents. At periods around 0.4 s, adding a secondary parameter into AF(VS30) leads to a very limited reduction in variability. This is because, at periods around 0.4 s, VS30 performs well (Fig. 11b), and thus introducing an extra predictor variable can only induce an inconsequential improvement. However, for periods between 0.7 and 2.0 s, these additional variables are rather effective in improving amplification prediction.

Fig. 12
figure 12

Standard deviation of intersite amplification residuals (first column) of each amplification model and its relative reduction (second column) compared with that of model AF(VS30) using SSRPSA, SSRFAS, and c-SSRFAS

The most remarkable reduction is achieved by T0. The induced reductions are as much as 20% (T = 1.0 s), 27% (T = 2.0 s), and 26% (T = 2.0 s) for SSRPSA, SSRFAS and c-SSRFAS, respectively. T0 is followed by Z0.8, including which into the model AF(VS30) leads to reductions up to 16% (T = 1.0 s), 17% (T = 1.0 s), and 15% (T = 0.7 s) for SSRPSA, SSRFAS and c-SSRFAS, respectively. In general, Z1.0 is more difficult to obtain than Z0.8, but the former does not instigate a more apparent reduction in site-to-site variability than the latter (Fig. 12). The percentages of decrease pertaining to Z1.0 are 12% (T = 1.0 s), 17% (T = 2.0 s), and 15% (T = 2.0 s) for SSRPSA, SSRFAS and c-SSRFAS, respectively.

A three-dimensional (3D) subsurface velocity structure model, the Japan Seismic Hazard Information Station (J-SHIS) model, was constructed by Fujiwara et al. (2009) for the whole Japan (see Data and Resources). Regional velocity models are also available for some areas outside Japan, i.e., the 3D Community Velocity Model CVM-S4 (Kohler et al. 2003) and CVM-H1.1.0 (Süss and Shaw 2003) for the Southern California, as well as the 3D Velocity Model of the Bay Area (Boatwright et al. 2004) for the Northern California. These regional velocity models are developed primarily to model the propagation of long-period ground motions (> ~ 1.0 s). However, it is tempting to extract site parameters from these existing velocity models in a site-specific investigation. For instance, in the NGA-West2 project (Ancheta et al. 2014), subsurface models for Japan and California were queried to establish the site depth database. Thus, it is intriguing to examine to what extent the depth parameters extracted from a regional velocity model can improve site-response estimation if they are used as complementary proxies to VS30.

We denote depths inferred from the J-SHIS model as Zx_infer (x = 0.8, 1.0, 1.5 and 2.5 km/s). The subscript “infer” is to distinguish them from the measured depths. Following the same procedure as Z0.8 and Z1.0, we gauge the performances of Zx_infer based on the decreases in site-to-site variability due to their incorporations into the amplification model AF(VS30). Fig 13 displays the percentages of reduction in standard deviation. It can be seen that Z0.8_infer, Z1.0_infer, Z1.5_infer and Z2.5_infer can secure reductions, but the reductions do not exceed 8% (T = 0.7 s), 8% (T = 0.7 s), 10% (T = 1.0 s) and 9% (T = 1.0 s), respectively. Comparing measured (Fig. 12) with inferred depths (Fig. 13), the formers (Z0.8 and Z1.0) are advantageous over their inferred counterparts (Z0.8_infer and Z1.0_infer). This is mainly attributable to the different levels of uncertainty in inferred and measured depth data. Inferred depths from the J-SHIS model are biased and have a substantial amount of uncertainty when compared with site-specific depth measurements, as indicated by Zhu et al. (2019).

Fig. 13
figure 13

Relative reduction in the standard deviation of intersite amplification residual due to the incorporation of inferred site depth Zx_infer (x = 0.8, 1.0, 1.5 and 2.5 km/s) compared to model AF(VS30) using SSRFAS

Although Zx_infer (x = 0.8, 1.0, 1.5 and 2.5 km/s) perform less well than their measured counterparts, introducing Zx_infer into AF(VS30) can lead to a noticeable improvement in prediction. Thus, one can exploit an available regional velocity model for purposes other than long-period ground motion simulation. However, comparison with measured depths warrants a further improvement to the regional velocity model J-SHIS. As reported by Dhakal and Yamanaka (2013), there are evident discrepancies between the J-SHIS model and other subsurface models for the same region. For regions outside Japan, the 3D Community Velocity Model CVM (Version 2, Magistrale et al. 2000) for southern California was also found to have a significant bias (Stewart et al. 2005) and, as pointed out by Graves and Aagaard (2011), further refinement to the CVM was needed.

6 Site amplification as a function of T0 (primary) and VSz (secondary)

In engineering practice, VS30 has already been derived at many recording stations and is widely adopted to categorize sites in current seismic codes (e.g., European code EC8) or to describe site effects in many GMMs (e.g., NGA-West2 GMMs). However, VS30 entails higher costs than T0 which can be readily acquired using the HVSR technique on either ambient noises (Nakamura 1989) or earthquake recordings (Lermo and Chávez-García 1993). More importantly, T0 is proved to be the best-performing single-proxy (Fig. 10) among VSz, T0, Z0.8 and Z1.0 in depicting linear site effects. Thus, considering both the performance and engineering utility of T0, as well as the established status of VSz, especially VS30, we parameterize site effects using both parameters with T0 as the primary and VSz as the secondary site indicator.

Figure 14 shows the site-to-site variabilities of each site-effect model, including AF(T0) and AF(T0, VSz), as well as the percentages of reduction in variability due to the incorporation of VSz into AF(T0). Adding VSz as secondary site proxies can decrease estimation uncertainty. The reductions brought by VS5 and VS10 are insignificant, but the reduction increases with z (z = 5, 10, 20 and 30 m). VS20 and VS30 can lead to apparent variability reduction in the period range from 0.2 to 0.7 s, suggesting that, besides VS30 (Fig. 11b), VS20 performs also well in describing linear amplification in this period range. In comparison, the reduction as a result of the inclusion of VS30 into AF(T0) is less significant than the reduction due to the incorporation of T0 into AF(VS30) (Fig. 12), manifesting that T0 alone can account for most of the site effects.

Fig. 14
figure 14

Standard deviations of residuals of site-effects models using T0 as the primary site proxy with or without VSz as the secondary predictor (left column); and relative reductions in the standard deviation of residuals of models AF(T0, VS5), AF(T0, VS10), AF(T0, VS20), and AF(T0, VS30) compared with AF(T0) (right column)

Figure 15a compares the intersite variabilities of the amplification (SSRPSA) model AF(T0, VS30) with that of the model AF(VS30, T0). The former uses T0 as the primary and VS30 as the secondary predictor whereas the latter utilizes VS30 as the primary and T0 as the secondary variable. The level of estimation uncertainty associated with AF(T0, VS30) is lower than that of AF(VS30, T0), implying that the sequence of predictors entering the model affects the model efficacy. If both T0 and VS30 were to be included in an amplification model, using T0 as the primary indicator is preferable than the other way around, which is consistent with the finding of Hassani and Atkinson (2018a). Comparing with the configuration of VS30 (primary) and T0 (secondary), Fig. 15b shows that the combination of T0 (primary) and VS30 (secondary) can reduce model uncertainty by up to 12% (SSRPSA), 7% (SSRFAS) and 12% (c-SSRFAS).

Fig. 15
figure 15

a Standard deviations of residuals of amplification models of AF(T0, VS30) and AF(VS30, T0) using SSRPSA; b reduction in the uncertainty of model AF(T0, VS30) relative to model AF(VS30, T0)

7 Discussion and conclusions

In an effort to pinpoint the best-performing site proxy or optimal combination of site proxies in modelling linear site response, we selected 1840 ground-motion recordings from a KiK-net database processed by Dawood et al. (2016). Site effects were estimated using surface-to-borehole spectral ratios. T0 was found to be the best-performing single-proxy among VSz (z = 5, 10, 20 and 30 m), T0, Z0.8 and Z1.0. Substituting T0 for VS30 in an amplification model could induce a significant reduction in site-to-site amplification variability, especially for spectral periods at 0.7 s or longer. There seems to be a consensus that T0 has a better overall performance than VS30 in parameterizing site effects (e.g., Zhao and Xu 2013; Cadet et al. 2012; McVerry 2011; Derras et al. 2017; Stambouli et al. 2017; Hassani and Atkinson 2018a). T0 is also recommended as one of the main proxies in the draft of revised EC8.

Besides, T0 was also found in this study to be most descriptive of the residual amplification after VS30-correction among T0, Z0.8, Z1.0 and Zx_infer (x = 0.8, 1.0, 1.5 and 2.5 km/s). In addition to VS30 (primary), adding T0 (secondary) into an amplification model could reduce site-to-site amplification variability by up to 20–27% at relatively long periods (> 0.7 s). In site classification, Luzi et al. (2011) also achieved a significant reduction in variability when T0 was included as a complimentary proxy to VS30. This suggests that T0 can be incorporated into current site-response models (or site terms) and site classification schemes to better account for site effects.

When T0 was used as the primary proxy to model site response, T0 alone could capture most of the site effects. Adding VS5 or VS10 (secondary) could only introduce a very limited further reduction in intersite amplification variability. In contrast, incorporating VS20 or VS30 (secondary) could apparently decrease variability at periods between 0.2 and 0.7 s but by no more than 10% and 14%, respectively. Furthermore, using T0 as the primary and VS30 as the secondary predictor variables is better than the configuration of VS30 (primary) and T0 (secondary), which confirms the results of Hassani and Atkinson (2018a).

Figure 16 compares the reductions in standard deviations of residuals between amplification observations and T0-based predictions relative to the conventional model AF(VS30). These relative reductions are given in Table 2. All T0-based amplification models have less amount of uncertainty than AF(VS30), especially for periods over 0.7 s. Adding VS10, VS20 or VS30 into the model AF(T0) can further improve model performance, especially for spectral periods between 0.2 and 0.7 s. Although models AF(T0, VS30) and AF(VS30, T0) utilize the same site proxies, the former performs better than the latter due to the difference in the sequence of variables entering the model. For the same reason, AF(T0, VS10) can achieve at least comparable overall performance to AF(VS30, T0). Figure 16 also demonstrates that AF(T0, VS30) is the best-performing proxy pair but is only slightly better than AF(T0, VS20). Given that VS20 is strongly correlated with VS30 (e.g., Boore et al. 2011) and that VS30 may entail higher costs than VS20, thus VS20 (secondary) may be adequate for engineering use.

Fig. 16
figure 16

Relative reductions in the standard deviation of residuals of models AF(T0), AF(T0, VS10), AF(T0, VS20), AF(T0, VS30) and AF(VS30, T0) compared with AF(VS30)

Table 2 Relative reductions in the standard deviation of residuals compared with AF(VS30)

T0 is shown to be preferable in site characterization. However, our previous study (Zhu et al. 2019) found that there were discrepancies in T0 derived using the HVSR technique by different teams. For sites with prominent 2D or 3D features, HVSR is not always reliable in determining T0 (Gueguen et al. 2007). This may be one source of uncertainty in T0 and will inevitably affect its efficacy. Besides, T0 is identified as corresponding to a significant peak on HVSR, but significant peaks are often defined rather subjectively, which may be another source of uncertainty in T0. Thus, one should take due consideration of these factors when detecting T0 using the HVSR method. In this study, we used only KiK-net sites which are often located on weathered rocks or thin sediments (Aoi et al. 2000). This is compatible with the maximum usable period of our selected recordings, i.e., at least 4.17 s since there are no prominent site effects beyond this period. However, there exist sites with rather thick sediments, especially those in deep sedimentary basins, e.g., the Kanto (Tokyo) basin in Japan and the Los Angeles basin in California. At these sites, there are significant amplifications at periods longer than 4.17 s. Also, basin-generated surface waves are likely to have a strong presence in the recordings. Thus the best-performing site proxy or combination of proxies should be further tested on deep basin sites and in a broad period range preferably up to ~ 10.0 s. Meanwhile, it needs to be stressed that site amplifications in this research are referenced to downhole bedrocks with different velocities (larger than 800 m/s). Since all site proxies are gauged on the same dataset, inhomogeneous reference site conditions will not affect the results. However, amplification should be normalized to a common reference for the development of empirical prediction models. In addition, we limit our study to the linear domain, but soft soil sites exhibit nonlinear behavior during strong ground shakings. Thus, the efficiencies of various site parameters in depicting nonlinear site response need to be investigated in future studies.

What distinguishes this work from many preceding ones is that we include many competitive site proxies all in one study. In addition to VSz (z = 5, 10, 20 and 30 m) and T0, we also consider various depth parameters with differentiation of measured (Z0.8 and Z1.0) from inferred depths (Zx-infer, x = 0.8 and 1.0, 1.5 and 2.5 km/s). Though Z1.5 and Z2.5 are also candidate proxies utilized in some research, e.g., Z1.5 in the amplification model by Choi et al. (2005) and Z2.5 in the site term of Campbell and Bozorgnia (2014), we do not include them here since there are not adequate sites with measured Z1.5 and Z2.5 available. For instance, there are only 42 KiK-nets sites of which borehole drillings penetrate 2.5 km/s horizon.

In summary, our results show that T0 is the best-performing single-proxy among VSz, T0, Z0.8 and Z1.0 in modelling linear site response at KiK-net sites. Thus, T0 can be used as a substitute for VS30. Meanwhile, T0 is also the best-performing proxy among T0, Z0.8, Z1.0 and Zx-infer complementary to VS30 in capturing the VS30-corrected residual amplification. Hence, T0 can be utilized as an add-on to calibrate existing VS30-based amplificaiton models or site terms. Besides, T0 alone can capture most of the site effects and should be utilized as the primary site proxy. (T0, VS30) is found to be the best-performing proxy pair among (VS30, T0), (VS30, Z0.8), (VS30, Z1.0), (VS30, Zx-infer) and (T0, VSz) but is only slightly better than AF(T0, VS20). Given that VS30 may entail higher costs than VS20, the configuration of T0 (primary) and VS20 (secondary) is considered to be the optimal combination.