Article Highlights

  • We present an alternative passive surface wave method to capture the dominant azimuth of noise sources and estimate phase velocity by polarization analysis with multicomponent ambient noise records

  • We deduce that noise sources are mainly classified as either inline or offline distribution according to the dominant azimuth

  • Results from synthetic data and field examples show that the velocity overestimation due to offline noise source distribution is corrected using the proposed method

1 Introduction

Surface waves propagate along the earth–air or the earth–water interfaces and are usually characterized by relatively low velocity, low frequency, high amplitude, and dispersion property in all earth models except for the case of the elastic half-space. Geometric dispersion of surface waves is used to infer the properties of medium by estimating the model parameters (e.g., shear (S)-wave velocity) for applications of near-surface geology, environment, and engineering (Socco et al. 2010). Stokoe and Nazarian (1983) and Nazarian et al. (1983) introduced a surface-wave method, called spectral analysis of surface wave (SASW), which analyzes dispersive phase velocity of Rayleigh waves to determine S-wave velocity profiles. But the SASW method is subject to coherent noise and unable to extract multimode surface waves. Multichannel analysis of surface waves (MASW) was then developed to evaluate S-wave velocity profiles by utilizing a multichannel recording system and had been widely applied in near-surface applications because it improved the quality control during data acquisition and processing steps (Song et al. 1989; Park et al. 1999; Xia et al. 1999, 2003, 2012). The standard procedure for MASW includes acquisition of high-frequency broad-band Rayleigh waves, extraction of Rayleigh wave dispersion curves, and inversion of dispersion curves. For the extraction of dispersion curves, wavefield transformation techniques such as the phase shift (Park et al. 1998) and slant stack methods (McMechan and Yedlin 1981; Xia et al. 2007) are commonly used to transfer the original time–space domain data into the frequency-velocity domain. Rahimi et al. (2021) evaluated the performance of four common MASW wavefield transformation techniques and concluded that the best practice is to at least use two different transformation methods to enhance the data quality, especially for complex stratigraphy environments. Once the dispersion curves are identified, we can use the linear or nonlinear optimization algorithms to estimate near-surface S-wave velocity by the inversion of phase velocities (Xia et al. 1999; Wathelet et al. 2004; Dal Moro et al. 2007; Aleardi et al. 2020; Barone et al. 2021). Currently, the MASW method has been widely used in geotechnical engineering for near-surface site characterization, infrastructure evaluation and liquefaction assessment (Rix et al. 2002; Socco and Strobbia 2004; Cardarelli et al. 2014; Wood et al. 2017; Mi et al. 2019).

Active surface wave survey, however, often does not achieve sufficient depth of investigation since most active sources such as mechanical weight drops or sledgehammers are bandwidth-limited due to their physical limitations. To overcome this limitation, passive surface-wave methods utilize seismic ambient noise generated from natural or anthropogenic sources to extract dispersion information at long wavelengths, providing a wide range of investigation depths from a few tens to hundreds of meters (Aki 1957; Asten 1978; Park et al. 2004). Passive seismic methods have been paid more attention in near-surface geophysical and geotechnical communities since they not only have deep investigation depths but also overcome the challenges in performing seismic investigation in noisy urban environments (Cheng et al. 2018; Chen et al. 2021; Pang et al. 2022). The origins of passive surface-wave methods date back to the pioneering work of Aki (1957) who investigated the ambient noise as surface waves and proposed the theory of a spatial autocorrelation (SPAC). The 2-D receiver array, however, such as a triangular or L-shape layout, is an impractical and uneconomical mode of survey in urban areas populated with buildings. Chávez-García et al. (2006) presented an extension of the SPAC method to obtain reliable dispersion curves from noise recordings without constraints in the geometry of the array. Two-station microtremor SPAC is recently developed to extract dispersion curves between station pairs allowing the flexible receiver geometries (Hayashi et al. 2013; Cho 2020). Asten and Hayashi (2018) summarized the fundamental theory of the SPAC method and reviewed recent developments including choice of array, alternative SPAC processing methodologies, and the comparison of frequency-wavenumber and SPAC methods. Louie (2001) developed the refraction microtremor technique to process passive data recorded with a 1-D linear receiver array. It should be noted that the measured phase velocity of the 1-D method is equal to true phase velocity only when wave propagation is parallel to the receiver line. Considering different types of wave propagation, Park et al. (2004, 2008) introduced dispersion imaging schemes of inline plane waves, offline plane waves, and offline cylindrical waves for roadside passive surveys. Baglari et al. (2020) presented the influence of the length of receiver array and acquisition time on the resolution of dispersion image obtained from roadside passive surveys and compared the results obtained from passive and active surveys in terms of the dispersion imaging and S-wave velocity profiles at the specific sites. To obtain the strong higher-order dispersion information, the frequency-Bessel transform (F-J) method is recently developed to extract multimode surface wave dispersion curves from active or passive source data regardless of the assumptions of plane wave (Forbriger 2003; Wang et al. 2019; Li and Chen 2020; Xi et al. 2021). Hu et al. (2020) extended the F-J method to the multicomponent cross-correlation function of noise recordings for the multimode dispersion curve extraction of Rayleigh and Love waves.

The cross-correlation of noise recordings at two different stations has recently been shown to be a powerful method for estimating Green’s function between the stations (Claerbout 1968; Weaver et al. 2001; Shapiro and Campillo 2004; Sabra et al. 2005; Bensen et al. 2007). Many theoretical studies demonstrate that the extraction of Green’s function can be derived from normal modes (Lobkis and Weaver 2001), time-reversal symmetry (Derode et al. 2003), representation theorems (Wapenaar 2004; Wapenaar and Fokkema 2006), and stationary phase analysis (Snieder et al. 2006). Shapiro et al. (2005) constructed tomographic images of the principal geological units of California using hundreds of interstation surface wave group-velocity measurements determined by cross-correlating long sequences of ambient seismic noise recorded at stations in Southern California. Then, numerous applications of ambient noise cross-correlation technique on a continental and regional scale emerge rapidly (Yao et al. 2006; Moschetti et al. 2007; Lin et al. 2008; Yang et al. 2008; Foster et al. 2014; Shen and Ritzwoller 2016; Zhou et al. 2021). In recent years, seismic interferometry by cross-correlation is increasingly applied for near-surface investigations (Nakata et al. 2011; Draganov et al. 2013; Behm et al. 2014; Cheng et al. 2015; Quiros et al. 2016; Guan et al. 2021; Liu et al. 2021; Ning et al. 2021). Le Feuvre et al. (2015) introduced the use of cross-correlations in the passive multichannel analysis of surface waves and showed an improvement in the determination of subsurface shear velocities from ambient seismic noise. For complex urbanized environments, Cheng et al. (2016) investigated that directional noise sources will bring azimuthal effects to the phase-velocity estimation because true randomness of ambient noise cannot be achieved in reality.

To minimize the azimuthal effects, different kinds of algorithms are developed to improve passive imaging with ambient noise. They can be mainly summarized as: (1) Beamforming analysis. Cheng et al. (2016) presented the multichannel analysis of passive surface waves (MAPS) to correct the azimuthal effects of cultural noise by identifying the azimuth of the predominant noise sources based on beamforming analysis. Xia et al. (2017) demonstrated the advantages of MAPS in urban environments and suggested that MAPS could be used to accurately and rapidly image the surface wave dispersion energy, even for only a few minutes of noise recordings. Pang et al. (2019) introduced a data selection technique in time domain for selective stacking of cross-correlations and applied it to improve the MAPS measurement. Liu et al. (2020) used a beamforming algorithm to correct the apparent velocities of waves traveling obliquely to the array by adding two more offline receivers to a conventional linear array. Morton et al. (2021) proposed the passive 1D-2D MASW method to select optimal time windows when passive seismic sources are of sufficient quality and aligned with the 1D receiver spread for improving passive-data dispersion measurement. Ning et al. (2022) used the beamforming analysis to select noise segments coming from the sources in the stationary phase zone to obtain virtual shot gathers with high signal-to-noise ratio and high-resolution dispersion energy. Mi et al. (2022) applied the MAPS method to traffic-induced noise recorded by dense linear arrays for investigating near-surface structures in Hangzhou City, Eastern China. (2) Filter and weight stacking. For the biased Green’s function due to directional sources, Carrière et al. (2014) designed matrix-based spatial filters to remove unwanted contributions of in the cross-correlations. Weaver and Yoritomo (2018) introduced stacking schemes for optimizing the time-varying weight to make the effective noise source distribution isotropic. Wu et al. (2020) developed an adapted eigenvalue-based filter to attenuate the interference of strong directional sources and improve cross-correlations. Zhao et al. (2021) applied the wavelet-domain filter to ambient noise cross-correlations for enhancing the signal-to-noise ratio of virtual shot gathers and improving the subsequent phasee-velocity dispersion measurement. Shirzad et al. (2022) developed a new method based on weighted root-mean-square stacking to improve the distorted Green’s function in the case of non-uniform distribution of noise sources. (3) Waveform inversion. For anisotropic seismic source distributions and heterogeneous subsurface structures, Sager et al. (2018) demonstrated that full waveform ambient noise inversion is an effective and promising approach to estimate source distribution. The theoretical foundation of this approach can date back to the previous work (Tromp et al. 2010; Fichtner 2015). Xu et al. (2019) further demonstrated that using both vertical–vertical and radial–radial cross-correlations can better constrain estimation of the source distribution than vertical–vertical cross-correlations alone. Datta et al. (2019) introduced a new method based on the theoretical framework of sensitivity kernels for cross-correlation waveforms to determine ambient noise directionality. Zhou et al. (2022) developed a waveform joint imaging algorithm to invert noise source distributions and the corresponding unbiased surface wave velocities.

Compared with the other kinds of method, the beamforming analysis is an effective tool for estimating the directionality of wavefield and correcting the velocity bias for 2-D receiver array or pseudo-linear array (Rost and Thomas 2002; Roux 2009; Behr et al. 2013; Cheng et al. 2016; Liu et al. 2020; Ning et al. 2022). Unfortunately, the linear array based on the beamforming algorithm is challenging to present the correct noise source distribution (Liu et al. 2020). With the development of multicomponent seismic instruments, approaches based on the polarization of seismic waves are used to estimate the wavefield directionality (Perelberg and Hornbostel 1994; Baker and Stevens 2004; Schute-Pelkum et al. 2004; Tanimoto et al. 2006; Schimmel et al. 2011; Takagi et al. 2018; Dangwal and Behm 2021). Assuming the seismic wave type of the observed data, we can infer a direction of incidence from the particle motion (Koper and Hawley 2010). An advantage of polarization analysis is that the back azimuth is estimated from the three-component records at a single station. Chevrot et al. (2007) compared results from beamforming to particle motion polarization analysis and concluded that polarization analysis is able to constrain the direction of arrival with only one three-component station. Takagi et al. (2018) provided a detailed description of polarization analysis of incident waves from a distributed source more suitable for microseisms and applied the approach to estimate the directionality of ambient noise recorded by Hi-net. Suemoto et al. (2020) applied polarization analysis of InSight seismic data to estimate the temporal variation and frequency dependence of the Martian ambient noise field and analyzed the presence of several ambient noise sources as well as geological structure at the landing site. Zenhäusern et al. (2022) used the polarization analysis to estimate the back azimuth of seismic events recorded by InSight and create a new and extended set of marsquake location.

A 2-D array requires wide space for deployment of receiver which may not be easily available in highly populated urban areas. In this paper, we propose an alternative passive surface wave method by polarization analysis with multicomponent noise recorded by 1-D linear array to capture the dominant azimuth of noise-source distribution and perform accurate dispersion measurements. Theoretical background of the proposed method will be first introduced. Next, three synthetic tests demonstrate the advantages of our method in detection of the dominant noise-source direction and estimation of the unbiased dispersion measurements under various source distributions. Finally, results from two field examples in urban environments show the feasibility of the proposed method.

2 Method

When noise sources show a strong directionality, linear arrays may produce biased phase-velocity measurements (Park and Miller 2008; Le Feuvre et al. 2015). To capture the dominant azimuth of noise sources is key to azimuthal adjustment for the MAPS method (Cheng et al. 2016).

Assuming that Rayleigh waves are incident as plane waves on each seismic station, we can represent the multicomponent wavefields by a superposition of incident plane waves as follows:

$$\begin{gathered} u_{Z} \left( f \right) = \mathop \smallint \limits_{ - \pi }^{\pi } {\text{d}}\varphi A_{{\text{R}}} \left( {\varphi ,f} \right) \hfill \\ u_{{\text{N}}} \left( f \right) = \mathop \smallint \limits_{ - \pi }^{\pi } {\text{d}}\varphi iH\left( f \right)A_{{\text{R}}} \left( {\varphi ,f} \right)\sin \varphi \hfill \\ u_{{\text{E}}} \left( f \right) = \mathop \smallint \limits_{ - \pi }^{\pi } {\text{d}}\varphi iH\left( f \right)A_{{\text{R}}} \left( {\varphi ,f} \right)\cos \varphi \hfill \\ \end{gathered}$$
(1)

where f is frequency; \(u_{Z} \left( f \right)\), \(u_{{\text{N}}} \left( f \right)\), and \(u_{{\text{E}}} \left( f \right)\) are the Fourier spectra of the vertical, north and east wavefields, respectively; \(A_{{\text{R}}} \left( {\varphi ,f} \right)\) is the Fourier spectrum of the vertical wavefield of Rayleigh wave propagating in azimuth \(\varphi\) (in degrees counter-clockwise from east); and \(H\left( f \right)\) is the horizontal-over-vertical ratio of the Rayleigh waves. As shown by Takagi et al. (2018), we use the \(x\left( f \right) = \mathop \smallint \limits_{ - \infty }^{ + \infty } X\left( t \right)e^{i2\pi ft} {\text{d}}t\) as the Fourier time transform. In this convention, the particle motion is retrograde when \(H\left( f \right)\) is positive. According to the previous study (Harmon et al. 2010; Takagi et al. 2018), the relationships between vertical-horizontal cross-spectra and azimuthal power spectrum of incident Rayleigh waves under the assumption of random uncorrelated waves are given by (see Appendix A for the details)

$$\begin{gathered} \left\langle {u_{Z}^{*} \left( f \right)u_{N} \left( f \right)} \right\rangle = iH\left( f \right)b_{R1} \hfill \\ \left\langle {u_{Z}^{*} \left( f \right)u_{E} \left( f \right)} \right\rangle = iH\left( f \right)a_{R1} \hfill \\ \end{gathered}$$
(2)

where ⟨⟩ denotes the ensemble average; \({a}_{R1}\) and \({b}_{R1}\) are first-order Fourier coefficients of the azimuthal power spectrum. When Rayleigh waves are radiated from a distributed source, it is indicated from Eq. (2) that the first-order terms have a single peak that represents the centroid of the propagation directions of the incident Rayleigh waves.

Following Takagi et al. (2018), the propagation azimuth of the Rayleigh waves can be estimated from one-station multicomponent records without the ambiguity of 180° when the direction of the rotational motion (prograde or retrograde) is known. Although the rotational direction of the fundamental mode Rayleigh waves is retrograde in most cases, the rotational motion of Rayleigh waves can be prograde for the fundamental mode in sedimentary areas. To solve the uncertainty of the direction of rotation, we utilize the asymmetry of the cross-correlation functions to determine the predominant direction of Rayleigh waves regardless of the direction of rotational motion. Note that we take the orientation from the first station to the end station as the forward direction of incident Rayleigh waves. On the contrary, the backward direction of incident Rayleigh waves is from the end station to the first station. The causal and acausal parts of cross-correlation functions of vertical-component records at several stations can be used to determine whether incident Rayleigh waves come from the forward or backward direction (Stehly et al. 2006; Pang et al. 2019). If the amplitude of the causal part the cross-correlation functions is much larger than the one of the acausal part, it is deduced that incident Rayleigh waves mainly originate from the forward direction. Similarly, incident Rayleigh waves mainly originate from the backward direction when the amplitude of the causal part the cross-correlation functions is smaller than the one of the acausal part. We then define the azimuth \(\mathrm{\varphi }\) that characterizes the directionality of incident Rayleigh waves (Takagi et al. 2018; Suemoto et al. 2020),

$$\varphi_{j} \left( f \right) = \left\{ \begin{gathered} \arctan \left( {\frac{{{\text{Im}} u_{Z}^{*} \left( f \right)u_{{\text{N}}} \left( f \right)}}{{{\text{Im}} u_{Z}^{*} \left( f \right)u_{{\text{E}}} \left( f \right)}}} \right) + \pi = \arctan \left( {\frac{{b_{R1} }}{{a_{R1} }}} \right) + \pi ,\quad {\text{if}}\;{\text{forward}} \hfill \\ \arctan \left( {\frac{{{\text{Im}} u_{Z}^{*} \left( f \right)u_{{\text{N}}} \left( f \right)}}{{{\text{Im}} u_{Z}^{*} \left( f \right)u_{{\text{E}}} \left( f \right)}}} \right) = \arctan \left( {\frac{{b_{R1} }}{{a_{R1} }}} \right),\quad \quad \quad \quad {\text{if}}\;{\text{backward}} \hfill \\ \end{gathered} \right.$$
(3)

where \({\mathrm{\varphi }}_{j}(f)\) corresponds to the azimuth estimated from station j of the multicomponent records. For M seismic stations of 1-D linear array, we can obtain M repeated measurements of the azimuth. Although single station polarization analysis is commonly used for the estimation of the azimuth, we calculate the mean azimuth \(\overline{\varphi }(f)\) by stacking all seismic stations for a robust estimation considering that some stations may be affected by the random disturbance,

$$\overline{\varphi }\left( f \right) = \frac{1}{M}\mathop \sum \limits_{j = 1}^{M} \varphi_{j} \left( f \right)$$
(4)

It is from Eq. (4) that \(\overline{\varphi }(f)\) is frequency-dependent. For a particular frequency, one corresponding azimuth can be calculated. Statistically, we obtain the frequency (count) distribution of mean azimuth within a broad frequency band, marked \(P(\overline{\varphi })\). The dominant azimuth \(\widehat{\varphi }\) for the linear array is detected by maximizing the counts of \(P(\overline{\varphi })\) as follows

$$\hat{\varphi } = \mathop {{\text{argmax}}}\limits_{{\overline{\varphi }}} P\left( {\overline{\varphi }} \right)$$
(5)

After detecting the dominant azimuth, we deduce that sources of noise records are predominantly classified as either inline or offline distribution. Note that inline distribution is considered as the stationary phase zone of the linear array (Boschi and Weemstra 2015; Ning et al. 2022). As for the inline case, phase-velocity dispersion measurements without azimuthal adjustment are implemented by the MAPS method introduced by Cheng et al. (2016):

$$E\left( {f,c} \right) = \left| {\mathop \sum \limits_{j = 1}^{M - 1} \mathop \sum \limits_{k = j + 1}^{M} \exp \left( {\frac{{i2\pi fx_{jk} }}{c}} \right)\frac{{C_{jk}^{ + } \left( f \right) + C_{jk}^{ - } \left( f \right)}}{2}} \right|,$$
(6)

where \(E(f,c)\) is the relative dispersion energy matrix for a particular frequency \(f\) and a scanning velocity \(c\); \({C}_{jk}^{+}\left(f\right)\) and \({C}_{jk}^{-}\left(f\right)\) are the Fourier transformation of the causal and acausal parts of cross-correlation between station j and station k of the vertical-component records, respectively; \({x}_{jk}\) corresponds to the distance between station j and station k.

Substituting the detected azimuth \(\widehat{\varphi }\) into Eq. (6), accurate relative dispersion energy matrix \(E(f,c)\) for the offline case is updated by:

$$E\left( {f,c} \right) = \left\{ \begin{gathered} \left| {\mathop \sum \limits_{j = 1}^{M - 1} \mathop \sum \limits_{k = j + 1}^{M} \exp \left( {\frac{{i2\pi fx_{jk} \cos \hat{\varphi }}}{c}} \right)C_{jk}^{ + } \left( f \right)} \right|\quad , {\text{if}}\;{\text{forward}} \hfill \\ \left| {\mathop \sum \limits_{j = 1}^{M - 1} \mathop \sum \limits_{k = j + 1}^{M} \exp \left( {\frac{{i2\pi fx_{jk} \cos \hat{\varphi }}}{c}} \right)C_{jk}^{ - } \left( f \right)} \right|\quad , {\text{if}}\;{\text{backward}} \hfill \\ \end{gathered} \right.$$
(7)

3 Numerical Tests

To demonstrate the performance of the proposed method, we conducted three synthetic tests for passive surface wave imaging: one for inline noise sources, and the others for offline noise sources. The normal mode summation code in the Computer Programs in Seismology package (Herrmann 2013) was used to generate the surface wave synthetic noise waveforms with 10,000 point single force sources at the free surface. The medium is a two-layer model (Table 1), whose properties are taken from Bonnefoy-Claudet et al. (2006). The 24 stations named H01 to H24 with H13 near the origin are deployed (the solid black triangles from left to right in Fig. 1a), and the interval between stations is 5 m.

Table 1 Parameters of a two-layer model for surface wave synthetic noise data
Fig. 1
figure 1

a Sources and receiver distribution in case A. Noise sources (the red dots) are randomly at polar coordinates between R1 = 1000 m, \({\theta }_{1}\text{ = -}10^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }10^\circ\); A 24-channel linear array (the solid black triangles) with a 5-m interval is used. b Randomly distributed force intensity at different directions. FE, force in the eastward/westward direction (eastward positive); FN, force in the northward/southward direction (northward positive); FD, force in the downward/upward direction (downward positive). c Synthetic multicomponent noise data recorded by one station marked as H01

We first consider the case A of inline noise sources, which are randomly distributed at polar coordinates between R1 = 1000 m, \(\theta_{1} = - 10^\circ\) and R2 = 3000 m, \(\theta_{2} { = }10^\circ\), as the red dots shown in Fig. 1a. The orientations and amplitudes of the point force sources were also randomly distributed in Fig. 1b as suggested by Herrmann (2013), which will generate both Rayleigh and Love waves. Examples of the synthetic ambient noise recorded by a station in the seismic array are shown in Fig. 1c. The noise recordings are experimentally divided into 10 s windows. Following the processing procedure described in Cheng et al. (2016), cross-correlation functions with H01 as the virtual source are calculated in Fig. 2a. The asymmetry of cross-correlations shows that incident Rayleigh waves mainly come from the backward direction. Figure 2b and c displays the azimuth measured by station H01 and station H24, respectively. The mean azimuth for the 1-D linear array is shown in Fig. 2d. It is seen that the mean azimuth is superior to the azimuth from single station in robustness. Then we obtained the frequency distribution of the mean azimuth for the 1-D linear array within a broad frequency range from 1 to 30 Hz (Fig. 2e). The dominant azimuth is determined by detecting the peak location of the counts at the \(\hat{\varphi }{ = }0^\circ\). Consequently, we can deduce that the distribution of noise sources is predominantly inline. Figure 2f exhibits phase velocity–frequency diagrams calculated by Eq. (6). Peaks of dispersion diagrams match well with the theoretical curve (the solid white line in Fig. 2f).

Fig. 2
figure 2

a Cross-correlation functions with H01 as the virtual source. b The azimuth measured by station H01. c The azimuth measured by station H24. d The mean azimuth for the 1-D linear array. e The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz in case A. f Phase velocity–frequency diagrams in case A. The solid white line is theoretical dispersion curve. It is normalized along the frequency axis in the full frequency band for display only

We next consider the case B of offline noise sources, which are randomly distributed at polar coordinates between R1 = 1000 m, \(\theta_{1} { = 125}^\circ\) and R2 = 3000 m, \(\theta_{2} { = }145^\circ\), as the red dots shown in Fig. 3. By calculating cross-correlation functions (Fig. 4a). we analyzed that incident Rayleigh waves mainly come from the forward direction since the amplitude of the causal part of the cross-correlation functions is much larger than the one of the acausal part. Figure 4b and c displays the mean azimuth for the 1-D linear array and the frequency distribution of the mean azimuth in the frequency band of 1–30 Hz, respectively. The dominant azimuth is determined by detecting the peak location of the counts at the \(\hat{\varphi }{ = 135}^\circ\). Therefore, it is deduced that the distribution of noise sources is predominantly offline. Figure 4d obtained using Eq. (6) displays velocity–frequency diagrams without azimuthal correction. It is observed from Fig. 4d that the measurement systematic deviation between surface-wave energy and the theoretical dispersion curve (the solid white line in Fig. 4d) exists because of offline noise source distribution. Figure 4e obtained using Eq. (7) displays velocity–frequency diagrams after azimuthal correction. We can pick phase velocities agreed with the theoretical dispersion curve (the solid white line in Fig. 4e).

Fig. 3
figure 3

Sources and receiver distribution in case B. Noise sources (the red dots) are randomly at polar coordinates between R1 = 1000 m, \({\theta }_{1}\text{ = 125}^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }145^\circ\); A 24-channel linear array (the solid black triangles) with a 5-m interval is used

Fig. 4
figure 4

a Cross-correlation functions with H01 as the virtual source. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz in case B. d Phase velocity–frequency diagrams without azimuthal correction, and e phase velocity–frequency diagrams after azimuthal correction in case B. The solid white lines are theoretical dispersion curves

We now consider the case C including the offline and inline noise sources simultaneously. Offline noise sources (10,000 point single force sources) are randomly distributed at polar coordinates between R1 = 1000 m, \(\theta_{1} { = }50^\circ\) and R2 = 3000 m, \(\theta_{2} { = }70^\circ\). And inline noise sources (2000 point single force sources) are randomly distributed in two distinct areas, which are in the range of R1 = 1000 m, \(\theta_{3} = - 10^\circ\) and R2 = 3000 m, \(\theta_{4} { = }10^\circ\), and R1 = 1000 m, \(\theta_{5} { = 17}0^\circ\) and R2 = 3000 m, \(\theta_{6} { = }190^\circ\) in polar coordinates, as shown by the red dots (Fig. 5). As described previously, we calculated cross-correlation functions (Fig. 6a). Although surface-wave signals exist in the causal part of the cross-correlation functions, the amplitude of the causal part is smaller than the one of the acausal part. This indicates that incident Rayleigh waves mainly come from the backward direction. The mean azimuth for the 1-D linear array is calculated in Fig. 6b, and the frequency distribution of the mean azimuth in the frequency range from 1 to 30 Hz is obtained in Fig. 6c. The dominant azimuth by detecting the peak location of the counts at the \(\hat{\varphi }{ = 6}0^\circ\). As a result, it is deduced that the distribution of noise sources is predominantly offline. Figure 6d obtained using Eq. (6) displays velocity–frequency diagrams without azimuthal correction. We noticed that offline noise source distribution results in a significant deviation between surface-wave energy and the theoretical dispersion curve (the solid white line in Fig. 6d). To correct the overestimation caused by azimuthal effects, we applied Eq. (7) to obtain velocity–frequency diagrams after azimuthal correction (Fig. 6e). Phase velocities of surface waves can be clearly identified in Fig. 6e, which match with the theoretical dispersion curve (the solid white line in Fig. 6e).

Fig. 5
figure 5

Sources and receiver distribution in case C. Offline noise sources (10,000 point single force sources) are randomly distributed at polar coordinates between R1 = 1000 m, \({\theta }_{1}\text{ = }50^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }70^\circ\). And inline noise sources (2000 point single force sources) are randomly distributed in two distinct areas, which are in the range of R1 = 1000 m, \(\theta_{3} = - 10^\circ\) and R2 = 3000 m, \({\theta }_{4}\text{ = }10^\circ\), and R1 = 1000 m, \({\theta }_{5}\text{ = 17}0^\circ\) and R2 = 3000 m, \({\theta }_{6}\text{ = }190^\circ\), as shown by the red dots. A 24-channel linear array (the solid black triangles) with a 5-m interval is used

Fig. 6
figure 6

a Cross-correlation functions with H01 as the virtual source. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz in case C. d Phase velocity–frequency diagrams without azimuthal correction, and e phase velocity–frequency diagrams after azimuthal correction in case C. The solid white lines are theoretical dispersion curves

4 Applications to Field Data

4.1 Qianjiang New City Experiment

The experiment site was wasteland beside the main road in Qianjiang New City, Hangzhou, China (Fig. 7a). A linear array of 12 Fairfield Nodal ZLand 3C 5 Hz seismic sensors in a 10 m spatial interval was deployed to continuously record cultural noise from local time 9:30 am on 15 June to 9:30 am on 16 June 2019, with a sampling frequency of 1000 Hz (Fig. 7b). Note that the two horizontal components of sensors are oriented in the inline and crossline directions. Hence, the inline and crossline components are considered as the east and north components, respectively. Furthermore, an active seismic measurement was performed with a hammer and the nearest offset of 10 m for a linear array to evaluate the accuracy and validity of corrected phase velocities.

Fig. 7
figure 7

a An aerial photograph of the survey area in Qiangjiang New City, Hangzhou, China. A 1-D linear array of 12 Fairfield Nodal ZLand 3C 5 Hz seismic sensors (the solid red triangles) in a 10-m spatial interval was deployed. b Multicomponent (East (E), North (N), Vertical (Z)) noise data recorded by the first trace of the 1-D linear array

We experimentally divided vertical-component noise recordings into a series with 60-s time segments after de-meaning and de-trending, and calculated the cross-correlation functions with the virtual source located at the first trace (H01) of the survey area (Fig. 8a). The amplitude of the causal part the cross-correlation functions is smaller than the one of the acausal part. This indicates incident Rayleigh waves mainly come from the backward direction. Figure 8b and c displays the mean azimuth for the 1-D linear array and the frequency distribution of the mean azimuth in the frequency band of 1–30 Hz. The dominant azimuth is determined by detecting the peak location of the counts at the \(\hat{\varphi } = - 27^\circ\). Consequently, we deduce that noise sources are predominantly classified as offline distribution. Figure 8d exhibits velocity–frequency diagrams without azimuthal correction using Eq. (6). The velocity overestimation caused by offline source distribution is corrected in the dispersion image (Fig. 8e) obtained using Eq. (7). We picked phase velocities between 2 and 16 Hz by following higher amplitude peaks along energy trend (Fig. 8d and e).

Fig. 8
figure 8

a Cross-correlation functions with the virtual source located at the first trace of the survey area in Qianjiang New City experiment. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz. d Phase velocity–frequency diagrams without azimuthal correction, and e phase velocity–frequency diagrams after azimuthal correction. Phase velocities in dispersion images generated from passive source data (the solid white dots) are picked

The active shot gather is shown in Fig. 9a. Figure 9b displays images of dispersion curves obtained by applying the phase-shift method (Park et al. 1999) to the active seismic data. Surface-wave energy trend is hardly defined in the low frequency (< 14 Hz) band while very well defined in the high-frequency band of 14–30 Hz. We picked phase velocities from 14 to 30 Hz following higher amplitude peaks along energy trend. By comparing the results between active and passive dispersion measurements, we found that the picked phase velocity dispersion curve in the corrected dispersion image generated from passive source data can generally agree with that from active source data (Fig. 10), demonstrating the accuracy of the proposed method.

Fig. 9
figure 9

a A shot gather obtained using a hammer as the source in Qianjiang New City experiment, and b An image of dispersive energy from the active seismic data and the picked phase velocities (the solid white dots)

Fig. 10
figure 10

Comparison between the picked phase-velocity dispersion curves in dispersion images generated from passive source data and active source data in Qianjiang New City experiment

4.2 Fengqi Road Experiment

The experiment was carried out along the Fengqi Road in the city of Hangzhou, Zhejiang Province of China (Fig. 11a). A linear array of 24 Fairfield Nodal ZLand 3C 5 Hz seismic sensors was deployed with a sampling frequency of 1000 Hz and 5-m spatial interval. It is noted that the two horizontal components of sensors are oriented in the inline and crossline directions. The inline and crossline components are deemed as the east and north components, respectively. Continuous traffic noise data were recorded for up to 36 h starting at the local time 10:50 am on 10 June 2019 (Fig. 11b). As designed in the former example, we also conducted the active seismic measurement with a hammer and the nearest offset of 5 m for a linear array.

Fig. 11
figure 11

a An aerial photograph of the survey line along the Fengqi Road in the city of Hangzhou, Zhejiang province of China. A 1-D linear array of 24 Fairfield Nodal ZLand 3C 5 Hz seismic sensors (the solid red triangles) in a 5-m spatial interval was deployed. b Multicomponent (East (E), North (N), Vertical (Z)) noise data recorded by the first trace of the 1-D linear array

The vertical-component noise recordings were experimentally split into 60 s time series after de-meaning and de-trending. The asymmetry of cross-correlation functions with the virtual source located at the first trace (H01) of the survey line (Fig. 12a) indicates that incident Rayleigh waves mainly come from the forward direction because the amplitude of the causal part the cross-correlation functions is much larger than the one of the acausal part. The mean azimuth for the 1-D linear array is calculated in Fig. 12b, and the frequency distribution of the mean azimuth in the frequency band of 1–30 Hz is obtained in Fig. 12c. The dominant azimuth is determined by detecting the peak location of the counts at the \(\widehat{\varphi }\text{ = 162}^\circ\). It is deduced that noise sources are predominantly classified as offline distribution. Figure 12d exhibits velocity–frequency diagrams without azimuthal correction using Eq. (6). The velocity overestimation caused by offline source distribution is corrected in the dispersion image (Fig. 12e) obtained using Eq. (7). We picked phase velocities between 2 and 15 Hz by following higher amplitude peaks associated with energy trend (Fig. 12d and e).

Fig. 12
figure 12

a Cross-correlation functions with the virtual source located at the first trace of the survey line in Fengqi Road experiment. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz. d Phase velocity–frequency diagrams without azimuthal correction, and e phase velocity–frequency diagrams after azimuthal correction. Phase velocities in dispersion images generated from passive source data (the solid white dots) are picked

Figure 13a and b displays the active shot gather and the image of dispersive energy generated by transforming active seismic data from the time–space domain to the frequency-velocity domain. Surface-wave energy trend is hardly defined in the low frequency (< 8 Hz) band while very well defined in the high-frequency band of 8–25 Hz. We picked phase velocities from 8 to 25 Hz by following higher amplitude peaks associated with energy trend. Figure 14 exhibits the comparison between active and passive dispersion measurements. After the azimuthal correction, a great match between the picked phase-velocity dispersion curves in dispersion images generated from passive source data and active source data further demonstrates the validity of the proposed method. Unlike the previous field example, this one shows the higher mode of surface waves in the passive result but not in the active. The interesting phenomenon will be discussed in the following section.

Fig. 13
figure 13

a A shot gather obtained using a hammer as the source in Fengqi Road experiment, and b An image of dispersive energy from the active seismic data and the picked phase velocities (the solid white dots)

Fig. 14
figure 14

Comparison between the picked phase-velocity dispersion curves in dispersion images generated from passive source data and active source data in Fengqi Road experiment

5 Discussion

5.1 Complex Noise Source Conditions

For urban surface wave surveys, passive surface wave measurement can be improved on condition that the directionality of the noise sources is counted for (Asten 2006; Halliday et al. 2008; Cheng et al. 2016). Polarization analysis rather than the beamforming is applied to determine the dominant azimuth of noise sources using multicomponent noise data recorded by 1-D linear array because the linear array-based beamforming algorithm is difficult to present the correct noise source distribution by analyzing the array response function of the 1-D linear array (Liu et al. 2020). Synthetic tests and field examples have demonstrated that the proposed method is suitable for directional noise source distribution in urban environments. We will discuss the performance of the proposed method for complex noise-source conditions such as different dominant azimuth for Rayleigh waves and Love waves or two dominant directions.

5.1.1 Different Dominant Azimuth for Rayleigh waves and Love waves

Noise sources generating Rayleigh waves (the red dots in Fig. 15) are randomly distributed in the range of R1 = 1000 m, \({\theta }_{1}\text{ = }125^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }145^\circ\) in polar coordinates, and noise sources generating Love waves (the blue dots in Fig. 15) are randomly distributed in the range of R1 = 1000 m, \({\theta }_{3}\text{ = 20}0^\circ\) and R2 = 3000 m, \({\theta }_{4}\text{ = }220^\circ\) in polar coordinates. We calculated the cross-correlation functions and observed that the causal part of the cross-correlation functions has clear surface-wave signals (Fig. 16a). Incident Rayleigh waves mainly come from the forward direction because the amplitude of the causal part is larger than the one of the acausal part. Figure 16b and c displays the mean azimuth for the 1-D linear array and the frequency distribution of the mean azimuth in the frequency range from 1 to 30 Hz, respectively. We determined the dominant azimuth by detecting the peak location of the counts at the \(\widehat{\varphi }\text{ =138}^\circ\). It indicates that the distribution of noise sources is predominantly offline. Figure 16d obtained using Eq. (6) displays velocity–frequency diagrams without azimuthal correction. There is a significant deviation between surface-wave energy and the theoretical dispersion curve (the solid white line in Fig. 16d) due to offline noise source distribution. To minimize azimuthal effects, we applied Eq. (7) to obtain velocity–frequency diagrams (Fig. 16e). Compared with Fig. 16d, Fig. 16e can provide phase velocities that deviate less from theoretical dispersion curve (the solid white line in Fig. 16e). It proves that the proposed method also has the ability to obtain relatively accurate phase velocities in spite of different dominant azimuth for Rayleigh waves and Love waves.

Fig. 15
figure 15

Noise sources generating Rayleigh waves (the red dots) are randomly distributed in the range of R1 = 1000 m, \({\theta }_{1}\text{ = }125^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }145^\circ\) in polar coordinates, and noise sources generating Love waves (the blue dots) are randomly distributed in the range of R1 = 1000 m, \({\theta }_{3}\text{ = 20}0^\circ\) and R2 = 3000 m, \({\theta }_{4}\text{ = }220^\circ\) in polar coordinates. A 24-channel linear array (the solid black triangles) with a 5-m interval is used

Fig. 16
figure 16

a Cross-correlation functions with H01 as the virtual source. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz for complex noise sources with different dominant azimuth for Rayleigh waves and Love waves. d Phase velocity–frequency diagrams without azimuthal correction, and e phase velocity–frequency diagrams after azimuthal correction. The solid white lines are theoretical dispersion curves

5.1.2 Two Dominant Directions

Noise sources are randomly distributed in two distinct areas, which are in the range of R1 = 1000 m, \({\theta }_{1}\text{ = }125^\circ\) and R2 = 3000 m, \({\theta }_{2}\text{ = }145^\circ\), and R1 = 1000 m, \({\theta }_{3}\text{ = 20}0^\circ\) and R2 = 3000 m, \({\theta }_{4}\text{ = }220^\circ\) in polar coordinates. Considering that noise sources have different frequency contents in different azimuth ranges, 10 Hz and 25 Hz Ricker wavelets are chosen as the source impulse in the former and latter areas, respectively. It is observed from the cross-correlation functions (Fig. 17a) that incident Rayleigh waves mainly come from the forward direction because the amplitude of the causal part is larger than the one of the acausal part. The mean azimuth for the 1-D linear array is calculated in Fig. 17b, and the frequency distribution of the mean azimuth in the frequency range from 1 to 30 Hz is obtained in Fig. 17c. By detecting the two peak locations of the counts, we found that noise sources in the frequency range from 1 Hz to about 18 Hz are dominated by the azimuth \({\widehat{\varphi }}_{1}= \text{135} ^\circ\) while noise sources in the frequency range from about 18 Hz to 30 Hz are predominated by the azimuth \({\widehat{\varphi }}_{2}= \text{210} ^\circ\). Figure 17c obtained using Eq. (6) displays velocity–frequency diagrams without azimuthal correction. Surface-wave energy appears break at round 18 Hz since noise sources have different frequency contents in different azimuth ranges. This makes us difficult to obtain the accurate phase velocities. Figure 18b and c displays the velocity–frequency diagrams after azimuthal correction using the azimuth \({\widehat{\varphi }}_{1}= \text{135} ^\circ\) and \({\widehat{\varphi }}_{2}= \text{210} ^\circ\) in a broad frequency band of 1–30 Hz, respectively. Obviously, it is inappropriate to only use the single dominant azimuth in the frequency band for phase velocity correction. To minimize azimuthal effects, dispersion measurement is performed using the azimuth \({\widehat{\varphi }}_{1}= \text{135} ^\circ\) in the frequency range from 1 Hz to about 18 Hz and \({\widehat{\varphi }}_{2}= \text{210} ^\circ\) in the frequency range from 18 to 30 Hz (Fig. 18d). We can pick phase velocities that match with the theoretical dispersion curve (the solid white line in Fig. 18d). It demonstrates that the proposed method also has potential to deal with two or more dominant directions when noise sources have different frequency contents in different azimuth ranges. Note that we need to correctly determine the dominant azimuth in the different frequency band. It is worth mentioning that using the single dominant azimuth in the broad frequency band is relatively reasonable for the above field examples because the frequency distribution of the mean azimuth shows no distinguished peaks for the other azimuths and surface-wave energy before azimuthal correction has no clear break.

Fig. 17
figure 17

a Cross-correlation functions with H01 as the virtual source. b The mean azimuth for the 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz for complex noise sources that have different frequency contents for two dominant directions

Fig. 18
figure 18

a Phase velocity–frequency diagrams without azimuthal correction, and phase velocity–frequency diagrams after azimuthal correction using b \({\widehat{\varphi }}_{1}= \text{135} ^\circ\) in the frequency band of 1–30 Hz, c \({\widehat{\varphi }}_{2}= \text{210} ^\circ\) in the frequency band of 1–30 Hz, and d \({\widehat{\varphi }}_{1}= \text{135} ^\circ\) in the frequency range from 1 Hz to about 18 Hz and \({\widehat{\varphi }}_{2}= \text{210} ^\circ\) in the frequency range from 18 to 30 Hz. The solid white lines are theoretical dispersion curves

Besides, we also chose a 10 Hz Ricker wavelet as the source impulse in two distinct areas considering that noise sources have similar frequency contents in different azimuth ranges. According to the calculated cross-correlation functions (Fig. 19a), we analyzed that incident Rayleigh waves mainly come from the forward direction since the amplitude of the causal part is larger than the one of the acausal part. Figure 19b and c displays the mean azimuth for the 1-D linear array and the frequency distribution of the mean azimuth in the frequency band of 1–30 Hz, respectively. The dominant azimuth is determined by detecting the peak location of the counts at the \(\widehat{\varphi }\text{ = 173}^\circ\). It indicates that the distribution of noise sources is predominantly offline. Figure 19d and e displays velocity–frequency diagrams using Eqs. (6) and (7), respectively. We found that the deviation between surface-wave energy (Fig. 19e) and the theoretical dispersion curve (the solid white line in Fig. 19e) still exists after azimuthal correction. The proposed method cannot work on two or more dominant directions when noise sources have similar frequency contents in different azimuth ranges. The main reason is that the dominant azimuth measured by polarization analysis is the average effects of two or more dominant directions at a particular frequency. The calculated dominant azimuth no longer has the ability to correct the velocity deviation for this noise-source distribution. How to eliminate the deviation caused by complex noise sources that have similar frequency contents for two or more dominant directions is indeed a realistic and challenging problem, which needs to be further studied in our next work. Due to the unknown characteristic of noise sources in the practical applications, we suggested that active measurements can be used to check the accuracy of passive measurements after azimuthal correction.

Fig. 19
figure 19

a Cross-correlation functions with H01 as the virtual source. b The mean azimuth for 1-D linear array. c The frequency distribution of the mean azimuth with a broad frequency band from 1 to 30 Hz for complex noise sources that have similar frequency contents for two dominant directions. d Phase velocity–frequency diagrams without azimuthal correction and e phase velocity–frequency diagrams after azimuthal correction. The solid white lines are theoretical dispersion curves

5.2 Why Higher Modes Appear only in the Passive Survey

It is very interesting to understand why higher modes appear only in the passive survey for the Fengqi Road experiment. As known to all, the great difference between the passive and active surveys is the source attributes. The active source is a hammer triggered at the surface of earth while passive sources are various kinds of cultural activities excited at the surface of earth or below the surface. On the one hand, Halliday and Curtis (2008) investigated that sources at depth play a role in the retrieval of higher-mode surface waves by ambient noise cross-correlation. In addition, simulations and field studies (Ikeda et al. 2015) also demonstrate that higher modes of surface waves are dominant as the source depth gradually increases. Thus, one possible reason is that higher modes of passive surface waves may generate from some subsurface sources in the surrounding environment. On the other hand, the dispersion image generated with active data has no higher-mode surface-wave energy because the nearest source-receiver offset is only 5 m. The distance may be too short to extract the higher-order information (Zhang et al. 2004).

5.3 Prospects

The new method proposed in this study can be used to capture the dominant azimuth of noise sources and obtain the accurate dispersion measurements for directional noise sources using multicomponent seismic noise recorded by 1-D linear array. In recent years, distributed acoustic sensing (DAS) becomes an emerging technology for urban seismic site characterization and attracts increasingly attention to seismology and near-surface geophysical communities (Lindsey et al. 2017; Ajo-Franklin et al. 2019; Fang et al. 2020; Fenta et al. 2021; Lindsey and Martin 2021). DAS-recorded ambient noise using linear fiber optic cables have been utilized for near-surface imaging in populated urban areas (Dou et al. 2017; Spica et al. 2020; Song et al. 2021). Although DAS does not observe multicomponent records, sparse additional multicomponent seismometers nearby the fiber optic cables can help determine the Rayleigh wave directionality with polarization analysis as shown by this study. From this perspective, the proposed method can be applied to DAS-recorded ambient noise to retrieve the accurate phase-velocity dispersion curve and obtain the time-lapse imaging of near-surface S-wave velocity structures.

6 Conclusions

In urban areas populated with buildings, it is convenient and practical to employ a 1-D linear array for passive surface-wave investigations. We used synthetic data sets under various source distributions to estimate the dominant azimuth of noise sources by polarization analysis and obtain phase-velocity dispersion information. For inline noise source distribution, we had the ability to directly perform accurate phase-velocity measurements; for offline noise source distribution, we applied the proposed method to correct the velocity overestimation due to azimuthal effects. We also applied this approach to two field data sets in the city of Hangzhou, Zhejiang Province of China. Results show that the distributions of noise sources are predominantly offline. Then we eliminated the velocity bias caused by offline source distribution and picked phase velocities in the dispersion image. To evaluate the accuracy of the corrected phase velocities, the active seismic measurements were conducted with a hammer and a linear array. After the azimuthal correction, the picked phase-velocity dispersion curves in dispersion images obtained from passive source data match well with those from active source data. It demonstrates the validity and the feasibility of the proposed method.