1 Introduction

The dynamic description of mechanical and civil engineering systems constitutes a fundamental target towards the characterization of structural modal properties. To this purpose, not only ambient or experimentally controlled vibrations, but also earthquake-induced excitations may be adopted (Ntotsios et al. 2009; Sevim et al. 2016). In Earthquake Engineering, specifically developed identification procedures may be adopted in order to detect structural modifications and to achieve reliable strong ground motion modal parameter estimates.

In the Earthquake Engineering field, most of known identification procedures pertain directly to Experimental Modal Analysis (EMA), where both input and output signals need to be available for achieving an appropriate operation of the estimation procedures (Heylen et al. 2006; Steiger et al. 2016). During the last decades, Operational Modal Analysis (OMA) has been increasingly adopted in this field (Rainieri and Fabbrocino 2014; Bindi et al. 2015). With OMA procedures, only structural responses (input signals for the identification technique) need to be known, which makes it particularly suitable for treating ambient vibrations, but in principle also for processing earthquake excitations.

Generally, for OMA applications, the unknown excitation input acting on the structure shall be considered to be similar, in main characteristics, to that of a (stationary) white noise signal, as it may be recorded under typical ambient or operational excitations. In the field of OMA algorithms, the adoption of input channels coming from (short-duration, non-stationary) earthquake-induced structural response signals has been considered quite a few times in the dedicated literature, either in the Time or in the Frequency Domain. On that, a brief summary on recent literature contributions is reported below.

By adopting earthquake-induced structural response signals, most of current OMA techniques refer to Time Domain methods. In Pridham and Wilson (2004), a Stochastic Subspace Identification (SSI) approach was combined to an Expectation Maximization method in order to identify shear-type frames under seismic base excitation. In Lin et al. (2005), an Ibrahim time domain method with modified random decrement was adopted, by relying only on a few floor acceleration seismic responses. In Kun et al. (2009), Taylor’s first-order approximations for the identification equations were adopted and solved with damped iterative Least-Squares (LS). In Ghahari et al. (2013), spatial time-frequency distributions were used for blind identification of strong motion modal parameters. In Gouache et al. (2013), OMA under harmonic transient input was attempted through a phase analysis. In Pioldi and Rizzi (2016a, b), a Full Dynamic Compound Inverse Method (FDCIM) was developed, to simultaneously identify output-only modal parameters, element-level structural features and earthquake input excitation, through a two-stage iterative identification algorithm.

Conversely, by adopting OMA techniques at seismic input in the Frequency Domain, only a few notable exceptions have been investigated in the field. In Ventura et al. (2005), a real street overpass was studied through ambient vibrations and earthquake ground motions, by a commercial software version of Frequency Domain Decomposition (FDD) (Brincker et al. 2001) (and of SSI, Peeters and De Roeck 1999). In Mahmoudabadi et al. (2007), a method based on iterative Least-Squares was proposed to identify classically damped linear systems. In Michel et al. (2010), the seismic response of a building to weak earthquakes was studied through a FDD algorithm.

By making reference to such output-only identification techniques, earthquake-induced structural responses are adopted here as input channels for a refined Frequency Domain Decomposition (rFDD) algorithm (Brincker et al. 2001) (Frequency Domain) and an improved Data-Driven Stochastic Subspace Identification (SSI-DATA) technique (Peeters and De Roeck 1999) (Time Domain). Both methods were implemented autonomously within MATLAB, for rFDD in Pioldi et al. (2015a, b), Pioldi and Rizzi (2017) and for SSI-DATA starting from Pansieri (2016) and as definitely presented here. The traditional versions of these algorithms rely on the typical assumption of white noise input, which no longer holds with seismic response input. On the contrary, the present methods have been specifically developed to deal with earthquake-induced structural responses and simultaneous heavy damping (in terms of identification challenge, i.e. with modal damping ratios larger than only a few percents and up to \(10\%\) or higher).

In the present investigation, the autonomously-implemented rFDD and SSI-DATA techniques are separately adopted first to identify the modal properties of a reference linear heavy-damped 3-dof shear-type frame under ten different selected strong ground motions. This work shall constitute a first basis for a comparison between the two OMA identification methods within the Earthquake Engineering range, in order to inspect their positive and negative aspects, concerning their reliability of correctly identifying strong ground motion modal parameters, within the linear range of seismic response, as recorded by synthetic response signals. From the performed analyses, the achieved estimates are compared among them and also with the known target values computed before stage, in order to extract general and specific considerations on the efficiency and consistency of both OMA algorithms and to investigate and compare their effectiveness in identifying all current strong ground motion modal parameters. Second, a real structural case is considered, based on a single earthquake record and attached real seismic response signals. The two identification techniques are further employed, again with very consistent results, which further corroborates and completes the previous necessary condition analysis with synthetic seismic response signals.

Presentation is structured as follows. In Sects. 2 and 3, necessary main theoretical backgrounds (Sects. 2.1, 3.1) and enhancements (Sects. 2.2, 3.2) of the developed rFDD and SSI-DATA algorithms are outlined, respectively. In Sect. 4 the selected earthquake dataset and the adopted numerical model for the first synthetic analyses are presented (Sect. 4.1), jointly with the results separately achieved by the two OMA identification methods for a reference three-storey frame (Sect. 4.2). Then, real recorded seismic response signals for an existing instrumented building are finally processed by the two identification techniques and presented in Sect. 5. In the end, salient conclusions on the whole investigation are gathered in closing Sect. 6.

2 Fundamentals of the present refined Frequency Domain Decomposition algorithm

2.1 Classical FDD theoretical background

Classical FDD theory is based on a general input/output expression as a function of frequency \(\omega \) for a MDoF system (Bendat and Piersol 1986; Brincker et al. 2001):

$$ {{\bf G}}_{yy}(\omega )=\overline{{{\bf H}}}(\omega ){{\bf G}}_{xx}(\omega ) {{\bf H}}^{{\mathrm {T}}}(\omega ) $$
(1)

where \({{\bf G}}_{xx}(\omega )\in {\mathcal {R}}^{r \times r}\) is the input Power Spectral Density (PSD) matrix (excitations), r is the number of input channels; \({{\bf G}}_{yy}(\omega )\in {\mathcal {R}}^{m \times m}\) is the output PSD matrix (responses), m is the number of output response signals; \({{\bf H}}(\omega )\in {\mathcal {R}}^{m \times r}\) is the Frequency Response Function (FRF) matrix; overbar denotes complex conjugate and apex symbol \({\mathrm {T}}\) transpose.

FRF \({{\bf H}}(\omega )\) may also be written in pole/residue form as Heylen et al. (2006):

$$ {{\bf H}}(\omega )=\sum _{k=1}^n \dfrac{{{\bf R}}_{k}}{{\mathrm {i}}\, \omega -\lambda _{k}}+\dfrac{\overline{{{\bf R}}}_{k}}{{\mathrm {i}}\, \omega -\bar{\lambda }_{k}} $$
(2)

where n is the number of modes, \(\lambda _k\) and \(\bar{\lambda }_{k}\) are the kth poles (in complex conjugate pairs) of the FRF function (Heylen et al. 2006), and \({{\bf R}}_{k}=\varvec{\phi }_{k}{\varvec {\Gamma }}_k^\mathrm {T} \in {\mathcal {R}}^{m \times r}\) is the residue matrix (Brincker et al. 2001; Reynders 2012), obtained by the product between mode shape vector \(\varvec{\phi }_{k}\in {\mathcal {R}}^{m \times 1}\) and modal participation factor vector \({\varvec {\Gamma }}_k\in {\mathcal {R}}^{r \times 1}\).

When all output measurements are taken as input references (i.e. when \(m = r\)), \({{\bf H}}(\omega )\) becomes a square matrix. So, by defining the residue matrix of PSD output \({{\bf A}}_{k} \in {\mathcal {R}}^{m \times m}\), which corresponds to kth pole \(\lambda _k\), as (Brincker et al. 2001):

$$ {{\bf A}}_{k} = \sum _{s=1}^n \Biggl ( \dfrac{ {{\bf R}}_{s} }{ -\lambda _{k}-\lambda _{s}} + \dfrac{\overline{{{\bf R}}}_{s} }{ -\lambda _{k}-\bar{\lambda }_{s}}\Biggr ) {\mathrm {G}}_{xx}{{\bf R}}_{k}^{{\mathrm {T}}} $$
(3)

output PSD matrix \({{\bf G}}_{yy}(\omega )\) of Eq. (1), after substituting Eqs. (2) and (3), and by applying the Heaviside partial fraction expansion theorem, may be reduced to the following final pole/residue form (Brincker et al. 2001; Wang et al. 2005):

$$ {{\bf G}}_{yy}(\omega )\simeq \sum _{k=1}^n \dfrac{{{\bf A}}_{k}}{{\mathrm {i}}\,\omega -\lambda _{k}} + \dfrac{{{\bf A}}_{k}^{{\mathrm {H}}}}{-{\mathrm {i}}\,\omega -\bar{\lambda }_{k}} +\dfrac{\overline{{{\bf A}}}_{k}}{{\mathrm {i}}\,\omega -\bar{\lambda }_{k}} +\dfrac{{{\bf A}}_{k}^{{\mathrm {T}}}}{-{\mathrm {i}}\,\omega -\lambda _{k}} $$
(4)

where Hermitian apex symbol \({\mathrm {H}}\) denotes complex conjugate and transpose. Under the assumption of stationary white noise input (Bendat and Piersol 1986), PSD matrix \({{\bf G}}_{xx}(\omega )\) degenerates to a real-valued non-negative single scalar constant \({\mathrm {G}}_{xx}\).

When the structure is lightly damped (small modal damping ratios \(\zeta _k \ll 1\)), the pole in the vicinity of kth modal frequency \(\omega _k\) can be expressed in approximate form as (Brincker et al. 2001):

$$ {{\bf A}}_{k} \simeq \dfrac{ \overline{{{\bf R}}}_{k} {\mathrm {G}} _{xx} {{\bf R}}_{k}^{{\mathrm {T}}}}{ 2\zeta _{k}\omega _{k}} = \dfrac{ \bar{\varvec{\phi }}_{k}\varvec{\varGamma }_k^{\mathrm {H}} {\mathrm {G}} _{xx} \varvec{\varGamma }_k^{\mathrm {T}} \varvec{\phi }_{k}}{ 2\zeta _{k}\omega _{k}} = {\mathrm {d}}_{k}\, \bar{\varvec{\phi }}_{k} \varvec{\phi }_{k}^{{\mathrm {T}}} $$
(5)

where index k spans modes \(k = 1,\ldots , n\), \(\zeta _{k}\) is the modal damping ratio and term \({\mathrm {d}}_{k}\) can be proven to be a real scalar. So, by substituting Eq. (5) into Eq. (4), in the narrow band with spectrum lines in the vicinity of a modal frequency, Eq. (4) can be simplified to:

$$ {{\bf G}}_{yy}(\omega ) \simeq \sum _{k=1}^n \dfrac{{\mathrm {d}}_{k} \, \bar{\varvec{\phi }}_{k} \varvec{\phi }_{k}^{{\mathrm {T}}}}{\mathrm {i}\,\omega -\lambda _{k}} + \dfrac{{\mathrm {d}}_{k} \, \bar{\varvec{\phi }}_{k} \varvec{\phi }_{k}^{{\mathrm {T}}}}{-{\mathrm {i}}\,\omega -\bar{\lambda }_{k}} = \bar{{\varvec {\Phi }}} \left\{ {\text {diag}} \left[ {\mathrm {Re}}\left( \dfrac{2 {\mathrm {d}}_{k}}{{\mathrm {i}}\,\omega -\lambda _{k}}\right) \right] \right\} {\varvec {\Phi }}^{\mathrm {T}} $$
(6)

where \({\varvec {\Phi }}\) is the eigenvector matrix, gathering all n eigenvectors \(\varvec{\phi }_{i}\) as columns.

Therefore, the first step of classical FDD methods is the estimation of the PSD matrix of system responses \({{\bf G}}_{yy}(\omega )\) in previous Eq. (6) by time correlation first, in the Time Domain, and then Fourier transform to the Frequency Domain. Then, its transpose shall be decomposed by performing a Singular Value Decomposition (SVD) at each discrete frequency line \(\omega =\omega _i\):

$$ {{\bf G}}_{yy}^{{\mathrm {T}}}(\omega =\omega _i) \simeq {\varvec {\Phi }}_i \left\{ {\text {diag}} \left[ {\mathrm {Re}} \left( \dfrac{2 {\mathrm {d}}_{j}}{{\mathrm {i}}\,\omega _i-\lambda _{j}}\right) \right] \right\} {\varvec {\Phi }}_i^{{\mathrm {H}}} = {\bf U}_i {{\bf S}}_i {\bf U}_i^{{\mathrm {H}}} $$
(7)

where \({\varvec {\Phi }}_i\) is the ith eigenvector matrix, gathering all m eigenvectors \(\varvec{\phi }_{ij}\) as columns, and index \(j=1,\ldots ,m\), being m the number output response signals, i.e. the input channels for the algorithm. In parallel, \({{\bf U}}_i\) is a unitary complex matrix holding singular vectors \({{\bf u}}_{ij}\) and \({{\bf S}}_i\) is a real diagonal matrix holding Singular Values (SV) \({\mathrm {s}}_{ij}\).

Starting from the SVD in Eq. (7), the identification of mode q can be made around a modal peak in the frequency domain, which can be located by an appropriate peak-picking procedure on the SV representations. Then, the response PSD matrix of the qth mode, in correspondence of identified damped modal frequency \(\omega _q\), can be approximated as (Brincker et al. 2001):

$$ {{\bf G}}_{yy}^{{\mathrm {T}}}(\omega _{i}=\omega _{q}) \simeq {\varvec{\phi }}_{q} \left\{ {\text {diag}} \left[ {\mathrm {Re}}\left( \dfrac{2 {\mathrm {d}}_{q}}{{\mathrm {i}}\,\omega _q-\lambda _{q}}\right) \right] \right\} \varvec{\phi }_{q}^{{\mathrm {H}}} = {\mathrm {s}}_{1} {{\bf u}}_{q1} {{\bf u}}_{q1}^{{\mathrm {H}}} $$
(8)

where first singular vector \({{\bf u}}_{q1}\) at qth resonance frequency \(\omega _q\) leads to an estimate of related mode shape vector \(\hat{\varvec{\phi }}_q = {{\bf u}}_{q1}\). Associated Singular Value \(s_1\) is the Auto-PSD function of the corresponding SDoF system, which may be detected by comparing the identified mode shape \(\varvec{\phi }_q\) with the surrounding singular vectors around the peak. For this purpose, the Modal Assurance Criterion (MAC) index may be classically used (Brincker et al. 2001):

$$ {\mathrm {MAC}} \left( \varvec{\phi }_q,\varvec{\phi }_s \right) = \frac{\bigl | \varvec{\phi }_q^{{\mathrm {H}}} \varvec{\phi }_s^{~} \bigr |^2}{\bigl | \varvec{\phi }_q^{{\mathrm {H}}} \varvec{\phi }_q^{~} \bigr | \, \bigl | \varvec{\phi }_s^{{\mathrm {H}}} \varvec{\phi }_s^{~} \bigr |} $$
(9)

If MAC index is 1, the two compared vectors are considered to be identical; if 0, clearly distinct (orthogonal).

Then, for the modal damping ratio identification it is possible to operate as follows. The Inverse Discrete Fourier Transform (IDFT) (Time Domain) of the located qth Auto-PSD function (Frequency Domain) allows to obtain an estimate of the SDoF Auto-Correlation Function (ACF) related to the located resonance peak. In this process, remaining parts of the Auto-PSD function are simply reset to zero (Brincker et al. 2001). All ACF extrema (i.e. peaks and valleys), which represent the free amplitude decay of a damped SDoF system may be detected by peak-picking of peaks and valleys within an appropriate time window. The logarithmic decrement, which is classically defined as \(\delta _q =\left( 2/j \right) {\mathrm {ln}} \left( r_0 / \left| r_j \right| \right) \), can be estimated by a linear regression on \(\delta \, j\) and \(2 {\mathrm {ln}} (\left| r_j \right| )\), where \(j=1,2,\ldots \) is an integer index counter of the jth ACF extreme and \(r_0\), \(r_j\) are the initial and the jth extreme value of the ACF, respectively. Then, modal damping ratio \(\zeta _q\) can be typically estimated as:

$$ \zeta _q = \frac{\delta _q}{\sqrt{4\pi ^2 + \delta _q^2}} $$
(10)

Finally, knowing the estimated modal damping ratio, the undamped natural frequency can be obtained from the estimated damped modal frequency by dividing it by factor \(\sqrt{1 - \zeta _k^2}\).

2.2 Main enhancements of the present rFDD algorithm

Main assumptions of classical FDD methods consist of (stationary) white noise input, light damping (modal damping ratios in the order of 1%) and geometrically-orthogonal mode shapes of close modes (Brincker et al. 2001).

The present rFDD method, whose original theoretical background has been reported in Pioldi et al. (2015a, b), conceptually derives from classical FDD methods (Brincker et al. 2001), but has been specifically developed to deal with earthquake-induced structural response signals and concurrent heavy damping (in terms of FDD identification challenge, i.e. for realistic modal damping ratios up to \(10\%\)).

Pioldi et al. (2015a, b) have discussed the theoretical validity and efficacy of the present rFDD technique, through the use of synthetic seismic response signals in the linear range, possibly affected by simulated noise of different levels. Trials with real earthquake responses and damage scenarios in the non-linear range have been effectively performed as well in Pioldi et al. (2017). In Pioldi and Rizzi (2017), further rFDD computational strategies have been introduced, by adopting excitation data from the complete FEMA P695 earthquake database, towards achieving an extensive validation in the Earthquake Engineering range. In Pioldi et al. (2016), the rFDD technique has been also applied to frames under Soil-Structure Interaction (SSI) effects, towards obtaining the identification of flexible- and fixed-base modal parameters from earthquake-induced structural response signals.

For the sake of completeness, the following computational steps (and references quoted therein) summarize the main workflow of the present rFDD algorithm:

  • Suitably-developed filtering applied to the structural response input signals (earthquake-induced structural responses) before starting the modal identification process (Pioldi et al. 2015a).

  • Coupling of rFDD to a time-frequency Gabor Wavelet Transform (GWT), towards achieving a correct evaluation of the time-frequency features of the signals and a best setup for rFDD identification (Pioldi and Rizzi 2017).

  • Processing of the auto- and cross-correlation matrix entries, by aiming at obtaining clearer and well-defined SVs out of seismic response signals (Pioldi et al. 2015b).

  • Integrated PSD matrix computation, implementing simultaneously both Wiener-Khinchin and Welch’s modified periodogram methods (Pioldi et al. 2015a, b). The Wiener-Khinchin’s approach works especially well with short signals, allowing for a clearer peak detection, not only on the first SV curve, but also on the subsequent ones. Welch’s method, instead, implements averaging and windowing before the frequency-domain convolution, allowing to achieve better mode shapes, despite for the not so good separation of the signals in the modal space. Then, the integrated PSD matrix computation aims at extracting better modal estimates, by taking simultaneous advantage of both PSD evaluation methods.

  • Iterative loop and optimization algorithm towards achieving effective modal damping ratio estimates, especially under heavy-damping identification conditions (Pioldi et al. 2015a).

  • Coupled Chebyshev Type II bandpass filters computational procedure, aiming at enhancing the SDoF spectral bells towards estimation improvement, when challenging seismic input and heavy-damping conditions apply (Pioldi and Rizzi 2017).

  • Estimation of modal parameters by operating on different SVs and on their composition, to detect each SV contribution and to reconstruct the original SDoF spectral bells (Pioldi et al. 2015a).

  • Inner procedure for frequency resolution enhancement, without the need of higher frequency sampling, as first outlined in Pioldi et al. (2015a).

  • Combined use of different MAC indexes towards modal validation purposes (Pioldi et al. 2015a, 2017). After a preliminary “peak-picking” (Pioldi et al. 2015a), the use of Modal Assurance Criterion (MAC) and Modal Phase Collinearity (MPC) indexes (Pioldi et al. 2015b) becomes necessary to discern “spurious peaks” from true (physical) modal ones. In particular, these indexes have been used to discard spurious peaks that exhibit a complex-number character (i.e. displaying modal deflection phases that significantly deviate from 0 or \(\pi \)).

Consistently, the rFDD results reported later in Sect. 4 demonstrate the robustness of the developed rFDD algorithm in returning reliable modal parameter estimates at seismic response input and concurrent heavy damping. These are going to be compared to those independently achieved by a separate SSI-DATA implementation, as introduced in the next section.

3 Fundamentals of the present improved Data-Driven Stochastic Subspace Identification algorithm

3.1 Classical SSI theoretical background

A classical system of m second-order differential equations of motion of a dynamical linear structural system may be written as (spatial model):

$$ {{\bf M}}{\ddot{{{\bf u}}}}(t)+{{\bf C}}{\dot{{{\bf u}}}}(t) +{{\bf Ku}}(t)={\bar{{{\bf B}}}} {{\bf f'}}(t) $$
(11)

where \({{\bf M}}, {{\bf C}}\) and \({{\bf K}} \in {\mathcal {R}}^{m \times m}\) are the mass, damping and stiffness matrices, matrix \({\bar{{\bf B}}} \in {\mathcal {R}}^{m \times m}\) defines the location of the input channels, \({{\bf f'}}(t) \in {\mathcal {R}}^{m \times 1}\) is the input force vector and \({\ddot{{\bf u}}}(t)\), \({\dot{{\bf u}}}(t)\) and \({{\bf u}}(t) \in {\mathcal {R}}^{m \times 1}\) are the vectors of total acceleration, velocity and displacement structural responses.

By switching to State-Space form, the m second-order differential equations of motion in Eq. (11) can be rewritten into 2m first-order differential equations as (Van Overschee and De Moor 1996):

$$ \left\{ \begin{array}{l} {\dot{{\bf x}}}(t) = {{\bf A}}_c{{{\bf x}}}(t)+{{\bf B}}_c{{\bf f}}'(t) \\ {{\bf y}}(t) = {{\bf C}}_c{{\bf x}}(t)+{{\bf D}}_c{{\bf f}}'(t) \end{array}\right. $$
(12)

being the first equation the state equation, in terms of state vector of responses \({{\bf x}}(t) \in {\mathcal {R}}^{2m \times 1}\), \({{\bf x}}(t) = \{ {{\bf u}}(t) \; {\dot{{\bf u}}}(t) \}^{{\mathrm {T}}}\) and its derivative \({\dot{{\bf x}}}(t) = \{ {\dot{{\bf u}}}(t) \; {\ddot{{\bf u}}}(t) \}^{{\mathrm {T}}}\), and the second equation the observer equation, in terms of observer vector of responses \({{\bf y}}(t) \in {\mathcal {R}}^{m \times 1}\) (either displacements, velocities and/or, typically, accelerations). From Eq. (12) state matrix \( {{\bf A}}_c \in {\mathcal {R}}^{2m\times 2m} \), input matrix \( {{\bf B}}_c \in {\mathcal {R}}^{2m \times m} \), output matrix \( {{\bf C}}_{c} \in {\mathcal {R}}^{m \times 2m} \) and feed-through matrix \( {{\bf D}}_{c} \in {\mathcal {R}}^{m \times m} \) are defined as follows, where subscript c denotes continuous time (Van Overschee and De Moor 1996):

$$ \begin{aligned} {{\bf A}}_c&= \left[ \begin{array}{cc} {{\bf 0}}_{n \times n} &{} \;\;{{\bf I}}_{n \times n} \\ -{{\bf M}}^{-1}{{\bf K}} &{} \;\;-{{\bf M}}^{-1}{{\bf C}} \end{array}\right] ,\;\;\;&{{\bf B}}_c = \left[ \begin{array}{c} {{\bf 0}}_{n \times n} \\ {{\bf M}}^{-1}{\bar{{{\bf B}}}} \end{array}\right] \\ {{\bf C}}_c&= [ -{{\bf M}}^{-1}{{\bf K}} \quad \;\; -{{\bf M}}^{-1}{{\bf C}}] ,\;\;\;&{{\bf D}}_c = {{\bf M}}^{-1} {{\bar{{\bf B}}}} \end{aligned}$$
(13)

Here, the structures of matrices \({{\bf C}}_c\) and \({{\bf D}}_c\) refer to the adoption of total acceleration responses for the observer vector, namely \({{\bf y}}(t)={\ddot{{\bf u}}}(t)\).

In the OMA context, structures are excited by unmeasurable and spatially distributed input excitations; this means that the information from excitation \({{\bf f}}'(t)\) is not available. Also, experimental tests yield measurements taken at discrete time instants, while Eqs. (11)–(13) are actually expressed in continuous time. For a given sampling time interval \( \Delta t \), continuous-time equations can be discretized and solved at discrete time instants \( t_{k} = k \Delta t, k = 1, \ldots , N \), with N being the total number of sampling points of the signal.

By taking into account the kth time instant and assuming unknown/unmeasured input (treated as white noise), classical SSI-DATA theory takes as a typical starting point of the identification process the following OMA Stochastic State-Space model in discrete-time notation (Van Overschee and De Moor 1996):

$$ \left\{ \begin{array}{l} {{\bf x}}_{k+1} ={{\bf A}}{\bf x}_{k}+{{\bf w}}_{k} \\ \,\,\,\,\,{{\bf y}}_{k}={{\bf C}}{{\bf x}}_{k}+{{\bf v}}_{k} \end{array}\right. $$
(14)

being \({{\bf x}}_{k} = \{ {{\bf u}}_k \; {{\bf u}}_{k+1} \}^{{\mathrm {T}}}\) the state vector of responses and its derivative \({{\bf x}}_{k+1} = \{ {{\bf u}}_{k+1} \; {{\bf u}}_{k+2} \}^{{\mathrm {T}}}\), and \({{\bf y}}_{k}\) the observer vector of responses (either displacements, velocities and/or, typically, accelerations, as considered here). Notation \({{\bf u}}_{k}, {{\bf u}}_{k+1}\) and \({{\bf u}}_{k+2}\) refers to the discrete-time counterparts of continuous-time vectors of displacement \({{\bf u}}(t)\), velocity \({\dot{{\bf u}}}(t)\) and acceleration \({\ddot{{\bf u}}}(t)\) responses, respectively. There, also \( {{\bf A}}\) and \( {{\bf C}}\) are the discrete-time counterparts of \( {{\bf A}}_c \) and \( {{\bf C}}_c \) matrices, while vectors \({{\bf w}}_{k}\) and \({{\bf v}}_{k} \in {\mathcal {R}}^{2m \times 1}\) are zero mean, stationary white noise stochastic processes, representing process noise and measurement noise, respectively (Van Overschee and De Moor 1996). These \({{\bf w}}_{k}\) and \({{\bf v}}_{k}\) stochastic processes become necessary and shall be included, in order to describe real measurement data, which are also driven by uncertainty and noise.

Then, the salient mathematical and computational steps described in the following summarize the main workflow of the present SSI-DATA algorithm.

The first step of classical SSI-DATA identification algorithms is the computation of the so-called block Hankel matrix \(\varvec{H}_{0|2i-1}\in {\mathcal {R}}^{2mi \times j}\) of responses, which is directly calculated from the measurement data (Van Overschee and De Moor 1996), i.e. system structural responses \({{\bf y}}_k\) (in the present case, total accelerations), as:

$$ \varvec{H}_{0|2i-1}=\frac{1}{\sqrt{j}}\left[ \begin{array}{ccc} {{\bf y}}_{0} &{} \cdots &{} {{\bf y}}_{j-1}\\ \vdots &{} \ddots &{} \vdots \\ {{\bf y}}_{i-1} &{} \cdots &{} {{\bf y}}_{i+j-2}\\ {{\bf y}}_{i} &{} \cdots &{} {{\bf y}}_{i+j-1}\\ \vdots &{} \ddots &{} \vdots \\ {{\bf y}}_{2i-1} &{} \cdots &{} {{\bf y}}_{2i+j-2} \end{array}\right] =\left[ \begin{array}{c} \varvec{Y}_{0|i-1}\\ \varvec{Y}_{i|2i-1} \end{array}\right] =\left[ \begin{array}{c} \varvec{Y}_{p}\\ \varvec{Y}_{f} \end{array}\right] $$
(15)

where the two partition sub-matrices refer to past \(\varvec{Y}_{p} = \varvec{H}_{0|i-1}\) and future \(\varvec{Y}_{f} = \varvec{H}_{i|2i-1}\) output channel matrices, where subscripts on the lefthand and righthand sides of delimiter | denote the first and the last element of the first column, respectively, of block Hankel matrix \(\varvec{H}_{0|2i-1}\). So, matrices \(\varvec{Y}_{p}\) and \(\varvec{Y}_{f}\) are defined by splitting matrix \(\varvec{H}_{0|2i-1}\) in two equal parts of i block rows, where number of block rows i shall be determined in agreement with condition \(m \cdot i \ge n\) (Peeters 2000), being m the number of output (acquisition) channels and n the so-called system order (i.e. the dimension of square matrix \({{\bf A}}\) in the identification process, i.e. the rank of diagonal matrix \(\varvec{\varSigma }_{1}\) defined below). Number of columns j of block Hankel matrix \(\varvec{H}_{0|2i-1}\) is usually taken as \(j=N-2i+1\), which implies that all recorded data samples are used (Van Overschee and De Moor 1996). Thus, all following quantities showing subscript i refer to the assumed number of block rows. By observing Eq. (15), it is clear that the block Hankel matrix consists of the repetition of the same element in each anti-diagonal term.

The second computational step is based on the calculation of projection matrix \(\varvec{P}_{i}\in {\mathcal {R}}^{mi \times mi}\), i.e. the orthogonal projection of the row space of future output channels \(\varvec{Y}_{f}\) into the row space of past output channels \(\varvec{Y}_{p}\), which can be expressed as (Van Overschee and De Moor 1996):

$$ \varvec{P}_{i}=\varvec{Y}_{f} /\varvec{Y}_{p}=\varvec{Y}_{f}\varvec{Y}_{p}^{{\mathrm {T}}} (\varvec{Y}_{p}\varvec{Y}_{p}^{{\mathrm {T}}})^{\dagger }\varvec{Y}_{p} = \varvec{O}_{i}\hat{\varvec{S}}_{i} $$
(16)

where symbol \( \dagger \) indicates Moore-Penrose pseudo-inverse, whilst the factorization of projection matrix \(\varvec{P}_{i}\) into the product of observability matrix \( \varvec{O}_{i} \in {\mathcal {R}}^{mi \times n}\) and Kalman filter state sequence \( \hat{\varvec{S}}_{i} \in {\mathcal {R}}^{n \times mi}\) defines a main theorem of SSI-DATA (Van Overschee and De Moor 1996; Rainieri and Fabbrocino 2014), where n is the selected system order, as detailed in the following.

Then, through the application of specific weighting matrices \(\varvec{W}_{1}\) and \(\varvec{W}_{2}\) to projection matrix \(\varvec{P}_{i}\), a SVD may be derived, by holding non-zero singular values only (Van Overschee and De Moor 1996; Rainieri and Fabbrocino 2014), as:

$$ \varvec{W}_{1}\varvec{P}_{i}\varvec{W}_{2}=[{{\bf U}}_{1} {{\bf U}}_{2}]\left[ \begin{array}{cc} {\varvec {\Sigma }}_{1} &{} {{\bf 0}}\\ {\bf 0} &{} {{\bf 0}} \end{array}\right] \left[ \begin{array}{c} {{\bf V}}_{1}^{{\mathrm {T}}}\\ {{\bf V}}_{2}^{{\mathrm {T}}} \end{array}\right] ={{\bf U}}_{1}{\varvec {\Sigma }}_{1}{\bf V}_{1}^{\mathrm {T}} $$
(17)

where \({\bf U}_{k}\) and \({\bf V}_{k} \in \mathcal {R}^{mi \times n}\), \(k=1,2\,\) are the singular vector matrices, and \({\bf \Sigma }_{1} \in \mathcal {R}^{n \times n}\) is the diagonal matrix holding the non-zero singular values (which allows to estimate the rank of matrix \( \varvec{P}_{i} \)). The selection of the dimension (n) of square diagonal matrix \({\bf \Sigma }_{1}\) fixes system order n of the State-Space model, which is adopted for the subsequent computational steps.

As concerning weighting matrices \(\varvec{W}_{1}\in {\mathcal {R}}^{mi \times mi}\) and \(\varvec{W}_{2}\in {\mathcal {R}}^{mi \times mi}\), they may be defined according to different weighting proposals (Van Overschee and De Moor 1996), namely Principal Component (PC), Unweighted Principal Component (UPC) and Canonical Variate Analysis (CVA). Accordingly, the corresponding weighting matrices can be defined as follows:

  • for the Principal Component (PC) weighting:

    $$ \varvec{W}_{1}={{\bf I}}, \;\;\; \varvec{W}_{2}=\varvec{Y}_{p}^{{\mathrm {T}}} \left( \frac{1}{j}\varvec{Y}_{p}\varvec{Y}_{p}^{{\mathrm {T}}} \right) ^{-\tfrac{1}{2}}\varvec{Y}_{p}; $$
    (18)
  • for the Unweighted Principal Component (UPC) weighting:

    $$ \varvec{W}_{1}={{\bf I}}, \;\;\; \varvec{W}_{2}={{\bf I}}; $$
    (19)
  • for the Canonical Variate Analysis weighting (CVA) (of a main use in the following):

    $$ \varvec{W}_{1}=\left( \frac{1}{j}\varvec{Y}_{f}\varvec{Y}_{f}^{{\mathrm {T}}} \right) ^{-\tfrac{1}{2}}, \;\;\; \varvec{W}_{2}={{\bf I}}; $$
    (20)

where j is again the number of columns of block Hankel matrix \(\varvec{H}_{0|2i-1}\) (taken as \(j=N-2i+1\) in the present case, i.e. all recorded data samples are used).

Then, starting from Eq. (17), observability matrix \(\varvec{O}_{i}\) and Kalman filter state sequence \(\hat{\varvec{S}}_{i}\) may be computed as:

$$ \varvec{O}_{i}={{\bf U}}_{1}{\varvec {\Sigma }}_{1}^{1/2}{{\bf T}}, \;\;\;\hat{\varvec{S}}_{i}=\varvec{O}_{i}^{\dagger }\varvec{P}_{i} $$
(21)

where \({{\bf T}} \in {\mathcal {R}}^{n \times n}\) is a further possible weighting matrix, generally taken as an identity matrix, \({{\bf T}} = {\bf I} \) (as done here). By taking into account Eqs. (16) and (21), Kalman filter state sequence \(\hat{\varvec{S}}_{i+1}\) and output sequence \(\varvec{Y}_{i|i}\) may be calculated as shown in Van Overschee and De Moor (1996). Especially, \(\varvec{Y}_{i|i}\) comes directly from a different partition of block Hankel matrix \(\varvec{H}_{0|2i-1}\):

$$ \varvec{H}_{0|2i-1}= \left[ \begin{array}{c} \varvec{Y}_{0|i-1}\\ \varvec{Y}_{i|i}\\ \varvec{Y}_{i+1|2i-1} \end{array}\right] =\left[ \begin{array}{c} \varvec{Y}_{0|i}\\ \varvec{Y}_{i+1|2i-1} \end{array}\right] = \left[ \begin{array}{c} \varvec{Y}_{p}^{+}\\ \varvec{Y}_{f}^{-} \end{array}\right] $$
(22)

where superscript symbols \(+\) and − stay for addition and for subtraction of one block row to the original \(\varvec{Y}_{p}\) and \(\varvec{Y}_{f}\) matrices. Then, from projection matrix \(\varvec{P}_{i-1}\) it is possible to obtain:

$$ \varvec{P}_{i-1}=\varvec{Y}_{f}^{-} /\varvec{Y}_{p}^{+}=\varvec{Y}_{f}^{-}{\varvec{Y}_{p}^{+}} ^{{\mathrm {T}}}(\varvec{Y}_{p}^{+} {\varvec{Y}_{p}^{+}}^{{\mathrm {T}}})^{\dagger }\varvec{Y}_{p}^{+} = \varvec{O}_{i-1}\hat{\varvec{S}}_{i+1} $$
(23)

and Kalman filter state sequence \(\hat{\varvec{S}}_{i+1}\) can be determined as:

$$ \hat{\varvec{S}}_{i+1}=\varvec{O}_{i-1}^{\dagger }\varvec{P}_{i-1} $$
(24)

At this stage, discrete-time State-Space matrices \({{\bf A}}\) and \({{\bf C}}\) may be computed through an asymptotically-unbiased least squares estimate as (Van Overschee and De Moor 1996):

$$ \left[ \begin{array}{c} {{\bf A}}\\ {{\bf C}} \end{array}\right] =\left[ \begin{array}{c} \hat{\varvec{S}}_{i+1}\\ \varvec{Y}_{i|i} \end{array}\right] \hat{\varvec{S}}_{i}^{\dagger }. $$
(25)

Then, the eigenvalue decomposition of discrete-time state matrix \({{\bf A}}= \varvec{\varPsi }\varvec{\mathcal {M}}\varvec{\varPsi }^{-1}\) allows for the estimation of the modal parameters, through matrices \(\varvec{\varPsi }\) and \(\varvec{\mathcal {M}}\), holding discrete-time eigenvectors \(\varvec{\psi }_{r}\) and eigenvalues \(\mu _{r}\), respectively. Afterwards, discrete-time eigenvalues \( \mu _{r} \) are converted to continuous-time eigenvalues \( \lambda _{r} \), as \( \lambda _{r}={\mathrm {ln}(\mu _{r})}/{\Delta t}\) (Rainieri and Fabbrocino 2014), so that the so-called system poles are obtained. Finally, the rth mode shape, natural frequency, damped modal frequency and modal damping ratio estimates may be computed as (Rainieri and Fabbrocino 2014):

$$ \varvec{\phi }_{r}={{\bf C}}\varvec{\psi }_{r}; \;\;\; f_{r}=\frac{\left| \lambda _{r} \right| }{2\pi }, \;\;\; f_{r,d}=\frac{{\mathrm {Im}}(\lambda _{r})}{2\pi }, \;\;\; \zeta _{r}=-\frac{{\mathrm {Re}}(\lambda _{r})}{\left| \lambda _{r} \right| }. $$
(26)

3.2 Main enhancements of the present improved SSI-DATA algorithm

Main assumptions of classical SSI methods consist of (stationary) white noise input and adequately long structural response signals, to achieve a suitable stabilization of the estimated poles. Light damping (modal damping ratios in the order of \(1 \div 2\%\)) leads to better estimates, too, since it may reduce the occurrence of noise poles or of mathematical poles (e.g. false stable poles characterized by a positive real part and a negative damping ratio) (Van Overschee and De Moor 1996; Rainieri and Fabbrocino 2014).

The present SSI-DATA algorithm, whose original theoretical background comes from the general formulation in Van Overschee and De Moor (1996), as outlined in Sect. 3.1 above, may be intended as a first implementation attempt, for SSI algorithms, to deal with earthquake-induced structural response signals, at concurrent heavy structural damping in terms of modal identification challenge.

Therefore, the following fundamental items summarize the main steps and issues related to the present SSI-DATA implementation:

  • A first issue is to appropriately define weighting matrices \(\varvec{W}_{1}\) and \(\varvec{W}_{2}\). After first extensive simulations performed in Pansieri (2016), under white noise input or seismic excitation, it was shown that the Canonical Variate Analysis weighting (CVA) (Van Overschee and De Moor 1996), with weighting matrices \(\varvec{W}_{1}\) and \(\varvec{W}_{2}\) as given in Eq. (20), turns out to be the most stable and performing weighting option towards achieving reliable estimates at seismic response input and concurrent heavy damping (as demonstrated later in Sects. 45). This type of weighting, as opposed to widely-used Principal Component (PC) [Eq. (18)] and Unweighted Principal Component (UPC) [Eq. (19)]) weightings, returns even less noise or mathematical poles and looks mostly able to separate true physical modes from possible spurious earthquake harmonics.

  • For the correct selection of system order n and for the determination of the stable poles (i.e. the poles where frequency, mode shape and modal damping ratio estimates show to be stable and not deriving from noise or mathematical poles), a stabilization diagram may be constructed from the SSI-DATA identification outcomes (Cara et al. 2013). It displays the poles that are obtained according to different considered system orders, as a function of the estimated frequency lines. The Singular Value (SV) curves extracted from SVD of SSI-DATA output spectral matrix \({{\bf G}}_{yy}(\omega )\) may be reported too, within the same stabilization diagram. This matrix may be typically calculated from the estimated SSI model as outlined in Peeters (2000), by adopting the estimated next state-output covariance matrix and the output covariance matrix (Rainieri and Fabbrocino 2014). As an original alternative in the present work, a novel integrated arrangement considers instead the use of SV curves, which are computed from the SVD of direct output spectral matrix \({{\bf G}}_{yy}(\omega )\). This is calculated through a routine of the previous rFDD algorithm, by adopting Welch’s modified periodogram (Pioldi et al. 2017). Such proposed integration of SSI and FDD information demonstrates to provide a reliable tool to support the individuation of the stable SSI poles within the stabilization diagram, especially when dealing with earthquake-induced structural response signals and at concurrent heavy damping, as handled here.

  • Anyway, the most severe issue in the present SSI identification keeps lying in the fact that seismic response signals are characterized by rather short durations (specifically with respect to ambient vibration recordings). This directly affects the achievable estimates, since the poles intrinsically display a harder stabilization. So, the maximum system order employed in the analysis shall be incremented, jointly with a careful setting of the adopted number of block rows in the Hankel matrix. This has been specifically pursued in the present implementation (see also preliminary investigation results in Pansieri 2016). In this way, better natural frequency and mode shape estimates may be achieved. As regarding to the modal damping ratios, they appear to be the most challenging parameters to be detected by the present SSI identification, especially in relation to the cited very short durations of the seismic response signals.

4 Numerical attempts and identification outcomes

4.1 Adopted earthquake dataset and numerical models

The current output-only algorithms require as input channels the recorded structural responses of the considered building. In the present work, these responses are first obtained from synthetic signals calculated from a heavy-damped linear three-storey frame. Out of several preliminary MDoF trials in Pansieri (2016), this 3-DoF case has been selected since it already constitutes a good and challenging synthetic sample for the present identification methodologies within the seismic engineering range, providing a consistent estimation of all the modal characteristics of the three modes and allowing for a compact presentation of all the achieved results. Also, a coherence subsists with the forthcoming analysis of a real MDoF case that will be exposed in subsequent Sect. 5, which will target and report as well the successful identification of the first three modes of vibration of an existing monitored building.

In the identification analyses, total storey accelerations are considered as input data for the identification algorithms. These numerical response recordings are generated prior to the modal identification by taking as base acceleration an earthquake signal out of a set of ten selected seismic ground motions (see Table 1).

The ten adopted earthquake records have been chosen as representative ones from a variety of available seismic excitations, displaying rather different peculiar characteristics, i.e. time-frequency spectra [see e.g. further information reported in Pioldi et al. (2015b)], duration, sampling, epicentral distance, magnitude M and PGA. Also, they have been specifically selected as potential challenging instances for the present rFDD and SSI-DATA output-only identification purposes, given their strong non-stationary nature. Seismic signals have been imputed as is, as base excitation, and then considered as unknown in the subsequent output-only identification process.

The simulated structural responses are calculated by direct time integration of the equations of motion, via Newmark’s (average acceleration) method. The use of synthetic signals shall fulfil a first necessary condition for the algorithms’ effectiveness, since modal parameters are determined via direct modal analysis before identification and adopted as known targets for validation purposes.

Table 1 Main features of the adopted set of ten selected earthquake base excitations

The frame structure that has been adopted for initial verification is a reference three-storey shear-type frame, subjected to each single base excitation instance from the above adopted set of ten strong ground motions. This reference 3-DoF case has been characterized by a modal damping ratio \(\zeta _k = 7\%\), for all the structural modes, a rather high value for the present OMA identification purposes, especially within the seismic engineering scenario. Structural and modal dynamic characteristics of the adopted three-storey frame (Pioldi et al. 2015a) are reported in Table 2.

Table 2 Properties of the analyzed reference three-storey frame (Pioldi et al. 2015a)

4.2 Results from synthetic output-only modal dynamic identification

By taking as base excitation the single instances from the set of ten selected earthquake recordings presented in Sect. 4.1 (Table 1), separate dynamic identification analyses have been performed here with the present rFDD and SSI-DATA algorithms, in order to identify all strong ground motion modal parameters.

Both rFDD and SSI-DATA identification methods adopt constant time series lengths of 400 s and a 0.0025 Hz frequency resolution, by applying the method outlined in Pioldi et al. (2015a), which allows to increase the frequency resolution of the recordings, despite for the shortness of the seismic histories.

As concerning rFDD, the following parameters have been adopted for the performed analysis:

  • Butterworth low-pass filtering, order 8, cut-off frequency 15 Hz, applied to the earthquake-induced structural response signals (input channels for the rFDD algorithm);

  • Different decimation settings, as a function of the frequency sampling of the recordings, i.e. no decimation (50 Hz signals), decimation of order 2 (100 Hz signals) and of order 5 (200 Hz signals);

  • Integrated PSD matrix computation through both Welch’s Modified Periodogram, generally set with 1024-points Hanning smoothing windows and 66.7% overlapping (2048-points Hanning smoothing windows when no decimation is applied), and Wiener-Khinchin’s method, set with a de-trended biased correlation matrix (see Pioldi et al. 2015b for more details).

Then, as concerning SSI-DATA, the performed analysis adopted the following settings:

  • Butterworth low-pass filtering, order 8, cut-off frequency 25 Hz, no decimation of the signals;

  • Block Hankel matrix parameters: number of block rows set to \(i=50\) (in general, adopted as variable in the range \(30\le i\le 80\)), number of columns \(j=N-2i+1\), for all the analyzed cases;

  • Stabilization diagram parameters: maximum order \(n=150\) (in general, adopted as \(n=m \cdot i\), as a function of number of response channels m and number of block rows i), detection of stable poles with tolerance levels set to \(\Delta f_k=|(f_{k}-f_{k+1})|/f_{k}<0.01\), \(\Delta \zeta _k=|(\zeta _{k}-\zeta _{k+1})|/\zeta _{k}<0.075\) and \(\Delta {\mathrm {MAC}}_k=1-{\mathrm {MAC}}(\varvec{\phi }_{k},\varvec{\phi }_{k+1})<0.02\) for frequencies, modal damping ratios and MACs, respectively, being k the current model order. Notice that higher tolerance value 7.5% on the damping ratios has been set based on the experience that has been gained by running the various cases, pointing out (expected) higher dispersion on the damping ratio estimations, basically as the lowest that has anyway allowed to produce the scores in the diagrams that will be reported in the paper;

  • Stabilization diagram SVs computed through SVD of the PSD matrix as estimated via Welch’s Modified Periodogram, set with 2048-points Hanning smoothing windows and 66.7% overlapping.

A sample of the outcomes from the two rFDD and SSI-DATA algorithms, operating here at seismic response input and concurrent heavy damping, is represented in Fig. 1. In this representation, the Singular Value Decomposition with peak-picking of the modes of vibration (left) and the enhanced Global stabilization diagram with the first rFDD Singular Value, representing rFDD and SSI-DATA all together (right), respectively, are depicted. The Global stabilization diagram displays the stable poles, at increasing system order n, concerning frequency, damping ratio and mode shape. Stable poles are selected as poles with a stable frequency, damping ratio and mode shape, between two consecutive system orders (they are marked with blue circles in Fig. 1).

Fig. 1
figure 1

Singular Value Decomposition and peak-picking of the modes of vibration (rFDD) and enhanced Global Stabilization Diagram with 1st rFDD SV (SSI-DATA), three-storey frame, synthetic response signals, rFDD (left) and SSI-DATA (right) algorithms, earthquake of Maule (CH) (Table 1)

Fig. 2
figure 2

Deviations of estimated natural frequencies (first column), modal damping ratios (second column) and MAC indexes (third column), three-storey frame, synthetic response signals, rFDD (first row) and SSI-DATA (second row) algorithms, complete considered earthquake dataset

The case of Maule (CH) excitation has been selected here because it constitutes a non-stationary case characterized by several earthquake harmonics in the structural frequency range of interest, which makes modal identification rather challenging. Anyway, similar features have been displayed also by the other SVs curves and stabilization diagrams obtained from the other seismic excitations.

As concerning rFDD outcomes, the three modal peaks are detectable from the frequency lines on the first SV. Also, modal peaks are repeated on the remaining SVs, which is a clear index of existence of the related mode of vibration. Anyway, MAC and MPC indexes may be simultaneously adopted towards modal peak selection and validation, as extensively outlined in Pioldi et al. (2015b, 2017).

Then, regarding SSI-DATA, the achieved enhanced Global Stabilization Diagram, coupled with the representation of the \(1^{st}\) rFDD SV (see Sect. 3.2), shows three clear lines of stabilization of the poles. These lines may be discerned from the remaining poles, which are spurious poles coming from the earthquake harmonics or from numerical or mathematical poles. For example, the left line of stabilization for the first modal frequency contains several noise and mathematical poles, with poor consistency also by adopting different and combined MAC indexes, while the right one clearly defines better poles, and constitutes the line that is chosen for the identification purposes. Then, the use of the rFDD SV considerably helps in the selection of the correct stable poles, since it provides a better indication with respect to classically adopted PSDs within the stabilization diagrams. For example, in the right plot in Fig. 1 there appears a bifurcation of stable pole lines for the first mode of vibration, which can be anyway discerned by reading as well the underlying SV representation.

Thus, by the strategies and settings reported above, complete synthetic output-only analyses with the two OMA methods have been performed. A synopsis from all the achieved results is reported in Fig. 2, where the estimates in terms of absolute deviations of rFDD and SSI-DATA identified natural frequencies and modal damping ratios and achieved MAC indexes for the estimated mode shapes are depicted. Estimates are reported in terms of absolute deviations of estimated natural frequencies (\(\Delta f=|(f_{est}-f_{targ})/f_{targ}|\)) and modal damping ratios (\(\Delta \zeta =|(\zeta _{est}-\zeta _{targ})/\zeta _{targ}|\)) from the target parameters, and of achieved MAC indexes for the estimated mode shapes on the target ones (\({\mathrm {MAC}} = |\varvec{\phi }_{est}^{{\mathrm {H}}} \varvec{\phi }_{targ} |^2 / (| \varvec{\phi }_{est}^{{\mathrm {H}}} \varvec{\phi }_{est} | \, | \varvec{\phi }_{targ}^{{\mathrm {H}}} \varvec{\phi }_{targ} |)\).

Fig. 3
figure 3

Dispersion diagrams of deviations of estimated natural frequencies (first row), modal damping ratios (second row) and MAC indexes (third row), three-storey frame, synthetic response signals, rFDD (first column) and SSI-DATA (second column) algorithms, complete considered earthquake dataset. Minimum, mean and maximum values, and standard deviations are indicated

The rFDD estimated frequencies show deviations that are always below \(5\%\), except for the last modes of the NO, LP and TA earthquake cases (Table 1), where deviations increase up to \(9\%\). SSI-DATA estimated frequencies show more scattered deviations, with values up to \(14\%\). The estimated modal damping ratios display very low deviations, at around \(10\%\), for the rFDD algorithm. This is not true for the SSI-DATA algorithm, where modal damping ratios display discrepancies raising up to \(70\%\). By this method, the order of magnitude is more or less caught, but deviations look much higher and often rather unacceptable in engineering terms. However, it should be recalled that really tough heavy-damping identification conditions have been considered here. Generally, the frequency and damping estimates from the rFDD cases show to be much closer to the target values than from the SSI ones. In this sense, Frequency Domain rFDD appears superior to Time Domain SSI-DATA, within the considered seismic engineering scenario at simultaneous heavy damping and according to the present implementations and their achieved level of refinement.

MAC values are always higher than 0.91 for the rFDD instances, for all the modes. With SSI-DATA, MAC indexes perform slightly less well. For the first two modes, MAC values are always higher than 0.75, with acceptable values in engineering terms. The third modes, instead, display some problems, especially with the IV, NO and NZ cases (Table 1), which return quite unreliable mode shapes. Thus, also in terms of mode shape estimates, rFDD performs better than SSI-DATA, in the present seismic and heavy damping context.

Then, global results on the achieved modal estimates are further summarized in Fig. 3, where the absolute deviations of estimated natural frequencies and modal damping ratios, and the MAC indexes are represented, in terms of suitably-designed dispersion diagrams (Pioldi and Rizzi 2017). The synthetic estimates for the adopted three-storey frame have been condensed all together, by displaying the minimum, the mean and the maximum (absolute) deviations, in blue, black and red coloured lines and markers, respectively. Then, the normalized truncated Gaussian Probability Density Functions (PDF) related to the dispersion of the estimates have been depicted for each mode, jointly with an indication of standard deviation \(\sigma \) of the estimated values. In the present case, frequency and modal damping ratio deviations shall turn out strictly positive (since absolute percentage deviations are adopted), while MAC indexes shall vary between 0 and 1. By taking into account such boundaries, truncated Gaussians are fitted on the achieved estimates, for each examined case.

These truncated Gaussians represent the probability of appearance of a certain deviation, as associated to each estimate, between the minimum and the maximum value, and are centred on the mean value. As it is possible to be appreciated, the maximum deviations are always on the Gaussian tails, while the minimum deviations lay in the Gaussian center. This confirms the goodness of the achieved modal identification estimates, especially for the rFDD outcomes.

Finally, the achieved results are also reported in statistical form in Fig. 4, through appropriate boxplots. In these representations, each boxplot relates to the natural frequencies, modal damping ratios and MAC values estimated from the performed analyses. In each boxplot the inner rectangular box represents the central 50% of the identified parameters, while the centred line indicates their median. Then, the right and left boundary segments depict the 25% and 75% quantiles of the related statistical distributions. Finally, the vertical through-plot dashed green lines mark the known targeted modal parameters for the identification procedure.

Fig. 4
figure 4

Boxplot diagrams for estimated natural frequencies, modal damping ratios and MAC values, three-storey frame, synthetic response signals, rFDD (first row) and SSI-DATA (second row) algorithms, complete considered earthquake dataset

Presented boxplots confirm again the goodness of the achieved results, especially as concerning the rFDD outcomes. Natural frequency estimates show to substantially catch the expected target values. Also MAC indexes reveal very good mode shape estimates, also for the last modes of vibration, where SSI-DATA returns less accurate results, though still acceptable in engineering terms. Finally, modal damping ratios display very good estimates for the rFDD outcomes, by showing a rather contained dispersion, while for the SSI results some troubles appear, especially on the second and third modes of vibration.

The effect of artificial noise of a controlled amount added to the synthetic response signals does not impede the output-only identification analysis, up to noise levels that could be associated to those typical of practical instrumentations, as extensively demonstrated in Pioldi et al. (2015a, b). This is also testified by the effective processing of real signals, likely endowed with a certain amount of inherent noise, which is presented next, for final validation (see also Pioldi et al. 2017).

5 \({\mathrm {r}}\)FDD and improved SSI-DATA analyses with real earthquake-induced structural response signals

Next to the numerical analysis earlier reported in Sect. 4, where synthetic response signals have been adopted first, as a necessary validation condition, the present rFDD and improved SSI-DATA OMA approaches are now applied to real earthquake-induced structural response recordings. The selected building is the San Bruno six-storey office building (in short SBOB), California (Fig. 5). Data are taken from the Center of Engineering Strong Motion Data (CESMD) online database (CESMD Database 2016), and represent one of the likely few well-documented and available cases, already studied in the dedicated EMA literature (Marshall et al. 1992; Celebi 1996) (notice that here OMA is instead attempted).

Fig. 5
figure 5

External view of the San Bruno six-storey office building (SBOB), California, and three-dimensional sensor layout (adapted from Marshall et al. 1992). Input data are taken from the CESMD database (storey acceleration responses)

This six-storey building is constituted by RC moment resisting frames, with individual spread footings, located in the city of San Bruno, in the metropolitan area of San Francisco. The design was made in 1978 and displays a plan of \( 60.96\,{\mathrm {m}} \times 27.43\,{\mathrm {m}}\), and a height of 23.77 m. More information on the building, its history and characteristics may be found in Marshall et al. (1992) and Celebi (1996).

Thanks to the California Strong Motion Instrumentation Program, the building was instrumented in 1985 (CSMIP Station n. 58490). The recording system consisted of 13 accelerometers, on four levels of the building: four channels were devoted for the NS direction, eight channels were devised to the WE direction and one channel was setup for the UP direction. Fig. 5 briefly represents the overall building dimensions and the instrumentation layout.

In this work, the adopted SBOB seismic response data refer to the local earthquake excitation of Loma Prieta (1989), already considered in the previous synthetic analysis (see Table 1), now with main local characteristics as recorded at the building site and reported in Table 3. Recorded data belong to the total accelerations of the seven WE channels and of the three NS channels (the channels at the ground floor have not been considered, both for the OMA purposes of the present analysis and for their low signal-to-noise ratio).

Table 3 Main local characteristics and properties of the Loma Prieta earthquake (CESMD database). PGA and PSA values are referred to as the maximum between NS and WE components

The developed rFDD and SSI-DATA algorithms have been adopted to analyse the real earthquake-induced structural response data coming from the local Loma Prieta earthquake excitation. Sample of the outcomes from the two identification techniques are represented in Fig. 6. In this figure, the Singular Value Decomposition with modal peak-picking and the enhanced Global Stabilization Diagram with the first rFDD Singular Value, for the NS response component, are depicted. Despite for the real seismic responses, the first three modes of vibration can still be detected from the graphs.

Fig. 6
figure 6

Singular Value Decomposition and peak-picking of the modes of vibration (rFDD) and enhanced Global Stabilization Diagram with 1st rFDD SV (SSI-DATA), San Bruno six-storey office building, real response signals, rFDD (left) and SSI-DATA (right) algorithms, earthquake of Loma Prieta (LP), NS component (Table 3)

Then, main results, in terms of estimated natural frequencies and modal damping ratios, are reported in Table 4. The analyses and the resulting modal parameter estimates have been subdivided into the two spatial components of the excitation, namely the NS and the WE responses. The first three modes of vibration can be identified as well, and deviations between the two methods are reported, as \(\Delta f = [(f_{\mathrm {SSI}} - f_{{\mathrm {rFDD}}})/f_{{\mathrm {rFDD}}}] \%\) and \(\Delta \zeta = [(\zeta _{{\mathrm {SSI}}} - \zeta _{{\mathrm {rFDD}}})/\zeta _{{\mathrm {rFDD}}}] \%\). These deviations show to be very limited concerning the natural frequencies, with a maximum of \(8.14\%\) on the third natural frequency of the NS case. Then, also the modal damping ratios reveal rather reasonable deviations, with a maximum of about \(30\%\) for the first mode of the NS cases. The only exceptions are related to the second modes of vibration, where deviations increase, for both NS and WE components. In these cases, SSI-DATA shows much higher modal damping ratios than for rFDD; this is probably due to the presence of two very close modes of vibration, i.e. the first and the second, which may negatively affects the damping estimates.

Table 4 Identified natural frequencies \( f_i \), modal damping ratios \( \zeta _i \) and related deviations, computed between rFDD and SSI-DATA estimates, San Bruno six-storey office building, real response signals, earthquake of Loma Prieta (LP), NS and WE components (Table 3)

Mode shapes, instead, are reported in terms of MAC indexes in Fig. 7, i.e. by adopting \({\mathrm {MAC}}( \varvec{\phi }_{{\mathrm {rFDD}}},\varvec{\phi }_{{\mathrm {SSI}}}) = | \varvec{\phi }_{{\mathrm {rFDD}}}^{{{\mathrm {H}}}} \varvec{\phi }_{{\mathrm {SSI}}}^{~} |^2 / (| \varvec{\phi }_{{\mathrm {rFDD}}}^{{{\mathrm {H}}}} \varvec{\phi }_{{\mathrm {rFDD}}}^{~} | \, | \varvec{\phi }_{{\mathrm {SSI}}}^{{{\mathrm {H}}}} \varvec{\phi }_{{\mathrm {SSI}}}^{~} |) \), as calculated between the rFDD and the SSI-DATA outcomes. Here, 3D MAC barplots are made by combining the rFDD and the SSI mode shapes, for each mode of vibration. On the diagonal terms, where the same modes are combined, MAC values turn out very close to one, as expected (since the first rFDD mode shape is combined with the first SSI mode shapes, and so on), while off-diagonal terms shall be close to zero, as effectively detected (since different modes are combined to each other, resulting to be orthogonal among them).

Notice that the direct comparison with the two rFDD and SSI-DATA methods (with respect to natural frequencies, modal damping ratios and mode shapes), after the earlier numerical analysis with synthetic response signals made in Sect. 4, helps with the validation of the achieved strong ground motion modal parameters. In fact, by adopting the two validated method, it is possible to achieve a set of rFDD and SSI-DATA modal parameters, which can be “self-compared” in order to reach final identification outcomes.

Fig. 7
figure 7

MAC indexes for the estimated mode shapes, computed between rFDD and SSI-DATA eigenvector estimates, San Bruno six-storey office building, real response signals, earthquake of Loma Prieta (LP), NS and WE components (Table 3)

Finally, the barplots proposed in Fig. 8 represent the achieved natural frequencies and modal damping ratios, computed with the rFDD and SSI-DATA methods for the LP earthquake responses, NS and WE components. In particular, jointly with the estimated parameters, also the target ones are proposed, towards further comparison purposes. Target modal parameters are taken from the work of Marshall et al. (1992), Celebi (1996). For the NS case, the first three natural frequencies and the first modal damping ratio are available, while for the WE case the first two natural frequencies and the first modal damping ratio are provided. These target parameters are marked in red in Fig. 8, and deviations are calculated on them as \(\Delta f = [(f_{{\mathrm {ID}}} - f_{{\mathrm {TARGET}}})/f_{{\mathrm {TARGET}}}] \%\) for natural frequencies and \(\Delta \zeta = [(\zeta _{{\mathrm {ID}}} - \zeta _{{\mathrm {TARGET}}})/\zeta _{{\mathrm {TARGET}}}] \%\) for modal damping ratios. The very limited deviations between the estimated and the target values confirm once again the reliability of the proposed rFDD and SSI-DATA techniques, also in detecting strong ground motion modal parameters with real earthquake-induced structural response signals.

Fig. 8
figure 8

Natural frequency and modal damping ratio barplots for rFDD and SSI-DATA, San Bruno six-storey office building, real response signals, earthquake of Loma Prieta (LP), NS and WE components (Table 3). Percentage deviations from the target values (where available) are indicated

6 Conclusions

In this work, two different output-only identification algorithms have been adopted to achieve strong ground motion modal parameter estimations. Simultaneous heavy damping, in terms of identification challenge, has been considered all together. In this framework, both synthetic (Sect. 4) and real (Sect. 5) seismic response signals have been effectively analyzed.

Consistently, two enhanced OMA methods have been originally developed and implemented within MATLAB, by referring either to the Frequency Domain (refined Frequency Domain Decomposition, rFDD) or to the Time Domain (improved Data-Driven Stochastic Subspace Identification, SSI-DATA). Starting from classical implementations (Sects. 2.1, 3.1), the present algorithms have been specifically implemented to deal with seismic vibration and simultaneous heavy damping. Specifically, a series of further peculiar strategies, described in Sects. 2.2 and 3.2, have been successfully implemented to handle such a challenging Earthquake Engineering identification context.

By adopting the two output-only methods, consistent estimates have been achieved from both synthetic (reference linear three-storey shear-type frame) and real (six-storey instrumented RC building in California) seismic response signals. The present rFDD and SSI-DATA methodologies show to be rather robust in terms of global modal identification capability, within the considered seismic engineering range. In these terms, readable estimates of all the modal parameters have been achieved.

The obtained results from the adoption of synthetic earthquake-induced response signals have been demonstrated and validated by the consistent outcomes from the use of real seismic response recordings. With the present enhanced output-only implementations (to be used separately or all together), the challenging modal identification within the Earthquake Engineering range may effectively be performed. As a specific identification outcome from the analyzed cases, rFDD overall looks superior to SSI-DATA, at this particular stage of development and implementation. In such a sense, better modal estimates may be achieved, especially for the modal damping ratios.

Future developments shall concern additional refinements of both methods, especially by focusing on the improvement of the present SSI-DATA implementation, and more extensive analyses and case studies.