1 Introduction

Rapid socio-economic development in the recent past has forced the infrastructures to perform under extreme conditions, pushing their operational envelope to meet the ever-increasing demand. It requires continuous performance evaluation to maintain the optimal output without any interruption. In this context, vibration-based condition assessment has advantages over other methods, as it gives quick overview of the structural health with excellent accuracy. Doebling et al. [15] presented a review of the vibration-based parameter estimation and damage detection. These methods are broadly classified into two categories—input–output-based methods and output-only methods. To implement the first technique, user needs to know the structural details, input excitation, and the response, while in the second case, only the recorded responses are required for system identification. Stochastic subspace identification (SSI) [25, 31], wavelet transformation (WT) [10, 11, 29], Hilbert–Huang transformation (HHT) [9, 17, 27], and blind source separation (BSS) [41,42,43] are widely used techniques among other output-only methods. Output-only techniques are easy and fast to implement for structural vibration control [5, 16, 19], retrofitting and restrengthening of structures [8, 33, 34], and other decision-making process.

SSI, HHT, and BSS have been extensively used for system identification in the recent past. While HHT becomes more popular for its simplicity to implement and data independence, it lacks strong mathematical foundation. In many cases, it is observed to provide spurious information, mode-mixing [39] and demands strong user interface, often heuristic in nature. Other methods (e.g., SSI and BSS) are based on either singular value decomposition or eigen value decomposition. These mathematical frameworks help to segregate the components of any signal very fast. The main drawback of these methods is the little control over the decomposition level. These techniques often provide only strong components present in the signal and, hence, is very difficult to study the frequencies that have lower energy contents.

Among all these methods, WT enjoys popularity for its robustness and better control over its parameters to get object-oriented results. The main parameter in WT is the scale that decides the number of components into which the main signal is to be decomposed. With the increase of scale, an user can get finer details of the frequency contents. Thus, to decompose a signal containing very high frequency, the number of scale in this integral transformation also needs to be very high. In this context, the relation between scale and frequency is given by

$$\begin{aligned} a=\frac{f_{\mathrm{c}}}{f_{\mathrm{ps}}}\frac{1}{\varDelta _{f}}. \end{aligned}$$
(1)

In above relation, a, \(f_{\mathrm{c}}\), \(f_{\mathrm{ps}}\), and \(\varDelta _{f}\) represent scale, central frequency, pseudo-frequency, and sampling rate, respectively. As shown in Fig. 1, scale a linearly varies with pseudo-frequency \(f_{\mathrm{ps}}\) in the logarithmic scale. It clearly shows the typical dominant structural frequencies and their respective scales. For example, a cable-stayed bridge can have higher modes excited during its operation that demands a different set of scales besides its regular structural modes which are much lower. Besides \(f_{\mathrm{ps}}\), the sampling rate \(\varDelta _f\) is also a deciding factor in the wavelet transformation which remains unchanged during an experiment. Thus, to get a higher level of accuracy, an user needs to go for the higher number of scales which, in turn, invokes more computational time. Though computational time alone is not an issue in the inverse problem-based health monitoring but an equal division of frequency scale in the area, where frequencies are well apart produces a large amount of redundant data. Moreover, the participation of modal frequency may change depending upon many factors. Therefore, frequency tracking over a large range is a major challenge. It is also a major hurdle in case of real-time decision making and incorporating the result in close loop vibration control. This problem of frequency tracking over wider range also demands better resolution than the original wavelet transformation for a data with a prefixed sampling rate.

Fig. 1
figure 1

Scales requirement for different frequency resolutions

Recently, synchrosqueezed transformation (SST) [12, 38] is introduced over WT. This new technique is capable to enhance the resolution between two scales. Therefore, this transformation offers better clarity in the frequency localization. Due to this property, many researchers have [18, 22, 23, 32, 36, 37, 40] applied it to separate out frequencies from a noisy signal, damage identification, frequency tracking, etc. Yang [40] and Thakur et al. [36] studied the efficiency of SST using a signal that contained different harmonic frequencies. They found that SST-based algorithm can identify low-frequency components due to its enhance resolution. In another study by Li and Liang [22], SST is used to identify the harmonic components of gearbox vibrations and damage, if any, by detecting the sifted harmonic components. In all these studies, SST showed impressive performance, though these applications are not very critical using WT, as finer scales could perform the same task. Wu et al. [38] showed mathematically that in spite of the improvement in resolution, SST fails to identify closely spaced frequencies if it is not designed with proper judgment. Their study showed that if the difference in frequency of the two closely spaced modes are below the sampling rate, SST also fails to separate them and demands finer sampling of the original signal. Besides sampling rate, frequency tracking itself offers several challenges as the response of a vibrating body contains dominant modes that may evolve with time, apart from the frequency content of the input excitation. In this context, it may be noted that spurious modes present in the transformed data also impose difficulty in modal identification.

1.1 Objective

The literature review presented above clearly shows the evolution of wavelet-based time–frequency analysis in the light of system identification and damage detection. In this context, recent development of SST base time–frequency analysis of a signal has shown it potential for different engineering applications. This signal-processing tool offers better resolution in an adaptive framework which is similar to Hilbert–Huang transformation. However, its potential in structural system identification and health monitoring is yet to be fully explored. With this in view, following objectives are set for the present work:

  • develop an efficient modal identification strategy using SST. The main contribution, in this context, is to apply this tool to extract large number of frequencies with significant accuracy. Here, the major challenge is to separate the modal frequencies from other frequencies (e.g., excitation frequency and spurious modes) which are difficult using conventional wavelet transform. This will be demonstrated using numerical results (both synthetic and experimental).

  • study the advantages and disadvantages of SST-based signal processing in the light of modal identification. Here, two main aspects are—(a) effect of resolution on the frequency tracking and (b) choice of basis function on the quality of the end results.

  • develop an automated strategy by combining the SST-based signal processing with machine learning to avoid user intermittency. In this context, wavelet transformation-based identification needs significant intermittency which brings in the parameter estimation.

All these issues will be discussed and the performance of the proposed algorithm will be demonstrated with the help of different examples in the following sections.

2 Review of WT and its synchrosqueezed version

In this section, a brief overview of the wavelet transform and its synchrosqueezed variant used for signal processing are presented. Reader may refer [12,13,14] for further details of these time–frequency-based signal-processing tools.

The continuous wavelet transform of a finite energy signal x(t) in \({\mathbf {L}}^{2}{\mathbb {R}}\) is given by

$$\begin{aligned} W_{\psi }x(a,b)=\int _{-\infty }^{+\infty }\frac{1}{\sqrt{|a|}}x(t) \psi ^{*}\left( \frac{t-b}{a} \right) \,\mathrm{d}t. \end{aligned}$$
(2)

In the above equation, \(*\) represents complex conjugate and \(\psi _{a,b}(t)\) is the dilated and time localized version of the mother wavelet which is given by

$$\begin{aligned} \psi _{a,b}(t)=\frac{1}{\sqrt{|a|}}\psi \left( \frac{t-b}{a} \right) \quad a,b\in {{\mathbb {R}}},a\ne {0}. \end{aligned}$$
(3)

Parameter a in Eq. (3) stretches or dilates \(\psi (t)\) to control the frequency content, while parameter b centers it in and around \(t=b\) to extract the time localized features of the signal x(t). It may be noted that \(\psi (t)\) must satisfy admissibility criteria [14] which is not discussed here for brevity. Unlike Fourier transform, where the exponential function is used as a basis function, different bases are proposed in the literature for wavelet transformation. In this paper, three different bases (i.e., complex Morlet, log-normal and generalized Morse) are considered to study their performance in parameter estimation. In this analysis, the signal is decomposed in logarithmic frequency levels as follows:

$$\begin{aligned} \omega _j = 2{\pi }2^{(j-j_0)/n_{\mathrm{s}}}. \end{aligned}$$
(4)

Here, j represents the index of the discretized frequency and \(n_{\mathrm{s}}\) is the number of scales in which the frequencies are segregated. It depends upon the frequency range \([f_1,f_2]\) over which the signal is decomposed which is given by

$$\begin{aligned} n_{\mathrm{s}} = \frac{N\log {2}}{\log {f_2}-\log {f_1}}. \end{aligned}$$
(5)

Parameter N is an integer whose value is greater than 1. The lower and upper bound of the frequency interval (i.e., \(f_1\) and \(f_2\) respectively) depends on the search domain of the specific problem. For example, the dominant natural frequency of civil infrastructure normally remain within 0 to 10 Hz. Therefore, such values of \(f_1\) and \(f_2\) can be adopted for modal identification of building and bridges. This discretized frequency is equivalent to pseudo-frequency described in Eq. (1) which may be combined in the following form:

$$\begin{aligned} \omega _j = 2\pi {f_{\mathrm{ps}_j}}=2\pi \frac{f_{\mathrm{c}}}{\varDelta _f}\frac{1}{a_j}. \end{aligned}$$
(6)

Apart from scales and pseudo-frequencies, different mother wavelets or bases are also prescribed for the different purposes. In the following subsections, a brief overview of the three different mother wavelets are presented whose performance in modal identification are proposed to be investigated in this study.

2.1 Complex Morlet basis function

The complex Morlet wavelet basis function is expressed in analogy with the Gaussian window and is expressed as [18]

$$\begin{aligned} \psi (f)=\frac{C_{f_{0}}}{\sqrt{2{\pi }}}\left( \mathrm{e}^{i2{\pi }f_{0}f} -\mathrm{e}^{-(2{\pi }f_{0})^{2}/2}\right) \mathrm{e}^{-f^{2}/2}. \end{aligned}$$
(7)

Here, \(f_{0}\) is the resolution parameter and \(C_{f_{0}}\) is a normalizing constant. In this study, \(C_{f_{0}}\) is considered to be unity.

2.2 Log-normal basis function

The frequency resolution obtained from Morlet wavelet is uniform, while, in practice, resolution varies with the scale, where lower frequencies stay dense within a band, while higher frequencies are dispersed over a wider range. This, in turn, demands a large number of scales to cover the complete range of frequencies present in the response. Under this situation, a basis function whose frequency resolution follows logarithmic variation is expected to perform better. Hence, log-normal wavelet basis is developed which is given by [18]

$$\begin{aligned} \psi (f)=\mathrm{e}^{-(2{\pi }f_{0}\log {f})^{2}/2}, \quad f>0, \end{aligned}$$
(8)

where \(\omega _0\) is the resolution parameter.

2.3 Generalized Morse basis function

Although different wavelet bases are proposed in the literature for specific end use, recent research works show a trend to develop a unified basis that can be tuned for different applications. One such wavelet basis is the generalized morse wavelet [13, 24, 30] which is given by

$$\begin{aligned} \psi _{\beta ,\gamma }(\omega )=U(\omega )\alpha _{\beta ,\gamma }\omega ^{\beta }\mathrm{e}^{-\omega ^{\gamma }}. \end{aligned}$$
(9)

Here, \(\alpha _{\beta ,\gamma }\) is a normalizing constant and its value can be estimated as

$$\begin{aligned} \alpha _{\beta ,\gamma }={2}{\frac{\mathbf {e}\gamma }{\beta }}^{\frac{\beta }{\gamma }}. \end{aligned}$$
(10)

In the above equation, \(\mathbf {e}\) is the Euler’s number whose value is 2.7182 [24], \(U(\omega )\) is a unit Heaviside function. The resolution is controlled by \(\beta\) and \(\gamma\) which are considered to be positive for all practical purpose. By tuning these two parameters, one can obtain the desired resolution that is achieved by other wavelet bases [24].

2.4 Synchrosqueezing and instantaneous frequency

Using continuous wavelet transformation along with a given basis function, the instantaneous frequency of a time signal can be tracked which is explained here. For this purpose, let us consider an amplitude modulated harmonic signal of the form \(x(t)=F_{0}\mathrm{e}^{-\eta {t}}\cos (\omega {t})\) as this trend is commonly observed in the response of linear dynamic systems. Applying wavelet transform on this signal using Eq. (2), the coefficient can be expressed as

$$\begin{aligned} W_{\psi }x(a,b)= \frac{F_{0}\sqrt{a}}{4\pi }\mathrm{e}^{-\eta {t}}\psi ^{*}(a\omega )\mathrm{e}^{{\mathrm {i}}b\omega }. \end{aligned}$$
(11)

From this expression, the instantaneous frequency of the signal x(t) can be evaluated as

$$\begin{aligned} \omega _{x}(a,b) =-{\mathrm {i}}(W_{\psi }x(a,b))^{-1} \frac{\partial }{\partial {b}}W_{\psi }x(a,b). \end{aligned}$$
(12)

Above mathematical operation is known as synchrosqueezing. This operation enhances the gradient of the wavelet coefficient near the instantaneous frequencies as it uses differential of the coefficient with respect to ‘b’. Thus it increases the resolution near the dominant frequencies present in the signal. Furthermore, the energy of the time signal x(t) in the jth scale can be obtained as [11]

$$\begin{aligned} {E_{x}}_{j} = \frac{1}{2\pi {C_{\psi }}}\int \limits _{-\infty }^{+\infty }\frac{\left[ W_{\psi } x(a_{j},b)^2\right] }{a_{j}^2}\,\mathrm{d}b \quad \text{where } j=1,2,\ldots ,J. \end{aligned}$$
(13)

Here, it may be noted that the energy of the signal described above is localized in and around the harmonic frequencies \(\omega\). Therefore, the scale \(a_j\) corresponding to \(\omega\) shows sharp energy concentration in the scalogram obtained from Eq. (13). Once the instantaneous frequency \(\omega _{\mathrm{int}}\) is identified, the response in the wavelet domain can be reconstructed in the following way [38]:

$$\begin{aligned} T(\omega _{\mathrm{int}},b)=(\varDelta \omega )^{-1}\sum W_{\psi }{{x}}(a_{n},b)a_{n}^{-3/2}(\varDelta {a})_n. \end{aligned}$$
(14)

Here, the scale \(a_n\) should be selected, such that

$$\begin{aligned} {{a_{n}:|\omega (a_{n},b)-\omega _{\mathrm{int}}|\le \varDelta \omega /2}}. \end{aligned}$$
(15)

An inverse wavelet transform in and around a particular scale (say \(a_n\)) using Eq. (14) will produce a single tone time signal (i.e., mono-component). Theoretically, synchrosqueezing can separate out dominant frequencies present in the signal, if the separation of these frequencies are larger than \(\varDelta \omega\) [38]. In practice, \(\varDelta \omega\) is constant once the data is collected from the experiment. Therefore, if the sampling rate is more than the separation between the two closely spaced mode, SST fails to segregate even when the scales are increased [38].

This property of SST helps to improve the resolution of the signal significantly compared to scalogram obtained from wavelet transformation alone. Thus, identification of instantaneous frequency becomes more easier in SST than wavelet scalogram, where ridges and skeletons are used. This conventional use of ridge and skeleton demands significant intermittency (where the user needs to study each of them) and also produces spurious modes. As outlined in the objective, this obvious advantage of SST is planned to be utilized for modal identification which is described in the following section.

3 Proposed synchrosqueezed clustering for modal identification

In this section, first the dynamics of a structural system is expressed in the wavelet domain. The equilibrium equation of a linear multi-degree of freedom system takes the following form:

$$\begin{aligned}{}[{\mathbf {M}}]\left\{ \ddot{{\mathbf {u}}}(t)\right\} +[{\mathbf {C}}] \left\{ \dot{{\mathbf {u}}}(t)\right\} + [{\mathbf {K}}]\left\{ {\mathbf {u}}(t)\right\} =\left\{ {\mathbf {P}}(t)\right\} . \end{aligned}$$
(16)

Here, \([{\mathbf {M}}]\), \([{\mathbf {C}}]\) and \([{\mathbf {K}}]\) are the system matrices, i.e., mass, damping and stiffness respectively. The displacement vector is denoted by \({\mathbf {u}}\) and the upper dot represents derivative with respect to time. In the above equation, \({\mathbf {P}}\) is the generalized force vector. Applying WT on both sides of Eq. (16), it can be expressed in the wavelet domain as follows [6]:

$$\begin{aligned}&[{\mathbf {M}}]\left\{ \frac{\partial ^2}{\partial {b}^2}W_{\psi }{\mathbf {u}}(a,b)\right\} + [{\mathbf {C}}]\left\{ \frac{\partial }{\partial {b}}W_{\psi }{\mathbf {u}}(a,b)\right\} \nonumber \\&\quad +\,{[{\mathbf {K}}]}\left\{ W_{\psi }{\mathbf {u}}(a,b)\right\} =\left\{ W_{\psi }{\mathbf {P}}(a,b)\right\} . \end{aligned}$$
(17)

Above expression shows that the dynamic equilibrium expressed in Eq. (16) is valid in ‘b’ domain for a given scale ‘a’ that corresponds to a particular frequency. Moreover, it can be observed that coupling in the generalized coordinate [as in Eq. (16)] is also present in the wavelet domain. Therefore, decoupling using modal coordinates are also applicable in the wavelet domain which is given by

$$\begin{aligned} W_{\psi }{\mathbf {u}}(a,b)=\sum \limits _{q}\varvec{\Phi }^{q}W_{\psi }{u}^{q}(a,b), \end{aligned}$$
(18)

where \(\varvec{\Phi }\) is the mode shape vector obtained from the eigen analysis of mass and stiffness matrices. Using this orthogonal decomposition, Eq. (17) in the wavelet domain can be expressed in the modal coordinates as

$$\begin{aligned}&[{\mathbf {M}}_{l}]\left\{ \frac{\partial ^{2}}{\partial {b}^2}W_{\psi }{u}^{q}(a,b)\right\} + [{\mathbf {C}}_{l}]\left\{ \frac{\partial }{\partial {b}}W_{\psi }{u}^{q}(a,b)\right\} \nonumber \\&\quad +\,{[{\mathbf {K}}_{l}]}\left\{ W_{\psi }{u}^{q}(a,b)\right\} =\left\{ W_{\psi }{P}^{q}(a,b)\right\} . \end{aligned}$$
(19)

In the above equation, \([{\mathbf {M}}_{l}]\), \([{\mathbf {C}}_{l}]\), and \([{\mathbf {K}}_{l}]\) are the modal mass, damping, and stiffness matrices, while \(\left\{ W_{\psi }{P}^{q}(a,b)\right\}\) is the modal load vector in the wavelet domain corresponding to scale ‘a’. Here, it may be observed that modal dynamics in the wavelet domain follow the same mathematical framework as in the original time domain. Hence, convolution integral can be adopted to evaluate the response in the wavelet domain corresponding to the scale factor ‘a’. Following this analogy, the response due to the modal force vector \(\{W_{\psi }{P}^{q}(a,b)\}\) can be expressed as

$$\begin{aligned} W_{\psi }{u}_{k}^{q}(a,b)=\int \limits _{0}^{b}h_{k} (b-\tau )W_{\psi }{P}_{k}^{q}(a,\tau )\,\mathrm{d}{\tau }. \end{aligned}$$
(20)

Here, \(h_{k}(b-\tau )\) is the Impulse Response Function of the decoupled system in the qth mode. In the above equation, \(\{W_{\psi }{P}_{k}^{q}(a,b)\}\) can be considered as a pulse train where the total response is obtained by linear summation of the response due to the individual pulse acting at a time instant \(\tau\). Therefore, the modal load vector can be expressed as follows:

$$\begin{aligned} W_{\psi }{P}_{k}^{q}(a,\tau ) = \sum \limits _{i}{P_k}_{i}^{q}\delta (\tau -b_{i}). \end{aligned}$$
(21)

Using Eq. (21) in Eq. (20), the acceleration response in the wavelet domain corresponding to a scale ‘a’ can be derived as follows:

$$\begin{aligned} W_{\psi }\ddot{u}_{k}^{q}(a,b)= & {} \int \limits _{0}^{b}\frac{A_{k}}{m_{k}{\omega _{\mathrm{d}}}_{k}} \mathrm{e}^{-\eta _{k}{\omega _{n}}_k(b-\tau )}\sin \left\{ {\omega _{\mathrm{d}}}_k(b-\tau )-\vartheta _{k} \right\} \nonumber \\&\times \sum \limits _{i}{P_k}_{i}^{q}\delta (\tau -b_{i})\,\mathrm{d}{\tau } \nonumber \\= & {} \sum \limits _{i}\frac{A_{k}{P_k}_{i}^{q}}{m_{k}{\omega _{\mathrm{d}}}_{k}}\mathrm{e}^{-\eta _{k}{\omega _n}_{k}(b-b_{i})} \sin \left\{ {\omega _{\mathrm{d}}}_{k}(b-b_i)-\vartheta _{k} \right\} , \end{aligned}$$
(22)

where \(A_{k} =\eta _{k}{\omega _n}_{k}^{2}\sqrt{4-3\eta _{k}^2}\) and \(\vartheta _{k}=\tan ^{-1}\frac{2\eta _{k}\sqrt{1-\eta _{k}^2}}{2\eta _{k}^{2}-1}\). In the above equation, \(m_k\), \({\omega _{\mathrm{d}}}_k\) and \(\eta _k\) are the modal mass, natural frequency and modal damping ratio respectively. Using Eqs. (18) and (22), the global acceleration response in the wavelet domain corresponding to scale ‘a’ is evaluated as

$$\begin{aligned} W_{\psi }\ddot{{\mathbf {u}}}(a,b)= & {} \sum \limits _{q}\sum \limits _{i}\frac{A_{k}{P_k}_{i}^{q}}{m_{k}{\omega _{\mathrm{d}}}_{k}}\mathrm{e}^{-\eta _{k}{\omega _n}_{k}(b-b_{i})} \nonumber \\&\cos \left\{ {\omega _{\mathrm{d}}}_{k}(b-b_i)+\vartheta ^{k} \right\} , \end{aligned}$$
(23)

where \(\vartheta ^{k}=\frac{\pi }{2}-\vartheta _{k}\) is the phase lag between the modal load and the response in the wavelet domain corresponding to scale ‘a’. Once the total response is obtained in the wavelet domain, the analytical signal can be reconstructed as

$$\begin{aligned} W_{\psi }\ddot{{\mathbf {u}}}^{\mathrm{{an}}}(a,b)= & {} W_{\psi }\ddot{{\mathbf {u}}}(a,b)+{\mathcal {H}}(W_{\psi }\ddot{{\mathbf {u}}}(a,b)) \quad \forall {a}\in {+{\mathbb {R}}}\nonumber \\= & {} \sum \limits _{q}\sum \limits _{i}\frac{A_{k}{P_k}_{i}^{q}}{m_{k}{\omega _{\mathrm{d}}}_{k}}\mathrm{e}^{-\eta _{k}{\omega _n}_{k}(b-b_{i})} \mathrm{e}^{{\mathrm {i}}\theta }. \end{aligned}$$
(24)

In the above mathematical expression, \({\mathcal {H}}(\cdot )\) represents Hilbert transform [7]. Here, superscript \(^{\mathrm{{an}}}\) represents the analytic signal. In Eq. (24), \(\theta ={\omega _{\mathrm{d}}}_{k}(b-b_i)+\vartheta ^{k}\) and \({\mathrm {i}}=\sqrt{-1}\). The energy content of the signal in different scales are evaluated which is expressed in the following equation:

$$\begin{aligned} {E_{\ddot{{\mathbf {u}}}^{\mathrm{{an}}}}}_{j} = \frac{1}{2\pi {C_{\psi }}}\int \limits _{-\infty }^{+\infty } \frac{\left[ W_{\psi }{\ddot{{\mathbf {u}}}^{\mathrm{{an}}}} (a_{j},b)^2\right] }{a_{j}^2}\,\mathrm{d}b. \end{aligned}$$
(25)

Since the total response is expressed by linear summation of the modal responses [i.e., Eq. (18)], the energy content of the measured response is also localized in and around different frequencies (i.e., scales in wavelet domain) corresponding to structural modes and input forces. In traditional wavelet transform-based identification, the wavelet scalograms of the measured responses are used to develop the ridges and skeletons. This algorithm is not discuss here to avoid repetition. Reader may refer [21, 35] for the details of this technique. However, the ridge-skeleton-based identification demand user intermittency to locate the scales. Moreover, traditional modal identification uses either broad banded excitation or ambient vibration which has inherent advantage as the output is featured with modal frequencies only. The use of these technique for non-stationary forced excitation is often heuristic. Due to this reason, present study aims to use advanced time–frequency analysis (i.e., synchrosqueezed transformation) for efficient modal identification as stated in the objectives. This is achieved in two steps—synchrosqueezing for better resolution followed by machine learning for unbiased frequency localization.

Thus, the above analytic signal in Eq. (24) is used for SST to enhance the resolution of instantaneous amplitude and phase as explained in Eq. (12), that is

$$\begin{aligned} \omega _{\mathrm{s}}=-{\mathrm {i}}[W_{\psi }\ddot{{\mathbf {u}}}^{\mathrm{{an}}}(a,b)]^{-1} \frac{\partial }{\partial {b}}W_{\psi }\ddot{{\mathbf {u}}}^{\mathrm{{an}}}(a,b) = {\omega _{\mathrm{d}}}_{k}+{\mathrm {i}}\eta _{k}{\omega _{n}}_{k}. \end{aligned}$$
(26)

Here, it may be noted that above instantaneous frequency obtained from the analytic signal contains both real and imaginary parts. Thus, the modal frequency and damping ratio can be evaluated from the above expression which are given by

$$\begin{aligned} {\omega _n}_{k}&= {\mathrm {abs}}(\omega _{\mathrm{s}})=\sqrt{{{\omega _{\mathrm{d}}}_{k}}^2 + \eta _{k}^2{\omega _{n}}_{k}^2} \end{aligned}$$
(27a)
$$\begin{aligned} \eta _k&= \tan \left( {\mathrm {arg}}(\omega _{\mathrm{s}})\right) . \end{aligned}$$
(27b)

However, Eq. (26) provides instantaneous frequency corresponding to every scale which may be either modal frequencies or frequencies corresponding to the input force.

These frequencies are separated by exploiting the inherent properties of modes (i.e., modal responses are in phase). Using Eq. (14) near \(\omega _n\), corresponding wavelet coefficients can be used to reconstruct the signal as follows:

$$\begin{aligned} T(\omega _n,b)=(\varDelta \omega )^{-1}\sum W_{\psi }\ddot{{\mathbf {u}}}(a_{n},b)a_{n}^{-3/2}(\varDelta {a})_n. \end{aligned}$$
(28)

Thus, by inverse wavelet transformation of \(T(\omega _n,b)\) for each \(\omega _n\), response \(\ddot{{\mathbf {u}}}^q(t)\) can be generated in the time domain. To check whether the extracted mono-component (as per theory) response represents the modal frequency or not, phase portrait of the same signal obtained from different sensors are used. Here, it may be noted that instantaneous phase can also be found out by constructing the analytic signal as follows:

$$\begin{aligned} \theta ^q_k = \text{phase}[\ddot{{\mathbf {u}}}^q(t)+{\mathcal {H}}(\ddot{{\mathbf {u}}}^q(t))]. \end{aligned}$$
(29)

Nevertheless, once the phase portrait of the signals from different sensors are obtained corresponding to the scales, where energies are localized, they are compared to check the unison (i.e., crossing zero or obtaining peaks at the same time) which is the typical behavior of modal vibration. In this context, synchrosqueezed wavelet scalogram with improved resolution helps to segregate energies in different scales as opposed to traditional ridge and skeleton of the wavelet coefficients obtained from free or ambient vibration analysis. In reality, measured responses are often transient due to arbitrary forcing functions contaminated with noise and other interference. Due to this reason, wavelet scalogram often shows energy localization over different regions instead of specific scales and also suffers discontinuity in ‘b’ domain. Hence, it is difficult to identify dominant frequencies from the wavelet scalogram. The problem is more prominent, where large number of modes are available with closely spaced frequencies. To avoid these problems (i.e., user interface to decide the dominant frequencies that involves subjectivity leading to erroneous estimation), further analysis of the synchrosqueezed wavelet coefficients are proposed in this paper. Here, two major improvisations are adopted—(a) apply machine learning over synchrosqueezed wavelet transform data to segregate them into different frequency bins and (b) then extract phase portrait to locate modal frequencies as they are in unison which are explained below. The second step, in particular, is very helpful, where only limited clusters are searched as opposed to large as of ridge and skeleton in original wavelet-based identification.

3.1 k-means clustering-based frequency localization

As discussed above, energy concentration of a signal varies with time as the system vibrates under arbitrary excitation and suspectable to measurement noise. This change is difficult to arrest by visual inspection of ridge and skeleton obtained from the spectrogram of the WT or SST. Thus, clustering is adopted to identify the energy concentration in a signal in different scales. Here, popular partition-based stable k-means clustering algorithm is used to extract the underlying pattern of the energy localization from the spectrogram of SST analysis. The details of this algorithm may be found in Abonyi and Feil [2]. Here, only the relevant equations in the light of the present problem are explained. The k-means clustering of a data is defined as

$$\begin{aligned} {\mathcal {J}}(x,v)=\sum _{i=1}^{c}\sum _{\begin{array}{c} k=1 \\ x_{k}\in {c_{i}} \end{array}}^{n}\Vert {x_{k}}-v_{i}\Vert ^2. \end{aligned}$$
(30)

In the above equation, \(x_k = |W_{\psi }\ddot{{\mathbf {u}}}(a,b)|^2\). Here, n and c are the number of data and clusters respectively, where the center of the cluster is defined by \(v_{i}\). The individual weight index of each cluster is given by

$$\begin{aligned} \lambda =\frac{\sum \limits _{i=1}^{n}|W_{\psi }\ddot{{\mathbf {u}}}(a,b_i)|^2}{\sum \limits _{k=1}^{c}\sum \limits _{\begin{array}{c} i=1\\ x_k\in {c_k} \end{array}} ^{n}|W_{\psi }\ddot{{\mathbf {u}}}(a,b_i)|^2}. \end{aligned}$$
(31)

Based on this weight index, it is possible to locate the energy concentration at same scales in the SST spectrogram and hence, the underlying dominant frequencies. To identify the optimum number of cluster c, gap value (GV) is estimated which is given by

$$\begin{aligned} \text{GV}_{n}(c) = E_{n}\left\{ \log {W_{\mathrm{c}}}\right\} -\log \left( W_{\mathrm{c}}\right) . \end{aligned}$$
(32)

In the above equation, \(W_{\mathrm{c}}\) represents the pool within the cluster dispersion which is evaluated as

$$\begin{aligned} W_{\mathrm{c}} = \sum \limits _{i=1}^{c}\frac{1}{2n_{i}}d_{i}. \end{aligned}$$
(33)

Here, the number of data point in the ith cluster is represented by \(n_{i}\), whereas \(d_{i}\) is the sum of the pairwise distances for all points within that cluster. The optimal number of cluster c is obtained in iterative manner. Thus, the clustering is started with initial number c (typically 1) and increase by 1 in every successive iteration and the gap values are studied over a wider range of c. From this analysis, optimal c is identified as GV saturate after some iterations. In this context, it is relevant to explain the use of unsupervised learning in the proposed modal identification strategy over supervised learning. The reason behind the selection of k-means algorithm (i.e., unsupervised learning) are

  • to avoid bias associated with the training data used in supervised learning which will corresponding to a specific class of modal frequencies and non-stationary excitation used for training. In reality, there is no guarantee that the actual frequencies will be in the same pool used for training and similar non-stationary excitation will occur in testing.

  • as the machine learning is used in this study to segregate the data in groups of dominant frequencies to avoid user intermittency, unsupervised learning is enough to carry out this task.

Using this technique, energy localization in different clusters are identified and the signal corresponding to those clusters are obtained by inverse synchrosqueezed transformation as explained in the previous section. These signals are further used to identify the modal parameters, as explained in Sect. 2.

3.2 Algorithm of the proposed identification strategy

In this subsection, the algorithm of the sequential clustering of the synchrosqueezed wavelet transform coefficients, as explained in the previous sections is presented.

figure a

4 Numerical results and discussion

In this section, the proposed algorithm is implemented for parameter identification using simulated and experimental data. In the first example, synthetic measurements of a 3 degree of freedom (dof) system excited by white noise for three different cases are compared to validate the algorithm. In the second example, the acceleration response of a building is considered for two real seismic events. This building in IIT Guwahati campus serves as a testbed for seismic research. It is designated as BRNS building as it is sponsored by Board of Research in Nuclear Sciences (BRNS), Government of India. Finally, the response of a thin beam tested in the laboratory by Adhikari et al. [3] is used. In this example, a beam with additional point masses is considered whose dominant natural frequencies vary over a wider range. The proposed algorithm is applied to these examples to demonstrate its performance which are discussed below.

4.1 Validation using synthetic experiment

Table 1 Parameters of 3dof system

In this simulated case, a 3-dof hypothetical system [35] with white noise as support excitation is considered. Here, three cases are studied—(a) strong and separated modes; (b) two closely spaced strong modes; and (c) two closely spaced weak modes whose properties are shown in Table 1. It may be noted that all three systems are independent, i.e., case II and case III are not derived from case I. They, in principle, represent three different eigen systems to validate the performance of the proposed algorithm. Figure 2 shows the frequency response function (FRF) of the third dof for all three cases. From this figure, it may be noted that case II is the most challenging as the two frequencies (i.e., 2nd and 3rd) are very close to each other with almost similar energy content. Figure 3a shows the response of the 3rd dof due to simulated white noise excitation. Here, it should be noted that the same excitation is used for all three cases. This simulated response is used for wavelet transformation with complex Morlet basis with \(2^{8}\) scales initially covering a frequency range of 0.001 to 10 Hz. The algorithm finally converges with 2980 scales. Here, the frequencies identified in two successive iteration are used to check the convergence, where the iteration is stopped if the absolute error is below the tolerance limit (say \(10^{-2}\), which remains same for the other examples). The scalogram of this WT is shown in Fig. 3b. It is extremely difficult to identify the frequencies using visual inspection of this scalogram, although different regions of scales containing the dominant frequencies can be detected. To improve it further, synchrosqueezing is applied on the WT coefficients and the scalogram is shown in Fig. 3c. From these figures, it is clear that the frequency localization has improved after SST, but it is still difficult to identify the frequencies by visual inspection or ridge and skeleton as proposed in the literature [23]. To alleviate these issue as stated in the objective, k-means clustering is applied on the SST coefficients. Clusters are formed on the basis of energy concentration which is equivalent to the square of the modulus of coefficients obtained from SST analysis as given in Eq. (25). For this purpose, optimum cluster number is determined using gap statistics as discussed in Eqs. (32) and (33). This is achieved in an iterative manner over a realistic range (typically upto 15) with \(c=1\) as the initial number that is raised by one in every successive iteration. The gap values for every iteration is evaluated and are plotted in Fig. 4a. From this figure, optimal number of cluster can be identified as 4. Based on this analysis, 4 clusters are formed and the median of each cluster is considered as the identified frequency. These cluster are arranged as per cluster weights described in Eq. (31) which are shown in Fig. 4b. Here, it can be observed that 4 frequencies are identified from the median values of these clusters as opposed to 3 modal frequencies of the system. Now, to identify the mode, phase spectrum of the response corresponding to the median frequencies of the cluster are evaluated as per Eq. (29). Figure 5 shows the responses in the time domain obtained from Eq. (28) corresponding to each cluster median with \(\pm {2.5\%}\) width on the either side of this value. Figure 6 shows the phase difference, where 1st dof is considered as reference. It may be noted that response corresponding to first three clusters are mode as the phases are in unison, while the response in the fourth cluster shows randomness which is against the fundamental property of the mode. Based on this analysis, the identified frequencies are 0.801, 2.164 and 3.251 Hz in case I which are very close to their theoretical values as shown in Table 2. Once the frequencies are identified, modal responses are considered for the estimation of the damping ratio. However, it may be noted from Fig. 5 that time history corresponding to each cluster do not exhibit decay as these are forced responses. Hence, they are not directly used for damping estimation as the modal damping ratio is very sensitive and are best evaluated from the decay of the transient response. For this purpose, NExT [20] is applied over this modal responses prior to the damping estimation which is not discussed here as it is not the theme of this study. Once the free response is obtained from NExT, Eq. (27b) is adopted to evaluate the critical modal damping ratio. The estimated modal damping ratio are 0.053, 0.03 and 0.02 respectively in case I which are very close to their respective theoretical values.

Fig. 2
figure 2

Frequency response function (FRF) of 3dof system

Fig. 3
figure 3

Time histories and scalograms—a input and output; b WT scalograms for case I; c SST scalograms for case I

Fig. 4
figure 4

k-means clustering—a GV statistics and b clusters

Fig. 5
figure 5

Reconstructed signals corresponding to median frequencies

Fig. 6
figure 6

Phase portrait of reconstructed signals (subscript in legend represents dof)

Using similar steps as stated above, case II and case III are solved and the scalograms using WT and SST are shown in Figs. 7 and 8, respectively. The number of scales needed in case II and case III are 2980 and 4476 respectively. This shows that a large number of scales are required for closely spaced weak modes as in case III. However, visual inspection of the scalograms do not reveal the modal frequencies as usual. Hence, the scalograms with improved resolution (i.e., after SST) are used for clustering and the optimal clusters for both of them remain 4. These plots are not shown here to avoid repetition. In this context, it is relevant to explain why clustering is performed over SST coefficients and not directly over WT coefficients. WT coefficients being poor in resolution offers spurious modes and hence spurious clusters as the energies are dispersed over different frequency bands instead of being localized in and around few scales as observed in the SST scalogram. Due to this reason, spurious modes are generated leading to inaccurate estimation of the modal frequencies as shown in Fig. 9 which shows additional clusters obtained from wavelet transformation coefficients. Finally, instantaneous phases are compared as described in case I to identify the modal frequencies. Table 2 shows the identified frequencies and modal damping ratio for all these cases. It can be observed that the estimated values are well within \(5\%\) of their respective theoretical values, except in one case, where the error is found to be \(7\%\). This is in case II which is the most critical as the second and third frequencies are closely spaced with almost same energy content. Due to this reason, extraction of mono-component signals through SST is difficult which leads to higher estimation error. As mentioned earlier, SST also fails to pin point scales (and respective frequencies) if the difference between them is close to or below \(\varDelta \omega\) (i.e., sampling rate). With these validation exercise, it may be concluded that the performance of the proposed algorithm is satisfactory and can be used for field implementation.

Fig. 7
figure 7

Scalograms—a wavelet coefficients in case II; b SST coefficients in case II

Fig. 8
figure 8

Scalograms—a wavelet coefficients in case III; b SST coefficients in case III

Fig. 9
figure 9

Clusters from WT and SST analysis for case I

Table 2 Identified parameters of 3dof system

4.2 Building under earthquake excitation

In this example, a full scale building in IIT Guwahati campus is used for experimental verification. Figure 10 shows the photograph of this building along with its structural dimensions. It can be noticed that there are two identical buildings on either side of the central staircase. Buildings are separated from the staircase so that they can act freely. The building on the right side has base isolators, while building on the left side is supported over isolated footings (i.e., fixed base) which is used in this study for modal identification. The sensor details of this building are provided in Mahato et al. [26]. Accelerations at the top floor are recorded during actual seismic events which are used here for identification. Figure 11 shows the top floor responses in X and Y directions which were recorded on 3rd September 2009. The first column of this figure shows the recorded earthquake ground motions in X and Y directions along with the building responses. Wavelet transformation of these responses are carried out as described earlier, with \(n_{\mathrm{s}} = 2^{8}\) as the initial value. The middle column of Fig. 11 shows the scalogram of the responses using complex Morlet wavelet. The central frequency of this basis function is considered to be 3 Hz in this study and after convergence the number of scales is 1904 to cover a frequency range of 0.001 to 30 Hz. It can be observed from Fig. 11 that frequencies are localized in different zones but their resolution is not very high to identify them clearly. Thus to improve the resolution further, synchrosqueezing is adopted which are shown in the third column of Fig. 11. Here, the clarity of the scalogram has improved, however, it also fails to locate the modal frequencies directly as the relative magnitude of energy distribution is very low. Thus, k-means clustering is invoked to segregate the frequencies based on their energy content which is shown in Fig. 13. In this context, the optimal number of clusters is estimated to be 7. Finally, the modes are identified using instantaneous phase portrait whose difference in two dof for the same mode are shown in Fig. 14. For this purpose, 1st and 4th dof are considered and the responses in 1st and 4th cluster are used for demonstration. Here, time domain responses are obtained by inverse transformation as described in the previous example and the instantaneous phases are evaluated using Eq. (29). The first row in Fig. 14 shows the responses in 1st and 4th clusters, while the second row shows their phase difference. From this figure, it can be concluded that 1st cluster corresponds to 1st mode as the phase difference is zero in most of the time instant (except near zero crossing of the individual signals which are reflected in large spikes), while the 4th cluster is a spurious mode. Using similar technique, other cluster are also verified to identify the modal frequencies. These identified frequencies are listed in Table 3.

Fig. 10
figure 10

Details of BRNS building

Fig. 11
figure 11

Earthquake responses and scalogram—a recorded motion on 03/09/09 and top floor response in x-direction; b Wavelet coefficients of top floor response in x-direction; c synchrosqueezed transform coefficients of top floor response in x-direction; d recorded motion on 03/09/09 and top floor response in y-direction; e Wavelet coefficients of top floor response in y-direction and f Synchrosqueezed transform coefficients of top floor response in y-direction

Fig. 12
figure 12

a Equivalent strut model of the building, b Fourier amplitude spectrum of response on 03/09/09 and c Fourier amplitude spectrum of response on 21/09/09

In this table, the first column refers to discrete model, where infill walls are characterized by diagonal strut as shown in Fig. 12a. This model is created as per Indian Standard guidelines for reinforce concrete structures and the strut is model as proposed by Mondal and Jain [28] which is based on the static analysis of 2D frame with infills. Therefore, the discrete model is the best possible representation of the field structure and not the exact one. However, the discrete model is tuned in such a way that it can replicate the fundamental modes obtained from fourier analysis as closely as possible. Therefore, error estimation using these values are not carried out here as the exact benchmark is not available in this case. In absence of the actual benchmark values, only the consistency of the estimated modes are checked by the proposed algorithm and found to be in accordance with the value obtained from the discrete model and other identification strategy [26]. To study the behavior of the full scale building, recorded responses in two different occasions are reported here. For this purpose, the seismic events on 3rd Sept 2009 and 21st Sept 2009 are used. The peak ground acceleration of the major components at these two dates are 0.0138 g and 0.026 g respectively which is far below the level used to design the structure. Hence, there is no chance of damage caused by these two successive events. Moreover, the date of occurrence is so close that the structure does not experience any change in material properties due to weathering or other activities surrounding it. Figure 12b, c show the Fourier transform of the X and Y components of acceleration response at the top floor. These amplitude spectrum clearly shows the difference in fundamental frequencies which is due to the uncertainty associated with the field experiments involving non-stationary excitation. Table 3 shows the identified modal parameters for these two seismic events. Although, the values differs from each other, but the estimation by the proposed algorithm are found to follow the trend in both these events. However, these deviations should not be attributed to the change in structural properties (i.e., damage) or under performance of the identification strategy. As the non-stationary input has frequencies close to structural frequencies, they are bound to interact. In this context, it may be noted that non-stationary excitation are not recommended for damage detection which is better estimated from the free response or the response due to broad banded excitation (if possible). In this context, author wish to clarify that the proposed algorithm is specifically used for the identification of the actual building from the seismic excitation not to locate possible damage but to pinpoint the dominant frequencies and its variations in different seismic events for the tuning of the passive controller which is not discussed here as it is beyond the scope of this work. Once, the frequencies are identified, modal damping is estimated using NExT as described in the previous example. Table 3 shows the identified modal damping of the BRNS building (Figs. 13, 14).

Fig. 13
figure 13

k-means clustering—a median of clusters for response due to recorded earthquake on 03/09/09 and b median of clusters for response due to recorded earthquake on 21/09/09

Fig. 14
figure 14

Phase portrait for seismic event on 03/09/09—a response in 1st cluster; b response in 4th cluster; c phase difference in 1st cluster and d phase difference in 4th cluster

Table 3 Identified modal parameters of BRNS building from different seismic events

4.3 Laboratory experiment

The final problem considered in this study is a thin beam with lumped masses at different locations. The test was carried out by Adhikari et al. [3] in the Bristol Laboratory for Advanced Dynamic Engineering and the data is freely available in the internet [1]. Figure 15 shows the schematic diagram of this beam. The beam is 1.2 m long and 2.05 mm-thick whose both ends are fixed. Twelve discrete masses, each weighing 2 g are placed on this beam whose locations can be changed for different tests. Table 4 summarizes these locations for different tests in this study. In total, all discrete masses together contribute only \(1.6\%\) of the mass of the beam. Three accelerometers are placed at 23 cm, 50 cm and 102 cm from the left end. As the locations of these discrete masses change, the frequencies of the beam also change. The beam is excited by impulse at the middle and the responses are recorded with a sampling rate of 16,384 Hz which are shown in Fig. 16. The data is recorded in FFT analyzer and the subsequent time domain data is generated by inverse fourier transformation after down sizing the sampling rate to 4096 Hz. The peak acceleration response in all sensors are very high. Here, it may be noted that the beam is very thin and the effective length is very high so that large number of frequencies can be excited. This thin beam is excited by impulse hammer which is reflected in large acceleration response so that response energy in the higher modes are significant. In this context, it may be noted that the proposed algorithm is based on the frequency distribution in the scalogram and its phase spectrum. Hence, the proposed algorithm in this paper never uses the absolute amplitude of the response for modal identification. In this example, three different wavelet basis functions are used to study their relative performance on the same problem. First, the complex Morlet basis is used with \(n_{\mathrm{s}}=1024\), i.e., \(2^{10}\). The iteration in the proposed algorithm is continued until the process is converged. It is found that complex Morlet basis converges with \(n_{\mathrm{s}}=1255\) which remain same for Morse basis, while log-normal basis converge with \(n_{\mathrm{s}}=5020\). Figure 17 shows the scalogram of the wavelet coefficients obtained from the three different basis functions to demonstrate their relative performances, covering the frequency range of 0.001 to 800 Hz. As expected, the frequencies are localized at different regions of these scales (i.e., they are segregated into different bands). It can be easily shown that even the ridges and skeletons at these scales fail to separate out these frequencies. This, in turn, advocates for better frequency resolution of the scalograms. With this in view, synchrosqueezing of the original wavelet coefficients are carried out and the enhanced scalograms are also plotted in the second column of Fig. 17 which shows that resolution improves drastically due to synchrosqueezing. Here, only the scalograms of the response from sensor 2 in test 1 are shown to avoid repetition. However, these scalograms are not clear enough to distinctly identify the modal frequencies. Thus, k-means clustering-based machine learning is adopted as described in the proposed algorithm. Here, the optimal number of clusters are identified in a similar way as described in the algorithm which is found to be 15 for all three basis. Figure 18 shows the cluster weights corresponding to the median frequencies for three different tests. It may be noted that relative weights in each cluster indicate the energy associated with that frequency. To avoid any false alarm (i.e., spurious modes), phase portraits are then obtained from the signals by inverse synchrosqueezing as described in Eq. (29). Figure 19 shows the difference in phase angles obtained from three different sensors. Here, only 2nd and 4th clusters are used for demonstration purpose. The first row of Fig. 19 shows the time history corresponding to the median frequency of the 2nd and 4th cluster obtained from inverse synchrosqueezed transformation. The second row of the same figure shows the instantaneous phase difference for all three sensors. As the responses are in mode, the phase difference is found to be in unison (i.e., attaining peaks and zero crossing simultaneously) in the second cluster indicating it to be a modal frequency. Table 5 summarizes the modal frequencies obtained using the proposed algorithm. Altogether 14 modal frequencies are identified in each test with error well below \(5\%\) except in the 1st mode, where the error is \(7\%\). Here, it may be noted that \(f_{\mathrm{exp}}\) in Table 5 for each test is obtained from FFT analyzer and not from the exact theoretical model. Thus, \(f_{\mathrm{exp}}\) reported in Table 5 also has measurement error. However, these values are preferred over theoretical values as these are observed during the experiment. Hence, the magnitude of error reported for the present algorithm is not absolute. Moreover, the relatively high error in mode 1 is due to weak energy concentration in the 1st mode and should not be attributed to the drawback of the proposed algorithm. The weakness of the 1st modal frequency is also reported by Adhikary and Phani [4]. The frequency response functions (FRF) are obtained from the identified modal parameters and are compared with the experimental observations as shown in Fig. 20. A close match in frequencies between them are observed indicating accurate estimation of these parameters. There is a difference in shape of the identified and experimental FRF which is attributed to the approximate damping ratio used in identified FRF. Here, it may be mentioned that an effort was made to identify the modal damping of this thin beam. However, being very flexible with ultra high modal frequencies, the estimation of viscous damping suffers large errors and hence an approximated value of modal damping is used as suggested by Adhikari et al. [3]. Finally, the performance of different basis functions are investigated and the results are tabulated in Table 6. It can be observed from this table that choice of basis function does not affect the quality of estimation in the proposed algorithm as the same frequencies are identified using three different bases. However, the number of scales are different for different basis functions. Here, only first five frequencies upto two decimal places are used for comparison. This is another advantage of the proposed algorithm which automatically adjust the scales to achieve the desire accuracy in estimation.

Fig. 15
figure 15

Schematic diagram of experimental setup [3]

Fig. 16
figure 16

Recorded time histories—a test-1, b test-2 and c test-3

Based on these results, it can be inferred that modal frequencies are identified satisfactory without any prior knowledge of the system/input and any user intervention.

Fig. 17
figure 17

Scalograms obtained from WT (1st column) and SST (2nd column) of sensor 2 for test 1; a, b complex Morlet; c, d log-normal wavelet and e, f Morse wavelet

Fig. 18
figure 18

Median of the each cluster estimated from sensor 2 data a test-1, b test-2 and c test-3

Table 4 Mass locations for different tests (in cm)
Table 5 Identified modal frequencies of thin beam
Table 6 Identified frequencies from sensor 1 in test 1 with different wavelet bases
Fig. 19
figure 19

Phase portrait—a response in 2nd cluster; b response in 4th cluster; c phase difference in 2nd cluster and d phase difference in 4th cluster

Fig. 20
figure 20

FRF at sensor location 2 for test-1

5 Conclusion

A combined synchrosqueezed wavelet transformation and unsupervised k-means clustering is used in this study to identify modal parameters. Here, the synchrosqueezing offers better frequency resolution, while k-means clustering-based machine learning helps to segregate the frequency localization without any intermittency with the user. The proposed identification strategy is applied on different engineering problems to demonstrate its efficiency and accuracy. The major observations from this study are as follows:

  • Wavelet transformation has proved to be an efficient tool for modal identification. However, wavelet coefficients alone fail to pinpoint the modal frequency as shown in different scalograms and demand either user interface (which is often subjective) or additional mathematical improvisation for efficient frequency tracking. The second option is utilized in two steps here—(i) improving frequency resolution by synchrosqueezing and (ii) clustering-based identification of energy concentration in different scales.

  • Synchrosqueezing operates over the wavelet coefficients to improve the clarity of the scalogram (i.e., better resolution) and hence, it also bears the same characteristics of the wavelet scalogram except narrowing the frequency bands that contain signal energy.

  • k-means clustering can efficiently identify the energy localization without any user intervention. However, this may provide frequencies that do not correspond to the modal vibration which, in the proposed technique, are detected using phase portrait. This involves inverse synchrosqueezed transformation of the coefficients in the clusters identified by the k-means algorithm. Once the modal frequencies are identified, modal damping ratio can also be traced using any standard algorithm (e.g., NExT).

  • Different numerical applications demonstrated in this study prove that the proposed algorithm works efficiently for wide range of problems. In particular, the second and third examples demonstrate that large number of modal frequencies often observed in the civil and mechanical vibrations can be effectively traced. These parameters can be efficiently identified by the proposed algorithm. In this context, it may be noted that different basis functions (e.g., complex Morlet, lognormal, generalized morse) offer the same quality of the end result. Hence, either of them can be adopted for frequency tracking.

In general, the proposed synchrosqueezed wavelet transformation-based clustering technique has proved to be an effective identification tool that can be adopted in vibration-based system identification. The numerical results clearly show its accuracy in different civil and mechanical problems. This algorithm can be easily adopted for damage detection and real-time frequency tracking for control problems which the authors wish to address in their future work.