Abstract
In this article, an original data-driven approach is proposed to detect both linear and nonlinear damage in structures using output-only responses. The method deploys variational mode decomposition (VMD) and generalized autoregressive conditional heteroscedasticity (GARCH) model for signal processing and feature extraction. To this end, VMD decomposes the response signals that are first decomposed to intrinsic mode functions (IMFs), and then, GARCH model is utilized to represent the statistics of IMFs. The model coefficients’ of IMFs construct the primary feature vector. Kernel-based principal component analysis (PCA) and linear discriminant analysis (LDA) are utilized to reduce the redundancy from the primary features by mapping them to the new feature space. The informative features are then fed separately into three supervised classifiers: support vector machine (SVM), k-nearest neighbor (kNN), and fine tree. The performance of the proposed method is evaluated on two experimental scaled models in terms of linear and nonlinear damage assessment. Kurtosis and ARCH tests proved the compatibility of GARCH model. The results demonstrate that the proposed technique reaches the accuracy of 100% and 98.82% in classifying linear and nonlinear damage, respectively. Also, its accuracy is higher than 80% in the presence of noise with a signal-to-noise ratio (SNR) of more than 10 dB.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Today’s current structural engineering industry requires consideration to be directed towards structural health monitoring (SHM) and optimizing safety. With forecasts of increasing worlds’ population, structural infrastructure shall be subject to increased loading and deformation. To decrease the effects and consequences of structural deterioration, SHM processes are required more frequently, with high levels of accuracy necessary to achieve asset preservation. Hence, there has been a surge in interest surrounding SHM and the development of automated defect evaluation systems in an attempt to maintain existing structural networks and allow for asset expansion.
Concerning structural behavior, damage leads to deviations in the structure’s dynamic characteristics and is considered a reliable indication of anomaly diagnosis. Also, it might cause a system with typically linear behavior to demonstrate nonlinear responses, including cracking, impacts and rattling, delamination, stick or slip, rub, or deformation in connections [1, 2]. Nonlinear behavior is supposed to be unpredictable and more sophisticated compared to the linear one. As a case in point, it has been proven through experimental investigation that natural frequencies could rise instead of decrease in breathing phenomena [3]. This reaction originates from the fact that the crack conversely opens and closes in the experimental test. Subsequently, the detection of nonlinear anomalies is considered more challenging compared to linear damage [4].
Over the decades, researchers have proposed several techniques in terms of anomaly identification. Generally speaking, such methods are divided into physics-based (or model-based) and data-driven approaches [5]. In the physics-based, anomalies are tracked utilizing monitoring variations within the simulated responses from the structural numerical model [6]. This model is a detailed mathematical abstraction linking a studied system’s input and output variables employing known or presumed properties [7]. Post analysis is demanded for determining damage location and qualification. Finite-element methods (FEMs), boundary element methods (BEMs), and spectral finite-element methods (SFEMs) are some of the techniques used in this regard. However, FEMs are considered the systematic method compared to the others due to their compliance in modeling complicated structures [8]. In the occurrence of damage, particular parameters of the simulated models are updated according to response measurements. Optimization algorithms are typically used to minimize variations between experimental and numerical responses by comparing mechanical characteristics of stiffness, damping, or mass [6].
Despite the broad potential of physics-based approaches in damage assessment, especially for the evolution of complex systems such as multi-stories buildings and multi-span bridges, they have some limitations. For example, exact modeling of a structure entails sufficient information regarding different components of a monitored system, such as loading states, boundary conditions, material properties, and precise coordinates of members. Moreover, optimization solutions commonly experience numerical instability as well as ill-conditions dilemma [9]. The performance of such optimization techniques substantially degrades proportionally to the number of variables in the problem.
On the other side, data-driven SHM provides bottom–up solutions founded on tracking changes within the output signals appropriate for complex systems where the knowledge about geometries, properties, and initial conditions is limited [5]. Any sudden changes in the output signals are observed and analyzed through signal processing tools and pattern recognition procedures to determine probable damage. Independence for having an initial model and prior knowledge causes data-driven SHM to be a faster technique and an economical and practical solution for online SHM. Signal processing techniques synthesize, modify and analyze the recorded responses, and highlight different features in time, frequency, and frequency domains. Machine Learning algorithms are typically employed to identify and interpret features extracted from signals and recognize generated patterns in conjunction with such methods. Machine learning includes clustering, regression, neural networks, ensemble learning, deep learning, Bayesian methods, instance-based, decision trees, and dimensionality reduction [10].
Data-driven methods are helpful compared to physics-based techniques when [11], first, the structure’s physical characteristics are unavailable or challenging to be modeled. Second, there are an adequate number of sensors installed for capturing the structure’s responses. Third, the computational operations are costly in the SHM project; in addition, multi-physics models consist of more physical processes in a system (e.g., thermal interactions, water precipitation, and magnetostatic and chemical reactions) may not seem efficient for utilizing a large amount of sensor data. The accuracy of physics-based depends on the response measurements; the best performance is achieved in an environment with the slightest amount of noise. In real-world structures and especially for in-servicing conditions, however, the amount of noise is considerable. As such, data-driven damage identifications deploying actual responses have revealed preferable adaptability and thereby turned into an inspiring solution in the realm of SHM [10].
1.1 Need for research
Although nonlinear damage has been studied before and practical solutions are proposed in this realm, most focus on damage identification as the first level based on Rytter’s classification levels in SHM [12]. Hence, limited research has been conducted to reach higher levels (e.g., damage localization and classification). This study attempts to address nonlinear damage detection in building structures through a robust data-driven approach. Adverse conditions such as environmental and operational effects in recording responses and analyzing signals are the other crucial points that should be considered. These issues become more though in the case of buildings where the story correlations can affect the structural responses. Therefore, proposing a robust model with appropriate precision in identifying different kinds of linear and nonlinear anomalies considering these issues leads to a practical approach in assessing real-world structures under adverse conditions.
Accordingly, the rest of the paper is organized as follows. In Sect. 2, related works are discussed, and gaps are highlighted once again. Case studies are presented in detail in Sect. 3. Section 4 provides the details of the proposed data-driven approach. Experimental results and discussion are given in Sect. 5. Finally, Sect. 6 concludes the work and suggests future directions.
2 Background
Signal processing techniques play a fundamental role in data-driven SHM for analysis responses in time, frequency, or time–frequency domains. Fourier spectra, spectrum analysis, difference frequency analysis, and the high-frequency resonance technique are appropriate for damage identification, especially for gear faults and roller bearings [13]. Wavelets proved the efficiency for damage and deterioration detection in building structures based on a stochastic approach [14]. Fourier transform (FT) and fast Fourier transform (FFT) are considered the main concepts for anomaly detection. A time-series model is a promising tool for simulating and predicting structural signals in the time-domain. Since this method is based on a partial structural dynamics model, it can identify even a small number of vibrations [15]. In this area, autoregressive (AR) models are investigated for damage and deterioration detection in buildings and bridges [16–18]. Auto-regressive and moving average model (ARMA), as well as generalized autoregressive conditional heteroscedasticity model (GARCH), have proved to be beneficial for nonlinear damage identification in building specimens [19]. Transient behaviors caused by damage or adverse environmental conditions can be recognized through a signal’s time–frequency form [20].
In a broad perspective, the real-world signals are linear and stationary and are coupled with noise. Consequently, linear signal processing techniques, such as spectral analysis, are not appropriate in this realm of scope [21]. Hilbert–Huang transform (HHT), introduced by Huang et al. [22], consists of two sequential steps. The first step, called empirical mode decomposition (EMD), separates the complicated initial signal into a determined and commonly limited number of intrinsic mode functions (IMFs) or modes. Each mode is an oscillatory function with time-varying frequencies that reveals the input signals’ local features and corresponds to different frequencies and a residue [23, 24]. The algorithm detects the maxima/minima recursively, assesses the envelopes using the extrema, and removes the average envelopes, which leads to isolating high-frequency bands.[25]. In the next step, the Hilbert transform (HT) includes each IMF’s orthogonal pair with 90° difference in the phase [26]. As a result, each IMF set and the corresponding pair can evaluate instant variations of signal magnitude and frequency concerning time. Compared to wavelet analysis and Fourier transform, EMD benefits from tracing out the IMFs by interpolating between the extremums instead of using any given wavelet basis. Despite the wide usage of EMD in a variety of time–frequency applications such as medical [27], economics [28], climate predictions [29], SHM [30, 31], and many other fields, it may dace with some issues like sensitivity to noise and sampling frequency which cause the performance relies on the frequency ratio [25, 31, 32].
Some modified algorithms have been developed, including ensemble EMD (EEMD), complete ensemble EMD with adaptive noise (CEEMDAN), and variational mode decomposition (VMD) [32] to address these limitations. VMD is a relatively new algorithm that decomposes a signal into distinctive amplitude and frequency adjusted sub-signals where together they reproduce the primary input signal [32]. This approach is entirely non-recursive, and the sub-signals are extracted simultaneously; it is proven that VMD outperforms the EMD algorithm in various areas such as signal analysis and damage detection.
Variational mode decomposition has been deployed in the real SHM by some researchers. For instance, Bagheri et al. [31] calculated damping ratios for each extracted modal response obtained from VMD. The mode shape vector was obtained for each decomposed structure mode, which was then practiced for damage identification in three specimens, including numerical, experiment, and field case studies. Xin et al. [33] established two damage indices relying on modal parameters obtained from VMD. An experimental and numerical assessment demonstrated the efficiency of the method for nonlinear to find the location and severity of nonlinear damage scenarios in the models. Das and Saha [34] investigated the impact of a heavy noise environment on a new hybrid algorithm using VMD along with frequency-domain decomposition (FDD). It was deducted that the hybrid method could detect damage location accurately for noises above 20%. A novel methodology is illustrated and assessed in the following sections on two experimental specimens with linear and nonlinear damage scenarios.
3 Case studies
In this section, two case studies used in this work are thoroughly explained and discussed.
3.1 Case study 1: linear damage
The first case study is a three-story metal frame with aluminum columns and floors, investigated in linear damage simulation [35]. A roller at the base supports the specimen and can move horizontally using a hydraulic jack. Piezoelectric single-axis accelerometers instrument each floor. Nine linear damage scenarios are imitated employing stiffness reduction of columns and replacement of a 1.2 kg mass. Hence, 50 signals are recorded for each status with a sample rate of 320 Hz. Therefore, 450 signals are acquired for all scenarios, as illustrated in Table 1. As depicted, there are nine statuses, including healthy condition (S1) showing the intact structures without any changes in components, two scenarios simulate the operational and environmental effects by changing mass of floors (S2 and S3), and six damage scenarios by changing the stiffness of columns (S4–S9).
Additionally, Fig. 1 presents sample recorded signals in different scenarios, where \(y_{1} (t)\), \(y_{2} (t)\), and \(y_{3} (t)\) represent the recorded data by sensor 1, sensor 2, and sensor 3, respectively. It is evident that the recorded responses for all damage scenarios follow a random pattern, and usage of time-domain data cannot discriminate damage status from healthy cases. Thus, there is a need to model output responses through signal processing techniques to find suitable features indicating variations in the signals.
3.2 Case study 2: nonlinear damage
This case is the adjusted model of the first case study and is used for studying the impact of nonlinear damage. The sampling rate is the same as the linear model and is set to 322.58 Hz with 8192 data points for each record. Ten measurements are recorded for each state. Likewise, in the initial specimen, this frame also glides on rails that enable a transmission in one direction with the aid of an actuator. Four accelerometers with a sensitivity of 1000 mV/g are attached on the opposite side of the shaker at the center of the floors; thus, they do not help determine the specimen’s torsion models.
To simulate nonlinear damage, a mechanical bumper and a center column are installed onto the frame. This mechanism imitates the breathing crack and will cause nonlinear behaviors in the condition that the installed column hits the bumper, which is placed on the second floor. The adjustable gap between the bumper and the installed column is used for defining different degrees of nonlinearity. Hence, the larger the gap is, the smaller the nonlinear behavior becomes. The specimen’s outline and the damage scenarios are provided in Fig. 2 and Table 2, respectively. Some recorded nonlinear signals are given in Fig. 3, where \(y_{1} (t)\), \(y_{2} (t)\), \(y_{3} (t)\), and \(y_{4} (t)\), respectively, represent the recorded data by sensor 1, sensor 2, sensor 3, and sensor 4. Similar to the previous case, the time-domain presentation of responses cannot indicate variations due to damage properly.
As noted, two three-story models were presented for linear and nonlinear damage scenarios. Linear damage were simulated by reducing the cross-section area of columns, while nonlinear behavior was considered as hitting a bumper with a mid-column in the second case study. The environmental and operational conditions were also considered by adding a mass to different damage scenarios. Story accelerations were recorded for damage identification and classification, with a novel methodology discussed in the following section.
4 Proposed method
In this work, anomaly detection is performed in three steps. First, VMD decomposes the signal into several sub-signals with separated bandwidths. Second, primary features are extracted using the time-series modeling, and then, the number of features is reduced by KPCA and KDA. Finally, three supervised classifiers are separately deployed to discriminate different damage states within three specimens. A schematic workflow of the proposed method is depicted in Fig. 4. In the following, these stages are illustrated thoroughly.
4.1 Signal processing
Herein, the input acceleration signals are decomposed using VMD, so that an input signal \(S(t)\) is broken down into \(d\) limited-bandwidth IMFs depicted as [36]
where \(A_{k} (t)\) and \(\omega_{k} (t)\) present the instantaneous amplitude and frequency of \(u_{k} (t)\), respectively. The constructed variational problem is obtained using Hilbert transform as follows:
such that
where \(\partial (t)\) denotes the partial derivative of \(t,\)\(\{ u_{k} (t)\} = \{ u_{1} (t),...,u_{n} (t)\}\) and \(\{ \omega_{k} \} = \{ \omega_{1} ,...,\omega_{n} \}\) shows the IMFs of signal \(S_{t}\) and their center frequencies of each signal sub-band, respectively. Equation (2) is presented in a Lagrange function using \(\lambda\) and \(\alpha\) as a multiplier operator and penalty factor, respectively, to solve the optimization problem
Afterward, Eq. (4) is transformed into the time–frequency space, and the equivalent extremum solution is solved to obtain the frequency-domain form of the modal element \(u_{k} (t)\) as well as the center frequency \(\omega_{k}\)
Finally, the alternative direction of multipliers (ADMM) is deployed to optimize the constrained variational model. Subsequently, the initial signal \(S(t)\) is broken down by \(d\) IMFs as described in the following:
-
Initialize the parameters \(\{ u_{k} \} ,\{ \omega_{k} \} ,\{ \lambda^{1} \} \,\,{\text{and}}\,\,n \to 0\)
-
The value of \(u_{k}^{n + 1}\) and \(\omega_{k}^{n + 1}\) is updated according to (5) and (6).
-
The \(\lambda^{n + 1}\) is updated as stated in
$$\lambda^{n} (\omega ) + \tau \left( {f(\omega ) - \sum\limits_{k}^{n + 1} {u_{k} (\omega )} } \right).$$(7) -
Equation (7) is continued till the following criteria are satisfied:
$$\frac{{\sum\nolimits_{k} {\left\| {u_{k}^{n + 1} - u_{k}^{n} } \right\|}_{2}^{2} }}{{\left\| {u_{k}^{n} } \right\|_{2}^{2} }} < \varepsilon .$$(8)
Proved that the above condition is met, the iteration procedure stops.
Herein, the iteration is stopped; otherwise, it returns to step 2, and \(d\) IMFs can be extracted [31, 36]. In Figs. 5, 6, 7, and 8, the IMFs of linear and nonlinear signals are shown. Due to space limitations, we only present the two IMFs.
4.2 Feature extraction
4.2.1 GARCH modeling of IMFs
Generally speaking, a signal can be modeled via ARMA time-series to evaluate the conditional mean. As an illustration, the ARMA(p, q) prediction for the conditional mean is formulated as [37]
where p denotes the autoregressive model order, \(\varphi_{i}\) presents the autoregressive variable, q stands for the moving average model order, \(\theta_{j}\) shows the moving average variable, \(\varepsilon_{t}\) denotes the residual, and c is a constant. However, the residual is usually considered to have a mean of zero with constant variance. In some time-series, it is not homoscedastic and has no constant variance [37]. In this case, the time-varying variance is called conditional variance that is described as
The GARCH model, established by Bollersl [38], is a dynamic model that addresses the conditional heteroscedasticity or volatility clustering for an innovation process using a weighted combination of past heteroscedasticity functions coupled with the squared residuals of the past. It causes a reduction in the parameters and complexity of the model. A \({\text{GARCH}}(r,m)\) model for the conditional variance of residual \(\varepsilon_{t}\) is formed as
In which \(\beta\), \(b_{i}\), and \(a_{j}\) are the parameters of the GARCH model. Herein, the following constraints are defined to ensure that the conditional variance is positive:
Moreover, the following formula is defined to make the covariance stationary:
This paper utilizes the GRACH model to create the conditional variance model for IMFs obtained from VMD. The GARCH model showed reliable performance in nonlinear problems, as discussed in [19]. The coefficients of \({\text{GARCH}}(r,m)\), i.e., \(\{ b_{i} \}\) and \(\{ a_{j} \}\), are considered as features. Hence, kth IMF is described by \(\left\{ {b_{1}^{(k)} , \ldots ,b_{r}^{(k)} ,a_{1}^{(k)} , \ldots ,a_{m}^{(k)} } \right\}\). Considering \(d\) IMFs, the feature vector of signal with \(d(r + m)\) features, \({\mathbf{f}}_{(d(r + m)) \times 1}^{{}}\), is constructed as
Finally, since each signal is recorded by several sensors, each record is described with \(n_{f} = \sum\nolimits_{i = 1}^{n} {d_{i} (r + m)}\) features, where \(n\) shows the number of sensors and \(d_{i}\) stands for the number of IMFs is used to decompose the signal of the ith sensor. Hence, the feature vector of a signal with \(n\) sensors is given as \({\mathbf{f}} = \left[ {{\mathbf{r}}_{1}^{{\text{T}}} , \ldots ,{\mathbf{r}}_{n}^{{\text{T}}} } \right]^{{\text{T}}}\). All obtained features are not suitable for classification, and feature vectors may suffer from redundant features. Hence, we should utilize feature reduction techniques to remove such features from the feature vector.
4.2.2 Feature reduction
The general concept of kernel-based feature reduction is based on deploying a particular sort of nonlinear mapping function to protrude the initial vector f into a high-dimensional feature space as F. Regarding the new feature space, the principal components are obtained through the regular principal component analysis (PCA). In other words, the principal nonlinear components in the initial space correspond to the principal components in feature space F. Afterward, the kernel functions, including polynomial, radial basis function, and sigmoid, are used to perform the nonlinear mapping in KPCA [39].
Assume nonlinear mapping \(\phi\); the initial data space \({\mathbb{R}}^{{n_{f} }}\) is mapped into a new feature space like \({\rm H}\) as [40]
For a training sample set \({\mathbf{f}}_{1} ,{\mathbf{f}}_{2} ,...,{\mathbf{f}}_{M}\) in \({\mathbb{R}}^{{n_{f} }}\), where \(M\) denotes training sample numbers. Subsequently, the covariance matrix is formulated as [40]
such that
Since \({\mathbf{S}}_{{}}^{\phi }\) it is a bounded, compact, positive, and symmetric matrix, its nonzero values are also positive. For the sake of finding these nonzero values, Schölkopf et al. [41] suggested linearly express every eigenvector of \({\mathbf{S}}_{{}}^{\phi }\) by [40]
To compute expansion coefficients, the Gram matrix is formed as \(\tilde{R} = {\mathbf{Q}}^{{\text{T}}} {\mathbf{Q}}\), where \({\mathbf{Q}} = [\phi ({\mathbf{F}}_{1} ),...,\phi ({\mathbf{F}}_{M} )]\). Consequently, each component \({\mathbf{Q}}\) is computed using kernel tricks as [40]
Accordingly, \({\tilde{\mathbf{R}}}\) is centralized by [40]
where
Afterward, the orthonormal eigenvectors \(\gamma_{1} ,\cdots,\gamma_{{n_{{\text{p}}} }}\) of R are calculated related to \(n_{{\text{p}}}\) the most significant positive eigenvalues, such that \(\lambda_{1} \ge \lambda_{2} \ge \cdots \ge \lambda_{{n_{{\text{p}}} }}\). Consequently, the orthonormal eigenvectors \(\beta_{1} ,\beta_{2} ,...,\beta_{{n_{p} }}\) of corresponding \({\mathbf{S}}_{{}}^{\phi }\) are obtained via [40]
After that, the KPCA transformed feature \({\mathbf{y}} = \left( {y_{1} ,...,y_{{n_{{\text{p}}} }} } \right)^{{\text{T}}}\) vector is obtained by the projection of the mapped sample \(\phi ({\mathbf{f}})\) onto the eigenvector \(\beta_{1} ,\beta_{2} ,...,\beta_{{n_{{\text{p}}} }}\) as formulated below [40]
The training matrix \({\mathbf{F}} = \left[ {{\mathbf{f}}_{1}^{{\text{T}}} ;\,\,{\mathbf{f}}_{2}^{{\text{T}}} ; \ldots ;{\mathbf{f}}_{M}^{{\text{T}}} } \right]^{{\text{T}}}\) with the size of \(n_{{\text{f}}} \times M\) is mapped to the matrix \({\mathbf{Y}} = \left[ {{\mathbf{y}}_{1}^{{\text{T}}} \,;\,\,{\mathbf{y}}_{2}^{{\text{T}}} ; \ldots ;{\mathbf{y}}_{M}^{{\text{T}}} } \right]^{{\text{T}}}\) with the size of \(n_{{\text{p}}} \times M.\)
The aim of linear LDA is as follows [42]:
where \({\mathbf{S}}_{{\text{b}}}\) and \({\mathbf{S}}_{{\text{w}}}\) reveal the between-class and within-class scatter matrices, which are obtained as
where \({\varvec{\mu}}\) is the global mean, \(m_{k}\) stands for the number of samples in the kth class, and \({\varvec{\mu}}^{(k)}\) denotes the mean of the kth class. Afterward, the total scatter matrix is defined as \({\mathbf{S}}_{t} = {\mathbf{S}}_{{\text{b}}} + {\mathbf{S}}_{{\text{w}}} ,\). The optimum values of a correspond to the nonzero eigenvalue of eigenproblem
A maximum number of \(n_{{\text{c}}} - 1\) eigenvectors are obtained corresponding to nonzero eigenvalues, because the rank of \({\mathbf{S}}_{{\text{b}}}\) is limited to \(n_{{\text{c}}} - 1\). Similar mapping (15) is considered to extend the LDA to the nonlinear case. Hence, \({\mathbf{S}}_{{\text{b}}}^{\varphi }\), \({\mathbf{S}}_{{\text{w}}}^{\varphi }\), and \({\mathbf{S}}_{t}^{\varphi }\), respectively, stand for the between-class, within-class, and total scatter matrices in feature space, which are obtained by the following formulation:
Assume that \({\varvec{\nu}}\) shows the projective function in feature space, and the associated objective function in feature space is defined as
This function can be solved by eigenproblem as
And we have
Then, we can define an equivalent problem as:
where \({\varvec{\alpha}} = [\alpha_{1} ,...,\alpha_{M} ]^{{\text{T}}}\). The corresponding eigenproblem is as \({\mathbf{KWK}}\varvec{\alpha} = \lambda {\mathbf{KK}}{\varvec{\alpha}}\), where K shows the kernel matrix, i.e., \(K_{ij} = \kappa ({\mathbf{y}}_{i} ,{\mathbf{y}}_{j} )\) and W is defined as
Each eigenvector \({\varvec{\alpha}}\) provides a projective function \({\varvec{\nu}}\) in the feature space. Let y a data, and then, we have
where \(\kappa \left( {:,{\varvec{y}}} \right) \doteq \left[ {\kappa \left( {{\varvec{y}}_{1} ,{\varvec{y}}} \right), \ldots ,\kappa \left( {{\varvec{y}}_{m} ,{\varvec{y}}} \right)} \right]^{{\text{T}}}\). Let \(\left\{ {{\varvec{\alpha}}_{1} , \ldots ,{\varvec{\alpha}}_{{n_{{\text{c}}} - 1}} } \right\}\) be the \(n_{{\text{c}}} - 1\) eigenvectors of the eigenproblem concerning nonzero eigenvalues. The transformation matrix \(\Theta = \left[ {{\varvec{\alpha}}_{1} , \ldots ,{\varvec{\alpha}}_{{n_{{\text{c}}} - 1}} } \right]\) is \(M \times (n_{{\text{c}}} - 1)\) a matrix that embeds the data sample y into \(n_{{\text{c}}} - 1\) dimensional subspace by
4.3 Classification
In the next section, three classifiers are applied to the selected features previously taken and are called predictors. These classifiers are prevailing in the realm of Machine Learning, including support vector machine (SVM), fine tree, and k-nearest neighbor (kNN). SVM is a supervised training algorithm founded on the fact that measurements can be considered two-dimensional space. Each sample denotes a data point in the space and can be separated by a line in the case of a two-dimensional problem and a plane in the case of the dimensional system [43]. Regarding kNN, despite its simplicity, it is common in terms of suing in large training datasets. It allocates an estimated value to a new sample on the ground of plurality or weighted of the k-nearest neighbors in the training set [44]. Classification using a decision tree (fine tree) algorithm is very fast and suitable for high-dimensional classification problems. A fine tree is a predictive algorithm mapping from samples about an item to conclusions about its target value. In this model, leaves represent the labels, nodes are the features, and branches denote the junction of features, resulting in label classification [45]. Subsequently, the prediction using these classifiers is compared with each in the following sections.
5 Results and discussion
This section provides the experimental results and relevant discussions. We considered the fivefold cross-validation to assess the performance of the proposed method. To this end, data were randomly partitioned into five equal-sized groups, and then, the training and testing procedures were repeated for five trials. One group was considered for testing data in each trial, and other groups were used to train the classifier. Finally, results were averaged.
5.1 The effect of the number of IMFs on residual
The number of IMFs has a considerable effect on the number of extracted features and the complexity of the proposed method. Here, we determine the efficient number of IMFs based on the mean absolute of residuals, shown in Fig. 9 for different numbers of IMFs of nonlinear signals. It is observed that residual generally reduces as the number of IMFs increases. However, the slope of reduction varies for different sensors. The residuals of sensors 2, 3, and 4 reduce faster than that of sensor 1. As observed, the residual of sensor one does not have a significant variation when the number of IMFs are greater than ten. On the other side, the reduction in residuals of sensors 2, 3, and 4 is not notable for the number of IMFs greater than seven. Hence, we consider the ten IMFs for sensor one and seven IMFs for the remaining sensors. Considering 31 IMFs and two features extracted from each IMF, each recording is described with 62 features.
Following the linear case, as observed in Fig. 10, the residuals of all sensors dwindle gradually at nearly the same pace. For any figures over eight IMFs, the residual does not show significant deviations. Thus, for the linear signals, the eight values of IMFs are assigned for all sensors of stories. Considering two features for each IMF, each record is denoted through 48 features.
5.2 Classification accuracy
To assess the stability of the proposed method and evaluate the effect of features on results, the authors considered four cases as follows:
-
SA: no feature reduction method is employed
-
SB: only KPCA is used for feature reduction
-
SC: only KDA is employed for feature reduction
-
SD: at first, KPCA and then KDA is considered for feature reduction.
The number of features in conditions SB, SC, and SD is obtained based on the normalized cumulative summation of eigenvalues (NCSE). When the NCSE reaches higher than 0.95 for the first time, the efficient number of features is obtained. Considering \(\left[ {\lambda_{1} , \cdots ,\lambda_{{n_{{\text{f}}} }} } \right]\) as sorted eigenvalues in descending order, the NSCE is calculated as follows:
Classification accuracy of the proposed method for nonlinear and linear data considering kNN, SVM, and fine tree classifiers and different lengths of signals obtained from sensors are given in Tables 3 and 4, respectively.
Concerning the nonlinear case, the minimum and maximum performance are observed in scenario SA and SD with 76.92% and 98.82%, respectively. In all scenarios, fine tree classifiers seem to be more efficient compared to the other classifiers. Moreover, kNN is the second accurate classifier, and SVM indicates the lowest performance in this case. It is noteworthy that the signal length has the highest impact on the SB and the lowest on SD with the relative variation (\(\Delta_{\max }\)) of 9.09% and 3.69%, respectively.
Regarding the nonlinear case study, the highest and lowest performance, likewise the nonlinear case, were observed in SA and SD with the accuracy of 100.0% and 89.56%, respectively. Similar to the previous case, the fine tree is the suitable classifier in all proposed scenarios. Except for the SB, kNN indicates higher performance in comparison with SVM. Scenario SB reveals less sensitivity to the signal length, whereas scenario SA shows the highest sensitivity to the signal variations based on \(\Delta_{\max }\).
5.3 Confusion matrix
In this part, the classification performance for both case studies is provided through confusion matrices. Considering the confusion matrix, we provide the recall or sensitivity (Sens.), precision (Prec.), total accuracy (Acc.), and F-score, which are defined as
where TP, TN, FP, and FN denote the true positive, true negative, false positive, and false negative, respectively.
The results are given in Table 5 for the linear damage and the performance metrics are computed for the nine scenarios described earlier. As indicated, the proposed method determines all damage scenarios with no errors. Consequently, this approach expresses the highest performance for discriminating linear damage based on reference to this study.
Regarding the nonlinear case study, 17 separate states of the specimen are predicted through the presented technique, and the results are presented by the confusion matrix as depicted in Table 6. As noted, in the majority of the damage states, the prediction accuracy is 100%. Regarding the remaining cases, which are two out of seventeen scenarios, the classification performance is 90.0%. Subsequently, the established strategy revealed considerable performance for recognizing nonlinear and linear damage with significant precision.
5.4 The effect of noise
The various intensity of noises is applied to the responses on the grounds of signal–noise ratio (SNR) to assess the stability of the proposed method against noise, as depicted in Fig. 11. As observed, the proposed method is efficient even in environments contaminated with severe noise (SNR = 1). Furthermore, the established approach can maintain its performance against noise where it shows insignificant variations in the case of SNR of 20 and 15.
6 GARCH effect assessment
In this section, two tests are applied to demonstrate the compatibility of the GARCH model [46]. Thus, Kurtosis and ARCH tests are provided in the following sections.
6.1 Kurtosis test
GARCH model is appropriate for those signals that have the shape of heavy tails. Therefore, the Kurtosis test is utilized to find out that signals have heavy tails or not. The Kurtosis for a distribution (s) is formulated as follows [46]:
where \(\mu\) and \(\sigma\) denote the mean and standard deviation of distribution s, respectively, and \(E(s)\) stands for the expected value of s. For Gaussian distribution, the Kurtosis value of three and higher values shows that the distribution of coefficients has a heavier tail than the Gaussian distribution. This paper applies this test to the IMFs for each sensor, and the average results for minimum and maximum values of sub-bands are presented in Table 7. Regarding the results, it can be seen that the maximum values are higher than 3, which proves that the IMFs do not have Gaussian distribution.
6.2 ARCH test
Based on the hypothesis provided in [47], the ARCH test is deployed to see the existence of ARCH/GARCH impact in the IMFs of each sensor. In this reference, the Lagrange multiplier test is presented based on regression. Subsequently, the test statistic is asymptotically Chi-square distributed has q degrees of freedom [46].
Thus, in this part, the ARCH test is applied to the IMFs for different sub-bands, and the average results for signals are shown in Table 8. In this table, h stands for the Boolean decision variable, where 1 shows the rejection of the null hypothesis, which depicts that no GARCH effect exists. p Value is the significance level at which the test rejects the null hypothesis. GARCHstat and CriticalValue are the ARCH test static and critical values of the Chi-square distribution, respectively. Based on this test, if GARCHstat is less than the critical value, no GARCH effect exists. In this study, the significance level is set to 0.05, frequently deployed in [48]. Notably, these results are the average of all signals; for example, the average value of h for the fourth IMF of the first sensor is 0.74, which demonstrates that 74% of the signals have the GARCH effect. Thus, in general, the results of the table prove the existence of the GARCH effect in most cases.
7 Conclusion
In this paper, a novel methodology was proposed with the potential to identify and classify linear and nonlinear damage in building structures. Here, the VMD was applied to address the variational conditioning in the input signals and the GARCH model used for modeling the decomposed signals. Afterward, the IMFs were deployed as the features of input signals. It was revealed that using all IMFs led to an increase in residuals. Thus, KPCA and KDA are applied to the extracted features, respectively, to find the optimum and appropriate features. It was observed that using kernel-based dimensional reduction could enhance classification performance using SVM, KNN, and fine tree algorithms. It was demonstrated through the use of two empirical models that the proposed method could discriminate linear damage states correctly and without any error and classify nonlinear damage with significant accuracy. Moreover, the proposed method proves its efficiency even in a highly noisy environment with an SNR of 20 and 15. Finally, to see the existence of the GARCH effect, Kurtosis and ARCH tests were deployed, and the results showed that IMFs followed the GARCH effect; thereby, they were appropriate candidates for the proposed method.
The authors suggest the application of VMD and the GARCH model for unsupervised approaches and reinforcement learning. Moreover, optimization algorithms such as particle swarm optimization (PSO) and grey wolf optimizer (GWO) could be deployed to find the optimum number of features. The current limitation of the proposed method is sensitivity to the noisy signals, which can be solved by SNR estimation and reducing the noise by signal processing approaches. Also, we can consider the semi-supervised schemes to reduce the effect of noisy features on the performance of feature reduction schemes.
References
Nichols JM, Todd MD (2009) Nonlinear features for SHM applications. Encycl Struct Health Monit, p. 649-663
Haroon M Free and forced vibration models. Encycl Struct Health Monit, p.24-52
Gudmundson P (1983) The dynamic behaviour of slender structures with cross-sectional cracks. J Mech Phys Solids 31(4):329–345
Sinou J-J (2009) A review of damage detection and health monitoring of mechanical systems from changes in the measurement of linear and nonlinear vibrations. Nova Science Publishers Inc, New York
Cavadas F, Smith IF, Figueiras J (2013) Damage detection using data-driven methods applied to moving-load responses. Mech Syst Signal Process 39(1–2):409–425
Pawar PM, Ganguli R (2011) Structural health monitoring using genetic fuzzy systems. Springer Science & Business Media, New York
Chatzi EN, Papadimitriou C (2016) Identification methods for structural health monitoring, vol 567. Springer, New York
Gopalakrishnan S, Ruzzene M, Hanagud S (2011) Computational techniques for structural health monitoring. Springer Science & Business Media, New York
Monavari B (2019) SHM-based structural deterioration assessment. Queensland University of Technology, Brisbane
Azimi M, Eslamlou AD, Pekcan G (2020) Data-driven structural health monitoring and damage detection through deep learning: state-of-the-art review. Sensors 20(10):2778
Smarsly K, Dragos K, Wiggenbrock J (2016) Machine learning techniques for structural health monitoring. In: Proceedings of the 8th European workshop on structural health monitoring (EWSHM 2016), Bilbao, Spain
Rytter A (1993) Variational based inspection of civil engineering structures. Fracture and Dynamics R9314(44):193
Farrar CR et al (1999) A statistical pattern recognition paradigm for vibration-based structural health monitoring. Struct Health Monit 2000:764–773
Gharehbaghi VR et al (2021) Deterioration and damage identification in building structures using a novel feature selection method. In: Structures. Elsevier, Amsterdam
Das S, Saha P, Patro S (2016) Vibration-based damage detection techniques used for health monitoring of structures: a review. J Civ Struct Health Monit 6(3):477–507
Khuc T et al (2020) A nonparametric method for identifying structural damage in bridges based on the best-fit auto-regressive models. Int J Struct Stab Dyn 20(10):1–17
Gharehbaghi VR et al (2020) Supervised damage and deterioration detection in building structures using an enhanced autoregressive time-series approach. J Build Eng. 30:101292
Monavari B et al (2020) Structural deterioration localization using enhanced autoregressive time-series analysis. Int J Struct Stab Dyn 20(10):2042013
Cheng C, Yu L, Chen LJ (2012) Structural nonlinear damage detection based on ARMA-GARCH model. In: Applied mechanics and materials. Trans Tech Publ, Bäch.
Beale C, Niezrecki C, Inalpolat M (2020) An adaptive wavelet packet denoising algorithm for enhanced active acoustic damage detection from wind turbine blades. Mech Syst Signal Process 142:106754
Huang NE, Wu Z (2008) A review on Hilbert–Huang transform: method and its applications to geophysical studies. Rev Geophys 46(2):228-251
Huang NE et al (1971) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond Ser A Math Phys Eng Sci 1998(454):903–995
Ai Q et al (2018) Advanced rehabilitative technology: neural interfaces and devices. Academic Press, Cambridge
Huang B-L, Yao Y (2014) Batch-to-batch steady state identification via online ensemble empirical mode decomposition and statistical test. Computer aided chemical engineering. Elsevier, pp 787–792
Dragomiretskiy K, Zosso D (2013) Variational mode decomposition. IEEE Trans Signal Process 62(3):531–544
Phan SK, Chen C (2017) Big data and monitoring the grid. The power grid. Elsevier, pp 253–285
Blanco-Velasco M, Weng B, Barner KE (2008) ECG signal denoising and baseline wander correction based on the empirical mode decomposition. Comput Biol Med 38(1):1–13
Oladosu G (2009) Identifying the oil price–macroeconomy relationship: an empirical mode decomposition analysis of US data. Energy Policy 37(12):5417–5426
Lee T, Ouarda TB (2011) Prediction of climate nonstationary oscillation processes with empirical mode decomposition. J Geophys Res Atmos 116(6):352-367
Lei Y et al (2013) A review on empirical mode decomposition in fault diagnosis of rotating machinery. Mech Syst Signal Process 35(1–2):108–126
Bagheri A, Ozbulut OE, Harris DK (2018) Structural system identification based on variational mode decomposition. J Sound Vib 417:182–197
Maji U, Pal S (2016) Empirical mode decomposition vs. variational mode decomposition on ECG signal processing: a comparative study. In: 2016 International conference on advances in computing, communications and informatics (ICACCI). IEEE
Xin Y, Li J, Hao H (2020) Damage detection in initially nonlinear structures based on variational mode decomposition. Int J Struct Stab Dyn 20(10):2042009
Das S, Saha P (2020) Performance of hybrid decomposition algorithm under heavy noise condition for health monitoring of structure. J Civ Struct Health Monit 10:679–692
Figueiredo E, Park G, Figueiras J, Farrar C, Worden K (2009) Structural health monitoring algorithm comparisons using standard data sets. https://doi.org/10.2172/961604
Wang Z et al (2019) Application of parameter optimized variational mode decomposition method in fault diagnosis of gearbox. IEEE Access 7:44871–44882
Guo H, Zhou R (2019) Experimental research of nonlinear damage diagnosis using ARMA/GARCH method. In: IOP conference series: materials science and engineering. IOP Publishing
Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econometr 31(3):307–327
Yin S et al (2016) PCA and KPCA integrated support vector machine for multi-fault classification. In: IECON 2016–42nd annual conference of the IEEE industrial electronics society. IEEE
Yang J et al (2005) KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Trans Pattern Anal Mach Intell 27(2):230–244
Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Cai D, He X, Han J (2011) Speed up kernel discriminant analysis. VLDB J 20(1):21–33
Dukart J (2015) Basic concepts of image classification algorithms applied to study neurodegenerative diseases. Neurosci Biobehav Psychol 3(1):641-646
Richman JS (2011) Multivariate neighborhood sample entropy: a method for data reduction and prediction of complex data. Methods Enzymol 487:397–408
Tan L (2015) Code comment analysis for improving software quality. The art and science of analyzing software data. Elsevier, pp 493–517
Kalbkhani H, Shayesteh MG, Zali-Vargahan B (2013) Robust algorithm for brain magnetic resonance image (MRI) classification based on GARCH variances series. Biomed Signal Process Control 8(6):909–919
Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometr J Econometr Soc 50(4):987–1007
Park C-S et al (2008) Automatic modulation recognition of digital signals using wavelet features and SVM. In: 2008 10th international conference on advanced communication technology. IEEE
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gharehbaghi, V.R., Kalbkhani, H., Noroozinejad Farsangi, E. et al. A data-driven approach for linear and nonlinear damage detection using variational mode decomposition and GARCH model. Engineering with Computers 39, 2017–2034 (2023). https://doi.org/10.1007/s00366-021-01568-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00366-021-01568-4