1 Introduction

Civil structures and infrastructures occupy a major position in the economy and play a vital role in facilitating daily life for the world population. These assets have been incurring premature damage and approaching the end of their service lives [9]. Replacing such structures would be costly, labor intensive and will exceed available financial and human resources. Hence, engineers have developed various techniques to enhance the safety and structural integrity of those constructions [64] and to mitigate possible financial and life losses associated with their failure. Figure 1 illustrates different damage detection disciplines in SHM.

Fig. 1
figure 1

Damage detection disciplines

This paper focuses on Structural Health Monitoring (SHM) as a damage detection process. SHM consists of implementing a scheme of monitoring the structure, for instance, using periodically spaced dynamic response measurements, and extracting sensitive features related to damage through these measures and their statistical analyses to assess the actual health of the system [17]. Long-term SHM is the result of periodically updated information with respect to the ability of the structure to continue serving in the presence of other influencing factors, such as degradation and aging. Consider for example a sudden blast loading [132] or a severe seismic event [77]. SHM could be proposed to provide information on the performance of the structural system during the load event and to assess its structural integrity thereafter (also termed Rapid Condition Screening) [3]. Indeed, SHM can appraise the current state and behavior of a structure via automatically analyzing data acquired by tailored devices and sensors installed in engineered locations across the structure. Hence, anomalies can be duly detected, allowing to instantly assess the reliability of the structure after the catastrophic event, and identifying corrective measures before the damage escalates to more costly or riskier levels.

Considering such advantages of SHM, related research has been rapidly escalating and gaining growing attention of diverse stakeholders. Accordingly, several SHM systems have emerged and been implemented in bridges [2], high-rise buildings [98], towers [89], dams [91], tunnels [80] and so forth. This has led to acquiring big data, which requires powerful, intelligent and sophisticated computational techniques and has opened the door to deploying Artificial Intelligence (AI) in SHM problems.

Artificial Intelligence emerged between the 1950s and 1970s in the field of computer science and achieved substantial success in various subfields such as robotics [14, 15], data mining [130], pattern recognition [94], knowledge representation [14, 15] and agent systems [128]. Conversely, AI has attracted the attention of civil engineering experts only recently. For instance, it has been used to perform several tasks in SHM applications dealing with knowledge-based systems [38], fuzzy logic algorithms [92] and artificial neural networks [7]. The increasing number of AI applications has led scientists and engineers to train more complex models and create more robust AI tools. Machine Learning (ML) has more recently emerged as a strong contender to deal with this need. It is defined as a subset of AI that uses statistical models to improve the accuracy of machines by understanding the structure of data and then fitting it into models [38].

A machine could learn via supervised, unsupervised or reinforcement learning (Fig. 2). Supervised learning (SL) uses labels or captions so the machine can know the features of the objects added to the labels that are combined with those features. SL provides a learning scheme with labeled data to deal with regression, and classification problems. In the SHM domain, SL can be used for instance to detect the type and severity of damage [117]. Conversely, unsupervised learning is the process of learning with unlabeled data, i.e. via datasets with unspecified outputs that fit a general rule and can be grouped together following a certain trend. This can be used for example to detect the existence of damage through clustering structural response data. As shown in Fig. 3, ML is a straightforward process, starting from the input (Database), passing through the selected algorithm, getting the output, then deciding to either stop or restart the process by providing some feedback. The end of the process is marked by getting an accurate and well predicted result.

Fig. 2
figure 2

ML taxonomy

Fig. 3
figure 3

ML life cycle

2 Hierarchy of ML Algorithms

For the sake of clarity, a brief guideline on how to manipulate each of the ML steps of the general process is provided below.

2.1 Input Configuration

Starting at the input stage, a better understanding of the data can help in selecting the appropriate algorithm to use. Some algorithms can perform well with smaller sample sets, while others require very large samples. Also, some work better with a certain type of data than others. As illustrated in Fig. 4, data need to be well understood and manipulated using mathematical tools such as data statistics and data visualization, before using any machine learning algorithm. In data statistics, percentiles are used to identify the range, average and median of data to describe the central tendency and correlations, besides acquiring knowledge of how the data is linked together [60]. However, in data visualization, density plots and histograms are used to show the distribution of data, along with box plots to identify problems like outliers [107]. Then, data need to be ‘cleaned’ which involves dealing with missing values and outliers that can be a concern for some algorithms, decreasing output predictive accuracy. Finally, the data can be augmented or enriched to make the models easier to interpret, reduce data redundancy and dimensionality, capture complex relationships, and rescale some variables.

Fig. 4
figure 4

Input configuration

After manipulating the data, the problem needs to be categorized following an input–output process. For the input process, if the data is labeled, it will consist of a supervised learning problem. However, if it is unlabeled, the learning problem is considered unsupervised. On the other hand, the output process is categorized by task. If the output is a set of input groups, the problem shall be recognized as a clustering problem. Understanding the constraints of the problem is also a main task in selecting an appropriate algorithm.

Several kinds of constraints could be presented in a ML algorithm, starting from the awareness of the data storage capacity. Furthermore, the time of prediction can play a major role in the selection process. For instance, some SHM problems need to be performed in a timely manner. For example, real-time object detection problems need to be super-fast to avoid wasting information during the process of object recognition [30]. In addition, the model training process should learn rapidly in cases where it is rapidly exposed to new data and must instantly process it. To select the appropriate algorithm, other factors such as the accuracy and scale of the model, model pre-processing and complexity in terms of features included to learn and predict more complex polynomial terms, interactions and more computational overhead, need to be considered. The commonly used ML algorithms in SHM applications are are summarized in Fig. 5.

Fig. 5
figure 5

List of ML algorithms applied to SHM

2.2 Algorithm Manipulation

The most commonly used ML algorithms for SHM purposes are outlined below. Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression problems, also called Support Vector Networks (SVN). A Support Vector Machine (SVM) algorithm sorts data into one of two categories, then outputs a map of the sorted data, maximizing the margins between the two. It performs both linear and non-linear classifications thanks to the use of kernel functions [19]. Its architecture is detailed in Fig. 6. Back Propagation Neural Networks (BPNNs) are supervised learning algorithm for training multi-layer perceptrons. Its main use consists of finding the minimal value of the error function in the weight space using a gradient descent technique. The weight that minimizes the loss function is the solution for the learning problem [50]. K-Nearest Neighbors (K-NNs) are a set of classifiers used for pattern classification and ML [35]. For a set of inputs x of n points and a distance function, KNNs search for the closest points in x to a query point or set of points y to be found. Principal Component Analysis (PCA) is a method within the data analysis family that consists of transforming correlated variables to uncorrelated ones, called principal variables. This technique helps the user reducing the size of variables and making the information less redundant [59]. Convolutional Neural Network (CNN) is an architecture used in deep learning (DL), which is a subset of ML, to perform both descriptive and generative tasks dedicated mainly to image processing tasks using machine vision libraries that contain image and video recognition scripts. The main difference between the ML and DL processes is the hidden layer located between the input and output for DL algorithms, as illustrated in Fig. 7. This layer can contain multiple convolutional or deconvolutional layers, pooling, activation, fully connected and normalization layers, depending on the use.

Fig. 6
figure 6

SVM classifier architecture

Fig. 7
figure 7

Commonly used configuration for CNN

2.3 Output Manipulation

The output of the SHM can vary from one problem to another such as settlement, damage detection, damage classification, object detection, temperature prediction and health index. The end of the process should be marked by an accurate and precise output as otherwise feedback is provided to the machine, so it can learn from the experience and attempt to provide better results.

3 Structural Health Monitoring (SHM)

3.1 Bridge Health Monitoring (BHM)

BHM is the application of SHM and inspection techniques to bridge structures. Causes of degradation of bridge structures include materials aging [49], corrosion of metals [137] and structural supports [140], mechanical overloading and other damage mechanisms [24]. Bridge Health Monitoring (BHM) consists of collecting quantitative data from various sensors located within or on the surface of the structure [48]. This Real-Time feedback creates a dataset monitoring system used to assess the condition of the bridge. Processing real-time complex big data has been a challenge in BHM. According to [95], BHM can be separated into three key aspects. First, the construction control (CC) stage, where engineers are responsible for monitoring construction progress. Second, the routine monitoring (RM) stage directly after constructing the bridge. In this period, a large amount of data acquired from the installed sensors is produced and stored. To process this data, ML algorithms are being developed to provide real-time feedback for understanding the health condition of the bridge. Finally, the damage detection (DD) stage where engineers should assess the safety of the structure and detect any damage that develops.

3.2 Building Health Monitoring (BUHM)

Buildings are often exposed to damage from earthquakes, wind, overloading, vibration, impact, landslides, floods, aging and environmental action, and other damage mechanisms. Without adequate monitoring, maintenance and repair, this can lead to inadequate service and possible economic and life loss. Thus, understanding how buildings perform in real conditions can help engineers designing and building more resilient, safer, reliable and more durable structures. In particular, there has been recently rapid growth in the construction of high-rise buildings that require smarter and more robust monitoring [5]. Monitoring the deformation of such buildings has long been a concern. More recently, experts have introduced ML algorithms to monitor the condition of high-rise buildings considering their proven effectiveness in other fields.

3.3 Dam Health Monitoring (DHM)

Dams play a key role providing drinking and irrigation water, flood defense, power generation, water storage and so forth. Their deterioration can led to massive financial losses and possibly a disastrous number of casualties [16]. Thus, safe operation of dams is needed, and any anomalous behavior should be detected in its early stages to avoid any failure or mis-operation. Dam Health Monitoring (DHM) is a discipline that is often based on traditional visual inspection and other monitoring of the dam and foundation [28]. This requires robust analysis of dam monitoring data obtained from the installed sensors in the short- and long-term. For short term monitoring, the engineer is responsible for comparing the measured data with reference values that correspond to the response of the dam to loads in a normal or safe condition. The detection of anomalies is marked by the localization of predicted intervals located either above or below the reference values. However, for long-term monitoring, analysis of the behavior models and the observed data is needed to assess the performance of the dam in terms of loads and observed output [61]. DHM can also consist of static and dynamic monitoring aspects. Statically, many features could be monitored including reservoir storage levels, cracks, displacements, strains and stresses. Dynamically, other parameters could be identified like the stiffness, damping ratio and mode shapes caused by wind, water waves and ground motions [40]. Structural behavior of dams has complicated relationships with environmental factors, hydraulics (e.g. water level) and geo-mechanisms (e.g. pore pressure, rock deformability) [46]. To illustrate the behavior of the concrete dams based on real time monitoring, several mathematical models have been proposed, including statistic, deterministic and hybrid models. Such models serve to assess the behavior of dams by analyzing real time data, considering hydrostatic pressure, environmental temperature and time effects to be the main variables [121]. Due to uncertainties in using this kind of approach, several AI techniques have been implemented, making fusion between conventional models and heuristic algorithms, and leading to hybrid models. In recent years, ML has become a new accurate tool in DHM.

3.4 Wind Turbine Health Monitoring (WTHM)

To limit the need for traditional sources of energy such as fossil fuels, ecofriendly sources of energy that can mitigate climate change are being sought after [47]. Wind Turbines (WT) have gained acceptance owing to the maturity of their technology. Larger size WT emerged to harvest more wind energy, seeking efficiency and productivity. However, this reason has complicated maintenance and repair works for facility managers. Several attempts to monitor the structural integrity of WT have been reported. For instance, different problems faced by wind turbine blades (WTB) during their lifecycle [27], and methods used to detect damage in WT, including acoustic emission event detection [122], thermal imaging [8], ultrasonic methods [119], modal based approaches [116], fiber optics [123], laser doppler vibrometer [81], electrical resistance-based damage detection [83], strain memory alloy [125], X-radioscopy [119], eddy current [45] and other methods have been reported. Accordingly, big data have been cumulated. Data science is needed for classification and prediction of WT damage, hence the need for ML.

4 DL and ML Applications in SHM

This section surveys different ML and DL approaches and algorithms used in SHM problems. Various algorithms were used in SHM applications for the last 10 years, including Back Propagation (BP) algorithm, Support Vector Machine (SVM), Neural Networks (NNs), K-Nearest Neighbors, Convolutional Neural Networks (CNNs). Uses of those algorithms in several applications including SHM of bridges, high-rise buildings, dams, and wind turbines are outlined below.

4.1 Artificial Neural Networks (ANNs)

4.1.1 Feed Forward Neural Networks (NNs)

Gonzalez et al. [44] presented a damage identification method for steel moment frame structures. The method uses NNs and first flexural modes (frequencies and mode shapes obtained by a finite element model for a five-story office building) as input. Their method was based on two main approaches. The first is to calibrate the healthy structure, while the second was intended to identify the damaged structure after a seismic event. They predicted the mass and stiffness of the structure to provide a damage index at each story and indicated robust model prediction of damage. More recently, Chang et al. [22] developed this approach and applied it not only to detect damage, but also to localize it and predict its severity for appraising the remaining performance of the damaged members. Two critical structures were studied: (1) a seven-story building with single and multiple damaged columns, and (2) a scaled twin tower with weak braces installed in some floors.

To detect damage (DD) in bridges, three different algorithms were applied. The NN technique was used in the Jamboree road over-crossing, Irvine, California to assess parameters including aging, long-term structural parameters, stiffness and mass [120]. Many applications have used this algorithm owing to its simplicity and accuracy compared to traditional methods. For instance, it was used to determine radial dam displacements with different sets of inputs [31, 63, 82, 104, 105]. Other uses were reported in [88, 100, 101, 114] to detect the pore pressure in dams, to predict the tangential displacement [96] and to monitor the leakage flow [112]. A summary of the used algorithms is provided in Table 1.

Table 1 Summary of the different NN applications in SHM

4.1.2 Back Propagation Neural Networks (BPNNs)

BP algorithm was applied during the early stages of construction of the Yangtze river bridge in China to track girder elevation changes during the construction phase using input parameters like cable tension deflection parameters and deflection of the deck. Another study [95] employed a BP algorithm to track variation of the deflection of the Hubei Danjiangkou bridge deck throughout the Construction Control (CC) phase, using inputs including temperature, the value of deflection of the deck after stretching and height of the stretched section. Other uses of the BP algorithm were in the Routine Monitoring (RM) stage. For instance, pile settlement was predicted as a function of the pile displacement sequence [95] and to track the normality of points according to their deflection [133]. The Kentucky Louisville truss bridge in the USA was exposed to an extensive campaign to measure parameters like frequency, mode shapes and the number of degrees of freedom to serve as inputs for measuring the damage potential of truss joints [41, 84]. The Yangtze River Bridge was also monitored to track girder elevation changes based on cable tension and deflection parameters using BPNN, as illustrated in [136]. Four distinct uses of ML to detect damage and identify its degree for the main structural elements of a building using the BP algorithm were reported in [37]. The first consisted of identifying the damage of a reinforced concrete frame structure using the changing ratio of modal strain energy, which is taken as the damage location factor. The second explored damage location and degree in a simply supported beam, coupled with finite element simulation to calculate the first two natural frequencies of the structure using curvature mode of some critical points highlighted in the frame. The third application identified the damage degree in a scaled four-story steel frame structure where the inputs of the algorithm consisted of ratios of natural frequency, while the applied load was simulated to wind load. Finally, a damage identification method was applied to the Kewitte single-layer spherical reticulated shell. The above methods achieved adequate accuracy in detecting damage for different kinds of structures (Table 2).

Table 2 Summary of the different BPNN applications in SHM

4.1.3 Convolutional Neural Networks (CNNs)

More recently, Deep Learning [71] has emerged as a sophisticated subset of AI. It has been proposed to perform more advanced tasks using innovative algorithms. Its main application for structural health monitoring is detecting defects such as cracks, efflorescence, steel exposure, rust staining, scaling, spalling of concrete structures based on surface images, fatigue in steel structures, bolts loosening, potholes and holes in asphalt pavement, etc. ML allows detecting cracks in civil engineering structures in a fast and reliable way, determining the type of the crack, its distribution along the section, and its width and length. Thus, engineers can assess the load carrying capacity and degradation level of structures [113]. This procedure has often been conducted by experts [32] based on rather subjective opinions in assessing the health of structures [42] and predicting remaining service, which is compounded by difficulty accessing hard to reach areas. Thus, there is need for automated and intelligent crack detection methods that do not rely on subjective operator expertise and opinion.

Image-based crack detection is currently among the most advanced and active research fields in SHM. It is still evolving to address difficulties such as the random shapes and irregular sizes of cracks, concerns with lighting conditions, shading, blemishes and concrete spalling in the obtained images. Recently, a new technology of automatic crack detection using Deep Learning (DL) has emerged. New optimization of pre-trained networks such as GoogleNet [115], AlexNet [6], ResNet [131], VGG-16 [1], YOLO object detection [102] are frequently reported. Yet, from Input or dataset to output, parameters need to be carefully considered. A summary of the most recent applications of CNNs to detect damage in concrete and non-concrete structures is provide in Table 3 and described below.

Table 3 Summary of CNNs applications for SHM

It is widely accepted that the larger and more comprehensive is the data set, the more successful can be AI models using such data. Thus, some techniques such as data augmentation [39] have been proposed to solve problems of lack of data, and to reduce overfitting caused by limited and imbalanced training datasets. Another promising technique that helped increasing prediction accuracy is the dropout technique, which consists of randomly and temporarily ignoring in calculations some units of the neural network. Also, to obtain higher accuracy in image data processing, several parameters should be considered, such as uncontrolled image shooting distance [118], lighting conditions [126], shot angle and blurriness conditions.

Most relevant studies have focused on classifying structures as damaged or not damaged through the presence of cracks. One of the earliest applications of CNNs used different layout and architectures, varying the number of convolutional blocks, pooling layers, fully connected layers, adding some features to the available pre-trained networks Transfer Learning (TL) in order to detect cracks in concrete structures and asphalt pavements [134].

Different configurations have been proposed to optimize crack detection in defective structures. Recently, a new robust concept based on transfer learning to early detect fatigue cracks in gusset plate joints of steel bridges was proposed in [36] as an alternative for training a neural network. They used the output features of the VGG16 network architecture previously trained using a dataset called ImageNet, then they fine-tuned the top layer of VGG16, which helped achieving best precision. This affirmed that fine-tuning a well-trained fully connected layer with the top convolutional layer of the VGG16, in combination with data augmentation, is among the best performing combinations for detecting cracks in structures. Numerous applications have been proposed in the literature looking for the most robust algorithm for cracks detection [34, 67, 68, 72, 74, 78, 85, 97, 126, 127, 139] through varying the architecture of the used CNN, changing the number of convolutional blocks, which varied between two [36] and eleven [74] convolutional blocks, introducing more pooling at the end of each convolutional block, more activation layers and normalization, etc.

Other research efforts did not limit their scope to the binary classifications of structure (cracked, or not). More innovative and useful ideas for monitoring tasks, for instance to detect efflorescence and spalling [56, 74]; bolts loosening [139], rutting of asphalt pavements and potholes [79], typology of cracks, their length and width [134] have been explored. For instance, [56] proposed a three-staged concrete defect classifier that can classify unhealthy defected bridge areas and determine their specific defect type compared to inspection guidelines. The process consisted of finetuning three separate pre-trained networks on a multi-source dataset for concrete walls, beams, columns, etc.

Another successful application of CNN was discussed in [43], which proposed a baseline recognition task that determines the component type, checks the spalling condition, evaluates damage in percentage (no damage, minor damage, medium to severe damage, collapse) and predicts the mechanical source of damage; e.g. if the crack is horizontal, the mechanical force that initiated it is an axial (tensile or compressive) force; however, if the crack is slightly vertical, a bending moment could be the main cause; and finally if the crack is inclined, shear force would be the main cause. accordingly, a dataset composed of 10,000 images was collected from a platform called ImageNet and then labeled manually for specified recognition tasks. To avoid overfitting, TL based on VGGNet was applied using two different strategies called finetuning and feature extraction. Two sets of experiments were done to find the relative optimal model parameters and hyperparameters including learning rate, mini-batch size, number of epochs, initial weights, etc. Both strategies proved effective in recognition applications.

Similarly, a study conducted by [76] proposed a three-level image-based approach for post-disaster monitoring of reinforced concrete bridges using image classification, object detection and semantic segmentation, respectively to assess failure of the overall system, detect the structural element (Deck, Column, Beam, Wall) where the damage persists and then zoom to the exact location on that element to localize the damage. This study achieved over 90% accuracy for the three deep learning models, which confirms the necessity of research in order to propose new solutions for these kinds of problems.

Deep learning and CNN scholars did not limit their scope in the field of image recognition, and attempted diverse applications to detect crack damage in real time for instance using unmanned aerial vehicles or drones, as illustrated in [23, 67,68,69, 79]. Collecting images and labelling it manually can be a repetitive and a time-consuming task. For this reason, different methods have been used in the literature to save time and provide an alternative solution, such as the use of Scrapebox proposed in [66], which scrapes images from a search engine site (e.g., Google Images, Baidu Images, etc.) for a keyword (e.g., concrete crack), and LabelImg used as a graphical image annotation tool (in [12].

Only few applications of CNNs have quantified detected cracks on images by calculating its width and length. For instance, (R-CNN)-based transfer learning was applied to a 384 collected images (in [70]. Those images were cropped to regions where the crack had been located. To quantify cracks, the exact pixel size in the image and the focal distance were attributed using GPS data of Unmanned Aerial Vehicle (UAV) system. The crack quantification algorithm was verified in a small-scale laboratory test that provided a relative error of 1–2%. Another application (in [86] proposed a DL-enabled quantitative crack width measurement method. The study presented a novel crack width estimation method based on the use of Zernike moment operator, which achieved high accuracy for thin cracks.

4.2 Support Vector Machine (SVM)

SVM has been widely used in BHM applications, for instance to determine damage in the Hangzhou bridge using strain vibration, distortion, and cable tension [26]. For the Flushing 149st bridge in New-York, Impact Echo (IE) data was collected to classify damage of the deck using SVM [73]. Moreover, an attempt was made to use SVM for crack detection in the Sydney Harbor bridge, Australia using inputs including force, acceleration and time histories recorded during normal bridge operation [4]. The SVM algorithm was used in the RM stage, for example in the Humboldt bay middle channel bridge to evaluate the correct position of the pier using some pier features. To predict scour depth near the bridge piers of the Taiwan High-Speed Rail System Bridge, features like pile length, young’s modulus of soil and natural frequency of the bridge were used with an SVM algorithm [65].

To detect and localize damage, two potential applications for SVM have been reported. The first [75] used radial basis function for regressing and optimizing the input (mode curvature change). Good accuracy and generalization ability along with noise resistance from the surrounding environment were achieved. In the second, [90] applied SVM algorithm to vibration signals from sensors installed on a wooden brace inside a wooden house (Timber Health Monitoring) to track the degradation of wood, assess and localize damage, then compare results to that of k-Nearest Neighbors algorithm. SVM was found more accurate and gave more precise results than the K-NN algorithm for this kind of application. Two main other applications consisted of calculating tangential displacements of the Iron Gate two dams between Serbia and Romania using the downstream height, upstream height, their lags and the lag of the output itself for next iterations [100, 101]. This was intended to predict radial displacements (Rad-Disp) and uplift pressure [25]. Also, an evaluation of the correct position of piers installed in the Humboldt bay middle channel California bridge was illustrated (in [18]. The various SVM applications are summarized in Table 4.

Table 4 Summary of the different SVM applications in SHM

4.3 Other Algorithms

Table 5 lists various algorithm applications in SHM. The Principal Component Analysis (PCA) algorithm was used for DD purposes in BHM, for instance in Japan’s Hayakawa truss Bridge (Fig. 8), where data acquired from sensors installed on the bridge were deployed in the PCA algorithm combined with an Auto-Regressive (AR) model to detect damage [124]. Another application of this algorithm was in Taiwan’s prestressed concrete Hanxi bridge, where data from single channel deflection signals were used to detect deflection of concrete, shrinkage and creep strains and prestress loss.

Table 5 Other ML algorithms
Fig. 8
figure 8

3D Model of the Hayakawa Bridge, Japan

One application of the Tree-structured Gaussian Process (TGP) algorithm was during the RM stage of BHM, where important features related to the Tamar bridge in the UK were extracted, including its natural frequency, traffic loading applied to the bridge, wind direction and speed. Those features were introduced to the TGP algorithm to study the effects of wind conditions on the behavior of the main structural elements of the bridge. A second application was in Switzerland’s Z24 Bridge, where modal parameters, air and soil temperature, and soil humidity data were used to assess several parameters such as the settlement of the pier, landslide prediction, concrete spalling, concrete hinge failure, anchor head failure and the tendons rupture [129].

A methodology to detect local and global health conditions of structural systems using ambient vibration response of structures collected by installed sensors was proposed [99]. Unsupervised deep Boltzmann machine (DBM) was combined with numerical methods such as wavelet and Fast Fourier transform to extract features from the frequency domain of the recorded signals and create a classification index for the local and global health of the structure using a probability density function. The algorithm was validated through a verification test case using actual experimental data obtained on a 1:20 scaled residential 42-story concrete building in Hong- Kong (Fig. 9). A Hybrid Multi Objective Optimization (HMOO) algorithm was proposed to detect damage by solving the inverse problem of limiting change of modified modal strain energy in structural elements [21]. A scaled model of the building was designed and then numerically modeled by Finite Element Analysis to assess the performance of the algorithm. The approach was compared to other traditional methods using a single-objective Genetic Algorithm (GA). HMOO achieved better performance in detecting multiple minor damages, which had little effect on changing the modal properties of the structure. Moreover, the proposed method demonstrated ability to mitigate difficulties of measuring rotational components of each mode shape using incomplete mode shapes that incorporated only global translational components.

Fig. 9
figure 9

a 3D model of the Hong-Kong 42 story High-rise Building. b Scaled prototype of the substructures and location of sensors along the height of the building

The K-means clustering algorithm was also applied to detect and localize damage in joints of the Sydney Harbor bridge, Australia [33]. Moreover, Bayesian Networks (BN) were deployed to rate the condition and structural reliability of the Albert railway bridge in Brisbane, Australia [46]. Another approach [106] used Boosted Regression Trees BRT combined with a 100-m finite element numerical model to detect anomalies in a dam (Rad_Disp) (Fig. 10). This algorithm was effective compared to casual (only considering external variables, e.g., reservoir level) and non-casual models (including both internal and lagged variables as predictors). However, [61, 62] compared four sets of algorithms, namely BPNNs, Multiple Linear Regression (MLR), Step Wise Multiple Regression (SWMR) and Extreme Learning Machine (ELM) applied on a dataset obtained on the Fengman Dam in China and found that ELM was the most accurate algorithm.

Fig. 10
figure 10

a A disposition of the installed sensors in a dam. b Flow diagram of DM data analysis

A technique called Pitch and Catch was used to detect ice thickness on blades using a combination of Guided Ultrasonic Waves (GUW) and supervised ML algorithm. Several case studies of ice on WTB surface have been used to test and validate the approach. The data needed to be well processed before running the algorithm, using four feature extraction methods, linear (Autoregressive (AR) and PCA) and nonlinear (nonlinear-AR exogenous and Hierarchical non-linear PCA), the feature selection was done by NCA. Twenty ML classifiers were used including DT, DA, SVM, K-NN and EC. The results were reasonably accurate and were verified in single frequency and multi-frequency modes [57, 58]. A different study [57, 58] used the same technique with similar features to catch dirt and mud layers on WTB. The same supervised machine learning (pattern recognition) algorithm was used to classify signals based on the fault. Another application to detect damage on WTB was proposed in [103] using an acoustic method based on Linear Regression (LR) and SVM algorithms combined with optimal feature selection to make accurate decisions. A laboratory-scale wind turbine was built having an external microphone to monitor blade damage, while being internally ensonified by wireless speakers.

To detect integral health of wind turbines, [138] implemented a method to extract numeral characteristics and predict the health condition from data stream acquired from sensors as illustrated in Fig. 11. The SVM algorithm classifies the health condition of the WTB online in both time and frequency domains based on a stream of data received from sensors installed on a WT in China. The algorithm proved ability to detect online vibration and predict the health condition. Another application [10] proposed a method to classify the operating regimes from coarse resolution to Supervisory Control and Data Acquisition systems (SCADA) recorded by the turbine supervisory controller to finally classify damage of WT using K-NN algorithm with PCA to treat the data. Furthermore, a mix between nonlinear curve method and other ML algorithms (SVM with different kernel functions and BPNNs) has been set to detect scouring conditions along pipelines for thermometry based Tunnel Health Monitoring (THM) [141]. SVM model with radial basis function was found to be best classifier for scour monitoring, reaching 99.9% and 98.9% for accuracy for training and testing sets, respectively. Other references, such as [20] measured the vibration of gearbox, rack and pinion, and motor to detect damage in a movable bridge. Moreover, Ye et al. [135] used single channel deflection signal for a prestressed concrete bridge employing PCA and Ensemble Empirical Modal Decomposition (EEMD) to detect the deflection of the girder, concrete shrinkage, creep and prestress loss. Other ML algorithms and its corresponding uses are summarized in Table 5.

Fig. 11
figure 11

Sensors for WTHM

5 Analysis and Discussion

Tables 1, 2, 3 and 4 present a summary of different applications of machine learning and deep learning algorithms in the field of SHM. Based on the comprehensive review provided above, different applications, their advantages and drawbacks, along with knowledge gaps research needs of the different algorithms of ML in SHM have been identified and summarized.

PCA was primarily used to reduce the dimensions of data, which helps reducing computational cost and obtaining higher accuracy in most cases. However, the problem of calculation time remains a drawback. PCA was used in [29] to model the vibration response of a stand in the Giuseppe-Meazza stadium and Fig. 12 displays an outline of the installed sensors. The aim was to illustrate the state of the structure in 2D or 3D space principal directions, and to interpret how this data processing considers the different effects of operational and environmental conditions. The results showed good agreement with actual temperature and humidity values and so are a good simulation for the behavior of the structure during major events like concerts and football matches.

Fig. 12
figure 12

Sensors installed in Giuseppe Mazzei Stadium, Italy

NNs can work with so-called “incomplete knowledge”, where it can produce output even with incomplete information after successful training. NNs perform very well with repetitive events, so it can learn and make decisions based on similar tasks already done (supervised learning). Another key point is that NNs are tolerant to a certain point if one or more cells of the NN is corrupted, but this will not prevent it from having an output. Most applications in the open literature were in the field of DHM, because of the simplicity and accuracy of NN compared to traditional statistical and heuristic models. Despite their great success in some areas of research, NNs are now outdated in SHM applications. More advanced ML algorithms are being implemented to achieve a balance between the performance of the network and its computational time.

BPNNs can be easily distracted in the case of noisy data and can lead to erroneous results, including overfitting and drastic deterioration of the classification or regression task. However, BPNNs performed very well in bridge and building health monitoring as mentioned in Sect. 4.1.2. One of the greatest advantages of BPNN is that it simplifies the network structure by removing the unnecessary weighted links that do not have valuable effect on the trained network.

More recently, CNNs have proved their great success with deep learning tasks and especially computer vision-based applications. CNNs outperformed traditional neural networks on conventional image recognition, classification and segmentation tasks. Another key parameter of CNNs in image recognition, compared to conventional image processing techniques and other artificial neural networks, is that the features of the images are automatically extracted and do not require manual handling. Furthermore, CNNs are very efficient in pre-training tasks and can reduce the computational time and then save the memory since the network does not have to be trained each time from scratch. Only the classifier must be trained based on the provided labels.

CNNs were first applied in SHM problems about five years ago. The major application was aimed to detecting cracks as first indicator of structural damage in sidewalks, asphalt pavements, concrete and steel structures. Several sub-models employing CNNs are rapidly evolving, including Inception V2 and V3, ResNet 50 and 100 and many others. However, these kinds of networks need powerful computational configuration features (GPU) and massive data for training, otherwise the network will overfit and lead to erroneous results.

SVM proved its effectiveness in binary classifications, training, building and regression tasks. For instance, SVM algorithm has one important feature called “L2 Regularization”, which is characterized by superior generalization capability. Another, characteristic of SVM is that it performs very well in non-linear data from different sensors installed on structures. The processing of data has presented an obstacle for other kinds of neural networks especially when there is a certain change in the data. On the contrary, SVM showed great stability since such change does not affect the hyperplane. However, the use of SVM algorithm can be challenging since the filter or the kernel need to be appropriately chosen to handle non-linear data and this can lead to generating too many support vectors, which will lead to more calculation time. Moreover, the data obtained from sensors need first to be scaled manually, which reduces the time to effectively obtain classification and regression results. SVM has been attributed to almost every kind of structure given its great accuracy when dealing with the problem of having a clear margin of separation between classes (safe structure and damaged one), but its application is still dependent on the computation time, which is one of the most important factors in AI tasks.

Other algorithms like TGP, HMOO, K-NN, K-means clustering, and ELM were proposed in 0. Those algorithms were used in several applications of SHM but did not achieve the popularity of NNs and SVM. For example, ELM was first proposed by in [52,53,54,55] as a tool that is faster in the training phase, which may result in better interpolation, but did not necessarily produce more precise and accurate results. For ML problems, more importance is assigned to the accuracy of the algorithm. Thus, ELM was not as credible in SHM applications.

In the present critical review, such methods have been divided into two main categories, namely vibration-based and image-based algorithms. The strengths and weaknesses of those algorithms were investigated and critically discussed. It has been found that more dedicated studies need to be performed concerning the following aspects:

Vibration-based algorithms need to concentrate more on wind-induced vibrations, especially for high-rise buildings, bridges, and towers. Moreover, other sophisticated algorithms can be applied in SHM of civil engineering structures since they have proved their applicability and high prediction accuracy in other fields, such as mechanical and aerospace engineering. These include Naïve Bayes (NB) classifier, Self-Organizing Maps (SOM) and k-means clustering [87]. However, the main issue with the applicability of these algorithms is the accuracy of the selection of the structure concerning the number of layers and the combined algorithms with those classifiers.

For image recognition tasks using CNNs, more research is needed to maintain a robust algorithm with high accuracy using small datasets and a smaller number of convolutional blocks that can affect the computation time and need for high computational resources. Furthermore, this algorithm should take care of the different distortions that can happen because of lighting conditions, shooting metric distance, angle of shooting, etc.

Most algorithms that are available in the open literature are supervised learning algorithms that need to be labelled manually. There is need to implement unsupervised learning for monitoring tasks using clustering to broaden the scope of applications of CNNs. Of the existing applications, about 95% have limited detection algorithms on the shallow scale of the distribution of cracks dealing with crack distribution, width, length, spalling, scaling and efflorescence. More advanced studies go beyond that scope to determine whether the reinforcement is exposed, the steel rebars are corroded, etc. However, in order to make algorithms more robust and therefore more appealing to the industry, researchers need to relate these concepts not only to the diagnosis level, but also to the damage mechanisms within concrete. For instance, several chemical mechanisms can occur underneath the concrete surface, while the exterior surface may appear integral and free of cracks and damage. accordingly, further research is needed to cover the following aspects:

Relating crack initiation to concrete mixture design, curing conditions, mechanical and environmental conditions of the structure, such as the chemistry of the pore solution, mechanical loading, seismicity of the area, temperature, humidity, etc. Some phenomena that are dependent on those conditions include carbonation of the concrete cover, corrosion of steel reinforcement, freeze–thaw damage, sulfate attack, shrinkage strains and cracking, etc. While this is a major undertaking, it could be done by combining available algorithms with experimental data of techniques such as infrared thermography, radar, impact-echo and other ultrasonic techniques, half-cell potential and polarization scanning, etc. [93]. Some applications have related chemical, physical and mechanical testing conditions to associated damage. A proof-of-concept evaluation of using CNNs was performed [111]. The study aimed to identify damage features in images of concrete samples at a microscopic scale. This was based on a management protocol developed by Bérubé et al. [13]. Improved guidelines have then been proposed (in [108,109,110] to optimize testing protocols and models and explore numerous distress processes in concrete, such as Alkali-Aggregate Reaction (AAR), Delayed Ettringite Formation (DEF), and cyclic Freezing and Thawing (FT). The developed approach was based on three phases. The first succeeded to predict seven different Damage Rating Indices (DRI) features, but with an average accuracy of only 64%, due to the limited number of microscopic image dataset. The second, aimed to use the same explicit DRI formula that an expert petrographer would apply based on crack counts. The third was aimed to use the refined ML algorithm for assessing other damage mechanisms, such as external and internal sulfate attack, FT damage and steel corrosion, to generate a comprehensive protocol that could be used to assess critical aging infrastructure. Ongoing research is being carried out to improve the accuracy of phase 1 by conducting more experiments and then providing additional training data. Phase 2 was still being processed. Phase 3 did not start yet, till phase 2 has been successfully implemented for AAR cases.

Relating the cause of cracks to structural conditions, for example by detecting mechanical loads causing the cracks, application of fracture mechanics with possibility to predict the stress field around the crack [11, 51] and then assessing the remaining stresses that the structural element could resist in the short and long-term. This could be broadened by empowering the algorithm to propose solutions for the diagnosed problems based on available resources, such as the knowledge of experts, international codes, etc. Another evolving research item in this field is real time concrete crack detection, which needs more consideration and greater efforts to transfer images to video rendering that could efficiently detect cracks in a timely manner.

6 Conclusions

There has been rapid increase in the volume of research on applications of machine learning algorithms in the field of structural health monitoring. Such studies explore the important benefits of ML, enhance its applicability and accuracy, and strive to reduce the associated computational effort. The application of ML algorithms to detect, assess, and possibly repair and rehabilitate damage in civil engineering structures is garnering increasing attention. We stand at the brink of a technological revolution where artificial intelligence could dominate what we do in structural health monitoring and the management of ageing civil infrastructure assets. In this paper, the main techniques and algorithms that have been deployed for this purpose in the open literature have been critically surveyed, discussed and analyzed. Detailed tables have been made to summarize the state-of-art and provide the reader with convenient access to the volume of work that has been conducted in this domain. The advantages and limitations of these techniques have been identified and best practice recommendations for their use have been formulated. Knowledge gaps and future research needed have been outlined. This critical review should better position engineers for decision making regarding the use of machine learning and deep learning algorithms in the domain of structural health monitoring.