1 Introduction

Rapid seismic damage assessment of structure and infrastructure is critical for post-even emergency and recovery, especially for those bridges located in mountain areas, committing main traffic assignment in the global transportation network and therefore mattering emergency rescue of earthquake disaster region.

The most traditional method used for seismic damage evaluation is the on-site investigation, involving a mass of structural engineers and organizations related (Muvafik 2014; Nie et al. 2018). Thus, this time-consuming method results in many realities to be overcome, such as work screw security, efficiency improvement and rapid response to emergency center. As an alternative and benefiting from the rapid development of sensor network constructions in the nation, the distribution maps of seismic intensity measures (IMs) as well as peak ground accelerations (PGAs) could be generated by the terminal systems, using historical data and numerical simulations. They are useful in providing near-real-time information regarding expected damages and losses after earthquake events (Ebrahimian et al. 2015; Chopra et al. 2015). However, this kind of coarse estimation can only be treated as a macro measure meaningful to economic losses and casualties, due to the lack of accuracy and structure-oriented analysis.

On the other hand, the fragility-based approaches are proposed for damage estimation in recent years (Wang et al. 2019; Nguyen et al. 2022; Noghabi and Bargi 2022), which could be regarded as a further application of these computed distribution maps, since it estimates the probability of a structure reaching or exceeding a certain damage state, given a specific input intensity (usually utilizing IM or PGA). It is notable that developing fragility curves for a structure involves conducting numerous nonlinear analyses, by usage of incremental dynamic analysis (IDA), to cover ground motions with different intensity levels and other underlying uncertainties as possible (Vamvatsikos 2011; Oncu and Yon 2017). Although various finite element software packages are available for conducting nonlinear dynamic analyses, such as OpenSees, ANSYS, ABQUAS and SAP2000, these prior platforms can predict the seismic performance of structures with reasonable accuracy; however, nonlinear time history analysis (NLTHA) can be computational demanding, especially when wide disparities of structures in geometric, material and other properties need to be taken into account (Ramanathan 2012). Moreover, the fragility method often utilizes only a single-value ground motion intensity measure; consequently, some non-stationary characteristics of ground motions, such as the changing amplitude and/or frequency over time, are hard to be accounted for. The previous studies show this kind of non-stationary features have a significant influence on structural response; quantitatively, the fragilities of structures can vary by 61% depending on the characteristics of ground motions (Mangalathu et al. 2019; Li et al. 2022), implying the bias of emergency relief based on fragility with a single intensity measure can be unexpectedly high (Mercedes et al. 2022).

As above, the desirable approach for seismic damage estimation needs to be comprehensively accurate and computational effective, to guide emergency response, probabilistic risk analysis and loss assessment as well. In the past few years, various soft computing methods have been proposed to develop metamodel for solving engineering problems. Artificial neural networks (ANNs), fuzzy logic and decision tree analysis with other popular methods were widely used as powerful tools to alleviate computational burden (Vazirizade et al. 2017; Ozkul et al. 2014; Karbassi et al. 2014; Kouchaki et al. 2023). Among these deep learning techniques, convolutional neural network (CNN), as feed-forward ANNs, has shown its great capacity in identifying damage states by high-level learning ability of images of damaged structures, incorporating feature extraction and classification together, whose capability has been clarified and recognized in the field of earthquake engineering (Savino and Tondolo 2021; Qing et al. 2022; Ogunjinmi et al.2022; Mahmoudi et al. 2023).

At this aspect, CNN can be a potentially highly powerful tool in seismic damage identification, given formalized images containing structural characteristics. Obviously, high accuracy can be achieved only if more detailed information about structure response is provided in images. As many previous studies show, structure responses are very sensitive to ground motion features, specifically, the time-domain and frequency-domain features (Spanos et al. 2007; Honda and Ahmed 2011; Krishnan and Muto 2013; Ahmadi and Anvari 2018a, b). Therefore, the time and frequency—characteristics are indispensable in structural information to guarantee the recognition precision. In light of computational demanding of NLTHA for structures excited by numerous ground motions, a compromised solution is to use a suite of representative ground motions and establish the mapping rule between structure response and ground motion features. After that, this mapping rule is to be applied to remaining ground motions to predict structure performance and then classify damage states based on certain criteria. To sum up, there are three key techniques in this hybrid approach: The first one is to present ground motion features in formalized images; the second one is to establish optimal CNN models that can recognize the mapping rule between structural response and ground motion features; and the last one is to quantify damage states based on structural response and appropriate damage criterion (Ahmadi and Anvari 2018a, b; Ahmadi et al. 2021).

Based on the observation and program above, this study proposed a hybrid method to evaluate seismic damage of bridges considering time–frequency characteristics of ground motions, in which the ground motion features were explored by wavelet transform technique, and structural response was obtained by nonlinear time history analysis of a typical RC bridge, with a suite of representative ground motions with regard to the seismic hazards at the site, based on which, damage states were further identified according to damage parameters. Then, optimized CNN models were utilized to establish the predictive relationship between structural damages and characteristics of ground motions. Compared with the existing damage assessment method, this proposed framework had several advantages, for example, the fact that structural responses were time–frequency-dependent was emphasized by differentiating ground motions; moreover, damage states were quantitatively identified by more reasonable demand parameter, providing not only the degraded status but also the structural ductility, which were both vital for emergency transportation and occupancy. At the same time, the usage of deep learning technique would remarkably enhance the assessment efficiency without sacrificing accuracy. Therefore, it has great value in estimating seismic damages and losses after earthquake disasters.

2 Methodology

The framework of seismic damage assessment for RC bridges in this study is illustrated in Fig. 1. It can be summarized as six main steps: (1) performing NLTHA for the bridge under a suite of ground motions; (2) identifying damage status of the structure with performance-based criterion; (3) obtaining time–frequency characteristics of ground motions by wavelet transform; (4) constructing the mapping relationship between time–frequency characteristics and damage status; (5) establishing CNN models by transform learning, training, testing and verifying the prediction model using datasets; (6) applying the trained model with the best performance to conduct damage prediction.

Fig. 1
figure 1

The framework of the proposed method

3 Demonstration of the Proposed Method

3.1 Description of the Case Study Bridge

To demonstrate the proposed methodology in detail, a real pre-stressed concrete box girder bridge was studied in this section. It is located at the mountain area in southwest of China and had a span length of 40 m + 70 m + 40 m, with single-cell box girder and pot rubber bearing; the rectangular column had a height of 10 m. C50 and C30 concretes were used for the box girder and piers, respectively, and HRB400 for both longitudinal steel reinforcement and transverse stirrup.

3.2 Numerical Bridge Model

The numerical bridge model was developed in OpenSees platform, considering the mechanical behavior of bridges under seismic events. The main components were simulated as below: (1) The deck was supposed to remain in elastic manner under seismic excitation and thus was modeled by elastic beam–column elements with an interval of 3 m; for important positions, such as mid-span and structural supports, the interval was reduced to 0.5 m to refine the model. The mass of the superstructure was calculated and imposed on elements. (2) Zero-length elements were used to simulate pot rubber bearings to reflect their behaviors. This kind of element was spring element connecting two nodes at the same location, only directions and material properties were needed to define. (3) Structural responses on the longitudinal direction were likely larger than those in other directions; by taking the material parameters of bearings provided by the manufactory and the designing institute, a lateral stiffness of 7 × 103 kN/m was endowed to the bearings, and that of 1 × 108 kN/m was approximated for both vertical and rotational directions, since their responses are negligible under seismic excitations. (3) Columns were the very key components to resist seismic ground motions and thus needed to be modeled elaborately. To capture inelastic deformation of columns, fiber-type displacement-based beam–column elements were used. In which, Steel 02 model in OpenSees was used to model the stress–strain behavior of reinforcing steel; Concrete 01 Kent-Park and Mander models were used to model unconfined and confined concrete, respectively. Consequently, the unconfined concretes were separately divided into 35 segments and 2 segments in the lengthwise and widthwise direction and 80 and 35 segments for the confined concretes, respectively. The developed numerical bridge model and the mentioned behavioral models are illustrated in Fig. 2.

Fig. 2
figure 2

The illustration of the finite element model and the behavioral models a the finite element model; b Steel 02 model; c Concrete 01 model; d Mander model

The numerical bridge model was verified by two approaches: (1) visualizing the fiber section by OSLite software and double-checking segments, divided for the confined and unconfined concretes and the reinforcing steel; (2) establishing the numerical bridge model in another popular finite element platform, such as ANSYS, and comparing the vibration modes of the two numerical models. In this part, the first five vibration modes were compared, and the relative errors were between 0.3 and 8.7%, indicating the established numerical bridge model is credible.

3.3 Nonlinear Time History Analysis of the Bridge

3.3.1 Ground Motion Package

Ground motions records are usually selected based on the response spectrum given in seismic design specifications. As far as the case study bridge was concerned, its seismic fortification category and intensity were B class and VII degrees, respectively, in accordance with Chinese specifications for seismic design. The site condition belonged to II class. Then, the designing response spectrum for this bridge can be calculated according to Chinese Specifications for Seismic Design of Highway Bridges (Ministry of Transportation of China 2020). Complying with the response spectrum and site condition, 128 ground motion recordings were selected from PEER database, containing a wide range of earthquake intensities and frequency components. These recordings were pre-processed to satisfy further analysis of damage assessment and prediction: The time step was uniformly set as 0.01 s, and the time duration of 30 s was truncated, taking the peak acceleration value as the middle point (15 s before to 15 s after the occurrence of PGA), to get the severest response and improve computational efficiency. The ground motions with duration less than 30 s were discarded. At the end, 100 recordings were remained as the benchmark, whose distribution is shown in Fig. 3. To generate a universal dataset of damage states, especially to understand the severe damage states, scaling method was applied on the benchmark recordings. The scaling factors were set as 1.0, 1.5, 2.0, 2.5, 3.0, 3.5 and 4.0 herein. By multiplying each ground motion with these scaling factors, 700 amplified ground motions were produced. The response spectrum of 700 recordings and the median spectrum are compared with the target design spectrum in Fig. 4, as shown, and they had identical spectral characteristics.

Fig. 3
figure 3

The distribution of benchmark PGAs

Fig. 4
figure 4

Response spectra of ground motions

3.3.2 Nonlinear Dynamic Analysis

To perform nonlinear dynamic analysis, the following built-in solvers and commands were used in OpenSees: the norm displacement increment test, Newton algorithm, the transformation constraint method and UMFPACK solver. The analysis time step was simultaneously determined based on the time step of ground motions of 0.01 s, so the analysis steps for each ground motion turned to be 3000. The Rayleigh damping of 5% was used for this typical concrete structure. This study focused on longitudinal responses, since the longitudinal deformations probably were larger than the transverse deformations. The time histories of the drifts and the lateral forces of columns were extracted for analyzing. Figure 5 displays the time history of one ground motion and the corresponding structural drifts and forces as an example.

Fig. 5
figure 5

Structural responses under ground motion excitation a time history of ground motion; b lateral drift of the column; c lateral force of the column

3.4 Seismic Damage States

Post-earthquake investigations and many previous studies have pointed out that columns are the most presentative components to embody the global damage states of the bridge system. Columns are usually expected to take their advantages of plastic deformation capacity provided by plastic hinge zones to protect other components such as beams and bearings. In light of this seismic design concept, superstructures often have better performance under a seismic event. Therefore, the bridge damage state could be resolved into the damage state of columns. Kinds of damage index or damage classification were proposed in the past decades, in which the Park–Ang damage index (Park and Ang 1985) can be regarded as one of the most well-recognized parameters owing to its intuitive demonstration of the relationship between structural ductility and structural damage. Since then, a lot of studies were derived from it, and they also inversely verified this original method (Rajabi et al. 2013; Huang et al. 2016; Cao et al. 2019; Lakhade et al. 2020). In this regard, this study employed the work of Park and Ang to identify damage states of columns by DI index, which can be expressed as follows:

$$D = \frac{{\delta_{m} }}{{\delta_{u} }} + \frac{\beta }{{P_{y} \delta_{u} }}\int {dE}$$
(1)

In addition, damage states of structures should be compatible with their seismic performance and objectives specified in design codes. Thus, three typical damage states were employed in accordance with specifications issued by many countries; more importantly, this classification would benefit in estimating the residual capacity of structures after earthquakes, presenting occupancy, retrofitting and abandon for light, moderate and severe damage states, respectively. The proposed damage states and the corresponding DI indexes are listed in Table 1.

Table 1 Seismic damage states and damage index

Integrated Eq. (1) with the interested analysis herein, \(\delta_{m}\) denoted the maximum deformation of columns under earthquake, and it could be obtained by NLTHA conducted in OpenSees as before; \(\delta_{u}\) denoted the ultimate deformation of columns under monotonic loading; \(P_{y}\) was the calculated yield strength; and \(\int {dE}\) was the hysteretic energy of columns under earthquake.

To fulfill the parameter calculation, hysteretic curve of column under each ground motion excitation was generated to determine aforementioned parameters. Besides, the cyclic performance of the numerical models was validated by experimental test data available in the literatures and PEER database. For example, the hysteretic curve of the column under the ground motion shown in Fig. 5 is plotted in Fig. 6, in which the cyclic behavior was validated by the experiment conducted by Priestley and Benzoni (1996), due to the similar structural parameters and seismic excitation. As shown, the hysteretic performance of the numerical column model matched the experimental data very well, indicating the simulation results were reliable. The distribution of all calculated DI indexes are plotted in Fig. 7.

Fig. 6
figure 6

Hysteretic curve of the column

Fig. 7
figure 7

Distribution of calculated Park index

3.5 The Approach of Seismic Damage Assessment

3.5.1 Extracting Time–Frequency Characteristics of Ground Motions

Transient signals encountered in earthquake engineering and structural dynamics are inherently non-stationary in the sense that both their frequency content and amplitude vary with time. Previous studies pointed out that capturing the time-varying dominant frequencies present in a seismic accelerogram facilitated the assessment of its structural damage potential, and the damage assessment would vary significantly depending on the characteristics of ground motions (Iyama and Kuwamura 1999). Herein, continuous wavelet transform was used to extract the time–frequency characteristics (TFC) of each earthquake accelerogram selected above, to identify the distribution of frequency component and the time that appeared. Unlike ordinary Fourier analysis that can only provide an “average” spectral decomposition of a signal, the availability of scales and locations in wavelet would promote the accuracy of seismic damage estimation. By many attempts, it was found that “Complex Gaussian wavelet 8” (i.e., the mother wavelet) presents the best manifestation mode; thus, it was used for all accelerograms. Noted that these images were the database for CNN networks and therefore needed to be carried on further processing, namely, be cropped into the same size and appearance, to abandon meaningless information that would interfere with CNN learning, such as coordinate axis, titles and legends. A processed image by continuous wavelet transform is shown in Fig. 8. As seen, the strongest frequency components were between 1.0 to 2.5 Hz and occurred around 8 s.

Fig. 8
figure 8

Time–frequency characteristics presented by wavelet transform. a Wavelet image; b cropped image

3.5.2 The Mapping Relationship Between Time–Frequency Characteristics and Seismic Damage

The ultimate purpose was to predict structure damage in future earthquake events by constructing the mapping relationship between accelerogram and damage states. The former was specifically presented by TFC and the latter was further quantified by Park–Ang index as listed in Table 1. In this regard, each processed graph presenting TFC could be classified into light, moderate and severe damage state and tagged as Green (available), Yellow (retrofit) and Red (abandon), respectively, depending on its structural damage potential with damage index. Taking the image in Fig. 8 as an example, under this ground motion excitation, the bridge had a damage index of 0.30, and thus tagged as Yellow, as described in Table 1, which means any ground motion having the similar time–frequency characteristics is likely to make moderate damage on structures. In this way, the mapping relationship between ground motion and its seismic destructive capacity could be established.

3.5.3 CNN Network Structures

3.5.3.1 A Brief Introduction of CNN

CNN has emerged as a powerful tool in classifying images into various categories based on distinguished learning ability of image features (Lei et al. 2019; Waheed et al. 2023). The typical structure of a CNN network includes input, hidden and output layers; among them, hidden layers contain the kernel of the network: convolution, pooling and fully connected layers. Specifically, the front input layers perform the preliminary processing to extract initiatory features as inputs, which are then processed through a series of filters in convolution layers to extract the image features in the local areas. These convoluted features further undertaken nonlinear operations by activation function and are delivered to the next layers. Pooling layers make statistics calculation for a certain position as well as its adjacent area and yield the output values by the maximum or other pooling operations. The outputs from the feature extraction are passed through a fully connected layer to produce the distributions of the output responses.

3.5.3.2 Transfer Learning of Various CNN Architectures

As far, a large number of CNN models have been developed in various engineering fields. Creating a CNN architecture requires a huge amount of data, and to improve their applicability and practicability, some researchers proposed transfer learning method to modify the specific data-based architectures for being used on other datasets (Yang et al. 2013; Karbalayghareh et al. 2018; Oeztuerk et al. 2022). The previous studies have verified the outstanding capacity of transfer learning, evolving from solving specific problems to more comprehensive problems. Therefore, this study used various pre-trained CNN models as the initial architectures, with respect to seismic damage assessment, and these CNN network structures were further modified by transfer learning technique.

Herein, six popular and ingenious CNN networks were selected. The classical four of AlexNet (Krizhevsky et al. 2012; Abd Almisreb et al. 2018), VGG16Net (Simonyan and Zisserman 2015; Zhang 2021), ResNet 34 and ResNet101 (He et al. 2016; Feng et al. 2019) were widely used models in deep learning field, and the another two were developed in recent years, which had more convenient platform that drastically reduced computational sources and working time, and even can be used in small mobile devices (Sandler et al. 2018; Howard et al. 2019; Sinha and El-Sharkawy 2019). In this paper, these CNN networks were constructed by Python programming, under the deep learning framework of PyTorch developed by Facebook company. In addition, to optimize the CNN models, these procedures were conducted for the following six CNN networks: under the PyTorch framework, the PyCharm editor was used to make programming and the learning ratio was set as 0.0001; random inactivation was set as 0.5 for dropout. Based on the training dataset size, the batchsize of 32 and epochs of 50 turns were selected. Moreover, the GPU method was also used to improve training efficiency.

As mentioned above, 700 ground motions were processed by wavelet transform. 70% of these images were used for network training and the remaining 30% for testing, namely, the training dataset had 490 wavelet images and the testing dataset had 210 ones. Noted that input of 224 pixels × 224 pixels images was required for CNN networks, some pre-processing also needed to be conducted on the wavelet images before network training, including adjusting image size and enlarging the amount of sample data by random rotation or mirroring, which would not change any information of images but get more sample data having different angles and directions, to remarkably enhance recognition accuracy. The transfer learning and training process were briefly demonstrated for each network below. Interested readers could find more details in the literatures referred.

  1. 1.

    AlexNet structure

AlexNet consists of input, convolution, pooling and LRN layers. Regarding to the seismic damage prediction interest in this study, the last fully connected layer for output classification was replaced with three neural cells in accordance with the previous damage states. In the PyTorch framework, the operating environment parameters such as learning rate, dropout inactivation and batch size. were set accordingly. Then, the refined CNN network was trained and tested by image datasets to output the recognized damage tags as well as some evaluation indexes.

  1. 2.

    VGG16Net structure

VGG16Net is one of the two main types of VGGNet; it has 13 convolution layers and three fully connected layers; pooling layer is attached after every convolution layer. The convolution kernel in every convolution layer is fixed as 3 × 3, but the number differs. In this network, the maximum pooling approach is used, and the pooling kernel is also fixed as 2 × 2. When pooling one time, the kernel moves forward two steps to take the next pooling. At the end of the network architecture, three fully connected layers are attached to classify the outputs. As similar, the VGG16Net structure was modified for the three damage states and then trained and tested by the same datasets.

  1. 3.

    ResNet34 and ResNet101

ResNet34 and ResNet101 belong to the same family of ResNet. They have similar structural construction but different number of layers, output dimensions and some other parameters. The two networks were refined by the same means of transfer learning. Firstly, replacing the last layer (fully connected) with a three-way classification layer (fully connected); secondly, initializing the new model using the pre-trained weight coefficients of ResNet networks; and finally, training the new model and updating the parameters. Note that employing the pre-trained parameters remarkably saved training time without sacrificing recognition accuracy, since front layers would capture the universal features of images and thus could be applied to other network models.

  1. 4.

    MobileNetV2 and MobileNetV3

With development of deep learning technique, the recognition rate has been greatly improved and even get close to 100%, provided that massive parameter training and updating. This requires high-grade computational sources and luxury processing time. However, the extra requirements are unfriendly to mobile devices or embedded devices, whose hard resources are very limited so that unable to deal with large-scale network models. In fact, the portable devices are more likely available to facilitate damage investigation after seismic disasters. Therefore, small-scale network models providing enough accuracy are preferred in quick damage assessment. MobileNet developed by Google company successfully implements this kind of balance as a milestone in CNN network clusters. In this study, MobileNetV2 and MobileNetV3-Large, as two excellent candidates in MobileNet family, were refined by transfer learning for damage recognition. Specifically, and the last layer was replaced by a new fully connected layer having three systematized accesses; the pre-trained weight parameters were used for the first several layers and the last new layer was trained and updated then.

3.6 Results of Seismic Damage Recognition

As mentioned previously, CNN networks of AlexNet, VGG16Net, ResNet34, ResNet101, MobileNetV2 and MobileNetV3 were used to construct the best prediction model using the training dataset. The performance of each model was evaluated by the test set. The results were presented in the form of a confusion matrix (Ahmad et al. 2022), which also indicated the performance of the classification algorithm in each model. For both training set and test set, the correctly predicted entries were presented as diagonal elements in a confusion matrix, while incorrectly predicted entries as off-diagonal terms. In addition, for evaluating the performance of the algorithm, several indexes were used: for a certain class (Green, Yellow or Red herein), (a) precision, the number of correctly predicted items divided by the total number of items being predicted; (b) recall, the number of correctly predicted items divided by the number of items belonging to the true class; (c) specificity, the number of correctly predicted items in other class divided by the number of items belonging to the true class. For example, in the confusion matrix of training set in Fig. 9g, 239 Green, 63 Yellow and 108 Red were correctly predicted by the algorithm of ResNet101 network. For Green class, the three evaluation indexes were 0.952, 0.972 and 0.951, respectively.

Fig. 9
figure 9

Confusion matrix for six CNN models: a + d for AlexNet; b + e for VGG16Net; c + f for ResNet34; g + j for ResNet101; h + k for MobileNetV2; i + l for MobileNetV3

4 Comparison of Results

4.1 Comparison of Various CNN Models Constructed in this Study

To compare the performance of the six architectures of CNN, the evaluation indicators of accuracy, precision, recall and specificity are contrastively analyzed in Fig. 10.

Fig. 10
figure 10

Comparison of evaluation performance of the six CNN models: a accuracy of performance; b precision of test sets; c recall of test sets; d specificity of test sets

Generally, the recognition accuracy in all network models were acceptable, and ResNet101 achieved the best prediction performance, reaching an accuracy of 83.5% for both training and testing datasets, implying its high potential in structure damage assessment under future earthquake events. For three tag categories, the Green class was recognized as attaining the highest precision and recall, attributing to plenty of its image samples. However, the performance on Yellow class was unfavorable, which was incorrectly recognized as Red in several cases; this problem may be solved by adding more ground motions having median intensity. As far as the two small-scale CNN models were concerned, they needed much less parameters than other four large-scale ones, but the accuracy did not decrease sharply. Taking the classical VGG16Net model as an example, the required parameters were 60 and 32 times these of MobileNetV2 and MobileNetV3, respectively. On the contrary, its accuracy was even lower than the latter two. As seen, small-scale networks may be optimal options for earthquake emergency response when computing sources and time are limited.

To demonstrate the application of the proposed approach to damage prediction and also to examine the functionality of CNN models in a reversed way, we randomly picked out three wavelet images from Green, Yellow and Red classes, assuming as unknown ground motion inputs, and then let the constructed CNN models to predict the seismic damage states. The predicted results are compared with the actual results in detail in Table 2. The uncertainty parameters related to model number and model size are listed in Table 3.

Table 2 Comparison of predicted results and actual results
Table 3 Uncertainty parameters for CNN models

4.2 Comparison with Other Study

Other researchers also studied on seismic damage assessment, such as the work of Mangalathu and Jeon (2020), in which non-ductile building frame and bridge were concerned, and thus structure damages were analyzed based on drifts of reinforced beam joint, column joint and columns, respectively. However, modern structures are usually designed with ductile concept, and ductile capacity has become universal feature for most of them; therefore, damage measures presenting ductile failure would be more reasonable. To validate this statement, the recognition results in this study were compared with the referred work. For the same network of ResNet101, the results are listed in Table 4. As seen, the recognition precision was significantly improved from 76.2 to 83.3% and recall from 83.3 to 95.2%, indicating the positive effects of performance-based criteria on seismic damage assessment (Fig. 11).

Table 4 Evaluation performance of this study and other study
Fig. 11
figure 11

Comparison of evaluation performance of the proposed method with other study: a precision of test sets; b recall of test sets

5 Conclusions

In this study, a hybrid approach of seismic damage assessment was proposed. This method emphasized the effects of time–frequency characteristics of ground motions on structure damage and quantified their relationship by constructing the mapping rule and criterion. Specifically, continuous wavelet transform was used to extract the time–frequency characteristics of earthquake accelerogram, and by performing nonlinear dynamic analysis, structural damage was quantified and classified into three tagged states based on Park damage index. Taking advantage of the powerful learning ability of CNN architectures, the mapping rule between TFC and structure damage can be established. As a result, seismic damage would be predicted when wavelet images were input into the trained networks. This method was applied to a real pre-stressed concrete box girder bridge, and its extraordinary ability in classifying damage states was validated, indicating its great potential for rapid post-earthquake assessment. The main findings could be summarized as below:

  1. (1)

    Capturing non-stationary characteristics in strong ground motions was necessary for structural damage analysis, and wavelet transform could be an efficient technique to provide scales and locations of frequency components in seismic acceleration signals. This kind of time–frequency characteristics had profound effects on structure damage estimation, and the processed ground motions could provide credible foundation for further analysis.

  2. (2)

    The mapping rule constructed between TFC and damage states based on Park index was verified to be preferred, attributed that this rule intuitively demonstrated the relationship between structural ductility and structural damage. The category of three tagged damage states reasonably complied with seismic design criterion, and it was convenient for looking into the availability of structures after disaster events.

  3. (3)

    CNN architectures presented powerful capacity in seismic damage recognition with high prediction accuracy. Note that CNN model was vital for evaluation performance, and reconstruction of some pre-trained network by transfer learning was feasible for seismic assessment. The ResNet101 network had satisfactory performance with an accuracy up to 83.3%. It was recommended if computational source was available after earthquake, and the lighter network MobileNetV3 could be a good alternative if the source was limited, since it was compatible with mobile devices and also had acceptable prediction accuracy.

It should be noted that the damage stated was quantified by the Park index of columns, a worthy work of exploring damage parameters for bridge structure system (including local damage and global damage) may be more comprehensive for seismic loss estimation. In addition, some other neural networks (such as graph convolutional networks) also exhibited outstanding performance in image identification, and a further exploration of its application to rapid seismic damage assessment could be conducted; moreover, more bridge types needed to be covered to further validate this method.