High-speed railway seismic response prediction using CNN-LSTM hybrid neural network

Zhang, Xuebing; Xie, Xiaonan; Tang, Shenghua; Zhao, Han; Shi, Xueji; Wang, Li; Wu, Han; Xiang, Ping

doi:10.1007/s13349-023-00758-6

High-speed railway seismic response prediction using CNN-LSTM hybrid neural network

Original Paper
Published: 11 March 2024

Volume 14, pages 1125–1139, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Civil Structural Health Monitoring Aims and scope Submit manuscript

High-speed railway seismic response prediction using CNN-LSTM hybrid neural network

Download PDF

Xuebing Zhang¹,
Xiaonan Xie¹,
Shenghua Tang¹,
Han Zhao²,
Xueji Shi³,
Li Wang¹,
Han Wu¹ &
…
Ping Xiang ORCID: orcid.org/0000-0002-1636-4111²

716 Accesses
14 Citations
Explore all metrics

Abstract

In addressing the challenges of analyzing seismic response data for high-speed railroads, this research introduces a hybrid prediction model combining convolutional neural networks (CNN) and long short-term memory networks (LSTM). The model's novelty lies in its ability to significantly improve the precision of fiber grating monitoring for high-speed railroads. Employing quasi-distributed fiber optic gratings, seven grating monitoring points were strategically placed on each fiber to capture responses of the track plate, rail, base plate, and beam during seismic activities. Using data from peripheral gratings, the model predicts the central point's response. A continuous feature map, formed via a time-sliding window from the rail's acquisition location, undergoes initial feature extraction with CNN. These features are then sequenced for the LSTM network, culminating in prediction. Empirical results validate the model's efficacy, with an RMSE of 0.3753, MAE of 0.2968, and a R² of 0.9371, underscoring its potential in earthquake response analysis for rail infrastructures.

Enhancing the Effectiveness of Neural Networks in Predicting Railway Track Degradation

Damage Detection of Rail Fastening System Through Deep Learning and Vehicle-Track Coupled Dynamics

CNN-LSTM Networks Based Fault Diagnosis Using Spatial and Temporal Information for ZPW-2000A Track Circuit

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

To ensure the safety and operational efficiency of high-speed trains, it is essential to maintain sufficient stiffnesses and high natural frequencies in railway bridges. Furthermore, they must withstand seismic forces without damage during low-intensity earthquakes, and ensure the safety of passengers within train carriages during high-intensity seismic events. The ballastless track offers some longitudinal restriction for bridges with the extensive laying of seamless lines, which improves bridge integrity and connects neighboring span simply supported girders as a connected structure [1].

China railway track system Type II (CRTSII) is a typical structural style of ballastless track [2]. It is specifically developed to address the demands of railroad bridges, particularly those that span considerable distances. CRTSII has found application on several long railway lines, including the Beijing-Tianjin, Shanghai-Hangzhou, and Beijing-Shanghai routes [3,4,5,6].

China's high-speed railways have extended dramatically over the last several years [7, 8]. The impact of the track construction on the bridge's seismic resistance cannot be ignored, as the track bears a part of the ground vibration conveyed by the foundation under an earthquake. To ensure train running safety is not only of significant theoretical importance but also holds practical engineering value [4,5,6, 9,10,11].

To assess the seismic performance of both a simply supported girder bridge and CRTSII track slabs [12, 13]. Researchers found that stresses in the rails, track plates, and base plates were present close to the abutments or anchorages of the bridge. Montenegro et al. [14] proposed an analytical approach for nonlinear train-bridge interaction method, and assess the running safety evaluated with safety criteria existent in the literature. Zhao et al. [4,5,6] proposed the velocity-related SI index for the first time and improved the limitations of the conventional train running safety indices. Su et al. [15] uses spectrally similar short- and long-period ground vibrations to analyze how ground movement length affects the seismic behavior of reinforced concrete (RC) bridge piers. To evaluate the seismic performance of the track structure from a stochastic perspective, Li et al. [16] evaluated the stochastic seismic response of a high-speed railroad's connected rail-bridge system using a probabilistic densities evolving technique.

In the realm of seismic testing methods, the shaking table's test methodology are critical for precise earthquake vibration simulation to study structural analysis in a laboratory setting [17]. Jiang et al. [18] conducted experiments on continuous girder bridges for high-speed railroads to analyze the structural damage state and investigate the effects of various seismic levels and installation orientations on structural seismic response. Wang et al. [19] examined liquefaction’s impact on seismic safety and bridge collapse dynamics using shake-table tests on pile-group supported bridges in liquefiable and non-liquefiable sands. Yang et al. [20] studied the effects of a collision on the lateral seismic response of a bridge model and the damping effect of rubber buffers through a series of shaking table experiments on a 1/6 scale bridge model, also exploring the seismic response of a bridge via shaking table testing.

Over the past few decades, Fiber Bragg Grating (FBG) sensors have gained prominence in assessing the structural condition of existing infrastructure. Zhang et al. 21 employed quasi-distributed optical fiber technology to investigate the force characteristics, ductility performance, and damage mechanisms of the ballastless CRTS II plate in shear failure scenarios. Chan et al. [22] provided an extensive conceptual study of fiber grating sensors in current infrastructures. Wang et al. [23,24,25] employed strain transfer analysis to assess the utility of fiber optic sensing technology for in-situ monitoring of the structural integrity of single and multi-layer asphalt pavements. To monitor the deterioration response of asymmetrical concrete-reinforced cracking structures exposed to increasing seismic stress, Zhang et al. [26, 27] utilized fiber-optic grating sensors. In the context of bridge engineering, Lu et al. [28], investigated the dynamic and stationary pressure division technique for large-scale strain gauges on substantial-span rigid bridges under vehicle loading using an externally attached fiber-optic grating strain transmitter. Zhao et al. [29] built a train-bridge dynamics model and investigated the influence of temperature deformation of the main beam on beam deflection generated by the train.

Based on existing fiber grating detection examples in engineering, this paper employs fiber grating Wavelength Division Multiplexing (WDM) technology [22] to realize a quasi-distributed fiber grating sensing system by connecting multiple FBG sensors in series on a single fiber [30,31,32] and attaching the series fiber grating to the scaled CRTS II track model to achieve long-range multi-point acquisition. The advent of advanced technologies such as big data, machine learning, and artificial intelligence has heralded novel concepts and methodologies in seismic mitigation theories and technologies for bridge structures.

Artificial neural networks (ANNs) have been widely used in recent years to predict structural seismic responses [33,34,35], damage state [36, 37], and failure mode [38], as well as to evaluate structural seismic performance [39] and damage state [40] by demonstrating superior nonlinear function modeling capability [41]. Wang et al. [31, 32] investigated machine learning (ML) methodologies for an accurate estimate of bearing deformation and column drift ratio responses of bridges, particularly those supported by extended pile shafts. To forecast the time series of seismic reactions of ground structures, one-dimensional convolutional neural networks (1D-CNN) and long-short term memory (LSTM) neural networks were built using extensive research on artificial neural networks [27, 35, 42]. Several recent studies have delved into the potential of deep learning in this domain. For instance, Bilal et al. [43] developed an early warning system for earthquake prediction from seismic data using batch normalized graph convolutional neural network with attention mechanism that can successfully predict the depth and magnitude of an earthquake event at any number of seismic stations in any number of locations.

Similarly, Zhang et al. [21] and Zhao et al. [44] utilized recurrent neural networks (RNNs), particularly LSTM networks, to model temporal sequences of seismic data, showcasing their efficacy in real-time earthquake detection. Furthermore, hybrid models that combine the strengths of multiple deep-learning architectures have gained traction.

In light of these advancements, this research aims to further the understanding of seismic responses prediction through deep learning techniques. This research proposes a CNN-LSTM network hybrid model response prediction approach based on the CNN and the LSTM network to increase the prediction accuracy of seismic response fitting. It combines CNN and LSTM network features, and employs quasi-distributed fiber grating to gather stresses of simply supported girder bridges under seismic impacts, and creates a continuous feature map of the observed grating locations, seismic orientations, and peak accelerations as input. Leveraging the deep learning algorithm, the model is adept at predicting the strain across the span, showcasing the potential for improved accuracy in seismic response predictions.

However, it's essential to acknowledge the potential limitations of our proposed method. While our model has demonstrated superior performance in controlled experiments, its computational effectiveness in real-life scenarios, especially for high-speed railway bridges, warrants further exploration. Deep learning models, particularly hybrid ones, can be computationally intensive. When applied to complex structures like high-speed railway bridges with vast amounts of data, the computational time might increase, potentially affecting real-time applications. Moreover, the model's accuracy could be influenced by the quality and quantity of the data available. In real-world scenarios, data might be noisy, incomplete, or not as diverse as in controlled experiments, which could impact the model's predictions.

Nevertheless, the promising results from our research provide a strong foundation for future studies. Further optimizations and refinements, both in terms of the model architecture and data processing, could address these limitations, making the method more suitable for real-life applications.

2 Data gathering and processing

By constructing a scaled basic girder bridge on a shaking table testing system [11], quasi-distributed fiber-optic gratings were installed at the track plates of the scaled bridge's mid-span section, respectively, to measure strain response in various directions along the same line.

2.1 Seismic table experimental setup

In this study, we use the multi-span simply supported girders of the CRTSII plate ballastless track system as the research object, create a scaled-down model of the bridge with a similarity ratio of 1:10, and build a bridge operation test platform with four rows of shaking tables.

The prototype of the scaled-down model is a Chinese high-speed railway simple supported box girder bridge [45]. The piers are round-end solid piers with heights varying from 3 to 20 m, and the girders are prepared having concrete simple-supported box girders with an overall length of 32.5 m. Equal-section piers are those with a height under 14 m, whereas variable-section piers with a slope of 1:45 are those with a height over 14 m. The anti-fall girder mechanism has a trigger spacing of 20 cm. Basin rubber bearings have 5000 kN and 1000 kN maximum vertical and horizontal bearing capabilities, respectively. To create relative movement between the ceiling and the bottom basin, polytetrafluoroethylene (PTFE) plates with a low coefficient of friction may be employed. Under three-dimensional stress, the rubber is fluid and may be utilized to rotate the main beam. Seals are used to keep the rubber from deteriorating due to exposure to air.

The track is a ballastless slab-type track system known as CRTSII. To reduce the temperature stress on the track construction, a sliding layer is inserted between the box girder and the base plate. As a buffer layer between the filler materials, a layer of CA mortar is installed between the base plate and the track plate. Fasteners hold the rail to the track plate; transverse blocks are installed on both sides of the bottom plate and the track plate to limit their lateral movement; shear reinforcement is installed between the bottom plate and the track plate at the ends of the girder joints; and shear grooves are installed on the surface of the box girder above the fixed supports to limit the movement of the bottom plate. Blocks and fasteners are separated by 0.65 m and 6.5 m, respectively.

The test used two sine wave seismic excitations, the characteristics of which are reported in Table 1. The earthquake frequency is 10 Hz, and the peak acceleration is 0.1 g and 0.2 g. Figure 1 depicts the model installation. The scaled-down bridge is a steel bridge with a 1:10 scale, each span is 3.25 m long, and there are a total of 11 spans. Table 2 shows the model similarity coefficients. Steel plates were used for fasteners; rails, track plates, base plates, girders, and piers were made based on equivalent bending stiffness; shear bars, shear gears, lateral blocks, and bearings were experimented for different sizes of specimens based on the principle of equivalent effectiveness and displacement, and the most suitable size was selected based on the experimental results; One 4 m by 4 m six-degree-of-freedom fixed table and three 4 m by 4 m six-degree-of-freedom movable tables make up the slide of the shaking table testing system. There is an adjustable separation of 625 m between the table array.

Table 1 Parameters of sine wave

Full size table

Table 2 Scaled-down model similarity coefficients [46]

Full size table

2.2 Data capture device with a fiber grating

An optical fiber with seven grating spots is epoxy resin attached to the track at the middle portion of the bridge span, and it is organized as illustrated in Fig. 2. To guarantee that the grating points were uniformly distributed on the monitored structure, the optical fiber with seven grating points was pasted on the track plate according to the fourth grating point matching to the midpoint position of the bridge span. Figures 2 and 3 illustrate the data collection and schematic diagrams, respectively.

3 Model architecture

The CNN-LSTM model combines two models: CNN and LSTM. The feature vector is initially extracted using CNN, after which it is created into a time-series sequence and used as input data for the LSTM network. The LSTM network is then used to forecast responses. CNN is used to extract spatial information from response data at each time point, primarily via convolutional operation and pooling operations. The original response sequence is transformed into a depth feature time series after feature extraction. The LSTM model is used to train the depth feature time series retrieved by the CNN algorithm. The whole procedure may be broken down into two stages: data pre-processing and model training.

3.1 Introduction to CNN models

The processing of image data by ANN is inefficient since there are too many inputs and training parameters. CNNs were created to work around the constraints of ANNs while analyzing class image data. CNN is the first genuinely constructed multilayer neural network technique with high network depth scalability. The number of network parameters is minimized while the deep characteristics of multidimensional data are preserved by using convolutional and pooling techniques. As a result, CNNs are frequently utilized in image recognition and computer vision. Convolutional layers, pooling layers, and fully linked layers comprise the CNN architecture. Brief descriptions of these several levels are provided below.

(1)
Convolution operation [47].

A crucial phase in the CNN operation is the convolutional operation. Convolutional processes are used to calculate the input and output by the convolutional layer and the nonlinear activation function, as shown below.

$$y_{i} = \sigma \left( {k_{i} * x + b_{i} } \right), \quad i = 1,2...K$$

(1)

where $x$ is the convolutional layer's input with width $W_{1}$, height $H_{1}$, and depth $D_{1}$. The convolution operation is denoted by the operator $*$. The letter $k_{i}$ represents the $i$th trainable convolutional filter, Its dimensions are $F \times F \times D_{1}$(width, height and depth, respectively), The $i$th deviation of the convolution filter $k_{i}$ is denoted by $b_{i}$, $\sigma$ stands for the nonlinear activation function, $y_{i}$ represents the $i$th output matrix of the $i$th convolution filter. K convolution filters are used in each layer. Figure 4 shows the convolution procedure, with the step size of the convolution process equal to 1. The width is $W_{2}$, the height is $H_{2}$, and the depth is the number of convolution kernels K for the output of the $i$th convolution process $y_{i}$. The input matrix and the $i$th convolution filter K dot product are used to compute each member of the $i$th output matrix.

$$\left\{ \begin{gathered} W_{2} = \frac{{W_{1} - F + 2P}}{S} + 1 \hfill \\ H_{2} = \frac{{H_{1} - F + 2P}}{S} + 1 \hfill \\ \end{gathered} \right.$$

(2)

When the span is S = 1, typically, setting the number of filled zeros on either side to $P = \left( {F - 1} \right)/2$, assures that the input and output amounts are the same size in space. In Fig. 4, the convolution process of the convolutional neural network inserts a layer of zeros (gray) around each side of the original input matrix (purple). As a result, the following equation may be used to compute the width D and height H of the output.

(2)
Pooling and fully connected layers.

The pooling operation is a down-sampling method that takes the low-dimensional output of the appropriate sampling window and recovers an element (such as the maximum value, average value, and L2-parametric number). The down-sampling procedure of the pooling layer is presented in Fig. 5 pooling operation of a convolutional neural network. The output matrix has the same depth dimension as the input matrix since the procedure of getting the highest value in the pooling window is carried out separately on each slice in the input depth dimension. Additionally, the step size is often the same as the width or height of the pooling window, and the length and width of the matrix of results may be determined similarly to the convolutional layer. Layer pooling may minimize the number of representation spaces while keeping deep characteristics. It can minimize the number of parameters in the convolutional neural network, lowering the computing cost of the model, preventing model overfitting, and improving CNN's generalization capabilities. The completely connected layer, as the name indicates, contains numerous neural connections between two layers.

3.2 Introduction to LSTM models

The LSTM network is a kind of temporal recurrent neural network that has been modified (RNN). It has been suggested and enhanced with the inclusion of another forgetting gate. The upgraded LSTM network eliminates the issue of "gradient disappearance" in model training and can learn long and short term time series dependent information. Figure 6 depicts the network's core units.

The LSTM network's fundamental unit consists of forgetting gates, input gates, and output gates [48]. Together with the state memory unit $S_{t - 1}$ and the intermediate input $h_{t - 1}$, the forgetting gate's input $x_{t}$ determines the forgetting component of the memory unit. The sigmoid and tanh function modifications jointly decide the $x_{t}$ in the input gate in order to keep the vector in the state memory cell. The updated $S_{t}$, coupled with the output $o_{t}$, determine the intermediate output $h_{t}$, which is computed as follows [49].

$$f_{t} = \sigma \left( {W_{{{\text{fx}}}} x_{t} + W_{{{\text{fh}}}} h_{t - 1} + b_{f} } \right)$$

(3)

$$i_{t} = \sigma \left( {W_{{{\text{ix}}}} x_{t} + W_{{{\text{ih}}}} h_{t - 1} + b_{i} } \right)$$

(4)

$$g_{t} = \phi \left( {W_{{{\text{gx}}}} x_{t} + W_{{{\text{gh}}}} h_{t - 1} + b_{g} } \right)$$

(5)

$$o_{t} = \sigma \left( {W_{{{\text{ox}}}} x_{t} + W_{{{\text{oh}}}} h_{t - 1} + b_{{\text{o}}} } \right)$$

(6)

$$S_{t} = g_{t} \odot i_{t} + S_{t - 1} \odot f_{t}$$

(7)

$$h_{t} = \phi (S_{t} ) \odot o_{t}$$

(8)

where $f_{t}$,$i_{t}$,$g_{t}$,$o_{t}$,$h_{t}$, and $S_{t}$ are the corresponding states of the oblivion gate, input gate, input node, output gate, intermediate output, and state unit. $W_{{{\text{fx}}}}$, $W_{{{\text{fh}}}}$, $W_{{{\text{ix}}}}$, $W_{{{\text{ih}}}}$, $W_{{{\text{gx}}}}$, $W_{{{\text{gh}}}}$, $W_{{{\text{ox}}}}$ and $W_{{{\text{oh}}}}$ are the relevant gate's matrix weights multiplied by the input $x_{t}$ and intermediate output $h_{t - 1}$, respectively. $b_{f}$, $b_{i}$, $b_{g}$, $b_{o}$ are the bias terms of the associated gates. $\odot$ represents the bit multiplication of vector elements. $\sigma$ represents the sigmoid function's change. $\phi$ represents the tanh function's change.

3.3 CNN-LSTM network hybrid model

3.3.1 Model architecture of CNN-LSTM

This paper delves into the intricacies of the fiber grating measurement data associated with the track plate. Central to our discussion is the CNN-LSTM network hybrid model, as illustrated in Fig. 7. This model is an amalgamation of 17 meticulously stacked functional layers, bifurcated into two primary segments: the CNN dedicated to feature extraction, and the long short-term memory LSTM network, which shoulders the responsibility of load prediction.

Before delving into the model's architecture, it's crucial to understand the nature of the input data. In Block 1 of the CNN algorithm, the input data comprises spatial–temporal features derived from the fiber grating measurement data. These features capture both the spatial relationships inherent in the data and the temporal dynamics over time. The CNN, with its convolutional layers, is adept at extracting spatial patterns and relationships from this data. Once these spatial features are extracted by the CNN, they are then transformed into a format suitable for the LSTM.

The LSTM, being a recurrent neural network, excels at processing sequences and time-series data. By feeding the spatial information extracted by the CNN into the LSTM, we leverage the strengths of both networks: the CNN's ability to recognize spatial patterns and the LSTM's capacity to understand temporal dynamics.

For a granular understanding of the CNN-specific parameters, readers are directed to Table 3. Our model's foundation is rooted in Python's sci-kit-learn machine learning toolkit, further bolstered by the PyTorch framework. A pivotal component of our hybrid CNN-LSTM network is the time series feature map, which serves as the primary input. It's imperative to note that data elements like grating location, seismic wave type, and monitoring time maintain their distinctiveness as time series. Drawing parallels from natural language processing, we employ the word vector representation method. This technique allows us to sequentially represent the strian at specific instances by aligning it with its associated features, thereby crafting an innovative time series dataset. Each data point encapsulates the historical load's characteristics.

Table 3 Detailed configuration of CNN network architecture

Full size table

To further refine our model's input, we employ the sliding window approach. This method, with a window width set to 30,000 records, facilitates subsequent network computations. Consequently, the unit feature map dimensions are established at 30,000*6. For a detailed breakdown of the convolutional layers, including their respective sizes and step sizes, Table 3 offers a comprehensive overview, highlighting the model's five convolutional layers.

The input subsequence is initially processed in Block1, which contains three functional layers in that order, including convolution, ReLU, and pooling. In the diagram, they are labeled Conv_1, ReLU_1, and Maxpool_1. Conv_1 is the first layer, with an input size of 30,000 × 1 × 30, and the convolution layer is made up of 32 convolution kernels with a size of 1000 × 1 × 30 and a sliding window step size of 100. The output size is unaffected by the ReLU layer. The pooling layer is 2 × 1 × 32 in size and has a step size of 2. As a result, the output size of Block1 is 146 × 6 × 32. Convolutional layers are utilized in this model to extract the differentiating properties of the input data. The choice of five convolutional layers is based on LeNet-5 classification recognition results [50, 51].

3.3.2 Experimental evaluation index

The prediction results are reviewed to validate the CNN-LSTM model's prediction accuracy. The coefficient of determination R-square (R²) [52], the Mean Absolute Error (MAE) and the Root Mean Squared Error (RMSE) [53] are used to statistically analyze the model prediction outcomes. The specifics are provided below.

$${\text{RMSE}} = \sqrt {\frac{1}{{T_{e} }}\sum\nolimits_{i = 1}^{{T_{e} }} {\left( {\hat{F}_{i} - F_{i} } \right)^{2} } }$$

(9)

$${\text{MAE}} = \frac{1}{{T_{e} }}\sum\nolimits_{i = 1}^{{T_{e} }} {\left| {\mathop {F_{i} }\limits^{ \wedge } - F_{i} } \right|}$$

(10)

$$R^{2} = 1 - \frac{{\sum\nolimits_{i = 1}^{{T_{e} }} {\left( {\mathop {F_{i} }\limits^{ \wedge } - F_{i} } \right)^{2} } }}{{\sum\nolimits_{i = 1}^{{T_{e} }} {\left( {F_{i} - \mathop {F_{i} }\limits^{ - } } \right)^{2} } }}$$

(11)

The RMSE and MAE metrics are both used to indicate how the anticipated value differs from the real value. The RMSE differs in that it first computes the square of the deviation, which magnifies the mistake if it is big. The coefficient of determination R² is used to measure the model's average prediction accuracy and is the ratio of the sum of squares of total errors to the sum of squares of total deviations. Where $F_{i}$ is the detected seismic response value, $\hat{F}_{i}$ is the anticipated seismic response value, $T_{e}$ denotes the number of detection points and $\mathop {F_{i} }\limits^{ - }$ denotes the average of detection points, The closer the RMSE and MAE findings are to zero, the closer the results are to one, and the greater the R² model's prediction accuracy.

4 Analysis of experimental results

4.1 High-speed railroad seismic response dataset

We evaluate the grating strain response data under various seismic excitations in this part, and the findings are displayed along with the impacts of the operating conditions' predictions in Table 2. Figure 8 displays the seismic response data set for high-speed rail, which consists of 900 data points in total. Of these, 720 data points are the training set and 180 data points are the test set. The pre-processing procedure results in the removal of 49 outliers in total. The 49 outliers are due to noise problems in the demodulator's data acquisition and are indicated as "NaN" in the original data, so they are deleted.

4.2 Performance comparison with other deep learning-based models

In this study, we juxtaposed the performance metrics—MAE, RMSE, and R²—of our proposed method, CNN-LSTM, with three other prominent prediction techniques: long short-term memory (LSTM), Backpropagation (BP), and Gated Recurrent Unit (GRU). It's pertinent to note that BP is a supervised learning algorithm used for training artificial neural networks, and the GRU is a type of recurrent neural network architecture. A detailed comparison of these methods can be found in Table 4.

Table 4 Comparison of deep learning techniques' performance

Full size table

The experimental findings demonstrate that the proposed CNN-LSTM model outperforms conventional deep learning approaches in predicting power consumption. The proposed model is followed by LSTM. The competitiveness of the suggested CNN-LSTM technique for seismic response prediction has been shown via tests. The CNN-LSTM network's prediction impact is depicted in Fig. 9. This graph demonstrates how the CNN-LSTM model's predicted outcomes are often compatible with the observed strain trend. No matter how great or tiny the strain value, it has a high forecast accuracy.

4.3 Model prediction effectiveness in quasi-distributed grating monitoring

In the present study, a systematic control variable methodology was employed to facilitate incremental adjustments to the model. The potential ramifications on predictive accuracy, stemming from augmenting the model's depth, were meticulously assessed by incrementally enhancing the number of layers within the long short-term memory network. To maintain a consistent benchmark, the influence of varying long short-term memory network layers on predictive outcomes was scrutinized, while retaining a static Convolutional Neural Network layer for unaltered feature extraction. The empirical findings are succinctly presented in Table 5, wherein the tabulated data represents the mean values of the evaluative indices across the seven distinct points of examination. It was discerned that while augmenting the number of long short-term memory network layers to deepen the model can potentially bolster predictive prowess, there is a concomitant increase in the error rate when the long short-term memory layers surpass a count of four, indicative of potential overfitting. Consequently, an optimal configuration of a three-layer long short-term memory network was adopted for the experiments conducted in this study.

Table 5 Model combination structure test results

Full size table

Figure 10 provides a visual representation of the cross-entropy loss for both training and validation sets as they evolve over time. The training process is halted once the cross-entropy no longer exhibits a decline within a specified duration. As elucidated in Fig. 10, a discernible disparity exists between the losses associated with training and validation. This disparity markedly diminishes during the initial five cycles as both training and validation datasets expand. Upon reaching twenty-eight epochs, this gap is observed to be at its minimal magnitude. However, by the fortieth epoch, this difference begins to expand precipitously and lacks stability as the true value escalates, indicative of the onset of overfitting. Given these observations, the training strategy is consequently discontinued upon completion of fifty epochs. Therefore, the convolutional neural network-long short-term memory model, characterized by a total of twenty-eight epochs, is identified as the most optimal model, effectively minimizing the aforementioned gap.

Figure 11 provides an illustrative comparison between the anticipated and actual values derived from the intermediate grating point algorithm model. Figure 11a–f elucidates that, in the context of the strain response under varied seismic excitations, the CNN-LSTM algorithm model retains its capability to extract strain information from one location based on the grating strain response observed at different locations concurrently. However, it is noteworthy that the congruence of data at peak values exhibits some deviations. A closer examination of Fig. 11g, h reveals that the deep learning model proposed in this manuscript exhibits enhanced applicability when considering the track plate, rail, and base plate, as opposed to its performance with the box girder. The strategic positioning of the grating intimates that the box girder might be situated at a considerable distance from the primary site, with the model predominantly relying on strain data sourced from the track plate. Given the proximity of the rail and base plate to the track plate, their predictive outcomes are more aligned. Conversely, the box girder, being remote from the track and serving as a pivotal bearing point for seismic excitations, manifests strain values that deviate significantly from other locations. This divergence culminates in a suboptimal prediction performance for the box girder.

5 Conclusion

In this study, we strategically positioned a quasi-distributed fiber grating system at the track plate, rail, base plate, and beam to monitor strain variations in a shaking table-induced simple beam bridge model. We then introduced a hybrid model that combines the strengths of CNNs and LSTM networks. The CNN processes and extracts salient features from the data, while the LSTM excels in analyzing time-series data. The advantages and efficacy of this approach are further elucidated through our analytical investigations.

(1)
By employing time-sliding windows as input parameters, we meticulously construct continuous feature maps derived from multi-source data. This approach capitalizes on the inherent feature extraction capabilities of Convolutional Neural Networks, facilitating the extraction of more nuanced and pertinent information embedded within the dataset. The feature vector, constructed in a sequential time-series manner, serves as the foundational input for the long short-term memory network model. This configuration is particularly adept at accommodating the intricate nonlinear interactions and temporal characteristics inherent in the response data.
(2)
The CNN-LSTM hybrid model, blending the capabilities of both networks, has showcased its resilience and effectiveness through comprehensive analytical evaluations. When benchmarked against metrics like MAE, RMSE, and R², this fusion model distinctly surpasses its individual counterparts. By offering enhanced feature representation and superior predictive accuracy, it firmly establishes itself as an invaluable asset in civil engineering analytics.
(3)
The CNN-LSTM hybrid model proves to be an effective tool for predicting seismic responses in bridges via fiber grating. Its adaptability ensures suitability for a vast majority of locations, emphasizing its broad applicability in civil engineering.
(4)
Maintaining gratings presents challenges due to inherent material properties and unforeseen strain variations at the monitored sites. This study elucidates that deep learning can be harnessed to predict strain at alternative locations, leveraging data from the measured grating points. Such an approach holds the potential to mitigate monitoring expenses and avert data loss stemming from grating deactivation.

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Feng Y, Jiang L, Zhou W, Han J, Zhang Y, Nie L, Tan Z, Liu X (2020) Experimental investigation on shear steel bars in CRTS II slab ballastless track under low-cyclic reciprocating load. Constr Build Mater. https://doi.org/10.1016/j.conbuildmat.2020.119425
Article Google Scholar
Xiang P, Ma H, Zhao H, Jiang L, Xu S, Liu X (2023) Safety analysis of train-track-bridge coupled braking system under earthquake. Structures 53:1519–1529. https://doi.org/10.1016/j.istruc.2023.04.086
Article Google Scholar
Feng YL, Jiang LZ, Zhou WB, Lai ZP, Chai XL (2019) An analytical solution to the mapping relationship between bridge structures vertical deformation and rail deformation of high-speed railway. Steel Compos Struct 33(2):209–224. https://doi.org/10.12989/scs.2019.33.2.209
Article Google Scholar
Zhao H, Wei B, Shao Z, Xie X, Jiang L, Xiang P (2023) Assessment of train running safety on railway bridges based on velocity-related indices under random near-fault ground motions. Structures 57:105244. https://doi.org/10.1016/j.istruc.2023.105244
Article Google Scholar
Zhao H, Wei B, Guo P, Tan J, Xiang P, Jiang L, Fu W, Liu X (2023) Random analysis of train-bridge coupled system under non-uniform ground motion. Adv Struct Eng. https://doi.org/10.1177/13694332231175230
Article Google Scholar
Zhao H, Wei B, Jiang L, Xiang P, Zhang X, Ma H, Xu S, Wang L, Wu H, Xie X (2023) A velocity-related running safety assessment index in seismic design for railway bridge. Mech Syst Signal Process 198:110305. https://doi.org/10.1016/j.ymssp.2023.110305
Article Google Scholar
Zhao H, Wei B, Jiang L, Xiang P (2022) Seismic running safety assessment for stochastic vibration of train–bridge coupled system. Arch Civ Mech Eng 22(4):180. https://doi.org/10.1007/s43452-022-00451-3
Article Google Scholar
Zeng Y, Jiang L, Zhang Z, Zhao H, Hu H, Zhang P, Tang F, Xiang P (2023) Influence of variable height of piers on the dynamic characteristics of high-speed train-track-bridge coupled systems in mountainous areas. Appl Sci Basel 13(18):10271. https://doi.org/10.3390/app131810271
Article Google Scholar
Jiang L, Zhang Y, Feng Y, Zhou W, Tan Z (2020) Simplified calculation modeling method of multi-span bridges on high-speed railways under earthquake condition. Bull Earthq Eng 18(5):2303–2328. https://doi.org/10.1007/s10518-019-00779-x
Article Google Scholar
Liu X, Jiang L-z, Liu X, Lai Z, Feng Y, Cao S-s (2021) Dynamic response limit of high-speed railway bridge under earthquake considering running safety performance of train. J Cent South Univ 28:968–980. https://doi.org/10.1007/s11771-021-4657-2
Article Google Scholar
Yu J, Jiang LZ, Zhou WB, Liu X, Nie LX, Zhang YT, Feng YL, Cao SS (2021) Running test on high-speed railway track-simply supported girder bridge systems under seismic action. Bull Earthq Eng 19(9):3779–3802. https://doi.org/10.1007/s10518-021-01125-w
Article Google Scholar
Shao Z, Li X, Xiang P (2023) A new computational scheme for structural static stochastic analysis based on Karhunen–Loève expansion and modified perturbation stochastic finite element method. Comput Mech. https://doi.org/10.1007/s00466-022-02259-7
Article Google Scholar
Yan B, Liu S, Pu H, Dai GL, Cai XP (2017) Elastic-plastic seismic response of CRTS II slab ballastless track system on high-speed railway bridges. Sci China Technol Sci 60(6):865–871. https://doi.org/10.1007/s11431-016-0222-6
Article Google Scholar
Montenegro PA, Calcada R, Pouca NV, Tanabe M (2016) Running safety assessment of trains moving over bridges subjected to moderate earthquakes. Earthq Eng Struct Dyn 45(3):483–504. https://doi.org/10.1002/eqe.2673
Article Google Scholar
Su J, Wu D, Wang X (2023) Influence of ground motion duration on seismic behavior of RC bridge piers: the role of low-cycle fatigue damage of reinforcing bars. Eng Struct 279:115587. https://doi.org/10.1016/j.engstruct.2023.115587
Article Google Scholar
Li HY, Yu ZW, Mao JF, Jiang LZ (2020) Nonlinear random seismic analysis of 3D high-speed railway track-bridge system based on OpenSEES. Structures 24:87–98. https://doi.org/10.1016/j.istruc.2020.01.003
Article Google Scholar
Gao C-h, Yuan X-b (2019) Development of the shaking table and array system technology in China. Adv Civ Eng. https://doi.org/10.1155/2019/8167684
Article Google Scholar
Jiang LZ, Feng YL, Zhou WB, He BB (2019) Vibration characteristic analysis of high-speed railway simply supported beam bridge-track structure system. Steel Compos Struct 31(6):591–600. https://doi.org/10.12989/scs.2019.31.6.591
Article Google Scholar
Wang XW, Ye AJ, Shang Y, Zhou LX (2019) Shake-table investigation of scoured RC pile-group-supported bridges in liquefiable and nonliquefiable soils. Earthq Eng Struct Dyn 48(11):1217–1237. https://doi.org/10.1002/eqe.3186
Article Google Scholar
Yang MG, Meng DL, Gao Q, Zhu YP, Hu ST (2019) Experimental study on transverse pounding reduction of a high-speed railway simply-supported girder bridge using rubber bumpers subjected to earthquake excitations. Eng Struct. https://doi.org/10.1016/j.engstruct.2019.109290
Article Google Scholar
Zhang X, Li W, Tang S, Cui H, Xie X, Han W, Liu X, Yang D, Wang H, Ping X (2023) Investigations on the shearing performance of ballastless CRTS II slab based on quasi-distributed optical fiber sensing. Opt Fiber Technol. https://doi.org/10.1016/j.yofte.2022.103129
Article Google Scholar
Chan YWS, Wang HP, Xiang P (2021) Optical fiber sensors for monitoring railway infrastructures: a review towards smart concept. Symmetry Basel 13(12):2251. https://doi.org/10.3390/sym13122251
Article Google Scholar
Wang HP, Jiang LZ, Xiang P (2018) Improving the durability of the optical fiber sensor based on strain transfer analysis. Opt Fiber Technol 42:97–104. https://doi.org/10.1016/j.yofte.2018.02.004
Article Google Scholar
Wang HP, Xiang P, Jiang LZ (2018) Optical fiber sensor based in-field structural performance monitoring of multilayered asphalt pavement. J Lightw Technol 36(17):3624–3632. https://doi.org/10.1109/jlt.2018.2838122
Article Google Scholar
Wang H-P, Xiang P, Jiang L-Z (2020) Optical fiber sensing technology for full-scale condition monitoring of pavement layers. Road Mater Pavement Des 21(5):1258–1273. https://doi.org/10.1080/14680629.2018.1547656
Article Google Scholar
Zhang CW, Alam ZS, Sun L, Su ZX, Samali B (2019) Fibre Bragg grating sensor-based damage response monitoring of an asymmetric reinforced concrete shear wall structure subjected to progressive seismic loads. Struct Contr Health Monit. https://doi.org/10.1002/stc.2307
Article Google Scholar
Zhang R, Chen Z, Chen S, Zheng J, Buyukozturk O, Sun H (2019) Deep long short-term memory networks for nonlinear structural seismic response prediction. Comput Struct 220:55–68. https://doi.org/10.1016/j.compstruc.2019.05.006
Article Google Scholar
Lu HX, Gao ZC, Wu BT, Zhou ZW (2019) Dynamic and quasi-static signal separation method for bridges under moving loads based on long-gauge FBG strain monitoring. J Low Freq Noise Vib Act Contr 38(2):388–402. https://doi.org/10.1177/1461348418822375
Article Google Scholar
Zhao HW, Ding YL, Nagarajaiah S, Li AQ (2019) Behavior analysis and early warning of girder deflections of a steel-truss arch railway bridge under the effects of temperature and trains: case study. J Bridge Eng. https://doi.org/10.1061/(asce)be.1943-5592.0001327
Article Google Scholar
Zhang X, Zheng Z, Wang L, Cui H, Xie X, Wu H, Liu X, Gao B, Wang H, Xiang P (2024) A quasi-distributed optic fiber sensing approach for interlayer performance analysis of ballastless track-type II plate. Opt Laser Technol 170:110237. https://doi.org/10.1016/j.optlastec.2023.110237
Article Google Scholar
Wang H-P, Feng S-Y, Gong X-S, Guo Y-X, Xiang P, Fang Y, Li Q-M (2021) Dynamic performance detection of CFRP composite pipes based on quasi-distributed optical fiber sensing techniques. Front Mater. https://doi.org/10.3389/fmats.2021.683374
Article Google Scholar
Wang XW, Li ZQ, Shafieezadeh A (2021) Seismic response prediction and variable importance analysis of extended pile-shaft-supported bridges against lateral spreading: exploring optimized machine learning models. Eng Struct. https://doi.org/10.1016/j.engstruct.2021.112142
Article Google Scholar
Ferrario E, Pedroni N, Zio E, Lopez-Caballero F (2017) Bootstrapped artificial neural networks for the seismic analysis of structural systems. Struct Saf 67:70–84. https://doi.org/10.1016/j.strusafe.2017.03.003
Article Google Scholar
Oh BK, Park Y, Park HS (2020) Seismic response prediction method for building structures using convolutional neural network. Struct Contr Health Monit. https://doi.org/10.1002/stc.2519
Article Google Scholar
Zhang R, Liu Y, Sun H (2020) Physics-guided convolutional neural network (PhyCNN) for data-driven seismic response modeling. Eng Struct 215:110704. https://doi.org/10.1016/j.engstruct.2020.110704
Article Google Scholar
Mangalathu S, Heo G, Jeon J-S (2018) Artificial neural network based multi-dimensional fragility development of skewed concrete bridge classes. Eng Struct 162:166–176. https://doi.org/10.1016/j.engstruct.2018.01.053
Article Google Scholar
Mangalathu S, Jeon J-S (2020) Ground motion-dependent rapid damage assessment of structures based on wavelet transform and image analysis techniques. J Struct Eng 146(11):04020230. https://doi.org/10.1061/(ASCE)ST.1943-541X.0002793
Article Google Scholar
Mangalathu S, Jeon J-S (2019) Machine learning-based failure mode recognition of circular reinforced concrete bridge columns: comparative study. J Struct Eng 145(10):04019104. https://doi.org/10.1061/(ASCE)ST.1943-541X.0002402
Article Google Scholar
Arslan MH (2010) An evaluation of effective design parameters on earthquake performance of RC buildings using neural networks. Eng Struct 32(7):1888–1898. https://doi.org/10.1016/j.engstruct.2010.03.010
Article Google Scholar
Mangalathu S, Burton HV (2019) Deep learning-based classification of earthquake-impacted buildings using textual damage descriptions. Int J Disaster Risk Reduct 36:101111. https://doi.org/10.1016/j.ijdrr.2019.101111
Article Google Scholar
Chen S, Billings SA (1992) Neural networks for nonlinear dynamic system modelling and identification. Int J Control 56(2):319–346. https://doi.org/10.1080/00207179208934317
Article MathSciNet Google Scholar
Wu R-T, Jahanshahi Mohammad R (2019) Deep convolutional neural network for structural dynamic response estimation and system identification. J Eng Mech 145(1):04018125. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001556
Article Google Scholar
Bilal MA, Ji Y, Wang Y, Akhter MP, Yaqub M (2022) An early warning system for earthquake prediction from seismic data using batch normalized graph convolutional neural network with attention mechanism (BNGCNNATT). Sensors (Basel). 22(17):6482. https://doi.org/10.3390/s22176482
Article Google Scholar
Zhao H, Wei B, Zhang P, Guo P, Shao Z, Xu S, Jiang L, Hu H, Zeng Y, Xiang P (2024) Safety analysis of high-speed trains on bridges under earthquakes using a LSTM-RNN-based surrogate model. Comput Struct 294:107274. https://doi.org/10.1016/j.compstruc.2024.107274
Article Google Scholar
Xiang P, Xu S, Zhao H, Jiang L, Ma H, Liu X (2023) Running safety analysis of a train-bridge coupled system under near-fault ground motions considering rupture directivity effects. Structures 58:105382. https://doi.org/10.1016/j.istruc.2023.105382
Article Google Scholar
Zhou WB, Yu J, Jiang LZ, Lai ZP, Zuo YJ, Peng K (2023) Component damage and failure sequence of track-bridge system for high-speed railway under seismic action. J Earthquake Eng 27(3):656–678. https://doi.org/10.1080/13632469.2022.2030433
Article Google Scholar
Sekar V, Jiang QH, Shu C, Khoo BC (2019) Fast flow field prediction over airfoils using deep learning approach. Phys Fluids. https://doi.org/10.1063/1.5094943
Article Google Scholar
Xiang P, Zhang P, Zhao H, Shao Z, Jiang L (2023) Seismic response prediction of a train-bridge coupled system based on a LSTM neural network. Mech Based Des Struct Mach. https://doi.org/10.1080/15397734.2023.2260469
Article Google Scholar
Lu WX, Rui HD, Liang CY, Jiang L, Zhao SP, Li KQ (2020) A method based on GA-CNN-LSTM for daily tourist flow prediction at scenic spots. Entropy 22(3):261. https://doi.org/10.3390/e22030261
Article Google Scholar
Hannun AY, Rajpurkar P, Haghpanahi M, Tison GH, Bourn C, Turakhia MP, Ng AY (2019) Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med 25(1):65. https://doi.org/10.1038/s41591-018-0268-3
Article Google Scholar
Wen L, Li XY, Gao L, Zhang YY (2018) A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans Ind Electron 65(7):5990–5998. https://doi.org/10.1109/tie.2017.2774777
Article Google Scholar
Nakagawa S, Schielzeth H (2013) A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol Evol 4(2):133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x
Article Google Scholar
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Article Google Scholar

Download references

Acknowledgements

This research work was jointly supported by the National Natural Science Foundation of China (Grant No. 11972379), and Hunan Science Fund for Distinguished Young Scholars (2021JJ10061).

Author information

Authors and Affiliations

College of Civil Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
Xuebing Zhang, Xiaonan Xie, Shenghua Tang, Li Wang & Han Wu
School of Civil Engineering, Central South University, Changsha, 410075, Hunan, China
Han Zhao & Ping Xiang
School of Transportation, Southeast University, Nanjing, 211189, Jiangsu, China
Xueji Shi

Authors

Xuebing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaonan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Shenghua Tang
View author publications
You can also search for this author in PubMed Google Scholar
Han Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xueji Shi
View author publications
You can also search for this author in PubMed Google Scholar
Li Wang
View author publications
You can also search for this author in PubMed Google Scholar
Han Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ping Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XZ: Supervision, Writing—review and editing, Funding acquisition. XX: Conceptualization, Methodology, Software, Validation, Writing-original draft. ST: Supervision. HZ: Validation, Investigation. XS: Conceptualization. LW: Conceptualization, Data curation. HW: Visualization. PX: Supervision, Writing—review·and editing, Funding acquisition.

Corresponding author

Correspondence to Ping Xiang.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Xie, X., Tang, S. et al. High-speed railway seismic response prediction using CNN-LSTM hybrid neural network. J Civil Struct Health Monit 14, 1125–1139 (2024). https://doi.org/10.1007/s13349-023-00758-6

Download citation

Received: 10 May 2023
Accepted: 22 December 2023
Published: 11 March 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s13349-023-00758-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

High-speed railway seismic response prediction using CNN-LSTM hybrid neural network

Abstract

Similar content being viewed by others

Enhancing the Effectiveness of Neural Networks in Predicting Railway Track Degradation

Damage Detection of Rail Fastening System Through Deep Learning and Vehicle-Track Coupled Dynamics

CNN-LSTM Networks Based Fault Diagnosis Using Spatial and Temporal Information for ZPW-2000A Track Circuit

1 Introduction