Keywords

1 Introduction

In recent years, China’s railway transportation system has been continuously developing. Track circuit, as an important means to ensure traffic safety and improve transport efficiency [1, 2], is a necessary component in the detection of trains [3]. The track circuit is designed to be fail safe to prevent accidents [4].

Currently, the ZPW-2000A track circuit is used for train occupancy detection and information transmission at railway stations. At present, China's railway operating mileage has reached 130,000 km, of which the operating mileage of high-speed railways exceeds 20,000 km. In order to improve safety, it is necessary to be able to determine which component should be repaired or replaced before a fault turns into a failure [5]. Therefore, it is important to design a safe and reliable intelligent fault diagnosis method.

In recent years, data-driven deep learning methods [6, 7] have been widely used and have achieved advanced levels in various challenging fields, such as image processing and fog prediction. In these fields, some research can be used to solve the problem of fault diagnosis.

Sequence data correlation can be explained as the temporal or spatial correlation in sequence observations. Normal system behavior depends largely on the previous environment, which may evolve over time.

Convolutional neural networks (CNNs) are widely used in image recognition and classification tasks. They can automatically learn higher-level features from data, and convolution kernels operate only in local areas, thus better capturing local features.

Long short-term memory (LSTM) network [8] has been shown to be particularly suitable for modeling patterned sequences [9, 10]. Experiments also show that when processing time series data, LSTM performs well in searching along the sequence and modeling long-range contextual information.

Yu et al. proposed a hierarchical deep learning algorithm with 3 long short-term memory (LSTM) modules and a dropout layer for each module [11]. Hu et al. proposed a fault diagnosis method that combines gray theory with an expert system [12]. Tian and Liu proposed a deep convolutional neural networks (CNN) for real-time roller bearing damage detection [13], comprising 3 convolutional, 3 pooling, and 1 fully connected layer, with the softmax function for fifinal fault prediction. Zhao et al. established the tuning zone simulation model and the back propagation neural network according to the working principle of the tuning zone 8 of the Track circuit [14].

The various algorithms and methods discussed above have achieved satisfactory results. However, there are still some limitations:

  1. 1.

    Manual feature extraction is a very laborious and time-consuming task, which usually requires a lot of core knowledge related to signal processing and mathematics.

  2. 2.

    The model used to obtain experimental data is not complete enough, and only a limited number of fault types can be diagnosed.

  3. 3.

    There is still a considerable room for improvement in the accuracy of fault diagnosis

In this article, we propose a CNN-LSTM fault diagnosis model to solve the aforementioned issues. The CNN-LSTM consists of a feature extractor and a classifier, allowing raw one-dimensional time series data to be directly input into the model for automatic feature extraction of track circuit fault signals. The extracted features are then input into a classifier, an LSTM network. This method first uses CNN to extract local features from the data, which better captures the valuable features in the data. Then, it makes use of the advantages of the LSTM network in processing time-series data to capture the temporal dependence features of the data and perform classification. The method has the following advantages:

  1. 1.

    A compact network structure allows for real-time state detection with direct input of raw data.

  2. 2.

    Features are automatically learned from the raw signal, eliminating the need for data preprocessing.

  3. 3.

    It is capable of accurately identifying six different types of faults.

The rest of this paper is organized as follows. Section 2 introduces related work. Section 3 describes the proposed method in detail. In Sect. 4 a series of experiments are constructed and the experimental results and analysis are given. Finally, conclusions are drawn in Sect. 5.

2 Related Works

2.1 The Basic Principle of ZPW-2000A Track Circuits

In China, in order to ensure the efficiency and safety of train transportation, ZPW-2000A track circuits are used to detect train occupancy and transmit information. The working principle of the ZPW-2000A track circuit is to divide the track into several segments and set compensation capacitors between each segment to form a circular circuit [15]. When a train passes through a certain track segment, it will cause a short circuit in that segment of the track circuit, causing a change in the current flowing through the circuit and sending relevant information to the signal system.

In the train station, the track circuit is powered by the transmitter. When the track circuit is idle, the transmitter generates a high-precision, stable, and powerful frequency-shift signal, which reaches the main rail and the running rail through the SPT cable and the transmitter transformer. It then reaches the adjacent track circuit receiver transformer, SPT cable, and attenuator through the running rail. It also reaches the reception device through the receiving end transformer, SPT cable, and attenuator in this section. The specific components are shown in Fig. 1. When the voltage on the main rail is not less than 240 mV, and the voltage on the running rail is not less than 100 mV, the relay is pulled to indicate that the section is unoccupied. When the voltage on the main rail is less than 140mV, the relay drops, indicating that the section is occupied by a train.

Fig. 1.
figure 1

Diagram of the working rules of ZPW-2000A track circuit

The ZPW-2000A track circuit consists of the following components:

  1. (1)

    Transmitter: It generates high-precision, stable, and powerful frequency-shift signals for automatic block signaling, locomotive signaling, and overspeed protection.

  2. (2)

    Receiver: It receives signals from the main rail and checks the status of the associated tuning section running rail circuit.

  3. (3)

    Two rails: They can be considered as the transmission lines for signal transmission within the track circuit.

  4. (4)

    Attenuator: It adjusts the voltage of the main rail circuit and the tuning section running rail circuit.

  5. (5)

    Matching transformer: It changes the rail voltage. It is used to reduce the rail voltage at the transmitter end and increase the rail voltage at the receiver end.

  6. (6)

    Compensation capacitor: Extending the transmission distance of a signal.

  7. (7)

    SPT cable: It is used to connect all components of the track circuit.

  8. (8)

    Tuning unit: It achieves electrical isolation between adjacent track circuits.

Signal transmitted through the track flows into the receiving devices of this section and adjacent sections. The track relay is energized only when the following two conditions are met simultaneously:

  1. 1.

    The receiving voltage on the main rail is not less than 240mV;

  2. 2.

    The receiving voltage on the running rail is not less than 100mV. When these two conditions cannot be met at the same time, the track relay is in the de-energized state.

3 Method

3.1 Fault Diagnosis

There are two states in the track circuit: the adjust state and the route-divide state. The adjust state refers to the situation where there is no train in the section. In this state, the rail voltage must be high enough for the relay to pick up correctly. The route-divide state refers to the situation where a train is running in the section. In this state, the rail voltage needs to be low enough to ensure that the relay drops down correctly. To ensure the safe operation of trains, it is very important to detect all possible faults in the detection system. Each type of equipment fault has different electrical characteristics, which can cause different voltage values in the track circuit. This article considers measuring the voltages of three adjacent track circuits in the same geographic area over a long period of time because different faults may affect adjacent track circuits. The types of faults considered in this article are as follows:

  1. 1.

    Tuning unit fault: The tuning unit is used to achieve electrical isolation. When this device fails, it may affect the rail voltage of adjacent sections.

  2. 2.

    Transformer fault: The transformer is mainly used to raise or lower the voltage. When the transformer fails, the rail voltage will drop.

  3. 3.

    Attenuator fault: The attenuator mainly adjusts the voltage of the main rail and the tuning section running rail circuit. If this device is faulty, it will cause the receiving voltage of the main rail and the tuning section running rail circuit to drop, which may lead to the wrong judgment that the section is occupied by a train.

  4. 4.

    Broken rail: The steel rail can be regarded as the transmission line for the signal. When the line is disconnected, the receiving voltage of the main rail and the relay voltage may drop, which may lead to the wrong judgment that the section is occupied by a train.

  5. 5.

    Compensation capacitor fault: In order to extend the transmission distance of the signal, compensation capacitors are installed in the rails. If these compensation capacitors are seriously damaged, it will cause the receiving voltage of the main rail and the relay voltage to drop, which may lead to the wrong judgment that the section is occupied by a train.

The fault types studied in this article are the normal state of track circuit and the above five types of faults, which are labeled as faults 0, 1, 2, 3, 4 and 5, respectively.

3.2 Network Architecture

This network consists of a feature extraction module and a classification module, as shown in Fig. 2. The network adopts an end-to-end approach, directly inputting the raw data into the network. The feature extraction module consists of a convolutional neural network [16, 17], including an input layer, three convolutional layers, and a pooling layer. The convolution kernel sizes are set to 3, 5, and 1, respectively. From the figure, it can be seen that the input data undergoes two rounds of convolution and Relu and one pooling operation to extract local features. Finally, the number of channels of the features is adjusted by another round of convolution.

Fig. 2.
figure 2

The CNN-LSTM network architecture for fault diagnosis

The classification module consists of an LSTM network [18, 19], and the number of LSTM units has a significant impact on the classification performance. A smaller number of units may result in poor results, but it does not necessarily mean that a larger network can always improve results and reduce training time. In this paper, the network architecture includes an input layer, an LSTM layer, and a fully connected layer. Firstly, the locally extracted features are used as inputs, and the LSTM layer extracts the temporal features of the data, and finally, the fully connected layer outputs the fault type. The LSTM layer includes 64 LSTM units. From an empirical perspective, this architecture can reliably produce good results for this problem.

  1. (1)

    Input data

Through the analysis of the time characteristics of the fault signal, we found that the minimum duration of the fault signal was 120 s. In order to fully utilize the temporal and spatial dependencies of the fault, the voltage magnitudes of each rail circuit’s main rail and small rail are sampled 120 times within 120 s. The voltage of the rear small rail is not collected, and a total of 600 voltage values are obtained as inputs to the network.

  1. (2)

    Output data

The output layer of the network consists of six softmax classification units, one of which represents the healthy state, and the other five are associated with these fault types.

3.3 Network Training

We using the cross-entropy function as the loss function in the LSTM network to evaluate network performance. The output is obtained after model training, and then the cross entropy between the output value and the sample tag is calculated. It achieves a good effect on judging the similarity between the actual output and the expected output. The function formula is as follows:

$$ L = - \left[ {{\text{y}}\log \dot{y} + (1 - y)\log (1 - \dot{y})} \right] $$
(1)

y is the real label value (positive class value is 1, negative class value is 0), and ý is the predicted probability value (ý ∈ (0,1)). It represents the difference between the real sample tag and the prediction probability. In this article, the optimizer uses stochastic gradient descent (SGD), and the learning rate is set to 0.01. Update parameters through the backpropagation algorithm.

3.4 The Equivalent Circuit Model of ZPW-2000A Track Circuits

When using a large dataset for deep learning training, good performance can be achieved. However, in reality, it is not practical to collect a large amount of on-site data from ZPW-2000A track circuits. Therefore, the data used in this study was obtained through simulation due to these factors. In this paper, a ZPW-2000A track circuit simulation model is proposed to obtain experimental data, as shown in Fig. 3.

Fig. 3.
figure 3

The equivalent model of ZPW-2000A track circuits

The basis of this model is a quantitative understanding of the system, the impact of faults, and a limited set of measurement data obtained from real-world track circuits. The ZPW-2000A track circuit equivalent circuit model and description of data acquisition are as follows.

The voltage (U1) of the transmitter is the input data. Based on the equivalent model, the main rail and the small rail voltage can be calculated using formulas (12).

$$ \left[ {\begin{array}{*{20}c} {U1} \\ {I1} \\ \end{array} } \right] = T{\text{S}}PT \cdot T{\text{M}}T \cdot T{\text{CC}} \cdot T{\text{R}}AIL \cdot T{\text{M}}T \cdot T{\text{M}}T \cdot T{\text{S}}PT \cdot TA1 \cdot \left[ {\begin{array}{*{20}c} {U8} \\ {I8} \\ \end{array} } \right] $$
(2)
$$\begin{aligned} &\left[ {\begin{array}{*{20}c} {U1} \\ {I1} \\ \end{array} } \right] = T{\text{S}}PT \cdot T{\text{M}}T \cdot T{\text{CC}} \cdot TF1 \cdot T{\text{R}}AIL \cdot TSVA\\&\quad \cdot TRAIL \cdot TF2 \cdot T{\text{M}}T \cdot T{\text{M}}T \cdot T{\text{S}}PT \cdot TA2 \cdot \left[ {\begin{array}{*{20}c} {U8} \\ {I8} \\ \end{array} } \right] \end{aligned}$$
(3)

where U1 and I1 are the voltage and current of the transmitter, TSPT is the transmission cable parameter matrix; TMt is the matching transformer parameter matrix; TCC is the compensating capacitor parameter matrix; TRAIL is the rail parameter matrix; TA1 is the attenuator main track circuit parameter matrix; TA2 is the attenuator small track circuit parameter matrix; TF1 is the tuning unit 1 parameter matrix; TF2 is the tuning unit 2 parameter matrix; TSVA is the hollow coil parameter matrix.

SPT Cable

The SPT cable can be considered as a uniform transmission line. Based on the transmission line theory, the T parameter matrix TSPT of the SPT cable can be defined as:

$$ T{\text{spt}} = \left[ {\begin{array}{*{20}c} {\cosh (\gamma l)} & {Z{\text{c}}\sinh (\gamma l)} \\ {\frac{\cosh (\gamma l)}{{Zc}}} & {\cosh (\gamma l)} \\ \end{array} } \right] $$
(4)
$$ \left\{ {\begin{array}{*{20}c} {\gamma = \sqrt {Z0*Y0} } \\ {Zc = \sqrt {Z0/Y0} } \\ {Z0 = R2 + jwL2} \\ {Y0 = G2 + jwC2} \\ \end{array} } \right. $$
(5)

where \(\upgamma \) is the propagation constant of the cable, ZC is the characteristic impedance of the cable, l is the length of the cable, Z0 is the impedance of the cable, Y0 is the admittance of the cable, w is the signal angular frequency, R2 is the resistance of the cable, L2 is the inductance of the cable, G2 is the conductance of the cable, and C2 is the capacitance of the cable.

Matching Transformer

According to the two-port theory and equivalent circuit, the T parameter matrix TMt of a matching transformer can be defined as:

$$ TM{\text{t}} = \left[ {\begin{array}{*{20}c} \frac{n}{m} & {2j\left( {\frac{m}{n} * wLt1 - \frac{n}{m} * \frac{1}{wCt1}} \right)} \\ 0 & \frac{m}{n} \\ \end{array} } \right] $$
(6)

where n and m are the number of turns of the primary and secondary coils of the matching transformer, Lt1 is the inductance of the matching transformer, and Ct1 is the capacitance of the matching transformer.

Compensation Capacitor

Translation: According to the two-port theory and equivalent circuit, the T-parameter matrix TCC of a compensating capacitor can be defined as:

$$ TCC = \left[ {\begin{array}{*{20}c} 1 & 0 \\ {jwCb} & 1 \\ \end{array} } \right] $$
(7)

where Cb is the compensating capacitor.

Two Steel Rails

Based on transmission line theory and equivalent circuit, the T-parameter matrix TRAIL of a rail can be defined as:

$$ TR{\text{a}}il = \left[ {\begin{array}{*{20}c} {\cosh (\gamma rd)} & {Z{\text{r}}\sinh (\gamma rd)} \\ {\frac{\cosh (\gamma rd)}{{Zr}}} & {\cosh (\gamma rd)} \\ \end{array} } \right] $$
(8)

where \(\upgamma \) r is the propagation constant of the rail, Zr is the characteristic impedance of the rail, and d is the length of the rail.

Attenuator

The simulation circuit of the attenuator for narrow gauge railways is shown in Fig. 4.

Fig. 4.
figure 4

Simulation circuit for small track adjustment

According to two-port theory and equivalent circuit, the T-parameter matrices TA1 and TA2 of the main track circuit and the small track circuit of the attenuator can be defined as:

$$ TA{1} = \left[ {\begin{array}{*{20}c} \frac{116}{n} & 0 \\ 0 & {\frac{RA1 + Rd3}{R} * \frac{n}{116}} \\ \end{array} } \right] $$
(9)
$$ TA{2} = \left[ {\begin{array}{*{20}c} 1 & {RA2} \\ 0 & 1 \\ \end{array} } \right] $$
(10)

where, RA1 represents the coil impedance of the main track circuit input end of the attenuator, Rd3 represents the resistance at the input end, and RA2 represents the series resistance of the small track circuit of the attenuator.

Tuning Unit

The tuning unit is composed of tuning unit F1, tuning unit F2 and hollow coil. Based on the two-port theory and equivalent circuit, the T-parameter matrix can be defined as:

$$ TF1 = \left[ {\begin{array}{*{20}c} 1 & 0 \\ {\frac{{{\text{j}}wCf1}}{{1 - w^{2} Lf1Cf1}}} & 1 \\ \end{array} } \right] $$
(11)
$$ TSVA = \left[ {\begin{array}{*{20}c} 1 & 0 \\ {\frac{1}{{{\text{j}}wLf2}}} & 1 \\ \end{array} } \right] $$
(12)
$$ TF2 = \left[ {\begin{array}{*{20}c} 1 & 0 \\ {\frac{{1 - w^{2} Lf2Cf2}}{{jwCf2 + jwCf3(1 - w^{2} Lf2Cf2)}}} & 1 \\ \end{array} } \right] $$
(13)

where Lf1, Lf2 and Lf3 are inductors, and Cf1, Cf2 and Cf3 are capacitors. The voltage and current after the tuning unit can be obtained according to the above process.

4 Results

4.1 The Equivalent Circuit Model Accuracy

Through the simulation model, the voltage change in each piece of equipment in the track circuit can be obtained. When the sending terminal voltage is 135 V, the voltage that reaches the rail surface through the SPT cable and matching transformer is 2.58 V. In the actual track circuit, the voltage on the rail surface at the sending end is 2.6 V. The voltage change in the transmitting end equipment of the simulated track circuit meets the voltage change in the transmitting end equipment of the actual track circuit.

The voltage is 0.72 V after being transmitted by the rail. Table 1 shows the voltage comparison between the two ends of the rail in the simulation circuit and the real track circuit. The data show that the voltage change in the rail simulation circuit meets the requirements of the actual rail transmission voltage change.

Table 1. Voltage at both ends of the rail

The voltage on the rail surface at the receiving end first passes through the matching transformer and then rises and enters the attenuator through the SPT cable. The receiving voltage of the main track and the receiving voltage of the small track are obtained through the adjusting circuit in the attenuating disc. The simulation results are shown in Table 2.

Table 2. Simulation circuit results

In the actual ZPW-2000A track circuit, the receiving voltage of the main track is 500 mV, and the receiving voltage of the small track reaches 130 mV. By comparing the data, the simulation results of the receiver equipment meet the requirements of the actual track circuit. In summary, the model is verified to be feasible. Therefore, by adjusting the different parameter values of formula (23), the equivalent circuit model can generate various data for training the network, such as compensation capacitor fault data, track break fault data, transformer fault data, attenuator disk fault data, tuning unit fault data, and normal data.

4.2 Training and Testing Results

The size of the fault dataset used in this article is 1200, with a training set size of 900 and a test set size of 300. The study employs accuracy to validate the model. The accuracy was recorded every 50 iterations. The accuracy of the test set is 98.33%, as shown in Fig. 5.

Fig. 5.
figure 5

Accuracy of the fault diagnosis method proposed in this paper

The confusion matrix of the fault diagnosis method is shown in Fig. 6. The method correctly classified 50 normal states, 50 tuning unit faults, 47 matching transformer faults, 48 attenuator faults, 50 broken rail faults, and 50 compensation capacitor faults. Among them, 2 transformer faults were mistakenly classified as attenuator faults, and 3 attenuator faults were mistakenly classified as matching transformer faults. The reason for this is that these two types of faults have similar characteristics.

Fig. 6.
figure 6

Confusion matrix

4.3 Compared with Other Methods

In order to verify the superiority of our proposed method, we conducted experiments on several other fault diagnosis methods using the same dataset. As shown in the Table 3, the maximum accuracy among the other methods is 83.33% (while ours is 98.33%). Therefore, the superiority of this method has been validated.

Table 3. Comparison of results from different methods.

4.4 t-SNE

t-SNE (t-Distributed Stochastic Neighbor Embedding) is a popular data visualization technique that can map high-dimensional data to a lower-dimensional space for visualization [20]. In order to gain deeper understanding of the content learned by neural networks, we used t-SNE to investigate whether different networks learned useful feature information. As shown in the Fig. 7, a displays the original data, b represents the results after passing through a CNN feature extraction network, and c represents the results after passing through an LSTM layer. It can be observed from the figure that the CNN and LSTM networks have better classification performance than the original data. This means that our proposed CNN-LSTM network structure can learn relevant dependencies in the data.

Fig. 7.
figure 7

t-SNE visualization of original fault data and different networks.

5 Conclusion

This article proposes a CNN-LSTM-based spatio-temporal information-based fault diagnosis method for ZPW-2000A track circuits. Due to the unique basic structure and principle of track circuits, this study combines circuit principles, dual-port network theory, and transmission line theory to establish an equivalent model of the ZPW-2000A track circuit. The network is trained and tested using synthesized data from the equivalent circuit model. The results show that this method can correctly classify faults in ZPW-2000A track circuits. Compared with other fault diagnosis methods, the proposed method in this article has superior performance.