Keywords

1 Introduction

As the key link of circuit-pneumatic-mechanical control loop in the train braking system, pneumatic valve plays an important role in system’s safety and reliability [1]. Pneumatic valves gradually deteriorate with the accumulation of working time, thus leading to a decline in performance [2]. Due to the complex structure, the internal conditions of pneumatic valves cannot be directly observed, resulting in difficulty in predicting when the faults occur.

Current fault prediction methods consist of physical model-based methods, statistical model-based methods and machine learning-based methods [3]. The physical model-based methods use the thresholds generated by the actual and model values as indicators for prediction [4]. Chetan et al. [5] used a mathematical model describing the physical principles of component degradation, a Bayesian filter algorithm was used for parameter state estimation and fault prediction. The physical model-based methods require accurate mathematical models, but it is difficult to build an accurate mathematical model for pneumatic valves [6].

Instead of building a physical model, the statistical methods use historical data to establish health indicators of the degradation process [7]. Jin et al. [8] derived an auto-regressive model for filtering fault-independent signals to track the degradation process of the bearing and designed an extended Kalman filter for fault prediction. Principal component analysis is widely used in the field of fault prediction where the degradation state can be accurately measured according to the degree of deviation of the statistical indicators [9].

In recent years, machine learning-based methods for fault prediction have drawn a lot of attention, since only enough historical fault data is required to implement complex fault prediction problems [10]. For example, long and short-term memory recurrent neural networks are used to perform multi-forward voltage prediction for battery system failures prediction [11]. Hack-Eun et al. [12] introduced a support vector machine classifier to estimate the health state and to make long-term predictions of the bearing degradation state.

The purpose of this paper is to predict the degradation process in pneumatic valves and to provide early warning before a fault occurs. A support vector regression-based fault prediction method for pneumatic valves is proposed, and different fault prediction models are established for different types of faults. Firstly, the similarities in the failure characteristics of different pneumatic valves are discussed, and the features are reconstructed according to the fault types. Then a principal component analysis method is used to extract the health indicators. Finally, the support vector regression model is trained separately for each fault type to predict the time of fault occurrence.

The remainder of this paper is organized as follows. The degradation and simulation models of pneumatic valves are established in Section II. Section III describes the SVR-based fault prognosis method. Simulation results are presented in Section IV. We conclude the paper in Section V.

2 Modeling

In this section, the structure and working principle of the pneumatic valve are analyzed. The simulation models of the train/equalization control unit of the electro-air braking system based on AMESim and Matlab are built to analyze the different fault types and performance, and obtain the fault data to verify the proposed method.

Pneumatic valve is a mechanical valve, which does not rely on electromagnetic drive. In the braking system, pneumatic valves mainly include relay valves, and other valves. Relay valve, as one of the core parts of braking system, is very representative in different types of pneumatic valves. Therefore, this paper takes the relay valve as the research object to verify the effectiveness of the method. The brake control unit is built with Simulink toolbox in MATLAB.

The simulation model of train/equalization control unit is realized by AMESim and Simulink toolbox in MATLAB. The model includes two parts: air channel and brake control unit. The air channel is simulated by the Pnuematic Kit and the Pnuematic Component Design Kit in AMESim.

The AMESim simulation model of the relay valve consists of three parts: the air supply valve, the air evacuation valve and the main piston. The function of the relay valve is to control the pressure of the train pipe according to the pressure of the equalizing reservoir, and to realize the functions of air inflation, exhaust and maintaining pressure.

The inflation function is to increase the pressure in the train tube so that the train can run normally. The exhaust function is to reduce the pressure in the train tube to slow down or stop the train. The pressure maintaining function is to keep the train in the braking state.

As shown in Fig. 1, this paper proposes a fault prognosis method for pneumatic valve based on health indicators and support vector regression. Different fault prognosis models are trained with the data generated by different fault types. The importance of the degradation features was evaluated, and a group of features with the highest weight for each fault type was selected as sample data. Principal component analysis is used to calculate statistics \(T^{2}\) and SPE as health indicators. The prognosis model is established based on the support vector regression algorithm.

Fig. 1
figure 1

The proposed fault prognosis framework for pneumatic valve

It mainly consists of two parts, features extraction using principal component analysis and model establishment based on support vector regression.

3 Fault Prognosis Using Health Index Extraction and Support Vector Regression

Pneumatic valves are key components of train braking system. Effective fault prediction method of pneumatic valves can avoid the failure of pneumatic valve and ensure the safety and reliability of train operation. Therefore, a fault prediction model of pneumatic valve based on health index and support vector regression is proposed in this paper.

3.1 Health Index Extraction Based on Principal Component Analysis

After the sample data reconstruction based on the importance of fault features, the health indicators corresponding to the data samples should be extracted according to the fault symptom types of the samples. Aiming at the problem of pneumatic valve health indicators are difficult to extract, this section on the basis of the characteristics of the reconstruction, this paper proposes a pneumatic valve health indicators on the basis of the principal component analysis model, mainly through the orthogonal transformation, the original data space decomposition is given priority to yuan space and residual space, while reducing the data dimension reserves the main information, and then extracted respectively from two Spaces health indicators.

The reconstructed degradation characteristics of pneumatic valves were used as the input data of the health index extraction method. Given a two-dimensional data matrix \(X \in {\mathbb{R}}^{n \times m}\), \(n\) is the number of observations, \(m\) is the number of process variables included in the principal component analysis.

$$ X = \hat{X} + E = TP + E = \sum\limits_{i = 1}^{m} {t_{i} p_{i}^{T} } + E $$
(1)

where \(\hat{X}\) is the cross product sum of subspace, \(t_{i}\) is the principal component vector; \(p_{i}\) is the projection direction of the principal component by transforming the basis vector; \(E \in {\mathbb{R}}^{n \times m}\) is the residual matrix of the prediction error of the principal component model. \(E\) is measures the weak trend in the change of uncertainty and degradation process. According to the change trend, two statistics (\(T^{2}\) and SPE) are obtained to detect the system state, as shown in Eqs. (2) and (3):

$$ T_{k}^{2} = \sum\limits_{i = 1}^{A} {\frac{{t_{ki}^{2} }}{{\sigma_{i} }}} $$
(2)
$$ Q_{k} = \sum\limits_{j = A + 1}^{m} {t_{kj}^{2} } $$
(3)

where \(T_{k}^{2}\) and \(Q_{k}\) are the sample of \(T^{2}\) and SPE at time \(k\); \(t_{ki}^{2}\) and \(t_{kj}^{2}\) are the component scores at time \(k\); \(\sigma_{i} \in {\mathbb{R}}\) is estimated variance of the score variable. For different fault types, the data of pneumatic valve in normal working state are collected to establish the principal component model, and different principal component models are established for each fault type. When the pneumatic valve begins to degenerate, the health index \(T^{2}\) and SPE gradually deviate from the principal component model, and when the deviation is too large, it indicates the failure of the pneumatic valve. The threshold equation of health index \(T^{2}\) can be represented as:

$$ T_{\alpha }^{2} = \frac{{A\left( {n - 1} \right)}}{n - A}F_{A,n - A,\alpha } $$
(4)

where \(n\) is the number of the samples, A is the number of the principal components. \(\alpha\) represents significance level, which value is set at 0.01. A and \(n\)-A are the confidence of the F distribution.

The threshold equation of health index SPE can be represented as:

$$ Q_{\alpha } = \theta_{1} \left( {\frac{{C_{\alpha } \sqrt {2\theta_{2} h_{0}^{2} } }}{{\theta_{1} }} + \frac{{\theta_{2} h_{0} \left( {h_{0} - 1} \right)}}{{\theta_{1}^{2} }} + 1} \right)^{{\frac{1}{{h_{0} }}}} $$
(5)

where, \(C_{\alpha }\) is the value of the normal distribution at \(\alpha\) the significance level, \(\theta_{i}\) and \(h_{0}\) can be calculated by Eqs. (6) and (7), which represents the eigenvalue of covariance matrix.

$$ \theta_{i} = \sum\limits_{j = A + 1}^{m} {\lambda_{j}^{i} } \left( {i = 1,2,3} \right) $$
(6)
$$ h_{0} = 1 - \frac{{2\theta_{1} \theta_{3} }}{{3\theta_{2}^{3} }} $$
(7)

Through the study in this section, the health indicators and threshold values corresponding to the degradation symptom types of each sample can be obtained, which the health indicator will be used to evaluate the current degradation state of the pneumatic valve and make a prediction, and the threshold value of the indicator will be used to judge whether the predicted health indicator has a failure.

According to the \(T^{2}\) and SPE, the error conditions can be divided into four cases:

  1. 1.

    \(T^{2}\) and SPE within their threshold limits;

  2. 2.

    \(T^{2}\) within the threshold limits, but SPE is large than the threshold limits;

  3. 3.

    SPE within the threshold limits, but \(T^{2}\) is large than the threshold limits;

  4. 4.

    \(T^{2}\) and SPE are both large than the threshold limits;

Generally, (2), (3) and (4) are failure conditions, while (1) is a normal condition.

3.2 Error Prediction Based on Support Vector Regression

In this section, according to the health indicators of different fault types of pneumatic valves obtained, support vector regression models are built respectively for numerical prediction of health indicators. The health indexes of pneumatic valves were constructed as training samples

$$\begin{aligned} & D = \left\{ {\left( {x_{1} ,y_{1} } \right),\left( {x_{2} ,y_{2} } \right), \ldots ,\left( {x_{m} ,y_{m} } \right)} \right\} \\ & f\left( x \right) = \omega^{T} x + b \\ \end{aligned}$$
(8)

where \(x \in {\mathbb{R}}^{n}\), \(y_{i} \in {\mathbb{R}}\). \(f(x)\) is the estimated output of the model. \(\omega\) and \(b\) are respectively the weight and deviation corresponding to the input. But when in the case the distance between \(f(x)\) and \(y\) is greater than \(\varepsilon\), this case can be considered to be an error. This means that \(\omega\) is as flat as possible.

$$ \mathop {{\text{min}}}\limits_{\omega b} \frac{1}{2}\left\| \omega \right\|^{2} + C\sum\limits_{i = 1}^{m} {\ell_{\varepsilon } } \left( {f\left( {{\varvec{x}}_{i} } \right) - y_{i} } \right) $$
(9)

where \(C\) is the regularization constant, and \(\ell_{\varepsilon }\) is \(\varepsilon\)- insensitive loss function, which can be formulated as:

$$ \ell _{\varepsilon } \left( z \right) = \left\{ \begin{gathered} 0\quad \quad ,\,if\;\left| z \right| \le \varepsilon \hfill \\ \left| z \right| - \varepsilon ,\;if\;\left| z \right| > \varepsilon \hfill \\ \end{gathered} \right. $$
(10)

where \(z = y_{i} - \omega x + b\). According to Eq. (9), the \(\xi_{i}\) and \(\hat{\xi }_{i}\) are used as slack variable. Based on Eqs. (10), (11) can be formulated as:

$$ \mathop {{\text{min}}}\limits_{{\omega \user2{,}b,\xi_{i} ,\hat{\xi }_{i} }} \frac{1}{2}\left\| \omega \right\|^{2} + C\sum\limits_{i = 1}^{m} {\left( {\xi_{i} { + }\hat{\xi }_{i} } \right)} $$
(11)

where \(- \varepsilon - \xi_{i} \le z \le \varepsilon + \hat{\xi }_{i} ,\xi_{i} \ge 0,\hat{\xi }_{i} \ge 0\left( {i = 1,2, \ldots ,m} \right)\). According to the Lagrange multiplier and optimal constraints, the kernel method of support vector regression model can be expressed as Eq. (12):

$$ f\left( x \right) = \sum\limits_{i = 1}^{m} {\left( {\hat{\alpha }_{i} - \alpha_{i} } \right)\kappa \left( {x\user2{,}x_{i} } \right)} + b $$
(12)

where \(\kappa \left( {\user2{x,x}_{i} } \right) = \phi \left( {{\varvec{x}}_{i} } \right)^{T} \phi \left( {{\varvec{x}}_{j} } \right)\) is the kernel function. The commonly used kernel functions of support vector regression algorithms include linear kernel function polynomial kernel function radial basis kernel function and Sigmoid kernel function. The radial basis kernel function has a good performance when dealing with regression problems. The radial basis kernel function is selected as the kernel function of the support vector regression prediction model.

As the degradation process of pneumatic valves generally takes a long time, it is very difficult to obtain a large number of complete life cycle data. Therefore, in order to fully apply the training samples and improve the accuracy of the model, the idea of time sliding window is used to construct the prediction target artificially. The degradation process data of a pneumatic valve is slide-wise intercepted into multiple sample sets in the form of window, and each sample set includes training data and prediction indexes. In order to better train the model, different data sample sets are intercepted from the training set according to time, and each sample set includes training number samples and test samples. The support vector regression algorithm is used for multiple training, which can effectively improve the accuracy of the model.

4 Simulation Results and Discussions

In this section, the simulation indicators are introduced, then the joint simulation model fault acquisition method is described, and finally, it is revealed the time for fault occurrence by predicting health indicators based on degradation data.

4.1 Failure Prediction Evaluation Index

The prediction of failure is to predict future health indicators based on existing health indicators. In this paper, Mean Squared Error (MSE) and R Squared (\(R^{2}\)) are selected as the evaluation indexes of the pneumatic valve failure prediction model, which are calculated as shown in Eqs. (13) and (14).

$$ MSE = \frac{1}{m}\sum\limits_{i = 1}^{m} {\left( {\hat{y}^{\left( i \right)} - y^{\left( i \right)} } \right)^{2} } $$
(13)
$$ \begin{aligned} R^{2} & = 1 - \frac{{\sum\limits_{i} {\left( {\hat{y}^{\left( i \right)} - y^{\left( i \right)} } \right)^{2} } }}{{\sum\limits_{i} {\left( {\overline{y} - y^{\left( i \right)} } \right)^{2} } }} = 1 - \frac{{\sum\limits_{i} {\left( {\hat{y}^{\left( i \right)} - y^{\left( i \right)} } \right)^{2} /m} }}{{\sum\limits_{i} {\left( {\overline{y} - y^{\left( i \right)} } \right)^{2} /m} }} \\ & = 1 - \frac{{MSE\left( {\hat{y},y} \right)}}{Var\left( y \right)} \\ \end{aligned} $$
(14)

According to Eqs. (13) and (14), the size of regression evaluation index MSE is related to the dimension of the model. MSE increases with larger dimensions. With consistent dimensions, the decrease in MSE will result in better model performance. The regression evaluation index \(R^{2}\) removes the dimensionality of the model and returns an accuracy between 0 and 1. Training models with a higher \(R^{2}\) will have better accuracy.

4.2 Preparation for Fault Prediction Data

In order to obtain the process data with fault occurrence, a joint simulation model of AMESim and Matlab was built, and Simulink was used to dynamically adjust the real-time parameters of a part of the relay valve to obtain the fault process data by artificially giving the degradation process curve of the part. When the real-time parameters are given, the model parameters are the same for each cycle, and the model parameters are different between cycles.

In Fig. 2, three degradation curves represent three different degradation processes. We model the different degradation processes for the rubber ring wear in relay valve supply air valves and demonstrate how the proposed failure prediction method accurately predicts the time of failure when the form of degradation is uncertain.

Fig. 2
figure 2

The degradation process relay valve supply air valve rubber ring wear failure parameters change curve

4.3 Extracting Health Indicators by Principal Component Analysis

The principal component models are established for the degradation process of rubber ring wear fault of relay valve air supply valve to extract health indicators. Since the establishment of the principal component model needs to test whether the original spatial data obey the normal distribution, K-S test is used to prove that the original spatial data obey the normal distribution, and the principal component analysis method is used to extract the health indicators \(T^{2}\) and SPE from the original space.

Fig. 3
figure 3

Health index \(T^{2}\) of relay valve

Fig. 4
figure 4

Health index SPE of relay valve

In Figs. 3 and 4, the health indicators \(T^{2}\) and SPE are represented by the blue curve, and represents the control limits of the two indicators are represented by the black dotted line. The fault judgment results are shown in Fig. 4.

In Fig. 5, the index \(T^{2}\) is represented by the blue curve, the control limit of \(T^{2}\) is represented by the blue dotted line, the index SPE is represented by the green curve, the control limit of SPE is represented by the green dotted line, and the fault detection results are represented by the red dotted line. When the red dotted line is equal to 0, it means that the relay valve has no fault. When the red dotted line is not equal to 0, it means that the relay valve has fault. It can be seen from the figure that the relay valve fails after the 105th cycle.

Fig. 5
figure 5

Fault judgment results

4.4 Result Analysis

The grid search method is used to select the super parameter value of support vector regression model. The value range and step size of super parameter are selected according to experience, and then the grid search method is used to search each super parameter combination within the value range according to step size.

It can be seen from Table 1 that the R-squared values of the first two training samples are relatively large, and the prediction results are relatively good, while the R-squared values of \(T^{2}\) health indicators of the last two training samples are relatively small, indicating that the prediction results are relatively poor. On the whole, the method proposed in this paper has good performance in both fault prediction results and fault prediction accuracy, and can effectively realize the fault prediction of pneumatic valve.

Table 1 SVR Evaluation of fault prediction results

5 Conclusion

This paper proposes a novel data driven fault prognosis method for pneumatic valves in train braking system. Two health indicators, i.e., \(T^{2}\) and SPE, are extracted through PCA method, which are further used to train a fault prognosis model based on SVR. The proposed method is validated on a semi-physical simulation verification platform of DK-2 braking system, results show that the proposed fault prognosis model can estimate the expected time of pneumatic valve fault time accurately.