Keywords

1 Introduction

Accurate prediction of track geometry degradation is critical for efficient railway operations, as it can aid in accident prevention and improve maintenance planning [1]. However, predicting track degradation is challenging due to the complexity of factors that affect it, including traffic, load, maintenance, and environmental factors [2]. Predicting track degradation involves analyzing how these factors interact and affect the degradation rate or level at a specific point in time. Artificial Neural Networks (ANNs) have been suggested as a potential solution for predicting track degradation, considering multiple input features, including traffic patterns, load conditions, maintenance schedules, and environmental factors [3]. NNs are a machine learning model that mimics the information processing mechanism of the human brain. There are various types of ANNs, including Feedforward-ANNs (FF-ANN), Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNNs), and Deep Belief Networks (DBNs), among others. It is a common approach to use historical degradation data to train a NN model, and the resulting model is applied to forecast degradation in unseen data [4]. The degradation rate or level at a specific point in time can be the output variable of the ANN [4].

The degradation pattern of a railway track can also be treated as a time series, and predicting its future degradation can be approached as a time series forecasting problem [5]. Historically, there has been a common belief that ANNs are not ideal for time series, primarily due to the typically short length of most time series [3, 6]. However, in the recent large-scale forecasting competition organized by the International Institute of Forecasters (IIF), called the M4 competition, RNN achieved impressive performance and a hybrid model combining exponential smoothing and RNN emerged as the winner [7]. Given the increasing use of sensors and condition monitoring tools for the predictive maintenance of railway tracks, a vast amount of data can be leveraged to predict degradation patterns and optimize track maintenance. RNNs have shown great potential in dealing with sequential data [3, 8], such as degradation patterns, and can be used to forecast future trends in track conditions. LSTM and GRU are two variants of RNNs that have demonstrated strong performance in learning long-term dependencies in sequential data [9]. However, most previous studies [4, 10,11,12,13] have neglected the high capabilities of RNNs in dealing with sequential data, such as degradation patterns.

In addition, an essential aspect of utilizing ANNs for degradation analysis is the selection of suitable structures and hyperparameters. Hyperparameter tuning involves selecting the best values for NN model parameters, such as the number of hidden layers, the number of neurons per layer, the learning rate, the activation function, batch size, and the optimizer [14]. Finding the optimal combination of hyperparameters is essential to enhance model accuracy [15]. However, in the reviewed literature on applying ANNs for railway track maintenance [4, 10,11,12,13, 16,17,18] in this paper, the issue of hyperparameter tuning has not received sufficient attention, and ANNs have typically been constructed using a manual search approach. To enhance the performance of ANNs in predicting degradation patterns, it may be necessary to further investigate the potential impact of advanced techniques, such as Bayesian optimization, for optimizing hyperparameters.

This paper aims to investigate the performance of different ANNs, i.e., FF-ANN, Simple RNN, LSTM, and GRU, in predicting track geometry degradation. This study is the first of its kind to provide a comprehensive overview of the predictive capabilities of these neural networks for track geometry predictions while also addressing the issue of hyperparameter tuning.

The second section of the paper provides a literature review on the use of ANNs for predicting railway track geometry. In section three, the ANNs used in this study are explained. The fourth section elaborates on the hyperparameter tuning process adopted to optimize the performance of the neural network models. Section five presents a detailed analysis of the neural network models’ performance based on data from a case study on a track section with 110 segments. Finally, section six highlights the findings’ conclusions and implications for the railway track maintenance field.

2 Review of the Literature

ANNs have been suggested as a method for analyzing degradation in railway tracks, aiming to predict the degradation using diverse input features. [4, 10,11,12]. Several studies have used ANN to predict different aspects of railway track degradation. Guler [4] used ANN to predict track geometry degradation by considering track structure, traffic characteristics, layout, environmental factors, geometry, and maintenance and renewal data. Moridpour et al. [11] explored the impact of increased tram traffic on rail infrastructure and presented an ANN model to predict tram track degradation using existing data to reduce maintenance costs and improve system performance. Lee et al. [12] used an ANN and support vector regression to predict track geometry degradation based on several input variables, such as the track quality index value, curvature, velocity, and million gross tonnages. Ali et al. [13] applied a backpropagation-ANN to construct a deterioration model for railway tracks in the UK, using factors such as track geometry, ballast fouling index, train speed, catch pits, ballast age, and sleeper age. Gerum et al. [19] used RNN to predict the track defects and classify them into two groups of red and yellow defects. Falamarzi et al. [20] used a regression model and an ANN model to predict tram track gauge deviation, and both showed good performance with determination coefficients above 0.7. Finally, Khajehei et al. [10] used track geometry measurements, asset information, and maintenance history to predict track geometry degradation by ANN.

In other related studies, but not necessarily focused on degradation prediction, Bruin et al. [16] proposed using LSTM for fault detection and identification in railway track circuits based on commonly available measurement signals, achieving a correct classification rate of 99.7% and outperforming a convolutional network. Popov et al. [17] used ANN on data from a high-speed line in the UK to assess the efficiency of railway track maintenance.

The review of the literature revealed two gaps in the research on using ANNs to predict railway track degradation patterns:

  • Firstly, none of the reviewed articles addressed the issue of hyperparameter tuning. The ANN structure was constructed using a manual search approach, which lacks a sophisticated hyperparameter tuning method. This could potentially limit the accuracy and efficiency of the model.

  • Secondly, all the reviewed articles used only one type of ANNs, mostly FF-ANN, while neglecting the high capabilities of RNN in dealing with sequential data as degradation patterns. Therefore, incorporating RNN into the degradation analysis could provide more precise track degradation predictions and help enhance the decision support system for railway track management.

To address the above mentioned gaps, four ANNs, i.e., FF-ANN, RNN, LSTM, and GRU, are used for predicting track geometry degradation while using Bayesian optimization for hyperparameter optimization.

3 Neural Network Methods

3.1 FF-ANN

The most frequently used form of ANNs is the FF-ANN model, which comprises three types of layers: input, output, and hidden. In this model, each output layer node is linked to a target variable, while the input layer nodes are associated with predictor variables, as shown in Fig. 1 [18, 21]. The number of hidden layers and the number of nodes (neurons) in each layer together determine how complex the FF-ANN model is. FF-ANNs with multiple non-linear hidden layers can capture complex relationships between input and output variables, but limited training data may lead to overfitting due to sampling noise creating a complex relationship that does not exist in the test data [18, 21].

Fig. 1
A neural network of F F A N N structure. The inputs ranging from x 1 through x j in the input layer move to hidden layer 1 up to hidden layer n minus 1 followed by hidden layer n and then to the output layer. The output is y cap of k.

FF-ANN structure

The number of hidden layers in an FF-ANN model is proportional to the complexity of the research object, and experiments are used to determine the optimal number of hidden layers [18].

3.2 RNN

RNNs are designed to handle sequential data, often used for time series analysis [3, 8]. Therefore, RNNs can be used for degradation analysis, which aims to monitor and predict the system's condition over time The RNN's structure resembles a multilayer perceptron but with time-delay connections between hidden units to retain information about the past and discover temporal correlations between distant events in the data. [22, 23], as shown in Fig. 2. This approach allows building ANNs to process and analyze data effectively over time.

Fig. 2
A neural network of R N N structure. The inputs ranging from x 1 through x j in the input layer move to hidden layer 1 up to hidden layer n and then to the output layer. The output is y cap of k.

RNN structure

Even though RNNs can handle sequential data, such as time series, RNNs cannot learn long-term dependencies due to the vanishing-gradient problem [9, 24, 25]. The vanishing-gradient problem refers is a challenge encountered in RNNs that arises when the gradients computed for each time step are multiplied by the recurrent weight matrix, causing them to diminish in magnitude over time [9]. This problem causing current information to be prioritized over past events and hindering the learning of long-term dependencies [9]. As a result, LSTM and GRU models were developed to address these issues [24]. LSTM addresses this by controlling the flow of information within neurons using a gating mechanism that regulates the addition and deletion of information from an iteratively propagated cell state [9].

LSTM cells have three gates–input, forget, and output–to modify a cell state vector, which is iteratively propagated to capture long-term dependencies [9, 24, 25]. The controlled information flow within the cell helps the network remember multiple time dependencies with varying characteristics [24]. GRU has a simpler cell structure than LSTM and uses a gating system with only an update and reset gate [3, 24]. Hewamalage et al. [3] provided further details on the mathematical models and structure of RNN, GRU, and LSTM.

4 Hyperparameter Tuning

Choosing the right architecture and hyperparameters is crucial for implementing ANNs in degradation analysis, as they heavily impact the behaviour of training algorithms and model performance [14]. The main hyperparameters of a NN include [14, 26]:

  • The number of hidden layers: ANNs can have one or more hidden layers.

  • The number of neurons per layer: An appropriate number of neurons should be selected to prevent overfitting or underfitting.

  • Activation functions: Different activation functions, such as ReLU, sigmoid, and Tanh, introduce nonlinearity into the NN. The choice of activation function can significantly impact the model's performance.

  • Learning rate: The learning rate controls the step size during optimization and determines how quickly the model converges.

  • The number of epochs: The number of epochs determines how many times the training process will iterate over the entire training set.

  • Batch size: During training, data is processed in batches. The batch size is a hyperparameter that determines the number of samples in each batch.

  • Optimizer: The optimizer is used to update the model parameters during training. Different optimizers, such as stochastic gradient descent (SGD) and Adam, are available.

4.1 Bayesian Optimization

Bayesian optimization is a powerful and effective method for hyperparameter tuning in ANNs [14, 26]. A search space for hyperparameters is defined to implement Bayesian optimization, along with an acquisition function that balances exploitation and exploration [27, 28]. The acquisition function determines where to sample next based on the current state of the model and the target function [27]. The process is repeated until the optimal hyperparameters are found, or a stopping criterion is met [27].

5 Results and Discussion

Historical data from track Sect. 119 (TN-HO19), which spans 30 kms along the Swedish Iron ore line between Boden and Luleå, is utilized to assess the performance of investigated ANNs.

5.1 Data Preparation

5.1.1 Data Collections

The data used in the study included the shortwave measure of the rail's longitudinal level, which is an important track geometry variable. The shortwave measurement is defined as the amplitude of longitudinal waves with wavelengths between 3 and 25 m, as per the EN 13,848–1:2017 standard [29]. The study utilizes data gathered between 2007 and 2022 using a measurement train that records shortwave amplitude for each 25 cm of both rails. This results in 800 measurements per 200 m segment.

5.1.2 Data Cleaning

To make the data less sensitive to errors, the measurement data was segmented into 200-m lengths, and segment statistics were calculated. Missing observations were removed, and segments with less than 50% complete observations were considered missing. Data for objects such as switches, crossings, or platforms were removed as they were not relevant to the study.

5.1.3 Data Scaling

A standard scaling procedure is used to normalize the features of the dataset. Standard scaling involves subtracting the mean of each feature from its values and dividing it by its standard deviation.

5.2 Application and Evaluation of ANNs

5.2.1 Input Features

For the input feature, the standard deviation of longitudinal level and history of maintenance actions for 110 segments of track Sect. 119 are used to train the ANNs.

5.2.2 Performance Indicators

Various performance indicators, including Mean Absolute Error (MAE), Mean Squared Error (MSE), Mean Absolute Percentage Error (MAPE), and prediction accuracy based on MAPE, are used to evaluate the performance of ANNs, as shown in Table 1.

Table 1 Performance indicators

5.2.3 Pseudocode of ANNs

Two scenarios are being considered in this paper. The first scenario is called N-HO and involves no hyperparameter optimization. In N-HO, no hyperparameter optimization is performed, meaning the hyperparameters are set to values found by manual search or suggested values in previous studies. The second scenario, HO, involves hyperparameter optimization by Bayesian optimization. Pseudocodes for both scenarios are presented in Tables 2 and 3. The hyperparameters tested in HO are the number of neurons per second and third layers, activation functions, and the number of epochs. In both scenarios, the Python programming language version 3.10 is used, along with the TensorFlow library [30]. Additionally, the BayesianOptimization package from the Bayes_opt library is utilized for hyperparameter optimization in HO.

Table 2 Pseudocode for N-HO
Table 3 Pseudocode for HO

The development of a NN model with TensorFlow in Python involves the following steps:

  • Feature scaling: A standard feature scaling is applied.

  • Splitting data into train and test sets: 80% of the data are used as training set.

  • Building the NN model: The next step is identifying the number of layers, the activation functions, the number of neurons in each layer, and other hyperparameters. This paper chooses a NN model with one input layer, two hidden layers, and one output layer. In N-HO, the ReLU activation function is selected for hidden layers, and the Linear activation function is selected for the output layer.

  • Training the model: Once the NN model has been defined, the next step is to train it on the data. Training the model involves optimizing the model's parameters (weights and biases) using an Adam optimization algorithm. During training, the model is iteratively updated based on the mean squared error (loss) between its predictions and the true values in the training data.

  • Evaluating the model: After training, various performance metrics explained in Table 1 are used.

  • Tuning the hyperparameters: In HO, Bayesian Optimization is used to find the optimal hyperparameters and run the model again based on the optimized values of hyperparameters.

5.2.4 Comparing the Performance of ANNs

Table 4 presents performance indicators for four different ANNs: FF-ANN, simple RNN, GRU, and LSTM. For each NN, two scenarios are considered: one without hyperparameter optimization (N-HO) and one with hyperparameter optimization (HO). For each performance indicator, the table presents the minimum, mean, and maximum values across all 110 segments. Based on the minimum, mean, and maximum values, the performance of the different ANNs and the impact of hyperparameter optimization are compared.

Table 4 Comparison of ANNs' performance

The range of values for hyperparameters is as follows:

  • Number of neurons per second layer \(\in\) [1, 100]

  • Number of neurons per third layer \(\in\) [1, 100]

  • Activation functions \(\in\) {ReLU (Rectified Linear Unit), Sigmoid, Tanh (Hyperbolic Tangent), Softmax, ELU (Exponential Linear Unit), SELU (Scaled Exponential Linear Unit), and Swish}

  • Number of epochs \(\in\) [1, 200].

The results presented in Table 4 show that the hyperparameter optimization process, HO, leads to better performance compared to using default or suggested by experts hyperparameters, N-HO. The improvement is particularly evident in the minimum and maximum values of the performance indicators, such as MSE, MAE, and MAPE, as well as in the prediction accuracy. These results suggest that the hyperparameter optimization process can help identify better NN configurations that result in improved performance.

Furthermore, with hyperparameter optimization (HO), the GRU model achieved the best overall performance among the four ANNs. In general, all variations of RNN performed better than FF-ANN in HO. However, in N-HO, LSTM and RNNs performed worse than FF-ANN. It can be concluded that hyperparameter optimization has a higher impact on the performance of RNN, LSTM, and GRU compared to FF-ANN.

It is also important to note that the computational time required to train the models varies depending on the scenario (N-HO or HO) and the ANNs model. Generally, the HO scenarios require more computational time than N-HO due to the additional hyperparameter optimization process. Additionally, the LSTM and GRU models typically require more computational time than ANN and RNNs.

The choice between N-HO and HO depends on the specific problem and available resources. While HO can result in better performance, it also requires additional computational time. Therefore, it is important to consider the benefits of hyperparameter optimization against the costs before deciding on a particular approach. For practical implications, the computational time may not be critical when the real time prediction is not necessary.

6 Conclusion

The prediction of track geometry degradation poses significant challenges due to many factors influencing it. Developing advanced, efficient, and effective prediction models are imperative to ensure accurate prediction of track geometry degradation. The present study underscores the criticality of selecting appropriate NN architectures capable of capturing the temporal dependencies inherent in the track geometry degradation process. This study provides insights into the effectiveness of different ANNs (FF-ANN, simple RNN, LSTM, and GRU) for predicting track geometry degradation. The GRU model exhibited the most promising overall performance of the four ANNs evaluated in this research. As such, it is recommended that future research endeavors prioritize the exploration and optimization of GRU and LSTM as RNN variants when developing prediction models for track geometry degradation.

In addition, this study investigated the importance of hyperparameter tuning in improving predictive performance. While hyperparameter optimization can enhance the performance of ANNs, it is also important to consider the computational time required for this process. Therefore, future research should focus on developing more efficient hyperparameter optimization processes to achieve better performance while reducing computational time.

Overall, the accurate prediction of track geometry degradation can significantly improve maintenance planning and prevent accidents in railway operations. The use of advanced sensors and condition monitoring tools, and machine learning models, such as ANNs, can help facilitate this task. Further research in this area can lead to the development of more accurate and efficient predictive models that can be used in practice to enhance railway maintenance and safety. track geometry degradation. A limitation of this study is using only track geometry data and previous maintenance history for training ANNs. In future studies, additional input features can be included in the model to improve the predictive performance of ANNs.