Introduction

In recent years, machine learning technology has been widely used in various fields because it is based on data and does not need to build mechanism models. In the field of machine learning, machine learning algorithm can be divided into online learning algorithm and offline learning algorithm (Ozay et al. 2016). Offline learning is to train the data independently and use the trained model to predict the task. Most of machine learning algorithms studied at present are offline learning algorithm, which have achieved excellent results in many classification and regression tasks (Ozay et al. 2016). However, the following disadvantages of offline machine algorithm limit its application: (1) the training process of the model is very inefficient, and the model training costs a lot of time and space. (2) The training process is not adapting to big data application scenarios. (3) The model cannot adapt to the dynamic changing environment, and the trained model cannot adapt to the change of the external data when the external data changes.

Compared with offline learning, online learning means that the model receives training data in sequence (Mao et al. 2017), and the current model is updated with each sample received. Online learning algorithms do not need all the training data stored in the computer, but can automatically adjust the model itself according to the change of data distribution. These advantages make online learning more suitable for processing massive data and timely responding to the dynamic changes in the external environment. Therefore, the study of online learning algorithms is very hot in recent years. Cavalcante et al. (2016) discussed the application of online learning algorithm in the field of computational finance. In the stock trading system, trading data is generated in high frequency and real-time, and users have higher requirements for trading speed. Therefore, the online learning algorithm is very suitable for portfolio research in the financial field to deal with the problem of online portfolio selection. Malaca et al. (2019) applied online inspection system based on machine learning to fabric texture classification in the automotive industry. The application of machine learning online algorithm improves the real-time performance of inspection system, thus increasing the economic benefit. Song et al. (2016) proposed a novel large-scale, context-aware recommender system. The system relies on a novel online learning algorithm which learns online the item preferences of users based on their click behavior. Experimental results show that the proposed algorithm outperforms the state-of-the-art algorithms by over 20% in terms of click through rates. He et al. (2011) proposed a general adaptive online learning framework that is capable of learning from continuous raw data and using such knowledge to improve future learning and prediction performance.

However, online learning algorithms are rarely used in industrial production, especially in equipment fault diagnosis and prediction. In industrial production, as time goes on, the state of equipment maybe changes. At this time, if the previous generated model is adopted, there will be a large error and affect the accuracy of the model. For example, due to aging of compressor components or changes of production environment, the normal operating range of vibration signal changes. At this time, if the previous generated vibration signal prediction model is used, there will be a considerable error or even a state misjudgment (Liu et al. 2019; Carino et al. 2018). Therefore, in this paper, the idea of online learning is introduced into the compressor vibration signal prediction model to ensure the accuracy of the model. Vibration signal prediction have been widely studied in recent years (Fei 2016; Liu and Yang 2018; Wu and Lei 2019), which belongs to time series paradigm (Chen et al. 2006). Long Short Term Memory (LSTM) network (Hochreiter and Schmidhuber 1997) is the most widely used time series prediction model. LSTM network can not only deal with long time dependence in time series data (Gers et al. 2000), but also deal with nonlinear and non-stationary in vibration signal (Tian et al. 2019). Hence, LSTM network is used as the basic prediction model in this paper.

In order to ensure the continuous validity of the model, the online learning algorithm is introduced into the LSTM model. The operational efficiency of the model is an important index of online learning model. However, complex structure and too many parameters of LSTM model lead to high computational cost (Mohamed 2018). Therefore, if dataset is large, problems such as too long training time and not fast enough prediction speed may occur, so that the online update requirements cannot be met in the model update process. In literature (Prabhavalkar et al. 2016), an improved Recurrent neural network (RNN) compression model based on low-rank decomposition and linear projection is proposed to achieve accelerated training, and the loss of precision can be ignored. Tang proposed a parallel improved LSTM neural network to predict Large-Scale Computing Systems Workload (Tang 2019). This network improves the efficiency of the model through the parallel mode and the improved error back propagation method, which makes it perform well in the large-scale workload prediction. Rizk and Awad (2019) proposed a Non-iterative training algorithm to reduce the training time, mainly on feedforward artificial neural networks. The experimental results show that the training speed of this training algorithm is significantly improved compared with that of the back propagation training algorithm. However, the current literatures only consider the training speed of LSTM model in the update process, and does not consider the model accuracy. To solve this problem, online learning algorithm is introduced into LSTM model, and an update model based on Error-LSTM (E-LSTM) is proposed in this paper. The main idea of E-LSTM model is to improve the accuracy and efficiency of the model according to the test error of the model. The contribution of this paper is mainly in three respects.

  1. (1)

    Online learning algorithm is introduced into vibration signal prediction model to solve the problem of model adaptability when the data distribution changes.

  2. (2)

    Hidden layer neurons are divided into blocks, and only part of neurons is calculated at each time, so as to solve the efficiency problem of LSTM model in update process.

  3. (3)

    Test error is used to determine the number of neurons updated at each time. The greater the error, the more neurons will be activated, which ensures the accuracy of the model.

The organization of this paper is as follows. “Methodology” section describes the proposed algorithm and relevant theoretical knowledge in detail. Then, the validity of the proposed algorithm is verified through different experiments and the experimental results are analyzed in detail in “Experiments and analysis” section. Finally, the conclusion of the paper is summarized in “Conclusion” section.

Methodology

Long Short Term Memory (LSTM)

Compressor vibration signals is time series data and has a strong nonlinear and non-stationary (Tian et al. 2019). LSTM network is widely used in dealing with nonlinear time series problems. LSTM network is proposed by Hochreiter and Schmidhuber (1997) as a special kind of the recurrent neural networks (RNNs). LSTM network can alleviate the problem of gradient vanishing encountered by RNN when solving long time dependence tasks (Hochreiter and Schmidhuber 1997; Gers et al. 2000). Therefore, LSTM network is used as the based forecasting model of vibration signal. The structure of LSTM network is shown in Fig. 1, such a memory block consists of a cell, an input gate, an output gate, and a forget gate. The calculation formula of each gate and the update of cell state can be expressed as follows:

$$ f_{t} = \sigma (w_{f} [h_{t - 1} ,x_{t} ] + b_{f} ) $$
(1)
$$ i_{t} = \sigma (w_{i} [h_{t - 1} ,x_{t} ] + b_{i} ) $$
(2)
$$ \tilde{c}_{t} = \tanh (w_{c} [h_{t - 1} ,x_{t} ] + b_{c} ) $$
(3)
$$ o_{t} = \sigma (w_{o} [h_{t - 1} ,x_{t} ] + b_{o} ) $$
(4)
$$ c_{t} = f_{t} \odot c_{t - 1} + i_{t} \odot \tilde{c}_{t} $$
(5)
$$ h_{t} = o_{t} \odot \tanh (c_{t} ) $$
(6)

where it, ft, ot, ct and ht are outputs of input gate, forget gate, output gate, cell and memory block at time step t. xt is input vector of memory block at time step t. ht−1 is output vector of memory block at time step t − 1. \( \tilde{c}_{t} \) denotes candidate information of input gate. \( \sigma \) and tanh denote activation function. \( \odot \) denotes the dot product of the vectors. Additionally, wf, wi, wc and wo are the weight to be learned; bf, bi, bc and bo are the corresponding bias vectors.

Fig. 1
figure 1

The structure of LSTM network

As can be seen from Fig. 1, to achieve better results than RNN model, LSTM model need to sets more parameters. Meanwhile, the LSTM model is slower than the ordinary RNN model in training, which makes it difficult to adapt to the application scenario of online updating (Mohamed 2018; Rizk and Awad 2019). To alleviate the issue, Error-LSTM model is proposed which speeds up the training speed of LSTM model and improves its model accuracy in the updating process.

Update model based on Error-LSTM

In industrial production, the state of compressor will change with the aging of equipment components, which will lead to changes in the characteristics of vibration signals collected (Ye and Dai 2018). Hence, to ensure the effectiveness of the model, the model needs to be constantly updated. In order to meet the demand of online updating, the prediction model needs to have higher computational efficiency. Due to the complex structure of LSTM model, it is difficult to adapt to the scene of online update. Therefore, E-LSTM model is proposed to improve the efficiency and accuracy of the model. The framework of updating model based on Error-LSTM is demonstrated in Fig. 2. The main idea of this model is simply to improve the performance of the model according to the test error. Next, we will show in detail how E-LSTM improves the efficiency and accuracy of the model.

Fig. 2
figure 2

The framework of updating model based on Error-LSTM

Improvement of model efficiency

E-LSTM like LSTM, contains input layer, hidden layer and output layer. There are forward connections from the input to hidden layer, and from the hidden to output layer. However, unlike LSTM, E-LSTM hidden layer neurons are divided into g modules of size k (If the number of hidden layer nodes is m, then m = k*g). To speed up the training of the model, only part of the module is executed at each time step, and the other modules retain their output values from the previous time step. Suppose two modules are activated at time step t, calculation of the hidden layer output gate is illustrated in Fig. 3.

Fig. 3
figure 3

Calculation of the hidden layer output gate

The backward pass of the error propagation is similar to LSTM as well. The only difference is that the error propagates only from modules that were executed at time step t. The error of non-activated modules gets copied back in time (similarly to copying the activations of nodes not activated at the time step t during the corresponding forward pass), where it is added to the back-propagated error.

The number of modules activated at each time step is determined by the test error. The basic idea is that if the model generated of the previous time period performs well on the dataset of this time period, it indicates that the dataset of this time period is similar to the dataset of the previous time period. Therefore, when the model is updated during this time period, the number of modules activated can be correspondingly reduced, and vice versa. For clearer expression, let’s take the update process from time period T1 to T2 as an example, the algorithm process is described as follows:

figure a

It should be noted that since there was no test error in training model M1, we used the mode in which all neurons were activated to train the model M1. It is important to note that when calculating the number of update modules, we consider the error block series Err_mean rather than the single error series Err. This is because if there are abnormal points in the data, the error of the model at the abnormal point will be large. At this time, if only the error at the current moment is considered, the number of updated modules will be more, which means that the model will focus on training the abnormal points. It is obviously wrong. However, if the average error of L time step is considered (the size of L will be tuned in the experimental section), the influence of abnormal points will be weakened, so as to improve the accuracy of the model.

Improvement of model accuracy

As we know, when the data distribution changes, it makes more sense to update the model (Razavi-Far et al. 2019). Figure 4 shows the vibration signal data of compressor running for a long time. The data is divided into three sub-datasets, which represent different distribution conditions. If the model is not updated and the model trained in dataset1 is used as the final vibration signal prediction model, the model may perform poorly in dataset2 and dataset3. The update model based on LSTM adopts the incremental learning method to continuously update the model based on new data. However, the update model based on LSTM does not consider the influence of data distribution on model accuracy, which leads to low model accuracy when data distribution changes. Update model based on E-LSTM adopts different update methods for data with different distribution according to test error, so as to improve the accuracy of the updated model.

Fig. 4
figure 4

Data distribution of the entire dataset

To explain the improvement in model accuracy of the update model based on E-LSTM, dataset2 is divided into two parts: dataset21 and dataset22, as shown in Fig. 5. It can be seen from Fig. 5 that dataset21 has a similar distribution with dataset1, while dataset22 has a similar distribution with dataset3. The following steps explain why the updated model based on E-LSTM performs better than the updated model based on LSTM.

Fig. 5
figure 5

Approximate distribution of dataset2

  • Step 1 LSTM model is used to train model M1 based on dataset1.

  • Step 2 Test model M1 is based on dataset21 and dataset22, and test error E21 and E22 are obtained.

  • Step 3 Since dataset21 and dataset1 have similar distribution, theoretically model M1 performs better in dataset21, so E21 is less than E22.

  • Step 4 Hence, the number of update modules for dataset22 is greater than for dataset21. In the training process of E-LSTM model, dataset22 is mainly trained and finally model M2 is obtained. In addition, LSTM model is used to train model M2′ based on dataset2.

  • Step 5 Model M2 perform better on dataset3 than model M2′ for two reasons: (1) Dataset22 and dataset3 have similar distribution, and model M2 focuses on training dataset22. (2) More importantly, model M2 pays little attention to dataset21 which is different from dataset3, which reduces the interference of data with different distribution to the model. Therefore, the update model based on E-LSTM can quickly adapt to the changes of data, so as to avoid large errors.

In “Methodology” section, the proposed model and related theories are introduced in detail. First, because of the superiority of LSTM model in dealing with time series, LSTM model is adopted as the basic model to forecasting vibration signal. Second, due to the LSTM model is not suitable for online update scenarios, E-SLTM model is proposed to improve the performance of model. Finally, it is proved theoretically why E-LSTM can improve the accuracy and efficiency of the model. In next section, experiments will be conducted to verify effectiveness of the proposed model.

Experiments and analysis

In this section, two different datasets are used to verify the superiority of the proposed model. First, we construct a set of test function with changing tendency to verify the validity of the model. In addition, to verify the application of the proposed model in industrial production, the vibration signal of the compressor is used to test the model. Experimental data were collected from a reciprocating compressor on the oil production platform in Bohai, China. The speed sensor is fixed on the wall of the main motor of the compressor, and the sampling interval is 1 min. Three commonly used evaluation indices are employed to evaluate the performance of the E-LSTM model. They are mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) that defined as follow:

$$ MAE = \frac{ 1}{N}\sum\limits_{i = 1}^{N} {\left|predicted_{i} - observed_{i} \right|} $$
(7)
$$ RMSE = \sqrt {\frac{ 1}{\text{N}}\sum\limits_{i = 1}^{N} {(predicted_{i} - observed_{i} )^{2} } } $$
(8)
$$ MAPE = \frac{ 1}{N}\sum\limits_{i = 1}^{N} {\left |\frac{{predicted_{i} - observed_{i} }}{{observed_{i} }}\right |} \times 100\% $$
(9)

where predictedi and observedi denote the predicted value and observed value of the i sample respectively. N denotes sample size. Besides, in order to clearer verify the superiority of the proposed model, Promoting Percentages of the MAE (PMAE), Promoting Percentages of the RMSE (PRMSE), Promoting Percentages of the MAPE (PMAPE) and Promoting percentages of the Time (PTime) are executed in this paper. The definitions of these evaluation indexes are demonstrated as follows:

$$ P_{MAE} = (MAE_{1} - MAE_{2} )/MAE_{1} $$
(10)
$$ P_{RMSE} = (RMSE_{1} - RMSE_{2} )/RMSE_{1} $$
(11)
$$ P_{MAPE} = (MAPE_{1} - MAPE_{2} )/MAPE_{1} $$
(12)
$$ P_{Time} = (Time_{1} - Time_{2} )/Time_{1} $$
(13)

It should be noted that all experiments run in the Python 3.6 environment on 2.80 GHz PC with process i5-7440HQ and 16G RAM. Considering the effect of randomness on performance, all experiments in this paper are run independently for 10 times, and then averaged to get the final result.

Test function dataset

In this section, we generated a set of test function dataset with a trend change to test the proposed model. The test function is defined as follow:

$$ h_{1} (t) = 4\sin 40\pi t $$
(14)
$$ h_{2} (t) = (1 + 0.5\sin 5\pi t)\cos (250\pi t + 20\pi t^{2} ) $$
(15)
$$ f_{1} (t) = h_{1} (t) + h_{2} (t)\quad t \in [0,1] $$
(16)
$$ f_{2} (t) = h_{1} (t) + h_{2} (t) + 6t\quad t \in [0,1] $$
(17)
$$ f_{3} (t) = h_{1} (t) + h_{2} (t) + 6\quad t \in [0,1] $$
(18)

Each function of \( f_{1} (t) \), \( f_{2} (t) \) and \( f_{3} (t) \) was sampled 2000 times, and then the three functions were spliced together to get the final test function, as shown in Fig. 6. We first tune hyper parameters of model and then compare it with other forecasting models.

Fig. 6
figure 6

The dataset of test function

Hyper parameter tuning

There are many hyper parameters in E-LSTM model, which will greatly affect the prediction performance of the model. Therefore, to ensure the accuracy of the model, we first fine-tune the parameters. Since the trial and error method has the advantages of wide parameter search range and fast search speed (Elsayed et al. 2015), we adopt the trial and error method for parameter tuning. The main parameters that affect the accuracy of model include the number of hidden layers, the number of hidden layer neurons, the window size and the size of error blocks. For these parameters, different values are set based on some prior knowledge, as shown in Table 1.

Table 1 Setting of experimental parameters
  1. A.

    The number of hidden layers

The number of hidden layers of neural network is an important parameter that affects the performance of the model. Increasing the number of hidden layers can improve the accuracy of the model but also increase the complexity of the model. In this experiment, to verify the effect of different number of hidden layers on the model performance, the number of hidden layers is set as 1, 2, 5, 10 and 20. The other three key parameters number of neurons, window size and error block size are 60, 500 and 5. The experiment results are shown in Table 2.

Table 2 The effect of number of hidden layers on the model performance

As can be seen from Table 2, increasing the number of hidden layers (2, 5 layers) cannot significantly improve the accuracy of the model, and the calculation time of the model increases significantly. If the number of hidden layers continues to increase to 10 and 20 layers, the accuracy of the model declines instead of improving, because the complexity of the model is so high that the overfitting problem arises. At the same time, in the process of model update, the efficiency of the model is an important index, so the number of hidden layers is set as 1.

  1. B.

    The number of hidden layer neurons

The number of hidden layer neurons is an important parameters of neural network model. Too few neurons will cause the model to fail to fully learn all the features in data. However, the excessive number of neurons not only reduce the efficiency of model, but also may lead to overfitting problem (Henriquez and Ruz 2018). In this experiment, to verify the effect of different number of neurons on the model performance, the number of neurons is set as 20, 60, 100 and 200. The other three key parameters number of hidden layers, window size and error block size are 1, 500 and 5. The experiment results are shown in Table 3, where the best performance is highlighted in bold.

Table 3 The effect of number of neurons on the model performance

It can be seen from Table 3 that the model performs best when the number of hidden layer neurons is 60. As the number of neurons continues to increase to 100 or even 200, the performance of model declines instead of getting better, which indicates that the model may cause overfitting problem. Therefore, the number of hidden layer neurons is set to 60 in subsequent experiments.

  1. C.

    The window size

In the process of model updating, the proper of selection of window size is significant to the improvement of model accuracy. The window size is too small, resulting in insufficient sample size and insufficient model training. On the contrary, too large window size leads to information redundancy during model training, which affects the accuracy of the model (Youn et al. 2018). In this experiment, to evaluate the effect of window size on the model performance, the window size is set as 500, 1000 and 2000. The number of hidden layers is 1, the number of neurons is 60 and the error block size is still 5. The experiment results are shown in Table 4.

Table 4 The effect of window size on the model performance

It can be seen from Table 4 that the model error is the smallest when the window size is 1000. Hence, a window size of 1000 is an ideal value for this dataset.

  1. D.

    The size of error blocks

Similarly, to evaluate the effect of error blocks size on the model performance, the error blocks size is set as 5, 10, 20 and 50. The number of hidden layers is 1, the number of neurons is 60 and the window size is 1000. The experiment results are shown in Table 5.

Table 5 The effect of error blocks size on the model performance

As can be seen from Table 5, the model performs best when error block size is 20. In the other words, information at current time step is most relevant to information at previous 20 times step. As the error block size continues to increase to 50, the performance of model declines, which indicates that the current time step information will be interfered if the associated information is too long.

Based on the above experimental results, after tuning the key parameters of model, the number of hidden layers is 1, the number of neurons is 60, the window size is 1000 and the error blocks size is 20. It should be pointed out that for a new dataset, it is necessary to use the trial and error method to search for parameters again. In the following section, we will verify the validity of E-LSTM model based on these optimized parameters.

Checking overfitting for the model

In order to prevent the model may fit well in the training dataset, but cannot be generalized to new examples (Bouktif et al. 2018). It is necessary to plot learning curve that shows the model performance on training and testing data. RMSE for both the training and testing datasets for optimized E-LSTM model (i.e., after hyper parameter optimization) decrease with the increase of iteration times and converge at similar values, which shows that our model is not overfitting, as shown in Fig. 7.

Fig. 7
figure 7

Learning curve for E-LSTM model

Comparison and analysis

In this experiment, we will verify the superiority of E-LSTM model in algorithm efficiency and model accuracy. LSTM model is used for comparison. For a fair comparison, the parameters of models are the same as shown in Table 6.

Table 6 The parameters of models

Since the dataset has 6000 data and the window size is 1000, the dataset can be divided into 6 data blocks. The purpose of the experiment is to compare the test error and the training time of different models on each data block. Considering that the first data block has no test error, we only study the performance of the model on the remaining 5 data blocks (labeled 0–4). The forecasting result of different models is shown in Fig. 8 and Table 7. From Fig. 8 and Table 7, the following conclusions can be drawn:

Fig. 8
figure 8

The forecasting results of different models on test function dataset a RMSE; b MAE; c MAPE; d time

Table 7 The promoting percentages of different models on test function dataset
  1. (1)

    It can be seen from Fig. 8a–c that three error indicators RMSE, MAE and MAPE have similar trends. In almost all data blocks, E-LSTM model has smaller errors than LSTM model. When the data distribution changes, E-LSTM model can quickly adapt to this change, so as not to produce large errors. However, LSTM model does not consider the influence of data distribution, which leads to large errors. From Table 7, the RMSE, MAE, MAPE average promoting percentages of the LSTM model by the E-LSTM model are 11.78%, 10.39%, 38.28%, respectively.

  2. (2)

    It can be seen from Fig. 8d that E-LSTM model has a faster training speed than LSTM model.

    From Table 7, the Time average promoting percentages of the LSTM model by the E-LSTM model is 20.71%.

  3. (3)

    Compared with LSTM model, E-LSTM model has obvious improvements in both model accuracy and efficiency, which indicates that the proposed model is effective.

The data block corresponds to the data block in Fig. 8.

Vibration signal dataset

To further verify the application of the proposed model in actual production scenarios. In this section, vibration signal generated by compressor is used to test the performance of the model. In order to better verify the performance of the model, the dataset with obvious change are selected. As is shown in Fig. 9. The dataset contains 30,000 data and sampling interval is 1 min.

Fig. 9
figure 9

The dataset of vibration signal

Similarly, the four parameters the number of hidden layers, the number of hidden layer neurons, window size and error block size are respectively tuned. Due to limited space, the tuning process is in “Appendix”. Finally, the number of hidden layers is 1, the number of hidden layer neurons is 60, the window size is 1000 and the error blocks size is 20. We verify the performance of the E-LSTM model by comparing it with the LSTM model. Similarly, considering that the first data block has no test error, we only study the performance of the model on the remaining 29 data blocks (labeled 0–28). The forecasting result of different models is shown in Fig. 10 and Table 8.

Fig. 10
figure 10

The forecasting results of different models on vibration signal dataset a RMSE; b MAE; c MAPE; d time

Table 8 The promoting percentages of different models on vibration signal dataset

The following conclusions can be drawn from Fig. 10 and Table 8:

  1. (1)

    It can be seen from Fig. 10a–c that three error indicators RMSE, MAE and MAPE have similar trends. In almost all data blocks, E-LSTM model has smaller errors than LSTM model. From Table 8, the RMSE, MAE, MAPE average promoting percentages of the LSTM model by the E-LSTM model are 16.26%, 16.12%, 16.25%, respectively.

  2. (2)

    It can be seen from Fig. 10d that E-LSTM model has a faster training speed than LSTM model in most data blocks. In some data blocks with large test error, duo to the large number of updated modules, the training time of E-LSTM model is similar to LSTM model. However, for whole dataset, E-LSTM model still has advantages over LSTM model in model efficiency. From Table 8, the Time average promoting percentages of the LSTM model by the E-LSTM model is 8.18%.

In this section, we use two types of datasets to verify the performance of the model. Experimental results show that E-LSTM model has better performance than LSTM model in terms of model efficiency and accuracy. Therefore, E-LSTM model can ensure that the accuracy of the model is not affected when the data distribution changes. Furthermore, E-LSTM model provides a new solution for industrial big data mining.

Conclusion

In industrial big data, the equipment will accumulate a large amount of data in a certain period of time. Meanwhile, with the change of time, the distribution of data will change, resulting in that the previously generated model does not meet the current moment. Therefore, the model needs to be constantly updated to meet the current data distribution. To solve the problem, the online learning algorithm is introduced into the prediction model, a new vibration signal prediction model Error-LSTM (E-LSTM) is proposed in this paper. Based on LSTM model, E-LSTM model improves the accuracy and efficiency of the model. First, the hidden layer neurons are divided into blocks, and only part of the modules are counted at each moment. Thus, the training speed of the model is accelerated and the efficiency of the model is improved. Second, for the different data distribution in dataset, E-LSTM model can adopt different training methods according to the test error. In other words, E-LSTM model can focus on training data with similar distribution, and reduce the interference of data with different distribution, so as to improve the accuracy of the model.

In the experimental part, we used two different datasets to test the model. LSTM model is compared with E-LSTM model. In order to better demonstrate the superiority of the E-LSTM model, the improvement effect of the model is quantified. In summary, compared with LSTM model, E-LSTM model has advantages in both model accuracy and efficiency, which indicates that the proposed model is effective.

In conclusion, this paper is proposed against the background that the accuracy of vibration signal prediction model decreases when the data distribution changes. Therefore, the application scenario of E-LSTM model is to predict the time series data containing the change of data distribution. However, if the time series data runs smoothly and the data distribution does not change, the accuracy of E-LSTM model is not necessarily higher than that of LSTM model, but the efficiency is still higher than that of LSTM model. In addition, the research object of this paper is the field of industrial big data. In the future, the idea of E-LSTM model can also be applied to other fields, such as financial big data and medical big data.