1 Introduction

Ball screws have been widely applied in CNC machine tools owing to its high efficiency, precision, and stiffness [1]. The productive efficiency of the machine tool is determined by the speed and feed rate of the machine tool, and it shows an obvious high-speed trend in recent years. However, severe thermal issues will follow the high speed and feed rate. The temperature rise results in thermally induced error of ball screws, which seriously affects the machining precision of CNC machine tools especially in high speed and precision machining. It is reported that thermal error represents 40–70% of the total error of machine tools [1, 2]. Generally, the heat generation of ball screws is mainly from the motor, nuts, and bearings, which cause the thermal deformation of the screw resulting in loss of position accuracy [3]. Therefore, with the increase of manufacturing precision, it is extremely important to reduce the influence of thermal issue of ball screws on position accuracy.

Reduction and compensation of the thermal deformation are the two main technical measures to address the thermal issues of ball screws. In terms of the thermal errors reduction, Xu et al. [4, 5] and Shi et al. [6] discussed the air/liquid cooling system in a ball screw shaft to reduce thermal errors and achieve quick temperature equilibrium. A series of tests were carried out to show the position accuracy could be significantly improved. Nevertheless, the systems become more complicated and cause the loss of stiffness. Gao et al. [7] proposed an adaptive method based on carbon fiber–reinforced plastics to reduce thermal deformation. But the designed ball screws are difficult to implement in engineering practice because of the oversized clamping structure in the two ends. Guo et al. [8] proposed a bio-inspired graphene-coated ball screws inspired by the Saharan silver ant to reduce the thermal deformation. However, the coating may peel off from the surface of the ball screw resulting in failure of the method.

Thermal error compensation is a cost-effective method to solve this problem, which merely depends on the exact measurement and the accurate prediction of thermal error during machining. In the last decades, the most widely used algorithms in the thermal error modeling were the multivariate linear regression analysis. Yang et al. [9] proposed a thermal error model based on cerebellar model articulation controller (CMAC) neural network which can search for the nonlinear and interaction characteristics between the thermal errors and temperature field on the machine tools. However, the convergence rates and prediction accuracy are not suitable since the model parameters are not optimized. Yang et al. [10] reduced the number of sensors according to thermal expansion and thermal bending mode analysis and established the multivariate linear regression model to compensate the thermal error on a CNC turning center. Nevertheless, a practical regression model with high accuracy and robustness is difficult to establish by the method. Zhao et al. [11] proposed a method for determining the convection heat transfer coefficient, and then the temperature field and thermal errors were dynamically simulated using the finite element analysis (FEA) method satisfying to replace the experiment results. Zhu et al. [12] developed a temperature sensor placement scheme and thermal error modeling strategy by using the finite element analysis, thermal mode concept, and linear regression models. But the boundary conditions in actual working conditions have not been mentioned in these studies. Some researchers focused on the mathematical model to reveal the error generation mechanism. However, the modeling method is less compatible since the variation of temperature rise is complicated leading to the analytic relationship variation with the working condition [13, 14]. In recent years, with the development of computer technology and intelligent algorithm, gray theory, support vector machine, artificial neural network, and some hybrid approaches were gradually employed by many researchers. Ramesh et al. [15, 16] employed Bayesian networks to classify the thermal error in different parameter settings and adopted the support vector machine algorithm to determine the thermal error of each class. Wu et al. [17] established a GA-BPN–based thermal error model for online prediction of the thermal errors and developed a real-time compensation system to compensate for the thermal drift errors, while the performance of the model under complex conditions was not revealed. Zhang et al. [18] used the artificial neural networks and gray theory to enhance the robustness and the accuracy of the thermal error model, but the weight coefficients seem difficult to be adjusted in real time. Wang et al. [19], Miao et al. [20, 21], and Yang et al. [22] used a combination of fuzzy clustering analysis and advanced algorithm to establish the thermal error model. Abdulshahed et al. [23, 24] established ANN and ANFIS thermal error model for thermal error compensation on CNC machine tools. Liu et al. [25] established a thermal error model by using the ridge regression algorithm to inhibit the bad influence of collinearity on the thermal error predicted robustness. Santos et al. [26] established physical models with data-driven models based on the ANN and the FEM simulation to predict the thermal error. Huang et al. [27] used a genetic algorithm (GA) to optimize initial weights and thresholds of back propagation (BP) network for training the thermal error sample data and modeling of the thermal error. Rojek et al. [28] used single-directional multilayered neural networks with error back propagation (MLP), radial basis function neural networks (RBF), and Kohonen networks to establish the compensation model of ball screws. Li et al. [29] proposed a thermal error compensation model by using genetic algorithm to optimize wavelet neural network. The robustness of thermal error model, variable searching, and modeling time can be optimized theoretically. Although these methods can enhance the predicted robustness of the model, the predicted accuracy reduced at the same time [23]. The robustness and accuracy of the model need to be improved since the thermal errors are not only related to temperature of selected sensitive points at certain moments but also vary with historical temperature value. Therefore, it is necessary to establish an accurate real-time and historical mapping relationship between temperature fields and thermal errors. Long short-term memory (LSTM) network as one of the modern deep learning models has a strong ability for time series forecast in various fields, since it can dynamically learn new information while maintaining a persistent memory of historical information. Sagheer et al. [30] implemented time series forecasting of petroleum production based on LSTM recurrent networks optimized by genetic algorithm (GA). Zhang et al. [31] employed the LSTM network to predict remaining useful life (RUL) of lithium-ion batteries. Zhang et al. [32] developed a LSTM model to predict water table depth in agricultural areas and evaluated and discussed the ability of the proposed model. Qin et al. [33] employed the LSTM model to predict gear remaining life. In a word, LSTM network has both the short-term correlation and long-term dependence characteristics, and it can characterize the dependence relationship of time sequence data and predict the variation trend of the time series data. However, the research on thermal errors prediction of machine tools by using LSTM network is rarely reported, especially in the field of ball screws.

In this research, a novel thermal error model of ball screws is proposed. Initially, a LSTM model is developed to forecast the time series thermal errors of ball screws. In order to establish an accurate mapping relationship with time-varying between temperature fields and thermal errors, particle swarm optimization (PSO) algorithm is employed to optimize the hyperparameters of LSTM network for improving the performance of the model. Furthermore, the effectiveness and robustness of the PSO-LSTM model are verified according to the collected experimental data. Finally, performance of the proposed model and traditional ones are compared. The rest of this paper is organized as follows. In Section 2, the modeling process and relevant theory of the proposed method are introduced. In Section 3, the thermal mode analysis and experiments are conducted, and the performance of proposed data-driven model is discussed and compared with other methods. Section 4 summarizes the main conclusions.

2 Thermal error prediction of ball screws based on PSO-LSTM

A deep learning model based on PSO-LSTM to predict thermally induced error of ball screws according to temperature and deformation data measured by temperature sensor and eddy current displacement sensor, respectively, is proposed in this research. And then the essential configuration of the deep learning model is optimized by particle swarm optimization (PSO) algorithm. The model is intended to quickly determine the thermal deformation and is used to compensate online thermal error in the working status. It is established and optimized by the Matlab deep learning toolbox and global optimization toolbox, respectively.

2.1 LSTM neural network

Long short-term memory (LSTM) network, one of popular deep learning architectures in recent years, is developed specially to learn and handle long-term dependencies presented in sequential data such as temperature rise and thermal deformation of machine tools in the machining process. The exploding/vanishing gradient problem, which directly leads to the recurrent neural network (RNN) learning stopping or becoming too slow, can be solved by introducing a memory cell and three gating mechanism into the architecture of LSTM which modifies the RNN structure. Figure 1 shows the framework of a LSTM neural network. The core of LSTM network is a cell memory state represented by the horizontal line through the cell which is similar to a conveyor belt running through the entire cell, but it only has few branches. It also includes three gates known as the forget gate, input gate, and output gate, to control and update cell status. Therefore, the network can ensure the entire information passes through the cell and update information to maintain its memory state with time. Based on the above characteristic, LSTM can not only address variable length time series data and capture long-term dependencies but also memorize historical information dynamically and learn new information while maintaining a persistent memory of historical information [34,35,36,37].

Fig. 1
figure 1

Framework of the LSTM neural network

The first step of LSTM is removing unimportant information operated by a unit called forget gate, which can be derived as

$$ {f}_t=\sigma \left({W}_f\cdot \left[{h}_{t-1},{x}_t\right]+{b}_f\right) $$
(1)

where ft represents the forget gate, σ is the activation function, Wf is the weight matric of forget gate, ht − 1 is the output at the previous time t − 1, xt is the current input, and bf is a bias vector.

The next step is to select and add valuable information to the network through input gate and to produce new cell information waiting to be selected. This process can be expressed as

$$ {i}_t=\sigma \left({W}_i\cdot \left[{h}_{t-1},{x}_t\right]+{b}_i\right) $$
(2)
$$ {\tilde{C}}_t=\tanh \left({W}_C\cdot \left[{h}_{t-1},{x}_t\right]+{b}_C\right) $$
(3)

where it and \( {\tilde{C}}_t \) are the input gate and the intermediate value during the calculation, Wi and WC are weight matrixes of input gate and internal state, and bi and bC are biases of input gate and internal state.

Subsequently, new cell information is updated in the network through input gate and forget gate, which is formulated as

$$ {C}_t={f}_t\ast {C}_{t-1}+{i}_t\ast {\tilde{C}}_t $$
(4)

where Ct and Ct − 1 are the cell state at the current time t and the previous time t − 1, respectively.

Finally, the outcome and cell state of LSTM is determined by updating and selecting the new cell information and the input, which is written as

$$ {o}_t=\sigma \left({W}_o\cdot \left[{h}_{t-1},{x}_t\right]+{b}_o\right) $$
(5)
$$ {h}_t={o}_t\ast \tanh \left({C}_t\right) $$
(6)

where ot, σ, Wo, and bo are the output gate, activation function, the weight matrix of output gate, and the bias of output gate, respectively. ht is the output at the current time t.

The above equations reveal the internal calculation mechanism of LSTM where the network output of each time step is associated with previous input and cell state to predict future information by addressing variable length time series data and capturing long-term dependencies.

2.2 Thermal error modeling by LSTM

Thermally induced error of ball screws can be processed by LSTM network as the temperature and deformation are time series data increasing or dropping with time once the machine tool is working. Firstly, the essential configuration of LSTM network is designed based on the input and output data where the number of features corresponds with temperature measurement point and thermal deformation, respectively. The multilayer LSTM and added full-connected layer are introduced into the network model. In the model training process, weighs and bias are updated by Adam optimizer and the root mean square error (RMSE) as fitness function is used to evaluate the performance of the LSTM. Additionally, the hyperparameters of LSTM is adjusted by optimization algorithm due to the ability to quickly find the best solution.

2.3 Hyperparameter optimization algorithm

In order to establish an ideal network with accurate performance, it is necessary to search optimal parameters of the model. Particle swarm optimization (PSO) algorithm is derived from the imitation of bird predation behavior. As one of the evolutionary calculation technologies, PSO can collaborate and share information between individuals in the group to find the optimal solution quickly so that it is suitable to be used for searching the optimal parameters. Therefore, PSO is applied in this study to optimize the network hyperparameters for better results. This algorithm is conducted by continuously searching and updating the personal best and global best of swarm while simultaneously updating the position and velocity of each particle for the next optimization process. The search process of this algorithm is summarized in Fig. 2.

Fig. 2
figure 2

Search process of particles

2.4 Data normalization

In order to ensure the equivalence and homogeneity of the various factors, it is necessary to process the sample data to dimensionless normalized data. Therefore, the Z-score method is used to uniformly normalize the data of the model in this study, as shown in Eqs. (7) ~ (9).

$$ \overline{x}=\frac{1}{n}\sum \limits_{i=1}^n{x}_i $$
(7)
$$ S=\sqrt{\frac{1}{n-1}\sum \limits_{i=1}^n{\left({x}_i-\overline{x}\right)}^2} $$
(8)
$$ {x}_i^{\hbox{'}}=\frac{x_i-\overline{x}}{S} $$
(9)

where xi and \( \overline{x} \) are the sample data and average value, S and \( {x}_i^{\hbox{'}} \) are the standard deviation and normalized value, and n is the number of samples.

2.5 Evaluation metrics of the model

RMSE, MAE, MSE, and MAPE are four metrics for evaluating the performance of the model. The smaller value of the metric indicates the better performance with the model, which are expressed as

$$ RMSE=\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({y}_i-{y}_i^{\hbox{'}}\right)}^2} $$
(10)
$$ MAE=\frac{1}{n}\sum \limits_{i=1}^n\left|{y}_i-{y}_i^{\hbox{'}}\right| $$
(11)
$$ MSE=\frac{1}{n}\sum \limits_{i=1}^n{\left({y}_i-{y}_i^{\hbox{'}}\right)}^2 $$
(12)
$$ MAPE=\frac{100}{n}\sum \limits_{i=1}^n\left|\frac{y_i-{y}_i^{\hbox{'}}}{y_i}\right| $$
(13)

where n is the number of samples and yi and \( {y}_i^{\hbox{'}} \) are the testing and predictive values of i-th sample, respectively.

2.6 Thermal error prediction based on PSO-LSTM model

Learning rate is the most crucial hyperparameters, followed by the network size [38]. In this paper, the hyperparameters of LSTM network, learning rate, and unit number were searched by PSO intelligent optimization algorithm by using the fitness function of root mean square error to search the optimal learning rate and unit number under the best fitness value. Simultaneously, the time window size of data set is designed by using PSO where the time window is introduced into the data set to improve the accuracy of the model. In this research, the relationship between thermal deformation and temperature rise of measured points is given by

$$ {Y}_t=f\left({Y}_{t-1},{T}_{t-1},{T}_{t-2},\cdots, {T}_{t-n}\right) $$
(14)

where Yt and Yt − 1 are on behalf of the predicted thermal error at the current time t and the previous time t − 1; Tt − 1, Tt − 2, ⋯, Tt − n are the previous temperatures; and the number n is the time window size.

The time series data of temperature and thermal deformation in axial direction are measured by temperature sensor and eddy current displacement sensor, respectively. And then these data are imported into the LSTM network to train the data-driven LSTM model. Sequentially, the optimal learning rate and unit number determined by PSO intelligent optimization algorithm are input into the network for obtaining the optimal configuration of the model to predict thermally induced error. The flow chart of proposed PSO-LSTM model is shown in Fig. 3.

Fig. 3
figure 3

Flow chart of the thermal error prediction modeling process. Software: Matlab R2019a, deep learning and global optimization toolbox, Win7 64-bit operating system. Hardware: RAM 8 GB, Inter (R) Core (TM) i5-5200U CPU 2.20GHz

3 Results and validation

In order to validate the PSO-LSTM model for predicting thermally induced error of ball screws, thermal characteristic experiments for the time series thermal error forecast are implemented in this paper. Firstly, thermal sensitive points of ball screws are captured according to thermal modal analysis (TMA) to determine the key temperature measurement point. Secondly, a thermal experiment with step speed of the motor for obtaining the data of temperature and deformation is carried out to validate the effectiveness of PSO-LSTM model, and then another thermal experiment with random speed is carried out to verify the robustness of the proposed model further. Finally, the performance of the proposed model is compared with that of RBF model and BP model.

3.1 Thermal sensitive point

Ball screws feed drive systems mainly include motor, coupling, bearings, and ball screws. In the working status, the thermal deformation of ball screws is mainly caused by temperature rise of bearings, nuts, and a screw. With the purpose of thermal error modeling, it is required to establish a correlation between the temperature rise of components and deformation. Therefore, the selection of temperature sensitive points based on thermal mode is introduced in this section. The thermal modes of the system represent the distribution of the temperature field under the corresponding eigenvalues, which is the reciprocal of the time constant. The corresponding transient response of temperature field can be acquired by superposition of each order thermal mode. Additionally, the areas with which significant temperature changes can be identified rely on the shape of low order thermal modes. In order to analyze the thermal sensitive point of ball screws, a finite element (FE) model of ball screws is established and the thermal modal analysis is carried out.

To perform the thermal modal analysis, the finite element solution of the underlying heat transfer problem needs to be solved [12], which requires the integration of coupled differential equations of the form

$$ \left[{C}_T\right]\left\{\dot{T}(t)\right\}+\left[{K}_T\right]\left\{T(t)\right\}=\left\{Q(t)\right\} $$
(15)

where [CT] and [KT] are the heat capacity matrix and the heat conductivity matrix, respectively, {T(t)} is the nodal temperature vector, and {Q(t)} is the nodal thermal load vector.

The eigen-problem associated with Eq. (16) is

$$ \left[{K}_T\right]\left[{\Phi}_T\right]=\left[{C}_T\right]\left[{\Phi}_T\right]\left[\Lambda \right] $$
(16)

where [Λ] is a diagonal matrix composed of all the eigenvalues and [ΦT] is the corresponding eigenvector matrix. Theoretically, the reciprocal of the corresponding time constant is

$$ {\tau}_i=\frac{1}{\lambda_i} $$
(17)

where λi and τi are the i-th eigenvalue and time constant, respectively. The time constant describes how quickly the mode responds to thermal loads.

The first four thermal modes with the time constants and the corresponding temperature field distributions are calculated by thermal modal analysis, and the magnitudes of the temperature for each mode are normalized by Eq. (18) .

$$ {T}_j^{\hbox{'}}=\frac{T_j}{{\left\Vert \mathbf{T}\right\Vert}_{\infty }} $$
(18)

where Tj is the temperature result in the j-th position and \( {T}_j^{\hbox{'}} \) is the normalized temperature result.

The first four thermal modes with temperature fields and thermal time constant are shown in Fig. 4. It is seen that the thermal time constant of ball screws is greater, indicating that the variation of temperature field is slow. It is depicted significant temperature rise occurs on the fixed bearing, the support bearing, and the nut/screw interface. The reason is that the heat is mainly generated by the friction of components and causes the temperature rise of the frictional contact region obviously. Therefore, the temperature sensitive points are determined and three temperature measuring points are arranged on the surface of the nut and the end surface of two bearings in this research.

Fig. 4
figure 4

First four thermal modes with temperature fields and time constants. a 1st order, b 2nd order, c 3rd order, d 4th order

3.2 Experimental results and validation

To verify the effectiveness and robustness of this proposed model and to compare with the performance of other models in this study, two thermal characteristic experiments with step speed and random speed are conducted on the high-speed precision ball screw test bench in our laboratory. The experimental setup is shown in Fig. 5a–e. Parameters of the measuring instrument are shown in Table 1. Three temperature sensors are mounted on the feed drive system to obtain the temperature data, where they are arranged on the front bearing seating, the rear bearing seating, and the nut, respectively. An eddy current displacement sensor is installed on the end face of the screw to measure the thermal deformation. Additionally, the ambient temperature is monitored during the experiment so as to consider the effect of ambient temperature on thermal deformation.

Fig. 5
figure 5

Experiment setup. a Test bench of ball screws, b temperature sensor mounted on nut, c temperature sensor mounted near fixed bear, d temperature and thermal deformation test, e data acquisition system

Table 1 Parameters of experimental instrument

First of all, the ball screw runs as reciprocating cycle at a rotational speed of 500r/min for 10min, where the purpose is to avoid the influence of gap between the parts of ball screws and instability of the machine at the startup stage, and the accuracy of experimental results is guaranteed. Ten minutes later, the experiment is carried out according to the speed spectrum and the nut moves along with the screw as the reciprocating cycle. The temperature of the front bearing housing, the rear bearing housing, and the nut and the ambient temperature and the thermal deformation of the screw are in real-time collected where the data is recorded by a self-designed data acquisition system and sampling period assigned as 600ms. After the experiment, the temperature of components should be naturally cooled to the ambient temperature and then another experiment continues.

A thermal characteristic experiment with step speed of the motor is carried out to validate the effectiveness of PSO-LSTM model. The speed of the motor is changed on every 20 min and increases from 300r/min to 1000r/min and then decreases to 300r/min (see Fig. 6). The experimental result based on step speed is shown in Fig. 7. According to experimental results, the data of temperature rise and thermal deformation is normalized and feed the neural network models. The first 80% data is taken as a training set and the last 20% data is assigned as a testing set. The PSO searching result shows the time window size, learning rate, and unit number are 5, 0.0037, and 88, respectively. In order to verify the advantage of this proposed model, the results of this model is compared with those of other models. The parameters of each network model are fairly assigned. Comparative results of each model under the step speed are shown in Fig. 8. It is clear that the result of PSO-LSTM model is almost entirely consistent with the experimental result. The RBF and BP can reasonably predict the values that never appeared, but the error between predictive result and experimental result of both models is greater than PSO-LSTM one. The result of the testing set can indicate the performance of model and is most critical for validating the model. Figure 9 illustrates the relative error distribution characteristics of the three models. It is concluded that the results of PSO-LSTM model and BP model are better than RBF model in testing set. The PSO-LSTM model and BP model are both with high accuracy, while the PSO-LSTM is more accurate than the BP model. Figure 10 presents the absolute error distribution characteristics of the three methods. It can be found that many error points of RBF and BP are beyond 10μm and some of them even close to 20μm. In contrast, the absolute error of data obtained from PSO-LSTM are mostly controlled within 5μm, only a little of points outside of 5μm but not larger than 8μm.

Fig. 6
figure 6

Step speed spectrum

Fig. 7
figure 7

Experimental results of thermal error in step speed

Fig. 8
figure 8

Comparative results between the experiment and the prediction of the three models. a Predictive results of the PSO-LSTM model, b predictive results of the RBF model, c predictive results of the BP model

Fig. 9
figure 9

Absolute relative error of the three models

Fig. 10
figure 10

Absolute error scatterplot of the testing set. a PSO-LSTM model, b RBF model, c BP model

Figure 11 shows the results of regression state analysis for the three models. The regression prediction performance of PSO-LSTM model is better than others. To quantify the predictive performance of the models, the four criteria, RMSE, MAE, MSE, and MAPE, of the testing set of the three models are listed in Table 2. It is concluded that the PSO-LSTM model has the smallest RMSE, MAE, MSE, and MAPE with the highest precision, followed by the BP model, and then the RBF model. Consequently, the proposed model provides an exact global trend prediction of thermal error; meanwhile, the thermal error at a certain time also can be accurately determined. As the predictive results well agree with the experimental one, the PSO-LSTM model is validated. Via comparisons with other models, it is concluded that the PSO-LSTM model has the advantage in prediction accuracy.

Fig. 11
figure 11

Regression analysis. a PSO-LSTM model, b RBF model, c BP model

Table 2 Comparison among different predictive models

According to the random speed spectrum (see Fig. 12), another thermal experiment is carried out to verify the robustness of the proposed model. The experimental results based on random speed are shown in Fig. 13. The experimental data is normalized and imported to the model. The first 80% data is taken as a training set and the last 20% data is assigned as a testing set. The PSO searches that the time window size is 5, the learning rate is 0.0029, and the unit number is 86. In order to compare fairly, RBF and BP are set with the same structure. The prediction results of three models are shown in Fig. 14. It can be seen from Fig. 14 that the LSTM model outperforms the others and the RBF model and the BP model have an obvious decline in their accuracy. Although the results of RBF and BP are acceptable for the training set, a certain data of the testing set are with large error and even some of them close to 6%. The results of RBF and BP for the testing set are both with obvious errors, which indicate that the generalizability is not strong enough to predict the data that never appear in the training stage. In contrast, the proposed PSO-LSTM model can predict the thermal error variation more accurately.

Fig. 12
figure 12

Random speed spectrum

Fig. 13
figure 13

Experimental results of thermal error in random speed

Fig. 14
figure 14

Comparative results between the experiment and the prediction of the three models. a Predictive results of the PSO-LSTM model, b predictive results of the RBF model, c predictive results of the BP model

The relative error obtained by the PSO-LSTM model is smaller than that of other two models, as depicted in Fig. 15. The absolute error distribution characteristics of the three methods are shown in Fig. 16. It is concluded that the PSO-LSTM model has a smaller and narrower error band. Most error points of RBF and BP in testing set are beyond 10μm and some of them even close to 20μm. In contrast, the absolute error of data from PSO-LSTM is all controlled within 5μm, which shows the excellent performance of this proposed model. The results of regression state analysis for the three models are shown in Fig. 17. The regression prediction performance of PSO-LSTM model is better than that of the others. The statistic evaluation indexes of the testing set of the three models are given in Table 3. The PSO-LSTM model still has the smallest RMSE, MAE, MSE, and MAPE with the highest precision. It is concluded that the accuracy of the PSO-LSTM, RBF, and BP model at step speed is 98.5%, 97.5%, and 98.1%, respectively, and the accuracy of them at random speed is 99.4%, 96.5%, and 97.2%, respectively. The PSO-LSTM model has higher accuracy and lower error compared with the RBF model and BP model in two thermal characteristic experiments based on step speed and random speed. The generalization ability of the PSO-LSTM model is superior to the other two models according to the experimental results. The proposed model can not only provide an accuracy predictive result of thermal error but also maintain a stable and satisfactory performance even in complex work condition. Therefore, the effectiveness and robustness of this proposed model are verified. The proposed method in this paper can be implemented to predict the thermal error of ball screws and provide a foundation for thermal error compensation.

Fig. 15
figure 15

Absolute relative error of the three models

Fig. 16
figure 16

Absolute error scatterplot of the testing set. a PSO-LSTM model, b RBF model, c BP model

Fig. 17
figure 17

Regression analysis curve. a PSO-LSTM model, b RBF model, c BP model

Table 3 Comparison among different prediction models

In addition, two predictive results based on random division of training set and testing set are shown in Fig. 18 to illustrate the performance of the PSO-LSTM model. To facilitate comparison, the experimental data based on random speed is used to establish the model and implement the thermal error prediction of ball screws where the training set and testing set are changed by two random divisions. Figure 18a is the result of the PSO-LSTM model where the first 74% data is assigned as a training set and the last 26% data is taken as a testing set. The predictive accuracy of this division can reach 98.5%; it is lower than the division in Fig. 14a and still has a high accuracy. Another division is that the first 82% data is taken as a training set, and the rest is assigned as a testing set. The predictive result of this division is shown in Fig. 18b. The predictive accuracy is 99.5% and slightly higher than the division in Fig. 14a. The result shows that the PSO-LSTM model has a considerable stability and high predictive accuracy in different divisions, and the accuracy of the PSO-LSTM model can be improved by increasing the proportion of training set.

Fig. 18
figure 18

Effects of data division on the performance of PSO-LSTM model. a First 74% data assigned as training data, b first 82% data assigned as training data

4 Conclusions and future work

In this paper, a novel data-driven model based on PSO-LSTM is proposed for predicting thermally induced error of ball screws, where the deep learning model combining with intelligent optimization algorithm is established based on experimental results. The thermal error of ball screws can be accurately predicted based on this proposed method, which provides a foundation for thermal error compensation to improve the machining accuracy. The effectiveness and robustness of this proposed model are validated by thermal error obtained from experiments based on step speed and random speed. The comparison between the proposed model and traditional models is implemented. The conclusions can be drawn as follows:

  1. (1)

    A novel data-driven thermal error model of ball screws based on PSO-LSTM is proposed, and the effectiveness of this model is validated by a thermal characteristic experiment at step speed on a self-designed ball screw test bench. Comparison between PSO-LSTM model and experiment shows that thermal error of ball screws can be accurately predicted by this proposed model, which can provide a foundation for thermal error compensation.

  2. (2)

    The robustness of this proposed model is verified based on another thermal characteristic experiment at random speed. By comparison between predictive results and experimental ones, the PSO-LSTM model can not only provide an accuracy predictive result of thermal error but also maintain a stable and satisfactory robustness even in complex work conditions.

  3. (3)

    Comparative results between this proposed model and the traditional models are analyzed. In terms of predicting thermal error of ball screws in this paper, the PSO-LSTM model has higher accuracy and lower error compared with the RBF model and BP model and has the smallest RMSE, MAE, MSE, and MAPE. It is concluded that the PSO-LSTM model outperforms the others.

Although the predictive thermal error model of ball screws is established, the effect of thermal error compensation has not been checked yet. The thermal compensation based on this proposed model will be conducted, and its effects will be checked in the next stage of our research work.