Introduction

Thermal error accounts for over 70% among all the total errors of the machine tools (Mayr et al., 2012). To improve the machining accuracy, it is of great significance to model the thermal error accurately. Recently, various methods have been presented to analyse, model and predict the thermal errors, which can be categorized into physics-based method and data-driven method.

For the physics-based method, the relevant researches focus on the mechanism analysis of the heat generation, heat transfer, and the resultant deformation in the manufacturing system. Liu et al. (2019) proposed the optimization method of the thermal boundary conditions, including the thermal loads, the convective heat transfer coefficient and the thermal contact resistance to reduce the thermal elongation. Świć et al. (2021) presented a thermo-mechanical method based on thermal deformation mechanism to increase the accuracy of long low-rigidity shafts. Liu et al. (2021) established a physically-based model for the time-varying nonlinear thermal error of the screw in the servo axis. Grama et al. (2018) calculated the heat generation and dissipation of the motorized spindle and presented an effective cooling strategy to reduce the thermal error. The physics-based method often requires an in-depth physical understanding of the manufacturing system to develop closed-form mathematical models for the temperature field and deformation field. However, prior knowledge of system behavior is not always available due to the complex error sources in the actual machining where the thermal system is high-order nonlinear, dynamic, and accompanied by hysteresis. To address the shortcomings, the data-driven method has drawn increasingly significant attention.

For the data-driven method, it aims to explore the internal characteristics of process data and establish the mapping relationship between temperature and thermal error without consideration of the intrinsic physical process. Liu et al. (2020) applied multivariable regression analysis (MRA) to the thermal error modeling of spindle. Whereas, the model prediction accuracy is limited due to its relatively weak nonlinear fitting ability. Hereafter, the artificial neural network (ANN) is increasingly popular to be used to model the thermal error due to its good performance on fitting nonlinear functions, such as back propagation neural network (BPNN) (Yin et al., 2018), recurrent neural network (Yang & Ni, 2005) and convolutional neural network (Fujishima et al., 2018). But it is not easy to adjust the parameters of the neural network model, such as the number of the neurons in the hidden layer, the weights, and the thresholds. Besides, it is easy to fall into local extremum due to the gradient descent characteristics and it requires a large amount of training data and long training time. Support vector machine (SVM) is another comparable method with the neural network. Miao et al. (2013) found that SVM has better prediction accuracy and robustness than MRA under the condition of small training data. Ramesh et al. (2002) concluded that SVM is more applicable in a production environment as fewer training data and parameters are required than ANN. But the parameter tuning for SVM is also unavoidable and even it is critical to find a precise kernel function. Before implementing ANN and SVM, their network structures have to be determined in advance. Then many scholars have combined with some parameter optimization methods, such as genetic algorithm (Tian & Luo, 2020), particle swarm optimization (Katherasan et al., 2014) and ant colony optimization (Zhang & Wong, 2018). In essence, these methods are a type of searching algorithms to determine the thresholds and weights of the model. In addition, there are also other data-driven methods, such as Bayesian approach (Mosallam et al., 2016), fuzzy logic (Kovac et al., 2013), ridge regression (Liu et al., 2017), and the combination of several methods (Abdulshahed et al., 2016). However, these existing methods are hard to interpret and more importantly fail to satisfy the requirement for prediction model in the actual engineering application that high accuracy and strong robustness with low measurement and computational cost can be achieved under a small amount of data so as to reduce the prohibitive downtime for experimental tests.

The temperatures are generally measured as the input variables whose collinearity between each other and correlation with the output variables affect the model accuracy as well as robustness drastically (Miao et al., 2015), and therefore the reliable locations and numbers of temperature measuring points are extremely significant. Poor placement and a small number of temperature sensors would result in poor prediction accuracy. However, a large number of temperature sensors would have a negative influence on the model’s accuracy and robustness because each temperature sensor may bring noise and some temperatures inevitably have high correlation with others. Meanwhile, the measurement cost for experiments and the computational cost for modeling would also be increased. To seek an optimal strategy to lay out the temperature sensors, many researchers are dedicated to exploring the effective methods of selecting the key temperature points, such as correlation analysis (Lo et al., 1999), grey system theory (Li et al., 2006), fuzzy clustering (Abdulshahed et al., 2015), the least absolute shrinkage and selection operator (LASSO) (Tan et al., 2017). Nevertheless, there always exist some drawbacks among these presented methods. For correlation analysis, common approaches such as Pearson correlation coefficient, it is only sensitive to the linear relationship between the variables. Specifically, even though it is one-to-one mapping for the nonlinear relationship, the correlation coefficient can still be rather low. For grey system theory, it fails to eliminate the coupling and collinearity between temperature features and measure negative correlations. For fuzzy clustering as a typical unsupervised learning algorithm, the threshold selection is quite empirical, which largely depends on the engineering experience. For LASSO, it would arbitrarily select one feature among highly correlated features and directly neglect all the others, which is easy to lead to the instability of the prediction. To sum up the above, there is yet no an effective method to optimally select the key temperature points. In addition, all these current methods are conducted separately before establishing the thermal error prediction model which itself is unable to evaluate the feature importance.

The hysteresis effect is a nonnegligible factor that makes the conventional static or instantaneous modeling method less robust, which is defined as a system that has memory, where the effects of the current input to the system are experienced with a certain delay in time (Hassani et al., 2014). It can be observed that the temperature usually lags behind the thermal error since the temperature sensors mounted on the surface cannot reflect the real internal temperatures. The worst hysteresis behaviour generally occurs in large machine tools with bigger volumes, longer strokes and heavier cutting loads (Tan et al., 2014). The thermal system in machine tools is a nonstationary and time-varying system with varying thermal time constants caused by various working conditions. The thermal errors depend on not only the current thermal status but also the previous thermal status. To take the hysteresis effect into consideration, existing researches focus on the physics-based analysis of the dynamic characteristics of the thermal system (Yang & Ni, 2003) or require an extra thermal basic characteristics test of the system (Xiang et al., 2018). Hence, it is of prime concern to put forward a concise and efficient data-driven method that can be directly implemented based on the process data.

With the rapid advancement in artificial intelligence (Nti et al., 2021), this paper presents a novel thermal error modeling method based on random forest (RF) consisting of a forest of decision trees. Based on the out-of-bag (OOB) data, the proposed model itself can simultaneously evaluate the feature importance through comparing the decrease in the prediction accuracy after randomly shuffling the value of the target feature. Then, the key temperature points are selected based on iterative elimination to improve the model performance and save the measurement and computational cost. Furthermore, the hysteresis effect between temperature and deformation is also considered. The accuracy and robustness of the proposed model were validated through a thermal error experiment. The further comparisons with the extensively-used models, such as BPNN and SVM model, are conducted, which demonstrates the superiority of the proposed RF model.

The rest of the paper is organized as follows. Section 2 details the random forest algorithm, including model structure, model construction process and hyper-parametric tuning. Section 3 proposes the method of optimally selecting key temperature points in the thermal system. Section 4 presents the method of determining the time lag considering hysteresis effect between temperature and deformation. A thermal error experiment is carried out in a machine tool and the result and discussion are detailed in Sect. 5. Concluding remarks and future research directions are presented in Sect. 6.

Random forest

The random forest algorithm, developed by Breiman (2001), is a tree-based ensemble learning method consisting of a forest of decision trees and has been widely applied to the classification and regression. RF uses bagging to increase the diversity of the trees by growing them from bootstrap samples and random subsets of input features. In addition, aggregating the prediction of all the diverse decision trees can significantly eliminate the influence of the noise in the dataset and reduce the overall variance of the model. Comparing with the conventional machine learning algorithms, it requires little data preparation, is simple to interpret, and less likely to overfit a dataset.

Model structure

Figure 1 illustrates the model structure of random forest that constructs N decision trees from bootstrap samples of a training dataset.

Fig. 1
figure 1

Model structure of random forest

Each decision tree is composed of branches and nodes. Each internal node represents a test on a certain input feature and each branch represents the output result of the test. The leaf node, i.e. the node which would not split, represents a class label for classification or a response for regression. The decision tree where each node has less than two branches is called binary tree which is generally used to solve the regression problems where the response is continuous such as the issue on thermal error modeling in this paper. Classification and regression trees (CART) supports numerical target variables and is usually implemented to construct binary trees using the feature and threshold that yield the largest impurity decrease at each node. Each decision tree is a weak learner. Multiple decision trees are grown in parallel to reduce the bias and variance of the random forest model. The final response of the model is obtained by averaging the predicted values of all the N regression trees.

Model construction process

First, the training dataset of each regression tree is acquired by sampling from the original training dataset with replacement and is called as a bootstrap sample. As thus, bootstrap aggregating or bagging generates N new training datasets of size n to grow N regression trees. The number of regression trees is an important parameter governing the complexity of a model.

Next, for each regression tree, M features are randomly selected without replacement from all available input features to be taken as split candidates of each non-leaf node. Such feature bagging can reduce the correlation among the trees. Then starting from the root node, choose the best split among these features based on splitting criterion until the terminal leaf node is generated.

The splitting criterion at each node is that the residual sum of squares would be minimized after the split. A decision tree recursively partitions the feature space so that the samples with the same labels for classification or similar target values for regression can finally be grouped together from the initial mixed samples. In essence, the tree growing process is the process of decreasing the impurity of the whole dataset which is measured by information entropy or Gini impurity for classification and residual sum of squares for regression in this paper.

Supposing that the training dataset \(S = \left\{ {(x_{1} ,y_{1} ),(x_{2} ,y_{2} ), \ldots ,(x_{n} ,y_{n} )} \right\}\) of the vth regression tree is discretely divided into m regions R1, R2,…, Rm through the regression tree. Denoting the response in the jth region as a constant cj. The response of the regression tree can be modeled as

$$ T_{v} (x) = \sum\limits_{j = 1}^{m} {c_{j} I} (x \in R_{j} ) $$
(1)

where I(.) is an indicator function. If its argument is true, then the indicator function returns 1; otherwise 0.

To satisfy the splitting criterion, the best response \(\hat{c}_{j}\) is the mean values of yi in region Rj

$$ \hat{c}_{j} = mean(y_{i} |x_{i} \in R_{j} ) $$
(2)

Considering a splitting feature k and split point p, and define the pair of half-planes, then

$$ R_{1} (k,p) = \left\{ {X|X_{k} < p} \right\}\quad {\text{and}}\quad R_{2} (k,p) = \left\{ {X|X_{k} \ge p} \right\} $$
(3)

The splitting feature k and split point p need to satisfy

$$ \mathop {\min }\limits_{k,p} \left[ {\mathop {\min }\limits_{{c_{1} }} \sum\limits_{{x_{i} \in R_{1} (k,p)}} {(y_{i} - c_{1} )}^{2} + \mathop {\min }\limits_{{c_{2} }} \sum\limits_{{x_{i} \in R_{2} (k,p)}} {(y_{i} - c_{2} )}^{2} } \right] $$
(4)

The minimization within the square brackets can be solved as

$$\begin{aligned} &\hat{c}_{1} = mean(y_{i} |x_{i} \in R_{1} (k,p))\quad {\text{and}}\\ & \hat{c}_{2} = mean(y_{i} |x_{i} \in R_{2} (k,p))\end{aligned}$$
(5)

Herein, the best split has been found. The dataset can be divided into two regions and the splitting process is iteratively repeated on each of the two resultant regions until a predefined stopping criterion is satisfied. The maximum allowable depth of the tree or the maximum number of the records in the leaf node can both be set to be a threshold to stop the splitting process.

Last, N regression trees \(\left\{ {T_{v} } \right\}_{1}^{N}\) can be constructed. The final response of the prediction model at a new input x is obtained by averaging the predicted values of all the N regression trees.

$$ f_{RF}^{{}} (x) = \frac{1}{N}\sum\limits_{v = 1}^{N} {T_{v} (x)} $$
(6)

Hyper-parametric tuning

Model’s hyper-parameters significantly affect its accuracy, robustness and generalization capability. There exist three crucial hyper-parameters in the random forest algorithm, i.e. the number of trees (N), the maximum depth of the tree (D), the number of randomly selected features (M). Parameter N has a close relation with the computational cost. Thus, it is necessary to seek a reasonable number to realize a trade-off between predictive performance and computational time. When increasing the number of trees would not significantly improve prediction results, the threshold can be considered to be acceptable. Parameter D largely decides the generalization capability of the model. The oversize regression tree would lead to the serious overfitting in the new dataset. Parameter M serves the feature bagging process and determines the strength of variable selection process. For most regression problems, M is the dimension of input feature vector (Geurts et al., 2006). Comparing to conventional machine learning techniques, the hyper-parameters of RF are more intuitive and easier to be optimized.

The grid searching method is implemented to find the optimal values of these hyper-parameters, which executes the subprocess for all combinations of selected values of the parameters and then delivers the optimal parameter values (Bardak et al., 2021). The cross-validation method is integrated to prevent over-fitting problems and evaluate the model’s performance on the unknown dataset. In this paper, fivefold cross-validation is carried out. Specifically, the training dataset is divided into five subsets equally. Each subset is successively regarded as a validation dataset, and the remaining four subsets are taken as training dataset.

Selection of key temperature points

For the thermal error modeling, M temperature sensors are placed at the different positions in the machine tool to measure the temperatures which are taken as the input features. A displacement sensor is used to measure the thermal error which is taken as the output variable. Thus, the dimension of the dataset is M + 1. The size of the dataset depends on the sample period and the total sample time. Too many temperature features would have a negative influence on the model’s accuracy and robustness because each temperature sensor may bring noise and some temperatures are highly correlated with each other. Additionally, the measurement cost for experiments and the computational cost for modeling also need to be considered in the actual engineering application. Thus, it is of great significance to select the temperature key point to improve the performance and practicability of the model.

Evaluation of temperature feature importance

Comparing to the conventional machine learning algorithms, RF enables assessment of relative importance of input features, which contributes to dimensionality reduction to improve model’s performance on high-dimensional problems. Since the dataset of each regression tree is generated by sampling with replacement or bootstrapping, some observations may be repeated and other observations may not be selected. On the average, approximately one third of the samples in the dataset have not been utilized during the process of constructing the regression tree and they are named OOB samples of that tree by which RF can natively perform an unbiased estimation of generalization error with no need for using an external dataset. Herein, OOB samples are actually regarded as a testing dataset for each tree and can be used to evaluate feature importance through adding random noise to change the value of a certain feature and comparing decrease in prediction accuracy in the OOB samples. The greater the decrease is, the more important the feature is. Not only the individual influence of each feature but also the interactive effect of multiple features on the response can be considered.

Supposing that there exist the bootstrap samples v = 1, 2, …, N. The importance factor of each input feature can be counted in the following five steps.

  1. (1)

    Starting from v = 1, construct a regression tree \(T_{v}\) by bootstrapping and denote OOB data as \(L_{v}^{oob}\).

  2. (2)

    Make the predictions on \(L_{v}^{oob}\) through the constructed regression tree and calculate the prediction error \(errOOB_{v}\).

  3. (3)

    For a certain input feature k (k = 1, 2, …, M), randomly shuffle the value of the feature in \(L_{v}^{oob}\) and then denote the resultant data as \(L_{vk}^{oob}\). Similarly, make the predictions on \(L_{vk}^{oob}\) through the constructed regression tree and calculate the prediction error \(errOOB_{vk}\).

  4. (4)

    Repeat steps (1)-(3) for v = 2, …, N.

  5. (5)

    The importance factor of the feature k is evaluated as

    $$ F(k) = \frac{1}{N}\sum\limits_{v = 1}^{N} {(errOOB_{v} - errOOB_{vk} )} $$
    (7)

Selection of key temperature points based on iterative elimination

To find the optimal locations as well as numbers of temperature sensors, the method of selecting key temperature points based on the iterative elimination is presented. First, rank the temperature features by their importance factors based on the proposed model and record the model’s prediction accuracy. Then, eliminate the least important feature and iteratively repeat the above steps. The iterations are performed M-1 times until there is only one feature left. Last, compare the model’s prediction accuracy under different feature combinations and select the feature combination with the highest prediction accuracy.

Furthermore, in order to ensure the reasonability and reliability of results, five-fold cross-validation is further applied. The validation processes together with corresponding feature ranking are performed five times at each iteration. At the ith (i = 1, 2,…, 5) validation, the prediction accuracy on the validation dataset is denoted as \(A_{cur} [i]\). The feature importance ranking at the validation with the highest prediction accuracy \(A_{curMax}\) is considered as the basis of feature elimination at the current iteration. The mean prediction accuracy \(A_{curMean}\) during five times of validation are deemed as the prediction accuracy at the current iteration. The highest accuracy \(A_{allMax}\) among M − 1 times of iterations is regarded as the final model accuracy whose corresponding temperature feature combination is considered as the key temperature points which are denoted as keyFC. Figure 2 and Algorithm 1 depict the procedures of implementing the method of selecting key temperature points based on iterative elimination.

Fig. 2
figure 2

Algorithm flowchart of key temperature points selection based on iterative elimination

figure a

Thermal error modeling considering hysteresis effect

Hysteresis effect

In previous studies (Xiang et al., 2018; Yang & Ni, 2003), sufficient evidence shows that there exists the hysteresis phenomenon between temperature and thermal deformation, which causes the conventional modeling approach based on the static premise to be less accurate and robust due to the absence of the time variable in the modeling process. The temperature usually lags behind the thermal error when the rate of temperature change is lower than the response speed of thermal deformation and where the sensors mounted on the surface do not reflect the real internal temperature. The dynamic response of thermal displacement to different surface temperature sensors has a different time constant. Taking a certain dataset for example to demonstrate the hysteresis effect, Fig. 3a illustrates the measurement results of the temperatures and spindle thermal error from a cycle of two hours for heating up and another three hours for cooling down. The temperature variations at three different positions on the spindle are the input features and respectively denoted as ΔT1, ΔT2 and ΔT3. The thermal error induced by spindle thermal elongation is the output variable and denoted as ΔL. The sampling period is 1 min, and thus the dataset contains 300 data samples. It can be noticed that there exists a certain delay in time between the temperature and thermal error and the time interval is not identical for different temperature features.

Fig. 3
figure 3

Measurement results when being heated and cooled: a Thermal error and temperature variations varying with time, b Thermal error varying with temperature variations

Furthermore, Fig. 3b shows the temperature-deformation relationship when being heated and cooled which can more intuitively indicate the hysteresis behavior of the temperatures relative to the deformation and the linear correlation between these variables. Hence, it would yield the more precise prediction results considering the hysteresis effect of the thermal system.

Determination of time lag based on permutation test

The existing work formulates the time lag based on the physics-based method or requires an extra thermal basic characteristics test. To develop a more concise and efficient data-driven method, this paper presents the method of determining the time lag based on permutation test to incorporate the time variable into the thermal error modeling, which can be divided into the following steps:

  1. (1)

    Rearranging the original time series of the target feature with a certain time lag while keeping the remaining features constant is called a permutation test.

  2. (2)

    The time series of the target feature is rearranged to be with different time lags in different permutation tests.

  3. (3)

    The relative importance of the different time series is evaluated based on the proposed model using random forest.

  4. (4)

    The optimal time lag can be determined through comparing the relative importance factors.

  5. (5)

    The above steps are repeated until the time lags of all the temperature features are obtained.

Generally, the sampling period \(SampleT\) in the actual engineering application is considered as the resolution of time lag in the permutation test. Denoting the number of the permutation tests as \(NumPT\) and the time lag as \(LagT\). Figure 4 and Algorithm 2 describe the procedures of implementing the method of determining the time lag based on permutation test.

Fig. 4
figure 4

Algorithm flowchart of determining time lag based on permutation test

figure b

Experiment

Experimental Setup

As the core component of machine tool, the spindle would generate large amounts of heat during the machining process, which is the main source affecting the machine precision (Li et al., 2015). In this paper, a spindle thermal error experiment is conducted in a three-axis vertical machining center with a mechanical spindle using the spindle error analyzer manufactured by Lion Precision Corporation in USA, as shown in Fig. 5. The tool holder of the machine tool is used to clamp the high-precision standard balls. The capacitive displacement sensor with the accuracy of 0.1 μm is installed at the bottom of the probe nest to measure the Z-direction displacement of the front standard ball, namely the spindle thermal error induced by thermal elongation. The temperature sensor chip is Tsic506F of IST Corporation in Switzerland with the accuracy of 0.1 °C. Eight temperature sensors are originally used to measure the temperature data. Their specific number and placements are listed in Table 1 and shown in Fig. 5.

Fig. 5
figure 5

Experimental setup for the thermal error measurement using the spindle error analyzer

Table 1 Sensors and locations

Results and discussion

Key temperature points

In the test, the spindle rotates without load at the speed of 6000 r/min for 2 h and then remains stop for 3 h. The temperature difference between each temperature variable and the ambient temperature is taken as the model input, which is denoted as ΔTi, where i represents the number of the temperature sensor. The spindle length is denoted as L. The thermal error induced by spindle thermal elongation is denoted as ΔL, which is taken as the model output. The sampling period for temperature and thermal error is 1 min, and thus the eight-dimensional dataset including 300 data samples is obtained, as shown in Fig. 6.

Fig. 6
figure 6

Measured data of the temperatures and thermal error

In this research, a random forest is constructed using 60 regression trees. 75% of the measured data are randomly selected as the training dataset and the remaining is taken as the testing dataset. The RF algorithm is performed with scikit-learn in Python. Through the presented method of selecting key temperature points based on iterative elimination, the importance factors of these features are recorded in Fig. 7 and the corresponding prediction accuracy measured by mean square error (MSE) is depicted in Fig. 8 in each iteration.

Fig. 7
figure 7

Importance factors of the temperature features during iteration process

Fig. 8
figure 8

Prediction accuracy varying with the feature combination

According to the rank of the feature importance, the least important feature is iteratively eliminated until there is only one feature left. From Fig. 8, the prediction accuracy increases continuously with the elimination of the redundant features and the measurement noise until three features are left. Then the prediction accuracy inversely decreases because the significant features are eliminated resulting in the loss of the useful information. Through comparing the model’s prediction accuracy under different groups of temperature features, the feature combination, i.e., T2, T4 and T5, with the highest accuracy is finally selected as the key temperature points. Herein, the dimension and size of the dataset are 4 and 300, respectively. It can be concluded that the presented method contributes to the identification and elimination of the redundant features and the model performance improvement.

Time lag considering hysteresis effect

Through the presented method of determining the time lag, the permutation test is conducted on each feature with different time lags which are the integer multiple of the sampling period which is 1 min in this case. Then, the relative importance of the feature with different time lags can be obtained, as exhibited in Fig. 9.

Fig. 9
figure 9

Relative importance of the temperature feature with different time lags: a T2, b T4, c T5

It can notably be seen that the temperature T2 of the rear bearings, the temperature T4 of the front bearings and the temperature T5 of the headstock lag behind the thermal deformation by 2 min, 3 min and 4 min, respectively. That can reasonably be attributed to the positions of the machine components from the heat sources and the specific placements of the sensors attached on them, as shown in Fig. 5. T2 and T4 are closer to the main heat sources, rear bearings and front bearings which generate the heat due to the friction during the rotation process of the spindle. In addition, the sensor placement of T5 is relatively farther from the internal temperature of the spindle.

Model prediction and comparison

Considering the random factors during the experimental process, the test was repeated four times, and thus four sets of datasets containing 1200 data samples were obtained, as shown in Fig. 10.

Fig. 10
figure 10

Measured datasets of the temperatures and thermal error

Three measured datasets are randomly selected as the training dataset (e.g. Fig. 10a–c) and the remaining one (e.g. Fig. 10d) is taken as the testing dataset to establish the proposed model based on RF. A regression tree in the proposed model can partially be visualized as shown in Fig. 11.

Fig. 11
figure 11

Partial visualization of a regression tree in the random forest

Comparing to the conventional machine learning techniques, such as BPNN and SVM which are both the typical and extensively-used approaches for thermal error modeling, the proposed model is no more a black box model and can be visualized. The given situation is observable and the explanation for the condition can easily be explained by Boolean logic. Thus, it can be concluded that the proposed model has better interpretability.

To further compare the model performance, the thermal error is also modeled using BPNN and SVM. For BPNN, through modifying the number of hidden layers and neurons in each hidden layer and the connection weights among the neurons together with the performance comparison (Bardak et al., 2016), the network architecture is selected, which includes one input layer with three neurons, one hidden layer with ten neurons and one output layer with one neuron. For SVM, according to the research work (Miao et al., 2013), the Gaussian radial basis function kernel is one of the most effective kernel functions used in thermal error modeling, and thus is selected in this research. The BPNN and SVM model are constructed using MATLAB Toolbox which provides large amounts of powerful tools for achieving the efficient modeling. The predicted values using BPNN, SVM and RF and the observed values are exhibited in Fig. 12. The model accuracy using three methods is measured through the coefficient of determination R2, mean absolute error (MAE) and MSE in statistics, as listed in Table 2.

Fig. 12
figure 12

Comparison of observed errors and predicted errors using three methods: a BPNN, b SVM, c RF

Table 2 Comparison of model accuracy using BPNN, SVM and RF

From Table 2, the higher R2 of RF indicates the higher percentage of the response variable variation that is explained by a regression model, which means the proposed model based on RF fits the data better. Furthermore, the smaller MAE and MSE using RF significantly demonstrates the higher prediction accuracy of the proposed model.

RF uses bagging to increase the diversity of the trees by growing them from bootstrap samples and random subsets of input features. Moreover, aggregating the prediction of all these trees can significantly eliminate the influence of the noise in the dataset and reduce the overall variance of the model to achieve the stronger robustness. To further verify the robustness of the proposed model, another test with the varying spindle speeds is conducted to test whether the proposed model can maintain the expected accuracy when operation conditions are changed. The spindle speed spectrum and the model prediction using BPNN, SVM and RF are illustrated in Fig. 13.

Fig. 13
figure 13

Robustness experiment with the varying spindle speeds: a Spindle speed spectrum, b Model prediction using BPNN, c Model prediction using SVM, d Model prediction using RF

From Table 3, it can visibly be noticed that even though the operation conditions are varying, the proposed model based on RF still fits the data better and maintains the higher prediction accuracy than BPNN and SVM and the maximum residual error is less than 3 μm which is smaller than 6 μm of BPNN and 5 μm of SVM. Thus, the proposed model based on RF is further demonstrated to be more robust.

Table 3 Comparison of model accuracy using BPNN, SVM and RF

Conclusions

To further improve the thermal error modeling accuracy and robustness, this paper presents a novel thermal error modeling method based on random forest which requires less training data, enables faster and more intuitive parameter tuning, achieves higher prediction accuracy, and has stronger robustness. The following conclusions can be drawn:

  1. 1.

    Comparing to the existing error prediction models, the proposed model itself can simultaneously evaluate the feature importance since OOB data are generated during modeling process and can be utilized to perform an unbiased estimation to give the feature importance factors.

  2. 2.

    The method of selecting key temperature points based on iterative elimination is presented to effectively eliminate the redundant features and measurement noise to improve the prediction accuracy and reduce the measurement and computational cost.

  3. 3.

    Considering hysteresis effect in the thermal system, the method of determining the time lag based on permutation test is presented to further improve the model performance.

  4. 4.

    Comparing to the conventional machine learning methods, such as BPNN and SVM, the proposed model based on RF is simple to interpret, achieves higher prediction accuracy, and has stronger robustness with the requirement of less training data and the faster parameter tuning.

A thermal error experiment was conducted in a machine tool to validate the accuracy and robustness of the proposed model which can continuously achieve the prediction accuracy of over 90% even though the operation conditions are changed.

The proposed model can also be applied to other errors of the machine tools such as geometric error and cutting-tool wear-induced error. Further, these models using RF can be integrated into an error compensation software system to adaptively improve the machine precision based on the big data from production line, which contributes a lot to the intelligent manufacturing. All of these potential works will be explored in future research.