1 Introduction

The accuracy and reliability of CNC machine tools is one of the crucial standards to measure a country’s manufacturing level. As a key component of high-speed and high-precision CNC machine tools, the accuracy of the motorized spindle can directly affect the machining accuracy [1]. By integrating the spindle of the machine with the spindle motor, the motorized spindle eliminates the error caused by the transmission mechanism such as screw and gear, and improves the accuracy of CNC machine tools [2]. Due to the compact structure and more internal heating parts, the motorized spindle produces a lot of heat during operation, which causes the thermal distortion and reduces the accuracy of CNC machine tools. The thermal error accounts for 40%-70% of the total error of CNC machine tools [3]. Therefore, reducing the thermal error of the motorized spindle can effectively improve the accuracy of CNC machine tools.

There are three main methods to reduce the thermal error of the motorized spindles, namely thermal error prevention method, thermal error control method and thermal error compensation method [4]. The thermal error prevention method is to reduce the thermal error by using materials with small thermal expansion coefficient. The thermal error control method is to reduce the thermal error by improving the cooling system. The thermal error compensation method needs to know the thermal error, and then compensate or correct the thermal error through the CNC controller. Thermal error compensation method is widely used because of its low cost. Thermal error compensation method can be divided into direct compensation and indirect compensation. Direct compensation obtains the thermal error by directly measuring the machining error, but the measurement process is tedious and can not achieve real-time compensation. Indirect compensation can predict and compensate thermal error in real time by builting thermal error prediction model. Therefore, the indirect thermal error compensation method has become one of the most widely used compensation methods.

The indirect thermal error compensation method mainly depends on the prediction accuracy of the thermal error prediction model, which is mainly divided into numerical simulation model and experimental data model [5]. The numerical simulation model is to simulate the thermal error according to the main parameters of the motorized spindle. The main methods of numerical simulation model are finite element method (FEM) [6, 7], finite difference method (FDM) [8] and thermal resistance network [9, 10]. The numerical simulation model needs to establish an accurate three-dimensional model of the motorized spindle and determine the detailed material properties. The process is complex and time-consuming, and the prediction accuracy is not high. Therefore, the experimental data model has received a lot of attention. The experimental data model obtains the temperature and thermal error data through the thermal error experiment, and establishes the thermal error prediction model combined with machine learning. The experimental data model usually has high prediction accuracy and can compensate the motorized spindle in real time. The establishment of experimental data model includes the selection of temperature sensitive points and the establishment of thermal error prediction model.

Because the position of the best temperature measuring point in the thermal error experiment is uncertain, multiple temperature measuring points will be selected for temperature detection. However, the accuracy of the thermal error prediction model is reduced because of the collinearity between multiple temperature measurement points. Therefore, it is necessary to select the best temperature measurement point, which is called the temperature sensitive point [11]. The selection of temperature sensitive points mainly involves cluster analysis, correlation analysis and principal component analysis. Yang et al. [12] used a combination of fuzzy clustering and correlation analysis to select temperature sensitive points. Tsai et al. [13] used principal component regression to select temperature sensitive points, so as to eliminate the collinearity between temperature measuring points. Li et al. [14] proposed a temperature sensitive point selecting method based on improved binary grasshopper optimization algorithm (IBGOA). Li et al. [15] selected temperature sensitive points through fuzzy clustering and grey relation analysis. The current method can not determine the optimal cluster number when clustering temperature measurement points, and usually determines the cluster number manually. The clustering results have strong randomness and have a great impact on the prediction accuracy of the thermal error prediction model of the motorized spindle.

The thermal error prediction model of the motorized spindle mainly includes regression algorithm model and neural network model. The regression algorithm model is suitable for small sample data. Fu et al. [16] applied the multiple linear regression model to establish the axial thermal error model of the motorized spindle. Zhou et al. [17] combined particle swarm optimization(PSO) with simulated annealing algorithm(SA) to optimize the hyperparameters of support vector machine(SVM), and establish a high-precision prediction model for the thermal error of the motorized spindle. Li et al. [18] established the prediction model of thermal error of the motorized spindle based on least squares support vector machine (LSSVM) optimized by Aquila Optimizer (AO). The neural network model has high prediction accuracy, but needs a lot of data for model training. Liu et al. [19] introduced BP neural network to predict the thermal error of five axis machining center. Kosarac et al.[20] proposed the prediction method of thermal error of the motorized spindle based on BP neural network combined with Adam optimization algorithm. Cheng et al. [21] proposed a thermal error prediction model combining long short term memory (LSTM) and convolutional neural network (CNN). To sum up, the thermal error prediction model of motorized spindle mainly adopts the combination of optimization algorithm and prediction algorithm. The optimization boundary of the optimization algorithm is mostly determined manually, and the selection of the boundary has a great impact on the prediction accuracy. At present, there is no method to automatically select the boundary.

In order to improve the prediction accuracy of the thermal error prediction model of the motorized spindle, this paper proposes a thermal error prediction model of the motorized spindle based on the adaptive boundary harris hawk optimization (ABHHO) and the least squares support vector machine (LSSVM). Aiming at the shortcomings of the existing temperature sensitive point selection methods, a multi speed temperature sensitive point selection method based on clustering by fast search and find of density peaks (CFSFDP) is proposed. The framework of this paper is as follows: In Sect. 1, the existing research methods and shortcomings are summarized. In Sect. 2, the thermal error experiment is carried out and the experimental data are analyzed. In Sect. 3, a multi speed temperature sensitive point selection method based on CFSFDP is proposed, which can determine the optimal number of temperature sensitive points. In Sect. 4, a thermal error prediction model of the motorized spindle based on adaptive boundary harris hawks optimization and least squares support vector machine(ABHHO-LSSVM) is established. In Sect. 5, compared with the traditional thermal error prediction models, the effectiveness of the proposed model is verified. Finally, the conclusions are given in Sect. 6.

2 Thermal error experiment of the motorized spindle

2.1 Experiment setup

Thermal error experiment is the first step to establish thermal error prediction model. The temperature and thermal error data of the motorized spindle at different speeds are obtained through the thermal error experiment. The specific experimental scheme is as follows:

Fig. 1
figure 1

Sensor arrangement

The thermal error experiment scheme is shown in Fig. 1, and 12 temperature sensors were mounted on the motorized spindle to collect temperature data. T1-T3 detect the temperature of the front and front bearing of the motorized spindle. T4-T7 detect the temperature of the motor inside the motorized spindle. T8 and T9 detect the temperature of the rear bearing of the motorized spindle. T10 and T11 detect the temperature of the rear of the motorized spindle. The ambient temperature was detected by T12. The layout of temperature sensor is shown in Table 1. The sampling frequency of the temperature sensor is once a minute. The displacement sensor is fixed on the workbench through a bracket. The three displacement sensors measure the axial, horizontal and vertical displacement changes of the standard ball at the end of the motorized spindle. The sampling frequency of the displacement sensor is the same as that of the temperature sensor.

Table 1 Layout position of temperature sensor

The CWY-DO-502 eddy current displacement sensor is selected as the displacement sensor. The sensor has the advantages of wide frequency response, wide linear measurement range, small volume, strong antiinterference ability, convenient installation and so on. The PT100 magnetic temperature sensor is selected as the temperature sensor, which has an accuracy of 0.1 and a sampling range of - 60 - 120. It can be directly adsorbed on the motorized spindle body during the sampling process.

The rated speed of the motorized spindle used in the experiment is 4000 r/min. The thermal error experiment with four different speeds was carried out, and the speeds were 1000 r/min, 2000 r/min, 3000 r/min and 4000 r/min. The sampling frequency of temperature and thermal error is once a minute. Each speed was measured for three hours, and 180 sample points were collected.

Fig. 2
figure 2

Temperature data

2.2 Analysis of experimental results

The experimental data are shown in Figs. 2 and 3. As shown in Fig. 2, the temperature of the motorized spindle increases with the increase of the running time. The stable temperature of the motorized spindle increases with the increase of the speed. The temperature change of each part of the motorized spindle is different. The T1-T3 temperature sensor is located in the front bearing of the motorized spindle, which is seriously heated, so the temperature is the highest. The T8-T11 temperature sensor is located at the rear bearing and rear of the motorized spindle, generating less heat, so the temperature is low. The T4-T7 temperature sensor is located in the built-in motor part of the motorized spindle and its temperature fluctuates continuously due to cooling water.

Fig. 3
figure 3

Thermal error data

Thermal errors of the motorized spindle mainly occur in the axial direction (Z-axis), while thermal errors of two radial directions (X and Y axis) are small. Therefore, this paper mainly focuses on the modeling and analysis of the axial thermal error of the motorized spindle. All subsequent thermal errors represent the axial thermal error. Figure 3 shows the thermal error data of the spindle at four speeds. Since the water cooling device is turned on in the experiment, the measured thermal error is small. However, the smaller experimental data does not affect the verification of the subsequent prediction model.

3 Multi speed temperature sensitive point selection method based on CFSFDP

3.1 Clustering by fast search and find of density peaks(CFSFDP)

The CFSFDP [22] is a clustering analysis method published by Rodriguez and Laio in Science in 2014. It can quickly find the density peak point of any shape dataset, and thus obtain the optimal cluster number of dataset. The CFSFDP believes that the cluster center should have the following two characteristics at the same time:

  1. (1)

    Its density is large, and it is surrounded by points whose density is smaller than it.

  2. (2)

    It is more distant from other data points with higher density.

The CFSFDP needs to calculate two quantities: its local density \(\rho _{i}\) and its distance \(\delta _{i}\) from points of higher density. The data set is represented by a decision graph (the abscissa is the density \(\rho \) and the ordinate is the distance \(\delta \)). The sample points whose \(\rho _{i}\) and \(\delta _{i}\) are obviously greater than other points are selected as the cluster center points.

The local density \(\rho _{i}\) is calculated as follows:

$$\begin{aligned} \rho _{i} = \sum _{j} e^{-(\frac{d_{ij}}{d{c}})^2} \end{aligned}$$
(1)

Where \(\rho _{i}\) is the local density of data point i, \(d_{ij}\) is the distance from sample point i to sample point j and d c is the cutoff distance, which needs to be specified manually.

The distance \(\delta _{i}\) is calculated as follows:

$$\begin{aligned} \delta _{i} = \min _{j:\rho _{j} > \rho _{i}}(d_{ij}) \end{aligned}$$
(2)

For the point with highest density, we conventionally take \(\delta _{i} = \max (d_{ij})\).

In order to consider both \(\rho \) and \(\delta \), a new variable \(\gamma _{i}\) is defined.

$$\begin{aligned} \gamma _{i} = \rho _{i} \cdot \delta _{i} \end{aligned}$$
(3)

Obviously, when \(\gamma _{i}\) is larger, point i is more likely to be the cluster center.

3.2 Fuzzy c-means clustering (FCM) and grey relational analysis (GRA)

The combination of fuzzy c-means clustering (FCM) and grey relational analysis (GRA) is a common method for selecting temperature sensitive points [23]. The temperature measuring points are clustered into different clusters by FCM, and then the correlation coefficient between each temperature measuring point and thermal error is calculated by GRA. The point with the highest correlation coefficient in each cluster is selected as the temperature sensitive point.

FCM obtains the clustering result by minimizing the objective function J through the formula (4) - (6):

$$\begin{aligned} J_{m}= & {} \sum _{i=1}^{N} \sum _{j=1}^{C} u_{ij}^m \Vert x_{i} - c_{j} \Vert ^2, 1 \le m < \infty \end{aligned}$$
(4)
$$\begin{aligned} u_{ij}= & {} \frac{1}{\sum _{k=1}^{C} (\frac{\Vert x_{i} - c_{j} \Vert }{\Vert x_{i} - c_{k} \Vert })^\frac{2}{m-1}}\end{aligned}$$
(5)
$$\begin{aligned} c_{j}= & {} \frac{\sum _{i=1}^{N} u_{ij}^m \cdot x_{i}}{\sum _{i=1}^{N} u_{ij}^m} \end{aligned}$$
(6)

Where \(m \ge 1\) is the fuzzy coefficient, the general value range is [1.5,2.5], N is the number of samples, C is the number of cluster centers, \(c_{j}\) is the jth cluster center, \(x_{i}\) is the ith sample, \(u_{ij}\) represents the degree of membership of \(x_{i}\) to \(c_{j}\). The iteration termination conditions are as follows:

$$\begin{aligned} \max _{ij} (|u_{ij}^{(t+1)} - u_{ij}^{(t)} |) < \varepsilon \end{aligned}$$
(7)

Where t is the number of iteration steps, and \(\varepsilon \) is a small constant representing the error threshold. Once the cluster number C is set, the corresponding clustering results can be obtained.

Grey relational analysis(GRA) is a quantitative description and comparison method for the development and change of a system. In GRA, reference data and comparative data should be determined first. In this paper, the reference data is the thermal error of the motorized spindle, and the comparison data is the temperature measurement point data of the motorized spindle. Then, the data are dimensionless processed, and the most common methods are averaging and initializing.

  1. (1)

    Initializing: \(x_{i}(k) = \frac{x_{i}(k)}{x_{1}(k)}, k = 1, 2, \ldots , n; i = 1, 2, \ldots , m\)

  2. (2)

    Averaging: \(x_{i}(k) \!=\! \frac{x_{i}(k)}{\overline{x}}, k \!=\! 1, 2, \ldots , n; i \!=\! 1, 2, \ldots , m\)

The correlation coefficient formula of reference data and comparison data is as follows:

$$\begin{aligned} \zeta _{i}(k) = \frac{\min \limits _{i}\min \limits _{k} |y(k) - x_{i}(k) |+ \mu \max \limits _{i}\max \limits _{k} |y(k) - x_{i}(k) |}{|y(k) - x_{i}(k) |+ \mu \max \limits _{i}\max \limits _{k} |y(k) - x_{i}(k) |} \end{aligned}$$
(8)

Where \(\mu \) is called resolution coefficient, which is usually taken as \(\mu =0.5\).

3.3 Multi speed temperature sensitive point selection

According to the temperature and thermal error data of the motorized spindle collected from the thermal error experiment, the CFSFDP can be used to calculate the optimal cluster number at each speed, and then FCM and GRA can be used for cluster analysis and correlation analysis to obtain the temperature sensitive points of the motorized spindle.

Fig. 4
figure 4

Decision diagram

Figure 4 shows the decision diagram of CFSFDP solution for the temperature data of the motorized spindle at different speeds. The optimal number of temperature sensitive points is different at different speeds. Therefore, it is necessary to analyze different speeds when solving the temperature sensitive points. Based on the optimal cluster number of each speed, the temperature measurement points are clustered by FCM. The optimal cluster number and clustering results for each speed are shown in Table 2.

Table 2 Optimal cluster number and clustering results at different speeds

Combining the temperature measurement point data and thermal error data, calculate the grey correlation coefficient between each temperature measurement point and thermal error at different speeds. The calculation results are shown in Table 3.

Table 3 The grey correlation coefficient between temperatures and thermal errors

Based on the FCM and GRA calculation results, the temperature sensitive points at different speeds can be obtained as shown in the Table 4. After comprehensive consideration, the optimal number of temperature sensitive points of the motorized spindle is 4, and the temperature sensitive points are (T1, T2, T7, T9).

In order to verify the correctness of the method of solving the optimal cluster number temperature data through CFSFDP, the motorized spindle temperature data of 2000r/min is taken as an example. The optimal cluster number is determined by calculating the Silhouette Coefficient (SC) of the clustering results for different cluster numbers. The SC combines the cohesion and separation of clustering to evaluate the clustering effect. A large SC indicates a good clustering effect, with a value range of [- 1, 1]. The calculation formula of SC is as follows:

$$\begin{aligned} S = \frac{b-a}{max(a,b)} \end{aligned}$$
(9)

Where a is the average distance from sample i to other samples of the same type, b is the minimum value of the average distance from sample i to all other types of samples.

Table 5 shows the clustering results and SC of the motorized spindle at 2000r/min under different cluster numbers. When the number of clusters is 2, the SC is 0.71, which is the best number of clusters. Therefore, the method of selecting temperature sensitive points based on CFSFDP is feasible, and the best cluster number of temperature measuring points of the motorized spindle can be found by this method.

4 Thermal error prediction model of the motorized spindle based on ABHHO-LSSVM

4.1 Harris hawks optimization(HHO)

Harris hawks optimization(HHO) [24] is a new metaheuristic algorithm proposed by Heidari and Mirjalili in 2019. It simulates the predation process of the harris hawks, and combines Lévy flight to solve complex multidimensional problems. Research [25] shows that the HHO has good optimization performance. The HHO mainly includes exploration phase, transition from exploration to exploitation and exploitation phase.

4.1.1 Exploration phase

In the exploration phase, the prey is detected according to the following two strategies:

$$\begin{aligned} \mathop {A}\limits ^{\rightarrow }(t+1) = {\left\{ \begin{array}{ll} X_{rand}(t)-r_{1}|X_{rand}(t)-2r_{2}X(t) |, q \ge 0.5\\ (X_{rabbit}(t)-X_{ave}(t))-r_{3}(lb+r_{4}(ub-lb)), q < 0.5\\ \end{array}\right. } \end{aligned}$$
(10)

Where q,\(r_{1}\),\(r_{2}\),\(r_{3}\) and \(r_{4}\) are all random numbers in [0,1], ub and lb are the upper and lower bounds of the search space, \(X_{rand}\) is the position of a random individual, \(X_{rabbit}\) is the prey position, and \(X_{ave}\) is the average position of all individuals in the population.

4.1.2 Transition from exploration to exploitation

The harris hawks transform the exploration phase and exploitation phase according to the escaping energy of the prey. The formula of escaping energy is as follows:

$$\begin{aligned} E = 2 E_{0} (1 - t / t_{max}) \end{aligned}$$
(11)

Where E is escaping energy, \(E_{0}\) is the initial escaping energy, which varies randomly in [-1,1], t is the current iteration number, \(t_{max}\) is the maximum number of iterations. When \(|E |\ge 1\), the HHO is in the exploration phase. When \(|E |< 1\), the HHO is in the exploitation phase.

Table 4 Temperature sensitive points at different speeds
Table 5 Clustering results and SC of the motorized spindle temperature measurement points at 2000r/min

4.1.3 Exploitation phase

In the exploitation phase, Let r be the escaping probability of prey. When \(r<0.5\), it means successful escape. When \(r \ge 0.5\), the escape fails. There are four cases:

  1. (1)

    Soft besiege: When \(r \ge 0.5, |E |\ge 0.5\), the location optimization strategy is as follows:

    $$\begin{aligned} X(t+1) = X_{rabbit}(t) - X(t) - E |J X_{rabbit}(t) - X(t) |\end{aligned}$$
    (12)

    Where \(J = 2 (1 - r_{5})\) is the jumping distance of prey throughout the escaping procedure, \(r_{5}\) is a random number of (0,1).

  2. (2)

    Hard besiege: When \(r \ge 0.5, |E |< 0.5\), the location optimization strategy is as follows:

    $$\begin{aligned} X(t+1) = X_{rabbit}(t) - E |X_{rabbit}(t) - X(t) |\end{aligned}$$
    (13)
  3. (3)

    Soft besiege with progressive rapid dives: When \(r < 0.5, |E |\ge 0.5\), the location optimization strategy is as follows:

    $$\begin{aligned} X(t+1)= & {} {\left\{ \begin{array}{ll} Y, fitness(Y)< fitness(X(t))\\ Z, fitness(Z) < fitness(X(t))\\ \end{array}\right. }\end{aligned}$$
    (14)
    $$\begin{aligned} Y= & {} X_{rabbit}(t) - E |J X_{rabbit}(t) - X(t) |\end{aligned}$$
    (15)
    $$\begin{aligned} Z= & {} Y + S \times LF(D) \end{aligned}$$
    (16)

    Where D is the dimension of solving coefficient, S is a random variable of \(1 \times D\), LF(D) is the Lévy flight.

  4. (4)

    Hard besiege with progressive rapid dives: When \(r< 0.5, |E |< 0.5\), the location optimization strategy is as follows:

    $$\begin{aligned} X(t+1)= & {} {\left\{ \begin{array}{ll} Y, fitness(Y)< fitness(X(t))\\ Z, fitness(Z) < fitness(X(t))\\ \end{array}\right. }\end{aligned}$$
    (17)
    $$\begin{aligned} Y= & {} X_{rabbit}(t) - E |J X_{rabbit}(t) - X_{m}(t) |\end{aligned}$$
    (18)
    $$\begin{aligned} Z= & {} Y + S \times LF(D)\end{aligned}$$
    (19)
    $$\begin{aligned} X_{m}(t)= & {} \frac{1}{N} \sum _{i=1}^{N} X_{i}(t) \end{aligned}$$
    (20)

    The HHO realizes the optimization solution by adjusting the four trapping mechanisms between the harris hawks and prey.

4.2 Least squares support vector machine(LSSVM)

Least squares support vector machine(LSSVM) [26] is a machine learning algorithm proposed by J.A.K. Suykens in 1999. The LSSVM can accurately reflect the nonlinear dependence of thermal error on temperature using kernel function. The regression function of the LSSVM is as follows:

$$\begin{aligned} f(x) = w^T \phi (x) + b \end{aligned}$$
(21)

Where f(x) is the predicted value, w is the weight vector, x is the input value, b is the deviation.

The optimization objectives of the LSSVM are as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} minJ(w, \xi ) = \frac{1}{2} w^T w + \frac{1}{2} c \sum _{i=1}^{m} \xi _{i}\\ s.t.y_{i} = w^T \phi (x) + b + \xi _{i}, i = 1, 2, \ldots ,m\\ \end{array}\right. } \end{aligned}$$
(22)

Where \(\xi _{i}\) is the error variable, c is a penalty term and an adjustable parameter. The prediction accuracy of the model can be improved by adjusting the c value.

Lagrangian function is introduced into the Eq. 22, and w and i are eliminated based on KKT to obtain the following linear equations:

$$\begin{aligned} \begin{bmatrix} 0 &{} 1 &{} \cdots &{} 1 \\ 1 &{} K(x_{1}, x_{1})\!+1/c &{} \cdots &{} K(x_{1}, x_{n})\!+1/c \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ 1 &{} K(x_{m}, x_{1})\!+1/c &{} \cdots \ &{} K(x_{m}, x_{n})\!+1/c \\ \end{bmatrix} \cdot \! \begin{bmatrix} b \\ a_{1} \\ \vdots \\ a_{n} \\ \end{bmatrix} \!=\! \begin{bmatrix} 0 \\ y_{1} \\ \vdots \\ y_{m} \\ \end{bmatrix} \end{aligned}$$
(23)

Where \(a_{i}(i = 1, 2, \ldots , m)\) is the Lagrange multiplier, \(K(x_{i}, x_{j})\) is a kernel function, The commonly used kernel function is Gaussian kernel, and the specific form is as follows:

$$\begin{aligned} K(x_{i}, x_{j}) = exp(\frac{-\Vert x - x_{i} \Vert ^2}{2 \sigma ^2}) \end{aligned}$$
(24)

Where \(\sigma \) is the width parameter of the kernel function.

The precision of the LSSVM can be improved by optimizing c and \(\sigma \). Finally the LSSVM can be obtained as follows:

$$\begin{aligned} f(x) = \sum _{i=1}^{N} a_{i} K(x,x_{i}) + b \end{aligned}$$
(25)

4.3 Thermal Error Prediction Model of the motorized spindle base on ABHHO-LSSVM

The thermal error prediction model of the motorized spindle based on LSSVM needs to adjust the hyperparameters c and \(\sigma \) to improve the prediction accuracy. The HHO has a strong global search ability, and the combination of the HHO and LSSVM can obtain a higher prediction accuracy. In order to avoid overfitting, cross validation is used to validate the model. The fitness function is the root mean square error (RMSE) between the predicted thermal error and the actual thermal error.

$$\begin{aligned} RMSE = \sqrt{\frac{\sum _{i=1}^{N} (r - r_{i})^2}{N}} \end{aligned}$$
(26)

Where N is the number of predictions, r is the predicted thermal error, and \(r_{i}\) is the actual thermal error.

When using the HHO, it is necessary to determine the upper and lower boundary of the optimization parameters, that is, the optimization boundary. The selection of optimization boundary is very important for the accuracy of prediction model, and it is necessary to ensure that the optimal parameters are within the optimization boundary as much as possible. Therefore, this paper proposes an adaptive boundary harris hawks optimization(ABHHO), which can adaptively adjust the optimization boundary to improve the prediction accuracy when establishing the thermal error prediction model of the motorized spindle.

The ABHHO adjusts the optimization boundary based on the best parameters solved. If the optimal parameters are located on both sides of the optimization boundary, there may be more optimal parameters outside the boundary. The optimization boundary needs to be determined with the optimal parameters as the center, and the optimal parameters need to be found again. If the optimal parameter is located in the middle of the optimization boundary, the optimal parameter is taken as the center, the optimization boundary is narrowed, and the optimal parameter is searched again. When the iteration number or iteration accuracy meets the conditions, the program ends and the optimal parameters and fitness function values are output.

To sum up, the specific steps of the ABHHO are as follows:

  1. (1)

    Determine the initial boundary (ab), iteration number n, boundary length \(c=b-a\) and boundary reduction percentage m. The range of m is (0,1). The optimal parameter \(x_{1}\) and fitness function value \(y_{1}\) are calculated based on ABHHO. Set optimal parameters \(x_{f} = x_{1}\) and optimal fitness function values \(y_{f} = y_{1}\).

  2. (2)

    Adjust the boundary according to \(x_{f}\), and the adjustment scheme is as follows:

    1. (a)

      If \(a \le x_{1} < a + c/3\) or \(b - c/3 < x_{1} \le b\), the updated boundary is \((x_{f}-c/2, x_{f}+c/2)\).

    2. (b)

      If \(a + c/3 \le x_{1} \le b - c/3\), update boundary length \(c_{1} = mc\) and boundary \((x_{f}-c_{1}/2, x_{f}+c_{1}/2)\).

  3. (3)

    The best parameter \(x_{2}\) and fitness function value \(y_{2}\) are calculated based on the updated boundary.

  4. (4)

    If \(y_{1} \le y_{2}\), \(x_{f}\) remains unchanged. Repeat step 2.

  5. (5)

    If \(y_{1} > y_{2}\), then \(x_{f} = x_{2},y_{f} = y_{2},\). Repeat step 2.

  6. (6)

    When the optimal parameters and fitness function values remain unchanged for n times, the program ends and the optimal parameters \(x_{f}\) and fitness function values \(y_{f}\) are output.

The process of ABHHO is shown in Fig. 5.

Fig. 5
figure 5

Process of ABHHO-LSSVM

Fig. 6
figure 6

Fitting results of four models at 1000 r/min

Fig. 7
figure 7

Fitting results of four models at 2000 r/min

Fig. 8
figure 8

Fitting results of four models at 3000 r/min

Fig. 9
figure 9

Fitting results of four models at 4000 r/min

Table 6 Fitting accuracy of four models at four speeds
Fig. 10
figure 10

Test data set for thermal error prediction model

Set the initial boundary to [1, 100]. Due to the complexity of the algorithm, 75% of the experimental data is randomly selected as the training set. Using ABHHO to optimize the hyperparameters of LSSVM, we can get the hyperparameters \((c=422.787, \sigma =1)\).

Fig. 11
figure 11

Predicting results of four models

5 Thermal error prediction model performance analysis

5.1 Traditional thermal error prediction model

In order to verify the feasibility and superiority of ABHHO-LSSVM, multiple linear regression(MLR), support vector machine(SVM) and least squares support vector machine (LSSVM) were established respectively to compare with ABHHO-LSSVM. According to Sect. 3.3, four temperature measuring points (T1, T2, T7, T9) are selected as temperature sensitive points.

For fair comparison, the hyperparameters of SVM and LSSVM are determined by HHO. The initial boundary is [1, 100]. Determine that the hyperparameters of SVM are \((c=1.090, \sigma =1)\) and the hyperparameters of LSSVM are \((c=67.419, \sigma =15.921)\). The MLR established by the least square method is as follows.

$$\begin{aligned} y \!=\! -21.57 \!-\! 0.418 t_{1} \!+\! 1.021 t_{2} \!+\! 0.278 t_{3} \!-\! 0.150 t_{4} \end{aligned}$$
(27)

5.2 Fitting performance comparison

Fitting performance refers to the ability of a model to fit training set data. The fitting performance of four thermal error prediction models of the motorized spindle at four speeds is compared. The training set data includes temperature and thermal error of four speeds: 1000r/min, 2000r/min, 3000r/min and 4000r/min, with 180 samples for each speed.

Figure 6, 7, 8 and 9 shows the thermal error fitting results of four thermal error prediction models at four speeds.

Table 6 lists the fitting accuracy of four thermal error prediction models at four speeds. The RMSE of the thermal error prediction models of four motorized spindles at four speeds is less than 0.25 \(\mu m\). The fitting accuracy of ABHHO-LSSVM is the highest, and its RMSE is less than 0.04 \(\mu m\) at four speeds and less than the other three thermal error prediction models. Therefore, the thermal error prediction model of the motorized spindle based on ABHHO-LSSVM has good fitting performance. The strong fitting accuracy shows that the model can get better fitting training data, and the more important aspect of the thermal error prediction model is the prediction performance for different speeds.

Table 7 Predicting accuracy of four models

5.3 Prediction performance comparison

In order to verify the prediction performance of the thermal error prediction model at different speeds (the speed of training set data is different from that of test set data), the temperature and thermal error data of the motorized spindle at 1500 r/min, 2500 r/min and 3500 r/min were measured through the thermal error experiment, and the sampling frequency was once a minute. The experimental scheme is as follows.

  1. (1)

    The motorized spindle runs at 4000r/min for 3 h until the temperature is stable.

  2. (2)

    Adjust the speed to 3500r/min, run for half an hour, and collect 30 samples in total.

  3. (3)

    Adjust the speed to 2500r/min, run for half an hour, and collect 30 samples in total.

  4. (4)

    Adjust the speed to 1500r/min, run for half an hour, and collect 30 samples in total.

The collected temperature and thermal error data are shown in Fig. 10.

The four thermal error prediction models are trained by the training set data in Sect. 5.1. The test set data includes temperature and thermal error data of 1500r/min, 2500r/min and 3500r/min, with a total of 90 samples. Figure 11 shows the prediction results of the four models.

Table 7 shows the RMSE and \(R^2\) of the prediction results of the four models. The ABHHO-LSSVM has better prediction performance than other models, and its RMSE is 0.117 \(\mu m\). The MLR and HHO-LSSVM also have good prediction performance, and their RMSE are 0.182 \(\mu m\) and 0.279 \(\mu m\) respectively. The prediction performance of HHO-SVM is poor, and its RMSE is 0.363 \(\mu m\). Compared with MLR, HHO-SVM and HHO-LSSVM, the RMSE of thermal error of ABHHO-LSSVM decreases by 36%, 68% and 58% respectively. Therefore, the thermal error prediction model of the motorized spindle based on ABHHO-LSSVM is feasible and superior.

6 Conclusions

In order to improve the prediction accuracy of the motorized spindle prediction model, this paper proposes a thermal error prediction model of the motorized spindle based on ABHHO-LSSVM and a method for selecting temperature sensitive points based on CFSFDP. Through the thermal error experiment, the rationality and feasibility of the proposed method and model are verified. The following conclusions are summarized:

  1. (1)

    The method of selecting temperature sensitive points of the motorized spindle based on CFSFDP can determine the optimal cluster number, and select temperature sensitive points based on the optimal cluster number. The temperature sensitive points will change with the speed change.

  2. (2)

    The ABHHO has better optimization performance than the HHO. Through adaptive boundary, the optimization boundary can be changed with the optimal solution, avoiding the situation that there is no optimal solution in the optimization boundary.

  3. (3)

    The ABHHO-LSSVM has better fitting performance than MLR, HHO-SVM and HHO-LSSVM. The RMES of ABHHO-LSSVM is less than 0.04 in the fitting experiments of 1000r/min, 2000r/min, 3000r/min and 4000r/min.

  4. (4)

    The ABHHO-LSSVM has better prediction performance than MLR, HHO-SVM and HHO-LSSVM. When predicting the thermal error of the motorized spindle, the RMSE of ABHHO-LSSVM is only 0.117 \(\mu m\). Compared with MLR, HHO-SVM and HHO-LSSVM, the RMSE of thermal error of ABHHO-LSSVM decreases by 36%, 68% and 58% respectively.

Because we turned on the water cooling device during the thermal error measurement experiment, the thermal error measurement result was too small. So in the prediction experiment, the model prediction error is small. Although the experimental data is small, the prediction model based on ABHHO-LSSVM can be found to have high prediction accuracy through the prediction experiment. The size of the experimental data does not affect the prediction accuracy of the prediction model. The proposed prediction model can compensate more than 80% of the thermal error.

The ABHHO-LSSVM model proposed in this paper can theoretically predict the linear thermal error of the motorized spindle through the temperature of the spindle body. Due to the limitation of experimental conditions, no experimental verification has been carried out on other motorized spindle. The modeling complexity of the model proposed in this paper is high, and the high-precision prediction model with low modeling complexity will be explored in the future. This study did not consider the thermal error compensation of the motorized spindle. It is suggested that the following researchers should study the thermal error compensation of the motorized spindle to improve the machining accuracy of the motorized spindle.