1 Introduction

Five-axis computer numerical control (CNC) machine tool is the “industrial master machine” of the modern manufacturing industry. As the core component of the direct drive milling head, the motorized spindle works with the machining center, which dramatically improves the machining accuracy and efficiency. The motorized spindle mainly includes bearings, motorized spindle, cooling system, and tool change system. The motorized spindle has a compact internal structure, and the motor and bearings generate a large amount of heat when running at high speed. The heat accumulation leads to the thermal deformation of motorized spindle components, which affects the bearing preload state and machining accuracy. In heat transfer, due to the influence of temperature gradient and ambient temperature, the thermally induced error is characterized chiefly by time delay, time-varying, and multi-directional coupling [1]. It is difficult to predict the relationship between temperature field and thermal displacement using a simple mathematical model. How to effectively control the thermally induced error is particularly important.

The methods of controlling thermal error mainly include thermal error prevention methods and thermal error modeling compensation. The thermal error prevention method refers to eliminating the thermal error in the design and manufacturing stage [2], and the thermal error modeling compensation is to establish the thermal error prediction model by using a neural network. The thermal error prevention method is a high-cost “hard technology,” so we pay more attention to the thermal error modeling compensation method with low cost and high benefit in actual production [3]. To ensure the input quality of the model, scholars put forward a variety of strategies to optimize the temperature measurement points to improve modeling accuracy. Yang and Li et al. [4, 5] used a combination of fuzzy C clustering and correlation analysis to group temperature variables and selected the points with large correlation coefficients as temperature-sensitive points. The difference is that Li built a comprehensive temperature information (STI) matrix and used the effective clustering index (CVI) to determine the number of clusters. Chiu et al. selected four characteristic temperature points using Pearson relational coefficient [6]. Tan et al. [7] used the least absolute shrinkage and selection operator (LASSO) to screen heat key points. However, the coupling between the temperatures is not considered. Miao et al. [8] put forward a traversal optimization method for screening temperature-sensitive points, which effectively improved the temperature multiple co-line problems. To reduce the search time of variables, Lu [9] applied the gray relational analysis algorithm to optimize temperature-sensitive points. Li et al. proposed the improved binary grasshopper optimization algorithm (IBGOA) and multiple linear regression to screen temperature-sensitive points [10], which significantly improved the fitting accuracy of the prediction model. Zhou [11] used the K-means algorithm to filter temperature-sensitive points; however, the K-means algorithm is susceptible to selecting cluster centers, and randomly initializing the cluster centers reduces clustering accuracy. The K-means++ algorithm uses a weighted probability distribution to randomly select new points as new prime centers and divides them into clusters closest to their respective prime centers according to the closest distance principle, solving the random initialization problem that exists in the K-means algorithm and improving the sensitivity.

Many experts and scholars have raised various neural network thermal error modeling methods to reveal the relationship between temperature and thermal error, such as RBF neural network modeling, BP modeling, and gray neural network modeling (GNN). Yang et al. established an RBF neural network prediction model to identify the key thermal stiffness of machine tools [12]. Zhang et al. used a GA to globally search the center value and connection weights of the RBF network [13], which is more accurate than a single RBF neural network model. BP networks are widely used, but there are problems such as low model accuracy, slow convergence, and the tendency to fall into local minima, and many scholars have proposed many methods to solve them. Ma et al. used a GA to optimize the weights of the BP network to improve accuracy [14]. However, the network structure is determined empirically, and the GA training takes a long time and is prone to premature maturation. To solve this problem, Ma et al. combined gray cluster grouping and relational analysis methods to optimize the temperature measurement points, improving the input quality of the model [15]. Qianjian Guo, Jianguo Yang, and Xiaoying Zhang optimize the structure and parameters of a BP neural network using an ant colony algorithm [16,17,18]. However, the paucity of initial pheromones in ACO consumes a large amount of search time and easily makes the results fall into local optimal solutions y. Abdulshahed et al. used the particle swarm optimization (PSO) algorithm to optimize the weights of GNN nodes [19], but it consumed a lot of computing time. In addition, several scholars have also carried out heat transfer analyses. The K-means++ algorithm uses a weighted probability distribution to randomly select new points as new prime centers and group them into clusters closest to their respective prime centers according to the closest distance principle, solving the random initialization problem that exists in the K-means algorithm and improving the sensitivity. The K-means++ clustering algorithm is used to establish each index rank affiliation function and determine the dynamic characteristic affiliation rank [20].

There are still some problems in establishing the thermal error model using an artificial neural network. The excessive number of temperature measuring points will cause a thermal coupling effect, resulting in inaccurate identification of temperature-sensitive points. The BP network is too sensitive to the initial weight selection. The weights are adjusted along the direction of local improvement, so it is easy to fall into local minima, leading to a reduction in the accuracy of the BP model. The initial pheromone distribution of the ACO-BP neural network is poor, making the efficiency of the ant random search path low. When the optimization objective function is complex, the convergence speed is slow. Currently, fewer scholars use the GA-ACO algorithm to optimize BP neural networks. To fill this research gap in the field of motorized spindles, this paper combines GA and ACO algorithms to search the optimal parameters of BP neural networks and establish a neural network thermal error prediction model. Firstly, the thermal-structure simulation analysis is carried out to provide a theoretical basis for the arrangement of temperature measuring points. Then K-means++ algorithm is used to group the temperature variables, and the gray relation analysis is used to select the temperature-sensitive points with significant relation coefficients. Finally, the global search ability of GA is introduced to initialize the pheromone distribution of the ant colony, and the optimized ant colony algorithm updates the optimal solution of weight and threshold in the BP network to establish the GA-ACO-BP neural network model.

2 Thermal characteristic analysis

2.1 Finite element simulation analysis

The A03 motorized spindle in the milling head is taken as the experimental object in this paper. The product belongs to the core component of the high-end five-axis CNC machine tool, with a positioning diameter of Φ170 mm. It adopts torque motor direct drive technology and built-in hydraulic clamping, featuring high speed, high precision, and large torque. The maximum torque of the A03 motorized spindle is 200 Nm and the rated speed is 6000 r/min. The motorized spindle works in a constant temperature and humidity-sealed environment. In order to simulate the motor heating condition to find the temperature measurement point, only the effect of internal heat sources is considered in this paper.

2.1.1 Setup of the simulation analysis

The motorized spindle is the core component of the milling head. Under working conditions, a large amount of heat accumulates in the motorized spindle, which results in the deviation of the tooltip position. Therefore, it is necessary to analyze the main heat sources. The primary heat sources of the milling head are the motorized spindle and A/C axis. Each heat source affects the thermal displacement of the motorized spindle in different ways, so it is particularly important to find out the one that has the greatest influence on it.

Motor heat and bearing heat are the main factors leading to thermal error [21]. Therefore, a 3D model is established for thermal characteristics simulation analysis. However, the complicated model will lead to an unreasonable mesh shape, which reduces the accuracy of the analysis results. Therefore, the following simplifications are made in the modeling process [22]:

  1. 1.

    Ignore the process holes such as screws, threads, bolt holes, and positioning holes, and ignore the process structures such as chamfers, fillets, bosses, and seal retaining rings.

  2. 2.

    Use a fixed connection for internal parts, simplify the bearing into a ring, and use the “Bonded” command to define the bearing connection surface.

  3. 3.

    Import a simplified model into the Workbench platform and add material properties to the material library. Select the automatic mesh division method to mesh the model, and encrypt the mesh of motor and bearing parts. The room temperature is set to 22°C. The steady-state temperature and thermal deformation are solved.

2.1.2 Simulation results and analysis

After the solution, the temperature field is obtained in Fig. 1a. The temperature of the rotor of the motorized spindle is the highest, followed by the rotor of the A-axis, while the temperature of the C-axis part is lower. Therefore, the temperature measurement points are arranged near the motorized spindle and A-axis.

Fig. 1
figure 1

a Temperature field simulation results. b Thermal deformation field simulation results

From Fig. 1b, it can be seen that the heat deformation in the Z-direction is as high as 177.23μm, in X-direction is 75.897μm, and in the Y-direction is only 21.46μm. The milling head is axisymmetric about the Y-axis, and the heat symmetric structure makes the heat distribution uniform, so the thermal deformation in the Y-direction is minimal and negligible. The motorized spindle heats up seriously and is close to the motorized spindle, and the heat deformation at the end of the motorized spindle is the largest. It can be seen that the error of the milling head mainly originates from the thermal deformation in the Z-direction and X-direction, so the displacement sensors are arranged in the Z-direction and X-direction for the subsequent experiments.

2.2 Thermal deformation experiment

Under working conditions, the thermal deformation of the end of the motorized spindle is affected not only by its thermal elongation but also by the thermal deformation of the A/C axis. It can be seen from the finite element simulation that the heat generated by C-axis is less, and it is difficult to transfer the heat to the shell to affect the thermal deformation. However, the heat generated by A-axis is large, and it will be transferred to the motorized spindle through the shell connector. At the same time, the Z-direction and X-direction deformation of the A-axis can also affect the deformation of the motorized spindle through the shell connector.

To verify that the A-axis heat source has a great influence on thermal displacement, three groups of experiments are carried out to measure thermal displacement. The thermal displacement sensor displacement is shown in Fig. 3b. The first group of experiments shut down the C-axis and let the A-axis and motorized spindle run, the second group shut down the A-axis and let the C-axis and motorized spindle run, and the third group runs the motorized spindle and A/C axis at the same time. The results show that the thermal displacement of the first group and the third group is the same and much larger than that of the second group. So the A-axis heat source has a great influence on thermal displacement, and the C-axis heat source can be neglected. To avoid the jumble of data and improve the efficiency of the prediction model, measuring points are only arranged for A-axis and motorized spindle. To sum up, this paper considers the effects of the A-axis and spindle to measure thermal error.

2.2.1 Experimental platform construction

The temperature measurement point and displacement measurement locations are arranged according to Fig. 1. The air conditioning temperature is set to 22 °C. The instruments used in the experiment are shown in Fig. 2. Among them, the temperature acquisition software selects COM 1, and the baud rate is set to 9600. The calibration coefficient of the displacement real-time acquisition software is set to 1.2, the signal acquisition period and recording period are set to 60 s, and the signal smoothing parameter is set to 1. The temperature detection device is a PT100 temperature sensor with a resolution of 0.1°C. The temperature acquisition module model is JF-32PT100-4-033. Displacement detection device mainly includes an AEC eddy current displacement sensor, signal converter, power supply, and data collector, with a resolution of 0.3 μm and a measurement range of 0~1 mm.

Fig. 2
figure 2

Experimental instruments

The high temperature is mainly concentrated in the motorized spindle and A-axis parts, and the heat is distributed along the shaft core. Accordingly, the temperature measurement points are arranged for these two parts, and the specific locations are shown in Table 1. The Z-direction thermal deformation directly affects the vertical machining accuracy of the tool, and the X-direction thermal deformation aggravates the horizontal offset of the tool, both of which seriously reduce the machining accuracy. While the thermally symmetric structure greatly reduces the Y-directional thermal deformation, this experiment arranges displacement sensors for the X-direction and Z-direction. Figure 3 shows the overall experimental setup, where X1 measures the Z-direction thermal displacement and X2 measures the X-direction thermal displacement. T9 and T11, T2 and T3, and T6 and T10 are symmetrically arranged to collect the temperature data accurately. The speed of the motorized spindle is usually 10000 r/min, so the experiment lasts for 3 h, and displacement data are collected at 3000 r/min, 6000 r/min, and 9000 r/min.

Table 1 Positions of temperature measurement points
Fig. 3
figure 3

Overall arrangement of the thermal deformation experiment

2.2.2 Experimental results and analysis

The thermal equilibrium state has been reached when the temperature change curve is gradually stable. In Fig. 4a, b, and c, the temperature rises with the increase of rotating speed within 0~80 min, and the temperature fluctuates slightly after 80 min. The temperature rise trends of T9 and T11 are consistent, and those of T2, T3, and T7 are consistent. T9 and T11 have the highest temperature rise at different speeds, followed by T2, T3, and T7. The temperature rise of other temperature measuring points is tiny. When the temperature reaches thermal equilibrium, the thermal deformation also tends to be stable. As shown in Fig. 4d and e, the Z-direction thermal deformation at 3000 r/min, 6000 r/min, and 9000 r/min is about 16.5 μm, 25 μm, and 36 μm, respectively. The X-direction thermal deformation at 3000 r/min, 6000 r/min, and 9000 r/min is 4.8 μm, 8.1 μm, and 13 μm, respectively.

Fig. 4
figure 4

Temperature and thermal displacement variation curves at different rotational speeds

The motorized spindle bearing generates a large amount of heat and has a poor cooling effect on the bearing, so the temperature at T9, T11, T2, and T3 is high. The rear bearing has a heavy load, and the cooling water flows through the front bearing first, so the temperatures of T9 and T11 are higher than that of T2 and T3. The temperature of T7 is slightly lower than that of T2 because the heat generation of the A-axis bearing is less than that of the spindle bearing. There are cooling water pipes around the A-axis stator, so the temperature of T6 and T10 is minimal. When the coolant is introduced, it will take away a small amount of heat at the rotor, making the temperature of T8 lower than that of T7. T1, T4~T8 are far from the heat source and have a good heat dissipation effect, so the temperature changes are small.

Thermal deformation is closely related to temperature changes. At 3000 r/min, the overall temperature rise is slight, and the thermal deformation is relatively small. With the gradual increase of the rotational speed, the thermal deformation will continue to increase due to temperature, and the most extensive thermal deformation occurs at 9000 r/min. When the motorized spindle reaches thermal equilibrium, the thermal deformation is stable and no longer changes with temperature. Since the heat transfer forms are thermal convection and heat conduction, mostly hysteresis, the thermal deformation stabilization time is slightly later than the temperature stabilization time.

3 Optimized selection of temperature-sensitive points

Optimizing the choice of temperature-sensitive points can reduce the coupling effect and fully reflect temperature field information. To reduce the redundancy of input data and improve the model prediction accuracy, this paper uses the K-means++ algorithm for clustering analysis, combined with gray relational analysis to filter out the points with more excellent relational degrees as temperature-sensitive points.

3.1 K-means++ algorithm

K-means algorithm clusters well, but random initialization of the centroid influences the clustering result. The K-means++ algorithm optimizes the problem of random initialization of the centroid and improves the convergence speed. The K-means++ algorithm clustering process is:

  1. 1.

    Randomly select a temperature sample as the initial centroid u1.

  2. 2.

    Calculate the distance D(x) of all samples from the initial centroid u1 [23].

  3. 3.

    Use the weighted probability distribution to choose a new point as the new centroid randomly. Where the probability P is positively related to D(x).

  4. 4.

    Repeat 2 and 3 until K centroids are selected.

  5. 5.

    Calculate the Euclidean distances from each temperature sample to the K centroids, and divide them into clusters Ci closest to their respective centroids according to the closest distance principle.

  6. 6.

    Calculate the distance from each temperature point in Ci to the centroid and re-cluster, and update the centroid u1.

$${u}_i=\frac{1}{\mid {C}_i\mid}\sum_{x\in {C}_i}{x}_i$$
(1)
  1. 7.

    Repeat steps 5 and 6 until the sum of error squares D [24] from each temperature data to the centroid are the shortest to obtain the final clustering result.

$$D=\sum_{i=1}^K\sum_{x\in {C}_i}{\left\Vert {x}_i-{u}_i\right\Vert}^2$$
(2)

3.2 Gray relational analysis

To filter out the points where the temperature has a considerable influence on the thermal error, the gray relation analysis is performed. Gray relation analysis can evaluate the relation degree between temperature and thermal error.

  1. 1.

    Determine the analysis series. The reference series is the thermal error series Y={Y(t)|t=1,2,···,T}; the relational sequence is the temperature measurement point sequence X={Xi(t)|i=1, 2,···,m; t=1,2,···,T}.

  2. 2.

    Dimensionless treatment of quantities.

  3. 3.

    Calculate the gray relation coefficient ξi(t) according to Eq (3) [25].

$${\xi}_i(t)=\frac{\min_i{\min}_t\mid {Y}^{\ast }(t)-{X}^{\ast }(t)\mid +\rho \cdot {\max}_i{\max}_t\mid {Y}^{\ast }(t)-{X}^{\ast }(t)\mid }{\mid {Y}^{\ast }(t)-{X}^{\ast }(t)\mid +\rho \cdot {\max}_i{\max}_t\mid {Y}^{\ast }(t)-{X}^{\ast }(t)\mid }$$
(3)

In the formula, Y*(t) and X*(t) are the thermal error series and the temperature dimensionless processed series, respectively, ρ is an adjustable parameter between [0,1], usually ρ = 0.5.

  1. 4.

    Calculate the gray relation degree ri for each temperature.

$${r}_i=\frac{1}{T}\sum_{t=1}^T{\xi}_i(t)$$
(4)
  1. 5.

    Judge the effect on thermal error according to the gray relation degree. The larger its relational degree, the greater the influence of the thermal error.

3.3 Evaluation and analysis of clustering results

Setting different K values can yield different clustering results. The best clusters can be determined intuitively and accurately using the Silhouette Coefficient (St) and the elbow method. St represents the evaluation value of the Silhouette Coefficient of all temperature points and takes the value range of [−1,1]. The more prominent St, the better the clustering effect. Si is the contour coefficient of any point xi. St and Si are calculated by Eq. (5) [26].

$$\left\{\begin{array}{l}S_t=\frac1N\sum \limits _{i=1}^NS_i\\S_i=\frac{b\left(x_i\right)-a\left(x_i\right)}{\max\left\{a\left(x_i\right),b\left(x_i\right)\right\}}\end{array}\right.$$
(5)

In the formula, N is the sample size; a(xi) and b(xi) are the average distances of point xi from other elements in the same cluster.

In the elbow rule, the sum of the distance from the temperature to the centroid is called distortion degree J. A smaller J indicates a high density of temperature points within the cluster. On the contrary, the temperature point is loose. When a certain inflection point is reached, the distortion degree will change, and this critical point is the point where the clustering effect is the best.

The data at 3000 r/min are used as training samples to calculate St and J. From Fig. 5, it can be seen that the St is maximum at K = 3, which is 0.76. As the clusters increase, the St gradually decreases. The J is highest at K = 1, which is improved at K = 3. As the value of K increases, J slowly becomes smaller, and thus, K = 3 is the critical point for the distortion degree. In summary, K = 3 is the optimal number of clustering groups. The Silhouette Coefficient and the elbow rule comprehensively evaluate the optimal number of clustering groups, improving clustering accuracy.

Fig. 5
figure 5

K value evaluation coefficient

The K value is set to 3, and cluster analysis is performed on the temperature data. The clustering results are shown in Table 2, which classifies the temperatures into the following three groups: [T1, T4, T5, T6, T8, T10], [T2, T3, T7], and [T9, T11].

Table 2 K-means + + clustering results

Too many temperature points lead to redundant input data, so the gray relational analysis calculates the relational degree of each temperature point with the thermal error. In the gray relational analysis, the highest relational in each group is selected as the input to the model. The gray relational degree of temperature at each speed is shown in Fig. 6, and the ranking results are shown in Table 3. T9 and T11, T2, and T3 are symmetrically arranged and have the same temperature change trend, so the T11 and T2 with more significant relational coefficients are selected. Finally, T2, T7, T8, and T11 are chosen as temperature-sensitive points.

Fig. 6
figure 6

Gray relational degree of temperature variables

Table 3 Sorting results of temperature variables

4 Thermal error modeling and analysis

A highly accurate thermal error prediction model can effectively predict thermal elongation. In this paper, a GA is used to optimize the pheromone distribution of ACO; the ACO searches for the optimal parameters of the BP network and establishes the GA-ACO-BP neural network model. The evaluation index of model accuracy is introduced to compare with ACO-BP neural network and RBF neural network.

4.1 Principle of algorithm optimization

4.1.1 Genetic algorithm

GA is an optimized search algorithm that mimics the evolutionary mechanism of organisms in nature. Individuals with good fitness continue to survive and reproduce during the search, and new populations are formed through selection, crossover, inheritance, and mutation to obtain the optimal solution. The specific steps of the GA are as follows.

  1. 1.

    Random initialization parameters (number of iterations, population size, gene length, crossover probability, mutation probability);

  2. 2.

    Calculate the fitness function f and the individual fitness F;

$$\left\{\begin{array}{l}f=\frac1E=\frac1{\sum \limits _{q=1}^n\left(y_q-y'_q\right)^2}\\F=k\frac1{\sum \limits _{i=1}^n\left|y_q-y'_q\right|}\end{array}\right.$$
(6)

In the formula, E is the output error, n is the number of output nodes, k is the constant, yq is the qth measured value, and y′q is the qth predicted value.

  1. 3.

    Select the better individual for genetic operations (selection, crossover, mutation) to generate new individuals.

  2. 4.

    Determine the end condition. If the accuracy or the maximum number of iterations is met, the algorithm ends; otherwise, repeat steps 2 and 3.

4.1.2 Ant colony optimization algorithm

The basic principle of ACO is that ants release pheromones on their paths during the food search. The pheromone concentration is proportional to the path length and volatilizes with time. The later ants prefer the path with a large pheromone concentration as the best path and release pheromone. The pheromone concentration on the optimal path more and more increases, forming a positive pheromone feedback mechanism. Finally, the shortest path searched by the ant colony is taken as the optimal solution. The flow of the ant colony algorithm is as follows :

  1. 1.

    Initialize the number of ants k, pheromone volatility factor ρ, pheromone decay coefficient ξ, etc.

  2. 2.

    Construct the solution space of the ant colony. Randomly select the starting point of each ant and calculate the number of ants k from point i to point j at moment t with probability Pkij [27] until the ant traverses all path points.

$${P^k}_{ij}=\frac{{\left[{\tau}_{ij}(t)\right]}^{\alpha}\cdot {\left[{\delta}_{ij}(t)\right]}^{\beta }}{\sum \limits _{j\in \textrm{allowed}}{\left[{\tau}_{ij}(t)\right]}^{\alpha}\cdot {\left[{\delta}_{ij}(t)\right]}^{\beta }}$$
(7)

In the formula, {allowed} denotes the set of all paths; τij(t) is the residual pheromone on the path; δ is the path visibility; α is the pheromone heuristic value; and β is the fitness heuristic value.

  1. 3.

    Update the pheromone concentration. The pheromone τij on the path is updated according to the local pheromone update rule when the ants finish a path, and the global pheromone τij(t+1) is updated when the ants finish all paths. The updated formula is:

$$\left\{\begin{array}{l}\tau_{ij}=\left(1-\xi\right)\tau_{ij}\\\tau_{ij}\left(t+1\right)=\left(1-\rho\right)\tau_{ij}(t)+\sum \limits _{n=1}^k\triangle\tau_{ij}^n(t)\end{array}\right.$$
(8)
  1. 4.

    Judge the termination condition. When the number of evolution is less than the highest number of iterations, steps 2 and 3 are repeated; otherwise, terminate the calculation and output the optimal solution.

4.2 GA-ACO-BP neural network thermal error prediction model

4.2.1 Modeling process

The number of the input layer, hidden layer, and output layer nodes are m, h, and n, respectively. The wnj is the input layer-hidden layer weight, vjk is the hidden layer-output layer weight, r is the hidden layer threshold, and s is the output layer threshold. The process of building the GA-ACO-BP neural network prediction model is as follows:

  1. 1.

    Determine the structure of the BP network, and initialize the basic parameters of the network and the ant colony.

  2. 2.

    The k ants perform path searches. Select a random path and update its pheromone concentration in real-time.

  3. 3.

    The GA performs crossover and mutation operations on the ACO.

  4. 4.

    The optimal solution searched by the ACO is used as the input parameter of the BP network, and the error eq is calculated according to Eq. (9).

$${e}_q={y}_q-{\hat{y}}_q$$
(9)
  1. 5.

    ACO optimizes the weights and thresholds of the BP network, and the update formula is shown in Eqs. (10) and (11) [28]. Hj is the hidden layer value, and η is the learning efficiency. i∈[1,m]; j∈[1,h]; q∈[1,n].

$$\left\{\begin{array}{l}w_{ij}=w_{ij}+\eta H_j\left(1-H_j\right)x_i\sum \limits _{q=1}^nv_{jk}e_q\\v_{jk}=v_{jk}+\eta H_je_q\end{array}\right.$$
(10)
$$\left\{\begin{array}{l}r_j=r_j+\eta H_j\left(1-H_j\right)\sum \limits _{q=1}^nw_{jk}e_q\\s_i=s_i+e_q\end{array}\right.$$
(11)
  1. 6.

    Determine whether the termination conditions are met. If the accuracy requirements are met, output the optimal solution; otherwise, continue to steps 4 and 5. Figure 7 shows the process of building the GA-ACO-BP neural network model.

Fig. 7
figure 7

GA-ACO-BP neural network modeling process

4.2.2 Model parameter setting

Temperature data and thermal displacement data are collected for the thermal error modeling in the thermal deformation experiment. The temperature-sensitive points T2, T7, T8, and T11 are taken as the model input, the thermal error is taken as the output, and the number of hidden layers is 20. The GA population size is 20, the iteration limit is set to 100, and the crossover and heritability probabilities are 0.9 and 0.005. The number of ants k is 50, the pheromone heuristic value α, the fitness heuristic value β, the pheromone volatility factor ρ, and the target error E are set to 1, 5, 0.9, and 0.01, respectively. The network selects the tansig function as the activation function of the hidden layer, the purelin function as the output layer activation function, and the trainscg function as the training function. Set the learning rate to 0.01 and train the neural network.

4.3 Analysis of prediction results

The data at 3000 r/min are used as training samples, and the data at 6000 r/min and 9000 r/min are used as verification samples to evaluate the model’s prediction ability. After the model is trained, input the samples at each speed for prediction. The fitness values of ACO and GA-ACO algorithms are shown in Fig. 8. At different speeds, the GA-ACO algorithm has fewer iterations and a lower fitness value. Therefore, the GA-ACO algorithm has faster convergence and more outstanding optimization capability.

Fig. 8
figure 8

Convergence curve of algorithm fitness function at each speed

To further evaluate the accuracy of the GA-ACO-BP model, this paper uses the same test samples to establish RBF neural network and ACO-BP neural network for comparison. The predicted values of thermal errors for each model at different speeds are shown in Fig. 9. The prediction curves and residuals show that:

Fig. 9
figure 9figure 9

Thermal error prediction curve

  • Prediction effect: The ACO-BP neural network predicted value is higher than the measured value at 6000 r/min, and the predicted value is lower than the measured value at 9000 r/min. The RBF neural network prediction accuracy is slightly better than the ACO-BP neural network; however, the GA-ACO-BP neural network has the best prediction accuracy. The prediction accuracy of RBF neural network and GA-ACO-BP neural network is higher than that of ACO-BP neural network, and GA-ACO-BP neural network fits better than RBF neural network. Especially in the second half of the prediction, the predicted value of GA-ACO-BP neural network almost perfectly matches the measured value.

  • Residual comparison: The residual range of ACO-BP neural network is much larger than that of RBF neural network and GA-ACO-BP neural network. The residual of the ACO-BP neural network reaches up to 3 μm, and the average residuals of RBF neural network and GA-ACO-BP neural network are within 2 μm. Among them, the residual fluctuation of GA-ACO-BP neural network is the smallest, and that of ACO-BP neural network is the biggest. From the above analysis, GA-ACO-BP neural network is superior to ACO-BP neural network and RBF neural network in prediction accuracy and goodness of fit.

4.4 Evaluation of model prediction effect

To better evaluate the prediction performance of the thermal error model, this paper selects the relational coefficient (R), the coefficient of determination (R2), RMSE, and MAE as the prediction accuracy evaluation index. The R is closer to 1, the better the collinearity; the R2 is closer to 1, the better the fitting degree of the prediction curve; the smaller RMSE, the better nonlinear fitting degree; the more petite MAE is, the more minor error.

The evaluation index values at different speeds are shown in Tables 4 and 5. For the Z-directional thermal error, the average values of R and R2 of the GA-ACO-BP neural network are the highest, which are 0.9965 and 0.993, while the average values of RMSE and MAE are the lowest, which are 0.567 and 0.388 respectively. The model evaluation indexes at different speeds are shown in Fig. 10a. At 6000 r/min, compared with RBF neural network and BP neural network, the R of the GA-ACO-BP neural network increased by 0.004 and 0 012, R2 increased by 0.069 and 0.022, RMSE decreased by 48.7% and 64.7%, and MAE decreased by 69.1% and 52.4%, respectively. At 9000 r/min, compared with the RBF neural network and BP neural network, R of the GA-ACO-BP neural network increased by 0.002 and 0.005, R2 increased by 0.016 and 0.01, RMSE decreased by 42% and 50.6%, and MAE decreased by 60.5% and 55.1%, respectively.

Table 4 Evaluation index values of model Z-directional thermal errors at different rpm
Table 5 Evaluation index values of model X-directional thermal errors at different rpm
Fig. 10
figure 10

Model evaluation indexes at different speeds

For the X-directional thermal error, the average values of R and R2 of the GA-ACO-BP neural network are 0.998 and 0.996, and the average values of RMSE and MAE are 0.207 and 0.1695 respectively. The model evaluation indexes at different speeds are shown in Fig. 10b. At 6000 r/min, compared with RBF neural network and BP neural network, the R of the GA-ACO-BP network increased by 0.014 and 0 02, R2 increased by 0.026 and 0.065, RMSE decreased by 55.4% and 68%, and MAE decreased by 46.4% and 68.1%, respectively. At 9000 r/min, compared with the RBF network and BP network, R of the GA-ACO-BP network increased by 0.006 and 0.004, R2 increased by 0.018 and 0.012, RMSE decreased by 47.6% and 46.5%, and MAE decreased by 56.4% and 54.2%, respectively.

By comparing R, the GA-ACO-BP neural network prediction curve has the highest degree of linear relational. Compared with R2, the curve-fitting effect of the GA-ACO-BP neural network is the best. Comparing RMSE and MAE, GA-ACO-BP neural network has the best nonlinear fitting degree and the slightest residual error. Compared with RBF neural network, the fitting effect of the ACO-BP neural network is poor, while the prediction accuracy and fitting effect of the GA-ACO-BP neural network are significantly improved. In other words, the BP neural network optimized by GA and ACO algorithms has strong robustness and prediction accuracy.

5 Conclusion

Taking the A03 motorized spindle as the research object, this paper first sets up temperature measurement points according to the temperature field distribution and collects experimental data, then uses the K-means++ algorithm to screen temperature-sensitive points, and finally establishes GA-ACO-BP neural network model according to the characteristic temperature points to predict the thermal error, and draw the following conclusions:

  1. 1.

    The axial thermal elongation of the motorized spindle is the largest, and the A-axis has the greatest influence on the axial thermal displacement. The thermal deformation varies with the temperature change of the thermal-sensitive point. They often show a time lag and nonlinear relationship with each other.

  2. 2.

    The K-means++ algorithm effectively solves the accuracy problem caused by randomly initializing the centroid. Combined with the gray relational analysis, the 11 temperature measurement points are reduced to 4, which improves the collinearity between temperatures and improves the operating efficiency of the model.

  3. 3.

    This study shows that the GA-ACO-BP neural network model has a better prediction performance of the thermal error, and the average residual value is less than 1 μm. Compared to the ACO-BP neural network model, the GA-ACO-BP neural network model has over 45% lower evaluation metrics, so its prediction accuracy is high and the fit is the best.

  4. 4.

    The GA and ACO can optimize the main parameters and objective function of the BP network, and the convergence speed and prediction accuracy are better than the traditional neural network. Both Z-direction and X-directional prediction residuals are within 2 um, which improves the accuracy of the model.

  5. 5.

    The neural network model has high prediction accuracy and low cost, and it is particularly important to establish an effective compensation mechanism or compensation system. The implantation of the GA-ACO-BPNN model into the thermal error compensation system is the direction of future work and provides a new idea for real-time compensation of motorized spindles.