1 Introduction

There are many factors affecting the accuracy of machine tools, including geometric errors, thermal errors, control errors, and cutting force errors [1]. Thermal errors are the largest single source of errors, accounting for approximately 40–70% of the total error [2, 3]. As a critical component of high-speed CNC machine tools, the motorized spindle generates a lot of heat during operation, which leads to thermal expansion of spindle parts or tool deformation, which affects the accuracy of the motorized spindle and even the preload of the bearings, thus affecting the machining accuracy and service life of the machine tool [4]. Therefore, reducing thermal error is the key to the development of high-speed precision machining technology. Researchers used finite element method [5, 6] and finite difference method [7] to simulate the thermal characteristics of the motorized spindle. Because the boundary conditions and heat transfer characteristics cannot be clearly defined, these techniques are only qualitative analysis, and other methods are still needed to reduce or eliminate the errors generated in the machining process. The main methods which reduce the thermal error of motorized spindle are the thermal error avoidance method, thermal error control method, and thermal error compensation method [8]. Among them, the thermal error compensation method does not need to change the mechanical properties (structure, material, etc.) of the motorized spindle. Based on the establishment of the thermal error model, it is the most economical and effective method to predict the size of the thermal error in advance and avoid the error using compensation.

Choosing appropriate temperature-sensitive points can effectively increase the accuracy and robustness of the thermal error model. Too many temperature-sensitive points not only increase the collinearity between temperature variables but also complicate the thermal error model. Too few temperature-sensitive points cannot accurately take into account the temperature field distribution of the motorized spindle. They may lose critical information, so the thermal error model cannot reach the prediction accuracy. The screening of temperature-sensitive points mainly includes grouping temperature variables and selecting temperature measuring points with a high correlation between temperature variables and thermal errors. Guo et al. [9] selected the temperature sensitive points by using the grey correlation analysis method and reduced the input variables of the model from 24 to 8, which reduced the calculation time of the thermal error model and improved the measurement efficiency of thermal error. Chiu et al. [10] used Pearson correlation coefficient method to describe the correlation between temperature measuring points and thermal error and removed the temperature points with low correlation. Liu et al. [11] used fuzzy clustering algorithm and grey relational analysis to optimize the temperature measuring points, which reduced the number of independent variables in the model. Li et al. [12] proposed a method for screening temperature-sensitive points based on the improved binary grasshopper optimization algorithm, which improved the accuracy of the thermal error prediction model compared with the traditional fuzzy c-means clustering method for screening temperature-sensitive points. Liu et al. [13] proposed a clustering algorithm with the elbow method which was proposed to determine the optimal number of key temperature points in this study. This approach can effectively reduce the dimensionality of initial temperature points. Li et al. [14] proposed a temperature-sensitive point selection method based on integrated temperature information, which solved the problem of incomplete clustering and the same number of temperature-sensitive points with different errors. Miao et al. [15] proposed a modeling method based on a principal component regression algorithm, which can eliminate the influence of multi-collinearity among temperature variables.

In the traditional cluster analysis of temperature-sensitive points, only the internal data information of temperature measurement points is considered, and the degree of correlation between these measurement points and thermal errors is ignored. Therefore, temperature measuring points with high correlation with thermal error will be ignored. In addition, temperature-sensitive points change with the number of clusters, and there is currently no general method to determine the optimal number of clusters. Both cases are not conducive to thermal error modeling. To avoid these problems, a new method for screening temperature-sensitive points is proposed. The optimal number of clusters was determined by pedigree clustering and experience. Then k-means clustering was used to group the temperature measurement points according to the number of clusters. At the same time, it is beneficial to analyze the correlation between temperature measuring points and thermal error by grey correlation degree, and the temperature measuring points with large correlation degree with thermal error are selected as temperature-sensitive points in each group.

Thermal errors in high-speed spindles are influenced by operating conditions, ambient temperature, cooling conditions, and other parameters resulting in a time-varying, nonlinear, and non-stationary process. Therefore, accurate and robust thermal error modeling is tough. Domestic and foreign scholars have established thermal error models with strong robustness and high prediction accuracy using multiple regression, support vector machine, artificial neural network, and grey system theory. Wu et al. [16] used deep learning convolutional neural network (CNN) to model the axial and radial thermal errors of the spindle. The prediction accuracy of the model is higher than that of the BP model, and it has good robustness. Abdulshahed et al. [17] combine the grey system model with the artificial neural network; a grey neural network model with convolution integral is proposed, which overcomes the shortcomings of the independent application of the two models. Wu et al. [18] proposed a thermal error model based on a genetic algorithm optimized back propagation (GA-BP) neural network, which increased the prediction accuracy of thermal deformation of turning center. Yang et al. [19] proposed the integrated recurrent neural network (IRNN) thermal error modeling method. They verified that the IRNN model robustness was better than multivariate regression analysis (MRA), multilayer feedforward neural network (MFN), and recurrent neural network (RNN) under a variety of working conditions. Hou et al. [20] proposed a modeling method for thermal error of CNC lathes based on a multi-objective genetic algorithm and verified the stability of the model under different working conditions. Liu et al. [21] proposed that the hyper-parameters of LSTM neural networks are optimized with grey wolf (GW) algorithms to remove the sensitivity of predictive performance to its hyper-parameters and improved the robustness and generalization of thermal error models. Li et al. [22] proposed a random new inverse model to predict the thermal error by combining the stochastic theory, genetic algorithm, and radial basis function neural network. This method can describe the thermal error more accurately while considering the randomness of the influencing factors of thermal error. Liu et al. [23] combined the ridge regression model and the state-space model, and the proposed robust state-space model has high model accuracy and prediction robustness under the influence of temperature and spindle speed changes. Li et al. [24] established a generalized regression neural network (PSO-GRNN) thermal error model based on particle swarm optimization, which has high prediction accuracy and generalization ability under different feed speeds.

To improve the prediction accuracy of the thermal error model, a large number of scholars combine population evolution algorithm with the neural network. Elman neural network is a dynamic recurrent neural network that adds local memory units based on the structure of the BP network. By storing the internal state, the system has the function of dynamic mapping characteristics, so that the system can adapt to time-varying characteristics and better learning ability. It can solve problems such as rapid optimization, fitting, and regression prediction. It is suitable for modeling the thermal error of the motorized spindle. Elman neural network also inevitably has the disadvantages of easy to fall into local extremum, slow convergence speed, and low efficiency. The sparrow search algorithm simulates the behavior of sparrows foraging for food and avoiding natural enemies in nature. The algorithm has certain advantages in convergence speed, accuracy, and stability compared to traditional optimization algorithms. Therefore, to increase the accuracy of the Elman neural network thermal error prediction model, the weight and threshold of the Elman neural network are optimized by the sparrow search algorithm. Finally, the thermal error prediction model of high-speed motorized spindle based on Elman neural network optimized by sparrow search algorithm is established.

In summary, to establish a thermal error prediction model with high accuracy and good robustness, this paper uses pedigree clustering and k-means clustering to cluster and group the temperature variables of the high-speed motorized spindle. The number of temperature measurement points and the complexity of thermal error models are reduced. Secondly, weights and thresholds of the Elman neural network are optimized by sparrow search algorithm, and the thermal error model of motorized spindle based on SSA-Elman neural network is established to improve the prediction accuracy and robustness of the Elman neural network model. Finally, the validity of the prediction model is verified by the experimental data and the prediction performance is compared with the PSO-Elman neural network model, which provides a reference for thermal error modeling.

2 Temperature and thermal error experiment

2.1 Thermal characteristic analysis

A high-speed motorized spindle has a high running speed, compact internal structure, and easy heat accumulation, which causes thermal deformation and affects machining accuracy. To collect enough temperature information in the motorized spindle temperature data acquisition experiment, the thermal characteristics of the high-speed motorized spindle are analyzed, and the temperature distribution of the motorized spindle is studied, which provides a reference for the arrangement of temperature measuring points. In this paper, the A02 motorized spindle produced by CNC technology in Ningbo is mainly used in high-speed milling of workpiece surface in the machining center. Heat transfer of motorized spindle includes heat conduction, heat convection, and heat radiation. The thermal boundary parameters of the motorized spindle are obtained by calculation and are presented in Tables 1 and 2.

Table 1 Heat generation rate of each component of motorized spindle
Table 2 Heat transfer coefficient of motorized spindle components

The above parameters are set in the finite element analysis software Workbench to solve the steady temperature field of a high-speed motorized spindle. The steady-state thermal analysis results of the motorized spindle are presented in Fig. 1.

Fig. 1
figure 1

Steady-state thermal simulation results of motorized spindle

From the simulation results, it can be seen that the highest temperature inside the high-speed motorized spindle is about 58.7 °C at the middle air gap of the built-in motor. Because the built-in motor is the largest heat source of the high-speed motorized spindle, and there is no corresponding cooling device inside, the heat dissipation is poor, so this part of the temperature is the highest. The temperature of the front and rear bearings of the high-speed motorized spindle is relatively low. The temperature of the front bearing is about 40.5 °C and that of the rear bearing is about 43.5 °C. The temperature at the cooling water jacket of the motorized spindle is the lowest, and the simulation temperature is 21.3 °C. From the temperature field distribution of high-speed motorized spindle, it can be seen that the temperature of high-speed motorized spindle varies greatly along the axial direction and slightly along the circumferential direction. Therefore, in the experiment, the arrangement of temperature measuring points is mainly arranged along the axial direction of the motorized spindle so that enough temperature information can be collected.

2.2 Experimental platform construction

In this paper, the thermal error measurement experiment is carried out on the A02 motorized spindle used in the joint laboratory of SKY NC High-speed Motorized Spindle of Harbin University of Science and Technology. The surface temperature of the motorized spindle is measured by PT100 thermal resistance. The front and rear bearings of the motorized spindle are equipped with PT 1000 thermal resistors. It can be seen from the above analysis that the main heat sources of the motorized spindle are the motor stator, rotor, front bearing, and rear bearing, and the temperature changes considerably along the axial direction. Therefore, the arrangement of temperature measuring points in the experiment is mainly arranged along the axial direction of the motorized spindle. In order to prevent the lack of temperature information, resulting in the prediction model accuracy which cannot meet the requirements, it is necessary to arrange a large number of temperature sensors. Combined with laboratory research on A02 motorized spindle, this paper selected 10 temperature measuring points. The layout of the temperature sensor is shown in Table 3. At different rotational speeds, the motorized spindle has axial and radial thermal displacements. Compared with the axial thermal displacement, the radial thermal drift is minimal and can be ignored [25]. In this paper, the axial thermal error data of the motorized spindle are collected by the eddy current displacement sensor. The installation of temperature and displacement sensors is shown in Fig. 2.

Table 3 Scheme design of temperature measurement point
Fig. 2
figure 2

Installation of temperature and displacement sensors

2.3 Result analysis

In actual processing, a high-speed motorized spindle often works at a constant speed for a long time. In this experiment, the A02 high-speed motorized spindle was taken as the research object. The experiment was divided into five groups according to the rotation speed from 2000 to 10000r/min. The working time of each group was 180 min, and the temperature and thermal error data of the motorized spindle were collected. To reduce the influence of ambient temperature on the experimental data, the ambient temperature of the laboratory was 22 °C. Because the internal structure of the motorized spindle is close and not easy to heat dissipation, to reduce the interaction between the three groups of experiments, the interval of each experiment is 12 h. After the motorized spindle is completely cooled, the next experiment is carried out. The temperature and thermal error data collected according to the above scheme are shown in Fig. 3.

Fig. 3
figure 3

Temperature and thermal displacement curves of motorized spindle at different speeds. a Temperature measuring point curve of motorized spindle at 2000r/min. b Temperature measuring point curve of motorized spindle at 4000r/min. c Temperature measuring point curve of motorized spindle at 6000r/min. d Temperature measuring point curve of motorized spindle at 8000r/min. e Temperature measuring point curve of motorized spindle at 10000r/min. f Axial thermal drift curve of motorized spindle

From the change curve of each temperature measuring point under different rotational speeds, it can be known that the temperature of each measuring point gradually increases with time and then is in dynamic balance. Temperature measurement points T5 and T10 have higher temperatures due to their proximity to internal heat sources. Other temperature measuring points are located on the surface of the motorized spindle, so the temperature is low. As can be seen from the axial thermal displacement at different speeds, the axial thermal error of the motorized spindle at different speeds varies in roughly the same trend, and the spindle thermal error increases as the speed increases. The temperature and thermal error data obtained from the experiment provides data support for the temperature measurement point optimization and thermal error modeling of the motorized spindle. Through the characteristic thermal simulation of the motorized spindle, it can be seen that the front and rear bearings of the motorized spindle are one of the largest heat sources of the spindle. The temperature of the front and rear bearings measured by the experiment is also high, which can verify the accuracy of the simulation. Therefore, by modifying the thermal simulation model later, only by setting the simulation parameters, the temperature distribution of the motorized spindle under different working conditions can be accurately obtained.

3 Temperature measurement point optimization

To represent the full temperature field information with the temperature-sensitive point information of the motorized spindle, it is necessary not only to reflect the temperature field distribution of the motorized spindle, but also to consider the correlation and collinearity between the temperature measurement points. As a classical clustering analysis algorithm, the k-means algorithm has the advantages of simple modeling, low complexity, and fast grouping. However, the k-means algorithm needs to give the number of clusters in advance [26]. To obtain accurate clustering, this paper uses the pedigree map and experience to determine the optimal number of clusters and then combines the k-means algorithm to perform cluster analysis on each temperature measurement point. At the same time, the grey correlation degree is used to analyze the tightness of the relationship between the thermal error of the motorized spindle and the temperature measuring points. In each group, the temperature measuring point with the greatest correlation with thermal error is selected as the temperature-sensitive point.

3.1 Cluster analysis of temperature measurement points

3.1.1 Pedigree clustering

Pedigree clustering classifies the research objects according to the idea of biological taxonomy. Firstly, each temperature measuring point is regarded as a class, then the most similar temperature measuring points are clustered into subclasses, and then the aggregated subclasses are repolymerized according to the similarity between each class. With the weakening of similarity, all subclasses are clustered into a large class, and a pedigree graph aggregated according to the size of similarity is obtained. The steps of pedigree clustering are as follows:

  1. 1.

    The feature matrix is composed according to the characteristics of the classification object.

    $$X = \left[ {\begin{array}{*{20}c} {x_{11} } & {x_{12} } & \cdots & {x_{1p} } \\ {x_{21} } & {x_{22} } & \cdots & {x_{2p} } \\ \vdots & \vdots & {} & \vdots \\ {x_{m1} } & {x_{m2} } & \cdots & {x_{mp} } \\ \end{array} } \right]$$
    (1)

    where xij is the jth eigenvalue of the ith object.

  2. 2.

    Initially set the distance of the object. Each of the n temperature measurement points is a class, and the shortest distance between different temperature measurement points is calculated to obtain the distance matrix D0:

    $$D_{0} = \left[ {\begin{array}{*{20}c} 0 & {d_{12} } & \cdots & {d_{1p} } \\ {d_{21} } & 0 & \cdots & {d_{2p} } \\ \vdots & \vdots & {} & \vdots \\ {d_{m1} } & {d_{m2} } & \cdots & 0 \\ \end{array} } \right]$$
    (2)
  3. 3.

    Find the minimum value on the off-diagonal line of D0, record it as Dpq, merge Tp and Tq into a new class Tr = (Tp, Tq), and remove Tp and Tq in D0, the two rows and columns, plus the distance between the new class Tr and the rest of the classes, to get the n-1 order matrix D1.

  4. 4.

    Repeat step 2 from D1 to obtain matrix D2, and repeat this step until all temperature measurement points are clustered into one large class.

  5. 5.

    Draw a pedigree clustering diagram.

3.1.2 K-means clustering

K-means clustering algorithm is an unsupervised clustering algorithm, with Euclidean distance as the similarity measure and the sum of squares of errors as the clustering criterion, which can complete the clustering analysis with the lowest similarity between classes and the highest similarity within the class. The steps of clustering temperature measurement points based on k-means are as follows:

  1. 1.

    The optimal clustering number k is determined by pedigree clustering analysis, and k temperature measuring points are selected as the clustering center of the initial temperature.

    $${L}^{(0)}=\left\{{x}_{1}^{(0)},{x}_{2}^{(0)},\cdots ,{x}_{k}^{(0)}\right\}$$
    (3)
  2. 2.

    The Euclidean distance between each temperature measuring point and the cluster center is calculated and assigned to the nearest cluster center:

    $$\begin{aligned}{G}_{i}^{(0)}=&\left\{{\text{x}}:d\left(x,{x}_{1}^{(0)}\right)\le d\left(x,{x}_{1}^{(0)}\right),\right.\\&\left.j=\mathrm{1,2},\cdots ,k,j\ne i\right\}\quad(i=\mathrm{1,2},\cdots ,k)\end{aligned}$$
    (4)

    Gets an initial classification:

    $${G}^{(0)}=\left\{{G}_{1}^{(0)},{G}_{2}^{(0)},\cdots ,{G}_{k}^{(0)}\right\}$$
    (5)
  3. 3.

    Calculate the average value of all temperatures in each temperature measuring point set. The calculation is as follows:

    $${x}_{i}^{(1)}=\frac{1}{{n}_{i}}\sum_{{x}_{i}\in {G}_{i}^{(0)}}{x}_{l}\quad\left(i=\mathrm{1,2},\cdots ,k\right)$$
    (6)

    The average value in each set is taken as the new clustering center.

    $${L}^{(1)}=\left\{{x}_{1}^{(1)},{x}_{2}^{(1)},\cdots ,{x}_{k}^{(1)}\right\}$$
    (7)
  4. 4.

    The Euclidean distance from each measuring point to the new clustering center is calculated and assigned to the nearest clustering center. A new classification is obtained in the same way:

    $${G}^{(1)}=\left\{{G}_{1}^{(1)},{G}_{2}^{(1)},\cdots ,{G}_{k}^{(1)}\right\}$$
    (8)
  5. 5.

    Repeat steps 3 and 4 until the clustering centers no longer change and the final clustering result is obtained:

    $${G}^{(m)}=\left\{{G}_{1}^{(m)},{G}_{2}^{(m)},\cdots ,{G}_{k}^{(m)}\right\}$$
    (9)

3.1.3 Clustering result evaluation

The Silhouette function draws a contour map based on the k-means clustering results. From the contour map, it can be seen whether the classification of each point is reasonable. The contour value of the ith point on the contour map is calculated as follows:

$$s\left(i\right)=\frac{b\left(i\right)-a\left(i\right)}{\mathrm{max}\left(a\left(i\right),b\left(i\right)\right)}\quad{\text{i}}=\mathrm{1,2},\cdots ,n$$
(10)

where a(i) is the dissimilarity of point i to the class it currently belongs to, usually measured by the average Euclidean distance to each point. b(i) is the minimum value of the dissimilarity of point i to the other classes. The value range of S(i) is [− 1, 1]. The larger the value of S(i), the more reasonable the classification of point i is. When S(i) < 0, the classification of point i is unreasonable.

3.2 Grey relational analysis

Cluster analysis only classifies the temperature measurement points and cannot determine the location of the optimal temperature measurement points. The grey correlation degree was used to analyze the correlation degree between each temperature measurement point and the thermal error. The temperature measurement points with the greatest correlation degree with the thermal error were selected from each group.

  1. 1.

    Data normalization. To avoid numerical error, the data are dimensionless. The calculation is as follows:

    $${x}_{i}\left(n\right)=\frac{{x}_{i}^{(0)}\left(n\right)}{\overline{{x }_{i}}}$$
    (11)

    where: xi(n) is the processed data, xi(0)(n) is the original data, and \(\overline{{x}_{i}}\) is the mean value of the data in column i.

  2. 2.

    Calculate the grey correlation coefficient. Set the thermal error data as \(x_{0} = \left\{ {x_{0} \left( k \right)|k = 1,2, \cdots ,m} \right\}\) and the temperature measurement point data as \(x_{i} = \left\{ {x_{i} \left( k \right)|i = 1,2, \cdots ,n; \, k = 1,2, \cdots ,m} \right\}\), and then the correlation coefficient of x0 to xi at the k-point is calculated as follows:

    $${\xi }_{oi}=\frac{\underset{i}{\mathrm{min}}\underset{k}{\mathrm{min}}\left|{x}_{o}\left(k\right)-{x}_{i}\left(k\right)\right|+\rho \underset{i}{\mathrm{max}}\underset{k}{\mathrm{max}}\left|{x}_{o}\left(k\right)-{x}_{i}\left(k\right)\right|}{\left|{x}_{o}\left(k\right)-{x}_{i}\left(k\right)\right|+\rho \underset{i}{\mathrm{max}}\underset{k}{\mathrm{max}}\left|{x}_{o}\left(k\right)-{x}_{i}\left(k\right)\right|}$$
    (12)

    where ρ is the resolution coefficient, generally ρ = 0.5. In practical calculation, the ρ value can be adjusted appropriately to improve the resolution.

  3. 3.

    Calculate grey correlation degree. The correlation between the thermal error sequence and the temperature measuring point sequence can be calculated by the average network model. When the thermal error of each correlation coefficient of the two sequences at each time. The calculation is as follows:

    $$r_{oi} = \frac{1}{m}\sum\limits_{k = 1}^{m} {\xi_{oi} \left( k \right)} , \, \left( {k = 1,2, \cdots ,m} \right)$$
    (13)

3.3 Result analysis

To eliminate the collinearity between the temperature measuring points and reduce the complexity of the thermal error model, this paper uses pedigree clustering combined with k-means clustering to cluster the temperature measuring points. It uses grey correlation degree to analyze the correlation degree between the temperature measuring points and the thermal error, to screen the temperature-sensitive points.

From the pedigree cluster analysis in Sect. 3.1, the pedigree diagram of each temperature measurement point clustering is obtained using MATLAB software, as shown in Fig. 4.

Fig. 4
figure 4

Cluster diagram of pedigree

The general range of the number of clusters is [Kmin, Kmax], Kmin usually takes 2, and Kmax usually takes n1/2, where n is the total number of samples in the data set. In this paper, ten temperature measuring points are selected, and the range of clustering numbers is [2, 3.16]. When it is divided into two categories, namely, T10 is one category, and the rest is one category. The inter-class distance between T5 and T1, T2, T3, T4, T6, T7, T8, and T9 is 68.12, and the distance is large, so it is not suitable to be divided into one group. When it is divided into four categories, the distance between T1, T2, T3, and T9 and T4, T6, T7, and T8 is 5.45, which is relatively close and is unsuitable for dividing into two groups. Based on these two situations, ten temperature points were divided into three groups with the best results.

The optimal classification number 3 is used as the K value of k-means clustering, clustering analysis is carried out, and the number of clustering groups is obtained as shown in Table 4.

Table 4 Clustering results

The Silhouette function is used to analyze the clustering results, and the contour values of each temperature measuring point are shown in Fig. 5. It can be seen from the figure that the contour value of each measuring point is close to 1 when divided into three categories, so the classification is reasonable.

Fig. 5
figure 5

Contour value of each temperature measuring point

In this paper, to verify the accuracy and robustness of the model, 6000r/min is selected from the five rotational speeds as the training data set, and the remaining rotational speeds are used as the prediction set for verification. Through grey correlation degree analysis, the grey correlation degree between each temperature measurement point and the axial thermal error when the motorized spindle is at 6000r/min is obtained, as shown in Fig. 6.

Fig. 6
figure 6

Grey relational degree between each temperature measurement point and thermal error

According to the clustering results of temperature measuring points and the grey correlation degree between each temperature measuring point and axial thermal error, the temperature measuring point with the maximum grey correlation degree is selected as the temperature-sensitive point in each group. Thus, T7 of the first group, T5 of the second group, and T10 of the third group is obtained as the temperature-sensitive points of the motorized spindle.

4 Thermal error modeling analysis

The thermal error caused by uneven temperature distribution of motorized spindle is nonlinear and time-varying. The artificial neural network has good nonlinear mapping ability and self-learning ability, especially suitable for thermal error modeling. In this paper, Elman dynamic neural network is adopted, which is based on the BP network structure, and the receiving layer is added as a one-step delay operator. Using adaptive time-varying characteristics, the global stability and learning ability of the network is enhanced. The deficiency of the Elman neural network lies in the randomness of the initial assignment of weights and thresholds of the network itself, which makes it easy to fall into the local minimum, and the prediction accuracy is low. Therefore, this paper introduces sparrow search algorithm to optimize the initial weights and thresholds of the Elman neural network. Finally, the SSA-Elman neural network thermal error prediction model of the motorized spindle is established.

4.1 Principles of sparrow algorithm

The sparrow search algorithm proposed in 2020 is a new population intelligence algorithm, which is conceived based on the principle of stimulating the foraging and predator-evading behaviors of sparrow populations in nature, with significant advantages in convergence speed, accuracy, and stability. The population of sparrows is divided into discoverers, joiners, and scouts. The discoverer provides the foraging area and direction for the whole population. Joiners forage for food based on the clues provided by the discoverers while grabbing resources for higher energy to get more food. Individuals in the periphery, with very high alertness, will send out warning signals when there is danger, and the population realizes the danger to make anti-predatory behavior. The algorithm process is as follows:

  1. 1.

    Establish sparrow population. In the matrix, n is the population number, and d represents the dimension of the optimization problem (the variables to be optimized are the initial weights and thresholds of the neural network):

    $$X=\left[\begin{array}{cccc}{x}_{11}& {x}_{12}& \cdots & {x}_{1d}\\ {x}_{21}& {x}_{22}& \cdots & {x}_{2d}\\ \vdots & \vdots & & \vdots \\ {x}_{n1}& {x}_{n2}& \cdots & {x}_{nd}\end{array}\right]$$
    (14)

    The corresponding ability of each sparrow to find food is expressed by the fitness value, and the fitness function f is selected according to the actual situation and expressed as:

    $${F}_{X}=\left[\begin{array}{c}f\left(\left[\begin{array}{cccc}{x}_{11}& {x}_{12}& \cdots & {x}_{1d}\end{array}\right]\right)\\ f\left(\left[\begin{array}{cccc}{x}_{21}& {x}_{22}& \cdots & {x}_{2d}\end{array}\right]\right)\\ \vdots \\ f\left(\left[\begin{array}{cccc}{x}_{n1}& {x}_{n2}& \cdots & {x}_{nd}\end{array}\right]\right)\end{array}\right]$$
    (15)
  2. 2.

    The discoverer’s location update. Compared with the joiners, the discoverers have higher energy reserves, providing the joiners with safe foraging areas and directions. Its location is updated to:

    $${X}_{ij}^{t+1}=\begin{cases}{X}_{ij}^{t}\times \mathrm{exp}\left(\frac{-i}{\alpha \times ite{r}_{\mathrm{max}}}\right)& {R}_{2}\le ST\\ {X}_{ij}^{t}+QL& {R}_{2}\ge ST\end{cases}$$
    (16)

    where t is the current number of iterations and itermax is the maximum number of iterations; j is the dimension. Xij denotes the position information of the ith sparrow in the jth dimension. R2 and ST are alert thresholds and safety values. Q is a random number with normal distribution, and L is a matrix of 1 × d. When R2 < ST, it means there is no predator in the foraging area, and the discoverers can search the surrounding area; when R2 ≥ T, indicating that some sparrows in the population have found predators and issued a warning, all sparrows will fly to a safe place to feed.

  3. 3.

    The joiner’s location update and the joiners can adjust their position at any time. Its location is updated to:

    $${X}_{ij}^{t+1}=\begin{cases}{Q}\times \mathrm{exp}\left(\frac{{X}_{worst}^{t}-{X}_{ij}^{t}}{\alpha \times ite{r}_{\mathrm{max}}}\right)& i>n/2\\ {X}_{p}^{t+1}+\left|{X}_{ij}^{t}-{X}_{p}^{t+1}\right|\times {A}^{+}\times L& otherwise\end{cases}$$
    (17)

    where Xp and Xworst are the best and worst positions of food discoverers, respectively, and A represents a 1 × d matrix. If i > n/2, it means that the ith joiners did not get food, the ability supply was insufficient, and he needed to explore in other places to get food to improve his adaptability.

  4. 4.

    The scout’s location update. Scouts account for 10–20% of the sparrow population. When the danger approaches, they will give up the current food and move to a new location. The location update formula is:

    $${X}_{ij}^{t+1}=\begin{cases}{X}_{best}^{t}+\beta \times \left|{X}_{ij}^{t}-{X}_{best}^{t}\right|& {f}_{i}>{f}_{g}\\ {X}_{ij}^{t}+K\times \left(\frac{{X}_{ij}^{t}-{X}_{best}^{t}}{\left({f}_{i}-{f}_{w}\right)+\varepsilon }\right)& {f}_{i}={f}_{g}\end{cases}$$
    (18)

    where Xbest is the current global optimal position, β is the step control parameter, K is a uniform random number of [− 1, 1], fi is the fitness value of the sparrows at present, and fg and fw are the best and worst fitness values at present. ε is a constant. When fi > fg, the sparrow is at the edge of the population and is easy to be preyed on. When fi = fg, the sparrow in the center of the population has found danger and needs to approach other sparrows to reduce the probability of being preyed on.

4.2 Elman neural network

Elman neural network structure is shown in Fig. 7. The mathematical model expression of the Elman neural network is:

Fig. 7
figure 7

Elman neural network structure

$$y\left(k\right)=g\left({\omega }_{3}x\left(k\right)\right)$$
(19)
$$x\left(k\right)=f\left({\omega }_{1}{x}_{c}\left(k\right)+{\omega }_{2}\left(u\left(k-1\right)\right)\right)$$
(20)
$${x}_{c}\left(k\right)=x\left(k-1\right)$$
(21)

where y is the m-dimensional output node vector; x is the n-dimensional intermediate layer node unit vector; u is the r-dimensional input vector; xc is the feedback state vector; ω1 is the weight of the connecting layer to the hidden layer connection; ω2 is the input layer to the hidden layer connection weight; ω3 is the hidden layer to the output layer connection weight; g () is the transfer function of the output neuron; and f () is the transfer function of the intermediate layer neuron.

4.3 SSA-Elman neural network thermal error model

Aiming at the shortcomings of the Elman neural network, this paper proposes using the sparrow search algorithm to optimize the initial weights and thresholds of the Elman neural network to improve its performance. The SSA-Elman neural network thermal error prediction model is established. The modeling steps are as follows:

  1. 1.

    Initialize the sparrow population and related parameters. Determine the population number, the proportion of discoverers, joiners, and scouts persons of sparrows in the population, etc.

  2. 2.

    Determine the structure of the Elman neural network. The three nodes of that input layer are temperature-sensitive points; one node of the output layer is the axial thermal error of the motorized spindle; the hidden layer is generally determined by the formula h = (m + n)1/2 + a, where m is the number of nodes in the input layer, n is the number of nodes in the output layer, and a is generally an integer between 1 and 10.

  3. 3.

    Determine the fitness function. Each sparrow in the population contains the weights and thresholds required by the network, and the problem dimension is determined according to the network structure. The output of the neural network can be determined by mapping functions and solving linear output relations. The root means square error between the predicted result and the actual value is taken as the fitness function.

    $$fitness=MSE=\frac{1}{N}\sum_{i=1}^{N}{\left({\overline{y} }_{i}-{y}_{i}\right)}^{2}$$
    (22)

    where n is the number of training set samples; \({\overline{y} }_{i}\) and yi represent the predicted output and the actual value.

  4. 4.

    Sort sparrow individuals according to the value of fitness function to generate the initial individual position of the population.

  5. 5.

    Update the positions of discoverers, joiners, and scouts.

  6. 6.

    Calculate the fitness value and update the sparrow position.

  7. 7.

    Judge whether the stop and termination conditions are met; if so, turn to step 8; otherwise, return to step 5 to continue iteration.

  8. 8.

    After the stop condition is met, the optimal weights and thresholds are assigned to Elman neural network for training and prediction. The SSA-Elman thermal error modeling process is shown in Fig. 8.

Fig. 8
figure 8

SSA optimized Elman neural network process

4.4 PSO-Elman neural network thermal error model

To verify the prediction accuracy and performance of the SSA-Elman model, it is compared with the thermal error model of the Elman neural network optimized by particle swarm optimization [27]. The thermal error modeling steps of the PSO-Elman neural network model are as follows:

  1. 1.

    Initialize particle swarm and related parameters.

  2. 2.

    Determine Elman neural network.

  3. 3.

    Code the particle information in the particle swarm and establishe the corresponding relationship between the weights and thresholds in the particle swarm and the BP neural network.

  4. 4.

    Determine the particle fitness function, encode the particle information in the particle swarm, and establish the corresponding relationship between the weights and thresholds in the particle swarm and Elman neural network. The root means square error between the predicted output and the actual value of the Elman neural network is used as the fitness function of particle swarm optimization.

  5. 5.

    Find the individual extremum and global extremum according to initial particle fitness value.

  6. 6.

    Update the velocity and position of particles.

  7. 7.

    Update the individual extremum and global extremum according to the fitness value of new particles.

  8. 8.

    Judge whether the stop and termination conditions are met. If so, go to step 9; otherwise, return to step 6 to continue iteration.

  9. 9.

    After the stop condition is met, the optimal weights and thresholds are assigned to Elman neural network for training and prediction. The PSO-Elman thermal error modeling process is shown in Fig. 9.

Fig. 9
figure 9

PSO optimized Elman neural network process

4.5 Prediction and analysis of thermal error of motorized spindle

Generally, the trained thermal error model has better prediction accuracy under the same experimental conditions, but when the experimental conditions change, the prediction accuracy of the model is uncertain. Therefore, it is necessary to verify the prediction performance of the model under different rotational speeds. In this paper, the experimental data of 6000r/min is used as the training data set. To verify the accuracy and robustness of the model and prevent randomness, 2000 and 4000r/min below 6000r/min and 8000 and 10000r/min above 6000r/min are selected as the prediction sets for verification.

In the Elman network, the training times are 1000 times, and the learning rate is 0.01. The transfer function of the hidden layer is tangent S-type transfer function tansig, the transfer function of the output layer is the purelin function, and the training function of back propagation is trainlm function. Through several trial simulations, according to the data and network structure selected in this paper, the optimal number of hidden layer nodes is determined to be 7 through training. The parameters of the sparrow search algorithm are as follows: the number of sparrow population is 35, the number of iterations is 100 times, the early warning value ST in the set population is 0.7, the proportion of discoverers is 0.8, and the proportion of fore scouts is 0.2. The parameters of PSO are as follows: the population size is 35, the number of iterations is 100, the particle velocities c1 and c2 are 2, and the inertia weight is 0.9.

SSA-Elman neural network and PSO-Elman neural network established in this paper have the same fitness function and iteration times. The fitness curves of the two models at different speeds are shown in Fig. 10. From the fitness curve, it can be seen that the number of iterations required for convergence of SSA algorithm is less than that of PSO algorithm at different speeds, so the optimization speed of SSA algorithm is higher than that of PSO algorithm.

Fig. 10
figure 10

Optimal fitness value curve at different speeds. a 2000 best fitness value curve, b 4000 best fitness value curve, c 8000 best fitness value curve, and d 10,000 best fitness value curve

The prediction curves of each model at different speeds are shown in Figs. 11, 12, 13, and 14. It can be seen from the figure that the prediction accuracy of SSA-Elman and PSO-Elman neural networks is higher than that of Elman neural networks, and the prediction accuracy of SSA-Elman neural networks is slightly higher than that of PSO-Elman neural networks. The optimized Elman neural network model is closer to the actual value, which greatly improves the prediction accuracy of the Elman neural network model. When the thermal error of each prediction model tends to be stable at different rotational speeds, the fluctuation is the largest. Elman, PSO-Elman and SSA-Elman neural network model, at 2000r/min, the value range of prediction residual is -5.457–4.394 μm, -2.352–3.096 μm, -2.983–3.586 μm; at 4000r/min respectively -6.072–5.843 μm, -3.069–3.996 μm, -2.094–4.393 μm; at 8000r/min respectively -2.974–8.940 μm, -3.968–4.589 μm and -3.883–3.771 μm; at 10000r/min, respectively -6.075–7.219 μm, -3.543–3.732 μm, -4.024–3.570 μm. It can be concluded that SSA-Elman and PSO-Elman neural networks have higher prediction accuracy than Elman neural networks. SSA-Elman neural network model and PSO-Elman neural network model have roughly the same prediction accuracy. To accurately evaluate the accuracy of the two models, other evaluation indexes need to be used to verify them.

Fig. 11
figure 11

Prediction curves and residuals of 2000r/min models. a SSA-Elman prediction curve, b PSO-Elman prediction curve, c Elman prediction curve, and d prediction residuals

Fig. 12
figure 12

Prediction curves and residuals of 4000r/min models. a SSA-Elman prediction curve, b PSO-Elman prediction curve, c Elman prediction curve, and d prediction residuals

Fig. 13
figure 13

Prediction curves and residuals of 8000r/min models. a SSA-Elman prediction curve, b PSO-Elman prediction curve, c Elman prediction curve, and d prediction residuals

Fig. 14
figure 14

Prediction curves and residuals of 10000r/min models. a SSA-Elman prediction curve, b PSO-Elman prediction curve, c Elman prediction curve, and d prediction residuals

To further evaluate the thermal error prediction model, the determination coefficient (R2), root mean square error (RMSE), mean absolute error (MAE), and modeling accuracy (η) are used as evaluation indexes. After many trainings, the evaluation indexes of each model at different speeds are obtained as shown in Table 5.

Table 5 Comparison of different model evaluation indicators

R2 is the determining coefficient, an index of the model’s fitting degree to the sample. Theoretical value range is (-∞, 1]. The closer to 1, the higher the fitting degree of the model to the data; the closer to 0, the worse the model fits. MAE and RMSE are the mean absolute and root mean square errors of the model, respectively, and the value range is [0, + ∞]. When the predicted value is completely consistent with the true value, it equals to 0, which is a perfect model. The larger the prediction error, the larger the value. η is the modeling accuracy. The closer the η value is to 1, the higher the model accuracy. The results of the 8000r/min model evaluation are used as an example for analysis. The R2 of Elman, PSO-Elman, and SSA-Elman are 0.648, 0.935, and 0.977, respectively; MAE and RMSE are 5.663 and 6.366, 2.145 and 2.627, and 1.279 and 1.694, respectively; η are 0.857, 0.939, and 0.965, respectively. Compared with Elman, the determination coefficients of PSO-Elman and SSA-Elman increased by 0.289 and 0.328, respectively, and the fitting degree of the SSA-Elman model reached 0.977, which was significantly higher than that of the PSO-Elman model. The absolute error and root mean square error was reduced by 3.518 and 3.739 and 4.384 and 4.672, respectively, and the SSA-Elman model reduced the error the most, approaching the perfect model. The accuracy of the model is improved by 8.2% and 10.8% respectively. At different speeds, the average values of R2, MAE, RMSE, and η of the Elman model are 0.705, 4.29, 4.88, and 0.846, respectively. The mean values of R2, MAE, RMSE, and η of the PSO-Elman model are 0.925, 2.225, 2.62, and 0.92, respectively. The mean values of R2, MAE, RMSE, and η of the SSA-Elman model are 0.964, 1.31, 1.648, and 0.953, respectively. Among them, SSA-Elman and PSO-Elman models improve the accuracy distribution by 7.4% and 10.7% compared with the Elman model. Compared with the PSO-Elman model, the SSA-Elman model improves the accuracy by 3.3%, reaching 95.3%. The R2, MAE, RMSE, and η performance of the SSA-Elman model are better than those of the PSO-Elman and Elman models. In summary, both the SSA-Elman and PSO-Elman neural network models can significantly improve the performance of the Elman neural network model, and the SSA-Elman model has the best predictive performance.

At 2000 and 4000r/min, the model accuracy of Elman, PSO-Elman, and SSA-Elman were 0.82 and 0.808, 0.901 and 0.9, and 0.929 and 0.952, respectively. At 8000 and 10000r/min, the model accuracy of Elman, PSO-Elman, and SSA-Elman are 0.857 and 0.901, 0.94 and 0.94, and 0.965 and 0.966, respectively. It can be seen that the prediction accuracy of each model at low speed of 2000 and 4000r/min is lower than that of each model at high-speed of 8000 and 10000r/min. At low speed, the prediction accuracy and robustness of each model need to be further improved. To obtain higher prediction accuracy, it is necessary to use the low rotation speed as the training set and the high rotation speed as the prediction set.

To sum up, by selecting appropriate temperature-sensitive points, the SSA-Elman model exhibits good stability and generalization ability at different rotational speeds. The SSA-Elman model has the highest prediction accuracy, reaching 96.6%. Therefore, the SSA-Elman model is more suitable for predicting the thermal error of the motorized spindle.

5 Conclusions

In this paper, the temperature field and thermal error of the A02 high-speed motorized spindle are measured. The pedigree clustering and k-means clustering were used to group the temperature measuring points, and the temperature-sensitive points were screened combined with the grey relational degree. The thermal error prediction model of the SSA-Elman motorized spindle was established. The main conclusions are as follows:

  1. 1.

    The pedigree clustering and k-means clustering are used to cluster the temperature measuring points. The grey relational degree is used to explore the correlation between the temperature measuring points and the thermal error. Three temperature-sensitive points in the front and rear bearings of the motorized spindle and the built-in motor shell are selected. Effectively eliminating the collinearity between temperature variables is of great significance to improve the robustness and modeling accuracy of thermal error models.

  2. 2.

    The thermal error of a high-speed motorized spindle is modeled by SSA-Elman neural network. Firstly, Elman neural network can adapt to time-varying characteristics and global stability, which can solve the nonlinear relationship between temperature variables and thermal error. Secondly, the sparrow algorithm is proposed to optimize the Elman neural network, which solves the shortcomings of the Elman neural network model, such as poor convergence, low prediction accuracy, and ease to fall into the local extremum. The proposed SSA-Elman model can significantly improve the predictive performance of the Elman model.

  3. 3.

    The thermal error model verification was carried out on the A02 high-speed motorized spindle. The accuracy and robustness of the thermal error model are verified at different speeds. The results show that the average prediction accuracy of Elman, PSO-Elman, and SSA-Elman models at different speeds is 84.6%, 92%, and 95.3%, respectively. PSO-Elman and SSA-Elman models can improve the prediction accuracy of Elman model by 7.4% and 10.7%, respectively. The prediction accuracy of SSA-Elman model is 3.3% higher than that of PSO-Elman model; the SSA-Elman model shows better generalization ability and prediction accuracy. Therefore, the SA-Elman model is more suitable for predicting of the thermal error of the motorized spindle.