1 Introduction

In recent years, with the improvement of the precision and speed of feed drive systems, the ball screw, as the core component of the precision feed drive system, has an increasingly obvious influence on the system. A large number of studies have shown that the thermal error accounts for 40–70% of the positioning error [1]. The study on the thermal error and compensation of ball screw systems will improve the performance of feed drive systems. Feng et al. proved that the influence of nonlinear friction on the conventional drive feed system (CDFS) can be reduced significantly by the two-axis differential micro-feed system (TDMS) [2,3,4,5,6]. However, the friction of the supporting bearings and the ball screw in the system will lead to axial thermal deformation, which being a main factor restricting high-precision micro-feeds. Due to the structure and motion characteristics of the transmission parts of the TDMS, the distribution of the heat source is different from that of the CDFS. Clearly, it is essential to study the thermal error measurement, modelling, and compensation methods of the TDMS.

High-precision thermal error models can improve machining accuracy effectively [7]. A neural network is one of the important development directions of artificial intelligence algorithms. Compared with a traditional classical theoretical model, a neural network model has good data parallel processing ability, storage capacity, data fault tolerance, and characteristics of nonlinear mapping. Many scholars use back-propagation (BP) neural networks [8], radial basis function (RBF) neural networks [9], or support vector machines (SVMs) [10] to build thermal error models. Because the prediction results of neural network models depend largely on the selected initial values, many scholars incorporate genetic algorithms [8], particle swarm optimization [8], Grey theory [11], and fuzzy theory [12] into the neural network to be used as a neural network initial value optimization method, thereby further improving the prediction results. Ma et al. utilized particle swarm optimization and genetic algorithm to optimize the thermal error modelling of a BP neural network [8] but did not consider the thermal elastic effect of thermal deformation. This approach will reduce the prediction accuracy when the operating conditions are complex.

In this paper, the thermal error is predicted by the dynamic model with the elastic effect of thermal deformation taken into account. The model input is a time series of variables, such as temperature and speed. The thermoelastic effect can be understood as follows: the temperature change near the location of the heat source is faster than the change in the thermal deformation, while the temperature change far from the heat source is slower than the change in the thermal deformation. A dynamic neural network is often used for thermal error modelling because of its delay or feedback links. Chang et al. used a dynamic feedforward neural network, with its input variables corrected according to the time series [13]. Yang et al. simplified the heat conduction problem of the spindle into a one-dimensional infinitely long heat conduction rod problem [14]. The analytical solution of the problem is given, and the internal regression diagonal neural network model is used for thermal error prediction. Xia et al. used the packet display algorithm to solve the heat problem of the finite length one-dimensional bar and used the neural network model and NARMAX time series model to predict the axial deformation of a screw feed system [15]. Yang et al. used an Elman neural network to build the thermal error model of a machine tool and compared with the RBF neural network, which showed an improvement in the prediction accuracy [16]. Similarly, Huang et al. [17] utilized the Elman model optimized by the genetic algorithm for thermal error prediction. Zhu et al. [18] proposed a thermal error model using the OIF-Elman network. These studies show that the Elman dynamic neural network can significantly improve the modelling accuracy. However, they did not consider the influence of different operating conditions on the thermal error. Chen et al. [19] used an Elman neural network to build the thermal error model of the ball-screw drive system and considered operating conditions. However, the Elman neural network was not optimized.

In order to improve the accuracy and robustness of the model, a new thermal error modelling method for the TDMS is proposed in this paper. In Sect. 2, the system structure is presented to analyze the heat generation. The thermal error model based on Elman and DE-Elman is introduced in Sect. 3. Thermal error modelling and compensation experiments are conducted in Sect. 4. Finally, the conclusions are presented in Sect. 5.

2 Heat effect analysis

The block diagram of the TDMS is shown in Fig. 1. The experimental platform mainly consists of a nut-driven ball screw, linear motion guides, and two permanent magnet synchronous motors (PMSMs).

Fig. 1
figure 1

Block diagram of the TDMS

As shown in Fig. 2, the outer ring of the supporting bearing in the TDMS nut assembly is integrated with the flange in the nut assembly, the inner ring of the supporting bearing is integrated with the outer ring of the rotating nut, and the nut and the screw drive are separately driven. This structure and the driving method make the mechanism different from the heat source point of the conventional drive feed mechanism. At the same time, the temperature distribution of the mechanism is different from that of the conventional driving feed mechanism. Due to the special driving mode of the TDMS, the two axes have to be maintained at a relatively high speed, and the heat generated by the TDMS per unit time is more than that of the CDFS. Therefore, the thermal error compensation is important to the TDMS.

Fig. 2
figure 2

Heat source distribution

3 Elman neural network based differential evolution algorithm

3.1 Elman network

The BP neural network is a feedforward neural network. Because it is a static neural network, only nonlinear static mapping of the input and output can be achieved. The Elman neural network is a typical recurrent neural network, as shown in Fig. 3. The recurrent neural network can respond to externally input signals according to the timing, and then dynamically process the input signals to achieve dynamic nonlinear mapping of the input and output. On the basis of static neural networks, recurrent neural networks add feedback connections. Through the storage of neuron history records, delay of the output based on the input is realized so that the prediction model has a memory function for the historical data, which improves the recognition accuracy of the dynamic system. Therefore, recurrent neural networks have many applications in modelling and prediction of nonlinear dynamic systems.

Fig. 3
figure 3

Structure of Elman neural network

3.2 Differential evolution algorithm

Differential evolution (DE) generates an individual population by coding with floating-point vectors. In the process of DE algorithm optimization, first, two individuals are selected from the parents to create a difference between vectors to generate difference vectors; second, another individual is selected to sum the difference vectors to generate the experimental individuals; then, the parent and the corresponding experimental individuals are cross-operated to generate new offspring individuals; finally, the parent and the corresponding experimental individuals are generated. Selection between the progeny and the progeny is performed to preserve the eligible individuals in the next generation [20].

  1. 1.

    Initialization:

$$ \left\{{x}_i(0)|{x}_i(0)=\left[{x}_{i1},{x}_{i2},{x}_{i3},\dots, {x}_{iD}\right],I=1,2,\cdots NP\right\} $$
(1)
$$ {x}_{ij}={a}_j+\mathit{\operatorname{rand}}\times \left({b}_j-{a}_j\right)\kern0.3em i=1,2,\cdots, NP,j=1,2,\cdots, D $$
(2)

where xi (0) is the ith individual in the initial population, xij denotes the jth component of the ith individual, NP represents the population size, and rand is a uniformly distributed random number lying between 0 and 1.

  1. 2.

    Determination of fitness function:

$$ {F}_{\mathrm{min}}=1/\sum \limits_{i=1}^{n_0}{\left({y}_i-{Q}_i\right)}^2 $$
(3)

where n0 is the number of neurons in the output layer; yi represents the predicted outputs, and Qi represents the expected outputs.

  1. 3.

    Mutation: The DE algorithm realizes the mutation operation through a differential mode. The basic method is to select two dissimilar individuals randomly in the current population, then scale their difference vector and perform the vector operation with other individuals to generate new individuals.

$$ {V}_i\left(g+1\right)={X}_{r1}(g)+F\times \left({X}_{r2}(g)-{X}_{r3}(g)\right) $$
(4)

where i ≠ r1 ≠ r2 ≠ r3, i = 1, 2, .., NP, r1, r2, and r3 are random integers in the interval [1, NP], Xi(g) is the ith individual of the g-generation population, g is the evolutionary algebra, and F is the scaling factor. After mutation, the g-generation population produces a new intermediate population:

$$ \left\{{V}_i\left(g+1\right),i=1,2,\cdots, NP\right\} $$
(5)
  1. 4.

    Crossover: Inter-individual crossover operations are performed on the g-generation population {Xi(g), i = 1, 2, ⋯, NP} and its intermediate population {Vi(g + 1), i = 1, 2, ⋯, NP}:

$$ {u}_{ij}\left(g+1\right)=\left\{\begin{array}{l}{v}_{ij}\left(g+1\right), if\kern0.2em \mathit{\operatorname{rand}}\le CR\kern0.2em or\kern0.2em j={j}_{rand}\\ {}{x}_{ij}(g), otherwise\end{array}\right. $$
(6)

wherei = 1, 2, .., NP,j = 1, 2, .., D, rand is a uniformly distributed random number in the interval (0, 1), Uij(g + 1) = [ui1, ui2, ui3, ⋯, uiD]represents the ith individual in the g+1-th new population, uij(g + 1) and vij(g + 1) represent the jth component in Ui(g + 1) and Vi(g + 1), respectively. CR represents the crossover probability and jrand is a random integer within the interval. This crossover strategy ensures that at least one component of Ui(g + 1) is contributed by the corresponding component in Vi(g + 1).

  1. 5.

    Selection: The DE algorithm uses a greedy strategy to select individuals entering a new population based on the size of the objective function.

$$ {X}_i\left(g+1\right)=\left\{\begin{array}{l}{U}_i\left(g+1\right), if\kern0.2em f\left({U}_i\left(g+1\right)\right)\le f\left({X}_i(g)\right)\\ {}{X}_i(g), otherwise\end{array}\right. $$
(7)

where i = 1, 2, .., NP.

6. Stopping rule: If the number of iterations g exceeds the maximum number of iterations or the solution accuracy meets the requirements, the search is stopped; otherwise, the population is again subjected to mutation, crossover, and selection until the condition is met.

3.3 DE-Elman modelling

In the training process of the Elman model, the important parameters to be determined are the initial values of the connection weights of the thresholds. Reasonable and accurate selection of these parameters can enable the Elman neural network to carry out nonlinear approximation. The DE algorithm optimizes the Elman model to obtain the best weight and threshold, which lays a foundation for the thermal deformation modelling of the TDMS. The specific process is shown in Fig. 4.

  1. 1)

    Initialization of the Elman model: The inputs of the DE-Elman model are the typical temperature variables, and the output of the DE-Elman model is the thermal error.

  2. 2)

    Initialization of the DE algorithm: The maximum evolutionary algebra, population size N, minimum optimal fitness, mutation factor and crossover probability of the initialization population are obtained, and the initial network connection weights and thresholds obtained in step (1) are mapped to the individual population by using the real number coding method.

  3. 3)

    Determination of fitness function: Reference Eq. (3).

  4. 4)

    Mutation: Reference Eq. (4) and Eq. (5).

  5. 5)

    Crossover: Reference Eq. (6).

  6. 6)

    Selection: Reference Eq. (7).

  7. 7)

    Calculation of fitness value.

  8. 8)

    Obtain the best original weights and the thresholds of the DE-Elman model.

  9. 9)

    Training of the model: The prediction model is trained by training data. The parameters of the Elman model are shown in Table 1, and the parameters of the DE are shown in Table 2.

Fig. 4
figure 4

Flowchart of DE-Elman

Table 1 Parameters of the Elman model
Table 2 Parameters of DE

4 Experiment based thermal error modelling

4.1 Experimental equipment

A schematic diagram of the temperature increase and the thermal elongation of the system is shown in Fig. 5. The experimental environment is a room with constant temperature and humidity, that is, a temperature of 20 ± 1 °C and humidity of 50 ± 10%. The positioning error of the worktable is measured using a Renishaw XL-80 dual-frequency laser interferometer. The hardware of the detection system is mainly composed of temperature sensors (T1, T23, T4, and T5), a transmitter and a data acquisition card. The temperature sensor T1 is installed near the screw motor bearing to measure the temperature of the rear bearing. The temperature sensor T4 is mounted away from the screw motor bearing to measure the temperature of the front bearing. The temperature sensor T23 is mounted on the flange of the nut to measure the temperature of the nut. The temperature sensor T5 is used to measure the temperature of the environment. The temperature sensors use a PT100 thermistor and a matched temperature transmitter. The temperature range is 0 to 100°, and the output is 0 to 5 V. An Advantech USB-4711A multi-channel data acquisition card is used with 16 analogue input channels and a 12-bit resolution. During the experiment, the experimental data were recorded over 5 min, including the temperature values of the key points and the positioning errors of the worktable.

Fig. 5
figure 5

TDMS experimental setup

The experimental conditions, as shown in Table 3, include the running speed, the stroke, and the running time of the table. By changing the above three operating conditions, and then measuring the temperature and thermal error, the model is built and verified. The table runs reciprocally, and the positioning error in the experiment is measured by a laser interferometer. Experiment I is used to build the thermal error model. Experiments II~IV are used to compensate the thermal errors and verify the validity of the model. In each experiment, the table runs 150 min according to the set speed and stroke, and experiment IV consists of 3 consecutive stages of different speeds and strokes.

Table 3 Operation conditions of experiments

4.2 Thermal error modelling

The result of experiment I is shown in Fig. 7. Figure 7 a is a critical point temperature curve. The Y-axis is based on the temperature increase data to facilitate drawing. Figures 8, 9, and 10 are the same. Figure 7 b is the curve of the change in the thermal error. Based on these two sets of data, a thermal error neural network model can be obtained. To compare the modelling effect of the feedforward and the dynamic neural network, the BP and Elman networks are used to model the temperature and the thermal error data obtained in experiment I. The inputs of the BP and Elman network are the increases in temperature of four key temperature points, and the output variable is the thermal error of the system. In training, the input variable is the array (4 × 30), which is collected in experiment I, and the output variable is the system thermal error array (1 × 30) of the corresponding time.

The thermal error prediction values and residual values of the BP and Elman network are shown in Fig. 7b. The estimated residual values of the BP and the Elman model are − 1.85~1.88 μm and − 0.45~0.46 μm, respectively. By comparison, the compensation effects of the two models are improved. The residual value reflects the fact that the Elman model has a better modelling effect.

Next, the robustness of the BP and Elman network modelling is compared with the model obtained in experiment I. The temperatures of the key points under different operating conditions are used as the inputs of the model.

First, the temperature variation data collected in experiment II are used as the input, and the BP and Elman model are used to predict the thermal error. The results are shown in Fig. 8. The estimated residual of the BP model fluctuates between − 2.51 and 2.42 μm, and the estimated residual of the Elman model fluctuates between − 0.87 and 0.87 μm; the estimation made using the Elman model is clearly better. This is because feedforward neural networks contain only static neurons and lack time factors and thus cannot more accurately describe the dynamic characteristics of screw thermal characteristics. The first layer of the Elman network has a feedback connection, which can store the previous value. Therefore, the dynamic nonlinear thermal error model of the system is established by Elman network, which has strong approximation performance and robustness.

According to the temperature variation data of experiment III, the thermal errors of the Elman and the DE-Elman model are used to predict the thermal error. The results are shown in Fig. 9. The experimental results can be divided into two stages in accordance with the passage of time. The estimation error of the two models before 30 min is larger; the maximum errors of the BP and the Elman model are 2.68 μm and 1.21 μm, respectively. After 30 min, the amount of model data increases as time passes, and the model estimation error gradually becomes stable. The estimation error of the Elman neural network then becomes − 1.93~2.04 μm, and the estimation error of DE-Elman is − 1.12~1.18 μm. Therefore, the convergence time of DE-Elman is much shorter than the Elman model, and the prediction accuracy of the DE-Elman model is higher when the model converges.

Next, the temperature variation data of experiment III are used as the input variable, and the thermal errors of the Elman and DE-Elman model are used for thermal error prediction. The parameters of the differential evolution algorithm are shown in Table 2, and the prediction results are shown in Fig. 9. The experimental results can be divided into two stages. The estimation error of the two models before 30 min is large, with the maximum errors of the BP and Elman model being 2.68 μm and 1.21 μm, respectively. After 30 min, the amount of model data gradually increases as time passes, and the model estimation error gradually becomes stable. The estimation error of the Elman neural network is − 1.93~2.04 μm, and the estimation error of DE-Elman is − 1.12~1.18 μm. Therefore, the convergence time of DE-Elman is much smaller than Elman model, and the prediction accuracy of the DE-Elman model is higher when the model converges.

Finally, experiment IV is analyzed. To simulate the movement of the carriage in the actual machining process, the experiment is divided into three stages with different operating conditions. Figure 10 is the prediction results of the two models. The two models have a large deviation in the estimation of the thermal error. The estimated residual errors of the Elman and the DE-Elman model are − 2.56~2.93 μm and − 1.54~1.79 μm, respectively. Although the DE-Elman model has a better estimation performance than Elman model, the prediction performance is considerably lower than that in experiment III.

From Figs. 7, 8, 9, and 10, it is known that the Elman model with a feedback structure and the DE-Elman model have good effects on the prediction of the heat error of the screw. This shows that the dynamic neural network is suitable for describing the thermal characteristics of the dynamic change in the dual-axis differential system. However, when the operating conditions are more complex, the temperature increase data and the thermal error of the key temperature measurement points also undergo complicated changes. When this occurs, the Elman neural network and DE-Elman fail to achieve better prediction results. Therefore, based on the DE-Elman model and considering the effect of the operating conditions on the thermal error, the prediction model of the dual-axis differential system is set up, and the heat error compensation experiment is conducted based on the model.

The DE-Elman model considering the actual operating conditions has seven input terminal nodes, which are the temperature of the near-end bearing, the temperature of the screw nut, the temperature of the far-end bearing, the temperature of the environment, the feed speed of the worktable, and the speed and stroke of the nut axis. The array is composed of the corresponding time feed speed, nut shaft speed, and stroke with the previous temperature increase data used as the input of the model. The thermal error distribution of the system is obtained by the interpolation method, and the thermal error model of the system is built. The model data are introduced into the fixed high motion controller to achieve the thermal error compensation function. For example, in Figs. 6, 7, 8, 9, 10, and 11 and Table 4, considering the thermal error compensation of the DE-Elman model under the operating conditions, the positioning error of the worktable fluctuates between − 0.88 and 0.83 μm. Compared with the Elman neural network model (which fluctuates between − 1.42 and 1.73 μm), more ideal results were obtained.

Fig. 6
figure 6

Photograph of the experimental setup

Fig. 7
figure 7

Experiment I. a Temperature data. b Thermal error prediction

Fig. 8
figure 8

Experiment II. a Temperature data. b Thermal error prediction

Fig. 9
figure 9

Experiment III. a Temperature data. b Thermal error prediction

Fig. 10
figure 10

Experiment IV. a Temperature data. b Thermal error prediction

Fig. 11
figure 11

Experiment IV

Table 4 Comparison of the predicting errors in experiments

To summarize, when the temperature increase data of the feed system are taken as the input, the BP, Elman, and DE-Elman network can set up the model of thermal error. When the operation state of the feed system remains unchanged, the Elman network model based on a differential evolution algorithm is more effective in compensating for the thermal error of the system. When the conditions are complex, the above three neural network models have insufficient thermal error prediction, while the model considered to be the input term of the DE-Elman model can achieve good estimation for all conditions, the compensation effect is better, and the robustness is stronger.

5 Conclusions

This paper first analyzes the dynamic characteristics of the heat source and temperature field of a two-axis differential micro-feed servo system. Through the analysis of the thermal characteristics of the system, the DE-Elman model, which considers the varying conditions, is utilized to model the thermal deformation error of the feed system, and good results are obtained. The experiment shows that when the operating conditions of the TDMS are more complex, the thermal deformation estimation residual of DE-Elman network modelling considering the operating conditions fluctuates between − 0.88 and 0.83 μm. Compared with BP and Elman neural network, this method has a better prediction accuracy and robustness and shows strong potential for engineering applications.