1 Introduction

Numerical control machining technology is widely used to produce complicated geometries and high precision parts. Tool wear will lead to reduced machining accuracy, shortened tool life, and ineffective production. Tool life is basically defined as the time required to reach a predetermined flank wear width [1]. Tool life prediction and tool wear state identification are the most relevant. Thus, tool wear prediction is a crucial field of Prognostic and Health Management (PHM), aiming at improving machining accuracy and production efficiency, maximizing tool utilization, and reducing machining cost.

Due to the high complexity and nonlinearity of the tool wear process, it is challenging to develop a universal tool wear prediction model applied to industrial production. Many studies on tool wear have been conducted recently, and a variety of models have been put forth, which can be broadly categorized into physical model and data-driven model [2].

The physical model establishes relationship between physical quantity of machining conditions and tool wear based on prior knowledge of tool cutting. The existing research includes force model coefficient analysis [3], adhesive wear model [4], diffusion wear model [5], and so on. Under specific presumptions, these models are built with a clear physical meaning, reliability, and interpretability. And can produce positive results in a continuous single processing environment. The link between tool wear status and cutting parameters is described by Taylor’s tool life formula and its many extension functions [6,7,8,9,10,11]. These semi-empirical functions are well known in field of tool health monitoring. Empirical expressions that can directly describe tool wear and cutting time based on wear time curve have also been proposed in many literatures [12,13,14,15,16]. As a result of the complexity and nonlinearity of tool degradation, these models are typically created using the proper assumptions and simplifications. Many random factors are inevitably involved in cutting process. Cutting process randomness is not taken into account in the empirical formula. Hence, the physical method application is restricted. The data-driven model is a method to learn mapping relationship between tool state and monitoring signals from tool machining sensors [17]. Data-driven models have become increasingly common in the context of industrial big data, with ongoing advancements in machine learning and deep learning algorithms. Model construction frequently employs support vector regression (SVR) [18,19,20], artificial neural network (ANN) [21,22,23], neural fuzzy reasoning system (NFRS) [24, 25], etc. Large amounts of historical sample data are necessary for the development of data-driven models, but there are also challenges with signal processing and feature extraction caused by noise or interference from the machining environment. The majority of data-driven method outputs are difficult to comprehend in light of inputs or connected to any physical meaning (cutting parameters, tool geometry parameters, machined materials, etc.).

Based on above analysis, tool wear modeling needs to be further improved. In this paper, a hybrid physics data-driven model-based fusion framework for tool wear prediction is proposed, which makes use of the advantages of two different models to improve prediction accuracy. The major highlights of this paper are summarized as follows: (1) Physical information is introduced as the input of data-driven model to build a hybrid physics data-driven (HPDD) model. By using unlabeled samples for data expansion, data mining and physics principles are effectively integrated. (2) The Bayesian fusion mechanism combines physical information with HPDD models to manage the uncertainty of different prediction results. (3) Tool wear process is divided into different stages to reduce parameter estimation difficulty and improve prediction interpretability. The proposed framework can improve poor prediction accuracy and generalization ability of physical model and lack interpretability and physical consistency of data-driven model.

The rest of this paper is constructed as follows. The research status of tool wear prediction methods is described in Sect. 2. Construction method of hybrid physics data-driven model-based fusion framework is introduced in Sect. 3. This method practicability is verified by experiment in Sect. 4. The conclusions are drawn in Sect. 5.

2 Related works

Recent physical model research programs have been able to provide solutions to practical problems, and most of which are simple, effective and easy to apply. Bai et al. [26] established a semi-analytical model for tool wear prediction based on interaction between tool and workpiece. Zhang et al. [27] proposed a universal tool wear model with adjustable coefficient, and carried out tool life prediction and tool state evaluation. Slamani et al. [12] compared multiplicative statistical model and exponential model of tool wear, and analyzed advantages and disadvantages of model under different circumstances. Most empirical formulas for tool wear rely on predefined cutting parameters and tool geometry. The parameters in models lack representativeness and flexibility, which are not only sensitive to processing conditions, but also have low adaptability to various actual processing conditions. Current physical models are incredibly oversimplified, accounting for just a few dominant factors. The reliability and accuracy of the physical model will be significantly impacted when the model assumption is not confirmed. Numerical simulation approach has made some progress in the monitoring of tool wear condition in recent years. This method can address the issue of difficultly obtaining process parameters, but it requires in-depth study of tool wear mechanism. Tang et al. [28] analyzed influence different physical and structural parameters on drilling wear through numerical simulation model. Attanasio et al. [29] developed a finite element analysis method with modifiable shape of model geometric parameters to evaluate influence of cutting conditions on tool wear. Zhu et al. [30] used finite element method based on Johnson–Cook model to monitor milling tools status. While the simplified friction model cannot fully account for contact during the actual cutting process, it is frequently used in actual modeling. This causes a discrepancy between simulation results and actual values. Moreover, some model inputs cannot be monitored or calculated during the cutting manufacturing process, which severely limits the use of physical models. Therefore, physical model is difficult to be widely promoted and further improve in field of practical application and tool health monitoring.

The data-driven method is mining tool wear internal operation law based on state monitoring data. Under mutual coupling of multiple factors such as working condition and machining technology, the relationship between tool wear and monitoring quantity is more intricate and changeable. It is unrealistic to deduce mathematical law between the physical quantity and tool wear. Under premise of reasonable and sufficient samples, with excellent nonlinear mapping ability, the data-driven model can dig internal relationship of each physical quantity at in-depth level without deeply dependence on prior knowledge and processing mechanism. So, it is appropriate for research on tool wear monitoring that is still being fully understood. Duan et al. [31] proposed a novel multi-scale stacked sparse principal component analysis network to select training indexes for tool prediction. He et al. [32] proposed a cross-domain adaptive network based on attention mechanism to accurately identify tool wear states under different machining parameters. Qin et al. [33] established the tool wear identification model of stack sparse self-encoder and the tool wear prediction model of BP neural network to monitor the tool wear according to different task requirements and guide the tool replacement in the actual cutting process. Wei et al. [34] proposed a salp swarm algorithm combining chaotic mapping and attenuation factor, which unified with neural network to achieve effective tool wear prediction. However, data-driven model also has significant drawbacks. The mapping relationship is completely dependent on monitoring data, and final results may be inconsistent with physical law and common-sense understanding. Training samples and test samples are required to be independent and processed under the same conditions. Obtaining life cycle historical monitoring data of various operating situations requires investing a significant amount of manpower and material resources. Insufficient sample size may cause the model to be sensitive to operating conditions, make individual differences clear, and result in substantial prediction error.

More and more scholars are concentrating on the physics data-driven fusion model in order to overcome the drawbacks of the physical model and the data-driven model, respectively. The fusion model is to introduce physical information into the data-driven model to limit or guide its health status monitoring. The following are its benefits: (1) Through integration of physical information, data-driven model can be constrained in a relatively low space and unknown samples can be filled to a certain extent. (2) Make full use of two methods advantages to improve model predictive performance and provide more reasonable theoretical guidance for tool health management. (3) Compared with original data-driven model, physics data-driven fusion model is more consistent with model interpretation concept of modern manufacturing requirements. Although the research work on tool wear prediction by utilizing physics data-driven fusion method is quite scarce, there is still some pertinent literature to consult. Hanachi et al. [35] built a physical data-driven hybrid framework based on regularized particle filtering technology, and results of tool wear state estimation were significantly improved. Huang et al. [36] constructed a mathematical description of tool wear degradation process and integrated it with multi-layer perceptron (MLP) method. The results showed that hybrid model performed better than any single model. Li et al. [37] proposed a hybrid physics meta-learning framework, which used physical informed loss terms to constrain model optimization, and verified this method effectiveness. Relevant studies have achieved certain effect, but algorithm should not only fit relationship between sensor signal and tool machining state, but also complete mapping between monitoring signal and physical mechanism. It might be challenging to match useful information properly without being overburdened by redundant information because sensor signals’ data amount and distribution frequently varies from actual quantities. Physical information is currently only integrated into data-driven models as particular, straightforward prior knowledge. Further in-depth investigation is still required to determine how to perfectly blend physical and data-driven methods into fusion models. To more precisely assess tool wear status, an appropriate fusion system must be built.

3 Hybrid physics data-driven model-based fusion framework for tool wear prediction

The proposed hybrid physics data-driven model-based fusion framework mainly consists of four modules: wear stage division, physical information, hybrid physics data-driven (HPDD) model and physics data-driven fusion mechanism, as shown in Fig. 1. In wear stage division module, tool wear trend is fitted according to tool wear characteristics, and wear process is divided into three stages: initial wear stage, normal wear stage and severe wear stage. Local features of sensor monitoring signals are extracted depending on division stage. In physical information module, physical model parameters are optimized by utilizing wear rate and cutting force data. Optimized parameters are used to predict tool wear, and physical information of tool wear in three stages are achieved. In HPDD model module, with extracted local features as the first input and physical information as the second input, HPDD model is trained to obtain tool wear prediction results. In physics data-driven fusion mechanism modules, physical information is taken as prior information, and prediction results of HPDD model are taken as observation information. Final tool wear prediction results are obtained by two kinds of information fusion based on Bayesian theory. Tool wear stage division are the basis for the piecewise prediction of physical information module and HPDD module. The physical information and local features are the common input of HPDD model. Finally, physical information and the predicted results of HPDD model are integrated through fusion mechanism. The proposed framework can ameliorate poor prediction accuracy and generalization ability of physical model and lack interpretability and physical consistency of data-driven model.

Fig. 1
figure 1

The scheme of tool wear prediction

3.1 Tool wear stage division and fitting

The complicated phenomena of tool wear are induced by many factors, including abrasive wear caused by mechanical action of hard particles, diffusion wear caused by atomic motion, adhesive wear caused by shear plane deformation and oxidation wear at high temperature. The combined activity of these mechanisms ultimately leads to emergence of wear forms such as rake face wear, flank wear, and boundary wear. Tool flank wear is resulted from strong friction between tool clearance surface and workpiece surface. The academic community is quite concerned about it since it will directly affect the quality of the machined surface.

Tool wear speed is very quick in initial wear stage due to the large number of microscopic flaws on the tool surface, low metal surface strength, and point contact between cutting tool and workpiece. In normal wear stage, due to disappearance of micro convex surface in tool, contact area between tool and workpiece increases, all parts in tool contact surface are stressed uniformly, and tool material is slowly and equally worn. Then, in severe wear stage, the tool’s cutting edge becomes dull, the temperature in the cutting area rises, cutting force increases, and friction in the cutting area rapidly increases, all of which contribute to greater tool wear, as illustrated in Fig. 2. Wear rate and wear acceleration are computed as follows to analyze the changing trend of tool wear:

$$\left\{\begin{array}{c}{w}^{\prime}\left(t\right)=\Delta w/\Delta t=\left[w\left(t+\Delta t\right)-w\left(t\right)\right]/\Delta t\\ {w}^{^{\prime\prime} }\left(t\right)=\Delta {w}^{\prime}/\Delta t=\left[{w}^{\prime}\left(t+\Delta t\right)-{w}^{\prime}\left(t\right)\right]/\Delta t\end{array}\right.$$
(1)

where is wear sampling interval time of wear trend curve, \(\Delta w\) and \(\Delta {w}^{\prime}\) are wear rate and wear acceleration corresponding to wear sampling interval respectively, \(\Delta t\) is an interval sampling milling time of the wear curve. In the figure, ta is moment when wear acceleration is \({w}^{^{\prime\prime} }=0\). This critical time point divides tool wear curve into convex and concave parts.

Fig. 2
figure 2

Trend chart of tool wear

Tool wear rate change is used to classify the wear stages. It is evident from the tool wear process that tool wear is not a simple linear relationship. It is unreasonable to describe tool wear rate with a straightforward difference relationship. Due to factors such as tool installation, measurement errors, processing environment, and individual differences in the workpiece, there may be deviations in experimental measurement data of tool wear, and the measurement time interval for the experimental data is large. Hence, the difference method will result in significant mistake and severe wear rate data dispersion. Tool wear fitting model can describe degradation trend of tool wear well.

A tool flank wear fitting model w(t) was proposed in paper [27]. w(t) consists of transition functions as follows:

$$w\left(t\right)={w}_{E}\left(t\right)+{w}_{L}\left(t\right)$$
(2)
$$\left\{\begin{array}{c}{{w}_{E}}^{\prime}={a}_{1}/\left(t+{b}_{1}\right)+{c}_{1}\\ {{w}_{L}}^{{\prime}{\prime}}={a}_{2}t+{b}_{2}\end{array}\right.$$
(3)

where \({w}_{E}\left(t\right)\) and \({w}_{L}\left(t\right)\) represent polynomial fitting curves of convex and concave parts. \({a}_{1}\left(\mu {\text{m}}\right)\), \({a}_{2}\left({\mu \text{m/min}}^{3}\right)\), \({b}_{1}\left({\text{min}}\right)\), \({b}_{2}\left({\mu \text{m/min}}^{2}\right)\), and \({c}_{1}\left(\mu \text{m/min}\right)\) are basic transition coefficients.

According to transition expression in Eq. (3), after integrating and removing low-order variables that have little influence on tool wear in normal wear stage, in order to improve model flexibility and fitting accuracy under various wear conditions, exponential x is introduced in model. Therefore, tool wear fitting model is summarized as:

$$w\left(t\right)=A{\text{ln}}\left(Bt+1\right)+C{t}^{x}+D$$
(4)
$${w}^{\prime}\left(t\right)=AB{\left(Bt+1\right)}^{-1}+xC{t}^{x-1}$$
(5)

where A (μm), B (min−1), C (μm/min3), and D (μm) are fitting coefficients.

A number of empirical physical models of tool wear were developed in past investigations. A number of empirical physical models of tool wear have been established in past studies. Slamani et al. [12] proposed the multiplicative index tool wear model w1. In doctoral thesis, Muller presented an empirical model w2 that combines linear function and exponential function, which is considered to be tool wear model with the best fitting degree in initial stage [38]. Sipos introduced an empirical wear model w3 with exponential and polynomial functions in doctoral dissertation [39]. Three typical tool wear fitting models are as follows:

$$\left\{\begin{array}{c}{w}_{1}={b}_{0}{v}_{c}^{b1}{f}^{b2}{L}^{b3}\\ {w}_{2}={a}_{B}t+{b}_{B}\left(1-{e}^{-{c}_{B}t}\right)\\ {w}_{3}={D}_{C}texp\left({A}_{C}+{B}_{C}t+{C}_{C}{t}^{2}\right)\end{array}\right.$$
(6)

where vc (mm/min) is the cutting speed. f (mm/min) is the cutting feed. L (mm) is the cutting length. b0, b2, b3, b4,\({a}_{B}\),\({b}_{B}\),\({c}_{B}\) and \({A}_{C}\),\({B}_{C}\),\({C}_{C}\),\({D}_{C}\) are constants defined in literature.

In order to compare models fitting performance, model practicability is verified using Milling dataset from NASA data repository. Through analysis model w(t) has the best fitting performance of tool wear degradation process [27]. This model is used to preprocess experimental data to obtain smoother and more regular wear rate data.

3.2 Physical information

Some existing physical models have good fitting performance. Nevertheless, these models are often completely dependent on training of previous tool wear data in actual tool wear prediction, and prediction is completely divorced from test tool state monitoring, resulting in low accuracy of prediction results. Several studies have demonstrated that tool wear leads to an increase in cutting force and validated effectiveness of an improved Taylor tool wear life equation for tool wear estimation [40]. This physical model takes cutting force signal as an important index into tool wear prediction and shows better prediction performance. Taylor equation describing relationship between cutting force and tool wear rate is as follows:

$$\frac{dw}{dt}=C{N}^{m}$$
(7)

where dw/dt represents tool wear rate, and C and m are constants. N represents tool cutting force.

According to tool wear characteristics in Fig. 2, tool wear process can be divided into three stages: initial stage, normal stage, and severe stage. Tool wear rate of three stages also has its corresponding change rule. Since variation law of tool wear rate at three stages is inconsistent, cutting relationship between cutting force and tool wear rate at each stage is dynamic. Therefore, wear process is piecewise fitting, and these dynamic changes are directly reflected in parameter changes of different wear stages. According to tool wear rate change, tool wear rate and cutting force signal are divided into three stages:

$$N=[{N}_{\text{ini}},{N}_{\text{nor}},{N}_{\text{sev}}]$$
(8)
$$\Delta w=[\Delta {w}_{\text{ini}},\Delta {w}_{\text{nor}},\Delta {w}_{\text{sev}}]$$
(9)

where\(\Delta {w}_{\text{ini}}\),\(\Delta {w}_{\text{nor}}\), and \(\Delta {w}_{\text{sev}}\) are wear rate, and Nini, Nnor, and Nsev are cutting forces of three stages. And their corresponding values are substituted into Eq. (7) to obtain model optimization parameters of corresponding stages. The optimized parameters are used to predict tool wear in order to obtain physical information for subsequent research. As shown in Fig. 3, prediction process of the first tool is taken as an example with superscript representing different tools. After fitting tool wear value measured by experiment, the tool wear rate value is obtained. The tool wear rate value and the force signal are processed to obtain the data of three wear stages. In addition to the first tool, the rest of the tool data is used as training data. The training tool corresponding wear stage data composes the array(initial wear stage:\(\left[\left[\Delta {w}_{\text{ini}}^{2},\Delta {w}_{\text{ini}}^{3},\cdots ,\Delta {w}_{\text{ini}}^{\text{n}}\right],\left[{N}_{\text{ini}}^{2},{N}_{\text{ini}}^{3},\cdots ,{N}_{\text{ini}}^{\text{n}}\right]\right]\), normal wear stage:\(\left[\left[\Delta {w}_{\text{nor}}^{2},\Delta {w}_{\text{nor}}^{3},\cdots ,\Delta {w}_{\text{nor}}^{\text{n}}\right],\left[{N}_{\text{nor}}^{2},{N}_{\text{nor}}^{3},\cdots ,{N}_{\text{nor}}^{\text{n}}\right]\right]\), severe wear stage:\(\left[\left[\Delta {w}_{\text{sev}}^{2},\Delta {w}_{\text{sev}}^{3},\cdots ,\Delta {w}_{\text{sev}}^{\text{n}}\right],\left[{N}_{\text{sev}}^{2},{N}_{\text{sev}}^{3},\cdots ,{N}_{\text{sev}}^{\text{n}}\right]\right]\)). The array of different wear stages is brought into the Taylor physical model respectively to obtain the model optimization parameters of the corresponding stages (as the physical model parameters of the first tool\(\left(\left[{C}_{\text{ini}}^{1},{m}_{\text{ini}}^{1}\right],\left[{C}_{\text{nor}}^{1},{m}_{\text{nor}}^{1}\right],\left[{C}_{\text{sev}}^{1},{m}_{sev}^{1}\right]\right)\)). Finally, the physical information (\(\left[\Delta {w}_{\text{ini}}^{\text{1pre}},\Delta {w}_{\text{nor}}^{\text{1pre}},\Delta {w}_{\text{sev}}^{\text{1pre}}\right]\)) of the first tool is obtained.

Fig. 3
figure 3

The flow chart of physical information

3.3 Hybrid physics data-driven model

Hybrid physics data-driven (HPDD) model’s construction method is mainly based on fusion of physical information and sensor monitoring signals as data-driven model input information. Three stages HPDD models are constructed for cutting tool, and each stage results are integrated to obtain prediction results of test tool. Compared with traditional data-driven model, which directly trains monitoring signals, HPDD model incorporates physical information into data-driven model input, which improves accuracy and interpretability of model. Firstly, according to piecewise setting of tool characteristics, sensor monitoring signals (including cutting force signals, vibration signals, acoustic emission signals) are extracted features. Then extracted features and physical information are used as feature indexes of data-driven model together. These indexes corresponding input to piecewise HPDD model to get prediction results.

In order to solve problems of noise and signal source pollution in original sensor monitoring signal, a large number of data preprocessing methods are proposed to obtain effective signal information. In this paper, local feature extraction method is adopted. Compared with traditional global feature extraction method, it can accurately capture subtle signal changes and eliminate some redundant information. Under small sample prediction condition, traditional feature extraction method will lead to a sharp decline in number of samples. The insufficient number of training samples may lead to model overfitting and decline in accuracy and robustness. It takes a lot of manpower and material resources to obtain tool life cycle data, and tool wear data sets are small-batch samples in most case, so local feature extraction method introduction can quickly extract a large number of effective information. Local features are used as the first input of HPDD model to improve model prediction accuracy.

The interpretability of data-driven model has been a focus by researchers. Real relationship between tool wear and various physical quantities is difficult to determine. Therefore, data-driven model prediction continues to adopt piecewise prediction according to tool wear characteristics. Because of different change rules of tool wear at each stage, piecewise prediction is conducive to quickly and accurately establish relationship between input and output. Physical information is applied as the second input of the HPDD model. Physical information is based on empirical knowledge accumulated from long-term studies of real tool wear. As unlabeled samples, physical information not only expands training samples, but also facilitates exploration of hidden information except local features, which helps to improve interpretability and robustness of model prediction.

3.4 Physics data-driven fusion mechanism

Both physical information and HPDD model prediction results have certain theoretical basis and reliability. Thus, this section combines two types of information to obtain higher prediction accuracy. Bayesian theory is an effective method to deal with random factors and data analysis. Because accumulated knowledge, expert experience and other important information as prior knowledge to participate in decision, while using monitoring system measurement data constantly update posterior information, Bayesian updating theory effectively improve posterior results reliability. Therefore, Bayesian updating theory has a wide application prospect in uncertainty problems of health status monitoring.

Tool wear process can be regarded as a nonlinear system. Physical information is taken as state equation of system \({f}_{k}\left(\cdot \right)\), and predicted results of HPDD model are used as observational equation of system \({h}_{k}\left(\cdot \right)\), State space model can be defined as follows:

$${x}_{k}={f}_{k}\left({x}_{k-1},{u}_{k}\right)$$
(10)
$${y}_{k}={h}_{k}\left({x}_{k},{v}_{k}\right)$$
(11)

where xk and yk respectively represent state value and observed value at moment k, and uk and vk are status noise and observed noise, respectively. In order to describe relationship between tool wear at present moment and previous moment, state equation of above Eq. (10) can be updated as:

$${x}_{k}={y}_{k-1}+C{N}_{k-1}^{m}$$
(12)

where xk is current tool wear state value, yk-1 is final prediction result at previous time, which is also tool wear state value at the previous time. \(C{N}_{k-1}^{m}\) is tool wear rate at previous time. HPDD model constructs the mapping relationship between tool wear and monitoring signals. Observation equation of tool wear is:

$${y}_{k}={\text{HPDD}}\left({z}_{k}\right)$$
(13)

where HPDD is a hybrid physics data-driven tool wear prediction model, zk represents local features and physical information input features.

According to the above equation, current tool wear state value and observed value can be calculated from known state value of the last tool wear and current input value. Since tool wear value is uncertain at every moment, it can be assumed that tool wear is a first-order Markov process. For current tool wear value xk, prior probability density can be derived as follows:

$$p\left({x}_{k}\left|{y}_{1:k-1}\right.\right)=\int p\left({x}_{k}\left|{x}_{k-1}\right.\right)p\left({x}_{k-1}\left|{y}_{1:k-1}\right.\right)d{x}_{k-1}$$
(14)

The posterior probability of xk updated by observation yk is:

$$p\left({x}_{k}\left|{y}_{1:k}\right.\right)=\frac{p\left({y}_{k}\left|{x}_{k}\right.\right)p\left({x}_{k}\left|{y}_{1:k-1}\right.\right)}{p\left({y}_{k}\left|{y}_{1:k-1}\right.\right)}$$
(15)
$$p\left({y}_{k}\left|{y}_{1:k-1}\right.\right)=\int p\left({y}_{k}\left|{x}_{k}\right.\right)p\left({x}_{k}\left|{y}_{1:k-1}\right.\right)d{x}_{k}$$
(16)

Bayesian statistics problems involve multi-dimensional random variables and parameters. Error of numerical methods will increase significantly with the rise of dimension. Markov Monte Carlo method is an indirect sampling method to obtain random variable sample values by constructing Markov chain. It can circumvent difficulty of sampling from incomplete known probability distribution by direct sampling method which provide an effective computing method for Bayesian algorithm. Metropolis–Hastings (M-H) algorithm is widely used in Bayesian algorithm. Fusion mechanism process for tool wear prediction based on Bayesian theory is shown in Fig. 4.

Fig. 4
figure 4

The flowchart of fusion mechanism

4 Experiment validation

4.1 Experimental data description

To evaluate the performance of the developed framework objectively, the public dataset from the “prognostic data challenge 2010” is selected. The experiment was carried out on a high-speed CNC milling machine (Roders Tech RFM760), using a 6 mm three-slot spherical head tungsten carbide tool to mill the stainless steel workpiece (HRC52). In order to accelerate tool wear, the experiment adopted the dry milling method to carry out the whole life cycle experiment of three milling cutters respectively. In experiment, a down-milling machining method was adopted. After each end milling with a length of 108 mm, cutting tool returned to side of starting point of milling path. Experimental processing parameters are shown in Table 1. It could be seen that a higher cutting speed was selected in this experiment to obtain wear monitoring signal data of tool life cycle faster under requirement of dry milling. Each cutting tool had been end-milling 315 times according to machining parameters. In each milling process, an acquisition card (NI DAQ PCI 1200) was used to collect signals of dynamometer (Kistler 9265B), vibration sensor (Kistler 8636C) and acoustic emission sensor (Kistler 8152) at a sampling frequency of 50 kHz. After each milling, tool wear value was measured by microscope (LEICA MZ12). Dynamometer measured cutting force in form of charge, which was converted into voltage signal by charge amplifier (Kistler 5019A) and then transmitted to the acquisition card. The tool wear experimental platform is shown in Fig. 5

Table 1 Experimental machining parameter
Fig. 5
figure 5

Tool wear experimental platform

Whole life cycle dataset of tools (C1, C4, and C6) was obtained in the experiment, including the three-directions cutting force signal, three-direction vibration signal, and one-dimensional acoustic emission signal collected in 315 end milling processes, as well as three blade wear values measured after each end milling. Maximum value of three flank wear values was used as the final tool wear results for each end milling. The basic definition of tool life is the time required to reach predetermined flank wear width, so predetermined flank wear value of tool is set as 160 μm in this paper. Two tools in dataset are used as training data to predict wear state of the other tool.

4.2 Data preprocessing

4.2.1 Degenerate process piecewise

Due to uncertainty of experimental measurement data, it is difficult to divide stages according to change of tool wear rate. Experimental measurement of tool wear data is fitted according to fitting model Eq. (4). Tool wear fitting results and wear rate are shown in Fig. 6, and their high fit with measured data reflects better fitting performance. Since tool wear change is unknown during tool prediction, it is necessary to have clear and uniform criteria for wear stage division. Cluster analysis [41] is performed on tool wear rate, and clustering result of three tools is integrated to set stage division as shown in Fig. 6.

Fig. 6
figure 6

Physical model fitted results and wear stage division. The first three figures are physical model fit results of the tools (C1, C4, and C6). Experiment value: tool wear values measured by experiment; fitted value: tool wear results fitted by physical model; wear rate: tool wear rate results fitted by physical model. The last figure is the division of three wear stages

4.2.2 Local features extraction

In order to remove noise and redundant information from monitoring signals and accelerate model convergence, local features are extracted from monitoring data. Local feature extraction is to divide 7 channels of original data by each end milling into 20 segments, and extract time and frequency domain features of each segment data in each channel as shown in Table 2. Original signal of each channel is converted into feature matrix of (20, 11). Extraction features of three directional force signals, vibration signals, and acoustic emission signals are integrated into a matrix (20, 77). Assuming that when tool flank wear is 160 μm, the number of end milling is n, total local feature size of each tool is (20, 77, n). Correlation analysis between each dimension features and tool wear is carried out. Finally, 15-dimensional features with high correlation are selected and normalized. After data preprocessing, local feature matrix (20, 15, n) of each tool is obtained.

Table 2 List of extracted features

4.3 Performance evaluation

4.3.1 Results discussion

Taylor model parameters are optimized through training data, the optimized parameters are combined with cutting force monitoring signals of test tool, and the predicted tool wear rate is obtained by leave-one cross-validation method in Fig. 7. Wear rate data obtained by differential measurement of tool wear experiment have high dispersion. Fitting parameters related to tool wear rate change in different stages and reduces difficulty of parameter estimation. Tool wear data derived from predicted tool wear rate is used as physical guidance information for hybrid physics data-driven model. As the second input of data-driven model, physical information break limitation of traditional method only considering labeled samples, and make full use of unlabeled samples to enhance accuracy and physical consistency of prediction results.

Fig. 7
figure 7

Physical model wear rate prediction. Experiment value: tool wear rate values calculated by difference according to experimental measurement tool wear values; fitted value: tool wear rate values fitted by the physical fitting model; physical predicted value: tool wear rate values predicted by the physical prediction model

Several classical deep learning and machine learning methods in field of PHM are selected for comparison, including convolutional neural network (CNN), LSTM, SVR and bidirectional gated recurrent neural network (Bi-GRU). Four models take extracted local features as model inputs. Settings of loss function, learning rate, optimizer and number of iterations are consistent, and predicted results of these four methods are compared. Figure 8 shows that prediction accuracy of Bi-GRU is optimal compared with other methods, and its good prediction performance is embodied in root mean square error (RMSE) and mean relative error (MARE) of prediction results. Therefore, Bi-GRU is selected as data-driven model in HPDD method. Bi-directional GRU: including a bi-directional GRU layer (number of units: 10) and a fully connected layer (number of units: 10), learning rate: 0.001, optimizer: Adam, iterations: 100. Extracted local features are the first input of model, and prediction result of Taylor model is physical information input. Bi-GRU is used to construct HPDD model. Prediction results and error of HPDD model are shown in Fig. 9. Compared with prediction results of Bi-GRU model without considering physical information, it can be seen that HPDD model has higher prediction accuracy. HPDD model is a method to improve accuracy and interpretability by using input of physical information, and to a certain extent solves problem that model is difficult to build. Physical information input not only expands sample of unlabeled data, but also helps data-driven model to quickly explore tool wear information from local features, and provides theoretical premise for accurate tool wear prediction.

Fig. 8
figure 8

Data-driven model comparison

Fig. 9
figure 9

Comparison of proposed method predicted results with various independent methods (C1). Comparison of accuracy of prediction results by 5 methods (proposed model, HPDD model, Bi-GRU model, Talyor model and Fitting model); experiment value: tool wear values measured by experiment; dotted line: the prediction results of various methods; error: the absolute error between the predicted result and the experimental measurement; confidence interval: the distribution of predicted results of the proposed method

Physical information and prediction results of HPDD model are combined by Bayesian framework, and the final prediction results are obtained by solving the Bayesian posterior samples through Markov Monte Carlo (MCMC). In this paper, M-H algorithm is used to calculate posterior samples of tool wear. A total of 10,000 cycles of sampling are carried out. The first 5,000 cycles are taken as training processes to make cycles converge, and the last 5000 updated samples are posterior distribution samples of current tool wear state. Prediction results are shown in Fig. 9 and Fig. 10. Bayesian theory solves uncertainty problem in prediction from perspective of probability distribution. Confidence interval of prediction results is given according to posterior sample distribution of tool wear, and mean value of each predicted posterior distribution is taken as final prediction result of tool wear. Compared with tool wear values measured by experiment, it can be intuitively found that prediction results have high-accuracy. And prediction probability distribution presented by confidence interval shows that prediction method has strong prediction stability. Bayesian framework fuses two kinds of information and gradually reduces error of both results to improve prediction accuracy of fusion prediction.

Fig. 10
figure 10

Tool wear prediction results (C4, C6). The legend is the same as Fig. 9. The bar chart is a comparison of 5 methods (1, proposed model; 2, HPDD model; 3, Bi-GRU model; 4, Talyor model; and 5, fitting model)

The hybrid physics data-driven model-based fusion framework proposed in this paper, HPDD model, Bi-GRU, Taylor extended formula and fitting model conduct a comparative experiment. Figure 9 shows prediction results of tool (C1) and absolute difference of various models between experimental measurements and predicted values. Comparison results of five methods for cutting tools (C4, C6) are shown in Fig. 10. These reveal that performance of proposed method is superior to other methods, prediction error is the smallest, and convergence effect is optimal from prediction start to end. RMSE and MARE of prediction results can more intuitively show prediction accuracy and stability of proposed method.

Some existing physical models do not take factors such as processing environment and cutting parameters into account in model, and rely on empirical models that summarize a large amount of tool wear data and wear trend observation, showing good fitting performance. However, trend prediction process is completely divorced from actual machining of test tool, only relying on training of historical data, it is difficult to obtain high-precision prediction results. Fitting model in figure has a large difference in prediction performance of different tools. Compared with other methods, overall prediction performance of fitting model is the worst. Therefore, this model is suitable for preprocessing of tool wear data, eliminate the errors caused by measured tool wear data to a certain extent and maintain physical consistency between experimental data and wear process. Taylor model adopted in this paper incorporates cutting force signals monitored by sensors into physical model construction. Considering tool wear characteristics, piecewise prediction is carried out according to change rule of wear rate at different stages, which improved prediction ability compared with fitting model. HPDD model takes local features of sensor monitoring signal and physical information as model inputs at the same time, which helps deep learning network to learn tool wear information from more dimensions. Combined with comparison results of Fig. 9, HPDD model shows better prediction performance than independent BI-GRU model and Taylor model in tool results of physics data-driven fusion method. Physical wear prediction. Proposed method combines physical information based on Taylor model and HPDD model through Bayesian theory to obtain final prediction information as priori information to improve interpretability of model, and prediction results of HPDD model as observation information to constantly update prediction results to improve accuracy and robustness of model prediction.

4.3.2 Comparison with other methods

In recent years, many researches have been carried out on tool wear, and this dataset has been widely used in field of tool health monitoring to demonstrate the methods’ performance. In this paper, many related works are summarized to prove effectiveness of proposed method objectively. Both commonly machine learning and deep learning algorithms such as MLP [42], SVR [43], CNN [44], and recurrent neural network (RNN) [17] have certain predictive effect. However, because these algorithms are widely used in various fields, they are not specific to tool wear, so prediction accuracy is generally low. Improved models for tool wear have been proposed in many literatures, including CNN combined with LSTM model (CNN + LSTM) [45], Time-distributed ConvLSTM model (TDConvLSTM) [46], Parallel convolutional Neural networks for multi-scale feature fusion and channel attention mechanism (FFCA + PCNN) [47], Physics guided neural network neural network (PGNN) [48] and Deep heterogeneous GRUs model (DHGRU) [49], which have significantly improved prediction performance compared with those general networks. RMSE and MARE of proposed method predicted results for three tools are respectively 2.29, 1.77, 2.29 and 2.14, 2.27, 1.65. In order to quantitatively analyze the prediction performance of.

different methods, prediction results of each method are shown in Fig. 11 and Table 3. Comparison and analysis indicate that the hybrid physics data-driven model-based fusion framework for machining tool wear prediction in this paper is the most reliable and accurate.

Fig. 11
figure 11

Comparison with existing methods

Table 3 Comparison with existing methods

5 Conclusions

Tool wear has a significant impact on machining accuracy and part quality. A reliable method is required to monitor and predict tool wear conditions. In this paper, a hybrid physics data-driven model-based fusion framework for tool wear prediction is proposed. Some conclusions can be drawn as follows:

  • a) In order to solve problems of low prediction accuracy and poor generalization ability of physical model and lack of interpretability and physical consistency of data-driven model, the proposed method utilizes the advantages of different models to improve the prediction accuracy.

  • b) Physical information is used as data-driven model input to build HPDD model. By using unlabeled samples for data expansion, limitation of labeled samples lack is broken. It integrates data mining and physics theory effectively, and fully discusses hidden relation between input and output.

  • c) Factors such as sensor noise and measurement techniques will cause uncertainty in physical information and HPDD model prediction results. A Bayesian fusion mechanism is introduced to integrate two types of information. The results demonstrate that high precision tool wear prediction is realized, and prediction error is significantly reduced when compared to independent models and existing methods.