1 Introduction

High-precision and high-efficiency are the goals pursued by modern manufacturing industry; however, chatter during milling severely reduces the surface quality and processing efficiency. Chatter is mainly divided into regenerative chatter and mode coupling chatter according to the mechanism. Among them, the regenerative chatter is widely studied by most scholars due to the fact that it occurs before mode coupling chatter [1]. The regenerative chatter-related articles mainly focus on chatter prediction [2, 3], chatter identification [4, 5], and chatter control [6]. Chatter prediction can offer suitable machining parameters to avoid chatter; however, chatter-free machining is difficult to realize due to time dependence and uncertainties in the dynamic characteristics of the flexible workpiece [7]. Instead, chatter identification and control are verified to be reliable. Therefore, chatter identification and control receive more and more attentions in academic area to monitor chatter as shown in Fig. 1. Chatter identification is a technology to diagnose chatter in time or in advance by applying signal processing methods to the measured signals during processing. The specific process majorly includes signal acquisition, feature extraction, and pattern recognition.

Fig. 1
figure 1

Flow chart of chatter monitoring

The signals must be first collected, including the cutting force signal [8], acceleration signal [9, 10], displacement signal [11, 12], and acoustic emission signal [13, 14]. Except the abovementioned signals, some signals from built-in machine tools like motor current are used to monitor the chatter. But Aslan and Altintas [15] pointed out the effects of structural dynamic modes of the spindle should be compensated via a proposed observer. Luo et al. [16] presented an instrumented wireless milling cutter system with embedded thin-film sensors in each cutting inserts; thus, the cutting forces acting on each cutting edge could be measured without reducing the stiffness and dynamic characteristics of the machining system. Axinte et al. [17] experimentally verified that the cutting force was more sensitive to chatter than acoustic emission and vibration signal. Hence, the force signal is chosen in this work.

After collecting the data, it is necessary to perform feature extraction on the original signal to obtain the flutter index. At present, there are time-domain methods, frequency-domain methods, and time-frequency methods in chatter feature extraction methods [18]. Among them, only time-frequency methods can simultaneously locate time and frequency. Time-frequency methods include short-time Fourier transform (STFT), wavelet (WV) analysis, and ensemble empirical mode decomposition (EEMD). Among them, the STFT is developed based on Fourier transform and it is not applicable for nonstationary signals. As for WV, how to choose the wavelet base function and decomposition level reasonably has been an open question. The EEMD has been extensively employed in chatter detection [19, 20] because the EEMD is a self-adaptive analysis method for nonlinear and nonstationary signals [21].

Apart from this, some nonlinear dimensionless indicators can be applied to chatter detection, such as approximate entropy (AE) [22], sample entropy (SE) [23], and energy entropy (EE) [24, 25]. Among them, Canales [22] proposed a chatter identification method using the AE index. The article stated that the AE can estimate randomness content in chattering signals. Yang et al. [23] presented AE and SE to detect the onset of chatter. But compared with AE, SE is more consistent and has less reliance on data length in measuring time series complexity as SE performs some transformations on some steps based on the AE calculation. Except for these time-domain indicators, EE can serve as an important indicator because it describes changes in the energy distribution [24, 25].

Chatter identification is essentially the problem of pattern classification, the purpose of which is to establish a mapping between chatter feature vectors and cutting conditions (stable, chatter). In general, a constant or varying threshold [23, 26] is set to judge whether chatter occurs in the milling process. Recently, due to the continuous development of artificial intelligence (AI), some sophisticated algorithms have been developed, including artificial neural network (ANN) model [27] and support vector machine (SVM) [28,29,30]. Among these algorithms, the SVM can overcome the deficiencies of multiple local minimal and over-fitting, which is suitable for solving high dimensional, small sample size, and nonlinear cases [11]. But the parameters of kernel function (KF) such as the penalty factor (c) and the core parameter (g) play an important role in identification results, so it is very necessary to explore the best parameters for the SVM.

Due to the requirement of lightweight design in various fields of industrial manufacturing, the proportion of thin-walled parts raises quickly [31]. However, vibration (chatter) easily occurs during machining thin-walled parts owing to the lower stiffness and deformation of thin-walled parts. Recently, scholars have begun to pay attention to the chatter detection of machining thin-walled parts. In order to consider both time-varying characteristic and position-dependent characteristic of milling the thin-walled workpiece, a varying threshold method was presented to detect the chatter by Liu et al. [26]. Wang et al. [11] proposed Q-factor to show the change of bandwidth caused by chatter; however, Q-factor was calculated with linear predictive analysis while chatter is a complex nonlinear phenomenon. In order to describe the nonlinear characteristics of chatter, Dong [9] used complexity index to detect the chatter. Ye et al. [32] defined coefficient of variation (CV) and verified that CV was robust to different machining materials and machining parameters. The abovementioned complexity and CV are calculated only in time domain; however, the frequency domain information is missing. According to nonstationary characteristics of chatter, Gao et al. [33] applied cmor continuous wavelet transform (CMWT) to locate the time and frequency of chatter. However, the natural frequency of the thin-walled workpiece and the tooth-through frequency of the milling system are needed to obtain in advance.

In summary, the chatter in milling thin-walled workpieces becomes more complicated, nonlinear, and nonstationary due to the lower stiffness. The EEMD and nonlinear dimensionless indicators should be suitable for chatter detection of milling of thin-walled workpiece. Because unstable chattering is related to the emergence of random dynamics [22], SE is best suited for predicting chatter formation. It is very difficult to extract the chatter indicators from the raw signals which include lots of noise brought by the deformation and low stiffness of thin-walled workpiece. A single chatter indicator may be not valid for the thin-walled workpiece. Hence, another indicator (EE) is needed. And EE can consider the energy redistribution brought by chatter, so it should be an alternative indicator for chatter detection. Therefore, chatter detection methods for peripheral milling thin-walled workpieces using SE and EE are presented in this paper.

The rest of this article is organized as follows. Chatter detection methods using SE and EE is proposed in Section 2. As the EEMD is adopted to decompose the raw signal, the EEMD is firstly introduced in Section 2.1. In Section 2.2, the cutting states are fixed through a stability lobe diagram (SLD). Among them, modal parameters are an important input of SLD, so modal experiments are conducted in Section 2.2.1. Then, the SLD is drawn and verified in Section 2.2.2. Section 2.3 introduces SE’s and EE’s mathematical models to show their validity for chatter detection. Subsequently, the chatter features are extracted based on experimental setup after the raw data are decomposed into a set of intrinsic mode functions (IMFs) by EEMD. And Section 3 makes a correlation analysis to retain chatter-related IMFs. To further seize the chatter information automatically, the principal component analysis (PCA) is used in Section 4. After that, the chatter detection using an improved nonlinear SVM is conducted and the identification results are verified by the experiments. At last, Section 6 concludes the paper.

2 The proposed chatter detection method

Figure 2 shows the proposed chatter detection method using SE and EE. The raw measured cutting forces are decomposed by the EEMD. Besides, the effectiveness of the EEMD is verified by simulation signal. Then, a set of IMFs are obtained by the EEMD. After that, the correlation analysis is conducted to reserve the chatter-related IMFs. The SE and EE of the retained IMFs are extracted as the initial chatter feature vector. Among them, only IMF1 contains more chatter information compared with other IMFs through the change analysis of SE and EE. Therefore, correlation analysis is not sufficient to acquire chatter information. In order to further reduce the feature vector dimension automatically, the PCA is adopted after SE and EE are extracted from 4 IMFs in this paper. The selected features are trained by the SVM. As a result, the chatter detection using an improved nonlinear SVM is conducted and the identification results are verified by the experiments. In this part, the suitable KF is firstly chosen and the parameter optimization of selected KF is investigated by GA and GE. Besides, the surface profile is used to further verify the effectiveness of the SVM.

Fig. 2
figure 2

Flow chart of the chatter detection method using SE and EE

At the same time, the cutting states of the signals can be determined by the off-line chatter prediction (the stable cutting and unstable cutting in this paper). Besides, the identified cutting states are viewed as the labels for training and test set. However, the SLD needs to be first drawn and validated. The SLD is affected by dynamic behavior, which can be measured through modal experiments. So modal experiments are conducted. Aiming at obtaining a more accurate transfer function, the concept of a relative transfer function is given through the modal analysis of tool and workpiece. Then, the SLD is drawn and verified by experiments. Based on the verified SLD, the different cutting conditions can be chosen and cutting force signals are collected from these different cutting conditions to form the raw data set.

2.1 EEMD analysis of simulation signal

Compared with wavelet transform, the EEMD is an optimal method to analyze nonlinear and nonstationary data because its basis functions are determined by the data itself, while wavelet transform requires selecting the best wavelet basis [21]. Hence, in the paper, the EEMD is chosen to decompose the raw signal into a set of IMFs and a residue. The EEMD was proposed by N.E. Huang [34] to solve the modal aliasing in the algorithm of empirical mode decomposition (EMD). The EEMD takes full advantage of the uniform distribution of white noise by adding several white noises continuously to the signal. After that, the noise-containing signals are decomposed by the EMD. Because of the zero-mean value of white noise, the influence of white noise is naturally eliminated. The specific steps of the EEMD are as follows:

  1. (1)

    Determine the number of ensample m and add the ith white noise ni(t)(1 ≪ i ≪ m) with a mean value of 0 and the standard deviation of constant to the original signal x(t), then the new signal xi(t) with the ith white noise is given by:

$$ {x}_i(t)=x(t)+{n}_i(t) $$
(1)
  1. (2)

    xi(t) is decomposed by the EMD; then, the jth IMF is written as cij and the residual is written by ri(t). As a result, xi(t) is given by:

$$ {x}_i(t)=\sum_{j=1}^I{c}_{ij}(t)+{r}_i(t) $$
(2)

where I denotes the number of IMFs of each trial.

  1. (3)

    Repeat (1) and (2) until m; then, the ensample means cj of m trial is computed by overall average for each IMF in decomposition, which is given by:

$$ {c}_j=\frac{1}{m}\sum_{i=1}^m{c}_{ij}(t),j=1,2,\cdots, I,i=1,2,\cdots m $$
(3)

Each cj (j = 1, 2, ⋯, I) of the I IMFs is defined as the final IMFs through EEMD. As white noises are added to the entire time-domain sequence, the noise is canceled and the modal aliasing resulting from the uneven distribution of extreme points can be eliminated. To demonstrate the validity of the EEMD, the simulation signal is constructed, and then, it is decomposed by the EEMD. Figure 3 a is the constructed signals containing discontinuous cosines, in which the sinusoidal low-frequency signal x1 = sin(30πt) is superimposed with the discontinuity of the high-frequency signal x2 = 0.3 cos(240πt). So the sine signal is given as: x0 = x1 + x2.

Fig. 3
figure 3

The analysis of the simulation signal. a Simulation signal and its composition. b EMD result of simulation signal. c EEMD result of simulation signal. d Hilbert time-frequency spectrum from EEMD

The IMF1-IMF5 and the residual of the raw signal are individually obtained by the EMD and the EEMD, which are shown in Fig. 3 b and c. As can be seen, IMF1 coincides with the high-frequency component of the simulation signal x0, and IMF2 is similar to the low-frequency component of x0. But modal aliasing is coming along with the EMD. Besides, fake IMFs produced by the EEMD have smaller amplitude compared with the EMD. The Hilbert frequency spectrum obtained from the IMFs of the EEMD is shown in Fig. 3c. The frequencies are clearly separated by the EEMD. So the EEMD is available for decomposing the raw signal to get more accurate IMFs.

2.2 Cutting state identification based on SLD

2.2.1 Modal experiment setup and results

An important input of the SLD is dynamic behavior, which can be measured by modal experiment. Modal experiment is mainly composed of vibration excitation, vibration picking, and data analysis. The structure is shown in Fig.4a. During the experiment, the single-point excitation and single-point response test method were used to measure and analyze the frequency response function of the milling system. In order to obtain a more accurate result, it needs to hammer three times and get the average. As a result, the modal parameters of the processing system and the corresponding vibration were obtained. Among them, the connection of modal experiment is shown in Fig. 4b.

Fig. 4
figure 4

Modal experiment. a Schematic diagram. b Connection diagram. c Modal test for workpiece. d Modal test for tool

Besides, modal experiment was conducted on the vertical machining center TH5650 which was produced by Shenyang Machine Tool Company. Firstly, the acceleration sensor was fixed on the cutter of the machine tool spindle. Make sure that the paste position was close to the tip of the tool and the acceleration sensor was an effective signal perception range. Similarly, the acceleration sensor was pasted on the workpiece. The measured signal was transmitted to the data acquisition device, and then, the transfer function of the system was calculated and analyzed by the professional software of the computer. The modal test of the tool and the workpiece is shown in Fig. 4 c and d.

The frequency response function of the tool in x direction is shown in Fig. 5a; the frequency response function of the tool in y direction is shown in Fig. 5b. The modal parameters corresponding to the tool in x and y directions are shown in Table 1.

Fig. 5
figure 5

The frequency response function of the cutter. ax-direction. by-direction

Table 1 The modal parameters of the cutter

Similarly, the modal experiments of the workpiece were conducted as shown in Fig. 6a. The workpiece’s modal parameters are shown in Table 2. The modal fit of the thin-walled workpiece has a relatively large error due to complexity brought by the deformation and lower stiffness. However, the peaks are almost the same. According to the parameters of the tool and the workpiece obtained from the modal test, the relative transfer function of the machine tool is proven to obtain a proper transform function of the machine tool [35]. The phase frequency of the transfer function of the tool, the workpiece, and the machine tool is shown in Fig. 6 b and c.

Fig. 6
figure 6

Modal parameters and relative transfer function. a The frequency response function of the workpiece. b The amplitude of the three transfer functions. c Image of the three transfer functions

Table 2 The modal parameters of the workpiece

According to the relative transfer function theory, the relative transfer function of the machine tool depends on the relative relationship between the tool and the workpiece. It can be seen from the figure above: at lower frequencies, the transfer function of the machine tool is dominated by the workpiece’s modal; instead, the tool’s modal controls the transfer function of the machine tool when the frequency is higher. Hence, the relative transfer function of the machine tool should be viewed as the initial conditions of chatter stability simulation analysis.

2.2.2 Stability prediction and experimental verification

Another important input of the SLD is cutting force coefficients. By changing the feed rate in slot milling experiments and related calculations, the tangential cutting force coefficient Kt is 1773.1N/mm2 and the radial cutting force coefficient Kr is 558.2N/mm2. So the SLD is calculated and shown in Fig. 7a. To verify the SLD’s effectiveness, two points representing stable (B) and chatter (A) cutting parameters were tested and surface profiles after machining were observed by 3D profile and a scanning electron microscope. The detailed processing parameters of A and B are listed in Table 5. During the experiment (Fig. 8), carbide-coated flat tools are used. Tool length is 70 mm and overhang length is 45 mm. Tool’s diameter is 10 mm and the used tool has 4 teeth. The material of workpiece is titanium alloy. Besides, the radial depth of cut is 0.5 mm and down milling is used in the milling. The observed surface profiles are shown in Fig. 7 b and c.

Fig. 7
figure 7

SLD and machining surfaces under different cutting conditions. a SLD diagram. b Results through SEM. c Results through the 3D profile

Fig. 8
figure 8

Experimental platform

From Fig. 7 b and c, the surface profile of point B has no obvious vibration (no chatter), and the surface profile of point A has obvious vibration (chatter). It shows that the observed results are consistent with those predicted by the SLD. Therefore, the processing status under different processing parameters can be determined through the SLD.

2.3 Mathematical model of SE and EE

In thermodynamics, entropy is essentially a measure of the degree of chaos. But infinitely accurate precision and resolution of traditional entropy concepts require an infinite data series [22]. So in the early 1990s, the concept of AE was proposed by Pincus [36] to quantify the regularity and complexity of time series. The higher the AE, the higher the probability of generating a new pattern within a time series. This theory has been successfully applied to the analysis of biological time series. However, AE has matching problem, leading to its calculation results more dependent on the length of the data, and also resulting in inconsistencies in the calculation results. In order to overcome these shortcomings, SE was proposed by Richman and Moorman [37]. In the mathematical model of SE, the self-matching of vectors is not included in the probability calculation. Therefore, the computation of SE has less reliance on data length and computation time is shorter compared with AE. Besides, SE is a nonlinear model and is suited for the nonstationary signal. Furthermore, the occurrence of chatter causes the frequency of the signal to be redistributed. As a consequence, the probability of generating a new subsequence in the time series is raised. Therefore, SE’s changes can represent the occurrence of chatter. It means SE coming from time domain can detect the chatter.

For a N time series x(1), x(2), ⋯, x(N), the calculation process of SE is as below:

  1. (a)

    An m dimension vector is formed from the original time series:

$$ {X}_m(i)=\left\{x(i),x\left(i+1\right),\cdots, x\left(i+m-1\right)\right\},1\le i\le N-m+1 $$
(4)
  1. (b)

    The distance between Xm(i) and Xm(j) is given by:

$$ d\left[{X}_m(i),{X}_m(j)\right]=\max\ \left(\left|x\left(i+k\right)-x\left(j+k\right)\right|\right),k\in \left[1,m-1\right] $$
(5)
  1. (c)

    Giving tolerance r, the number of d[Xm(i), Xm(j)] ≤ r is calculated, recorded as Ai. The ratio of Ai to N − m + 1 is written as:

$$ {B}_i^m(r)=\frac{A_i}{N-m+1},1\le i\le N-m $$
(6)
  1. (d)

    The average Bm(r) of \( {B}_i^m(r) \) is found:

$$ {B}^m(r)=\frac{1}{N-m}\sum_{i=1}^{N-m}{B}_i^m(r) $$
(7)
  1. (e)

    Similarly, the average Bm + 1(r) of \( {B}_i^{m+1}(r) \) can be given. Thus, the theoretical SE of the time series is defined as:

$$ \mathrm{SampEn}\left(m,r\right)=\sum_{N-\infty }-\mathit{\ln}\frac{B^{m+1}(r)}{B^m(r)} $$
(8)

when N is a finite value, the estimated value of the SE is given by:

$$ \mathrm{SampEn}\left(m,r,N\right)=\sum -\mathit{\ln}\frac{B^{m+1}(r)}{B^m(r)} $$
(9)

From the above theoretical formula of SE, it can be seen that the values of m and r will have some impact on the calculation results. Therefore, the choice of these two parameters is very important in solving the SE. In this paper, m = 1 and r = 0.1SD are chosen (SD is the standard deviation of the original time series). Because the result of SE is less sensitive to data length, the number of data points selected under the premise of stable computing results can be as few as possible. So N = 4000 is selected in this paper.

Due to complexity of chatter of thin-walled part machining, another indicator is needed except time-domain feature. In the milling process, energy changes with the cutting condition according to related references. In the stable cutting stage, energy chiefly focuses on the machine-dominant frequency and their harmonics. In the unstable cutting stage, the amplitude of chatter frequencies increases dramatically which means energy concentrates on frequency bands containing chatter frequencies. EE is an extension of entropy in the energy domain. Furthermore, EE is nonlinear. So it is effective and practicable to judge whether chatter occurs in the cutting process by EE [24].

The IMFs after EEMD are given by u1(t), u2(t), ⋯un(t), which represent the frequency bands, so the energy Ri of each IMF is given by:

$$ {R}_i={\int}_{-\infty}^{+\infty }{\left|{u}_i(t)\right|}^2 dt\kern3.25em i=1,2\cdots n $$
(10)

Because IMFs are orthotropic, the signal’s energy could be expressed by the sum of the energy of all IMFs. EE of IMFs is given by:

$$ {E}_i=\sum_{i=1}^n{T}_i\log {T}_i $$
(11)

where Ti=Ri/R describes the percentage that energy of IMF counts for the whole signal’s energy R.

3 Experimental setup and chatter feature extraction using SE and EE

In our experiment, the milling force signals under different cutting states were collected. Figure 8 presents the experimental platform, consisting of the machining center, dynamometer, cutter, and thin-walled workpiece (100 × 100 × 5 mm). The cutting force was collected by the dynamometer, and the sampling frequency of the dynamometer was set to 4000 Hz. Specific processing parameters are shown in Table 3.

Table 3 Machining parameters of milling thin-walled workpiece

After the cutting force signals are collected based on above experimental setup, the following steps are conducted:

  1. (1)

    The EEMD decomposes the experimental signal into a set of IMFs and a residual.

  2. (2)

    Because the false IMFs may be produced by EEMD, to obtain more reasonable IMFs, the correlation analysis is conducted. The n IMFs with a relatively high correlation coefficient are determined as reasonable components.

  3. (3)

    SE λ1 and EE λ2 of n IMFs are respectively calculated and are constructed into a total feature matrix [λ1, λ2].

In order to show the validity of SE and EE, two signals including stable and unstable (chatter) are chosen to analyze. Then, the raw signals are decomposed by the EEMD and their results are shown in Fig. 9 a and b. From Fig. 9 a and b, 9 IMFs are acquired by the EEMD. Then, the correlation coefficients analysis is conducted. The correlation coefficients between each IMF and the original signal are expressed as μi, i = 1, 2, ⋯, n; select the parameter γ = 0.1 are usually taken as the threshold. If μi ≥ γ, this IMF will be retained. The correlation analysis of each IMF is shown in Fig. 9c. Therefore, the first 4 IMFs are preserved through the above method. Subsequently, SE and EE of the first 4 IMFs are respectively calculated, as shown in Fig. 9d.

Fig. 9
figure 9

EEMD result and selection of IMFs. a Stable signal. b Chatter signal. c Correlation coefficients of IMFs. d SE and EE

Figure 9 d shows there is change in SE and EE values in stable and chatter conditions, which certify SE and EE can be viewed as chatter indexes. It is worth noting that the change is reflected by IMF1 and IMF2 in SE and EE charts. Among them, the SE of IMF1 increases and the SE of IMF2 decreases from the stable cutting to the unstable cutting in the SE chart. This shows that the complexity of IMF1 becomes higher, which means IMF1 has a higher probability of generating a new subsequence (it should be chattering subsequence). Instead, IMF2 is not possible to create a new subsequence owing to the decreasing of SE. This trend can be observed in the EE chart. The EE of IMF1 increases and the EE of IMF2 decreases. This illustrates the energy of signal was transferred to IMF1. Correspondingly, the energy of IMF2 is reduced. The above energy transformation is caused by chatter as energy is crucial for the occurrence of chatter. Therefore, the occurrence of chatter is a process in which the entropy value (including SE and EE) increases. In conclusion, chatter can be revealed by the SE and EE. Besides, IMF1 contains more chatter information compared with IMF2-4 through above analysis, which means that it is not sufficient to acquire the chattering signal by only correlation analysis. The further feature selection is needed to retain chatter information, which provides a reason for the next feature dimension reduction using the PCA.

4 Chatter feature selection using PCA

To define the number of principal components, the cumulative variance contribution rate method is used in the PCA. The cumulative variance contribution rate method is based on the descending ordered eigenvalues of the correlation coefficient matrix. The variance and direction of the first principal component are determined by the maximum eigenvalue and its corresponding eigenvectors. Variance and direction of the second principal element are determined by the second maximum eigenvalue and its corresponding eigenvectors, and so on. It is noteworthy that these eigenvalues are nonnegative. The ratio of each eigenvalue to the sum of all eigenvalues is called the contribution rate of the corresponding principal to the total variance of the sample, which is given as:

$$ {v}_i=\frac{\rho_i}{\sum_{i=1}^n{\rho}_i} $$
(12)

Among them, the contribution rate of the ith principle component is expressed by vi. So the contribution rate of the previous kth principal components can be given as:

$$ Q=\sum_{i=1}^k{v}_i $$
(13)

A threshold (85–90%) is often required when selecting the number of principal components based on this method. The minimum number of principal components is the minimum number of cumulative variance contributions which is greater than the threshold.

In the previous section, SE and EE were extracted from the original signal, forming an 8D eigenvector. But its feature dimension is higher and some components are redundant, which affects the accuracy of subsequent classification. The cumulative contribution rate is set at 90%, and the cumulative contribution rate of each principal component is shown in Fig. 10a.

Fig. 10
figure 10

The analysis of PCA. a Principal component contribution rate. b Three-dimensional principal component distribution

From Fig. 10a, the purpose of reducing dimension is achieved by the PCA. As the overall contribution rate of the first three-order principal components is greater than 90%, the first three-order principal components are chosen as the final eigenvectors. The discrete distribution of the first three elements is shown by Fig. 10b. It can be seen that after the PCA processing, the dimension of the initial matrix is reduced to three dimensions, and the features of stable cutting and chatter cutting are clustered together. The PCA not only reduces the dimension of the original feature vector but also extracts the valid information.

5 Chatter detection result and analysis

After the features are extracted and selected, the nonlinear SVM is employed to identify the cutting states. The basic idea of the SVM can be concluded: Firstly, the input space is transformed to a high-dimension space through the nonlinear transformation. Then, in this high-dimension space, the optimal nonlinear classification plane is resolved. The nonlinear transformation is carried out by defining KF. In order to choose appropriate KF, some different KFs are investigated as shown in Fig.11.

Fig. 11
figure 11

The flow chart of KF selection and parameter optimization in nonlinear SVM

Usual KFs mainly are as follows:

(1) Polynomial KF

$$ K\left({x}_i,{x}_j\right)={\left[\left({x}_i\bullet {x}_j\right)+1\right]}^d,d=1,2,\cdots n $$
(14)

(2) Radial basis KF (RBKF)

Among them, the most commonly used RBF is Gauss KF, which is defined as:

$$ K\left({x}_i,{x}_j\right)=\exp \left\{-\frac{{\left({x}_i-{x}_j\right)}^2}{2{\sigma}^2}\right\} $$
(15)

where xi and xj represent the samples or vectors and σ represents the width of KF, affecting the scope of KF.

(3) Sigmoid KF

$$ K\left({x}_i,{x}_j\right)=\tan h\left(\beta \left({x}_i\bullet {x}_j\right)+\theta \right),\beta >0,\theta <0 $$
(16)

To choose proper KF for the nonlinear SVM, the classification results of different KFs are calculated by the extracted features. Among them, c and g are respectively set to 1 and 0.25, and the other parameters are set as the default. From Fig. 12a, the classification accuracy with radial basis KF (RBKF) exceeds the other two KFs whatever training set or the testing set. Therefore, the SVM with RBKF is selected to monitor the cutting states.

Fig. 12
figure 12

The parameter optimization of SVM and prediction accuracy. a The classification accuracy using different KFs. b Fitness optimization using GA. c The classification results of the test data using GA. d The accuracy of the training set and testing set

After the best KF is determined, c and g are optimized by the GE and GA in order to improve the accuracy in the RBKF because c and g have an important influence on the identification results. In this paper, GE is firstly employed to optimize the parameters c and g in the SVM. A relatively large range is set to roughly find the best parameters; this paper is set to the range of c and g set to [−23, 23], the search step of c and g is: 0.8 and 0.6, respectively. After determining the approximate range, the range is gradually reduced according to the step size and made refined parameter selection. By gradually changing the optimization range of the parameters, the multi-group classification accuracy rate is calculated. Then, the parameters are explored by the GA algorithm as follows: the number of populations is 20, the maximum number of iterations is 200, and the search range of parameters c and g is: [0.01,100] and [0.01,1000]; 3-fold cross-validation is used. Among them, the fitness function is defined as the maximum of the prediction accuracy of the SVM. The fitness curve of the iterative process through the GA algorithm is shown in Fig. 12b.

As seen from the above figure, the average fitness of the parameter optimization in the SVM using the GA is above 80%, and the optimal fitness value is obtained near the 41th generation by continuously performing the iterative update calculation. This moment c = 15 and g = 1.8. Then, the accuracies of the training set and testing set are depicted in Fig. 12 c and d. It was found that no matter what GE or GA is used, the optimization method can improve the prediction accuracy. And the SVM using GA optimization can achieve the highest chatter detection accuracy: the accuracy of the training set is 98%, and the accuracy of the test set is 90.9%, which exceeds the chatter prediction of the SVM using GE and no parameter optimization. However, it takes more time to find the best c and g, especially when using GA to optimize the parameters of the SVM. As shown in Table 4, the elapsed time was almost close to 5 s. The reason it takes so long may be that the parameters of SMV need to be optimized every time.

Table 4 The comparison of calculation time

To further verify the effectiveness of the proposed method, the surface profiles under stable and unstable cutting (previous mentioned points including A and B) are compared with the prediction results of the proposed method as observing surface profilers is a very direct and effective method. The identification results for these cutting conditions are listed in Table 5.

Table 5 The identification results between the proposed method and surface profile

In Table 5, “1” represents chatter cutting state and “− 1” represents stable cutting state. The results of “Output” are given by the proposed model. The results of “Target” are obtained by the surface profile. As seen from Table 5, the final results are correct and the cutting states predicted by the SLD is consistent with the SVM.

In addition, more different cutting conditions are chosen to experimentally verify the trained model. The cutting conditions are listed in Table 6. The predicted cutting states through the trained model are compared with the SLD as well as surface profiles, which are shown in Fig. 13.

Table 6 Cutting conditions
Fig. 13
figure 13

The surface profiles of selected cutting condition. a No. 1. b No. 2. c No. 3. d No. 4

From Fig. 13, No. 1 and No. 2 belong to stable cutting and No. 3 and No. 4 belong to unstable cutting. The prediction results by the trained model are the same as the observed results of surface profiles as well as the prediction results of the SLD. Therefore, it is shown that the proposed method effectively detects the chatter in milling thin-walled parts under different cutting conditions.

6 Conclusion

In this paper, a chatter detection method for peripheral milling thin-walled workpiece using SE and EE is studied. Firstly, the EEMD is introduced to acquire the IMFs and its validity is verified by the simulation signal. Then, the initial eigenvectors are constructed by extracting SE and EE from the retained IMFs based on the correlation analysis. To reduce the eigenvector dimension, the PCA is employed in this paper. To provide the labels for the SVM, the SLD is drawn and verified. At last, an improved SVM using optimized parameters is used to detect the chatter. The following conclusions are given:

  • The presented method based on EEMD and two nonlinear dimensionless indicators can effectively detect the chatter through not only the prediction results of the SLD but also the observation results of the surface profile. Based on analysis of the trend of SE and EE, chatter is essentially a phenomenon of entropy increase. It means that SE and EE will increase when cutting process chatters.

  • To accurately descript the dynamics of processing system, the relative transfer function is used in the paper. The dynamics of processing system is represented by the workpiece’s modal when the frequency is low. Instead, the modal of tool represents the dynamics of processing system when the frequency is high.

  • It is not sufficient to extract chatter-related features through correlation analysis as only two IMFs are largely changed. Hence, the PCA is used to reduce the feature vector dimension in this paper.

  • Based on the SVM, the chatter identification results show that GA is more applicable for deciding the parameters compared with GE. However, the proposed model is very time-consuming because parameter optimization is performed each time. To meet the real-time requirements of online monitoring, future work should focus on shortening the calculation time of the algorithm.