1 Introduction

Carbon fiber–reinforced polymers (CFRPs) possess attractive characteristics such as their high strength-to-weight ratios and high specific stiffness-to-weight ratios compared with metallic materials [1, 2]. These excellent properties account for the manufacturing of advanced structures with CFRPs in aviation, aerospace, and national defense industries, where the drilling of the structural parts is frequently encountered for either manufacturing riveted assemblies or structural repairs [3]. High-quality hole is an overriding factor to ensure a higher assembly accuracy with other structural parts after drilling. Owing to the inherent anisotropy and structural heterogeneity of CFRPs, the drilling operation may cause a multitude of damage, such as delamination, burr, fiber pull-out, and matrix thermal degradation, consequently reducing the structural strength and in-service life under fatigue loads [4]. The thrust force is the primary factor that affects the generation and evolution of hole damage in drilling, especially delamination [5]. The value of the thrust force strongly depends on the geometry and materials of the tool; workpiece material property parameters (MPPs), such as the strength and stiffness; machining parameters such as spindle speed and feed rate; and coolant use [6]. Analogously, the thrust force also depends on the different thicknesses of CFRPs during the entire process of drilling. Thus, predicting the thrust force exactly is a vital and necessary strategy to prevent damage in the drilling of CFRP parts.

Theoretical modeling of the thrust force has played a dominant role in understanding the damage of holes in the drilling of CFRPs over recent decades. Langella et al. [7] and Tsao et al. [8] proposed theoretical models to predict thrust force during composite materials drilling based on the material invalidation and removal analysis, respectively. Su et al. [9] established a theoretical drilling model of a tapered drill-reamer (TDR) to study the comprehensive analysis for the thrust force and the delamination under the classical theoretical prediction model of a twist. Zarif et al. [10] developed analytical models to predict the critical thrust force and feed rate at the onset of delamination. The investigation results revealed that the maximum thrust force has an important effect on the delamination based on comparison of experiments. Their contributions laid a solid foundation for the prediction of the thrust force in future studies. However, the empirical formulas deduced by curve approximation or data line fitting are insufficient given the required prediction precision due to the increased machining accuracy requirements over recent decades.

Except for experimental studies and the method of formula derivation, the finite element (FE) method has captured the attention of scholars by theoretical prediction models of CFRPs [11]. The thrust force of the entire drilling process can be obtained at the end of the simulation analysis, and the prediction results are generally more accurate than those of the theoretical model. Unfortunately, some parameters of material failure criteria of CFRPs in the FE model are dependent on the theoretical model, which leads to errors between the prediction model and experiments. In addition, a single or multiple similar type of CFRPs is merely considered in most theoretical prediction models and FE models. The MPPs of CFRPs are generally regarded as one of the most important factors influencing the thrust force. Nonetheless, these are not considered in previous studies. The major reason is that the MPPs of CFRPs are extensive and a series of complex analytical processes must be redone when some parameters are updated in each analysis, such as remodeling and recalculation. In contrast, scholars [12,13,14] are more inclined to research the influence of machining parameters on the thrust force based on numerical simulation methods, such as the feed rate and spindle speed. The shortcoming is that completing the entire analysis of the simulation process is time-consuming to perform because the three-dimensional (3D) drilling FE model is a dynamic explicit simulation analysis process, and a high-performance computing facility is essential for the complex structures of CFRPs.

To make the prediction results of the thrust force more rapid and accurate, artificial neural network (ANN) model is an excellent approach to obtain the relationships between the thrust force and the MPPs of the CFRPs, and machining parameters. The defects of traditional artificial intelligence based on logical symbols in dealing with intuitive and non-resultant information are overcome. The entire prediction process is dependent on establishing mathematical relationships between the input parameters and output parameters. Different algorithms are adopted to facilitate ANN having the characteristics of self-adaptation, sub-organization, and real-time learning.

ANN is an implementation of an algorithm inspired by research on the brain, and ANN models are a technology in which computers learn directly from data, thereby assisting in classification, function estimation, data compression, and similar tasks. ANN can be seen as computing elements, simulating the structure and function of a biological neural network [15]. Actually, ANN have been widely used in machining composites over the last few decades, such as real-time control and optimization of machining parameters [16, 17], tool wear [18], and prediction of parameters [19, 20]. For the drilling of composites, Mishra et al. [21] first predicted the likely damage before drilling the unidirectional FRP based on an ANN model; the diameter and geometry sizes of the tool were added and used as the input parameters. Based on this investigation, Palanikumar et al. [15] first used an ANN model for the prediction of the delamination factor in drilling to achieve a higher quality hole in fiber-reinforced plastic (FRP). Unfortunately, the MPPs that are considered one of the most influential factors were neglected and other parameters that are used in evaluating drilling quality were also predicted through the ANN model. The prediction results showed that the well-trained ANN model was able to predict the surface roughness in the drilling of glass FRP. Kahraman [22] predicted the major drilling parameters (such as the bit load, optimum rotation speed) of a diamond bit under different machining drilling conditions. The prediction results revealed that the prediction method proposed by the authors had higher precision than the traditional modeling method, whereas the deficiency was that the weight ratio of each parameter was random in the establishment of the neural network training model, which lacks complete training for each parameter. In addition, based on the genetic algorithm, Anarghya et al. [23] proposed a multilayer perception neural network optimized model to predict the thrust force, which takes as input the command feed rate, previous command feed rate, and drill point angles. The shortcoming of this investigation was the small number of training samples since the acquisition was from experimental data, which subsequently caused the prediction results to be closely matched to the experimental values. Scholars also try to use numerical analysis models to replace experiments for sample training. Nevertheless, suitable material failure criteria models of CFRPs are lacking in FE software, which leads to the prediction results between the FE model and experiments not being in good agreement. More importantly, manual modification of parameters and the extraction of results are required when the parameters of the corresponding FE model are updated in the software. Plenty of samples are needed as the basis in the training of the ANN model. This causes the entire analysis process to be more time-consuming.

Based on the aforementioned studies, an integrated and effective methodology, which is coupled with a scale-span drilling FE model and a revised ANN model, is developed to rapidly predict the thrust force in the drilling of CFRPs using tapered drill-reamer (TDR) under the different machining parameters. The optimum global mesh size of the scale-span FE model of CFRPs is optimized to obtain the minimum calculation time on the premise of ensuring accuracy of the predicted thrust force, while an order-driven FE computation approach is developed for the batch solution of samples. The material property parameters of CFRPs are added as input parameters, while the initial weight indices of each parameter are considered in the revised ANN model. The established well-trained ANN model can be used to predict the thrust force more rapidly and precisely compared with the conventional ANN method in drilling CFRPs.

2 Mesh size optimization of the scale-span drilling FE model

According to the established scale-span drilling model in our previous research, the simulation accuracy depends highly on the mesh size of the scale-span drilling FE model. The simulation process costs plentiful of simulation time (about 193 h) even by high-performance computing facility, and hundreds of samples are needed for training in ANN. If the refinement mesh size is adopted, it will take more than several years although the high precision simulation results can be obtained, such as burr and delamination. However, it is too time-consuming owing to it taking up lots of disk space and available computer resources. Thus, exploring the mesh sensitivity of the scale-span model is an essential task to improve the simulation efficiency.

2.1 Scale-span FE modeling with different mesh sizes and mesh styles

Determining the global minimum mesh size, reasonably dividing each partition area, refining the mesh size of the contact area, and making the transition of the mesh of the different divided areas are necessary in the mesh generation of the CFRP model. According to the previous research, the scale-span drilling FE model which adopts the typical machining parameters (Sr=2000rpm, Sf=0.03mm/rev) and the refined mesh size (0.05 mm) of CFRPs model is replicated multiple times. Then, the initial mesh elements of CFRPs model are removed and each copied model is numbered. Meanwhile, the scale-span drilling FE model with different global sizes of refined meshed within a spacing of 0.05 mm in the range of 0.05~0.4 mm is generated using free and structural mesh generation style. With the change of the global size, the number of the mesh of each simulation model is not consistent. For instance, the total number of elements is 3382122 when the global size is 0.05mm, whereas the total number of elements is only 49983 when the global size is 0.4mm. In other simulation models, the total number of elements is between 49983 and 3382122. Several scale-span models with different mesh sizes as the samples are shown in Fig. 1.

Fig. 1
figure 1

Scale-span drilling FE model with different mesh sizes and mesh styles. a Refinement mesh size FE model. b Medium structural mesh size FE model. c Coarse free mesh size FE model

Because the mesh type and mesh size of each simulation model have changed, it is also necessary to re-attribute the material properties, stacking angle and stacking sequence of elements, while the 0 thickness cohesive elements (CEs), which is used for delamination simulation, are also regenerated through the basic mesh of CFRPs. In addition, the elements, which are located at the contact area, also are required to be reset, and the damage failure parameters are also modified slightly to prevent nonconvergence of the simulation model.

2.2 Comparison of simulation efficiency and accuracy

Figure 2a shows the predicted thrust force variation of the scale-span FE model with different mesh sizes during drilling. Apparently, the overall variation trend is almost consistent because the same boundary conditions are adopted, which is also close to the experimental results. However, due to the mesh size of the contact elements being different, the amplitude of variation is not consistent at the same period. The vibration amplitude of the thrust force increases with the increase of mesh size in the same period. The main reason is that lots of elements with small mesh size that should be deleted in refined mesh FE model are retained in coarse mesh FE model. There is a lack of interaction between the elements with small mesh size and the TDR bit model when the elements with large mesh size have been deleted in coarse mesh FE model. Therefore, with the increasing mesh size of corresponding elements, the contract nodes will become coarse, leading to increase in the oscillation amplitude of the thrust force owing to that the large size element can be assumed to contain many small size elements.

Fig. 2
figure 2

Thrust force prediction results of the scale-span FE model with different mesh size and mesh style. a Prediction results of the scale-span FE model with different mesh sizes. b Prediction results of the scale-span FE model with different mesh styles

Figure 2b shows the predicted thrust force variation of the scale-span FE model with different mesh styles. Analogously, there is no obvious difference on the predicted force variation because the basic mesh size and the number of elements of the simulation model are almost consistent. Their unique difference is some small oscillation change during the contact between the TDR model and CFRP model. Nonetheless, there is almost no greater variation trend on predicted thrust force because the number of elements located at the contact area is almost the same, although the different mesh styles are adopted in CFRP model.

Figure 3 shows the simulation results on interlaminar and intralaminar damage of the CFRPs with different mesh sizes. Compared with experimental results, the bearizing phenomenon has occurred in the simulation model with larger mesh size owing to the continuity of CFRP mesh. Similarly, other types of the real damage phenomenon are difficult to be simulated, while the surface morphology of the hole obtained by the simulation model is relatively rough as well. On the contrary, various damage behaviors can be simulated precisely in the FE model with refinement mesh size, such as pit, burr, tear, and delamination. But it can be observed that the damage behavior is closely related to the thrust force. Therefore, accurately predicting the maximum average thrust force is an important and necessary strategy to prevent damage [11].

Fig. 3
figure 3

Simulation results of the scale-span FE model with different mesh sizes. a Experimental results. b Simulation results of the FE model with refinement mesh size. c Simulation results of the FE model with medium mesh size

According to the comparative analysis of the predicted results, the deviation of the maximum average thrust force is relatively small among all the FE models with different mesh sizes and mesh styles. The absolute percentage deviations [24] between the simulation results and experimental results are introduced to accurately quantify the prediction accuracy of the scale-span FE model with different mesh sizes and mesh styles. Similarly, the calculation time of each simulation model is also collected to optimize the optimum mesh size for improving the simulation efficiency on the premise of ensuring accuracy. The statistical results of the prediction accuracy and simulation time of the scale-span FE model with different mesh sizes and mesh styles are shown in Fig. 4.

Fig. 4
figure 4

Calculation time and the corresponding deviation

As can be seen from Fig. 4, the prediction accuracy for the thrust force is highest when the structural mesh size of 0.1mm is adopted, and the maximum deviation is less than 1%. Simultaneously, the deviation value becomes larger when the larger mesh sizes are adopted according to the trend analysis of the absolute deviation. The prediction accuracy of the scale-span FE model with the global mesh size of 0.05~0.35mm is approximately 5%. However, the deviation value is more than 15% when the structural mesh size is greater than 0.4mm. The total calculation time of each FE model shows an exponential growth when the mesh size is less than 0.35mm although the prediction accuracy has a slight improvement with mesh refining.

Because the calculation time and the accuracy of the prediction results are inversely proportional, the value of the mesh size, located at the intersection of simulation and absolute deviation, is defined as the most optimized value, which is used for guaranteeing higher simulation efficiency and accuracy in this study. Thus, the mesh size of 0.23mm is the most optimum mesh. Furthermore, the structural mesh size of 0.35mm is regarded as the most suitable scheme in order to maximally shorten simulation time on the premise of ensuring accuracy because of the requirement of more than hundreds of samples used for training in ANN model. The main reason is that the deviation is not far apart, as it is shown in Fig. 4. Finally, the complete calculation time of a simulation model is approximately 5.5 h on a high-performance computing facility with two 48 core 8160 platinum processors and 128 GB RAM.

3 Obtaining the sample set, features selection, and ANN algorithm

3.1 Data preparation

3.1.1 Input sample set

The effectiveness of the ANN model for the prediction of thrust force is dependent on many samples, which contain a training sample and checking sample. Sobol’s quasi-random sequence based on algorithm 659 [25] is adopted in this study considering the uniformity and the generation efficiency of the samples.

Since the generation of Sobol’s sequence is clearly explained in previous studies [2, 26], the main points are only briefly reviewed to indicate the extra data required to generate the input sample set based on the basic input parameters of the FE model. First, there is a need to establish a probabilistic model of the MPPs of CFRPs and the machining parameters. Then, the statistical characteristics of random parameters are calculated using the observation of a probabilistic model or sampling experiments. Thus, an approximation of the target solution was obtained according to the accurate requirement of the solution or the number of solutions.

To generate the jth component of the points in Sobol’s sequence, a primitive polynomial [27, 28] of some degree sj in the field needs to be chosen, which can be written as follows:

$$ {P}_{sj, aj}(x)={x}^{s_j}+{a}_{1,j}{x}^{s_j-1}+\cdots +{a}_{s_j-1,j}x+1 $$
(1)

where a1,j… asj-1,j denote the coefficients, and asj,j are either 0 or 1.

These coefficients are used to define a sequence {m1,j,m2,jmsj,j}, and the sequence is of positive integers using a recurrence equation [29], which can be written as:

$$ {m}_{k,j}=2{a}_{1,j}{m}_{k-1,j}\oplus {2}^2{a}_{2,j}{m}_{k-2,j}\oplus \cdots \oplus {2}^{s_j-1}{a}_{s_j-1,j}{m}_{k-{s}_j+1,j}\oplus {2}^{s_j}{m}_{k-{s}_j,j}\oplus {m}_{k-{s}_j,j} $$
(2)

where k is equal to or greater than sj+1; ⊕ denotes the bit-by-bit exclusive-OR operator.

The initial values m1,j,m2,jmsj,j can be chosen freely if each mk,j is a positive odd valueless than 2k, where k is greater than 1 and equal to or less than sj. A set of direction numbers {v1,j,v2,jvsj,j} are required, which can be defined by

$$ {v}_{k,j}={m}_{k,j}/{2}^k $$
(3)

Thus, the jth component of the ith point in a Sobol’s sequence xi,j, namely, the input sample set that is generated by the MPs of fiber and matrix, can be obtained by [26, 30].

$$ {x}_{i,j}={b}_1{v}_{1,j}\oplus {b}_2{v}_{2,j}\oplus \cdots \oplus {b}_w{v}_{i,j} $$
(4)

where bw denotes the wth bit from the right when i is written in binary, namely (…b2b1)2 is the binary representation of i.

Thus, a corresponding code was developed with the help of the MATLAB software based on the above inputs [2]. Two independent samples in the FE model were generated according to the initial MPPs (not including the fraction energy because the value was assumed to be consistent) of unidirectional CFRPs and machining parameters. The independent sample set of each parameter contained 30 data points. Then, each data set of parameters between two sets of independent samples was combined and replaced. Finally, 540 input samples on the MPPs of CFRPs and machining parameters are listed as examples in Table 1.

Table 1 Several samples of the input parameter data set

3.1.2 Result sample set

According to the optimized scale-span FE model, the thrust force of each input sample set will be acquired by importing 540 groups of the input parameters respectively, namely the result sample set. However, performing the entire process manually is time-consuming because the operation process of the FE model through the ABAQUS software is complex. Furthermore, there is a large number of input samples with different parameters, and the implementation of artificially modifying parameters and manually submitting jobs one by one is also difficult in the software.

Aiming at the above problem, an order-driven FE computation approach is first proposed by developing an ABAQUS batch file and a Python script file for drilling CFRPs to maximize the utilization of available computing resources and ensure the efficiency of the calculations of the FE model. The entire analysis process is also conducive to implementing the output more conveniently and rapidly.

The whole simulation process of the order-driven FE analysis model with 540 input samples of drilling CFRPs was based on the Isight software, which is shown in Fig. 5. First, establishing a scale-span FE drilling model from ABAQUS/explicit GUI is necessary and vital, as is guaranteeing that the analysis results files (.rpt) of the FE model can be accessed when the submitted job is finished. Simultaneously, extracting and modifying the modeling and analysis process files of the FE model to ensure all files could be compiled using Python software, namely the order-driven file, is necessary. Moreover, a corresponding ABAQUS batch script file is developed using Notepad++ software. After the above preparation, the order-driven files (.py), the ABAQUS batch script file (.bat), and the result files are imported into the Simcode module of the Isight software respectively, a job submitted to ensure that the analysis results are consistent with the initial model before importing 540 input samples. Furthermore, the data of 540 input samples modeled with different parameters are imported into the DOE module of the Isight software. The most important work is to reasonably setup parameters and debug the corresponding code in the main program module to automatically modify the parameters of input samples and submit jobs automatically when calling the kernel of the ABAQUS software in the background. Based on the above operation, the analysis result could be accessed conveniently and rapidly based on the output analysis result files of each input sample model, namely, the results set.

Fig. 5
figure 5

Implementation flowchart of the order-driven FE computation approach

Finally, the whole calculation of all input samples model takes more than a month on four same high-performance computers with two 48 core 8160 platinum processors and 128 GB RAM.

3.2 Acquisition of initial weight indices

Each of the attributes is referred to as a feature in the input data set, such as the tensile strength of CFRPs and spindle feed rate of the machine. The primary purpose of feature selection is to select the feature subset with the greatest correlation with the output data from the features set of the training samples [31]. Feature selection is a significant step in the data processing of the ANN model because there is a mass of input parameters in the simulation model. Selecting the greatest relevant subset from all feature sets is conducive to simplifying the training model in the ANN model, decreasing the calculation cost of the training model, and preventing the risk of overfitting.

The entire process of the feature selection generally consists of two primary steps. One is a subset search, and the other one is a subset evaluation. A machine learning algorithm is adopted to combine feature selection with the training process. The random-forests algorithm as a common approach of machining learning can acquire importance of individual characteristic variables to gain feature selection [32]. This approach considers the data set and learns tasks, as well as prevents vast of computation. Furthermore, the extra-tree forests algorithm is applied for feature selection after comparing machining learning algorithm in terms of the ability to acquire individual characteristic variables.

Since the primary purpose of this study is adopting the scale-span FE model to predict the maximum average thrust force during drilling, some parameters that have little effect on the thrust force are ignored in drilling CFRPs, such as the fraction energy which is assumed to be consistent, etc. Thus, the basic MPPs of CFRPs and machining parameters are regarded as the features that should be recognized and all parameters are defined as the input data. The thrust force data set from the results sample set is defined as the output data.

To obtain the initial weight indices of each input parameter for improving the training accuracy of the ANN model, the corresponding code for feature selection was developed in Python software in accordance with the flowchart shown in Fig. 6.

Fig. 6
figure 6

Implementation flowchart of the feature selection model

According to the feature selection results, the weight indices of input parameters are shown in Fig. 7. The compression strength ZC and the shear strength SYZ of CFRPs have a significant effect on the thrust force, and weight indices of them are 0.12884 and 0.10636 respectively. Other feature effects are individually smaller. However, the effects of them cannot be ignored, especially the machining parameters, which play an essential role in the whole drilling process of CFRPs. Then, the weight indices of each parameter are adopted in the input layer of the ANN prediction model to decrease the calculation cost of the training model and improve the prediction precision.

Fig. 7
figure 7

Weight indices of MPPs and machining parameters

3.3 Establishment of ANN prediction model

3.3.1 ANN modeling and training

A multilayer perceptron ANN based on feed forward and error back propagation is adopted in this study. The implementation process of the basic learning algorithm mainly includes two subsequent steps. First, establishing the computer network to predict the corresponding relationship between the input sample and the output sample is essential according to the training samples. The learning results and the prediction results must be compared after a large number of training epochs to modify the output error for evaluation the performance of the ANN model is proved, and the mean square error (MSE) is generally adopted. An ANN model is proved to be a good network based on which the produced errors are less than 11% for both training and validation data [33]. Then, the weight indices of each input parameter are continuously modified to reduce the errors between the prediction results and the learning results. The training process will be stopped if the output error is acceptable or the preset training epoch is reached. The prediction results are achieved when the fresh samples are imported into the well-trained ANN model. The ANN model consists of an input layer, one or more hidden layers, and an output layer. The number of neurons in each layer can be different.

Based on the aforementioned description, the following assumption should be satisfied in the established ANN model:

  1. i)

    There are S hidden layers in ANN model.

  2. ii)

    The total number of samples for training is K.

  3. iii)

    There are M and H neurons in the input layer and output layer, respectively.

  4. iv)

    The weight indices between the input layer and the hidden layer are W, while the weight indices between the output layer and the hidden layer are V.

  5. v)

    The thresholds of the hidden layer and the output layer are θ and γ, respectively.

  6. vi)

    Three activation functions namely tanh, relu, and sigmoid are used to activate neurons [34].

According to the integrated algorithm flowchart shown in Fig. 8, Python software is used for the design of the ANN model, and the complete modeling and learning process are as follows:

  1. (1)

    The input vector, the output vector of the hidden layer, the network output vector of the output layer, and the expected output vector of the ith sample can be respectively expressed as.

Fig. 8
figure 8

Algorithm flowchart of the multi-layer ANN model

$$ \left\{\begin{array}{l}X(i)=\left[{x}_1(i),{x}_2(i),\cdots, {x}_M(i)\right]\\ {}H(i)=\left[{h}_1(i),{h}_2(i),\cdots, {h}_H(i)\right]\\ {}A(i)=\left[{a}_1(i),{a}_2(i),\cdots, {a}_N(i)\right]\\ {}T(i)=\left[{t}_1(i),{t}_2(i),\cdots, {t}_N(i)\right]\end{array}\right. $$
(5)
  1. (2)

    All inputs and outputs are normalized within the range 0–1, and then normalized using the following expression [15].

$$ {X}_{norm}=\frac{X-{X}_{\mathrm{min}}\Big)}{\left({X}_{\mathrm{max}}-{X}_{\mathrm{min}}\right)} $$
(6)

where Xnorm denotes the normalized value, X denotes the initial value, and Xmin and Xmax denote the minimum and maximum value of inputs, respectively. Simultaneously, all input samples X and the corresponding expected outputs T are imported to the developed code in Python software, while the value of MSE and the number of the maximum epochs are also set, respectively.

  1. (3)

    The output weight indices of the hidden layer and the output layer in the training model are calculated, while the expressions are written as.

$$ \left\{\begin{array}{l}{H}_j(i)=f\left[\sum \limits_{j=1}^m{W}_{jk}{x}_j(i)-{\theta}_k\right],k=1,2,3,\cdots H\\ {}{A}_o(i)=f\left[\sum \limits_{k=1}^H\left({V}_{ko}{H}_j(i)-{\gamma}_o\right)\right],o=1,2,3\cdots, N\end{array}\right. $$
(7)

where Wjk denotes the weight indices between the jth input layer and the kth hidden layer, and Vko denotes the weight indices between the kth hidden layer and the oth output layer.

  1. (4)

    The correction errors between the hidden layer and the output layer are solved, and their expressions are written as.

$$ \left\{\begin{array}{l}{\delta}_o(i)={a}_o(i)\left[1-{a}_o(i)\right]\left[{a}_o(i)-{t}_o(i)\right]\\ {}{\delta}_j(i)={a}_o(i)\left[1-{a}_o(k)\right]{\delta}_o(i){V}_{ko}\end{array}\right. $$
(8)

where δk(i) and δo(i) denote node error of hidden layer and output layer, respectively.

  1. (5)

    The weight indices of the output layer and the hidden layer and the threshold of the output layer neurons are modified, while the modified expressions are written as.

$$ \left\{\begin{array}{l}{V}_{ko}^{N+1}={V}_{ko}^N+{\eta \delta}_o(i){a}_o(i)\\ {}\Delta {V}_{ko}(i)=-\alpha \frac{\partial e}{\partial {V}_{ko}}={\alpha \delta}_o(i){\alpha}_o(i)\end{array}\right. $$
(9)
$$ {\gamma}_o^{N+1}={\gamma}_o^N+{\alpha \delta}_o(i) $$
(10)

where α denotes the learning rate, 0 <α <1, N denotes the number of epoch, and η denotes the coefficient which is used to modify the weight indices.

  1. (6)

    The weight indices of the input layer and the hidden layer and the threshold of the hidden layer are modified, and the modified expressions are written as.

$$ \left\{\begin{array}{l}{W}_{jk}^{N+1}={V}_{ko}^N+{\eta \delta}_k(i){x}_j(i)\\ {}\Delta {W}_{jk}(i)=-\mu \frac{\partial e}{\partial {W}_{jk}}={\alpha \delta}_o(i){x}_j(i)\end{array}\right. $$
(11)
$$ {\theta}_k^{N+1}={\theta}_k^N+{\mu \delta}_k(i) $$
(12)

where μ denotes the learning rate, and 0 <μ <1, and the definitions of other parameters are the same as before.

  1. (7)

    The errors of each neuron are output through the calculation of Eq. (8), and the number of epochs is updated. The training is terminated if the value of output error is less than the preset accuracy or the number of iterations exceeds the preset maximum in the program. Otherwise, steps (3) to (6) are required to be repeated.

  2. (8)

    Based on the renormalization method, parameters (such as thrust force) can be predicted by inputting fresh samples into the well-trained ANN model.

3.3.2 Implement of thrust force prediction

Based on the established multi-layer perceptron ANN model, neurons in the input layer correspond to the elastic and strength parameters of CFRPs and machining parameters are reported in Table 1. The output layer corresponds to the maximum average thrust force from the result sample set, which is obtained based on the order-driven FE computation approach proposed in Section 3.1.2. More than 2 hidden layers are generally designed in accordance with the input vector size and the number of input classifications. The number of neurons in the hidden layer is generally greater than the number of neurons in the input layer to achieve a more precise classification of the input parameters. The approach of changing the momentum coefficient, learning rate, number of hidden layers, and number of neurons in the hidden layer is adopted to determine the suitable architecture of the proposed ANN in this study. In addition, the MSE is used to evaluate the performance of the ANN model.

In the training process of the ANN model, the initial weight indices of each input parameter must be imported into the ANN model at first according to Fig. 7 (the initial weight indices in the conventional ANN model are random). With increasing training times, the weight indices are constantly modified in accordance with Eqs. (9) and (10). The Adam algorithm [35] based on gradient descent is adopted to modify the weight indices in the established ANN model. The training process ends when the calculation error is not greater than the preset precision.

It is observed from Fig. 9 that 4 hidden layers and 18 neurons in the hidden layers exhibit satisfactory training which hence are selected eventually. The ANN model with a 16-18-18-18-16-1 architecture is found to be the most suitable choice in maximum thrust force prediction based on the optimized scale-span simulation model sample through multiple different training trials. The complete architecture of the established ANN model contains 16 neurons in the input layer, 4 hidden layers (18 neurons in the first three layers and 16 neurons in the fourth layer), and 1 neuron in the output layer, as is shown in Fig. 10.

Fig. 9
figure 9

MSE variation and number of epochs. a MSE of different hidden layers. b MSE of different neurons in hidden layer = 4

Fig. 10
figure 10

Designed neural networks architecture

Based on the well-trained ANN model, a series of samples based on the machining parameter variation of the machine are regarded as the fresh samples to verify the correctness of the prediction results.

4 Results and discussion

4.1 Fresh sample set based on Taguchi method

To verify the correctness of the prediction model based on the well-trained ANN in dealing with new test data set, a series of sample data set based on the machining parameter variation of the machine center are regarded as the fresh test data. It is helpful for experimental verification because the material property parameters of CFRPs are difficult to modify in the experiment.

A full factorial design of experiments with two factors (spindle feed Sf and spindle rotation Sr) based on the initial material property parameters of CFRPs is treated as the fresh validation data set in this study [36]. The validation data set of five levels of spindle feed rate (20, 30, 40, 50, 60mm/min) and their corresponding spindle rotation (2000, 3000, 4000, 5000, 6000rpm) were carried out. In addition, the predictions of some extreme machining parameters were also carried out to ensure the application scopes of the established ANN model were wider and more accurate, such as Sf =10mm/min and Sr=8000rpm. Finally, the corresponding prediction results are shown in Fig. 11a.

Fig. 11
figure 11

Experimental scheme and prediction results of ANN. a Prediction results of the well-trained ANN. b Experimental specimen

4.2 Experimental validation

The experimental setup is consistent with the previous research. Their unique difference is that the unit of the spindle feed rate is required to be converted. Three T700S-12K/YP-H26 CFRP laminates specimens were used for the test to obtain a satisfactorily measured dataset. The experiments under the same machining parameters were repeated three times for the sake of improving the accuracy. One of the CFRPs laminates specimens after drilling is shown in Fig. 11b.

The experimental results for other machining parameters were also collected by taking the average of each set with the same machining parameter. To more accurately verify the accuracy of the predicted results for the well-trained ANN model, some experiments with extreme machining parameters were also added in the reserved experimental area of CFRPs, such as higher spindle rotation speed. Meanwhile, the reserved area is used to make up some samples with no data collection. Finally, all of the experimental results are shown in Fig. 12a.

Fig. 12
figure 12

a Experimental and prediction results and b deviations

From the experimental and prediction results, the predicted value of the thrust force is in close agreement and follows almost the same trend as the experimental value through the analysis. To furtherly quantify the precision of the experimental results and prediction results, the absolute percentage deviation Ea is introduced again [24], which is written as follows:

$$ \%{E}_a=\left|\frac{F_{\mathrm{exp}}-{F}_{pred}}{F_{\mathrm{exp}}}\right|\ast 100 $$
(13)

where Fexp denotes the experimental value of maximum thrust force and Fpred denotes the predicted value of maximum thrust force based on the well-trained ANN model.

As can be observed from Fig. 12b, the predicted results are almost consistent with the experimental results, and the maximum absolute deviation is only 4.56% among them. The deviations of prediction results of the extreme machining parameters in the well-trained ANN model are also within 5% of experimental value. The dominant cause is that there is a deviation between the drilling FE model and the experiment under the same machining parameters, followed by the deviation of the measurement devices. In addition, CFRPs in the drilling FE model are regarded as the desired material. For example, the fiber and resin are perfectly combined at the structural level and without damage. In practice, there may be a small amount of fiber damage (void and micro-cracks, etc.) during the preparation of CFRPs under high temperature and high pressure.

Therefore, the proposed methodology based on the coupling of the scale-span drilling FE model and the ANN model to predict the thrust force with a TDR under different machining parameters is correct and effective according to the comparison of the prediction results and the experimental results in this study. Compared with the conventional ANN model, the influence of the MPPs of CFRPs and the corresponding initial weight indices of the input parameters is considered in the prediction of thrust force, which enables predicting the thrust force in drilling CFRPs more rapidly and precisely using the well-trained ANN model. More importantly, based on the well-trained multi-layer ANN model, the thrust force of CFRPs with different MPPs can also be predicted rapidly under different machining parameters when a similar bit is adopted. In addition, similar damage phenomena of holes can also be predicted using this approach in drilling CFRPs, such as the delamination factor, which will be the next step of our research in the future.

5 Conclusion

In the present study, a rapid thrust force prediction method was proposed by coupling scale-span model and revised ANN for drilling T700S-12K/YP-H26 CFRPs with a TDR. First, the optimum mesh size of the scale-span drilling model of CFRPs was optimized to shorten simulation time on the premise of ensuring accuracy. An order-driven FE computation approach was first proposed to implement the parametric calculation of the FE model and the automatic extraction of the simulation results. Subsequently, a multilayer revised ANN model, which considers the MPPs of CFRPs and the initial weight indices of the input parameters, was developed to predict the average maximum thrust force in drilling. Furthermore, the well-trained ANN model was applied for the thrust force prediction of the fresh samples, and the corresponding experimental results were used for evaluating the accuracy via a comparison of the prediction results. Based on the above studies, the main conclusions are as follows:

  1. (1)

    The mesh size of 0.23mm is the optimum mesh of the scale-span drilling FE model to predict the maximum average thrust force for the sake of guaranteeing higher simulation efficiency and accuracy. But the structural mesh size of 0.35mm is regarded as the most suitable scheme to greatly shorten simulation time on the premise of ensuring accuracy.

  2. (2)

    The compression strength ZC and the shear strength SYZ of CFRPs have a significant effect on the thrust force during drilling. Their weight indices are 0.12884 and 0.10636, respectively. In addition, the machining parameters play an essential role in the drilling process.

  3. (3)

    The multilayer ANN model with a 16-18-18-18-16-1 architecture is the most suitable for predicting the maximum average thrust force based on the optimized scale-span drilling FE model samples. The minimum MSE for the training data is computed to be 0.00151 through 70000 training epochs in the ANN model.

  4. (4)

    The maximum average thrust force of CFRPs with parameters of different materials is rapidly and precisely predicted under different machining parameters using the well-trained ANN model. The maximum absolute deviation is only 4.56% with the comparisons of experiments. The prediction results of some extreme machining parameters are also possible using the well-trained ANN model, and the deviations of the prediction results are acceptable as well.