1 Introduction

Aluminum matrix composites (AMCs) reinforced with ceramic particles have potential for many industrial applications where weight saving is of primary concern. These composites offer improved properties such as increased strength, higher elastic modulus, higher service temperature and improved wear resistance as compared to the un-reinforced alloy. Recently, nano-composites have attracted most attention for their unique properties [1]. High-energy mechanical milling is a solid-state powder processing technique involving repeated welding, fracturing and re-welding of powder particles in a high-energy ball mill. Synthesizing of metal matrix nano-composites by this process has attracted a great interest due to its ability in distributing nano-sized reinforcement particles within the matrix alloy without the typical drawbacks of other processing methods. The capability of mechanical milling in synthesizing a variety of metal matrix nano-composites such as Mg/SiC [2], Cu/Al2O3 [3, 4], Zn/Al2O3 [5], Fe/TiC [6], Ni/AlN [7] and various aluminum matrix nano-composites [8,9,10,11] has been demonstrated.

The mechanical properties of AMCs are largely affected by the size and distribution of the second-phase particles. During the milling process, ductile powders undergo plastic deformation resulting in a gradual change in their morphology and size. Many parameters such as miller type, ball-to-powder weight ratio, characteristics of the balls, milling atmosphere, process control agent and temperature influence the powder particle size during milling. When composite powders are subjected to this process, their size is also affected by the type, volume fraction and size of the reinforcing particles.

Artificial neural networks (ANNs) as predictive models have attracted a great interest due to their ability for pattern recognition. In data treatment, ANNs are capable for learning what happens in the process without actually modeling the physical and chemical laws that govern the system. These networks have been developed in such a manner that they can simulate the biological nervous systems and recognize diverse patterns and produce responses which are correct or nearly correct from partially incorrect or incomplete stimuli. The networks consist of a number of computational units known as neurons, connected to each other via weight factors. These factors are not constant and are updated during training stage. The process of weight factor updating continues until the network converges to the desired values presented to the network at the start of training stage. Generally speaking, each neuron produces a linear combination of the product of each input and its relevant weigh factor. Then, this linear combination undergoes a nonlinear mapping, and finally, the resultant output distributes to the neighboring layer. The success in obtaining a reliable and robust network depends strongly on the choice of process variables involved, as well as on the available set of data and the domain used for training purposes. ANN models have been developed to model different correlations and phenomena in steels [12,13,14], aluminum alloys [15, 16] and Ni-base superalloys [17, 18] as well as mechanical alloying [19, 20].

Multiple linear regression (MLR) is another statistical technique that attempts to model a group of random variables by creating a mathematical relationship between them. The model creates a relationship between two or more explanatory variables and a response variable in the form of a straight line (linear) that best approximates all the individual data points. The goal of MLR is to determine how the descriptive variables influence the response variable.

In the present study, mechanical milling was used for preparation of Al–B4C nano-composite powder mixtures and the effect of the size and content of starting powders (Al and B4C) as well as the milling time on the size distribution of the resultant powders was studied by laser particle size analyzing. These results were utilized in developing two different statistical methodologies based on ANN and MLR models. By using the initial size and content of Al and B4C powders as well as the milling time as input parameters, these models could predict the median particle size (D50) and the extent of size distribution (D90D10) of the milled powders.

2 Materials and methods

In this study, different amounts (5 and 10 wt%) of B4C particles with different sizes (90, 700, and 1200 nm) were mixed with Al 6061 alloy powder particles having two different average sizes (21 and 71 µm) and milled in an attrition mill (Union Process, model 1-s). The milling was performed at 320 rpm with ball-to-powder weight ratio of 20 under argon atmosphere. The size distribution of powders was quantified by a laser particle size analyzer (Cilas-1064). Powder samples were designated by AlxCy (Z  %) in which X, Y and Z indicate aluminum particle size (µm), B4C particle size (nm) and B4C percent (wt%), respectively.

3 Construction of models and processing of data

3.1 Neural networks, architecture and learning

It has been mathematically proven that a three-layer network can map any function to any required accuracy [21]. In the present study, we considered the mechanical milling process parameters as the input values and the median particle size (D50) together with the extent of the particle size data (D90D10) of the milled powders as the output values. The networks consisted of 5 input nodes (one for each input value and one for the bias, where the bias value was considered to be −1), a number of hidden nodes and an output node representing D50 or D90D10. Figure 1 illustrates schematically a three-layer neural network and its various parts. The number of nodes in the hidden layer depends on the complexity of the problem, and we considered this number as 2n + 1, where n is the number of neurons in the input layer. Therefore, the network architecture used in this study is 5-11-1 (5 nodes in the input layer, 11 nodes in the hidden layer and 1 node in the output layer).

Fig. 1
figure 1

Schematic illustration of the neural network structure showing the input nodes, hidden units and the output node

Both the input and output values were first normalized within the range of 0 and 1 as follows:

$$ x_{\text{N}} = \frac{{x - x_{\hbox{min} } }}{{x_{\hbox{max} } - x_{\hbox{min} } }} $$
(1)

where xN is the normalized value of x which has maximum and minimum values given by xmax and xmin, respectively.

A unipolar sigmoid function was selected as the activation function in each layer as follows:

$$ f(y_{i} ) = \frac{1}{{1 + \exp ( - 0.5y_{i} )}} $$
(2)

where \( y_{i} \) is defined as

$$ y_{i} = \sum\limits_{j} {w_{ij} } x_{j} + \theta_{i} $$
(3)

where x is the normalized input value, \( \theta_{i} \) is the threshold for its input neuron and i and j represent the neuron and input numbers, respectively.

A set of input/output patterns, amounting about 70% of the available experimental data, was randomly chosen for training the networks, and the rest of the data were used to test the efficiency of the networks. The error backpropagation algorithm was used to train the networks, and momentum term was used in updating weights to improve the convergence rate. The training algorithm can be summarized as follows:

Step 1 Selection of the learning constant and momentum coefficient. In this stage, the learning constant\( \eta \) and momentum coefficient,\( \alpha \) were selected to be 0.5 and 0.9, respectively.

Step 2 Initializing the weight factors. In this work, random numbers between 0 and 1 were taken as the initial values for the weight factors. Therefore, the value of the \( \theta_{i} \) in Eq. (3) would be equal to the product of a relevant weight factor and -1, and this value could be considered as a weight factor.

Step 3 Computation of outputs of all neurons based on Eqs. (2) and (3), layer by layer.

Step 4 Calculation of root-mean-square error based on the following equation:

$$ E_{\text{rms}} = \frac{1}{2PK}\sqrt {\sum\nolimits_{p = 1}^{P} {\sum\nolimits_{k = 1}^{K} {\left( {d_{pk} - o_{pk} } \right)^{2} } } } $$
(4)

where P is the number of input patterns, K is the number of output neurons (being equal to 1 in this work), \( d_{pk} \) is the desired output value, and \( o_{pk} \) is the output value produced by the network.

Step 5 Termination criterion: If \( E_{\text{rms}} \) reaches a desired value, which is considered to be 0.2 in this work, the training algorithm is considered to be terminated.

Step 6 Updating the weights along the negative gradient of \( E_{\text{rms}} \). In this step, initially the weights on the output layer are calculated, and then, the result is propagated backwards through the network, layer by layer according to the following equations:

$$ w_{ij}^{(n)} (0) = w_{ij}^{(n)} ( - 1) $$
(5)
$$ w_{ij}^{(n)} (t + 1) = w_{ij}^{(n)} (t) + \eta \delta_{i}^{(n)} f^{(n - 1)} (y_{i} ) + \alpha (w_{ij}^{(n)} (t) - w_{ij}^{(n)} (t - 1)) $$
(6)

where t is the number of weight updates and \( \delta_{i}^{(n)} \) is the error gradient of the ith neuron on the nth layer. Equations 7 and 8 are used for the output and hidden layers, respectively;

$$ \delta_{i}^{(n)} = (d_{i} - o_{i} )\, f^{\prime}(y_{i} ) $$
(7)
$$ \delta_{i}^{(n)} = f^{\prime}(y_{i} )\sum\limits_{j = 1}^{n + 1} {wij^{(n + 1)} } $$
(8)

where \( f^{\prime}(y_{i} ) \) is the first derivative of the function \( f(y_{i} ) \).

Step 7 Repeating the procedure mentioned above by going to step 3.

3.2 Multiple linear regression analysis

When there are an arbitrary number of explanatory variables, the linear regression model takes the following form:

$$ y = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \cdots + \beta_{k} x_{k} $$
(9)

where y represents the response or independent variable and x1, x2,…xk represent explanatory or dependent variables. \( \beta_{0} \), … \( \beta_{k} \) are constants which are estimated by “fitting” the equation to the data using least-square approach. In the present study, the median size of powder particles (D50) and the extent of size distribution (D90D10) have been taken as response variables, whereas milling time, B4C content and B4C size are explanatory variables. Therefore, the relation between the median size and size distribution of powder particles with different material and processing variables can be represented by the following equations:

$$ D_{50} = \beta_{0} + \beta_{1} C + \beta_{2} S_{{}} + \beta_{k} t $$
(10)
$$ D_{90} - D_{10} = \beta^{\prime}_{0} + \beta^{\prime}_{1} C + \beta^{\prime}_{2} S_{{}} + \beta^{\prime}_{k} t $$
(11)

where C, S and t represent B4C content, B4C size and milling time values, respectively.

4 Results and discussion

4.1 Particle size evolution in relation with ANNs analysis

The cumulative size distribution plots for Al71, Al71 C1200 (5%) and Al71 C1200 (10%) powder samples after different milling times as measured by laser particle size analyzer are shown in Fig. 2. It can be seen that the increased milling time from 2 to 4 h resulted in generation of coarser particles attributable to cold welding and agglomeration of powders. However, for longer milling times the decreased size of powders indicates that fragmentation has been the predominant mechanism.

Fig. 2
figure 2

Cumulative size distribution curves obtained after different milling times for a Al 71, b Al71C1200 (5%) and c Al71C1200 (10%) powder samples

The variation in the volume percent of different particle sizes during milling of Al71C1200 (5%) sample is shown in Fig. 3 and represents two different behaviors for fine and coarse particles.

Fig. 3
figure 3

Variation in the volume percent of different particle sizes during milling of Al71C1200 (5%) sample

It can be seen that during the first 4 h of milling, the volume percent of coarse particles increases, while for the finer particles (i.e., d ≤ 15 µm), the adverse trend is observed. However, for longer milling times a progressive decrease in the volume percent of coarse particles together with increased quantity of fine particles occurs. These results confirm that during the first 4 h of milling, the finer particles are welded to each other, resulting in diminishing a portion of particles in the small size bands. At the same time, flattening of the larger particles together with contribution of cold welded agglomerates entered from smaller size bands results in increased volume percent of particles in the larger size bands. However, after 4 h of milling, the easier fracturing of the larger flattened powders results in decreased percent of particles in the large size bands. Consequently, these broken small sized particles contribute to the increased volume fraction of finer particles.

The results of laser particle size analysis representing the variation of the median size (D50) and the extent of the size distribution (D90D10) of different batches of powder mixtures with milling time are compared with those of the ANNs generated data in Fig. 4a–d. These plots exhibit good agreement between the experimental and ANN results.

Fig. 4
figure 4

Variation of a, b the median size (D50) and c, d the extent of the size distribution (D90D10) of different batches of powder mixtures containing various contents of the same sized B4C particles with milling time as compared to those of the ANN generated data

The increased median size and width of size distribution, during the first 4 h of milling for all the investigated powder batches, as shown in these plots are attributable to flattening and cold welding of particles. However, for longer milling times, the fracture of larger particles as well as the agglomeration of the smaller ones is the predominant mechanisms and results in decreased D50 and attainment of narrower size distributions. Finally, the decreased slopes of D50 versus milling time after 12 h of milling are attributable to attainment of equilibrium between fracturing and welding.

These results were confirmed by scanning electron microscopy of milled powders in our previous report [22] and are also in agreement with those reported by Arik [23] for an Al–Al4C3 system.

As shown in Fig. 5, addition of B4C to Al powders resulted in decreased size of powder mixtures at least for the first 8 h of milling. This effect is intensified when a higher amount of B4C particles is added to Al powders. Consequently, addition of 10 wt% of B4C particles to Al resulted in finer powder particles as compared to 5% B4C addition. In most of the cases, the same trend is observed for the width of the size distribution. These results are in agreement with other reports [9] and suggest that the presence of hard ceramic particles accelerate the milling process. Therefore, the required milling time for attainment of the equilibrium condition, i.e., formation of fine equiaxed particles, is shortened. The results presented in Fig. 2 confirm that at any identical milling time, the presence of 10 wt% of B4C particles resulted in reduced size of Al powders. These results can be attributed to the following facts:

Fig. 5
figure 5

Variation of a, b the median size (D50) and c, d the extent of the size distribution (D90D10) of different batches of powder mixtures containing 5% of different sized B4C particles with milling time as compared to those of the ANN generated data

  1. 1.

    The as-received B4C particles are finer than the initial Al powder particles; therefore, the increased B4C content in the mixture contributes to decreased size of the powder mixture during milling.

  2. 2.

    As will be discussed later, a part of fine B4C particles may be embedded into the Al powders during milling, resulting in their decreased fracture toughness and increased fracturing. This effect is intensified for a higher B4C content in the powder mixture

Figure 5 also shows a reasonably good agreement between the ANN predictions with the results of LPS analysis for the median size and the extent of powder size distribution (D90D10) of particles during co-milling of 5% of different sized B4C particles with fine and coarse Al powders. These plots indicate that when Al powders with an initial size of 21 µm were co-milled with B4C particles, the smaller size of added B4C particle resulted in generation of finer powders at most of the milling intervals. However, the adverse results were obtained when the coarser (71 µm) aluminum powder particles were used. These results are attributed to the embedding of fine B4C particles in the coarser aluminum powder particles. In fact the fine B4C particles can penetrate more easily into the coarser Al powders leading to decreased volume fraction of free B4C particles within the powder mixture resulting in increased overall particle size distribution.

As was mentioned before, the ANNs predicted plots shown in Figs. 4 and 5 reveal a reasonably good agreement between the experimentally measured and predicted D50 and (D90D10) values. It can be seen that during mechanical milling, the initial size and content of starting Al and B4C powders together with several events such as flattening, cold welding, fracturing and agglomeration of Al powders together with embedding of nano-sized B4C particles in Al powders influence the size and size distribution width of the powder mixtures. However, the effects of all these parameters can be reasonably predicted by the ANNs model. The performance of the neural networks is best judged from Fig. 6, in which large correlation coefficients (>0.95) resulted when all the experimentally obtained data for D50 and D90D10 after 4 h of milling were randomly divided into two series of training and test data and were plotted against their corresponding ANN predicted values.

Fig. 6
figure 6

Plots of observed a, bD50 and b, c (D90D10) versus ANNs predicted data. For a and c a set of experimental data used as training data and for b and d a set of experimental data used as validation test data are plotted against their corresponding ANNs predicted values

4.2 MLR analysis

According to the experimental data presented in Figs. 4 and 5, after 4 h of milling, the dependence of D50 and (D90D10) to the milling time is almost linear. Therefore, we used that part of data in the MLR analysis.

The MLR method related the response values (D50 and (D90D10)) and process parameters such as milling time (t), B4C size (S) and B4C content (C) via the following equations:

$$ D_{ 5 0} ( 2 1 )= { 60 - 0} . 3C + 0. 0 0 3S{ - 3} . 7 6t\quad \left( {R^{2} = 0.93} \right) $$
(12)
$$ D_{ 5 0} ( 7 1 )= { 131 - 1} . 1C{ - 0} . 0 1 5 2 { }S{ - 6} . 4t\quad \left( {R^{2} = 0.92} \right) $$
(13)
$$ D_{ 9 0} { - }D_{ 1 0} ( 2 1 )= { 86 - 0} . 4 2C + 0. 0 0 6 6S{ - 4} . 4 9t\quad \left( {R^{2} = 0.90} \right) $$
(14)
$$ D_{ 9 0} { - }D_{ 1 0} ( 7 1 )= { 236 - 0} . 0 1 4 4 { }C{ - 0} . 0 4 { }S{ - 11} . 8 { }t\quad \left( {R^{2} = 0.91} \right) $$
(15)

These equations clearly confirm some experimental results as follows:

  • In all the equations, the negative sign of (t) and (C) indicates the decreased median particle size and the extent of size distribution for increased milling time and/or B4C content.

  • For fine Al powders (Al21), the sign of B4C size (S) is positive, whereas the negative sign for coarse Al powder particles (Al71) confirms the embedding of B4C particles in Al71 as was discussed before.

  • The larger coefficient of (t) for Al71 as compared to that of Al21 indicates more pronounced effect of milling time in decreasing the size of coarser Al powders.

Figure 7a–d is the plots of MLR predicted median particle (D50) and width of particle size distribution (D90D10) versus milling time (t ≥ 4 h). The experimental data shown in these plots confirm the capability of MLR in predicting the response values with a reasonable accuracy.

Fig. 7
figure 7

Variation of a, b the median size (D50) and c, d the extent of the size distribution (D90D10) of different batches of powder mixtures with milling time (in excess of 4 h) as compared to those of the MLR generated data

The validity of MLR was proven by means of plotting the predicted D50 and (D90D10) values calculated for different batches of powder particles against their measured values as shown in Fig. 8a, b. The values of correlation coefficient (R2) as calculated for these plots (0.91–0.93) indicate reasonably good predictions. However, the higher R2 values (>0.95) obtained for ANN predictions (Fig. 6) indicate the superior capability of ANNs in predicting the more accurate size characteristics of the investigated powders

Fig. 8
figure 8

Plots of observed versus predicted data for MLR modeling when all the available sets of data were employed, aD50 and b (D90D10)

5 Conclusions

Statistical methodologies based on artificial neural networks (ANNs) and multiple linear regression (MLR) were developed and used to predict the median particle size (D50) and the extent of size distribution (D90D10) of Al–B4C nano-composite powders generated during co-milling of different sized Al powders with various amounts of different sized B4C particles.

By using the initial Al powder size, B4C size and its content as well as milling time as input values, the developed ANNs model was capable of predicting the D50 and (D90D10) of powders and anticipating the different influential parameters involved during milling. The good performance of this model was confirmed by large correlation coefficients (>0.95) achieved by plotting all the experimentally obtained data for D50 and D90D10 against their corresponding ANN predicted values.

The MLR method resulted in four equations that could predict D50 and (D90D10) of Al + B4C powder mixtures containing coarse or fine Al powders. The input parameters were milling time (t > 4 h), B4C size and B4C content. The signs and values of the coefficients in these equations well confirmed some experimental observations. However, the smaller correlation coefficient (0.91–0.93) obtained by means of plotting the MLR predicted D50 and (D90D10) against their measured values as compared to their ANNs counterparts indicated the superior capability of ANNs in predicting the more accurate size characteristics of the investigated powders