Introduction

Tubings and pipes are manufactured using different methods, but extrusion is probably the most efficient one. This complex thermoforming process involves heating a raw material (usually plastic, metal, polymer, concrete or ceramic) and forming a final ring-shaped product. A primary advantage of extrusion over other manufacturing processes is its capability to create a very complex cross-sectional profile object (Oberg et al. 2012). However, extrusion processing compresses many interdependent input parameters (both process and system variables) and output parameters. Process variables refer to the operating conditions that can be controlled and manipulated directly, whereas the system variables are determined by the process parameters and have an impact on the output parameters (Chevanan et al. 2007).

Unlike the cyclic techniques such as injection molding or blow molding, extrusion is a steady-state or continuous process. This means that, for example, a change in the parameters of the extruder will disrupt the steady-state process condition with a non-negligible effect on the quality of the extruded product. Therefore, all input parameters must be identified, controlled and monitored to guarantee success in the extrusion process since manufacturing quality prediction, control, and monitoring are critical (Khan et al. 2014). Common deficiencies of extruded products are related to visual or geometrical characteristics (e.g., diameter variations, color changes and rough surface) and physical or mechanical properties (e.g., elasticity and rigidity).

Several intelligent and soft computing models (Witten et al. 2011) have been applied to a large variety of manufacturing processes, such as production, fault detection, process planning and monitoring, machine maintenance, and quality prediction and control (Charaniya et al. 2010; Choudhary et al. 2008; Harding et al. 2006; Köksal et al. 2011; Kusiak 2006; Pratihar 2015; Yin et al. 2015). In particular, the use of these techniques for machinery fault detection and product quality prediction has received increasing attention over the last years.

Krömer et al. (2010) showed the ability of genetic programming to evolve fuzzy classifiers on a real-world problem for detecting faulty products in an industrial production process. Multi-layer perceptron neural networks were employed to predict errors in mold surface roughness (Erzurumlu and Oktem 2007) and the product quality in a wave soldering process (Liukkonen et al. 2009). Support vector machines (Jiang et al. 2013) and radial basis function neural networks (Zhang et al. 2014) were used to predict the quality of propylene polymerization in industrial processes. Chien et al. (2007) applied the K-means clustering algorithm and decision trees for the detection of defects in semiconductor manufacturing. The rough set approach was applied to find out solder defects in printed circuit boards (Kusiak and Kurasek 2001). Quality prediction in plastic injection molding processes was tackled using back-propagation neural networks (Sadeghi 2000), support vector machines (Ribeiro 2005) and genetic algorithms (Meiabadi et al. 2013). A combined method based on artificial neural network and particle swarm optimization was proposed to improve the mechanical performance of polymer products (Xu et al. 2015). Adly et al. (2015) presented a simplified subspace regression algorithm for accurate identification of defect patterns in semiconductor wafer maps. Two evolutionary fuzzy ARTMAP neural networks were designed by Tan et al. (2015) to deal with the class imbalance problem in semiconductor manufacturing operations. Ghorai et al. (2013) developed a visual inspection system to localize defects on hot-rolled steel surfaces employing some kernel classifiers, such as the support vector machine and the vector-valued regularized kernel function approximation. Wu et al. (2017) introduced a method based on random forests for tool wear prediction and compared its performance with that of support vector regression and feed-forward back-propagation neural networks. Wang et al. (2018) presented a comprehensive survey of deep learning algorithms for smart manufacturing.

With regards to the particular case of product quality prediction in extrusion processes, we can pay attention to a set of works that have employed some soft computing techniques. For instance, Wu and Hsu (2002) combined the finite-element approach, a polynomial network and a genetic algorithm to develop a method for the design of the optimal shape of an extrusion die. Li et al. (2004) adopted the cooperation between a three-layer back-propagation neural network and a genetic algorithm to set up the system and optimize the technical parameters in the semi-solid extrusion of composite tubes and bars. Yu et al. (2004) proposed a strategy based on a fuzzy neural-Taguchi network and a genetic algorithm to determine the optimal die gap programming of extrusion blow molding processes. Oke et al. (2006) optimized the flow rate of the plastic extrusion process in a plastic recycling plant with the application of a neuro-fuzzy model. González Marcos et al. (2007) introduced improvements in the rubber extrusion process by predicting the characteristics of rubber with a multi-layer perceptron neural network. Sharma et al. (2009) suggested a model of forward mapping for hot extrusion process using the ANFIS neuro-fuzzy approach. Hsiang et al. (2012) investigated the optimal process parameters that maximize the multiple performance characteristics index for hot extrusion of magnesium alloy bicycle carriers through a fuzzy-based Taguchi method. Ramana and Reddy (2013) proposed to make use of clustering, naïve Bayes, and decision trees to predict and improve the final product quality in a plastic extrusion process. Zhao et al. (2013) employed a Pareto-based genetic algorithm for optimization of porthole extrusion die. Support vector regression models and multi-layer perceptron neural networks were compared for the prediction of specific properties of rubber extruded mixtures (Urraca Valle et al. 2013). Carrano et al. (2015) employed an evolutionary computing algorithm to optimize the operational and screw geometrical parameters of a single screw polymer extrusion system. One-class classification methods were used by Kohlert and König (2015) for yield optimization of an extrusion process in a polymer film industry. Chondronasios et al. (2016) introduced a feature extraction technique based on gradient-only co-occurrence matrices to detect blisters and scratches on the surface of extruded aluminum profiles using a two-layer feed-forward artificial neural network.

The main purpose of this paper, therefore, is to analyze the performance of some regression models in the prediction of product quality (regarding the inner and outer diameters) in a tubing extrusion process. From an application perspective, the novelty of this study is on the specific solution proposed for product quality control in a plastic tube manufacturing plant. To the best of our knowledge, there are no previous reports that analyze the use of parameters taken from the extrusion and pulling processes to predict the inner and outer diameters of an extruded tube using the regression methods considered here.

Henceforth the paper is organized as follows. “Description of the tubing extrusion process” section describes the tubing extrusion process of a Mexican manufacturing company, which provided us with the database used for the subsequent empirical analysis. “Regression models” section introduces the bases of the regression models that will be explored in this study. Next, “Experimental set-up” section presents the experimental set-up and the performance evaluation criteria used in the experiments, while the results are given and discussed in “Results and discussion” section. Finally, “Conclusions and future work” section summarizes the main conclusions and outlines some possible avenues for future research.

Description of the tubing extrusion process

This section provides a general description of the tube extrusion process used by a manufacturing company located in Ciudad Juárez (Chihuahua, Mexico). Thus the extrusion process consists of two stages. In the initial phase, the plastic is fed into the heating chamber of the extruder to melt it (see Fig. 1). Once molten, the plastic is pushed by a screw device through the shaped die, which forms the plastic into a tube-shaped form.

Fig. 1
figure 1

Extrusion process. Process parameters that determine the quality of the tube: [1–4] zone temperatures, [5] melting temperature, [6–9] die temperature, [10] revolutions per minute of the screw, and [11] base hopper temperature

In the second phase depicted in Fig. 2, the extruded tube is pulled by a mechanism through a water tank or a blowing system to cool it down and get the final form.

Fig. 2
figure 2

Pulling process. Process parameters that determines the quality of the tube: [1–2] tank temperature, [3] vacuum pressure, and [4] tension of the pulling mechanism

A defect can be defined as a deviation of the product characteristics from the specifications set up by the manufacturing process (Khan et al. 2014), or the difference between the desired product and the resulting product (Dhafr et al. 2006). It can be caused by a single source or the cumulative effect of several factors, which may arise at any stage of the extrusion process. Some defects can be found in extruded parts such as the rough surface, the extruder surging, the thickness variation, the uneven wall thickness, the diameter variation, and the centering problem. In this work, the extruded tube quality was defined regarding the inner diameter (ID) and the outer diameter (OD), as shown in Fig. 3. Although other characteristics could affect the quality of the product (e.g., length of the tube, wall thickness, or color uniformity), the only functional requirements for this application correspond to the inner and outer diameters because these are the critical characteristics that were stated by the customer.

Fig. 3
figure 3

Extruded tube. Quality indices: inner and outer diameters

To guarantee the quality of the manufactured tube (i.e., the inner and outer diameters have to be within design specifications set by the customer), every process parameter must be identified, controlled, and monitored throughout the extrusion process. For example, in the extruder zone, there are several input parameters that might yield significant deviations in the characteristics of the product: the base hopper temperature (BHT), the zone temperature (ZT), the die temperature (DT), the melting temperature (MT), and the revolutions per minute of the screw (SRPM). In the case of the pulling stage, the set of parameters are the tank temperature (TT), the vacuum pressure (VP), and the tension of the pulling mechanism (TPM).

In total, there are 15 process parameters that may produce deviations in the functional requirements of the extruded tube: four zone temperatures, four die temperatures, the melting temperature, the revolutions per minute of the screw, the base hopper temperature, two tank temperatures, the vacuum pressure, and the tension of the pulling mechanism. As a result, each sample will be described by these 15 input parameters and the two output variables mentioned in the previous paragraph (ID and OD).

Table 1 Characteristics of the tube-extrusion data set used in the experiments

Table 1 reports the main characteristics of the database used in the empirical analysis: the attribute number, the attribute description and some statistics, such as the minimum and maximum values of the attribute, the mean and the standard deviation.

It is important to point out that the different input parameters were measured and recorded using specific sensors during the extrusion and pulling processes, and an operator collected the data at a fixed time. Analogously, the input and output diameters of tubes were measured manually with a vernier caliper by the operator. At each shift, these tasks were carried out three times, thus obtaining a data set as the one shown in the example of Table 2.

Table 2 An example of collected data for the tubing extrusion process

Regression models

In this section, we briefly introduce the regression methods that will be further applied to product quality prediction for the tubing extrusion process just described.

Let \(T=\{(\mathbf{x }_1,a_1),\dots ,(\mathbf{x }_n,a_n)\} \in (\mathbf{x } \times a)^n\) be a data set of n independent and identically distributed (i.i.d.) random pairs \((\mathbf{x }_i,a_i)\), where \(\mathbf{x }_i = [x_{i1}, x_{i2}, \dots , x_{iD}]\) represents an instance in a D-dimensional feature space and \(a_i\) denotes the continuous target value associated to it. The aim of regression is to learn a function \(f:\mathbf{y }\rightarrow a\) to predict the value a for a new sample \(\mathbf{y } = [y_1, y_2, \dots , y_D]\).

Nearest neighbor regression

One of the most popular and successful supervised learning methods corresponds to the nearest neighbor (NN) rule due to its algorithmic simplicity and high prediction performance. This non-parametric technique works under the assumption that new samples share similar properties with the set of stored instances and therefore, it predicts the output of a new sample based on its closest neighbor.

The concept of the NN rule can be generalized for regression because the nearest neighbor method assigns a new sample \(\mathbf{y }\) the same target value as the closest instance in T, according to a particular dissimilarity measure (generally, the Euclidean distance). An extension of this procedure is the k-NN decision rule, in which the algorithm retrieves the k closest instances in T. When \(k=1\), the target value assigned to the input sample is the target value indicated by its nearest neighbor. For \(k>1\), the k-NN regression model (k-NNR) estimates the target value \(f(\mathbf{y })\) of a new input sample \(\mathbf{y }\) by averaging the target values of its k nearest neighbors (Biau et al. 2012; Guyader and Hengartner 2013; Kramer 2011; Lee et al. 2014):

$$\begin{aligned} f (\mathbf{y }) = \frac{1}{k} \sum \limits _{i=1}^ka_i\, \end{aligned}$$
(1)

where \(a_i\) denotes the target value of the i-th nearest neighbor.

Distance-weighted k-NN regression

When the basic k-NN algorithm estimates the target value for the new sample, it ignores some relevant information that each of the k nearest neighbors might provide regarding their distance (Batista and Silva 2009). To overcome this shortcoming, Dudani (1976) proposed a weighting function, which weights more heavily closer neighbors than distant neighbors, depending on their corresponding distances to the new sample.

Table 3 Average results in terms of RMSE and MAE (with standard deviations) on the outer diameter database

In general, a weighting function has to work based upon the premise that the weights should decrease with increasing sample-to-neighbor distance (Dudani 1976). Let \(\mathbf{x }_i (i= 1,\dots ,k)\) be the closest instances to an input sample \(\mathbf{y }\), and let \(d_i=d(\mathbf{x }_i,\mathbf{y })\) be the distance between \(\mathbf{x }_i\) and \(\mathbf{y }\). A common weighting technique computes \(w_i\) for the i-th nearest neighbor as the inverse of its distance (Dudani 1976):

$$\begin{aligned} w_i= \frac{1}{d_i}\,~~~d_i\ne 0\, \end{aligned}$$
(2)

Another possible weighting function (Batista and Silva 2009) can be defined as

$$\begin{aligned} w_i = 1-d_i \end{aligned}$$
(3)

Once the weights \(w_i\) have been computed, the distance-weighted k-NN approach for regression (k-NNRw) estimates the target value as follows (Hall et al. 2009):

$$\begin{aligned} f(\mathbf{y }) = \frac{\sum \limits _{i=1}^k w_i a_i}{\sum \limits _{i=1}^k w_i}\, \end{aligned}$$
(4)

Linear regression

Multiple linear regression (LR) attempts to model the relationship between two or more independent variables (in this case, the input attributes reported in Table 1) and an output or response variable by fitting a linear equation to the observed data (Draper and Smith 1998). Every value of the independent variable is associated with a value of the response variable. The general form of the multiple linear regression equation can be written as follows:

$$\begin{aligned} f(\mathbf{y }) = \alpha + {\sum \limits _{j=1}^D \beta _i y_i} + \epsilon \end{aligned}$$
(5)

where \(\alpha \) is a constant (the point where the regression line intercepts the Y-axis), \(\beta _i\) are the regression coefficients on the independent variables \(y_i\), and \(\epsilon \) is the residual or fitted error.

The regression coefficients \(\beta _i\) are estimated by curve fitting based on the least square method with the aim of minimizing the fitted error (the difference between the observed and estimated values). Equation 5 indicates how the average response of the output variable changes with the independent variables. Thus the LR model can be used to predict the target value a from new observed values of \(\mathbf{y }\).

Support vector regression

The foundations of support vector machines are well-known for both classification and regression problems. Smola and Schlkopf (2004) published an excellent tutorial on support vector machines for regression (SVR). The objective of the SVR model is to define a linear regression function to map the input data to a high-dimensional feature space, in which input data can be separated easier than in the original input space (Chou et al. 2017; Ma et al. 2003),

$$\begin{aligned} f(\mathbf{x }) = \mathbf{W }^T \varPhi (\mathbf{x }) + b \end{aligned}$$
(6)

where \(\mathbf{W }\) is a weight vector, \(\varPhi (\mathbf{x })\) maps the input sample \(\mathbf{x }\) to the high-dimensional feature space, and b is a bias term.

The \(\mathbf{W }\) and b can be obtained by solving an optimization problem (Ma et al. 2003):

$$\begin{aligned} \begin{array}{ll} \underset{\mathbf{W },b}{\text {minimize}}\quad &{}\quad \frac{1}{2} \mathbf{W }^T \mathbf{W } + C {\sum \limits _{i=1}^n (\xi _i+\xi _i^*)} \\ \text {subject to}\quad &{}\quad a_i - (\mathbf{W }^T \varPhi (\mathbf{x }) + b) \le (\xi _i+\xi _i^*) \\ &{} (\mathbf{W }^T \varPhi (\mathbf{x }) + b) - a_i \le (\xi _i+\xi _i^*) \\ &{} \xi _i, \xi _i^* \ge 0, ~~i = 1, \dots , n \end{array} \end{aligned}$$
(7)

where C is a regularization parameter, \(\xi _i\) and \(\xi _i^*\) are non-negative slack variables to penalize for errors that are greater than \(\epsilon \) in magnitude

By introducing the Lagrange multipliers \(\alpha \), \(\alpha ^*\), and a kernel function K , the model form in the dual space can be written as:

$$\begin{aligned} f(\mathbf{x }) = {\sum \limits _{i=1}^n (\alpha _i - \alpha _i^*) K(\mathbf{x }_i,\mathbf{x })} + b \end{aligned}$$
(8)

The use of a kernel function allows to deal with feature spaces of arbitrary dimensionality without having to compute the mapping function \(\varPhi (\mathbf{x })\) explicitly (Yang and Shieh 2010). The kernels most commonly used are linear, polynomial, sigmoid, and radial basis functions.

Experimental set-up

As already stated, this study aims to evaluate the performance of some regression models for product quality prediction in the tubing extrusion process of a manufacturing plant. Thus we conducted a pool of experiments on a data set with 260 samples that were collected using the procedure described in “Description of the tubing extrusion process” section. It has to be remarked that all input attribute values (process parameters) were normalized to the range [0, 1].

We focused our study on the simple k-NNR (no weighting), two weighted versions of k-NNR using Eqs. 2 and 3, in the sequel called k-NNRw1 and k-NNRw2 respectively, the LR model, the SVR technique with three different kernels and the multi-layer perceptron (MLP) neural network. The kernels used in the SVR model were a linear function (SVR-1), a polynomial function of degree 2 (SVR-2) and a radial basis function (SVR-RBF). For the regression algorithms based on the k-NN rule, twenty-five odd values of k (\(1,3,\dots , 29\)) were tested. All regression models were taken from the WEKA toolkit (Hall et al. 2009).

Following the standard strategy used to evaluate the performance of regression models when databases are small- or medium-sized, the 10-fold cross-validation method was adopted (Buza et al. 2015; Hall et al. 2009; Hu et al. 2014). The original data set was randomly divided into ten parts of size n / 10 (where n denotes the total number of samples in the data set); for each fold, nine blocks were gathered as the training set for learning the model, and the remaining fold was used as an independent test set. Additionally, with the aim of increasing the statistical significance of the experimental scores, ten repetitions were run for each trial and the results from predicting the output of test samples were averaged across the 100 runs.

Evaluation criteria

In the framework of regression, the purpose of most performance evaluation metrics is to estimate how much the predictions \((p_1,p_2,\ldots , p_n)\) deviate from the actual target values \((a_1,a_2,\ldots , a_n)\). These metrics are minimized when the predicted value for each test sample agrees with their true value (Caruana and Niculescu-Mizil 2004). Two of the most popular performance measures that have frequently been employed to assess the model performance in regression problems are the root mean square error (RMSE),

$$\begin{aligned} RMSE = \sqrt{\frac{1}{n} \sum \limits _{i=1}^n(p_i-a_i)^2} \, \end{aligned}$$
(9)

and the mean absolute error (MAE),

$$\begin{aligned} MAE = \frac{1}{n} \sum \limits _{i=1}^n\mid p_i-a_i \mid \, \end{aligned}$$
(10)
Fig. 4
figure 4

Outer diameter database: performance regression measures with k-NN regression models when varying k. (a) Root Mean Square Error. (b) Mean Absolute Error

Table 4 Average results in terms of RMSE and MAE (with standard deviations) on the inner diameter database

Both these metrics show how far away the predicted values \(p_i\) are from the target values \(a_i\) by averaging the magnitude of individual errors without taking care of their sign.

Results and discussion

Since the quality product was predicted here using the inner and outer diameters of the extruded tubes, the experiments and the subsequent analysis of results were performed according to these two physical quality indices. Hence, for each database (i.e., the outer and inner diameter databases), we compared the average of the two performance measures (RMSE and MAE) achieved by each regression method.

Outer diameter database

Table 3 reports the average results in terms of RMSE and MAE across the 100 runs for each regression technique. In the case of the k-NN methods, the values correspond to those of the best k. Based on the root mean square error, one can observe that the three k-NN algorithms, the linear support vector (SVR-1) and the SVR-RBF obtained the lowest error rates (very close to 0) when predicting the outer diameter. In addition, the same behavior can be viewed in terms of MAE.

If we consider that the output values of the outer diameter are in a range from 0.228 to 0.331, then all these regression models appear to be suitable for predicting the quality of extruded tubes. However, the small differences in both RMSE and MAE results do not allow us to draw significant conclusions about which method is the best numerical prediction technique. In fact, even the LR and MLP models could be applied to tackle this problem because their errors were also close enough to 0.

Figure 4 shows the performance measures for the three versions of k-NN regression when varying the value of k from 1 to 29. The graphical results suggest that when k increases, the k-NNRw1 shows a steady behavior along the X-axis. In the case of k-NNR and k-NNRw2 models, their error rates increase along with the value of the parameter k. In summary, it appears that k-NNRw1 performed the best with all values of \(k>1\), demonstrating the benefits of applying this technique to predict the quality of extruded tubes concerning their outer diameter.

Inner diameter database

As in the previous section, we analyzed the behavior of the regression models to predict the inner diameter of an extruded tube. Table 4 shows the results of RMSE and MAE averaged across the 100 runs for each technique. Results are conceptually similar to those of the outer diameter database: (i) the methods based on k-NN, the SVR-1 and the SVR-RBF yielded very low error values (\(\approx 0\)); (ii) here MLP also appears to be among the best performing algorithms; and (iii) except the SVR-2 method, differences in the results of the regression models seem not to be significant.

Figure 5 depicts the performance results for all versions of the k-NN regression models as a function of k. One can observe that k-NNRw1 achieved very similar performance results regardless of the k value. In the case of plain k-NNR and k-NNRw2, when k increases, the error rates decreases slightly. These results suggest that the three k-NN regression models are suitable for predicting the inner diameter of extruded tubes, although the k-NNR and k-NNRw2 approaches seem to be the best techniques.

Fig. 5
figure 5

Inner diameter database: Performance regression measures with k-NN regression models when varying k. (a) Root Mean Square Error. (b) Mean Absolute Error

Conclusions and future work

The present paper has focused on predicting two quality indices in a tubing extrusion process. A thoroughly experimental study has been carried out on a real-life data set provided by an extrusion tube manufacturing plant located in Ciudad Juárez (Chihuahua, Mexico). More specifically, three k-NN regression methods (the straightforward algorithm and two distance-weighted approaches), the linear regression model, three SVR configurations (SVR-1, SVR-2, and SVR-RBF), and a multi-layer perceptron have been used to predict the inner and outer diameters of an extruded tube based on the evaluation of 15 process parameters.

Experimental results suggest that distance-weighted k-NN regression models along with the linear and the RBF-based support vector regression methods were the most effective techniques for the prediction of extruded tube quality, achieving RMSE and MAE rates close to 0. From our analysis when varying the k values, we found out that when k increases, the performance regression rates are (almost) stable.

Future research will be mainly addressed to incorporate a feature selection phase to remove any attribute that might be considered noisy or irrelevant. Another avenue for further investigation concentrates on developing some regression algorithms based on the surrounding neighborhood concept. Finally, we are also interested in analyzing the behavior of ensembles of regression models.