1 Introduction

Many manufacturing processes are naturally multivariate and although they are common in a manufacturing environment there is a lack of efficiency in the quality characteristics model building when traditional OLS (ordinary least squares) is employed. These kinds of processes are also specifics, which imply that a unique model cannot be employed in every application, raw material or operational setup. Looking for the best process operation conditions the behavior of some desired features must be evaluated as a function of the factor increments that are, at first, considered significant. This is typically the experimental strategy.

Sometimes, the existence of a multiple quality characteristics sets involves the growing of a highly correlated structure among the dependent variables. There are a number of studies that evaluate processes with this particular feature, [14]. However, even though the responses present a highly correlated structure, there are no majors concerns by the authors regarding to the influence of such structures on the models coefficient estimation [57] or to the residual independence, [8].

The most known optimization method for multiple responses is the function so called Harrington’s desirability [9]. However, according to Khuri and Conlon [10], this method is not able to incorporate the correlation structure existent in the original data set. This means that the interrelationship among the many responses can lead the experimenter to non-conclusive results when the analysis is univariate [6]. This kind of solution can be far away from the simultaneous optimum solution.

Many authors try to surpass this problem using principal component analysis (PCA), [57]. However, two important shortcomings should be considered using PCA for multiple responses optimization: (a) the existent conflict between maximum and minimum values in a group of variables that need to be simultaneously optimized and (b) the lower significance of principal components for data with lower inter-correlation. In any of these cases PCA is not adequate to optimize the responses. Otherwise, if these shortcomings can be contoured, the PCA approach must be equivalent to other multiresponse optimization methods.

In this study, a proposal developed by Liao and Chen [11] to treat a multiresponse Taguchi design through data envelopment analysis (DEA) will be extend to the case of multiple response set obtained with a response surface design like rotatable central composite design (CCD). After, this approach will be compared with multivariate response surface approach enhanced as in [12], in the attempt to optimize the output variables affecting a specific manufacturing process. To comply with the objective proposed, experimentation is conducted in the region of interest according to the chosen design and the responses of interest will be recorded and calculated. Using each run in the CCD as an alternative DMU (decision making unit), the efficiency of the all experiment runs will be calculated using a typical linear programming written in term of DEA format. Then, the regression analysis will be done using DMU’s efficiency. Last, the GRG optimization algorithm is used to determine the optimal parameters of the process based in the full quadratic model obtained with DEA efficiency as singular response. To compare the results, a multivariate alternative procedure based on an index formed by the weighted largest principal component scores of correlation matrix from the original set of responses will be suggested as an objective function. Using the same nonlinear optimization algorithm, a feasible optimum will be investigated.

As a manufacturing process example, it is employed a pulsed gas metal arc welding (P-GMAW), which is widely used in industries for welding a great deal of ferrous and non-ferrous materials. The P-GMAW achieves coalescence of metals by melting continuously fed current-carrying wire. However, it is necessary a high-quality welding procedures to achieve a high bead quality The process is characterized by pulsing of current between low-level background current and high-level peak current in such a way that mean current is always below the threshold level of spray transfer. The background current is used to maintain arc whereas peak currents are long enough to make sure detachment of the molten droplets [13]. Five profile bead properties are recorded and used as correlated response variables to test the above approaches.

This paper is organized as follows: Sect. 2 presents the overview of the response surface methodology. Section 3 presents the data envelopment analysis. Section 4 presents the multivariate response surface approach. Section 5 presents the experimental procedure and data analysis. Finally, we offer our conclusions in Sect. 6.

2 Overview of the response surface methodology

According to [14], RSM is a collection of mathematical and statistical tools used to model and to analyze problems whose responses of interest are influenced for many variables. In general, the relationship among dependent and independent variables is known. Due to that one must find a reasonable approximation for the real relationship between y and the set of independent variables. Usually, a low-order polynomial in some regional of interest is employed. However, if there is curvature in the system, then the approximating function must be a polynomial of higher degree, such as the second-order model describe by Eq. (1).

$$y = \beta_{0} + \sum\limits_{i = 1}^{k} {\beta_{i} } x_{i} + \sum\limits_{i = 1}^{k} {\beta_{ii} } x_{i}^{2} + \sum\limits_{i < j} {\sum {\beta_{ij} } } x_{i} x_{j} + \varepsilon$$
(1)

Montgomery [14] also considers unlikely that a specific polynomial model approximates a real model for the whole experimental space covered for the independent variables. For a specific region, the approximation usually is efficient. The OLS method is used to estimate the parameters (β) that in matrix form could be written as:

$${\hat{\mathbf{\beta }}} = {\mathbf{(X}}^{{\mathbf{T}}} {\mathbf{X)}}^{{ - {\mathbf{1}}}} {\mathbf{X}}^{{\mathbf{T}}} {\mathbf{y}}$$
(2)

where X is the matrix of factor levels and y is the response. The evaluation of the presence of curvature in the model is based on the analysis of center points for the factors levels.

Derringer and Suich [15], dealing with multiple response problems, improved the algorithm of desirability function in [9]. In this method, the statistical model is firstly obtained using OLS. Secondly, using a set of transformations based on the limits imposed to the responses, a conversion is conducted for each one of the responses resulting in a individual desirability function d i , with \(0 \le d_{i} \le 1\). These individual values are then combined using a geometrical average, such as:

$$D = \left[ {\prod\limits_{i = 1}^{p} {d_{p} (Y_{p} )} } \right]^{\frac{1}{p}}$$
(3)

This value of D gives a solution of commitment and is restricted to the interval [0, 1]. D is close to 1 when the responses are close to its specification. The type of transformation depends on the desired optimization direction.

The desirability function approach to problem optimization is simple, easy to apply, and permits the use to make subjective judgment on the importance of each response. However, according to [10], this methodology does not take into consideration the variances and correlations of the responses. Ignoring these correlations can alter the structure of the overall desirability function, which in turn may jeopardize the determination of optimum operating condition.

3 Data envelopment analysis

Data envelopment analysis (DEA) is a linear programming-based technique for measuring the relative efficiency of a set of competing DMU’s (decision making units) where the presence of multiple inputs and outputs makes the comparisons difficult [16]. According to [17], the relative efficiency of the multiples inputs and outputs in DMU is typically defined as a ratio (weighted sum of the DMU’s outputs divided by weighted sum of the DMU’s inputs). Then, if a relative efficiency wants to have a higher performance, the input data of ratio must have lower values and the output data of ratio must have higher values [18]. In this paper, DEA is combined with traditional response surface methodology to solve a multiresponse welding process. Each combination of factors/levels is treated as a DMU. Following the approach proposed by [11], the larger-is-better welding quality characteristics are considered as outputs, and the smaller-is-better responses are treated as inputs. The maximization of the ratio between the sum of weighted DMU’s outputs and inputs leads to the higher efficiency. The larger relative efficiency values imply that the welding bead quality’s characteristics targets are completely achieved. Once is calculated the efficiency for each experiment in a CCD design, this multiresponse index is used as dependent variable. Proceeding as the traditional RSM, the model’s coefficients are obtained employing OLS method. After inspection of the individual significance, the practitioner can to decide if the reduced or full quadratic model must be adopted. After, using the GRG algorithm, the DEA index quadratic function is maximized subject to the experimental region constraint. To compare the results obtained, a hybrid multivariate approach proposed in [12] is used. According to [11], the main shortcomings in the PCA-based methods are: (a) how to trade-off to select a feasible solution when more than one principal component is selected, and (b) the low correlation structure among the original responses. The method proposed by [12] showed itself capable to surpass these drawbacks. Thus, if the correlation structure among the multiples responses is large enough, then, the PCA and DEA methods must to converge.

In this work, the mathematical notation of [19] will be adopted to represent the DEA CCR model [20]. According to this formulation, the general efficiency measure used by DEA can be summarized as:

$$E_{\text{ks}} = \frac{{\sum\nolimits_{y} {O_{\text{sy}} v_{\text{ky}} } }}{{\sum\nolimits_{y} {I_{\text{sx}} u_{\text{kx}} } }}$$
(4)

where E ks is the efficiency measure for each experiment s, using the weights of the assessed experiment k; O sy the values of output y for the experiment s; I sx the values of the input x for the experiment s; v ky the weights assigned to trial experiment k for output y; u kx the weights assigned to trial experiment k for input x.

To decide the optimal set of weights for the experiment k being evaluated (DMU), many mathematical models have been developed. Within them the CCR model, developed by [20], is the most popular. The objective in CCR model is to maximize the relative efficiency value of the experiment k under analysis from among a reference set of experiments s, by selecting the optimal weights associated with the input and output measures. The maximum relative efficiencies are constrained to 1. The nonlinear programming formulation expressed in Eq. (4) can be written as:

$$\begin{gathered} \max {\kern 1pt} E_{{{\text{kk}}}} = \frac{{\sum\nolimits_{y} {O_{{{\text{ky}}}} v_{{{\text{ky}}}} } }}{{\sum\nolimits_{y} {I_{{{\text{ky}}}} u_{{{\text{ky}}}} } }} \hfill \\ {\text{s.t.:}}\sum\limits_{y} {O_{{{\text{ky}}}} v_{{{\text{ky}}}} } - \sum\limits_{y} {I_{{{\text{ky}}}} u_{{{\text{ky}}}} } \le 0{\kern 1pt}, \quad \forall {\text{ designs }}s{\kern 1pt} \hfill \\ u_{{{\text{ky}}}} ,v_{{{\text{ky}}}} \ge 0 \hfill \\ \end{gathered}$$
(5)

The Eq. (5) can be written in a linear programming formulation, as described in Eq. (6), by setting its denominator equal to 1 and by maximizing its numerator.

$$\begin{gathered} \hbox{max} \, E_{\text{kk}} { = }\sum\limits_{y} {O_{\text{ky}} v_{\text{ky}} } \hfill \\ {\text{s}} . {\text{t}} . : { }\sum\limits_{y} {I_{\text{ky}} u_{\text{ky}} } = 1 \, \hfill \\ \, \sum\limits_{y} {O_{\text{ky}} v_{\text{ky}} } - \sum\limits_{y} {I_{\text{ky}} u_{\text{ky}} } \le 0 \quad \forall {\text{ designs }}s \hfill \\ \, u_{\text{ky}} ,v_{\text{ky}} \ge 0 \hfill \\ \end{gathered}$$
(6)

The result of formulation (5) is an optimal efficiency value \(\left( {E_{\text{kk}}^{*} } \right)\) that is at most equal to the unit. According to [11], when \(E_{\text{kk}}^{*} = 1\), no other experiments are more efficient than experiment k under its selected weights. To experiment efficiencies values less than one imply that the factor/level combination does not lie on the optimal frontier, and there is at least another experiment that is more efficient under the optimal set of weights determined by the formulation (6). Otherwise, it is also possible that any experiment achieves the unitary efficiency, once the experiment represents only a portion of a full experimental design, producing what is called as “censored data” in [21]. Another feasible situation occurs when individual efficiencies of each experiment lie in the vicinity of 100 % efficiency. These situations reflect that any experimental design is optimum and indicate that the optimal combination does not belong to an explicit set of chosen experimental design. To overcome this shortcoming, [17, 21] proposed the Taguchi neural network approach. Through this strategy, a Taguchi design is used to generate the initial training set of a back propagation neural network algorithm. This training set is formed by the calculated efficiencies obtained with the design. Another way to surpass this shortcoming is employ the conceptual knowledge of design of experiments methodology (DOE). According to the DOE framework, the experimental results should be used to predict response values only inside the region formed by the factor’s levels. Thus, supposing that any value of each factor is equally probable between its levels, one could adopt a Monte Carlo or a Latin Hypercube sampling, generating data through a uniform probability distribution. Another optimum search procedure could be simpler whether the practitioner adopts a gradient-based method, like generalized reduced gradient (GRG). The gradient methods demand a differentiable function which must be obtained using a fitted response surface.

To employ the DEA approach in a multiresponse optimization problem, [11] warn that the response must be standardized. Although there exists a wide set of standardization procedures, in this work it will be adopted the normalization expression describe by Eq. (7).

Assuming that X ij is an observation for the ith \(\left( {i = 1,2, \ldots ,m} \right)\) response in the jth \(\left( {j = 1,2, \ldots ,n} \right)\) factor/level combination, its standardized value \(Z_{ij} \left( {0 \le Z_{ij} \le 1} \right)\) can be more appropriate to avoid the shortcomings raised with use of responses in different units. In the experiments with replicates, X ij should be treated as the mean of each specific treatment. This kind of formulation for \(Z_{ij} \left( {0 \le Z_{ij} \le 1} \right)\) can be used for responses that must be minimized or maximized. For this specific case, the standardization of responses can be done using Eq. (7) as follow:

$$Z_{ij} = \frac{{\mathop {X_{ij} - \hbox{min} \left\{ {X_{ij} ,\;j = 1,2, \ldots ,n} \right\}}\limits_{{}} }}{{\mathop {\hbox{max} \left\{ {X_{ij} ,\;j = 1,2, \ldots ,n} \right\} - \hbox{min} \left\{ {X_{ij} ,\;j = 1,2, \ldots ,n} \right\}}\limits^{{}} }}$$
(7)

for \(i = 1,2, \ldots ,m,\;j = 1,2, \ldots ,n\).

4 Multivariate response surface approach

The principal component analysis (PCA) is one of the most widely applied tools in order to summarize common patterns of variation among variables. Moreover, PCA is also able to retain meaningful information in the early axes whereas variation associated to experimental error, measurement inaccuracy, and rounding is summarized in later axes. According to [22], the PCA method is algebraically a linear combination of p random variables \(X_{1} ,X_{2} , \ldots ,X_{p}\). Geometrically these combinations represent a selection of a new system of coordinates obtained from an original system rotation. The coordinate axes has now the variables \(X_{1} ,X_{2} , \ldots ,X_{p}\). The new axes represent the direction of maxima. The principal components are uncorrelated and depend only on the covariance matrix Σ (or on the matrix of correlation ρ) of the variables \(X_{1} ,X_{2} , \ldots ,X_{p}\) and its development does not require the multivariate normality assumption.

Assuming that Σ is the covariance matrix associated to the random vector \(X^{T} = \left[ {X_{1} ,X_{2} , \ldots ,X_{p} } \right]\) and that this matrix has pairs of eigenvalues–eigenvectors \(\left( {\lambda_{1} ,e_{1} } \right),\left( {\lambda_{2} ,e_{2} } \right), \cdots \ge \left( {\lambda_{p} ,e_{p} } \right)\), where \(\lambda_{1} \ge \lambda_{2} \ge \ldots \ge \lambda_{p} \ge 0\), then the ith principal component is given by:

$$Y_{i} = e_{i}^{T} X = e_{1i} X_{1} + e_{2i} X_{2} + \cdots + e_{pi} X_{p} \quad i = 1,2, \ldots ,p$$
(8)

If the eigenvectors are perpendicular, the ith component will be the result of:

$$\begin{aligned} &{\text{Maximize Var}}\left( {\ell_{i}^{T} X} \right) \hfill \\ &{\text{Subject to:}} \, \ell_{i}^{T} \, \ell_{i} = 1 \hfill \\ &{\text{Cov}}(\ell_{i}^{T} X,\ell_{k}^{T} X) = 0, \, k < i \hfill \\ \end{aligned}$$
(9)

Sometimes it is useful to write the linear combination in a form of principal component score. In this way \(x_{\text{pn}}\) is the random observation, \(\bar{x}_{p}\) is the p-th response average, \(\sqrt {s_{\text{pp}} }\) is the response standard deviation, p is the response, [Z] is the matrix of standardized original data and [E] is the eigenvectors matrix of the multivariate set. Then:

$${\text{PC}}_{\text{score}} = \left[ Z \right] \, . \, \left[ E \right].$$
(10)

According to [23], many tests are needed to evaluate the data adequacy to the PCA application. In this work the test of isotropy, the Bartlett sphericity test, the Kulbach index (KI), the divergence index (DI) and the generalized correlation index (GCI) [24] will be addressed, as shown in Table 1. The number of variables of response is denominated p and the number of axes that are supposed to be invariant isotropically is m.

Table 1 Tests and statistics index for PCA application

The number of axes that are supposed to have isotropy variation is simply r = p − m and can be discarded. The number of experiments is n and R is the correlation matrix for the p variables. For the test of isotropy the degree of freedom is \(\nu = 0.5r\left( {r + 1} \right) - 1\) while for the test of Bartlett the value is \(p\left( {p - 1} \right)/2\). The null hypothesis is rejected when the test statistic is greater than a critical value, which is also represented by P value (P value < α). In this work it was assumed that α = 0.05.

There are a variety of stopping rules to estimate the adequate number of non-trivial axes, or in other words, the number of significant principal components. According to [22], the most popular methods are those based on the Kaiser’s criteria. According to this rule, only the principal components whose eigenvalues are greater than one should be kept to represent the original set. Moreover, the accumulated variance explained should be greater than 80 %. These criteria are adequate when it is used the correlation matrix. Otherwise, the covariance matrix only should be used for a set of original responses written in the same scale.

When dealing with multivariate responses there are several difficulties. Modeling each response variable independently takes no account of relationships or correlations among the variables. According to [25], it is necessary a special care in analyzing multiresponse data to avoid misleading interpretation. The basic problem is associated with fitting multiresponse models ignoring the three kinds of dependencies that can occur: dependence among the errors, linear dependencies among the expected value of the responses, and linear dependencies in the original data. To overcome these difficulties, a hybrid strategy based in multivariate statistics for summarizing and reducing the dimensionality of the data can be employed. The principal components analysis (PCA) factorizes the multivariate data into a number of independent factors, which take into account the variances and correlations among the original variables. For this reason, a natural formulation for the multiresponse problem changes the original response variables by a principal component score. This new equation is modeled through OLS algorithm. To force the solution to fall inside the experimental region, a constrained nonlinear programming problem written in terms of principal components could be expressed as shown in the following equation:

$$\begin{aligned} &{\text{Minimize PC}}_{1} = \beta_{0} + \sum\limits_{i = 1}^{k} {\beta_{i} } x_{i} + \sum\limits_{i = 1}^{k} {\beta_{ii} } x_{i}^{2} + \sum\limits_{i < j} {\sum {\beta_{ij} } } x_{i} x_{j} \hfill \\ &{\text{Subject to:}} \, x^{T} x \le \rho^{2} \, \hfill \\ \end{aligned}$$
(11)

Treating the principal components rather than the original response variables has several advantages. If the first principal component represents a high proportion of the total variance in the data, it provides a univariate summary of the multivariate responses. Inspection of the loadings (eigenvectors) will reveal the kind of relationship among the ith principal component score equation and the original responses. According to [26], a response surface model for principal component provides a model of the overall response which takes account of the correlations among the response variables and their relative importance. Linear relationships among the response variables can be immediately identified by zero eigenvalues and omitted from further consideration, which avoids unnecessary work when the number of responses exceeds the number of observations.

Even though a set of variables is well represented by the principal components, only one principal component is not always enough for this representation. Moreover, this does not occur in the majority of the complex manufacturing processes. To integrate more than one principal component into a comprehensive index, a simple approach based only on the significant principal components is considered more appropriate. A test of hypothesis can reveal which components must be chosen to create the multivariate index.

Considering the eigenvalues of the correlation matrix as a set of weights of the most representative PC scores, Paiva [12] established a multivariate global index (MGI), obtained with the sum of the products of significant components weighted by their respective eigenvalues. When creating and modeling the MGI response surface, it can be applied a constrained nonlinear programming strategy, as described in Eq. (12):

$$\begin{aligned}&{\text{Maximize MGI}} = \sum\limits_{i = 1}^{m} {\left[ {\lambda_{i} \left( {{\text{PCs}}_{i} } \right)} \right]} \\ &{\text{Subject to:}}\,\, x^{T} x \le \rho^{2} \end{aligned}$$
(12)

where m is the number of significant principal components according to the Bartlett sphericity test; \(\lambda_{i}\) is the ith largest eigenvalue and PCs i is the ith largest PC-score.

Optimum values can be obtained by locating the stationary point of a fitted surface. The objective is to find the settings of x’s that can optimize the objective function subject only to the constraint that defines the region of interest Ω. In other words, the appropriate optimum value of the fitted objective function is located using a GRG algorithm through a constrained procedure to force the optimum to lie within the experimental region. In this work, three objective functions are considered to study: a DMU’s-based efficiency (DEA), a First Principal Component Score (PC1) and a multivariate global index (MGI). As constraints, there are two different regions of interest in optimization: spherical and cuboidal. For cuboidal designs, the constraint is written as \(- 1 \le x_{i} \le 1\), i = 1, 2,…, k (k is the number of control variables), and for spherical designs the constraint is defined by \(\left( {x^{T} x} \right) \le \rho^{2}\), where ρ is the design radius. The value of ρ should be chosen in order to avoid solutions that are too far outside the experimental region that led to response surfaces established in Eq. (1). For a central composite design, a logical choice is ρ = α, where α is the axial distance. In the case of cuboidal designs (such as Box–Behnken and factorial or fractional factorial designs), natural choices for the lower and upper bounds on the x’s are the experimental low and high coded levels, respectively. In this work it will be adopted a spherical constraint, represented by the axial point length. According to [14], for non-blocked response surface designs the axial point is determined by \(\sqrt[4]{{2^{k} }}\), where k is number of controllable factor present in the design.

According to [27], the GRG method is one of the most robust and most efficient methods of constrained nonlinear optimization. The expression reduced gradient comes from the substitution of the constraints on the objective function, decreasing then the number of variables and consequently reducing the number of present gradients. When making the partition of the original variables into basics (Z) (or dependents) and non-basics (Y ) (or independent) one can write \(F\left( X \right) = F\left( {Z,Y} \right)\) and \(h\left( X \right) = h\left( {Z,Y} \right)\). Seeking to attend the condition of optimality it is needed that \({\text{dh}}_{j} \left( X \right) = 0\). Making \(A = \nabla_{z} h_{j} \left( X \right)^{ \, T} {\text{ and }}B = \nabla_{Y} h_{j} \left( X \right)^{ \, T}\), then \(dY = - B^{ - 1} AdZ\). Consequently the GRG can be defined as:

$$G_{R} = \frac{d}{dZ}F\left( X \right) = \nabla_{z} F\left( X \right) - \left[ {B^{ - 1} A} \right]^{ \, T} \nabla_{Y} F\left( X \right)^{T}$$
(13)

The searching direction is \(S_{X} = \left[ {\begin{array}{*{20}c} { - G_{R} } & {dY} \\ \end{array} } \right]^{ \, T}\). For the iterations one can use \(X^{k + 1} = X^{k} + \alpha S^{k + 1}\), verifying at each step if \(X^{k + 1}\) is adequate and \(h\left( {X^{k + 1} } \right) = 0\). The final step consists on solving F(X) as a function of α, using a one-dimensional algorithm of search like the Newton method.

5 Experimental procedure and data analysis

In order to comply with the objective of this work, a power source working with current pulsed mode was used. This was choosen to get more flexibility in adjusting the parameters. Associated with the equipment a mechanical tractor was used to move the attached torch at the adjustable welding speed. All welding tests were performed using a weld bead on plate (BOP) technique using a wire AWS ER 70S-6, diameter of 1.2 mm, and base material of ABNT 1045 with 120 × 40 × 6 mm. The shielding gas used was a mixture of Argon + 25 % CO2 with a constant flow of 15 l/min. The welding speed was kept constant and fixed in 40 cm/min for all tests performed and the standoff used was 22.5 mm. The parameters used in the experiments and their levels are shown in Table 2. These parameters were designed according to a RSM methodology and used in a central composite design.

Table 2 Process parameters

In order to get a duty cycle (Ca), the peak time (tp) was kept fixed in 4 ms and the background time (tb) varied according to the desired level according to the Eq. (14):

$${\text{Ca}}\;{ = }\;\frac{\text{tp}}{{{\text{tp}}\;{ + }\;{\text{tb}}}}$$
(14)

After the welding, all test specimens were cross-sectioned, polished and chemical attacked. Then the geometric characteristics for the penetration (p), reinforcement (r), width (w), and overall area (A) of the weld bead were determined for each test specimen. Also the convexity of the weld bead (CI) was determined by the relation between the reinforcement and the width (w). Using an adequate central composite design (CCD) to collect the data of the five responses it was obtained the necessary information to build the second-order model, as can be seen in Table 3.

Table 3 Experimental results obtained using central composite design

Experimentally, two least squares-based models can be obtained: coded and uncoded based parameters. According to [14, 25], the coded units approach is better than uncoded because it provide resources to eliminate any spurious statistical results due to different measurement scales for the factors. In addition, uncoded units often lead to collinearity among the terms in the model. This inflates the variability in the coefficients estimates and makes them difficult to interpret. For these reasons, it was employed the model based on the coded units.

To apply the DEA method in the optimization of multiresponse set, it is necessary to standardize the data set. On this case, there exist just maximization and minimization responses, and then standardization task can be done using Eq. (7). The results are shown in the Table 4.

Table 4 Standardized responses

Moreover, the practitioner must to indicate the respective inputs and outputs response variables. As mentioned by [21], the response variables which must be maximized will be considered the outputs, while the variable which must be minimized, can be treated as inputs. In the specific case of the P-GMAW process, penetration (P), total penetrated area (A) e bead width (W) are larger-the-better responses, while reinforcement (R) and convexity index (CI) assume the smaller-the-better form. To determine the efficiencies of the individual DMU’s, Eq. (6) must be used. The efficiency data and the respective coefficients of input and output responses are described by Table 5.

Table 5 Efficiency of individuals DMU’s

Applying the Eq. (10) to the original data set, it is possible to calculate the first two principal components. As only the principal components PC1 and PC2 held eigenvalues larger than 1 (Fig. 1), these were chosen.

Fig. 1
figure 1

Significative Eigen values

Table 6 shows that the two principal components together represent 85.5 % of variation on the responses. It also be noted that although there is a strong correlation structure among the quality characteristics studied, the first principal component is not enough to represent the original set. Then, it is necessary to select the second principal component. This fact was pointed out by [11] as the main shortcoming observed with the PCA-based optimization methods. To overcome this difficulty, it is adequate to use the Eq. (12), developed in [12]. Applying this equation, PC1 and PC2 can be combined in a singular data set called MGI. The results for MGI are shown in Table 8.

Table 6 PCA for the correlation matrix of the responses

The results shown in Table 7 were calculated using the formulation presented in Table 1 and it is based on experiments. This result reveals a perfect adequacy of the problem for the use of PCA, rejecting the null hypotheses that the principal components PC1 and PC2 are not representative of the original responses set. The index also reveals a high correlation between variables.

Table 7 Tests and indexes for determination of non-trivial axes in PCA

The results of Table 5, the two first principal component scores and the results for MGI are condensed in the Table 8.

Table 8 Results for DEA, PCA and MGI

Hence, once the efficiency of each DMU (E kk), the principal components scores PC1 and PC2 of original responses and MGI were determined, the OLS method can be applied to create the models for each representative objective function as shown in Table 9.

Table 9 Ordinary least squares coefficients

It is usually used the analysis of variance (ANOVA) to formally test for significance of the main effects and interactions. To refine the model, a common approach consists of removing any non-significant term from the full model. As a decision rule, if P value is lower than 0.05 the correspondent term will be considered significant to the model. Otherwise, if P value is greater than 0.05 the term will be excluded. According to [14] this procedure is convenient to obtain a simplified model but that could decrease the coefficient of determination R 2 and increase the error term S. Moreover, the exclusion of any term should follow the hierarchy principle. This is a model-building principle that suggests that when a particular polynomial term is included in a model, all lower-order polynomial terms should also be include, even those terms that do not exhibit significance individually. The hierarchy principle promotes an internal consistency in the model. Table 9 shows the complete model for each response. In spite of some non-significant terms were found, its exclusion from the complete model increased the error S and reduced R 2 (adj). A complete second-order model was then considered to surpass this problem.

Other important aspect in the statistical model-building process is associated with the amount of explicability of the dependent variables y by the predictors x. In the Table 9, the adjusted R-squared is shown. This expression is of the larger-is-better type. A larger adjusted R 2 indicates a high degree of explanation of the interest response. The adjusted R 2 [R-Sq(adj)] takes into account the fact that R 2 tends to overestimate the actual amount of variation accounted for in the sample analysis, i.e., if one applies the regression equation derived from a sample to another independent sample, it will almost always get a smaller R 2 in the new sample than in the original. It can be noted that the models found in the present work are adequate, once all of them exhibit a large adjusted R 2.

Another necessary decision in the association of PCA and RSM is related with the type of optimization that the objective functions written in terms of PC1 and MGI must to follow. It can be noted that is the main reason why PCA is more widely used with Taguchi than with response surface designs. The analysis in Taguchi designs is done employing the concept of loss function. Specifically, each kind of optimization (maximization, minimization or normalization) can be represented by a proper signal-to-noise relation. Due to the mathematical nature of this relation, the signal-to-noise must be always maximized. In RSM, the approach is totally different. To surpass this barrier we propose in this article to find out the kind of optimization through analyzing the correlation among PC1, PC2 and MGI with each original response. If there is a positive correlation between a principal component score and a specific original response, then they will have the same direction of optimization. If the correlation between the two variables is negative, the maximization of one variable implies in the minimization of the other and vice versa.

Therefore, observing the eigenvectors and the factorial analysis shown in the Table 6 is possible to note that there is a high positive correlation between PC1 and the responses P, W and A. There is also a high negative correlation between PC1 and CI and a high negative correlation between PC2 and R. The inspection of the factor loadings of the first and second principal components on the Table 6 also reveals that the first principal component is strongly and positively correlated with P, W and A, and negatively correlated with CI, while the second principal component is strongly and negatively correlated with R. For the process improvement, the responses penetration, width and area must be maximized while reinforcement and convexity index must be minimized. Analyzing the correlation between the principal components and the responses one can find out that when maximizing PC1, the responses P, W and A will be maximized while CI will be minimized. When maximizing PC2, R will be minimized. In this way, two strategies can be proposed: (a) to maximize PC1, according to Eq. (11) and (b) to maximize MGI, the index calculated using the sum of PC1 and PC2 weighted by their respective eigenvalues, obtained according to Eq. (12). In both cases, using the GRG method, some spherical constraints will be imposed to the factor levels, i.e., the values that optimizes the responses of interest must belong to an experimental interval \(- r \le x_{i} \le + r\). On this case, where a central composite design for four factors was used, the natural choice for the radius is 2.

Table 10 exhibits the results obtained with the DEA and multivariate multiresponse optimization approaches.

Table 10 Comparative results among DEA and PCA methods

The optimal response values obtained with DEA, PC1 and MGI were analyzed by a Tukey’s test what shows that the difference of the results obtained with the methods used are statistically significant at 1 % level.

Comparing the PC1 and MGI methods one can find that the complete model of PC1 violates the reinforcement (R) constraint. Therefore, it is expected this component is not able to represent all the responses. The reinforcement response is well represented by PC2 and does not have significant correlation with PC1. In a worst case, the individual optimization of PC2 is just able to optimize the reinforcement response not considering all the others responses.

When the MGI was applied, taking into account the respective eigenvalues as weights, according to Eq. (12), a better solution that considers all the constraints was found. Comparing the two methods the MGI is better and it is closer to the reality, because this method considered all the responses involved in the problem.

The correlation among the data variables is the core of PCA. However, in DEA models the influence of this kind of dependency is irrelevant. As suggested by [10], if output variables are highly correlated, only one of those variables should be kept in the model as they serve adequately as proxies for others. While in MRSM the principal component scores are used instead the original observations in building regression for data, the DEA does not break down even multicollinearity exists in the input or output data. Another difference between the two approaches, according to [28], is the fact that DEA does not reduce the dimensionality of the data as in the PCA case. When we compare the three methods employed, the optimization method using DEA showed to be the best. The optimal response values for the larger-the-better parameters penetration (P), area (A) and bead width (W) reached larger values using DEA method than the others. Besides, when analyzing the smaller-the-better parameters reinforcement (R) and the convexity index (CI) the optimal values using the DEA are the smallest. In order to compare the adequacy of the models, the global percentage error (GPE) was calculated for the results of each method, using the Eq. (15):

$${\text{GPE}} = \sum\limits_{i = 1}^{m} {\left| {\frac{{y_{i}^{*} }}{{T_{i} }} - 1} \right|}$$
(15)

where: \(y_{i}^{*}\)—value of the optimal responses, \(T_{i}\)—targets defined, \(m\)—number of objectives.

The GPE, as its name declares, is an error index. In this case, we want to evaluate the distance of the determined optimal response from its ideal value. The values of GPE are shown in Table 11. The results of GPE prove that DEA method was the best in the analyzed case because this method showed the smallest GPE.

Table 11 GPE for DEA, PC1 and MGI

In spite of DEA be better than MGI, the results are so close to the each other, as we can see in Fig. 2.

Fig. 2
figure 2

Contour plot for MGI showing the results for the comparison between the two methods

6 Conclusions

This research showed that a multiresponse manufacturing process with a moderate to high correlation structure is well represented using both a DEA and multivariate approach. The results showed that DEA is robust to the multicollinearity generated by the input and output responses while it is not necessary to reduce the dimensionality of data. A quadratic function built with the efficiencies allowed the application of a nonlinear optimization algorithm like GRG. Otherwise, with multivariate approach, with only two principal components it was possible to represent 85.5 % of the total variation in the original data set.

As the tests have indicated the correlation between the responses was significant to support the multivariate approach, the correlation between the responses and the principal components was also favorable to the optimization. It was found that the MGI function was efficient in this case once the two principal components are associated. The weight of the principal components on the MGI using eigenvalues was considered also satisfactory. Comparing DEA and PCA methods it is possible to conclude that the results are almost the same, indicating in this case that the two approaches are quite similar in the role of correlated multiresponse optimization, once the constraints are not violated.

However, even considering so close responses for both methods, DEA was considered better because its results are larger in parameters that we wanted maximize and are smaller in parameters we wanted minimize.