Keywords

1 Introduction

Regression analysis is used to understand the statistical dependence of one variable on other variables. Linear regression is the oldest and most widely used predictive model in decision making in managerial sciences, environmental science, and all the areas wherever it is required to describe possible relationships between two or more variables. This technique can show what proportion of variance between variables is due to the dependent variable, and what proportion is due to the independent variables. The earliest form of regression was the method of least squares, which was published by Legendre [1] and by Gauss [2]. The linear regression can be classified into two types, simple linear regression and multiple linear regression (MLR). The simple linear regression describes the relationship between two variables and MLR analysis describes the relationship between several independent variables and a single dependent variable. A number of methods for the estimation of the regression parameters are available in the literature. These include methods of minimizing the sum of absolute residuals, minimizing the maximum of absolute residuals, and minimizing the sum of squares of residuals [3], where the last method of minimizing the sum of squares of residuals popularly known as least square methods is commonly used. Alp et al. [4] explained that linear goal programming (GP) can be proposed as an alternative of the least square method. For this, he took an example of vertical network adjustment. Hassonpour et al. [5] proposed a linear programming model based on GP to calculate regression coefficient.

An interaction occurs when the magnitude of the effect of one independent variable on a dependent variable varies as a function of a second independent variable. This is also known as a moderation effect, although some have more strict criteria for moderation effects than for interactions. Nowadays, interaction effects through regression models are a widely interested area of investigation as there has been a great deal of confusion about the analysis of moderated relationships involving continuous variables. Alken and West [6] have analyzed such interaction effects; further, this method was applied into several models by the researchers, for example, Curran et al. [7] applied into hierarchical linear growth models.

Multiple objective optimization techniques provide more realistic solutions for most of the problems as it deals with multiple objectives, whereas single objective optimization techniques provide solutions to the problems that deals with single objective. GP is a type of multiple objective optimization technique that converts a multi-objective optimization model into a single objective optimization model. GP model has been proven a valuable tool in support of decision making. The first publication using GP as the form of a constrained regression model was used by Charnes et al. [8]. There have been many books devoted to this topic over past years (Ijiri [9]; Lee [10]; Spronk [11]; Ignizio [12]). This tool often represents a substantial improvement in the modeling and analysis of multi-objective problems (Charnes and Cooper [13]; Eiselt et al. [14]; Ignizio [15]). By minimizing deviation, the GP model can generate decision variable values that are the same as the beta values in some types of multiple regression models. Tamiz et al. [16] presents the review of current literature on the branch of multi-criteria decision modeling known as GP. Machiel Kruger [17] proposed a GP approach to efficiently managing a bank’s balance sheet while maximizing returns and at the same time taking into account the conflicting goals such as minimizing risk, subject to regulatory and managerial constraints. Gupta et al. [18] solved a multi-objective investment management planning problem using fuzzy min sum weighted fuzzy goal programming technique.

Application of a multi-objective programming model like GP model is an important tool for studying various aspects of management systems (Sen and Nandi [19]). As an extension to the findings of Sharma et al. [20], this paper is focused on comparative study of the results obtained through different software packages like LINDO and TORA.

2 Regression and Goal Programming Formulation

The regression equation used to analyze and interpret a two-way interaction is:

$$\begin{aligned} y_{ir} =b_0 +b_1 X_i +b_2 Z_i +b_3 X_i ^{2}+b_4 Z_i ^{2}+b_5 X_i Z_i +e_i ,\quad i=1,\,2,\,\ldots ,\,m. \end{aligned}$$

where \({b}_{0}\), \({b}_{1}\), \({b}_{2}\), \({b}_{3}\), \({b}_{4}\) and \({b}_{5}\) are the parameters to be estimated, and \(e_i \) is the error components which are assumed to be normally and independently distributed with zero mean and constant variance. The linear absolute residual method requires us to estimate the values of these unknown parameters so as to minimize \( \sum \nolimits _{{i = 1}}^{m} {\left| {y_{{io}} - y_{{ir}} } \right| } \).

Let \(y_i \) be the \(i\)th goal, \(d_i^+ \) be positive deviation from the \(i\)th goal, and \(d_i^- \) be the negative deviation from the \(i\)th goal. Then, the problem of minimizing \( \sum \nolimits _{{i = 1}}^{m} {\left| {y_{i} - y_{{ir}} } \right| } \) may be reformulated as

Minimize \(\mathop \sum \limits _{{i=1}}^{m} \left( {d_i^+ +d_i^- } \right) \)

Subject to:

$$\begin{aligned} a_0 +a_1 X_{i1} +a_2 X_{i2} +a_3 X_{i3} +a_4 X_{i4} +a_5 X_{i5} +d_i^+ -d_i^- =y_{iG} , \end{aligned}$$
$$\begin{aligned} d_i^+ \ge 0 \end{aligned}$$
$$\begin{aligned} d_i^- \ge 0 \end{aligned}$$

and \(a_0 ,a_1 ,a_2 ,a_3 ,a_4 ,a_5 \) are unrestricted.

$$\begin{aligned} i\,=\,1,\,2,\,\ldots ,\,{m}. \end{aligned}$$

where \(X_i ^{2},Z_i ^{2}\), and \(X_i Z_i \) are taken as \(X_{i3} ,X_{i4}\), and \(X_{i5} \), respectively, to formulate the multiple nonlinear regression problem into linear GP model.

3 Mathematical Modeling and Solution

3.1 Mathematical Modeling

Relationship between two methods can be established by taking a simple example. We consider a regression equation of Y on X and Z. The data for illustration are:

y

x

z

7.88

3

2

7.43

2

1

8.38

4

3

7.42

2

1

7.97

3

2

7.49

2

2

8.84

5

3

8.29

4

2

Reformulating the above problem into linear GP model:

Minimize \(\mathop \sum \limits _{{i=1}}^{8} \left( {d_i^+ +d_i^- } \right) \)

Subject to:

$$\begin{aligned} a_0 +3a_1 +2a_2 +9a_3 +4a_4 +6a_5 +d_1^+ -d_1^-&=7.88\\ a_0 +2a_1 +a_2 +4a_3 +a_4 +2a_5 +d_2^+ -d_2^-&=7.43\\ a_0 +4a_1 +3a_2 +16a_3 +9a_4 +12a_5 +d_3^+ -d_3^-&=8.38\\ a_0 +2a_1 +a_2 +4a_3 +a_4 +2a_5 +d_4^+ -d_4^-&=7.42\\ a_0 +3a_1 +2a_2 +9a_3 +4a_4 +6a_5 +d_5^+ -d_5^-&=7.97\\ a_0 +2a_1 +2a_2 +4a_3 +4a_4 +4a_5 +d_6^+ -d_6^-&=7.49\\ a_0 +5a_1 +3a_2 +25a_3 +9a_4 +15a_5 +d_7^+ -d_7^-&=8.84\\ a_0 +4a_1 +2a_2 +16a_3 +4a_4 +8a_5 +d_8^+ -d_8^-&=8.29 \end{aligned}$$
$$\begin{aligned} d_i^+ \ge 0,\quad i=1,\,2,\,\ldots ,\,8 \end{aligned}$$
$$\begin{aligned} d_i^- \ge 0,\quad i=1,\,2,\,\ldots ,\,8 \end{aligned}$$

\(a_i \) are unrestricted, \(i\,=\,0,\,1,\,2,\,\ldots ,\,5.\)

3.2 Solution

The values of coefficients in the above problem through different methods are tabulated in Table 1

Table 1 The values of coefficients using different methods

Final results are tabulated in Table 2:

4 Discussion

It is clear from Table 1 that all software packages give the same results to linear GP formulation with zero difference in the results.

Table 2 Values of y for paired values of x and z

It is observed from the above-tabulated results of Table 2 and Fig. 1 that \(\hbox {Minimize}\) \(\mathop \sum \nolimits _{i=1}^m \left| {y_{io} -y_{iG} } \right| <\hbox {Minimize}\mathop \sum \nolimits _{i=1}^m \left| {y_{io} -y_{ir} } \right| \), where \(y_{iG} \) be the estimate of the \(i\)th response using GP technique, and \(y_{ir} \) be the estimate using the least square method. Hence, it is concluded that the GP technique provide better estimate of the multiple nonlinear regression parameters with two-way interaction effect than the least square method.

Fig. 1
figure 1

Comparative results of residuals through different algorithms

It is clear from Fig. 2 that the data are best fitted into the curve when we get the values of coefficients through solutions of the GP formulation comparative to the solutions using least square method.

Fig. 2
figure 2

Comparative results of y through different algorithms

5 Conclusion

  1. 1.

    The software packages TORA and LINDO both give similar results to a linear GP problem.

  2. 2.

    GP formulation gives better and best-fitted results than the traditional least square method.

  3. 3.

    The error is minimized when we solve regression model using GP formulation.