Keywords

1 Introduction

Both correlation analysis and regression analysis are two of the most applied statistical tools in several disciplines due to its applicability and interpretability. They allow certain types of measurements to be used in classical statistical theory which means that observation are supposed to follow certain distributions. However, encountering observations either described by linguistic terms such as “bad”, “good” and “very good”, or approximately known quantities such as “around 2” is possible. With the introduction of fuzzy set theory, uncertainty different than one defined by probabilistic framework being modeled with possibility distribution for data collected as either qualitative or approximately known quantities has been a research area for data analysts.

Extending both methods to fuzzy framework gives rise to several proposed methods utilizing different aspects of fuzzy set theory.

2 Fuzzy Correlation Analysis

Correlation coefficient is a statistical measure which determines both the direction and strength of the linear relation between two variables which is defined by

$$r_{XY} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} (x_{i} - \bar{x})(y_{i} - \bar{y})}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{n} (x_{i} - \bar{x})^{2} \mathop \sum \nolimits_{i = 1}^{n} (y_{i} - \bar{y})^{2} } }}$$
(1)

where \(X\) and \(Y\) are variables whose values are denoted by \(\left( {x_{i} ,y_{i} } \right), i = 1,2, \ldots ,n\) and their corresponding arithmetic means are denoted by \(\bar{x}\) and \(\bar{y}\) respectively. Its range restricted in a closed interval [−1, 1] tells how strong the linear dependence is between those variables with the knowledge of direction.

When the correlation coefficient is reconsidered in the fuzzy setting which means that observation values either are qualitative knowledge such as linguistic terms taking values of, for example, “bad” or “good” or “excellent”, or are approximately known values, for instance, the value of the quantity can be defined around 2, measuring it is a need to quantify the relation. Both types of data are encountered when subjective or linguistic evaluations are provided by experts in the field of engineering, management or social sciences [13]. For example, the need for fuzzy correlation measure can arise when to quantify relation between the technology level and the management achievements of firms in management science or when to partition images to determine similarity or dissimilarity is concern in the field of engineering. Indeed, these types of exemplifications can easily be extended to any disciplines. Therefore, measuring correlation coefficient between two variables involving fuzziness is a need and computational procedures are challenging than that given in (1).

Computing fuzzy correlation employs basically two different methods. The first of which is to rely on Zadeh’s extension principle, which aims at finding the membership function of fuzzy correlation. In order to determine membership function of fuzzy correlation, some methods are available providing with both analytical and numerical solutions, for example, using weakest t-norm and non-linear programming. Before explaining the details of the methods that are utilized in the computation of fuzzy correlation as well as fuzzy non-linear regression, some preliminary notions and definitions are needed which are fuzzy numbers, \(LR\) type fuzzy numbers, α-cuts of a fuzzy set and triangular norm, namely, t-norm, Zadeh’s extension principle, fuzzy arithmetic. More detailed treatment of the subjects mentioned above can be found in variety of books pertinent to fuzzy set theory or fuzzy logic [4].

A fuzzy number is a convex subset of the real line R with a normalized membership function. For example, an asymmetric triangular fuzzy number \(\tilde{x} = (x,\alpha ,\beta )\) is defined by

$$\tilde{x}\left( t \right) = \left\{ {\begin{array}{*{20}c} {1 - \frac{x - t}{\alpha },\;\;if\,x - \alpha \le t \le x} \\ {1 - \frac{t - x}{\beta }, \;\;if\,x \le t \le x + \beta } \\ {0,\quad \;\,otherwise} \\ \end{array} } \right.$$
(2)

where the center value \(x \in R,\) left spread value \(\alpha > 0,\) and right spread value \(\beta > 0\) are based on the definition of fuzzy number. When \(\alpha = \beta\) is assumed, an asymmetric triangular fuzzy number is called a symmetric triangular fuzzy number and is denoted by \(\tilde{x} = \left( {x,\alpha } \right).\) Other types of fuzzy numbers such as trapezoidal fuzzy number and Gaussian fuzzy number are also defined and utilized in various applications dependent upon the suitability, interpretability, and applicability.

A fuzzy number \(\tilde{x} = (x,\alpha )_{LR}\) of type LR is a function from real numbers into the interval [0, 1] defined by

$$\tilde{x}\left( t \right) = \left\{ {\begin{array}{*{20}c} {L\left( {\frac{x - t}{\alpha }} \right)\;\quad for\;\;x - \alpha \le t \le x} \\ {R\left( {\frac{t - x}{\beta }} \right)\;\quad for\;\;x \le t \le x + \beta } \\ \end{array} } \right.$$
(3)

where \(L\) and \(R\) are non-increasing and continuous shape functions from [0,1] to [0,1] satisfying \(L\left( 0 \right) = R\left( 0 \right) = 1\) and \(L\left( 1 \right) = R\left( 1 \right) = 0.\)

An α-cut of a fuzzy set is a crisp set defined by

$$A_{\alpha } = \left\{ {x \in A|\mu_{A} (x) \ge \alpha \} } \right.$$
(4)

A binary operation T on unit interval is said to be a triangular norm or t-norm if and only if T is associative, commutative, non-decreasing and \(T\left( {x,1} \right) = x \,\,for\,each\,x \in [0,1].\)

Extending ordinary arithmetic into fuzzy number setting is possible by employing Zadeh’s extension principle defined by

$$\mu_{B} \left( y \right) = \underbrace {Sup}_{{\begin{array}{*{20}c} {\left( {x_{1} , \ldots ,x_{n} } \right) \in U_{1} x \ldots xU_{n} } \\ {y = f\left( {x_{1} , \ldots ,x_{n} } \right)} \\ \end{array} }}{ \hbox{min} }(\mu_{{A_{1} }} \left( {x_{1} } \right), \ldots \mu_{{A_{n} }} \left( {x_{n} } \right))$$
(5)

where \(A = A_{1} x \ldots xA_{n}\) and \(U = U_{1} x \ldots xU_{n}\) are Cartesian product of the fuzzy sets \(A_{i} ,(i = 1, \ldots ,n)\) and universal sets \(U_{i} , (i = 1, \ldots ,n)\) of fuzzy sets respectively.

2.1 Fuzzy Correlation Coefficient Based on the Weakest t-Norm (Tw) and Fuzzy Arithmetic

When Zadeh’s extension principle is rewritten using one of union operators such as t-norm instead of minimization, the arithmetic operators are defined by

$$(\tilde{A} \oplus \tilde{B}) = \underbrace {Sup}_{x + y = z}T\left( {\tilde{A}\left( x \right),\tilde{B}\left( y \right)} \right)$$
(6a)
$$\left( {\tilde{A} \otimes \tilde{B}} \right) = \underbrace {Sup}_{x \cdot y = z} T(\tilde{A}\left( x \right),\tilde{B}\left( y \right))$$
(6b)
$$\left( {\tilde{A}{ \oslash }\tilde{B}} \right) = \underbrace {Sup}_{x/y = z} T(\tilde{A}\left( x \right),\tilde{B}\left( y \right))$$
(6c)

where \(\tilde{A}\) and \(\tilde{B}\) are fuzzy numbers and ⊕ , ⊗ , ⊘ are fuzzy arithmetic operators for addition, multiplication and division, respectively.

When fuzzy correlation is being computed, applying the extension principle based on the weakest t-norm denoted by \(T_{w}\) for a sample of n independent pairs of \(LR\) type fuzzy numbers is the method using the classical definition of the correlation coefficient given in (1) [5]. Instead of using the union operator \(Sup,\) \(T_{w}\) based fuzzy addition and multiplication are preferred in order to preserve the shape of the resultant \(LR\) type fuzzy numbers since it is the fact that fuzzy multiplication and division operators lead to resultant fuzzy numbers different than \(LR\) types except fuzzy addition and subtraction.

When the observations are fuzzy, the sample correlation coefficient given in (1) is rewritten.

$$\tilde{r}_{{\widetilde{X,Y}}} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} (\tilde{x}_{i} - \frac{1}{n}^\circ \mathop \sum \nolimits_{i = 1}^{n} \tilde{x}_{i} )(\tilde{y}_{i} - \frac{1}{n}^\circ \mathop \sum \nolimits_{i = 1}^{n} \tilde{y}_{i} )}}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{n} (\tilde{x}_{i} - \frac{1}{n}^\circ \mathop \sum \nolimits_{i = 1}^{n} \tilde{x}_{i} )^{2} \mathop \sum \nolimits_{i = 1}^{n} (\tilde{y}_{i} - \frac{1}{n}^\circ \mathop \sum \nolimits_{i = 1}^{n} \tilde{y}_{i} )^{2} } }}$$
(7)

where \(\tilde{x}_{i} = (x_{i} ,\gamma_{i} )\) and \(\tilde{y}_{i} = (y_{i} ,\delta_{i} ),\) \(i = 1,2, \ldots ,n,\) are symmetric triangular fuzzy numbers and \(\bar{\tilde{x}} = \frac{1}{n}^\circ \sum\nolimits_{i = 1}^{n} {\tilde{x}_{i} }\) and \(\bar{\tilde{y}} = \frac{1}{n}^\circ \sum\nolimits_{i = 1}^{n} {\tilde{y}_{i} }\) are the average values of fuzzy numbers \(\tilde{X}\) and \(\tilde{Y},\) respectively. Then the average values of fuzzy numbers \(\tilde{X}\) and \(\tilde{Y}\) are calculated based on \(T_{w}\) as follows:

$$\bar{\tilde{x}} = ( {\frac{1}{n}\sum\limits_{i = 1}^{n} {x_{i} } ,\max_{1 \le i \le n} \gamma_{i} })$$
(8a)
$$\bar{\tilde{y}} = ( {\frac{1}{n}\sum\limits_{i = 1}^{n} {y_{i} } ,\max_{1 \le i \le n} \delta_{i} })$$
(8b)

The expressions given in (8a) and (8b) can be written for just some observation using \(T_{w}\) as follows:

$$(\tilde{x} - \bar{\tilde{x}}_{i} ) = (x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{i} } ,\max_{1 \le i \le n} \gamma_{i} )_{L}$$
(9a)
$$(\tilde{y} - \bar{\tilde{y}}_{i} ) = (y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{i} } ,\max_{1 \le i \le n} \delta_{i} )_{L}$$
(9b)

Then the product of (9a) and (9b) is obtained as follows:

$$\left( {\left( {x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{i} } } \right)\left( {y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{i} } } \right), { \hbox{max} }\left| {x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{i} } } \right|\max_{1 \le k \le n} \delta_{k} ,\left( {y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{i} } } \right)\max_{1 \le k \le n} \gamma_{k} } \right)_{L}$$
(10)

The numerator of (7) is the summation of the product of (9a) and (9b) using \(T_{w}\) based fuzzy arithmetic denoted by

$$\left( {\sum\limits_{i = 1}^{n} {\left( {x_{i} - \frac{1}{n}\sum_{k = 1}^{n} x_{k} } \right)} \sum\limits_{i = 1}^{n} {\left( {y_{i} - \frac{1}{n}\sum_{k = 1}^{n} y_{k} } \right)} , \max_{1 \le i \le n} \left| {x_{i} - \frac{1}{n}\sum\limits_{k = 1}^{n} {x_{k} } } \right|\max_{1 \le k \le n} \delta_{k} ,\left| {x_{i} - } {\frac{1}{n}\sum_{k = 1}^{n} x_{k} } \right|\max_{1 \le k \le n} \gamma_{k} } \right)$$
(11)

In order to compute the denominator of (7), we will follow the similar steps. The summation of the square of the differences between fuzzy observations and its fuzzy arithmetic mean for each variable is denoted using \(T_{w}\) based fuzzy arithmetic in (12) and (13).

$$\left( {\sum\limits_{i = 1}^{n} {(x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{k} } )^{2} } , \quad \max_{1 \le i \le n} |x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{k} } |\max_{1 \le k \le n} \gamma_{k} } \right)_{L}$$
(12)
$$\left( {\sum\limits_{i = 1}^{n} {(y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{k} } )^{2} } ,\quad \max_{1 \le i \le n} |y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{k} } |\max_{1 \le k \le n} \delta_{k} } \right)_{L}$$
(13)

The product of (12) and (13) yields (14) and (15)

$$\sqrt {\sum\limits_{i = 1}^{n} {\left( {x_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {x_{k} } } \right)^{2} } \sum\limits_{i = 1}^{n} {\left( {y_{i} - \frac{1}{n}\sum\limits_{i = 1}^{n} {y_{k} } } \right)^{2} , } }$$
(14)
$$\frac{\begin{aligned} {\text{max}}\{ & \sum\nolimits_{{i = 1}}^{n} {(x_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {x_{k} } )^{2} } \max _{{1 \le i \le n}} |y_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {y_{k} } |\max _{{1 \le k \le n}} \delta _{k} , \\ & \sum\nolimits_{{i = 1}}^{n} {(y_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {y_{k} } )^{2} } \max _{{1 \le i \le n}} |x_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {x_{k} } |\max _{{1 \le k \le n}} \gamma _{k} , \\ \end{aligned} }{{\sum\nolimits_{{i = 1}}^{n} {(x_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {x_{k} } )^{2} } \sum\nolimits_{{i = 1}}^{n} {(y_{i} - \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {y_{k} } )^{2} } }}$$
(15)

where expressions in (14) and (15) are center and the spread part of the fuzzy number in denominator of (7), respectively.

Hence, both numerator and denominator are obtained. The last step is to divide those two fuzzy numbers. Its division is simply based on the implementation of the expression given in (6c). It is denoted by

$$\left( {\tilde{A}\,{ \oslash }\,\tilde{B}} \right)\left( z \right) = \left\{ {\begin{array}{*{20}c} {L\left[ {\frac{{\left( {\frac{a}{b} - z} \right)}}{{\left( {\left( {\frac{1}{b}} \right)\hbox{max} \left( {\alpha ,z\beta } \right)} \right)}}} \right],\;\;z \ge min\{ \left( {\frac{a - \alpha }{b},\frac{a}{b + \beta }} \right)\} } \\ {R\left[ {\frac{{\left( {z - \frac{a}{b}} \right)}}{{\left( {\frac{1}{b}} \right)\hbox{max} \left( {\alpha ,z\beta } \right)}}} \right], \;\;z \le max\{ \frac{{\left( {a + \alpha } \right)}}{b},\frac{a}{{\left( {b - \beta } \right)}}\} } \\ \end{array} } \right.$$
(16)

where \(a,b > 0\) and it is assumed that \(L = R,\) also \(\tilde{A} = (a,\alpha )_{LL}\) and \(\tilde{B} = (b,\beta )_{LL}\) are fuzzy numbers. Also, other cases including the different signs of two fuzzy numbers are easily defined and given in [5]. It should be noted that expression given in (16) holds for \(LL\) types fuzzy numbers.

A small data set presented in Table 1 will be used in order to exemplify calculations.

Table 1 Data set for both fuzzy numbers written in the form of symmetric triangular fuzzy numbers

Fuzzy arithmetic for both variables are obtained as \(\bar{\tilde{X}} = (3.26,0.5)\) and \(\bar{\tilde{Y}} = (3.46,0.6).\) Then using expression (11) results in (3.582, 1.27) which is the enumerator of (7). The denominator is calculated using (14) and (15) leading to (3.72, 2.71). When former value one is divided by the latter one, the membership function for correlation coefficient is denoted by

$$\tilde{r}_{{\widetilde{XY}}} = = \frac{(3.58, 1.27)}{(3.72, 2.71)} = \left\{ {\begin{array}{*{20}c} {1 - \frac{0.96 - z}{{{ \hbox{max} }(0.341,0.728z)}}\quad if\,0.341 \le z \le 0.96 } \\ {1 - \frac{z - 0.96}{{{ \hbox{max} }(0.341,0.728z)}}\quad if\,0.96 \le z \le 1.304} \\ \end{array} } \right.$$

2.2 Fuzzy Correlation Based on Zadeh’s Extension Principle

Another approach in the computation of fuzzy correlation coefficient is to use the α-cuts of fuzzy numbers in order to derive the membership function proposed by [6]. This method relies on the application of the extension principle aiming at finding the α-cuts of \(\tilde{r}_{{\widetilde{X,Y}}} .\) The α-cuts of \(\tilde{X}_{i}\) and \(\tilde{Y}_{i}\) are denoted by

$$(X_{i} )_{\alpha } = \left[ {(X_{i} )_{\alpha }^{L} ,(X_{i} )_{\alpha }^{L} } \right] = [\min_{x} \left\{ {x_{i} |} \right.\mu_{{\tilde{x}_{i} }} (x_{i} ) \ge \alpha \} , \max_{x} \left\{ {x_{i} |} \right.\mu_{{\tilde{x}_{i} }} (x_{i} ) \ge \alpha \} ]$$
(17a)
$$(Y_{i} )_{\alpha } = \left[ {(Y_{i} )_{\alpha }^{L} ,(Y_{i} )_{\alpha }^{L} } \right] = [\min_{y} \left\{ {y_{i} |} \right.\mu_{{\tilde{y}_{i} }} (y_{i} ) \ge \alpha \} , \max_{y} \left\{ {y_{i} |} \right.\mu_{{\tilde{y}_{i} }} (y_{i} ) \ge \alpha \} ]$$
(17b)

Also, its interval form containing the values of both variables are denoted by

$$[\left( {X_{i} )_{\alpha } ,(Y_{i} )_{\alpha } } \right] = [\min_{x} \left\{ {x_{i} |\mu_{{\tilde{x}_{i} }} \left( {x_{i} } \right) \ge \alpha } \right\},\max_{x} \left\{ {x_{i} |\mu_{{\tilde{x}_{i} }} \left( {x_{i} } \right) \ge \alpha } \right\}]$$
(18)

where the α-cuts of \(\tilde{X}_{i}\) and \(\tilde{Y}_{i}\) are both crisp sets.

Then as mentioned in [6], a pair of non-linear mathematical programs are introduced in order to find the lower and upper bounds of the α-cuts of \(\tilde{r}_{{\widetilde{X,Y}}} .\) Those are denoted as follows:

$$\begin{aligned} (r_{XY} )_{\alpha }^{L} & = { \hbox{min} }\left[ {{{\left( {\sum\nolimits_{i = 1}^{n} {\left( {x_{i} - \bar{x}} \right)\left( {y_{i} - \bar{y}} \right)} } \right)} \mathord{\left/ {\vphantom {{\left( {\sum\nolimits_{i = 1}^{n} {\left( {x_{i} - \bar{x}} \right)\left( {y_{i} - \bar{y}} \right)} } \right)} {\sqrt {\sum\nolimits_{i = 1}^{n} {(x_{i} - \bar{x})^{2} } \sum\nolimits_{i = 1}^{n} {(y_{i} - \bar{y})^{2} } } }}} \right. \kern-0pt} {\sqrt {\sum\nolimits_{i = 1}^{n} {(x_{i} - \bar{x})^{2} } \sum\nolimits_{i = 1}^{n} {(y_{i} - \bar{y})^{2} } } }}} \right] \\ s.t\quad \;\, & (X_{i} )_{\alpha }^{L} \le x_{i} \le (X_{i} )_{\alpha }^{U} ,\quad \forall i \\ & \,\,(Y_{i} )_{\alpha }^{L} \le x_{i} \le (Y_{i} )_{\alpha }^{U} ,\quad \forall i \\ \end{aligned}$$
(19a)
$$\begin{aligned} (r_{XY} )_{\alpha }^{U} & = { \hbox{max} }\left[ \left( {{{\sum\limits_{i = 1}^{n} {\left( {x_{i} - \bar{x}} \right)\left( {y_{i} - \bar{y}} \right)} } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{n} {\left( {x_{i} - \bar{x}} \right)\left( {y_{i} - \bar{y}} \right)} } {\sqrt {\sum\limits_{i = 1}^{n} {(x_{i} - \bar{x})^{2} \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} } } }}} \right. \kern-0pt} {\sqrt {\sum\limits_{i = 1}^{n} {(x_{i} - \bar{x})^{2} \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} } } }}}\right)\right] \\ s.t\quad \;\, & (X_{i} )_{\alpha }^{L} \le x_{i} \le (X_{i} )_{\alpha }^{U} ,\quad \forall i \\ & \,\,(Y_{i} )_{\alpha }^{L} \le x_{i} \le (Y_{i} )_{\alpha }^{U} ,\quad \forall i \\ \end{aligned}$$
(19b)

In the case of nonexistence of analytic solutions of non-linear programming problems, it is possible to obtain the numeric solutions for \((r_{XY} )_{\alpha }^{L}\) and \((r_{XY} )_{\alpha }^{U}\) at different α levels, which leads to the approximate shape of \(L(r)\) and \(R(r).\) A small data set which is given in Table 1 will be used to exemplify.

In order to work with a pair of non-linear programming problems, the α-cuts of variables for specified values (α = 0.0, 0.1, …, 0.9, 1.0) are tabulated in Tables 2 and 3, respectively.

Table 2 The α-cuts values for \(\tilde{X}\)
Table 3 The α-cuts values for \(\tilde{Y}\)

For each α value, while the first column shows the left end point, the second column denotes the right end point. Similar construction is made for the fuzzy \(\tilde{Y}\) variable in Table 3.

Based on those values presented in Tables 2 and 3, a pair of non-linear programming problem is solved in order to calculate correlation values for each corresponding α-cut values. Those are tabulated in Table 4.

Table 4 Upper and lower correlation values for each corresponding α-cuts

3 Fuzzy Non-linear Regressions

Fuzzy linear regression has been utilized as a modeling technique since the first introduction by [7] when one encounters different settings such as linguistically defined values, small data sets, unknown structure between dependent variable and independent variables, approximate measurements like intervals. Modeling endeavor covers several applications in many disciplines ranging from quality function deployment to determining claiming reserves [8, 9]. Also, it allows crisp numbers to be utilized in the modeling. Therefore, several types of fuzzy linear regression models and their parameter estimation methods have been proposed. The generic form of it can be denoted by

$$\tilde{Y} = \tilde{A}_{0} + \tilde{A}_{1} \otimes \tilde{X}_{1} + \cdots + \tilde{A}_{n} \otimes \tilde{X}_{k} , \quad i = 1, \ldots ,n$$
(20)

where parameters, dependent and independent variables are all fuzzy numbers represented as one of the types such as triangular, trapezoidal and Gaussian fuzzy numbers.

Despite of the fact that several estimation methods have been defined, they are actually being grouped into two different methods that have been utilized and evolved during the research. The first of which is based on mathematical programming methods such as linear programming, goal programming, and non-linear programming and so on. The second one is to rely on the minimization of distance between two fuzzy sets so-called fuzzy least squares, which are the squares of the differences between the observed and estimated values of dependent variable.

When fuzzy non-linear regression is concern, the same variety pertinent to model types and their estimation methods are encountered. In this chapter, two different types of fuzzy non-linear regression models that are available in the literature will be presented with small but illustrative examples. The first one is called S-shaped curve fuzzy regression whose crisp version is widely utilized in the modeling of complex systems such as biology, agriculture and social economy. Both input variable and output variable in this model are fuzzy numbers. Its parameter estimation method is based upon minimizing the distance between fuzzy observed values and fuzzy estimated values which are represented by the pre-defined model. The parameter estimates are obtained as crisp values. The second one is called quadratic fuzzy regression model which appears to be two different types. While the first quadratic fuzzy regression allows quadratic term to be included in the model, the second one has interaction terms. While input variables have crisp values, the output variable and the parameters are fuzzy values. Its parameter estimation method uses the distance based methods aiming at minimizing the difference between observed values and estimated values proposed by [10, 12, 13].

3.1 S-Shaped Curve Fuzzy Regression

S-shaped curve fuzzy regression was proposed by [11] in order to model observations that are encountered in complex systems such as biology, social economy and agricultural sciences where the trend of growing is experienced slowly at the beginning, rapid increments are observed during process and it finishes with the saturation at the last phase.

Suppose that \(\tilde{x}_{i}\) and \(\tilde{y}_{i} ,(i = 1, \ldots ,n)\) are observations that are tried to be modeled defined by

$$\tilde{y} = (a + b \cdot { \exp }\,\,( - \tilde{x}))^{ - 1} , \quad a,b \in R$$
(21)

It is assumed that least squares based metrics between fuzzy numbers has better estimation ability when parameter estimation in fuzzy non-linear regression is concern. Therefore, metric defined in (22) will be utilized to determine parameters of the model given in (21).

$$\tilde{d}\left( {\tilde{A},\tilde{B}} \right) = \left[ {\int_{0}^{1} {w^{2} } \left( \alpha \right)d^{2} \left( {A_{\alpha } ,B_{\alpha } } \right)dt} \right]^{{\frac{1}{2}}} ,\quad \tilde{A},\tilde{B} \in F(R)$$
(22)

where \(w^{2} (\alpha )\) should be chosen as a monotone increasing function in [0, 1], and \(\tilde{A}\) and \(\tilde{B}\) are fuzzy numbers defined on real line denoted by \(F(R).\)

The motivation behind choosing monotone increasing function is based on the desire of having higher degree of membership level set when determining the distance between fuzzy numbers.

The distance based on the α-cuts of \(\tilde{A}\) and \(\tilde{B}\) given in (22) is denoted by

$$d^{2} = \left( {\tilde{A}_{\alpha } ,\tilde{B}_{\alpha } } \right) = [l\left( \alpha \right) - p(\alpha )]^{2} + [r\left( \alpha \right) - q(\alpha )]^{2}$$
(23)

where \(\tilde{A}_{\alpha } = [l\left( \alpha \right),r\left( \alpha \right)]\) and \(\tilde{B}_{\alpha } = [p\left( \alpha \right),q\left( \alpha \right)].\)

Utilizing the metric and the model given in (21) and (22) respectively, the least squares optimization problem is written in (24).

$$Minimize\,\,M\left( {a,b} \right) = \sum\nolimits_{i = 1}^{n} {\tilde{d}^{2} } \left( {a + b\exp \left( { - \tilde{x}_{i} } \right), \frac{1}{{\tilde{y}_{i} }}} \right)$$
(24)

The α-cuts of functions of \(\tilde{X}\) and \(\tilde{Y}\) are represented as follows:

$$\begin{aligned} (\tilde{Y}_{i} )_{\alpha } = & [f_{i} \left( \alpha \right),\,g_{i} \left( \alpha \right)],\,(\tilde{X}_{i} )_{\alpha } = [u_{i} \left( \alpha \right),\,v_{i} \left( \alpha \right)], \\ \left( {\frac{1}{{\tilde{Y}_{i} }}} \right)_{\alpha } = & \left[ {\frac{1}{{g_{i} (\alpha )}},\frac{1}{{f_{i} (\alpha )}}} \right]\,(\exp \left( { - \tilde{x}_{i} } \right))_{\alpha } = [\exp \left( { - v_{i} } \right),\exp \left( { - u_{i} } \right)], \\ (\exp \left( {\tilde{x}_{i} } \right))_{\alpha } = & [\exp \left( {u_{i} } \right),\exp \left( {v_{i} } \right)],\quad \alpha \in (0,1] \\ \end{aligned}$$
(25)

where expression in (25) holds for positive fuzzy numbers.

Two different minimization functions are defined with respect to the sign of \(b,\) which are for \(b \ge 0\) and \(b < 0,\) respectively.

In order to simplify the notations, the α-cut in parenthesis is removed. Also, \(w^{2}\) is adapted instead of using \(w^{2} (\alpha )\) in (22) and (25).

For \(b \ge 0,\) the α-cut of \((a + b\exp \left( { - \tilde{x}_{i} } \right))_{\alpha }\) is denoted by

$$\left[ {a + b\exp \left( { - v_{i} } \right),a + b\exp \left( { - u_{i} } \right)} \right], \quad (i = 1, \ldots ,n)$$
(26)

Then, its least squares optimization function given in (24) is rewritten

$$\begin{aligned} \min_{a,b} M_{ + } \left( {a,b} \right) = & \sum\nolimits_{i = 1}^{n} {\tilde{d}^{2} } \left( {a + b\,exp( - x_{i} ),\frac{1}{{\tilde{y}_{i} }}} \right) \\ = & \int_{0}^{1} {w^{2} } \left[ {\sum\limits_{i = 1}^{n} {\left( {a + b\,\exp \left( { - v_{i} } \right) - \frac{1}{{g_{i} }}} \right)^{2} + \left( {a + b\exp \left( { - u_{i} } \right) - \frac{1}{{f_{i} }}} \right)^{2} } } \right]d\alpha \\ \end{aligned}$$
(27)

By taking derivatives of optimization function given in (27) with respect to parameters \(a\) and \(b\), an equation system consisting two equations are obtained. The first equation system is denoted by ES1

$$ES1 = \left\{ {\begin{aligned} & {2na\int_{{_{0} }}^{{^{1} }} w^{2} d\alpha + b\int_{{_{0} }}^{{^{1} }} w^{2} \sum\nolimits_{i = 1}^{n} {({ \exp }( - v_{i} ) + \exp \left( { - u_{i} } \right))} d\alpha } \\ & \quad { = \int_{{_{0} }}^{{^{1} }} w^{2} \sum\nolimits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }} + \frac{1}{{g_{i} }}} \right)} d\alpha } \\& {a\int_{{_{0} }}^{{^{1} }} w^{2} \sum\nolimits_{i = 1}^{n} {({ \exp }( - v_{i} ) + \exp \left( { - u_{i} } \right))d\alpha + b\int_{{_{0} }}^{{^{1} }} w^{2} )} \sum\nolimits_{i = 1}^{n} {({ \exp }( - 2v_{i} ) + \exp \left( { - 2u_{i} } \right))} d\alpha } \\ & \quad { = \int_{{_{0} }}^{{^{1} }} w^{2} \sum\nolimits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }}{ \exp }( - 2v_{i} + \frac{1}{{g_{i} }}\exp ( - 2{\text{u}}_{\text{i}} )} \right)} d\alpha } \\ \end{aligned} } \right.$$
(28)

For \(b < 0,\) the α-cut of \((a + b\exp \left( { - \tilde{x}_{i} } \right))_{\alpha }\) is denoted by

$$\left[ {a + b\exp \left( { - u_{i} } \right),a + b\exp \left( { - v_{i} } \right)} \right], \quad (i = 1, \ldots ,n)$$
(29)

Then, its least squares optimization function given in (24) is rewritten for this case.

$$\begin{aligned} \min_{a,b} M_{ - } \left( {a,b} \right) = & \sum\limits_{i = 1}^{n} {\tilde{d}^{2} } (a + b\,exp( - x_{i} ),\frac{1}{{\tilde{y}_{i} }}) \\ = & \int_{0}^{1} {w^{2} \sum\limits_{i = 1}^{n} {[\left( {a + b\,\exp \left( { - u_{i} } \right) - \frac{1}{{g_{i} }}} \right)^{2} } + \left( {a + b\exp \left( { - v_{i} } \right) - \frac{1}{{f_{i} }}} \right)^{2}] d\alpha } \\ \end{aligned}$$
(30)

By taking derivatives of optimization function given in (30) with respect to parameters \(a\) and \(b,\) an equation system consisting two equations are obtained. The second equation system is denoted by ES2

$$ES2 = \left\{ {\begin{aligned} &{2na\int_{0}^{1} {w^{2} d\alpha + b} \int_{0}^{1} {w^{2} \sum\limits_{i = 1}^{n} {({ \exp }( - v_{i} ) + \exp \left( { - u_{i} } \right))} d\alpha } } \\ & \quad { = \int_{0}^{1} {w^{2} \left( \alpha \right)\sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} \left( \alpha \right)}} + \frac{1}{{g_{i} \left( \alpha \right)}}} \right)} d\alpha } } \\ & {a\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {({ \exp }( - v_{i} ) + \exp \left( { - u_{i} } \right))} d\alpha + b\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {({ \exp }( - 2v_{i} ) + \exp \left( { - 2u_{i} } \right))} d\alpha } \\ & \quad { = \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} \left( \alpha \right)}}{ \exp }( - v_{i} + \frac{1}{{g_{i} \left( \alpha \right)}}\exp ( - 2{\text{u}}_{\text{i}} )} \right)} d\alpha } \\ \end{aligned} } \right.$$
(31)

In order to find the parameters of fuzzy non-linear regression defined in the form of S-curve fuzzy model, ES1 and ES2 needs to be solved. For this purpose, criterion is defined by [11], which is denoted by (32) and (33) are utilized.

$$\begin{aligned} D_{b} = & 2n\int_{0}^{1} {w^{2} } d\alpha \int_{0}^{1} {\left( {\frac{1}{{f_{i} }}\exp \left( {u_{i} } \right) + \frac{1}{{g_{i} }}\exp \left( {v_{i} } \right)} \right)d\alpha } \\ & - \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - v_{i} } \right) + \exp \left( {u_{i} } \right)} \right)} d\alpha \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }} + \frac{1}{{g_{i} }}} \right)} d\alpha \\ \end{aligned}$$
(32)
$$\begin{aligned} D_{{b_{ - } }} = & 2n\int_{0}^{1} {w^{2} } d\alpha \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {(\frac{1}{{g_{i} }}\exp \left( {u_{i} } \right) + \frac{1}{{f_{i} }}\exp \left( {v_{i} } \right))} d\alpha \\ & - \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - u_{i} } \right) + \exp \left( { - v_{i} } \right)} \right)} d\alpha \int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }} + \frac{1}{{g_{i} }}} \right)} d\alpha \\ \end{aligned}$$
(33)

It is proved by [11] that \(D_{b} \ge D_{{b_{ - } }} .\)

Based on values of \(D_{b} ,\) The solution set of (28) or (31) is searched using the computational procedure

If \(D_{b} \ge 0,\) expression (28) has unique solution which is denoted by the form of parameter estimates

$$a = \frac{{p_{1} }}{D}\,{\text{and}}\,b = \frac{{D_{b} }}{D}$$
(34)

where \(p_{1}\) and \(D\) are determinant values which are defined by

$$p_{1} = \left| {\begin{array}{*{20}c} {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }} + \frac{1}{{g_{i} }}} \right)} d\alpha } & {\mathop \int \limits_{0}^{1} w^{2} \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - v_{i} } \right) + \exp \left( { - u_{i} } \right)} \right)} d\alpha } \\ {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }}\exp \left( {u_{i} } \right) + \frac{1}{{g_{i} }}\exp \left( {v_{i} } \right)} \right)} d\alpha } & {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - 2v_{i} } \right) + \exp \left( { - 2u_{i} } \right)} \right)} d\alpha } \\ \end{array} } \right|$$
(35)
$$D = \left| {\begin{array}{*{20}c} {2n\int_{0}^{1} {w^{2} } d\alpha } & {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - v_{i} } \right) + \exp \left( { - u_{i} } \right)} \right)} d\alpha } \\ {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - v_{i} } \right) + \exp \left( { - u_{i} } \right)} \right)} d\alpha } & {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - 2v_{i} } \right) + \exp \left( { - 2u_{i} } \right)} \right)} d\alpha } \\ \end{array} } \right|$$
(36)

If \(D_{b} < 0,\) then \(D_{b - } \le D_{b} \le 0.\) Hence expression (31) has a unique solution which is expressed in the form of parameter estimates

$$a = \frac{{p_{2} }}{D}\,\,{\text{and}}\,\,b = \frac{{D_{b - } }}{D}$$
(37)

where \(p_{2}\) is a determinant value which are defined by

$$p_{2} = \left| {\begin{array}{*{20}c} {\int_{0}^{1} {w^{2} } d\alpha \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{f_{i} }} + \frac{1}{{g_{i} }}} \right)} d\alpha } & {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - v_{i} } \right) + \exp \left( { - u_{i} } \right)} \right)} d\alpha } \\ {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\frac{1}{{g_{i} }}\exp \left( {u_{i} } \right) + \frac{1}{{f_{i} }}\exp \left( {v_{i} } \right)} \right)} d\alpha } & {\int_{0}^{1} {w^{2} } \sum\limits_{i = 1}^{n} {\left( {\exp \left( { - 2v_{i} } \right) + \exp \left( { - 2u_{i} } \right)} \right)} d\alpha } \\ \end{array} } \right|$$
(38)

A small data set is tabulated in Table 5. Our aim is to determine parameters of fuzzy non-linear regression model defined in (21).

Table 5 Data set for S-curve fuzzy regression
$$a = \frac{{p_{1} }}{D} = \frac{ - 6.53}{1.41} = - 4.63\,{\text{and}}\,b = \frac{{D_{b} }}{D} = \frac{56.46}{1.41} = 40.04$$

The model given in (21) is denoted by

$$\tilde{y} = - 4.63 + 40.04\,{ \exp }( - \tilde{x})$$
(39)

3.2 Quadratic Fuzzy Regression

The second type of fuzzy non-linear regression model is quadratic fuzzy regression expressed in two different models. While the first of which is the one including a quadratic term, the second one contains a term consisting of the interaction of the independent variables. They are proposed by [12] and denoted by (40) and (41).

$$\tilde{Y}_{i} = A_{0} + A_{1} X_{i1} + A_{2} X_{i1 }^{2}, \quad i = 1,2, \ldots ,n$$
(40)
$$\tilde{Y}_{i} = A_{0} + A_{1} X_{i1} + A_{2} X_{j1} + A_{3} X_{i1} X_{j1}, \quad i,j = 1,2, \ldots ,n$$
(41)

In both models, it is assumed that input variables are non-negative crisp values and output variable is normal and convex fuzzy numbers with either symmetric or non-symmetric triangular membership functions. The parameter estimation method so-called fuzzy least squares which aim to minimizing the squares of the differences between the observed fuzzy dependent variable and the estimated fuzzy outputs are widely applied to estimation of the parameters. In order to define the difference between the observed and the estimated fuzzy numbers, some methods transforming those fuzzy numbers into crisp numbers are proposed in [10, 12, 13]. One of the methods called Overall Existence Ranking Index (OERI) was proposed in [13]. It is based on the usage of the inverse membership function which is simply a ranking method developed for fuzzy sets. For a given existence level \(w\), the inverse image in terms of membership function, \(\mu (x)\), is defined as

$$\mu^{ - 1} \left( w \right) = \left\{ {x:\mu \left( x \right) = w} \right\}$$
(42)

Then for any two arbitrary fuzzy numbers \(A\) and \(B\), if \(A\) is said to be larger than \(B\) at \(w\) where \(w \in (0,1]\),\(\left\{ {\mu_{A}^{ - 1} (w)} \right\} > \left\{ {\mu_{B}^{ - 1} (w)} \right\}\) holds. The inverse is not generally true. The OERI for a fuzzy number \(A = (x,\alpha ,\beta )\) is a crisp number defined as

$$OM\left( A \right) = x - \frac{1}{2}X_{1} \left( w \right)\alpha + \frac{{1 - X_{1} (w)}}{2}\beta$$
(43)

where \(X_{1} \left( w \right)\) is a weighting function determined by decision makers subjectively. The more realistic weighting function is the linear one mentioned in [13]. Then for fuzzy numbers \(A\) and \(B,\) the distance is defined as

$$D\left( {A,B} \right) = OM\left( A \right) - OM(B)$$
(44)

When OERI is adapted into regression problem, its distance function can be written as follows:

$$MIN\sum\limits_{i = 1}^{n} {[Y_{i} - (\widehat{Y}_{i} )]^{2} } = \sum\limits_{i = 1}^{n} {[D(Y_{i} ,\widehat{Y}_{i} )]^{2} } = \sum\limits_{i = 1}^{n} {[OM\left( {Y_{i} } \right) - OM(\widehat{Y}_{i} )]^{2} }$$
(45)

The minimization function and its constraints employing OERI can be written as

$$MIN\sum\nolimits_{i = 1}^{n} {\left\{ {\left[ {y_{i}^{m} - ax_{i} - \frac{{X_{1} \left( w \right)}}{2}\left( {y_{i}^{L} - \gamma x_{i} } \right) + \frac{{1 - X_{1} \left( w \right)}}{2}\left( {y_{i}^{R} - \delta x_{i} } \right)} \right]^{2} + \left( {y_{i}^{L} - \gamma x_{i} } \right)^{2} + \left( {y_{i}^{R} - \delta x_{i} } \right)^{2} } \right\}}$$
(46)
$$y_{i}^{m} - \left( {1 - \alpha } \right)y_{i}^{L} \ge ax_{i} - \left( {1 - \alpha } \right)\gamma x_{i}$$
$$y_{i}^{m} + \left( {1 - \alpha } \right)y_{i}^{R} \le ax_{i} + (1 - \alpha )\delta x_{i}$$
$$\gamma ,\delta \ge 0$$

where fuzzy number \(A = (a,\gamma ,\delta )\) is the parameter of the regression and fuzzy number \(Y = (y^{m} ,y^{L} ,y^{R} )\) is the observed dependent variable and \(0 < \alpha \le 1.\) The formulation given in (46) is the case having one independent variables and its formulation can be easily extended to multiple cases of independent variables easily.

Similarly, Diamond [10] proposed another distance function defined as

$$d^{2} \left( {A,B} \right) = (x - y)^{2} + [\left( {x - y} \right) - (\alpha - \gamma )]^{2} + [\left( {x - y} \right) - (\beta - \delta )]^{2}$$
(47)

The formulations denoted based on the method proposed by Diamond is given as follows:

$$(X^{T} X)\varvec{\alpha}_{L} = X^{T} \varvec{Y}_{L}$$
(48a)
$$(X^{T} X)\varvec{\alpha}_{U} = X^{T} \varvec{Y}_{R}$$
(48b)

where \(\varvec{Y}_{L}\) and \(\varvec{Y}_{R}\) are vectors denoting the left end points and the right end points of the response values and \(\varvec{\alpha}_{L}\) and \(\varvec{\alpha}_{R}\) are vectors denoting the left end points and the right end points of the center values of the predicted parameters.

where \(X\) is the data matrix denoted by

$$X = \left[ {\begin{array}{*{20}c} 1 & {X_{11} } & {X_{11}^{2} } \\ \vdots & \vdots & \vdots \\ 1 & {X_{n1} } & {X_{n1}^{2} } \\ \end{array} } \right]$$
(49)

Similar construction was also proposed by [12] for the quadratic fuzzy regression containing the interaction term of independent variables.

The generic data matrix for the model given in (41) is given as follows:

$$X = \left[ {\begin{array}{*{20}c} 1 \\ \vdots \\ 1 \\ \end{array} \begin{array}{*{20}c} {X_{11} } & {X_{12} } & {X_{11} X_{12} } \\ \vdots & \vdots & \vdots \\ {X_{n1} } & {X_{m2} } & {X_{n1} X_{m2} } \\ \end{array} } \right]$$
(50)

A small data set is used in order to illustrate the models given in (40) and (41) (Table 6).

Table 6 Data set for quadratic regression models

The parameter estimates for model (40) and (41) are denoted respectively by

$$\begin{aligned} & \quad \quad \tilde{y} = \left( {6.37,0.86} \right) + \left( { - 3.12,0.48} \right)x_{1} + (0.59,0.10)x_{1}^{2} \\ \tilde{y} = & \left( {10.05,1.53} \right) + \left( { - 1.62,0.18} \right)x_{1} + \left( { - 2.90,0.51} \right)x_{2} + (0.76,0.12)x_{1} x_{2} \\ \end{aligned}$$

4 Conclusion

Fuzzy correlation measure is an important fuzzy statistics that helps comprehend the relation between two variables that are collected as either linguistically defined values or approximately known quantities. In classical statistical theory, the correlation of these types of variables can no longer be calculated without losing information included. With the help of fuzzy set theory providing mathematical tools allowing to model uncertainty different than one defined by probabilistic approach, the relation between those variables can be quantified using fuzzy correlation measure. Despite of the fact several methods are available in the literature, two different methods are chosen due to having utilized fuzzy concepts directly in computational procedures and their reliable results. Both of them using basically Zadeh’s extension principle with the combination of either fuzzy arithmetic and the weakest t-norm or non-linear programming problem are employed. Both methods with same small data set are run.

Fuzzy non-linear regression is a method fully benefiting from methodological developments used in fuzzy linear regression when it is defined in a form different than linear structure distance. They are called S-curve regression and quadratic fuzzy regression . It is a fact that distance based parameter estimation methods has better ability than mathematical programming ones do when parameter estimation is concern in fuzzy non-linear regression. Two data sets for S-curve fuzzy regression and quadratic fuzzy regression are employed respectively.