1 Introduction

Fuzzy regression analysis is one of the tools to investigate problems with vague variables in a complex system. The first fuzzy linear regression method was derived by Tanaka et al. [1]. After that, different variants of fuzzy regression have been proposed. According to the review paper [2], fuzzy regression methods can be generally classified into three main categories: (1) possibilistic regression methods, (2) fuzzy least squares methods, and (3) machine learning methods.

Tanaka et al. [1], in which linear programming methods minimize the total spread of fuzzy variables. Other optimization methods were proposed, such as nonlinear programming [3] and goal programming [4].

The fuzzy least squares method was first proposed by Celmins [5] and Diamond [6]. The fuzzy regression coefficients are estimated by minimizing the squared distance between the estimated and observed fuzzy outputs. Different enhancements were proposed. For example, Xu [7, 8] used the integral distance of every level set for treating three vertices of the triangular fuzzy number equally. Diamond and Körner [9] applied the Hukuhara difference to resolve the negative spread problem.

For developing robust fuzzy regression models, some machine learning techniques were employed, e.g., fuzzy genetic algorithms [10], support vector fuzzy regression machines [11], and back-propagation neural networks combined with fuzzy regression analysis [12].

The main drawback of the fuzzy least squares approach is that the resulting fuzzy regression model's accuracy decreases with the increases in the magnitudes and the number of independent variables, see [13]. On the other hand, it is well-known that regression analyses based on minimizing convex loss functions are sensitive to outliers in the design space [14]. Hence, fuzzy regression analysis based on the least squares method also comes with the same criticism [15].

Some nonparametric methods have been proposed for overcoming these drawbacks [6, 13, 16]. Jung et al. [13] employed the rank transform method to construct a fuzzy linear regression model and empirically showed that the rank transform method is robust to outliers using three examples. Each example consists of less than ten samples and three independent variables. Cheng and Lee [16] employed two nonparametric regression techniques: k-nearest neighbor smoothing and kernel smoothing. Wang et al. [6] developed a fuzzy nonparametric regression method based on the Diamond distance measure and the local linear smoothing technique. On the other hand, Choi and Buckley [15] proposed the least absolute deviation estimators method, a parametric method, to enhance the accuracy.

This paper derives a new fuzzy nonparametric regression method based on a convex nonparametric least squares (CNLS) approach, like [6, 16] and Diamond's fuzzy least squares model. The resulting regression model is a fuzzy nonlinear regression model. Hence, it is considered that the proposed approach may alleviate the drawback mentioned before.

In 2008, Kuosmanen [17] derived the representation theorem for CNLS, subject to continuity, monotonicity, and concavity constraints. Also, CNLS does not require the functional form's prior specification and a smoothing parameter like the kernel regression. Due to the shape constraints (concavity), CNLS has attracted considerable interest in the literature on productivity efficiency analysis [18, 19]. In short, the attractiveness of CNLS is the avoidance of the functional form assumption and the better model fit compared with ordinary least squares. However, to our best knowledge, CNLS is not employed in fuzzy regression analysis, which is a research gap and the motivation of the current paper.

To derive the new fuzzy nonparametric regression method, we first separate Diamond's fuzzy least squares model into three fuzzy least squares submodels. A similar approach can be found in [20]. Then, we employ the CNLS technique for each sub-model and call this method fuzzy convex nonparametric least squares (Fuzzy-CNLS). The resulting fuzzy regression model of Fuzzy-CNLS becomes nonlinear. Since the proposed fuzzy regression analysis employs a nonparametric method and the resulting model is nonlinear, the drawback of the model's accuracy decreasing with the increases in the magnitudes and the number of independent variables is alleviated. Some nonparametric methods, such as the kernel regression method, have been proposed to overcome these drawbacks.

The current paper's main contribution is to derive a fuzzy nonparametric regression method (Fuzzy-CNLS) that combines Diamond's fuzzy least squares and a CNLS approach. The Fuzzy-CNLS does not require prior specification of the regression models' functional form and a smoothing parameter like the kernel regression. It provides a better estimation performance with a set of hyperplanes which is not a kind of black box. In short, the contributions of this paper are:

  1. 1.

    To derive a new novel fuzzy nonparametric regression method, Fuzzy-CNLS

  2. 2.

    To introduce CNLS to Diamond’s fuzzy least squares

  3. 3.

    To derive the nonlinear fuzzy regression model by Fuzzy-CNLS

  4. 4.

    To derive the forecasting process of the nonlinear fuzzy regression model, which is not a kind of black box

  5. 5.

    To provide one illustrative example and one application example of Fuzzy-CNLS to the house prices of Shanghai retrieved from literature.

The remainder of the paper is organized as follows. Section 2 presents Diamond's Fuzzy least squares estimation and the convex nonparametric least squares (CNLS) technique. Section 3 derives the new fuzzy nonparametric regression method, Fuzzy-CNLS, discusses the modification and the use of the shape constraints, and describes forecasting processes. Section 4 selects a method of the goodness of fit (Similarity) to measure the performance of Fuzzy-CNLS compared with other fuzzy least squares methods. Section 5 provides three examples to illustrate the use of Fuzzy-CNLS. The performance of Fuzzy-CNLS in terms of goodness of fit is reported. Example 1 and Example 2 demonstrate the use of different shape constraints. Section 6 concludes the current article.

2 Preliminaries

Throughout this paper, R stands for all real numbers, and \(\mathcal{F}\left(R\right)\) stands for the set of all fuzzy numbers in R. All asymmetric and symmetric triangular fuzzy numbers in R. \(\widetilde{a}\) stands for a fuzzy number.

Definition 1

[21] A fuzzy number \(\widetilde{y}\) is a so-called L-R fuzzy number, \(\widetilde{y}=({y}_{c}, {y}_{L}, {y}_{R})\), if the corresponding membership function \({\mu }_{\widetilde{y}}(x)\) satisfies for all \(x\in R\)

$${\mu }_{\widetilde{y}}(x)=\left\{\begin{array}{l}L\left(\frac{{y}_{c}-x}{{y}_{c}-{y}_{L}}\right)\quad {y}_{L}\le x\le {y}_{c},\\ R\left(\frac{x-{y}_{c}}{{y}_{R}-{y}_{c}}\right) \quad {y}_{c}\le x\le {y}_{R},\\ 0 \quad \mathrm{else},\end{array}\right.$$

where \({y}_{c}, {y}_{L}, {y}_{R}\) are the center, left endpoint, and right endpoint of the fuzzy number \(\widetilde{y}\), respectively, and \(L\) and \(R\) are strictly decreasing continuous functions from [0,1] to [0,1] such that \(L\left(0\right)=R\left(0\right)=1\) and \(L\left(1\right)=R\left(1\right)=0\). \(L\left(x\right)\) and \(R\left(x\right)\) are called the left and the right shape function, respectively.

If \({{y}_{c}-y}_{L}={y}_{R}-{y}_{c}={y}_{e}\), then the L-R fuzzy number \(\widetilde{y}\) is called L-R symmetrical fuzzy number and denoted by \(\widetilde{y}=({y}_{c},{y}_{e})\).

Definition 2

[21] A fuzzy number \(\widetilde{y}\) is a so-called triangular fuzzy number, \(\widetilde{y}=({y}_{c}, {y}_{L}, {y}_{R})\), if the corresponding membership function \({\mu }_{\widetilde{y}}(x)\) satisfies for all \(x\in R\)

$${\mu }_{\widetilde{y}}(x)=\left\{\begin{array}{l}\frac{x-{y}_{L}}{{y}_{c}-{y}_{L}}\quad {y}_{L}\le x\le {y}_{c},\\ \frac{{y}_{R}-x}{{y}_{R}-{y}_{c}}\quad {y}_{c}\le x\le {y}_{R},\\ 0 \quad \mathrm{else}.\end{array}\right.$$

Remark

Some papers define \(\widetilde{y}=\left({y}_{c}, {y}_{l}, {y}_{r}\right)\) where \({y}_{l}\) and \({y}_{r}\) are the left and right spread of \(\widetilde{y}\), i.e., \({y}_{l}={y}_{c}-{y}_{L}\) and \({y}_{r}={y}_{R}-{y}_{c}\). In the current paper, using \(\widetilde{y}=\left({y}_{c}, {y}_{L}, {y}_{R}\right)\) is more appropriate because we need to search a set of unique hyperplanes of the proposed fuzzy CNLS methods. Hereafter, we denote \(\widetilde{y}=\left({y}_{c}, {y}_{l}, {y}_{r}\right)\) spread format and \(\widetilde{y}=\left({y}_{c}, {y}_{L}, {y}_{R}\right)\) endpoint format.

2.1 Fuzzy Linear Regression Models

Diamond [22] derived the fuzzy least squares method for the fuzzy linear regression model of the following form (1)

$$\tilde{y}_{i} = \tilde{a} + \tilde{b}_{1} x_{{i1}} + \tilde{b}_{2} x_{{i2}} + \ldots + \tilde{b}_{m} x_{{im}} \;i = 1,2, \ldots ,n$$
(1)

\(\left({x}_{i1},{x}_{i2},\dots ,{x}_{im},{\widetilde{y}}_{i}\right),i=1,\dots ,n,\) be n pairs of crisp input and fuzzy output observations. The fuzzy parameters, \(\widetilde{a}, {\widetilde{b}}_{1},\dots ,{\widetilde{b}}_{m}\) Let \(\widetilde{a}=({a}_{c}, {a}_{L}, {a}_{R})\), \({\widetilde{b}}_{j}=({b}_{cj}, {b}_{Lj}, {b}_{Rj})\) for \(j=\mathrm{1,2},\dots ,m\), be the triangular fuzzy parameter. Moreover, let \({\widehat{\widetilde{y}}}_{i}=\left({\widehat{y}}_{ci}, {\widehat{y}}_{Li}, {\widehat{y}}_{Ri}\right)\) be the estimated fuzzy response, and by Zadeh's extension principle of fuzzy numbers, we have

$$\begin{aligned} \tilde{a} + \sum\limits_{{j = 0}}^{m} {\tilde{b}_{j} } x_{{ij}} =\, & \left( {a_{c} + \sum\limits_{{j = 1}}^{m} {b_{{cj}} } x_{{ij}} ,a_{L} + \sum\limits_{{j = 1}}^{m} {b_{{Lj}} } x_{{ij}} ,a_{R} + \sum\limits_{{j = 1}}^{m} {b_{{Rj}} } x_{{ij}} } \right) \\ =\, & \tilde{y}_{i} = \left( {y_{{ci}} ,y_{{Li}} ,y_{{Ri}} } \right). \\ \end{aligned}$$

2.2 Fuzzy Least Squares Estimation [22]

There are three major approaches to finding the fuzzy parameters, \(\widetilde{a}\) and \({\widetilde{b}}_{j}\), of the regression model, (i) possibilistic regression analysis, (ii) fuzzy least squares methods, and (iii) machine learning. See Chukhrova and Johannssen's review paper for more details [2]. Since we derive Fuzzy-CNLS based on Diamond's fuzzy least squares method [22], we present Diamond's model below and the CNLS approach next.

Diamond [22] developed the fuzzy least squares method to obtain the fuzzy parameters by minimizing the total squared error of the output of the following least square problem.

$$\begin{aligned} \mathop {{\text{min}}}\limits_{{a_{c} ,a_{l} ,a_{r} ,b_{{cj}} ,b_{{lj}} ,b_{{rj}} }} \sum\limits_{{i = 1}}^{n} {\left( {\widehat{{\tilde{y}}}_{i} - \tilde{y}_{i} } \right)^{2} } = & \sum\limits_{{i = 1}}^{n} {\left( {a_{c} + \sum\limits_{{j = 1}}^{m} {b_{{cj}} } x_{{ij}} - y_{{ci}} } \right)^{2} } \\ & \; + \sum\limits_{{i = 1}}^{n} {\left( {a_{l} + \sum\limits_{{j = 1}}^{m} {b_{{lj}} } x_{{ij}} - y_{{li}} } \right)^{2} } + \sum\limits_{{i = 1}}^{n} {\left( {a_{r} + \sum\limits_{{j = 1}}^{m} {b_{{rj}} } x_{{ij}} - y_{{ri}} } \right)^{2} } \\ \end{aligned} .$$

Note that the above regression coefficients, \(\widetilde{a}\) and \({\widetilde{b}}_{j}\), are in the spread format. If the regression coefficients are fuzzy triangular numbers, we can have the following Diamond method with the fuzzy numbers in the endpoint format [20].

[Diamond]

$$\begin{aligned} \mathop {{\text{min}}}\limits_{{a_{c} ,a_{L} ,a_{R} ,b_{{cj}} ,b_{{Lj}} ,b_{{Rj}} }} \sum\limits_{{i = 1}}^{n} {\left( {\widehat{{\tilde{y}}}_{i} - \tilde{y}_{i} } \right)^{2} } = & \sum\limits_{{i = 1}}^{n} {\left( {a_{c} + \sum\limits_{{j = 1}}^{m} {b_{{cj}} } x_{{ij}} - y_{{ci}} } \right)^{2} } \\ & \quad + \sum\limits_{{i = 1}}^{n} {\left( {a_{L} + \sum\limits_{{j = 1}}^{m} {b_{{Lj}} } x_{{ij}} - y_{{Li}} } \right)^{2} } + \sum\limits_{{i = 1}}^{n} {\left( {a_{R} + \sum\limits_{{j = 1}}^{m} {b_{{Rj}} } x_{{ij}} - y_{{Ri}} } \right)^{2} } \\ \end{aligned}.$$

2.3 Convex Nonparametric Least Squares (CNLS)

CNLS is a nonparametric regression method shown below.

$$y=f\left(\mathbf{x}\right)+{\varepsilon }^{CNLS},$$

where \(f\left(\mathbf{x}\right)\) is a function with shape restrictions, \(y\) is the dependent crisp output variable, \(\mathbf{x}\) is a vector of crisp input variables, and \({\varepsilon }^{CNLS}\) is a random variable satisfying \(E({\varepsilon }^{CNLS}|x)=0\). Kuosmanen [17] derived the following quadratic programming formulation to estimate any \(f\left(\mathbf{x}\right)\).

[CNLS]

$${\mathrm{min}}_{a,b,\varepsilon } \sum_{i=1}^{n}{({\varepsilon }_{i}^{\mathrm{CNLS}})}^{2}$$
$$s.t.\;y_{i} = a_{i} + \sum\limits_{{j = 1}}^{m} {b_{{ij}} } x_{{ij}} + \varepsilon _{i}^{{{\text{CNLS}}}} \;{\text{for}}\;i = 1, \ldots ,n,$$
(2)
$$a_{i} + \sum\limits_{{j = 1}}^{m} {b_{{ij}} } x_{{ij}} \le a_{h} + \sum\limits_{{j = 1}}^{m} {b_{{hj}} } x_{{ij}} \;{\text{for}}\;i,h = 1, \ldots ,n\;{\text{and}}\;i \ne h,$$
(3)
$$b_{{ij}} \ge 0,\;{\text{for}}\;i = 1, \ldots ,n;\;j = 1, \ldots ,m$$
(4)

where \({y}_{i}\) is the crisp output, \({x}_{ij}\) is the crisp input \(j\), and \({\varepsilon }_{i}^{CNLS}\) is the disturbance term representing the deviation of observation \(i\) from the estimated function.

The first \(n\) equality constraints (different hyperplanes) are used to approximate the unknown underlying function, \(f\left(\mathbf{x}\right)\), where \({a}_{i}\) and \({b}_{ij}\) are specific to each observation \(i\). Afriat concavity inequalities and monotonicity constraints are represented by constraints (3) and (4), respectively.

According to Kuosmanen and Kortelainen [23], the [CNLS] may not generate a unique optimal solution in terms of \({a}_{i}\) and \({b}_{ij}\). However, the fitted values, \({\widehat{y}}_{i}={\widehat{a}}_{i}+\sum_{j=1}^{m}{\widehat{b}}_{ij}{x}_{ij}\), are unique. Thus, we can calculate a low concave envelope for the estimated function. Kuosmanen and Kortelainen [23] proposed to estimate the low concave envelop by solving the following problems with the estimated \({\widehat{y}}_{i}\) by [CNLS].


[CNLS-LCEi]

$${\mathrm{min}}_{{a}_{i},{b}_{ij}}\left\{{a}_{i}+\sum_{j=1}^{m}{b}_{ij}{x}_{ij}|{a}_{i}+\sum_{j=1}^{m}{b}_{ij}{x}_{ij}\ge {\widehat{y}}_{i} \forall i\right\},$$

where \({a}_{i} and {b}_{ij}\) is the unique optimal solution to [CNLS], which may be distinct from the results \(({\widehat{a}}_{i},{\widehat{b}}_{ij})\) estimated in [CNLS]. The resulting model (a set of unique hyperplanes) of [CNLS-LCEi] is used for the forecasting process. The number of unique hyperplanes is smaller than that of observations.

The advantages of CNLS are its nonparametric, nonlinear, and better estimation performance than ordinary least squares (OLS), see [17, 23, 24, 25], which are also applied to Fuzzy-CNLS.

3 Fuzzy-CNLS Regression Method

In this section, we derive the Fuzzy-CNLS regression method, discuss the issue of shape constraints, and provide a forecasting process with Fuzzy-CNLS models.

The main concern is how we can employ the CNLS method in the fuzzy regression analysis. We find that the Diamond method of the fuzzy regression model can apply the CNLS method. It is because from [Diamond], we can have three individual OLSs, [Diamond-c], [Diamond-L], and [Diamond-R], shown below. A similar approach can be found in [20].


[Diamond-c]

$$\underset{{a}_{c},{b}_{cj},{\varepsilon }_{i}^{c}}{\mathrm{min}}\left\{\sum_{i=1}^{n}{({\varepsilon }_{i}^{c})}^{2}|{y}_{ci}={a}_{c}+\sum_{j=1}^{m}({{b}_{c})}_{j}{x}_{ij}+{\varepsilon }_{i}^{c} \forall i\right\}$$

[Diamond-L]

$$\underset{{a}_{L},{b}_{Lj},{\varepsilon }_{i}^{L}}{\mathrm{min}}\left\{\sum_{i=1}^{n}{({\varepsilon }_{i}^{L})}^{2}|{y}_{Li}={a}_{L}+\sum_{j=1}^{m}({{b}_{L})}_{j}{x}_{ij}+{\varepsilon }_{i}^{L} \forall i\right\}$$

[Diamond-R]

$$\underset{{a}_{R},{b}_{Rj},{\varepsilon }_{i}^{R}}{\mathrm{min}}\left\{\sum_{i=1}^{n}{({\varepsilon }_{i}^{R})}^{2}|{y}_{Ri}={a}_{R}+\sum_{j=1}^{m}({{b}_{R})}_{j}{x}_{ij}+{\varepsilon }_{i}^{R} \forall i\right\}$$

[Diamond-c] is the regression model for the center point, while [Diamond-L] and [Diamond-R] are for the left and right endpoints, respectively.

For simplification, we define [Diamond-(K)] for \(K=\{c,L,R\}\) representing the above three Diamond’s models.


[Diamond-(K)]

$$\underset{{a}_{K},{b}_{Kj},{\varepsilon }_{i}^{K}}{\mathrm{min}}\left\{\sum_{i=1}^{n}{({\varepsilon }_{i}^{K})}^{2}|{y}_{Ki}={a}_{K}+\sum_{j=1}^{m}({{b}_{K})}_{j}{x}_{ij}+{\varepsilon }_{i}^{K} \forall i\right\}$$

It is noted that Diamond used OLS to approximate the fuzzy regression functions of the center, left endpoint, and right endpoint. Instead of using OLS, we propose to use CNLS to approximate the fuzzy regression functions. We consider the fuzzy regression in the form of \(y=f\left(\mathbf{x}\right)+{\varepsilon }^{CNLS}\) for\(, K=\{c,L,R\}\). Then we apply {CNLS} for each K, and we have.


[Fuzzy-CNLS-(K)]

$$\underset{{a}_{Ki},({{b}_{K})}_{ij},{\varepsilon }_{i}^{K}}{\mathrm{min}}\sum_{i=1}^{n}{({\varepsilon }_{i}^{K})}^{2}$$
$$s.t.\;y_{{Ki}} = a_{{Ki}} + \sum\limits_{{j = 1}}^{m} ( b_{K} )_{{ij}} x_{{ij}} + \varepsilon _{i}^{K} \;{\text{for}}\;i = 1, \ldots ,n ,$$
(5a)
$$a_{{Ki}} + \sum\limits_{{j = 1}}^{m} ( b_{K} )_{{ij}} x_{{ij}} \le a_{{Kh}} + \sum\limits_{{j = 1}}^{m} ( b_{K} )_{{hj}} x_{{ij}} ,\;{\text{for}}\;i,h = 1, \ldots ,n,\;{\text{and}}\;i \ne h .$$
(5b)
$$(b_{K} )_{{ij}} \ge 0i = 1, \ldots ,n;\;j = 1, \ldots ,m .$$
(5c)

Like [CNLS], the [Fuzzy-CNLS-(K)] may not generate a unique optimal solution, and we need to calculate a low concave envelope for the estimated function using the following model.

[Fuzzy-CNLS-(K)-LCEi]

$${\mathrm{min}}_{{a}_{K\widehat{i}},({{b}_{K})}_{\widehat{i}j}}\left\{{a}_{K\widehat{i}}+\sum_{j=1}^{m}({{b}_{K})}_{\widehat{i}j}{x}_{ij}|{a}_{K\widehat{i}}+\sum_{j=1}^{m}({{b}_{K})}_{\widehat{i}j}{x}_{ij}\ge {\widehat{y}}_{Ki} \forall i\right\},$$
(5d)

where \({a}_{K\widehat{i}},({{b}_{K})}_{\widehat{i}j}\) is the unique optimal solution to [Fuzzy-CNLS-(K)], which may be distinct from the results \(({a}_{Ki},{({b}_{K})}_{ij})\) estimated in [Fuzzy-CNLS-(K)].Consequently, we can use [Fuzzy-CNLS-(K)] to conduct a fuzzy regression analysis, and the resulting fuzzy regression model is nonlinear found by solving [Fuzzy-CNLS-(K)- LCEi]. That is,

$$\left\{\begin{array}{c}{a}_{c\widehat{i}}+\sum_{j=1}^{m}({{b}_{c})}_{\widehat{i}j}{x}_{ij} \quad \forall \widehat{i}\\ {a}_{L\widehat{i}}+\sum_{j=1}^{m}({{b}_{L})}_{\widehat{i}j}{x}_{ij} \quad \forall \widehat{i}\\ {a}_{R\widehat{i}}+\sum_{j=1}^{m}({{b}_{R})}_{\widehat{i}j}{x}_{ij} \quad \forall \widehat{i}\end{array}\right.$$

Note that [Fuzzy-CNLS-(K)] inherits some properties of the [Diamond] method and [CNLS] method. The following table, Table 1, shows the difference and similarities between the Diamond and Fuzzy-CNLS methods.

Table 1 The difference and similarities of Diamond and Fuzzy-CNLS methods

According to [24], the coefficient of determination of [CNLS] must be greater than that of OLS (\({R}_{OLS}^{2}\le {R}_{CNLS}^{2}\)). Hence, the goodness of fit of the Fuzzy-CNLS method is better than the Diamond method because each sub-model of the Diamond method is OLS.

3.1 The Issue of Shape Constraints

In the development of CNLS by Kuosmanen [17], the application domain is concave productivity function forecasting, and Afriat concavity inequalities (inequalities (3) of [CNLS]) are employed. However, when we apply [Fuzzy-CNLS-(K)], convex regression functions may be involved. Lee et al. [25] mentioned these concavity and convexity constraints in the context of CNLS. The inequalities (5b) are modified by changing “\(\le\)” to “\(\ge\)” to capture the convexity of the forecasting function instead of the concavity. That is,

$$a_{{Ki}} + \sum\limits_{{j = 1}}^{m} ( b_{K} )_{{ij}} x_{{ij}} \le a_{{Kh}} + \sum\limits_{{j = 1}}^{m} ( b_{K} )_{{hj}} x_{{ij}} \;{\text{for}}\;i,h = 1, \ldots ,n\;{\text{and}}\;i \ne h.$$
(6b)

Accordingly, the [Fuzzy-CNLS-(K)-LCEi] is also changed to

$${max}_{{a}_{K\widehat{i}},({{b}_{K})}_{\widehat{i}j}}\left\{{a}_{K\widehat{i}}+\sum_{j=1}^{m}({{b}_{K})}_{\widehat{i}j}{x}_{ij}|{a}_{K\widehat{i}}+\sum_{j=1}^{m}({{b}_{K})}_{\widehat{i}j}{x}_{ij}\le {\widehat{y}}_{Ki} \forall i\right\},$$
(6d)

3.2 Development of Fuzzy Nonlinear Regression Model by Fuzzy-CNLS

[Fuzzy-CNLS-c], [Fuzzy-CNLS-L], and [Fuzzy-CNLS-R] may need different concavity or convexity constraints to estimate the concavity or the convexity of the corresponding regression functions.

We call [Fuzzy-CNLS-(K)] with 5(a), 5(b), 5(c) concavity model and [Fuzzy-CNLS-(K)] with 5(a), 5(b), 5(c) convexity model.

We use this subsection to describe the development steps in detail, i.e., the calculation steps of using [Fuzzy-CNLS-(K)], [Fuzzy-CNLS-(K)-LCEi], and two different shape constraints to develop a fuzzy nonlinear regression model. The calculation steps are described below.

First, for \(K=\{c,L,R\}\), solve the concavity model and convexity model. Then, we have eight [Fuzzy-CNLS] models with different combinations of the concavity model and convexity model, shown in Table 2 below.

Table 2 Eight [Fuzzy-CNLS] models with different combinations of the shape constraints

Second, the similarity of each fuzzy nonlinear regression model is calculated. The model with the highest similarity score is selected.

Third, use 5(a) of the corresponding concavity or convexity model to obtain the hyperplanes of the Center, the Left, and the Right models.

Fourth, [Fuzzy-CNLS-(K)-LCEi] with 5(d) or 6(d) is then used to calculate the resulting fuzzy nonlinear regression model if [Fuzzy-CNLS-(K)] is the concavity model or convexity model, respectively. Note that resulting hyperplanes 5(a) of [Fuzzy-CNLS-(K)] are used as input to [Fuzzy-CNLS-(K)-LCEi]. For example, if [CCV] has the largest similarity, [Fuzzy-CNLS-(K)-LCEi] with 5(d) is used to calculate the Center and Left models and with 6(d) to calculate the Right model. The similarity calculation will be discussed in Sect. 4.

Figure 1 provides the information flow of the Fuzzy-CNLS calculation steps in developing fuzzy nonlinear regression models.

Fig. 1
figure 1

Information flow of the Fuzzy-CNLS

3.3 Forecasting Process with Fuzzy-CNLS Regression Models

In [CNLS], the forecasting process for a given observation is described in [24] for concavity models. In short, let \({x}_{tj}\) be the observed inputs for a given observation \({\varvec{t}}\). We first found the fitted values, \({\widehat{y}}_{i}={\widehat{a}}_{i}+\sum_{j=1}^{m}{\widehat{b}}_{ij}{x}_{tj}\), using the hyperplane \({\varvec{i}}\). Then, the forecasted

$${\widehat{y}}_{t}=\mathrm{min}\left\{{\widehat{y}}_{ti}|{\widehat{y}}_{ti}={\widehat{a}}_{i}+\sum_{j=1}^{m}{\widehat{b}}_{ij}{x}_{tj} \forall i\right\}.$$
(7a)

See Fig. 2 for the illustration of the forecasting results in a regression model with one input and four hyperplanes.

Fig. 2
figure 2

Illustration of the forecast \(\widehat{{y}_{t}}\) for on x input and four hyperplanes regression model (Concavity model)

In Fuzzy-CNLS, we use the above forecasting process of CNLS in [Fuzzy-CNLS-(K)-LCEi] for K = {c, L, R} to find \({\widehat{\widetilde{y}}}_{t}=( {\widehat{y}}_{c}, {\widehat{y}}_{L}, {\widehat{y}}_{R})\) for concavity models. For convexity models, we use.

$${\widehat{y}}_{t}=\mathrm{max}\left\{{\widehat{y}}_{ti}|{\widehat{y}}_{ti}={\widehat{a}}_{i}+\sum_{j=1}^{m}{\widehat{b}}_{ij}{x}_{tj} \forall i\right\}$$
(7b)

For example, if [CCV] has the largest similarity, 7(a) is used for the Center and Left models and 7(b) for the Right model. Also, see Fig. 1 for reference.

4 The Goodness of Fit (Similarity-Distance Measure)

To evaluate the performance of fuzzy regression results, we can find several popular approaches from the literature, the goodness of fit index, similarity measures, and distance measures.

We also find that distance is more popular than similarity, and Zeng et al. [26] derived the following formula to calculate the similarity measure between two fuzzy triangular numbers, \(\widetilde{a}\) and \(\widetilde{b}\).

$$S\left(\widetilde{a},\widetilde{b}\right)=1-\frac{\left|{a}_{c}-{b}_{c}\right|+\left|{a}_{l}-{b}_{l}\right|+\left|{a}_{r}-{b}_{r}\right|}{\mathrm{max}\left({a}_{c}+{a}_{r}, {b}_{c}+{b}_{r}\right)-\mathrm{min}\left({a}_{c}- {a}_{l}, {b}_{c}- {b}_{l}\right)} (8)$$
(8)

Zeng et al. proved that \(S\left(\widetilde{a},\widetilde{b}\right)\) satisfies.

  1. (P1)

    \(\widetilde{a}=\widetilde{b}\Leftrightarrow S\left(\widetilde{a},\widetilde{b}\right)=1\)

  2. (P2)

    \(S\left(\widetilde{a},\widetilde{b}\right)=S(\widetilde{b},\widetilde{a})\)

  3. (P3)

    \(\widetilde{a}\subseteq \widetilde{b}\subseteq \widetilde{c}\Rightarrow S\left(\widetilde{a},\widetilde{c}\right)\le min\left\{S\left(\widetilde{a},\widetilde{b}\right),S\left(\widetilde{b},\widetilde{c}\right)\right\}\)

  4. (P4)

    \(S\left(\widetilde{a},{\widetilde{a}}^{c}\right)=0\)

if \(\widetilde{a}\) is a crisp set. Indeed, \(S\left(\widetilde{a},\widetilde{b}\right)\) also satisfies (P5), \(0\le S(\widetilde{a},\widetilde{b})\le 1\), shown in Theorem 1

Theorem 1

\(S\left(\widetilde{a},\widetilde{b}\right)\) is the similarity measure of fuzzy triangular numbers \(\widetilde{a}\) and \(\widetilde{b}\) with the following properties: P(1) to P(5).

Proof

P(1), P(2), P(3), and P(4), see Theorem 5 of Zeng et al. [26]. For P(5), \(0\le S(\widetilde{a},\widetilde{b})\le 1\), we prove that \(\left(\mathrm{max}\left({a}_{c}+{a}_{r}, {b}_{c}+{b}_{r}\right)-\mathrm{min}({a}_{c}- {a}_{l}, {b}_{c}- {b}_{l})\right)\ge\) \(\left(\left|{a}_{c}-{b}_{c}\right|+\left|{a}_{l}-{b}_{l}\right|+|{a}_{r}-{b}_{r}|\right)\) as follows. \(\left(\mathrm{max}\left({a}_{c}+{a}_{r}, {b}_{c}+{b}_{r}\right)-\mathrm{min}({a}_{c}- {a}_{l}, {b}_{c}- {b}_{l})\right)\) results in four result scenarios, (i) \({a}_{c}+{a}_{r}-({b}_{c}- {b}_{l})\), (ii) \({b}_{c}+{b}_{r}-({b}_{c}- {b}_{l})\), (iii) \({b}_{c}+{b}_{r}-\) (\({a}_{c}- {a}_{l})\), and (iv) \({a}_{c}+{a}_{r}-({a}_{c}- {a}_{l}).\)

$$\begin{aligned} {\text{For }}\left( {\text{i}} \right)a_{c} + a_{r} - & \left( {b_{c} - b_{l} } \right) - \left( {\left| {a_{c} - b_{c} } \right| + \left| {a_{l} - b_{l} } \right| + |a_{r} - b_{r} |} \right) \\ = & (a_{c} - b_{c} ) - \left| {a_{c} - b_{c} } \right| + a_{r} + b_{l} - \left( {\left| {a_{l} - b_{l} } \right| + |a_{r} - b_{r} |} \right) \\ \ge & a_{r} + b_{l} - \left( {\left| {a_{l} - b_{l} } \right| + \left| {a_{r} - b_{r} } \right|} \right) = (a_{r} - b_{r} ) + b_{r} + b_{l} - \left( {\left| {a_{l} - b_{l} } \right| + \left| {a_{r} - b_{r} } \right|} \right) \\ \ge & b_{r} + b_{l} - \left( {\left| {a_{l} - b_{l} } \right|} \right) = b_{r} + \left( {b_{l} - a_{l} } \right) + a_{l} - \left( {\left| {b_{l} - a_{l} } \right|} \right) \ge b_{r} + a_{l} \ge 0 \\ \end{aligned}$$
$$\begin{aligned} {\text{For}}({\text{ii}}),b_{c} + b_{r} - & (b_{c} - b_{l} ) - \left( {\left| {a_{c} - b_{c} } \right| + \left| {a_{l} - b_{l} } \right| + |a_{r} - b_{r} |} \right) \\ & \ge \left( {b_{c} + b_{r} } \right) - \left( {b_{c} - b_{l} } \right) - [\left( {b_{c} + b_{r} } \right) - \left( {a_{c} + a_{r} } \right)] - \left( {\left| {a_{c} - b_{c} } \right| + \left| {a_{l} - b_{l} } \right| + |a_{r} - b_{r} |} \right) \\ & = (a_{c} + a_{r} ) - \left( {b_{c} - b_{l} } \right) - \left( {\left| {a_{c} - b_{c} } \right| + \left| {a_{l} - b_{l} } \right| + |a_{r} - b_{r} |} \right) \ge 0 \\ \end{aligned}$$

We can have the results of (iii) and (iv) similar to (i) and (ii), respectively.

\(S\left(\widetilde{a},\widetilde{b}\right)\) also is a distance measure with \(\left|{a}_{c}-{b}_{c}\right|+\left|{a}_{l}-{b}_{l}\right|+|{a}_{r}-{b}_{r}|\), which does not require the intersection properties. Due to \(S\left(\widetilde{a},\widetilde{b}\right)\) consisting of both similarity and distance measure properties, we employ \(S\left(\widetilde{a},\widetilde{b}\right)\) to discuss the empirical results of the current paper.

5 Illustrative Example and Application

We use one example and one application to illustrate the proposed [Fuzzy-CNLS-(K)] model and compare the results with the [Diamond] and the FLAR method of Zeng et al. [26]. All the models and examples are coded in GAMS [27] with the Pathnlp solver and run on a PC. We compare the FLAR method because the FLAR is becoming popular, and we believe its performance is better than Diamond's. The formulations of the FLAR are given in supplementary material.

5.1 An Illustrative Example

This illustrative example, example 1, is retrieved from [28]. Table 3 shows the data with one input \(({x}_{1})\). Figure 3 is the scatter diagram. Following the development steps in Sect. 3.2, first, we calculated three concavity models (\(\mathrm{Fuzzy}-\mathrm{CNLS}\left(\mathrm{K}\right)\) 5(a), 5(b), 5(c)) and three convexity models \(\mathrm{Fuzzy}-\mathrm{CNLS}\left(\mathrm{K}\right)\) 5(a), 6(b), 5(c)) for K = {c, L, R}.

Table 3 Data of Example 1
Fig. 3
figure 3

Scatter plot of Example 1

Second, using Eq. (8), eight similarity scores are found and shown in Table 4 below.

Table 4 Similarity scores of the Example 1 and 2

Third, the [Fuzzy-CNLS] model (VCV) consists of two convexity models (the Center and the Right) and one concavity model (the Left) due to its largest similarity score. By 5(a) of their corresponding models, we have eight hyperplanes for the Center, the Left, and the Right, shown in rows 3–11 of Table 5. The last two rows present the results of the Diamond model and FLAR model, respectively. The similarity of Fuzzy-CNLS (0.749) is much better than that of Diamond (0.543) and FLAR (0.593) in this example. Figure 4  shows the results of the Diamond method, in which we have three single hyperplanes.

Table 5 The results of the Fuzzy-CNLS analysis for Example 1
Fig. 4
figure 4

Diamond results and the observations

Fourth, since the Center and Right are convexity models, and 6(d) of [Fuzzy-CNLS-(K)-LCEi] is used to calculate the hyperplanes of the corresponding regression model 5(a). On the other hand, the Left is the concavity model, and 5(d) of [Fuzzy-CNLS-(K)-LCEi] is used to find the hyperplanes to represent the fuzzy nonlinear regression model. Tables 6, 7, and 8 summarize the hyperplanes of the Center, Left, and Right regression models, respectively. There are two, four, and five hyperplanes for the Center, Left endpoint, and Right endpoint.

Table 6 Fuzzy nonlinear regression model of Example 1 by [Fuzzy-CNLS-(K)-LCEi] for Center

5.1.1 An Instance of a Forecasting Process with Hyperplanes

The last columns of Tables 6, 7, and 8 are the forecasting result when \({\mathrm{x}}_{\mathrm{t}}=21\). For the Center, there are two hyperplanes, C1 and C2. We use \({\widehat{\mathrm{y}}}_{\mathrm{Ct}}=\mathrm{aC}+\mathrm{bC}*\left({\mathrm{x}}_{\mathrm{t}}\right)\) to find the corresponding estimated \({\widehat{\mathrm{y}}}_{\mathrm{Ct}}\), 30.896 by C1 and 29.639 for C2. Since Center is a concavity model, (max{\({\widehat{\mathrm{y}}}_{\mathrm{Ct}}\)}) of 7(b) is used and the \({\widehat{\mathrm{y}}}_{\mathrm{Ct}}=\mathrm{max}\left\{{\widehat{\mathrm{y}}}_{\mathrm{Ct}}\right\}=\mathrm{max}\left\{30.896, 29.639\right\}=30.896\). Similarly, we can have \({\widehat{\mathrm{y}}}_{\mathrm{Lt}}=26.569\mathrm{ and }{\widehat{\mathrm{y}}}_{\mathrm{Rt}}=37.251\). Then, \({\widehat{\widetilde{\mathrm{y}}}}_{\mathrm{t}}=\left({\widehat{\mathrm{y}}}_{\mathrm{ct}}, {\widehat{\mathrm{y}}}_{\mathrm{Lt}}, {\widehat{\mathrm{y}}}_{\mathrm{Rt}}\right)=(30.896, 26.569, 37.251)\).

Figure 5 presents the positions of the hyperplanes of Example 1. C1-c2 are the hyperplanes of the Center; L1-L4 are the hyperplanes of left endpoints; R1-R5 are the right endpoints' hyperplanes. These hyperplanes are listed in Tables 6, 7 and 8. Different numbers of resulting hyperplanes to the Center, Left, and Right may be obtained.

Table 7 Fuzzy nonlinear regression model of Example 1 by [Fuzzy-CNLS-(K)-LCEi] for Left
Table 8 Fuzzy nonlinear regression model of Example 1 by [Fuzzy-CNLS-(K)-LCEi] for Right
Fig. 5
figure 5

Hyperplanes of Example 1 with observations

5.2 An Application

This example is retrieved from [20], in which the affordable levels of house prices in Shanghai (China) are analyzed. They considered a set of influential factors, including three policy factors (the mortgage interest rate (\({x}_{2})\), the real estate tax (\({x}_{3}\)), and the down payment ratio (\({x}_{4})\)) and three non-policy factors (the housing size (\({x}_{1})\), the annual household income (\({x}_{5})\), and the family population (\({x}_{6})\)). 147 samples were collected, and the corresponding data table is in supplementary material (see S1).

We employed 147 samples, and the resulting hyperplanes are provided in Tables S2, S3, S4 in supplementary material. Figures 6 and 7 show the possible requirement of convexity constraints for estimating the regression functions of the Center, Left endpoint, and Right endpoint. See the last column of Table 4, and the [Fuzzy-CNLS] with (VVV) obtains largest similarity. Hence, convexity constraints are employed for all K. On the other hand, Zhou et al. [20] mentioned that all three policy factors (\({x}_{2}, {x}_{3}, {x}_{4}\)) and the family population (\({x}_{6})\) are negatively correlated with the affordable levels of house prices. Hence, these data are converted into negative for having \(({{b}_{K})}_{ij}\ge 0\) in [Fuzzy-CNLS-(K)].

Fig. 6
figure 6

Scatter plots of Y values for Example 2. *The horizontal axis is the observation number according to the ascending order of yc

Fig. 7
figure 7

Line plots of x values for Example 2. *The horizontal axis is the observation number according to the ascending order of yc

After solving [Fuzzy-CNLS-(K)] and then [Fuzzy-CNLS-(K)-LCEi], we have 117, 128, and 108 for Center, Left endpoint, and Right endpoint, respectively, see the last column of Table 9. The details of hyperplanes are attached in supplementary material, Tables S2, S3, S4, which can be used for the forecasting process—for instance, estimated \({\widehat{\widetilde{y}}}_{i}\) is (298.67, 180.3, 365.6) for a new sample (× 1 = 70, × 2 = 4.9, × 3 = 0.6, × 4 = 35, × 5 = 30, × 6 = 4). The forecasting results with Table 12, 13, and 14 are shown in Table 9.

Table 9 Forecasting example and number of hyperplanes

For the similarity of different methods, we have 0.716, 0.719, and 0.783 for the FLAR, Diamond, and Fuzzy-CNLS, respectively.

As mentioned in [20], Zhou et al. considered two outliers out of 147 samples based on their far-from-zero residual errors. Hence, 145 samples were employed in their analyses. These two outliers are bolded in Table 11. It may be interesting to investigate empirically if the similarities of Diamond, FLAR, and Fuzzy-CNLS for these 145 observations significantly differ from the similarities with all 147 observations (including outliers). Table 10 summarizes the similarity scores of different methods. Outliers do not affect the similarity significantly, and the performance of Fuzzy-CNLS is better than that of Diamond and FLAR.

Table 10 Similarity results of Example 2

6 Conclusion

This paper proposes a new fuzzy convex nonparametric least squares method (fuzzy-CNLS) for the triangular fuzzy output crisp input model, which is a nonparametric method for no prior specification of the functional form of the resulting regression function is required. The resulting fuzzy regression model is nonlinear. We first separate Diamond's fuzzy least squares model into three fuzzy submodels, Center, Left endpoint, and Right endpoint. Then, we employ a CNLS approach to each sub-model to find the corresponding regression model. The original CNLS was developed with the shape constraints (concavity constraints) for the productivity efficiency analysis. We consider that the regression function of each submodels may be convex instead of concave. Hence, we propose another type of shape constraint (convexity constraint) if the corresponding submodels' regression functions follow the convex pattern. In Examples 1 and 2 of the current paper, some estimated function follows a convex pattern instead of a concave one. Hence, some modifications are needed in the proposed Fuzzy-CNLS. Two examples have been used to illustrate shape constraints, better similarity scores, and insensitivity of outliers of Fuzzy-CNLS.

The limitation of Fuzzy-CNLS comes from the shape constraints, which contribute to the abovementioned advantages. It is because the shape constraints generate many constraints that incur computational burdens. From the literature, we can find Lee et al. [25] and Mazumder et al. [29] proposed an efficient algorithm. Further research is needed to find an approach to using these efficient algorithms to resolve three Fuzzy-CNLS submodels with the same set of input variables. Other further researches are to extend the Fuzzy-CNLS with ridge approaches, like [30], and to other fuzzy regression analyses, like the fuzzy-input fuzzy output model.