Fuzzy Linear Regression Model Based on Adaptive Lasso Method

Kong, Lingtao

doi:10.1007/s40815-021-01156-0

Fuzzy Linear Regression Model Based on Adaptive Lasso Method

Published: 15 August 2021

Volume 24, pages 508–518, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Fuzzy Systems Aims and scope Submit manuscript

Fuzzy Linear Regression Model Based on Adaptive Lasso Method

Download PDF

Lingtao Kong ORCID: orcid.org/0000-0001-7188-4761¹

309 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we propose a fuzzy adaptive lasso (least absolute shrinkage and selection operator) estimate for fuzzy linear regression with crisp inputs and fuzzy outputs. The proposed estimate is obtained by imposing an $L_1$ penalty on the least-squares error. Compared with fuzzy lasso estimate proposed by Hesamian and Akbari (Int J Approx Reason 115:290–300, 2019), the estimate we proposed assigns different weights to different coefficients, which is reasonable to significant covariates. Some numerical experiments are conducted to evaluate the performance of the proposed estimate. In most cases, fuzzy adaptive lasso estimate outperforms five commonly used estimates, especially when the variances of the error terms are small.

Fuzzy Least Squares Estimation with New Fuzzy Operations

Parameter Estimation of Fuzzy Linear Regression Utilizing Fuzzy Arithmetic and Fuzzy Inverse Matrix

An integrated shrinkage strategy for improving efficiency in fuzzy regression modeling

Article 10 May 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Regression model is usually used to describe how the output (response) variable depends on some input (explanatory) variables in social sciences, economics, and life sciences (Fahrmeir et al. [11]). The simplest regression model is the linear regression given by

$$y=\beta _1x_{1}+\cdots +\beta _kx_{k}+\epsilon,$$

where y is the output variable, $x_{j}, j=1,\cdots ,k,$ are the input variables, $\varvec{\beta }=(\beta _1,\cdots ,\beta _k)$ is the unknown vector of regression coefficients, and $\epsilon$ is the random error. Estimating the unknown vector $\varvec{\beta }$ is an important topic in regression analysis. In recent years, some shrinkage methods are developed to deal with this problem, and lasso (least absolute shrinkage and selection operator) proposed by Tibshirani [28] is the most commonly used one. From [28], the lasso estimate for $\varvec{\beta }$ is obtained by imposing an $L_1$ penalty on the least-squares error, that is, solving the following optimization problem:

$$\mathop {\arg \min }_{\varvec{\beta }}\sum _{i=1}^n\Big (y_i-\sum _{j=1}^k\beta _jx_{ij}\Big )^2 +\lambda \sum _{j=1}^k{|\beta _j|},$$

where $(x_{i1},\cdots ,x_{ik};y_i)$ and $\lambda >0$ denote the observed data and tuning parameter, respectively. The lasso estimate will reduce to the ordinary least-squares estimate when $\lambda$ goes to zero, and the coefficients will shrunk to zero when $\lambda$ goes to infinity. Moreover, the lasso estimate usually has smaller variance and higher predicts accuracy than the ordinary least-squares estimate when k is large. However, the lasso procedure forces the coefficients to be equally penalized, which seems unfair to the important covariates. To solve it, Zou [37] proposed an adaptive lasso method which assigns different weights to different coefficients. From [37], the adaptive lasso estimate for $\varvec{\beta }$ is to solve the following minimization:

$$\mathop {\arg \min }_{\varvec{\beta }}\sum _{i=1}^n\Big (y_i-\sum _{j=1}^k\beta _jx_{ij}\Big )^2+\lambda \sum _{j=1}^k{\omega _j|\beta _j|},$$

where $\omega _j$s are positive and data-dependent weights. Compared with lasso technique, the adaptive lasso has two advantages: near-minimax optimality and oracle property [13, 37]. For more discussions on lasso, adaptive lasso, and other shrinkage methods, we refer the reader to Knight and Fu [20], Fan and Li [13], Zou and Hastie [38], Wang et al. [32], Hastie et al. [14], and the references therein.

In real-life applications, we usually meet with some vague or imprecise data, such as ”young,” ”low,” ”about 5 inches” and so on. To deal with these data, Tanaka et al. [26] formalized traditional linear regression to fuzzy environment. Henceforth, a great number of studies on fuzzy regression analysis have been developed [2,3,4,5,6, 17, 18, 36]. There are two mainly used approaches to handle fuzzy regression problems: the possibilistic approach [26, 27] and the least-squares approach [2, 6, 9, 33]. However, there are only a few works using shrinkage technique on fuzzy regression analysis. Farnoosh et al. [12] proposed a ridge-type estimate for fuzzy linear regression model with fuzzy inputs and fuzzy outputs. Recently, Hesamian and Akbari [15] extended lasso technique to fuzzy linear regression and obtained fuzzy lasso estimate for the unknown coefficients. The fuzzy lasso estimate has good performance in prediction (estimation) and variable selection [15]. However, there are also some drawbacks of fuzzy lasso estimate: (a) it assigns the same weights to different coefficients, which produces biased estimates for the significant ones, and (b) the oracle properties do not hold for lasso estimate [13, 37]. In this paper, we propose a fuzzy adaptive lasso estimate for fuzzy linear regression with crisp inputs and fuzzy outputs. The new estimate is a generalization of fuzzy lasso estimate [15]. It penalizes every covariate according to its importance, that is, it allows larger penalties on unimportant covariates and smaller penalties on important ones. Moreover, it can also achieve parameter estimation and variable selection simultaneously. The numerical studies showed that the proposed fuzzy adaptive lasso estimate has better performance in prediction and variable selection than fuzzy lasso estimate and some popular methods.

The rest of this paper is organized as follows. We review some basic concepts and results of fuzzy number in Sect. 2. In Sect. 3, we introduce the definition of fuzzy linear regression model. In addition, three commonly used estimates are stated. In Sect. 4, we propose the fuzzy adaptive lasso estimate for fuzzy linear regression model. Besides, the algorithm is also presented in this section. To evaluate the performance of the proposed estimator, some numerical experiments are studied in Sect. 5. Finally, some concluding remarks are provided in Sect. 6.

2 Fuzzy Number, Fuzzy Arithmetic and Fuzzy Distance

In this section, we introduce some fundamental concepts of fuzzy number which will be used later. Moreover, some arithmetic operations and a distance metric between fuzzy numbers are presented. For more details, see [1, 8, 19].

Let $\tilde{A}$ be a fuzzy set on $\mathbb {R}$ (the real line) with membership function $\mu _{\tilde{A}}$ from $\mathbb {R}$ to interval [0, 1]. The fuzzy set $\tilde{A}$ is said to be a fuzzy number if it satisfies the properties of normality, convexity, and boundary [1, 8]. The fuzzy number $\tilde{A}$ is said to be an LR fuzzy number if its membership function has the form:

$$\mu _{{\tilde{A}}} (x) = \left\{ {\begin{array}{*{20}c} {L\left( {\frac{{a^{C} - x}}{{a^{L} }}} \right)} & {x \le a^{C} ,} \\ {R\left( {\frac{{x - a^{C} }}{{a^{U} }}} \right)} & {x > a^{C} ,} \\ \end{array} } \right.$$

where L and R are two strictly decreasing and continuous functions from [0,1] to [0,1], $a^C$, $a^L>0,$ and $a^U>0$ denote the center, left, and right spreads of $\tilde{A}$. For simplicity, the LR fuzzy number $\tilde{A}$ defined above is denoted by $(a^L,a^C,a^U)_{LR}$. In particular, when $L(x)=R(x)=1-x, \,0\le x\le 1$, $\tilde{A}$ will reduce to the triangular fuzzy number $\tilde{A}=(a^L,a^C,a^U)_T$ in which membership function takes the form:

$$\begin{aligned} \mu _{\tilde{A}}(x)= {\left\{ \begin{array}{ll} \frac{x-(a^C-a^L)}{a^L}&{} a^C-a^L\le x\le a^C,\\ \frac{a^C+a^U-x}{a^U}&{} a^C\le x\le a^C+a^U,\\ 0&{} \text{ else }. \end{array}\right. } \end{aligned}$$

If $a^L=a^U$, $\tilde{A}$ is a symmetric triangular fuzzy number denoted by $(a^C,a^L)_{T}$. Moreover, if $a^L=a^U=0$, it reduces to a crisp real number $a^C$.

In the following, we state two algebraic operations of triangular fuzzy numbers [15, 21], i.e., addition $(\oplus )$ and scalar multiplication $(\otimes )$. Let $\tilde{A}=(a^L,a^C,a^U)_{T}$, $\tilde{B}=(b^L,b^C,b^U)_{T}$ be two triangular fuzzy numbers and $\rho$ be a crisp number. Then, we have

$$\begin{gathered} \tilde{A} \oplus \tilde{B} = (a^{L} + b^{L} ,a^{C} + b^{C} ,a^{U} + b^{U} )_{T} , \hfill \\ \rho \otimes \tilde{A} = \left\{ {\begin{array}{*{20}c} {(\rho a^{L} ,\rho a^{C} ,\rho a^{U} )_{T} ,} & {{\text{if }}\rho \ge 0,} \\ {( - \rho a^{U} ,\rho a^{C} , - \rho a^{L} )_{T} ,} & {{\text{if }}\rho < 0} \\ \end{array} } \right. \hfill \\ \end{gathered}$$

Yang and Ko [34] defined a distance between LR fuzzy numbers which considers the effect of the shape of the membership function. In Sect. 5, we will use it to evaluate the closeness between the observed and estimated outputs. For two LR fuzzy numbers $\tilde{A}=(a^L,a^C,a^U)_{LR}$ and $\tilde{B}=(b^L,b^C,b^U)_{LR}$, the distance between $\tilde{A}$ and $\tilde{B}$ are defined as follows:

$$D(\tilde{A},\tilde{B})=\Big \{(a^C-b^C)^2+[(a^C-\eta _1a^L)-(b^C-\eta _1b^L)]^2+[(a^C+\eta _2a^U)-(b^C+\eta _2b^U)]^2\Big \}^{1/2},$$

where $\eta _1=\int _{0}^1L^{-1}(x)dx$ and $\eta _2=\int _{0}^1R^{-1}(x)dx$ [34]. When $\tilde{A}$ and $\tilde{B}$ are two triangular fuzzy numbers, the distance defined above reduces to

$$D(\tilde{A},\tilde{B})=\Big \{(a^C-b^C)^2+[(a^C-0.5a^L)-(b^C-0.5b^L)]^2+[(a^C+0.5a^U)-(b^C+0.5b^U)]^2\Big \}^{1/2}.$$

(2.1)

For more properties of the distance defined above, we refer the reader to [9, 10, 15, 34].

3 Fuzzy Linear Regression Model and Some Existing Estimates

Consider the following fuzzy linear regression model:

$$\tilde{y}_i=\oplus _{j=1}^k(x_{ij}\otimes {\tilde{\beta }}_j)\oplus {\tilde{\epsilon }}_i, \,i=1,\cdots ,n,$$

(3.1)

where $\tilde{y}_i=(y_i^L,y_i^C,y_i^U)_T$ is fuzzy response variable, $x_{i1}, \cdots , x_{ik}$ are crisp explanatory variables, ${\tilde{\beta }}_j=(\beta _j^L,\beta _j^C,\beta _j^U)_T$ is unknown fuzzy coefficient, and ${\tilde{\epsilon }}_i$ denotes fuzzy error term. Let $\tilde{y}_i^*=(y_i^{*L},y_i^{*C},y_i^{*U})_T=\oplus _{j=1}^k(x_{ij}\otimes {\tilde{\beta }}_j)$. From the arithmetic operation of triangular fuzzy numbers, we have, for $1\le i\le n$,

$$\begin{aligned}&y_i^{*C}=\sum _{j=1}^k\beta _j^Cx_{ij},\\&y_i^{*L}=\sum _{j=1}^k[\gamma _{ij}\beta _j^Lx_{ij}-(1-\gamma _{ij})\beta _j^Ux_{ij}],\\&y_i^{*U}=\sum _{j=1}^k[\gamma _{ij}\beta _j^Ux_{ij}-(1-\gamma _{ij})\beta _j^Lx_{ij}], \end{aligned}$$

where $\gamma _{ij}=I(x_{ij}>0)$ and $I(\cdot )$ denotes the indicator function.

In the following, we review three popular estimates for the unknown coefficients in model (3.1).

3.1 Fuzzy Least-Squares Estimate (FLSE)

The most widely used method in linear regression analysis is the ordinary least-squares method. Diamond [7] generalized it to fuzzy environment and obtained fuzzy least-squares estimate (FLSE) for $\tilde{\varvec{\beta }}=({\tilde{\beta }}_1,\cdots ,{\tilde{\beta }}_k)$ in model (3.1) as follows:

$$\begin{aligned} \hat{\tilde{\varvec{\beta }}}^{FLSE}=\mathop {\arg \min }_{\tilde{\varvec{\beta }}}\sum _{i=1}^n d_1^2(\tilde{y}_i,\tilde{y}_i^*), \end{aligned}$$

(3.2)

where

$$\begin{aligned} d_1(\tilde{y}_i,\tilde{y}_i^*)=\Big \{(y_i^C-y_i^{*C})^2+[(y_i^C-y_i^L)-(y_i^{*C}-y_i^{*L})]^2 +[(y_i^C+y_i^U)-(y_i^{*C}+y_i^{*U})]^2\Big \}^{1/2}. \end{aligned}$$

Although FLSE is accurate and simple to compute, it will have bad performance when the outliers exist. Even a single unusual value may have a great influence on the estimate. For other fuzzy least-squares estimates of model (3.1), we refer the reader to Chachi [2], Coppi et al. [6], D’Urso [9], and Xu and Li [33].

3.2 Fuzzy Least Absolute Estimate (FLAE)

Since the least-squares estimate is sensitive to outliers, the least absolute method is developed to overcome this problem. Choi and Buckley [3] stated that the least absolute estimate is more efficient than the least-squares estimate when the unusual values exist. Later, based on the distance

$$\begin{aligned} d_2(\tilde{y}_i,\tilde{y}_i^*)=|y_i^C-y_i^{*C}|+|y_i^L-y_i^{*L}| +|y_i^U-y_i^{*U}|. \end{aligned}$$

(3.3)

Zeng et al. [36] established a fuzzy least absolute estimate (FLAE) for $\tilde{\varvec{\beta }}=({\tilde{\beta }}_1,\cdots ,\tilde{\beta }_k)$ as follows:

$$\begin{aligned} \hat{\tilde{\varvec{\beta }}}^{FLAE}=\mathop {\arg \min }_{\tilde{\varvec{\beta }}} \sum _{i=1}^nd_2(\tilde{y}_i,\tilde{y}_i^*). \end{aligned}$$

For other fuzzy absolute least estimates, see Choi and Buckley [3], Taheri and Kelkinnama [25].

3.3 Fuzzy Lasso Estimate

Lasso is a shrinkage method by assigning an $L_1$-penalty to shrunk the coefficients to zero. Moreover, it can select variables simultaneously [28]. Recently, Hesamian and Akbari [15] applied lasso technique to fuzzy linear regression and got the following estimate for $\tilde{\varvec{\beta }}=(\tilde{\beta }_1,\cdots ,\tilde{\beta }_k)$ as follows:

$$\begin{aligned}&\hat{\tilde{\varvec{\beta }}}^{lasso}=\mathop {\arg \min }_{\tilde{\varvec{\beta }}}\sum _{i=1}^n D^2(\tilde{y}_i,\tilde{y}_i^*),\nonumber \\&s.t.\quad {\left\{ \begin{array}{ll} \sum _{j=1}^k |\beta _j^C| \le \lambda _1, &{} \\ \sum _{j=1}^k \beta _j^L \le \lambda _2, &{} \\ \sum _{j=1}^k \beta _j^U \le \lambda _3, &{} \\ \end{array}\right. } \end{aligned}$$

(3.4)

where the distance $D(\cdot ,\cdot )$ is defined by Eq. (2.1) and $\lambda _1, \lambda _2, \lambda _3$ are three positive tuning parameters determined by cross-validation criterion [23]. For more properties of lasso method, see Tibshirani [28] and Hastie et al. [14].

4 Fuzzy Adaptive Lasso Estimate

As stated in Sect. 1, lasso technique is unjust to significant covariates, because it penalizes all coefficients equally in $L_1$-penalty. Hence, in the following, we extend the adaptive lasso technique (Zou [37]) to estimate $\tilde{\varvec{\beta }}=(\tilde{\beta }_1,\cdots ,\tilde{\beta }_k)$, which assigns larger penalties on unimportant covariates and smaller penalties on important ones.

Let $\tau$ be a crisp positive number. Suppose that $\hat{\tilde{\varvec{\beta }}}=(\hat{\tilde{\beta }}_1,\cdots ,\hat{\tilde{\beta }}_k)$ denotes a fuzzy least-squares estimate of $\tilde{\varvec{\beta }}=(\tilde{\beta }_1,\cdots ,\tilde{\beta }_k)$ with $\hat{\tilde{\beta }}_j=(\hat{\beta }_j^L,\hat{\beta }_j^C,\hat{\beta }_j^U)_T$. For example , we can use Diamond’s estimate $\hat{\tilde{\varvec{\beta }}}^{FLSE}$ given in Equation (3.2). The fuzzy adaptive lasso estimate $\hat{\tilde{\varvec{\beta }}}^{alasso}$ is defined by

$$\begin{aligned} \hat{\tilde{\varvec{\beta }}}^{alasso}=\mathop {\arg \min }_{\tilde{\varvec{\beta }}}\sum _{i=1}^n D^2(\tilde{y}_i,\tilde{y}_i^*) +\lambda \sum _{j=1}^k\Big (w_{j1}|\beta _j^C| +w_{j2}\beta _j^L +w_{j3}\beta _j^U\Big ), \end{aligned}$$

(4.1)

where

$$\begin{aligned} w_{j1}=(1/|\hat{\beta }_{j}^C|)^{\tau }, \, w_{j2}= (1/\hat{\beta }_{j}^L)^{\tau }, \, w_{j3}=(1/\hat{\beta }_{j}^U)^{\tau },\,1\le j\le k, \end{aligned}$$

and $\lambda$ is the tuning constant. These two parameters $\tau$ and $\lambda$ will be determined later.

Remark 4.1

Although we recommend to apply Diamond’s estimate $\hat{\tilde{\varvec{\beta }}}^{FLSE}$ in Equation (3.1), we may use other least-squares estimates [6, 9, 33]. Moreover, if the input variables have high collinearity, we could use the ridge estimate (Hong and Hwang [16]) because of its stability.

Remark 4.2

Determining the tuning parameters $\tau$ and $\lambda$ is a crucial problem in adaptive lasso procedure [37]. In the numerical studies, we use two-dimensional cross-validation procedure [23] to choose the optimal values of them. In addition, the parameter $\tau$ is selected from the set $\{0.5,1,2\}$ suggested by Zou [37].

4.1 Goodness-of-Fit Measures

Suppose that $\hat{\tilde{y}}_i$ denotes the estimated value of $\tilde{y}_i$. To assess the prediction accuracy of fuzzy adaptive lasso estimate, we use the following three widely used measures [15, 36].

Mean-Square Error (MSE):
$$\begin{aligned} MSE=\frac{1}{n}\sum _{i=1}^n D^2(\tilde{y}_i,\hat{\tilde{y}}_i), \end{aligned}$$
where the distance $D(\cdot ,\cdot )$ is defined by Eq. (2.1).
Mean Absolute Error (MAE):
$$\begin{aligned} MAE=\frac{1}{n}\sum _{i=1}^n d_2(\tilde{y}_i,\hat{\tilde{y}}_i), \end{aligned}$$
where the distance $d_2(\cdot ,\cdot )$ is defined by Eq. (3.3).
Mean Similarity Measure (MSM):
$$\begin{aligned} MSM=\frac{1}{n}\sum _{i=1}^n S(\tilde{y}_i,\hat{\tilde{y}}_i), \end{aligned}$$
(4.2)

where the similarity measure $S(\cdot ,\cdot )$ is defined as follows:

$$\begin{aligned} S(\tilde{A},\tilde{B})=1-\frac{d_2(\tilde{A},\tilde{B})}{\max (a^C+a^U,b^C+b^U)-\min (a^C-a^L,b^C-b^L)} \end{aligned}$$

with $\tilde{A}=(a^L,a^C,a^U)_T$, $\tilde{B}=(b^L,b^C,b^U)_T$. This measure is also used by Zeng et al. [36]. In addition, it satisfies that (i) $S(\tilde{A},\tilde{B})=S(\tilde{B},\tilde{A})$, (ii) $0\le S(\tilde{A},\tilde{B})\le 1$ and $S(\tilde{A},\tilde{B})=1$ if and only if $\tilde{A}=\tilde{B}$ [36].

Note that smaller values of MSE and MAE mean smaller total estimation errors and higher prediction performances. On the contrary, a larger value of MSM indicates better prediction performance.

4.2 Algorithm for the Estimate $\hat{\tilde{\varvec{\beta }}}^{alasso}$

In the following, we state the algorithm of fuzzy adaptive lasso method. The optimal values of $\lambda$ and $\tau$ will also be determined. Moreover, the flowchart for the proposed method is given in Fig. 1.

5 Numerical Examples

In this section, we will evaluate the performance of fuzzy adaptive lasso estimate. In Subsect. 5.1, we compare the prediction accuracy of the new estimate with several commonly used methods including Diamond [7] (FLSE), Coppi et al. [6], Taheri and Kelkinnama [25], Zeng et al. [36] (FLAE), and Hesamian and Akbari [15] (Fuzzy lasso). Three goodness-of-fit measures (MSE, MAE, MSM) introduced in Subsect. 4.1 are calculated to assess their prediction accuracy. In Subsect. 5.2, we illustrate the performance of fuzzy adaptive lasso estimate in variable selection. All experiments were implemented in the R language [22].

5.1 Prediction Accuracy of Fuzzy Adaptive Lasso Estimate

Example 5.1

(A few large effects) Consider the following fuzzy linear regression model:

$$\begin{aligned} \tilde{y}_i=\oplus _{j=1}^{8}(x_{ij}\otimes \tilde{\beta }_j)\oplus \tilde{\epsilon }_i, \, i=1,\cdots ,n, \end{aligned}$$

where the true regression coefficients are $\tilde{\beta }_1=(0.5,3,0.4)_T, \, \tilde{\beta }_3=(0.3,2,0.3)_T$, $\tilde{\beta }_2=\tilde{\beta }_4=\tilde{\beta }_7=\tilde{\beta }_8=(0,0,0)_T$, $\tilde{\beta }_5=(0.001,0,0)_T$, and $\tilde{\beta }_6=(0,0,0.001)_T$. The crisp inputs $\varvec{x}_i=(x_{i1},\cdots ,x_{i8}), i=1,\cdots ,n$, are iid normal vectors with mean zero and covariance matrix $\Sigma =(\sigma _{ij})_{8\times 8}$, $\sigma _{ij}=0.5^{|i-j|}$. In addition, the fuzzy error term $\tilde{\epsilon }_i=(\epsilon _i^L,\epsilon _i^C,\epsilon _i^U)_T$ with $\epsilon _i^L\sim N(0,\sigma _1^2)$, $\epsilon _i^C\sim N(0,\sigma _2^2),$ and $\epsilon _i^U\sim N(0,\sigma _3^2)$. We set $\varvec{\sigma }=(\sigma _1,\sigma _2,\sigma _3)$ to be (0.5, 1, 0.5), (0.2, 0.4, 0.2), and (0.1, 0.2, 0.1).

In this example, we generated $N=100$ simulated datasets with $n=20$ and 40. We calculate the mean MSE, MAE, and MSM of fuzzy adaptive lasso estimate and other five popular methods. These results are summarized in Table 1. From this table, we may obtain the following observations. First, in all cases, the values of MSE and MAE of fuzzy adaptive lasso estimate are smaller than those of other methods, which means that the method we proposed performs best in terms of MSE and MAE measures. Second, according to MSM measure, Zeng et al.’s method outperforms the other five estimates. Finally, all methods perform better as the variances decrease, which is consistent with the traditional statistical case [37].

Besides the criterions introduced in Subsect. 4.1, we also consider the general defuzzification method introduced by Sugeno [24], which is called ”the center of gravity.” This defuzzification method transforms fuzzy quantities into crisp ones. From [24], the center of gravity for a triangular fuzzy number $\tilde{A}=(a^L,a^C,a^U)_T$ is

$$G_{\tilde{A}}=a^C+\frac{1}{3}(a^U-a^L).$$

(5.1)

The centers of gravity of the observed and estimated outputs for Example 5.1 with $n=20$ and 40 are plotted in Fig. 2. It suggests that the the estimated outputs agree with the true outputs very closely.

Table 1 Comparison of the performances of fuzzy adaptive lasso estimate and five popular methods for Example 5.1 with $n=20, 40$

Full size table

Example 5.2

(Many small effects) In this example, we also consider fuzzy linear regression model

$$\begin{aligned} \tilde{y}_i=\oplus _{j=1}^{8}(x_{ij}\otimes \tilde{\beta }_j)\oplus \tilde{\epsilon }_i, \, i=1,\cdots ,n, \end{aligned}$$

where $\varvec{x}_i=(x_{i1},\cdots ,x_{i8})$ and $\tilde{\epsilon }_i=(\epsilon _i^L,\epsilon _i^C,\epsilon _i^U)_T$ are the same as those in Example 5.1. We set fuzzy coefficients to be $\tilde{\beta }_1=\tilde{\beta }_2=\cdots =\tilde{\beta }_8=(0.2,0.85,0.2)_T$. Similar to Example 5.1, we also generated $N=100$ datasets with $n=20$ and 40. Three measures of six methods are reported in Table 2. Furthermore, the centers of gravity of observed outputs and estimated ones for this example are given in Fig. 3. From Table 2 and Fig. 3, we obtain similar conclusion with that of Example 5.1.

Table 2 Comparison of the performances of fuzzy adaptive lasso estimate and five popular methods for Example 5.2 with $n=20, 40$

Full size table

Example 5.3

In order to illustrate the application of fuzzy adaptive lasso estimate, we consider an example in [33], which analyzed the effect of the composition Portland cement on heat evolved during hardening. The dataset is presented in Table 3. In this data set, the symmetric triangular fuzzy output $\tilde{y}_i=(y_i,e_i)_T$ denotes the heat evolved in calories per gram of cement, the crisp inputs $x_1, x_2$ and $x_3$ denote the amount of tricalcium aluminate, the amount of tricalcium silicate, and the amount of tetracalcium alumino ferrite, respectively.

Table 3 Dataset in Example 5.3

Full size table

Table 4 The estimated coefficients and three measures of four methods for Example 5.3

Full size table

In this example, we employ fuzzy adaptive lasso method, Hesamian and Akbari [15] (fuzzy lasso method), Coppi et al. [6], and Zeng et al. [36] to estimate the unknown fuzzy coefficients. The estimated coefficients and three measures (MSE, MAE, MSM) of these methods are reported in Table 4. From Table 4, none of these methods can universally dominate the other two methods. Specially, the values of MSE and MAE of fuzzy adaptive lasso estimate are 48.4402 and 5.9162, which are less than those of other methods. In terms of the measure MSM, Zeng et al. [36] have the best performance.

Example 5.4

In this example, we consider the house price data which were also studied by Tanaka et al. [26] and Choi [4]. This dataset is stated in Table 5. The symmetric triangular fuzzy output $\tilde{y}_i=(y_i,e_i)_T$ denotes the house price (1000 yen) and the crisp inputs $x_1, x_2$ and $x_3$ denote first floor space m$^2$, second floor space m$^2$ and number of rooms, respectively. We also use fuzzy adaptive lasso method, Hesamian and Akbari [15] (fuzzy lasso method), Coppi et al. [6], and Zeng et al. [36] to estimate the unknown fuzzy coefficients. The corresponding results are presented in Table 6. In terms of MSE and MSM, fuzzy adaptive lasso estimate has the best performance than other methods. Moreover, according to the measure MAE, Zeng et al. [36] outperform other methods.

Table 5 Dataset in Example 5.4

Full size table

Table 6 The estimated coefficients and three measures of four methods for Example 5.4

Full size table

5.2 Variable Selection of Fuzzy Adaptive Lasso Estimate

Hesamian and Akbari [15] stated that if fuzzy coefficients are $(0,0,0)_T$, $(l,0,0)_T,$ or $(0,0,r)_T$, they are called ”about 0.” Moreover, the corresponding explanatory variable is said to be completely noninformative if its fuzzy coefficient is $(0,0,0)_T$ and strongly noninformative if its fuzzy coefficient is $(l,0,0)_T$ or $(0,0,r)_T$. According to their criterion, the explanatory variables $x_2, x_4, x_7, x_8$ in Example 5.1 are completely noninformative, and $x_5, x_6$ are strongly noninformative. In this subsection, we also use this criterion to study variable selection of fuzzy adaptive lasso method.

Table 7 presents the results of fuzzy adaptive lasso and fuzzy lasso estimates in variable selection for Example 5.1 with three different variances. In this table, the column labeled ”C” states the average number of selected nonzero components, and the column labeled ”I” denotes the average number of ”about zero” components incorrectly selected into the final model. From the column labeled ”C,” both methods can correctly select the two informative variables ($x_1$ and $x_3$) in all cases. From the column labeled ”I,” fuzzy adaptive lasso estimate seems to select less noninformative variables into the final model when the variances are low. However, fuzzy lasso estimate tends to outperform fuzzy adaptive lasso estimate when the variances are high. We also studied the case of $n=20$ and obtained the same conclusion.

Table 7 Average number of selected variables of fuzzy adaptive lasso, fuzzy lasso for Example 5.1 with $n=40$

Full size table

6 Concluding Remarks

In this paper, we propose a fuzzy adaptive lasso estimate for fuzzy linear regression with crisp inputs and fuzzy outputs. This newly proposed estimate is a penalized one which imposes an $L_1$ penalty on least-squares error. Compared with fuzzy lasso estimate established by Hesamian and Akbari [15], the new estimate assigns different weights on different coefficients, which is fair to the important coefficients. The tuning parameters are determined by two-dimensional cross-validation procedure. We compared fuzzy adaptive lasso estimate with five usually used estimates through several numerical experiments. The experimental results show that the proposed fuzzy adaptive lasso estimate is an effective tool for estimation and variable selection.

There are of course some deficiencies of the proposed method. Note that, in many real applications, the input variables can also be fuzzy [5, 9]. Moreover, compared with fuzzy set, Pythagorean fuzzy set is an effective tool dealing with fuzziness and imprecision, especially the multiple attribute decision making problems [30, 31, 35]. Hence, extending the proposed method to the situation where both input and output variables are fuzzy or Pythagorean fuzzy is one direction in our future research. Moreover, as stated in Vidaurre et al. [29], the adaptive lasso is sensitive to collinearity. Thus, generalizing robust penalized techniques such as elastic net (Zou and Hastie [38]) and LAD-lasso (Wang et al. [32]) to fuzzy environment will be another research topic in the future.

References

Buckley, J.J.: Fuzzy Probability and Statistics. Springer, Berlin (2006)
MATH Google Scholar
Chachi, J.: A weighted least squares fuzzy regression for crisp input-fuzzy output data. IEEE Trans. Fuzzy Syst. 27, 739–748 (2019)
Article Google Scholar
Choi, S.H., Buckley, J.J.: Fuzzy regression using least absolute deviation estimations. Soft Comput. 12, 257–263 (2008)
Article Google Scholar
Choi, S.H., Jung, H.Y., Kim, H.: Ridge fuzzy regression model. Int. J. Fuzzy Syst. 21, 2077–2090 (2019)
Article MathSciNet Google Scholar
Chukhrova, N., Johannssen, A.: Fuzzy regression analysis: systematic review and bibliography. Appl. Soft Comput. 84, 105708 (2019)
Article Google Scholar
Coppi, R., D’Urso, P., Giordani, P., Santoro, A.: Least squares estimation of a linear regression model with $LR$ fuzzy response. Comput. Stat. Data Anal. 51, 267–286 (2006)
Article MathSciNet Google Scholar
Diamond, P.: Fuzzy least squares. Inf. Sci. 46, 141–157 (1988)
Article MathSciNet Google Scholar
Dubois, D., Prade, H.: Fuzzy Sets and Systems: Theory and Applications. Academic Press, New York (1980)
MATH Google Scholar
D’Urso, P.: Linear regression analysis for fuzzy/crisp input and fuzzy/crisp output data. Comput. Stat. Data Anal. 42, 47–72 (2003)
Article MathSciNet Google Scholar
D’Urso, P., Santoro, A.: Goodness of fit and variable selection in the fuzzy multiple linear regression. Fuzzy Sets Syst. 157, 2627–2647 (2006)
Article MathSciNet Google Scholar
Fahrmeir, L., Kneib, T., Lang, S., Marx, B.: Regression-Models, Methods and Applications. Springer, London (2013)
MATH Google Scholar
Farnoosh, R., Ghasemian, J., Solaymani Fard, O.: Integrating ridge-type regularization in fuzzy nonliear regression. Comput. Appl. Math. 31, 323–338 (2012)
Article MathSciNet Google Scholar
Fan, J., Li, R.Z.: Variable selection via penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity. Chapman and Hall, London (2015)
Book Google Scholar
Hesamian, G., Akbari, M.G.: Fuzzy Lasso regression model with exact explanatory variables and fuzzy responses. Int. J. Approx. Reason. 115, 290–300 (2019)
Article MathSciNet Google Scholar
Hong, D.H., Hwang, C.: Ridge regression procedures for fuzzy models using triangular fuzzy numbers. Int. J. Uncertain. Fusiness Knowl.-Based Syst. 12, 145–159 (2004)
Article MathSciNet Google Scholar
Icen, D., Demirhan, H.: Error measures for fuzzy linear regression: Monte Carlo simulation approach. Appl. Soft Comput. 46, 104–114 (2016)
Article Google Scholar
Jung, H.Y., Yoon, J.H., Choi, S.H.: Fuzzy linear regression using rank transform method. Fuzzy Sets Syst. 274, 97–108 (2015)
Article MathSciNet Google Scholar
Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, New Jersey (1995)
MATH Google Scholar
Knight, K., Fu, W.J.: Asymptotics for Lasso-type estimators. Ann. Stat. 28, 1356–1378 (2000)
MathSciNet MATH Google Scholar
Lee, K.H.: First Course on Fuzzy Theory and Applications. Springer-Verlag, Berlin (2005)
MATH Google Scholar
R Core Team: R: a language and enviroment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org (2016)
Stone, M.: Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. B 36, 111–133 (1974)
MathSciNet MATH Google Scholar
Sugeno, M.: An introductory survey of fuzzy control. Inf. Sci. 36, 59–83 (1985)
Article MathSciNet Google Scholar
Taheri, S.M., Kelkinnama, M.: Fuzzy linear regression based on least absolute deviations. Iran. J. Fuzzy Syst. 9, 121–140 (2012)
MathSciNet MATH Google Scholar
Tanaka, H., Uejima, S., Asai, K.: Linear regression analysis with fuzzy model. IEEE Trans. Syst. Man Cybern. 12, 903–907 (1982)
Article Google Scholar
Tanaka, H., Watada, J.: Possibilistic linear systems and their application to the linear regression model. Fuzzy Sets Syst. 27, 275–289 (1988)
Article MathSciNet Google Scholar
Tibshirani, R.: Regression penalized and selection via the LASSO. J. R. Stat. Soc. B 58, 267–288 (1996)
MATH Google Scholar
Vidaurre, D., Bielza, C., Larrañaga, P.: A survey of $L_1$ regression. Int. Stat. Rev. 81, 361–387 (2013)
Article MathSciNet Google Scholar
Wang, L., Garg, H., Li, N.: Pythagorean fuzzy interactive Hamacher power aggregation operators for assessment of express service quality with entropy weight. Soft Comput. 25, 973–993 (2021)
Article Google Scholar
Wang, L., Li, N.: Pythagorean fuzzy interaction power Bonferroni mean aggregation operators in multiple attribute decision making. Int. J. Intell. Syst. 35(1), 150–183 (2020)
Article Google Scholar
Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection via the lad-lasso. J. Bus. Econ. Stat. 20, 347–355 (2007)
Article Google Scholar
Xu, R.N., Li, C.L.: Multidimensional least-squares fitting with a fuzzy model. Fuzzy Sets Syst. 119, 215–223 (2001)
Article MathSciNet Google Scholar
Yang, M.S., Ko, C.H.: On a class of fuzzy c-numbers clustering procedures for fuzzydata. Fuzzy Sets Syst. 84, 49–60 (1996)
Article Google Scholar
Yager, R.R.: Pythagorean fuzzy subsets. In: Proceedings of the 2013 joint IFSA world congress and NAFIPS annual meeting (2013)
Zeng, W., Feng, Q., Li, J.: Fuzzy least absolute linear regression. Appl. Soft Comput. 52, 1009–1019 (2017)
Article Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Article MathSciNet Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67, 301–320 (2005)
Article MathSciNet Google Scholar

Download references

Acknowledgements

Dr. Kong was supported by the National Natural Science Foundation of China (No. 11801317).

Author information

Authors and Affiliations

School of Statistics, Shandong University of Finance and Economics, Jinan, 250014, China
Lingtao Kong

Authors

Lingtao Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingtao Kong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kong, L. Fuzzy Linear Regression Model Based on Adaptive Lasso Method. Int. J. Fuzzy Syst. 24, 508–518 (2022). https://doi.org/10.1007/s40815-021-01156-0

Download citation

Received: 24 February 2021
Revised: 01 June 2021
Accepted: 06 July 2021
Published: 15 August 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s40815-021-01156-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fuzzy Linear Regression Model Based on Adaptive Lasso Method

Abstract

Similar content being viewed by others

Fuzzy Least Squares Estimation with New Fuzzy Operations

Parameter Estimation of Fuzzy Linear Regression Utilizing Fuzzy Arithmetic and Fuzzy Inverse Matrix

An integrated shrinkage strategy for improving efficiency in fuzzy regression modeling

1 Introduction

2 Fuzzy Number, Fuzzy Arithmetic and Fuzzy Distance