1 Introduction

Future planning has become very important in the world we live in. It is vital to obtain accurate and reliable forecasts in making plans for the future. Fuzzy inference systems have been also added to the methods used to obtain predictions in recent years. Fuzzy inference systems work based on fuzzy sets created for input and output dataset on hand. In the literature, fuzzy inference system proposed by Takagi and Sugeno (1985) and adaptive network fuzzy inference system (ANFIS) proposed by Jang (1993) have been commonly used for forecasting problem. Both of these fuzzy inference systems work based on a rule base. The determination of these rules requires expert opinion and this situation prevents the objective working of the inference system.

Therefore, fuzzy function approach was proposed by Turksen (2008). Fuzzy function approach does not need rule base using fuzzy functions instead of rules. The different applications of fuzzy function approach were developed by Celikyilmaz and Turksen (2008a, b) and Turksen (2009). The detailed information about fuzzy functions can be found in Celikyilmaz and Turksen (2009). And also, a hybrid fuzzy function approach was proposed by Zarandi et al. (2013). The fuzzy function approach for forecasting problem was also reconsidered in the study of Beyhan and Alci (2010) and Aladag et al. (2014, 2016). In recent years, Granular computing has been an emerging computing paradigm of information processing. Granular computing is more a theoretical perspective than a set of methods. Granular computing approaches are data analysis techniques like fuzzy methods and these methods try to recognize various levels of scales. The recent contributions of granular computing are given by Pedrycz and Chen (2011, 2015a, b) and Maciel et al. (2016).

Fuzzy function approach can produce successful results for prediction problems. In a fuzzy function approach, while obtaining fuzzy functions, the input matrix is extended with the use of various nonlinear transformations of memberships. It should not be a significant linear relationship between the explanatory variables in multiple regression method used in the obtaining of fuzzy functions. Linear relationship between the explanatory variables in multiple regression leads to multicollinearity problem and this situation causes to increase the variance of estimators and to obtain inconsistent prediction results. It is clearly seen that the nonlinear functions of the memberships used to obtain fuzzy functions are related between each other.

As a result, Type 1 fuzzy function (T1FF) approach work based on regression models with multicollinearity problem. Ridge regression technique can be used in the presence of multicollinearity problem instead of classic regression analysis. It can be obtained biased but small variance estimators using shrinkage parameter in ridge regression. Ridge regression is a remedy used in the presence of multicollinearity problem and it was first proposed by Hoerl and Kennard (1970). Ridge regression method has two important advantages according to the ordinary least squares (OLS) method. One of them is to solve the multicollinearity problem and the other one is to decrease the mean square error (MSE) of predictors. A simple formula to obtain shrinkage parameter in ridge regression was proposed by Hoerl et al. (1975). The motivation of this paper is to try ridge regression method to eliminate multicollinearity problem in T1FF approach.

In this paper, the multicollinearity problem in T1FF approach is solved using ridge regression technique and this is the contribution of the paper. The proposed new T1FF approach is called as “Type 1 fuzzy function approach based on ridge regression (T1FFRR)”. The proposed new T1FFRR approach was applied to many real world time series datasets and the results are compared to the ones obtained from other techniques.

The rest of the paper can be outlined as below: The second section of the paper is about T1FF approach. Ridge regression technique is briefly summarized in Sect. 3. In the fourth section of the paper, the proposed new T1FFRR approach was introduced with details. Section 5 presents the results from the application of the proposed method to real life datasets and finally Sect. 6 presents conclusions and discussions.

2 Type 1 fuzzy function approach

Fuzzy inference system proposed by Takagi and Sugeno (1985) and ANFIS are fuzzy inference systems that require the creation of rule base. Fuzzy function approach proposed by Turksen (2008) is a fuzzy inference system working without rules. In a fuzzy function approach, a linear function is composed via linear regression method for each fuzzy set obtained from fuzzy c-means and the output of the system is obtained from the output of weighted fuzzy function predictions with memberships as weights. The fuzzy function approach for forecasting problem was also reconsidered in the study of Beyhan and Alci (2010), Aladag et al. (2014, 2016). Four different fuzzy function approaches were given in Celikyilmaz and Turksen (2009). The algorithm of T1FF method is given in Algorithm 1.

Algorithm 1. T1FF Approach.

Step 1 Generate a matrix of lagged variables of time series.

Step 2 Determine the degrees of belonging to the fuzzy sets.

The matrix consisting of lagged variables is clustered using FCM technique, and thus the membership values \(\left( {{\mu _{ik}}~i=1,2, \ldots c~;k=1,2, \ldots ,n} \right)\). Where \(c\) and \(n\) represent # fuzzy sets and crisp observations.

Step 3 Constitute the fuzzy regression functions.

The fuzzy regression functions for each fuzzy set are constituted and they can be expressed as given in Eq. (1).

$${Y^{\left( i \right)}}={X^{\left( i \right)}}{\beta ^{\left( i \right)}}+{\varepsilon ^{\left( i \right)}},\quad i=1,2, \ldots c~$$
(1)

Celikyilmaz and Turksen (2009) stated that various logarithmic and exponential transformations of membership values may increase the performance of systems. In Eqs. (2) and (3) represents the targeted outputs and the inputs of the system.

$${Y^{\left( i \right)}}=\left[ {\begin{array}{*{20}{c}} {{y_1}} \\ {\begin{array}{*{20}{c}} {{y_2}} \\ \vdots \end{array}} \\ {{y_n}} \end{array}} \right],\quad i=1,2, \ldots c~$$
(2)
$${X^{\left( i \right)}}=~\left[ {\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{\mu _{i1}}} \\ {{\mu _{i2}}} \end{array}}&{\begin{array}{*{20}{c}} {\mu _{{i1}}^{2}} \\ {\mu _{{i2}}^{2}} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{\mu _{in}}} \end{array}}&{\begin{array}{*{20}{c}} \vdots \\ {\mu _{{in}}^{2}} \end{array}} \end{array}~~~~~\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {exp\left( {{\mu _{i1}}} \right)} \\ {exp\left( {{\mu _{i2}}} \right)} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ {exp\left( {{\mu _{in}}} \right)} \end{array}} \end{array}~~~~\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {ln\left( {\left( {1 - {\mu _{i1}}} \right)/{\mu _{i1}}} \right)} \\ {n\left( {\left( {1 - {\mu _{i2}}} \right)/{\mu _{i2}}} \right)} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ {n\left( {\left( {1 - {\mu _{in}}} \right)/{\mu _{in}}} \right)} \end{array}} \end{array}~~~\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{x_{11}}} \\ {{x_{12}}} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{x_{1n}}} \end{array}} \end{array}~~~\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} \cdots \\ \cdots \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ \cdots \end{array}} \end{array}~~~\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{x_{p1}}} \\ {{x_{p2}}} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{x_{pn}}} \end{array}} \end{array}} \right],\quad i=1,2, \ldots c~$$
(3)

where \(p\) represents # crisp inputs.

Step 4 Estimate the fuzzy regression functions.

The fuzzy regression functions for each fuzzy set are estimated by OLS method.

$${\hat {Y}^{\left( i \right)}}={X^{\left( i \right)}}{\hat {\beta }^{\left( i \right)}},\quad i=1,2, \ldots c~$$
(4)

Step 5 Obtain the outputs.

The outputs obtained by each estimated fuzzy regression function are weighed in proportion to corresponding membership values and the final outputs are calculated by Eq. (5).

$${\hat {y}_k}=\frac{{\mathop \sum \nolimits_{{i=1}}^{c} {{\hat {y}}_{ik}}{\mu _{ik}}}}{{\mathop \sum \nolimits_{{i=1}}^{c} {\mu _{ik}}}},\quad i=1,2, \ldots c;\,\,\,k=1,2, \ldots ,n$$
(5)

In a fuzzy function approach, fuzzy function is obtained as the number of clusters and requires fewer parameters than other fuzzy inference systems. The non-linear relationship between input and output can be modeled on the grounds that fuzzy function approach is an approach based on data. The most important issue in fuzzy function approach is that it has multicollinearity problem and also there is no other solution method in the literature on this subject. In this study, T1FF approach has been modified by solving multicollinearity problem.

3 Ridge regression

Ridge regression is a remedy used in the presence of multicollinearity problem and it was first proposed by Hoerl and Kennard (1970). Ridge regression method has two important advantages according to the OLS method. One of them is to solve the multicollinearity problem and the other one is to decrease the mean square error (MSE). The solution technique of ridge regression is similar with OLS. Besides, the difference between ridge regression and OLS is the k value. This k value is also called as biased parameter or shrinkage parameter and it takes values between 0 and 1. This k value is added to the diagonal elements of the correlation matrix and thus biased regression coefficients are obtained. That means ridge regression is a biased regression method. There are several methods of determining multicollinearity problem. Some graphs and statistics can be used for this aim. Variance inflation factors (VIF), eigenvalues of correlation matrix of the independent variables and signs of the regression coefficients can demonstrate signals of multicollinearity problem.

The OLS estimates of regression coefficients and ridge estimates of regression coefficients are shown in Eqs. (6) and (7) respectively.

$$\hat {\beta }={\left( {{X^\prime }X} \right)^{ - 1}}{X^\prime }Y$$
(6)
$${\hat {\beta }_R}={\left( {{X^\prime }X+kI} \right)^{ - 1}}{X^\prime }Y$$
(7)

As noted above, ridge regression is a biased regression method. The proof of this situation is shown in Eq. (8).

$$\begin{aligned} {{\hat {\beta }}_R} & ={\left( {{X^\prime }X+kI} \right)^{ - 1}}{X^\prime }Y \\ & ={\left( {{X^\prime }X+kI} \right)^{ - 1}}{X^\prime }X\hat {\beta }=Z\hat {\beta } \\ \end{aligned}$$
(8)
$$E\left( {{{\hat {\beta }}_R}} \right)=E\left( {Z\hat {\beta }} \right)=Z\beta$$

It is clearly seen that ridge estimates of regression coefficients \(\left( {{{\hat {\beta }}_R}} \right)\) are biased estimates. One of the most important points to be considered in the ridge regression is the k value. There are many methods proposed in the literature to find the optimal k value. Ridge trace is one of these methods. Ridge trace is a plot of the elements of the ridge estimator versus k usually in the interval (0, 1). The other method in the literature used to find the optimal k value is given in Eq. (9) and it was proposed in Hoerl et al. (1975).

$$k=\frac{{p~{{\hat {\sigma }}^2}}}{{{{\hat {\beta }}^\prime }\hat {\beta }}}$$
(9)

The other method used to determine the effects of multicollinearity problem is VIF. The diagonal elements of \({\text{Var}}\left( {\hat {\beta }} \right)\) is called as VIF and is given by Eq. (10).

$${\text{VI}}{{\text{F}}_j}=\frac{1}{{\left( {1 - R_{j}^{2}} \right)}}\quad j=1, \ldots ,p$$
(10)

In Eq. (10), \({\varvec{R}}_{{\varvec{j}}}^{2}\) is the determination coefficient obtained from the multiple regression of \({{\varvec{X}}_{\varvec{j}}}\) on the remaining \(\left( {{\varvec{p}} - 1} \right)~\) regressor variables in the model. It can be said that there is a multicollinearity problem among the independent variables if these VIF values increase (VIF values ≥ 10).

4 The proposed method

In T1FF approach, ridge regression technique can be used to eliminate the multicollinearity problem that arises from the relationship between non-linear transformations of the membership values. The forecasting performance will be increased in a fuzzy function approach which does not have multicollinearity problem. The main difference of proposed method from T1FF method is the usage of ridge regression method for obtaining fuzzy functions. In the ridge regression the k value is determined using Eq. (9). T1FFRR approach is given in Algorithm 2.

Algorithm 2: The Proposed T1FFRR Method.

Step 1. The inputs of the system are lagged variables (number of p).

The model order \((p)~\) is determined according to the structure of the time series. Time series is shown as column vectors in the form of \({X_t}={[{x_1},{x_2}, \ldots ,{x_n}]^\prime }\). The matrix Z is composed of the input and outputs of the system and its dimension is \((n - p) \times (p+1)\).

$$Z=\left[ {\begin{array}{*{20}{c}} {{X_t}}&{{X_{t - 1}}}&{{X_{t - 2}}}& \cdots &{{X_{t - p}}} \end{array}} \right]$$
(11)

The elements of the matrix are clustered using fuzzy clustering method (FCM) technique proposed by Bezdek (1981). FCM clustering technique is applied iteratively using the following Eqs. (12) and (13). In these equations, \(c\), \({v_i}(i=1,~2,~ \ldots ,~c)\) and \({\mu _{ik}}\left( {i=1,~2,~ \ldots ,~c~;~k=1,~2,~ \ldots ,~n} \right)\) represent the number of fuzzy sets, cluster centers and membership values, respectively.

$${v_i}=\frac{{\mathop \sum \nolimits_{{k=1}}^{n} {{({\mu _{ik}})}^f}{z_k}}}{{\mathop \sum \nolimits_{{k=1}}^{n} {{({\mu _{ik}})}^f}}},\quad ~i=1,~2,~ \ldots ,~c$$
(12)
$${\mu _{ik}}={\left[ {\mathop \sum \limits_{{j=1}}^{c} {{\left( {\frac{{d({z_k},{v_i})}}{{d({z_k},{v_j})}}} \right)}^{\frac{2}{{f - 1}}}}} \right]^{ - 1}},\quad i=,~2,~ \ldots ,~c~;~k=1,~2,~ \ldots ,~n$$
(13)

where \(f~\)is the degree of fuzziness, \({z_k}\) is a vector whose elements are the elements that compose of kth row of \(Z\). µik; is the degree of belongingness of kth observation to ith cluster and \(d(z,~v)\): is Euclidian distance and is computed using Eq. (14).

$$d\left( {{z_k},{v_i}} \right)={z_k} - {v_i}$$
(14)

Step 2 The membership values are determined using the Eq. (13) for the input dataset according to the center determined from FCM technique.

$${\mu _{ik}}={\left[ {\mathop \sum \limits_{{j=1}}^{n} {{\left( {\frac{{d({x_k},{v_i})}}{{d({x_k},{v_j})}}} \right)}^{\frac{2}{{fi - 1}}}}} \right]^{ - 1}},\quad ~i=1,~2,~ \ldots ,~c~;~k=1,~2,~ \ldots ,~n$$
(15)

where \({x_k},~\) is a row of the input matrix which is generated for lagged variables and it is a vector whose elements are the elements that compose of kth row and this value is taken as zero when µik\(\alpha\). \({f_i}\) is fuzziness index.

$$X=\left[ {\begin{array}{*{20}{c}} {{X_{t - 1}}}&{{X_{t - 2}}}& \cdots &{{X_{t - p}}} \end{array}} \right]$$
(16)

Step 3 For each cluster \(i\), the membership values of each input data sample (µik), non-linear transformations of membership values and original inputs are used as explanatory variables and ith fuzzy function is obtained from predicting \({{\varvec{Y}}^{(i)}}={{\varvec{X}}^{(i)}}{\varvec{\beta}^{(i)}}+{\varvec{\varepsilon}^{(i)}}\) multiple regression model.

\({X^{(i)}}\) and \({Y^{(i)}}\) matrices are as follows when the number of lagged variables is \(p\) and using transformations \({\mu _{i1}}^{2}\), \({\text{exp}}({\mu _{i1}})\) and \(ln\left( {\left( {1 - {\mu _{i1}}} \right)/{\mu _{i1}}} \right)\) for non-linear transformation of the membership values.

$${{\varvec{X}}^{(i)}}=\left[ {\begin{array}{*{20}{c}} {{\mu _{i1}}}&{{\mu _{i1}}^{2}}&{\begin{array}{*{20}{c}} {{\text{exp}}({\mu _{i1}})}&{ln\left( {\left( {1 - {\mu _{i1}}} \right)/{\mu _{i1}}} \right)}&{\begin{array}{*{20}{c}} {{x_1}}&{~~~~\begin{array}{*{20}{c}} \cdots &{{x_p}} \end{array}} \end{array}} \end{array}} \\ {\begin{array}{*{20}{c}} {{\mu _{i2}}} \\ \vdots \end{array}}&{\begin{array}{*{20}{c}} {{\mu _{i2}}^{2}} \\ \vdots \end{array}}&{\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {{\text{exp}}({\mu _{i2}})} \\ \vdots \end{array}}&{\begin{array}{*{20}{c}} {ln\left( {\left( {1 - {\mu _{i2}}} \right)/{\mu _{i2}}} \right)} \\ \vdots \end{array}}&{\begin{array}{*{20}{c}} {\begin{array}{*{20}{c}} {~~~{x_2}}&{~~~\begin{array}{*{20}{c}} \cdots &{{x_{p+1}}} \end{array}} \end{array}} \\ {\begin{array}{*{20}{c}} \vdots &{~~~~\begin{array}{*{20}{c}} \cdots & \vdots \end{array}} \end{array}} \end{array}} \end{array}} \\ {{\mu _{in}}}&{{\mu _{in}}^{2}}&{\begin{array}{*{20}{c}} {{\text{exp}}({\mu _{in}})}&{ln\left( {\left( {1 - {\mu _{in}}} \right)/{\mu _{in}}} \right)}&{\begin{array}{*{20}{c}} {~~~{x_{n - p}}}&{~\begin{array}{*{20}{c}} \cdots &{{x_{n - 1}}} \end{array}} \end{array}} \end{array}} \end{array}} \right]$$
(17)
$${{\varvec{Y}}^{(i)}}=\left[ {\begin{array}{*{20}{c}} {{x_{p+1}}} \\ {{x_{p+2}}} \\ {\begin{array}{*{20}{c}} \vdots \\ {{x_n}} \end{array}} \end{array}} \right]$$
(18)

Maximum likelihood estimators of regression parameters are obtained as follows.

$$\hat {\beta }_{R}^{{(i)}}={({{\varvec{X}}^{(i)\prime }}{{\varvec{X}}^{(i)}}+kI)^{ - 1}}{{\varvec{X}}^{(i)\prime }}{{\varvec{Y}}^{(i)}}$$
(19)
$${\hat {{\varvec{Y}}}^{({\varvec{i}})}}={{\varvec{X}}^{(i)}}\hat {\beta }_{R}^{{(i)}}$$
(20)

Here shrinkage parameter \(({\varvec{k}})~\) is obtained using Eq. (9).

Step 4 The results obtained from fuzzy functions are weighed according to the membership values and the output values are calculated as follows.

$${\hat {y}_k}=\frac{{\mathop \sum \nolimits_{{i=1}}^{c} {{\hat {y}}_{ik}}{\mu _{ik}}}}{{\mathop \sum \nolimits_{{i=1}}^{c} {\mu _{ik}}}}~,\quad i=1,2, \ldots ,c,~~k=1,2, \ldots ,n - p$$
(21)

where, \({\hat {y}_{ik}}\) represents the predicted value obtained from ith cluster for the observation \(k\) and \({\hat {y}_k}\) is the forecasted value of the approach for the observation \(k\). The flowchart of proposed method is given in Fig. 1.

Fig. 1
figure 1

The flowchart of proposed method

5 Applications

In this study, twelve time series data are analyzed for the evaluation of the forecasting performance of the proposed method. The first five time series data of these datasets are Istanbul Stock Exchange Market (BIST100) data observed daily between years 2009 and 2013. The next five time series dataset are Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) data observed daily between years 2000 and 2004. The next dataset is Australian Beer Consumption (AUST) data observed quarterly between years 1956 and 1994. Finally, the last dataset is Turkey Electricity Consumption (TEC) data observed monthly between first month of 2002 and last month of 2013.

These time series and their features are presented in Table 1. The methods are compared by root mean square error (RMSE) and mean absolute percentage error (MAPE) criteria. RMSE and MAPE are calculated using Eqs. (22) and (23).

Table 1 The names and features of time series
$${\text{RMSE}}=\sqrt {\frac{1}{n}\mathop \sum \limits_{{k=1}}^{n} {{({y_k} - {{\hat {y}}_k})}^2}}$$
(22)
$${\text{MAPE}}=\sqrt {\frac{1}{n}\mathop \sum \limits_{{k=1}}^{n} \frac{{\left| {{y_k} - {{\hat {y}}_k}} \right|}}{{{y_k}}}} \times 100$$
(23)

In the first case, the proposed method was implemented on BIST100 datasets. Alternative forecasting methods except proposed method used in the analysis of BIST 100 datasets are listed below.

ARIMA: Autoregressive Integrated Moving Average Model, The best model was determined by Box and Jenkins (1970).

ES: Exponential Smoothing, Simple, Holt and Winters exponential smoothing methods were applied and the best model was selected (Brown 1963).

MLP-ANN: Multi-Layer Perceptron Artificial Neural Network, The number of inputs and hidden layer neurons were changed from 1 to 5 and the best architecture was selected by trial and error method. Levenberg Marquardt training algorithm was used as learning algorithm (Wilamowski et al. 2007).

SC: Song and Chissom time invariant fuzzy time series method (Song and Chissom 1993), the number of fuzzy sets were changed from 5 to 15 and the best number of fuzzy sets were selected.

FF: Fuzzy Function Approach (Turksen 2008). The model order and the number of fuzzy sets were changed from 1 to 5 and from 5 to 15, respectively.

The best situations were determined for each series. Table 2 summarizes the results obtained from test set for Series 1–5.

Table 2 The obtained results for series 1–5 and when ntest = 7 and 15

The proposed method has 70% success for BIST 100 datasets. And also, the proposed method has a noticeable improvement on the forecasting performance. Besides, although the proposed method has 30% failures the multicollinearity problem was eliminated using the proposed method. An example of multicollinearity problem in T1FF approach is given in Table 3 for Series 1 when ntest = 7. It was shown that T1FF method has multicollinearity problem because some VIF values are greater than 10. And also there is no multicollinearity problem when using T1FFRR method because all VIF values are less than 10.

Table 3 The comparison of VIF values obtained from T1FF and T1FFRR methods for series 1 when ntest = 7

And also, the best situations for series 1–5 for the best results are given in Table 4.

Table 4 The best situations for series 1–5

Moreover, Kruskal–Wallis H test was applied for RMSE values of test set from different methods given in Table 2. The significance value is smaller than 0.05 and it is clear that there is a significant difference between all applied methods.

In the second case, the proposed method was implemented on TAIEX datasets. Table 5 summarizes the results obtained from test set for series 6–10.

Table 5 All obtained results for TAIEX

Analysis of Table 5 reveals that the proposed method exhibit more successful and superior forecasting performance compared to other methods in terms of RMSE performance criteria. And also, the best situations for series 6–10 for the best results are given in Table 6.

Table 6 The best situations for series 6–10

In the next case, the proposed method was implemented on AUST dataset (series 11). Table 7 summarizes the results obtained from test set for Series11. Series 11 was analyzed by Seasonal Autoregressive Integrated Moving Average (SARIMA), Winters’ multiplicative exponential smoothing method (WMES) proposed by Winters (1960), linear and nonlinear artificial neural network model (L&NL-ANN) proposed by Yolcu et al. (2013), multiplicative neuron model-based fuzzy time series method (MNM-FTS) proposed by Aladag (2013) and T1FF proposed by Turksen (2008) method.

Table 7 All obtained results for Series 11

Analysis of Table 7 reveals that the proposed method exhibit more successful and superior forecasting performance compared to other methods in terms of MAPE and RMSE performance measures. We conclude that the best result is obtained in the case where \(\hbox{``}m=8,~\;{\text{cn}}=5,~\;{\text{and}}~\alpha {\text{-cut}}=0\hbox{''}.\) In Table 7, the results of methods compared with proposed method was taken from [20].

And also, the graph of the real observations and the forecasts obtained from proposed method for the test set is presented in Fig. 2. According to this graph, it is clearly seen that the forecasts obtained from proposed method are very accurate.

Fig. 2
figure 2

The graph of the real observations and the forecasts obtained from proposed method for series 11

Finally, the proposed method was implemented to TEC dataset (series 12). The dataset was forecasted using SARIMA, MLP-ANN, Multiplicative Neuron Model Artificial Neural Network (MNM-ANN), L&NL-ANN, and T1FF approach.

Table 8 summarizes the results obtained from test set for Series 12. Table 8 reveals that the proposed method exhibit more successful and superior forecasting performance compared to other methods in terms of MAPE and RMSE performance measures. We conclude that the best result is obtained in the case where \(\hbox{``}m=16,~\;{\text{cn}}=4,~\quad {\text{and}}\quad ~\alpha {\text{-cut}}=0.1\hbox{''}.\)

Table 8 All obtained results for Series 12

And also, the graph of the real observations and the forecasts obtained from proposed method for the test set is presented in Fig. 3. According to this graph, it is clearly seen that the forecasts obtained from proposed method are very accurate.

Fig. 3
figure 3

The graph of the real observations and the forecasts obtained from proposed method for series 12

6 Conclusion and discussions

Fuzzy function approach is a kind of fuzzy inference system that can produce successful results for the analysis of forecasting problems. In fuzzy function approach, a fuzzy function corresponding to each fuzzy set is generated using multiple regression analysis. It can be found a high correlation between the non-linear functions of membership functions. To overcome this problem, a new fuzzy function approach using ridge regression instead of multiple linear regression method in Type 1 fuzzy function approach is proposed and this is the contribution of this paper. The multicollinearity problem was eliminated using our new proposed method. And also, to show the superior forecasting performance of the proposed method, the proposed method was applied to many real life datasets and its superior forecasting performance was proved. The reason for superior forecasting performance is the elimination of multicollinearity problem. The proposed method can be used in different areas which need time series forecasting such as economy, finance, meteorology, hydrology, etc.

In the future studies, the new proposed approach can be applied to Type 2 fuzzy function approach and also different regression techniques can be used for Type 1 fuzzy function.