Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting

Benmouiza, Khalil; Cheknane, Ali

doi:10.1007/s00704-018-2576-4

Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting

Original Paper
Published: 01 August 2018

Volume 137, pages 31–43, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Theoretical and Applied Climatology Aims and scope Submit manuscript

Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting

Download PDF

Khalil Benmouiza¹ &
Ali Cheknane¹

1552 Accesses
104 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, an improved clustered adaptive neuro-fuzzy inference system (ANFIS) to forecast an hour-ahead solar radiation data for 915 h is introduced. First, we have classified the history data of solar radiation time series to decrease the input sample size using clustering methods. Three methods are used, namely, fuzzy c-means (FCM), subtractive clustering, and grid partitioning. These methods allow classifying the input data into groups; each group has similar properties that help to understand the correlation between the data and by consequence simplify the forecasting process. Second, we designed an ANFIS structure that takes both advantages of fuzzy theory to describe the uncertain phenomena of the data and artificial neural network algorithm, which has a self-learning ability. Finally, by combining clustered data and ANFIS model, an hour-ahead forecasting is achieved, and it was validated using measured data. The advantage of the proposed method is that provides the ability to use implicitly the information associated with the forecasting problem, without a priori knowledge of the relationships between the different variables solar radiation. The comparison results show that the ANFIS with FCM clustering model gives the best results with RMSE equals to 112 W/m² and high values of FS.

Enhanced adaptive neuro-fuzzy inference system using genetic algorithm: a case study in predicting electricity consumption

Article Open access 14 June 2023

Potential of k-Means Clustering-Based Fuzzy Logic for Prediction of Temperature in Ambient Atmosphere

Article 26 November 2014

Modeling Forecast Uncertainty Using Fuzzy Clustering

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The application of solar energy at a given site requires the complete and detailed knowledge of solar radiation of the site (Belaid and Mellit 2016). On the ground level, it is an important element for conversion systems using solar energy. This conversion is generally easy when the site is provided with a radiometric measurement station running regularly for several years. This information can be collected by different methods such as measuring solar radiation data by pyranometers, cell references, or by satellite measures. However, in most areas of the world, these measurements are not easily available due to financial, technical, or institutional limitations (Bezdek 1981; Zhang et al. 1998; Zhang 2003; Kaplanis and Kaplani 2007; Badescu et al. 2013; Benmouiza and Cheknane 2016).

Forecasting seems to be the solution when estimation is not possible; it consists of finding future values of time series data based only on past data. It is known as a difficult problem due to the non-linearity and complexity of the solar radiation time series (Kaplanis and Kaplani 2007; Ji and Chee 2011; Gan et al. 2012; Peled and Appelbaum 2013). Several methods are proposed for forecasting purpose such as stochastic models (Flores et al. 2012; Huang et al. 2013; Rout et al. 2014; Ren et al. 2015; David et al. 2016). They treat the solar radiation as a time series; the mathematical modeling of this series is used to forecast future values. However, these linear regression models such as the traditional ARMA are unable to give a full description of the complicated relationship between the data due its dynamic and nonlinearity. They cannot take into account the effect of other factors that influence the time series. In addition, outliers lead to a high fitting error that limits the application of this kind of models.

On the other hand, artificial intelligence methods have attracted the attention of many researchers in the field of renewable energy and in particular for the forecasting of meteorological data such as solar irradiation. Indeed, many research works have proven the ability of artificial intelligence methods to forecast meteorological data (Mellit 2008; Diagne et al. 2013). They demonstrated that they are more suitable and give better results compared to conventional approximation methods proposed by other researchers for the forecasting of solar irradiation. Artificial neural networks have been used widely for forecasting some kinds of time series (Benmouiza and Cheknane 2013; Kashyap et al. 2015; Qazi et al. 2015; Benmouiza and Cheknane 2016; Azimi et al. 2016). However, they suffer from some drawbacks such as bad global search and long training time. On the other side, fuzzy logic interference systems depend on the knowledge and experience of professional experts that makes it difficult to obtain satisfactory results in the case of lack of information on the knowledge database (Boata and Gravila 2012; Bas et al. 2015; Suganthi et al. 2015; Olatomiwa et al. 2015; Chen et al. 2016).

Hence, based on the above discussion, we propose in this paper an improved adaptive neuro-fuzzy inference system (ANFIS) for an hour-ahead forecast of solar radiation time series. At first place, and in the aim to decrease the input size, fuzzy c-means (FCM) (Dunn 1973; Bezdek 1981), subtractive clustering (Chiu 1994; Yager and Filev 1994), and grid partitioning (Simon 1991; Giotis and Giannakoglou 1998) are used. They consist of clustering the inputs into similar groups with same properties. Before that, each input is reconstructed using phase space reconstruction based on time delay embedding method. It allows the understanding of the dynamic underlying of the input dataset. Second, each clustered input is assigned to an ANFIS structure. After that, the ANFIS is trained using training dataset and the model is tested against a testing dataset. The advantage of this method is that it takes both advantages of ANFIS method based on fuzzy theory and artificial neural network algorithm and classification methods to simplify the analysis. The comparison results show the goodness of the proposed clustered ANFIS structure. The novelty of the paper can be summarized as follows:

Using FCM, subtractive and grid partitioning clustering methods to decrease the ANFIS input size, which leads to full comprehension of the dynamic behavior of the solar radiation time series
Best forecasting results using both neural network and fuzzy logic concepts
Decrease the calculation time and giving the opportunity to test in less time more configurations with good forecasting results

2 Data selection

For the forecasting purpose, the location of Ghardaia, Algeria (32.4908° N, 3.6728° E) is selected. It is located in northern-central Algeria in the Sahara Desert. Ghardaia is characterized by a dry and arid climate with a yearly average of the global solar radiation measured on a horizontal plane that exceeds 6000 Wh/m² and more than 3000 sunny hours per year, which promotes the use of solar energy in various fields such as bioclimatic, the hot water, producing electricity, and food drying.

To test the proposed models, we have selected the daylight hour’s data from the National Meteorological Office of Algeria of Ghardaia for 2010. It regularly measures the global horizontal data using CM 11 pyranometer (Capderou 1986) with a measurement error equal to 2% over the year with a sampling rate of 1 h. These data are divided into a training set for the model development and a testing set to evaluate the established model. To this end, the k-fold cross validation is applied, and the data is separated into a training set from 1 January 2010 to 31 October and test dataset from 1 November 2010 to 31 December 2010.

3 Methods

The primary objective is to forecast hourly global horizontal solar radiation from past data based on the ANFIS model, clustering algorithms, and time delay methods. Figure 1 illustrates the block diagram of the proposed method.

Neuro-fuzzy interference systems are realized by a combination of appropriate neural networks and fuzzy systems. This combination allows the use of the numerical and linguistic power of these two intelligent systems. It is known in the case of the theory of neuro-fuzzy that different strategies of fuzzification with different rules may have various solutions for a given task. In addition, a high number of fuzzy sets imply a high number of fuzzy rules, which allows for a proper study of the nonlinear effects on the overall behavior of the systems. However, it dramatically increases the processing time. Moreover, the use of neural network does not allow exploiting the knowledge of qualitative data which leads to an increase in neurons and layers. The approximation of the non-linearity of data is a solution. However, this implies the difficulty of learning and implementation level. Hence, to solve this problem, taking the advantages of both fuzzy logic theory and neural networks an adaptive neuro-fuzzy inference system is proposed.

3.1 Adaptive neuro-fuzzy inference system

As its name implies, an adaptive network is a network structure with an overall input–output behavior determined by the values of the collection of editable parameters. ANFIS is an adaptive neuro-fuzzy inference system introduced in (Jang 1993). It is a hybrid intelligent system that combines both ANN and fuzzy logic theory in a single system; it employs the ANN to update the parameters of the Takagi–Sugeno-type inference model. As shown in Fig. 1, an ANFIS model generally contains five layers.

3.1.1 Layer one

Called the fuzzification layer, it is characterized by fuzzy sets described as neurons. Each one of them has its membership function with an adjustable parameter called parameter of the premise. It is generally represented by Gaussian bell-shaped function expressed as follows:

$$ {y}_{ij}^1=\exp \left(-\frac{1}{2}{\left(\frac{x_i-{a}_{ij}}{\sigma_{ij}}\right)}^2\right) $$

(1)

x _i :: is the ith input (i: 1…n_in (number of inputs)).
j :: is the jth fuzzy set of the ith input (j: 1… n_mf (number of fuzzy set per input)).
y _ij :: is the output of ith input and jth fuzzy set.
a_ij and σ_ij:: are the center and standard deviation of the Gaussian bell-shaped function, respectively.

3.1.2 Layer two

It is the rule layer, which is used to calculate the degree of activation of the premises. Each neuron of this layer represents the premise of a rule. They receive as input the degree of truth of the different fuzzy sets up to a premise, and they are responsible for computing their own degree of truth. The activation functions used for these neurons is dependent on the operators (AND or OR) present in the rules. The output of each node is given by:

$$ {y}_l^2={y}_{ij}^1.{y}_{i^{\prime }{j}^{\prime}}^1 $$

(2)

l :: :1…n_r (number of rules: n_r = (n_mf)ⁿⁱⁿ).

3.1.3 Layer three

In this normalization layer, each neuron is a circle neuron; the ith neuron calculates the ratio between weight ith rules and the sum of all the weights of the rules. This operation is called normalization of weight.

$$ {y}_l^3=\frac{y_l^2}{\sum \limits_{m=1}^{n_r}{y}_m^2} $$

(3)

3.1.4 Layer four

Each neuron of this layer comprise n_in + 1 adjusted parameters, and the output of this layer is written as

$$ {y}_l^4={y}_l^3\left(\sum \limits_{i=1}^{n_{in}+1}{b}_i{x}_i+{b}_0\right) $$

(4)

b _i :: are all associated parameters to the normalized sequence of the ith rule of the previous layer.

3.1.5 Layer five

The output layer contains one single neuron which computes the overall output as an addition of all incoming signals.

$$ {y}^5=\sum \limits_{l=1}^{n_r}{y}_l^4 $$

(5)

Generally, each ANFIS model consists of constructing and training sections. The number and type of the membership functions, as well as the division of the input–output data into rule patches, is done in the construction phase. Hence, to achieve this task, clustering methods are used as powerful tools to understand and classify the inputs into groups that facilitate the training phase using ANFIS model. For that, we have chosen three clustering methods namely fuzzy c-means (FCM), subtractive clustering, and grid partitioning as expressed in what follows:

3.1.6 Fuzzy c-means

FCM algorithm is used widely in clustering methods. It separates the data into groups by optimizing an objective function (Dunn 1973; Bezdek 1981). In our case, the hourly global solar radiation time series presents a non-linearity, which makes the clustering process more difficult. Hence, phase space reconstruction is used in order to understand the underlying dynamical of this time series (MacQueen 1967; Benmouiza et al. 2016).

It consists of determining the minimum, appropriate, embedding dimension for a time series (Benmouiza and Cheknane 2013). The most widely used version is the time delay embedding method (Rand and Young 1981). A scalar time series x(t_i) is embedded into an m-dimensional space denotedX(t_i), as expressed in Eq. (6):

$$ X\left({t}_i\right)=\Big(x\left({t}_i\right),x\left({t}_i+\tau \right),\dots, x\left({t}_i+\left(m-1\right)\tau \right) $$

(6)

where i = (1, 2, …, M), τ is the delay time, m is the embedding dimension, and M is the number of embedded points in the m-dimensional space given by Eq. (7), where N is the total number of points of the time series and X(t_i) is the embedded time series into an m-dimensional space:

$$ M=N-\left(m-1\right)\tau $$

(7)

To determine the number of delays, the mutual information method proposed by Fraser and Swinney (Fraser and Swinney 1986) was used. The optimum delay is equal to the first minimum of the plotted mutual information expressed by the following equation:

$$ I\left(x(t),x\left(t-\tau \right)\right)=\sum \limits_{x\in \chi}\sum \limits_{y\in \gamma }p\left(x(t),x\left(t-\tau \right)\right)\log \frac{p\left(x(t),x\left(t-\tau \right)\right)}{p\left(x(t)\right)p\left(\left(t-\tau \right)\right)} $$

(8)

I(x(t), x(t − τ)) is the mutual information and p(x(t), x(t − τ)) is the joint probability mass function for the marginal probability mass functions x(t) and x(t − τ).

In addition, the false nearest neighbor method was chosen to set the number of the suitable embedding dimension (Kennel et al. 1992). This method determines the nearest neighbor of every point in a given dimension and then checks to see if these are still close neighbors in the higher dimension.

After determining the optimal embedding dimension, the reconstructed phase space of the solar radiation data is clustered using the fuzzy c-means algorithm. Each data point from each cluster center (taking into account the distance between the cluster center and the data point) is assigned to a membership. The data that is near to the cluster center is selected to belong to that cluster.

Each point belongs to a cluster with some degree of belonging defined by a membership grade. The FCM algorithm minimizes an objective function J_FCM that calculated the weighted within-group sum of squared errors as expressed in Eq. (9):

$$ {J}_{FCM}=\sum \limits_{k=1}^n\sum \limits_{i=1}^c{\left({u}_{ik}\right)}^q{d}^2\left({x}_k,{v}_i\right) $$

(9)

where n is the length of the data, c is the number of clusters defined by the c-means algorithm, u_ik is the degree of membership of x_k in the i^th cluster, q is a weighting exponent on each fuzzy membership, and it is a real number greater than 1 (Chiu 1994); X = (x₁, x₂, …, x_n) is the data in the m-dimensional vector space, v_i is the center of the cluster i, andd²(x_k, v_i) is the distance measured between data x_k and cluster center v_i.

The summary of the FCM algorithm is illustrated by the following steps (Dunn 1973; Bezdek 1981; Benmouiza et al. 2016):

1.
Initialize the values c, q and the error ε.
2.
Initialize the cluster center matrix $ {V}^{\left(t=0\right)}=\left[{v}_i^{\left(t=0\right)}\right] $ and the membership matrix $ {U}^{\left(t=0\right)}=\left[{u}_{ik}^{\left(t=0\right)}\right] $.
3.
Increase the time t and calculate the new c cluster centers V^(t):

$$ {V}^{(t)}=\frac{\sum \limits_{k=1}^n{\left({\left({u}_{ik}\right)}^{(t)}\right)}^q{x}_k}{\sum \limits_{k=1}^n{\left({\left({u}_{ik}\right)}^{(t)}\right)}^q} $$

(10)

4.
Calculate the new membership values U^(t + 1):

$$ {U}^{\left(t+1\right)}=\left[{u}_{ik}^{\left(t+1\right)}\right]\frac{1}{\sum \limits_{j=1}^c{\left(\frac{d_{ik}}{d_{jk}}\right)}^{2/\left(q-1\right)}} $$

(11)

whered_ik = ‖x_k − v_i‖ and 1 ≤ k ≤ n; 1 ≤ i ≤ c

5.
If ‖U^(t) − U^(t + 1)‖ < ε stop. Otherwise, increase t and go to step (3).

The FCM algorithm depends strongly on the position of the initialization points. Hence, an important task in the FCM algorithm is choosing the correct number of clusters to avoid the problem of the fall in a local minimum. In this paper, FCM is used to cluster the input data into groups that have similar properties to be used in the forecasting phase using ANFIS model.

3.1.7 Subtractive clustering

This technique is applied when there is not a clear idea about the number of centers for the distribution of data. Subtractive method is an extension of the classification method proposed by Yager (Yager and Filev 1994). In this algorithm, each data point is considered as a cluster center candidate, and then it calculates the potential of each data point by measuring the density of points data surrounding it. The algorithm is an iterative process, which supposed that each point is a potential cluster center according to its location to other data points. It consists of choosing a point that has the probability to be the highest potential cluster center, then delete all the points which are inside the radius of the first cluster center (the radius is defined by the neighborhoods of the center). And, recalculate the potential of the other points to determine the next cluster center. Finally, repeat this step until all the data are within the radius of a cluster center: the algorithm can be summarized as follows:

1.
Consider a collection of n data points in an m-dimensional space; select the data point with the highest potential to be the center of the first group.
2.
Measure the density index D_i corresponding to data x_i:

$$ {D}_i=\sum \limits_{j=1}^n\exp \left(-{\frac{\left\Vert {x}_i-{x}_j\right\Vert }{{\left(\raisebox{1ex}{${r}_a$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\right)}^2}}^2\right) $$

(12)

where r_a is a positive number that represents the radius where all the data within it are considered neighborhoods; the data point with the highest density measure is selected as the first center cluster denoted x_c1 and its density is D_c1.

3.
Recalculate the density measurements for each data point x_i using Eq. (13):

$$ {D}_i^{\hbox{'}}={D}_i-{D}_{c1}\exp \left(-\frac{{\left\Vert {x}_i-{x}_{c1}\right\Vert}^2}{\left(\raisebox{1ex}{${r}_b$}\!\left/ \!\raisebox{-1ex}{$2$}\right.\right)}\right) $$

(13)

r_b = K.r_a (K is a positive number, usually K = 1.5 (Chiu 1994; Yager and Filev 1994)), as a consequence; all the points near to the first cluster center x_c1 will have low-density measure and thereby they will not be considered as the next cluster centers. The next cluster center x_c2 is selected after the density measure for each data point is recalculated.

4.
Recalculate all of the density measures for data points again. And repeat the processes until a sufficient number of cluster centers are generated.

3.1.8 Grid partitioning

In this method, the input data space is divided into rectangular subspace using an axis-paralleled partition; each input is partitioned into identically shaped membership functions. The number of the fuzzy if-then rules is equal to Mⁿ, where n is the input dimension and M is the number of partitioned fuzzy subsets for each input variable.

The grid is constructed without taking any physical meaning or data density repartition, and each part of the grid is used to generate fuzzy rules based on system input–output training data, which allows fast learning processes and calculation time optimization. However, the performance of this method depends strongly on the size of the inputs and the grid; generally, a finer grid leads to high performance. An adaptive grid portioning can be used to optimize the size and location of the fuzzy grid regions.

3.2 Forecasting of solar radiation

As shown in Fig.1, to forecast solar radiation data using ANFIS model, hourly solar radiation time series need to be embedded and delayed in order to obtain the input dataset. Moreover, k-fold cross validation method is used to choose the proper training and testing dataset to avoid the over-fitting problem (Kohavi 1995). In this method, the dataset is divided into k subsets (k test and k − 1 training subsets). Then, the average error across all k trials is computed until we reached the best training and testing dataset (Klipp et al. 2005). After that, each input is classified using FCM, subtractive clustering, and grid portioning methods for a different set of parameters; these classified data are used in the training phase of the ANFIS model. Finally, the model is used to forecast future values of the series.

3.3 Error metrics and data

Our objective is to choose the best model for an hour-ahead forecasting of hourly solar radiation using different clustering techniques in ANFIS model. Different error calculations are used to evaluate the forecasting accuracy measures of the proposed model. They are summarized as follows:

3.3.1 The root-mean-square error

It allows a term by term comparison of actual deviation between measured and forecasted data; it provides information on the short-term performance of correlations. The model with the lowest RMSE is considered the best:

$$ RMSE=\sqrt{\frac{\sum \limits_{i=1}^n{\left({I}_{i, predicted}-{I}_{i, measured}\right)}^2}{n}} $$

(14)

3.3.2 Forecasting skill

The forecasting skill (FS) is the accuracy degree of the association of a forecast to an estimate of actual values. It is given by Eq. (15); a FS of zero represents no improvement over the reference model and a skill of one represents a perfect forecast (Mazorra Aguiar et al. 2015; Schmidt et al. 2016):

$$ FS=1-\frac{RMSE}{RMSE_{smart}} $$

(15)

where RMSE_smart is the smart persistence model. It consists in forecasting clear sky index for each time horizon “h” persists for the next time step (Inman et al. 2013). As expressed by Eq. (16):

$$ {k}_t^{\ast}\left(t+h\right)={k}_t^{\ast }(t){GHI}_{clear}\left(t+\Delta t\right) $$

(16)

k_t^* represents the clear sky index expressed as follows:

$$ {k}_t^{\ast }=\frac{GHI}{GHI_{clear}} $$

(17)

GHI is the measured horizontal hourly solar radiation data at ground level and GHI_clear is the calculated clear sky hourly solar radiation data. In this paper, W.M.O model (W.M.O. 1981) is used to determine clear sky data; this model depends on the solar height (hs) and the linked turbidity factor (TL) (Tadj et al. 2014).

3.3.3 R-squared value

It is used as metric to judge the goodness of the forecast:

$$ {R}^2=1-\left(\frac{\sum \limits_{i=1}^n{\left({I}_{i,\mathrm{measured}}-{I}_{i,\mathrm{predicted}}\right)}^2}{\sum \limits_{i=1}^n{\left({I}_{i,\mathrm{measured}}-\overline{I_{i, measured}}\right)}^2}\right) $$

(18)

3.3.4 Histogram

It is an estimate of the probability distribution of a continuous variable represented by a graphical display of data using bars of different heights.

4 Simulation results

Our objective is to present the best model based on the clustered ANFIS model for 1-h-ahead GHI forecasting. For that, 4.530 h from 1 January 2010 to 31 October 2010 is used to train the model and 915 h (total forecasted hours) from 1 November 2010 to 31 December 2010 to test it. The simulation results of different hybrid combinations are presented below.

4.1 Fuzzy c-means clustering

The first selected method is ANFIS with FCM clustering. A different number of clusters, nodes, and fuzzy sets are tested in the simulation. The performance has been evaluated using RMSE, FS, R-squared value, and calculation time. The results are presented in Table 1. In addition, the simulation results of the ANFIS model using FCM clustering are shown in Fig. 2. The figure shows the forecasted data and measured testing data for November and December 2010; the black dot line in Fig.2a represents the forecasted data and the red line represents the measured hourly global solar radiation data. Also, it shows the error and its histogram. Moreover, Fig. 3 represents the training and testing data versus its fit with the different R-squared values.

Table 1 Performance analysis of ANFIS with FCM clustering for an hour-ahead forecasting (915 forecasted hours)

Full size table

From Table 1 and Figs. 2 and 3, we can see clearly the influence of the number of clusters nodes and fuzzy rules in the results. Eight clusters are founded as the best one that uses 103 nodes and 8 fuzzy rules. The lowest RMSE is equal to 124.07 W/ m² and the FS is 47.46% with an R-squared value of 0.949. In addition, low numbers of clusters do not give good results due to the non-good partitioning of the inputs. Moreover, a high number of clusters lead to use of many numbers of nodes and fuzzy rules that increase the calculation time, which does not imply necessarily the goodness of the forecast.

4.2 Subtractive clustering

In the same way, the methodology is applied to subtractive clustering; different sets of cluster radius are tested in the range of 0.2 to 0.9. The performances are tested for training data and testing data using RMSE and FS. The results are shown in Table 2 and Fig.4 that represents the simulation results of the measured testing data and forecasted one as well as the error and its histogram. Moreover, the R-squared values are shown in Fig. 5.

Table 2 Performance analysis of ANFIS with subtractive clustering for an hour-ahead forecasting (915 forecasted hours)

Full size table

From Table 2 and Figs. 4 and 5, the lowest value of RMSE is equal to 124.52 W/ m² and an FS of 47.28% for the testing forecasted data. The R-squared value is 0.9450. The number of the used nodes is 104 for 8 fuzzy rules. Low radius values do not allow the subtractive clustering ANFIS model to be mapped well. However, high-radius values increase the difficulty of training and lead to over-fitting or memorizing undesirable inputs.

4.3 Grid partitioning

The results of forecasting the hourly solar radiation using this classification method are shown in Table 3 and Figs. 6 and 7. We have only chosen partitions 2 and 3. This method presents a high value of RMSE = 144.94 W/m² with an FS equal to 40.62% for the testing forecasted data. Moreover, the R-squared values presented in Fig. 7 show the low performance of this method besides the high computation time. This is due to the need for a small number of membership functions for each input that is not the case of the chosen solar radiation time series.

Table 3 Performance analysis of ANFIS with grid partitioning for an hour-ahead forecasting (915 forecasted hours)

Full size table

4.4 Comparison with other models

The first comparison is reached to choose the best model between the three introduced clustering methods. Hence, 2 days is selected; one represents a clear sky day (22nd December 2010) and the other is a cloudy day (23rd December 2010) as shown in Figs. 8a, b, respectively. Moreover, 11 testing days (from 20 to 31 December 2010) are also for the comparison purpose. The best configurations founded in Tables 1, 2, and 3 are used. The comparison results of the forecasted series and measured one are shown in Fig. 9. The forecasting skill values for the clear and cloudy days are presented as follows: FCM (clear sky day 48.29%, cloudy day 37.20%), subtractive clustering (clear sky day 48.21%, cloudy day 35.88%), and grid partitioning (clear sky day 41.30%, cloudy day 32.37%). From these results, we can note clearly that the forecasting using ANFIS with FCM clustering method gives the best results with an RMSE equals to 85.4151 W/ m² (from Fig. 9). And the three proposed models perform better in clear sky days than cloudy one.

Moreover, to evaluate the goodness of the proposed forecast, a comparison between some existing models in the literature and the presented hybrid-clustered ANFIS models is needed. Hence, two models based on hybrid methodology are selected. First, a coupled autoregressive and dynamical system (CARDS) model to forecast solar radiation proposed in Huang et al. (2013) is used. In addition, the hybrid ARMA NAR model presented in Benmouiza and Cheknane (2016) is also chosen. The same dataset presented in Huang et al. (2013) for the city of Mildura (in 2001) (testing day: 25 January) is used to test the forecasting model. The results of the RMSE are presented in Table 4.

Table 4 RMSE for testing and measuring data of different models

Full size table

From Table 4, it is clear that the ANFIS FCM models present the lowest RMSE equals to 112 W/ m². This result proves clearly the robustness and the goodness of the proposed model to forecast solar radiation time series.

5 Conclusion

Forecasting of solar radiation is an important key in the field of solar radiation applications where ground measurements are not available. In this paper, we have proposed an improved ANFIS model to forecast hourly solar radiation data for 915 h for the site of Ghardaia, Algeria. The adopted methodology consists of clustering the input data using FCM, subtractive clustering, and grid partitioning. Time delay embedding method is used to understand the underlying dynamic of the input series. It helps to extract information from this series that helps the classification phase.

The results obtained in this paper confirm the interest in the use of ANFIS model in a long-term forecasting objective. Choosing the number of clusters is an important key in forecasting purpose; a low number of clusters lead to a non-good partitioning of inputs. In the other side, a high number of cluster increase significantly the calculation time and it can affect the goodness of the forecast.

Grid partitioning method shows the lowest performance compared to other methods. This is due to the number of chosen membership functions. A high number of this increase the computation time and reflect the forecast. Subtractive clustering gives good results. However, ANFIS with FCM clustering model is the best one; it uses low membership function besides the low calculation time. Hence, this method is chosen as the best one for this case. Finally, as a conclusion, and by comparison with other methods presented in the literature, the proposed clustered ANFIS model is supposed a good method to forecast such similar problems.

References

Azimi R, Ghayekhloo M, Ghofrani M (2016) A hybrid method based on a new clustering technique and multilayer perceptron neural networks for hourly solar radiation forecasting. Energy Convers Manag 118:331–344. https://doi.org/10.1016/j.enconman.2016.04.009
Article Google Scholar
Badescu V, Gueymard CA, Cheval S, Oprea C, Baciu M, Dumitrescu A, Iacobescu F, Milos I, Rada C (2013) Accuracy analysis for fifty-four clear-sky solar radiation models using routine hourly global irradiance measurements in Romania. Renew Energy 55:85–103. https://doi.org/10.1016/j.renene.2012.11.037
Article Google Scholar
Bas E, Egrioglu E, Aladag CH, Yolcu U (2015) Fuzzy-time-series network used to forecast linear and nonlinear time series. Appl Intell 43:343–355. https://doi.org/10.1007/s10489-015-0647-0
Article Google Scholar
Belaid S, Mellit A (2016) Prediction of daily and mean monthly global solar radiation using support vector machine in an arid climate. Energy Convers Manag 118:105–118. https://doi.org/10.1016/j.enconman.2016.03.082
Article Google Scholar
Benmouiza K, Cheknane A (2013) Forecasting hourly global solar radiation using hybrid k-means and nonlinear autoregressive neural network models. Energy Convers Manag 75:561–569. https://doi.org/10.1016/j.enconman.2013.07.003
Article Google Scholar
Benmouiza K, Cheknane A (2016) Small-scale solar radiation forecasting using ARMA and nonlinear autoregressive neural network models. Theor Appl Climatol 124:945–958. https://doi.org/10.1007/s00704-015-1469-z
Article Google Scholar
Benmouiza K, Tadj M, Cheknane A (2016) Classification of hourly solar radiation using fuzzy c-means algorithm for optimal stand-alone PV system sizing. Int J Electr Power Energy Syst 82:233–241. https://doi.org/10.1016/j.ijepes.2016.03.019
Article Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Springer US, Boston
Book Google Scholar
Boata RS, Gravila P (2012) Functional fuzzy approach for forecasting daily global solar irradiation. Atmos Res 112:79–88. https://doi.org/10.1016/j.atmosres.2012.04.011
Article Google Scholar
Capderou M (1986) Atlas solaire de l’algerie.tome1, Office des. Office des publications universitaires
Chen Y-S, Cheng C-H, Chiu C-L, Huang S-T (2016) A study of ANFIS-based multi-factor time series models for forecasting stock index. Appl Intell 45:277–292. https://doi.org/10.1007/s10489-016-0760-8
Article Google Scholar
Chiu SL (1994) Fuzzy model identification based on cluster estimation. J Intell Fuzzy Syst Appl Eng Technol 2:267–278
Google Scholar
David M, Ramahatana F, Trombe PJ, Lauret P (2016) Probabilistic forecasting of the solar irradiance with recursive ARMA and GARCH models. Sol Energy 133:55–72. https://doi.org/10.1016/j.solener.2016.03.064
Article Google Scholar
Diagne M, David M, Lauret P, Boland J, Schmutz N (2013) Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renew Sust Energ Rev 27:65–76. https://doi.org/10.1016/j.rser.2013.06.042
Article Google Scholar
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3:32–57. https://doi.org/10.1080/01969727308546046
Article Google Scholar
Flores JJ, Graff M, Rodriguez H (2012) Evolutive design of ARMA and ANN models for time series forecasting. Renew Energy 44:225–230. https://doi.org/10.1016/j.renene.2012.01.084
Article Google Scholar
Fraser A, Swinney H (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A Gen Phys 33:1134–1140
Article Google Scholar
Gan M, Huang Y, Ding M, Dong XP, Peng JB (2012) Testing for nonlinearity in solar radiation time series by a fast surrogate data test method. Sol Energy 86:2893–2896. https://doi.org/10.1016/j.solener.2012.04.021
Article Google Scholar
Giotis AP, Giannakoglou KC (1998) An unstructured grid partitioning method based on genetic algorithms. Adv Eng Softw 29:129–138. https://doi.org/10.1016/S0965-9978(98)00014-3
Article Google Scholar
Huang J, Korolkiewicz M, Agrawal M, Boland J (2013) Forecasting solar radiation on an hourly time scale using a coupled autoregressive and dynamical system (CARDS) model. Sol Energy 87:136–149. https://doi.org/10.1016/j.solener.2012.10.012
Article Google Scholar
Inman RH, Pedro HTC, Coimbra CFM (2013) Solar forecasting methods for renewable energy integration. Prog Energy Combust Sci 39:535–576. https://doi.org/10.1016/j.pecs.2013.06.002
Article Google Scholar
Jang J-SR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23:665–685. https://doi.org/10.1109/21.256541
Article Google Scholar
Ji W, Chee KC (2011) Prediction of hourly solar radiation using a novel hybrid model of ARMA and TDNN. Sol Energy 85:808–817. https://doi.org/10.1016/j.solener.2011.01.013
Article Google Scholar
Kaplanis S, Kaplani E (2007) A model to predict expected mean and stochastic hourly global solar radiation I(h;nj) values. Renew Energy 32:1414–1425. https://doi.org/10.1016/j.renene.2006.06.014
Article Google Scholar
Kashyap Y, Bansal A, Sao AK (2015) Solar radiation forecasting with multiple parameters neural networks. Renew Sust Energ Rev 49:825–835. https://doi.org/10.1016/j.rser.2015.04.077
Article Google Scholar
Kennel MB, Brown R, Abarbanel HDI (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A 45:3403–3411. https://doi.org/10.1103/PhysRevA.45.3403
Article Google Scholar
Klipp E, Herwig R, Kowald A, Wierling C, Lehrach H (2005) Systems biology in practice. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
Book Google Scholar
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. 1137–1143
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1: statistics. The regents of the University of California
Mazorra Aguiar L, Pereira B, David M, Díaz F, Lauret P (2015) Use of satellite data to improve solar radiation forecasting with Bayesian artificial neural networks. Sol Energy 122:1309–1324. https://doi.org/10.1016/j.solener.2015.10.041
Article Google Scholar
Mellit A (2008) Artificial intelligence technique for modelling and forecasting of solar radiation data: a review
Olatomiwa L, Mekhilef S, Shamshirband S, Petković D (2015) Adaptive neuro-fuzzy approach for solar radiation prediction in Nigeria. Renew Sust Energ Rev 51:1784–1791. https://doi.org/10.1016/j.rser.2015.05.068
Article Google Scholar
Peled A, Appelbaum J (2013) Evaluation of solar radiation properties by statistical tools and wavelet analysis. Renew Energy 59:30–38. https://doi.org/10.1016/j.renene.2013.03.019
Article Google Scholar
Qazi A, Fayaz H, Wadi A, Raj RG, Rahim NA, Khan WA (2015) The artificial neural network for solar radiation prediction and designing solar systems: a systematic literature review. J Clean Prod 104:1–12. https://doi.org/10.1016/j.jclepro.2015.04.041
Article Google Scholar
Rand D, Young L-S (eds) (1981) Dynamical systems and turbulence, Warwick 1980. Springer Berlin Heidelberg, Berlin
Google Scholar
Ren Y, Suganthan PN, Srikanth N (2015) Ensemble methods for wind and solar power forecasting—a state-of-the-art review. Renew Sust Energ Rev 50:82–91. https://doi.org/10.1016/j.rser.2015.04.081
Article Google Scholar
Rout M, Majhi B, Majhi R, Panda G (2014) Forecasting of currency exchange rates using an adaptive ARMA model with differential evolution based training. J King Saud University 26:7–18. https://doi.org/10.1016/j.jksuci.2013.01.002
Google Scholar
Schmidt T, Kalisch J, Lorenz E, Heinemann D (2016) Evaluating the spatio-temporal performance of sky-imager-based solar irradiance analysis and forecasts. Atmos Chem Phys 16:3399–3412. https://doi.org/10.5194/acp-16-3399-2016
Article Google Scholar
Simon HD (1991) Partitioning of unstructured problems for parallel processing. Comput Syst Eng 2:135–148. https://doi.org/10.1016/0956-0521(91)90014-V
Article Google Scholar
Suganthi L, Iniyan S, Samuel AA (2015) Applications of fuzzy logic in renewable energy systems – a review. Renew Sust Energ Rev 48:585–607. https://doi.org/10.1016/j.rser.2015.04.037
Article Google Scholar
Tadj M, Benmouiza K, Cheknane A, Silvestre S (2014) Improving the performance of PV systems by faults detection using GISTEL approach. Energy Convers Manag 80:298–304. https://doi.org/10.1016/j.enconman.2014.01.030
Article Google Scholar
W.M.O (1981) Meteorological aspects of the utilization of solar radiation as an energy source, illustrate. Secretariat of the World Meteorological Organization
Yager RR, Filev DP (1994) Generation of fuzzy rules by mountain clustering. J Intell Fuzzy Syst Appl Eng Technol 2:209–219
Google Scholar
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175. https://doi.org/10.1016/S0925-2312(01)00702-0
Article Google Scholar
Zhang G, Eddy Patuwo B, Hu MY (1998) Forecasting with artificial neural networks. Int J Forecast 14:35–62. https://doi.org/10.1016/S0169-2070(97)00044-7
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire des Semi-conducteurs et Matériaux Fonctionnels, Université Amar Telidji de Laghouat, Bd des Martyrs, BP 37G, 03000, Laghouat, Algeria
Khalil Benmouiza & Ali Cheknane

Authors

Khalil Benmouiza
View author publications
You can also search for this author in PubMed Google Scholar
Ali Cheknane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khalil Benmouiza.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benmouiza, K., Cheknane, A. Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting. Theor Appl Climatol 137, 31–43 (2019). https://doi.org/10.1007/s00704-018-2576-4

Download citation

Received: 06 January 2017
Accepted: 14 July 2018
Published: 01 August 2018
Issue Date: 01 July 2019
DOI: https://doi.org/10.1007/s00704-018-2576-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Clustered ANFIS network using fuzzy c-means, subtractive clustering, and grid partitioning for hourly solar radiation forecasting

Abstract

Similar content being viewed by others

Enhanced adaptive neuro-fuzzy inference system using genetic algorithm: a case study in predicting electricity consumption

Potential of k-Means Clustering-Based Fuzzy Logic for Prediction of Temperature in Ambient Atmosphere

Modeling Forecast Uncertainty Using Fuzzy Clustering

1 Introduction

2 Data selection

3 Methods