Keywords

1 Introduction

The main reservation around large scale adoption of wind energy is its intermittency as energy production is directly dependent on uncertain future atmospheric conditions [3]. Forecasting of short term wind energy production has thus become critical to operations management and planning for electricity suppliers [11]. Such forecasts can then be used for proactive reserve management and energy market trading to ensure that electricity load demands are optimally met [11, 13].

Generally, statistical and machine learning methods have become increasingly prominent in wind power forecasting. These are mainly based on using historical wind power output to refine Numerical Weather Predictions (NWP) into localised predictions for the wind farms in question. Adaptive Neuro-Fuzzy Inference Systems (ANFIS) have also been prominent in wind power prediction literature. The authors in [4] propose a two stage ANFIS with the first stage refining the NWP winds speeds and while the second uses the refined wind speed estimate for wind power prediction. The authors in [9] compare ANFIS and ANNs in power predictions. Other techniques have included Bayesian Neural Networks [11], Classification and Regression Trees, Support Vector Regression and Random Forests [5]. A comprehensive review of the current state-of-the-art methods in wind power forecasting is given in [17].

There has also been work in creating hybrids between the genetic algorithm (GA) and particle swarm optimisation (PSO). A hybrid algorithm is proposed in [6], where randomly selected particles from the PSO are evolved using GA and returned to the PSO for further optimisation. The authors in [19] propose randomly partitioning the particle population into sections where PSO and GA are applied independently, and the results are then combined then re-partitioned after every iteration.

The objective of this paper is to investigate two hybrid GA-PSO algorithms, we then apply these two algorithms for training an ANFIS model for one-hour ahead wind power prediction. The rest of the paper is arranged as follows: Sect. 2 introduces our proposed hybrid GA-PSO algorithm and sets-out the ANFIS optimisation problem. Section 3 gives the experiment setup. Section 4 provides and discusses the results of the experiments. Section 5 gives the conclusions and possible future work.

2 Methods

In this paper, two hybrid methods based on GA and PSO are investigated in predicting one-hour ahead wind power production based on an ANFIS model.

2.1 Genetic Algorithm

The GA is inspired by the biological process of natural selection. In GA candidate solutions are individuals within the population. Each individual’s fitness is evaluated on the basis of a cost function. At each iteration, three evolutionary procedures, selection, crossover and mutation, are executed in order to obtain a global minimum. First, certain individuals are sampled from the population (the selection step) for a crossover where parts of the selected individuals are randomly exchanged. Next, another set of randomly selected individuals are also mutated. In this paper, a continuous GA is used to optimise the parameters of the ANFIS model where the mutation is performed by adding a Gaussian noise to randomly selected parts of the vector of the unknown parameters. The process continues until the algorithm converge or a specified maximum number of iterations is executed.

2.2 Particle Swarm Optimisation

The PSO is one of the most recognised meta-heuristic optimisation algorithms which inspired by the natural process of flocking of birds in search for food. In the PSO algorithm, each particle in a swarm is considered as a candidate solution of the optimisation problem. In this algorithm, the position of each particle is updated in each iteration using the following equation:

$$\begin{aligned} P_{i+1}= P_{i}+ V_{i+1} \end{aligned}$$
(1)

Here \(V_{i+1}\) the particle’s velocity which is updated by:

$$\begin{aligned} V_{i+1}= w_0V_i + c_1r_1(P_{best}-P_i)+c_2r_2(G_{best}-P_i) \end{aligned}$$
(2)

where \(w_0\) is the inertia weight which maintains the previous velocity. \(c_1\) is the particle’s acceleration constant towards its personal best solution \(P_{Best}\), while \(c_2\) is the acceleration of the particle towards the best known position amongst all particles. \(r_1\) and \(r_2\) are randomly selected from a uniform distribution U(0, 1) to add randomness to the search space exploration. These updates continue until the algorithm converges or a specified maximum number of iterations is executed.

2.3 Proposed Approach

Generally, the main issue in GA is the lack of memory since the information contained by the candidate solution that has not been selected for crossover (or mutation) may be lost to future generations [6]. In this paper, two hybrid methods between GA and PSO are proposed such that the GA can be further improved by the memory and social learning elements of the PSO.

  1. 1.

    GA with PSO Crossover (GA-PSO) Here we adapt the GA crossover by probabilistically alternating between the standard GA crossover and the PSO velocity updates. A random number R(i) is drawn from a U(0, 1) distribution. If R(i) exceeds a threshold T PSO updates are performed instead of the GA crossover. This algorithm is shown in Fig. 1.

  2. 2.

    GA with PSO initialisation (GAPSO-I) Here we run the PSO algorithm for a limited number of iterations while the best particle obtained by the PSO is used as one of the individuals that initialise the GA population. This most similar to the algorithm proposed by [18]. However, we use only one particle from the PSO rather than a all the M best particles in the GA initialisation. We believe that using just best particle from the PSO with other random population members increases the search space of the GA. While using M particles could possibly localise the search.

Fig. 1.
figure 1

Flowchart showing the GAPSO algorithm.

Fig. 2.
figure 2

Simple ANFIS architecture with two inputs and two rules.

2.4 Adaptive Neuro-Fuzzy Inference Systems

ANFIS are class of the fuzzy Inference Systems (FIS) that adaptively adjust membership functions and consequent parameters based on training data. Figure 2 shows an ANFIS architecture as proposed by [8]. ANFIS consists of five consecutive layers that sequentially process the information from inputs towards outputs. These five layers operate as follows:

Layer 1 is a fuzzification layer, where crisp inputs are converted into fuzzy set membership values. This is done using membership functions (MFs) which are bounded in range the [0, 1]. The output of the \(j^{th}\) node in this layer will be of the form:

$$\begin{aligned} O_j^1=\mu _{A_j}(x) \quad \text {j=1, 2} \end{aligned}$$
(3)

where \(\mu _{A_j}(x)\) is the MF. In this paper, a Gaussian MF, as described in Eq. 4, are selected for the modelling process.

$$\begin{aligned} \mu _{A_j}(x) = \exp {\bigg (-\frac{(x-p_j)}{\alpha _j}\bigg )} \end{aligned}$$
(4)

Layer 2 combines the incoming signals from the fuzzy sets in the previous layer using a T-norm operator. The result of this operation is the combined firing strength of each rule. If the chosen T-norm operator is multiplication then the output of the \(j^{th}\) node in this layer is:

$$\begin{aligned} O_j^2= w_j=\mu _{A_j}(x) \times \mu _{A_j}(x) \quad \text {j=1, 2} \end{aligned}$$
(5)

Layer 3 is a normalisation node where the relative firing strength of each rule is calculated as ratio of its firing strength \(w_j\) to the sum of the firing strengths of all rules. The normalised firing strength of the \(j^{th}\) in this layer will be:

$$\begin{aligned} O_j^3 = \bar{w_j} =\frac{w_j}{w_1+w_2} \quad \text {j=1, 2}\ \end{aligned}$$
(6)

Layer 4 calculates the consequent part of a Tagaki-Sugeno type FIS. The result is a linear combination of the inputs for each rule weighted by its respective normalised firing strength \(\bar{w_j}\). This weighted linear combination is of the form:

$$\begin{aligned} O_j^4= \bar{w_j}f_j=\bar{w_j}(a_jx_1+b_jx_2+c_j) \end{aligned}$$
(7)

where \(a_j\), \(b_j\), \(c_j\) are unknown consequent parameters

Layer 5 performs an aggregation of the consequent values evaluated in the previous layer as an weighted average. The final output is therefore:

$$\begin{aligned} O_j^5= \bar{w_j}=\sum _i{\bar{w_j}{f_i}} \end{aligned}$$
(8)

The unknown parameters of the MFs in layer 1 and the linear coefficients in layer 4 need to be estimated from training data. Multiple optimisation techniques have been used for tuning these parameters [15]. A two-step process is suggested in [8] where the linear consequent parameters are optimised using Least Squares Estimation(LSE) in the forward pass, while the MF parameters are optimised using gradient decent in the backward pass. The authors in [14] use PSO for the MF parameters and LSE for consequent parameters. PSO is used for both the MF and consequent parameters in [16], while the genetic algorithm is used for both sets of parameters in [2]. In this paper, two algorithms based on the combination of GA and PSO are proposed for estimating these parameters.

3 Experiments

In this section, the relative performance of the proposed hybrid methods is investigated when training an ANFIS for predicting one-hour ahead wind power production on the Norwegian wind farm dataset.

3.1 Dataset

The Norwegian wind farm dataset consists of 7384 records covering the period from January 2014 to December 2016 [11]. The dataset features include the windfarm online capacity, one and two hour lagged historical power production values as well as NWP estimates of humidity, temperature and wind speed.

We randomly split the data into 70% for training and 30% for testing. The model performance is evaluated on the basis of Root Mean Square Error (RMSE) as defined in Eq. 9 for observation series T and corresponding model prediction series Y.

$$\begin{aligned} RMSE=\sqrt{\frac{1}{N}\sum ^{N}_{i=1}\big (T_i-Y_i\big )^2} \end{aligned}$$
(9)

3.2 Model Setup

An ANFIS with 3 Gaussian MFs for each input was trained using the following methods:

  1. 1.

    Hybrid backpropagation least squares (BP-LS) method of [8]

  2. 2.

    The normal GA [2]

  3. 3.

    The proposed GAPSO

  4. 4.

    The GAPSO-I

Table 1. List of additional parameters

In-order to allow for the stochastic effects of random initialisation we repeat the training of each algorithm 30 times. This also allows us to perform statistical significance tests on the results using a non-parametric Kruskal-Wallis(KW) [12] and post-hoc bonferorni [7] test for pairwise comparisons. In all cases, the ANFIS is first initialised using Fuzzy C-Means clustering [1]. Training iterations for each of the algorithms is capped at a maximum of 1500 iterations. A population size of 40 is used of the GA, GAPSO and GAPSO-I. Table 1 shows a list of the additional parameters settings.

4 Results and Discussions

Figures 3 and 4 shows the results of the experiments described in Sect. 3.

Fig. 3.
figure 3

Boxplot showing the distribution of testing RMSE from the 30 trials of each method

Fig. 4.
figure 4

Left, Graph showing the mean predictions from the GAPSO and true targets for each of the samples in testing set. Right, Graph showing the mean predictions from the GAPSO-I and true targets for each of the samples in testing set.

The boxplot in Fig. 3 shows the distribution of testing RMSE from the different models. It can be seen from the plot that evolutionary techniques show lower testing mean RMSE with the GAPSO-I showing the lowest mean RMSE of 2919.60 kWh.

The results also show that the RMSE for the evolutionary techniques have greater variation than the BP-LS as there are more stochastic elements in the algorithms rather than just in the initialisation of the backpropagation in the BP-LS.

We also see from Table 2 that while the BP-LS displays the lowest training RMSE of 2955.34 kWh it displays the highest the testing mean RMSE of 2971.57 kWh. This indicates the common problem of over-fitting to the training data displayed by gradient based methods [10].

Table 2. Results showing the mean RMSE from 30 trails of ANFIS trained by the various optimisation algorithms for one-hour ahead wind power prediction.

A KW statistical test performed on the testing RMSE gives a p-value of \(1.07e{-12}\), indicating that the differences in model performance are statistically significant. A further Bonferroni test for pair-wise differences shows that the difference between the testing RMSE from the evolutionary techniques and the BP-LS is statistically significant in all cases at an acceptance level of \(\alpha =0.05\). The Bonferroni test also showed that the GAPSO-I had a statistically significant lower RMSE than all the other methods.

5 Conclusions and Future Work

In this paper, two hybrid GA-PSO techniques were proposed for optimising ANFIS parameters. We applied these techniques in predicting one-hour ahead wind power production on the Norwegian wind farm dataset. The results showed that both hybrid methods produce statistically significant lower RMSE than the traditional BP-LS. Furthermore, the GAPSO-I displayed statistically significant out-performance when compared to the normal GA and the GAPSO.

Future improvements to this work could explore longer horizons in wind power prediction such as 24 h [11]. An ANFIS version of the two stage modelling wind power framework such as that in [20] could also be adopted. The effect of different types of membership functions on the predictive performance can also be explored.