Keywords

1 Introduction

Renewable energy has shown promising growth in recent years due to its sustainability, environmentally friendly nature, and abundant availability as a source of electric energy [22]. Among various types of renewable energy, wind power has demonstrated remarkable growth as one of the most effective strategies to combat climate change and meet greenhouse gas emission targets in many countries. Governments and researchers strongly encourage the production and consumption of wind energy [22]. Accurately quantifying the amount of renewable energy production, particularly wind energy generation, is crucial for the safe integration of renewable energy into the grid system and to enhance efficient power grid operation [3]. However, wind power generation is inherently random, nonlinear, non-stationary, and highly intermittent, making its integration with the grid system challenging.

Despite significant renewable energy potential in Ethiopia, including hydroelectric, wind, and geothermal energy, current energy production is limited, and the energy supply falls short of rising energy consumption demands [24]. Furthermore, the absence of electric load estimation and modeling methods contributes to energy fluctuations and power interruptions that affect electric energy transmission and distribution systems [15]. As a result, energy outages and power interruptions affect all customer categories, increasing defensive expenditures due to unreliable and unstable energy supply. Therefore, accurate prediction of wind power generation can play a key role in improving the reliability and stability of the power system [7] and enable safe integration of produced wind power into the grid system [4].

Recently, deep learning has shown remarkable performance in various applications, including renewable energy forecasting, due to its ability to handle nonlinear, non-stationary, spatiotemporal data generated from energy systems. Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Deep Belief Networks (DBN), and Multilayer Perceptron (MLP) are among the well-known and widely used deep learning algorithms, with Long Short-Term Memory (LSTM) standing out in the context of time series forecasting [6].

To improve the accuracy of energy forecasting in the renewable energy sector, various artificial intelligence techniques have been applied. An example of this is the Bayesian optimization-based artificial neural network model developed in [19]. Additionally, the precision of short-term forecasting has been enhanced by utilizing hybrid deep learning models that integrate various neural network architectures. Several studies have shown that these hybrid models outperform single models in various applications [11, 19], and [20], and have been successful in accurately predicting wind speed and evapotranspiration [7, 27].

This paper aims to evaluate the effectiveness of a hybrid model of 1D-CNN and LSTM for forecasting wind power time series data in Ethiopia for the first time. The model leverages CNN’s feature extraction capability from nonlinear wind power time series data and the potential of LSTM in learning high temporal time series data. The paper’s contributions can be summarized as follows:

  1. 1.

    The optimal hyperparameters were determined using Bayesian optimization algorithm to obtain the optimal performance of the proposed model.

  2. 2.

    A CNN-LSTM hybrid model was developed for day-ahead wind power forecasting by using the effective feature extraction capabilities of 1D-CNN and forecast generalization of LSTM models

  3. 3.

    The effectiveness of a hybrid of CNN-LSTM model against base line models such as 1D-CNN, LSTM, ANN and BiLSTM is verified for wind power forecasting using the metrics of mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE).

2 Related Works

Industries and institutions can generate a considerable volume of data on their day-to-day operations by introducing sensor devices [18]. The energy sector, including renewable energy, is among the few that produces data on a timely basis regarding customer energy consumption, such as minute, hourly, and daily usage. Furthermore, wind farms’ Supervisory Control and Data Acquisition (SCADA) systems collect data related to wind power at specified time intervals. This type of data is known as time-series data, representing a series of periodic measurements of a variable. Specifically, time series data generated from energy systems is nonlinear and non-stationary, exhibiting not only temporal correlation but also spatial patterns [8].

The energy sector, particularly in the field of renewable energy forecasting, has made significant advancements with the utilization of Artificial Intelligence (AI) methods such as machine learning and deep learning techniques [1, 21]. These algorithms have been extensively used to forecast various weather parameters. Furthermore, a combination of different AI techniques has emerged as a preferred approach in recent times to develop models that perform better than individual models.

Several studies have shown that hybrid deep learning models outperform individual or single models [10]. To prove this, Goh et al. [5] investigated a hybrid of a convolutional neural network (1D-CNN) and a long short memory network (LSTM). They obtained an improvement of 16.73% for single-step prediction and 20.33% for 24-step load prediction. Additionally, [11] implemented a hybrid 1D-CNN and BiLSTM model to enhance wind speed prediction accuracy and address uncertainty modeling issues. Results indicated that the proposed hybrid approach achieved a 42% improvement over reference approaches. Another study by [9] proposed the combination of Ensemble Empirical Mode Decomposition (EEMD) and BiDLSTM system for accurate wind speed forecasting.

Furthermore, Wang et al. [25] proposed a 3-hour ahead average wind power prediction method based on a convolutional neural network. The authors in [26] introduced a deep learning approach based on a pooling long short-term memory (LSTM) based convolutional neural network to predict short- and medium-term electric consumption. Results revealed that the proposed method improved short- and medium-term load forecasting performance. Authors in [20] developed the CNN-LSTM-LightGBM-based short-term wind power prediction model by considering various environmental factors. Moreover, a hybrid deep learning model to accurately forecast the very short-term (5-min and 10-min) wind power generation of the Boco Rock wind farm in Australia was proposed by Hossain et al. [7]. However, the authors used the Harris Hawks Optimization algorithm to improve the proposed model.

Another hybrid model was introduced by J. Yin et al. [27] to forecast short-term (1–7-day lead time) evapotranspiration (ET0). The authors used a hybrid Bi-LSTM that combines BiLSTM and ANN using three meteorological data (maximum temperature, minimum temperature, and sunshine duration). The best forecast performance for short-term daily ET0 was found.

Lu et al. [14] proposed a hybrid model based on a convolutional neural network and long short-term memory network (CNN-LSTM) for short-term load forecasting (STLF). The authors noted that forecasting accuracy can be notably improved. T. Li et al. [13] introduced a hybrid CNN-LSTM model by integrating the convolution neural network (CNN) with the long short-term memory (LSTM) neural network to forecast the next 24-hour PM2.5 concentration in Beijing, China. Results indicated that the proposed multivariate CNN-LSTM model achieved the best results due to low error and shorter training time.

3 Methods

The section presents the main parts of the methodology carried out in the proposed hybrid deep learning model, the CNN-LSTM model, for wind power forecasting. The main features of the methodology are presented in Fig. 1.

In particular, a description of the selected deep learning algorithms, CNN and LSTM, used to build the hybrid model, data preprocessing, hyperparameter tuning, model training, and evaluation are described in this section. The study uses three wind power datasets generated from different wind farm sites as case study data. Each dataset is divided into training and test sets, while maintaining the order of the time series data. The first three years of data are used to build and fit the model, and the final one-year data is used to assess the model’s performance. Hyperparameter tuning is performed on the training data using a k-fold cross-validation approach.

3.1 Deep Learning Models

Deep learning algorithms have emerged as one of the most widely used approaches in artificial intelligence in the last years. One of the key advantages of deep learning is its ability to automatically learn features and extract multi-level abstract representations from complex data sets, setting it apart from other machine learning models. In particular, deep learning models such as those discussed in Sect. 3.1 have demonstrated better performance in handling large and complex data, including image processing, pattern extraction, classification, and time series forecasting [12].

RNNs are a type of deep learning method that is particularly effective in handling large datasets containing temporal dependencies. RNNs can learn sequential data by recursively applying operations during the forward pass and using backpropagation through time for learning. As such, RNNs have been studied for many real-world applications that generate sequential and time series data, including speech synthesis, natural language processing, and image captioning [17]. However, a challenge for RNNs is long-term dependency, which leads to the vanishing gradient problem as the gap between relevant information and the point where it is needed grows [17]. To address the limitations of RNNs, LSTM and Gated Recurrent Unit (GRU) techniques were introduced. These methods can retain information over long periods, allowing for processing complex and sequential data. LSTM is especially notable for its ability to handle time series forecasting. The main deep learning algorithms involved in the hybrid model CNN-LSTM, CNN and LSTM respectively, are briefly described in the following subsections.

Fig. 1.
figure 1

General scheme of the methodology including CNN-LSTM hybrid model.

CNN. This type of deep learning algorithm mimics humans’ visual perception processing systems. They have become the most widely used and extensively studied deep learning method for tasks such as computer vision, image segmentation, classification, and natural language processing, demonstrating remarkable performance [5]. Additionally, CNNs have recently gained attention from researchers as a solution for time series forecasting problems such as wind power and solar radiation [8].

LSTM. This type of deep learning algorithm was developed to solve the vanishing gradient problem in RNN by introducing an efficient memory cell that can handle long-term dependencies [2]. The memory cells in LSTM networks can retain the previous information for the next learning step. In addition to the cell unit, LSTM includes three gate structures: the input gate, forget gate, and output gate [16]. The main function of these gates in LSTM layers is to control the flow of data into and out of the cell state. The forget gate, which consists of sigmoid activation nodes, determines which previous states should be retained and which ones should be discarded [23].

This paper proposes a hybrid model that combines a one-dimensional convolutional neural network (1D-CNN) and LSTM, as shown in Fig. 2. The 1D-CNN is capable of extracting meaningful features from wind power time series data, and the LSTM network can leverage long-term dependencies among the extracted features to produce improved prediction results.

Fig. 2.
figure 2

Hybrid of 1D-CNN-LSTM architecture.

3.2 Data Preprocessing

For this study, the dataset was obtained with the permission of the Ethiopian Electric Power Corporation. The source dataset was collected from three groups of wind power generation plants managed by the corporation. For all groups of wind power data, the time resolution is the same ranging from 9 February 2019 to 25 July 2022, referred by Dataset 1, Dataset 2, and Dataset 3. The wind power plants from which data is generated is located just outside Adama Town in Oromia Regional State of Ethiopia, which is 95 km southeast of Addis Ababa, the capital city of Ethiopia. For further information, the dataset used to establish the findings of this paper is available at https://github.com/DataLabUPO/WindPower_HAIS23.

To improve the performance of the proposed model, we performed data preprocessing, which included handling missing values and removing duplicates. The dataset must be transformed into favorable range values for deep learning model training to effectively learn the input data. In this case, the dataset is scaled into (0,1), which will improve computation and model convergence speed. The min-max normalization method was used to transform the data into the range (0,1), as expressed by Eq. 1.

$$\begin{aligned} n = \frac{\big (X_{0}-X_{min}\big )}{X_{max}-X_{min}} \end{aligned}$$
(1)

where n represents the normalized values of X, while \(X_0\) represents the current value of the variable X. \(X_{min}\) and \(X_{max}\) refer to the minimum and maximum data points for the variable X in the input dataset.

3.3 Performance Evaluation

Different evaluation techniques, such as MAE (Eq. 2), RMSE (Eq. 3), MSE (Eq. 4), and MAPE (Eq. 5), have been used to determine the prediction performance of trained models.

$$\begin{aligned} MAE= & {} \frac{1}{n}\sum _{i=1}^{n} \left| {y} - \hat{y}\right| \end{aligned}$$
(2)
$$\begin{aligned} RMSE= & {} \sqrt{\frac{1}{n}\sum _{i=1}^{n}{\Big ({y} - \hat{y}\Big )^2}}\end{aligned}$$
(3)
$$\begin{aligned} MSE= & {} \frac{1}{n}\sum _{i=1}^{n}{\Big ({y} - \hat{y}\Big )^2}\end{aligned}$$
(4)
$$\begin{aligned} MAPE= & {} \frac{1}{n}\sum _{i=1}^{n}\frac{|y - \hat{y}|}{y}*100 \end{aligned}$$
(5)

where y and \(\hat{y}\) represent the actual and predicted values, respectively. In addition, n represents the total number of observations used to train the model.

3.4 Analysis and Result Discussion

In order to develop the hybrid CNN-LSTM model for wind power forecasting, it is crucial to determine the optimal hyperparameters. In this study, we define the range and type of hyperparameters for each deep model including 1D-CNN and LSTM models within their respective hyperparameter spaces. The best combination of optimal hyperparameters was determined using the Bayesian optimization algorithm and some of the optimal values found for each model are not the same despite we define the same hyperparameter space and types such as learning_rate, activation function, etc. For example, hyperparameters space, parameter type, and optimal values searched for 1D-CNN and LSTM models are shown in Table 1.

Table 1. Hyperparameters used for the proposed hybrid CNN-LSTM model.
Table 2. Forecasting performance of deep learning models using MAE, RMSE, and MAPE metrics on training and test data.

After searching for the optimal parameters for each deep learning model, 1D-CNN and LSTM models were defined in intercorrelated sequence layers. In the first learning phase, the extraction of time series features was achieved using the 1D-CNN convolution layer. Learning the temporal correlation of the time series data was performed in the second learning phase. Finally, the fully connected layer produced the predicted output, as defined in the last layer.

Table 2 presents the results of a day ahead wind power prediction using the hybrid deep learning model that combines 1D-CNN and LSTM models for the three wind power datasets. Using optimal hyperparameter configurations, the proposed CNN-LSTM model was compared against four individual deep learning models, including simple RNN, LSTM, 1D-CNN, and BiLSTM. Based on the evaluation results of MAE, RMSE, and MAPE presented in Table 2, it can be observed that the shallow ANN exhibited the worst performance with the highest MAE and RMSE error values of 0.4791 and 0.5451, respectively, on the training data for all three cases. Similarly, the ANN performed poorly on the test data with the largest MAE and RMSE error values of 0.5461 and 0.6167, respectively, followed by the inferior performance of the RNN models on both the training and test data compared to the rest of the deep learning models (LSTM, BiLSTM, 1D-CNN) and CNN-LSTM, as summarized in Table 2. More importantly, based on the MAE and RMSE evaluation metrics, the CNN-LSTM model exhibits the lowest error on both the training and test data for all case study datasets used in this paper. From this, we can conclude that the combination of CNN-LSTM enhanced by the hyperparameter tuning approach outperforms the single optimized deep learning models and achieves excellent performance for the non-linear and highly intermittent wind power forecasting problems.

Additionally, Fig. 3 displays the average MSE error values of all models on the three wind power datasets analyzed in this study. The results indicate that the ANN model had significantly larger error values for the MSE metric compared to the other deep learning models. The results demonstrate the capability of deep learning models in learning the nonlinear and complex wind power data as compared to the shallow ANN architecture. On the other hand, the hybridizing of deep learning LSTM and 1D-CNN models with the use of an automatic hyperparameter optimization approach yields the lowest MSE error and exhibits improved forecasting performance. Furthermore, the CNN-LSTM model exhibited the best performance on the test data, with the smallest MSE error, as depicted in Fig. 4, while the shallow ANN was the poorest model, followed by the RNN model, for all three wind power datasets analyzed.

Fig. 3.
figure 3

MSE error on the training set for different models on three datasets.

Fig. 4.
figure 4

MSE error on the test set for different models on three datasets.

Fig. 5.
figure 5

Actual and predicted values for CNN-LSTM model in Dataset 1.

Figure 5 shows the predicted curve and the actual test data for Dataset 1 of daily wind power generation obtained by the CNN-LSTM model. It can be observed that the actual observation and the model output follow the same curve and the gap between the two curves is very small. Therefore, it can be concluded that the CNN-LSTM model fits the actual curve accurately. In other words, the model performed well and did not demonstrate either overestimation or underestimation on the test data.

4 Conclusion

This paper proposes a hybrid CNN-LSTM model for improved wind power forecasting by leveraging the feature extraction potential of CNN and the better temporal data forecasting capabilities of LSTM. A metaheuristic-based Bayesian optimization approach was applied to select the optimal hyperparameters that improve model accuracy. Using the automatically selected optimal parameters of CNN and LSTM, the proposed hybrid CNN-LSTM models was developed for each wind power dataset. The results of the comparative analysis with benchmark models reveal that ANN exhibits lower performance, followed by the RNN models. However, the hybrid CNN-LSTM model outperforms the benchmark methods for daily wind power forecasting. Specifically, the performance of the hybrid models are verified for each dataset compared to the single model and found significant improvements in terms of lower MAE, RMSE, and MAPE values for the three wind power datasets when using the CNN-LSTM model.