Keywords

1 Introduction

There has been a rapid increase in energy demand in recent decades, and because of this, a major threat to our future is global warming, since it affects the natural world in harmful ways. Energy demand is one of the major players in world trade, and it is a major factor in global warming because of its dependence on fossil fuels [1]. The alternative solution to this problem is renewable energy sources, which are highly dependent on metrological conditions and, for this reason, are dynamic in nature. Several forms of renewable energy can be used to meet energy demand, including nuclear power, geothermal hydropower, wind power, and solar power. In recent years, solar power has been attracting the attention of researchers due to its abundant, renewable, and sustainable nature [2]. In addition, solar power is mainly dependent on solar radiation, temperature, humidity, and other meteorological factors. Solar power systems face new challenges as a result of these characteristics, including complexity and difficulty [3].

Accurate estimation of solar radiation is strongly associated with effective planning, management, and implementation of solar power systems. The term “solar radiation” is used to refer to “horizontal global solar radiation”. A number of studies use solar radiation to forecast solar power [4, 5]. Although other literature proposes direct solar power generation forecasting using historical values of solar power [6].

According to number of literatures, there are four types of solar power forecasting methods, namely physical, statistical, machine learning, and deep learning. Methods based on deep learning and machine learning gain advantages due to non-linearity, logical thinking like human brain and strong computer simulation [7].

A horizon for forecasting is a period of time into the future over which the forecast is desired. In order to design a forecasting model, it is necessary to select a suitable time horizon. In turn, it will help to maintain the accuracy of forecasts of solar power output at a satisfactory level. The forecasting horizon for solar power output includes very short term by one minute to 60 min, short term by 60 min to 24 h, medium term by one day to one year; and the long term by one year to five years or more [8, 9]. The purpose of every forecasting horizon is mentioned in Table 1.

Table 1 Types of forecasting horizons

Our major contributors include a literature review of solar power and solar radiation forecasting using deep learning, implementation of all four proposed deep learning methods in Python, and a comparative performance analysis based on evaluation metrics. Because of this, the paper is arranged as follows: Sect. 2 discusses different literature on solar power and irradiance forecasting using deep learning. A detailed discussion of the framework for the proposed models is provided in Sect. 3, and the experimental results are presented in Sect. 4. Finally, conclusive remarks are provided in Sect. 5.

2 Related Work

The increase in demand for renewable energy and its incorporation into the power grid creates a viable and sustainable future. The nature of renewable energy like solar is volatile and, because of this, it creates significant challenges to the secure and reliable operation of renewable energy integrated power grid [10, 11]. A machine learning model can extract and map nonlinear, complex features efficiently. To take advantage of this, machine learning-based predictions are among the most commonly used methods for making predictions [12,13,14]. In the research, an artificial neural network-based model was developed using meteorological data including humidity, temperature, pressure, and wind speed as inputs. Estimation of solar irradiation is directly related to solar power and in [15] for predicting solar irradiance, the author proposes a model based on artificial neural network and the proposed method performs better than the other prediction models.

The deep neural networks perform better in extracting features than simple neural networks, and they can solve the gradient disappearance problem more significantly than neural networks [16]. With convolutional neural networks (CNN), hidden features are extracted from data through hidden layers, and regression is performed with the final fully connected layer. With solar data measured at photovoltaic plants as input, Wang et al. developed a new hybrid prediction model based on a conventional neural network and a long short-term memory neural network [17]. A hybrid model was proposed by Zeng et al. and Ray et al. to obtain better short-term forecasting accuracy for solar power by using frequency subseries that were decomposed into different frequencies by variational mode decomposition and a convolutional neural network with long short-term memory [18, 19].

Solar irradiance is directly related to solar power generation. A time series-based forecast for short-term solar power generation is proposed by Sharadga et al. using a Long Short-Term Memory model with solar irradiance as a key input parameter [20]. Mishra et al. proposed a model for short-term solar power forecast using a long short-term memory neural network at different time horizons using metrological factors with similar characteristics as input parameters [8]. Gao et. al. used weather time series classification data as input to LSTM for short-term solar power forecasting [21]. Day ahead solar irradiation prediction is required for reliable electrical grid operation and for this historical weather data as input in the LSTM network for forecasting [22].

The LSTM and Bi-LSTM are among the best deep learning models for time series prediction, since they can handle correlations in time series better than other models. For time series analysis, the Bi-LSTM provides better accuracy and a faster learning rate due to utilizing bidirectional learning capabilities. Zhen et al. developed a prediction model using the framework of Bi–LSTM for short-term solar power output that takes solar power output time series as an input [23]. Using bidirectional LSTM, Solar power prediction is always an important part of grid scheduling and, for this, Sharadga et al. developed a model that outperforms traditional neural network models and statistical models for predicting solar power output at large scales [20].

3 Methodology

3.1 Fully Connected Neural Network

As an artificial intelligence subset, neural networks are based on the biological structure of human neurons. Neurons are simple processing elements that make up each neural network, which are organized according to a particular type of neural network. The neural networks can be trained properly to identify trends and models relating to datasets. Artificial neural networks work in two steps: learning and recalling. The weights and bias of artificial neural networks are adjusted recursively in training steps to reduce the error associated with generating output. Neural networks have input and output layers that are related, as shown in Eq. 1 [24].

$$Z= \sum_{r=1}^{{N}_{i-1}}{X}_{r}^{i-1} {W}_{r, i}- {\mathrm{\varnothing }}_{r}$$
(1)

where W is the weight, X is input for the neuron at layer I and Bias is \({\varnothing }_{r}\).

3.2 Convolutional Neural Network

Convolutional neural network is one type of deep learning neural network and have been implemented in many fields to achieve successful performance. It is an effective technology for extracting features automatically. In addition, they also exhibit a high degree of potential for dealing with time series, for example, automatic speech recognition and wind speed forecasting. Layers in convolutional neural networks include pooling, convolutional, and fully connected layers. The following is the summary of tasks performed by each layer.

Convolutional Layer: In a Convolutional neural network, the convolutional layer is used to generate new feature maps using convolutional kernels. As kernel weights in convolution are shared among all input maps, this operation is useful for local feature extraction.

Pooling Layer: The pooling layer reduces the dimensionality of input maps in-plane, thereby reducing the number of parameters that can be learned and reducing the probability of overfitting. Pooling can take different forms, such as average and maximum pooling.

Excitation Layer: As the final layer of the convolutional neural network, the fully connected layer functions as a “classifier.” Every neuron between two layers is connected [17].

3.3 Long Short-Term Memory Neural Network

Hochreiter and Schmidhuber analyzed and the solved vanishing gradient problem for prediction using time series data [25]. An LSTM is an upgraded version of recurrent neural network that solves the problem of long-term dependency of past information and gradient vanishing. As shown in Fig. 1. LSTM has forward propagation memory blocks for information.

Fig. 1
A structure diagram. It has four memory blocks connected horizontally using arrows from left to right. The link between the first two memory blocks is labeled unfold. Outputs from each are X and H.

LSTM neural network structure

Memory cell, input, forget, and output gates are the core elements of the long short-term memory network. As in Fig. 1 the input gates are (X1,…Xt), and the output gates are (H1,…Ht) [26].

3.4 Bi-Directional Long Short-Term Memory

In Bi-directional Long Short-Term Memory (Bi-LSTM), the network is not only connected to previous generations, but to potential future generations as well. Bi-directional long short-term memory (Bi-LSTM) is characterized by two time directions. In comparison to LSTM, Bi-LSTM has more learning capabilities. Previous and future information uses as inputs in Bi-LSTM. The backward and forward layers in Bi-LSTM are used to input information to every hidden layers. The α activation function is used to calculate the output of the Bi-LSTM network [27].

3.5 Performance Measures

Three accuracy performance indices, Mean Squared Error (MSE), Root Mean Square Error (RMSE), and coefficient of determination (R2), were utilized to assess the effectiveness of the suggested model for the short-term forecast [28]. The following are brief descriptions of these indices.

The RMSE is a measure of the variance of prediction errors in a sample. The difference in values indicates how far apart errors are, i.e., a higher value indicates a wider error spread. To calculate the RMSE value, use the following Eq. 2

$${RMSE}= \sqrt{\frac{1}{N}{\sum }_{i=1}^{N}{\left({t}_{i}- {f}_{i}\right)}^{2}}$$
(2)

Mean squared error (MSE) is used to measure the error distribution. For forecast f and target t using N steps use the following Eq. 3.

$$MSE=\frac{1}{N}\sum\limits_{i=1}^{N}{\left({t}_{i}- {f}_{i}\right)}^{2}$$
(3)

\({R}^{2}\) is a measurement that gives the information about the accuracy of a model with training and testing data.

$${R}^{2}=1- \frac{\sum {\left({y}_{i}- {\widehat{y}}_{i}\right)}^{2}}{\sum {\left({y}_{i}- \overline{y }\right)}^{2}}$$
(4)

4 Experimental Analysis

The required metrological data for research is downloaded from [29]. The time series dataset contains 11 variables with a resolution of 5 min, including pressure, temperature, radiation, wind speed, direction, and humidity. A total of 327,00 measurements were selected in the four months from September to December 2016 as part of the experiment.

As part of the data preprocessing step, it is always important to select input parameters for proposed models. A correlation coefficient was first used to find correlation indexes for parameters associated with solar radiation. The correlation coefficient index values are computed using the Pearson correlation coefficient. Table 2 shows the influential promoters. Based on Table 2, wind speed, pressure, and temperature are the most influential parameters.

Table 2 Correlation matrix

The datasets must then be prepared for training and testing. As part of the standardization process, the dataset is prepared so that it fits better and the deviations are kept as low as possible. By removing the mean from the data and scaling it to unit variance, the standard scaler can be used to achieve this task. 80% of the data is used for training and the remaining 20% is used for testing.

Based on simulations, the effectiveness of deep learning models in this study, including fully connected neural networks, CNN, LSTM, and Bi-LSTM, was evaluated. We determine the error parameters of the proposed deep learning models based on trial and error. Table 3 depicts the parameters of various deep learning models. The epoch for training and validation is 100 for each deep learning model for this paper.

Table 3 Parameters settings for deep learning models

The values of RMSE, MSE, and R2 are determined for each model and are shown in Table 4. Our findings show that the Bi-LSTM forecasting model performs better than other techniques for forecasting short-term solar power. In all forecasting models, the Bi-LSTM model demonstrates a lower value of RMSE, MSE, and higher value of R2 than the other models. Accordingly, the MSE of the Bi-LSTM model is 0.05121, which is best among CNN, LSTM, and Fully Connected Neural Network respectively, at 0.07883, 0.07102, 0.06431, and 39,744.4. Additionally, the Bi-LSTM model's RMSE is lower than that of other forecasting models. According to the forecasting results, Bi-LSTM outperforms the other models even if all of them have acceptable forecasting abilities.

Table 4 Comparative analysis of models

Figure 2 shows a comparison between observed solar radiation and predicted solar radiation for CNN, LSTM, Bi-LSTM, and fully connected neural network.

Fig. 2
4 graphs of observed versus predicted solar radiation for C N N, L S T M, B i L S T M and fully connected neural network. The 2 trends labeled are expected and prediction. The values fluctuate in proximity and are noisy.

Forecasting performance of different models

As per Fig. 1 it shows that the convolutional neural network gives some negative results when it comes to predicting the output. By comparison, the fully connected neural network predicts different outcomes from what is expected. It is almost identical between the expected and predicted output values for LSTM and Bi-LSSTM, but in Bi-LSTM it is more similar than LSTM.

5 Conclusion

Future prospects for sustainable solar power are very promising. When constructing a model for forecasting, accuracy, and stability should be taken into account. Therefore, it is necessary to create an accurate and precise short-term solar energy forecasting technique. A comparison of deep learning-based forecasting approaches is presented in this paper, using RMSE, MSE, and R2 error metrics to assess their forecasting performance. By using RMSE, MSE, and R2 values, the deep learning model Bi-LSTM produces fewer statistical performance errors. In light of the numerical experimental results and analysis, the BI-LSTM model has a good degree of accuracy and precision. In this study, deep learning models were used to forecast solar radiation output. This domain will therefore be subject to further study in the future.