Keywords

1 Introduction

Owing to the setting up of large numbers of new industrial and residential sectors, electricity demand is rising day by day [1]. In 2019, Marsal-Pomianowska et al. reported 40–50% electricity consumption only from the buildings which were only 16% in 2013 [2]. However, the use of renewable energy sources will meet this rising demand, which is becoming a promising solution to this issue [3]. In addition, solar energy is one of the popular and eco-friendly solutions among all renewable sources which generate electricity by converting solar radiations received from sun [4, 5]. Nowadays, the solar plants are used as a power generation source not only in standalone mode but also in grid interconnected mode [6]. In fact, the scale and numbers of these solar plants are also growing at an exponential rate [7]. Despite this growth, the uncertainty and randomness of power generation by PV panels is still a major challenge. Therefore, the forecasting of the next step electricity generation is necessary to manage the grid planning, maintenance and operation in case of grid interconnected plants [8]. A precise and accurate forecast of solar irradiation helps not only in grid management but also prevents penalty.

Generally, there are two methods to forecast the solar GHI: direct and indirect methods [9]. Direct methods are the methods which only consider the historical variables as input features whereas the meteorological variables are considered in indirect method of forecasting [3]. Numerous studies have been published in the literature to forecast the solar GHI using artificial neural network (ANN), support vector machine (SVM), regression method, etc. K. Mohammadi et al. developed a forecasting model based on SVM. This study utilized the wavelet transform (WT) decomposition to make the model hybrid and showed WT + SVM performed better than single SVM [10]. Likewise, SVM along with k-means clustering was used by the Bae et al. This combination again outperforms to backpropagation neural network (BPNN) [11]. In addition to this, wavelet combination with the neural network was also performed by Sharma et al. [12]. The comparative study of regression-based model and SVM was conducted by Sharika et al. in their study [13]. The ability of deep learning to manage large data has made it popular in recent years in the field of forecasting. But very few studies are available for forecasting solar GHI using deep learning networks. Sharadga H. et al. predicted the PV power output using the bidirectional long short term memory (Bi-LSTM) network [14]. Whereas Srivastava S. et al. forecasted the solar GHI using the LSTM model [1]. In addition to this, Gao M. et al. proposed the LSTM network to forecast the PV power of a plant. However, for a smooth dataset, this study used the meteorological variables as inputs, whereas the time series data has been considered for non-ideal weather condition [3]. Moreover, solar GHI has also been forecasted by Aslam M et al. In their paper, GRU deep learning network was used to forecast hourly solar GHI for a year. The developed model was also compared with the benchmark model: SVM feed-forward neural network (FFNN) [15].

Therefore, motivated from the power of deep learning, the paper developed two deep learning networks to forecast one hour ahead solar GHI. The research contributions to the paper are:

  1. (i)

    A brief literature of the deep leaning for solar GHI forecast.

  2. (ii)

    A theoretical overview of LSTM network and GRU network.

  3. (iii)

    Calculation of CSI to stationarize the meteorological data.

  4. (iv)

    Simulation of naïve model (benchmark model), LSTM network and GRU network for solar GHI forecast.

  5. (v)

    Performance evaluation of developed model using RMSE, MAPE and R2.

The paper organization is as follows: Section 2 describes the theoretical background of LSTM and GRU deep learning techniques. Section 3 discusses the experimental setup of this paper. This section discusses the data description along with the developed forecasting model and error metrics. The result and analysis are provided in Sect. 4. Finally, the study is concluded in Sect. 5.

2 Theoretical Background

This section provides the theoretical background of LSTM and GRU networks in brief.

2.1 Long-Short Term Memory Network

Generally, the vanishing gradient problem is generated in the simple recurrent neural network (RNN) due to its limited memory [16]. This problem is solved in case of LSTM network. In LSTM, the memory blocks are presents instead of summation unit like in RNN [1]. The gate ‘gi collects the previous time step output say ‘ht−1. However, the data in context of present time is also input to the gate of the cell. The outputs of four gates are depicts as input ‘it, output ‘Ot, update ‘gi and forget ‘ft. The information passed to the cell is decided by the input gate as:

$$ i_{t} = {\text{sigm}}(\theta^{i} x_{t} + \theta^{hi} h_{t - 1} ) $$
(1)

The amount of previous state information that has to pass is decided by forget gate and expressed as:

$$ f_{t} = {\text{sigm}}(\theta^{f} x_{t} + \theta^{hf} h_{t - 1} ) $$
(2)

whereas the amount of information of internal state that has to pass is decided by output gate.

$$ O_{t} = {\text{sigm}}(\theta^{o} x_{t} + \theta^{ho} h_{t - 1} ) $$
(3)

So, the internal memory state ‘Ct will be updated as:

$$ C_{t} = {\text{sigm}}(f_{t} *C_{t - 1} + i_{t} *\tilde{C}_{t} ) $$
(4)
$$ \tilde{C}_{t} = \tanh (\theta^{g} x_{t} + \theta^{hg} h_{t - 1} ) $$
(5)
$$ h_{t} = \tanh (C_{t} )*O_{t} $$
(6)

2.2 Gated Recurrent Unit (GRU)

Similar to LSTM, the GRU has reset gate and update gate as a gating unit. The purpose of gating unit is to modulate the information passage in the unit [17]. The reset gate for a GRU cell is expressed as:

$$ r_{t} = {\text{sigm}}(\theta^{r} x_{t} + \theta^{hr} h_{t - 1} ) $$
(7)

Similarly, update gate,

$$ z_{t} = {\text{sigm}}(\theta^{z} x_{t} + \theta^{hz} h_{t - 1} ) $$
(8)

The current hidden state of the cell can be finally expressed as (Fig. 1):

$$ h_{t} = (1 - Z_{t} )*h_{t - 1} + Z_{t} *\tilde{h}_{t} ) $$
(9)
$$ \tilde{h}_{t} = \tanh (\theta^{h} x_{t} + (r_{t} *h_{t - 1} )\theta^{h} ) $$
(10)
Fig. 1
figure 1

a, b Architecture of LSTM and GRU

3 Materials and Methodology

3.1 Data Description

In order to compare LSTM and GRU network performance, the dataset of an ‘Ahmadabad, Gujarat’ has been considered. The location is a city of Gujarat state of India located at latitude longitude of 23° 0.05′/72° 0.35′ having climatic condition of extreme type. The entire year is divided into three different climatic seasons: summer, winter and monsoon [18]. The dataset for the study was collected from the database of National energy renewable laboratory on hourly basis. Eight different meteorological variables have been collected as: wind direction, dew point, pressure, temperature, solar zenith angle, relative humidity, wind speed and precipitation. The model was trained using the data of one year and one step ahead forecasting is performed for month basis.

3.2 Forecasting Process

Figure 2 represents the flow graph of process used to perform the forecasting using the LSTM and GRU.

Fig. 2
figure 2

Flow Process of forecasting using LSTM and GRU

At stage-I, the meteorological data was collected for the targeted site. The data collected is often in raw form as well as some missing and incorrect vales are included. So, the data quality check was performed to remove the night hours as well as missing and false observations. The night hours were removed due to the non-availability of GHI in the night. Once the quality of the data has been checked, the clear sky index was calculated to make the data stationary. The clear sky index can be calculated as:

$$ K_{t} = \frac{Y}{{Y_{CS} }} $$
(11)

where, \(Y_{CS} = E_{o} \exp^{{ - \frac{\tau }{{{\text{Sin}}^{a(\mu (t))} }}}} {\text{Sin}} (\mu (t))\).

where, Y is the solar irradiation, µ(t) is the height of panel in degree, Eo is the extraterrestrial irradiation, and a is the fitting parameter.

However, the normalization techniques can be used in place of finding the clear sky index. The correlation coefficient of each variable was performed in the next step with the target variable. The variable with a strong correlation has been selected, while the weaker correlated variable has been removed. The next and important step is the selection of the hyperparameters for deep learning networks. There is, however, no particular rule provided in the literature for selecting these hyperparameters. For the same method, only the error and trail method must be practiced. This study conducted various experiments to select proper hypermeter. The hidden units were varied from ‘20–100’ with a learning rate of ‘0.2. 0.02, 0.002, 0.0007.’ But, hidden units of 70’ with ‘500’ epochs, performed best for both of the networks with an initial learning rate of ‘0.02.’ The ‘Adam’ function was used as an optimizer as prescribed in the literature also. The learning rate drop factor was 0.2’ with a drop period of ‘125.’ Both the models were trained with one year of meteorological data, while one step ahead monthly GHI was forecasted. After completing the training and testing, the error metrics were calculated. If the results are satisfactory then finalize the model otherwise reselect the hyperparameters and repeat the process.

3.3 Evaluation Metrics

$$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {{\text{GHI}}_{p,i} - {\text{GHI}}_{a,i} } \right)^{2} } } $$
(12)
$$ {\text{MAPE}} = \frac{1}{n}\sum\limits_{i = 1}^{n} {\left| {\frac{{{\text{GHI}}_{p,i} - {\text{GHI}}_{a,i} }}{{{\text{GHI}}_{a,i} }}} \right|} $$
(13)
$$ R^{2} = 1 - \frac{{{\text{var}} ({\text{GHI}}_{a,i} - {\text{GHI}}_{p,i} )}}{{{\text{var}} ({\text{GHI}}_{p,i} )}} $$
(14)

where, ‘GHIp,I’ is predicted/forecasted irradiation, and ‘GHIa,I is real/actual irradiation.

4 Results and Analysis

Table 1 presents the observations of RMSE (w/m2), MAPE (%) and R2 for the benchmark model, LSTM and GRU network for different months, respectively.

Table 1 Results for naïve, LSTM and GRU

According to Table 1, the RMSE for GRU network is better than LSTM network except for three months: April, august and October. The minimum RMSE obtained from GRU network is 34.59 (w/m2) in month of March, whereas maximum RMSE is 158.83 (w/m2) in the month of September. On the other hand, the minimum RMSE obtained from LSTM network is 28.77 (w/m2) in month of April, whereas it is at maximum in the month of September with 155.66 (w/m2). In addition, the benchmark mode obtained minimum RMSE [134.01 (w/m2)] in month of December and maximum RMSE [169 (w/m2)] in month of August. As far as MAPE is concern, the minimum MAPE obtained from LSTM network is 3.09% in month of April, whereas it is maximum in month of September with value of 35.88%. The GRU network achieved the minimum MAPE in month of March with 3.59% and maximum in month of September with value of 29.97%. Whereas the benchmark model obtained minimum MAPE (20.77%) in month of April and maximum MAPE (37.09%) in the month of January. However, the overall performance on the basis of annual average RMSE and MAPE, GRU still a good choice over the LSTM for forecasting using climatic variables. The higher RMSE and MAPE in the months June, July, August, September and October are due to the presence of uncertainty in the data due to rainy and cloudy days.

The accuracy parameter R2 statistics is also observed for these models to evaluate the curve fitting progress of the model. From the results, it also shows that the overall maximum R2 (0.86) is obtained by model GRU model, while the benchmark model and LSTM network obtained R2 equal to 0.52 and 0.85, respectively. For more analysis, Fig. 3a, b shows the actual GHI tracing by the GRU model for one week data of ‘May’ and ’September’ month, respectively. The figure plot clearly shows that the month ‘May’ have smooth data set which can be traced by LSTM and GRU properly. But the month ‘September’ has larger variations or randomness and the uncertainty in the data which is not traced by LSTM model and GRU model properly. Only GRU model traced this GHI much precisely than LSTM model. In addition, Fig. 4a, b represents the annual GHI plot for LSTM model and GRU model, respectively.

Fig. 3
figure 3

a, b Performance of LSTM and GRU on one week of May and September

Fig. 4
figure 4

a, b Performance of LSTM and GRU for annual dataset

Therefore, the GRU model achieved annual average RMSE of 69.8117 (w/m2), whereas it was 148.94 (w/m2) and 71.7772 (w/m2) for benchmark model and LSTM network, respectively. Moreover, the MAPE obtained from GRU, LSTM network and benchmark model was 11.5205%, 11.8521% and 28.64%, respectively. In addition to this, the LSTM, GRU and benchmark model obtained annual R2 is equal to 0.85, 0.86 and 0.52, respectively. Table 2 shows the comparative results of the performed experiments with the other study also.

Table 2 Comparative results of the performed experiments with the other study

This study used the clear sky index as a calculation to make the data stationary instead of simple normalization. The results mentioned in Table 2 show that both of the networks performed well with the clear sky index calculation. Among both of the networks, GRU is better in terms of RMSE from LSTM for meteorological data. This study achieved RMSE of 69.8117 (w/m2) which was 122.45 (w/m2) in Ref. [17] and 127.3 (w/m2) in Ref. [19].

5 Conclusion

This paper studies the LSTM and GRU network using meteorological variables to forecast the GHI for ‘Ahmadabad, Gujarat’ area. On the basis of the intensity of their correlation with the target variable, the metrological variables are chosen. The research used the clear sky index calculation to stationarize the data to enhance the model's efficiency. In the analysis, one step ahead monthly forecast was carried out to compare the performance of LSTM and GRU network with benchmark model using RMSE, MAPE and R2 statistics. The model configuration was selected from the variations of different hyperparameters where an initial learning rate of 0.02 with 70 hidden layers performed best. The GRU and LSTM network obtained 69.8117 (w/m2) and 71.777(w/m2) of RMSE, whereas 11.5205 and 11.8521% of MAPE. Moreover, the LSTM and GRU network obtained R2 statistics equal to the 0.85 and 0.86, respectively. These comparative results show that the GRU network is better than LSTM network by considering the meteorological variables as input features.