Keywords

1 Introduction

Electricity load balancing and supply and demand management are required by the electricity suppliers to utilize the power plant’s ability up to required demand level. In terms of predictions, forecasting is divided into three categories (1) long term, (2) mid term, and (3) short term. These studies can save electricity expenditures by analyzing the data and the pattern of electricity from approximate values of future consumption. Analyzing the approximate future values, electricity suppliers can manage the prices of electricity [1], which eventually helps in the economy of the country. Properly planned electricity load and consumption can save money that can help in building economy[2].

Statistical as well as machine learning models are resourceful to get insight of the electricity consumption from dataset. Generally, these consumption datasets have time series representation. Time series data can be obtained from domains where data figures change with respect to time like in stock market[3]. Statistical or machine learning models can be applied to other fields also as [4] introduces an approach to analyze churn dataset with respect to geographic locations.

Auto-regressive and moving average models are widely used for such forecasting studies, but models from artificial intelligence can also be utilized to increase the forecast accuracy. ARIMA stands for “Auto Regressive Integrated Moving Average.” The ARIMA system is described with 3 terms: p, d, and q. The approach is that computing the differential value from the prior and the present values. ARIMA is used in studies like forecasting wheat production, infant mortality rate, automated modeling for electricity production [5,6,7], and many more.

Forecasting electricity can impact industries that are directly linked to electricity production. Prices can go up and down in such industries like oil and gas industry [8], which helps in electricity production. This statement emphasizes the cruciality and importance of the study as many other factors can get affected with improper management of electricity load balancing, production, supply, and demand.

In this study, the authors investigated comprehensive models for related dataset. Authors identified the hyperparameters for models, conducted experiments, and compared results on the basis of MSE and RMSE metrics. The structure of the paper is as follows: Sect. 2 provides related work and much better understanding of the models like ARIMA and neural networks. Section 3 provides material and methods that have been used while conducting this study. Section 4 provides information of the experiments and discusses certain results that are observed. Furthermore, results are discussed for future work.

2 Literature Review

Neural networks are mostly known for image processing especially in medical image processing. Different models with certain parameters have to be analyzed to determine the best outcomes. The study of [9] shows good progress in terms of medical image processing as they analyze 5 thousand to 160 million parameters with a certain number of layers to evaluate their large scale dataset that eventually helps in computer aided vision (CAD). Neural networks can help in understanding the personality dynamics and can determine state of personality is stable or not [10] and what are the variables that can affect the personality variable. Photovoltaic (PV) integration can help in economic growth as it is a promising source of renewable energy and thus requires prediction and forecasting to help in taking future decision. Forecasting PV based data can be done by neural network model named as LSTM. LSTM-RNN is analyzed by Abdel-Nasser and Mahmoud [11], which can determine temporal changes in PV output and can be evaluated through hourly dataset for a year. Electricity being considered as a key role player in economy thus studied by many researchers with different models and approaches. An approach used by Bouktif et al. [12] utilizes LSTM along with genetic algorithm to get better results and performance with time series data for short-term and long-term forecasting. Considering electricity plays an essential role, [13, 14] also propose a model with LSTM that is capable of forecasting load for single residential as there are certain other parameters that are involved.

3 Material and Methods

We have chosen core i7 processor along with NVIDIA GPU with 8GB memory, in terms of hardware utilized, to speed up the computational time. To develop prediction models, we have used Python along with Python integrated environment along with TensorFlow and Keras libraries for neural network and statsmodels’ libraries to implement statistical models (Table 1).

Table 1 Abbreviation table

4 Building the Forecasting Model

We have performed statistical as well as machine learning algorithm to fetch the information from the dataset. To evaluate our experiments, we have considered performance metrics like MSE and RMSE.

4.1 ARIMA Model

ARIMA is a statistical model to forecast the values. Combining the AR and MA model formulas, we get general formula as

$$\displaystyle \begin{aligned} \hat{{y}}_{{t}}=\mu+\phi_{1} {y}_{{t}-1} + \cdots + \phi {y}_{{t}-{p}} - \theta_1 {e}_{{t}-1} - \cdots - \theta_{{q}} {e}_{{t}-{q}}. \end{aligned} $$
(1)

Figure 1 indicates that our data is not fully stationary, and we have to do some data preparation before we can use ARIMA model. It can also be observed that there are a number of spikes that are not within the critical range. This phenomenon can determine the P and D values for the ARIMA model.

Fig. 1
figure 1

ACF and PACF

Figure 2 provides results of different hyperparameter values that are determined from Fig. 1. In Fig. 2, it can be seen that the suitable value for p,d,q is 3,1,1 respectively. Figure 3 provides pictorial view of ARIMA model with hyperparameter (p,d,q) values as (3,1,1) with a MSE of 0.028. This shows that model was successfully executed with hyperparameter setting and yielded good results.

Fig. 2
figure 2

ARIMA(3,1,1) result for GURO district

Fig. 3
figure 3

ARIMA(3,1,1) result for GURO district

Figures 4 and 5 show the 4-month predicted values in a grey area that is 95% interval and forecasted values that are in blue line.

Fig. 4
figure 4

Four-month prediction with ARIMA(3,1,1)

Fig. 5
figure 5

Enhanced 4-month prediction with ARIMA(3,1,1)

5 Conclusions

The main focus of the study is to determine whether statistical models or neural networks can perform better with provided dataset. To compare all the models, we maintain the consistency of hyperparameters so comparison can be most realistic. ARIMA requires heavy data preprocessing, while neural networks are easy to adopt. This study will expand to find results from different neural network models that are applicable in time series prediction. Results produced by ARIMA are certain and can be applied into real world application where data pattern does not change most of the time. Meanwhile, LSTM and other neural network models are difficult to train but produce better results.