Abstract
Electricity is one of the critical role players to build an economy. Electricity consumption and generation can affect the overall policy of the country. Such importance opens an area for intelligent systems that can provide future insights. Intelligent management for electric power consumption requires future electricity power consumption prediction with less error. These predictions provide insights for making decisions to smooth line the policy and grow the country’s economy. Future prediction can be categorized into three categories, namely (1) Long-Term, (2) Short-Term, and (3) Mid-Term predictions. For our study, we consider the Mid-Term electricity consumption prediction. Dataset provided by Korea Electric power supply to get insights for a metropolitan city like Seoul. Dataset is in time-series, so statistical and machine learning models can be used. This study provides experimental results from the proposed ARIMA and CNN-Bi-LSTM. Hyperparameters are tuned for ARIMA and neural network models to increase the models’ accuracy, which looks promising as RMSE for training is 0.14 and 0.20 RMSE for testing.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The electricity suppliers require forecasts for electricity load balancing, supply, and demand management to utilize the power plant’s ability up to the required demand level. In terms of predictions, forecasting is divided into three categories (1) Long-term, (2) Mid-term, and (3) Short-term. All three categories are necessary to smooth line the load balancing, demand, and supply according to the market’s needs. All three mentioned categories are investigated in recent years with different parameters and perspectives. These studies can save electricity expenditures by analyzing the data and electricity patterns from approximate future consumption values. Analyzing the approximate future values, electricity suppliers can manage the prices of electricity [1], which eventually helps in the country’s economy. Properly planned electricity load and consumption can save money that can help in building economy [2].
Statistical and machine learning models are resourceful to get information about electricity consumption from the dataset. Generally, these consumption datasets have time series representation. Time series based datasets can be uni-variant or multi-variant. Time series data can have observations with specified time stamps. Timestamps can range from seconds to years. Time series data can be obtained from domains where data figures change with time, like in stock market [3]. Statistical or machine learning models can be applied to other fields. [4] introduces an approach to analyze churn dataset that also contains geographic locations. Several other studies are conducted to forecast electricity consumption with different variables and parameters [5,6,7,8,9]. Such models can help understand the variation in consumption while increasing the accuracy rate and decreasing the computational time.
Auto-regressive and Moving-average models are widely used for such forecasting studies. However, models from artificial intelligence can also be utilized to increase forecast accuracy. ARIMA stands for ”Auto-Regressive Integrated Moving Average. It comprises models that predict a particular value derived from its former values, especially from lags to deferred estimate errors. Each unseasonal time set that one reveals the template and stays to be casual white noise can also be shown with this model’s help. The ARIMA system is described with three terms: p, d, and q. To understand the ARIMA model, we will have to know the implication of these terms p, d, and q for this model. The goal for making this ARIMA model is to generate a fixed time set. Auto-Regressive, which follows the linear regression algorithm’s function that uses its lag values as the predictors. Linear regression helps to achieve a better outcome even though the predictors are non-correlated and non-dependent. Hence to generate a fixed time set, we have a simple mathematical approach that is mostly used. The approach is that computing the differential value from the prior and the present values. Due to the difficulty in the set, we may have to do a repetitive computation. Therefore, the d- value is the lowest number of times the computation being performed to generate the fixed time set. ARIMA is used in studies like forecasting wheat production, infant mortality rate, automated modeling for electricity production [5, 10,11,12], and many more. The biological neurons’ nature inspires neural network models as neurons, and the synapses connected them carry out brain processing. These models can help in a broad spectrum of domains like decision making, learning, emotions, language. Neural network models are well equipped to handle such experiments. Forecasting electricity can impact industries that are directly linked to electricity production. Prices can go up and down to such industries like oil and gas industry [13], which helps in electricity production. This statement emphasizes the study’s importance, as many other factors can affect electricity load balancing, production, supply, and demand. In this study, the authors investigated comprehensive models for the related dataset. Authors identify the hyperparameters for models, conduct experiments, and compare results based on MSE, RMSE, MAE, and MAPE metrics. These models provide enough insights for the dataset to get more accurate forecasting. The author also contributes more by combining CNN with Bi-LSTM to create a new model to forecast and analyze electricity consumption. The paper’s structure is as follows: Section 2 provides related work and a much better understanding of the models like ARIMA and Neural networks. Section 3 provides material and methods that have been used while conducting this study. Section 4 provides information about the experiments and discusses results that are observed. Furthermore, results are discussed for future work, and at the end, the authors conclude their study.
2 Related work
ARIMA is a well-known forecasting model, especially in stock and finance, thus [14] talk in detail about the pros, cons, and techniques to improve the ARIMA model. A forecasting model for commercial users in SouthKorea studied by [8] for gas, petroleum, electric heat, and renewable energy to accommodate United Nation’s convention on climate change. Likewise, forecasting electricity production in Turkey [15] is compared with regression and ARIMA model and provide reliable results. As ARIMA is famous among stock market, [3] forecast stock market crises by considering the probability of the stock market crash in the various time frame. ARIMA is also explored by [10] where they use (2,1,2) hyperparameter values for wheat production. The exponential-smoothing method, along with ARIMA, is studied by [16] and expand the application with bagging to determine monthly load and states that their models are suitable for developing countries. The model “FB-Prophet ”is getting famous due to its usability and adaptability, thus studied by [17] in search of opensource-tools and algorithms.
Neural Networks are mostly renowned for image processing, especially in medical image processing. Different models with parameters have to be analyzed to determine the best outcomes, and the study of [18, 26] shows promising results. Neural Networks can help in understanding personality dynamics. They can determine the state of personality is stable or not [19] and what variables can affect the personality variable. Photovoltaic (PV) integration can help in economic growth. It is a promising renewable energy source, thus requiring prediction and forecasting to help make future decisions. Forecasting PV based data can be done by the neural network model named LSTM. LSTM-RNN is analyzed by [20], which can determine temporal changes in PV output can be evaluated through the hourly dataset for a year. Electricity is considered a key role player in economics and is thus studied by many researchers with different models and approaches. An approach used by [21] utilizes LSTM along with the genetic algorithm to get better results and performance with time-series data for short-term and long-term forecasting. Further improvement in forecasting with LSTM is made in the article from [22] where the author combined the CNN with Bi-LSTM to get better forecasting result for electricity for households. Considering electricity plays an essential role, [23] also proposes a model with LSTM capable of forecasting load for single residential as several other parameters are involved. The proposed framework with LSTM evaluates real residential smart meter data. Residential usage is essential, and many researchers are looking deeper to find patterns in residential electricity usage. Forecasting long-term electricity demand for the residential user is also affected by other variables. Such a variable can be the number of households in the residential area. Studies conducted by [24] are considering these variables along with average consumption and electrification rate. Modeling with granularity is quite challenging, as shown by [25]. Modeling long- and short-term temporal pattern with LSTNet is done by [27] where they extract temporal dependencies among variables in the given time-series dataset. Experiments conducted by [28] show the importance of data standardization and data sampling to overcome uncertainty associated with neural network training with time distributed data.
Credit scores can be improved with neural networks as [29] investigates five models with 10-fold cross-validation on two real datasets. Recurrent neural networks (RNN) can be used for dynamic modeling for nonlinear data. Data play a vital role in overall modeling and experimentation. So, [30] makes simple modification into RNN to work along with nonlinear Spatio-temporal data for forecasting applications. Computational time is also a critical factor in the overall forecasting process. The time factor can be heavily improved if we decrease the variable as [31] did it in their study by relying only on past solar energy consumption data.
3 Materials and methods
We have chosen a core i7 processor and NVIDIA GPU with 8GB memory to speed up the computational time in terms of hardware utilized. We used Python and Python integrated environment and TensorFlow and Keras libraries for neural network and stats models’ libraries to implement statistical models to develop prediction models.
3.1 Dataset
Dataset is provided by Korea electric supply company. Dataset having a shape of (3120,6). 3120 observations for 25 districts collectively. Further, this dataset is divided into 6 categories like household, public, services, industries, Total and Districts along with timestamp as index for time series prediction. Dataset can be preprocessed as it gives uni-variant representation if we discuss about one district at a given time. These observations are for ten years starting from 1st January 2009 to 1st December 2018. While preprocessing the dataset, we have found multiple observation for January which make it uninterpretable in time series analysis. We have considered that it is a mistake as observations for the month of October were missing. We have replaced second observation from January to October. After taking the decision for multiple observation we check for missing values and found that there are no missing values in the 3120 observations and make it easy for further preprocessing.
Table 1 is the abbreviation table, and Table 2 provides information in the dataset according to variables. As the timestamp for observations is one month, we can analyze this dataset for mid-term forecasting, i.e., for a couple of months. In this Districts column, a variable named “Total” can be excluded as we can calculate the monthly total separately. This exclusion makes the dataset easy to interpret appropriately.
Figure 1 shows the 10-year electricity consumption in the GURO district. In contrast, Fig. 2 shows the Dobong district in the “Public” category. We have chosen the “Guro” district for our analysis. This decision also makes time series data of uni-variant type. Statistical analysis can be performed on the preprocessed dataset (Table 3).
4 Building the forecasting
We have performed statistical as well as the machine learning algorithm to fetch the information from the dataset. To evaluate our experiments, we have considered performance metrics like MSE and RMSE.
4.1 ARIMA model
Arima is a statistical model to forecast the values. Combining the AR and MA model formulas, we get the general formula as
Figure 3 indicates that our data are not stationary. We have to do some data preparation before using the ARIMA model. It can also be observed that the number of spikes is not within the critical range. This phenomenon can determine the P and D values for the ARIMA model.
Figure 4 provides the results of different hyperparameters that are determined from Fig. 3. In Fig. 4, it can be seen that the suitable value for p, d, q is 3, 1, 1, respectively. Figure 5 provides a pictorial view of the ARIMA model with hyperparameter (p, d, q) values (3, 1, 1) with an MSE of 0.028. It shows that model was successfully executed with a hyperparameter setting and yield good results.
Figure 6 shows the 4-month predicted values in a gray area with a 95% interval, and forecasted values are the blue line (Fig. 7).
Figure 6 shows the 4-month predicted values in a gray area with a 95% interval, and forecasted values are the blue line (Fig. 7).
4.2 LSTM
LSTM requires data reshaping according to samples, timestamps, and features. Data have been reshaped accordingly and sent to two layers of LSTM are configured with 50 neurons to boost the learning process. RELU activation function has been used with return_sequecnces equal to true so that data can be passed from one layer to another. After 200 epochs, we got a train score of 0.15 RMSE and a test score of 0.21 RMSE.
Figure 8 shows the LSTM result with 120 observation 70–30%. 70% for training and 30% for test (Fig. 9).
4.3 Bi-LSTM model
Bi-LSTM also gets normalized data in our experiment. We used two bidirectional layers of LSTM having same configuration as we used in simple LSTM model. RELu activation function has been used with adamax as optimizer. To maintain same criteria for testing and training, we also run this model for 200 epoch which produces Train score of 0.14 RMSE and Test score of 0.22 RMSE.
Figure 10 shows the Training and Testing results while Fig. 11 shows the model loss.It can be seen that after a certain time period, loss is increasing. This phenomenon is because of less observations to work with.
4.4 Bi-LSTM: LSTM model
Our experimentation also includes combined model of Bi-LSTM and LSTM layers. Configuration are same as we have utilizes for our previous models experimentation. We have used two layers of Bi-LSTM and two layers of LSTM for this experimentation. After 200 epochs, we get Training score as 0.15RMSE and Testing score as 0.26 RMSE.
Figure 12 shows results of the model for Training and Testing for 120 observations, while Fig. 13 shows the model loss. It can be seen that model loss is fluctuating as compared to Bi-LSTM model and trend can be seen to go upward. This means over the time this model will produce more loss. Like other models, this model can also be re-tuned to work for mid-term forecasting applications.
4.5 CNN-LSTM model
Convolution layer is introduced in this experiment with filter equals 1 and kernel size equals to 1. We choose same activation function RELu to maintain the consistency. MaxPooling layer with pool size one is used for this model. Two LSTM layers with same configuration are also combined in this model with 1 dense layer as we need single output. Dataset is splitted in 70 and 30 ratio for Training and Testing, respectively. After 200 epochs, model produces 0.13 RMSE for Training and 2.06 RMSE for Testing. Testing score is not as good as we anticipated. This shows that CNN combined with LSTM might not good combination for our dataset.
Figure 14 shows the CNN-LSTM results, while Fig. 15 shows the Model loss. It can be seen that after 100th observation we have a downward spike in Fig. 14 and around same observation in Fig. 15, we have more model loss. This phenomenon is because of observations that are feeded to the LSTM from CNN layer and LSTM layer took some time to learn according to feature coming from CNN layer.
4.6 CNN-Bi-LSTM model
The main focus of this study was ARIMA and this model as they can produce better results in forecasting. We have combined CNN layer and Bi-LSTM layers in this proposed model. CNN layer is configured as same in CNN-LSTM model to keep the consistency for evaluation and comparison purpose. After convolution layer, we have added two layers of Bi-LSTM that produces 0.14 RMSE Training score and 0.20 RMSE Testing score after 200 epochs.
Figure 16 shows the training and testing results of proposed model, while Fig. 17 shows the loss of the model. From Fig. 17, it can be observed that loss is shaking but still have downward trend from which we can say that with large dataset this model will perform better.
5 Experimental results
Table 4 shows the results of the proposed models. As we can see that ARIMA performs extremely well but it require extensive data preprocessing before data can be analyzed by model and can variate with certain change in data incoming patterns. Mean while ARIMA model is prone to seasonality, trend and white noise which makes it really difficult to make it stationary in real-world scenario. ARIMA does not accept time series data if it is not stationary. Mean while, CNN-Bi-LSTM also shows better results without extensive data preprocessing. That makes neural network easy to adopt in real-world applications. Especially RNN-LSTM makes it easy to predict short, mid and long term electricity consumption forecasting. Our models show good result beside CNN-LSTM model which spikes after 100th observation due to the factor of feature that are coming from CNN layers and LSTM was not able to interpret it as number of observation were also very small. To be concise we only have 120 observations for single districts that makes models to behave like that.
6 Future directions
In future, we are considering to test and develop automated ARIMA models that can determine order of ARIMA to pre-process the incoming data before it can be passed to the model. Also, Neural networks show promising result and in our belief, we can optimize the process to get more accurate results with less computational time.
7 Conclusions
This study developed and proposed number of forecasting models for energy consumption prediction. Main focus of the study is to determine wether statistical models can perform with provided dataset or neural networks. To compare all the models, we maintain the consistency of hyper parameters so comparison can be most realistic. ARIMA and CNN with Bi-LSTM perform well in our study. ARIMA requires heavy data pre-processing, while Neural networks are easy to adopt. Our results show that only combination of CNN-LSTM did not perform well that has been discussed in experiment section and in results and discussion section also. Results produced by ARIMA are certain and can be applied into real-world application where data pattern do not change most of the time. Meanwhile, CNN combined with Bi-LSTM performed well with less MSE and RMSE after ARIMA.
References
Siano P (2014) Demand response and smart grids—a survey. Renew Sustain Energy Rev. https://doi.org/10.1016/j.rser.2013.10.022
Ardakani FJ, Ardehali MM (2014) Long-term electrical energy consumption forecasting for developing and developed economies based on different optimized models and historical data types. Energy. https://doi.org/10.1016/j.energy.2013.12.031
Chatzis SP, Siakoulis V, Petropoulos A, Stavroulakis E, Vlachogiannakis N (2018) Forecasting stock market crisis events using deep and statistical machine learning techniques. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2018.06.032
Long HV, Son LH, Khari M, Arora K, Chopra S, Kumar R, Le T, Baik SW (2019) A new approach for construction of geodemographic segmentation model and prediction analysis. Comput Intell Neurosci. https://doi.org/10.1155/2019/9252837
Kavasseri RG, Seetharaman K (2009) Day-ahead wind speed forecasting using f-ARIMA models. Renew Energy. https://doi.org/10.1016/j.renene.2008.09.006
Fan S, Hyndman RJ (2012) Short-term load forecasting based on a semi-parametric additive model. IEEE Trans Power Syst. https://doi.org/10.1109/TPWRS.2011.2162082
Kaytez F, Taplamacioglu MC, Cam E, Hardalac F (2015) Forecasting electricity consumption: a comparison of regression analysis, neural networks and least squares support vector machines. Int J Electr Power Energy Syst. https://doi.org/10.1016/j.ijepes.2014.12.036
Ha S, Tae S, Kim R (2019) Energy demand forecast models for commercial buildings in South Korea. Energies. https://doi.org/10.3390/en12122313
Shinde P, Literature Amelin MA (2019) Review of intraday electricity markets and prices. IEEE Milan PowerTech. https://doi.org/10.1109/PTC.2019.8810752
Masood MA, Abid S (2018) Forecasting wheat production using time series models in Pakistan. Asian J Agric Rural Dev. https://doi.org/10.18488/journal.1005/2018.8.2/1005.2.172.177
Mishra AK, Sahanaa C, Manikandan M (2019) Forecasting Indian infant mortality rate: an application of autoregressive integrated moving average model. J Family Community Med. https://doi.org/10.4103/jfcm.JFCM_51_18
Amin P, Cherkasova L, Aitken R, Kache V (2019) Automating energy demand modeling and forecasting using smart meter data. In: Proceedings—2019 IEEE International Congress on Internet Of Things, ICIOT 2019—Part of the 2019 IEEE World Congress on Services. https://doi.org/10.1109/ICIOT.2019.00032
Debnath KB, Mourshed M (2018) Forecasting methods in energy planning models. Renew Sustain Energy Rev. https://doi.org/10.1016/j.rser.2018.02.002
Author Information Pack (2018) Adv. Account. https://doi.org/10.1016/s0882-6110(18)30184-6
Ediger VŞ, Akar S, Uǧurlu B (2006) Forecasting production of fossil fuel sources in turkey using a comparative regression and ARIMA model. Energy Policy. https://doi.org/10.1016/j.enpol.2005.08.023
de Oliveira EM, Cyrino Oliveira FL (2018) Forecasting mid-long term electric energy consumption through bagging ARIMA and exponential smoothing methods. Energy. https://doi.org/10.1016/j.energy.2017.12.049
Romero-Gelvez JI, Delgado-Sierra EA, Herrera-Cuartas JA, Garcia-Bedoya O (2019) Demand forecasting and material requirement planning optimization using open source tools. In: CEUR workshop proceedings
Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. https://doi.org/10.1109/TMI.2016.2528162
Saeed F, Paul A, Hong WH, Seo H (2020) Machine learning based approach for multimedia surveillance during fire emergencies. Multimed Tools Appl. https://doi.org/10.1007/s11042-019-7548-x
Read SJ, Droutman V, Smith BJ, Miller LC (2019) Using neural networks as models of personality process: a tutorial. Pers Individ Differ. https://doi.org/10.1016/j.paid.2017.11.015
Abdel-Nasser M, Mahmoud K (2019) Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput Appl. https://doi.org/10.1007/s00521-017-3225-z
Bouktif S, Fiaz A, Ouni A, Serhani MA (2018) Optimal deep learning LSTM model for electric load forecasting using feature selection and genetic algorithm: comparison with machine learning approaches. Energies. https://doi.org/10.3390/en11071636
Le T, Vo MT, Vo B, Hwang E, Rho S, Baik SW (2019) Improving electric energy consumption prediction using CNN and Bi-LSTM. Appl Sci. https://doi.org/10.3390/app9204237
Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y (2019) Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans Smart Grid. https://doi.org/10.1109/TSG.2017.2753802
Pessanha JFM, Leon N (2015) Forecasting long-term electricity demand in the residential sector. Proc Comput Sci. https://doi.org/10.1016/j.procs.2015.07.032
Rodrigues F, Cardeira C, Calado JMF (2014) The daily and hourly energy consumption and load forecasting using artificial neural network method: a case study using a set of 93 households in Portugal. Energy Proc. https://doi.org/10.1016/j.egypro.2014.12.383
Lai G, Chang WC, Yang Y, Liu H (2018) Modeling long- and short-term temporal patterns with deep neural networks. In: 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018. https://doi.org/10.1145/3209978.3210006
Kourentzes N, Barrow DK, Crone SF (2014) Neural network ensemble operators for time series forecasting. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2013.12.011
West D (2000) Neural network credit scoring models. Oper Res. https://doi.org/10.1016/S0305-0548(99)00149-5
McDermott PL, Wikle CK (2019) Bayesian recurrent neural network models for forecasting and quantifying uncertainty in spatial-temporal data. Entropy. https://doi.org/10.3390/e21020184
Majidpour M, Nazaripouya H, Chu P, Pota H, Gadh R (2018) Fast univariate time series prediction of solar power for real-time control of energy storage system. Forecasting. https://doi.org/10.3390/forecast1010008
Acknowledgements
This work was supported by the faculty research fund of Sejong University in 2020 and also supported by Energy Cloud R&D Program(Grant No. 2019M3F2A1073184) through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gul, M.J., Urfa, G.M., Paul, A. et al. Mid-term electricity load prediction using CNN and Bi-LSTM. J Supercomput 77, 10942–10958 (2021). https://doi.org/10.1007/s11227-021-03686-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03686-8