Keywords

1 Introduction

Automated Metering Infrastructure (AMI) is an integral part of a smart grid. AMI not only allows the consumers facilities such as real time consumption monitoring and automated billing etc., it also allows the power utilities to implement value-added services such as DA and DR. A typical AMI system is represented in Fig. 1 [2]. One of the major aspects of AMI is its ability to learn customer electricity consumption behavior and accordingly optimize the quality of smart grid services delivered to the user. A detailed analysis of the issues related to load forecasting and DR has been presented in [3, 4]. AMI applications to scheduling of residential microgrids [5], weather based load normalization and increasing the forecast accuracy for hourly forecast consumption is done in [6] and the user segmentation based on users power consumption profile forecasting for hourly power consumption data and thereby maximizing the benefits of smart grid services especially on DR is done in [7].

Fig. 1.
figure 1

AMI block diagram [2]

However, it is observed that there are few studies on identification of suitable techniques for electricity consumption forecasting. It is also observed that the consumption patterns of users change according to the time of the day and according to the seasonal variations. This aspect has also not received much attention. Consumption behavior is also seen to be very different for domestic consumers when compared to industrial users. The earlier work by the authors on the topic [2] attempted to empirically address these aspects of consumption forecasting by analyzing forecast performance of machine learning based systems for 485 Small and Medium Enterprise consumers. This paper presents a much more comprehensive study with a total of 5275 users and four different forecasting techniques.

Rest of the paper is organised as follows: Sect. 2 presents the description of the proposed system and the methodology, Sect. 3 details the experimental results and conclusions are presented in Sect. 4.

2 System Description and Methodology

The block diagram of the proposed system is presented in Fig. 2.

Fig. 2.
figure 2

Proposed system block diagram

2.1 Dataset and Preprocessing

All the models were developed for CER electricity dataset acquired from Irish Social Science Data Archive (ISSDA) [8] comprising half hourly interval electricity consumption data for 6445 users, subcategorized as (a) Small and Medium Enterprise (SME) consisting of 485 users (b) Residential (RES) consisting of 4225 users and (c) ‘Others’ (OTH) consisting of 1735 users. The data has been recorded over the period 00:00 h, 14/07/2009 to 23:59 h, 29/12/2010. It was observed on inspection that in the OTH category, the consumption data for 1170 users is not available for the entire duration listed above and was not considered for further analysis. The total number of users considered, then, is 5275.

As the first preprocessing step, the consumption data for each user is split into four subsets: summer (May to July), winter (November to January), autumn (August to October) and spring (February to April). Different forecasting models are trained for each season separately. For comparison purposes, the models were also trained on the data without seasonal splits.

As the next step, after normalization of the data, the trend and cyclic components of the data are separated using Hodrick-Prescott (HP) filter [9]. HP filter has been successfully used in [10,11,12,13] as a preprocessing technique for financial forecasting systems. It has also been reported to offer an enhancement in performance for electricity consumption forecasting systems in [2]. Machine learning based forecasting systems described later in the paper are separately trained to forecast trend and cyclic components. Finally both the forecasts are added to yield the estimated consumption for that instant. Systems that forecast the consumption based directly on the normalized data alone (i.e. not employing the HP filter) are also evaluated.

Next step in the process is to train a machine learning system to estimate the electricity consumption. Three different machine learning techniques as well as linear regression (LR) models are investigated for the purpose. Each of these techniques is described in the next sub-section.

Fig. 3.
figure 3

MdAPE plots for all models with RES category users (a) all seasons together (b) Autumn (c) Winter (d) Spring and (e) Summer.

Fig. 4.
figure 4

MdAPE plots for all models with SME users (a) all seasons together (b) Autumn (c) Winter (d) Spring and (e) Summer.

Fig. 5.
figure 5

MdAPE plots for all models with OTH users - (a) all seasons together (b) Autumn (c) Winter (d) Spring and (e) Summer.

Fig. 6.
figure 6

MdAPE Histogram for RES, SME and OTH consumers using HP-ELM 2-hour ahead forecasting system for (a) all seasons together (b) Autumn (c) Winter (d) Spring and (e) Summer.

2.2 Machine Learning Techniques Considered

A total of four techniques, namely, ELM, ANN, ERT and LR were evaluated in the present study. Hence a total of eight different forecasting models were evaluated (see Fig. 2). Each of the techniques used and the system parameters are described below.

2.2.1 Extreme Learning Machines (ELMs)

ELMs are basically single hidden layer feedforward neural network (SLFNN), however in ELMs the weights connecting input and hidden layer are randomly fixed and not updated. The hidden to output layer weights are learnt in a single step with the help of Pseudoinverse technique which makes the architecture much faster and generalized when compared to any other neural network, with minimal training errors and smaller norm of weights [14, 15]. Kernel based ELMs have also been proposed [16]. The results of employing kernel ELMs have been encouraging, as seen in [16, 17]. A detailed treatment on kernel ELMs can be found in [15]. In the proposed work, the ELM Models are designed using linear kernel with regularization coefficient as ‘10’. The kernel function and the regularization coefficient value were arrived at, using trial-and-error. The estimated output from the ELM, \(\widehat{Y}\) can be expressed as follows:

$$\begin{aligned} \widehat{Y} = W_{H-O} \sigma _{A}(W_{I-H}X) \end{aligned}$$
(1)

where: \(W_{I-H}\) = input - hidden layer weight vector, \(W_{H-O}\) = hidden - output layer weight vector and \(\sigma _{A}\) = activation function.

Algorithm:

Step 1: Initialize \(W_{I-H}\) with random weights.

Step 2: Compute \(W_{H-O}\) using pseudoinverse:

$$\begin{aligned} W_{H-O} = \sigma _{A}(W_{I-H}X)^+Y \end{aligned}$$
(2)
Fig. 7.
figure 7

MdAPE Histogram for RES, SME and OTH category consumers using ELM (without using HP filter) 2-hour ahead forecasting system for (a) all seasons together (b) Autumn (c) Winter (d) Spring and (e) Summer.

2.2.2 Artificial Neural Networks (ANNs)

ANNs have been very widely used for forecasting purposes [18]. In the present study single hidden layer feedforward ANNs trained using Levenberg-Marquardt (LM) learning algorithm and 5 hidden neurons (number of hidden neurons is arrived at, using trial and error) are employed.

2.2.3 Ensemble Regressing Trees (ERTs)

ERTs are forecast models which are generally weighted combinations of several regression trees which contributes to increase in predictive performance of regression tree. ERT models in the present study are developed using bagging [19] and number of trees is 5 (selected using trial and error).

2.2.4 Linear Regression (LR)

This are the simplest curve fitting algorithms available. The coefficients for the LR models considered in the present study were identified using simple Ordinary Least Squares technique.

3 Results and Analysis

Due to the large amount of data to be processed, a representative subset from the full set of users was selected for each of the three categories SME, RES and OTH. All the models were first validated on these subsets and only the best performing models were considered for evaluation on the complete datasets. The subset selection criterion, as proposed in [20] is given below:

$$\begin{aligned} \text {sample size} = \frac{(Z^2 (p)(1-p))}{C^2} \end{aligned}$$
(3)

where: \(z=1.96\) for \(95\%\) confidence level, \(p=0.5\) (variability, maximum is 0.5) and \(c=0.05\) (i.e. 5%) is the confidence interval.

$$\begin{aligned} \text {For Finite population sample size} = \frac{sample size}{1+ \frac{sample size -1}{Total Population}} \end{aligned}$$
(4)

The nominal sample size required is found to be 215 for ‘Small and Medium Enterprise users’, 352 for ‘residential’ users and 315 for ‘others’ with a confidence level as 95% and confidence interval as ‘5’. Performance metric considered for the present study is Median Absolute Percentage Error (MdAPE) [21]. MdAPE has also been used as the performance measure in [2].

Fig. 8.
figure 8

Percentage increase in MdAPE between different forecast periods (a–c) forecasting system trained using ELMs without splitting the time series using HP filter for Residential, SMEs and Others category users respectively

All the systems under consideration are validated for their 2, 4, 6 and 8 - hour ahead forecast capabilities. Since data is available at half hourly intervals, this translates to 4, 8, 12 and 16 step-ahead forecasts. Sliding window technique was used to re-train the forecasting systems with window length of 10. i.e.,

$$\begin{aligned} \widehat{y}(t+4)=f(y(t), y(t-1),..., y(t-9)) \end{aligned}$$
(5)

where: \(\widehat{y}(t+4)\) = estimated two hour ahead consumption, f(.) = machine learning based forecasting technique and y(t) = consumption at time instant ‘t’.

Forecasting results for all the eight models (1–8 in Figs. 3, 4 and 5) are presented in the form of stacked column charts in figures below. The five colors in the charts represent the MdAPE ranges 0–20%, between 20 and 40%, 40 to 60%, 60 to 80% and >80% (legends 20, 40, 60, 80 and 80+ in Figs. 3, 4 and 5).

It can be observed from the results presented in Figs. 3, 4 and 5 that ELM based models tend to outperform all the other models considered. Hence, the two ELM based models (HP-ELM and ELM without HP preprocessing) were evaluated on the entire set of 5275 users. The results are presented in Figs. 6 and 7. It can be observed from the Fig. 8 that the 2-hour ahead forecasts were consistently better than the longer horizon forecasts of 6 and 8 h-ahead. It was also observed that forecasting system trained with ELMs and without using HP filter produce better result with consistency than the rest of systems considered. Employing HP filter for separation of trend and cyclic components and trained using ELMs and ANN also generated better forecasts, albeit slightly worse than the without HP-ELM system discussed earlier. It was also evident that linear regression models were consistently worse off compared to the other three techniques considered.

4 Conclusion

Based on the results presented above, it can be concluded that shorter term horizon forecasts tend to be more accurate while employing the proposed forecasting systems. It is also observed that ELM based forecasting systems are able to generate better forecasts when compared to ANN or ERT based systems. Due to the forecasting performance exhibited by ELM based systems in the present study, ELMs can be considered to be good candidates for generating electricity consumption forecasts as a part of a larger AMI system.