Keywords

1 Introduction

The largest consumer of electricity in the world is the building sector, for example, the US building’s primary and electrical energy consumption is more than 40 and 76% respectively [1]. Indeed, reducing energy consumption is essential to meet national energy and environmental challenges and reduce costs for building users. According to the US department of energy (DOE), the possibilities for improving efficiency are colossal. By 2030, the energy consumption of buildings could be reduced by more than 20% by using technologies known to be profitable today, and by more than 35% if the research objectives are met [2].

The main goal of a Home Energy Management System (HEMS) is to manage efficiently the flow of electricity in the house so as to lower the price of the electric bill, while maintaining the comfort of its occupants [3]. Monitoring the electricity consumption of several devices is a first step of any HEMS and aims to perform (1) detailed energy detection, (2) provide information on the distribution of energy consumed, (3) profile energy-consuming devices to develop energy conservation strategies, like reprogramming consumption of high power devices during off-peak hours [3, 4].

In some cases, to minimize waste and improper use of energy, users must provide a lot of configuration information, this collection of information generally involving arduous measurement campaigns [5, 6]. Thus, forecasting electrical devices consumption is a necessary and an important part of managing the power system in order to forecast people’s penchant for using electricity [7]. Indeed, forecasting consumer demand becomes very important as the non-urgent energy needs of users (washing machines, air conditioning or refrigeration) can be relatively postponed or anticipated to obtain a good adjustment of the production profile while maintaining consumer comfort and the level of service [7, 8]. Moreover, most of the work in the literature based on the forecasting of building consumption does not claim any attention to the individual consumption of each appliance.

In this paper, we propose a non-linear autoregressive neural network (NAR) model to predict the electrical consumption of the device in a house. Our objective is to provide a methodological framework for analyzing the historical data of the time series of device power consumption and to know if the forecasting of such energy consumption can be performed with this type of model. Moreover, we will assess the feasibility of the method using the UK-DALE public dataset and asses the model performance using various delays as well as different model configurations.

The rest of paper is organized as follows: Sect. 2 reviews related works on the prediction of electrical consumption in a house. Section 3 presents the proposed methodology and the description of the NAR model. It also describes the public dataset used, and the evaluation criteria employed. Section 4 shows the experimental results of the proposed model and conclusions are drawn in Sect. 5.

2 Related Work

Several works have been carried out in the literature to analyze, profile, classify and/or predict the electrical consumption in a house. This information can be used to plan energy-saving strategies, improve user behavior by helping them to change their habits in the use of household appliances, improve the overall grid performance and decrease the consumption of electricity [5]. Indeed, Ruiz et al. proposed in [9] a method based on non-linear autoregressive neural networks method to forecast energy consumption in public buildings. They compared the NAR model with the NARX model and demonstrated that, if no external data is available, energy consumption only can be used to obtain accurate forecasts. A method for building energy consumption using weighted SVRs was developed by Zhang et al. in [10]. Good results were obtained, their results showing that using half-hourly data a higher weight is given to nu-SVR , while, for daily data, a greater weight for epsilon SVR is applied. In [11], Fumo et al. presented a method based on linear regression analysis to forecast energy consumption in an individual family household. They claim that the future of residential energy prediction is going towards the development of single model for each house, due to the accessibility of smart meters data. Deb et al. [12] introduced a data-driven model to predict diurnal cooling load energy consumption for institutional buildings. They demonstrated that the artificial neural network is capable to predict the next day energy consumption based on 5 preceding days data with a suitable accuracy. In [13], Gul et Patidar focused on understanding the influence of occupancy of multi-purpose educational buildings in their energy consumption. Thus, they noted that detailed information on occupancy preference could help the management staff to rethink control policies for optimal energy management of the building. A technique based on short term load prediction in distribution systems using a neural network method was proposed by Ding et al. in [14]. They performed feature selection in models for electrical load prediction to enforce an optimal generalization capacity of the model. Motepe et al. presented a load prediction process for the distribution of power utilizing deep learning and hybrid AI algorithms [15]. They investigated the effect of the inclusion of loading cleaned data and weather variables on the load prediction performance of an hybrid AI method. In [16], Hong et al. developed a method for short-term residential load prediction based on deep learning. The results showed that both the devices load data and the proposed iterative ResBlocks can help to enhance the prediction performance.

All of these above studies, however, are focused on forecasting the building consumption and no attention is paid to forecasting the use of a single appliance, thus preventing their applicability to appliance scheduling. Several works have focused on appliance disaggregation [17,18,19,20,21,22], Kelly et al. proposed a deep neural network for energy disaggregation [35]. They obtained good performances and showed a good capacity for generalization on unknown houses. A method based on sequence-to-point learning with CNN for energy disaggregation was developed by Zhang et al. in [24]. They showed that the CNN can inherently learn the features of the target devices, and their methods achieve state-of-the-art performance. Barbato et al. focused on prediction the usage of household devices via power meter in [5]. They showed the effectiveness of their model in predicting device usage through experimental tests.

3 Methodology

A Non-linear autoregressive model (NAR) is a recurrent neural network model that can accept dynamic inputs [9]. Indeed, the choice of NAR can be explained by the fact that not only the power demand of electrical devices is a time series but also the classic recurrent network encounters some difficulties in the face of long-term dependence problems [25]. These difficulties have their origin in the problem of gradient descent [26]. According to Lin et al. in [27] the exponential decrease in the gradient means that the weights of the distant values do not change and, therefore, the network cannot be trained effectively. All internal recurrent networks suffer of this problem, which makes NARs models very adequate for use in forecasting and modeling time series [25].

The NAR model is used to forecast the values of a time series y(t), using the d past values of the series y(t). The NAR model has the following formula [9, 25]:

$$\begin{aligned} y(t) =F(y(t-1), y(t-2), ... , y(t-d)) + e(t) \end{aligned}$$
(1)

where the function F(\(\cdot \)) is, in our case, a neural network. Therefore, the purpose is to train the model in order to approximate the unknown function by optimizing the network bias and weights. The term e(t) represents the approximation error at time t. The d elements y (t − 1), y (t − 2), . . . , y (t − d), are called feedback delays. The architecture used is a Multi-Layer Perceptron (MLP) composed of three layers: input layer, hidden layer(s) and output layer. The number of neurons per layer and hidden layers are utterly flexible and optimized via a trial and error procedure to obtain the network topology that could give the best performance.

In the intended application, the NAR is used to predict the appliance’s power consumption. The inputs are the delayed versions of the consumption power, and the output the next value in the series y(t). In order to find the best delays, we trained a model with a specific topology, varying the number of delays. As it will be explained later on, we shall use power series of four different appliances: kettle, Fridge, Dish-washer and Washing Machine. The RMSEs (Root Mean Square Error) that were obtained in the test set are shown in Table 1.

Table 1 RMSE (W) with various delays

As it can be seen, there is not a specific delay that obtained the best values for all appliances. Using the average RMSE, the best value is obtained with 4 delays and the worst performance was obtained with 10 delays. We shall use 4 delays in the models from now on.

After the delays defined, the next step is to determine the number of hidden neurons and number of hidden layers to be used in the model.

Several tests were performed in order to determine the number of hidden layers and the number of hidden neurons in each layer. For that, the RMSE and the MAE (Mean Absolute Error) criteria. were used. For the training of the network we used the Levenberg-Marquardt algorithm as it is recognized to be the best method to train static neural networks [28]. Some results are presented in Table 2. Note that all the experimental results are obtained for the test dataset.

Table 2 Statistical indicators obtained with various number of hidden layers and hidden neurons: RMSE (W) and MAE (W)

As before, there is not a single topology that obtains the best values for all cases. Using again the average values of the RMSE we can conclude that the best configuration contains 8 neurons in one hidden layer. The performance of this configuration in terms of RMSE are: 78.51 W, 24.92 W, 34.70 W, and 38.96 W respectively for Kettle, Fridge, Dishwasher, and Washing Machine.

3.1 Evaluation Criteria

To assess Artificial Neural Network forecasting performance, many prediction accuracy catalogs are proposed in the literature. We have selected among others the Root Mean Square Error (RMSE), and the Mean Absolute Error (MAE), which seem to be the most used metric in model assessment [16, 29,30,31].

$$\begin{aligned} RMSE = \sqrt{\frac{1}{N}\sum _{1}^{N} {(y-p)^2}} \end{aligned}$$
(2)
$$\begin{aligned} MAE = \frac{\sum _{1}^{N} {|y-p|}}{N} \end{aligned}$$
(3)

In (2) and (3) N is the number of patterns, y are the measured values and p the predicted values

3.2 Dataset

Since the work of Kolter et al. [32] and the deployment of the smart meter, several datasets containing data from house-hold electrical consumption as well as individual appliance consumption have been made available for the public [23, 33, 34].

We used the UK-DALE (United Kingdom-Domestic Appliance Level) data set, which regroups aggregated and disaggregated device data for five houses in London, England, over several years. Aggregate data represents the power demand of all devices in the house. The power data of each device was measured via smart sockets on single devices which measured their individual energy demands. The individual power of each appliance and the global aggregated power data are sampled every six seconds (0.1667 Hz). The labeled measurement data of each device are also available.

The dataset regroups the measurements of over 10 types of devices, however, we used four devices in all our experiments: Kettle, fridge, dishwasher, and washing machine which are popular appliances for evaluating energy disaggregation algorithms [23, 24, 35, 36]. We selected these devices because all exist in at least 2 houses in the dataset. This means that, for each device, we can train our model in at least 1 house and test on another house not seen during training. Moreover, these four devices consume a considerable proportion of energy and expose a range of various power features from the two-states on/off of the kettle to the complex characteristics of the washing machine.

4 Results

We trained our ANN-NAR model on house 5 (Train dataset) of the UK-DALE dataset and evaluated the testing performance on house 2. House 2 (Test dataset) contains data for several months between 17-02-2013 and 10-10-2013 and house 5 (Train dataset) contains data measurements from 29-06-2014 to 13-11-2014. The results in terms of the MAE criterion in comparison with some works mentioned on the state of the art are presented in Table 3. We point out that all the 5 methods used techniques to disaggregate the device data from the aggregate consumption data of the house prior to predict the device power consumption. Here we only used the historical data of each device to train our model. It should also be noted that the experiments were carried out over the entire period that constitutes the test dataset (house 2 data).

Table 3 shows that the developed model has surpassed all the 5 other state of-the-art methods, in terms of the MAE criterion, for every appliance. FHMM and CO proposed by Batra et al. present the worst performance, followed by LSTM proposed by Kelly et al., for Kettle and Fridge. The Seq2point proposed by Zhang et al. out-performs LSTM, CO and FHMM on the four appliances but falls behind our NAR model, where the MAE was reduced by 33%, 92%, 90% and 75% for the Kettle, Fridge, Dishwasher and washing machine, respectively. The Autoencoder developed by Kelly et al. outperforms Seq2point for the kettle and dishwasher, but is still worse than our model, which obtains a reduction in MAE of 14% and 89% respectively.

Table 3 Mean absolute error (MAE) (W) performance for UK-DALE dataset on a house not seen during training. Best performances are shown in bold

To assess the performance of the proposed method, Figure 1 shows the measured power and the predicted power, for the 4 different appliances and for short periods. Notice, however, that the results shown in Table 3 are applicable to the whole test set.

Fig. 1
figure 1

Predicted versus measured appliance power consumption for a Kettle, b Fridge, c Dishwasher and d Washing machine

5 Conclusion

In this study, we proposed a non-linear autoregressive neural network model for forecasting appliance power consumption. We have considered four appliances that consume an important proportion of energy and exhibit different operation patterns, from two states to more complex pattern. We have tested the impact of hidden neurons and hidden layers, as well as number of delays on the performances of the model. It has also been shown that the model is highly efficient, obtaining better prediction performance than some state-of-the-art methods.

Our future research will focus on studying the effect of exogenous data on the forecasting method and how data other than power could be used to improve the accuracy of forecasting.