Keywords

1 Introduction

The spread of prosumers in smart grids has forced companies to offer sustainable energy services using bidirectional flows [1, 2]. Currently, prosumers consume their own energy, but when production is not enough, they demand it from the utilities. In contrast, when they produce a surplus, they inject it into the public grid to get an economic incentive in return [3]. According to several investigations, optimizing energy management is one of the most essential factors in the use of renewable systems [1, 4]. Efficient consumption benefits prosumers in reducing bills or even fully covering their energy needs; however, it is necessary to guide them in the transition to this process [3]. Fortunately, the interaction with technological applications (e.g. recommender systems, customer service portals) has demonstrated positive results by helping prosumers reduce energy costs in the long term [5,6,7].

On the other hand, utilities are responsible for collecting consumption data from smart meters at different intervals (e.g. hourly, daily). Analysis of this data combined with additional variables (e.g. weather and energy prices) [8, 9] can help them identify consumption patterns. Currently, there are several applications that allow the utilities to forecast energy demand and supply. Usually, these applications consider different factors such as weather conditions, socio-economic data and energy load profiles (e.g. BDEW standard load profiles).Footnote 1 Some of these applications are currently used mostly at the level of (larger) network operators or by large utilities for better short-term management of supply and demand.

Using forecasting applications involves complex machine learning processes usually based on black-box processes [10] that can be challenging to follow by users with non-technical knowledge in the area [11]. The internal functioning cannot be easily interpreted in terms of understanding how the forecasting process results obtained were actually produced. Knowing input and output data, but not being able to understand the procedure performed to calculate the forecasting can be a concerning situation, especially for small municipal utilities who are often not used to using sophisticated modeling and simulation approaches.

In order to provide an explainable forecast process (avoiding the black-box procedure), we propose to provide a method for heuristic day-ahead forecasting based on direct interaction with a visual analytics system. The idea is to model and implement analytics procedures that allow the utility analyst to inspect and compare historical data with respect to the available attributes characterizing the day ahead (e.g. weather conditions, month, day of the week and related factors) in order to identify consumption patterns that could likely correspond to the expected situation on the next day. This process could combine both manual as well as semi-automatic analysis (with the system suggesting similar days and periods based on criteria defined by the analyst). Supporting this exploratory process with appropriate interactive visualizations can aid the manual analysis to support the sense-making of the information presented and to allow the analyst to determine day-ahead forecasting in a heuristic and explainable manner.

This article presents a work in progress in the development of the main-use cases as part of the SIT4Energy project.Footnote 2 The integration of this proposal is comprised of a developed approach, method, system design and some initial mockups for the implementation of an explainable Smart Energy Dashboard to forecast energy consumption and production for the utilities. We propose the integration of decision trees and k-nearest neighbor (KNN) methods to implement an explainable forecasting process. This paper is structured as follows. Section 2 presents related work on the development of solutions to support the utilities in managing energy regarding the penetration of prosumers into the energy market. Section 3 presents the analysis of needs and requirements obtained from initial research with target end-users. Section 4 describes the approach for the development of our solution. Section 5 describes the system design for the implementation of the proposed approach. Section 6 presents the definition of guidelines for the evaluation of the proposed approach. Finally, Sect. 7 provides conclusions and outlines future work.

2 Related Work

One of the main concerns related to renewable energy is the development of technology to support the new energy market, which includes the installation of renewable systems and provides prosumers with monitoring and supporting tools [12, 13]. For example, Fiorentino et al. reported that recommendation systems have a significant impact by helping prosumers optimize energy consumption and production [5]. In addition, Yan et al. [8] proposed designing scenarios in order to define significant variables (e.g. temperature, solar radiation, humidity) to provide accurate information on energy demand. In this way, users can get a more personalized prediction based on their consumption preferences (e.g. low, medium, or high consumption) as well as storage device types and capacity.

On the other hand, due to changes in the power generation market, the operations performed by the utilities have been extended with activities such as planning energy production or purchasing it from third parties [2]. This situation leads to the necessity to put data analysis at the forefront. At the same time, real-time pricing has emerged as one of the main factors impacting the energy supply, especially during peak times [14, 15]. Therefore, the utilities have to anticipate changes in energy consumption associated with dynamic prices [16]. This situation has encouraged the development of tools to support the management of energy consumption and production [17, 18], including algorithms to forecast different conditions such as building temperature, solar radiation and gas emission, and therefore, contributing to the identification of patterns in the prediction of energy demand.

Currently, there are several machine learning methods to forecast energy consumption and production; however, it is essential to analyze the differences between explainability and accuracy (Fig. 1). As an example, neural networks are focused on providing more accurate results, but they lack interpretability because of the implementation of black-box approaches [19]. In contrast, the decision tree model is considered one of the most transparent methods based on multilayers, which have been widely applied to generate knowledge by discovery data mining [20] (Fig. 2).

Fig. 1.
figure 1

(Image adapted from [22])

Accuracy vs. explainability machine learning models

Fig. 2.
figure 2

(Image adapted from [23])

(a) The Neural Network Model implements hidden and complex layers to produce results. (b) The Decision Tree Model generates an explainable tree

Forecasting energy consumption is a complex process that involves identifying multiple variables that together comprise energy demand, for example, weather conditions and historical consumption among others [2]. Usually, the results are integrated into charts using different periods (e.g. daily, monthly and yearly). However, this information is not always presented to the user under usability standards that facilitate understanding [21].

Another contribution proposed a decision-support tool based on predefined scenarios to allow users to analyze the environmental, economic and cultural impact of the exploitation of renewable resources available [24]. However, to validate the operation of such tools, it is important to evaluate them in different climatological regions. Climate can be classified by clustering similar conditions to facilitate learning about specific situations such as sunny days [9]. Most of these applications focus on the needs of researchers or professionals to understand and improve forecasting methods [25]. Even for such expert users, Mentler et al. [21] consider the interface design to be also of essential importance.

In addition, the use of visual analytics has shown multiple benefits in the analysis of information to support decision-making; for that reason, it is gaining the attention of researchers and developers, primarily to support decisions based on the analysis of large and complex data [26]. Several contributions have explored the design of dashboards to evaluate energy-related activities including the following features: (a) electrical power consumption monitoring using real-time systems, (b) comparing information and (c) providing educational information [27]. However, challenges in the design of such dashboards include the availability and completeness of data, its structuring and exploration to produce meaningful graphs while avoiding misinterpretation [26]. Therefore, the analysis of the information necessary to include in the dashboards is of paramount importance. Furthermore, using dashboards by building managers has demonstrated a positive impact resulting in attitudinal changes such as increased awareness, desire to learn and motivation to save energy [27]. Hence, user-centered visual analytics dashboards could also motivate utility analysts to learn from their customers’ behavior regarding energy demand and production.

The majority of contributions presented above focused on facilitating information analysis by prosumers (e.g. to support their decision-making) and on improving energy forecasting models to benefit the utilities. However, most energy forecasting procedures are trained and tested with historical data [25]. According to some studies, the variation in the historical data (e.g. new appliances, new habits, a new family member) over time affects the actual performance of forecasting methods [28]. Therefore, it is paramount to consider the effects of usage of increasingly common customer applications (e.g. consumer portals, energy saving services) that can influence prosumers to change consumption habits in the long term [10].

3 Identification of Needs and Requirements

In line with the user-centered design methodology, utility employees have been involved in the design process as target users, starting with the requirements analysis and definition (e.g. interviews, feedback to mock-ups). Based on the literature review and interviews with utility representatives, initial information needs and requirements for analyzing energy demand and production were identified, and first visual mock-ups of possible solution ideas developed. The initial mock-ups were tested in a workshop with utility representatives and based on obtained feedback the requirements were consolidated and main use cases for the development of a visual analytics dashboard to support energy analysis needs for small utilities were defined.

Based on the results of this process, the main identified need was to be able to perform day-ahead forecasts of energy demand in prosumer scenarios in an easily understandable way. Another essential requirement was to enable the analysis to explore what-if scenarios by varying different input parameters that were identified as main factors influencing energy demand and production (e.g. weather conditions, season, day of the week, energy prices, etc.). This should enable the utility to better plan the energy production for the next day, a process that is currently performed manually and considering the previous experience of the people in charge, without any tools to specifically support that task.

The development of a visual tool to analyze the relationships between most important factors such as weather conditions (sun hours, temperature and wind speed), energy prices, supply, and demand, can help utilities improve their energy estimation for the next day. The simulation of different conditions, not just for the next day but also to identify specific possible situations, can support the utilities to analyze their impact on the energy demand and supply, e.g. regarding the holiday season or as a result of increasing energy prices. This could be supported through the creation of “what-if-scenarios” in a visual analytics process to help utilities to identify or prevent issues in the energy demand forecasting and planning process.

4 Approach

Based on the described analysis, we propose the development of a visual analytics tool to help the utilities better understand consumption and production patterns to determine a day-ahead forecast. That includes two main use cases:

  1. (1)

    Using a Smart Energy Dashboard to perform dynamic analysis and interactive day-ahead forecasting of energy supply and demand,

  2. (2)

    Analyzing the historical and forecasted energy consumption and production patterns to provide recommendations to prosumers for optimizing their consumption and production.

To this end, we propose the development of an interactive Smart Energy Dashboard based on a combination of visual analytics and explainable machine learning models. The implementation of explainable AI is crucial to gain users’ trust regarding the interaction with the results [20]. In this way, we expect to support the decision-making of utilities regarding energy management activities, avoiding the use of complex black-box models. Instead of the implementation of sophisticated and accurate algorithms that can be difficult to understand, we intend to implement an explainable model compatible with existing practices — e.g. heuristic assessment based on previous experience.

In order to implement the forecasting process, we selected decision tree and KNN methods (Fig. 3). We want to compare the suitability of these two methods and investigate trade-offs between the accuracy and explainability that they can provide to end users. In both cases, the overall forecasting process is modelled as follows. The analyst can set the input parameters of factors influencing consumption and production (sun hours, wind, temperature, weekday/weekend/holiday and month) to values that are expected for the next day. Based on historical data, the system then applies either kNN or decision tree methods to identify days in the past which are most similar to the conditions the analyst has defined with setting the input parameters. The consumption and production patterns corresponding to those most similar days are then displayed in a visual analytics dashboard, and accompanied with the data on corresponding context conditions (e.g. day of the week, month, weather conditions and energy prices on those days). The analyst can now inspect and explore the presented information in order to identify which of the presented historical contexts and energy demand data reflect the most comparable situation to the one expected for the next day, to derive the expected energy demand.

Fig. 3.
figure 3

(Image adapted from [23])

Decision tree model and K-nearest neighbor model

For the decision tree model, the input parameters are used to apply the decision rules by splitting the data in order to determine dates in the past that met the initial parameters. Explainability is applied by presenting the decision tree computed as part of the process [23]. In this way, the selected pathway is used to create the visualization in the dashboard; therefore, it should be easy to analyze the relationships between the input parameters with respect to energy consumption and production.

On the other hand, the KNN model is a regression model used to compute the distance between data points and a query point in a multidimensional array. The model takes different input parameters from the query point to compare with the data points in the matrix by using distance equations [19]. The model is explainable because of the analysis of vicinity between data points in the matrix and the query point based on the input parameters. In this way, it should be easy for the utilities to analyze the closeness of the data in order to understand the prediction.

Finally, the described interactive forecasting process would also allow utilities to simulate “what-if-scenarios”: by specifying different hypothetical conditions, they could analyze the obtained forecast results for different scenarios, in order to identify or prevent issues in different possible future scenarios (e.g. peak energy demand in specific summer conditions). Combining the described forecasting features and the creation of “what-if-scenarios” could thus allow the utilities to understand the relationship between consumption patterns in different conditions and the factors that determine them. In this way, the utilities could also define recommendations for prosumers to help them optimize energy consumption and production.

5 System Design

The basis for the implementation of the described approach is a combination of the machine learning platform RapidMinerFootnote 3 and the visual analytics platform QliksenseFootnote 4. The former allows the straightforward implementation of several prediction models (incl. decision trees and KNN), while the latter provides functionalities for designing and implementing customized visual analytics dashboards. The implementation of the workflow leading to the presentation and exploration of the forecasting results in the Smart Energy Dashboard is given in Fig. 4.

Fig. 4.
figure 4

The Smart Energy Dashboard system design

The first step is to collect and prepare the input data from historical datasets (energy consumption and production, weather, energy prices, etc.). Once the dataset is updated, the user enters the input parameters in the Smart Energy Dashboard and selects the forecasting model (decision tree or KNN). The dashboard is connected to the RapidMiner platform to produce the forecasting results. However, the user can analyze the model generated (e.g. the pathway obtained in the decision tree). Subsequently, the output results are visualized and can be interactively explored and analyzed in the Smart Energy Dashboard (implemented with QlikSense). The information is presented per hour to help the user analyze the progress of consumption, production and other values throughout the day (Fig. 5). The user can also interact with the explainable model generated through the machine learning methods to understand the processing of data that produced the results.

Fig. 5.
figure 5

The Smart Energy Dashboard prototype (a) The selection of input parameters (b) Consumption and production per hour. Each line represents a date obtained from the similarity analysis. (c) The weather conditions per hour and energy prices per hour.

The main features of the Smart Energy Dashboard have been defined in a low-fidelity prototype (Fig. 5). In the beginning, the user selects input parameters to guide the forecasting process as well as the model to be implemented (Fig. 5a). As a result, the dashboard presents production and consumption visualizations for the different dates obtained in the forecasting model (Fig. 5b). The users can also analyze the relationships between the input parameters and the information presented because of the additional visualizations such as weather conditions and energy prices (Fig. 5c). To analyze a specific day, the user can directly select the corresponding consumption line, and the dashboard will be updated only to present the corresponding information. To create “what-if-scenarios,” the users can simply define desired input parameters in the dashboard to describe a scenario they would like to inspect, in the same way as they define the parameters for the day-ahead forecast. The identified historical data matching the defined scenario parameters would then be presented in the dashboard in the same way as described for the day-ahead analysis.

6 Evaluation

In order to perform an initial validation of our solution design, we performed an initial “heuristic evaluation.” During an interview with one utility representative, we presented the system design, method and visualizations. The participant understood the objective of the mockups and felt satisfied regarding the proposal as well as the interaction flow because this was compatible with existing practices in the company. In addition, the participant identified that the interaction with the visual tool could benefit the employees by augmenting their existing experience with the presented data-driven analysis. After completing the implementation of the functional prototype, we will perform a comprehensive evaluation of accuracy and explainability evaluation with more utility employees as follows:

  • (1) Accuracy Evaluation - A historical test data set and a set of test days from the past will be defined (from different seasons). The users will be asked to select input parameters to reflect the test days and to analyze the obtained system results to determine the most likely forecast for a given test day. We will then compare the difference between the forecast identified by the users and the actually measured consumption on the test day to determine the accuracy of this approach. This will be performed both for the kNN and the decision tree models, to compare their accuracy.

  • (2) Explainability Evaluation - A user-centered evaluation will be performed to determine perceived usefulness and explainability. When performing the test run described above, the users will also be asked about their evaluation of the perceived usefulness and performance of the system (through interviews and questionnaires). They will be given a set of questions about how well they understood the results of the system in determining similar days in the past regarding the input parameters. This will also include questions about observed relationships between specific input parameters and obtained consumption patterns, in order to verify the explainability more objectively. Finally, a control group of users will be presented with a simulation of automatic forecasting results for the same test days (common “Wizard-of-Oz” technique), without the possibility to set input parameters themselves and without the explanation of the rules that have been used to identify the forecast (as would be the case with black-box machine learning method).

7 Conclusions

We proposed an approach for interactive forecasting of day-ahead energy demand, based on historical data through a combination of explainable machine learning and visual analytics, in a way that is explainable for the users. Based on the selection of input parameters (e.g. weather conditions, energy price, day of the week), the proposed approach also allows utilities to analyze “what-if” scenarios of energy demand and their relationship to conditions that would generate them. This could help them to identify and/or prevent critical situations and issues in the future.

Furthermore, we proposed to extend this approach with a behavioral analysis of the interaction of prosumers with technological tools (e.g. customer service portal, recommendation systems) in order to identify the possible impact of such tools on consumption patterns. Understanding these relationships could be a key factor for the utilities to identify and define recommendations to optimize energy consumption. The behavioral analysis could also provide an improvement to traditional forecasting models which usually analyze historical data, but do not consider the effects of additional influences on the consumption behavior of prosumers. The evaluation approach presented in this work will be used to empirically evaluate the suitability of the proposed system design and identify possibilities for improvement. In future work, we also intend to extend the evaluation to different test years and different utilities in order to collect more substantial evidence regarding the suitability and usability of the proposed approach.