Keywords

1 Introduction

Many studies in Ukraine are aimed at solving current problems of energy markets [1,2,3,4] and electricity systems [5,6,7,8,9]. The transition in Ukraine to a newly liberalized electricity market has led to the functioning of such segments as the market of bilateral agreements, the market for the “day ahead”, the intraday market, and the balancing market [10, 11]. The emergence of new market segments has strengthened the urgency of improving the accuracy and stability of the results of short-term forecasting of both total electrical load (TEL) [12] and nodal load. In particular, the accuracy of forecasts determines the level of imbalances in electricity in the electricity system, which are created by different market participants [13, 14]. Accordingly, different approaches can be used for short-term forecasting tasks of both TEL and short-term forecasting of nodal electric load.

One of the approaches is the idea of ​​short-term forecasting of the Integrated Power System (IPS) of Ukraine at each of the three hierarchical levels independently. Research in this direction is given in [15, 16], modern methods of hierarchical forecasting are divided into two groups: “bottom-up” and “top-down”. The first approach combines lower-level forecasts for forecasting for each higher level, and the second approach uses only historical data from all levels for forecasting. Based on this, it can be argued that to increase the accuracy of forecasting at the upper level of the hierarchical system, it is necessary to increase the accuracy of forecasting at lower levels.

Another approach to solving the problems of short-term forecasting of TEL is the solution by building a multifactor mathematical model, which takes into account the structure and nature of electricity consumption taking into account the factors of influence. Improving the methods of short-term forecasting of TEL allows to increase in the efficiency of market participants [13] and distribution system operators(DSO) [14] in organized segments of the electricity market, as well as transmission system operator (TSO) during the organization of the balancing electricity market of Ukraine [17].

The development of multifactor models is also effective for predicting nodal loads. Determining the relationship between load nodes and between additional influencing factors also indicates the need to consider additional factors when predicting load.

The accuracy of forecasts of both total and nodal loads affects the cost-effectiveness of generating equipment and, accordingly, the cost of electricity. In particular, the forecast of nodal loads [18, 19] is needed to optimize future and adjust current regimes, accept operational dispatch requests, as well as to submit applications for the purchase and sale of electricity to distribution system operators, which necessitates obtaining forecast data for electricity purchases. different market segments.

2 Application of Artificial Neural Networks for Forecasting Electrical Load at Different Levels of Power Systems

2.1 Forecasting of Hierarchical Levels of the Power System

Using artificial neural networks for energy problems demonstrates advantages over classical statistical forecasting methods. For example, in [20], some methods of statistical and artificial intelligence are used to predict the electric load considered as well as the factors influencing the accuracy of forecasts are analyzed. The transition to hybrid models combines two or more models. In [21], was shown that neural network models are gradually becoming more accurate for load prediction compared, to multiple linear regression, the reference vector method, the Random Forest, and others. Data from the Irish energy system were used to test the effectiveness of short-term forecasting methods for different types of workloads (residential, small, and medium-sized enterprises). The obtained results demonstrate the high accuracy of neural networks compared to other methods, especially for short-term forecasting with a prediction of 1–7 days, where they have a better advantage.

To test the effectiveness of forecasting different hierarchical levels, a model was built for each hierarchical level of the IPS of Ukraine based on artificial neural networks, namely:

  • for the distribution system operator (DSO) level;

  • for the level of the regional power system of the transmission system operator (TSO);

  • for the IPS level of Ukraine.

The model is evaluated based on Kyivenerho, the Central Electric Power System of National Power Company (NPC) Ukrenergo, and the IPS of Ukraine for the period 2015–2016.

The data of the total electric load are time series. These are indicators that are collected over a period and correspond to some samples. Within the framework of this publication, the hourly values of TEL in MW at each of the given hierarchical levels of the IPS of Ukraine were used. A recurrent artificial neural network, which is widely used for time series prediction problems, was chosen for modeling.

A recurrent neural network is an improved version of a conventional artificial neural network (multilayer perceptron) that contains feedback to store information. One of the types of architecture of recurrent networks is LSTM (long-short time memory) [22], a network that is capable of learning on long-term dependencies.

For this task, a single-layer recurrent neural network of the LSTM type was used, to which a two-layer fully connected network was added. Data for two weeks with hourly discreteness is submitted to the network input. The input layer has 24 neurons, ie for each neuron of the LSTM layer values are given every hour for the previous two weeks. Thus, we obtain the sequence in which the input data for a particular hour enters the input of a particular neuron, which in turn transmits the output data to the next neuron both horizontally and vertically. This neural network is implemented in the Python programming language. In Fig. 1 shows the general architecture of the proposed neural network.

Fig. 1
A flowchart exhibits the layers of a neural network. It includes input data, L S T M layer, fully connected layer, fully connected layer, and output data.

Neural network architecture

Before submitting the data to the network input, the data of the training sample was normalized to the form from 0 to 1 according to formula (1). Test sample data were normalized in the same way, but using the minimum and maximum values from the training sample.

$$ x_{i,j} = { }\frac{{x_{i,j} { } - {\text{ min}}(x_{j} )}}{{{\text{ max}}(x_{j} ){ } - {\text{ min}}(x_{j} )}}, $$
(1)

where i—is the row number and, j—is the column number.

The LSTM expects the input to match some structure of the 3D array. Therefore, the best option is to use the previous time steps in our time series as input data to predict the output data in the next step. That is, for each neuron of data of separate time intervals is given, and through feedback, the information from the previous steps is transferred to the following. Thus, the network receives data not only for a specific hour but also information from previous time steps.

2.2 Retrospective Data and Results of Forecasting Different Hierarchical Levels of the Power System

Prediction of each of the hierarchical levels is performed on the model of the artificial neural network described above, for each hierarchical level training was conducted separately on the corresponding data samples. Approbation of the forecasting results was performed on the data of DTEK Kyiv Electric Networks and the Central Electric Power System of NPC Ukrenergo for the period from 2015 to 2017 with hourly discreteness. Training samples of the same dimension for the period from January 2, 2015, to August 22, 2016, were used to train the models. Test samples were divided into summer and winter. The summer sample contained data for the period from August 22 to September 1, and the winter—from 22 to 31 December 2016. The RELU function was used as the activation function of the fully connected layers. The RMSE function was used as an evaluation parameter.

Table 1 shows the RMSE forecast errors (square root of the root mean square error) as a percentage and in absolute values for the test samples.

Table 1 Forecast errors

RMSE graphs for summer and winter testing periods are presented in Figs. 2 and 3 for the DSO level, the regional TSO power system and the IES level of Ukraine, respectively.

Fig. 2
A graph depicts the R M S E errors of N E K, C E S, and kyivenergo in a period of summer. The trend of C E S has the highest error percentage and reaches above 8% in 10 hours.

RMSE errors for summer period for all power system levels

Fig. 3
A graph depicts the R M S E errors of N E K, C E S, and kyivenergo in a period of winter. The trend of N E K has the least error percentage and C E S has the highest error percentage.

RMSE errors for winter period for all power system levels

The results of the calculations show that the accuracy of forecasting increases with each higher hierarchical level. That is due to factors that affect them. In particular, the lower levels are affected by several factors. The forecast for the winter period shows a smaller error. The graphs show that at each higher hierarchical level the error is more uniform without obvious bevels.

The analysis of forecasting results showed that the forecast error is smallest in the winter period for all hierarchical levels, which is in the range of 1.5 … 4.5%, while for the summer period the error is in the range of 2.6 … 5.9%. With each higher level, the error decreases in both testing periods, this is since the lower levels are affected by more external factors, so it is more dispersion. To test the impact of external factors on the lower levels of the power system, a study was conducted to predict the total load of the DSO, taking into account air temperature. This study is described in Sects. 2.3 and 2.4.

2.3 Decomposition of Schedules of Total Electric Load and Forecasting of Total Electric Load Taking into Account Temperature

Since at the lower levels the TEL values are influenced by both internal (technological) and external (meteorological, astronomical, etc.) factors, to determine the degree of influence of a factor, it is advisable to decompose graphs of TEL hour sections and predict each component separately depending on the factor.

In this model, the Hilbert-Huang method is used to decompose TEL schedules into temperature and base components [23]. This method is promising for the study of nonlinear and nonstationary processes. The classical algorithm of the Hilbert-Huang method looks like this:

  1. 1.

    Search in the TEL curve of the hour section P(x) of local extrema, grouping separately local minima and maxima of TEL.

  2. 2.

    Construction of curved curves by interpolation of curves of local minima ub(xb) and maxima ut(xt). Since the number of points in the curves can differ significantly, it is necessary to interpolate (using cubic splines) and extrapolate (using the first-order Brown method) their functions over the entire sample size ub(x) and ut(x), respectively, where x varies from 1 to n-sample size.

  3. 3.

    Then the first component m is found as the mean value between the functions ub(x) and ut(x) (2):

    $${m}_{i}=\frac{u{b}_{i}+u{t}_{i}}{2}.$$
    (2)
  4. 4.

    The second component ck (k—iterations number) is the difference between the values of full load and the first component.

  5. 5.

    In the following iterations, y(x) takes the value mk-1 and algorithm 1–4 continues until the number of local minima or maxima is less than 2.

Thus, in [24] described method is used for pre-processing of data in one-factor forecasting using neural networks.

In the developed model, this algorithm is adapted to match the decomposition results to the real process of the effect of temperature change on the TEL. In particular, the following changes were made:

  1. 1.

    Only the curve of the local minima of the TEL schedule is used in the calculations, so for the most part the base and temperature components are positive, in addition, the limit of the “insensitivity zone” is determined, at temperatures below which the temperature component is zero.

  2. 2.

    After each iteration, the selected components ck are added, and the correlation coefficient between the sum of the selected components Σck and the air temperature is calculated, it is an additional condition for stopping the decomposition cycle.

To predict the temperature component, polynomial regression is used with the selection of the optimal degree and model (3):

$$P=\sum_{i=0}^{m}{a}_{i}{t}^{i},$$
(3)

where u varies from 0 to the optimal degree of m; a—coefficients of the polynomial equation.

These coefficients are determined in the following sequence: a system of algebraic Eq. (4) is formed using the matrix method. Since the matrix of input parameters (air temperature values) t{[1], [ti], [ti2]… [tim]} is often rectangular, it is necessary to apply the matrix transformations of Eq. (4), then the required coefficients are determined by Eq. (5):

$$tA=P; $$
(4)
$${t}^{T}tA=\left({t}^{T}P\right).$$
(5)

To increase the universality of the method of calculating the system of Eq. (4), namely to avoid cases where the matrix tTt has no inverse, the resulting system of algebraic equations is solved using the Gaussian method. The analysis of preliminary calculations showed that the selection of the degree from 2 to 10 is sufficient. At the same time, the optimal model is selected for each degree. The criterion of minimum means absolute percentage error (MAPE) is accepted as a target function for selecting the optimal model.

Pugachev's method of canonical decomposition of random processes was used to predict the base component of TEL [25]. The method of canonical decomposition is a representation of the function Pb(t) in the form:

$$Pb\left(t\right)={m}_{Pb}\left(t\right)+\sum_{V}{V}_{V}{\varphi }_{V}\left(t\right),$$
(6)

where mPb(t)—mathematical expectation of the base component of TEL, Vv—some random variables whose mathematical expectation is 0, φv(t)—coordinate function calculated by the following formula:

$${\varphi }_{V}\left(t\right)=\frac{1}{{D}_{v}}M\left(Pb\left(t\right){V}_{v}\right),$$
(7)

where Dv—variance of an array of random numbers; Pb(t)—values of the base component of TEL, centered on the average value (deviation of the original function from the average value).

An array of random numbers must satisfy the following conditions:

$$M\left[{V}_{v}\right]=0; M\left[{V}_{v}{V}_{m}\right]=0 \left(m\ne v\right).$$
(8)

Numbers were obtained using a white noise generator.

Prediction of the base component of TEL is performed by the formula:

$$Pb\left(t+1\right)={m}_{Pb}\left(t\right)+\varphi v\left(t\right){V}_{v}.$$
(9)

The synthesis of the forecast graph is performed as the algebraic sum of the temperature and base components in each hour of the daily schedule.

2.4 Analysis of the Results of Total Forecasting Taking into Account the Temperature

The study was conducted according to Kyivenerho for the winter period from 01/11/2015 to 31/03/2016 and the summer period from 01/06/2015 to 31/08/2015. Both samples are hourly and contain only working days from Tuesday to Thursday. Data on air temperature were obtained from open sources for the city of Kyiv with the discreteness of 3 h, so these data were interpolated to obtain hourly values.

Figure 4 shows the graphs of the temperature component and the temperature for the 12-h cross-section of both samples, where the inverse (for winter) and direct (for summer) correlations are observed. Testing of the mathematical model was performed for several days, for the summer period—for four days, for the winter period—for three days. The MAPE value is used to estimate the forecast error. The forecasting results are given in Table 2.

Fig. 4
Two graphs depict the temperature component and air temperature, P t and t for the summer and winter periods. Both have lines with peaks and troughs. The summer period has a direct correlation.

Schedules of the temperature component and air temperature

Table 2 Errors of the total forecast taking into account the temperature

3 Short-Term Nodal Load Forecasting Taking into Account Temperature

Based on the results of short-term forecasting of nodal loads in the services of power system modes, most of the technical tasks of mode planning are solved, which are aimed at improving the efficiency and reliability of power systems. At this time, this problem is solved very simply: the node loads are determined using the coefficients of distribution of the total load according to the degree of their relationship with the node loads. However, there are works in which more advanced forecasting methods are used to determine the nodal loads. Thus, in [26], inversion of a neural network based on a multilayer perceptron is used to predict nodal loads. In [14], an algorithm based on an artificial neural network of the multilayer perceptron type, combined with a mathematical apparatus of autoregression, was considered to predict nodal loads. Using the autoregression method, the data is pre-processed and the parameters of the mathematical model (MM) are estimated. The error of forecasting results for working days is in the range of 2.4–6.2%. In addition, methods based on artificial neural networks can be used for problems of renewable energy sources and their forecasting [27, 28]. In some published works on short-term forecasting of total electrical load, the influence of meteorological factors (temperature, clouds, etc.) is taken into account [12]. Preliminary studies have also shown that to increase the accuracy and reliability of short-term forecasting results, it is necessary to take into account additional technological factors, in particular, the mode of operation of energy-intensive enterprises.

LSTM deep learning neural network, the architecture of which is present in [18], was used to predict nodal loads. Such a neural network is a combined architecture based on a multilayer perceptron hidden layer which contains a recurrent LSTM memory module [22], as well as two fully connected layers, and one bypass connection that provides input to the output, which is summed to improve the neural network learning process. The data on the input of the neural network happens in increments of 24 values. The SELU (scaled exponential linear unit) function is used as an activation function of hidden layers [29]. Training is carried out using the ADAM optimizer (adaptive moment estimation) [30]. A period of 100 epochs was chosen for study. Ambient temperature data was used as a virtual node and concatenated with the input load vector of the nodes.

To study the influence of air temperature on the accuracy of forecasting of nodal loads used to load data obtained from the automated system of control and accounting of electricity (ASCAE) “Vinnytsiaoblenergo” for the period from 10.01.2017 to 06.10.2019, monthly loading of 15 universities with hourly discreteness. Air temperature data were obtained from the meteorological station of Vinnytsia (according to the index of the International Meteorological Organization 33,562). The temperature data used have a discreteness of one hour.

To determine the relationship between the load nodes, the correlation of data for working days and weekends was performed.

As can be seen from the above data, when the nodal correlation of working day data was found, several nodes that have no connection with the nodes are nodes 4, 11, and 13 (correlation coefficient of which is in the range of 0.2 − (−0.2)). Whereas for weekends only 4 nodes have no connection with other nodes. You can select the nodes with the highest correlation coefficients, these are 3.5, 6, 7, 8, and 12. (correlation is in the range from 0.7 to 1). You can also group nodes 9–10 to 14–15 with a correlation coefficient of 0.7–0.6, and nodes 1–2 with a correlation coefficient of 0.5–0.4 (Tables 3 and 4).

Table 3 Correlation between nodal of working days
Table 4 Correlation between nodal of weekends

The correlation of data between nodes load and temperature was also investigated. According to the results of which (Table 5), it is seen that the relationship between working days and weekends with the temperature is identical. Almost all nodes have a negative correlation with temperature.

Table 5 Correlation of load nodes with temperature

To determine the optimal scope of training samples for winter and summer periods, a comparative analysis of the average daily load and temperature graphs for all nodes for the entire period was conducted. After analyzing the graphs of load and air temperature, we can identify the following common features:

  • All nodes are dependent on temperature in winter, with some having a linear relationship (ie the form of load graphs and similar temperatures), while others have a negative correlation.

  • The winter period can be conditionally allocated starting from the period 25/09–10/10/2017 to 04–05/04/2018. During this period there is a decrease in temperature and an increase in the magnitude of the load on the nodes.

  • Then in the period from 4 to 9 April, there is a sharp decline in load and conditionally begins the summer period during which the load is almost independent of temperature. This period ends on September 20–25, 2018.

  • In some nodes there is a significant number of load failures (in some cases it is characterized by the presence of holidays and in others—the emergence of probable emergencies), in node 10 there is an abnormal increase in load in November 2018, exceeding normal values by 4 times.

Thus, for training samples, it is possible to allocate conditionally winter from 10/01/2017 to 04/04/2018 and conditionally summer from 04/09/2018 to 09/20/2018 periods with the allocation of the last 7 days to assess the forecast. Figures 5, 6 and 7 show examples of load-temperature ratio charts for the selected period.

Fig. 5
A graph depicts the correlation of temperature and load of node 3 from 2017 to 2019. It has 2 fluctuating curves with alternative peaks and troughs.

Graphs of the ratio of load and temperature of the node 3

Fig. 6
A graph depicts the correlation of temperature and load of node 4 from 2017 to 2019. It has 2 fluctuating curves overlapping each other.

Graphs of the ratio of load and temperature of the node 4

Fig. 7
A graph depicts the correlation of temperature and load of node 8 from 2017 to 2019. It has 2 fluctuating curves with alternative peaks and troughs.

Graphs of the ratio of load and temperature of the node 8

Also, to check the effectiveness of forecasting nodal loads, data analysis was performed to identify anomalous values and omissions (hereinafter referred to as data analysis). To do this, a two-stage validation algorithm was developed, which includes the stage of data clustering to select anomalous values and replace them, after which the seasonal decomposition method selects residual data, which is used for re-verification by the clustering method.

Detailed analysis of the node load data revealed a significant number of anomalous values that need to be replaced.

Table 6 shows the statistical load characteristics of nodes 1 and 11, before and after the authentication procedure.

Table 6 Statistical load characteristics of nodes 1 and 11 before and after the authentication procedure

The schedule of loading of the corresponding knots before and after authentication is shown in Fig. 8 (Table 7).

Fig. 8
Two graphs depict the input and corrected data prior to authentication and after authentication respectively. Graph 1 has dense plots for input data with sharp peaks whereas, graph 2 has a dense wave for corrected data.

Schedule of loading of knots before authentication and after authentication

Table 7 Samples of training and test data samples for forecasting

As can be seen from the above data, the verification algorithm as a whole successfully detected and recovered single emissions, but the quality of identification and recovery of group emissions is much lower.

Tables 8 and 9 show the forecast results. The MAPE was used to assess error. The calculation of the error was performed on the data for the period from 01/01/2019 to 06/10/2019, which was not used for neural network training.

Table 8 Forecast errors for different configuration
Table 9 Forecast errors for diferent day types

Thus, it is shown that the use of the confidence method for nodal load data can reduce the average forecast error from 13.74 to 11.52%. The use of air temperature data as additional forecasting factors can further reduce forecast errors in the range from 14.22 to 11.17%. The accuracy of the prediction also depends on the data samples. When using samples for the conditionally winter or summer period, in some cases this reduces the forecast errors, but the accuracy depends primarily on the sample size.

4 Conclusion

The results of complex studies are aimed at improving the accuracy of forecasting electrical loads through the use of artificial neural networks and taking into account additional factors, including air temperature. Prediction of the total load using the decomposition of TEL graphs (separately for each slice) using the Hilbert-Huang method with the proposed and made changes to solve the problem obtained temperature component that has a close correlation with air temperature, which helps to build more exact regression dependence for its prediction. The use of the proposed method allows ensuring the error of the results of short-term forecasting of TEL within 1.5 ÷ 3.15%.

The use of a recurrent neural network is effective when forecasting data with different dimensions and provides high accuracy of forecasting at the level of the IPS of Ukraine, namely within 1.5 … 2.6%. For other hierarchical levels, forecasting accuracy is reduced to 6%. To increase the accuracy at the regional level and the IPS level of Ukraine, it is advisable to take into account the results of forecasts at lower hierarchical levels, taking into account the listed external factors.

The use of air temperature as an additional factor for short-term forecasting of nodal load can reduce the forecast error from 11.52 to 11.17%. Based on the analysis of load and temperature data, it was determined that the data have the opposite correlation. Also, depending on the type of data sample, the effect of temperature changes and thus changes the accuracy of forecasting results. It is established that the choice of the training sample and its volume for neural network training depends on the accuracy of forecasting results. The use of the developed method of verification allows the detection of significant anomalous values ​​and omissions of data, thereby improving the accuracy of forecasting. Careful analysis of the results of forecasting node loads showed that reducing the error for nodes with sharply variable loads requires a more advanced method of data validation.