Keywords

1 Introduction

With the continuous popularization and development of smart grid, the accuracy of load forecasting is vital for improving the scientificity of power generation and distribution and dispatching in the power system [1]. The short-term load forecasting is used to forecast the load data in the next day or next few days, which is affected by many factors. The forecasting method is continuously improved, and the forecasting accuracy also remains to be further improved. The method for short-term load forecasting is mainly classified as classical forecasting method and modern forecasting method. The classical forecasting method includes empirical forecasting method and traditional forecasting method, of which the machine learning algorithm is used in the modern forecasting method for analysis modeling through historical load data and relevant factors of impact load, such as air temperature and special event. The research methods include support vector machine, random forest and neural network at present [2].

In recent years, the dynamic equilibrium of power supply and demand is provided with higher requirement with the development and reform of power market, so that the requirement for accuracy of short-term load forecasting becomes higher and higher; at the same time, with the continuous renewal of research method in short-term load forecasting field, many experts put forward the method for combination forecasting. The combination forecasting model not only overcomes the limitation of single model algorithm, but also realizes the complementary advantages between different neural networks by virtue of multi-feature characteristics of load data. It is widely applied and deeply researched in the short-term load forecasting field at present. The common thoughts in combination forecasting model include: firstly, combine with different neural network models to realize combination forecasting for load data, such as CNN-LSTM and GRU-NN combination forecasting model [3, 4]; secondly, preprocess the original load sequence, take different methods to extract features, then establish the forecasting model respectively, such as variational mode decomposition (VMD) and local mean decomposition (LMD) selected [5, 6], and execute load forecasting in combination with different forecasting models.

On the basis of existing scientific research achievements of combination forecasting method and characteristics of periodicity, non-linearity, non-stationarity and strong randomness of short-term load forecasting sequence, a short-term load combination forecasting method for power grid based on Hilbert-Huang Transform is proposed in this paper. HHT is a time-frequency analysis method characterized by self-adaptability and decomposing signal locally. Firstly, the load data is decomposed by empirical mode decomposition (EMD) algorithm to get IMF component and get different components of instantaneous frequency by Hilbert conversion; secondly, the different load forecasting models are selected to forecast the high-frequency, medium-frequency and low-frequency components respectively. Such a combination model can not only make full use of the most of character of different components, but also realize the complementary advantage between different forecasting models. The rationality and effectiveness of the combination model are validated by test in this paper.

2 Theoretical Basis of Combination Model

2.1 Empirical Mode Decomposition (\({\varvec{EMD}}\))

HHT include EMD and Hilbert conversion. EMD decomposition is the core part of HHT conversion, as well as the truly innovative point. EMD process is an iterative process of data processing by envelope fit by extreme value. After the data column iterated meets the certain conditions, it will become the intrinsic mode function (IMF), calling as screening. IMF is characterized by two points: 1) In sequence, the number of extreme point is equal to that of zero crossing point or the difference is not greater than 1; 2) At any time, the local value of upper envelope and lower envelope defined by extreme point is zero [7]. The general steps of EMD decomposition include:

  1. 1)

    The maximum value and minimum value of input signal \(x\left( t \right)\) are evaluated to fit the envelope. The curve fitting mentioned here is an important issue in \(EMD\). The effect of the interpolation method selected on envelope fitting will have a direct influence on \(EMD\) decomposition result, and the cubic spline interpolation method is used in this paper.

  2. 2)

    The mean \(m\left( t \right)\) of upper envelope and lower envelope is evaluated, and \(h\left( t \right)\) is evaluated by \(h\left( t \right) = x\left( t \right) - m\left( t \right)\);

  3. 3)

    \(IMF\) end condition of \(h\left( t \right)\) is judged. If two features of \(IMF\) mentioned above are met, \(h\left( t \right)\) is the first \(IMF\) evaluated by decomposition, recorded as \(h_{1} \left( t \right)\); if not, Step 1 and Step 2 are cycled and repeated through assuming \(h\left( t \right) = x\left( t \right)\) as the new sequence;

  4. 4)

    The new sequence \(r\left( t \right)\) is gotten by \(r\left( t \right) = x\left( t \right) - h_{n} \left( t \right)\), and the decomposition end condition is judged. If it is met, the decomposition is ended; \(r\left( t \right)\) is the residual component, and n \(IMF\) components are gotten by decomposition; if not, the above steps are cycled and repeated through assuming \(r\left( t \right) = x\left( t \right)\).

A certain number of \(IMF\) and residual component \(r\) are gotten through decomposition of original sequence signal \(x\left( t \right)\), of which \(x\left( t \right)\) is expressed as:

$$x\left( t \right) = \mathop \sum \nolimits_{i = 1}^{n} h_{i} \left( t \right) + r\left( t \right)$$
(1)

wherein, \(h_{i} \left( t \right)\) is the \(i{ }IMF{ }\) component, and \(r\left( t \right){ }\) is the residual component. Each \(IMF\) component gotten here is the independent data sequence of a characteristic scale [8].

2.2 Hilbert Spectrum Analysis

After \(EMD\) decomposition screening, \(Hilbert\) conversion is applied in each independent \(IMF\) component, so as to get the instantaneous frequency and instantaneous amplitude to analyze the components. \(Hilbert\) conversion can realize the \(90^{^\circ }\) phase shift of base frequency and harmonic wave accurately, and remain constant amplitude [9]. With regard to the given signal \(x\left( t \right)\), \(Hilbert\) conversion may be defined as:

$$H\left[ {x\left( t \right)} \right] = \frac{1}{\pi }\mathop \smallint \nolimits_{ - \infty }^{ + \infty } \frac{x\left( \tau \right)}{{t - \tau }}d\tau = x\left( t \right)*\frac{1}{\pi t}$$
(2)

The analytic signal is gotten through \(Hilbert\) conversion for signal \(x\left( t \right)\):

$$z\left( t \right) = x\left( t \right) + jH\left[ {x\left( t \right)} \right] = a\left( t \right)e^{j\theta \left( t \right)}$$
(3)

So the instantaneous frequency \(\omega\):

$$f_{i} = \frac{1}{2\pi }\omega_{i} \left( t \right) = \frac{1}{2\pi }\frac{{d\theta_{i} \left( t \right)}}{dt}$$
(4)

Through \(Hilbert\) conversion, the analytic function of each IMF component \(h_{i} \left(t \right)\):

$$z_{i} \left( t \right) = h_{i} \left( t \right) + j\tilde{h}_{i} \left( t \right) = a_{i} \left( t \right)e^{{j\theta_{i} \left( t \right)}}$$
(5)

The instantaneous frequency and instantaneous amplitude of each \(IMF\) component are gotten from Eq. (4, 5). In essence, \(Hilbert\) conversion shows the optimal approximation degree of local signal and sine function, and its localization feature is further reinforced in the differential operation of solving instantaneous frequency [10].

2.3 RBF Neural Network

\(RBF\) neural network can approximate arbitrary nonlinear function with good function approximation function, and is widely applied in the load forecasting aspect by virtue of characteristics of simple structure and rapid learning convergence rate. \(RBF\) neural network based on Gaussian kernel is used [11]. It is assumed that the input vector is \(n\) dimension, and is denoted as \(x = \left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)^{T}\). In addition, there are \(k\) hidden nodes and \(m\) outputs in the model. \(h_{i} \left( x \right)\) represents the \(i\) hidden layer node. The Gaussian function is used for conversion of space mapping of input information as the kernel function of hidden layer neuron:

$$h_{i} \left( x \right) = {\Phi }_{i} \left( x \right) = e^{{ - \frac{{x^{2} }}{{\delta_{i}^{2} }}}}$$
(6)

\(\delta\) is the extended constant. When the vector is input to neural network through \(Gaussian\) radial basis function, the output of the \(j\) node of hidden layer:

$${\Phi }_{i} \left( x \right) = {\text{exp}}\left( {\frac{{ - \parallel x_{i} - c_{j} \parallel}}{{2\sigma_{j}^{2} }}} \right)$$
(7)

wherein, \(c_{j}\) is the center of Gaussian function of the \(j\) hidden layer; \(\parallel \cdot\parallel \) is Euclidean norm, and \(\sigma_{j}\) is the width of Gaussian function of the \(j\) hidden layer. The output of \(RBF\) neural network:

$$Y = \left( {y_{1} ,y_{2} , \ldots ,y_{m} } \right)^{T} = \mathop \sum \nolimits_{j = 1}^{m} w_{j} {\Phi }_{j} \left( x \right)$$
(8)

wherein, \(w_{j}\) is the network connection weight between the \(j\) hidden layer node and output layer [18]. \(RBF\) neural network forecasting model can turn the nonlinear mapping from input layer to hidden layer into the linear mapping on the other space, and forecast the signal with high frequency, large volatility and strong randomness better [19].

2.4 LSTM Recurrent Neural Network

LSTM is an improved structure proposed for easy gradient vanishing and gradient explosion of common \(RNN\) in practical training. It is a mechanism which leads in cell gate from neuron of standard \(RNN\) model, which consists of input gate, output gate and forget gate [12]. The forget gate is used to decide the forget and update of transitional information. LSTM cell structure is as shown in Fig. 1.

Fig. 1.
figure 1

LSTM Unit Structure Diagram.

LSTM model can decide which information is forgotten and updated to constitute the long-term and short-term memory network through gate mechanism in cell structure. According to LSTM cell structure chart, \(C_{t}\) is the cell state at \(t\) time; \(x_{t}\) is the input at \(t{ }\) time; \(h_{t}\) is the output at \(t\) time, and \(f_{t}\), \(i_{t} \; and \; o_{t}\) are output of forget gate, input gate and output gate respectively. The operation process of concrete cell structure:

$$f_{t} = \sigma \left( {W_{f} \left[ {h_{t - 1} ,x_{t} } \right] + b_{f} } \right)$$
(9)
$$i_{t} = \sigma \left( {W_{i} \left[ {h_{t - 1} ,x_{t} } \right] + b_{i} } \right)$$
(10)
$$\overline{C}_{t} = tanh\left( {W_{c} \left[ {h_{t - 1} ,x_{t} } \right] + b_{c} } \right)$$
(11)
$$o_{t} = \sigma (W_{o} \left[ {h_{t - 1} ,x_{t} } \right] + b_{o}$$
(12)
$$C_{t} = f_{t} C_{t - 1} + i_{t} \overline{C}_{t}$$
(13)
$$h_{t} = o_{t} {\text{tanh}}\left( {C_{t} } \right)$$
(14)

wherein, \(W_{f}\), \(W_{i}\), \(W_{c}\) and \(W_{o}\) are weight matrixes, and \(b_{f}\), \(b_{i}\), \(b_{c}\) and \(b_{o}\) are bias vectors. \(LSTM\) model is optimized by “forget gate” additionally, so as to control the convergence of gradient during training data, and solve the gradient vanishing or gradient explosion better.

3 Short-Term Load Forecasting Combination Model Based on HHT

The short-term load data of power grid is affected by human production and life, change of meteorological condition, economic factor, political factor, etc. The system load data includes multiple characteristics for analysis and forecasting, and it is difficult to obtain the essential characteristics. In order to further explore the inherent law of load data, the short-term load forecasting combination model of power grid based on \(HHT\) is established to decompose the load data as a certain amount of \(IMF\) by \(EMD\) algorithm, and then convert and process each component alone by \(Hilbert\) conversion, so as to get different instantaneous frequencies and instantaneous amplitudes. According to different characteristics of \(IMF\), the different neural network models are selected for forecasting, and the result is overlapped to get the forecasted value of load in the end. In the meantime, due to large influence of change of air temperature on fluctuation of load data, the accuracy of load forecasting is promoted in combination with the correlation between temperature data of the region and \(IMF\) component.

3.1 Hilbert-Huang Transform (HHT) of Load Data

The data sample in March 2021 of a region in East China is selected for test. The load data curve is shown in Fig. 2. Firstly, the load sequence is provided with \(EMD\) decomposition, then the envelope is fitted with cubic spline interpolation method, and a total of 7 \(IMF\) components and a residual component \(r\) are decomposed. The concrete result is as shown in Fig. 3.

Fig. 2.
figure 2

Load data curve (March, 2021)

Fig. 3.
figure 3

\(EMD\) decomposition result

The frequency of \(IMF_{1}\) component and \(IMF_{2}\) component is high in Fig. 3, but that of \(IMF_{3}\) to \(IMF_{7}\) is decreased progressively in contrast. In order to further analyze each component, \(Hilbert\) conversion is also applied to obtain the concrete instantaneous frequency curve chart of each component, as shown in Fig. 4.

Fig. 4.
figure 4

Instantaneous frequency of \(IMF\) component

The mean frequency of each \(IMF\) component is further calculated in Table 1. It shows that each \(IMF\) component owns different frequency characteristics in Fig. 4 and Table 1. It is decreased progressively, and the mean value calculated is also reduced in turn. \(IMF_{1}\) to \(IMF_{3}\) is characterized by large fluctuation, strong randomness and high frequency upon calculation as the random part of load; \(IMF_{4}\) to \(IMF_{5}\) is characterized by steady fluctuation trend and mean frequency decrease, which represents the periodicity of load; eventually, according to the calculation result of mean frequency in Table 1, \(IMF_{6}\), \(IMF_{7}\) and residual term \(r\) approach to zero, which represents the trend component of load. The components are divided into random component, periodic component and trend component according to their characteristics. On one hand, they reduce the difficulty of building forecasting model; on the other hand, they emphasize on different characteristics of each component. The model is built by combination of \(RBF\) neural network model and recurrent neural network based on \(LSTM\). It not only takes advantage of high learning rate of \(RBF\) neural network to process the data signal with large volatility and high frequency, but also processes the problems with strong periodicity and highly correlated with time sequence in combination with \(LSTM\) to effectively improve the forecasting accuracy.

Table 1. Average frequency of IMF component

3.2 Correlation Analysis of \({\varvec{IMF}}\) Component Temperature

The power load data includes multiple properties of power utilizations, i.e., industrial load, appliance load and transportation load. Different \(IMF\) components represent different properties of power utilization data, and the meteorological influence is also different, so it should be analyzed in preliminary data processing in combination with meteorological factors. For example, the air temperature is selected as the representation of meteorological factors, the meteorological factor is integrated to adjust the input data and model parameter through comparison with correlation between different \(IMF\) components and temperatures as well as neural network modeling for different \(IMF\) components.

The correlation coefficient of each \(IMF\) component and air temperature is defined as

$$r_{i} = \frac{{cov\left( {h_{i} \left( t \right), c_{i} \left( t \right)} \right)}}{{\sqrt {cov(h_{i} \left( t \right))} \sqrt {cov\left( {c_{i} \left( t \right)} \right)} }}$$
(15)

wherein, \(c_{i} \left( t \right)\) is air temperature of the corresponding point (\(i = 1,2, \ldots ,n\); \(n\) is total number of \(IMF\) components). The difference in correlation of different seasons is obvious, of which the correlation coefficient of \(IMF\) component and temperature in summer and winter is high, and the load data in March is selected; the correlation of \(IMF\) component and air temperature data gotten by decomposition is small on the whole, and the curve chart of correlation coefficient in Fig. 5 is obtained. Specially, it is shown that the correlation between \(IMF_{1}\) component and \(IMF_{2}\) component and air temperature data scarcely exists. \(IMF_{3}\) and \(IMF_{4}\) are positively correlated with air temperature data, but \(IMF_{5}\) and \(IMF_{7}\) are negatively correlated with air temperature data, so the short-term load forecasting combination model is trained for different \(IMF\) components respectively. As an example of \(IMF_{4}\) component, it shows that \(IMF_{4}\) component is greatly affected by weather in contrast in Fig. 5, and the forecasting difficulty is high. In the short-term load forecasting combination modeling of power grid corresponding to \(IMF_{4}\) component, the proportion of training data, validation data and test data is about 90%, 5% and 5% respectively.

Fig. 5.
figure 5

Correlation between IMF components and temperature

3.3 Short-Term Load Forecasting Combination Model of Power Grid

The load sequence of power system is characterized by volatility and special periodicity, and it is greatly affected by actual scenes, for example, the difference of geographic position and living habit of the southern and northern China, economic and social difference of the first-tier and second-tier cities and the third-tier and fourth-tier cities will cause different periodicities and volatilities of load data due to climate, major events, electricity price fluctuation, etc. The short-term load combination forecasting model of power grid based on \(HHT\) proposed herein is applied to study the essence of load data through decomposition of short-term load data of power grid, and then forecast in combination with the appropriate neural network forecasting model according to characteristics of different components, so as to improve the forecasting accuracy and stability.

The concrete steps of short-term load combination forecasting model of power grid based on \(HHT{ }\) is:

  1. (1)

    Preprocess the historical load data and specify the evaluation index;

  2. (2)

    Decompose the load data by \(EMD\) algorithm, provide \(Hilbert\) conversion for \(IMF\) component, and get the instantaneous frequency;

  3. (3)

    Apply appropriate neural network model for forecasting respectively according to characteristics of different frequencies of each component;

  4. (4)

    Add the forecasted result of each component to get the final result;

  5. (5)

    Eventually, get the accuracy index through comparison with the non-compositional method.

4 Simulated Analysis

The short-term load forecasting model is applied, and the load data in March 2021 of one region in eastern China is selected to forecast the load value for 24 h on April 1, 2021 as the training sample, and analyze the accuracy of forecasted result. The curve chart of actual load and forecasted load is as shown in Fig. 6. The error of concrete forecasted result value and relative percentage of forecasting is as shown in Table 2, of which the forecasting percentage error is defined as:

$$APE = \left| {\frac{{A_{t} - P_{t} }}{{A_{t} }}} \right| \times 100\%$$
(16)

wherein, \(A_{t}\) is the real load value; \(P_{t}\) is the load value gotten by forecasting.

Fig. 6.
figure 6

Comparison between real load and forecasting load

Table 2. APE of forecasting load

In order to compare with other methods, the mean absolute percentage error (\(MAPE\)) is selected to measure the forecasted result as the evaluation index of short-term load forecasting of power grid.

$$MAPE = \frac{1}{n}\mathop \sum \nolimits_{t = 1}^{n} \left| {\frac{{A_{t} - P_{t} }}{{A_{t} }}} \right| \times 100\%$$
(17)

The result which shows the higher forecasting accuracy and better effect of the method proposed herein is presented in Table 3 by comparison of the method proposed herein and forecasted result of \(RNN\) recurrent neural network and \(LSTM\) recurrent neural network forecasting model for mean load of one region.

Table 3. MAPE of short-term load forecasting

In contrast with \(MAPE\) result based on the combination method and network model method based on \(LSTM\) and \(RNN\), although the forecasting accuracy is fluctuated, \(MAPE\) index of combination forecasting method proposed in this paper is basically less than 2%, which is obviously superior to that of recurrent neural network forecasting model based on \(LSTM\) applied alone.

5 Conclusions

In this paper, the short-term load combination forecasting model of power grid based on \(HHT\) is studied. The original load sequence is decomposed by \(EMD\) decomposition algorithm, and then each \(IMF\) component is changed along by \(HT\). According to characteristics of different components and analysis on correlation of air temperature data, it is forecasted in combination with the forecasting model of neural network based on \(RBF\) and recurrent neural network based on \(LSTM\). It not only takes advantage of \(HHT\) to process nonlinear and non-stable signal, but also realizes the complementary advantages between different neural networks. It is discovered that the accuracy of short-term load forecasting combination method of power grid based on \(HHT\) is higher through experimental result. Certainly, the characteristic analysis on IMF component should remain to be further studied and explored in contrast with correlation analysis of other factors, such as social experience factors and holiday activity factors. The combined neural network model also remains to be further improved and tried to promote the load forecasting accuracy to a greater extent.