Keywords

1 Introduction

Data mining technology is usually divided into two categories. The first category is represented by the statistical models, the most commonly used being probability analysis, relevance, clustering analysis, and discriminant analysis; the other category is machine learning in artificial intelligence. In this paper, we use an artificial intelligence model to forecast electricity demand. The purpose of short-term power load forecasting is to obtain accurate load forecasting results, because under the premise of meeting power supply quality requirements such data guarantees that power providers achieve optimal use of power supply system construction funds, so as to achieve maximum social and economic benefits.

Recently, researchers have put forward a range of models for electricity demand forecasting based on time series technologies, such as artificial neural network models [1, 2], and fuzzy logic grey-based approaches [3, 4]. However, none of these models can obtain the expected forecasting accuracy for all electricity demand forecasting issues [5]. There is no single best method that under every condition can achieve the best forecasting results. Of course, the hybrid or combining model appeared to solve this problem.

The combining model combines the advantages of one, two, or more models. Bates and Granger [6] proposed the hybrid model. Later, Dickinson [7] testified that the hybrid model can obtain higher accuracy than that of an individual model. So, in this article, we propose a new combined model named SPLSSVM that combines seasonal adjustment, particle swarm optimization (PSO), and the least square support vector machine (LSSVM). Firstly, SPLSSVM applied seasonal adjustment (SA) to eliminate the seasonal component. Then, SPLSSVM used the LSSVM for model training and fitting, and for this, LSSVM parameters were optimized by the PSO algorithm. Through comparison of the proposed model with other models, we show that the SPLSSVM model indeed improves accuracy.

The layout of this paper is as follows. We demonstrate the theory of SA, PSO, and the LSSVM model in Sect. 2, and a case study of forecasting electric load is presented in Sect. 3. We conclude this paper in the last section.

2 The Hybrid Model

2.1 A Review of Seasonal Adjustment

The data series x 1, x 2, …, x T (T = ml) is entered as x 11, x 12, …, x 1l ; x 21, x 22, …, x 2l ; …; x m1, x m2, …, x ml in turn, then we calculate the average value of the time series according to \( {\overline{x}}_k=\left({x}_{k1}+{x}_{k2}+\cdots +{x}_{kl}\right)/l \) (k = 1, 2, …, m).

Then we get the following result:

$$ {I}_{ks}=\frac{x_{ks}}{{\overline{x}}_k}\;\left(k=1,2,\dots, m;\kern1em s=1,2,\dots, l\right). $$
(41.1)

The average value of I ks at the same time in every period is seen as the seasonal index number

$$ {I}_j=\frac{I_{1j}+{I}_{2j}+\cdots +{I}_{mj}}{m}\;\left(j=1,2,\dots, l\right). $$
(41.2)

Then the sequence without the seasonal effects is obtained:

$$ {y}_{ks}=\frac{x_{ks}}{I_s}\;\left(k=1,2,\dots, m;\kern1.5em s=1,2,\dots, l\right). $$
(41.3)

2.2 A Review of Particle Swarm Optimization

In an m-dimensional search space, we define each particle as a possible potential solution to a problem. Here X i  = (x i1, x i2, …, x im ) is considered as the current position of particle i, V i  = (v i1, v i2, …, v im ) is considered as the current velocity, P i  = ( p i1, p i2, …, p im ) is considered as the previous position, and P g  = ( p g1, p g2, …, p gm ) is the optimal position in all particles. Then the optimal position of particle i can be calculated according to Eqs. (41.4) and (41.5) [8].

$$ {v}_i^{k+1}=w\cdot {v}_i^k+{c}_1\cdot {r}_1\cdot \left({p}_i^k-{x}_i^k\right)+{c}_2\cdot {r}_2\cdot \left({p}_g^k-{x}_i^k\right), $$
(41.4)
$$ {x}_i^{k+1}={x}_i^k+\alpha \cdot {v}_i^k, $$
(41.5)

where v k i and x k i respectively represent the current velocity and position of particle i, c 1 and c 2 are two constants greater than zero, r 1 and r 2 are two independently distributed random variables between [0, 1], and w is the inertia weight.

2.3 Least Square Support Vector Machine Model

The LSSVM model was proposed by Suykens and Vandewalle [9]. Given a training dataset of N points {x i , y i } N i = 1 with input data x i  ∈ R n and output data y i  ∈ R, then we define the decision function [10]:

$$ y(x)={w}^T\varphi (x)+b. $$
(41.6)

To solve the function estimation problem, this paper introduces structural risk minimization to realize function optimization:

$$ \mathrm{Minimize}:\;\frac{1}{2}{\left\Vert w\right\Vert}^2+\frac{1}{2}c{\displaystyle \sum_{i=1}^n{\varepsilon}_i^2}. $$
(41.7)

Subject to: y i  = w T φ(x i ) + b + ε i , i = 1, …, N.

To derive the solutions w and ε, the Lagrange multipliers are introduced as follows:

$$ L\left(w,b,\varepsilon, a\right)=\frac{1}{2}{\left\Vert w\right\Vert}^2+\frac{1}{2}c{\displaystyle \sum_{i=1}^n{\varepsilon}_i^2}-{\displaystyle \sum_{i=1}^n{a}_i\left[{w}^T\varphi \left({x}_i\right)+b+{\varepsilon}_i-{y}_i\right]}. $$
(41.8)

According to the Karush–Khun–Tucker conditions, the finally result into the LSSVM model for function estimation can be described as:

$$ f(x)={\displaystyle \sum_{i=1}^n{a}_iK\left(x,{x}_i\right)+b}, $$
(41.9)

where the dot product K(x, x i ) is the kernel function; the radial basis function is applied in this paper; and we defined radial basis function (RBF) with a width of σ as follows:

$$ K\left(x,{x}_i\right)= \exp \left(-0.5{\left\Vert x-{x}_i\right\Vert}^2/{\sigma}^2\right). $$
(41.10)

3 Simulation Results

For this paper, an electric load dataset from New South Wales in Australia was used; data was collected 48 times per day. The original data series we used for training and fitting the model represents electricity demand data for 35 days, with 1,680 values as shown in Fig. 41.1. Using the original 1440 values, we forecast the following 240 values.

Fig. 41.1
figure 1

The electricity demand data for model fitting and training

Figure 41.1 shows that the shape of data series in the same day of different weeks is more similar compared with the shape of data series in different days of one week. Considering this cyclic behavior, this study divides the original data series into five groups based on the day of the week, i.e. Monday group, Tuesday group, and so on. This study then analyzes each of these day groups, and forecasts the corresponding day of the week, i.e. using Monday group to forecast following Monday.

From Fig. 41.1, it can be seen that these five data series show strong seasonality. There is no doubt that the forecasting accuracy could be improved if we eliminate the seasonal component before electricity demand forecasting. The Monday group data after elimination of the seasonal component is shown in Fig. 41.2. Figure 41.2a shows the original Monday group electricity demand, while Fig. 41.2b shows the seasonally adjusted Monday group electricity demand.

Fig. 41.2
figure 2

The Monday group electricity demand before and after seasonal adjustment

After eliminating the seasonal component, we used the resultant five group datasets to forecast the next five days electricity demand. Here, the LSSVM model was applied to model training and fitting, and the LSSVM parameters were optimized by PSO. In order to verify that the proposed model (SPLSSVM) indeed improved forecasting accuracy, we carried out a comparison between it and the other two models, i.e., LSSVM and LSSVM which has been optimized by PSO (named PLSSVM). The forecasting results for all three models (LSSVM, PLSSVM, and SPLSSVM) for each day are shown in Fig. 41.3.

Fig. 41.3
figure 3

Forecasting results of the three models

Table 41.1 lists three performance measures of the three forecasting models, and it reveals many details that are discussed below.

Table 41.1 Three performance measures of the three models

On comparing PLSSVM and LSSVM, we obtained the following result. Considering the three parameters mentioned in Table 41.1, PLSSVM shows the expected lower values compared to LSSVM for three days: Tuesday, Wednesday, and Friday. In terms of average values for the entire week, the root mean square error (RMSE) of PLSSVM shows a slightly larger value than that of LSSVM. However, the mean absolute error (MAE) and mean absolute percentage error (MAPE) of PLSSVM have decreased by 4.3 % and 5.4 %, respectively. For Monday and Thursday: on the basis of the three parameters, PLSSVM shows slightly higher values compared to LSSVM. However, generally speaking, PLSSVM performs better than LSSVM.

When we compare SPLSSVM and PLSSVM, we obtained the following result. In terms of the three significant parameters RMSE, MAE, and MAPE, PLSSVM has, as desired, lower values than LSSVM for every day of the week. Considering the average values for the whole week, PLSSVM has reduced RMSE, MAE, and MAPE by 44.1 %, 41.1 %, and 38.3 %, respectively.

To sum up, among the three models (LSSVM, PLSSVM, and SPLSSVM), SPLSSVM has the best performance for every day of the week. In terms of average values for the whole week, the RMSE of SPLSSVM is decreased by 44.1 and 43.4 %, MAE is decreased by 41.1 and 43.7 %, and MAPE is decreased by 38.3 and 41.7 % when compared to PLSSVM and LSSVM.

Conclusion

A new electricity demand forecasting model named SPLSSVM is proposed in this paper. SPLSSVM first uses seasonal adjustment to remove seasonal factors from the original data series. Next, SPLSSVM employs LSSVM to model the intermediate series, and PSO is used to optimize the parameters of LSSVM. From the perspective of different evaluation criteria that included RMSE, MAE, and MAPE, we can see that SPLSSVM increases the precision of electricity demand forecasting, and the proposed model could help power utilities in the control and dispatch of electricity.