Introduction

Currently, the energy structure is mainly based on traditional primary energy and supplemented by new energy. However, primary energy is a non-renewable resource, and there is an unadjustable contradiction between the utilization of primary energy and the storage of primary energy. At the same time, these used primary energy sources often produce a large amount of carbon dioxide, sulfur dioxide, carbon monoxide, and other substances that have a bad impact on the environment. These substances will have a great negative impact on the environment, forming acid rain, aggravating the greenhouse effect, and leading to the ecological environment of the earth (Zhang 2021). Therefore, all countries have begun to focus on energy issues, improve their energy structure, and adjust the proportion of new energy sources. In order to alleviate these problems, the large-scale use of renewable energy is imperative.

Wind power is clean and renewable, and is one of the most widely used power generation methods in the field of renewable energy. With the continuous advancement of wind power technology, the global wind power industry is booming. According to Global Wind Energy Council (GWEC) statistics, as of the end of 2019, the cumulative grid-connected installed capacity of global wind power reached 651 GW, an increase of more than 26 times compared with the end of 2001, with an average annual compound growth rate of 20.12% (Xia 2020). According to the Global Wind Energy Report 2021 released by GWEC in 2021, the cumulative installed capacity of global wind power has reached 743 GW in 2020, reducing carbon dioxide emissions by more than 1.1 billion tons. The newly installed capacity is 93 GW, an increase of 53% over 2019 (Luo 2021). And GWEC forecasts that global installed wind power capacity will reach 840 GW by 2022 (Shen 2019).

At present, there are more than 90 countries and regions using wind power in the world. From the perspective of the status quo of wind power generation in various countries, as of the end of 2019, the top five countries with cumulative installed capacity of onshore wind power in the world were China, the USA, India, Spain, and Sweden. According to the statistics of wind power grid connection data of National Energy Administration (National Energy Commission 2016; National Energy Commission 2017; National Energy Commission 2018; National Energy Commission 2019; National Energy Commission 2020), significant progress has been made in the field of wind power generation in China in the past 5 years. From 2016 to 2019, the cumulative grid-connected installed capacity of wind power jumped from 149 million kilowatts to 210 million kilowatts, an increase of 40.94%. In the past 5 years, the annual wind power generation has soared from 241 billion kilowatt-hours to 405.7 billion kilowatt-hours. In addition, the quality of wind power generation in China has rapidly improved, and the wind curtailment rate has dropped from 20 to 4%.

However, as the installed capacity of wind power continues to increase, some negative impacts also follow. Since wind power is intermittent and random, large-scale wind power grid integration will have a serious impact on the traditional power system. Therefore, in order to maintain the power balance and frequency stability of the grid, there is a large-scale wind abandoning phenomenon, resulting in an obvious contradiction between large installed fan capacity and wind abandoning volume. Based on this, studying the uncertainty of wind power output and accurately predicting the output information of wind power are of great significance to reducing wind curtailment and the stable operation of the power system.

Wind power prediction methods can be divided into physical methods (Feng et al. 2010), statistical methods (Liu and Meng 2013), and machine learning methods (G. N. Kariniotakis et al. 1996; Yang et al. 2017). Physical methods mainly refer to prediction methods based on digital weather forecasts. This method generally uses meteorological data and landmark information as initial and boundary conditions, directly solves the physics equations to obtain the wind speed and direction at the height of the wind turbine hub, and calculates the output power through the power curve of the wind farm. However, this method has a large amount of calculation, slow prediction speed, and poor accuracy. The statistical method is a method of estimating the wind power in the future based on the historical wind power. However, this method is difficult to deal with scenes with drastic changes and needs to be used in combination with specific scenarios. Machine learning method establishes the relationship between input and output through some learning rules and establishes a prediction model based on historical wind power related data. It has the characteristics of high prediction accuracy and is a hot research topic in recent years. In recent years, in view of the high complexity of data input for wind power forecasting, a large number of scholars have conducted research on wind power forecasting decomposition technology. Literature (Zhang et al. 2021) uses the EMD method to decompose the wind power and uses the k-means clustering method to perform the clustering, respectively, constructing the empirical mode decomposition-relevance vector machine prediction model, which improves the prediction efficiency. However, EMD decomposition has the phenomenon of modal aliasing. In order to avoid the appearance of modal aliasing, literature (Yang et al. 2018) uses the EEMD method to decompose historical wind power data and achieves good results. EEMD adds symmetrical Gaussian white noise on the basis of EMD decomposition, which effectively avoids the phenomenon of modal overlap. However, if the added noise amplitude is not appropriate, it will not only increase the calculation scale, but also make the IMF component doped. There are more false components, which in turn affects accuracy. In view of the above considerations, literature (Zhao et al. 2019) uses the VMD method to decompose the wind power, each sub-component is predicted by the ARIMA method, and the GARCH model is introduced to eliminate the heteroscedasticity characteristics, and finally the results are superimposed to obtain the predicted value. This VMD method avoids the phenomenon of modal aliasing. However, VMD decomposition has a large reconstruction error. Therefore, literature (Zhao et al. 2020) uses the CEEMD method to decompose wind power input, which reduces the reconstruction error and improves the prediction accuracy on the basis of avoiding modal aliasing. In the literature (Zhao et al. 2020), in view of the high complexity of CEEMDAN decomposition, the CEEMDAN-VMD secondary decomposition method is proposed to effectively solve the problem of excessive high frequency complexity and improve the accuracy of wind power forecasting. Wind power point forecasting can only output forecast values and cannot describe the interval range of wind power forecasting. Therefore, in the form of forecasting, the current forecast has gradually shifted from point forecasting to interval forecasting.

However, the strong volatility, randomness, and intermittent nature of wind energy have brought great impacts and challenges to the integration of wind power into the grid. In order to ensure the safety of power integration, reduce the uncertainty caused by wind power grid integration, and provide more auxiliary information for grid dispatchers, wind power interval prediction is becoming more and more important.

According to different interval formation methods, wind power interval prediction methods are mainly divided into two categories: the first category is to build a dual-output model of wind power based on neural networks to predict the upper and lower bounds of wind power that may occur (Yang et al. 2020a, b); the second type is to assume or estimate the probability distribution function of wind power prediction error in advance, and perform inverse calculation on it to generate the confidence interval of wind power power(Yang et al. 2019).

The first type of method avoids the problem of strong hypothesis and weak versatility of the probability distribution function of the second type of method. Wind power interval prediction based on fuzzy information granulation is the first type of method. This article conducts an in-depth study of this method.

Yin et al. proposed a wind speed multi-step interval prediction model based on singular spectrum analysis-fuzzy information granulation and extreme learning machine. Wind speed sequence data was preprocessed through singular spectrum analysis, and all decomposed components were preprocessed through fuzzy granulation (Yin et al. 2018, 2019). Zeng et al. (2018) proposed a short-term wind speed interval prediction model based on empirical wavelet transform-fuzzy information granulation and mutation robust extreme learning machine. Through empirical wavelet transform, the wind speed decomposition components are obtained, and only the component with the highest frequency is used to perform fuzzy granulation. Literature (Yin et al. 2018, 2019) granulating all components leads to excessive computational cost, while literature (Zeng et al. 2018) only granulates the highest frequency component, and the interval prediction effect is not good. Therefore, it is necessary to select the best fuzzy granulation component.

In addition, most of the above literatures use traditional machine learning methods for interval prediction, and machine learning methods cannot effectively use a large amount of historical data due to their structural limitations, which will cause low prediction accuracy. In recent years, deep learning has made great progress. Compared with models such as extreme learning machines, deep neural networks have a memory function and have natural advantages for predicting and analyzing time series data. Compared with the traditional neural network algorithm, BiLSTM has memory ability, and the forward propagation layer and the back propagation layer are connected to the output layer together. Considering the correlation between data, BiLSTM can make the output result more accurate (Xue et al. 2020).

In view of the above considerations, this paper proposes an ultra-short-term wind power multi-step interval prediction method based on CEEMDAN-FIG and CNN-BiLSTM models. Firstly, CEEMDAN is used to decompose the wind power time series. Secondly, the components are selected for fuzzy information granulation. Then, the Low, R, Up, and other components obtained by fuzzy granulation are combined with wind speed data, wind direction data, and temperature data as the input of the CNN-BiLSTM model. Output of this model is obtained. Finally, CWCproposed is used as the objective function to optimize the prediction interval to generate a wind power prediction interval under a given confidence level.

Basic method principle

Complete ensemble empirical mode decomposition with adaptive noise

Wind power time series are non-stationary and nonlinear; if it is directly used as the input of the prediction model, it is generally difficult to obtain accurate prediction results. The nonlinearity and non-stationarity of wind power time series can be reduced and the prediction accuracy can be improved by using data decomposition technology.

EMD (empirical mode decomposition, EMD) can decompose nonlinear time series into finite modal components by using cubic spline interpolation functions. However, EMD has the phenomenon of “modal aliasing.” EEMD (ensemble empirical mode decomposition, EEMD) reduces the phenomenon of “modal aliasing” by adding white noise to the original signal. But EEMD has noise residue. CEEMDAN (Torres et al. 2011) is an improved algorithm of EEMD. By adding a finite amount of white noise at each stage of decomposition process, it successfully solves the problem of noise residue.

Graining of fuzzy information

Information granulation (He et al. 2019) is the study of a set of close or similar elements as a whole. The whole is called the information granule. Fuzzy information granulation is an information granulation model based on fuzzy set theory. Fuzzy information granulation mainly includes two steps: time window division and fuzzification. The division time window is to divide the original time series into multiple time series data of the same length according to a certain time. The time window is divided into each small piece of data. The width of the time window is the time span of each small piece of data. Fuzzification is the membership function to determine the information granulation. Information fuzzification uses triangular fuzzy particles to obtain the maximum value, average value, and minimum value of the window data. The membership function is:

$$A=\left\{\begin{array}{c}\genfrac{}{}{0pt}{}{0,x<a}{\frac{x-a}{m-a},a\le x\le m}\\ \genfrac{}{}{0pt}{}{\frac{b-x}{b-m},m<x\le b}{0,x>b}\end{array}\right.$$
(1)

Where x is a time domain variable; α, m, and b respectively correspond to the maximum, average, and minimum value of data in each window after fuzzy graining.

CNN-BiLSTM model

Convolutional neural network

Convolutional neural network (CNN) is one of the representative algorithms of deep learning. In particular, it has been widely used in the field of image processing (Qiu 2020).

Long short-term memory

In order to selectively update memory cells, long short-term memory (LSTM) introduces the cell state on the basis of the hidden layer state of the original RNN to store long-term memory and reflect the dependence of adjacent moments learned by the deep network at any time step and the structural characteristics of input data a long time ago (Hochreiter and Schmidhuber 1997). Each LSTM calculation unit contains three control gates, which are input gate, output gate, and forget gate. The structure is shown in Fig. 1.

Fig. 1
figure 1

LSTM network unit structure

ft is the forgetting gate, It is the input gate, and ot is the output gate. The relationship between output ft of LSTM network forgetting gate and current input xt is as follows:

$${f}_{t}=\sigma \left({\omega }_{f}\bullet \left[{h}_{t-\mathrm{l}},{x}_{t}\right]+{b}_{f}\right)$$
(2)

In formula (2), ωt and bf are the weight and bias of the forgetting gate input respectively, σ(⋅) is the activation function sigmoid, which is used to restrict the passage of information, 0 means “completely discarded,” and 1 means “completely reserved.” The intermediate variable is used to determine whether the cell state will be added. The specific functional relationship is as follows:

$$\stackrel{\sim }{{c}_{t}}=\mathrm{tanh}\left({\omega }_{c}\bullet \left[{h}_{t-\mathrm{l}},{x}_{t}\right]+{b}_{c}\right)$$
(3)
$${i}_{t}=\sigma \left({\omega }_{i}\bullet \left[{h}_{t-\mathrm{l}},{x}_{t}\right]+{b}_{i}\right)$$
(4)

In formulas (3) and (4), ωc and ωi are the weight values of the intermediate variable and the input gate; bc and bi are the offset value. The unit state ct has the following functional relationship:

$${c}_{t}={c}_{t-\mathrm{l}}\odot {f}_{t}+{i}_{t}\odot \stackrel{\sim }{{c}_{t}}$$
(5)

Based on the above variables, the output value of the output gate at the current time can be obtained:

$${o}_{t}=\sigma \left({\omega }_{o}\bullet \left[{h}_{t-\mathrm{l}},{x}_{t}\right]+{b}_{o}\right)$$
(6)
$${h}_{t}={o}_{t}\odot \mathrm{tanh}\left({c}_{t}\right)$$
(7)

In formulas (6) and (7), ωo and bo are the weight and bias of intermediate variables respectively. ⊙is the Hadamard product.

Generally, the information in the LSTM network is one-way transmission, and only the past information can be used; the future information cannot be used.

Bidirectional long short-term memory

Bidirectional long short-term memory (BiLSTM) is composed of two LSTM networks in opposite directions. The structure is shown in Fig. 2. The forward LSTM can obtain the past data information of the input sequence, and the backward LSTM can obtain the future data information of the input sequence; the forward and backward LSTM training of the time series can further improve the global and integrity of feature extraction.

Fig. 2
figure 2

BiLSTM network structure

The hidden state at time t is defined as forward \({h}_{t}^{\left(1\right)}\) and reverse \({h}_{t}^{\left(2\right)}\), as shown in Eqs. (8) and (9) respectively:

$${h}_{t}^{\left(\mathrm{l}\right)}=\overrightarrow{{h}_{t}}=\overrightarrow{LSTM}\left({h}_{t-\mathrm{l}},{x}_{t},{c}_{t-\mathrm{l}}\right)$$
(8)
$${h}_{t}^{\left(2\right)}=\overleftarrow{{h}_{t}}=\overleftarrow{LSTM}\left({h}_{t+\mathrm{l}},{x}_{t},{c}_{t+\mathrm{l}}\right)$$
(9)

Then, the output of the hidden layer at time \(t\) is determined by the forward hidden layer and the reverse hidden layer at the same time, namely:

$${h}_{t}={h}_{t}^{\left(\mathrm{l}\right)}\oplus {h}_{t}^{\left(2\right)}$$
(10)

In formula (10), ⊕is the vector splicing operation.

CNN-BiLSTM model

In order to improve the accuracy of interval multi-step prediction, CNN-BILSTM is selected in this paper to construct the prediction model. The CNN-BiLSTM model combines the feedforward mechanism of the convolutional neural network with the feedback mechanism of the BiLSTM network.

Through the feature extraction of the convolutional neural network, the computational cost is greatly reduced. BiLSTM has a memory function, and the prediction of the BiLSTM model can improve the model forecast accuracy of multi-step forecasts. The structure of the CNN-BiLSTM combined prediction model is shown in Fig. 3.

Fig. 3
figure 3

CNN-BiLSTM model structure diagram

Wind power interval prediction based on CEEMDAN-FIG and CNN-BiLSTM

CEEMDAN-FIG-CNN-BiLSTM prediction mode

Aiming at the randomness and non-stationarity of wind power time series, this paper proposes a wind power interval prediction method based on CEEMDAN-FIG and CNN-BiLSTM. Fig 4 is a flowchart of the prediction model.

Fig. 4
figure 4

CEEMDAN-FIG and CNN-BiLSTM prediction model flowchart. Note that WS refers to wind speed. Dr refers to direction and T refers to temperature

The model prediction steps are as follows:

  1. (1)

    The original wind power time series is decomposed by CEEMDAN to obtain n components including the remaining components;

  2. (2)

    Select some components for fuzzy information granulation and obtain the minimum sequence, maximum value sequence, and average value sequence. Combine the wind speed, wind direction, and temperature sequence as the input of the CNN-BiLSTM combined model. The remaining components are treated in the same way;

  3. (3)

    The prediction result of the minimum sequence combined with the prediction results of the remaining components is used as the lower limit of the interval; the prediction result of the maximum sequence combined with the prediction results of the remaining components is used as the upper limit of the interval; the prediction result of the average sequence combined with the prediction results of the remaining components is used as the point prediction result.

Evaluation index of prediction results

This paper selects prediction interval coverage probability (PICP) (Li et al. 2019), prediction interval normalized average width (PINAW) (Yang et al. 2020a, b), and coverage width criterion proposed (CWCproposed) (Chen 2019) to evaluate the performance of the prediction model.

$$PICP=\frac{\mathrm{l}}{N}\sum_{i=\mathrm{l}}^{N}{c}_{i}\times 100\%$$
(11)
$${c}_{i}=\left\{\begin{array}{c}1,if{P}_{ti}\in \left({L}_{i},{U}_{i}\right)\\ 0,if{P}_{ti}\notin \left({L}_{i},{U}_{i}\right)\end{array}\right.$$
(12)

In formulas (11) and (12), N is the total number of samples, and Ci is the reliability index of the upper limit Ui and the lower limit Li of the interval to the real value Pti of wind power.

$$PINAW=\frac{\mathrm{l}}{N}\sum_{i-\mathrm{l}}^{N}\frac{{U}_{i}-{L}_{i}}{{P}_{ti}}\times 100\mathrm{\%}$$
(13)
$$\left\{\begin{array}{c}\genfrac{}{}{0pt}{}{{CWC}_{proposed=}}{\beta \bullet \mathrm{PINAW}}\\ \left(\alpha +\beta \bullet \mathrm{PINAW}\right)\bullet (l+exp(-\eta \bullet \genfrac{}{}{0pt}{}{\genfrac{}{}{0pt}{}{(\mathrm{PICP}\ge \mu }{\mathrm{PICP}-\mu )))}}{\mathrm{PICP}<\mu }\end{array}\right.$$
(14)

In formula (14), α is used to avoid the problem that the influence of PICP is ignored when PINAW is too small, β is the proportion coefficient of the PINAW index, \(\eta\) is the penalty parameter, and µ is the preset nominal confidence level of the interval prediction.

Experiment

This paper selects wind power data, wind speed data, wind direction data, and temperature data of a wind turbine in northwest China from January to March as experimental samples, and divides the data into training set, validation set, and test set. The data sampling interval is 5 min. And the wind power historical data is input into the adftest function of MATLAB for testing, and the test result is 0, which indicates that the wind power data is non-stationary.

The application of CEEMDAN to decompose wind power sequence

Because the wind power sequence fluctuates greatly, the data needs to be processed. In this paper, CEEMDAN is used to decompose the original sequence into several relatively flat components. The result of applying CEEMDAN to decompose the original wind power time series is shown in Fig. 5.

Fig. 5
figure 5

The result of applying CEEMDAN to decompose the original wind power time series

It can be seen from the wind power decomposition diagram that the original power sequence is decomposed into 14 sub-components. From IMF1-RES, the components gradually transition from high-frequency components to low-frequency components.

Selecting the optimal fuzzy information granulation component

In order to avoid the influence of artificial selection of fuzzy granulation components on the interval prediction effect, when selecting sub-components for fuzzy information granulation, we respectively select IMF1, IMF1-IMF2, IMF1-IMF8 etc. components for experiments. The IMF1 component accounts for a large proportion of all the original data, so the experiments all contain the IMF1 component. By comparing the interval prediction effects of the 8 groups of experiments on the validation set, the best granular fuzzy information component is screened out.

Normalize the Low sequence, R sequence, Up sequence with the resolution of 15 min, and the remaining components with the resolution of 5 min obtained by granulating the fuzzy information of 8 groups of experiments to [0 1]. The normalized sequence is divided into training set, validation set, and test set according to 8:1:1.

The purpose of this section is to select the best fuzzy information granulation component. Taking into account that the BiLSTM network operation occupies a lot of computing resources, here, a single-layer LSTM network is selected for prediction, and the optimal granulation component is determined by comparing the one-step interval prediction results of 8 groups of experiments on the verification set at a 90% confidence level.

It can be seen from Table 1 that when the confidence level is 90%, the minimum average bandwidth PINAW of the prediction interval of the fuzzy information granulation component of the second group of experiments is 0.1423, and the minimum coverage width criterion of the interval comprehensive evaluation index is 0.8542, so the best granular component of fuzzy information is IMF1IMF2

Table 1 One-step interval prediction evaluation index on the 8 sets of experimental verification sets

Select the granular component of fuzzy information IMF1IMF2 and reconstruct it into IMF; the remaining components IMF3,…, RES will not be processed. Perform FIG processing on the components IMF. The selected fuzzy particles are triangular fuzzy particles, and the time window width is 3. In this way, the components with a resolution of 5 min are granulated by fuzzy information and processed into a minimum sequence (Low), an average sequence (R), and a maximum sequence (Up) with a resolution of 15 min. The granular view of the component blur information is shown in Fig. 6.

Fig. 6
figure 6

\(IMF^{^{\prime}}\) component fuzzy information granular view

So far, three sequences of Low sequence, R sequence, and Up sequence with a resolution of 15 min have been obtained, and 12 components with a resolution of 5 min IMF3,…, RES, etc. have been obtained. Combine the wind speed sequence, wind direction sequence (processed into wind direction sine and wind direction cosine), and temperature sequence as the input of the prediction model to obtain the initial wind power prediction interval, and finally take the improved coverage width criterion as the objective function to optimize the PICP and PINAW of the initial prediction interval to generate the wind power prediction interval under the given confidence level.

One-step interval prediction result analysis

Table 2 shows the one-step interval prediction results of the CNN-BiLSTM model, CNN-GRU model, CNN-LSTM model, KELM model, and SVR model at the 90% confidence level. Table 3 shows the one-step interval prediction results of the five models with 80% confidence level. And the CNN-BiLSTM model structure includes one CNN, one max pooling layer, two BiLSTM layers, and one fully connected layer. The structure and parameters of the CNN-GRU model and the CNN-LSTM model are consistent with the CNN-BiLSTM model, and the BiLSTM network layer is replaced by the GRU layer and the LSTM layer, respectively. The parameters of the KELM model are set as follows: Elm_Type selects 0, indicating that regression analysis is performed; the regularization coefficient or penalty coefficient C is set to 1; the kernel function Kernel_type selects RBF_kernel. The parameters of the SVR model are set as follows: SVM type selects ε; the kernel function is the RBF kernel function; the loss function ε in ε-SVR is set to 0.01; the parameter C of ε-SVR and the γ in the kernel function are selected by grid search method, and the initial range is [− 10, 10]. In addition, the parameters of CWCproposed are as follows: α is 0.1, β is 6, \(\eta\) is 15, and µ is the preset nominal confidence level of the interval prediction.

Table 2 One-step interval prediction results of 5 models at 90% confidence level
Table 3 One-step interval prediction results of 5 models with 80% confidence level

From Table 2 and Table 3, when the confidence level is the same, the average bandwidth PINAW of the prediction interval of the CNN-BiLSTM model is the smallest, and the CWCproposed value of the improved coverage width criterion is the smallest, indicating that the one-step interval prediction effect of the CNN-BiLSTM model is the best.

When the confidence level is 90%, the prediction interval average bandwidth index PINAW of the CNN-BiLSTM model is increased by 3.79%, 1.89%, 14.57%, and 10.13% compared with the CNN-GRU model, CNN-LSTM model, KELM model, and SVR model, respectively. Compared with the other four models, the improved coverage width criterion CWCproposed index of the CNN-BiLSTM model is increased by 3.76%, 1.79%, 14.51%, and 10.08%, respectively. The two indicators, the prediction interval average bandwidth PINAW and the improved coverage width criterion CWCproposed, fully demonstrate the superiority and effectiveness of the CNN-BiLSTM model.

Table 4 is the results of the CNN-BiLSTM model using different decomposition methods. It can be seen from Table 4 that when the confidence level is the same, the CEEMDAN-FIG-CNN-BiLSTM model has the highest prediction interval coverage rate PICP, the smallest average bandwidth PINAW, and the smallest improved coverage width criterion CWCproposed value, indicating the effectiveness of CEEMDAN-FIG.

Table 4 One-step interval prediction results of 3 models with 90% confidence level

Figures 7 and 8 are the results of ultra-short-term one-step interval prediction of wind power with a confidence level of 80% and 90% of the CEEMDAN-FIG-CNN-BiLSTM prediction model, respectively. Figs 9 and 10 respectively show the ultra-short-term one-step interval prediction results of wind power of EEMD-FIG-CNN-BiLSTM and EMD-FIG-CNN-BiLSTM models with confidence level of 90%.

Fig. 7
figure 7

CEEMDAN-FIG-CNN-BiLSTM ultra-short-term one-step interval prediction with 80% confidence level

Fig. 8
figure 8

CEEMDAN-FIG-CNN-BiLSTM ultra-short-term one-step interval prediction with 90% confidence level

Fig. 9
figure 9

EEMD-FIG-CNN-BiLSTM ultra-short-term one-step interval prediction with 90% confidence level

Fig. 10
figure 10

EMD-FIG-CNN-BiLSTM ultra-short-term one-step interval prediction with 90% confidence level

Multi-step interval prediction results analysis

From Table 5, when the confidence level is the same and the number of prediction steps is the same, the prediction interval average bandwidth PINAW of the CNN-BiLSTM model and the improved coverage width criterion CWCproposed are better than the other four comparison models.

Table 5 Multi-step interval prediction results of 5 models with 90% confidence level

For example, in the 2-step interval prediction results, the average bandwidth of the prediction interval PINNAW of the CNN-BiLSTM model is increased by 3.71%, 3.22%, 20.91%, and 18.01% compared with the CNN-GRU model, CNN-LSTM model, KELM model, and SVR model, respectively. Compared with the other four models, the improved coverage width criterion CWCproposed of the CNN-BiLSTM model is increased by 3.67%, 3.19%, 25.48%, and 21.19%, respectively.

Figure 11, Fig. 12, and Fig. 13 are the ultra-short-term 2-step, 3-step, and 4-step interval prediction results of wind power with a 90% confidence level of the CEEMDAN-FIG-CNN-BiLSTM model.

Fig. 11
figure 11

Two-step interval prediction result graph

Fig. 12
figure 12

Three-step interval prediction result graph

Fig. 13
figure 13

Four-step interval prediction result graph

Conclusion

In this paper, the method of constructing a dual-output wind power model based on neural network in the prediction of wind power interval and generating the confidence interval of wind power is studied. Through simulation and comparison experiments, the following conclusions are obtained:

  1. (1)

    An ultra-short-term wind power multi-step interval prediction method based on CEEMDAN-FIG and CNN-BiLSTM is proposed;

  2. (2)

    By analyzing the prediction results of different granulation components on the verification set, the optimal granulation component is selected objectively, and the prediction error caused by artificial selection of the granulation component is avoided;

  3. (3)

    Compared with the CNN-GRU model, CNN-LSTM model, KELM model, and SVR model, the CNN-BiLSTM combined model has superiority in the results of one-step interval prediction and multi-step interval prediction.