Introduction

Under the background of the global “bi-carbon” consensus and the reform of the world energy system, energy storage plants with the functions of smooth transition, peak and valley filling, frequency modulation, and voltage regulation have received widespread attention and rapid development [1]. Lithium-ion batteries are strongly used in the field of energy storage power plants because of their excellent performance such as high storage capacity, small size, and zero pollution [2,3,4]. To ensure operational safety, durability, and reliability, battery status needs to be monitored in real time and accurately by advanced battery management systems (BMS) [5,6,7,8]. SOC is the ratio of the current capacity to the full capacity of the battery and is one of the most important states monitored by the BMS [9]. However, due to the complex dynamic coupling activities and mechanisms inside the battery, SOC cannot be measured directly and needs to be obtained indirectly with the help of battery measurement variables (voltage, current, and temperature) combined with relevant SOC estimation algorithms [10,11,12]. It is still a challenge to accurately obtain the battery SOC because of the highly time-varying and nonlinear nature of Li-ion batteries.

Literature review

Many methods for SOC estimation have been proposed nowadays, which can be specifically classified into three categories: traditional estimation methods, model-based methods, and deep learning–based methods. The conventional estimation method consists of two types of methods: the ampere-time integration method [13] and the open-circuit voltage method [14]. The ampere-time integration method calculates the battery charge change by current integration, which is simple and suitable for real-time application. However, it relies heavily on the starting SOC value and produces error accumulation as the discharge charge increases, which has a large impact on the final estimation results. The open-circuit voltage method estimates the SOC by calibrating the one-to-one relationship between the open-circuit voltage and the SOC [15]. Although this method can easily achieve the SOC estimation requirements, it cannot perform real-time estimation because of the long rest time required before measuring OCV. Thus, various model-based filtering methods have been proposed to achieve more accurate battery SOC estimation.

The model-based methods mainly contain H-infinity filter (HIF) [16], Luenberger observer [17], slip film observer method (SMO) [18], particle filter (PF) [19], and Kalman filter (KF) method [20]. Among them, KF [21] is the most widely used model-based method and has been theoretically proven to be an effective SOC estimation method due to its self-correction and online computational capabilities. Since the Li-ion battery is a nonlinear model and the basic KF algorithm can only be used for linear systems, some extension algorithms have been developed from it, including extended Kalman filter (EKF) [22], unscented Kalman filter (UKF) [23], and cubature Kalman filter (CKF) [24]. If the selected battery models can accurately simulate the real internal state of the battery, they can perform well in SOC estimation. Unfortunately, the model-based methods have some inherent disadvantages. For instance, model construction requires cumbersome tests, which need to be repeated for different battery chemistries. In addition, errors arise when we assume constant process noise and measurement noise, which are time-varying in reality [25].

With the rapid development of artificial intelligence and chip technology, researchers are increasingly paying attention to deep learning–based SOC estimation methods [26, 27]. Deep learning–based SOC estimation methods can directly map sampled battery operating signals (e.g., current and voltage) to SOC. Therefore, arduous battery modeling or feature engineering is no longer needed [28,29,30]. In addition, deep learning methods have high scalability, which allows training models based on large datasets coming from different types of batteries. Nowadays, the current SOC estimation methods based on deep learning methods are mainly DNNs with specific various types of layers. Among them, fully connected neural networks (FCNNs), convolutional neural networks, and recurrent neural networks (RNNs) are the most commonly used [31]. Chemali et al. [32] developed a SOC estimation method based on an FCNN. In the proposed method, the present SOC is modeled as a function of the present voltage, current, temperature, and the average current and voltage over 400 precedent time steps. The proposed method shows high accuracy under various temperatures. In a follow-up study by How et al. [33], the influence of the number of middle layers on SOC estimation was investigated. Their results show that deep FCNNs have improved modeling ability under unseen conditions but face overfitting risk. The FCNNs have intrinsic limitations. First, the fully connected structure comprises a large number of parameters, especially when a long sequence is directly taken as input. This issue gives rise to high computational costs and overfitting risk. Besides, it can hardly process multivariate sequences unless the input data are flattened. Therefore, the fully connected structure can hardly be directly used to estimate battery SOC without condensing its input. More advanced DNN architectures have become the mainstream for SOC estimation. Hannan et al. [34] developed a CNN in which four blocks comprising convolutional and max-pooling layers are stacked to process the input current, voltage, and temperature sequence, followed by a fully connected layer to map the extracted features to the SOC estimation. In addition to the basic CNN, more advanced CNN variants have been applied to SOC estimation. For example, Hu et al. [35] and Guo et al. [36] devised the temporal convolutional neural network (TCN) for SOC estimation. The above examples confirm the ability of CNNs to provide effective SOC estimation. However, CNNs still have some limitations in the task of SOC estimation. CNNs are not sensitive to the order of the inputs and therefore cannot capture temporal dependencies in the input sequence. Since battery SOC estimation is a problem based on time series data. Therefore, RNNs and their variant structures, which are good at processing time series data, are often used in SOC estimation. Chaoui et al. used recurrent neural networks (RNN) to estimate SOC from past charging and discharging data and verified the effectiveness of the RNN method. However, due to the gradient vanishing and explosion problems, standard RNNs have difficulty in capturing long-time dependencies. Two advanced gated RNNs (LSTM and GRU) with superior ability to handle long sequences were then proposed to solve this problem. Chemali et al. used the LSTM network for SOC estimation. The proposed method can correctly estimate the SOC at various ambient temperatures. Meng et al. proposed a GRU-based deep learning method to estimate the state of charge (SOC) of a battery. Its proposed method constructs a GRU model that uses measured voltages and currents as inputs to estimate the SOC. Hannan et al. [37] developed a GRU model to estimate SOC with current, voltage, and temperature sequences. A one-cycle learning rate policy was adopted to improve the DNN performance. Validation results demonstrate that the proposed method is more accurate and efficient than LSTM. The flexibility of RNNs allows us to stack multiple layers to improve learning. Yang et al. [38] built a deep LSTM model by stacking three LSTM layers. Hannan et al. [37] developed a two-layer GRU model. Their validation results show that deep RNNs with more than one layer have improved learning ability than those RNNs with only one layer. However, stacking more layers does not always give rise to accuracy improvement and may encounter the problem of overfitting, which limits the generalization of a DNN to un­seen inputs. In addition to the stacked RNN layers, the bidirectional architecture is also an attractive alternative to improve the learning ability of DNNs. A bidirectional LSTM was utilized for SOC estimation in [39]. The experimental validation results confirm that the bidirectional LSTM outperforms the single-direction one. Although deep learning–based methods have been successfully applied in many studies, almost all methods that are only based on deep learning have the problem of large fluctuations in estimated SOC when targeting complex operating conditions with large variations in battery current.

In recent years, a combined approach based on neural networks and filtering methods has been proposed. This type of method first uses a neural network to predict the battery SOC, initially establishes a nonlinear mapping relationship between battery measurement variables and SOC, and then uses a filter to smooth the output SOC of the neural network. In 2014, He et al. [40] were the first to use UKF to smooth the NN output SOC and validated the method using lithium battery data collected from the Federal Urban Driving Scheme (FUDS) and the Dynamic Stress Test (DST). After this, Yang et al. [41] combined LSTM-RNN with UKF to accomplish similar work. Chen et al. [42] combined GRU with AKF to estimate SOC. Its algorithm uses the AKF with noise adaptation to smooth the output SOC of GRU, which solves the problem of unstable single GRU estimation and greatly improves the SOC estimation accuracy. Tian et al. [43] proposed an adaptive closed-loop DNN model that first decouples the used voltage and current sequences into open-circuit voltage (OCV), ohmic response, and polarized voltage to enhance the inputs to the deep neural network (DNN). The SOC estimates from the DNN are then adaptively fused with short-term ampere-hour predictions using a Kalman filtering algorithm. An accurate SOC estimate is finally obtained. Tian et al. [44] proposed a SOC estimation method based on the combination of deep neural network (DNN) and Kalman filter for improving the robustness of SOC estimation against random noise and error spikes. Tian et al. [45] combined LSTM-RNN with ACKF to estimate SOC. There are two very noteworthy points in their proposed approach LSTM-ACKF. The first point is to improve the estimation accuracy by using a “many-to-one” structural LSTM model. The second point is the necessity of noise adaptation in KF.

All six of these papers used a combination of deep neural networks and Kalman-type filtering to estimate SOC, and the final results proved that the SOC estimation accuracy was also improved in all of them. However, the biggest drawback of the Kalman-type filtering method is that it is necessary to provide a determined initial SOC and a reference SOC computed by the Ampere-hour counting method in order to filter the SOC estimated by the neural network. This method cannot be implemented when the initial SOC is unknown. In fact, when estimating the SOC of a lithium-ion battery for energy storage, its initial SOC is many times unknown. The filtering effectiveness of this method is also affected when the initial SOC is known but inaccurate. To solve this problem, Jiao et al. [46] proposed a method SG-BILSTM based on the combination of SG filter and bidirectional LSTM neural network. The SG filter used in this method can directly smooth the initial SOC estimated by the bidirectional LSTM neural network without the reference SOC. This method can greatly reduce the hardware requirements of the computer, and the real-time estimation performance is also stronger. But this method currently has two drawbacks; the first is that it uses a “one-to-one” structure LSTM, which does not take into account the influence of previous measurements on the current SOC, resulting in relatively low estimation accuracy. The second point is that the SG filter used cannot adaptively select the optimal window length W for the SOC estimation of different operating conditions to achieve the best filtering effect [47].

Key contributions

In order to accurately estimate battery SOC under complex operating conditions in energy storage plants, based on the problems of current methods, a robust and efficient combined SOC estimation method GRU-ASG is proposed in this paper. The main contributions of this paper are as follows:

  1. (1)

    Proposed the combined method GRU-ASG; the method only requires energy storage battery data (voltage, current, temperature) for accurate estimation of SOC under complex operating conditions.

  2. (2)

    Create a “many-to-one” structured GRU network consisting of a 1-layer GRU layer and a 3-layer fully connected layer. This network can maximize the impact of previously measured SOC on the current SOC and thus improve the accuracy of SOC estimation.

  3. (3)

    Proposed an adaptive SG filter based on the Spearman correlation coefficient. The method can be used to select the optimal window length W adaptively online, which in turn greatly improves the SOC estimation accuracy. Moreover, it has good SOC estimation performance and generalization ability for unknown data under different complex working conditions.

Organization of the paper

The remaining paper is organized as follows: the “Datasets and ASG-GRU-related algorithm principles” section introduces the used energy storage plant operation dataset, GRU neural network, and SG filter. The “GRU-ASG model” section describes the overall structure and specific process of GRU-ASG, the established “many-to-one” GRU model architecture and parameter settings, and the theoretical basis and implementation of the ASG filter. The “Results and analysis” section presents the validation and discussion of the proposed method. Finally, the paper is summarized in the “Conclusions” section.

Datasets and ASG-GRU-related algorithm principles

Datasets introduction

The battery data used in this paper are from the actual operating data of an energy storage plant, and the battery type used is 280Ah 3.2V lithium iron phosphate battery CB310 for energy storage produced by CATL. The specific parameters of the battery are shown in Table 1.

Table 1 Battery parameters

Six actual operating datasets collected during the discharge of the energy storage plant were randomly selected, and all six datasets corresponded to different complex operating conditions. The variation of the discharge current for the six operating conditions is shown in Fig. 1. In this paper, four datasets with discharge current conditions a–d are used as training sets for model training, and two datasets with discharge current conditions e and f are used as test sets for model testing. Table 2 shows the specific purposes of the six datasets in this paper.

Fig. 1
figure 1

Six different discharge strategies: af

Table 2 Specific uses of the dataset

Introduction to the principle of GRU algorithm

Since voltage, current temperature, and SOC are time-varying physical quantities, so we consider using RNN or its variants to construct the SOC estimation model, which is excellent at handling time series correlation. GRU and LSTM are both variant forms of RNN, and their roles are similar. The only structural difference is that GRU combines the forget gate and the input gate of the LSTM model into a single update gate, thus becoming simpler and easier to train than the LSTM. Therefore, the GRU group is chosen as the base network in this paper, and its neuron structure is shown in Fig. 2. GRU has two gate functions, the update gate and the reset gate. The update gate defines how much previous information is saved to the current time.

Fig. 2
figure 2

GRU recurrent neural network parameter transfer process

The GRU is computed as follows: the input information xt at moment t and the hidden layer state ht − 1 at moment t−1 are used as inputs. The outputs rt and zt of the reset gate and update gate are updated by Eq. (1) and Eq. (2). The temporary output state \({\tilde{h}}_t\) is updated by Eq. (3), and the hidden layer state ht is updated by Eq. (4). ht mainly refers to targeted retention of xt and ht−1.

$${z}_t=\sigma \left({W}_z\cdot \left[{h}_{t-1},{x}_t\right]\right)$$
(1)
$${r}_t=\sigma \left({W}_r\cdot \left[{h}_{t-1},{x}_t\right]\right)$$
(2)
$${\tilde{h}}_t=\tanh \left({W}_h\cdot \left[{r}_t\cdot {h}_{t-1},{x}_t\right]\right)$$
(3)
$${h}_t=\left(1-{z}_t\right)\cdot {h}_{t-1}+{z}_t\cdot {\tilde{h}}_t$$
(4)

where Wz, Wr , and Wh denote the weight matrices. “·” represents the matrix multiplication operation. “+” represents the matrix addition operation.

Introduction of SG filter principle

SG filter is a low-pass digital filter proposed by Savitzky and Golay in 1964 [48]. It can smooth a set of data without changing the trend of the signal and thus improve the accuracy of the data. The SG filter is implemented by a convolution process. The process is as follows: a continuous subset of adjacent data points is fitted by least squares to a low-order polynomial. Find the analytical solution of the least squares equation when the data points are equally spaced. It takes the form of a set of “convolution coefficients” that can be applied to all subsets of the data..

The principle of SG filter is as follows: as shown in Fig. 3, for a time series dataset X of length T.

$$X=\left({x}_1,{x}_2,{x}_3,\cdots, {x}_{T-2},{x}_{T-1},{x}_T\right)$$
(5)
Fig. 3
figure 3

SG filter schematic

Let there be 2M+1 discrete data points in each window, then the value of the window length W is 2M+1. The data from the temporal data the subset C of data for each filtering window drawn from the set X is shown in Eq. (6). Take the first window C1 as an example. First, the N-order polynomial shown in Eq. (7) is used to combine the 2M+1 data points within its window. Then, use Eq. (8) to calculate the mean square approximation error for the data subset C1 centered at ξ0=0. Finally, according to the traditional least squares theorem, when the MSE εN is minimized, the optimal coefficients of the polynomial p(ξ) can be found. The filtered value can be determined from the optimal coefficients.

$${\mathrm{C}}_i={X}_i={\left[{x}_{-M},\cdots, {x}_0,\cdots, {x}_M\right]}_i,i=1,2,\cdots, n$$
(6)
$${\displaystyle \begin{array}{c}p\left(\xi \right)=\sum_{k=0}^N{a}_k{\xi}^k={a}_0+{a}_1\xi +{a}_2{\xi}^2+\cdots +{a}_N{\xi}^N\\ {}\xi =\left[{\xi}_{-M},\cdots, {\xi}_0,\cdots, {\xi}_M\right]=\left[-M,\cdots, 0,\cdots, M\right]\end{array}}$$
(7)
$${\varepsilon}_N=\sum\nolimits_{d=-M}^M{\left(p\left({\xi}_d\right)-{C}_1\right)}^2=\sum\nolimits_{d=-M}^M{\left(\sum\nolimits_{k=0}^N{a}_k{\xi}_d^k-{x}_d\right)}^2$$
(8)

GRU-ASG model

GRU-ASG model overall structure

The GRU-ASG model is composed of two components. The first part is the GRU neural network, which is mainly used for feature extraction and initial SOC estimation. The second part is the ASG filter, which is mainly used to filter the initial SOCs estimated by the GRU model and output the final estimates. The structure of the GRU-ASG model is shown in Fig. 4.

Fig. 4
figure 4

Overall structure of GRU-ASG model

To evaluate the estimation effectiveness of the GRU-ASG model, we use 2 indicators, MSE and MAE, to evaluate the estimation accuracy of the model. Smaller MSE and MAE values represent better model fitting ability and higher estimation accuracy. The MSE will also be used as an evaluation index of the filtering effect, and a smaller MSE means a better filtering effect. The calculation is shown in Eq. (9).

$${\displaystyle \begin{array}{c} MSE=\frac{1}{L}{\sum}_{l=1}^L{\left(y(l)-\dot{y}(l)\right)}^2\\ {} MAE=\frac{1}{L}{\sum}_{l=1}^L\mid y(l)-\dot{y}(l)\Big)\mid \end{array}}$$
(9)

where y(l) denotes the true value of SOC, \(\dot{y}(l)\) denotes the estimated value of SOC, and L denotes the length of \(\dot{y}\).

Many-to-one structural GRU neural network

In this paper, a GRU network for complex working conditions SOC estimation is developed. This network consists of three layers, which are the input layer, the hidden layer, and the output layer. The first layer is the input layer, which includes the voltage, current, and temperature for each time step t. The second layer is the hidden layer, which contains 1 GRU layer and 3 fully connected (FC) layers. The GRU layer is used to extract features, and the FC layer to map the hidden features extracted by the GRU layer. Due to the high sampling frequency of the dataset, the number of steps contained in a discharge cycle is high. In order to make the most of past information, we have improved the standard GRU network. The “many-to-one” structured GRU network is designed as shown in Fig. 5.

Fig. 5
figure 5

Many-to-one structured GRU network

For an input \(X=\left[{\delta}_{t_1},\cdots, {\delta}_{t_n}\right]\), where n is the number of steps in the entire discharge cycle, δt = [Vt, It, Tt] representing the voltage, current, and temperature for each time step. The standard GRU network input of one δt will output an estimated SOC. In contrast, the “many-to-one” structured GRU network is input for n steps of δt before an estimated SOC is output.

The experimental hardware included a CPU (Intel(R) Core(TM) i5-10400F), a GPU (NVIDIA GeForce GTX 1060), and the Windows operating system. The network model was built in the Tensorflow architecture, with the basic parameters of the target network set, as shown in Table 3.

Table 3 GRU Network parameters

ASG for SOC filtering

Usually, the implementation of SG filtering requires not only the input of a time series signal but also the setting of two important parameters that have a large impact on the filtering effect, the fitting order N and the window length W. For the selection of the optimal N, we know from the nature of SG filtering that its value is generally constant when the input time series signal is of the same type. Combining this nature and the characteristics of the SOC of different working condition datasets of energy storage plants, we can determine the best-fit order Nbest applicable to it in advance. So for the battery SOC estimation, we used the enumeration method for four different operating conditions datasets in the experiment, and the experimental results are shown in Table 4. From the experimental results, we can conclude that the most applicable best-fit order Nbest is 5 when using SG filtering to filter the SOC of the energy storage battery. For the choice of the optimal window length Wbest, the conventional signal can be found it by the enumeration method. The reason is that the true value of the conventional signal is generally known and constant. However, this method cannot be used to find the Wbest for online SOC estimation of energy storage batteries, because the real SOC corresponding to different operating conditions datasets are different and unknown.

Table 4 Experimental results of the best-fit order

To solve this problem, this paper proposes an ASG filtering algorithm based on the Spearman correlation coefficient, which is used to find the Wbest for different complex operating conditions online when the true value of SOC is unknown. The specific process of the ASG algorithm is the following three steps.

Step 1: Set the fitting order of the SG filter to 5, and then filter the initial SOC estimate \(\hat{y}\) which was obtained from the GRU network by Eq. (10). Obtained the filtering results \({\tilde{y}}_i\) under different window lengths.

$${\tilde{y}}_i= SG\left(\hat{y},{W}_i,N\right)= SG\left(\hat{y},{W}_i,5\right)=\left[{\tilde{y}}_{\left({W}_1,5\right)},{\tilde{y}}_{\left({W}_2,5\right)},\cdots, {\tilde{y}}_{\left({W}_i,5\right)}\right]$$
(10)

where 0.02L < Wi < L, i = 1, 2, ⋯, L − 50, L is the length of \(\hat{y}\).

Step 2: Calculate the Spearman coefficient of \({\tilde{y}}_i\) and the initial estimated SOC \(\hat{y}\) using Eq. (11). The calculation results in the set R shown in Eq. (12).

$$P\left({\tilde{y}}_i,\hat{y}\right)=1-\frac{6\sum {d}_j^2}{n\left({n}^2-1\right)}$$
(11)

where dj denotes the difference between the place value of \({\tilde{y}}_i\) and \(\hat{y}\) at the j-th data pair, and n denotes the total number of observed samples.

$$R=\left[P\left({\widetilde y}_{\left(W_1,5\right)},\widehat y\right),P\left({\widetilde y}_{\left(w_2,5\right)},\widehat y\right),\cdots,P\left({\widetilde y}_{\left(w_{L-50},5\right)},\widehat y\right)\right]=\left[p_1,p_2,\cdots,p_{L-50}\right]$$
(12)

Step 3: Find the first descending convergence point Pc in the set R. The Wi corresponding to Pc is the optimal window length Wbest found by ASG filtering algorithm.

ASG is proposed based on the properties of the SG filter and Spearman correlation coefficient. This will be elaborated on the next section. First, the standard SG filter has one important property: The smaller the W, the closer the curve is to the real curve; the larger the W, the better the smoothing effect. To investigate the results of this property on SOC filtering, we conducted experiments using Testing 1. From the results shown in Fig. 6, it can be seen that when W is small, the filtered curve \(\overset{\sim }{y}\) is very close to the pre-filtered curve \(\hat{y}\), resulting in a poor filtering effect. When W is large, the smoothness of the filtered curve \(\overset{\sim }{y}\) is particularly large and the filtering effect is equally poor. Because only the general trend of the curve \(\hat{y}\) is retained at this point, certain processes in between are completely ignored. Hence, we can draw Conclusion 1: when W is taken to a critical point, the degree of closeness to the real curve and the smoothing effect will reach a balanced state. At this point, W is the optimal window length Wbest, MSE obtains the minimum value, and the filtering effect is the best. When W passes the critical point, it will be out of balance again and the MSE will continue to increase again. In this paper, Conclusion 1 was verified using the Testing 1. The experimental results obtained are shown in Fig. 7.

Fig. 6
figure 6

Filtering effect with three different window lengths W: a W takes the minimum value of 31; b W takes the optimum value of 861; c W takes the maximum value of 1563

Fig. 7
figure 7

Schematic diagram of SG filtering performance MSE with window length W

For the energy storage battery SOC filtering. Combined with Conclusion 1 and the properties of the Spearman correlation coefficient P: For a data pair (X, Y), when X is unchanged and Y is changed, its P will not change as long as the bit values at the corresponding positions between X and Y remain unchanged. Then, the principle of ASG can then be determined:

When W is small, the filtered curve \(\overset{\sim }{y}\) is very close to the pre-filtered curve \(\hat{y}\). The variation of \(\hat{y}\) and \(\overset{\sim }{y}\) bits on the corresponding time step is small and the difference of bit values d is small. Since the total number of observed samples n is constant, the P is highest when W is smallest. When W starts to increase but does not reach near the critical point Wbest, the overall smoothness of the curve \(\overset{\sim }{y}\) becomes higher and the difference of the place value d becomes larger. The P will then decrease to near the minimum point. When W crosses Wbest and continues to increase, the overall trend of the curve \(\overset{\sim }{y}\) and the waveform basically stop changing significantly. The P will show an essentially constant trend and converge around its minimum value.

In this paper, the proposed ASG method is validated by the Testing (1, 2). The experimental results are shown in Fig. 8 that reveal that the Wbest obtained using the ASG algorithm is quite close to the actual optimal window length. The feasibility of the proposed ASG algorithm is fully demonstrated.

Fig. 8
figure 8

Variation of the Spearman coefficient P of different test sets with window length: a Testing 1; b Testing 2

Results and analysis

Comparison experiments of three neural network: RNN, LSTM, and GRU

In order to verify the superiority of the improved “many-to-one” structured GRU network over RNN and LSTM models in SOC estimation of energy storage batteries, we compare the estimation results with those of RNN and LSTM. The 3 models are identical except for the different types at the neural network layers. The specific parameter information is shown in Table 3. The estimated SOC and estimation error of the above three neural networks under Testing set (1, 2) are shown in Fig. 9, and the values of the evaluation metrics are shown in Fig. 10 and Table 5. From Fig. 9, we can see that the three neural networks can basically capture the decreasing trend of SOC, but the maximum estimation errors are all relatively large. The lowest maximum estimation error is the GRU model with about 7%, followed by the LSTM model with about 8%, and the largest is the RNN model with about 12%. This indicates that its estimation results are volatile and need to be smoothed by using filters. From Fig. 10, we can visualize that the SOC estimation accuracy of the GRU network is optimal in all cases. The average MSE and MAE of the two test sets can be derived from Table 5. The estimated MSE of the GRU network is 0.17%, which is 59% lower than that of the RNN network and 39% lower than that of the LSTM network; the estimated MAE of the GRU network is 3.49%, which is 33% lower than that of the RNN network and 20% lower than that of the LSTM network. The above results fully illustrate the advantages of the estimation accuracy of the proposed “many-to-one” structured GRU network. To satisfy the requirements of practical engineering applications, we use the ASG filtering algorithm to further improve the accuracy of SOC estimation in this paper. The results are given and discussed in the “ASG filtering algorithm performance evaluation experiment” section.

Fig. 9
figure 9

Estimated SOC and estimation errors for 2 test sets under different networks: a Testing 1; b Testing 2

Fig. 10
figure 10

Evaluation metrics of the 2 test sets in different neural networks: a MSE; b MAE

Table 5 Performance of the 2 test sets with different neural networks

ASG filtering algorithm performance evaluation experiment

First, for all three models in this comparison experiment, the parameters of the neural network model GRU are the same as those in Table 3. The parameter fitting order N of the SG filters in both Best-GRU-SG and GRU-ASG was set to 5 based on the results in Table 4.

Then, we summarize the actual optimal window length Wbest and the window length W2 obtained with the ASG algorithm for the two test sets in Table 6. It can be concluded from Table 6 that for a very large range of values of W, two test sets can find a value close to the actual optimal window length by the ASG algorithm and the average error is only 39. In this paper, the estimation results of the GRU-ASG model are compared with those of the GRU network and the Best-GRU-SG model with the actual optimal window length Wbest. The SOC estimation results for the two test sets in the above three models are depicted in Figs. 11 and 12 and summarized in Table 7. Comparing the SOC estimated by GRU-ASG and GRU, it is obvious from Fig. 11 that adding ASG filtering can greatly smooth the output SOC of GRU. From Fig. 11, we can visualize the accuracy advantage of the GRU-ASG model over the GRU. The maximum estimation error of GRU-ASG is about 5%, which is about 2% lower than the maximum estimation error of the GRU model. The average MSE and MAE of the two test sets can be derived from Table 7. The MSE of the GRU-ASG network is 0.1%, which is 41% lower than that of the GRU network. The MAE of the GRU-ASG network is 2.61%, which is 25% lower than that of the GRU network. This is sufficient to indicate that the addition of ASG can significantly reduce the SOC estimation error.

Table 6 Window length W selection for 2 test sets
Fig. 11
figure 11

Estimated SOC and estimation errors for 2 test sets before and after ASG filtering. a Testing 1, b Testing 2

Fig. 12
figure 12

Evaluation metrics of the 2 test sets before and after ASG filtering: a MSE; b MAE

Table 7 Performance of the 2 test sets before and after ASG filtering

Moreover, compared to the Best-GRU-SG model, from Fig. 11 and Fig. 12, it can be seen that the GRU-ASG and Best-GRU-SG estimation results are basically the same, and the evaluation indicators are also basically equal. According to the estimation results shown in Table 7, it can be concluded that the MSE and MAE of the ASG-GRU model are only 0.001% and 0.01% lower than those of the Best-GRU-SG model on average for the two test sets. The correctness of the ASG filtering algorithm proposed by this paper is demonstrated.

Comparison experiments under different filters

For all three models in this comparison experiment, the parameters of the neural network model GRU are also the same as in Table 4. The parameters of GRU-ASG are the same as those in the “ASG filtering algorithm performance evaluation experiment” section. To verify the superiority of the ASG filtering algorithm compared with other online filtering algorithms for SOC estimation in energy storage plants, we applied the most commonly used moving median filter (MD) and Gauss filter (GA) to filter the GRU model estimation results. The estimated SOCs and estimation errors of the above three filters are shown in Fig. 13, and the values of the evaluation metrics are shown in Fig. 14 and Table 8. Best-GRU-MD represents the filtering result when the parameters of the MD filter are set to optimal. Best-GRU-GA represents the filtering result when the parameters of the GA filter are set to optimal. As can be seen from Fig. 13, all three filters can reduce the initial SOC fluctuations and improve the stability of the GRU network output results. It can be visualized from Fig. 14 that the SOC estimation accuracy of GRU-ASG is the highest. The average MSE and MAE of the two test sets can be derived from Table 8. The MSE and MAE of the GRU-ASG model decreased by 29% and 23% compared to the Best-GRU-MD model and by 20% and 12% compared to the Best-GRU-GA model. The above results demonstrate that the combination of GRU and ASG is the most advantageous.

Fig. 13
figure 13

Estimated SOC and estimation errors for 2 test sets at different filters: a Testing 1; b Testing 2

Fig. 14
figure 14

Evaluation metrics of the 2 test sets with different filters: a MSE; b MAE

Table 8 Performance of the 2 test sets with different filters

Conclusions

In this paper, we propose a combined SOC estimation method GRU-ASG based on the GRU network and ASG filter. The method first uses the GRU network to obtain an approximate SOC estimate and then applies ASG to further improve the accuracy of the SOC estimate. The article establishes the “many-to-one” structured GRU network, which can better utilize the previous measurements to improve the estimation accuracy. The ASG algorithm proposed in the article can update the window length adaptively online so that we do not need to focus on choosing the optimal window length. The GRU-ASG model is trained using the first four of six different operating condition datasets collected from an energy storage plant and validated using the last two datasets. The experimental results show that the SOC estimation accuracy of GRU-ASG is better than that of GRU, GRU-MD, and GRU-GA. And it is extremely close to that of GRU-SG under the actual optimal window. In particular, the maximum MSE of the two test sets does not exceed 0.15% and the maximum MAE does not exceed 3%. In addition, the proposed method has good generalization capability and is able to correct SOC errors under different practical and complex operating conditions. Therefore, GRU-ASG may have a broad application prospect in practice. Especially with the large number of energy storage plants being built and put into operation, this will provide enough and abundant data to train the neural network, thus making the proposed method more applicable to SOC estimation and safety management of energy storage plants. In future work, the effects of temperature and battery capacity degradation in the operating environment of energy storage plants will be considered. Transfer learning techniques will also be introduced to further improve the applicability of the proposed approach in real-world conditions.