1 Introduction

Lithium-ion batteries (LIB) have found increasing usage in the past two decades in consumer electronics, power backup, and grid-scale energy storage, electric vehicles [1, 2] are at the forefront of decarbonizing society and a viable alternative to carbon-based energy resources [3, 4]. While a long way to go to achieve parity with carbon-based energy resources on energy and power densities [5, 6], life, reliability, safety, etc. Numerous research works [7] have been undertaken on non-Edisonian approaches [8] to unlock its properties and understanding from different aspects [9, 10]. Due to complexities in the possible manufacturing process [11], manufacturing of battery packs [12] from the cell, wide properties, and various possible applications at wide operating and environmental conditions [13, 14], multi-scale integrated computational modeling and data-driven methods are used. It is a non-measurable state estimation approach widely required in the process–structure–property–performance of a lithium-ion battery [15]. The electrochemical performance characteristics [16] of LIB, including energy density, power density, and capacity [17], which are highly dependent on the electrode structure produced during the manufacturing process, are used to define its performance [18]. Lithium-ion battery heterogeneous nature of electrochemical behavior which includes several different rechargeable cell types, provides a problem when it comes to formulating predictions and estimates about their state [19]. It is resilient, non-linear, time-varying, and has properties, making an exceptionally difficult task with the final result obtained indirectly, based on the measurement of other parameters varies in accuracy because of varied estimation methods, battery models, and optimization methods [20].

Among several LIB states, state of charge is a vital aspect and the main barrier to adopting LIB-based electric vehicles [9] as an alternative to conventional internal combustion engine vehicles. The precise estimation of the state of charge is vital in extending cell life and guarantee its safe operation [21]. State of charge is not a tangible parameter but rather it is a co-state within the battery management system that cannot be directly captured through measuring instruments. Numerous researchers have proposed different methods for estimating SOC but a significant portion of them lack precision and they are categorized into online and offline approaches. For real-time state estimation, online methods can be employed, however, due to rigorous experimental protocols or expensive processing requirements, offline approaches are not suited for battery operations. Model-based methods, coulomb counting, Kalman filters, electrochemical methods, hybrid methods, and machine learning approaches are used to calculate state of charge estimation.

The challenges associated with accurate state of charge estimation in LIB are due to the non-linear relationship between voltage and SOC due to operating temperature and aging. Operating temperature depends on the current drawn from the battery and it requires robust thermal compensation techniques such as air or liquid cool techniques. Capacity degradation of the battery pack over time is age age-related variations known as cycle index introduces erroneousness in the system. Accurate current measurement is vital but prone to errors and requires frequent calibrations. Charge and discharge cycles of the battery pack are further complicated due to hysteresis effects and the state of health of the battery reflecting its overall condition also depends on SOC estimation. Open circuit voltage, C-rate dependency, and precise calibration add to the complexity of the SOC estimation. Researchers employ different state estimation methods to handle the challenges striving to enhance SOC prediction accuracy for diverse applications.

The objective of the paper is to provide emphasis on the state of charge estimation methods with enhanced accuracy, extended life, and system reliability via proper prognostics and diagnostics. The objective also covers the identification of performance parameters of artificial intelligence algorithms and deep learning methods used for SOC estimation with different quantum of data such as voltage, current, temperature, and impedance. The novelty of the research work is to ease and standardize SOC investigations through a simple systematic approach for commercial lithium-ion batteries. The goal of the review is to understand different estimation methods with advantages and limitations for lithium ion battery which is essential for safe and efficient operation across a wide range of applications.

The remainder of the paper is structured as the state of charge estimation methods with findings and limitations of different techniques described in Sect. 2. Different state of charge estimation methods, their performance characteristics, and process flow diagrams are described in Sect. 3. Section 4 summarizes different deep learning methods with multiple operating profiles, different types of cells used for electric vehicle applications, the advantages and disadvantages of different deep learning methods, and different cell assembly patterns. Section 5 covers key issues and challenges whereas Sect. 6 covers conclusions, future work, and recommendations.

2 Related Works

The shortcomings of the nonlinear battery model are solved using the long short-term memory neural network model by Almaita et al. can adapt to the complexity [22]. The accuracy of the model is compared with findings from the feed-forward neural network and deep feed-forward neural network [23, 24] topologies under three distinct time series. It was shown to be less superior due to the uncertainty of the estimate process [25]. The battery dynamics could be self-learned by an artificial neural network (ANN) [26], which made it possible to compete with conventional SOC estimating methods [13, 27]. Additionally, the ANN's assessment is more reliable because its inputs exclude the prior SOC level [28]. For the SOC estimate of lithium-ion batteries in hybrid and electric cars are compared a trade-off analysis between five alternative ANN designs [29] and found that the nonlinear autoregressive exogenous model architecture performed for estimation error, training time, and computing cost.

State estimations are examined with ensemble bagging, linear regression, Gaussian process regression (GPR), support vector machine (SVM) [19], and ensemble boosting [30]. It determined that out of six algorithms, ANN and GPR are the best ones based on MSE and RMSE of (0.0004, 0.00170) and (0.023, 0.04118), respectively, and used that information to enhance the battery's performance parameter [31]. To create training and testing datasets a mechanism is proposed as a recurrent neural network based on a genetic algorithm called a gated recurrent unit network that was tested under four dynamic driving conditions at five different temperatures [32]. The authors concluded that it achieves high robustness and accuracy with the proposed method. An adaptive H-infinity filter method and long short-term memory network [33, 34] modeling were proposed and the advantage of the suggested synthetic method is that it can increase the application efficiency of the proposed algorithm [35] by avoiding the precise battery modeling and taxing model parameter identification tasks required for conventional observers or filters [36].

J. Hong et al. reviewed studies to predict SOC with the actual driving cycle of electric vehicles [27, 37] using intricate mathematical formulas, but machine learning (ML) was not used, and the temporal attention long-short-term memory model was found to predict SOC more correctly than other models. The SOC of the hidden drive cycles [38] during training may be predicted by a deep neural network (DNN) with enough hidden layers [30]. They established that adding hidden layers to a DNN (up to 4 hidden layers) reduces error rates and enhances SOC estimation while adding hidden levels beyond that raises error rates. The current deep learning (DL) based approaches for SOC estimates have several research gaps [39]. It is noted that the nonlinear LIB configuration [26, 40] makes it challenging to model accurately and it is also challenging to evaluate the internal environments of a LIB [41, 42] and this can vary between laboratory conditions and real-world conditions [43]. These discrepancies can increase the LIB's instability, therefore, more development is needed to achieve improved SOC estimate accuracy in EV LIBs [44].

Without the use of feature engineering or adaptive filtering proposed a self-supervised learning [45] for end-to-end SOC estimation and demonstrated that the deep learning-enabled transformer model [46, 47] achieves the lowest mean-absolute-error (MAE) of 0.7% and root-mean-square-error (RMSE) of 1.2% on the test dataset at various ambient temperatures [45, 48]. The temporal convolution network (TCN) technique was initially developed to estimate SOC [30, 49] at several drive cycles, including highway fuel economy test (HWFET), unified cycle driving schedule (UCDS) also known as LA92, urban dynamometer driving schedule (UDDS), and US06 drive cycles at 1 C and 25°Celsius, and it was discovered that TCN design obtained an accuracy of 99.1%. With help of a recurrent neural network with long-short-term memory (LSTM), introduced a unique machine learning-enabled approach for conducting real-time multi-forward-step SOC prediction (LSTM) [27, 50]. The long training module demonstrates that the offline LSTM based model is capable of performing quick and accurate multi-forward-step battery SOC forecasts.

3 Methods

SOC is represented by the percentage of the total battery available charge over the battery's residual charge under particular operating conditions such as variable load and temperature. The traditional approach, the adaptive filter methods [51], the deep learning methods, the nonlinear observer, and the hybrid algorithm are the five categories into which SOC estimation methodologies are divided. Lithium-ion battery state of charge estimation is a critical task considered in electric vehicles, renewable energy, and portable gadgets applications. For SOC estimation variety of techniques are used and each has unique advantages and limitations. The traditional mathematical modeling approach offers precision but necessitates a thorough comprehension of the battery's features required to explain the battery's electrochemical behavior. Deep learning techniques use neural networks to learn complicated associations from data sets and produce accurate SOC predictions. For real-time application filter techniques such as Kalman filter or extended Kalman filter techniques integrate mathematical models with annotations and they are resistant to errors and address non-linear dynamics in battery systems. Hybrid approaches combine many techniques to make use of their benefits, frequently employing mathematical models for preliminary estimation and deep learning for refinement.

The supreme methods used in SOC estimation are data-driven, direct measurement [52], and model-based approaches. It also covers the combination of two or more of these methods. Direct measurement-based approaches are open‐circuit voltage and coulomb counting methods. The model-based method makes use of complicated mathematical equations, internal electrochemical processes, electrical characteristics [53] of the components utilized to describe them, and in-depth knowledge of the electrochemistry domain to model SOC [54]. The equivalent circuit model [55], electrochemical model, sliding mode observer, electrochemical impedance model, Kalman filters, Luenberger observer, and other prominent model-based approaches are depicted in Fig. 1. It explains the different states of charge estimation methods such as conventional mathematical modeling, deep learning algorithm, filter algorithm, non-linear, and hybrid algorithm.

Fig. 1
figure 1

Different state of charge estimation methods cover conventional mathematical modeling, deep learning algorithms, filter algorithms, non-linear and hybrid algorithms

Although the model-based method yields dependable and accurate models, it calls for in-depth domain expertise, meticulous feature engineering, and a lengthy development period. It also does not scale up different cell chemistry or foam factors, which results, in alterations in the cell chemistry or foam factor requiring a re-development of the separate model. Cell irregularities such as inconsistent manufacture, erratic operating circumstances, cell deterioration, etc. [42, 56] are not considered in the model-based approach. Due to these inadequacies, researchers are now shifting their attention toward a model-less or data-driven approach for SOC estimation. The temperature, current, and voltage of the cells are measured under various operating and environmental circumstances and across various cell chemistry, form factors, and manufacturers are directly used to predict the SOC. There are several techniques for data-driven SOC estimation [57], including, among others, fuzzy logic, wavelet neural networks, support vector machines, extreme learning machines, nonlinear autoregressive with exogenous input neural networks, and artificial neural networks (ANN) [58]. The specific mathematical expression represented through Eqs. 1 and 2 explains the percentage of the battery capability [31] in the current state to the battery capacity at full charge as follows.

$${SOC}_{curr}= \frac{{C}_{curr}}{{C}_{o}} \times 100 \%$$
(1)

The following function can also serve as a representation of the definition of SOC.

$${SOC}_{curr}= {SOC}_{O}- \frac{\underset{0}{\overset{t}{\int }}i\left(\varepsilon \right)d\varepsilon }{{C}_{n}}$$
(2)

where SOCcurr signifies the SOC data at the time ‘t’, SOC0 represents the initial SOC value, i(Ɛ) represents current at a time ‘Ɛ’, and ‘Cn’ is the nominal capacity.

With the fast-expanding need for robots that can learn to solve a wide range of complicated issues, machine learning [59], data science, and artificial intelligence (AI) help accelerate and simplify the process. Due to excellent learning capabilities from data, deep learning technology interpreted in terms of the universal approximation theorem [60], or probabilistic inference, originated from ANN and was introduced by Geoffrey E. Hinton. It has gained popularity in the computer world and is regarded as a foundational technology of the current fourth industrial revolution [61, 62]. It is extensively used in a variety of application fields, including healthcare, image identification, text analytics, cybersecurity, and many more. Figure 2 illustrates the different artificial intelligence and deep learning algorithms covering supervised, unsupervised, and hybrid technology used for the SOC estimation approach. Artificial intelligence and deep learning algorithms play an important role in improving SOC estimation accuracy. In the supervised learning approach through convolutional neural network (CNN), recurrent neural network (RNN), self-organizing map (SOM), and linear regression are employed to model the complex relationships between input data voltage, current, temperature, and SOC. Unsupervised learning techniques auto encoders (AE), restricted Boltzmann machine (RBM), and generative adversarial network (GAN) help to identify patterns and similar battery behaviors that enhance SOC estimation. Hybrid approaches combine multiple reinforcement learning algorithms for their adaptability and performance in dynamic and uncertain environments.

Fig. 2
figure 2

Different artificial intelligence and deep learning algorithm covers supervised, unsupervised and hybrid technology used for the SOC estimation approach

In essence, deep learning (DL) is a neural network with three or more layers that imitates how the system learns specific input and output information [63]. Data science, which also encompasses statistics and predictive modeling, contains DL as a key component [64]. Data scientists who are responsible for collecting, analyzing, and interpreting vast volumes of data find it highly helpful because DL makes the process quicker and simpler [65, 66]. DLs are considered a means to automate, predict, and analyze at multiple levels. Unsupervised learning is used in DL algorithms, which are built in a hierarchy of increasing complexity and abstraction in contrast to typical linear ML algorithms. Its algorithms automate feature extraction and can ingest and analyze unstructured data, including text and pictures, eliminating the need for human experience [67]. It eliminates the data pre-processing that is generally necessary with ML since it can learn any function with accurate data using different universal approximation theorems, DL has emerged as a topic of interest for academics studying energy storage [68] during the past several years. Figure 3 explains the performance characteristics of the machine learning algorithm with deep learning methods on the parameters of the amount of data used for the algorithm.

Fig. 3
figure 3

Performance characteristics of machine learning algorithm with deep learning methods on the parameters of the amount of data used for the algorithm

An example of how DL modeling using massive volumes of data might improve performance when compared to conventional machine learning (ML) techniques [69]. In essence, without further processing like the use of adaptive filters, DL [70, 71] may be used to directly predict the link between individual cell signals (voltage, current, and temperature) and SOC [72, 73]. This does away with the requirement for manual feature engineering, which still yields accurate SOC estimate results but requires a lot of effort and in-depth domain expertise. Deep neural networks (DNN) and long short-term memory (LSTM) are introduced in the groundbreaking research by authors to estimate SOC from cell temperature, voltage, and current without the use of extra filters [74]. Figure 4 illustrates the process flow diagram of the deep learning technique for calculating SOC in lithium-ion batteries used in two-wheel electric vehicles. It covers data collection, data pre and post-processing, feature engineering, model training, testing, and model prediction. Data processing requires voltage, current, temperature, capacity, cycle life, and time.

Fig. 4
figure 4

Process flow diagram of deep learning method for estimating SOC in the LIBs used for two-wheel electric vehicles

The final step is model evaluation, the predicted SOC values are compared with the actual SOC values in the test set using the root mean square error equation (RMSE). The mean error equation (MAE), or the mean square error (MSE) to evaluate the model accuracy, and the RMSE, mean error, and MSE [75, 76] are shown in Eqs. (3)-(5).

$$RMSE= \sqrt{\frac{1}{N}\sum_{K=1}^{N}{\left({SOC}_{pre}-{SOC}_{act}\right)}^{2}}$$
(3)
$$MAE= \frac{1}{N} \sum_{K-1}^{N}\left|{SOC}_{pre}-{SOC}_{act}\right|$$
(4)
$$MAE= \frac{1}{N} \sum_{K-1}^{N}{\left({SOC}_{pre}-{SOC}_{act}\right)}^{2}$$
(5)

where ‘N’ represents number of data points, SOCpre represents the predicted SOC value through the model in the deep learning method, and SOCact is the actual SOC value in the test set. The smaller the error obtained from the above formula, the higher the model accuracy.

4 Findings and discussions

Table 1 lists different deep learning methods applied for the state of charge calculation with multiple operating profiles. Different data-driven methods are used to calculate the SOC of LIB, taking advantage of the accessibility of charging-discharging data and hardware computing capability. It is still difficult to choose the discriminative features and best-supervised machine learning models for a precise estimate of battery statuses.

Table 1 Different deep learning methods are applied for the state of charge estimation with multiple operating profiles

Electric cars employ three different types of battery cells: pouch cells, prismatic cells, and cylindrical batteries. Additionally, coin cells are utilized for testing in research and development. Because cylindrical cells are already self-contained in a shell that provides adequate mechanical resistance, they are the most affordable configuration to manufacture. Prismatic cells have a range in size from 20 to 100 times that of cylindrical cells and require less material for the casing often deliver more power and store more energy for the same volume. Better heat management than cylindrical cells is also made possible by the thickness and form of the casing. Compared to other cell types, pouch cells are designed to give greater power. They are also highly effective at utilizing available space however have the lowest mechanical resistance of all cell types because of their flexible plastic housing. Table 2 summarizes the different types of cells used in lithium-ion batteries for electric vehicles and allied applications. Nickel Manganese Cobalt (NMC) batteries provide an excellent mix of power and energy. Three lithium compounds: lithium nickel–cobalt-aluminum (NCA), lithium cobalt oxide (LCO), and lithium iron phosphate (LFP), play a crucial part in the electrification revolution's drive to reduce carbon emissions. Table 3 lists the advantages and limitations of different deep learning methods LSTM, RNN, SVM, and RVM are used for SOC calculation for electric vehicle applications. It is observed that SVM has good accuracy in a multi-dimensional system with quick and accurate SOC estimation but it has high complex computation and lack of sparseness.

Table 2 Summarizes different types of cells used in lithium-ion batteries for electric vehicles and allied applications
Table 3 List the advantages and limitations of different deep learning methods used for SOC calculation for electric vehicle applications

The typical two-wheel electric vehicle batteries are lithium-ion batteries, lead-acid batteries, nickel metal hydride batteries, and ultra-capacitors. These batteries work well at high temperatures, have outstanding specific energy, and have a low self-discharge rate. Table 4 explains the typical lithium-ion battery used for mass production of two-wheel electric vehicles in India by different manufacturers such as Ather, OLA, Tork KRATUS, TVS, Hero Electric, Okinawa, and Ampere Magnus. Typical battery configurations are 48 V–72 V with 1–2 KWh for normal use in two-wheel electric vehicles to up to 5 kWh for high-performance vehicles.

Table 4 Typical lithium-ion battery used for mass production of two-wheel electric vehicles in India

Lithium-ion batteries now hold the top spot in battery technology for their energy density of 150–265 Wh/kg. However, they are under a lot of stress due to thermal runaway and they burst and expend all the stored energy. For this reason, BMS is frequently needed to keep them under check. It covers the fundamental components of the conventional BMS as well as the fundamentals of different states of BMS. Four lithium-ion battery packs used in a sequence manner are handled by BMS. A cell monitoring mechanism measures the voltages of all the cells and balances them is known as balancing. A microcontroller unit manages telemetry data, switch activation, and cell balancing (active and passive) strategy. The balancing mechanism limits the cell capacity and impedance of battery packs therefore, a charge differential between cells builds up over aging. A weaker set of cells will charge more quickly than others in the series if they have less capacity. To prevent overcharging of the weaker cells, the BMS must prevent other cells from charging. On the other hand, if a cell discharges more quickly, there is a chance that it will go below the minimum voltage. A BMS without a cell balancer would need to cut off the power early in this situation. The higher SOC cell will be discharged by a circuit at the same rate as the other cells in a series. Figure 5 illustrates the typical two-wheel electric vehicle lithium-ion cell (18,650) as an individual cell, battery bank, and battery management system. The individual cylindrical cell 18 mm in diameter and 65 mm in length acts as an energy storage unit and comprises a cathode, anode, and electrolyte. Multiple cells are grouped to form a battery pack and these cells are strategically connected in series and parallel configurations within the battery pack. BMS monitors individual cell parameters, manages charge balancing, safeguards against overcharging and over-discharging, and communicates with peripherals devices. Figure 6 illustrates the two-wheel electric vehicle battery in assembly with a cylindrical cell, with a prismatic cell used in the OLA S1 Pro battery bank. It provides a visual representation of different battery assembly configurations and highlights the diversity in battery pack design used in two-wheel electric vehicles providing technological choices to vehicle manufacturers.

Fig. 5
figure 5

Typical two wheel electric vehicle energy system components a Lithium-ion cell (18,650), b cell lot (18,650), c battery management system (BMS)

Fig. 6
figure 6

A two-wheel electric vehicle battery in assembly a with cylindrical cell, b with prismatic cell, c Ola S1 Pro battery

Rechargeable batteries are used in electric transportation usually referred to as traction batteries to power electric motors. Lithium-ion batteries are frequently created with higher energy capacity and lower specific charge density. Deep cycle batteries are made to provide power for extended periods, they set themselves apart from starting, lights, and ignition batteries. For electric transportation systems, compact and lightweight batteries are preferable as they impact less on the weight of the vehicle and hence increase vehicle performance. Electric vehicle batteries are distinguished by their relatively high power-to-weight ratio, energy density, and specific energy. Battery technologies today have substantially lower specific energies than liquid fuels, which frequently affects the vehicle drive range. Table 5 illustrates different lithium-ion battery cell specifications based on manufacturer, model number, foam factor, electrochemistry used, weight in grams, diameter in millimeters, height in millimeters, nominal capacity, and nominal voltage used in two-wheel electric vehicles.

Table 5 List out different lithium-ion battery cell specifications used in two-wheel electric vehicles

Among all two-wheel electric vehicle manufacturers, battery packs are predominantly made with cylindrical NMC cells with very few exceptions for pouch cells due to foam factor and other chemistry. Typically cells are used for 3C or higher discharge rating, because of the high discharge current required for a short duration during two-wheel electric vehicle operation and state of charge estimation is a vital parameter. Table 6 presents an analysis of BMS functionalities as essential, desirable, and non-essential available in typical two-wheel electric vehicle batteries cell voltage level monitoring, I/O current monitoring, charging and discharging control, cell balancing (active/ passive), thermal management system, and user interface attributes.

Table 6 An analysis of BMS functionalities available in typical two-wheel electric vehicle batteries

5 Key issues and challenges

Literature present on DL methods, few standard profiles are used for estimation, which may not ensure reliable SOC estimation under various actual EV driving conditions, which is a typical case of “covariate shift” concerning ML and is susceptible to algorithm failures. This presents a significant issue because LIB has a long cycle life and the DL prediction model requires making thousands of extrapolation forecasts. Under the effect of variables such as cumulative error and random noise, the outcome of the prediction is extremely likely to be incorrect. The demand is more than what the current technologies can handle, particularly when trying to solve the issue of long-term prediction of batteries with numerous formulations. Most of the literature presents, only ideal constant current—constant voltage charging protocols, which are rare in a real-life scenario and differ within various regions, drivers, and durations. The actual effect of partial charging, over-charge, partial discharge, or the effect of temperature during charging is not considered for verifying the performance of the DL model.

Most of the literature mainly focuses on datasets of the cell containing only one particular cell model or set up for particular charging methods and patterned discharging mode with very few done for comparative analysis on different datasets obtained from different cell chemistry, different charging methods, different discharging methods (UDDS, DST, UNIBO, HWFET, NRDC, US06, FUDS, etc.). Also, drive cycles are predominantly designed for high-end EVs (four wheels) with hardly any analysis or reference for two-wheel electric vehicle at the same time country-specific or geographical-specific need is not exploited in the available literature. Less research and developments happened on the prediction of future SOC trends using DL methods, making it crucial to precisely measure the existing SOC as well as to anticipate the impending SOC based on the current driving data. Additionally, once the SOC is precisely calibrated, it may be determined via Ah counting, which is quick and accurate, thus the live SOC calculation based on neural networks is not required. The need for a collaborative estimation and prediction model through DL is fewer studies, whereas SOC is related to other indirectly measurable states.

6 Conclusions and recommendations

In this paper, numerous SOC estimating methodologies are critically analyzed about their underlying assumptions, accuracy, execution, advantages, and limitations. With the help of model-based and data-driven estimates are extensively studied in the field of SOC prediction. In terms of SOC estimates, both model-based and data-driven techniques have produced noteworthy outcomes. After conducting a comprehensive examination a model-based approach is the most optimal method for achieving superior performance as long as the system behavior is known ahead of implementation. Whereas the data-driven method could perform better than model-based solutions if the system is not well understood. To get the best results from both strategies, several researchers have been attempting to combine both approaches as hybrid models. Nevertheless, different research and developments are happening and moving towards data-driven algorithm-based SOC estimation because of technological advancements such as fast processing processors, high-capacity storage devices, and the availability of big data.

The findings list different recommendations that will substantially enhance the future methodology for estimating SOC. In a real-world application, LIB may be exposed to additional environmental dynamics that are possible to replicate in a laboratory. The findings of the SOC estimation should thus be further examined in light of numerous uncertainties, such as temperature, age, and noise effects. The electrochemical battery model needs to be thoroughly investigated in terms of capacity loss, temperature failure, internal reaction kinetics, and mechanical fatigue. The enhanced fusion rule combining data set and sensor information under various operating conditions in the fusion model covers battery cathode chemistry and battery aging. Additional research is needed for the state of charge estimation techniques employed in the real-time battery management system and the different optimization strategies required to lower the computational complexity of the processes.