LSTM-based low-impedance fault and high-impedance fault detection and classification

Bhatnagar, Maanvi; Yadav, Anamika; Swetapadma, Aleena; Abdelaziz, Almoataz Y.

doi:10.1007/s00202-024-02381-0

LSTM-based low-impedance fault and high-impedance fault detection and classification

Original Paper
Published: 20 April 2024

(2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Electrical Engineering Aims and scope Submit manuscript

LSTM-based low-impedance fault and high-impedance fault detection and classification

Download PDF

Maanvi Bhatnagar¹,
Anamika Yadav¹,
Aleena Swetapadma² &
…
Almoataz Y. Abdelaziz³

221 Accesses
2 Citations
Explore all metrics

Abstract

In this article, a long short-term memory based protection scheme for power transmission lines is presented. A fault detection framework is developed that uses voltage and current signals’ RMS values as input. The proposed work is established for various shunt faults, both low-impedance and high-impedance faults which are tested on a standard IEEE 14 bus system and an existing real transmission network. Results confirm the detection and classification of faults with accuracy and precision higher than 99%. The impact of non-faulty events such as load switching, capacitor switching, noisy data, and load variation conditions is also studied to analyze the model’s performance. The efficacy of the proposed method is confirmed by comparing it with different methods in the literature. Results indicate the aptness of the proposed scheme for the protection of power transmission lines.

Review on High-Impedance Fault Detection Techniques

Article 01 April 2023

Bayesian-optimized LSTM-DWT approach for reliable fault detection in MMC-based HVDC systems

Article Open access 02 August 2024

A cumulative standard deviation sum based method for high resistance fault identification and classification in power transmission lines

Article Open access 26 September 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Modern power system network has experienced a rapid increase in size and complexity with the integration of renewable energy resources [1]. As the system is exposed to different atmospheric conditions, the chances of faults are greater. A fault is a condition in which abnormal electric current flows through the system. The protection system’s main functions are to detect the fault, and classify them into different types (LG, LL, LLLG, LLL, and LLLG) and to identify the faulty phase(s). There are however frequent and unavoidable faults due to a variety of random reasons, which severely affect the performance of the power system, interrupt the supply of energy, and compromise the efficiency and reliability of the system [2]. To meet the increasing energy demand and ensure continuous delivery, the impact of faults must be minimized. It is therefore crucial to detect these faults early to eradicate them as quickly as possible. Maintaining the faulted component allows for faster recovery of the main system function and makes components more reliable by restoring their reliability in a matter of minutes. To mitigate these faults and restore proper system operation, we need effective protection and maintenance schemes. There are several types of short-circuit faults that may produce short circuits in generation, transmission, and distribution systems, including generators, transformers, insulation, HVDC converters, feeder buses, and transmission lines. Electric short circuits adversely affect power system performance and threaten the key function of the system. Many of the faults occur on transmission and distribution lines, as mentioned in [3]. A short-circuit fault occurs commonly and is the most hazardous type of fault, posing high risks for the line, including reduced component life expectancy, increased power loss and heat, and damaged insulators. The short circuits encountered can be divided into symmetrical and asymmetrical faults. Symmetrical or balanced fault keeps the system balanced. It consists of three lines to ground (LLLG) and three lines (LLL) faults. They are relatively rare, but they cause the major harm to the system equipment because of the higher fault current magnitude involved. Double line-to-ground (LLG), line-to-ground (LG), and line-to-line (LL) faults are asymmetrical and unbalanced faults. The occurrence probability of single-line ground faults is 0.80, although less severe than balanced faults. Another type of fault showcasing a large impedance thereby leading to very small fault currents is known as high-impedance fault (HIF). In addition to endangering power system equipment, HIFs can also endanger human safety. Designing a proper protection scheme for HIFs is not just a matter of ensuring detection of this hazardous fault but also to avoid any fatal accident which may happen due to fact that it involves current carrying conductor lying on ground with some high-impedance surfaces which can cause fatal damage to human beings. Further it also involves addressing the fact that HIF features depend on many aspects such as surface humidity, soil type, and weather condition [4]. HIF current and voltage waveforms possess random, asymmetric, nonlinear features, making it imperative to develop a robust detection scheme using effective signal processing methods.

The type, location, and duration of a fault affect the system's performance. By detecting, identifying, and localizing faults faster, system protection and upkeep strategies are improved and power systems are maintained with high quality and quantity. Researchers and practitioners study fault detection and classification extensively. Currently, most methods rely on digital sampling of voltage and/or current signals. Detection and classification tasks are then performed on the sampled data. As part of the processing phase, features are extracted and applied to classify and detect faults [2]. A data-driven approach to fault detection and classification will replace rule-based algorithms with machine learning and other artificial intelligence (AI) tools soon. Compared to rule-based algorithms, data-driven models are more effective and can help develop generic solutions. With state-of-the-art AI techniques, features are not required to be extracted using the proposed scheme.

Detecting and classifying LIFs usually involve two steps: (a) identifying features from input signals, and (b) calculating results based on features with the help of various AI tools. Several techniques are used in conjunction with these two parts. These include fast Fourier transforms (FFT) [5], wavelet transforms (WT) [6], S-Transforms (ST) [7], and other statistical methods. A feature extraction process requires repeated efforts and is usually specific to system configurations. For example, multiresolution wavelet analysis and statistical features-based techniques in conjugation with an artificial neural network (ANN) have been utilized [8]. This method is remarkably accurate but is susceptible to higher fault impedances and has not been tested for noise. In feature extraction, WT or DWT is commonly used algorithms [6]. Their applications, nevertheless, do not separate the feature extraction from working algorithms. ST can reveal joint time–frequency characteristics; some researchers have used it instead of WT [7]. ST generally has a better ability to reveal harmonics than DWT and is less susceptible to noise than DWT. However, the criteria for selecting features are alike, typically established on the standard deviation or signal energy. The method of feature extraction may be limited by the lack of generalization, as well as the need to investigate constraints not directly recommended in the literature. For instance, using WT or DWT encompasses the choice of appropriate mother wavelet levels and decomposition levels which is not only computationally affluent but can affect algorithm performance as well [6]. Next, the features must be processed to detect and classify faults. To accomplish this, numerous researchers have used a variety of methods. Due to the capability to learn from patterns, neural networks (NNs) have attracted a lot of attention [9]. Data-driven tools like decision trees (DT) [10] and k-nearest neighbors (k-NN) [11] are also useful for fault analysis and decision making. Literature also reports on contemporary deep learning methods such as convolution neural networks (CNNs) [2]. However, these techniques are quite inefficient, because they rely on two distinct parts to extract features and work the algorithm. Feature extraction should be eliminated, and operational data should be worked on directly. There is no generalized set of rules for the process of feature extraction. Owing to this unpredictability, the process of feature extraction could be time-consuming and can disturb the system’s performance.

HIF detection and classification can also be achieved by different signal processing techniques as discussed in [4, 12]. The HIF detection schemes can be categorized according to the domain for feature extraction, such as the time domain or the frequency/time–frequency domain. Using time-domain methods, HIF is typically detected by measuring voltages and currents and analyzing their unique properties. Mathematical morphology technology [13], fractal geometry techniques [14], and Kullback–Leibler (KL) divergence [15] are some of the time-domain-based feature extraction techniques available for HIF detection. HIF voltage and current signals are analyzed by frequency-domain methods. FFT technology detects HIF in [16] by calculating distance changes between the harmonic components of fault currents.

The consideration of low-impedance faults along with HIF (high-impedance faults) is indeed a crucial aspect that merits attention in this paper. As the system may be subjected to both kinds of faults, a protection scheme must be able to detect the presence of both kinds of faults then only the protection scheme is said to be reliable and robust. Thus, the proposed scheme which is designed to cater both kinds of faults provides reliability and robustness against both LIF and HIF. While HIFs are extensively studied for their damaging effects, the inclusion of low-impedance faults is equally important for a comprehensive understanding of the overall fault landscape. Low-impedance faults, often characterized by reduced resistance, can pose distinct challenges and have different implications compared to their high-impedance counterparts. They may trigger different protective responses in the system and exhibit unique fault signatures. By incorporating an analysis of both LIF and HIF, the paper provides a more holistic view of fault scenarios, leading to a more effective and versatile protective framework. Addressing LIF and HIF in this paper also contributes to a more practical and applicable model, ensuring that the fault detection system is well equipped to handle a wide range of potential issues in real-world electrical systems. Data sources for the modern power grid include intelligent electronic devices (IEDs), phasor measurement units (PMUs), digital fault recorders (DFRs), along with many other devices [17]. According to the literature review, there is a requirement to develop automated fault detection and classification methods whose parameters are flexible in terms of working conditions and data sources. With the help of long short-term memory (LSTM) units [18], this paper presents an innovative technique for automating feature extraction that avoids the need for distinct feature extraction tasks and merges it with the working technique. Learning process parts such as feature extraction and working algorithms are unified, which makes deployment more feasible. The major rewards offered by the method are that it does not need communication links, requires low sampling frequency, is easy to implement, does not get affected by system noise, unavoidable transients, or operating conditions, and is overall robust and reliable for operation. The paper is organized as follows—Sect. 2 describes the LSTM methods used, Sect. 3 contains the proposed method, Sects. 4,5, and 6 cover the results, Sect. 8 contains a comparison with other schemes and is followed by the conclusion of the work in Sect. 9.

2 Long short-term memory method

As a type of artificial neural network, LSTM [18] is a component of artificial intelligence and deep learning. An LSTM has feedback connections, unlike a feed-forward neural network. In addition to processing single data points, this type of recurrent neural network (RNN) can also process whole sequences of data. LSTM has both long-term and short-term memories, like a standard RNN. In the network, weights and biases are altered in each iteration, just as synaptic strength changes physiologically to accumulate long-term memories; activation patterns in the network alter once per time step, similarly to how short-term memories are stored in the brain through moment-to-moment electrical firing patterns. By providing long short-term memory to RNNs, the LSTM architecture offers long short-term memory lasting thousands of time steps. Three gates control the flow of information into and out of a common LSTM cell unit. These gates are a forget gate, an input gate, and an output gate [18]. With three gates controlling information flow, the cell can remember values for arbitrary periods. A time series may have significant lags between significant events in the series, so LSTM networks are appropriate to classify, process, and prediction of events. To overcome the problem of vanishing gradients encountered in training conventional RNNs, LSTMs were developed. In various applications, LSTMs are advantageous over RNNs and other sequence learning approaches. This is because they are relatively insensitive to gap length. The basic block diagram of a conventional LSTM unit is illustrated in Fig. 1.

LSTM begins by deciding how much information about the cell state ${(C}_{t})$ it is willing to discard. Sigmoid layers known as “forget gate layers” make these decisions. It decides by looking at ${h}_{t-1}$ (previous time step hidden state vector) and ${x}_{t}$ (input vector to LSTM unit) and returns a value between 0 and 1 for each number in the cell state ${C}_{t-1}$. In general, “1” indicates that the information should be kept permanently, while “0” indicates that it should be thrown away completely. This step is shown mathematically in Eq. (1).

$$f_{t} = \sigma \left( {W_{f} \left[ {h_{t - 1,} x_{t} } \right] + b_{f} } \right)$$

(1)

Here ${f}_{t}$ is the activation vector of the forget gate, ${W}_{f}$ is the corresponding weight matrix, and ${b}_{f}$ is the bias function. $\sigma$ represents the sigmoid activation function. Choosing the new data to store in the cell state is the next step. It consists of two parts. As a first step, a sigmoid layer known as the “input gate layer” decides which values to appraise. Afterward, a $tanh$ layer creates a vector of new candidate values, ${{C}^{\mathrm{^{\prime}}}}_{t}$ (cell input activation vector), which could be added to the state. In the next step, these two are combined to create an update to the state [24].

$$i_{t} = \sigma \left( {W_{i} \left[ {h_{t - 1,} x_{t} } \right] + b_{i} } \right)$$

(2)

$$C{^{\prime}}_{t} = tanh\left( {W_{C} \left[ {h_{t - 1,} x_{t} } \right] + b_{c} } \right)$$

(3)

Here ${i}_{t}$ is the activation vector of the input gate, ${W}_{i}$ and ${W}_{C}$ are the corresponding weight matrix, and ${b}_{f}$ and ${b}_{c}$ are the bias functions. Next the old cell state $({C}_{t-1}$) is updated to cell state ${(C}_{t})$. In the previous steps what information needs to be added was decided upon. This step involves the execution of the same. The old state is multiplied by ${f}_{t}$ which leads to forgetting the new and maxima of the signal the information that was not needed. Then, ${i}_{t}{C\mathrm{^{\prime}}}_{t}$ is added to it. The new candidate value is scaled before updating every state value. Mathematically this step is represented by Eq. (4)

$$C_{t} = \left( {f_{t} C_{t - 1} + i_{t} C^{\prime}_{t} } \right)$$

(4)

The last stage is the output stage. The output depends on the filtered cell state. Firstly, the information passes through the sigmoid layer that decides what parts of the cell state will be output. Afterward, the cell state passes through a layer to produce outputs between − 1 and 1. It later gets multiplied with sigmoid layers output so that only the outputs decided upon earlier are obtained. The mathematical representation of the last stage is given by Eqs. (5) and (6).

$$O_{t} = \sigma \left( {W_{O} \left[ {h_{t - 1,} x_{t} } \right] + b_{O} } \right)$$

(5)

$$h_{t} = \left( {o_{t} {\text{tanh}}(C_{t} } \right)$$

(6)

Here ${O}_{t}$ is the output state activation vector and ${W}_{O}$ and ${b}_{O}$ represent the corresponding weight and bias matrix.

The LSTM network used in the approach proposed in this paper classifies various types of symmetrical and asymmetrical faults as well as detects their presence. The data are transferred from the input nodes to the LSTM hidden layer, which is composed of several LSTM units. A fully connected dense layer receives the LSTM layer output. As the output of this layer provides the probability of class labels, the fully connected layer is responsible for the high-level reasoning required for classification. Furthermore, a fully connected layer optimizes the objective by learning nonlinear amalgamations present among designated features. In contrast to pooling layers, fully connected layers contain weights and intercepts that multiply trainable weights by the input features. In addition, they contain an additional bias that can be selectively applied. Finally, a multiclass classifier employs a softmax activation function in the last dense layer. A sigmoid activates this layer during binary classification. CNNs are considered one of the most favored deep learning practices, but they suffer from overfitting and vanishing gradient problems and training large-scale networks requires a lot of processing power. LSTM uses additive gradient structures to solve vanishing gradient problems. The network can activate the forget gate directly and update the gate frequently at every time step, thereby achieving the desired performance from error gradients. With LSTMs, patterns can be remembered over long periods, giving them an advantage over other deep learning methods [18]. In LSTMs, information flows through cell states, allowing some information to be retained while some information is forgotten. An LSTM only adjusts by multiplying and adding, unlike other methods. In addition, LSTM networks preserve constant backward propagation error rates. Even if the time steps are large, the network can learn dependencies.

3 Proposed methodology

The relaying scheme proposed for fault detection and classification in the transmission system consists of three parts: input preparation, fault detection, and fault classification. The detailed architecture of the proposed methodology is shown in Fig. 2. The various steps incorporated in the relaying scheme design are discussed below.

1.
Simulation of power system module: The IEEE 14 bus system has been simulated in MATLAB/Simulink environment [19]. All the simulations are executed on a system having an Intel CORE i5 processor, 3.4 GHz CPU speed, and 8 GB RAM. Various fault and no-fault cases by varying different system parameters such as fault location and impedances have been simulated to test the efficacy of the proposed scheme.
2.
Input preparation: The two-cycle post-fault three-phase voltage and current signals are collected from buses 1, 2, and 3. The root mean square (RMS) value of these signals is calculated using a one-cycle-long moving window. The recursive RMS voltage and current obtained are then utilized as input for the fault detection and classification modules.
3.
Training Modules: Two modules have been created, each for fault detection (FD) and fault type classification (FC) with the help of input RMS voltage and current values and their corresponding targets. The modules are trained using suitable LSTM parameters.
4.
Testing Modules: 20% of the entire dataset has been reserved for testing both modules. The testing set was randomly selected. The testing data determines the efficacy of the trained module in terms of various parameters such as accuracy.

3.1 Power system model under study

In this paper, an IEEE 14 bus transmission system [20] as shown in Fig. 3 has been utilized for fault detection and classification purposes. It consists of a 220/132 kV, 60 Hz system, five synchronous generators located on buses 1, 2, 3, 6, and 8, fourteen buses, twenty transmission lines, eleven loads, and various transformers. Out of the five synchronous generators, three are synchronous condensers located on buses 3, 6, and 8. The 11 loads combined have a total power of 259 MW and 81.3 MVar. All the necessary system parameters such as line lengths are discussed in detail in [20]. The dataset has been generated by simulating the different types of symmetrical and unsymmetrical faults along with some no-fault cases such as load variation and switching events. The fault simulations are done between lines connecting bus 1 and 2, bus 2 and 3, and bus 1 and 5, and data from buses 1.2 and 3 are recorded.

3.2 Preprocessing and data collection

The proposed scheme uses processed three-phase voltage and current signals collected from different buses. The sampling frequency used for data collection is 1.2 kHz. Therefore, the total number of samples in each cycle is ($\frac{1200}{60}=20 samples ).$ These signals are further processed by an RMS filter with a moving window of one cycle. The data from two cycles post-fault are then utilized for further processing. Figures 4 and 5 show how pre-processing is performed. The process is shown in detail for an AB-type fault occurring between lines 1 and 5 at 20 km simulated at t = 1 s. Bus 1 voltage and current signals are obtained, and RMS is calculated. Finally, the RMS voltage and current signals of two cycles post-fault are used as input for the fault detection and classification modules.

To train and check the versatility of the proposed model, a variety of fault and no-fault cases are simulated. To achieve this, the modeled system is configured, and its parameters are adjusted. Tables 1 and 2 depict the possible configurations for fault and non-fault cases. All possible fault resistance configurations and fault inception angles are evaluated at different lines (1–2, 2–3, and 1–5) and locations of the system to test all different types of faults. Both LIF and HIF have been simulated.

Table 1 Fault cases used for validation of the proposed method

LSTM-based low-impedance fault and high-impedance fault detection and classification

Abstract

Similar content being viewed by others

Review on High-Impedance Fault Detection Techniques

Bayesian-optimized LSTM-DWT approach for reliable fault detection in MMC-based HVDC systems

A cumulative standard deviation sum based method for high resistance fault identification and classification in power transmission lines

Explore related subjects

1 Introduction

2 Long short-term memory method

3 Proposed methodology

3.1 Power system model under study

3.2 Preprocessing and data collection

3.3 Design of the fault detection module

3.4 Design of fault classification module

4 Performance evaluation parameters

4.1 Fault detection results

4.2 Output of FD for faulty events

4.2.1 Effect of fault resistance on FD

4.2.2 Effect of fault location on FD

4.2.3 Effect of type of fault on FD

4.2.4 Effect of variation of fault inception angle on FD

4.2.5 Effect of variation of system load on FD

4.2.6 Evolving fault

4.2.7 Simultaneous fault

4.3 Fault detection in case of a non-fault event followed by a faulty event

4.4 Influence of noise

4.5 High-impedance fault detection

5 Result fault classification

5.1 Low-impedance fault (LIF)

5.2 High-impedance fault (HIF)

6 Power swing

7 Evaluation of scheme on an Indian power system network of Chhattisgarh State

7.1 LIF and HIF detection

8 Comparison with other schemes

9 Conclusion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation