Introduction

Power transmission system is one among the most widespread networks. These transmission lines spread across long distances and go through different terrains. Hence, these are often subject to different faults. Different environmental constraints like storm, snow, ice, etc., often initiate faults in these lines. Several other factors like the presence of different animals, birds as and growing vegetation, weeds and other parts of the trees often cause short circuit between the three phases of the transmission lines and ground. In most of the cases, the faults occur between single line and ground. Fast detection of fault and identification of the faulted phase is one of key area research in power system protection and research. Classification of faults, therefore, is very essential for detecting the disturbed line; which enables quick isolation of the faulted line. This restricts unwanted flow of electric power as well as prevents damage to the working persons and equipments. System stability is also maintained by the removal of the faulted line.

Researchers have worked on several methodologies regarding fault treatment [1]. Enormous advancement of soft computation and artificial intelligence-based techniques has paved the way for the development of digital relays. In this paper, a PCA-based simple and effective method of classification of power system fault is described. The method uses the Principal Component Analysis (PCA) to develop Principal Component Index (PCI) for each the three phase fault signals of half cycle post-fault duration. Fault features of each of the ten classes are extracted using the PCA schemes which are further analyzed using two- level threshold-based analysis to segregate the three different levels of fault disturbance levels and hence develop classifier model. The proposed classifier is verified using diverse data of fault signals for fault conducted at various intermediate locations along the line. Fault resistance, another vital fault parameter, is also varied in steps to provide robustness of the classifier.

The proposed classifier is simple and contains very less computational analysis, as it uses only threshold-based classification model, using the PCI values; hence, is fast and easily implementable in practical power system protection models. This method is simple with the use of PCA as the only feature extraction tool. Non-use of supervised learning models like the neural network discards the requirement of large training data set and the associated large training time. At the same time, PCA is a simple analysis model, compared to computationally heavier transform-based models using wavelet or Fourier transforms, thus making the overall analysis easier. Another major feature of the proposed work is that it takes into account the practical variability like the effect of varying fault resistance, as well as inherent power line noise. The most important part of the present work is that the proposed fault classifier model shows insensitivity toward the effect of adverse noise level, which is also studied in this work. Gaussian white noise has been incorporated to the fault signals to develop noise-contaminated fault signals. Noise level of the fault waveforms is also varied in four steps by changing the SNR level. More importantly, the proposed model is studied at a high noise level of 15 dB SNR, which is higher than the normal noise level adapted in most of the researches. The effect of this adverse noise is considered even with simultaneous occurrence of variation of fault location and fault resistance. Hence, an attempt has been made to design a robust fault classifier model using the excellent feature extracting properties of PCA, following the proposed two-level threshold identifier at this intense level of noise. On the overall analysis, very high classifier accuracy of 99.78% using simple PCA-threshold-based scheme with reduced computational complexity, analysis with practical alike simulated fault signals incorporating simultaneous variation of fault location and fault resistance and most importantly, the presence of a very high noise level of 15 dB SNR proves the effectiveness of the proposed classifier.

The development of the algorithm is explained in steps using case studies in several occasions, followed by testing of the model using unknown fault signals. Finally, a comparative analysis of the same with some of the existing methodologies has been illustrated in the discussion section.

Background

Multivariate statistical method like Principal Component Analysis (PCA), as has been adapted in this work, has been often used to detect and classify faults in a long transmission system. PCA is one of the most useful tools for determining the directions of highest variance in a multivariate sample in descending order of importance. Power system research has huge application of PCA due to its usefulness in such feature identification, among the multivariate data set containing various combinations of voltage, current and power. Fault analysis in power system, in particular, requires identification of directions of variations of these fault signals of the three corresponding phases, which on analysis with PCA yields useful fault features. Another major advantage of PCA makes it extremely useful as it has the ability to reduce the effect of noise inherently by considering the most important direction only in decaying order of importance. Hence, power system applications find it very useful, especially in fault analysis, where the electrical signals like voltage, current, etc., most often remain corrupted with power line noise. The proposed work also investigates the effectiveness of application of PCA in adverse noisy environment for extraction of fault features. PCA-based classifiers are relatively simple and used heavily for fault analysis [2,3,4,5,6,7,8,9,10,11]. PCA is often as a standalone feature extractor [2,3,4], while different hybrid models of PCA with other popular methodologies are also used in abundance in different researches [5,6,7,8,9,10,11]. The researchers [2] have used symmetrical pattern-based application of PCA, while the authors of [3, 4] have used ratio-based analysis of the PCA features in terms of the PCA index for classification of faults. PCA is often combined with another very important supervised learning method of pattern recognition like probabilistic neural network or PNN for effective classification of power system faults [5, 6]. Signal differentiation is another method which has been often used for highlighting the transient disturbances. The researchers [7] have used this scheme, followed by analyzing the highlighted transients using PCA. Support vector machine or SVM, which is already proven extremely effective, especially for pattern recognition problems, has often been used as hybrid combination with PCA to develop effective classifiers [8]. Traveling wave-based topology of analyzing faults has been practiced since long back. This has also been used with PCA for effective protection algorithms [8, 9], often incorporating wavelet tool as an additional feature extractor [10]. Scientists have also tried to develop fault analysis algorithms incorporating several other methodologies altogether along with PCA, as is done by the authors of [11]; although incorporation of diverse methodologies develops computational complicacy in analysis. Apart from classification, the PCA feature analyzed here has also been usefully applied for the purpose of fault localization, used with best-fit analysis [12, 13].

Different other methodologies have also been applied in power system fault analysis for detection, classification as well as prediction of fault location in transmission lines. Artificial neural network (ANN) along with a major variant of the neural network family: Probabilistic neural network or PNN has extensive applications in different pattern recognition application and hence used widely in the field of power system fault analysis [14, 15]. Supervised learning methods like neural network-based approaches are quite accurate; but training of the network requires a diverse and long set of data for successful development of the model. This accounts for associated large training time as well. Another useful tool like wavelet transforms (WT) has also been employed extensively in fault analysis and research [16]; although WT suffers from computational intricacy, especially at higher level of decomposition. Fuzzy inference system (FIS) is another method that has good application in the same field of research. The investigators [17, 18] have proposed a FIS-based multi-sensor data fusion-based approach for accurate fault analysis. Different combinations of ANN, WT or FIS have been applied in power system protection models very effectively. Hybrid methods like combined approaches of WT and ANN has been extremely popular as it contains the accuracy features of both the methods [19]. The researchers [20, 21] have developed WT and FIS-based model, while the researchers [22, 23] have developed a combined approach of all three methods and extended the research to develop wavelet-neuro-fuzzy adaptive network or ANFIS model to develop effective fault analysis schemes. The ANFIS-based models are extremely effective, although the method computationally heavier as it includes supervised training method in combination with mathematical transform-based technique. Apart from these traditional methods of fault analysis, support vector machine or SVM has come up with superior results, although SVM is sometimes prone to inaccuracy due to noise perturbation. SVM is often used standalone for extracting fault features [24, 25], as well as like a hybrid model used with other methodologies for developing the complete fault analyzer [26,27,28]. The researchers [26] have proposed discrete orthogonal S-transform or DOST-based SVM analyzer, whereas authors of [27] have used SVM with wavelet analysis, and those of [28] have described a hybrid model of SVM and radial basis function neural network. Probabilistic neural network (PNN) is a major variant of neural network and is very effective for pattern recognition problems; hence used in abundance for power system fault analysis, especially classification-related researches [15, 29]. PNN is also used with PCA to develop active fault classifier models [5, 6]. Traveling wave-based methods have also been applied by several researchers [30, 31]; although the wave acquisition time depends on the location of fault, which is a variable measure.

Latest researches include several modern applications of soft computational techniques. Phasor measurement unit or PMU-based analysis has become extremely popular, especially during the last few years [32,33,34]; although PMU applications often need additional devices to be installed at both ends to acquire fault features simultaneously from both ends. This requires additional cost of installation. Sequence network-based fault analysis schemes have been practiced since long back, although recent advancements of soft computational schemes have led to its latest advanced developments [35, 36]. Advanced form of neural network, like extreme learning machine (ELM) [37] or polynomial-based Chebyshev neural network (ChNN) [38], hybrid methods with artificial intelligence-based methods like genetic algorithms [39] or mathematical morphology-based fault feature extraction technique using median filter [40], has been recent additions to the fault analysis methodologies. Combined involvement of a few methods like independent component analysis (ICA), traveling wave theory and SVM is shown in the literature [41] to and the researchers [42] have used methods like stationary wavelet transform (SWT), determinant function feature (DFF), SVM and support vector regression (SVR) to develop accurate classifier. Several methods are still being developed to improve accuracy using lesser duration of fault signals, which would aid to the process of developing effective digital relaying schemes.

Methods of Analysis

Simulation of Faults

The proposed work illustrates a simple method of fault classification in a long transmission line. A 132 kV three phase transmission line is modelled in ATP draw, and this line is very much similar to the one designed as mentioned in the researches [3, 4, 6, 7, 12, 13]. Fifteen individual blocks of 10 km each are connected in cascade to develop the proposed 150 km long ‘JMarti’ model of transmission line. Fault data are collected from the simulated model in the form of fault transient voltage and current for analyzing with the proposed algorithm. The three phase transmission line is made up of planar configuration of the three lines, where each line is separated by 17.5 m. The height of the line at the tower is 27.9 m, and the same at mid-span of the line is 13 m. Line resistance has also selected as 0.0585 Ω/km for each conductor. Two other ground conductors of per unit resistance of 0.304 Ω/km are placed above the three phases at a mutual separation of 26.4 m. These ground lines are placed at a tower height of 41.05 m. The sampling frequency has been assigned as 2000 samples per cycle, which corresponds to 100 kHz. Ten different prototypes: three single line to ground fault (SLG), three double line fault (DL), three double line to ground fault (DLG) and one three line fault (LLL) along with healthy condition are investigated for classification.

In order to make the simulation more practically suited, we have incorporated Gaussian white noise to the fault signals in MATLAB environment using the required signal to noise ratio (SNR). The noise is simulated explicitly in the software and added to the fault signals generated using ATP Analyzer. Noise level is also selected accordingly as per requirement of this work. We have found in many studies that researchers have used the same Gaussian white noise [13, 14, 31]; hence, we have used in this work. Besides, we have found in several papers that many authors prefer the noise level to lie in between 75 to 20 dB [14, 31, 41], or in a few cases, upto 15 dB. Hence, the SNR is varied here between 30 to 15 dB in steps of 5 dB as and when required. This, altogether, creates a more practical alike fault environment.

Analysis of Fault Levels

Three phase current waveform for all the ten fault classes is studied carefully to extract special features in terms of the Principal Component Index (PCI) values. It is observed that fault signals produce high frequency transient oscillations immediately on occurrence of a fault. These transients are developed primarily in the directly affected phase (s) for all the above four categories of fault and less prominently in the other un-faulted lines. The fault features are observed with markedly minor impact in the un-faulted lines in cases of a ground faults like SLG and DLG faults. The circulation of the zero sequence current through the grounded neutral and the grounded faulty line causes minor irregularities in the line currents of the directly un-grounded lines. These disturbances are larger compared to the almost no disturbance of the un-faulted line for DL faults. Less prominent zero sequence current flow for these faults causes this almost no disturbance in the un-faulted lines. Hence, disturbances are observed in the affected phases only for the DL faults. Hence, the three phase lines are categorized in three distinct classes according to the disturbance level of fault occurred in all the phases. These classes are denoted as fault disturbance level subsequently which are mentioned as follows:

  1. (i)

    Directly affected phase for all faults

  2. (ii)

    Directly un-faulted phases for ground faults like SLG and DLG faults.

  3. (iii)

    Directly un-faulted phase for DL faults

The key features of faults, expressed in terms of the PCI values, are arranged in proper order to categorize these according to the above-mentioned three classes. The fault transient signals are scaled before analysis in order to impart generalization to the proposed scheme. These prepared signals are mentioned as f(n) subsequently. The phase A currents are shown here for identifying the above three categories using three fault classes here as examples. These are shown in Fig. 1.

Fig. 1
figure 1

Phase A currents for three classes of faults (a) AG, (b) BG and (c) BC faults conducted at 70 km with fault resistance of 10Ω

It is observed from Fig. 1 that phase A is affected in three different ways with three different levels of disturbance. Phase A is directly affected in case of AG fault hence is most disturbed. This is prominently observed from Fig. 1a where it is observed that the per unit line current reaches almost 6 times within 300 samples, i.e., 3 ms post-fault condition after the fault has been occurred at sample time index of 500 approximately. Figure 1b shows line A waveforms for BG fault where line acts as the indirectly affected line for ground fault. Naturally, it is observed that the effect of fault is much less prominent for this line A as fault affects this line indirectly; yet the effect of fault is observed in minor proportion due to the indirect connection to ground through the directly faulted line B. Finally, for BC fault, as observed from Fig. 1c, line A is almost undisturbed due to the effect of fault. The only major disturbance observed in thin line is due to the effect of noise only. Hence, the three fault disturbance levels are prominently classified graphically from the above observations of Fig. 1. Similar observations are also found for other phases too, as well as for each phase for different fault prototypes. The three phases of the ten fault prototypes studied in this work are categorized using the above concept and the fault disturbance levels are illustrated in Fig. 2. These variations are also prominently obtained from the PCI values so obtained at all fault locations, later observed from Table 1. Hence, attempts have been made in this work to differentiate the three fault disturbance levels directly, using the two threshold values.

Fig. 2
figure 2

Variation of fault disturbance levels for ten different fault prototypes

Table 1 Three phase PCI for different fault locations for ten classes of faults, with half cycle post-fault data, SNR = 30 dB

Effect of Variation of Fault Location

The robustness of the proposed algorithm is also validated using fault data corresponding to faults conducted at various intermediate locations along the line. Ten different faults are conducted initially at three intermediate equidistant points: 30 km, 70 km and 110 km from the source and the PCI values are observed.

In this work, we have considered only the first two major principal components. The data set is initially arranged in such a way that the healthy signals are placed in the first column, followed by the fault signals corresponding to different fault conditions. We have considered the PC scores (PCS) in the primary two directions and computed the distance of the PCS of any other fault signal from the PCS of the healthy condition. This distance is referred to as the PCI in this work. Thus, the PCI values, in turn, indicated the extent of disturbance caused in the signals due to faults; more the extent of fault, higher is the PCI and vice versa.

The variation of line A currents for variation of fault locations is observed as a prototype case for AG fault in Fig. 3, where it is observed that as the fault point moves away from the source, the magnitude of transients gradually reduces in proportion. This is also reflected from the PCI value from Table 1.

Fig. 3
figure 3

Fault transients for line A current for AG fault conducted at three intermediate locations of 30, 70 and 110 km, fault resistance 10Ω

The nature of transients discussed earlier, as well as shown from Fig. 1, is also found to comply with the PCI so found in Table 1. It is observed that the PCI for line A is much higher compared to the PCI of B and C for AG fault, since A is the most disturbed phase for AG fault. Other faults too are found to follow the inferences drawn earlier. It is further observed from Fig. 3 that as the fault moves away, the level of disturbance also reduces for a fixed fault resistance. Hence, it is well inferred that PCI values basically are a measure of the level of disturbance of the fault signal from the healthy condition. Other faults are also found to follow the disturbance levels as described from Fig. 3. The three levels of disturbances are also prominently observed from the same Table 1. It is further observed that the diminishing PCI characteristics with fault location are most prominently observed for the directly faulted lines, compared to the indirectly affected ones.

Effect of Variation of Fault Resistance Along with Fault Location, with Increased Noise Level

The variation of fault resistance is also studied in this work by varying the same in steps of 0.01, 1, 10 and 100 Ω. The noise contamination is also increased to 15 dB SNR simultaneously to observe the effect of noise with this increased distortion. Figure 4 shows the variation of line A current for AG fault for variation of fault resistance for a particular fault location. The three phase PCI values for these four-fault resistances and for three intermediate locations are mentioned in Table 2.

Fig. 4
figure 4

Fault transients for line A current for AG fault conducted at 70 km from the sending end, fault resistance is varied in four steps

Table 2 Three phase PCI for different combinations of Df and Rf for AG fault with half cycle post-fault data, SNR = 15 dB

Analysis and Algorithm Development

Comparative Analysis of the Observations

The observations from all the above tables are used to develop a classifier algorithm. It is worth inferring from the observations of Tables 1 and 2 that PCI for the directly affected line for all faults are very high, followed by much lesser PCI for the un-grounded lines of ground faults. Finally, the un-faulted lines for DL faults are found to produce least value of PCI, thus justifying the classification schemes developed in Sect. 2.2. These observations are further found valid irrespective of the fault location and fault resistance, although the magnitude of PCI reduces gradually for increasing fault location as well as for increasing fault resistance in accordance with the observations from Figs. 3 and 4, respectively. A comparative analysis of the three phase PCI values obtained is shown in Table 3 considering faults conducted at two more terminal locations of the line: 10 km and 140 km, in addition to the 30, 70 and 110 km. Hence, the total set of fault locations used for developing the classifier becomes 10, 30, 70, 110 and 140 km. Besides, and fault resistance is also varied within the range 0.01 to 100 Ω.

Table 3 The limiting values of three phase PCI considering variation of fault location and fault resistance, for four major fault categories including ten fault prototypes, using half cycle post-fault signal

The PCI values which are not required for developing the algorithm are marked as NR in Table 3. The distinct difference between the PCI ranges of the three fault disturbance levels is observed prominently from Table 1. These variations are again found justified even for variations in practically varying fault parameters like fault location and fault resistance. Hence, considering the above variations, the two intended threshold levels are developed as two intermediate boundaries between the three sets of fault disturbance levels so conceptualized. Figure 5 shows the method of separation of the three fault disturbance levels using the developed two threshold levels. The highlighted values from Table 3 are used to design these two threshold levels, which is illustrated graphically in Fig. 5.

Fig. 5
figure 5

Segmentation of three levels of fault disturbance using two-level threshold of PCI values

It is observed that the directly faulted phases produce highest oscillation, followed by indirectly faulted phases of ground faults and indirectly faulted phases of non-ground faults, respectively, in order. Hence, the PCI-level boundaries are also obtained in the same order in Fig. 5. The NR values, as described in Table 3, are either the upper limits of the directly faulted phases or the lower limit of the indirectly faulted phases of DL faults, which behave almost similar to no-fault condition. This is understood from Fig. 5 as well that these two limits are absolutely not required for developing the classifier threshold limits; rather, the thresholds reside at the intermediate region of lower limit of directly affected phase and the upper limit of the directly un-faulted phases for ground faults, as well as lower limit of directly un-faulted phases for ground faults and upper limit of directly un-faulted phase for double line faults, as is easily observed from Fig. 5. The numerical values of PCI of these limiting conditions are also observed from Table 3 as well as from Fig. 5. Considering all these values, classifier logic is designed as follows.

Fault classifier threshold selection

Two threshold values: θL and θH, as is observed from Fig. 5, are selected for the development of the algorithm.

θL is denoted as the lower threshold which corresponds to the maximum predicted obtainable PCI by the un-faulted line of DL faults or no-fault signals. Hence, un-faulted phases of DL faults are predicted to produce PCI less than θL.

Similarly, θH is termed as the upper or fault threshold which indicates the least predicted PCI level of the directly faulted lines for any fault. The directly faulted lines of all faults are expected to produce higher PCI than θH.

The values lying in the intermediate zone (in between θL and θH) are signifying the PCI for the un-faulted lines of ground faults. Hence, the un-grounded lines of the ground faults (DLG and SLG faults) are expected to produce PCI in this range.

These values are chosen based on a large set of observations considering diverse conditions of fault location and fault resistance. As mentioned already, fault is conducted at 10, 30, 70, 110 and 140 km with fault resistance varying in discrete steps of 0.01, 1, 10 and 100 Ω in order to obtain fault signals for developing the classifier. The following major findings were obtained from Table 3 as well as from Fig. 5:

Case a) The highest PCI of un-faulted line for all DL faults is 0.0742.

Case b) Also, the minimum value of PCI of any directly affected line considering all four major categories of faults is 26.6141 (for SLG fault).

Case c) The un-faulted lines in case of ground faults (SLG and DLG) produce PCI in the range of 1.6832 and 7.3953.

A further factor of safety is imposed on these values to obtain the threshold limits; hence, a tolerance of 25% is applied over the values as a factor of safety. This alters the obtained numerical PCI boundaries as shown in Fig. 4 and the modified PCI values are shown in a modified PCI threshold diagram as shown in Fig. 6.

Fig. 6
figure 6

Segmentation of three levels of fault disturbance using two-level modified threshold of PCI values considering 25% tolerance level

The modified values of the limiting boundaries of the three fault disturbance levels are obtained from Fig. 6. Hence, 25% above the upper limit obtained in case (a) produces: 0.0742 × 125% = 0.0927 and 25% below the lower limit obtained in case (b) produces (26.6141 × 0.75) = 19.9606. It is further observed that the range of PCI mentioned in case (c) for ground faults [1.6832: 7.3953] lies well within these modified lower and upper limits of 0.0927 and 19.9606 , respectively, even after applying the assigned factor of safety of 25% when this range of case (c) is modified as [1.2624: 9.2441]. Hence, θL and θH are judiciously chosen following Fig. 6. Hence,

a) θL is selected in the intermediate range of 0.0927 and 1.2624, and hence, θL is chosen as 0.5.

b) θL is selected in the intermediate range of 9.2441 and 19.9606, and hence, θL is chosen as 15, thus,

θL = 0.5 and θH = 15;

Thus,

  1. (a)

    If any line has PCI less than θL, the line is denoted as directly un-faulted phase for DL faults or no-fault,

  2. (b)

    If any line has PCI higher than θH, the line is denoted as directly affected phase for any fault, and

  3. (c)

    If any line has PCI in between θL and θH, the line is denoted as directly un-faulted phases for ground faults.

And, the fault classifier logics are developed as follows:

  • Case 1: If the PCI for all three lines is less than θL, it is classified as no-fault.

  • Case 2: If the PCI for any two lines is higher than θH and one line is less than θL, it is detected as DL fault.

  • Case 3: If the PCI for any two lines is higher than θH and one line is in between θL and θH, it is detected as DLG fault.

  • Case 4: If the PCI for any one line is higher than θH and that of other two is less than θH, it is detected as SLG fault.

The sub-classifications of faults are done according to the logic developed in Table 4. This defines the final fault signatures for classification. The test PCI values PCIT for the three lines are compared to the two threshold levels θH and θL for fault classification. The basic classifier logic is illustrated in a flowchart in Fig. 7, and the sub-classification is done according to Table 4 as mentioned earlier.

Table 4 Fault classifier logic development
Fig. 7
figure 7

Flowchart of the proposed fault classifier

Noise Immunity of the Proposed Classifier

PCA is well known for reducing the effect of noise as it identifies the principal directions of the variation, of a multivariate dataset, in the descending order of importance. Since noise has inherent property of much lower magnitude compared to power frequency of fault frequency signals, the effect of noise in naturally eliminated from these principal directions. This property of noise immunity is also investigated in this work. Table 5 shows a comparative analysis of the PCI for direct standardized fault signals and that of its filtered form.

Table 5 Comparative analysis of PCI for direct and filtered signals for AG fault conducted at fault location 70 km with fault resistance 10Ω, with half cycle post-fault data, SNR = 30 dB

.

The above observations show that filtering does not affect the outcomes of the PCA algorithm as there is no remarkable change in the magnitudes of PCI, i.e., PCI for direct and fault signals has almost insignificant difference. Hence, one key advantage of using the proposed PCA-based fault analyzer, as evident from Table 5, as it capable of reducing computational burden by eliminating the requirement of filtering. The proposed algorithm is further carried out with more adverse conditions of higher noise level. SNR is varied for this purpose to observe the variation in PCI. Figure 8 shows the variation of line A current signal (per unit) under healthy condition for four different SNR levels. These PCI values of the direct and the filtered signals are presented in Table 6.

Fig. 8
figure 8

Magnified view of line A per unit healthy current signal for variation in SNR in four levels

Table 6 Comparative analysis of PCI values of phase A for direct and filtered signals for AG fault conducted at fault location 70 km with fault resistance 1Ω, with half cycle post-fault data, SNR is varied in four stages

It is well inferred from Table 5 and Table 6 that the impact of noise is well taken care of by PCA, as it retains only the most important directions of variation of the data, thus discarding the effect of less magnitude noise naturally. The filtered and non-filters signals when analyzed using PCA produce very close outcomes which is reflected in the PCI values of both the tables. Hence, filtering becomes insignificant with PCA, thus saving vital time and computation associated with filtering. This shows the inherent property of PCA to ignore the effect of noise to a great extent.

Table 6 further shows that the proposed PCA-based analyzer produces robust fault features in terms of Principal Component Indices (PCI). The range of PCI values of the non-faulted A phase for BC fault are highlighted in Table 6. It is observed from Table 6 that the average value of PCI for non-faulted phase in case of DL faults varies from 0.048 (for SNR = 30 dB) to 0.36 (for SNR = 15 dB). This increased magnitude of PCI is naturally due to the higher level of noise. The lower threshold θL was selected as 0.5 previously. It is found here that the PCI of the un-faulted line (line A) of DL fault (BC fault) is less than 0.22 for 20 dB SNR, which is considered for many practical simulations and considered in many researches [7]. PCI increases to marginally higher than 0.36, even for very high noise level of 15 dB of SNR, which still lies below the previously selected threshold of θL = 0.5, by a margin of 26% approximately. Since there is not much of a difference among PCI of the un-grounded lines of the ground faults for cases corresponding to 30 dB and 15 dB, margin of 20% may further be considered to increase the magnitude of θL safely from the previously selected value of 0.5 to a new level of 0.8, considering the new PCI magnitude of 0.37. This updated value of θL is still much less than 1.6832 which is the minimum PCI of the un-grounded lines of the ground faults mentioned in Table 3, as well as from Fig. 6. θH although, is kept same as 15 in all the cases, as it is not affected due to minor noise disturbances. Hence, the threshold levels are set finally as:

$$\theta_{L} = {\mathbf{0}}.{\mathbf{8}}\;{\text{and}}\;\theta_{H} = {\mathbf{15}}$$

Result and Discussion

The proposed classifier is tested for ten different fault classes for wide variation in fault location and fault resistance, and most impotently, for a large variation in noise. Out of the fourteen intermediate locations in between 10 and 140 km, each 10 km apart of the 150 km long transmission line, fault data corresponding to 10, 30, 70, 110 and 140 km are used for designing the classifier, whereas the rest of the locations, e.g., 20, 40, 50, 60, 80, 90, 100, 120 and 130 km, i.e., a total of nine locations are used for testing the same. Fault resistance is also varied within the specified range of 0.01 to 100 Ω. Finally, the noise level is changed in terms of four steps of SNR to test the robustness of the classifier under practically simulated circumstances. Each fault class is tested for faults conducted at the above-mentioned nine fault locations with four different fault resistance values for each level of noise. Hence, the volume of the test dataset for each fault class becomes (9 locations × 5 fault resistance), i.e., 45, for each of the four levels of SNR. The classifier outcomes of the (45 observations × 10 fault classes), i.e., 450 numbers of unknown faults are summarized in Table 7.

Table 7 Classifier results for different levels of SNR for variation in fault location and fault resistance over the line span

Besides, the proposed model is also studied for variation in fault inception angle (FIA). Three discrete values of FIA of 0°, 45° and 90° have been tested using this scheme; since each quarter cycle of a sinusoidal signal is symmetrical. It is found that this fault parameter has less impact on the results; and more importantly, it hardly has any impact on determination of the faulty phase(s). All the directly affected lines were identified correctly even considering the variation of FIA. Identification of the faulted phases is of most importance, since a heavy power flows out through these faulted lines causing system unbalance, and even causing possible damage to persons and equipments. This shows the robustness of the classifier against the variation of another practical parameter like FIA.

Analysis of Results

A comparative analysis of the proposed method with that of some of the existing methods is illustrated in Table 8. Fault classifier features like classification accuracy, range of variation of fault resistances, noise level incorporated and post-fault detection duration of several researches are shown in a tabular form, which easily showcases the utility of the proposed scheme, especially under severe noise affected conditions.

Table 8 Comparative analysis of the present method with some of the existing researches

It could easily be inferred from the above table that the proposed PCA-threshold-based-fault classifier yields effective results considering the entire set of above-mentioned fault features together. The major highlight of this research, which is the high noise tolerance of the method, is easily interpretable from the table. It is observed that most of the schemes use noise level of 20 dB or more, whereas the proposed method investigated the proposed classifier with noise level of as low as 15 dB SNR; simultaneously, it produces an average classifier accuracy of 99.78%, which is very high compared to the existing schemes, especially and most importantly, the obtained accuracy becomes considerably significant considering the low level of SNR investigated here. Some of the methods have considered even very low noise level of 40 dB or more [31, 41], compared to which the proposed method is inspected in a further adverse noisy environment.

The range of variation of other practical parameters like fault resistance is also comparable to that of the several proposed schemes. Besides, the requirement of post-fault signal duration for detection of fault is also half cycle only, which is comparable to most of the methods, even better than some of the existing ones. Finally, the PCA-threshold-based classifier is computationally lighter than most of the proposed methods, which use either, supervised learning methods or critical computational mathematical or statistical analysis. This ease of computation further makes the scheme more effective for practical implementation. Hence, the major highlights of this method are described in the following section.

Major Features of the Proposed Transmission Line Fault Classifier Model

The following major highlights of the proposed classifier are mentioned below:

  1. (i)

    The proposed model considers practically simulated environment using variation of two major fault parameters like the variation of fault location and fault resistance in order to ascertain the robustness and flexibility of the proposed model. Fault location is varied from 10 to 140 km, and fault resistance is varied between 0.01 and 100 Ω in steps. It is found that the proposed PCI threshold-based classifier is almost insensitive to the variations of these parameters.

  2. (ii)

    Inclusion of another practical fault parameter like power system noise has also been considered as a major point of investigation in this work. It is evident from the above table that the proposed classifier is able to produce very high accuracy of classification. Even the model yields an accuracy level of 99.78% for a very high noise level of 15 dB SNR, which is, for normal practice, quite an adverse level of disturbance due to noise. Hence, the proposed model may be justified as a robust classifier model to detect transmission line faults, even at an adverse noisy environment.

  3. (iii)

    Another important feature of this classifier is that it is able to classify the SLG faults with 100% accuracy, even considering the practical variations and the high level of noise of 15 dB SNR. This result is more important since 70–80% to the total fault cases in transmission lines are of SLG type.

  4. (iv)

    The proposed method used PCA as the major classifier tool, which is computationally lighter. Further, the classifier is developed based on two threshold limits, so selected using the PCI levels. The threshold-based classifier logic is also very simple incorporating simple comparison of the test PCI with the prototype PCI fault signatures so developed. Thus, the proposed method consumes considerably reduced time for computation of fault class.

  5. (v)

    PCA is inherently immune to noise, and hence, the requirement of an additional filter is naturally eliminated, which also considers for a reduced computation associated with filtering of the signal.

  6. (vi)

    Finally, direct application of PCA makes the analysis simpler. Besides, the absence of other heavier soft computational analyses including supervised learning algorithms or major statistical approaches makes the proposed classifier much effective for fast fault detection as well as classification.

Conclusion

The proposed fault classifier presents a simple and accurate scheme of classification of faults in long transmission line. Principal Component Analysis (PCA) has been successfully applied to extract fault features in terms of the Principal Component Indices (PCI) using the fault transient oscillations of the line current signals. The proposed method is fast enough to detect and classify faults within half cycle after the occurrence of a fault with classifier accuracy of 100% with a noise level of upto 20 dB SNR. The proposed method is further tested with fault signals with more adverse conditions, incorporating higher noise level of 15 dB SNR, where the same is still found to produce a high accuracy of 99.78% which is superior at this high level of noise.

The proposed method uses PCA to extract the fault features, followed by simple comparative analysis of the PCI so found using two threshold PCI levels. The threshold-based segmentation of fault disturbance levels makes the classifier scheme easy for implementation, as well as contains less computational burden. The method does not include applications of computationally heavier methods like neural network, wavelet transform, fuzzy inference, etc., which are more computationally heavier especially compared to the proposed one. The effect of practical constraints like variation of fault resistance and fault location is also considered in this work unlike some other methods. Most importantly, the effect of an adverse level of noise is taken into consideration. Effect of noise level of 15 dB SNR is investigated in this work, which is considerably heavy for normal power flow operation. The present work achieves a very high classifier accuracy of 99.78% even at this high level of noise. Besides, the method described here requires only half cycle post-fault transient signal for classification of faults, which is comparable, and even better than some of the existing researches. These major features of the present research make the proposed classifier a practically suited one, even at a heavily noise corrupted fault signal.