1 Introduction

In recent years, the analysis and assessment of voltage stability have become a major concern in many power system planning and operation, since it has been the cause for many power system blackouts around the world (Taylor 1994). Lack of new generation and transmission facilities and over exploitation of the existing facilities geared by increased load demand, especially in the deregulated environment, forces the system to operate closer to their security boundaries leading to system instability. Loading margin analysis has been profoundly identified as one of the fundamental measurement in voltage collapse or voltage stability studies. The voltage collapse condition is predicted to occur when the load is increased exceeding the maximum loading point and subsequently the system starts to lose its equilibriums.

Generally, loading margin determination can be achieved by two techniques, viz., direct method and Homotopy method. The direct method can find the critical point where the Jacobian is singular by solving the enlarged system of power flow equations in one step. The major drawback of this method is requirement of good initial assumptions for a successful convergence. It also doubles the number of equations to be solved. For a more complicated practical nonlinear problem, the direct methods may not work at all (Ajjarapu 2006).

The lack of knowledge of an initial guess can be tackled by the Homotopy method. This method first defines an easy problem for which a solution is known and then it defines path between the easy problem to the problem that need to be solved. The easy problem, with which the Homotopy method starts, is gradually transformed into the solution of the hard problem. The Homotopy method consists of a continuation equation, whose Jacobian matrix is not singular at bifurcation points, hence they are numerically robust (Kundur 1994). The continuation power flow (CPF) approach is one of the Homotopy methods and consists of predictor–corrector scheme to forecast the bifurcation point of PV or QV curve (Ajjarapu 2006; Naoto Yorino 2005; Chiang et al. 1995; Canizares 2002). At some critical points (turning or fold points), the singularity of the Jacobian matrix often causes trouble either in the prediction or correction process because the selection of step size in step length control scheme is problem dependent.

In ref Irisarri et al. (1997), interior-point optimization method is used to determine optimal loading parameters for loading maximization. Multiple load flow solutions are proposed to obtain the minimum loadability margins in Yorino et al. (1997). An energy function-based approach is proposed to evaluate the loadability margins on a given loading direction in the references Klump and Overbye (1997) and Chiang and Ningqiang (2013). In Tare and Bijwe (1997), the authors maximize the loadability margin in a given loading direction through minimizing the reactive power losses near the critical loading point by linear programming technique.

Thus, various techniques for estimating the loadability margin of power systems have been proposed related to voltage stability limits but all these techniques tend to depend on size and nonlinearity of the problem (Zambroni et al. 2011; Nagao et al. 1997; Van Cutsem 2002).

In recent years, research has been devoted to neural network applications to voltage security assessment and monitoring. Artificial neural network (ANN) is a highly efficient computational tool that could be used for online loadability evaluation. Multi-layered feedforward neural network has been widely used for power margin estimation associated with static voltage stability limits by adopting different training criteria and algorithms. In references Saikat and Benjamin (2007), Bahmanyar and Karami (2014), Wan and Song (1998), Srivastava et al. (2000), the active and reactive powers, bus voltage magnitude and angles had been used as the input attributes to the their proposed ANN model. In Torre et al. (2007), the authors have proposed a new methodology for loading margin estimation based on subtractive clustering and adaptive neuro-fuzzy inference system, wherein various voltage stability indices has been selected as inputs. This method has proven to give good results to deal with uncertain load behavior and hence, can be implemented in a real-time environment. However, most of the artificial intelligence (AI)-based methods have failed to predict the voltage stability margin correctly, because they cannot find the global minima accurately.

Ant Colony optimization technique has been used in Kalil (2006) while in reference Ismail and Titik (2003), utilized Evolutionary Programming. The results obtained using the multi-layer perceptron (MLP) associated with hybrid particle swarm optimization technique have been compared with the results of the conventional CPF technique in El-Dib (2006). An expert system called fuzzy logic approach has been used to determine the maximum loadability limit in Babulal et al. (2008), but has not proved to give global optima. In Suganyadevi and Babulal (2009), the comparative study of various voltage stability indices for the estimation of loadability margin is presented which gives some other useful information such as identification of critical bus/line of a power system.

A new DEPSO algorithm combining the advantages of differential evolution (DE) and particle swarm optimization (PSO) algorithm has been proposed in Gnanambal and Babulal (2012) to determine the maximum loading point which has provided more promising and accurate results compared to other evolutionary algorithms. In general, in any evolutionary technique, the time per iteration is greater and hence, it cannot be directly applicable to solve practical large scale interconnected power system networks.

A vast observation of literature review shows that the problem of loadability prediction still needs further investigation for real-time applications. In the operation of practical power systems, different loading scenarios may result in very different loadability margins for the same operating point, hence loadability margins should be predicted for any loading direction. Therefore, this paper presents a new methodology for the estimation of voltage stability margin of a power system based on support vector regression (SVR).

The proposed method is tested on IEEE 30 bus and Indian 181 bus systems. The simulation results are compared with the results obtained by CPF and ANN, ESVM (Liu et al. 2008), ELM (Huang et al. 2006) and OS-ELM regression methods (Liang et al. 2006). The performance of SVR (Vapnik 1998) is compared with other algorithms on statistical measures like mean square error and average computation time.

FACTS controllers enhance the voltage profile and also the loadability margin of power systems. Natesan and Radman (2004), Gotham and Heydt (1998). FACTS devices are used for power flow analysis (Enrique et al. 2004; Eberhart and Kennedy 1995). FACTS devices can be connected to a transmission line in various ways, such as in series, shunt, or combination of series and shunt. For example, the static VAR compensator (SVC) and static synchronous compensator (STATCOM) are connected in shunt; static synchronous series compensator (SSSC) and thyristor-controlled series capacitor (TCSC) are connected in series; thyristor-controlled phase-shifting transformer (TCPST) and unified power flow controller (UPFC) are connected in a series and shunt combination. The behavior of the test system with and without FACTS device under different loading conditions is studied.

The organization of the paper is as follows: In Sect.  2, a summarized prediction of loadability margin of a power system during normal as well as contingency cases is presented. Section 3 presents the SVM methodology and in Sect. 4, the algorithm for the prediction of loadability margin and the process of data generation are described in Sect. 5. The simulation results of the proposed approach and the comparison of SVR result with and without FACTS devices and compared with other algorithms are in Sect. 6. Finally, important conclusions are drawn in Sect. 7.

2 Prediction of loadability margin

Loadability margin is the distance with respect to the loading parameter, from the current operating point to voltage collapse point (Kundur 1994). The most common method to determine the loadability margin is the continuation power flow (CPF) technique. CPF employs a predictor–corrector scheme to find a solution path of a set of power flow equations reformulated to include a load parameter. It starts from a known solution and uses a tangent predictor to estimate subsequent solution corresponding to a different value of the load parameter. This estimate is then corrected using Newton–Raphson power flow. The local parameterization provides a mean of identifying each point along the solution path and plays an integral part in avoiding singularity in Jacobian. Figure 1 shows a typical PV curve, which is recognized as an important tool for assessing voltage stability. The continuation method systematically increases the loading level or bifurcation parameter, until bifurcation or point of collapse is determined.

Fig. 1
figure 1

PV curve and voltage stability limit

2.1 LM index

In this paper, we define a term called Loadability Margin index (LM_Index), which is given by Eqs. (1) and (2), shown below, indicating the distance from the current operating point to the voltage collapse point in terms of loading parameter for a given system operating condition with and without specified contingency. The LM_Index for pre-contingency and post-contingency is given by the following equations.

$$\begin{aligned} \mathrm{LM\_Index_{(Pre{\text {-}}Contingency)}} =\lambda _{\mathrm{VC} (\mathrm{Pre})} -\lambda _0 \end{aligned}$$
(1)
$$\begin{aligned} \mathrm{LM\_Index_{(Post{\text {-}}Contingency)}} =\lambda _{\mathrm{VC} (\mathrm{Post})} -\lambda _0 \end{aligned}$$
(2)

2.2 Loadability margin for pre-contingency

The prediction of load direction in real time is not known, hence the present work has considered seven different load scenarios under normal operating conditions covering the whole spectrum of load direction prediction. The various scenarios considered are listed as follows:

  • Scenario 1-real power load alone increased at a single load bus.

  • Scenario 2-real power load alone increased at multiple load buses.

  • Scenario 3-reactive power load alone increased at a single load bus.

  • Scenario 4-reactive power load alone increased at multiple load buses.

  • Scenario 5-both real and reactive power loads are increased simultaneously at a single load bus for different loading factor.

  • Scenario 6-both real and reactive power loads are increased simultaneously at multiple load buses for different loading factor.

  • Scenario 7-both real and reactive power loads at all load buses are increased keeping the power factor as constant at 0.8 lagging.

The conventional CPF algorithm available in PSAT toolbox (Federico 2009) is used for the above scenarios to generate sufficient data set. The data set includes real and reactive power load at all buses and the corresponding loading parameter \(\lambda \).

2.3 Loadability margin for post-contingency

Post-contingency is the state of the power system after a contingency has occurred A contingency is the loss or failure of a small part of the power system (e.g., a transmission line), or an individual equipment failure (such as a generator or transformer). This is also called an unplanned “outage”.

Many possible outage conditions could happen in a power system. This has necessitated a need to study the system behavior under a large number of contingency cases, so that power system operator can be given a warning signal to initiate an appropriate corrective action so as to prevent serious damage or overload on other equipments. The problem of studying all possible outages becomes very difficult and time-consuming, since it is required to present the results quickly for corrective actions to be taken.

The post-contingency analysis is used as a tool for the offline analysis of contingency events, and as an online tool to show operators what would be the effects of future outages. It allows operators to be better prepared to react to outages using pre-planned recovery scenarios.

The sequence of steps carried out offline in applying regression approach for the prediction of loadability margin is depicted in Fig. 2.

Fig. 2
figure 2

Loadability margin prediction framework

3 SVM methodology

SVM has two special properties that SVMs can achieve (1) high generalization by maximizing the margin and (2) support an efficient learning of nonlinear functions by kernel trick. For the classification, the SVM tries to find the optimal hyperplane, which is expressed as a linear combination of a subset of training data (called support vectors) by solving a linearly constrained quadratic programming (QP) problem with a maximum margin between the two classes. Additionally, with the introduction of Vapnik’s \(\varepsilon \)-insensitive loss function, the SVM has been extended to solve a nonlinear regression-estimation problem, called the SVM for regression (Jin-Tsong 2006).

SVM Regression (SVR) is a method to estimate a function that maps from an input object to a real number based on training data. Similar to SVM Classification (SVC), SVR has the same properties of the margin maximization and kernel trick for nonlinear mapping (Hwanjo and Sungchul 2010).

Figure 3 shows the structure of SVR predictor which realizes the mapping function. The basic idea of SVR is to map the data of input space into high dimensional feature space via a nonlinear mapping and to do linear regression in this space. \(\hat{x}_{t+1} =f(x_t ,x_{t-1} ,x_{t-2}\ldots \ldots x_{t-(m-1)} )\) in which \(\hat{x}_{t+1}\) is the predicted value and \(\vec {x}_t\) is the observed value.

Fig. 3
figure 3

Structure of support vector regression predictor

A training set for regression is represented as follows.

$$\begin{aligned} D=\left\{ {( {x_1 ,y_1 }),( {x_2 ,y_2 })\ldots \ldots ( {x_m ,y_m })} \right\} \end{aligned}$$
(3)

where x \(_{i}\) is a n-dimensional vector and \(y\) is the real number for each x \(_{i}\). The SVR function \(F\)(x \(_{i})\) makes a mapping from an input vector x \(_{i}\) to the target \(y_{i}\) and takes the form.

$$\begin{aligned} F(X)=w*x-b \end{aligned}$$
(4)

where w is the weight vector and \(b\) is the bias. The goal is to estimate the parameters (w and \(b)\) of the function that give the best fit of the data. An SVR function \(F\)(x) approximates all pairs (x\(_{i}\), \(y_{i})\) while maintaining the differences between estimated values and real values under precision. That is, for every input vector x in \(D\),

$$\begin{aligned} y_i -w*x_i -b\le \varepsilon \end{aligned}$$
$$\begin{aligned} w*x_i +b-y_i \le \varepsilon \end{aligned}$$

The margin is

$$\begin{aligned} m\arg in=\frac{1}{\left\| w \right\| } \end{aligned}$$

By minimizing \(\left\| w \right\| ^2\) to maximize the margin, the training in SVR becomes a constrained optimization problem as follows.

$$\begin{aligned} \hbox {Minimize:} \ L(w)=\frac{1}{2}\left\| w \right\| ^2 \end{aligned}$$
(5)
$$\begin{aligned} \hbox {Subject to:} \ y_i -w*x_i -b\le \varepsilon \end{aligned}$$
(6)
$$\begin{aligned} w*x_i +b-y_i \le \varepsilon \end{aligned}$$
(7)

The solution of this problem does not allow any errors. To allow some errors to deal with noise in the training data, the soft margin SVR uses slack variables \(\xi \) and \({\hat{\xi }}\)

Then, the optimization problem can be revised as follows.

$$\begin{aligned} \hbox {Minimize:} \ L( {w,\xi })=\frac{1}{2}\left\| w \right\| ^2+C\sum \limits _i {\left( {\xi _{2i} , {\hat{\xi }}_{{2i}}}\right) } ,C>0\nonumber \\ \end{aligned}$$
(8)

Subject to:

$$\begin{aligned} y_i -w*x_i -b\le \varepsilon +\xi _i \quad \forall (x_i ,y_i )\in D \end{aligned}$$
(9)
$$\begin{aligned} w*x_i +b-y_i \le \varepsilon + {\hat{\xi }}_i \quad \forall (x_i ,y_i )\in D \end{aligned}$$
(10)
$$\begin{aligned} \xi _i ,{\hat{\xi }}_i \ge 0 \end{aligned}$$
(11)

The constant \(C >\) 0 is the trade-off parameter between the margin size and the amount of errors.

The slack variables \(\xi \) and \({\hat{\xi }}\) deal with infeasible constraints of the optimization problem by imposing the penalty to the excess deviations which are larger than \(\upvarepsilon \).

To solve the optimization problem Eq. (8), we can construct a Lagrange function from the objective function with Lagrange multipliers as follows:

$$\begin{aligned} \hbox {minimize:} \ L&= \frac{1}{2}\left\| w \right\| ^2+C\sum \limits _i {\left( {\xi _i ,{\hat{\xi }}_i }\right) }\nonumber \\&-\sum \limits _i {\left( {\eta _i \xi _i +\mathop {\hat{\eta }_i } {\hat{\xi }}_i }\right) } \end{aligned}$$
(12)
$$\begin{aligned} -\sum \limits _i {\alpha _i \left( {\varepsilon +\eta _i -y_i +w \cdot x_i +b}\right) } \end{aligned}$$
$$\begin{aligned} -\sum \limits _i {\hat{\alpha }}_i \left( {\varepsilon +{\hat{\eta }}_i +y_i -w \cdot x_i -b}\right) \end{aligned}$$
$$\begin{aligned} \hbox {subject to}:\eta _i ,{\hat{\eta }}_i \ge 0 \end{aligned}$$
(13)
$$\begin{aligned} \alpha _i ,{\hat{\alpha }} _i \ge 0 \end{aligned}$$
(14)

where \(\eta _i , \, {\hat{\eta }}_i, \, \alpha _i ,{\hat{\alpha }}_i\) are the Lagrange multipliers which satisfy positive constraints.

The following is the process to find the saddle point using the partial derivatives of L with respect to each lagrangian multipliers for minimizing the function L.

$$\begin{aligned} \frac{\partial L}{\partial b}=\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) =0} \end{aligned}$$
(15)
$$\begin{aligned} \frac{\partial L}{\partial w}=w-\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) x_i =0} ,w=\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) x_i }\nonumber \\ \end{aligned}$$
(16)
$$\begin{aligned} \frac{\partial L}{\partial {\hat{\xi }_i}}=C-\hat{\alpha }_i -\hat{\eta }_i =0,\hat{\eta }_i =C-\hat{\alpha }_i \end{aligned}$$
(17)

The optimization problem with inequality constraints can be changed into the following dual optimization problem by substituting Eqs. (15), (16) and (17) in (12).

$$\begin{aligned}&\hbox {maximize: }L(\alpha )=\sum \limits _i {y_i \left( {\alpha _i-\hat{\alpha }_i }\right) -\varepsilon \sum \limits _i {( {\alpha _i +\hat{\alpha }_i })} }\nonumber \\&\quad -\frac{1}{2}\sum \limits _i {\sum \limits _j {( {\alpha _i -\hat{\alpha }_i })} } ( {\alpha _i -\hat{\alpha }_i })x_i x_j \end{aligned}$$
(18)
$$\begin{aligned} \hbox {subject to:}\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) } =0 \end{aligned}$$
(19)
$$\begin{aligned} 0\le \alpha ,\hat{\alpha }\le C \end{aligned}$$
(20)

The dual variables \(\eta _i \), \(\hat{\eta }_i \) are eliminated in revising Eq. (12) in Eq. (18). Equations (16) and (17) can be rewritten as follows.

$$\begin{aligned}&w=\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) } x_i\end{aligned}$$
(21)
$$\begin{aligned}&\pm \supset B \eta _i =C-\alpha _i\end{aligned}$$
(22)
$$\begin{aligned}&\hat{\eta }_i =C-\hat{\alpha }_i \end{aligned}$$
(23)

where w is represented by a linear combination of the training vectors x \(_\mathrm{i}\). Accordingly, the SVR function \(F\)(x) becomes the following function.

$$\begin{aligned} F(x)=\sum \limits _i {\left( {\alpha _i -\hat{\alpha }_i }\right) } x_i x_j +b \end{aligned}$$
(24)

Equation (25) can map the training vectors to target real values with allowing some errors but it cannot handle the nonlinear SVR case. The same kernel trick can be applied by replacing the inner product of two vectors x \(_{i}\) and x \(_{j}\) with a kernel function \(K\)(x\(_{i}\), x\(_{j})\). The transformed feature space is usually high dimensional, and the SVR function in this space becomes nonlinear in the original input space. Using the kernel function \(K\), The inner product in the transformed feature space can be computed as fast as the inner product x \(_{i*}\) x \(_{j}\) in the original input space. Once replacing the original inner product with a kernel function \(K\), the remaining process for solving the optimization problem is very similar to that for the linear SVR. The linear optimization function can be changed using kernel function as follows.

$$\begin{aligned}&\hbox {maximize:} \ L(\alpha )=\sum \limits _i {y_i ( {\alpha _i -\hat{\alpha }_i })-\varepsilon \sum \limits _i {( {\alpha _i +\hat{\alpha }_i })} }\nonumber \\&\quad -\frac{1}{2}\sum \limits _i {\sum \limits _j {( {\alpha _i -\hat{\alpha }_i })} } ( {\alpha _i -\hat{\alpha }_i})Kx_i x_j\end{aligned}$$
(25)
$$\begin{aligned}&\hbox {subject to:} \ \sum \limits _i {( {\alpha _i -\hat{\alpha }_i })} =0\end{aligned}$$
(26)
$$\begin{aligned}&\hat{\alpha }_i \ge 0,\alpha _i \ge 0\end{aligned}$$
(27)
$$\begin{aligned}&0\le \alpha ,\hat{\alpha }\le C \end{aligned}$$
(28)

Finally, the SVR function \(F\)(x) becomes the following using the kernel function.

$$\begin{aligned} F(x)=\sum \limits _i {( {\alpha _i -\hat{\alpha }_i })} Kx_i x_j +b \end{aligned}$$
(29)

4 Algorithm for predicting loadability margin

The detailed algorithm for the prediction of loadability margin of a power system is given as follows:

  1. 1.

    Run the test systems data in PSAT for all seven loading scenarios.

  2. 2.

    From the CPF results, create a database for the input vector in the form of [P\(_\mathrm{G}\), P\(_\mathrm{L}\), Q\(_\mathrm{G}\), Q\(_\mathrm{L}\)] where P\(_\mathrm{G}\), P\(_\mathrm{L}\), Q\(_\mathrm{G}\) and Q\(_\mathrm{L}\) are the real and reactive powers in generators and load buses, respectively, and output vector in the form of scalar lambda (loading margin) for the corresponding input vectors as shown in module 1 of Figs. 4 and 5.

  3. 3.

    Choose different possibilities, such as kernel type, kernel parameters and SVR parameters (C and \(\gamma )\) to train the SVR network.

  4. 4.

    Train the SVR network using the training data set. Test the accuracy of the regression model to unseen test samples and verify the predictor of loadability margin value.

  5. 5.

    Compare the predictor loadability margin of SVR with other algorithms and conventional CPF technique in terms of computational time and Mean Squared Error (MSE) as given in following Eq. (30).

$$\begin{aligned} \mathrm{MSE}=\sum \limits _{i=1}^n {\frac{(X_i -Y_i )^2}{n}} \end{aligned}$$
(30)

where X\(_{i}\) is the predictor loading margin value and Y\(_{i}\) is target value (i.e., loading margin from CPF).

Fig. 4
figure 4

Structure of SVR model

Fig. 5
figure 5

Nose curve of IEEE 30 bus system

Where \(\left[ {\mathop P\nolimits _{li,1} ,\mathop P\nolimits _{li,2} ,\ldots \ldots \mathop P\nolimits _{li,n}}\right] \) is the real power load vector of \(\mathop i\)th bus for ‘n’ number of patterns and \(\big [\!\mathop Q\nolimits _{li,1} ,\mathop Q\nolimits _{li,2} ,\ldots \ldots \mathop Q\nolimits _{li,n} \! \big ]\) is the reactive load vector of \(\mathop i\)th bus for ‘n’ patterns.

The structure of the proposed SVR model consists of module 1 and module 2 as shown in the Fig. 3. In module 1, the continuation power flow method is used for various loading scenarios of the test systems. The P and Q values from the module 1 are given as input to the module 2 of different SVRs. These SVR models estimate the loadability margin of the given power system.

5 Test systems and data generation

The proposed algorithm (explained in Sect. 4) is applied to the IEEE 30 bus and Indian 181 bus systems. The numerical data for IEEE 30 bus and Indian 181 bus systems are taken from Power System Test Archive-UWEE and Babulal and Kannan (2006), respectively. In machine learning approaches, the generated data set must adequately represent the entire range of power system operating states.

5.1 Pre-contingency cases

The input and output patterns can be generated either from real time or offline mode of simulation. In this work, a large number of characteristics of operating points are generated through offline simulation using CPF method using PSAT software for IEEE 30 bus system and CPF method using MATPOWER software (Zimmerman et al. 2011) for Indian 181 bus system. The following Figs. 4, 5 and 6 shows the nose curves (Loading parameter versus voltage) of IEEE 30 bus and Indian 181 bus systems, respectively. These curves are obtained using CPF for all possible loading scenarios by increasing the real and reactive power load from its base case condition.

Fig. 6
figure 6

Nose curve of Indian 181 bus system

In IEEE 30 bus system, 600 patterns were generated by varying the real and reactive power loads randomly from its base case value to 150 %. Out of 600 patterns generated for IEEE 30 bus system, 80 % (480 patterns) are selected arbitrarily for training, while the left 20 % (120 patterns) are used for testing. In Indian 181 bus test system, as many as 750 patterns were generated by changing the loading at each bus randomly in wide range (\(\pm \)50 % of base case). Thus, 271,500 data samples are used for the simulation study. Out of 750 patterns, 80 % (600 patterns) are taken for training and the remaining for testing. The size of the data set adopted for training and testing phases under pre-contingency cases is shown in Table 1.

Table 1 Data size of pre-contingency and post-contingency cases

5.2 Post-contingency cases

For loading margin estimation in the contingency cases, a large number of load patterns are generated by all single line outages and selected double line outages in the power system to capture all possible scenarios. In IEEE 30 bus system, 450 patterns were generated for all single line outages and some double line outages. In Indian 181 bus test system, as many as 480 patterns were generated by N-1 contingency scenarios. Out of 450 patterns generated in IEEE 30 bus system, 80 % (360 patterns) are selected arbitrarily for training, while the left 20 % (90 patterns) are used for testing. Similarly for Indian 181 bus system, out of 480 patterns, 80 % (384 patterns) are taken for training and the remaining for testing. The size of the data set adopted for training and testing phases under normal as well as under contingency cases are shown in Table 1.

6 Simulation results and discussion

6.1 Selection of kernel and SVR parameters using tenfold cross-validation approach

LIBSVM (Babulal and Kannan 2006) and MATLAB routine is used for training the SVMs in both classification and regression. The training performance of the SVR module depends on proper selection of SVR parameter such as cost function C and \(\gamma \) and kernel types. The various kernel types considered for SVM regression are the RBF, linear, polynomial and Gaussian. This paper uses RBF kernel type over the others because of its superiority (Suganyadevi and Babulal 2013). The best combination of \(C\) and \(\gamma \) is often selected by a grid search with exponentially growing sequences of C and \(\gamma \), for example, \(C\in \left\{ {2^{-5},2^{-3},\ldots \ldots 2^{13},2^{15}} \right\} ;\gamma \!\in \! \left\{ {2^{-15},2^{-13},\ldots \ldots 2^1,2^3} \right\} \). Typically, every combination of parameter choices is checked using cross-validation, and the parameters with best cross-validation accuracy are selected. The improper selection of these two parameters can lead to over-fitting or under-fitting problems. Tenfold cross-validation approach in grid-search method is used to determine the optimal value of C and \(\gamma \) so that the regression model can accurately predict the unknown data (Kalyani and Swarup 2013). Highest cross-validation accuracy of 93 % is obtained for \(C=100\) and \(\gamma = 0.2\).

6.2 Comparison of various regression model’s loadability margin without FACTS devices

This section compares the performance of ANN, ELM, OSELM and ESVM with respect to SVM in predicting the loadability margin of IEEE 30 bus and Indian 181 bus systems without FACTS devices. ANN is designed with an input layer of three neurons, a hidden layer of five neurons and an output layer of one neuron. The Levenberg–Marquardt backpropagation network training function is used. The network is trained for up to 300 epochs to an error goal of 0.000001. In ELM and OSELM algorithms, the RBF activation function is used to compute the hidden layer output matrix. Sigmoidal activation function is used for ESVM.

The proposed SVM model is used to determine the loadability margin and the results in terms of training time, testing time, training MSE and testing MSE for both pre-contingency and post-contingency cases are compared and shown in Tables 2 and 3. Clearly from the tables, the performance of SVM is better than others. Hence the proposed SVM regression model may be suitable for online implementation. As SVM predicts quickly, it may allow the operator to monitor the power system stability from time to time and take appropriate control and preventive actions accordingly.

Table 2 Comparative analysis of various regression models for LM estimation for IEEE 30 bus test system
Table 3 Comparative analysis of various regression models for LM estimation for Indian 181 bus test system

6.3 Prediction of VSM after placing FACTS devices

The best location for shunt reactive power compensation, as far as the improvement of voltage stability margin is concerned, is the weakest bus of the system. The weakest bus of the system can be identified using nodal voltage stability indices (Suganyadevi and Babulal 2009). Introducing shunt compensation devices at this location will improve the LM. In IEEE 30 bus system, bus number 26 is identified as the weakest bus, the injection of reactive power of 2.5 p.u. in this bus from SVC will increase the loading margin by 33.34 %. Similarly in Indian 181 bus system, SVC (reactive power 3.75 p.u.) is connected at the weakest bus number 109, increases the loading margin by 37.14 %. The enhancement of LM is also studied with the introduction of TCSC in the weakest transmission line (identified by the line voltage stability indices (Suganyadevi and Babulal 2009). In IEEE 30 bus system, 20 % of line reactance compensation is inserted in the weakest line number 34. An increase in loading of \(P=1.5~\%\) and \(Q=1.75~\%\) are obtained due to the inclusion of TCSC. Similarly for the Indian 181 bus system, TCSC (30 % of the line reactance of the weakest line between 110 and 141 bus numbers) is connected and the improvement in the loadings of \(P=2.89~\%\) and \(Q=3.52~\%\) are obtained. The nose curves of IEEE 30 bus and Indian 181 bus systems after placing SVC at its weakest buses are shown in the following Figs. 7 and 8, respectively.

Fig. 7
figure 7

Nose curve of IEEE 30 bus system after placing SVC at 26th bus

Fig. 8
figure 8

Nose curve of Indian 181 bus system after placing SVC at 109th bus

The enhanced LM with SVC and TCSC are determined and the results in terms of training time, testing time and testing MSE are compared and presented for IEEE 30 bus and Indian 181 bus systems in Tables  4 and 5, respectively. The tables show clearly that the proposed SVM model estimate the same loadability margin as obtained by the other techniques with greater accuracy. The training computational time of ELM network is slightly higher than the SVM and ANN. The SVM training and testing time are lesser and accurate when compared to ANN model. The results show that the SVM type predicts the result quickly when compared to other models and the computational time also less in the order of 10\(^{-4}\).

Table 4 Comparison of various regression models by placing FACTS devices for IEEE 30 bus test system
Table 5 Comparison of various regression models by placing FACTS devices for Indian 181 bus test system

Figure 9 shows the variation of LM for base case, contingency, TCSC and SVC of IEEE 30 bus system for some testing patterns obtained from SVM models. SVC gives better voltage stability compared to series compensation device TCSC, i.e., the LM of the system with SVC is higher than that of TCSC. Shunt compensation device injects the reactive power at the connected weakest bus but series compensation device inserts the reactive power at the connected line(weakest line). It shows that IEEE 30 bus system needs more reactive power at the load bus than the line. The weakest bus 26 of the system requires more reactive power. Injection of reactive power at bus 26 or in its vicinity can improve the voltage stability margin. Similarly, the variation of LM for base case, contingency, TCSC and SVC of Indian 181 bus system for some testing patterns obtained from SVM models is shown in Fig. 10. The weakest bus 109 of the system requires more reactive power. Injection of reactive power at bus 109 or in its vicinity can improve the voltage stability margin.

Fig. 9
figure 9

Variation of LM for IEEE 30 bus system with and without FACTS devices

Fig. 10
figure 10

LM of Indian 181 bus test system with and without FACTS devices

7 Conclusion

This paper has presented a new SVM model for online prediction of loadability margin for power system leading to fast voltage stability assessment. The comparative results of LM for IEEE 30 bus and Indian 181 bus systems under normal as well as post-contingency cases had proven the efficacy of the SVM model for online estimation. This, in turn, will help the power system operator to take necessary control actions at the appropriate time thereby preventing voltage collapse and system blackout. The proposed SVM model is well trained to predict the voltage stability margin in a short frame of time for the considered power system. Injection of reactive power at the weakest bus using SVC can improve LM than that of TCSC. The estimation of LM using SVM model is achieved with least absolute error, minimum training and testing computational time, compared to other machine learning models. Future work will focus on application of proposed work for the estimation of loadability margin in restructured environment.